Index: head/UPDATING
===================================================================
--- head/UPDATING	(revision 186118)
+++ head/UPDATING	(revision 186119)
@@ -1,1181 +1,1190 @@
 Updating Information for FreeBSD current users
 
 This file is maintained and copyrighted by M. Warner Losh
 <imp@village.org>.  See end of file for further details.  For commonly
 done items, please see the COMMON ITEMS: section later in the file.
 
 Items affecting the ports and packages system can be found in
 /usr/ports/UPDATING.  Please read that file before running
 portupgrade.
 
 NOTE TO PEOPLE WHO THINK THAT FreeBSD 8.x IS SLOW:
 	FreeBSD 8.x has many debugging features turned on, in
 	both the kernel and userland.  These features attempt to detect
 	incorrect use of system primitives, and encourage loud failure
 	through extra sanity checking and fail stop semantics.  They
 	also substantially impact system performance.  If you want to
 	do performance measurement, benchmarking, and optimization,
 	you'll want to turn them off.  This includes various WITNESS-
 	related kernel options, INVARIANTS, malloc debugging flags
 	in userland, and various verbose features in the kernel.  Many
 	developers choose to disable these features on build machines
 	to maximize performance.  (To disable malloc debugging, run
 	ln -s aj /etc/malloc.conf.)
 
+20081214:
+	__FreeBSD_version 800059 incorporates the new arp-v2 rewrite.
+	RTF_CLONING, RTF_LLINFO and RTF_WASCLONED flags are eliminated.
+	The new code reduced struct rtentry{} by 16 bytes on 32-bit 
+	architecture and 40 bytes on 64-bit architecture. The userland
+	applications "arp" and "ndp" have been updated accordingly.
+	The output from "netstat -r" shows only routing entries and
+	none of the L2 information.
+
 20081130:
 	__FreeBSD_version 800057 marks the switchover from the
 	binary ath hal to source code. Users must add the line:
 
 	options	AH_SUPPORT_AR5416
 
 	to their kernel config files when specifying:
 
 	device	ath_hal
 
 	The ath_hal module no longer exists; the code is now compiled
 	together with the driver in the ath module.  It is now
 	possible to tailor chip support (i.e. reduce the set of chips
 	and thereby the code size); consult ath_hal(4) for details.
 
 20081121:
 	__FreeBSD_version 800054 adds memory barriers to
 	<machine/atomic.h>, new interfaces to ifnet to facilitate
 	multiple hardware transmit queues for cards that support
 	them, and a lock-less ring-buffer implementation to
 	enable drivers to more efficiently manage queueing of
 	packets.
 
 20081117:
 	A new version of ZFS (version 13) has been merged to -HEAD.
 	This version has zpool attribute "listsnapshots" off by
 	default, which means "zfs list" does not show snapshots,
 	and is the same as Solaris behavior.
 
 20081028:
 	dummynet(4) ABI has changed. ipfw(8) needs to be recompiled.
 
 20081009:
 	The uhci, ohci, ehci and slhci USB Host controller drivers have
 	been put into separate modules. If you load the usb module
 	separately through loader.conf you will need to load the
 	appropriate *hci module as well. E.g. for a UHCI-based USB 2.0
 	controller add the following to loader.conf:
 
 		uhci_load="YES"
 		ehci_load="YES"
 
 20081009:
 	The ABI used by the PMC toolset has changed.  Please keep
 	userland (libpmc(3)) and the kernel module (hwpmc(4)) in
 	sync.
 
 20080820:
 	The TTY subsystem of the kernel has been replaced by a new
 	implementation, which provides better scalability and an
 	improved driver model. Most common drivers have been migrated to
 	the new TTY subsystem, while others have not. The following
 	drivers have not yet been ported to the new TTY layer:
 
 	PCI/ISA:
 		cy, digi, rc, rp, sio
 
 	USB:
 		ubser, ucycom
 
 	Line disciplines:
 		ng_h4, ng_tty, ppp, sl, snp
 
 	Adding these drivers to your kernel configuration file shall
 	cause compilation to fail.
 
 20080818:
 	ntpd has been upgraded to 4.2.4p5.
 
 20080801:
 	OpenSSH has been upgraded to 5.1p1.
 
 	For many years, FreeBSD's version of OpenSSH preferred DSA
 	over RSA for host and user authentication keys.  With this
 	upgrade, we've switched to the vendor's default of RSA over
 	DSA.  This may cause upgraded clients to warn about unknown
 	host keys even for previously known hosts.  Users should
 	follow the usual procedure for verifying host keys before
 	accepting the RSA key.
 
 	This can be circumvented by setting the "HostKeyAlgorithms"
 	option to "ssh-dss,ssh-rsa" in ~/.ssh/config or on the ssh
 	command line.
 
 	Please note that the sequence of keys offered for
 	authentication has been changed as well.  You may want to
 	specify IdentityFile in a different order to revert this
 	behavior.
 
 20080713:
 	The sio(4) driver has been removed from the i386 and amd64
 	kernel configuration files. This means uart(4) is now the
 	default serial port driver on those platforms as well.
 
 	To prevent collisions with the sio(4) driver, the uart(4) driver
 	uses different names for its device nodes. This means the
 	onboard serial port will now most likely be called "ttyu0"
 	instead of "ttyd0". You may need to reconfigure applications to
 	use the new device names.
 
 	When using the serial port as a boot console, be sure to update
 	/boot/device.hints and /etc/ttys before booting the new kernel.
 	If you forget to do so, you can still manually specify the hints
 	at the loader prompt:
 
 		set hint.uart.0.at="isa"
 		set hint.uart.0.port="0x3F8"
 		set hint.uart.0.flags="0x10"
 		set hint.uart.0.irq="4"
 		boot -s
 
 20080609:
 	The gpt(8) utility has been removed. Use gpart(8) to partition
 	disks instead.
 
 20080603:
 	The version that Linuxulator emulates was changed from 2.4.2
 	to 2.6.16. If you experience any problems with Linux binaries
 	please try to set sysctl compat.linux.osrelease to 2.4.2 and
 	if it fixes the problem contact emulation mailing list.
 
 20080525:
 	ISDN4BSD (I4B) was removed from the src tree. You may need to
 	update a your kernel configuration and remove relevant entries.
 
 20080509:
 	I have checked in code to support multiple routing tables.
 	See the man pages setfib(1) and setfib(2).
 	This is a hopefully backwards compatible version,
 	but to make use of it you need to compile your kernel
 	with options ROUTETABLES=2 (or more up to 16).
 
 20080420:
 	The 802.11 wireless support was redone to enable multi-bss
 	operation on devices that are capable.  The underlying device
 	is no longer used directly but instead wlanX devices are
 	cloned with ifconfig.  This requires changes to rc.conf files.
 	For example, change:
 		ifconfig_ath0="WPA DHCP"
 	to
 		wlans_ath0=wlan0
 		ifconfig_wlan0="WPA DHCP"
 	see rc.conf(5) for more details.  In addition, mergemaster of
 	/etc/rc.d is highly recommended.  Simultaneous update of userland
 	and kernel wouldn't hurt either.
 
 	As part of the multi-bss changes the wlan_scan_ap and wlan_scan_sta
 	modules were merged into the base wlan module.  All references
 	to these modules (e.g. in kernel config files) must be removed.
 
 20080408:
 	psm(4) has gained write(2) support in native operation level.
 	Arbitrary commands can be written to /dev/psm%d and status can
 	be read back from it.  Therefore, an application is responsible
 	for status validation and error recovery.  It is a no-op in
 	other operation levels.
 
 20080312:
 	Support for KSE threading has been removed from the kernel.  To
 	run legacy applications linked against KSE libmap.conf may
 	be used.  The following libmap.conf may be used to ensure
 	compatibility with any prior release:
 
 	libpthread.so.1 libthr.so.1
 	libpthread.so.2 libthr.so.2
 	libkse.so.3 libthr.so.3
 
 20080301:
 	The layout of struct vmspace has changed. This affects libkvm
 	and any executables that link against libkvm and use the
 	kvm_getprocs() function. In particular, but not exclusively,
 	it affects ps(1), fstat(1), pkill(1), systat(1), top(1) and w(1).
 	The effects are minimal, but it's advisable to upgrade world
 	nonetheless.
 
 20080229:
 	The latest em driver no longer has support in it for the
 	82575 adapter, this is now moved to the igb driver. The
 	split was done to make new features that are incompatible
 	with older hardware easier to do.
 
 20080220:
 	The new geom_lvm(4) geom class has been renamed to geom_linux_lvm(4),
 	likewise the kernel option is now GEOM_LINUX_LVM.
 
 20080211:
 	The default NFS mount mode has changed from UDP to TCP for
 	increased reliability.  If you rely on (insecurely) NFS
 	mounting across a firewall you may need to update your
 	firewall rules.
 
 20080208:
 	Belatedly note the addition of m_collapse for compacting
 	mbuf chains.
 
 20080126:
 	The fts(3) structures have been changed to use adequate
 	integer types for their members and so to be able to cope
 	with huge file trees.  The old fts(3) ABI is preserved
 	through symbol versioning in libc, so third-party binaries
 	using fts(3) should still work, although they will not take
 	advantage of the extended types.  At the same time, some
 	third-party software might fail to build after this change
 	due to unportable assumptions made in its source code about
 	fts(3) structure members.  Such software should be fixed
 	by its vendor or, in the worst case, in the ports tree.
 	FreeBSD_version 800015 marks this change for the unlikely
 	case that a portable fix is impossible.
 
 20080123:
 	To upgrade to -current after this date, you must be running
 	FreeBSD not older than 6.0-RELEASE.  Upgrading to -current
 	from 5.x now requires a stop over at RELENG_6 or RELENG_7 systems.
 
 20071128:
 	The ADAPTIVE_GIANT kernel option has been retired because its
 	functionality is the default now.
 
 20071118:
 	The AT keyboard emulation of sunkbd(4) has been turned on
 	by default. In order to make the special symbols of the Sun
 	keyboards driven by sunkbd(4) work under X these now have
 	to be configured the same way as Sun USB keyboards driven
 	by ukbd(4) (which also does AT keyboard emulation), f.e.:
 
 	Option	"XkbLayout" "us"
 	Option	"XkbRules" "xorg"
 	Option	"XkbSymbols" "pc(pc105)+sun_vndr/usb(sun_usb)+us"
 
 20071024:
 	It has been decided that it is desirable to provide ABI
 	backwards compatibility to the FreeBSD 4/5/6 versions of the
 	PCIOCGETCONF, PCIOCREAD and PCIOCWRITE IOCTLs, which was
 	broken with the introduction of PCI domain support (see the
 	20070930 entry). Unfortunately, this required the ABI of
 	PCIOCGETCONF to be broken again in order to be able to
 	provide backwards compatibility to the old version of that
 	IOCTL. Thus consumers of PCIOCGETCONF have to be recompiled
 	again. As for prominent ports this affects neither pciutils
 	nor xorg-server this time, the hal port needs to be rebuilt
 	however.
 
 20071020:
 	The misnamed kthread_create() and friends have been renamed
 	to kproc_create() etc. Many of the callers already
 	used kproc_start()..
 	I will return kthread_create() and friends in a while
 	with implementations that actually create threads, not procs.
 	Renaming corresponds with version 800002.
 
 20071010:
 	RELENG_7 branched.
 
 20071009:
 	Setting WITHOUT_LIBPTHREAD now means WITHOUT_LIBKSE and
 	WITHOUT_LIBTHR are set.
 
 20070930:
 	The PCI code has been made aware of PCI domains. This means that
 	the location strings as used by pciconf(8) etc are now in the
 	following format: pci<domain>:<bus>:<device>[:<function>]. It
 	also means that consumers of <sys/pciio.h> potentially need to
 	be recompiled; this includes the hal and xorg-server ports.
 
 20070928:
 	The caching daemon (cached) was renamed to nscd. nscd.conf
 	configuration file should be used instead of cached.conf and
 	nscd_enable, nscd_pidfile and nscd_flags options should be used
 	instead of cached_enable, cached_pidfile and cached_flags in
 	rc.conf.
 
 20070921:
 	The getfacl(1) utility now prints owning user and group name
 	instead of owning uid and gid in the three line comment header.
 	This is the same behavior as getfacl(1) on Solaris and Linux.
 
 20070704:
 	The new IPsec code is now compiled in using the IPSEC option.  The
 	IPSEC option now requires "device crypto" be defined in your kernel
 	configuration.  The FAST_IPSEC kernel option is now deprecated.
 
 20070702:
 	The packet filter (pf) code has been updated to OpenBSD 4.1 Please
 	note the changed syntax - keep state is now on by default.  Also
 	note the fact that ftp-proxy(8) has been changed from bottom up and
 	has been moved from libexec to usr/sbin.  Changes in the ALTQ
 	handling also affect users of IPFW's ALTQ capabilities.
 
 20070701:
 	Remove KAME IPsec in favor of FAST_IPSEC, which is now the
 	only IPsec supported by FreeBSD.  The new IPsec stack
 	supports both IPv4 and IPv6. The kernel option will change
 	after the code changes have settled in.  For now the kernel
 	option IPSEC is deprecated and FAST_IPSEC is the only option, that
 	will change after some settling time.
 
 20070701:
 	The wicontrol(8) utility has been removed from the base system. wi(4)
 	cards should be configured using ifconfig(8), see the man page for more
 	information.
 
 20070612:
 	The i386/amd64 GENERIC kernel now defaults to the nfe(4) driver
 	instead of the nve(4) driver. Please update your configuration
 	accordingly.
 
 20070612:
 	By default, /etc/rc.d/sendmail no longer rebuilds the aliases
 	database if it is missing or older than the aliases file.  If
 	desired, set the new rc.conf option sendmail_rebuild_aliases
 	to "YES" to restore that functionality.
 
 20070612:
 	The IPv4 multicast socket code has been considerably modified, and
 	moved to the file sys/netinet/in_mcast.c. Initial support for the
 	RFC 3678 Source-Specific Multicast Socket API has been added to
 	the IPv4 network stack.
 
 	Strict multicast and broadcast reception is now the default for
 	UDP/IPv4 sockets; the net.inet.udp.strict_mcast_mship sysctl variable
 	has now been removed.
 
 	The RFC 1724 hack for interface selection has been removed; the use
 	of the Linux-derived ip_mreqn structure with IP_MULTICAST_IF has
 	been added to replace it. Consumers such as routed will soon be
 	updated to reflect this.
 
 	These changes affect users who are running routed(8) or rdisc(8)
 	from the FreeBSD base system on point-to-point or unnumbered
 	interfaces.
 
 20070610:
 	The net80211 layer has changed significantly and all wireless
 	drivers that depend on it need to be recompiled.  Further these
 	changes require that any program that interacts with the wireless
 	support in the kernel be recompiled; this includes: ifconfig,
 	wpa_supplicant, hostapd, and wlanstats.  Users must also, for
 	the moment, kldload the wlan_scan_sta and/or wlan_scan_ap modules
 	if they use modules for wireless support.  These modules implement
 	scanning support for station and ap modes, respectively.  Failure
 	to load the appropriate module before marking a wireless interface
 	up will result in a message to the console and the device not
 	operating properly.
 
 20070610:
 	The pam_nologin(8) module ceases to provide an authentication
 	function and starts providing an account management function.
 	Consequent changes to /etc/pam.d should be brought in using
 	mergemaster(8).  Third-party files in /usr/local/etc/pam.d may
 	need manual editing as follows.  Locate this line (or similar):
 
 		auth	required	pam_nologin.so	no_warn
 
 	and change it according to this example:
 
 		account	required	pam_nologin.so	no_warn
 
 	That is, the first word needs to be changed from "auth" to
 	"account".  The new line can be moved to the account section
 	within the file for clarity.  Not updating pam.conf(5) files
 	will result in nologin(5) ignored by the respective services.
 
 20070529:
 	The ether_ioctl() function has been synchronized with ioctl(2)
 	and ifnet.if_ioctl.  Due to that, the size of one of its arguments
 	has changed on 64-bit architectures.  All kernel modules using
 	ether_ioctl() need to be rebuilt on such architectures.
 
 20070516:
 	Improved INCLUDE_CONFIG_FILE support has been introduced to the
 	config(8) utility. In order to take advantage of this new
 	functionality, you are expected to recompile and install
 	src/usr.sbin/config. If you don't rebuild config(8), and your
 	kernel configuration depends on INCLUDE_CONFIG_FILE, the kernel
 	build will be broken because of a missing "kernconfstring"
 	symbol.
 
 20070513:
 	Symbol versioning is enabled by default.  To disable it, use
 	option WITHOUT_SYMVER.  It is not advisable to attempt to
 	disable symbol versioning once it is enabled; your installworld
 	will break because a symbol version-less libc will get installed
 	before the install tools.  As a result, the old install tools,
 	which previously had symbol dependencies to FBSD_1.0, will fail
 	because the freshly installed libc will not have them.
 
 	The default threading library (providing "libpthread") has been
 	changed to libthr.  If you wish to have libkse as your default,
 	use option DEFAULT_THREAD_LIB=libkse for the buildworld.
 
 20070423:
 	The ABI breakage in sendmail(8)'s libmilter has been repaired
 	so it is no longer necessary to recompile mail filters (aka,
 	milters).  If you recompiled mail filters after the 20070408
 	note, it is not necessary to recompile them again.
 
 20070417:
 	The new trunk(4) driver has been renamed to lagg(4) as it better
 	reflects its purpose. ifconfig will need to be recompiled.
 
 20070408:
 	sendmail(8) has been updated to version 8.14.1.  Mail filters
 	(aka, milters) compiled against the libmilter included in the
 	base operating system should be recompiled.
 
 20070302:
 	Firmwares for ipw(4) and iwi(4) are now included in the base tree.
 	In order to use them one must agree to the respective LICENSE in
 	share/doc/legal and define legal.intel_<name>.license_ack=1 via
 	loader.conf(5) or kenv(1).  Make sure to deinstall the now
 	deprecated modules from the respective firmware ports.
 
 20070228:
 	The name resolution/mapping functions addr2ascii(3) and ascii2addr(3)
 	were removed from FreeBSD's libc. These originally came from INRIA
 	IPv6. Nothing in FreeBSD ever used them. They may be regarded as
 	deprecated in previous releases.
 	The AF_LINK support for getnameinfo(3) was merged from NetBSD to
 	replace it as a more portable (and re-entrant) API.
 
 20070224:
 	To support interrupt filtering a modification to the newbus API
 	has occurred, ABI was broken and __FreeBSD_version was bumped
 	to 700031. Please make sure that your kernel and modules are in
 	sync. For more info:
 	http://docs.freebsd.org/cgi/mid.cgi?20070221233124.GA13941
 
 20070224:
 	The IPv6 multicast forwarding code may now be loaded into GENERIC
 	kernels by loading the ip_mroute.ko module. This is built into the
 	module unless WITHOUT_INET6 or WITHOUT_INET6_SUPPORT options are
 	set; see src.conf(5) for more information.
 
 20070214:
 	The output of netstat -r has changed. Without -n, we now only
 	print a "network name" without the prefix length if the network
 	address and mask exactly match a Class A/B/C network, and an entry
 	exists in the nsswitch "networks" map.
 	With -n, we print the full unabbreviated CIDR network prefix in
 	the form "a.b.c.d/p". 0.0.0.0/0 is always printed as "default".
 	This change is in preparation for changes such as equal-cost
 	multipath, and to more generally assist operational deployment
 	of FreeBSD as a modern IPv4 router.
 
 20070210:
 	PIM has been turned on by default in the IPv4 multicast
 	routing code. The kernel option 'PIM' has now been removed.
 	PIM is now built by default if option 'MROUTING' is specified.
 	It may now be loaded into GENERIC kernels by loading the
 	ip_mroute.ko module.
 
 20070207:
 	Support for IPIP tunnels (VIFF_TUNNEL) in IPv4 multicast routing
 	has been removed. Its functionality may be achieved by explicitly
 	configuring gif(4) interfaces and using the 'phyint' keyword in
 	mrouted.conf.
 	XORP does not support source-routed IPv4 multicast tunnels nor the
 	integrated IPIP tunneling, therefore it is not affected by this
 	change. The __FreeBSD_version macro has been bumped to 700030.
 
 20061221:
 	Support for PCI Message Signalled Interrupts has been
 	re-enabled in the bge driver, only for those chips which are
 	believed to support it properly.  If there are any problems,
 	MSI can be disabled completely by setting the
 	'hw.pci.enable_msi' and 'hw.pci.enable_msix' tunables to 0
 	in the loader.
 
 20061214:
 	Support for PCI Message Signalled Interrupts has been
 	disabled again in the bge driver.  Many revisions of the
 	hardware fail to support it properly.  Support can be
 	re-enabled by removing the #define of BGE_DISABLE_MSI in
 	"src/sys/dev/bge/if_bge.c".
 
 20061214:
 	Support for PCI Message Signalled Interrupts has been added
 	to the bge driver.  If there are any problems, MSI can be
 	disabled completely by setting the 'hw.pci.enable_msi' and
 	'hw.pci.enable_msix' tunables to 0 in the loader.
 
 20061205:
 	The removal of several facets of the experimental Threading
 	system from the kernel means that the proc and thread structures
 	have changed quite a bit. I suggest all kernel modules that might
 	reference these structures be recompiled.. Especially the
 	linux module.
 
 20061126:
 	Sound infrastructure has been updated with various fixes and
 	improvements. Most of the changes are pretty much transparent,
 	with exceptions of followings:
 	1) All sound driver specific sysctls (hw.snd.pcm%d.*) have been
 	   moved to their own dev sysctl nodes, for example:
 		hw.snd.pcm0.vchans -> dev.pcm.0.vchans
 	2) /dev/dspr%d.%d has been deprecated. Each channel now has its
 	   own chardev in the form of "dsp%d.<function>%d", where <function>
 	   is p = playback, r = record and v = virtual, respectively. Users
 	   are encouraged to use these devs instead of (old) "/dev/dsp%d.%d".
 	   This does not affect those who are using "/dev/dsp".
 
 20061122:
 	geom(4)'s gmirror(8) class metadata structure has been
 	rev'd from v3 to v4. If you update across this point and
 	your metadata is converted for you, you will not be easily
 	able to downgrade since the /boot/kernel.old/geom_mirror.ko
 	kernel module will be unable to read the v4 metadata.  You
 	can resolve this by doing from the loader(8) prompt:
 
 		set vfs.root.mountfrom="ufs:/dev/XXX"
 
 	where XXX is the root slice of one of the disks that composed
 	the mirror (i.e.: /dev/ad0s1a). You can then rebuild
 	the array the same way you built it originally.
 
 20061122:
 	The following binaries have been disconnected from the build:
 	mount_devfs, mount_ext2fs, mount_fdescfs, mount_procfs, mount_linprocfs,
 	and mount_std.  The functionality of these programs has been
 	moved into the mount program.  For example, to mount a devfs
 	filesystem, instead of using mount_devfs, use: "mount -t devfs".
 	This does not affect entries in /etc/fstab, since entries in
 	/etc/fstab are always processed with "mount -t fstype".
 
 20061113:
 	Support for PCI Message Signalled Interrupts on i386 and amd64
 	has been added to the kernel and various drivers will soon be
 	updated to use MSI when it is available.  If there are any problems,
 	MSI can be disabled completely by setting the 'hw.pci.enable_msi'
 	and 'hw.pci.enable_msix' tunables to 0 in the loader.
 
 20061110:
 	The MUTEX_PROFILING option has been renamed to LOCK_PROFILING.
 	The lockmgr object layout has been changed as a result of having
 	a lock_object embedded in it. As a consequence all file system
 	kernel modules must be re-compiled. The mutex profiling man page
 	has not yet been updated to reflect this change.
 
 20061026:
 	KSE in the kernel has now been made optional and turned on by
 	default. Use 'nooption KSE' in your kernel config to turn it
 	off. All kernel modules *must* be recompiled after this change.
 	There-after, modules from a KSE kernel should be compatible with
 	modules from a NOKSE kernel due to the temporary padding fields
 	added to 'struct proc'.
 
 20060929:
 	mrouted and its utilities have been removed from the base system.
 
 20060927:
 	Some ioctl(2) command codes have changed.  Full backward ABI
 	compatibility is provided if the "options COMPAT_FREEBSD6" is
 	present in the kernel configuration file.  Make sure to add
 	this option to your kernel config file, or recompile X.Org
 	and the rest of ports; otherwise they may refuse to work.
 
 20060924:
 	tcpslice has been removed from the base system.
 
 20060913:
 	The sizes of struct tcpcb (and struct xtcpcb) have changed due to
 	the rewrite of TCP syncookies.  Tools like netstat, sockstat, and
 	systat needs to be rebuilt.
 
 20060903:
 	libpcap updated to v0.9.4 and tcpdump to v3.9.4
 
 20060816:
 	The IPFIREWALL_FORWARD_EXTENDED option is gone and the behaviour
 	for IPFIREWALL_FORWARD is now as it was before when it was first
 	committed and for years after. The behaviour is now ON.
 
 20060725:
 	enigma(1)/crypt(1) utility has been changed on 64 bit architectures.
 	Now it can decrypt files created from different architectures.
 	Unfortunately, it is no longer able to decrypt a cipher text
 	generated with an older version on 64 bit architectures.
 	If you have such a file, you need old utility to decrypt it.
 
 20060709:
 	The interface version of the i4b kernel part has changed. So
 	after updating the kernel sources and compiling a new kernel,
 	the i4b user space tools in "/usr/src/usr.sbin/i4b" must also
 	be rebuilt, and vice versa.
 
 20060627:
 	The XBOX kernel now defaults to the nfe(4) driver instead of
 	the nve(4) driver. Please update your configuration
 	accordingly.
 
 20060514:
 	The i386-only lnc(4) driver for the AMD Am7900 LANCE and Am79C9xx
 	PCnet family of NICs has been removed. The new le(4) driver serves
 	as an equivalent but cross-platform replacement with the pcn(4)
 	driver still providing performance-optimized support for the subset
 	of AMD Am79C971 PCnet-FAST and greater chips as before.
 
 20060511:
 	The machdep.* sysctls and the adjkerntz utility have been
 	modified a bit.  The new adjkerntz utility uses the new
 	sysctl names and sysctlbyname() calls, so it may be impossible
 	to run an old /sbin/adjkerntz utility in single-user mode
 	with a new kernel.  Replace the `adjkerntz -i' step before
 	`make installworld' with:
 
 	    /usr/obj/usr/src/sbin/adjkerntz/adjkerntz -i
 
 	and proceed as usual with the rest of the installworld-stage
 	steps.  Otherwise, you risk installing binaries with their
 	timestamp set several hours in the future, especially if
 	you are running with local time set to GMT+X hours.
 
 20060412:
 	The ip6fw utility has been removed.  The behavior provided by
 	ip6fw has been in ipfw2 for a good while and the rc.d scripts
 	have been updated to deal with it.  There are some rules that
 	might not migrate cleanly.  Use rc.firewall6 as a template to
 	rewrite rules.
 
 20060428:
 	The puc(4) driver has been overhauled. The ebus(4) and sbus(4)
 	attachments have been removed. Make sure to configure scc(4)
 	on sparc64. Note also that by default puc(4) will use uart(4)
 	and not sio(4) for serial ports because interrupt handling has
 	been optimized for multi-port serial cards and only uart(4)
 	implements the interface to support it.
 
 20060330:
 	The scc(4) driver replaces puc(4) for Serial Communications
 	Controllers (SCCs) like the Siemens SAB82532 and the Zilog
 	Z8530. On sparc64, it is advised to add scc(4) to the kernel
 	configuration to make sure that the serial ports remain
 	functional.
 
 20060317:
 	Most world/kernel related NO_* build options changed names.
 	New knobs have common prefixes WITHOUT_*/WITH_* (modelled
 	after FreeBSD ports) and should be set in /etc/src.conf
 	(the src.conf(5) manpage is provided).  Full backwards
 	compatibility is maintained for the time being though it's
 	highly recommended to start moving old options out of the
 	system-wide /etc/make.conf file into the new /etc/src.conf
 	while also properly renaming them.  More conversions will
 	likely follow.  Posting to current@:
 
 	http://lists.freebsd.org/pipermail/freebsd-current/2006-March/061725.html
 
 20060305:
 	The NETSMBCRYPTO kernel option has been retired because its
 	functionality is always included in NETSMB and smbfs.ko now.
 
 20060303:
 	The TDFX_LINUX kernel option was retired and replaced by the
 	tdfx_linux device.  The latter can be loaded as the 3dfx_linux.ko
 	kernel module.  Loading it alone should suffice to get 3dfx support
 	for Linux apps because it will pull in 3dfx.ko and linux.ko through
 	its dependencies.
 
 20060204:
 	The 'audit' group was added to support the new auditing functionality
 	in the base system.  Be sure to follow the directions for updating,
 	including the requirement to run mergemaster -p.
 
 20060201:
 	The kernel ABI to file system modules was changed on i386.
 	Please make sure that your kernel and modules are in sync.
 
 20060118:
 	This actually occured some time ago, but installing the kernel
 	now also installs a bunch of symbol files for the kernel modules.
 	This increases the size of /boot/kernel to about 67Mbytes. You
 	will need twice this if you will eventually back this up to kernel.old
 	on your next install.
 	If you have a shortage of room in your root partition, you should add
 	-DINSTALL_NODEBUG to your make arguments or add INSTALL_NODEBUG="yes"
 	to your /etc/make.conf.
 
 20060113:
 	libc's malloc implementation has been replaced.  This change has the
 	potential to uncover application bugs that previously went unnoticed.
 	See the malloc(3) manual page for more details.
 
 20060112:
 	The generic netgraph(4) cookie has been changed. If you upgrade
 	kernel passing this point, you also need to upgrade userland
 	and netgraph(4) utilities like ports/net/mpd or ports/net/mpd4.
 
 20060106:
 	si(4)'s device files now contain the unit number.
 	Uses of {cua,tty}A[0-9a-f] should be replaced by {cua,tty}A0[0-9a-f].
 
 20060106:
 	The kernel ABI was mostly destroyed due to a change in the size
 	of struct lock_object which is nested in other structures such
 	as mutexes which are nested in all sorts of other structures.
 	Make sure your kernel and modules are in sync.
 
 20051231:
 	The page coloring algorithm in the VM subsystem was converted
 	from tuning with kernel options to autotuning. Please remove
 	any PQ_* option except PQ_NOOPT from your kernel config.
 
 20051211:
 	The net80211-related tools in the tools/tools/ath directory
 	have been moved to tools/tools/net80211 and renamed with a
 	"wlan" prefix.  Scripts that use them should be adjusted
 	accordingly.
 
 20051202:
 	Scripts in the local_startup directories (as defined in
 	/etc/defaults/rc.conf) that have the new rc.d semantics will
 	now be run as part of the base system rcorder. If there are
 	errors or problems with one of these local scripts, it could
 	cause boot problems. If you encounter such problems, boot in
 	single user mode, remove that script from the */rc.d directory.
 	Please report the problem to the port's maintainer, and the
 	freebsd-ports@freebsd.org mailing list.
 
 20051129:
 	The nodev mount option was deprecated in RELENG_6 (where it
 	was a no-op), and is now unsupported.  If you have nodev or dev listed
 	in /etc/fstab, remove it, otherwise it will result in a mount error.
 
 20051129:
 	ABI between ipfw(4) and ipfw(8) has been changed. You need
 	to rebuild ipfw(8) when rebuilding kernel.
 
 20051108:
 	rp(4)'s device files now contain the unit number.
 	Uses of {cua,tty}R[0-9a-f] should be replaced by {cua,tty}R0[0-9a-f].
 
 20051029:
 	/etc/rc.d/ppp-user has been renamed to /etc/rc.d/ppp.
 	Its /etc/rc.conf.d configuration file has been `ppp' from
 	the beginning, and hence there is no need to touch it.
 
 20051014:
 	Now most modules get their build-time options from the kernel
 	configuration file.  A few modules still have fixed options
 	due to their non-conformant implementation, but they will be
 	corrected eventually.  You may need to review the options of
 	the modules in use, explicitly specify the non-default options
 	in the kernel configuration file, and rebuild the kernel and
 	modules afterwards.
 
 20051001:
 	kern.polling.enable sysctl MIB is now deprecated. Use ifconfig(8)
 	to turn polling(4) on your interfaces.
 
 20050927:
 	The old bridge(4) implementation was retired.  The new
 	if_bridge(4) serves as a full functional replacement.
 
 20050722:
 	The ai_addrlen of a struct addrinfo was changed to a socklen_t
 	to conform to POSIX-2001.  This change broke an ABI
 	compatibility on 64 bit architecture.  You have to recompile
 	userland programs that use getaddrinfo(3) on 64 bit
 	architecture.
 
 20050711:
 	RELENG_6 branched here.
 
 20050629:
 	The pccard_ifconfig rc.conf variable has been removed and a new
 	variable, ifconfig_DEFAULT has been introduced.  Unlike
 	pccard_ifconfig, ifconfig_DEFAULT applies to ALL interfaces that
 	do not have ifconfig_ifn entries rather than just those in
 	removable_interfaces.
 
 20050616:
 	Some previous versions of PAM have permitted the use of
 	non-absolute paths in /etc/pam.conf or /etc/pam.d/* when referring
 	to third party PAM modules in /usr/local/lib.  A change has been
 	made to require the use of absolute paths in order to avoid
 	ambiguity and dependence on library path configuration, which may
 	affect existing configurations.
 
 20050610:
 	Major changes to network interface API.  All drivers must be
 	recompiled.  Drivers not in the base system will need to be
 	updated to the new APIs.
 
 20050609:
 	Changes were made to kinfo_proc in sys/user.h.  Please recompile
 	userland, or commands like `fstat', `pkill', `ps', `top' and `w'
 	will not behave correctly.
 
 	The API and ABI for hwpmc(4) have changed with the addition
 	of sampling support.  Please recompile lib/libpmc(3) and
 	usr.sbin/{pmcstat,pmccontrol}.
 
 20050606:
 	The OpenBSD dhclient was imported in place of the ISC dhclient
 	and the network interface configuration scripts were updated
 	accordingly.  If you use DHCP to configure your interfaces, you
 	must now run devd.  Also, DNS updating was lost so you will need
 	to find a workaround if you use this feature.
 
 	The '_dhcp' user was added to support the OpenBSD dhclient.  Be
 	sure to run mergemaster -p (like you are supposed to do every time
 	anyway).
 
 20050605:
 	if_bridge was added to the tree. This has changed struct ifnet.
 	Please recompile userland and all network related modules.
 
 20050603:
 	The n_net of a struct netent was changed to an uint32_t, and
 	1st argument of getnetbyaddr() was changed to an uint32_t, to
 	conform to POSIX-2001.  These changes broke an ABI
 	compatibility on 64 bit architecture.  With these changes,
 	shlib major of libpcap was bumped.  You have to recompile
 	userland programs that use getnetbyaddr(3), getnetbyname(3),
 	getnetent(3) and/or libpcap on 64 bit architecture.
 
 20050528:
 	Kernel parsing of extra options on '#!' first lines of shell
 	scripts has changed.  Lines with multiple options likely will
 	fail after this date.  For full details, please see
 		http://people.freebsd.org/~gad/Updating-20050528.txt
 
 20050503:
 	The packet filter (pf) code has been updated to OpenBSD 3.7
 	Please note the changed anchor syntax and the fact that
 	authpf(8) now needs a mounted fdescfs(5) to function.
 
 20050415:
 	The NO_MIXED_MODE kernel option has been removed from the i386
 	amd64 platforms as its use has been superceded by the new local
 	APIC timer code.  Any kernel config files containing this option
 	should be updated.
 
 20050227:
 	The on-disk format of LC_CTYPE files was changed to be machine
 	independent.  Please make sure NOT to use NO_CLEAN buildworld
 	when crossing this point. Crossing this point also requires
 	recompile or reinstall of all locale depended packages.
 
 20050225:
 	The ifi_epoch member of struct if_data has been changed to
 	contain the uptime at which the interface was created or the
 	statistics zeroed rather then the wall clock time because
 	wallclock time may go backwards.  This should have no impact
 	unless an snmp implementation is using this value (I know of
 	none at this point.)
 
 20050224:
 	The acpi_perf and acpi_throttle drivers are now part of the
 	acpi(4) main module.  They are no longer built separately.
 
 20050223:
 	The layout of struct image_params has changed. You have to
 	recompile all compatibility modules (linux, svr4, etc) for use
 	with the new kernel.
 
 20050223:
 	The p4tcc driver has been merged into cpufreq(4).  This makes
 	"options CPU_ENABLE_TCC" obsolete.  Please load cpufreq.ko or
 	compile in "device cpufreq" to restore this functionality.
 
 20050220:
 	The responsibility of recomputing the file system summary of
 	a SoftUpdates-enabled dirty volume has been transferred to the
 	background fsck.  A rebuild of fsck(8) utility is recommended
 	if you have updated the kernel.
 
 	To get the old behavior (recompute file system summary at mount
 	time), you can set vfs.ffs.compute_summary_at_mount=1 before
 	mounting the new volume.
 
 20050206:
 	The cpufreq import is complete.  As part of this, the sysctls for
 	acpi(4) throttling have been removed.  The power_profile script
 	has been updated, so you can use performance/economy_cpu_freq in
 	rc.conf(5) to set AC on/offline cpu frequencies.
 
 20050206:
 	NG_VERSION has been increased. Recompiling kernel (or ng_socket.ko)
 	requires recompiling libnetgraph and userland netgraph utilities.
 
 20050114:
 	Support for abbreviated forms of a number of ipfw options is
 	now deprecated.  Warnings are printed to stderr indicating the
 	correct full form when a match occurs.  Some abbreviations may
 	be supported at a later date based on user feedback.  To be
 	considered for support, abbreviations must be in use prior to
 	this commit and unlikely to be confused with current key words.
 
 20041221:
 	By a popular demand, a lot of NOFOO options were renamed
 	to NO_FOO (see bsd.compat.mk for a full list).  The old
 	spellings are still supported, but will cause annoying
 	warnings on stderr.  Make sure you upgrade properly (see
 	the COMMON ITEMS: section later in this file).
 
 20041219:
 	Auto-loading of ancillary wlan modules such as wlan_wep has
 	been temporarily disabled; you need to statically configure
 	the modules you need into your kernel or explicitly load them
 	prior to use.  Specifically, if you intend to use WEP encryption
 	with an 802.11 device load/configure wlan_wep; if you want to
 	use WPA with the ath driver load/configure wlan_tkip, wlan_ccmp,
 	and wlan_xauth as required.
 
 20041213:
 	The behaviour of ppp(8) has changed slightly.  If lqr is enabled
 	(``enable lqr''), older versions would revert to LCP ECHO mode on
 	negotiation failure.  Now, ``enable echo'' is required for this
 	behaviour.  The ppp version number has been bumped to 3.4.2 to
 	reflect the change.
 
 20041201:
 	The wlan support has been updated to split the crypto support
 	into separate modules.  For static WEP you must configure the
 	wlan_wep module in your system or build and install the module
 	in place where it can be loaded (the kernel will auto-load
 	the module when a wep key is configured).
 
 20041201:
 	The ath driver has been updated to split the tx rate control
 	algorithm into a separate module.  You need to include either
 	ath_rate_onoe or ath_rate_amrr when configuring the kernel.
 
 20041116:
 	Support for systems with an 80386 CPU has been removed.  Please
 	use FreeBSD 5.x or earlier on systems with an 80386.
 
 20041110:
 	We have had a hack which would mount the root filesystem
 	R/W if the device were named 'md*'.  As part of the vnode
 	work I'm doing I have had to remove this hack.  People
 	building systems which use preloaded MD root filesystems
 	may need to insert a "/sbin/mount -u -o rw /dev/md0 /" in
 	their /etc/rc scripts.
 
 20041104:
 	FreeBSD 5.3 shipped here.
 
 20041102:
 	The size of struct tcpcb has changed again due to the removal
 	of RFC1644 T/TCP.  You have to recompile userland programs that
 	read kmem for tcp sockets directly (netstat, sockstat, etc.)
 
 20041022:
 	The size of struct tcpcb has changed.  You have to recompile
 	userland programs that read kmem for tcp sockets directly
 	(netstat, sockstat, etc.)
 
 20041016:
 	RELENG_5 branched here.  For older entries, please see updating
 	in the RELENG_5 branch.
 
 COMMON ITEMS:
 
 	General Notes
 	-------------
 	Avoid using make -j when upgrading.  From time to time in the
 	past there have been problems using -j with buildworld and/or
 	installworld.  This is especially true when upgrading between
 	"distant" versions (eg one that cross a major release boundary
 	or several minor releases, or when several months have passed
 	on the -current branch).
 
 	Sometimes, obscure build problems are the result of environment
 	poisoning.  This can happen because the make utility reads its
 	environment when searching for values for global variables.
 	To run your build attempts in an "environmental clean room",
 	prefix all make commands with 'env -i '.  See the env(1) manual
 	page for more details.
 
 	When upgrading from one major version to another it is generally
 	best to upgrade to the latest code in the currently installed branch
 	first, then do an upgrade to the new branch. This is the best-tested
 	upgrade path, and has the highest probability of being successful.
 	Please try this approach before reporting problems with a major
 	version upgrade.
 
 	To build a kernel
 	-----------------
 	If you are updating from a prior version of FreeBSD (even one just
 	a few days old), you should follow this procedure.  It is the most
 	failsafe as it uses a /usr/obj tree with a fresh mini-buildworld,
 
 	make kernel-toolchain
 	make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE
 	make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE
 
 	To test a kernel once
 	---------------------
 	If you just want to boot a kernel once (because you are not sure
 	if it works, or if you want to boot a known bad kernel to provide
 	debugging information) run
 	make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel
 	nextboot -k testkernel
 
 	To just build a kernel when you know that it won't mess you up
 	--------------------------------------------------------------
 	This assumes you are already running a 5.X system.  Replace
 	${arch} with the architecture of your machine (e.g. "i386",
 	"alpha", "amd64", "ia64", "pc98", "sparc64", etc).
 
 	cd src/sys/${arch}/conf
 	config KERNEL_NAME_HERE
 	cd ../compile/KERNEL_NAME_HERE
 	make depend
 	make
 	make install
 
 	If this fails, go to the "To build a kernel" section.
 
 	To rebuild everything and install it on the current system.
 	-----------------------------------------------------------
 	# Note: sometimes if you are running current you gotta do more than
 	# is listed here if you are upgrading from a really old current.
 
 	<make sure you have good level 0 dumps>
 	make buildworld
 	make kernel KERNCONF=YOUR_KERNEL_HERE
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -p					[5]
 	make installworld
 	make delete-old
 	mergemaster					[4]
 	<reboot>
 
 
 	To cross-install current onto a separate partition
 	--------------------------------------------------
 	# In this approach we use a separate partition to hold
 	# current's root, 'usr', and 'var' directories.   A partition
 	# holding "/", "/usr" and "/var" should be about 2GB in
 	# size.
 
 	<make sure you have good level 0 dumps>
 	<boot into -stable>
 	make buildworld
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE
 	<maybe newfs current's root partition>
 	<mount current's root partition on directory ${CURRENT_ROOT}>
 	make installworld DESTDIR=${CURRENT_ROOT}
 	make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd
 	make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT}
 	cp /etc/fstab ${CURRENT_ROOT}/etc/fstab 		   # if newfs'd
 	<edit ${CURRENT_ROOT}/etc/fstab to mount "/" from the correct partition>
 	<reboot into current>
 	<do a "native" rebuild/install as described in the previous section>
 	<maybe install compatibility libraries from ports/misc/compat*>
 	<reboot>
 
 
 	To upgrade in-place from 5.x-stable to current
 	----------------------------------------------
 	<make sure you have good level 0 dumps>
 	make buildworld					[9]
 	make kernel KERNCONF=YOUR_KERNEL_HERE		[8]
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -p					[5]
 	make installworld
 	make delete-old
 	mergemaster -i					[4]
 	<reboot>
 
 	Make sure that you've read the UPDATING file to understand the
 	tweaks to various things you need.  At this point in the life
 	cycle of current, things change often and you are on your own
 	to cope.  The defaults can also change, so please read ALL of
 	the UPDATING entries.
 
 	Also, if you are tracking -current, you must be subscribed to
 	freebsd-current@freebsd.org.  Make sure that before you update
 	your sources that you have read and understood all the recent
 	messages there.  If in doubt, please track -stable which has
 	much fewer pitfalls.
 
 	[1] If you have third party modules, such as vmware, you
 	should disable them at this point so they don't crash your
 	system on reboot.
 
 	[3] From the bootblocks, boot -s, and then do
 		fsck -p
 		mount -u /
 		mount -a
 		cd src
 		adjkerntz -i		# if CMOS is wall time
 	Also, when doing a major release upgrade, it is required that
 	you boot into single user mode to do the installworld.
 
 	[4] Note: This step is non-optional.  Failure to do this step
 	can result in a significant reduction in the functionality of the
 	system.  Attempting to do it by hand is not recommended and those
 	that pursue this avenue should read this file carefully, as well
 	as the archives of freebsd-current and freebsd-hackers mailing lists
 	for potential gotchas.
 
 	[5] Usually this step is a noop.  However, from time to time
 	you may need to do this if you get unknown user in the following
 	step.  It never hurts to do it all the time.  You may need to
 	install a new mergemaster (cd src/usr.sbin/mergemaster && make
 	install) after the buildworld before this step if you last updated
 	from current before 20020224 or from -stable before 20020408.
 
 	[8] In order to have a kernel that can run the 4.x binaries
 	needed to do an installworld, you must include the COMPAT_FREEBSD4
 	option in your kernel.  Failure to do so may leave you with a system
 	that is hard to boot to recover. A similar kernel option COMPAT_FREEBSD5
 	is required to run the 5.x binaries on more recent kernels.
 
 	Make sure that you merge any new devices from GENERIC since the
 	last time you updated your kernel config file.
 
 	[9] When checking out sources, you must include the -P flag to have
 	cvs prune empty directories.
 
 	If CPUTYPE is defined in your /etc/make.conf, make sure to use the
 	"?=" instead of the "=" assignment operator, so that buildworld can
 	override the CPUTYPE if it needs to.
 
 	MAKEOBJDIRPREFIX must be defined in an environment variable, and
 	not on the command line, or in /etc/make.conf.  buildworld will
 	warn if it is improperly defined.
 FORMAT:
 
 This file contains a list, in reverse chronological order, of major
 breakages in tracking -current.  Not all things will be listed here,
 and it only starts on October 16, 2004.  Updating files can found in
 previous releases if your system is older than this.
 
 Copyright information:
 
 Copyright 1998-2005 M. Warner Losh.  All Rights Reserved.
 
 Redistribution, publication, translation and use, with or without
 modification, in full or in part, in any form or format of this
 document are permitted without further permission from the author.
 
 THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED.  IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 
 If you find this document useful, and you want to, you may buy the
 author a beer.
 
 Contact Warner Losh if you have any questions about your use of
 this document.
 
 $FreeBSD$
Index: head/contrib/bsnmp/snmp_mibII/mibII.c
===================================================================
--- head/contrib/bsnmp/snmp_mibII/mibII.c	(revision 186118)
+++ head/contrib/bsnmp/snmp_mibII/mibII.c	(revision 186119)
@@ -1,1790 +1,1725 @@
 /*
  * Copyright (c) 2001-2003
  *	Fraunhofer Institute for Open Communication Systems (FhG Fokus).
  *	All rights reserved.
  *
  * Author: Harti Brandt <harti@freebsd.org>
  * 
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 
  * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $Begemot: mibII.c 516 2006-10-27 15:54:02Z brandt_h $
  *
  * Implementation of the standard interfaces and ip MIB.
  */
 #include "mibII.h"
 #include "mibII_oid.h"
 #include <net/if.h>
 #include <net/if_types.h>
 
 
 /*****************************/
 
 /* our module */
 static struct lmodule *module;
 
 /* routing socket */
 static int route;
 static void *route_fd;
 
 /* if-index allocator */
 static uint32_t next_if_index = 1;
 
-/* re-fetch arp table */
-static int update_arp;
+/* currently fetching the arp table */
 static int in_update_arp;
 
 /* OR registrations */
 static u_int ifmib_reg;
 static u_int ipmib_reg;
 static u_int tcpmib_reg;
 static u_int udpmib_reg;
 static u_int ipForward_reg;
 
 /*****************************/
 
 /* list of all IP addresses */
 struct mibifa_list mibifa_list = TAILQ_HEAD_INITIALIZER(mibifa_list);
 
 /* list of all interfaces */
 struct mibif_list mibif_list = TAILQ_HEAD_INITIALIZER(mibif_list);
 
 /* list of dynamic interface names */
 struct mibdynif_list mibdynif_list = SLIST_HEAD_INITIALIZER(mibdynif_list);
 
 /* list of all interface index mappings */
 struct mibindexmap_list mibindexmap_list = STAILQ_HEAD_INITIALIZER(mibindexmap_list);
 
 /* list of all stacking entries */
 struct mibifstack_list mibifstack_list = TAILQ_HEAD_INITIALIZER(mibifstack_list);
 
 /* list of all receive addresses */
 struct mibrcvaddr_list mibrcvaddr_list = TAILQ_HEAD_INITIALIZER(mibrcvaddr_list);
 
 /* list of all NetToMedia entries */
 struct mibarp_list mibarp_list = TAILQ_HEAD_INITIALIZER(mibarp_list);
 
 /* number of interfaces */
 int32_t mib_if_number;
 
 /* last change of table */
 uint64_t mib_iftable_last_change;
 
 /* last change of stack table */
 uint64_t mib_ifstack_last_change;
 
 /* if this is set, one of our lists may be bad. refresh them when idle */
 int mib_iflist_bad;
 
 /* network socket */
 int mib_netsock;
 
 /* last time refreshed */
 uint64_t mibarpticks;
 
 /* info on system clocks */
 struct clockinfo clockinfo;
 
 /* list of all New if registrations */
 static struct newifreg_list newifreg_list = TAILQ_HEAD_INITIALIZER(newifreg_list);
 
 /* baud rate of fastest interface */
 uint64_t mibif_maxspeed;
 
 /* user-forced update interval */
 u_int mibif_force_hc_update_interval;
 
 /* current update interval */
 u_int mibif_hc_update_interval;
 
 /* HC update timer handle */
 static void *hc_update_timer;
 
 /*****************************/
 
 static const struct asn_oid oid_ifMIB = OIDX_ifMIB;
 static const struct asn_oid oid_ipMIB = OIDX_ipMIB;
 static const struct asn_oid oid_tcpMIB = OIDX_tcpMIB;
 static const struct asn_oid oid_udpMIB = OIDX_udpMIB;
 static const struct asn_oid oid_ipForward = OIDX_ipForward;
 static const struct asn_oid oid_linkDown = OIDX_linkDown;
 static const struct asn_oid oid_linkUp = OIDX_linkUp;
 static const struct asn_oid oid_ifIndex = OIDX_ifIndex;
 
 /*****************************/
 
 /*
  * Find an interface
  */
 struct mibif *
 mib_find_if(u_int idx)
 {
 	struct mibif *ifp;
 
 	TAILQ_FOREACH(ifp, &mibif_list, link)
 		if (ifp->index == idx)
 			return (ifp);
 	return (NULL);
 }
 
 struct mibif *
 mib_find_if_sys(u_int sysindex)
 {
 	struct mibif *ifp;
 
 	TAILQ_FOREACH(ifp, &mibif_list, link)
 		if (ifp->sysindex == sysindex)
 			return (ifp);
 	return (NULL);
 }
 
 struct mibif *
 mib_find_if_name(const char *name)
 {
 	struct mibif *ifp;
 
 	TAILQ_FOREACH(ifp, &mibif_list, link)
 		if (strcmp(ifp->name, name) == 0)
 			return (ifp);
 	return (NULL);
 }
 
 /*
  * Check whether an interface is dynamic. The argument may include the
  * unit number. This assumes, that the name part does NOT contain digits.
  */
 int
 mib_if_is_dyn(const char *name)
 {
 	size_t len;
 	struct mibdynif *d;
 
 	for (len = 0; name[len] != '\0' && isalpha(name[len]) ; len++)
 		;
 	SLIST_FOREACH(d, &mibdynif_list, link)
 		if (strlen(d->name) == len && strncmp(d->name, name, len) == 0)
 			return (1);
 	return (0);
 }
 
 /* set an interface name to dynamic mode */
 void
 mib_if_set_dyn(const char *name)
 {
 	struct mibdynif *d;
 
 	SLIST_FOREACH(d, &mibdynif_list, link)
 		if (strcmp(name, d->name) == 0)
 			return;
 	if ((d = malloc(sizeof(*d))) == NULL)
 		err(1, NULL);
 	strcpy(d->name, name);
 	SLIST_INSERT_HEAD(&mibdynif_list, d, link);
 }
 
 /*
  * register for interface creations
  */
 int
 mib_register_newif(int (*func)(struct mibif *), const struct lmodule *mod)
 {
 	struct newifreg *reg;
 
 	TAILQ_FOREACH(reg, &newifreg_list, link)
 		if (reg->mod == mod) {
 			reg->func = func;
 			return (0);
 		}
 	if ((reg = malloc(sizeof(*reg))) == NULL) {
 		syslog(LOG_ERR, "newifreg: %m");
 		return (-1);
 	}
 	reg->mod = mod;
 	reg->func = func;
 	TAILQ_INSERT_TAIL(&newifreg_list, reg, link);
 
 	return (0);
 }
 
 void
 mib_unregister_newif(const struct lmodule *mod)
 {
 	struct newifreg *reg;
 
 	TAILQ_FOREACH(reg, &newifreg_list, link)
 		if (reg->mod == mod) {
 			TAILQ_REMOVE(&newifreg_list, reg, link);
 			free(reg);
 			return;
 		}
 
 }
 
 struct mibif *
 mib_first_if(void)
 {
 	return (TAILQ_FIRST(&mibif_list));
 }
 struct mibif *
 mib_next_if(const struct mibif *ifp)
 {
 	return (TAILQ_NEXT(ifp, link));
 }
 
 /*
  * Change the admin status of an interface
  */
 int
 mib_if_admin(struct mibif *ifp, int up)
 {
 	struct ifreq ifr;
 
 	strncpy(ifr.ifr_name, ifp->name, sizeof(ifr.ifr_name));
 	if (ioctl(mib_netsock, SIOCGIFFLAGS, &ifr) == -1) {
 		syslog(LOG_ERR, "SIOCGIFFLAGS(%s): %m", ifp->name);
 		return (-1);
 	}
 	if (up)
 		ifr.ifr_flags |= IFF_UP;
 	else
 		ifr.ifr_flags &= ~IFF_UP;
 	if (ioctl(mib_netsock, SIOCSIFFLAGS, &ifr) == -1) {
 		syslog(LOG_ERR, "SIOCSIFFLAGS(%s): %m", ifp->name);
 		return (-1);
 	}
 
 	(void)mib_fetch_ifmib(ifp);
 
 	return (0);
 }
 
 /*
  * Generate a link up/down trap
  */
 static void
 link_trap(struct mibif *ifp, int up)
 {
 	struct snmp_value ifindex;
 
 	ifindex.var = oid_ifIndex;
 	ifindex.var.subs[ifindex.var.len++] = ifp->index;
 	ifindex.syntax = SNMP_SYNTAX_INTEGER;
 	ifindex.v.integer = ifp->index;
 
 	snmp_send_trap(up ? &oid_linkUp : &oid_linkDown, &ifindex,
 	    (struct snmp_value *)NULL);
 }
 
 /**
  * Fetch the GENERIC IFMIB and update the HC counters
  */
 static int
 fetch_generic_mib(struct mibif *ifp, const struct ifmibdata *old)
 {
 	int name[6];
 	size_t len;
 	struct mibif_private *p = ifp->private;
 
 	name[0] = CTL_NET;
 	name[1] = PF_LINK;
 	name[2] = NETLINK_GENERIC;
 	name[3] = IFMIB_IFDATA;
 	name[4] = ifp->sysindex;
 	name[5] = IFDATA_GENERAL;
 
 	len = sizeof(ifp->mib);
 	if (sysctl(name, 6, &ifp->mib, &len, NULL, 0) == -1) {
 		if (errno != ENOENT)
 			syslog(LOG_WARNING, "sysctl(ifmib, %s) failed %m",
 			    ifp->name);
 		return (-1);
 	}
 
 	/*
 	 * Assume that one of the two following compounds is optimized away
 	 */
 	if (ULONG_MAX >= 0xffffffffffffffffULL) {
 		p->hc_inoctets = ifp->mib.ifmd_data.ifi_ibytes;
 		p->hc_outoctets = ifp->mib.ifmd_data.ifi_obytes;
 		p->hc_omcasts = ifp->mib.ifmd_data.ifi_omcasts;
 		p->hc_opackets = ifp->mib.ifmd_data.ifi_opackets;
 		p->hc_imcasts = ifp->mib.ifmd_data.ifi_imcasts;
 		p->hc_ipackets = ifp->mib.ifmd_data.ifi_ipackets;
 
 	} else if (ULONG_MAX >= 0xffffffff) {
 
 #define	UPDATE(HC, MIB)							\
 		if (old->ifmd_data.MIB > ifp->mib.ifmd_data.MIB)	\
 			p->HC += (0x100000000ULL +			\
 			    ifp->mib.ifmd_data.MIB) -			\
 			    old->ifmd_data.MIB;				\
 		else							\
 			p->HC += ifp->mib.ifmd_data.MIB -		\
 			    old->ifmd_data.MIB;
 
 		UPDATE(hc_inoctets, ifi_ibytes)
 		UPDATE(hc_outoctets, ifi_obytes)
 		UPDATE(hc_omcasts, ifi_omcasts)
 		UPDATE(hc_opackets, ifi_opackets)
 		UPDATE(hc_imcasts, ifi_imcasts)
 		UPDATE(hc_ipackets, ifi_ipackets)
 
 #undef	UPDATE
 	} else
 		abort();
 	return (0);
 }
 
 /**
  * Update the 64-bit interface counters
  */
 static void
 update_hc_counters(void *arg __unused)
 {
 	struct mibif *ifp;
 	struct ifmibdata oldmib;
 
 	TAILQ_FOREACH(ifp, &mibif_list, link) {
 		oldmib = ifp->mib;
 		(void)fetch_generic_mib(ifp, &oldmib);
 	}
 }
 
 /**
  * Recompute the poll timer for the HC counters
  */
 void
 mibif_reset_hc_timer(void)
 {
 	u_int ticks;
 
 	if ((ticks = mibif_force_hc_update_interval) == 0) {
 		if (mibif_maxspeed <= IF_Mbps(10)) {
 			/* at 10Mbps overflow needs 3436 seconds */
 			ticks = 3000 * 100;	/* 50 minutes */
 		} else if (mibif_maxspeed <= IF_Mbps(100)) {
 			/* at 100Mbps overflow needs 343 seconds */
 			ticks = 300 * 100;	/* 5 minutes */
 		} else if (mibif_maxspeed < IF_Mbps(622)) {
 			/* at 622Mbps overflow needs 53 seconds */
 			ticks = 40 * 100;	/* 40 seconds */
 		} else if (mibif_maxspeed <= IF_Mbps(1000)) {
 			/* at 1Gbps overflow needs  34 seconds */
 			ticks = 20 * 100;	/* 20 seconds */
 		} else {
 			/* at 10Gbps overflow needs 3.4 seconds */
 			ticks = 100;		/* 1 seconds */
 		}
 	}
 
 	if (ticks == mibif_hc_update_interval)
 		return;
 
 	if (hc_update_timer != NULL) {
 		timer_stop(hc_update_timer);
 		hc_update_timer = NULL;
 	}
 	update_hc_counters(NULL);
 	if ((hc_update_timer = timer_start_repeat(ticks * 10, ticks * 10,
 	    update_hc_counters, NULL, module)) == NULL) {
 		syslog(LOG_ERR, "timer_start(%u): %m", ticks);
 		return;
 	}
 	mibif_hc_update_interval = ticks;
 }
 
 /*
  * Fetch new MIB data.
  */
 int
 mib_fetch_ifmib(struct mibif *ifp)
 {
 	int name[6];
 	size_t len;
 	void *newmib;
 	struct ifmibdata oldmib = ifp->mib;
 
 	if (fetch_generic_mib(ifp, &oldmib) == -1)
 		return (-1);
 
 	/*
 	 * Quoting RFC2863, 3.1.15: "... LinkUp and linkDown traps are
 	 * generated just after ifOperStatus leaves, or just before it
 	 * enters, the down state, respectively;"
 	 */
 	if (ifp->trap_enable && ifp->mib.ifmd_data.ifi_link_state !=
 	    oldmib.ifmd_data.ifi_link_state &&
 	    (ifp->mib.ifmd_data.ifi_link_state == LINK_STATE_DOWN ||
 	    oldmib.ifmd_data.ifi_link_state == LINK_STATE_DOWN))
 		link_trap(ifp, ifp->mib.ifmd_data.ifi_link_state ==
 		    LINK_STATE_UP ? 1 : 0);
 
 	ifp->flags &= ~(MIBIF_HIGHSPEED | MIBIF_VERYHIGHSPEED);
 	if (ifp->mib.ifmd_data.ifi_baudrate > 20000000) {
 		ifp->flags |= MIBIF_HIGHSPEED;
 		if (ifp->mib.ifmd_data.ifi_baudrate > 650000000)
 			ifp->flags |= MIBIF_VERYHIGHSPEED;
 	}
 	if (ifp->mib.ifmd_data.ifi_baudrate > mibif_maxspeed) {
 		mibif_maxspeed = ifp->mib.ifmd_data.ifi_baudrate;
 		mibif_reset_hc_timer();
 	}
 
 	/*
 	 * linkspecific MIB
 	 */
 	name[0] = CTL_NET;
 	name[1] = PF_LINK;
 	name[2] = NETLINK_GENERIC;
 	name[3] = IFMIB_IFDATA;
 	name[4] = ifp->sysindex;
 	name[5] = IFDATA_LINKSPECIFIC;
 	if (sysctl(name, 6, NULL, &len, NULL, 0) == -1) {
 		syslog(LOG_WARNING, "sysctl linkmib estimate (%s): %m",
 		    ifp->name);
 		if (ifp->specmib != NULL) {
 			ifp->specmib = NULL;
 			ifp->specmiblen = 0;
 		}
 		goto out;
 	}
 	if (len == 0) {
 		if (ifp->specmib != NULL) {
 			ifp->specmib = NULL;
 			ifp->specmiblen = 0;
 		}
 		goto out;
 	}
 
 	if (ifp->specmiblen != len) {
 		if ((newmib = realloc(ifp->specmib, len)) == NULL) {
 			ifp->specmib = NULL;
 			ifp->specmiblen = 0;
 			goto out;
 		}
 		ifp->specmib = newmib;
 		ifp->specmiblen = len;
 	}
 	if (sysctl(name, 6, ifp->specmib, &len, NULL, 0) == -1) {
 		syslog(LOG_WARNING, "sysctl linkmib (%s): %m", ifp->name);
 		if (ifp->specmib != NULL) {
 			ifp->specmib = NULL;
 			ifp->specmiblen = 0;
 		}
 	}
 
   out:
 	ifp->mibtick = get_ticks();
 	return (0);
 }
 
 /* find first/next address for a given interface */
 struct mibifa *
 mib_first_ififa(const struct mibif *ifp)
 {
 	struct mibifa *ifa;
 
 	TAILQ_FOREACH(ifa, &mibifa_list, link)
 		if (ifp->index == ifa->ifindex)
 			return (ifa);
 	return (NULL);
 }
 
 struct mibifa *
 mib_next_ififa(struct mibifa *ifa0)
 {
 	struct mibifa *ifa;
 
 	ifa = ifa0;
 	while ((ifa = TAILQ_NEXT(ifa, link)) != NULL)
 		if (ifa->ifindex == ifa0->ifindex)
 			return (ifa);
 	return (NULL);
 }
 
 /*
  * Allocate a new IFA
  */
 static struct mibifa *
 alloc_ifa(u_int ifindex, struct in_addr addr)
 {
 	struct mibifa *ifa;
 	uint32_t ha;
 
 	if ((ifa = malloc(sizeof(struct mibifa))) == NULL) {
 		syslog(LOG_ERR, "ifa: %m");
 		return (NULL);
 	}
 	ifa->inaddr = addr;
 	ifa->ifindex = ifindex;
 
 	ha = ntohl(ifa->inaddr.s_addr);
 	ifa->index.len = 4;
 	ifa->index.subs[0] = (ha >> 24) & 0xff;
 	ifa->index.subs[1] = (ha >> 16) & 0xff;
 	ifa->index.subs[2] = (ha >>  8) & 0xff;
 	ifa->index.subs[3] = (ha >>  0) & 0xff;
 
 	ifa->flags = 0;
 	ifa->inbcast.s_addr = 0;
 	ifa->inmask.s_addr = 0xffffffff;
 
 	INSERT_OBJECT_OID(ifa, &mibifa_list);
 
 	return (ifa);
 }
 
 /*
  * Delete an interface address
  */
 static void
 destroy_ifa(struct mibifa *ifa)
 {
 	TAILQ_REMOVE(&mibifa_list, ifa, link);
 	free(ifa);
 }
 
 
 /*
  * Helper routine to extract the sockaddr structures from a routing
  * socket message.
  */
 void
 mib_extract_addrs(int addrs, u_char *info, struct sockaddr **out)
 {
 	u_int i;
 
 	for (i = 0; i < RTAX_MAX; i++) {
 		if ((addrs & (1 << i)) != 0) {
 			*out = (struct sockaddr *)(void *)info;
 			info += roundup((*out)->sa_len, sizeof(long));
 		} else
 			*out = NULL;
 		out++;
 	}
 }
 
 /*
  * save the phys address of an interface. Handle receive address entries here.
  */
 static void
 get_physaddr(struct mibif *ifp, struct sockaddr_dl *sdl, u_char *ptr)
 {
 	u_char *np;
 	struct mibrcvaddr *rcv;
 
 	if (sdl->sdl_alen == 0) {
 		/* no address */
 		if (ifp->physaddrlen != 0) {
 			if ((rcv = mib_find_rcvaddr(ifp->index, ifp->physaddr,
 			    ifp->physaddrlen)) != NULL)
 				mib_rcvaddr_delete(rcv);
 			free(ifp->physaddr);
 			ifp->physaddr = NULL;
 			ifp->physaddrlen = 0;
 		}
 		return;
 	}
 
 	if (ifp->physaddrlen != sdl->sdl_alen) {
 		/* length changed */
 		if (ifp->physaddrlen) {
 			/* delete olf receive address */
 			if ((rcv = mib_find_rcvaddr(ifp->index, ifp->physaddr,
 			    ifp->physaddrlen)) != NULL)
 				mib_rcvaddr_delete(rcv);
 		}
 		if ((np = realloc(ifp->physaddr, sdl->sdl_alen)) == NULL) {
 			free(ifp->physaddr);
 			ifp->physaddr = NULL;
 			ifp->physaddrlen = 0;
 			return;
 		}
 		ifp->physaddr = np;
 		ifp->physaddrlen = sdl->sdl_alen;
 
 	} else if (memcmp(ifp->physaddr, ptr, ifp->physaddrlen) == 0) {
 		/* no change */
 		return;
 
 	} else {
 		/* address changed */
 
 		/* delete olf receive address */
 		if ((rcv = mib_find_rcvaddr(ifp->index, ifp->physaddr,
 		    ifp->physaddrlen)) != NULL)
 			mib_rcvaddr_delete(rcv);
 	}
 
 	memcpy(ifp->physaddr, ptr, ifp->physaddrlen);
 
 	/* make new receive address */
 	if ((rcv = mib_rcvaddr_create(ifp, ifp->physaddr, ifp->physaddrlen)) != NULL)
 		rcv->flags |= MIBRCVADDR_HW;
 }
 
 /*
  * Free an interface
  */
 static void
 mibif_free(struct mibif *ifp)
 {
 	struct mibif *ifp1;
 	struct mibindexmap *map;
 	struct mibifa *ifa, *ifa1;
 	struct mibrcvaddr *rcv, *rcv1;
 	struct mibarp *at, *at1;
 
 	if (ifp->xnotify != NULL)
 		(*ifp->xnotify)(ifp, MIBIF_NOTIFY_DESTROY, ifp->xnotify_data);
 
 	(void)mib_ifstack_delete(ifp, NULL);
 	(void)mib_ifstack_delete(NULL, ifp);
 
 	TAILQ_REMOVE(&mibif_list, ifp, link);
 
 	/* if this was the fastest interface - recompute this */
 	if (ifp->mib.ifmd_data.ifi_baudrate == mibif_maxspeed) {
 		mibif_maxspeed = ifp->mib.ifmd_data.ifi_baudrate;
 		TAILQ_FOREACH(ifp1, &mibif_list, link)
 			if (ifp1->mib.ifmd_data.ifi_baudrate > mibif_maxspeed)
 				mibif_maxspeed =
 				    ifp1->mib.ifmd_data.ifi_baudrate;
 		mibif_reset_hc_timer();
 	}
 
 	free(ifp->private);
 	if (ifp->physaddr != NULL)
 		free(ifp->physaddr);
 	if (ifp->specmib != NULL)
 		free(ifp->specmib);
 
 	STAILQ_FOREACH(map, &mibindexmap_list, link)
 		if (map->mibif == ifp) {
 			map->mibif = NULL;
 			break;
 		}
 
 	/* purge interface addresses */
 	ifa = TAILQ_FIRST(&mibifa_list);
 	while (ifa != NULL) {
 		ifa1 = TAILQ_NEXT(ifa, link);
 		if (ifa->ifindex == ifp->index)
 			destroy_ifa(ifa);
 		ifa = ifa1;
 	}
 
 	/* purge receive addresses */
 	rcv = TAILQ_FIRST(&mibrcvaddr_list);
 	while (rcv != NULL) {
 		rcv1 = TAILQ_NEXT(rcv, link);
 		if (rcv->ifindex == ifp->index)
 			mib_rcvaddr_delete(rcv);
 		rcv = rcv1;
 	}
 
 	/* purge ARP entries */
 	at = TAILQ_FIRST(&mibarp_list);
 	while (at != NULL) {
 		at1 = TAILQ_NEXT(at, link);
 		if (at->index.subs[0] == ifp->index)
 			mib_arp_delete(at);
 		at = at1;
 	}
 
 
 	free(ifp);
 	mib_if_number--;
 	mib_iftable_last_change = this_tick;
 }
 
 /*
  * Create a new interface
  */
 static struct mibif *
 mibif_create(u_int sysindex, const char *name)
 {
 	struct mibif *ifp;
 	struct mibindexmap *map;
 
 	if ((ifp = malloc(sizeof(*ifp))) == NULL) {
 		syslog(LOG_WARNING, "%s: %m", __func__);
 		return (NULL);
 	}
 	memset(ifp, 0, sizeof(*ifp));
 	if ((ifp->private = malloc(sizeof(struct mibif_private))) == NULL) {
 		syslog(LOG_WARNING, "%s: %m", __func__);
 		free(ifp);
 		return (NULL);
 	}
 	memset(ifp->private, 0, sizeof(struct mibif_private));
 
 	ifp->sysindex = sysindex;
 	strcpy(ifp->name, name);
 	strcpy(ifp->descr, name);
 	ifp->spec_oid = oid_zeroDotZero;
 
 	map = NULL;
 	if (!mib_if_is_dyn(ifp->name)) {
 		/* non-dynamic. look whether we know the interface */
 		STAILQ_FOREACH(map, &mibindexmap_list, link)
 			if (strcmp(map->name, ifp->name) == 0) {
 				ifp->index = map->ifindex;
 				map->mibif = ifp;
 				break;
 			}
 		/* assume it has a connector if it is not dynamic */
 		ifp->has_connector = 1;
 		ifp->trap_enable = 1;
 	}
 	if (map == NULL) {
 		/* new interface - get new index */
 		if (next_if_index > 0x7fffffff)
 			errx(1, "ifindex wrap");
 
 		if ((map = malloc(sizeof(*map))) == NULL) {
 			syslog(LOG_ERR, "ifmap: %m");
 			free(ifp);
 			return (NULL);
 		}
 		map->ifindex = next_if_index++;
 		map->sysindex = ifp->sysindex;
 		strcpy(map->name, ifp->name);
 		map->mibif = ifp;
 		STAILQ_INSERT_TAIL(&mibindexmap_list, map, link);
 	} else {
 		/* re-instantiate. Introduce a counter discontinuity */
 		ifp->counter_disc = get_ticks();
 	}
 	ifp->index = map->ifindex;
 	ifp->mib.ifmd_data.ifi_link_state = LINK_STATE_UNKNOWN;
 
 	INSERT_OBJECT_INT(ifp, &mibif_list);
 	mib_if_number++;
 	mib_iftable_last_change = this_tick;
 
 	/* instantiate default ifStack entries */
 	(void)mib_ifstack_create(ifp, NULL);
 	(void)mib_ifstack_create(NULL, ifp);
 
 	return (ifp);
 }
 
 /*
  * Inform all interested parties about a new interface
  */
 static void
 notify_newif(struct mibif *ifp)
 {
 	struct newifreg *reg;
 
 	TAILQ_FOREACH(reg, &newifreg_list, link)
 		if ((*reg->func)(ifp))
 			return;
 }
 
 /*
  * This is called for new interfaces after we have fetched the interface
  * MIB. If this is a broadcast interface try to guess the broadcast address
  * depending on the interface type.
  */
 static void
 check_llbcast(struct mibif *ifp)
 {
 	static u_char ether_bcast[6] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
 	static u_char arcnet_bcast = 0;
 	struct mibrcvaddr *rcv;
 
 	if (!(ifp->mib.ifmd_flags & IFF_BROADCAST))
 		return;
 
 	switch (ifp->mib.ifmd_data.ifi_type) {
 
 	  case IFT_ETHER:
 	  case IFT_FDDI:
 	  case IFT_ISO88025:
 		if (mib_find_rcvaddr(ifp->index, ether_bcast, 6) == NULL &&
 		    (rcv = mib_rcvaddr_create(ifp, ether_bcast, 6)) != NULL)
 			rcv->flags |= MIBRCVADDR_BCAST;
 		break;
 
 	  case IFT_ARCNET:
 		if (mib_find_rcvaddr(ifp->index, &arcnet_bcast, 1) == NULL &&
 		    (rcv = mib_rcvaddr_create(ifp, &arcnet_bcast, 1)) != NULL)
 			rcv->flags |= MIBRCVADDR_BCAST;
 		break;
 	}
 }
 
 
 /*
  * Retrieve the current interface list from the system.
  */
 void
 mib_refresh_iflist(void)
 {
 	struct mibif *ifp, *ifp1;
 	size_t len;
 	u_short idx;
 	int name[6];
 	int count;
 	struct ifmibdata mib;
 
 	TAILQ_FOREACH(ifp, &mibif_list, link)
 		ifp->flags &= ~MIBIF_FOUND;
 
 	len = sizeof(count);
 	if (sysctlbyname("net.link.generic.system.ifcount", &count, &len,
 	    NULL, 0) == -1) {
 		syslog(LOG_ERR, "ifcount: %m");
 		return;
 	}
 	name[0] = CTL_NET;
 	name[1] = PF_LINK;
 	name[2] = NETLINK_GENERIC;
 	name[3] = IFMIB_IFDATA;
 	name[5] = IFDATA_GENERAL;
 	for (idx = 1; idx <= count; idx++) {
 		name[4] = idx;
 		len = sizeof(mib);
 		if (sysctl(name, 6, &mib, &len, NULL, 0) == -1) {
 			if (errno == ENOENT)
 				continue;
 			syslog(LOG_ERR, "ifmib(%u): %m", idx);
 			return;
 		}
 		if ((ifp = mib_find_if_sys(idx)) != NULL) {
 			ifp->flags |= MIBIF_FOUND;
 			continue;
 		}
 		/* Unknown interface - create */
 		if ((ifp = mibif_create(idx, mib.ifmd_name)) != NULL) {
 			ifp->flags |= MIBIF_FOUND;
 			(void)mib_fetch_ifmib(ifp);
 			check_llbcast(ifp);
 			notify_newif(ifp);
 		}
 	}
 
 	/*
 	 * Purge interfaces that disappeared
 	 */
 	ifp = TAILQ_FIRST(&mibif_list);
 	while (ifp != NULL) {
 		ifp1 = TAILQ_NEXT(ifp, link);
 		if (!(ifp->flags & MIBIF_FOUND))
 			mibif_free(ifp);
 		ifp = ifp1;
 	}
 }
 
 /*
  * Find an interface address
  */
 struct mibifa *
 mib_find_ifa(struct in_addr addr)
 {
 	struct mibifa *ifa;
 
 	TAILQ_FOREACH(ifa, &mibifa_list, link)
 		if (ifa->inaddr.s_addr == addr.s_addr)
 			return (ifa);
 	return (NULL);
 }
 
 /*
- * Process a new ARP entry
- */
-static void
-process_arp(const struct rt_msghdr *rtm, const struct sockaddr_dl *sdl,
-    const struct sockaddr_in *sa)
-{
-	struct mibif *ifp;
-	struct mibarp *at;
-
-	/* IP arp table entry */
-	if (sdl->sdl_alen == 0) {
-		update_arp = 1;
-		return;
-	}
-	if ((ifp = mib_find_if_sys(sdl->sdl_index)) == NULL)
-		return;
-	/* have a valid entry */
-	if ((at = mib_find_arp(ifp, sa->sin_addr)) == NULL &&
-	    (at = mib_arp_create(ifp, sa->sin_addr,
-	    sdl->sdl_data + sdl->sdl_nlen, sdl->sdl_alen)) == NULL)
-		return;
-
-	if (rtm->rtm_rmx.rmx_expire == 0)
-		at->flags |= MIBARP_PERM;
-	else
-		at->flags &= ~MIBARP_PERM;
-	at->flags |= MIBARP_FOUND;
-}
-
-/*
  * Handle a routing socket message.
  */
 static void
 handle_rtmsg(struct rt_msghdr *rtm)
 {
 	struct sockaddr *addrs[RTAX_MAX];
 	struct if_msghdr *ifm;
 	struct ifa_msghdr *ifam;
 	struct ifma_msghdr *ifmam;
 #ifdef RTM_IFANNOUNCE
 	struct if_announcemsghdr *ifan;
 #endif
 	struct mibif *ifp;
 	struct sockaddr_dl *sdl;
 	struct sockaddr_in *sa;
 	struct mibifa *ifa;
 	struct mibrcvaddr *rcv;
 	u_char *ptr;
 
 	if (rtm->rtm_version != RTM_VERSION) {
 		syslog(LOG_ERR, "Bogus RTM version %u", rtm->rtm_version);
 		return;
 	}
 
 	switch (rtm->rtm_type) {
 
 	  case RTM_NEWADDR:
 		ifam = (struct ifa_msghdr *)rtm;
 		mib_extract_addrs(ifam->ifam_addrs, (u_char *)(ifam + 1), addrs);
 		if (addrs[RTAX_IFA] == NULL || addrs[RTAX_NETMASK] == NULL)
 			break;
 
 		sa = (struct sockaddr_in *)(void *)addrs[RTAX_IFA];
 		if ((ifa = mib_find_ifa(sa->sin_addr)) == NULL) {
 			/* unknown address */
 		    	if ((ifp = mib_find_if_sys(ifam->ifam_index)) == NULL) {
 				syslog(LOG_WARNING, "RTM_NEWADDR for unknown "
 				    "interface %u", ifam->ifam_index);
 				break;
 			}
 		     	if ((ifa = alloc_ifa(ifp->index, sa->sin_addr)) == NULL)
 				break;
 		}
 		sa = (struct sockaddr_in *)(void *)addrs[RTAX_NETMASK];
 		ifa->inmask = sa->sin_addr;
 
 		if (addrs[RTAX_BRD] != NULL) {
 			sa = (struct sockaddr_in *)(void *)addrs[RTAX_BRD];
 			ifa->inbcast = sa->sin_addr;
 		}
 		ifa->flags |= MIBIFA_FOUND;
 		break;
 
 	  case RTM_DELADDR:
 		ifam = (struct ifa_msghdr *)rtm;
 		mib_extract_addrs(ifam->ifam_addrs, (u_char *)(ifam + 1), addrs);
 		if (addrs[RTAX_IFA] == NULL)
 			break;
 
 		sa = (struct sockaddr_in *)(void *)addrs[RTAX_IFA];
 		if ((ifa = mib_find_ifa(sa->sin_addr)) != NULL) {
 			ifa->flags |= MIBIFA_FOUND;
 			if (!(ifa->flags & MIBIFA_DESTROYED))
 				destroy_ifa(ifa);
 		}
 		break;
 
 	  case RTM_NEWMADDR:
 		ifmam = (struct ifma_msghdr *)rtm;
 		mib_extract_addrs(ifmam->ifmam_addrs, (u_char *)(ifmam + 1), addrs);
 		if (addrs[RTAX_IFA] == NULL ||
 		    addrs[RTAX_IFA]->sa_family != AF_LINK)
 			break;
 		sdl = (struct sockaddr_dl *)(void *)addrs[RTAX_IFA];
 		if ((rcv = mib_find_rcvaddr(sdl->sdl_index,
 		    sdl->sdl_data + sdl->sdl_nlen, sdl->sdl_alen)) == NULL) {
 			/* unknown address */
 		    	if ((ifp = mib_find_if_sys(sdl->sdl_index)) == NULL) {
 				syslog(LOG_WARNING, "RTM_NEWMADDR for unknown "
 				    "interface %u", sdl->sdl_index);
 				break;
 			}
 		     	if ((rcv = mib_rcvaddr_create(ifp,
 			    sdl->sdl_data + sdl->sdl_nlen, sdl->sdl_alen)) == NULL)
 				break;
 			rcv->flags |= MIBRCVADDR_VOLATILE;
 		}
 		rcv->flags |= MIBRCVADDR_FOUND;
 		break;
 
 	  case RTM_DELMADDR:
 		ifmam = (struct ifma_msghdr *)rtm;
 		mib_extract_addrs(ifmam->ifmam_addrs, (u_char *)(ifmam + 1), addrs);
 		if (addrs[RTAX_IFA] == NULL ||
 		    addrs[RTAX_IFA]->sa_family != AF_LINK)
 			break;
 		sdl = (struct sockaddr_dl *)(void *)addrs[RTAX_IFA];
 		if ((rcv = mib_find_rcvaddr(sdl->sdl_index,
 		    sdl->sdl_data + sdl->sdl_nlen, sdl->sdl_alen)) != NULL)
 			mib_rcvaddr_delete(rcv);
 		break;
 
 	  case RTM_IFINFO:
 		ifm = (struct if_msghdr *)rtm;
 		mib_extract_addrs(ifm->ifm_addrs, (u_char *)(ifm + 1), addrs);
 		if ((ifp = mib_find_if_sys(ifm->ifm_index)) == NULL)
 			break;
 		if (addrs[RTAX_IFP] != NULL &&
 		    addrs[RTAX_IFP]->sa_family == AF_LINK) {
 			sdl = (struct sockaddr_dl *)(void *)addrs[RTAX_IFP];
 			ptr = sdl->sdl_data + sdl->sdl_nlen;
 			get_physaddr(ifp, sdl, ptr);
 		}
 		(void)mib_fetch_ifmib(ifp);
 		break;
 
 #ifdef RTM_IFANNOUNCE
 	  case RTM_IFANNOUNCE:
 		ifan = (struct if_announcemsghdr *)rtm;
 		ifp = mib_find_if_sys(ifan->ifan_index);
 
 		switch (ifan->ifan_what) {
 
 		  case IFAN_ARRIVAL:
 			if (ifp == NULL && (ifp = mibif_create(ifan->ifan_index,
 			    ifan->ifan_name)) != NULL) {
 				(void)mib_fetch_ifmib(ifp);
 				check_llbcast(ifp);
 				notify_newif(ifp);
 			}
 			break;
 
 		  case IFAN_DEPARTURE:
 			if (ifp != NULL)
 				mibif_free(ifp);
 			break;
 		}
 		break;
 #endif
-
 	  case RTM_GET:
-		mib_extract_addrs(rtm->rtm_addrs, (u_char *)(rtm + 1), addrs);
-		if (rtm->rtm_flags & RTF_LLINFO) {
-			if (addrs[RTAX_DST] == NULL ||
-			    addrs[RTAX_GATEWAY] == NULL ||
-			    addrs[RTAX_DST]->sa_family != AF_INET ||
-			    addrs[RTAX_GATEWAY]->sa_family != AF_LINK)
-				break;
-			process_arp(rtm,
-			    (struct sockaddr_dl *)(void *)addrs[RTAX_GATEWAY],
-			    (struct sockaddr_in *)(void *)addrs[RTAX_DST]);
-		} else {
-			if (rtm->rtm_errno == 0 && (rtm->rtm_flags & RTF_UP))
-				mib_sroute_process(rtm, addrs[RTAX_GATEWAY],
-				    addrs[RTAX_DST], addrs[RTAX_NETMASK]);
-		}
-		break;
-
 	  case RTM_ADD:
-		mib_extract_addrs(rtm->rtm_addrs, (u_char *)(rtm + 1), addrs);
-		if (rtm->rtm_flags & RTF_LLINFO) {
-			if (addrs[RTAX_DST] == NULL ||
-			    addrs[RTAX_GATEWAY] == NULL ||
-			    addrs[RTAX_DST]->sa_family != AF_INET ||
-			    addrs[RTAX_GATEWAY]->sa_family != AF_LINK)
-				break;
-			process_arp(rtm,
-			    (struct sockaddr_dl *)(void *)addrs[RTAX_GATEWAY],
-			    (struct sockaddr_in *)(void *)addrs[RTAX_DST]);
-		} else {
-			if (rtm->rtm_errno == 0 && (rtm->rtm_flags & RTF_UP))
-				mib_sroute_process(rtm, addrs[RTAX_GATEWAY],
-				    addrs[RTAX_DST], addrs[RTAX_NETMASK]);
-		}
-		break;
-
 	  case RTM_DELETE:
 		mib_extract_addrs(rtm->rtm_addrs, (u_char *)(rtm + 1), addrs);
-		if (rtm->rtm_errno == 0 && !(rtm->rtm_flags & RTF_LLINFO))
+
+		if (rtm->rtm_errno == 0 && (rtm->rtm_flags & RTF_UP))
 			mib_sroute_process(rtm, addrs[RTAX_GATEWAY],
 			    addrs[RTAX_DST], addrs[RTAX_NETMASK]);
 		break;
 	}
 }
 
 /*
  * send a routing message
  */
 void
 mib_send_rtmsg(struct rt_msghdr *rtm, struct sockaddr *gw,
     struct sockaddr *dst, struct sockaddr *mask)
 {
 	size_t len;
 	struct rt_msghdr *msg;
 	char *cp;
 	ssize_t sent;
 
 	len = sizeof(*rtm) + SA_SIZE(gw) + SA_SIZE(dst) + SA_SIZE(mask);
 	if ((msg = malloc(len)) == NULL) {
 		syslog(LOG_ERR, "%s: %m", __func__);
 		return;
 	}
 	cp = (char *)(msg + 1);
 
 	memset(msg, 0, sizeof(*msg));
 	msg->rtm_flags = 0;
 	msg->rtm_version = RTM_VERSION;
 	msg->rtm_addrs = RTA_DST | RTA_GATEWAY;
 
 	memcpy(cp, dst, SA_SIZE(dst));
 	cp += SA_SIZE(dst);
 	memcpy(cp, gw, SA_SIZE(gw));
 	cp += SA_SIZE(gw);
 	if (mask != NULL) {
 		memcpy(cp, mask, SA_SIZE(mask));
 		cp += SA_SIZE(mask);
 		msg->rtm_addrs |= RTA_NETMASK;
 	}
 	msg->rtm_msglen = cp - (char *)msg;
 	msg->rtm_type = RTM_GET;
 	if ((sent = write(route, msg, msg->rtm_msglen)) == -1) {
 		syslog(LOG_ERR, "%s: write: %m", __func__);
 		free(msg);
 		return;
 	}
 	if (sent != msg->rtm_msglen) {
 		syslog(LOG_ERR, "%s: short write", __func__);
 		free(msg);
 		return;
 	}
 	free(msg);
 }
 
 /*
  * Fetch the routing table via sysctl
  */
 u_char *
 mib_fetch_rtab(int af, int info, int arg, size_t *lenp)
 {
 	int name[6];
 	u_char *buf, *newbuf;
 
 	name[0] = CTL_NET;
 	name[1] = PF_ROUTE;
 	name[2] = 0;
 	name[3] = af;
 	name[4] = info;
 	name[5] = arg;
 
 	*lenp = 0;
 
 	/* initial estimate */
 	if (sysctl(name, 6, NULL, lenp, NULL, 0) == -1) {
 		syslog(LOG_ERR, "sysctl estimate (%d,%d,%d,%d,%d,%d): %m",
 		    name[0], name[1], name[2], name[3], name[4], name[5]);
 		return (NULL);
 	}
 	if (*lenp == 0)
 		return (NULL);
 
 	buf = NULL;
 	for (;;) {
 		if ((newbuf = realloc(buf, *lenp)) == NULL) {
 			syslog(LOG_ERR, "sysctl buffer: %m");
 			free(buf);
 			return (NULL);
 		}
 		buf = newbuf;
 			
 		if (sysctl(name, 6, buf, lenp, NULL, 0) == 0)
 			break;
 
 		if (errno != ENOMEM) {
 			syslog(LOG_ERR, "sysctl get: %m");
 			free(buf);
 			return (NULL);
 		}
 		*lenp += *lenp / 8 + 1;
 	}
 
 	return (buf);
 }
 
 /*
  * Update the following info: interface, interface addresses, interface
  * receive addresses, arp-table.
  * This does not change the interface list itself.
  */
 static void
 update_ifa_info(void)
 {
 	u_char *buf, *next;
 	struct rt_msghdr *rtm;
 	struct mibifa *ifa, *ifa1;
 	struct mibrcvaddr *rcv, *rcv1;
 	size_t needed;
 	static const int infos[][3] = {
 		{ 0, NET_RT_IFLIST, 0 },
 #ifdef NET_RT_IFMALIST
 		{ AF_LINK, NET_RT_IFMALIST, 0 },
 #endif
 	};
 	u_int i;
 
 	TAILQ_FOREACH(ifa, &mibifa_list, link)
 		ifa->flags &= ~MIBIFA_FOUND;
 	TAILQ_FOREACH(rcv, &mibrcvaddr_list, link)
 		rcv->flags &= ~MIBRCVADDR_FOUND;
 
 	for (i = 0; i < sizeof(infos) / sizeof(infos[0]); i++) {
 		if ((buf = mib_fetch_rtab(infos[i][0], infos[i][1], infos[i][2],
 		   &needed)) == NULL)
 			continue;
 
 		next = buf;
 		while (next < buf + needed) {
 			rtm = (struct rt_msghdr *)(void *)next;
 			next += rtm->rtm_msglen;
 			handle_rtmsg(rtm);
 		}
 		free(buf);
 	}
 
 	/*
 	 * Purge the address list of unused entries. These may happen for
 	 * interface aliases that are on the same subnet. We don't receive
 	 * routing socket messages for them.
 	 */
 	ifa = TAILQ_FIRST(&mibifa_list);
 	while (ifa != NULL) {
 		ifa1 = TAILQ_NEXT(ifa, link);
 		if (!(ifa->flags & MIBIFA_FOUND))
 			destroy_ifa(ifa);
 		ifa = ifa1;
 	}
 
 	rcv = TAILQ_FIRST(&mibrcvaddr_list);
 	while (rcv != NULL) {
 		rcv1 = TAILQ_NEXT(rcv, link);
 		if (!(rcv->flags & (MIBRCVADDR_FOUND | MIBRCVADDR_BCAST |
 		    MIBRCVADDR_HW)))
 			mib_rcvaddr_delete(rcv);
 		rcv = rcv1;
 	}
 }
 
 /*
  * Update arp table
- */
+ *
+*/
 void
 mib_arp_update(void)
 {
 	struct mibarp *at, *at1;
 	size_t needed;
 	u_char *buf, *next;
 	struct rt_msghdr *rtm;
 
 	if (in_update_arp)
 		return;		/* Aaargh */
 	in_update_arp = 1;
 
 	TAILQ_FOREACH(at, &mibarp_list, link)
 		at->flags &= ~MIBARP_FOUND;
 
-	if ((buf = mib_fetch_rtab(AF_INET, NET_RT_FLAGS, RTF_LLINFO, &needed)) == NULL) {
+	if ((buf = mib_fetch_rtab(AF_INET, NET_RT_FLAGS, 0, &needed)) == NULL) {
 		in_update_arp = 0;
 		return;
 	}
-
+	
 	next = buf;
 	while (next < buf + needed) {
 		rtm = (struct rt_msghdr *)(void *)next;
 		next += rtm->rtm_msglen;
 		handle_rtmsg(rtm);
 	}
 	free(buf);
 
 	at = TAILQ_FIRST(&mibarp_list);
 	while (at != NULL) {
 		at1 = TAILQ_NEXT(at, link);
 		if (!(at->flags & MIBARP_FOUND))
 			mib_arp_delete(at);
 		at = at1;
 	}
 	mibarpticks = get_ticks();
-	update_arp = 0;
 	in_update_arp = 0;
 }
 
 
 /*
  * Intput on the routing socket.
  */
 static void
 route_input(int fd, void *udata __unused)
 {
 	u_char	buf[1024 * 16];
 	ssize_t n;
 	struct rt_msghdr *rtm;
 
 	if ((n = read(fd, buf, sizeof(buf))) == -1)
 		err(1, "read(rt_socket)");
 
 	if (n == 0)
 		errx(1, "EOF on rt_socket");
 
 	rtm = (struct rt_msghdr *)(void *)buf;
 	if ((size_t)n != rtm->rtm_msglen)
 		errx(1, "n=%zu, rtm_msglen=%u", (size_t)n, rtm->rtm_msglen);
 
 	handle_rtmsg(rtm);
 }
 
 /*
  * execute and SIOCAIFADDR
  */
 static int
 siocaifaddr(char *ifname, struct in_addr addr, struct in_addr mask,
     struct in_addr bcast)
 {
 	struct ifaliasreq addreq;
 	struct sockaddr_in *sa;
 
 	memset(&addreq, 0, sizeof(addreq));
 	strncpy(addreq.ifra_name, ifname, sizeof(addreq.ifra_name));
 
 	sa = (struct sockaddr_in *)(void *)&addreq.ifra_addr;
 	sa->sin_family = AF_INET;
 	sa->sin_len = sizeof(*sa);
 	sa->sin_addr = addr;
 
 	sa = (struct sockaddr_in *)(void *)&addreq.ifra_mask;
 	sa->sin_family = AF_INET;
 	sa->sin_len = sizeof(*sa);
 	sa->sin_addr = mask;
 
 	sa = (struct sockaddr_in *)(void *)&addreq.ifra_broadaddr;
 	sa->sin_family = AF_INET;
 	sa->sin_len = sizeof(*sa);
 	sa->sin_addr = bcast;
 
 	return (ioctl(mib_netsock, SIOCAIFADDR, &addreq));
 }
 
 /*
  * Exececute a SIOCDIFADDR
  */
 static int
 siocdifaddr(const char *ifname, struct in_addr addr)
 {
 	struct ifreq delreq;
 	struct sockaddr_in *sa;
 
 	memset(&delreq, 0, sizeof(delreq));
 	strncpy(delreq.ifr_name, ifname, sizeof(delreq.ifr_name));
 	sa = (struct sockaddr_in *)(void *)&delreq.ifr_addr;
 	sa->sin_family = AF_INET;
 	sa->sin_len = sizeof(*sa);
 	sa->sin_addr = addr;
 
 	return (ioctl(mib_netsock, SIOCDIFADDR, &delreq));
 }
 
 /*
  * Verify an interface address without fetching the entire list
  */
 static int
 verify_ifa(const char *name, struct mibifa *ifa)
 {
 	struct ifreq req;
 	struct sockaddr_in *sa;
 
 	memset(&req, 0, sizeof(req));
 	strncpy(req.ifr_name, name, sizeof(req.ifr_name));
 	sa = (struct sockaddr_in *)(void *)&req.ifr_addr;
 	sa->sin_family = AF_INET;
 	sa->sin_len = sizeof(*sa);
 	sa->sin_addr = ifa->inaddr;
 
 	if (ioctl(mib_netsock, SIOCGIFADDR, &req) == -1)
 		return (-1);
 	if (ifa->inaddr.s_addr != sa->sin_addr.s_addr) {
 		syslog(LOG_ERR, "%s: address mismatch", __func__);
 		return (-1);
 	}
 
 	if (ioctl(mib_netsock, SIOCGIFNETMASK, &req) == -1)
 		return (-1);
 	if (ifa->inmask.s_addr != sa->sin_addr.s_addr) {
 		syslog(LOG_ERR, "%s: netmask mismatch", __func__);
 		return (-1);
 	}
 	return (0);
 }
 
 /*
  * Restore a deleted interface address. Don't wait for the routing socket
  * to update us.
  */
 void
 mib_undestroy_ifa(struct mibifa *ifa)
 {
 	struct mibif *ifp;
 
 	if ((ifp = mib_find_if(ifa->ifindex)) == NULL)
 		/* keep it destroyed */
 		return;
 
 	if (siocaifaddr(ifp->name, ifa->inaddr, ifa->inmask, ifa->inbcast))
 		/* keep it destroyed */
 		return;
 
 	ifa->flags &= ~MIBIFA_DESTROYED;
 }
 
 /*
  * Destroy an interface address
  */
 int
 mib_destroy_ifa(struct mibifa *ifa)
 {
 	struct mibif *ifp;
 
 	if ((ifp = mib_find_if(ifa->ifindex)) == NULL) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return (-1);
 	}
 	if (siocdifaddr(ifp->name, ifa->inaddr)) {
 		/* ups. */
 		syslog(LOG_ERR, "SIOCDIFADDR: %m");
 		mib_iflist_bad = 1;
 		return (-1);
 	}
 	ifa->flags |= MIBIFA_DESTROYED;
 	return (0);
 }
 
 /*
  * Rollback the modification of an address. Don't bother to wait for
  * the routing socket.
  */
 void
 mib_unmodify_ifa(struct mibifa *ifa)
 {
 	struct mibif *ifp;
 
 	if ((ifp = mib_find_if(ifa->ifindex)) == NULL) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return;
 	}
 
 	if (siocaifaddr(ifp->name, ifa->inaddr, ifa->inmask, ifa->inbcast)) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return;
 	}
 }
 
 /*
  * Modify an IFA. 
  */
 int
 mib_modify_ifa(struct mibifa *ifa)
 {
 	struct mibif *ifp;
 
 	if ((ifp = mib_find_if(ifa->ifindex)) == NULL) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return (-1);
 	}
 
 	if (siocaifaddr(ifp->name, ifa->inaddr, ifa->inmask, ifa->inbcast)) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return (-1);
 	}
 
 	if (verify_ifa(ifp->name, ifa)) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return (-1);
 	}
 
 	return (0);
 }
 
 /*
  * Destroy a freshly created interface address. Don't bother to wait for
  * the routing socket.
  */
 void
 mib_uncreate_ifa(struct mibifa *ifa)
 {
 	struct mibif *ifp;
 
 	if ((ifp = mib_find_if(ifa->ifindex)) == NULL) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return;
 	}
 	if (siocdifaddr(ifp->name, ifa->inaddr)) {
 		/* ups. */
 		mib_iflist_bad = 1;
 		return;
 	}
 
 	destroy_ifa(ifa);
 }
 
 /*
  * Create a new ifa and verify it
  */
 struct mibifa *
 mib_create_ifa(u_int ifindex, struct in_addr addr, struct in_addr mask,
     struct in_addr bcast)
 {
 	struct mibif *ifp;
 	struct mibifa *ifa;
 
 	if ((ifp = mib_find_if(ifindex)) == NULL)
 		return (NULL);
 	if ((ifa = alloc_ifa(ifindex, addr)) == NULL)
 		return (NULL);
 	ifa->inmask = mask;
 	ifa->inbcast = bcast;
 
 	if (siocaifaddr(ifp->name, ifa->inaddr, ifa->inmask, ifa->inbcast)) {
 		syslog(LOG_ERR, "%s: %m", __func__);
 		destroy_ifa(ifa);
 		return (NULL);
 	}
 	if (verify_ifa(ifp->name, ifa)) {
 		destroy_ifa(ifa);
 		return (NULL);
 	}
 	return (ifa);
 }
 
 /*
  * Get all cloning interfaces and make them dynamic.
  * Hah! Whe should probably do this on a periodic basis (XXX).
  */
 static void
 get_cloners(void)
 {
 	struct if_clonereq req;
 	char *buf, *cp;
 	int i;
 
 	memset(&req, 0, sizeof(req));
 	if (ioctl(mib_netsock, SIOCIFGCLONERS, &req) == -1) {
 		syslog(LOG_ERR, "get cloners: %m");
 		return;
 	}
 	if ((buf = malloc(req.ifcr_total * IFNAMSIZ)) == NULL) {
 		syslog(LOG_ERR, "%m");
 		return;
 	}
 	req.ifcr_count = req.ifcr_total;
 	req.ifcr_buffer = buf;
 	if (ioctl(mib_netsock, SIOCIFGCLONERS, &req) == -1) {
 		syslog(LOG_ERR, "get cloners: %m");
 		free(buf);
 		return;
 	}
 	for (cp = buf, i = 0; i < req.ifcr_total; i++, cp += IFNAMSIZ)
 		mib_if_set_dyn(cp);
 	free(buf);
 }
 
 /*
  * Idle function
  */
 static void
 mibII_idle(void)
 {
 	struct mibifa *ifa;
 
 	if (mib_iflist_bad) {
 		TAILQ_FOREACH(ifa, &mibifa_list, link)
 			ifa->flags &= ~MIBIFA_DESTROYED;
 
 		/* assume, that all cloning interfaces are dynamic */
 		get_cloners();
 
 		mib_refresh_iflist();
 		update_ifa_info();
 		mib_arp_update();
 		mib_iflist_bad = 0;
 	}
-	if (update_arp)
-		mib_arp_update();
+
+	mib_arp_update();
 }
 
 
 /*
  * Start the module
  */
 static void
 mibII_start(void)
 {
 	if ((route_fd = fd_select(route, route_input, NULL, module)) == NULL) {
 		syslog(LOG_ERR, "fd_select(route): %m");
 		return;
 	}
 	mib_refresh_iflist();
 	update_ifa_info();
 	mib_arp_update();
 	(void)mib_fetch_route();
 	mib_iftable_last_change = 0;
 	mib_ifstack_last_change = 0;
 
 	ifmib_reg = or_register(&oid_ifMIB,
 	    "The MIB module to describe generic objects for network interface"
 	    " sub-layers.", module);
 
 	ipmib_reg = or_register(&oid_ipMIB,
 	   "The MIB module for managing IP and ICMP implementations, but "
 	   "excluding their management of IP routes.", module);
 
 	tcpmib_reg = or_register(&oid_tcpMIB,
 	   "The MIB module for managing TCP implementations.", module);
 
 	udpmib_reg = or_register(&oid_udpMIB,
 	   "The MIB module for managing UDP implementations.", module);
 
 	ipForward_reg = or_register(&oid_ipForward,
 	   "The MIB module for the display of CIDR multipath IP Routes.",
 	   module);
 }
 
 /*
  * Initialize the module
  */
 static int
 mibII_init(struct lmodule *mod, int argc __unused, char *argv[] __unused)
 {
 	size_t len;
 
 	module = mod;
 
 	len = sizeof(clockinfo);
 	if (sysctlbyname("kern.clockrate", &clockinfo, &len, NULL, 0) == -1) {
 		syslog(LOG_ERR, "kern.clockrate: %m");
 		return (-1);
 	}
 	if (len != sizeof(clockinfo)) {
 		syslog(LOG_ERR, "kern.clockrate: wrong size");
 		return (-1);
 	}
 
 	if ((route = socket(PF_ROUTE, SOCK_RAW, AF_UNSPEC)) == -1) {
 		syslog(LOG_ERR, "PF_ROUTE: %m");
 		return (-1);
 	}
 
 	if ((mib_netsock = socket(PF_INET, SOCK_DGRAM, 0)) == -1) {
 		syslog(LOG_ERR, "PF_INET: %m");
 		(void)close(route);
 		return (-1);
 	}
 	(void)shutdown(mib_netsock, SHUT_RDWR);
 
 	/* assume, that all cloning interfaces are dynamic */
 	get_cloners();
 
 	return (0);
 }
 
 static int
 mibII_fini(void)
 {
 	if (route_fd != NULL)
 		fd_deselect(route_fd);
 	if (route != -1)
 		(void)close(route);
 	if (mib_netsock != -1)
 		(void)close(mib_netsock);
 	/* XXX free memory */
 
 	or_unregister(ipForward_reg);
 	or_unregister(udpmib_reg);
 	or_unregister(tcpmib_reg);
 	or_unregister(ipmib_reg);
 	or_unregister(ifmib_reg);
 
 	return (0);
 }
 
 static void
 mibII_loading(const struct lmodule *mod, int loaded)
 {
 	struct mibif *ifp;
 
 	if (loaded == 1)
 		return;
 
 	TAILQ_FOREACH(ifp, &mibif_list, link)
 		if (ifp->xnotify_mod == mod) {
 			ifp->xnotify_mod = NULL;
 			ifp->xnotify_data = NULL;
 			ifp->xnotify = NULL;
 		}
 
 	mib_unregister_newif(mod);
 }
 
 const struct snmp_module config = {
 	"This module implements the interface and ip groups.",
 	mibII_init,
 	mibII_fini,
 	mibII_idle,	/* idle */
 	NULL,		/* dump */
 	NULL,		/* config */
 	mibII_start,
 	NULL,
 	mibII_ctree,
 	mibII_CTREE_SIZE,
 	mibII_loading
 };
 
 /*
  * Should have a list of these attached to each interface.
  */
 void *
 mibif_notify(struct mibif *ifp, const struct lmodule *mod,
     mibif_notify_f func, void *data)
 {
 	ifp->xnotify = func;
 	ifp->xnotify_data = data;
 	ifp->xnotify_mod = mod;
 
 	return (ifp);
 }
 
 void
 mibif_unnotify(void *arg)
 {
 	struct mibif *ifp = arg;
 
 	ifp->xnotify = NULL;
 	ifp->xnotify_data = NULL;
 	ifp->xnotify_mod = NULL;
 }
Index: head/contrib/bsnmp/snmp_mibII/mibII_route.c
===================================================================
--- head/contrib/bsnmp/snmp_mibII/mibII_route.c	(revision 186118)
+++ head/contrib/bsnmp/snmp_mibII/mibII_route.c	(revision 186119)
@@ -1,516 +1,515 @@
 /*
  * Copyright (c) 2001-2003
  *	Fraunhofer Institute for Open Communication Systems (FhG Fokus).
  *	All rights reserved.
  *
  * Author: Harti Brandt <harti@freebsd.org>
  * 
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 
  * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $Begemot: bsnmp/snmp_mibII/mibII_route.c,v 1.9 2005/10/06 07:15:00 brandt_h Exp $
  *
  * Routing table
  */
 #include "support.h"
 
 #ifdef HAVE_SYS_TREE_H
 #include <sys/tree.h>
 #else
 #include "tree.h"
 #endif
 
 #include "mibII.h"
 #include "mibII_oid.h"
 
 struct sroute {
 	RB_ENTRY(sroute) link;
 	uint32_t	ifindex;
 	uint8_t		index[13];
 	uint8_t		type;
 	uint8_t		proto;
 };
 RB_HEAD(sroutes, sroute) sroutes = RB_INITIALIZER(&sroutes);
 
 RB_PROTOTYPE(sroutes, sroute, link, sroute_compare);
 
 #define	ROUTE_UPDATE_INTERVAL	(100 * 60 * 10)	/* 10 min */
 static uint64_t route_tick;
 static u_int route_total;
 
 /*
  * Compare two routes
  */
 static int
 sroute_compare(struct sroute *s1, struct sroute *s2)
 {
 
 	return (memcmp(s1->index, s2->index, 13));
 }
 
 static void
 sroute_index_append(struct asn_oid *oid, u_int sub, const struct sroute *s)
 {
 	int i;
 
 	oid->len = sub + 13;
 	for (i = 0; i < 13; i++)
 		oid->subs[sub + i] = s->index[i];
 }
 
 #if 0
 static void
 sroute_print(const struct sroute *r)
 {
 	u_int i;
 
 	for (i = 0; i < 13 - 1; i++)
 		printf("%u.", r->index[i]);
 	printf("%u proto=%u type=%u", r->index[i], r->proto, r->type);
 }
 #endif
 
 /*
  * process routing message
  */
 void
 mib_sroute_process(struct rt_msghdr *rtm, struct sockaddr *gw,
     struct sockaddr *dst, struct sockaddr *mask)
 {
 	struct sockaddr_in *in_dst, *in_gw;
 	struct in_addr in_mask;
 	struct mibif *ifp;
 	struct sroute key;
 	struct sroute *r, *r1;
 	in_addr_t ha;
 
 	if (dst == NULL || gw == NULL || dst->sa_family != AF_INET ||
 	    gw->sa_family != AF_INET)
 		return;
 
 	in_dst = (struct sockaddr_in *)(void *)dst;
 	in_gw = (struct sockaddr_in *)(void *)gw;
 
 	if (rtm->rtm_flags & RTF_HOST)
 		in_mask.s_addr = 0xffffffff;
 	else if (mask == NULL || mask->sa_len == 0)
 		in_mask.s_addr = 0;
 	else
 		in_mask = ((struct sockaddr_in *)(void *)mask)->sin_addr;
 
 	/* build the index */
 	ha = ntohl(in_dst->sin_addr.s_addr);
 	key.index[0] = (ha >> 24) & 0xff;
 	key.index[1] = (ha >> 16) & 0xff;
 	key.index[2] = (ha >>  8) & 0xff;
 	key.index[3] = (ha >>  0) & 0xff;
 
 	ha = ntohl(in_mask.s_addr);
 	key.index[4] = (ha >> 24) & 0xff;
 	key.index[5] = (ha >> 16) & 0xff;
 	key.index[6] = (ha >>  8) & 0xff;
 	key.index[7] = (ha >>  0) & 0xff;
 
 	/* ToS */
 	key.index[8] = 0;
 
 	ha = ntohl(in_gw->sin_addr.s_addr);
 	key.index[9] = (ha >> 24) & 0xff;
 	key.index[10] = (ha >> 16) & 0xff;
 	key.index[11] = (ha >>  8) & 0xff;
 	key.index[12] = (ha >>  0) & 0xff;
 
 	if (rtm->rtm_type == RTM_DELETE) {
 		r = RB_FIND(sroutes, &sroutes, &key);
 		if (r == 0) {
 #ifdef DEBUG_ROUTE
 			syslog(LOG_WARNING, "%s: DELETE: %u.%u.%u.%u "
 			    "%u.%u.%u.%u %u %u.%u.%u.%u not found", __func__,
 			    key.index[0], key.index[1], key.index[2],
 			    key.index[3], key.index[4], key.index[5],
 			    key.index[6], key.index[7], key.index[8],
 			    key.index[9], key.index[10], key.index[11],
 			    key.index[12]);
 #endif
 			return;
 		}
 		RB_REMOVE(sroutes, &sroutes, r);
 		free(r);
 		route_total--;
 #ifdef DEBUG_ROUTE
 		printf("%s: DELETE: %u.%u.%u.%u "
 		    "%u.%u.%u.%u %u %u.%u.%u.%u\n", __func__,
 		    key.index[0], key.index[1], key.index[2],
 		    key.index[3], key.index[4], key.index[5],
 		    key.index[6], key.index[7], key.index[8],
 		    key.index[9], key.index[10], key.index[11],
 		    key.index[12]);
 #endif
 		return;
 	}
 
 	/* GET or ADD */
 	ifp = NULL;
 	if ((ifp = mib_find_if_sys(rtm->rtm_index)) == NULL) {
 		if (rtm->rtm_type == RTM_ADD) {
 			/* make it a get so the kernel fills the index */
 			mib_send_rtmsg(rtm, gw, dst, mask);
 			return;
 		}
 		mib_iflist_bad = 1;
 	}
 
 	if ((r = malloc(sizeof(*r))) == NULL) {
 		syslog(LOG_ERR, "%m");
 		return;
 	}
 
 	memcpy(r->index, key.index, sizeof(r->index));
 	r->ifindex = (ifp == NULL) ? 0 : ifp->index;
 
-	r->type = (rtm->rtm_flags & RTF_LLINFO) ? 3 :
-	    (rtm->rtm_flags & RTF_REJECT) ? 2 : 4;
+	r->type = (rtm->rtm_flags & RTF_REJECT) ? 2 : 4;
 
 	/* cannot really know, what protocol it runs */
 	r->proto = (rtm->rtm_flags & RTF_LOCAL) ? 2 :
 	    (rtm->rtm_flags & RTF_STATIC) ? 3 :
 	    (rtm->rtm_flags & RTF_DYNAMIC) ? 4 : 10;
 
 	r1 = RB_INSERT(sroutes, &sroutes, r);
 	if (r1 != NULL) {
 #ifdef DEBUG_ROUTE
 		syslog(LOG_WARNING, "%s: %u.%u.%u.%u "
 		    "%u.%u.%u.%u %u %u.%u.%u.%u duplicate route", __func__,
 		    key.index[0], key.index[1], key.index[2],
 		    key.index[3], key.index[4], key.index[5],
 		    key.index[6], key.index[7], key.index[8],
 		    key.index[9], key.index[10], key.index[11],
 		    key.index[12]);
 #endif
 		r1->ifindex = r->ifindex;
 		r1->type = r->type;
 		r1->proto = r->proto;
 		free(r);
 		return;
 	}
 
 	route_total++;
 #ifdef DEBUG_ROUTE
 	printf("%s: ADD/GET: %u.%u.%u.%u "
 	    "%u.%u.%u.%u %u %u.%u.%u.%u\n", __func__,
 	    key.index[0], key.index[1], key.index[2],
 	    key.index[3], key.index[4], key.index[5],
 	    key.index[6], key.index[7], key.index[8],
 	    key.index[9], key.index[10], key.index[11],
 	    key.index[12]);
 #endif
 }
 
 int
 mib_fetch_route(void)
 {
 	u_char *rtab, *next;
 	size_t len;
 	struct sroute *r, *r1;
 	struct rt_msghdr *rtm;
 	struct sockaddr *addrs[RTAX_MAX];
 
 	if (route_tick != 0 && route_tick + ROUTE_UPDATE_INTERVAL > this_tick)
 		return (0);
 
 	/*
 	 * Remove all routes
 	 */
 	r = RB_MIN(sroutes, &sroutes);
 	while (r != NULL) {
 		r1 = RB_NEXT(sroutes, &sroutes, r);
 		RB_REMOVE(sroutes, &sroutes, r);
 		free(r);
 		r = r1;
 	}
 	route_total = 0;
 
 	if ((rtab = mib_fetch_rtab(AF_INET, NET_RT_DUMP, 0, &len)) == NULL)
 		return (-1);
 
 	next = rtab;
 	for (next = rtab; next < rtab + len; next += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)(void *)next;
 		if (rtm->rtm_type != RTM_GET ||
 		    !(rtm->rtm_flags & RTF_UP))
 			continue;
 		mib_extract_addrs(rtm->rtm_addrs, (u_char *)(rtm + 1), addrs);
 
 		
 		mib_sroute_process(rtm, addrs[RTAX_GATEWAY], addrs[RTAX_DST],
 		    addrs[RTAX_NETMASK]);
 	}
 
 #if 0
 	u_int n = 0;
 	r = RB_MIN(sroutes, &sroutes);
 	while (r != NULL) {
 		printf("%u: ", n++);
 		sroute_print(r);
 		printf("\n");
 		r = RB_NEXT(sroutes, &sroutes, r);
 	}
 #endif
 	free(rtab);
 	route_tick = get_ticks();
 
 	return (0);
 }
 
 /**
  * Find a route in the table.
  */
 static struct sroute *
 sroute_get(const struct asn_oid *oid, u_int sub)
 {
 	struct sroute key;
 	int i;
 
 	if (oid->len - sub != 13)
 		return (NULL);
 	for (i = 0; i < 13; i++)
 		key.index[i] = oid->subs[sub + i];
 	return (RB_FIND(sroutes, &sroutes, &key));
 }
 
 /**
  * Find next route in the table. There is no such RB_ macro, so must
  * dig into the innards of the RB stuff.
  */
 static struct sroute *
 sroute_getnext(struct asn_oid *oid, u_int sub)
 {
 	u_int i;
 	int comp;
 	struct sroute key;
 	struct sroute *best;
 	struct sroute *s;
 
 	/*
 	 * We now, that the OID is at least the tableEntry OID. If it is,
 	 * the user wants the first route.
 	 */
 	if (oid->len == sub)
 		return (RB_MIN(sroutes, &sroutes));
 
 	/*
 	 * This is also true for any index that consists of zeros and is
 	 * shorter than the full index.
 	 */
 	if (oid->len < sub + 13) {
 		for (i = sub; i < oid->len; i++)
 			if (oid->subs[i] != 0)
 				break;
 		if (i == oid->len)
 			return (RB_MIN(sroutes, &sroutes));
 
 		/*
 		 * Now if the index is too short, we fill it with zeros and then
 		 * subtract one from the index. We can do this, because we now,
 		 * that there is at least one index element that is not zero.
 		 */
 		for (i = oid->len; i < sub + 13; i++)
 			oid->subs[i] = 0;
 
 		for (i = sub + 13 - 1; i >= sub; i--) {
 			if (oid->subs[i] != 0) {
 				oid->subs[i]--;
 				break;
 			}
 			oid->subs[i] = ASN_MAXID;
 		}
 		oid->len = sub + 13;
 	}
 
 	/* build the index */
 	for (i = sub; i < sub + 13; i++)
 		key.index[i - sub] = oid->subs[i];
 
 	/* now find the element */
 	best = NULL;
 	s = RB_ROOT(&sroutes);
 
 	while (s != NULL) {
 		comp = sroute_compare(&key, s);
 		if (comp >= 0) {
 			/* The current element is smaller than what we search.
 			 * Forget about it and move to the right subtree. */
 			s = RB_RIGHT(s, link);
 			continue;
 		}
 		/* the current element is larger than what we search.
 		 * forget about the right subtree (its even larger), but
 		 * the current element may be what we need. */
 		if (best == NULL || sroute_compare(s, best) < 0)
 			/* this one's better */
 			best = s;
 
 		s = RB_LEFT(s, link);
 	}
 	return (best);
 }
 
 /*
  * Table
  */
 int
 op_route_table(struct snmp_context *ctx __unused, struct snmp_value *value,
     u_int sub, u_int iidx __unused, enum snmp_op op)
 {
 	struct sroute *r;
 
 	if (mib_fetch_route() == -1)
 		return (SNMP_ERR_GENERR);
 
 	switch (op) {
 
 	  case SNMP_OP_GETNEXT:
 		if ((r = sroute_getnext(&value->var, sub)) == NULL)
 			return (SNMP_ERR_NOSUCHNAME);
 		sroute_index_append(&value->var, sub, r);
 		break;
 
 	  case SNMP_OP_GET:
 		if ((r = sroute_get(&value->var, sub)) == NULL)
 			return (SNMP_ERR_NOSUCHNAME);
 		break;
 
 	  case SNMP_OP_SET:
 		if ((r = sroute_get(&value->var, sub)) == NULL)
 			return (SNMP_ERR_NOSUCHNAME);
 		return (SNMP_ERR_NOT_WRITEABLE);
 
 	  case SNMP_OP_ROLLBACK:
 	  case SNMP_OP_COMMIT:
 		abort();
 
 	  default:
 		abort();
 	}
 
 	switch (value->var.subs[sub - 1]) {
 
 	  case LEAF_ipCidrRouteDest:
 		value->v.ipaddress[0] = r->index[0];
 		value->v.ipaddress[1] = r->index[1];
 		value->v.ipaddress[2] = r->index[2];
 		value->v.ipaddress[3] = r->index[3];
 		break;
 
 	  case LEAF_ipCidrRouteMask:
 		value->v.ipaddress[0] = r->index[4];
 		value->v.ipaddress[1] = r->index[5];
 		value->v.ipaddress[2] = r->index[6];
 		value->v.ipaddress[3] = r->index[7];
 		break;
 
 	  case LEAF_ipCidrRouteTos:
 		value->v.integer = r->index[8];
 		break;
 
 	  case LEAF_ipCidrRouteNextHop:
 		value->v.ipaddress[0] = r->index[9];
 		value->v.ipaddress[1] = r->index[10];
 		value->v.ipaddress[2] = r->index[11];
 		value->v.ipaddress[3] = r->index[12];
 		break;
 
 	  case LEAF_ipCidrRouteIfIndex:
 		value->v.integer = r->ifindex;
 		break;
 
 	  case LEAF_ipCidrRouteType:
 		value->v.integer = r->type;
 		break;
 
 	  case LEAF_ipCidrRouteProto:
 		value->v.integer = r->proto;
 		break;
 
 	  case LEAF_ipCidrRouteAge:
 		value->v.integer = 0;
 		break;
 
 	  case LEAF_ipCidrRouteInfo:
 		value->v.oid = oid_zeroDotZero;
 		break;
 
 	  case LEAF_ipCidrRouteNextHopAS:
 		value->v.integer = 0;
 		break;
 
 	  case LEAF_ipCidrRouteMetric1:
 	  case LEAF_ipCidrRouteMetric2:
 	  case LEAF_ipCidrRouteMetric3:
 	  case LEAF_ipCidrRouteMetric4:
 	  case LEAF_ipCidrRouteMetric5:
 		value->v.integer = -1;
 		break;
 
 	  case LEAF_ipCidrRouteStatus:
 		value->v.integer = 1;
 		break;
 	}
 	return (SNMP_ERR_NOERROR);
 }
 
 /*
  * scalars
  */
 int
 op_route(struct snmp_context *ctx __unused, struct snmp_value *value,
     u_int sub, u_int iidx __unused, enum snmp_op op)
 {
 	switch (op) {
 
 	  case SNMP_OP_GETNEXT:
 		abort();
 
 	  case SNMP_OP_GET:
 		break;
 
 	  case SNMP_OP_SET:
 		return (SNMP_ERR_NOT_WRITEABLE);
 
 	  case SNMP_OP_ROLLBACK:
 	  case SNMP_OP_COMMIT:
 		abort();
 	}
 
 	if (mib_fetch_route() == -1)
 		return (SNMP_ERR_GENERR);
 
 	switch (value->var.subs[sub - 1]) {
 
 	  case LEAF_ipCidrRouteNumber:
 		value->v.uint32 = route_total;
 		break;
 
 	}
 	return (SNMP_ERR_NOERROR);
 }
 
 RB_GENERATE(sroutes, sroute, link, sroute_compare);
Index: head/contrib/ipfilter/ipsend/44arp.c
===================================================================
--- head/contrib/ipfilter/ipsend/44arp.c	(revision 186118)
+++ head/contrib/ipfilter/ipsend/44arp.c	(revision 186119)
@@ -1,121 +1,126 @@
 /*	$FreeBSD$	*/
 
 /*
  * Based upon 4.4BSD's /usr/sbin/arp
  */
 #include <sys/param.h>
 #include <sys/file.h>
 #include <sys/socket.h>
 #include <sys/sysctl.h>
 #include <net/if.h>
 #if __FreeBSD_version >= 300000
 # include <net/if_var.h>
 #endif
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #if defined(__FreeBSD__)
 # include "radix_ipf.h"
 #endif
 #ifndef __osf__
 # include <net/route.h>
 #endif
 #include <netinet/in.h>
 #include <netinet/if_ether.h>
 #include <arpa/inet.h>
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet/tcp.h>
 #include <unistd.h>
 #include <string.h>
 #include <stdlib.h>
 #include <netdb.h>
 #include <errno.h>
 #include <nlist.h>
 #include <stdio.h>
 #include "ipsend.h"
 #include "iplang/iplang.h"
 
 
 /*
  * lookup host and return
  * its IP address in address
  * (4 bytes)
  */
 int	resolve(host, address)
 char	*host, *address;
 {
         struct	hostent	*hp;
         u_long	add;
 
 	add = inet_addr(host);
 	if (add == -1)
 	    {
 		if (!(hp = gethostbyname(host)))
 		    {
 			fprintf(stderr, "unknown host: %s\n", host);
 			return -1;
 		    }
 		bcopy((char *)hp->h_addr, (char *)address, 4);
 		return 0;
 	}
 	bcopy((char*)&add, address, 4);
 	return 0;
 }
 
 
 int	arp(addr, eaddr)
 char	*addr, *eaddr;
 {
 	int	mib[6];
 	size_t	needed;
 	char	*lim, *buf, *next;
 	struct	rt_msghdr	*rtm;
 	struct	sockaddr_inarp	*sin;
 	struct	sockaddr_dl	*sdl;
 
 #ifdef	IPSEND
 	if (arp_getipv4(addr, ether) == 0)
 		return 0;
 #endif
 
 	if (!addr)
 		return -1;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = AF_INET;
 	mib[4] = NET_RT_FLAGS;
+#ifdef RTF_LLINFO
 	mib[5] = RTF_LLINFO;
+#else
+	mib[5] = 0;
+#endif	
+
 	if (sysctl(mib, 6, NULL, &needed, NULL, 0) == -1)
 	    {
 		perror("route-sysctl-estimate");
 		exit(-1);
 	    }
 	if ((buf = malloc(needed)) == NULL)
 	    {
 		perror("malloc");
 		exit(-1);
 	    }
 	if (sysctl(mib, 6, buf, &needed, NULL, 0) == -1)
 	    {
 		perror("actual retrieval of routing table");
 		exit(-1);
 	    }
 	lim = buf + needed;
 	for (next = buf; next < lim; next += rtm->rtm_msglen)
 	    {
 		rtm = (struct rt_msghdr *)next;
 		sin = (struct sockaddr_inarp *)(rtm + 1);
 		sdl = (struct sockaddr_dl *)(sin + 1);
 		if (!bcmp(addr, (char *)&sin->sin_addr,
 			  sizeof(struct in_addr)))
 		    {
 			bcopy(LLADDR(sdl), eaddr, sdl->sdl_alen);
 			return 0;
 		    }
 	    }
 	return -1;
 }
Index: head/lib/libstand/if_ether.h
===================================================================
--- head/lib/libstand/if_ether.h	(revision 186118)
+++ head/lib/libstand/if_ether.h	(revision 186119)
@@ -1,261 +1,261 @@
 /*	$NetBSD: if_ether.h,v 1.25 1997/01/17 17:06:06 mikel Exp $	*/
 
 /*
  * Copyright (c) 1982, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)if_ether.h	8.1 (Berkeley) 6/10/93
  *
  * $FreeBSD$
  */
 
 /*
  * Ethernet address - 6 octets
  * this is only used by the ethers(3) functions.
  */
 struct ether_addr {
 	u_int8_t ether_addr_octet[6];
 };
 
 /*
  * Structure of a 10Mb/s Ethernet header.
  */
 #define	ETHER_ADDR_LEN	6
 
 struct	ether_header {
 	u_int8_t  ether_dhost[ETHER_ADDR_LEN];
 	u_int8_t  ether_shost[ETHER_ADDR_LEN];
 	u_int16_t ether_type;
 };
 
 #define	ETHERTYPE_PUP		0x0200	/* PUP protocol */
 #define	ETHERTYPE_IP		0x0800	/* IP protocol */
 #define	ETHERTYPE_ARP		0x0806	/* address resolution protocol */
 #define	ETHERTYPE_REVARP	0x8035	/* reverse addr resolution protocol */
 
 /*
  * The ETHERTYPE_NTRAILER packet types starting at ETHERTYPE_TRAIL have
  * (type-ETHERTYPE_TRAIL)*512 bytes of data followed
  * by an ETHER type (as given above) and then the (variable-length) header.
  */
 #define	ETHERTYPE_TRAIL		0x1000		/* Trailer packet */
 #define	ETHERTYPE_NTRAILER	16
 
 #define	ETHER_IS_MULTICAST(addr) (*(addr) & 0x01) /* is address mcast/bcast? */
 
 #define	ETHERMTU	1500
 #define	ETHERMIN	(60-14)
 
 #ifdef _KERNEL
 /*
  * Macro to map an IP multicast address to an Ethernet multicast address.
  * The high-order 25 bits of the Ethernet address are statically assigned,
  * and the low-order 23 bits are taken from the low end of the IP address.
  */
 #define ETHER_MAP_IP_MULTICAST(ipaddr, enaddr)				\
 	/* struct in_addr *ipaddr; */					\
 	/* u_int8_t enaddr[ETHER_ADDR_LEN]; */				\
 {									\
 	(enaddr)[0] = 0x01;						\
 	(enaddr)[1] = 0x00;						\
 	(enaddr)[2] = 0x5e;						\
 	(enaddr)[3] = ((u_int8_t *)ipaddr)[1] & 0x7f;			\
 	(enaddr)[4] = ((u_int8_t *)ipaddr)[2];				\
 	(enaddr)[5] = ((u_int8_t *)ipaddr)[3];				\
 }
 #endif
 
 /*
  * Ethernet Address Resolution Protocol.
  *
  * See RFC 826 for protocol description.  Structure below is adapted
  * to resolving internet addresses.  Field names used correspond to 
  * RFC 826.
  */
 struct	ether_arp {
 	struct	 arphdr ea_hdr;			/* fixed-size header */
 	u_int8_t arp_sha[ETHER_ADDR_LEN];	/* sender hardware address */
 	u_int8_t arp_spa[4];			/* sender protocol address */
 	u_int8_t arp_tha[ETHER_ADDR_LEN];	/* target hardware address */
 	u_int8_t arp_tpa[4];			/* target protocol address */
 };
 #define	arp_hrd	ea_hdr.ar_hrd
 #define	arp_pro	ea_hdr.ar_pro
 #define	arp_hln	ea_hdr.ar_hln
 #define	arp_pln	ea_hdr.ar_pln
 #define	arp_op	ea_hdr.ar_op
 
 /*
  * Structure shared between the ethernet driver modules and
  * the address resolution code.  For example, each ec_softc or il_softc
  * begins with this structure.
  */
 struct	arpcom {
 	struct	 ifnet ac_if;			/* network-visible interface */
 	u_int8_t ac_enaddr[ETHER_ADDR_LEN];	/* ethernet hardware address */
 	char	 ac__pad[2];			/* be nice to m68k ports */
 	LIST_HEAD(, ether_multi) ac_multiaddrs;	/* list of ether multicast addrs */
 	int	 ac_multicnt;			/* length of ac_multiaddrs list */
 };
 
 struct llinfo_arp {
 	LIST_ENTRY(llinfo_arp) la_list;
 	struct	rtentry *la_rt;
 	struct	mbuf *la_hold;		/* last packet until resolved/timeout */
 	long	la_asked;		/* last time we QUERIED for this addr */
 #define la_timer la_rt->rt_rmx.rmx_expire /* deletion time in seconds */
 };
 
 struct sockaddr_inarp {
 	u_int8_t  sin_len;
 	u_int8_t  sin_family;
 	u_int16_t sin_port;
 	struct	  in_addr sin_addr;
 	struct	  in_addr sin_srcaddr;
 	u_int16_t sin_tos;
 	u_int16_t sin_other;
 #define SIN_PROXY 1
 };
 
 /*
  * IP and ethernet specific routing flags
  */
 #define	RTF_USETRAILERS	RTF_PROTO1	/* use trailers */
 #define	RTF_ANNOUNCE	RTF_PROTO2	/* announce new arp entry */
 
 #ifdef	_KERNEL
 u_int8_t etherbroadcastaddr[ETHER_ADDR_LEN];
 u_int8_t ether_ipmulticast_min[ETHER_ADDR_LEN];
 u_int8_t ether_ipmulticast_max[ETHER_ADDR_LEN];
 struct	ifqueue arpintrq;
 
 void	arpwhohas(struct arpcom *, struct in_addr *);
 void	arpintr(void);
 int	arpresolve(struct arpcom *,
-	    struct rtentry *, struct mbuf *, struct sockaddr *, u_char *);
+	    struct rtentry *, struct mbuf *, struct sockaddr *, u_char *, struct llentry **);
 void	arp_ifinit(struct arpcom *, struct ifaddr *);
 void	arp_rtrequest(int, struct rtentry *, struct sockaddr *);
 
 int	ether_addmulti(struct ifreq *, struct arpcom *);
 int	ether_delmulti(struct ifreq *, struct arpcom *);
 #endif /* _KERNEL */
 
 /*
  * Ethernet multicast address structure.  There is one of these for each
  * multicast address or range of multicast addresses that we are supposed
  * to listen to on a particular interface.  They are kept in a linked list,
  * rooted in the interface's arpcom structure.  (This really has nothing to
  * do with ARP, or with the Internet address family, but this appears to be
  * the minimally-disrupting place to put it.)
  */
 struct ether_multi {
 	u_int8_t enm_addrlo[ETHER_ADDR_LEN]; /* low  or only address of range */
 	u_int8_t enm_addrhi[ETHER_ADDR_LEN]; /* high or only address of range */
 	struct	 arpcom *enm_ac;	/* back pointer to arpcom */
 	u_int	 enm_refcount;		/* no. claims to this addr/range */
 	LIST_ENTRY(ether_multi) enm_list;
 };
 
 /*
  * Structure used by macros below to remember position when stepping through
  * all of the ether_multi records.
  */
 struct ether_multistep {
 	struct ether_multi  *e_enm;
 };
 
 /*
  * Macro for looking up the ether_multi record for a given range of Ethernet
  * multicast addresses connected to a given arpcom structure.  If no matching
  * record is found, "enm" returns NULL.
  */
 #define ETHER_LOOKUP_MULTI(addrlo, addrhi, ac, enm)			\
 	/* u_int8_t addrlo[ETHER_ADDR_LEN]; */				\
 	/* u_int8_t addrhi[ETHER_ADDR_LEN]; */				\
 	/* struct arpcom *ac; */					\
 	/* struct ether_multi *enm; */					\
 {									\
 	for ((enm) = (ac)->ac_multiaddrs.lh_first;			\
 	    (enm) != NULL &&						\
 	    (bcmp((enm)->enm_addrlo, (addrlo), ETHER_ADDR_LEN) != 0 ||	\
 	     bcmp((enm)->enm_addrhi, (addrhi), ETHER_ADDR_LEN) != 0);	\
 		(enm) = (enm)->enm_list.le_next);			\
 }
 
 /*
  * Macro to step through all of the ether_multi records, one at a time.
  * The current position is remembered in "step", which the caller must
  * provide.  ETHER_FIRST_MULTI(), below, must be called to initialize "step"
  * and get the first record.  Both macros return a NULL "enm" when there
  * are no remaining records.
  */
 #define ETHER_NEXT_MULTI(step, enm) \
 	/* struct ether_multistep step; */  \
 	/* struct ether_multi *enm; */  \
 { \
 	if (((enm) = (step).e_enm) != NULL) \
 		(step).e_enm = (enm)->enm_list.le_next; \
 }
 
 #define ETHER_FIRST_MULTI(step, ac, enm) \
 	/* struct ether_multistep step; */ \
 	/* struct arpcom *ac; */ \
 	/* struct ether_multi *enm; */ \
 { \
 	(step).e_enm = (ac)->ac_multiaddrs.lh_first; \
 	ETHER_NEXT_MULTI((step), (enm)); \
 }
 
 #ifdef _KERNEL
 void arp_rtrequest(int, struct rtentry *, struct sockaddr *);
 int arpresolve(struct arpcom *, struct rtentry *, struct mbuf *,
-		    struct sockaddr *, u_char *);
+		    struct sockaddr *, u_char *, struct llentry **);
 void arpintr(void);
 int arpioctl(u_long, caddr_t);
 void arp_ifinit(struct arpcom *, struct ifaddr *);
 void revarpinput(struct mbuf *);
 void in_revarpinput(struct mbuf *);
 void revarprequest(struct ifnet *);
 int revarpwhoarewe(struct ifnet *, struct in_addr *, struct in_addr *);
 int revarpwhoami(struct in_addr *, struct ifnet *);
 int db_show_arptab(void);
 #endif
 
 /*
  * Prototype ethers(3) functions.
  */
 #ifndef _KERNEL
 #include <sys/cdefs.h>
 __BEGIN_DECLS
 char *	ether_ntoa(struct ether_addr *);
 struct ether_addr *
 	ether_aton(char *);
 int	ether_ntohost(char *, struct ether_addr *);
 int	ether_hostton(char *, struct ether_addr *);
 int	ether_line(char *, struct ether_addr *, char *);
 __END_DECLS
 #endif
Index: head/libexec/bootpd/rtmsg.c
===================================================================
--- head/libexec/bootpd/rtmsg.c	(revision 186118)
+++ head/libexec/bootpd/rtmsg.c	(revision 186119)
@@ -1,256 +1,255 @@
 /*
  * Copyright (c) 1984, 1993
  *	The Regents of the University of California.  All rights reserved.
  * Copyright (c) 1994
  *	Geoffrey M. Rehmet, All rights reserved.
  *
  * This code is derived from software which forms part of the 4.4-Lite
  * Berkeley software distribution, which was in derived from software
  * contributed to Berkeley by Sun Microsystems, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * from arp.c	8.2 (Berkeley) 1/2/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 /*
  * Verify that we are at least 4.4 BSD
  */
 #if defined(BSD)
 #if BSD >= 199306
 
 #include <sys/socket.h>
 #include <sys/filio.h>
 #include <sys/time.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/route.h>
 
 #include <netinet/in.h>
 #include <netinet/if_ether.h>
 
 #include <arpa/inet.h>
 
 #include <errno.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <syslog.h>
 #include <unistd.h>
 
 #include "report.h"
 
 
 static int rtmsg(int);
 
 static int s = -1; 	/* routing socket */
 
 
 /*
  * Open the routing socket
  */
 static void getsocket () {
 	if (s < 0) {
 		s = socket(PF_ROUTE, SOCK_RAW, 0);
 		if (s < 0) {
 			report(LOG_ERR, "socket %s", strerror(errno));
 			exit(1);
 		}
 	} else {
 		/*
 		 * Drain the socket of any unwanted routing messages.
 		 */
 		int n;
 		char buf[512];
 
 		ioctl(s, FIONREAD, &n);
 		while (n > 0) {
 			read(s, buf, sizeof buf);
 			ioctl(s, FIONREAD, &n);
 		}
 	}
 }
 
 static struct	sockaddr_in so_mask = {8, 0, 0, { 0xffffffff}};
 static struct	sockaddr_inarp blank_sin = {sizeof(blank_sin), AF_INET }, sin_m;
 static struct	sockaddr_dl blank_sdl = {sizeof(blank_sdl), AF_LINK }, sdl_m;
 static int	expire_time, flags, export_only, doing_proxy;
 static struct	{
 	struct	rt_msghdr m_rtm;
 	char	m_space[512];
 }	m_rtmsg;
 
 /*
  * Set an individual arp entry
  */
 int bsd_arp_set(ia, eaddr, len)
 	struct in_addr *ia;
 	char *eaddr;
 	int len;
 {
 	register struct sockaddr_inarp *sin = &sin_m;
 	register struct sockaddr_dl *sdl;
 	register struct rt_msghdr *rtm = &(m_rtmsg.m_rtm);
 	u_char *ea;
 	struct timeval time;
 	int op = RTM_ADD;
 
 	getsocket();
 	sdl_m = blank_sdl;
 	sin_m = blank_sin;
 	sin->sin_addr = *ia;
 
 	ea = (u_char *)LLADDR(&sdl_m);
 	bcopy(eaddr, ea, len);
 	sdl_m.sdl_alen = len;
 	doing_proxy = flags = export_only = expire_time = 0;
 
 	/* make arp entry temporary */
 	gettimeofday(&time, 0);
 	expire_time = time.tv_sec + 20 * 60;
 
 tryagain:
 	if (rtmsg(RTM_GET) < 0) {
 		report(LOG_WARNING, "rtmget: %s", strerror(errno));
 		return (1);
 	}
 	sin = (struct sockaddr_inarp *)(rtm + 1);
 	sdl = (struct sockaddr_dl *)(sin->sin_len + (char *)sin);
 	if (sin->sin_addr.s_addr == sin_m.sin_addr.s_addr) {
 		if (sdl->sdl_family == AF_LINK &&
-		    (rtm->rtm_flags & RTF_LLINFO) &&
 		    !(rtm->rtm_flags & RTF_GATEWAY)) switch (sdl->sdl_type) {
 		case IFT_ETHER: case IFT_FDDI: case IFT_ISO88023:
 		case IFT_ISO88024: case IFT_ISO88025:
 			op = RTM_CHANGE;
 			goto overwrite;
 		}
 		if (doing_proxy == 0) {
 			report(LOG_WARNING, "set: can only proxy for %s\n",
 				inet_ntoa(sin->sin_addr));
 			return (1);
 		}
 		if (sin_m.sin_other & SIN_PROXY) {
 			report(LOG_WARNING,
 				"set: proxy entry exists for non 802 device\n");
 			return(1);
 		}
 		sin_m.sin_other = SIN_PROXY;
 		export_only = 1;
 		goto tryagain;
 	}
 overwrite:
 	if (sdl->sdl_family != AF_LINK) {
 		report(LOG_WARNING,
 			"cannot intuit interface index and type for %s\n",
 			inet_ntoa(sin->sin_addr));
 		return (1);
 	}
 	sdl_m.sdl_type = sdl->sdl_type;
 	sdl_m.sdl_index = sdl->sdl_index;
 	return (rtmsg(op));
 }
 
 
 static int rtmsg(cmd)
 	int cmd;
 {
 	static int seq;
 	int rlen;
 	register struct rt_msghdr *rtm = &m_rtmsg.m_rtm;
 	register char *cp = m_rtmsg.m_space;
 	register int l;
 
 	errno = 0;
 	bzero((char *)&m_rtmsg, sizeof(m_rtmsg));
 	rtm->rtm_flags = flags;
 	rtm->rtm_version = RTM_VERSION;
 
 	switch (cmd) {
 	default:
 		report(LOG_ERR, "set_arp: internal wrong cmd - exiting");
 		exit(1);
 	case RTM_ADD:
 	case RTM_CHANGE:
 		rtm->rtm_addrs |= RTA_GATEWAY;
 		rtm->rtm_rmx.rmx_expire = expire_time;
 		rtm->rtm_inits = RTV_EXPIRE;
 		rtm->rtm_flags |= (RTF_HOST | RTF_STATIC);
 		sin_m.sin_other = 0;
 		if (doing_proxy) {
 			if (export_only)
 				sin_m.sin_other = SIN_PROXY;
 			else {
 				rtm->rtm_addrs |= RTA_NETMASK;
 				rtm->rtm_flags &= ~RTF_HOST;
 			}
 		}
 		/* FALLTHROUGH */
 	case RTM_GET:
 		rtm->rtm_addrs |= RTA_DST;
 	}
 #define NEXTADDR(w, s) \
 	if (rtm->rtm_addrs & (w)) { \
 		bcopy((char *)&s, cp, sizeof(s)); cp += sizeof(s);}
 
 	NEXTADDR(RTA_DST, sin_m);
 	NEXTADDR(RTA_GATEWAY, sdl_m);
 	NEXTADDR(RTA_NETMASK, so_mask);
 
 	rtm->rtm_msglen = cp - (char *)&m_rtmsg;
 
 	l = rtm->rtm_msglen;
 	rtm->rtm_seq = ++seq;
 	rtm->rtm_type = cmd;
 	if ((rlen = write(s, (char *)&m_rtmsg, l)) < 0) {
 		if ((errno != ESRCH) && !(errno == EEXIST && cmd == RTM_ADD)){
 			report(LOG_WARNING, "writing to routing socket: %s",
 				strerror(errno));
 			return (-1);
 		}
 	}
 	do {
 		l = read(s, (char *)&m_rtmsg, sizeof(m_rtmsg));
 	} while (l > 0 && (rtm->rtm_type != cmd || rtm->rtm_seq != seq || rtm->rtm_pid != getpid()));
 	if (l < 0)
 		report(LOG_WARNING, "arp: read from routing socket: %s\n",
 		    strerror(errno));
 	return (0);
 }
 
 #endif /* BSD */
 #endif /* BSD >= 199306 */
Index: head/release/picobsd/tinyware/ns/ns.c
===================================================================
--- head/release/picobsd/tinyware/ns/ns.c	(revision 186118)
+++ head/release/picobsd/tinyware/ns/ns.c	(revision 186119)
@@ -1,829 +1,825 @@
 /*-
  * Copyright (c) 1998 Andrzej Bialecki
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 
 /*
  * Small replacement for netstat. Uses only sysctl(3) to get the info.
  */
 
 #include <sys/types.h>
 #include <sys/time.h>
 #include <sys/sysctl.h>
 #include <sys/socket.h>
 #include <sys/un.h>
 
 #include <net/if.h>
 #include <net/route.h>
 #include <net/if_dl.h>
 #include <netinet/in_systm.h>
 #include <netinet/in.h>
 #include <netinet/ip.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/icmp_var.h>
 #include <netinet/ip_var.h>
 #include <netinet/tcp.h>
 #include <netinet/tcp_timer.h>
 #include <netinet/tcp_var.h>
 #include <netinet/udp.h>
 #include <netinet/udp_var.h>
 
 #include <err.h>
 #include <errno.h>
 #include <osreldate.h>
 #include <stdio.h>
 #include <unistd.h>
 
 char *progname;
 int iflag = 0;
 int lflag = 0;			/* print cpu load info */
 int rflag = 0;
 int sflag = 0;
 int pflag = 0;
 int wflag = 0;			/* repeat every wait seconds */
 int delta = 0 ;
 
 extern char *optarg;
 extern int optind;
 
 void
 usage()
 {
 	fprintf(stderr, "\n%s [-nrsil] [-p proto] [-w wait]\n", progname);
 	fprintf(stderr, "  proto: {ip|tcp|udp|icmp}\n\n");
 }
 
 
 /*
  * The following parts related to retrieving the routing table and
  * interface information, were borrowed from R. Stevens' code examples
  * accompanying his excellent book. Thanks!
  */
 char *
 sock_ntop(const struct sockaddr *sa, size_t salen)
 {
 	char	portstr[7];
 	static	char str[128];	/* Unix domain is largest */
 
 	switch (sa->sa_family) {
 	case 255: {
 		int	i = 0;
 		u_long	mask;
 		u_int	index = 1 << 31;
 		u_short	new_mask = 0;
 
 		mask = ntohl(((struct sockaddr_in *)sa)->sin_addr.s_addr);
 
 		while (mask & index) {
 			new_mask++;
 			index >>= 1;
 		}
 		sprintf(str, "/%hu", new_mask);
 		return (str);
 	}
 	case AF_UNSPEC:
 	case AF_INET: {
 		struct	sockaddr_in *sin = (struct sockaddr_in *)sa;
 
 		if (inet_ntop(AF_INET, &sin->sin_addr, str, sizeof(str))
 		    == NULL)
 			return (NULL);
 		if (ntohs(sin->sin_port) != 0) {
 			snprintf(portstr, sizeof(portstr), ".%d",
 			    ntohs(sin->sin_port));
 			strcat(str, portstr);
 		}
 		if (strcmp(str, "0.0.0.0") == 0)
 			sprintf(str, "default");
 		return (str);
 	}
 
 	case AF_UNIX: {
 		struct	sockaddr_un *unp = (struct sockaddr_un *)sa;
 
 		/*
 		 * OK to have no pathname bound to the socket:
 		 * happens on every connect() unless client calls
 		 * bind() first.
 		 */
 		if (unp->sun_path[0] == 0)
 			strcpy(str, "(no pathname bound)");
 		else
 			snprintf(str, sizeof(str), "%s", unp->sun_path);
 		return (str);
 	}
 
 	case AF_LINK: {
 		struct	sockaddr_dl *sdl = (struct sockaddr_dl *)sa;
 
 		if (sdl->sdl_nlen > 0) {
 			bcopy(&sdl->sdl_data[0], str, sdl->sdl_nlen);
 			str[sdl->sdl_nlen] = '\0';
 		} else
 			snprintf(str, sizeof(str), "link#%d", sdl->sdl_index);
 		return (str);
 	}
 
 	default:
 		snprintf(str, sizeof(str),
 		    "sock_ntop: unknown AF_xxx: %d, len %d", sa->sa_family,
 		    salen);
 		return (str);
 	}
 	return (NULL);
 }
 
 char *
 Sock_ntop(const struct sockaddr *sa, size_t salen)
 {
 	char	*ptr;
 
 	if ((ptr = sock_ntop(sa, salen)) == NULL)
 		err(1, "sock_ntop error");	/* inet_ntop() sets errno */
 	return (ptr);
 }
 
 
 #define ROUNDUP(a,size) (((a) & ((size)-1))?(1+((a)|((size)-1))):(a))
 
 #define NEXT_SA(ap) 							\
 	ap=(struct sockaddr *)						\
 	    ((caddr_t)ap+(ap->sa_len?ROUNDUP(ap->sa_len,sizeof(u_long)):\
 	    sizeof(u_long)))
 
 void
 get_rtaddrs(int addrs, struct sockaddr *sa, struct sockaddr **rti_info)
 {
 	int	i;
 
 	for (i = 0; i < RTAX_MAX; i++) {
 		if (addrs & (1 << i)) {
 			rti_info[i] = sa;
 			NEXT_SA(sa);
 		} else
 			rti_info[i] = NULL;
 	}
 }
 
 void
 get_flags(char *buf, int flags)
 {
 	if (flags & 0x1)
 		strcat(buf, "U");
 	if (flags & 0x2)
 		strcat(buf, "G");
 	if (flags & 0x4)
 		strcat(buf, "H");
 	if (flags & 0x8)
 		strcat(buf, "r");
 	if (flags & 0x10)
 		strcat(buf, "d");
 #ifdef NEVER
 	if (flags & 0x20)
 		strcat(buf, "mod,");
 #endif /*NEVER*/
 	if (flags & 0x100)
 		strcat(buf, "C");
 	if (flags & 0x400)
 		strcat(buf, "L");
 	if (flags & 0x800)
 		strcat(buf, "S");
 	if (flags & 0x10000)
 		strcat(buf, "c");
 	if (flags & 0x20000)
 		strcat(buf, "W");
 #ifdef NEVER
 	if (flags & 0x200000)
 		strcat(buf, ",LOC");
 #endif /*NEVER*/
 	if (flags & 0x400000)
 		strcat(buf, "b");
 #ifdef NEVER
 	if (flags & 0x800000)
 		strcat(buf, ",MCA");
 #endif /*NEVER*/
 }
 
 int
 print_routing(char *proto)
 {
 	int	mib[6];
 	int	i = 0;
 	int	rt_len;
 	int	if_len;
 	int	if_num;
 	char	*rt_buf;
 	char	*if_buf;
 	char	*next;
 	char	*lim;
 	struct	rt_msghdr *rtm;
 	struct	if_msghdr *ifm;
 	struct	if_msghdr **ifm_table;
 	struct	ifa_msghdr *ifam;
 	struct	sockaddr *sa;
 	struct	sockaddr *sa1;
 	struct	sockaddr *rti_info[RTAX_MAX];
 	struct	sockaddr **if_table;
 	struct	rt_metrics rm;
 	char	fbuf[50];
 
 	/* keep a copy of statistics here for future use */
 	static unsigned *base_stats = NULL ;
 	static unsigned base_len = 0 ;
 
 	/* Get the routing table */
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = 0;
 	mib[4] = NET_RT_DUMP;
 	mib[5] = 0;
 
 	/*Estimate the size of table */
 	if (sysctl(mib, 6, NULL, &rt_len, NULL, 0) == -1) {
 		perror("sysctl size");
 		exit(-1);
 	}
 	if ((rt_buf = (char *)malloc(rt_len)) == NULL) {
 		perror("malloc");
 		exit(-1);
 	}
 
 	/* Now get it. */
 	if (sysctl(mib, 6, rt_buf, &rt_len, NULL, 0) == -1) {
 		perror("sysctl get");
 		exit(-1);
 	}
 
 	/* Get the interfaces table */
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = 0;
 	mib[4] = NET_RT_IFLIST;
 	mib[5] = 0;
 
 	/* Estimate the size of table */
 	if (sysctl(mib, 6, NULL, &if_len, NULL, 0) == -1) {
 		perror("sysctl size");
 		exit(-1);
 	}
 	if ((if_buf = (char *)malloc(if_len)) == NULL) {
 		perror("malloc");
 		exit(-1);
 	}
 
 	/* Now get it. */
 	if (sysctl(mib, 6, if_buf, &if_len, NULL, 0) == -1) {
 		perror("sysctl get");
 		exit(-1);
 	}
 	lim = if_buf + if_len;
 	i = 0;
 	for (next = if_buf, i = 0; next < lim; next += ifm->ifm_msglen) {
 		ifm = (struct if_msghdr *)next;
 		i++;
 	}
 	if_num = i;
 	if_table = (struct sockaddr **)calloc(i, sizeof(struct sockaddr));
 	ifm_table = (struct if_msghdr **)calloc(i, sizeof(struct if_msghdr));
 	if (iflag) {
 		printf("\nInterface table:\n");
 		printf("----------------\n");
 		printf("Name  Mtu   Network       Address            "
 		    "Ipkts Ierrs    Opkts Oerrs  Coll\n");
 	}
         /* scan the list and store base values */
 	i = 0 ;
 	for (next = if_buf; next < lim; next += ifm->ifm_msglen) {
 		ifm = (struct if_msghdr *)next;
 		i++ ;
 	}
 	if (base_stats == NULL || i != base_len) {
 		base_stats = calloc(i*5, sizeof(unsigned));
 		base_len = i ;
 	}
 	i = 0;
 	for (next = if_buf; next < lim; next += ifm->ifm_msglen) {
 		ifm = (struct if_msghdr *)next;
 		if_table[i] = (struct sockaddr *)(ifm + 1);
 		ifm_table[i] = ifm;
 
 		sa = if_table[i];
 		if (iflag && sa->sa_family == AF_LINK) {
 			struct	sockaddr_dl *sdl = (struct sockaddr_dl *)sa;
 			unsigned *bp = &base_stats[i*5];
 
 			printf("%-4s  %-5d <Link>   ",
 			    sock_ntop(if_table[i], if_table[i]->sa_len),
 			    ifm->ifm_data.ifi_mtu);
 			if (sdl->sdl_alen == 6) {
 				unsigned char *p =
 				    sdl->sdl_data + sdl->sdl_nlen;
 				printf("%02x:%02x:%02x:%02x:%02x:%02x   ",
 				    p[0], p[1], p[2], p[3], p[4], p[5]);
 			} else
 				printf("                    ");
 			printf("%9d%6d%9d%6d%6d\n",
 			    ifm->ifm_data.ifi_ipackets - bp[0],
 			    ifm->ifm_data.ifi_ierrors - bp[1],
 			    ifm->ifm_data.ifi_opackets - bp[2],
 			    ifm->ifm_data.ifi_oerrors - bp[3],
 			    ifm->ifm_data.ifi_collisions -bp[4]);
 			if (delta > 0) {
 			    bp[0] = ifm->ifm_data.ifi_ipackets ;
 			    bp[1] = ifm->ifm_data.ifi_ierrors ;
 			    bp[2] = ifm->ifm_data.ifi_opackets ;
 			    bp[3] = ifm->ifm_data.ifi_oerrors ;
 			    bp[4] = ifm->ifm_data.ifi_collisions ;
 			}
 		}
 		i++;
 	}
 	if (!rflag) {
 		free(rt_buf);
 		free(if_buf);
 		free(if_table);
 		free(ifm_table);
 		return;
 	}
 
 	/* Now dump the routing table */
 	printf("\nRouting table:\n");
 	printf("--------------\n");
 	printf
 	    ("Destination        Gateway            Flags       Netif  Use\n");
 	lim = rt_buf + rt_len;
 	for (next = rt_buf; next < lim; next += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)next;
 		sa = (struct sockaddr *)(rtm + 1);
 		get_rtaddrs(rtm->rtm_addrs, sa, rti_info);
-		if (rtm->rtm_flags & RTF_WASCLONED) {
-			if ((rtm->rtm_flags & RTF_LLINFO) == 0)
-				continue;
-		}
 		if ((sa = rti_info[RTAX_DST]) != NULL) {
 			sprintf(fbuf, "%s", sock_ntop(sa, sa->sa_len));
 			if (((sa1 = rti_info[RTAX_NETMASK]) != NULL)
 			    && sa1->sa_family == 255) {
 				strcat(fbuf, sock_ntop(sa1, sa1->sa_len));
 			}
 			printf("%-19s", fbuf);
 		}
 		if ((sa = rti_info[RTAX_GATEWAY]) != NULL) {
 			printf("%-19s", sock_ntop(sa, sa->sa_len));
 		}
 		memset(fbuf, 0, sizeof(fbuf));
 		get_flags(fbuf, rtm->rtm_flags);
 		printf("%-10s", fbuf);
 		for (i = 0; i < if_num; i++) {
 			ifm = ifm_table[i];
 			if ((ifm->ifm_index == rtm->rtm_index) &&
 			    (ifm->ifm_data.ifi_type > 0)) {
 				sa = if_table[i];
 				break;
 			}
 		}
 		if (ifm->ifm_type == RTM_IFINFO) {
 			get_rtaddrs(ifm->ifm_addrs, sa, rti_info);
 			printf("  %s", Sock_ntop(sa, sa->sa_len));
 		} else if (ifm->ifm_type == RTM_NEWADDR) {
 			ifam =
 			    (struct ifa_msghdr *)ifm_table[rtm->rtm_index - 1];
 			sa = (struct sockaddr *)(ifam + 1);
 			get_rtaddrs(ifam->ifam_addrs, sa, rti_info);
 			printf("  %s", Sock_ntop(sa, sa->sa_len));
 		}
 		printf("    %u", rtm->rtm_use);
 		printf("\n");
 	}
 	free(rt_buf);
 	free(if_buf);
 	free(if_table);
 	free(ifm_table);
 	return;
 
 }
 
 print_ip_stats()
 {
 	int	mib[4];
 	int	len;
 	struct	ipstat s;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_INET;
 	mib[2] = IPPROTO_IP;
 #ifndef IPCTL_STATS
 	printf("sorry, ip stats not available\n");
 	return -1;
 #else
 	mib[3] = IPCTL_STATS;
 	len = sizeof(struct ipstat);
 	if (sysctl(mib, 4, &s, &len, NULL, 0) < 0) {
 		perror("sysctl");
 		return (-1);
 	}
 	printf("\nIP statistics:\n");
 	printf("--------------\n");
 	printf("  %10lu total packets received\n", s.ips_total);
 	printf("* Packets ok:\n");
 	printf("  %10lu fragments received\n", s.ips_fragments);
 	printf("  %10lu forwarded\n", s.ips_forward);
 #if __FreeBSD_version > 300001
 	printf("  %10lu fast forwarded\n", s.ips_fastforward);
 #endif
 	printf("  %10lu forwarded on same net (redirect)\n",
 	    s.ips_redirectsent);
 	printf("  %10lu delivered to upper level\n", s.ips_delivered);
 	printf("  %10lu total ip packets generated here\n", s.ips_localout);
 	printf("  %10lu total packets reassembled ok\n", s.ips_reassembled);
 	printf("  %10lu total datagrams successfully fragmented\n",
 	    s.ips_fragmented);
 	printf("  %10lu output fragments created\n", s.ips_ofragments);
 	printf("  %10lu total raw IP packets generated\n", s.ips_rawout);
 	printf("\n* Bad packets:\n");
 	printf("  %10lu bad checksum\n", s.ips_badsum);
 	printf("  %10lu too short\n", s.ips_tooshort);
 	printf("  %10lu not enough data (too small)\n", s.ips_toosmall);
 	printf("  %10lu more data than declared in header\n", s.ips_badhlen);
 	printf("  %10lu less data than declared in header\n", s.ips_badlen);
 	printf("  %10lu fragments dropped (dups, no mbuf)\n",
 	    s.ips_fragdropped);
 	printf("  %10lu fragments timed out in reassembly\n",
 	    s.ips_fragtimeout);
 	printf("  %10lu received for unreachable dest.\n", s.ips_cantforward);
 	printf("  %10lu unknown or unsupported protocol\n", s.ips_noproto);
 	printf("  %10lu lost due to no bufs etc.\n", s.ips_odropped);
 	printf("  %10lu couldn't fragment (DF set, etc.)\n", s.ips_cantfrag);
 	printf("  %10lu error in IP options processing\n", s.ips_badoptions);
 	printf("  %10lu dropped due to no route\n", s.ips_noroute);
 	printf("  %10lu bad IP version\n", s.ips_badvers);
 	printf("  %10lu too long (more than max IP size)\n", s.ips_toolong);
 #if __FreeBSD_version > 300001
 	printf("  %10lu multicast for unregistered groups\n", s.ips_notmember);
 #endif
 #endif
 }
 
 print_tcp_stats()
 {
 	int	mib[4];
 	int	len;
 	struct	tcpstat s;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_INET;
 	mib[2] = IPPROTO_TCP;
 #ifndef TCPCTL_STATS
 	printf("sorry, tcp stats not available\n");
 	return -1;
 #else
 	mib[3] = TCPCTL_STATS;
 	len = sizeof(struct tcpstat);
 	if (sysctl(mib, 4, &s, &len, NULL, 0) < 0) {
 		perror("sysctl");
 		return (-1);
 	}
 	printf("\nTCP statistics:\n");
 	printf("---------------\n");
 	printf("* Connections:\n");
 	printf("  %10lu initiated\n", s.tcps_connattempt);
 	printf("  %10lu accepted\n", s.tcps_accepts);
 	printf("  %10lu established\n", s.tcps_connects);
 	printf("  %10lu dropped\n", s.tcps_drops);
 	printf("  %10lu embryonic connections dropped\n", s.tcps_conndrops);
 	printf("  %10lu closed (includes dropped)\n", s.tcps_closed);
 	printf("  %10lu segments where we tried to get RTT\n",
 	    s.tcps_segstimed);
 	printf("  %10lu times RTT successfully updated\n", s.tcps_rttupdated);
 	printf("  %10lu delayed ACKs sent\n", s.tcps_delack);
 	printf("  %10lu dropped in rxmt timeout\n", s.tcps_timeoutdrop);
 	printf("  %10lu retrasmit timeouts\n", s.tcps_rexmttimeo);
 	printf("  %10lu persist timeouts\n", s.tcps_persisttimeo);
 	printf("  %10lu keepalive timeouts\n", s.tcps_keeptimeo);
 	printf("  %10lu keepalive probes sent\n", s.tcps_keepprobe);
 	printf("  %10lu dropped in keepalive\n", s.tcps_keepdrops);
 
 	printf("* Packets sent:\n");
 	printf("  %10lu total packets sent\n", s.tcps_sndtotal);
 	printf("  %10lu data packets sent\n", s.tcps_sndpack);
 	printf("  %10lu data bytes sent\n", s.tcps_sndbyte);
 	printf("  %10lu data packets retransmitted\n", s.tcps_sndrexmitpack);
 	printf("  %10lu data bytes retransmitted\n", s.tcps_sndrexmitbyte);
 	printf("  %10lu ACK-only packets sent\n", s.tcps_sndacks);
 	printf("  %10lu window probes sent\n", s.tcps_sndprobe);
 	printf("  %10lu URG-only packets sent\n", s.tcps_sndurg);
 	printf("  %10lu window update-only packets sent\n", s.tcps_sndwinup);
 	printf("  %10lu control (SYN,FIN,RST) packets sent\n", s.tcps_sndctrl);
 	printf("* Packets received:\n");
 	printf("  %10lu total packets received\n", s.tcps_rcvtotal);
 	printf("  %10lu packets in sequence\n", s.tcps_rcvpack);
 	printf("  %10lu bytes in sequence\n", s.tcps_rcvbyte);
 	printf("  %10lu packets with bad checksum\n", s.tcps_rcvbadsum);
 	printf("  %10lu packets with bad offset\n", s.tcps_rcvbadoff);
 	printf("  %10lu packets too short\n", s.tcps_rcvshort);
 	printf("  %10lu duplicate-only packets\n", s.tcps_rcvduppack);
 	printf("  %10lu duplicate-only bytes\n", s.tcps_rcvdupbyte);
 	printf("  %10lu packets with some duplicate data\n",
 	    s.tcps_rcvpartduppack);
 	printf("  %10lu duplicate bytes in partially dup. packets\n",
 	    s.tcps_rcvpartdupbyte);
 	printf("  %10lu out-of-order packets\n", s.tcps_rcvoopack);
 	printf("  %10lu out-of-order bytes\n", s.tcps_rcvoobyte);
 	printf("  %10lu packets with data after window\n",
 	    s.tcps_rcvpackafterwin);
 	printf("  %10lu bytes received after window\n",
 	    s.tcps_rcvbyteafterwin);
 	printf("  %10lu packets received after 'close'\n",
 	    s.tcps_rcvafterclose);
 	printf("  %10lu window probe packets\n", s.tcps_rcvwinprobe);
 	printf("  %10lu duplicate ACKs\n", s.tcps_rcvdupack);
 	printf("  %10lu ACKs for unsent data\n", s.tcps_rcvacktoomuch);
 	printf("  %10lu ACK packets\n", s.tcps_rcvackpack);
 	printf("  %10lu bytes ACKed by received ACKs\n", s.tcps_rcvackbyte);
 	printf("  %10lu window update packets\n", s.tcps_rcvwinupd);
 	printf("  %10lu segments dropped due to PAWS\n", s.tcps_pawsdrop);
 	printf("  %10lu times header predict ok for ACKs\n", s.tcps_predack);
 	printf("  %10lu times header predict ok for data packets\n",
 	    s.tcps_preddat);
 	printf("  %10lu PCB cache misses\n", s.tcps_pcbcachemiss);
 	printf("  %10lu times cached RTT in route updated\n",
 	    s.tcps_cachedrtt);
 	printf("  %10lu times cached RTTVAR updated\n", s.tcps_cachedrttvar);
 	printf("  %10lu times ssthresh updated\n", s.tcps_cachedssthresh);
 	printf("  %10lu times RTT initialized from route\n", s.tcps_usedrtt);
 	printf("  %10lu times RTTVAR initialized from route\n",
 	    s.tcps_usedrttvar);
 	printf("  %10lu times ssthresh initialized from route\n",
 	    s.tcps_usedssthresh);
 	printf("  %10lu timeout in persist state\n", s.tcps_persistdrop);
 	printf("  %10lu bogus SYN, e.g. premature ACK\n", s.tcps_badsyn);
 	printf("  %10lu resends due to MTU discovery\n", s.tcps_mturesent);
 	printf("  %10lu listen queue overflows\n", s.tcps_listendrop);
 #endif
 }
 
 print_udp_stats()
 {
 	int	mib[4];
 	int	len;
 	struct	udpstat s;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_INET;
 	mib[2] = IPPROTO_UDP;
 	mib[3] = UDPCTL_STATS;
 	len = sizeof(struct udpstat);
 	if (sysctl(mib, 4, &s, &len, NULL, 0) < 0) {
 		perror("sysctl");
 		return (-1);
 	}
 	printf("\nUDP statistics:\n");
 	printf("---------------\n");
 	printf("* Packets received:\n");
 	printf("  %10lu total input packets\n", s.udps_ipackets);
 	printf("  %10lu packets shorter than header (dropped)\n",
 	    s.udps_hdrops);
 	printf("  %10lu bad checksum\n", s.udps_badsum);
 	printf("  %10lu data length larger than packet\n", s.udps_badlen);
 	printf("  %10lu no socket on specified port\n", s.udps_noport);
 	printf("  %10lu of above, arrived as broadcast\n", s.udps_noportbcast);
 	printf("  %10lu not delivered, input socket full\n", s.udps_fullsock);
 	printf("  %10lu packets missing PCB cache\n", s.udpps_pcbcachemiss);
 	printf("  %10lu packets not for hashed PCBs\n", s.udpps_pcbhashmiss);
 	printf("* Packets sent:\n");
 	printf("  %10lu total output packets\n", s.udps_opackets);
 #if __FreeBSD_version > 300001
 	printf("  %10lu output packets on fast path\n", s.udps_fastout);
 #endif
 }
 
 char *icmp_names[] = {
 	"echo reply",
 	"#1",
 	"#2",
 	"destination unreachable",
 	"source quench",
 	"routing redirect",
 	"#6",
 	"#7",
 	"echo",
 	"router advertisement",
 	"router solicitation",
 	"time exceeded",
 	"parameter problem",
 	"time stamp",
 	"time stamp reply",
 	"information request",
 	"information request reply",
 	"address mask request",
 	"address mask reply",
 };
 
 print_icmp_stats()
 {
 	int	mib[4];
 	int	len;
 	int	i;
 	struct	icmpstat s;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_INET;
 	mib[2] = IPPROTO_ICMP;
 	mib[3] = ICMPCTL_STATS;
 	len = sizeof(struct icmpstat);
 	if (sysctl(mib, 4, &s, &len, NULL, 0) < 0) {
 		perror("sysctl");
 		return (-1);
 	}
 	printf("\nICMP statistics:\n");
 	printf("----------------\n");
 	printf("* Output histogram:\n");
 	for (i = 0; i < (ICMP_MAXTYPE + 1); i++) {
 		if (s.icps_outhist[i] > 0)
 			printf("\t%10lu %s\n",
 			    s.icps_outhist[i], icmp_names[i]);
 	}
 	printf("* Input histogram:\n");
 	for (i = 0; i < (ICMP_MAXTYPE + 1); i++) {
 		if (s.icps_inhist[i] > 0)
 			printf("\t%10lu %s\n",
 			    s.icps_inhist[i], icmp_names[i]);
 	}
 	printf("* Other stats:\n");
 	printf("  %10lu calls to icmp_error\n", s.icps_error);
 	printf("  %10lu no error 'cuz old ip too short\n", s.icps_oldshort);
 	printf("  %10lu no error 'cuz old was icmp\n", s.icps_oldicmp);
 
 	printf("  %10lu icmp code out of range\n", s.icps_badcode);
 	printf("  %10lu packets shorter than min length\n", s.icps_tooshort);
 	printf("  %10lu bad checksum\n", s.icps_checksum);
 	printf("  %10lu calculated bound mismatch\n", s.icps_badlen);
 	printf("  %10lu number of responses\n", s.icps_reflect);
 	printf("  %10lu broad/multi-cast echo requests dropped\n",
 	    s.icps_bmcastecho);
 	printf("  %10lu broad/multi-cast timestamp requests dropped\n",
 	    s.icps_bmcasttstamp);
 }
 
 int
 stats(char *proto)
 {
 	if (!sflag)
 		return 0;
 	if (pflag) {
 		if (proto == NULL) {
 			fprintf(stderr, "Option '-p' requires paramter.\n");
 			usage();
 			exit(-1);
 		}
 		if (strcmp(proto, "ip") == 0)
 			print_ip_stats();
 		if (strcmp(proto, "icmp") == 0)
 			print_icmp_stats();
 		if (strcmp(proto, "udp") == 0)
 			print_udp_stats();
 		if (strcmp(proto, "tcp") == 0)
 			print_tcp_stats();
 		return (0);
 	}
 	print_ip_stats();
 	print_icmp_stats();
 	print_udp_stats();
 	print_tcp_stats();
 	return (0);
 }
 
 int
 main(int argc, char *argv[])
 {
 	char	c;
 	char	*proto = NULL;
 
 	progname = argv[0];
 
 	while ((c = getopt(argc, argv, "dilnrsp:w:")) != -1) {
 		switch (c) {
 		case 'd': /* print deltas in stats every w seconds */
 			delta++ ;
 			break;
 		case 'w':
 			wflag = atoi(optarg);
 			break;
 		case 'n': /* ignored, just for compatibility with std netstat */
 			break;
 		case 'r':
 			rflag++;
 			break;
 		case 'i':
 			iflag++;
 			break;
 		case 'l':
 			lflag++;
 			break;
 		case 's':
 			sflag++;
 			rflag = 0;
 			break;
 		case 'p':
 			pflag++;
 			sflag++;
 			proto = optarg;
 			break;
 		case '?':
 		default:
 			usage();
 			exit(0);
 			break;
 		}
 	}
 	if (rflag == 0 && sflag == 0 && iflag == 0)
 		rflag = 1;
 	argc -= optind;
 
 	if (argc > 0) {
 		usage();
 		exit(-1);
 	}
 	if (wflag)
 		printf("\033[H\033[J");
 again:
 	if (wflag) {
 		struct timeval t;
 
 		gettimeofday(&t, NULL);
 		printf("\033[H%s", ctime(&t.tv_sec));
 	}
 	print_routing(proto);
 	print_load_stats();
 	stats(proto);
 	if (wflag) {
 		sleep(wflag);
 		goto again;
 	}
 	exit(0);
 }
 
 int
 print_load_stats(void)
 {
 	static u_int32_t cp_time[5];
 	u_int32_t new_cp_time[5];
 	int l;
 	int shz;
 	static int stathz ;
 
 	if (!lflag || !wflag)
 		return 0;
 	l = sizeof(new_cp_time) ;
 	bzero(new_cp_time, l);
 	if (sysctlbyname("kern.cp_time", new_cp_time, &l, NULL, 0) < 0) {
 		warn("sysctl: retrieving cp_time length");
 		return 0;
 	}
 	if (stathz == 0) {
 		struct clockinfo ci;
 
 		bzero (&ci, sizeof(ci));
 		l = sizeof(ci) ;
 		if (sysctlbyname("kern.clockrate", &ci, &l, NULL, 0) < 0) {
 			warn("sysctl: retrieving clockinfo length");
 			return 0;
 		}
 		stathz = ci.stathz ;
 		bcopy(new_cp_time, cp_time, sizeof(cp_time));
 	}
 	shz = stathz * wflag ;
 	if (shz == 0)
 		shz = 1;
 #define X(i)   ( (double)(new_cp_time[i] - cp_time[i])*100/shz )
 	printf("\nUSER %5.2f%% NICE %5.2f%% SYS %5.2f%% "
 			"INTR %5.2f%% IDLE %5.2f%%\n",
 		X(0), X(1), X(2), X(3), X(4) );
 	bcopy(new_cp_time, cp_time, sizeof(cp_time));
 }              
Index: head/sbin/route/route.c
===================================================================
--- head/sbin/route/route.c	(revision 186118)
+++ head/sbin/route/route.c	(revision 186119)
@@ -1,1667 +1,1661 @@
 /*
  * Copyright (c) 1983, 1989, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #ifndef lint
 static const char copyright[] =
 "@(#) Copyright (c) 1983, 1989, 1991, 1993\n\
 	The Regents of the University of California.  All rights reserved.\n";
 #endif /* not lint */
 
 #ifndef lint
 #if 0
 static char sccsid[] = "@(#)route.c	8.6 (Berkeley) 4/28/95";
 #endif
 static const char rcsid[] =
   "$FreeBSD$";
 #endif /* not lint */
 
 #include <sys/param.h>
 #include <sys/file.h>
 #include <sys/socket.h>
 #include <sys/ioctl.h>
 #include <sys/sysctl.h>
 #include <sys/types.h>
 
 #include <net/if.h>
 #include <net/route.h>
 #include <net/if_dl.h>
 #include <netinet/in.h>
 #include <netinet/if_ether.h>
 #include <netatalk/at.h>
 #include <arpa/inet.h>
 #include <netdb.h>
 
 #include <ctype.h>
 #include <err.h>
 #include <errno.h>
 #include <paths.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sysexits.h>
 #include <unistd.h>
 #include <ifaddrs.h>
 
 struct keytab {
 	char	*kt_cp;
 	int	kt_i;
 } keywords[] = {
 #include "keywords.h"
 	{0, 0}
 };
 
 struct	ortentry route;
 union	sockunion {
 	struct	sockaddr sa;
 	struct	sockaddr_in sin;
 #ifdef INET6
 	struct	sockaddr_in6 sin6;
 #endif
 	struct	sockaddr_at sat;
 	struct	sockaddr_dl sdl;
 	struct	sockaddr_inarp sinarp;
 	struct	sockaddr_storage ss; /* added to avoid memory overrun */
 } so_dst, so_gate, so_mask, so_genmask, so_ifa, so_ifp;
 
 typedef union sockunion *sup;
 int	pid, rtm_addrs;
 int	s;
 int	forcehost, forcenet, doflush, nflag, af, qflag, tflag, keyword();
 int	iflag, verbose, aflen = sizeof (struct sockaddr_in);
 int	locking, lockrest, debugonly;
 struct	rt_metrics rt_metrics;
 u_long  rtm_inits;
 uid_t	uid;
 int	atalk_aton(const char *, struct at_addr *);
 char	*atalk_ntoa(struct at_addr);
 const char	*routename(), *netname();
 void	flushroutes(), newroute(), monitor(), sockaddr(), sodump(), bprintf();
 void	print_getmsg(), print_rtmsg(), pmsg_common(), pmsg_addrs(), mask_addr();
 #ifdef INET6
 static int inet6_makenetandmask(struct sockaddr_in6 *, char *);
 #endif
 int	getaddr(), rtmsg(), x25_makemask();
 int	prefixlen();
 extern	char *iso_ntoa();
 
 void usage(const char *) __dead2;
 
 void
 usage(cp)
 	const char *cp;
 {
 	if (cp)
 		warnx("bad keyword: %s", cp);
 	(void) fprintf(stderr,
 	    "usage: route [-dnqtv] command [[modifiers] args]\n");
 	exit(EX_USAGE);
 	/* NOTREACHED */
 }
 
 int
 main(argc, argv)
 	int argc;
 	char **argv;
 {
 	int ch;
 
 	if (argc < 2)
 		usage((char *)NULL);
 
 	while ((ch = getopt(argc, argv, "nqdtv")) != -1)
 		switch(ch) {
 		case 'n':
 			nflag = 1;
 			break;
 		case 'q':
 			qflag = 1;
 			break;
 		case 'v':
 			verbose = 1;
 			break;
 		case 't':
 			tflag = 1;
 			break;
 		case 'd':
 			debugonly = 1;
 			break;
 		case '?':
 		default:
 			usage((char *)NULL);
 		}
 	argc -= optind;
 	argv += optind;
 
 	pid = getpid();
 	uid = geteuid();
 	if (tflag)
 		s = open(_PATH_DEVNULL, O_WRONLY, 0);
 	else
 		s = socket(PF_ROUTE, SOCK_RAW, 0);
 	if (s < 0)
 		err(EX_OSERR, "socket");
 	if (*argv)
 		switch (keyword(*argv)) {
 		case K_GET:
 			uid = 0;
 			/* FALLTHROUGH */
 
 		case K_CHANGE:
 		case K_ADD:
 		case K_DEL:
 		case K_DELETE:
 			newroute(argc, argv);
 			/* NOTREACHED */
 
 		case K_MONITOR:
 			monitor();
 			/* NOTREACHED */
 
 		case K_FLUSH:
 			flushroutes(argc, argv);
 			exit(0);
 			/* NOTREACHED */
 		}
 	usage(*argv);
 	/* NOTREACHED */
 }
 
 /*
  * Purge all entries in the routing tables not
  * associated with network interfaces.
  */
 void
 flushroutes(argc, argv)
 	int argc;
 	char *argv[];
 {
 	size_t needed;
 	int mib[6], rlen, seqno, count = 0;
 	char *buf, *next, *lim;
 	struct rt_msghdr *rtm;
 
 	if (uid && !debugonly) {
 		errx(EX_NOPERM, "must be root to alter routing table");
 	}
 	shutdown(s, SHUT_RD); /* Don't want to read back our messages */
 	if (argc > 1) {
 		argv++;
 		if (argc == 2 && **argv == '-')
 		    switch (keyword(*argv + 1)) {
 			case K_INET:
 				af = AF_INET;
 				break;
 #ifdef INET6
 			case K_INET6:
 				af = AF_INET6;
 				break;
 #endif
 			case K_ATALK:
 				af = AF_APPLETALK;
 				break;
 			case K_LINK:
 				af = AF_LINK;
 				break;
 			default:
 				goto bad;
 		} else
 bad:			usage(*argv);
 	}
 retry:
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;		/* protocol */
 	mib[3] = 0;		/* wildcard address family */
 	mib[4] = NET_RT_DUMP;
 	mib[5] = 0;		/* no flags */
 	if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0)
 		err(EX_OSERR, "route-sysctl-estimate");
 	if ((buf = malloc(needed)) == NULL)
 		errx(EX_OSERR, "malloc failed");
 	if (sysctl(mib, 6, buf, &needed, NULL, 0) < 0) {
 		if (errno == ENOMEM && count++ < 10) {
 			warnx("Routing table grew, retrying");  
 			sleep(1);
 			free(buf);
 			goto retry;
 		}
 		err(EX_OSERR, "route-sysctl-get");
 	}
 	lim = buf + needed;
 	if (verbose)
 		(void) printf("Examining routing table from sysctl\n");
 	seqno = 0;		/* ??? */
 	for (next = buf; next < lim; next += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)next;
 		if (verbose)
 			print_rtmsg(rtm, rtm->rtm_msglen);
 		if ((rtm->rtm_flags & RTF_GATEWAY) == 0)
 			continue;
 		if (af) {
 			struct sockaddr *sa = (struct sockaddr *)(rtm + 1);
 
 			if (sa->sa_family != af)
 				continue;
 		}
 		if (debugonly)
 			continue;
 		rtm->rtm_type = RTM_DELETE;
 		rtm->rtm_seq = seqno;
 		rlen = write(s, next, rtm->rtm_msglen);
 		if (rlen < 0 && errno == EPERM)
 			err(1, "write to routing socket");
 		if (rlen < (int)rtm->rtm_msglen) {
 			warn("write to routing socket");
 			(void) printf("got only %d for rlen\n", rlen);
 			free(buf);
 			goto retry;
 			break;
 		}
 		seqno++;
 		if (qflag)
 			continue;
 		if (verbose)
 			print_rtmsg(rtm, rlen);
 		else {
 			struct sockaddr *sa = (struct sockaddr *)(rtm + 1);
 			(void) printf("%-20.20s ", rtm->rtm_flags & RTF_HOST ?
 			    routename(sa) : netname(sa));
 			sa = (struct sockaddr *)(SA_SIZE(sa) + (char *)sa);
 			(void) printf("%-20.20s ", routename(sa));
 			(void) printf("done\n");
 		}
 	}
 }
 
 const char *
 routename(sa)
 	struct sockaddr *sa;
 {
 	char *cp;
 	static char line[MAXHOSTNAMELEN + 1];
 	struct hostent *hp;
 	static char domain[MAXHOSTNAMELEN + 1];
 	static int first = 1, n;
 
 	if (first) {
 		first = 0;
 		if (gethostname(domain, MAXHOSTNAMELEN) == 0 &&
 		    (cp = strchr(domain, '.'))) {
 			domain[MAXHOSTNAMELEN] = '\0';
 			(void) strcpy(domain, cp + 1);
 		} else
 			domain[0] = 0;
 	}
 
 	if (sa->sa_len == 0)
 		strcpy(line, "default");
 	else switch (sa->sa_family) {
 
 	case AF_INET:
 	    {	struct in_addr in;
 		in = ((struct sockaddr_in *)sa)->sin_addr;
 
 		cp = 0;
 		if (in.s_addr == INADDR_ANY || sa->sa_len < 4)
 			cp = "default";
 		if (cp == 0 && !nflag) {
 			hp = gethostbyaddr((char *)&in, sizeof (struct in_addr),
 				AF_INET);
 			if (hp) {
 				if ((cp = strchr(hp->h_name, '.')) &&
 				    !strcmp(cp + 1, domain))
 					*cp = 0;
 				cp = hp->h_name;
 			}
 		}
 		if (cp) {
 			strncpy(line, cp, sizeof(line) - 1);
 			line[sizeof(line) - 1] = '\0';
 		} else
 			(void) sprintf(line, "%s", inet_ntoa(in));
 		break;
 	    }
 
 #ifdef INET6
 	case AF_INET6:
 	{
 		struct sockaddr_in6 sin6; /* use static var for safety */
 		int niflags = 0;
 
 		memset(&sin6, 0, sizeof(sin6));
 		memcpy(&sin6, sa, sa->sa_len);
 		sin6.sin6_len = sizeof(struct sockaddr_in6);
 		sin6.sin6_family = AF_INET6;
 #ifdef __KAME__
 		if (sa->sa_len == sizeof(struct sockaddr_in6) &&
 		    (IN6_IS_ADDR_LINKLOCAL(&sin6.sin6_addr) ||
 		     IN6_IS_ADDR_MC_LINKLOCAL(&sin6.sin6_addr)) &&
 		    sin6.sin6_scope_id == 0) {
 			sin6.sin6_scope_id =
 			    ntohs(*(u_int16_t *)&sin6.sin6_addr.s6_addr[2]);
 			sin6.sin6_addr.s6_addr[2] = 0;
 			sin6.sin6_addr.s6_addr[3] = 0;
 		}
 #endif
 		if (nflag)
 			niflags |= NI_NUMERICHOST;
 		if (getnameinfo((struct sockaddr *)&sin6, sin6.sin6_len,
 		    line, sizeof(line), NULL, 0, niflags) != 0)
 			strncpy(line, "invalid", sizeof(line));
 
 		return(line);
 	}
 #endif
 
 	case AF_APPLETALK:
 		(void) snprintf(line, sizeof(line), "atalk %s",
 			atalk_ntoa(((struct sockaddr_at *)sa)->sat_addr));
 		break;
 
 	case AF_LINK:
 		return (link_ntoa((struct sockaddr_dl *)sa));
 
 	default:
 	    {	u_short *s = (u_short *)sa;
 		u_short *slim = s + ((sa->sa_len + 1) >> 1);
 		char *cp = line + sprintf(line, "(%d)", sa->sa_family);
 		char *cpe = line + sizeof(line);
 
 		while (++s < slim && cp < cpe) /* start with sa->sa_data */
 			if ((n = snprintf(cp, cpe - cp, " %x", *s)) > 0)
 				cp += n;
 			else
 				*cp = '\0';
 		break;
 	    }
 	}
 	return (line);
 }
 
 /*
  * Return the name of the network whose address is given.
  * The address is assumed to be that of a net or subnet, not a host.
  */
 const char *
 netname(sa)
 	struct sockaddr *sa;
 {
 	char *cp = 0;
 	static char line[MAXHOSTNAMELEN + 1];
 	struct netent *np = 0;
 	u_long net, mask;
 	u_long i;
 	int n, subnetshift;
 
 	switch (sa->sa_family) {
 
 	case AF_INET:
 	    {	struct in_addr in;
 		in = ((struct sockaddr_in *)sa)->sin_addr;
 
 		i = in.s_addr = ntohl(in.s_addr);
 		if (in.s_addr == 0)
 			cp = "default";
 		else if (!nflag) {
 			if (IN_CLASSA(i)) {
 				mask = IN_CLASSA_NET;
 				subnetshift = 8;
 			} else if (IN_CLASSB(i)) {
 				mask = IN_CLASSB_NET;
 				subnetshift = 8;
 			} else {
 				mask = IN_CLASSC_NET;
 				subnetshift = 4;
 			}
 			/*
 			 * If there are more bits than the standard mask
 			 * would suggest, subnets must be in use.
 			 * Guess at the subnet mask, assuming reasonable
 			 * width subnet fields.
 			 */
 			while (in.s_addr &~ mask)
 				mask = (long)mask >> subnetshift;
 			net = in.s_addr & mask;
 			while ((mask & 1) == 0)
 				mask >>= 1, net >>= 1;
 			np = getnetbyaddr(net, AF_INET);
 			if (np)
 				cp = np->n_name;
 		}
 #define C(x)	(unsigned)((x) & 0xff)
 		if (cp)
 			strncpy(line, cp, sizeof(line));
 		else if ((in.s_addr & 0xffffff) == 0)
 			(void) sprintf(line, "%u", C(in.s_addr >> 24));
 		else if ((in.s_addr & 0xffff) == 0)
 			(void) sprintf(line, "%u.%u", C(in.s_addr >> 24),
 			    C(in.s_addr >> 16));
 		else if ((in.s_addr & 0xff) == 0)
 			(void) sprintf(line, "%u.%u.%u", C(in.s_addr >> 24),
 			    C(in.s_addr >> 16), C(in.s_addr >> 8));
 		else
 			(void) sprintf(line, "%u.%u.%u.%u", C(in.s_addr >> 24),
 			    C(in.s_addr >> 16), C(in.s_addr >> 8),
 			    C(in.s_addr));
 #undef C
 		break;
 	    }
 
 #ifdef INET6
 	case AF_INET6:
 	{
 		struct sockaddr_in6 sin6; /* use static var for safety */
 		int niflags = 0;
 
 		memset(&sin6, 0, sizeof(sin6));
 		memcpy(&sin6, sa, sa->sa_len);
 		sin6.sin6_len = sizeof(struct sockaddr_in6);
 		sin6.sin6_family = AF_INET6;
 #ifdef __KAME__
 		if (sa->sa_len == sizeof(struct sockaddr_in6) &&
 		    (IN6_IS_ADDR_LINKLOCAL(&sin6.sin6_addr) ||
 		     IN6_IS_ADDR_MC_LINKLOCAL(&sin6.sin6_addr)) &&
 		    sin6.sin6_scope_id == 0) {
 			sin6.sin6_scope_id =
 			    ntohs(*(u_int16_t *)&sin6.sin6_addr.s6_addr[2]);
 			sin6.sin6_addr.s6_addr[2] = 0;
 			sin6.sin6_addr.s6_addr[3] = 0;
 		}
 #endif
 		if (nflag)
 			niflags |= NI_NUMERICHOST;
 		if (getnameinfo((struct sockaddr *)&sin6, sin6.sin6_len,
 		    line, sizeof(line), NULL, 0, niflags) != 0)
 			strncpy(line, "invalid", sizeof(line));
 
 		return(line);
 	}
 #endif
 
 	case AF_APPLETALK:
 		(void) snprintf(line, sizeof(line), "atalk %s",
 			atalk_ntoa(((struct sockaddr_at *)sa)->sat_addr));
 		break;
 
 	case AF_LINK:
 		return (link_ntoa((struct sockaddr_dl *)sa));
 
 
 	default:
 	    {	u_short *s = (u_short *)sa->sa_data;
 		u_short *slim = s + ((sa->sa_len + 1)>>1);
 		char *cp = line + sprintf(line, "af %d:", sa->sa_family);
 		char *cpe = line + sizeof(line);
 
 		while (s < slim && cp < cpe)
 			if ((n = snprintf(cp, cpe - cp, " %x", *s++)) > 0)
 				cp += n;
 			else
 				*cp = '\0';
 		break;
 	    }
 	}
 	return (line);
 }
 
 void
 set_metric(value, key)
 	char *value;
 	int key;
 {
 	int flag = 0;
 	u_long noval, *valp = &noval;
 
 	switch (key) {
 #define caseof(x, y, z)	case x: valp = &rt_metrics.z; flag = y; break
 	caseof(K_MTU, RTV_MTU, rmx_mtu);
 	caseof(K_HOPCOUNT, RTV_HOPCOUNT, rmx_hopcount);
 	caseof(K_EXPIRE, RTV_EXPIRE, rmx_expire);
 	caseof(K_RECVPIPE, RTV_RPIPE, rmx_recvpipe);
 	caseof(K_SENDPIPE, RTV_SPIPE, rmx_sendpipe);
 	caseof(K_SSTHRESH, RTV_SSTHRESH, rmx_ssthresh);
 	caseof(K_RTT, RTV_RTT, rmx_rtt);
 	caseof(K_RTTVAR, RTV_RTTVAR, rmx_rttvar);
 	}
 	rtm_inits |= flag;
 	if (lockrest || locking)
 		rt_metrics.rmx_locks |= flag;
 	if (locking)
 		locking = 0;
 	*valp = atoi(value);
 }
 
 void
 newroute(argc, argv)
 	int argc;
 	char **argv;
 {
 	char *cmd, *dest = "", *gateway = "", *err;
 	int ishost = 0, proxy = 0, ret, attempts, oerrno, flags = RTF_STATIC;
 	int key;
 	struct hostent *hp = 0;
 
 	if (uid) {
 		errx(EX_NOPERM, "must be root to alter routing table");
 	}
 	cmd = argv[0];
 	if (*cmd != 'g')
 		shutdown(s, SHUT_RD); /* Don't want to read back our messages */
 	while (--argc > 0) {
 		if (**(++argv)== '-') {
 			switch (key = keyword(1 + *argv)) {
 			case K_LINK:
 				af = AF_LINK;
 				aflen = sizeof(struct sockaddr_dl);
 				break;
 			case K_INET:
 				af = AF_INET;
 				aflen = sizeof(struct sockaddr_in);
 				break;
 #ifdef INET6
 			case K_INET6:
 				af = AF_INET6;
 				aflen = sizeof(struct sockaddr_in6);
 				break;
 #endif
 			case K_ATALK:
 				af = AF_APPLETALK;
 				aflen = sizeof(struct sockaddr_at);
 				break;
 			case K_SA:
 				af = PF_ROUTE;
 				aflen = sizeof(union sockunion);
 				break;
 			case K_IFACE:
 			case K_INTERFACE:
 				iflag++;
 				break;
 			case K_NOSTATIC:
 				flags &= ~RTF_STATIC;
 				break;
-			case K_LLINFO:
-				flags |= RTF_LLINFO;
-				break;
 			case K_LOCK:
 				locking = 1;
 				break;
 			case K_LOCKREST:
 				lockrest = 1;
 				break;
 			case K_HOST:
 				forcehost++;
 				break;
 			case K_REJECT:
 				flags |= RTF_REJECT;
 				break;
 			case K_BLACKHOLE:
 				flags |= RTF_BLACKHOLE;
 				break;
 			case K_PROTO1:
 				flags |= RTF_PROTO1;
 				break;
 			case K_PROTO2:
 				flags |= RTF_PROTO2;
 				break;
 			case K_PROXY:
 				proxy = 1;
-				break;
-			case K_CLONING:
-				flags |= RTF_CLONING;
 				break;
 			case K_XRESOLVE:
 				flags |= RTF_XRESOLVE;
 				break;
 			case K_STATIC:
 				flags |= RTF_STATIC;
 				break;
 			case K_IFA:
 				if (!--argc)
 					usage((char *)NULL);
 				(void) getaddr(RTA_IFA, *++argv, 0);
 				break;
 			case K_IFP:
 				if (!--argc)
 					usage((char *)NULL);
 				(void) getaddr(RTA_IFP, *++argv, 0);
 				break;
 			case K_GENMASK:
 				if (!--argc)
 					usage((char *)NULL);
 				(void) getaddr(RTA_GENMASK, *++argv, 0);
 				break;
 			case K_GATEWAY:
 				if (!--argc)
 					usage((char *)NULL);
 				(void) getaddr(RTA_GATEWAY, *++argv, 0);
 				break;
 			case K_DST:
 				if (!--argc)
 					usage((char *)NULL);
 				ishost = getaddr(RTA_DST, *++argv, &hp);
 				dest = *argv;
 				break;
 			case K_NETMASK:
 				if (!--argc)
 					usage((char *)NULL);
 				(void) getaddr(RTA_NETMASK, *++argv, 0);
 				/* FALLTHROUGH */
 			case K_NET:
 				forcenet++;
 				break;
 			case K_PREFIXLEN:
 				if (!--argc)
 					usage((char *)NULL);
 				if (prefixlen(*++argv) == -1) {
 					forcenet = 0;
 					ishost = 1;
 				} else {
 					forcenet = 1;
 					ishost = 0;
 				}
 				break;
 			case K_MTU:
 			case K_HOPCOUNT:
 			case K_EXPIRE:
 			case K_RECVPIPE:
 			case K_SENDPIPE:
 			case K_SSTHRESH:
 			case K_RTT:
 			case K_RTTVAR:
 				if (!--argc)
 					usage((char *)NULL);
 				set_metric(*++argv, key);
 				break;
 			default:
 				usage(1+*argv);
 			}
 		} else {
 			if ((rtm_addrs & RTA_DST) == 0) {
 				dest = *argv;
 				ishost = getaddr(RTA_DST, *argv, &hp);
 			} else if ((rtm_addrs & RTA_GATEWAY) == 0) {
 				gateway = *argv;
 				(void) getaddr(RTA_GATEWAY, *argv, &hp);
 			} else {
 				(void) getaddr(RTA_NETMASK, *argv, 0);
 				forcenet = 1;
 			}
 		}
 	}
 	if (forcehost) {
 		ishost = 1;
 #ifdef INET6
 		if (af == AF_INET6) {
 			rtm_addrs &= ~RTA_NETMASK;
 			memset((void *)&so_mask, 0, sizeof(so_mask));
 		}
 #endif 
 	}
 	if (forcenet)
 		ishost = 0;
 	flags |= RTF_UP;
 	if (ishost)
 		flags |= RTF_HOST;
 	if (iflag == 0)
 		flags |= RTF_GATEWAY;
 	if (proxy) {
 		so_dst.sinarp.sin_other = SIN_PROXY;
 		flags |= RTF_ANNOUNCE;
 	}
 	for (attempts = 1; ; attempts++) {
 		errno = 0;
 		if ((ret = rtmsg(*cmd, flags)) == 0)
 			break;
 		if (errno != ENETUNREACH && errno != ESRCH)
 			break;
 		if (af == AF_INET && *gateway && hp && hp->h_addr_list[1]) {
 			hp->h_addr_list++;
 			memmove(&so_gate.sin.sin_addr, hp->h_addr_list[0],
 			    MIN(hp->h_length, sizeof(so_gate.sin.sin_addr)));
 		} else
 			break;
 	}
 	if (*cmd == 'g')
 		exit(ret != 0);
 	if (!qflag) {
 		oerrno = errno;
 		(void) printf("%s %s %s", cmd, ishost? "host" : "net", dest);
 		if (*gateway) {
 			(void) printf(": gateway %s", gateway);
 			if (attempts > 1 && ret == 0 && af == AF_INET)
 			    (void) printf(" (%s)",
 				inet_ntoa(((struct sockaddr_in *)&route.rt_gateway)->sin_addr));
 		}
 		if (ret == 0) {
 			(void) printf("\n");
 		} else {
 			switch (oerrno) {
 			case ESRCH:
 				err = "not in table";
 				break;
 			case EBUSY:
 				err = "entry in use";
 				break;
 			case ENOBUFS:
 				err = "not enough memory";
 				break;
 			case EADDRINUSE:
 				/* handle recursion avoidance in rt_setgate() */
 				err = "gateway uses the same route";
 				break;
 			case EEXIST:
 				err = "route already in table";
 				break;
 			default:
 				err = strerror(oerrno);
 				break;
 			}
 			(void) printf(": %s\n", err);
 		}
 	}
 	exit(ret != 0);
 }
 
 void
 inet_makenetandmask(net, sin, bits)
 	u_long net, bits;
 	struct sockaddr_in *sin;
 {
 	u_long addr, mask = 0;
 	char *cp;
 
 	rtm_addrs |= RTA_NETMASK;
 	if (net == 0)
 		mask = addr = 0;
 	else {
 		if (net <= 0xff)
 			addr = net << IN_CLASSA_NSHIFT;
 		else if (net <= 0xffff)
 			addr = net << IN_CLASSB_NSHIFT;
 		else if (net <= 0xffffff)
 			addr = net << IN_CLASSC_NSHIFT;
 		else
 			addr = net;
 
 		if (bits != 0)
 			mask = 0xffffffff << (32 - bits);
 		else {
 			if (IN_CLASSA(addr))
 				mask = IN_CLASSA_NET;
 			else if (IN_CLASSB(addr))
 				mask = IN_CLASSB_NET;
 			else if (IN_CLASSC(addr))
 				mask = IN_CLASSC_NET;
 			else if (IN_MULTICAST(addr))
 				mask = IN_CLASSD_NET;
 			else
 				mask = 0xffffffff;
 		}
 	}
 	sin->sin_addr.s_addr = htonl(addr);
 	sin = &so_mask.sin;
 	sin->sin_addr.s_addr = htonl(mask);
 	sin->sin_len = 0;
 	sin->sin_family = 0;
 	cp = (char *)(&sin->sin_addr + 1);
 	while (*--cp == 0 && cp > (char *)sin)
 		;
 	sin->sin_len = 1 + cp - (char *)sin;
 }
 
 #ifdef INET6
 /*
  * XXX the function may need more improvement...
  */
 static int
 inet6_makenetandmask(sin6, plen)
 	struct sockaddr_in6 *sin6;
 	char *plen;
 {
 	struct in6_addr in6;
 
 	if (!plen) {
 		if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr) &&
 		    sin6->sin6_scope_id == 0) {
 			plen = "0";
 		} else if ((sin6->sin6_addr.s6_addr[0] & 0xe0) == 0x20) {
 			/* aggregatable global unicast - RFC2374 */
 			memset(&in6, 0, sizeof(in6));
 			if (!memcmp(&sin6->sin6_addr.s6_addr[8],
 				    &in6.s6_addr[8], 8))
 				plen = "64";
 		}
 	}
 
 	if (!plen || strcmp(plen, "128") == 0)
 		return 1;
 	rtm_addrs |= RTA_NETMASK;
 	(void)prefixlen(plen);
 	return 0;
 }
 #endif
 
 /*
  * Interpret an argument as a network address of some kind,
  * returning 1 if a host address, 0 if a network address.
  */
 int
 getaddr(which, s, hpp)
 	int which;
 	char *s;
 	struct hostent **hpp;
 {
 	sup su;
 	struct hostent *hp;
 	struct netent *np;
 	u_long val;
 	char *q;
 	int afamily;  /* local copy of af so we can change it */
 
 	if (af == 0) {
 		af = AF_INET;
 		aflen = sizeof(struct sockaddr_in);
 	}
 	afamily = af;
 	rtm_addrs |= which;
 	switch (which) {
 	case RTA_DST:
 		su = &so_dst;
 		break;
 	case RTA_GATEWAY:
 		su = &so_gate;
 		if (iflag) {
 			struct ifaddrs *ifap, *ifa;
 			struct sockaddr_dl *sdl = NULL;
 
 			if (getifaddrs(&ifap))
 				err(1, "getifaddrs");
 
 			for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
 				if (ifa->ifa_addr->sa_family != AF_LINK)
 					continue;
 
 				if (strcmp(s, ifa->ifa_name))
 					continue;
 
 				sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 			}
 			/* If we found it, then use it */
 			if (sdl) {
 				/*
 				 * Copy is safe since we have a
 				 * sockaddr_storage member in sockunion{}.
 				 * Note that we need to copy before calling
 				 * freeifaddrs().
 				 */
 				memcpy(&su->sdl, sdl, sdl->sdl_len);
 			}
 			freeifaddrs(ifap);
 			if (sdl)
 				return(1);
 		}
 		break;
 	case RTA_NETMASK:
 		su = &so_mask;
 		break;
 	case RTA_GENMASK:
 		su = &so_genmask;
 		break;
 	case RTA_IFP:
 		su = &so_ifp;
 		afamily = AF_LINK;
 		break;
 	case RTA_IFA:
 		su = &so_ifa;
 		break;
 	default:
 		usage("internal error");
 		/*NOTREACHED*/
 	}
 	su->sa.sa_len = aflen;
 	su->sa.sa_family = afamily; /* cases that don't want it have left already */
 	if (strcmp(s, "default") == 0) {
 		/*
 		 * Default is net 0.0.0.0/0 
 		 */
 		switch (which) {
 		case RTA_DST:
 			forcenet++;
 #if 0
 			bzero(su, sizeof(*su));	/* for readability */
 #endif
 			(void) getaddr(RTA_NETMASK, s, 0);
 			break;
 #if 0
 		case RTA_NETMASK:
 		case RTA_GENMASK:
 			bzero(su, sizeof(*su));	/* for readability */
 #endif
 		}
 		return (0);
 	}
 	switch (afamily) {
 #ifdef INET6
 	case AF_INET6:
 	{
 		struct addrinfo hints, *res;
 		int ecode;
 
 		q = NULL;
 		if (which == RTA_DST && (q = strchr(s, '/')) != NULL)
 			*q = '\0';
 		memset(&hints, 0, sizeof(hints));
 		hints.ai_family = afamily;	/*AF_INET6*/
 		hints.ai_socktype = SOCK_DGRAM;		/*dummy*/
 		ecode = getaddrinfo(s, NULL, &hints, &res);
 		if (ecode != 0 || res->ai_family != AF_INET6 ||
 		    res->ai_addrlen != sizeof(su->sin6)) {
 			(void) fprintf(stderr, "%s: %s\n", s,
 			    gai_strerror(ecode));
 			exit(1);
 		}
 		memcpy(&su->sin6, res->ai_addr, sizeof(su->sin6));
 #ifdef __KAME__
 		if ((IN6_IS_ADDR_LINKLOCAL(&su->sin6.sin6_addr) ||
 		     IN6_IS_ADDR_MC_LINKLOCAL(&su->sin6.sin6_addr)) &&
 		    su->sin6.sin6_scope_id) {
 			*(u_int16_t *)&su->sin6.sin6_addr.s6_addr[2] =
 				htons(su->sin6.sin6_scope_id);
 			su->sin6.sin6_scope_id = 0;
 		}
 #endif
 		freeaddrinfo(res);
 		if (q != NULL)
 			*q++ = '/';
 		if (which == RTA_DST)
 			return (inet6_makenetandmask(&su->sin6, q));
 		return (0);
 	}
 #endif /* INET6 */
 
 	case AF_APPLETALK:
 		if (!atalk_aton(s, &su->sat.sat_addr))
 			errx(EX_NOHOST, "bad address: %s", s);
 		rtm_addrs |= RTA_NETMASK;
 		return(forcehost || su->sat.sat_addr.s_node != 0);
 
 	case AF_LINK:
 		link_addr(s, &su->sdl);
 		return (1);
 
 
 	case PF_ROUTE:
 		su->sa.sa_len = sizeof(*su);
 		sockaddr(s, &su->sa);
 		return (1);
 
 	case AF_INET:
 	default:
 		break;
 	}
 
 	if (hpp == NULL)
 		hpp = &hp;
 	*hpp = NULL;
 
 	q = strchr(s,'/');
 	if (q && which == RTA_DST) {
 		*q = '\0';
 		if ((val = inet_network(s)) != INADDR_NONE) {
 			inet_makenetandmask(
 				val, &su->sin, strtoul(q+1, 0, 0));
 			return (0);
 		}
 		*q = '/';
 	}
 	if ((which != RTA_DST || forcenet == 0) &&
 	    inet_aton(s, &su->sin.sin_addr)) {
 		val = su->sin.sin_addr.s_addr;
 		if (which != RTA_DST || forcehost ||
 		    inet_lnaof(su->sin.sin_addr) != INADDR_ANY)
 			return (1);
 		else {
 			val = ntohl(val);
 			goto netdone;
 		}
 	}
 	if (which == RTA_DST && forcehost == 0 &&
 	    ((val = inet_network(s)) != INADDR_NONE ||
 	    ((np = getnetbyname(s)) != NULL && (val = np->n_net) != 0))) {
 netdone:
 		inet_makenetandmask(val, &su->sin, 0);
 		return (0);
 	}
 	hp = gethostbyname(s);
 	if (hp) {
 		*hpp = hp;
 		su->sin.sin_family = hp->h_addrtype;
 		memmove((char *)&su->sin.sin_addr, hp->h_addr,
 		    MIN(hp->h_length, sizeof(su->sin.sin_addr)));
 		return (1);
 	}
 	errx(EX_NOHOST, "bad address: %s", s);
 }
 
 int
 prefixlen(s)
 	char *s;
 {
 	int len = atoi(s), q, r;
 	int max;
 	char *p;
 
 	rtm_addrs |= RTA_NETMASK;	
 	switch (af) {
 #ifdef INET6
 	case AF_INET6:
 		max = 128;
 		p = (char *)&so_mask.sin6.sin6_addr;
 		break;
 #endif
 	case AF_INET:
 		max = 32;
 		p = (char *)&so_mask.sin.sin_addr;
 		break;
 	default:
 		(void) fprintf(stderr, "prefixlen not supported in this af\n");
 		exit(1);
 		/*NOTREACHED*/
 	}
 
 	if (len < 0 || max < len) {
 		(void) fprintf(stderr, "%s: bad value\n", s);
 		exit(1);
 	}
 	
 	q = len >> 3;
 	r = len & 7;
 	so_mask.sa.sa_family = af;
 	so_mask.sa.sa_len = aflen;
 	memset((void *)p, 0, max / 8);
 	if (q > 0)
 		memset((void *)p, 0xff, q);
 	if (r > 0)
 		*((u_char *)p + q) = (0xff00 >> r) & 0xff;
 	if (len == max)
 		return -1;
 	else
 		return len;
 }
 
 void
 interfaces()
 {
 	size_t needed;
 	int mib[6];
 	char *buf, *lim, *next, count = 0;
 	struct rt_msghdr *rtm;
 
 retry2:
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;		/* protocol */
 	mib[3] = 0;		/* wildcard address family */
 	mib[4] = NET_RT_IFLIST;
 	mib[5] = 0;		/* no flags */
 	if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0)
 		err(EX_OSERR, "route-sysctl-estimate");
 	if ((buf = malloc(needed)) == NULL)
 		errx(EX_OSERR, "malloc failed");
 	if (sysctl(mib, 6, buf, &needed, NULL, 0) < 0) {
 		if (errno == ENOMEM && count++ < 10) {
 			warnx("Routing table grew, retrying");
 			sleep(1);
 			free(buf);
 			goto retry2;
 		}
 		err(EX_OSERR, "actual retrieval of interface table");
 	}
 	lim = buf + needed;
 	for (next = buf; next < lim; next += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)next;
 		print_rtmsg(rtm, rtm->rtm_msglen);
 	}
 }
 
 void
 monitor()
 {
 	int n;
 	char msg[2048];
 
 	verbose = 1;
 	if (debugonly) {
 		interfaces();
 		exit(0);
 	}
 	for(;;) {
 		time_t now;
 		n = read(s, msg, 2048);
 		now = time(NULL);
 		(void) printf("\ngot message of size %d on %s", n, ctime(&now));
 		print_rtmsg((struct rt_msghdr *)msg, n);
 	}
 }
 
 struct {
 	struct	rt_msghdr m_rtm;
 	char	m_space[512];
 } m_rtmsg;
 
 int
 rtmsg(cmd, flags)
 	int cmd, flags;
 {
 	static int seq;
 	int rlen;
 	char *cp = m_rtmsg.m_space;
 	int l;
 
 #define NEXTADDR(w, u) \
 	if (rtm_addrs & (w)) {\
 	    l = SA_SIZE(&(u.sa)); memmove(cp, &(u), l); cp += l;\
 	    if (verbose) sodump(&(u),#u);\
 	}
 
 	errno = 0;
 	memset(&m_rtmsg, 0, sizeof(m_rtmsg));
 	if (cmd == 'a')
 		cmd = RTM_ADD;
 	else if (cmd == 'c')
 		cmd = RTM_CHANGE;
 	else if (cmd == 'g') {
 		cmd = RTM_GET;
 		if (so_ifp.sa.sa_family == 0) {
 			so_ifp.sa.sa_family = AF_LINK;
 			so_ifp.sa.sa_len = sizeof(struct sockaddr_dl);
 			rtm_addrs |= RTA_IFP;
 		}
 	} else
 		cmd = RTM_DELETE;
 #define rtm m_rtmsg.m_rtm
 	rtm.rtm_type = cmd;
 	rtm.rtm_flags = flags;
 	rtm.rtm_version = RTM_VERSION;
 	rtm.rtm_seq = ++seq;
 	rtm.rtm_addrs = rtm_addrs;
 	rtm.rtm_rmx = rt_metrics;
 	rtm.rtm_inits = rtm_inits;
 
 	if (rtm_addrs & RTA_NETMASK)
 		mask_addr();
 	NEXTADDR(RTA_DST, so_dst);
 	NEXTADDR(RTA_GATEWAY, so_gate);
 	NEXTADDR(RTA_NETMASK, so_mask);
 	NEXTADDR(RTA_GENMASK, so_genmask);
 	NEXTADDR(RTA_IFP, so_ifp);
 	NEXTADDR(RTA_IFA, so_ifa);
 	rtm.rtm_msglen = l = cp - (char *)&m_rtmsg;
 	if (verbose)
 		print_rtmsg(&rtm, l);
 	if (debugonly)
 		return (0);
 	if ((rlen = write(s, (char *)&m_rtmsg, l)) < 0) {
 		if (errno == EPERM)
 			err(1, "writing to routing socket");
 		warn("writing to routing socket");
 		return (-1);
 	}
 	if (cmd == RTM_GET) {
 		do {
 			l = read(s, (char *)&m_rtmsg, sizeof(m_rtmsg));
 		} while (l > 0 && (rtm.rtm_seq != seq || rtm.rtm_pid != pid));
 		if (l < 0)
 			warn("read from routing socket");
 		else
 			print_getmsg(&rtm, l);
 	}
 #undef rtm
 	return (0);
 }
 
 void
 mask_addr()
 {
 	int olen = so_mask.sa.sa_len;
 	char *cp1 = olen + (char *)&so_mask, *cp2;
 
 	for (so_mask.sa.sa_len = 0; cp1 > (char *)&so_mask; )
 		if (*--cp1 != 0) {
 			so_mask.sa.sa_len = 1 + cp1 - (char *)&so_mask;
 			break;
 		}
 	if ((rtm_addrs & RTA_DST) == 0)
 		return;
 	switch (so_dst.sa.sa_family) {
 	case AF_INET:
 #ifdef INET6
 	case AF_INET6:
 #endif
 	case AF_APPLETALK:
 	case 0:
 		return;
 	}
 	cp1 = so_mask.sa.sa_len + 1 + (char *)&so_dst;
 	cp2 = so_dst.sa.sa_len + 1 + (char *)&so_dst;
 	while (cp2 > cp1)
 		*--cp2 = 0;
 	cp2 = so_mask.sa.sa_len + 1 + (char *)&so_mask;
 	while (cp1 > so_dst.sa.sa_data)
 		*--cp1 &= *--cp2;
 }
 
 char *msgtypes[] = {
 	"",
 	"RTM_ADD: Add Route",
 	"RTM_DELETE: Delete Route",
 	"RTM_CHANGE: Change Metrics or flags",
 	"RTM_GET: Report Metrics",
 	"RTM_LOSING: Kernel Suspects Partitioning",
 	"RTM_REDIRECT: Told to use different route",
 	"RTM_MISS: Lookup failed on this address",
 	"RTM_LOCK: fix specified metrics",
 	"RTM_OLDADD: caused by SIOCADDRT",
 	"RTM_OLDDEL: caused by SIOCDELRT",
 	"RTM_RESOLVE: Route created by cloning",
 	"RTM_NEWADDR: address being added to iface",
 	"RTM_DELADDR: address being removed from iface",
 	"RTM_IFINFO: iface status change",
 	"RTM_NEWMADDR: new multicast group membership on iface",
 	"RTM_DELMADDR: multicast group membership removed from iface",
 	"RTM_IFANNOUNCE: interface arrival/departure",
 	0,
 };
 
 char metricnames[] =
 "\011pksent\010rttvar\7rtt\6ssthresh\5sendpipe\4recvpipe\3expire\2hopcount"
 "\1mtu";
 char routeflags[] =
 "\1UP\2GATEWAY\3HOST\4REJECT\5DYNAMIC\6MODIFIED\7DONE\010MASK_PRESENT"
 "\011CLONING\012XRESOLVE\013LLINFO\014STATIC\015BLACKHOLE\016b016"
 "\017PROTO2\020PROTO1\021PRCLONING\022WASCLONED\023PROTO3\024CHAINDELETE"
 "\025PINNED\026LOCAL\027BROADCAST\030MULTICAST";
 char ifnetflags[] =
 "\1UP\2BROADCAST\3DEBUG\4LOOPBACK\5PTP\6b6\7RUNNING\010NOARP"
 "\011PPROMISC\012ALLMULTI\013OACTIVE\014SIMPLEX\015LINK0\016LINK1"
 "\017LINK2\020MULTICAST";
 char addrnames[] =
 "\1DST\2GATEWAY\3NETMASK\4GENMASK\5IFP\6IFA\7AUTHOR\010BRD";
 
 void
 print_rtmsg(rtm, msglen)
 	struct rt_msghdr *rtm;
 	int msglen;
 {
 	struct if_msghdr *ifm;
 	struct ifa_msghdr *ifam;
 #ifdef RTM_NEWMADDR
 	struct ifma_msghdr *ifmam;
 #endif
 	struct if_announcemsghdr *ifan;
 	char *state;
 
 	if (verbose == 0)
 		return;
 	if (rtm->rtm_version != RTM_VERSION) {
 		(void) printf("routing message version %d not understood\n",
 		    rtm->rtm_version);
 		return;
 	}
 	if (msgtypes[rtm->rtm_type] != NULL)
 		(void)printf("%s: ", msgtypes[rtm->rtm_type]);
 	else
 		(void)printf("#%d: ", rtm->rtm_type);
 	(void)printf("len %d, ", rtm->rtm_msglen);
 	switch (rtm->rtm_type) {
 	case RTM_IFINFO:
 		ifm = (struct if_msghdr *)rtm;
 		(void) printf("if# %d, ", ifm->ifm_index);
 		switch (ifm->ifm_data.ifi_link_state) {
 		case LINK_STATE_DOWN:
 			state = "down";
 			break;
 		case LINK_STATE_UP:
 			state = "up";
 			break;
 		default:
 			state = "unknown";
 			break;
 		}
 		(void) printf("link: %s, flags:", state);
 		bprintf(stdout, ifm->ifm_flags, ifnetflags);
 		pmsg_addrs((char *)(ifm + 1), ifm->ifm_addrs);
 		break;
 	case RTM_NEWADDR:
 	case RTM_DELADDR:
 		ifam = (struct ifa_msghdr *)rtm;
 		(void) printf("metric %d, flags:", ifam->ifam_metric);
 		bprintf(stdout, ifam->ifam_flags, routeflags);
 		pmsg_addrs((char *)(ifam + 1), ifam->ifam_addrs);
 		break;
 #ifdef RTM_NEWMADDR
 	case RTM_NEWMADDR:
 	case RTM_DELMADDR:
 		ifmam = (struct ifma_msghdr *)rtm;
 		pmsg_addrs((char *)(ifmam + 1), ifmam->ifmam_addrs);
 		break;
 #endif
 	case RTM_IFANNOUNCE:
 		ifan = (struct if_announcemsghdr *)rtm;
 		(void) printf("if# %d, what: ", ifan->ifan_index);
 		switch (ifan->ifan_what) {
 		case IFAN_ARRIVAL:
 			printf("arrival");
 			break;
 		case IFAN_DEPARTURE:
 			printf("departure");
 			break;
 		default:
 			printf("#%d", ifan->ifan_what);
 			break;
 		}
 		printf("\n");
 		break;
 
 	default:
 		(void) printf("pid: %ld, seq %d, errno %d, flags:",
 			(long)rtm->rtm_pid, rtm->rtm_seq, rtm->rtm_errno);
 		bprintf(stdout, rtm->rtm_flags, routeflags);
 		pmsg_common(rtm);
 	}
 }
 
 void
 print_getmsg(rtm, msglen)
 	struct rt_msghdr *rtm;
 	int msglen;
 {
 	struct sockaddr *dst = NULL, *gate = NULL, *mask = NULL;
 	struct sockaddr_dl *ifp = NULL;
 	struct sockaddr *sa;
 	char *cp;
 	int i;
 
 	(void) printf("   route to: %s\n", routename(&so_dst));
 	if (rtm->rtm_version != RTM_VERSION) {
 		warnx("routing message version %d not understood",
 		     rtm->rtm_version);
 		return;
 	}
 	if (rtm->rtm_msglen > msglen) {
 		warnx("message length mismatch, in packet %d, returned %d",
 		      rtm->rtm_msglen, msglen);
 	}
 	if (rtm->rtm_errno)  {
 		errno = rtm->rtm_errno;
 		warn("message indicates error %d", errno);
 		return;
 	}
 	cp = ((char *)(rtm + 1));
 	if (rtm->rtm_addrs)
 		for (i = 1; i; i <<= 1)
 			if (i & rtm->rtm_addrs) {
 				sa = (struct sockaddr *)cp;
 				switch (i) {
 				case RTA_DST:
 					dst = sa;
 					break;
 				case RTA_GATEWAY:
 					gate = sa;
 					break;
 				case RTA_NETMASK:
 					mask = sa;
 					break;
 				case RTA_IFP:
 					if (sa->sa_family == AF_LINK &&
 					   ((struct sockaddr_dl *)sa)->sdl_nlen)
 						ifp = (struct sockaddr_dl *)sa;
 					break;
 				}
 				cp += SA_SIZE(sa);
 			}
 	if (dst && mask)
 		mask->sa_family = dst->sa_family;	/* XXX */
 	if (dst)
 		(void)printf("destination: %s\n", routename(dst));
 	if (mask) {
 		int savenflag = nflag;
 
 		nflag = 1;
 		(void)printf("       mask: %s\n", routename(mask));
 		nflag = savenflag;
 	}
 	if (gate && rtm->rtm_flags & RTF_GATEWAY)
 		(void)printf("    gateway: %s\n", routename(gate));
 	if (ifp)
 		(void)printf("  interface: %.*s\n",
 		    ifp->sdl_nlen, ifp->sdl_data);
 	(void)printf("      flags: ");
 	bprintf(stdout, rtm->rtm_flags, routeflags);
 
 #define lock(f)	((rtm->rtm_rmx.rmx_locks & __CONCAT(RTV_,f)) ? 'L' : ' ')
 #define msec(u)	(((u) + 500) / 1000)		/* usec to msec */
 
 	(void) printf("\n%s\n", "\
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire");
 	printf("%8ld%c ", rtm->rtm_rmx.rmx_recvpipe, lock(RPIPE));
 	printf("%8ld%c ", rtm->rtm_rmx.rmx_sendpipe, lock(SPIPE));
 	printf("%8ld%c ", rtm->rtm_rmx.rmx_ssthresh, lock(SSTHRESH));
 	printf("%8ld%c ", msec(rtm->rtm_rmx.rmx_rtt), lock(RTT));
 	printf("%8ld%c ", msec(rtm->rtm_rmx.rmx_rttvar), lock(RTTVAR));
 	printf("%8ld%c ", rtm->rtm_rmx.rmx_hopcount, lock(HOPCOUNT));
 	printf("%8ld%c ", rtm->rtm_rmx.rmx_mtu, lock(MTU));
 	if (rtm->rtm_rmx.rmx_expire)
 		rtm->rtm_rmx.rmx_expire -= time(0);
 	printf("%8ld%c\n", rtm->rtm_rmx.rmx_expire, lock(EXPIRE));
 #undef lock
 #undef msec
 #define	RTA_IGN	(RTA_DST|RTA_GATEWAY|RTA_NETMASK|RTA_IFP|RTA_IFA|RTA_BRD)
 	if (verbose)
 		pmsg_common(rtm);
 	else if (rtm->rtm_addrs &~ RTA_IGN) {
 		(void) printf("sockaddrs: ");
 		bprintf(stdout, rtm->rtm_addrs, addrnames);
 		putchar('\n');
 	}
 #undef	RTA_IGN
 }
 
 void
 pmsg_common(rtm)
 	struct rt_msghdr *rtm;
 {
 	(void) printf("\nlocks: ");
 	bprintf(stdout, rtm->rtm_rmx.rmx_locks, metricnames);
 	(void) printf(" inits: ");
 	bprintf(stdout, rtm->rtm_inits, metricnames);
 	pmsg_addrs(((char *)(rtm + 1)), rtm->rtm_addrs);
 }
 
 void
 pmsg_addrs(cp, addrs)
 	char	*cp;
 	int	addrs;
 {
 	struct sockaddr *sa;
 	int i;
 
 	if (addrs == 0) {
 		(void) putchar('\n');
 		return;
 	}
 	(void) printf("\nsockaddrs: ");
 	bprintf(stdout, addrs, addrnames);
 	(void) putchar('\n');
 	for (i = 1; i; i <<= 1)
 		if (i & addrs) {
 			sa = (struct sockaddr *)cp;
 			(void) printf(" %s", routename(sa));
 			cp += SA_SIZE(sa);
 		}
 	(void) putchar('\n');
 	(void) fflush(stdout);
 }
 
 void
 bprintf(fp, b, s)
 	FILE *fp;
 	int b;
 	u_char *s;
 {
 	int i;
 	int gotsome = 0;
 
 	if (b == 0)
 		return;
 	while ((i = *s++) != 0) {
 		if (b & (1 << (i-1))) {
 			if (gotsome == 0)
 				i = '<';
 			else
 				i = ',';
 			(void) putc(i, fp);
 			gotsome = 1;
 			for (; (i = *s) > 32; s++)
 				(void) putc(i, fp);
 		} else
 			while (*s > 32)
 				s++;
 	}
 	if (gotsome)
 		(void) putc('>', fp);
 }
 
 int
 keyword(cp)
 	char *cp;
 {
 	struct keytab *kt = keywords;
 
 	while (kt->kt_cp && strcmp(kt->kt_cp, cp))
 		kt++;
 	return kt->kt_i;
 }
 
 void
 sodump(su, which)
 	sup su;
 	char *which;
 {
 	switch (su->sa.sa_family) {
 	case AF_LINK:
 		(void) printf("%s: link %s; ",
 		    which, link_ntoa(&su->sdl));
 		break;
 	case AF_INET:
 		(void) printf("%s: inet %s; ",
 		    which, inet_ntoa(su->sin.sin_addr));
 		break;
 	case AF_APPLETALK:
 		(void) printf("%s: atalk %s; ",
 		    which, atalk_ntoa(su->sat.sat_addr));
 		break;
 	}
 	(void) fflush(stdout);
 }
 
 /* States*/
 #define VIRGIN	0
 #define GOTONE	1
 #define GOTTWO	2
 /* Inputs */
 #define	DIGIT	(4*0)
 #define	END	(4*1)
 #define DELIM	(4*2)
 
 void
 sockaddr(addr, sa)
 	char *addr;
 	struct sockaddr *sa;
 {
 	char *cp = (char *)sa;
 	int size = sa->sa_len;
 	char *cplim = cp + size;
 	int byte = 0, state = VIRGIN, new = 0 /* foil gcc */;
 
 	memset(cp, 0, size);
 	cp++;
 	do {
 		if ((*addr >= '0') && (*addr <= '9')) {
 			new = *addr - '0';
 		} else if ((*addr >= 'a') && (*addr <= 'f')) {
 			new = *addr - 'a' + 10;
 		} else if ((*addr >= 'A') && (*addr <= 'F')) {
 			new = *addr - 'A' + 10;
 		} else if (*addr == 0)
 			state |= END;
 		else
 			state |= DELIM;
 		addr++;
 		switch (state /* | INPUT */) {
 		case GOTTWO | DIGIT:
 			*cp++ = byte; /*FALLTHROUGH*/
 		case VIRGIN | DIGIT:
 			state = GOTONE; byte = new; continue;
 		case GOTONE | DIGIT:
 			state = GOTTWO; byte = new + (byte << 4); continue;
 		default: /* | DELIM */
 			state = VIRGIN; *cp++ = byte; byte = 0; continue;
 		case GOTONE | END:
 		case GOTTWO | END:
 			*cp++ = byte; /* FALLTHROUGH */
 		case VIRGIN | END:
 			break;
 		}
 		break;
 	} while (cp < cplim);
 	sa->sa_len = cp - (char *)sa;
 }
 
 int
 atalk_aton(const char *text, struct at_addr *addr)
 {
 	u_int net, node;
 
 	if (sscanf(text, "%u.%u", &net, &node) != 2
 	    || net > 0xffff || node > 0xff)
 		return(0);
 	addr->s_net = htons(net);
 	addr->s_node = node;
 	return(1);
 }
 
 char *
 atalk_ntoa(struct at_addr at)
 {
 	static char buf[20];
 
 	(void) snprintf(buf, sizeof(buf), "%u.%u", ntohs(at.s_net), at.s_node);
 	return(buf);
 }
Index: head/sbin/routed/table.c
===================================================================
--- head/sbin/routed/table.c	(revision 186118)
+++ head/sbin/routed/table.c	(revision 186119)
@@ -1,2153 +1,2156 @@
 /*
  * Copyright (c) 1983, 1988, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include "defs.h"
 
 #ifdef __NetBSD__
 __RCSID("$NetBSD$");
 #elif defined(__FreeBSD__)
 __RCSID("$FreeBSD$");
 #else
 __RCSID("$Revision: 2.27 $");
 #ident "$Revision: 2.27 $"
 #endif
 
 static struct rt_spare *rts_better(struct rt_entry *);
 static struct rt_spare rts_empty = {0,0,0,HOPCNT_INFINITY,0,0,0};
 static void  set_need_flash(void);
 #ifdef _HAVE_SIN_LEN
 static void masktrim(struct sockaddr_in *ap);
 #else
 static void masktrim(struct sockaddr_in_new *ap);
 #endif
 
 
 struct radix_node_head *rhead;		/* root of the radix tree */
 
 int	need_flash = 1;			/* flash update needed
 					 * start =1 to suppress the 1st
 					 */
 
 struct timeval age_timer;		/* next check of old routes */
 struct timeval need_kern = {		/* need to update kernel table */
 	EPOCH+MIN_WAITTIME-1, 0
 };
 
 int	stopint;
 
 int	total_routes;
 
 /* zap any old routes through this gateway */
 naddr	age_bad_gate;
 
 
 /* It is desirable to "aggregate" routes, to combine differing routes of
  * the same metric and next hop into a common route with a smaller netmask
  * or to suppress redundant routes, routes that add no information to
  * routes with smaller netmasks.
  *
  * A route is redundant if and only if any and all routes with smaller
  * but matching netmasks and nets are the same.  Since routes are
  * kept sorted in the radix tree, redundant routes always come second.
  *
  * There are two kinds of aggregations.  First, two routes of the same bit
  * mask and differing only in the least significant bit of the network
  * number can be combined into a single route with a coarser mask.
  *
  * Second, a route can be suppressed in favor of another route with a more
  * coarse mask provided no incompatible routes with intermediate masks
  * are present.  The second kind of aggregation involves suppressing routes.
  * A route must not be suppressed if an incompatible route exists with
  * an intermediate mask, since the suppressed route would be covered
  * by the intermediate.
  *
  * This code relies on the radix tree walk encountering routes
  * sorted first by address, with the smallest address first.
  */
 
 struct ag_info ag_slots[NUM_AG_SLOTS], *ag_avail, *ag_corsest, *ag_finest;
 
 /* #define DEBUG_AG */
 #ifdef DEBUG_AG
 #define CHECK_AG() {int acnt = 0; struct ag_info *cag;		\
 	for (cag = ag_avail; cag != 0; cag = cag->ag_fine)	\
 		acnt++;						\
 	for (cag = ag_corsest; cag != 0; cag = cag->ag_fine)	\
 		acnt++;						\
 	if (acnt != NUM_AG_SLOTS) {				\
 		(void)fflush(stderr);				\
 		abort();					\
 	}							\
 }
 #else
 #define CHECK_AG()
 #endif
 
 
 /* Output the contents of an aggregation table slot.
  *	This function must always be immediately followed with the deletion
  *	of the target slot.
  */
 static void
 ag_out(struct ag_info *ag,
 	 void (*out)(struct ag_info *))
 {
 	struct ag_info *ag_cors;
 	naddr bit;
 
 
 	/* Forget it if this route should not be output for split-horizon. */
 	if (ag->ag_state & AGS_SPLIT_HZ)
 		return;
 
 	/* If we output both the even and odd twins, then the immediate parent,
 	 * if it is present, is redundant, unless the parent manages to
 	 * aggregate into something coarser.
 	 * On successive calls, this code detects the even and odd twins,
 	 * and marks the parent.
 	 *
 	 * Note that the order in which the radix tree code emits routes
 	 * ensures that the twins are seen before the parent is emitted.
 	 */
 	ag_cors = ag->ag_cors;
 	if (ag_cors != 0
 	    && ag_cors->ag_mask == ag->ag_mask<<1
 	    && ag_cors->ag_dst_h == (ag->ag_dst_h & ag_cors->ag_mask)) {
 		ag_cors->ag_state |= ((ag_cors->ag_dst_h == ag->ag_dst_h)
 				      ? AGS_REDUN0
 				      : AGS_REDUN1);
 	}
 
 	/* Skip it if this route is itself redundant.
 	 *
 	 * It is ok to change the contents of the slot here, since it is
 	 * always deleted next.
 	 */
 	if (ag->ag_state & AGS_REDUN0) {
 		if (ag->ag_state & AGS_REDUN1)
 			return;		/* quit if fully redundant */
 		/* make it finer if it is half-redundant */
 		bit = (-ag->ag_mask) >> 1;
 		ag->ag_dst_h |= bit;
 		ag->ag_mask |= bit;
 
 	} else if (ag->ag_state & AGS_REDUN1) {
 		/* make it finer if it is half-redundant */
 		bit = (-ag->ag_mask) >> 1;
 		ag->ag_mask |= bit;
 	}
 	out(ag);
 }
 
 
 static void
 ag_del(struct ag_info *ag)
 {
 	CHECK_AG();
 
 	if (ag->ag_cors == 0)
 		ag_corsest = ag->ag_fine;
 	else
 		ag->ag_cors->ag_fine = ag->ag_fine;
 
 	if (ag->ag_fine == 0)
 		ag_finest = ag->ag_cors;
 	else
 		ag->ag_fine->ag_cors = ag->ag_cors;
 
 	ag->ag_fine = ag_avail;
 	ag_avail = ag;
 
 	CHECK_AG();
 }
 
 
 /* Flush routes waiting for aggregation.
  *	This must not suppress a route unless it is known that among all
  *	routes with coarser masks that match it, the one with the longest
  *	mask is appropriate.  This is ensured by scanning the routes
  *	in lexical order, and with the most restrictive mask first
  *	among routes to the same destination.
  */
 void
 ag_flush(naddr lim_dst_h,		/* flush routes to here */
 	 naddr lim_mask,		/* matching this mask */
 	 void (*out)(struct ag_info *))
 {
 	struct ag_info *ag, *ag_cors;
 	naddr dst_h;
 
 
 	for (ag = ag_finest;
 	     ag != 0 && ag->ag_mask >= lim_mask;
 	     ag = ag_cors) {
 		ag_cors = ag->ag_cors;
 
 		/* work on only the specified routes */
 		dst_h = ag->ag_dst_h;
 		if ((dst_h & lim_mask) != lim_dst_h)
 			continue;
 
 		if (!(ag->ag_state & AGS_SUPPRESS))
 			ag_out(ag, out);
 
 		else for ( ; ; ag_cors = ag_cors->ag_cors) {
 			/* Look for a route that can suppress the
 			 * current route */
 			if (ag_cors == 0) {
 				/* failed, so output it and look for
 				 * another route to work on
 				 */
 				ag_out(ag, out);
 				break;
 			}
 
 			if ((dst_h & ag_cors->ag_mask) == ag_cors->ag_dst_h) {
 				/* We found a route with a coarser mask that
 				 * aggregates the current target.
 				 *
 				 * If it has a different next hop, it
 				 * cannot replace the target, so output
 				 * the target.
 				 */
 				if (ag->ag_gate != ag_cors->ag_gate
 				    && !(ag->ag_state & AGS_FINE_GATE)
 				    && !(ag_cors->ag_state & AGS_CORS_GATE)) {
 					ag_out(ag, out);
 					break;
 				}
 
 				/* If the coarse route has a good enough
 				 * metric, it suppresses the target.
 				 * If the suppressed target was redundant,
 				 * then mark the suppressor redundant.
 				 */
 				if (ag_cors->ag_pref <= ag->ag_pref) {
 				    if (AG_IS_REDUN(ag->ag_state)
 					&& ag_cors->ag_mask==ag->ag_mask<<1) {
 					if (ag_cors->ag_dst_h == dst_h)
 					    ag_cors->ag_state |= AGS_REDUN0;
 					else
 					    ag_cors->ag_state |= AGS_REDUN1;
 				    }
 				    if (ag->ag_tag != ag_cors->ag_tag)
 					    ag_cors->ag_tag = 0;
 				    if (ag->ag_nhop != ag_cors->ag_nhop)
 					    ag_cors->ag_nhop = 0;
 				    break;
 				}
 			}
 		}
 
 		/* That route has either been output or suppressed */
 		ag_cors = ag->ag_cors;
 		ag_del(ag);
 	}
 
 	CHECK_AG();
 }
 
 
 /* Try to aggregate a route with previous routes.
  */
 void
 ag_check(naddr	dst,
 	 naddr	mask,
 	 naddr	gate,
 	 naddr	nhop,
 	 char	metric,
 	 char	pref,
 	 u_int	new_seqno,
 	 u_short tag,
 	 u_short state,
 	 void (*out)(struct ag_info *))	/* output using this */
 {
 	struct ag_info *ag, *nag, *ag_cors;
 	naddr xaddr;
 	int x;
 
 	dst = ntohl(dst);
 
 	/* Punt non-contiguous subnet masks.
 	 *
 	 * (X & -X) contains a single bit if and only if X is a power of 2.
 	 * (X + (X & -X)) == 0 if and only if X is a power of 2.
 	 */
 	if ((mask & -mask) + mask != 0) {
 		struct ag_info nc_ag;
 
 		nc_ag.ag_dst_h = dst;
 		nc_ag.ag_mask = mask;
 		nc_ag.ag_gate = gate;
 		nc_ag.ag_nhop = nhop;
 		nc_ag.ag_metric = metric;
 		nc_ag.ag_pref = pref;
 		nc_ag.ag_tag = tag;
 		nc_ag.ag_state = state;
 		nc_ag.ag_seqno = new_seqno;
 		out(&nc_ag);
 		return;
 	}
 
 	/* Search for the right slot in the aggregation table.
 	 */
 	ag_cors = 0;
 	ag = ag_corsest;
 	while (ag != 0) {
 		if (ag->ag_mask >= mask)
 			break;
 
 		/* Suppress old routes (i.e. combine with compatible routes
 		 * with coarser masks) as we look for the right slot in the
 		 * aggregation table for the new route.
 		 * A route to an address less than the current destination
 		 * will not be affected by the current route or any route
 		 * seen hereafter.  That means it is safe to suppress it.
 		 * This check keeps poor routes (e.g. with large hop counts)
 		 * from preventing suppression of finer routes.
 		 */
 		if (ag_cors != 0
 		    && ag->ag_dst_h < dst
 		    && (ag->ag_state & AGS_SUPPRESS)
 		    && ag_cors->ag_pref <= ag->ag_pref
 		    && (ag->ag_dst_h & ag_cors->ag_mask) == ag_cors->ag_dst_h
 		    && (ag_cors->ag_gate == ag->ag_gate
 			|| (ag->ag_state & AGS_FINE_GATE)
 			|| (ag_cors->ag_state & AGS_CORS_GATE))) {
 			/*  If the suppressed target was redundant,
 			 * then mark the suppressor redundant.
 			 */
 			if (AG_IS_REDUN(ag->ag_state)
 			    && ag_cors->ag_mask == ag->ag_mask<<1) {
 				if (ag_cors->ag_dst_h == dst)
 					ag_cors->ag_state |= AGS_REDUN0;
 				else
 					ag_cors->ag_state |= AGS_REDUN1;
 			}
 			if (ag->ag_tag != ag_cors->ag_tag)
 				ag_cors->ag_tag = 0;
 			if (ag->ag_nhop != ag_cors->ag_nhop)
 				ag_cors->ag_nhop = 0;
 			ag_del(ag);
 			CHECK_AG();
 		} else {
 			ag_cors = ag;
 		}
 		ag = ag_cors->ag_fine;
 	}
 
 	/* If we find the even/odd twin of the new route, and if the
 	 * masks and so forth are equal, we can aggregate them.
 	 * We can probably promote one of the pair.
 	 *
 	 * Since the routes are encountered in lexical order,
 	 * the new route must be odd.  However, the second or later
 	 * times around this loop, it could be the even twin promoted
 	 * from the even/odd pair of twins of the finer route.
 	 */
 	while (ag != 0
 	       && ag->ag_mask == mask
 	       && ((ag->ag_dst_h ^ dst) & (mask<<1)) == 0) {
 
 		/* Here we know the target route and the route in the current
 		 * slot have the same netmasks and differ by at most the
 		 * last bit.  They are either for the same destination, or
 		 * for an even/odd pair of destinations.
 		 */
 		if (ag->ag_dst_h == dst) {
 			/* We have two routes to the same destination.
 			 * Routes are encountered in lexical order, so a
 			 * route is never promoted until the parent route is
 			 * already present.  So we know that the new route is
 			 * a promoted (or aggregated) pair and the route
 			 * already in the slot is the explicit route.
 			 *
 			 * Prefer the best route if their metrics differ,
 			 * or the aggregated one if not, following a sort
 			 * of longest-match rule.
 			 */
 			if (pref <= ag->ag_pref) {
 				ag->ag_gate = gate;
 				ag->ag_nhop = nhop;
 				ag->ag_tag = tag;
 				ag->ag_metric = metric;
 				ag->ag_pref = pref;
 				if (ag->ag_seqno < new_seqno)
 					ag->ag_seqno = new_seqno;
 				x = ag->ag_state;
 				ag->ag_state = state;
 				state = x;
 			}
 
 			/* Some bits are set if they are set on either route,
 			 * except when the route is for an interface.
 			 */
 			if (!(ag->ag_state & AGS_IF))
 				ag->ag_state |= (state & (AGS_AGGREGATE_EITHER
 							| AGS_REDUN0
 							| AGS_REDUN1));
 			return;
 		}
 
 		/* If one of the routes can be promoted and the other can
 		 * be suppressed, it may be possible to combine them or
 		 * worthwhile to promote one.
 		 *
 		 * Any route that can be promoted is always
 		 * marked to be eligible to be suppressed.
 		 */
 		if (!((state & AGS_AGGREGATE)
 		      && (ag->ag_state & AGS_SUPPRESS))
 		    && !((ag->ag_state & AGS_AGGREGATE)
 			 && (state & AGS_SUPPRESS)))
 			break;
 
 		/* A pair of even/odd twin routes can be combined
 		 * if either is redundant, or if they are via the
 		 * same gateway and have the same metric.
 		 */
 		if (AG_IS_REDUN(ag->ag_state)
 		    || AG_IS_REDUN(state)
 		    || (ag->ag_gate == gate
 			&& ag->ag_pref == pref
 			&& (state & ag->ag_state & AGS_AGGREGATE) != 0)) {
 
 			/* We have both the even and odd pairs.
 			 * Since the routes are encountered in order,
 			 * the route in the slot must be the even twin.
 			 *
 			 * Combine and promote (aggregate) the pair of routes.
 			 */
 			if (new_seqno < ag->ag_seqno)
 				new_seqno = ag->ag_seqno;
 			if (!AG_IS_REDUN(state))
 				state &= ~AGS_REDUN1;
 			if (AG_IS_REDUN(ag->ag_state))
 				state |= AGS_REDUN0;
 			else
 				state &= ~AGS_REDUN0;
 			state |= (ag->ag_state & AGS_AGGREGATE_EITHER);
 			if (ag->ag_tag != tag)
 				tag = 0;
 			if (ag->ag_nhop != nhop)
 				nhop = 0;
 
 			/* Get rid of the even twin that was already
 			 * in the slot.
 			 */
 			ag_del(ag);
 
 		} else if (ag->ag_pref >= pref
 			   && (ag->ag_state & AGS_AGGREGATE)) {
 			/* If we cannot combine the pair, maybe the route
 			 * with the worse metric can be promoted.
 			 *
 			 * Promote the old, even twin, by giving its slot
 			 * in the table to the new, odd twin.
 			 */
 			ag->ag_dst_h = dst;
 
 			xaddr = ag->ag_gate;
 			ag->ag_gate = gate;
 			gate = xaddr;
 
 			xaddr = ag->ag_nhop;
 			ag->ag_nhop = nhop;
 			nhop = xaddr;
 
 			x = ag->ag_tag;
 			ag->ag_tag = tag;
 			tag = x;
 
 			/* The promoted route is even-redundant only if the
 			 * even twin was fully redundant.  It is not
 			 * odd-redundant because the odd-twin will still be
 			 * in the table.
 			 */
 			x = ag->ag_state;
 			if (!AG_IS_REDUN(x))
 				x &= ~AGS_REDUN0;
 			x &= ~AGS_REDUN1;
 			ag->ag_state = state;
 			state = x;
 
 			x = ag->ag_metric;
 			ag->ag_metric = metric;
 			metric = x;
 
 			x = ag->ag_pref;
 			ag->ag_pref = pref;
 			pref = x;
 
 			/* take the newest sequence number */
 			if (new_seqno <= ag->ag_seqno)
 				new_seqno = ag->ag_seqno;
 			else
 				ag->ag_seqno = new_seqno;
 
 		} else {
 			if (!(state & AGS_AGGREGATE))
 				break;	/* cannot promote either twin */
 
 			/* Promote the new, odd twin by shaving its
 			 * mask and address.
 			 * The promoted route is odd-redundant only if the
 			 * odd twin was fully redundant.  It is not
 			 * even-redundant because the even twin is still in
 			 * the table.
 			 */
 			if (!AG_IS_REDUN(state))
 				state &= ~AGS_REDUN1;
 			state &= ~AGS_REDUN0;
 			if (new_seqno < ag->ag_seqno)
 				new_seqno = ag->ag_seqno;
 			else
 				ag->ag_seqno = new_seqno;
 		}
 
 		mask <<= 1;
 		dst &= mask;
 
 		if (ag_cors == 0) {
 			ag = ag_corsest;
 			break;
 		}
 		ag = ag_cors;
 		ag_cors = ag->ag_cors;
 	}
 
 	/* When we can no longer promote and combine routes,
 	 * flush the old route in the target slot.  Also flush
 	 * any finer routes that we know will never be aggregated by
 	 * the new route.
 	 *
 	 * In case we moved toward coarser masks,
 	 * get back where we belong
 	 */
 	if (ag != 0
 	    && ag->ag_mask < mask) {
 		ag_cors = ag;
 		ag = ag->ag_fine;
 	}
 
 	/* Empty the target slot
 	 */
 	if (ag != 0 && ag->ag_mask == mask) {
 		ag_flush(ag->ag_dst_h, ag->ag_mask, out);
 		ag = (ag_cors == 0) ? ag_corsest : ag_cors->ag_fine;
 	}
 
 #ifdef DEBUG_AG
 	(void)fflush(stderr);
 	if (ag == 0 && ag_cors != ag_finest)
 		abort();
 	if (ag_cors == 0 && ag != ag_corsest)
 		abort();
 	if (ag != 0 && ag->ag_cors != ag_cors)
 		abort();
 	if (ag_cors != 0 && ag_cors->ag_fine != ag)
 		abort();
 	CHECK_AG();
 #endif
 
 	/* Save the new route on the end of the table.
 	 */
 	nag = ag_avail;
 	ag_avail = nag->ag_fine;
 
 	nag->ag_dst_h = dst;
 	nag->ag_mask = mask;
 	nag->ag_gate = gate;
 	nag->ag_nhop = nhop;
 	nag->ag_metric = metric;
 	nag->ag_pref = pref;
 	nag->ag_tag = tag;
 	nag->ag_state = state;
 	nag->ag_seqno = new_seqno;
 
 	nag->ag_fine = ag;
 	if (ag != 0)
 		ag->ag_cors = nag;
 	else
 		ag_finest = nag;
 	nag->ag_cors = ag_cors;
 	if (ag_cors == 0)
 		ag_corsest = nag;
 	else
 		ag_cors->ag_fine = nag;
 	CHECK_AG();
 }
 
 
 #define	NAME0_LEN 14
 static const char *
 rtm_type_name(u_char type)
 {
 	static const char *rtm_types[] = {
 		"RTM_ADD",
 		"RTM_DELETE",
 		"RTM_CHANGE",
 		"RTM_GET",
 		"RTM_LOSING",
 		"RTM_REDIRECT",
 		"RTM_MISS",
 		"RTM_LOCK",
 		"RTM_OLDADD",
 		"RTM_OLDDEL",
 		"RTM_RESOLVE",
 		"RTM_NEWADDR",
 		"RTM_DELADDR",
 #ifdef RTM_OIFINFO
 		"RTM_OIFINFO",
 #endif
 		"RTM_IFINFO",
 		"RTM_NEWMADDR",
 		"RTM_DELMADDR"
 	};
 #define NEW_RTM_PAT "RTM type %#x"
 	static char name0[sizeof(NEW_RTM_PAT)+2];
 
 
 	if (type > sizeof(rtm_types)/sizeof(rtm_types[0])
 	    || type == 0) {
 		snprintf(name0, sizeof(name0), NEW_RTM_PAT, type);
 		return name0;
 	} else {
 		return rtm_types[type-1];
 	}
 #undef NEW_RTM_PAT
 }
 
 
 /* Trim a mask in a sockaddr
  *	Produce a length of 0 for an address of 0.
  *	Otherwise produce the index of the first zero byte.
  */
 void
 #ifdef _HAVE_SIN_LEN
 masktrim(struct sockaddr_in *ap)
 #else
 masktrim(struct sockaddr_in_new *ap)
 #endif
 {
 	char *cp;
 
 	if (ap->sin_addr.s_addr == 0) {
 		ap->sin_len = 0;
 		return;
 	}
 	cp = (char *)(&ap->sin_addr.s_addr+1);
 	while (*--cp == 0)
 		continue;
 	ap->sin_len = cp - (char*)ap + 1;
 }
 
 
 /* Tell the kernel to add, delete or change a route
  */
 static void
 rtioctl(int action,			/* RTM_DELETE, etc */
 	naddr dst,
 	naddr gate,
 	naddr mask,
 	int metric,
 	int flags)
 {
 	struct {
 		struct rt_msghdr w_rtm;
 		struct sockaddr_in w_dst;
 		struct sockaddr_in w_gate;
 #ifdef _HAVE_SA_LEN
 		struct sockaddr_in w_mask;
 #else
 		struct sockaddr_in_new w_mask;
 #endif
 	} w;
 	long cc;
 #   define PAT " %-10s %s metric=%d flags=%#x"
 #   define ARGS rtm_type_name(action), rtname(dst,mask,gate), metric, flags
 
 again:
 	memset(&w, 0, sizeof(w));
 	w.w_rtm.rtm_msglen = sizeof(w);
 	w.w_rtm.rtm_version = RTM_VERSION;
 	w.w_rtm.rtm_type = action;
 	w.w_rtm.rtm_flags = flags;
 	w.w_rtm.rtm_seq = ++rt_sock_seqno;
 	w.w_rtm.rtm_addrs = RTA_DST|RTA_GATEWAY;
 	if (metric != 0 || action == RTM_CHANGE) {
 		w.w_rtm.rtm_rmx.rmx_hopcount = metric;
 		w.w_rtm.rtm_inits |= RTV_HOPCOUNT;
 	}
 	w.w_dst.sin_family = AF_INET;
 	w.w_dst.sin_addr.s_addr = dst;
 	w.w_gate.sin_family = AF_INET;
 	w.w_gate.sin_addr.s_addr = gate;
 #ifdef _HAVE_SA_LEN
 	w.w_dst.sin_len = sizeof(w.w_dst);
 	w.w_gate.sin_len = sizeof(w.w_gate);
 #endif
 	if (mask == HOST_MASK) {
 		w.w_rtm.rtm_flags |= RTF_HOST;
 		w.w_rtm.rtm_msglen -= sizeof(w.w_mask);
 	} else {
 		w.w_rtm.rtm_addrs |= RTA_NETMASK;
 		w.w_mask.sin_addr.s_addr = htonl(mask);
 #ifdef _HAVE_SA_LEN
 		masktrim(&w.w_mask);
 		if (w.w_mask.sin_len == 0)
 			w.w_mask.sin_len = sizeof(long);
 		w.w_rtm.rtm_msglen -= (sizeof(w.w_mask) - w.w_mask.sin_len);
 #endif
 	}
 
 #ifndef NO_INSTALL
 	cc = write(rt_sock, &w, w.w_rtm.rtm_msglen);
 	if (cc < 0) {
 		if (errno == ESRCH
 		    && (action == RTM_CHANGE || action == RTM_DELETE)) {
 			trace_act("route disappeared before" PAT, ARGS);
 			if (action == RTM_CHANGE) {
 				action = RTM_ADD;
 				goto again;
 			}
 			return;
 		}
 		msglog("write(rt_sock)" PAT ": %s", ARGS, strerror(errno));
 		return;
 	} else if (cc != w.w_rtm.rtm_msglen) {
 		msglog("write(rt_sock) wrote %ld instead of %d for" PAT,
 		       cc, w.w_rtm.rtm_msglen, ARGS);
 		return;
 	}
 #endif
 	if (TRACEKERNEL)
 		trace_misc("write kernel" PAT, ARGS);
 #undef PAT
 #undef ARGS
 }
 
 
 #define KHASH_SIZE 71			/* should be prime */
 #define KHASH(a,m) khash_bins[((a) ^ (m)) % KHASH_SIZE]
 static struct khash {
 	struct khash *k_next;
 	naddr	k_dst;
 	naddr	k_mask;
 	naddr	k_gate;
 	short	k_metric;
 	u_short	k_state;
 #define	    KS_NEW	0x001
 #define	    KS_DELETE	0x002		/* need to delete the route */
 #define	    KS_ADD	0x004		/* add to the kernel */
 #define	    KS_CHANGE	0x008		/* tell kernel to change the route */
 #define	    KS_DEL_ADD	0x010		/* delete & add to change the kernel */
 #define	    KS_STATIC	0x020		/* Static flag in kernel */
 #define	    KS_GATEWAY	0x040		/* G flag in kernel */
 #define	    KS_DYNAMIC	0x080		/* result of redirect */
 #define	    KS_DELETED	0x100		/* already deleted from kernel */
 #define	    KS_CHECK	0x200
 	time_t	k_keep;
 #define	    K_KEEP_LIM	30
 	time_t	k_redirect_time;	/* when redirected route 1st seen */
 } *khash_bins[KHASH_SIZE];
 
 
 static struct khash*
 kern_find(naddr dst, naddr mask, struct khash ***ppk)
 {
 	struct khash *k, **pk;
 
 	for (pk = &KHASH(dst,mask); (k = *pk) != 0; pk = &k->k_next) {
 		if (k->k_dst == dst && k->k_mask == mask)
 			break;
 	}
 	if (ppk != 0)
 		*ppk = pk;
 	return k;
 }
 
 
 static struct khash*
 kern_add(naddr dst, naddr mask)
 {
 	struct khash *k, **pk;
 
 	k = kern_find(dst, mask, &pk);
 	if (k != 0)
 		return k;
 
 	k = (struct khash *)rtmalloc(sizeof(*k), "kern_add");
 
 	memset(k, 0, sizeof(*k));
 	k->k_dst = dst;
 	k->k_mask = mask;
 	k->k_state = KS_NEW;
 	k->k_keep = now.tv_sec;
 	*pk = k;
 
 	return k;
 }
 
 
 /* If a kernel route has a non-zero metric, check that it is still in the
  *	daemon table, and not deleted by interfaces coming and going.
  */
 static void
 kern_check_static(struct khash *k,
 		  struct interface *ifp)
 {
 	struct rt_entry *rt;
 	struct rt_spare new;
 
 	if (k->k_metric == 0)
 		return;
 
 	memset(&new, 0, sizeof(new));
 	new.rts_ifp = ifp;
 	new.rts_gate = k->k_gate;
 	new.rts_router = (ifp != 0) ? ifp->int_addr : loopaddr;
 	new.rts_metric = k->k_metric;
 	new.rts_time = now.tv_sec;
 
 	rt = rtget(k->k_dst, k->k_mask);
 	if (rt != 0) {
 		if (!(rt->rt_state & RS_STATIC))
 			rtchange(rt, rt->rt_state | RS_STATIC, &new, 0);
 	} else {
 		rtadd(k->k_dst, k->k_mask, RS_STATIC, &new);
 	}
 }
 
 
 /* operate on a kernel entry
  */
 static void
 kern_ioctl(struct khash *k,
 	   int action,			/* RTM_DELETE, etc */
 	   int flags)
 
 {
 	switch (action) {
 	case RTM_DELETE:
 		k->k_state &= ~KS_DYNAMIC;
 		if (k->k_state & KS_DELETED)
 			return;
 		k->k_state |= KS_DELETED;
 		break;
 	case RTM_ADD:
 		k->k_state &= ~KS_DELETED;
 		break;
 	case RTM_CHANGE:
 		if (k->k_state & KS_DELETED) {
 			action = RTM_ADD;
 			k->k_state &= ~KS_DELETED;
 		}
 		break;
 	}
 
 	rtioctl(action, k->k_dst, k->k_gate, k->k_mask, k->k_metric, flags);
 }
 
 
 /* add a route the kernel told us
  */
 static void
 rtm_add(struct rt_msghdr *rtm,
 	struct rt_addrinfo *info,
 	time_t keep)
 {
 	struct khash *k;
 	struct interface *ifp;
 	naddr mask;
 
 
 	if (rtm->rtm_flags & RTF_HOST) {
 		mask = HOST_MASK;
 	} else if (INFO_MASK(info) != 0) {
 		mask = ntohl(S_ADDR(INFO_MASK(info)));
 	} else {
 		msglog("ignore %s without mask", rtm_type_name(rtm->rtm_type));
 		return;
 	}
 
 	k = kern_add(S_ADDR(INFO_DST(info)), mask);
 	if (k->k_state & KS_NEW)
 		k->k_keep = now.tv_sec+keep;
 	if (INFO_GATE(info) == 0) {
 		trace_act("note %s without gateway",
 			  rtm_type_name(rtm->rtm_type));
 		k->k_metric = HOPCNT_INFINITY;
 	} else if (INFO_GATE(info)->sa_family != AF_INET) {
 		trace_act("note %s with gateway AF=%d",
 			  rtm_type_name(rtm->rtm_type),
 			  INFO_GATE(info)->sa_family);
 		k->k_metric = HOPCNT_INFINITY;
 	} else {
 		k->k_gate = S_ADDR(INFO_GATE(info));
 		k->k_metric = rtm->rtm_rmx.rmx_hopcount;
 		if (k->k_metric < 0)
 			k->k_metric = 0;
 		else if (k->k_metric > HOPCNT_INFINITY-1)
 			k->k_metric = HOPCNT_INFINITY-1;
 	}
 	k->k_state &= ~(KS_DELETE | KS_ADD | KS_CHANGE | KS_DEL_ADD
 			| KS_DELETED | KS_GATEWAY | KS_STATIC
 			| KS_NEW | KS_CHECK);
 	if (rtm->rtm_flags & RTF_GATEWAY)
 		k->k_state |= KS_GATEWAY;
 	if (rtm->rtm_flags & RTF_STATIC)
 		k->k_state |= KS_STATIC;
 
 	if (0 != (rtm->rtm_flags & (RTF_DYNAMIC | RTF_MODIFIED))) {
 		if (INFO_AUTHOR(info) != 0
 		    && INFO_AUTHOR(info)->sa_family == AF_INET)
 			ifp = iflookup(S_ADDR(INFO_AUTHOR(info)));
 		else
 			ifp = 0;
 		if (supplier
 		    && (ifp == 0 || !(ifp->int_state & IS_REDIRECT_OK))) {
 			/* Routers are not supposed to listen to redirects,
 			 * so delete it if it came via an unknown interface
 			 * or the interface does not have special permission.
 			 */
 			k->k_state &= ~KS_DYNAMIC;
 			k->k_state |= KS_DELETE;
 			LIM_SEC(need_kern, 0);
 			trace_act("mark for deletion redirected %s --> %s"
 				  " via %s",
 				  addrname(k->k_dst, k->k_mask, 0),
 				  naddr_ntoa(k->k_gate),
 				  ifp ? ifp->int_name : "unknown interface");
 		} else {
 			k->k_state |= KS_DYNAMIC;
 			k->k_redirect_time = now.tv_sec;
 			trace_act("accept redirected %s --> %s via %s",
 				  addrname(k->k_dst, k->k_mask, 0),
 				  naddr_ntoa(k->k_gate),
 				  ifp ? ifp->int_name : "unknown interface");
 		}
 		return;
 	}
 
 	/* If it is not a static route, quit until the next comparison
 	 * between the kernel and daemon tables, when it will be deleted.
 	 */
 	if (!(k->k_state & KS_STATIC)) {
 		k->k_state |= KS_DELETE;
 		LIM_SEC(need_kern, k->k_keep);
 		return;
 	}
 
 	/* Put static routes with real metrics into the daemon table so
 	 * they can be advertised.
 	 *
 	 * Find the interface toward the gateway.
 	 */
 	ifp = iflookup(k->k_gate);
 	if (ifp == 0)
 		msglog("static route %s --> %s impossibly lacks ifp",
 		       addrname(S_ADDR(INFO_DST(info)), mask, 0),
 		       naddr_ntoa(k->k_gate));
 
 	kern_check_static(k, ifp);
 }
 
 
 /* deal with packet loss
  */
 static void
 rtm_lose(struct rt_msghdr *rtm,
 	 struct rt_addrinfo *info)
 {
 	if (INFO_GATE(info) == 0
 	    || INFO_GATE(info)->sa_family != AF_INET) {
 		trace_act("ignore %s without gateway",
 			  rtm_type_name(rtm->rtm_type));
 		return;
 	}
 
 	if (rdisc_ok)
 		rdisc_age(S_ADDR(INFO_GATE(info)));
 	age(S_ADDR(INFO_GATE(info)));
 }
 
 
 /* Make the gateway slot of an info structure point to something
  * useful.  If it is not already useful, but it specifies an interface,
  * then fill in the sockaddr_in provided and point it there.
  */
 static int
 get_info_gate(struct sockaddr **sap,
 	      struct sockaddr_in *rsin)
 {
 	struct sockaddr_dl *sdl = (struct sockaddr_dl *)*sap;
 	struct interface *ifp;
 
 	if (sdl == 0)
 		return 0;
 	if ((sdl)->sdl_family == AF_INET)
 		return 1;
 	if ((sdl)->sdl_family != AF_LINK)
 		return 0;
 
 	ifp = ifwithindex(sdl->sdl_index, 1);
 	if (ifp == 0)
 		return 0;
 
 	rsin->sin_addr.s_addr = ifp->int_addr;
 #ifdef _HAVE_SA_LEN
 	rsin->sin_len = sizeof(*rsin);
 #endif
 	rsin->sin_family = AF_INET;
 	*sap = (struct sockaddr*)rsin;
 
 	return 1;
 }
 
 
 /* Clean the kernel table by copying it to the daemon image.
  * Eventually the daemon will delete any extra routes.
  */
 void
 flush_kern(void)
 {
 	static char *sysctl_buf;
 	static size_t sysctl_buf_size = 0;
 	size_t needed;
 	int mib[6];
 	char *next, *lim;
 	struct rt_msghdr *rtm;
 	struct sockaddr_in gate_sin;
 	struct rt_addrinfo info;
 	int i;
 	struct khash *k;
 
 
 	for (i = 0; i < KHASH_SIZE; i++) {
 		for (k = khash_bins[i]; k != 0; k = k->k_next) {
 			k->k_state |= KS_CHECK;
 		}
 	}
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;		/* protocol */
 	mib[3] = 0;		/* wildcard address family */
 	mib[4] = NET_RT_DUMP;
 	mib[5] = 0;		/* no flags */
 	for (;;) {
 		if ((needed = sysctl_buf_size) != 0) {
 			if (sysctl(mib, 6, sysctl_buf,&needed, 0, 0) >= 0)
 				break;
 			if (errno != ENOMEM && errno != EFAULT)
 				BADERR(1,"flush_kern: sysctl(RT_DUMP)");
 			free(sysctl_buf);
 			needed = 0;
 		}
 		if (sysctl(mib, 6, 0, &needed, 0, 0) < 0)
 			BADERR(1,"flush_kern: sysctl(RT_DUMP) estimate");
 		/* Kludge around the habit of some systems, such as
 		 * BSD/OS 3.1, to not admit how many routes are in the
 		 * kernel, or at least to be quite wrong.
 		 */
 		needed += 50*(sizeof(*rtm)+5*sizeof(struct sockaddr));
 		sysctl_buf = rtmalloc(sysctl_buf_size = needed,
 				      "flush_kern sysctl(RT_DUMP)");
 	}
 
 	lim = sysctl_buf + needed;
 	for (next = sysctl_buf; next < lim; next += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)next;
 		if (rtm->rtm_msglen == 0) {
 			msglog("zero length kernel route at "
 			       " %#lx in buffer %#lx before %#lx",
 			       (u_long)rtm, (u_long)sysctl_buf, (u_long)lim);
 			break;
 		}
 
 		rt_xaddrs(&info,
 			  (struct sockaddr *)(rtm+1),
 			  (struct sockaddr *)(next + rtm->rtm_msglen),
 			  rtm->rtm_addrs);
 
 		if (INFO_DST(&info) == 0
 		    || INFO_DST(&info)->sa_family != AF_INET)
 			continue;
 
+#if defined (RTF_LLINFO)		
 		/* ignore ARP table entries on systems with a merged route
 		 * and ARP table.
 		 */
 		if (rtm->rtm_flags & RTF_LLINFO)
 			continue;
-
+#endif
 #if defined(RTF_WASCLONED) && defined(__FreeBSD__)
 		/* ignore cloned routes
 		 */
 		if (rtm->rtm_flags & RTF_WASCLONED)
 			continue;
 #endif
 
 		/* ignore multicast addresses
 		 */
 		if (IN_MULTICAST(ntohl(S_ADDR(INFO_DST(&info)))))
 			continue;
 
 		if (!get_info_gate(&INFO_GATE(&info), &gate_sin))
 			continue;
 
 		/* Note static routes and interface routes, and also
 		 * preload the image of the kernel table so that
 		 * we can later clean it, as well as avoid making
 		 * unneeded changes.  Keep the old kernel routes for a
 		 * few seconds to allow a RIP or router-discovery
 		 * response to be heard.
 		 */
 		rtm_add(rtm,&info,MIN_WAITTIME);
 	}
 
 	for (i = 0; i < KHASH_SIZE; i++) {
 		for (k = khash_bins[i]; k != 0; k = k->k_next) {
 			if (k->k_state & KS_CHECK) {
 				msglog("%s --> %s disappeared from kernel",
 				       addrname(k->k_dst, k->k_mask, 0),
 				       naddr_ntoa(k->k_gate));
 				del_static(k->k_dst, k->k_mask, k->k_gate, 1);
 			}
 		}
 	}
 }
 
 
 /* Listen to announcements from the kernel
  */
 void
 read_rt(void)
 {
 	long cc;
 	struct interface *ifp;
 	struct sockaddr_in gate_sin;
 	naddr mask, gate;
 	union {
 		struct {
 			struct rt_msghdr rtm;
 			struct sockaddr addrs[RTAX_MAX];
 		} r;
 		struct if_msghdr ifm;
 	} m;
 	char str[100], *strp;
 	struct rt_addrinfo info;
 
 
 	for (;;) {
 		cc = read(rt_sock, &m, sizeof(m));
 		if (cc <= 0) {
 			if (cc < 0 && errno != EWOULDBLOCK)
 				LOGERR("read(rt_sock)");
 			return;
 		}
 
 		if (m.r.rtm.rtm_version != RTM_VERSION) {
 			msglog("bogus routing message version %d",
 			       m.r.rtm.rtm_version);
 			continue;
 		}
 
 		/* Ignore our own results.
 		 */
 		if (m.r.rtm.rtm_type <= RTM_CHANGE
 		    && m.r.rtm.rtm_pid == mypid) {
 			static int complained = 0;
 			if (!complained) {
 				msglog("receiving our own change messages");
 				complained = 1;
 			}
 			continue;
 		}
 
 		if (m.r.rtm.rtm_type == RTM_IFINFO
 		    || m.r.rtm.rtm_type == RTM_NEWADDR
 		    || m.r.rtm.rtm_type == RTM_DELADDR) {
 			ifp = ifwithindex(m.ifm.ifm_index,
 					  m.r.rtm.rtm_type != RTM_DELADDR);
 			if (ifp == 0)
 				trace_act("note %s with flags %#x"
 					  " for unknown interface index #%d",
 					  rtm_type_name(m.r.rtm.rtm_type),
 					  m.ifm.ifm_flags,
 					  m.ifm.ifm_index);
 			else
 				trace_act("note %s with flags %#x for %s",
 					  rtm_type_name(m.r.rtm.rtm_type),
 					  m.ifm.ifm_flags,
 					  ifp->int_name);
 
 			/* After being informed of a change to an interface,
 			 * check them all now if the check would otherwise
 			 * be a long time from now, if the interface is
 			 * not known, or if the interface has been turned
 			 * off or on.
 			 */
 			if (ifinit_timer.tv_sec-now.tv_sec>=CHECK_BAD_INTERVAL
 			    || ifp == 0
 			    || ((ifp->int_if_flags ^ m.ifm.ifm_flags)
 				& IFF_UP) != 0)
 				ifinit_timer.tv_sec = now.tv_sec;
 			continue;
 		}
 #ifdef RTM_OIFINFO
 		if (m.r.rtm.rtm_type == RTM_OIFINFO)
 			continue;	/* ignore compat message */
 #endif
 
 		strcpy(str, rtm_type_name(m.r.rtm.rtm_type));
 		strp = &str[strlen(str)];
 		if (m.r.rtm.rtm_type <= RTM_CHANGE)
 			strp += sprintf(strp," from pid %d",m.r.rtm.rtm_pid);
 
 		rt_xaddrs(&info, m.r.addrs, &m.r.addrs[RTAX_MAX],
 			  m.r.rtm.rtm_addrs);
 
 		if (INFO_DST(&info) == 0) {
 			trace_act("ignore %s without dst", str);
 			continue;
 		}
 
 		if (INFO_DST(&info)->sa_family != AF_INET) {
 			trace_act("ignore %s for AF %d", str,
 				  INFO_DST(&info)->sa_family);
 			continue;
 		}
 
 		mask = ((INFO_MASK(&info) != 0)
 			? ntohl(S_ADDR(INFO_MASK(&info)))
 			: (m.r.rtm.rtm_flags & RTF_HOST)
 			? HOST_MASK
 			: std_mask(S_ADDR(INFO_DST(&info))));
 
 		strp += sprintf(strp, ": %s",
 				addrname(S_ADDR(INFO_DST(&info)), mask, 0));
 
 		if (IN_MULTICAST(ntohl(S_ADDR(INFO_DST(&info))))) {
 			trace_act("ignore multicast %s", str);
 			continue;
 		}
 
+#if defined(RTF_LLINFO) 
 		if (m.r.rtm.rtm_flags & RTF_LLINFO) {
 			trace_act("ignore ARP %s", str);
 			continue;
 		}
-
+#endif
+		
 #if defined(RTF_WASCLONED) && defined(__FreeBSD__)
 		if (m.r.rtm.rtm_flags & RTF_WASCLONED) {
 			trace_act("ignore cloned %s", str);
 			continue;
 		}
 #endif
 
 		if (get_info_gate(&INFO_GATE(&info), &gate_sin)) {
 			gate = S_ADDR(INFO_GATE(&info));
 			strp += sprintf(strp, " --> %s", naddr_ntoa(gate));
 		} else {
 			gate = 0;
 		}
 
 		if (INFO_AUTHOR(&info) != 0)
 			strp += sprintf(strp, " by authority of %s",
 					saddr_ntoa(INFO_AUTHOR(&info)));
 
 		switch (m.r.rtm.rtm_type) {
 		case RTM_ADD:
 		case RTM_CHANGE:
 		case RTM_REDIRECT:
 			if (m.r.rtm.rtm_errno != 0) {
 				trace_act("ignore %s with \"%s\" error",
 					  str, strerror(m.r.rtm.rtm_errno));
 			} else {
 				trace_act("%s", str);
 				rtm_add(&m.r.rtm,&info,0);
 			}
 			break;
 
 		case RTM_DELETE:
 			if (m.r.rtm.rtm_errno != 0
 			    && m.r.rtm.rtm_errno != ESRCH) {
 				trace_act("ignore %s with \"%s\" error",
 					  str, strerror(m.r.rtm.rtm_errno));
 			} else {
 				trace_act("%s", str);
 				del_static(S_ADDR(INFO_DST(&info)), mask,
 					   gate, 1);
 			}
 			break;
 
 		case RTM_LOSING:
 			trace_act("%s", str);
 			rtm_lose(&m.r.rtm,&info);
 			break;
 
 		default:
 			trace_act("ignore %s", str);
 			break;
 		}
 	}
 }
 
 
 /* after aggregating, note routes that belong in the kernel
  */
 static void
 kern_out(struct ag_info *ag)
 {
 	struct khash *k;
 
 
 	/* Do not install bad routes if they are not already present.
 	 * This includes routes that had RS_NET_SYN for interfaces that
 	 * recently died.
 	 */
 	if (ag->ag_metric == HOPCNT_INFINITY) {
 		k = kern_find(htonl(ag->ag_dst_h), ag->ag_mask, 0);
 		if (k == 0)
 			return;
 	} else {
 		k = kern_add(htonl(ag->ag_dst_h), ag->ag_mask);
 	}
 
 	if (k->k_state & KS_NEW) {
 		/* will need to add new entry to the kernel table */
 		k->k_state = KS_ADD;
 		if (ag->ag_state & AGS_GATEWAY)
 			k->k_state |= KS_GATEWAY;
 		k->k_gate = ag->ag_gate;
 		k->k_metric = ag->ag_metric;
 		return;
 	}
 
 	if (k->k_state & KS_STATIC)
 		return;
 
 	/* modify existing kernel entry if necessary */
 	if (k->k_gate != ag->ag_gate
 	    || k->k_metric != ag->ag_metric) {
 		/* Must delete bad interface routes etc. to change them. */
 		if (k->k_metric == HOPCNT_INFINITY)
 			k->k_state |= KS_DEL_ADD;
 		k->k_gate = ag->ag_gate;
 		k->k_metric = ag->ag_metric;
 		k->k_state |= KS_CHANGE;
 	}
 
 	/* If the daemon thinks the route should exist, forget
 	 * about any redirections.
 	 * If the daemon thinks the route should exist, eventually
 	 * override manual intervention by the operator.
 	 */
 	if ((k->k_state & (KS_DYNAMIC | KS_DELETED)) != 0) {
 		k->k_state &= ~KS_DYNAMIC;
 		k->k_state |= (KS_ADD | KS_DEL_ADD);
 	}
 
 	if ((k->k_state & KS_GATEWAY)
 	    && !(ag->ag_state & AGS_GATEWAY)) {
 		k->k_state &= ~KS_GATEWAY;
 		k->k_state |= (KS_ADD | KS_DEL_ADD);
 	} else if (!(k->k_state & KS_GATEWAY)
 		   && (ag->ag_state & AGS_GATEWAY)) {
 		k->k_state |= KS_GATEWAY;
 		k->k_state |= (KS_ADD | KS_DEL_ADD);
 	}
 
 	/* Deleting-and-adding is necessary to change aspects of a route.
 	 * Just delete instead of deleting and then adding a bad route.
 	 * Otherwise, we want to keep the route in the kernel.
 	 */
 	if (k->k_metric == HOPCNT_INFINITY
 	    && (k->k_state & KS_DEL_ADD))
 		k->k_state |= KS_DELETE;
 	else
 		k->k_state &= ~KS_DELETE;
 #undef RT
 }
 
 
 /* ARGSUSED */
 static int
 walk_kern(struct radix_node *rn,
 	  struct walkarg *argp UNUSED)
 {
 #define RT ((struct rt_entry *)rn)
 	char metric, pref;
 	u_int ags = 0;
 
 
 	/* Do not install synthetic routes */
 	if (RT->rt_state & RS_NET_SYN)
 		return 0;
 
 	if (!(RT->rt_state & RS_IF)) {
 		/* This is an ordinary route, not for an interface.
 		 */
 
 		/* aggregate, ordinary good routes without regard to
 		 * their metric
 		 */
 		pref = 1;
 		ags |= (AGS_GATEWAY | AGS_SUPPRESS | AGS_AGGREGATE);
 
 		/* Do not install host routes directly to hosts, to avoid
 		 * interfering with ARP entries in the kernel table.
 		 */
 		if (RT_ISHOST(RT)
 		    && ntohl(RT->rt_dst) == RT->rt_gate)
 			return 0;
 
 	} else {
 		/* This is an interface route.
 		 * Do not install routes for "external" remote interfaces.
 		 */
 		if (RT->rt_ifp != 0 && (RT->rt_ifp->int_state & IS_EXTERNAL))
 			return 0;
 
 		/* Interfaces should override received routes.
 		 */
 		pref = 0;
 		ags |= (AGS_IF | AGS_CORS_GATE);
 
 		/* If it is not an interface, or an alias for an interface,
 		 * it must be a "gateway."
 		 *
 		 * If it is a "remote" interface, it is also a "gateway" to
 		 * the kernel if is not an alias.
 		 */
 		if (RT->rt_ifp == 0
 		    || (RT->rt_ifp->int_state & IS_REMOTE))
 			ags |= (AGS_GATEWAY | AGS_SUPPRESS | AGS_AGGREGATE);
 	}
 
 	/* If RIP is off and IRDP is on, let the route to the discovered
 	 * route suppress any RIP routes.  Eventually the RIP routes
 	 * will time-out and be deleted.  This reaches the steady-state
 	 * quicker.
 	 */
 	if ((RT->rt_state & RS_RDISC) && rip_sock < 0)
 		ags |= AGS_CORS_GATE;
 
 	metric = RT->rt_metric;
 	if (metric == HOPCNT_INFINITY) {
 		/* if the route is dead, so try hard to aggregate. */
 		pref = HOPCNT_INFINITY;
 		ags |= (AGS_FINE_GATE | AGS_SUPPRESS);
 		ags &= ~(AGS_IF | AGS_CORS_GATE);
 	}
 
 	ag_check(RT->rt_dst, RT->rt_mask, RT->rt_gate, 0,
 		 metric,pref, 0, 0, ags, kern_out);
 	return 0;
 #undef RT
 }
 
 
 /* Update the kernel table to match the daemon table.
  */
 static void
 fix_kern(void)
 {
 	int i;
 	struct khash *k, **pk;
 
 
 	need_kern = age_timer;
 
 	/* Walk daemon table, updating the copy of the kernel table.
 	 */
 	(void)rn_walktree(rhead, walk_kern, 0);
 	ag_flush(0,0,kern_out);
 
 	for (i = 0; i < KHASH_SIZE; i++) {
 		for (pk = &khash_bins[i]; (k = *pk) != 0; ) {
 			/* Do not touch static routes */
 			if (k->k_state & KS_STATIC) {
 				kern_check_static(k,0);
 				pk = &k->k_next;
 				continue;
 			}
 
 			/* check hold on routes deleted by the operator */
 			if (k->k_keep > now.tv_sec) {
 				/* ensure we check when the hold is over */
 				LIM_SEC(need_kern, k->k_keep);
 				/* mark for the next cycle */
 				k->k_state |= KS_DELETE;
 				pk = &k->k_next;
 				continue;
 			}
 
 			if ((k->k_state & KS_DELETE)
 			    && !(k->k_state & KS_DYNAMIC)) {
 				kern_ioctl(k, RTM_DELETE, 0);
 				*pk = k->k_next;
 				free(k);
 				continue;
 			}
 
 			if (k->k_state & KS_DEL_ADD)
 				kern_ioctl(k, RTM_DELETE, 0);
 
 			if (k->k_state & KS_ADD) {
 				kern_ioctl(k, RTM_ADD,
 					   ((0 != (k->k_state & (KS_GATEWAY
 							| KS_DYNAMIC)))
 					    ? RTF_GATEWAY : 0));
 			} else if (k->k_state & KS_CHANGE) {
 				kern_ioctl(k,  RTM_CHANGE,
 					   ((0 != (k->k_state & (KS_GATEWAY
 							| KS_DYNAMIC)))
 					    ? RTF_GATEWAY : 0));
 			}
 			k->k_state &= ~(KS_ADD|KS_CHANGE|KS_DEL_ADD);
 
 			/* Mark this route to be deleted in the next cycle.
 			 * This deletes routes that disappear from the
 			 * daemon table, since the normal aging code
 			 * will clear the bit for routes that have not
 			 * disappeared from the daemon table.
 			 */
 			k->k_state |= KS_DELETE;
 			pk = &k->k_next;
 		}
 	}
 }
 
 
 /* Delete a static route in the image of the kernel table.
  */
 void
 del_static(naddr dst,
 	   naddr mask,
 	   naddr gate,
 	   int gone)
 {
 	struct khash *k;
 	struct rt_entry *rt;
 
 	/* Just mark it in the table to be deleted next time the kernel
 	 * table is updated.
 	 * If it has already been deleted, mark it as such, and set its
 	 * keep-timer so that it will not be deleted again for a while.
 	 * This lets the operator delete a route added by the daemon
 	 * and add a replacement.
 	 */
 	k = kern_find(dst, mask, 0);
 	if (k != 0 && (gate == 0 || k->k_gate == gate)) {
 		k->k_state &= ~(KS_STATIC | KS_DYNAMIC | KS_CHECK);
 		k->k_state |= KS_DELETE;
 		if (gone) {
 			k->k_state |= KS_DELETED;
 			k->k_keep = now.tv_sec + K_KEEP_LIM;
 		}
 	}
 
 	rt = rtget(dst, mask);
 	if (rt != 0 && (rt->rt_state & RS_STATIC))
 		rtbad(rt);
 }
 
 
 /* Delete all routes generated from ICMP Redirects that use a given gateway,
  * as well as old redirected routes.
  */
 void
 del_redirects(naddr bad_gate,
 	      time_t old)
 {
 	int i;
 	struct khash *k;
 
 
 	for (i = 0; i < KHASH_SIZE; i++) {
 		for (k = khash_bins[i]; k != 0; k = k->k_next) {
 			if (!(k->k_state & KS_DYNAMIC)
 			    || (k->k_state & KS_STATIC))
 				continue;
 
 			if (k->k_gate != bad_gate
 			    && k->k_redirect_time > old
 			    && !supplier)
 				continue;
 
 			k->k_state |= KS_DELETE;
 			k->k_state &= ~KS_DYNAMIC;
 			need_kern.tv_sec = now.tv_sec;
 			trace_act("mark redirected %s --> %s for deletion",
 				  addrname(k->k_dst, k->k_mask, 0),
 				  naddr_ntoa(k->k_gate));
 		}
 	}
 }
 
 
 /* Start the daemon tables.
  */
 extern int max_keylen;
 
 void
 rtinit(void)
 {
 	int i;
 	struct ag_info *ag;
 
 	/* Initialize the radix trees */
 	max_keylen = sizeof(struct sockaddr_in);
 	rn_init();
 	rn_inithead((void**)&rhead, 32);
 
 	/* mark all of the slots in the table free */
 	ag_avail = ag_slots;
 	for (ag = ag_slots, i = 1; i < NUM_AG_SLOTS; i++) {
 		ag->ag_fine = ag+1;
 		ag++;
 	}
 }
 
 
 #ifdef _HAVE_SIN_LEN
 static struct sockaddr_in dst_sock = {sizeof(dst_sock), AF_INET, 0, {0}, {0}};
 static struct sockaddr_in mask_sock = {sizeof(mask_sock), AF_INET, 0, {0}, {0}};
 #else
 static struct sockaddr_in_new dst_sock = {_SIN_ADDR_SIZE, AF_INET};
 static struct sockaddr_in_new mask_sock = {_SIN_ADDR_SIZE, AF_INET};
 #endif
 
 
 static void
 set_need_flash(void)
 {
 	if (!need_flash) {
 		need_flash = 1;
 		/* Do not send the flash update immediately.  Wait a little
 		 * while to hear from other routers.
 		 */
 		no_flash.tv_sec = now.tv_sec + MIN_WAITTIME;
 	}
 }
 
 
 /* Get a particular routing table entry
  */
 struct rt_entry *
 rtget(naddr dst, naddr mask)
 {
 	struct rt_entry *rt;
 
 	dst_sock.sin_addr.s_addr = dst;
 	mask_sock.sin_addr.s_addr = htonl(mask);
 	masktrim(&mask_sock);
 	rt = (struct rt_entry *)rhead->rnh_lookup(&dst_sock,&mask_sock,rhead);
 	if (!rt
 	    || rt->rt_dst != dst
 	    || rt->rt_mask != mask)
 		return 0;
 
 	return rt;
 }
 
 
 /* Find a route to dst as the kernel would.
  */
 struct rt_entry *
 rtfind(naddr dst)
 {
 	dst_sock.sin_addr.s_addr = dst;
 	return (struct rt_entry *)rhead->rnh_matchaddr(&dst_sock, rhead);
 }
 
 
 /* add a route to the table
  */
 void
 rtadd(naddr	dst,
       naddr	mask,
       u_int	state,			/* rt_state for the entry */
       struct	rt_spare *new)
 {
 	struct rt_entry *rt;
 	naddr smask;
 	int i;
 	struct rt_spare *rts;
 
 	rt = (struct rt_entry *)rtmalloc(sizeof (*rt), "rtadd");
 	memset(rt, 0, sizeof(*rt));
 	for (rts = rt->rt_spares, i = NUM_SPARES; i != 0; i--, rts++)
 		rts->rts_metric = HOPCNT_INFINITY;
 
 	rt->rt_nodes->rn_key = (caddr_t)&rt->rt_dst_sock;
 	rt->rt_dst = dst;
 	rt->rt_dst_sock.sin_family = AF_INET;
 #ifdef _HAVE_SIN_LEN
 	rt->rt_dst_sock.sin_len = dst_sock.sin_len;
 #endif
 	if (mask != HOST_MASK) {
 		smask = std_mask(dst);
 		if ((smask & ~mask) == 0 && mask > smask)
 			state |= RS_SUBNET;
 	}
 	mask_sock.sin_addr.s_addr = htonl(mask);
 	masktrim(&mask_sock);
 	rt->rt_mask = mask;
 	rt->rt_state = state;
 	rt->rt_spares[0] = *new;
 	rt->rt_time = now.tv_sec;
 	rt->rt_poison_metric = HOPCNT_INFINITY;
 	rt->rt_seqno = update_seqno;
 
 	if (++total_routes == MAX_ROUTES)
 		msglog("have maximum (%d) routes", total_routes);
 	if (TRACEACTIONS)
 		trace_add_del("Add", rt);
 
 	need_kern.tv_sec = now.tv_sec;
 	set_need_flash();
 
 	if (0 == rhead->rnh_addaddr(&rt->rt_dst_sock, &mask_sock,
 				    rhead, rt->rt_nodes)) {
 		msglog("rnh_addaddr() failed for %s mask=%#lx",
 		       naddr_ntoa(dst), (u_long)mask);
 		free(rt);
 	}
 }
 
 
 /* notice a changed route
  */
 void
 rtchange(struct rt_entry *rt,
 	 u_int	state,			/* new state bits */
 	 struct rt_spare *new,
 	 char	*label)
 {
 	if (rt->rt_metric != new->rts_metric) {
 		/* Fix the kernel immediately if it seems the route
 		 * has gone bad, since there may be a working route that
 		 * aggregates this route.
 		 */
 		if (new->rts_metric == HOPCNT_INFINITY) {
 			need_kern.tv_sec = now.tv_sec;
 			if (new->rts_time >= now.tv_sec - EXPIRE_TIME)
 				new->rts_time = now.tv_sec - EXPIRE_TIME;
 		}
 		rt->rt_seqno = update_seqno;
 		set_need_flash();
 	}
 
 	if (rt->rt_gate != new->rts_gate) {
 		need_kern.tv_sec = now.tv_sec;
 		rt->rt_seqno = update_seqno;
 		set_need_flash();
 	}
 
 	state |= (rt->rt_state & RS_SUBNET);
 
 	/* Keep various things from deciding ageless routes are stale.
 	 */
 	if (!AGE_RT(state, new->rts_ifp))
 		new->rts_time = now.tv_sec;
 
 	if (TRACEACTIONS)
 		trace_change(rt, state, new,
 			     label ? label : "Chg   ");
 
 	rt->rt_state = state;
 	rt->rt_spares[0] = *new;
 }
 
 
 /* check for a better route among the spares
  */
 static struct rt_spare *
 rts_better(struct rt_entry *rt)
 {
 	struct rt_spare *rts, *rts1;
 	int i;
 
 	/* find the best alternative among the spares */
 	rts = rt->rt_spares+1;
 	for (i = NUM_SPARES, rts1 = rts+1; i > 2; i--, rts1++) {
 		if (BETTER_LINK(rt,rts1,rts))
 			rts = rts1;
 	}
 
 	return rts;
 }
 
 
 /* switch to a backup route
  */
 void
 rtswitch(struct rt_entry *rt,
 	 struct rt_spare *rts)
 {
 	struct rt_spare swap;
 	char label[10];
 
 
 	/* Do not change permanent routes */
 	if (0 != (rt->rt_state & (RS_MHOME | RS_STATIC | RS_RDISC
 				  | RS_NET_SYN | RS_IF)))
 		return;
 
 	/* find the best alternative among the spares */
 	if (rts == 0)
 		rts = rts_better(rt);
 
 	/* Do not bother if it is not worthwhile.
 	 */
 	if (!BETTER_LINK(rt, rts, rt->rt_spares))
 		return;
 
 	swap = rt->rt_spares[0];
 	(void)sprintf(label, "Use #%d", (int)(rts - rt->rt_spares));
 	rtchange(rt, rt->rt_state & ~(RS_NET_SYN | RS_RDISC), rts, label);
 	if (swap.rts_metric == HOPCNT_INFINITY) {
 		*rts = rts_empty;
 	} else {
 		*rts = swap;
 	}
 }
 
 
 void
 rtdelete(struct rt_entry *rt)
 {
 	struct khash *k;
 
 
 	if (TRACEACTIONS)
 		trace_add_del("Del", rt);
 
 	k = kern_find(rt->rt_dst, rt->rt_mask, 0);
 	if (k != 0) {
 		k->k_state |= KS_DELETE;
 		need_kern.tv_sec = now.tv_sec;
 	}
 
 	dst_sock.sin_addr.s_addr = rt->rt_dst;
 	mask_sock.sin_addr.s_addr = htonl(rt->rt_mask);
 	masktrim(&mask_sock);
 	if (rt != (struct rt_entry *)rhead->rnh_deladdr(&dst_sock, &mask_sock,
 							rhead)) {
 		msglog("rnh_deladdr() failed");
 	} else {
 		free(rt);
 		total_routes--;
 	}
 }
 
 
 void
 rts_delete(struct rt_entry *rt,
 	   struct rt_spare *rts)
 {
 	trace_upslot(rt, rts, &rts_empty);
 	*rts = rts_empty;
 }
 
 
 /* Get rid of a bad route, and try to switch to a replacement.
  */
 void
 rtbad(struct rt_entry *rt)
 {
 	struct rt_spare new;
 
 	/* Poison the route */
 	new = rt->rt_spares[0];
 	new.rts_metric = HOPCNT_INFINITY;
 	rtchange(rt, rt->rt_state & ~(RS_IF | RS_LOCAL | RS_STATIC), &new, 0);
 	rtswitch(rt, 0);
 }
 
 
 /* Junk a RS_NET_SYN or RS_LOCAL route,
  *	unless it is needed by another interface.
  */
 void
 rtbad_sub(struct rt_entry *rt)
 {
 	struct interface *ifp, *ifp1;
 	struct intnet *intnetp;
 	u_int state;
 
 
 	ifp1 = 0;
 	state = 0;
 
 	if (rt->rt_state & RS_LOCAL) {
 		/* Is this the route through loopback for the interface?
 		 * If so, see if it is used by any other interfaces, such
 		 * as a point-to-point interface with the same local address.
 		 */
 		for (ifp = ifnet; ifp != 0; ifp = ifp->int_next) {
 			/* Retain it if another interface needs it.
 			 */
 			if (ifp->int_addr == rt->rt_ifp->int_addr) {
 				state |= RS_LOCAL;
 				ifp1 = ifp;
 				break;
 			}
 		}
 
 	}
 
 	if (!(state & RS_LOCAL)) {
 		/* Retain RIPv1 logical network route if there is another
 		 * interface that justifies it.
 		 */
 		if (rt->rt_state & RS_NET_SYN) {
 			for (ifp = ifnet; ifp != 0; ifp = ifp->int_next) {
 				if ((ifp->int_state & IS_NEED_NET_SYN)
 				    && rt->rt_mask == ifp->int_std_mask
 				    && rt->rt_dst == ifp->int_std_addr) {
 					state |= RS_NET_SYN;
 					ifp1 = ifp;
 					break;
 				}
 			}
 		}
 
 		/* or if there is an authority route that needs it. */
 		for (intnetp = intnets;
 		     intnetp != 0;
 		     intnetp = intnetp->intnet_next) {
 			if (intnetp->intnet_addr == rt->rt_dst
 			    && intnetp->intnet_mask == rt->rt_mask) {
 				state |= (RS_NET_SYN | RS_NET_INT);
 				break;
 			}
 		}
 	}
 
 	if (ifp1 != 0 || (state & RS_NET_SYN)) {
 		struct rt_spare new = rt->rt_spares[0];
 		new.rts_ifp = ifp1;
 		rtchange(rt, ((rt->rt_state & ~(RS_NET_SYN|RS_LOCAL)) | state),
 			 &new, 0);
 	} else {
 		rtbad(rt);
 	}
 }
 
 
 /* Called while walking the table looking for sick interfaces
  * or after a time change.
  */
 /* ARGSUSED */
 int
 walk_bad(struct radix_node *rn,
 	 struct walkarg *argp UNUSED)
 {
 #define RT ((struct rt_entry *)rn)
 	struct rt_spare *rts;
 	int i;
 
 
 	/* fix any spare routes through the interface
 	 */
 	rts = RT->rt_spares;
 	for (i = NUM_SPARES; i != 1; i--) {
 		rts++;
 		if (rts->rts_metric < HOPCNT_INFINITY
 		    && (rts->rts_ifp == 0
 			|| (rts->rts_ifp->int_state & IS_BROKE)))
 			rts_delete(RT, rts);
 	}
 
 	/* Deal with the main route
 	 */
 	/* finished if it has been handled before or if its interface is ok
 	 */
 	if (RT->rt_ifp == 0 || !(RT->rt_ifp->int_state & IS_BROKE))
 		return 0;
 
 	/* Bad routes for other than interfaces are easy.
 	 */
 	if (0 == (RT->rt_state & (RS_IF | RS_NET_SYN | RS_LOCAL))) {
 		rtbad(RT);
 		return 0;
 	}
 
 	rtbad_sub(RT);
 	return 0;
 #undef RT
 }
 
 
 /* Check the age of an individual route.
  */
 /* ARGSUSED */
 static int
 walk_age(struct radix_node *rn,
 	   struct walkarg *argp UNUSED)
 {
 #define RT ((struct rt_entry *)rn)
 	struct interface *ifp;
 	struct rt_spare *rts;
 	int i;
 
 
 	/* age all of the spare routes, including the primary route
 	 * currently in use
 	 */
 	rts = RT->rt_spares;
 	for (i = NUM_SPARES; i != 0; i--, rts++) {
 
 		ifp = rts->rts_ifp;
 		if (i == NUM_SPARES) {
 			if (!AGE_RT(RT->rt_state, ifp)) {
 				/* Keep various things from deciding ageless
 				 * routes are stale
 				 */
 				rts->rts_time = now.tv_sec;
 				continue;
 			}
 
 			/* forget RIP routes after RIP has been turned off.
 			 */
 			if (rip_sock < 0) {
 				rtdelete(RT);
 				return 0;
 			}
 		}
 
 		/* age failing routes
 		 */
 		if (age_bad_gate == rts->rts_gate
 		    && rts->rts_time >= now_stale) {
 			rts->rts_time -= SUPPLY_INTERVAL;
 		}
 
 		/* trash the spare routes when they go bad */
 		if (rts->rts_metric < HOPCNT_INFINITY
 		    && now_garbage > rts->rts_time
 		    && i != NUM_SPARES)
 			rts_delete(RT, rts);
 	}
 
 
 	/* finished if the active route is still fresh */
 	if (now_stale <= RT->rt_time)
 		return 0;
 
 	/* try to switch to an alternative */
 	rtswitch(RT, 0);
 
 	/* Delete a dead route after it has been publically mourned. */
 	if (now_garbage > RT->rt_time) {
 		rtdelete(RT);
 		return 0;
 	}
 
 	/* Start poisoning a bad route before deleting it. */
 	if (now.tv_sec - RT->rt_time > EXPIRE_TIME) {
 		struct rt_spare new = RT->rt_spares[0];
 		new.rts_metric = HOPCNT_INFINITY;
 		rtchange(RT, RT->rt_state, &new, 0);
 	}
 	return 0;
 }
 
 
 /* Watch for dead routes and interfaces.
  */
 void
 age(naddr bad_gate)
 {
 	struct interface *ifp;
 	int need_query = 0;
 
 	/* If not listening to RIP, there is no need to age the routes in
 	 * the table.
 	 */
 	age_timer.tv_sec = (now.tv_sec
 			    + ((rip_sock < 0) ? NEVER : SUPPLY_INTERVAL));
 
 	/* Check for dead IS_REMOTE interfaces by timing their
 	 * transmissions.
 	 */
 	for (ifp = ifnet; ifp; ifp = ifp->int_next) {
 		if (!(ifp->int_state & IS_REMOTE))
 			continue;
 
 		/* ignore unreachable remote interfaces */
 		if (!check_remote(ifp))
 			continue;
 
 		/* Restore remote interface that has become reachable
 		 */
 		if (ifp->int_state & IS_BROKE)
 			if_ok(ifp, "remote ");
 
 		if (ifp->int_act_time != NEVER
 		    && now.tv_sec - ifp->int_act_time > EXPIRE_TIME) {
 			msglog("remote interface %s to %s timed out after"
 			       " %ld:%ld",
 			       ifp->int_name,
 			       naddr_ntoa(ifp->int_dstaddr),
 			       (now.tv_sec - ifp->int_act_time)/60,
 			       (now.tv_sec - ifp->int_act_time)%60);
 			if_sick(ifp);
 		}
 
 		/* If we have not heard from the other router
 		 * recently, ask it.
 		 */
 		if (now.tv_sec >= ifp->int_query_time) {
 			ifp->int_query_time = NEVER;
 			need_query = 1;
 		}
 	}
 
 	/* Age routes. */
 	age_bad_gate = bad_gate;
 	(void)rn_walktree(rhead, walk_age, 0);
 
 	/* delete old redirected routes to keep the kernel table small
 	 * and prevent blackholes
 	 */
 	del_redirects(bad_gate, now.tv_sec-STALE_TIME);
 
 	/* Update the kernel routing table. */
 	fix_kern();
 
 	/* poke reticent remote gateways */
 	if (need_query)
 		rip_query();
 }
Index: head/share/man/man4/route.4
===================================================================
--- head/share/man/man4/route.4	(revision 186118)
+++ head/share/man/man4/route.4	(revision 186119)
@@ -1,331 +1,331 @@
 .\" Copyright (c) 1990, 1991, 1993
 .\"	The Regents of the University of California.  All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\" 3. All advertising materials mentioning features or use of this software
 .\"    must display the following acknowledgement:
 .\"	This product includes software developed by the University of
 .\"	California, Berkeley and its contributors.
 .\" 4. Neither the name of the University nor the names of its contributors
 .\"    may be used to endorse or promote products derived from this software
 .\"    without specific prior written permission.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\"     From: @(#)route.4	8.6 (Berkeley) 4/19/94
 .\" $FreeBSD$
 .\"
 .Dd November 4, 2004
 .Dt ROUTE 4
 .Os
 .Sh NAME
 .Nm route
 .Nd kernel packet forwarding database
 .Sh SYNOPSIS
 .In sys/types.h
 .In sys/time.h
 .In sys/socket.h
 .In net/if.h
 .In net/route.h
 .Ft int
 .Fn socket PF_ROUTE SOCK_RAW "int family"
 .Sh DESCRIPTION
 .Fx
 provides some packet routing facilities.
 The kernel maintains a routing information database, which
 is used in selecting the appropriate network interface when
 transmitting packets.
 .Pp
 A user process (or possibly multiple co-operating processes)
 maintains this database by sending messages over a special kind
 of socket.
 This supplants fixed size
 .Xr ioctl 2 Ns 's
 used in earlier releases.
 Routing table changes may only be carried out by the super user.
 .Pp
 The operating system may spontaneously emit routing messages in response
 to external events, such as receipt of a re-direct, or failure to
 locate a suitable route for a request.
 The message types are described in greater detail below.
 .Pp
 Routing database entries come in two flavors: for a specific
 host, or for all hosts on a generic subnetwork (as specified
 by a bit mask and value under the mask.
 The effect of wildcard or default route may be achieved by using
 a mask of all zeros, and there may be hierarchical routes.
 .Pp
 When the system is booted and addresses are assigned
 to the network interfaces, each protocol family
 installs a routing table entry for each interface when it is ready for traffic.
 Normally the protocol specifies the route
 through each interface as a
 .Dq direct
 connection to the destination host
 or network.
 If the route is direct, the transport layer of
 a protocol family usually requests the packet be sent to the
 same host specified in the packet.
 Otherwise, the interface
 is requested to address the packet to the gateway listed in the routing entry
 (i.e., the packet is forwarded).
 .Pp
 When routing a packet,
 the kernel will attempt to find
 the most specific route matching the destination.
 (If there are two different mask and value-under-the-mask pairs
 that match, the more specific is the one with more bits in the mask.
 A route to a host is regarded as being supplied with a mask of
 as many ones as there are bits in the destination).
 If no entry is found, the destination is declared to be unreachable,
 and a routing-miss message is generated if there are any
 listeners on the routing control socket described below.
 .Pp
 A wildcard routing entry is specified with a zero
 destination address value, and a mask of all zeroes.
 Wildcard routes will be used
 when the system fails to find other routes matching the
 destination.
 The combination of wildcard
 routes and routing redirects can provide an economical
 mechanism for routing traffic.
 .Pp
 One opens the channel for passing routing control messages
 by using the socket call shown in the synopsis above:
 .Pp
 The
 .Fa family
 parameter may be
 .Dv AF_UNSPEC
 which will provide
 routing information for all address families, or can be restricted
 to a specific address family by specifying which one is desired.
 There can be more than one routing socket open per system.
 .Pp
 Messages are formed by a header followed by a small
 number of sockaddrs (now variable length particularly
 in the
 .Tn ISO
 case), interpreted by position, and delimited
 by the new length entry in the sockaddr.
 An example of a message with four addresses might be an
 .Tn ISO
 redirect:
 Destination, Netmask, Gateway, and Author of the redirect.
 The interpretation of which address are present is given by a
 bit mask within the header, and the sequence is least significant
 to most significant bit within the vector.
 .Pp
 Any messages sent to the kernel are returned, and copies are sent
 to all interested listeners.
 The kernel will provide the process
 ID for the sender, and the sender may use an additional sequence
 field to distinguish between outstanding messages.
 However, message replies may be lost when kernel buffers are exhausted.
 .Pp
 The kernel may reject certain messages, and will indicate this
 by filling in the
 .Ar rtm_errno
 field.
 The routing code returns
 .Er EEXIST
 if
 requested to duplicate an existing entry,
 .Er ESRCH
 if
 requested to delete a non-existent entry,
 or
 .Er ENOBUFS
 if insufficient resources were available
 to install a new route.
 In the current implementation, all routing processes run locally,
 and the values for
 .Ar rtm_errno
 are available through the normal
 .Em errno
 mechanism, even if the routing reply message is lost.
 .Pp
 A process may avoid the expense of reading replies to
 its own messages by issuing a
 .Xr setsockopt 2
 call indicating that the
 .Dv SO_USELOOPBACK
 option
 at the
 .Dv SOL_SOCKET
 level is to be turned off.
 A process may ignore all messages from the routing socket
 by doing a
 .Xr shutdown 2
 system call for further input.
 .Pp
 If a route is in use when it is deleted,
 the routing entry will be marked down and removed from the routing table,
 but the resources associated with it will not
 be reclaimed until all references to it are released.
 User processes can obtain information about the routing
 entry to a specific destination by using a
 .Dv RTM_GET
 message, or by calling
 .Xr sysctl 3 .
 .Pp
 Messages include:
 .Bd -literal
 #define	RTM_ADD		0x1    /* Add Route */
 #define	RTM_DELETE	0x2    /* Delete Route */
 #define	RTM_CHANGE	0x3    /* Change Metrics, Flags, or Gateway */
 #define	RTM_GET		0x4    /* Report Information */
 #define	RTM_LOSING	0x5    /* Kernel Suspects Partitioning */
 #define	RTM_REDIRECT	0x6    /* Told to use different route */
 #define	RTM_MISS	0x7    /* Lookup failed on this address */
 #define	RTM_LOCK	0x8    /* fix specified metrics */
-#define	RTM_RESOLVE	0xb    /* request to resolve dst to LL addr */
+#define	RTM_RESOLVE	0xb    /* request to resolve dst to LL addr - unused */
 #define	RTM_NEWADDR	0xc    /* address being added to iface */
 #define	RTM_DELADDR	0xd    /* address being removed from iface */
 #define	RTM_IFINFO	0xe    /* iface going up/down etc. */
 #define	RTM_NEWMADDR	0xf    /* mcast group membership being added to if */
 #define	RTM_DELMADDR	0x10   /* mcast group membership being deleted */
 #define	RTM_IFANNOUNCE	0x11   /* iface arrival/departure */
 .Ed
 .Pp
 A message header consists of one of the following:
 .Bd -literal
 struct rt_msghdr {
     u_short rtm_msglen;         /* to skip over non-understood messages */
     u_char  rtm_version;        /* future binary compatibility */
     u_char  rtm_type;           /* message type */
     u_short rtm_index;          /* index for associated ifp */
     int     rtm_flags;          /* flags, incl. kern & message, e.g. DONE */
     int     rtm_addrs;          /* bitmask identifying sockaddrs in msg */
     pid_t   rtm_pid;            /* identify sender */
     int     rtm_seq;            /* for sender to identify action */
     int     rtm_errno;          /* why failed */
     int     rtm_use;            /* from rtentry */
     u_long  rtm_inits;          /* which metrics we are initializing */
     struct  rt_metrics rtm_rmx;	/* metrics themselves */
 };
 
 struct if_msghdr {
     u_short ifm_msglen;         /* to skip over non-understood messages */
     u_char  ifm_version;        /* future binary compatibility */
     u_char  ifm_type;           /* message type */
     int     ifm_addrs;          /* like rtm_addrs */
     int     ifm_flags;          /* value of if_flags */
     u_short ifm_index;          /* index for associated ifp */
     struct  if_data ifm_data;   /* statistics and other data about if */
 };
 
 struct ifa_msghdr {
     u_short ifam_msglen;        /* to skip over non-understood messages */
     u_char  ifam_version;       /* future binary compatibility */
     u_char  ifam_type;          /* message type */
     int     ifam_addrs;         /* like rtm_addrs */
     int     ifam_flags;         /* value of ifa_flags */
     u_short ifam_index;         /* index for associated ifp */
     int     ifam_metric;        /* value of ifa_metric */
 };
 
 struct ifma_msghdr {
     u_short ifmam_msglen;       /* to skip over non-understood messages */
     u_char  ifmam_version;      /* future binary compatibility */
     u_char  ifmam_type;         /* message type */
     int     ifmam_addrs;        /* like rtm_addrs */
     int     ifmam_flags;        /* value of ifa_flags */
     u_short ifmam_index;        /* index for associated ifp */
 };
 
 struct if_announcemsghdr {
 	u_short	ifan_msglen;	/* to skip over non-understood messages */
 	u_char	ifan_version;	/* future binary compatibility */
 	u_char	ifan_type;	/* message type */
 	u_short	ifan_index;	/* index for associated ifp */
 	char	ifan_name[IFNAMSIZ]; /* if name, e.g. "en0" */
 	u_short	ifan_what;	/* what type of announcement */
 };
 .Ed
 .Pp
 The
 .Dv RTM_IFINFO
 message uses a
 .Ar if_msghdr
 header, the
 .Dv RTM_NEWADDR
 and
 .Dv RTM_DELADDR
 messages use a
 .Ar ifa_msghdr
 header, the
 .Dv RTM_NEWMADDR
 and
 .Dv RTM_DELMADDR
 messages use a
 .Vt ifma_msghdr
 header, the
 .Dv RTM_IFANNOUNCE
 message uses a
 .Vt if_announcemsghdr
 header,
 and all other messages use the
 .Ar rt_msghdr
 header.
 .Pp
 The
 .Dq Li "struct rt_metrics"
 and the flag bits are as defined in
 .Xr rtentry 9 .
 .Pp
 Specifiers for metric values in rmx_locks and rtm_inits are:
 .Bd -literal
 #define	RTV_MTU       0x1    /* init or lock _mtu */
 #define	RTV_HOPCOUNT  0x2    /* init or lock _hopcount */
 #define	RTV_EXPIRE    0x4    /* init or lock _expire */
 #define	RTV_RPIPE     0x8    /* init or lock _recvpipe */
 #define	RTV_SPIPE     0x10   /* init or lock _sendpipe */
 #define	RTV_SSTHRESH  0x20   /* init or lock _ssthresh */
 #define	RTV_RTT       0x40   /* init or lock _rtt */
 #define	RTV_RTTVAR    0x80   /* init or lock _rttvar */
 .Ed
 .Pp
 Specifiers for which addresses are present in the messages are:
 .Bd -literal
 #define RTA_DST       0x1    /* destination sockaddr present */
 #define RTA_GATEWAY   0x2    /* gateway sockaddr present */
 #define RTA_NETMASK   0x4    /* netmask sockaddr present */
-#define RTA_GENMASK   0x8    /* cloning mask sockaddr present */
+#define RTA_GENMASK   0x8    /* cloning mask sockaddr present - unused */
 #define RTA_IFP       0x10   /* interface name sockaddr present */
 #define RTA_IFA       0x20   /* interface addr sockaddr present */
 #define RTA_AUTHOR    0x40   /* sockaddr for author of redirect */
 #define RTA_BRD       0x80   /* for NEWADDR, broadcast or p-p dest addr */
 .Ed
 .Sh SEE ALSO
 .Xr sysctl 3 ,
 .Xr route 8 ,
 .Xr rtentry 9
 .Pp
 The constants for the
 .Va rtm_flags
 field are documented in the manual page for the
 .Xr route 8
 utility.
 .Sh HISTORY
 A
 .Dv PF_ROUTE
 protocol family first appeared in
 .Bx 4.3 reno .
Index: head/share/man/man9/rtalloc.9
===================================================================
--- head/share/man/man9/rtalloc.9	(revision 186118)
+++ head/share/man/man9/rtalloc.9	(revision 186119)
@@ -1,266 +1,240 @@
 .\"
 .\" Copyright 1996 Massachusetts Institute of Technology
 .\"
 .\" Permission to use, copy, modify, and distribute this software and
 .\" its documentation for any purpose and without fee is hereby
 .\" granted, provided that both the above copyright notice and this
 .\" permission notice appear in all copies, that both the above
 .\" copyright notice and this permission notice appear in all
 .\" supporting documentation, and that the name of M.I.T. not be used
 .\" in advertising or publicity pertaining to distribution of the
 .\" software without specific, written prior permission.  M.I.T. makes
 .\" no representations about the suitability of this software for any
 .\" purpose.  It is provided "as is" without express or implied
 .\" warranty.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''.  M.I.T. DISCLAIMS
 .\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
 .\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
 .\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
 .\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 .\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 .\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
 .\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
 .\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
 .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\" $FreeBSD$
-.Dd October 11, 2004
+.\"
+.Dd December 11, 2008
 .Os
 .Dt RTALLOC 9
 .Sh NAME
 .Nm rtalloc ,
 .Nm rtalloc_ign ,
 .Nm rtalloc1 ,
 .Nm rtfree
 .Nd look up a route in the kernel routing table
 .Sh SYNOPSIS
 .In sys/types.h
 .In sys/socket.h
 .In net/route.h
 .Ft void
 .Fn rtalloc "struct route *ro"
 .Ft void
 .Fn rtalloc_ign "struct route *ro" "u_long flags"
 .Ft "struct rtentry *"
 .Fn rtalloc1 "struct sockaddr *sa" "int report" "u_long flags"
 .Ft void
 .Fn rtfree "struct rt_entry *rt"
 .Fn RTFREE "struct rt_entry *rt"
 .Fn RT_LOCK "struct rt_entry *rt"
 .Fn RT_UNLOCK "struct rt_entry *rt"
 .Fn RT_ADDREF "struct rt_entry *rt"
 .Fn RT_REMREF "struct rt_entry *rt"
 .Sh DESCRIPTION
 The kernel uses a radix tree structure to manage routes for the
 networking subsystem.
 The
 .Fn rtalloc
 family of routines is used by protocols to query this structure for a
 route corresponding to a particular end-node address, and to cause
 certain protocol\- and interface-specific actions to take place.
 .\" XXX - -mdoc should contain a standard request for getting em and
 .\" en dashes.
 .Pp
-When a route with the flag
-.Dv RTF_CLONING
-is retrieved, and the action of this flag is not masked, the
-.Nm
-facility automatically generates a new route using information in the
-old route as a template, and
-sends an
-.Dv RTM_RESOLVE
-message to the appropriate interface-address route-management routine
-.Pq Fn ifa->ifa_rtrequest .
-This generated route is called
-.Em cloned ,
-and has
-.Dv RTF_WASCLONED
-flag set.
 .Dv RTF_PRCLONING
 flag is obsolete and thus ignored by facility.
 If the
 .Dv RTF_XRESOLVE
 flag is set, then the
 .Dv RTM_RESOLVE
 message is sent instead on the
 .Xr route 4
 socket interface, requesting that an external program resolve the
 address in question and modify the route appropriately.
 .Pp
 The default interface is
 .Fn rtalloc .
 Its only argument is
 .Fa ro ,
 a pointer to a
 .Dq Li "struct route" ,
 which is defined as follows:
 .Bd -literal -offset indent
 struct route {
 	struct sockaddr ro_dst;
 	struct rtentry *ro_rt;
 };
 .Ed
 .Pp
 Thus, this function can only be used for address families which are
 smaller than the default
 .Dq Li "struct sockaddr" .
 Before calling
 .Fn rtalloc
 for the first time, callers should ensure that unused bits of the
 structure are set to zero.
 On subsequent calls,
 .Fn rtalloc
 returns without performing a lookup if
 .Fa ro->ro_rt
 is non-null and the
 .Dv RTF_UP
 flag is set in the route's
 .Li rt_flags
 field.
 .Pp
 The
 .Fn rtalloc_ign
-interface can be used when the default actions of
-.Fn rtalloc
-in the presence of the
-.Dv RTF_CLONING
-flag is undesired.
+interface can be used when the caller does not want to receive
+the returned
+.Fa rtentry
+locked.
 The
 .Fa ro
 argument is the same as
 .Fn rtalloc ,
 but there is additionally a
 .Fa flags
-argument, which lists the flags in the route which are to be
-.Em ignored
-(in most cases this is
-.Dv RTF_CLONING
-flag).
+argument, which is now only used to pass
+.Dv RTF_RNH_LOCKED
+indicating that the radix tree lock is already held.
 Both
 .Fn rtalloc
 and
 .Fn rtalloc_ign
 functions return a pointer to an unlocked
 .Vt "struct rtentry" .
 .Pp
 The
 .Fn rtalloc1
 function is the most general form of
 .Fn rtalloc
 (and both of the other forms are implemented as calls to rtalloc1).
 It does not use the
 .Dq Li "struct route" ,
 and is therefore suitable for address families which require more
 space than is in a traditional
 .Dq Li "struct sockaddr" .
 Instead, it takes a
 .Dq Li "struct sockaddr *"
 directly as the
 .Fa sa
 argument.
 The second argument,
 .Fa report ,
-controls whether
-.Dv RTM_RESOLVE
-requests are sent to the lower layers when an
-.Dv RTF_CLONING
-or
-.Dv RTF_PRCLONING
-route is cloned.
-Ordinarily a value of one should be passed, except
-in the processing of those lower layers which use the cloning
-facility.
+controls whether the lower layers are notified when a lookup fails.
 The third argument,
 .Fa flags ,
 is a set of flags to ignore, as in
 .Fn rtalloc_ign .
 The
 .Fn rtalloc1
 function returns a pointer to a locked
 .Vt "struct rtentry" .
 .Pp
 The
 .Fn rtfree
 function frees a locked route entry, e.g., a previously allocated by
 .Fn rtalloc1 .
 .Pp
 The
 .Fn RTFREE
 macro is used to free unlocked route entries, previously allocated by
 .Fn rtalloc
 or
 .Fn rtalloc_ign .
 The
 .Fn RTFREE
 macro decrements the reference count on the routing table entry (see below),
 and frees it if the reference count has reached zero.
 .Pp
 The preferred usage is allocating a route using
 .Fn rtalloc
 or
 .Fn rtalloc_ign
 and freeing using
 .Fn RTFREE .
 .Pp
 The
 .Fn RT_LOCK
 macro is used to lock a routing table entry.
 The
 .Fn RT_UNLOCK
 macro is used to unlock a routing table entry.
 .Pp
 The
 .Fn RT_ADDREF
 macro increments the reference count on a previously locked route entry.
 The
 .Fn RT_REMREF
 macro decrements the reference count on a previously locked route entry.
 .Sh RETURN VALUES
 The
 .Fn rtalloc ,
 .Fn rtalloc_ign
 and
 .Fn rtfree
 functions do not return a value.
 The
 .Fn rtalloc1
 function returns a pointer to a routing-table entry if it succeeds,
 otherwise a null pointer.
 Lack of a route should in most cases be
 translated to the
 .Xr errno 2
 value
 .Er EHOSTUNREACH .
 .Sh SEE ALSO
 .Xr route 4 ,
 .Xr rtentry 9
 .Sh HISTORY
 The
 .Nm
 facility first appeared in
 .Bx 4.2 ,
 although with much different internals.
 The
 .Fn rtalloc_ign
 function and the
 .Fa flags
 argument to
 .Fn rtalloc1
 first appeared in
 .Fx 2.0 .
 Routing table locking was introduced in
 .Fx 5.2 .
 .Sh AUTHORS
 This manual page was written by
 .An Garrett Wollman ,
 as were the changes to implement
 .Dv RTF_PRCLONING
 and the
 .Fn rtalloc_ign
 function and the
 .Fa flags
 argument to
 .Fn rtalloc1 .
Index: head/share/man/man9/rtentry.9
===================================================================
--- head/share/man/man9/rtentry.9	(revision 186118)
+++ head/share/man/man9/rtentry.9	(revision 186119)
@@ -1,303 +1,251 @@
 .\"
 .\" Copyright 1996 Massachusetts Institute of Technology
 .\"
 .\" Permission to use, copy, modify, and distribute this software and
 .\" its documentation for any purpose and without fee is hereby
 .\" granted, provided that both the above copyright notice and this
 .\" permission notice appear in all copies, that both the above
 .\" copyright notice and this permission notice appear in all
 .\" supporting documentation, and that the name of M.I.T. not be used
 .\" in advertising or publicity pertaining to distribution of the
 .\" software without specific, written prior permission.  M.I.T. makes
 .\" no representations about the suitability of this software for any
 .\" purpose.  It is provided "as is" without express or implied
 .\" warranty.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''.  M.I.T. DISCLAIMS
 .\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
 .\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
 .\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
 .\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 .\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 .\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
 .\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
 .\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
 .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\" $FreeBSD$
 .\"
-.Dd October 7, 2004
+.Dd December 11, 2008
 .Os
 .Dt RTENTRY 9
 .Sh NAME
 .Nm rtentry
 .Nd structure of an entry in the kernel routing table
 .Sh SYNOPSIS
 .In sys/types.h
 .In sys/socket.h
 .In net/route.h
 .Sh DESCRIPTION
 The kernel provides a common mechanism by which all protocols can store
 and retrieve entries from a central table of routes.
 Parts of this
 mechanism are also used to interact with user-level processes by means
 of a socket in the
 .Xr route 4
 pseudo-protocol family.
 The
 .In net/route.h
 header file defines the structures and manifest constants used in this
 facility.
 .Pp
 The basic structure of a route is defined by
 .Vt "struct rtentry" ,
 which includes the following fields:
 .Bl -tag -offset indent -width 6n
 .It Vt "struct radix_node rt_nodes[2]" ;
 Glue used by the radix-tree routines.
 These members also include in
 their substructure the key (i.e., destination address) and mask used
 when the route was created.
 The
 .Fn rt_key rt
 and
 .Fn rt_mask rt
 macros can be used to extract this information (in the form of a
 .Vt "struct sockaddr *" )
 given a
 .Vt "struct rtentry *" .
 .It Vt "struct sockaddr *rt_gateway" ;
 The
 .Dq target
 of the route, which can either represent a destination in its own
 right (some protocols will put a link-layer address here), or some
 intermediate stop on the way to that destination (if the
 .Dv RTF_GATEWAY
 flag is set).
-.It Vt "u_long rt_flags" ;
+.It Vt "int rt_flags" ;
 See below.
+.It Vt "int rt_refcnt" ;
+Route entries are reference-counted; this field indicates the number
+of external (to the radix tree) references.
 .It Vt "struct ifnet *rt_ifp" ;
 .It Vt "struct ifaddr *rt_ifa" ;
 These two fields represent the
 .Dq answer ,
 as it were, to the question posed by a route lookup; that is, they
 name the interface and interface address to be used in sending a
 packet to the destination or set of destinations which this route
 represents.
 .It Vt "struct rt_metrics_lite rt_rmx" ;
 See below.
-.It Vt "long rt_refcnt" ;
-Route entries are reference-counted; this field indicates the number
-of external (to the radix tree) references.
-If the
 .Dv RTF_UP
 flag is not present, the
 .Fn rtfree
 function will delete the route from the radix tree when the last
 reference drops.
-.It Vt "struct sockaddr *rt_genmask" ;
-When the
-.Fn rtalloc
-family of functions performs a cloning operation as requested by the
-.Dv RTF_CLONING
-flag, this field is used as the mask for the new route which is
-inserted into the table.
-If this field is a null pointer, then a host
-route is generated.
-.It Vt "caddr_t rt_llinfo" ;
-When the
-.Dv RTF_LLINFO
-flag is set, this field contains information specific to the link
-layer represented by the named interface address.
-(It is normally managed by the
-.Va rt_ifa->ifa_rtrequest
-routine.)
-Protocols such as
-.Xr arp 4
-use this field to reference per-destination state internal to that
-protocol.
 .It Vt "struct rtentry *rt_gwroute" ;
 This member is a reference to a route whose destination is
 .Va rt_gateway .
 It is only used for
 .Dv RTF_GATEWAY
 routes.
-.It Vt "struct rtentry *rt_parent" ;
-A reference to the route from which this route was cloned, or a null
-pointer if this route was not generated by cloning.
-See also the
-.Dv RTF_WASCLONED
-flag.
 .It Vt "struct mtx rt_mtx" ;
 Mutex to lock this routing entry.
 .El
 .Pp
 The following flag bits are defined:
 .Bl -tag -offset indent -width ".Dv RTF_BLACKHOLE" -compact
 .It Dv RTF_UP
 The route is not deleted.
 .It Dv RTF_GATEWAY
 The route points to an intermediate destination and not the ultimate
 recipient; the
 .Va rt_gateway
 and
 .Va rt_gwroute
 fields name that destination.
 .It Dv RTF_HOST
 This is a host route.
 .It Dv RTF_REJECT
 The destination is presently unreachable.
 This should result in an
 .Er EHOSTUNREACH
 error from output routines.
 .It Dv RTF_DYNAMIC
 This route was created dynamically by
 .Fn rtredirect .
 .It Dv RTF_MODIFIED
 This route was modified by
 .Fn rtredirect .
 .It Dv RTF_DONE
 Used only in the
 .Xr route 4
 protocol, indicating that the request was executed.
-.It Dv RTF_CLONING
-When this route is returned as a result of a lookup, automatically
-create a new route using this one as a template and
-.Va rt_genmask
-(if present) as a mask.
 .It Dv RTF_XRESOLVE
 When this route is returned as a result of a lookup, send a report on
 the
 .Xr route 4
 interface requesting that an external process perform resolution for
 this route.
-(Used in conjunction with
-.Dv RTF_CLONING . )
-.It Dv RTF_LLINFO
-Indicates that this route represents information being managed by a
-link layer's adaptation layer (e.g.,
-.Tn ARP ) .
 .It Dv RTF_STATIC
 Indicates that this route was manually added by means of the
 .Xr route 8
 command.
 .It Dv RTF_BLACKHOLE
 Requests that output sent via this route be discarded.
 .It Dv RTF_PROTO1
 .It Dv RTF_PROTO2
 .It Dv RTF_PROTO3
 Protocol-specific.
 .It Dv RTF_PRCLONING
 This flag is obsolete and simply ignored by facility.
-.It Dv RTF_WASCLONED
-Indicates that this route was generated as a result of cloning
-requested by the
-.Dv RTF_CLONING
-flag.
-When set, the
-.Va rt_parent
-field indicates the route from which this one was generated.
 .It Dv RTF_PINNED
 (Reserved for future use to indicate routes which are not to be
 modified by a routing protocol.)
 .It Dv RTF_LOCAL
 Indicates that the destination of this route is an address configured
 as belonging to this system.
 .It Dv RTF_BROADCAST
 Indicates that the destination is a broadcast address.
 .It Dv RTF_MULTICAST
 Indicates that the destination is a multicast address.
 .El
 .Pp
 Every route has associated with it a set of metrics, stored in
 .Vt "struct rt_metrics_lite" .
 Metrics are supplied in
 .Vt "struct rt_metrics"
 passed with routing control messages via
 .Xr route 4
 API.
 Currently only
 .Vt rmx_mtu , rmx_expire ,
 and
 .Vt rmx_pksent
 metrics are used in
 .Vt "struct rt_metrics_lite" .
 All others are ignored.
 .Pp
 The following metrics are defined by
 .Vt "struct rt_metrics" :
 .Bl -tag -offset indent -width 6n
 .It Vt "u_long rmx_locks" ;
 Flag bits indicating which metrics the kernel is not permitted to
 dynamically modify.
 .It Vt "u_long rmx_mtu" ;
 MTU for this path.
 .It Vt "u_long rmx_hopcount" ;
 Number of intermediate systems on the path to this destination.
 .It Vt "u_long rmx_expire" ;
 The time
 (a la
 .Xr time 3 )
 at which this route should expire, or zero if it should never expire.
 It is the responsibility of individual protocol suites to ensure that routes
 are actually deleted once they expire.
 .It Vt "u_long rmx_recvpipe" ;
 Nominally, the bandwidth-delay product for the path
 .Em from
 the destination
 .Em to
 this system.
 In practice, this value is used to set the size of the
 receive buffer (and thus the window in sliding-window protocols like
 .Tn TCP ) .
 .It Vt "u_long rmx_sendpipe" ;
 As before, but in the opposite direction.
 .It Vt "u_long rmx_ssthresh" ;
 The slow-start threshold used in
 .Tn TCP
 congestion-avoidance.
 .It Vt "u_long rmx_rtt" ;
 The round-trip time to this destination, in units of
 .Dv RMX_RTTUNIT
 per second.
 .It Vt "u_long rmx_rttvar" ;
 The average deviation of the round-trip time to this destination, in
 units of
 .Dv RMX_RTTUNIT
 per second.
 .It Vt "u_long rmx_pksent" ;
 A count of packets successfully sent via this route.
 .It Vt "u_long rmx_filler[4]" ;
 .\" XXX badly named
 Empty space available for protocol-specific information.
 .El
 .Sh SEE ALSO
 .Xr route 4 ,
 .Xr route 8 ,
 .Xr rtalloc 9
 .Sh HISTORY
 The
 .Vt rtentry
 structure first appeared in
 .Bx 4.2 .
 The radix-tree representation of the routing table and the
 .Vt rt_metrics
 structure first appeared in
 .Bx 4.3 reno .
 .Sh AUTHORS
 This manual page was written by
 .An Garrett Wollman .
 .Sh BUGS
 There are a number of historical relics remaining in this interface.
 The
 .Va rt_gateway
 and
 .Va rmx_filler
 fields could be named better.
-.Pp
-There is some disagreement over whether it is legitimate for
-.Dv RTF_LLINFO
-to be set by any process other than
-.Va rt_ifa->ifa_rtrequest .
Index: head/sys/conf/NOTES
===================================================================
--- head/sys/conf/NOTES	(revision 186118)
+++ head/sys/conf/NOTES	(revision 186119)
@@ -1,2692 +1,2692 @@
 # $FreeBSD$
 #
 # NOTES -- Lines that can be cut/pasted into kernel and hints configs.
 #
 # Lines that begin with 'device', 'options', 'machine', 'ident', 'maxusers',
 # 'makeoptions', 'hints', etc. go into the kernel configuration that you
 # run config(8) with.
 #
 # Lines that begin with 'hint.' are NOT for config(8), they go into your
 # hints file.  See /boot/device.hints and/or the 'hints' config(8) directive.
 #
 # Please use ``make LINT'' to create an old-style LINT file if you want to
 # do kernel test-builds.
 #
 # This file contains machine independent kernel configuration notes.  For
 # machine dependent notes, look in /sys/<arch>/conf/NOTES.
 #
 
 #
 # NOTES conventions and style guide:
 #
 # Large block comments should begin and end with a line containing only a
 # comment character.
 #
 # To describe a particular object, a block comment (if it exists) should
 # come first.  Next should come device, options, and hints lines in that
 # order.  All device and option lines must be described by a comment that
 # doesn't just expand the device or option name.  Use only a concise
 # comment on the same line if possible.  Very detailed descriptions of
 # devices and subsystems belong in man pages.
 #
 # A space followed by a tab separates 'options' from an option name.  Two
 # spaces followed by a tab separate 'device' from a device name.  Comments
 # after an option or device should use one space after the comment character.
 # To comment out a negative option that disables code and thus should not be
 # enabled for LINT builds, precede 'options' with "#!".
 #
 
 #
 # This is the ``identification'' of the kernel.  Usually this should
 # be the same as the name of your kernel.
 #
 ident		LINT
 
 #
 # The `maxusers' parameter controls the static sizing of a number of
 # internal system tables by a formula defined in subr_param.c.
 # Omitting this parameter or setting it to 0 will cause the system to
 # auto-size based on physical memory.
 #
 maxusers	10
 
 #
 # The `makeoptions' parameter allows variables to be passed to the
 # generated Makefile in the build area.
 #
 # CONF_CFLAGS gives some extra compiler flags that are added to ${CFLAGS}
 # after most other flags.  Here we use it to inhibit use of non-optimal
 # gcc built-in functions (e.g., memcmp).
 #
 # DEBUG happens to be magic.
 # The following is equivalent to 'config -g KERNELNAME' and creates
 # 'kernel.debug' compiled with -g debugging as well as a normal
 # 'kernel'.  Use 'make install.debug' to install the debug kernel
 # but that isn't normally necessary as the debug symbols are not loaded
 # by the kernel and are not useful there anyway.
 #
 # KERNEL can be overridden so that you can change the default name of your
 # kernel.
 #
 # MODULES_OVERRIDE can be used to limit modules built to a specific list.
 #
 makeoptions	CONF_CFLAGS=-fno-builtin  #Don't allow use of memcmp, etc.
 #makeoptions	DEBUG=-g		#Build kernel with gdb(1) debug symbols
 #makeoptions	KERNEL=foo		#Build kernel "foo" and install "/foo"
 # Only build ext2fs module plus those parts of the sound system I need.
 #makeoptions	MODULES_OVERRIDE="ext2fs sound/sound sound/driver/maestro3"
 makeoptions	DESTDIR=/tmp
 
 #
 # FreeBSD processes are subject to certain limits to their consumption
 # of system resources.  See getrlimit(2) for more details.  Each
 # resource limit has two values, a "soft" limit and a "hard" limit.
 # The soft limits can be modified during normal system operation, but
 # the hard limits are set at boot time.  Their default values are
 # in sys/<arch>/include/vmparam.h.  There are two ways to change them:
 # 
 # 1.  Set the values at kernel build time.  The options below are one
 #     way to allow that limit to grow to 1GB.  They can be increased
 #     further by changing the parameters:
 #	
 # 2.  In /boot/loader.conf, set the tunables kern.maxswzone,
 #     kern.maxbcache, kern.maxtsiz, kern.dfldsiz, kern.maxdsiz,
 #     kern.dflssiz, kern.maxssiz and kern.sgrowsiz.
 #
 # The options in /boot/loader.conf override anything in the kernel
 # configuration file.  See the function init_param1 in
 # sys/kern/subr_param.c for more details.
 #
 
 options 	MAXDSIZ=(1024UL*1024*1024)
 options 	MAXSSIZ=(128UL*1024*1024)
 options 	DFLDSIZ=(1024UL*1024*1024)
 
 #
 # BLKDEV_IOSIZE sets the default block size used in user block
 # device I/O.  Note that this value will be overridden by the label
 # when specifying a block device from a label with a non-0
 # partition blocksize.  The default is PAGE_SIZE.
 #
 options 	BLKDEV_IOSIZE=8192
 
 #
 # MAXPHYS and DFLTPHYS
 #
 # These are the max and default 'raw' I/O block device access sizes.
 # Reads and writes will be split into DFLTPHYS chunks. Some applications
 # have better performance with larger raw I/O access sizes. Typically
 # MAXPHYS should be twice the size of DFLTPHYS. Note that certain VM
 # parameters are derived from these values and making them too large
 # can make an an unbootable kernel.
 #
 # The defaults are 64K and 128K respectively.
 options 	DFLTPHYS=(64*1024)
 options 	MAXPHYS=(128*1024)
 
 
 # This allows you to actually store this configuration file into
 # the kernel binary itself. See config(8) for more details.
 #
 options 	INCLUDE_CONFIG_FILE     # Include this file in kernel
 
 options 	GEOM_AES		# Don't use, use GEOM_BDE
 options 	GEOM_BDE		# Disk encryption.
 options 	GEOM_BSD		# BSD disklabels
 options 	GEOM_CACHE		# Disk cache.
 options 	GEOM_CONCAT		# Disk concatenation.
 options 	GEOM_ELI		# Disk encryption.
 options 	GEOM_FOX		# Redundant path mitigation
 options 	GEOM_GATE		# Userland services.
 options 	GEOM_JOURNAL		# Journaling.
 options 	GEOM_LABEL		# Providers labelization.
 options 	GEOM_LINUX_LVM		# Linux LVM2 volumes
 options 	GEOM_MBR		# DOS/MBR partitioning
 options 	GEOM_MIRROR		# Disk mirroring.
 options 	GEOM_MULTIPATH		# Disk multipath
 options 	GEOM_NOP		# Test class.
 options 	GEOM_PART_APM		# Apple partitioning
 options 	GEOM_PART_BSD		# BSD disklabel
 options 	GEOM_PART_GPT		# GPT partitioning
 options 	GEOM_PART_MBR		# MBR partitioning
 options 	GEOM_PART_PC98		# PC-9800 disk partitioning
 options 	GEOM_PART_VTOC8		# SMI VTOC8 disk label
 options 	GEOM_PC98		# NEC PC9800 partitioning
 options 	GEOM_RAID3		# RAID3 functionality.
 options 	GEOM_SHSEC		# Shared secret.
 options 	GEOM_STRIPE		# Disk striping.
 options 	GEOM_SUNLABEL		# Sun/Solaris partitioning
 options 	GEOM_UZIP		# Read-only compressed disks
 options 	GEOM_VIRSTOR		# Virtual storage.
 options 	GEOM_VOL		# Volume names from UFS superblock
 options 	GEOM_ZERO		# Performance testing helper.
 
 #
 # The root device and filesystem type can be compiled in;
 # this provides a fallback option if the root device cannot
 # be correctly guessed by the bootstrap code, or an override if
 # the RB_DFLTROOT flag (-r) is specified when booting the kernel.
 #
 options 	ROOTDEVNAME=\"ufs:da0s2e\"
 
 
 #####################################################################
 # Scheduler options:
 #
 # Specifying one of SCHED_4BSD or SCHED_ULE is mandatory.  These options
 # select which scheduler is compiled in.
 #
 # SCHED_4BSD is the historical, proven, BSD scheduler.  It has a global run
 # queue and no CPU affinity which makes it suboptimal for SMP.  It has very
 # good interactivity and priority selection.
 #
 # SCHED_ULE provides significant performance advantages over 4BSD on many
 # workloads on SMP machines.  It supports cpu-affinity, per-cpu runqueues
 # and scheduler locks.  It also has a stronger notion of interactivity 
 # which leads to better responsiveness even on uniprocessor machines.  This
 # will eventually become the default scheduler.
 #
 # SCHED_STATS is a debugging option which keeps some stats in the sysctl
 # tree at 'kern.sched.stats' and is useful for debugging scheduling decisions.
 #
 options 	SCHED_4BSD
 options		SCHED_STATS
 #options 	SCHED_ULE
 
 #####################################################################
 # SMP OPTIONS:
 #
 # SMP enables building of a Symmetric MultiProcessor Kernel.
 
 # Mandatory:
 options 	SMP			# Symmetric MultiProcessor Kernel
 
 # ADAPTIVE_MUTEXES changes the behavior of blocking mutexes to spin
 # if the thread that currently owns the mutex is executing on another
 # CPU.  This behaviour is enabled by default, so this option can be used
 # to disable it.
 options 	NO_ADAPTIVE_MUTEXES
 
 # ADAPTIVE_RWLOCKS changes the behavior of reader/writer locks to spin
 # if the thread that currently owns the rwlock is executing on another
 # CPU.  This behaviour is enabled by default, so this option can be used
 # to disable it.
 options 	NO_ADAPTIVE_RWLOCKS
 
 # ADAPTIVE_SX changes the behavior of sx locks to spin if the thread
 # that currently owns the lock is executing on another CPU.  Note that
 # in addition to enabling this option, individual sx locks must be
 # initialized with the SX_ADAPTIVESPIN flag.
 options 	ADAPTIVE_SX
 
 # MUTEX_NOINLINE forces mutex operations to call functions to perform each
 # operation rather than inlining the simple cases.  This can be used to
 # shrink the size of the kernel text segment.  Note that this behavior is
 # already implied by the INVARIANT_SUPPORT, INVARIANTS, KTR, LOCK_PROFILING,
 # and WITNESS options.
 options 	MUTEX_NOINLINE
 
 # RWLOCK_NOINLINE forces rwlock operations to call functions to perform each
 # operation rather than inlining the simple cases.  This can be used to
 # shrink the size of the kernel text segment.  Note that this behavior is
 # already implied by the INVARIANT_SUPPORT, INVARIANTS, KTR, LOCK_PROFILING,
 # and WITNESS options.
 options 	RWLOCK_NOINLINE
 
 # SX_NOINLINE forces sx lock operations to call functions to perform each
 # operation rather than inlining the simple cases.  This can be used to
 # shrink the size of the kernel text segment.  Note that this behavior is
 # already implied by the INVARIANT_SUPPORT, INVARIANTS, KTR, LOCK_PROFILING,
 # and WITNESS options.
 options 	SX_NOINLINE
 
 # SMP Debugging Options:
 #
 # PREEMPTION allows the threads that are in the kernel to be preempted by
 #	  higher priority [interrupt] threads.  It helps with interactivity
 #	  and allows interrupt threads to run sooner rather than waiting.
 #	  WARNING! Only tested on amd64 and i386.
 # FULL_PREEMPTION instructs the kernel to preempt non-realtime kernel
 #	  threads.  Its sole use is to expose race conditions and other
 #	  bugs during development.  Enabling this option will reduce
 #	  performance and increase the frequency of kernel panics by
 #	  design.  If you aren't sure that you need it then you don't.
 #	  Relies on the PREEMPTION option.  DON'T TURN THIS ON.
 # MUTEX_DEBUG enables various extra assertions in the mutex code.
 # SLEEPQUEUE_PROFILING enables rudimentary profiling of the hash table
 #	  used to hold active sleep queues as well as sleep wait message
 #	  frequency.
 # TURNSTILE_PROFILING enables rudimentary profiling of the hash table
 #	  used to hold active lock queues.
 # WITNESS enables the witness code which detects deadlocks and cycles
 #         during locking operations.
 # WITNESS_KDB causes the witness code to drop into the kernel debugger if
 #	  a lock hierarchy violation occurs or if locks are held when going to
 #	  sleep.
 # WITNESS_SKIPSPIN disables the witness checks on spin mutexes.
 options 	PREEMPTION
 options 	FULL_PREEMPTION
 options 	MUTEX_DEBUG
 options 	WITNESS
 options 	WITNESS_KDB
 options 	WITNESS_SKIPSPIN
 
 # LOCK_PROFILING - Profiling locks.  See LOCK_PROFILING(9) for details.
 options 	LOCK_PROFILING
 # Set the number of buffers and the hash size.  The hash size MUST be larger
 # than the number of buffers.  Hash size should be prime.
 options 	MPROF_BUFFERS="1536"
 options 	MPROF_HASH_SIZE="1543"
 
 # Profiling for internal hash tables.
 options 	SLEEPQUEUE_PROFILING
 options 	TURNSTILE_PROFILING
 
 
 #####################################################################
 # COMPATIBILITY OPTIONS
 
 #
 # Implement system calls compatible with 4.3BSD and older versions of
 # FreeBSD.  You probably do NOT want to remove this as much current code
 # still relies on the 4.3 emulation.  Note that some architectures that
 # are supported by FreeBSD do not include support for certain important
 # aspects of this compatibility option, namely those related to the
 # signal delivery mechanism.
 #
 options 	COMPAT_43
 
 # Old tty interface.
 options 	COMPAT_43TTY
 
 # Enable FreeBSD4 compatibility syscalls
 options 	COMPAT_FREEBSD4
 
 # Enable FreeBSD5 compatibility syscalls
 options 	COMPAT_FREEBSD5
 
 # Enable FreeBSD6 compatibility syscalls
 options 	COMPAT_FREEBSD6
 
 # Enable FreeBSD7 compatibility syscalls
 options 	COMPAT_FREEBSD7
 
 #
 # These three options provide support for System V Interface
 # Definition-style interprocess communication, in the form of shared
 # memory, semaphores, and message queues, respectively.
 #
 options 	SYSVSHM
 options 	SYSVSEM
 options 	SYSVMSG
 
 
 #####################################################################
 # DEBUGGING OPTIONS
 
 #
 # Compile with kernel debugger related code.
 #
 options 	KDB
 
 #
 # Print a stack trace of the current thread on the console for a panic.
 #
 options 	KDB_TRACE
 
 #
 # Don't enter the debugger for a panic. Intended for unattended operation
 # where you may want to enter the debugger from the console, but still want
 # the machine to recover from a panic.
 #
 options 	KDB_UNATTENDED
 
 #
 # Enable the ddb debugger backend.
 #
 options 	DDB
 
 #
 # Print the numerical value of symbols in addition to the symbolic
 # representation.
 #
 options 	DDB_NUMSYM
 
 #
 # Enable the remote gdb debugger backend.
 #
 options 	GDB
 
 #
 # Enable the kernel DTrace hooks which are required to load the DTrace
 # kernel modules.
 #
 options 	KDTRACE_HOOKS
 
 #
 # SYSCTL_DEBUG enables a 'sysctl' debug tree that can be used to dump the
 # contents of the registered sysctl nodes on the console.  It is disabled by
 # default because it generates excessively verbose console output that can
 # interfere with serial console operation.
 #
 options 	SYSCTL_DEBUG
 
 #
 # DEBUG_MEMGUARD builds and enables memguard(9), a replacement allocator
 # for the kernel used to detect modify-after-free scenarios.  See the
 # memguard(9) man page for more information on usage.
 #
 options 	DEBUG_MEMGUARD
 
 #
 # DEBUG_REDZONE enables buffer underflows and buffer overflows detection for
 # malloc(9).
 #
 options 	DEBUG_REDZONE
 
 #
 # KTRACE enables the system-call tracing facility ktrace(2).  To be more
 # SMP-friendly, KTRACE uses a worker thread to process most trace events
 # asynchronously to the thread generating the event.  This requires a
 # pre-allocated store of objects representing trace events.  The
 # KTRACE_REQUEST_POOL option specifies the initial size of this store.
 # The size of the pool can be adjusted both at boottime and runtime via
 # the kern.ktrace_request_pool tunable and sysctl.
 #
 options 	KTRACE			#kernel tracing
 options 	KTRACE_REQUEST_POOL=101
 
 #
 # KTR is a kernel tracing mechanism imported from BSD/OS.  Currently
 # it has no userland interface aside from a few sysctl's.  It is
 # enabled with the KTR option.  KTR_ENTRIES defines the number of
 # entries in the circular trace buffer; it must be a power of two.
 # KTR_COMPILE defines the mask of events to compile into the kernel as
 # defined by the KTR_* constants in <sys/ktr.h>.  KTR_MASK defines the
 # initial value of the ktr_mask variable which determines at runtime
 # what events to trace.  KTR_CPUMASK determines which CPU's log
 # events, with bit X corresponding to CPU X.  KTR_VERBOSE enables
 # dumping of KTR events to the console by default.  This functionality
 # can be toggled via the debug.ktr_verbose sysctl and defaults to off
 # if KTR_VERBOSE is not defined.
 #
 options 	KTR
 options 	KTR_ENTRIES=1024
 options 	KTR_COMPILE=(KTR_INTR|KTR_PROC)
 options 	KTR_MASK=KTR_INTR
 options 	KTR_CPUMASK=0x3
 options 	KTR_VERBOSE
 
 #
 # ALQ(9) is a facility for the asynchronous queuing of records from the kernel
 # to a vnode, and is employed by services such as KTR(4) to produce trace
 # files based on a kernel event stream.  Records are written asynchronously
 # in a worker thread.
 #
 options 	ALQ
 options 	KTR_ALQ
 
 #
 # The INVARIANTS option is used in a number of source files to enable
 # extra sanity checking of internal structures.  This support is not
 # enabled by default because of the extra time it would take to check
 # for these conditions, which can only occur as a result of
 # programming errors.
 #
 options 	INVARIANTS
 
 #
 # The INVARIANT_SUPPORT option makes us compile in support for
 # verifying some of the internal structures.  It is a prerequisite for
 # 'INVARIANTS', as enabling 'INVARIANTS' will make these functions be
 # called.  The intent is that you can set 'INVARIANTS' for single
 # source files (by changing the source file or specifying it on the
 # command line) if you have 'INVARIANT_SUPPORT' enabled.  Also, if you
 # wish to build a kernel module with 'INVARIANTS', then adding
 # 'INVARIANT_SUPPORT' to your kernel will provide all the necessary
 # infrastructure without the added overhead.
 #
 options 	INVARIANT_SUPPORT
 
 #
 # The DIAGNOSTIC option is used to enable extra debugging information
 # from some parts of the kernel.  As this makes everything more noisy,
 # it is disabled by default.
 #
 options 	DIAGNOSTIC
 
 #
 # REGRESSION causes optional kernel interfaces necessary only for regression
 # testing to be enabled.  These interfaces may constitute security risks
 # when enabled, as they permit processes to easily modify aspects of the
 # run-time environment to reproduce unlikely or unusual (possibly normally
 # impossible) scenarios.
 #
 options 	REGRESSION
 
 #
 # RESTARTABLE_PANICS allows one to continue from a panic as if it were
 # a call to the debugger to continue from a panic as instead.  It is only
 # useful if a kernel debugger is present.  To restart from a panic, reset
 # the panicstr variable to NULL and continue execution.  This option is
 # for development use only and should NOT be used in production systems
 # to "workaround" a panic.
 #
 #options 	RESTARTABLE_PANICS
 
 #
 # This option let some drivers co-exist that can't co-exist in a running
 # system.  This is used to be able to compile all kernel code in one go for
 # quality assurance purposes (like this file, which the option takes it name
 # from.)
 #
 options 	COMPILING_LINT
 
 #
 # STACK enables the stack(9) facility, allowing the capture of kernel stack
 # for the purpose of procinfo(1), etc.  stack(9) will also be compiled in
 # automatically if DDB(4) is compiled into the kernel.
 #
 options 	STACK
 
 
 #####################################################################
 # PERFORMANCE MONITORING OPTIONS
 
 #
 # The hwpmc driver that allows the use of in-CPU performance monitoring
 # counters for performance monitoring.  The base kernel needs to configured
 # with the 'options' line, while the hwpmc device can be either compiled
 # in or loaded as a loadable kernel module.
 #
 # Additional configuration options may be required on specific architectures,
 # please see hwpmc(4).
 
 device  	hwpmc			# Driver (also a loadable module)
 options 	HWPMC_HOOKS		# Other necessary kernel hooks
 
 
 #####################################################################
 # NETWORKING OPTIONS
 
 #
 # Protocol families
 #
 options 	INET			#Internet communications protocols
 options 	INET6			#IPv6 communications protocols
 
 options		ROUTETABLES=2		# max 16. 1 is back compatible.
 
 # In order to enable IPSEC you MUST also add device crypto to 
 # your kernel configuration
 options 	IPSEC			#IP security (requires device crypto)
 #options 	IPSEC_DEBUG		#debug for IP security
 #
 # Set IPSEC_FILTERTUNNEL to force packets coming through a tunnel
 # to be processed by any configured packet filtering twice.
 # The default is that packets coming out of a tunnel are _not_ processed;
 # they are assumed trusted.
 #
 # IPSEC history is preserved for such packets, and can be filtered
 # using ipfw(8)'s 'ipsec' keyword, when this option is enabled.
 #
 #options 	IPSEC_FILTERTUNNEL	#filter ipsec packets from a tunnel
 
 options 	IPX			#IPX/SPX communications protocols
 
 options 	NCP			#NetWare Core protocol
 
 options 	NETATALK		#Appletalk communications protocols
 options 	NETATALKDEBUG		#Appletalk debugging
 
 #
 # SMB/CIFS requester
 # NETSMB enables support for SMB protocol, it requires LIBMCHAIN and LIBICONV
 # options.
 options 	NETSMB			#SMB/CIFS requester
 
 # mchain library. It can be either loaded as KLD or compiled into kernel
 options 	LIBMCHAIN
 
 # libalias library, performing NAT
 options		LIBALIAS
 
 #
 # SCTP is a NEW transport protocol defined by
 # RFC2960 updated by RFC3309 and RFC3758.. and
 # soon to have a new base RFC and many many more
 # extensions. This release supports all the extensions
 # including many drafts (most about to become RFC's).
 # It is the premeier SCTP implementation in the NET
 # and is quite well tested.
 #
 # Note YOU MUST have both INET and INET6 defined.
 # you don't have to enable V6, but SCTP is 
 # dual stacked and so far we have not teased apart
 # the V6 and V4.. since an association can span
 # both a V6 and V4 address at the SAME time :-)
 #
 options         SCTP
 # There are bunches of options:
 # this one turns on all sorts of
 # nastly printing that you can
 # do. Its all controled by a
 # bit mask (settable by socket opt and
 # by sysctl). Including will not cause
 # logging until you set the bits.. but it
 # can be quite verbose.. so without this
 # option we don't do any of the tests for
 # bits and prints.. which makes the code run
 # faster.. if you are not debugging don't use.
 options SCTP_DEBUG
 #
 # This option turns off the CRC32c checksum. Basically
 # You will not be able to talk to anyone else that
 # has not done this. Its more for expermentation to
 # see how much CPU the CRC32c really takes. Most new
 # cards for TCP support checksum offload.. so this 
 # option gives you a "view" into what SCTP would be
 # like with such an offload (which only exists in
 # high in iSCSI boards so far). With the new
 # splitting 8's algorithm its not as bad as it used
 # to be.. but it does speed things up try only
 # for in a captured lab environment :-)
 options SCTP_WITH_NO_CSUM
 #
 
 #
 # All that options after that turn on specific types of
 # logging. You can monitor CWND growth, flight size
 # and all sorts of things. Go look at the code and
 # see. I have used this to produce interesting 
 # charts and graphs as well :->
 # 
 # I have not yet commited the tools to get and print
 # the logs, I will do that eventually .. before then
 # if you want them send me an email rrs@freebsd.org
 # You basically must have KTR enabled for these
 # and you then set the sysctl to turn on/off various
 # logging bits. Use ktrdump to pull the log and run
 # it through a dispaly program.. and graphs and other
 # things too.
 #
 options 	SCTP_LOCK_LOGGING
 options 	SCTP_MBUF_LOGGING
 options 	SCTP_MBCNT_LOGGING
 options 	SCTP_PACKET_LOGGING
 options		SCTP_LTRACE_CHUNKS
 options 	SCTP_LTRACE_ERRORS
 
 
 # altq(9). Enable the base part of the hooks with the ALTQ option.
 # Individual disciplines must be built into the base system and can not be
 # loaded as modules at this point. ALTQ requires a stable TSC so if yours is
 # broken or changes with CPU throttling then you must also have the ALTQ_NOPCC
 # option.
 options 	ALTQ
 options 	ALTQ_CBQ	# Class Based Queueing
 options 	ALTQ_RED	# Random Early Detection
 options 	ALTQ_RIO	# RED In/Out
 options 	ALTQ_HFSC	# Hierarchical Packet Scheduler
 options 	ALTQ_CDNR	# Traffic conditioner
 options 	ALTQ_PRIQ	# Priority Queueing
 options 	ALTQ_NOPCC	# Required if the TSC is unusable
 options 	ALTQ_DEBUG
 
 # netgraph(4). Enable the base netgraph code with the NETGRAPH option.
 # Individual node types can be enabled with the corresponding option
 # listed below; however, this is not strictly necessary as netgraph
 # will automatically load the corresponding KLD module if the node type
 # is not already compiled into the kernel. Each type below has a
 # corresponding man page, e.g., ng_async(8).
 options 	NETGRAPH		# netgraph(4) system
 options 	NETGRAPH_DEBUG		# enable extra debugging, this
 					# affects netgraph(4) and nodes
 # Node types
 options 	NETGRAPH_ASYNC
 options 	NETGRAPH_ATMLLC
 options 	NETGRAPH_ATM_ATMPIF
 options 	NETGRAPH_BLUETOOTH		# ng_bluetooth(4)
 options 	NETGRAPH_BLUETOOTH_BT3C		# ng_bt3c(4)
 options 	NETGRAPH_BLUETOOTH_HCI		# ng_hci(4)
 options 	NETGRAPH_BLUETOOTH_L2CAP	# ng_l2cap(4)
 options 	NETGRAPH_BLUETOOTH_SOCKET	# ng_btsocket(4)
 options 	NETGRAPH_BLUETOOTH_UBT		# ng_ubt(4)
 options 	NETGRAPH_BLUETOOTH_UBTBCMFW	# ubtbcmfw(4)
 options 	NETGRAPH_BPF
 options 	NETGRAPH_BRIDGE
 options 	NETGRAPH_CAR
 options 	NETGRAPH_CISCO
 options 	NETGRAPH_DEFLATE
 options 	NETGRAPH_DEVICE
 options 	NETGRAPH_ECHO
 options 	NETGRAPH_EIFACE
 options 	NETGRAPH_ETHER
 options 	NETGRAPH_FEC
 options 	NETGRAPH_FRAME_RELAY
 options 	NETGRAPH_GIF
 options 	NETGRAPH_GIF_DEMUX
 options 	NETGRAPH_HOLE
 options 	NETGRAPH_IFACE
 options 	NETGRAPH_IP_INPUT
 options 	NETGRAPH_IPFW
 options 	NETGRAPH_KSOCKET
 options 	NETGRAPH_L2TP
 options 	NETGRAPH_LMI
 # MPPC compression requires proprietary files (not included)
 #options 	NETGRAPH_MPPC_COMPRESSION
 options 	NETGRAPH_MPPC_ENCRYPTION
 options 	NETGRAPH_NETFLOW
 options 	NETGRAPH_NAT
 options 	NETGRAPH_ONE2MANY
 options 	NETGRAPH_PPP
 options 	NETGRAPH_PPPOE
 options 	NETGRAPH_PPTPGRE
 options 	NETGRAPH_PRED1
 options 	NETGRAPH_RFC1490
 options 	NETGRAPH_SOCKET
 options 	NETGRAPH_SPLIT
 options 	NETGRAPH_SPPP
 options 	NETGRAPH_TAG
 options 	NETGRAPH_TCPMSS
 options 	NETGRAPH_TEE
 options 	NETGRAPH_UI
 options 	NETGRAPH_VJC
 
 # NgATM - Netgraph ATM
 options 	NGATM_ATM
 options 	NGATM_ATMBASE
 options 	NGATM_SSCOP
 options 	NGATM_SSCFU
 options 	NGATM_UNI
 options 	NGATM_CCATM
 
 device		mn	# Munich32x/Falc54 Nx64kbit/sec cards.
 
 #
 # Network interfaces:
 #  The `loop' device is MANDATORY when networking is enabled.
 #  The `ether' device provides generic code to handle
 #  Ethernets; it is MANDATORY when an Ethernet device driver is
 #  configured or token-ring is enabled.
 #  The `vlan' device implements the VLAN tagging of Ethernet frames
 #  according to IEEE 802.1Q.  It requires `device miibus'.
 #  The `wlan' device provides generic code to support 802.11
 #  drivers, including host AP mode; it is MANDATORY for the wi,
 #  and ath drivers and will eventually be required by all 802.11 drivers.
 #  The `wlan_wep', `wlan_tkip', and `wlan_ccmp' devices provide
 #  support for WEP, TKIP, and AES-CCMP crypto protocols optionally
 #  used with 802.11 devices that depend on the `wlan' module.
 #  The `wlan_xauth' device provides support for external (i.e. user-mode)
 #  authenticators for use with 802.11 drivers that use the `wlan'
 #  module and support 802.1x and/or WPA security protocols.
 #  The `wlan_acl' device provides a MAC-based access control mechanism
 #  for use with 802.11 drivers operating in ap mode and using the
 #  `wlan' module.
 #  The `fddi' device provides generic code to support FDDI.
 #  The `arcnet' device provides generic code to support Arcnet.
 #  The `sppp' device serves a similar role for certain types
 #  of synchronous PPP links (like `cx', `ar').
 #  The `sl' device implements the Serial Line IP (SLIP) service.
 #  The `ppp' device implements the Point-to-Point Protocol.
 #  The `bpf' device enables the Berkeley Packet Filter.  Be
 #  aware of the legal and administrative consequences of enabling this
 #  option.  The number of devices determines the maximum number of
 #  simultaneous BPF clients programs runnable.  DHCP requires bpf.
 #  The `disc' device implements a minimal network interface,
 #  which throws away all packets sent and never receives any.  It is
 #  included for testing and benchmarking purposes.
 #  The `edsc' device implements a minimal Ethernet interface,
 #  which discards all packets sent and receives none.
 #  The `tap' device is a pty-like virtual Ethernet interface
 #  The `tun' device implements (user-)ppp and nos-tun
 #  The `gif' device implements IPv6 over IP4 tunneling,
 #  IPv4 over IPv6 tunneling, IPv4 over IPv4 tunneling and
 #  IPv6 over IPv6 tunneling.
 #  The `gre' device implements two types of IP4 over IP4 tunneling:
 #  GRE and MOBILE, as specified in the RFC1701 and RFC2004.
 #  The XBONEHACK option allows the same pair of addresses to be configured on
 #  multiple gif interfaces.
 #  The `faith' device captures packets sent to it and diverts them
 #  to the IPv4/IPv6 translation daemon.
 #  The `stf' device implements 6to4 encapsulation.
 #  The `ef' device provides support for multiple ethernet frame types
 #  specified via ETHER_* options. See ef(4) for details.
 #
 # The pf packet filter consists of three devices:
 #  The `pf' device provides /dev/pf and the firewall code itself.
 #  The `pflog' device provides the pflog0 interface which logs packets.
 #  The `pfsync' device provides the pfsync0 interface used for
 #   synchronization of firewall state tables (over the net).
 #
 # The PPP_BSDCOMP option enables support for compress(1) style entire
 # packet compression, the PPP_DEFLATE is for zlib/gzip style compression.
 # PPP_FILTER enables code for filtering the ppp data stream and selecting
 # events for resetting the demand dial activity timer - requires bpf.
 # See pppd(8) for more details.
 #
 device		ether			#Generic Ethernet
 device		vlan			#VLAN support (needs miibus)
 device		wlan			#802.11 support
 options		IEEE80211_DEBUG		#enable debugging msgs
 options		IEEE80211_AMPDU_AGE	#age frames in AMPDU reorder q's
 device		wlan_wep		#802.11 WEP support
 device		wlan_ccmp		#802.11 CCMP support
 device		wlan_tkip		#802.11 TKIP support
 device		wlan_xauth		#802.11 external authenticator support
 device		wlan_acl		#802.11 MAC ACL support
 device		wlan_amrr		#AMRR transmit rate control algorithm
 device		token			#Generic TokenRing
 device		fddi			#Generic FDDI
 device		arcnet			#Generic Arcnet
 device		sppp			#Generic Synchronous PPP
 device		loop			#Network loopback device
 device		bpf			#Berkeley packet filter
 device		disc			#Discard device based on loopback
 device		edsc			#Ethernet discard device
 device		tap			#Virtual Ethernet driver
 device		tun			#Tunnel driver (ppp(8), nos-tun(8))
 device		gre			#IP over IP tunneling
 device		if_bridge		#Bridge interface
 device		pf			#PF OpenBSD packet-filter firewall
 device		pflog			#logging support interface for PF
 device		pfsync			#synchronization interface for PF
 device		carp			#Common Address Redundancy Protocol
 device		enc			#IPsec interface
 device		lagg			#Link aggregation interface
 
 device		ef			# Multiple ethernet frames support
 options 	ETHER_II		# enable Ethernet_II frame
 options 	ETHER_8023		# enable Ethernet_802.3 (Novell) frame
 options 	ETHER_8022		# enable Ethernet_802.2 frame
 options 	ETHER_SNAP		# enable Ethernet_802.2/SNAP frame
 
 # for IPv6
 device		gif			#IPv6 and IPv4 tunneling
 options 	XBONEHACK
 device		faith			#for IPv6 and IPv4 translation
 device		stf			#6to4 IPv6 over IPv4 encapsulation
 
 #
 # Internet family options:
 #
 # MROUTING enables the kernel multicast packet forwarder, which works
 # with mrouted and XORP.
 #
 # IPFIREWALL enables support for IP firewall construction, in
 # conjunction with the `ipfw' program.  IPFIREWALL_VERBOSE sends
 # logged packets to the system logger.  IPFIREWALL_VERBOSE_LIMIT
 # limits the number of times a matching entry can be logged.
 #
 # WARNING:  IPFIREWALL defaults to a policy of "deny ip from any to any"
 # and if you do not add other rules during startup to allow access,
 # YOU WILL LOCK YOURSELF OUT.  It is suggested that you set firewall_type=open
 # in /etc/rc.conf when first enabling this feature, then refining the
 # firewall rules in /etc/rc.firewall after you've tested that the new kernel
 # feature works properly.
 #
 # IPFIREWALL_DEFAULT_TO_ACCEPT causes the default rule (at boot) to
 # allow everything.  Use with care, if a cracker can crash your
 # firewall machine, they can get to your protected machines.  However,
 # if you are using it as an as-needed filter for specific problems as
 # they arise, then this may be for you.  Changing the default to 'allow'
 # means that you won't get stuck if the kernel and /sbin/ipfw binary get
 # out of sync.
 #
 # IPDIVERT enables the divert IP sockets, used by ``ipfw divert''.  It
 # depends on IPFIREWALL if compiled into the kernel.
 #
 # IPFIREWALL_FORWARD enables changing of the packet destination either
 # to do some sort of policy routing or transparent proxying.  Used by
 # ``ipfw forward''. All  redirections apply to locally generated
 # packets too.  Because of this great care is required when
 # crafting the ruleset.
 #
 # IPFIREWALL_NAT adds support for in kernel nat in ipfw, and it requires
 # LIBALIAS.
 #
 # IPSTEALTH enables code to support stealth forwarding (i.e., forwarding
 # packets without touching the TTL).  This can be useful to hide firewalls
 # from traceroute and similar tools.
 #
 # TCPDEBUG enables code which keeps traces of the TCP state machine
 # for sockets with the SO_DEBUG option set, which can then be examined
 # using the trpt(8) utility.
 #
 options 	MROUTING		# Multicast routing
 options 	IPFIREWALL		#firewall
 options 	IPFIREWALL_VERBOSE	#enable logging to syslogd(8)
 options 	IPFIREWALL_VERBOSE_LIMIT=100	#limit verbosity
 options 	IPFIREWALL_DEFAULT_TO_ACCEPT	#allow everything by default
 options 	IPFIREWALL_FORWARD	#packet destination changes
 options 	IPFIREWALL_NAT		#ipfw kernel nat support
 options 	IPDIVERT		#divert sockets
 options 	IPFILTER		#ipfilter support
 options 	IPFILTER_LOG		#ipfilter logging
 options 	IPFILTER_LOOKUP		#ipfilter pools
 options 	IPFILTER_DEFAULT_BLOCK	#block all packets by default
 options 	IPSTEALTH		#support for stealth forwarding
 options 	TCPDEBUG
 
 # The MBUF_STRESS_TEST option enables options which create
 # various random failures / extreme cases related to mbuf
 # functions.  See mbuf(9) for a list of available test cases.
 # MBUF_PROFILING enables code to profile the mbuf chains
 # exiting the system (via participating interfaces) and
 # return a logarithmic histogram of monitored parameters
 # (e.g. packet size, wasted space, number of mbufs in chain).
 options 	MBUF_STRESS_TEST
 options		MBUF_PROFILING
 
 # Statically Link in accept filters
 options 	ACCEPT_FILTER_DATA
 options 	ACCEPT_FILTER_DNS
 options 	ACCEPT_FILTER_HTTP
 
 # TCP_SIGNATURE adds support for RFC 2385 (TCP-MD5) digests. These are
 # carried in TCP option 19. This option is commonly used to protect
 # TCP sessions (e.g. BGP) where IPSEC is not available nor desirable.
 # This is enabled on a per-socket basis using the TCP_MD5SIG socket option.
 # This requires the use of 'device crypto', 'options IPSEC'
 # or 'device cryptodev'.
 options 	TCP_SIGNATURE		#include support for RFC 2385
 
 # DUMMYNET enables the "dummynet" bandwidth limiter.  You need IPFIREWALL
 # as well.  See dummynet(4) and ipfw(8) for more info.  When you run
 # DUMMYNET it is advisable to also have at least "options HZ=1000" to achieve
 # a smooth scheduling of the traffic.
 options 	DUMMYNET
 
 # Zero copy sockets support.  This enables "zero copy" for sending and
 # receiving data via a socket.  The send side works for any type of NIC,
 # the receive side only works for NICs that support MTUs greater than the
 # page size of your architecture and that support header splitting.  See
 # zero_copy(9) for more details.
 options 	ZERO_COPY_SOCKETS
 
 
 #####################################################################
 # FILESYSTEM OPTIONS
 
 #
 # Only the root, /usr, and /tmp filesystems need be statically
 # compiled; everything else will be automatically loaded at mount
 # time.  (Exception: the UFS family--- FFS --- cannot
 # currently be demand-loaded.)  Some people still prefer to statically
 # compile other filesystems as well.
 #
 # NB: The PORTAL filesystem is known to be buggy, and WILL panic your
 # system if you attempt to do anything with it.  It is included here
 # as an incentive for some enterprising soul to sit down and fix it.
 # The UNION filesystem was known to be buggy in the past.  It is now
 # being actively maintained, although there are still some issues being
 # resolved.
 #
 
 # One of these is mandatory:
 options 	FFS			#Fast filesystem
 options 	NFSCLIENT		#Network File System client
 
 # The rest are optional:
 options 	CD9660			#ISO 9660 filesystem
 options 	FDESCFS			#File descriptor filesystem
 options 	HPFS			#OS/2 File system
 options 	MSDOSFS			#MS DOS File System (FAT, FAT32)
 options 	NFSSERVER		#Network File System server
 options		NFSLOCKD		#Network Lock Manager
 options 	NTFS			#NT File System
 options 	NULLFS			#NULL filesystem
 # Broken (depends on NCP):
 #options 	NWFS			#NetWare filesystem
 options 	PORTALFS		#Portal filesystem
 options 	PROCFS			#Process filesystem (requires PSEUDOFS)
 options 	PSEUDOFS		#Pseudo-filesystem framework
 options 	PSEUDOFS_TRACE		#Debugging support for PSEUDOFS
 options 	SMBFS			#SMB/CIFS filesystem
 options 	UDF			#Universal Disk Format
 options 	UNIONFS			#Union filesystem
 # The xFS_ROOT options REQUIRE the associated ``options xFS''
 options 	NFS_ROOT		#NFS usable as root device
 
 # Soft updates is a technique for improving filesystem speed and
 # making abrupt shutdown less risky.
 #
 options 	SOFTUPDATES
 
 # Extended attributes allow additional data to be associated with files,
 # and is used for ACLs, Capabilities, and MAC labels.
 # See src/sys/ufs/ufs/README.extattr for more information.
 options 	UFS_EXTATTR
 options 	UFS_EXTATTR_AUTOSTART
 
 # Access Control List support for UFS filesystems.  The current ACL
 # implementation requires extended attribute support, UFS_EXTATTR,
 # for the underlying filesystem.
 # See src/sys/ufs/ufs/README.acls for more information.
 options 	UFS_ACL
 
 # Directory hashing improves the speed of operations on very large
 # directories at the expense of some memory.
 options 	UFS_DIRHASH
 
 # Gjournal-based UFS journaling support.
 options 	UFS_GJOURNAL
 
 # Make space in the kernel for a root filesystem on a md device.
 # Define to the number of kilobytes to reserve for the filesystem.
 options 	MD_ROOT_SIZE=10
 
 # Make the md device a potential root device, either with preloaded
 # images of type mfs_root or md_root.
 options 	MD_ROOT
 
 # Disk quotas are supported when this option is enabled.
 options 	QUOTA			#enable disk quotas
 
 # If you are running a machine just as a fileserver for PC and MAC
 # users, using SAMBA or Netatalk, you may consider setting this option
 # and keeping all those users' directories on a filesystem that is
 # mounted with the suiddir option. This gives new files the same
 # ownership as the directory (similar to group). It's a security hole
 # if you let these users run programs, so confine it to file-servers
 # (but it'll save you lots of headaches in those cases). Root owned
 # directories are exempt and X bits are cleared. The suid bit must be
 # set on the directory as well; see chmod(1) PC owners can't see/set
 # ownerships so they keep getting their toes trodden on. This saves
 # you all the support calls as the filesystem it's used on will act as
 # they expect: "It's my dir so it must be my file".
 #
 options 	SUIDDIR
 
 # NFS options:
 options 	NFS_MINATTRTIMO=3	# VREG attrib cache timeout in sec
 options 	NFS_MAXATTRTIMO=60
 options 	NFS_MINDIRATTRTIMO=30	# VDIR attrib cache timeout in sec
 options 	NFS_MAXDIRATTRTIMO=60
 options 	NFS_GATHERDELAY=10	# Default write gather delay (msec)
 options 	NFS_WDELAYHASHSIZ=16	# and with this
 options 	NFS_DEBUG		# Enable NFS Debugging
 
 # Coda stuff:
 options 	CODA			#CODA filesystem.
 device		vcoda			#coda minicache <-> venus comm.
 # Use the old Coda 5.x venus<->kernel interface instead of the new
 # realms-aware 6.x protocol.
 #options 	CODA_COMPAT_5
 
 #
 # Add support for the EXT2FS filesystem of Linux fame.  Be a bit
 # careful with this - the ext2fs code has a tendency to lag behind
 # changes and not be exercised very much, so mounting read/write could
 # be dangerous (and even mounting read only could result in panics.)
 #
 options 	EXT2FS
 
 #
 # Add support for the ReiserFS filesystem (used in Linux). Currently,
 # this is limited to read-only access.
 #
 options 	REISERFS
 
 #
 # Add support for the SGI XFS filesystem. Currently,
 # this is limited to read-only access.
 #
 options 	XFS
 
 # Use real implementations of the aio_* system calls.  There are numerous
 # stability and security issues in the current aio code that make it
 # unsuitable for inclusion on machines with untrusted local users.
 options 	VFS_AIO
 
 # Cryptographically secure random number generator; /dev/random
 device		random
 
 # The system memory devices; /dev/mem, /dev/kmem
 device		mem
 
 # Optional character code conversion support with LIBICONV.
 # Each option requires their base file system and LIBICONV.
 options 	CD9660_ICONV
 options 	MSDOSFS_ICONV
 options 	NTFS_ICONV
 options 	UDF_ICONV
 
 
 #####################################################################
 # POSIX P1003.1B
 
 # Real time extensions added in the 1993 POSIX
 # _KPOSIX_PRIORITY_SCHEDULING: Build in _POSIX_PRIORITY_SCHEDULING
 
 options 	_KPOSIX_PRIORITY_SCHEDULING
 # p1003_1b_semaphores are very experimental,
 # user should be ready to assist in debugging if problems arise.
 options 	P1003_1B_SEMAPHORES
 
 # POSIX message queue
 options 	P1003_1B_MQUEUE
 
 #####################################################################
 # SECURITY POLICY PARAMETERS
 
 # Support for BSM audit
 options 	AUDIT
 
 # Support for Mandatory Access Control (MAC):
 options 	MAC
 options 	MAC_BIBA
 options 	MAC_BSDEXTENDED
 options 	MAC_IFOFF
 options 	MAC_LOMAC
 options 	MAC_MLS
 options 	MAC_NONE
 options 	MAC_PARTITION
 options 	MAC_PORTACL
 options 	MAC_SEEOTHERUIDS
 options 	MAC_STUB
 options 	MAC_TEST
 
 
 #####################################################################
 # CLOCK OPTIONS
 
 # The granularity of operation is controlled by the kernel option HZ whose
 # default value (1000 on most architectures) means a granularity of 1ms
 # (1s/HZ).  Historically, the default was 100, but finer granularity is
 # required for DUMMYNET and other systems on modern hardware.  There are
 # reasonable arguments that HZ should, in fact, be 100 still; consider,
 # that reducing the granularity too much might cause excessive overhead in
 # clock interrupt processing, potentially causing ticks to be missed and thus
 # actually reducing the accuracy of operation.
 
 options 	HZ=100
 
 # Enable support for the kernel PLL to use an external PPS signal,
 # under supervision of [x]ntpd(8)
 # More info in ntpd documentation: http://www.eecis.udel.edu/~ntp
 
 options 	PPS_SYNC
 
 
 #####################################################################
 # SCSI DEVICES
 
 # SCSI DEVICE CONFIGURATION
 
 # The SCSI subsystem consists of the `base' SCSI code, a number of
 # high-level SCSI device `type' drivers, and the low-level host-adapter
 # device drivers.  The host adapters are listed in the ISA and PCI
 # device configuration sections below.
 #
 # It is possible to wire down your SCSI devices so that a given bus,
 # target, and LUN always come on line as the same device unit.  In
 # earlier versions the unit numbers were assigned in the order that
 # the devices were probed on the SCSI bus.  This means that if you
 # removed a disk drive, you may have had to rewrite your /etc/fstab
 # file, and also that you had to be careful when adding a new disk
 # as it may have been probed earlier and moved your device configuration
 # around.  (See also option GEOM_VOL for a different solution to this
 # problem.)
 
 # This old behavior is maintained as the default behavior.  The unit
 # assignment begins with the first non-wired down unit for a device
 # type.  For example, if you wire a disk as "da3" then the first
 # non-wired disk will be assigned da4.
 
 # The syntax for wiring down devices is:
 
 hint.scbus.0.at="ahc0"
 hint.scbus.1.at="ahc1"
 hint.scbus.1.bus="0"
 hint.scbus.3.at="ahc2"
 hint.scbus.3.bus="0"
 hint.scbus.2.at="ahc2"
 hint.scbus.2.bus="1"
 hint.da.0.at="scbus0"
 hint.da.0.target="0"
 hint.da.0.unit="0"
 hint.da.1.at="scbus3"
 hint.da.1.target="1"
 hint.da.2.at="scbus2"
 hint.da.2.target="3"
 hint.sa.1.at="scbus1"
 hint.sa.1.target="6"
 
 # "units" (SCSI logical unit number) that are not specified are
 # treated as if specified as LUN 0.
 
 # All SCSI devices allocate as many units as are required.
 
 # The ch driver drives SCSI Media Changer ("jukebox") devices.
 #
 # The da driver drives SCSI Direct Access ("disk") and Optical Media
 # ("WORM") devices.
 #
 # The sa driver drives SCSI Sequential Access ("tape") devices.
 #
 # The cd driver drives SCSI Read Only Direct Access ("cd") devices.
 #
 # The ses driver drives SCSI Environment Services ("ses") and
 # SAF-TE ("SCSI Accessible Fault-Tolerant Enclosure") devices.
 #
 # The pt driver drives SCSI Processor devices.
 #
 # The sg driver provides a passthrough API that is compatible with the
 # Linux SG driver.  It will work in conjunction with the COMPAT_LINUX
 # option to run linux SG apps.  It can also stand on its own and provide
 # source level API compatiblity for porting apps to FreeBSD.
 #
 # Target Mode support is provided here but also requires that a SIM
 # (SCSI Host Adapter Driver) provide support as well.
 #
 # The targ driver provides target mode support as a Processor type device.
 # It exists to give the minimal context necessary to respond to Inquiry
 # commands. There is a sample user application that shows how the rest
 # of the command support might be done in /usr/share/examples/scsi_target.
 #
 # The targbh driver provides target mode support and exists to respond
 # to incoming commands that do not otherwise have a logical unit assigned
 # to them.
 #
 # The "unknown" device (uk? in pre-2.0.5) is now part of the base SCSI
 # configuration as the "pass" driver.
 
 device		scbus		#base SCSI code
 device		ch		#SCSI media changers
 device		da		#SCSI direct access devices (aka disks)
 device		sa		#SCSI tapes
 device		cd		#SCSI CD-ROMs
 device		ses		#SCSI Environmental Services (and SAF-TE)
 device		pt		#SCSI processor
 device		targ		#SCSI Target Mode Code
 device		targbh		#SCSI Target Mode Blackhole Device
 device		pass		#CAM passthrough driver
 device		sg		#Linux SCSI passthrough
 
 # CAM OPTIONS:
 # debugging options:
 # -- NOTE --  If you specify one of the bus/target/lun options, you must
 #             specify them all!
 # CAMDEBUG: When defined enables debugging macros
 # CAM_DEBUG_BUS:  Debug the given bus.  Use -1 to debug all busses.
 # CAM_DEBUG_TARGET:  Debug the given target.  Use -1 to debug all targets.
 # CAM_DEBUG_LUN:  Debug the given lun.  Use -1 to debug all luns.
 # CAM_DEBUG_FLAGS:  OR together CAM_DEBUG_INFO, CAM_DEBUG_TRACE,
 #                   CAM_DEBUG_SUBTRACE, and CAM_DEBUG_CDB
 #
 # CAM_MAX_HIGHPOWER: Maximum number of concurrent high power (start unit) cmds
 # SCSI_NO_SENSE_STRINGS: When defined disables sense descriptions
 # SCSI_NO_OP_STRINGS: When defined disables opcode descriptions
 # SCSI_DELAY: The number of MILLISECONDS to freeze the SIM (scsi adapter)
 #             queue after a bus reset, and the number of milliseconds to
 #             freeze the device queue after a bus device reset.  This
 #             can be changed at boot and runtime with the
 #             kern.cam.scsi_delay tunable/sysctl.
 options 	CAMDEBUG
 options 	CAM_DEBUG_BUS=-1
 options 	CAM_DEBUG_TARGET=-1
 options 	CAM_DEBUG_LUN=-1
 options 	CAM_DEBUG_FLAGS=(CAM_DEBUG_INFO|CAM_DEBUG_TRACE|CAM_DEBUG_CDB)
 options 	CAM_MAX_HIGHPOWER=4
 options 	SCSI_NO_SENSE_STRINGS
 options 	SCSI_NO_OP_STRINGS
 options 	SCSI_DELAY=5000	# Be pessimistic about Joe SCSI device
 
 # Options for the CAM CDROM driver:
 # CHANGER_MIN_BUSY_SECONDS: Guaranteed minimum time quantum for a changer LUN
 # CHANGER_MAX_BUSY_SECONDS: Maximum time quantum per changer LUN, only
 #                           enforced if there is I/O waiting for another LUN
 # The compiled in defaults for these variables are 2 and 10 seconds,
 # respectively.
 #
 # These can also be changed on the fly with the following sysctl variables:
 # kern.cam.cd.changer.min_busy_seconds
 # kern.cam.cd.changer.max_busy_seconds
 #
 options 	CHANGER_MIN_BUSY_SECONDS=2
 options 	CHANGER_MAX_BUSY_SECONDS=10
 
 # Options for the CAM sequential access driver:
 # SA_IO_TIMEOUT: Timeout for read/write/wfm  operations, in minutes
 # SA_SPACE_TIMEOUT: Timeout for space operations, in minutes
 # SA_REWIND_TIMEOUT: Timeout for rewind operations, in minutes
 # SA_ERASE_TIMEOUT: Timeout for erase operations, in minutes
 # SA_1FM_AT_EOD: Default to model which only has a default one filemark at EOT.
 options 	SA_IO_TIMEOUT=4
 options 	SA_SPACE_TIMEOUT=60
 options 	SA_REWIND_TIMEOUT=(2*60)
 options 	SA_ERASE_TIMEOUT=(4*60)
 options 	SA_1FM_AT_EOD
 
 # Optional timeout for the CAM processor target (pt) device
 # This is specified in seconds.  The default is 60 seconds.
 options 	SCSI_PT_DEFAULT_TIMEOUT=60
 
 # Optional enable of doing SES passthrough on other devices (e.g., disks)
 #
 # Normally disabled because a lot of newer SCSI disks report themselves
 # as having SES capabilities, but this can then clot up attempts to build
 # build a topology with the SES device that's on the box these drives
 # are in....
 options 	SES_ENABLE_PASSTHROUGH
 
 
 #####################################################################
 # MISCELLANEOUS DEVICES AND OPTIONS
 
 device		pty		#BSD-style compatibility pseudo ttys
 device		nmdm		#back-to-back tty devices
 device		md		#Memory/malloc disk
 device		snp		#Snoop device - to look at pty/vty/etc..
 device		ccd		#Concatenated disk driver
 device		firmware	#firmware(9) support
 
 # Kernel side iconv library
 options 	LIBICONV
 
 # Size of the kernel message buffer.  Should be N * pagesize.
 options 	MSGBUF_SIZE=40960
 
 
 #####################################################################
 # HARDWARE DEVICE CONFIGURATION
 
 # For ISA the required hints are listed.
 # EISA, MCA, PCI, CardBus, SD/MMC and pccard are self identifying buses, so
 # no hints are needed.
 
 #
 # Mandatory devices:
 #
 
 # These options are valid for other keyboard drivers as well.
 options 	KBD_DISABLE_KEYMAP_LOAD	# refuse to load a keymap
 options 	KBD_INSTALL_CDEV	# install a CDEV entry in /dev
 
 options 	FB_DEBUG		# Frame buffer debugging
 
 device		splash			# Splash screen and screen saver support
 
 # Various screen savers.
 device		blank_saver
 device		daemon_saver
 device		dragon_saver
 device		fade_saver
 device		fire_saver
 device		green_saver
 device		logo_saver
 device		rain_saver
 device		snake_saver
 device		star_saver
 device		warp_saver
 
 # The syscons console driver (SCO color console compatible).
 device		sc
 hint.sc.0.at="isa"
 options 	MAXCONS=16		# number of virtual consoles
 options 	SC_ALT_MOUSE_IMAGE	# simplified mouse cursor in text mode
 options 	SC_DFLT_FONT		# compile font in
 makeoptions	SC_DFLT_FONT=cp850
 options 	SC_DISABLE_KDBKEY	# disable `debug' key
 options 	SC_DISABLE_REBOOT	# disable reboot key sequence
 options 	SC_HISTORY_SIZE=200	# number of history buffer lines
 options 	SC_MOUSE_CHAR=0x3	# char code for text mode mouse cursor
 options 	SC_PIXEL_MODE		# add support for the raster text mode
 
 # The following options will let you change the default colors of syscons.
 options 	SC_NORM_ATTR=(FG_GREEN|BG_BLACK)
 options 	SC_NORM_REV_ATTR=(FG_YELLOW|BG_GREEN)
 options 	SC_KERNEL_CONS_ATTR=(FG_RED|BG_BLACK)
 options 	SC_KERNEL_CONS_REV_ATTR=(FG_BLACK|BG_RED)
 
 # The following options will let you change the default behaviour of
 # cut-n-paste feature
 options 	SC_CUT_SPACES2TABS	# convert leading spaces into tabs
 options 	SC_CUT_SEPCHARS=\"x09\"	# set of characters that delimit words
 					# (default is single space - \"x20\")
 
 # If you have a two button mouse, you may want to add the following option
 # to use the right button of the mouse to paste text.
 options 	SC_TWOBUTTON_MOUSE
 
 # You can selectively disable features in syscons.
 options 	SC_NO_CUTPASTE
 options 	SC_NO_FONT_LOADING
 options 	SC_NO_HISTORY
 options 	SC_NO_MODE_CHANGE
 options 	SC_NO_SYSMOUSE
 options 	SC_NO_SUSPEND_VTYSWITCH
 
 # `flags' for sc
 #	0x80	Put the video card in the VESA 800x600 dots, 16 color mode
 #	0x100	Probe for a keyboard device periodically if one is not present
 
 #
 # Optional devices:
 #
 
 #
 # SCSI host adapters:
 #
 # adv: All Narrow SCSI bus AdvanSys controllers.
 # adw: Second Generation AdvanSys controllers including the ADV940UW.
 # aha: Adaptec 154x/1535/1640
 # ahb: Adaptec 174x EISA controllers
 # ahc: Adaptec 274x/284x/2910/293x/294x/394x/3950x/3960x/398X/4944/
 #      19160x/29160x, aic7770/aic78xx
 # ahd: Adaptec 29320/39320 Controllers.
 # aic: Adaptec 6260/6360, APA-1460 (PC Card), NEC PC9801-100 (C-BUS)
 # amd: Support for the AMD 53C974 SCSI host adapter chip as found on devices
 #      such as the Tekram DC-390(T).
 # bt:  Most Buslogic controllers: including BT-445, BT-54x, BT-64x, BT-74x,
 #      BT-75x, BT-946, BT-948, BT-956, BT-958, SDC3211B, SDC3211F, SDC3222F
 # esp: NCR53c9x.  Only for SBUS hardware right now.
 # isp: Qlogic ISP 1020, 1040 and 1040B PCI SCSI host adapters,
 #      ISP 1240 Dual Ultra SCSI, ISP 1080 and 1280 (Dual) Ultra2,
 #      ISP 12160 Ultra3 SCSI,
 #      Qlogic ISP 2100 and ISP 2200 1Gb Fibre Channel host adapters.
 #      Qlogic ISP 2300 and ISP 2312 2Gb Fibre Channel host adapters.
 #      Qlogic ISP 2322 and ISP 6322 2Gb Fibre Channel host adapters.
 # ispfw: Firmware module for Qlogic host adapters
 # mpt: LSI-Logic MPT/Fusion 53c1020 or 53c1030 Ultra4
 #      or FC9x9 Fibre Channel host adapters.
 # ncr: NCR 53C810, 53C825 self-contained SCSI host adapters.
 # sym: Symbios/Logic 53C8XX family of PCI-SCSI I/O processors:
 #      53C810, 53C810A, 53C815, 53C825,  53C825A, 53C860, 53C875,
 #      53C876, 53C885,  53C895, 53C895A, 53C896,  53C897, 53C1510D,
 #      53C1010-33, 53C1010-66.
 # trm: Tekram DC395U/UW/F DC315U adapters.
 # wds: WD7000
 
 #
 # Note that the order is important in order for Buslogic ISA/EISA cards to be
 # probed correctly.
 #
 device		bt
 hint.bt.0.at="isa"
 hint.bt.0.port="0x330"
 device		adv
 hint.adv.0.at="isa"
 device		adw
 device		aha
 hint.aha.0.at="isa"
 device		aic
 hint.aic.0.at="isa"
 device		ahb
 device		ahc
 device		ahd
 device		amd
 device		esp
 device		iscsi_initiator
 device		isp
 hint.isp.0.disable="1"
 hint.isp.0.role="3"
 hint.isp.0.prefer_iomap="1"
 hint.isp.0.prefer_memmap="1"
 hint.isp.0.fwload_disable="1"
 hint.isp.0.ignore_nvram="1"
 hint.isp.0.fullduplex="1"
 hint.isp.0.topology="lport"
 hint.isp.0.topology="nport"
 hint.isp.0.topology="lport-only"
 hint.isp.0.topology="nport-only"
 # we can't get u_int64_t types, nor can we get strings if it's got
 # a leading 0x, hence this silly dodge.
 hint.isp.0.portwnn="w50000000aaaa0000"
 hint.isp.0.nodewnn="w50000000aaaa0001"
 device		ispfw
 device		mpt
 device		ncr
 device		sym
 device		trm
 device		wds
 hint.wds.0.at="isa"
 hint.wds.0.port="0x350"
 hint.wds.0.irq="11"
 hint.wds.0.drq="6"
 
 # The aic7xxx driver will attempt to use memory mapped I/O for all PCI
 # controllers that have it configured only if this option is set. Unfortunately,
 # this doesn't work on some motherboards, which prevents it from being the
 # default.
 options 	AHC_ALLOW_MEMIO
 
 # Dump the contents of the ahc controller configuration PROM.
 options 	AHC_DUMP_EEPROM
 
 # Bitmap of units to enable targetmode operations.
 options 	AHC_TMODE_ENABLE
 
 # Compile in Aic7xxx Debugging code.
 options 	AHC_DEBUG
 
 # Aic7xxx driver debugging options. See sys/dev/aic7xxx/aic7xxx.h
 options 	AHC_DEBUG_OPTS
 
 # Print register bitfields in debug output.  Adds ~128k to driver
 # See ahc(4).
 options 	AHC_REG_PRETTY_PRINT
 
 # Compile in aic79xx debugging code.
 options 	AHD_DEBUG
 
 # Aic79xx driver debugging options.  Adds ~215k to driver.  See ahd(4).
 options 	AHD_DEBUG_OPTS=0xFFFFFFFF
 
 # Print human-readable register definitions when debugging
 options 	AHD_REG_PRETTY_PRINT
 
 # Bitmap of units to enable targetmode operations.
 options 	AHD_TMODE_ENABLE
 
 # The adw driver will attempt to use memory mapped I/O for all PCI
 # controllers that have it configured only if this option is set.
 options 	ADW_ALLOW_MEMIO
 
 # Options used in dev/iscsi (Software iSCSI stack)
 #
 options		ISCSI_INITIATOR_DEBUG=9
 
 # Options used in dev/isp/ (Qlogic SCSI/FC driver).
 #
 #	ISP_TARGET_MODE		-	enable target mode operation
 #
 options 	ISP_TARGET_MODE=1
 #
 #	ISP_DEFAULT_ROLES	-	default role
 #		none=0
 #		target=1
 #		initiator=2
 #		both=3			(not supported currently)
 #
 options 	ISP_DEFAULT_ROLES=2
 
 # Options used in dev/sym/ (Symbios SCSI driver).
 #options 	SYM_SETUP_LP_PROBE_MAP	#-Low Priority Probe Map (bits)
 					# Allows the ncr to take precedence
 					# 1 (1<<0) -> 810a, 860
 					# 2 (1<<1) -> 825a, 875, 885, 895
 					# 4 (1<<2) -> 895a, 896, 1510d
 #options 	SYM_SETUP_SCSI_DIFF	#-HVD support for 825a, 875, 885
 					# disabled:0 (default), enabled:1
 #options 	SYM_SETUP_PCI_PARITY	#-PCI parity checking
 					# disabled:0, enabled:1 (default)
 #options 	SYM_SETUP_MAX_LUN	#-Number of LUNs supported
 					# default:8, range:[1..64]
 
 # The 'dpt' driver provides support for old DPT controllers (http://www.dpt.com/).
 # These have hardware RAID-{0,1,5} support, and do multi-initiator I/O.
 # The DPT controllers are commonly re-licensed under other brand-names -
 # some controllers by Olivetti, Dec, HP, AT&T, SNI, AST, Alphatronic, NEC and
 # Compaq are actually DPT controllers.
 #
 # See src/sys/dev/dpt for debugging and other subtle options.
 #   DPT_MEASURE_PERFORMANCE Enables a set of (semi)invasive metrics. Various
 #                           instruments are enabled.  The tools in
 #                           /usr/sbin/dpt_* assume these to be enabled.
 #   DPT_HANDLE_TIMEOUTS     Normally device timeouts are handled by the DPT.
 #                           If you ant the driver to handle timeouts, enable
 #                           this option.  If your system is very busy, this
 #                           option will create more trouble than solve.
 #   DPT_TIMEOUT_FACTOR      Used to compute the excessive amount of time to
 #                           wait when timing out with the above option.
 #  DPT_DEBUG_xxxx           These are controllable from sys/dev/dpt/dpt.h
 #  DPT_LOST_IRQ             When enabled, will try, once per second, to catch
 #                           any interrupt that got lost.  Seems to help in some
 #                           DPT-firmware/Motherboard combinations.  Minimal
 #                           cost, great benefit.
 #  DPT_RESET_HBA            Make "reset" actually reset the controller
 #                           instead of fudging it.  Only enable this if you
 #			    are 100% certain you need it.
 
 device		dpt
 
 # DPT options
 #!CAM# options 	DPT_MEASURE_PERFORMANCE
 #!CAM# options 	DPT_HANDLE_TIMEOUTS
 options 	DPT_TIMEOUT_FACTOR=4
 options 	DPT_LOST_IRQ
 options 	DPT_RESET_HBA
 
 #
 # Compaq "CISS" RAID controllers (SmartRAID 5* series)
 # These controllers have a SCSI-like interface, and require the
 # CAM infrastructure.
 #
 device		ciss
 
 #
 # Intel Integrated RAID controllers.
 # This driver was developed and is maintained by Intel.  Contacts
 # at Intel for this driver are
 # "Kannanthanam, Boji T" <boji.t.kannanthanam@intel.com> and
 # "Leubner, Achim" <achim.leubner@intel.com>.
 #
 device		iir
 
 #
 # Mylex AcceleRAID and eXtremeRAID controllers with v6 and later
 # firmware.  These controllers have a SCSI-like interface, and require
 # the CAM infrastructure.
 #
 device		mly
 
 #
 # Compaq Smart RAID, Mylex DAC960 and AMI MegaRAID controllers.  Only
 # one entry is needed; the code will find and configure all supported
 # controllers.
 #
 device		ida		# Compaq Smart RAID
 device		mlx		# Mylex DAC960
 device		amr		# AMI MegaRAID
 device 		amrp		# SCSI Passthrough interface (optional, CAM req.)
 device		mfi		# LSI MegaRAID SAS
 device		mfip		# LSI MegaRAID SAS passthrough, requires CAM
 options 	MFI_DEBUG
 
 #
 # 3ware ATA RAID
 #
 device		twe		# 3ware ATA RAID
 
 #
 # The 'ATA' driver supports all ATA and ATAPI devices, including PC Card
 # devices. You only need one "device ata" for it to find all
 # PCI and PC Card ATA/ATAPI devices on modern machines.
 device		ata
 device		atadisk		# ATA disk drives
 device		ataraid		# ATA RAID drives
 device		atapicd		# ATAPI CDROM drives
 device		atapifd		# ATAPI floppy drives
 device		atapist		# ATAPI tape drives
 device		atapicam	# emulate ATAPI devices as SCSI ditto via CAM
 				# needs CAM to be present (scbus & pass)
 #
 # For older non-PCI, non-PnPBIOS systems, these are the hints lines to add:
 hint.ata.0.at="isa"
 hint.ata.0.port="0x1f0"
 hint.ata.0.irq="14"
 hint.ata.1.at="isa"
 hint.ata.1.port="0x170"
 hint.ata.1.irq="15"
 
 #
 # The following options are valid on the ATA driver:
 #
 # ATA_STATIC_ID:	controller numbering is static ie depends on location
 #			else the device numbers are dynamically allocated.
 
 options 	ATA_STATIC_ID
 
 #
 # Standard floppy disk controllers and floppy tapes, supports
 # the Y-E DATA External FDD (PC Card)
 #
 device		fdc
 hint.fdc.0.at="isa"
 hint.fdc.0.port="0x3F0"
 hint.fdc.0.irq="6"
 hint.fdc.0.drq="2"
 #
 # FDC_DEBUG enables floppy debugging.  Since the debug output is huge, you
 # gotta turn it actually on by setting the variable fd_debug with DDB,
 # however.
 options 	FDC_DEBUG
 #
 # Activate this line if you happen to have an Insight floppy tape.
 # Probing them proved to be dangerous for people with floppy disks only,
 # so it's "hidden" behind a flag:
 #hint.fdc.0.flags="1"
 
 # Specify floppy devices
 hint.fd.0.at="fdc0"
 hint.fd.0.drive="0"
 hint.fd.1.at="fdc0"
 hint.fd.1.drive="1"
 
 #
 # uart: newbusified driver for serial interfaces.  It consolidates the sio(4),
 #	sab(4) and zs(4) drivers.
 #
 device		uart
 
 # Options for uart(4)
 options 	UART_PPS_ON_CTS		# Do time pulse capturing using CTS
 					# instead of DCD.
 
 # The following hint should only be used for pure ISA devices.  It is not
 # needed otherwise.  Use of hints is strongly discouraged.
 hint.uart.0.at="isa"
 
 # The following 3 hints are used when the UART is a system device (i.e., a
 # console or debug port), but only on platforms that don't have any other
 # means to pass the information to the kernel.  The unit number of the hint
 # is only used to bundle the hints together.  There is no relation to the
 # unit number of the probed UART.
 hint.uart.0.port="0x3f8"
 hint.uart.0.flags="0x10"
 hint.uart.0.baud="115200"
 
 # `flags' for serial drivers that support consoles like sio(4) and uart(4):
 #	0x10	enable console support for this unit.  Other console flags
 #		(if applicable) are ignored unless this is set.  Enabling
 #		console support does not make the unit the preferred console.
 #		Boot with -h or set boot_serial=YES in the loader.  For sio(4)
 #		specifically, the 0x20 flag can also be set (see above).
 #		Currently, at most one unit can have console support; the
 #		first one (in config file order) with this flag set is
 #		preferred.  Setting this flag for sio0 gives the old behaviour.
 #	0x80	use this port for serial line gdb support in ddb.  Also known
 #		as debug port.
 #
 
 # Options for serial drivers that support consoles:
 options 	BREAK_TO_DEBUGGER	# A BREAK on a serial console goes to
 					# ddb, if available.
 
 # Solaris implements a new BREAK which is initiated by a character
 # sequence CR ~ ^b which is similar to a familiar pattern used on
 # Sun servers by the Remote Console.  There are FreeBSD extentions:
 # CR ~ ^p requests force panic and CR ~ ^r requests a clean reboot.
 options 	ALT_BREAK_TO_DEBUGGER
 
 # Serial Communications Controller
 # Supports the Siemens SAB 82532 and Zilog Z8530 multi-channel
 # communications controllers.
 device		scc
 
 # PCI Universal Communications driver
 # Supports various multi port PCI I/O cards.
 device		puc
 
 #
 # Network interfaces:
 #
 # MII bus support is required for some PCI 10/100 ethernet NICs,
 # namely those which use MII-compliant transceivers or implement
 # transceiver control interfaces that operate like an MII. Adding
 # "device miibus" to the kernel config pulls in support for
 # the generic miibus API and all of the PHY drivers, including a
 # generic one for PHYs that aren't specifically handled by an
 # individual driver.
 device		miibus
 
 # an:   Aironet 4500/4800 802.11 wireless adapters. Supports the PCMCIA,
 #       PCI and ISA varieties.
 # ae:   Support for gigabit ethernet adapters based on the Attansic/Atheros
 #       L2 PCI-Express FastEthernet controllers.
 # age:  Support for gigabit ethernet adapters based on the Attansic/Atheros
 #       L1 PCI express gigabit ethernet controllers.
 # ale:  Support for Atheros AR8121/AR8113/AR8114 PCIe ethernet controllers.
 # bce:	Broadcom NetXtreme II (BCM5706/BCM5708) PCI/PCIe Gigabit Ethernet
 #       adapters.
 # bfe:	Broadcom BCM4401 Ethernet adapter.
 # bge:	Support for gigabit ethernet adapters based on the Broadcom
 #	BCM570x family of controllers, including the 3Com 3c996-T,
 #	the Netgear GA302T, the SysKonnect SK-9D21 and SK-9D41, and
 #	the embedded gigE NICs on Dell PowerEdge 2550 servers.
 # cm:	Arcnet SMC COM90c26 / SMC COM90c56
 #	(and SMC COM90c66 in '56 compatibility mode) adapters.
 # dc:   Support for PCI fast ethernet adapters based on the DEC/Intel 21143
 #       and various workalikes including:
 #       the ADMtek AL981 Comet and AN985 Centaur, the ASIX Electronics
 #       AX88140A and AX88141, the Davicom DM9100 and DM9102, the Lite-On
 #       82c168 and 82c169 PNIC, the Lite-On/Macronix LC82C115 PNIC II
 #       and the Macronix 98713/98713A/98715/98715A/98725 PMAC. This driver
 #       replaces the old al, ax, dm, pn and mx drivers.  List of brands:
 #       Digital DE500-BA, Kingston KNE100TX, D-Link DFE-570TX, SOHOware SFA110,
 #       SVEC PN102-TX, CNet Pro110B, 120A, and 120B, Compex RL100-TX,
 #       LinkSys LNE100TX, LNE100TX V2.0, Jaton XpressNet, Alfa Inc GFC2204,
 #       KNE110TX.
 # de:   Digital Equipment DC21040
 # em:   Intel Pro/1000 Gigabit Ethernet 82542, 82543, 82544 based adapters.
 # igb:  Intel Pro/1000 PCI Express Gigabit Ethernet: 82575 and later adapters.
 # ep:   3Com 3C509, 3C529, 3C556, 3C562D, 3C563D, 3C572, 3C574X, 3C579, 3C589
 #       and PC Card devices using these chipsets.
 # ex:   Intel EtherExpress Pro/10 and other i82595-based adapters,
 #       Olicom Ethernet PC Card devices.
 # fe:   Fujitsu MB86960A/MB86965A Ethernet
 # fea:  DEC DEFEA EISA FDDI adapter
 # fpa:  Support for the Digital DEFPA PCI FDDI. `device fddi' is also needed.
 # fxp:  Intel EtherExpress Pro/100B
 #	(hint of prefer_iomap can be done to prefer I/O instead of Mem mapping)
 # gem:  Apple GMAC/Sun ERI/Sun GEM
 # hme:  Sun HME (Happy Meal Ethernet)
 # jme:  JMicron JMC260 Fast Ethernet/JMC250 Gigabit Ethernet based adapters.
 # le:   AMD Am7900 LANCE and Am79C9xx PCnet
 # lge:	Support for PCI gigabit ethernet adapters based on the Level 1
 #	LXT1001 NetCellerator chipset. This includes the D-Link DGE-500SX,
 #	SMC TigerCard 1000 (SMC9462SX), and some Addtron cards.
 # msk:	Support for gigabit ethernet adapters based on the Marvell/SysKonnect
 #	Yukon II Gigabit controllers, including 88E8021, 88E8022, 88E8061,
 #	88E8062, 88E8035, 88E8036, 88E8038, 88E8050, 88E8052, 88E8053,
 #	88E8055, 88E8056 and D-Link 560T/550SX.
 # lmc:	Support for the LMC/SBE wide-area network interface cards.
 # my:	Myson Fast Ethernet (MTD80X, MTD89X)
 # nge:	Support for PCI gigabit ethernet adapters based on the National
 #	Semiconductor DP83820 and DP83821 chipset. This includes the
 #	SMC EZ Card 1000 (SMC9462TX), D-Link DGE-500T, Asante FriendlyNet
 #	GigaNIX 1000TA and 1000TPC, the Addtron AEG320T, the Surecom
 #	EP-320G-TX and the Netgear GA622T.
 # pcn:	Support for PCI fast ethernet adapters based on the AMD Am79c97x
 #	PCnet-FAST, PCnet-FAST+, PCnet-FAST III, PCnet-PRO and PCnet-Home
 #	chipsets. These can also be handled by the le(4) driver if the
 #	pcn(4) driver is left out of the kernel. The le(4) driver does not
 #	support the additional features like the MII bus and burst mode of
 #	the PCnet-FAST and greater chipsets though.
 # re:   RealTek 8139C+/8169/816xS/811xS/8101E PCI/PCIe Ethernet adapter
 # rl:   Support for PCI fast ethernet adapters based on the RealTek 8129/8139
 #       chipset.  Note that the RealTek driver defaults to using programmed
 #       I/O to do register accesses because memory mapped mode seems to cause
 #       severe lockups on SMP hardware.  This driver also supports the
 #       Accton EN1207D `Cheetah' adapter, which uses a chip called
 #       the MPX 5030/5038, which is either a RealTek in disguise or a
 #       RealTek workalike.  Note that the D-Link DFE-530TX+ uses the RealTek
 #       chipset and is supported by this driver, not the 'vr' driver.
 # sf:   Support for Adaptec Duralink PCI fast ethernet adapters based on the
 #       Adaptec AIC-6915 "starfire" controller.
 #       This includes dual and quad port cards, as well as one 100baseFX card.
 #       Most of these are 64-bit PCI devices, except for one single port
 #       card which is 32-bit.
 # sis:  Support for NICs based on the Silicon Integrated Systems SiS 900,
 #       SiS 7016 and NS DP83815 PCI fast ethernet controller chips.
 # sk:   Support for the SysKonnect SK-984x series PCI gigabit ethernet NICs.
 #       This includes the SK-9841 and SK-9842 single port cards (single mode
 #       and multimode fiber) and the SK-9843 and SK-9844 dual port cards
 #       (also single mode and multimode).
 #       The driver will autodetect the number of ports on the card and
 #       attach each one as a separate network interface.
 # sn:   Support for ISA and PC Card Ethernet devices using the
 #       SMC91C90/92/94/95 chips.
 # ste:  Sundance Technologies ST201 PCI fast ethernet controller, includes
 #       the D-Link DFE-550TX.
 # stge: Support for gigabit ethernet adapters based on the Sundance/Tamarack
 #       TC9021 family of controllers, including the Sundance ST2021/ST2023,
 #       the Sundance/Tamarack TC9021, the D-Link DL-4000 and ASUS NX1101.
 # ti:   Support for PCI gigabit ethernet NICs based on the Alteon Networks
 #       Tigon 1 and Tigon 2 chipsets.  This includes the Alteon AceNIC, the
 #       3Com 3c985, the Netgear GA620 and various others.  Note that you will
 #       probably want to bump up kern.ipc.nmbclusters a lot to use this driver.
 # tl:   Support for the Texas Instruments TNETE100 series 'ThunderLAN'
 #       cards and integrated ethernet controllers.  This includes several
 #       Compaq Netelligent 10/100 cards and the built-in ethernet controllers
 #       in several Compaq Prosignia, Proliant and Deskpro systems.  It also
 #       supports several Olicom 10Mbps and 10/100 boards.
 # tx:   SMC 9432 TX, BTX and FTX cards. (SMC EtherPower II series)
 # txp:	Support for 3Com 3cR990 cards with the "Typhoon" chipset
 # vr:   Support for various fast ethernet adapters based on the VIA
 #       Technologies VT3043 `Rhine I' and VT86C100A `Rhine II' chips,
 #       including the D-Link DFE530TX (see 'rl' for DFE530TX+), the Hawking
 #       Technologies PN102TX, and the AOpen/Acer ALN-320.
 # vx:   3Com 3C590 and 3C595
 # wb:   Support for fast ethernet adapters based on the Winbond W89C840F chip.
 #       Note: this is not the same as the Winbond W89C940F, which is a
 #       NE2000 clone.
 # wi:   Lucent WaveLAN/IEEE 802.11 PCMCIA adapters. Note: this supports both
 #       the PCMCIA and ISA cards: the ISA card is really a PCMCIA to ISA
 #       bridge with a PCMCIA adapter plugged into it.
 # xe:   Xircom/Intel EtherExpress Pro100/16 PC Card ethernet controller,
 #       Accton Fast EtherCard-16, Compaq Netelligent 10/100 PC Card,
 #       Toshiba 10/100 Ethernet PC Card, Xircom 16-bit Ethernet + Modem 56
 # xl:   Support for the 3Com 3c900, 3c905, 3c905B and 3c905C (Fast)
 #       Etherlink XL cards and integrated controllers.  This includes the
 #       integrated 3c905B-TX chips in certain Dell Optiplex and Dell
 #       Precision desktop machines and the integrated 3c905-TX chips
 #       in Dell Latitude laptop docking stations.
 #       Also supported: 3Com 3c980(C)-TX, 3Com 3cSOHO100-TX, 3Com 3c450-TX
 
 # Order for ISA/EISA devices is important here
 
 device		cm
 hint.cm.0.at="isa"
 hint.cm.0.port="0x2e0"
 hint.cm.0.irq="9"
 hint.cm.0.maddr="0xdc000"
 device		ep
 device		ex
 device		fe
 hint.fe.0.at="isa"
 hint.fe.0.port="0x300"
 device		fea
 device		sn
 hint.sn.0.at="isa"
 hint.sn.0.port="0x300"
 hint.sn.0.irq="10"
 device		an
 device		wi
 device		xe
 
 # PCI Ethernet NICs that use the common MII bus controller code.
 device		ae		# Attansic/Atheros L2 FastEthernet
 device		age		# Attansic/Atheros L1 Gigabit Ethernet
 device		ale		# Atheros AR8121/AR8113/AR8114 Ethernet
 device		bce		# Broadcom BCM5706/BCM5708 Gigabit Ethernet
 device		bfe		# Broadcom BCM440x 10/100 Ethernet
 device		bge		# Broadcom BCM570xx Gigabit Ethernet
 device		cxgb		# Chelsio T3 10 Gigabit Ethernet
 device		cxgb_t3fw	# Chelsio T3 10 Gigabit Ethernet firmware
 device		dc		# DEC/Intel 21143 and various workalikes
 device		et		# Agere ET1310 10/100/Gigabit Ethernet
 device		fxp		# Intel EtherExpress PRO/100B (82557, 82558)
 hint.fxp.0.prefer_iomap="0"
 device		gem		# Apple GMAC/Sun ERI/Sun GEM
 device		hme		# Sun HME (Happy Meal Ethernet)
 device		jme		# JMicron JMC250 Gigabit/JMC260 Fast Ethernet
 device		lge		# Level 1 LXT1001 gigabit Ethernet
 device		msk		# Marvell/SysKonnect Yukon II Gigabit Ethernet
 device		my		# Myson Fast Ethernet (MTD80X, MTD89X)
 device		nge		# NatSemi DP83820 gigabit Ethernet
 device		re		# RealTek 8139C+/8169/8169S/8110S
 device		rl		# RealTek 8129/8139
 device		pcn		# AMD Am79C97x PCI 10/100 NICs
 device		sf		# Adaptec AIC-6915 (``Starfire'')
 device		sis		# Silicon Integrated Systems SiS 900/SiS 7016
 device		sk		# SysKonnect SK-984x & SK-982x gigabit Ethernet
 device		ste		# Sundance ST201 (D-Link DFE-550TX)
 device		stge		# Sundance/Tamarack TC9021 gigabit Ethernet
 device		tl		# Texas Instruments ThunderLAN
 device		tx		# SMC EtherPower II (83c170 ``EPIC'')
 device		vr		# VIA Rhine, Rhine II
 device		wb		# Winbond W89C840F
 device		xl		# 3Com 3c90x (``Boomerang'', ``Cyclone'')
 
 # PCI Ethernet NICs.
 device		de		# DEC/Intel DC21x4x (``Tulip'')
 device		em		# Intel Pro/1000 Gigabit Ethernet
 device		igb		# Intel Pro/1000 PCIE Gigabit Ethernet
 #device		ixgbe		# Intel Pro/10Gbe PCIE Ethernet
 device		le		# AMD Am7900 LANCE and Am79C9xx PCnet
 device		mxge		# Myricom Myri-10G 10GbE NIC
 device		nxge		# Neterion Xframe 10GbE Server/Storage Adapter
 device		ti		# Alteon Networks Tigon I/II gigabit Ethernet
 device		txp		# 3Com 3cR990 (``Typhoon'')
 device		vx		# 3Com 3c590, 3c595 (``Vortex'')
 
 # PCI FDDI NICs.
 device		fpa
 
 # PCI WAN adapters.
 device		lmc
 
 # Use "private" jumbo buffers allocated exclusively for the ti(4) driver.
 # This option is incompatible with the TI_JUMBO_HDRSPLIT option below.
 #options 	TI_PRIVATE_JUMBOS
 # Turn on the header splitting option for the ti(4) driver firmware.  This
 # only works for Tigon II chips, and has no effect for Tigon I chips.
 options 	TI_JUMBO_HDRSPLIT
 
 # These two options allow manipulating the mbuf cluster size and mbuf size,
 # respectively.  Be very careful with NIC driver modules when changing
 # these from their default values, because that can potentially cause a
 # mismatch between the mbuf size assumed by the kernel and the mbuf size
 # assumed by a module.  The only driver that currently has the ability to
 # detect a mismatch is ti(4).
 options 	MCLSHIFT=12	# mbuf cluster shift in bits, 12 == 4KB
 options 	MSIZE=512	# mbuf size in bytes
 
 #
 # ATM related options (Cranor version)
 # (note: this driver cannot be used with the HARP ATM stack)
 #
 # The `en' device provides support for Efficient Networks (ENI)
 # ENI-155 PCI midway cards, and the Adaptec 155Mbps PCI ATM cards (ANA-59x0).
 #
 # The `hatm' device provides support for Fore/Marconi HE155 and HE622
 # ATM PCI cards.
 #
 # The `fatm' device provides support for Fore PCA200E ATM PCI cards.
 #
 # The `patm' device provides support for IDT77252 based cards like
 # ProSum's ProATM-155 and ProATM-25 and IDT's evaluation boards.
 #
 # atm device provides generic atm functions and is required for
 # atm devices.
 # NATM enables the netnatm protocol family that can be used to
 # bypass TCP/IP.
 #
 # utopia provides the access to the ATM PHY chips and is required for en,
 # hatm and fatm.
 #
 # the current driver supports only PVC operations (no atm-arp, no multicast).
 # for more details, please read the original documents at
 # http://www.ccrc.wustl.edu/pub/chuck/tech/bsdatm/bsdatm.html
 #
 device		atm
 device		en
 device		fatm			#Fore PCA200E
 device		hatm			#Fore/Marconi HE155/622
 device		patm			#IDT77252 cards (ProATM and IDT)
 device		utopia			#ATM PHY driver
-options 	NATM			#native ATM
+#options 	NATM			#native ATM
 
 options 	LIBMBPOOL		#needed by patm, iatm
 
 #
 # Sound drivers
 #
 # sound: The generic sound driver.
 #
 
 device		sound
 
 #
 # snd_*: Device-specific drivers.
 #
 # The flags of the device tells the device a bit more info about the
 # device that normally is obtained through the PnP interface.
 #	bit  2..0   secondary DMA channel;
 #	bit  4      set if the board uses two dma channels;
 #	bit 15..8   board type, overrides autodetection; leave it
 #		    zero if don't know what to put in (and you don't,
 #		    since this is unsupported at the moment...).
 #
 # snd_ad1816:		Analog Devices AD1816 ISA PnP/non-PnP.
 # snd_als4000:		Avance Logic ALS4000 PCI.
 # snd_atiixp:		ATI IXP 200/300/400 PCI.
 # snd_au88x0		Aureal Vortex 1/2/Advantage PCI. This driver
 #			lacks support for playback and recording.
 # snd_audiocs:		Crystal Semiconductor CS4231 SBus/EBus. Only
 #			for sparc64.
 # snd_cmi:		CMedia CMI8338/CMI8738 PCI.
 # snd_cs4281:		Crystal Semiconductor CS4281 PCI.
 # snd_csa:		Crystal Semiconductor CS461x/428x PCI. (except
 #			4281)
 # snd_ds1:		Yamaha DS-1 PCI.
 # snd_emu10k1:		Creative EMU10K1 PCI and EMU10K2 (Audigy) PCI.
 # snd_emu10kx:		Creative SoundBlaster Live! and Audigy
 # snd_envy24:		VIA Envy24 and compatible, needs snd_spicds.
 # snd_envy24ht:		VIA Envy24HT and compatible, needs snd_spicds.
 # snd_es137x:		Ensoniq AudioPCI ES137x PCI.
 # snd_ess:		Ensoniq ESS ISA PnP/non-PnP, to be used in
 #			conjunction with snd_sbc.
 # snd_fm801:		Forte Media FM801 PCI.
 # snd_gusc:		Gravis UltraSound ISA PnP/non-PnP.
 # snd_hda:		Intel High Definition Audio (Controller) and
 #			compatible.
 # snd_ich:		Intel ICH PCI and some more audio controllers
 #			embedded in a chipset, for example nVidia
 #			nForce controllers.
 # snd_maestro:		ESS Technology Maestro-1/2x PCI.
 # snd_maestro3:		ESS Technology Maestro-3/Allegro PCI.
 # snd_mss:		Microsoft Sound System ISA PnP/non-PnP.
 # snd_neomagic:		Neomagic 256 AV/ZX PCI.
 # snd_sb16:		Creative SoundBlaster16, to be used in
 #			conjunction with snd_sbc.
 # snd_sb8:		Creative SoundBlaster (pre-16), to be used in
 #			conjunction with snd_sbc.
 # snd_sbc:		Creative SoundBlaster ISA PnP/non-PnP.
 #			Supports ESS and Avance ISA chips as well.
 # snd_spicds:		SPI codec driver, needed by Envy24/Envy24HT drivers.
 # snd_solo:		ESS Solo-1x PCI.
 # snd_t4dwave:		Trident 4DWave DX/NX PCI, Sis 7018 PCI and Acer Labs
 #			M5451 PCI.
 # snd_via8233:		VIA VT8233x PCI.
 # snd_via82c686:	VIA VT82C686A PCI.
 # snd_vibes:		S3 Sonicvibes PCI.
 # snd_uaudio:		USB audio.
 
 device		snd_ad1816
 device		snd_als4000
 device		snd_atiixp
 #device		snd_au88x0
 #device		snd_audiocs
 device		snd_cmi
 device		snd_cs4281
 device		snd_csa
 device		snd_ds1
 device		snd_emu10k1
 device		snd_emu10kx
 device		snd_envy24
 device		snd_envy24ht
 device		snd_es137x
 device		snd_ess
 device		snd_fm801
 device		snd_gusc
 device		snd_hda
 device		snd_ich
 device		snd_maestro
 device		snd_maestro3
 device		snd_mss
 device		snd_neomagic
 device		snd_sb16
 device		snd_sb8
 device		snd_sbc
 device		snd_solo
 device		snd_spicds
 device		snd_t4dwave
 device		snd_via8233
 device		snd_via82c686
 device		snd_vibes
 device		snd_uaudio
 
 # For non-PnP sound cards:
 hint.pcm.0.at="isa"
 hint.pcm.0.irq="10"
 hint.pcm.0.drq="1"
 hint.pcm.0.flags="0x0"
 hint.sbc.0.at="isa"
 hint.sbc.0.port="0x220"
 hint.sbc.0.irq="5"
 hint.sbc.0.drq="1"
 hint.sbc.0.flags="0x15"
 hint.gusc.0.at="isa"
 hint.gusc.0.port="0x220"
 hint.gusc.0.irq="5"
 hint.gusc.0.drq="1"
 hint.gusc.0.flags="0x13"
 
 #
 # IEEE-488 hardware:
 # pcii:		PCIIA cards (uPD7210 based isa cards)
 # tnt4882:	National Instruments PCI-GPIB card.
 
 device	pcii
 hint.pcii.0.at="isa"
 hint.pcii.0.port="0x2e1"
 hint.pcii.0.irq="5"
 hint.pcii.0.drq="1"
 
 device	tnt4882
 
 #
 # Miscellaneous hardware:
 #
 # scd: Sony CD-ROM using proprietary (non-ATAPI) interface
 # mcd: Mitsumi CD-ROM using proprietary (non-ATAPI) interface
 # bktr: Brooktree bt848/848a/849a/878/879 video capture and TV Tuner board
 # cy: Cyclades serial driver
 # joy: joystick (including IO DATA PCJOY PC Card joystick)
 # rc: RISCom/8 multiport card
 # rp: Comtrol Rocketport(ISA/PCI) - single card
 # si: Specialix SI/XIO 4-32 port terminal multiplexor
 # cmx: OmniKey CardMan 4040 pccard smartcard reader
 
 # Notes on the Comtrol Rocketport driver:
 #
 # The exact values used for rp0 depend on how many boards you have
 # in the system.  The manufacturer's sample configs are listed as:
 #
 #               device  rp	# core driver support
 #
 #   Comtrol Rocketport ISA single card
 #		hint.rp.0.at="isa"
 #		hint.rp.0.port="0x280"
 #
 #   If instead you have two ISA cards, one installed at 0x100 and the
 #   second installed at 0x180, then you should add the following to
 #   your kernel probe hints:
 #		hint.rp.0.at="isa"
 #		hint.rp.0.port="0x100"
 #		hint.rp.1.at="isa"
 #		hint.rp.1.port="0x180"
 #
 #   For 4 ISA cards, it might be something like this:
 #		hint.rp.0.at="isa"
 #		hint.rp.0.port="0x180"
 #		hint.rp.1.at="isa"
 #		hint.rp.1.port="0x100"
 #		hint.rp.2.at="isa"
 #		hint.rp.2.port="0x340"
 #		hint.rp.3.at="isa"
 #		hint.rp.3.port="0x240"
 #
 #   For PCI cards, you need no hints.
 
 # Mitsumi CD-ROM
 device		mcd
 hint.mcd.0.at="isa"
 hint.mcd.0.port="0x300"
 # for the Sony CDU31/33A CDROM
 device		scd
 hint.scd.0.at="isa"
 hint.scd.0.port="0x230"
 device		joy			# PnP aware, hints for non-PnP only
 hint.joy.0.at="isa"
 hint.joy.0.port="0x201"
 device		cmx
 
 #
 # The 'bktr' device is a PCI video capture device using the Brooktree
 # bt848/bt848a/bt849a/bt878/bt879 chipset. When used with a TV Tuner it forms a
 # TV card, e.g. Miro PC/TV, Hauppauge WinCast/TV WinTV, VideoLogic Captivator,
 # Intel Smart Video III, AverMedia, IMS Turbo, FlyVideo.
 #
 # options 	OVERRIDE_CARD=xxx
 # options 	OVERRIDE_TUNER=xxx
 # options 	OVERRIDE_MSP=1
 # options 	OVERRIDE_DBX=1
 # These options can be used to override the auto detection
 # The current values for xxx are found in src/sys/dev/bktr/bktr_card.h
 # Using sysctl(8) run-time overrides on a per-card basis can be made
 #
 # options 	BROOKTREE_SYSTEM_DEFAULT=BROOKTREE_PAL
 # or
 # options 	BROOKTREE_SYSTEM_DEFAULT=BROOKTREE_NTSC
 # Specifies the default video capture mode.
 # This is required for Dual Crystal (28&35Mhz) boards where PAL is used
 # to prevent hangs during initialisation, e.g. VideoLogic Captivator PCI.
 #
 # options 	BKTR_USE_PLL
 # This is required for PAL or SECAM boards with a 28Mhz crystal and no 35Mhz
 # crystal, e.g. some new Bt878 cards.
 #
 # options 	BKTR_GPIO_ACCESS
 # This enable IOCTLs which give user level access to the GPIO port.
 #
 # options 	BKTR_NO_MSP_RESET
 # Prevents the MSP34xx reset. Good if you initialise the MSP in another OS first
 #
 # options 	BKTR_430_FX_MODE
 # Switch Bt878/879 cards into Intel 430FX chipset compatibility mode.
 #
 # options 	BKTR_SIS_VIA_MODE
 # Switch Bt878/879 cards into SIS/VIA chipset compatibility mode which is
 # needed for some old SiS and VIA chipset motherboards.
 # This also allows Bt878/879 chips to work on old OPTi (<1997) chipset
 # motherboards and motherboards with bad or incomplete PCI 2.1 support.
 # As a rough guess, old = before 1998
 #
 # options 	BKTR_NEW_MSP34XX_DRIVER
 # Use new, more complete initialization scheme for the msp34* soundchip.
 # Should fix stereo autodetection if the old driver does only output
 # mono sound.
 
 #
 # options 	BKTR_USE_FREEBSD_SMBUS
 # Compile with FreeBSD SMBus implementation
 #
 # Brooktree driver has been ported to the new I2C framework. Thus,
 # you'll need to have the following 3 lines in the kernel config.
 #     device smbus
 #     device iicbus
 #     device iicbb
 #     device iicsmb
 # The iic and smb devices are only needed if you want to control other
 # I2C slaves connected to the external connector of some cards.
 #
 device		bktr
  
 #
 # PC Card/PCMCIA and Cardbus
 #
 # cbb: pci/cardbus bridge implementing YENTA interface
 # pccard: pccard slots
 # cardbus: cardbus slots
 device		cbb
 device		pccard
 device		cardbus
 
 #
 # MMC/SD
 #
 # mmc 		MMC/SD bus
 # mmcsd		MMC/SD memory card
 # sdhci		Generic PCI SD Host Controller
 #
 device		mmc
 device		mmcsd
 device		sdhci
 
 #
 # SMB bus
 #
 # System Management Bus support is provided by the 'smbus' device.
 # Access to the SMBus device is via the 'smb' device (/dev/smb*),
 # which is a child of the 'smbus' device.
 #
 # Supported devices:
 # smb		standard I/O through /dev/smb*
 #
 # Supported SMB interfaces:
 # iicsmb	I2C to SMB bridge with any iicbus interface
 # bktr		brooktree848 I2C hardware interface
 # intpm		Intel PIIX4 (82371AB, 82443MX) Power Management Unit
 # alpm		Acer Aladdin-IV/V/Pro2 Power Management Unit
 # ichsmb	Intel ICH SMBus controller chips (82801AA, 82801AB, 82801BA)
 # viapm		VIA VT82C586B/596B/686A and VT8233 Power Management Unit
 # amdpm		AMD 756 Power Management Unit
 # amdsmb	AMD 8111 SMBus 2.0 Controller
 # nfpm		NVIDIA nForce Power Management Unit
 # nfsmb		NVIDIA nForce2/3/4 MCP SMBus 2.0 Controller
 #
 device		smbus		# Bus support, required for smb below.
 
 device		intpm
 device		alpm
 device		ichsmb
 device		viapm
 device		amdpm
 device		amdsmb
 device		nfpm
 device		nfsmb
 
 device		smb
 
 #
 # I2C Bus
 #
 # Philips i2c bus support is provided by the `iicbus' device.
 #
 # Supported devices:
 # ic	i2c network interface
 # iic	i2c standard io
 # iicsmb i2c to smb bridge. Allow i2c i/o with smb commands.
 #
 # Supported interfaces:
 # bktr	brooktree848 I2C software interface
 #
 # Other:
 # iicbb	generic I2C bit-banging code (needed by lpbb, bktr)
 #
 device		iicbus		# Bus support, required for ic/iic/iicsmb below.
 device		iicbb
 
 device		ic
 device		iic
 device		iicsmb		# smb over i2c bridge
 
 # I2C peripheral devices
 #
 # ds133x	Dallas Semiconductor DS1337, DS1338 and DS1339 RTC
 # ds1672	Dallas Semiconductor DS1672 RTC
 #
 device		ds133x
 device		ds1672
 
 # Parallel-Port Bus
 #
 # Parallel port bus support is provided by the `ppbus' device.
 # Multiple devices may be attached to the parallel port, devices
 # are automatically probed and attached when found.
 #
 # Supported devices:
 # vpo	Iomega Zip Drive
 #	Requires SCSI disk support ('scbus' and 'da'), best
 #	performance is achieved with ports in EPP 1.9 mode.
 # lpt	Parallel Printer
 # plip	Parallel network interface
 # ppi	General-purpose I/O ("Geek Port") + IEEE1284 I/O
 # pps	Pulse per second Timing Interface
 # lpbb	Philips official parallel port I2C bit-banging interface
 #
 # Supported interfaces:
 # ppc	ISA-bus parallel port interfaces.
 #
 
 options 	PPC_PROBE_CHIPSET # Enable chipset specific detection
 				  # (see flags in ppc(4))
 options 	DEBUG_1284	# IEEE1284 signaling protocol debug
 options 	PERIPH_1284	# Makes your computer act as an IEEE1284
 				# compliant peripheral
 options 	DONTPROBE_1284	# Avoid boot detection of PnP parallel devices
 options 	VP0_DEBUG	# ZIP/ZIP+ debug
 options 	LPT_DEBUG	# Printer driver debug
 options 	PPC_DEBUG	# Parallel chipset level debug
 options 	PLIP_DEBUG	# Parallel network IP interface debug
 options 	PCFCLOCK_VERBOSE         # Verbose pcfclock driver
 options 	PCFCLOCK_MAX_RETRIES=5   # Maximum read tries (default 10)
 
 device		ppc
 hint.ppc.0.at="isa"
 hint.ppc.0.irq="7"
 device		ppbus
 device		vpo
 device		lpt
 device		plip
 device		ppi
 device		pps
 device		lpbb
 device		pcfclock
 
 # Kernel BOOTP support
 
 options 	BOOTP		# Use BOOTP to obtain IP address/hostname
 				# Requires NFSCLIENT and NFS_ROOT
 options 	BOOTP_NFSROOT	# NFS mount root filesystem using BOOTP info
 options 	BOOTP_NFSV3	# Use NFS v3 to NFS mount root
 options 	BOOTP_COMPAT	# Workaround for broken bootp daemons.
 options 	BOOTP_WIRED_TO=fxp0 # Use interface fxp0 for BOOTP
 options		BOOTP_BLOCKSIZE=8192 # Override NFS block size
 
 #
 # Add software watchdog routines.
 #
 options 	SW_WATCHDOG
 
 #
 # Disable swapping of stack pages.  This option removes all
 # code which actually performs swapping, so it's not possible to turn
 # it back on at run-time.
 #
 # This is sometimes usable for systems which don't have any swap space
 # (see also sysctls "vm.defer_swapspace_pageouts" and
 # "vm.disable_swapspace_pageouts")
 #
 #options 	NO_SWAPPING
 
 # Set the number of sf_bufs to allocate. sf_bufs are virtual buffers
 # for sendfile(2) that are used to map file VM pages, and normally
 # default to a quantity that is roughly 16*MAXUSERS+512. You would
 # typically want about 4 of these for each simultaneous file send.
 #
 options 	NSFBUFS=1024
 
 #
 # Enable extra debugging code for locks.  This stores the filename and
 # line of whatever acquired the lock in the lock itself, and change a
 # number of function calls to pass around the relevant data.  This is
 # not at all useful unless you are debugging lock code.  Also note
 # that it is likely to break e.g. fstat(1) unless you recompile your
 # userland with -DDEBUG_LOCKS as well.
 #
 options 	DEBUG_LOCKS
 
 
 #####################################################################
 # USB support
 # UHCI controller
 device		uhci
 # OHCI controller
 device		ohci
 # EHCI controller
 device		ehci
 # SL811 Controller
 device 		slhci
 # General USB code (mandatory for USB)
 device		usb
 #
 # USB Double Bulk Pipe devices
 device		udbp
 # USB Fm Radio
 device		ufm
 # Generic USB device driver
 device		ugen
 # Human Interface Device (anything with buttons and dials)
 device		uhid
 # USB keyboard
 device		ukbd
 # USB printer
 device		ulpt
 # USB Iomega Zip 100 Drive (Requires scbus and da)
 device		umass
 # USB support for Belkin F5U109 and Magic Control Technology serial adapters
 device		umct
 # USB modem support
 device		umodem
 # USB mouse
 device		ums
 # Diamond Rio 500 MP3 player
 device		urio
 # USB scanners
 device		uscanner
 #
 # USB serial support
 device		ucom
 # USB support for 3G modem cards by Option, Novatel, Huawei and Sierra
 device		u3g
 # USB support for Technologies ARK3116 based serial adapters
 device		uark
 # USB support for Belkin F5U103 and compatible serial adapters
 device		ubsa
 # USB support for serial adapters based on the FT8U100AX and FT8U232AM
 device		uftdi
 # USB support for some Windows CE based serial communication.
 device		uipaq
 # USB support for Prolific PL-2303 serial adapters
 device		uplcom
 # USB support for Silicon Laboratories CP2101/CP2102 based USB serial adapters
 device		uslcom
 # USB Visor and Palm devices
 device		uvisor
 # USB serial support for DDI pocket's PHS
 device		uvscom
 #
 # ADMtek USB ethernet. Supports the LinkSys USB100TX,
 # the Billionton USB100, the Melco LU-ATX, the D-Link DSB-650TX
 # and the SMC 2202USB. Also works with the ADMtek AN986 Pegasus
 # eval board.
 device		aue
 
 # ASIX Electronics AX88172 USB 2.0 ethernet driver. Used in the
 # LinkSys USB200M and various other adapters.
 device		axe
 
 #
 # Devices which communicate using Ethernet over USB, particularly
 # Communication Device Class (CDC) Ethernet specification. Supports
 # Sharp Zaurus PDAs, some DOCSIS cable modems and so on.
 device		cdce
 #
 # CATC USB-EL1201A USB ethernet. Supports the CATC Netmate
 # and Netmate II, and the Belkin F5U111.
 device		cue
 #
 # Kawasaki LSI ethernet. Supports the LinkSys USB10T,
 # Entrega USB-NET-E45, Peracom Ethernet Adapter, the
 # 3Com 3c19250, the ADS Technologies USB-10BT, the ATen UC10T,
 # the Netgear EA101, the D-Link DSB-650, the SMC 2102USB
 # and 2104USB, and the Corega USB-T.
 device		kue
 #
 # RealTek RTL8150 USB to fast ethernet. Supports the Melco LUA-KTX
 # and the GREEN HOUSE GH-USB100B.
 device		rue
 #
 # Davicom DM9601E USB to fast ethernet. Supports the Corega FEther USB-TXC.
 device		udav
 
 #
 # ZyDas ZD1211/ZD1211B wireless ethernet driver
 device		zyd
 #
 # Ralink Technology RT2500USB chispet driver
 device		ural
 #
 # Ralink Technology RT2501USB/RT2601USB chispet driver
 device		rum
 
 # 
 # debugging options for the USB subsystem
 #
 options 	USB_DEBUG
 options		U3G_DEBUG
 
 # options for ukbd:
 options 	UKBD_DFLT_KEYMAP	# specify the built-in keymap
 makeoptions	UKBD_DFLT_KEYMAP=it.iso
 
 # options for uplcom:
 options 	UPLCOM_INTR_INTERVAL=100	# interrupt pipe interval
 						# in milliseconds
 
 # options for uvscom:
 options 	UVSCOM_DEFAULT_OPKTSIZE=8	# default output packet size
 options 	UVSCOM_INTR_INTERVAL=100	# interrupt pipe interval
 						# in milliseconds
 
 #####################################################################
 # FireWire support
 
 device		firewire	# FireWire bus code
 device		sbp		# SCSI over Firewire (Requires scbus and da)
 device		sbp_targ	# SBP-2 Target mode  (Requires scbus and targ)
 device		fwe		# Ethernet over FireWire (non-standard!)
 device		fwip		# IP over FireWire (RFC2734 and RFC3146)
 
 #####################################################################
 # dcons support (Dumb Console Device)
 
 device		dcons			# dumb console driver
 device		dcons_crom		# FireWire attachment
 options 	DCONS_BUF_SIZE=16384	# buffer size
 options 	DCONS_POLL_HZ=100	# polling rate
 options 	DCONS_FORCE_CONSOLE=0	# force to be the primary console
 options 	DCONS_FORCE_GDB=1	# force to be the gdb device
 
 #####################################################################
 # crypto subsystem
 #
 # This is a port of the OpenBSD crypto framework.  Include this when
 # configuring IPSEC and when you have a h/w crypto device to accelerate
 # user applications that link to OpenSSL.
 #
 # Drivers are ports from OpenBSD with some simple enhancements that have
 # been fed back to OpenBSD.
 
 device		crypto		# core crypto support
 device		cryptodev	# /dev/crypto for access to h/w
 
 device		rndtest		# FIPS 140-2 entropy tester
 
 device		hifn		# Hifn 7951, 7781, etc.
 options 	HIFN_DEBUG	# enable debugging support: hw.hifn.debug
 options 	HIFN_RNDTEST	# enable rndtest support
 
 device		ubsec		# Broadcom 5501, 5601, 58xx
 options 	UBSEC_DEBUG	# enable debugging support: hw.ubsec.debug
 options 	UBSEC_RNDTEST	# enable rndtest support
 
 #####################################################################
 
 
 #
 # Embedded system options:
 #
 # An embedded system might want to run something other than init.
 options 	INIT_PATH=/sbin/init:/stand/sysinstall
 
 # Debug options
 options 	BUS_DEBUG	# enable newbus debugging
 options 	DEBUG_VFS_LOCKS	# enable VFS lock debugging
 options 	SOCKBUF_DEBUG	# enable sockbuf last record/mb tail checking
 
 #
 # Verbose SYSINIT
 #
 # Make the SYSINIT process performed by mi_startup() verbose.  This is very
 # useful when porting to a new architecture.  If DDB is also enabled, this
 # will print function names instead of addresses.
 options 	VERBOSE_SYSINIT
 
 #####################################################################
 # SYSV IPC KERNEL PARAMETERS
 #
 # Maximum number of entries in a semaphore map.
 options 	SEMMAP=31
 
 # Maximum number of System V semaphores that can be used on the system at
 # one time.
 options 	SEMMNI=11
 
 # Total number of semaphores system wide
 options 	SEMMNS=61
 
 # Total number of undo structures in system
 options 	SEMMNU=31
 
 # Maximum number of System V semaphores that can be used by a single process
 # at one time.
 options 	SEMMSL=61
 
 # Maximum number of operations that can be outstanding on a single System V
 # semaphore at one time.
 options 	SEMOPM=101
 
 # Maximum number of undo operations that can be outstanding on a single
 # System V semaphore at one time.
 options 	SEMUME=11
 
 # Maximum number of shared memory pages system wide.
 options 	SHMALL=1025
 
 # Maximum size, in bytes, of a single System V shared memory region.
 options 	SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)
 options 	SHMMAXPGS=1025
 
 # Minimum size, in bytes, of a single System V shared memory region.
 options 	SHMMIN=2
 
 # Maximum number of shared memory regions that can be used on the system
 # at one time.
 options 	SHMMNI=33
 
 # Maximum number of System V shared memory regions that can be attached to
 # a single process at one time.
 options 	SHMSEG=9
 
 # Set the amount of time (in seconds) the system will wait before
 # rebooting automatically when a kernel panic occurs.  If set to (-1),
 # the system will wait indefinitely until a key is pressed on the
 # console.
 options 	PANIC_REBOOT_WAIT_TIME=16
 
 # Attempt to bypass the buffer cache and put data directly into the
 # userland buffer for read operation when O_DIRECT flag is set on the
 # file.  Both offset and length of the read operation must be
 # multiples of the physical media sector size.
 #
 options 	DIRECTIO
 
 # Specify a lower limit for the number of swap I/O buffers.  They are
 # (among other things) used when bypassing the buffer cache due to
 # DIRECTIO kernel option enabled and O_DIRECT flag set on file.
 #
 options 	NSWBUF_MIN=120
 
 #####################################################################
 
 # More undocumented options for linting.
 # Note that documenting these are not considered an affront.
 
 options 	CAM_DEBUG_DELAY
 
 # VFS cluster debugging.
 options 	CLUSTERDEBUG
 
 options 	DEBUG
 
 # Kernel filelock debugging.
 options 	LOCKF_DEBUG
 
 # System V compatible message queues
 # Please note that the values provided here are used to test kernel
 # building.  The defaults in the sources provide almost the same numbers.
 # MSGSSZ must be a power of 2 between 8 and 1024.
 options 	MSGMNB=2049	# Max number of chars in queue
 options 	MSGMNI=41	# Max number of message queue identifiers
 options 	MSGSEG=2049	# Max number of message segments
 options 	MSGSSZ=16	# Size of a message segment
 options 	MSGTQL=41	# Max number of messages in system
 
 options 	NBUF=512	# Number of buffer headers
 
 options 	SCSI_NCR_DEBUG
 options 	SCSI_NCR_MAX_SYNC=10000
 options 	SCSI_NCR_MAX_WIDE=1
 options 	SCSI_NCR_MYADDR=7
 
 options 	SC_DEBUG_LEVEL=5	# Syscons debug level
 options 	SC_RENDER_DEBUG	# syscons rendering debugging
 
 options 	SHOW_BUSYBUFS	# List buffers that prevent root unmount
 options 	SLIP_IFF_OPTS
 options 	VFS_BIO_DEBUG	# VFS buffer I/O debugging
 
 options 	KSTACK_MAX_PAGES=32 # Maximum pages to give the kernel stack
 
 # Adaptec Array Controller driver options
 options 	AAC_DEBUG	# Debugging levels:
 				# 0 - quiet, only emit warnings
 				# 1 - noisy, emit major function
 				#     points and things done
 				# 2 - extremely noisy, emit trace
 				#     items in loops, etc.
 
 # Yet more undocumented options for linting.
 # BKTR_ALLOC_PAGES has no effect except to cause warnings, and
 # BROOKTREE_ALLOC_PAGES hasn't actually been superseded by it, since the
 # driver still mostly spells this option BROOKTREE_ALLOC_PAGES.
 ##options 	BKTR_ALLOC_PAGES=(217*4+1)
 options 	BROOKTREE_ALLOC_PAGES=(217*4+1)
 options 	MAXFILES=999
Index: head/sys/conf/files
===================================================================
--- head/sys/conf/files	(revision 186118)
+++ head/sys/conf/files	(revision 186119)
@@ -1,2777 +1,2778 @@
 # $FreeBSD$
 #
 # The long compile-with and dependency lines are required because of
 # limitations in config: backslash-newline doesn't work in strings, and
 # dependency lines other than the first are silently ignored.
 #
 acpi_quirks.h			optional acpi				   \
 	dependency	"$S/tools/acpi_quirks2h.awk $S/dev/acpica/acpi_quirks" \
 	compile-with	"${AWK} -f $S/tools/acpi_quirks2h.awk $S/dev/acpica/acpi_quirks" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"acpi_quirks.h"
 aicasm				optional ahc | ahd			   \
 	dependency	"$S/dev/aic7xxx/aicasm/*.[chyl]"		   \
 	compile-with	"CC='${CC}' ${MAKE} -f $S/dev/aic7xxx/aicasm/Makefile MAKESRCPATH=$S/dev/aic7xxx/aicasm" \
 	no-obj no-implicit-rule						   \
 	clean		"aicasm* y.tab.h"
 aic7xxx_seq.h			optional ahc				   \
 	compile-with	"./aicasm ${INCLUDES} -I$S/cam/scsi -I$S/dev/aic7xxx -o aic7xxx_seq.h -r aic7xxx_reg.h -p aic7xxx_reg_print.c -i $S/dev/aic7xxx/aic7xxx_osm.h $S/dev/aic7xxx/aic7xxx.seq"   \
 	no-obj no-implicit-rule before-depend local			   \
 	clean		"aic7xxx_seq.h"					   \
 	dependency	"$S/dev/aic7xxx/aic7xxx.{reg,seq} $S/cam/scsi/scsi_message.h aicasm"
 aic7xxx_reg.h			optional ahc				   \
 	compile-with	"./aicasm ${INCLUDES} -I$S/cam/scsi -I$S/dev/aic7xxx -o aic7xxx_seq.h -r aic7xxx_reg.h -p aic7xxx_reg_print.c -i $S/dev/aic7xxx/aic7xxx_osm.h $S/dev/aic7xxx/aic7xxx.seq"   \
 	no-obj no-implicit-rule before-depend local			   \
 	clean		"aic7xxx_reg.h"					   \
 	dependency	"$S/dev/aic7xxx/aic7xxx.{reg,seq} $S/cam/scsi/scsi_message.h aicasm"
 aic7xxx_reg_print.c		optional ahc				   \
 	compile-with	"./aicasm ${INCLUDES} -I$S/cam/scsi -I$S/dev/aic7xxx -o aic7xxx_seq.h -r aic7xxx_reg.h -p aic7xxx_reg_print.c -i $S/dev/aic7xxx/aic7xxx_osm.h $S/dev/aic7xxx/aic7xxx.seq"   \
 	no-obj no-implicit-rule local					   \
 	clean		"aic7xxx_reg_print.c"				   \
 	dependency	"$S/dev/aic7xxx/aic7xxx.{reg,seq} $S/cam/scsi/scsi_message.h aicasm"
 aic7xxx_reg_print.o		optional ahc ahc_reg_pretty_print	   \
 	compile-with	"${NORMAL_C}"					   \
 	no-implicit-rule local
 aic79xx_seq.h		optional ahd pci				   \
 	compile-with	"./aicasm ${INCLUDES} -I$S/cam/scsi -I$S/dev/aic7xxx -o aic79xx_seq.h -r aic79xx_reg.h -p aic79xx_reg_print.c -i $S/dev/aic7xxx/aic79xx_osm.h $S/dev/aic7xxx/aic79xx.seq"   \
 	no-obj no-implicit-rule before-depend local			   \
 	clean		"aic79xx_seq.h"					   \
 	dependency	"$S/dev/aic7xxx/aic79xx.{reg,seq} $S/cam/scsi/scsi_message.h aicasm"
 aic79xx_reg.h		optional ahd pci				   \
 	compile-with	"./aicasm ${INCLUDES} -I$S/cam/scsi -I$S/dev/aic7xxx -o aic79xx_seq.h -r aic79xx_reg.h -p aic79xx_reg_print.c -i $S/dev/aic7xxx/aic79xx_osm.h $S/dev/aic7xxx/aic79xx.seq"   \
 	no-obj no-implicit-rule before-depend local			   \
 	clean		"aic79xx_reg.h"					   \
 	dependency	"$S/dev/aic7xxx/aic79xx.{reg,seq} $S/cam/scsi/scsi_message.h aicasm"
 aic79xx_reg_print.c	optional ahd pci				   \
 	compile-with	"./aicasm ${INCLUDES} -I$S/cam/scsi -I$S/dev/aic7xxx -o aic79xx_seq.h -r aic79xx_reg.h -p aic79xx_reg_print.c -i $S/dev/aic7xxx/aic79xx_osm.h $S/dev/aic7xxx/aic79xx.seq"   \
 	no-obj no-implicit-rule local					   \
 	clean		"aic79xx_reg_print.c"				   \
 	dependency	"$S/dev/aic7xxx/aic79xx.{reg,seq} $S/cam/scsi/scsi_message.h aicasm"
 aic79xx_reg_print.o		optional ahd pci ahd_reg_pretty_print	   \
 	compile-with	"${NORMAL_C}"					   \
 	no-implicit-rule local
 emu10k1-alsa%diked.h		optional snd_emu10k1 | snd_emu10kx	   \
 	dependency	"$S/tools/emu10k1-mkalsa.sh $S/gnu/dev/sound/pci/emu10k1-alsa.h" \
 	compile-with	"CC='${CC}' AWK=${AWK} sh $S/tools/emu10k1-mkalsa.sh $S/gnu/dev/sound/pci/emu10k1-alsa.h emu10k1-alsa%diked.h" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"emu10k1-alsa%diked.h"
 p16v-alsa%diked.h		optional snd_emu10kx pci			   \
 	dependency	"$S/tools/emu10k1-mkalsa.sh $S/gnu/dev/sound/pci/p16v-alsa.h" \
 	compile-with	"CC='${CC}' AWK=${AWK} sh $S/tools/emu10k1-mkalsa.sh $S/gnu/dev/sound/pci/p16v-alsa.h p16v-alsa%diked.h" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"p16v-alsa%diked.h"
 p17v-alsa%diked.h		optional snd_emu10kx pci			   \
 	dependency	"$S/tools/emu10k1-mkalsa.sh $S/gnu/dev/sound/pci/p17v-alsa.h" \
 	compile-with	"CC='${CC}' AWK=${AWK} sh $S/tools/emu10k1-mkalsa.sh $S/gnu/dev/sound/pci/p17v-alsa.h p17v-alsa%diked.h" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"p17v-alsa%diked.h"
 miidevs.h			optional miibus | mii			   \
 	dependency	"$S/tools/miidevs2h.awk $S/dev/mii/miidevs"	   \
 	compile-with	"${AWK} -f $S/tools/miidevs2h.awk $S/dev/mii/miidevs" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"miidevs.h"
 pccarddevs.h			standard				   \
 	dependency	"$S/tools/pccarddevs2h.awk $S/dev/pccard/pccarddevs" \
 	compile-with	"${AWK} -f $S/tools/pccarddevs2h.awk $S/dev/pccard/pccarddevs" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"pccarddevs.h"
 usbdevs.h			optional usb				   \
 	dependency	"$S/tools/usbdevs2h.awk $S/dev/usb/usbdevs" \
 	compile-with	"${AWK} -f $S/tools/usbdevs2h.awk $S/dev/usb/usbdevs -h" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"usbdevs.h"
 usbdevs_data.h			optional usb				   \
 	dependency	"$S/tools/usbdevs2h.awk $S/dev/usb/usbdevs" \
 	compile-with	"${AWK} -f $S/tools/usbdevs2h.awk $S/dev/usb/usbdevs -d" \
 	no-obj no-implicit-rule before-depend				   \
 	clean		"usbdevs_data.h"
 cam/cam.c			optional scbus
 cam/cam_periph.c		optional scbus
 cam/cam_queue.c			optional scbus
 cam/cam_sim.c			optional scbus
 cam/cam_xpt.c			optional scbus
 cam/scsi/scsi_all.c		optional scbus
 cam/scsi/scsi_cd.c		optional cd
 cam/scsi/scsi_ch.c		optional ch
 cam/scsi/scsi_da.c		optional da
 cam/scsi/scsi_low.c		optional ct | ncv | nsp | stg
 cam/scsi/scsi_low_pisa.c	optional ct | ncv | nsp | stg
 cam/scsi/scsi_pass.c		optional pass
 cam/scsi/scsi_pt.c		optional pt
 cam/scsi/scsi_sa.c		optional sa
 cam/scsi/scsi_ses.c		optional ses
 cam/scsi/scsi_sg.c		optional sg
 cam/scsi/scsi_targ_bh.c		optional targbh
 cam/scsi/scsi_target.c		optional targ
 contrib/altq/altq/altq_cbq.c	optional altq \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/altq/altq/altq_cdnr.c	optional altq
 contrib/altq/altq/altq_hfsc.c	optional altq \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/altq/altq/altq_priq.c	optional altq \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/altq/altq/altq_red.c	optional altq \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/altq/altq/altq_rio.c	optional altq \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/altq/altq/altq_rmclass.c optional altq
 contrib/altq/altq/altq_subr.c	optional altq \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/dev/acpica/dbcmds.c	optional acpi acpi_debug
 contrib/dev/acpica/dbdisply.c	optional acpi acpi_debug
 contrib/dev/acpica/dbexec.c	optional acpi acpi_debug
 contrib/dev/acpica/dbfileio.c	optional acpi acpi_debug
 contrib/dev/acpica/dbhistry.c	optional acpi acpi_debug
 contrib/dev/acpica/dbinput.c	optional acpi acpi_debug
 contrib/dev/acpica/dbstats.c	optional acpi acpi_debug
 contrib/dev/acpica/dbutils.c	optional acpi acpi_debug
 contrib/dev/acpica/dbxface.c	optional acpi acpi_debug
 contrib/dev/acpica/dmbuffer.c	optional acpi acpi_debug
 contrib/dev/acpica/dmnames.c	optional acpi acpi_debug
 contrib/dev/acpica/dmopcode.c	optional acpi acpi_debug
 contrib/dev/acpica/dmobject.c	optional acpi acpi_debug
 contrib/dev/acpica/dmresrc.c	optional acpi acpi_debug
 contrib/dev/acpica/dmresrcl.c	optional acpi acpi_debug
 contrib/dev/acpica/dmresrcs.c	optional acpi acpi_debug
 contrib/dev/acpica/dmutils.c	optional acpi acpi_debug
 contrib/dev/acpica/dmwalk.c	optional acpi acpi_debug
 contrib/dev/acpica/dsfield.c	optional acpi
 contrib/dev/acpica/dsinit.c	optional acpi
 contrib/dev/acpica/dsmethod.c	optional acpi
 contrib/dev/acpica/dsmthdat.c	optional acpi
 contrib/dev/acpica/dsobject.c	optional acpi
 contrib/dev/acpica/dsopcode.c	optional acpi
 contrib/dev/acpica/dsutils.c	optional acpi
 contrib/dev/acpica/dswexec.c	optional acpi
 contrib/dev/acpica/dswload.c	optional acpi
 contrib/dev/acpica/dswscope.c	optional acpi
 contrib/dev/acpica/dswstate.c	optional acpi
 contrib/dev/acpica/evevent.c	optional acpi
 contrib/dev/acpica/evgpe.c	optional acpi
 contrib/dev/acpica/evgpeblk.c	optional acpi
 contrib/dev/acpica/evmisc.c	optional acpi
 contrib/dev/acpica/evregion.c	optional acpi
 contrib/dev/acpica/evrgnini.c	optional acpi
 contrib/dev/acpica/evsci.c	optional acpi
 contrib/dev/acpica/evxface.c	optional acpi
 contrib/dev/acpica/evxfevnt.c	optional acpi
 contrib/dev/acpica/evxfregn.c	optional acpi
 contrib/dev/acpica/exconfig.c	optional acpi
 contrib/dev/acpica/exconvrt.c	optional acpi
 contrib/dev/acpica/excreate.c	optional acpi
 contrib/dev/acpica/exdump.c	optional acpi
 contrib/dev/acpica/exfield.c	optional acpi
 contrib/dev/acpica/exfldio.c	optional acpi
 contrib/dev/acpica/exmisc.c	optional acpi
 contrib/dev/acpica/exmutex.c	optional acpi
 contrib/dev/acpica/exnames.c	optional acpi
 contrib/dev/acpica/exoparg1.c	optional acpi
 contrib/dev/acpica/exoparg2.c	optional acpi
 contrib/dev/acpica/exoparg3.c	optional acpi
 contrib/dev/acpica/exoparg6.c	optional acpi
 contrib/dev/acpica/exprep.c	optional acpi
 contrib/dev/acpica/exregion.c	optional acpi
 contrib/dev/acpica/exresnte.c	optional acpi
 contrib/dev/acpica/exresolv.c	optional acpi
 contrib/dev/acpica/exresop.c	optional acpi
 contrib/dev/acpica/exstore.c	optional acpi
 contrib/dev/acpica/exstoren.c	optional acpi
 contrib/dev/acpica/exstorob.c	optional acpi
 contrib/dev/acpica/exsystem.c	optional acpi
 contrib/dev/acpica/exutils.c	optional acpi
 contrib/dev/acpica/hwacpi.c	optional acpi
 contrib/dev/acpica/hwgpe.c	optional acpi
 contrib/dev/acpica/hwregs.c	optional acpi
 contrib/dev/acpica/hwsleep.c	optional acpi
 contrib/dev/acpica/hwtimer.c	optional acpi
 contrib/dev/acpica/nsaccess.c	optional acpi
 contrib/dev/acpica/nsalloc.c	optional acpi
 contrib/dev/acpica/nsdump.c	optional acpi
 contrib/dev/acpica/nseval.c	optional acpi
 contrib/dev/acpica/nsinit.c	optional acpi
 contrib/dev/acpica/nsload.c	optional acpi
 contrib/dev/acpica/nsnames.c	optional acpi
 contrib/dev/acpica/nsobject.c	optional acpi
 contrib/dev/acpica/nsparse.c	optional acpi
 contrib/dev/acpica/nssearch.c	optional acpi
 contrib/dev/acpica/nsutils.c	optional acpi
 contrib/dev/acpica/nswalk.c	optional acpi
 contrib/dev/acpica/nsxfeval.c	optional acpi
 contrib/dev/acpica/nsxfname.c	optional acpi
 contrib/dev/acpica/nsxfobj.c	optional acpi
 contrib/dev/acpica/psargs.c	optional acpi
 contrib/dev/acpica/psloop.c	optional acpi
 contrib/dev/acpica/psopcode.c	optional acpi
 contrib/dev/acpica/psparse.c	optional acpi
 contrib/dev/acpica/psscope.c	optional acpi
 contrib/dev/acpica/pstree.c	optional acpi
 contrib/dev/acpica/psutils.c	optional acpi
 contrib/dev/acpica/pswalk.c	optional acpi
 contrib/dev/acpica/psxface.c	optional acpi
 contrib/dev/acpica/rsaddr.c	optional acpi
 contrib/dev/acpica/rscalc.c	optional acpi
 contrib/dev/acpica/rscreate.c	optional acpi
 contrib/dev/acpica/rsdump.c	optional acpi
 contrib/dev/acpica/rsinfo.c	optional acpi
 contrib/dev/acpica/rsio.c	optional acpi
 contrib/dev/acpica/rsirq.c	optional acpi
 contrib/dev/acpica/rslist.c	optional acpi
 contrib/dev/acpica/rsmemory.c	optional acpi
 contrib/dev/acpica/rsmisc.c	optional acpi
 contrib/dev/acpica/rsutils.c	optional acpi
 contrib/dev/acpica/rsxface.c	optional acpi
 contrib/dev/acpica/tbfadt.c	optional acpi
 contrib/dev/acpica/tbfind.c	optional acpi
 contrib/dev/acpica/tbinstal.c	optional acpi
 contrib/dev/acpica/tbutils.c	optional acpi
 contrib/dev/acpica/tbxface.c	optional acpi
 contrib/dev/acpica/tbxfroot.c	optional acpi
 contrib/dev/acpica/utalloc.c	optional acpi
 contrib/dev/acpica/utcache.c	optional acpi
 contrib/dev/acpica/utclib.c	optional acpi
 contrib/dev/acpica/utcopy.c	optional acpi
 contrib/dev/acpica/utdebug.c	optional acpi
 contrib/dev/acpica/utdelete.c	optional acpi
 contrib/dev/acpica/uteval.c	optional acpi
 contrib/dev/acpica/utglobal.c	optional acpi
 contrib/dev/acpica/utinit.c	optional acpi
 contrib/dev/acpica/utmath.c	optional acpi
 contrib/dev/acpica/utmisc.c	optional acpi
 contrib/dev/acpica/utmutex.c	optional acpi
 contrib/dev/acpica/utobject.c	optional acpi
 contrib/dev/acpica/utresrc.c	optional acpi
 contrib/dev/acpica/utstate.c	optional acpi
 contrib/dev/acpica/utxface.c	optional acpi
 contrib/ipfilter/netinet/fil.c	optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_auth.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_fil_freebsd.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_frag.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_log.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_nat.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_proxy.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_state.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_lookup.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -Wno-error -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_pool.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_htable.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/ip_sync.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ipfilter/netinet/mlfk_ipl.c optional ipfilter inet \
 	compile-with "${NORMAL_C} -I$S/contrib/ipfilter"
 contrib/ngatm/netnatm/api/cc_conn.c optional ngatm_ccatm \
 	compile-with "${NORMAL_C_NOWERROR} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/api/cc_data.c optional ngatm_ccatm \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/api/cc_dump.c optional ngatm_ccatm \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/api/cc_port.c optional ngatm_ccatm \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/api/cc_sig.c optional ngatm_ccatm \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/api/cc_user.c optional ngatm_ccatm \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/api/unisap.c optional ngatm_ccatm \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/misc/straddr.c optional ngatm_atmbase \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/misc/unimsg_common.c optional ngatm_atmbase \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/msg/traffic.c optional ngatm_atmbase \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/msg/uni_ie.c optional ngatm_atmbase \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/msg/uni_msg.c optional ngatm_atmbase \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/saal/saal_sscfu.c	optional ngatm_sscfu \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/saal/saal_sscop.c	optional ngatm_sscop \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_call.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_coord.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_party.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_print.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_reset.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_uni.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_unimsgcpy.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/ngatm/netnatm/sig/sig_verify.c optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 contrib/pf/net/if_pflog.c	optional pflog \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/if_pfsync.c	optional pfsync \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf.c		optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf_if.c		optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf_ioctl.c	optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf_norm.c	optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf_osfp.c	optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf_ruleset.c	optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf_subr.c	optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/net/pf_table.c	optional pf \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 contrib/pf/netinet/in4_cksum.c	optional pf inet
 crypto/blowfish/bf_ecb.c	optional ipsec 
 crypto/blowfish/bf_skey.c	optional crypto | ipsec 
 crypto/camellia/camellia.c	optional crypto | ipsec 
 crypto/camellia/camellia-api.c	optional crypto | ipsec 
 crypto/des/des_ecb.c		optional crypto | ipsec | netsmb
 crypto/des/des_setkey.c		optional crypto | ipsec | netsmb
 crypto/rc4/rc4.c		optional netgraph_mppc_encryption | kgssapi
 crypto/rijndael/rijndael-alg-fst.c optional crypto | geom_bde | \
 					 ipsec | random | wlan_ccmp
 crypto/rijndael/rijndael-api-fst.c optional geom_bde | random
 crypto/rijndael/rijndael-api.c	optional crypto | ipsec | wlan_ccmp
 crypto/sha1.c			optional carp | crypto | ipsec | \
 					 netgraph_mppc_encryption | sctp
 crypto/sha2/sha2.c		optional crypto | geom_bde | ipsec | random | \
 					 sctp
 ddb/db_access.c			optional	ddb
 ddb/db_break.c			optional	ddb
 ddb/db_capture.c		optional	ddb
 ddb/db_command.c		optional	ddb
 ddb/db_examine.c		optional	ddb
 ddb/db_expr.c			optional	ddb
 ddb/db_input.c			optional	ddb
 ddb/db_lex.c			optional	ddb
 ddb/db_main.c			optional	ddb
 ddb/db_output.c			optional	ddb
 ddb/db_print.c			optional	ddb
 ddb/db_ps.c			optional	ddb
 ddb/db_run.c			optional	ddb
 ddb/db_script.c			optional	ddb
 ddb/db_sym.c			optional	ddb
 ddb/db_thread.c			optional	ddb
 ddb/db_textdump.c		optional	ddb
 ddb/db_variables.c		optional	ddb
 ddb/db_watch.c			optional	ddb
 ddb/db_write_cmd.c		optional	ddb
 #dev/dpt/dpt_control.c		optional dpt
 dev/aac/aac.c			optional aac
 dev/aac/aac_cam.c		optional aacp aac
 dev/aac/aac_debug.c		optional aac
 dev/aac/aac_disk.c		optional aac
 dev/aac/aac_linux.c		optional aac compat_linux
 dev/aac/aac_pci.c		optional aac pci
 dev/acpi_support/acpi_aiboost.c	optional acpi_aiboost acpi
 dev/acpi_support/acpi_asus.c	optional acpi_asus acpi
 dev/acpi_support/acpi_fujitsu.c	optional acpi_fujitsu acpi
 dev/acpi_support/acpi_ibm.c	optional acpi_ibm acpi
 dev/acpi_support/acpi_panasonic.c optional acpi_panasonic acpi
 dev/acpi_support/acpi_sony.c	optional acpi_sony acpi
 dev/acpi_support/acpi_toshiba.c	optional acpi_toshiba acpi
 dev/acpica/Osd/OsdDebug.c	optional acpi
 dev/acpica/Osd/OsdHardware.c	optional acpi
 dev/acpica/Osd/OsdInterrupt.c	optional acpi
 dev/acpica/Osd/OsdMemory.c	optional acpi
 dev/acpica/Osd/OsdSchedule.c	optional acpi
 dev/acpica/Osd/OsdStream.c	optional acpi
 dev/acpica/Osd/OsdSynch.c	optional acpi
 dev/acpica/Osd/OsdTable.c	optional acpi
 dev/acpica/acpi.c		optional acpi
 dev/acpica/acpi_acad.c		optional acpi
 dev/acpica/acpi_battery.c	optional acpi
 dev/acpica/acpi_button.c	optional acpi
 dev/acpica/acpi_cmbat.c		optional acpi
 dev/acpica/acpi_cpu.c		optional acpi
 dev/acpica/acpi_ec.c		optional acpi
 dev/acpica/acpi_hpet.c		optional acpi
 dev/acpica/acpi_isab.c		optional acpi isa
 dev/acpica/acpi_lid.c		optional acpi
 dev/acpica/acpi_package.c	optional acpi
 dev/acpica/acpi_pci.c		optional acpi pci
 dev/acpica/acpi_pci_link.c	optional acpi pci
 dev/acpica/acpi_pcib.c		optional acpi pci
 dev/acpica/acpi_pcib_acpi.c	optional acpi pci
 dev/acpica/acpi_pcib_pci.c	optional acpi pci
 dev/acpica/acpi_perf.c		optional acpi
 dev/acpica/acpi_powerres.c	optional acpi
 dev/acpica/acpi_quirk.c		optional acpi
 dev/acpica/acpi_resource.c	optional acpi
 dev/acpica/acpi_smbat.c		optional acpi
 dev/acpica/acpi_thermal.c	optional acpi
 dev/acpica/acpi_throttle.c	optional acpi
 dev/acpica/acpi_timer.c		optional acpi
 dev/acpica/acpi_video.c		optional acpi_video acpi
 dev/acpica/acpi_dock.c		optional acpi_dock acpi
 dev/adlink/adlink.c		optional adlink
 dev/advansys/adv_eisa.c		optional adv eisa
 dev/advansys/adv_pci.c		optional adv pci
 dev/advansys/advansys.c		optional adv
 dev/advansys/advlib.c		optional adv
 dev/advansys/advmcode.c		optional adv
 dev/advansys/adw_pci.c		optional adw pci
 dev/advansys/adwcam.c		optional adw
 dev/advansys/adwlib.c		optional adw
 dev/advansys/adwmcode.c		optional adw
 dev/ae/if_ae.c			optional ae pci
 dev/age/if_age.c		optional age pci
 dev/agp/agp.c			optional agp pci
 dev/agp/agp_if.m			optional agp pci
 dev/aha/aha.c			optional aha
 dev/aha/aha_isa.c		optional aha isa
 dev/aha/aha_mca.c		optional aha mca
 dev/ahb/ahb.c			optional ahb eisa
 dev/aic/aic.c			optional aic
 dev/aic/aic_pccard.c		optional aic pccard
 dev/aic7xxx/ahc_eisa.c		optional ahc eisa
 dev/aic7xxx/ahc_isa.c		optional ahc isa
 dev/aic7xxx/ahc_pci.c		optional ahc pci
 dev/aic7xxx/ahd_pci.c		optional ahd pci
 dev/aic7xxx/aic7770.c		optional ahc
 dev/aic7xxx/aic79xx.c		optional ahd pci
 dev/aic7xxx/aic79xx_osm.c	optional ahd pci
 dev/aic7xxx/aic79xx_pci.c	optional ahd pci
 dev/aic7xxx/aic7xxx.c		optional ahc
 dev/aic7xxx/aic7xxx_93cx6.c	optional ahc
 dev/aic7xxx/aic7xxx_osm.c	optional ahc
 dev/aic7xxx/aic7xxx_pci.c	optional ahc pci
 dev/ale/if_ale.c		optional ale pci
 dev/amd/amd.c			optional amd
 dev/amr/amr.c			optional amr
 dev/amr/amr_cam.c		optional amrp amr
 dev/amr/amr_disk.c		optional amr
 dev/amr/amr_linux.c		optional amr compat_linux
 dev/amr/amr_pci.c		optional amr pci
 dev/an/if_an.c			optional an
 dev/an/if_an_isa.c		optional an isa
 dev/an/if_an_pccard.c		optional an pccard
 dev/an/if_an_pci.c		optional an pci
 dev/asr/asr.c			optional asr pci
 #
 dev/ata/ata_if.m		optional ata | atacore
 dev/ata/ata-all.c		optional ata | atacore
 dev/ata/ata-lowlevel.c		optional ata | atacore
 dev/ata/ata-queue.c		optional ata | atacore
 dev/ata/ata-card.c		optional ata pccard | atapccard
 dev/ata/ata-cbus.c		optional ata pc98 | atapc98
 dev/ata/ata-isa.c		optional ata isa | ataisa
 dev/ata/ata-pci.c		optional ata pci | atapci
 dev/ata/ata-dma.c		optional ata pci | atapci
 dev/ata/ata-sata.c		optional ata pci | atapci
 dev/ata/chipsets/ata-ahci.c	optional ata pci | ataahci | ataacerlabs | \
 					 ataati | ataintel | atajmicron | atavia
 dev/ata/chipsets/ata-acard.c	optional ata pci | ataacard
 dev/ata/chipsets/ata-acerlabs.c	optional ata pci | ataacerlabs
 dev/ata/chipsets/ata-adaptec.c	optional ata pci | ataadaptec
 dev/ata/chipsets/ata-amd.c	optional ata pci | ataamd
 dev/ata/chipsets/ata-ati.c	optional ata pci | ataati
 dev/ata/chipsets/ata-cenatek.c	optional ata pci | atacenatek
 dev/ata/chipsets/ata-cypress.c	optional ata pci | atacypress
 dev/ata/chipsets/ata-cyrix.c	optional ata pci | atacyrix
 dev/ata/chipsets/ata-highpoint.c	optional ata pci | atahighpoint
 dev/ata/chipsets/ata-intel.c	optional ata pci | ataintel
 dev/ata/chipsets/ata-ite.c	optional ata pci | ataite
 dev/ata/chipsets/ata-jmicron.c	optional ata pci | atajmicron
 dev/ata/chipsets/ata-marvell.c	optional ata pci | atamarvell
 dev/ata/chipsets/ata-micron.c	optional ata pci | atamicron
 dev/ata/chipsets/ata-national.c	optional ata pci | atanational
 dev/ata/chipsets/ata-netcell.c	optional ata pci | atanetcell
 dev/ata/chipsets/ata-nvidia.c	optional ata pci | atanvidia
 dev/ata/chipsets/ata-promise.c	optional ata pci | atapromise
 dev/ata/chipsets/ata-serverworks.c	optional ata pci | ataserverworks
 dev/ata/chipsets/ata-siliconimage.c	optional ata pci | atasiliconimage
 dev/ata/chipsets/ata-sis.c	optional ata pci | atasis
 dev/ata/chipsets/ata-via.c	optional ata pci | atavia
 dev/ata/ata-disk.c		optional atadisk
 dev/ata/ata-raid.c		optional ataraid
 dev/ata/ata-usb.c		optional atausb
 dev/ata/atapi-cd.c		optional atapicd
 dev/ata/atapi-fd.c		optional atapifd
 dev/ata/atapi-tape.c		optional atapist
 dev/ata/atapi-cam.c		optional atapicam
 #
 dev/ath/if_ath.c		optional ath \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/if_ath_pci.c		optional ath pci \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ah_osdep.c		optional ath \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_hal/ah.c		optional ath \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_hal/ah_eeprom_v1.c	optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_hal/ah_eeprom_v3.c	optional ath_hal | ath_ar5211 | ath_ar5212 \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_hal/ah_eeprom_v14.c optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_hal/ah_regdomain.c	optional ath \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_hal/ar5210/ar5210_attach.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_beacon.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_interrupts.c	optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_keycache.c	optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_misc.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_phy.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_power.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_recv.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_reset.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5210/ar5210_xmit.c		optional ath_hal | ath_ar5210 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_attach.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_beacon.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_interrupts.c	optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_keycache.c	optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_misc.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_phy.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_power.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_recv.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_reset.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5211/ar5211_xmit.c		optional ath_hal | ath_ar5211 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_ani.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_attach.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_beacon.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_eeprom.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_gpio.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_interrupts.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_keycache.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_misc.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_phy.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_power.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_recv.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_reset.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_rfgain.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5212_xmit.c \
 	optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar2316.c	optional ath_rf2316 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar2317.c	optional ath_rf2317 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar2413.c	optional ath_hal | ath_rf2413 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar2425.c	optional ath_hal | ath_rf2425 | ath_rf2417 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5111.c	optional ath_hal | ath_rf5111 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5112.c	optional ath_hal | ath_rf5112 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5212/ar5413.c	optional ath_hal | ath_rf5413 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar2133.c	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_ani.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_attach.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_beacon.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_cal.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_cal_iq.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_cal_adcgain.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_cal_adcdc.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_eeprom.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_gpio.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_interrupts.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_keycache.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_misc.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_phy.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_power.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_recv.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_reset.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar5416_xmit.c \
 	optional ath_hal | ath_ar5416 | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_hal/ar5416/ar9160_attach.c optional ath_hal | ath_ar9160 \
 	compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal"
 dev/ath/ath_rate/amrr/amrr.c	optional ath_rate_amrr \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_rate/onoe/onoe.c	optional ath_rate_onoe \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/ath/ath_rate/sample/sample.c	optional ath_rate_sample \
 	compile-with "${NORMAL_C} -I$S/dev/ath"
 dev/bce/if_bce.c		optional bce
 dev/bfe/if_bfe.c		optional bfe
 dev/bge/if_bge.c		optional bge
 dev/bktr/bktr_audio.c		optional bktr pci
 dev/bktr/bktr_card.c		optional bktr pci
 dev/bktr/bktr_core.c		optional bktr pci
 dev/bktr/bktr_i2c.c		optional bktr pci smbus
 dev/bktr/bktr_os.c		optional bktr pci
 dev/bktr/bktr_tuner.c		optional bktr pci
 dev/bktr/msp34xx.c		optional bktr pci
 dev/buslogic/bt.c		optional bt
 dev/buslogic/bt_eisa.c		optional bt eisa
 dev/buslogic/bt_isa.c		optional bt isa
 dev/buslogic/bt_mca.c		optional bt mca
 dev/buslogic/bt_pci.c		optional bt pci
 dev/cardbus/cardbus.c		optional cardbus
 dev/cardbus/cardbus_cis.c	optional cardbus
 dev/cardbus/cardbus_device.c	optional cardbus
 dev/cfi/cfi_core.c		optional cfi
 dev/cfi/cfi_dev.c		optional cfi
 dev/ciss/ciss.c			optional ciss
 dev/cm/smc90cx6.c		optional cm
 dev/cmx/cmx.c			optional cmx
 dev/cmx/cmx_pccard.c		optional cmx pccard
 dev/cpufreq/ichss.c		optional cpufreq
 dev/cs/if_cs.c			optional cs
 dev/cs/if_cs_isa.c		optional cs isa
 dev/cs/if_cs_pccard.c		optional cs pccard
 dev/cxgb/cxgb_main.c		optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/cxgb_offload.c		optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/cxgb_sge.c		optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/cxgb_multiq.c		optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_mc5.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_vsc7323.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_vsc8211.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_ael1002.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_mv88e1xxx.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_xgmac.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_t3_hw.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/common/cxgb_tn1010.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"	
 dev/cxgb/sys/uipc_mvec.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/sys/cxgb_support.c	optional cxgb pci \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cxgb/cxgb_t3fw.c		optional cxgb cxgb_t3fw \
 	compile-with "${NORMAL_C} -I$S/dev/cxgb"
 dev/cy/cy.c			optional cy
 dev/cy/cy_isa.c			optional cy isa
 dev/cy/cy_pci.c			optional cy pci
 dev/dc/if_dc.c			optional dc pci
 dev/dc/dcphy.c			optional dc pci
 dev/dc/pnphy.c			optional dc pci
 dev/dcons/dcons.c		optional dcons
 dev/dcons/dcons_crom.c		optional dcons_crom
 dev/dcons/dcons_os.c		optional dcons
 dev/de/if_de.c			optional de pci
 dev/digi/CX.c			optional digi_CX
 dev/digi/CX_PCI.c		optional digi_CX_PCI
 dev/digi/EPCX.c			optional digi_EPCX
 dev/digi/EPCX_PCI.c		optional digi_EPCX_PCI
 dev/digi/Xe.c			optional digi_Xe
 dev/digi/Xem.c			optional digi_Xem
 dev/digi/Xr.c			optional digi_Xr
 dev/digi/digi.c			optional digi
 dev/digi/digi_isa.c		optional digi isa
 dev/digi/digi_pci.c		optional digi pci
 dev/dpt/dpt_eisa.c		optional dpt eisa
 dev/dpt/dpt_pci.c		optional dpt pci
 dev/dpt/dpt_scsi.c		optional dpt
 dev/drm/ati_pcigart.c		optional drm
 dev/drm/drm_agpsupport.c	optional drm
 dev/drm/drm_auth.c		optional drm
 dev/drm/drm_bufs.c		optional drm
 dev/drm/drm_context.c		optional drm
 dev/drm/drm_dma.c		optional drm
 dev/drm/drm_drawable.c		optional drm
 dev/drm/drm_drv.c		optional drm
 dev/drm/drm_fops.c		optional drm
 dev/drm/drm_ioctl.c		optional drm
 dev/drm/drm_irq.c		optional drm
 dev/drm/drm_lock.c		optional drm
 dev/drm/drm_memory.c		optional drm
 dev/drm/drm_pci.c		optional drm
 dev/drm/drm_scatter.c		optional drm
 dev/drm/drm_sysctl.c		optional drm
 dev/drm/drm_vm.c		optional drm
 dev/drm/i915_dma.c		optional i915drm
 dev/drm/i915_drv.c		optional i915drm
 dev/drm/i915_irq.c		optional i915drm
 dev/drm/i915_mem.c		optional i915drm
 dev/drm/i915_suspend.c		optional i915drm
 dev/drm/mach64_dma.c		optional mach64drm
 dev/drm/mach64_drv.c		optional mach64drm
 dev/drm/mach64_irq.c		optional mach64drm
 dev/drm/mach64_state.c		optional mach64drm
 dev/drm/mga_dma.c		optional mgadrm
 dev/drm/mga_drv.c		optional mgadrm
 dev/drm/mga_irq.c		optional mgadrm
 dev/drm/mga_state.c		optional mgadrm \
 	compile-with "${NORMAL_C} -finline-limit=13500"
 dev/drm/mga_warp.c		optional mgadrm
 dev/drm/r128_cce.c		optional r128drm
 dev/drm/r128_drv.c		optional r128drm
 dev/drm/r128_irq.c		optional r128drm
 dev/drm/r128_state.c		optional r128drm \
 	compile-with "${NORMAL_C} -finline-limit=13500"
 dev/drm/r300_cmdbuf.c		optional radeondrm
 dev/drm/radeon_cp.c		optional radeondrm
 dev/drm/radeon_drv.c		optional radeondrm
 dev/drm/radeon_irq.c		optional radeondrm
 dev/drm/radeon_mem.c		optional radeondrm
 dev/drm/radeon_state.c		optional radeondrm
 dev/drm/savage_bci.c		optional savagedrm
 dev/drm/savage_drv.c		optional savagedrm
 dev/drm/savage_state.c		optional savagedrm
 dev/drm/sis_drv.c		optional sisdrm
 dev/drm/sis_ds.c		optional sisdrm
 dev/drm/sis_mm.c		optional sisdrm
 dev/drm/tdfx_drv.c		optional tdfxdrm
 dev/ed/if_ed.c			optional ed
 dev/ed/if_ed_novell.c		optional ed
 dev/ed/if_ed_rtl80x9.c		optional ed
 dev/ed/if_ed_pccard.c		optional ed pccard
 dev/ed/if_ed_pci.c		optional ed pci
 dev/eisa/eisa_if.m		standard
 dev/eisa/eisaconf.c		optional eisa
 dev/e1000/if_em.c		optional em \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/if_igb.c		optional igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_80003es2lan.c	optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_82540.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_82541.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_82542.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_82543.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_82571.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_82575.c		optional em | igb \
 	 compile-with "${NORMAL_C} -I$S/dev/igb"
 dev/e1000/e1000_ich8lan.c	optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_api.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_mac.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_manage.c	optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_nvm.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_phy.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/e1000/e1000_osdep.c		optional em | igb \
 	compile-with "${NORMAL_C} -I$S/dev/e1000"
 dev/et/if_et.c			optional et
 dev/en/if_en_pci.c		optional en pci
 dev/en/midway.c			optional en
 dev/ep/if_ep.c			optional ep
 dev/ep/if_ep_eisa.c		optional ep eisa
 dev/ep/if_ep_isa.c		optional ep isa
 dev/ep/if_ep_mca.c		optional ep mca
 dev/ep/if_ep_pccard.c		optional ep pccard
 dev/esp/ncr53c9x.c		optional esp
 dev/ex/if_ex.c			optional ex
 dev/ex/if_ex_isa.c		optional ex isa
 dev/ex/if_ex_pccard.c		optional ex pccard
 dev/exca/exca.c			optional cbb
 dev/fatm/if_fatm.c		optional fatm pci
 dev/fb/splash.c			optional splash
 dev/fe/if_fe.c			optional fe
 dev/fe/if_fe_pccard.c		optional fe pccard
 dev/firewire/firewire.c		optional firewire
 dev/firewire/fwcrom.c		optional firewire
 dev/firewire/fwdev.c		optional firewire
 dev/firewire/fwdma.c		optional firewire
 dev/firewire/fwmem.c		optional firewire
 dev/firewire/fwohci.c		optional firewire
 dev/firewire/fwohci_pci.c	optional firewire pci
 dev/firewire/if_fwe.c		optional fwe
 dev/firewire/if_fwip.c		optional fwip
 dev/firewire/sbp.c		optional sbp
 dev/firewire/sbp_targ.c		optional sbp_targ
 dev/flash/at45d.c		optional at45d
 dev/fxp/if_fxp.c		optional fxp
 dev/gem/if_gem.c		optional gem
 dev/gem/if_gem_pci.c		optional gem pci
 dev/hatm/if_hatm.c		optional hatm pci
 dev/hatm/if_hatm_intr.c		optional hatm pci
 dev/hatm/if_hatm_ioctl.c	optional hatm pci
 dev/hatm/if_hatm_rx.c		optional hatm pci
 dev/hatm/if_hatm_tx.c		optional hatm pci
 dev/hifn/hifn7751.c		optional hifn
 dev/hme/if_hme.c		optional hme
 dev/hme/if_hme_pci.c		optional hme pci
 dev/hme/if_hme_sbus.c		optional hme sbus
 dev/hptiop/hptiop.c		optional hptiop scbus
 dev/hwpmc/hwpmc_logging.c	optional hwpmc
 dev/hwpmc/hwpmc_mod.c		optional hwpmc
 dev/ichsmb/ichsmb.c		optional ichsmb
 dev/ichsmb/ichsmb_pci.c		optional ichsmb pci
 dev/ida/ida.c			optional ida
 dev/ida/ida_disk.c		optional ida
 dev/ida/ida_eisa.c		optional ida eisa
 dev/ida/ida_pci.c		optional ida pci
 dev/ie/if_ie.c			optional ie isa nowerror
 dev/ie/if_ie_isa.c		optional ie isa
 dev/ieee488/ibfoo.c		optional pcii | tnt4882
 dev/ieee488/pcii.c		optional pcii
 dev/ieee488/tnt4882.c		optional tnt4882
 dev/ieee488/upd7210.c		optional pcii | tnt4882
 dev/iicbus/ad7418.c		optional ad7418
 dev/iicbus/ds133x.c		optional ds133x
 dev/iicbus/ds1672.c		optional ds1672
 dev/iicbus/icee.c		optional icee
 dev/iicbus/if_ic.c		optional ic
 dev/iicbus/iic.c		optional iic
 dev/iicbus/iicbb.c		optional iicbb
 dev/iicbus/iicbb_if.m		optional iicbb
 dev/iicbus/iicbus.c		optional iicbus
 dev/iicbus/iicbus_if.m		optional iicbus
 dev/iicbus/iiconf.c		optional iicbus
 dev/iicbus/iicsmb.c		optional iicsmb				\
 	dependency	"iicbus_if.h"
 dev/iir/iir.c			optional iir
 dev/iir/iir_ctrl.c		optional iir
 dev/iir/iir_pci.c		optional iir pci
 dev/ips/ips.c			optional ips
 dev/ips/ips_commands.c		optional ips
 dev/ips/ips_disk.c		optional ips
 dev/ips/ips_ioctl.c		optional ips
 dev/ips/ips_pci.c		optional ips pci
 dev/ipw/if_ipw.c		optional ipw
 ipwbssfw.c			optional ipwbssfw | ipwfw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk ipw_bss.fw:ipw_bss:130 -lintel_ipw -mipw_bss -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"ipwbssfw.c"
 ipw_bss.fwo			optional ipwbssfw | ipwfw		\
 	dependency	"ipw_bss.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} ipw_bss.fw" \
 	no-implicit-rule						\
 	clean		"ipw_bss.fwo"
 ipw_bss.fw			optional ipwbssfw | ipwfw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/ipw/ipw2100-1.3.fw.uu" \
 	no-obj no-implicit-rule						\
 	clean		"ipw_bss.fw"
 ipwibssfw.c			optional ipwibssfw | ipwfw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk ipw_ibss.fw:ipw_ibss:130 -lintel_ipw -mipw_ibss -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"ipwibssfw.c"
 ipw_ibss.fwo			optional ipwibssfw | ipwfw		\
 	dependency	"ipw_ibss.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} ipw_ibss.fw" \
 	no-implicit-rule						\
 	clean		"ipw_ibss.fwo"
 ipw_ibss.fw			optional ipwibssfw | ipwfw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/ipw/ipw2100-1.3-i.fw.uu" \
 	no-obj no-implicit-rule						\
 	clean		"ipw_ibss.fw"
 ipwmonitorfw.c			optional ipwmonitorfw | ipwfw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk ipw_monitor.fw:ipw_monitor:130 -lintel_ipw -mipw_monitor -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"ipwmonitorfw.c"
 ipw_monitor.fwo			optional ipwmonitorfw | ipwfw		\
 	dependency	"ipw_monitor.fw"				\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} ipw_monitor.fw" \
 	no-implicit-rule						\
 	clean		"ipw_monitor.fwo"
 ipw_monitor.fw			optional ipwmonitorfw | ipwfw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/ipw/ipw2100-1.3-p.fw.uu" \
 	no-obj no-implicit-rule						\
 	clean		"ipw_monitor.fw"
 dev/iscsi/initiator/iscsi.c	optional iscsi_initiator scbus
 dev/iscsi/initiator/iscsi_subr.c	optional iscsi_initiator scbus
 dev/iscsi/initiator/isc_cam.c	optional iscsi_initiator scbus
 dev/iscsi/initiator/isc_soc.c	optional iscsi_initiator scbus
 dev/iscsi/initiator/isc_sm.c	optional iscsi_initiator scbus
 dev/iscsi/initiator/isc_subr.c	optional iscsi_initiator scbus
 dev/isp/isp.c			optional isp
 dev/isp/isp_freebsd.c		optional isp
 dev/isp/isp_library.c		optional isp
 dev/isp/isp_pci.c		optional isp pci
 dev/isp/isp_sbus.c		optional isp sbus
 dev/isp/isp_target.c		optional isp
 dev/ispfw/ispfw.c		optional ispfw
 dev/iwi/if_iwi.c		optional iwi
 iwibssfw.c			optional iwibssfw | iwifw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk iwi_bss.fw:iwi_bss:300 -lintel_iwi -miwi_bss -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"iwibssfw.c"
 iwi_bss.fwo			optional iwibssfw | iwifw		\
 	dependency	"iwi_bss.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} iwi_bss.fw" \
 	no-implicit-rule						\
 	clean		"iwi_bss.fwo"
 iwi_bss.fw			optional iwibssfw | iwifw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/iwi/ipw2200-bss.fw.uu" \
 	no-obj no-implicit-rule						\
 	clean		"iwi_bss.fw"
 iwiibssfw.c			optional iwiibssfw | iwifw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk iwi_ibss.fw:iwi_ibss:300 -lintel_iwi -miwi_ibss -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"iwiibssfw.c"
 iwi_ibss.fwo			optional iwiibssfw | iwifw		\
 	dependency	"iwi_ibss.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} iwi_ibss.fw" \
 	no-implicit-rule						\
 	clean		"iwi_ibss.fwo"
 iwi_ibss.fw			optional iwiibssfw | iwifw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/iwi/ipw2200-ibss.fw.uu" \
 	no-obj no-implicit-rule						\
 	clean		"iwi_ibss.fw"
 iwimonitorfw.c			optional iwimonitorfw | iwifw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk iwi_monitor.fw:iwi_monitor:300 -lintel_iwi -miwi_monitor -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"iwimonitorfw.c"
 iwi_monitor.fwo			optional iwimonitorfw | iwifw		\
 	dependency	"iwi_monitor.fw"				\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} iwi_monitor.fw" \
 	no-implicit-rule						\
 	clean		"iwi_monitor.fwo"
 iwi_monitor.fw			optional iwimonitorfw | iwifw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/iwi/ipw2200-sniffer.fw.uu" \
 	no-obj no-implicit-rule						\
 	clean		"iwi_monitor.fw"
 dev/iwn/if_iwn.c		optional iwn
 iwnfw.c			optional iwnfw					\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk iwn.fw:iwnfw:44417 -lintel_iwn -miwn -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"iwnfw.c"
 iwnfw.fwo			optional iwnfw				\
 	dependency	"iwn.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} iwn.fw" \
 	no-implicit-rule						\
 	clean		"iwn.fwo"
 iwn.fw			optional iwnfw					\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/iwn/iwlwifi-4965-4.44.17.fw.uu"	\
 	no-obj no-implicit-rule						\
 	clean		"iwn.fw"
 dev/ixgb/if_ixgb.c		optional ixgb
 dev/ixgb/ixgb_ee.c		optional ixgb
 dev/ixgb/ixgb_hw.c		optional ixgb
 dev/ixgbe/ixgbe.c		optional ixgbe \
 	compile-with "${NORMAL_C} -I$S/dev/ixgbe"
 dev/ixgbe/ixgbe_phy.c		optional ixgbe \
 	compile-with "${NORMAL_C} -I$S/dev/ixgbe"
 dev/ixgbe/ixgbe_api.c		optional ixgbe \
 	compile-with "${NORMAL_C} -I$S/dev/ixgbe"
 dev/ixgbe/ixgbe_common.c	optional ixgbe \
 	compile-with "${NORMAL_C} -I$S/dev/ixgbe"
 dev/ixgbe/ixgbe_82598.c		optional ixgbe \
 	compile-with "${NORMAL_C} -I$S/dev/ixgbe"
 dev/jme/if_jme.c		optional jme pci
 dev/joy/joy.c			optional joy
 dev/joy/joy_isa.c		optional joy isa
 dev/joy/joy_pccard.c		optional joy pccard
 dev/kbdmux/kbdmux.c		optional kbdmux
 dev/le/am7990.c			optional le
 dev/le/am79900.c		optional le
 dev/le/if_le_pci.c		optional le pci
 dev/le/lance.c			optional le
 dev/led/led.c			standard
 dev/lge/if_lge.c		optional lge
 dev/lmc/if_lmc.c		optional lmc
 dev/malo/if_malo.c		optional malo
 dev/malo/if_malohal.c		optional malo
 dev/malo/if_malo_pci.c		optional malo pci
 dev/mc146818/mc146818.c		optional mc146818
 dev/mca/mca_bus.c		optional mca
 dev/mcd/mcd.c			optional mcd isa nowerror
 dev/mcd/mcd_isa.c		optional mcd isa nowerror
 dev/md/md.c			optional md
 dev/mem/memdev.c		optional mem
 dev/mfi/mfi.c			optional mfi
 dev/mfi/mfi_debug.c		optional mfi
 dev/mfi/mfi_pci.c		optional mfi pci
 dev/mfi/mfi_disk.c		optional mfi
 dev/mfi/mfi_linux.c		optional mfi compat_linux
 dev/mfi/mfi_cam.c		optional mfip scbus
 dev/mii/acphy.c			optional miibus | acphy
 dev/mii/amphy.c			optional miibus | amphy
 dev/mii/atphy.c			optional miibus | atphy
 dev/mii/bmtphy.c		optional miibus | bmtphy
 dev/mii/brgphy.c		optional miibus | brgphy
 dev/mii/ciphy.c			optional miibus | ciphy
 dev/mii/e1000phy.c		optional miibus | e1000phy
 # XXX only xl cards?
 dev/mii/exphy.c			optional miibus | exphy
 dev/mii/gentbi.c		optional miibus | gentbi
 dev/mii/icsphy.c		optional miibus | icsphy
 # XXX only fxp cards?
 dev/mii/inphy.c			optional miibus | inphy
 dev/mii/ip1000phy.c		optional miibus | ip1000phy
 dev/mii/jmphy.c			optional miibus | jmphy
 dev/mii/lxtphy.c		optional miibus | lxtphy
 dev/mii/mii.c			optional miibus | mii
 dev/mii/mii_physubr.c		optional miibus | mii
 dev/mii/miibus_if.m		optional miibus | mii
 dev/mii/mlphy.c			optional miibus | mlphy
 dev/mii/nsgphy.c		optional miibus | nsgphy
 dev/mii/nsphy.c			optional miibus | nsphy
 dev/mii/nsphyter.c		optional miibus | nsphyter
 dev/mii/pnaphy.c		optional miibus | pnaphy
 dev/mii/qsphy.c			optional miibus | qsphy
 dev/mii/rgephy.c		optional miibus | rgephy
 dev/mii/rlphy.c			optional miibus | rlphy
 dev/mii/rlswitch.c		optional rlswitch
 # XXX rue only?
 dev/mii/ruephy.c		optional miibus | ruephy
 dev/mii/smcphy.c		optional miibus | smcphy
 dev/mii/tdkphy.c		optional miibus | tdkphy
 dev/mii/tlphy.c			optional miibus | tlphy
 dev/mii/truephy.c		optional miibus | truephy
 dev/mii/ukphy.c			optional miibus | mii
 dev/mii/ukphy_subr.c		optional miibus | mii
 dev/mii/xmphy.c			optional miibus | xmphy
 dev/mk48txx/mk48txx.c		optional mk48txx
 dev/mlx/mlx.c			optional mlx
 dev/mlx/mlx_disk.c		optional mlx
 dev/mlx/mlx_pci.c		optional mlx pci
 dev/mly/mly.c			optional mly
 dev/mmc/mmc.c			optional mmc
 dev/mmc/mmcbr_if.m		standard
 dev/mmc/mmcbus_if.m		standard
 dev/mmc/mmcsd.c			optional mmcsd
 dev/mn/if_mn.c			optional mn pci
 dev/mpt/mpt.c			optional mpt
 dev/mpt/mpt_cam.c		optional mpt
 dev/mpt/mpt_debug.c		optional mpt
 dev/mpt/mpt_pci.c		optional mpt pci
 dev/mpt/mpt_raid.c		optional mpt
 dev/mpt/mpt_user.c		optional mpt
 dev/msk/if_msk.c		optional msk
 dev/mxge/if_mxge.c		optional mxge pci
 dev/mxge/mxge_lro.c		optional mxge pci
 dev/mxge/mxge_eth_z8e.c		optional mxge pci
 dev/mxge/mxge_ethp_z8e.c	optional mxge pci
 dev/mxge/mxge_rss_eth_z8e.c	optional mxge pci
 dev/mxge/mxge_rss_ethp_z8e.c	optional mxge pci
 dev/my/if_my.c			optional my
 dev/ncv/ncr53c500.c		optional ncv
 dev/ncv/ncr53c500_pccard.c	optional ncv pccard
 dev/nge/if_nge.c		optional nge
 dev/nxge/if_nxge.c		optional nxge
 dev/nxge/xgehal/xgehal-device.c	optional nxge
 dev/nxge/xgehal/xgehal-mm.c	optional nxge
 dev/nxge/xgehal/xge-queue.c	optional nxge
 dev/nxge/xgehal/xgehal-driver.c	optional nxge
 dev/nxge/xgehal/xgehal-ring.c	optional nxge
 dev/nxge/xgehal/xgehal-channel.c	optional nxge
 dev/nxge/xgehal/xgehal-fifo.c	optional nxge
 dev/nxge/xgehal/xgehal-stats.c	optional nxge
 dev/nxge/xgehal/xgehal-config.c	optional nxge
 dev/nxge/xgehal/xgehal-mgmt.c	optional nxge
 dev/nmdm/nmdm.c			optional nmdm
 dev/nsp/nsp.c			optional nsp
 dev/nsp/nsp_pccard.c		optional nsp pccard
 dev/null/null.c			standard
 dev/patm/if_patm.c		optional patm pci
 dev/patm/if_patm_attach.c	optional patm pci
 dev/patm/if_patm_intr.c		optional patm pci
 dev/patm/if_patm_ioctl.c	optional patm pci
 dev/patm/if_patm_rtables.c	optional patm pci
 dev/patm/if_patm_rx.c		optional patm pci
 dev/patm/if_patm_tx.c		optional patm pci
 dev/pbio/pbio.c			optional pbio isa
 dev/pccard/card_if.m		standard
 dev/pccard/pccard.c		optional pccard
 dev/pccard/pccard_cis.c		optional pccard
 dev/pccard/pccard_cis_quirks.c	optional pccard
 dev/pccard/pccard_device.c	optional pccard
 dev/pccard/power_if.m		standard
 dev/pccbb/pccbb.c		optional cbb
 dev/pccbb/pccbb_isa.c		optional cbb isa
 dev/pccbb/pccbb_pci.c		optional cbb pci
 dev/pcf/pcf.c			optional pcf
 dev/pci/eisa_pci.c		optional pci eisa
 dev/pci/fixup_pci.c		optional pci
 dev/pci/hostb_pci.c		optional pci
 dev/pci/ignore_pci.c		optional pci
 dev/pci/isa_pci.c		optional pci isa
 dev/pci/pci.c			optional pci
 dev/pci/pci_if.m		standard
 dev/pci/pci_pci.c		optional pci
 dev/pci/pci_user.c		optional pci
 dev/pci/pcib_if.m		standard
 dev/pci/vga_pci.c		optional pci
 dev/pcn/if_pcn.c		optional pcn pci
 dev/pdq/if_fea.c		optional fea eisa
 dev/pdq/if_fpa.c		optional fpa pci
 dev/pdq/pdq.c			optional nowerror fea eisa | fpa pci
 dev/pdq/pdq_ifsubr.c		optional nowerror fea eisa | fpa pci
 dev/ppbus/if_plip.c		optional plip
 dev/ppbus/immio.c		optional vpo
 dev/ppbus/lpbb.c		optional lpbb
 dev/ppbus/lpt.c			optional lpt
 dev/ppbus/pcfclock.c		optional pcfclock
 dev/ppbus/ppb_1284.c		optional ppbus
 dev/ppbus/ppb_base.c		optional ppbus
 dev/ppbus/ppb_msq.c		optional ppbus
 dev/ppbus/ppbconf.c		optional ppbus
 dev/ppbus/ppbus_if.m		optional ppbus
 dev/ppbus/ppi.c			optional ppi
 dev/ppbus/pps.c			optional pps
 dev/ppbus/vpo.c			optional vpo
 dev/ppbus/vpoio.c		optional vpo
 dev/ppc/ppc.c			optional ppc
 dev/ppc/ppc_acpi.c		optional ppc acpi
 dev/ppc/ppc_isa.c		optional ppc isa
 dev/ppc/ppc_pci.c		optional ppc pci
 dev/ppc/ppc_puc.c		optional ppc puc
 dev/pst/pst-iop.c		optional pst
 dev/pst/pst-pci.c		optional pst pci
 dev/pst/pst-raid.c		optional pst
 dev/puc/puc.c			optional puc
 dev/puc/puc_cfg.c		optional puc
 dev/puc/puc_pccard.c		optional puc pccard
 dev/puc/puc_pci.c		optional puc pci
 dev/puc/pucdata.c		optional puc pci
 dev/quicc/quicc_core.c		optional quicc
 dev/ral/rt2560.c		optional ral
 dev/ral/rt2661.c		optional ral
 dev/ral/if_ral_pci.c		optional ral pci
 rt2561fw.c			optional rt2561fw | ralfw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk rt2561.fw:rt2561fw -mrt2561 -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"rt2561fw.c"
 rt2561fw.fwo			optional rt2561fw | ralfw		\
 	dependency	"rt2561.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} rt2561.fw" \
 	no-implicit-rule						\
 	clean		"rt2561.fwo"
 rt2561.fw			optional rt2561fw | ralfw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/ral/rt2561.fw.uu" \
 	no-obj no-implicit-rule						\
 	clean		"rt2561.fw"
 rt2561sfw.c			optional rt2561sfw | ralfw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk rt2561s.fw:rt2561sfw -mrt2561s -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"rt2561sfw.c"
 rt2561sfw.fwo			optional rt2561sfw | ralfw		\
 	dependency	"rt2561s.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} rt2561s.fw" \
 	no-implicit-rule						\
 	clean		"rt2561s.fwo"
 rt2561s.fw			optional rt2561sfw | ralfw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/ral/rt2561s.fw.uu"	\
 	no-obj no-implicit-rule						\
 	clean		"rt2561s.fw"
 rt2661fw.c			optional rt2661fw | ralfw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk rt2661.fw:rt2661fw -mrt2661 -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"rt2661fw.c"
 rt2661fw.fwo			optional rt2661fw | ralfw		\
 	dependency	"rt2661.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} rt2661.fw" \
 	no-implicit-rule						\
 	clean		"rt2661.fwo"
 rt2661.fw			optional rt2661fw | ralfw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/ral/rt2661.fw.uu"	\
 	no-obj no-implicit-rule						\
 	clean		"rt2661.fw"
 rt2860fw.c			optional rt2860fw | ralfw		\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk rt2860.fw:rt2860fw -mrt2860 -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"rt2860fw.c"
 rt2860fw.fwo			optional rt2860fw | ralfw		\
 	dependency	"rt2860.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} rt2860.fw" \
 	no-implicit-rule						\
 	clean		"rt2860.fwo"
 rt2860.fw			optional rt2860fw | ralfw		\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/ral/rt2860.fw.uu"	\
 	no-obj no-implicit-rule						\
 	clean		"rt2860.fw"
 dev/random/harvest.c		standard
 dev/random/hash.c		optional random
 dev/random/probe.c		optional random
 dev/random/randomdev.c		optional random
 dev/random/randomdev_soft.c	optional random
 dev/random/yarrow.c		optional random
 dev/ray/if_ray.c		optional ray pccard
 dev/rc/rc.c			optional rc
 dev/re/if_re.c			optional re
 dev/rndtest/rndtest.c		optional rndtest
 dev/rp/rp.c			optional rp
 dev/rp/rp_isa.c			optional rp isa
 dev/rp/rp_pci.c			optional rp pci
 dev/safe/safe.c			optional safe
 dev/scc/scc_if.m		optional scc
 dev/scc/scc_bfe_ebus.c		optional scc ebus
 dev/scc/scc_bfe_quicc.c		optional scc quicc
 dev/scc/scc_bfe_sbus.c		optional scc fhc | scc sbus
 dev/scc/scc_core.c		optional scc
 dev/scc/scc_dev_quicc.c		optional scc quicc
 dev/scc/scc_dev_sab82532.c	optional scc
 dev/scc/scc_dev_z8530.c		optional scc
 dev/scd/scd.c			optional scd isa
 dev/scd/scd_isa.c		optional scd isa
 dev/sdhci/sdhci.c		optional sdhci pci
 dev/sf/if_sf.c			optional sf pci
 dev/si/si.c			optional si
 dev/si/si2_z280.c		optional si
 dev/si/si3_t225.c		optional si
 dev/si/si_eisa.c		optional si eisa
 dev/si/si_isa.c			optional si isa
 dev/si/si_pci.c			optional si pci
 dev/sis/if_sis.c		optional sis pci
 dev/sk/if_sk.c			optional sk pci
 dev/smbus/smb.c			optional smb
 dev/smbus/smbconf.c		optional smbus
 dev/smbus/smbus.c		optional smbus
 dev/smbus/smbus_if.m		optional smbus
 dev/smc/if_smc.c		optional smc
 dev/sn/if_sn.c			optional sn
 dev/sn/if_sn_isa.c		optional sn isa
 dev/sn/if_sn_pccard.c		optional sn pccard
 dev/snp/snp.c			optional snp
 dev/sound/clone.c		optional sound
 dev/sound/unit.c		optional sound
 dev/sound/isa/ad1816.c		optional snd_ad1816 isa
 dev/sound/isa/ess.c		optional snd_ess isa
 dev/sound/isa/gusc.c		optional snd_gusc isa
 dev/sound/isa/mss.c		optional snd_mss isa
 dev/sound/isa/sb16.c		optional snd_sb16 isa
 dev/sound/isa/sb8.c		optional snd_sb8 isa
 dev/sound/isa/sbc.c		optional snd_sbc isa
 dev/sound/isa/sndbuf_dma.c	optional sound isa
 dev/sound/pci/als4000.c		optional snd_als4000 pci
 dev/sound/pci/atiixp.c		optional snd_atiixp pci
 #dev/sound/pci/au88x0.c		optional snd_au88x0 pci
 dev/sound/pci/cmi.c		optional snd_cmi pci
 dev/sound/pci/cs4281.c		optional snd_cs4281 pci
 dev/sound/pci/csa.c		optional snd_csa pci \
 	warning "kernel contains GPL contaminated csaimg.h header"
 dev/sound/pci/csapcm.c		optional snd_csa pci
 dev/sound/pci/ds1.c		optional snd_ds1 pci
 dev/sound/pci/emu10k1.c		optional snd_emu10k1 pci \
 	dependency "emu10k1-alsa%diked.h" \
 	warning "kernel contains GPL contaminated emu10k1 headers"
 dev/sound/pci/emu10kx.c		optional snd_emu10kx pci \
 	dependency "emu10k1-alsa%diked.h" \
 	dependency "p16v-alsa%diked.h" \
 	dependency "p17v-alsa%diked.h" \
 	warning "kernel contains GPL contaminated emu10kx headers"
 dev/sound/pci/emu10kx-pcm.c	optional snd_emu10kx pci \
 	dependency "emu10k1-alsa%diked.h" \
 	dependency "p16v-alsa%diked.h" \
 	dependency "p17v-alsa%diked.h" \
 	warning "kernel contains GPL contaminated emu10kx headers"
 dev/sound/pci/emu10kx-midi.c	optional snd_emu10kx pci \
 	dependency "emu10k1-alsa%diked.h" \
 	warning "kernel contains GPL contaminated emu10kx headers"
 dev/sound/pci/envy24.c		optional snd_envy24 pci
 dev/sound/pci/envy24ht.c	optional snd_envy24ht pci
 dev/sound/pci/es137x.c		optional snd_es137x pci
 dev/sound/pci/fm801.c		optional snd_fm801 pci
 dev/sound/pci/ich.c		optional snd_ich pci
 dev/sound/pci/maestro.c		optional snd_maestro pci
 dev/sound/pci/maestro3.c	optional snd_maestro3 pci \
 	warning "kernel contains GPL contaminated maestro3 headers"
 dev/sound/pci/neomagic.c	optional snd_neomagic pci
 dev/sound/pci/solo.c		optional snd_solo pci
 dev/sound/pci/spicds.c		optional snd_spicds pci
 dev/sound/pci/t4dwave.c		optional snd_t4dwave pci
 dev/sound/pci/via8233.c		optional snd_via8233 pci
 dev/sound/pci/via82c686.c	optional snd_via82c686 pci
 dev/sound/pci/vibes.c		optional snd_vibes pci
 dev/sound/pci/hda/hdac.c	optional snd_hda pci
 dev/sound/pcm/ac97.c		optional sound
 dev/sound/pcm/ac97_if.m		optional sound
 dev/sound/pcm/ac97_patch.c	optional sound
 dev/sound/pcm/buffer.c		optional sound
 dev/sound/pcm/channel.c		optional sound
 dev/sound/pcm/channel_if.m	optional sound
 dev/sound/pcm/dsp.c		optional sound
 dev/sound/pcm/fake.c		optional sound
 dev/sound/pcm/feeder.c		optional sound
 dev/sound/pcm/feeder_fmt.c	optional sound
 dev/sound/pcm/feeder_if.m	optional sound
 dev/sound/pcm/feeder_rate.c	optional sound
 dev/sound/pcm/feeder_volume.c	optional sound
 dev/sound/pcm/mixer.c		optional sound
 dev/sound/pcm/mixer_if.m	optional sound
 dev/sound/pcm/sndstat.c		optional sound
 dev/sound/pcm/sound.c		optional sound
 dev/sound/pcm/vchan.c		optional sound
 #dev/sound/usb/upcm.c		optional snd_upcm usb
 dev/sound/usb/uaudio.c		optional snd_uaudio usb
 dev/sound/usb/uaudio_pcm.c	optional snd_uaudio usb
 dev/sound/midi/midi.c		optional sound
 dev/sound/midi/mpu401.c		optional sound
 dev/sound/midi/mpu_if.m		optional sound
 dev/sound/midi/mpufoi_if.m	optional sound
 dev/sound/midi/sequencer.c	optional sound
 dev/sound/midi/synth_if.m	optional sound
 dev/spibus/spibus.c		optional spibus				\
 	dependency	"spibus_if.h"
 dev/spibus/spibus_if.m		optional spibus
 dev/sr/if_sr.c			optional sr
 dev/sr/if_sr_pci.c		optional sr pci
 dev/ste/if_ste.c		optional ste pci
 dev/stg/tmc18c30.c		optional stg
 dev/stg/tmc18c30_isa.c		optional stg isa
 dev/stg/tmc18c30_pccard.c	optional stg pccard
 dev/stg/tmc18c30_pci.c		optional stg pci
 dev/stg/tmc18c30_subr.c		optional stg
 dev/stge/if_stge.c		optional stge
 dev/streams/streams.c		optional streams
 dev/sym/sym_hipd.c		optional sym				\
 	dependency	"$S/dev/sym/sym_{conf,defs}.h"
 dev/syscons/blank/blank_saver.c	optional blank_saver
 dev/syscons/daemon/daemon_saver.c optional daemon_saver
 dev/syscons/dragon/dragon_saver.c optional dragon_saver
 dev/syscons/fade/fade_saver.c	optional fade_saver
 dev/syscons/fire/fire_saver.c	optional fire_saver
 dev/syscons/green/green_saver.c	optional green_saver
 dev/syscons/logo/logo.c		optional logo_saver
 dev/syscons/logo/logo_saver.c	optional logo_saver
 dev/syscons/rain/rain_saver.c	optional rain_saver
 dev/syscons/schistory.c		optional sc
 dev/syscons/scmouse.c		optional sc
 dev/syscons/scterm-dumb.c	optional sc
 dev/syscons/scterm.c		optional sc
 dev/syscons/scvidctl.c		optional sc
 dev/syscons/snake/snake_saver.c	optional snake_saver
 dev/syscons/star/star_saver.c	optional star_saver
 dev/syscons/syscons.c		optional sc
 dev/syscons/sysmouse.c		optional sc
 dev/syscons/warp/warp_saver.c	optional warp_saver
 dev/tdfx/tdfx_linux.c		optional tdfx_linux tdfx compat_linux
 dev/tdfx/tdfx_pci.c		optional tdfx pci
 dev/ti/if_ti.c			optional ti pci
 dev/tl/if_tl.c			optional tl pci
 dev/trm/trm.c			optional trm
 dev/twa/tw_cl_init.c		optional twa \
 	compile-with "${NORMAL_C} -I$S/dev/twa"
 dev/twa/tw_cl_intr.c		optional twa \
 	compile-with "${NORMAL_C} -I$S/dev/twa"
 dev/twa/tw_cl_io.c		optional twa \
 	compile-with "${NORMAL_C} -I$S/dev/twa"
 dev/twa/tw_cl_misc.c		optional twa \
 	compile-with "${NORMAL_C} -I$S/dev/twa"
 dev/twa/tw_osl_cam.c		optional twa \
 	compile-with "${NORMAL_C} -I$S/dev/twa"
 dev/twa/tw_osl_freebsd.c	optional twa \
 	compile-with "${NORMAL_C} -I$S/dev/twa"
 dev/twe/twe.c			optional twe
 dev/twe/twe_freebsd.c		optional twe
 dev/tx/if_tx.c			optional tx
 dev/txp/if_txp.c		optional txp
 dev/uart/uart_bus_acpi.c	optional uart acpi
 #dev/uart/uart_bus_cbus.c	optional uart cbus
 dev/uart/uart_bus_ebus.c	optional uart ebus
 dev/uart/uart_bus_isa.c		optional uart isa
 dev/uart/uart_bus_pccard.c	optional uart pccard
 dev/uart/uart_bus_pci.c		optional uart pci
 dev/uart/uart_bus_puc.c		optional uart puc
 dev/uart/uart_bus_scc.c		optional uart scc
 dev/uart/uart_core.c		optional uart
 dev/uart/uart_dbg.c		optional uart gdb
 dev/uart/uart_dev_ns8250.c	optional uart uart_ns8250
 dev/uart/uart_dev_quicc.c	optional uart quicc
 dev/uart/uart_dev_sab82532.c	optional uart uart_sab82532
 dev/uart/uart_dev_sab82532.c	optional uart scc
 dev/uart/uart_dev_z8530.c	optional uart uart_z8530
 dev/uart/uart_dev_z8530.c	optional uart scc
 dev/uart/uart_if.m		optional uart
 dev/uart/uart_subr.c		optional uart
 dev/uart/uart_tty.c		optional uart
 dev/ubsec/ubsec.c		optional ubsec
 #
 # USB support
 dev/usb/ehci.c			optional ehci
 dev/usb/ehci_pci.c		optional ehci pci
 dev/usb/hid.c			optional usb
 dev/usb/if_aue.c		optional aue
 dev/usb/if_axe.c		optional axe
 dev/usb/if_cdce.c		optional cdce
 dev/usb/if_cue.c		optional cue
 dev/usb/if_kue.c		optional kue
 dev/usb/if_ural.c		optional ural
 dev/usb/if_rue.c		optional rue
 dev/usb/if_rum.c		optional rum
 dev/usb/if_udav.c		optional udav
 dev/usb/if_zyd.c		optional zyd
 dev/usb/ohci.c			optional ohci
 dev/usb/ohci_pci.c		optional ohci pci
 dev/usb/sl811hs.c		optional slhci
 dev/usb/slhci_pccard.c		optional slhci pccard
 dev/usb/uark.c			optional uark
 dev/usb/u3g.c			optional u3g
 dev/usb/ubsa.c			optional ubsa
 dev/usb/ubser.c			optional ubser
 dev/usb/ucom.c			optional ucom
 dev/usb/ucycom.c		optional ucycom
 dev/usb/udbp.c			optional udbp
 dev/usb/ufoma.c			optional ufoma
 dev/usb/ufm.c			optional ufm
 dev/usb/uftdi.c			optional uftdi
 dev/usb/ugen.c			optional ugen
 dev/usb/uhci.c			optional uhci
 dev/usb/uhci_pci.c		optional uhci pci
 dev/usb/uhid.c			optional uhid
 dev/usb/uhub.c			optional usb
 dev/usb/uipaq.c			optional uipaq
 dev/usb/ukbd.c			optional ukbd
 dev/usb/ulpt.c			optional ulpt
 dev/usb/umass.c			optional umass
 dev/usb/umct.c			optional umct
 dev/usb/umodem.c		optional umodem
 dev/usb/ums.c			optional ums
 dev/usb/uplcom.c		optional uplcom
 dev/usb/urio.c			optional urio
 dev/usb/usb.c			optional usb
 dev/usb/usb_ethersubr.c		optional usb
 dev/usb/usb_if.m		optional usb
 dev/usb/usb_mem.c		optional usb
 dev/usb/usb_quirks.c		optional usb
 dev/usb/usb_subr.c		optional usb
 dev/usb/usbdi.c			optional usb
 dev/usb/usbdi_util.c		optional usb
 dev/usb/uscanner.c		optional uscanner
 dev/usb/uslcom.c		optional uslcom
 dev/usb/uvisor.c		optional uvisor
 dev/usb/uvscom.c		optional uvscom
 #
 # USB2 controller drivers
 #
 dev/usb2/controller/at91dci.c		optional usb2_core usb2_controller usb2_controller_at91dci
 dev/usb2/controller/at91dci_atmelarm.c	optional usb2_core usb2_controller usb2_controller_at91dci at91rm9200
 dev/usb2/controller/musb2_otg.c		optional usb2_core usb2_controller usb2_controller_musb
 dev/usb2/controller/musb2_otg_atmelarm.c optional usb2_core usb2_controller usb2_controller_musb at91rm9200
 dev/usb2/controller/ehci2.c		optional usb2_core usb2_controller usb2_controller_ehci
 dev/usb2/controller/ehci2_pci.c		optional usb2_core usb2_controller usb2_controller_ehci pci
 dev/usb2/controller/ohci2.c		optional usb2_core usb2_controller usb2_controller_ohci
 dev/usb2/controller/ohci2_atmelarm.c	optional usb2_core usb2_controller usb2_controller_ohci at91rm9200
 dev/usb2/controller/ohci2_pci.c		optional usb2_core usb2_controller usb2_controller_ohci pci
 dev/usb2/controller/uhci2.c		optional usb2_core usb2_controller usb2_controller_uhci
 dev/usb2/controller/uhci2_pci.c		optional usb2_core usb2_controller usb2_controller_uhci pci
 dev/usb2/controller/uss820dci.c		optional usb2_core usb2_controller usb2_controller_uss820dci
 dev/usb2/controller/uss820dci_atmelarm.c	optional usb2_core usb2_controller usb2_controller_uss820dci at91rm9200
 dev/usb2/controller/usb2_controller.c	optional usb2_core usb2_controller
 #
 # USB2 storage drivers
 #
 dev/usb2/storage/ata-usb2.c		optional usb2_core usb2_storage usb2_storage_ata
 dev/usb2/storage/umass2.c		optional usb2_core usb2_storage usb2_storage_mass
 dev/usb2/storage/urio2.c		optional usb2_core usb2_storage usb2_storage_rio
 dev/usb2/storage/usb2_storage.c		optional usb2_core usb2_storage
 dev/usb2/storage/ustorage2_fs.c		optional usb2_core usb2_storage usb2_storage_fs
 #
 # USB2 NDIS driver
 #
 dev/usb2/ndis/if_ndis_usb2.c		optional usb2_core usb2_ndis
 dev/usb2/ndis/usb2_ndis.c		optional usb2_core usb2_ndis
 #
 # USB2 core
 #
 dev/usb2/core/usb2_busdma.c		optional usb2_core
 dev/usb2/core/usb2_compat_linux.c	optional usb2_core
 dev/usb2/core/usb2_config_td.c		optional usb2_core
 dev/usb2/core/usb2_core.c		optional usb2_core
 dev/usb2/core/usb2_debug.c		optional usb2_core
 dev/usb2/core/usb2_dev.c		optional usb2_core
 dev/usb2/core/usb2_device.c		optional usb2_core
 dev/usb2/core/usb2_dynamic.c		optional usb2_core
 dev/usb2/core/usb2_error.c		optional usb2_core
 dev/usb2/core/usb2_generic.c		optional usb2_core
 dev/usb2/core/usb2_handle_request.c	optional usb2_core
 dev/usb2/core/usb2_hid.c		optional usb2_core
 dev/usb2/core/usb2_hub.c		optional usb2_core
 dev/usb2/core/usb2_if.m			optional usb2_core
 dev/usb2/core/usb2_lookup.c		optional usb2_core
 dev/usb2/core/usb2_mbuf.c		optional usb2_core
 dev/usb2/core/usb2_msctest.c		optional usb2_core
 dev/usb2/core/usb2_parse.c		optional usb2_core
 dev/usb2/core/usb2_process.c		optional usb2_core
 dev/usb2/core/usb2_request.c		optional usb2_core
 dev/usb2/core/usb2_sw_transfer.c	optional usb2_core
 dev/usb2/core/usb2_transfer.c		optional usb2_core
 dev/usb2/core/usb2_util.c		optional usb2_core
 #
 # USB2 ethernet drivers
 #
 dev/usb2/ethernet/if_aue2.c		optional usb2_core usb2_ethernet usb2_ethernet_aue
 dev/usb2/ethernet/if_axe2.c		optional usb2_core usb2_ethernet usb2_ethernet_axe
 dev/usb2/ethernet/if_cdce2.c		optional usb2_core usb2_ethernet usb2_ethernet_cdce
 dev/usb2/ethernet/if_cue2.c		optional usb2_core usb2_ethernet usb2_ethernet_cue
 dev/usb2/ethernet/if_kue2.c		optional usb2_core usb2_ethernet usb2_ethernet_kue
 dev/usb2/ethernet/if_rue2.c		optional usb2_core usb2_ethernet usb2_ethernet_rue
 dev/usb2/ethernet/if_udav2.c		optional usb2_core usb2_ethernet usb2_ethernet_udav
 dev/usb2/ethernet/usb2_ethernet.c	optional usb2_core usb2_ethernet
 #
 # USB2 WLAN drivers
 #
 dev/usb2/wlan/if_rum2.c		optional usb2_core usb2_wlan usb2_wlan_rum
 dev/usb2/wlan/if_ural2.c	optional usb2_core usb2_wlan usb2_wlan_ral
 dev/usb2/wlan/if_zyd2.c		optional usb2_core usb2_wlan usb2_wlan_zyd
 dev/usb2/wlan/usb2_wlan.c	optional usb2_core usb2_wlan
 #
 # USB2 serial and parallel port drivers
 #
 dev/usb2/serial/uark2.c		optional usb2_core usb2_serial usb2_serial_ark
 dev/usb2/serial/ubsa2.c		optional usb2_core usb2_serial usb2_serial_bsa
 dev/usb2/serial/ubser2.c	optional usb2_core usb2_serial usb2_serial_bser
 dev/usb2/serial/uchcom2.c	optional usb2_core usb2_serial usb2_serial_chcom
 dev/usb2/serial/ucycom2.c	optional usb2_core usb2_serial usb2_serial_cycom
 dev/usb2/serial/ufoma2.c	optional usb2_core usb2_serial usb2_serial_foma
 dev/usb2/serial/uftdi2.c	optional usb2_core usb2_serial usb2_serial_ftdi
 dev/usb2/serial/ugensa2.c	optional usb2_core usb2_serial usb2_serial_gensa
 dev/usb2/serial/uipaq2.c	optional usb2_core usb2_serial usb2_serial_ipaq
 dev/usb2/serial/ulpt2.c		optional usb2_core usb2_serial usb2_serial_lpt
 dev/usb2/serial/umct2.c		optional usb2_core usb2_serial usb2_serial_mct
 dev/usb2/serial/umodem2.c	optional usb2_core usb2_serial usb2_serial_modem
 dev/usb2/serial/umoscom2.c	optional usb2_core usb2_serial usb2_serial_moscom
 dev/usb2/serial/uplcom2.c	optional usb2_core usb2_serial usb2_serial_plcom
 dev/usb2/serial/usb2_serial.c	optional usb2_core usb2_serial
 dev/usb2/serial/uvisor2.c	optional usb2_core usb2_serial usb2_serial_visor
 dev/usb2/serial/uvscom2.c	optional usb2_core usb2_serial usb2_serial_vscom
 #
 # USB2 bluetooth drivers
 #
 dev/usb2/bluetooth/usb2_bluetooth.c	optional usb2_core usb2_bluetooth
 dev/usb2/bluetooth/ng_ubt2.c		optional usb2_core usb2_bluetooth usb2_bluetooth_ng
 dev/usb2/bluetooth/ubtbcmfw2.c		optional usb2_core usb2_bluetooth usb2_bluetooth_fw
 
 #
 # USB2 misc drivers
 #
 dev/usb2/misc/usb2_misc.c	optional usb2_core usb2_misc
 dev/usb2/misc/ufm2.c		optional usb2_core usb2_misc usb2_misc_fm
 dev/usb2/misc/udbp2.c		optional usb2_core usb2_misc usb2_misc_dbp
 #
 # USB2 input drivers
 #
 dev/usb2/input/uhid2.c		optional usb2_core usb2_input usb2_input_hid
 dev/usb2/input/ukbd2.c		optional usb2_core usb2_input usb2_input_kbd
 dev/usb2/input/ums2.c		optional usb2_core usb2_input usb2_input_ms
 dev/usb2/input/usb2_input.c	optional usb2_core usb2_input
 #
 # USB2 quirks
 #
 dev/usb2/quirk/usb2_quirk.c	optional usb2_core usb2_quirk
 #
 # USB2 templates
 #
 dev/usb2/template/usb2_template.c	optional usb2_core usb2_template
 dev/usb2/template/usb2_template_cdce.c	optional usb2_core usb2_template
 dev/usb2/template/usb2_template_msc.c	optional usb2_core usb2_template
 dev/usb2/template/usb2_template_mtp.c	optional usb2_core usb2_template
 #
 # USB2 image drivers
 #
 dev/usb2/image/usb2_image.c	optional usb2_core usb2_image
 dev/usb2/image/uscanner2.c	optional usb2_core usb2_image usb2_scanner
 #
 # USB2 sound and MIDI drivers
 #
 dev/usb2/sound/usb2_sound.c	optional usb2_core usb2_sound
 dev/usb2/sound/uaudio2.c	optional usb2_core usb2_sound
 dev/usb2/sound/uaudio2_pcm.c	optional usb2_core usb2_sound
 #
 # USB2 END
 #
 dev/utopia/idtphy.c		optional utopia
 dev/utopia/suni.c		optional utopia
 dev/utopia/utopia.c		optional utopia
 dev/vge/if_vge.c		optional vge
 dev/vkbd/vkbd.c			optional vkbd
 dev/vr/if_vr.c			optional vr pci
 dev/vx/if_vx.c			optional vx
 dev/vx/if_vx_eisa.c		optional vx eisa
 dev/vx/if_vx_pci.c		optional vx pci
 dev/watchdog/watchdog.c		standard
 dev/wb/if_wb.c			optional wb pci
 dev/wds/wd7000.c		optional wds isa
 dev/wi/if_wi.c			optional wi
 dev/wi/if_wi_pccard.c		optional wi pccard
 dev/wi/if_wi_pci.c		optional wi pci
 dev/wl/if_wl.c			optional wl isa
 wpifw.c			optional wpifw					\
 	compile-with	"${AWK} -f $S/tools/fw_stub.awk wpi.fw:wpifw:2144 -lintel_wpi -mwpi -c${.TARGET}" \
 	no-implicit-rule before-depend local				\
 	clean		"wpifw.c"
 wpifw.fwo			optional wpifw				\
 	dependency	"wpi.fw"					\
 	compile-with	"${LD} -b binary -d -warn-common -r -d -o ${.TARGET} wpi.fw" \
 	no-implicit-rule						\
 	clean		"wpi.fwo"
 wpi.fw			optional wpifw					\
 	dependency	".PHONY"					\
 	compile-with	"uudecode -o ${.TARGET} $S/contrib/dev/wpi/iwlwifi-3945-2.14.4.fw.uu"	\
 	no-obj no-implicit-rule						\
 	clean		"wpi.fw"
 dev/xe/if_xe.c			optional xe
 dev/xe/if_xe_pccard.c		optional xe pccard
 dev/xl/if_xl.c			optional xl pci
 fs/coda/coda_fbsd.c		optional vcoda
 fs/coda/coda_psdev.c		optional vcoda
 fs/coda/coda_subr.c		optional vcoda
 fs/coda/coda_venus.c		optional vcoda
 fs/coda/coda_vfsops.c		optional vcoda
 fs/coda/coda_vnops.c		optional vcoda
 fs/deadfs/dead_vnops.c		standard
 fs/devfs/devfs_devs.c		standard
 fs/devfs/devfs_rule.c		standard
 fs/devfs/devfs_vfsops.c		standard
 fs/devfs/devfs_vnops.c		standard
 fs/fdescfs/fdesc_vfsops.c	optional fdescfs
 fs/fdescfs/fdesc_vnops.c	optional fdescfs
 fs/fifofs/fifo_vnops.c		standard
 fs/hpfs/hpfs_alsubr.c		optional hpfs
 fs/hpfs/hpfs_lookup.c		optional hpfs
 fs/hpfs/hpfs_subr.c		optional hpfs
 fs/hpfs/hpfs_vfsops.c		optional hpfs
 fs/hpfs/hpfs_vnops.c		optional hpfs
 fs/msdosfs/msdosfs_conv.c	optional msdosfs
 fs/msdosfs/msdosfs_denode.c	optional msdosfs
 fs/msdosfs/msdosfs_fat.c	optional msdosfs
 fs/msdosfs/msdosfs_fileno.c	optional msdosfs
 fs/msdosfs/msdosfs_iconv.c	optional msdosfs_iconv
 fs/msdosfs/msdosfs_lookup.c	optional msdosfs
 fs/msdosfs/msdosfs_vfsops.c	optional msdosfs
 fs/msdosfs/msdosfs_vnops.c	optional msdosfs
 fs/ntfs/ntfs_compr.c		optional ntfs
 fs/ntfs/ntfs_iconv.c		optional ntfs_iconv
 fs/ntfs/ntfs_ihash.c		optional ntfs
 fs/ntfs/ntfs_subr.c		optional ntfs
 fs/ntfs/ntfs_vfsops.c		optional ntfs
 fs/ntfs/ntfs_vnops.c		optional ntfs
 fs/nullfs/null_subr.c		optional nullfs
 fs/nullfs/null_vfsops.c		optional nullfs
 fs/nullfs/null_vnops.c		optional nullfs
 fs/nwfs/nwfs_io.c		optional nwfs
 fs/nwfs/nwfs_ioctl.c		optional nwfs
 fs/nwfs/nwfs_node.c		optional nwfs
 fs/nwfs/nwfs_subr.c		optional nwfs
 fs/nwfs/nwfs_vfsops.c		optional nwfs
 fs/nwfs/nwfs_vnops.c		optional nwfs
 fs/portalfs/portal_vfsops.c	optional portalfs
 fs/portalfs/portal_vnops.c	optional portalfs
 fs/procfs/procfs.c		optional procfs
 fs/procfs/procfs_ctl.c		optional procfs
 fs/procfs/procfs_dbregs.c	optional procfs
 fs/procfs/procfs_fpregs.c	optional procfs
 fs/procfs/procfs_ioctl.c	optional procfs
 fs/procfs/procfs_map.c		optional procfs
 fs/procfs/procfs_mem.c		optional procfs
 fs/procfs/procfs_note.c		optional procfs
 fs/procfs/procfs_regs.c		optional procfs
 fs/procfs/procfs_rlimit.c	optional procfs
 fs/procfs/procfs_status.c	optional procfs
 fs/procfs/procfs_type.c		optional procfs
 fs/pseudofs/pseudofs.c		optional pseudofs
 fs/pseudofs/pseudofs_fileno.c	optional pseudofs
 fs/pseudofs/pseudofs_vncache.c	optional pseudofs
 fs/pseudofs/pseudofs_vnops.c	optional pseudofs
 fs/smbfs/smbfs_io.c		optional smbfs
 fs/smbfs/smbfs_node.c		optional smbfs
 fs/smbfs/smbfs_smb.c		optional smbfs
 fs/smbfs/smbfs_subr.c		optional smbfs
 fs/smbfs/smbfs_vfsops.c		optional smbfs
 fs/smbfs/smbfs_vnops.c		optional smbfs
 fs/udf/osta.c			optional udf
 fs/udf/udf_iconv.c		optional udf_iconv
 fs/udf/udf_vfsops.c		optional udf
 fs/udf/udf_vnops.c		optional udf
 fs/unionfs/union_subr.c		optional unionfs
 fs/unionfs/union_vfsops.c	optional unionfs
 fs/unionfs/union_vnops.c	optional unionfs
 fs/tmpfs/tmpfs_vnops.c		optional tmpfs
 fs/tmpfs/tmpfs_fifoops.c 	optional tmpfs
 fs/tmpfs/tmpfs_vfsops.c 	optional tmpfs
 fs/tmpfs/tmpfs_subr.c 		optional tmpfs
 gdb/gdb_cons.c			optional gdb
 gdb/gdb_main.c			optional gdb
 gdb/gdb_packet.c		optional gdb
 geom/bde/g_bde.c		optional geom_bde
 geom/bde/g_bde_crypt.c		optional geom_bde
 geom/bde/g_bde_lock.c		optional geom_bde
 geom/bde/g_bde_work.c		optional geom_bde
 geom/cache/g_cache.c		optional geom_cache
 geom/concat/g_concat.c		optional geom_concat
 geom/eli/g_eli.c		optional geom_eli
 geom/eli/g_eli_crypto.c		optional geom_eli
 geom/eli/g_eli_ctl.c		optional geom_eli
 geom/eli/g_eli_integrity.c	optional geom_eli
 geom/eli/g_eli_key.c		optional geom_eli
 geom/eli/g_eli_privacy.c	optional geom_eli
 geom/eli/pkcs5v2.c		optional geom_eli
 geom/gate/g_gate.c		optional geom_gate
 geom/geom_aes.c			optional geom_aes
 geom/geom_bsd.c			optional geom_bsd
 geom/geom_bsd_enc.c		optional geom_bsd
 geom/geom_ccd.c			optional ccd | geom_ccd
 geom/geom_ctl.c			standard
 geom/geom_dev.c			standard
 geom/geom_disk.c		standard
 geom/geom_dump.c		standard
 geom/geom_event.c		standard
 geom/geom_fox.c			optional geom_fox
 geom/geom_io.c			standard
 geom/geom_kern.c		standard
 geom/geom_mbr.c			optional geom_mbr
 geom/geom_mbr_enc.c		optional geom_mbr
 geom/geom_pc98.c		optional geom_pc98
 geom/geom_pc98_enc.c		optional geom_pc98
 geom/geom_slice.c		standard
 geom/geom_subr.c		standard
 geom/geom_sunlabel.c		optional geom_sunlabel
 geom/geom_sunlabel_enc.c	optional geom_sunlabel
 geom/geom_vfs.c			standard
 geom/geom_vol_ffs.c		optional geom_vol
 geom/journal/g_journal.c	optional geom_journal
 geom/journal/g_journal_ufs.c	optional geom_journal
 geom/label/g_label.c		optional geom_label
 geom/label/g_label_ext2fs.c	optional geom_label
 geom/label/g_label_iso9660.c	optional geom_label
 geom/label/g_label_msdosfs.c	optional geom_label
 geom/label/g_label_ntfs.c	optional geom_label
 geom/label/g_label_reiserfs.c	optional geom_label
 geom/label/g_label_ufs.c	optional geom_label
 geom/linux_lvm/g_linux_lvm.c	optional geom_linux_lvm
 geom/mirror/g_mirror.c		optional geom_mirror
 geom/mirror/g_mirror_ctl.c	optional geom_mirror
 geom/multipath/g_multipath.c	optional geom_multipath
 geom/nop/g_nop.c		optional geom_nop
 geom/part/g_part.c		standard
 geom/part/g_part_if.m		standard
 geom/part/g_part_apm.c		optional geom_part_apm
 geom/part/g_part_bsd.c		optional geom_part_bsd
 geom/part/g_part_gpt.c		optional geom_part_gpt
 geom/part/g_part_mbr.c		optional geom_part_mbr
 geom/part/g_part_pc98.c		optional geom_part_pc98
 geom/part/g_part_vtoc8.c	optional geom_part_vtoc8
 geom/raid3/g_raid3.c		optional geom_raid3
 geom/raid3/g_raid3_ctl.c	optional geom_raid3
 geom/shsec/g_shsec.c		optional geom_shsec
 geom/stripe/g_stripe.c		optional geom_stripe
 geom/uzip/g_uzip.c		optional geom_uzip
 geom/virstor/binstream.c	optional geom_virstor
 geom/virstor/g_virstor.c	optional geom_virstor
 geom/virstor/g_virstor_md.c	optional geom_virstor
 geom/zero/g_zero.c		optional geom_zero
 gnu/fs/ext2fs/ext2_alloc.c		optional ext2fs \
 	warning "kernel contains GPL contaminated ext2fs filesystem"
 gnu/fs/ext2fs/ext2_balloc.c	optional ext2fs
 gnu/fs/ext2fs/ext2_bmap.c	optional ext2fs
 gnu/fs/ext2fs/ext2_inode.c	optional ext2fs
 gnu/fs/ext2fs/ext2_inode_cnv.c	optional ext2fs
 gnu/fs/ext2fs/ext2_linux_balloc.c	optional ext2fs
 gnu/fs/ext2fs/ext2_linux_ialloc.c	optional ext2fs
 gnu/fs/ext2fs/ext2_lookup.c	optional ext2fs
 gnu/fs/ext2fs/ext2_subr.c	optional ext2fs
 gnu/fs/ext2fs/ext2_vfsops.c	optional ext2fs
 gnu/fs/ext2fs/ext2_vnops.c	optional ext2fs
 gnu/fs/reiserfs/reiserfs_hashes.c	optional reiserfs \
 	warning "kernel contains GPL contaminated ReiserFS filesystem"
 gnu/fs/reiserfs/reiserfs_inode.c	optional reiserfs
 gnu/fs/reiserfs/reiserfs_item_ops.c	optional reiserfs
 gnu/fs/reiserfs/reiserfs_namei.c	optional reiserfs
 gnu/fs/reiserfs/reiserfs_prints.c	optional reiserfs
 gnu/fs/reiserfs/reiserfs_stree.c	optional reiserfs
 gnu/fs/reiserfs/reiserfs_vfsops.c	optional reiserfs
 gnu/fs/reiserfs/reiserfs_vnops.c	optional reiserfs
 #
 isa/isa_if.m			standard
 isa/isa_common.c		optional isa
 isa/isahint.c			optional isa
 isa/orm.c			optional isa
 isa/pnp.c			optional isa isapnp
 isa/pnpparse.c			optional isa isapnp
 fs/cd9660/cd9660_bmap.c	optional cd9660
 fs/cd9660/cd9660_lookup.c	optional cd9660
 fs/cd9660/cd9660_node.c	optional cd9660
 fs/cd9660/cd9660_rrip.c	optional cd9660
 fs/cd9660/cd9660_util.c	optional cd9660
 fs/cd9660/cd9660_vfsops.c	optional cd9660
 fs/cd9660/cd9660_vnops.c	optional cd9660
 fs/cd9660/cd9660_iconv.c	optional cd9660_iconv
 kern/bus_if.m			standard
 kern/clock_if.m			standard
 kern/cpufreq_if.m		standard
 kern/device_if.m		standard
 kern/imgact_elf.c		standard
 kern/imgact_shell.c		standard
 kern/inflate.c			optional gzip
 kern/init_main.c		standard
 kern/init_sysent.c		standard
 kern/ksched.c			optional _kposix_priority_scheduling
 kern/kern_acct.c		standard
 kern/kern_alq.c			optional alq
 kern/kern_clock.c		standard
 kern/kern_condvar.c		standard
 kern/kern_conf.c		standard
 kern/kern_cons.c		standard
 kern/kern_cpu.c			standard
 kern/kern_cpuset.c		standard
 kern/kern_context.c		standard
 kern/kern_descrip.c		standard
 kern/kern_dtrace.c		optional kdtrace_hooks
 kern/kern_environment.c		standard
 kern/kern_event.c		standard
 kern/kern_exec.c		standard
 kern/kern_exit.c		standard
 kern/kern_fork.c		standard
 kern/kern_idle.c		standard
 kern/kern_intr.c		standard
 kern/kern_jail.c		standard
 kern/kern_kthread.c		standard
 kern/kern_ktr.c			optional ktr
 kern/kern_ktrace.c		standard
 kern/kern_linker.c		standard
 kern/kern_lock.c		standard
 kern/kern_lockf.c		standard
 kern/kern_malloc.c		standard
 kern/kern_mbuf.c		standard
 kern/kern_mib.c			standard
 kern/kern_module.c		standard
 kern/kern_mtxpool.c		standard
 kern/kern_mutex.c		standard
 kern/kern_ntptime.c		standard
 kern/kern_osd.c			standard
 kern/kern_physio.c		standard
 kern/kern_pmc.c			standard
 kern/kern_poll.c		optional device_polling
 kern/kern_priv.c		standard
 kern/kern_proc.c		standard
 kern/kern_prot.c		standard
 kern/kern_resource.c		standard
 kern/kern_rmlock.c		standard
 kern/kern_rwlock.c		standard
 kern/kern_sdt.c			optional kdtrace_hooks
 kern/kern_sema.c		standard
 kern/kern_shutdown.c		standard
 kern/kern_sig.c			standard
 kern/kern_subr.c		standard
 kern/kern_switch.c		standard
 kern/kern_sx.c			standard
 kern/kern_synch.c		standard
 kern/kern_syscalls.c		standard
 kern/kern_sysctl.c		standard
 kern/kern_tc.c			standard
 kern/kern_thr.c			standard
 kern/kern_thread.c		standard
 kern/kern_time.c		standard
 kern/kern_timeout.c		standard
 kern/kern_umtx.c		standard
 kern/kern_uuid.c		standard
 kern/kern_xxx.c			standard
 kern/kern_vimage.c		standard
 kern/link_elf.c			standard
 kern/linker_if.m		standard
 kern/md4c.c			optional netsmb
 kern/md5c.c			standard
 kern/p1003_1b.c			standard
 kern/posix4_mib.c		standard
 kern/sched_4bsd.c		optional sched_4bsd
 kern/sched_ule.c		optional sched_ule
 kern/serdev_if.m		standard
 kern/stack_protector.c		standard \
 	compile-with "${NORMAL_C:N-fstack-protector*}"
 kern/subr_acl_posix1e.c		standard
 kern/subr_autoconf.c		standard
 kern/subr_blist.c		standard
 kern/subr_bus.c			standard
 kern/subr_bufring.c		standard
 kern/subr_clist.c		standard
 kern/subr_clock.c		standard
 kern/subr_devstat.c		standard
 kern/subr_disk.c		standard
 kern/subr_eventhandler.c	standard
 kern/subr_fattime.c		standard
 kern/subr_firmware.c		optional firmware
 kern/subr_hints.c		standard
 kern/subr_kdb.c			standard
 kern/subr_kobj.c		standard
 kern/subr_lock.c		standard
 kern/subr_log.c			standard
 kern/subr_mbpool.c		optional libmbpool
 kern/subr_mchain.c		optional libmchain
 kern/subr_module.c		standard
 kern/subr_msgbuf.c		standard
 kern/subr_param.c		standard
 kern/subr_pcpu.c		standard
 kern/subr_power.c		standard
 kern/subr_prf.c			standard
 kern/subr_prof.c		standard
 kern/subr_rman.c		standard
 kern/subr_rtc.c			standard
 kern/subr_sbuf.c		standard
 kern/subr_scanf.c		standard
 kern/subr_sleepqueue.c		standard
 kern/subr_smp.c			standard
 kern/subr_stack.c		optional ddb | stack | ktr
 kern/subr_taskqueue.c		standard
 kern/subr_trap.c		standard
 kern/subr_turnstile.c		standard
 kern/subr_unit.c		standard
 kern/subr_witness.c		optional witness
 kern/sys_generic.c		standard
 kern/sys_pipe.c			standard
 kern/sys_process.c		standard
 kern/sys_socket.c		standard
 kern/syscalls.c			optional witness | invariants | kdtrace_hooks
 kern/sysv_ipc.c			standard
 kern/sysv_msg.c			optional sysvmsg
 kern/sysv_sem.c			optional sysvsem
 kern/sysv_shm.c			optional sysvshm
 kern/tty.c			standard
 kern/tty_compat.c		optional compat_43tty
 kern/tty_info.c			standard
 kern/tty_inq.c			standard
 kern/tty_outq.c			standard
 kern/tty_pts.c			standard
 kern/tty_pty.c			optional pty
 kern/tty_tty.c			standard
 kern/tty_ttydisc.c		standard
 kern/uipc_accf.c		optional inet
 kern/uipc_cow.c			optional zero_copy_sockets
 kern/uipc_debug.c		optional ddb
 kern/uipc_domain.c		standard
 kern/uipc_mbuf.c		standard
 kern/uipc_mbuf2.c		standard
 kern/uipc_mqueue.c		optional p1003_1b_mqueue
 kern/uipc_sem.c			optional p1003_1b_semaphores
 kern/uipc_shm.c			standard
 kern/uipc_sockbuf.c		standard
 kern/uipc_socket.c		standard
 kern/uipc_syscalls.c		standard
 kern/uipc_usrreq.c		standard
 kern/vfs_acl.c			standard
 kern/vfs_aio.c			optional vfs_aio
 kern/vfs_bio.c			standard
 kern/vfs_cache.c		standard
 kern/vfs_cluster.c		standard
 kern/vfs_default.c		standard
 kern/vfs_export.c		standard
 kern/vfs_extattr.c		standard
 kern/vfs_hash.c			standard
 kern/vfs_init.c			standard
 kern/vfs_lookup.c		standard
 kern/vfs_mount.c		standard
 kern/vfs_subr.c			standard
 kern/vfs_syscalls.c		standard
 kern/vfs_vnops.c		standard
 #
 # Kernel GSS-API
 #
 gssd.h				optional kgssapi			\
 	dependency		"$S/kgssapi/gssd.x"			\
 	compile-with		"rpcgen -hM $S/kgssapi/gssd.x | grep -v pthread.h > gssd.h" \
 	no-obj no-implicit-rule before-depend local			\
 	clean			"gssd.h"
 gssd_xdr.c			optional kgssapi			\
 	dependency		"$S/kgssapi/gssd.x gssd.h"		\
 	compile-with		"rpcgen -c $S/kgssapi/gssd.x -o gssd_xdr.c" \
 	no-implicit-rule before-depend local				\
 	clean			"gssd_xdr.c"
 gssd_clnt.c			optional kgssapi			\
 	dependency		"$S/kgssapi/gssd.x gssd.h"		\
 	compile-with		"rpcgen -lM $S/kgssapi/gssd.x | grep -v string.h > gssd_clnt.c" \
 	no-implicit-rule before-depend local				\
 	clean			"gssd_clnt.c"
 kgssapi/gss_accept_sec_context.c optional kgssapi
 kgssapi/gss_add_oid_set_member.c optional kgssapi
 kgssapi/gss_acquire_cred.c	optional kgssapi
 kgssapi/gss_canonicalize_name.c	optional kgssapi
 kgssapi/gss_create_empty_oid_set.c optional kgssapi
 kgssapi/gss_delete_sec_context.c optional kgssapi
 kgssapi/gss_display_status.c	optional kgssapi
 kgssapi/gss_export_name.c	optional kgssapi
 kgssapi/gss_get_mic.c		optional kgssapi
 kgssapi/gss_init_sec_context.c	optional kgssapi
 kgssapi/gss_impl.c		optional kgssapi
 kgssapi/gss_import_name.c	optional kgssapi
 kgssapi/gss_names.c		optional kgssapi
 kgssapi/gss_pname_to_uid.c	optional kgssapi
 kgssapi/gss_release_buffer.c	optional kgssapi
 kgssapi/gss_release_cred.c	optional kgssapi
 kgssapi/gss_release_name.c	optional kgssapi
 kgssapi/gss_release_oid_set.c	optional kgssapi
 kgssapi/gss_set_cred_option.c	optional kgssapi
 kgssapi/gss_test_oid_set_member.c optional kgssapi
 kgssapi/gss_unwrap.c		optional kgssapi
 kgssapi/gss_verify_mic.c	optional kgssapi
 kgssapi/gss_wrap.c		optional kgssapi
 kgssapi/gss_wrap_size_limit.c	optional kgssapi
 kgssapi/gssd_prot.c		optional kgssapi
 kgssapi/krb5/krb5_mech.c	optional kgssapi
 kgssapi/krb5/kcrypto.c		optional kgssapi
 kgssapi/krb5/kcrypto_aes.c	optional kgssapi
 kgssapi/krb5/kcrypto_arcfour.c	optional kgssapi
 kgssapi/krb5/kcrypto_des.c	optional kgssapi
 kgssapi/krb5/kcrypto_des3.c	optional kgssapi
 kgssapi/kgss_if.m		optional kgssapi
 kgssapi/gsstest.c		optional kgssapi_debug
 # These files in libkern/ are those needed by all architectures.  Some
 # of the files in libkern/ are only needed on some architectures, e.g.,
 # libkern/divdi3.c is needed by i386 but not alpha.  Also, some of these
 # routines may be optimized for a particular platform.  In either case,
 # the file should be moved to conf/files.<arch> from here.
 #
 libkern/arc4random.c		standard
 libkern/bcd.c			standard
 libkern/bsearch.c		standard
 libkern/crc32.c			standard
 libkern/fnmatch.c		standard
 libkern/gets.c			standard
 libkern/iconv.c			optional libiconv
 libkern/iconv_converter_if.m	optional libiconv
 libkern/iconv_xlat.c		optional libiconv
 libkern/iconv_xlat16.c		optional libiconv
 libkern/index.c			standard
 libkern/inet_ntoa.c		standard
 libkern/mcount.c		optional profiling-routine
 libkern/memcmp.c		standard
 libkern/qsort.c			standard
 libkern/qsort_r.c		standard
 libkern/random.c		standard
 libkern/rindex.c		standard
 libkern/scanc.c			standard
 libkern/skpc.c			standard
 libkern/strcasecmp.c		standard
 libkern/strcat.c		standard
 libkern/strcmp.c		standard
 libkern/strcpy.c		standard
 libkern/strcspn.c		standard
 libkern/strdup.c		standard
 libkern/strlcat.c		standard
 libkern/strlcpy.c		standard
 libkern/strlen.c		standard
 libkern/strncmp.c		standard
 libkern/strncpy.c		standard
 libkern/strsep.c		standard
 libkern/strspn.c		standard
 libkern/strstr.c		standard
 libkern/strtol.c		standard
 libkern/strtoq.c		standard
 libkern/strtoul.c		standard
 libkern/strtouq.c		standard
 libkern/strvalid.c		standard
 net/bpf.c			standard
 net/bpf_buffer.c		optional bpf
 net/bpf_jitter.c		optional bpf_jitter
 net/bpf_filter.c		optional bpf | netgraph_bpf
 net/bpf_zerocopy.c		optional bpf
 net/bridgestp.c			optional bridge | if_bridge
 net/bsd_comp.c			optional ppp_bsdcomp
 net/ieee8023ad_lacp.c		optional lagg
 net/if.c			standard
 net/if_arcsubr.c		optional arcnet
 net/if_atmsubr.c		optional atm
 net/if_bridge.c			optional bridge | if_bridge
 net/if_clone.c			standard
 net/if_disc.c			optional disc
 net/if_edsc.c			optional edsc
 net/if_ef.c			optional ef
 net/if_enc.c			optional enc
 net/if_ethersubr.c		optional ether \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 net/if_faith.c			optional faith
 net/if_fddisubr.c		optional fddi
 net/if_fwsubr.c			optional fwip
 net/if_gif.c			optional gif
 net/if_gre.c			optional gre
 net/if_iso88025subr.c		optional token
 net/if_lagg.c			optional lagg
 net/if_loop.c			optional loop
+net/if_llatbl.c			standard
 net/if_media.c			standard
 net/if_mib.c			standard
 net/if_ppp.c			optional ppp
 net/if_sl.c			optional sl
 net/if_spppfr.c			optional sppp | netgraph_sppp
 net/if_spppsubr.c		optional sppp | netgraph_sppp
 net/if_stf.c			optional stf
 net/if_tun.c			optional tun
 net/if_tap.c			optional tap
 net/if_vlan.c			optional vlan
 net/mppcc.c			optional netgraph_mppc_compression
 net/mppcd.c			optional netgraph_mppc_compression
 net/netisr.c			standard
 net/ppp_deflate.c		optional ppp_deflate
 net/ppp_tty.c			optional ppp
 net/pfil.c			optional ether | inet
 net/radix.c			standard
 net/radix_mpath.c		standard
 net/raw_cb.c			standard
 net/raw_usrreq.c		standard
 net/route.c			standard
 net/rtsock.c			standard
 net/slcompress.c		optional netgraph_vjc | ppp | sl | sppp | \
 					 netgraph_sppp
 net/zlib.c			optional crypto | geom_uzip | ipsec | \
 					 mxge | ppp_deflate | netgraph_deflate | \
 					 ddb_ctf
 net80211/ieee80211.c		optional wlan
 net80211/ieee80211_acl.c	optional wlan_acl
 net80211/ieee80211_adhoc.c	optional wlan
 net80211/ieee80211_amrr.c	optional wlan_amrr
 net80211/ieee80211_crypto.c	optional wlan
 net80211/ieee80211_crypto_ccmp.c optional wlan_ccmp
 net80211/ieee80211_crypto_none.c optional wlan
 net80211/ieee80211_crypto_tkip.c optional wlan_tkip
 net80211/ieee80211_crypto_wep.c	optional wlan_wep
 net80211/ieee80211_ddb.c	optional wlan ddb
 net80211/ieee80211_dfs.c	optional wlan
 net80211/ieee80211_freebsd.c	optional wlan
 net80211/ieee80211_hostap.c	optional wlan
 net80211/ieee80211_ht.c		optional wlan
 net80211/ieee80211_input.c	optional wlan
 net80211/ieee80211_ioctl.c	optional wlan
 net80211/ieee80211_monitor.c	optional wlan
 net80211/ieee80211_node.c	optional wlan
 net80211/ieee80211_output.c	optional wlan
 net80211/ieee80211_phy.c	optional wlan
 net80211/ieee80211_power.c	optional wlan
 net80211/ieee80211_proto.c	optional wlan
 net80211/ieee80211_regdomain.c	optional wlan
 net80211/ieee80211_rssadapt.c	optional wlan_rssadapt
 net80211/ieee80211_scan.c	optional wlan
 net80211/ieee80211_scan_sta.c	optional wlan
 net80211/ieee80211_sta.c	optional wlan
 net80211/ieee80211_wds.c	optional wlan
 net80211/ieee80211_xauth.c	optional wlan_xauth
 netatalk/aarp.c			optional netatalk
 netatalk/at_control.c		optional netatalk
 netatalk/at_proto.c		optional netatalk
 netatalk/at_rmx.c		optional netatalk
 netatalk/ddp_input.c		optional netatalk
 netatalk/ddp_output.c		optional netatalk
 netatalk/ddp_pcb.c		optional netatalk
 netatalk/ddp_usrreq.c		optional netatalk
 netgraph/atm/ccatm/ng_ccatm.c	optional ngatm_ccatm \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 netgraph/atm/ng_atm.c		optional ngatm_atm
 netgraph/atm/ngatmbase.c	optional ngatm_atmbase \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 netgraph/atm/sscfu/ng_sscfu.c	optional ngatm_sscfu \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 netgraph/atm/sscop/ng_sscop.c optional ngatm_sscop \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 netgraph/atm/uni/ng_uni.c	optional ngatm_uni \
 	compile-with "${NORMAL_C} -I$S/contrib/ngatm"
 netgraph/bluetooth/common/ng_bluetooth.c optional netgraph_bluetooth
 netgraph/bluetooth/drivers/bt3c/ng_bt3c_pccard.c optional netgraph_bluetooth_bt3c
 netgraph/bluetooth/drivers/h4/ng_h4.c optional netgraph_bluetooth_h4
 netgraph/bluetooth/drivers/ubt/ng_ubt.c optional netgraph_bluetooth_ubt
 netgraph/bluetooth/drivers/ubtbcmfw/ubtbcmfw.c optional netgraph_bluetooth_ubtbcmfw
 netgraph/bluetooth/hci/ng_hci_cmds.c optional netgraph_bluetooth_hci
 netgraph/bluetooth/hci/ng_hci_evnt.c optional netgraph_bluetooth_hci
 netgraph/bluetooth/hci/ng_hci_main.c optional netgraph_bluetooth_hci
 netgraph/bluetooth/hci/ng_hci_misc.c optional netgraph_bluetooth_hci
 netgraph/bluetooth/hci/ng_hci_ulpi.c optional netgraph_bluetooth_hci
 netgraph/bluetooth/l2cap/ng_l2cap_cmds.c optional netgraph_bluetooth_l2cap
 netgraph/bluetooth/l2cap/ng_l2cap_evnt.c optional netgraph_bluetooth_l2cap
 netgraph/bluetooth/l2cap/ng_l2cap_llpi.c optional netgraph_bluetooth_l2cap
 netgraph/bluetooth/l2cap/ng_l2cap_main.c optional netgraph_bluetooth_l2cap
 netgraph/bluetooth/l2cap/ng_l2cap_misc.c optional netgraph_bluetooth_l2cap
 netgraph/bluetooth/l2cap/ng_l2cap_ulpi.c optional netgraph_bluetooth_l2cap
 netgraph/bluetooth/socket/ng_btsocket.c optional netgraph_bluetooth_socket
 netgraph/bluetooth/socket/ng_btsocket_hci_raw.c	optional netgraph_bluetooth_socket
 netgraph/bluetooth/socket/ng_btsocket_l2cap.c optional netgraph_bluetooth_socket
 netgraph/bluetooth/socket/ng_btsocket_l2cap_raw.c optional netgraph_bluetooth_socket
 netgraph/bluetooth/socket/ng_btsocket_rfcomm.c optional netgraph_bluetooth_socket
 netgraph/bluetooth/socket/ng_btsocket_sco.c optional netgraph_bluetooth_socket
 netgraph/netflow/netflow.c	optional netgraph_netflow
 netgraph/netflow/ng_netflow.c	optional netgraph_netflow
 netgraph/ng_UI.c		optional netgraph_UI
 netgraph/ng_async.c		optional netgraph_async
 netgraph/ng_atmllc.c		optional netgraph_atmllc
 netgraph/ng_base.c		optional netgraph
 netgraph/ng_bpf.c		optional netgraph_bpf
 netgraph/ng_bridge.c		optional netgraph_bridge
 netgraph/ng_car.c		optional netgraph_car
 netgraph/ng_cisco.c		optional netgraph_cisco
 netgraph/ng_deflate.c		optional netgraph_deflate
 netgraph/ng_device.c		optional netgraph_device
 netgraph/ng_echo.c		optional netgraph_echo
 netgraph/ng_eiface.c		optional netgraph_eiface
 netgraph/ng_ether.c		optional netgraph_ether
 netgraph/ng_fec.c		optional netgraph_fec
 netgraph/ng_frame_relay.c	optional netgraph_frame_relay
 netgraph/ng_gif.c		optional netgraph_gif
 netgraph/ng_gif_demux.c		optional netgraph_gif_demux
 netgraph/ng_hole.c		optional netgraph_hole
 netgraph/ng_iface.c		optional netgraph_iface
 netgraph/ng_ip_input.c		optional netgraph_ip_input
 netgraph/ng_ipfw.c		optional netgraph_ipfw
 netgraph/ng_ksocket.c		optional netgraph_ksocket
 netgraph/ng_l2tp.c		optional netgraph_l2tp
 netgraph/ng_lmi.c		optional netgraph_lmi
 netgraph/ng_mppc.c		optional netgraph_mppc_compression | \
 					 netgraph_mppc_encryption
 netgraph/ng_nat.c		optional netgraph_nat
 netgraph/ng_one2many.c		optional netgraph_one2many
 netgraph/ng_parse.c		optional netgraph
 netgraph/ng_ppp.c		optional netgraph_ppp
 netgraph/ng_pppoe.c		optional netgraph_pppoe
 netgraph/ng_pptpgre.c		optional netgraph_pptpgre
 netgraph/ng_pred1.c		optional netgraph_pred1
 netgraph/ng_rfc1490.c		optional netgraph_rfc1490
 netgraph/ng_socket.c		optional netgraph_socket
 netgraph/ng_split.c		optional netgraph_split
 netgraph/ng_sppp.c		optional netgraph_sppp
 netgraph/ng_tag.c		optional netgraph_tag
 netgraph/ng_tcpmss.c		optional netgraph_tcpmss
 netgraph/ng_tee.c		optional netgraph_tee
 netgraph/ng_tty.c		optional netgraph_tty
 netgraph/ng_vjc.c		optional netgraph_vjc
 netinet/accf_data.c		optional accept_filter_data
 netinet/accf_dns.c		optional accept_filter_dns
 netinet/accf_http.c		optional accept_filter_http
 netinet/if_atm.c		optional atm
 netinet/if_ether.c		optional ether
 netinet/igmp.c			optional inet
 netinet/in.c			optional inet
 netinet/ip_carp.c		optional carp
 netinet/in_gif.c		optional gif inet
 netinet/ip_gre.c		optional gre inet
 netinet/ip_id.c			optional inet
 netinet/in_mcast.c		optional inet
 netinet/in_pcb.c		optional inet
 netinet/in_proto.c		optional inet \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 netinet/in_rmx.c		optional inet
 netinet/ip_divert.c		optional ipdivert
 netinet/ip_dummynet.c		optional dummynet
 netinet/ip_ecn.c		optional inet | inet6
 netinet/ip_encap.c		optional inet | inet6
 netinet/ip_fastfwd.c		optional inet
 netinet/ip_fw2.c		optional ipfirewall \
 	compile-with "${NORMAL_C} -I$S/contrib/pf"
 netinet/ip_fw_pfil.c		optional ipfirewall
 netinet/ip_fw_nat.c		optional ipfirewall_nat
 netinet/ip_icmp.c		optional inet
 netinet/ip_input.c		optional inet
 netinet/ip_ipsec.c		optional ipsec
 netinet/ip_mroute.c		optional mrouting inet | mrouting inet6
 netinet/ip_options.c		optional inet
 netinet/ip_output.c		optional inet
 netinet/raw_ip.c		optional inet
 netinet/sctp_asconf.c		optional inet sctp
 netinet/sctp_auth.c		optional inet sctp
 netinet/sctp_bsd_addr.c		optional inet sctp
 netinet/sctp_cc_functions.c	optional inet sctp
 netinet/sctp_crc32.c		optional inet sctp
 netinet/sctp_indata.c		optional inet sctp
 netinet/sctp_input.c		optional inet sctp
 netinet/sctp_output.c		optional inet sctp
 netinet/sctp_pcb.c		optional inet sctp
 netinet/sctp_peeloff.c		optional inet sctp
 netinet/sctp_sysctl.c		optional inet sctp
 netinet/sctp_timer.c		optional inet sctp
 netinet/sctp_usrreq.c		optional inet sctp
 netinet/sctputil.c		optional inet sctp
 netinet/tcp_debug.c		optional tcpdebug
 netinet/tcp_hostcache.c		optional inet
 netinet/tcp_input.c		optional inet
 netinet/tcp_lro.c		optional inet
 netinet/tcp_output.c		optional inet
 netinet/tcp_offload.c		optional inet
 netinet/tcp_reass.c		optional inet
 netinet/tcp_sack.c		optional inet
 netinet/tcp_subr.c		optional inet
 netinet/tcp_syncache.c		optional inet
 netinet/tcp_timer.c		optional inet
 netinet/tcp_timewait.c		optional inet
 netinet/tcp_usrreq.c		optional inet
 netinet/udp_usrreq.c		optional inet
 netinet/libalias/alias.c	optional libalias | netgraph_nat
 netinet/libalias/alias_db.c	optional libalias | netgraph_nat
 netinet/libalias/alias_mod.c	optional libalias | netgraph_nat
 netinet/libalias/alias_proxy.c	optional libalias | netgraph_nat
 netinet/libalias/alias_util.c	optional libalias | netgraph_nat
 netinet6/dest6.c		optional inet6
 netinet6/frag6.c		optional inet6
 netinet6/icmp6.c		optional inet6
 netinet6/in6.c			optional inet6
 netinet6/in6_cksum.c		optional inet6
 netinet6/in6_gif.c		optional gif inet6
 netinet6/in6_ifattach.c		optional inet6
 netinet6/in6_pcb.c		optional inet6
 netinet6/in6_proto.c		optional inet6
 netinet6/in6_rmx.c		optional inet6
 netinet6/in6_src.c		optional inet6
 netinet6/ip6_forward.c		optional inet6
 netinet6/ip6_id.c		optional inet6
 netinet6/ip6_input.c		optional inet6
 netinet6/ip6_mroute.c		optional mrouting inet6
 netinet6/ip6_output.c		optional inet6
 netinet6/ip6_ipsec.c		optional inet6 ipsec
 netinet6/mld6.c			optional inet6
 netinet6/nd6.c			optional inet6
 netinet6/nd6_nbr.c		optional inet6
 netinet6/nd6_rtr.c		optional inet6
 netinet6/raw_ip6.c		optional inet6
 netinet6/route6.c		optional inet6
 netinet6/scope6.c		optional inet6
 netinet6/sctp6_usrreq.c		optional inet6 sctp
 netinet6/udp6_usrreq.c		optional inet6
 netipsec/ipsec.c		optional ipsec
 netipsec/ipsec_input.c		optional ipsec
 netipsec/ipsec_mbuf.c		optional ipsec
 netipsec/ipsec_output.c		optional ipsec
 netipsec/key.c			optional ipsec
 netipsec/key_debug.c		optional ipsec
 netipsec/keysock.c		optional ipsec
 netipsec/xform_ah.c		optional ipsec
 netipsec/xform_esp.c		optional ipsec
 netipsec/xform_ipcomp.c		optional ipsec
 netipsec/xform_ipip.c		optional ipsec
 netipsec/xform_tcp.c		optional ipsec tcp_signature
 netipx/ipx.c			optional ipx
 netipx/ipx_cksum.c		optional ipx
 netipx/ipx_input.c		optional ipx
 netipx/ipx_outputfl.c		optional ipx
 netipx/ipx_pcb.c		optional ipx
 netipx/ipx_proto.c		optional ipx
 netipx/ipx_usrreq.c		optional ipx
 netipx/spx_debug.c		optional ipx
 netipx/spx_usrreq.c		optional ipx
 netnatm/natm.c			optional natm
 netnatm/natm_pcb.c		optional natm
 netnatm/natm_proto.c		optional natm
 netncp/ncp_conn.c		optional ncp
 netncp/ncp_crypt.c		optional ncp
 netncp/ncp_login.c		optional ncp
 netncp/ncp_mod.c		optional ncp
 netncp/ncp_ncp.c		optional ncp
 netncp/ncp_nls.c		optional ncp
 netncp/ncp_rq.c			optional ncp
 netncp/ncp_sock.c		optional ncp
 netncp/ncp_subr.c		optional ncp
 netsmb/smb_conn.c		optional netsmb
 netsmb/smb_crypt.c		optional netsmb
 netsmb/smb_dev.c		optional netsmb
 netsmb/smb_iod.c		optional netsmb
 netsmb/smb_rq.c			optional netsmb
 netsmb/smb_smb.c		optional netsmb
 netsmb/smb_subr.c		optional netsmb
 netsmb/smb_trantcp.c		optional netsmb
 netsmb/smb_usr.c		optional netsmb
 nfs/nfs_common.c		optional nfsclient | nfsserver
 nfs4client/nfs4_dev.c		optional nfsclient
 nfs4client/nfs4_idmap.c		optional nfsclient
 nfs4client/nfs4_socket.c	optional nfsclient
 nfs4client/nfs4_subs.c		optional nfsclient
 nfs4client/nfs4_vfs_subs.c	optional nfsclient
 nfs4client/nfs4_vfsops.c	optional nfsclient
 nfs4client/nfs4_vn_subs.c	optional nfsclient
 nfs4client/nfs4_vnops.c		optional nfsclient
 nfsclient/bootp_subr.c		optional bootp nfsclient
 nfsclient/krpc_subr.c		optional bootp nfsclient
 nfsclient/nfs_bio.c		optional nfsclient
 nfsclient/nfs_diskless.c	optional nfsclient nfs_root
 nfsclient/nfs_node.c		optional nfsclient
 nfsclient/nfs_socket.c		optional nfsclient
 nfsclient/nfs_krpc.c		optional nfsclient
 nfsclient/nfs_subs.c		optional nfsclient
 nfsclient/nfs_nfsiod.c		optional nfsclient
 nfsclient/nfs_vfsops.c		optional nfsclient
 nfsclient/nfs_vnops.c		optional nfsclient
 nfsclient/nfs_lock.c		optional nfsclient
 nfsserver/nfs_fha.c		optional nfsserver
 nfsserver/nfs_serv.c		optional nfsserver
 nfsserver/nfs_srvkrpc.c		optional nfsserver
 nfsserver/nfs_srvsock.c		optional nfsserver
 nfsserver/nfs_srvcache.c	optional nfsserver
 nfsserver/nfs_srvsubs.c		optional nfsserver
 nfsserver/nfs_syscalls.c	optional nfsserver
 nlm/nlm_advlock.c		optional nfslockd nfsclient
 nlm/nlm_prot_clnt.c		optional nfslockd
 nlm/nlm_prot_impl.c		optional nfslockd
 nlm/nlm_prot_server.c		optional nfslockd
 nlm/nlm_prot_svc.c		optional nfslockd
 nlm/nlm_prot_xdr.c		optional nfslockd
 nlm/sm_inter_xdr.c		optional nfslockd
 # crypto support
 opencrypto/cast.c		optional crypto | ipsec
 opencrypto/criov.c		optional crypto
 opencrypto/crypto.c		optional crypto
 opencrypto/cryptodev.c		optional cryptodev
 opencrypto/cryptodev_if.m	optional crypto
 opencrypto/cryptosoft.c		optional crypto
 opencrypto/deflate.c		optional crypto
 opencrypto/rmd160.c		optional crypto | ipsec
 opencrypto/skipjack.c		optional crypto
 opencrypto/xform.c		optional crypto
 pci/alpm.c			optional alpm pci
 pci/amdpm.c			optional amdpm pci | nfpm pci
 pci/amdsmb.c			optional amdsmb pci
 pci/if_rl.c			optional rl pci
 pci/intpm.c			optional intpm pci
 pci/ncr.c			optional ncr pci
 pci/nfsmb.c			optional nfsmb pci
 pci/viapm.c			optional viapm pci
 rpc/auth_none.c			optional krpc | nfslockd | nfsclient | nfsserver
 rpc/auth_unix.c			optional krpc | nfslockd | nfsclient
 rpc/authunix_prot.c		optional krpc | nfslockd | nfsclient | nfsserver
 rpc/clnt_dg.c			optional krpc | nfslockd | nfsclient
 rpc/clnt_rc.c			optional krpc | nfslockd | nfsclient
 rpc/clnt_vc.c			optional krpc | nfslockd | nfsclient | nfsserver
 rpc/getnetconfig.c		optional krpc | nfslockd | nfsclient | nfsserver
 rpc/inet_ntop.c			optional krpc | nfslockd | nfsclient | nfsserver
 rpc/inet_pton.c			optional krpc | nfslockd | nfsclient | nfsserver
 rpc/replay.c			optional krpc | nfslockd | nfsserver
 rpc/rpc_callmsg.c		optional krpc | nfslockd | nfsclient | nfsserver
 rpc/rpc_generic.c		optional krpc | nfslockd | nfsclient | nfsserver
 rpc/rpc_prot.c			optional krpc | nfslockd | nfsclient | nfsserver
 rpc/rpcb_clnt.c			optional krpc | nfslockd | nfsclient | nfsserver
 rpc/rpcb_prot.c			optional krpc | nfslockd | nfsclient | nfsserver
 rpc/rpcclnt.c			optional nfsclient
 rpc/svc.c			optional krpc | nfslockd | nfsserver
 rpc/svc_auth.c			optional krpc | nfslockd | nfsserver
 rpc/svc_auth_unix.c		optional krpc | nfslockd | nfsserver
 rpc/svc_dg.c			optional krpc | nfslockd | nfsserver
 rpc/svc_generic.c		optional krpc | nfslockd | nfsserver
 rpc/svc_vc.c			optional krpc | nfslockd | nfsserver
 rpc/rpcsec_gss/rpcsec_gss.c	optional krpc kgssapi | nfslockd kgssapi
 rpc/rpcsec_gss/rpcsec_gss_conf.c optional krpc kgssapi | nfslockd kgssapi
 rpc/rpcsec_gss/rpcsec_gss_misc.c optional krpc kgssapi | nfslockd kgssapi
 rpc/rpcsec_gss/rpcsec_gss_prot.c optional krpc kgssapi | nfslockd kgssapi
 rpc/rpcsec_gss/svc_rpcsec_gss.c	optional krpc kgssapi | nfslockd kgssapi
 security/audit/audit.c		optional audit
 security/audit/audit_arg.c	optional audit
 security/audit/audit_bsm.c	optional audit
 security/audit/audit_bsm_klib.c	optional audit
 security/audit/audit_bsm_token.c	optional audit
 security/audit/audit_pipe.c	optional audit
 security/audit/audit_syscalls.c	standard
 security/audit/audit_trigger.c	optional audit
 security/audit/audit_worker.c	optional audit
 security/mac/mac_atalk.c	optional mac netatalk
 security/mac/mac_audit.c	optional mac audit
 security/mac/mac_cred.c		optional mac
 security/mac/mac_framework.c	optional mac
 security/mac/mac_inet.c		optional mac inet
 security/mac/mac_inet6.c	optional mac inet6
 security/mac/mac_label.c	optional mac
 security/mac/mac_net.c		optional mac
 security/mac/mac_pipe.c		optional mac
 security/mac/mac_posix_sem.c	optional mac
 security/mac/mac_posix_shm.c	optional mac
 security/mac/mac_priv.c		optional mac
 security/mac/mac_process.c	optional mac
 security/mac/mac_socket.c	optional mac
 security/mac/mac_syscalls.c	standard
 security/mac/mac_system.c	optional mac
 security/mac/mac_sysv_msg.c	optional mac
 security/mac/mac_sysv_sem.c	optional mac
 security/mac/mac_sysv_shm.c	optional mac
 security/mac/mac_vfs.c		optional mac
 security/mac_biba/mac_biba.c	optional mac_biba
 security/mac_bsdextended/mac_bsdextended.c optional mac_bsdextended
 security/mac_bsdextended/ugidfw_system.c optional mac_bsdextended
 security/mac_bsdextended/ugidfw_vnode.c optional mac_bsdextended
 security/mac_ifoff/mac_ifoff.c	optional mac_ifoff
 security/mac_lomac/mac_lomac.c	optional mac_lomac
 security/mac_mls/mac_mls.c	optional mac_mls
 security/mac_none/mac_none.c	optional mac_none
 security/mac_partition/mac_partition.c optional mac_partition
 security/mac_portacl/mac_portacl.c optional mac_portacl
 security/mac_seeotheruids/mac_seeotheruids.c optional mac_seeotheruids
 security/mac_stub/mac_stub.c	optional mac_stub
 security/mac_test/mac_test.c	optional mac_test
 ufs/ffs/ffs_alloc.c		optional ffs
 ufs/ffs/ffs_balloc.c		optional ffs
 ufs/ffs/ffs_inode.c		optional ffs
 ufs/ffs/ffs_snapshot.c		optional ffs
 ufs/ffs/ffs_softdep.c		optional ffs
 ufs/ffs/ffs_subr.c		optional ffs
 ufs/ffs/ffs_tables.c		optional ffs
 ufs/ffs/ffs_vfsops.c		optional ffs
 ufs/ffs/ffs_vnops.c		optional ffs
 ufs/ffs/ffs_rawread.c		optional directio
 ufs/ufs/ufs_acl.c		optional ffs
 ufs/ufs/ufs_bmap.c		optional ffs
 ufs/ufs/ufs_dirhash.c		optional ffs
 ufs/ufs/ufs_extattr.c		optional ffs
 ufs/ufs/ufs_gjournal.c		optional ffs
 ufs/ufs/ufs_inode.c		optional ffs
 ufs/ufs/ufs_lookup.c		optional ffs
 ufs/ufs/ufs_quota.c		optional ffs
 ufs/ufs/ufs_vfsops.c		optional ffs
 ufs/ufs/ufs_vnops.c		optional ffs
 vm/default_pager.c		standard
 vm/device_pager.c		standard
 vm/phys_pager.c			standard
 vm/redzone.c			optional DEBUG_REDZONE
 vm/swap_pager.c			standard
 vm/uma_core.c			standard
 vm/uma_dbg.c			standard
 vm/vm_contig.c			standard
 vm/memguard.c			optional DEBUG_MEMGUARD
 vm/vm_fault.c			standard
 vm/vm_glue.c			standard
 vm/vm_init.c			standard
 vm/vm_kern.c			standard
 vm/vm_map.c			standard
 vm/vm_meter.c			standard
 vm/vm_mmap.c			standard
 vm/vm_object.c			standard
 vm/vm_page.c			standard
 vm/vm_pageout.c			standard
 vm/vm_pager.c			standard
 vm/vm_phys.c			standard
 vm/vm_reserv.c			standard
 vm/vm_unix.c			standard
 vm/vm_zeroidle.c		standard
 vm/vnode_pager.c		standard
 xdr/xdr.c			optional krpc | nfslockd | nfsclient | nfsserver
 xdr/xdr_array.c			optional krpc | nfslockd | nfsclient | nfsserver
 xdr/xdr_mbuf.c			optional krpc | nfslockd | nfsclient | nfsserver
 xdr/xdr_mem.c			optional krpc | nfslockd | nfsclient | nfsserver
 xdr/xdr_reference.c		optional krpc | nfslockd | nfsclient | nfsserver
 xdr/xdr_sizeof.c		optional krpc | nfslockd | nfsclient | nfsserver
 #
 gnu/fs/xfs/xfs_alloc.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs" \
 	warning "kernel contains GPL contaminated xfs filesystem"
 gnu/fs/xfs/xfs_alloc_btree.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_bit.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_bmap.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_bmap_btree.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_btree.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_buf_item.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_da_btree.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir2.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir2_block.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir2_data.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir2_leaf.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir2_node.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir2_sf.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir2_trace.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dir_leaf.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_error.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_extfree_item.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_fsops.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_ialloc.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_ialloc_btree.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_inode.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_inode_item.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_iocore.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_itable.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dfrag.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_log.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_log_recover.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_mount.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_rename.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_trans.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_trans_ail.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_trans_buf.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_trans_extfree.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_trans_inode.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_trans_item.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_utils.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_vfsops.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_vnodeops.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_rw.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_attr_leaf.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_attr.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_dmops.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_qmops.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_iget.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_freebsd_iget.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_mountops.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_vnops.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_frw.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_buf.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_globals.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_dmistubs.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_super.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_stats.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_vfs.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_vnode.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_sysctl.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_fs_subr.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/xfs_ioctl.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/support/debug.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/support/ktrace.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/support/mrlock.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/support/uuid.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/FreeBSD/support/kmem.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_iomap.c		optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 gnu/fs/xfs/xfs_behavior.c	optional xfs \
 	compile-with "${NORMAL_C} -I$S/gnu/fs/xfs/FreeBSD -I$S/gnu/fs/xfs/FreeBSD/support -I$S/gnu/fs/xfs"
 
 xen/gnttab.c                   optional xen
 xen/features.c                 optional xen
 xen/evtchn/evtchn.c            optional xen
 xen/evtchn/evtchn_dev.c                optional xen
 xen/xenbus/xenbus_client.c     optional xen
 xen/xenbus/xenbus_comms.c      optional xen
 xen/xenbus/xenbus_dev.c                optional xen
 xen/xenbus/xenbus_if.m		optional xen
 xen/xenbus/xenbus_probe.c      optional xen
 #xen/xenbus/xenbus_probe_backend.c      optional xen
 xen/xenbus/xenbus_xs.c         optional xen
 dev/xen/console/console.c      optional xen
 dev/xen/console/xencons_ring.c optional xen
 dev/xen/blkfront/blkfront.c    optional xen
 dev/xen/netfront/netfront.c    optional xen
 #dev/xen/xenpci/xenpci.c        optional xen
 #xen/xenbus/xenbus_newbus.c	optional xenhvm
 
Index: head/sys/contrib/pf/net/pf.c
===================================================================
--- head/sys/contrib/pf/net/pf.c	(revision 186118)
+++ head/sys/contrib/pf/net/pf.c	(revision 186119)
@@ -1,7625 +1,7625 @@
 /*	$OpenBSD: pf.c,v 1.527 2007/02/22 15:23:23 pyr Exp $ */
 /* add:	$OpenBSD: pf.c,v 1.559 2007/09/18 18:45:59 markus Exp $ */
 
 /*
  * Copyright (c) 2001 Daniel Hartmeier
  * Copyright (c) 2002,2003 Henning Brauer
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  *
  *    - Redistributions of source code must retain the above copyright
  *      notice, this list of conditions and the following disclaimer.
  *    - Redistributions in binary form must reproduce the above
  *      copyright notice, this list of conditions and the following
  *      disclaimer in the documentation and/or other materials provided
  *      with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
  * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
  * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
  * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
  * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
  * POSSIBILITY OF SUCH DAMAGE.
  *
  * Effort sponsored in part by the Defense Advanced Research Projects
  * Agency (DARPA) and Air Force Research Laboratory, Air Force
  * Materiel Command, USAF, under agreement number F30602-01-2-0537.
  *
  */
 
 #ifdef __FreeBSD__
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 #endif
 
 #ifdef __FreeBSD__
 #include "opt_mac.h"
 #include "opt_bpf.h"
 #include "opt_pf.h"
 
 #ifdef DEV_BPF
 #define	NBPFILTER	DEV_BPF
 #else
 #define	NBPFILTER	0
 #endif
 
 #ifdef DEV_PFLOG
 #define	NPFLOG		DEV_PFLOG
 #else
 #define	NPFLOG		0
 #endif
 
 #ifdef DEV_PFSYNC
 #define	NPFSYNC		DEV_PFSYNC
 #else
 #define	NPFSYNC		0
 #endif
 
 #else
 #include "bpfilter.h"
 #include "pflog.h"
 #include "pfsync.h"
 #endif
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/mbuf.h>
 #include <sys/filio.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/kernel.h>
 #include <sys/time.h>
 #ifdef __FreeBSD__
 #include <sys/sysctl.h>
 #include <sys/endian.h>
 #else
 #include <sys/pool.h>
 #endif
 #include <sys/proc.h>
 #ifdef __FreeBSD__
 #include <sys/kthread.h>
 #include <sys/lock.h>
 #include <sys/sx.h>
 #include <sys/vimage.h>
 #else
 #include <sys/rwlock.h>
 #endif
 
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/bpf.h>
 #include <net/route.h>
 #ifndef __FreeBSD__
 #include <net/radix_mpath.h>
 #endif
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet/tcp.h>
 #include <netinet/tcp_seq.h>
 #include <netinet/udp.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/in_pcb.h>
 #include <netinet/tcp_timer.h>
 #include <netinet/tcp_var.h>
 #include <netinet/udp_var.h>
 #include <netinet/icmp_var.h>
 #include <netinet/if_ether.h>
 #ifdef __FreeBSD__
 #include <netinet/vinet.h>
 #endif
 
 #ifndef __FreeBSD__
 #include <dev/rndvar.h>
 #endif
 #include <net/pfvar.h>
 #include <net/if_pflog.h>
 
 #if NPFSYNC > 0
 #include <net/if_pfsync.h>
 #endif /* NPFSYNC > 0 */
 
 #ifdef INET6
 #include <netinet/ip6.h>
 #include <netinet/in_pcb.h>
 #include <netinet/icmp6.h>
 #include <netinet6/nd6.h>
 #ifdef __FreeBSD__
 #include <netinet6/ip6_var.h>
 #include <netinet6/in6_pcb.h>
 #include <netinet6/vinet6.h>
 #endif
 #endif /* INET6 */
 
 #ifdef __FreeBSD__
 #include <machine/in_cksum.h>
 #include <sys/limits.h>
 #include <sys/ucred.h>
 #include <security/mac/mac_framework.h>
 
 extern int ip_optcopy(struct ip *, struct ip *);
 extern int debug_pfugidhack;
 #endif
 
 #define DPFPRINTF(n, x)	if (pf_status.debug >= (n)) printf x
 
 /*
  * Global variables
  */
 
 struct pf_altqqueue	 pf_altqs[2];
 struct pf_palist	 pf_pabuf;
 struct pf_altqqueue	*pf_altqs_active;
 struct pf_altqqueue	*pf_altqs_inactive;
 struct pf_status	 pf_status;
 
 u_int32_t		 ticket_altqs_active;
 u_int32_t		 ticket_altqs_inactive;
 int			 altqs_inactive_open;
 u_int32_t		 ticket_pabuf;
 
 struct pf_anchor_stackframe {
 	struct pf_ruleset			*rs;
 	struct pf_rule				*r;
 	struct pf_anchor_node			*parent;
 	struct pf_anchor			*child;
 } pf_anchor_stack[64];
 
 #ifdef __FreeBSD__
 uma_zone_t		 pf_src_tree_pl, pf_rule_pl;
 uma_zone_t		 pf_state_pl, pf_altq_pl, pf_pooladdr_pl;
 #else
 struct pool		 pf_src_tree_pl, pf_rule_pl;
 struct pool		 pf_state_pl, pf_altq_pl, pf_pooladdr_pl;
 #endif
 
 void			 pf_print_host(struct pf_addr *, u_int16_t, u_int8_t);
 
 void			 pf_init_threshold(struct pf_threshold *, u_int32_t,
 			    u_int32_t);
 void			 pf_add_threshold(struct pf_threshold *);
 int			 pf_check_threshold(struct pf_threshold *);
 
 void			 pf_change_ap(struct pf_addr *, u_int16_t *,
 			    u_int16_t *, u_int16_t *, struct pf_addr *,
 			    u_int16_t, u_int8_t, sa_family_t);
 int			 pf_modulate_sack(struct mbuf *, int, struct pf_pdesc *,
 			    struct tcphdr *, struct pf_state_peer *);
 #ifdef INET6
 void			 pf_change_a6(struct pf_addr *, u_int16_t *,
 			    struct pf_addr *, u_int8_t);
 #endif /* INET6 */
 void			 pf_change_icmp(struct pf_addr *, u_int16_t *,
 			    struct pf_addr *, struct pf_addr *, u_int16_t,
 			    u_int16_t *, u_int16_t *, u_int16_t *,
 			    u_int16_t *, u_int8_t, sa_family_t);
 #ifdef __FreeBSD__
 void			 pf_send_tcp(struct mbuf *,
 			    const struct pf_rule *, sa_family_t,
 #else
 void			 pf_send_tcp(const struct pf_rule *, sa_family_t,
 #endif
 			    const struct pf_addr *, const struct pf_addr *,
 			    u_int16_t, u_int16_t, u_int32_t, u_int32_t,
 			    u_int8_t, u_int16_t, u_int16_t, u_int8_t, int,
 			    u_int16_t, struct ether_header *, struct ifnet *);
 void			 pf_send_icmp(struct mbuf *, u_int8_t, u_int8_t,
 			    sa_family_t, struct pf_rule *);
 struct pf_rule		*pf_match_translation(struct pf_pdesc *, struct mbuf *,
 			    int, int, struct pfi_kif *,
 			    struct pf_addr *, u_int16_t, struct pf_addr *,
 			    u_int16_t, int);
 struct pf_rule		*pf_get_translation(struct pf_pdesc *, struct mbuf *,
 			    int, int, struct pfi_kif *, struct pf_src_node **,
 			    struct pf_addr *, u_int16_t,
 			    struct pf_addr *, u_int16_t,
 			    struct pf_addr *, u_int16_t *);
 int			 pf_test_tcp(struct pf_rule **, struct pf_state **,
 			    int, struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *, struct pf_rule **,
 #ifdef __FreeBSD__
 			    struct pf_ruleset **, struct ifqueue *,
 			    struct inpcb *);
 #else
 			    struct pf_ruleset **, struct ifqueue *);
 #endif
 int			 pf_test_udp(struct pf_rule **, struct pf_state **,
 			    int, struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *, struct pf_rule **,
 #ifdef __FreeBSD__
 			    struct pf_ruleset **, struct ifqueue *,
 			    struct inpcb *);
 #else
 			    struct pf_ruleset **, struct ifqueue *);
 #endif
 int			 pf_test_icmp(struct pf_rule **, struct pf_state **,
 			    int, struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *, struct pf_rule **,
 			    struct pf_ruleset **, struct ifqueue *);
 int			 pf_test_other(struct pf_rule **, struct pf_state **,
 			    int, struct pfi_kif *, struct mbuf *, int, void *,
 			    struct pf_pdesc *, struct pf_rule **,
 			    struct pf_ruleset **, struct ifqueue *);
 int			 pf_test_fragment(struct pf_rule **, int,
 			    struct pfi_kif *, struct mbuf *, void *,
 			    struct pf_pdesc *, struct pf_rule **,
 			    struct pf_ruleset **);
 int			 pf_test_state_tcp(struct pf_state **, int,
 			    struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *, u_short *);
 int			 pf_test_state_udp(struct pf_state **, int,
 			    struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *);
 int			 pf_test_state_icmp(struct pf_state **, int,
 			    struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *, u_short *);
 int			 pf_test_state_other(struct pf_state **, int,
 			    struct pfi_kif *, struct pf_pdesc *);
 int			 pf_match_tag(struct mbuf *, struct pf_rule *,
 			     struct pf_mtag *, int *);
 int			 pf_step_out_of_anchor(int *, struct pf_ruleset **,
 			     int, struct pf_rule **, struct pf_rule **,
 			     int *);
 void			 pf_hash(struct pf_addr *, struct pf_addr *,
 			    struct pf_poolhashkey *, sa_family_t);
 int			 pf_map_addr(u_int8_t, struct pf_rule *,
 			    struct pf_addr *, struct pf_addr *,
 			    struct pf_addr *, struct pf_src_node **);
 int			 pf_get_sport(sa_family_t, u_int8_t, struct pf_rule *,
 			    struct pf_addr *, struct pf_addr *, u_int16_t,
 			    struct pf_addr *, u_int16_t*, u_int16_t, u_int16_t,
 			    struct pf_src_node **);
 void			 pf_route(struct mbuf **, struct pf_rule *, int,
 			    struct ifnet *, struct pf_state *,
 			    struct pf_pdesc *);
 void			 pf_route6(struct mbuf **, struct pf_rule *, int,
 			    struct ifnet *, struct pf_state *,
 			    struct pf_pdesc *);
 #ifdef __FreeBSD__
 /* XXX: import */
 #else
 int			 pf_socket_lookup(int, struct pf_pdesc *);
 #endif
 u_int8_t		 pf_get_wscale(struct mbuf *, int, u_int16_t,
 			    sa_family_t);
 u_int16_t		 pf_get_mss(struct mbuf *, int, u_int16_t,
 			    sa_family_t);
 u_int16_t		 pf_calc_mss(struct pf_addr *, sa_family_t,
 				u_int16_t);
 void			 pf_set_rt_ifp(struct pf_state *,
 			    struct pf_addr *);
 int			 pf_check_proto_cksum(struct mbuf *, int, int,
 			    u_int8_t, sa_family_t);
 int			 pf_addr_wrap_neq(struct pf_addr_wrap *,
 			    struct pf_addr_wrap *);
 struct pf_state		*pf_find_state_recurse(struct pfi_kif *,
 			    struct pf_state_cmp *, u_int8_t);
 int			 pf_src_connlimit(struct pf_state **);
 int			 pf_check_congestion(struct ifqueue *);
 
 #ifdef __FreeBSD__
 int in4_cksum(struct mbuf *m, u_int8_t nxt, int off, int len);
 
 extern int pf_end_threads;
 
 struct pf_pool_limit pf_pool_limits[PF_LIMIT_MAX];
 #else
 extern struct pool pfr_ktable_pl;
 extern struct pool pfr_kentry_pl;
 
 struct pf_pool_limit pf_pool_limits[PF_LIMIT_MAX] = {
 	{ &pf_state_pl, PFSTATE_HIWAT },
 	{ &pf_src_tree_pl, PFSNODE_HIWAT },
 	{ &pf_frent_pl, PFFRAG_FRENT_HIWAT },
 	{ &pfr_ktable_pl, PFR_KTABLE_HIWAT },
 	{ &pfr_kentry_pl, PFR_KENTRY_HIWAT }
 };
 #endif
 
 #define STATE_LOOKUP()							\
 	do {								\
 		if (direction == PF_IN)					\
 			*state = pf_find_state_recurse(			\
 			    kif, &key, PF_EXT_GWY);			\
 		else							\
 			*state = pf_find_state_recurse(			\
 			    kif, &key, PF_LAN_EXT);			\
 		if (*state == NULL || (*state)->timeout == PFTM_PURGE)	\
 			return (PF_DROP);				\
 		if (direction == PF_OUT &&				\
 		    (((*state)->rule.ptr->rt == PF_ROUTETO &&		\
 		    (*state)->rule.ptr->direction == PF_OUT) ||		\
 		    ((*state)->rule.ptr->rt == PF_REPLYTO &&		\
 		    (*state)->rule.ptr->direction == PF_IN)) &&		\
 		    (*state)->rt_kif != NULL &&				\
 		    (*state)->rt_kif != kif)				\
 			return (PF_PASS);				\
 	} while (0)
 
 #define	STATE_TRANSLATE(s) \
 	(s)->lan.addr.addr32[0] != (s)->gwy.addr.addr32[0] || \
 	((s)->af == AF_INET6 && \
 	((s)->lan.addr.addr32[1] != (s)->gwy.addr.addr32[1] || \
 	(s)->lan.addr.addr32[2] != (s)->gwy.addr.addr32[2] || \
 	(s)->lan.addr.addr32[3] != (s)->gwy.addr.addr32[3])) || \
 	(s)->lan.port != (s)->gwy.port
 
 #define BOUND_IFACE(r, k) \
 	((r)->rule_flag & PFRULE_IFBOUND) ? (k) : pfi_all
 
 #define STATE_INC_COUNTERS(s)				\
 	do {						\
 		s->rule.ptr->states++;			\
 		if (s->anchor.ptr != NULL)		\
 			s->anchor.ptr->states++;	\
 		if (s->nat_rule.ptr != NULL)		\
 			s->nat_rule.ptr->states++;	\
 	} while (0)
 
 #define STATE_DEC_COUNTERS(s)				\
 	do {						\
 		if (s->nat_rule.ptr != NULL)		\
 			s->nat_rule.ptr->states--;	\
 		if (s->anchor.ptr != NULL)		\
 			s->anchor.ptr->states--;	\
 		s->rule.ptr->states--;			\
 	} while (0)
 
 struct pf_src_tree tree_src_tracking;
 
 struct pf_state_tree_id tree_id;
 struct pf_state_queue state_list;
 
 #ifdef __FreeBSD__
 static int pf_src_compare(struct pf_src_node *, struct pf_src_node *);
 static int pf_state_compare_lan_ext(struct pf_state *, struct pf_state *);
 static int pf_state_compare_ext_gwy(struct pf_state *, struct pf_state *);
 static int pf_state_compare_id(struct pf_state *, struct pf_state *);
 #endif
 
 RB_GENERATE(pf_src_tree, pf_src_node, entry, pf_src_compare);
 RB_GENERATE(pf_state_tree_lan_ext, pf_state,
     u.s.entry_lan_ext, pf_state_compare_lan_ext);
 RB_GENERATE(pf_state_tree_ext_gwy, pf_state,
     u.s.entry_ext_gwy, pf_state_compare_ext_gwy);
 RB_GENERATE(pf_state_tree_id, pf_state,
     u.s.entry_id, pf_state_compare_id);
 
 #ifdef __FreeBSD__
 static int
 #else
 static __inline int
 #endif
 pf_src_compare(struct pf_src_node *a, struct pf_src_node *b)
 {
 	int	diff;
 
 	if (a->rule.ptr > b->rule.ptr)
 		return (1);
 	if (a->rule.ptr < b->rule.ptr)
 		return (-1);
 	if ((diff = a->af - b->af) != 0)
 		return (diff);
 	switch (a->af) {
 #ifdef INET
 	case AF_INET:
 		if (a->addr.addr32[0] > b->addr.addr32[0])
 			return (1);
 		if (a->addr.addr32[0] < b->addr.addr32[0])
 			return (-1);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		if (a->addr.addr32[3] > b->addr.addr32[3])
 			return (1);
 		if (a->addr.addr32[3] < b->addr.addr32[3])
 			return (-1);
 		if (a->addr.addr32[2] > b->addr.addr32[2])
 			return (1);
 		if (a->addr.addr32[2] < b->addr.addr32[2])
 			return (-1);
 		if (a->addr.addr32[1] > b->addr.addr32[1])
 			return (1);
 		if (a->addr.addr32[1] < b->addr.addr32[1])
 			return (-1);
 		if (a->addr.addr32[0] > b->addr.addr32[0])
 			return (1);
 		if (a->addr.addr32[0] < b->addr.addr32[0])
 			return (-1);
 		break;
 #endif /* INET6 */
 	}
 	return (0);
 }
 
 #ifdef __FreeBSD__
 static int
 #else
 static __inline int
 #endif
 pf_state_compare_lan_ext(struct pf_state *a, struct pf_state *b)
 {
 	int	diff;
 
 	if ((diff = a->proto - b->proto) != 0)
 		return (diff);
 	if ((diff = a->af - b->af) != 0)
 		return (diff);
 	switch (a->af) {
 #ifdef INET
 	case AF_INET:
 		if (a->lan.addr.addr32[0] > b->lan.addr.addr32[0])
 			return (1);
 		if (a->lan.addr.addr32[0] < b->lan.addr.addr32[0])
 			return (-1);
 		if (a->ext.addr.addr32[0] > b->ext.addr.addr32[0])
 			return (1);
 		if (a->ext.addr.addr32[0] < b->ext.addr.addr32[0])
 			return (-1);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		if (a->lan.addr.addr32[3] > b->lan.addr.addr32[3])
 			return (1);
 		if (a->lan.addr.addr32[3] < b->lan.addr.addr32[3])
 			return (-1);
 		if (a->ext.addr.addr32[3] > b->ext.addr.addr32[3])
 			return (1);
 		if (a->ext.addr.addr32[3] < b->ext.addr.addr32[3])
 			return (-1);
 		if (a->lan.addr.addr32[2] > b->lan.addr.addr32[2])
 			return (1);
 		if (a->lan.addr.addr32[2] < b->lan.addr.addr32[2])
 			return (-1);
 		if (a->ext.addr.addr32[2] > b->ext.addr.addr32[2])
 			return (1);
 		if (a->ext.addr.addr32[2] < b->ext.addr.addr32[2])
 			return (-1);
 		if (a->lan.addr.addr32[1] > b->lan.addr.addr32[1])
 			return (1);
 		if (a->lan.addr.addr32[1] < b->lan.addr.addr32[1])
 			return (-1);
 		if (a->ext.addr.addr32[1] > b->ext.addr.addr32[1])
 			return (1);
 		if (a->ext.addr.addr32[1] < b->ext.addr.addr32[1])
 			return (-1);
 		if (a->lan.addr.addr32[0] > b->lan.addr.addr32[0])
 			return (1);
 		if (a->lan.addr.addr32[0] < b->lan.addr.addr32[0])
 			return (-1);
 		if (a->ext.addr.addr32[0] > b->ext.addr.addr32[0])
 			return (1);
 		if (a->ext.addr.addr32[0] < b->ext.addr.addr32[0])
 			return (-1);
 		break;
 #endif /* INET6 */
 	}
 
 	if ((diff = a->lan.port - b->lan.port) != 0)
 		return (diff);
 	if ((diff = a->ext.port - b->ext.port) != 0)
 		return (diff);
 
 	return (0);
 }
 
 #ifdef __FreeBSD__
 static int
 #else
 static __inline int
 #endif
 pf_state_compare_ext_gwy(struct pf_state *a, struct pf_state *b)
 {
 	int	diff;
 
 	if ((diff = a->proto - b->proto) != 0)
 		return (diff);
 	if ((diff = a->af - b->af) != 0)
 		return (diff);
 	switch (a->af) {
 #ifdef INET
 	case AF_INET:
 		if (a->ext.addr.addr32[0] > b->ext.addr.addr32[0])
 			return (1);
 		if (a->ext.addr.addr32[0] < b->ext.addr.addr32[0])
 			return (-1);
 		if (a->gwy.addr.addr32[0] > b->gwy.addr.addr32[0])
 			return (1);
 		if (a->gwy.addr.addr32[0] < b->gwy.addr.addr32[0])
 			return (-1);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		if (a->ext.addr.addr32[3] > b->ext.addr.addr32[3])
 			return (1);
 		if (a->ext.addr.addr32[3] < b->ext.addr.addr32[3])
 			return (-1);
 		if (a->gwy.addr.addr32[3] > b->gwy.addr.addr32[3])
 			return (1);
 		if (a->gwy.addr.addr32[3] < b->gwy.addr.addr32[3])
 			return (-1);
 		if (a->ext.addr.addr32[2] > b->ext.addr.addr32[2])
 			return (1);
 		if (a->ext.addr.addr32[2] < b->ext.addr.addr32[2])
 			return (-1);
 		if (a->gwy.addr.addr32[2] > b->gwy.addr.addr32[2])
 			return (1);
 		if (a->gwy.addr.addr32[2] < b->gwy.addr.addr32[2])
 			return (-1);
 		if (a->ext.addr.addr32[1] > b->ext.addr.addr32[1])
 			return (1);
 		if (a->ext.addr.addr32[1] < b->ext.addr.addr32[1])
 			return (-1);
 		if (a->gwy.addr.addr32[1] > b->gwy.addr.addr32[1])
 			return (1);
 		if (a->gwy.addr.addr32[1] < b->gwy.addr.addr32[1])
 			return (-1);
 		if (a->ext.addr.addr32[0] > b->ext.addr.addr32[0])
 			return (1);
 		if (a->ext.addr.addr32[0] < b->ext.addr.addr32[0])
 			return (-1);
 		if (a->gwy.addr.addr32[0] > b->gwy.addr.addr32[0])
 			return (1);
 		if (a->gwy.addr.addr32[0] < b->gwy.addr.addr32[0])
 			return (-1);
 		break;
 #endif /* INET6 */
 	}
 
 	if ((diff = a->ext.port - b->ext.port) != 0)
 		return (diff);
 	if ((diff = a->gwy.port - b->gwy.port) != 0)
 		return (diff);
 
 	return (0);
 }
 
 #ifdef __FreeBSD__
 static int
 #else
 static __inline int
 #endif
 pf_state_compare_id(struct pf_state *a, struct pf_state *b)
 {
 	if (a->id > b->id)
 		return (1);
 	if (a->id < b->id)
 		return (-1);
 	if (a->creatorid > b->creatorid)
 		return (1);
 	if (a->creatorid < b->creatorid)
 		return (-1);
 
 	return (0);
 }
 
 #ifdef INET6
 void
 pf_addrcpy(struct pf_addr *dst, struct pf_addr *src, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		dst->addr32[0] = src->addr32[0];
 		break;
 #endif /* INET */
 	case AF_INET6:
 		dst->addr32[0] = src->addr32[0];
 		dst->addr32[1] = src->addr32[1];
 		dst->addr32[2] = src->addr32[2];
 		dst->addr32[3] = src->addr32[3];
 		break;
 	}
 }
 #endif /* INET6 */
 
 struct pf_state *
 pf_find_state_byid(struct pf_state_cmp *key)
 {
 	pf_status.fcounters[FCNT_STATE_SEARCH]++;
 	return (RB_FIND(pf_state_tree_id, &tree_id, (struct pf_state *)key));
 }
 
 struct pf_state *
 pf_find_state_recurse(struct pfi_kif *kif, struct pf_state_cmp *key, u_int8_t tree)
 {
 	struct pf_state *s;
 
 	pf_status.fcounters[FCNT_STATE_SEARCH]++;
 
 	switch (tree) {
 	case PF_LAN_EXT:
 		if ((s = RB_FIND(pf_state_tree_lan_ext, &kif->pfik_lan_ext,
 		    (struct pf_state *)key)) != NULL)
 			return (s);
 		if ((s = RB_FIND(pf_state_tree_lan_ext, &pfi_all->pfik_lan_ext,
 		    (struct pf_state *)key)) != NULL)
 			return (s);
 		return (NULL);
 	case PF_EXT_GWY:
 		if ((s = RB_FIND(pf_state_tree_ext_gwy, &kif->pfik_ext_gwy,
 		    (struct pf_state *)key)) != NULL)
 			return (s);
 		if ((s = RB_FIND(pf_state_tree_ext_gwy, &pfi_all->pfik_ext_gwy,
 		    (struct pf_state *)key)) != NULL)
 			return (s);
 		return (NULL);
 	default:
 		panic("pf_find_state_recurse");
 	}
 }
 
 struct pf_state *
 pf_find_state_all(struct pf_state_cmp *key, u_int8_t tree, int *more)
 {
 	struct pf_state *s, *ss = NULL;
 	struct pfi_kif	*kif;
 
 	pf_status.fcounters[FCNT_STATE_SEARCH]++;
 
 	switch (tree) {
 	case PF_LAN_EXT:
 		TAILQ_FOREACH(kif, &pfi_statehead, pfik_w_states) {
 			s = RB_FIND(pf_state_tree_lan_ext,
 			    &kif->pfik_lan_ext, (struct pf_state *)key);
 			if (s == NULL)
 				continue;
 			if (more == NULL)
 				return (s);
 			ss = s;
 			(*more)++;
 		}
 		return (ss);
 	case PF_EXT_GWY:
 		TAILQ_FOREACH(kif, &pfi_statehead, pfik_w_states) {
 			s = RB_FIND(pf_state_tree_ext_gwy,
 			    &kif->pfik_ext_gwy, (struct pf_state *)key);
 			if (s == NULL)
 				continue;
 			if (more == NULL)
 				return (s);
 			ss = s;
 			(*more)++;
 		}
 		return (ss);
 	default:
 		panic("pf_find_state_all");
 	}
 }
 
 void
 pf_init_threshold(struct pf_threshold *threshold,
     u_int32_t limit, u_int32_t seconds)
 {
 	threshold->limit = limit * PF_THRESHOLD_MULT;
 	threshold->seconds = seconds;
 	threshold->count = 0;
 	threshold->last = time_second;
 }
 
 void
 pf_add_threshold(struct pf_threshold *threshold)
 {
 	u_int32_t t = time_second, diff = t - threshold->last;
 
 	if (diff >= threshold->seconds)
 		threshold->count = 0;
 	else
 		threshold->count -= threshold->count * diff /
 		    threshold->seconds;
 	threshold->count += PF_THRESHOLD_MULT;
 	threshold->last = t;
 }
 
 int
 pf_check_threshold(struct pf_threshold *threshold)
 {
 	return (threshold->count > threshold->limit);
 }
 
 int
 pf_src_connlimit(struct pf_state **state)
 {
 	struct pf_state	*s;
 	int bad = 0;
 
 	(*state)->src_node->conn++;
 	(*state)->src.tcp_est = 1;
 	pf_add_threshold(&(*state)->src_node->conn_rate);
 
 	if ((*state)->rule.ptr->max_src_conn &&
 	    (*state)->rule.ptr->max_src_conn <
 	    (*state)->src_node->conn) {
 		pf_status.lcounters[LCNT_SRCCONN]++;
 		bad++;
 	}
 
 	if ((*state)->rule.ptr->max_src_conn_rate.limit &&
 	    pf_check_threshold(&(*state)->src_node->conn_rate)) {
 		pf_status.lcounters[LCNT_SRCCONNRATE]++;
 		bad++;
 	}
 
 	if (!bad)
 		return (0);
 
 	if ((*state)->rule.ptr->overload_tbl) {
 		struct pfr_addr p;
 		u_int32_t	killed = 0;
 
 		pf_status.lcounters[LCNT_OVERLOAD_TABLE]++;
 		if (pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf_src_connlimit: blocking address ");
 			pf_print_host(&(*state)->src_node->addr, 0,
 			    (*state)->af);
 		}
 
 		bzero(&p, sizeof(p));
 		p.pfra_af = (*state)->af;
 		switch ((*state)->af) {
 #ifdef INET
 		case AF_INET:
 			p.pfra_net = 32;
 			p.pfra_ip4addr = (*state)->src_node->addr.v4;
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			p.pfra_net = 128;
 			p.pfra_ip6addr = (*state)->src_node->addr.v6;
 			break;
 #endif /* INET6 */
 		}
 
 		pfr_insert_kentry((*state)->rule.ptr->overload_tbl,
 		    &p, time_second);
 
 		/* kill existing states if that's required. */
 		if ((*state)->rule.ptr->flush) {
 			pf_status.lcounters[LCNT_OVERLOAD_FLUSH]++;
 
 			RB_FOREACH(s, pf_state_tree_id, &tree_id) {
 				/*
 				 * Kill states from this source.  (Only those
 				 * from the same rule if PF_FLUSH_GLOBAL is not
 				 * set)
 				 */
 				if (s->af == (*state)->af &&
 				    (((*state)->direction == PF_OUT &&
 				    PF_AEQ(&(*state)->src_node->addr,
 				    &s->lan.addr, s->af)) ||
 				    ((*state)->direction == PF_IN &&
 				    PF_AEQ(&(*state)->src_node->addr,
 				    &s->ext.addr, s->af))) &&
 				    ((*state)->rule.ptr->flush &
 				    PF_FLUSH_GLOBAL ||
 				    (*state)->rule.ptr == s->rule.ptr)) {
 					s->timeout = PFTM_PURGE;
 					s->src.state = s->dst.state =
 					    TCPS_CLOSED;
 					killed++;
 				}
 			}
 			if (pf_status.debug >= PF_DEBUG_MISC)
 				printf(", %u states killed", killed);
 		}
 		if (pf_status.debug >= PF_DEBUG_MISC)
 			printf("\n");
 	}
 
 	/* kill this state */
 	(*state)->timeout = PFTM_PURGE;
 	(*state)->src.state = (*state)->dst.state = TCPS_CLOSED;
 	return (1);
 }
 
 int
 pf_insert_src_node(struct pf_src_node **sn, struct pf_rule *rule,
     struct pf_addr *src, sa_family_t af)
 {
 	struct pf_src_node	k;
 
 	if (*sn == NULL) {
 		k.af = af;
 		PF_ACPY(&k.addr, src, af);
 		if (rule->rule_flag & PFRULE_RULESRCTRACK ||
 		    rule->rpool.opts & PF_POOL_STICKYADDR)
 			k.rule.ptr = rule;
 		else
 			k.rule.ptr = NULL;
 		pf_status.scounters[SCNT_SRC_NODE_SEARCH]++;
 		*sn = RB_FIND(pf_src_tree, &tree_src_tracking, &k);
 	}
 	if (*sn == NULL) {
 		if (!rule->max_src_nodes ||
 		    rule->src_nodes < rule->max_src_nodes)
 			(*sn) = pool_get(&pf_src_tree_pl, PR_NOWAIT);
 		else
 			pf_status.lcounters[LCNT_SRCNODES]++;
 		if ((*sn) == NULL)
 			return (-1);
 		bzero(*sn, sizeof(struct pf_src_node));
 
 		pf_init_threshold(&(*sn)->conn_rate,
 		    rule->max_src_conn_rate.limit,
 		    rule->max_src_conn_rate.seconds);
 
 		(*sn)->af = af;
 		if (rule->rule_flag & PFRULE_RULESRCTRACK ||
 		    rule->rpool.opts & PF_POOL_STICKYADDR)
 			(*sn)->rule.ptr = rule;
 		else
 			(*sn)->rule.ptr = NULL;
 		PF_ACPY(&(*sn)->addr, src, af);
 		if (RB_INSERT(pf_src_tree,
 		    &tree_src_tracking, *sn) != NULL) {
 			if (pf_status.debug >= PF_DEBUG_MISC) {
 				printf("pf: src_tree insert failed: ");
 				pf_print_host(&(*sn)->addr, 0, af);
 				printf("\n");
 			}
 			pool_put(&pf_src_tree_pl, *sn);
 			return (-1);
 		}
 		(*sn)->creation = time_second;
 		(*sn)->ruletype = rule->action;
 		if ((*sn)->rule.ptr != NULL)
 			(*sn)->rule.ptr->src_nodes++;
 		pf_status.scounters[SCNT_SRC_NODE_INSERT]++;
 		pf_status.src_nodes++;
 	} else {
 		if (rule->max_src_states &&
 		    (*sn)->states >= rule->max_src_states) {
 			pf_status.lcounters[LCNT_SRCSTATES]++;
 			return (-1);
 		}
 	}
 	return (0);
 }
 
 int
 pf_insert_state(struct pfi_kif *kif, struct pf_state *state)
 {
 	/* Thou MUST NOT insert multiple duplicate keys */
 	state->u.s.kif = kif;
 	if (RB_INSERT(pf_state_tree_lan_ext, &kif->pfik_lan_ext, state)) {
 		if (pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: state insert failed: tree_lan_ext");
 			printf(" lan: ");
 			pf_print_host(&state->lan.addr, state->lan.port,
 			    state->af);
 			printf(" gwy: ");
 			pf_print_host(&state->gwy.addr, state->gwy.port,
 			    state->af);
 			printf(" ext: ");
 			pf_print_host(&state->ext.addr, state->ext.port,
 			    state->af);
 			if (state->sync_flags & PFSTATE_FROMSYNC)
 				printf(" (from sync)");
 			printf("\n");
 		}
 		return (-1);
 	}
 
 	if (RB_INSERT(pf_state_tree_ext_gwy, &kif->pfik_ext_gwy, state)) {
 		if (pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: state insert failed: tree_ext_gwy");
 			printf(" lan: ");
 			pf_print_host(&state->lan.addr, state->lan.port,
 			    state->af);
 			printf(" gwy: ");
 			pf_print_host(&state->gwy.addr, state->gwy.port,
 			    state->af);
 			printf(" ext: ");
 			pf_print_host(&state->ext.addr, state->ext.port,
 			    state->af);
 			if (state->sync_flags & PFSTATE_FROMSYNC)
 				printf(" (from sync)");
 			printf("\n");
 		}
 		RB_REMOVE(pf_state_tree_lan_ext, &kif->pfik_lan_ext, state);
 		return (-1);
 	}
 
 	if (state->id == 0 && state->creatorid == 0) {
 		state->id = htobe64(pf_status.stateid++);
 		state->creatorid = pf_status.hostid;
 	}
 	if (RB_INSERT(pf_state_tree_id, &tree_id, state) != NULL) {
 		if (pf_status.debug >= PF_DEBUG_MISC) {
 #ifdef __FreeBSD__
 			printf("pf: state insert failed: "
 			    "id: %016llx creatorid: %08x",
 			    (long long)be64toh(state->id),
 			    ntohl(state->creatorid));
 #else
 			printf("pf: state insert failed: "
 			    "id: %016llx creatorid: %08x",
 			    betoh64(state->id), ntohl(state->creatorid));
 #endif
 			if (state->sync_flags & PFSTATE_FROMSYNC)
 				printf(" (from sync)");
 			printf("\n");
 		}
 		RB_REMOVE(pf_state_tree_lan_ext, &kif->pfik_lan_ext, state);
 		RB_REMOVE(pf_state_tree_ext_gwy, &kif->pfik_ext_gwy, state);
 		return (-1);
 	}
 	TAILQ_INSERT_TAIL(&state_list, state, u.s.entry_list);
 	pf_status.fcounters[FCNT_STATE_INSERT]++;
 	pf_status.states++;
 	pfi_kif_ref(kif, PFI_KIF_REF_STATE);
 #if NPFSYNC
 	pfsync_insert_state(state);
 #endif
 	return (0);
 }
 
 void
 pf_purge_thread(void *v)
 {
 	int nloops = 0, s;
 
 	for (;;) {
 		tsleep(pf_purge_thread, PWAIT, "pftm", 1 * hz);
 
 #ifdef __FreeBSD__
 		sx_slock(&pf_consistency_lock);
 		PF_LOCK();
 
 		if (pf_end_threads) {
 			pf_purge_expired_states(pf_status.states);
 			pf_purge_expired_fragments();
 			pf_purge_expired_src_nodes(0);
 			pf_end_threads++;
 
 			sx_sunlock(&pf_consistency_lock);
 			PF_UNLOCK();
 			wakeup(pf_purge_thread);
 			kproc_exit(0);
 		}
 #endif
 		s = splsoftnet();
 
 		/* process a fraction of the state table every second */
 		pf_purge_expired_states(1 + (pf_status.states
 		    / pf_default_rule.timeout[PFTM_INTERVAL]));
 
 		/* purge other expired types every PFTM_INTERVAL seconds */
 		if (++nloops >= pf_default_rule.timeout[PFTM_INTERVAL]) {
 			pf_purge_expired_fragments();
 			pf_purge_expired_src_nodes(0);
 			nloops = 0;
 		}
 
 		splx(s);
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 		sx_sunlock(&pf_consistency_lock);
 #endif
 	}
 }
 
 u_int32_t
 pf_state_expires(const struct pf_state *state)
 {
 	u_int32_t	timeout;
 	u_int32_t	start;
 	u_int32_t	end;
 	u_int32_t	states;
 
 	/* handle all PFTM_* > PFTM_MAX here */
 	if (state->timeout == PFTM_PURGE)
 		return (time_second);
 	if (state->timeout == PFTM_UNTIL_PACKET)
 		return (0);
 #ifdef __FreeBSD__	
 	KASSERT(state->timeout != PFTM_UNLINKED,
 	    ("pf_state_expires: timeout == PFTM_UNLINKED"));
 	KASSERT((state->timeout < PFTM_MAX), 
 	    ("pf_state_expires: timeout > PFTM_MAX"));
 #else
 	KASSERT(state->timeout != PFTM_UNLINKED);
 	KASSERT(state->timeout < PFTM_MAX);
 #endif
 	timeout = state->rule.ptr->timeout[state->timeout];
 	if (!timeout)
 		timeout = pf_default_rule.timeout[state->timeout];
 	start = state->rule.ptr->timeout[PFTM_ADAPTIVE_START];
 	if (start) {
 		end = state->rule.ptr->timeout[PFTM_ADAPTIVE_END];
 		states = state->rule.ptr->states;
 	} else {
 		start = pf_default_rule.timeout[PFTM_ADAPTIVE_START];
 		end = pf_default_rule.timeout[PFTM_ADAPTIVE_END];
 		states = pf_status.states;
 	}
 	if (end && states > start && start < end) {
 		if (states < end)
 			return (state->expire + timeout * (end - states) /
 			    (end - start));
 		else
 			return (time_second);
 	}
 	return (state->expire + timeout);
 }
 
 void
 pf_purge_expired_src_nodes(int waslocked)
 {
 	 struct pf_src_node		*cur, *next;
 	 int				 locked = waslocked;
 
 	 for (cur = RB_MIN(pf_src_tree, &tree_src_tracking); cur; cur = next) {
 		 next = RB_NEXT(pf_src_tree, &tree_src_tracking, cur);
 
 		 if (cur->states <= 0 && cur->expire <= time_second) {
 			 if (! locked) {
 #ifdef __FreeBSD__
 				 if (!sx_try_upgrade(&pf_consistency_lock)) {
 					 PF_UNLOCK();
 					 sx_sunlock(&pf_consistency_lock);
 					 sx_xlock(&pf_consistency_lock);
 					 PF_LOCK();
 				 }
 #else
 				 rw_enter_write(&pf_consistency_lock);
 #endif
 			 	 next = RB_NEXT(pf_src_tree,
 				     &tree_src_tracking, cur);
 				 locked = 1;
 			 }
 			 if (cur->rule.ptr != NULL) {
 				 cur->rule.ptr->src_nodes--;
 				 if (cur->rule.ptr->states <= 0 &&
 				     cur->rule.ptr->max_src_nodes <= 0)
 					 pf_rm_rule(NULL, cur->rule.ptr);
 			 }
 			 RB_REMOVE(pf_src_tree, &tree_src_tracking, cur);
 			 pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 			 pf_status.src_nodes--;
 			 pool_put(&pf_src_tree_pl, cur);
 		 }
 	 }
 
 	 if (locked && !waslocked)
 #ifdef __FreeBSD__
 		sx_downgrade(&pf_consistency_lock);
 #else
 		rw_exit_write(&pf_consistency_lock);
 #endif
 }
 
 void
 pf_src_tree_remove_state(struct pf_state *s)
 {
 	u_int32_t timeout;
 
 	if (s->src_node != NULL) {
 		if (s->proto == IPPROTO_TCP) {
 			if (s->src.tcp_est)
 				--s->src_node->conn;
 		}
 		if (--s->src_node->states <= 0) {
 			timeout = s->rule.ptr->timeout[PFTM_SRC_NODE];
 			if (!timeout)
 				timeout =
 				    pf_default_rule.timeout[PFTM_SRC_NODE];
 			s->src_node->expire = time_second + timeout;
 		}
 	}
 	if (s->nat_src_node != s->src_node && s->nat_src_node != NULL) {
 		if (--s->nat_src_node->states <= 0) {
 			timeout = s->rule.ptr->timeout[PFTM_SRC_NODE];
 			if (!timeout)
 				timeout =
 				    pf_default_rule.timeout[PFTM_SRC_NODE];
 			s->nat_src_node->expire = time_second + timeout;
 		}
 	}
 	s->src_node = s->nat_src_node = NULL;
 }
 
 /* callers should be at splsoftnet */
 void
 pf_unlink_state(struct pf_state *cur)
 {
 #ifdef __FreeBSD__
 	if (cur->local_flags & PFSTATE_EXPIRING)
 		return;
 	cur->local_flags |= PFSTATE_EXPIRING;
 #endif
 	if (cur->src.state == PF_TCPS_PROXY_DST) {
 #ifdef __FreeBSD__
 		pf_send_tcp(NULL, cur->rule.ptr, cur->af,
 #else
 		pf_send_tcp(cur->rule.ptr, cur->af,
 #endif
 		    &cur->ext.addr, &cur->lan.addr,
 		    cur->ext.port, cur->lan.port,
 		    cur->src.seqhi, cur->src.seqlo + 1,
 		    TH_RST|TH_ACK, 0, 0, 0, 1, cur->tag, NULL, NULL);
 	}
 	RB_REMOVE(pf_state_tree_ext_gwy,
 	    &cur->u.s.kif->pfik_ext_gwy, cur);
 	RB_REMOVE(pf_state_tree_lan_ext,
 	    &cur->u.s.kif->pfik_lan_ext, cur);
 	RB_REMOVE(pf_state_tree_id, &tree_id, cur);
 #if NPFSYNC
 	if (cur->creatorid == pf_status.hostid)
 		pfsync_delete_state(cur);
 #endif
 	cur->timeout = PFTM_UNLINKED;
 	pf_src_tree_remove_state(cur);
 }
 
 /* callers should be at splsoftnet and hold the
  * write_lock on pf_consistency_lock */
 void
 pf_free_state(struct pf_state *cur)
 {
 #if NPFSYNC
 	if (pfsyncif != NULL &&
 	    (pfsyncif->sc_bulk_send_next == cur ||
 	    pfsyncif->sc_bulk_terminator == cur))
 		return;
 #endif
 #ifdef __FreeBSD__
 	KASSERT(cur->timeout == PFTM_UNLINKED,
 	    ("pf_free_state: cur->timeout != PFTM_UNLINKED"));
 #else
 	KASSERT(cur->timeout == PFTM_UNLINKED);
 #endif
 	if (--cur->rule.ptr->states <= 0 &&
 	    cur->rule.ptr->src_nodes <= 0)
 		pf_rm_rule(NULL, cur->rule.ptr);
 	if (cur->nat_rule.ptr != NULL)
 		if (--cur->nat_rule.ptr->states <= 0 &&
 			cur->nat_rule.ptr->src_nodes <= 0)
 			pf_rm_rule(NULL, cur->nat_rule.ptr);
 	if (cur->anchor.ptr != NULL)
 		if (--cur->anchor.ptr->states <= 0)
 			pf_rm_rule(NULL, cur->anchor.ptr);
 	pf_normalize_tcp_cleanup(cur);
 	pfi_kif_unref(cur->u.s.kif, PFI_KIF_REF_STATE);
 	TAILQ_REMOVE(&state_list, cur, u.s.entry_list);
 	if (cur->tag)
 		pf_tag_unref(cur->tag);
 	pool_put(&pf_state_pl, cur);
 	pf_status.fcounters[FCNT_STATE_REMOVALS]++;
 	pf_status.states--;
 }
 
 void
 pf_purge_expired_states(u_int32_t maxcheck)
 {
 	static struct pf_state	*cur = NULL;
 	struct pf_state		*next;
 	int 			 locked = 0;
 
 	while (maxcheck--) {
 		/* wrap to start of list when we hit the end */
 		if (cur == NULL) {
 			cur = TAILQ_FIRST(&state_list);
 			if (cur == NULL)
 				break;	/* list empty */
 		}
 
 		/* get next state, as cur may get deleted */
 		next = TAILQ_NEXT(cur, u.s.entry_list);
 
 		if (cur->timeout == PFTM_UNLINKED) {
 			/* free unlinked state */
 			if (! locked) {
 #ifdef __FreeBSD__
 				 if (!sx_try_upgrade(&pf_consistency_lock)) {
 					 PF_UNLOCK();
 					 sx_sunlock(&pf_consistency_lock);
 					 sx_xlock(&pf_consistency_lock);
 					 PF_LOCK();
 				 }
 #else
 				rw_enter_write(&pf_consistency_lock);
 #endif
 				locked = 1;
 			}
 			pf_free_state(cur);
 		} else if (pf_state_expires(cur) <= time_second) {
 			/* unlink and free expired state */
 			pf_unlink_state(cur);
 			if (! locked) {
 #ifdef __FreeBSD__
 				 if (!sx_try_upgrade(&pf_consistency_lock)) {
 					 PF_UNLOCK();
 					 sx_sunlock(&pf_consistency_lock);
 					 sx_xlock(&pf_consistency_lock);
 					 PF_LOCK();
 				 }
 #else
 				rw_enter_write(&pf_consistency_lock);
 #endif
 				locked = 1;
 			}
 			pf_free_state(cur);
 		}
 		cur = next;
 	}
 
 	if (locked)
 #ifdef __FreeBSD__
 		sx_downgrade(&pf_consistency_lock);
 #else
 		rw_exit_write(&pf_consistency_lock);
 #endif
 }
 
 int
 pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw)
 {
 	if (aw->type != PF_ADDR_TABLE)
 		return (0);
 	if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname)) == NULL)
 		return (1);
 	return (0);
 }
 
 void
 pf_tbladdr_remove(struct pf_addr_wrap *aw)
 {
 	if (aw->type != PF_ADDR_TABLE || aw->p.tbl == NULL)
 		return;
 	pfr_detach_table(aw->p.tbl);
 	aw->p.tbl = NULL;
 }
 
 void
 pf_tbladdr_copyout(struct pf_addr_wrap *aw)
 {
 	struct pfr_ktable *kt = aw->p.tbl;
 
 	if (aw->type != PF_ADDR_TABLE || kt == NULL)
 		return;
 	if (!(kt->pfrkt_flags & PFR_TFLAG_ACTIVE) && kt->pfrkt_root != NULL)
 		kt = kt->pfrkt_root;
 	aw->p.tbl = NULL;
 	aw->p.tblcnt = (kt->pfrkt_flags & PFR_TFLAG_ACTIVE) ?
 		kt->pfrkt_cnt : -1;
 }
 
 void
 pf_print_host(struct pf_addr *addr, u_int16_t p, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET: {
 		u_int32_t a = ntohl(addr->addr32[0]);
 		printf("%u.%u.%u.%u", (a>>24)&255, (a>>16)&255,
 		    (a>>8)&255, a&255);
 		if (p) {
 			p = ntohs(p);
 			printf(":%u", p);
 		}
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6: {
 		u_int16_t b;
 		u_int8_t i, curstart = 255, curend = 0,
 		    maxstart = 0, maxend = 0;
 		for (i = 0; i < 8; i++) {
 			if (!addr->addr16[i]) {
 				if (curstart == 255)
 					curstart = i;
 				else
 					curend = i;
 			} else {
 				if (curstart) {
 					if ((curend - curstart) >
 					    (maxend - maxstart)) {
 						maxstart = curstart;
 						maxend = curend;
 						curstart = 255;
 					}
 				}
 			}
 		}
 		for (i = 0; i < 8; i++) {
 			if (i >= maxstart && i <= maxend) {
 				if (maxend != 7) {
 					if (i == maxstart)
 						printf(":");
 				} else {
 					if (i == maxend)
 						printf(":");
 				}
 			} else {
 				b = ntohs(addr->addr16[i]);
 				printf("%x", b);
 				if (i < 7)
 					printf(":");
 			}
 		}
 		if (p) {
 			p = ntohs(p);
 			printf("[%u]", p);
 		}
 		break;
 	}
 #endif /* INET6 */
 	}
 }
 
 void
 pf_print_state(struct pf_state *s)
 {
 	switch (s->proto) {
 	case IPPROTO_TCP:
 		printf("TCP ");
 		break;
 	case IPPROTO_UDP:
 		printf("UDP ");
 		break;
 	case IPPROTO_ICMP:
 		printf("ICMP ");
 		break;
 	case IPPROTO_ICMPV6:
 		printf("ICMPV6 ");
 		break;
 	default:
 		printf("%u ", s->proto);
 		break;
 	}
 	pf_print_host(&s->lan.addr, s->lan.port, s->af);
 	printf(" ");
 	pf_print_host(&s->gwy.addr, s->gwy.port, s->af);
 	printf(" ");
 	pf_print_host(&s->ext.addr, s->ext.port, s->af);
 	printf(" [lo=%u high=%u win=%u modulator=%u", s->src.seqlo,
 	    s->src.seqhi, s->src.max_win, s->src.seqdiff);
 	if (s->src.wscale && s->dst.wscale)
 		printf(" wscale=%u", s->src.wscale & PF_WSCALE_MASK);
 	printf("]");
 	printf(" [lo=%u high=%u win=%u modulator=%u", s->dst.seqlo,
 	    s->dst.seqhi, s->dst.max_win, s->dst.seqdiff);
 	if (s->src.wscale && s->dst.wscale)
 		printf(" wscale=%u", s->dst.wscale & PF_WSCALE_MASK);
 	printf("]");
 	printf(" %u:%u", s->src.state, s->dst.state);
 }
 
 void
 pf_print_flags(u_int8_t f)
 {
 	if (f)
 		printf(" ");
 	if (f & TH_FIN)
 		printf("F");
 	if (f & TH_SYN)
 		printf("S");
 	if (f & TH_RST)
 		printf("R");
 	if (f & TH_PUSH)
 		printf("P");
 	if (f & TH_ACK)
 		printf("A");
 	if (f & TH_URG)
 		printf("U");
 	if (f & TH_ECE)
 		printf("E");
 	if (f & TH_CWR)
 		printf("W");
 }
 
 #define	PF_SET_SKIP_STEPS(i)					\
 	do {							\
 		while (head[i] != cur) {			\
 			head[i]->skip[i].ptr = cur;		\
 			head[i] = TAILQ_NEXT(head[i], entries);	\
 		}						\
 	} while (0)
 
 void
 pf_calc_skip_steps(struct pf_rulequeue *rules)
 {
 	struct pf_rule *cur, *prev, *head[PF_SKIP_COUNT];
 	int i;
 
 	cur = TAILQ_FIRST(rules);
 	prev = cur;
 	for (i = 0; i < PF_SKIP_COUNT; ++i)
 		head[i] = cur;
 	while (cur != NULL) {
 
 		if (cur->kif != prev->kif || cur->ifnot != prev->ifnot)
 			PF_SET_SKIP_STEPS(PF_SKIP_IFP);
 		if (cur->direction != prev->direction)
 			PF_SET_SKIP_STEPS(PF_SKIP_DIR);
 		if (cur->af != prev->af)
 			PF_SET_SKIP_STEPS(PF_SKIP_AF);
 		if (cur->proto != prev->proto)
 			PF_SET_SKIP_STEPS(PF_SKIP_PROTO);
 		if (cur->src.neg != prev->src.neg ||
 		    pf_addr_wrap_neq(&cur->src.addr, &prev->src.addr))
 			PF_SET_SKIP_STEPS(PF_SKIP_SRC_ADDR);
 		if (cur->src.port[0] != prev->src.port[0] ||
 		    cur->src.port[1] != prev->src.port[1] ||
 		    cur->src.port_op != prev->src.port_op)
 			PF_SET_SKIP_STEPS(PF_SKIP_SRC_PORT);
 		if (cur->dst.neg != prev->dst.neg ||
 		    pf_addr_wrap_neq(&cur->dst.addr, &prev->dst.addr))
 			PF_SET_SKIP_STEPS(PF_SKIP_DST_ADDR);
 		if (cur->dst.port[0] != prev->dst.port[0] ||
 		    cur->dst.port[1] != prev->dst.port[1] ||
 		    cur->dst.port_op != prev->dst.port_op)
 			PF_SET_SKIP_STEPS(PF_SKIP_DST_PORT);
 
 		prev = cur;
 		cur = TAILQ_NEXT(cur, entries);
 	}
 	for (i = 0; i < PF_SKIP_COUNT; ++i)
 		PF_SET_SKIP_STEPS(i);
 }
 
 int
 pf_addr_wrap_neq(struct pf_addr_wrap *aw1, struct pf_addr_wrap *aw2)
 {
 	if (aw1->type != aw2->type)
 		return (1);
 	switch (aw1->type) {
 	case PF_ADDR_ADDRMASK:
 		if (PF_ANEQ(&aw1->v.a.addr, &aw2->v.a.addr, 0))
 			return (1);
 		if (PF_ANEQ(&aw1->v.a.mask, &aw2->v.a.mask, 0))
 			return (1);
 		return (0);
 	case PF_ADDR_DYNIFTL:
 		return (aw1->p.dyn->pfid_kt != aw2->p.dyn->pfid_kt);
 	case PF_ADDR_NOROUTE:
 	case PF_ADDR_URPFFAILED:
 		return (0);
 	case PF_ADDR_TABLE:
 		return (aw1->p.tbl != aw2->p.tbl);
 	case PF_ADDR_RTLABEL:
 		return (aw1->v.rtlabel != aw2->v.rtlabel);
 	default:
 		printf("invalid address type: %d\n", aw1->type);
 		return (1);
 	}
 }
 
 u_int16_t
 pf_cksum_fixup(u_int16_t cksum, u_int16_t old, u_int16_t new, u_int8_t udp)
 {
 	u_int32_t	l;
 
 	if (udp && !cksum)
 		return (0x0000);
 	l = cksum + old - new;
 	l = (l >> 16) + (l & 65535);
 	l = l & 65535;
 	if (udp && !l)
 		return (0xFFFF);
 	return (l);
 }
 
 void
 pf_change_ap(struct pf_addr *a, u_int16_t *p, u_int16_t *ic, u_int16_t *pc,
     struct pf_addr *an, u_int16_t pn, u_int8_t u, sa_family_t af)
 {
 	struct pf_addr	ao;
 	u_int16_t	po = *p;
 
 	PF_ACPY(&ao, a, af);
 	PF_ACPY(a, an, af);
 
 	*p = pn;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		*ic = pf_cksum_fixup(pf_cksum_fixup(*ic,
 		    ao.addr16[0], an->addr16[0], 0),
 		    ao.addr16[1], an->addr16[1], 0);
 		*p = pn;
 		*pc = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(*pc,
 		    ao.addr16[0], an->addr16[0], u),
 		    ao.addr16[1], an->addr16[1], u),
 		    po, pn, u);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		*pc = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(*pc,
 		    ao.addr16[0], an->addr16[0], u),
 		    ao.addr16[1], an->addr16[1], u),
 		    ao.addr16[2], an->addr16[2], u),
 		    ao.addr16[3], an->addr16[3], u),
 		    ao.addr16[4], an->addr16[4], u),
 		    ao.addr16[5], an->addr16[5], u),
 		    ao.addr16[6], an->addr16[6], u),
 		    ao.addr16[7], an->addr16[7], u),
 		    po, pn, u);
 		break;
 #endif /* INET6 */
 	}
 }
 
 
 /* Changes a u_int32_t.  Uses a void * so there are no align restrictions */
 void
 pf_change_a(void *a, u_int16_t *c, u_int32_t an, u_int8_t u)
 {
 	u_int32_t	ao;
 
 	memcpy(&ao, a, sizeof(ao));
 	memcpy(a, &an, sizeof(u_int32_t));
 	*c = pf_cksum_fixup(pf_cksum_fixup(*c, ao / 65536, an / 65536, u),
 	    ao % 65536, an % 65536, u);
 }
 
 #ifdef INET6
 void
 pf_change_a6(struct pf_addr *a, u_int16_t *c, struct pf_addr *an, u_int8_t u)
 {
 	struct pf_addr	ao;
 
 	PF_ACPY(&ao, a, AF_INET6);
 	PF_ACPY(a, an, AF_INET6);
 
 	*c = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 	    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 	    pf_cksum_fixup(pf_cksum_fixup(*c,
 	    ao.addr16[0], an->addr16[0], u),
 	    ao.addr16[1], an->addr16[1], u),
 	    ao.addr16[2], an->addr16[2], u),
 	    ao.addr16[3], an->addr16[3], u),
 	    ao.addr16[4], an->addr16[4], u),
 	    ao.addr16[5], an->addr16[5], u),
 	    ao.addr16[6], an->addr16[6], u),
 	    ao.addr16[7], an->addr16[7], u);
 }
 #endif /* INET6 */
 
 void
 pf_change_icmp(struct pf_addr *ia, u_int16_t *ip, struct pf_addr *oa,
     struct pf_addr *na, u_int16_t np, u_int16_t *pc, u_int16_t *h2c,
     u_int16_t *ic, u_int16_t *hc, u_int8_t u, sa_family_t af)
 {
 	struct pf_addr	oia, ooa;
 
 	PF_ACPY(&oia, ia, af);
 	PF_ACPY(&ooa, oa, af);
 
 	/* Change inner protocol port, fix inner protocol checksum. */
 	if (ip != NULL) {
 		u_int16_t	oip = *ip;
 		u_int32_t	opc = 0;	/* make the compiler happy */
 
 		if (pc != NULL)
 			opc = *pc;
 		*ip = np;
 		if (pc != NULL)
 			*pc = pf_cksum_fixup(*pc, oip, *ip, u);
 		*ic = pf_cksum_fixup(*ic, oip, *ip, 0);
 		if (pc != NULL)
 			*ic = pf_cksum_fixup(*ic, opc, *pc, 0);
 	}
 	/* Change inner ip address, fix inner ip and icmp checksums. */
 	PF_ACPY(ia, na, af);
 	switch (af) {
 #ifdef INET
 	case AF_INET: {
 		u_int32_t	 oh2c = *h2c;
 
 		*h2c = pf_cksum_fixup(pf_cksum_fixup(*h2c,
 		    oia.addr16[0], ia->addr16[0], 0),
 		    oia.addr16[1], ia->addr16[1], 0);
 		*ic = pf_cksum_fixup(pf_cksum_fixup(*ic,
 		    oia.addr16[0], ia->addr16[0], 0),
 		    oia.addr16[1], ia->addr16[1], 0);
 		*ic = pf_cksum_fixup(*ic, oh2c, *h2c, 0);
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		*ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(*ic,
 		    oia.addr16[0], ia->addr16[0], u),
 		    oia.addr16[1], ia->addr16[1], u),
 		    oia.addr16[2], ia->addr16[2], u),
 		    oia.addr16[3], ia->addr16[3], u),
 		    oia.addr16[4], ia->addr16[4], u),
 		    oia.addr16[5], ia->addr16[5], u),
 		    oia.addr16[6], ia->addr16[6], u),
 		    oia.addr16[7], ia->addr16[7], u);
 		break;
 #endif /* INET6 */
 	}
 	/* Change outer ip address, fix outer ip or icmpv6 checksum. */
 	PF_ACPY(oa, na, af);
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		*hc = pf_cksum_fixup(pf_cksum_fixup(*hc,
 		    ooa.addr16[0], oa->addr16[0], 0),
 		    ooa.addr16[1], oa->addr16[1], 0);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		*ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(*ic,
 		    ooa.addr16[0], oa->addr16[0], u),
 		    ooa.addr16[1], oa->addr16[1], u),
 		    ooa.addr16[2], oa->addr16[2], u),
 		    ooa.addr16[3], oa->addr16[3], u),
 		    ooa.addr16[4], oa->addr16[4], u),
 		    ooa.addr16[5], oa->addr16[5], u),
 		    ooa.addr16[6], oa->addr16[6], u),
 		    ooa.addr16[7], oa->addr16[7], u);
 		break;
 #endif /* INET6 */
 	}
 }
 
 
 /*
  * Need to modulate the sequence numbers in the TCP SACK option
  * (credits to Krzysztof Pfaff for report and patch)
  */
 int
 pf_modulate_sack(struct mbuf *m, int off, struct pf_pdesc *pd,
     struct tcphdr *th, struct pf_state_peer *dst)
 {
 	int hlen = (th->th_off << 2) - sizeof(*th), thoptlen = hlen;
 #ifdef __FreeBSD__
 	u_int8_t opts[TCP_MAXOLEN], *opt = opts;
 #else
 	u_int8_t opts[MAX_TCPOPTLEN], *opt = opts;
 #endif
 	int copyback = 0, i, olen;
 	struct sackblk sack;
 
 #define TCPOLEN_SACKLEN	(TCPOLEN_SACK + 2)
 	if (hlen < TCPOLEN_SACKLEN ||
 	    !pf_pull_hdr(m, off + sizeof(*th), opts, hlen, NULL, NULL, pd->af))
 		return 0;
 
 	while (hlen >= TCPOLEN_SACKLEN) {
 		olen = opt[1];
 		switch (*opt) {
 		case TCPOPT_EOL:	/* FALLTHROUGH */
 		case TCPOPT_NOP:
 			opt++;
 			hlen--;
 			break;
 		case TCPOPT_SACK:
 			if (olen > hlen)
 				olen = hlen;
 			if (olen >= TCPOLEN_SACKLEN) {
 				for (i = 2; i + TCPOLEN_SACK <= olen;
 				    i += TCPOLEN_SACK) {
 					memcpy(&sack, &opt[i], sizeof(sack));
 					pf_change_a(&sack.start, &th->th_sum,
 					    htonl(ntohl(sack.start) -
 					    dst->seqdiff), 0);
 					pf_change_a(&sack.end, &th->th_sum,
 					    htonl(ntohl(sack.end) -
 					    dst->seqdiff), 0);
 					memcpy(&opt[i], &sack, sizeof(sack));
 				}
 				copyback = 1;
 			}
 			/* FALLTHROUGH */
 		default:
 			if (olen < 2)
 				olen = 2;
 			hlen -= olen;
 			opt += olen;
 		}
 	}
 
 	if (copyback)
 #ifdef __FreeBSD__
 		m_copyback(m, off + sizeof(*th), thoptlen, (caddr_t)opts);
 #else
 		m_copyback(m, off + sizeof(*th), thoptlen, opts);
 #endif
 	return (copyback);
 }
 
 void
 #ifdef __FreeBSD__
 pf_send_tcp(struct mbuf *replyto, const struct pf_rule *r, sa_family_t af,
 #else
 pf_send_tcp(const struct pf_rule *r, sa_family_t af,
 #endif
     const struct pf_addr *saddr, const struct pf_addr *daddr,
     u_int16_t sport, u_int16_t dport, u_int32_t seq, u_int32_t ack,
     u_int8_t flags, u_int16_t win, u_int16_t mss, u_int8_t ttl, int tag,
     u_int16_t rtag, struct ether_header *eh, struct ifnet *ifp)
 {
 	INIT_VNET_INET(curvnet);
 	struct mbuf	*m;
 	int		 len, tlen;
 #ifdef INET
 	struct ip	*h;
 #endif /* INET */
 #ifdef INET6
 	struct ip6_hdr	*h6;
 #endif /* INET6 */
 	struct tcphdr	*th;
 	char		*opt;
 	struct pf_mtag	*pf_mtag;
 
 #ifdef __FreeBSD__
 	KASSERT(
 #ifdef INET
 	    af == AF_INET
 #else
 	    0
 #endif
 	    ||
 #ifdef INET6
 	    af == AF_INET6
 #else
 	    0
 #endif
 	    , ("Unsupported AF %d", af));
 	len = 0;
 	th = NULL;
 #ifdef INET
 	h = NULL;
 #endif
 #ifdef INET6
 	h6 = NULL;
 #endif
 #endif
 
 	/* maximum segment size tcp option */
 	tlen = sizeof(struct tcphdr);
 	if (mss)
 		tlen += 4;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		len = sizeof(struct ip) + tlen;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		len = sizeof(struct ip6_hdr) + tlen;
 		break;
 #endif /* INET6 */
 	}
 
 	/* create outgoing mbuf */
 	m = m_gethdr(M_DONTWAIT, MT_HEADER);
 	if (m == NULL)
 		return;
 #ifdef __FreeBSD__
 #ifdef MAC
 	if (replyto)
 		mac_netinet_firewall_reply(replyto, m);
 	else
 		mac_netinet_firewall_send(m);
 #else
 	(void)replyto;
 #endif
 #endif
 	if ((pf_mtag = pf_get_mtag(m)) == NULL) {
 		m_freem(m);
 		return;
 	}
 	if (tag)
 #ifdef __FreeBSD__
 		m->m_flags |= M_SKIP_FIREWALL;
 #else
 		pf_mtag->flags |= PF_TAG_GENERATED;
 #endif
 
 	pf_mtag->tag = rtag;
 
 	if (r != NULL && r->rtableid >= 0)
 #ifdef __FreeBSD__
 	{
 		M_SETFIB(m, r->rtableid);
 #endif
 		pf_mtag->rtableid = r->rtableid;
 #ifdef __FreeBSD__
 	}
 #endif
 #ifdef ALTQ
 	if (r != NULL && r->qid) {
 		pf_mtag->qid = r->qid;
 		/* add hints for ecn */
 		pf_mtag->af = af;
 		pf_mtag->hdr = mtod(m, struct ip *);
 	}
 #endif /* ALTQ */
 	m->m_data += max_linkhdr;
 	m->m_pkthdr.len = m->m_len = len;
 	m->m_pkthdr.rcvif = NULL;
 	bzero(m->m_data, len);
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		h = mtod(m, struct ip *);
 
 		/* IP header fields included in the TCP checksum */
 		h->ip_p = IPPROTO_TCP;
 		h->ip_len = htons(tlen);
 		h->ip_src.s_addr = saddr->v4.s_addr;
 		h->ip_dst.s_addr = daddr->v4.s_addr;
 
 		th = (struct tcphdr *)((caddr_t)h + sizeof(struct ip));
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		h6 = mtod(m, struct ip6_hdr *);
 
 		/* IP header fields included in the TCP checksum */
 		h6->ip6_nxt = IPPROTO_TCP;
 		h6->ip6_plen = htons(tlen);
 		memcpy(&h6->ip6_src, &saddr->v6, sizeof(struct in6_addr));
 		memcpy(&h6->ip6_dst, &daddr->v6, sizeof(struct in6_addr));
 
 		th = (struct tcphdr *)((caddr_t)h6 + sizeof(struct ip6_hdr));
 		break;
 #endif /* INET6 */
 	}
 
 	/* TCP header */
 	th->th_sport = sport;
 	th->th_dport = dport;
 	th->th_seq = htonl(seq);
 	th->th_ack = htonl(ack);
 	th->th_off = tlen >> 2;
 	th->th_flags = flags;
 	th->th_win = htons(win);
 
 	if (mss) {
 		opt = (char *)(th + 1);
 		opt[0] = TCPOPT_MAXSEG;
 		opt[1] = 4;
 		HTONS(mss);
 		bcopy((caddr_t)&mss, (caddr_t)(opt + 2), 2);
 	}
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		/* TCP checksum */
 		th->th_sum = in_cksum(m, len);
 
 		/* Finish the IP header */
 		h->ip_v = 4;
 		h->ip_hl = sizeof(*h) >> 2;
 		h->ip_tos = IPTOS_LOWDELAY;
 #ifdef __FreeBSD__
 		h->ip_off = V_path_mtu_discovery ? IP_DF : 0;
 		h->ip_len = len;
 #else
 		h->ip_off = htons(ip_mtudisc ? IP_DF : 0);
 		h->ip_len = htons(len);
 #endif
 		h->ip_ttl = ttl ? ttl : V_ip_defttl;
 		h->ip_sum = 0;
 		if (eh == NULL) {
 #ifdef __FreeBSD__
 			PF_UNLOCK();
 			ip_output(m, (void *)NULL, (void *)NULL, 0,
 			    (void *)NULL, (void *)NULL);
 			PF_LOCK();
 #else /* ! __FreeBSD__ */
 			ip_output(m, (void *)NULL, (void *)NULL, 0,
 			    (void *)NULL, (void *)NULL);
 #endif
 		} else {
 			struct route		 ro;
 			struct rtentry		 rt;
 			struct ether_header	*e = (void *)ro.ro_dst.sa_data;
 
 			if (ifp == NULL) {
 				m_freem(m);
 				return;
 			}
 			rt.rt_ifp = ifp;
 			ro.ro_rt = &rt;
 			ro.ro_dst.sa_len = sizeof(ro.ro_dst);
 			ro.ro_dst.sa_family = pseudo_AF_HDRCMPLT;
 			bcopy(eh->ether_dhost, e->ether_shost, ETHER_ADDR_LEN);
 			bcopy(eh->ether_shost, e->ether_dhost, ETHER_ADDR_LEN);
 			e->ether_type = eh->ether_type;
 #ifdef __FreeBSD__
 			PF_UNLOCK();
 			/* XXX_IMPORT: later */
 			ip_output(m, (void *)NULL, &ro, 0,
 			    (void *)NULL, (void *)NULL);
 			PF_LOCK();
 #else /* ! __FreeBSD__ */
 			ip_output(m, (void *)NULL, &ro, IP_ROUTETOETHER,
 			    (void *)NULL, (void *)NULL);
 #endif
 		}
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		/* TCP checksum */
 		th->th_sum = in6_cksum(m, IPPROTO_TCP,
 		    sizeof(struct ip6_hdr), tlen);
 
 		h6->ip6_vfc |= IPV6_VERSION;
 		h6->ip6_hlim = IPV6_DEFHLIM;
 
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 		ip6_output(m, NULL, NULL, 0, NULL, NULL, NULL);
 		PF_LOCK();
 #else
 		ip6_output(m, NULL, NULL, 0, NULL, NULL);
 #endif
 		break;
 #endif /* INET6 */
 	}
 }
 
 void
 pf_send_icmp(struct mbuf *m, u_int8_t type, u_int8_t code, sa_family_t af,
     struct pf_rule *r)
 {
 	struct pf_mtag	*pf_mtag;
 	struct mbuf	*m0;
 #ifdef __FreeBSD__
 	struct ip *ip;
 #endif
 
 #ifdef __FreeBSD__
 	m0 = m_copypacket(m, M_DONTWAIT);
 	if (m0 == NULL)
 		return;
 #else
 	m0 = m_copy(m, 0, M_COPYALL);
 #endif
 	if ((pf_mtag = pf_get_mtag(m0)) == NULL)
 		return;
 #ifdef __FreeBSD__
 	/* XXX: revisit */
 	m0->m_flags |= M_SKIP_FIREWALL;
 #else
 	pf_mtag->flags |= PF_TAG_GENERATED;
 #endif
 
 	if (r->rtableid >= 0)
 #ifdef __FreeBSD__
 	{
 		M_SETFIB(m0, r->rtableid);
 #endif
 		pf_mtag->rtableid = r->rtableid;
 #ifdef __FreeBSD__
 	}
 #endif
 
 #ifdef ALTQ
 	if (r->qid) {
 		pf_mtag->qid = r->qid;
 		/* add hints for ecn */
 		pf_mtag->af = af;
 		pf_mtag->hdr = mtod(m0, struct ip *);
 	}
 #endif /* ALTQ */
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 #ifdef __FreeBSD__
 		/* icmp_error() expects host byte ordering */
 		ip = mtod(m0, struct ip *);
 		NTOHS(ip->ip_len);
 		NTOHS(ip->ip_off);
 		PF_UNLOCK();
 		icmp_error(m0, type, code, 0, 0);
 		PF_LOCK();
 #else
 		icmp_error(m0, type, code, 0, 0);
 #endif
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		icmp6_error(m0, type, code, 0);
 #ifdef __FreeBSD__
 		PF_LOCK();
 #endif
 		break;
 #endif /* INET6 */
 	}
 }
 
 /*
  * Return 1 if the addresses a and b match (with mask m), otherwise return 0.
  * If n is 0, they match if they are equal. If n is != 0, they match if they
  * are different.
  */
 int
 pf_match_addr(u_int8_t n, struct pf_addr *a, struct pf_addr *m,
     struct pf_addr *b, sa_family_t af)
 {
 	int	match = 0;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		if ((a->addr32[0] & m->addr32[0]) ==
 		    (b->addr32[0] & m->addr32[0]))
 			match++;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		if (((a->addr32[0] & m->addr32[0]) ==
 		     (b->addr32[0] & m->addr32[0])) &&
 		    ((a->addr32[1] & m->addr32[1]) ==
 		     (b->addr32[1] & m->addr32[1])) &&
 		    ((a->addr32[2] & m->addr32[2]) ==
 		     (b->addr32[2] & m->addr32[2])) &&
 		    ((a->addr32[3] & m->addr32[3]) ==
 		     (b->addr32[3] & m->addr32[3])))
 			match++;
 		break;
 #endif /* INET6 */
 	}
 	if (match) {
 		if (n)
 			return (0);
 		else
 			return (1);
 	} else {
 		if (n)
 			return (1);
 		else
 			return (0);
 	}
 }
 
 int
 pf_match(u_int8_t op, u_int32_t a1, u_int32_t a2, u_int32_t p)
 {
 	switch (op) {
 	case PF_OP_IRG:
 		return ((p > a1) && (p < a2));
 	case PF_OP_XRG:
 		return ((p < a1) || (p > a2));
 	case PF_OP_RRG:
 		return ((p >= a1) && (p <= a2));
 	case PF_OP_EQ:
 		return (p == a1);
 	case PF_OP_NE:
 		return (p != a1);
 	case PF_OP_LT:
 		return (p < a1);
 	case PF_OP_LE:
 		return (p <= a1);
 	case PF_OP_GT:
 		return (p > a1);
 	case PF_OP_GE:
 		return (p >= a1);
 	}
 	return (0); /* never reached */
 }
 
 int
 pf_match_port(u_int8_t op, u_int16_t a1, u_int16_t a2, u_int16_t p)
 {
 	NTOHS(a1);
 	NTOHS(a2);
 	NTOHS(p);
 	return (pf_match(op, a1, a2, p));
 }
 
 int
 pf_match_uid(u_int8_t op, uid_t a1, uid_t a2, uid_t u)
 {
 	if (u == UID_MAX && op != PF_OP_EQ && op != PF_OP_NE)
 		return (0);
 	return (pf_match(op, a1, a2, u));
 }
 
 int
 pf_match_gid(u_int8_t op, gid_t a1, gid_t a2, gid_t g)
 {
 	if (g == GID_MAX && op != PF_OP_EQ && op != PF_OP_NE)
 		return (0);
 	return (pf_match(op, a1, a2, g));
 }
 
 #ifndef __FreeBSD__
 struct pf_mtag *
 pf_find_mtag(struct mbuf *m)
 {
 	struct m_tag	*mtag;
 
 	if ((mtag = m_tag_find(m, PACKET_TAG_PF, NULL)) == NULL)
 		return (NULL);
 
 	return ((struct pf_mtag *)(mtag + 1));
 }
 
 struct pf_mtag *
 pf_get_mtag(struct mbuf *m)
 {
 	struct m_tag	*mtag;
 
 	if ((mtag = m_tag_find(m, PACKET_TAG_PF, NULL)) == NULL) {
 		mtag = m_tag_get(PACKET_TAG_PF, sizeof(struct pf_mtag),
 		    M_NOWAIT);
 		if (mtag == NULL)
 			return (NULL);
 		bzero(mtag + 1, sizeof(struct pf_mtag));
 		m_tag_prepend(m, mtag);
 	}
 
 	return ((struct pf_mtag *)(mtag + 1));
 }
 #endif
 
 int
 pf_match_tag(struct mbuf *m, struct pf_rule *r, struct pf_mtag *pf_mtag,
     int *tag)
 {
 	if (*tag == -1)
 		*tag = pf_mtag->tag;
 
 	return ((!r->match_tag_not && r->match_tag == *tag) ||
 	    (r->match_tag_not && r->match_tag != *tag));
 }
 
 int
 pf_tag_packet(struct mbuf *m, struct pf_mtag *pf_mtag, int tag, int rtableid)
 {
 	if (tag <= 0 && rtableid < 0)
 		return (0);
 
 	if (pf_mtag == NULL)
 		if ((pf_mtag = pf_get_mtag(m)) == NULL)
 			return (1);
 	if (tag > 0)
 		pf_mtag->tag = tag;
 	if (rtableid >= 0)
 #ifdef __FreeBSD__
 	{
 		M_SETFIB(m, rtableid);
 #endif
 		pf_mtag->rtableid = rtableid;
 #ifdef __FreeBSD__
 	}
 #endif
 
 	return (0);
 }
 
 static void
 pf_step_into_anchor(int *depth, struct pf_ruleset **rs, int n,
     struct pf_rule **r, struct pf_rule **a,  int *match)
 {
 	struct pf_anchor_stackframe	*f;
 
 	(*r)->anchor->match = 0;
 	if (match)
 		*match = 0;
 	if (*depth >= sizeof(pf_anchor_stack) /
 	    sizeof(pf_anchor_stack[0])) {
 		printf("pf_step_into_anchor: stack overflow\n");
 		*r = TAILQ_NEXT(*r, entries);
 		return;
 	} else if (*depth == 0 && a != NULL)
 		*a = *r;
 	f = pf_anchor_stack + (*depth)++;
 	f->rs = *rs;
 	f->r = *r;
 	if ((*r)->anchor_wildcard) {
 		f->parent = &(*r)->anchor->children;
 		if ((f->child = RB_MIN(pf_anchor_node, f->parent)) ==
 		    NULL) {
 			*r = NULL;
 			return;
 		}
 		*rs = &f->child->ruleset;
 	} else {
 		f->parent = NULL;
 		f->child = NULL;
 		*rs = &(*r)->anchor->ruleset;
 	}
 	*r = TAILQ_FIRST((*rs)->rules[n].active.ptr);
 }
 
 int
 pf_step_out_of_anchor(int *depth, struct pf_ruleset **rs, int n,
     struct pf_rule **r, struct pf_rule **a, int *match)
 {
 	struct pf_anchor_stackframe	*f;
 	int quick = 0;
 
 	do {
 		if (*depth <= 0)
 			break;
 		f = pf_anchor_stack + *depth - 1;
 		if (f->parent != NULL && f->child != NULL) {
 			if (f->child->match ||
 			    (match != NULL && *match)) {
 				f->r->anchor->match = 1;
 				*match = 0;
 			}
 			f->child = RB_NEXT(pf_anchor_node, f->parent, f->child);
 			if (f->child != NULL) {
 				*rs = &f->child->ruleset;
 				*r = TAILQ_FIRST((*rs)->rules[n].active.ptr);
 				if (*r == NULL)
 					continue;
 				else
 					break;
 			}
 		}
 		(*depth)--;
 		if (*depth == 0 && a != NULL)
 			*a = NULL;
 		*rs = f->rs;
 		if (f->r->anchor->match || (match  != NULL && *match))
 			quick = f->r->quick;
 		*r = TAILQ_NEXT(f->r, entries);
 	} while (*r == NULL);
 
 	return (quick);
 }
 
 #ifdef INET6
 void
 pf_poolmask(struct pf_addr *naddr, struct pf_addr *raddr,
     struct pf_addr *rmask, struct pf_addr *saddr, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) |
 		((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]);
 		break;
 #endif /* INET */
 	case AF_INET6:
 		naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) |
 		((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]);
 		naddr->addr32[1] = (raddr->addr32[1] & rmask->addr32[1]) |
 		((rmask->addr32[1] ^ 0xffffffff ) & saddr->addr32[1]);
 		naddr->addr32[2] = (raddr->addr32[2] & rmask->addr32[2]) |
 		((rmask->addr32[2] ^ 0xffffffff ) & saddr->addr32[2]);
 		naddr->addr32[3] = (raddr->addr32[3] & rmask->addr32[3]) |
 		((rmask->addr32[3] ^ 0xffffffff ) & saddr->addr32[3]);
 		break;
 	}
 }
 
 void
 pf_addr_inc(struct pf_addr *addr, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		addr->addr32[0] = htonl(ntohl(addr->addr32[0]) + 1);
 		break;
 #endif /* INET */
 	case AF_INET6:
 		if (addr->addr32[3] == 0xffffffff) {
 			addr->addr32[3] = 0;
 			if (addr->addr32[2] == 0xffffffff) {
 				addr->addr32[2] = 0;
 				if (addr->addr32[1] == 0xffffffff) {
 					addr->addr32[1] = 0;
 					addr->addr32[0] =
 					    htonl(ntohl(addr->addr32[0]) + 1);
 				} else
 					addr->addr32[1] =
 					    htonl(ntohl(addr->addr32[1]) + 1);
 			} else
 				addr->addr32[2] =
 				    htonl(ntohl(addr->addr32[2]) + 1);
 		} else
 			addr->addr32[3] =
 			    htonl(ntohl(addr->addr32[3]) + 1);
 		break;
 	}
 }
 #endif /* INET6 */
 
 #define mix(a,b,c) \
 	do {					\
 		a -= b; a -= c; a ^= (c >> 13);	\
 		b -= c; b -= a; b ^= (a << 8);	\
 		c -= a; c -= b; c ^= (b >> 13);	\
 		a -= b; a -= c; a ^= (c >> 12);	\
 		b -= c; b -= a; b ^= (a << 16);	\
 		c -= a; c -= b; c ^= (b >> 5);	\
 		a -= b; a -= c; a ^= (c >> 3);	\
 		b -= c; b -= a; b ^= (a << 10);	\
 		c -= a; c -= b; c ^= (b >> 15);	\
 	} while (0)
 
 /*
  * hash function based on bridge_hash in if_bridge.c
  */
 void
 pf_hash(struct pf_addr *inaddr, struct pf_addr *hash,
     struct pf_poolhashkey *key, sa_family_t af)
 {
 	u_int32_t	a = 0x9e3779b9, b = 0x9e3779b9, c = key->key32[0];
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		a += inaddr->addr32[0];
 		b += key->key32[1];
 		mix(a, b, c);
 		hash->addr32[0] = c + key->key32[2];
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		a += inaddr->addr32[0];
 		b += inaddr->addr32[2];
 		mix(a, b, c);
 		hash->addr32[0] = c;
 		a += inaddr->addr32[1];
 		b += inaddr->addr32[3];
 		c += key->key32[1];
 		mix(a, b, c);
 		hash->addr32[1] = c;
 		a += inaddr->addr32[2];
 		b += inaddr->addr32[1];
 		c += key->key32[2];
 		mix(a, b, c);
 		hash->addr32[2] = c;
 		a += inaddr->addr32[3];
 		b += inaddr->addr32[0];
 		c += key->key32[3];
 		mix(a, b, c);
 		hash->addr32[3] = c;
 		break;
 #endif /* INET6 */
 	}
 }
 
 int
 pf_map_addr(sa_family_t af, struct pf_rule *r, struct pf_addr *saddr,
     struct pf_addr *naddr, struct pf_addr *init_addr, struct pf_src_node **sn)
 {
 	unsigned char		 hash[16];
 	struct pf_pool		*rpool = &r->rpool;
 	struct pf_addr		*raddr = &rpool->cur->addr.v.a.addr;
 	struct pf_addr		*rmask = &rpool->cur->addr.v.a.mask;
 	struct pf_pooladdr	*acur = rpool->cur;
 	struct pf_src_node	 k;
 
 	if (*sn == NULL && r->rpool.opts & PF_POOL_STICKYADDR &&
 	    (r->rpool.opts & PF_POOL_TYPEMASK) != PF_POOL_NONE) {
 		k.af = af;
 		PF_ACPY(&k.addr, saddr, af);
 		if (r->rule_flag & PFRULE_RULESRCTRACK ||
 		    r->rpool.opts & PF_POOL_STICKYADDR)
 			k.rule.ptr = r;
 		else
 			k.rule.ptr = NULL;
 		pf_status.scounters[SCNT_SRC_NODE_SEARCH]++;
 		*sn = RB_FIND(pf_src_tree, &tree_src_tracking, &k);
 		if (*sn != NULL && !PF_AZERO(&(*sn)->raddr, af)) {
 			PF_ACPY(naddr, &(*sn)->raddr, af);
 			if (pf_status.debug >= PF_DEBUG_MISC) {
 				printf("pf_map_addr: src tracking maps ");
 				pf_print_host(&k.addr, 0, af);
 				printf(" to ");
 				pf_print_host(naddr, 0, af);
 				printf("\n");
 			}
 			return (0);
 		}
 	}
 
 	if (rpool->cur->addr.type == PF_ADDR_NOROUTE)
 		return (1);
 	if (rpool->cur->addr.type == PF_ADDR_DYNIFTL) {
 		switch (af) {
 #ifdef INET
 		case AF_INET:
 			if (rpool->cur->addr.p.dyn->pfid_acnt4 < 1 &&
 			    (rpool->opts & PF_POOL_TYPEMASK) !=
 			    PF_POOL_ROUNDROBIN)
 				return (1);
 			 raddr = &rpool->cur->addr.p.dyn->pfid_addr4;
 			 rmask = &rpool->cur->addr.p.dyn->pfid_mask4;
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			if (rpool->cur->addr.p.dyn->pfid_acnt6 < 1 &&
 			    (rpool->opts & PF_POOL_TYPEMASK) !=
 			    PF_POOL_ROUNDROBIN)
 				return (1);
 			raddr = &rpool->cur->addr.p.dyn->pfid_addr6;
 			rmask = &rpool->cur->addr.p.dyn->pfid_mask6;
 			break;
 #endif /* INET6 */
 		}
 	} else if (rpool->cur->addr.type == PF_ADDR_TABLE) {
 		if ((rpool->opts & PF_POOL_TYPEMASK) != PF_POOL_ROUNDROBIN)
 			return (1); /* unsupported */
 	} else {
 		raddr = &rpool->cur->addr.v.a.addr;
 		rmask = &rpool->cur->addr.v.a.mask;
 	}
 
 	switch (rpool->opts & PF_POOL_TYPEMASK) {
 	case PF_POOL_NONE:
 		PF_ACPY(naddr, raddr, af);
 		break;
 	case PF_POOL_BITMASK:
 		PF_POOLMASK(naddr, raddr, rmask, saddr, af);
 		break;
 	case PF_POOL_RANDOM:
 		if (init_addr != NULL && PF_AZERO(init_addr, af)) {
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				rpool->counter.addr32[0] = htonl(arc4random());
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				if (rmask->addr32[3] != 0xffffffff)
 					rpool->counter.addr32[3] =
 					    htonl(arc4random());
 				else
 					break;
 				if (rmask->addr32[2] != 0xffffffff)
 					rpool->counter.addr32[2] =
 					    htonl(arc4random());
 				else
 					break;
 				if (rmask->addr32[1] != 0xffffffff)
 					rpool->counter.addr32[1] =
 					    htonl(arc4random());
 				else
 					break;
 				if (rmask->addr32[0] != 0xffffffff)
 					rpool->counter.addr32[0] =
 					    htonl(arc4random());
 				break;
 #endif /* INET6 */
 			}
 			PF_POOLMASK(naddr, raddr, rmask, &rpool->counter, af);
 			PF_ACPY(init_addr, naddr, af);
 
 		} else {
 			PF_AINC(&rpool->counter, af);
 			PF_POOLMASK(naddr, raddr, rmask, &rpool->counter, af);
 		}
 		break;
 	case PF_POOL_SRCHASH:
 		pf_hash(saddr, (struct pf_addr *)&hash, &rpool->key, af);
 		PF_POOLMASK(naddr, raddr, rmask, (struct pf_addr *)&hash, af);
 		break;
 	case PF_POOL_ROUNDROBIN:
 		if (rpool->cur->addr.type == PF_ADDR_TABLE) {
 			if (!pfr_pool_get(rpool->cur->addr.p.tbl,
 			    &rpool->tblidx, &rpool->counter,
 			    &raddr, &rmask, af))
 				goto get_addr;
 		} else if (rpool->cur->addr.type == PF_ADDR_DYNIFTL) {
 			if (!pfr_pool_get(rpool->cur->addr.p.dyn->pfid_kt,
 			    &rpool->tblidx, &rpool->counter,
 			    &raddr, &rmask, af))
 				goto get_addr;
 		} else if (pf_match_addr(0, raddr, rmask, &rpool->counter, af))
 			goto get_addr;
 
 	try_next:
 		if ((rpool->cur = TAILQ_NEXT(rpool->cur, entries)) == NULL)
 			rpool->cur = TAILQ_FIRST(&rpool->list);
 		if (rpool->cur->addr.type == PF_ADDR_TABLE) {
 			rpool->tblidx = -1;
 			if (pfr_pool_get(rpool->cur->addr.p.tbl,
 			    &rpool->tblidx, &rpool->counter,
 			    &raddr, &rmask, af)) {
 				/* table contains no address of type 'af' */
 				if (rpool->cur != acur)
 					goto try_next;
 				return (1);
 			}
 		} else if (rpool->cur->addr.type == PF_ADDR_DYNIFTL) {
 			rpool->tblidx = -1;
 			if (pfr_pool_get(rpool->cur->addr.p.dyn->pfid_kt,
 			    &rpool->tblidx, &rpool->counter,
 			    &raddr, &rmask, af)) {
 				/* table contains no address of type 'af' */
 				if (rpool->cur != acur)
 					goto try_next;
 				return (1);
 			}
 		} else {
 			raddr = &rpool->cur->addr.v.a.addr;
 			rmask = &rpool->cur->addr.v.a.mask;
 			PF_ACPY(&rpool->counter, raddr, af);
 		}
 
 	get_addr:
 		PF_ACPY(naddr, &rpool->counter, af);
 		if (init_addr != NULL && PF_AZERO(init_addr, af))
 			PF_ACPY(init_addr, naddr, af);
 		PF_AINC(&rpool->counter, af);
 		break;
 	}
 	if (*sn != NULL)
 		PF_ACPY(&(*sn)->raddr, naddr, af);
 
 	if (pf_status.debug >= PF_DEBUG_MISC &&
 	    (rpool->opts & PF_POOL_TYPEMASK) != PF_POOL_NONE) {
 		printf("pf_map_addr: selected address ");
 		pf_print_host(naddr, 0, af);
 		printf("\n");
 	}
 
 	return (0);
 }
 
 int
 pf_get_sport(sa_family_t af, u_int8_t proto, struct pf_rule *r,
     struct pf_addr *saddr, struct pf_addr *daddr, u_int16_t dport,
     struct pf_addr *naddr, u_int16_t *nport, u_int16_t low, u_int16_t high,
     struct pf_src_node **sn)
 {
 	struct pf_state_cmp	key;
 	struct pf_addr		init_addr;
 	u_int16_t		cut;
 
 	bzero(&init_addr, sizeof(init_addr));
 	if (pf_map_addr(af, r, saddr, naddr, &init_addr, sn))
 		return (1);
 
 	if (proto == IPPROTO_ICMP) {
 		low = 1;
 		high = 65535;
 	}
 
 	do {
 		key.af = af;
 		key.proto = proto;
 		PF_ACPY(&key.ext.addr, daddr, key.af);
 		PF_ACPY(&key.gwy.addr, naddr, key.af);
 		key.ext.port = dport;
 
 		/*
 		 * port search; start random, step;
 		 * similar 2 portloop in in_pcbbind
 		 */
 		if (!(proto == IPPROTO_TCP || proto == IPPROTO_UDP ||
 		    proto == IPPROTO_ICMP)) {
 			key.gwy.port = dport;
 			if (pf_find_state_all(&key, PF_EXT_GWY, NULL) == NULL)
 				return (0);
 		} else if (low == 0 && high == 0) {
 			key.gwy.port = *nport;
 			if (pf_find_state_all(&key, PF_EXT_GWY, NULL) == NULL)
 				return (0);
 		} else if (low == high) {
 			key.gwy.port = htons(low);
 			if (pf_find_state_all(&key, PF_EXT_GWY, NULL) == NULL) {
 				*nport = htons(low);
 				return (0);
 			}
 		} else {
 			u_int16_t tmp;
 
 			if (low > high) {
 				tmp = low;
 				low = high;
 				high = tmp;
 			}
 			/* low < high */
 			cut = htonl(arc4random()) % (1 + high - low) + low;
 			/* low <= cut <= high */
 			for (tmp = cut; tmp <= high; ++(tmp)) {
 				key.gwy.port = htons(tmp);
 				if (pf_find_state_all(&key, PF_EXT_GWY, NULL) ==
 				    NULL) {
 					*nport = htons(tmp);
 					return (0);
 				}
 			}
 			for (tmp = cut - 1; tmp >= low; --(tmp)) {
 				key.gwy.port = htons(tmp);
 				if (pf_find_state_all(&key, PF_EXT_GWY, NULL) ==
 				    NULL) {
 					*nport = htons(tmp);
 					return (0);
 				}
 			}
 		}
 
 		switch (r->rpool.opts & PF_POOL_TYPEMASK) {
 		case PF_POOL_RANDOM:
 		case PF_POOL_ROUNDROBIN:
 			if (pf_map_addr(af, r, saddr, naddr, &init_addr, sn))
 				return (1);
 			break;
 		case PF_POOL_NONE:
 		case PF_POOL_SRCHASH:
 		case PF_POOL_BITMASK:
 		default:
 			return (1);
 		}
 	} while (! PF_AEQ(&init_addr, naddr, af) );
 
 	return (1);					/* none available */
 }
 
 struct pf_rule *
 pf_match_translation(struct pf_pdesc *pd, struct mbuf *m, int off,
     int direction, struct pfi_kif *kif, struct pf_addr *saddr, u_int16_t sport,
     struct pf_addr *daddr, u_int16_t dport, int rs_num)
 {
 	struct pf_rule		*r, *rm = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	int			 tag = -1;
 	int			 rtableid = -1;
 	int			 asd = 0;
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[rs_num].active.ptr);
 	while (r && rm == NULL) {
 		struct pf_rule_addr	*src = NULL, *dst = NULL;
 		struct pf_addr_wrap	*xdst = NULL;
 
 		if (r->action == PF_BINAT && direction == PF_IN) {
 			src = &r->dst;
 			if (r->rpool.cur != NULL)
 				xdst = &r->rpool.cur->addr;
 		} else {
 			src = &r->src;
 			dst = &r->dst;
 		}
 
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != pd->af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != pd->proto)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&src->addr, saddr, pd->af,
 		    src->neg, kif))
 			r = r->skip[src == &r->src ? PF_SKIP_SRC_ADDR :
 			    PF_SKIP_DST_ADDR].ptr;
 		else if (src->port_op && !pf_match_port(src->port_op,
 		    src->port[0], src->port[1], sport))
 			r = r->skip[src == &r->src ? PF_SKIP_SRC_PORT :
 			    PF_SKIP_DST_PORT].ptr;
 		else if (dst != NULL &&
 		    PF_MISMATCHAW(&dst->addr, daddr, pd->af, dst->neg, NULL))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (xdst != NULL && PF_MISMATCHAW(xdst, daddr, pd->af,
 		    0, NULL))
 			r = TAILQ_NEXT(r, entries);
 		else if (dst != NULL && dst->port_op &&
 		    !pf_match_port(dst->port_op, dst->port[0],
 		    dst->port[1], dport))
 			r = r->skip[PF_SKIP_DST_PORT].ptr;
 		else if (r->match_tag && !pf_match_tag(m, r, pd->pf_mtag, &tag))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY && (pd->proto !=
 		    IPPROTO_TCP || !pf_osfp_match(pf_osfp_fingerprint(pd, m,
 		    off, pd->hdr.tcp), r->os_fingerprint)))
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->tag)
 				tag = r->tag;
 			if (r->rtableid >= 0)
 				rtableid = r->rtableid;
 			if (r->anchor == NULL) {
 				rm = r;
 			} else
 				pf_step_into_anchor(&asd, &ruleset, rs_num,
 				    &r, NULL, NULL);
 		}
 		if (r == NULL)
 			pf_step_out_of_anchor(&asd, &ruleset, rs_num, &r,
 			    NULL, NULL);
 	}
 	if (pf_tag_packet(m, pd->pf_mtag, tag, rtableid))
 		return (NULL);
 	if (rm != NULL && (rm->action == PF_NONAT ||
 	    rm->action == PF_NORDR || rm->action == PF_NOBINAT))
 		return (NULL);
 	return (rm);
 }
 
 struct pf_rule *
 pf_get_translation(struct pf_pdesc *pd, struct mbuf *m, int off, int direction,
     struct pfi_kif *kif, struct pf_src_node **sn,
     struct pf_addr *saddr, u_int16_t sport,
     struct pf_addr *daddr, u_int16_t dport,
     struct pf_addr *naddr, u_int16_t *nport)
 {
 	struct pf_rule	*r = NULL;
 
 	if (direction == PF_OUT) {
 		r = pf_match_translation(pd, m, off, direction, kif, saddr,
 		    sport, daddr, dport, PF_RULESET_BINAT);
 		if (r == NULL)
 			r = pf_match_translation(pd, m, off, direction, kif,
 			    saddr, sport, daddr, dport, PF_RULESET_NAT);
 	} else {
 		r = pf_match_translation(pd, m, off, direction, kif, saddr,
 		    sport, daddr, dport, PF_RULESET_RDR);
 		if (r == NULL)
 			r = pf_match_translation(pd, m, off, direction, kif,
 			    saddr, sport, daddr, dport, PF_RULESET_BINAT);
 	}
 
 	if (r != NULL) {
 		switch (r->action) {
 		case PF_NONAT:
 		case PF_NOBINAT:
 		case PF_NORDR:
 			return (NULL);
 		case PF_NAT:
 			if (pf_get_sport(pd->af, pd->proto, r, saddr,
 			    daddr, dport, naddr, nport, r->rpool.proxy_port[0],
 			    r->rpool.proxy_port[1], sn)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: NAT proxy port allocation "
 				    "(%u-%u) failed\n",
 				    r->rpool.proxy_port[0],
 				    r->rpool.proxy_port[1]));
 				return (NULL);
 			}
 			break;
 		case PF_BINAT:
 			switch (direction) {
 			case PF_OUT:
 				if (r->rpool.cur->addr.type == PF_ADDR_DYNIFTL){
 					switch (pd->af) {
 #ifdef INET
 					case AF_INET:
 						if (r->rpool.cur->addr.p.dyn->
 						    pfid_acnt4 < 1)
 							return (NULL);
 						PF_POOLMASK(naddr,
 						    &r->rpool.cur->addr.p.dyn->
 						    pfid_addr4,
 						    &r->rpool.cur->addr.p.dyn->
 						    pfid_mask4,
 						    saddr, AF_INET);
 						break;
 #endif /* INET */
 #ifdef INET6
 					case AF_INET6:
 						if (r->rpool.cur->addr.p.dyn->
 						    pfid_acnt6 < 1)
 							return (NULL);
 						PF_POOLMASK(naddr,
 						    &r->rpool.cur->addr.p.dyn->
 						    pfid_addr6,
 						    &r->rpool.cur->addr.p.dyn->
 						    pfid_mask6,
 						    saddr, AF_INET6);
 						break;
 #endif /* INET6 */
 					}
 				} else
 					PF_POOLMASK(naddr,
 					    &r->rpool.cur->addr.v.a.addr,
 					    &r->rpool.cur->addr.v.a.mask,
 					    saddr, pd->af);
 				break;
 			case PF_IN:
 				if (r->src.addr.type == PF_ADDR_DYNIFTL) {
 					switch (pd->af) {
 #ifdef INET
 					case AF_INET:
 						if (r->src.addr.p.dyn->
 						    pfid_acnt4 < 1)
 							return (NULL);
 						PF_POOLMASK(naddr,
 						    &r->src.addr.p.dyn->
 						    pfid_addr4,
 						    &r->src.addr.p.dyn->
 						    pfid_mask4,
 						    daddr, AF_INET);
 						break;
 #endif /* INET */
 #ifdef INET6
 					case AF_INET6:
 						if (r->src.addr.p.dyn->
 						    pfid_acnt6 < 1)
 							return (NULL);
 						PF_POOLMASK(naddr,
 						    &r->src.addr.p.dyn->
 						    pfid_addr6,
 						    &r->src.addr.p.dyn->
 						    pfid_mask6,
 						    daddr, AF_INET6);
 						break;
 #endif /* INET6 */
 					}
 				} else
 					PF_POOLMASK(naddr,
 					    &r->src.addr.v.a.addr,
 					    &r->src.addr.v.a.mask, daddr,
 					    pd->af);
 				break;
 			}
 			break;
 		case PF_RDR: {
 			if (pf_map_addr(pd->af, r, saddr, naddr, NULL, sn))
 				return (NULL);
 			if ((r->rpool.opts & PF_POOL_TYPEMASK) ==
 			    PF_POOL_BITMASK)
 				PF_POOLMASK(naddr, naddr,
 				    &r->rpool.cur->addr.v.a.mask, daddr,
 				    pd->af);
 
 			if (r->rpool.proxy_port[1]) {
 				u_int32_t	tmp_nport;
 
 				tmp_nport = ((ntohs(dport) -
 				    ntohs(r->dst.port[0])) %
 				    (r->rpool.proxy_port[1] -
 				    r->rpool.proxy_port[0] + 1)) +
 				    r->rpool.proxy_port[0];
 
 				/* wrap around if necessary */
 				if (tmp_nport > 65535)
 					tmp_nport -= 65535;
 				*nport = htons((u_int16_t)tmp_nport);
 			} else if (r->rpool.proxy_port[0])
 				*nport = htons(r->rpool.proxy_port[0]);
 			break;
 		}
 		default:
 			return (NULL);
 		}
 	}
 
 	return (r);
 }
 
 int
 #ifdef __FreeBSD__
 pf_socket_lookup(int direction, struct pf_pdesc *pd, struct inpcb *inp_arg)
 #else
 pf_socket_lookup(int direction, struct pf_pdesc *pd)
 #endif
 {
 	INIT_VNET_INET(curvnet);
 	struct pf_addr		*saddr, *daddr;
 	u_int16_t		 sport, dport;
 #ifdef __FreeBSD__
 	struct inpcbinfo	*pi;
 #else
 	struct inpcbtable	*tb;
 #endif
 	struct inpcb		*inp;
 
 	if (pd == NULL)
 		return (-1);
 	pd->lookup.uid = UID_MAX;
 	pd->lookup.gid = GID_MAX;
 	pd->lookup.pid = NO_PID;		/* XXX: revisit */
 #ifdef __FreeBSD__
 	if (inp_arg != NULL) {
 		INP_LOCK_ASSERT(inp_arg);
 		pd->lookup.uid = inp_arg->inp_cred->cr_uid;
 		pd->lookup.gid = inp_arg->inp_cred->cr_groups[0];
 		return (1);
 	}
 #endif
 	switch (pd->proto) {
 	case IPPROTO_TCP:
 		if (pd->hdr.tcp == NULL)
 			return (-1);
 		sport = pd->hdr.tcp->th_sport;
 		dport = pd->hdr.tcp->th_dport;
 #ifdef __FreeBSD__
 		pi = &V_tcbinfo;
 #else
 		tb = &tcbtable;
 #endif
 		break;
 	case IPPROTO_UDP:
 		if (pd->hdr.udp == NULL)
 			return (-1);
 		sport = pd->hdr.udp->uh_sport;
 		dport = pd->hdr.udp->uh_dport;
 #ifdef __FreeBSD__
 		pi = &V_udbinfo;
 #else
 		tb = &udbtable;
 #endif
 		break;
 	default:
 		return (-1);
 	}
 	if (direction == PF_IN) {
 		saddr = pd->src;
 		daddr = pd->dst;
 	} else {
 		u_int16_t	p;
 
 		p = sport;
 		sport = dport;
 		dport = p;
 		saddr = pd->dst;
 		daddr = pd->src;
 	}
 	switch (pd->af) {
 #ifdef INET
 	case AF_INET:
 #ifdef __FreeBSD__
 		INP_INFO_RLOCK(pi);	/* XXX LOR */
 		inp = in_pcblookup_hash(pi, saddr->v4, sport, daddr->v4,
 			dport, 0, NULL);
 		if (inp == NULL) {
 			inp = in_pcblookup_hash(pi, saddr->v4, sport,
 			   daddr->v4, dport, INPLOOKUP_WILDCARD, NULL);
 			if(inp == NULL) {
 				INP_INFO_RUNLOCK(pi);
 				return (-1);
 			}
 		}
 #else
 		inp = in_pcbhashlookup(tb, saddr->v4, sport, daddr->v4, dport);
 		if (inp == NULL) {
 			inp = in_pcblookup_listen(tb, daddr->v4, dport, 0);
 			if (inp == NULL)
 				return (-1);
 		}
 #endif
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 #ifdef __FreeBSD__
 		INP_INFO_RLOCK(pi);
 		inp = in6_pcblookup_hash(pi, &saddr->v6, sport,
 			&daddr->v6, dport, 0, NULL);
 		if (inp == NULL) {
 			inp = in6_pcblookup_hash(pi, &saddr->v6, sport,
 			&daddr->v6, dport, INPLOOKUP_WILDCARD, NULL);
 			if (inp == NULL) {
 				INP_INFO_RUNLOCK(pi);
 				return (-1);
 			}
 		}
 #else
 		inp = in6_pcbhashlookup(tb, &saddr->v6, sport, &daddr->v6,
 		    dport);
 		if (inp == NULL) {
 			inp = in6_pcblookup_listen(tb, &daddr->v6, dport, 0);
 			if (inp == NULL)
 				return (-1);
 		}
 #endif
 		break;
 #endif /* INET6 */
 
 	default:
 		return (-1);
 	}
 #ifdef __FreeBSD__
 	pd->lookup.uid = inp->inp_cred->cr_uid;
 	pd->lookup.gid = inp->inp_cred->cr_groups[0];
 	INP_INFO_RUNLOCK(pi);
 #else
 	pd->lookup.uid = inp->inp_socket->so_euid;
 	pd->lookup.gid = inp->inp_socket->so_egid;
 	pd->lookup.pid = inp->inp_socket->so_cpid;
 #endif
 	return (1);
 }
 
 u_int8_t
 pf_get_wscale(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af)
 {
 	int		 hlen;
 	u_int8_t	 hdr[60];
 	u_int8_t	*opt, optlen;
 	u_int8_t	 wscale = 0;
 
 	hlen = th_off << 2;		/* hlen <= sizeof(hdr) */
 	if (hlen <= sizeof(struct tcphdr))
 		return (0);
 	if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af))
 		return (0);
 	opt = hdr + sizeof(struct tcphdr);
 	hlen -= sizeof(struct tcphdr);
 	while (hlen >= 3) {
 		switch (*opt) {
 		case TCPOPT_EOL:
 		case TCPOPT_NOP:
 			++opt;
 			--hlen;
 			break;
 		case TCPOPT_WINDOW:
 			wscale = opt[2];
 			if (wscale > TCP_MAX_WINSHIFT)
 				wscale = TCP_MAX_WINSHIFT;
 			wscale |= PF_WSCALE_FLAG;
 			/* FALLTHROUGH */
 		default:
 			optlen = opt[1];
 			if (optlen < 2)
 				optlen = 2;
 			hlen -= optlen;
 			opt += optlen;
 			break;
 		}
 	}
 	return (wscale);
 }
 
 u_int16_t
 pf_get_mss(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af)
 {
 	INIT_VNET_INET(curvnet);
 	int		 hlen;
 	u_int8_t	 hdr[60];
 	u_int8_t	*opt, optlen;
 	u_int16_t	 mss = V_tcp_mssdflt;
 
 	hlen = th_off << 2;	/* hlen <= sizeof(hdr) */
 	if (hlen <= sizeof(struct tcphdr))
 		return (0);
 	if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af))
 		return (0);
 	opt = hdr + sizeof(struct tcphdr);
 	hlen -= sizeof(struct tcphdr);
 	while (hlen >= TCPOLEN_MAXSEG) {
 		switch (*opt) {
 		case TCPOPT_EOL:
 		case TCPOPT_NOP:
 			++opt;
 			--hlen;
 			break;
 		case TCPOPT_MAXSEG:
 			bcopy((caddr_t)(opt + 2), (caddr_t)&mss, 2);
 			NTOHS(mss);
 			/* FALLTHROUGH */
 		default:
 			optlen = opt[1];
 			if (optlen < 2)
 				optlen = 2;
 			hlen -= optlen;
 			opt += optlen;
 			break;
 		}
 	}
 	return (mss);
 }
 
 u_int16_t
 pf_calc_mss(struct pf_addr *addr, sa_family_t af, u_int16_t offer)
 {
 #ifdef INET
 	INIT_VNET_INET(curvnet);
 	struct sockaddr_in	*dst;
 	struct route		 ro;
 #endif /* INET */
 #ifdef INET6
 	struct sockaddr_in6	*dst6;
 	struct route_in6	 ro6;
 #endif /* INET6 */
 	struct rtentry		*rt = NULL;
 	int			 hlen = 0;	/* make the compiler happy */
 	u_int16_t		 mss = V_tcp_mssdflt;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		hlen = sizeof(struct ip);
 		bzero(&ro, sizeof(ro));
 		dst = (struct sockaddr_in *)&ro.ro_dst;
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = addr->v4;
 #ifdef __FreeBSD__
 #ifdef RTF_PRCLONING
 		rtalloc_ign(&ro, (RTF_CLONING | RTF_PRCLONING));
 #else /* !RTF_PRCLONING */
-		in_rtalloc_ign(&ro, RTF_CLONING, 0);
+		in_rtalloc_ign(&ro, 0, 0);
 #endif
 #else /* ! __FreeBSD__ */
 		rtalloc_noclone(&ro, NO_CLONING);
 #endif
 		rt = ro.ro_rt;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		hlen = sizeof(struct ip6_hdr);
 		bzero(&ro6, sizeof(ro6));
 		dst6 = (struct sockaddr_in6 *)&ro6.ro_dst;
 		dst6->sin6_family = AF_INET6;
 		dst6->sin6_len = sizeof(*dst6);
 		dst6->sin6_addr = addr->v6;
 #ifdef __FreeBSD__
 #ifdef RTF_PRCLONING
 		rtalloc_ign((struct route *)&ro6,
 		    (RTF_CLONING | RTF_PRCLONING));
 #else /* !RTF_PRCLONING */
-		rtalloc_ign((struct route *)&ro6, RTF_CLONING);
+		rtalloc_ign((struct route *)&ro6, 0);
 #endif
 #else /* ! __FreeBSD__ */
 		rtalloc_noclone((struct route *)&ro6, NO_CLONING);
 #endif
 		rt = ro6.ro_rt;
 		break;
 #endif /* INET6 */
 	}
 
 	if (rt && rt->rt_ifp) {
 		mss = rt->rt_ifp->if_mtu - hlen - sizeof(struct tcphdr);
 		mss = max(V_tcp_mssdflt, mss);
 		RTFREE(rt);
 	}
 	mss = min(mss, offer);
 	mss = max(mss, 64);		/* sanity - at least max opt space */
 	return (mss);
 }
 
 void
 pf_set_rt_ifp(struct pf_state *s, struct pf_addr *saddr)
 {
 	struct pf_rule *r = s->rule.ptr;
 
 	s->rt_kif = NULL;
 	if (!r->rt || r->rt == PF_FASTROUTE)
 		return;
 	switch (s->af) {
 #ifdef INET
 	case AF_INET:
 		pf_map_addr(AF_INET, r, saddr, &s->rt_addr, NULL,
 		    &s->nat_src_node);
 		s->rt_kif = r->rpool.cur->kif;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		pf_map_addr(AF_INET6, r, saddr, &s->rt_addr, NULL,
 		    &s->nat_src_node);
 		s->rt_kif = r->rpool.cur->kif;
 		break;
 #endif /* INET6 */
 	}
 }
 
 int
 pf_test_tcp(struct pf_rule **rm, struct pf_state **sm, int direction,
     struct pfi_kif *kif, struct mbuf *m, int off, void *h,
 #ifdef __FreeBSD__
     struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm,
     struct ifqueue *ifq, struct inpcb *inp)
 #else
     struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm,
     struct ifqueue *ifq)
 #endif
 {
 	INIT_VNET_INET(curvnet);
 	struct pf_rule		*nr = NULL;
 	struct pf_addr		*saddr = pd->src, *daddr = pd->dst;
 	struct tcphdr		*th = pd->hdr.tcp;
 	u_int16_t		 bport, nport = 0;
 	sa_family_t		 af = pd->af;
 	struct pf_rule		*r, *a = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_src_node	*nsn = NULL;
 	u_short			 reason;
 	int			 rewrite = 0;
 	int			 tag = -1, rtableid = -1;
 	u_int16_t		 mss = V_tcp_mssdflt;
 	int			 asd = 0;
 	int			 match = 0;
 
 	if (pf_check_congestion(ifq)) {
 		REASON_SET(&reason, PFRES_CONGEST);
 		return (PF_DROP);
 	}
 
 #ifdef __FreeBSD__
 	if (inp != NULL)
 		pd->lookup.done = pf_socket_lookup(direction, pd, inp);
 	else if (debug_pfugidhack) {
 		PF_UNLOCK();
 		DPFPRINTF(PF_DEBUG_MISC, ("pf: unlocked lookup\n"));
 		pd->lookup.done = pf_socket_lookup(direction, pd, inp);
 		PF_LOCK();
 	}
 #endif
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
 
 	if (direction == PF_OUT) {
 		bport = nport = th->th_sport;
 		/* check outgoing packet for BINAT/NAT */
 		if ((nr = pf_get_translation(pd, m, off, PF_OUT, kif, &nsn,
 		    saddr, th->th_sport, daddr, th->th_dport,
 		    &pd->naddr, &nport)) != NULL) {
 			PF_ACPY(&pd->baddr, saddr, af);
 			pf_change_ap(saddr, &th->th_sport, pd->ip_sum,
 			    &th->th_sum, &pd->naddr, nport, 0, af);
 			rewrite++;
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	} else {
 		bport = nport = th->th_dport;
 		/* check incoming packet for BINAT/RDR */
 		if ((nr = pf_get_translation(pd, m, off, PF_IN, kif, &nsn,
 		    saddr, th->th_sport, daddr, th->th_dport,
 		    &pd->naddr, &nport)) != NULL) {
 			PF_ACPY(&pd->baddr, daddr, af);
 			pf_change_ap(daddr, &th->th_dport, pd->ip_sum,
 			    &th->th_sum, &pd->naddr, nport, 0, af);
 			rewrite++;
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	}
 
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != IPPROTO_TCP)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, saddr, af,
 		    r->src.neg, kif))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (r->src.port_op && !pf_match_port(r->src.port_op,
 		    r->src.port[0], r->src.port[1], th->th_sport))
 			r = r->skip[PF_SKIP_SRC_PORT].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, daddr, af,
 		    r->dst.neg, NULL))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->dst.port_op && !pf_match_port(r->dst.port_op,
 		    r->dst.port[0], r->dst.port[1], th->th_dport))
 			r = r->skip[PF_SKIP_DST_PORT].ptr;
 		else if (r->tos && !(r->tos == pd->tos))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->rule_flag & PFRULE_FRAGMENT)
 			r = TAILQ_NEXT(r, entries);
 		else if ((r->flagset & th->th_flags) != r->flags)
 			r = TAILQ_NEXT(r, entries);
 		else if (r->uid.op && (pd->lookup.done || (pd->lookup.done =
 #ifdef __FreeBSD__
 		    pf_socket_lookup(direction, pd, inp), 1)) &&
 #else
 		    pf_socket_lookup(direction, pd), 1)) &&
 #endif
 		    !pf_match_uid(r->uid.op, r->uid.uid[0], r->uid.uid[1],
 		    pd->lookup.uid))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->gid.op && (pd->lookup.done || (pd->lookup.done =
 #ifdef __FreeBSD__
 		    pf_socket_lookup(direction, pd, inp), 1)) &&
 #else
 		    pf_socket_lookup(direction, pd), 1)) &&
 #endif
 		    !pf_match_gid(r->gid.op, r->gid.gid[0], r->gid.gid[1],
 		    pd->lookup.gid))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->prob && r->prob <= arc4random())
 			r = TAILQ_NEXT(r, entries);
 		else if (r->match_tag && !pf_match_tag(m, r, pd->pf_mtag, &tag))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY && !pf_osfp_match(
 		    pf_osfp_fingerprint(pd, m, off, th), r->os_fingerprint))
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->tag)
 				tag = r->tag;
 			if (r->rtableid >= 0)
 				rtableid = r->rtableid;
 			if (r->anchor == NULL) {
 				match = 1;
 				*rm = r;
 				*am = a;
 				*rsm = ruleset;
 				if ((*rm)->quick)
 					break;
 				r = TAILQ_NEXT(r, entries);
 			} else
 				pf_step_into_anchor(&asd, &ruleset,
 				    PF_RULESET_FILTER, &r, &a, &match);
 		}
 		if (r == NULL && pf_step_out_of_anchor(&asd, &ruleset,
 		    PF_RULESET_FILTER, &r, &a, &match))
 			break;
 	}
 	r = *rm;
 	a = *am;
 	ruleset = *rsm;
 
 	REASON_SET(&reason, PFRES_MATCH);
 
 	if (r->log || (nr != NULL && nr->natpass && nr->log)) {
 		if (rewrite)
 #ifdef __FreeBSD__
 			m_copyback(m, off, sizeof(*th), (caddr_t)th);
 #else
 			m_copyback(m, off, sizeof(*th), th);
 #endif
 		PFLOG_PACKET(kif, h, m, af, direction, reason, r->log ? r : nr,
 		    a, ruleset, pd);
 	}
 
 	if ((r->action == PF_DROP) &&
 	    ((r->rule_flag & PFRULE_RETURNRST) ||
 	    (r->rule_flag & PFRULE_RETURNICMP) ||
 	    (r->rule_flag & PFRULE_RETURN))) {
 		/* undo NAT changes, if they have taken place */
 		if (nr != NULL) {
 			if (direction == PF_OUT) {
 				pf_change_ap(saddr, &th->th_sport, pd->ip_sum,
 				    &th->th_sum, &pd->baddr, bport, 0, af);
 				rewrite++;
 			} else {
 				pf_change_ap(daddr, &th->th_dport, pd->ip_sum,
 				    &th->th_sum, &pd->baddr, bport, 0, af);
 				rewrite++;
 			}
 		}
 		if (((r->rule_flag & PFRULE_RETURNRST) ||
 		    (r->rule_flag & PFRULE_RETURN)) &&
 		    !(th->th_flags & TH_RST)) {
 			u_int32_t ack = ntohl(th->th_seq) + pd->p_len;
 
 			if (th->th_flags & TH_SYN)
 				ack++;
 			if (th->th_flags & TH_FIN)
 				ack++;
 #ifdef __FreeBSD__
 			pf_send_tcp(m, r, af, pd->dst,
 #else
 			pf_send_tcp(r, af, pd->dst,
 #endif
 			    pd->src, th->th_dport, th->th_sport,
 			    ntohl(th->th_ack), ack, TH_RST|TH_ACK, 0, 0,
 			    r->return_ttl, 1, 0, pd->eh, kif->pfik_ifp);
 		} else if ((af == AF_INET) && r->return_icmp)
 			pf_send_icmp(m, r->return_icmp >> 8,
 			    r->return_icmp & 255, af, r);
 		else if ((af == AF_INET6) && r->return_icmp6)
 			pf_send_icmp(m, r->return_icmp6 >> 8,
 			    r->return_icmp6 & 255, af, r);
 	}
 
 	if (r->action == PF_DROP)
 		return (PF_DROP);
 
 	if (pf_tag_packet(m, pd->pf_mtag, tag, rtableid)) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		return (PF_DROP);
 	}
 
 	if (r->keep_state || nr != NULL ||
 	    (pd->flags & PFDESC_TCP_NORM)) {
 		/* create new state */
 		u_int16_t	 len;
 		struct pf_state	*s = NULL;
 		struct pf_src_node *sn = NULL;
 
 		len = pd->tot_len - off - (th->th_off << 2);
 
 		/* check maximums */
 		if (r->max_states && (r->states >= r->max_states)) {
 			pf_status.lcounters[LCNT_STATES]++;
 			REASON_SET(&reason, PFRES_MAXSTATES);
 			goto cleanup;
 		}
 		/* src node for filter rule */
 		if ((r->rule_flag & PFRULE_SRCTRACK ||
 		    r->rpool.opts & PF_POOL_STICKYADDR) &&
 		    pf_insert_src_node(&sn, r, saddr, af) != 0) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		/* src node for translation rule */
 		if (nr != NULL && (nr->rpool.opts & PF_POOL_STICKYADDR) &&
 		    ((direction == PF_OUT &&
 		    pf_insert_src_node(&nsn, nr, &pd->baddr, af) != 0) ||
 		    (pf_insert_src_node(&nsn, nr, saddr, af) != 0))) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		s = pool_get(&pf_state_pl, PR_NOWAIT);
 		if (s == NULL) {
 			REASON_SET(&reason, PFRES_MEMORY);
 cleanup:
 			if (sn != NULL && sn->states == 0 && sn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, sn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, sn);
 			}
 			if (nsn != sn && nsn != NULL && nsn->states == 0 &&
 			    nsn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, nsn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, nsn);
 			}
 			return (PF_DROP);
 		}
 		bzero(s, sizeof(*s));
 		s->rule.ptr = r;
 		s->nat_rule.ptr = nr;
 		s->anchor.ptr = a;
 		STATE_INC_COUNTERS(s);
 		s->allow_opts = r->allow_opts;
 		s->log = r->log & PF_LOG_ALL;
 		if (nr != NULL)
 			s->log |= nr->log & PF_LOG_ALL;
 		s->proto = IPPROTO_TCP;
 		s->direction = direction;
 		s->af = af;
 		if (direction == PF_OUT) {
 			PF_ACPY(&s->gwy.addr, saddr, af);
 			s->gwy.port = th->th_sport;		/* sport */
 			PF_ACPY(&s->ext.addr, daddr, af);
 			s->ext.port = th->th_dport;
 			if (nr != NULL) {
 				PF_ACPY(&s->lan.addr, &pd->baddr, af);
 				s->lan.port = bport;
 			} else {
 				PF_ACPY(&s->lan.addr, &s->gwy.addr, af);
 				s->lan.port = s->gwy.port;
 			}
 		} else {
 			PF_ACPY(&s->lan.addr, daddr, af);
 			s->lan.port = th->th_dport;
 			PF_ACPY(&s->ext.addr, saddr, af);
 			s->ext.port = th->th_sport;
 			if (nr != NULL) {
 				PF_ACPY(&s->gwy.addr, &pd->baddr, af);
 				s->gwy.port = bport;
 			} else {
 				PF_ACPY(&s->gwy.addr, &s->lan.addr, af);
 				s->gwy.port = s->lan.port;
 			}
 		}
 
 		s->src.seqlo = ntohl(th->th_seq);
 		s->src.seqhi = s->src.seqlo + len + 1;
 		if ((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN &&
 		    r->keep_state == PF_STATE_MODULATE) {
 			/* Generate sequence number modulator */
 #ifdef __FreeBSD__
 			while ((s->src.seqdiff =
 			    pf_new_isn(s) - s->src.seqlo) == 0)
 				;	
 #else
 			while ((s->src.seqdiff =
 			    tcp_rndiss_next() - s->src.seqlo) == 0)
 				;
 #endif
 			pf_change_a(&th->th_seq, &th->th_sum,
 			    htonl(s->src.seqlo + s->src.seqdiff), 0);
 			rewrite = 1;
 		} else
 			s->src.seqdiff = 0;
 		if (th->th_flags & TH_SYN) {
 			s->src.seqhi++;
 			s->src.wscale = pf_get_wscale(m, off, th->th_off, af);
 		}
 		s->src.max_win = MAX(ntohs(th->th_win), 1);
 		if (s->src.wscale & PF_WSCALE_MASK) {
 			/* Remove scale factor from initial window */
 			int win = s->src.max_win;
 			win += 1 << (s->src.wscale & PF_WSCALE_MASK);
 			s->src.max_win = (win - 1) >>
 			    (s->src.wscale & PF_WSCALE_MASK);
 		}
 		if (th->th_flags & TH_FIN)
 			s->src.seqhi++;
 		s->dst.seqhi = 1;
 		s->dst.max_win = 1;
 		s->src.state = TCPS_SYN_SENT;
 		s->dst.state = TCPS_CLOSED;
 		s->creation = time_second;
 		s->expire = time_second;
 		s->timeout = PFTM_TCP_FIRST_PACKET;
 		pf_set_rt_ifp(s, saddr);
 		if (sn != NULL) {
 			s->src_node = sn;
 			s->src_node->states++;
 		}
 		if (nsn != NULL) {
 			PF_ACPY(&nsn->raddr, &pd->naddr, af);
 			s->nat_src_node = nsn;
 			s->nat_src_node->states++;
 		}
 		if ((pd->flags & PFDESC_TCP_NORM) && pf_normalize_tcp_init(m,
 		    off, pd, th, &s->src, &s->dst)) {
 			REASON_SET(&reason, PFRES_MEMORY);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			pool_put(&pf_state_pl, s);
 			return (PF_DROP);
 		}
 		if ((pd->flags & PFDESC_TCP_NORM) && s->src.scrub &&
 		    pf_normalize_tcp_stateful(m, off, pd, &reason, th, s,
 		    &s->src, &s->dst, &rewrite)) {
 			/* This really shouldn't happen!!! */
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("pf_normalize_tcp_stateful failed on first pkt"));
 			pf_normalize_tcp_cleanup(s);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			pool_put(&pf_state_pl, s);
 			return (PF_DROP);
 		}
 		if (pf_insert_state(BOUND_IFACE(r, kif), s)) {
 			pf_normalize_tcp_cleanup(s);
 			REASON_SET(&reason, PFRES_STATEINS);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			pool_put(&pf_state_pl, s);
 			return (PF_DROP);
 		} else
 			*sm = s;
 		if (tag > 0) {
 			pf_tag_ref(tag);
 			s->tag = tag;
 		}
 		if ((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN &&
 		    r->keep_state == PF_STATE_SYNPROXY) {
 			s->src.state = PF_TCPS_PROXY_SRC;
 			if (nr != NULL) {
 				if (direction == PF_OUT) {
 					pf_change_ap(saddr, &th->th_sport,
 					    pd->ip_sum, &th->th_sum, &pd->baddr,
 					    bport, 0, af);
 				} else {
 					pf_change_ap(daddr, &th->th_dport,
 					    pd->ip_sum, &th->th_sum, &pd->baddr,
 					    bport, 0, af);
 				}
 			}
 			s->src.seqhi = htonl(arc4random());
 			/* Find mss option */
 			mss = pf_get_mss(m, off, th->th_off, af);
 			mss = pf_calc_mss(saddr, af, mss);
 			mss = pf_calc_mss(daddr, af, mss);
 			s->src.mss = mss;
 #ifdef __FreeBSD__
 			pf_send_tcp(NULL, r, af, daddr, saddr, th->th_dport,
 #else
 			pf_send_tcp(r, af, daddr, saddr, th->th_dport,
 #endif
 			    th->th_sport, s->src.seqhi, ntohl(th->th_seq) + 1,
 			    TH_SYN|TH_ACK, 0, s->src.mss, 0, 1, 0, NULL, NULL);
 			REASON_SET(&reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		}
 	}
 
 	/* copy back packet headers if we performed NAT operations */
 	if (rewrite)
 		m_copyback(m, off, sizeof(*th), (caddr_t)th);
 
 	return (PF_PASS);
 }
 
 int
 pf_test_udp(struct pf_rule **rm, struct pf_state **sm, int direction,
     struct pfi_kif *kif, struct mbuf *m, int off, void *h,
 #ifdef __FreeBSD__
     struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm,
     struct ifqueue *ifq, struct inpcb *inp)
 #else
     struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm,
     struct ifqueue *ifq)
 #endif
 {
 	struct pf_rule		*nr = NULL;
 	struct pf_addr		*saddr = pd->src, *daddr = pd->dst;
 	struct udphdr		*uh = pd->hdr.udp;
 	u_int16_t		 bport, nport = 0;
 	sa_family_t		 af = pd->af;
 	struct pf_rule		*r, *a = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_src_node	*nsn = NULL;
 	u_short			 reason;
 	int			 rewrite = 0;
 	int			 tag = -1, rtableid = -1;
 	int			 asd = 0;
 	int			 match = 0;
 
 	if (pf_check_congestion(ifq)) {
 		REASON_SET(&reason, PFRES_CONGEST);
 		return (PF_DROP);
 	}
 
 #ifdef __FreeBSD__
 	if (inp != NULL)
 		pd->lookup.done = pf_socket_lookup(direction, pd, inp);
 	else if (debug_pfugidhack) {
 		PF_UNLOCK();
 		DPFPRINTF(PF_DEBUG_MISC, ("pf: unlocked lookup\n"));
 		pd->lookup.done = pf_socket_lookup(direction, pd, inp);
 		PF_LOCK();
 	}
 #endif
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
 
 	if (direction == PF_OUT) {
 		bport = nport = uh->uh_sport;
 		/* check outgoing packet for BINAT/NAT */
 		if ((nr = pf_get_translation(pd, m, off, PF_OUT, kif, &nsn,
 		    saddr, uh->uh_sport, daddr, uh->uh_dport,
 		    &pd->naddr, &nport)) != NULL) {
 			PF_ACPY(&pd->baddr, saddr, af);
 			pf_change_ap(saddr, &uh->uh_sport, pd->ip_sum,
 			    &uh->uh_sum, &pd->naddr, nport, 1, af);
 			rewrite++;
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	} else {
 		bport = nport = uh->uh_dport;
 		/* check incoming packet for BINAT/RDR */
 		if ((nr = pf_get_translation(pd, m, off, PF_IN, kif, &nsn,
 		    saddr, uh->uh_sport, daddr, uh->uh_dport, &pd->naddr,
 		    &nport)) != NULL) {
 			PF_ACPY(&pd->baddr, daddr, af);
 			pf_change_ap(daddr, &uh->uh_dport, pd->ip_sum,
 			    &uh->uh_sum, &pd->naddr, nport, 1, af);
 			rewrite++;
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	}
 
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != IPPROTO_UDP)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, saddr, af,
 		    r->src.neg, kif))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (r->src.port_op && !pf_match_port(r->src.port_op,
 		    r->src.port[0], r->src.port[1], uh->uh_sport))
 			r = r->skip[PF_SKIP_SRC_PORT].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, daddr, af,
 		    r->dst.neg, NULL))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->dst.port_op && !pf_match_port(r->dst.port_op,
 		    r->dst.port[0], r->dst.port[1], uh->uh_dport))
 			r = r->skip[PF_SKIP_DST_PORT].ptr;
 		else if (r->tos && !(r->tos == pd->tos))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->rule_flag & PFRULE_FRAGMENT)
 			r = TAILQ_NEXT(r, entries);
 		else if (r->uid.op && (pd->lookup.done || (pd->lookup.done =
 #ifdef __FreeBSD__
 		    pf_socket_lookup(direction, pd, inp), 1)) &&
 #else
 		    pf_socket_lookup(direction, pd), 1)) &&
 #endif
 		    !pf_match_uid(r->uid.op, r->uid.uid[0], r->uid.uid[1],
 		    pd->lookup.uid))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->gid.op && (pd->lookup.done || (pd->lookup.done =
 #ifdef __FreeBSD__
 		    pf_socket_lookup(direction, pd, inp), 1)) &&
 #else
 		    pf_socket_lookup(direction, pd), 1)) &&
 #endif
 		    !pf_match_gid(r->gid.op, r->gid.gid[0], r->gid.gid[1],
 		    pd->lookup.gid))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->prob && r->prob <= arc4random())
 			r = TAILQ_NEXT(r, entries);
 		else if (r->match_tag && !pf_match_tag(m, r, pd->pf_mtag, &tag))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY)
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->tag)
 				tag = r->tag;
 			if (r->rtableid >= 0)
 				rtableid = r->rtableid;
 			if (r->anchor == NULL) {
 				match = 1;
 				*rm = r;
 				*am = a;
 				*rsm = ruleset;
 				if ((*rm)->quick)
 					break;
 				r = TAILQ_NEXT(r, entries);
 			} else
 				pf_step_into_anchor(&asd, &ruleset,
 				    PF_RULESET_FILTER, &r, &a, &match);
 		}
 		if (r == NULL && pf_step_out_of_anchor(&asd, &ruleset,
 		    PF_RULESET_FILTER, &r, &a, &match))
 			break;
 	}
 	r = *rm;
 	a = *am;
 	ruleset = *rsm;
 
 	REASON_SET(&reason, PFRES_MATCH);
 
 	if (r->log || (nr != NULL && nr->natpass && nr->log)) {
 		if (rewrite)
 #ifdef __FreeBSD__
 			m_copyback(m, off, sizeof(*uh), (caddr_t)uh);
 #else
 			m_copyback(m, off, sizeof(*uh), uh);
 #endif
 		PFLOG_PACKET(kif, h, m, af, direction, reason, r->log ? r : nr,
 		    a, ruleset, pd);
 	}
 
 	if ((r->action == PF_DROP) &&
 	    ((r->rule_flag & PFRULE_RETURNICMP) ||
 	    (r->rule_flag & PFRULE_RETURN))) {
 		/* undo NAT changes, if they have taken place */
 		if (nr != NULL) {
 			if (direction == PF_OUT) {
 				pf_change_ap(saddr, &uh->uh_sport, pd->ip_sum,
 				    &uh->uh_sum, &pd->baddr, bport, 1, af);
 				rewrite++;
 			} else {
 				pf_change_ap(daddr, &uh->uh_dport, pd->ip_sum,
 				    &uh->uh_sum, &pd->baddr, bport, 1, af);
 				rewrite++;
 			}
 		}
 		if ((af == AF_INET) && r->return_icmp)
 			pf_send_icmp(m, r->return_icmp >> 8,
 			    r->return_icmp & 255, af, r);
 		else if ((af == AF_INET6) && r->return_icmp6)
 			pf_send_icmp(m, r->return_icmp6 >> 8,
 			    r->return_icmp6 & 255, af, r);
 	}
 
 	if (r->action == PF_DROP)
 		return (PF_DROP);
 
 	if (pf_tag_packet(m, pd->pf_mtag, tag, rtableid)) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		return (PF_DROP);
 	}
 
 	if (r->keep_state || nr != NULL) {
 		/* create new state */
 		struct pf_state	*s = NULL;
 		struct pf_src_node *sn = NULL;
 
 		/* check maximums */
 		if (r->max_states && (r->states >= r->max_states)) {
 			pf_status.lcounters[LCNT_STATES]++;
 			REASON_SET(&reason, PFRES_MAXSTATES);
 			goto cleanup;
 		}
 		/* src node for filter rule */
 		if ((r->rule_flag & PFRULE_SRCTRACK ||
 		    r->rpool.opts & PF_POOL_STICKYADDR) &&
 		    pf_insert_src_node(&sn, r, saddr, af) != 0) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		/* src node for translation rule */
 		if (nr != NULL && (nr->rpool.opts & PF_POOL_STICKYADDR) &&
 		    ((direction == PF_OUT &&
 		    pf_insert_src_node(&nsn, nr, &pd->baddr, af) != 0) ||
 		    (pf_insert_src_node(&nsn, nr, saddr, af) != 0))) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		s = pool_get(&pf_state_pl, PR_NOWAIT);
 		if (s == NULL) {
 			REASON_SET(&reason, PFRES_MEMORY);
 cleanup:
 			if (sn != NULL && sn->states == 0 && sn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, sn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, sn);
 			}
 			if (nsn != sn && nsn != NULL && nsn->states == 0 &&
 			    nsn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, nsn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, nsn);
 			}
 			return (PF_DROP);
 		}
 		bzero(s, sizeof(*s));
 		s->rule.ptr = r;
 		s->nat_rule.ptr = nr;
 		s->anchor.ptr = a;
 		STATE_INC_COUNTERS(s);
 		s->allow_opts = r->allow_opts;
 		s->log = r->log & PF_LOG_ALL;
 		if (nr != NULL)
 			s->log |= nr->log & PF_LOG_ALL;
 		s->proto = IPPROTO_UDP;
 		s->direction = direction;
 		s->af = af;
 		if (direction == PF_OUT) {
 			PF_ACPY(&s->gwy.addr, saddr, af);
 			s->gwy.port = uh->uh_sport;
 			PF_ACPY(&s->ext.addr, daddr, af);
 			s->ext.port = uh->uh_dport;
 			if (nr != NULL) {
 				PF_ACPY(&s->lan.addr, &pd->baddr, af);
 				s->lan.port = bport;
 			} else {
 				PF_ACPY(&s->lan.addr, &s->gwy.addr, af);
 				s->lan.port = s->gwy.port;
 			}
 		} else {
 			PF_ACPY(&s->lan.addr, daddr, af);
 			s->lan.port = uh->uh_dport;
 			PF_ACPY(&s->ext.addr, saddr, af);
 			s->ext.port = uh->uh_sport;
 			if (nr != NULL) {
 				PF_ACPY(&s->gwy.addr, &pd->baddr, af);
 				s->gwy.port = bport;
 			} else {
 				PF_ACPY(&s->gwy.addr, &s->lan.addr, af);
 				s->gwy.port = s->lan.port;
 			}
 		}
 		s->src.state = PFUDPS_SINGLE;
 		s->dst.state = PFUDPS_NO_TRAFFIC;
 		s->creation = time_second;
 		s->expire = time_second;
 		s->timeout = PFTM_UDP_FIRST_PACKET;
 		pf_set_rt_ifp(s, saddr);
 		if (sn != NULL) {
 			s->src_node = sn;
 			s->src_node->states++;
 		}
 		if (nsn != NULL) {
 			PF_ACPY(&nsn->raddr, &pd->naddr, af);
 			s->nat_src_node = nsn;
 			s->nat_src_node->states++;
 		}
 		if (pf_insert_state(BOUND_IFACE(r, kif), s)) {
 			REASON_SET(&reason, PFRES_STATEINS);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			pool_put(&pf_state_pl, s);
 			return (PF_DROP);
 		} else
 			*sm = s;
 		if (tag > 0) {
 			pf_tag_ref(tag);
 			s->tag = tag;
 		}
 	}
 
 	/* copy back packet headers if we performed NAT operations */
 	if (rewrite)
 		m_copyback(m, off, sizeof(*uh), (caddr_t)uh);
 
 	return (PF_PASS);
 }
 
 int
 pf_test_icmp(struct pf_rule **rm, struct pf_state **sm, int direction,
     struct pfi_kif *kif, struct mbuf *m, int off, void *h,
     struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm,
     struct ifqueue *ifq)
 {
 	struct pf_rule		*nr = NULL;
 	struct pf_addr		*saddr = pd->src, *daddr = pd->dst;
 	struct pf_rule		*r, *a = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_src_node	*nsn = NULL;
 	u_short			 reason;
 	u_int16_t		 icmpid = 0, bport, nport = 0;
 	sa_family_t		 af = pd->af;
 	u_int8_t		 icmptype = 0;	/* make the compiler happy */
 	u_int8_t		 icmpcode = 0;	/* make the compiler happy */
 	int			 state_icmp = 0;
 	int			 tag = -1, rtableid = -1;
 #ifdef INET6
 	int			 rewrite = 0;
 #endif /* INET6 */
 	int			 asd = 0;
 	int			 match = 0;
 
 	if (pf_check_congestion(ifq)) {
 		REASON_SET(&reason, PFRES_CONGEST);
 		return (PF_DROP);
 	}
 
 	switch (pd->proto) {
 #ifdef INET
 	case IPPROTO_ICMP:
 		icmptype = pd->hdr.icmp->icmp_type;
 		icmpcode = pd->hdr.icmp->icmp_code;
 		icmpid = pd->hdr.icmp->icmp_id;
 
 		if (icmptype == ICMP_UNREACH ||
 		    icmptype == ICMP_SOURCEQUENCH ||
 		    icmptype == ICMP_REDIRECT ||
 		    icmptype == ICMP_TIMXCEED ||
 		    icmptype == ICMP_PARAMPROB)
 			state_icmp++;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 		icmptype = pd->hdr.icmp6->icmp6_type;
 		icmpcode = pd->hdr.icmp6->icmp6_code;
 		icmpid = pd->hdr.icmp6->icmp6_id;
 
 		if (icmptype == ICMP6_DST_UNREACH ||
 		    icmptype == ICMP6_PACKET_TOO_BIG ||
 		    icmptype == ICMP6_TIME_EXCEEDED ||
 		    icmptype == ICMP6_PARAM_PROB)
 			state_icmp++;
 		break;
 #endif /* INET6 */
 	}
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
 
 	if (direction == PF_OUT) {
 		bport = nport = icmpid;
 		/* check outgoing packet for BINAT/NAT */
 		if ((nr = pf_get_translation(pd, m, off, PF_OUT, kif, &nsn,
 		    saddr, icmpid, daddr, icmpid, &pd->naddr, &nport)) !=
 		    NULL) {
 			PF_ACPY(&pd->baddr, saddr, af);
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				pf_change_a(&saddr->v4.s_addr, pd->ip_sum,
 				    pd->naddr.v4.s_addr, 0);
 				pd->hdr.icmp->icmp_cksum = pf_cksum_fixup(
 				    pd->hdr.icmp->icmp_cksum, icmpid, nport, 0);
 				pd->hdr.icmp->icmp_id = nport;
 				m_copyback(m, off, ICMP_MINLEN,
 				    (caddr_t)pd->hdr.icmp);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				pf_change_a6(saddr, &pd->hdr.icmp6->icmp6_cksum,
 				    &pd->naddr, 0);
 				rewrite++;
 				break;
 #endif /* INET6 */
 			}
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	} else {
 		bport = nport = icmpid;
 		/* check incoming packet for BINAT/RDR */
 		if ((nr = pf_get_translation(pd, m, off, PF_IN, kif, &nsn,
 		    saddr, icmpid, daddr, icmpid, &pd->naddr, &nport)) !=
 		    NULL) {
 			PF_ACPY(&pd->baddr, daddr, af);
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				pf_change_a(&daddr->v4.s_addr,
 				    pd->ip_sum, pd->naddr.v4.s_addr, 0);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				pf_change_a6(daddr, &pd->hdr.icmp6->icmp6_cksum,
 				    &pd->naddr, 0);
 				rewrite++;
 				break;
 #endif /* INET6 */
 			}
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	}
 
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != pd->proto)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, saddr, af,
 		    r->src.neg, kif))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, daddr, af,
 		    r->dst.neg, NULL))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->type && r->type != icmptype + 1)
 			r = TAILQ_NEXT(r, entries);
 		else if (r->code && r->code != icmpcode + 1)
 			r = TAILQ_NEXT(r, entries);
 		else if (r->tos && !(r->tos == pd->tos))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->rule_flag & PFRULE_FRAGMENT)
 			r = TAILQ_NEXT(r, entries);
 		else if (r->prob && r->prob <= arc4random())
 			r = TAILQ_NEXT(r, entries);
 		else if (r->match_tag && !pf_match_tag(m, r, pd->pf_mtag, &tag))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY)
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->tag)
 				tag = r->tag;
 			if (r->rtableid >= 0)
 				rtableid = r->rtableid;
 			if (r->anchor == NULL) {
 				match = 1;
 				*rm = r;
 				*am = a;
 				*rsm = ruleset;
 				if ((*rm)->quick)
 					break;
 				r = TAILQ_NEXT(r, entries);
 			} else
 				pf_step_into_anchor(&asd, &ruleset,
 				    PF_RULESET_FILTER, &r, &a, &match);
 		}
 		if (r == NULL && pf_step_out_of_anchor(&asd, &ruleset,
 		    PF_RULESET_FILTER, &r, &a, &match))
 			break;
 	}
 	r = *rm;
 	a = *am;
 	ruleset = *rsm;
 
 	REASON_SET(&reason, PFRES_MATCH);
 
 	if (r->log || (nr != NULL && nr->natpass && nr->log)) {
 #ifdef INET6
 		if (rewrite)
 			m_copyback(m, off, sizeof(struct icmp6_hdr),
 			    (caddr_t)pd->hdr.icmp6);
 #endif /* INET6 */
 		PFLOG_PACKET(kif, h, m, af, direction, reason, r->log ? r : nr,
 		    a, ruleset, pd);
 	}
 
 	if (r->action != PF_PASS)
 		return (PF_DROP);
 
 	if (pf_tag_packet(m, pd->pf_mtag, tag, rtableid)) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		return (PF_DROP);
 	}
 
 	if (!state_icmp && (r->keep_state || nr != NULL)) {
 		/* create new state */
 		struct pf_state	*s = NULL;
 		struct pf_src_node *sn = NULL;
 
 		/* check maximums */
 		if (r->max_states && (r->states >= r->max_states)) {
 			pf_status.lcounters[LCNT_STATES]++;
 			REASON_SET(&reason, PFRES_MAXSTATES);
 			goto cleanup;
 		}
 		/* src node for filter rule */
 		if ((r->rule_flag & PFRULE_SRCTRACK ||
 		    r->rpool.opts & PF_POOL_STICKYADDR) &&
 		    pf_insert_src_node(&sn, r, saddr, af) != 0) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		/* src node for translation rule */
 		if (nr != NULL && (nr->rpool.opts & PF_POOL_STICKYADDR) &&
 		    ((direction == PF_OUT &&
 		    pf_insert_src_node(&nsn, nr, &pd->baddr, af) != 0) ||
 		    (pf_insert_src_node(&nsn, nr, saddr, af) != 0))) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		s = pool_get(&pf_state_pl, PR_NOWAIT);
 		if (s == NULL) {
 			REASON_SET(&reason, PFRES_MEMORY);
 cleanup:
 			if (sn != NULL && sn->states == 0 && sn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, sn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, sn);
 			}
 			if (nsn != sn && nsn != NULL && nsn->states == 0 &&
 			    nsn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, nsn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, nsn);
 			}
 			return (PF_DROP);
 		}
 		bzero(s, sizeof(*s));
 		s->rule.ptr = r;
 		s->nat_rule.ptr = nr;
 		s->anchor.ptr = a;
 		STATE_INC_COUNTERS(s);
 		s->allow_opts = r->allow_opts;
 		s->log = r->log & PF_LOG_ALL;
 		if (nr != NULL)
 			s->log |= nr->log & PF_LOG_ALL;
 		s->proto = pd->proto;
 		s->direction = direction;
 		s->af = af;
 		if (direction == PF_OUT) {
 			PF_ACPY(&s->gwy.addr, saddr, af);
 			s->gwy.port = nport;
 			PF_ACPY(&s->ext.addr, daddr, af);
 			s->ext.port = 0;
 			if (nr != NULL) {
 				PF_ACPY(&s->lan.addr, &pd->baddr, af);
 				s->lan.port = bport;
 			} else {
 				PF_ACPY(&s->lan.addr, &s->gwy.addr, af);
 				s->lan.port = s->gwy.port;
 			}
 		} else {
 			PF_ACPY(&s->lan.addr, daddr, af);
 			s->lan.port = nport;
 			PF_ACPY(&s->ext.addr, saddr, af);
 			s->ext.port = 0; 
 			if (nr != NULL) {
 				PF_ACPY(&s->gwy.addr, &pd->baddr, af);
 				s->gwy.port = bport;
 			} else {
 				PF_ACPY(&s->gwy.addr, &s->lan.addr, af);
 				s->gwy.port = s->lan.port;
 			}
 		}
 		s->creation = time_second;
 		s->expire = time_second;
 		s->timeout = PFTM_ICMP_FIRST_PACKET;
 		pf_set_rt_ifp(s, saddr);
 		if (sn != NULL) {
 			s->src_node = sn;
 			s->src_node->states++;
 		}
 		if (nsn != NULL) {
 			PF_ACPY(&nsn->raddr, &pd->naddr, af);
 			s->nat_src_node = nsn;
 			s->nat_src_node->states++;
 		}
 		if (pf_insert_state(BOUND_IFACE(r, kif), s)) {
 			REASON_SET(&reason, PFRES_STATEINS);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			pool_put(&pf_state_pl, s);
 			return (PF_DROP);
 		} else
 			*sm = s;
 		if (tag > 0) {
 			pf_tag_ref(tag);
 			s->tag = tag;
 		}
 	}
 
 #ifdef INET6
 	/* copy back packet headers if we performed IPv6 NAT operations */
 	if (rewrite)
 		m_copyback(m, off, sizeof(struct icmp6_hdr),
 		    (caddr_t)pd->hdr.icmp6);
 #endif /* INET6 */
 
 	return (PF_PASS);
 }
 
 int
 pf_test_other(struct pf_rule **rm, struct pf_state **sm, int direction,
     struct pfi_kif *kif, struct mbuf *m, int off, void *h, struct pf_pdesc *pd,
     struct pf_rule **am, struct pf_ruleset **rsm, struct ifqueue *ifq)
 {
 	struct pf_rule		*nr = NULL;
 	struct pf_rule		*r, *a = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_src_node	*nsn = NULL;
 	struct pf_addr		*saddr = pd->src, *daddr = pd->dst;
 	sa_family_t		 af = pd->af;
 	u_short			 reason;
 	int			 tag = -1, rtableid = -1;
 	int			 asd = 0;
 	int			 match = 0;
 
 	if (pf_check_congestion(ifq)) {
 		REASON_SET(&reason, PFRES_CONGEST);
 		return (PF_DROP);
 	}
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
 
 	if (direction == PF_OUT) {
 		/* check outgoing packet for BINAT/NAT */
 		if ((nr = pf_get_translation(pd, m, off, PF_OUT, kif, &nsn,
 		    saddr, 0, daddr, 0, &pd->naddr, NULL)) != NULL) {
 			PF_ACPY(&pd->baddr, saddr, af);
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				pf_change_a(&saddr->v4.s_addr, pd->ip_sum,
 				    pd->naddr.v4.s_addr, 0);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				PF_ACPY(saddr, &pd->naddr, af);
 				break;
 #endif /* INET6 */
 			}
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	} else {
 		/* check incoming packet for BINAT/RDR */
 		if ((nr = pf_get_translation(pd, m, off, PF_IN, kif, &nsn,
 		    saddr, 0, daddr, 0, &pd->naddr, NULL)) != NULL) {
 			PF_ACPY(&pd->baddr, daddr, af);
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				pf_change_a(&daddr->v4.s_addr,
 				    pd->ip_sum, pd->naddr.v4.s_addr, 0);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				PF_ACPY(daddr, &pd->naddr, af);
 				break;
 #endif /* INET6 */
 			}
 			if (nr->natpass)
 				r = NULL;
 			pd->nat_rule = nr;
 		}
 	}
 
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != pd->proto)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, pd->src, af,
 		    r->src.neg, kif))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, pd->dst, af,
 		    r->dst.neg, NULL))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->tos && !(r->tos == pd->tos))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->rule_flag & PFRULE_FRAGMENT)
 			r = TAILQ_NEXT(r, entries);
 		else if (r->prob && r->prob <= arc4random())
 			r = TAILQ_NEXT(r, entries);
 		else if (r->match_tag && !pf_match_tag(m, r, pd->pf_mtag, &tag))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY)
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->tag)
 				tag = r->tag;
 			if (r->rtableid >= 0)
 				rtableid = r->rtableid;
 			if (r->anchor == NULL) {
 				match = 1;
 				*rm = r;
 				*am = a;
 				*rsm = ruleset;
 				if ((*rm)->quick)
 					break;
 				r = TAILQ_NEXT(r, entries);
 			} else
 				pf_step_into_anchor(&asd, &ruleset,
 				    PF_RULESET_FILTER, &r, &a, &match);
 		}
 		if (r == NULL && pf_step_out_of_anchor(&asd, &ruleset,
 		    PF_RULESET_FILTER, &r, &a, &match))
 			break;
 	}
 	r = *rm;
 	a = *am;
 	ruleset = *rsm;
 
 	REASON_SET(&reason, PFRES_MATCH);
 
 	if (r->log || (nr != NULL && nr->natpass && nr->log))
 		PFLOG_PACKET(kif, h, m, af, direction, reason, r->log ? r : nr,
 		    a, ruleset, pd);
 
 	if ((r->action == PF_DROP) &&
 	    ((r->rule_flag & PFRULE_RETURNICMP) ||
 	    (r->rule_flag & PFRULE_RETURN))) {
 		struct pf_addr *a = NULL;
 
 		if (nr != NULL) {
 			if (direction == PF_OUT)
 				a = saddr;
 			else
 				a = daddr;
 		}
 		if (a != NULL) {
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				pf_change_a(&a->v4.s_addr, pd->ip_sum,
 				    pd->baddr.v4.s_addr, 0);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				PF_ACPY(a, &pd->baddr, af);
 				break;
 #endif /* INET6 */
 			}
 		}
 		if ((af == AF_INET) && r->return_icmp)
 			pf_send_icmp(m, r->return_icmp >> 8,
 			    r->return_icmp & 255, af, r);
 		else if ((af == AF_INET6) && r->return_icmp6)
 			pf_send_icmp(m, r->return_icmp6 >> 8,
 			    r->return_icmp6 & 255, af, r);
 	}
 
 	if (r->action != PF_PASS)
 		return (PF_DROP);
 
 	if (pf_tag_packet(m, pd->pf_mtag, tag, rtableid)) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		return (PF_DROP);
 	}
 
 	if (r->keep_state || nr != NULL) {
 		/* create new state */
 		struct pf_state	*s = NULL;
 		struct pf_src_node *sn = NULL;
 
 		/* check maximums */
 		if (r->max_states && (r->states >= r->max_states)) {
 			pf_status.lcounters[LCNT_STATES]++;
 			REASON_SET(&reason, PFRES_MAXSTATES);
 			goto cleanup;
 		}
 		/* src node for filter rule */
 		if ((r->rule_flag & PFRULE_SRCTRACK ||
 		    r->rpool.opts & PF_POOL_STICKYADDR) &&
 		    pf_insert_src_node(&sn, r, saddr, af) != 0) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		/* src node for translation rule */
 		if (nr != NULL && (nr->rpool.opts & PF_POOL_STICKYADDR) &&
 		    ((direction == PF_OUT &&
 		    pf_insert_src_node(&nsn, nr, &pd->baddr, af) != 0) ||
 		    (pf_insert_src_node(&nsn, nr, saddr, af) != 0))) {
 			REASON_SET(&reason, PFRES_SRCLIMIT);
 			goto cleanup;
 		}
 		s = pool_get(&pf_state_pl, PR_NOWAIT);
 		if (s == NULL) {
 			REASON_SET(&reason, PFRES_MEMORY);
 cleanup:
 			if (sn != NULL && sn->states == 0 && sn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, sn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, sn);
 			}
 			if (nsn != sn && nsn != NULL && nsn->states == 0 &&
 			    nsn->expire == 0) {
 				RB_REMOVE(pf_src_tree, &tree_src_tracking, nsn);
 				pf_status.scounters[SCNT_SRC_NODE_REMOVALS]++;
 				pf_status.src_nodes--;
 				pool_put(&pf_src_tree_pl, nsn);
 			}
 			return (PF_DROP);
 		}
 		bzero(s, sizeof(*s));
 		s->rule.ptr = r;
 		s->nat_rule.ptr = nr;
 		s->anchor.ptr = a;
 		STATE_INC_COUNTERS(s);
 		s->allow_opts = r->allow_opts;
 		s->log = r->log & PF_LOG_ALL;
 		if (nr != NULL)
 			s->log |= nr->log & PF_LOG_ALL;
 		s->proto = pd->proto;
 		s->direction = direction;
 		s->af = af;
 		if (direction == PF_OUT) {
 			PF_ACPY(&s->gwy.addr, saddr, af);
 			PF_ACPY(&s->ext.addr, daddr, af);
 			if (nr != NULL)
 				PF_ACPY(&s->lan.addr, &pd->baddr, af);
 			else
 				PF_ACPY(&s->lan.addr, &s->gwy.addr, af);
 		} else {
 			PF_ACPY(&s->lan.addr, daddr, af);
 			PF_ACPY(&s->ext.addr, saddr, af);
 			if (nr != NULL)
 				PF_ACPY(&s->gwy.addr, &pd->baddr, af);
 			else
 				PF_ACPY(&s->gwy.addr, &s->lan.addr, af);
 		}
 		s->src.state = PFOTHERS_SINGLE;
 		s->dst.state = PFOTHERS_NO_TRAFFIC;
 		s->creation = time_second;
 		s->expire = time_second;
 		s->timeout = PFTM_OTHER_FIRST_PACKET;
 		pf_set_rt_ifp(s, saddr);
 		if (sn != NULL) {
 			s->src_node = sn;
 			s->src_node->states++;
 		}
 		if (nsn != NULL) {
 			PF_ACPY(&nsn->raddr, &pd->naddr, af);
 			s->nat_src_node = nsn;
 			s->nat_src_node->states++;
 		}
 		if (pf_insert_state(BOUND_IFACE(r, kif), s)) {
 			REASON_SET(&reason, PFRES_STATEINS);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			pool_put(&pf_state_pl, s);
 			return (PF_DROP);
 		} else
 			*sm = s;
 		if (tag > 0) {
 			pf_tag_ref(tag);
 			s->tag = tag;
 		}
 	}
 
 	return (PF_PASS);
 }
 
 int
 pf_test_fragment(struct pf_rule **rm, int direction, struct pfi_kif *kif,
     struct mbuf *m, void *h, struct pf_pdesc *pd, struct pf_rule **am,
     struct pf_ruleset **rsm)
 {
 	struct pf_rule		*r, *a = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	sa_family_t		 af = pd->af;
 	u_short			 reason;
 	int			 tag = -1;
 	int			 asd = 0;
 	int			 match = 0;
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != pd->proto)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, pd->src, af,
 		    r->src.neg, kif))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, pd->dst, af,
 		    r->dst.neg, NULL))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->tos && !(r->tos == pd->tos))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY)
 			r = TAILQ_NEXT(r, entries);
 		else if (pd->proto == IPPROTO_UDP &&
 		    (r->src.port_op || r->dst.port_op))
 			r = TAILQ_NEXT(r, entries);
 		else if (pd->proto == IPPROTO_TCP &&
 		    (r->src.port_op || r->dst.port_op || r->flagset))
 			r = TAILQ_NEXT(r, entries);
 		else if ((pd->proto == IPPROTO_ICMP ||
 		    pd->proto == IPPROTO_ICMPV6) &&
 		    (r->type || r->code))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->prob && r->prob <= arc4random())
 			r = TAILQ_NEXT(r, entries);
 		else if (r->match_tag && !pf_match_tag(m, r, pd->pf_mtag, &tag))
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->anchor == NULL) {
 				match = 1;
 				*rm = r;
 				*am = a;
 				*rsm = ruleset;
 				if ((*rm)->quick)
 					break;
 				r = TAILQ_NEXT(r, entries);
 			} else
 				pf_step_into_anchor(&asd, &ruleset,
 				    PF_RULESET_FILTER, &r, &a, &match);
 		}
 		if (r == NULL && pf_step_out_of_anchor(&asd, &ruleset,
 		    PF_RULESET_FILTER, &r, &a, &match))
 			break;
 	}
 	r = *rm;
 	a = *am;
 	ruleset = *rsm;
 
 	REASON_SET(&reason, PFRES_MATCH);
 
 	if (r->log)
 		PFLOG_PACKET(kif, h, m, af, direction, reason, r, a, ruleset,
 		    pd);
 
 	if (r->action != PF_PASS)
 		return (PF_DROP);
 
 	if (pf_tag_packet(m, pd->pf_mtag, tag, -1)) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		return (PF_DROP);
 	}
 
 	return (PF_PASS);
 }
 
 int
 pf_test_state_tcp(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct mbuf *m, int off, void *h, struct pf_pdesc *pd,
     u_short *reason)
 {
 	struct pf_state_cmp	 key;
 	struct tcphdr		*th = pd->hdr.tcp;
 	u_int16_t		 win = ntohs(th->th_win);
 	u_int32_t		 ack, end, seq, orig_seq;
 	u_int8_t		 sws, dws;
 	int			 ackskew;
 	int			 copyback = 0;
 	struct pf_state_peer	*src, *dst;
 
 	key.af = pd->af;
 	key.proto = IPPROTO_TCP;
 	if (direction == PF_IN)	{
 		PF_ACPY(&key.ext.addr, pd->src, key.af);
 		PF_ACPY(&key.gwy.addr, pd->dst, key.af);
 		key.ext.port = th->th_sport;
 		key.gwy.port = th->th_dport;
 	} else {
 		PF_ACPY(&key.lan.addr, pd->src, key.af);
 		PF_ACPY(&key.ext.addr, pd->dst, key.af);
 		key.lan.port = th->th_sport;
 		key.ext.port = th->th_dport;
 	}
 
 	STATE_LOOKUP();
 
 	if (direction == (*state)->direction) {
 		src = &(*state)->src;
 		dst = &(*state)->dst;
 	} else {
 		src = &(*state)->dst;
 		dst = &(*state)->src;
 	}
 
 	if ((*state)->src.state == PF_TCPS_PROXY_SRC) {
 		if (direction != (*state)->direction) {
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		}
 		if (th->th_flags & TH_SYN) {
 			if (ntohl(th->th_seq) != (*state)->src.seqlo) {
 				REASON_SET(reason, PFRES_SYNPROXY);
 				return (PF_DROP);
 			}
 #ifdef __FreeBSD__
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst,
 #else
 			pf_send_tcp((*state)->rule.ptr, pd->af, pd->dst,
 #endif
 			    pd->src, th->th_dport, th->th_sport,
 			    (*state)->src.seqhi, ntohl(th->th_seq) + 1,
 			    TH_SYN|TH_ACK, 0, (*state)->src.mss, 0, 1,
 			    0, NULL, NULL);
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		} else if (!(th->th_flags & TH_ACK) ||
 		    (ntohl(th->th_ack) != (*state)->src.seqhi + 1) ||
 		    (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) {
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_DROP);
 		} else if ((*state)->src_node != NULL &&
 		    pf_src_connlimit(state)) {
 			REASON_SET(reason, PFRES_SRCLIMIT);
 			return (PF_DROP);
 		} else
 			(*state)->src.state = PF_TCPS_PROXY_DST;
 	}
 	if ((*state)->src.state == PF_TCPS_PROXY_DST) {
 		struct pf_state_host *src, *dst;
 
 		if (direction == PF_OUT) {
 			src = &(*state)->gwy;
 			dst = &(*state)->ext;
 		} else {
 			src = &(*state)->ext;
 			dst = &(*state)->lan;
 		}
 		if (direction == (*state)->direction) {
 			if (((th->th_flags & (TH_SYN|TH_ACK)) != TH_ACK) ||
 			    (ntohl(th->th_ack) != (*state)->src.seqhi + 1) ||
 			    (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) {
 				REASON_SET(reason, PFRES_SYNPROXY);
 				return (PF_DROP);
 			}
 			(*state)->src.max_win = MAX(ntohs(th->th_win), 1);
 			if ((*state)->dst.seqhi == 1)
 				(*state)->dst.seqhi = htonl(arc4random());
 #ifdef __FreeBSD__
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af,
 			    &src->addr,
 #else
 			pf_send_tcp((*state)->rule.ptr, pd->af, &src->addr,
 #endif
 			    &dst->addr, src->port, dst->port,
 			    (*state)->dst.seqhi, 0, TH_SYN, 0,
 			    (*state)->src.mss, 0, 0, (*state)->tag, NULL, NULL);
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		} else if (((th->th_flags & (TH_SYN|TH_ACK)) !=
 		    (TH_SYN|TH_ACK)) ||
 		    (ntohl(th->th_ack) != (*state)->dst.seqhi + 1)) {
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_DROP);
 		} else {
 			(*state)->dst.max_win = MAX(ntohs(th->th_win), 1);
 			(*state)->dst.seqlo = ntohl(th->th_seq);
 #ifdef __FreeBSD__
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst,
 #else
 			pf_send_tcp((*state)->rule.ptr, pd->af, pd->dst,
 #endif
 			    pd->src, th->th_dport, th->th_sport,
 			    ntohl(th->th_ack), ntohl(th->th_seq) + 1,
 			    TH_ACK, (*state)->src.max_win, 0, 0, 0,
 			    (*state)->tag, NULL, NULL);
 #ifdef __FreeBSD__
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af,
 			    &src->addr,
 #else
 			pf_send_tcp((*state)->rule.ptr, pd->af, &src->addr,
 #endif
 			    &dst->addr, src->port, dst->port,
 			    (*state)->src.seqhi + 1, (*state)->src.seqlo + 1,
 			    TH_ACK, (*state)->dst.max_win, 0, 0, 1,
 			    0, NULL, NULL);
 			(*state)->src.seqdiff = (*state)->dst.seqhi -
 			    (*state)->src.seqlo;
 			(*state)->dst.seqdiff = (*state)->src.seqhi -
 			    (*state)->dst.seqlo;
 			(*state)->src.seqhi = (*state)->src.seqlo +
 			    (*state)->dst.max_win;
 			(*state)->dst.seqhi = (*state)->dst.seqlo +
 			    (*state)->src.max_win;
 			(*state)->src.wscale = (*state)->dst.wscale = 0;
 			(*state)->src.state = (*state)->dst.state =
 			    TCPS_ESTABLISHED;
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		}
 	}
 
 	if (((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN) &&
 	    dst->state >= TCPS_FIN_WAIT_2 &&
 	    src->state >= TCPS_FIN_WAIT_2) {
 		if (pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: state reuse ");
 			pf_print_state(*state);
 			pf_print_flags(th->th_flags);
 			printf("\n");
 		}
 		/* XXX make sure it's the same direction ?? */
 		(*state)->src.state = (*state)->dst.state = TCPS_CLOSED;
 		pf_unlink_state(*state);
 		*state = NULL;
 		return (PF_DROP);
 	}
 
 	if (src->wscale && dst->wscale && !(th->th_flags & TH_SYN)) {
 		sws = src->wscale & PF_WSCALE_MASK;
 		dws = dst->wscale & PF_WSCALE_MASK;
 	} else
 		sws = dws = 0;
 
 	/*
 	 * Sequence tracking algorithm from Guido van Rooij's paper:
 	 *   http://www.madison-gurkha.com/publications/tcp_filtering/
 	 *	tcp_filtering.ps
 	 */
 
 	orig_seq = seq = ntohl(th->th_seq);
 	if (src->seqlo == 0) {
 		/* First packet from this end. Set its state */
 
 		if ((pd->flags & PFDESC_TCP_NORM || dst->scrub) &&
 		    src->scrub == NULL) {
 			if (pf_normalize_tcp_init(m, off, pd, th, src, dst)) {
 				REASON_SET(reason, PFRES_MEMORY);
 				return (PF_DROP);
 			}
 		}
 
 		/* Deferred generation of sequence number modulator */
 		if (dst->seqdiff && !src->seqdiff) {
 #ifdef __FreeBSD__
 			while ((src->seqdiff = pf_new_isn(*state) - seq) == 0)
 				;
 #else
 			while ((src->seqdiff = tcp_rndiss_next() - seq) == 0)
 				;
 #endif
 			ack = ntohl(th->th_ack) - dst->seqdiff;
 			pf_change_a(&th->th_seq, &th->th_sum, htonl(seq +
 			    src->seqdiff), 0);
 			pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0);
 			copyback = 1;
 		} else {
 			ack = ntohl(th->th_ack);
 		}
 
 		end = seq + pd->p_len;
 		if (th->th_flags & TH_SYN) {
 			end++;
 			if (dst->wscale & PF_WSCALE_FLAG) {
 				src->wscale = pf_get_wscale(m, off, th->th_off,
 				    pd->af);
 				if (src->wscale & PF_WSCALE_FLAG) {
 					/* Remove scale factor from initial
 					 * window */
 					sws = src->wscale & PF_WSCALE_MASK;
 					win = ((u_int32_t)win + (1 << sws) - 1)
 					    >> sws;
 					dws = dst->wscale & PF_WSCALE_MASK;
 				} else {
 					/* fixup other window */
 					dst->max_win <<= dst->wscale &
 					    PF_WSCALE_MASK;
 					/* in case of a retrans SYN|ACK */
 					dst->wscale = 0;
 				}
 			}
 		}
 		if (th->th_flags & TH_FIN)
 			end++;
 
 		src->seqlo = seq;
 		if (src->state < TCPS_SYN_SENT)
 			src->state = TCPS_SYN_SENT;
 
 		/*
 		 * May need to slide the window (seqhi may have been set by
 		 * the crappy stack check or if we picked up the connection
 		 * after establishment)
 		 */
 		if (src->seqhi == 1 ||
 		    SEQ_GEQ(end + MAX(1, dst->max_win << dws), src->seqhi))
 			src->seqhi = end + MAX(1, dst->max_win << dws);
 		if (win > src->max_win)
 			src->max_win = win;
 
 	} else {
 		ack = ntohl(th->th_ack) - dst->seqdiff;
 		if (src->seqdiff) {
 			/* Modulate sequence numbers */
 			pf_change_a(&th->th_seq, &th->th_sum, htonl(seq +
 			    src->seqdiff), 0);
 			pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0);
 			copyback = 1;
 		}
 		end = seq + pd->p_len;
 		if (th->th_flags & TH_SYN)
 			end++;
 		if (th->th_flags & TH_FIN)
 			end++;
 	}
 
 	if ((th->th_flags & TH_ACK) == 0) {
 		/* Let it pass through the ack skew check */
 		ack = dst->seqlo;
 	} else if ((ack == 0 &&
 	    (th->th_flags & (TH_ACK|TH_RST)) == (TH_ACK|TH_RST)) ||
 	    /* broken tcp stacks do not set ack */
 	    (dst->state < TCPS_SYN_SENT)) {
 		/*
 		 * Many stacks (ours included) will set the ACK number in an
 		 * FIN|ACK if the SYN times out -- no sequence to ACK.
 		 */
 		ack = dst->seqlo;
 	}
 
 	if (seq == end) {
 		/* Ease sequencing restrictions on no data packets */
 		seq = src->seqlo;
 		end = seq;
 	}
 
 	ackskew = dst->seqlo - ack;
 
 
 	/*
 	 * Need to demodulate the sequence numbers in any TCP SACK options
 	 * (Selective ACK). We could optionally validate the SACK values
 	 * against the current ACK window, either forwards or backwards, but
 	 * I'm not confident that SACK has been implemented properly
 	 * everywhere. It wouldn't surprise me if several stacks accidently
 	 * SACK too far backwards of previously ACKed data. There really aren't
 	 * any security implications of bad SACKing unless the target stack
 	 * doesn't validate the option length correctly. Someone trying to
 	 * spoof into a TCP connection won't bother blindly sending SACK
 	 * options anyway.
 	 */
 	if (dst->seqdiff && (th->th_off << 2) > sizeof(struct tcphdr)) {
 		if (pf_modulate_sack(m, off, pd, th, dst))
 			copyback = 1;
 	}
 
 
 #define MAXACKWINDOW (0xffff + 1500)	/* 1500 is an arbitrary fudge factor */
 	if (SEQ_GEQ(src->seqhi, end) &&
 	    /* Last octet inside other's window space */
 	    SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) &&
 	    /* Retrans: not more than one window back */
 	    (ackskew >= -MAXACKWINDOW) &&
 	    /* Acking not more than one reassembled fragment backwards */
 	    (ackskew <= (MAXACKWINDOW << sws)) &&
 	    /* Acking not more than one window forward */
 	    ((th->th_flags & TH_RST) == 0 || orig_seq == src->seqlo ||
 	    (orig_seq == src->seqlo + 1) || (pd->flags & PFDESC_IP_REAS) == 0)) {
 	    /* Require an exact/+1 sequence match on resets when possible */
 
 		if (dst->scrub || src->scrub) {
 			if (pf_normalize_tcp_stateful(m, off, pd, reason, th,
 			    *state, src, dst, &copyback))
 				return (PF_DROP);
 		}
 
 		/* update max window */
 		if (src->max_win < win)
 			src->max_win = win;
 		/* synchronize sequencing */
 		if (SEQ_GT(end, src->seqlo))
 			src->seqlo = end;
 		/* slide the window of what the other end can send */
 		if (SEQ_GEQ(ack + (win << sws), dst->seqhi))
 			dst->seqhi = ack + MAX((win << sws), 1);
 
 
 		/* update states */
 		if (th->th_flags & TH_SYN)
 			if (src->state < TCPS_SYN_SENT)
 				src->state = TCPS_SYN_SENT;
 		if (th->th_flags & TH_FIN)
 			if (src->state < TCPS_CLOSING)
 				src->state = TCPS_CLOSING;
 		if (th->th_flags & TH_ACK) {
 			if (dst->state == TCPS_SYN_SENT) {
 				dst->state = TCPS_ESTABLISHED;
 				if (src->state == TCPS_ESTABLISHED &&
 				    (*state)->src_node != NULL &&
 				    pf_src_connlimit(state)) {
 					REASON_SET(reason, PFRES_SRCLIMIT);
 					return (PF_DROP);
 				}
 			} else if (dst->state == TCPS_CLOSING)
 				dst->state = TCPS_FIN_WAIT_2;
 		}
 		if (th->th_flags & TH_RST)
 			src->state = dst->state = TCPS_TIME_WAIT;
 
 		/* update expire time */
 		(*state)->expire = time_second;
 		if (src->state >= TCPS_FIN_WAIT_2 &&
 		    dst->state >= TCPS_FIN_WAIT_2)
 			(*state)->timeout = PFTM_TCP_CLOSED;
 		else if (src->state >= TCPS_CLOSING &&
 		    dst->state >= TCPS_CLOSING)
 			(*state)->timeout = PFTM_TCP_FIN_WAIT;
 		else if (src->state < TCPS_ESTABLISHED ||
 		    dst->state < TCPS_ESTABLISHED)
 			(*state)->timeout = PFTM_TCP_OPENING;
 		else if (src->state >= TCPS_CLOSING ||
 		    dst->state >= TCPS_CLOSING)
 			(*state)->timeout = PFTM_TCP_CLOSING;
 		else
 			(*state)->timeout = PFTM_TCP_ESTABLISHED;
 
 		/* Fall through to PASS packet */
 
 	} else if ((dst->state < TCPS_SYN_SENT ||
 		dst->state >= TCPS_FIN_WAIT_2 ||
 		src->state >= TCPS_FIN_WAIT_2) &&
 	    SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) &&
 	    /* Within a window forward of the originating packet */
 	    SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW)) {
 	    /* Within a window backward of the originating packet */
 
 		/*
 		 * This currently handles three situations:
 		 *  1) Stupid stacks will shotgun SYNs before their peer
 		 *     replies.
 		 *  2) When PF catches an already established stream (the
 		 *     firewall rebooted, the state table was flushed, routes
 		 *     changed...)
 		 *  3) Packets get funky immediately after the connection
 		 *     closes (this should catch Solaris spurious ACK|FINs
 		 *     that web servers like to spew after a close)
 		 *
 		 * This must be a little more careful than the above code
 		 * since packet floods will also be caught here. We don't
 		 * update the TTL here to mitigate the damage of a packet
 		 * flood and so the same code can handle awkward establishment
 		 * and a loosened connection close.
 		 * In the establishment case, a correct peer response will
 		 * validate the connection, go through the normal state code
 		 * and keep updating the state TTL.
 		 */
 
 		if (pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: loose state match: ");
 			pf_print_state(*state);
 			pf_print_flags(th->th_flags);
 			printf(" seq=%u (%u) ack=%u len=%u ackskew=%d "
 			    "pkts=%llu:%llu\n", seq, orig_seq, ack, pd->p_len,
 #ifdef __FreeBSD__
 			    ackskew, (unsigned long long)(*state)->packets[0],
 			    (unsigned long long)(*state)->packets[1]);
 #else
 			    ackskew, (*state)->packets[0],
 			    (*state)->packets[1]);
 #endif
 		}
 
 		if (dst->scrub || src->scrub) {
 			if (pf_normalize_tcp_stateful(m, off, pd, reason, th,
 			    *state, src, dst, &copyback))
 				return (PF_DROP);
 		}
 
 		/* update max window */
 		if (src->max_win < win)
 			src->max_win = win;
 		/* synchronize sequencing */
 		if (SEQ_GT(end, src->seqlo))
 			src->seqlo = end;
 		/* slide the window of what the other end can send */
 		if (SEQ_GEQ(ack + (win << sws), dst->seqhi))
 			dst->seqhi = ack + MAX((win << sws), 1);
 
 		/*
 		 * Cannot set dst->seqhi here since this could be a shotgunned
 		 * SYN and not an already established connection.
 		 */
 
 		if (th->th_flags & TH_FIN)
 			if (src->state < TCPS_CLOSING)
 				src->state = TCPS_CLOSING;
 		if (th->th_flags & TH_RST)
 			src->state = dst->state = TCPS_TIME_WAIT;
 
 		/* Fall through to PASS packet */
 
 	} else {
 		if ((*state)->dst.state == TCPS_SYN_SENT &&
 		    (*state)->src.state == TCPS_SYN_SENT) {
 			/* Send RST for state mismatches during handshake */
 			if (!(th->th_flags & TH_RST))
 #ifdef __FreeBSD__
 				pf_send_tcp(m, (*state)->rule.ptr, pd->af,
 #else
 				pf_send_tcp((*state)->rule.ptr, pd->af,
 #endif
 				    pd->dst, pd->src, th->th_dport,
 				    th->th_sport, ntohl(th->th_ack), 0,
 				    TH_RST, 0, 0,
 				    (*state)->rule.ptr->return_ttl, 1, 0,
 				    pd->eh, kif->pfik_ifp);
 			src->seqlo = 0;
 			src->seqhi = 1;
 			src->max_win = 1;
 		} else if (pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: BAD state: ");
 			pf_print_state(*state);
 			pf_print_flags(th->th_flags);
 			printf(" seq=%u (%u) ack=%u len=%u ackskew=%d "
 			    "pkts=%llu:%llu dir=%s,%s\n",
 			    seq, orig_seq, ack, pd->p_len, ackskew,
 #ifdef __FreeBSD__
 			    (unsigned long long)(*state)->packets[0],
 			    (unsigned long long)(*state)->packets[1],
 #else
 			    (*state)->packets[0], (*state)->packets[1],
 #endif
 			    direction == PF_IN ? "in" : "out",
 			    direction == (*state)->direction ? "fwd" : "rev");
 			printf("pf: State failure on: %c %c %c %c | %c %c\n",
 			    SEQ_GEQ(src->seqhi, end) ? ' ' : '1',
 			    SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) ?
 			    ' ': '2',
 			    (ackskew >= -MAXACKWINDOW) ? ' ' : '3',
 			    (ackskew <= (MAXACKWINDOW << sws)) ? ' ' : '4',
 			    SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) ?' ' :'5',
 			    SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW) ?' ' :'6');
 		}
 		REASON_SET(reason, PFRES_BADSTATE);
 		return (PF_DROP);
 	}
 
 	/* Any packets which have gotten here are to be passed */
 
 	/* translate source/destination address, if necessary */
 	if (STATE_TRANSLATE(*state)) {
 		if (direction == PF_OUT)
 			pf_change_ap(pd->src, &th->th_sport, pd->ip_sum,
 			    &th->th_sum, &(*state)->gwy.addr,
 			    (*state)->gwy.port, 0, pd->af);
 		else
 			pf_change_ap(pd->dst, &th->th_dport, pd->ip_sum,
 			    &th->th_sum, &(*state)->lan.addr,
 			    (*state)->lan.port, 0, pd->af);
 		m_copyback(m, off, sizeof(*th), (caddr_t)th);
 	} else if (copyback) {
 		/* Copyback sequence modulation or stateful scrub changes */
 		m_copyback(m, off, sizeof(*th), (caddr_t)th);
 	}
 
 	return (PF_PASS);
 }
 
 int
 pf_test_state_udp(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct mbuf *m, int off, void *h, struct pf_pdesc *pd)
 {
 	struct pf_state_peer	*src, *dst;
 	struct pf_state_cmp	 key;
 	struct udphdr		*uh = pd->hdr.udp;
 
 	key.af = pd->af;
 	key.proto = IPPROTO_UDP;
 	if (direction == PF_IN)	{
 		PF_ACPY(&key.ext.addr, pd->src, key.af);
 		PF_ACPY(&key.gwy.addr, pd->dst, key.af);
 		key.ext.port = uh->uh_sport;
 		key.gwy.port = uh->uh_dport;
 	} else {
 		PF_ACPY(&key.lan.addr, pd->src, key.af);
 		PF_ACPY(&key.ext.addr, pd->dst, key.af);
 		key.lan.port = uh->uh_sport;
 		key.ext.port = uh->uh_dport;
 	}
 
 	STATE_LOOKUP();
 
 	if (direction == (*state)->direction) {
 		src = &(*state)->src;
 		dst = &(*state)->dst;
 	} else {
 		src = &(*state)->dst;
 		dst = &(*state)->src;
 	}
 
 	/* update states */
 	if (src->state < PFUDPS_SINGLE)
 		src->state = PFUDPS_SINGLE;
 	if (dst->state == PFUDPS_SINGLE)
 		dst->state = PFUDPS_MULTIPLE;
 
 	/* update expire time */
 	(*state)->expire = time_second;
 	if (src->state == PFUDPS_MULTIPLE && dst->state == PFUDPS_MULTIPLE)
 		(*state)->timeout = PFTM_UDP_MULTIPLE;
 	else
 		(*state)->timeout = PFTM_UDP_SINGLE;
 
 	/* translate source/destination address, if necessary */
 	if (STATE_TRANSLATE(*state)) {
 		if (direction == PF_OUT)
 			pf_change_ap(pd->src, &uh->uh_sport, pd->ip_sum,
 			    &uh->uh_sum, &(*state)->gwy.addr,
 			    (*state)->gwy.port, 1, pd->af);
 		else
 			pf_change_ap(pd->dst, &uh->uh_dport, pd->ip_sum,
 			    &uh->uh_sum, &(*state)->lan.addr,
 			    (*state)->lan.port, 1, pd->af);
 		m_copyback(m, off, sizeof(*uh), (caddr_t)uh);
 	}
 
 	return (PF_PASS);
 }
 
 int
 pf_test_state_icmp(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct mbuf *m, int off, void *h, struct pf_pdesc *pd, u_short *reason)
 {
 	struct pf_addr	*saddr = pd->src, *daddr = pd->dst;
 	u_int16_t	 icmpid = 0;		/* make the compiler happy */
 	u_int16_t	*icmpsum = NULL;	/* make the compiler happy */
 	u_int8_t	 icmptype = 0;		/* make the compiler happy */
 	int		 state_icmp = 0;
 	struct pf_state_cmp key;
 
 	switch (pd->proto) {
 #ifdef INET
 	case IPPROTO_ICMP:
 		icmptype = pd->hdr.icmp->icmp_type;
 		icmpid = pd->hdr.icmp->icmp_id;
 		icmpsum = &pd->hdr.icmp->icmp_cksum;
 
 		if (icmptype == ICMP_UNREACH ||
 		    icmptype == ICMP_SOURCEQUENCH ||
 		    icmptype == ICMP_REDIRECT ||
 		    icmptype == ICMP_TIMXCEED ||
 		    icmptype == ICMP_PARAMPROB)
 			state_icmp++;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 		icmptype = pd->hdr.icmp6->icmp6_type;
 		icmpid = pd->hdr.icmp6->icmp6_id;
 		icmpsum = &pd->hdr.icmp6->icmp6_cksum;
 
 		if (icmptype == ICMP6_DST_UNREACH ||
 		    icmptype == ICMP6_PACKET_TOO_BIG ||
 		    icmptype == ICMP6_TIME_EXCEEDED ||
 		    icmptype == ICMP6_PARAM_PROB)
 			state_icmp++;
 		break;
 #endif /* INET6 */
 	}
 
 	if (!state_icmp) {
 
 		/*
 		 * ICMP query/reply message not related to a TCP/UDP packet.
 		 * Search for an ICMP state.
 		 */
 		key.af = pd->af;
 		key.proto = pd->proto;
 		if (direction == PF_IN)	{
 			PF_ACPY(&key.ext.addr, pd->src, key.af);
 			PF_ACPY(&key.gwy.addr, pd->dst, key.af);
 			key.ext.port = 0;
 			key.gwy.port = icmpid;
 		} else {
 			PF_ACPY(&key.lan.addr, pd->src, key.af);
 			PF_ACPY(&key.ext.addr, pd->dst, key.af);
 			key.lan.port = icmpid;
 			key.ext.port = 0;
 		}
 
 		STATE_LOOKUP();
 
 		(*state)->expire = time_second;
 		(*state)->timeout = PFTM_ICMP_ERROR_REPLY;
 
 		/* translate source/destination address, if necessary */
 		if (STATE_TRANSLATE(*state)) {
 			if (direction == PF_OUT) {
 				switch (pd->af) {
 #ifdef INET
 				case AF_INET:
 					pf_change_a(&saddr->v4.s_addr,
 					    pd->ip_sum,
 					    (*state)->gwy.addr.v4.s_addr, 0);
 					pd->hdr.icmp->icmp_cksum =
 					    pf_cksum_fixup(
 					    pd->hdr.icmp->icmp_cksum, icmpid,
 					    (*state)->gwy.port, 0);
 					pd->hdr.icmp->icmp_id =
 					    (*state)->gwy.port;
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t)pd->hdr.icmp);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					pf_change_a6(saddr,
 					    &pd->hdr.icmp6->icmp6_cksum,
 					    &(*state)->gwy.addr, 0);
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t)pd->hdr.icmp6);
 					break;
 #endif /* INET6 */
 				}
 			} else {
 				switch (pd->af) {
 #ifdef INET
 				case AF_INET:
 					pf_change_a(&daddr->v4.s_addr,
 					    pd->ip_sum,
 					    (*state)->lan.addr.v4.s_addr, 0);
 					pd->hdr.icmp->icmp_cksum =
 					    pf_cksum_fixup(
 					    pd->hdr.icmp->icmp_cksum, icmpid,
 					    (*state)->lan.port, 0);
 					pd->hdr.icmp->icmp_id =
 					    (*state)->lan.port;
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t)pd->hdr.icmp);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					pf_change_a6(daddr,
 					    &pd->hdr.icmp6->icmp6_cksum,
 					    &(*state)->lan.addr, 0);
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t)pd->hdr.icmp6);
 					break;
 #endif /* INET6 */
 				}
 			}
 		}
 
 		return (PF_PASS);
 
 	} else {
 		/*
 		 * ICMP error message in response to a TCP/UDP packet.
 		 * Extract the inner TCP/UDP header and search for that state.
 		 */
 
 		struct pf_pdesc	pd2;
 #ifdef INET
 		struct ip	h2;
 #endif /* INET */
 #ifdef INET6
 		struct ip6_hdr	h2_6;
 		int		terminal = 0;
 #endif /* INET6 */
 		int		ipoff2 = 0;	/* make the compiler happy */
 		int		off2 = 0;	/* make the compiler happy */
 
 		pd2.af = pd->af;
 		switch (pd->af) {
 #ifdef INET
 		case AF_INET:
 			/* offset of h2 in mbuf chain */
 			ipoff2 = off + ICMP_MINLEN;
 
 			if (!pf_pull_hdr(m, ipoff2, &h2, sizeof(h2),
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(ip)\n"));
 				return (PF_DROP);
 			}
 			/*
 			 * ICMP error messages don't refer to non-first
 			 * fragments
 			 */
 			if (h2.ip_off & htons(IP_OFFMASK)) {
 				REASON_SET(reason, PFRES_FRAG);
 				return (PF_DROP);
 			}
 
 			/* offset of protocol header that follows h2 */
 			off2 = ipoff2 + (h2.ip_hl << 2);
 
 			pd2.proto = h2.ip_p;
 			pd2.src = (struct pf_addr *)&h2.ip_src;
 			pd2.dst = (struct pf_addr *)&h2.ip_dst;
 			pd2.ip_sum = &h2.ip_sum;
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			ipoff2 = off + sizeof(struct icmp6_hdr);
 
 			if (!pf_pull_hdr(m, ipoff2, &h2_6, sizeof(h2_6),
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(ip6)\n"));
 				return (PF_DROP);
 			}
 			pd2.proto = h2_6.ip6_nxt;
 			pd2.src = (struct pf_addr *)&h2_6.ip6_src;
 			pd2.dst = (struct pf_addr *)&h2_6.ip6_dst;
 			pd2.ip_sum = NULL;
 			off2 = ipoff2 + sizeof(h2_6);
 			do {
 				switch (pd2.proto) {
 				case IPPROTO_FRAGMENT:
 					/*
 					 * ICMPv6 error messages for
 					 * non-first fragments
 					 */
 					REASON_SET(reason, PFRES_FRAG);
 					return (PF_DROP);
 				case IPPROTO_AH:
 				case IPPROTO_HOPOPTS:
 				case IPPROTO_ROUTING:
 				case IPPROTO_DSTOPTS: {
 					/* get next header and header length */
 					struct ip6_ext opt6;
 
 					if (!pf_pull_hdr(m, off2, &opt6,
 					    sizeof(opt6), NULL, reason,
 					    pd2.af)) {
 						DPFPRINTF(PF_DEBUG_MISC,
 						    ("pf: ICMPv6 short opt\n"));
 						return (PF_DROP);
 					}
 					if (pd2.proto == IPPROTO_AH)
 						off2 += (opt6.ip6e_len + 2) * 4;
 					else
 						off2 += (opt6.ip6e_len + 1) * 8;
 					pd2.proto = opt6.ip6e_nxt;
 					/* goto the next header */
 					break;
 				}
 				default:
 					terminal++;
 					break;
 				}
 			} while (!terminal);
 			break;
 #endif /* INET6 */
 #ifdef __FreeBSD__
 		default:
 			panic("AF not supported: %d", pd->af);
 #endif
 		}
 
 		switch (pd2.proto) {
 		case IPPROTO_TCP: {
 			struct tcphdr		 th;
 			u_int32_t		 seq;
 			struct pf_state_peer	*src, *dst;
 			u_int8_t		 dws;
 			int			 copyback = 0;
 
 			/*
 			 * Only the first 8 bytes of the TCP header can be
 			 * expected. Don't access any TCP header fields after
 			 * th_seq, an ackskew test is not possible.
 			 */
 			if (!pf_pull_hdr(m, off2, &th, 8, NULL, reason,
 			    pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(tcp)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_TCP;
 			if (direction == PF_IN)	{
 				PF_ACPY(&key.ext.addr, pd2.dst, key.af);
 				PF_ACPY(&key.gwy.addr, pd2.src, key.af);
 				key.ext.port = th.th_dport;
 				key.gwy.port = th.th_sport;
 			} else {
 				PF_ACPY(&key.lan.addr, pd2.dst, key.af);
 				PF_ACPY(&key.ext.addr, pd2.src, key.af);
 				key.lan.port = th.th_dport;
 				key.ext.port = th.th_sport;
 			}
 
 			STATE_LOOKUP();
 
 			if (direction == (*state)->direction) {
 				src = &(*state)->dst;
 				dst = &(*state)->src;
 			} else {
 				src = &(*state)->src;
 				dst = &(*state)->dst;
 			}
 
 			if (src->wscale && dst->wscale)
 				dws = dst->wscale & PF_WSCALE_MASK;
 			else
 				dws = 0;
 
 			/* Demodulate sequence number */
 			seq = ntohl(th.th_seq) - src->seqdiff;
 			if (src->seqdiff) {
 				pf_change_a(&th.th_seq, icmpsum,
 				    htonl(seq), 0);
 				copyback = 1;
 			}
 
 			if (!SEQ_GEQ(src->seqhi, seq) ||
 			    !SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws))) {
 				if (pf_status.debug >= PF_DEBUG_MISC) {
 					printf("pf: BAD ICMP %d:%d ",
 					    icmptype, pd->hdr.icmp->icmp_code);
 					pf_print_host(pd->src, 0, pd->af);
 					printf(" -> ");
 					pf_print_host(pd->dst, 0, pd->af);
 					printf(" state: ");
 					pf_print_state(*state);
 					printf(" seq=%u\n", seq);
 				}
 				REASON_SET(reason, PFRES_BADSTATE);
 				return (PF_DROP);
 			}
 
 			if (STATE_TRANSLATE(*state)) {
 				if (direction == PF_IN) {
 					pf_change_icmp(pd2.src, &th.th_sport,
 					    daddr, &(*state)->lan.addr,
 					    (*state)->lan.port, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 				} else {
 					pf_change_icmp(pd2.dst, &th.th_dport,
 					    saddr, &(*state)->gwy.addr,
 					    (*state)->gwy.port, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 				}
 				copyback = 1;
 			}
 
 			if (copyback) {
 				switch (pd2.af) {
 #ifdef INET
 				case AF_INET:
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t)pd->hdr.icmp);
 					m_copyback(m, ipoff2, sizeof(h2),
 					    (caddr_t)&h2);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t)pd->hdr.icmp6);
 					m_copyback(m, ipoff2, sizeof(h2_6),
 					    (caddr_t)&h2_6);
 					break;
 #endif /* INET6 */
 				}
 				m_copyback(m, off2, 8, (caddr_t)&th);
 			}
 
 			return (PF_PASS);
 			break;
 		}
 		case IPPROTO_UDP: {
 			struct udphdr		uh;
 
 			if (!pf_pull_hdr(m, off2, &uh, sizeof(uh),
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(udp)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_UDP;
 			if (direction == PF_IN)	{
 				PF_ACPY(&key.ext.addr, pd2.dst, key.af);
 				PF_ACPY(&key.gwy.addr, pd2.src, key.af);
 				key.ext.port = uh.uh_dport;
 				key.gwy.port = uh.uh_sport;
 			} else {
 				PF_ACPY(&key.lan.addr, pd2.dst, key.af);
 				PF_ACPY(&key.ext.addr, pd2.src, key.af);
 				key.lan.port = uh.uh_dport;
 				key.ext.port = uh.uh_sport;
 			}
 
 			STATE_LOOKUP();
 
 			if (STATE_TRANSLATE(*state)) {
 				if (direction == PF_IN) {
 					pf_change_icmp(pd2.src, &uh.uh_sport,
 					    daddr, &(*state)->lan.addr,
 					    (*state)->lan.port, &uh.uh_sum,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 1, pd2.af);
 				} else {
 					pf_change_icmp(pd2.dst, &uh.uh_dport,
 					    saddr, &(*state)->gwy.addr,
 					    (*state)->gwy.port, &uh.uh_sum,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 1, pd2.af);
 				}
 				switch (pd2.af) {
 #ifdef INET
 				case AF_INET:
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t)pd->hdr.icmp);
 					m_copyback(m, ipoff2, sizeof(h2),
 					    (caddr_t)&h2);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t)pd->hdr.icmp6);
 					m_copyback(m, ipoff2, sizeof(h2_6),
 					    (caddr_t)&h2_6);
 					break;
 #endif /* INET6 */
 				}
 				m_copyback(m, off2, sizeof(uh),
 				    (caddr_t)&uh);
 			}
 
 			return (PF_PASS);
 			break;
 		}
 #ifdef INET
 		case IPPROTO_ICMP: {
 			struct icmp		iih;
 
 			if (!pf_pull_hdr(m, off2, &iih, ICMP_MINLEN,
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short i"
 				    "(icmp)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_ICMP;
 			if (direction == PF_IN)	{
 				PF_ACPY(&key.ext.addr, pd2.dst, key.af);
 				PF_ACPY(&key.gwy.addr, pd2.src, key.af);
 				key.ext.port = 0;
 				key.gwy.port = iih.icmp_id;
 			} else {
 				PF_ACPY(&key.lan.addr, pd2.dst, key.af);
 				PF_ACPY(&key.ext.addr, pd2.src, key.af);
 				key.lan.port = iih.icmp_id;
 				key.ext.port = 0;
 			}
 
 			STATE_LOOKUP();
 
 			if (STATE_TRANSLATE(*state)) {
 				if (direction == PF_IN) {
 					pf_change_icmp(pd2.src, &iih.icmp_id,
 					    daddr, &(*state)->lan.addr,
 					    (*state)->lan.port, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET);
 				} else {
 					pf_change_icmp(pd2.dst, &iih.icmp_id,
 					    saddr, &(*state)->gwy.addr,
 					    (*state)->gwy.port, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET);
 				}
 				m_copyback(m, off, ICMP_MINLEN,
 				    (caddr_t)pd->hdr.icmp);
 				m_copyback(m, ipoff2, sizeof(h2),
 				    (caddr_t)&h2);
 				m_copyback(m, off2, ICMP_MINLEN,
 				    (caddr_t)&iih);
 			}
 
 			return (PF_PASS);
 			break;
 		}
 #endif /* INET */
 #ifdef INET6
 		case IPPROTO_ICMPV6: {
 			struct icmp6_hdr	iih;
 
 			if (!pf_pull_hdr(m, off2, &iih,
 			    sizeof(struct icmp6_hdr), NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(icmp6)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_ICMPV6;
 			if (direction == PF_IN)	{
 				PF_ACPY(&key.ext.addr, pd2.dst, key.af);
 				PF_ACPY(&key.gwy.addr, pd2.src, key.af);
 				key.ext.port = 0;
 				key.gwy.port = iih.icmp6_id;
 			} else {
 				PF_ACPY(&key.lan.addr, pd2.dst, key.af);
 				PF_ACPY(&key.ext.addr, pd2.src, key.af);
 				key.lan.port = iih.icmp6_id;
 				key.ext.port = 0;
 			}
 
 			STATE_LOOKUP();
 
 			if (STATE_TRANSLATE(*state)) {
 				if (direction == PF_IN) {
 					pf_change_icmp(pd2.src, &iih.icmp6_id,
 					    daddr, &(*state)->lan.addr,
 					    (*state)->lan.port, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET6);
 				} else {
 					pf_change_icmp(pd2.dst, &iih.icmp6_id,
 					    saddr, &(*state)->gwy.addr,
 					    (*state)->gwy.port, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET6);
 				}
 				m_copyback(m, off, sizeof(struct icmp6_hdr),
 				    (caddr_t)pd->hdr.icmp6);
 				m_copyback(m, ipoff2, sizeof(h2_6),
 				    (caddr_t)&h2_6);
 				m_copyback(m, off2, sizeof(struct icmp6_hdr),
 				    (caddr_t)&iih);
 			}
 
 			return (PF_PASS);
 			break;
 		}
 #endif /* INET6 */
 		default: {
 			key.af = pd2.af;
 			key.proto = pd2.proto;
 			if (direction == PF_IN)	{
 				PF_ACPY(&key.ext.addr, pd2.dst, key.af);
 				PF_ACPY(&key.gwy.addr, pd2.src, key.af);
 				key.ext.port = 0;
 				key.gwy.port = 0;
 			} else {
 				PF_ACPY(&key.lan.addr, pd2.dst, key.af);
 				PF_ACPY(&key.ext.addr, pd2.src, key.af);
 				key.lan.port = 0;
 				key.ext.port = 0;
 			}
 
 			STATE_LOOKUP();
 
 			if (STATE_TRANSLATE(*state)) {
 				if (direction == PF_IN) {
 					pf_change_icmp(pd2.src, NULL,
 					    daddr, &(*state)->lan.addr,
 					    0, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 				} else {
 					pf_change_icmp(pd2.dst, NULL,
 					    saddr, &(*state)->gwy.addr,
 					    0, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 				}
 				switch (pd2.af) {
 #ifdef INET
 				case AF_INET:
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t)pd->hdr.icmp);
 					m_copyback(m, ipoff2, sizeof(h2),
 					    (caddr_t)&h2);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t)pd->hdr.icmp6);
 					m_copyback(m, ipoff2, sizeof(h2_6),
 					    (caddr_t)&h2_6);
 					break;
 #endif /* INET6 */
 				}
 			}
 
 			return (PF_PASS);
 			break;
 		}
 		}
 	}
 }
 
 int
 pf_test_state_other(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct pf_pdesc *pd)
 {
 	struct pf_state_peer	*src, *dst;
 	struct pf_state_cmp	 key;
 
 	key.af = pd->af;
 	key.proto = pd->proto;
 	if (direction == PF_IN)	{
 		PF_ACPY(&key.ext.addr, pd->src, key.af);
 		PF_ACPY(&key.gwy.addr, pd->dst, key.af);
 		key.ext.port = 0;
 		key.gwy.port = 0;
 	} else {
 		PF_ACPY(&key.lan.addr, pd->src, key.af);
 		PF_ACPY(&key.ext.addr, pd->dst, key.af);
 		key.lan.port = 0;
 		key.ext.port = 0;
 	}
 
 	STATE_LOOKUP();
 
 	if (direction == (*state)->direction) {
 		src = &(*state)->src;
 		dst = &(*state)->dst;
 	} else {
 		src = &(*state)->dst;
 		dst = &(*state)->src;
 	}
 
 	/* update states */
 	if (src->state < PFOTHERS_SINGLE)
 		src->state = PFOTHERS_SINGLE;
 	if (dst->state == PFOTHERS_SINGLE)
 		dst->state = PFOTHERS_MULTIPLE;
 
 	/* update expire time */
 	(*state)->expire = time_second;
 	if (src->state == PFOTHERS_MULTIPLE && dst->state == PFOTHERS_MULTIPLE)
 		(*state)->timeout = PFTM_OTHER_MULTIPLE;
 	else
 		(*state)->timeout = PFTM_OTHER_SINGLE;
 
 	/* translate source/destination address, if necessary */
 	if (STATE_TRANSLATE(*state)) {
 		if (direction == PF_OUT)
 			switch (pd->af) {
 #ifdef INET
 			case AF_INET:
 				pf_change_a(&pd->src->v4.s_addr,
 				    pd->ip_sum, (*state)->gwy.addr.v4.s_addr,
 				    0);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				PF_ACPY(pd->src, &(*state)->gwy.addr, pd->af);
 				break;
 #endif /* INET6 */
 			}
 		else
 			switch (pd->af) {
 #ifdef INET
 			case AF_INET:
 				pf_change_a(&pd->dst->v4.s_addr,
 				    pd->ip_sum, (*state)->lan.addr.v4.s_addr,
 				    0);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				PF_ACPY(pd->dst, &(*state)->lan.addr, pd->af);
 				break;
 #endif /* INET6 */
 			}
 	}
 
 	return (PF_PASS);
 }
 
 /*
  * ipoff and off are measured from the start of the mbuf chain.
  * h must be at "ipoff" on the mbuf chain.
  */
 void *
 pf_pull_hdr(struct mbuf *m, int off, void *p, int len,
     u_short *actionp, u_short *reasonp, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET: {
 		struct ip	*h = mtod(m, struct ip *);
 		u_int16_t	 fragoff = (ntohs(h->ip_off) & IP_OFFMASK) << 3;
 
 		if (fragoff) {
 			if (fragoff >= len)
 				ACTION_SET(actionp, PF_PASS);
 			else {
 				ACTION_SET(actionp, PF_DROP);
 				REASON_SET(reasonp, PFRES_FRAG);
 			}
 			return (NULL);
 		}
 		if (m->m_pkthdr.len < off + len ||
 		    ntohs(h->ip_len) < off + len) {
 			ACTION_SET(actionp, PF_DROP);
 			REASON_SET(reasonp, PFRES_SHORT);
 			return (NULL);
 		}
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6: {
 		struct ip6_hdr	*h = mtod(m, struct ip6_hdr *);
 
 		if (m->m_pkthdr.len < off + len ||
 		    (ntohs(h->ip6_plen) + sizeof(struct ip6_hdr)) <
 		    (unsigned)(off + len)) {
 			ACTION_SET(actionp, PF_DROP);
 			REASON_SET(reasonp, PFRES_SHORT);
 			return (NULL);
 		}
 		break;
 	}
 #endif /* INET6 */
 	}
 	m_copydata(m, off, len, p);
 	return (p);
 }
 
 int
 pf_routable(struct pf_addr *addr, sa_family_t af, struct pfi_kif *kif)
 {
 	struct sockaddr_in	*dst;
 	int			 ret = 1;
 	int			 check_mpath;
 #ifndef __FreeBSD__
 	extern int		 ipmultipath;
 #endif
 #ifdef INET6
 #ifndef __FreeBSD__
 	extern int		 ip6_multipath;
 #endif
 	struct sockaddr_in6	*dst6;
 	struct route_in6	 ro;
 #else
 	struct route		 ro;
 #endif
 	struct radix_node	*rn;
 	struct rtentry		*rt;
 	struct ifnet		*ifp;
 
 	check_mpath = 0;
 	bzero(&ro, sizeof(ro));
 	switch (af) {
 	case AF_INET:
 		dst = satosin(&ro.ro_dst);
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = addr->v4;
 #ifndef __FreeBSD__	/* MULTIPATH_ROUTING */
 		if (ipmultipath)
 			check_mpath = 1;
 #endif
 		break;
 #ifdef INET6
 	case AF_INET6:
 		dst6 = (struct sockaddr_in6 *)&ro.ro_dst;
 		dst6->sin6_family = AF_INET6;
 		dst6->sin6_len = sizeof(*dst6);
 		dst6->sin6_addr = addr->v6;
 #ifndef __FreeBSD__	/* MULTIPATH_ROUTING */
 		if (ip6_multipath)
 			check_mpath = 1;
 #endif
 		break;
 #endif /* INET6 */
 	default:
 		return (0);
 	}
 
 	/* Skip checks for ipsec interfaces */
 	if (kif != NULL && kif->pfik_ifp->if_type == IFT_ENC)
 		goto out;
 
 #ifdef __FreeBSD__
 /* XXX MRT not always INET */ /* stick with table 0 though */
 	if (af == AF_INET)
-		in_rtalloc_ign((struct route *)&ro, RTF_CLONING, 0);
+		in_rtalloc_ign((struct route *)&ro, 0, 0);
 	else
-		rtalloc_ign((struct route *)&ro, RTF_CLONING);
+		rtalloc_ign((struct route *)&ro, 0);
 #else /* ! __FreeBSD__ */
 	rtalloc_noclone((struct route *)&ro, NO_CLONING);
 #endif
 
 	if (ro.ro_rt != NULL) {
 		/* No interface given, this is a no-route check */
 		if (kif == NULL)
 			goto out;
 
 		if (kif->pfik_ifp == NULL) {
 			ret = 0;
 			goto out;
 		}
 
 		/* Perform uRPF check if passed input interface */
 		ret = 0;
 		rn = (struct radix_node *)ro.ro_rt;
 		do {
 			rt = (struct rtentry *)rn;
 #ifndef __FreeBSD__ /* CARPDEV */
 			if (rt->rt_ifp->if_type == IFT_CARP)
 				ifp = rt->rt_ifp->if_carpdev;
 			else
 #endif
 				ifp = rt->rt_ifp;
 
 			if (kif->pfik_ifp == ifp)
 				ret = 1;
 #ifdef __FreeBSD__ /* MULTIPATH_ROUTING */
 			rn = NULL;
 #else
 			rn = rn_mpath_next(rn);
 #endif
 		} while (check_mpath == 1 && rn != NULL && ret == 0);
 	} else
 		ret = 0;
 out:
 	if (ro.ro_rt != NULL)
 		RTFREE(ro.ro_rt);
 	return (ret);
 }
 
 int
 pf_rtlabel_match(struct pf_addr *addr, sa_family_t af, struct pf_addr_wrap *aw)
 {
 	struct sockaddr_in	*dst;
 #ifdef INET6
 	struct sockaddr_in6	*dst6;
 	struct route_in6	 ro;
 #else
 	struct route		 ro;
 #endif
 	int			 ret = 0;
 
 	bzero(&ro, sizeof(ro));
 	switch (af) {
 	case AF_INET:
 		dst = satosin(&ro.ro_dst);
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = addr->v4;
 		break;
 #ifdef INET6
 	case AF_INET6:
 		dst6 = (struct sockaddr_in6 *)&ro.ro_dst;
 		dst6->sin6_family = AF_INET6;
 		dst6->sin6_len = sizeof(*dst6);
 		dst6->sin6_addr = addr->v6;
 		break;
 #endif /* INET6 */
 	default:
 		return (0);
 	}
 
 #ifdef __FreeBSD__
 # ifdef RTF_PRCLONING
 	rtalloc_ign((struct route *)&ro, (RTF_CLONING|RTF_PRCLONING));
 # else /* !RTF_PRCLONING */
 	if (af == AF_INET)
-		in_rtalloc_ign((struct route *)&ro, RTF_CLONING, 0);
+		in_rtalloc_ign((struct route *)&ro, 0, 0);
 	else
-		rtalloc_ign((struct route *)&ro, RTF_CLONING);
+		rtalloc_ign((struct route *)&ro, 0);
 # endif
 #else /* ! __FreeBSD__ */
 	rtalloc_noclone((struct route *)&ro, NO_CLONING);
 #endif
 
 	if (ro.ro_rt != NULL) {
 #ifdef __FreeBSD__
 		/* XXX_IMPORT: later */
 #else
 		if (ro.ro_rt->rt_labelid == aw->v.rtlabel)
 			ret = 1;
 #endif
 		RTFREE(ro.ro_rt);
 	}
 
 	return (ret);
 }
 
 #ifdef INET
 
 void
 pf_route(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp,
     struct pf_state *s, struct pf_pdesc *pd)
 {
 	INIT_VNET_INET(curvnet);
 	struct mbuf		*m0, *m1;
 	struct route		 iproute;
 	struct route		*ro = NULL;
 	struct sockaddr_in	*dst;
 	struct ip		*ip;
 	struct ifnet		*ifp = NULL;
 	struct pf_addr		 naddr;
 	struct pf_src_node	*sn = NULL;
 	int			 error = 0;
 #ifdef __FreeBSD__
 	int sw_csum;
 #endif
 #ifdef IPSEC
 	struct m_tag		*mtag;
 #endif /* IPSEC */
 
 	if (m == NULL || *m == NULL || r == NULL ||
 	    (dir != PF_IN && dir != PF_OUT) || oifp == NULL)
 		panic("pf_route: invalid parameters");
 
 	if (pd->pf_mtag->routed++ > 3) {
 		m0 = *m;
 		*m = NULL;
 		goto bad;
 	}
 
 	if (r->rt == PF_DUPTO) {
 #ifdef __FreeBSD__
 		if ((m0 = m_dup(*m, M_DONTWAIT)) == NULL)
 #else
 		if ((m0 = m_copym2(*m, 0, M_COPYALL, M_NOWAIT)) == NULL)
 #endif
 			return;
 	} else {
 		if ((r->rt == PF_REPLYTO) == (r->direction == dir))
 			return;
 		m0 = *m;
 	}
 
 	if (m0->m_len < sizeof(struct ip)) {
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_route: m0->m_len < sizeof(struct ip)\n"));
 		goto bad;
 	}
 
 	ip = mtod(m0, struct ip *);
 
 	ro = &iproute;
 	bzero((caddr_t)ro, sizeof(*ro));
 	dst = satosin(&ro->ro_dst);
 	dst->sin_family = AF_INET;
 	dst->sin_len = sizeof(*dst);
 	dst->sin_addr = ip->ip_dst;
 
 	if (r->rt == PF_FASTROUTE) {
 		in_rtalloc(ro, 0);
 		if (ro->ro_rt == 0) {
 			V_ipstat.ips_noroute++;
 			goto bad;
 		}
 
 		ifp = ro->ro_rt->rt_ifp;
 		ro->ro_rt->rt_use++;
 
 		if (ro->ro_rt->rt_flags & RTF_GATEWAY)
 			dst = satosin(ro->ro_rt->rt_gateway);
 	} else {
 		if (TAILQ_EMPTY(&r->rpool.list)) {
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("pf_route: TAILQ_EMPTY(&r->rpool.list)\n"));
 			goto bad;
 		}
 		if (s == NULL) {
 			pf_map_addr(AF_INET, r, (struct pf_addr *)&ip->ip_src,
 			    &naddr, NULL, &sn);
 			if (!PF_AZERO(&naddr, AF_INET))
 				dst->sin_addr.s_addr = naddr.v4.s_addr;
 			ifp = r->rpool.cur->kif ?
 			    r->rpool.cur->kif->pfik_ifp : NULL;
 		} else {
 			if (!PF_AZERO(&s->rt_addr, AF_INET))
 				dst->sin_addr.s_addr =
 				    s->rt_addr.v4.s_addr;
 			ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL;
 		}
 	}
 	if (ifp == NULL)
 		goto bad;
 
 	if (oifp != ifp) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 		if (pf_test(PF_OUT, ifp, &m0, NULL, NULL) != PF_PASS) {
 			PF_LOCK();
 			goto bad;
 		} else if (m0 == NULL) {
 			PF_LOCK();
 			goto done;
 		}
 		PF_LOCK();
 #else
 		if (pf_test(PF_OUT, ifp, &m0, NULL) != PF_PASS)
 			goto bad;
 		else if (m0 == NULL)
 			goto done;
 #endif
 		if (m0->m_len < sizeof(struct ip)) {
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("pf_route: m0->m_len < sizeof(struct ip)\n"));
 			goto bad;
 		}
 		ip = mtod(m0, struct ip *);
 	}
 
 #ifdef __FreeBSD__
 	/* Copied from FreeBSD 5.1-CURRENT ip_output. */
 	m0->m_pkthdr.csum_flags |= CSUM_IP;
 	sw_csum = m0->m_pkthdr.csum_flags & ~ifp->if_hwassist;
 	if (sw_csum & CSUM_DELAY_DATA) {
 		/*
 		 * XXX: in_delayed_cksum assumes HBO for ip->ip_len (at least)
 		 */
 		NTOHS(ip->ip_len);
 		NTOHS(ip->ip_off);	 /* XXX: needed? */
 		in_delayed_cksum(m0);
 		HTONS(ip->ip_len);
 		HTONS(ip->ip_off);
 		sw_csum &= ~CSUM_DELAY_DATA;
 	}
 	m0->m_pkthdr.csum_flags &= ifp->if_hwassist;
 
 	if (ntohs(ip->ip_len) <= ifp->if_mtu ||
 	    (ifp->if_hwassist & CSUM_FRAGMENT &&
 		((ip->ip_off & htons(IP_DF)) == 0))) {
 		/*
 		 * ip->ip_len = htons(ip->ip_len);
 		 * ip->ip_off = htons(ip->ip_off);
 		 */
 		ip->ip_sum = 0;
 		if (sw_csum & CSUM_DELAY_IP) {
 			/* From KAME */
 			if (ip->ip_v == IPVERSION &&
 			    (ip->ip_hl << 2) == sizeof(*ip)) {
 				ip->ip_sum = in_cksum_hdr(ip);
 			} else {
 				ip->ip_sum = in_cksum(m0, ip->ip_hl << 2);
 			}
 		}
 		PF_UNLOCK();
 		error = (*ifp->if_output)(ifp, m0, sintosa(dst), ro->ro_rt);
 		PF_LOCK();
 		goto done;
 	}
 
 #else
 	/* Copied from ip_output. */
 #ifdef IPSEC
 	/*
 	 * If deferred crypto processing is needed, check that the
 	 * interface supports it.
 	 */
 	if ((mtag = m_tag_find(m0, PACKET_TAG_IPSEC_OUT_CRYPTO_NEEDED, NULL))
 	    != NULL && (ifp->if_capabilities & IFCAP_IPSEC) == 0) {
 		/* Notify IPsec to do its own crypto. */
 		ipsp_skipcrypto_unmark((struct tdb_ident *)(mtag + 1));
 		goto bad;
 	}
 #endif /* IPSEC */
 
 	/* Catch routing changes wrt. hardware checksumming for TCP or UDP. */
 	if (m0->m_pkthdr.csum_flags & M_TCPV4_CSUM_OUT) {
 		if (!(ifp->if_capabilities & IFCAP_CSUM_TCPv4) ||
 		    ifp->if_bridge != NULL) {
 			in_delayed_cksum(m0);
 			m0->m_pkthdr.csum_flags &= ~M_TCPV4_CSUM_OUT; /* Clear */
 		}
 	} else if (m0->m_pkthdr.csum_flags & M_UDPV4_CSUM_OUT) {
 		if (!(ifp->if_capabilities & IFCAP_CSUM_UDPv4) ||
 		    ifp->if_bridge != NULL) {
 			in_delayed_cksum(m0);
 			m0->m_pkthdr.csum_flags &= ~M_UDPV4_CSUM_OUT; /* Clear */
 		}
 	}
 
 	if (ntohs(ip->ip_len) <= ifp->if_mtu) {
 		if ((ifp->if_capabilities & IFCAP_CSUM_IPv4) &&
 		    ifp->if_bridge == NULL) {
 			m0->m_pkthdr.csum_flags |= M_IPV4_CSUM_OUT;
 			V_ipstat.ips_outhwcsum++;
 		} else {
 			ip->ip_sum = 0;
 			ip->ip_sum = in_cksum(m0, ip->ip_hl << 2);
 		}
 		/* Update relevant hardware checksum stats for TCP/UDP */
 		if (m0->m_pkthdr.csum_flags & M_TCPV4_CSUM_OUT)
 			V_tcpstat.tcps_outhwcsum++;
 		else if (m0->m_pkthdr.csum_flags & M_UDPV4_CSUM_OUT)
 			V_udpstat.udps_outhwcsum++;
 		error = (*ifp->if_output)(ifp, m0, sintosa(dst), NULL);
 		goto done;
 	}
 #endif
 	/*
 	 * Too large for interface; fragment if possible.
 	 * Must be able to put at least 8 bytes per fragment.
 	 */
 	if (ip->ip_off & htons(IP_DF)) {
 		V_ipstat.ips_cantfrag++;
 		if (r->rt != PF_DUPTO) {
 #ifdef __FreeBSD__
 			/* icmp_error() expects host byte ordering */
 			NTOHS(ip->ip_len);
 			NTOHS(ip->ip_off);
 			PF_UNLOCK();
 			icmp_error(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0,
 			    ifp->if_mtu);
 			PF_LOCK();
 #else
 			icmp_error(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0,
 			    ifp->if_mtu);
 #endif
 			goto done;
 		} else
 			goto bad;
 	}
 
 	m1 = m0;
 #ifdef __FreeBSD__
 	/*
 	 * XXX: is cheaper + less error prone than own function
 	 */
 	NTOHS(ip->ip_len);
 	NTOHS(ip->ip_off);
 	error = ip_fragment(ip, &m0, ifp->if_mtu, ifp->if_hwassist, sw_csum);
 #else
 	error = ip_fragment(m0, ifp, ifp->if_mtu);
 #endif
 	if (error) {
 #ifndef __FreeBSD__	/* ip_fragment does not do m_freem() on FreeBSD */
 		m0 = NULL;
 #endif
 		goto bad;
 	}
 
 	for (m0 = m1; m0; m0 = m1) {
 		m1 = m0->m_nextpkt;
 		m0->m_nextpkt = 0;
 #ifdef __FreeBSD__
 		if (error == 0) {
 			PF_UNLOCK();
 			error = (*ifp->if_output)(ifp, m0, sintosa(dst),
 			    NULL);
 			PF_LOCK();
 		} else
 #else
 		if (error == 0)
 			error = (*ifp->if_output)(ifp, m0, sintosa(dst),
 			    NULL);
 		else
 #endif
 			m_freem(m0);
 	}
 
 	if (error == 0)
 		V_ipstat.ips_fragmented++;
 
 done:
 	if (r->rt != PF_DUPTO)
 		*m = NULL;
 	if (ro == &iproute && ro->ro_rt)
 		RTFREE(ro->ro_rt);
 	return;
 
 bad:
 	m_freem(m0);
 	goto done;
 }
 #endif /* INET */
 
 #ifdef INET6
 void
 pf_route6(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp,
     struct pf_state *s, struct pf_pdesc *pd)
 {
 	struct mbuf		*m0;
 	struct route_in6	 ip6route;
 	struct route_in6	*ro;
 	struct sockaddr_in6	*dst;
 	struct ip6_hdr		*ip6;
 	struct ifnet		*ifp = NULL;
 	struct pf_addr		 naddr;
 	struct pf_src_node	*sn = NULL;
 	int			 error = 0;
 
 	if (m == NULL || *m == NULL || r == NULL ||
 	    (dir != PF_IN && dir != PF_OUT) || oifp == NULL)
 		panic("pf_route6: invalid parameters");
 
 	if (pd->pf_mtag->routed++ > 3) {
 		m0 = *m;
 		*m = NULL;
 		goto bad;
 	}
 
 	if (r->rt == PF_DUPTO) {
 #ifdef __FreeBSD__
 		if ((m0 = m_dup(*m, M_DONTWAIT)) == NULL)
 #else
 		if ((m0 = m_copym2(*m, 0, M_COPYALL, M_NOWAIT)) == NULL)
 #endif
 			return;
 	} else {
 		if ((r->rt == PF_REPLYTO) == (r->direction == dir))
 			return;
 		m0 = *m;
 	}
 
 	if (m0->m_len < sizeof(struct ip6_hdr)) {
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_route6: m0->m_len < sizeof(struct ip6_hdr)\n"));
 		goto bad;
 	}
 	ip6 = mtod(m0, struct ip6_hdr *);
 
 	ro = &ip6route;
 	bzero((caddr_t)ro, sizeof(*ro));
 	dst = (struct sockaddr_in6 *)&ro->ro_dst;
 	dst->sin6_family = AF_INET6;
 	dst->sin6_len = sizeof(*dst);
 	dst->sin6_addr = ip6->ip6_dst;
 
 	/* Cheat. XXX why only in the v6 case??? */
 	if (r->rt == PF_FASTROUTE) {
 #ifdef __FreeBSD__
 		m0->m_flags |= M_SKIP_FIREWALL;
 		PF_UNLOCK();
 		ip6_output(m0, NULL, NULL, 0, NULL, NULL, NULL);
 		PF_LOCK();
 #else
 		mtag = m_tag_get(PACKET_TAG_PF_GENERATED, 0, M_NOWAIT);
 		if (mtag == NULL)
 			goto bad;
 		m_tag_prepend(m0, mtag);
 		pd->pf_mtag->flags |= PF_TAG_GENERATED;
 		ip6_output(m0, NULL, NULL, 0, NULL, NULL);
 #endif
 		return;
 	}
 
 	if (TAILQ_EMPTY(&r->rpool.list)) {
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_route6: TAILQ_EMPTY(&r->rpool.list)\n"));
 		goto bad;
 	}
 	if (s == NULL) {
 		pf_map_addr(AF_INET6, r, (struct pf_addr *)&ip6->ip6_src,
 		    &naddr, NULL, &sn);
 		if (!PF_AZERO(&naddr, AF_INET6))
 			PF_ACPY((struct pf_addr *)&dst->sin6_addr,
 			    &naddr, AF_INET6);
 		ifp = r->rpool.cur->kif ? r->rpool.cur->kif->pfik_ifp : NULL;
 	} else {
 		if (!PF_AZERO(&s->rt_addr, AF_INET6))
 			PF_ACPY((struct pf_addr *)&dst->sin6_addr,
 			    &s->rt_addr, AF_INET6);
 		ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL;
 	}
 	if (ifp == NULL)
 		goto bad;
 
 	if (oifp != ifp) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 		if (pf_test6(PF_OUT, ifp, &m0, NULL, NULL) != PF_PASS) {
 			PF_LOCK();
 			goto bad;
 		} else if (m0 == NULL) {
 			PF_LOCK();
 			goto done;
 		}
 		PF_LOCK();
 #else
 		if (pf_test6(PF_OUT, ifp, &m0, NULL) != PF_PASS)
 			goto bad;
 		else if (m0 == NULL)
 			goto done;
 #endif
 		if (m0->m_len < sizeof(struct ip6_hdr)) {
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("pf_route6: m0->m_len < sizeof(struct ip6_hdr)\n"));
 			goto bad;
 		}
 		ip6 = mtod(m0, struct ip6_hdr *);
 	}
 
 	/*
 	 * If the packet is too large for the outgoing interface,
 	 * send back an icmp6 error.
 	 */
 	if (IN6_IS_SCOPE_EMBED(&dst->sin6_addr))
 		dst->sin6_addr.s6_addr16[1] = htons(ifp->if_index);
 	if ((u_long)m0->m_pkthdr.len <= ifp->if_mtu) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		error = nd6_output(ifp, ifp, m0, dst, NULL);
 #ifdef __FreeBSD__
 		PF_LOCK();
 #endif
 	} else {
 		in6_ifstat_inc(ifp, ifs6_in_toobig);
 #ifdef __FreeBSD__
 		if (r->rt != PF_DUPTO) {
 			PF_UNLOCK();
 			icmp6_error(m0, ICMP6_PACKET_TOO_BIG, 0, ifp->if_mtu);
 			PF_LOCK();
 		 } else
 #else
 		if (r->rt != PF_DUPTO)
 			icmp6_error(m0, ICMP6_PACKET_TOO_BIG, 0, ifp->if_mtu);
 		else
 #endif
 			goto bad;
 	}
 
 done:
 	if (r->rt != PF_DUPTO)
 		*m = NULL;
 	return;
 
 bad:
 	m_freem(m0);
 	goto done;
 }
 #endif /* INET6 */
 
 
 #ifdef __FreeBSD__
 /*
  * FreeBSD supports cksum offloads for the following drivers.
  *  em(4), fxp(4), ixgb(4), lge(4), ndis(4), nge(4), re(4),
  *   ti(4), txp(4), xl(4)
  *
  * CSUM_DATA_VALID | CSUM_PSEUDO_HDR :
  *  network driver performed cksum including pseudo header, need to verify
  *   csum_data
  * CSUM_DATA_VALID :
  *  network driver performed cksum, needs to additional pseudo header
  *  cksum computation with partial csum_data(i.e. lack of H/W support for
  *  pseudo header, for instance hme(4), sk(4) and possibly gem(4))
  *
  * After validating the cksum of packet, set both flag CSUM_DATA_VALID and
  * CSUM_PSEUDO_HDR in order to avoid recomputation of the cksum in upper
  * TCP/UDP layer.
  * Also, set csum_data to 0xffff to force cksum validation.
  */
 int
 pf_check_proto_cksum(struct mbuf *m, int off, int len, u_int8_t p, sa_family_t af)
 {
 	u_int16_t sum = 0;
 	int hw_assist = 0;
 	struct ip *ip;
 
 	if (off < sizeof(struct ip) || len < sizeof(struct udphdr))
 		return (1);
 	if (m->m_pkthdr.len < off + len)
 		return (1);
 
 	switch (p) {
 	case IPPROTO_TCP:
 		if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) {
 			if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) {
 				sum = m->m_pkthdr.csum_data;
 			} else {
 				ip = mtod(m, struct ip *);	
 				sum = in_pseudo(ip->ip_src.s_addr,
 					ip->ip_dst.s_addr, htonl((u_short)len + 
 					m->m_pkthdr.csum_data + IPPROTO_TCP));
 			}
 			sum ^= 0xffff;
 			++hw_assist;
 		}
 		break;
 	case IPPROTO_UDP:
 		if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) {
 			if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) {
 				sum = m->m_pkthdr.csum_data;
 			} else {
 				ip = mtod(m, struct ip *);	
 				sum = in_pseudo(ip->ip_src.s_addr,
 					ip->ip_dst.s_addr, htonl((u_short)len +
 					m->m_pkthdr.csum_data + IPPROTO_UDP));
 			}
 			sum ^= 0xffff;
 			++hw_assist;
                 }
 		break;
 	case IPPROTO_ICMP:
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 #endif /* INET6 */
 		break;
 	default:
 		return (1);
 	}
 
 	if (!hw_assist) {
 		switch (af) {
 		case AF_INET:
 			if (p == IPPROTO_ICMP) {
 				if (m->m_len < off)
 					return (1);
 				m->m_data += off;
 				m->m_len -= off;
 				sum = in_cksum(m, len);
 				m->m_data -= off;
 				m->m_len += off;
 			} else {
 				if (m->m_len < sizeof(struct ip))
 					return (1);
 				sum = in4_cksum(m, p, off, len);
 			}
 			break;
 #ifdef INET6
 		case AF_INET6:
 			if (m->m_len < sizeof(struct ip6_hdr))
 				return (1);
 			sum = in6_cksum(m, p, off, len);
 			break;
 #endif /* INET6 */
 		default:
 			return (1);
 		}
 	}
 	if (sum) {
 		switch (p) {
 		case IPPROTO_TCP:
 		    {
 			INIT_VNET_INET(curvnet);
 			V_tcpstat.tcps_rcvbadsum++;
 			break;
 		    }
 		case IPPROTO_UDP:
 		    {
 			INIT_VNET_INET(curvnet);
 			V_udpstat.udps_badsum++;
 			break;
 		    }
 		case IPPROTO_ICMP:
 		    {
 			INIT_VNET_INET(curvnet);
 			V_icmpstat.icps_checksum++;
 			break;
 		    }
 #ifdef INET6
 		case IPPROTO_ICMPV6:
 		    {
 			INIT_VNET_INET6(curvnet);
 			V_icmp6stat.icp6s_checksum++;
 			break;
 		    }
 #endif /* INET6 */
 		}
 		return (1);
 	} else {
 		if (p == IPPROTO_TCP || p == IPPROTO_UDP) {
 			m->m_pkthdr.csum_flags |=
 			    (CSUM_DATA_VALID | CSUM_PSEUDO_HDR);
 			m->m_pkthdr.csum_data = 0xffff;
 		}
 	}
 	return (0);
 }
 #else /* !__FreeBSD__ */
 /*
  * check protocol (tcp/udp/icmp/icmp6) checksum and set mbuf flag
  *   off is the offset where the protocol header starts
  *   len is the total length of protocol header plus payload
  * returns 0 when the checksum is valid, otherwise returns 1.
  */
 int
 pf_check_proto_cksum(struct mbuf *m, int off, int len, u_int8_t p,
     sa_family_t af)
 {
 	u_int16_t flag_ok, flag_bad;
 	u_int16_t sum;
 
 	switch (p) {
 	case IPPROTO_TCP:
 		flag_ok = M_TCP_CSUM_IN_OK;
 		flag_bad = M_TCP_CSUM_IN_BAD;
 		break;
 	case IPPROTO_UDP:
 		flag_ok = M_UDP_CSUM_IN_OK;
 		flag_bad = M_UDP_CSUM_IN_BAD;
 		break;
 	case IPPROTO_ICMP:
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 #endif /* INET6 */
 		flag_ok = flag_bad = 0;
 		break;
 	default:
 		return (1);
 	}
 	if (m->m_pkthdr.csum_flags & flag_ok)
 		return (0);
 	if (m->m_pkthdr.csum_flags & flag_bad)
 		return (1);
 	if (off < sizeof(struct ip) || len < sizeof(struct udphdr))
 		return (1);
 	if (m->m_pkthdr.len < off + len)
 		return (1);
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		if (p == IPPROTO_ICMP) {
 			if (m->m_len < off)
 				return (1);
 			m->m_data += off;
 			m->m_len -= off;
 			sum = in_cksum(m, len);
 			m->m_data -= off;
 			m->m_len += off;
 		} else {
 			if (m->m_len < sizeof(struct ip))
 				return (1);
 			sum = in4_cksum(m, p, off, len);
 		}
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		if (m->m_len < sizeof(struct ip6_hdr))
 			return (1);
 		sum = in6_cksum(m, p, off, len);
 		break;
 #endif /* INET6 */
 	default:
 		return (1);
 	}
 	if (sum) {
 		m->m_pkthdr.csum_flags |= flag_bad;
 		switch (p) {
 		case IPPROTO_TCP:
 			V_tcpstat.tcps_rcvbadsum++;
 			break;
 		case IPPROTO_UDP:
 			V_udpstat.udps_badsum++;
 			break;
 		case IPPROTO_ICMP:
 			V_icmpstat.icps_checksum++;
 			break;
 #ifdef INET6
 		case IPPROTO_ICMPV6:
 			V_icmp6stat.icp6s_checksum++;
 			break;
 #endif /* INET6 */
 		}
 		return (1);
 	}
 	m->m_pkthdr.csum_flags |= flag_ok;
 	return (0);
 }
 #endif /* __FreeBSD__ */
 
 #ifdef INET
 int
 #ifdef __FreeBSD__
 pf_test(int dir, struct ifnet *ifp, struct mbuf **m0,
     struct ether_header *eh, struct inpcb *inp)
 #else
 pf_test(int dir, struct ifnet *ifp, struct mbuf **m0,
     struct ether_header *eh)
 #endif
 {
 	struct pfi_kif		*kif;
 	u_short			 action, reason = 0, log = 0;
 	struct mbuf		*m = *m0;
 	struct ip		*h = NULL;	/* make the compiler happy */
 	struct pf_rule		*a = NULL, *r = &pf_default_rule, *tr, *nr;
 	struct pf_state		*s = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_pdesc		 pd;
 	int			 off, dirndx, pqid = 0;
 
 #ifdef __FreeBSD__
 	PF_LOCK();
 #endif
 	if (!pf_status.running)
 #ifdef __FreeBSD__
 	{
 		PF_UNLOCK();
 #endif
 		return (PF_PASS);
 #ifdef __FreeBSD__
 	}
 #endif
 
 	memset(&pd, 0, sizeof(pd));
 	if ((pd.pf_mtag = pf_get_mtag(m)) == NULL) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_test: pf_get_mtag returned NULL\n"));
 		return (PF_DROP);
 	}
 #ifdef __FreeBSD__
 	if (m->m_flags & M_SKIP_FIREWALL) {
 		PF_UNLOCK();
 		return (PF_PASS);
 	}
 #else
 	if (pd.pf_mtag->flags & PF_TAG_GENERATED)
 		return (PF_PASS);
 #endif
 
 #ifdef __FreeBSD__
 	/* XXX_IMPORT: later */
 #else
 	if (ifp->if_type == IFT_CARP && ifp->if_carpdev)
 		ifp = ifp->if_carpdev;
 #endif
 
 	kif = (struct pfi_kif *)ifp->if_pf_kif;
 	if (kif == NULL) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_test: kif == NULL, if_xname %s\n", ifp->if_xname));
 		return (PF_DROP);
 	}
 	if (kif->pfik_flags & PFI_IFLAG_SKIP) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		return (PF_PASS);
 	}
 
 #ifdef __FreeBSD__
 	M_ASSERTPKTHDR(m);
 #else
 #ifdef DIAGNOSTIC
 	if ((m->m_flags & M_PKTHDR) == 0)
 		panic("non-M_PKTHDR is passed to pf_test");
 #endif /* DIAGNOSTIC */
 #endif /* __FreeBSD__ */
 
 	if (m->m_pkthdr.len < (int)sizeof(*h)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_SHORT);
 		log = 1;
 		goto done;
 	}
 
 	/* We do IP header normalization and packet reassembly here */
 	if (pf_normalize_ip(m0, dir, kif, &reason, &pd) != PF_PASS) {
 		action = PF_DROP;
 		goto done;
 	}
 	m = *m0;
 	h = mtod(m, struct ip *);
 
 	off = h->ip_hl << 2;
 	if (off < (int)sizeof(*h)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_SHORT);
 		log = 1;
 		goto done;
 	}
 
 	pd.src = (struct pf_addr *)&h->ip_src;
 	pd.dst = (struct pf_addr *)&h->ip_dst;
 	PF_ACPY(&pd.baddr, dir == PF_OUT ? pd.src : pd.dst, AF_INET);
 	pd.ip_sum = &h->ip_sum;
 	pd.proto = h->ip_p;
 	pd.af = AF_INET;
 	pd.tos = h->ip_tos;
 	pd.tot_len = ntohs(h->ip_len);
 	pd.eh = eh;
 
 	/* handle fragments that didn't get reassembled by normalization */
 	if (h->ip_off & htons(IP_MF | IP_OFFMASK)) {
 		action = pf_test_fragment(&r, dir, kif, m, h,
 		    &pd, &a, &ruleset);
 		goto done;
 	}
 
 	switch (h->ip_p) {
 
 	case IPPROTO_TCP: {
 		struct tcphdr	th;
 
 		pd.hdr.tcp = &th;
 		if (!pf_pull_hdr(m, off, &th, sizeof(th),
 		    &action, &reason, AF_INET)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (dir == PF_IN && pf_check_proto_cksum(m, off,
 		    ntohs(h->ip_len) - off, IPPROTO_TCP, AF_INET)) {
 			REASON_SET(&reason, PFRES_PROTCKSUM);
 			action = PF_DROP;
 			goto done;
 		}
 		pd.p_len = pd.tot_len - off - (th.th_off << 2);
 		if ((th.th_flags & TH_ACK) && pd.p_len == 0)
 			pqid = 1;
 		action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd);
 		if (action == PF_DROP)
 			goto done;
 		action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd,
 		    &reason);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_tcp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, NULL, inp);
 #else
 			action = pf_test_tcp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, &ipintrq);
 #endif
 		break;
 	}
 
 	case IPPROTO_UDP: {
 		struct udphdr	uh;
 
 		pd.hdr.udp = &uh;
 		if (!pf_pull_hdr(m, off, &uh, sizeof(uh),
 		    &action, &reason, AF_INET)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (dir == PF_IN && uh.uh_sum && pf_check_proto_cksum(m,
 		    off, ntohs(h->ip_len) - off, IPPROTO_UDP, AF_INET)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_PROTCKSUM);
 			goto done;
 		}
 		if (uh.uh_dport == 0 ||
 		    ntohs(uh.uh_ulen) > m->m_pkthdr.len - off ||
 		    ntohs(uh.uh_ulen) < sizeof(struct udphdr)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_SHORT);
 			goto done;
 		}
 		action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_udp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, NULL, inp);
 #else
 			action = pf_test_udp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, &ipintrq);
 #endif
 		break;
 	}
 
 	case IPPROTO_ICMP: {
 		struct icmp	ih;
 
 		pd.hdr.icmp = &ih;
 		if (!pf_pull_hdr(m, off, &ih, ICMP_MINLEN,
 		    &action, &reason, AF_INET)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (dir == PF_IN && pf_check_proto_cksum(m, off,
 		    ntohs(h->ip_len) - off, IPPROTO_ICMP, AF_INET)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_PROTCKSUM);
 			goto done;
 		}
 		action = pf_test_state_icmp(&s, dir, kif, m, off, h, &pd,
 		    &reason);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_icmp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, NULL);
 #else
 			action = pf_test_icmp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, &ipintrq);
 #endif
 		break;
 	}
 
 	default:
 		action = pf_test_state_other(&s, dir, kif, &pd);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_other(&r, &s, dir, kif, m, off, h,
 			    &pd, &a, &ruleset, NULL);
 #else
 			action = pf_test_other(&r, &s, dir, kif, m, off, h,
 			    &pd, &a, &ruleset, &ipintrq);
 #endif
 		break;
 	}
 
 done:
 	if (action == PF_PASS && h->ip_hl > 5 &&
 	    !((s && s->allow_opts) || r->allow_opts)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_IPOPTIONS);
 		log = 1;
 		DPFPRINTF(PF_DEBUG_MISC,
 		    ("pf: dropping packet with ip options\n"));
 	}
 
 	if ((s && s->tag) || r->rtableid)
 		pf_tag_packet(m, pd.pf_mtag, s ? s->tag : 0, r->rtableid);
 
 #ifdef ALTQ
 	if (action == PF_PASS && r->qid) {
 		if (pqid || (pd.tos & IPTOS_LOWDELAY))
 			pd.pf_mtag->qid = r->pqid;
 		else
 			pd.pf_mtag->qid = r->qid;
 		/* add hints for ecn */
 		pd.pf_mtag->af = AF_INET;
 		pd.pf_mtag->hdr = h;
 	}
 #endif /* ALTQ */
 
 	/*
 	 * connections redirected to loopback should not match sockets
 	 * bound specifically to loopback due to security implications,
 	 * see tcp_input() and in_pcblookup_listen().
 	 */
 	if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP ||
 	    pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL &&
 	    (s->nat_rule.ptr->action == PF_RDR ||
 	    s->nat_rule.ptr->action == PF_BINAT) &&
 	    (ntohl(pd.dst->v4.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET)
 		pd.pf_mtag->flags |= PF_TAG_TRANSLATE_LOCALHOST;
 
 	if (log) {
 		struct pf_rule *lr;
 
 		if (s != NULL && s->nat_rule.ptr != NULL &&
 		    s->nat_rule.ptr->log & PF_LOG_ALL)
 			lr = s->nat_rule.ptr;
 		else
 			lr = r;
 		PFLOG_PACKET(kif, h, m, AF_INET, dir, reason, lr, a, ruleset,
 		    &pd);
 	}
 
 	kif->pfik_bytes[0][dir == PF_OUT][action != PF_PASS] += pd.tot_len;
 	kif->pfik_packets[0][dir == PF_OUT][action != PF_PASS]++;
 
 	if (action == PF_PASS || r->action == PF_DROP) {
 		dirndx = (dir == PF_OUT);
 		r->packets[dirndx]++;
 		r->bytes[dirndx] += pd.tot_len;
 		if (a != NULL) {
 			a->packets[dirndx]++;
 			a->bytes[dirndx] += pd.tot_len;
 		}
 		if (s != NULL) {
 			if (s->nat_rule.ptr != NULL) {
 				s->nat_rule.ptr->packets[dirndx]++;
 				s->nat_rule.ptr->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->src_node != NULL) {
 				s->src_node->packets[dirndx]++;
 				s->src_node->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->nat_src_node != NULL) {
 				s->nat_src_node->packets[dirndx]++;
 				s->nat_src_node->bytes[dirndx] += pd.tot_len;
 			}
 			dirndx = (dir == s->direction) ? 0 : 1;
 			s->packets[dirndx]++;
 			s->bytes[dirndx] += pd.tot_len;
 		}
 		tr = r;
 		nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule;
 		if (nr != NULL) {
 			struct pf_addr *x;
 			/*
 			 * XXX: we need to make sure that the addresses
 			 * passed to pfr_update_stats() are the same than
 			 * the addresses used during matching (pfr_match)
 			 */
 			if (r == &pf_default_rule) {
 				tr = nr;
 				x = (s == NULL || s->direction == dir) ?
 				    &pd.baddr : &pd.naddr;
 			} else
 				x = (s == NULL || s->direction == dir) ?
 				    &pd.naddr : &pd.baddr;
 			if (x == &pd.baddr || s == NULL) {
 				/* we need to change the address */
 				if (dir == PF_OUT)
 					pd.src = x;
 				else
 					pd.dst = x;
 			}
 		}
 		if (tr->src.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->src.addr.p.tbl, (s == NULL ||
 			    s->direction == dir) ? pd.src : pd.dst, pd.af,
 			    pd.tot_len, dir == PF_OUT, r->action == PF_PASS,
 			    tr->src.neg);
 		if (tr->dst.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->dst.addr.p.tbl, (s == NULL ||
 			    s->direction == dir) ? pd.dst : pd.src, pd.af,
 			    pd.tot_len, dir == PF_OUT, r->action == PF_PASS,
 			    tr->dst.neg);
 	}
 
 
 	if (action == PF_SYNPROXY_DROP) {
 		m_freem(*m0);
 		*m0 = NULL;
 		action = PF_PASS;
 	} else if (r->rt)
 		/* pf_route can free the mbuf causing *m0 to become NULL */
 		pf_route(m0, r, dir, ifp, s, &pd);
 
 #ifdef __FreeBSD__
 	PF_UNLOCK();
 #endif
 
 	return (action);
 }
 #endif /* INET */
 
 #ifdef INET6
 int
 #ifdef __FreeBSD__
 pf_test6(int dir, struct ifnet *ifp, struct mbuf **m0,
     struct ether_header *eh, struct inpcb *inp)
 #else
 pf_test6(int dir, struct ifnet *ifp, struct mbuf **m0,
     struct ether_header *eh)
 #endif
 {
 	struct pfi_kif		*kif;
 	u_short			 action, reason = 0, log = 0;
 	struct mbuf		*m = *m0, *n = NULL;
 	struct ip6_hdr		*h;
 	struct pf_rule		*a = NULL, *r = &pf_default_rule, *tr, *nr;
 	struct pf_state		*s = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_pdesc		 pd;
 	int			 off, terminal = 0, dirndx, rh_cnt = 0;
 
 #ifdef __FreeBSD__
 	PF_LOCK();
 #endif
 
 	if (!pf_status.running)
 #ifdef __FreeBSD__
 	{
 		PF_UNLOCK();
 #endif
 		return (PF_PASS);
 #ifdef __FreeBSD__
 	}
 #endif
 
 	memset(&pd, 0, sizeof(pd));
 	if ((pd.pf_mtag = pf_get_mtag(m)) == NULL) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_test6: pf_get_mtag returned NULL\n"));
 		return (PF_DROP);
 	}
 	if (pd.pf_mtag->flags & PF_TAG_GENERATED)
 		return (PF_PASS);
 
 #ifdef __FreeBSD__
 	/* XXX_IMPORT: later */
 #else
 	if (ifp->if_type == IFT_CARP && ifp->if_carpdev)
 		ifp = ifp->if_carpdev;
 #endif
 
 	kif = (struct pfi_kif *)ifp->if_pf_kif;
 	if (kif == NULL) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_test6: kif == NULL, if_xname %s\n", ifp->if_xname));
 		return (PF_DROP);
 	}
 	if (kif->pfik_flags & PFI_IFLAG_SKIP) {
 #ifdef __FreeBSD__
 		PF_UNLOCK();
 #endif
 		return (PF_PASS);
 	}
 
 #ifdef __FreeBSD__
 	M_ASSERTPKTHDR(m);
 #else
 #ifdef DIAGNOSTIC
 	if ((m->m_flags & M_PKTHDR) == 0)
 		panic("non-M_PKTHDR is passed to pf_test6");
 #endif /* DIAGNOSTIC */
 #endif
 
 #ifdef __FreeBSD__
 	h = NULL;	/* make the compiler happy */
 #endif
 
 	if (m->m_pkthdr.len < (int)sizeof(*h)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_SHORT);
 		log = 1;
 		goto done;
 	}
 
 	/* We do IP header normalization and packet reassembly here */
 	if (pf_normalize_ip6(m0, dir, kif, &reason, &pd) != PF_PASS) {
 		action = PF_DROP;
 		goto done;
 	}
 	m = *m0;
 	h = mtod(m, struct ip6_hdr *);
 
 #if 1
 	/*
 	 * we do not support jumbogram yet.  if we keep going, zero ip6_plen
 	 * will do something bad, so drop the packet for now.
 	 */
 	if (htons(h->ip6_plen) == 0) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_NORM);	/*XXX*/
 		goto done;
 	}
 #endif
 
 	pd.src = (struct pf_addr *)&h->ip6_src;
 	pd.dst = (struct pf_addr *)&h->ip6_dst;
 	PF_ACPY(&pd.baddr, dir == PF_OUT ? pd.src : pd.dst, AF_INET6);
 	pd.ip_sum = NULL;
 	pd.af = AF_INET6;
 	pd.tos = 0;
 	pd.tot_len = ntohs(h->ip6_plen) + sizeof(struct ip6_hdr);
 	pd.eh = eh;
 
 	off = ((caddr_t)h - m->m_data) + sizeof(struct ip6_hdr);
 	pd.proto = h->ip6_nxt;
 	do {
 		switch (pd.proto) {
 		case IPPROTO_FRAGMENT:
 			action = pf_test_fragment(&r, dir, kif, m, h,
 			    &pd, &a, &ruleset);
 			if (action == PF_DROP)
 				REASON_SET(&reason, PFRES_FRAG);
 			goto done;
 		case IPPROTO_ROUTING: {
 			struct ip6_rthdr rthdr;
 
 			if (rh_cnt++) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 more than one rthdr\n"));
 				action = PF_DROP;
 				REASON_SET(&reason, PFRES_IPOPTIONS);
 				log = 1;
 				goto done;
 			}
 			if (!pf_pull_hdr(m, off, &rthdr, sizeof(rthdr), NULL,
 			    &reason, pd.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 short rthdr\n"));
 				action = PF_DROP;
 				REASON_SET(&reason, PFRES_SHORT);
 				log = 1;
 				goto done;
 			}
 			if (rthdr.ip6r_type == IPV6_RTHDR_TYPE_0) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 rthdr0\n"));
 				action = PF_DROP;
 				REASON_SET(&reason, PFRES_IPOPTIONS);
 				log = 1;
 				goto done;
 			}
 			/* fallthrough */
 		}
 		case IPPROTO_AH:
 		case IPPROTO_HOPOPTS:
 		case IPPROTO_DSTOPTS: {
 			/* get next header and header length */
 			struct ip6_ext	opt6;
 
 			if (!pf_pull_hdr(m, off, &opt6, sizeof(opt6),
 			    NULL, &reason, pd.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 short opt\n"));
 				action = PF_DROP;
 				log = 1;
 				goto done;
 			}
 			if (pd.proto == IPPROTO_AH)
 				off += (opt6.ip6e_len + 2) * 4;
 			else
 				off += (opt6.ip6e_len + 1) * 8;
 			pd.proto = opt6.ip6e_nxt;
 			/* goto the next header */
 			break;
 		}
 		default:
 			terminal++;
 			break;
 		}
 	} while (!terminal);
 
 	/* if there's no routing header, use unmodified mbuf for checksumming */
 	if (!n)
 		n = m;
 
 	switch (pd.proto) {
 
 	case IPPROTO_TCP: {
 		struct tcphdr	th;
 
 		pd.hdr.tcp = &th;
 		if (!pf_pull_hdr(m, off, &th, sizeof(th),
 		    &action, &reason, AF_INET6)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (dir == PF_IN && pf_check_proto_cksum(n, off,
 		    ntohs(h->ip6_plen) - (off - sizeof(struct ip6_hdr)),
 		    IPPROTO_TCP, AF_INET6)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_PROTCKSUM);
 			goto done;
 		}
 		pd.p_len = pd.tot_len - off - (th.th_off << 2);
 		action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd);
 		if (action == PF_DROP)
 			goto done;
 		action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd,
 		    &reason);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_tcp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, NULL, inp);
 #else
 			action = pf_test_tcp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, &ip6intrq);
 #endif
 		break;
 	}
 
 	case IPPROTO_UDP: {
 		struct udphdr	uh;
 
 		pd.hdr.udp = &uh;
 		if (!pf_pull_hdr(m, off, &uh, sizeof(uh),
 		    &action, &reason, AF_INET6)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (dir == PF_IN && uh.uh_sum && pf_check_proto_cksum(n,
 		    off, ntohs(h->ip6_plen) - (off - sizeof(struct ip6_hdr)),
 		    IPPROTO_UDP, AF_INET6)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_PROTCKSUM);
 			goto done;
 		}
 		if (uh.uh_dport == 0 ||
 		    ntohs(uh.uh_ulen) > m->m_pkthdr.len - off ||
 		    ntohs(uh.uh_ulen) < sizeof(struct udphdr)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_SHORT);
 			goto done;
 		}
 		action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_udp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, NULL, inp);
 #else
 			action = pf_test_udp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, &ip6intrq);
 #endif
 		break;
 	}
 
 	case IPPROTO_ICMPV6: {
 		struct icmp6_hdr	ih;
 
 		pd.hdr.icmp6 = &ih;
 		if (!pf_pull_hdr(m, off, &ih, sizeof(ih),
 		    &action, &reason, AF_INET6)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (dir == PF_IN && pf_check_proto_cksum(n, off,
 		    ntohs(h->ip6_plen) - (off - sizeof(struct ip6_hdr)),
 		    IPPROTO_ICMPV6, AF_INET6)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_PROTCKSUM);
 			goto done;
 		}
 		action = pf_test_state_icmp(&s, dir, kif,
 		    m, off, h, &pd, &reason);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_icmp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, NULL);
 #else
 			action = pf_test_icmp(&r, &s, dir, kif,
 			    m, off, h, &pd, &a, &ruleset, &ip6intrq);
 #endif
 		break;
 	}
 
 	default:
 		action = pf_test_state_other(&s, dir, kif, &pd);
 		if (action == PF_PASS) {
 #if NPFSYNC
 			pfsync_update_state(s);
 #endif /* NPFSYNC */
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 #ifdef __FreeBSD__
 			action = pf_test_other(&r, &s, dir, kif, m, off, h,
 			    &pd, &a, &ruleset, NULL);
 #else
 			action = pf_test_other(&r, &s, dir, kif, m, off, h,
 			    &pd, &a, &ruleset, &ip6intrq);
 #endif
 		break;
 	}
 
 done:
 	/* handle dangerous IPv6 extension headers. */
 	if (action == PF_PASS && rh_cnt &&
 	    !((s && s->allow_opts) || r->allow_opts)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_IPOPTIONS);
 		log = 1;
 		DPFPRINTF(PF_DEBUG_MISC,
 		    ("pf: dropping packet with dangerous v6 headers\n"));
 	}
 
 	if ((s && s->tag) || r->rtableid)
 		pf_tag_packet(m, pd.pf_mtag, s ? s->tag : 0, r->rtableid);
 
 #ifdef ALTQ
 	if (action == PF_PASS && r->qid) {
 		if (pd.tos & IPTOS_LOWDELAY)
 			pd.pf_mtag->qid = r->pqid;
 		else
 			pd.pf_mtag->qid = r->qid;
 		/* add hints for ecn */
 		pd.pf_mtag->af = AF_INET6;
 		pd.pf_mtag->hdr = h;
 	}
 #endif /* ALTQ */
 
 	if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP ||
 	    pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL &&
 	    (s->nat_rule.ptr->action == PF_RDR ||
 	    s->nat_rule.ptr->action == PF_BINAT) &&
 	    IN6_IS_ADDR_LOOPBACK(&pd.dst->v6))
 		pd.pf_mtag->flags |= PF_TAG_TRANSLATE_LOCALHOST;
 
 	if (log) {
 		struct pf_rule *lr;
 
 		if (s != NULL && s->nat_rule.ptr != NULL &&
 		    s->nat_rule.ptr->log & PF_LOG_ALL)
 			lr = s->nat_rule.ptr;
 		else
 			lr = r;
 		PFLOG_PACKET(kif, h, m, AF_INET6, dir, reason, lr, a, ruleset,
 		    &pd);
 	}
 
 	kif->pfik_bytes[1][dir == PF_OUT][action != PF_PASS] += pd.tot_len;
 	kif->pfik_packets[1][dir == PF_OUT][action != PF_PASS]++;
 
 	if (action == PF_PASS || r->action == PF_DROP) {
 		dirndx = (dir == PF_OUT);
 		r->packets[dirndx]++;
 		r->bytes[dirndx] += pd.tot_len;
 		if (a != NULL) {
 			a->packets[dirndx]++;
 			a->bytes[dirndx] += pd.tot_len;
 		}
 		if (s != NULL) {
 			if (s->nat_rule.ptr != NULL) {
 				s->nat_rule.ptr->packets[dirndx]++;
 				s->nat_rule.ptr->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->src_node != NULL) {
 				s->src_node->packets[dirndx]++;
 				s->src_node->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->nat_src_node != NULL) {
 				s->nat_src_node->packets[dirndx]++;
 				s->nat_src_node->bytes[dirndx] += pd.tot_len;
 			}
 			dirndx = (dir == s->direction) ? 0 : 1;
 			s->packets[dirndx]++;
 			s->bytes[dirndx] += pd.tot_len;
 		}
 		tr = r;
 		nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule;
 		if (nr != NULL) {
 			struct pf_addr *x;
 			/*
 			 * XXX: we need to make sure that the addresses
 			 * passed to pfr_update_stats() are the same than
 			 * the addresses used during matching (pfr_match)
 			 */
 			if (r == &pf_default_rule) {
 				tr = nr;
 				x = (s == NULL || s->direction == dir) ?
 				    &pd.baddr : &pd.naddr;
 			} else {
 				x = (s == NULL || s->direction == dir) ?
 				    &pd.naddr : &pd.baddr;
 			}
 			if (x == &pd.baddr || s == NULL) {
 				if (dir == PF_OUT)
 					pd.src = x;
 				else
 					pd.dst = x;
 			}
 		}
 		if (tr->src.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->src.addr.p.tbl, (s == NULL ||
 			    s->direction == dir) ? pd.src : pd.dst, pd.af,
 			    pd.tot_len, dir == PF_OUT, r->action == PF_PASS,
 			    tr->src.neg);
 		if (tr->dst.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->dst.addr.p.tbl, (s == NULL ||
 			    s->direction == dir) ? pd.dst : pd.src, pd.af,
 			    pd.tot_len, dir == PF_OUT, r->action == PF_PASS,
 			    tr->dst.neg);
 	}
 
 
 	if (action == PF_SYNPROXY_DROP) {
 		m_freem(*m0);
 		*m0 = NULL;
 		action = PF_PASS;
 	} else if (r->rt)
 		/* pf_route6 can free the mbuf causing *m0 to become NULL */
 		pf_route6(m0, r, dir, ifp, s, &pd);
 
 #ifdef __FreeBSD__
 	PF_UNLOCK();
 #endif
 	return (action);
 }
 #endif /* INET6 */
 
 int
 pf_check_congestion(struct ifqueue *ifq)
 {
 #ifdef __FreeBSD__
 	/* XXX_IMPORT: later */
 	return (0);
 #else
 	if (ifq->ifq_congestion)
 		return (1);
 	else
 		return (0);
 #endif
 }
Index: head/sys/contrib/rdma/rdma_addr.c
===================================================================
--- head/sys/contrib/rdma/rdma_addr.c	(revision 186118)
+++ head/sys/contrib/rdma/rdma_addr.c	(revision 186119)
@@ -1,408 +1,410 @@
 /*
  * Copyright (c) 2005 Voltaire Inc.  All rights reserved.
  * Copyright (c) 2002-2005, Network Appliance, Inc. All rights reserved.
  * Copyright (c) 1999-2005, Mellanox Technologies, Inc. All rights reserved.
  * Copyright (c) 2005 Intel Corporation.  All rights reserved.
  *
  * This Software is licensed under one of the following licenses:
  *
  * 1) under the terms of the "Common Public License 1.0" a copy of which is
  *    available from the Open Source Initiative, see
  *    http://www.opensource.org/licenses/cpl.php.
  *
  * 2) under the terms of the "The BSD License" a copy of which is
  *    available from the Open Source Initiative, see
  *    http://www.opensource.org/licenses/bsd-license.php.
  *
  * 3) under the terms of the "GNU General Public License (GPL) Version 2" a
  *    copy of which is available from the Open Source Initiative, see
  *    http://www.opensource.org/licenses/gpl-license.php.
  *
  * Licensee has the right to choose one of the above licenses.
  *
  * Redistributions of source code must retain the above copyright
  * notice and one of the license notices.
  *
  * Redistributions in binary form must reproduce both the above copyright
  * notice, one of the license notices in the documentation
  * and/or other materials provided with the distribution.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/condvar.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/socket.h>
 #include <sys/module.h>
 
 #include <sys/lock.h>
 #include <sys/condvar.h>
 #include <sys/mutex.h>
 #include <sys/rwlock.h>
 #include <sys/queue.h>
 #include <sys/taskqueue.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
 #include <net/if_var.h>
 #include <net/if_arp.h>
 #include <net/route.h>
 
 #include <net80211/ieee80211_freebsd.h>
 
 #include <netinet/in.h>
 #include <netinet/if_ether.h>
 
 #include <contrib/rdma/ib_addr.h>
 
 struct addr_req {
 	TAILQ_ENTRY(addr_req) entry;
 	struct sockaddr src_addr;
 	struct sockaddr dst_addr;
 	struct rdma_dev_addr *addr;
 	struct rdma_addr_client *client;
 	void *context;
 	void (*callback)(int status, struct sockaddr *src_addr,
 			 struct rdma_dev_addr *addr, void *context);
 	unsigned long timeout;
 	int status;
 };
 
 static void process_req(void *ctx, int pending);
 
 static struct mtx lock;
 
 static TAILQ_HEAD(addr_req_list, addr_req) req_list;
 static struct task addr_task;
 static struct taskqueue *addr_taskq;
 static struct callout addr_ch;
 static eventhandler_tag route_event_tag;
 
 static void addr_timeout(void *arg)
 {
 	taskqueue_enqueue(addr_taskq, &addr_task);
 }
 
 void rdma_addr_register_client(struct rdma_addr_client *client)
 {
 	mtx_init(&client->lock, "rdma_addr client lock", NULL, MTX_DUPOK|MTX_DEF);
 	cv_init(&client->comp, "rdma_addr cv");
 	client->refcount = 1;
 }
 
 static inline void put_client(struct rdma_addr_client *client)
 {
 	mtx_lock(&client->lock);
 	if (--client->refcount == 0) {
 		cv_broadcast(&client->comp);
 	}
 	mtx_unlock(&client->lock);
 }
 
 void rdma_addr_unregister_client(struct rdma_addr_client *client)
 {
 	put_client(client);
 	mtx_lock(&client->lock);
 	if (client->refcount) {
 		cv_wait(&client->comp, &client->lock);
 	}
 	mtx_unlock(&client->lock);
 }
 
 int rdma_copy_addr(struct rdma_dev_addr *dev_addr, struct ifnet *dev,
 		     const unsigned char *dst_dev_addr)
 {
 	dev_addr->dev_type = RDMA_NODE_RNIC;
 	memcpy(dev_addr->src_dev_addr, IF_LLADDR(dev), MAX_ADDR_LEN);
 	memcpy(dev_addr->broadcast, dev->if_broadcastaddr, MAX_ADDR_LEN);
 	if (dst_dev_addr)
 		memcpy(dev_addr->dst_dev_addr, dst_dev_addr, MAX_ADDR_LEN);
 	return 0;
 }
 
 int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr)
 {
 	struct ifaddr *ifa;
 	struct sockaddr_in *sin = (struct sockaddr_in *)addr;
 	uint16_t port = sin->sin_port;
 	
 	sin->sin_port = 0;
 	ifa = ifa_ifwithaddr(addr);
 	sin->sin_port = port;
 	if (!ifa)
 		return (EADDRNOTAVAIL);
 	return rdma_copy_addr(dev_addr, ifa->ifa_ifp, NULL);
 }
 
 static void queue_req(struct addr_req *req)
 {
 	struct addr_req *tmp_req = NULL;
 	
 	mtx_lock(&lock);
 	TAILQ_FOREACH_REVERSE(tmp_req, &req_list, addr_req_list, entry)
 	    if (time_after_eq(req->timeout, tmp_req->timeout))
 		    break;
 	
 	if (tmp_req)
 		TAILQ_INSERT_AFTER(&req_list, tmp_req, req, entry);
 	else
 		TAILQ_INSERT_TAIL(&req_list, req, entry);
 	
 	if (TAILQ_FIRST(&req_list) == req)	
 		callout_reset(&addr_ch, req->timeout - ticks, addr_timeout, NULL);
 	mtx_unlock(&lock);
 }
 
 #ifdef needed
 static void addr_send_arp(struct sockaddr_in *dst_in)
 {
 	struct route iproute;
 	struct sockaddr_in *dst = (struct sockaddr_in *)&iproute.ro_dst;
 	char dmac[ETHER_ADDR_LEN];
+	struct llentry *lle;
 
 	bzero(&iproute, sizeof iproute);
 	*dst = *dst_in;
 
 	rtalloc(&iproute);
 	if (iproute.ro_rt == NULL);
 		return;
 
 	arpresolve(iproute.ro_rt->rt_ifp, iproute.ro_rt, NULL, 
-		   rt_key(iproute.ro_rt), dmac);
+		   rt_key(iproute.ro_rt), dmac, &lle);
 
 	RTFREE(iproute.ro_rt);
 }
 #endif
 
 static int addr_resolve_remote(struct sockaddr_in *src_in,
 			       struct sockaddr_in *dst_in,
 			       struct rdma_dev_addr *addr)
 {
 	int ret = 0;
 	struct route iproute;
 	struct sockaddr_in *dst = (struct sockaddr_in *)&iproute.ro_dst;
 	char dmac[ETHER_ADDR_LEN];
+	struct llentry *lle;
 
 	bzero(&iproute, sizeof iproute);
 	*dst = *dst_in;
 
 	rtalloc(&iproute);
 	if (iproute.ro_rt == NULL) {
 		ret = EHOSTUNREACH;
 		goto out;
 	}
 
 	/* If the device does ARP internally, return 'done' */
 	if (iproute.ro_rt->rt_ifp->if_flags & IFF_NOARP) {
 		rdma_copy_addr(addr, iproute.ro_rt->rt_ifp, NULL);
 		goto put;
 	}
  	ret = arpresolve(iproute.ro_rt->rt_ifp, iproute.ro_rt, NULL, 
-		rt_key(iproute.ro_rt), dmac);
+		rt_key(iproute.ro_rt), dmac, &lle);
 	if (ret) {
 		goto put;
 	}
 
 	if (!src_in->sin_addr.s_addr) {
 		src_in->sin_len = sizeof *src_in;
 		src_in->sin_family = dst_in->sin_family;
 		src_in->sin_addr.s_addr = ((struct sockaddr_in *)iproute.ro_rt->rt_ifa->ifa_addr)->sin_addr.s_addr;
 	}
 
 	ret = rdma_copy_addr(addr, iproute.ro_rt->rt_ifp, dmac);
 put:
 	RTFREE(iproute.ro_rt);
 out:
 	return ret;
 }
 
 static void process_req(void *ctx, int pending)
 {
 	struct addr_req *req, *tmp_req;
 	struct sockaddr_in *src_in, *dst_in;
 	TAILQ_HEAD(, addr_req) done_list;
 
 	TAILQ_INIT(&done_list);
 
 	mtx_lock(&lock);
 	TAILQ_FOREACH_SAFE(req, &req_list, entry, tmp_req) {
 		if (req->status == EWOULDBLOCK) {
 			src_in = (struct sockaddr_in *) &req->src_addr;
 			dst_in = (struct sockaddr_in *) &req->dst_addr;
 			req->status = addr_resolve_remote(src_in, dst_in,
 							  req->addr);
 			if (req->status && time_after_eq(ticks, req->timeout))
 				req->status = ETIMEDOUT;
 			else if (req->status == EWOULDBLOCK)
 				continue;
 		}
 		TAILQ_REMOVE(&req_list, req, entry);
 		TAILQ_INSERT_TAIL(&done_list, req, entry);
 	}
 
 	if (!TAILQ_EMPTY(&req_list)) {
 		req = TAILQ_FIRST(&req_list);
 		callout_reset(&addr_ch, req->timeout - ticks, addr_timeout, 
 			NULL);
 	}
 	mtx_unlock(&lock);
 
 	TAILQ_FOREACH_SAFE(req, &done_list, entry, tmp_req) {
 		TAILQ_REMOVE(&done_list, req, entry);
 		req->callback(req->status, &req->src_addr, req->addr,
 			      req->context);
 		put_client(req->client);
 		free(req, M_DEVBUF);
 	}
 }
 
 int rdma_resolve_ip(struct rdma_addr_client *client,
 		    struct sockaddr *src_addr, struct sockaddr *dst_addr,
 		    struct rdma_dev_addr *addr, int timeout_ms,
 		    void (*callback)(int status, struct sockaddr *src_addr,
 				     struct rdma_dev_addr *addr, void *context),
 		    void *context)
 {
 	struct sockaddr_in *src_in, *dst_in;
 	struct addr_req *req;
 	int ret = 0;
 
 	req = malloc(sizeof *req, M_DEVBUF, M_NOWAIT);
 	if (!req)
 		return (ENOMEM);
 	memset(req, 0, sizeof *req);
 
 	if (src_addr)
 		memcpy(&req->src_addr, src_addr, ip_addr_size(src_addr));
 	memcpy(&req->dst_addr, dst_addr, ip_addr_size(dst_addr));
 	req->addr = addr;
 	req->callback = callback;
 	req->context = context;
 	req->client = client;
 	mtx_lock(&client->lock);
 	client->refcount++;
 	mtx_unlock(&client->lock);
 
 	src_in = (struct sockaddr_in *) &req->src_addr;
 	dst_in = (struct sockaddr_in *) &req->dst_addr;
 
 	req->status = addr_resolve_remote(src_in, dst_in, addr);
 
 	switch (req->status) {
 	case 0:
 		req->timeout = ticks;
 		queue_req(req);
 		break;
 	case EWOULDBLOCK:
 		req->timeout = msecs_to_ticks(timeout_ms) + ticks;
 		queue_req(req);
 #ifdef needed
 		addr_send_arp(dst_in);
 #endif
 		break;
 	default:
 		ret = req->status;
 		mtx_lock(&client->lock);
 		client->refcount--;
 		mtx_unlock(&client->lock);
 		free(req, M_DEVBUF);
 		break;
 	}
 	return ret;
 }
 
 void rdma_addr_cancel(struct rdma_dev_addr *addr)
 {
 	struct addr_req *req, *tmp_req;
 
 	mtx_lock(&lock);
 	TAILQ_FOREACH_SAFE(req, &req_list, entry, tmp_req) {
 		if (req->addr == addr) {
 			req->status = ECANCELED;
 			req->timeout = ticks;
 			TAILQ_REMOVE(&req_list, req, entry);
 			TAILQ_INSERT_HEAD(&req_list, req, entry);
 			callout_reset(&addr_ch, req->timeout - ticks, addr_timeout, NULL);
 			break;
 		}
 	}
 	mtx_unlock(&lock);
 }
 
 static void
 route_event_arp_update(void *unused, struct rtentry *rt0, uint8_t *enaddr,
 	struct sockaddr *sa)
 {
 		callout_stop(&addr_ch);
 		taskqueue_enqueue(addr_taskq, &addr_task);
 }
 
 static int addr_init(void)
 {
 	TAILQ_INIT(&req_list);
 	mtx_init(&lock, "rdma_addr req_list lock", NULL, MTX_DEF);
 
 	addr_taskq = taskqueue_create("rdma_addr_taskq", M_NOWAIT,
 		taskqueue_thread_enqueue, &addr_taskq);
         if (addr_taskq == NULL) {
                 printf("failed to allocate rdma_addr taskqueue\n");
                 return (ENOMEM);
         }
         taskqueue_start_threads(&addr_taskq, 1, PI_NET, "rdma_addr taskq");
         TASK_INIT(&addr_task, 0, process_req, NULL);
 
 	callout_init(&addr_ch, TRUE);
 
 	route_event_tag = EVENTHANDLER_REGISTER(route_arp_update_event, 
 		route_event_arp_update, NULL, EVENTHANDLER_PRI_ANY);
 
 	return 0;
 }
 
 static void addr_cleanup(void)
 {
 	EVENTHANDLER_DEREGISTER(route_event_arp_update, route_event_tag);
 	callout_stop(&addr_ch);
 	taskqueue_drain(addr_taskq, &addr_task);
 	taskqueue_free(addr_taskq);
 }
 
 static int 
 addr_load(module_t mod, int cmd, void *arg)
 {
         int err = 0;
 
         switch (cmd) {
         case MOD_LOAD:
                 printf("Loading rdma_addr.\n");
 
                 addr_init();
                 break;
         case MOD_QUIESCE:
                 break;
         case MOD_UNLOAD:
                 printf("Unloading rdma_addr.\n");
 		addr_cleanup();
                 break;
         case MOD_SHUTDOWN:
                 break;
         default:
                 err = EOPNOTSUPP;
                 break;
         }
 
         return (err);
 }
 
 static moduledata_t mod_data = {
 	"rdma_addr",
 	addr_load,
 	0
 };
 
 MODULE_VERSION(rdma_addr, 1);
 DECLARE_MODULE(rdma_addr, mod_data, SI_SUB_EXEC, SI_ORDER_ANY);
Index: head/sys/dev/cxgb/ulp/tom/cxgb_l2t.c
===================================================================
--- head/sys/dev/cxgb/ulp/tom/cxgb_l2t.c	(revision 186118)
+++ head/sys/dev/cxgb/ulp/tom/cxgb_l2t.c	(revision 186119)
@@ -1,538 +1,534 @@
 /**************************************************************************
 
 Copyright (c) 2007, Chelsio Inc.
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
 
  1. Redistributions of source code must retain the above copyright notice,
     this list of conditions and the following disclaimer.
 
  2. Neither the name of the Chelsio Corporation nor the names of its
     contributors may be used to endorse or promote products derived from
     this software without specific prior written permission.
  
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 
 ***************************************************************************/
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/module.h>
 #include <sys/bus.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #if __FreeBSD_version > 700000
 #include <sys/rwlock.h>
 #endif
 
 #include <sys/socket.h>
 #include <net/if.h>
 #include <net/ethernet.h>
 #include <net/if_vlan_var.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 #include <netinet/in.h>
 #include <netinet/if_ether.h>
 
 #include <cxgb_include.h>
 #include <ulp/tom/cxgb_l2t.h>
 
 #define VLAN_NONE 0xfff
 #define SDL(s) ((struct sockaddr_dl *)s) 
 #define RT_ENADDR(sa)  ((u_char *)LLADDR(SDL((sa))))
 #define rt_expire rt_rmx.rmx_expire 
 
 struct llinfo_arp { 
         struct  callout la_timer; 
         struct  rtentry *la_rt; 
         struct  mbuf *la_hold;  /* last packet until resolved/timeout */ 
         u_short la_preempt;     /* countdown for pre-expiry arps */ 
         u_short la_asked;       /* # requests sent */ 
 }; 
 
 /*
  * Module locking notes:  There is a RW lock protecting the L2 table as a
  * whole plus a spinlock per L2T entry.  Entry lookups and allocations happen
  * under the protection of the table lock, individual entry changes happen
  * while holding that entry's spinlock.  The table lock nests outside the
  * entry locks.  Allocations of new entries take the table lock as writers so
  * no other lookups can happen while allocating new entries.  Entry updates
  * take the table lock as readers so multiple entries can be updated in
  * parallel.  An L2T entry can be dropped by decrementing its reference count
  * and therefore can happen in parallel with entry allocation but no entry
  * can change state or increment its ref count during allocation as both of
  * these perform lookups.
  */
 
 static inline unsigned int
 vlan_prio(const struct l2t_entry *e)
 {
 	return e->vlan >> 13;
 }
 
 static inline unsigned int
 arp_hash(u32 key, int ifindex, const struct l2t_data *d)
 {
 	return jhash_2words(key, ifindex, 0) & (d->nentries - 1);
 }
 
 static inline void
-neigh_replace(struct l2t_entry *e, struct rtentry *rt)
+neigh_replace(struct l2t_entry *e, struct llentry *neigh)
 {
-	RT_LOCK(rt);
-	RT_ADDREF(rt);
-	RT_UNLOCK(rt);
+	LLE_WLOCK(neigh);
+	LLE_ADDREF(neigh);
+	LLE_WUNLOCK(neigh);
 	
 	if (e->neigh)
-		RTFREE(e->neigh);
-	e->neigh = rt;
+		LLE_FREE(e->neigh);
+	e->neigh = neigh;
 }
 
 /*
  * Set up an L2T entry and send any packets waiting in the arp queue.  The
  * supplied mbuf is used for the CPL_L2T_WRITE_REQ.  Must be called with the
  * entry locked.
  */
 static int
 setup_l2e_send_pending(struct t3cdev *dev, struct mbuf *m,
     struct l2t_entry *e)
 {
 	struct cpl_l2t_write_req *req;
 
 	if (!m) {
 		if ((m = m_gethdr(M_NOWAIT, MT_DATA)) == NULL)
 		    return (ENOMEM);
 	}
 	/*
 	 * XXX MH_ALIGN
 	 */
 	req = mtod(m, struct cpl_l2t_write_req *);
 	m->m_pkthdr.len = m->m_len = sizeof(*req);
 	
 	req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD));
 	OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_L2T_WRITE_REQ, e->idx));
 	req->params = htonl(V_L2T_W_IDX(e->idx) | V_L2T_W_IFF(e->smt_idx) |
 			    V_L2T_W_VLAN(e->vlan & EVL_VLID_MASK) |
 			    V_L2T_W_PRIO(vlan_prio(e)));
 
 	memcpy(req->dst_mac, e->dmac, sizeof(req->dst_mac));
 	m_set_priority(m, CPL_PRIORITY_CONTROL);
 	cxgb_ofld_send(dev, m);
 	while (e->arpq_head) {
 		m = e->arpq_head;
 		e->arpq_head = m->m_next;
 		m->m_next = NULL;
 		cxgb_ofld_send(dev, m);
 	}
 	e->arpq_tail = NULL;
 	e->state = L2T_STATE_VALID;
 
 	return 0;
 }
 
 /*
  * Add a packet to the an L2T entry's queue of packets awaiting resolution.
  * Must be called with the entry's lock held.
  */
 static inline void
 arpq_enqueue(struct l2t_entry *e, struct mbuf *m)
 {
 	m->m_next = NULL;
 	if (e->arpq_head)
 		e->arpq_tail->m_next = m;
 	else
 		e->arpq_head = m;
 	e->arpq_tail = m;
 }
 
 int
 t3_l2t_send_slow(struct t3cdev *dev, struct mbuf *m, struct l2t_entry *e)
 {
-	struct rtentry *rt =  e->neigh;
+	struct llentry *lle =  e->neigh;
 	struct sockaddr_in sin;
 
 	bzero(&sin, sizeof(struct sockaddr_in));
 	sin.sin_family = AF_INET;
 	sin.sin_len = sizeof(struct sockaddr_in);
 	sin.sin_addr.s_addr = e->addr;
 
 	CTR2(KTR_CXGB, "send slow on rt=%p eaddr=0x%08x\n", rt, e->addr);
 again:
 	switch (e->state) {
 	case L2T_STATE_STALE:     /* entry is stale, kick off revalidation */
 		arpresolve(rt->rt_ifp, rt, NULL,
-		     (struct sockaddr *)&sin, e->dmac);
+		     (struct sockaddr *)&sin, e->dmac, &lle);
 		mtx_lock(&e->lock);
 		if (e->state == L2T_STATE_STALE)
 			e->state = L2T_STATE_VALID;
 		mtx_unlock(&e->lock);
 	case L2T_STATE_VALID:     /* fast-path, send the packet on */
 		return cxgb_ofld_send(dev, m);
 	case L2T_STATE_RESOLVING:
 		mtx_lock(&e->lock);
 		if (e->state != L2T_STATE_RESOLVING) { // ARP already completed
 			mtx_unlock(&e->lock);
 			goto again;
 		}
 		arpq_enqueue(e, m);
 		mtx_unlock(&e->lock);
 		/*
 		 * Only the first packet added to the arpq should kick off
 		 * resolution.  However, because the m_gethdr below can fail,
 		 * we allow each packet added to the arpq to retry resolution
 		 * as a way of recovering from transient memory exhaustion.
 		 * A better way would be to use a work request to retry L2T
 		 * entries when there's no memory.
 		 */
 		if (arpresolve(rt->rt_ifp, rt, NULL,
-		     (struct sockaddr *)&sin, e->dmac) == 0) {
+		     (struct sockaddr *)&sin, e->dmac, &lle) == 0) {
 			CTR6(KTR_CXGB, "mac=%x:%x:%x:%x:%x:%x\n",
 			    e->dmac[0], e->dmac[1], e->dmac[2], e->dmac[3], e->dmac[4], e->dmac[5]);
 			
 			if ((m = m_gethdr(M_NOWAIT, MT_DATA)) == NULL)
 				return (ENOMEM);
 
 			mtx_lock(&e->lock);
 			if (e->arpq_head) 
 				setup_l2e_send_pending(dev, m, e);
 			else
 				m_freem(m);
 			mtx_unlock(&e->lock);
 		}
 	}
 	return 0;
 }
 
 void
 t3_l2t_send_event(struct t3cdev *dev, struct l2t_entry *e)
 {
-	struct rtentry *rt;
 	struct mbuf *m0;
 	struct sockaddr_in sin;
 	sin.sin_family = AF_INET;
 	sin.sin_len = sizeof(struct sockaddr_in);
 	sin.sin_addr.s_addr = e->addr;
+	struct llentry *lle;
 	
 	if ((m0 = m_gethdr(M_NOWAIT, MT_DATA)) == NULL)
 		return;
 
 	rt = e->neigh;
 again:
 	switch (e->state) {
 	case L2T_STATE_STALE:     /* entry is stale, kick off revalidation */
 		arpresolve(rt->rt_ifp, rt, NULL,
-		     (struct sockaddr *)&sin, e->dmac);
+		     (struct sockaddr *)&sin, e->dmac, &lle);
 		mtx_lock(&e->lock);
 		if (e->state == L2T_STATE_STALE) {
 			e->state = L2T_STATE_VALID;
 		}
 		mtx_unlock(&e->lock);
 		return;
 	case L2T_STATE_VALID:     /* fast-path, send the packet on */
 		return;
 	case L2T_STATE_RESOLVING:
 		mtx_lock(&e->lock);
 		if (e->state != L2T_STATE_RESOLVING) { // ARP already completed
 			mtx_unlock(&e->lock);
 			goto again;
 		}
 		mtx_unlock(&e->lock);
 		
 		/*
 		 * Only the first packet added to the arpq should kick off
 		 * resolution.  However, because the alloc_skb below can fail,
 		 * we allow each packet added to the arpq to retry resolution
 		 * as a way of recovering from transient memory exhaustion.
 		 * A better way would be to use a work request to retry L2T
 		 * entries when there's no memory.
 		 */
 		arpresolve(rt->rt_ifp, rt, NULL,
-		    (struct sockaddr *)&sin, e->dmac);
+		    (struct sockaddr *)&sin, e->dmac, &lle);
 
 	}
 	return;
 }
 /*
  * Allocate a free L2T entry.  Must be called with l2t_data.lock held.
  */
 static struct l2t_entry *
 alloc_l2e(struct l2t_data *d)
 {
 	struct l2t_entry *end, *e, **p;
 
 	if (!atomic_load_acq_int(&d->nfree))
 		return NULL;
 
 	/* there's definitely a free entry */
 	for (e = d->rover, end = &d->l2tab[d->nentries]; e != end; ++e)
 		if (atomic_load_acq_int(&e->refcnt) == 0)
 			goto found;
 
 	for (e = &d->l2tab[1]; atomic_load_acq_int(&e->refcnt); ++e) ;
 found:
 	d->rover = e + 1;
 	atomic_add_int(&d->nfree, -1);
 
 	/*
 	 * The entry we found may be an inactive entry that is
 	 * presently in the hash table.  We need to remove it.
 	 */
 	if (e->state != L2T_STATE_UNUSED) {
 		int hash = arp_hash(e->addr, e->ifindex, d);
 
 		for (p = &d->l2tab[hash].first; *p; p = &(*p)->next)
 			if (*p == e) {
 				*p = e->next;
 				break;
 			}
 		e->state = L2T_STATE_UNUSED;
 	}
 	
 	return e;
 }
 
 /*
  * Called when an L2T entry has no more users.  The entry is left in the hash
  * table since it is likely to be reused but we also bump nfree to indicate
  * that the entry can be reallocated for a different neighbor.  We also drop
  * the existing neighbor reference in case the neighbor is going away and is
  * waiting on our reference.
  *
  * Because entries can be reallocated to other neighbors once their ref count
  * drops to 0 we need to take the entry's lock to avoid races with a new
  * incarnation.
  */
 void
 t3_l2e_free(struct l2t_data *d, struct l2t_entry *e)
 {
-	struct rtentry *rt = NULL;
-	
+	struct llentry *lle;
+
 	mtx_lock(&e->lock);
 	if (atomic_load_acq_int(&e->refcnt) == 0) {  /* hasn't been recycled */
-		rt = e->neigh;
+		lle = e->neigh;
 		e->neigh = NULL;
 	}
 	
 	mtx_unlock(&e->lock);
 	atomic_add_int(&d->nfree, 1);
-	if (rt)
-		RTFREE(rt);
+	if (lle)
+		LLE_FREE(lle);
 }
 
 
 /*
  * Update an L2T entry that was previously used for the same next hop as neigh.
  * Must be called with softirqs disabled.
  */
 static inline void
-reuse_entry(struct l2t_entry *e, struct rtentry *neigh)
+reuse_entry(struct l2t_entry *e, struct llentry *neigh)
 {
-	struct llinfo_arp *la;
 
-	la = (struct llinfo_arp *)neigh->rt_llinfo; 
-
 	mtx_lock(&e->lock);                /* avoid race with t3_l2t_free */
 	if (neigh != e->neigh)
 		neigh_replace(e, neigh);
 	
 	if (memcmp(e->dmac, RT_ENADDR(neigh->rt_gateway), sizeof(e->dmac)) ||
 	    (neigh->rt_expire > time_uptime))
 		e->state = L2T_STATE_RESOLVING;
 	else if (la->la_hold == NULL)
 		e->state = L2T_STATE_VALID;
 	else
 		e->state = L2T_STATE_STALE;
 	mtx_unlock(&e->lock);
 }
 
 struct l2t_entry *
-t3_l2t_get(struct t3cdev *dev, struct rtentry *neigh, struct ifnet *ifp,
+t3_l2t_get(struct t3cdev *dev, struct llentry *neigh, struct ifnet *ifp,
 	struct sockaddr *sa)
 {
 	struct l2t_entry *e;
 	struct l2t_data *d = L2DATA(dev);
 	u32 addr = ((struct sockaddr_in *)sa)->sin_addr.s_addr;
-	int ifidx = neigh->rt_ifp->if_index;
+	int ifidx = ifp->if_index;
 	int hash = arp_hash(addr, ifidx, d);
 	unsigned int smt_idx = ((struct port_info *)ifp->if_softc)->port_id;
 
 	rw_wlock(&d->lock);
 	for (e = d->l2tab[hash].first; e; e = e->next)
 		if (e->addr == addr && e->ifindex == ifidx &&
 		    e->smt_idx == smt_idx) {
 			l2t_hold(d, e);
 			if (atomic_load_acq_int(&e->refcnt) == 1)
 				reuse_entry(e, neigh);
 			goto done;
 		}
 
 	/* Need to allocate a new entry */
 	e = alloc_l2e(d);
 	if (e) {
 		mtx_lock(&e->lock);          /* avoid race with t3_l2t_free */
 		e->next = d->l2tab[hash].first;
 		d->l2tab[hash].first = e;
 		rw_wunlock(&d->lock);
 		
 		e->state = L2T_STATE_RESOLVING;
 		e->addr = addr;
 		e->ifindex = ifidx;
 		e->smt_idx = smt_idx;
 		atomic_store_rel_int(&e->refcnt, 1);
 		e->neigh = NULL;
 		
 		
 		neigh_replace(e, neigh);
 #ifdef notyet
 		/* 
 		 * XXX need to add accessor function for vlan tag
 		 */
 		if (neigh->rt_ifp->if_vlantrunk)
 			e->vlan = VLAN_DEV_INFO(neigh->dev)->vlan_id;
 		else
 #endif			    
 			e->vlan = VLAN_NONE;
 		mtx_unlock(&e->lock);
 
 		return (e);
 	}
 	
 done:
 	rw_wunlock(&d->lock);
 	return e;
 }
 
 /*
  * Called when address resolution fails for an L2T entry to handle packets
  * on the arpq head.  If a packet specifies a failure handler it is invoked,
  * otherwise the packets is sent to the TOE.
  *
  * XXX: maybe we should abandon the latter behavior and just require a failure
  * handler.
  */
 static void
 handle_failed_resolution(struct t3cdev *dev, struct mbuf *arpq)
 {
 
 	while (arpq) {
 		struct mbuf *m = arpq;
 #ifdef notyet		
 		struct l2t_mbuf_cb *cb = L2T_MBUF_CB(m);
 #endif
 		arpq = m->m_next;
 		m->m_next = NULL;
 #ifdef notyet		
 		if (cb->arp_failure_handler)
 			cb->arp_failure_handler(dev, m);
 		else
 #endif			
 			cxgb_ofld_send(dev, m);
 	}
 
 }
 
 void
-t3_l2t_update(struct t3cdev *dev, struct rtentry *neigh,
+t3_l2t_update(struct t3cdev *dev, struct llentry *neigh,
     uint8_t *enaddr, struct sockaddr *sa)
 {
 	struct l2t_entry *e;
 	struct mbuf *arpq = NULL;
 	struct l2t_data *d = L2DATA(dev);
 	u32 addr = *(u32 *) &((struct sockaddr_in *)sa)->sin_addr;
-	int ifidx = neigh->rt_ifp->if_index;
 	int hash = arp_hash(addr, ifidx, d);
 	struct llinfo_arp *la;
 
 	rw_rlock(&d->lock);
 	for (e = d->l2tab[hash].first; e; e = e->next)
-		if (e->addr == addr && e->ifindex == ifidx) {
+		if (e->addr == addr) {
 			mtx_lock(&e->lock);
 			goto found;
 		}
 	rw_runlock(&d->lock);
 	CTR1(KTR_CXGB, "t3_l2t_update: addr=0x%08x not found", addr);
 	return;
 
 found:
 	printf("found 0x%08x\n", addr);
 
 	rw_runlock(&d->lock);
 	memcpy(e->dmac, enaddr, ETHER_ADDR_LEN);
 	printf("mac=%x:%x:%x:%x:%x:%x\n",
 	    e->dmac[0], e->dmac[1], e->dmac[2], e->dmac[3], e->dmac[4], e->dmac[5]);
 	
 	if (atomic_load_acq_int(&e->refcnt)) {
 		if (neigh != e->neigh)
 			neigh_replace(e, neigh);
 		
 		la = (struct llinfo_arp *)neigh->rt_llinfo; 
 		if (e->state == L2T_STATE_RESOLVING) {
 			
 			if (la->la_asked >= 5 /* arp_maxtries */) {
 				arpq = e->arpq_head;
 				e->arpq_head = e->arpq_tail = NULL;
 			} else
 				setup_l2e_send_pending(dev, NULL, e);
 		} else {
 			e->state = L2T_STATE_VALID;
 			if (memcmp(e->dmac, RT_ENADDR(neigh->rt_gateway), 6))
 				setup_l2e_send_pending(dev, NULL, e);
 		}
 	}
 	mtx_unlock(&e->lock);
 
 	if (arpq)
 		handle_failed_resolution(dev, arpq);
 }
 
 struct l2t_data *
 t3_init_l2t(unsigned int l2t_capacity)
 {
 	struct l2t_data *d;
 	int i, size = sizeof(*d) + l2t_capacity * sizeof(struct l2t_entry);
 
 	d = cxgb_alloc_mem(size);
 	if (!d)
 		return NULL;
 
 	d->nentries = l2t_capacity;
 	d->rover = &d->l2tab[1];	/* entry 0 is not used */
 	atomic_store_rel_int(&d->nfree, l2t_capacity - 1);
 	rw_init(&d->lock, "L2T");
 
 	for (i = 0; i < l2t_capacity; ++i) {
 		d->l2tab[i].idx = i;
 		d->l2tab[i].state = L2T_STATE_UNUSED;
 		mtx_init(&d->l2tab[i].lock, "L2TAB", NULL, MTX_DEF);
 		atomic_store_rel_int(&d->l2tab[i].refcnt, 0);
 	}
 	return d;
 }
 
 void
 t3_free_l2t(struct l2t_data *d)
 {
 	int i;
 
 	rw_destroy(&d->lock);
 	for (i = 0; i < d->nentries; ++i) 
 		mtx_destroy(&d->l2tab[i].lock);
 
 	cxgb_free_mem(d);
 }
Index: head/sys/dev/cxgb/ulp/tom/cxgb_l2t.h
===================================================================
--- head/sys/dev/cxgb/ulp/tom/cxgb_l2t.h	(revision 186118)
+++ head/sys/dev/cxgb/ulp/tom/cxgb_l2t.h	(revision 186119)
@@ -1,161 +1,161 @@
 /**************************************************************************
 
 Copyright (c) 2007-2008, Chelsio Inc.
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
 
  1. Redistributions of source code must retain the above copyright notice,
     this list of conditions and the following disclaimer.
 
  2. Neither the name of the Chelsio Corporation nor the names of its
     contributors may be used to endorse or promote products derived from
     this software without specific prior written permission.
 
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 
 $FreeBSD$
 
 ***************************************************************************/
 #ifndef _CHELSIO_L2T_H
 #define _CHELSIO_L2T_H
 
 #include <ulp/toecore/cxgb_toedev.h>
 #include <sys/lock.h>
 
 #if __FreeBSD_version > 700000
 #include <sys/rwlock.h>
 #else
 #define rwlock mtx
 #define rw_wlock(x) mtx_lock((x))
 #define rw_wunlock(x) mtx_unlock((x))
 #define rw_rlock(x) mtx_lock((x))
 #define rw_runlock(x) mtx_unlock((x))
 #define rw_init(x, str) mtx_init((x), (str), NULL, MTX_DEF)
 #define rw_destroy(x) mtx_destroy((x))
 #endif
 
 enum {
 	L2T_STATE_VALID,      /* entry is up to date */
 	L2T_STATE_STALE,      /* entry may be used but needs revalidation */
 	L2T_STATE_RESOLVING,  /* entry needs address resolution */
 	L2T_STATE_UNUSED      /* entry not in use */
 };
 
 /*
  * Each L2T entry plays multiple roles.  First of all, it keeps state for the
  * corresponding entry of the HW L2 table and maintains a queue of offload
  * packets awaiting address resolution.  Second, it is a node of a hash table
  * chain, where the nodes of the chain are linked together through their next
  * pointer.  Finally, each node is a bucket of a hash table, pointing to the
  * first element in its chain through its first pointer.
  */
 struct l2t_entry {
 	uint16_t state;               /* entry state */
 	uint16_t idx;                 /* entry index */
 	uint32_t addr;                /* dest IP address */
 	int ifindex;                  /* neighbor's net_device's ifindex */
 	uint16_t smt_idx;             /* SMT index */
 	uint16_t vlan;                /* VLAN TCI (id: bits 0-11, prio: 13-15 */
-	struct rtentry *neigh;        /* associated neighbour */
+	struct llentry *neigh;        /* associated neighbour */
 	struct l2t_entry *first;      /* start of hash chain */
 	struct l2t_entry *next;       /* next l2t_entry on chain */
 	struct mbuf *arpq_head;       /* queue of packets awaiting resolution */
 	struct mbuf *arpq_tail;
 	struct mtx lock;
 	volatile uint32_t refcnt;     /* entry reference count */
 	uint8_t dmac[6];              /* neighbour's MAC address */
 };
 
 struct l2t_data {
 	unsigned int nentries;      /* number of entries */
 	struct l2t_entry *rover;    /* starting point for next allocation */
 	volatile uint32_t nfree;    /* number of free entries */
 	struct rwlock lock;
 	struct l2t_entry l2tab[0];
 };
 
 typedef void (*arp_failure_handler_func)(struct t3cdev *dev,
 					 struct mbuf *m);
 
 typedef void (*opaque_arp_failure_handler_func)(void *dev,
 					 struct mbuf *m);
 
 /*
  * Callback stored in an skb to handle address resolution failure.
  */
 struct l2t_mbuf_cb {
 	arp_failure_handler_func arp_failure_handler;
 };
 
 /*
  * XXX 
  */
 #define L2T_MBUF_CB(skb) ((struct l2t_mbuf_cb *)(skb)->cb)
 
 
 static __inline void set_arp_failure_handler(struct mbuf *m,
 					   arp_failure_handler_func hnd)
 {
 	m->m_pkthdr.header = (opaque_arp_failure_handler_func)hnd;
 
 }
 
 /*
  * Getting to the L2 data from an offload device.
  */
 #define L2DATA(dev) ((dev)->l2opt)
 
 void t3_l2e_free(struct l2t_data *d, struct l2t_entry *e);
 void t3_l2t_update(struct t3cdev *dev, struct rtentry *rt, uint8_t *enaddr, struct sockaddr *sa);
 struct l2t_entry *t3_l2t_get(struct t3cdev *dev, struct rtentry *neigh,
     struct ifnet *ifp, struct sockaddr *sa);
 int t3_l2t_send_slow(struct t3cdev *dev, struct mbuf *m,
 		     struct l2t_entry *e);
 void t3_l2t_send_event(struct t3cdev *dev, struct l2t_entry *e);
 struct l2t_data *t3_init_l2t(unsigned int l2t_capacity);
 void t3_free_l2t(struct l2t_data *d);
 
 #ifdef CONFIG_PROC_FS
 int t3_l2t_proc_setup(struct proc_dir_entry *dir, struct l2t_data *d);
 void t3_l2t_proc_free(struct proc_dir_entry *dir);
 #else
 #define l2t_proc_setup(dir, d) 0
 #define l2t_proc_free(dir)
 #endif
 
 int cxgb_ofld_send(struct t3cdev *dev, struct mbuf *m);
 
 static inline int l2t_send(struct t3cdev *dev, struct mbuf *m,
 			   struct l2t_entry *e)
 {
 	if (__predict_true(e->state == L2T_STATE_VALID)) {
 		return cxgb_ofld_send(dev, (struct mbuf *)m);
 	}
 	return t3_l2t_send_slow(dev, (struct mbuf *)m, e);
 }
 
 static inline void l2t_release(struct l2t_data *d, struct l2t_entry *e)
 {
 	if (atomic_fetchadd_int(&e->refcnt, -1) == 1)
 		t3_l2e_free(d, e);
 }
 
 static inline void l2t_hold(struct l2t_data *d, struct l2t_entry *e)
 {
 	if (atomic_fetchadd_int(&e->refcnt, 1) == 1)  /* 0 -> 1 transition */
 		atomic_add_int(&d->nfree, 1);
 }
 
 #endif
Index: head/sys/modules/cxgb/Makefile
===================================================================
--- head/sys/modules/cxgb/Makefile	(revision 186118)
+++ head/sys/modules/cxgb/Makefile	(revision 186119)
@@ -1,39 +1,39 @@
 # $FreeBSD$
 SUBDIR= cxgb
 SUBDIR+= ${_toecore}
 SUBDIR+= ${_tom}
 SUBDIR+= ${_iw_cxgb}
 SUBDIR+= cxgb_t3fw
 
 .if defined(SYSDIR)
 _sysdir = ${SYSDIR}
 .endif
 
 # Based on bsd.kmod.mk but we don't modify SYSDIR in this one.
 .for _dir in ${.CURDIR}/../.. ${.CURDIR}/../../.. ${.CURDIR}/../../../.. \
     /sys /usr/src/sys
 .if !defined(_sysdir) && exists(${_dir}/kern/) && exists(${_dir}/conf/kmod.mk)
 _sysdir = ${_dir}
 .endif
 .endfor
 .if !defined(_sysdir) || !exists(${_sysdir}/kern/) || \
     !exists(${_sysdir}/conf/kmod.mk)
 .error "can't find kernel source tree"
 .endif
 
 _toe_header = ${_sysdir}/netinet/toedev.h
 
 .if exists(${_toe_header})
 _toecore = toecore
-_tom = tom
+#_tom = tom
 .endif
 
 .if ${MACHINE_ARCH} == "i386" && exists(${_toe_header})
 _iw_cxgb = iw_cxgb
 .endif
 
 .if ${MACHINE_ARCH} == "amd64" && exists(${_toe_header})
 _iw_cxgb = iw_cxgb
 .endif
 
 .include <bsd.subdir.mk>
Index: head/sys/net/if.c
===================================================================
--- head/sys/net/if.c	(revision 186118)
+++ head/sys/net/if.c	(revision 186119)
@@ -1,2903 +1,2907 @@
 /*-
  * Copyright (c) 1980, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)if.c	8.5 (Berkeley) 1/9/95
  * $FreeBSD$
  */
 
 #include "opt_compat.h"
 #include "opt_inet6.h"
 #include "opt_inet.h"
 #include "opt_mac.h"
 #include "opt_carp.h"
 
 #include <sys/param.h>
 #include <sys/types.h>
 #include <sys/conf.h>
 #include <sys/malloc.h>
 #include <sys/sbuf.h>
 #include <sys/bus.h>
 #include <sys/mbuf.h>
 #include <sys/systm.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/protosw.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/rwlock.h>
 #include <sys/sockio.h>
 #include <sys/syslog.h>
 #include <sys/sysctl.h>
 #include <sys/taskqueue.h>
 #include <sys/domain.h>
 #include <sys/jail.h>
 #include <sys/vimage.h>
 #include <machine/stdarg.h>
+#include <vm/uma.h>
 
 #include <net/if.h>
 #include <net/if_arp.h>
 #include <net/if_clone.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/if_var.h>
 #include <net/radix.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #if defined(INET) || defined(INET6)
 /*XXX*/
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #ifdef INET6
 #include <netinet6/in6_var.h>
 #include <netinet6/in6_ifattach.h>
 #endif
 #endif
 #ifdef INET
 #include <netinet/if_ether.h>
 #include <netinet/vinet.h>
 #endif
 #ifdef DEV_CARP
 #include <netinet/ip_carp.h>
 #endif
 
 #include <security/mac/mac_framework.h>
 
 #ifndef VIMAGE
 #ifndef VIMAGE_GLOBALS
 struct vnet_net vnet_net_0;
 #endif
 #endif
 
 SYSCTL_NODE(_net, PF_LINK, link, CTLFLAG_RW, 0, "Link layers");
 SYSCTL_NODE(_net_link, 0, generic, CTLFLAG_RW, 0, "Generic link-management");
 
 /* Log link state change events */
 static int log_link_state_change = 1;
 
 SYSCTL_INT(_net_link, OID_AUTO, log_link_state_change, CTLFLAG_RW,
 	&log_link_state_change, 0,
 	"log interface link state change events");
 
 void	(*bstp_linkstate_p)(struct ifnet *ifp, int state);
 void	(*ng_ether_link_state_p)(struct ifnet *ifp, int state);
 void	(*lagg_linkstate_p)(struct ifnet *ifp, int state);
 
 struct mbuf *(*tbr_dequeue_ptr)(struct ifaltq *, int) = NULL;
 
 /*
  * XXX: Style; these should be sorted alphabetically, and unprototyped
  * static functions should be prototyped. Currently they are sorted by
  * declaration order.
  */
 static void	if_attachdomain(void *);
 static void	if_attachdomain1(struct ifnet *);
 static int	ifconf(u_long, caddr_t);
 static void	if_freemulti(struct ifmultiaddr *);
 static void	if_grow(void);
 static void	if_init(void *);
 static void	if_qflush(struct ifnet *);
 static void	if_route(struct ifnet *, int flag, int fam);
 static int	if_setflag(struct ifnet *, int, int, int *, int);
 static void	if_slowtimo(void *);
 static int	if_transmit(struct ifnet *ifp, struct mbuf *m);
 static void	if_unroute(struct ifnet *, int flag, int fam);
 static void	link_rtrequest(int, struct rtentry *, struct rt_addrinfo *);
 static int	if_rtdel(struct radix_node *, void *);
 static int	ifhwioctl(u_long, struct ifnet *, caddr_t, struct thread *);
 static int	if_delmulti_locked(struct ifnet *, struct ifmultiaddr *, int);
 static void	if_start_deferred(void *context, int pending);
 static void	do_link_state_change(void *, int);
 static int	if_getgroup(struct ifgroupreq *, struct ifnet *);
 static int	if_getgroupmembers(struct ifgroupreq *);
 
 #ifdef INET6
 /*
  * XXX: declare here to avoid to include many inet6 related files..
  * should be more generalized?
  */
 extern void	nd6_setmtu(struct ifnet *);
 #endif
 
 #ifdef VIMAGE_GLOBALS
 struct	ifnethead ifnet;	/* depend on static init XXX */
 struct	ifgrouphead ifg_head;
 int	if_index;
 static	int if_indexlim;
 /* Table of ifnet/cdev by index.  Locked with ifnet_lock. */
 static struct ifindex_entry *ifindex_table;
 static struct	knlist ifklist;
 #endif
 
 int	ifqmaxlen = IFQ_MAXLEN;
 struct	mtx ifnet_lock;
 static	if_com_alloc_t *if_com_alloc[256];
 static	if_com_free_t *if_com_free[256];
 
 static void	filt_netdetach(struct knote *kn);
 static int	filt_netdev(struct knote *kn, long hint);
 
 static struct filterops netdev_filtops =
     { 1, NULL, filt_netdetach, filt_netdev };
 
 #ifndef VIMAGE_GLOBALS
 static struct vnet_symmap vnet_net_symmap[] = {
 	VNET_SYMMAP(net, ifnet),
 	VNET_SYMMAP(net, rt_tables),
 	VNET_SYMMAP(net, rtstat),
 	VNET_SYMMAP(net, rttrash),
 	VNET_SYMMAP_END
 };
 
 VNET_MOD_DECLARE(NET, net, vnet_net_iattach, vnet_net_idetach,
     NONE, vnet_net_symmap)
 #endif
 
 /*
  * System initialization
  */
 SYSINIT(interfaces, SI_SUB_INIT_IF, SI_ORDER_FIRST, if_init, NULL);
 SYSINIT(interface_check, SI_SUB_PROTO_IF, SI_ORDER_FIRST, if_slowtimo, NULL);
 
 MALLOC_DEFINE(M_IFNET, "ifnet", "interface internals");
 MALLOC_DEFINE(M_IFADDR, "ifaddr", "interface address");
 MALLOC_DEFINE(M_IFMADDR, "ether_multi", "link-level multicast address");
 
 struct ifnet *
 ifnet_byindex(u_short idx)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 
 	IFNET_RLOCK();
 	ifp = V_ifindex_table[idx].ife_ifnet;
 	IFNET_RUNLOCK();
 	return (ifp);
 }
 
 static void
 ifnet_setbyindex(u_short idx, struct ifnet *ifp)
 {
 	INIT_VNET_NET(curvnet);
 
 	IFNET_WLOCK_ASSERT();
 
 	V_ifindex_table[idx].ife_ifnet = ifp;
 }
 
 struct ifaddr *
 ifaddr_byindex(u_short idx)
 {
 	struct ifaddr *ifa;
 
 	IFNET_RLOCK();
 	ifa = ifnet_byindex(idx)->if_addr;
 	IFNET_RUNLOCK();
 	return (ifa);
 }
 
 struct cdev *
 ifdev_byindex(u_short idx)
 {
 	INIT_VNET_NET(curvnet);
 	struct cdev *cdev;
 
 	IFNET_RLOCK();
 	cdev = V_ifindex_table[idx].ife_dev;
 	IFNET_RUNLOCK();
 	return (cdev);
 }
 
 static void
 ifdev_setbyindex(u_short idx, struct cdev *cdev)
 {
 	INIT_VNET_NET(curvnet);
 
 	IFNET_WLOCK();
 	V_ifindex_table[idx].ife_dev = cdev;
 	IFNET_WUNLOCK();
 }
 
 static d_open_t		netopen;
 static d_close_t	netclose;
 static d_ioctl_t	netioctl;
 static d_kqfilter_t	netkqfilter;
 
 static struct cdevsw net_cdevsw = {
 	.d_version =	D_VERSION,
 	.d_flags =	D_NEEDGIANT,
 	.d_open =	netopen,
 	.d_close =	netclose,
 	.d_ioctl =	netioctl,
 	.d_name =	"net",
 	.d_kqfilter =	netkqfilter,
 };
 
 static int
 netopen(struct cdev *dev, int flag, int mode, struct thread *td)
 {
 	return (0);
 }
 
 static int
 netclose(struct cdev *dev, int flags, int fmt, struct thread *td)
 {
 	return (0);
 }
 
 static int
 netioctl(struct cdev *dev, u_long cmd, caddr_t data, int flag, struct thread *td)
 {
 	struct ifnet *ifp;
 	int error, idx;
 
 	/* only support interface specific ioctls */
 	if (IOCGROUP(cmd) != 'i')
 		return (EOPNOTSUPP);
 	idx = dev2unit(dev);
 	if (idx == 0) {
 		/*
 		 * special network device, not interface.
 		 */
 		if (cmd == SIOCGIFCONF)
 			return (ifconf(cmd, data));	/* XXX remove cmd */
 #ifdef __amd64__
 		if (cmd == SIOCGIFCONF32)
 			return (ifconf(cmd, data));	/* XXX remove cmd */
 #endif
 		return (EOPNOTSUPP);
 	}
 
 	ifp = ifnet_byindex(idx);
 	if (ifp == NULL)
 		return (ENXIO);
 
 	error = ifhwioctl(cmd, ifp, data, td);
 	if (error == ENOIOCTL)
 		error = EOPNOTSUPP;
 	return (error);
 }
 
 static int
 netkqfilter(struct cdev *dev, struct knote *kn)
 {
 	INIT_VNET_NET(curvnet);
 	struct knlist *klist;
 	struct ifnet *ifp;
 	int idx;
 
 	switch (kn->kn_filter) {
 	case EVFILT_NETDEV:
 		kn->kn_fop = &netdev_filtops;
 		break;
 	default:
 		return (EINVAL);
 	}
 
 	idx = dev2unit(dev);
 	if (idx == 0) {
 		klist = &V_ifklist;
 	} else {
 		ifp = ifnet_byindex(idx);
 		if (ifp == NULL)
 			return (1);
 		klist = &ifp->if_klist;
 	}
 
 	kn->kn_hook = (caddr_t)klist;
 
 	knlist_add(klist, kn, 0);
 
 	return (0);
 }
 
 static void
 filt_netdetach(struct knote *kn)
 {
 	struct knlist *klist = (struct knlist *)kn->kn_hook;
 
 	knlist_remove(klist, kn, 0);
 }
 
 static int
 filt_netdev(struct knote *kn, long hint)
 {
 	struct knlist *klist = (struct knlist *)kn->kn_hook;
 
 	/*
 	 * Currently NOTE_EXIT is abused to indicate device detach.
 	 */
 	if (hint == NOTE_EXIT) {
 		kn->kn_data = NOTE_LINKINV;
 		kn->kn_flags |= (EV_EOF | EV_ONESHOT);
 		knlist_remove_inevent(klist, kn);
 		return (1);
 	}
 	if (hint != 0)
 		kn->kn_data = hint;			/* current status */
 	if (kn->kn_sfflags & hint)
 		kn->kn_fflags |= hint;
 	return (kn->kn_fflags != 0);
 }
 
 /*
  * Network interface utility routines.
  *
  * Routines with ifa_ifwith* names take sockaddr *'s as
  * parameters.
  */
 
 /* ARGSUSED*/
 static void
 if_init(void *dummy __unused)
 {
 	INIT_VNET_NET(curvnet);
 
 #ifndef VIMAGE_GLOBALS
 	vnet_mod_register(&vnet_net_modinfo);
 #endif
 
 	V_if_index = 0;
 	V_ifindex_table = NULL;
 	V_if_indexlim = 8;
 
 	IFNET_LOCK_INIT();
 	TAILQ_INIT(&V_ifnet);
 	TAILQ_INIT(&V_ifg_head);
 	knlist_init(&V_ifklist, NULL, NULL, NULL, NULL);
 	if_grow();				/* create initial table */
 	ifdev_setbyindex(0, make_dev(&net_cdevsw, 0, UID_ROOT, GID_WHEEL,
 	    0600, "network"));
 	if_clone_init();
 }
 
 static void
 if_grow(void)
 {
 	INIT_VNET_NET(curvnet);
 	u_int n;
 	struct ifindex_entry *e;
 
 	V_if_indexlim <<= 1;
 	n = V_if_indexlim * sizeof(*e);
 	e = malloc(n, M_IFNET, M_WAITOK | M_ZERO);
 	if (V_ifindex_table != NULL) {
 		memcpy((caddr_t)e, (caddr_t)V_ifindex_table, n/2);
 		free((caddr_t)V_ifindex_table, M_IFNET);
 	}
 	V_ifindex_table = e;
 }
 
 /*
  * Allocate a struct ifnet and an index for an interface.  A layer 2
  * common structure will also be allocated if an allocation routine is
  * registered for the passed type.
  */
 struct ifnet*
 if_alloc(u_char type)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 
 	ifp = malloc(sizeof(struct ifnet), M_IFNET, M_WAITOK|M_ZERO);
 
 	/*
 	 * Try to find an empty slot below if_index.  If we fail, take
 	 * the next slot.
 	 *
 	 * XXX: should be locked!
 	 */
 	for (ifp->if_index = 1; ifp->if_index <= V_if_index; ifp->if_index++) {
 		if (ifnet_byindex(ifp->if_index) == NULL)
 			break;
 	}
 	/* Catch if_index overflow. */
 	if (ifp->if_index < 1) {
 		free(ifp, M_IFNET);
 		return (NULL);
 	}
 	if (ifp->if_index > V_if_index)
 		V_if_index = ifp->if_index;
 	if (V_if_index >= V_if_indexlim)
 		if_grow();
 
 	ifp->if_type = type;
 
 	if (if_com_alloc[type] != NULL) {
 		ifp->if_l2com = if_com_alloc[type](type, ifp);
 		if (ifp->if_l2com == NULL) {
 			free(ifp, M_IFNET);
 			return (NULL);
 		}
 	}
 	IFNET_WLOCK();
 	ifnet_setbyindex(ifp->if_index, ifp);
 	IFNET_WUNLOCK();
 	IF_ADDR_LOCK_INIT(ifp);
 
 	return (ifp);
 }
 
 /*
  * Free the struct ifnet, the associated index, and the layer 2 common
  * structure if needed.  All the work is done in if_free_type().
  *
  * Do not add code to this function!  Add it to if_free_type().
  */
 void
 if_free(struct ifnet *ifp)
 {
 
 	if_free_type(ifp, ifp->if_type);
 }
 
 /*
  * Do the actual work of freeing a struct ifnet, associated index, and
  * layer 2 common structure.  This version should only be called by
  * intefaces that switch their type after calling if_alloc().
  */
 void
 if_free_type(struct ifnet *ifp, u_char type)
 {
 	INIT_VNET_NET(curvnet); /* ifp->if_vnet can be NULL here ! */
 
 	if (ifp != ifnet_byindex(ifp->if_index)) {
 		if_printf(ifp, "%s: value was not if_alloced, skipping\n",
 		    __func__);
 		return;
 	}
 
 	IFNET_WLOCK();
 	ifnet_setbyindex(ifp->if_index, NULL);
 
 	/* XXX: should be locked with if_findindex() */
 	while (V_if_index > 0 && ifnet_byindex(V_if_index) == NULL)
 		V_if_index--;
 	IFNET_WUNLOCK();
 
 	if (if_com_free[type] != NULL)
 		if_com_free[type](ifp->if_l2com, type);
 
 	IF_ADDR_LOCK_DESTROY(ifp);
 	free(ifp, M_IFNET);
 };
 
 void
 ifq_attach(struct ifaltq *ifq, struct ifnet *ifp)
 {
 	
 	mtx_init(&ifq->ifq_mtx, ifp->if_xname, "if send queue", MTX_DEF);
 
 	if (ifq->ifq_maxlen == 0) 
 		ifq->ifq_maxlen = ifqmaxlen;
 
 	ifq->altq_type = 0;
 	ifq->altq_disc = NULL;
 	ifq->altq_flags &= ALTQF_CANTCHANGE;
 	ifq->altq_tbr  = NULL;
 	ifq->altq_ifp  = ifp;
 }
 
 void
 ifq_detach(struct ifaltq *ifq)
 {
 	mtx_destroy(&ifq->ifq_mtx);
 }
 
 /*
  * Perform generic interface initalization tasks and attach the interface
  * to the list of "active" interfaces.
  *
  * XXX:
  *  - The decision to return void and thus require this function to
  *    succeed is questionable.
  *  - We do more initialization here then is probably a good idea.
  *    Some of this should probably move to if_alloc().
  *  - We should probably do more sanity checking.  For instance we don't
  *    do anything to insure if_xname is unique or non-empty.
  */
 void
 if_attach(struct ifnet *ifp)
 {
 	INIT_VNET_NET(curvnet);
 	unsigned socksize, ifasize;
 	int namelen, masklen;
 	struct sockaddr_dl *sdl;
 	struct ifaddr *ifa;
 
 	if (ifp->if_index == 0 || ifp != ifnet_byindex(ifp->if_index))
 		panic ("%s: BUG: if_attach called without if_alloc'd input()\n",
 		    ifp->if_xname);
 
 	TASK_INIT(&ifp->if_starttask, 0, if_start_deferred, ifp);
 	TASK_INIT(&ifp->if_linktask, 0, do_link_state_change, ifp);
 	IF_AFDATA_LOCK_INIT(ifp);
 	ifp->if_afdata_initialized = 0;
 
 	TAILQ_INIT(&ifp->if_addrhead);
 	TAILQ_INIT(&ifp->if_prefixhead);
 	TAILQ_INIT(&ifp->if_multiaddrs);
 	TAILQ_INIT(&ifp->if_groups);
 
 	if_addgroup(ifp, IFG_ALL);
 
 	knlist_init(&ifp->if_klist, NULL, NULL, NULL, NULL);
 	getmicrotime(&ifp->if_lastchange);
 	ifp->if_data.ifi_epoch = time_uptime;
 	ifp->if_data.ifi_datalen = sizeof(struct if_data);
 	ifp->if_transmit = if_transmit;
 	ifp->if_qflush = if_qflush;
 #ifdef MAC
 	mac_ifnet_init(ifp);
 	mac_ifnet_create(ifp);
 #endif
 
 	ifdev_setbyindex(ifp->if_index, make_dev(&net_cdevsw,
 	    ifp->if_index, UID_ROOT, GID_WHEEL, 0600, "%s/%s",
 	    net_cdevsw.d_name, ifp->if_xname));
 	make_dev_alias(ifdev_byindex(ifp->if_index), "%s%d",
 	    net_cdevsw.d_name, ifp->if_index);
 
 	ifq_attach(&ifp->if_snd, ifp);
 
 	/*
 	 * create a Link Level name for this device
 	 */
 	namelen = strlen(ifp->if_xname);
 	/*
 	 * Always save enough space for any possiable name so we can do
 	 * a rename in place later.
 	 */
 	masklen = offsetof(struct sockaddr_dl, sdl_data[0]) + IFNAMSIZ;
 	socksize = masklen + ifp->if_addrlen;
 	if (socksize < sizeof(*sdl))
 		socksize = sizeof(*sdl);
 	socksize = roundup2(socksize, sizeof(long));
 	ifasize = sizeof(*ifa) + 2 * socksize;
 	ifa = malloc(ifasize, M_IFADDR, M_WAITOK | M_ZERO);
 	IFA_LOCK_INIT(ifa);
 	sdl = (struct sockaddr_dl *)(ifa + 1);
 	sdl->sdl_len = socksize;
 	sdl->sdl_family = AF_LINK;
 	bcopy(ifp->if_xname, sdl->sdl_data, namelen);
 	sdl->sdl_nlen = namelen;
 	sdl->sdl_index = ifp->if_index;
 	sdl->sdl_type = ifp->if_type;
 	ifp->if_addr = ifa;
 	ifa->ifa_ifp = ifp;
 	ifa->ifa_rtrequest = link_rtrequest;
 	ifa->ifa_addr = (struct sockaddr *)sdl;
 	sdl = (struct sockaddr_dl *)(socksize + (caddr_t)sdl);
 	ifa->ifa_netmask = (struct sockaddr *)sdl;
 	sdl->sdl_len = masklen;
 	while (namelen != 0)
 		sdl->sdl_data[--namelen] = 0xff;
 	ifa->ifa_refcnt = 1;
 	TAILQ_INSERT_HEAD(&ifp->if_addrhead, ifa, ifa_link);
 	ifp->if_broadcastaddr = NULL; /* reliably crash if used uninitialized */
 
 
 	IFNET_WLOCK();
 	TAILQ_INSERT_TAIL(&V_ifnet, ifp, if_link);
 	IFNET_WUNLOCK();
 
 	if (domain_init_status >= 2)
 		if_attachdomain1(ifp);
 
 	EVENTHANDLER_INVOKE(ifnet_arrival_event, ifp);
 	devctl_notify("IFNET", ifp->if_xname, "ATTACH", NULL);
 
 	/* Announce the interface. */
 	rt_ifannouncemsg(ifp, IFAN_ARRIVAL);
 
 	if (ifp->if_watchdog != NULL)
 		if_printf(ifp,
 		    "WARNING: using obsoleted if_watchdog interface\n");
 	if (ifp->if_flags & IFF_NEEDSGIANT)
 		if_printf(ifp,
 		    "WARNING: using obsoleted IFF_NEEDSGIANT flag\n");
 }
 
 static void
 if_attachdomain(void *dummy)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 	int s;
 
 	s = splnet();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link)
 		if_attachdomain1(ifp);
 	splx(s);
 }
 SYSINIT(domainifattach, SI_SUB_PROTO_IFATTACHDOMAIN, SI_ORDER_SECOND,
     if_attachdomain, NULL);
 
 static void
 if_attachdomain1(struct ifnet *ifp)
 {
 	struct domain *dp;
 	int s;
 
 	s = splnet();
 
 	/*
 	 * Since dp->dom_ifattach calls malloc() with M_WAITOK, we
 	 * cannot lock ifp->if_afdata initialization, entirely.
 	 */
 	if (IF_AFDATA_TRYLOCK(ifp) == 0) {
 		splx(s);
 		return;
 	}
 	if (ifp->if_afdata_initialized >= domain_init_status) {
 		IF_AFDATA_UNLOCK(ifp);
 		splx(s);
 		printf("if_attachdomain called more than once on %s\n",
 		    ifp->if_xname);
 		return;
 	}
 	ifp->if_afdata_initialized = domain_init_status;
 	IF_AFDATA_UNLOCK(ifp);
 
 	/* address family dependent data region */
 	bzero(ifp->if_afdata, sizeof(ifp->if_afdata));
 	for (dp = domains; dp; dp = dp->dom_next) {
 		if (dp->dom_ifattach)
 			ifp->if_afdata[dp->dom_family] =
 			    (*dp->dom_ifattach)(ifp);
 	}
 
 	splx(s);
 }
 
 /*
  * Remove any unicast or broadcast network addresses from an interface.
  */
 void
 if_purgeaddrs(struct ifnet *ifp)
 {
 	struct ifaddr *ifa, *next;
 
 	TAILQ_FOREACH_SAFE(ifa, &ifp->if_addrhead, ifa_link, next) {
 		if (ifa->ifa_addr->sa_family == AF_LINK)
 			continue;
 #ifdef INET
 		/* XXX: Ugly!! ad hoc just for INET */
 		if (ifa->ifa_addr->sa_family == AF_INET) {
 			struct ifaliasreq ifr;
 
 			bzero(&ifr, sizeof(ifr));
 			ifr.ifra_addr = *ifa->ifa_addr;
 			if (ifa->ifa_dstaddr)
 				ifr.ifra_broadaddr = *ifa->ifa_dstaddr;
 			if (in_control(NULL, SIOCDIFADDR, (caddr_t)&ifr, ifp,
 			    NULL) == 0)
 				continue;
 		}
 #endif /* INET */
 #ifdef INET6
 		if (ifa->ifa_addr->sa_family == AF_INET6) {
 			in6_purgeaddr(ifa);
 			/* ifp_addrhead is already updated */
 			continue;
 		}
 #endif /* INET6 */
 		TAILQ_REMOVE(&ifp->if_addrhead, ifa, ifa_link);
 		IFAFREE(ifa);
 	}
 }
 
 /*
  * Remove any multicast network addresses from an interface.
  */
 void
 if_purgemaddrs(struct ifnet *ifp)
 {
 	struct ifmultiaddr *ifma;
 	struct ifmultiaddr *next;
 
 	IF_ADDR_LOCK(ifp);
 	TAILQ_FOREACH_SAFE(ifma, &ifp->if_multiaddrs, ifma_link, next)
 		if_delmulti_locked(ifp, ifma, 1);
 	IF_ADDR_UNLOCK(ifp);
 }
 
 /*
  * Detach an interface, removing it from the
  * list of "active" interfaces.
  *
  * XXXRW: There are some significant questions about event ordering, and
  * how to prevent things from starting to use the interface during detach.
  */
 void
 if_detach(struct ifnet *ifp)
 {
 	INIT_VNET_NET(ifp->if_vnet);
 	struct ifaddr *ifa;
 	struct radix_node_head	*rnh;
 	int s, i, j;
 	struct domain *dp;
  	struct ifnet *iter;
  	int found = 0;
 
 	IFNET_WLOCK();
 	TAILQ_FOREACH(iter, &V_ifnet, if_link)
 		if (iter == ifp) {
 			TAILQ_REMOVE(&V_ifnet, ifp, if_link);
 			found = 1;
 			break;
 		}
 	IFNET_WUNLOCK();
 	if (!found)
 		return;
 
 	/*
 	 * Remove/wait for pending events.
 	 */
 	taskqueue_drain(taskqueue_swi, &ifp->if_linktask);
 
 	/*
 	 * Remove routes and flush queues.
 	 */
 	s = splnet();
 	if_down(ifp);
 #ifdef ALTQ
 	if (ALTQ_IS_ENABLED(&ifp->if_snd))
 		altq_disable(&ifp->if_snd);
 	if (ALTQ_IS_ATTACHED(&ifp->if_snd))
 		altq_detach(&ifp->if_snd);
 #endif
 
 	if_purgeaddrs(ifp);
 
 #ifdef INET
 	in_ifdetach(ifp);
 #endif
 
 #ifdef INET6
 	/*
 	 * Remove all IPv6 kernel structs related to ifp.  This should be done
 	 * before removing routing entries below, since IPv6 interface direct
 	 * routes are expected to be removed by the IPv6-specific kernel API.
 	 * Otherwise, the kernel will detect some inconsistency and bark it.
 	 */
 	in6_ifdetach(ifp);
 #endif
 	if_purgemaddrs(ifp);
 
 	/*
 	 * Remove link ifaddr pointer and maybe decrement if_index.
 	 * Clean up all addresses.
 	 */
 	ifp->if_addr = NULL;
 	destroy_dev(ifdev_byindex(ifp->if_index));
 	ifdev_setbyindex(ifp->if_index, NULL);	
 
 	/* We can now free link ifaddr. */
 	if (!TAILQ_EMPTY(&ifp->if_addrhead)) {
 		ifa = TAILQ_FIRST(&ifp->if_addrhead);
 		TAILQ_REMOVE(&ifp->if_addrhead, ifa, ifa_link);
 		IFAFREE(ifa);
 	}
 
 	/*
 	 * Delete all remaining routes using this interface
 	 * Unfortuneatly the only way to do this is to slog through
 	 * the entire routing table looking for routes which point
 	 * to this interface...oh well...
 	 */
 	for (i = 1; i <= AF_MAX; i++) {
 		for (j = 0; j < rt_numfibs; j++) {
 			if ((rnh = V_rt_tables[j][i]) == NULL)
 				continue;
 			RADIX_NODE_HEAD_LOCK(rnh);
 			(void) rnh->rnh_walktree(rnh, if_rtdel, ifp);
 			RADIX_NODE_HEAD_UNLOCK(rnh);
 		}
 	}
 
 	/* Announce that the interface is gone. */
 	rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
 	EVENTHANDLER_INVOKE(ifnet_departure_event, ifp);
 	devctl_notify("IFNET", ifp->if_xname, "DETACH", NULL);
 
 	IF_AFDATA_LOCK(ifp);
 	for (dp = domains; dp; dp = dp->dom_next) {
 		if (dp->dom_ifdetach && ifp->if_afdata[dp->dom_family])
 			(*dp->dom_ifdetach)(ifp,
 			    ifp->if_afdata[dp->dom_family]);
 	}
 	IF_AFDATA_UNLOCK(ifp);
 
 #ifdef MAC
 	mac_ifnet_destroy(ifp);
 #endif /* MAC */
 	KNOTE_UNLOCKED(&ifp->if_klist, NOTE_EXIT);
 	knlist_clear(&ifp->if_klist, 0);
 	knlist_destroy(&ifp->if_klist);
 	ifq_detach(&ifp->if_snd);
 	IF_AFDATA_DESTROY(ifp);
 	splx(s);
 }
 
 /*
  * Add a group to an interface
  */
 int
 if_addgroup(struct ifnet *ifp, const char *groupname)
 {
 	INIT_VNET_NET(ifp->if_vnet);
 	struct ifg_list		*ifgl;
 	struct ifg_group	*ifg = NULL;
 	struct ifg_member	*ifgm;
 
 	if (groupname[0] && groupname[strlen(groupname) - 1] >= '0' &&
 	    groupname[strlen(groupname) - 1] <= '9')
 		return (EINVAL);
 
 	IFNET_WLOCK();
 	TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next)
 		if (!strcmp(ifgl->ifgl_group->ifg_group, groupname)) {
 			IFNET_WUNLOCK();
 			return (EEXIST);
 		}
 
 	if ((ifgl = (struct ifg_list *)malloc(sizeof(struct ifg_list), M_TEMP,
 	    M_NOWAIT)) == NULL) {
 	    	IFNET_WUNLOCK();
 		return (ENOMEM);
 	}
 
 	if ((ifgm = (struct ifg_member *)malloc(sizeof(struct ifg_member),
 	    M_TEMP, M_NOWAIT)) == NULL) {
 		free(ifgl, M_TEMP);
 		IFNET_WUNLOCK();
 		return (ENOMEM);
 	}
 
 	TAILQ_FOREACH(ifg, &V_ifg_head, ifg_next)
 		if (!strcmp(ifg->ifg_group, groupname))
 			break;
 
 	if (ifg == NULL) {
 		if ((ifg = (struct ifg_group *)malloc(sizeof(struct ifg_group),
 		    M_TEMP, M_NOWAIT)) == NULL) {
 			free(ifgl, M_TEMP);
 			free(ifgm, M_TEMP);
 			IFNET_WUNLOCK();
 			return (ENOMEM);
 		}
 		strlcpy(ifg->ifg_group, groupname, sizeof(ifg->ifg_group));
 		ifg->ifg_refcnt = 0;
 		TAILQ_INIT(&ifg->ifg_members);
 		EVENTHANDLER_INVOKE(group_attach_event, ifg);
 		TAILQ_INSERT_TAIL(&V_ifg_head, ifg, ifg_next);
 	}
 
 	ifg->ifg_refcnt++;
 	ifgl->ifgl_group = ifg;
 	ifgm->ifgm_ifp = ifp;
 
 	IF_ADDR_LOCK(ifp);
 	TAILQ_INSERT_TAIL(&ifg->ifg_members, ifgm, ifgm_next);
 	TAILQ_INSERT_TAIL(&ifp->if_groups, ifgl, ifgl_next);
 	IF_ADDR_UNLOCK(ifp);
 
 	IFNET_WUNLOCK();
 
 	EVENTHANDLER_INVOKE(group_change_event, groupname);
 
 	return (0);
 }
 
 /*
  * Remove a group from an interface
  */
 int
 if_delgroup(struct ifnet *ifp, const char *groupname)
 {
 	INIT_VNET_NET(ifp->if_vnet);
 	struct ifg_list		*ifgl;
 	struct ifg_member	*ifgm;
 
 	IFNET_WLOCK();
 	TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next)
 		if (!strcmp(ifgl->ifgl_group->ifg_group, groupname))
 			break;
 	if (ifgl == NULL) {
 		IFNET_WUNLOCK();
 		return (ENOENT);
 	}
 
 	IF_ADDR_LOCK(ifp);
 	TAILQ_REMOVE(&ifp->if_groups, ifgl, ifgl_next);
 	IF_ADDR_UNLOCK(ifp);
 
 	TAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next)
 		if (ifgm->ifgm_ifp == ifp)
 			break;
 
 	if (ifgm != NULL) {
 		TAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm, ifgm_next);
 		free(ifgm, M_TEMP);
 	}
 
 	if (--ifgl->ifgl_group->ifg_refcnt == 0) {
 		TAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_next);
 		EVENTHANDLER_INVOKE(group_detach_event, ifgl->ifgl_group);
 		free(ifgl->ifgl_group, M_TEMP);
 	}
 	IFNET_WUNLOCK();
 
 	free(ifgl, M_TEMP);
 
 	EVENTHANDLER_INVOKE(group_change_event, groupname);
 
 	return (0);
 }
 
 /*
  * Stores all groups from an interface in memory pointed
  * to by data
  */
 static int
 if_getgroup(struct ifgroupreq *data, struct ifnet *ifp)
 {
 	int			 len, error;
 	struct ifg_list		*ifgl;
 	struct ifg_req		 ifgrq, *ifgp;
 	struct ifgroupreq	*ifgr = data;
 
 	if (ifgr->ifgr_len == 0) {
 		IF_ADDR_LOCK(ifp);
 		TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next)
 			ifgr->ifgr_len += sizeof(struct ifg_req);
 		IF_ADDR_UNLOCK(ifp);
 		return (0);
 	}
 
 	len = ifgr->ifgr_len;
 	ifgp = ifgr->ifgr_groups;
 	/* XXX: wire */
 	IF_ADDR_LOCK(ifp);
 	TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) {
 		if (len < sizeof(ifgrq)) {
 			IF_ADDR_UNLOCK(ifp);
 			return (EINVAL);
 		}
 		bzero(&ifgrq, sizeof ifgrq);
 		strlcpy(ifgrq.ifgrq_group, ifgl->ifgl_group->ifg_group,
 		    sizeof(ifgrq.ifgrq_group));
 		if ((error = copyout(&ifgrq, ifgp, sizeof(struct ifg_req)))) {
 		    	IF_ADDR_UNLOCK(ifp);
 			return (error);
 		}
 		len -= sizeof(ifgrq);
 		ifgp++;
 	}
 	IF_ADDR_UNLOCK(ifp);
 
 	return (0);
 }
 
 /*
  * Stores all members of a group in memory pointed to by data
  */
 static int
 if_getgroupmembers(struct ifgroupreq *data)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifgroupreq	*ifgr = data;
 	struct ifg_group	*ifg;
 	struct ifg_member	*ifgm;
 	struct ifg_req		 ifgrq, *ifgp;
 	int			 len, error;
 
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifg, &V_ifg_head, ifg_next)
 		if (!strcmp(ifg->ifg_group, ifgr->ifgr_name))
 			break;
 	if (ifg == NULL) {
 		IFNET_RUNLOCK();
 		return (ENOENT);
 	}
 
 	if (ifgr->ifgr_len == 0) {
 		TAILQ_FOREACH(ifgm, &ifg->ifg_members, ifgm_next)
 			ifgr->ifgr_len += sizeof(ifgrq);
 		IFNET_RUNLOCK();
 		return (0);
 	}
 
 	len = ifgr->ifgr_len;
 	ifgp = ifgr->ifgr_groups;
 	TAILQ_FOREACH(ifgm, &ifg->ifg_members, ifgm_next) {
 		if (len < sizeof(ifgrq)) {
 			IFNET_RUNLOCK();
 			return (EINVAL);
 		}
 		bzero(&ifgrq, sizeof ifgrq);
 		strlcpy(ifgrq.ifgrq_member, ifgm->ifgm_ifp->if_xname,
 		    sizeof(ifgrq.ifgrq_member));
 		if ((error = copyout(&ifgrq, ifgp, sizeof(struct ifg_req)))) {
 			IFNET_RUNLOCK();
 			return (error);
 		}
 		len -= sizeof(ifgrq);
 		ifgp++;
 	}
 	IFNET_RUNLOCK();
 
 	return (0);
 }
 
 /*
  * Delete Routes for a Network Interface
  *
  * Called for each routing entry via the rnh->rnh_walktree() call above
  * to delete all route entries referencing a detaching network interface.
  *
  * Arguments:
  *	rn	pointer to node in the routing table
  *	arg	argument passed to rnh->rnh_walktree() - detaching interface
  *
  * Returns:
  *	0	successful
  *	errno	failed - reason indicated
  *
  */
 static int
 if_rtdel(struct radix_node *rn, void *arg)
 {
 	struct rtentry	*rt = (struct rtentry *)rn;
 	struct ifnet	*ifp = arg;
 	int		err;
 
 	if (rt->rt_ifp == ifp) {
 
 		/*
 		 * Protect (sorta) against walktree recursion problems
 		 * with cloned routes
 		 */
 		if ((rt->rt_flags & RTF_UP) == 0)
 			return (0);
 
 		err = rtrequest_fib(RTM_DELETE, rt_key(rt), rt->rt_gateway,
 				rt_mask(rt), rt->rt_flags,
 				(struct rtentry **) NULL, rt->rt_fibnum);
 		if (err) {
 			log(LOG_WARNING, "if_rtdel: error %d\n", err);
 		}
 	}
 
 	return (0);
 }
 
 /*
  * XXX: Because sockaddr_dl has deeper structure than the sockaddr
  * structs used to represent other address families, it is necessary
  * to perform a different comparison.
  */
 
 #define	sa_equal(a1, a2)	\
 	(bcmp((a1), (a2), ((a1))->sa_len) == 0)
 
 #define	sa_dl_equal(a1, a2)	\
 	((((struct sockaddr_dl *)(a1))->sdl_len ==			\
 	 ((struct sockaddr_dl *)(a2))->sdl_len) &&			\
 	 (bcmp(LLADDR((struct sockaddr_dl *)(a1)),			\
 	       LLADDR((struct sockaddr_dl *)(a2)),			\
 	       ((struct sockaddr_dl *)(a1))->sdl_alen) == 0))
 
 /*
  * Locate an interface based on a complete address.
  */
 /*ARGSUSED*/
 struct ifaddr *
 ifa_ifwithaddr(struct sockaddr *addr)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 	struct ifaddr *ifa;
 
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link)
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 			if (ifa->ifa_addr->sa_family != addr->sa_family)
 				continue;
 			if (sa_equal(addr, ifa->ifa_addr))
 				goto done;
 			/* IP6 doesn't have broadcast */
 			if ((ifp->if_flags & IFF_BROADCAST) &&
 			    ifa->ifa_broadaddr &&
 			    ifa->ifa_broadaddr->sa_len != 0 &&
 			    sa_equal(ifa->ifa_broadaddr, addr))
 				goto done;
 		}
 	ifa = NULL;
 done:
 	IFNET_RUNLOCK();
 	return (ifa);
 }
 
 /*
  * Locate an interface based on the broadcast address.
  */
 /* ARGSUSED */
 struct ifaddr *
 ifa_ifwithbroadaddr(struct sockaddr *addr)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 	struct ifaddr *ifa;
 
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link)
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 			if (ifa->ifa_addr->sa_family != addr->sa_family)
 				continue;
 			if ((ifp->if_flags & IFF_BROADCAST) &&
 			    ifa->ifa_broadaddr &&
 			    ifa->ifa_broadaddr->sa_len != 0 &&
 			    sa_equal(ifa->ifa_broadaddr, addr))
 				goto done;
 		}
 	ifa = NULL;
 done:
 	IFNET_RUNLOCK();
 	return (ifa);
 }
 
 /*
  * Locate the point to point interface with a given destination address.
  */
 /*ARGSUSED*/
 struct ifaddr *
 ifa_ifwithdstaddr(struct sockaddr *addr)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 	struct ifaddr *ifa;
 
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 		if ((ifp->if_flags & IFF_POINTOPOINT) == 0)
 			continue;
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 			if (ifa->ifa_addr->sa_family != addr->sa_family)
 				continue;
 			if (ifa->ifa_dstaddr != NULL &&
 			    sa_equal(addr, ifa->ifa_dstaddr))
 				goto done;
 		}
 	}
 	ifa = NULL;
 done:
 	IFNET_RUNLOCK();
 	return (ifa);
 }
 
 /*
  * Find an interface on a specific network.  If many, choice
  * is most specific found.
  */
 struct ifaddr *
 ifa_ifwithnet(struct sockaddr *addr)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 	struct ifaddr *ifa;
 	struct ifaddr *ifa_maybe = (struct ifaddr *) 0;
 	u_int af = addr->sa_family;
 	char *addr_data = addr->sa_data, *cplim;
 
 	/*
 	 * AF_LINK addresses can be looked up directly by their index number,
 	 * so do that if we can.
 	 */
 	if (af == AF_LINK) {
 	    struct sockaddr_dl *sdl = (struct sockaddr_dl *)addr;
 	    if (sdl->sdl_index && sdl->sdl_index <= V_if_index)
 		return (ifaddr_byindex(sdl->sdl_index));
 	}
 
 	/*
 	 * Scan though each interface, looking for ones that have
 	 * addresses in this address family.
 	 */
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 			char *cp, *cp2, *cp3;
 
 			if (ifa->ifa_addr->sa_family != af)
 next:				continue;
 			if (af == AF_INET && ifp->if_flags & IFF_POINTOPOINT) {
 				/*
 				 * This is a bit broken as it doesn't
 				 * take into account that the remote end may
 				 * be a single node in the network we are
 				 * looking for.
 				 * The trouble is that we don't know the
 				 * netmask for the remote end.
 				 */
 				if (ifa->ifa_dstaddr != NULL &&
 				    sa_equal(addr, ifa->ifa_dstaddr))
 					goto done;
 			} else {
 				/*
 				 * if we have a special address handler,
 				 * then use it instead of the generic one.
 				 */
 				if (ifa->ifa_claim_addr) {
 					if ((*ifa->ifa_claim_addr)(ifa, addr))
 						goto done;
 					continue;
 				}
 
 				/*
 				 * Scan all the bits in the ifa's address.
 				 * If a bit dissagrees with what we are
 				 * looking for, mask it with the netmask
 				 * to see if it really matters.
 				 * (A byte at a time)
 				 */
 				if (ifa->ifa_netmask == 0)
 					continue;
 				cp = addr_data;
 				cp2 = ifa->ifa_addr->sa_data;
 				cp3 = ifa->ifa_netmask->sa_data;
 				cplim = ifa->ifa_netmask->sa_len
 					+ (char *)ifa->ifa_netmask;
 				while (cp3 < cplim)
 					if ((*cp++ ^ *cp2++) & *cp3++)
 						goto next; /* next address! */
 				/*
 				 * If the netmask of what we just found
 				 * is more specific than what we had before
 				 * (if we had one) then remember the new one
 				 * before continuing to search
 				 * for an even better one.
 				 */
 				if (ifa_maybe == 0 ||
 				    rn_refines((caddr_t)ifa->ifa_netmask,
 				    (caddr_t)ifa_maybe->ifa_netmask))
 					ifa_maybe = ifa;
 			}
 		}
 	}
 	ifa = ifa_maybe;
 done:
 	IFNET_RUNLOCK();
 	return (ifa);
 }
 
 /*
  * Find an interface address specific to an interface best matching
  * a given address.
  */
 struct ifaddr *
 ifaof_ifpforaddr(struct sockaddr *addr, struct ifnet *ifp)
 {
 	struct ifaddr *ifa;
 	char *cp, *cp2, *cp3;
 	char *cplim;
 	struct ifaddr *ifa_maybe = 0;
 	u_int af = addr->sa_family;
 
 	if (af >= AF_MAX)
 		return (0);
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 		if (ifa->ifa_addr->sa_family != af)
 			continue;
 		if (ifa_maybe == 0)
 			ifa_maybe = ifa;
 		if (ifa->ifa_netmask == 0) {
 			if (sa_equal(addr, ifa->ifa_addr) ||
 			    (ifa->ifa_dstaddr &&
 			    sa_equal(addr, ifa->ifa_dstaddr)))
 				goto done;
 			continue;
 		}
 		if (ifp->if_flags & IFF_POINTOPOINT) {
 			if (sa_equal(addr, ifa->ifa_dstaddr))
 				goto done;
 		} else {
 			cp = addr->sa_data;
 			cp2 = ifa->ifa_addr->sa_data;
 			cp3 = ifa->ifa_netmask->sa_data;
 			cplim = ifa->ifa_netmask->sa_len + (char *)ifa->ifa_netmask;
 			for (; cp3 < cplim; cp3++)
 				if ((*cp++ ^ *cp2++) & *cp3)
 					break;
 			if (cp3 == cplim)
 				goto done;
 		}
 	}
 	ifa = ifa_maybe;
 done:
 	return (ifa);
 }
+
+#include <net/route.h>
+#include <net/if_llatbl.h>
 
 /*
  * Default action when installing a route with a Link Level gateway.
  * Lookup an appropriate real ifa to point to.
  * This should be moved to /sys/net/link.c eventually.
  */
 static void
 link_rtrequest(int cmd, struct rtentry *rt, struct rt_addrinfo *info)
 {
 	struct ifaddr *ifa, *oifa;
 	struct sockaddr *dst;
 	struct ifnet *ifp;
 
 	RT_LOCK_ASSERT(rt);
 
 	if (cmd != RTM_ADD || ((ifa = rt->rt_ifa) == 0) ||
 	    ((ifp = ifa->ifa_ifp) == 0) || ((dst = rt_key(rt)) == 0))
 		return;
 	ifa = ifaof_ifpforaddr(dst, ifp);
 	if (ifa) {
 		IFAREF(ifa);		/* XXX */
 		oifa = rt->rt_ifa;
 		rt->rt_ifa = ifa;
 		IFAFREE(oifa);
 		if (ifa->ifa_rtrequest && ifa->ifa_rtrequest != link_rtrequest)
 			ifa->ifa_rtrequest(cmd, rt, info);
 	}
 }
 
 /*
  * Mark an interface down and notify protocols of
  * the transition.
  * NOTE: must be called at splnet or eqivalent.
  */
 static void
 if_unroute(struct ifnet *ifp, int flag, int fam)
 {
 	struct ifaddr *ifa;
 
 	KASSERT(flag == IFF_UP, ("if_unroute: flag != IFF_UP"));
 
 	ifp->if_flags &= ~flag;
 	getmicrotime(&ifp->if_lastchange);
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)
 		if (fam == PF_UNSPEC || (fam == ifa->ifa_addr->sa_family))
 			pfctlinput(PRC_IFDOWN, ifa->ifa_addr);
 	ifp->if_qflush(ifp);
 	
 #ifdef DEV_CARP
 	if (ifp->if_carp)
 		carp_carpdev_state(ifp->if_carp);
 #endif
 	rt_ifmsg(ifp);
 }
 
 /*
  * Mark an interface up and notify protocols of
  * the transition.
  * NOTE: must be called at splnet or eqivalent.
  */
 static void
 if_route(struct ifnet *ifp, int flag, int fam)
 {
 	struct ifaddr *ifa;
 
 	KASSERT(flag == IFF_UP, ("if_route: flag != IFF_UP"));
 
 	ifp->if_flags |= flag;
 	getmicrotime(&ifp->if_lastchange);
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)
 		if (fam == PF_UNSPEC || (fam == ifa->ifa_addr->sa_family))
 			pfctlinput(PRC_IFUP, ifa->ifa_addr);
 #ifdef DEV_CARP
 	if (ifp->if_carp)
 		carp_carpdev_state(ifp->if_carp);
 #endif
 	rt_ifmsg(ifp);
 #ifdef INET6
 	in6_if_up(ifp);
 #endif
 }
 
 void	(*vlan_link_state_p)(struct ifnet *, int);	/* XXX: private from if_vlan */
 void	(*vlan_trunk_cap_p)(struct ifnet *);		/* XXX: private from if_vlan */
 
 /*
  * Handle a change in the interface link state. To avoid LORs
  * between driver lock and upper layer locks, as well as possible
  * recursions, we post event to taskqueue, and all job
  * is done in static do_link_state_change().
  */
 void
 if_link_state_change(struct ifnet *ifp, int link_state)
 {
 	/* Return if state hasn't changed. */
 	if (ifp->if_link_state == link_state)
 		return;
 
 	ifp->if_link_state = link_state;
 
 	taskqueue_enqueue(taskqueue_swi, &ifp->if_linktask);
 }
 
 static void
 do_link_state_change(void *arg, int pending)
 {
 	struct ifnet *ifp = (struct ifnet *)arg;
 	int link_state = ifp->if_link_state;
 	int link;
 	CURVNET_SET(ifp->if_vnet);
 
 	/* Notify that the link state has changed. */
 	rt_ifmsg(ifp);
 	if (link_state == LINK_STATE_UP)
 		link = NOTE_LINKUP;
 	else if (link_state == LINK_STATE_DOWN)
 		link = NOTE_LINKDOWN;
 	else
 		link = NOTE_LINKINV;
 	KNOTE_UNLOCKED(&ifp->if_klist, link);
 	if (ifp->if_vlantrunk != NULL)
 		(*vlan_link_state_p)(ifp, link);
 
 	if ((ifp->if_type == IFT_ETHER || ifp->if_type == IFT_L2VLAN) &&
 	    IFP2AC(ifp)->ac_netgraph != NULL)
 		(*ng_ether_link_state_p)(ifp, link_state);
 #ifdef DEV_CARP
 	if (ifp->if_carp)
 		carp_carpdev_state(ifp->if_carp);
 #endif
 	if (ifp->if_bridge) {
 		KASSERT(bstp_linkstate_p != NULL,("if_bridge bstp not loaded!"));
 		(*bstp_linkstate_p)(ifp, link_state);
 	}
 	if (ifp->if_lagg) {
 		KASSERT(lagg_linkstate_p != NULL,("if_lagg not loaded!"));
 		(*lagg_linkstate_p)(ifp, link_state);
 	}
 
 	devctl_notify("IFNET", ifp->if_xname,
 	    (link_state == LINK_STATE_UP) ? "LINK_UP" : "LINK_DOWN", NULL);
 	if (pending > 1)
 		if_printf(ifp, "%d link states coalesced\n", pending);
 	if (log_link_state_change)
 		log(LOG_NOTICE, "%s: link state changed to %s\n", ifp->if_xname,
 		    (link_state == LINK_STATE_UP) ? "UP" : "DOWN" );
 	CURVNET_RESTORE();
 }
 
 /*
  * Mark an interface down and notify protocols of
  * the transition.
  * NOTE: must be called at splnet or eqivalent.
  */
 void
 if_down(struct ifnet *ifp)
 {
 
 	if_unroute(ifp, IFF_UP, AF_UNSPEC);
 }
 
 /*
  * Mark an interface up and notify protocols of
  * the transition.
  * NOTE: must be called at splnet or eqivalent.
  */
 void
 if_up(struct ifnet *ifp)
 {
 
 	if_route(ifp, IFF_UP, AF_UNSPEC);
 }
 
 /*
  * Flush an interface queue.
  */
 static void
 if_qflush(struct ifnet *ifp)
 {
 	struct mbuf *m, *n;
 	struct ifaltq *ifq;
 	
 	ifq = &ifp->if_snd;
 	IFQ_LOCK(ifq);
 #ifdef ALTQ
 	if (ALTQ_IS_ENABLED(ifq))
 		ALTQ_PURGE(ifq);
 #endif
 	n = ifq->ifq_head;
 	while ((m = n) != 0) {
 		n = m->m_act;
 		m_freem(m);
 	}
 	ifq->ifq_head = 0;
 	ifq->ifq_tail = 0;
 	ifq->ifq_len = 0;
 	IFQ_UNLOCK(ifq);
 }
 
 /*
  * Handle interface watchdog timer routines.  Called
  * from softclock, we decrement timers (if set) and
  * call the appropriate interface routine on expiration.
  *
  * XXXRW: Note that because timeouts run with Giant, if_watchdog() is called
  * holding Giant.  If we switch to an MPSAFE callout, we likely need to grab
  * Giant before entering if_watchdog() on an IFF_NEEDSGIANT interface.
  */
 static void
 if_slowtimo(void *arg)
 {
 	VNET_ITERATOR_DECL(vnet_iter);
 	struct ifnet *ifp;
 	int s = splimp();
 
 	IFNET_RLOCK();
 	VNET_LIST_RLOCK();
 	VNET_FOREACH(vnet_iter) {
 		CURVNET_SET(vnet_iter);
 		INIT_VNET_NET(vnet_iter);
 		TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 			if (ifp->if_timer == 0 || --ifp->if_timer)
 				continue;
 			if (ifp->if_watchdog)
 				(*ifp->if_watchdog)(ifp);
 		}
 		CURVNET_RESTORE();
 	}
 	VNET_LIST_RUNLOCK();
 	IFNET_RUNLOCK();
 	splx(s);
 	timeout(if_slowtimo, (void *)0, hz / IFNET_SLOWHZ);
 }
 
 /*
  * Map interface name to
  * interface structure pointer.
  */
 struct ifnet *
 ifunit(const char *name)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 		if (strncmp(name, ifp->if_xname, IFNAMSIZ) == 0)
 			break;
 	}
 	IFNET_RUNLOCK();
 	return (ifp);
 }
 
 /*
  * Hardware specific interface ioctls.
  */
 static int
 ifhwioctl(u_long cmd, struct ifnet *ifp, caddr_t data, struct thread *td)
 {
 	struct ifreq *ifr;
 	struct ifstat *ifs;
 	int error = 0;
 	int new_flags, temp_flags;
 	size_t namelen, onamelen;
 	char new_name[IFNAMSIZ];
 	struct ifaddr *ifa;
 	struct sockaddr_dl *sdl;
 
 	ifr = (struct ifreq *)data;
 	switch (cmd) {
 	case SIOCGIFINDEX:
 		ifr->ifr_index = ifp->if_index;
 		break;
 
 	case SIOCGIFFLAGS:
 		temp_flags = ifp->if_flags | ifp->if_drv_flags;
 		ifr->ifr_flags = temp_flags & 0xffff;
 		ifr->ifr_flagshigh = temp_flags >> 16;
 		break;
 
 	case SIOCGIFCAP:
 		ifr->ifr_reqcap = ifp->if_capabilities;
 		ifr->ifr_curcap = ifp->if_capenable;
 		break;
 
 #ifdef MAC
 	case SIOCGIFMAC:
 		error = mac_ifnet_ioctl_get(td->td_ucred, ifr, ifp);
 		break;
 #endif
 
 	case SIOCGIFMETRIC:
 		ifr->ifr_metric = ifp->if_metric;
 		break;
 
 	case SIOCGIFMTU:
 		ifr->ifr_mtu = ifp->if_mtu;
 		break;
 
 	case SIOCGIFPHYS:
 		ifr->ifr_phys = ifp->if_physical;
 		break;
 
 	case SIOCSIFFLAGS:
 		error = priv_check(td, PRIV_NET_SETIFFLAGS);
 		if (error)
 			return (error);
 		/*
 		 * Currently, no driver owned flags pass the IFF_CANTCHANGE
 		 * check, so we don't need special handling here yet.
 		 */
 		new_flags = (ifr->ifr_flags & 0xffff) |
 		    (ifr->ifr_flagshigh << 16);
 		if (ifp->if_flags & IFF_SMART) {
 			/* Smart drivers twiddle their own routes */
 		} else if (ifp->if_flags & IFF_UP &&
 		    (new_flags & IFF_UP) == 0) {
 			int s = splimp();
 			if_down(ifp);
 			splx(s);
 		} else if (new_flags & IFF_UP &&
 		    (ifp->if_flags & IFF_UP) == 0) {
 			int s = splimp();
 			if_up(ifp);
 			splx(s);
 		}
 		/* See if permanently promiscuous mode bit is about to flip */
 		if ((ifp->if_flags ^ new_flags) & IFF_PPROMISC) {
 			if (new_flags & IFF_PPROMISC)
 				ifp->if_flags |= IFF_PROMISC;
 			else if (ifp->if_pcount == 0)
 				ifp->if_flags &= ~IFF_PROMISC;
 			log(LOG_INFO, "%s: permanently promiscuous mode %s\n",
 			    ifp->if_xname,
 			    (new_flags & IFF_PPROMISC) ? "enabled" : "disabled");
 		}
 		ifp->if_flags = (ifp->if_flags & IFF_CANTCHANGE) |
 			(new_flags &~ IFF_CANTCHANGE);
 		if (ifp->if_ioctl) {
 			IFF_LOCKGIANT(ifp);
 			(void) (*ifp->if_ioctl)(ifp, cmd, data);
 			IFF_UNLOCKGIANT(ifp);
 		}
 		getmicrotime(&ifp->if_lastchange);
 		break;
 
 	case SIOCSIFCAP:
 		error = priv_check(td, PRIV_NET_SETIFCAP);
 		if (error)
 			return (error);
 		if (ifp->if_ioctl == NULL)
 			return (EOPNOTSUPP);
 		if (ifr->ifr_reqcap & ~ifp->if_capabilities)
 			return (EINVAL);
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, cmd, data);
 		IFF_UNLOCKGIANT(ifp);
 		if (error == 0)
 			getmicrotime(&ifp->if_lastchange);
 		break;
 
 #ifdef MAC
 	case SIOCSIFMAC:
 		error = mac_ifnet_ioctl_set(td->td_ucred, ifr, ifp);
 		break;
 #endif
 
 	case SIOCSIFNAME:
 		error = priv_check(td, PRIV_NET_SETIFNAME);
 		if (error)
 			return (error);
 		error = copyinstr(ifr->ifr_data, new_name, IFNAMSIZ, NULL);
 		if (error != 0)
 			return (error);
 		if (new_name[0] == '\0')
 			return (EINVAL);
 		if (ifunit(new_name) != NULL)
 			return (EEXIST);
 		
 		/* Announce the departure of the interface. */
 		rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
 		EVENTHANDLER_INVOKE(ifnet_departure_event, ifp);
 
 		log(LOG_INFO, "%s: changing name to '%s'\n",
 		    ifp->if_xname, new_name);
 
 		strlcpy(ifp->if_xname, new_name, sizeof(ifp->if_xname));
 		ifa = ifp->if_addr;
 		IFA_LOCK(ifa);
 		sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 		namelen = strlen(new_name);
 		onamelen = sdl->sdl_nlen;
 		/*
 		 * Move the address if needed.  This is safe because we
 		 * allocate space for a name of length IFNAMSIZ when we
 		 * create this in if_attach().
 		 */
 		if (namelen != onamelen) {
 			bcopy(sdl->sdl_data + onamelen,
 			    sdl->sdl_data + namelen, sdl->sdl_alen);
 		}
 		bcopy(new_name, sdl->sdl_data, namelen);
 		sdl->sdl_nlen = namelen;
 		sdl = (struct sockaddr_dl *)ifa->ifa_netmask;
 		bzero(sdl->sdl_data, onamelen);
 		while (namelen != 0)
 			sdl->sdl_data[--namelen] = 0xff;
 		IFA_UNLOCK(ifa);
 
 		EVENTHANDLER_INVOKE(ifnet_arrival_event, ifp);
 		/* Announce the return of the interface. */
 		rt_ifannouncemsg(ifp, IFAN_ARRIVAL);
 		break;
 
 	case SIOCSIFMETRIC:
 		error = priv_check(td, PRIV_NET_SETIFMETRIC);
 		if (error)
 			return (error);
 		ifp->if_metric = ifr->ifr_metric;
 		getmicrotime(&ifp->if_lastchange);
 		break;
 
 	case SIOCSIFPHYS:
 		error = priv_check(td, PRIV_NET_SETIFPHYS);
 		if (error)
 			return (error);
 		if (ifp->if_ioctl == NULL)
 			return (EOPNOTSUPP);
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, cmd, data);
 		IFF_UNLOCKGIANT(ifp);
 		if (error == 0)
 			getmicrotime(&ifp->if_lastchange);
 		break;
 
 	case SIOCSIFMTU:
 	{
 		u_long oldmtu = ifp->if_mtu;
 
 		error = priv_check(td, PRIV_NET_SETIFMTU);
 		if (error)
 			return (error);
 		if (ifr->ifr_mtu < IF_MINMTU || ifr->ifr_mtu > IF_MAXMTU)
 			return (EINVAL);
 		if (ifp->if_ioctl == NULL)
 			return (EOPNOTSUPP);
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, cmd, data);
 		IFF_UNLOCKGIANT(ifp);
 		if (error == 0) {
 			getmicrotime(&ifp->if_lastchange);
 			rt_ifmsg(ifp);
 		}
 		/*
 		 * If the link MTU changed, do network layer specific procedure.
 		 */
 		if (ifp->if_mtu != oldmtu) {
 #ifdef INET6
 			nd6_setmtu(ifp);
 #endif
 		}
 		break;
 	}
 
 	case SIOCADDMULTI:
 	case SIOCDELMULTI:
 		if (cmd == SIOCADDMULTI)
 			error = priv_check(td, PRIV_NET_ADDMULTI);
 		else
 			error = priv_check(td, PRIV_NET_DELMULTI);
 		if (error)
 			return (error);
 
 		/* Don't allow group membership on non-multicast interfaces. */
 		if ((ifp->if_flags & IFF_MULTICAST) == 0)
 			return (EOPNOTSUPP);
 
 		/* Don't let users screw up protocols' entries. */
 		if (ifr->ifr_addr.sa_family != AF_LINK)
 			return (EINVAL);
 
 		if (cmd == SIOCADDMULTI) {
 			struct ifmultiaddr *ifma;
 
 			/*
 			 * Userland is only permitted to join groups once
 			 * via the if_addmulti() KPI, because it cannot hold
 			 * struct ifmultiaddr * between calls. It may also
 			 * lose a race while we check if the membership
 			 * already exists.
 			 */
 			IF_ADDR_LOCK(ifp);
 			ifma = if_findmulti(ifp, &ifr->ifr_addr);
 			IF_ADDR_UNLOCK(ifp);
 			if (ifma != NULL)
 				error = EADDRINUSE;
 			else
 				error = if_addmulti(ifp, &ifr->ifr_addr, &ifma);
 		} else {
 			error = if_delmulti(ifp, &ifr->ifr_addr);
 		}
 		if (error == 0)
 			getmicrotime(&ifp->if_lastchange);
 		break;
 
 	case SIOCSIFPHYADDR:
 	case SIOCDIFPHYADDR:
 #ifdef INET6
 	case SIOCSIFPHYADDR_IN6:
 #endif
 	case SIOCSLIFPHYADDR:
 	case SIOCSIFMEDIA:
 	case SIOCSIFGENERIC:
 		error = priv_check(td, PRIV_NET_HWIOCTL);
 		if (error)
 			return (error);
 		if (ifp->if_ioctl == NULL)
 			return (EOPNOTSUPP);
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, cmd, data);
 		IFF_UNLOCKGIANT(ifp);
 		if (error == 0)
 			getmicrotime(&ifp->if_lastchange);
 		break;
 
 	case SIOCGIFSTATUS:
 		ifs = (struct ifstat *)data;
 		ifs->ascii[0] = '\0';
 
 	case SIOCGIFPSRCADDR:
 	case SIOCGIFPDSTADDR:
 	case SIOCGLIFPHYADDR:
 	case SIOCGIFMEDIA:
 	case SIOCGIFGENERIC:
 		if (ifp->if_ioctl == NULL)
 			return (EOPNOTSUPP);
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, cmd, data);
 		IFF_UNLOCKGIANT(ifp);
 		break;
 
 	case SIOCSIFLLADDR:
 		error = priv_check(td, PRIV_NET_SETLLADDR);
 		if (error)
 			return (error);
 		error = if_setlladdr(ifp,
 		    ifr->ifr_addr.sa_data, ifr->ifr_addr.sa_len);
 		break;
 
 	case SIOCAIFGROUP:
 	{
 		struct ifgroupreq *ifgr = (struct ifgroupreq *)ifr;
 
 		error = priv_check(td, PRIV_NET_ADDIFGROUP);
 		if (error)
 			return (error);
 		if ((error = if_addgroup(ifp, ifgr->ifgr_group)))
 			return (error);
 		break;
 	}
 
 	case SIOCGIFGROUP:
 		if ((error = if_getgroup((struct ifgroupreq *)ifr, ifp)))
 			return (error);
 		break;
 
 	case SIOCDIFGROUP:
 	{
 		struct ifgroupreq *ifgr = (struct ifgroupreq *)ifr;
 
 		error = priv_check(td, PRIV_NET_DELIFGROUP);
 		if (error)
 			return (error);
 		if ((error = if_delgroup(ifp, ifgr->ifgr_group)))
 			return (error);
 		break;
 	}
 
 	default:
 		error = ENOIOCTL;
 		break;
 	}
 	return (error);
 }
 
 /*
  * Interface ioctls.
  */
 int
 ifioctl(struct socket *so, u_long cmd, caddr_t data, struct thread *td)
 {
 	struct ifnet *ifp;
 	struct ifreq *ifr;
 	int error;
 	int oif_flags;
 
 	switch (cmd) {
 	case SIOCGIFCONF:
 	case OSIOCGIFCONF:
 #ifdef __amd64__
 	case SIOCGIFCONF32:
 #endif
 		return (ifconf(cmd, data));
 	}
 	ifr = (struct ifreq *)data;
 
 	switch (cmd) {
 	case SIOCIFCREATE:
 	case SIOCIFCREATE2:
 		error = priv_check(td, PRIV_NET_IFCREATE);
 		if (error)
 			return (error);
 		return (if_clone_create(ifr->ifr_name, sizeof(ifr->ifr_name),
 			cmd == SIOCIFCREATE2 ? ifr->ifr_data : NULL));
 	case SIOCIFDESTROY:
 		error = priv_check(td, PRIV_NET_IFDESTROY);
 		if (error)
 			return (error);
 		return if_clone_destroy(ifr->ifr_name);
 
 	case SIOCIFGCLONERS:
 		return (if_clone_list((struct if_clonereq *)data));
 	case SIOCGIFGMEMB:
 		return (if_getgroupmembers((struct ifgroupreq *)data));
 	}
 
 	ifp = ifunit(ifr->ifr_name);
 	if (ifp == 0)
 		return (ENXIO);
 
 	error = ifhwioctl(cmd, ifp, data, td);
 	if (error != ENOIOCTL)
 		return (error);
 
 	oif_flags = ifp->if_flags;
 	if (so->so_proto == 0)
 		return (EOPNOTSUPP);
 #ifndef COMPAT_43
 	error = ((*so->so_proto->pr_usrreqs->pru_control)(so, cmd,
 								 data,
 								 ifp, td));
 #else
 	{
 		int ocmd = cmd;
 
 		switch (cmd) {
 
 		case SIOCSIFDSTADDR:
 		case SIOCSIFADDR:
 		case SIOCSIFBRDADDR:
 		case SIOCSIFNETMASK:
 #if BYTE_ORDER != BIG_ENDIAN
 			if (ifr->ifr_addr.sa_family == 0 &&
 			    ifr->ifr_addr.sa_len < 16) {
 				ifr->ifr_addr.sa_family = ifr->ifr_addr.sa_len;
 				ifr->ifr_addr.sa_len = 16;
 			}
 #else
 			if (ifr->ifr_addr.sa_len == 0)
 				ifr->ifr_addr.sa_len = 16;
 #endif
 			break;
 
 		case OSIOCGIFADDR:
 			cmd = SIOCGIFADDR;
 			break;
 
 		case OSIOCGIFDSTADDR:
 			cmd = SIOCGIFDSTADDR;
 			break;
 
 		case OSIOCGIFBRDADDR:
 			cmd = SIOCGIFBRDADDR;
 			break;
 
 		case OSIOCGIFNETMASK:
 			cmd = SIOCGIFNETMASK;
 		}
 		error =  ((*so->so_proto->pr_usrreqs->pru_control)(so,
 								   cmd,
 								   data,
 								   ifp, td));
 		switch (ocmd) {
 
 		case OSIOCGIFADDR:
 		case OSIOCGIFDSTADDR:
 		case OSIOCGIFBRDADDR:
 		case OSIOCGIFNETMASK:
 			*(u_short *)&ifr->ifr_addr = ifr->ifr_addr.sa_family;
 
 		}
 	}
 #endif /* COMPAT_43 */
 
 	if ((oif_flags ^ ifp->if_flags) & IFF_UP) {
 #ifdef INET6
 		DELAY(100);/* XXX: temporary workaround for fxp issue*/
 		if (ifp->if_flags & IFF_UP) {
 			int s = splimp();
 			in6_if_up(ifp);
 			splx(s);
 		}
 #endif
 	}
 	return (error);
 }
 
 /*
  * The code common to handling reference counted flags,
  * e.g., in ifpromisc() and if_allmulti().
  * The "pflag" argument can specify a permanent mode flag to check,
  * such as IFF_PPROMISC for promiscuous mode; should be 0 if none.
  *
  * Only to be used on stack-owned flags, not driver-owned flags.
  */
 static int
 if_setflag(struct ifnet *ifp, int flag, int pflag, int *refcount, int onswitch)
 {
 	struct ifreq ifr;
 	int error;
 	int oldflags, oldcount;
 
 	/* Sanity checks to catch programming errors */
 	KASSERT((flag & (IFF_DRV_OACTIVE|IFF_DRV_RUNNING)) == 0,
 	    ("%s: setting driver-owned flag %d", __func__, flag));
 
 	if (onswitch)
 		KASSERT(*refcount >= 0,
 		    ("%s: increment negative refcount %d for flag %d",
 		    __func__, *refcount, flag));
 	else
 		KASSERT(*refcount > 0,
 		    ("%s: decrement non-positive refcount %d for flag %d",
 		    __func__, *refcount, flag));
 
 	/* In case this mode is permanent, just touch refcount */
 	if (ifp->if_flags & pflag) {
 		*refcount += onswitch ? 1 : -1;
 		return (0);
 	}
 
 	/* Save ifnet parameters for if_ioctl() may fail */
 	oldcount = *refcount;
 	oldflags = ifp->if_flags;
 	
 	/*
 	 * See if we aren't the only and touching refcount is enough.
 	 * Actually toggle interface flag if we are the first or last.
 	 */
 	if (onswitch) {
 		if ((*refcount)++)
 			return (0);
 		ifp->if_flags |= flag;
 	} else {
 		if (--(*refcount))
 			return (0);
 		ifp->if_flags &= ~flag;
 	}
 
 	/* Call down the driver since we've changed interface flags */
 	if (ifp->if_ioctl == NULL) {
 		error = EOPNOTSUPP;
 		goto recover;
 	}
 	ifr.ifr_flags = ifp->if_flags & 0xffff;
 	ifr.ifr_flagshigh = ifp->if_flags >> 16;
 	IFF_LOCKGIANT(ifp);
 	error = (*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifr);
 	IFF_UNLOCKGIANT(ifp);
 	if (error)
 		goto recover;
 	/* Notify userland that interface flags have changed */
 	rt_ifmsg(ifp);
 	return (0);
 
 recover:
 	/* Recover after driver error */
 	*refcount = oldcount;
 	ifp->if_flags = oldflags;
 	return (error);
 }
 
 /*
  * Set/clear promiscuous mode on interface ifp based on the truth value
  * of pswitch.  The calls are reference counted so that only the first
  * "on" request actually has an effect, as does the final "off" request.
  * Results are undefined if the "off" and "on" requests are not matched.
  */
 int
 ifpromisc(struct ifnet *ifp, int pswitch)
 {
 	int error;
 	int oldflags = ifp->if_flags;
 
 	error = if_setflag(ifp, IFF_PROMISC, IFF_PPROMISC,
 			   &ifp->if_pcount, pswitch);
 	/* If promiscuous mode status has changed, log a message */
 	if (error == 0 && ((ifp->if_flags ^ oldflags) & IFF_PROMISC))
 		log(LOG_INFO, "%s: promiscuous mode %s\n",
 		    ifp->if_xname,
 		    (ifp->if_flags & IFF_PROMISC) ? "enabled" : "disabled");
 	return (error);
 }
 
 /*
  * Return interface configuration
  * of system.  List may be used
  * in later ioctl's (above) to get
  * other information.
  */
 /*ARGSUSED*/
 static int
 ifconf(u_long cmd, caddr_t data)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifconf *ifc = (struct ifconf *)data;
 #ifdef __amd64__
 	struct ifconf32 *ifc32 = (struct ifconf32 *)data;
 	struct ifconf ifc_swab;
 #endif
 	struct ifnet *ifp;
 	struct ifaddr *ifa;
 	struct ifreq ifr;
 	struct sbuf *sb;
 	int error, full = 0, valid_len, max_len;
 
 #ifdef __amd64__
 	if (cmd == SIOCGIFCONF32) {
 		ifc_swab.ifc_len = ifc32->ifc_len;
 		ifc_swab.ifc_buf = (caddr_t)(uintptr_t)ifc32->ifc_buf;
 		ifc = &ifc_swab;
 	}
 #endif
 	/* Limit initial buffer size to MAXPHYS to avoid DoS from userspace. */
 	max_len = MAXPHYS - 1;
 
 	/* Prevent hostile input from being able to crash the system */
 	if (ifc->ifc_len <= 0)
 		return (EINVAL);
 
 again:
 	if (ifc->ifc_len <= max_len) {
 		max_len = ifc->ifc_len;
 		full = 1;
 	}
 	sb = sbuf_new(NULL, NULL, max_len + 1, SBUF_FIXEDLEN);
 	max_len = 0;
 	valid_len = 0;
 
 	IFNET_RLOCK();		/* could sleep XXX */
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 		int addrs;
 
 		/*
 		 * Zero the ifr_name buffer to make sure we don't
 		 * disclose the contents of the stack.
 		 */
 		memset(ifr.ifr_name, 0, sizeof(ifr.ifr_name));
 
 		if (strlcpy(ifr.ifr_name, ifp->if_xname, sizeof(ifr.ifr_name))
 		    >= sizeof(ifr.ifr_name)) {
 			sbuf_delete(sb);
 			IFNET_RUNLOCK();
 			return (ENAMETOOLONG);
 		}
 
 		addrs = 0;
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 			struct sockaddr *sa = ifa->ifa_addr;
 
 			if (jailed(curthread->td_ucred) &&
 			    !prison_if(curthread->td_ucred, sa))
 				continue;
 			addrs++;
 #ifdef COMPAT_43
 			if (cmd == OSIOCGIFCONF) {
 				struct osockaddr *osa =
 					 (struct osockaddr *)&ifr.ifr_addr;
 				ifr.ifr_addr = *sa;
 				osa->sa_family = sa->sa_family;
 				sbuf_bcat(sb, &ifr, sizeof(ifr));
 				max_len += sizeof(ifr);
 			} else
 #endif
 			if (sa->sa_len <= sizeof(*sa)) {
 				ifr.ifr_addr = *sa;
 				sbuf_bcat(sb, &ifr, sizeof(ifr));
 				max_len += sizeof(ifr);
 			} else {
 				sbuf_bcat(sb, &ifr,
 				    offsetof(struct ifreq, ifr_addr));
 				max_len += offsetof(struct ifreq, ifr_addr);
 				sbuf_bcat(sb, sa, sa->sa_len);
 				max_len += sa->sa_len;
 			}
 
 			if (!sbuf_overflowed(sb))
 				valid_len = sbuf_len(sb);
 		}
 		if (addrs == 0) {
 			bzero((caddr_t)&ifr.ifr_addr, sizeof(ifr.ifr_addr));
 			sbuf_bcat(sb, &ifr, sizeof(ifr));
 			max_len += sizeof(ifr);
 
 			if (!sbuf_overflowed(sb))
 				valid_len = sbuf_len(sb);
 		}
 	}
 	IFNET_RUNLOCK();
 
 	/*
 	 * If we didn't allocate enough space (uncommon), try again.  If
 	 * we have already allocated as much space as we are allowed,
 	 * return what we've got.
 	 */
 	if (valid_len != max_len && !full) {
 		sbuf_delete(sb);
 		goto again;
 	}
 
 	ifc->ifc_len = valid_len;
 #ifdef __amd64__
 	if (cmd == SIOCGIFCONF32)
 		ifc32->ifc_len = valid_len;
 #endif
 	sbuf_finish(sb);
 	error = copyout(sbuf_data(sb), ifc->ifc_req, ifc->ifc_len);
 	sbuf_delete(sb);
 	return (error);
 }
 
 /*
  * Just like ifpromisc(), but for all-multicast-reception mode.
  */
 int
 if_allmulti(struct ifnet *ifp, int onswitch)
 {
 
 	return (if_setflag(ifp, IFF_ALLMULTI, 0, &ifp->if_amcount, onswitch));
 }
 
 struct ifmultiaddr *
 if_findmulti(struct ifnet *ifp, struct sockaddr *sa)
 {
 	struct ifmultiaddr *ifma;
 
 	IF_ADDR_LOCK_ASSERT(ifp);
 
 	TAILQ_FOREACH(ifma, &ifp->if_multiaddrs, ifma_link) {
 		if (sa->sa_family == AF_LINK) {
 			if (sa_dl_equal(ifma->ifma_addr, sa))
 				break;
 		} else {
 			if (sa_equal(ifma->ifma_addr, sa))
 				break;
 		}
 	}
 
 	return ifma;
 }
 
 /*
  * Allocate a new ifmultiaddr and initialize based on passed arguments.  We
  * make copies of passed sockaddrs.  The ifmultiaddr will not be added to
  * the ifnet multicast address list here, so the caller must do that and
  * other setup work (such as notifying the device driver).  The reference
  * count is initialized to 1.
  */
 static struct ifmultiaddr *
 if_allocmulti(struct ifnet *ifp, struct sockaddr *sa, struct sockaddr *llsa,
     int mflags)
 {
 	struct ifmultiaddr *ifma;
 	struct sockaddr *dupsa;
 
 	ifma = malloc(sizeof *ifma, M_IFMADDR, mflags |
 	    M_ZERO);
 	if (ifma == NULL)
 		return (NULL);
 
 	dupsa = malloc(sa->sa_len, M_IFMADDR, mflags);
 	if (dupsa == NULL) {
 		free(ifma, M_IFMADDR);
 		return (NULL);
 	}
 	bcopy(sa, dupsa, sa->sa_len);
 	ifma->ifma_addr = dupsa;
 
 	ifma->ifma_ifp = ifp;
 	ifma->ifma_refcount = 1;
 	ifma->ifma_protospec = NULL;
 
 	if (llsa == NULL) {
 		ifma->ifma_lladdr = NULL;
 		return (ifma);
 	}
 
 	dupsa = malloc(llsa->sa_len, M_IFMADDR, mflags);
 	if (dupsa == NULL) {
 		free(ifma->ifma_addr, M_IFMADDR);
 		free(ifma, M_IFMADDR);
 		return (NULL);
 	}
 	bcopy(llsa, dupsa, llsa->sa_len);
 	ifma->ifma_lladdr = dupsa;
 
 	return (ifma);
 }
 
 /*
  * if_freemulti: free ifmultiaddr structure and possibly attached related
  * addresses.  The caller is responsible for implementing reference
  * counting, notifying the driver, handling routing messages, and releasing
  * any dependent link layer state.
  */
 static void
 if_freemulti(struct ifmultiaddr *ifma)
 {
 
 	KASSERT(ifma->ifma_refcount == 0, ("if_freemulti: refcount %d",
 	    ifma->ifma_refcount));
 	KASSERT(ifma->ifma_protospec == NULL,
 	    ("if_freemulti: protospec not NULL"));
 
 	if (ifma->ifma_lladdr != NULL)
 		free(ifma->ifma_lladdr, M_IFMADDR);
 	free(ifma->ifma_addr, M_IFMADDR);
 	free(ifma, M_IFMADDR);
 }
 
 /*
  * Register an additional multicast address with a network interface.
  *
  * - If the address is already present, bump the reference count on the
  *   address and return.
  * - If the address is not link-layer, look up a link layer address.
  * - Allocate address structures for one or both addresses, and attach to the
  *   multicast address list on the interface.  If automatically adding a link
  *   layer address, the protocol address will own a reference to the link
  *   layer address, to be freed when it is freed.
  * - Notify the network device driver of an addition to the multicast address
  *   list.
  *
  * 'sa' points to caller-owned memory with the desired multicast address.
  *
  * 'retifma' will be used to return a pointer to the resulting multicast
  * address reference, if desired.
  */
 int
 if_addmulti(struct ifnet *ifp, struct sockaddr *sa,
     struct ifmultiaddr **retifma)
 {
 	struct ifmultiaddr *ifma, *ll_ifma;
 	struct sockaddr *llsa;
 	int error;
 
 	/*
 	 * If the address is already present, return a new reference to it;
 	 * otherwise, allocate storage and set up a new address.
 	 */
 	IF_ADDR_LOCK(ifp);
 	ifma = if_findmulti(ifp, sa);
 	if (ifma != NULL) {
 		ifma->ifma_refcount++;
 		if (retifma != NULL)
 			*retifma = ifma;
 		IF_ADDR_UNLOCK(ifp);
 		return (0);
 	}
 
 	/*
 	 * The address isn't already present; resolve the protocol address
 	 * into a link layer address, and then look that up, bump its
 	 * refcount or allocate an ifma for that also.  If 'llsa' was
 	 * returned, we will need to free it later.
 	 */
 	llsa = NULL;
 	ll_ifma = NULL;
 	if (ifp->if_resolvemulti != NULL) {
 		error = ifp->if_resolvemulti(ifp, &llsa, sa);
 		if (error)
 			goto unlock_out;
 	}
 
 	/*
 	 * Allocate the new address.  Don't hook it up yet, as we may also
 	 * need to allocate a link layer multicast address.
 	 */
 	ifma = if_allocmulti(ifp, sa, llsa, M_NOWAIT);
 	if (ifma == NULL) {
 		error = ENOMEM;
 		goto free_llsa_out;
 	}
 
 	/*
 	 * If a link layer address is found, we'll need to see if it's
 	 * already present in the address list, or allocate is as well.
 	 * When this block finishes, the link layer address will be on the
 	 * list.
 	 */
 	if (llsa != NULL) {
 		ll_ifma = if_findmulti(ifp, llsa);
 		if (ll_ifma == NULL) {
 			ll_ifma = if_allocmulti(ifp, llsa, NULL, M_NOWAIT);
 			if (ll_ifma == NULL) {
 				--ifma->ifma_refcount;
 				if_freemulti(ifma);
 				error = ENOMEM;
 				goto free_llsa_out;
 			}
 			TAILQ_INSERT_HEAD(&ifp->if_multiaddrs, ll_ifma,
 			    ifma_link);
 		} else
 			ll_ifma->ifma_refcount++;
 		ifma->ifma_llifma = ll_ifma;
 	}
 
 	/*
 	 * We now have a new multicast address, ifma, and possibly a new or
 	 * referenced link layer address.  Add the primary address to the
 	 * ifnet address list.
 	 */
 	TAILQ_INSERT_HEAD(&ifp->if_multiaddrs, ifma, ifma_link);
 
 	if (retifma != NULL)
 		*retifma = ifma;
 
 	/*
 	 * Must generate the message while holding the lock so that 'ifma'
 	 * pointer is still valid.
 	 */
 	rt_newmaddrmsg(RTM_NEWMADDR, ifma);
 	IF_ADDR_UNLOCK(ifp);
 
 	/*
 	 * We are certain we have added something, so call down to the
 	 * interface to let them know about it.
 	 */
 	if (ifp->if_ioctl != NULL) {
 		IFF_LOCKGIANT(ifp);
 		(void) (*ifp->if_ioctl)(ifp, SIOCADDMULTI, 0);
 		IFF_UNLOCKGIANT(ifp);
 	}
 
 	if (llsa != NULL)
 		free(llsa, M_IFMADDR);
 
 	return (0);
 
 free_llsa_out:
 	if (llsa != NULL)
 		free(llsa, M_IFMADDR);
 
 unlock_out:
 	IF_ADDR_UNLOCK(ifp);
 	return (error);
 }
 
 /*
  * Delete a multicast group membership by network-layer group address.
  *
  * Returns ENOENT if the entry could not be found. If ifp no longer
  * exists, results are undefined. This entry point should only be used
  * from subsystems which do appropriate locking to hold ifp for the
  * duration of the call.
  * Network-layer protocol domains must use if_delmulti_ifma().
  */
 int
 if_delmulti(struct ifnet *ifp, struct sockaddr *sa)
 {
 	struct ifmultiaddr *ifma;
 	int lastref;
 #ifdef INVARIANTS
 	struct ifnet *oifp;
 	INIT_VNET_NET(ifp->if_vnet);
 
 	IFNET_RLOCK();
 	TAILQ_FOREACH(oifp, &V_ifnet, if_link)
 		if (ifp == oifp)
 			break;
 	if (ifp != oifp)
 		ifp = NULL;
 	IFNET_RUNLOCK();
 
 	KASSERT(ifp != NULL, ("%s: ifnet went away", __func__));
 #endif
 	if (ifp == NULL)
 		return (ENOENT);
 
 	IF_ADDR_LOCK(ifp);
 	lastref = 0;
 	ifma = if_findmulti(ifp, sa);
 	if (ifma != NULL)
 		lastref = if_delmulti_locked(ifp, ifma, 0);
 	IF_ADDR_UNLOCK(ifp);
 
 	if (ifma == NULL)
 		return (ENOENT);
 
 	if (lastref && ifp->if_ioctl != NULL) {
 		IFF_LOCKGIANT(ifp);
 		(void)(*ifp->if_ioctl)(ifp, SIOCDELMULTI, 0);
 		IFF_UNLOCKGIANT(ifp);
 	}
 
 	return (0);
 }
 
 /*
  * Delete a multicast group membership by group membership pointer.
  * Network-layer protocol domains must use this routine.
  *
  * It is safe to call this routine if the ifp disappeared. Callers should
  * hold IFF_LOCKGIANT() to avoid a LOR in case the hardware needs to be
  * reconfigured.
  */
 void
 if_delmulti_ifma(struct ifmultiaddr *ifma)
 {
 #ifdef DIAGNOSTIC
 	INIT_VNET_NET(curvnet);
 #endif
 	struct ifnet *ifp;
 	int lastref;
 
 	ifp = ifma->ifma_ifp;
 #ifdef DIAGNOSTIC
 	if (ifp == NULL) {
 		printf("%s: ifma_ifp seems to be detached\n", __func__);
 	} else {
 		struct ifnet *oifp;
 
 		IFNET_RLOCK();
 		TAILQ_FOREACH(oifp, &V_ifnet, if_link)
 			if (ifp == oifp)
 				break;
 		if (ifp != oifp) {
 			printf("%s: ifnet %p disappeared\n", __func__, ifp);
 			ifp = NULL;
 		}
 		IFNET_RUNLOCK();
 	}
 #endif
 	/*
 	 * If and only if the ifnet instance exists: Acquire the address lock.
 	 */
 	if (ifp != NULL)
 		IF_ADDR_LOCK(ifp);
 
 	lastref = if_delmulti_locked(ifp, ifma, 0);
 
 	if (ifp != NULL) {
 		/*
 		 * If and only if the ifnet instance exists:
 		 *  Release the address lock.
 		 *  If the group was left: update the hardware hash filter.
 		 */
 		IF_ADDR_UNLOCK(ifp);
 		if (lastref && ifp->if_ioctl != NULL) {
 			IFF_LOCKGIANT(ifp);
 			(void)(*ifp->if_ioctl)(ifp, SIOCDELMULTI, 0);
 			IFF_UNLOCKGIANT(ifp);
 		}
 	}
 }
 
 /*
  * Perform deletion of network-layer and/or link-layer multicast address.
  *
  * Return 0 if the reference count was decremented.
  * Return 1 if the final reference was released, indicating that the
  * hardware hash filter should be reprogrammed.
  */
 static int
 if_delmulti_locked(struct ifnet *ifp, struct ifmultiaddr *ifma, int detaching)
 {
 	struct ifmultiaddr *ll_ifma;
 
 	if (ifp != NULL && ifma->ifma_ifp != NULL) {
 		KASSERT(ifma->ifma_ifp == ifp,
 		    ("%s: inconsistent ifp %p", __func__, ifp));
 		IF_ADDR_LOCK_ASSERT(ifp);
 	}
 
 	ifp = ifma->ifma_ifp;
 
 	/*
 	 * If the ifnet is detaching, null out references to ifnet,
 	 * so that upper protocol layers will notice, and not attempt
 	 * to obtain locks for an ifnet which no longer exists. The
 	 * routing socket announcement must happen before the ifnet
 	 * instance is detached from the system.
 	 */
 	if (detaching) {
 #ifdef DIAGNOSTIC
 		printf("%s: detaching ifnet instance %p\n", __func__, ifp);
 #endif
 		/*
 		 * ifp may already be nulled out if we are being reentered
 		 * to delete the ll_ifma.
 		 */
 		if (ifp != NULL) {
 			rt_newmaddrmsg(RTM_DELMADDR, ifma);
 			ifma->ifma_ifp = NULL;
 		}
 	}
 
 	if (--ifma->ifma_refcount > 0)
 		return 0;
 
 	/*
 	 * If this ifma is a network-layer ifma, a link-layer ifma may
 	 * have been associated with it. Release it first if so.
 	 */
 	ll_ifma = ifma->ifma_llifma;
 	if (ll_ifma != NULL) {
 		KASSERT(ifma->ifma_lladdr != NULL,
 		    ("%s: llifma w/o lladdr", __func__));
 		if (detaching)
 			ll_ifma->ifma_ifp = NULL;	/* XXX */
 		if (--ll_ifma->ifma_refcount == 0) {
 			if (ifp != NULL) {
 				TAILQ_REMOVE(&ifp->if_multiaddrs, ll_ifma,
 				    ifma_link);
 			}
 			if_freemulti(ll_ifma);
 		}
 	}
 
 	if (ifp != NULL)
 		TAILQ_REMOVE(&ifp->if_multiaddrs, ifma, ifma_link);
 
 	if_freemulti(ifma);
 
 	/*
 	 * The last reference to this instance of struct ifmultiaddr
 	 * was released; the hardware should be notified of this change.
 	 */
 	return 1;
 }
 
 /*
  * Set the link layer address on an interface.
  *
  * At this time we only support certain types of interfaces,
  * and we don't allow the length of the address to change.
  */
 int
 if_setlladdr(struct ifnet *ifp, const u_char *lladdr, int len)
 {
 	struct sockaddr_dl *sdl;
 	struct ifaddr *ifa;
 	struct ifreq ifr;
 
 	ifa = ifp->if_addr;
 	if (ifa == NULL)
 		return (EINVAL);
 	sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 	if (sdl == NULL)
 		return (EINVAL);
 	if (len != sdl->sdl_alen)	/* don't allow length to change */
 		return (EINVAL);
 	switch (ifp->if_type) {
 	case IFT_ETHER:
 	case IFT_FDDI:
 	case IFT_XETHER:
 	case IFT_ISO88025:
 	case IFT_L2VLAN:
 	case IFT_BRIDGE:
 	case IFT_ARCNET:
 	case IFT_IEEE8023ADLAG:
 		bcopy(lladdr, LLADDR(sdl), len);
 		break;
 	default:
 		return (ENODEV);
 	}
 	/*
 	 * If the interface is already up, we need
 	 * to re-init it in order to reprogram its
 	 * address filter.
 	 */
 	if ((ifp->if_flags & IFF_UP) != 0) {
 		if (ifp->if_ioctl) {
 			IFF_LOCKGIANT(ifp);
 			ifp->if_flags &= ~IFF_UP;
 			ifr.ifr_flags = ifp->if_flags & 0xffff;
 			ifr.ifr_flagshigh = ifp->if_flags >> 16;
 			(*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifr);
 			ifp->if_flags |= IFF_UP;
 			ifr.ifr_flags = ifp->if_flags & 0xffff;
 			ifr.ifr_flagshigh = ifp->if_flags >> 16;
 			(*ifp->if_ioctl)(ifp, SIOCSIFFLAGS, (caddr_t)&ifr);
 			IFF_UNLOCKGIANT(ifp);
 		}
 #ifdef INET
 		/*
 		 * Also send gratuitous ARPs to notify other nodes about
 		 * the address change.
 		 */
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 			if (ifa->ifa_addr->sa_family == AF_INET)
 				arp_ifinit(ifp, ifa);
 		}
 #endif
 	}
 	return (0);
 }
 
 /*
  * The name argument must be a pointer to storage which will last as
  * long as the interface does.  For physical devices, the result of
  * device_get_name(dev) is a good choice and for pseudo-devices a
  * static string works well.
  */
 void
 if_initname(struct ifnet *ifp, const char *name, int unit)
 {
 	ifp->if_dname = name;
 	ifp->if_dunit = unit;
 	if (unit != IF_DUNIT_NONE)
 		snprintf(ifp->if_xname, IFNAMSIZ, "%s%d", name, unit);
 	else
 		strlcpy(ifp->if_xname, name, IFNAMSIZ);
 }
 
 int
 if_printf(struct ifnet *ifp, const char * fmt, ...)
 {
 	va_list ap;
 	int retval;
 
 	retval = printf("%s: ", ifp->if_xname);
 	va_start(ap, fmt);
 	retval += vprintf(fmt, ap);
 	va_end(ap);
 	return (retval);
 }
 
 /*
  * When an interface is marked IFF_NEEDSGIANT, its if_start() routine cannot
  * be called without Giant.  However, we often can't acquire the Giant lock
  * at those points; instead, we run it via a task queue that holds Giant via
  * if_start_deferred.
  *
  * XXXRW: We need to make sure that the ifnet isn't fully detached until any
  * outstanding if_start_deferred() tasks that will run after the free.  This
  * probably means waiting in if_detach().
  */
 void
 if_start(struct ifnet *ifp)
 {
 
 	if (ifp->if_flags & IFF_NEEDSGIANT) {
 		if (mtx_owned(&Giant))
 			(*(ifp)->if_start)(ifp);
 		else
 			taskqueue_enqueue(taskqueue_swi_giant,
 			    &ifp->if_starttask);
 	} else
 		(*(ifp)->if_start)(ifp);
 }
 
 static void
 if_start_deferred(void *context, int pending)
 {
 	struct ifnet *ifp;
 
 	GIANT_REQUIRED;
 
 	ifp = context;
 	(ifp->if_start)(ifp);
 }
 
 /*
  * Backwards compatibility interface for drivers 
  * that have not implemented it
  */
 static int
 if_transmit(struct ifnet *ifp, struct mbuf *m)
 {
 	int error;
 
 	IFQ_HANDOFF(ifp, m, error);
 	return (error);
 }
 
 int
 if_handoff(struct ifqueue *ifq, struct mbuf *m, struct ifnet *ifp, int adjust)
 {
 	int active = 0;
 
 	IF_LOCK(ifq);
 	if (_IF_QFULL(ifq)) {
 		_IF_DROP(ifq);
 		IF_UNLOCK(ifq);
 		m_freem(m);
 		return (0);
 	}
 	if (ifp != NULL) {
 		ifp->if_obytes += m->m_pkthdr.len + adjust;
 		if (m->m_flags & (M_BCAST|M_MCAST))
 			ifp->if_omcasts++;
 		active = ifp->if_drv_flags & IFF_DRV_OACTIVE;
 	}
 	_IF_ENQUEUE(ifq, m);
 	IF_UNLOCK(ifq);
 	if (ifp != NULL && !active)
 		if_start(ifp);
 	return (1);
 }
 
 void
 if_register_com_alloc(u_char type,
     if_com_alloc_t *a, if_com_free_t *f)
 {
 	
 	KASSERT(if_com_alloc[type] == NULL,
 	    ("if_register_com_alloc: %d already registered", type));
 	KASSERT(if_com_free[type] == NULL,
 	    ("if_register_com_alloc: %d free already registered", type));
 
 	if_com_alloc[type] = a;
 	if_com_free[type] = f;
 }
 
 void
 if_deregister_com_alloc(u_char type)
 {
 	
 	KASSERT(if_com_alloc[type] != NULL,
 	    ("if_deregister_com_alloc: %d not registered", type));
 	KASSERT(if_com_free[type] != NULL,
 	    ("if_deregister_com_alloc: %d free not registered", type));
 	if_com_alloc[type] = NULL;
 	if_com_free[type] = NULL;
 }
Index: head/sys/net/if_arcsubr.c
===================================================================
--- head/sys/net/if_arcsubr.c	(revision 186118)
+++ head/sys/net/if_arcsubr.c	(revision 186119)
@@ -1,879 +1,881 @@
 /*	$NetBSD: if_arcsubr.c,v 1.36 2001/06/14 05:44:23 itojun Exp $	*/
 /*	$FreeBSD$ */
 
 /*-
  * Copyright (c) 1994, 1995 Ignatios Souvatzis
  * Copyright (c) 1982, 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * from: NetBSD: if_ethersubr.c,v 1.9 1994/06/29 06:36:11 cgd Exp
  *       @(#)if_ethersubr.c	8.1 (Berkeley) 6/10/93
  *
  */
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipx.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/module.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/errno.h>
 #include <sys/syslog.h>
 
 #include <machine/cpu.h>
 
 #include <net/if.h>
 #include <net/netisr.h>
 #include <net/route.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/if_arc.h>
 #include <net/if_arp.h>
 #include <net/bpf.h>
+#include <net/if_llatbl.h>
 
 #if defined(INET) || defined(INET6)
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/if_ether.h>
 #endif
 
 #ifdef INET6
 #include <netinet6/nd6.h>
 #endif
 
 #ifdef IPX
 #include <netipx/ipx.h>
 #include <netipx/ipx_if.h>
 #endif
 
 #define ARCNET_ALLOW_BROKEN_ARP
 
 static struct mbuf *arc_defrag(struct ifnet *, struct mbuf *);
 static int arc_resolvemulti(struct ifnet *, struct sockaddr **,
 			    struct sockaddr *);
 
 u_int8_t  arcbroadcastaddr = 0;
 
 #define ARC_LLADDR(ifp)	(*(u_int8_t *)IF_LLADDR(ifp))
 
 #define senderr(e) { error = (e); goto bad;}
 #define SIN(s)	((struct sockaddr_in *)s)
 #define SIPX(s)	((struct sockaddr_ipx *)s)
 
 /*
  * ARCnet output routine.
  * Encapsulate a packet of type family for the local net.
  * Assumes that ifp is actually pointer to arccom structure.
  */
 int
 arc_output(struct ifnet *ifp, struct mbuf *m, struct sockaddr *dst,
     struct rtentry *rt0)
 {
 	struct arc_header	*ah;
 	int			error;
 	u_int8_t		atype, adst;
 	int			loop_copy = 0;
 	int			isphds;
+	struct llentry		*lle;
 
 	if (!((ifp->if_flags & IFF_UP) &&
 	    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 		return(ENETDOWN); /* m, m1 aren't initialized yet */
 
 	error = 0;
 
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET:
 
 		/*
 		 * For now, use the simple IP addr -> ARCnet addr mapping
 		 */
 		if (m->m_flags & (M_BCAST|M_MCAST))
 			adst = arcbroadcastaddr; /* ARCnet broadcast address */
 		else if (ifp->if_flags & IFF_NOARP)
 			adst = ntohl(SIN(dst)->sin_addr.s_addr) & 0xFF;
 		else {
-			error = arpresolve(ifp, rt0, m, dst, &adst);
+			error = arpresolve(ifp, rt0, m, dst, &adst, &lle);
 			if (error)
 				return (error == EWOULDBLOCK ? 0 : error);
 		}
 
 		atype = (ifp->if_flags & IFF_LINK0) ?
 			ARCTYPE_IP_OLD : ARCTYPE_IP;
 		break;
 	case AF_ARP:
 	{
 		struct arphdr *ah;
 		ah = mtod(m, struct arphdr *);
 		ah->ar_hrd = htons(ARPHRD_ARCNET);
 
 		loop_copy = -1; /* if this is for us, don't do it */
 
 		switch(ntohs(ah->ar_op)) {
 		case ARPOP_REVREQUEST:
 		case ARPOP_REVREPLY:
 			atype = ARCTYPE_REVARP;
 			break;
 		case ARPOP_REQUEST:
 		case ARPOP_REPLY:
 		default:
 			atype = ARCTYPE_ARP;
 			break;
 		}
 
 		if (m->m_flags & M_BCAST)
 			bcopy(ifp->if_broadcastaddr, &adst, ARC_ADDR_LEN);
 		else
 			bcopy(ar_tha(ah), &adst, ARC_ADDR_LEN);
         
 	}
 	break;
 #endif
 #ifdef INET6
 	case AF_INET6:
-		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)&adst);
+		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)&adst, &lle);
 		if (error)
 			return (error);
 		atype = ARCTYPE_INET6;
 		break;
 #endif
 #ifdef IPX
 	case AF_IPX:
 		adst = SIPX(dst)->sipx_addr.x_host.c_host[5];
 		atype = ARCTYPE_IPX;
 		if (adst == 0xff)
 			adst = arcbroadcastaddr;
 		break;
 #endif
 
 	case AF_UNSPEC:
 		loop_copy = -1;
 		ah = (struct arc_header *)dst->sa_data;
 		adst = ah->arc_dhost;
 		atype = ah->arc_type;
 
 		if (atype == ARCTYPE_ARP) {
 			atype = (ifp->if_flags & IFF_LINK0) ?
 			    ARCTYPE_ARP_OLD: ARCTYPE_ARP;
 
 #ifdef ARCNET_ALLOW_BROKEN_ARP
 			/*
 			 * XXX It's not clear per RFC826 if this is needed, but
 			 * "assigned numbers" say this is wrong.
 			 * However, e.g., AmiTCP 3.0Beta used it... we make this
 			 * switchable for emergency cases. Not perfect, but...
 			 */
 			if (ifp->if_flags & IFF_LINK2)
 				mtod(m, struct arphdr *)->ar_pro = atype - 1;
 #endif
 		}
 		break;
 
 	default:
 		if_printf(ifp, "can't handle af%d\n", dst->sa_family);
 		senderr(EAFNOSUPPORT);
 	}
 
 	isphds = arc_isphds(atype);
 	M_PREPEND(m, isphds ? ARC_HDRNEWLEN : ARC_HDRLEN, M_DONTWAIT);
 	if (m == 0)
 		senderr(ENOBUFS);
 	ah = mtod(m, struct arc_header *);
 	ah->arc_type = atype;
 	ah->arc_dhost = adst;
 	ah->arc_shost = ARC_LLADDR(ifp);
 	if (isphds) {
 		ah->arc_flag = 0;
 		ah->arc_seqid = 0;
 	}
 
 	if ((ifp->if_flags & IFF_SIMPLEX) && (loop_copy != -1)) {
 		if ((m->m_flags & M_BCAST) || (loop_copy > 0)) {
 			struct mbuf *n = m_copy(m, 0, (int)M_COPYALL);
 
 			(void) if_simloop(ifp, n, dst->sa_family, ARC_HDRLEN);
 		} else if (ah->arc_dhost == ah->arc_shost) {
 			(void) if_simloop(ifp, m, dst->sa_family, ARC_HDRLEN);
 			return (0);     /* XXX */
 		}
 	}
 
 	BPF_MTAP(ifp, m);
 
 	IFQ_HANDOFF(ifp, m, error);
 
 	return (error);
 
 bad:
 	if (m)
 		m_freem(m);
 	return (error);
 }
 
 void
 arc_frag_init(struct ifnet *ifp)
 {
 	struct arccom *ac;
 
 	ac = (struct arccom *)ifp->if_l2com;
 	ac->curr_frag = 0;
 }
 
 struct mbuf *
 arc_frag_next(struct ifnet *ifp)
 {
 	struct arccom *ac;
 	struct mbuf *m;
 	struct arc_header *ah;
 
 	ac = (struct arccom *)ifp->if_l2com;
 	if ((m = ac->curr_frag) == 0) {
 		int tfrags;
 
 		/* dequeue new packet */
 		IF_DEQUEUE(&ifp->if_snd, m);
 		if (m == 0)
 			return 0;
 
 		ah = mtod(m, struct arc_header *);
 		if (!arc_isphds(ah->arc_type))
 			return m;
 
 		++ac->ac_seqid;		/* make the seqid unique */
 		tfrags = (m->m_pkthdr.len + ARC_MAX_DATA - 1) / ARC_MAX_DATA;
 		ac->fsflag = 2 * tfrags - 3;
 		ac->sflag = 0;
 		ac->rsflag = ac->fsflag;
 		ac->arc_dhost = ah->arc_dhost;
 		ac->arc_shost = ah->arc_shost;
 		ac->arc_type = ah->arc_type;
 
 		m_adj(m, ARC_HDRNEWLEN);
 		ac->curr_frag = m;
 	}
 
 	/* split out next fragment and return it */
 	if (ac->sflag < ac->fsflag) {
 		/* we CAN'T have short packets here */
 		ac->curr_frag = m_split(m, ARC_MAX_DATA, M_DONTWAIT);
 		if (ac->curr_frag == 0) {
 			m_freem(m);
 			return 0;
 		}
 
 		M_PREPEND(m, ARC_HDRNEWLEN, M_DONTWAIT);
 		if (m == 0) {
 			m_freem(ac->curr_frag);
 			ac->curr_frag = 0;
 			return 0;
 		}
 
 		ah = mtod(m, struct arc_header *);
 		ah->arc_flag = ac->rsflag;
 		ah->arc_seqid = ac->ac_seqid;
 
 		ac->sflag += 2;
 		ac->rsflag = ac->sflag;
 	} else if ((m->m_pkthdr.len >=
 	    ARC_MIN_FORBID_LEN - ARC_HDRNEWLEN + 2) &&
 	    (m->m_pkthdr.len <=
 	    ARC_MAX_FORBID_LEN - ARC_HDRNEWLEN + 2)) {
 		ac->curr_frag = 0;
 
 		M_PREPEND(m, ARC_HDRNEWLEN_EXC, M_DONTWAIT);
 		if (m == 0)
 			return 0;
 
 		ah = mtod(m, struct arc_header *);
 		ah->arc_flag = 0xFF;
 		ah->arc_seqid = 0xFFFF;
 		ah->arc_type2 = ac->arc_type;
 		ah->arc_flag2 = ac->sflag;
 		ah->arc_seqid2 = ac->ac_seqid;
 	} else {
 		ac->curr_frag = 0;
 
 		M_PREPEND(m, ARC_HDRNEWLEN, M_DONTWAIT);
 		if (m == 0)
 			return 0;
 
 		ah = mtod(m, struct arc_header *);
 		ah->arc_flag = ac->sflag;
 		ah->arc_seqid = ac->ac_seqid;
 	}
 
 	ah->arc_dhost = ac->arc_dhost;
 	ah->arc_shost = ac->arc_shost;
 	ah->arc_type = ac->arc_type;
 
 	return m;
 }
 
 /*
  * Defragmenter. Returns mbuf if last packet found, else
  * NULL. frees imcoming mbuf as necessary.
  */
 
 static __inline struct mbuf *
 arc_defrag(struct ifnet *ifp, struct mbuf *m)
 {
 	struct arc_header *ah, *ah1;
 	struct arccom *ac;
 	struct ac_frag *af;
 	struct mbuf *m1;
 	char *s;
 	int newflen;
 	u_char src,dst,typ;
 
 	ac = (struct arccom *)ifp->if_l2com;
 
 	if (m->m_len < ARC_HDRNEWLEN) {
 		m = m_pullup(m, ARC_HDRNEWLEN);
 		if (m == NULL) {
 			++ifp->if_ierrors;
 			return NULL;
 		}
 	}
 
 	ah = mtod(m, struct arc_header *);
 	typ = ah->arc_type;
 
 	if (!arc_isphds(typ))
 		return m;
 
 	src = ah->arc_shost;
 	dst = ah->arc_dhost;
 
 	if (ah->arc_flag == 0xff) {
 		m_adj(m, 4);
 
 		if (m->m_len < ARC_HDRNEWLEN) {
 			m = m_pullup(m, ARC_HDRNEWLEN);
 			if (m == NULL) {
 				++ifp->if_ierrors;
 				return NULL;
 			}
 		}
 
 		ah = mtod(m, struct arc_header *);
 	}
 
 	af = &ac->ac_fragtab[src];
 	m1 = af->af_packet;
 	s = "debug code error";
 
 	if (ah->arc_flag & 1) {
 		/*
 		 * first fragment. We always initialize, which is
 		 * about the right thing to do, as we only want to
 		 * accept one fragmented packet per src at a time.
 		 */
 		if (m1 != NULL)
 			m_freem(m1);
 
 		af->af_packet = m;
 		m1 = m;
 		af->af_maxflag = ah->arc_flag;
 		af->af_lastseen = 0;
 		af->af_seqid = ah->arc_seqid;
 
 		return NULL;
 		/* notreached */
 	} else {
 		/* check for unfragmented packet */
 		if (ah->arc_flag == 0)
 			return m;
 
 		/* do we have a first packet from that src? */
 		if (m1 == NULL) {
 			s = "no first frag";
 			goto outofseq;
 		}
 
 		ah1 = mtod(m1, struct arc_header *);
 
 		if (ah->arc_seqid != ah1->arc_seqid) {
 			s = "seqid differs";
 			goto outofseq;
 		}
 
 		if (typ != ah1->arc_type) {
 			s = "type differs";
 			goto outofseq;
 		}
 
 		if (dst != ah1->arc_dhost) {
 			s = "dest host differs";
 			goto outofseq;
 		}
 
 		/* typ, seqid and dst are ok here. */
 
 		if (ah->arc_flag == af->af_lastseen) {
 			m_freem(m);
 			return NULL;
 		}
 
 		if (ah->arc_flag == af->af_lastseen + 2) {
 			/* ok, this is next fragment */
 			af->af_lastseen = ah->arc_flag;
 			m_adj(m,ARC_HDRNEWLEN);
 
 			/*
 			 * m_cat might free the first mbuf (with pkthdr)
 			 * in 2nd chain; therefore:
 			 */
 
 			newflen = m->m_pkthdr.len;
 
 			m_cat(m1,m);
 
 			m1->m_pkthdr.len += newflen;
 
 			/* is it the last one? */
 			if (af->af_lastseen > af->af_maxflag) {
 				af->af_packet = NULL;
 				return(m1);
 			} else
 				return NULL;
 		}
 		s = "other reason";
 		/* if all else fails, it is out of sequence, too */
 	}
 outofseq:
 	if (m1) {
 		m_freem(m1);
 		af->af_packet = NULL;
 	}
 
 	if (m)
 		m_freem(m);
 
 	log(LOG_INFO,"%s: got out of seq. packet: %s\n",
 	    ifp->if_xname, s);
 
 	return NULL;
 }
 
 /*
  * return 1 if Packet Header Definition Standard, else 0.
  * For now: old IP, old ARP aren't obviously. Lacking correct information,
  * we guess that besides new IP and new ARP also IPX and APPLETALK are PHDS.
  * (Apple and Novell corporations were involved, among others, in PHDS work).
  * Easiest is to assume that everybody else uses that, too.
  */
 int
 arc_isphds(u_int8_t type)
 {
 	return (type != ARCTYPE_IP_OLD &&
 		type != ARCTYPE_ARP_OLD &&
 		type != ARCTYPE_DIAGNOSE);
 }
 
 /*
  * Process a received Arcnet packet;
  * the packet is in the mbuf chain m with
  * the ARCnet header.
  */
 void
 arc_input(struct ifnet *ifp, struct mbuf *m)
 {
 	struct arc_header *ah;
 	int isr;
 	u_int8_t atype;
 
 	if ((ifp->if_flags & IFF_UP) == 0) {
 		m_freem(m);
 		return;
 	}
 
 	/* possibly defragment: */
 	m = arc_defrag(ifp, m);
 	if (m == NULL)
 		return;
 
 	BPF_MTAP(ifp, m);
 
 	ah = mtod(m, struct arc_header *);
 	/* does this belong to us? */
 	if ((ifp->if_flags & IFF_PROMISC) == 0
 	    && ah->arc_dhost != arcbroadcastaddr
 	    && ah->arc_dhost != ARC_LLADDR(ifp)) {
 		m_freem(m);
 		return;
 	}
 
 	ifp->if_ibytes += m->m_pkthdr.len;
 
 	if (ah->arc_dhost == arcbroadcastaddr) {
 		m->m_flags |= M_BCAST|M_MCAST;
 		ifp->if_imcasts++;
 	}
 
 	atype = ah->arc_type;
 	switch (atype) {
 #ifdef INET
 	case ARCTYPE_IP:
 		m_adj(m, ARC_HDRNEWLEN);
 		if ((m = ip_fastforward(m)) == NULL)
 			return;
 		isr = NETISR_IP;
 		break;
 
 	case ARCTYPE_IP_OLD:
 		m_adj(m, ARC_HDRLEN);
 		if ((m = ip_fastforward(m)) == NULL)
 			return;
 		isr = NETISR_IP;
 		break;
 
 	case ARCTYPE_ARP:
 		if (ifp->if_flags & IFF_NOARP) {
 			/* Discard packet if ARP is disabled on interface */
 			m_freem(m);
 			return;
 		}
 		m_adj(m, ARC_HDRNEWLEN);
 		isr = NETISR_ARP;
 #ifdef ARCNET_ALLOW_BROKEN_ARP
 		mtod(m, struct arphdr *)->ar_pro = htons(ETHERTYPE_IP);
 #endif
 		break;
 
 	case ARCTYPE_ARP_OLD:
 		if (ifp->if_flags & IFF_NOARP) {
 			/* Discard packet if ARP is disabled on interface */
 			m_freem(m);
 			return;
 		}
 		m_adj(m, ARC_HDRLEN);
 		isr = NETISR_ARP;
 #ifdef ARCNET_ALLOW_BROKEN_ARP
 		mtod(m, struct arphdr *)->ar_pro = htons(ETHERTYPE_IP);
 #endif
 		break;
 #endif
 #ifdef INET6
 	case ARCTYPE_INET6:
 		m_adj(m, ARC_HDRNEWLEN);
 		isr = NETISR_IPV6;
 		break;
 #endif
 #ifdef IPX
 	case ARCTYPE_IPX:
 		m_adj(m, ARC_HDRNEWLEN);
 		isr = NETISR_IPX;
 		break;
 #endif
 	default:
 		m_freem(m);
 		return;
 	}
 	netisr_dispatch(isr, m);
 }
 
 /*
  * Register (new) link level address.
  */
 void
 arc_storelladdr(struct ifnet *ifp, u_int8_t lla)
 {
 	ARC_LLADDR(ifp) = lla;
 }
 
 /*
  * Perform common duties while attaching to interface list
  */
 void
 arc_ifattach(struct ifnet *ifp, u_int8_t lla)
 {
 	struct ifaddr *ifa;
 	struct sockaddr_dl *sdl;
 	struct arccom *ac;
 
 	if_attach(ifp);
 	ifp->if_addrlen = 1;
 	ifp->if_hdrlen = ARC_HDRLEN;
 	ifp->if_mtu = 1500;
 	ifp->if_resolvemulti = arc_resolvemulti;
 	if (ifp->if_baudrate == 0)
 		ifp->if_baudrate = 2500000;
 #if __FreeBSD_version < 500000
 	ifa = ifnet_addrs[ifp->if_index - 1];
 #else
 	ifa = ifp->if_addr;
 #endif
 	KASSERT(ifa != NULL, ("%s: no lladdr!\n", __func__));
 	sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 	sdl->sdl_type = IFT_ARCNET;
 	sdl->sdl_alen = ifp->if_addrlen;
 
 	if (ifp->if_flags & IFF_BROADCAST)
 		ifp->if_flags |= IFF_MULTICAST|IFF_ALLMULTI;
 
 	ac = (struct arccom *)ifp->if_l2com;
 	ac->ac_seqid = (time_second) & 0xFFFF; /* try to make seqid unique */
 	if (lla == 0) {
 		/* XXX this message isn't entirely clear, to me -- cgd */
 		log(LOG_ERR,"%s: link address 0 reserved for broadcasts.  Please change it and ifconfig %s down up\n",
 		   ifp->if_xname, ifp->if_xname);
 	}
 	arc_storelladdr(ifp, lla);
 
 	ifp->if_broadcastaddr = &arcbroadcastaddr;
 
 	bpfattach(ifp, DLT_ARCNET, ARC_HDRLEN);
 }
 
 void
 arc_ifdetach(struct ifnet *ifp)
 {
 	bpfdetach(ifp);
 	if_detach(ifp);
 }
 
 int
 arc_ioctl(struct ifnet *ifp, int command, caddr_t data)
 {
 	struct ifaddr *ifa = (struct ifaddr *) data;
 	struct ifreq *ifr = (struct ifreq *) data;
 	int error = 0;
 
 	switch (command) {
 	case SIOCSIFADDR:
 		ifp->if_flags |= IFF_UP;
 		switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
 		case AF_INET:
 			ifp->if_init(ifp->if_softc);	/* before arpwhohas */
 			arp_ifinit(ifp, ifa);
 			break;
 #endif
 #ifdef IPX
 		/*
 		 * XXX This code is probably wrong
 		 */
 		case AF_IPX:
 		{
 			struct ipx_addr *ina = &(IA_SIPX(ifa)->sipx_addr);
 
 			if (ipx_nullhost(*ina))
 				ina->x_host.c_host[5] = ARC_LLADDR(ifp);
 			else
 				arc_storelladdr(ifp, ina->x_host.c_host[5]);
 
 			/*
 			 * Set new address
 			 */
 			ifp->if_init(ifp->if_softc);
 			break;
 		}
 #endif
 		default:
 			ifp->if_init(ifp->if_softc);
 			break;
 		}
 		break;
 
 	case SIOCGIFADDR:
 		{
 			struct sockaddr *sa;
 
 			sa = (struct sockaddr *) &ifr->ifr_data;
 			*(u_int8_t *)sa->sa_data = ARC_LLADDR(ifp);
 		}
 		break;
 
 	case SIOCADDMULTI:
 	case SIOCDELMULTI:
 		if (ifr == NULL)
 			error = EAFNOSUPPORT;
 		else {
 			switch (ifr->ifr_addr.sa_family) {
 			case AF_INET:
 			case AF_INET6:
 				error = 0;
 				break;
 			default:
 				error = EAFNOSUPPORT;
 				break;
 			}
 		}
 		break;
 
 	case SIOCSIFMTU:
 		/*
 		 * Set the interface MTU.
 		 * mtu can't be larger than ARCMTU for RFC1051
 		 * and can't be larger than ARC_PHDS_MTU
 		 */
 		if (((ifp->if_flags & IFF_LINK0) && ifr->ifr_mtu > ARCMTU) ||
 		    ifr->ifr_mtu > ARC_PHDS_MAXMTU)
 			error = EINVAL;
 		else
 			ifp->if_mtu = ifr->ifr_mtu;
 		break;
 	}
 
 	return (error);
 }
 
 /* based on ether_resolvemulti() */
 int
 arc_resolvemulti(struct ifnet *ifp, struct sockaddr **llsa,
     struct sockaddr *sa)
 {
 	struct sockaddr_dl *sdl;
 #ifdef INET
 	struct sockaddr_in *sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *sin6;
 #endif
 
 	switch(sa->sa_family) {
 	case AF_LINK:
 		/*
 		* No mapping needed. Just check that it's a valid MC address.
 		*/
 		sdl = (struct sockaddr_dl *)sa;
 		if (*LLADDR(sdl) != arcbroadcastaddr)
 			return EADDRNOTAVAIL;
 		*llsa = 0;
 		return 0;
 #ifdef INET
 	case AF_INET:
 		sin = (struct sockaddr_in *)sa;
 		if (!IN_MULTICAST(ntohl(sin->sin_addr.s_addr)))
 			return EADDRNOTAVAIL;
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT | M_ZERO);
 		if (sdl == NULL)
 			return ENOMEM;
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_ARCNET;
 		sdl->sdl_alen = ARC_ADDR_LEN;
 		*LLADDR(sdl) = 0;
 		*llsa = (struct sockaddr *)sdl;
 		return 0;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		sin6 = (struct sockaddr_in6 *)sa;
 		if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) {
 			/*
 			 * An IP6 address of 0 means listen to all
 			 * of the Ethernet multicast address used for IP6.
 			 * (This is used for multicast routers.)
 			 */
 			ifp->if_flags |= IFF_ALLMULTI;
 			*llsa = 0;
 			return 0;
 		}
 		if (!IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr))
 			return EADDRNOTAVAIL;
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT | M_ZERO);
 		if (sdl == NULL)
 			return ENOMEM;
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_ARCNET;
 		sdl->sdl_alen = ARC_ADDR_LEN;
 		*LLADDR(sdl) = 0;
 		*llsa = (struct sockaddr *)sdl;
 		return 0;
 #endif
 
 	default:
 		/*
 		 * Well, the text isn't quite right, but it's the name
 		 * that counts...
 		 */
 		return EAFNOSUPPORT;
 	}
 }
 
 MALLOC_DEFINE(M_ARCCOM, "arccom", "ARCNET interface internals");
 
 static void*
 arc_alloc(u_char type, struct ifnet *ifp)
 {
 	struct arccom	*ac;
 	
 	ac = malloc(sizeof(struct arccom), M_ARCCOM, M_WAITOK | M_ZERO);
 	ac->ac_ifp = ifp;
 
 	return (ac);
 }
 
 static void
 arc_free(void *com, u_char type)
 {
 
 	free(com, M_ARCCOM);
 }
 
 static int
 arc_modevent(module_t mod, int type, void *data)
 {
 
 	switch (type) {
 	case MOD_LOAD:
 		if_register_com_alloc(IFT_ARCNET, arc_alloc, arc_free);
 		break;
 	case MOD_UNLOAD:
 		if_deregister_com_alloc(IFT_ARCNET);
 		break;
 	default:
 		return EOPNOTSUPP;
 	}
 
 	return (0);
 }
 
 static moduledata_t arc_mod = {
 	"arcnet",
 	arc_modevent,
 	0
 };
 
 DECLARE_MODULE(arcnet, arc_mod, SI_SUB_INIT_IF, SI_ORDER_ANY);
 MODULE_VERSION(arcnet, 1);
Index: head/sys/net/if_atmsubr.c
===================================================================
--- head/sys/net/if_atmsubr.c	(revision 186118)
+++ head/sys/net/if_atmsubr.c	(revision 186119)
@@ -1,514 +1,503 @@
 /*      $NetBSD: if_atmsubr.c,v 1.10 1997/03/11 23:19:51 chuck Exp $       */
 
 /*-
  *
  * Copyright (c) 1996 Charles D. Cranor and Washington University.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *      This product includes software developed by Charles D. Cranor and 
  *	Washington University.
  * 4. The name of the author may not be used to endorse or promote products
  *    derived from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * if_atmsubr.c
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_mac.h"
 #include "opt_natm.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/module.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/errno.h>
 #include <sys/sysctl.h>
 #include <sys/malloc.h>
 
 #include <net/if.h>
 #include <net/netisr.h>
 #include <net/route.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/if_atm.h>
 
 #include <netinet/in.h>
 #include <netinet/if_atm.h>
 #include <netinet/if_ether.h> /* XXX: for ETHERTYPE_* */
 #if defined(INET) || defined(INET6)
 #include <netinet/in_var.h>
 #endif
 #ifdef NATM
 #include <netnatm/natm.h>
 #endif
 
 #include <security/mac/mac_framework.h>
 
 /*
  * Netgraph interface functions.
  * These need not be protected by a lock, because ng_atm nodes are persitent.
  * The ng_atm module can be unloaded only if all ATM interfaces have been
  * unloaded, so nobody should be in the code paths accessing these function
  * pointers.
  */
 void	(*ng_atm_attach_p)(struct ifnet *);
 void	(*ng_atm_detach_p)(struct ifnet *);
 int	(*ng_atm_output_p)(struct ifnet *, struct mbuf **);
 void	(*ng_atm_input_p)(struct ifnet *, struct mbuf **,
 	    struct atm_pseudohdr *, void *);
 void	(*ng_atm_input_orphan_p)(struct ifnet *, struct mbuf *,
 	    struct atm_pseudohdr *, void *);
 void	(*ng_atm_event_p)(struct ifnet *, uint32_t, void *);
 
 /*
  * Harp pseudo interface hooks
  */
 void	(*atm_harp_input_p)(struct ifnet *ifp, struct mbuf **m,
 	    struct atm_pseudohdr *ah, void *rxhand);
 void	(*atm_harp_attach_p)(struct ifnet *);
 void	(*atm_harp_detach_p)(struct ifnet *);
 void	(*atm_harp_event_p)(struct ifnet *, uint32_t, void *);
 
 SYSCTL_NODE(_hw, OID_AUTO, atm, CTLFLAG_RW, 0, "ATM hardware");
 
 MALLOC_DEFINE(M_IFATM, "ifatm", "atm interface internals");
 
 #ifndef ETHERTYPE_IPV6
 #define	ETHERTYPE_IPV6	0x86dd
 #endif
 
 #define	senderr(e) do { error = (e); goto bad; } while (0)
 
 /*
  * atm_output: ATM output routine
  *   inputs:
  *     "ifp" = ATM interface to output to
  *     "m0" = the packet to output
  *     "dst" = the sockaddr to send to (either IP addr, or raw VPI/VCI)
  *     "rt0" = the route to use
  *   returns: error code   [0 == ok]
  *
  *   note: special semantic: if (dst == NULL) then we assume "m" already
  *		has an atm_pseudohdr on it and just send it directly.
  *		[for native mode ATM output]   if dst is null, then
  *		rt0 must also be NULL.
  */
 int
 atm_output(struct ifnet *ifp, struct mbuf *m0, struct sockaddr *dst,
     struct rtentry *rt0)
 {
 	u_int16_t etype = 0;			/* if using LLC/SNAP */
 	int error = 0, sz;
 	struct atm_pseudohdr atmdst, *ad;
 	struct mbuf *m = m0;
 	struct atmllc *atmllc;
 	struct atmllc *llc_hdr = NULL;
 	u_int32_t atm_flags;
 
 #ifdef MAC
 	error = mac_ifnet_check_transmit(ifp, m);
 	if (error)
 		senderr(error);
 #endif
 
 	if (!((ifp->if_flags & IFF_UP) &&
 	    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 		senderr(ENETDOWN);
 
 	/*
 	 * check for non-native ATM traffic   (dst != NULL)
 	 */
 	if (dst) {
 		switch (dst->sa_family) {
 
 #if defined(INET) || defined(INET6)
 		case AF_INET:
 		case AF_INET6:
 		{
-			struct rtentry *rt = NULL;
-			/*  
-			 * check route
-			 */
-			if (rt0 != NULL) {
-				error = rt_check(&rt, &rt0, dst);
-				if (error)
-					goto bad;
-				RT_UNLOCK(rt);
-			}
-
 			if (dst->sa_family == AF_INET6)
 			        etype = ETHERTYPE_IPV6;
 			else
 			        etype = ETHERTYPE_IP;
-			if (!atmresolve(rt, m, dst, &atmdst)) {
+			if (!atmresolve(rt0, m, dst, &atmdst)) {
 				m = NULL; 
 				/* XXX: atmresolve already free'd it */
 				senderr(EHOSTUNREACH);
 				/* XXX: put ATMARP stuff here */
 				/* XXX: watch who frees m on failure */
 			}
 		}
 			break;
 #endif /* INET || INET6 */
 
 		case AF_UNSPEC:
 			/*
 			 * XXX: bpfwrite. assuming dst contains 12 bytes
 			 * (atm pseudo header (4) + LLC/SNAP (8))
 			 */
 			bcopy(dst->sa_data, &atmdst, sizeof(atmdst));
 			llc_hdr = (struct atmllc *)(dst->sa_data +
 			    sizeof(atmdst));
 			break;
 			
 		default:
 			printf("%s: can't handle af%d\n", ifp->if_xname, 
 			    dst->sa_family);
 			senderr(EAFNOSUPPORT);
 		}
 
 		/*
 		 * must add atm_pseudohdr to data
 		 */
 		sz = sizeof(atmdst);
 		atm_flags = ATM_PH_FLAGS(&atmdst);
 		if (atm_flags & ATM_PH_LLCSNAP)
 			sz += 8;	/* sizeof snap == 8 */
 		M_PREPEND(m, sz, M_DONTWAIT);
 		if (m == 0)
 			senderr(ENOBUFS);
 		ad = mtod(m, struct atm_pseudohdr *);
 		*ad = atmdst;
 		if (atm_flags & ATM_PH_LLCSNAP) {
 			atmllc = (struct atmllc *)(ad + 1);
 			if (llc_hdr == NULL) {
 			        bcopy(ATMLLC_HDR, atmllc->llchdr, 
 				      sizeof(atmllc->llchdr));
 				/* note: in host order */
 				ATM_LLC_SETTYPE(atmllc, etype); 
 			}
 			else
 			        bcopy(llc_hdr, atmllc, sizeof(struct atmllc));
 		}
 	}
 
 	if (ng_atm_output_p != NULL) {
 		if ((error = (*ng_atm_output_p)(ifp, &m)) != 0) {
 			if (m != NULL)
 				m_freem(m);
 			return (error);
 		}
 		if (m == NULL)
 			return (0);
 	}
 
 	/*
 	 * Queue message on interface, and start output if interface
 	 * not yet active.
 	 */
 	if (!IF_HANDOFF_ADJ(&ifp->if_snd, m, ifp,
 	    -(int)sizeof(struct atm_pseudohdr)))
 		return (ENOBUFS);
 	return (error);
 
 bad:
 	if (m)
 		m_freem(m);
 	return (error);
 }
 
 /*
  * Process a received ATM packet;
  * the packet is in the mbuf chain m.
  */
 void
 atm_input(struct ifnet *ifp, struct atm_pseudohdr *ah, struct mbuf *m,
     void *rxhand)
 {
 	int isr;
 	u_int16_t etype = ETHERTYPE_IP;		/* default */
 
 	if ((ifp->if_flags & IFF_UP) == 0) {
 		m_freem(m);
 		return;
 	}
 #ifdef MAC
 	mac_ifnet_create_mbuf(ifp, m);
 #endif
 	ifp->if_ibytes += m->m_pkthdr.len;
 
 	if (ng_atm_input_p != NULL) {
 		(*ng_atm_input_p)(ifp, &m, ah, rxhand);
 		if (m == NULL)
 			return;
 	}
 
 	/* not eaten by ng_atm. Maybe it's a pseudo-harp PDU? */
 	if (atm_harp_input_p != NULL) {
 		(*atm_harp_input_p)(ifp, &m, ah, rxhand);
 		if (m == NULL)
 			return;
 	}
 
 	if (rxhand) {
 #ifdef NATM
 		struct natmpcb *npcb;
 
 		/*
 		 * XXXRW: this use of 'rxhand' is not a very good idea, and
 		 * was subject to races even before SMPng due to the release
 		 * of spl here.
 		 */
 		NATM_LOCK();
 		npcb = rxhand;
 		npcb->npcb_inq++;	/* count # in queue */
 		isr = NETISR_NATM;
 		m->m_pkthdr.rcvif = rxhand; /* XXX: overload */
 		NATM_UNLOCK();
 #else
 		printf("atm_input: NATM detected but not "
 		    "configured in kernel\n");
 		goto dropit;
 #endif
 	} else {
 		/*
 		 * handle LLC/SNAP header, if present
 		 */
 		if (ATM_PH_FLAGS(ah) & ATM_PH_LLCSNAP) {
 			struct atmllc *alc;
 
 			if (m->m_len < sizeof(*alc) &&
 			    (m = m_pullup(m, sizeof(*alc))) == 0)
 				return; /* failed */
 			alc = mtod(m, struct atmllc *);
 			if (bcmp(alc, ATMLLC_HDR, 6)) {
 				printf("%s: recv'd invalid LLC/SNAP frame "
 				    "[vp=%d,vc=%d]\n", ifp->if_xname,
 				    ATM_PH_VPI(ah), ATM_PH_VCI(ah));
 				m_freem(m);
 				return;
 			}
 			etype = ATM_LLC_TYPE(alc);
 			m_adj(m, sizeof(*alc));
 		}
 
 		switch (etype) {
 
 #ifdef INET
 		case ETHERTYPE_IP:
 			isr = NETISR_IP;
 			break;
 #endif
 
 #ifdef INET6
 		case ETHERTYPE_IPV6:
 			isr = NETISR_IPV6;
 			break;
 #endif
 		default:
 #ifndef NATM
   dropit:
 #endif
 			if (ng_atm_input_orphan_p != NULL)
 				(*ng_atm_input_orphan_p)(ifp, m, ah, rxhand);
 			else
 				m_freem(m);
 			return;
 		}
 	}
 	netisr_dispatch(isr, m);
 }
 
 /*
  * Perform common duties while attaching to interface list.
  */
 void
 atm_ifattach(struct ifnet *ifp)
 {
 	struct ifaddr *ifa;
 	struct sockaddr_dl *sdl;
 	struct ifatm *ifatm = ifp->if_l2com;
 
 	ifp->if_addrlen = 0;
 	ifp->if_hdrlen = 0;
 	if_attach(ifp);
 	ifp->if_mtu = ATMMTU;
 	ifp->if_output = atm_output;
 #if 0
 	ifp->if_input = atm_input;
 #endif
 	ifp->if_snd.ifq_maxlen = 50;	/* dummy */
 
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)
 		if (ifa->ifa_addr->sa_family == AF_LINK) {
 			sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 			sdl->sdl_type = IFT_ATM;
 			sdl->sdl_alen = ifp->if_addrlen;
 #ifdef notyet /* if using ATMARP, store hardware address using the next line */
 			bcopy(ifp->hw_addr, LLADDR(sdl), ifp->if_addrlen);
 #endif
 			break;
 		}
 
 	ifp->if_linkmib = &ifatm->mib;
 	ifp->if_linkmiblen = sizeof(ifatm->mib);
 
 	if(ng_atm_attach_p)
 		(*ng_atm_attach_p)(ifp);
 	if (atm_harp_attach_p)
 		(*atm_harp_attach_p)(ifp);
 }
 
 /*
  * Common stuff for detaching an ATM interface
  */
 void
 atm_ifdetach(struct ifnet *ifp)
 {
 	if (atm_harp_detach_p)
 		(*atm_harp_detach_p)(ifp);
 	if(ng_atm_detach_p)
 		(*ng_atm_detach_p)(ifp);
 	if_detach(ifp);
 }
 
 /*
  * Support routine for the SIOCATMGVCCS ioctl().
  *
  * This routine assumes, that the private VCC structures used by the driver
  * begin with a struct atmio_vcc.
  *
  * Return a table of VCCs in a freshly allocated memory area.
  * Here we have a problem: we first count, how many vccs we need
  * to return. The we allocate the memory and finally fill it in.
  * Because we cannot lock while calling malloc, the number of active
  * vccs may change while we're in malloc. So we allocate a couple of
  * vccs more and if space anyway is not enough re-iterate.
  *
  * We could use an sx lock for the vcc tables.
  */
 struct atmio_vcctable *
 atm_getvccs(struct atmio_vcc **table, u_int size, u_int start,
     struct mtx *lock, int waitok)
 {
 	u_int cid, alloc;
 	size_t len;
 	struct atmio_vcctable *vccs;
 	struct atmio_vcc *v;
 
 	alloc = start + 10;
 	vccs = NULL;
 
 	for (;;) {
 		len = sizeof(*vccs) + alloc * sizeof(vccs->vccs[0]);
 		vccs = reallocf(vccs, len, M_TEMP,
 		    waitok ? M_WAITOK : M_NOWAIT);
 		if (vccs == NULL)
 			return (NULL);
 		bzero(vccs, len);
 
 		vccs->count = 0;
 		v = vccs->vccs;
 
 		mtx_lock(lock);
 		for (cid = 0; cid < size; cid++)
 			if (table[cid] != NULL) {
 				if (++vccs->count == alloc)
 					/* too many - try again */
 					break;
 				*v++ = *table[cid];
 			}
 		mtx_unlock(lock);
 
 		if (cid == size)
 			break;
 
 		alloc *= 2;
 	}
 	return (vccs);
 }
 
 /*
  * Driver or channel state has changed. Inform whoever is interested
  * in these events.
  */
 void
 atm_event(struct ifnet *ifp, u_int event, void *arg)
 {
 	if (ng_atm_event_p != NULL)
 		(*ng_atm_event_p)(ifp, event, arg);
 	if (atm_harp_event_p != NULL)
 		(*atm_harp_event_p)(ifp, event, arg);
 }
 
 static void *
 atm_alloc(u_char type, struct ifnet *ifp)
 {
 	struct ifatm	*ifatm;
 
 	ifatm = malloc(sizeof(struct ifatm), M_IFATM, M_WAITOK | M_ZERO);
 	ifatm->ifp = ifp;
 
 	return (ifatm);
 }
 
 static void
 atm_free(void *com, u_char type)
 {
 
 	free(com, M_IFATM);
 }
 
 static int
 atm_modevent(module_t mod, int type, void *data)
 {
 	switch (type) {
 	case MOD_LOAD:
 		if_register_com_alloc(IFT_ATM, atm_alloc, atm_free);
 		break;
 	case MOD_UNLOAD:
 		if_deregister_com_alloc(IFT_ATM);
 		break;
 	default:
 		return (EOPNOTSUPP);
 	}
 
 	return (0);
 }
 
 static moduledata_t atm_mod = {
         "atm",
         atm_modevent,
         0
 };
                 
 DECLARE_MODULE(atm, atm_mod, SI_SUB_INIT_IF, SI_ORDER_ANY);
 MODULE_VERSION(atm, 1);
Index: head/sys/net/if_ethersubr.c
===================================================================
--- head/sys/net/if_ethersubr.c	(revision 186118)
+++ head/sys/net/if_ethersubr.c	(revision 186119)
@@ -1,1295 +1,1310 @@
 /*-
  * Copyright (c) 1982, 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)if_ethersubr.c	8.1 (Berkeley) 6/10/93
  * $FreeBSD$
  */
 
 #include "opt_atalk.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipx.h"
 #include "opt_mac.h"
 #include "opt_netgraph.h"
 #include "opt_carp.h"
 #include "opt_mbuf_profiling.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/module.h>
 #include <sys/mbuf.h>
 #include <sys/random.h>
 #include <sys/rwlock.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/sysctl.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_arp.h>
 #include <net/netisr.h>
 #include <net/route.h>
 #include <net/if_llc.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/bpf.h>
 #include <net/ethernet.h>
 #include <net/if_bridgevar.h>
 #include <net/if_vlan_var.h>
+#include <net/if_llatbl.h>
 #include <net/pf_mtag.h>
 #include <net/vnet.h>
 
 #if defined(INET) || defined(INET6)
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/if_ether.h>
 #include <netinet/ip_fw.h>
 #include <netinet/ip_dummynet.h>
 #include <netinet/vinet.h>
 #endif
 #ifdef INET6
 #include <netinet6/nd6.h>
 #endif
 
 #ifdef DEV_CARP
 #include <netinet/ip_carp.h>
 #endif
 
 #ifdef IPX
 #include <netipx/ipx.h>
 #include <netipx/ipx_if.h>
 #endif
+
 int (*ef_inputp)(struct ifnet*, struct ether_header *eh, struct mbuf *m);
 int (*ef_outputp)(struct ifnet *ifp, struct mbuf **mp,
 		struct sockaddr *dst, short *tp, int *hlen);
 
 #ifdef NETATALK
 #include <netatalk/at.h>
 #include <netatalk/at_var.h>
 #include <netatalk/at_extern.h>
 
 #define llc_snap_org_code llc_un.type_snap.org_code
 #define llc_snap_ether_type llc_un.type_snap.ether_type
 
 extern u_char	at_org_code[3];
 extern u_char	aarp_org_code[3];
 #endif /* NETATALK */
 
 #include <security/mac/mac_framework.h>
 
 #ifdef CTASSERT
 CTASSERT(sizeof (struct ether_header) == ETHER_ADDR_LEN * 2 + 2);
 CTASSERT(sizeof (struct ether_addr) == ETHER_ADDR_LEN);
 #endif
 
 /* netgraph node hooks for ng_ether(4) */
 void	(*ng_ether_input_p)(struct ifnet *ifp, struct mbuf **mp);
 void	(*ng_ether_input_orphan_p)(struct ifnet *ifp, struct mbuf *m);
 int	(*ng_ether_output_p)(struct ifnet *ifp, struct mbuf **mp);
 void	(*ng_ether_attach_p)(struct ifnet *ifp);
 void	(*ng_ether_detach_p)(struct ifnet *ifp);
 
 void	(*vlan_input_p)(struct ifnet *, struct mbuf *);
 
 /* if_bridge(4) support */
 struct mbuf *(*bridge_input_p)(struct ifnet *, struct mbuf *); 
 int	(*bridge_output_p)(struct ifnet *, struct mbuf *, 
 		struct sockaddr *, struct rtentry *);
 void	(*bridge_dn_p)(struct mbuf *, struct ifnet *);
 
 /* if_lagg(4) support */
 struct mbuf *(*lagg_input_p)(struct ifnet *, struct mbuf *); 
 
 static const u_char etherbroadcastaddr[ETHER_ADDR_LEN] =
 			{ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
 
 static	int ether_resolvemulti(struct ifnet *, struct sockaddr **,
 		struct sockaddr *);
 
 /* XXX: should be in an arp support file, not here */
 MALLOC_DEFINE(M_ARPCOM, "arpcom", "802.* interface internals");
 
 #define	ETHER_IS_BROADCAST(addr) \
 	(bcmp(etherbroadcastaddr, (addr), ETHER_ADDR_LEN) == 0)
 
 #define senderr(e) do { error = (e); goto bad;} while (0)
 
 #if defined(INET) || defined(INET6)
 int
 ether_ipfw_chk(struct mbuf **m0, struct ifnet *dst,
 	struct ip_fw **rule, int shared);
 #ifdef VIMAGE_GLOBALS
 static int ether_ipfw;
 #endif
 #endif
 
+
 /*
  * Ethernet output routine.
  * Encapsulate a packet of type family for the local net.
  * Use trailer local net encapsulation if enough data in first
  * packet leaves a multiple of 512 bytes of data in remainder.
  */
 int
 ether_output(struct ifnet *ifp, struct mbuf *m,
 	struct sockaddr *dst, struct rtentry *rt0)
 {
 	short type;
 	int error, hdrcmplt = 0;
 	u_char esrc[ETHER_ADDR_LEN], edst[ETHER_ADDR_LEN];
+	struct llentry *lle = NULL;
 	struct ether_header *eh;
 	struct pf_mtag *t;
 	int loop_copy = 1;
 	int hlen;	/* link layer header length */
 
 #ifdef MAC
 	error = mac_ifnet_check_transmit(ifp, m);
 	if (error)
 		senderr(error);
 #endif
 
 	M_PROFILE(m);
 	if (ifp->if_flags & IFF_MONITOR)
 		senderr(ENETDOWN);
 	if (!((ifp->if_flags & IFF_UP) &&
 	    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 		senderr(ENETDOWN);
 
 	hlen = ETHER_HDR_LEN;
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET:
-		error = arpresolve(ifp, rt0, m, dst, edst);
+		error = arpresolve(ifp, rt0, m, dst, edst, &lle);
 		if (error)
 			return (error == EWOULDBLOCK ? 0 : error);
 		type = htons(ETHERTYPE_IP);
 		break;
 	case AF_ARP:
 	{
 		struct arphdr *ah;
 		ah = mtod(m, struct arphdr *);
 		ah->ar_hrd = htons(ARPHRD_ETHER);
 
 		loop_copy = 0; /* if this is for us, don't do it */
 
 		switch(ntohs(ah->ar_op)) {
 		case ARPOP_REVREQUEST:
 		case ARPOP_REVREPLY:
 			type = htons(ETHERTYPE_REVARP);
 			break;
 		case ARPOP_REQUEST:
 		case ARPOP_REPLY:
 		default:
 			type = htons(ETHERTYPE_ARP);
 			break;
 		}
 
 		if (m->m_flags & M_BCAST)
 			bcopy(ifp->if_broadcastaddr, edst, ETHER_ADDR_LEN);
 		else
 			bcopy(ar_tha(ah), edst, ETHER_ADDR_LEN);
 
 	}
 	break;
 #endif
 #ifdef INET6
 	case AF_INET6:
-		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)edst);
+		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)edst, &lle);
 		if (error)
 			return error;
 		type = htons(ETHERTYPE_IPV6);
 		break;
 #endif
 #ifdef IPX
 	case AF_IPX:
 		if (ef_outputp) {
 		    error = ef_outputp(ifp, &m, dst, &type, &hlen);
 		    if (error)
 			goto bad;
 		} else
 		    type = htons(ETHERTYPE_IPX);
 		bcopy((caddr_t)&(((struct sockaddr_ipx *)dst)->sipx_addr.x_host),
 		    (caddr_t)edst, sizeof (edst));
 		break;
 #endif
 #ifdef NETATALK
 	case AF_APPLETALK:
 	  {
 	    struct at_ifaddr *aa;
 
 	    if ((aa = at_ifawithnet((struct sockaddr_at *)dst)) == NULL)
 		    senderr(EHOSTUNREACH); /* XXX */
 	    if (!aarpresolve(ifp, m, (struct sockaddr_at *)dst, edst))
 		    return (0);
 	    /*
 	     * In the phase 2 case, need to prepend an mbuf for the llc header.
 	     */
 	    if ( aa->aa_flags & AFA_PHASE2 ) {
 		struct llc llc;
 
 		M_PREPEND(m, LLC_SNAPFRAMELEN, M_DONTWAIT);
 		if (m == NULL)
 			senderr(ENOBUFS);
 		llc.llc_dsap = llc.llc_ssap = LLC_SNAP_LSAP;
 		llc.llc_control = LLC_UI;
 		bcopy(at_org_code, llc.llc_snap_org_code, sizeof(at_org_code));
 		llc.llc_snap_ether_type = htons( ETHERTYPE_AT );
 		bcopy(&llc, mtod(m, caddr_t), LLC_SNAPFRAMELEN);
 		type = htons(m->m_pkthdr.len);
 		hlen = LLC_SNAPFRAMELEN + ETHER_HDR_LEN;
 	    } else {
 		type = htons(ETHERTYPE_AT);
 	    }
 	    break;
 	  }
 #endif /* NETATALK */
 
 	case pseudo_AF_HDRCMPLT:
 		hdrcmplt = 1;
 		eh = (struct ether_header *)dst->sa_data;
 		(void)memcpy(esrc, eh->ether_shost, sizeof (esrc));
 		/* FALLTHROUGH */
 
 	case AF_UNSPEC:
 		loop_copy = 0; /* if this is for us, don't do it */
 		eh = (struct ether_header *)dst->sa_data;
 		(void)memcpy(edst, eh->ether_dhost, sizeof (edst));
 		type = eh->ether_type;
 		break;
 
 	default:
 		if_printf(ifp, "can't handle af%d\n", dst->sa_family);
 		senderr(EAFNOSUPPORT);
+	}
+
+	if (lle != NULL && (lle->la_flags & LLE_IFADDR)) {
+		int csum_flags = 0;
+		if (m->m_pkthdr.csum_flags & CSUM_IP)
+			csum_flags |= (CSUM_IP_CHECKED|CSUM_IP_VALID);
+		if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA)
+			csum_flags |= (CSUM_DATA_VALID|CSUM_PSEUDO_HDR);
+		m->m_pkthdr.csum_flags |= csum_flags;
+		m->m_pkthdr.csum_data = 0xffff;
+		return (if_simloop(ifp, m, dst->sa_family, 0));
 	}
 
 	/*
 	 * Add local net header.  If no space in first mbuf,
 	 * allocate another.
 	 */
 	M_PREPEND(m, ETHER_HDR_LEN, M_DONTWAIT);
 	if (m == NULL)
 		senderr(ENOBUFS);
 	eh = mtod(m, struct ether_header *);
 	(void)memcpy(&eh->ether_type, &type,
 		sizeof(eh->ether_type));
 	(void)memcpy(eh->ether_dhost, edst, sizeof (edst));
 	if (hdrcmplt)
 		(void)memcpy(eh->ether_shost, esrc,
 			sizeof(eh->ether_shost));
 	else
 		(void)memcpy(eh->ether_shost, IF_LLADDR(ifp),
 			sizeof(eh->ether_shost));
 
 	/*
 	 * If a simplex interface, and the packet is being sent to our
 	 * Ethernet address or a broadcast address, loopback a copy.
 	 * XXX To make a simplex device behave exactly like a duplex
 	 * device, we should copy in the case of sending to our own
 	 * ethernet address (thus letting the original actually appear
 	 * on the wire). However, we don't do that here for security
 	 * reasons and compatibility with the original behavior.
 	 */
 	if ((ifp->if_flags & IFF_SIMPLEX) && loop_copy &&
 	    ((t = pf_find_mtag(m)) == NULL || !t->routed)) {
 		int csum_flags = 0;
 
 		if (m->m_pkthdr.csum_flags & CSUM_IP)
 			csum_flags |= (CSUM_IP_CHECKED|CSUM_IP_VALID);
 		if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA)
 			csum_flags |= (CSUM_DATA_VALID|CSUM_PSEUDO_HDR);
 
 		if (m->m_flags & M_BCAST) {
 			struct mbuf *n;
 
 			/*
 			 * Because if_simloop() modifies the packet, we need a
 			 * writable copy through m_dup() instead of a readonly
 			 * one as m_copy[m] would give us. The alternative would
 			 * be to modify if_simloop() to handle the readonly mbuf,
 			 * but performancewise it is mostly equivalent (trading
 			 * extra data copying vs. extra locking).
 			 *
 			 * XXX This is a local workaround.  A number of less
 			 * often used kernel parts suffer from the same bug.
 			 * See PR kern/105943 for a proposed general solution.
 			 */
 			if ((n = m_dup(m, M_DONTWAIT)) != NULL) {
 				n->m_pkthdr.csum_flags |= csum_flags;
 				if (csum_flags & CSUM_DATA_VALID)
 					n->m_pkthdr.csum_data = 0xffff;
 				(void)if_simloop(ifp, n, dst->sa_family, hlen);
 			} else
 				ifp->if_iqdrops++;
 		} else if (bcmp(eh->ether_dhost, eh->ether_shost,
 				ETHER_ADDR_LEN) == 0) {
 			m->m_pkthdr.csum_flags |= csum_flags;
 			if (csum_flags & CSUM_DATA_VALID)
 				m->m_pkthdr.csum_data = 0xffff;
 			(void) if_simloop(ifp, m, dst->sa_family, hlen);
 			return (0);	/* XXX */
 		}
 	}
 
        /*
 	* Bridges require special output handling.
 	*/
 	if (ifp->if_bridge) {
 		BRIDGE_OUTPUT(ifp, m, error);
 		return (error);
 	}
 
 #ifdef DEV_CARP
 	if (ifp->if_carp &&
 	    (error = carp_output(ifp, m, dst, NULL)))
 		goto bad;
 #endif
 
 	/* Handle ng_ether(4) processing, if any */
 	if (IFP2AC(ifp)->ac_netgraph != NULL) {
 		KASSERT(ng_ether_output_p != NULL,
 		    ("ng_ether_output_p is NULL"));
 		if ((error = (*ng_ether_output_p)(ifp, &m)) != 0) {
 bad:			if (m != NULL)
 				m_freem(m);
 			return (error);
 		}
 		if (m == NULL)
 			return (0);
 	}
 
 	/* Continue with link-layer output */
 	return ether_output_frame(ifp, m);
 }
 
 /*
  * Ethernet link layer output routine to send a raw frame to the device.
  *
  * This assumes that the 14 byte Ethernet header is present and contiguous
  * in the first mbuf (if BRIDGE'ing).
  */
 int
 ether_output_frame(struct ifnet *ifp, struct mbuf *m)
 {
 #if defined(INET) || defined(INET6)
 	INIT_VNET_NET(ifp->if_vnet);
 	struct ip_fw *rule = ip_dn_claim_rule(m);
 
 	if (IPFW_LOADED && V_ether_ipfw != 0) {
 		if (ether_ipfw_chk(&m, ifp, &rule, 0) == 0) {
 			if (m) {
 				m_freem(m);
 				return EACCES;	/* pkt dropped */
 			} else
 				return 0;	/* consumed e.g. in a pipe */
 		}
 	}
 #endif
 
 	/*
 	 * Queue message on interface, update output statistics if
 	 * successful, and start output if interface not yet active.
 	 */
 	return ((ifp->if_transmit)(ifp, m));
 }
 
 #if defined(INET) || defined(INET6)
 /*
  * ipfw processing for ethernet packets (in and out).
  * The second parameter is NULL from ether_demux, and ifp from
  * ether_output_frame.
  */
 int
 ether_ipfw_chk(struct mbuf **m0, struct ifnet *dst,
 	struct ip_fw **rule, int shared)
 {
 	INIT_VNET_INET(dst->if_vnet);
 	struct ether_header *eh;
 	struct ether_header save_eh;
 	struct mbuf *m;
 	int i;
 	struct ip_fw_args args;
 
 	if (*rule != NULL && V_fw_one_pass)
 		return 1; /* dummynet packet, already partially processed */
 
 	/*
 	 * I need some amt of data to be contiguous, and in case others need
 	 * the packet (shared==1) also better be in the first mbuf.
 	 */
 	m = *m0;
 	i = min( m->m_pkthdr.len, max_protohdr);
 	if ( shared || m->m_len < i) {
 		m = m_pullup(m, i);
 		if (m == NULL) {
 			*m0 = m;
 			return 0;
 		}
 	}
 	eh = mtod(m, struct ether_header *);
 	save_eh = *eh;			/* save copy for restore below */
 	m_adj(m, ETHER_HDR_LEN);	/* strip ethernet header */
 
 	args.m = m;		/* the packet we are looking at		*/
 	args.oif = dst;		/* destination, if any			*/
 	args.rule = *rule;	/* matching rule to restart		*/
 	args.next_hop = NULL;	/* we do not support forward yet	*/
 	args.eh = &save_eh;	/* MAC header for bridged/MAC packets	*/
 	args.inp = NULL;	/* used by ipfw uid/gid/jail rules	*/
 	i = ip_fw_chk_ptr(&args);
 	m = args.m;
 	if (m != NULL) {
 		/*
 		 * Restore Ethernet header, as needed, in case the
 		 * mbuf chain was replaced by ipfw.
 		 */
 		M_PREPEND(m, ETHER_HDR_LEN, M_DONTWAIT);
 		if (m == NULL) {
 			*m0 = m;
 			return 0;
 		}
 		if (eh != mtod(m, struct ether_header *))
 			bcopy(&save_eh, mtod(m, struct ether_header *),
 				ETHER_HDR_LEN);
 	}
 	*m0 = m;
 	*rule = args.rule;
 
 	if (i == IP_FW_DENY) /* drop */
 		return 0;
 
 	KASSERT(m != NULL, ("ether_ipfw_chk: m is NULL"));
 
 	if (i == IP_FW_PASS) /* a PASS rule.  */
 		return 1;
 
 	if (DUMMYNET_LOADED && (i == IP_FW_DUMMYNET)) {
 		/*
 		 * Pass the pkt to dummynet, which consumes it.
 		 * If shared, make a copy and keep the original.
 		 */
 		if (shared) {
 			m = m_copypacket(m, M_DONTWAIT);
 			if (m == NULL)
 				return 0;
 		} else {
 			/*
 			 * Pass the original to dummynet and
 			 * nothing back to the caller
 			 */
 			*m0 = NULL ;
 		}
 		ip_dn_io_ptr(&m, dst ? DN_TO_ETH_OUT: DN_TO_ETH_DEMUX, &args);
 		return 0;
 	}
 	/*
 	 * XXX at some point add support for divert/forward actions.
 	 * If none of the above matches, we have to drop the pkt.
 	 */
 	return 0;
 }
 #endif
 
 /*
  * Process a received Ethernet packet; the packet is in the
  * mbuf chain m with the ethernet header at the front.
  */
 static void
 ether_input(struct ifnet *ifp, struct mbuf *m)
 {
 	struct ether_header *eh;
 	u_short etype;
 
 	if ((ifp->if_flags & IFF_UP) == 0) {
 		m_freem(m);
 		return;
 	}
 #ifdef DIAGNOSTIC
 	if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) {
 		if_printf(ifp, "discard frame at !IFF_DRV_RUNNING\n");
 		m_freem(m);
 		return;
 	}
 #endif
 	/*
 	 * Do consistency checks to verify assumptions
 	 * made by code past this point.
 	 */
 	if ((m->m_flags & M_PKTHDR) == 0) {
 		if_printf(ifp, "discard frame w/o packet header\n");
 		ifp->if_ierrors++;
 		m_freem(m);
 		return;
 	}
 	if (m->m_len < ETHER_HDR_LEN) {
 		/* XXX maybe should pullup? */
 		if_printf(ifp, "discard frame w/o leading ethernet "
 				"header (len %u pkt len %u)\n",
 				m->m_len, m->m_pkthdr.len);
 		ifp->if_ierrors++;
 		m_freem(m);
 		return;
 	}
 	eh = mtod(m, struct ether_header *);
 	etype = ntohs(eh->ether_type);
 	if (m->m_pkthdr.rcvif == NULL) {
 		if_printf(ifp, "discard frame w/o interface pointer\n");
 		ifp->if_ierrors++;
 		m_freem(m);
 		return;
 	}
 #ifdef DIAGNOSTIC
 	if (m->m_pkthdr.rcvif != ifp) {
 		if_printf(ifp, "Warning, frame marked as received on %s\n",
 			m->m_pkthdr.rcvif->if_xname);
 	}
 #endif
 
 	if (ETHER_IS_MULTICAST(eh->ether_dhost)) {
 		if (ETHER_IS_BROADCAST(eh->ether_dhost))
 			m->m_flags |= M_BCAST;
 		else
 			m->m_flags |= M_MCAST;
 		ifp->if_imcasts++;
 	}
 
 #ifdef MAC
 	/*
 	 * Tag the mbuf with an appropriate MAC label before any other
 	 * consumers can get to it.
 	 */
 	mac_ifnet_create_mbuf(ifp, m);
 #endif
 
 	/*
 	 * Give bpf a chance at the packet.
 	 */
 	ETHER_BPF_MTAP(ifp, m);
 
 	/*
 	 * If the CRC is still on the packet, trim it off. We do this once
 	 * and once only in case we are re-entered. Nothing else on the
 	 * Ethernet receive path expects to see the FCS.
 	 */
 	if (m->m_flags & M_HASFCS) {
 		m_adj(m, -ETHER_CRC_LEN);
 		m->m_flags &= ~M_HASFCS;
 	}
 
 	ifp->if_ibytes += m->m_pkthdr.len;
 
 	/* Allow monitor mode to claim this frame, after stats are updated. */
 	if (ifp->if_flags & IFF_MONITOR) {
 		m_freem(m);
 		return;
 	}
 
 	/* Handle input from a lagg(4) port */
 	if (ifp->if_type == IFT_IEEE8023ADLAG) {
 		KASSERT(lagg_input_p != NULL,
 		    ("%s: if_lagg not loaded!", __func__));
 		m = (*lagg_input_p)(ifp, m);
 		if (m != NULL)
 			ifp = m->m_pkthdr.rcvif;
 		else 
 			return;
 	}
 
 	/*
 	 * If the hardware did not process an 802.1Q tag, do this now,
 	 * to allow 802.1P priority frames to be passed to the main input
 	 * path correctly.
 	 * TODO: Deal with Q-in-Q frames, but not arbitrary nesting levels.
 	 */
 	if ((m->m_flags & M_VLANTAG) == 0 && etype == ETHERTYPE_VLAN) {
 		struct ether_vlan_header *evl;
 
 		if (m->m_len < sizeof(*evl) &&
 		    (m = m_pullup(m, sizeof(*evl))) == NULL) {
 #ifdef DIAGNOSTIC
 			if_printf(ifp, "cannot pullup VLAN header\n");
 #endif
 			ifp->if_ierrors++;
 			m_freem(m);
 			return;
 		}
 
 		evl = mtod(m, struct ether_vlan_header *);
 		m->m_pkthdr.ether_vtag = ntohs(evl->evl_tag);
 		m->m_flags |= M_VLANTAG;
 
 		bcopy((char *)evl, (char *)evl + ETHER_VLAN_ENCAP_LEN,
 		    ETHER_HDR_LEN - ETHER_TYPE_LEN);
 		m_adj(m, ETHER_VLAN_ENCAP_LEN);
 	}
 
 	/* Allow ng_ether(4) to claim this frame. */
 	if (IFP2AC(ifp)->ac_netgraph != NULL) {
 		KASSERT(ng_ether_input_p != NULL,
 		    ("%s: ng_ether_input_p is NULL", __func__));
 		m->m_flags &= ~M_PROMISC;
 		(*ng_ether_input_p)(ifp, &m);
 		if (m == NULL)
 			return;
 	}
 
 	/*
 	 * Allow if_bridge(4) to claim this frame.
 	 * The BRIDGE_INPUT() macro will update ifp if the bridge changed it
 	 * and the frame should be delivered locally.
 	 */
 	if (ifp->if_bridge != NULL) {
 		m->m_flags &= ~M_PROMISC;
 		BRIDGE_INPUT(ifp, m);
 		if (m == NULL)
 			return;
 	}
 
 #ifdef DEV_CARP
 	/*
 	 * Clear M_PROMISC on frame so that carp(4) will see it when the
 	 * mbuf flows up to Layer 3.
 	 * FreeBSD's implementation of carp(4) uses the inprotosw
 	 * to dispatch IPPROTO_CARP. carp(4) also allocates its own
 	 * Ethernet addresses of the form 00:00:5e:00:01:xx, which
 	 * is outside the scope of the M_PROMISC test below.
 	 * TODO: Maintain a hash table of ethernet addresses other than
 	 * ether_dhost which may be active on this ifp.
 	 */
 	if (ifp->if_carp && carp_forus(ifp->if_carp, eh->ether_dhost)) {
 		m->m_flags &= ~M_PROMISC;
 	} else
 #endif
 	{
 		/*
 		 * If the frame received was not for our MAC address, set the
 		 * M_PROMISC flag on the mbuf chain. The frame may need to
 		 * be seen by the rest of the Ethernet input path in case of
 		 * re-entry (e.g. bridge, vlan, netgraph) but should not be
 		 * seen by upper protocol layers.
 		 */
 		if (!ETHER_IS_MULTICAST(eh->ether_dhost) &&
 		    bcmp(IF_LLADDR(ifp), eh->ether_dhost, ETHER_ADDR_LEN) != 0)
 			m->m_flags |= M_PROMISC;
 	}
 
 	/* First chunk of an mbuf contains good entropy */
 	if (harvest.ethernet)
 		random_harvest(m, 16, 3, 0, RANDOM_NET);
 
 	ether_demux(ifp, m);
 }
 
 /*
  * Upper layer processing for a received Ethernet packet.
  */
 void
 ether_demux(struct ifnet *ifp, struct mbuf *m)
 {
 	struct ether_header *eh;
 	int isr;
 	u_short ether_type;
 #if defined(NETATALK)
 	struct llc *l;
 #endif
 
 	KASSERT(ifp != NULL, ("%s: NULL interface pointer", __func__));
 
 #if defined(INET) || defined(INET6)
 	INIT_VNET_NET(ifp->if_vnet);
 	/*
 	 * Allow dummynet and/or ipfw to claim the frame.
 	 * Do not do this for PROMISC frames in case we are re-entered.
 	 */
 	if (IPFW_LOADED && V_ether_ipfw != 0 && !(m->m_flags & M_PROMISC)) {
 		struct ip_fw *rule = ip_dn_claim_rule(m);
 
 		if (ether_ipfw_chk(&m, NULL, &rule, 0) == 0) {
 			if (m)
 				m_freem(m);	/* dropped; free mbuf chain */
 			return;			/* consumed */
 		}
 	}
 #endif
 	eh = mtod(m, struct ether_header *);
 	ether_type = ntohs(eh->ether_type);
 
 	/*
 	 * If this frame has a VLAN tag other than 0, call vlan_input()
 	 * if its module is loaded. Otherwise, drop.
 	 */
 	if ((m->m_flags & M_VLANTAG) &&
 	    EVL_VLANOFTAG(m->m_pkthdr.ether_vtag) != 0) {
 		if (ifp->if_vlantrunk == NULL) {
 			ifp->if_noproto++;
 			m_freem(m);
 			return;
 		}
 		KASSERT(vlan_input_p != NULL,("%s: VLAN not loaded!",
 		    __func__));
 		/* Clear before possibly re-entering ether_input(). */
 		m->m_flags &= ~M_PROMISC;
 		(*vlan_input_p)(ifp, m);
 		return;
 	}
 
 	/*
 	 * Pass promiscuously received frames to the upper layer if the user
 	 * requested this by setting IFF_PPROMISC. Otherwise, drop them.
 	 */
 	if ((ifp->if_flags & IFF_PPROMISC) == 0 && (m->m_flags & M_PROMISC)) {
 		m_freem(m);
 		return;
 	}
 
 	/*
 	 * Reset layer specific mbuf flags to avoid confusing upper layers.
 	 * Strip off Ethernet header.
 	 */
 	m->m_flags &= ~M_VLANTAG;
 	m->m_flags &= ~(M_PROTOFLAGS);
 	m_adj(m, ETHER_HDR_LEN);
 
 	/*
 	 * Dispatch frame to upper layer.
 	 */
 	switch (ether_type) {
 #ifdef INET
 	case ETHERTYPE_IP:
 		if ((m = ip_fastforward(m)) == NULL)
 			return;
 		isr = NETISR_IP;
 		break;
 
 	case ETHERTYPE_ARP:
 		if (ifp->if_flags & IFF_NOARP) {
 			/* Discard packet if ARP is disabled on interface */
 			m_freem(m);
 			return;
 		}
 		isr = NETISR_ARP;
 		break;
 #endif
 #ifdef IPX
 	case ETHERTYPE_IPX:
 		if (ef_inputp && ef_inputp(ifp, eh, m) == 0)
 			return;
 		isr = NETISR_IPX;
 		break;
 #endif
 #ifdef INET6
 	case ETHERTYPE_IPV6:
 		isr = NETISR_IPV6;
 		break;
 #endif
 #ifdef NETATALK
 	case ETHERTYPE_AT:
 		isr = NETISR_ATALK1;
 		break;
 	case ETHERTYPE_AARP:
 		isr = NETISR_AARP;
 		break;
 #endif /* NETATALK */
 	default:
 #ifdef IPX
 		if (ef_inputp && ef_inputp(ifp, eh, m) == 0)
 			return;
 #endif /* IPX */
 #if defined(NETATALK)
 		if (ether_type > ETHERMTU)
 			goto discard;
 		l = mtod(m, struct llc *);
 		if (l->llc_dsap == LLC_SNAP_LSAP &&
 		    l->llc_ssap == LLC_SNAP_LSAP &&
 		    l->llc_control == LLC_UI) {
 			if (bcmp(&(l->llc_snap_org_code)[0], at_org_code,
 			    sizeof(at_org_code)) == 0 &&
 			    ntohs(l->llc_snap_ether_type) == ETHERTYPE_AT) {
 				m_adj(m, LLC_SNAPFRAMELEN);
 				isr = NETISR_ATALK2;
 				break;
 			}
 			if (bcmp(&(l->llc_snap_org_code)[0], aarp_org_code,
 			    sizeof(aarp_org_code)) == 0 &&
 			    ntohs(l->llc_snap_ether_type) == ETHERTYPE_AARP) {
 				m_adj(m, LLC_SNAPFRAMELEN);
 				isr = NETISR_AARP;
 				break;
 			}
 		}
 #endif /* NETATALK */
 		goto discard;
 	}
 	netisr_dispatch(isr, m);
 	return;
 
 discard:
 	/*
 	 * Packet is to be discarded.  If netgraph is present,
 	 * hand the packet to it for last chance processing;
 	 * otherwise dispose of it.
 	 */
 	if (IFP2AC(ifp)->ac_netgraph != NULL) {
 		KASSERT(ng_ether_input_orphan_p != NULL,
 		    ("ng_ether_input_orphan_p is NULL"));
 		/*
 		 * Put back the ethernet header so netgraph has a
 		 * consistent view of inbound packets.
 		 */
 		M_PREPEND(m, ETHER_HDR_LEN, M_DONTWAIT);
 		(*ng_ether_input_orphan_p)(ifp, m);
 		return;
 	}
 	m_freem(m);
 }
 
 /*
  * Convert Ethernet address to printable (loggable) representation.
  * This routine is for compatibility; it's better to just use
  *
  *	printf("%6D", <pointer to address>, ":");
  *
  * since there's no static buffer involved.
  */
 char *
 ether_sprintf(const u_char *ap)
 {
 	static char etherbuf[18];
 	snprintf(etherbuf, sizeof (etherbuf), "%6D", ap, ":");
 	return (etherbuf);
 }
 
 /*
  * Perform common duties while attaching to interface list
  */
 void
 ether_ifattach(struct ifnet *ifp, const u_int8_t *lla)
 {
 	int i;
 	struct ifaddr *ifa;
 	struct sockaddr_dl *sdl;
 
 	ifp->if_addrlen = ETHER_ADDR_LEN;
 	ifp->if_hdrlen = ETHER_HDR_LEN;
 	if_attach(ifp);
 	ifp->if_mtu = ETHERMTU;
 	ifp->if_output = ether_output;
 	ifp->if_input = ether_input;
 	ifp->if_resolvemulti = ether_resolvemulti;
 	if (ifp->if_baudrate == 0)
 		ifp->if_baudrate = IF_Mbps(10);		/* just a default */
 	ifp->if_broadcastaddr = etherbroadcastaddr;
 
 	ifa = ifp->if_addr;
 	KASSERT(ifa != NULL, ("%s: no lladdr!\n", __func__));
 	sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 	sdl->sdl_type = IFT_ETHER;
 	sdl->sdl_alen = ifp->if_addrlen;
 	bcopy(lla, LLADDR(sdl), ifp->if_addrlen);
 
 	bpfattach(ifp, DLT_EN10MB, ETHER_HDR_LEN);
 	if (ng_ether_attach_p != NULL)
 		(*ng_ether_attach_p)(ifp);
 
 	/* Announce Ethernet MAC address if non-zero. */
 	for (i = 0; i < ifp->if_addrlen; i++)
 		if (lla[i] != 0)
 			break; 
 	if (i != ifp->if_addrlen)
 		if_printf(ifp, "Ethernet address: %6D\n", lla, ":");
 }
 
 /*
  * Perform common duties while detaching an Ethernet interface
  */
 void
 ether_ifdetach(struct ifnet *ifp)
 {
 	if (IFP2AC(ifp)->ac_netgraph != NULL) {
 		KASSERT(ng_ether_detach_p != NULL,
 		    ("ng_ether_detach_p is NULL"));
 		(*ng_ether_detach_p)(ifp);
 	}
 
 	bpfdetach(ifp);
 	if_detach(ifp);
 }
 
 SYSCTL_DECL(_net_link);
 SYSCTL_NODE(_net_link, IFT_ETHER, ether, CTLFLAG_RW, 0, "Ethernet");
 #if defined(INET) || defined(INET6)
 SYSCTL_V_INT(V_NET, vnet_net, _net_link_ether, OID_AUTO, ipfw, CTLFLAG_RW,
 	     ether_ipfw, 0, "Pass ether pkts through firewall");
 #endif
 
 #if 0
 /*
  * This is for reference.  We have a table-driven version
  * of the little-endian crc32 generator, which is faster
  * than the double-loop.
  */
 uint32_t
 ether_crc32_le(const uint8_t *buf, size_t len)
 {
 	size_t i;
 	uint32_t crc;
 	int bit;
 	uint8_t data;
 
 	crc = 0xffffffff;	/* initial value */
 
 	for (i = 0; i < len; i++) {
 		for (data = *buf++, bit = 0; bit < 8; bit++, data >>= 1) {
 			carry = (crc ^ data) & 1;
 			crc >>= 1;
 			if (carry)
 				crc = (crc ^ ETHER_CRC_POLY_LE);
 		}
 	}
 
 	return (crc);
 }
 #else
 uint32_t
 ether_crc32_le(const uint8_t *buf, size_t len)
 {
 	static const uint32_t crctab[] = {
 		0x00000000, 0x1db71064, 0x3b6e20c8, 0x26d930ac,
 		0x76dc4190, 0x6b6b51f4, 0x4db26158, 0x5005713c,
 		0xedb88320, 0xf00f9344, 0xd6d6a3e8, 0xcb61b38c,
 		0x9b64c2b0, 0x86d3d2d4, 0xa00ae278, 0xbdbdf21c
 	};
 	size_t i;
 	uint32_t crc;
 
 	crc = 0xffffffff;	/* initial value */
 
 	for (i = 0; i < len; i++) {
 		crc ^= buf[i];
 		crc = (crc >> 4) ^ crctab[crc & 0xf];
 		crc = (crc >> 4) ^ crctab[crc & 0xf];
 	}
 
 	return (crc);
 }
 #endif
 
 uint32_t
 ether_crc32_be(const uint8_t *buf, size_t len)
 {
 	size_t i;
 	uint32_t crc, carry;
 	int bit;
 	uint8_t data;
 
 	crc = 0xffffffff;	/* initial value */
 
 	for (i = 0; i < len; i++) {
 		for (data = *buf++, bit = 0; bit < 8; bit++, data >>= 1) {
 			carry = ((crc & 0x80000000) ? 1 : 0) ^ (data & 0x01);
 			crc <<= 1;
 			if (carry)
 				crc = (crc ^ ETHER_CRC_POLY_BE) | carry;
 		}
 	}
 
 	return (crc);
 }
 
 int
 ether_ioctl(struct ifnet *ifp, u_long command, caddr_t data)
 {
 	struct ifaddr *ifa = (struct ifaddr *) data;
 	struct ifreq *ifr = (struct ifreq *) data;
 	int error = 0;
 
 	switch (command) {
 	case SIOCSIFADDR:
 		ifp->if_flags |= IFF_UP;
 
 		switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
 		case AF_INET:
 			ifp->if_init(ifp->if_softc);	/* before arpwhohas */
 			arp_ifinit(ifp, ifa);
 			break;
 #endif
 #ifdef IPX
 		/*
 		 * XXX - This code is probably wrong
 		 */
 		case AF_IPX:
 			{
 			struct ipx_addr *ina = &(IA_SIPX(ifa)->sipx_addr);
 
 			if (ipx_nullhost(*ina))
 				ina->x_host =
 				    *(union ipx_host *)
 				    IF_LLADDR(ifp);
 			else {
 				bcopy((caddr_t) ina->x_host.c_host,
 				      (caddr_t) IF_LLADDR(ifp),
 				      ETHER_ADDR_LEN);
 			}
 
 			/*
 			 * Set new address
 			 */
 			ifp->if_init(ifp->if_softc);
 			break;
 			}
 #endif
 		default:
 			ifp->if_init(ifp->if_softc);
 			break;
 		}
 		break;
 
 	case SIOCGIFADDR:
 		{
 			struct sockaddr *sa;
 
 			sa = (struct sockaddr *) & ifr->ifr_data;
 			bcopy(IF_LLADDR(ifp),
 			      (caddr_t) sa->sa_data, ETHER_ADDR_LEN);
 		}
 		break;
 
 	case SIOCSIFMTU:
 		/*
 		 * Set the interface MTU.
 		 */
 		if (ifr->ifr_mtu > ETHERMTU) {
 			error = EINVAL;
 		} else {
 			ifp->if_mtu = ifr->ifr_mtu;
 		}
 		break;
 	default:
 		error = EINVAL;			/* XXX netbsd has ENOTTY??? */
 		break;
 	}
 	return (error);
 }
 
 static int
 ether_resolvemulti(struct ifnet *ifp, struct sockaddr **llsa,
 	struct sockaddr *sa)
 {
 	struct sockaddr_dl *sdl;
 #ifdef INET
 	struct sockaddr_in *sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *sin6;
 #endif
 	u_char *e_addr;
 
 	switch(sa->sa_family) {
 	case AF_LINK:
 		/*
 		 * No mapping needed. Just check that it's a valid MC address.
 		 */
 		sdl = (struct sockaddr_dl *)sa;
 		e_addr = LLADDR(sdl);
 		if (!ETHER_IS_MULTICAST(e_addr))
 			return EADDRNOTAVAIL;
 		*llsa = 0;
 		return 0;
 
 #ifdef INET
 	case AF_INET:
 		sin = (struct sockaddr_in *)sa;
 		if (!IN_MULTICAST(ntohl(sin->sin_addr.s_addr)))
 			return EADDRNOTAVAIL;
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT|M_ZERO);
 		if (sdl == NULL)
 			return ENOMEM;
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_ETHER;
 		sdl->sdl_alen = ETHER_ADDR_LEN;
 		e_addr = LLADDR(sdl);
 		ETHER_MAP_IP_MULTICAST(&sin->sin_addr, e_addr);
 		*llsa = (struct sockaddr *)sdl;
 		return 0;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		sin6 = (struct sockaddr_in6 *)sa;
 		if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) {
 			/*
 			 * An IP6 address of 0 means listen to all
 			 * of the Ethernet multicast address used for IP6.
 			 * (This is used for multicast routers.)
 			 */
 			ifp->if_flags |= IFF_ALLMULTI;
 			*llsa = 0;
 			return 0;
 		}
 		if (!IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr))
 			return EADDRNOTAVAIL;
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT|M_ZERO);
 		if (sdl == NULL)
 			return (ENOMEM);
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_ETHER;
 		sdl->sdl_alen = ETHER_ADDR_LEN;
 		e_addr = LLADDR(sdl);
 		ETHER_MAP_IPV6_MULTICAST(&sin6->sin6_addr, e_addr);
 		*llsa = (struct sockaddr *)sdl;
 		return 0;
 #endif
 
 	default:
 		/*
 		 * Well, the text isn't quite right, but it's the name
 		 * that counts...
 		 */
 		return EAFNOSUPPORT;
 	}
 }
 
 static void*
 ether_alloc(u_char type, struct ifnet *ifp)
 {
 	struct arpcom	*ac;
 	
 	ac = malloc(sizeof(struct arpcom), M_ARPCOM, M_WAITOK | M_ZERO);
 	ac->ac_ifp = ifp;
 
 	return (ac);
 }
 
 static void
 ether_free(void *com, u_char type)
 {
 
 	free(com, M_ARPCOM);
 }
 
 static int
 ether_modevent(module_t mod, int type, void *data)
 {
 
 	switch (type) {
 	case MOD_LOAD:
 		if_register_com_alloc(IFT_ETHER, ether_alloc, ether_free);
 		break;
 	case MOD_UNLOAD:
 		if_deregister_com_alloc(IFT_ETHER);
 		break;
 	default:
 		return EOPNOTSUPP;
 	}
 
 	return (0);
 }
 
 static moduledata_t ether_mod = {
 	"ether",
 	ether_modevent,
 	0
 };
 
 void
 ether_vlan_mtap(struct bpf_if *bp, struct mbuf *m, void *data, u_int dlen)
 {
 	struct ether_vlan_header vlan;
 	struct mbuf mv, mb;
 
 	KASSERT((m->m_flags & M_VLANTAG) != 0,
 	    ("%s: vlan information not present", __func__));
 	KASSERT(m->m_len >= sizeof(struct ether_header),
 	    ("%s: mbuf not large enough for header", __func__));
 	bcopy(mtod(m, char *), &vlan, sizeof(struct ether_header));
 	vlan.evl_proto = vlan.evl_encap_proto;
 	vlan.evl_encap_proto = htons(ETHERTYPE_VLAN);
 	vlan.evl_tag = htons(m->m_pkthdr.ether_vtag);
 	m->m_len -= sizeof(struct ether_header);
 	m->m_data += sizeof(struct ether_header);
 	/*
 	 * If a data link has been supplied by the caller, then we will need to
 	 * re-create a stack allocated mbuf chain with the following structure:
 	 *
 	 * (1) mbuf #1 will contain the supplied data link
 	 * (2) mbuf #2 will contain the vlan header
 	 * (3) mbuf #3 will contain the original mbuf's packet data
 	 *
 	 * Otherwise, submit the packet and vlan header via bpf_mtap2().
 	 */
 	if (data != NULL) {
 		mv.m_next = m;
 		mv.m_data = (caddr_t)&vlan;
 		mv.m_len = sizeof(vlan);
 		mb.m_next = &mv;
 		mb.m_data = data;
 		mb.m_len = dlen;
 		bpf_mtap(bp, &mb);
 	} else
 		bpf_mtap2(bp, &vlan, sizeof(vlan), m);
 	m->m_len += sizeof(struct ether_header);
 	m->m_data -= sizeof(struct ether_header);
 }
 
 struct mbuf *
 ether_vlanencap(struct mbuf *m, uint16_t tag)
 {
 	struct ether_vlan_header *evl;
 
 	M_PREPEND(m, ETHER_VLAN_ENCAP_LEN, M_DONTWAIT);
 	if (m == NULL)
 		return (NULL);
 	/* M_PREPEND takes care of m_len, m_pkthdr.len for us */
 
 	if (m->m_len < sizeof(*evl)) {
 		m = m_pullup(m, sizeof(*evl));
 		if (m == NULL)
 			return (NULL);
 	}
 
 	/*
 	 * Transform the Ethernet header into an Ethernet header
 	 * with 802.1Q encapsulation.
 	 */
 	evl = mtod(m, struct ether_vlan_header *);
 	bcopy((char *)evl + ETHER_VLAN_ENCAP_LEN,
 	    (char *)evl, ETHER_HDR_LEN - ETHER_TYPE_LEN);
 	evl->evl_encap_proto = htons(ETHERTYPE_VLAN);
 	evl->evl_tag = htons(tag);
 	return (m);
 }
 
 DECLARE_MODULE(ether, ether_mod, SI_SUB_INIT_IF, SI_ORDER_ANY);
 MODULE_VERSION(ether, 1);
Index: head/sys/net/if_fddisubr.c
===================================================================
--- head/sys/net/if_fddisubr.c	(revision 186118)
+++ head/sys/net/if_fddisubr.c	(revision 186119)
@@ -1,790 +1,792 @@
 /*-
  * Copyright (c) 1995, 1996
  *	Matt Thomas <matt@3am-software.com>.  All rights reserved.
  * Copyright (c) 1982, 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from: if_ethersubr.c,v 1.5 1994/12/13 22:31:45 wollman Exp
  * $FreeBSD$
  */
 
 #include "opt_atalk.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipx.h"
 #include "opt_mac.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/module.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
 #include <net/if_llc.h>
 #include <net/if_types.h>
+#include <net/if_llatbl.h>
 
 #include <net/ethernet.h>
 #include <net/netisr.h>
 #include <net/route.h>
 #include <net/bpf.h>
 #include <net/fddi.h>
 
 #if defined(INET) || defined(INET6)
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/if_ether.h>
 #endif
 #ifdef INET6
 #include <netinet6/nd6.h>
 #endif
 
 #ifdef IPX
 #include <netipx/ipx.h> 
 #include <netipx/ipx_if.h>
 #endif
 
 #ifdef DECNET
 #include <netdnet/dn.h>
 #endif
 
 #ifdef NETATALK
 #include <netatalk/at.h>
 #include <netatalk/at_var.h>
 #include <netatalk/at_extern.h>
 
 extern u_char	at_org_code[ 3 ];
 extern u_char	aarp_org_code[ 3 ];
 #endif /* NETATALK */
 
 #include <security/mac/mac_framework.h>
 
 static const u_char fddibroadcastaddr[FDDI_ADDR_LEN] =
 			{ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
 
 static int fddi_resolvemulti(struct ifnet *, struct sockaddr **,
 			      struct sockaddr *);
 static int fddi_output(struct ifnet *, struct mbuf *, struct sockaddr *,
 		       struct rtentry *); 
 static void fddi_input(struct ifnet *ifp, struct mbuf *m);
 
 #define	senderr(e)	do { error = (e); goto bad; } while (0)
 
 /*
  * FDDI output routine.
  * Encapsulate a packet of type family for the local net.
  * Use trailer local net encapsulation if enough data in first
  * packet leaves a multiple of 512 bytes of data in remainder.
  * Assumes that ifp is actually pointer to arpcom structure.
  */
 static int
 fddi_output(ifp, m, dst, rt0)
 	struct ifnet *ifp;
 	struct mbuf *m;
 	struct sockaddr *dst;
 	struct rtentry *rt0;
 {
 	u_int16_t type;
 	int loop_copy = 0, error = 0, hdrcmplt = 0;
  	u_char esrc[FDDI_ADDR_LEN], edst[FDDI_ADDR_LEN];
 	struct fddi_header *fh;
+	struct llentry *lle;
 
 #ifdef MAC
 	error = mac_ifnet_check_transmit(ifp, m);
 	if (error)
 		senderr(error);
 #endif
 
 	if (ifp->if_flags & IFF_MONITOR)
 		senderr(ENETDOWN);
 	if (!((ifp->if_flags & IFF_UP) &&
 	    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 		senderr(ENETDOWN);
 	getmicrotime(&ifp->if_lastchange);
 
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET: {
-		error = arpresolve(ifp, rt0, m, dst, edst);
+		error = arpresolve(ifp, rt0, m, dst, edst, &lle);
 		if (error)
 			return (error == EWOULDBLOCK ? 0 : error);
 		type = htons(ETHERTYPE_IP);
 		break;
 	}
 	case AF_ARP:
 	{
 		struct arphdr *ah;
 		ah = mtod(m, struct arphdr *);
 		ah->ar_hrd = htons(ARPHRD_ETHER);
 
 		loop_copy = -1; /* if this is for us, don't do it */
 
 		switch (ntohs(ah->ar_op)) {
 		case ARPOP_REVREQUEST:
 		case ARPOP_REVREPLY:
 			type = htons(ETHERTYPE_REVARP);
 			break;
 		case ARPOP_REQUEST:
 		case ARPOP_REPLY:
 		default:
 			type = htons(ETHERTYPE_ARP);
 			break;
 		}
 
 		if (m->m_flags & M_BCAST)
 			bcopy(ifp->if_broadcastaddr, edst, FDDI_ADDR_LEN);
                 else
 			bcopy(ar_tha(ah), edst, FDDI_ADDR_LEN);
 
 	}
 	break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
-		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)edst);
+		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)edst, &lle);
 		if (error)
 			return (error); /* Something bad happened */
 		type = htons(ETHERTYPE_IPV6);
 		break;
 #endif /* INET6 */
 #ifdef IPX
 	case AF_IPX:
 		type = htons(ETHERTYPE_IPX);
  		bcopy((caddr_t)&(((struct sockaddr_ipx *)dst)->sipx_addr.x_host),
 		    (caddr_t)edst, FDDI_ADDR_LEN);
 		break;
 #endif /* IPX */
 #ifdef NETATALK
 	case AF_APPLETALK: {
 	    struct at_ifaddr *aa;
             if (!aarpresolve(ifp, m, (struct sockaddr_at *)dst, edst))
                 return (0);
 	    /*
 	     * ifaddr is the first thing in at_ifaddr
 	     */
 	    if ((aa = at_ifawithnet( (struct sockaddr_at *)dst)) == 0)
 		goto bad;
 	    
 	    /*
 	     * In the phase 2 case, we need to prepend an mbuf for the llc header.
 	     * Since we must preserve the value of m, which is passed to us by
 	     * value, we m_copy() the first mbuf, and use it for our llc header.
 	     */
 	    if (aa->aa_flags & AFA_PHASE2) {
 		struct llc llc;
 
 		M_PREPEND(m, LLC_SNAPFRAMELEN, M_WAIT);
 		llc.llc_dsap = llc.llc_ssap = LLC_SNAP_LSAP;
 		llc.llc_control = LLC_UI;
 		bcopy(at_org_code, llc.llc_snap.org_code, sizeof(at_org_code));
 		llc.llc_snap.ether_type = htons(ETHERTYPE_AT);
 		bcopy(&llc, mtod(m, caddr_t), LLC_SNAPFRAMELEN);
 		type = 0;
 	    } else {
 		type = htons(ETHERTYPE_AT);
 	    }
 	    break;
 	}
 #endif /* NETATALK */
 
 	case pseudo_AF_HDRCMPLT:
 	{
 		struct ether_header *eh;
 		hdrcmplt = 1;
 		eh = (struct ether_header *)dst->sa_data;
 		bcopy((caddr_t)eh->ether_shost, (caddr_t)esrc, FDDI_ADDR_LEN);
 		/* FALLTHROUGH */
 	}
 
 	case AF_UNSPEC:
 	{
 		struct ether_header *eh;
 		loop_copy = -1;
 		eh = (struct ether_header *)dst->sa_data;
 		bcopy((caddr_t)eh->ether_dhost, (caddr_t)edst, FDDI_ADDR_LEN);
 		if (*edst & 1)
 			m->m_flags |= (M_BCAST|M_MCAST);
 		type = eh->ether_type;
 		break;
 	}
 
 	case AF_IMPLINK:
 	{
 		fh = mtod(m, struct fddi_header *);
 		error = EPROTONOSUPPORT;
 		switch (fh->fddi_fc & (FDDIFC_C|FDDIFC_L|FDDIFC_F)) {
 			case FDDIFC_LLC_ASYNC: {
 				/* legal priorities are 0 through 7 */
 				if ((fh->fddi_fc & FDDIFC_Z) > 7)
 			        	goto bad;
 				break;
 			}
 			case FDDIFC_LLC_SYNC: {
 				/* FDDIFC_Z bits reserved, must be zero */
 				if (fh->fddi_fc & FDDIFC_Z)
 					goto bad;
 				break;
 			}
 			case FDDIFC_SMT: {
 				/* FDDIFC_Z bits must be non zero */
 				if ((fh->fddi_fc & FDDIFC_Z) == 0)
 					goto bad;
 				break;
 			}
 			default: {
 				/* anything else is too dangerous */
                	 		goto bad;
 			}
 		}
 		error = 0;
 		if (fh->fddi_dhost[0] & 1)
 			m->m_flags |= (M_BCAST|M_MCAST);
 		goto queue_it;
 	}
 	default:
 		if_printf(ifp, "can't handle af%d\n", dst->sa_family);
 		senderr(EAFNOSUPPORT);
 	}
 
 	/*
 	 * Add LLC header.
 	 */
 	if (type != 0) {
 		struct llc *l;
 		M_PREPEND(m, LLC_SNAPFRAMELEN, M_DONTWAIT);
 		if (m == 0)
 			senderr(ENOBUFS);
 		l = mtod(m, struct llc *);
 		l->llc_control = LLC_UI;
 		l->llc_dsap = l->llc_ssap = LLC_SNAP_LSAP;
 		l->llc_snap.org_code[0] =
 			l->llc_snap.org_code[1] =
 			l->llc_snap.org_code[2] = 0;
 		l->llc_snap.ether_type = htons(type);
 	}
 
 	/*
 	 * Add local net header.  If no space in first mbuf,
 	 * allocate another.
 	 */
 	M_PREPEND(m, FDDI_HDR_LEN, M_DONTWAIT);
 	if (m == 0)
 		senderr(ENOBUFS);
 	fh = mtod(m, struct fddi_header *);
 	fh->fddi_fc = FDDIFC_LLC_ASYNC|FDDIFC_LLC_PRIO4;
 	bcopy((caddr_t)edst, (caddr_t)fh->fddi_dhost, FDDI_ADDR_LEN);
   queue_it:
 	if (hdrcmplt)
 		bcopy((caddr_t)esrc, (caddr_t)fh->fddi_shost, FDDI_ADDR_LEN);
 	else
 		bcopy(IF_LLADDR(ifp), (caddr_t)fh->fddi_shost,
 			FDDI_ADDR_LEN);
 
 	/*
 	 * If a simplex interface, and the packet is being sent to our
 	 * Ethernet address or a broadcast address, loopback a copy.
 	 * XXX To make a simplex device behave exactly like a duplex
 	 * device, we should copy in the case of sending to our own
 	 * ethernet address (thus letting the original actually appear
 	 * on the wire). However, we don't do that here for security
 	 * reasons and compatibility with the original behavior.
 	 */
 	if ((ifp->if_flags & IFF_SIMPLEX) && (loop_copy != -1)) {
 		if ((m->m_flags & M_BCAST) || (loop_copy > 0)) {
 			struct mbuf *n;
 			n = m_copy(m, 0, (int)M_COPYALL);
 			(void) if_simloop(ifp, n, dst->sa_family,
 					  FDDI_HDR_LEN);
 	     	} else if (bcmp(fh->fddi_dhost, fh->fddi_shost,
 				FDDI_ADDR_LEN) == 0) {
 			(void) if_simloop(ifp, m, dst->sa_family,
 					  FDDI_HDR_LEN);
 			return (0);	/* XXX */
 		}
 	}
 
 	error = (ifp->if_transmit)(ifp, m);
 	if (error)
 		ifp->if_oerrors++;
 
 	return (error);
 
 bad:
 	ifp->if_oerrors++;
 	if (m)
 		m_freem(m);
 	return (error);
 }
 
 /*
  * Process a received FDDI packet.
  */
 static void
 fddi_input(ifp, m)
 	struct ifnet *ifp;
 	struct mbuf *m;
 {
 	int isr;
 	struct llc *l;
 	struct fddi_header *fh;
 
 	/*
 	 * Do consistency checks to verify assumptions
 	 * made by code past this point.
 	 */
 	if ((m->m_flags & M_PKTHDR) == 0) {
 		if_printf(ifp, "discard frame w/o packet header\n");
 		ifp->if_ierrors++;
 		m_freem(m);
 		return;
 	}
 	if (m->m_pkthdr.rcvif == NULL) {
 		if_printf(ifp, "discard frame w/o interface pointer\n");
 		ifp->if_ierrors++;
 		m_freem(m);
 		return;
         }
 
 	m = m_pullup(m, FDDI_HDR_LEN);
 	if (m == NULL) {
 		ifp->if_ierrors++;
 		goto dropanyway;
 	}
 	fh = mtod(m, struct fddi_header *);
 	m->m_pkthdr.header = (void *)fh;
 
 	/*
 	 * Discard packet if interface is not up.
 	 */
 	if (!((ifp->if_flags & IFF_UP) &&
 	    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 		goto dropanyway;
 
 	/*
 	 * Give bpf a chance at the packet.
 	 */
 	BPF_MTAP(ifp, m);
 
 	/*
 	 * Interface marked for monitoring; discard packet.
 	 */
 	if (ifp->if_flags & IFF_MONITOR) {
 		m_freem(m);
 		return;
 	}
 
 #ifdef MAC
 	mac_ifnet_create_mbuf(ifp, m);
 #endif
 
 	/*
 	 * Update interface statistics.
 	 */
 	ifp->if_ibytes += m->m_pkthdr.len;
 	getmicrotime(&ifp->if_lastchange);
 
 	/*
 	 * Discard non local unicast packets when interface
 	 * is in promiscuous mode.
 	 */
 	if ((ifp->if_flags & IFF_PROMISC) && ((fh->fddi_dhost[0] & 1) == 0) &&
 	    (bcmp(IF_LLADDR(ifp), (caddr_t)fh->fddi_dhost,
 	     FDDI_ADDR_LEN) != 0))
 		goto dropanyway;
 
 	/*
 	 * Set mbuf flags for bcast/mcast.
 	 */
 	if (fh->fddi_dhost[0] & 1) {
 		if (bcmp(ifp->if_broadcastaddr, fh->fddi_dhost,
 		    FDDI_ADDR_LEN) == 0)
 			m->m_flags |= M_BCAST;
 		else
 			m->m_flags |= M_MCAST;
 		ifp->if_imcasts++;
 	}
 
 #ifdef M_LINK0
 	/*
 	 * If this has a LLC priority of 0, then mark it so upper
 	 * layers have a hint that it really came via a FDDI/Ethernet
 	 * bridge.
 	 */
 	if ((fh->fddi_fc & FDDIFC_LLC_PRIO7) == FDDIFC_LLC_PRIO0)
 		m->m_flags |= M_LINK0;
 #endif
 
 	/* Strip off FDDI header. */
 	m_adj(m, FDDI_HDR_LEN);
 
 	m = m_pullup(m, LLC_SNAPFRAMELEN);
 	if (m == 0) {
 		ifp->if_ierrors++;
 		goto dropanyway;
 	}
 	l = mtod(m, struct llc *);
 
 	switch (l->llc_dsap) {
 	case LLC_SNAP_LSAP:
 	{
 		u_int16_t type;
 		if ((l->llc_control != LLC_UI) ||
 		    (l->llc_ssap != LLC_SNAP_LSAP)) {
 			ifp->if_noproto++;
 			goto dropanyway;
 		}
 #ifdef NETATALK
 		if (bcmp(&(l->llc_snap.org_code)[0], at_org_code,
 		    sizeof(at_org_code)) == 0 &&
 		    ntohs(l->llc_snap.ether_type) == ETHERTYPE_AT) {
 			isr = NETISR_ATALK2;
 			m_adj(m, LLC_SNAPFRAMELEN);
 			break;
 		}
 
 		if (bcmp(&(l->llc_snap.org_code)[0], aarp_org_code,
 		    sizeof(aarp_org_code)) == 0 &&
 		    ntohs(l->llc_snap.ether_type) == ETHERTYPE_AARP) {
 			m_adj(m, LLC_SNAPFRAMELEN);
 			isr = NETISR_AARP;
 			break;
 		}
 #endif /* NETATALK */
 		if (l->llc_snap.org_code[0] != 0 ||
 		    l->llc_snap.org_code[1] != 0 ||
 		    l->llc_snap.org_code[2] != 0) {
 			ifp->if_noproto++;
 			goto dropanyway;
 		}
 
 		type = ntohs(l->llc_snap.ether_type);
 		m_adj(m, LLC_SNAPFRAMELEN);
 
 		switch (type) {
 #ifdef INET
 		case ETHERTYPE_IP:
 			if ((m = ip_fastforward(m)) == NULL)
 				return;
 			isr = NETISR_IP;
 			break;
 
 		case ETHERTYPE_ARP:
 			if (ifp->if_flags & IFF_NOARP)
 				goto dropanyway;
 			isr = NETISR_ARP;
 			break;
 #endif
 #ifdef INET6
 		case ETHERTYPE_IPV6:
 			isr = NETISR_IPV6;
 			break;
 #endif
 #ifdef IPX      
 		case ETHERTYPE_IPX: 
 			isr = NETISR_IPX;
 			break;  
 #endif   
 #ifdef DECNET
 		case ETHERTYPE_DECNET:
 			isr = NETISR_DECNET;
 			break;
 #endif
 #ifdef NETATALK 
 		case ETHERTYPE_AT:
 	                isr = NETISR_ATALK1;
 			break;
 	        case ETHERTYPE_AARP:
 			isr = NETISR_AARP;
 			break;
 #endif /* NETATALK */
 		default:
 			/* printf("fddi_input: unknown protocol 0x%x\n", type); */
 			ifp->if_noproto++;
 			goto dropanyway;
 		}
 		break;
 	}
 		
 	default:
 		/* printf("fddi_input: unknown dsap 0x%x\n", l->llc_dsap); */
 		ifp->if_noproto++;
 		goto dropanyway;
 	}
 	netisr_dispatch(isr, m);
 	return;
 
 dropanyway:
 	ifp->if_iqdrops++;
 	if (m)
 		m_freem(m);
 	return;
 }
 
 /*
  * Perform common duties while attaching to interface list
  */
 void
 fddi_ifattach(ifp, lla, bpf)
 	struct ifnet *ifp;
 	const u_int8_t *lla;
 	int bpf;
 {
 	struct ifaddr *ifa;
 	struct sockaddr_dl *sdl;
 
 	ifp->if_type = IFT_FDDI;
 	ifp->if_addrlen = FDDI_ADDR_LEN;
 	ifp->if_hdrlen = 21;
 
 	if_attach(ifp);         /* Must be called before additional assignments */
 
 	ifp->if_mtu = FDDIMTU;
 	ifp->if_output = fddi_output;
 	ifp->if_input = fddi_input;
 	ifp->if_resolvemulti = fddi_resolvemulti;
 	ifp->if_broadcastaddr = fddibroadcastaddr;
 	ifp->if_baudrate = 100000000;
 #ifdef IFF_NOTRAILERS
 	ifp->if_flags |= IFF_NOTRAILERS;
 #endif
 	ifa = ifp->if_addr;
 	KASSERT(ifa != NULL, ("%s: no lladdr!\n", __func__));
 
 	sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 	sdl->sdl_type = IFT_FDDI;
 	sdl->sdl_alen = ifp->if_addrlen;
 	bcopy(lla, LLADDR(sdl), ifp->if_addrlen);
 
 	if (bpf)
 		bpfattach(ifp, DLT_FDDI, FDDI_HDR_LEN);
 
 	return;
 }
 
 void
 fddi_ifdetach(ifp, bpf)
 	struct ifnet *ifp;
 	int bpf;
 {
      
 	if (bpf)
 		bpfdetach(ifp);
 
 	if_detach(ifp);
 
 	return;
 }
 
 int
 fddi_ioctl (ifp, command, data)
 	struct ifnet *ifp;
 	int command;
 	caddr_t data;
 {
 	struct ifaddr *ifa;
 	struct ifreq *ifr;
 	int error;
 
 	ifa = (struct ifaddr *) data;
 	ifr = (struct ifreq *) data;
 	error = 0;
 
 	switch (command) {
 	case SIOCSIFADDR:
 		ifp->if_flags |= IFF_UP;
 
 		switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
 		case AF_INET:	/* before arpwhohas */
 			ifp->if_init(ifp->if_softc);
 			arp_ifinit(ifp, ifa);
 			break;
 #endif
 #ifdef IPX
 		/*
 		 * XXX - This code is probably wrong
 		 */
 		case AF_IPX: {
 				struct ipx_addr *ina;
 
 				ina = &(IA_SIPX(ifa)->sipx_addr);
 
 				if (ipx_nullhost(*ina)) {
 					ina->x_host = *(union ipx_host *)
 							IF_LLADDR(ifp);
 				} else {
 					bcopy((caddr_t) ina->x_host.c_host,
 					      (caddr_t) IF_LLADDR(ifp),
 					      ETHER_ADDR_LEN);
 				}
 	
 				/*
 				 * Set new address
 				 */
 				ifp->if_init(ifp->if_softc);
 			}
 			break;
 #endif
 		default:
 			ifp->if_init(ifp->if_softc);
 			break;
 		}
 		break;
 	case SIOCGIFADDR: {
 			struct sockaddr *sa;
 
 			sa = (struct sockaddr *) & ifr->ifr_data;
 			bcopy(IF_LLADDR(ifp),
 			      (caddr_t) sa->sa_data, FDDI_ADDR_LEN);
 
 		}
 		break;
 	case SIOCSIFMTU:
 		/*
 		 * Set the interface MTU.
 		 */
 		if (ifr->ifr_mtu > FDDIMTU) {
 			error = EINVAL;
 		} else {
 			ifp->if_mtu = ifr->ifr_mtu;
 		}
 		break;
 	default:
 		error = EINVAL;
 		break;
 	}
 
 	return (error);
 }
 
 static int
 fddi_resolvemulti(ifp, llsa, sa)
 	struct ifnet *ifp;
 	struct sockaddr **llsa;
 	struct sockaddr *sa;
 {
 	struct sockaddr_dl *sdl;
 #ifdef INET
 	struct sockaddr_in *sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *sin6;
 #endif
 	u_char *e_addr;
 
 	switch(sa->sa_family) {
 	case AF_LINK:
 		/*
 		 * No mapping needed. Just check that it's a valid MC address.
 		 */
 		sdl = (struct sockaddr_dl *)sa;
 		e_addr = LLADDR(sdl);
 		if ((e_addr[0] & 1) != 1)
 			return (EADDRNOTAVAIL);
 		*llsa = 0;
 		return (0);
 
 #ifdef INET
 	case AF_INET:
 		sin = (struct sockaddr_in *)sa;
 		if (!IN_MULTICAST(ntohl(sin->sin_addr.s_addr)))
 			return (EADDRNOTAVAIL);
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT | M_ZERO);
 		if (sdl == NULL)
 			return (ENOMEM);
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_FDDI;
 		sdl->sdl_nlen = 0;
 		sdl->sdl_alen = FDDI_ADDR_LEN;
 		sdl->sdl_slen = 0;
 		e_addr = LLADDR(sdl);
 		ETHER_MAP_IP_MULTICAST(&sin->sin_addr, e_addr);
 		*llsa = (struct sockaddr *)sdl;
 		return (0);
 #endif
 #ifdef INET6
 	case AF_INET6:
 		sin6 = (struct sockaddr_in6 *)sa;
 		if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) {
 			/*
 			 * An IP6 address of 0 means listen to all
 			 * of the Ethernet multicast address used for IP6.
 			 * (This is used for multicast routers.)
 			 */
 			ifp->if_flags |= IFF_ALLMULTI;
 			*llsa = 0;
 			return (0);
 		}
 		if (!IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr))
 			return (EADDRNOTAVAIL);
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT | M_ZERO);
 		if (sdl == NULL)
 			return (ENOMEM);
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_FDDI;
 		sdl->sdl_nlen = 0;
 		sdl->sdl_alen = FDDI_ADDR_LEN;
 		sdl->sdl_slen = 0;
 		e_addr = LLADDR(sdl);
 		ETHER_MAP_IPV6_MULTICAST(&sin6->sin6_addr, e_addr);
 		*llsa = (struct sockaddr *)sdl;
 		return (0);
 #endif
 
 	default:
 		/*
 		 * Well, the text isn't quite right, but it's the name
 		 * that counts...
 		 */
 		return (EAFNOSUPPORT);
 	}
 
 	return (0);
 }
 
 static moduledata_t fddi_mod = {
 	"fddi",	/* module name */
 	NULL,	/* event handler */
 	0	/* extra data */
 };
 
 DECLARE_MODULE(fddi, fddi_mod, SI_SUB_PSEUDO, SI_ORDER_ANY);
 MODULE_VERSION(fddi, 1);
Index: head/sys/net/if_fwsubr.c
===================================================================
--- head/sys/net/if_fwsubr.c	(revision 186118)
+++ head/sys/net/if_fwsubr.c	(revision 186119)
@@ -1,856 +1,850 @@
 /*-
  * Copyright (c) 2004 Doug Rabson
  * Copyright (c) 1982, 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_mac.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/module.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 
 #include <net/if.h>
 #include <net/netisr.h>
 #include <net/route.h>
 #include <net/if_llc.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/bpf.h>
 #include <net/firewire.h>
+#include <net/if_llatbl.h>
 
 #if defined(INET) || defined(INET6)
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/if_ether.h>
 #endif
 #ifdef INET6
 #include <netinet6/nd6.h>
 #endif
 
 #include <security/mac/mac_framework.h>
 
 MALLOC_DEFINE(M_FWCOM, "fw_com", "firewire interface internals");
 
 struct fw_hwaddr firewire_broadcastaddr = {
 	0xffffffff,
 	0xffffffff,
 	0xff,
 	0xff,
 	0xffff,
 	0xffffffff
 };
 
 static int
 firewire_output(struct ifnet *ifp, struct mbuf *m, struct sockaddr *dst,
     struct rtentry *rt0)
 {
 	struct fw_com *fc = IFP2FWC(ifp);
 	int error, type;
-	struct rtentry *rt = NULL;
 	struct m_tag *mtag;
 	union fw_encap *enc;
 	struct fw_hwaddr *destfw;
 	uint8_t speed;
 	uint16_t psize, fsize, dsize;
 	struct mbuf *mtail;
 	int unicast, dgl, foff;
 	static int next_dgl;
+	struct llentry *lle;
 
 #ifdef MAC
 	error = mac_ifnet_check_transmit(ifp, m);
 	if (error)
 		goto bad;
 #endif
 
 	if (!((ifp->if_flags & IFF_UP) &&
 	   (ifp->if_drv_flags & IFF_DRV_RUNNING))) {
 		error = ENETDOWN;
 		goto bad;
 	}
 
-	if (rt0 != NULL) {
-		error = rt_check(&rt, &rt0, dst);
-		if (error)
-			goto bad;
-		RT_UNLOCK(rt);
-	}
-
 	/*
 	 * For unicast, we make a tag to store the lladdr of the
 	 * destination. This might not be the first time we have seen
 	 * the packet (for instance, the arp code might be trying to
 	 * re-send it after receiving an arp reply) so we only
 	 * allocate a tag if there isn't one there already. For
 	 * multicast, we will eventually use a different tag to store
 	 * the channel number.
 	 */
 	unicast = !(m->m_flags & (M_BCAST | M_MCAST));
 	if (unicast) {
 		mtag = m_tag_locate(m, MTAG_FIREWIRE, MTAG_FIREWIRE_HWADDR, NULL);
 		if (!mtag) {
 			mtag = m_tag_alloc(MTAG_FIREWIRE, MTAG_FIREWIRE_HWADDR,
 			    sizeof (struct fw_hwaddr), M_NOWAIT);
 			if (!mtag) {
 				error = ENOMEM;
 				goto bad;
 			}
 			m_tag_prepend(m, mtag);
 		}
 		destfw = (struct fw_hwaddr *)(mtag + 1);
 	} else {
 		destfw = 0;
 	}
 
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET:
 		/*
 		 * Only bother with arp for unicast. Allocation of
 		 * channels etc. for firewire is quite different and
 		 * doesn't fit into the arp model.
 		 */
 		if (unicast) {
-			error = arpresolve(ifp, rt, m, dst, (u_char *) destfw);
+			error = arpresolve(ifp, rt0, m, dst, (u_char *) destfw, &lle);
 			if (error)
 				return (error == EWOULDBLOCK ? 0 : error);
 		}
 		type = ETHERTYPE_IP;
 		break;
 
 	case AF_ARP:
 	{
 		struct arphdr *ah;
 		ah = mtod(m, struct arphdr *);
 		ah->ar_hrd = htons(ARPHRD_IEEE1394);
 		type = ETHERTYPE_ARP;
 		if (unicast)
 			*destfw = *(struct fw_hwaddr *) ar_tha(ah);
 
 		/*
 		 * The standard arp code leaves a hole for the target
 		 * hardware address which we need to close up.
 		 */
 		bcopy(ar_tpa(ah), ar_tha(ah), ah->ar_pln);
 		m_adj(m, -ah->ar_hln);
 		break;
 	}
 #endif
 
 #ifdef INET6
 	case AF_INET6:
 		if (unicast) {
-			error = nd6_storelladdr(fc->fc_ifp, rt, m, dst,
-			    (u_char *) destfw);
+			error = nd6_storelladdr(fc->fc_ifp, rt0, m, dst,
+			    (u_char *) destfw, &lle);
 			if (error)
 				return (error);
 		}
 		type = ETHERTYPE_IPV6;
 		break;
 #endif
 
 	default:
 		if_printf(ifp, "can't handle af%d\n", dst->sa_family);
 		error = EAFNOSUPPORT;
 		goto bad;
 	}
 
 	/*
 	 * Let BPF tap off a copy before we encapsulate.
 	 */
 	if (bpf_peers_present(ifp->if_bpf)) {
 		struct fw_bpfhdr h;
 		if (unicast)
 			bcopy(destfw, h.firewire_dhost, 8);
 		else
 			bcopy(&firewire_broadcastaddr, h.firewire_dhost, 8);
 		bcopy(&fc->fc_hwaddr, h.firewire_shost, 8);
 		h.firewire_type = htons(type);
 		bpf_mtap2(ifp->if_bpf, &h, sizeof(h), m);
 	}
 
 	/*
 	 * Punt on MCAP for now and send all multicast packets on the
 	 * broadcast channel.
 	 */
 	if (m->m_flags & M_MCAST)
 		m->m_flags |= M_BCAST;
 
 	/*
 	 * Figure out what speed to use and what the largest supported
 	 * packet size is. For unicast, this is the minimum of what we
 	 * can speak and what they can hear. For broadcast, lets be
 	 * conservative and use S100. We could possibly improve that
 	 * by examining the bus manager's speed map or similar. We
 	 * also reduce the packet size for broadcast to account for
 	 * the GASP header.
 	 */
 	if (unicast) {
 		speed = min(fc->fc_speed, destfw->sspd);
 		psize = min(512 << speed, 2 << destfw->sender_max_rec);
 	} else {
 		speed = 0;
 		psize = 512 - 2*sizeof(uint32_t);
 	}
 
 	/*
 	 * Next, we encapsulate, possibly fragmenting the original
 	 * datagram if it won't fit into a single packet.
 	 */
 	if (m->m_pkthdr.len <= psize - sizeof(uint32_t)) {
 		/*
 		 * No fragmentation is necessary.
 		 */
 		M_PREPEND(m, sizeof(uint32_t), M_DONTWAIT);
 		if (!m) {
 			error = ENOBUFS;
 			goto bad;
 		}
 		enc = mtod(m, union fw_encap *);
 		enc->unfrag.ether_type = type;
 		enc->unfrag.lf = FW_ENCAP_UNFRAG;
 		enc->unfrag.reserved = 0;
 
 		/*
 		 * Byte swap the encapsulation header manually.
 		 */
 		enc->ul[0] = htonl(enc->ul[0]);
 
 		error = (ifp->if_transmit)(ifp, m);
 		return (error);
 	} else {
 		/*
 		 * Fragment the datagram, making sure to leave enough
 		 * space for the encapsulation header in each packet.
 		 */
 		fsize = psize - 2*sizeof(uint32_t);
 		dgl = next_dgl++;
 		dsize = m->m_pkthdr.len;
 		foff = 0;
 		while (m) {
 			if (m->m_pkthdr.len > fsize) {
 				/*
 				 * Split off the tail segment from the
 				 * datagram, copying our tags over.
 				 */
 				mtail = m_split(m, fsize, M_DONTWAIT);
 				m_tag_copy_chain(mtail, m, M_NOWAIT);
 			} else {
 				mtail = 0;
 			}
 
 			/*
 			 * Add our encapsulation header to this
 			 * fragment and hand it off to the link.
 			 */
 			M_PREPEND(m, 2*sizeof(uint32_t), M_DONTWAIT);
 			if (!m) {
 				error = ENOBUFS;
 				goto bad;
 			}
 			enc = mtod(m, union fw_encap *);
 			if (foff == 0) {
 				enc->firstfrag.lf = FW_ENCAP_FIRST;
 				enc->firstfrag.reserved1 = 0;
 				enc->firstfrag.reserved2 = 0;
 				enc->firstfrag.datagram_size = dsize - 1;
 				enc->firstfrag.ether_type = type;
 				enc->firstfrag.dgl = dgl;
 			} else {
 				if (mtail)
 					enc->nextfrag.lf = FW_ENCAP_NEXT;
 				else
 					enc->nextfrag.lf = FW_ENCAP_LAST;
 				enc->nextfrag.reserved1 = 0;
 				enc->nextfrag.reserved2 = 0;
 				enc->nextfrag.reserved3 = 0;
 				enc->nextfrag.datagram_size = dsize - 1;
 				enc->nextfrag.fragment_offset = foff;
 				enc->nextfrag.dgl = dgl;
 			}
 			foff += m->m_pkthdr.len - 2*sizeof(uint32_t);
 
 			/*
 			 * Byte swap the encapsulation header manually.
 			 */
 			enc->ul[0] = htonl(enc->ul[0]);
 			enc->ul[1] = htonl(enc->ul[1]);
 
 			error = (ifp->if_transmit)(ifp, m);
 			if (error) {
 				if (mtail)
 					m_freem(mtail);
 				return (ENOBUFS);
 			}
 
 			m = mtail;
 		}
 
 		return (0);
 	}
 
 bad:
 	if (m)
 		m_freem(m);
 	return (error);
 }
 
 static struct mbuf *
 firewire_input_fragment(struct fw_com *fc, struct mbuf *m, int src)
 {
 	union fw_encap *enc;
 	struct fw_reass *r;
 	struct mbuf *mf, *mprev;
 	int dsize;
 	int fstart, fend, start, end, islast;
 	uint32_t id;
 
 	/*
 	 * Find an existing reassembly buffer or create a new one.
 	 */
 	enc = mtod(m, union fw_encap *);
 	id = enc->firstfrag.dgl | (src << 16);
 	STAILQ_FOREACH(r, &fc->fc_frags, fr_link)
 		if (r->fr_id == id)
 			break;
 	if (!r) {
 		r = malloc(sizeof(struct fw_reass), M_TEMP, M_NOWAIT);
 		if (!r) {
 			m_freem(m);
 			return 0;
 		}
 		r->fr_id = id;
 		r->fr_frags = 0;
 		STAILQ_INSERT_HEAD(&fc->fc_frags, r, fr_link);
 	}
 
 	/*
 	 * If this fragment overlaps any other fragment, we must discard
 	 * the partial reassembly and start again.
 	 */
 	if (enc->firstfrag.lf == FW_ENCAP_FIRST)
 		fstart = 0;
 	else
 		fstart = enc->nextfrag.fragment_offset;
 	fend = fstart + m->m_pkthdr.len - 2*sizeof(uint32_t);
 	dsize = enc->nextfrag.datagram_size;
 	islast = (enc->nextfrag.lf == FW_ENCAP_LAST);
 
 	for (mf = r->fr_frags; mf; mf = mf->m_nextpkt) {
 		enc = mtod(mf, union fw_encap *);
 		if (enc->nextfrag.datagram_size != dsize) {
 			/*
 			 * This fragment must be from a different
 			 * packet.
 			 */
 			goto bad;
 		}
 		if (enc->firstfrag.lf == FW_ENCAP_FIRST)
 			start = 0;
 		else
 			start = enc->nextfrag.fragment_offset;
 		end = start + mf->m_pkthdr.len - 2*sizeof(uint32_t);
 		if ((fstart < end && fend > start) ||
 		    (islast && enc->nextfrag.lf == FW_ENCAP_LAST)) {
 			/*
 			 * Overlap - discard reassembly buffer and start
 			 * again with this fragment.
 			 */
 			goto bad;
 		}
 	}
 
 	/*
 	 * Find where to put this fragment in the list.
 	 */
 	for (mf = r->fr_frags, mprev = NULL; mf;
 	    mprev = mf, mf = mf->m_nextpkt) {
 		enc = mtod(mf, union fw_encap *);
 		if (enc->firstfrag.lf == FW_ENCAP_FIRST)
 			start = 0;
 		else
 			start = enc->nextfrag.fragment_offset;
 		if (start >= fend)
 			break;
 	}
 
 	/*
 	 * If this is a last fragment and we are not adding at the end
 	 * of the list, discard the buffer.
 	 */
 	if (islast && mprev && mprev->m_nextpkt)
 		goto bad;
 
 	if (mprev) {
 		m->m_nextpkt = mprev->m_nextpkt;
 		mprev->m_nextpkt = m;
 
 		/*
 		 * Coalesce forwards and see if we can make a whole
 		 * datagram.
 		 */
 		enc = mtod(mprev, union fw_encap *);
 		if (enc->firstfrag.lf == FW_ENCAP_FIRST)
 			start = 0;
 		else
 			start = enc->nextfrag.fragment_offset;
 		end = start + mprev->m_pkthdr.len - 2*sizeof(uint32_t);
 		while (end == fstart) {
 			/*
 			 * Strip off the encap header from m and
 			 * append it to mprev, freeing m.
 			 */
 			m_adj(m, 2*sizeof(uint32_t));
 			mprev->m_nextpkt = m->m_nextpkt;
 			mprev->m_pkthdr.len += m->m_pkthdr.len;
 			m_cat(mprev, m);
 
 			if (mprev->m_pkthdr.len == dsize + 1 + 2*sizeof(uint32_t)) {
 				/*
 				 * We have assembled a complete packet
 				 * we must be finished. Make sure we have
 				 * merged the whole chain.
 				 */
 				STAILQ_REMOVE(&fc->fc_frags, r, fw_reass, fr_link);
 				free(r, M_TEMP);
 				m = mprev->m_nextpkt;
 				while (m) {
 					mf = m->m_nextpkt;
 					m_freem(m);
 					m = mf;
 				}
 				mprev->m_nextpkt = NULL;
 
 				return (mprev);
 			}
 
 			/*
 			 * See if we can continue merging forwards.
 			 */
 			end = fend;
 			m = mprev->m_nextpkt;
 			if (m) {
 				enc = mtod(m, union fw_encap *);
 				if (enc->firstfrag.lf == FW_ENCAP_FIRST)
 					fstart = 0;
 				else
 					fstart = enc->nextfrag.fragment_offset;
 				fend = fstart + m->m_pkthdr.len
 				    - 2*sizeof(uint32_t);
 			} else {
 				break;
 			}
 		}
 	} else {
 		m->m_nextpkt = 0;
 		r->fr_frags = m;
 	}
 
 	return (0);
 
 bad:
 	while (r->fr_frags) {
 		mf = r->fr_frags;
 		r->fr_frags = mf->m_nextpkt;
 		m_freem(mf);
 	}
 	m->m_nextpkt = 0;
 	r->fr_frags = m;
 
 	return (0);
 }
 
 void
 firewire_input(struct ifnet *ifp, struct mbuf *m, uint16_t src)
 {
 	struct fw_com *fc = IFP2FWC(ifp);
 	union fw_encap *enc;
 	int type, isr;
 
 	/*
 	 * The caller has already stripped off the packet header
 	 * (stream or wreqb) and marked the mbuf's M_BCAST flag
 	 * appropriately. We de-encapsulate the IP packet and pass it
 	 * up the line after handling link-level fragmentation.
 	 */
 	if (m->m_pkthdr.len < sizeof(uint32_t)) {
 		if_printf(ifp, "discarding frame without "
 		    "encapsulation header (len %u pkt len %u)\n",
 		    m->m_len, m->m_pkthdr.len);
 	}
 
 	m = m_pullup(m, sizeof(uint32_t));
 	if (m == NULL)
 		return;
 	enc = mtod(m, union fw_encap *);
 
 	/*
 	 * Byte swap the encapsulation header manually.
 	 */
 	enc->ul[0] = ntohl(enc->ul[0]);
 
 	if (enc->unfrag.lf != 0) {
 		m = m_pullup(m, 2*sizeof(uint32_t));
 		if (!m)
 			return;
 		enc = mtod(m, union fw_encap *);
 		enc->ul[1] = ntohl(enc->ul[1]);
 		m = firewire_input_fragment(fc, m, src);
 		if (!m)
 			return;
 		enc = mtod(m, union fw_encap *);
 		type = enc->firstfrag.ether_type;
 		m_adj(m, 2*sizeof(uint32_t));
 	} else {
 		type = enc->unfrag.ether_type;
 		m_adj(m, sizeof(uint32_t));
 	}
 
 	if (m->m_pkthdr.rcvif == NULL) {
 		if_printf(ifp, "discard frame w/o interface pointer\n");
 		ifp->if_ierrors++;
 		m_freem(m);
 		return;
 	}
 #ifdef DIAGNOSTIC
 	if (m->m_pkthdr.rcvif != ifp) {
 		if_printf(ifp, "Warning, frame marked as received on %s\n",
 			m->m_pkthdr.rcvif->if_xname);
 	}
 #endif
 
 #ifdef MAC
 	/*
 	 * Tag the mbuf with an appropriate MAC label before any other
 	 * consumers can get to it.
 	 */
 	mac_ifnet_create_mbuf(ifp, m);
 #endif
 
 	/*
 	 * Give bpf a chance at the packet. The link-level driver
 	 * should have left us a tag with the EUID of the sender.
 	 */
 	if (bpf_peers_present(ifp->if_bpf)) {
 		struct fw_bpfhdr h;
 		struct m_tag *mtag;
 
 		mtag = m_tag_locate(m, MTAG_FIREWIRE, MTAG_FIREWIRE_SENDER_EUID, 0);
 		if (mtag)
 			bcopy(mtag + 1, h.firewire_shost, 8);
 		else
 			bcopy(&firewire_broadcastaddr, h.firewire_dhost, 8);
 		bcopy(&fc->fc_hwaddr, h.firewire_dhost, 8);
 		h.firewire_type = htons(type);
 		bpf_mtap2(ifp->if_bpf, &h, sizeof(h), m);
 	}
 
 	if (ifp->if_flags & IFF_MONITOR) {
 		/*
 		 * Interface marked for monitoring; discard packet.
 		 */
 		m_freem(m);
 		return;
 	}
 
 	ifp->if_ibytes += m->m_pkthdr.len;
 
 	/* Discard packet if interface is not up */
 	if ((ifp->if_flags & IFF_UP) == 0) {
 		m_freem(m);
 		return;
 	}
 
 	if (m->m_flags & (M_BCAST|M_MCAST))
 		ifp->if_imcasts++;
 
 	switch (type) {
 #ifdef INET
 	case ETHERTYPE_IP:
 		if ((m = ip_fastforward(m)) == NULL)
 			return;
 		isr = NETISR_IP;
 		break;
 
 	case ETHERTYPE_ARP:
 	{
 		struct arphdr *ah;
 		ah = mtod(m, struct arphdr *);
 
 		/*
 		 * Adjust the arp packet to insert an empty tha slot.
 		 */
 		m->m_len += ah->ar_hln;
 		m->m_pkthdr.len += ah->ar_hln;
 		bcopy(ar_tha(ah), ar_tpa(ah), ah->ar_pln);
 		isr = NETISR_ARP;
 		break;
 	}
 #endif
 
 #ifdef INET6
 	case ETHERTYPE_IPV6:
 		isr = NETISR_IPV6;
 		break;
 #endif
 
 	default:
 		m_freem(m);
 		return;
 	}
 
 	netisr_dispatch(isr, m);
 }
 
 int
 firewire_ioctl(struct ifnet *ifp, int command, caddr_t data)
 {
 	struct ifaddr *ifa = (struct ifaddr *) data;
 	struct ifreq *ifr = (struct ifreq *) data;
 	int error = 0;
 
 	switch (command) {
 	case SIOCSIFADDR:
 		ifp->if_flags |= IFF_UP;
 
 		switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
 		case AF_INET:
 			ifp->if_init(ifp->if_softc);	/* before arpwhohas */
 			arp_ifinit(ifp, ifa);
 			break;
 #endif
 		default:
 			ifp->if_init(ifp->if_softc);
 			break;
 		}
 		break;
 
 	case SIOCGIFADDR:
 		{
 			struct sockaddr *sa;
 
 			sa = (struct sockaddr *) & ifr->ifr_data;
 			bcopy(&IFP2FWC(ifp)->fc_hwaddr,
 			    (caddr_t) sa->sa_data, sizeof(struct fw_hwaddr));
 		}
 		break;
 
 	case SIOCSIFMTU:
 		/*
 		 * Set the interface MTU.
 		 */
 		if (ifr->ifr_mtu > 1500) {
 			error = EINVAL;
 		} else {
 			ifp->if_mtu = ifr->ifr_mtu;
 		}
 		break;
 	default:
 		error = EINVAL;			/* XXX netbsd has ENOTTY??? */
 		break;
 	}
 	return (error);
 }
 
 static int
 firewire_resolvemulti(struct ifnet *ifp, struct sockaddr **llsa,
     struct sockaddr *sa)
 {
 #ifdef INET
 	struct sockaddr_in *sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *sin6;
 #endif
 
 	switch(sa->sa_family) {
 	case AF_LINK:
 		/*
 		 * No mapping needed.
 		 */
 		*llsa = 0;
 		return 0;
 
 #ifdef INET
 	case AF_INET:
 		sin = (struct sockaddr_in *)sa;
 		if (!IN_MULTICAST(ntohl(sin->sin_addr.s_addr)))
 			return EADDRNOTAVAIL;
 		*llsa = 0;
 		return 0;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		sin6 = (struct sockaddr_in6 *)sa;
 		if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) {
 			/*
 			 * An IP6 address of 0 means listen to all
 			 * of the Ethernet multicast address used for IP6.
 			 * (This is used for multicast routers.)
 			 */
 			ifp->if_flags |= IFF_ALLMULTI;
 			*llsa = 0;
 			return 0;
 		}
 		if (!IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr))
 			return EADDRNOTAVAIL;
 		*llsa = 0;
 		return 0;
 #endif
 
 	default:
 		/*
 		 * Well, the text isn't quite right, but it's the name
 		 * that counts...
 		 */
 		return EAFNOSUPPORT;
 	}
 }
 
 void
 firewire_ifattach(struct ifnet *ifp, struct fw_hwaddr *llc)
 {
 	struct fw_com *fc = IFP2FWC(ifp);
 	struct ifaddr *ifa;
 	struct sockaddr_dl *sdl;
 	static const char* speeds[] = {
 		"S100", "S200", "S400", "S800",
 		"S1600", "S3200"
 	};
 
 	fc->fc_speed = llc->sspd;
 	STAILQ_INIT(&fc->fc_frags);
 
 	ifp->if_addrlen = sizeof(struct fw_hwaddr);
 	ifp->if_hdrlen = 0;
 	if_attach(ifp);
 	ifp->if_mtu = 1500;	/* XXX */
 	ifp->if_output = firewire_output;
 	ifp->if_resolvemulti = firewire_resolvemulti;
 	ifp->if_broadcastaddr = (u_char *) &firewire_broadcastaddr;
 
 	ifa = ifp->if_addr;
 	KASSERT(ifa != NULL, ("%s: no lladdr!\n", __func__));
 	sdl = (struct sockaddr_dl *)ifa->ifa_addr;
 	sdl->sdl_type = IFT_IEEE1394;
 	sdl->sdl_alen = ifp->if_addrlen;
 	bcopy(llc, LLADDR(sdl), ifp->if_addrlen);
 
 	bpfattach(ifp, DLT_APPLE_IP_OVER_IEEE1394,
 	    sizeof(struct fw_hwaddr));
 
 	if_printf(ifp, "Firewire address: %8D @ 0x%04x%08x, %s, maxrec %d\n",
 	    (uint8_t *) &llc->sender_unique_ID_hi, ":",
 	    ntohs(llc->sender_unicast_FIFO_hi),
 	    ntohl(llc->sender_unicast_FIFO_lo),
 	    speeds[llc->sspd],
 	    (2 << llc->sender_max_rec));
 }
 
 void
 firewire_ifdetach(struct ifnet *ifp)
 {
 	bpfdetach(ifp);
 	if_detach(ifp);
 }
 
 void
 firewire_busreset(struct ifnet *ifp)
 {
 	struct fw_com *fc = IFP2FWC(ifp);
 	struct fw_reass *r;
 	struct mbuf *m;
 
 	/*
 	 * Discard any partial datagrams since the host ids may have changed.
 	 */
 	while ((r = STAILQ_FIRST(&fc->fc_frags))) {
 		STAILQ_REMOVE_HEAD(&fc->fc_frags, fr_link);
 		while (r->fr_frags) {
 			m = r->fr_frags;
 			r->fr_frags = m->m_nextpkt;
 			m_freem(m);
 		}
 		free(r, M_TEMP);
 	}
 }
 
 static void *
 firewire_alloc(u_char type, struct ifnet *ifp)
 {
 	struct fw_com	*fc;
 
 	fc = malloc(sizeof(struct fw_com), M_FWCOM, M_WAITOK | M_ZERO);
 	fc->fc_ifp = ifp;
 
 	return (fc);
 }
 
 static void
 firewire_free(void *com, u_char type)
 {
 
 	free(com, M_FWCOM);
 }
 
 static int
 firewire_modevent(module_t mod, int type, void *data)
 {
 
 	switch (type) {
 	case MOD_LOAD:
 		if_register_com_alloc(IFT_IEEE1394,
 		    firewire_alloc, firewire_free);
 		break;
 	case MOD_UNLOAD:
 		if_deregister_com_alloc(IFT_IEEE1394);
 		break;
 	default:
 		return (EOPNOTSUPP);
 	}
 
 	return (0);
 }
 
 static moduledata_t firewire_mod = {
 	"if_firewire",
 	firewire_modevent,
 	0
 };
 
 DECLARE_MODULE(if_firewire, firewire_mod, SI_SUB_INIT_IF, SI_ORDER_ANY);
 MODULE_VERSION(if_firewire, 1);
Index: head/sys/net/if_iso88025subr.c
===================================================================
--- head/sys/net/if_iso88025subr.c	(revision 186118)
+++ head/sys/net/if_iso88025subr.c	(revision 186119)
@@ -1,829 +1,824 @@
 /*-
  * Copyright (c) 1998, Larry Lile
  * All rights reserved.
  *
  * For latest sources and information on this driver, please
  * go to http://anarchy.stdio.com.
  *
  * Questions, comments or suggestions should be directed to
  * Larry Lile <lile@stdio.com>.
  * 
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  *
  */
 
 /*
  *
  * General ISO 802.5 (Token Ring) support routines
  * 
  */
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipx.h"
 #include "opt_mac.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/module.h>
 #include <sys/socket.h>
 #include <sys/sockio.h> 
 
 #include <net/if.h>
 #include <net/if_arp.h>
 #include <net/if_dl.h>
 #include <net/if_llc.h>
 #include <net/if_types.h>
+#include <net/if_llatbl.h>
 
 #include <net/ethernet.h>
 #include <net/netisr.h>
 #include <net/route.h>
 #include <net/bpf.h>
 #include <net/iso88025.h>
 
 #if defined(INET) || defined(INET6)
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/if_ether.h>
 #endif
 #ifdef INET6
 #include <netinet6/nd6.h>
 #endif
 
 #ifdef IPX
 #include <netipx/ipx.h>
 #include <netipx/ipx_if.h>
 #endif
 
 #include <security/mac/mac_framework.h>
 
 static const u_char iso88025_broadcastaddr[ISO88025_ADDR_LEN] =
 			{ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
 
 static int iso88025_resolvemulti (struct ifnet *, struct sockaddr **,
 				  struct sockaddr *);
 
 #define	senderr(e)	do { error = (e); goto bad; } while (0)
 
 /*
  * Perform common duties while attaching to interface list
  */
 void
 iso88025_ifattach(struct ifnet *ifp, const u_int8_t *lla, int bpf)
 {
     struct ifaddr *ifa;
     struct sockaddr_dl *sdl;
 
     ifa = NULL;
 
     ifp->if_type = IFT_ISO88025;
     ifp->if_addrlen = ISO88025_ADDR_LEN;
     ifp->if_hdrlen = ISO88025_HDR_LEN;
 
     if_attach(ifp);	/* Must be called before additional assignments */
 
     ifp->if_output = iso88025_output;
     ifp->if_input = iso88025_input;
     ifp->if_resolvemulti = iso88025_resolvemulti;
     ifp->if_broadcastaddr = iso88025_broadcastaddr;
 
     if (ifp->if_baudrate == 0)
         ifp->if_baudrate = TR_16MBPS; /* 16Mbit should be a safe default */
     if (ifp->if_mtu == 0)
         ifp->if_mtu = ISO88025_DEFAULT_MTU;
 
     ifa = ifp->if_addr;
     KASSERT(ifa != NULL, ("%s: no lladdr!\n", __func__));
 
     sdl = (struct sockaddr_dl *)ifa->ifa_addr;
     sdl->sdl_type = IFT_ISO88025;
     sdl->sdl_alen = ifp->if_addrlen;
     bcopy(lla, LLADDR(sdl), ifp->if_addrlen);
 
     if (bpf)
         bpfattach(ifp, DLT_IEEE802, ISO88025_HDR_LEN);
 
     return;
 }
 
 /*
  * Perform common duties while detaching a Token Ring interface
  */
 void
 iso88025_ifdetach(ifp, bpf)
         struct ifnet *ifp;
         int bpf;
 {
 
 	if (bpf)
                 bpfdetach(ifp);
 
 	if_detach(ifp);
 
 	return;
 }
 
 int
 iso88025_ioctl(struct ifnet *ifp, int command, caddr_t data)
 {
         struct ifaddr *ifa;
         struct ifreq *ifr;
         int error;
 
 	ifa = (struct ifaddr *) data;
 	ifr = (struct ifreq *) data;
 	error = 0;
 
         switch (command) {
         case SIOCSIFADDR:
                 ifp->if_flags |= IFF_UP;
 
                 switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
                 case AF_INET:
                         ifp->if_init(ifp->if_softc);    /* before arpwhohas */
                         arp_ifinit(ifp, ifa);
                         break;
 #endif	/* INET */
 #ifdef IPX
                 /*
                  * XXX - This code is probably wrong
                  */
                 case AF_IPX: {
 				struct ipx_addr *ina;
 
 				ina = &(IA_SIPX(ifa)->sipx_addr);
 
 				if (ipx_nullhost(*ina))
 					ina->x_host = *(union ipx_host *)
 							IF_LLADDR(ifp);
 				else
 					bcopy((caddr_t) ina->x_host.c_host,
 					      (caddr_t) IF_LLADDR(ifp),
 					      ISO88025_ADDR_LEN);
 
 				/*
 				 * Set new address
 				 */
 				ifp->if_init(ifp->if_softc);
 			}
 			break;
 #endif	/* IPX */
                 default:
                         ifp->if_init(ifp->if_softc);
                         break;
                 }
                 break;
 
         case SIOCGIFADDR: {
                         struct sockaddr *sa;
 
                         sa = (struct sockaddr *) & ifr->ifr_data;
                         bcopy(IF_LLADDR(ifp),
                               (caddr_t) sa->sa_data, ISO88025_ADDR_LEN);
                 }
                 break;
 
         case SIOCSIFMTU:
                 /*
                  * Set the interface MTU.
                  */
                 if (ifr->ifr_mtu > ISO88025_MAX_MTU) {
                         error = EINVAL;
                 } else {
                         ifp->if_mtu = ifr->ifr_mtu;
                 }
                 break;
 	default:
 		error = EINVAL;			/* XXX netbsd has ENOTTY??? */
 		break;
         }
 
         return (error);
 }
 
 /*
  * ISO88025 encapsulation
  */
 int
 iso88025_output(ifp, m, dst, rt0)
 	struct ifnet *ifp;
 	struct mbuf *m;
 	struct sockaddr *dst;
 	struct rtentry *rt0;
 {
 	u_int16_t snap_type = 0;
 	int loop_copy = 0, error = 0, rif_len = 0;
 	u_char edst[ISO88025_ADDR_LEN];
 	struct iso88025_header *th;
 	struct iso88025_header gen_th;
 	struct sockaddr_dl *sdl = NULL;
-	struct rtentry *rt = NULL;
+	struct llentry *lle;
 
 #ifdef MAC
 	error = mac_ifnet_check_transmit(ifp, m);
 	if (error)
 		senderr(error);
 #endif
 
 	if (ifp->if_flags & IFF_MONITOR)
 		senderr(ENETDOWN);
 	if (!((ifp->if_flags & IFF_UP) &&
 	    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 		senderr(ENETDOWN);
 	getmicrotime(&ifp->if_lastchange);
 
 	/* Calculate routing info length based on arp table entry */
 	/* XXX any better way to do this ? */
-	if (rt0 != NULL) {
-		error = rt_check(&rt, &rt0, dst);
-		if (error)
-			goto bad;
-		RT_UNLOCK(rt);
-	}
 
-	if (rt && (sdl = (struct sockaddr_dl *)rt->rt_gateway))
+	if (rt0 && (sdl = (struct sockaddr_dl *)rt0->rt_gateway))
 		if (SDL_ISO88025(sdl)->trld_rcf != 0)
 			rif_len = TR_RCF_RIFLEN(SDL_ISO88025(sdl)->trld_rcf);
 
 	/* Generate a generic 802.5 header for the packet */
 	gen_th.ac = TR_AC;
 	gen_th.fc = TR_LLC_FRAME;
 	(void)memcpy((caddr_t)gen_th.iso88025_shost, IF_LLADDR(ifp),
 		     ISO88025_ADDR_LEN);
 	if (rif_len) {
 		gen_th.iso88025_shost[0] |= TR_RII;
 		if (rif_len > 2) {
 			gen_th.rcf = SDL_ISO88025(sdl)->trld_rcf;
 			(void)memcpy((caddr_t)gen_th.rd,
 				(caddr_t)SDL_ISO88025(sdl)->trld_route,
 				rif_len - 2);
 		}
 	}
 	
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET:
-		error = arpresolve(ifp, rt0, m, dst, edst);
+		error = arpresolve(ifp, rt0, m, dst, edst, &lle);
 		if (error)
 			return (error == EWOULDBLOCK ? 0 : error);
 		snap_type = ETHERTYPE_IP;
 		break;
 	case AF_ARP:
 	{
 		struct arphdr *ah;
 		ah = mtod(m, struct arphdr *);
 		ah->ar_hrd = htons(ARPHRD_IEEE802);
 
 		loop_copy = -1; /* if this is for us, don't do it */
 
 		switch(ntohs(ah->ar_op)) {
 		case ARPOP_REVREQUEST:
 		case ARPOP_REVREPLY:
 			snap_type = ETHERTYPE_REVARP;
 			break;
 		case ARPOP_REQUEST:
 		case ARPOP_REPLY:
 		default:
 			snap_type = ETHERTYPE_ARP;
 			break;
 		}
 
 		if (m->m_flags & M_BCAST)
 			bcopy(ifp->if_broadcastaddr, edst, ISO88025_ADDR_LEN);
 		else
 			bcopy(ar_tha(ah), edst, ISO88025_ADDR_LEN);
 
 	}
 	break;
 #endif	/* INET */
 #ifdef INET6
 	case AF_INET6:
-		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)edst);
+		error = nd6_storelladdr(ifp, rt0, m, dst, (u_char *)edst, &lle);
 		if (error)
 			return (error);
 		snap_type = ETHERTYPE_IPV6;
 		break;
 #endif	/* INET6 */
 #ifdef IPX
 	case AF_IPX:
 	{
 		u_int8_t	*cp;
 
 		bcopy((caddr_t)&(satoipx_addr(dst).x_host), (caddr_t)edst,
 		      ISO88025_ADDR_LEN);
 
 		M_PREPEND(m, 3, M_WAIT);
 		m = m_pullup(m, 3);
 		if (m == 0)
 			senderr(ENOBUFS);
 		cp = mtod(m, u_int8_t *);
 		*cp++ = ETHERTYPE_IPX_8022;
 		*cp++ = ETHERTYPE_IPX_8022;
 		*cp++ = LLC_UI;
 	}
 	break;
 #endif	/* IPX */
 	case AF_UNSPEC:
 	{
 		struct iso88025_sockaddr_data *sd;
 		/*
 		 * For AF_UNSPEC sockaddr.sa_data must contain all of the
 		 * mac information needed to send the packet.  This allows
 		 * full mac, llc, and source routing function to be controlled.
 		 * llc and source routing information must already be in the
 		 * mbuf provided, ac/fc are set in sa_data.  sockaddr.sa_data
 		 * should be an iso88025_sockaddr_data structure see iso88025.h
 		 */
                 loop_copy = -1;
 		sd = (struct iso88025_sockaddr_data *)dst->sa_data;
 		gen_th.ac = sd->ac;
 		gen_th.fc = sd->fc;
 		(void)memcpy((caddr_t)edst, (caddr_t)sd->ether_dhost,
 			     ISO88025_ADDR_LEN);
 		(void)memcpy((caddr_t)gen_th.iso88025_shost,
 			     (caddr_t)sd->ether_shost, ISO88025_ADDR_LEN);
 		rif_len = 0;
 		break;
 	}
 	default:
 		if_printf(ifp, "can't handle af%d\n", dst->sa_family);
 		senderr(EAFNOSUPPORT);
 		break;
 	}
 
 	/*
 	 * Add LLC header.
 	 */
 	if (snap_type != 0) {
         	struct llc *l;
 		M_PREPEND(m, LLC_SNAPFRAMELEN, M_DONTWAIT);
 		if (m == 0)
 			senderr(ENOBUFS);
 		l = mtod(m, struct llc *);
 		l->llc_control = LLC_UI;
 		l->llc_dsap = l->llc_ssap = LLC_SNAP_LSAP;
 		l->llc_snap.org_code[0] =
 			l->llc_snap.org_code[1] =
 			l->llc_snap.org_code[2] = 0;
 		l->llc_snap.ether_type = htons(snap_type);
 	}
 
 	/*
 	 * Add local net header.  If no space in first mbuf,
 	 * allocate another.
 	 */
 	M_PREPEND(m, ISO88025_HDR_LEN + rif_len, M_DONTWAIT);
 	if (m == 0)
 		senderr(ENOBUFS);
 	th = mtod(m, struct iso88025_header *);
 	bcopy((caddr_t)edst, (caddr_t)&gen_th.iso88025_dhost, ISO88025_ADDR_LEN);
 
 	/* Copy as much of the generic header as is needed into the mbuf */
 	memcpy(th, &gen_th, ISO88025_HDR_LEN + rif_len);
 
         /*
          * If a simplex interface, and the packet is being sent to our
          * Ethernet address or a broadcast address, loopback a copy.
          * XXX To make a simplex device behave exactly like a duplex
          * device, we should copy in the case of sending to our own
          * ethernet address (thus letting the original actually appear
          * on the wire). However, we don't do that here for security
          * reasons and compatibility with the original behavior.
          */     
         if ((ifp->if_flags & IFF_SIMPLEX) && (loop_copy != -1)) {
                 if ((m->m_flags & M_BCAST) || (loop_copy > 0)) { 
                         struct mbuf *n;
 			n = m_copy(m, 0, (int)M_COPYALL);
                         (void) if_simloop(ifp, n, dst->sa_family,
 					  ISO88025_HDR_LEN);
                 } else if (bcmp(th->iso88025_dhost, th->iso88025_shost,
 				 ETHER_ADDR_LEN) == 0) {
 			(void) if_simloop(ifp, m, dst->sa_family,
 					  ISO88025_HDR_LEN);
                        	return(0);      /* XXX */
 		}       
         }      
 
 	IFQ_HANDOFF_ADJ(ifp, m, ISO88025_HDR_LEN + LLC_SNAPFRAMELEN, error);
 	if (error) {
 		printf("iso88025_output: packet dropped QFULL.\n");
 		ifp->if_oerrors++;
 	}
 	return (error);
 
 bad:
 	ifp->if_oerrors++;
 	if (m)
 		m_freem(m);
 	return (error);
 }
 
 /*
  * ISO 88025 de-encapsulation
  */
 void
 iso88025_input(ifp, m)
 	struct ifnet *ifp;
 	struct mbuf *m;
 {
 	struct iso88025_header *th;
 	struct llc *l;
 	int isr;
 	int mac_hdr_len;
 
 	/*
 	 * Do consistency checks to verify assumptions
 	 * made by code past this point.
 	 */
 	if ((m->m_flags & M_PKTHDR) == 0) {
 		if_printf(ifp, "discard frame w/o packet header\n");
 		ifp->if_ierrors++;
 		m_freem(m);
 		return;
 	}
 	if (m->m_pkthdr.rcvif == NULL) {
 		if_printf(ifp, "discard frame w/o interface pointer\n");
 		ifp->if_ierrors++;
  		m_freem(m);
 		return;
 	}
 
 	m = m_pullup(m, ISO88025_HDR_LEN);
 	if (m == NULL) {
 		ifp->if_ierrors++;
 		goto dropanyway;
 	}
 	th = mtod(m, struct iso88025_header *);
 	m->m_pkthdr.header = (void *)th;
 
 	/*
 	 * Discard packet if interface is not up.
 	 */
 	if (!((ifp->if_flags & IFF_UP) &&
 	    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 		goto dropanyway;
 
 	/*
 	 * Give bpf a chance at the packet.
 	 */
 	BPF_MTAP(ifp, m);
 
 	/*
 	 * Interface marked for monitoring; discard packet.
 	 */
 	if (ifp->if_flags & IFF_MONITOR) {
 		m_freem(m);
 		return;
 	}
 
 #ifdef MAC
 	mac_ifnet_create_mbuf(ifp, m);
 #endif
 
 	/*
 	 * Update interface statistics.
 	 */
 	ifp->if_ibytes += m->m_pkthdr.len;
 	getmicrotime(&ifp->if_lastchange);
 
 	/*
 	 * Discard non local unicast packets when interface
 	 * is in promiscuous mode.
 	 */
 	if ((ifp->if_flags & IFF_PROMISC) &&
 	    ((th->iso88025_dhost[0] & 1) == 0) &&
 	     (bcmp(IF_LLADDR(ifp), (caddr_t) th->iso88025_dhost,
 	     ISO88025_ADDR_LEN) != 0))
 		goto dropanyway;
 
 	/*
 	 * Set mbuf flags for bcast/mcast.
 	 */
 	if (th->iso88025_dhost[0] & 1) {
 		if (bcmp(iso88025_broadcastaddr, th->iso88025_dhost,
 		    ISO88025_ADDR_LEN) == 0)
 			m->m_flags |= M_BCAST;
 		else
 			m->m_flags |= M_MCAST;
 		ifp->if_imcasts++;
 	}
 
 	mac_hdr_len = ISO88025_HDR_LEN;
 	/* Check for source routing info */
 	if (th->iso88025_shost[0] & TR_RII)
 		mac_hdr_len += TR_RCF_RIFLEN(th->rcf);
 
 	/* Strip off ISO88025 header. */
 	m_adj(m, mac_hdr_len);
 
 	m = m_pullup(m, LLC_SNAPFRAMELEN);
 	if (m == 0) {
 		ifp->if_ierrors++;
 		goto dropanyway;
 	}
 	l = mtod(m, struct llc *);
 
 	switch (l->llc_dsap) {
 #ifdef IPX
 	case ETHERTYPE_IPX_8022:	/* Thanks a bunch Novell */
 		if ((l->llc_control != LLC_UI) ||
 		    (l->llc_ssap != ETHERTYPE_IPX_8022)) {
 			ifp->if_noproto++;
 			goto dropanyway;
 		}
 
 		th->iso88025_shost[0] &= ~(TR_RII); 
 		m_adj(m, 3);
 		isr = NETISR_IPX;
 		break;
 #endif	/* IPX */
 	case LLC_SNAP_LSAP: {
 		u_int16_t type;
 		if ((l->llc_control != LLC_UI) ||
 		    (l->llc_ssap != LLC_SNAP_LSAP)) {
 			ifp->if_noproto++;
 			goto dropanyway;
 		}
 
 		if (l->llc_snap.org_code[0] != 0 ||
 		    l->llc_snap.org_code[1] != 0 ||
 		    l->llc_snap.org_code[2] != 0) {
 			ifp->if_noproto++;
 			goto dropanyway;
 		}
 
 		type = ntohs(l->llc_snap.ether_type);
 		m_adj(m, LLC_SNAPFRAMELEN);
 		switch (type) {
 #ifdef INET
 		case ETHERTYPE_IP:
 			th->iso88025_shost[0] &= ~(TR_RII); 
 			if ((m = ip_fastforward(m)) == NULL)
 				return;
 			isr = NETISR_IP;
 			break;
 
 		case ETHERTYPE_ARP:
 			if (ifp->if_flags & IFF_NOARP)
 				goto dropanyway;
 			isr = NETISR_ARP;
 			break;
 #endif	/* INET */
 #ifdef IPX_SNAP	/* XXX: Not supported! */
 		case ETHERTYPE_IPX:
 			th->iso88025_shost[0] &= ~(TR_RII); 
 			isr = NETISR_IPX;
 			break;
 #endif	/* IPX_SNAP */
 #ifdef INET6
 		case ETHERTYPE_IPV6:
 			th->iso88025_shost[0] &= ~(TR_RII); 
 			isr = NETISR_IPV6;
 			break;
 #endif	/* INET6 */
 		default:
 			printf("iso88025_input: unexpected llc_snap ether_type  0x%02x\n", type);
 			ifp->if_noproto++;
 			goto dropanyway;
 		}
 		break;
 	}
 #ifdef ISO
 	case LLC_ISO_LSAP:
 		switch (l->llc_control) {
 		case LLC_UI:
 			ifp->if_noproto++;
 			goto dropanyway;
 			break;
                 case LLC_XID:
                 case LLC_XID_P:
 			if(m->m_len < ISO88025_ADDR_LEN)
 				goto dropanyway;
 			l->llc_window = 0;
 			l->llc_fid = 9;  
 			l->llc_class = 1;
 			l->llc_dsap = l->llc_ssap = 0;
 			/* Fall through to */  
 		case LLC_TEST:
 		case LLC_TEST_P:
 		{
 			struct sockaddr sa;
 			struct arpcom *ac;
 			struct iso88025_sockaddr_data *th2;
 			int i;
 			u_char c;
 
 			c = l->llc_dsap;
 
 			if (th->iso88025_shost[0] & TR_RII) { /* XXX */
 				printf("iso88025_input: dropping source routed LLC_TEST\n");
 				goto dropanyway;
 			}
 			l->llc_dsap = l->llc_ssap;
 			l->llc_ssap = c;
 			if (m->m_flags & (M_BCAST | M_MCAST))
 				bcopy((caddr_t)IF_LLADDR(ifp),
 				      (caddr_t)th->iso88025_dhost,
 					ISO88025_ADDR_LEN);
 			sa.sa_family = AF_UNSPEC;
 			sa.sa_len = sizeof(sa);
 			th2 = (struct iso88025_sockaddr_data *)sa.sa_data;
 			for (i = 0; i < ISO88025_ADDR_LEN; i++) {
 				th2->ether_shost[i] = c = th->iso88025_dhost[i];
 				th2->ether_dhost[i] = th->iso88025_dhost[i] =
 					th->iso88025_shost[i];
 				th->iso88025_shost[i] = c;
 			}
 			th2->ac = TR_AC;
 			th2->fc = TR_LLC_FRAME;
 			ifp->if_output(ifp, m, &sa, NULL);
 			return;
 		}
 		default:
 			printf("iso88025_input: unexpected llc control 0x%02x\n", l->llc_control);
 			ifp->if_noproto++;
 			goto dropanyway;
 			break;
 		}
 		break;
 #endif	/* ISO */
 	default:
 		printf("iso88025_input: unknown dsap 0x%x\n", l->llc_dsap);
 		ifp->if_noproto++;
 		goto dropanyway;
 		break;
 	}
 
 	netisr_dispatch(isr, m);
 	return;
 
 dropanyway:
 	ifp->if_iqdrops++;
 	if (m)
 		m_freem(m);
 	return;
 }
 
 static int
 iso88025_resolvemulti (ifp, llsa, sa)
 	struct ifnet *ifp;
 	struct sockaddr **llsa;
 	struct sockaddr *sa;
 {
 	struct sockaddr_dl *sdl;
 #ifdef INET
 	struct sockaddr_in *sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *sin6;
 #endif
 	u_char *e_addr;
 
 	switch(sa->sa_family) {
 	case AF_LINK:
 		/*
 		 * No mapping needed. Just check that it's a valid MC address.
 		 */
 		sdl = (struct sockaddr_dl *)sa;
 		e_addr = LLADDR(sdl);
 		if ((e_addr[0] & 1) != 1) {
 			return (EADDRNOTAVAIL);
 		}
 		*llsa = 0;
 		return (0);
 
 #ifdef INET
 	case AF_INET:
 		sin = (struct sockaddr_in *)sa;
 		if (!IN_MULTICAST(ntohl(sin->sin_addr.s_addr))) {
 			return (EADDRNOTAVAIL);
 		}
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT|M_ZERO);
 		if (sdl == NULL)
 			return (ENOMEM);
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_ISO88025;
 		sdl->sdl_alen = ISO88025_ADDR_LEN;
 		e_addr = LLADDR(sdl);
 		ETHER_MAP_IP_MULTICAST(&sin->sin_addr, e_addr);
 		*llsa = (struct sockaddr *)sdl;
 		return (0);
 #endif
 #ifdef INET6
 	case AF_INET6:
 		sin6 = (struct sockaddr_in6 *)sa;
 		if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) {
 			/*
 			 * An IP6 address of 0 means listen to all
 			 * of the Ethernet multicast address used for IP6.
 			 * (This is used for multicast routers.)
 			 */
 			ifp->if_flags |= IFF_ALLMULTI;
 			*llsa = 0;
 			return (0);
 		}
 		if (!IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr)) {
 			return (EADDRNOTAVAIL);
 		}
 		sdl = malloc(sizeof *sdl, M_IFMADDR,
 		       M_NOWAIT|M_ZERO);
 		if (sdl == NULL)
 			return (ENOMEM);
 		sdl->sdl_len = sizeof *sdl;
 		sdl->sdl_family = AF_LINK;
 		sdl->sdl_index = ifp->if_index;
 		sdl->sdl_type = IFT_ISO88025;
 		sdl->sdl_alen = ISO88025_ADDR_LEN;
 		e_addr = LLADDR(sdl);
 		ETHER_MAP_IPV6_MULTICAST(&sin6->sin6_addr, e_addr);
 		*llsa = (struct sockaddr *)sdl;
 		return (0);
 #endif
 
 	default:
 		/*
 		 * Well, the text isn't quite right, but it's the name
 		 * that counts...
 		 */
 		return (EAFNOSUPPORT);
 	}
 
 	return (0);
 }
 
 MALLOC_DEFINE(M_ISO88025, "arpcom", "802.5 interface internals");
 
 static void*
 iso88025_alloc(u_char type, struct ifnet *ifp)
 {
 	struct arpcom	*ac;
  
         ac = malloc(sizeof(struct arpcom), M_ISO88025, M_WAITOK | M_ZERO);
 	ac->ac_ifp = ifp;
 
 	return (ac);
 } 
 
 static void
 iso88025_free(void *com, u_char type)
 {
  
         free(com, M_ISO88025);
 }
  
 static int
 iso88025_modevent(module_t mod, int type, void *data)
 {
   
         switch (type) {
         case MOD_LOAD:
                 if_register_com_alloc(IFT_ISO88025, iso88025_alloc,
                     iso88025_free);
                 break;
         case MOD_UNLOAD:
                 if_deregister_com_alloc(IFT_ISO88025);
                 break;
         default:
                 return EOPNOTSUPP;
         }
 
         return (0);
 }
 
 static moduledata_t iso88025_mod = {
 	"iso88025",
 	iso88025_modevent,
 	0
 };
 
 DECLARE_MODULE(iso88025, iso88025_mod, SI_SUB_PSEUDO, SI_ORDER_ANY);
 MODULE_VERSION(iso88025, 1);
Index: head/sys/net/if_var.h
===================================================================
--- head/sys/net/if_var.h	(revision 186118)
+++ head/sys/net/if_var.h	(revision 186119)
@@ -1,727 +1,729 @@
 /*-
  * Copyright (c) 1982, 1986, 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	From: @(#)if.h	8.1 (Berkeley) 6/10/93
  * $FreeBSD$
  */
 
 #ifndef	_NET_IF_VAR_H_
 #define	_NET_IF_VAR_H_
 
 /*
  * Structures defining a network interface, providing a packet
  * transport mechanism (ala level 0 of the PUP protocols).
  *
  * Each interface accepts output datagrams of a specified maximum
  * length, and provides higher level routines with input datagrams
  * received from its medium.
  *
  * Output occurs when the routine if_output is called, with three parameters:
  *	(*ifp->if_output)(ifp, m, dst, rt)
  * Here m is the mbuf chain to be sent and dst is the destination address.
  * The output routine encapsulates the supplied datagram if necessary,
  * and then transmits it on its medium.
  *
  * On input, each interface unwraps the data received by it, and either
  * places it on the input queue of an internetwork datagram routine
  * and posts the associated software interrupt, or passes the datagram to a raw
  * packet input routine.
  *
  * Routines exist for locating interfaces by their addresses
  * or for locating an interface on a certain network, as well as more general
  * routing and gateway routines maintaining information used to locate
  * interfaces.  These routines live in the files if.c and route.c
  */
 
 #ifdef __STDC__
 /*
  * Forward structure declarations for function prototypes [sic].
  */
 struct	mbuf;
 struct	thread;
 struct	rtentry;
 struct	rt_addrinfo;
 struct	socket;
 struct	ether_header;
 struct	carp_if;
 struct  ifvlantrunk;
 #endif
 
 #include <sys/queue.h>		/* get TAILQ macros */
 
 #ifdef _KERNEL
 #include <sys/mbuf.h>
 #include <sys/eventhandler.h>
 #endif /* _KERNEL */
 #include <sys/lock.h>		/* XXX */
 #include <sys/mutex.h>		/* XXX */
 #include <sys/event.h>		/* XXX */
 #include <sys/_task.h>
 
 #define	IF_DUNIT_NONE	-1
 
 #include <altq/if_altq.h>
 
 TAILQ_HEAD(ifnethead, ifnet);	/* we use TAILQs so that the order of */
 TAILQ_HEAD(ifaddrhead, ifaddr);	/* instantiation is preserved in the list */
 TAILQ_HEAD(ifprefixhead, ifprefix);
 TAILQ_HEAD(ifmultihead, ifmultiaddr);
 TAILQ_HEAD(ifgrouphead, ifg_group);
 
 /*
  * Structure defining a queue for a network interface.
  */
 struct	ifqueue {
 	struct	mbuf *ifq_head;
 	struct	mbuf *ifq_tail;
 	int	ifq_len;
 	int	ifq_maxlen;
 	int	ifq_drops;
 	struct	mtx ifq_mtx;
 };
 
 /*
  * Structure defining a network interface.
  *
  * (Would like to call this struct ``if'', but C isn't PL/1.)
  */
 
 struct ifnet {
 	void	*if_softc;		/* pointer to driver state */
 	void	*if_l2com;		/* pointer to protocol bits */
 	TAILQ_ENTRY(ifnet) if_link; 	/* all struct ifnets are chained */
 	char	if_xname[IFNAMSIZ];	/* external name (name + unit) */
 	const char *if_dname;		/* driver name */
 	int	if_dunit;		/* unit or IF_DUNIT_NONE */
 	struct	ifaddrhead if_addrhead;	/* linked list of addresses per if */
 		/*
 		 * if_addrhead is the list of all addresses associated to
 		 * an interface.
 		 * Some code in the kernel assumes that first element
 		 * of the list has type AF_LINK, and contains sockaddr_dl
 		 * addresses which store the link-level address and the name
 		 * of the interface.
 		 * However, access to the AF_LINK address through this
 		 * field is deprecated. Use if_addr or ifaddr_byindex() instead.
 		 */
 	struct	knlist if_klist;	/* events attached to this if */
 	int	if_pcount;		/* number of promiscuous listeners */
 	struct	carp_if *if_carp;	/* carp interface structure */
 	struct	bpf_if *if_bpf;		/* packet filter structure */
 	u_short	if_index;		/* numeric abbreviation for this if  */
 	short	if_timer;		/* time 'til if_watchdog called */
 	struct  ifvlantrunk *if_vlantrunk; /* pointer to 802.1q data */
 	int	if_flags;		/* up/down, broadcast, etc. */
 	int	if_capabilities;	/* interface features & capabilities */
 	int	if_capenable;		/* enabled features & capabilities */
 	void	*if_linkmib;		/* link-type-specific MIB data */
 	size_t	if_linkmiblen;		/* length of above data */
 	struct	if_data if_data;
 	struct	ifmultihead if_multiaddrs; /* multicast addresses configured */
 	int	if_amcount;		/* number of all-multicast requests */
 /* procedure handles */
 	int	(*if_output)		/* output routine (enqueue) */
 		(struct ifnet *, struct mbuf *, struct sockaddr *,
 		     struct rtentry *);
 	void	(*if_input)		/* input routine (from h/w driver) */
 		(struct ifnet *, struct mbuf *);
 	void	(*if_start)		/* initiate output routine */
 		(struct ifnet *);
 	int	(*if_ioctl)		/* ioctl routine */
 		(struct ifnet *, u_long, caddr_t);
 	void	(*if_watchdog)		/* timer routine */
 		(struct ifnet *);
 	void	(*if_init)		/* Init routine */
 		(void *);
 	int	(*if_resolvemulti)	/* validate/resolve multicast */
 		(struct ifnet *, struct sockaddr **, struct sockaddr *);
 	struct	ifaddr	*if_addr;	/* pointer to link-level address */
 	void	*if_llsoftc;		/* link layer softc */
 	int	if_drv_flags;		/* driver-managed status flags */
 	u_int	if_spare_flags2;	/* spare flags 2 */
 	struct  ifaltq if_snd;		/* output queue (includes altq) */
 	const u_int8_t *if_broadcastaddr; /* linklevel broadcast bytestring */
 
 	void	*if_bridge;		/* bridge glue */
 
-	struct	lltable *lltables;	/* list of L3-L2 resolution tables */
-
 	struct	label *if_label;	/* interface MAC label */
 
 	/* these are only used by IPv6 */
 	struct	ifprefixhead if_prefixhead; /* list of prefixes per if */
 	void	*if_afdata[AF_MAX];
 	int	if_afdata_initialized;
 	struct	mtx if_afdata_mtx;
 	struct	task if_starttask;	/* task for IFF_NEEDSGIANT */
 	struct	task if_linktask;	/* task for link change events */
 	struct	mtx if_addr_mtx;	/* mutex to protect address lists */
+
 	LIST_ENTRY(ifnet) if_clones;	/* interfaces of a cloner */
 	TAILQ_HEAD(, ifg_list) if_groups; /* linked list of groups per if */
 					/* protected by if_addr_mtx */
 	void	*if_pf_kif;
 	void	*if_lagg;		/* lagg glue */
 	void	*if_pspare[8];		/* multiq/TOE 3; vimage 3; general use 4 */
 	void	(*if_qflush)	/* flush any queues */
 		(struct ifnet *);
 	int	(*if_transmit)	/* initiate output routine */
 		(struct ifnet *, struct mbuf *);
 	int	if_ispare[2];		/* general use 2 */
 };
 
 typedef void if_init_f_t(void *);
 
 /*
  * XXX These aliases are terribly dangerous because they could apply
  * to anything.
  */
 #define	if_mtu		if_data.ifi_mtu
 #define	if_type		if_data.ifi_type
 #define if_physical	if_data.ifi_physical
 #define	if_addrlen	if_data.ifi_addrlen
 #define	if_hdrlen	if_data.ifi_hdrlen
 #define	if_metric	if_data.ifi_metric
 #define	if_link_state	if_data.ifi_link_state
 #define	if_baudrate	if_data.ifi_baudrate
 #define	if_hwassist	if_data.ifi_hwassist
 #define	if_ipackets	if_data.ifi_ipackets
 #define	if_ierrors	if_data.ifi_ierrors
 #define	if_opackets	if_data.ifi_opackets
 #define	if_oerrors	if_data.ifi_oerrors
 #define	if_collisions	if_data.ifi_collisions
 #define	if_ibytes	if_data.ifi_ibytes
 #define	if_obytes	if_data.ifi_obytes
 #define	if_imcasts	if_data.ifi_imcasts
 #define	if_omcasts	if_data.ifi_omcasts
 #define	if_iqdrops	if_data.ifi_iqdrops
 #define	if_noproto	if_data.ifi_noproto
 #define	if_lastchange	if_data.ifi_lastchange
 #define if_rawoutput(if, m, sa) if_output(if, m, sa, (struct rtentry *)NULL)
 
 /* for compatibility with other BSDs */
 #define	if_addrlist	if_addrhead
 #define	if_list		if_link
 #define	if_name(ifp)	((ifp)->if_xname)
 
 /*
  * Locks for address lists on the network interface.
  */
 #define	IF_ADDR_LOCK_INIT(if)	mtx_init(&(if)->if_addr_mtx,		\
 				    "if_addr_mtx", NULL, MTX_DEF)
 #define	IF_ADDR_LOCK_DESTROY(if)	mtx_destroy(&(if)->if_addr_mtx)
 #define	IF_ADDR_LOCK(if)	mtx_lock(&(if)->if_addr_mtx)
 #define	IF_ADDR_UNLOCK(if)	mtx_unlock(&(if)->if_addr_mtx)
 #define	IF_ADDR_LOCK_ASSERT(if)	mtx_assert(&(if)->if_addr_mtx, MA_OWNED)
 
 /*
  * Output queues (ifp->if_snd) and slow device input queues (*ifp->if_slowq)
  * are queues of messages stored on ifqueue structures
  * (defined above).  Entries are added to and deleted from these structures
  * by these macros, which should be called with ipl raised to splimp().
  */
 #define IF_LOCK(ifq)		mtx_lock(&(ifq)->ifq_mtx)
 #define IF_UNLOCK(ifq)		mtx_unlock(&(ifq)->ifq_mtx)
 #define	IF_LOCK_ASSERT(ifq)	mtx_assert(&(ifq)->ifq_mtx, MA_OWNED)
 #define	_IF_QFULL(ifq)		((ifq)->ifq_len >= (ifq)->ifq_maxlen)
 #define	_IF_DROP(ifq)		((ifq)->ifq_drops++)
 #define	_IF_QLEN(ifq)		((ifq)->ifq_len)
 
 #define	_IF_ENQUEUE(ifq, m) do { 				\
 	(m)->m_nextpkt = NULL;					\
 	if ((ifq)->ifq_tail == NULL) 				\
 		(ifq)->ifq_head = m; 				\
 	else 							\
 		(ifq)->ifq_tail->m_nextpkt = m; 		\
 	(ifq)->ifq_tail = m; 					\
 	(ifq)->ifq_len++; 					\
 } while (0)
 
 #define IF_ENQUEUE(ifq, m) do {					\
 	IF_LOCK(ifq); 						\
 	_IF_ENQUEUE(ifq, m); 					\
 	IF_UNLOCK(ifq); 					\
 } while (0)
 
 #define	_IF_PREPEND(ifq, m) do {				\
 	(m)->m_nextpkt = (ifq)->ifq_head; 			\
 	if ((ifq)->ifq_tail == NULL) 				\
 		(ifq)->ifq_tail = (m); 				\
 	(ifq)->ifq_head = (m); 					\
 	(ifq)->ifq_len++; 					\
 } while (0)
 
 #define IF_PREPEND(ifq, m) do {		 			\
 	IF_LOCK(ifq); 						\
 	_IF_PREPEND(ifq, m); 					\
 	IF_UNLOCK(ifq); 					\
 } while (0)
 
 #define	_IF_DEQUEUE(ifq, m) do { 				\
 	(m) = (ifq)->ifq_head; 					\
 	if (m) { 						\
 		if (((ifq)->ifq_head = (m)->m_nextpkt) == NULL)	\
 			(ifq)->ifq_tail = NULL; 		\
 		(m)->m_nextpkt = NULL; 				\
 		(ifq)->ifq_len--; 				\
 	} 							\
 } while (0)
 
 #define IF_DEQUEUE(ifq, m) do { 				\
 	IF_LOCK(ifq); 						\
 	_IF_DEQUEUE(ifq, m); 					\
 	IF_UNLOCK(ifq); 					\
 } while (0)
 
 #define	_IF_POLL(ifq, m)	((m) = (ifq)->ifq_head)
 #define	IF_POLL(ifq, m)		_IF_POLL(ifq, m)
 
 #define _IF_DRAIN(ifq) do { 					\
 	struct mbuf *m; 					\
 	for (;;) { 						\
 		_IF_DEQUEUE(ifq, m); 				\
 		if (m == NULL) 					\
 			break; 					\
 		m_freem(m); 					\
 	} 							\
 } while (0)
 
 #define IF_DRAIN(ifq) do {					\
 	IF_LOCK(ifq);						\
 	_IF_DRAIN(ifq);						\
 	IF_UNLOCK(ifq);						\
 } while(0)
 
 #ifdef _KERNEL
 /* interface address change event */
 typedef void (*ifaddr_event_handler_t)(void *, struct ifnet *);
 EVENTHANDLER_DECLARE(ifaddr_event, ifaddr_event_handler_t);
 /* new interface arrival event */
 typedef void (*ifnet_arrival_event_handler_t)(void *, struct ifnet *);
 EVENTHANDLER_DECLARE(ifnet_arrival_event, ifnet_arrival_event_handler_t);
 /* interface departure event */
 typedef void (*ifnet_departure_event_handler_t)(void *, struct ifnet *);
 EVENTHANDLER_DECLARE(ifnet_departure_event, ifnet_departure_event_handler_t);
 
 /*
  * interface groups
  */
 struct ifg_group {
 	char				 ifg_group[IFNAMSIZ];
 	u_int				 ifg_refcnt;
 	void				*ifg_pf_kif;
 	TAILQ_HEAD(, ifg_member)	 ifg_members;
 	TAILQ_ENTRY(ifg_group)		 ifg_next;
 };
 
 struct ifg_member {
 	TAILQ_ENTRY(ifg_member)	 ifgm_next;
 	struct ifnet		*ifgm_ifp;
 };
 
 struct ifg_list {
 	struct ifg_group	*ifgl_group;
 	TAILQ_ENTRY(ifg_list)	 ifgl_next;
 };
 
 /* group attach event */
 typedef void (*group_attach_event_handler_t)(void *, struct ifg_group *);
 EVENTHANDLER_DECLARE(group_attach_event, group_attach_event_handler_t);
 /* group detach event */
 typedef void (*group_detach_event_handler_t)(void *, struct ifg_group *);
 EVENTHANDLER_DECLARE(group_detach_event, group_detach_event_handler_t);
 /* group change event */
 typedef void (*group_change_event_handler_t)(void *, const char *);
 EVENTHANDLER_DECLARE(group_change_event, group_change_event_handler_t);
 
 #define	IF_AFDATA_LOCK_INIT(ifp)	\
     mtx_init(&(ifp)->if_afdata_mtx, "if_afdata", NULL, MTX_DEF)
 #define	IF_AFDATA_LOCK(ifp)	mtx_lock(&(ifp)->if_afdata_mtx)
 #define	IF_AFDATA_TRYLOCK(ifp)	mtx_trylock(&(ifp)->if_afdata_mtx)
 #define	IF_AFDATA_UNLOCK(ifp)	mtx_unlock(&(ifp)->if_afdata_mtx)
 #define	IF_AFDATA_DESTROY(ifp)	mtx_destroy(&(ifp)->if_afdata_mtx)
+
+#define	IF_AFDATA_LOCK_ASSERT(ifp)	mtx_assert(&(ifp)->if_afdata_mtx, MA_OWNED)
+#define	IF_AFDATA_UNLOCK_ASSERT(ifp)	mtx_assert(&(ifp)->if_afdata_mtx, MA_NOTOWNED)
 
 #define	IFF_LOCKGIANT(ifp) do {						\
 	if ((ifp)->if_flags & IFF_NEEDSGIANT)				\
 		mtx_lock(&Giant);					\
 } while (0)
 
 #define	IFF_UNLOCKGIANT(ifp) do {					\
 	if ((ifp)->if_flags & IFF_NEEDSGIANT)				\
 		mtx_unlock(&Giant);					\
 } while (0)
 
 int	if_handoff(struct ifqueue *ifq, struct mbuf *m, struct ifnet *ifp,
 	    int adjust);
 #define	IF_HANDOFF(ifq, m, ifp)			\
 	if_handoff((struct ifqueue *)ifq, m, ifp, 0)
 #define	IF_HANDOFF_ADJ(ifq, m, ifp, adj)	\
 	if_handoff((struct ifqueue *)ifq, m, ifp, adj)
 
 void	if_start(struct ifnet *);
 
 #define	IFQ_ENQUEUE(ifq, m, err)					\
 do {									\
 	IF_LOCK(ifq);							\
 	if (ALTQ_IS_ENABLED(ifq))					\
 		ALTQ_ENQUEUE(ifq, m, NULL, err);			\
 	else {								\
 		if (_IF_QFULL(ifq)) {					\
 			m_freem(m);					\
 			(err) = ENOBUFS;				\
 		} else {						\
 			_IF_ENQUEUE(ifq, m);				\
 			(err) = 0;					\
 		}							\
 	}								\
 	if (err)							\
 		(ifq)->ifq_drops++;					\
 	IF_UNLOCK(ifq);							\
 } while (0)
 
 #define	IFQ_DEQUEUE_NOLOCK(ifq, m)					\
 do {									\
 	if (TBR_IS_ENABLED(ifq))					\
 		(m) = tbr_dequeue_ptr(ifq, ALTDQ_REMOVE);		\
 	else if (ALTQ_IS_ENABLED(ifq))					\
 		ALTQ_DEQUEUE(ifq, m);					\
 	else								\
 		_IF_DEQUEUE(ifq, m);					\
 } while (0)
 
 #define	IFQ_DEQUEUE(ifq, m)						\
 do {									\
 	IF_LOCK(ifq);							\
 	IFQ_DEQUEUE_NOLOCK(ifq, m);					\
 	IF_UNLOCK(ifq);							\
 } while (0)
 
 #define	IFQ_POLL_NOLOCK(ifq, m)						\
 do {									\
 	if (TBR_IS_ENABLED(ifq))					\
 		(m) = tbr_dequeue_ptr(ifq, ALTDQ_POLL);			\
 	else if (ALTQ_IS_ENABLED(ifq))					\
 		ALTQ_POLL(ifq, m);					\
 	else								\
 		_IF_POLL(ifq, m);					\
 } while (0)
 
 #define	IFQ_POLL(ifq, m)						\
 do {									\
 	IF_LOCK(ifq);							\
 	IFQ_POLL_NOLOCK(ifq, m);					\
 	IF_UNLOCK(ifq);							\
 } while (0)
 
 #define	IFQ_PURGE_NOLOCK(ifq)						\
 do {									\
 	if (ALTQ_IS_ENABLED(ifq)) {					\
 		ALTQ_PURGE(ifq);					\
 	} else								\
 		_IF_DRAIN(ifq);						\
 } while (0)
 
 #define	IFQ_PURGE(ifq)							\
 do {									\
 	IF_LOCK(ifq);							\
 	IFQ_PURGE_NOLOCK(ifq);						\
 	IF_UNLOCK(ifq);							\
 } while (0)
 
 #define	IFQ_SET_READY(ifq)						\
 	do { ((ifq)->altq_flags |= ALTQF_READY); } while (0)
 
 #define	IFQ_LOCK(ifq)			IF_LOCK(ifq)
 #define	IFQ_UNLOCK(ifq)			IF_UNLOCK(ifq)
 #define	IFQ_LOCK_ASSERT(ifq)		IF_LOCK_ASSERT(ifq)
 #define	IFQ_IS_EMPTY(ifq)		((ifq)->ifq_len == 0)
 #define	IFQ_INC_LEN(ifq)		((ifq)->ifq_len++)
 #define	IFQ_DEC_LEN(ifq)		(--(ifq)->ifq_len)
 #define	IFQ_INC_DROPS(ifq)		((ifq)->ifq_drops++)
 #define	IFQ_SET_MAXLEN(ifq, len)	((ifq)->ifq_maxlen = (len))
 
 /*
  * The IFF_DRV_OACTIVE test should really occur in the device driver, not in
  * the handoff logic, as that flag is locked by the device driver.
  */
 #define	IFQ_HANDOFF_ADJ(ifp, m, adj, err)				\
 do {									\
 	int len;							\
 	short mflags;							\
 									\
 	len = (m)->m_pkthdr.len;					\
 	mflags = (m)->m_flags;						\
 	IFQ_ENQUEUE(&(ifp)->if_snd, m, err);				\
 	if ((err) == 0) {						\
 		(ifp)->if_obytes += len + (adj);			\
 		if (mflags & M_MCAST)					\
 			(ifp)->if_omcasts++;				\
 		if (((ifp)->if_drv_flags & IFF_DRV_OACTIVE) == 0)	\
 			if_start(ifp);					\
 	}								\
 } while (0)
 
 #define	IFQ_HANDOFF(ifp, m, err)					\
 	IFQ_HANDOFF_ADJ(ifp, m, 0, err)
 
 #define	IFQ_DRV_DEQUEUE(ifq, m)						\
 do {									\
 	(m) = (ifq)->ifq_drv_head;					\
 	if (m) {							\
 		if (((ifq)->ifq_drv_head = (m)->m_nextpkt) == NULL)	\
 			(ifq)->ifq_drv_tail = NULL;			\
 		(m)->m_nextpkt = NULL;					\
 		(ifq)->ifq_drv_len--;					\
 	} else {							\
 		IFQ_LOCK(ifq);						\
 		IFQ_DEQUEUE_NOLOCK(ifq, m);				\
 		while ((ifq)->ifq_drv_len < (ifq)->ifq_drv_maxlen) {	\
 			struct mbuf *m0;				\
 			IFQ_DEQUEUE_NOLOCK(ifq, m0);			\
 			if (m0 == NULL)					\
 				break;					\
 			m0->m_nextpkt = NULL;				\
 			if ((ifq)->ifq_drv_tail == NULL)		\
 				(ifq)->ifq_drv_head = m0;		\
 			else						\
 				(ifq)->ifq_drv_tail->m_nextpkt = m0;	\
 			(ifq)->ifq_drv_tail = m0;			\
 			(ifq)->ifq_drv_len++;				\
 		}							\
 		IFQ_UNLOCK(ifq);					\
 	}								\
 } while (0)
 
 #define	IFQ_DRV_PREPEND(ifq, m)						\
 do {									\
 	(m)->m_nextpkt = (ifq)->ifq_drv_head;				\
 	if ((ifq)->ifq_drv_tail == NULL)				\
 		(ifq)->ifq_drv_tail = (m);				\
 	(ifq)->ifq_drv_head = (m);					\
 	(ifq)->ifq_drv_len++;						\
 } while (0)
 
 #define	IFQ_DRV_IS_EMPTY(ifq)						\
 	(((ifq)->ifq_drv_len == 0) && ((ifq)->ifq_len == 0))
 
 #define	IFQ_DRV_PURGE(ifq)						\
 do {									\
 	struct mbuf *m, *n = (ifq)->ifq_drv_head;			\
 	while((m = n) != NULL) {					\
 		n = m->m_nextpkt;					\
 		m_freem(m);						\
 	}								\
 	(ifq)->ifq_drv_head = (ifq)->ifq_drv_tail = NULL;		\
 	(ifq)->ifq_drv_len = 0;						\
 	IFQ_PURGE(ifq);							\
 } while (0)
 
 /*
  * 72 was chosen below because it is the size of a TCP/IP
  * header (40) + the minimum mss (32).
  */
 #define	IF_MINMTU	72
 #define	IF_MAXMTU	65535
 
 #endif /* _KERNEL */
 
 /*
  * The ifaddr structure contains information about one address
  * of an interface.  They are maintained by the different address families,
  * are allocated and attached when an address is set, and are linked
  * together so all addresses for an interface can be located.
  *
  * NOTE: a 'struct ifaddr' is always at the beginning of a larger
  * chunk of malloc'ed memory, where we store the three addresses
  * (ifa_addr, ifa_dstaddr and ifa_netmask) referenced here.
  */
 struct ifaddr {
 	struct	sockaddr *ifa_addr;	/* address of interface */
 	struct	sockaddr *ifa_dstaddr;	/* other end of p-to-p link */
 #define	ifa_broadaddr	ifa_dstaddr	/* broadcast address interface */
 	struct	sockaddr *ifa_netmask;	/* used to determine subnet */
 	struct	if_data if_data;	/* not all members are meaningful */
 	struct	ifnet *ifa_ifp;		/* back-pointer to interface */
 	TAILQ_ENTRY(ifaddr) ifa_link;	/* queue macro glue */
 	void	(*ifa_rtrequest)	/* check or clean routes (+ or -)'d */
 		(int, struct rtentry *, struct rt_addrinfo *);
 	u_short	ifa_flags;		/* mostly rt_flags for cloning */
 	u_int	ifa_refcnt;		/* references to this structure */
 	int	ifa_metric;		/* cost of going out this interface */
 	int (*ifa_claim_addr)		/* check if an addr goes to this if */
 		(struct ifaddr *, struct sockaddr *);
 	struct mtx ifa_mtx;
 };
 #define	IFA_ROUTE	RTF_UP		/* route installed */
 
 /* for compatibility with other BSDs */
 #define	ifa_list	ifa_link
 
 #define	IFA_LOCK_INIT(ifa)	\
     mtx_init(&(ifa)->ifa_mtx, "ifaddr", NULL, MTX_DEF)
 #define	IFA_LOCK(ifa)		mtx_lock(&(ifa)->ifa_mtx)
 #define	IFA_UNLOCK(ifa)		mtx_unlock(&(ifa)->ifa_mtx)
 #define	IFA_DESTROY(ifa)	mtx_destroy(&(ifa)->ifa_mtx)
 
 /*
  * The prefix structure contains information about one prefix
  * of an interface.  They are maintained by the different address families,
  * are allocated and attached when a prefix or an address is set,
  * and are linked together so all prefixes for an interface can be located.
  */
 struct ifprefix {
 	struct	sockaddr *ifpr_prefix;	/* prefix of interface */
 	struct	ifnet *ifpr_ifp;	/* back-pointer to interface */
 	TAILQ_ENTRY(ifprefix) ifpr_list; /* queue macro glue */
 	u_char	ifpr_plen;		/* prefix length in bits */
 	u_char	ifpr_type;		/* protocol dependent prefix type */
 };
 
 /*
  * Multicast address structure.  This is analogous to the ifaddr
  * structure except that it keeps track of multicast addresses.
  */
 struct ifmultiaddr {
 	TAILQ_ENTRY(ifmultiaddr) ifma_link; /* queue macro glue */
 	struct	sockaddr *ifma_addr; 	/* address this membership is for */
 	struct	sockaddr *ifma_lladdr;	/* link-layer translation, if any */
 	struct	ifnet *ifma_ifp;	/* back-pointer to interface */
 	u_int	ifma_refcount;		/* reference count */
 	void	*ifma_protospec;	/* protocol-specific state, if any */
 	struct	ifmultiaddr *ifma_llifma; /* pointer to ifma for ifma_lladdr */
 };
 
 #ifdef _KERNEL
 #define	IFAFREE(ifa)					\
 	do {						\
 		IFA_LOCK(ifa);				\
 		KASSERT((ifa)->ifa_refcnt > 0,		\
 		    ("ifa %p !(ifa_refcnt > 0)", ifa));	\
 		if (--(ifa)->ifa_refcnt == 0) {		\
 			IFA_DESTROY(ifa);		\
 			free(ifa, M_IFADDR);		\
 		} else 					\
 			IFA_UNLOCK(ifa);		\
 	} while (0)
 
 #define IFAREF(ifa)					\
 	do {						\
 		IFA_LOCK(ifa);				\
 		++(ifa)->ifa_refcnt;			\
 		IFA_UNLOCK(ifa);			\
 	} while (0)
 
 extern	struct mtx ifnet_lock;
 #define	IFNET_LOCK_INIT() \
     mtx_init(&ifnet_lock, "ifnet", NULL, MTX_DEF | MTX_RECURSE)
 #define	IFNET_WLOCK()		mtx_lock(&ifnet_lock)
 #define	IFNET_WUNLOCK()		mtx_unlock(&ifnet_lock)
 #define	IFNET_WLOCK_ASSERT()	mtx_assert(&ifnet_lock, MA_OWNED)
 #define	IFNET_RLOCK()		IFNET_WLOCK()
 #define	IFNET_RUNLOCK()		IFNET_WUNLOCK()
 
 struct ifindex_entry {
 	struct	ifnet *ife_ifnet;
 	struct cdev *ife_dev;
 };
 
 struct ifnet	*ifnet_byindex(u_short idx);
 
 /*
  * Given the index, ifaddr_byindex() returns the one and only
  * link-level ifaddr for the interface. You are not supposed to use
  * it to traverse the list of addresses associated to the interface.
  */
 struct ifaddr	*ifaddr_byindex(u_short idx);
 struct cdev	*ifdev_byindex(u_short idx);
 
 #ifdef VIMAGE_GLOBALS
 extern	struct ifnethead ifnet;
 extern	struct ifnet *loif;	/* first loopback interface */
 extern	int if_index;
 #endif
 extern	int ifqmaxlen;
 
 int	if_addgroup(struct ifnet *, const char *);
 int	if_delgroup(struct ifnet *, const char *);
 int	if_addmulti(struct ifnet *, struct sockaddr *, struct ifmultiaddr **);
 int	if_allmulti(struct ifnet *, int);
 struct	ifnet* if_alloc(u_char);
 void	if_attach(struct ifnet *);
 int	if_delmulti(struct ifnet *, struct sockaddr *);
 void	if_delmulti_ifma(struct ifmultiaddr *);
 void	if_detach(struct ifnet *);
 void	if_purgeaddrs(struct ifnet *);
 void	if_purgemaddrs(struct ifnet *);
 void	if_down(struct ifnet *);
 struct ifmultiaddr *
 	if_findmulti(struct ifnet *, struct sockaddr *);
 void	if_free(struct ifnet *);
 void	if_free_type(struct ifnet *, u_char);
 void	if_initname(struct ifnet *, const char *, int);
 void	if_link_state_change(struct ifnet *, int);
 int	if_printf(struct ifnet *, const char *, ...) __printflike(2, 3);
 int	if_setlladdr(struct ifnet *, const u_char *, int);
 void	if_up(struct ifnet *);
 /*void	ifinit(void);*/ /* declared in systm.h for main() */
 int	ifioctl(struct socket *, u_long, caddr_t, struct thread *);
 int	ifpromisc(struct ifnet *, int);
 struct	ifnet *ifunit(const char *);
 
 void	ifq_attach(struct ifaltq *, struct ifnet *ifp);
 void	ifq_detach(struct ifaltq *);
 
 struct	ifaddr *ifa_ifwithaddr(struct sockaddr *);
 struct	ifaddr *ifa_ifwithbroadaddr(struct sockaddr *);
 struct	ifaddr *ifa_ifwithdstaddr(struct sockaddr *);
 struct	ifaddr *ifa_ifwithnet(struct sockaddr *);
 struct	ifaddr *ifa_ifwithroute(int, struct sockaddr *, struct sockaddr *);
 struct	ifaddr *ifa_ifwithroute_fib(int, struct sockaddr *, struct sockaddr *, u_int);
 
 struct	ifaddr *ifaof_ifpforaddr(struct sockaddr *, struct ifnet *);
 
 int	if_simloop(struct ifnet *ifp, struct mbuf *m, int af, int hlen);
 
 typedef	void *if_com_alloc_t(u_char type, struct ifnet *ifp);
 typedef	void if_com_free_t(void *com, u_char type);
 void	if_register_com_alloc(u_char type, if_com_alloc_t *a, if_com_free_t *f);
 void	if_deregister_com_alloc(u_char type);
 
 #define IF_LLADDR(ifp)							\
     LLADDR((struct sockaddr_dl *)((ifp)->if_addr->ifa_addr))
 
 #ifdef DEVICE_POLLING
 enum poll_cmd {	POLL_ONLY, POLL_AND_CHECK_STATUS };
 
 typedef	void poll_handler_t(struct ifnet *ifp, enum poll_cmd cmd, int count);
 int    ether_poll_register(poll_handler_t *h, struct ifnet *ifp);
 int    ether_poll_deregister(struct ifnet *ifp);
 #endif /* DEVICE_POLLING */
 
 #endif /* _KERNEL */
 
 #endif /* !_NET_IF_VAR_H_ */
Index: head/sys/net/radix_mpath.c
===================================================================
--- head/sys/net/radix_mpath.c	(revision 186118)
+++ head/sys/net/radix_mpath.c	(revision 186119)
@@ -1,343 +1,343 @@
 /*	$KAME: radix_mpath.c,v 1.17 2004/11/08 10:29:39 itojun Exp $	*/
 
 /*
  * Copyright (C) 2001 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  * THE AUTHORS DO NOT GUARANTEE THAT THIS SOFTWARE DOES NOT INFRINGE
  * ANY OTHERS' INTELLECTUAL PROPERTIES. IN NO EVENT SHALL THE AUTHORS
  * BE LIABLE FOR ANY INFRINGEMENT OF ANY OTHERS' INTELLECTUAL
  * PROPERTIES.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/malloc.h>
 #include <sys/socket.h>
 #include <sys/domain.h>
 #include <sys/syslog.h>
 #include <net/radix.h>
 #include <net/radix_mpath.h>
 #include <net/route.h>
 #include <net/if.h>
 #include <net/if_var.h>
 
 /*
  * give some jitter to hash, to avoid synchronization between routers
  */
 static uint32_t hashjitter;
 
 int
 rn_mpath_capable(struct radix_node_head *rnh)
 {
 
 	return rnh->rnh_multipath;
 }
 
 struct radix_node *
 rn_mpath_next(struct radix_node *rn)
 {
 	struct radix_node *next;
 
 	if (!rn->rn_dupedkey)
 		return NULL;
 	next = rn->rn_dupedkey;
 	if (rn->rn_mask == next->rn_mask)
 		return next;
 	else
 		return NULL;
 }
 
 u_int32_t
 rn_mpath_count(struct radix_node *rn)
 {
 	u_int32_t i;
 
 	i = 1;
 	while ((rn = rn_mpath_next(rn)) != NULL)
 		i++;
 	return i;
 }
 
 struct rtentry *
 rt_mpath_matchgate(struct rtentry *rt, struct sockaddr *gate)
 {
 	struct radix_node *rn;
 
 	if (!rn_mpath_next((struct radix_node *)rt))
 		return rt;
 
 	if (!gate)
 		return NULL;
 
 	/* beyond here, we use rn as the master copy */
 	rn = (struct radix_node *)rt;
 	do {
 		rt = (struct rtentry *)rn;
 		/*
 		 * we are removing an address alias that has 
 		 * the same prefix as another address
 		 * we need to compare the interface address because
 		 * rt_gateway is a special sockadd_dl structure
 		 */
 		if (rt->rt_gateway->sa_family == AF_LINK) {
 			if (!memcmp(rt->rt_ifa->ifa_addr, gate, gate->sa_len))
 				break;
 		} else {
 			if (rt->rt_gateway->sa_len == gate->sa_len &&
 			    !memcmp(rt->rt_gateway, gate, gate->sa_len))
 				break;
 		}
 	} while ((rn = rn_mpath_next(rn)) != NULL);
 
 	return (struct rtentry *)rn;
 }
 
 /* 
  * go through the chain and unlink "rt" from the list
  * the caller will free "rt"
  */
 int
 rt_mpath_deldup(struct rtentry *headrt, struct rtentry *rt)
 {
         struct radix_node *t, *tt;
 
         if (!headrt || !rt)
             return (0);
         t = (struct radix_node *)headrt;
         tt = rn_mpath_next(t);
         while (tt) {
             if (tt == (struct radix_node *)rt) {
                 t->rn_dupedkey = tt->rn_dupedkey;
                 tt->rn_dupedkey = NULL;
     	        tt->rn_flags &= ~RNF_ACTIVE;
 	        tt[1].rn_flags &= ~RNF_ACTIVE;
                 return (1);
             }
             t = tt;
             tt = rn_mpath_next((struct radix_node *)t);
         }
         return (0);
 }
 
 /*
  * check if we have the same key/mask/gateway on the table already.
  */
 int
 rt_mpath_conflict(struct radix_node_head *rnh, struct rtentry *rt,
     struct sockaddr *netmask)
 {
 	struct radix_node *rn, *rn1;
 	struct rtentry *rt1;
 	char *p, *q, *eq;
 	int same, l, skip;
 
 	rn = (struct radix_node *)rt;
 	rn1 = rnh->rnh_lookup(rt_key(rt), netmask, rnh);
 	if (!rn1 || rn1->rn_flags & RNF_ROOT)
 		return 0;
 
 	/*
 	 * unlike other functions we have in this file, we have to check
 	 * all key/mask/gateway as rnh_lookup can match less specific entry.
 	 */
 	rt1 = (struct rtentry *)rn1;
 
 	/* compare key. */
 	if (rt_key(rt1)->sa_len != rt_key(rt)->sa_len ||
 	    bcmp(rt_key(rt1), rt_key(rt), rt_key(rt1)->sa_len))
 		goto different;
 
 	/* key was the same.  compare netmask.  hairy... */
 	if (rt_mask(rt1) && netmask) {
 		skip = rnh->rnh_treetop->rn_offset;
 		if (rt_mask(rt1)->sa_len > netmask->sa_len) {
 			/*
 			 * as rt_mask(rt1) is made optimal by radix.c,
 			 * there must be some 1-bits on rt_mask(rt1)
 			 * after netmask->sa_len.  therefore, in
 			 * this case, the entries are different.
 			 */
 			if (rt_mask(rt1)->sa_len > skip)
 				goto different;
 			else {
 				/* no bits to compare, i.e. same*/
 				goto maskmatched;
 			}
 		}
 
 		l = rt_mask(rt1)->sa_len;
 		if (skip > l) {
 			/* no bits to compare, i.e. same */
 			goto maskmatched;
 		}
 		p = (char *)rt_mask(rt1);
 		q = (char *)netmask;
 		if (bcmp(p + skip, q + skip, l - skip))
 			goto different;
 		/*
 		 * need to go through all the bit, as netmask is not
 		 * optimal and can contain trailing 0s
 		 */
 		eq = (char *)netmask + netmask->sa_len;
 		q += l;
 		same = 1;
 		while (eq > q)
 			if (*q++) {
 				same = 0;
 				break;
 			}
 		if (!same)
 			goto different;
 	} else if (!rt_mask(rt1) && !netmask)
 		; /* no mask to compare, i.e. same */
 	else {
 		/* one has mask and the other does not, different */
 		goto different;
 	}
 
 maskmatched:
 
 	/* key/mask were the same.  compare gateway for all multipaths */
 	do {
 		rt1 = (struct rtentry *)rn1;
 
 		/* sanity: no use in comparing the same thing */
 		if (rn1 == rn)
 			continue;
         
 		if (rt1->rt_gateway->sa_family == AF_LINK) {
 			if (rt1->rt_ifa->ifa_addr->sa_len != rt->rt_ifa->ifa_addr->sa_len ||
 			    bcmp(rt1->rt_ifa->ifa_addr, rt->rt_ifa->ifa_addr, 
 			    rt1->rt_ifa->ifa_addr->sa_len))
 				continue;
 		} else {
 			if (rt1->rt_gateway->sa_len != rt->rt_gateway->sa_len ||
 			    bcmp(rt1->rt_gateway, rt->rt_gateway,
 			    rt1->rt_gateway->sa_len))
 				continue;
 		}
 
 		/* all key/mask/gateway are the same.  conflicting entry. */
 		return EEXIST;
 	} while ((rn1 = rn_mpath_next(rn1)) != NULL);
 
 different:
 	return 0;
 }
 
 void
 rtalloc_mpath_fib(struct route *ro, u_int32_t hash, u_int fibnum)
 {
 	struct radix_node *rn0, *rn;
 	u_int32_t n;
 
 	/*
 	 * XXX we don't attempt to lookup cached route again; what should
 	 * be done for sendto(3) case?
 	 */
 	if (ro->ro_rt && ro->ro_rt->rt_ifp && (ro->ro_rt->rt_flags & RTF_UP))
-		return;				 /* XXX */
-	ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, RTF_CLONING, fibnum);
+		return;				 
+	ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, 0, fibnum);
 
 	/* if the route does not exist or it is not multipath, don't care */
 	if (ro->ro_rt == NULL)
 		return;
 	if (rn_mpath_next((struct radix_node *)ro->ro_rt) == NULL) {
 		RT_UNLOCK(ro->ro_rt);
 		return;
 	}
 
 	/* beyond here, we use rn as the master copy */
 	rn0 = rn = (struct radix_node *)ro->ro_rt;
 	n = rn_mpath_count(rn0);
 
 	/* gw selection by Modulo-N Hash (RFC2991) XXX need improvement? */
 	hash += hashjitter;
 	hash %= n;
 	while (hash-- > 0 && rn) {
 		/* stay within the multipath routes */
 		if (rn->rn_dupedkey && rn->rn_mask != rn->rn_dupedkey->rn_mask)
 			break;
 		rn = rn->rn_dupedkey;
 	}
 
 	/* XXX try filling rt_gwroute and avoid unreachable gw  */
 
 	/* if gw selection fails, use the first match (default) */
 	if (!rn) {
 		RT_UNLOCK(ro->ro_rt);
 		return;
 	}
 	
 	RTFREE_LOCKED(ro->ro_rt);
 	ro->ro_rt = (struct rtentry *)rn;
 	RT_LOCK(ro->ro_rt);
 	RT_ADDREF(ro->ro_rt);
 	RT_UNLOCK(ro->ro_rt);
 }
 
 extern int	in6_inithead(void **head, int off);
 extern int	in_inithead(void **head, int off);
 
 #ifdef INET
 int
 rn4_mpath_inithead(void **head, int off)
 {
 	struct radix_node_head *rnh;
 
 	hashjitter = arc4random();
 	if (in_inithead(head, off) == 1) {
 		rnh = (struct radix_node_head *)*head;
 		rnh->rnh_multipath = 1;
 		return 1;
 	} else
 		return 0;
 }
 #endif
 
 #ifdef INET6
 int
 rn6_mpath_inithead(void **head, int off)
 {
 	struct radix_node_head *rnh;
 
 	hashjitter = arc4random();
 	if (in6_inithead(head, off) == 1) {
 		rnh = (struct radix_node_head *)*head;
 		rnh->rnh_multipath = 1;
 		return 1;
 	} else
 		return 0;
 }
 
 #endif
Index: head/sys/net/route.c
===================================================================
--- head/sys/net/route.c	(revision 186118)
+++ head/sys/net/route.c	(revision 186119)
@@ -1,1842 +1,1362 @@
 /*-
  * Copyright (c) 1980, 1986, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)route.c	8.3.1.1 (Berkeley) 2/23/95
  * $FreeBSD$
  */
 /************************************************************************
  * Note: In this file a 'fib' is a "forwarding information base"	*
  * Which is the new name for an in kernel routing (next hop) table.	*
  ***********************************************************************/
 
 #include "opt_inet.h"
 #include "opt_route.h"
 #include "opt_mrouting.h"
 #include "opt_mpath.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
+#include <sys/syslog.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 #include <sys/sysproto.h>
 #include <sys/proc.h>
 #include <sys/domain.h>
 #include <sys/kernel.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
+#include <net/if_dl.h>
 #include <net/route.h>
 
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/ip_mroute.h>
 #include <netinet/vinet.h>
 
 #include <vm/uma.h>
 
 u_int rt_numfibs = RT_NUMFIBS;
 SYSCTL_INT(_net, OID_AUTO, fibs, CTLFLAG_RD, &rt_numfibs, 0, "");
 /*
  * Allow the boot code to allow LESS than RT_MAXFIBS to be used.
  * We can't do more because storage is statically allocated for now.
  * (for compatibility reasons.. this will change).
  */
 TUNABLE_INT("net.fibs", &rt_numfibs);
 
 /*
  * By default add routes to all fibs for new interfaces.
  * Once this is set to 0 then only allocate routes on interface
  * changes for the FIB of the caller when adding a new set of addresses
  * to an interface.  XXX this is a shotgun aproach to a problem that needs
  * a more fine grained solution.. that will come.
  */
 u_int rt_add_addr_allfibs = 1;
 SYSCTL_INT(_net, OID_AUTO, add_addr_allfibs, CTLFLAG_RW,
     &rt_add_addr_allfibs, 0, "");
 TUNABLE_INT("net.add_addr_allfibs", &rt_add_addr_allfibs);
 
 #ifdef VIMAGE_GLOBALS
 static struct rtstat rtstat;
 
 /* by default only the first 'row' of tables will be accessed. */
 /* 
  * XXXMRT When we fix netstat, and do this differnetly,
  * we can allocate this dynamically. As long as we are keeping
  * things backwards compaitble we need to allocate this 
  * statically.
  */
 struct radix_node_head *rt_tables[RT_MAXFIBS][AF_MAX+1];
 
 static int	rttrash;		/* routes not in table but not freed */
 #endif
 
 static void rt_maskedcopy(struct sockaddr *,
 	    struct sockaddr *, struct sockaddr *);
 
 /* compare two sockaddr structures */
 #define	sa_equal(a1, a2) (bcmp((a1), (a2), (a1)->sa_len) == 0)
 
 /*
  * Convert a 'struct radix_node *' to a 'struct rtentry *'.
  * The operation can be done safely (in this code) because a
  * 'struct rtentry' starts with two 'struct radix_node''s, the first
  * one representing leaf nodes in the routing tree, which is
  * what the code in radix.c passes us as a 'struct radix_node'.
  *
  * But because there are a lot of assumptions in this conversion,
  * do not cast explicitly, but always use the macro below.
  */
 #define RNTORT(p)	((struct rtentry *)(p))
 
 static uma_zone_t rtzone;		/* Routing table UMA zone. */
 
 #if 0
 /* default fib for tunnels to use */
 u_int tunnel_fib = 0;
 SYSCTL_INT(_net, OID_AUTO, tunnelfib, CTLFLAG_RD, &tunnel_fib, 0, "");
 #endif
 
 /*
  * handler for net.my_fibnum
  */
 static int
 sysctl_my_fibnum(SYSCTL_HANDLER_ARGS)
 {
         int fibnum;
         int error;
  
         fibnum = curthread->td_proc->p_fibnum;
         error = sysctl_handle_int(oidp, &fibnum, 0, req);
         return (error);
 }
 
 SYSCTL_PROC(_net, OID_AUTO, my_fibnum, CTLTYPE_INT|CTLFLAG_RD,
             NULL, 0, &sysctl_my_fibnum, "I", "default FIB of caller");
 
 static void
 route_init(void)
 {
 	INIT_VNET_INET(curvnet);
 	int table;
 	struct domain *dom;
 	int fam;
 
 	/* whack the tunable ints into  line. */
 	if (rt_numfibs > RT_MAXFIBS)
 		rt_numfibs = RT_MAXFIBS;
 	if (rt_numfibs == 0)
 		rt_numfibs = 1;
 	rtzone = uma_zcreate("rtentry", sizeof(struct rtentry), NULL, NULL,
 	    NULL, NULL, UMA_ALIGN_PTR, 0);
 	rn_init();	/* initialize all zeroes, all ones, mask table */
 
 	for (dom = domains; dom; dom = dom->dom_next) {
 		if (dom->dom_rtattach)  {
 			for  (table = 0; table < rt_numfibs; table++) {
 				if ( (fam = dom->dom_family) == AF_INET ||
 				    table == 0) {
  			        	/* for now only AF_INET has > 1 table */
 					/* XXX MRT 
 					 * rtattach will be also called
 					 * from vfs_export.c but the
 					 * offset will be 0
 					 * (only for AF_INET and AF_INET6
 					 * which don't need it anyhow)
 					 */
 					dom->dom_rtattach(
 				    	    (void **)&V_rt_tables[table][fam],
 				    	    dom->dom_rtoffset);
 				} else {
 					break;
 				}
 			}
 		}
 	}
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct setfib_args {
 	int     fibnum;
 };
 #endif
 int
 setfib(struct thread *td, struct setfib_args *uap)
 {
 	if (uap->fibnum < 0 || uap->fibnum >= rt_numfibs)
 		return EINVAL;
 	td->td_proc->p_fibnum = uap->fibnum;
 	return (0);
 }
 
 /*
  * Packet routing routines.
  */
 void
 rtalloc(struct route *ro)
 {
 	rtalloc_ign_fib(ro, 0UL, 0);
 }
 
 void
 rtalloc_fib(struct route *ro, u_int fibnum)
 {
 	rtalloc_ign_fib(ro, 0UL, fibnum);
 }
 
 void
 rtalloc_ign(struct route *ro, u_long ignore)
 {
 	struct rtentry *rt;
 
 	if ((rt = ro->ro_rt) != NULL) {
 		if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP)
 			return;
 		RTFREE(rt);
 		ro->ro_rt = NULL;
 	}
 	ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, ignore, 0);
 	if (ro->ro_rt)
 		RT_UNLOCK(ro->ro_rt);
 }
 
 void
 rtalloc_ign_fib(struct route *ro, u_long ignore, u_int fibnum)
 {
 	struct rtentry *rt;
 
 	if ((rt = ro->ro_rt) != NULL) {
 		if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP)
 			return;
 		RTFREE(rt);
 		ro->ro_rt = NULL;
 	}
 	ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, ignore, fibnum);
 	if (ro->ro_rt)
 		RT_UNLOCK(ro->ro_rt);
 }
 
 /*
  * Look up the route that matches the address given
  * Or, at least try.. Create a cloned route if needed.
  *
  * The returned route, if any, is locked.
  */
 struct rtentry *
 rtalloc1(struct sockaddr *dst, int report, u_long ignflags)
 {
 	return (rtalloc1_fib(dst, report, ignflags, 0));
 }
 
 struct rtentry *
 rtalloc1_fib(struct sockaddr *dst, int report, u_long ignflags,
 		    u_int fibnum)
 {
 	INIT_VNET_NET(curvnet);
 	struct radix_node_head *rnh;
 	struct rtentry *rt;
 	struct radix_node *rn;
 	struct rtentry *newrt;
 	struct rt_addrinfo info;
-	u_long nflags;
-	int needresolve = 0, err = 0, msgtype = RTM_MISS;
+	int err = 0, msgtype = RTM_MISS;
 	int needlock;
 
 	KASSERT((fibnum < rt_numfibs), ("rtalloc1_fib: bad fibnum"));
 	if (dst->sa_family != AF_INET)	/* Only INET supports > 1 fib now */
 		fibnum = 0;
 	rnh = V_rt_tables[fibnum][dst->sa_family];
 	newrt = NULL;
 	/*
 	 * Look up the address in the table for that Address Family
 	 */
 	if (rnh == NULL) {
 		V_rtstat.rts_unreach++;
-		goto miss2;
+		goto miss;
 	}
 	needlock = !(ignflags & RTF_RNH_LOCKED);
-retry:
 	if (needlock)
 		RADIX_NODE_HEAD_RLOCK(rnh);
 #ifdef INVARIANTS	
 	else
 		RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 #endif
 	rn = rnh->rnh_matchaddr(dst, rnh);
 	if (rn && ((rn->rn_flags & RNF_ROOT) == 0)) {
-
 		newrt = rt = RNTORT(rn);
-		nflags = rt->rt_flags & ~ignflags;
-		if (report && (nflags & RTF_CLONING)) {
-			if (needlock && !RADIX_NODE_HEAD_LOCK_TRY_UPGRADE(rnh)) {
-				RADIX_NODE_HEAD_RUNLOCK(rnh);
-				RADIX_NODE_HEAD_LOCK(rnh);
-				/*
-				 * lookup again to make sure it wasn't changed
-				 */
-				rn = rnh->rnh_matchaddr(dst, rnh);
-				if (!(rn && ((rn->rn_flags & RNF_ROOT) == 0))) {
-					RADIX_NODE_HEAD_UNLOCK(rnh);
-					needresolve = 0;
-					log(LOG_INFO, "retrying route lookup ...\n");
-					goto retry;
-				}
-			}
-			needresolve = 1;
-		} else {
-			RT_LOCK(newrt);
-			RT_ADDREF(newrt);
-			if (needlock)
-				RADIX_NODE_HEAD_RUNLOCK(rnh);
-			goto done;
-		}
-	}
+		RT_LOCK(newrt);
+		RT_ADDREF(newrt);
+		if (needlock)
+			RADIX_NODE_HEAD_RUNLOCK(rnh);
+		goto done;
+
+	} else if (needlock)
+			RADIX_NODE_HEAD_RUNLOCK(rnh);
+	
 	/*
-	 * if needresolve is set then we have the exclusive lock
-	 *  and we need to keep it held for the benefit of rtrequest_fib
+	 * Either we hit the root or couldn't find any match,
+	 * Which basically means
+	 * "caint get there frm here"
 	 */
-	if (!needresolve && needlock)
-		RADIX_NODE_HEAD_RUNLOCK(rnh);
-	
-	if (needresolve) {
-		RADIX_NODE_HEAD_WLOCK_ASSERT(rnh);
+	V_rtstat.rts_unreach++;
+miss:
+	if (report) {
 		/*
-		 * We are apparently adding (report = 0 in delete).
-		 * If it requires that it be cloned, do so.
-		 * (This implies it wasn't a HOST route.)
+		 * If required, report the failure to the supervising
+		 * Authorities.
+		 * For a delete, this is not an error. (report == 0)
 		 */
-		err = rtrequest_fib(RTM_RESOLVE, dst, NULL,
-		    NULL, RTF_RNH_LOCKED, &newrt, fibnum);
-		if (err) {
-			/*
-			 * If the cloning didn't succeed, maybe
-			 * what we have will do. Return that.
-			 */
-			newrt = rt;		/* existing route */
-			RT_LOCK(newrt);
-			RT_ADDREF(newrt);
-			goto miss;
-		}
-		KASSERT(newrt, ("no route and no error"));
-		RT_LOCK(newrt);
-		if (newrt->rt_flags & RTF_XRESOLVE) {
-			/*
-			 * If the new route specifies it be
-			 * externally resolved, then go do that.
-			 */
-			msgtype = RTM_RESOLVE;
-			goto miss;
-		}
-		/* Inform listeners of the new route. */
 		bzero(&info, sizeof(info));
-		info.rti_info[RTAX_DST] = rt_key(newrt);
-		info.rti_info[RTAX_NETMASK] = rt_mask(newrt);
-		info.rti_info[RTAX_GATEWAY] = newrt->rt_gateway;
-		if (newrt->rt_ifp != NULL) {
-			info.rti_info[RTAX_IFP] =
-			    newrt->rt_ifp->if_addr->ifa_addr;
-			info.rti_info[RTAX_IFA] = newrt->rt_ifa->ifa_addr;
-		}
-		rt_missmsg(RTM_ADD, &info, newrt->rt_flags, 0);
-		if (needlock)
-			RADIX_NODE_HEAD_UNLOCK(rnh);
-	} else {
-		/*
-		 * Either we hit the root or couldn't find any match,
-		 * Which basically means
-		 * "caint get there frm here"
-		 */
-		V_rtstat.rts_unreach++;
-	miss:
-		if (needlock && needresolve)
-			RADIX_NODE_HEAD_UNLOCK(rnh);
-	miss2:	if (report) {
-			/*
-			 * If required, report the failure to the supervising
-			 * Authorities.
-			 * For a delete, this is not an error. (report == 0)
-			 */
-			bzero(&info, sizeof(info));
-			info.rti_info[RTAX_DST] = dst;
-			rt_missmsg(msgtype, &info, 0, err);
-		}
-	}
+		info.rti_info[RTAX_DST] = dst;
+		rt_missmsg(msgtype, &info, 0, err);
+	}	
 done:
 	if (newrt)
 		RT_LOCK_ASSERT(newrt);
 	return (newrt);
 }
 
 /*
  * Remove a reference count from an rtentry.
  * If the count gets low enough, take it out of the routing table
  */
 void
 rtfree(struct rtentry *rt)
 {
 	INIT_VNET_NET(curvnet);
 	struct radix_node_head *rnh;
 
 	KASSERT(rt != NULL,("%s: NULL rt", __func__));
 	rnh = V_rt_tables[rt->rt_fibnum][rt_key(rt)->sa_family];
 	KASSERT(rnh != NULL,("%s: NULL rnh", __func__));
 
 	RT_LOCK_ASSERT(rt);
 
 	/*
 	 * The callers should use RTFREE_LOCKED() or RTFREE(), so
 	 * we should come here exactly with the last reference.
 	 */
 	RT_REMREF(rt);
 	if (rt->rt_refcnt > 0) {
-		printf("%s: %p has %lu refs\n", __func__, rt, rt->rt_refcnt);
+		log(LOG_DEBUG, "%s: %p has %d refs\t", __func__, rt, rt->rt_refcnt);
 		goto done;
 	}
 
 	/*
 	 * On last reference give the "close method" a chance
 	 * to cleanup private state.  This also permits (for
 	 * IPv4 and IPv6) a chance to decide if the routing table
 	 * entry should be purged immediately or at a later time.
 	 * When an immediate purge is to happen the close routine
 	 * typically calls rtexpunge which clears the RTF_UP flag
 	 * on the entry so that the code below reclaims the storage.
 	 */
 	if (rt->rt_refcnt == 0 && rnh->rnh_close)
 		rnh->rnh_close((struct radix_node *)rt, rnh);
 
 	/*
 	 * If we are no longer "up" (and ref == 0)
 	 * then we can free the resources associated
 	 * with the route.
 	 */
 	if ((rt->rt_flags & RTF_UP) == 0) {
 		if (rt->rt_nodes->rn_flags & (RNF_ACTIVE | RNF_ROOT))
 			panic("rtfree 2");
 		/*
 		 * the rtentry must have been removed from the routing table
 		 * so it is represented in rttrash.. remove that now.
 		 */
 		V_rttrash--;
 #ifdef	DIAGNOSTIC
 		if (rt->rt_refcnt < 0) {
 			printf("rtfree: %p not freed (neg refs)\n", rt);
 			goto done;
 		}
 #endif
 		/*
 		 * release references on items we hold them on..
 		 * e.g other routes and ifaddrs.
 		 */
 		if (rt->rt_ifa)
 			IFAFREE(rt->rt_ifa);
-		rt->rt_parent = NULL;		/* NB: no refcnt on parent */
-
 		/*
 		 * The key is separatly alloc'd so free it (see rt_setgate()).
 		 * This also frees the gateway, as they are always malloc'd
 		 * together.
 		 */
 		Free(rt_key(rt));
 
 		/*
 		 * and the rtentry itself of course
 		 */
 		RT_LOCK_DESTROY(rt);
 		uma_zfree(rtzone, rt);
 		return;
 	}
 done:
 	RT_UNLOCK(rt);
 }
 
 
 /*
  * Force a routing table entry to the specified
  * destination to go through the given gateway.
  * Normally called as a result of a routing redirect
  * message from the network layer.
  */
 void
 rtredirect(struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct sockaddr *src)
 {
 	rtredirect_fib(dst, gateway, netmask, flags, src, 0);
 }
 
 void
 rtredirect_fib(struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct sockaddr *src,
 	u_int fibnum)
 {
 	INIT_VNET_NET(curvnet);
 	struct rtentry *rt, *rt0 = NULL;
 	int error = 0;
 	short *stat = NULL;
 	struct rt_addrinfo info;
 	struct ifaddr *ifa;
 	struct radix_node_head *rnh =
 	    V_rt_tables[fibnum][dst->sa_family];
 
 	/* verify the gateway is directly reachable */
 	if ((ifa = ifa_ifwithnet(gateway)) == NULL) {
 		error = ENETUNREACH;
 		goto out;
 	}
 	rt = rtalloc1_fib(dst, 0, 0UL, fibnum);	/* NB: rt is locked */
 	/*
 	 * If the redirect isn't from our current router for this dst,
 	 * it's either old or wrong.  If it redirects us to ourselves,
 	 * we have a routing loop, perhaps as a result of an interface
 	 * going down recently.
 	 */
 	if (!(flags & RTF_DONE) && rt &&
 	     (!sa_equal(src, rt->rt_gateway) || rt->rt_ifa != ifa))
 		error = EINVAL;
 	else if (ifa_ifwithaddr(gateway))
 		error = EHOSTUNREACH;
 	if (error)
 		goto done;
 	/*
 	 * Create a new entry if we just got back a wildcard entry
 	 * or the the lookup failed.  This is necessary for hosts
 	 * which use routing redirects generated by smart gateways
 	 * to dynamically build the routing tables.
 	 */
 	if (rt == NULL || (rt_mask(rt) && rt_mask(rt)->sa_len < 2))
 		goto create;
 	/*
 	 * Don't listen to the redirect if it's
 	 * for a route to an interface.
 	 */
 	if (rt->rt_flags & RTF_GATEWAY) {
 		if (((rt->rt_flags & RTF_HOST) == 0) && (flags & RTF_HOST)) {
 			/*
 			 * Changing from route to net => route to host.
 			 * Create new route, rather than smashing route to net.
 			 */
 		create:
 			rt0 = rt;
 			rt = NULL;
 		
 			flags |=  RTF_GATEWAY | RTF_DYNAMIC;
 			bzero((caddr_t)&info, sizeof(info));
 			info.rti_info[RTAX_DST] = dst;
 			info.rti_info[RTAX_GATEWAY] = gateway;
 			info.rti_info[RTAX_NETMASK] = netmask;
 			info.rti_ifa = ifa;
 			info.rti_flags = flags;
 			if (rt0 != NULL)
 				RT_UNLOCK(rt0);	/* drop lock to avoid LOR with RNH */
 			error = rtrequest1_fib(RTM_ADD, &info, &rt, fibnum);
 			if (rt != NULL) {
 				RT_LOCK(rt);
 				if (rt0 != NULL)
 					EVENTHANDLER_INVOKE(route_redirect_event, rt0, rt, dst);
 				flags = rt->rt_flags;
 			}
 			if (rt0 != NULL)
 				RTFREE(rt0);
 			
 			stat = &V_rtstat.rts_dynamic;
 		} else {
 			struct rtentry *gwrt;
 
 			/*
 			 * Smash the current notion of the gateway to
 			 * this destination.  Should check about netmask!!!
 			 */
 			rt->rt_flags |= RTF_MODIFIED;
 			flags |= RTF_MODIFIED;
 			stat = &V_rtstat.rts_newgateway;
 			/*
 			 * add the key and gateway (in one malloc'd chunk).
 			 */
 			RT_UNLOCK(rt);
 			RADIX_NODE_HEAD_LOCK(rnh);
 			RT_LOCK(rt);
 			rt_setgate(rt, rt_key(rt), gateway);
 			gwrt = rtalloc1(gateway, 1, RTF_RNH_LOCKED);
 			RADIX_NODE_HEAD_UNLOCK(rnh);
 			EVENTHANDLER_INVOKE(route_redirect_event, rt, gwrt, dst);
 			RTFREE_LOCKED(gwrt);
 		}
 	} else
 		error = EHOSTUNREACH;
 done:
 	if (rt)
 		RTFREE_LOCKED(rt);
 out:
 	if (error)
 		V_rtstat.rts_badredirect++;
 	else if (stat != NULL)
 		(*stat)++;
 	bzero((caddr_t)&info, sizeof(info));
 	info.rti_info[RTAX_DST] = dst;
 	info.rti_info[RTAX_GATEWAY] = gateway;
 	info.rti_info[RTAX_NETMASK] = netmask;
 	info.rti_info[RTAX_AUTHOR] = src;
 	rt_missmsg(RTM_REDIRECT, &info, flags, error);
 }
 
 int
 rtioctl(u_long req, caddr_t data)
 {
 	return (rtioctl_fib(req, data, 0));
 }
 
 /*
  * Routing table ioctl interface.
  */
 int
 rtioctl_fib(u_long req, caddr_t data, u_int fibnum)
 {
 
 	/*
 	 * If more ioctl commands are added here, make sure the proper
 	 * super-user checks are being performed because it is possible for
 	 * prison-root to make it this far if raw sockets have been enabled
 	 * in jails.
 	 */
 #ifdef INET
 	/* Multicast goop, grrr... */
 	return mrt_ioctl ? mrt_ioctl(req, data, fibnum) : EOPNOTSUPP;
 #else /* INET */
 	return ENXIO;
 #endif /* INET */
 }
 
 struct ifaddr *
 ifa_ifwithroute(int flags, struct sockaddr *dst, struct sockaddr *gateway)
 {
 	return (ifa_ifwithroute_fib(flags, dst, gateway, 0));
 }
 
 struct ifaddr *
 ifa_ifwithroute_fib(int flags, struct sockaddr *dst, struct sockaddr *gateway,
 				u_int fibnum)
 {
 	register struct ifaddr *ifa;
 	int not_found = 0;
 
 	if ((flags & RTF_GATEWAY) == 0) {
 		/*
 		 * If we are adding a route to an interface,
 		 * and the interface is a pt to pt link
 		 * we should search for the destination
 		 * as our clue to the interface.  Otherwise
 		 * we can use the local address.
 		 */
 		ifa = NULL;
 		if (flags & RTF_HOST)
 			ifa = ifa_ifwithdstaddr(dst);
 		if (ifa == NULL)
 			ifa = ifa_ifwithaddr(gateway);
 	} else {
 		/*
 		 * If we are adding a route to a remote net
 		 * or host, the gateway may still be on the
 		 * other end of a pt to pt link.
 		 */
 		ifa = ifa_ifwithdstaddr(gateway);
 	}
 	if (ifa == NULL)
 		ifa = ifa_ifwithnet(gateway);
 	if (ifa == NULL) {
 		struct rtentry *rt = rtalloc1_fib(gateway, 0, RTF_RNH_LOCKED, fibnum);
 		if (rt == NULL)
 			return (NULL);
 		/*
 		 * dismiss a gateway that is reachable only
 		 * through the default router
 		 */
 		switch (gateway->sa_family) {
 		case AF_INET:
 			if (satosin(rt_key(rt))->sin_addr.s_addr == INADDR_ANY)
 				not_found = 1;
 			break;
 		case AF_INET6:
 			if (IN6_IS_ADDR_UNSPECIFIED(&satosin6(rt_key(rt))->sin6_addr))
 				not_found = 1;
 			break;
 		default:
 			break;
 		}
 		RT_REMREF(rt);
 		RT_UNLOCK(rt);
 		if (not_found)
 			return (NULL);
 		if ((ifa = rt->rt_ifa) == NULL)
 			return (NULL);
 	}
 	if (ifa->ifa_addr->sa_family != dst->sa_family) {
 		struct ifaddr *oifa = ifa;
 		ifa = ifaof_ifpforaddr(dst, ifa->ifa_ifp);
 		if (ifa == NULL)
 			ifa = oifa;
 	}
 	return (ifa);
 }
 
-static walktree_f_t rt_fixdelete;
-static walktree_f_t rt_fixchange;
-
-struct rtfc_arg {
-	struct rtentry *rt0;
-	struct radix_node_head *rnh;
-};
-
 /*
  * Do appropriate manipulations of a routing tree given
  * all the bits of info needed
  */
 int
 rtrequest(int req,
 	struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct rtentry **ret_nrt)
 {
 	return (rtrequest_fib(req, dst, gateway, netmask, flags, ret_nrt, 0));
 }
 
 int
 rtrequest_fib(int req,
 	struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct rtentry **ret_nrt,
 	u_int fibnum)
 {
 	struct rt_addrinfo info;
 
 	if (dst->sa_len == 0)
 		return(EINVAL);
 
 	bzero((caddr_t)&info, sizeof(info));
 	info.rti_flags = flags;
 	info.rti_info[RTAX_DST] = dst;
 	info.rti_info[RTAX_GATEWAY] = gateway;
 	info.rti_info[RTAX_NETMASK] = netmask;
 	return rtrequest1_fib(req, &info, ret_nrt, fibnum);
 }
 
 /*
  * These (questionable) definitions of apparent local variables apply
  * to the next two functions.  XXXXXX!!!
  */
 #define	dst	info->rti_info[RTAX_DST]
 #define	gateway	info->rti_info[RTAX_GATEWAY]
 #define	netmask	info->rti_info[RTAX_NETMASK]
 #define	ifaaddr	info->rti_info[RTAX_IFA]
 #define	ifpaddr	info->rti_info[RTAX_IFP]
 #define	flags	info->rti_flags
 
 int
 rt_getifa(struct rt_addrinfo *info)
 {
 	return (rt_getifa_fib(info, 0));
 }
 
 int
 rt_getifa_fib(struct rt_addrinfo *info, u_int fibnum)
 {
 	struct ifaddr *ifa;
 	int error = 0;
 
 	/*
 	 * ifp may be specified by sockaddr_dl
 	 * when protocol address is ambiguous.
 	 */
 	if (info->rti_ifp == NULL && ifpaddr != NULL &&
 	    ifpaddr->sa_family == AF_LINK &&
 	    (ifa = ifa_ifwithnet(ifpaddr)) != NULL)
 		info->rti_ifp = ifa->ifa_ifp;
 	if (info->rti_ifa == NULL && ifaaddr != NULL)
 		info->rti_ifa = ifa_ifwithaddr(ifaaddr);
 	if (info->rti_ifa == NULL) {
 		struct sockaddr *sa;
 
 		sa = ifaaddr != NULL ? ifaaddr :
 		    (gateway != NULL ? gateway : dst);
 		if (sa != NULL && info->rti_ifp != NULL)
 			info->rti_ifa = ifaof_ifpforaddr(sa, info->rti_ifp);
 		else if (dst != NULL && gateway != NULL)
 			info->rti_ifa = ifa_ifwithroute_fib(flags, dst, gateway,
 							fibnum);
 		else if (sa != NULL)
 			info->rti_ifa = ifa_ifwithroute_fib(flags, sa, sa,
 							fibnum);
 	}
 	if ((ifa = info->rti_ifa) != NULL) {
 		if (info->rti_ifp == NULL)
 			info->rti_ifp = ifa->ifa_ifp;
 	} else
 		error = ENETUNREACH;
 	return (error);
 }
 
 /*
  * Expunges references to a route that's about to be reclaimed.
  * The route must be locked.
  */
 int
 rtexpunge(struct rtentry *rt)
 {
 	INIT_VNET_NET(curvnet);
 	struct radix_node *rn;
 	struct radix_node_head *rnh;
 	struct ifaddr *ifa;
 	int error = 0;
 
+	/*
+	 * Find the correct routing tree to use for this Address Family
+	 */
 	rnh = V_rt_tables[rt->rt_fibnum][rt_key(rt)->sa_family];
 	RT_LOCK_ASSERT(rt);
+	if (rnh == NULL)
+		return (EAFNOSUPPORT);
 	RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 #if 0
 	/*
 	 * We cannot assume anything about the reference count
 	 * because protocols call us in many situations; often
 	 * before unwinding references to the table entry.
 	 */
 	KASSERT(rt->rt_refcnt <= 1, ("bogus refcnt %ld", rt->rt_refcnt));
 #endif
 	/*
-	 * Find the correct routing tree to use for this Address Family
-	 */
-	rnh = V_rt_tables[rt->rt_fibnum][rt_key(rt)->sa_family];
-	if (rnh == NULL)
-		return (EAFNOSUPPORT);
-
-	/*
 	 * Remove the item from the tree; it should be there,
 	 * but when callers invoke us blindly it may not (sigh).
 	 */
 	rn = rnh->rnh_deladdr(rt_key(rt), rt_mask(rt), rnh);
 	if (rn == NULL) {
 		error = ESRCH;
 		goto bad;
 	}
 	KASSERT((rn->rn_flags & (RNF_ACTIVE | RNF_ROOT)) == 0,
 		("unexpected flags 0x%x", rn->rn_flags));
 	KASSERT(rt == RNTORT(rn),
 		("lookup mismatch, rt %p rn %p", rt, rn));
 
 	rt->rt_flags &= ~RTF_UP;
 
 	/*
-	 * Now search what's left of the subtree for any cloned
-	 * routes which might have been formed from this node.
-	 */
-	if ((rt->rt_flags & RTF_CLONING) && rt_mask(rt)) 
-		rnh->rnh_walktree_from(rnh, rt_key(rt), rt_mask(rt),
-				       rt_fixdelete, rt);
-
-	/*
-	 * Remove any external references we may have.
-	 * This might result in another rtentry being freed if
-	 * we held its last reference.
-	 */
-	if (rt->rt_gwroute) {
-		RTFREE(rt->rt_gwroute);
-		rt->rt_gwroute = NULL;
-	}
-
-	/*
 	 * Give the protocol a chance to keep things in sync.
 	 */
 	if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest) {
 		struct rt_addrinfo info;
 
 		bzero((caddr_t)&info, sizeof(info));
 		info.rti_flags = rt->rt_flags;
 		info.rti_info[RTAX_DST] = rt_key(rt);
 		info.rti_info[RTAX_GATEWAY] = rt->rt_gateway;
 		info.rti_info[RTAX_NETMASK] = rt_mask(rt);
 		ifa->ifa_rtrequest(RTM_DELETE, rt, &info);
 	}
 
 	/*
 	 * one more rtentry floating around that is not
 	 * linked to the routing table.
 	 */
 	V_rttrash++;
 bad:
 	return (error);
 }
 
 int
-rtrequest1(int req, struct rt_addrinfo *info, struct rtentry **ret_nrt)
-{
-	return (rtrequest1_fib(req, info, ret_nrt, 0));
-}
-
-int
 rtrequest1_fib(int req, struct rt_addrinfo *info, struct rtentry **ret_nrt,
 				u_int fibnum)
 {
 	INIT_VNET_NET(curvnet);
 	int error = 0, needlock = 0;
 	register struct rtentry *rt;
 	register struct radix_node *rn;
 	register struct radix_node_head *rnh;
 	struct ifaddr *ifa;
 	struct sockaddr *ndst;
 #define senderr(x) { error = x ; goto bad; }
 
 	KASSERT((fibnum < rt_numfibs), ("rtrequest1_fib: bad fibnum"));
 	if (dst->sa_family != AF_INET)	/* Only INET supports > 1 fib now */
 		fibnum = 0;
 	/*
 	 * Find the correct routing tree to use for this Address Family
 	 */
 	rnh = V_rt_tables[fibnum][dst->sa_family];
 	if (rnh == NULL)
 		return (EAFNOSUPPORT);
 	needlock = ((flags & RTF_RNH_LOCKED) == 0);
 	flags &= ~RTF_RNH_LOCKED;
 	if (needlock)
 		RADIX_NODE_HEAD_LOCK(rnh);
 	else
 		RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 	/*
 	 * If we are adding a host route then we don't want to put
 	 * a netmask in the tree, nor do we want to clone it.
 	 */
-	if (flags & RTF_HOST) {
+	if (flags & RTF_HOST)
 		netmask = NULL;
-		flags &= ~RTF_CLONING;
-	}
+
 	switch (req) {
 	case RTM_DELETE:
 #ifdef RADIX_MPATH
 		/*
 		 * if we got multipath routes, we require users to specify
 		 * a matching RTAX_GATEWAY.
 		 */
 		if (rn_mpath_capable(rnh)) {
 			struct rtentry *rto = NULL;
 
 			rn = rnh->rnh_matchaddr(dst, rnh);
 			if (rn == NULL)
 				senderr(ESRCH);
  			rto = rt = RNTORT(rn);
 			rt = rt_mpath_matchgate(rt, gateway);
 			if (!rt)
 				senderr(ESRCH);
 			/*
 			 * this is the first entry in the chain
 			 */
 			if (rto == rt) {
 				rn = rn_mpath_next((struct radix_node *)rt);
 				/*
 				 * there is another entry, now it's active
 				 */
 				if (rn) {
 					rto = RNTORT(rn);
 					RT_LOCK(rto);
 					rto->rt_flags |= RTF_UP;
 					RT_UNLOCK(rto);
 				} else if (rt->rt_flags & RTF_GATEWAY) {
 					/*
 					 * For gateway routes, we need to 
 					 * make sure that we we are deleting
 					 * the correct gateway. 
 					 * rt_mpath_matchgate() does not 
 					 * check the case when there is only
 					 * one route in the chain.  
 					 */
 					if (gateway &&
 					    (rt->rt_gateway->sa_len != gateway->sa_len ||
 					    memcmp(rt->rt_gateway, gateway, gateway->sa_len)))
 						senderr(ESRCH);
 				}
 				/*
 				 * use the normal delete code to remove
 				 * the first entry
 				 */
 				goto normal_rtdel;
 			}
 			/*
 			 * if the entry is 2nd and on up
 			 */
 			if (!rt_mpath_deldup(rto, rt))
 				panic ("rtrequest1: rt_mpath_deldup");
 			RT_LOCK(rt);
 			RT_ADDREF(rt);
 			rt->rt_flags &= ~RTF_UP;
 			goto deldone;  /* done with the RTM_DELETE command */
 		}
 
 normal_rtdel:
 #endif
 		/*
 		 * Remove the item from the tree and return it.
 		 * Complain if it is not there and do no more processing.
 		 */
 		rn = rnh->rnh_deladdr(dst, netmask, rnh);
 		if (rn == NULL)
 			senderr(ESRCH);
 		if (rn->rn_flags & (RNF_ACTIVE | RNF_ROOT))
 			panic ("rtrequest delete");
 		rt = RNTORT(rn);
 		RT_LOCK(rt);
 		RT_ADDREF(rt);
 		rt->rt_flags &= ~RTF_UP;
 
 		/*
-		 * Now search what's left of the subtree for any cloned
-		 * routes which might have been formed from this node.
-		 */
-		if ((rt->rt_flags & RTF_CLONING) &&
-		    rt_mask(rt)) {
-			rnh->rnh_walktree_from(rnh, dst, rt_mask(rt),
-					       rt_fixdelete, rt);
-		}
-
-		/*
-		 * Remove any external references we may have.
-		 * This might result in another rtentry being freed if
-		 * we held its last reference.
-		 */
-		if (rt->rt_gwroute) {
-			RTFREE(rt->rt_gwroute);
-			rt->rt_gwroute = NULL;
-		}
-
-		/*
 		 * give the protocol a chance to keep things in sync.
 		 */
 		if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest)
 			ifa->ifa_rtrequest(RTM_DELETE, rt, info);
 
 #ifdef RADIX_MPATH
 deldone:
 #endif
 		/*
 		 * One more rtentry floating around that is not
 		 * linked to the routing table. rttrash will be decremented
 		 * when RTFREE(rt) is eventually called.
 		 */
 		V_rttrash++;
 
 		/*
 		 * If the caller wants it, then it can have it,
 		 * but it's up to it to free the rtentry as we won't be
 		 * doing it.
 		 */
 		if (ret_nrt) {
 			*ret_nrt = rt;
 			RT_UNLOCK(rt);
 		} else
 			RTFREE_LOCKED(rt);
 		break;
-
 	case RTM_RESOLVE:
-		if (ret_nrt == NULL || (rt = *ret_nrt) == NULL)
-			senderr(EINVAL);
-		ifa = rt->rt_ifa;
-		/* XXX locking? */
-		flags = rt->rt_flags &
-		    ~(RTF_CLONING | RTF_STATIC);
-		flags |= RTF_WASCLONED;
-		gateway = rt->rt_gateway;
-		if ((netmask = rt->rt_genmask) == NULL)
-			flags |= RTF_HOST;
-		goto makeroute;
-
+		/*
+		 * resolve was only used for route cloning
+		 * here for compat
+		 */
+		break;
 	case RTM_ADD:
 		if ((flags & RTF_GATEWAY) && !gateway)
 			senderr(EINVAL);
 		if (dst && gateway && (dst->sa_family != gateway->sa_family) && 
 		    (gateway->sa_family != AF_UNSPEC) && (gateway->sa_family != AF_LINK))
 			senderr(EINVAL);
 
 		if (info->rti_ifa == NULL && (error = rt_getifa_fib(info, fibnum)))
 			senderr(error);
 		ifa = info->rti_ifa;
-
-	makeroute:
 		rt = uma_zalloc(rtzone, M_NOWAIT | M_ZERO);
 		if (rt == NULL)
 			senderr(ENOBUFS);
 		RT_LOCK_INIT(rt);
 		rt->rt_flags = RTF_UP | flags;
 		rt->rt_fibnum = fibnum;
 		/*
 		 * Add the gateway. Possibly re-malloc-ing the storage for it
-		 * also add the rt_gwroute if possible.
+		 * 
 		 */
 		RT_LOCK(rt);
 		if ((error = rt_setgate(rt, dst, gateway)) != 0) {
 			RT_LOCK_DESTROY(rt);
 			uma_zfree(rtzone, rt);
 			senderr(error);
 		}
 
 		/*
 		 * point to the (possibly newly malloc'd) dest address.
 		 */
 		ndst = (struct sockaddr *)rt_key(rt);
 
 		/*
 		 * make sure it contains the value we want (masked if needed).
 		 */
 		if (netmask) {
 			rt_maskedcopy(dst, ndst, netmask);
 		} else
 			bcopy(dst, ndst, dst->sa_len);
 
 		/*
 		 * Note that we now have a reference to the ifa.
 		 * This moved from below so that rnh->rnh_addaddr() can
 		 * examine the ifa and  ifa->ifa_ifp if it so desires.
 		 */
 		IFAREF(ifa);
 		rt->rt_ifa = ifa;
 		rt->rt_ifp = ifa->ifa_ifp;
 
 #ifdef RADIX_MPATH
 		/* do not permit exactly the same dst/mask/gw pair */
 		if (rn_mpath_capable(rnh) &&
 			rt_mpath_conflict(rnh, rt, netmask)) {
-			if (rt->rt_gwroute)
-				RTFREE(rt->rt_gwroute);
 			if (rt->rt_ifa) {
 				IFAFREE(rt->rt_ifa);
 			}
 			Free(rt_key(rt));
 			RT_LOCK_DESTROY(rt);
 			uma_zfree(rtzone, rt);
 			senderr(EEXIST);
 		}
 #endif
 
 		/* XXX mtu manipulation will be done in rnh_addaddr -- itojun */
 		rn = rnh->rnh_addaddr(ndst, netmask, rnh, rt->rt_nodes);
-		if (rn == NULL) {
-			struct rtentry *rt2;
-			/*
-			 * Uh-oh, we already have one of these in the tree.
-			 * We do a special hack: if the route that's already
-			 * there was generated by the cloning mechanism
-			 * then we just blow it away and retry the insertion
-			 * of the new one.
-			 */
-			rt2 = rtalloc1_fib(dst, 0, RTF_RNH_LOCKED, fibnum);
-			if (rt2 && rt2->rt_parent) {
-				rtexpunge(rt2);
-				RT_UNLOCK(rt2);
-				rn = rnh->rnh_addaddr(ndst, netmask,
-						      rnh, rt->rt_nodes);
-			} else if (rt2) {
-				/* undo the extra ref we got */
-				RTFREE_LOCKED(rt2);
-			}
-		}
-
 		/*
 		 * If it still failed to go into the tree,
 		 * then un-make it (this should be a function)
 		 */
 		if (rn == NULL) {
-			if (rt->rt_gwroute)
-				RTFREE(rt->rt_gwroute);
 			if (rt->rt_ifa)
 				IFAFREE(rt->rt_ifa);
 			Free(rt_key(rt));
 			RT_LOCK_DESTROY(rt);
 			uma_zfree(rtzone, rt);
 			senderr(EEXIST);
 		}
 
-		rt->rt_parent = NULL;
-
 		/*
-		 * If we got here from RESOLVE, then we are cloning
-		 * so clone the rest, and note that we
-		 * are a clone (and increment the parent's references)
-		 */
-		if (req == RTM_RESOLVE) {
-			KASSERT(ret_nrt && *ret_nrt,
-				("no route to clone from"));
-			rt->rt_rmx = (*ret_nrt)->rt_rmx; /* copy metrics */
-			rt->rt_rmx.rmx_pksent = 0; /* reset packet counter */
-			if ((*ret_nrt)->rt_flags & RTF_CLONING) {
-				/*
-				 * NB: We do not bump the refcnt on the parent
-				 * entry under the assumption that it will
-				 * remain so long as we do.  This is
-				 * important when deleting the parent route
-				 * as this operation requires traversing
-				 * the tree to delete all clones and futzing
-				 * with refcnts requires us to double-lock
-				 * parent through this back reference.
-				 */
-				rt->rt_parent = *ret_nrt;
-			}
-		}
-
-		/*
 		 * If this protocol has something to add to this then
 		 * allow it to do that as well.
 		 */
 		if (ifa->ifa_rtrequest)
 			ifa->ifa_rtrequest(req, rt, info);
 
 		/*
-		 * We repeat the same procedure from rt_setgate() here because
-		 * it doesn't fire when we call it there because the node
-		 * hasn't been added to the tree yet.
-		 */
-		if (req == RTM_ADD &&
-		    !(rt->rt_flags & RTF_HOST) && rt_mask(rt) != NULL) {
-			struct rtfc_arg arg;
-			arg.rnh = rnh;
-			arg.rt0 = rt;
-			rnh->rnh_walktree_from(rnh, rt_key(rt), rt_mask(rt),
-					       rt_fixchange, &arg);
-		}
-
-		/*
 		 * actually return a resultant rtentry and
 		 * give the caller a single reference.
 		 */
 		if (ret_nrt) {
 			*ret_nrt = rt;
 			RT_ADDREF(rt);
 		}
 		RT_UNLOCK(rt);
 		break;
 	default:
 		error = EOPNOTSUPP;
 	}
 bad:
 	if (needlock)
 		RADIX_NODE_HEAD_UNLOCK(rnh);
 	return (error);
 #undef senderr
 }
 
 #undef dst
 #undef gateway
 #undef netmask
 #undef ifaaddr
 #undef ifpaddr
 #undef flags
 
-/*
- * Called from rtrequest(RTM_DELETE, ...) to fix up the route's ``family''
- * (i.e., the routes related to it by the operation of cloning).  This
- * routine is iterated over all potential former-child-routes by way of
- * rnh->rnh_walktree_from() above, and those that actually are children of
- * the late parent (passed in as VP here) are themselves deleted.
- */
-static int
-rt_fixdelete(struct radix_node *rn, void *vp)
-{
-	struct rtentry *rt = RNTORT(rn);
-	struct rtentry *rt0 = vp;
-
-	if (rt->rt_parent == rt0 &&
-	    !(rt->rt_flags & (RTF_PINNED | RTF_CLONING))) {
-		return rtrequest_fib(RTM_DELETE, rt_key(rt), NULL, rt_mask(rt),
-				 rt->rt_flags|RTF_RNH_LOCKED, NULL, rt->rt_fibnum);
-	}
-	return 0;
-}
-
-/*
- * This routine is called from rt_setgate() to do the analogous thing for
- * adds and changes.  There is the added complication in this case of a
- * middle insert; i.e., insertion of a new network route between an older
- * network route and (cloned) host routes.  For this reason, a simple check
- * of rt->rt_parent is insufficient; each candidate route must be tested
- * against the (mask, value) of the new route (passed as before in vp)
- * to see if the new route matches it.
- *
- * XXX - it may be possible to do fixdelete() for changes and reserve this
- * routine just for adds.  I'm not sure why I thought it was necessary to do
- * changes this way.
- */
-
-static int
-rt_fixchange(struct radix_node *rn, void *vp)
-{
-	struct rtentry *rt = RNTORT(rn);
-	struct rtfc_arg *ap = vp;
-	struct rtentry *rt0 = ap->rt0;
-	struct radix_node_head *rnh = ap->rnh;
-	u_char *xk1, *xm1, *xk2, *xmp;
-	int i, len, mlen;
-
-	/* make sure we have a parent, and route is not pinned or cloning */
-	if (!rt->rt_parent ||
-	    (rt->rt_flags & (RTF_PINNED | RTF_CLONING)))
-		return 0;
-
-	if (rt->rt_parent == rt0)	/* parent match */
-		goto delete_rt;
-	/*
-	 * There probably is a function somewhere which does this...
-	 * if not, there should be.
-	 */
-	len = imin(rt_key(rt0)->sa_len, rt_key(rt)->sa_len);
-
-	xk1 = (u_char *)rt_key(rt0);
-	xm1 = (u_char *)rt_mask(rt0);
-	xk2 = (u_char *)rt_key(rt);
-
-	/* avoid applying a less specific route */
-	xmp = (u_char *)rt_mask(rt->rt_parent);
-	mlen = rt_key(rt->rt_parent)->sa_len;
-	if (mlen > rt_key(rt0)->sa_len)		/* less specific route */
-		return 0;
-	for (i = rnh->rnh_treetop->rn_offset; i < mlen; i++)
-		if ((xmp[i] & ~(xmp[i] ^ xm1[i])) != xmp[i])
-			return 0;	/* less specific route */
-
-	for (i = rnh->rnh_treetop->rn_offset; i < len; i++)
-		if ((xk2[i] & xm1[i]) != xk1[i])
-			return 0;	/* no match */
-
-	/*
-	 * OK, this node is a clone, and matches the node currently being
-	 * changed/added under the node's mask.  So, get rid of it.
-	 */
-delete_rt:
-	return rtrequest_fib(RTM_DELETE, rt_key(rt), NULL,
-			 rt_mask(rt), rt->rt_flags, NULL, rt->rt_fibnum);
-}
-
 int
 rt_setgate(struct rtentry *rt, struct sockaddr *dst, struct sockaddr *gate)
 {
 	INIT_VNET_NET(curvnet);
 	/* XXX dst may be overwritten, can we move this to below */
+	int dlen = SA_SIZE(dst), glen = SA_SIZE(gate);
+#ifdef INVARIANTS
 	struct radix_node_head *rnh =
 	    V_rt_tables[rt->rt_fibnum][dst->sa_family];
-	int dlen = SA_SIZE(dst), glen = SA_SIZE(gate);
+#endif
 
-again:
 	RT_LOCK_ASSERT(rt);
 	RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 	
 	/*
-	 * A host route with the destination equal to the gateway
-	 * will interfere with keeping LLINFO in the routing
-	 * table, so disallow it.
-	 */
-	if (((rt->rt_flags & (RTF_HOST|RTF_GATEWAY|RTF_LLINFO)) ==
-					(RTF_HOST|RTF_GATEWAY)) &&
-	    dst->sa_len == gate->sa_len &&
-	    bcmp(dst, gate, dst->sa_len) == 0) {
-		/*
-		 * The route might already exist if this is an RTM_CHANGE
-		 * or a routing redirect, so try to delete it.
-		 */
-		if (rt_key(rt))
-			rtexpunge(rt);
-		return EADDRNOTAVAIL;
-	}
-
-	/*
-	 * Cloning loop avoidance in case of bad configuration.
-	 */
-	if (rt->rt_flags & RTF_GATEWAY) {
-		struct rtentry *gwrt;
-
-		RT_UNLOCK(rt);		/* XXX workaround LOR */
-		gwrt = rtalloc1_fib(gate, 1, RTF_RNH_LOCKED, rt->rt_fibnum);
-		if (gwrt == rt) {
-			RT_REMREF(rt);
-			return (EADDRINUSE); /* failure */
-		}
-		/*
-		 * Try to reacquire the lock on rt, and if it fails,
-		 * clean state and restart from scratch.
-		 */
-		if (!RT_TRYLOCK(rt)) {
-			RTFREE_LOCKED(gwrt);
-			RT_LOCK(rt);
-			goto again;
-		}
-		/*
-		 * If there is already a gwroute, then drop it. If we
-		 * are asked to replace route with itself, then do
-		 * not leak its refcounter.
-		 */
-		if (rt->rt_gwroute != NULL) {
-			if (rt->rt_gwroute == gwrt) {
-				RT_REMREF(rt->rt_gwroute);
-			} else
-				RTFREE(rt->rt_gwroute);
-		}
-
-		if ((rt->rt_gwroute = gwrt) != NULL)
-			RT_UNLOCK(rt->rt_gwroute);
-	}
-
-	/*
 	 * Prepare to store the gateway in rt->rt_gateway.
 	 * Both dst and gateway are stored one after the other in the same
 	 * malloc'd chunk. If we have room, we can reuse the old buffer,
 	 * rt_gateway already points to the right place.
 	 * Otherwise, malloc a new block and update the 'dst' address.
 	 */
 	if (rt->rt_gateway == NULL || glen > SA_SIZE(rt->rt_gateway)) {
 		caddr_t new;
 
 		R_Malloc(new, caddr_t, dlen + glen);
 		if (new == NULL)
 			return ENOBUFS;
 		/*
 		 * XXX note, we copy from *dst and not *rt_key(rt) because
 		 * rt_setgate() can be called to initialize a newly
 		 * allocated route entry, in which case rt_key(rt) == NULL
 		 * (and also rt->rt_gateway == NULL).
 		 * Free()/free() handle a NULL argument just fine.
 		 */
 		bcopy(dst, new, dlen);
 		Free(rt_key(rt));	/* free old block, if any */
 		rt_key(rt) = (struct sockaddr *)new;
 		rt->rt_gateway = (struct sockaddr *)(new + dlen);
 	}
 
 	/*
 	 * Copy the new gateway value into the memory chunk.
 	 */
 	bcopy(gate, rt->rt_gateway, glen);
 
-	/*
-	 * This isn't going to do anything useful for host routes, so
-	 * don't bother.  Also make sure we have a reasonable mask
-	 * (we don't yet have one during adds).
-	 */
-	if (!(rt->rt_flags & RTF_HOST) && rt_mask(rt) != 0) {
-		struct rtfc_arg arg;
-
-		arg.rnh = rnh;
-		arg.rt0 = rt;
-		rnh->rnh_walktree_from(rnh, rt_key(rt), rt_mask(rt),
-				       rt_fixchange, &arg);
-	}
-
-	return 0;
+	return (0);
 }
 
 static void
 rt_maskedcopy(struct sockaddr *src, struct sockaddr *dst, struct sockaddr *netmask)
 {
 	register u_char *cp1 = (u_char *)src;
 	register u_char *cp2 = (u_char *)dst;
 	register u_char *cp3 = (u_char *)netmask;
 	u_char *cplim = cp2 + *cp3;
 	u_char *cplim2 = cp2 + *cp1;
 
 	*cp2++ = *cp1++; *cp2++ = *cp1++; /* copies sa_len & sa_family */
 	cp3 += 2;
 	if (cplim > cplim2)
 		cplim = cplim2;
 	while (cp2 < cplim)
 		*cp2++ = *cp1++ & *cp3++;
 	if (cp2 < cplim2)
 		bzero((caddr_t)cp2, (unsigned)(cplim2 - cp2));
 }
 
 /*
  * Set up a routing table entry, normally
  * for an interface.
  */
 #define _SOCKADDR_TMPSIZE 128 /* Not too big.. kernel stack size is limited */
 static inline  int
 rtinit1(struct ifaddr *ifa, int cmd, int flags, int fibnum)
 {
 	INIT_VNET_NET(curvnet);
 	struct sockaddr *dst;
 	struct sockaddr *netmask;
 	struct rtentry *rt = NULL;
 	struct rt_addrinfo info;
 	int error = 0;
 	int startfib, endfib;
 	char tempbuf[_SOCKADDR_TMPSIZE];
 	int didwork = 0;
 	int a_failure = 0;
+	static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
 
 	if (flags & RTF_HOST) {
 		dst = ifa->ifa_dstaddr;
 		netmask = NULL;
 	} else {
 		dst = ifa->ifa_addr;
 		netmask = ifa->ifa_netmask;
 	}
 	if ( dst->sa_family != AF_INET)
 		fibnum = 0;
 	if (fibnum == -1) {
 		if (rt_add_addr_allfibs == 0 && cmd == (int)RTM_ADD) {
 			startfib = endfib = curthread->td_proc->p_fibnum;
 		} else {
 			startfib = 0;
 			endfib = rt_numfibs - 1;
 		}
 	} else {
 		KASSERT((fibnum < rt_numfibs), ("rtinit1: bad fibnum"));
 		startfib = fibnum;
 		endfib = fibnum;
 	}
 	if (dst->sa_len == 0)
 		return(EINVAL);
 
 	/*
 	 * If it's a delete, check that if it exists,
 	 * it's on the correct interface or we might scrub
 	 * a route to another ifa which would
 	 * be confusing at best and possibly worse.
 	 */
 	if (cmd == RTM_DELETE) {
 		/*
 		 * It's a delete, so it should already exist..
 		 * If it's a net, mask off the host bits
 		 * (Assuming we have a mask)
 		 * XXX this is kinda inet specific..
 		 */
 		if (netmask != NULL) {
 			rt_maskedcopy(dst, (struct sockaddr *)tempbuf, netmask);
 			dst = (struct sockaddr *)tempbuf;
 		}
 	}
 	/*
 	 * Now go through all the requested tables (fibs) and do the
 	 * requested action. Realistically, this will either be fib 0
 	 * for protocols that don't do multiple tables or all the
 	 * tables for those that do. XXX For this version only AF_INET.
 	 * When that changes code should be refactored to protocol
 	 * independent parts and protocol dependent parts.
 	 */
 	for ( fibnum = startfib; fibnum <= endfib; fibnum++) {
 		if (cmd == RTM_DELETE) {
 			struct radix_node_head *rnh;
 			struct radix_node *rn;
 			/*
 			 * Look up an rtentry that is in the routing tree and
 			 * contains the correct info.
 			 */
 			if ((rnh = V_rt_tables[fibnum][dst->sa_family]) == NULL)
 				/* this table doesn't exist but others might */
 				continue;
 			RADIX_NODE_HEAD_LOCK(rnh);
 #ifdef RADIX_MPATH
 			if (rn_mpath_capable(rnh)) {
 
 				rn = rnh->rnh_matchaddr(dst, rnh);
 				if (rn == NULL) 
 					error = ESRCH;
 				else {
 					rt = RNTORT(rn);
 					/*
 					 * for interface route the
 					 * rt->rt_gateway is sockaddr_intf
 					 * for cloning ARP entries, so
 					 * rt_mpath_matchgate must use the
 					 * interface address
 					 */
 					rt = rt_mpath_matchgate(rt,
 					    ifa->ifa_addr);
 					if (!rt) 
 						error = ESRCH;
 				}
 			}
 			else
 #endif
 			rn = rnh->rnh_lookup(dst, netmask, rnh);
 			error = (rn == NULL ||
 			    (rn->rn_flags & RNF_ROOT) ||
 			    RNTORT(rn)->rt_ifa != ifa ||
 			    !sa_equal((struct sockaddr *)rn->rn_key, dst));
 			RADIX_NODE_HEAD_UNLOCK(rnh);
 			if (error) {
 				/* this is only an error if bad on ALL tables */
 				continue;
 			}
 		}
 		/*
 		 * Do the actual request
 		 */
 		bzero((caddr_t)&info, sizeof(info));
 		info.rti_ifa = ifa;
 		info.rti_flags = flags | ifa->ifa_flags;
 		info.rti_info[RTAX_DST] = dst;
-		info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
+		/* 
+		 * doing this for compatibility reasons
+		 */
+		if (cmd == RTM_ADD)
+			info.rti_info[RTAX_GATEWAY] =
+			    (struct sockaddr *)&null_sdl;
+		else
+			info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
 		info.rti_info[RTAX_NETMASK] = netmask;
 		error = rtrequest1_fib(cmd, &info, &rt, fibnum);
 		if (error == 0 && rt != NULL) {
 			/*
 			 * notify any listening routing agents of the change
 			 */
 			RT_LOCK(rt);
 #ifdef RADIX_MPATH
 			/*
 			 * in case address alias finds the first address
 			 * e.g. ifconfig bge0 192.103.54.246/24
 			 * e.g. ifconfig bge0 192.103.54.247/24
 			 * the address set in the route is 192.103.54.246
 			 * so we need to replace it with 192.103.54.247
 			 */
 			if (memcmp(rt->rt_ifa->ifa_addr,
 			    ifa->ifa_addr, ifa->ifa_addr->sa_len)) {
 				IFAFREE(rt->rt_ifa);
 				IFAREF(ifa);
 				rt->rt_ifp = ifa->ifa_ifp;
 				rt->rt_ifa = ifa;
 			}
 #endif
+			/* 
+			 * doing this for compatibility reasons
+			 */
+			if (cmd == RTM_ADD) {
+			    ((struct sockaddr_dl *)rt->rt_gateway)->sdl_type  =
+				rt->rt_ifp->if_type;
+			    ((struct sockaddr_dl *)rt->rt_gateway)->sdl_index =
+				rt->rt_ifp->if_index;
+			}
 			rt_newaddrmsg(cmd, ifa, error, rt);
 			if (cmd == RTM_DELETE) {
 				/*
 				 * If we are deleting, and we found an entry,
 				 * then it's been removed from the tree..
 				 * now throw it away.
 				 */
 				RTFREE_LOCKED(rt);
 			} else {
 				if (cmd == RTM_ADD) {
 					/*
 					 * We just wanted to add it..
 					 * we don't actually need a reference.
 					 */
 					RT_REMREF(rt);
 				}
 				RT_UNLOCK(rt);
 			}
 			didwork = 1;
 		}
 		if (error)
 			a_failure = error;
 	}
 	if (cmd == RTM_DELETE) {
 		if (didwork) {
 			error = 0;
 		} else {
 			/* we only give an error if it wasn't in any table */
 			error = ((flags & RTF_HOST) ?
 			    EHOSTUNREACH : ENETUNREACH);
 		}
 	} else {
 		if (a_failure) {
 			/* return an error if any of them failed */
 			error = a_failure;
 		}
 	}
 	return (error);
 }
 
 /* special one for inet internal use. may not use. */
 int
 rtinit_fib(struct ifaddr *ifa, int cmd, int flags)
 {
 	return (rtinit1(ifa, cmd, flags, -1));
 }
 
 /*
  * Set up a routing table entry, normally
  * for an interface.
  */
 int
 rtinit(struct ifaddr *ifa, int cmd, int flags)
 {
 	struct sockaddr *dst;
 	int fib = 0;
 
 	if (flags & RTF_HOST) {
 		dst = ifa->ifa_dstaddr;
 	} else {
 		dst = ifa->ifa_addr;
 	}
 
 	if (dst->sa_family == AF_INET)
 		fib = -1;
 	return (rtinit1(ifa, cmd, flags, fib));
-}
-
-/*
- * rt_check() is invoked on each layer 2 output path, prior to
- * encapsulating outbound packets.
- *
- * The function is mostly used to find a routing entry for the gateway,
- * which in some protocol families could also point to the link-level
- * address for the gateway itself (the side effect of revalidating the
- * route to the destination is rather pointless at this stage, we did it
- * already a moment before in the pr_output() routine to locate the ifp
- * and gateway to use).
- *
- * When we remove the layer-3 to layer-2 mapping tables from the
- * routing table, this function can be removed.
- *
- * === On input ===
- *   *dst is the address of the NEXT HOP (which coincides with the
- *	final destination if directly reachable);
- *   *lrt0 points to the cached route to the final destination;
- *   *lrt is not meaningful;
- *	(*lrt0 has no ref held on it by us so REMREF is not needed.
- *	Refs only account for major structural references and not usages,
- * 	which is actually a bit of a problem.)
- *
- * === Operation ===
- * If the route is marked down try to find a new route.  If the route
- * to the gateway is gone, try to setup a new route.  Otherwise,
- * if the route is marked for packets to be rejected, enforce that.
- * Note that rtalloc returns an rtentry with an extra REF that we may
- * need to lose.
- *
- * === On return ===
- *   *dst is unchanged;
- *   *lrt0 points to the (possibly new) route to the final destination
- *   *lrt points to the route to the next hop   [LOCKED]
- *
- * Their values are meaningful ONLY if no error is returned.
- *
- * To follow this you have to remember that:
- * RT_REMREF reduces the reference count by 1 but doesn't check it for 0 (!)
- * RTFREE_LOCKED includes an RT_REMREF (or an rtfree if refs == 1)
- *    and an RT_UNLOCK
- * RTFREE does an RT_LOCK and an RTFREE_LOCKED
- * The gwroute pointer counts as a reference on the rtentry to which it points.
- * so when we add it we use the ref that rtalloc gives us and when we lose it
- * we need to remove the reference.
- * RT_TEMP_UNLOCK does an RT_ADDREF before freeing the lock, and
- * RT_RELOCK locks it (it can't have gone away due to the ref) and
- * drops the ref, possibly freeing it and zeroing the pointer if
- * the ref goes to 0 (unlocking in the process).
- */
-int
-rt_check(struct rtentry **lrt, struct rtentry **lrt0, struct sockaddr *dst)
-{
-	struct rtentry *rt;
-	struct rtentry *rt0;
-	u_int fibnum;
-
-	KASSERT(*lrt0 != NULL, ("rt_check"));
-	rt0 = *lrt0;
-	rt = NULL;
-	fibnum = rt0->rt_fibnum;
-
-	/* NB: the locking here is tortuous... */
-	RT_LOCK(rt0);
-retry:
-	if (rt0 && (rt0->rt_flags & RTF_UP) == 0) {
-		/* Current rt0 is useless, try get a replacement. */
-		RT_UNLOCK(rt0);
-		rt0 = NULL;
-	}
-	if (rt0 == NULL) {
-		rt0 = rtalloc1_fib(dst, 1, 0UL, fibnum);
-		if (rt0 == NULL) {
-			return (EHOSTUNREACH);
-		}
-		RT_REMREF(rt0); /* don't need the reference. */
-	}
-
-	if (rt0->rt_flags & RTF_GATEWAY) {
-		if ((rt = rt0->rt_gwroute) != NULL) {
-			RT_LOCK(rt);		/* NB: gwroute */
-			if ((rt->rt_flags & RTF_UP) == 0) {
-				/* gw route is dud. ignore/lose it */
-				RTFREE_LOCKED(rt); /* unref (&unlock) gwroute */
-				rt = rt0->rt_gwroute = NULL;
-			}
-		}
-		
-		if (rt == NULL) {  /* NOT AN ELSE CLAUSE */
-			RT_TEMP_UNLOCK(rt0); /* MUST return to undo this */
-			rt = rtalloc1_fib(rt0->rt_gateway, 1, 0UL, fibnum);
-			if ((rt == rt0) || (rt == NULL)) {
-				/* the best we can do is not good enough */
-				if (rt) {
-					RT_REMREF(rt); /* assumes ref > 0 */
-					RT_UNLOCK(rt);
-				}
-				RTFREE(rt0); /* lock, unref, (unlock) */
-				return (ENETUNREACH);
-			}
-			/*
-			 * Relock it and lose the added reference.
-			 * All sorts of things could have happenned while we
-			 * had no lock on it, so check for them.
-			 */
-			RT_RELOCK(rt0);
-			if (rt0 == NULL || ((rt0->rt_flags & RTF_UP) == 0))
-				/* Ru-roh.. what we had is no longer any good */
-				goto retry;
-			/* 
-			 * While we were away, someone replaced the gateway.
-			 * Since a reference count is involved we can't just
-			 * overwrite it.
-			 */
-			if (rt0->rt_gwroute) {
-				if (rt0->rt_gwroute != rt) {
-					RTFREE_LOCKED(rt);
-					goto retry;
-				}
-			} else {
-				rt0->rt_gwroute = rt;
-			}
-		}
-		RT_LOCK_ASSERT(rt);
-		RT_UNLOCK(rt0);
-	} else {
-		/* think of rt as having the lock from now on.. */
-		rt = rt0;
-	}
-	/* XXX why are we inspecting rmx_expire? */
-	if ((rt->rt_flags & RTF_REJECT) &&
-	    (rt->rt_rmx.rmx_expire == 0 ||
-	    time_uptime < rt->rt_rmx.rmx_expire)) {
-		RT_UNLOCK(rt);
-		return (rt == rt0 ? EHOSTDOWN : EHOSTUNREACH);
-	}
-
-	*lrt = rt;
-	*lrt0 = rt0;
-	return (0);
 }
 
 /* This must be before ip6_init2(), which is now SI_ORDER_MIDDLE */
 SYSINIT(route, SI_SUB_PROTO_DOMAIN, SI_ORDER_THIRD, route_init, 0);
Index: head/sys/net/route.h
===================================================================
--- head/sys/net/route.h	(revision 186118)
+++ head/sys/net/route.h	(revision 186119)
@@ -1,439 +1,434 @@
 /*-
  * Copyright (c) 1980, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)route.h	8.4 (Berkeley) 1/9/95
  * $FreeBSD$
  */
 
 #ifndef _NET_ROUTE_H_
 #define _NET_ROUTE_H_
 
 /*
  * Kernel resident routing tables.
  *
  * The routing tables are initialized when interface addresses
  * are set by making entries for all directly connected interfaces.
  */
 
 /*
  * A route consists of a destination address and a reference
  * to a routing entry.  These are often held by protocols
  * in their control blocks, e.g. inpcb.
  */
 struct route {
 	struct	rtentry *ro_rt;
 	struct	sockaddr ro_dst;
 };
 
 /*
  * These numbers are used by reliable protocols for determining
  * retransmission behavior and are included in the routing structure.
  */
 struct rt_metrics_lite {
 	u_long	rmx_mtu;	/* MTU for this path */
 	u_long	rmx_expire;	/* lifetime for route, e.g. redirect */
 	u_long	rmx_pksent;	/* packets sent using this route */
 };
 
 struct rt_metrics {
 	u_long	rmx_locks;	/* Kernel must leave these values alone */
 	u_long	rmx_mtu;	/* MTU for this path */
 	u_long	rmx_hopcount;	/* max hops expected */
 	u_long	rmx_expire;	/* lifetime for route, e.g. redirect */
 	u_long	rmx_recvpipe;	/* inbound delay-bandwidth product */
 	u_long	rmx_sendpipe;	/* outbound delay-bandwidth product */
 	u_long	rmx_ssthresh;	/* outbound gateway buffer limit */
 	u_long	rmx_rtt;	/* estimated round trip time */
 	u_long	rmx_rttvar;	/* estimated rtt variance */
 	u_long	rmx_pksent;	/* packets sent using this route */
 	u_long	rmx_filler[4];	/* will be used for T/TCP later */
 };
 
 /*
  * rmx_rtt and rmx_rttvar are stored as microseconds;
  * RTTTOPRHZ(rtt) converts to a value suitable for use
  * by a protocol slowtimo counter.
  */
 #define	RTM_RTTUNIT	1000000	/* units for rtt, rttvar, as units per sec */
 #define	RTTTOPRHZ(r)	((r) / (RTM_RTTUNIT / PR_SLOWHZ))
 
 /* MRT compile-time constants */
 #ifdef _KERNEL
  #ifndef ROUTETABLES
   #define RT_NUMFIBS 1
   #define RT_MAXFIBS 1
  #else
   /* while we use 4 bits in the mbuf flags, we are limited to 16 */
   #define RT_MAXFIBS 16
   #if ROUTETABLES > RT_MAXFIBS
    #define RT_NUMFIBS RT_MAXFIBS
    #error "ROUTETABLES defined too big"
   #else
    #if ROUTETABLES == 0
     #define RT_NUMFIBS 1
    #else
     #define RT_NUMFIBS ROUTETABLES
    #endif
   #endif
  #endif
 #endif
 
 extern u_int rt_numfibs;	/* number fo usable routing tables */
 extern u_int tunnel_fib;	/* tunnels use these */
 extern u_int fwd_fib;		/* packets being forwarded use these routes */
 /*
  * XXX kernel function pointer `rt_output' is visible to applications.
  */
 struct mbuf;
 
 /*
  * We distinguish between routes to hosts and routes to networks,
  * preferring the former if available.  For each route we infer
  * the interface to use from the gateway address supplied when
  * the route was entered.  Routes that forward packets through
  * gateways are marked so that the output routines know to address the
  * gateway rather than the ultimate destination.
  */
 #ifndef RNF_NORMAL
 #include <net/radix.h>
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 #endif
 struct rtentry {
 	struct	radix_node rt_nodes[2];	/* tree glue, and other values */
 	/*
 	 * XXX struct rtentry must begin with a struct radix_node (or two!)
 	 * because the code does some casts of a 'struct radix_node *'
 	 * to a 'struct rtentry *'
 	 */
 #define	rt_key(r)	(*((struct sockaddr **)(&(r)->rt_nodes->rn_key)))
 #define	rt_mask(r)	(*((struct sockaddr **)(&(r)->rt_nodes->rn_mask)))
 	struct	sockaddr *rt_gateway;	/* value */
-	u_long	rt_flags;		/* up/down?, host/net */
+	int	rt_flags;		/* up/down?, host/net */
+	int	rt_refcnt;		/* # held references */
 	struct	ifnet *rt_ifp;		/* the answer: interface to use */
 	struct	ifaddr *rt_ifa;		/* the answer: interface address to use */
 	struct	rt_metrics_lite rt_rmx;	/* metrics used by rx'ing protocols */
-	long	rt_refcnt;		/* # held references */
-	struct	sockaddr *rt_genmask;	/* for generation of cloned routes */
-	caddr_t	rt_llinfo;		/* pointer to link level info cache */
-	struct	rtentry *rt_gwroute;	/* implied entry for gatewayed routes */
-	struct	rtentry *rt_parent; 	/* cloning parent of this route */
 	u_int	rt_fibnum;		/* which FIB */
 #ifdef _KERNEL
 	/* XXX ugly, user apps use this definition but don't have a mtx def */
 	struct	mtx rt_mtx;		/* mutex for routing entry */
 #endif
 };
 
 /*
  * Following structure necessary for 4.3 compatibility;
  * We should eventually move it to a compat file.
  */
 struct ortentry {
 	u_long	rt_hash;		/* to speed lookups */
 	struct	sockaddr rt_dst;	/* key */
 	struct	sockaddr rt_gateway;	/* value */
 	short	rt_flags;		/* up/down?, host/net */
 	short	rt_refcnt;		/* # held references */
 	u_long	rt_use;			/* raw # packets forwarded */
 	struct	ifnet *rt_ifp;		/* the answer: interface to use */
 };
 
 #define rt_use rt_rmx.rmx_pksent
 
 #define	RTF_UP		0x1		/* route usable */
 #define	RTF_GATEWAY	0x2		/* destination is a gateway */
 #define	RTF_HOST	0x4		/* host entry (net otherwise) */
 #define	RTF_REJECT	0x8		/* host or net unreachable */
 #define	RTF_DYNAMIC	0x10		/* created dynamically (by redirect) */
 #define	RTF_MODIFIED	0x20		/* modified dynamically (by redirect) */
 #define RTF_DONE	0x40		/* message confirmed */
 /*			0x80		   unused, was RTF_DELCLONE */
-#define RTF_CLONING	0x100		/* generate new routes on use */
+/*			0x100		   unused, was RTF_CLONING */
 #define RTF_XRESOLVE	0x200		/* external daemon resolves name */
-#define RTF_LLINFO	0x400		/* generated by link layer (e.g. ARP) */
+/*			0x400		   unused, was RTF_LLINFO */
 #define RTF_STATIC	0x800		/* manually added */
 #define RTF_BLACKHOLE	0x1000		/* just discard pkts (during updates) */
 #define RTF_PROTO2	0x4000		/* protocol specific routing flag */
 #define RTF_PROTO1	0x8000		/* protocol specific routing flag */
 
 /* XXX: temporary to stay API/ABI compatible with userland */
 #ifndef _KERNEL
 #define RTF_PRCLONING	0x10000		/* unused, for compatibility */
 #endif
 
-#define RTF_WASCLONED	0x20000		/* route generated through cloning */
+/*			0x20000		   unused, was RTF_WASCLONED */
 #define RTF_PROTO3	0x40000		/* protocol specific routing flag */
 /*			0x80000		   unused */
 #define RTF_PINNED	0x100000	/* future use */
 #define	RTF_LOCAL	0x200000 	/* route represents a local address */
 #define	RTF_BROADCAST	0x400000	/* route represents a bcast address */
 #define	RTF_MULTICAST	0x800000	/* route represents a mcast address */
 					/* 0x1000000 and up unassigned */
 #define	RTF_RNH_LOCKED	 0x40000000	/* radix node head locked by caller */
 
 /* Mask of RTF flags that are allowed to be modified by RTM_CHANGE. */
 #define RTF_FMASK	\
 	(RTF_PROTO1 | RTF_PROTO2 | RTF_PROTO3 | RTF_BLACKHOLE | \
 	 RTF_REJECT | RTF_STATIC)
 
 /*
  * Routing statistics.
  */
 struct	rtstat {
 	short	rts_badredirect;	/* bogus redirect calls */
 	short	rts_dynamic;		/* routes created by redirects */
 	short	rts_newgateway;		/* routes modified by redirects */
 	short	rts_unreach;		/* lookups which failed */
 	short	rts_wildcard;		/* lookups satisfied by a wildcard */
 };
 /*
  * Structures for routing messages.
  */
 struct rt_msghdr {
 	u_short	rtm_msglen;	/* to skip over non-understood messages */
 	u_char	rtm_version;	/* future binary compatibility */
 	u_char	rtm_type;	/* message type */
 	u_short	rtm_index;	/* index for associated ifp */
 	int	rtm_flags;	/* flags, incl. kern & message, e.g. DONE */
 	int	rtm_addrs;	/* bitmask identifying sockaddrs in msg */
 	pid_t	rtm_pid;	/* identify sender */
 	int	rtm_seq;	/* for sender to identify action */
 	int	rtm_errno;	/* why failed */
 	int	rtm_fmask;	/* bitmask used in RTM_CHANGE message */
 #define	rtm_use	rtm_fmask	/* deprecated, use rtm_rmx->rmx_pksent */
 	u_long	rtm_inits;	/* which metrics we are initializing */
 	struct	rt_metrics rtm_rmx; /* metrics themselves */
 };
 
 #define RTM_VERSION	5	/* Up the ante and ignore older versions */
 
 /*
  * Message types.
  */
 #define RTM_ADD		0x1	/* Add Route */
 #define RTM_DELETE	0x2	/* Delete Route */
 #define RTM_CHANGE	0x3	/* Change Metrics or flags */
 #define RTM_GET		0x4	/* Report Metrics */
 #define RTM_LOSING	0x5	/* Kernel Suspects Partitioning */
 #define RTM_REDIRECT	0x6	/* Told to use different route */
 #define RTM_MISS	0x7	/* Lookup failed on this address */
 #define RTM_LOCK	0x8	/* fix specified metrics */
 #define RTM_OLDADD	0x9	/* caused by SIOCADDRT */
 #define RTM_OLDDEL	0xa	/* caused by SIOCDELRT */
 #define RTM_RESOLVE	0xb	/* req to resolve dst to LL addr */
 #define RTM_NEWADDR	0xc	/* address being added to iface */
 #define RTM_DELADDR	0xd	/* address being removed from iface */
 #define RTM_IFINFO	0xe	/* iface going up/down etc. */
 #define	RTM_NEWMADDR	0xf	/* mcast group membership being added to if */
 #define	RTM_DELMADDR	0x10	/* mcast group membership being deleted */
 #define	RTM_IFANNOUNCE	0x11	/* iface arrival/departure */
 #define	RTM_IEEE80211	0x12	/* IEEE80211 wireless event */
 
 /*
  * Bitmask values for rtm_inits and rmx_locks.
  */
 #define RTV_MTU		0x1	/* init or lock _mtu */
 #define RTV_HOPCOUNT	0x2	/* init or lock _hopcount */
 #define RTV_EXPIRE	0x4	/* init or lock _expire */
 #define RTV_RPIPE	0x8	/* init or lock _recvpipe */
 #define RTV_SPIPE	0x10	/* init or lock _sendpipe */
 #define RTV_SSTHRESH	0x20	/* init or lock _ssthresh */
 #define RTV_RTT		0x40	/* init or lock _rtt */
 #define RTV_RTTVAR	0x80	/* init or lock _rttvar */
 
 /*
  * Bitmask values for rtm_addrs.
  */
 #define RTA_DST		0x1	/* destination sockaddr present */
 #define RTA_GATEWAY	0x2	/* gateway sockaddr present */
 #define RTA_NETMASK	0x4	/* netmask sockaddr present */
 #define RTA_GENMASK	0x8	/* cloning mask sockaddr present */
 #define RTA_IFP		0x10	/* interface name sockaddr present */
 #define RTA_IFA		0x20	/* interface addr sockaddr present */
 #define RTA_AUTHOR	0x40	/* sockaddr for author of redirect */
 #define RTA_BRD		0x80	/* for NEWADDR, broadcast or p-p dest addr */
 
 /*
  * Index offsets for sockaddr array for alternate internal encoding.
  */
 #define RTAX_DST	0	/* destination sockaddr present */
 #define RTAX_GATEWAY	1	/* gateway sockaddr present */
 #define RTAX_NETMASK	2	/* netmask sockaddr present */
 #define RTAX_GENMASK	3	/* cloning mask sockaddr present */
 #define RTAX_IFP	4	/* interface name sockaddr present */
 #define RTAX_IFA	5	/* interface addr sockaddr present */
 #define RTAX_AUTHOR	6	/* sockaddr for author of redirect */
 #define RTAX_BRD	7	/* for NEWADDR, broadcast or p-p dest addr */
 #define RTAX_MAX	8	/* size of array to allocate */
 
 struct rt_addrinfo {
 	int	rti_addrs;
 	struct	sockaddr *rti_info[RTAX_MAX];
 	int	rti_flags;
 	struct	ifaddr *rti_ifa;
 	struct	ifnet *rti_ifp;
 };
 
 /*
  * This macro returns the size of a struct sockaddr when passed
  * through a routing socket. Basically we round up sa_len to
  * a multiple of sizeof(long), with a minimum of sizeof(long).
  * The check for a NULL pointer is just a convenience, probably never used.
  * The case sa_len == 0 should only apply to empty structures.
  */
 #define SA_SIZE(sa)						\
     (  (!(sa) || ((struct sockaddr *)(sa))->sa_len == 0) ?	\
 	sizeof(long)		:				\
 	1 + ( (((struct sockaddr *)(sa))->sa_len - 1) | (sizeof(long) - 1) ) )
 
 #ifdef _KERNEL
 
 #define	RT_LOCK_INIT(_rt) \
 	mtx_init(&(_rt)->rt_mtx, "rtentry", NULL, MTX_DEF | MTX_DUPOK)
 #define	RT_LOCK(_rt)		mtx_lock(&(_rt)->rt_mtx)
 #define	RT_TRYLOCK(_rt)		mtx_trylock(&(_rt)->rt_mtx)
 #define	RT_UNLOCK(_rt)		mtx_unlock(&(_rt)->rt_mtx)
 #define	RT_LOCK_DESTROY(_rt)	mtx_destroy(&(_rt)->rt_mtx)
 #define	RT_LOCK_ASSERT(_rt)	mtx_assert(&(_rt)->rt_mtx, MA_OWNED)
 
 #define	RT_ADDREF(_rt)	do {					\
 	RT_LOCK_ASSERT(_rt);					\
 	KASSERT((_rt)->rt_refcnt >= 0,				\
-		("negative refcnt %ld", (_rt)->rt_refcnt));	\
+		("negative refcnt %d", (_rt)->rt_refcnt));	\
 	(_rt)->rt_refcnt++;					\
 } while (0)
 
 #define	RT_REMREF(_rt)	do {					\
 	RT_LOCK_ASSERT(_rt);					\
 	KASSERT((_rt)->rt_refcnt > 0,				\
-		("bogus refcnt %ld", (_rt)->rt_refcnt));	\
+		("bogus refcnt %d", (_rt)->rt_refcnt));	\
 	(_rt)->rt_refcnt--;					\
 } while (0)
 
 #define	RTFREE_LOCKED(_rt) do {					\
 	if ((_rt)->rt_refcnt <= 1)				\
 		rtfree(_rt);					\
 	else {							\
 		RT_REMREF(_rt);					\
 		RT_UNLOCK(_rt);					\
 	}							\
 	/* guard against invalid refs */			\
 	_rt = 0;						\
 } while (0)
 
 #define	RTFREE(_rt) do {					\
 	RT_LOCK(_rt);						\
 	RTFREE_LOCKED(_rt);					\
 } while (0)
 
 #define RT_TEMP_UNLOCK(_rt) do {				\
 	RT_ADDREF(_rt);						\
 	RT_UNLOCK(_rt);						\
 } while (0)
 
 #define RT_RELOCK(_rt) do {					\
 	RT_LOCK(_rt);						\
 	if ((_rt)->rt_refcnt <= 1) {				\
 		rtfree(_rt);					\
 		_rt = 0; /*  signal that it went away */	\
 	} else {						\
 		RT_REMREF(_rt);					\
 		/* note that _rt is still valid */		\
 	}							\
 } while (0)
 
 extern struct radix_node_head *rt_tables[][AF_MAX+1];
 
 struct ifmultiaddr;
 
 void	 rt_ieee80211msg(struct ifnet *, int, void *, size_t);
 void	 rt_ifannouncemsg(struct ifnet *, int);
 void	 rt_ifmsg(struct ifnet *);
 void	 rt_missmsg(int, struct rt_addrinfo *, int, int);
 void	 rt_newaddrmsg(int, struct ifaddr *, int, struct rtentry *);
 void	 rt_newmaddrmsg(int, struct ifmultiaddr *);
 int	 rt_setgate(struct rtentry *, struct sockaddr *, struct sockaddr *);
 
 /*
  * Note the following locking behavior:
  *
  *    rtalloc_ign() and rtalloc() return ro->ro_rt unlocked
  *
  *    rtalloc1() returns a locked rtentry
  *
  *    rtfree() and RTFREE_LOCKED() require a locked rtentry
  *
  *    RTFREE() uses an unlocked entry.
  */
 
 int	 rtexpunge(struct rtentry *);
 void	 rtfree(struct rtentry *);
 int	 rt_check(struct rtentry **, struct rtentry **, struct sockaddr *);
 
 /* XXX MRT COMPAT VERSIONS THAT SET UNIVERSE to 0 */
 /* Thes are used by old code not yet converted to use multiple FIBS */
 int	 rt_getifa(struct rt_addrinfo *);
 void	 rtalloc_ign(struct route *ro, u_long ignflags);
 void	 rtalloc(struct route *ro); /* XXX deprecated, use rtalloc_ign(ro, 0) */
 struct rtentry *rtalloc1(struct sockaddr *, int, u_long);
 int	 rtinit(struct ifaddr *, int, int);
 int	 rtioctl(u_long, caddr_t);
 void	 rtredirect(struct sockaddr *, struct sockaddr *,
 	    struct sockaddr *, int, struct sockaddr *);
 int	 rtrequest(int, struct sockaddr *,
 	    struct sockaddr *, struct sockaddr *, int, struct rtentry **);
-int	 rtrequest1(int, struct rt_addrinfo *, struct rtentry **);
 
 /* defaults to "all" FIBs */
 int	 rtinit_fib(struct ifaddr *, int, int);
 
 /* XXX MRT NEW VERSIONS THAT USE FIBs
  * For now the protocol indepedent versions are the same as the AF_INET ones
  * but this will change.. 
  */
 int	 rt_getifa_fib(struct rt_addrinfo *, u_int fibnum);
 void	 rtalloc_ign_fib(struct route *ro, u_long ignflags, u_int fibnum);
 void	 rtalloc_fib(struct route *ro, u_int fibnum);
 struct rtentry *rtalloc1_fib(struct sockaddr *, int, u_long, u_int);
 int	 rtioctl_fib(u_long, caddr_t, u_int);
 void	 rtredirect_fib(struct sockaddr *, struct sockaddr *,
 	    struct sockaddr *, int, struct sockaddr *, u_int);
 int	 rtrequest_fib(int, struct sockaddr *,
 	    struct sockaddr *, struct sockaddr *, int, struct rtentry **, u_int);
 int	 rtrequest1_fib(int, struct rt_addrinfo *, struct rtentry **, u_int);
 
 #include <sys/eventhandler.h>
 typedef void (*rtevent_arp_update_fn)(void *, struct rtentry *, uint8_t *, struct sockaddr *);
 typedef void (*rtevent_redirect_fn)(void *, struct rtentry *, struct rtentry *, struct sockaddr *);
 EVENTHANDLER_DECLARE(route_arp_update_event, rtevent_arp_update_fn);
 EVENTHANDLER_DECLARE(route_redirect_event, rtevent_redirect_fn);
 #endif
 
 #endif
Index: head/sys/net/rtsock.c
===================================================================
--- head/sys/net/rtsock.c	(revision 186118)
+++ head/sys/net/rtsock.c	(revision 186119)
@@ -1,1485 +1,1485 @@
 /*-
  * Copyright (c) 1988, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)rtsock.c	8.7 (Berkeley) 10/12/95
  * $FreeBSD$
  */
 #include "opt_sctp.h"
 #include "opt_mpath.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/param.h>
 #include <sys/domain.h>
 #include <sys/jail.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/protosw.h>
 #include <sys/rwlock.h>
 #include <sys/signalvar.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sysctl.h>
 #include <sys/systm.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
+#include <net/if_llatbl.h>
 #include <net/netisr.h>
 #include <net/raw_cb.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #ifdef INET6
 #include <netinet6/scope6_var.h>
 #endif
 
 #ifdef SCTP
 extern void sctp_addr_change(struct ifaddr *ifa, int cmd);
 #endif /* SCTP */
 
 MALLOC_DEFINE(M_RTABLE, "routetbl", "routing tables");
 
 /* NB: these are not modified */
 static struct	sockaddr route_src = { 2, PF_ROUTE, };
 static struct	sockaddr sa_zero   = { sizeof(sa_zero), AF_INET, };
 
 static struct {
 	int	ip_count;	/* attached w/ AF_INET */
 	int	ip6_count;	/* attached w/ AF_INET6 */
 	int	ipx_count;	/* attached w/ AF_IPX */
 	int	any_count;	/* total attached */
 } route_cb;
 
 struct mtx rtsock_mtx;
 MTX_SYSINIT(rtsock, &rtsock_mtx, "rtsock route_cb lock", MTX_DEF);
 
 #define	RTSOCK_LOCK()	mtx_lock(&rtsock_mtx)
 #define	RTSOCK_UNLOCK()	mtx_unlock(&rtsock_mtx)
 #define	RTSOCK_LOCK_ASSERT()	mtx_assert(&rtsock_mtx, MA_OWNED)
 
 static struct	ifqueue rtsintrq;
 
 SYSCTL_NODE(_net, OID_AUTO, route, CTLFLAG_RD, 0, "");
 SYSCTL_INT(_net_route, OID_AUTO, netisr_maxqlen, CTLFLAG_RW,
     &rtsintrq.ifq_maxlen, 0, "maximum routing socket dispatch queue length");
 
 struct walkarg {
 	int	w_tmemsize;
 	int	w_op, w_arg;
 	caddr_t	w_tmem;
 	struct sysctl_req *w_req;
 };
 
 static void	rts_input(struct mbuf *m);
 static struct mbuf *rt_msg1(int type, struct rt_addrinfo *rtinfo);
 static int	rt_msg2(int type, struct rt_addrinfo *rtinfo,
 			caddr_t cp, struct walkarg *w);
 static int	rt_xaddrs(caddr_t cp, caddr_t cplim,
 			struct rt_addrinfo *rtinfo);
 static int	sysctl_dumpentry(struct radix_node *rn, void *vw);
 static int	sysctl_iflist(int af, struct walkarg *w);
 static int	sysctl_ifmalist(int af, struct walkarg *w);
 static int	route_output(struct mbuf *m, struct socket *so);
 static void	rt_setmetrics(u_long which, const struct rt_metrics *in,
 			struct rt_metrics_lite *out);
 static void	rt_getmetrics(const struct rt_metrics_lite *in,
 			struct rt_metrics *out);
 static void	rt_dispatch(struct mbuf *, const struct sockaddr *);
 
 static void
 rts_init(void)
 {
 	int tmp;
 
 	rtsintrq.ifq_maxlen = 256;
 	if (TUNABLE_INT_FETCH("net.route.netisr_maxqlen", &tmp))
 		rtsintrq.ifq_maxlen = tmp;
 	mtx_init(&rtsintrq.ifq_mtx, "rts_inq", NULL, MTX_DEF);
 	netisr_register(NETISR_ROUTE, rts_input, &rtsintrq, 0);
 }
 SYSINIT(rtsock, SI_SUB_PROTO_DOMAIN, SI_ORDER_THIRD, rts_init, 0);
 
 static void
 rts_input(struct mbuf *m)
 {
 	struct sockproto route_proto;
 	unsigned short *family;
 	struct m_tag *tag;
 
 	route_proto.sp_family = PF_ROUTE;
 	tag = m_tag_find(m, PACKET_TAG_RTSOCKFAM, NULL);
 	if (tag != NULL) {
 		family = (unsigned short *)(tag + 1);
 		route_proto.sp_protocol = *family;
 		m_tag_delete(m, tag);
 	} else
 		route_proto.sp_protocol = 0;
 
 	raw_input(m, &route_proto, &route_src);
 }
 
 /*
  * It really doesn't make any sense at all for this code to share much
  * with raw_usrreq.c, since its functionality is so restricted.  XXX
  */
 static void
 rts_abort(struct socket *so)
 {
 
 	raw_usrreqs.pru_abort(so);
 }
 
 static void
 rts_close(struct socket *so)
 {
 
 	raw_usrreqs.pru_close(so);
 }
 
 /* pru_accept is EOPNOTSUPP */
 
 static int
 rts_attach(struct socket *so, int proto, struct thread *td)
 {
 	struct rawcb *rp;
 	int s, error;
 
 	KASSERT(so->so_pcb == NULL, ("rts_attach: so_pcb != NULL"));
 
 	/* XXX */
 	rp = malloc(sizeof *rp, M_PCB, M_WAITOK | M_ZERO);
 	if (rp == NULL)
 		return ENOBUFS;
 
 	/*
 	 * The splnet() is necessary to block protocols from sending
 	 * error notifications (like RTM_REDIRECT or RTM_LOSING) while
 	 * this PCB is extant but incompletely initialized.
 	 * Probably we should try to do more of this work beforehand and
 	 * eliminate the spl.
 	 */
 	s = splnet();
 	so->so_pcb = (caddr_t)rp;
 	so->so_fibnum = td->td_proc->p_fibnum;
 	error = raw_attach(so, proto);
 	rp = sotorawcb(so);
 	if (error) {
 		splx(s);
 		so->so_pcb = NULL;
 		free(rp, M_PCB);
 		return error;
 	}
 	RTSOCK_LOCK();
 	switch(rp->rcb_proto.sp_protocol) {
 	case AF_INET:
 		route_cb.ip_count++;
 		break;
 	case AF_INET6:
 		route_cb.ip6_count++;
 		break;
 	case AF_IPX:
 		route_cb.ipx_count++;
 		break;
 	}
 	route_cb.any_count++;
 	RTSOCK_UNLOCK();
 	soisconnected(so);
 	so->so_options |= SO_USELOOPBACK;
 	splx(s);
 	return 0;
 }
 
 static int
 rts_bind(struct socket *so, struct sockaddr *nam, struct thread *td)
 {
 
 	return (raw_usrreqs.pru_bind(so, nam, td)); /* xxx just EINVAL */
 }
 
 static int
 rts_connect(struct socket *so, struct sockaddr *nam, struct thread *td)
 {
 
 	return (raw_usrreqs.pru_connect(so, nam, td)); /* XXX just EINVAL */
 }
 
 /* pru_connect2 is EOPNOTSUPP */
 /* pru_control is EOPNOTSUPP */
 
 static void
 rts_detach(struct socket *so)
 {
 	struct rawcb *rp = sotorawcb(so);
 
 	KASSERT(rp != NULL, ("rts_detach: rp == NULL"));
 
 	RTSOCK_LOCK();
 	switch(rp->rcb_proto.sp_protocol) {
 	case AF_INET:
 		route_cb.ip_count--;
 		break;
 	case AF_INET6:
 		route_cb.ip6_count--;
 		break;
 	case AF_IPX:
 		route_cb.ipx_count--;
 		break;
 	}
 	route_cb.any_count--;
 	RTSOCK_UNLOCK();
 	raw_usrreqs.pru_detach(so);
 }
 
 static int
 rts_disconnect(struct socket *so)
 {
 
 	return (raw_usrreqs.pru_disconnect(so));
 }
 
 /* pru_listen is EOPNOTSUPP */
 
 static int
 rts_peeraddr(struct socket *so, struct sockaddr **nam)
 {
 
 	return (raw_usrreqs.pru_peeraddr(so, nam));
 }
 
 /* pru_rcvd is EOPNOTSUPP */
 /* pru_rcvoob is EOPNOTSUPP */
 
 static int
 rts_send(struct socket *so, int flags, struct mbuf *m, struct sockaddr *nam,
 	 struct mbuf *control, struct thread *td)
 {
 
 	return (raw_usrreqs.pru_send(so, flags, m, nam, control, td));
 }
 
 /* pru_sense is null */
 
 static int
 rts_shutdown(struct socket *so)
 {
 
 	return (raw_usrreqs.pru_shutdown(so));
 }
 
 static int
 rts_sockaddr(struct socket *so, struct sockaddr **nam)
 {
 
 	return (raw_usrreqs.pru_sockaddr(so, nam));
 }
 
 static struct pr_usrreqs route_usrreqs = {
 	.pru_abort =		rts_abort,
 	.pru_attach =		rts_attach,
 	.pru_bind =		rts_bind,
 	.pru_connect =		rts_connect,
 	.pru_detach =		rts_detach,
 	.pru_disconnect =	rts_disconnect,
 	.pru_peeraddr =		rts_peeraddr,
 	.pru_send =		rts_send,
 	.pru_shutdown =		rts_shutdown,
 	.pru_sockaddr =		rts_sockaddr,
 	.pru_close =		rts_close,
 };
 
 #ifndef _SOCKADDR_UNION_DEFINED
 #define	_SOCKADDR_UNION_DEFINED
 /*
  * The union of all possible address formats we handle.
  */
 union sockaddr_union {
 	struct sockaddr		sa;
 	struct sockaddr_in	sin;
 	struct sockaddr_in6	sin6;
 };
 #endif /* _SOCKADDR_UNION_DEFINED */
 
 static int
 rtm_get_jailed(struct rt_addrinfo *info, struct ifnet *ifp,
     struct rtentry *rt, union sockaddr_union *saun, struct ucred *cred)
 {
 
 	switch (info->rti_info[RTAX_DST]->sa_family) {
 #ifdef INET
 	case AF_INET:
 	{
 		struct in_addr ia;
 
 		/*
 		 * 1. Check if the returned address is part of the jail.
 		 */
 		ia = ((struct sockaddr_in *)rt->rt_ifa->ifa_addr)->sin_addr;
 		if (prison_check_ip4(cred, &ia) != 0) {
 			info->rti_info[RTAX_IFA] = rt->rt_ifa->ifa_addr;
 
 		} else {
 			struct ifaddr *ifa;
 			int found;
 
 			found = 0;
 
 			/*
 			 * 2. Try to find an address on the given outgoing
 			 *    interface that belongs to the jail.
 			 */
 			TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 				struct sockaddr *sa;
 				sa = ifa->ifa_addr;
 				if (sa->sa_family != AF_INET)
 					continue;
 				ia = ((struct sockaddr_in *)sa)->sin_addr;
 				if (prison_check_ip4(cred, &ia) != 0) {
 					found = 1;
 					break;
 				}
 			}
 			if (!found) {
 				/*
 				 * 3. As a last resort return the 'default'
 				 * jail address.
 				 */
 				if (prison_getip4(cred, &ia) != 0)
 					return (ESRCH);
 			}
 			bzero(&saun->sin, sizeof(struct sockaddr_in));
 			saun->sin.sin_len = sizeof(struct sockaddr_in);
 			saun->sin.sin_family = AF_INET;
 			saun->sin.sin_addr.s_addr = ia.s_addr;
 			info->rti_info[RTAX_IFA] =
 			    (struct sockaddr *)&saun->sin;
 		}
 		break;
 	}
 #endif
 #ifdef INET6
 	case AF_INET6:
 	{
 		struct in6_addr ia6;
 
 		/*
 		 * 1. Check if the returned address is part of the jail.
 		 */
 		bcopy(&((struct sockaddr_in6 *)rt->rt_ifa->ifa_addr)->sin6_addr,
 		    &ia6, sizeof(struct in6_addr));
 		if (prison_check_ip6(cred, &ia6) != 0) {
 			info->rti_info[RTAX_IFA] = rt->rt_ifa->ifa_addr;
 		} else {
 			struct ifaddr *ifa;
 			int found;
 
 			found = 0;
 
 			/*
 			 * 2. Try to find an address on the given outgoing
 			 *    interface that belongs to the jail.
 			 */
 			TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 				struct sockaddr *sa;
 				sa = ifa->ifa_addr;
 				if (sa->sa_family != AF_INET6)
 					continue;
 				bcopy(&((struct sockaddr_in6 *)sa)->sin6_addr,
 				    &ia6, sizeof(struct in6_addr));
 				if (prison_check_ip6(cred, &ia6) != 0) {
 					found = 1;
 					break;
 				}
 			}
 			if (!found) {
 				/*
 				 * 3. As a last resort return the 'default'
 				 * jail address.
 				 */
 				if (prison_getip6(cred, &ia6) != 0)
 					return (ESRCH);
 			}
 			bzero(&saun->sin6, sizeof(struct sockaddr_in6));
 			saun->sin6.sin6_len = sizeof(struct sockaddr_in6);
 			saun->sin6.sin6_family = AF_INET6;
 			bcopy(&ia6, &saun->sin6.sin6_addr,
 			    sizeof(struct in6_addr));
 			if (sa6_recoverscope(&saun->sin6) != 0)
 				return (ESRCH);
 			info->rti_info[RTAX_IFA] =
 			    (struct sockaddr *)&saun->sin6;
 		}
 		break;
 	}
 #endif
 	default:
 		return (ESRCH);
 	}
 	return (0);
 }
 
 /*ARGSUSED*/
 static int
 route_output(struct mbuf *m, struct socket *so)
 {
 #define	sa_equal(a1, a2) (bcmp((a1), (a2), (a1)->sa_len) == 0)
 	INIT_VNET_NET(so->so_vnet);
 	struct rt_msghdr *rtm = NULL;
 	struct rtentry *rt = NULL;
 	struct radix_node_head *rnh;
 	struct rt_addrinfo info;
 	int len, error = 0;
 	struct ifnet *ifp = NULL;
 	union sockaddr_union saun;
 
 #define senderr(e) { error = e; goto flush;}
 	if (m == NULL || ((m->m_len < sizeof(long)) &&
 		       (m = m_pullup(m, sizeof(long))) == NULL))
 		return (ENOBUFS);
 	if ((m->m_flags & M_PKTHDR) == 0)
 		panic("route_output");
 	len = m->m_pkthdr.len;
 	if (len < sizeof(*rtm) ||
 	    len != mtod(m, struct rt_msghdr *)->rtm_msglen) {
 		info.rti_info[RTAX_DST] = NULL;
 		senderr(EINVAL);
 	}
 	R_Malloc(rtm, struct rt_msghdr *, len);
 	if (rtm == NULL) {
 		info.rti_info[RTAX_DST] = NULL;
 		senderr(ENOBUFS);
 	}
 	m_copydata(m, 0, len, (caddr_t)rtm);
 	if (rtm->rtm_version != RTM_VERSION) {
 		info.rti_info[RTAX_DST] = NULL;
 		senderr(EPROTONOSUPPORT);
 	}
 	rtm->rtm_pid = curproc->p_pid;
 	bzero(&info, sizeof(info));
 	info.rti_addrs = rtm->rtm_addrs;
 	if (rt_xaddrs((caddr_t)(rtm + 1), len + (caddr_t)rtm, &info)) {
 		info.rti_info[RTAX_DST] = NULL;
 		senderr(EINVAL);
 	}
 	info.rti_flags = rtm->rtm_flags;
 	if (info.rti_info[RTAX_DST] == NULL ||
 	    info.rti_info[RTAX_DST]->sa_family >= AF_MAX ||
 	    (info.rti_info[RTAX_GATEWAY] != NULL &&
 	     info.rti_info[RTAX_GATEWAY]->sa_family >= AF_MAX))
 		senderr(EINVAL);
-	if (info.rti_info[RTAX_GENMASK]) {
-		struct radix_node *t;
-		t = rn_addmask((caddr_t) info.rti_info[RTAX_GENMASK], 0, 1);
-		if (t != NULL &&
-		    bcmp((char *)(void *)info.rti_info[RTAX_GENMASK] + 1,
-		    (char *)(void *)t->rn_key + 1,
-		    ((struct sockaddr *)t->rn_key)->sa_len - 1) == 0)
-			info.rti_info[RTAX_GENMASK] =
-			    (struct sockaddr *)t->rn_key;
-		else
-			senderr(ENOBUFS);
-	}
-
 	/*
 	 * Verify that the caller has the appropriate privilege; RTM_GET
 	 * is the only operation the non-superuser is allowed.
 	 */
 	if (rtm->rtm_type != RTM_GET) {
 		error = priv_check(curthread, PRIV_NET_ROUTE);
 		if (error)
 			senderr(error);
 	}
 
 	switch (rtm->rtm_type) {
 		struct rtentry *saved_nrt;
 
 	case RTM_ADD:
 		if (info.rti_info[RTAX_GATEWAY] == NULL)
 			senderr(EINVAL);
 		saved_nrt = NULL;
+		/* support for new ARP code */
+		if (info.rti_info[RTAX_GATEWAY]->sa_family == AF_LINK) {
+			error = lla_rt_output(rtm, &info);
+			break;
+		}
 		error = rtrequest1_fib(RTM_ADD, &info, &saved_nrt,
 		    so->so_fibnum);
 		if (error == 0 && saved_nrt) {
 			RT_LOCK(saved_nrt);
 			rt_setmetrics(rtm->rtm_inits,
 				&rtm->rtm_rmx, &saved_nrt->rt_rmx);
 			rtm->rtm_index = saved_nrt->rt_ifp->if_index;
 			RT_REMREF(saved_nrt);
-			saved_nrt->rt_genmask = info.rti_info[RTAX_GENMASK];
 			RT_UNLOCK(saved_nrt);
 		}
 		break;
 
 	case RTM_DELETE:
 		saved_nrt = NULL;
+		/* support for new ARP code */
+		if (info.rti_info[RTAX_GATEWAY] && 
+		    (info.rti_info[RTAX_GATEWAY]->sa_family == AF_LINK)) {
+			error = lla_rt_output(rtm, &info);
+			break;
+		}
 		error = rtrequest1_fib(RTM_DELETE, &info, &saved_nrt,
 		    so->so_fibnum);
 		if (error == 0) {
 			RT_LOCK(saved_nrt);
 			rt = saved_nrt;
 			goto report;
 		}
 		break;
 
 	case RTM_GET:
 	case RTM_CHANGE:
 	case RTM_LOCK:
 		rnh = V_rt_tables[so->so_fibnum][info.rti_info[RTAX_DST]->sa_family];
 		if (rnh == NULL)
 			senderr(EAFNOSUPPORT);
 		RADIX_NODE_HEAD_RLOCK(rnh);
 		rt = (struct rtentry *) rnh->rnh_lookup(info.rti_info[RTAX_DST],
 			info.rti_info[RTAX_NETMASK], rnh);
 		if (rt == NULL) {	/* XXX looks bogus */
 			RADIX_NODE_HEAD_RUNLOCK(rnh);
 			senderr(ESRCH);
 		}
 #ifdef RADIX_MPATH
 		/*
 		 * for RTM_CHANGE/LOCK, if we got multipath routes,
 		 * we require users to specify a matching RTAX_GATEWAY.
 		 *
 		 * for RTM_GET, gate is optional even with multipath.
 		 * if gate == NULL the first match is returned.
 		 * (no need to call rt_mpath_matchgate if gate == NULL)
 		 */
 		if (rn_mpath_capable(rnh) &&
 		    (rtm->rtm_type != RTM_GET || info.rti_info[RTAX_GATEWAY])) {
 			rt = rt_mpath_matchgate(rt, info.rti_info[RTAX_GATEWAY]);
 			if (!rt) {
 				RADIX_NODE_HEAD_RUNLOCK(rnh);
 				senderr(ESRCH);
 			}
 		}
 #endif
 		RT_LOCK(rt);
 		RT_ADDREF(rt);
 		RADIX_NODE_HEAD_RUNLOCK(rnh);
 
 		/* 
 		 * Fix for PR: 82974
 		 *
 		 * RTM_CHANGE/LOCK need a perfect match, rn_lookup()
 		 * returns a perfect match in case a netmask is
 		 * specified.  For host routes only a longest prefix
 		 * match is returned so it is necessary to compare the
 		 * existence of the netmask.  If both have a netmask
 		 * rnh_lookup() did a perfect match and if none of them
 		 * have a netmask both are host routes which is also a
 		 * perfect match.
 		 */
 
 		if (rtm->rtm_type != RTM_GET && 
 		    (!rt_mask(rt) != !info.rti_info[RTAX_NETMASK])) {
 			RT_UNLOCK(rt);
 			senderr(ESRCH);
 		}
 
 		switch(rtm->rtm_type) {
 
 		case RTM_GET:
 		report:
 			RT_LOCK_ASSERT(rt);
 			info.rti_info[RTAX_DST] = rt_key(rt);
 			info.rti_info[RTAX_GATEWAY] = rt->rt_gateway;
 			info.rti_info[RTAX_NETMASK] = rt_mask(rt);
-			info.rti_info[RTAX_GENMASK] = rt->rt_genmask;
+			info.rti_info[RTAX_GENMASK] = 0;
 			if (rtm->rtm_addrs & (RTA_IFP | RTA_IFA)) {
 				ifp = rt->rt_ifp;
 				if (ifp) {
 					info.rti_info[RTAX_IFP] =
 					    ifp->if_addr->ifa_addr;
 					if (jailed(so->so_cred)) {
 						error = rtm_get_jailed(
 						    &info, ifp, rt, &saun,
 						    so->so_cred);
 						if (error != 0) {
 							RT_UNLOCK(rt);
 							senderr(ESRCH);
 						}
 					} else {
 						info.rti_info[RTAX_IFA] =
 						    rt->rt_ifa->ifa_addr;
 					}
 					if (ifp->if_flags & IFF_POINTOPOINT)
 						info.rti_info[RTAX_BRD] =
 						    rt->rt_ifa->ifa_dstaddr;
 					rtm->rtm_index = ifp->if_index;
 				} else {
 					info.rti_info[RTAX_IFP] = NULL;
 					info.rti_info[RTAX_IFA] = NULL;
 				}
 			} else if ((ifp = rt->rt_ifp) != NULL) {
 				rtm->rtm_index = ifp->if_index;
 			}
 			len = rt_msg2(rtm->rtm_type, &info, NULL, NULL);
 			if (len > rtm->rtm_msglen) {
 				struct rt_msghdr *new_rtm;
 				R_Malloc(new_rtm, struct rt_msghdr *, len);
 				if (new_rtm == NULL) {
 					RT_UNLOCK(rt);
 					senderr(ENOBUFS);
 				}
 				bcopy(rtm, new_rtm, rtm->rtm_msglen);
 				Free(rtm); rtm = new_rtm;
 			}
 			(void)rt_msg2(rtm->rtm_type, &info, (caddr_t)rtm, NULL);
 			rtm->rtm_flags = rt->rt_flags;
 			rtm->rtm_use = 0;
 			rt_getmetrics(&rt->rt_rmx, &rtm->rtm_rmx);
 			rtm->rtm_addrs = info.rti_addrs;
 			break;
 
 		case RTM_CHANGE:
 			/*
 			 * New gateway could require new ifaddr, ifp;
 			 * flags may also be different; ifp may be specified
 			 * by ll sockaddr when protocol address is ambiguous
 			 */
 			if (((rt->rt_flags & RTF_GATEWAY) &&
 			     info.rti_info[RTAX_GATEWAY] != NULL) ||
 			    info.rti_info[RTAX_IFP] != NULL ||
 			    (info.rti_info[RTAX_IFA] != NULL &&
 			     !sa_equal(info.rti_info[RTAX_IFA],
 				       rt->rt_ifa->ifa_addr))) {
 				RT_UNLOCK(rt);
 				RADIX_NODE_HEAD_LOCK(rnh);
 				error = rt_getifa_fib(&info, rt->rt_fibnum);
 				RADIX_NODE_HEAD_UNLOCK(rnh);
 				if (error != 0)
 					senderr(error);
 				RT_LOCK(rt);
 			}
 			if (info.rti_ifa != NULL &&
 			    info.rti_ifa != rt->rt_ifa &&
 			    rt->rt_ifa != NULL &&
 			    rt->rt_ifa->ifa_rtrequest != NULL) {
 				rt->rt_ifa->ifa_rtrequest(RTM_DELETE, rt,
 				    &info);
 				IFAFREE(rt->rt_ifa);
 			}
 			if (info.rti_info[RTAX_GATEWAY] != NULL) {
 				RT_UNLOCK(rt);
 				RADIX_NODE_HEAD_LOCK(rnh);
 				RT_LOCK(rt);
 				
 				error = rt_setgate(rt, rt_key(rt),
 				    info.rti_info[RTAX_GATEWAY]);
 				RADIX_NODE_HEAD_UNLOCK(rnh);
 				if (error != 0) {
 					RT_UNLOCK(rt);
 					senderr(error);
 				}
-				if (!(rt->rt_flags & RTF_LLINFO))
-					rt->rt_flags |= RTF_GATEWAY;
+				rt->rt_flags |= RTF_GATEWAY;
 			}
 			if (info.rti_ifa != NULL &&
 			    info.rti_ifa != rt->rt_ifa) {
 				IFAREF(info.rti_ifa);
 				rt->rt_ifa = info.rti_ifa;
 				rt->rt_ifp = info.rti_ifp;
 			}
 			/* Allow some flags to be toggled on change. */
 			if (rtm->rtm_fmask & RTF_FMASK)
 				rt->rt_flags = (rt->rt_flags &
 				    ~rtm->rtm_fmask) |
 				    (rtm->rtm_flags & rtm->rtm_fmask);
 			rt_setmetrics(rtm->rtm_inits, &rtm->rtm_rmx,
 					&rt->rt_rmx);
 			rtm->rtm_index = rt->rt_ifp->if_index;
 			if (rt->rt_ifa && rt->rt_ifa->ifa_rtrequest)
 			       rt->rt_ifa->ifa_rtrequest(RTM_ADD, rt, &info);
-			if (info.rti_info[RTAX_GENMASK])
-				rt->rt_genmask = info.rti_info[RTAX_GENMASK];
 			/* FALLTHROUGH */
 		case RTM_LOCK:
 			/* We don't support locks anymore */
 			break;
 		}
 		RT_UNLOCK(rt);
 		break;
 
 	default:
 		senderr(EOPNOTSUPP);
 	}
 
 flush:
 	if (rtm) {
 		if (error)
 			rtm->rtm_errno = error;
 		else
 			rtm->rtm_flags |= RTF_DONE;
 	}
 	if (rt)		/* XXX can this be true? */
 		RTFREE(rt);
     {
 	struct rawcb *rp = NULL;
 	/*
 	 * Check to see if we don't want our own messages.
 	 */
 	if ((so->so_options & SO_USELOOPBACK) == 0) {
 		if (route_cb.any_count <= 1) {
 			if (rtm)
 				Free(rtm);
 			m_freem(m);
 			return (error);
 		}
 		/* There is another listener, so construct message */
 		rp = sotorawcb(so);
 	}
 	if (rtm) {
 		m_copyback(m, 0, rtm->rtm_msglen, (caddr_t)rtm);
 		if (m->m_pkthdr.len < rtm->rtm_msglen) {
 			m_freem(m);
 			m = NULL;
 		} else if (m->m_pkthdr.len > rtm->rtm_msglen)
 			m_adj(m, rtm->rtm_msglen - m->m_pkthdr.len);
 		Free(rtm);
 	}
 	if (m) {
 		if (rp) {
 			/*
 			 * XXX insure we don't get a copy by
 			 * invalidating our protocol
 			 */
 			unsigned short family = rp->rcb_proto.sp_family;
 			rp->rcb_proto.sp_family = 0;
 			rt_dispatch(m, info.rti_info[RTAX_DST]);
 			rp->rcb_proto.sp_family = family;
 		} else
 			rt_dispatch(m, info.rti_info[RTAX_DST]);
 	}
     }
 	return (error);
 #undef	sa_equal
 }
 
 static void
 rt_setmetrics(u_long which, const struct rt_metrics *in,
 	struct rt_metrics_lite *out)
 {
 #define metric(f, e) if (which & (f)) out->e = in->e;
 	/*
 	 * Only these are stored in the routing entry since introduction
 	 * of tcp hostcache. The rest is ignored.
 	 */
 	metric(RTV_MTU, rmx_mtu);
 	/* Userland -> kernel timebase conversion. */
 	if (which & RTV_EXPIRE)
 		out->rmx_expire = in->rmx_expire ?
 		    in->rmx_expire - time_second + time_uptime : 0;
 #undef metric
 }
 
 static void
 rt_getmetrics(const struct rt_metrics_lite *in, struct rt_metrics *out)
 {
 #define metric(e) out->e = in->e;
 	bzero(out, sizeof(*out));
 	metric(rmx_mtu);
 	/* Kernel -> userland timebase conversion. */
 	out->rmx_expire = in->rmx_expire ?
 	    in->rmx_expire - time_uptime + time_second : 0;
 #undef metric
 }
 
 /*
  * Extract the addresses of the passed sockaddrs.
  * Do a little sanity checking so as to avoid bad memory references.
  * This data is derived straight from userland.
  */
 static int
 rt_xaddrs(caddr_t cp, caddr_t cplim, struct rt_addrinfo *rtinfo)
 {
 	struct sockaddr *sa;
 	int i;
 
 	for (i = 0; i < RTAX_MAX && cp < cplim; i++) {
 		if ((rtinfo->rti_addrs & (1 << i)) == 0)
 			continue;
 		sa = (struct sockaddr *)cp;
 		/*
 		 * It won't fit.
 		 */
 		if (cp + sa->sa_len > cplim)
 			return (EINVAL);
 		/*
 		 * there are no more.. quit now
 		 * If there are more bits, they are in error.
 		 * I've seen this. route(1) can evidently generate these. 
 		 * This causes kernel to core dump.
 		 * for compatibility, If we see this, point to a safe address.
 		 */
 		if (sa->sa_len == 0) {
 			rtinfo->rti_info[i] = &sa_zero;
 			return (0); /* should be EINVAL but for compat */
 		}
 		/* accept it */
 		rtinfo->rti_info[i] = sa;
 		cp += SA_SIZE(sa);
 	}
 	return (0);
 }
 
 static struct mbuf *
 rt_msg1(int type, struct rt_addrinfo *rtinfo)
 {
 	struct rt_msghdr *rtm;
 	struct mbuf *m;
 	int i;
 	struct sockaddr *sa;
 	int len, dlen;
 
 	switch (type) {
 
 	case RTM_DELADDR:
 	case RTM_NEWADDR:
 		len = sizeof(struct ifa_msghdr);
 		break;
 
 	case RTM_DELMADDR:
 	case RTM_NEWMADDR:
 		len = sizeof(struct ifma_msghdr);
 		break;
 
 	case RTM_IFINFO:
 		len = sizeof(struct if_msghdr);
 		break;
 
 	case RTM_IFANNOUNCE:
 	case RTM_IEEE80211:
 		len = sizeof(struct if_announcemsghdr);
 		break;
 
 	default:
 		len = sizeof(struct rt_msghdr);
 	}
 	if (len > MCLBYTES)
 		panic("rt_msg1");
 	m = m_gethdr(M_DONTWAIT, MT_DATA);
 	if (m && len > MHLEN) {
 		MCLGET(m, M_DONTWAIT);
 		if ((m->m_flags & M_EXT) == 0) {
 			m_free(m);
 			m = NULL;
 		}
 	}
 	if (m == NULL)
 		return (m);
 	m->m_pkthdr.len = m->m_len = len;
 	m->m_pkthdr.rcvif = NULL;
 	rtm = mtod(m, struct rt_msghdr *);
 	bzero((caddr_t)rtm, len);
 	for (i = 0; i < RTAX_MAX; i++) {
 		if ((sa = rtinfo->rti_info[i]) == NULL)
 			continue;
 		rtinfo->rti_addrs |= (1 << i);
 		dlen = SA_SIZE(sa);
 		m_copyback(m, len, dlen, (caddr_t)sa);
 		len += dlen;
 	}
 	if (m->m_pkthdr.len != len) {
 		m_freem(m);
 		return (NULL);
 	}
 	rtm->rtm_msglen = len;
 	rtm->rtm_version = RTM_VERSION;
 	rtm->rtm_type = type;
 	return (m);
 }
 
 static int
 rt_msg2(int type, struct rt_addrinfo *rtinfo, caddr_t cp, struct walkarg *w)
 {
 	int i;
 	int len, dlen, second_time = 0;
 	caddr_t cp0;
 
 	rtinfo->rti_addrs = 0;
 again:
 	switch (type) {
 
 	case RTM_DELADDR:
 	case RTM_NEWADDR:
 		len = sizeof(struct ifa_msghdr);
 		break;
 
 	case RTM_IFINFO:
 		len = sizeof(struct if_msghdr);
 		break;
 
 	case RTM_NEWMADDR:
 		len = sizeof(struct ifma_msghdr);
 		break;
 
 	default:
 		len = sizeof(struct rt_msghdr);
 	}
 	cp0 = cp;
 	if (cp0)
 		cp += len;
 	for (i = 0; i < RTAX_MAX; i++) {
 		struct sockaddr *sa;
 
 		if ((sa = rtinfo->rti_info[i]) == NULL)
 			continue;
 		rtinfo->rti_addrs |= (1 << i);
 		dlen = SA_SIZE(sa);
 		if (cp) {
 			bcopy((caddr_t)sa, cp, (unsigned)dlen);
 			cp += dlen;
 		}
 		len += dlen;
 	}
 	len = ALIGN(len);
 	if (cp == NULL && w != NULL && !second_time) {
 		struct walkarg *rw = w;
 
 		if (rw->w_req) {
 			if (rw->w_tmemsize < len) {
 				if (rw->w_tmem)
 					free(rw->w_tmem, M_RTABLE);
 				rw->w_tmem = (caddr_t)
 					malloc(len, M_RTABLE, M_NOWAIT);
 				if (rw->w_tmem)
 					rw->w_tmemsize = len;
 			}
 			if (rw->w_tmem) {
 				cp = rw->w_tmem;
 				second_time = 1;
 				goto again;
 			}
 		}
 	}
 	if (cp) {
 		struct rt_msghdr *rtm = (struct rt_msghdr *)cp0;
 
 		rtm->rtm_version = RTM_VERSION;
 		rtm->rtm_type = type;
 		rtm->rtm_msglen = len;
 	}
 	return (len);
 }
 
 /*
  * This routine is called to generate a message from the routing
  * socket indicating that a redirect has occured, a routing lookup
  * has failed, or that a protocol has detected timeouts to a particular
  * destination.
  */
 void
 rt_missmsg(int type, struct rt_addrinfo *rtinfo, int flags, int error)
 {
 	struct rt_msghdr *rtm;
 	struct mbuf *m;
 	struct sockaddr *sa = rtinfo->rti_info[RTAX_DST];
 
 	if (route_cb.any_count == 0)
 		return;
 	m = rt_msg1(type, rtinfo);
 	if (m == NULL)
 		return;
 	rtm = mtod(m, struct rt_msghdr *);
 	rtm->rtm_flags = RTF_DONE | flags;
 	rtm->rtm_errno = error;
 	rtm->rtm_addrs = rtinfo->rti_addrs;
 	rt_dispatch(m, sa);
 }
 
 /*
  * This routine is called to generate a message from the routing
  * socket indicating that the status of a network interface has changed.
  */
 void
 rt_ifmsg(struct ifnet *ifp)
 {
 	struct if_msghdr *ifm;
 	struct mbuf *m;
 	struct rt_addrinfo info;
 
 	if (route_cb.any_count == 0)
 		return;
 	bzero((caddr_t)&info, sizeof(info));
 	m = rt_msg1(RTM_IFINFO, &info);
 	if (m == NULL)
 		return;
 	ifm = mtod(m, struct if_msghdr *);
 	ifm->ifm_index = ifp->if_index;
 	ifm->ifm_flags = ifp->if_flags | ifp->if_drv_flags;
 	ifm->ifm_data = ifp->if_data;
 	ifm->ifm_addrs = 0;
 	rt_dispatch(m, NULL);
 }
 
 /*
  * This is called to generate messages from the routing socket
  * indicating a network interface has had addresses associated with it.
  * if we ever reverse the logic and replace messages TO the routing
  * socket indicate a request to configure interfaces, then it will
  * be unnecessary as the routing socket will automatically generate
  * copies of it.
  */
 void
 rt_newaddrmsg(int cmd, struct ifaddr *ifa, int error, struct rtentry *rt)
 {
 	struct rt_addrinfo info;
 	struct sockaddr *sa = NULL;
 	int pass;
 	struct mbuf *m = NULL;
 	struct ifnet *ifp = ifa->ifa_ifp;
 
 	KASSERT(cmd == RTM_ADD || cmd == RTM_DELETE,
 		("unexpected cmd %u", cmd));
 #ifdef SCTP
 	/*
 	 * notify the SCTP stack
 	 * this will only get called when an address is added/deleted
 	 * XXX pass the ifaddr struct instead if ifa->ifa_addr...
 	 */
 	sctp_addr_change(ifa, cmd);
 #endif /* SCTP */
 	if (route_cb.any_count == 0)
 		return;
 	for (pass = 1; pass < 3; pass++) {
 		bzero((caddr_t)&info, sizeof(info));
 		if ((cmd == RTM_ADD && pass == 1) ||
 		    (cmd == RTM_DELETE && pass == 2)) {
 			struct ifa_msghdr *ifam;
 			int ncmd = cmd == RTM_ADD ? RTM_NEWADDR : RTM_DELADDR;
 
 			info.rti_info[RTAX_IFA] = sa = ifa->ifa_addr;
 			info.rti_info[RTAX_IFP] = ifp->if_addr->ifa_addr;
 			info.rti_info[RTAX_NETMASK] = ifa->ifa_netmask;
 			info.rti_info[RTAX_BRD] = ifa->ifa_dstaddr;
 			if ((m = rt_msg1(ncmd, &info)) == NULL)
 				continue;
 			ifam = mtod(m, struct ifa_msghdr *);
 			ifam->ifam_index = ifp->if_index;
 			ifam->ifam_metric = ifa->ifa_metric;
 			ifam->ifam_flags = ifa->ifa_flags;
 			ifam->ifam_addrs = info.rti_addrs;
 		}
 		if ((cmd == RTM_ADD && pass == 2) ||
 		    (cmd == RTM_DELETE && pass == 1)) {
 			struct rt_msghdr *rtm;
 
 			if (rt == NULL)
 				continue;
 			info.rti_info[RTAX_NETMASK] = rt_mask(rt);
 			info.rti_info[RTAX_DST] = sa = rt_key(rt);
 			info.rti_info[RTAX_GATEWAY] = rt->rt_gateway;
 			if ((m = rt_msg1(cmd, &info)) == NULL)
 				continue;
 			rtm = mtod(m, struct rt_msghdr *);
 			rtm->rtm_index = ifp->if_index;
 			rtm->rtm_flags |= rt->rt_flags;
 			rtm->rtm_errno = error;
 			rtm->rtm_addrs = info.rti_addrs;
 		}
 		rt_dispatch(m, sa);
 	}
 }
 
 /*
  * This is the analogue to the rt_newaddrmsg which performs the same
  * function but for multicast group memberhips.  This is easier since
  * there is no route state to worry about.
  */
 void
 rt_newmaddrmsg(int cmd, struct ifmultiaddr *ifma)
 {
 	struct rt_addrinfo info;
 	struct mbuf *m = NULL;
 	struct ifnet *ifp = ifma->ifma_ifp;
 	struct ifma_msghdr *ifmam;
 
 	if (route_cb.any_count == 0)
 		return;
 
 	bzero((caddr_t)&info, sizeof(info));
 	info.rti_info[RTAX_IFA] = ifma->ifma_addr;
 	info.rti_info[RTAX_IFP] = ifp ? ifp->if_addr->ifa_addr : NULL;
 	/*
 	 * If a link-layer address is present, present it as a ``gateway''
 	 * (similarly to how ARP entries, e.g., are presented).
 	 */
 	info.rti_info[RTAX_GATEWAY] = ifma->ifma_lladdr;
 	m = rt_msg1(cmd, &info);
 	if (m == NULL)
 		return;
 	ifmam = mtod(m, struct ifma_msghdr *);
 	KASSERT(ifp != NULL, ("%s: link-layer multicast address w/o ifp\n",
 	    __func__));
 	ifmam->ifmam_index = ifp->if_index;
 	ifmam->ifmam_addrs = info.rti_addrs;
 	rt_dispatch(m, ifma->ifma_addr);
 }
 
 static struct mbuf *
 rt_makeifannouncemsg(struct ifnet *ifp, int type, int what,
 	struct rt_addrinfo *info)
 {
 	struct if_announcemsghdr *ifan;
 	struct mbuf *m;
 
 	if (route_cb.any_count == 0)
 		return NULL;
 	bzero((caddr_t)info, sizeof(*info));
 	m = rt_msg1(type, info);
 	if (m != NULL) {
 		ifan = mtod(m, struct if_announcemsghdr *);
 		ifan->ifan_index = ifp->if_index;
 		strlcpy(ifan->ifan_name, ifp->if_xname,
 			sizeof(ifan->ifan_name));
 		ifan->ifan_what = what;
 	}
 	return m;
 }
 
 /*
  * This is called to generate routing socket messages indicating
  * IEEE80211 wireless events.
  * XXX we piggyback on the RTM_IFANNOUNCE msg format in a clumsy way.
  */
 void
 rt_ieee80211msg(struct ifnet *ifp, int what, void *data, size_t data_len)
 {
 	struct mbuf *m;
 	struct rt_addrinfo info;
 
 	m = rt_makeifannouncemsg(ifp, RTM_IEEE80211, what, &info);
 	if (m != NULL) {
 		/*
 		 * Append the ieee80211 data.  Try to stick it in the
 		 * mbuf containing the ifannounce msg; otherwise allocate
 		 * a new mbuf and append.
 		 *
 		 * NB: we assume m is a single mbuf.
 		 */
 		if (data_len > M_TRAILINGSPACE(m)) {
 			struct mbuf *n = m_get(M_NOWAIT, MT_DATA);
 			if (n == NULL) {
 				m_freem(m);
 				return;
 			}
 			bcopy(data, mtod(n, void *), data_len);
 			n->m_len = data_len;
 			m->m_next = n;
 		} else if (data_len > 0) {
 			bcopy(data, mtod(m, u_int8_t *) + m->m_len, data_len);
 			m->m_len += data_len;
 		}
 		if (m->m_flags & M_PKTHDR)
 			m->m_pkthdr.len += data_len;
 		mtod(m, struct if_announcemsghdr *)->ifan_msglen += data_len;
 		rt_dispatch(m, NULL);
 	}
 }
 
 /*
  * This is called to generate routing socket messages indicating
  * network interface arrival and departure.
  */
 void
 rt_ifannouncemsg(struct ifnet *ifp, int what)
 {
 	struct mbuf *m;
 	struct rt_addrinfo info;
 
 	m = rt_makeifannouncemsg(ifp, RTM_IFANNOUNCE, what, &info);
 	if (m != NULL)
 		rt_dispatch(m, NULL);
 }
 
 static void
 rt_dispatch(struct mbuf *m, const struct sockaddr *sa)
 {
 	INIT_VNET_NET(curvnet);
 	struct m_tag *tag;
 
 	/*
 	 * Preserve the family from the sockaddr, if any, in an m_tag for
 	 * use when injecting the mbuf into the routing socket buffer from
 	 * the netisr.
 	 */
 	if (sa != NULL) {
 		tag = m_tag_get(PACKET_TAG_RTSOCKFAM, sizeof(unsigned short),
 		    M_NOWAIT);
 		if (tag == NULL) {
 			m_freem(m);
 			return;
 		}
 		*(unsigned short *)(tag + 1) = sa->sa_family;
 		m_tag_prepend(m, tag);
 	}
 	netisr_queue(NETISR_ROUTE, m);	/* mbuf is free'd on failure. */
 }
 
 /*
  * This is used in dumping the kernel table via sysctl().
  */
 static int
 sysctl_dumpentry(struct radix_node *rn, void *vw)
 {
 	struct walkarg *w = vw;
 	struct rtentry *rt = (struct rtentry *)rn;
 	int error = 0, size;
 	struct rt_addrinfo info;
 
 	if (w->w_op == NET_RT_FLAGS && !(rt->rt_flags & w->w_arg))
 		return 0;
 	bzero((caddr_t)&info, sizeof(info));
 	info.rti_info[RTAX_DST] = rt_key(rt);
 	info.rti_info[RTAX_GATEWAY] = rt->rt_gateway;
 	info.rti_info[RTAX_NETMASK] = rt_mask(rt);
-	info.rti_info[RTAX_GENMASK] = rt->rt_genmask;
+	info.rti_info[RTAX_GENMASK] = 0;
 	if (rt->rt_ifp) {
 		info.rti_info[RTAX_IFP] = rt->rt_ifp->if_addr->ifa_addr;
 		info.rti_info[RTAX_IFA] = rt->rt_ifa->ifa_addr;
 		if (rt->rt_ifp->if_flags & IFF_POINTOPOINT)
 			info.rti_info[RTAX_BRD] = rt->rt_ifa->ifa_dstaddr;
 	}
 	size = rt_msg2(RTM_GET, &info, NULL, w);
 	if (w->w_req && w->w_tmem) {
 		struct rt_msghdr *rtm = (struct rt_msghdr *)w->w_tmem;
 
 		rtm->rtm_flags = rt->rt_flags;
 		rtm->rtm_use = rt->rt_rmx.rmx_pksent;
 		rt_getmetrics(&rt->rt_rmx, &rtm->rtm_rmx);
 		rtm->rtm_index = rt->rt_ifp->if_index;
 		rtm->rtm_errno = rtm->rtm_pid = rtm->rtm_seq = 0;
 		rtm->rtm_addrs = info.rti_addrs;
 		error = SYSCTL_OUT(w->w_req, (caddr_t)rtm, size);
 		return (error);
 	}
 	return (error);
 }
 
 static int
 sysctl_iflist(int af, struct walkarg *w)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 	struct ifaddr *ifa;
 	struct rt_addrinfo info;
 	int len, error = 0;
 
 	bzero((caddr_t)&info, sizeof(info));
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 		if (w->w_arg && w->w_arg != ifp->if_index)
 			continue;
 		ifa = ifp->if_addr;
 		info.rti_info[RTAX_IFP] = ifa->ifa_addr;
 		len = rt_msg2(RTM_IFINFO, &info, NULL, w);
 		info.rti_info[RTAX_IFP] = NULL;
 		if (w->w_req && w->w_tmem) {
 			struct if_msghdr *ifm;
 
 			ifm = (struct if_msghdr *)w->w_tmem;
 			ifm->ifm_index = ifp->if_index;
 			ifm->ifm_flags = ifp->if_flags | ifp->if_drv_flags;
 			ifm->ifm_data = ifp->if_data;
 			ifm->ifm_addrs = info.rti_addrs;
 			error = SYSCTL_OUT(w->w_req,(caddr_t)ifm, len);
 			if (error)
 				goto done;
 		}
 		while ((ifa = TAILQ_NEXT(ifa, ifa_link)) != NULL) {
 			if (af && af != ifa->ifa_addr->sa_family)
 				continue;
 			if (jailed(curthread->td_ucred) &&
 			    !prison_if(curthread->td_ucred, ifa->ifa_addr))
 				continue;
 			info.rti_info[RTAX_IFA] = ifa->ifa_addr;
 			info.rti_info[RTAX_NETMASK] = ifa->ifa_netmask;
 			info.rti_info[RTAX_BRD] = ifa->ifa_dstaddr;
 			len = rt_msg2(RTM_NEWADDR, &info, NULL, w);
 			if (w->w_req && w->w_tmem) {
 				struct ifa_msghdr *ifam;
 
 				ifam = (struct ifa_msghdr *)w->w_tmem;
 				ifam->ifam_index = ifa->ifa_ifp->if_index;
 				ifam->ifam_flags = ifa->ifa_flags;
 				ifam->ifam_metric = ifa->ifa_metric;
 				ifam->ifam_addrs = info.rti_addrs;
 				error = SYSCTL_OUT(w->w_req, w->w_tmem, len);
 				if (error)
 					goto done;
 			}
 		}
 		info.rti_info[RTAX_IFA] = info.rti_info[RTAX_NETMASK] =
 			info.rti_info[RTAX_BRD] = NULL;
 	}
 done:
 	IFNET_RUNLOCK();
 	return (error);
 }
 
 int
 sysctl_ifmalist(int af, struct walkarg *w)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *ifp;
 	struct ifmultiaddr *ifma;
 	struct	rt_addrinfo info;
 	int	len, error = 0;
 	struct ifaddr *ifa;
 
 	bzero((caddr_t)&info, sizeof(info));
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 		if (w->w_arg && w->w_arg != ifp->if_index)
 			continue;
 		ifa = ifp->if_addr;
 		info.rti_info[RTAX_IFP] = ifa ? ifa->ifa_addr : NULL;
 		IF_ADDR_LOCK(ifp);
 		TAILQ_FOREACH(ifma, &ifp->if_multiaddrs, ifma_link) {
 			if (af && af != ifma->ifma_addr->sa_family)
 				continue;
 			if (jailed(curproc->p_ucred) &&
 			    !prison_if(curproc->p_ucred, ifma->ifma_addr))
 				continue;
 			info.rti_info[RTAX_IFA] = ifma->ifma_addr;
 			info.rti_info[RTAX_GATEWAY] =
 			    (ifma->ifma_addr->sa_family != AF_LINK) ?
 			    ifma->ifma_lladdr : NULL;
 			len = rt_msg2(RTM_NEWMADDR, &info, NULL, w);
 			if (w->w_req && w->w_tmem) {
 				struct ifma_msghdr *ifmam;
 
 				ifmam = (struct ifma_msghdr *)w->w_tmem;
 				ifmam->ifmam_index = ifma->ifma_ifp->if_index;
 				ifmam->ifmam_flags = 0;
 				ifmam->ifmam_addrs = info.rti_addrs;
 				error = SYSCTL_OUT(w->w_req, w->w_tmem, len);
 				if (error) {
 					IF_ADDR_UNLOCK(ifp);
 					goto done;
 				}
 			}
 		}
 		IF_ADDR_UNLOCK(ifp);
 	}
 done:
 	IFNET_RUNLOCK();
 	return (error);
 }
 
 static int
 sysctl_rtsock(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_NET(curvnet);
 	int	*name = (int *)arg1;
 	u_int	namelen = arg2;
 	struct radix_node_head *rnh;
 	int	i, lim, error = EINVAL;
 	u_char	af;
 	struct	walkarg w;
 
 	name ++;
 	namelen--;
 	if (req->newptr)
 		return (EPERM);
 	if (namelen != 3)
 		return ((namelen < 3) ? EISDIR : ENOTDIR);
 	af = name[0];
 	if (af > AF_MAX)
 		return (EINVAL);
 	bzero(&w, sizeof(w));
 	w.w_op = name[1];
 	w.w_arg = name[2];
 	w.w_req = req;
 
 	error = sysctl_wire_old_buffer(req, 0);
 	if (error)
 		return (error);
 	switch (w.w_op) {
 
 	case NET_RT_DUMP:
 	case NET_RT_FLAGS:
 		if (af == 0) {			/* dump all tables */
 			i = 1;
 			lim = AF_MAX;
 		} else				/* dump only one table */
 			i = lim = af;
 		for (error = 0; error == 0 && i <= lim; i++)
 			if ((rnh = V_rt_tables[curthread->td_proc->p_fibnum][i]) != NULL) {
 				RADIX_NODE_HEAD_LOCK(rnh); 
 			    	error = rnh->rnh_walktree(rnh,
 				    sysctl_dumpentry, &w);
 				RADIX_NODE_HEAD_UNLOCK(rnh);
 			} else if (af != 0)
 				error = EAFNOSUPPORT;
+		/*
+		 * take care of llinfo entries
+		 */
+		if (w.w_op == NET_RT_FLAGS)
+			error = lltable_sysctl_dumparp(af, w.w_req);
 		break;
 
 	case NET_RT_IFLIST:
 		error = sysctl_iflist(af, &w);
 		break;
 
 	case NET_RT_IFMALIST:
 		error = sysctl_ifmalist(af, &w);
 		break;
 	}
 	if (w.w_tmem)
 		free(w.w_tmem, M_RTABLE);
 	return (error);
 }
 
 SYSCTL_NODE(_net, PF_ROUTE, routetable, CTLFLAG_RD, sysctl_rtsock, "");
 
 /*
  * Definitions of protocols supported in the ROUTE domain.
  */
 
 static struct domain routedomain;		/* or at least forward */
 
 static struct protosw routesw[] = {
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&routedomain,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_output =		route_output,
 	.pr_ctlinput =		raw_ctlinput,
 	.pr_init =		raw_init,
 	.pr_usrreqs =		&route_usrreqs
 }
 };
 
 static struct domain routedomain = {
 	.dom_family =		PF_ROUTE,
 	.dom_name =		 "route",
 	.dom_protosw =		routesw,
 	.dom_protoswNPROTOSW =	&routesw[sizeof(routesw)/sizeof(routesw[0])]
 };
 
 DOMAIN_SET(route);
Index: head/sys/netgraph/netflow/netflow.c
===================================================================
--- head/sys/netgraph/netflow/netflow.c	(revision 186118)
+++ head/sys/netgraph/netflow/netflow.c	(revision 186119)
@@ -1,711 +1,711 @@
 /*-
  * Copyright (c) 2004-2005 Gleb Smirnoff <glebius@FreeBSD.org>
  * Copyright (c) 2001-2003 Roman V. Palagin <romanp@unshadow.net>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $SourceForge: netflow.c,v 1.41 2004/09/05 11:41:10 glebius Exp $
  */
 
 static const char rcs_id[] =
     "@(#) $FreeBSD$";
 
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/limits.h>
 #include <sys/mbuf.h>
 #include <sys/syslog.h>
 #include <sys/systm.h>
 #include <sys/socket.h>
 
 #include <machine/atomic.h>
 
 #include <net/if.h>
 #include <net/route.h>
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/tcp.h>
 #include <netinet/udp.h>
 
 #include <netgraph/ng_message.h>
 #include <netgraph/netgraph.h>
 
 #include <netgraph/netflow/netflow.h>
 #include <netgraph/netflow/ng_netflow.h>
 
 #define	NBUCKETS	(65536)		/* must be power of 2 */
 
 /* This hash is for TCP or UDP packets. */
 #define FULL_HASH(addr1, addr2, port1, port2)	\
 	(((addr1 ^ (addr1 >> 16) ^ 		\
 	htons(addr2 ^ (addr2 >> 16))) ^ 	\
 	port1 ^ htons(port2)) &			\
 	(NBUCKETS - 1))
 
 /* This hash is for all other IP packets. */
 #define ADDR_HASH(addr1, addr2)			\
 	((addr1 ^ (addr1 >> 16) ^ 		\
 	htons(addr2 ^ (addr2 >> 16))) &		\
 	(NBUCKETS - 1))
 
 /* Macros to shorten logical constructions */
 /* XXX: priv must exist in namespace */
 #define	INACTIVE(fle)	(time_uptime - fle->f.last > priv->info.nfinfo_inact_t)
 #define	AGED(fle)	(time_uptime - fle->f.first > priv->info.nfinfo_act_t)
 #define	ISFREE(fle)	(fle->f.packets == 0)
 
 /*
  * 4 is a magical number: statistically number of 4-packet flows is
  * bigger than 5,6,7...-packet flows by an order of magnitude. Most UDP/ICMP
  * scans are 1 packet (~ 90% of flow cache). TCP scans are 2-packet in case
  * of reachable host and 4-packet otherwise.
  */
 #define	SMALL(fle)	(fle->f.packets <= 4)
 
 /*
  * Cisco uses milliseconds for uptime. Bad idea, since it overflows
  * every 48+ days. But we will do same to keep compatibility. This macro
  * does overflowable multiplication to 1000.
  */
 #define	MILLIUPTIME(t)	(((t) << 9) +	/* 512 */	\
 			 ((t) << 8) +	/* 256 */	\
 			 ((t) << 7) +	/* 128 */	\
 			 ((t) << 6) +	/* 64  */	\
 			 ((t) << 5) +	/* 32  */	\
 			 ((t) << 3))	/* 8   */
 
 MALLOC_DECLARE(M_NETFLOW_HASH);
 MALLOC_DEFINE(M_NETFLOW_HASH, "netflow_hash", "NetFlow hash");
 
 static int export_add(item_p, struct flow_entry *);
 static int export_send(priv_p, item_p, int flags);
 
 /* Generate hash for a given flow record. */
 static __inline uint32_t
 ip_hash(struct flow_rec *r)
 {
 	switch (r->r_ip_p) {
 	case IPPROTO_TCP:
 	case IPPROTO_UDP:
 		return FULL_HASH(r->r_src.s_addr, r->r_dst.s_addr,
 		    r->r_sport, r->r_dport);
 	default:
 		return ADDR_HASH(r->r_src.s_addr, r->r_dst.s_addr);
 	}
 }
 
 /* This is callback from uma(9), called on alloc. */
 static int
 uma_ctor_flow(void *mem, int size, void *arg, int how)
 {
 	priv_p priv = (priv_p )arg;
 
 	if (atomic_load_acq_32(&priv->info.nfinfo_used) >= CACHESIZE)
 		return (ENOMEM);
 
 	atomic_add_32(&priv->info.nfinfo_used, 1);
 
 	return (0);
 }
 
 /* This is callback from uma(9), called on free. */
 static void
 uma_dtor_flow(void *mem, int size, void *arg)
 {
 	priv_p priv = (priv_p )arg;
 
 	atomic_subtract_32(&priv->info.nfinfo_used, 1);
 }
 
 /*
  * Detach export datagram from priv, if there is any.
  * If there is no, allocate a new one.
  */
 static item_p
 get_export_dgram(priv_p priv)
 {
 	item_p	item = NULL;
 
 	mtx_lock(&priv->export_mtx);
 	if (priv->export_item != NULL) {
 		item = priv->export_item;
 		priv->export_item = NULL;
 	}
 	mtx_unlock(&priv->export_mtx);
 
 	if (item == NULL) {
 		struct netflow_v5_export_dgram *dgram;
 		struct mbuf *m;
 
 		m = m_getcl(M_DONTWAIT, MT_DATA, M_PKTHDR);
 		if (m == NULL)
 			return (NULL);
 		item = ng_package_data(m, NG_NOFLAGS);
 		if (item == NULL)
 			return (NULL);
 		dgram = mtod(m, struct netflow_v5_export_dgram *);
 		dgram->header.count = 0;
 		dgram->header.version = htons(NETFLOW_V5);
 
 	}
 
 	return (item);
 }
 
 /*
  * Re-attach incomplete datagram back to priv.
  * If there is already another one, then send incomplete. */
 static void
 return_export_dgram(priv_p priv, item_p item, int flags)
 {
 	/*
 	 * It may happen on SMP, that some thread has already
 	 * put its item there, in this case we bail out and
 	 * send what we have to collector.
 	 */
 	mtx_lock(&priv->export_mtx);
 	if (priv->export_item == NULL) {
 		priv->export_item = item;
 		mtx_unlock(&priv->export_mtx);
 	} else {
 		mtx_unlock(&priv->export_mtx);
 		export_send(priv, item, flags);
 	}
 }
 
 /*
  * The flow is over. Call export_add() and free it. If datagram is
  * full, then call export_send().
  */
 static __inline void
 expire_flow(priv_p priv, item_p *item, struct flow_entry *fle, int flags)
 {
 	if (*item == NULL)
 		*item = get_export_dgram(priv);
 	if (*item == NULL) {
 		atomic_add_32(&priv->info.nfinfo_export_failed, 1);
 		uma_zfree_arg(priv->zone, fle, priv);
 		return;
 	}
 	if (export_add(*item, fle) > 0) {
 		export_send(priv, *item, flags);
 		*item = NULL;
 	}
 	uma_zfree_arg(priv->zone, fle, priv);
 }
 
 /* Get a snapshot of node statistics */
 void
 ng_netflow_copyinfo(priv_p priv, struct ng_netflow_info *i)
 {
 	/* XXX: atomic */
 	memcpy((void *)i, (void *)&priv->info, sizeof(priv->info));
 }
 
 /*
  * Insert a record into defined slot.
  *
  * First we get for us a free flow entry, then fill in all
  * possible fields in it.
  *
  * TODO: consider dropping hash mutex while filling in datagram,
  * as this was done in previous version. Need to test & profile
  * to be sure.
  */
 static __inline int
 hash_insert(priv_p priv, struct flow_hash_entry  *hsh, struct flow_rec *r,
 	int plen, uint8_t tcp_flags)
 {
 	struct flow_entry *fle;
 	struct sockaddr_in sin;
 	struct rtentry *rt;
 
 	mtx_assert(&hsh->mtx, MA_OWNED);
 
 	fle = uma_zalloc_arg(priv->zone, priv, M_NOWAIT);
 	if (fle == NULL) {
 		atomic_add_32(&priv->info.nfinfo_alloc_failed, 1);
 		return (ENOMEM);
 	}
 
 	/*
 	 * Now fle is totally ours. It is detached from all lists,
 	 * we can safely edit it.
 	 */
 
 	bcopy(r, &fle->f.r, sizeof(struct flow_rec));
 	fle->f.bytes = plen;
 	fle->f.packets = 1;
 	fle->f.tcp_flags = tcp_flags;
 
 	fle->f.first = fle->f.last = time_uptime;
 
 	/*
 	 * First we do route table lookup on destination address. So we can
 	 * fill in out_ifx, dst_mask, nexthop, and dst_as in future releases.
 	 */
 	bzero(&sin, sizeof(sin));
 	sin.sin_len = sizeof(struct sockaddr_in);
 	sin.sin_family = AF_INET;
 	sin.sin_addr = fle->f.r.r_dst;
 	/* XXX MRT 0 as a default.. need the m here to get fib */
-	rt = rtalloc1_fib((struct sockaddr *)&sin, 0, RTF_CLONING, 0);
+	rt = rtalloc1_fib((struct sockaddr *)&sin, 0, 0, 0);
 	if (rt != NULL) {
 		fle->f.fle_o_ifx = rt->rt_ifp->if_index;
 
 		if (rt->rt_flags & RTF_GATEWAY &&
 		    rt->rt_gateway->sa_family == AF_INET)
 			fle->f.next_hop =
 			    ((struct sockaddr_in *)(rt->rt_gateway))->sin_addr;
 
 		if (rt_mask(rt))
 			fle->f.dst_mask = bitcount32(((struct sockaddr_in *)
 			    rt_mask(rt))->sin_addr.s_addr);
 		else if (rt->rt_flags & RTF_HOST)
 			/* Give up. We can't determine mask :( */
 			fle->f.dst_mask = 32;
 
 		RTFREE_LOCKED(rt);
 	}
 
 	/* Do route lookup on source address, to fill in src_mask. */
 	bzero(&sin, sizeof(sin));
 	sin.sin_len = sizeof(struct sockaddr_in);
 	sin.sin_family = AF_INET;
 	sin.sin_addr = fle->f.r.r_src;
 	/* XXX MRT 0 as a default  revisit.  need the mbuf for fib*/
-	rt = rtalloc1_fib((struct sockaddr *)&sin, 0, RTF_CLONING, 0);
+	rt = rtalloc1_fib((struct sockaddr *)&sin, 0, 0, 0);
 	if (rt != NULL) {
 		if (rt_mask(rt))
 			fle->f.src_mask = bitcount32(((struct sockaddr_in *)
 			    rt_mask(rt))->sin_addr.s_addr);
 		else if (rt->rt_flags & RTF_HOST)
 			/* Give up. We can't determine mask :( */
 			fle->f.src_mask = 32;
 
 		RTFREE_LOCKED(rt);
 	}
 
 	/* Push new flow at the and of hash. */
 	TAILQ_INSERT_TAIL(&hsh->head, fle, fle_hash);
 
 	return (0);
 }
 
 
 /*
  * Non-static functions called from ng_netflow.c
  */
 
 /* Allocate memory and set up flow cache */
 int
 ng_netflow_cache_init(priv_p priv)
 {
 	struct flow_hash_entry	*hsh;
 	int i;
 
 	/* Initialize cache UMA zone. */
 	priv->zone = uma_zcreate("NetFlow cache", sizeof(struct flow_entry),
 	    uma_ctor_flow, uma_dtor_flow, NULL, NULL, UMA_ALIGN_CACHE, 0);
 	uma_zone_set_max(priv->zone, CACHESIZE);
 
 	/* Allocate hash. */
 	priv->hash = malloc(NBUCKETS * sizeof(struct flow_hash_entry),
 	    M_NETFLOW_HASH, M_WAITOK | M_ZERO);
 
 	if (priv->hash == NULL) {
 		uma_zdestroy(priv->zone);
 		return (ENOMEM);
 	}
 
 	/* Initialize hash. */
 	for (i = 0, hsh = priv->hash; i < NBUCKETS; i++, hsh++) {
 		mtx_init(&hsh->mtx, "hash mutex", NULL, MTX_DEF);
 		TAILQ_INIT(&hsh->head);
 	}
 
 	mtx_init(&priv->export_mtx, "export dgram lock", NULL, MTX_DEF);
 
 	return (0);
 }
 
 /* Free all flow cache memory. Called from node close method. */
 void
 ng_netflow_cache_flush(priv_p priv)
 {
 	struct flow_entry	*fle, *fle1;
 	struct flow_hash_entry	*hsh;
 	item_p			item = NULL;
 	int i;
 
 	/*
 	 * We are going to free probably billable data.
 	 * Expire everything before freeing it.
 	 * No locking is required since callout is already drained.
 	 */
 	for (hsh = priv->hash, i = 0; i < NBUCKETS; hsh++, i++)
 		TAILQ_FOREACH_SAFE(fle, &hsh->head, fle_hash, fle1) {
 			TAILQ_REMOVE(&hsh->head, fle, fle_hash);
 			expire_flow(priv, &item, fle, NG_QUEUE);
 		}
 
 	if (item != NULL)
 		export_send(priv, item, NG_QUEUE);
 
 	uma_zdestroy(priv->zone);
 
 	/* Destroy hash mutexes. */
 	for (i = 0, hsh = priv->hash; i < NBUCKETS; i++, hsh++)
 		mtx_destroy(&hsh->mtx);
 
 	/* Free hash memory. */
 	if (priv->hash)
 		free(priv->hash, M_NETFLOW_HASH);
 
 	mtx_destroy(&priv->export_mtx);
 }
 
 /* Insert packet from into flow cache. */
 int
 ng_netflow_flow_add(priv_p priv, struct ip *ip, unsigned int src_if_index)
 {
 	register struct flow_entry	*fle, *fle1;
 	struct flow_hash_entry		*hsh;
 	struct flow_rec		r;
 	item_p			item = NULL;
 	int			hlen, plen;
 	int			error = 0;
 	uint8_t			tcp_flags = 0;
 
 	/* Try to fill flow_rec r */
 	bzero(&r, sizeof(r));
 	/* check version */
 	if (ip->ip_v != IPVERSION)
 		return (EINVAL);
 
 	/* verify min header length */
 	hlen = ip->ip_hl << 2;
 
 	if (hlen < sizeof(struct ip))
 		return (EINVAL);
 
 	r.r_src = ip->ip_src;
 	r.r_dst = ip->ip_dst;
 
 	/* save packet length */
 	plen = ntohs(ip->ip_len);
 
 	r.r_ip_p = ip->ip_p;
 	r.r_tos = ip->ip_tos;
 
 	r.r_i_ifx = src_if_index;
 
 	/*
 	 * XXX NOTE: only first fragment of fragmented TCP, UDP and
 	 * ICMP packet will be recorded with proper s_port and d_port.
 	 * Following fragments will be recorded simply as IP packet with
 	 * ip_proto = ip->ip_p and s_port, d_port set to zero.
 	 * I know, it looks like bug. But I don't want to re-implement
 	 * ip packet assebmling here. Anyway, (in)famous trafd works this way -
 	 * and nobody complains yet :)
 	 */
 	if ((ip->ip_off & htons(IP_OFFMASK)) == 0)
 		switch(r.r_ip_p) {
 		case IPPROTO_TCP:
 		{
 			register struct tcphdr *tcp;
 
 			tcp = (struct tcphdr *)((caddr_t )ip + hlen);
 			r.r_sport = tcp->th_sport;
 			r.r_dport = tcp->th_dport;
 			tcp_flags = tcp->th_flags;
 			break;
 		}
 			case IPPROTO_UDP:
 			r.r_ports = *(uint32_t *)((caddr_t )ip + hlen);
 			break;
 		}
 
 	/* Update node statistics. XXX: race... */
 	priv->info.nfinfo_packets ++;
 	priv->info.nfinfo_bytes += plen;
 
 	/* Find hash slot. */
 	hsh = &priv->hash[ip_hash(&r)];
 
 	mtx_lock(&hsh->mtx);
 
 	/*
 	 * Go through hash and find our entry. If we encounter an
 	 * entry, that should be expired, purge it. We do a reverse
 	 * search since most active entries are first, and most
 	 * searches are done on most active entries.
 	 */
 	TAILQ_FOREACH_REVERSE_SAFE(fle, &hsh->head, fhead, fle_hash, fle1) {
 		if (bcmp(&r, &fle->f.r, sizeof(struct flow_rec)) == 0)
 			break;
 		if ((INACTIVE(fle) && SMALL(fle)) || AGED(fle)) {
 			TAILQ_REMOVE(&hsh->head, fle, fle_hash);
 			expire_flow(priv, &item, fle, NG_QUEUE);
 			atomic_add_32(&priv->info.nfinfo_act_exp, 1);
 		}
 	}
 
 	if (fle) {			/* An existent entry. */
 
 		fle->f.bytes += plen;
 		fle->f.packets ++;
 		fle->f.tcp_flags |= tcp_flags;
 		fle->f.last = time_uptime;
 
 		/*
 		 * We have the following reasons to expire flow in active way:
 		 * - it hit active timeout
 		 * - a TCP connection closed
 		 * - it is going to overflow counter
 		 */
 		if (tcp_flags & TH_FIN || tcp_flags & TH_RST || AGED(fle) ||
 		    (fle->f.bytes >= (UINT_MAX - IF_MAXMTU)) ) {
 			TAILQ_REMOVE(&hsh->head, fle, fle_hash);
 			expire_flow(priv, &item, fle, NG_QUEUE);
 			atomic_add_32(&priv->info.nfinfo_act_exp, 1);
 		} else {
 			/*
 			 * It is the newest, move it to the tail,
 			 * if it isn't there already. Next search will
 			 * locate it quicker.
 			 */
 			if (fle != TAILQ_LAST(&hsh->head, fhead)) {
 				TAILQ_REMOVE(&hsh->head, fle, fle_hash);
 				TAILQ_INSERT_TAIL(&hsh->head, fle, fle_hash);
 			}
 		}
 	} else				/* A new flow entry. */
 		error = hash_insert(priv, hsh, &r, plen, tcp_flags);
 
 	mtx_unlock(&hsh->mtx);
 
 	if (item != NULL)
 		return_export_dgram(priv, item, NG_QUEUE);
 
 	return (error);
 }
 
 /*
  * Return records from cache to userland.
  *
  * TODO: matching particular IP should be done in kernel, here.
  */
 int
 ng_netflow_flow_show(priv_p priv, uint32_t last, struct ng_mesg *resp)
 {
 	struct flow_hash_entry *hsh;
 	struct flow_entry *fle;
 	struct ngnf_flows *data;
 	int i;
 
 	data = (struct ngnf_flows *)resp->data;
 	data->last = 0;
 	data->nentries = 0;
 
 	/* Check if this is a first run */
 	if (last == 0) {
 		hsh = priv->hash;
 		i = 0;
 	} else {
 		if (last > NBUCKETS-1)
 			return (EINVAL);
 		hsh = priv->hash + last;
 		i = last;
 	}
 
 	/*
 	 * We will transfer not more than NREC_AT_ONCE. More data
 	 * will come in next message.
 	 * We send current hash index to userland, and userland should
 	 * return it back to us. Then, we will restart with new entry.
 	 *
 	 * The resulting cache snapshot is inaccurate for the
 	 * following reasons:
 	 *  - we skip locked hash entries
 	 *  - we bail out, if someone wants our entry
 	 *  - we skip rest of entry, when hit NREC_AT_ONCE
 	 */
 	for (; i < NBUCKETS; hsh++, i++) {
 		if (mtx_trylock(&hsh->mtx) == 0)
 			continue;
 
 		TAILQ_FOREACH(fle, &hsh->head, fle_hash) {
 			if (hsh->mtx.mtx_lock & MTX_CONTESTED)
 				break;
 
 			bcopy(&fle->f, &(data->entries[data->nentries]),
 			    sizeof(fle->f));
 			data->nentries++;
 			if (data->nentries == NREC_AT_ONCE) {
 				mtx_unlock(&hsh->mtx);
 				if (++i < NBUCKETS)
 					data->last = i;
 				return (0);
 			}
 		}
 		mtx_unlock(&hsh->mtx);
 	}
 
 	return (0);
 }
 
 /* We have full datagram in privdata. Send it to export hook. */
 static int
 export_send(priv_p priv, item_p item, int flags)
 {
 	struct mbuf *m = NGI_M(item);
 	struct netflow_v5_export_dgram *dgram = mtod(m,
 					struct netflow_v5_export_dgram *);
 	struct netflow_v5_header *header = &dgram->header;
 	struct timespec ts;
 	int error = 0;
 
 	/* Fill mbuf header. */
 	m->m_len = m->m_pkthdr.len = sizeof(struct netflow_v5_record) *
 	   header->count + sizeof(struct netflow_v5_header);
 
 	/* Fill export header. */
 	header->sys_uptime = htonl(MILLIUPTIME(time_uptime));
 	getnanotime(&ts);
 	header->unix_secs  = htonl(ts.tv_sec);
 	header->unix_nsecs = htonl(ts.tv_nsec);
 	header->engine_type = 0;
 	header->engine_id = 0;
 	header->pad = 0;
 	header->flow_seq = htonl(atomic_fetchadd_32(&priv->flow_seq,
 	    header->count));
 	header->count = htons(header->count);
 
 	if (priv->export != NULL)
 		NG_FWD_ITEM_HOOK_FLAGS(error, item, priv->export, flags);
 	else
 		NG_FREE_ITEM(item);
 
 	return (error);
 }
 
 
 /* Add export record to dgram. */
 static int
 export_add(item_p item, struct flow_entry *fle)
 {
 	struct netflow_v5_export_dgram *dgram = mtod(NGI_M(item),
 					struct netflow_v5_export_dgram *);
 	struct netflow_v5_header *header = &dgram->header;
 	struct netflow_v5_record *rec;
 
 	rec = &dgram->r[header->count];
 	header->count ++;
 
 	KASSERT(header->count <= NETFLOW_V5_MAX_RECORDS,
 	    ("ng_netflow: export too big"));
 
 	/* Fill in export record. */
 	rec->src_addr = fle->f.r.r_src.s_addr;
 	rec->dst_addr = fle->f.r.r_dst.s_addr;
 	rec->next_hop = fle->f.next_hop.s_addr;
 	rec->i_ifx    = htons(fle->f.fle_i_ifx);
 	rec->o_ifx    = htons(fle->f.fle_o_ifx);
 	rec->packets  = htonl(fle->f.packets);
 	rec->octets   = htonl(fle->f.bytes);
 	rec->first    = htonl(MILLIUPTIME(fle->f.first));
 	rec->last     = htonl(MILLIUPTIME(fle->f.last));
 	rec->s_port   = fle->f.r.r_sport;
 	rec->d_port   = fle->f.r.r_dport;
 	rec->flags    = fle->f.tcp_flags;
 	rec->prot     = fle->f.r.r_ip_p;
 	rec->tos      = fle->f.r.r_tos;
 	rec->dst_mask = fle->f.dst_mask;
 	rec->src_mask = fle->f.src_mask;
 
 	/* Not supported fields. */
 	rec->src_as = rec->dst_as = 0;
 
 	if (header->count == NETFLOW_V5_MAX_RECORDS)
 		return (1); /* end of datagram */
 	else
 		return (0);	
 }
 
 /* Periodic flow expiry run. */
 void
 ng_netflow_expire(void *arg)
 {
 	struct flow_entry	*fle, *fle1;
 	struct flow_hash_entry	*hsh;
 	priv_p			priv = (priv_p )arg;
 	item_p			item = NULL;
 	uint32_t		used;
 	int			i;
 
 	/*
 	 * Going through all the cache.
 	 */
 	for (hsh = priv->hash, i = 0; i < NBUCKETS; hsh++, i++) {
 		/*
 		 * Skip entries, that are already being worked on.
 		 */
 		if (mtx_trylock(&hsh->mtx) == 0)
 			continue;
 
 		used = atomic_load_acq_32(&priv->info.nfinfo_used);
 		TAILQ_FOREACH_SAFE(fle, &hsh->head, fle_hash, fle1) {
 			/*
 			 * Interrupt thread wants this entry!
 			 * Quick! Quick! Bail out!
 			 */
 			if (hsh->mtx.mtx_lock & MTX_CONTESTED)
 				break;
 
 			/*
 			 * Don't expire aggressively while hash collision
 			 * ratio is predicted small.
 			 */
 			if (used <= (NBUCKETS*2) && !INACTIVE(fle))
 				break;
 
 			if ((INACTIVE(fle) && (SMALL(fle) ||
 			    (used > (NBUCKETS*2)))) || AGED(fle)) {
 				TAILQ_REMOVE(&hsh->head, fle, fle_hash);
 				expire_flow(priv, &item, fle, NG_NOFLAGS);
 				used--;
 				atomic_add_32(&priv->info.nfinfo_inact_exp, 1);
 			}
 		}
 		mtx_unlock(&hsh->mtx);
 	}
 
 	if (item != NULL)
 		return_export_dgram(priv, item, NG_NOFLAGS);
 
 	/* Schedule next expire. */
 	callout_reset(&priv->exp_callout, (1*hz), &ng_netflow_expire,
 	    (void *)priv);
 }
Index: head/sys/netinet/if_atm.c
===================================================================
--- head/sys/netinet/if_atm.c	(revision 186118)
+++ head/sys/netinet/if_atm.c	(revision 186119)
@@ -1,370 +1,364 @@
 /*      $NetBSD: if_atm.c,v 1.6 1996/10/13 02:03:01 christos Exp $       */
 
 /*-
  *
  * Copyright (c) 1996 Charles D. Cranor and Washington University.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *      This product includes software developed by Charles D. Cranor and
  *      Washington University.
  * 4. The name of the author may not be used to endorse or promote products
  *    derived from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  * IP <=> ATM address resolution.
  */
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_natm.h"
 
 #if defined(INET) || defined(INET6)
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/queue.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/syslog.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 #include <net/if_atm.h>
 
 #include <netinet/in.h>
 #include <netinet/if_atm.h>
 
 #ifdef NATM
 #include <netnatm/natm.h>
 #endif
 
 #define SDL(s) ((struct sockaddr_dl *)s)
 
 #define	GET3BYTE(V, A, L)	do {				\
 	(V) = ((A)[0] << 16) | ((A)[1] << 8) | (A)[2];		\
 	(A) += 3;						\
 	(L) -= 3;						\
     } while (0)
 
 #define GET2BYTE(V, A, L)	do {				\
 	(V) = ((A)[0] << 8) | (A)[1];				\
 	(A) += 2;						\
 	(L) -= 2;						\
     } while (0)
 
 #define GET1BYTE(V, A, L)	do {				\
 	(V) = *(A)++;						\
 	(L)--;							\
     } while (0)
 
 
 /*
  * atm_rtrequest: handle ATM rt request (in support of generic code)
  *   inputs: "req" = request code
  *           "rt" = route entry
  *           "info" = rt_addrinfo
  */
 void
 atm_rtrequest(int req, struct rtentry *rt, struct rt_addrinfo *info)
 {
 	struct sockaddr *gate = rt->rt_gateway;
 	struct atmio_openvcc op;
 	struct atmio_closevcc cl;
 	u_char *addr;
 	u_int alen;
 #ifdef NATM
 	struct sockaddr_in *sin;
 	struct natmpcb *npcb = NULL;
 #endif
 	static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
 
 	if (rt->rt_flags & RTF_GATEWAY)   /* link level requests only */
 		return;
 
 	switch (req) {
 
 	case RTM_RESOLVE: /* resolve: only happens when cloning */
 		printf("atm_rtrequest: RTM_RESOLVE request detected?\n");
 		break;
 
 	case RTM_ADD:
 		/*
 		 * route added by a command (e.g. ifconfig, route, arp...).
 		 *
 		 * first check to see if this is not a host route, in which
 		 * case we are being called via "ifconfig" to set the address.
 		 */
 		if ((rt->rt_flags & RTF_HOST) == 0) {
 			rt_setgate(rt,rt_key(rt),(struct sockaddr *)&null_sdl);
 			gate = rt->rt_gateway;
 			SDL(gate)->sdl_type = rt->rt_ifp->if_type;
 			SDL(gate)->sdl_index = rt->rt_ifp->if_index;
 			break;
 		}
 
-		if ((rt->rt_flags & RTF_CLONING) != 0) {
-			printf("atm_rtrequest: cloning route detected?\n");
-			break;
-		}
 		if (gate->sa_family != AF_LINK ||
 		    gate->sa_len < sizeof(null_sdl)) {
 			log(LOG_DEBUG, "atm_rtrequest: bad gateway value");
 			break;
 		}
 
 		KASSERT(rt->rt_ifp->if_ioctl != NULL,
 		    ("atm_rtrequest: null ioctl"));
 
 		/*
 		 * Parse and verify the link level address as
 		 * an open request
 		 */
 #ifdef NATM
 		NATM_LOCK();
 #endif
 		bzero(&op, sizeof(op));
 		addr = LLADDR(SDL(gate));
 		alen = SDL(gate)->sdl_alen;
 		if (alen < 4) {
 			printf("%s: bad link-level address\n", __func__);
 			goto failed;
 		}
 
 		if (alen == 4) {
 			/* old type address */
 			GET1BYTE(op.param.flags, addr, alen);
 			GET1BYTE(op.param.vpi, addr, alen);
 			GET2BYTE(op.param.vci, addr, alen);
 			op.param.traffic = ATMIO_TRAFFIC_UBR;
 			op.param.aal = (op.param.flags & ATM_PH_AAL5) ?
 			    ATMIO_AAL_5 : ATMIO_AAL_0;
 		} else {
 			/* new address */
 			op.param.aal = ATMIO_AAL_5;
 
 			GET1BYTE(op.param.flags, addr, alen);
 			op.param.flags &= ATM_PH_LLCSNAP;
 
 			GET1BYTE(op.param.vpi, addr, alen);
 			GET2BYTE(op.param.vci, addr, alen);
 
 			GET1BYTE(op.param.traffic, addr, alen);
 
 			switch (op.param.traffic) {
 
 			  case ATMIO_TRAFFIC_UBR:
 				if (alen >= 3)
 					GET3BYTE(op.param.tparam.pcr,
 					    addr, alen);
 				break;
 
 			  case ATMIO_TRAFFIC_CBR:
 				if (alen < 3)
 					goto bad_param;
 				GET3BYTE(op.param.tparam.pcr, addr, alen);
 				break;
 
 			  case ATMIO_TRAFFIC_VBR:
 				if (alen < 3 * 3)
 					goto bad_param;
 				GET3BYTE(op.param.tparam.pcr, addr, alen);
 				GET3BYTE(op.param.tparam.scr, addr, alen);
 				GET3BYTE(op.param.tparam.mbs, addr, alen);
 				break;
 
 			  case ATMIO_TRAFFIC_ABR:
 				if (alen < 4 * 3 + 2 + 1 * 2 + 3)
 					goto bad_param;
 				GET3BYTE(op.param.tparam.pcr, addr, alen);
 				GET3BYTE(op.param.tparam.mcr, addr, alen);
 				GET3BYTE(op.param.tparam.icr, addr, alen);
 				GET3BYTE(op.param.tparam.tbe, addr, alen);
 				GET1BYTE(op.param.tparam.nrm, addr, alen);
 				GET1BYTE(op.param.tparam.trm, addr, alen);
 				GET2BYTE(op.param.tparam.adtf, addr, alen);
 				GET1BYTE(op.param.tparam.rif, addr, alen);
 				GET1BYTE(op.param.tparam.rdf, addr, alen);
 				GET1BYTE(op.param.tparam.cdf, addr, alen);
 				break;
 
 			  default:
 			  bad_param:
 				printf("%s: bad traffic params\n", __func__);
 				goto failed;
 			}
 		}
 		op.param.rmtu = op.param.tmtu = rt->rt_ifp->if_mtu;
 #ifdef NATM
 		/*
 		 * let native ATM know we are using this VCI/VPI
 		 * (i.e. reserve it)
 		 */
 		sin = (struct sockaddr_in *) rt_key(rt);
 		if (sin->sin_family != AF_INET)
 			goto failed;
 		npcb = npcb_add(NULL, rt->rt_ifp, op.param.vci,  op.param.vpi);
 		if (npcb == NULL)
 			goto failed;
 		npcb->npcb_flags |= NPCB_IP;
 		npcb->ipaddr.s_addr = sin->sin_addr.s_addr;
 		/* XXX: move npcb to llinfo when ATM ARP is ready */
 		rt->rt_llinfo = (caddr_t) npcb;
 		rt->rt_flags |= RTF_LLINFO;
 #endif
 		/*
 		 * let the lower level know this circuit is active
 		 */
 		op.rxhand = NULL;
 		op.param.flags |= ATMIO_FLAG_ASYNC;
 		if (rt->rt_ifp->if_ioctl(rt->rt_ifp, SIOCATMOPENVCC,
 		    (caddr_t)&op) != 0) {
 			printf("atm: couldn't add VC\n");
 			goto failed;
 		}
 
 		SDL(gate)->sdl_type = rt->rt_ifp->if_type;
 		SDL(gate)->sdl_index = rt->rt_ifp->if_index;
 
 #ifdef NATM
 		NATM_UNLOCK();
 #endif
 		break;
 
 failed:
 #ifdef NATM
 		if (npcb) {
 			npcb_free(npcb, NPCB_DESTROY);
 			rt->rt_llinfo = NULL;
 			rt->rt_flags &= ~RTF_LLINFO;
 		}
 		NATM_UNLOCK();
 #endif
 		/* mark as invalid. We cannot RTM_DELETE the route from
 		 * here, because the recursive call to rtrequest1 does
 		 * not really work. */
 		rt->rt_flags |= RTF_REJECT;
 		break;
 
 	case RTM_DELETE:
 #ifdef NATM
 		/*
 		 * tell native ATM we are done with this VC
 		 */
 		if (rt->rt_flags & RTF_LLINFO) {
 			NATM_LOCK();
 			npcb_free((struct natmpcb *)rt->rt_llinfo,
 			    NPCB_DESTROY);
 			rt->rt_llinfo = NULL;
 			rt->rt_flags &= ~RTF_LLINFO;
 			NATM_UNLOCK();
 		}
 #endif
 		/*
 		 * tell the lower layer to disable this circuit
 		 */
 		bzero(&op, sizeof(op));
 		addr = LLADDR(SDL(gate));
 		addr++;
 		cl.vpi = *addr++;
 		cl.vci = *addr++ << 8;
 		cl.vci |= *addr++;
 		(void)rt->rt_ifp->if_ioctl(rt->rt_ifp, SIOCATMCLOSEVCC,
 		    (caddr_t)&cl);
 		break;
 	}
 }
 
 /*
  * atmresolve:
  *   inputs:
  *     [1] "rt" = the link level route to use (or null if need to look one up)
  *     [2] "m" = mbuf containing the data to be sent
  *     [3] "dst" = sockaddr_in (IP) address of dest.
  *   output:
  *     [4] "desten" = ATM pseudo header which we will fill in VPI/VCI info
  *   return:
  *     0 == resolve FAILED; note that "m" gets m_freem'd in this case
  *     1 == resolve OK; desten contains result
  *
  *   XXX: will need more work if we wish to support ATMARP in the kernel,
  *   but this is enough for PVCs entered via the "route" command.
  */
 int
 atmresolve(struct rtentry *rt, struct mbuf *m, struct sockaddr *dst,
     struct atm_pseudohdr *desten)
 {
 	struct sockaddr_dl *sdl;
 
 	if (m->m_flags & (M_BCAST | M_MCAST)) {
 		log(LOG_INFO,
 		    "atmresolve: BCAST/MCAST packet detected/dumped\n");
 		goto bad;
 	}
 
 	if (rt == NULL) {
 		rt = RTALLOC1(dst, 0); /* link level on table 0 XXX MRT */
 		if (rt == NULL)
 			goto bad;	/* failed */
 		RT_REMREF(rt);		/* don't keep LL references */
 		if ((rt->rt_flags & RTF_GATEWAY) != 0 ||
-		    (rt->rt_flags & RTF_LLINFO) == 0 ||
-		    /* XXX: are we using LLINFO? */
 		    rt->rt_gateway->sa_family != AF_LINK) {
 			RT_UNLOCK(rt);
 			goto bad;
 		}
 		RT_UNLOCK(rt);
 	}
 
 	/*
 	 * note that rt_gateway is a sockaddr_dl which contains the
 	 * atm_pseudohdr data structure for this route.   we currently
 	 * don't need any rt_llinfo info (but will if we want to support
 	 * ATM ARP [c.f. if_ether.c]).
 	 */
 	sdl = SDL(rt->rt_gateway);
 
 	/*
 	 * Check the address family and length is valid, the address
 	 * is resolved; otherwise, try to resolve.
 	 */
 	if (sdl->sdl_family == AF_LINK && sdl->sdl_alen >= sizeof(*desten)) {
 		bcopy(LLADDR(sdl), desten, sizeof(*desten));
 		return (1);	/* ok, go for it! */
 	}
 
 	/*
 	 * we got an entry, but it doesn't have valid link address
 	 * info in it (it is prob. the interface route, which has
 	 * sdl_alen == 0).    dump packet.  (fall through to "bad").
 	 */
 bad:
 	m_freem(m);
 	return (0);
 }
 #endif /* INET */
Index: head/sys/netinet/if_ether.c
===================================================================
--- head/sys/netinet/if_ether.c	(revision 186118)
+++ head/sys/netinet/if_ether.c	(revision 186119)
@@ -1,1093 +1,807 @@
 /*-
  * Copyright (c) 1982, 1986, 1988, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)if_ether.c	8.1 (Berkeley) 6/10/93
  */
 
 /*
  * Ethernet address resolution protocol.
  * TODO:
  *	add "inuse/lock" bit (or ref. count) along with valid bit
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_mac.h"
 #include "opt_carp.h"
 
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/queue.h>
 #include <sys/sysctl.h>
 #include <sys/systm.h>
 #include <sys/mbuf.h>
 #include <sys/malloc.h>
 #include <sys/proc.h>
 #include <sys/socket.h>
 #include <sys/syslog.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/route.h>
 #include <net/netisr.h>
 #include <net/if_llc.h>
 #include <net/ethernet.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
+#include <net/if_llatbl.h>
 #include <netinet/if_ether.h>
 #include <netinet/vinet.h>
 
 #include <net/if_arc.h>
 #include <net/iso88025.h>
 
 #ifdef DEV_CARP
 #include <netinet/ip_carp.h>
 #endif
 
 #include <security/mac/mac_framework.h>
 
 #define SIN(s) ((struct sockaddr_in *)s)
 #define SDL(s) ((struct sockaddr_dl *)s)
+#define LLTABLE(ifp)	((struct lltable *)(ifp)->if_afdata[AF_INET])
 
 SYSCTL_DECL(_net_link_ether);
 SYSCTL_NODE(_net_link_ether, PF_INET, inet, CTLFLAG_RW, 0, "");
 
 /* timer values */
 #ifdef VIMAGE_GLOBALS
 static int	arpt_keep; /* once resolved, good for 20 more minutes */
 static int	arp_maxtries;
-static int	useloopback; /* use loopback interface for local traffic */
+int	useloopback; /* use loopback interface for local traffic */
 static int	arp_proxyall;
 #endif
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_link_ether_inet, OID_AUTO, max_age,
     CTLFLAG_RW, arpt_keep, 0, "ARP entry lifetime in seconds");
 
-#define	rt_expire rt_rmx.rmx_expire
-
-struct llinfo_arp {
-	struct	callout la_timer;
-	struct	rtentry *la_rt;
-	struct	mbuf *la_hold;	/* last packet until resolved/timeout */
-	u_short	la_preempt;	/* countdown for pre-expiry arps */
-	u_short	la_asked;	/* # requests sent */
-};
-
 static struct	ifqueue arpintrq;
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_link_ether_inet, OID_AUTO, maxtries,
 	CTLFLAG_RW, arp_maxtries, 0,
 	"ARP resolution attempts before returning error");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_link_ether_inet, OID_AUTO, useloopback,
 	CTLFLAG_RW, useloopback, 0,
 	"Use the loopback interface for local traffic");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_link_ether_inet, OID_AUTO, proxyall,
 	CTLFLAG_RW, arp_proxyall, 0,
 	"Enable proxy ARP for all suitable requests");
 
 static void	arp_init(void);
-static void	arp_rtrequest(int, struct rtentry *, struct rt_addrinfo *);
-static void	arprequest(struct ifnet *,
+void		arprequest(struct ifnet *,
 			struct in_addr *, struct in_addr *, u_char *);
 static void	arpintr(struct mbuf *);
 static void	arptimer(void *);
-static struct rtentry
-		*arplookup(u_long, int, int, int);
 #ifdef INET
 static void	in_arpinput(struct mbuf *);
 #endif
 
+#ifdef AF_INET
+void arp_ifscrub(struct ifnet *ifp, uint32_t addr);
+
 /*
- * Timeout routine.
+ * called by in_ifscrub to remove entry from the table when
+ * the interface goes away
  */
-static void
-arptimer(void *arg)
+void
+arp_ifscrub(struct ifnet *ifp, uint32_t addr)
 {
-	struct rtentry *rt = (struct rtentry *)arg;
+	struct sockaddr_in addr4;
 
-	RT_LOCK_ASSERT(rt);
-	/*
-	 * The lock is needed to close a theoretical race
-	 * between spontaneous expiry and intentional removal.
-	 * We still got an extra reference on rtentry, so can
-	 * safely pass pointers to its contents.
-	 */
-	RT_UNLOCK(rt);
-
-	in_rtrequest(RTM_DELETE, rt_key(rt), NULL, rt_mask(rt), 0, NULL,
-	    rt->rt_fibnum);
+	bzero((void *)&addr4, sizeof(addr4));
+	addr4.sin_len    = sizeof(addr4);
+	addr4.sin_family = AF_INET;
+	addr4.sin_addr.s_addr = addr;
+	IF_AFDATA_LOCK(ifp);
+	lla_lookup(LLTABLE(ifp), (LLE_DELETE | LLE_IFADDR),
+	    (struct sockaddr *)&addr4);
+	IF_AFDATA_UNLOCK(ifp);
 }
+#endif
 
 /*
- * Parallel to llc_rtrequest.
+ * Timeout routine.  Age arp_tab entries periodically.
  */
 static void
-arp_rtrequest(int req, struct rtentry *rt, struct rt_addrinfo *info)
+arptimer(void *arg)
 {
-	INIT_VNET_NET(curvnet);
-	INIT_VNET_INET(curvnet);
-	struct sockaddr *gate;
-	struct llinfo_arp *la;
-	static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
-	struct in_ifaddr *ia;
-	struct ifaddr *ifa;
+	struct ifnet *ifp;
+	struct llentry   *lle = (struct llentry *)arg;
 
-	RT_LOCK_ASSERT(rt);
-
-	if (rt->rt_flags & RTF_GATEWAY)
+	if (lle == NULL) {
+		panic("%s: NULL entry!\n", __func__);
 		return;
-	gate = rt->rt_gateway;
-	la = (struct llinfo_arp *)rt->rt_llinfo;
-	switch (req) {
-
-	case RTM_ADD:
+	}
+	ifp = lle->lle_tbl->llt_ifp;
+	IF_AFDATA_LOCK(ifp);
+	LLE_WLOCK(lle);
+	if ((lle->la_flags & LLE_DELETED) ||
+	    (time_second >= lle->la_expire)) {
+		if (!callout_pending(&lle->la_timer) &&
+		    callout_active(&lle->la_timer))
+			(void) llentry_free(lle);
+	} else {
 		/*
-		 * XXX: If this is a manually added route to interface
-		 * such as older version of routed or gated might provide,
-		 * restore cloning bit.
+		 * Still valid, just drop our reference
 		 */
-		if ((rt->rt_flags & RTF_HOST) == 0 &&
-		    rt_mask(rt) != NULL &&
-		    SIN(rt_mask(rt))->sin_addr.s_addr != 0xffffffff)
-			rt->rt_flags |= RTF_CLONING;
-		if (rt->rt_flags & RTF_CLONING) {
-			/*
-			 * Case 1: This route should come from a route to iface.
-			 */
-			rt_setgate(rt, rt_key(rt),
-					(struct sockaddr *)&null_sdl);
-			gate = rt->rt_gateway;
-			SDL(gate)->sdl_type = rt->rt_ifp->if_type;
-			SDL(gate)->sdl_index = rt->rt_ifp->if_index;
-			rt->rt_expire = time_uptime;
-			break;
-		}
-		/* Announce a new entry if requested. */
-		if (rt->rt_flags & RTF_ANNOUNCE)
-			arprequest(rt->rt_ifp,
-			    &SIN(rt_key(rt))->sin_addr,
-			    &SIN(rt_key(rt))->sin_addr,
-			    (u_char *)LLADDR(SDL(gate)));
-		/*FALLTHROUGH*/
-	case RTM_RESOLVE:
-		if (gate->sa_family != AF_LINK ||
-		    gate->sa_len < sizeof(null_sdl)) {
-			log(LOG_DEBUG, "%s: bad gateway %s%s\n", __func__,
-			    inet_ntoa(SIN(rt_key(rt))->sin_addr),
-			    (gate->sa_family != AF_LINK) ?
-			    " (!AF_LINK)": "");
-			break;
-		}
-		SDL(gate)->sdl_type = rt->rt_ifp->if_type;
-		SDL(gate)->sdl_index = rt->rt_ifp->if_index;
-		if (la != 0)
-			break; /* This happens on a route change */
-		/*
-		 * Case 2:  This route may come from cloning, or a manual route
-		 * add with a LL address.
-		 */
-		R_Zalloc(la, struct llinfo_arp *, sizeof(*la));
-		rt->rt_llinfo = (caddr_t)la;
-		if (la == 0) {
-			log(LOG_DEBUG, "%s: malloc failed\n", __func__);
-			break;
-		}
-		/*
-		 * We are storing a route entry outside of radix tree. So,
-		 * it can be found and accessed by other means than radix
-		 * lookup. The routing code assumes that any rtentry detached
-		 * from radix can be destroyed safely. To prevent this, we
-		 * add an additional reference.
-		 */
-		RT_ADDREF(rt);
-		la->la_rt = rt;
-		rt->rt_flags |= RTF_LLINFO;
-		callout_init_mtx(&la->la_timer, &rt->rt_mtx,
-		    CALLOUT_RETURNUNLOCKED);
-
-#ifdef INET
-		/*
-		 * This keeps the multicast addresses from showing up
-		 * in `arp -a' listings as unresolved.  It's not actually
-		 * functional.  Then the same for broadcast.
-		 */
-		if (IN_MULTICAST(ntohl(SIN(rt_key(rt))->sin_addr.s_addr)) &&
-		    rt->rt_ifp->if_type != IFT_ARCNET) {
-			ETHER_MAP_IP_MULTICAST(&SIN(rt_key(rt))->sin_addr,
-					       LLADDR(SDL(gate)));
-			SDL(gate)->sdl_alen = 6;
-			rt->rt_expire = 0;
-		}
-		if (in_broadcast(SIN(rt_key(rt))->sin_addr, rt->rt_ifp)) {
-			memcpy(LLADDR(SDL(gate)), rt->rt_ifp->if_broadcastaddr,
-			       rt->rt_ifp->if_addrlen);
-			SDL(gate)->sdl_alen = rt->rt_ifp->if_addrlen;
-			rt->rt_expire = 0;
-		}
-#endif
-
-		TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link) {
-			if (ia->ia_ifp == rt->rt_ifp &&
-			    SIN(rt_key(rt))->sin_addr.s_addr ==
-			    (IA_SIN(ia))->sin_addr.s_addr)
-				break;
-		}
-		if (ia) {
-		    /*
-		     * This test used to be
-		     *	if (loif.if_flags & IFF_UP)
-		     * It allowed local traffic to be forced
-		     * through the hardware by configuring the loopback down.
-		     * However, it causes problems during network configuration
-		     * for boards that can't receive packets they send.
-		     * It is now necessary to clear "useloopback" and remove
-		     * the route to force traffic out to the hardware.
-		     */
-			rt->rt_expire = 0;
-			bcopy(IF_LLADDR(rt->rt_ifp), LLADDR(SDL(gate)),
-			      SDL(gate)->sdl_alen = rt->rt_ifp->if_addrlen);
-			if (V_useloopback) {
-				rt->rt_ifp = V_loif;
-				rt->rt_rmx.rmx_mtu = V_loif->if_mtu;
-			}
-
-		    /*
-		     * make sure to set rt->rt_ifa to the interface
-		     * address we are using, otherwise we will have trouble
-		     * with source address selection.
-		     */
-			ifa = &ia->ia_ifa;
-			if (ifa != rt->rt_ifa) {
-				IFAFREE(rt->rt_ifa);
-				IFAREF(ifa);
-				rt->rt_ifa = ifa;
-			}
-		}
-		break;
-
-	case RTM_DELETE:
-		if (la == NULL)	/* XXX: at least CARP does this. */
-			break;
-		callout_stop(&la->la_timer);
-		rt->rt_llinfo = NULL;
-		rt->rt_flags &= ~RTF_LLINFO;
-		RT_REMREF(rt);
-		if (la->la_hold)
-			m_freem(la->la_hold);
-		Free((caddr_t)la);
+		LLE_FREE_LOCKED(lle);
 	}
+	IF_AFDATA_UNLOCK(ifp);
 }
 
 /*
  * Broadcast an ARP request. Caller specifies:
  *	- arp header source ip address
  *	- arp header target ip address
  *	- arp header source ethernet address
  */
-static void
-arprequest(struct ifnet *ifp, struct in_addr *sip, struct in_addr *tip,
+void
+arprequest(struct ifnet *ifp, struct in_addr *sip, struct in_addr  *tip,
     u_char *enaddr)
 {
 	struct mbuf *m;
 	struct arphdr *ah;
 	struct sockaddr sa;
 
+	if (sip == NULL) {
+		/* XXX don't believe this can happen (or explain why) */
+		/*
+		 * The caller did not supply a source address, try to find
+		 * a compatible one among those assigned to this interface.
+		 */
+		struct ifaddr *ifa;
+
+		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
+			if (!ifa->ifa_addr ||
+			    ifa->ifa_addr->sa_family != AF_INET)
+				continue;
+			sip = &SIN(ifa->ifa_addr)->sin_addr;
+			if (0 == ((sip->s_addr ^ tip->s_addr) &
+			    SIN(ifa->ifa_netmask)->sin_addr.s_addr) )
+				break;  /* found it. */
+		}
+		if (sip == NULL) {  
+			printf("%s: cannot find matching address\n", __func__);
+			return;
+		}
+	}
+
 	if ((m = m_gethdr(M_DONTWAIT, MT_DATA)) == NULL)
 		return;
 	m->m_len = sizeof(*ah) + 2*sizeof(struct in_addr) +
 		2*ifp->if_data.ifi_addrlen;
 	m->m_pkthdr.len = m->m_len;
 	MH_ALIGN(m, m->m_len);
 	ah = mtod(m, struct arphdr *);
 	bzero((caddr_t)ah, m->m_len);
 #ifdef MAC
 	mac_netinet_arp_send(ifp, m);
 #endif
 	ah->ar_pro = htons(ETHERTYPE_IP);
 	ah->ar_hln = ifp->if_addrlen;		/* hardware address length */
 	ah->ar_pln = sizeof(struct in_addr);	/* protocol address length */
 	ah->ar_op = htons(ARPOP_REQUEST);
 	bcopy((caddr_t)enaddr, (caddr_t)ar_sha(ah), ah->ar_hln);
 	bcopy((caddr_t)sip, (caddr_t)ar_spa(ah), ah->ar_pln);
 	bcopy((caddr_t)tip, (caddr_t)ar_tpa(ah), ah->ar_pln);
 	sa.sa_family = AF_ARP;
 	sa.sa_len = 2;
 	m->m_flags |= M_BCAST;
 	(*ifp->if_output)(ifp, m, &sa, (struct rtentry *)0);
-
-	return;
 }
 
 /*
  * Resolve an IP address into an ethernet address.
  * On input:
  *    ifp is the interface we use
  *    rt0 is the route to the final destination (possibly useless)
  *    m is the mbuf. May be NULL if we don't have a packet.
  *    dst is the next hop,
  *    desten is where we want the address.
  *
  * On success, desten is filled in and the function returns 0;
  * If the packet must be held pending resolution, we return EWOULDBLOCK
  * On other errors, we return the corresponding error code.
  * Note that m_freem() handles NULL.
  */
 int
 arpresolve(struct ifnet *ifp, struct rtentry *rt0, struct mbuf *m,
-    struct sockaddr *dst, u_char *desten)
+	struct sockaddr *dst, u_char *desten, struct llentry **lle)
 {
 	INIT_VNET_INET(ifp->if_vnet);
-	struct llinfo_arp *la = NULL;
-	struct rtentry *rt = NULL;
-	struct sockaddr_dl *sdl;
-	int error;
-	int fibnum = -1;
+	struct llentry *la = 0;
+	u_int flags;
+	int error, renew;
 
-	if (m) {
+	*lle = NULL;
+	if (m != NULL) {
 		if (m->m_flags & M_BCAST) {
 			/* broadcast */
 			(void)memcpy(desten,
 			    ifp->if_broadcastaddr, ifp->if_addrlen);
 			return (0);
 		}
 		if (m->m_flags & M_MCAST && ifp->if_type != IFT_ARCNET) {
 			/* multicast */
 			ETHER_MAP_IP_MULTICAST(&SIN(dst)->sin_addr, desten);
 			return (0);
 		}
-		fibnum = M_GETFIB(m);
 	}
 
-	if (rt0 != NULL) {
-		/* Look for a cached arp (ll) entry. */
-		if (m == NULL)
-			fibnum = rt0->rt_fibnum;
-		error = rt_check(&rt, &rt0, dst);
-		if (error) {
-			m_freem(m);
-			return error;
-		}
-		la = (struct llinfo_arp *)rt->rt_llinfo;
-		if (la == NULL)
-			RT_UNLOCK(rt);
-	}
+	flags = (ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) ? 0 : LLE_CREATE;
 
-	/*
-	 * If we had no mbuf and no route, then hope the caller
-	 * has a fib in mind because we are running out of ideas.
-	 * I think this should not happen in current code.
-	 * (kmacy would know).
+	/* XXXXX
+	 * Since this function returns an llentry, the 
+	 * lock is held by the caller.
+	 * XXX if caller is required to hold lock, assert it
 	 */
-	if (fibnum == -1)
-		fibnum = curthread->td_proc->p_fibnum; /* last gasp */
-
+retry:
+	IF_AFDATA_LOCK(ifp);	
+	la = lla_lookup(LLTABLE(ifp), flags, dst);
+	IF_AFDATA_UNLOCK(ifp);	
 	if (la == NULL) {
-		/*
-		 * We enter this block if rt0 was NULL,
-		 * or if rt found by rt_check() didn't have llinfo.
-		 * we should get a cloned route, which since it should
-		 * come from the local interface should have a ll entry.
-		 * It may be incomplete but that's ok.
-		 */
-		rt = arplookup(SIN(dst)->sin_addr.s_addr, 1, 0, fibnum);
-		if (rt == NULL) {
+		if (flags & LLE_CREATE)
 			log(LOG_DEBUG,
-			    "arpresolve: can't allocate route for %s\n",
-			    inet_ntoa(SIN(dst)->sin_addr));
-			m_freem(m);
-			return (EINVAL); /* XXX */
-		}
-		la = (struct llinfo_arp *)rt->rt_llinfo;
-		if (la == NULL) {
-			RT_UNLOCK(rt);
-			log(LOG_DEBUG,
 			    "arpresolve: can't allocate llinfo for %s\n",
 			    inet_ntoa(SIN(dst)->sin_addr));
-			m_freem(m);
-			return (EINVAL); /* XXX */
-		}
-	}
-	sdl = SDL(rt->rt_gateway);
-	/*
-	 * Check the address family and length is valid, the address
-	 * is resolved; otherwise, try to resolve.
-	 */
-	if ((rt->rt_expire == 0 || rt->rt_expire > time_uptime) &&
-	    sdl->sdl_family == AF_LINK && sdl->sdl_alen != 0) {
+		m_freem(m);
+		return (EINVAL);
+	} 
 
-		bcopy(LLADDR(sdl), desten, sdl->sdl_alen);
-
+	if ((la->la_flags & LLE_VALID) &&
+	    ((la->la_flags & LLE_STATIC) || la->la_expire > time_uptime)) {
+		bcopy(&la->ll_addr, desten, ifp->if_addrlen);
 		/*
 		 * If entry has an expiry time and it is approaching,
-		 * send an ARP request.
+		 * see if we need to send an ARP request within this
+		 * arpt_down interval.
 		 */
-		if ((rt->rt_expire != 0) &&
-		    (time_uptime + la->la_preempt > rt->rt_expire)) {
-			struct in_addr sin = 
-			    SIN(rt->rt_ifa->ifa_addr)->sin_addr;
+		if (!(la->la_flags & LLE_STATIC) &&
+		    time_uptime + la->la_preempt > la->la_expire) {
+			arprequest(ifp, NULL,
+			    &SIN(dst)->sin_addr, IF_LLADDR(ifp));
 
 			la->la_preempt--;
-			RT_UNLOCK(rt);
-			arprequest(ifp, &sin, &SIN(dst)->sin_addr,
-			    IF_LLADDR(ifp));
-			return (0);
-		} 
+		}
+		
+		*lle = la;
+		error = 0;
+		goto done;
+	} 
+			    
+	if (la->la_flags & LLE_STATIC) {   /* should not happen! */
+		log(LOG_DEBUG, "arpresolve: ouch, empty static llinfo for %s\n",
+		    inet_ntoa(SIN(dst)->sin_addr));
+		m_freem(m);
+		error = EINVAL;
+		goto done;
+	}
 
-		RT_UNLOCK(rt);
-		return (0);
+	renew = (la->la_asked == 0 || la->la_expire != time_uptime);
+	if ((renew || m != NULL) && (flags & LLE_EXCLUSIVE) == 0) {
+		flags |= LLE_EXCLUSIVE;
+		LLE_RUNLOCK(la);
+		goto retry;
 	}
 	/*
-	 * If ARP is disabled or static on this interface, stop.
-	 * XXX
-	 * Probably should not allocate empty llinfo struct if we are
-	 * not going to be sending out an arp request.
-	 */
-	if (ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) {
-		RT_UNLOCK(rt);
-		m_freem(m);
-		return (EINVAL);
-	}
-	/*
 	 * There is an arptab entry, but no ethernet address
 	 * response yet.  Replace the held mbuf with this
 	 * latest one.
 	 */
-	if (m) {
-		if (la->la_hold)
+	if (m != NULL) {
+		if (la->la_hold != NULL)
 			m_freem(la->la_hold);
 		la->la_hold = m;
+		if (renew == 0 && (flags & LLE_EXCLUSIVE)) {
+			flags &= ~LLE_EXCLUSIVE;
+			LLE_DOWNGRADE(la);
+		}
+		
 	}
-	KASSERT(rt->rt_expire > 0, ("sending ARP request for static entry"));
-
 	/*
 	 * Return EWOULDBLOCK if we have tried less than arp_maxtries. It
 	 * will be masked by ether_output(). Return EHOSTDOWN/EHOSTUNREACH
 	 * if we have already sent arp_maxtries ARP requests. Retransmit the
 	 * ARP request, but not faster than one request per second.
 	 */
 	if (la->la_asked < V_arp_maxtries)
 		error = EWOULDBLOCK;	/* First request. */
 	else
-		error = (rt == rt0) ? EHOSTDOWN : EHOSTUNREACH;
+		error =
+		    (rt0->rt_flags & RTF_GATEWAY) ? EHOSTDOWN : EHOSTUNREACH;
 
-	if (la->la_asked == 0 || rt->rt_expire != time_uptime) {
-		struct in_addr sin =
-		    SIN(rt->rt_ifa->ifa_addr)->sin_addr;
-
-		rt->rt_expire = time_uptime;
-		callout_reset(&la->la_timer, hz, arptimer, rt);
+	if (renew) {
+		LLE_ADDREF(la);
+		la->la_expire = time_uptime;
+		callout_reset(&la->la_timer, hz, arptimer, la);
 		la->la_asked++;
-		RT_UNLOCK(rt);
-
-		arprequest(ifp, &sin, &SIN(dst)->sin_addr,
+		LLE_WUNLOCK(la);
+		arprequest(ifp, NULL, &SIN(dst)->sin_addr,
 		    IF_LLADDR(ifp));
-	} else
-		RT_UNLOCK(rt);
-
+		return (error);
+	}
+done:
+	if (flags & LLE_EXCLUSIVE)
+		LLE_WUNLOCK(la);
+	else
+		LLE_RUNLOCK(la);
 	return (error);
 }
 
 /*
  * Common length and type checks are done here,
  * then the protocol-specific routine is called.
  */
 static void
 arpintr(struct mbuf *m)
 {
 	struct arphdr *ar;
 
 	if (m->m_len < sizeof(struct arphdr) &&
 	    ((m = m_pullup(m, sizeof(struct arphdr))) == NULL)) {
 		log(LOG_ERR, "arp: runt packet -- m_pullup failed\n");
 		return;
 	}
 	ar = mtod(m, struct arphdr *);
 
 	if (ntohs(ar->ar_hrd) != ARPHRD_ETHER &&
 	    ntohs(ar->ar_hrd) != ARPHRD_IEEE802 &&
 	    ntohs(ar->ar_hrd) != ARPHRD_ARCNET &&
 	    ntohs(ar->ar_hrd) != ARPHRD_IEEE1394) {
 		log(LOG_ERR, "arp: unknown hardware address format (0x%2D)\n",
 		    (unsigned char *)&ar->ar_hrd, "");
 		m_freem(m);
 		return;
 	}
 
 	if (m->m_len < arphdr_len(ar)) {
 		if ((m = m_pullup(m, arphdr_len(ar))) == NULL) {
 			log(LOG_ERR, "arp: runt packet\n");
 			m_freem(m);
 			return;
 		}
 		ar = mtod(m, struct arphdr *);
 	}
 
 	switch (ntohs(ar->ar_pro)) {
 #ifdef INET
 	case ETHERTYPE_IP:
 		in_arpinput(m);
 		return;
 #endif
 	}
 	m_freem(m);
 }
 
 #ifdef INET
 /*
  * ARP for Internet protocols on 10 Mb/s Ethernet.
  * Algorithm is that given in RFC 826.
  * In addition, a sanity check is performed on the sender
  * protocol address, to catch impersonators.
  * We no longer handle negotiations for use of trailer protocol:
  * Formerly, ARP replied for protocol type ETHERTYPE_TRAIL sent
  * along with IP replies if we wanted trailers sent to us,
  * and also sent them in response to IP replies.
  * This allowed either end to announce the desire to receive
  * trailer packets.
  * We no longer reply to requests for ETHERTYPE_TRAIL protocol either,
  * but formerly didn't normally send requests.
  */
 static int log_arp_wrong_iface = 1;
 static int log_arp_movements = 1;
 static int log_arp_permanent_modify = 1;
 
 SYSCTL_INT(_net_link_ether_inet, OID_AUTO, log_arp_wrong_iface, CTLFLAG_RW,
 	&log_arp_wrong_iface, 0,
 	"log arp packets arriving on the wrong interface");
 SYSCTL_INT(_net_link_ether_inet, OID_AUTO, log_arp_movements, CTLFLAG_RW,
         &log_arp_movements, 0,
         "log arp replies from MACs different than the one in the cache");
 SYSCTL_INT(_net_link_ether_inet, OID_AUTO, log_arp_permanent_modify, CTLFLAG_RW,
         &log_arp_permanent_modify, 0,
         "log arp replies from MACs different than the one in the permanent arp entry");
 
 
 static void
 in_arpinput(struct mbuf *m)
 {
 	struct arphdr *ah;
 	struct ifnet *ifp = m->m_pkthdr.rcvif;
-	struct llinfo_arp *la;
+	struct llentry *la = NULL;
 	struct rtentry *rt;
 	struct ifaddr *ifa;
 	struct in_ifaddr *ia;
-	struct sockaddr_dl *sdl;
 	struct sockaddr sa;
 	struct in_addr isaddr, itaddr, myaddr;
-	struct mbuf *hold;
 	u_int8_t *enaddr = NULL;
-	int op, rif_len;
+	int op, flags;
+	struct mbuf *m0;
 	int req_len;
 	int bridged = 0, is_bridge = 0;
-	u_int fibnum;
-	u_int goodfib = 0;
-	int firstpass = 1;
 #ifdef DEV_CARP
 	int carp_match = 0;
 #endif
 	struct sockaddr_in sin;
 	sin.sin_len = sizeof(struct sockaddr_in);
 	sin.sin_family = AF_INET;
 	sin.sin_addr.s_addr = 0;
 	INIT_VNET_INET(ifp->if_vnet);
 
 	if (ifp->if_bridge)
 		bridged = 1;
 	if (ifp->if_type == IFT_BRIDGE)
 		is_bridge = 1;
 
 	req_len = arphdr_len2(ifp->if_addrlen, sizeof(struct in_addr));
 	if (m->m_len < req_len && (m = m_pullup(m, req_len)) == NULL) {
 		log(LOG_ERR, "in_arp: runt packet -- m_pullup failed\n");
 		return;
 	}
 
 	ah = mtod(m, struct arphdr *);
 	op = ntohs(ah->ar_op);
 	(void)memcpy(&isaddr, ar_spa(ah), sizeof (isaddr));
 	(void)memcpy(&itaddr, ar_tpa(ah), sizeof (itaddr));
 
 	/*
 	 * For a bridge, we want to check the address irrespective
 	 * of the receive interface. (This will change slightly
 	 * when we have clusters of interfaces).
 	 * If the interface does not match, but the recieving interface
 	 * is part of carp, we call carp_iamatch to see if this is a
 	 * request for the virtual host ip.
 	 * XXX: This is really ugly!
 	 */
 	LIST_FOREACH(ia, INADDR_HASH(itaddr.s_addr), ia_hash) {
 		if (((bridged && ia->ia_ifp->if_bridge != NULL) ||
-		    (ia->ia_ifp == ifp)) &&
+		    ia->ia_ifp == ifp) &&
 		    itaddr.s_addr == ia->ia_addr.sin_addr.s_addr)
 			goto match;
 #ifdef DEV_CARP
 		if (ifp->if_carp != NULL &&
 		    carp_iamatch(ifp->if_carp, ia, &isaddr, &enaddr) &&
 		    itaddr.s_addr == ia->ia_addr.sin_addr.s_addr) {
 			carp_match = 1;
 			goto match;
 		}
 #endif
 	}
 	LIST_FOREACH(ia, INADDR_HASH(isaddr.s_addr), ia_hash)
 		if (((bridged && ia->ia_ifp->if_bridge != NULL) ||
-		    (ia->ia_ifp == ifp)) &&
+		    ia->ia_ifp == ifp) &&
 		    isaddr.s_addr == ia->ia_addr.sin_addr.s_addr)
 			goto match;
 
 #define BDG_MEMBER_MATCHES_ARP(addr, ifp, ia)				\
   (ia->ia_ifp->if_bridge == ifp->if_softc &&				\
   !bcmp(IF_LLADDR(ia->ia_ifp), IF_LLADDR(ifp), ifp->if_addrlen) &&	\
   addr == ia->ia_addr.sin_addr.s_addr)
 	/*
 	 * Check the case when bridge shares its MAC address with
 	 * some of its children, so packets are claimed by bridge
 	 * itself (bridge_input() does it first), but they are really
 	 * meant to be destined to the bridge member.
 	 */
 	if (is_bridge) {
 		LIST_FOREACH(ia, INADDR_HASH(itaddr.s_addr), ia_hash) {
 			if (BDG_MEMBER_MATCHES_ARP(itaddr.s_addr, ifp, ia)) {
 				ifp = ia->ia_ifp;
 				goto match;
 			}
 		}
 	}
 #undef BDG_MEMBER_MATCHES_ARP
 
 	/*
 	 * No match, use the first inet address on the receive interface
 	 * as a dummy address for the rest of the function.
 	 */
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)
 		if (ifa->ifa_addr->sa_family == AF_INET) {
 			ia = ifatoia(ifa);
 			goto match;
 		}
 	/*
 	 * If bridging, fall back to using any inet address.
 	 */
 	if (!bridged || (ia = TAILQ_FIRST(&V_in_ifaddrhead)) == NULL)
 		goto drop;
 match:
 	if (!enaddr)
 		enaddr = (u_int8_t *)IF_LLADDR(ifp);
 	myaddr = ia->ia_addr.sin_addr;
 	if (!bcmp(ar_sha(ah), enaddr, ifp->if_addrlen))
 		goto drop;	/* it's from me, ignore it. */
 	if (!bcmp(ar_sha(ah), ifp->if_broadcastaddr, ifp->if_addrlen)) {
 		log(LOG_ERR,
 		    "arp: link address is broadcast for IP address %s!\n",
 		    inet_ntoa(isaddr));
 		goto drop;
 	}
 	/*
 	 * Warn if another host is using the same IP address, but only if the
 	 * IP address isn't 0.0.0.0, which is used for DHCP only, in which
 	 * case we suppress the warning to avoid false positive complaints of
 	 * potential misconfiguration.
 	 */
 	if (!bridged && isaddr.s_addr == myaddr.s_addr && myaddr.s_addr != 0) {
 		log(LOG_ERR,
 		   "arp: %*D is using my IP address %s on %s!\n",
 		   ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
 		   inet_ntoa(isaddr), ifp->if_xname);
 		itaddr = myaddr;
 		goto reply;
 	}
 	if (ifp->if_flags & IFF_STATICARP)
 		goto reply;
-	/*
-	 * We look for any FIB that has this address to find
-	 * the interface etc.
-	 * For sanity checks that are FIB independent we abort the loop.
-	 */
-	for (fibnum = 0; fibnum < rt_numfibs; fibnum++) {
-		rt = arplookup(isaddr.s_addr,
-		    itaddr.s_addr == myaddr.s_addr, 0, fibnum);
-		if (rt == NULL)
-			continue;
-		
-		sdl = SDL(rt->rt_gateway);
-		/* Only call this once */
-		if (firstpass) {
-			sin.sin_addr.s_addr = isaddr.s_addr;
-			EVENTHANDLER_INVOKE(route_arp_update_event, rt,
-			    ar_sha(ah), (struct sockaddr *)&sin);
-		}
-		
-		la = (struct llinfo_arp *)rt->rt_llinfo;
-		if (la == NULL) {
-			RT_UNLOCK(rt);
-			continue;
-		}
 
-		if (firstpass) {
-			/* The following is not an error when doing bridging. */
-			if (!bridged && rt->rt_ifp != ifp
+	bzero(&sin, sizeof(sin));
+	sin.sin_len = sizeof(struct sockaddr_in);
+	sin.sin_family = AF_INET;
+	sin.sin_addr = isaddr;
+	flags = (itaddr.s_addr == myaddr.s_addr) ? LLE_CREATE : 0;
+	flags |= LLE_EXCLUSIVE;
+	IF_AFDATA_LOCK(ifp); 
+	la = lla_lookup(LLTABLE(ifp), flags, (struct sockaddr *)&sin);
+	IF_AFDATA_UNLOCK(ifp);
+	if (la != NULL) {
+		/* the following is not an error when doing bridging */
+		if (!bridged && la->lle_tbl->llt_ifp != ifp
 #ifdef DEV_CARP
-			    && (ifp->if_type != IFT_CARP || !carp_match)
+		    && (ifp->if_type != IFT_CARP || !carp_match)
 #endif
-			    ) {
-				if (log_arp_wrong_iface)
-					log(LOG_ERR, "arp: %s is on %s "
-						"but got reply from %*D "
-						"on %s\n",
-					    inet_ntoa(isaddr),
-					    rt->rt_ifp->if_xname,
-					    ifp->if_addrlen,
-					    (u_char *)ar_sha(ah), ":",
-					    ifp->if_xname);
-				RT_UNLOCK(rt);
-				break;
+			) {
+			if (log_arp_wrong_iface)
+				log(LOG_ERR, "arp: %s is on %s "
+				    "but got reply from %*D on %s\n",
+				    inet_ntoa(isaddr),
+				    la->lle_tbl->llt_ifp->if_xname,
+				    ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
+				    ifp->if_xname);
+			goto reply;
+		}
+		if ((la->la_flags & LLE_VALID) &&
+		    bcmp(ar_sha(ah), &la->ll_addr, ifp->if_addrlen)) {
+			if (la->la_flags & LLE_STATIC) {
+				log(LOG_ERR,
+				    "arp: %*D attempts to modify permanent "
+				    "entry for %s on %s\n",
+				    ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
+				    inet_ntoa(isaddr), ifp->if_xname);
+				goto reply;
 			}
-			if (sdl->sdl_alen &&
-			    bcmp(ar_sha(ah), LLADDR(sdl), sdl->sdl_alen)) {
-				if (rt->rt_expire) {
-				    if (log_arp_movements)
-					log(LOG_INFO,
-					    "arp: %s moved from %*D to %*D "
-					    "on %s\n",
-					    inet_ntoa(isaddr),
-					    ifp->if_addrlen,
-					    (u_char *)LLADDR(sdl), ":",
-					    ifp->if_addrlen,
-					    (u_char *)ar_sha(ah), ":",
-					    ifp->if_xname);
-				} else {
-					RT_UNLOCK(rt);
-					if (log_arp_permanent_modify)
-						log(LOG_ERR,
-						    "arp: %*D attempts to "
-						    "modify permanent entry "
-						    "for %s on %s\n",
-						    ifp->if_addrlen,
-						    (u_char *)ar_sha(ah), ":",
-						    inet_ntoa(isaddr),
-						    ifp->if_xname);
-					break;
-				}
+			if (log_arp_movements) {
+			        log(LOG_INFO, "arp: %s moved from %*D "
+				    "to %*D on %s\n",
+				    inet_ntoa(isaddr),
+				    ifp->if_addrlen,
+				    (u_char *)&la->ll_addr, ":",
+				    ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
+				    ifp->if_xname);
 			}
-			/*
-			 * sanity check for the address length.
-			 * XXX this does not work for protocols
-			 * with variable address length. -is
-			 */
-			if (sdl->sdl_alen &&
-			    sdl->sdl_alen != ah->ar_hln) {
-				log(LOG_WARNING,
-				    "arp from %*D: new addr len %d, was %d",
-				    ifp->if_addrlen, (u_char *) ar_sha(ah),
-				    ":", ah->ar_hln, sdl->sdl_alen);
-			}
-			if (ifp->if_addrlen != ah->ar_hln) {
-				log(LOG_WARNING,
-				    "arp from %*D: addr len: "
-				    "new %d, i/f %d (ignored)",
-				    ifp->if_addrlen, (u_char *) ar_sha(ah),
-				    ":", ah->ar_hln, ifp->if_addrlen);
-				RT_UNLOCK(rt);
-				break;
-			}
-			firstpass = 0;
-			goodfib = fibnum;
 		}
-
-		/* Copy in the information received. */
-		(void)memcpy(LLADDR(sdl), ar_sha(ah),
-		    sdl->sdl_alen = ah->ar_hln);
-		/*
-		 * If we receive an arp from a token-ring station over
-		 * a token-ring nic then try to save the source routing info.
-		 * XXXMRT Only minimal Token Ring support for MRT.
-		 * Only do this on the first pass as if modifies the mbuf.
-		 */
-		if (ifp->if_type == IFT_ISO88025) {
-			struct iso88025_header *th = NULL;
-			struct iso88025_sockaddr_dl_data *trld;
-
-			/* force the fib loop to end after this pass */
-			fibnum = rt_numfibs - 1;
-
-			th = (struct iso88025_header *)m->m_pkthdr.header;
-			trld = SDL_ISO88025(sdl);
-			rif_len = TR_RCF_RIFLEN(th->rcf);
-			if ((th->iso88025_shost[0] & TR_RII) &&
-			    (rif_len > 2)) {
-				trld->trld_rcf = th->rcf;
-				trld->trld_rcf ^= htons(TR_RCF_DIR);
-				memcpy(trld->trld_route, th->rd, rif_len - 2);
-				trld->trld_rcf &= ~htons(TR_RCF_BCST_MASK);
-				/*
-				 * Set up source routing information for
-				 * reply packet (XXX)
-				 */
-				m->m_data -= rif_len;
-				m->m_len  += rif_len;
-				m->m_pkthdr.len += rif_len;
-			} else {
-				th->iso88025_shost[0] &= ~TR_RII;
-				trld->trld_rcf = 0;
-			}
-			m->m_data -= 8;
-			m->m_len  += 8;
-			m->m_pkthdr.len += 8;
-			th->rcf = trld->trld_rcf;
+		    
+		if (ifp->if_addrlen != ah->ar_hln) {
+			log(LOG_WARNING,
+			    "arp from %*D: addr len: new %d, i/f %d (ignored)",
+			    ifp->if_addrlen, (u_char *) ar_sha(ah), ":",
+			    ah->ar_hln, ifp->if_addrlen);
+			goto reply;
 		}
+		(void)memcpy(&la->ll_addr, ar_sha(ah), ifp->if_addrlen);
+		la->la_flags |= LLE_VALID;
 
-		if (rt->rt_expire) {
-			rt->rt_expire = time_uptime + V_arpt_keep;
+		if (!(la->la_flags & LLE_STATIC)) {
+			la->la_expire = time_uptime + V_arpt_keep;
 			callout_reset(&la->la_timer, hz * V_arpt_keep,
-			    arptimer, rt);
+			    arptimer, la);
 		}
 		la->la_asked = 0;
 		la->la_preempt = V_arp_maxtries;
-		hold = la->la_hold;
-		la->la_hold = NULL;
-		RT_UNLOCK(rt);
-		if (hold != NULL)
-			(*ifp->if_output)(ifp, hold, rt_key(rt), rt);
-	} /* end of FIB loop */
+		if (la->la_hold != NULL) {
+			m0 = la->la_hold;
+			la->la_hold = 0;
+			memcpy(&sa, L3_ADDR(la), sizeof(sa));
+			LLE_WUNLOCK(la);
+			
+			(*ifp->if_output)(ifp, m0, &sa, NULL);
+			return;
+		}
+	}
 reply:
-
-	/*
-	 * Decide if we have to respond to something.
-	 */
 	if (op != ARPOP_REQUEST)
 		goto drop;
+
 	if (itaddr.s_addr == myaddr.s_addr) {
 		/* Shortcut.. the receiving interface is the target. */
 		(void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln);
 		(void)memcpy(ar_sha(ah), enaddr, ah->ar_hln);
 	} else {
-		/* It's not asking for our address. But it still may
-		 * be something we should answer.
-		 *
-		 * XXX MRT
-		 * We assume that link level info is independent of
-		 * the table used and so we use whichever we can and don't
-		 * have a better option.
-		 */
-		/* Have we been asked to proxy for the target. */
-		rt = arplookup(itaddr.s_addr, 0, SIN_PROXY, goodfib);
-		if (rt == NULL) {
-			/* Nope, only intersted now if proxying everything. */
-			struct sockaddr_in sin;
-
+		if (la == NULL) {
 			if (!V_arp_proxyall)
 				goto drop;
 
-			bzero(&sin, sizeof sin);
-			sin.sin_family = AF_INET;
-			sin.sin_len = sizeof sin;
 			sin.sin_addr = itaddr;
-
 			/* XXX MRT use table 0 for arp reply  */
 			rt = in_rtalloc1((struct sockaddr *)&sin, 0, 0UL, 0);
 			if (!rt)
 				goto drop;
 			/*
 			 * Don't send proxies for nodes on the same interface
 			 * as this one came out of, or we'll get into a fight
 			 * over who claims what Ether address.
 			 */
 			if (rt->rt_ifp == ifp) {
 				RTFREE_LOCKED(rt);
 				goto drop;
 			}
 			(void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln);
 			(void)memcpy(ar_sha(ah), enaddr, ah->ar_hln);
 			RTFREE_LOCKED(rt);
 
 			/*
 			 * Also check that the node which sent the ARP packet
 			 * is on the the interface we expect it to be on. This
 			 * avoids ARP chaos if an interface is connected to the
 			 * wrong network.
 			 */
 			sin.sin_addr = isaddr;
 
 			/* XXX MRT use table 0 for arp checks */
 			rt = in_rtalloc1((struct sockaddr *)&sin, 0, 0UL, 0);
 			if (!rt)
 				goto drop;
 			if (rt->rt_ifp != ifp) {
 				log(LOG_INFO, "arp_proxy: ignoring request"
 				    " from %s via %s, expecting %s\n",
 				    inet_ntoa(isaddr), ifp->if_xname,
 				    rt->rt_ifp->if_xname);
 				RTFREE_LOCKED(rt);
 				goto drop;
 			}
 			RTFREE_LOCKED(rt);
 
 #ifdef DEBUG_PROXY
 			printf("arp: proxying for %s\n",
 			       inet_ntoa(itaddr));
 #endif
 		} else {
 			/*
 			 * Return proxied ARP replies only on the interface
 			 * or bridge cluster where this network resides.
 			 * Otherwise we may conflict with the host we are
 			 * proxying for.
 			 */
-			if (rt->rt_ifp != ifp &&
-			    (rt->rt_ifp->if_bridge != ifp->if_bridge ||
+			if (la->lle_tbl->llt_ifp != ifp &&
+			    (la->lle_tbl->llt_ifp->if_bridge != ifp->if_bridge ||
 			    ifp->if_bridge == NULL)) {
-				RT_UNLOCK(rt);
 				goto drop;
 			}
-			sdl = SDL(rt->rt_gateway);
 			(void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln);
-			(void)memcpy(ar_sha(ah), LLADDR(sdl), ah->ar_hln);
-			RT_UNLOCK(rt);
+			(void)memcpy(ar_sha(ah), &la->ll_addr, ah->ar_hln);
 		}
 	}
 
+	if (la != NULL)
+		LLE_WUNLOCK(la);
 	if (itaddr.s_addr == myaddr.s_addr &&
 	    IN_LINKLOCAL(ntohl(itaddr.s_addr))) {
 		/* RFC 3927 link-local IPv4; always reply by broadcast. */
 #ifdef DEBUG_LINKLOCAL
 		printf("arp: sending reply for link-local addr %s\n",
 		    inet_ntoa(itaddr));
 #endif
 		m->m_flags |= M_BCAST;
 		m->m_flags &= ~M_MCAST;
 	} else {
 		/* default behaviour; never reply by broadcast. */
 		m->m_flags &= ~(M_BCAST|M_MCAST);
 	}
 	(void)memcpy(ar_tpa(ah), ar_spa(ah), ah->ar_pln);
 	(void)memcpy(ar_spa(ah), &itaddr, ah->ar_pln);
 	ah->ar_op = htons(ARPOP_REPLY);
 	ah->ar_pro = htons(ETHERTYPE_IP); /* let's be sure! */
 	m->m_len = sizeof(*ah) + (2 * ah->ar_pln) + (2 * ah->ar_hln);   
 	m->m_pkthdr.len = m->m_len;   
 	sa.sa_family = AF_ARP;
 	sa.sa_len = 2;
 	(*ifp->if_output)(ifp, m, &sa, (struct rtentry *)0);
 	return;
 
 drop:
+	if (la != NULL)
+		LLE_WUNLOCK(la);
 	m_freem(m);
 }
 #endif
 
-/*
- * Lookup or enter a new address in arptab.
- */
-static struct rtentry *
-arplookup(u_long addr, int create, int proxy, int fibnum)
-{
-	struct rtentry *rt;
-	struct sockaddr_inarp sin;
-	const char *why = 0;
-
-	bzero(&sin, sizeof(sin));
-	sin.sin_len = sizeof(sin);
-	sin.sin_family = AF_INET;
-	sin.sin_addr.s_addr = addr;
-	if (proxy)
-		sin.sin_other = SIN_PROXY;
-	rt = in_rtalloc1((struct sockaddr *)&sin, create, 0UL, fibnum);
-	if (rt == 0)
-		return (0);
-
-	if (rt->rt_flags & RTF_GATEWAY)
-		why = "host is not on local network";
-	else if ((rt->rt_flags & RTF_LLINFO) == 0)
-		why = "could not allocate llinfo";
-	else if (rt->rt_gateway->sa_family != AF_LINK)
-		why = "gateway route is not ours";
-
-	if (why) {
-#define	ISDYNCLONE(_rt) \
-	(((_rt)->rt_flags & (RTF_STATIC | RTF_WASCLONED)) == RTF_WASCLONED)
-		if (create)
-			log(LOG_DEBUG, "arplookup %s failed: %s\n",
-			    inet_ntoa(sin.sin_addr), why);
-		/*
-		 * If there are no references to this Layer 2 route,
-		 * and it is a cloned route, and not static, and
-		 * arplookup() is creating the route, then purge
-		 * it from the routing table as it is probably bogus.
-		 */
-		if (rt->rt_refcnt == 1 && ISDYNCLONE(rt))
-			rtexpunge(rt);
-		RTFREE_LOCKED(rt);
-		return (0);
-#undef ISDYNCLONE
-	} else {
-		RT_REMREF(rt);
-		return (rt);
-	}
-}
-
 void
 arp_ifinit(struct ifnet *ifp, struct ifaddr *ifa)
 {
+	struct llentry *lle;
+
 	if (ntohl(IA_SIN(ifa)->sin_addr.s_addr) != INADDR_ANY)
 		arprequest(ifp, &IA_SIN(ifa)->sin_addr,
 				&IA_SIN(ifa)->sin_addr, IF_LLADDR(ifp));
-	ifa->ifa_rtrequest = arp_rtrequest;
-	ifa->ifa_flags |= RTF_CLONING;
+	/* 
+	 * interface address is considered static entry
+	 * because the output of the arp utility shows
+	 * that L2 entry as permanent
+	 */
+	IF_AFDATA_LOCK(ifp);
+	lle = lla_lookup(LLTABLE(ifp), (LLE_CREATE | LLE_IFADDR | LLE_STATIC),
+	    (struct sockaddr *)IA_SIN(ifa));
+	IF_AFDATA_UNLOCK(ifp);
+	if (lle == NULL)
+		log(LOG_INFO, "arp_ifinit: cannot create arp "
+		    "entry for interface address\n");
+	LLE_RUNLOCK(lle);
+	ifa->ifa_rtrequest = NULL;
 }
 
 void
 arp_ifinit2(struct ifnet *ifp, struct ifaddr *ifa, u_char *enaddr)
 {
 	if (ntohl(IA_SIN(ifa)->sin_addr.s_addr) != INADDR_ANY)
 		arprequest(ifp, &IA_SIN(ifa)->sin_addr,
 				&IA_SIN(ifa)->sin_addr, enaddr);
-	ifa->ifa_rtrequest = arp_rtrequest;
-	ifa->ifa_flags |= RTF_CLONING;
+	ifa->ifa_rtrequest = NULL;
 }
 
 static void
 arp_init(void)
 {
 	INIT_VNET_INET(curvnet);
 
 	V_arpt_keep = (20*60); /* once resolved, good for 20 more minutes */
 	V_arp_maxtries = 5;
 	V_useloopback = 1; /* use loopback interface for local traffic */
 	V_arp_proxyall = 0;
 
 	arpintrq.ifq_maxlen = 50;
 	mtx_init(&arpintrq.ifq_mtx, "arp_inq", NULL, MTX_DEF);
 	netisr_register(NETISR_ARP, arpintr, &arpintrq, 0);
 }
 SYSINIT(arp, SI_SUB_PROTO_DOMAIN, SI_ORDER_ANY, arp_init, 0);
Index: head/sys/netinet/if_ether.h
===================================================================
--- head/sys/netinet/if_ether.h	(revision 186118)
+++ head/sys/netinet/if_ether.h	(revision 186119)
@@ -1,118 +1,121 @@
 /*-
  * Copyright (c) 1982, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)if_ether.h	8.3 (Berkeley) 5/2/95
  * $FreeBSD$
  */
 
 #ifndef _NETINET_IF_ETHER_H_
 #define _NETINET_IF_ETHER_H_
 
 #include <net/ethernet.h>
 #include <net/if_arp.h>
 
 /*
  * Macro to map an IP multicast address to an Ethernet multicast address.
  * The high-order 25 bits of the Ethernet address are statically assigned,
  * and the low-order 23 bits are taken from the low end of the IP address.
  */
 #define ETHER_MAP_IP_MULTICAST(ipaddr, enaddr) \
 	/* struct in_addr *ipaddr; */ \
 	/* u_char enaddr[ETHER_ADDR_LEN];	   */ \
 { \
 	(enaddr)[0] = 0x01; \
 	(enaddr)[1] = 0x00; \
 	(enaddr)[2] = 0x5e; \
 	(enaddr)[3] = ((u_char *)ipaddr)[1] & 0x7f; \
 	(enaddr)[4] = ((u_char *)ipaddr)[2]; \
 	(enaddr)[5] = ((u_char *)ipaddr)[3]; \
 }
 /*
  * Macro to map an IP6 multicast address to an Ethernet multicast address.
  * The high-order 16 bits of the Ethernet address are statically assigned,
  * and the low-order 32 bits are taken from the low end of the IP6 address.
  */
 #define ETHER_MAP_IPV6_MULTICAST(ip6addr, enaddr)			\
 /* struct	in6_addr *ip6addr; */					\
 /* u_char	enaddr[ETHER_ADDR_LEN]; */				\
 {                                                                       \
 	(enaddr)[0] = 0x33;						\
 	(enaddr)[1] = 0x33;						\
 	(enaddr)[2] = ((u_char *)ip6addr)[12];				\
 	(enaddr)[3] = ((u_char *)ip6addr)[13];				\
 	(enaddr)[4] = ((u_char *)ip6addr)[14];				\
 	(enaddr)[5] = ((u_char *)ip6addr)[15];				\
 }
 
 /*
  * Ethernet Address Resolution Protocol.
  *
  * See RFC 826 for protocol description.  Structure below is adapted
  * to resolving internet addresses.  Field names used correspond to
  * RFC 826.
  */
 struct	ether_arp {
 	struct	arphdr ea_hdr;	/* fixed-size header */
 	u_char	arp_sha[ETHER_ADDR_LEN];	/* sender hardware address */
 	u_char	arp_spa[4];	/* sender protocol address */
 	u_char	arp_tha[ETHER_ADDR_LEN];	/* target hardware address */
 	u_char	arp_tpa[4];	/* target protocol address */
 };
 #define	arp_hrd	ea_hdr.ar_hrd
 #define	arp_pro	ea_hdr.ar_pro
 #define	arp_hln	ea_hdr.ar_hln
 #define	arp_pln	ea_hdr.ar_pln
 #define	arp_op	ea_hdr.ar_op
 
 struct sockaddr_inarp {
 	u_char	sin_len;
 	u_char	sin_family;
 	u_short sin_port;
 	struct	in_addr sin_addr;
 	struct	in_addr sin_srcaddr;
 	u_short	sin_tos;
 	u_short	sin_other;
 #define SIN_PROXY 1
 };
 /*
  * IP and ethernet specific routing flags
  */
 #define	RTF_USETRAILERS	RTF_PROTO1	/* use trailers */
 #define RTF_ANNOUNCE	RTF_PROTO2	/* announce new arp entry */
 
 #ifdef	_KERNEL
 extern u_char	ether_ipmulticast_min[ETHER_ADDR_LEN];
 extern u_char	ether_ipmulticast_max[ETHER_ADDR_LEN];
 
+struct llentry;
+
 int	arpresolve(struct ifnet *ifp, struct rtentry *rt,
-		struct mbuf *m, struct sockaddr *dst, u_char *desten);
+		    struct mbuf *m, struct sockaddr *dst, u_char *desten,
+		    struct llentry **lle);
 void	arp_ifinit(struct ifnet *, struct ifaddr *);
 void	arp_ifinit2(struct ifnet *, struct ifaddr *, u_char *);
 #endif
 
 #endif
Index: head/sys/netinet/in.c
===================================================================
--- head/sys/netinet/in.c	(revision 186118)
+++ head/sys/netinet/in.c	(revision 186119)
@@ -1,1017 +1,1257 @@
 /*-
  * Copyright (c) 1982, 1986, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  * Copyright (C) 2001 WIDE Project.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)in.c	8.4 (Berkeley) 1/9/95
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_carp.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/sockio.h>
 #include <sys/malloc.h>
 #include <sys/priv.h>
 #include <sys/socket.h>
 #include <sys/kernel.h>
 #include <sys/sysctl.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
+#include <net/if_llatbl.h>
 #include <net/if_types.h>
 #include <net/route.h>
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/in_pcb.h>
 #include <netinet/ip_var.h>
 #include <netinet/vinet.h>
 
 static int in_mask2len(struct in_addr *);
 static void in_len2mask(struct in_addr *, int);
 static int in_lifaddr_ioctl(struct socket *, u_long, caddr_t,
 	struct ifnet *, struct thread *);
 
 static int	in_addprefix(struct in_ifaddr *, int);
 static int	in_scrubprefix(struct in_ifaddr *);
 static void	in_socktrim(struct sockaddr_in *);
 static int	in_ifinit(struct ifnet *,
 	    struct in_ifaddr *, struct sockaddr_in *, int);
 static void	in_purgemaddrs(struct ifnet *);
 
 #ifdef VIMAGE_GLOBALS
 static int subnetsarelocal;
 static int sameprefixcarponly;
 extern struct inpcbinfo ripcbinfo;
 #endif
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO, subnets_are_local,
 	CTLFLAG_RW, subnetsarelocal, 0,
 	"Treat all subnets as directly connected");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO, same_prefix_carp_only,
 	CTLFLAG_RW, sameprefixcarponly, 0,
 	"Refuse to create same prefixes on different interfaces");
 
 /*
  * Return 1 if an internet address is for a ``local'' host
  * (one to which we have a connection).  If subnetsarelocal
  * is true, this includes other subnets of the local net.
  * Otherwise, it includes only the directly-connected (sub)nets.
  */
 int
 in_localaddr(struct in_addr in)
 {
 	INIT_VNET_INET(curvnet);
 	register u_long i = ntohl(in.s_addr);
 	register struct in_ifaddr *ia;
 
 	if (V_subnetsarelocal) {
 		TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link)
 			if ((i & ia->ia_netmask) == ia->ia_net)
 				return (1);
 	} else {
 		TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link)
 			if ((i & ia->ia_subnetmask) == ia->ia_subnet)
 				return (1);
 	}
 	return (0);
 }
 
 /*
  * Return 1 if an internet address is for the local host and configured
  * on one of its interfaces.
  */
 int
 in_localip(struct in_addr in)
 {
 	INIT_VNET_INET(curvnet);
 	struct in_ifaddr *ia;
 
 	LIST_FOREACH(ia, INADDR_HASH(in.s_addr), ia_hash) {
 		if (IA_SIN(ia)->sin_addr.s_addr == in.s_addr)
 			return (1);
 	}
 	return (0);
 }
 
 /*
  * Determine whether an IP address is in a reserved set of addresses
  * that may not be forwarded, or whether datagrams to that destination
  * may be forwarded.
  */
 int
 in_canforward(struct in_addr in)
 {
 	register u_long i = ntohl(in.s_addr);
 	register u_long net;
 
 	if (IN_EXPERIMENTAL(i) || IN_MULTICAST(i) || IN_LINKLOCAL(i))
 		return (0);
 	if (IN_CLASSA(i)) {
 		net = i & IN_CLASSA_NET;
 		if (net == 0 || net == (IN_LOOPBACKNET << IN_CLASSA_NSHIFT))
 			return (0);
 	}
 	return (1);
 }
 
 /*
  * Trim a mask in a sockaddr
  */
 static void
 in_socktrim(struct sockaddr_in *ap)
 {
     register char *cplim = (char *) &ap->sin_addr;
     register char *cp = (char *) (&ap->sin_addr + 1);
 
     ap->sin_len = 0;
     while (--cp >= cplim)
 	if (*cp) {
 	    (ap)->sin_len = cp - (char *) (ap) + 1;
 	    break;
 	}
 }
 
 static int
 in_mask2len(mask)
 	struct in_addr *mask;
 {
 	int x, y;
 	u_char *p;
 
 	p = (u_char *)mask;
 	for (x = 0; x < sizeof(*mask); x++) {
 		if (p[x] != 0xff)
 			break;
 	}
 	y = 0;
 	if (x < sizeof(*mask)) {
 		for (y = 0; y < 8; y++) {
 			if ((p[x] & (0x80 >> y)) == 0)
 				break;
 		}
 	}
 	return (x * 8 + y);
 }
 
 static void
 in_len2mask(struct in_addr *mask, int len)
 {
 	int i;
 	u_char *p;
 
 	p = (u_char *)mask;
 	bzero(mask, sizeof(*mask));
 	for (i = 0; i < len / 8; i++)
 		p[i] = 0xff;
 	if (len % 8)
 		p[i] = (0xff00 >> (len % 8)) & 0xff;
 }
 
 /*
  * Generic internet control operations (ioctl's).
  * Ifp is 0 if not an interface-specific ioctl.
  */
 /* ARGSUSED */
 int
 in_control(struct socket *so, u_long cmd, caddr_t data, struct ifnet *ifp,
     struct thread *td)
 {
 	INIT_VNET_INET(curvnet); /* both so and ifp can be NULL here! */
 	register struct ifreq *ifr = (struct ifreq *)data;
 	register struct in_ifaddr *ia, *iap;
 	register struct ifaddr *ifa;
 	struct in_addr allhosts_addr;
 	struct in_addr dst;
 	struct in_ifaddr *oia;
 	struct in_aliasreq *ifra = (struct in_aliasreq *)data;
 	struct sockaddr_in oldaddr;
 	int error, hostIsNew, iaIsNew, maskIsNew, s;
 	int iaIsFirst;
 
 	ia = NULL;
 	iaIsFirst = 0;
 	iaIsNew = 0;
 	allhosts_addr.s_addr = htonl(INADDR_ALLHOSTS_GROUP);
 
 	switch (cmd) {
 	case SIOCALIFADDR:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NET_ADDIFADDR);
 			if (error)
 				return (error);
 		}
 		if (ifp == NULL)
 			return (EINVAL);
 		return in_lifaddr_ioctl(so, cmd, data, ifp, td);
 
 	case SIOCDLIFADDR:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NET_DELIFADDR);
 			if (error)
 				return (error);
 		}
 		if (ifp == NULL)
 			return (EINVAL);
 		return in_lifaddr_ioctl(so, cmd, data, ifp, td);
 
 	case SIOCGLIFADDR:
 		if (ifp == NULL)
 			return (EINVAL);
 		return in_lifaddr_ioctl(so, cmd, data, ifp, td);
 	}
 
 	/*
 	 * Find address for this interface, if it exists.
 	 *
 	 * If an alias address was specified, find that one instead of
 	 * the first one on the interface, if possible.
 	 */
 	if (ifp != NULL) {
 		dst = ((struct sockaddr_in *)&ifr->ifr_addr)->sin_addr;
 		LIST_FOREACH(iap, INADDR_HASH(dst.s_addr), ia_hash)
 			if (iap->ia_ifp == ifp &&
 			    iap->ia_addr.sin_addr.s_addr == dst.s_addr) {
 				ia = iap;
 				break;
 			}
 		if (ia == NULL)
 			TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 				iap = ifatoia(ifa);
 				if (iap->ia_addr.sin_family == AF_INET) {
 					ia = iap;
 					break;
 				}
 			}
 		if (ia == NULL)
 			iaIsFirst = 1;
 	}
 
 	switch (cmd) {
 
 	case SIOCAIFADDR:
 	case SIOCDIFADDR:
 		if (ifp == NULL)
 			return (EADDRNOTAVAIL);
 		if (ifra->ifra_addr.sin_family == AF_INET) {
 			for (oia = ia; ia; ia = TAILQ_NEXT(ia, ia_link)) {
 				if (ia->ia_ifp == ifp  &&
 				    ia->ia_addr.sin_addr.s_addr ==
 				    ifra->ifra_addr.sin_addr.s_addr)
 					break;
 			}
 			if ((ifp->if_flags & IFF_POINTOPOINT)
 			    && (cmd == SIOCAIFADDR)
 			    && (ifra->ifra_dstaddr.sin_addr.s_addr
 				== INADDR_ANY)) {
 				return (EDESTADDRREQ);
 			}
 		}
 		if (cmd == SIOCDIFADDR && ia == NULL)
 			return (EADDRNOTAVAIL);
 		/* FALLTHROUGH */
 	case SIOCSIFADDR:
 	case SIOCSIFNETMASK:
 	case SIOCSIFDSTADDR:
 		if (td != NULL) {
 			error = priv_check(td, (cmd == SIOCDIFADDR) ? 
 			    PRIV_NET_DELIFADDR : PRIV_NET_ADDIFADDR);
 			if (error)
 				return (error);
 		}
 
 		if (ifp == NULL)
 			return (EADDRNOTAVAIL);
 		if (ia == NULL) {
 			ia = (struct in_ifaddr *)
 				malloc(sizeof *ia, M_IFADDR, M_WAITOK | M_ZERO);
 			if (ia == NULL)
 				return (ENOBUFS);
 			/*
 			 * Protect from ipintr() traversing address list
 			 * while we're modifying it.
 			 */
 			s = splnet();
 			ifa = &ia->ia_ifa;
 			IFA_LOCK_INIT(ifa);
 			ifa->ifa_addr = (struct sockaddr *)&ia->ia_addr;
 			ifa->ifa_dstaddr = (struct sockaddr *)&ia->ia_dstaddr;
 			ifa->ifa_netmask = (struct sockaddr *)&ia->ia_sockmask;
 			ifa->ifa_refcnt = 1;
 			TAILQ_INSERT_TAIL(&ifp->if_addrhead, ifa, ifa_link);
 
 			ia->ia_sockmask.sin_len = 8;
 			ia->ia_sockmask.sin_family = AF_INET;
 			if (ifp->if_flags & IFF_BROADCAST) {
 				ia->ia_broadaddr.sin_len = sizeof(ia->ia_addr);
 				ia->ia_broadaddr.sin_family = AF_INET;
 			}
 			ia->ia_ifp = ifp;
 
 			TAILQ_INSERT_TAIL(&V_in_ifaddrhead, ia, ia_link);
 			splx(s);
 			iaIsNew = 1;
 		}
 		break;
 
 	case SIOCSIFBRDADDR:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NET_ADDIFADDR);
 			if (error)
 				return (error);
 		}
 		/* FALLTHROUGH */
 
 	case SIOCGIFADDR:
 	case SIOCGIFNETMASK:
 	case SIOCGIFDSTADDR:
 	case SIOCGIFBRDADDR:
 		if (ia == NULL)
 			return (EADDRNOTAVAIL);
 		break;
 	}
 	switch (cmd) {
 
 	case SIOCGIFADDR:
 		*((struct sockaddr_in *)&ifr->ifr_addr) = ia->ia_addr;
 		return (0);
 
 	case SIOCGIFBRDADDR:
 		if ((ifp->if_flags & IFF_BROADCAST) == 0)
 			return (EINVAL);
 		*((struct sockaddr_in *)&ifr->ifr_dstaddr) = ia->ia_broadaddr;
 		return (0);
 
 	case SIOCGIFDSTADDR:
 		if ((ifp->if_flags & IFF_POINTOPOINT) == 0)
 			return (EINVAL);
 		*((struct sockaddr_in *)&ifr->ifr_dstaddr) = ia->ia_dstaddr;
 		return (0);
 
 	case SIOCGIFNETMASK:
 		*((struct sockaddr_in *)&ifr->ifr_addr) = ia->ia_sockmask;
 		return (0);
 
 	case SIOCSIFDSTADDR:
 		if ((ifp->if_flags & IFF_POINTOPOINT) == 0)
 			return (EINVAL);
 		oldaddr = ia->ia_dstaddr;
 		ia->ia_dstaddr = *(struct sockaddr_in *)&ifr->ifr_dstaddr;
 		if (ifp->if_ioctl != NULL) {
 			IFF_LOCKGIANT(ifp);
 			error = (*ifp->if_ioctl)(ifp, SIOCSIFDSTADDR,
 			    (caddr_t)ia);
 			IFF_UNLOCKGIANT(ifp);
 			if (error) {
 				ia->ia_dstaddr = oldaddr;
 				return (error);
 			}
 		}
 		if (ia->ia_flags & IFA_ROUTE) {
 			ia->ia_ifa.ifa_dstaddr = (struct sockaddr *)&oldaddr;
 			rtinit(&(ia->ia_ifa), (int)RTM_DELETE, RTF_HOST);
 			ia->ia_ifa.ifa_dstaddr =
 					(struct sockaddr *)&ia->ia_dstaddr;
 			rtinit(&(ia->ia_ifa), (int)RTM_ADD, RTF_HOST|RTF_UP);
 		}
 		return (0);
 
 	case SIOCSIFBRDADDR:
 		if ((ifp->if_flags & IFF_BROADCAST) == 0)
 			return (EINVAL);
 		ia->ia_broadaddr = *(struct sockaddr_in *)&ifr->ifr_broadaddr;
 		return (0);
 
 	case SIOCSIFADDR:
 		error = in_ifinit(ifp, ia,
 		    (struct sockaddr_in *) &ifr->ifr_addr, 1);
 		if (error != 0 && iaIsNew)
 			break;
 		if (error == 0) {
 			if (iaIsFirst && (ifp->if_flags & IFF_MULTICAST) != 0)
 				in_addmulti(&allhosts_addr, ifp);
 			EVENTHANDLER_INVOKE(ifaddr_event, ifp);
 		}
 		return (0);
 
 	case SIOCSIFNETMASK:
 		ia->ia_sockmask.sin_addr = ifra->ifra_addr.sin_addr;
 		ia->ia_subnetmask = ntohl(ia->ia_sockmask.sin_addr.s_addr);
 		return (0);
 
 	case SIOCAIFADDR:
 		maskIsNew = 0;
 		hostIsNew = 1;
 		error = 0;
 		if (ia->ia_addr.sin_family == AF_INET) {
 			if (ifra->ifra_addr.sin_len == 0) {
 				ifra->ifra_addr = ia->ia_addr;
 				hostIsNew = 0;
 			} else if (ifra->ifra_addr.sin_addr.s_addr ==
 					       ia->ia_addr.sin_addr.s_addr)
 				hostIsNew = 0;
 		}
 		if (ifra->ifra_mask.sin_len) {
 			in_ifscrub(ifp, ia);
 			ia->ia_sockmask = ifra->ifra_mask;
 			ia->ia_sockmask.sin_family = AF_INET;
 			ia->ia_subnetmask =
 			     ntohl(ia->ia_sockmask.sin_addr.s_addr);
 			maskIsNew = 1;
 		}
 		if ((ifp->if_flags & IFF_POINTOPOINT) &&
 		    (ifra->ifra_dstaddr.sin_family == AF_INET)) {
 			in_ifscrub(ifp, ia);
 			ia->ia_dstaddr = ifra->ifra_dstaddr;
 			maskIsNew  = 1; /* We lie; but the effect's the same */
 		}
 		if (ifra->ifra_addr.sin_family == AF_INET &&
 		    (hostIsNew || maskIsNew))
 			error = in_ifinit(ifp, ia, &ifra->ifra_addr, 0);
 		if (error != 0 && iaIsNew)
 			break;
 
 		if ((ifp->if_flags & IFF_BROADCAST) &&
 		    (ifra->ifra_broadaddr.sin_family == AF_INET))
 			ia->ia_broadaddr = ifra->ifra_broadaddr;
 		if (error == 0) {
 			if (iaIsFirst && (ifp->if_flags & IFF_MULTICAST) != 0)
 				in_addmulti(&allhosts_addr, ifp);
 			EVENTHANDLER_INVOKE(ifaddr_event, ifp);
 		}
 		return (error);
 
 	case SIOCDIFADDR:
 		/*
 		 * in_ifscrub kills the interface route.
 		 */
 		in_ifscrub(ifp, ia);
 		/*
 		 * in_ifadown gets rid of all the rest of
 		 * the routes.  This is not quite the right
 		 * thing to do, but at least if we are running
 		 * a routing process they will come back.
 		 */
 		in_ifadown(&ia->ia_ifa, 1);
 		EVENTHANDLER_INVOKE(ifaddr_event, ifp);
 		error = 0;
 		break;
 
 	default:
 		if (ifp == NULL || ifp->if_ioctl == NULL)
 			return (EOPNOTSUPP);
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, cmd, data);
 		IFF_UNLOCKGIANT(ifp);
 		return (error);
 	}
 
 	/*
 	 * Protect from ipintr() traversing address list while we're modifying
 	 * it.
 	 */
 	s = splnet();
 	TAILQ_REMOVE(&ifp->if_addrhead, &ia->ia_ifa, ifa_link);
 	TAILQ_REMOVE(&V_in_ifaddrhead, ia, ia_link);
 	if (ia->ia_addr.sin_family == AF_INET) {
 		LIST_REMOVE(ia, ia_hash);
 		/*
 		 * If this is the last IPv4 address configured on this
 		 * interface, leave the all-hosts group.
 		 * XXX: This is quite ugly because of locking and structure.
 		 */
 		oia = NULL;
 		IFP_TO_IA(ifp, oia);
 		if (oia == NULL) {
 			struct in_multi *inm;
 
 			IFF_LOCKGIANT(ifp);
 			IN_MULTI_LOCK();
 			IN_LOOKUP_MULTI(allhosts_addr, ifp, inm);
 			if (inm != NULL)
 				in_delmulti_locked(inm);
 			IN_MULTI_UNLOCK();
 			IFF_UNLOCKGIANT(ifp);
 		}
 	}
 	IFAFREE(&ia->ia_ifa);
 	splx(s);
 
 	return (error);
 }
 
 /*
  * SIOC[GAD]LIFADDR.
  *	SIOCGLIFADDR: get first address. (?!?)
  *	SIOCGLIFADDR with IFLR_PREFIX:
  *		get first address that matches the specified prefix.
  *	SIOCALIFADDR: add the specified address.
  *	SIOCALIFADDR with IFLR_PREFIX:
  *		EINVAL since we can't deduce hostid part of the address.
  *	SIOCDLIFADDR: delete the specified address.
  *	SIOCDLIFADDR with IFLR_PREFIX:
  *		delete the first address that matches the specified prefix.
  * return values:
  *	EINVAL on invalid parameters
  *	EADDRNOTAVAIL on prefix match failed/specified address not found
  *	other values may be returned from in_ioctl()
  */
 static int
 in_lifaddr_ioctl(struct socket *so, u_long cmd, caddr_t data,
     struct ifnet *ifp, struct thread *td)
 {
 	struct if_laddrreq *iflr = (struct if_laddrreq *)data;
 	struct ifaddr *ifa;
 
 	/* sanity checks */
 	if (data == NULL || ifp == NULL) {
 		panic("invalid argument to in_lifaddr_ioctl");
 		/*NOTRECHED*/
 	}
 
 	switch (cmd) {
 	case SIOCGLIFADDR:
 		/* address must be specified on GET with IFLR_PREFIX */
 		if ((iflr->flags & IFLR_PREFIX) == 0)
 			break;
 		/*FALLTHROUGH*/
 	case SIOCALIFADDR:
 	case SIOCDLIFADDR:
 		/* address must be specified on ADD and DELETE */
 		if (iflr->addr.ss_family != AF_INET)
 			return (EINVAL);
 		if (iflr->addr.ss_len != sizeof(struct sockaddr_in))
 			return (EINVAL);
 		/* XXX need improvement */
 		if (iflr->dstaddr.ss_family
 		 && iflr->dstaddr.ss_family != AF_INET)
 			return (EINVAL);
 		if (iflr->dstaddr.ss_family
 		 && iflr->dstaddr.ss_len != sizeof(struct sockaddr_in))
 			return (EINVAL);
 		break;
 	default: /*shouldn't happen*/
 		return (EOPNOTSUPP);
 	}
 	if (sizeof(struct in_addr) * 8 < iflr->prefixlen)
 		return (EINVAL);
 
 	switch (cmd) {
 	case SIOCALIFADDR:
 	    {
 		struct in_aliasreq ifra;
 
 		if (iflr->flags & IFLR_PREFIX)
 			return (EINVAL);
 
 		/* copy args to in_aliasreq, perform ioctl(SIOCAIFADDR_IN6). */
 		bzero(&ifra, sizeof(ifra));
 		bcopy(iflr->iflr_name, ifra.ifra_name,
 			sizeof(ifra.ifra_name));
 
 		bcopy(&iflr->addr, &ifra.ifra_addr, iflr->addr.ss_len);
 
 		if (iflr->dstaddr.ss_family) {	/*XXX*/
 			bcopy(&iflr->dstaddr, &ifra.ifra_dstaddr,
 				iflr->dstaddr.ss_len);
 		}
 
 		ifra.ifra_mask.sin_family = AF_INET;
 		ifra.ifra_mask.sin_len = sizeof(struct sockaddr_in);
 		in_len2mask(&ifra.ifra_mask.sin_addr, iflr->prefixlen);
 
 		return (in_control(so, SIOCAIFADDR, (caddr_t)&ifra, ifp, td));
 	    }
 	case SIOCGLIFADDR:
 	case SIOCDLIFADDR:
 	    {
 		struct in_ifaddr *ia;
 		struct in_addr mask, candidate, match;
 		struct sockaddr_in *sin;
 
 		bzero(&mask, sizeof(mask));
 		bzero(&match, sizeof(match));
 		if (iflr->flags & IFLR_PREFIX) {
 			/* lookup a prefix rather than address. */
 			in_len2mask(&mask, iflr->prefixlen);
 
 			sin = (struct sockaddr_in *)&iflr->addr;
 			match.s_addr = sin->sin_addr.s_addr;
 			match.s_addr &= mask.s_addr;
 
 			/* if you set extra bits, that's wrong */
 			if (match.s_addr != sin->sin_addr.s_addr)
 				return (EINVAL);
 
 		} else {
 			/* on getting an address, take the 1st match */
 			/* on deleting an address, do exact match */
 			if (cmd != SIOCGLIFADDR) {
 				in_len2mask(&mask, 32);
 				sin = (struct sockaddr_in *)&iflr->addr;
 				match.s_addr = sin->sin_addr.s_addr;
 			}
 		}
 
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)	{
 			if (ifa->ifa_addr->sa_family != AF_INET6)
 				continue;
 			if (match.s_addr == 0)
 				break;
 			candidate.s_addr = ((struct sockaddr_in *)&ifa->ifa_addr)->sin_addr.s_addr;
 			candidate.s_addr &= mask.s_addr;
 			if (candidate.s_addr == match.s_addr)
 				break;
 		}
 		if (ifa == NULL)
 			return (EADDRNOTAVAIL);
 		ia = (struct in_ifaddr *)ifa;
 
 		if (cmd == SIOCGLIFADDR) {
 			/* fill in the if_laddrreq structure */
 			bcopy(&ia->ia_addr, &iflr->addr, ia->ia_addr.sin_len);
 
 			if ((ifp->if_flags & IFF_POINTOPOINT) != 0) {
 				bcopy(&ia->ia_dstaddr, &iflr->dstaddr,
 					ia->ia_dstaddr.sin_len);
 			} else
 				bzero(&iflr->dstaddr, sizeof(iflr->dstaddr));
 
 			iflr->prefixlen =
 				in_mask2len(&ia->ia_sockmask.sin_addr);
 
 			iflr->flags = 0;	/*XXX*/
 
 			return (0);
 		} else {
 			struct in_aliasreq ifra;
 
 			/* fill in_aliasreq and do ioctl(SIOCDIFADDR_IN6) */
 			bzero(&ifra, sizeof(ifra));
 			bcopy(iflr->iflr_name, ifra.ifra_name,
 				sizeof(ifra.ifra_name));
 
 			bcopy(&ia->ia_addr, &ifra.ifra_addr,
 				ia->ia_addr.sin_len);
 			if ((ifp->if_flags & IFF_POINTOPOINT) != 0) {
 				bcopy(&ia->ia_dstaddr, &ifra.ifra_dstaddr,
 					ia->ia_dstaddr.sin_len);
 			}
 			bcopy(&ia->ia_sockmask, &ifra.ifra_dstaddr,
 				ia->ia_sockmask.sin_len);
 
 			return (in_control(so, SIOCDIFADDR, (caddr_t)&ifra,
 			    ifp, td));
 		}
 	    }
 	}
 
 	return (EOPNOTSUPP);	/*just for safety*/
 }
 
 /*
  * Delete any existing route for an interface.
  */
 void
 in_ifscrub(struct ifnet *ifp, struct in_ifaddr *ia)
 {
 
 	in_scrubprefix(ia);
 }
 
 /*
  * Initialize an interface's internet address
  * and routing table entry.
  */
 static int
 in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin,
     int scrub)
 {
 	INIT_VNET_INET(ifp->if_vnet);
 	register u_long i = ntohl(sin->sin_addr.s_addr);
 	struct sockaddr_in oldaddr;
 	int s = splimp(), flags = RTF_UP, error = 0;
 
 	oldaddr = ia->ia_addr;
 	if (oldaddr.sin_family == AF_INET)
 		LIST_REMOVE(ia, ia_hash);
 	ia->ia_addr = *sin;
 	if (ia->ia_addr.sin_family == AF_INET)
 		LIST_INSERT_HEAD(INADDR_HASH(ia->ia_addr.sin_addr.s_addr),
 		    ia, ia_hash);
 	/*
 	 * Give the interface a chance to initialize
 	 * if this is its first address,
 	 * and to validate the address if necessary.
 	 */
 	if (ifp->if_ioctl != NULL) {
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, SIOCSIFADDR, (caddr_t)ia);
 		IFF_UNLOCKGIANT(ifp);
 		if (error) {
 			splx(s);
 			/* LIST_REMOVE(ia, ia_hash) is done in in_control */
 			ia->ia_addr = oldaddr;
 			if (ia->ia_addr.sin_family == AF_INET)
 				LIST_INSERT_HEAD(INADDR_HASH(
 				    ia->ia_addr.sin_addr.s_addr), ia, ia_hash);
 			else 
 				/* 
 				 * If oldaddr family is not AF_INET (e.g. 
 				 * interface has been just created) in_control 
 				 * does not call LIST_REMOVE, and we end up 
 				 * with bogus ia entries in hash
 				 */
 				LIST_REMOVE(ia, ia_hash);
 			return (error);
 		}
 	}
 	splx(s);
 	if (scrub) {
 		ia->ia_ifa.ifa_addr = (struct sockaddr *)&oldaddr;
 		in_ifscrub(ifp, ia);
 		ia->ia_ifa.ifa_addr = (struct sockaddr *)&ia->ia_addr;
 	}
 	if (IN_CLASSA(i))
 		ia->ia_netmask = IN_CLASSA_NET;
 	else if (IN_CLASSB(i))
 		ia->ia_netmask = IN_CLASSB_NET;
 	else
 		ia->ia_netmask = IN_CLASSC_NET;
 	/*
 	 * The subnet mask usually includes at least the standard network part,
 	 * but may may be smaller in the case of supernetting.
 	 * If it is set, we believe it.
 	 */
 	if (ia->ia_subnetmask == 0) {
 		ia->ia_subnetmask = ia->ia_netmask;
 		ia->ia_sockmask.sin_addr.s_addr = htonl(ia->ia_subnetmask);
 	} else
 		ia->ia_netmask &= ia->ia_subnetmask;
 	ia->ia_net = i & ia->ia_netmask;
 	ia->ia_subnet = i & ia->ia_subnetmask;
 	in_socktrim(&ia->ia_sockmask);
 #ifdef DEV_CARP
 	/*
 	 * XXX: carp(4) does not have interface route
 	 */
 	if (ifp->if_type == IFT_CARP)
 		return (0);
 #endif
 	/*
 	 * Add route for the network.
 	 */
 	ia->ia_ifa.ifa_metric = ifp->if_metric;
 	if (ifp->if_flags & IFF_BROADCAST) {
 		ia->ia_broadaddr.sin_addr.s_addr =
 			htonl(ia->ia_subnet | ~ia->ia_subnetmask);
 		ia->ia_netbroadcast.s_addr =
 			htonl(ia->ia_net | ~ ia->ia_netmask);
 	} else if (ifp->if_flags & IFF_LOOPBACK) {
 		ia->ia_dstaddr = ia->ia_addr;
 		flags |= RTF_HOST;
 	} else if (ifp->if_flags & IFF_POINTOPOINT) {
 		if (ia->ia_dstaddr.sin_family != AF_INET)
 			return (0);
 		flags |= RTF_HOST;
 	}
 	if ((error = in_addprefix(ia, flags)) != 0)
 		return (error);
 
 	return (error);
 }
 
 #define rtinitflags(x) \
 	((((x)->ia_ifp->if_flags & (IFF_LOOPBACK | IFF_POINTOPOINT)) != 0) \
 	    ? RTF_HOST : 0)
 /*
  * Check if we have a route for the given prefix already or add one accordingly.
  */
 static int
 in_addprefix(struct in_ifaddr *target, int flags)
 {
 	INIT_VNET_INET(curvnet);
 	struct in_ifaddr *ia;
 	struct in_addr prefix, mask, p, m;
 	int error;
 
 	if ((flags & RTF_HOST) != 0) {
 		prefix = target->ia_dstaddr.sin_addr;
 		mask.s_addr = 0;
 	} else {
 		prefix = target->ia_addr.sin_addr;
 		mask = target->ia_sockmask.sin_addr;
 		prefix.s_addr &= mask.s_addr;
 	}
 
 	TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link) {
 		if (rtinitflags(ia)) {
 			p = ia->ia_addr.sin_addr;
 
 			if (prefix.s_addr != p.s_addr)
 				continue;
 		} else {
 			p = ia->ia_addr.sin_addr;
 			m = ia->ia_sockmask.sin_addr;
 			p.s_addr &= m.s_addr;
 
 			if (prefix.s_addr != p.s_addr ||
 			    mask.s_addr != m.s_addr)
 				continue;
 		}
 
 		/*
 		 * If we got a matching prefix route inserted by other
 		 * interface address, we are done here.
 		 */
 		if (ia->ia_flags & IFA_ROUTE) {
 			if (V_sameprefixcarponly &&
 			    target->ia_ifp->if_type != IFT_CARP &&
 			    ia->ia_ifp->if_type != IFT_CARP)
 				return (EEXIST);
 			else
 				return (0);
 		}
 	}
 
 	/*
 	 * No-one seem to have this prefix route, so we try to insert it.
 	 */
 	error = rtinit(&target->ia_ifa, (int)RTM_ADD, flags);
 	if (!error)
 		target->ia_flags |= IFA_ROUTE;
 	return (error);
 }
 
+extern void arp_ifscrub(struct ifnet *ifp, uint32_t addr);
+
 /*
  * If there is no other address in the system that can serve a route to the
  * same prefix, remove the route.  Hand over the route to the new address
  * otherwise.
  */
 static int
 in_scrubprefix(struct in_ifaddr *target)
 {
 	INIT_VNET_INET(curvnet);
 	struct in_ifaddr *ia;
 	struct in_addr prefix, mask, p;
 	int error;
 
 	if ((target->ia_flags & IFA_ROUTE) == 0)
 		return (0);
 
 	if (rtinitflags(target))
 		prefix = target->ia_dstaddr.sin_addr;
 	else {
 		prefix = target->ia_addr.sin_addr;
 		mask = target->ia_sockmask.sin_addr;
 		prefix.s_addr &= mask.s_addr;
+		/* remove arp cache */
+		arp_ifscrub(target->ia_ifp, IA_SIN(target)->sin_addr.s_addr);
 	}
 
 	TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link) {
 		if (rtinitflags(ia))
 			p = ia->ia_dstaddr.sin_addr;
 		else {
 			p = ia->ia_addr.sin_addr;
 			p.s_addr &= ia->ia_sockmask.sin_addr.s_addr;
 		}
 
 		if (prefix.s_addr != p.s_addr)
 			continue;
 
 		/*
 		 * If we got a matching prefix address, move IFA_ROUTE and
 		 * the route itself to it.  Make sure that routing daemons
 		 * get a heads-up.
 		 *
 		 * XXX: a special case for carp(4) interface
 		 */
 		if ((ia->ia_flags & IFA_ROUTE) == 0
 #ifdef DEV_CARP
 		    && (ia->ia_ifp->if_type != IFT_CARP)
 #endif
 							) {
 			rtinit(&(target->ia_ifa), (int)RTM_DELETE,
 			    rtinitflags(target));
 			target->ia_flags &= ~IFA_ROUTE;
 
 			error = rtinit(&ia->ia_ifa, (int)RTM_ADD,
 			    rtinitflags(ia) | RTF_UP);
 			if (error == 0)
 				ia->ia_flags |= IFA_ROUTE;
 			return (error);
 		}
 	}
 
 	/*
 	 * As no-one seem to have this prefix, we can remove the route.
 	 */
 	rtinit(&(target->ia_ifa), (int)RTM_DELETE, rtinitflags(target));
 	target->ia_flags &= ~IFA_ROUTE;
 	return (0);
 }
 
 #undef rtinitflags
 
 /*
  * Return 1 if the address might be a local broadcast address.
  */
 int
 in_broadcast(struct in_addr in, struct ifnet *ifp)
 {
 	register struct ifaddr *ifa;
 	u_long t;
 
 	if (in.s_addr == INADDR_BROADCAST ||
 	    in.s_addr == INADDR_ANY)
 		return (1);
 	if ((ifp->if_flags & IFF_BROADCAST) == 0)
 		return (0);
 	t = ntohl(in.s_addr);
 	/*
 	 * Look through the list of addresses for a match
 	 * with a broadcast address.
 	 */
 #define ia ((struct in_ifaddr *)ifa)
 	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link)
 		if (ifa->ifa_addr->sa_family == AF_INET &&
 		    (in.s_addr == ia->ia_broadaddr.sin_addr.s_addr ||
 		     in.s_addr == ia->ia_netbroadcast.s_addr ||
 		     /*
 		      * Check for old-style (host 0) broadcast.
 		      */
 		     t == ia->ia_subnet || t == ia->ia_net) &&
 		     /*
 		      * Check for an all one subnetmask. These
 		      * only exist when an interface gets a secondary
 		      * address.
 		      */
 		     ia->ia_subnetmask != (u_long)0xffffffff)
 			    return (1);
 	return (0);
 #undef ia
 }
 
 /*
  * Delete all IPv4 multicast address records, and associated link-layer
  * multicast address records, associated with ifp.
  */
 static void
 in_purgemaddrs(struct ifnet *ifp)
 {
 	INIT_VNET_INET(ifp->if_vnet);
 	struct in_multi *inm;
 	struct in_multi *oinm;
 
 #ifdef DIAGNOSTIC
 	printf("%s: purging ifp %p\n", __func__, ifp);
 #endif
 	IFF_LOCKGIANT(ifp);
 	IN_MULTI_LOCK();
 	LIST_FOREACH_SAFE(inm, &V_in_multihead, inm_link, oinm) {
 		if (inm->inm_ifp == ifp)
 			in_delmulti_locked(inm);
 	}
 	IN_MULTI_UNLOCK();
 	IFF_UNLOCKGIANT(ifp);
 }
 
 /*
  * On interface removal, clean up IPv4 data structures hung off of the ifnet.
  */
 void
 in_ifdetach(struct ifnet *ifp)
 {
 	INIT_VNET_INET(ifp->if_vnet);
 
 	in_pcbpurgeif0(&V_ripcbinfo, ifp);
 	in_pcbpurgeif0(&V_udbinfo, ifp);
 	in_purgemaddrs(ifp);
+}
+
+#include <sys/syslog.h>
+#include <net/if_dl.h>
+#include <netinet/if_ether.h>
+
+struct in_llentry {
+	struct llentry		base;
+	struct sockaddr_in	l3_addr4;
+};
+
+static struct llentry *
+in_lltable_new(const struct sockaddr *l3addr, u_int flags)
+{
+	struct in_llentry *lle;
+
+	lle = malloc(sizeof(struct in_llentry), M_LLTABLE, M_DONTWAIT | M_ZERO);
+	if (lle == NULL)		/* NB: caller generates msg */
+		return NULL;
+
+	callout_init(&lle->base.la_timer, CALLOUT_MPSAFE);
+	/*
+	 * For IPv4 this will trigger "arpresolve" to generate
+	 * an ARP request.
+	 */
+	lle->base.la_expire = time_second; /* mark expired */
+	lle->l3_addr4 = *(const struct sockaddr_in *)l3addr;
+	lle->base.lle_refcnt = 1;
+	LLE_LOCK_INIT(&lle->base);
+	return &lle->base;
+}
+
+/*
+ * Deletes an address from the address table.
+ * This function is called by the timer functions
+ * such as arptimer() and nd6_llinfo_timer(), and
+ * the caller does the locking.
+ */
+static void
+in_lltable_free(struct lltable *llt, struct llentry *lle)
+{
+	free(lle, M_LLTABLE);
+}
+
+static int
+in_lltable_rtcheck(struct ifnet *ifp, const struct sockaddr *l3addr)
+{
+	struct rtentry *rt;
+
+	KASSERT(l3addr->sa_family == AF_INET,
+	    ("sin_family %d", l3addr->sa_family));
+
+	/* XXX rtalloc1 should take a const param */
+	rt = rtalloc1(__DECONST(struct sockaddr *, l3addr), 0, 0);
+	if (rt == NULL || (rt->rt_flags & RTF_GATEWAY) || rt->rt_ifp != ifp) {
+		log(LOG_INFO, "IPv4 address: \"%s\" is not on the network\n",
+		    inet_ntoa(((const struct sockaddr_in *)l3addr)->sin_addr));
+		if (rt != NULL)
+			RTFREE_LOCKED(rt);
+		return (EINVAL);
+	}
+	RTFREE_LOCKED(rt);
+	return 0;
+}
+
+/*
+ * Return NULL if not found or marked for deletion.
+ * If found return lle read locked.
+ */
+static struct llentry *
+in_lltable_lookup(struct lltable *llt, u_int flags, const struct sockaddr *l3addr)
+{
+	const struct sockaddr_in *sin = (const struct sockaddr_in *)l3addr;
+	struct ifnet *ifp = llt->llt_ifp;
+	struct llentry *lle;
+	struct llentries *lleh;
+	u_int hashkey;
+
+	IF_AFDATA_LOCK_ASSERT(ifp);
+	KASSERT(l3addr->sa_family == AF_INET,
+	    ("sin_family %d", l3addr->sa_family));
+
+	hashkey = sin->sin_addr.s_addr;
+	lleh = &llt->lle_head[LLATBL_HASH(hashkey, LLTBL_HASHMASK)];
+	LIST_FOREACH(lle, lleh, lle_next) {
+		if (lle->la_flags & LLE_DELETED)
+			continue;
+		if (bcmp(L3_ADDR(lle), l3addr, sizeof(struct sockaddr_in)) == 0)
+			break;
+	}
+	if (lle == NULL) {
+#ifdef DIAGNOSTICS
+		if (flags & LLE_DELETE)
+			log(LOG_INFO, "interface address is missing from cache = %p  in delete\n", lle);	
+#endif
+		if (!(flags & LLE_CREATE))
+			return (NULL);
+		/*
+		 * A route that covers the given address must have
+		 * been installed 1st because we are doing a resolution,
+		 * verify this.
+		 */
+		if (!(flags & LLE_IFADDR) &&
+		    in_lltable_rtcheck(ifp, l3addr) != 0)
+			goto done;
+
+		lle = in_lltable_new(l3addr, flags);
+		if (lle == NULL) {
+			log(LOG_INFO, "lla_lookup: new lle malloc failed\n");
+			goto done;
+		}
+		lle->la_flags = flags & ~LLE_CREATE;
+		if ((flags & (LLE_CREATE | LLE_IFADDR)) == (LLE_CREATE | LLE_IFADDR)) {
+			bcopy(IF_LLADDR(ifp), &lle->ll_addr, ifp->if_addrlen);
+			lle->la_flags |= (LLE_VALID | LLE_STATIC);
+		}
+
+		lle->lle_tbl  = llt;
+		lle->lle_head = lleh;
+		LIST_INSERT_HEAD(lleh, lle, lle_next);
+	} else if (flags & LLE_DELETE) {
+		if (!(lle->la_flags & LLE_IFADDR) || (flags & LLE_IFADDR)) {
+			LLE_WLOCK(lle);
+			lle->la_flags = LLE_DELETED;
+			LLE_WUNLOCK(lle);
+#ifdef DIAGNOSTICS
+			log(LOG_INFO, "ifaddr cache = %p  is deleted\n", lle);	
+#endif
+		}
+		lle = (void *)-1;
+		
+	}
+	if (lle != NULL && lle != (void *)-1) {
+		if (flags & LLE_EXCLUSIVE)
+			LLE_WLOCK(lle);
+		else
+			LLE_RLOCK(lle);
+	}
+done:
+	return (lle);
+}
+
+static int
+in_lltable_dump(struct lltable *llt, struct sysctl_req *wr)
+{
+#define	SIN(lle)	((struct sockaddr_in *) L3_ADDR(lle))
+	struct ifnet *ifp = llt->llt_ifp;
+	struct llentry *lle;
+	/* XXX stack use */
+	struct {
+		struct rt_msghdr	rtm;
+		struct sockaddr_inarp	sin;
+		struct sockaddr_dl	sdl;
+	} arpc;
+	int error, i;
+
+	/* XXXXX
+	 * current IFNET_RLOCK() is mapped to IFNET_WLOCK()
+	 * so it is okay to use this ASSERT, change it when
+	 * IFNET lock is finalized
+	 */
+	IFNET_WLOCK_ASSERT();
+
+	error = 0;
+	for (i = 0; i < LLTBL_HASHTBL_SIZE; i++) {
+		LIST_FOREACH(lle, &llt->lle_head[i], lle_next) {
+			struct sockaddr_dl *sdl;
+			
+			/* skip deleted entries */
+			if ((lle->la_flags & (LLE_DELETED|LLE_VALID)) != LLE_VALID)
+				continue;
+			/*
+			 * produce a msg made of:
+			 *  struct rt_msghdr;
+			 *  struct sockaddr_inarp; (IPv4)
+			 *  struct sockaddr_dl;
+			 */
+			bzero(&arpc, sizeof(arpc));
+			arpc.rtm.rtm_msglen = sizeof(arpc);
+			arpc.sin.sin_family = AF_INET;
+			arpc.sin.sin_len = sizeof(arpc.sin);
+			arpc.sin.sin_addr.s_addr = SIN(lle)->sin_addr.s_addr;
+
+			/* publish */
+			if (lle->la_flags & LLE_PUB) {
+				arpc.rtm.rtm_flags |= RTF_ANNOUNCE;
+				/* proxy only */
+				if (lle->la_flags & LLE_PROXY)
+					arpc.sin.sin_other = SIN_PROXY;
+			}
+
+			sdl = &arpc.sdl;
+			sdl->sdl_family = AF_LINK;
+			sdl->sdl_len = sizeof(*sdl);
+			sdl->sdl_alen = ifp->if_addrlen;
+			sdl->sdl_index = ifp->if_index;
+			sdl->sdl_type = ifp->if_type;
+			bcopy(&lle->ll_addr, LLADDR(sdl), ifp->if_addrlen);
+
+			arpc.rtm.rtm_rmx.rmx_expire =
+			    lle->la_flags & LLE_STATIC ? 0 : lle->la_expire;
+			arpc.rtm.rtm_flags |= RTF_HOST;
+			if (lle->la_flags & LLE_STATIC)
+				arpc.rtm.rtm_flags |= RTF_STATIC;
+			arpc.rtm.rtm_index = ifp->if_index;
+			error = SYSCTL_OUT(wr, &arpc, sizeof(arpc));
+			if (error)
+				break;
+		}
+	}
+	return error;
+#undef SIN
+}
+
+void *
+in_domifattach(struct ifnet *ifp)
+{   
+	struct lltable *llt = lltable_init(ifp, AF_INET);
+ 
+	if (llt != NULL) {
+		llt->llt_new = in_lltable_new;
+		llt->llt_free = in_lltable_free;
+		llt->llt_rtcheck = in_lltable_rtcheck;
+		llt->llt_lookup = in_lltable_lookup;
+		llt->llt_dump = in_lltable_dump;
+	}
+	return (llt);
+}
+
+void
+in_domifdetach(struct ifnet *ifp __unused, void *aux)
+{
+	struct lltable *llt = (struct lltable *)aux;
+
+	lltable_free(llt);
 }
Index: head/sys/netinet/in_mcast.c
===================================================================
--- head/sys/netinet/in_mcast.c	(revision 186118)
+++ head/sys/netinet/in_mcast.c	(revision 186119)
@@ -1,1835 +1,1835 @@
 /*-
  * Copyright (c) 2007 Bruce M. Simpson.
  * Copyright (c) 2005 Robert N. M. Watson.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. The name of the author may not be used to endorse or promote
  *    products derived from this software without specific prior written
  *    permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * IPv4 multicast socket, group, and socket option processing module.
  * Until further notice, this file requires INET to compile.
  * TODO: Make this infrastructure independent of address family.
  * TODO: Teach netinet6 to use this code.
  * TODO: Hook up SSM logic to IGMPv3/MLDv2.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sysctl.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/in_pcb.h>
 #include <netinet/in_var.h>
 #include <netinet/ip_var.h>
 #include <netinet/igmp_var.h>
 #include <netinet/vinet.h>
 
 #ifndef __SOCKUNION_DECLARED
 union sockunion {
 	struct sockaddr_storage	ss;
 	struct sockaddr		sa;
 	struct sockaddr_dl	sdl;
 	struct sockaddr_in	sin;
 #ifdef INET6
 	struct sockaddr_in6	sin6;
 #endif
 };
 typedef union sockunion sockunion_t;
 #define __SOCKUNION_DECLARED
 #endif /* __SOCKUNION_DECLARED */
 
 static MALLOC_DEFINE(M_IPMADDR, "in_multi", "IPv4 multicast group");
 static MALLOC_DEFINE(M_IPMOPTS, "ip_moptions", "IPv4 multicast options");
 static MALLOC_DEFINE(M_IPMSOURCE, "in_msource", "IPv4 multicast source filter");
 
 /*
  * The IPv4 multicast list (in_multihead and associated structures) are
  * protected by the global in_multi_mtx.  See in_var.h for more details.  For
  * now, in_multi_mtx is marked as recursible due to IGMP's calling back into
  * ip_output() to send IGMP packets while holding the lock; this probably is
  * not quite desirable.
  */
 #ifdef VIMAGE_GLOBALS
 struct in_multihead in_multihead;	/* XXX BSS initialization */
 #endif
 struct mtx in_multi_mtx;
 MTX_SYSINIT(in_multi_mtx, &in_multi_mtx, "in_multi_mtx", MTX_DEF | MTX_RECURSE);
 
 /*
  * Functions with non-static linkage defined in this file should be
  * declared in in_var.h:
  *  imo_match_group()
  *  imo_match_source()
  *  in_addmulti()
  *  in_delmulti()
  *  in_delmulti_locked()
  * and ip_var.h:
  *  inp_freemoptions()
  *  inp_getmoptions()
  *  inp_setmoptions()
  */
 static int	imo_grow(struct ip_moptions *);
 static int	imo_join_source(struct ip_moptions *, size_t, sockunion_t *);
 static int	imo_leave_source(struct ip_moptions *, size_t, sockunion_t *);
 static int	inp_change_source_filter(struct inpcb *, struct sockopt *);
 static struct ip_moptions *
 		inp_findmoptions(struct inpcb *);
 static int	inp_get_source_filters(struct inpcb *, struct sockopt *);
 static int	inp_join_group(struct inpcb *, struct sockopt *);
 static int	inp_leave_group(struct inpcb *, struct sockopt *);
 static int	inp_set_multicast_if(struct inpcb *, struct sockopt *);
 static int	inp_set_source_filters(struct inpcb *, struct sockopt *);
 
 /*
  * Resize the ip_moptions vector to the next power-of-two minus 1.
  * May be called with locks held; do not sleep.
  */
 static int
 imo_grow(struct ip_moptions *imo)
 {
 	struct in_multi		**nmships;
 	struct in_multi		**omships;
 	struct in_mfilter	 *nmfilters;
 	struct in_mfilter	 *omfilters;
 	size_t			  idx;
 	size_t			  newmax;
 	size_t			  oldmax;
 
 	nmships = NULL;
 	nmfilters = NULL;
 	omships = imo->imo_membership;
 	omfilters = imo->imo_mfilters;
 	oldmax = imo->imo_max_memberships;
 	newmax = ((oldmax + 1) * 2) - 1;
 
 	if (newmax <= IP_MAX_MEMBERSHIPS) {
 		nmships = (struct in_multi **)realloc(omships,
 		    sizeof(struct in_multi *) * newmax, M_IPMOPTS, M_NOWAIT);
 		nmfilters = (struct in_mfilter *)realloc(omfilters,
 		    sizeof(struct in_mfilter) * newmax, M_IPMSOURCE, M_NOWAIT);
 		if (nmships != NULL && nmfilters != NULL) {
 			/* Initialize newly allocated source filter heads. */
 			for (idx = oldmax; idx < newmax; idx++) {
 				nmfilters[idx].imf_fmode = MCAST_EXCLUDE;
 				nmfilters[idx].imf_nsources = 0;
 				TAILQ_INIT(&nmfilters[idx].imf_sources);
 			}
 			imo->imo_max_memberships = newmax;
 			imo->imo_membership = nmships;
 			imo->imo_mfilters = nmfilters;
 		}
 	}
 
 	if (nmships == NULL || nmfilters == NULL) {
 		if (nmships != NULL)
 			free(nmships, M_IPMOPTS);
 		if (nmfilters != NULL)
 			free(nmfilters, M_IPMSOURCE);
 		return (ETOOMANYREFS);
 	}
 
 	return (0);
 }
 
 /*
  * Add a source to a multicast filter list.
  * Assumes the associated inpcb is locked.
  */
 static int
 imo_join_source(struct ip_moptions *imo, size_t gidx, sockunion_t *src)
 {
 	struct in_msource	*ims, *nims;
 	struct in_mfilter	*imf;
 
 	KASSERT(src->ss.ss_family == AF_INET, ("%s: !AF_INET", __func__));
 	KASSERT(imo->imo_mfilters != NULL,
 	    ("%s: imo_mfilters vector not allocated", __func__));
 
 	imf = &imo->imo_mfilters[gidx];
 	if (imf->imf_nsources == IP_MAX_SOURCE_FILTER)
 		return (ENOBUFS);
 
 	ims = imo_match_source(imo, gidx, &src->sa);
 	if (ims != NULL)
 		return (EADDRNOTAVAIL);
 
 	/* Do not sleep with inp lock held. */
 	nims = malloc(sizeof(struct in_msource),
 	    M_IPMSOURCE, M_NOWAIT | M_ZERO);
 	if (nims == NULL)
 		return (ENOBUFS);
 
 	nims->ims_addr = src->ss;
 	TAILQ_INSERT_TAIL(&imf->imf_sources, nims, ims_next);
 	imf->imf_nsources++;
 
 	return (0);
 }
 
 static int
 imo_leave_source(struct ip_moptions *imo, size_t gidx, sockunion_t *src)
 {
 	struct in_msource	*ims;
 	struct in_mfilter	*imf;
 
 	KASSERT(src->ss.ss_family == AF_INET, ("%s: !AF_INET", __func__));
 	KASSERT(imo->imo_mfilters != NULL,
 	    ("%s: imo_mfilters vector not allocated", __func__));
 
 	imf = &imo->imo_mfilters[gidx];
 	if (imf->imf_nsources == IP_MAX_SOURCE_FILTER)
 		return (ENOBUFS);
 
 	ims = imo_match_source(imo, gidx, &src->sa);
 	if (ims == NULL)
 		return (EADDRNOTAVAIL);
 
 	TAILQ_REMOVE(&imf->imf_sources, ims, ims_next);
 	free(ims, M_IPMSOURCE);
 	imf->imf_nsources--;
 
 	return (0);
 }
 
 /*
  * Find an IPv4 multicast group entry for this ip_moptions instance
  * which matches the specified group, and optionally an interface.
  * Return its index into the array, or -1 if not found.
  */
 size_t
 imo_match_group(struct ip_moptions *imo, struct ifnet *ifp,
     struct sockaddr *group)
 {
 	sockunion_t	 *gsa;
 	struct in_multi	**pinm;
 	int		  idx;
 	int		  nmships;
 
 	gsa = (sockunion_t *)group;
 
 	/* The imo_membership array may be lazy allocated. */
 	if (imo->imo_membership == NULL || imo->imo_num_memberships == 0)
 		return (-1);
 
 	nmships = imo->imo_num_memberships;
 	pinm = &imo->imo_membership[0];
 	for (idx = 0; idx < nmships; idx++, pinm++) {
 		if (*pinm == NULL)
 			continue;
 #if 0
 		printf("%s: trying ifp = %p, inaddr = %s ", __func__,
 		    ifp, inet_ntoa(gsa->sin.sin_addr));
 		printf("against %p, %s\n",
 		    (*pinm)->inm_ifp, inet_ntoa((*pinm)->inm_addr));
 #endif
 		if ((ifp == NULL || ((*pinm)->inm_ifp == ifp)) &&
 		    (*pinm)->inm_addr.s_addr == gsa->sin.sin_addr.s_addr) {
 			break;
 		}
 	}
 	if (idx >= nmships)
 		idx = -1;
 
 	return (idx);
 }
 
 /*
  * Find a multicast source entry for this imo which matches
  * the given group index for this socket, and source address.
  */
 struct in_msource *
 imo_match_source(struct ip_moptions *imo, size_t gidx, struct sockaddr *src)
 {
 	struct in_mfilter	*imf;
 	struct in_msource	*ims, *pims;
 
 	KASSERT(src->sa_family == AF_INET, ("%s: !AF_INET", __func__));
 	KASSERT(gidx != -1 && gidx < imo->imo_num_memberships,
 	    ("%s: invalid index %d\n", __func__, (int)gidx));
 
 	/* The imo_mfilters array may be lazy allocated. */
 	if (imo->imo_mfilters == NULL)
 		return (NULL);
 
 	pims = NULL;
 	imf = &imo->imo_mfilters[gidx];
 	TAILQ_FOREACH(ims, &imf->imf_sources, ims_next) {
 		/*
 		 * Perform bitwise comparison of two IPv4 addresses.
 		 * TODO: Do the same for IPv6.
 		 * Do not use sa_equal() for this as it is not aware of
 		 * deeper structure in sockaddr_in or sockaddr_in6.
 		 */
 		if (((struct sockaddr_in *)&ims->ims_addr)->sin_addr.s_addr ==
 		    ((struct sockaddr_in *)src)->sin_addr.s_addr) {
 			pims = ims;
 			break;
 		}
 	}
 
 	return (pims);
 }
 
 /*
  * Join an IPv4 multicast group.
  */
 struct in_multi *
 in_addmulti(struct in_addr *ap, struct ifnet *ifp)
 {
 	INIT_VNET_INET(ifp->if_vnet);
 	struct in_multi *inm;
 
 	inm = NULL;
 
 	IFF_LOCKGIANT(ifp);
 	IN_MULTI_LOCK();
 
 	IN_LOOKUP_MULTI(*ap, ifp, inm);
 	if (inm != NULL) {
 		/*
 		 * If we already joined this group, just bump the
 		 * refcount and return it.
 		 */
 		KASSERT(inm->inm_refcount >= 1,
 		    ("%s: bad refcount %d", __func__, inm->inm_refcount));
 		++inm->inm_refcount;
 	} else do {
 		sockunion_t		 gsa;
 		struct ifmultiaddr	*ifma;
 		struct in_multi		*ninm;
 		int			 error;
 
 		memset(&gsa, 0, sizeof(gsa));
 		gsa.sin.sin_family = AF_INET;
 		gsa.sin.sin_len = sizeof(struct sockaddr_in);
 		gsa.sin.sin_addr = *ap;
 
 		/*
 		 * Check if a link-layer group is already associated
 		 * with this network-layer group on the given ifnet.
 		 * If so, bump the refcount on the existing network-layer
 		 * group association and return it.
 		 */
 		error = if_addmulti(ifp, &gsa.sa, &ifma);
 		if (error)
 			break;
 		if (ifma->ifma_protospec != NULL) {
 			inm = (struct in_multi *)ifma->ifma_protospec;
 #ifdef INVARIANTS
 			if (inm->inm_ifma != ifma || inm->inm_ifp != ifp ||
 			    inm->inm_addr.s_addr != ap->s_addr)
 				panic("%s: ifma is inconsistent", __func__);
 #endif
 			++inm->inm_refcount;
 			break;
 		}
 
 		/*
 		 * A new membership is needed; construct it and
 		 * perform the IGMP join.
 		 */
 		ninm = malloc(sizeof(*ninm), M_IPMADDR, M_NOWAIT | M_ZERO);
 		if (ninm == NULL) {
 			if_delmulti_ifma(ifma);
 			break;
 		}
 		ninm->inm_addr = *ap;
 		ninm->inm_ifp = ifp;
 		ninm->inm_ifma = ifma;
 		ninm->inm_refcount = 1;
 		ifma->ifma_protospec = ninm;
 		LIST_INSERT_HEAD(&V_in_multihead, ninm, inm_link);
 
 		igmp_joingroup(ninm);
 
 		inm = ninm;
 	} while (0);
 
 	IN_MULTI_UNLOCK();
 	IFF_UNLOCKGIANT(ifp);
 
 	return (inm);
 }
 
 /*
  * Leave an IPv4 multicast group.
  * It is OK to call this routine if the underlying ifnet went away.
  *
  * XXX: To deal with the ifp going away, we cheat; the link-layer code in net
  * will set ifma_ifp to NULL when the associated ifnet instance is detached
  * from the system.
  *
  * The only reason we need to violate layers and check ifma_ifp here at all
  * is because certain hardware drivers still require Giant to be held,
  * and it must always be taken before other locks.
  */
 void
 in_delmulti(struct in_multi *inm)
 {
 	struct ifnet *ifp;
 
 	KASSERT(inm != NULL, ("%s: inm is NULL", __func__));
 	KASSERT(inm->inm_ifma != NULL, ("%s: no ifma", __func__));
 	ifp = inm->inm_ifma->ifma_ifp;
 
 	if (ifp != NULL) {
 		/*
 		 * Sanity check that netinet's notion of ifp is the
 		 * same as net's.
 		 */
 		KASSERT(inm->inm_ifp == ifp, ("%s: bad ifp", __func__));
 		IFF_LOCKGIANT(ifp);
 	}
 
 	IN_MULTI_LOCK();
 	in_delmulti_locked(inm);
 	IN_MULTI_UNLOCK();
 
 	if (ifp != NULL)
 		IFF_UNLOCKGIANT(ifp);
 }
 
 /*
  * Delete a multicast address record, with locks held.
  *
  * It is OK to call this routine if the ifp went away.
  * Assumes that caller holds the IN_MULTI lock, and that
  * Giant was taken before other locks if required by the hardware.
  */
 void
 in_delmulti_locked(struct in_multi *inm)
 {
 	struct ifmultiaddr *ifma;
 
 	IN_MULTI_LOCK_ASSERT();
 	KASSERT(inm->inm_refcount >= 1, ("%s: freeing freed inm", __func__));
 
 	if (--inm->inm_refcount == 0) {
 		igmp_leavegroup(inm);
 
 		ifma = inm->inm_ifma;
 #ifdef DIAGNOSTIC
 		if (bootverbose)
 			printf("%s: purging ifma %p\n", __func__, ifma);
 #endif
 		KASSERT(ifma->ifma_protospec == inm,
 		    ("%s: ifma_protospec != inm", __func__));
 		ifma->ifma_protospec = NULL;
 
 		LIST_REMOVE(inm, inm_link);
 		free(inm, M_IPMADDR);
 
 		if_delmulti_ifma(ifma);
 	}
 }
 
 /*
  * Block or unblock an ASM/SSM multicast source on an inpcb.
  */
 static int
 inp_change_source_filter(struct inpcb *inp, struct sockopt *sopt)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET(curvnet);
 	struct group_source_req		 gsr;
 	sockunion_t			*gsa, *ssa;
 	struct ifnet			*ifp;
 	struct in_mfilter		*imf;
 	struct ip_moptions		*imo;
 	struct in_msource		*ims;
 	size_t				 idx;
 	int				 error;
 	int				 block;
 
 	ifp = NULL;
 	error = 0;
 	block = 0;
 
 	memset(&gsr, 0, sizeof(struct group_source_req));
 	gsa = (sockunion_t *)&gsr.gsr_group;
 	ssa = (sockunion_t *)&gsr.gsr_source;
 
 	switch (sopt->sopt_name) {
 	case IP_BLOCK_SOURCE:
 	case IP_UNBLOCK_SOURCE: {
 		struct ip_mreq_source	 mreqs;
 
 		error = sooptcopyin(sopt, &mreqs,
 		    sizeof(struct ip_mreq_source),
 		    sizeof(struct ip_mreq_source));
 		if (error)
 			return (error);
 
 		gsa->sin.sin_family = AF_INET;
 		gsa->sin.sin_len = sizeof(struct sockaddr_in);
 		gsa->sin.sin_addr = mreqs.imr_multiaddr;
 
 		ssa->sin.sin_family = AF_INET;
 		ssa->sin.sin_len = sizeof(struct sockaddr_in);
 		ssa->sin.sin_addr = mreqs.imr_sourceaddr;
 
 		if (mreqs.imr_interface.s_addr != INADDR_ANY)
 			INADDR_TO_IFP(mreqs.imr_interface, ifp);
 
 		if (sopt->sopt_name == IP_BLOCK_SOURCE)
 			block = 1;
 
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: imr_interface = %s, ifp = %p\n",
 			    __func__, inet_ntoa(mreqs.imr_interface), ifp);
 		}
 #endif
 		break;
 	    }
 
 	case MCAST_BLOCK_SOURCE:
 	case MCAST_UNBLOCK_SOURCE:
 		error = sooptcopyin(sopt, &gsr,
 		    sizeof(struct group_source_req),
 		    sizeof(struct group_source_req));
 		if (error)
 			return (error);
 
 		if (gsa->sin.sin_family != AF_INET ||
 		    gsa->sin.sin_len != sizeof(struct sockaddr_in))
 			return (EINVAL);
 
 		if (ssa->sin.sin_family != AF_INET ||
 		    ssa->sin.sin_len != sizeof(struct sockaddr_in))
 			return (EINVAL);
 
 		if (gsr.gsr_interface == 0 || V_if_index < gsr.gsr_interface)
 			return (EADDRNOTAVAIL);
 
 		ifp = ifnet_byindex(gsr.gsr_interface);
 
 		if (sopt->sopt_name == MCAST_BLOCK_SOURCE)
 			block = 1;
 		break;
 
 	default:
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: unknown sopt_name %d\n", __func__,
 			    sopt->sopt_name);
 		}
 #endif
 		return (EOPNOTSUPP);
 		break;
 	}
 
 	/* XXX INET6 */
 	if (!IN_MULTICAST(ntohl(gsa->sin.sin_addr.s_addr)))
 		return (EINVAL);
 
 	/*
 	 * Check if we are actually a member of this group.
 	 */
 	imo = inp_findmoptions(inp);
 	idx = imo_match_group(imo, ifp, &gsa->sa);
 	if (idx == -1 || imo->imo_mfilters == NULL) {
 		error = EADDRNOTAVAIL;
 		goto out_locked;
 	}
 
 	KASSERT(imo->imo_mfilters != NULL,
 	    ("%s: imo_mfilters not allocated", __func__));
 	imf = &imo->imo_mfilters[idx];
 
 	/*
 	 * SSM multicast truth table for block/unblock operations.
 	 *
 	 * Operation   Filter Mode  Entry exists?   Action
 	 *
 	 * block       exclude      no              add source to filter
 	 * unblock     include      no              add source to filter
 	 * block       include      no              EINVAL
 	 * unblock     exclude      no              EINVAL
 	 * block       exclude      yes             EADDRNOTAVAIL
 	 * unblock     include      yes             EADDRNOTAVAIL
 	 * block       include      yes             remove source from filter
 	 * unblock     exclude      yes             remove source from filter
 	 *
 	 * FreeBSD does not explicitly distinguish between ASM and SSM
 	 * mode sockets; all sockets are assumed to have a filter list.
 	 */
 #ifdef DIAGNOSTIC
 	if (bootverbose) {
 		printf("%s: imf_fmode is %s\n", __func__,
 		    imf->imf_fmode == MCAST_INCLUDE ? "include" : "exclude");
 	}
 #endif
 	ims = imo_match_source(imo, idx, &ssa->sa);
 	if (ims == NULL) {
 		if ((block == 1 && imf->imf_fmode == MCAST_EXCLUDE) ||
 		    (block == 0 && imf->imf_fmode == MCAST_INCLUDE)) {
 #ifdef DIAGNOSTIC
 			if (bootverbose) {
 				printf("%s: adding %s to filter list\n",
 				    __func__, inet_ntoa(ssa->sin.sin_addr));
 			}
 #endif
 			error = imo_join_source(imo, idx, ssa);
 		}
 		if ((block == 1 && imf->imf_fmode == MCAST_INCLUDE) ||
 		    (block == 0 && imf->imf_fmode == MCAST_EXCLUDE)) {
 			/*
 			 * If the socket is in inclusive mode:
 			 *  the source is already blocked as it has no entry.
 			 * If the socket is in exclusive mode:
 			 *  the source is already unblocked as it has no entry.
 			 */
 #ifdef DIAGNOSTIC
 			if (bootverbose) {
 				printf("%s: ims %p; %s already [un]blocked\n",
 				    __func__, ims,
 				    inet_ntoa(ssa->sin.sin_addr));
 			}
 #endif
 			error = EINVAL;
 		}
 	} else {
 		if ((block == 1 && imf->imf_fmode == MCAST_EXCLUDE) ||
 		    (block == 0 && imf->imf_fmode == MCAST_INCLUDE)) {
 			/*
 			 * If the socket is in exclusive mode:
 			 *  the source is already blocked as it has an entry.
 			 * If the socket is in inclusive mode:
 			 *  the source is already unblocked as it has an entry.
 			 */
 #ifdef DIAGNOSTIC
 			if (bootverbose) {
 				printf("%s: ims %p; %s already [un]blocked\n",
 				    __func__, ims,
 				    inet_ntoa(ssa->sin.sin_addr));
 			}
 #endif
 			error = EADDRNOTAVAIL;
 		}
 		if ((block == 1 && imf->imf_fmode == MCAST_INCLUDE) ||
 		    (block == 0 && imf->imf_fmode == MCAST_EXCLUDE)) {
 #ifdef DIAGNOSTIC
 			if (bootverbose) {
 				printf("%s: removing %s from filter list\n",
 				    __func__, inet_ntoa(ssa->sin.sin_addr));
 			}
 #endif
 			error = imo_leave_source(imo, idx, ssa);
 		}
 	}
 
 out_locked:
 	INP_WUNLOCK(inp);
 	return (error);
 }
 
 /*
  * Given an inpcb, return its multicast options structure pointer.  Accepts
  * an unlocked inpcb pointer, but will return it locked.  May sleep.
  */
 static struct ip_moptions *
 inp_findmoptions(struct inpcb *inp)
 {
 	struct ip_moptions	 *imo;
 	struct in_multi		**immp;
 	struct in_mfilter	 *imfp;
 	size_t			  idx;
 
 	INP_WLOCK(inp);
 	if (inp->inp_moptions != NULL)
 		return (inp->inp_moptions);
 
 	INP_WUNLOCK(inp);
 
 	imo = (struct ip_moptions *)malloc(sizeof(*imo), M_IPMOPTS,
 	    M_WAITOK);
 	immp = (struct in_multi **)malloc(sizeof(*immp) * IP_MIN_MEMBERSHIPS,
 	    M_IPMOPTS, M_WAITOK | M_ZERO);
 	imfp = (struct in_mfilter *)malloc(
 	    sizeof(struct in_mfilter) * IP_MIN_MEMBERSHIPS,
 	    M_IPMSOURCE, M_WAITOK);
 
 	imo->imo_multicast_ifp = NULL;
 	imo->imo_multicast_addr.s_addr = INADDR_ANY;
 	imo->imo_multicast_vif = -1;
 	imo->imo_multicast_ttl = IP_DEFAULT_MULTICAST_TTL;
 	imo->imo_multicast_loop = IP_DEFAULT_MULTICAST_LOOP;
 	imo->imo_num_memberships = 0;
 	imo->imo_max_memberships = IP_MIN_MEMBERSHIPS;
 	imo->imo_membership = immp;
 
 	/* Initialize per-group source filters. */
 	for (idx = 0; idx < IP_MIN_MEMBERSHIPS; idx++) {
 		imfp[idx].imf_fmode = MCAST_EXCLUDE;
 		imfp[idx].imf_nsources = 0;
 		TAILQ_INIT(&imfp[idx].imf_sources);
 	}
 	imo->imo_mfilters = imfp;
 
 	INP_WLOCK(inp);
 	if (inp->inp_moptions != NULL) {
 		free(imfp, M_IPMSOURCE);
 		free(immp, M_IPMOPTS);
 		free(imo, M_IPMOPTS);
 		return (inp->inp_moptions);
 	}
 	inp->inp_moptions = imo;
 	return (imo);
 }
 
 /*
  * Discard the IP multicast options (and source filters).
  */
 void
 inp_freemoptions(struct ip_moptions *imo)
 {
 	struct in_mfilter	*imf;
 	struct in_msource	*ims, *tims;
 	size_t			 idx, nmships;
 
 	KASSERT(imo != NULL, ("%s: ip_moptions is NULL", __func__));
 
 	nmships = imo->imo_num_memberships;
 	for (idx = 0; idx < nmships; ++idx) {
 		in_delmulti(imo->imo_membership[idx]);
 
 		if (imo->imo_mfilters != NULL) {
 			imf = &imo->imo_mfilters[idx];
 			TAILQ_FOREACH_SAFE(ims, &imf->imf_sources,
 			    ims_next, tims) {
 				TAILQ_REMOVE(&imf->imf_sources, ims, ims_next);
 				free(ims, M_IPMSOURCE);
 				imf->imf_nsources--;
 			}
 			KASSERT(imf->imf_nsources == 0,
 			    ("%s: did not free all imf_nsources", __func__));
 		}
 	}
 
 	if (imo->imo_mfilters != NULL)
 		free(imo->imo_mfilters, M_IPMSOURCE);
 	free(imo->imo_membership, M_IPMOPTS);
 	free(imo, M_IPMOPTS);
 }
 
 /*
  * Atomically get source filters on a socket for an IPv4 multicast group.
  * Called with INP lock held; returns with lock released.
  */
 static int
 inp_get_source_filters(struct inpcb *inp, struct sockopt *sopt)
 {
 	INIT_VNET_NET(curvnet);
 	struct __msfilterreq	 msfr;
 	sockunion_t		*gsa;
 	struct ifnet		*ifp;
 	struct ip_moptions	*imo;
 	struct in_mfilter	*imf;
 	struct in_msource	*ims;
 	struct sockaddr_storage	*ptss;
 	struct sockaddr_storage	*tss;
 	int			 error;
 	size_t			 idx;
 
 	INP_WLOCK_ASSERT(inp);
 
 	imo = inp->inp_moptions;
 	KASSERT(imo != NULL, ("%s: null ip_moptions", __func__));
 
 	INP_WUNLOCK(inp);
 
 	error = sooptcopyin(sopt, &msfr, sizeof(struct __msfilterreq),
 	    sizeof(struct __msfilterreq));
 	if (error)
 		return (error);
 
 	if (msfr.msfr_ifindex == 0 || V_if_index < msfr.msfr_ifindex)
 		return (EINVAL);
 
 	ifp = ifnet_byindex(msfr.msfr_ifindex);
 	if (ifp == NULL)
 		return (EINVAL);
 
 	INP_WLOCK(inp);
 
 	/*
 	 * Lookup group on the socket.
 	 */
 	gsa = (sockunion_t *)&msfr.msfr_group;
 	idx = imo_match_group(imo, ifp, &gsa->sa);
 	if (idx == -1 || imo->imo_mfilters == NULL) {
 		INP_WUNLOCK(inp);
 		return (EADDRNOTAVAIL);
 	}
 
 	imf = &imo->imo_mfilters[idx];
 	msfr.msfr_fmode = imf->imf_fmode;
 	msfr.msfr_nsrcs = imf->imf_nsources;
 
 	/*
 	 * If the user specified a buffer, copy out the source filter
 	 * entries to userland gracefully.
 	 * msfr.msfr_nsrcs is always set to the total number of filter
 	 * entries which the kernel currently has for this group.
 	 */
 	tss = NULL;
 	if (msfr.msfr_srcs != NULL && msfr.msfr_nsrcs > 0) {
 		/*
 		 * Make a copy of the source vector so that we do not
 		 * thrash the inpcb lock whilst copying it out.
 		 * We only copy out the number of entries which userland
 		 * has asked for, but we always tell userland how big the
 		 * buffer really needs to be.
 		 */
 		tss = malloc(sizeof(struct sockaddr_storage) * msfr.msfr_nsrcs,
 		    M_TEMP, M_NOWAIT);
 		if (tss == NULL) {
 			error = ENOBUFS;
 		} else {
 			ptss = tss;
 			TAILQ_FOREACH(ims, &imf->imf_sources, ims_next) {
 				memcpy(ptss++, &ims->ims_addr,
 				    sizeof(struct sockaddr_storage));
 			}
 		}
 	}
 
 	INP_WUNLOCK(inp);
 
 	if (tss != NULL) {
 		error = copyout(tss, msfr.msfr_srcs,
 		    sizeof(struct sockaddr_storage) * msfr.msfr_nsrcs);
 		free(tss, M_TEMP);
 	}
 
 	if (error)
 		return (error);
 
 	error = sooptcopyout(sopt, &msfr, sizeof(struct __msfilterreq));
 
 	return (error);
 }
 
 /*
  * Return the IP multicast options in response to user getsockopt().
  */
 int
 inp_getmoptions(struct inpcb *inp, struct sockopt *sopt)
 {
 	INIT_VNET_INET(curvnet);
 	struct ip_mreqn		 mreqn;
 	struct ip_moptions	*imo;
 	struct ifnet		*ifp;
 	struct in_ifaddr	*ia;
 	int			 error, optval;
 	u_char			 coptval;
 
 	INP_WLOCK(inp);
 	imo = inp->inp_moptions;
 	/*
 	 * If socket is neither of type SOCK_RAW or SOCK_DGRAM,
 	 * or is a divert socket, reject it.
 	 */
 	if (inp->inp_socket->so_proto->pr_protocol == IPPROTO_DIVERT ||
 	    (inp->inp_socket->so_proto->pr_type != SOCK_RAW &&
 	    inp->inp_socket->so_proto->pr_type != SOCK_DGRAM)) {
 		INP_WUNLOCK(inp);
 		return (EOPNOTSUPP);
 	}
 
 	error = 0;
 	switch (sopt->sopt_name) {
 	case IP_MULTICAST_VIF:
 		if (imo != NULL)
 			optval = imo->imo_multicast_vif;
 		else
 			optval = -1;
 		INP_WUNLOCK(inp);
 		error = sooptcopyout(sopt, &optval, sizeof(int));
 		break;
 
 	case IP_MULTICAST_IF:
 		memset(&mreqn, 0, sizeof(struct ip_mreqn));
 		if (imo != NULL) {
 			ifp = imo->imo_multicast_ifp;
 			if (imo->imo_multicast_addr.s_addr != INADDR_ANY) {
 				mreqn.imr_address = imo->imo_multicast_addr;
 			} else if (ifp != NULL) {
 				mreqn.imr_ifindex = ifp->if_index;
 				IFP_TO_IA(ifp, ia);
 				if (ia != NULL) {
 					mreqn.imr_address =
 					    IA_SIN(ia)->sin_addr;
 				}
 			}
 		}
 		INP_WUNLOCK(inp);
 		if (sopt->sopt_valsize == sizeof(struct ip_mreqn)) {
 			error = sooptcopyout(sopt, &mreqn,
 			    sizeof(struct ip_mreqn));
 		} else {
 			error = sooptcopyout(sopt, &mreqn.imr_address,
 			    sizeof(struct in_addr));
 		}
 		break;
 
 	case IP_MULTICAST_TTL:
 		if (imo == 0)
 			optval = coptval = IP_DEFAULT_MULTICAST_TTL;
 		else
 			optval = coptval = imo->imo_multicast_ttl;
 		INP_WUNLOCK(inp);
 		if (sopt->sopt_valsize == sizeof(u_char))
 			error = sooptcopyout(sopt, &coptval, sizeof(u_char));
 		else
 			error = sooptcopyout(sopt, &optval, sizeof(int));
 		break;
 
 	case IP_MULTICAST_LOOP:
 		if (imo == 0)
 			optval = coptval = IP_DEFAULT_MULTICAST_LOOP;
 		else
 			optval = coptval = imo->imo_multicast_loop;
 		INP_WUNLOCK(inp);
 		if (sopt->sopt_valsize == sizeof(u_char))
 			error = sooptcopyout(sopt, &coptval, sizeof(u_char));
 		else
 			error = sooptcopyout(sopt, &optval, sizeof(int));
 		break;
 
 	case IP_MSFILTER:
 		if (imo == NULL) {
 			error = EADDRNOTAVAIL;
 			INP_WUNLOCK(inp);
 		} else {
 			error = inp_get_source_filters(inp, sopt);
 		}
 		break;
 
 	default:
 		INP_WUNLOCK(inp);
 		error = ENOPROTOOPT;
 		break;
 	}
 
 	INP_UNLOCK_ASSERT(inp);
 
 	return (error);
 }
 
 /*
  * Join an IPv4 multicast group, possibly with a source.
  */
 static int
 inp_join_group(struct inpcb *inp, struct sockopt *sopt)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET(curvnet);
 	struct group_source_req		 gsr;
 	sockunion_t			*gsa, *ssa;
 	struct ifnet			*ifp;
 	struct in_mfilter		*imf;
 	struct ip_moptions		*imo;
 	struct in_multi			*inm;
 	size_t				 idx;
 	int				 error;
 
 	ifp = NULL;
 	error = 0;
 
 	memset(&gsr, 0, sizeof(struct group_source_req));
 	gsa = (sockunion_t *)&gsr.gsr_group;
 	gsa->ss.ss_family = AF_UNSPEC;
 	ssa = (sockunion_t *)&gsr.gsr_source;
 	ssa->ss.ss_family = AF_UNSPEC;
 
 	switch (sopt->sopt_name) {
 	case IP_ADD_MEMBERSHIP:
 	case IP_ADD_SOURCE_MEMBERSHIP: {
 		struct ip_mreq_source	 mreqs;
 
 		if (sopt->sopt_name == IP_ADD_MEMBERSHIP) {
 			error = sooptcopyin(sopt, &mreqs,
 			    sizeof(struct ip_mreq),
 			    sizeof(struct ip_mreq));
 			/*
 			 * Do argument switcharoo from ip_mreq into
 			 * ip_mreq_source to avoid using two instances.
 			 */
 			mreqs.imr_interface = mreqs.imr_sourceaddr;
 			mreqs.imr_sourceaddr.s_addr = INADDR_ANY;
 		} else if (sopt->sopt_name == IP_ADD_SOURCE_MEMBERSHIP) {
 			error = sooptcopyin(sopt, &mreqs,
 			    sizeof(struct ip_mreq_source),
 			    sizeof(struct ip_mreq_source));
 		}
 		if (error)
 			return (error);
 
 		gsa->sin.sin_family = AF_INET;
 		gsa->sin.sin_len = sizeof(struct sockaddr_in);
 		gsa->sin.sin_addr = mreqs.imr_multiaddr;
 
 		if (sopt->sopt_name == IP_ADD_SOURCE_MEMBERSHIP) {
 			ssa->sin.sin_family = AF_INET;
 			ssa->sin.sin_len = sizeof(struct sockaddr_in);
 			ssa->sin.sin_addr = mreqs.imr_sourceaddr;
 		}
 
 		/*
 		 * Obtain ifp. If no interface address was provided,
 		 * use the interface of the route in the unicast FIB for
 		 * the given multicast destination; usually, this is the
 		 * default route.
 		 * If this lookup fails, attempt to use the first non-loopback
 		 * interface with multicast capability in the system as a
 		 * last resort. The legacy IPv4 ASM API requires that we do
 		 * this in order to allow groups to be joined when the routing
 		 * table has not yet been populated during boot.
 		 * If all of these conditions fail, return EADDRNOTAVAIL, and
 		 * reject the IPv4 multicast join.
 		 */
 		if (mreqs.imr_interface.s_addr != INADDR_ANY) {
 			INADDR_TO_IFP(mreqs.imr_interface, ifp);
 		} else {
 			struct route ro;
 
 			ro.ro_rt = NULL;
 			*(struct sockaddr_in *)&ro.ro_dst = gsa->sin;
-			in_rtalloc_ign(&ro, RTF_CLONING,
+			in_rtalloc_ign(&ro, 0,
 			   inp->inp_inc.inc_fibnum);
 			if (ro.ro_rt != NULL) {
 				ifp = ro.ro_rt->rt_ifp;
 				KASSERT(ifp != NULL, ("%s: null ifp",
 				    __func__));
 				RTFREE(ro.ro_rt);
 			} else {
 				struct in_ifaddr *ia;
 				struct ifnet *mfp = NULL;
 				TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link) {
 					mfp = ia->ia_ifp;
 					if (!(mfp->if_flags & IFF_LOOPBACK) &&
 					     (mfp->if_flags & IFF_MULTICAST)) {
 						ifp = mfp;
 						break;
 					}
 				}
 			}
 		}
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: imr_interface = %s, ifp = %p\n",
 			    __func__, inet_ntoa(mreqs.imr_interface), ifp);
 		}
 #endif
 		break;
 	}
 
 	case MCAST_JOIN_GROUP:
 	case MCAST_JOIN_SOURCE_GROUP:
 		if (sopt->sopt_name == MCAST_JOIN_GROUP) {
 			error = sooptcopyin(sopt, &gsr,
 			    sizeof(struct group_req),
 			    sizeof(struct group_req));
 		} else if (sopt->sopt_name == MCAST_JOIN_SOURCE_GROUP) {
 			error = sooptcopyin(sopt, &gsr,
 			    sizeof(struct group_source_req),
 			    sizeof(struct group_source_req));
 		}
 		if (error)
 			return (error);
 
 		if (gsa->sin.sin_family != AF_INET ||
 		    gsa->sin.sin_len != sizeof(struct sockaddr_in))
 			return (EINVAL);
 
 		/*
 		 * Overwrite the port field if present, as the sockaddr
 		 * being copied in may be matched with a binary comparison.
 		 * XXX INET6
 		 */
 		gsa->sin.sin_port = 0;
 		if (sopt->sopt_name == MCAST_JOIN_SOURCE_GROUP) {
 			if (ssa->sin.sin_family != AF_INET ||
 			    ssa->sin.sin_len != sizeof(struct sockaddr_in))
 				return (EINVAL);
 			ssa->sin.sin_port = 0;
 		}
 
 		/*
 		 * Obtain the ifp.
 		 */
 		if (gsr.gsr_interface == 0 || V_if_index < gsr.gsr_interface)
 			return (EADDRNOTAVAIL);
 		ifp = ifnet_byindex(gsr.gsr_interface);
 
 		break;
 
 	default:
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: unknown sopt_name %d\n", __func__,
 			    sopt->sopt_name);
 		}
 #endif
 		return (EOPNOTSUPP);
 		break;
 	}
 
 	if (!IN_MULTICAST(ntohl(gsa->sin.sin_addr.s_addr)))
 		return (EINVAL);
 
 	if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0)
 		return (EADDRNOTAVAIL);
 
 	/*
 	 * Check if we already hold membership of this group for this inpcb.
 	 * If so, we do not need to perform the initial join.
 	 */
 	imo = inp_findmoptions(inp);
 	idx = imo_match_group(imo, ifp, &gsa->sa);
 	if (idx != -1) {
 		if (ssa->ss.ss_family != AF_UNSPEC) {
 			/*
 			 * Attempting to join an ASM group (when already
 			 * an ASM or SSM member) is an error.
 			 */
 			error = EADDRNOTAVAIL;
 		} else {
 			imf = &imo->imo_mfilters[idx];
 			if (imf->imf_nsources == 0) {
 				/*
 				 * Attempting to join an SSM group (when
 				 * already an ASM member) is an error.
 				 */
 				error = EINVAL;
 			} else {
 				/*
 				 * Attempting to join an SSM group (when
 				 * already an SSM member) means "add this
 				 * source to the inclusive filter list".
 				 */
 				error = imo_join_source(imo, idx, ssa);
 			}
 		}
 		goto out_locked;
 	}
 
 	/*
 	 * Call imo_grow() to reallocate the membership and source filter
 	 * vectors if they are full. If the size would exceed the hard limit,
 	 * then we know we've really run out of entries. We keep the INP
 	 * lock held to avoid introducing a race condition.
 	 */
 	if (imo->imo_num_memberships == imo->imo_max_memberships) {
 		error = imo_grow(imo);
 		if (error)
 			goto out_locked;
 	}
 
 	/*
 	 * So far, so good: perform the layer 3 join, layer 2 join,
 	 * and make an IGMP announcement if needed.
 	 */
 	inm = in_addmulti(&gsa->sin.sin_addr, ifp);
 	if (inm == NULL) {
 		error = ENOBUFS;
 		goto out_locked;
 	}
 	idx = imo->imo_num_memberships;
 	imo->imo_membership[idx] = inm;
 	imo->imo_num_memberships++;
 
 	KASSERT(imo->imo_mfilters != NULL,
 	    ("%s: imf_mfilters vector was not allocated", __func__));
 	imf = &imo->imo_mfilters[idx];
 	KASSERT(TAILQ_EMPTY(&imf->imf_sources),
 	    ("%s: imf_sources not empty", __func__));
 
 	/*
 	 * If this is a new SSM group join (i.e. a source was specified
 	 * with this group), add this source to the filter list.
 	 */
 	if (ssa->ss.ss_family != AF_UNSPEC) {
 		/*
 		 * An initial SSM join implies that this socket's membership
 		 * of the multicast group is now in inclusive mode.
 		 */
 		imf->imf_fmode = MCAST_INCLUDE;
 
 		error = imo_join_source(imo, idx, ssa);
 		if (error) {
 			/*
 			 * Drop inp lock before calling in_delmulti(),
 			 * to prevent a lock order reversal.
 			 */
 			--imo->imo_num_memberships;
 			INP_WUNLOCK(inp);
 			in_delmulti(inm);
 			return (error);
 		}
 	}
 
 out_locked:
 	INP_WUNLOCK(inp);
 	return (error);
 }
 
 /*
  * Leave an IPv4 multicast group on an inpcb, possibly with a source.
  */
 static int
 inp_leave_group(struct inpcb *inp, struct sockopt *sopt)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET(curvnet);
 	struct group_source_req		 gsr;
 	struct ip_mreq_source		 mreqs;
 	sockunion_t			*gsa, *ssa;
 	struct ifnet			*ifp;
 	struct in_mfilter		*imf;
 	struct ip_moptions		*imo;
 	struct in_msource		*ims, *tims;
 	struct in_multi			*inm;
 	size_t				 idx;
 	int				 error;
 
 	ifp = NULL;
 	error = 0;
 
 	memset(&gsr, 0, sizeof(struct group_source_req));
 	gsa = (sockunion_t *)&gsr.gsr_group;
 	gsa->ss.ss_family = AF_UNSPEC;
 	ssa = (sockunion_t *)&gsr.gsr_source;
 	ssa->ss.ss_family = AF_UNSPEC;
 
 	switch (sopt->sopt_name) {
 	case IP_DROP_MEMBERSHIP:
 	case IP_DROP_SOURCE_MEMBERSHIP:
 		if (sopt->sopt_name == IP_DROP_MEMBERSHIP) {
 			error = sooptcopyin(sopt, &mreqs,
 			    sizeof(struct ip_mreq),
 			    sizeof(struct ip_mreq));
 			/*
 			 * Swap interface and sourceaddr arguments,
 			 * as ip_mreq and ip_mreq_source are laid
 			 * out differently.
 			 */
 			mreqs.imr_interface = mreqs.imr_sourceaddr;
 			mreqs.imr_sourceaddr.s_addr = INADDR_ANY;
 		} else if (sopt->sopt_name == IP_DROP_SOURCE_MEMBERSHIP) {
 			error = sooptcopyin(sopt, &mreqs,
 			    sizeof(struct ip_mreq_source),
 			    sizeof(struct ip_mreq_source));
 		}
 		if (error)
 			return (error);
 
 		gsa->sin.sin_family = AF_INET;
 		gsa->sin.sin_len = sizeof(struct sockaddr_in);
 		gsa->sin.sin_addr = mreqs.imr_multiaddr;
 
 		if (sopt->sopt_name == IP_DROP_SOURCE_MEMBERSHIP) {
 			ssa->sin.sin_family = AF_INET;
 			ssa->sin.sin_len = sizeof(struct sockaddr_in);
 			ssa->sin.sin_addr = mreqs.imr_sourceaddr;
 		}
 
 		if (gsa->sin.sin_addr.s_addr != INADDR_ANY)
 			INADDR_TO_IFP(mreqs.imr_interface, ifp);
 
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: imr_interface = %s, ifp = %p\n",
 			    __func__, inet_ntoa(mreqs.imr_interface), ifp);
 		}
 #endif
 		break;
 
 	case MCAST_LEAVE_GROUP:
 	case MCAST_LEAVE_SOURCE_GROUP:
 		if (sopt->sopt_name == MCAST_LEAVE_GROUP) {
 			error = sooptcopyin(sopt, &gsr,
 			    sizeof(struct group_req),
 			    sizeof(struct group_req));
 		} else if (sopt->sopt_name == MCAST_LEAVE_SOURCE_GROUP) {
 			error = sooptcopyin(sopt, &gsr,
 			    sizeof(struct group_source_req),
 			    sizeof(struct group_source_req));
 		}
 		if (error)
 			return (error);
 
 		if (gsa->sin.sin_family != AF_INET ||
 		    gsa->sin.sin_len != sizeof(struct sockaddr_in))
 			return (EINVAL);
 
 		if (sopt->sopt_name == MCAST_LEAVE_SOURCE_GROUP) {
 			if (ssa->sin.sin_family != AF_INET ||
 			    ssa->sin.sin_len != sizeof(struct sockaddr_in))
 				return (EINVAL);
 		}
 
 		if (gsr.gsr_interface == 0 || V_if_index < gsr.gsr_interface)
 			return (EADDRNOTAVAIL);
 
 		ifp = ifnet_byindex(gsr.gsr_interface);
 		break;
 
 	default:
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: unknown sopt_name %d\n", __func__,
 			    sopt->sopt_name);
 		}
 #endif
 		return (EOPNOTSUPP);
 		break;
 	}
 
 	if (!IN_MULTICAST(ntohl(gsa->sin.sin_addr.s_addr)))
 		return (EINVAL);
 
 	/*
 	 * Find the membership in the membership array.
 	 */
 	imo = inp_findmoptions(inp);
 	idx = imo_match_group(imo, ifp, &gsa->sa);
 	if (idx == -1) {
 		error = EADDRNOTAVAIL;
 		goto out_locked;
 	}
 	imf = &imo->imo_mfilters[idx];
 
 	/*
 	 * If we were instructed only to leave a given source, do so.
 	 */
 	if (ssa->ss.ss_family != AF_UNSPEC) {
 		if (imf->imf_nsources == 0 ||
 		    imf->imf_fmode == MCAST_EXCLUDE) {
 			/*
 			 * Attempting to SSM leave an ASM group
 			 * is an error; should use *_BLOCK_SOURCE instead.
 			 * Attempting to SSM leave a source in a group when
 			 * the socket is in 'exclude mode' is also an error.
 			 */
 			error = EINVAL;
 		} else {
 			error = imo_leave_source(imo, idx, ssa);
 		}
 		/*
 		 * If an error occurred, or this source is not the last
 		 * source in the group, do not leave the whole group.
 		 */
 		if (error || imf->imf_nsources > 0)
 			goto out_locked;
 	}
 
 	/*
 	 * Give up the multicast address record to which the membership points.
 	 */
 	inm = imo->imo_membership[idx];
 	in_delmulti(inm);
 
 	/*
 	 * Free any source filters for this group if they exist.
 	 * Revert inpcb to the default MCAST_EXCLUDE state.
 	 */
 	if (imo->imo_mfilters != NULL) {
 		TAILQ_FOREACH_SAFE(ims, &imf->imf_sources, ims_next, tims) {
 			TAILQ_REMOVE(&imf->imf_sources, ims, ims_next);
 			free(ims, M_IPMSOURCE);
 			imf->imf_nsources--;
 		}
 		KASSERT(imf->imf_nsources == 0,
 		    ("%s: imf_nsources not 0", __func__));
 		KASSERT(TAILQ_EMPTY(&imf->imf_sources),
 		    ("%s: imf_sources not empty", __func__));
 		imf->imf_fmode = MCAST_EXCLUDE;
 	}
 
 	/*
 	 * Remove the gap in the membership array.
 	 */
 	for (++idx; idx < imo->imo_num_memberships; ++idx)
 		imo->imo_membership[idx-1] = imo->imo_membership[idx];
 	imo->imo_num_memberships--;
 
 out_locked:
 	INP_WUNLOCK(inp);
 	return (error);
 }
 
 /*
  * Select the interface for transmitting IPv4 multicast datagrams.
  *
  * Either an instance of struct in_addr or an instance of struct ip_mreqn
  * may be passed to this socket option. An address of INADDR_ANY or an
  * interface index of 0 is used to remove a previous selection.
  * When no interface is selected, one is chosen for every send.
  */
 static int
 inp_set_multicast_if(struct inpcb *inp, struct sockopt *sopt)
 {
 	INIT_VNET_NET(curvnet);
 	struct in_addr		 addr;
 	struct ip_mreqn		 mreqn;
 	struct ifnet		*ifp;
 	struct ip_moptions	*imo;
 	int			 error;
 
 	if (sopt->sopt_valsize == sizeof(struct ip_mreqn)) {
 		/*
 		 * An interface index was specified using the
 		 * Linux-derived ip_mreqn structure.
 		 */
 		error = sooptcopyin(sopt, &mreqn, sizeof(struct ip_mreqn),
 		    sizeof(struct ip_mreqn));
 		if (error)
 			return (error);
 
 		if (mreqn.imr_ifindex < 0 || V_if_index < mreqn.imr_ifindex)
 			return (EINVAL);
 
 		if (mreqn.imr_ifindex == 0) {
 			ifp = NULL;
 		} else {
 			ifp = ifnet_byindex(mreqn.imr_ifindex);
 			if (ifp == NULL)
 				return (EADDRNOTAVAIL);
 		}
 	} else {
 		/*
 		 * An interface was specified by IPv4 address.
 		 * This is the traditional BSD usage.
 		 */
 		error = sooptcopyin(sopt, &addr, sizeof(struct in_addr),
 		    sizeof(struct in_addr));
 		if (error)
 			return (error);
 		if (addr.s_addr == INADDR_ANY) {
 			ifp = NULL;
 		} else {
 			INADDR_TO_IFP(addr, ifp);
 			if (ifp == NULL)
 				return (EADDRNOTAVAIL);
 		}
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: ifp = %p, addr = %s\n",
 			    __func__, ifp, inet_ntoa(addr)); /* XXX INET6 */
 		}
 #endif
 	}
 
 	/* Reject interfaces which do not support multicast. */
 	if (ifp != NULL && (ifp->if_flags & IFF_MULTICAST) == 0)
 		return (EOPNOTSUPP);
 
 	imo = inp_findmoptions(inp);
 	imo->imo_multicast_ifp = ifp;
 	imo->imo_multicast_addr.s_addr = INADDR_ANY;
 	INP_WUNLOCK(inp);
 
 	return (0);
 }
 
 /*
  * Atomically set source filters on a socket for an IPv4 multicast group.
  */
 static int
 inp_set_source_filters(struct inpcb *inp, struct sockopt *sopt)
 {
 	INIT_VNET_NET(curvnet);
 	struct __msfilterreq	 msfr;
 	sockunion_t		*gsa;
 	struct ifnet		*ifp;
 	struct in_mfilter	*imf;
 	struct ip_moptions	*imo;
 	struct in_msource	*ims, *tims;
 	size_t			 idx;
 	int			 error;
 
 	error = sooptcopyin(sopt, &msfr, sizeof(struct __msfilterreq),
 	    sizeof(struct __msfilterreq));
 	if (error)
 		return (error);
 
 	if (msfr.msfr_nsrcs > IP_MAX_SOURCE_FILTER ||
 	    (msfr.msfr_fmode != MCAST_EXCLUDE &&
 	     msfr.msfr_fmode != MCAST_INCLUDE))
 		return (EINVAL);
 
 	if (msfr.msfr_group.ss_family != AF_INET ||
 	    msfr.msfr_group.ss_len != sizeof(struct sockaddr_in))
 		return (EINVAL);
 
 	gsa = (sockunion_t *)&msfr.msfr_group;
 	if (!IN_MULTICAST(ntohl(gsa->sin.sin_addr.s_addr)))
 		return (EINVAL);
 
 	gsa->sin.sin_port = 0;	/* ignore port */
 
 	if (msfr.msfr_ifindex == 0 || V_if_index < msfr.msfr_ifindex)
 		return (EADDRNOTAVAIL);
 
 	ifp = ifnet_byindex(msfr.msfr_ifindex);
 	if (ifp == NULL)
 		return (EADDRNOTAVAIL);
 
 	/*
 	 * Take the INP lock.
 	 * Check if this socket is a member of this group.
 	 */
 	imo = inp_findmoptions(inp);
 	idx = imo_match_group(imo, ifp, &gsa->sa);
 	if (idx == -1 || imo->imo_mfilters == NULL) {
 		error = EADDRNOTAVAIL;
 		goto out_locked;
 	}
 	imf = &imo->imo_mfilters[idx];
 
 #ifdef DIAGNOSTIC
 	if (bootverbose)
 		printf("%s: clearing source list\n", __func__);
 #endif
 
 	/*
 	 * Remove any existing source filters.
 	 */
 	TAILQ_FOREACH_SAFE(ims, &imf->imf_sources, ims_next, tims) {
 		TAILQ_REMOVE(&imf->imf_sources, ims, ims_next);
 		free(ims, M_IPMSOURCE);
 		imf->imf_nsources--;
 	}
 	KASSERT(imf->imf_nsources == 0,
 	    ("%s: source list not cleared", __func__));
 
 	/*
 	 * Apply any new source filters, if present.
 	 */
 	if (msfr.msfr_nsrcs > 0) {
 		struct in_msource	**pnims;
 		struct in_msource	*nims;
 		struct sockaddr_storage	*kss;
 		struct sockaddr_storage	*pkss;
 		sockunion_t		*psu;
 		int			 i, j;
 
 		/*
 		 * Drop the inp lock so we may sleep if we need to
 		 * in order to satisfy a malloc request.
 		 * We will re-take it before changing socket state.
 		 */
 		INP_WUNLOCK(inp);
 #ifdef DIAGNOSTIC
 		if (bootverbose) {
 			printf("%s: loading %lu source list entries\n",
 			    __func__, (unsigned long)msfr.msfr_nsrcs);
 		}
 #endif
 		/*
 		 * Make a copy of the user-space source vector so
 		 * that we may copy them with a single copyin. This
 		 * allows us to deal with page faults up-front.
 		 */
 		kss = malloc(sizeof(struct sockaddr_storage) * msfr.msfr_nsrcs,
 		    M_TEMP, M_WAITOK);
 		error = copyin(msfr.msfr_srcs, kss,
 		    sizeof(struct sockaddr_storage) * msfr.msfr_nsrcs);
 		if (error) {
 			free(kss, M_TEMP);
 			return (error);
 		}
 
 		/*
 		 * Perform argument checking on every sockaddr_storage
 		 * structure in the vector provided to us. Overwrite
 		 * fields which should not apply to source entries.
 		 * TODO: Check for duplicate sources on this pass.
 		 */
 		psu = (sockunion_t *)kss;
 		for (i = 0; i < msfr.msfr_nsrcs; i++, psu++) {
 			switch (psu->ss.ss_family) {
 			case AF_INET:
 				if (psu->sin.sin_len !=
 				    sizeof(struct sockaddr_in)) {
 					error = EINVAL;
 				} else {
 					psu->sin.sin_port = 0;
 				}
 				break;
 #ifdef notyet
 			case AF_INET6;
 				if (psu->sin6.sin6_len !=
 				    sizeof(struct sockaddr_in6)) {
 					error = EINVAL;
 				} else {
 					psu->sin6.sin6_port = 0;
 					psu->sin6.sin6_flowinfo = 0;
 				}
 				break;
 #endif
 			default:
 				error = EAFNOSUPPORT;
 				break;
 			}
 			if (error)
 				break;
 		}
 		if (error) {
 			free(kss, M_TEMP);
 			return (error);
 		}
 
 		/*
 		 * Allocate a block to track all the in_msource
 		 * entries we are about to allocate, in case we
 		 * abruptly need to free them.
 		 */
 		pnims = malloc(sizeof(struct in_msource *) * msfr.msfr_nsrcs,
 		    M_TEMP, M_WAITOK | M_ZERO);
 
 		/*
 		 * Allocate up to nsrcs individual chunks.
 		 * If we encounter an error, backtrack out of
 		 * all allocations cleanly; updates must be atomic.
 		 */
 		pkss = kss;
 		nims = NULL;
 		for (i = 0; i < msfr.msfr_nsrcs; i++, pkss++) {
 			nims = malloc(sizeof(struct in_msource) *
 			    msfr.msfr_nsrcs, M_IPMSOURCE, M_WAITOK | M_ZERO);
 			pnims[i] = nims;
 		}
 		if (i < msfr.msfr_nsrcs) {
 			for (j = 0; j < i; j++) {
 				if (pnims[j] != NULL)
 					free(pnims[j], M_IPMSOURCE);
 			}
 			free(pnims, M_TEMP);
 			free(kss, M_TEMP);
 			return (ENOBUFS);
 		}
 
 		INP_UNLOCK_ASSERT(inp);
 
 		/*
 		 * Finally, apply the filters to the socket.
 		 * Re-take the inp lock; we are changing socket state.
 		 */
 		pkss = kss;
 		INP_WLOCK(inp);
 		for (i = 0; i < msfr.msfr_nsrcs; i++, pkss++) {
 			memcpy(&(pnims[i]->ims_addr), pkss,
 			    sizeof(struct sockaddr_storage));
 			TAILQ_INSERT_TAIL(&imf->imf_sources, pnims[i],
 			    ims_next);
 			imf->imf_nsources++;
 		}
 		free(pnims, M_TEMP);
 		free(kss, M_TEMP);
 	}
 
 	/*
 	 * Update the filter mode on the socket before releasing the inpcb.
 	 */
 	INP_WLOCK_ASSERT(inp);
 	imf->imf_fmode = msfr.msfr_fmode;
 
 out_locked:
 	INP_WUNLOCK(inp);
 	return (error);
 }
 
 /*
  * Set the IP multicast options in response to user setsockopt().
  *
  * Many of the socket options handled in this function duplicate the
  * functionality of socket options in the regular unicast API. However,
  * it is not possible to merge the duplicate code, because the idempotence
  * of the IPv4 multicast part of the BSD Sockets API must be preserved;
  * the effects of these options must be treated as separate and distinct.
  */
 int
 inp_setmoptions(struct inpcb *inp, struct sockopt *sopt)
 {
 	struct ip_moptions	*imo;
 	int			 error;
 
 	error = 0;
 
 	/*
 	 * If socket is neither of type SOCK_RAW or SOCK_DGRAM,
 	 * or is a divert socket, reject it.
 	 * XXX Unlocked read of inp_socket believed OK.
 	 */
 	if (inp->inp_socket->so_proto->pr_protocol == IPPROTO_DIVERT ||
 	    (inp->inp_socket->so_proto->pr_type != SOCK_RAW &&
 	    inp->inp_socket->so_proto->pr_type != SOCK_DGRAM))
 		return (EOPNOTSUPP);
 
 	switch (sopt->sopt_name) {
 	case IP_MULTICAST_VIF: {
 		int vifi;
 		/*
 		 * Select a multicast VIF for transmission.
 		 * Only useful if multicast forwarding is active.
 		 */
 		if (legal_vif_num == NULL) {
 			error = EOPNOTSUPP;
 			break;
 		}
 		error = sooptcopyin(sopt, &vifi, sizeof(int), sizeof(int));
 		if (error)
 			break;
 		if (!legal_vif_num(vifi) && (vifi != -1)) {
 			error = EINVAL;
 			break;
 		}
 		imo = inp_findmoptions(inp);
 		imo->imo_multicast_vif = vifi;
 		INP_WUNLOCK(inp);
 		break;
 	}
 
 	case IP_MULTICAST_IF:
 		error = inp_set_multicast_if(inp, sopt);
 		break;
 
 	case IP_MULTICAST_TTL: {
 		u_char ttl;
 
 		/*
 		 * Set the IP time-to-live for outgoing multicast packets.
 		 * The original multicast API required a char argument,
 		 * which is inconsistent with the rest of the socket API.
 		 * We allow either a char or an int.
 		 */
 		if (sopt->sopt_valsize == sizeof(u_char)) {
 			error = sooptcopyin(sopt, &ttl, sizeof(u_char),
 			    sizeof(u_char));
 			if (error)
 				break;
 		} else {
 			u_int ittl;
 
 			error = sooptcopyin(sopt, &ittl, sizeof(u_int),
 			    sizeof(u_int));
 			if (error)
 				break;
 			if (ittl > 255) {
 				error = EINVAL;
 				break;
 			}
 			ttl = (u_char)ittl;
 		}
 		imo = inp_findmoptions(inp);
 		imo->imo_multicast_ttl = ttl;
 		INP_WUNLOCK(inp);
 		break;
 	}
 
 	case IP_MULTICAST_LOOP: {
 		u_char loop;
 
 		/*
 		 * Set the loopback flag for outgoing multicast packets.
 		 * Must be zero or one.  The original multicast API required a
 		 * char argument, which is inconsistent with the rest
 		 * of the socket API.  We allow either a char or an int.
 		 */
 		if (sopt->sopt_valsize == sizeof(u_char)) {
 			error = sooptcopyin(sopt, &loop, sizeof(u_char),
 			    sizeof(u_char));
 			if (error)
 				break;
 		} else {
 			u_int iloop;
 
 			error = sooptcopyin(sopt, &iloop, sizeof(u_int),
 					    sizeof(u_int));
 			if (error)
 				break;
 			loop = (u_char)iloop;
 		}
 		imo = inp_findmoptions(inp);
 		imo->imo_multicast_loop = !!loop;
 		INP_WUNLOCK(inp);
 		break;
 	}
 
 	case IP_ADD_MEMBERSHIP:
 	case IP_ADD_SOURCE_MEMBERSHIP:
 	case MCAST_JOIN_GROUP:
 	case MCAST_JOIN_SOURCE_GROUP:
 		error = inp_join_group(inp, sopt);
 		break;
 
 	case IP_DROP_MEMBERSHIP:
 	case IP_DROP_SOURCE_MEMBERSHIP:
 	case MCAST_LEAVE_GROUP:
 	case MCAST_LEAVE_SOURCE_GROUP:
 		error = inp_leave_group(inp, sopt);
 		break;
 
 	case IP_BLOCK_SOURCE:
 	case IP_UNBLOCK_SOURCE:
 	case MCAST_BLOCK_SOURCE:
 	case MCAST_UNBLOCK_SOURCE:
 		error = inp_change_source_filter(inp, sopt);
 		break;
 
 	case IP_MSFILTER:
 		error = inp_set_source_filters(inp, sopt);
 		break;
 
 	default:
 		error = EOPNOTSUPP;
 		break;
 	}
 
 	INP_UNLOCK_ASSERT(inp);
 
 	return (error);
 }
Index: head/sys/netinet/in_pcb.c
===================================================================
--- head/sys/netinet/in_pcb.c	(revision 186118)
+++ head/sys/netinet/in_pcb.c	(revision 186119)
@@ -1,1927 +1,1927 @@
 /*-
  * Copyright (c) 1982, 1986, 1991, 1993, 1995
  *	The Regents of the University of California.
  * Copyright (c) 2007-2008 Robert N. M. Watson
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)in_pcb.c	8.4 (Berkeley) 5/24/95
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ddb.h"
 #include "opt_ipsec.h"
 #include "opt_inet6.h"
 #include "opt_mac.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/domain.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/jail.h>
 #include <sys/kernel.h>
 #include <sys/sysctl.h>
 #include <sys/vimage.h>
 
 #ifdef DDB
 #include <ddb/ddb.h>
 #endif
 
 #include <vm/uma.h>
 
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/route.h>
 
 #include <netinet/in.h>
 #include <netinet/in_pcb.h>
 #include <netinet/in_var.h>
 #include <netinet/ip_var.h>
 #include <netinet/tcp_var.h>
 #include <netinet/udp.h>
 #include <netinet/udp_var.h>
 #include <netinet/vinet.h>
 #ifdef INET6
 #include <netinet/ip6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/vinet6.h>
 #endif /* INET6 */
 
 
 #ifdef IPSEC
 #include <netipsec/ipsec.h>
 #include <netipsec/key.h>
 #endif /* IPSEC */
 
 #include <security/mac/mac_framework.h>
 
 #ifdef VIMAGE_GLOBALS
 /*
  * These configure the range of local port addresses assigned to
  * "unspecified" outgoing connections/packets/whatever.
  */
 int	ipport_lowfirstauto;
 int	ipport_lowlastauto;
 int	ipport_firstauto;
 int	ipport_lastauto;
 int	ipport_hifirstauto;
 int	ipport_hilastauto;
 
 /*
  * Reserved ports accessible only to root. There are significant
  * security considerations that must be accounted for when changing these,
  * but the security benefits can be great. Please be careful.
  */
 int	ipport_reservedhigh;
 int	ipport_reservedlow;
 
 /* Variables dealing with random ephemeral port allocation. */
 int	ipport_randomized;
 int	ipport_randomcps;
 int	ipport_randomtime;
 int	ipport_stoprandom;
 int	ipport_tcpallocs;
 int	ipport_tcplastcount;
 #endif
 
 #define RANGECHK(var, min, max) \
 	if ((var) < (min)) { (var) = (min); } \
 	else if ((var) > (max)) { (var) = (max); }
 
 static int
 sysctl_net_ipport_check(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 	int error;
 
 	error = sysctl_handle_int(oidp, oidp->oid_arg1, oidp->oid_arg2, req);
 	if (error == 0) {
 		RANGECHK(V_ipport_lowfirstauto, 1, IPPORT_RESERVED - 1);
 		RANGECHK(V_ipport_lowlastauto, 1, IPPORT_RESERVED - 1);
 		RANGECHK(V_ipport_firstauto, IPPORT_RESERVED, IPPORT_MAX);
 		RANGECHK(V_ipport_lastauto, IPPORT_RESERVED, IPPORT_MAX);
 		RANGECHK(V_ipport_hifirstauto, IPPORT_RESERVED, IPPORT_MAX);
 		RANGECHK(V_ipport_hilastauto, IPPORT_RESERVED, IPPORT_MAX);
 	}
 	return (error);
 }
 
 #undef RANGECHK
 
 SYSCTL_NODE(_net_inet_ip, IPPROTO_IP, portrange, CTLFLAG_RW, 0, "IP Ports");
 
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO,
 	lowfirst, CTLTYPE_INT|CTLFLAG_RW, ipport_lowfirstauto, 0,
 	&sysctl_net_ipport_check, "I", "");
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO,
 	lowlast, CTLTYPE_INT|CTLFLAG_RW, ipport_lowlastauto, 0,
 	&sysctl_net_ipport_check, "I", "");
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO,
 	first, CTLTYPE_INT|CTLFLAG_RW, ipport_firstauto, 0,
 	&sysctl_net_ipport_check, "I", "");
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO,
 	last, CTLTYPE_INT|CTLFLAG_RW, ipport_lastauto, 0,
 	&sysctl_net_ipport_check, "I", "");
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO,
 	hifirst, CTLTYPE_INT|CTLFLAG_RW, ipport_hifirstauto, 0,	
 	&sysctl_net_ipport_check, "I", "");
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO,
 	hilast, CTLTYPE_INT|CTLFLAG_RW, ipport_hilastauto, 0,
 	&sysctl_net_ipport_check, "I", "");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO,
 	reservedhigh, CTLFLAG_RW|CTLFLAG_SECURE, ipport_reservedhigh, 0, "");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO, reservedlow,
 	CTLFLAG_RW|CTLFLAG_SECURE, ipport_reservedlow, 0, "");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO, randomized,
 	CTLFLAG_RW, ipport_randomized, 0, "Enable random port allocation");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO, randomcps,
 	CTLFLAG_RW, ipport_randomcps, 0, "Maximum number of random port "
 	"allocations before switching to a sequental one");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip_portrange, OID_AUTO, randomtime,
 	CTLFLAG_RW, ipport_randomtime, 0,
 	"Minimum time to keep sequental port "
 	"allocation before switching to a random one");
 
 /*
  * in_pcb.c: manage the Protocol Control Blocks.
  *
  * NOTE: It is assumed that most of these functions will be called with
  * the pcbinfo lock held, and often, the inpcb lock held, as these utility
  * functions often modify hash chains or addresses in pcbs.
  */
 
 /*
  * Allocate a PCB and associate it with the socket.
  * On success return with the PCB locked.
  */
 int
 in_pcballoc(struct socket *so, struct inpcbinfo *pcbinfo)
 {
 #ifdef INET6
 	INIT_VNET_INET6(curvnet);
 #endif
 	struct inpcb *inp;
 	int error;
 
 	INP_INFO_WLOCK_ASSERT(pcbinfo);
 	error = 0;
 	inp = uma_zalloc(pcbinfo->ipi_zone, M_NOWAIT);
 	if (inp == NULL)
 		return (ENOBUFS);
 	bzero(inp, inp_zero_size);
 	inp->inp_pcbinfo = pcbinfo;
 	inp->inp_socket = so;
 	inp->inp_cred = crhold(so->so_cred);
 	inp->inp_inc.inc_fibnum = so->so_fibnum;
 #ifdef MAC
 	error = mac_inpcb_init(inp, M_NOWAIT);
 	if (error != 0)
 		goto out;
 	SOCK_LOCK(so);
 	mac_inpcb_create(so, inp);
 	SOCK_UNLOCK(so);
 #endif
 #ifdef IPSEC
 	error = ipsec_init_policy(so, &inp->inp_sp);
 	if (error != 0) {
 #ifdef MAC
 		mac_inpcb_destroy(inp);
 #endif
 		goto out;
 	}
 #endif /*IPSEC*/
 #ifdef INET6
 	if (INP_SOCKAF(so) == AF_INET6) {
 		inp->inp_vflag |= INP_IPV6PROTO;
 		if (V_ip6_v6only)
 			inp->inp_flags |= IN6P_IPV6_V6ONLY;
 	}
 #endif
 	LIST_INSERT_HEAD(pcbinfo->ipi_listhead, inp, inp_list);
 	pcbinfo->ipi_count++;
 	so->so_pcb = (caddr_t)inp;
 #ifdef INET6
 	if (V_ip6_auto_flowlabel)
 		inp->inp_flags |= IN6P_AUTOFLOWLABEL;
 #endif
 	INP_WLOCK(inp);
 	inp->inp_gencnt = ++pcbinfo->ipi_gencnt;
 	inp->inp_refcount = 1;	/* Reference from the inpcbinfo */
 #if defined(IPSEC) || defined(MAC)
 out:
 	if (error != 0) {
 		crfree(inp->inp_cred);
 		uma_zfree(pcbinfo->ipi_zone, inp);
 	}
 #endif
 	return (error);
 }
 
 int
 in_pcbbind(struct inpcb *inp, struct sockaddr *nam, struct ucred *cred)
 {
 	int anonport, error;
 
 	INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	if (inp->inp_lport != 0 || inp->inp_laddr.s_addr != INADDR_ANY)
 		return (EINVAL);
 	anonport = inp->inp_lport == 0 && (nam == NULL ||
 	    ((struct sockaddr_in *)nam)->sin_port == 0);
 	error = in_pcbbind_setup(inp, nam, &inp->inp_laddr.s_addr,
 	    &inp->inp_lport, cred);
 	if (error)
 		return (error);
 	if (in_pcbinshash(inp) != 0) {
 		inp->inp_laddr.s_addr = INADDR_ANY;
 		inp->inp_lport = 0;
 		return (EAGAIN);
 	}
 	if (anonport)
 		inp->inp_flags |= INP_ANONPORT;
 	return (0);
 }
 
 /*
  * Set up a bind operation on a PCB, performing port allocation
  * as required, but do not actually modify the PCB. Callers can
  * either complete the bind by setting inp_laddr/inp_lport and
  * calling in_pcbinshash(), or they can just use the resulting
  * port and address to authorise the sending of a once-off packet.
  *
  * On error, the values of *laddrp and *lportp are not changed.
  */
 int
 in_pcbbind_setup(struct inpcb *inp, struct sockaddr *nam, in_addr_t *laddrp,
     u_short *lportp, struct ucred *cred)
 {
 	INIT_VNET_INET(inp->inp_vnet);
 	struct socket *so = inp->inp_socket;
 	unsigned short *lastport;
 	struct sockaddr_in *sin;
 	struct inpcbinfo *pcbinfo = inp->inp_pcbinfo;
 	struct in_addr laddr;
 	u_short lport = 0;
 	int wild = 0, reuseport = (so->so_options & SO_REUSEPORT);
 	int error;
 	int dorandom;
 
 	/*
 	 * Because no actual state changes occur here, a global write lock on
 	 * the pcbinfo isn't required.
 	 */
 	INP_INFO_LOCK_ASSERT(pcbinfo);
 	INP_LOCK_ASSERT(inp);
 
 	if (TAILQ_EMPTY(&V_in_ifaddrhead)) /* XXX broken! */
 		return (EADDRNOTAVAIL);
 	laddr.s_addr = *laddrp;
 	if (nam != NULL && laddr.s_addr != INADDR_ANY)
 		return (EINVAL);
 	if ((so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) == 0)
 		wild = INPLOOKUP_WILDCARD;
 	if (nam) {
 		sin = (struct sockaddr_in *)nam;
 		if (nam->sa_len != sizeof (*sin))
 			return (EINVAL);
 #ifdef notdef
 		/*
 		 * We should check the family, but old programs
 		 * incorrectly fail to initialize it.
 		 */
 		if (sin->sin_family != AF_INET)
 			return (EAFNOSUPPORT);
 #endif
 		if (prison_local_ip4(cred, &sin->sin_addr))
 			return (EINVAL);
 		if (sin->sin_port != *lportp) {
 			/* Don't allow the port to change. */
 			if (*lportp != 0)
 				return (EINVAL);
 			lport = sin->sin_port;
 		}
 		/* NB: lport is left as 0 if the port isn't being changed. */
 		if (IN_MULTICAST(ntohl(sin->sin_addr.s_addr))) {
 			/*
 			 * Treat SO_REUSEADDR as SO_REUSEPORT for multicast;
 			 * allow complete duplication of binding if
 			 * SO_REUSEPORT is set, or if SO_REUSEADDR is set
 			 * and a multicast address is bound on both
 			 * new and duplicated sockets.
 			 */
 			if (so->so_options & SO_REUSEADDR)
 				reuseport = SO_REUSEADDR|SO_REUSEPORT;
 		} else if (sin->sin_addr.s_addr != INADDR_ANY) {
 			sin->sin_port = 0;		/* yech... */
 			bzero(&sin->sin_zero, sizeof(sin->sin_zero));
 			if (ifa_ifwithaddr((struct sockaddr *)sin) == 0)
 				return (EADDRNOTAVAIL);
 		}
 		laddr = sin->sin_addr;
 		if (lport) {
 			struct inpcb *t;
 			struct tcptw *tw;
 
 			/* GROSS */
 			if (ntohs(lport) <= V_ipport_reservedhigh &&
 			    ntohs(lport) >= V_ipport_reservedlow &&
 			    priv_check_cred(cred, PRIV_NETINET_RESERVEDPORT,
 			    0))
 				return (EACCES);
 			if (!IN_MULTICAST(ntohl(sin->sin_addr.s_addr)) &&
 			    priv_check_cred(inp->inp_cred,
 			    PRIV_NETINET_REUSEPORT, 0) != 0) {
 				t = in_pcblookup_local(pcbinfo, sin->sin_addr,
 				    lport, INPLOOKUP_WILDCARD, cred);
 	/*
 	 * XXX
 	 * This entire block sorely needs a rewrite.
 	 */
 				if (t &&
 				    ((t->inp_vflag & INP_TIMEWAIT) == 0) &&
 				    (so->so_type != SOCK_STREAM ||
 				     ntohl(t->inp_faddr.s_addr) == INADDR_ANY) &&
 				    (ntohl(sin->sin_addr.s_addr) != INADDR_ANY ||
 				     ntohl(t->inp_laddr.s_addr) != INADDR_ANY ||
 				     (t->inp_socket->so_options &
 					 SO_REUSEPORT) == 0) &&
 				    (inp->inp_cred->cr_uid !=
 				     t->inp_cred->cr_uid))
 					return (EADDRINUSE);
 			}
 			if (prison_local_ip4(cred, &sin->sin_addr))
 				return (EADDRNOTAVAIL);
 			t = in_pcblookup_local(pcbinfo, sin->sin_addr,
 			    lport, wild, cred);
 			if (t && (t->inp_vflag & INP_TIMEWAIT)) {
 				/*
 				 * XXXRW: If an incpb has had its timewait
 				 * state recycled, we treat the address as
 				 * being in use (for now).  This is better
 				 * than a panic, but not desirable.
 				 */
 				tw = intotw(inp);
 				if (tw == NULL ||
 				    (reuseport & tw->tw_so_options) == 0)
 					return (EADDRINUSE);
 			} else if (t &&
 			    (reuseport & t->inp_socket->so_options) == 0) {
 #ifdef INET6
 				if (ntohl(sin->sin_addr.s_addr) !=
 				    INADDR_ANY ||
 				    ntohl(t->inp_laddr.s_addr) !=
 				    INADDR_ANY ||
 				    INP_SOCKAF(so) ==
 				    INP_SOCKAF(t->inp_socket))
 #endif
 				return (EADDRINUSE);
 			}
 		}
 	}
 	if (*lportp != 0)
 		lport = *lportp;
 	if (lport == 0) {
 		u_short first, last, aux;
 		int count;
 
 		if (prison_local_ip4(cred, &laddr))
 			return (EINVAL);
 
 		if (inp->inp_flags & INP_HIGHPORT) {
 			first = V_ipport_hifirstauto;	/* sysctl */
 			last  = V_ipport_hilastauto;
 			lastport = &pcbinfo->ipi_lasthi;
 		} else if (inp->inp_flags & INP_LOWPORT) {
 			error = priv_check_cred(cred,
 			    PRIV_NETINET_RESERVEDPORT, 0);
 			if (error)
 				return error;
 			first = V_ipport_lowfirstauto;	/* 1023 */
 			last  = V_ipport_lowlastauto;	/* 600 */
 			lastport = &pcbinfo->ipi_lastlow;
 		} else {
 			first = V_ipport_firstauto;	/* sysctl */
 			last  = V_ipport_lastauto;
 			lastport = &pcbinfo->ipi_lastport;
 		}
 		/*
 		 * For UDP, use random port allocation as long as the user
 		 * allows it.  For TCP (and as of yet unknown) connections,
 		 * use random port allocation only if the user allows it AND
 		 * ipport_tick() allows it.
 		 */
 		if (V_ipport_randomized &&
 			(!V_ipport_stoprandom || pcbinfo == &V_udbinfo))
 			dorandom = 1;
 		else
 			dorandom = 0;
 		/*
 		 * It makes no sense to do random port allocation if
 		 * we have the only port available.
 		 */
 		if (first == last)
 			dorandom = 0;
 		/* Make sure to not include UDP packets in the count. */
 		if (pcbinfo != &V_udbinfo)
 			V_ipport_tcpallocs++;
 		/*
 		 * Instead of having two loops further down counting up or down
 		 * make sure that first is always <= last and go with only one
 		 * code path implementing all logic.
 		 */
 		if (first > last) {
 			aux = first;
 			first = last;
 			last = aux;
 		}
 
 		if (dorandom)
 			*lastport = first +
 				    (arc4random() % (last - first));
 
 		count = last - first;
 
 		do {
 			if (count-- < 0)	/* completely used? */
 				return (EADDRNOTAVAIL);
 			++*lastport;
 			if (*lastport < first || *lastport > last)
 				*lastport = first;
 			lport = htons(*lastport);
 		} while (in_pcblookup_local(pcbinfo, laddr,
 		    lport, wild, cred));
 	}
 	if (prison_local_ip4(cred, &laddr))
 		return (EINVAL);
 	*laddrp = laddr.s_addr;
 	*lportp = lport;
 	return (0);
 }
 
 /*
  * Connect from a socket to a specified address.
  * Both address and port must be specified in argument sin.
  * If don't have a local address for this socket yet,
  * then pick one.
  */
 int
 in_pcbconnect(struct inpcb *inp, struct sockaddr *nam, struct ucred *cred)
 {
 	u_short lport, fport;
 	in_addr_t laddr, faddr;
 	int anonport, error;
 
 	INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	lport = inp->inp_lport;
 	laddr = inp->inp_laddr.s_addr;
 	anonport = (lport == 0);
 	error = in_pcbconnect_setup(inp, nam, &laddr, &lport, &faddr, &fport,
 	    NULL, cred);
 	if (error)
 		return (error);
 
 	/* Do the initial binding of the local address if required. */
 	if (inp->inp_laddr.s_addr == INADDR_ANY && inp->inp_lport == 0) {
 		inp->inp_lport = lport;
 		inp->inp_laddr.s_addr = laddr;
 		if (in_pcbinshash(inp) != 0) {
 			inp->inp_laddr.s_addr = INADDR_ANY;
 			inp->inp_lport = 0;
 			return (EAGAIN);
 		}
 	}
 
 	/* Commit the remaining changes. */
 	inp->inp_lport = lport;
 	inp->inp_laddr.s_addr = laddr;
 	inp->inp_faddr.s_addr = faddr;
 	inp->inp_fport = fport;
 	in_pcbrehash(inp);
 
 	if (anonport)
 		inp->inp_flags |= INP_ANONPORT;
 	return (0);
 }
 
 /*
  * Do proper source address selection on an unbound socket in case
  * of connect. Take jails into account as well.
  */
 static int
 in_pcbladdr(struct inpcb *inp, struct in_addr *faddr, struct in_addr *laddr,
     struct ucred *cred)
 {
 	struct in_ifaddr *ia;
 	struct ifaddr *ifa;
 	struct sockaddr *sa;
 	struct sockaddr_in *sin;
 	struct route sro;
 	int error;
 
 	KASSERT(laddr != NULL, ("%s: laddr NULL", __func__));
 
 	error = 0;
 	ia = NULL;
 	bzero(&sro, sizeof(sro));
 
 	sin = (struct sockaddr_in *)&sro.ro_dst;
 	sin->sin_family = AF_INET;
 	sin->sin_len = sizeof(struct sockaddr_in);
 	sin->sin_addr.s_addr = faddr->s_addr;
 
 	/*
 	 * If route is known our src addr is taken from the i/f,
 	 * else punt.
 	 *
 	 * Find out route to destination.
 	 */
 	if ((inp->inp_socket->so_options & SO_DONTROUTE) == 0)
-		in_rtalloc_ign(&sro, RTF_CLONING, inp->inp_inc.inc_fibnum);
+		in_rtalloc_ign(&sro, 0, inp->inp_inc.inc_fibnum);
 
 	/*
 	 * If we found a route, use the address corresponding to
 	 * the outgoing interface.
 	 * 
 	 * Otherwise assume faddr is reachable on a directly connected
 	 * network and try to find a corresponding interface to take
 	 * the source address from.
 	 */
 	if (sro.ro_rt == NULL || sro.ro_rt->rt_ifp == NULL) {
 		struct ifnet *ifp;
 
 		ia = ifatoia(ifa_ifwithdstaddr((struct sockaddr *)sin));
 		if (ia == NULL)
 			ia = ifatoia(ifa_ifwithnet((struct sockaddr *)sin));
 		if (ia == NULL) {
 			error = ENETUNREACH;
 			goto done;
 		}
 
 		if (cred == NULL || !jailed(cred)) {
 			laddr->s_addr = ia->ia_addr.sin_addr.s_addr;
 			goto done;
 		}
 
 		ifp = ia->ia_ifp;
 		ia = NULL;
 		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 
 			sa = ifa->ifa_addr;
 			if (sa->sa_family != AF_INET)
 				continue;
 			sin = (struct sockaddr_in *)sa;
 			if (prison_check_ip4(cred, &sin->sin_addr)) {
 				ia = (struct in_ifaddr *)ifa;
 				break;
 			}
 		}
 		if (ia != NULL) {
 			laddr->s_addr = ia->ia_addr.sin_addr.s_addr;
 			goto done;
 		}
 
 		/* 3. As a last resort return the 'default' jail address. */
 		if (prison_getip4(cred, laddr) != 0)
 			error = EADDRNOTAVAIL;
 		goto done;
 	}
 
 	/*
 	 * If the outgoing interface on the route found is not
 	 * a loopback interface, use the address from that interface.
 	 * In case of jails do those three steps:
 	 * 1. check if the interface address belongs to the jail. If so use it.
 	 * 2. check if we have any address on the outgoing interface
 	 *    belonging to this jail. If so use it.
 	 * 3. as a last resort return the 'default' jail address.
 	 */
 	if ((sro.ro_rt->rt_ifp->if_flags & IFF_LOOPBACK) == 0) {
 
 		/* If not jailed, use the default returned. */
 		if (cred == NULL || !jailed(cred)) {
 			ia = (struct in_ifaddr *)sro.ro_rt->rt_ifa;
 			laddr->s_addr = ia->ia_addr.sin_addr.s_addr;
 			goto done;
 		}
 
 		/* Jailed. */
 		/* 1. Check if the iface address belongs to the jail. */
 		sin = (struct sockaddr_in *)sro.ro_rt->rt_ifa->ifa_addr;
 		if (prison_check_ip4(cred, &sin->sin_addr)) {
 			ia = (struct in_ifaddr *)sro.ro_rt->rt_ifa;
 			laddr->s_addr = ia->ia_addr.sin_addr.s_addr;
 			goto done;
 		}
 
 		/*
 		 * 2. Check if we have any address on the outgoing interface
 		 *    belonging to this jail.
 		 */
 		TAILQ_FOREACH(ifa, &sro.ro_rt->rt_ifp->if_addrhead, ifa_link) {
 
 			sa = ifa->ifa_addr;
 			if (sa->sa_family != AF_INET)
 				continue;
 			sin = (struct sockaddr_in *)sa;
 			if (prison_check_ip4(cred, &sin->sin_addr)) {
 				ia = (struct in_ifaddr *)ifa;
 				break;
 			}
 		}
 		if (ia != NULL) {
 			laddr->s_addr = ia->ia_addr.sin_addr.s_addr;
 			goto done;
 		}
 
 		/* 3. As a last resort return the 'default' jail address. */
 		if (prison_getip4(cred, laddr) != 0)
 			error = EADDRNOTAVAIL;
 		goto done;
 	}
 
 	/*
 	 * The outgoing interface is marked with 'loopback net', so a route
 	 * to ourselves is here.
 	 * Try to find the interface of the destination address and then
 	 * take the address from there. That interface is not necessarily
 	 * a loopback interface.
 	 * In case of jails, check that it is an address of the jail
 	 * and if we cannot find, fall back to the 'default' jail address.
 	 */
 	if ((sro.ro_rt->rt_ifp->if_flags & IFF_LOOPBACK) != 0) {
 		struct sockaddr_in sain;
 
 		bzero(&sain, sizeof(struct sockaddr_in));
 		sain.sin_family = AF_INET;
 		sain.sin_len = sizeof(struct sockaddr_in);
 		sain.sin_addr.s_addr = faddr->s_addr;
 
 		ia = ifatoia(ifa_ifwithdstaddr(sintosa(&sain)));
 		if (ia == NULL)
 			ia = ifatoia(ifa_ifwithnet(sintosa(&sain)));
 
 		if (cred == NULL || !jailed(cred)) {
 #if __FreeBSD_version < 800000
 			if (ia == NULL)
 				ia = (struct in_ifaddr *)sro.ro_rt->rt_ifa;
 #endif
 			if (ia == NULL) {
 				error = ENETUNREACH;
 				goto done;
 			}
 			laddr->s_addr = ia->ia_addr.sin_addr.s_addr;
 			goto done;
 		}
 
 		/* Jailed. */
 		if (ia != NULL) {
 			struct ifnet *ifp;
 
 			ifp = ia->ia_ifp;
 			ia = NULL;
 			TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
 
 				sa = ifa->ifa_addr;
 				if (sa->sa_family != AF_INET)
 					continue;
 				sin = (struct sockaddr_in *)sa;
 				if (prison_check_ip4(cred, &sin->sin_addr)) {
 					ia = (struct in_ifaddr *)ifa;
 					break;
 				}
 			}
 			if (ia != NULL) {
 				laddr->s_addr = ia->ia_addr.sin_addr.s_addr;
 				goto done;
 			}
 		}
 
 		/* 3. As a last resort return the 'default' jail address. */
 		if (prison_getip4(cred, laddr) != 0)
 			error = EADDRNOTAVAIL;
 		goto done;
 	}
 
 done:
 	if (sro.ro_rt != NULL)
 		RTFREE(sro.ro_rt);
 	return (error);
 }
 
 /*
  * Set up for a connect from a socket to the specified address.
  * On entry, *laddrp and *lportp should contain the current local
  * address and port for the PCB; these are updated to the values
  * that should be placed in inp_laddr and inp_lport to complete
  * the connect.
  *
  * On success, *faddrp and *fportp will be set to the remote address
  * and port. These are not updated in the error case.
  *
  * If the operation fails because the connection already exists,
  * *oinpp will be set to the PCB of that connection so that the
  * caller can decide to override it. In all other cases, *oinpp
  * is set to NULL.
  */
 int
 in_pcbconnect_setup(struct inpcb *inp, struct sockaddr *nam,
     in_addr_t *laddrp, u_short *lportp, in_addr_t *faddrp, u_short *fportp,
     struct inpcb **oinpp, struct ucred *cred)
 {
 	INIT_VNET_INET(inp->inp_vnet);
 	struct sockaddr_in *sin = (struct sockaddr_in *)nam;
 	struct in_ifaddr *ia;
 	struct inpcb *oinp;
 	struct in_addr laddr, faddr, jailia;
 	u_short lport, fport;
 	int error;
 
 	/*
 	 * Because a global state change doesn't actually occur here, a read
 	 * lock is sufficient.
 	 */
 	INP_INFO_LOCK_ASSERT(inp->inp_pcbinfo);
 	INP_LOCK_ASSERT(inp);
 
 	if (oinpp != NULL)
 		*oinpp = NULL;
 	if (nam->sa_len != sizeof (*sin))
 		return (EINVAL);
 	if (sin->sin_family != AF_INET)
 		return (EAFNOSUPPORT);
 	if (sin->sin_port == 0)
 		return (EADDRNOTAVAIL);
 	laddr.s_addr = *laddrp;
 	lport = *lportp;
 	faddr = sin->sin_addr;
 	fport = sin->sin_port;
 
 	if (!TAILQ_EMPTY(&V_in_ifaddrhead)) {
 		/*
 		 * If the destination address is INADDR_ANY,
 		 * use the primary local address.
 		 * If the supplied address is INADDR_BROADCAST,
 		 * and the primary interface supports broadcast,
 		 * choose the broadcast address for that interface.
 		 */
 		if (faddr.s_addr == INADDR_ANY) {
 			if (cred != NULL && jailed(cred)) {
 				if (prison_getip4(cred, &jailia) != 0)
 					return (EADDRNOTAVAIL);
 				faddr.s_addr = jailia.s_addr;
 			} else {
 				faddr =
 				    IA_SIN(TAILQ_FIRST(&V_in_ifaddrhead))->
 				    sin_addr;
 			}
 		} else if (faddr.s_addr == (u_long)INADDR_BROADCAST &&
 		    (TAILQ_FIRST(&V_in_ifaddrhead)->ia_ifp->if_flags &
 		    IFF_BROADCAST))
 			faddr = satosin(&TAILQ_FIRST(
 			    &V_in_ifaddrhead)->ia_broadaddr)->sin_addr;
 	}
 	if (laddr.s_addr == INADDR_ANY) {
 		error = in_pcbladdr(inp, &faddr, &laddr, cred);
 		if (error)
 			return (error);
 
 		/*
 		 * If the destination address is multicast and an outgoing
 		 * interface has been set as a multicast option, use the
 		 * address of that interface as our source address.
 		 */
 		if (IN_MULTICAST(ntohl(faddr.s_addr)) &&
 		    inp->inp_moptions != NULL) {
 			struct ip_moptions *imo;
 			struct ifnet *ifp;
 
 			imo = inp->inp_moptions;
 			if (imo->imo_multicast_ifp != NULL) {
 				ifp = imo->imo_multicast_ifp;
 				TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link)
 					if (ia->ia_ifp == ifp)
 						break;
 				if (ia == NULL)
 					return (EADDRNOTAVAIL);
 				laddr = ia->ia_addr.sin_addr;
 			}
 		}
 	}
 
 	oinp = in_pcblookup_hash(inp->inp_pcbinfo, faddr, fport, laddr, lport,
 	    0, NULL);
 	if (oinp != NULL) {
 		if (oinpp != NULL)
 			*oinpp = oinp;
 		return (EADDRINUSE);
 	}
 	if (lport == 0) {
 		error = in_pcbbind_setup(inp, NULL, &laddr.s_addr, &lport,
 		    cred);
 		if (error)
 			return (error);
 	}
 	*laddrp = laddr.s_addr;
 	*lportp = lport;
 	*faddrp = faddr.s_addr;
 	*fportp = fport;
 	return (0);
 }
 
 void
 in_pcbdisconnect(struct inpcb *inp)
 {
 
 	INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	inp->inp_faddr.s_addr = INADDR_ANY;
 	inp->inp_fport = 0;
 	in_pcbrehash(inp);
 }
 
 /*
  * in_pcbdetach() is responsibe for disassociating a socket from an inpcb.
  * For most protocols, this will be invoked immediately prior to calling
  * in_pcbfree().  However, with TCP the inpcb may significantly outlive the
  * socket, in which case in_pcbfree() is deferred.
  */
 void
 in_pcbdetach(struct inpcb *inp)
 {
 
 	KASSERT(inp->inp_socket != NULL, ("%s: inp_socket == NULL", __func__));
 
 	inp->inp_socket->so_pcb = NULL;
 	inp->inp_socket = NULL;
 }
 
 /*
  * in_pcbfree_internal() frees an inpcb that has been detached from its
  * socket, and whose reference count has reached 0.  It will also remove the
  * inpcb from any global lists it might remain on.
  */
 static void
 in_pcbfree_internal(struct inpcb *inp)
 {
 	struct inpcbinfo *ipi = inp->inp_pcbinfo;
 
 	KASSERT(inp->inp_socket == NULL, ("%s: inp_socket != NULL", __func__));
 	KASSERT(inp->inp_refcount == 0, ("%s: refcount !0", __func__));
 
 	INP_INFO_WLOCK_ASSERT(ipi);
 	INP_WLOCK_ASSERT(inp);
 
 #ifdef IPSEC
 	if (inp->inp_sp != NULL)
 		ipsec_delete_pcbpolicy(inp);
 #endif /* IPSEC */
 	inp->inp_gencnt = ++ipi->ipi_gencnt;
 	in_pcbremlists(inp);
 #ifdef INET6
 	if (inp->inp_vflag & INP_IPV6PROTO) {
 		ip6_freepcbopts(inp->in6p_outputopts);
 		ip6_freemoptions(inp->in6p_moptions);
 	}
 #endif
 	if (inp->inp_options)
 		(void)m_free(inp->inp_options);
 	if (inp->inp_moptions != NULL)
 		inp_freemoptions(inp->inp_moptions);
 	inp->inp_vflag = 0;
 	crfree(inp->inp_cred);
 
 #ifdef MAC
 	mac_inpcb_destroy(inp);
 #endif
 	INP_WUNLOCK(inp);
 	uma_zfree(ipi->ipi_zone, inp);
 }
 
 /*
  * in_pcbref() bumps the reference count on an inpcb in order to maintain
  * stability of an inpcb pointer despite the inpcb lock being released.  This
  * is used in TCP when the inpcbinfo lock needs to be acquired or upgraded,
  * but where the inpcb lock is already held.
  *
  * While the inpcb will not be freed, releasing the inpcb lock means that the
  * connection's state may change, so the caller should be careful to
  * revalidate any cached state on reacquiring the lock.  Drop the reference
  * using in_pcbrele().
  */
 void
 in_pcbref(struct inpcb *inp)
 {
 
 	INP_WLOCK_ASSERT(inp);
 
 	KASSERT(inp->inp_refcount > 0, ("%s: refcount 0", __func__));
 
 	inp->inp_refcount++;
 }
 
 /*
  * Drop a refcount on an inpcb elevated using in_pcbref(); because a call to
  * in_pcbfree() may have been made between in_pcbref() and in_pcbrele(), we
  * return a flag indicating whether or not the inpcb remains valid.  If it is
  * valid, we return with the inpcb lock held.
  */
 int
 in_pcbrele(struct inpcb *inp)
 {
 #ifdef INVARIANTS
 	struct inpcbinfo *ipi = inp->inp_pcbinfo;
 #endif
 
 	KASSERT(inp->inp_refcount > 0, ("%s: refcount 0", __func__));
 
 	INP_INFO_WLOCK_ASSERT(ipi);
 	INP_WLOCK_ASSERT(inp);
 
 	inp->inp_refcount--;
 	if (inp->inp_refcount > 0)
 		return (0);
 	in_pcbfree_internal(inp);
 	return (1);
 }
 
 /*
  * Unconditionally schedule an inpcb to be freed by decrementing its
  * reference count, which should occur only after the inpcb has been detached
  * from its socket.  If another thread holds a temporary reference (acquired
  * using in_pcbref()) then the free is deferred until that reference is
  * released using in_pcbrele(), but the inpcb is still unlocked.
  */
 void
 in_pcbfree(struct inpcb *inp)
 {
 #ifdef INVARIANTS
 	struct inpcbinfo *ipi = inp->inp_pcbinfo;
 #endif
 
 	KASSERT(inp->inp_socket == NULL, ("%s: inp_socket != NULL",
 	    __func__));
 
 	INP_INFO_WLOCK_ASSERT(ipi);
 	INP_WLOCK_ASSERT(inp);
 
 	if (!in_pcbrele(inp))
 		INP_WUNLOCK(inp);
 }
 
 /*
  * in_pcbdrop() removes an inpcb from hashed lists, releasing its address and
  * port reservation, and preventing it from being returned by inpcb lookups.
  *
  * It is used by TCP to mark an inpcb as unused and avoid future packet
  * delivery or event notification when a socket remains open but TCP has
  * closed.  This might occur as a result of a shutdown()-initiated TCP close
  * or a RST on the wire, and allows the port binding to be reused while still
  * maintaining the invariant that so_pcb always points to a valid inpcb until
  * in_pcbdetach().
  *
  * XXXRW: An inp_lport of 0 is used to indicate that the inpcb is not on hash
  * lists, but can lead to confusing netstat output, as open sockets with
  * closed TCP connections will no longer appear to have their bound port
  * number.  An explicit flag would be better, as it would allow us to leave
  * the port number intact after the connection is dropped.
  *
  * XXXRW: Possibly in_pcbdrop() should also prevent future notifications by
  * in_pcbnotifyall() and in_pcbpurgeif0()?
  */
 void
 in_pcbdrop(struct inpcb *inp)
 {
 
 	INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	inp->inp_vflag |= INP_DROPPED;
 	if (inp->inp_lport) {
 		struct inpcbport *phd = inp->inp_phd;
 
 		LIST_REMOVE(inp, inp_hash);
 		LIST_REMOVE(inp, inp_portlist);
 		if (LIST_FIRST(&phd->phd_pcblist) == NULL) {
 			LIST_REMOVE(phd, phd_hash);
 			free(phd, M_PCB);
 		}
 		inp->inp_lport = 0;
 	}
 }
 
 /*
  * Common routines to return the socket addresses associated with inpcbs.
  */
 struct sockaddr *
 in_sockaddr(in_port_t port, struct in_addr *addr_p)
 {
 	struct sockaddr_in *sin;
 
 	sin = malloc(sizeof *sin, M_SONAME,
 		M_WAITOK | M_ZERO);
 	sin->sin_family = AF_INET;
 	sin->sin_len = sizeof(*sin);
 	sin->sin_addr = *addr_p;
 	sin->sin_port = port;
 
 	return (struct sockaddr *)sin;
 }
 
 int
 in_getsockaddr(struct socket *so, struct sockaddr **nam)
 {
 	struct inpcb *inp;
 	struct in_addr addr;
 	in_port_t port;
 
 	inp = sotoinpcb(so);
 	KASSERT(inp != NULL, ("in_getsockaddr: inp == NULL"));
 
 	INP_RLOCK(inp);
 	port = inp->inp_lport;
 	addr = inp->inp_laddr;
 	INP_RUNLOCK(inp);
 
 	*nam = in_sockaddr(port, &addr);
 	return 0;
 }
 
 int
 in_getpeeraddr(struct socket *so, struct sockaddr **nam)
 {
 	struct inpcb *inp;
 	struct in_addr addr;
 	in_port_t port;
 
 	inp = sotoinpcb(so);
 	KASSERT(inp != NULL, ("in_getpeeraddr: inp == NULL"));
 
 	INP_RLOCK(inp);
 	port = inp->inp_fport;
 	addr = inp->inp_faddr;
 	INP_RUNLOCK(inp);
 
 	*nam = in_sockaddr(port, &addr);
 	return 0;
 }
 
 void
 in_pcbnotifyall(struct inpcbinfo *pcbinfo, struct in_addr faddr, int errno,
     struct inpcb *(*notify)(struct inpcb *, int))
 {
 	struct inpcb *inp, *inp_temp;
 
 	INP_INFO_WLOCK(pcbinfo);
 	LIST_FOREACH_SAFE(inp, pcbinfo->ipi_listhead, inp_list, inp_temp) {
 		INP_WLOCK(inp);
 #ifdef INET6
 		if ((inp->inp_vflag & INP_IPV4) == 0) {
 			INP_WUNLOCK(inp);
 			continue;
 		}
 #endif
 		if (inp->inp_faddr.s_addr != faddr.s_addr ||
 		    inp->inp_socket == NULL) {
 			INP_WUNLOCK(inp);
 			continue;
 		}
 		if ((*notify)(inp, errno))
 			INP_WUNLOCK(inp);
 	}
 	INP_INFO_WUNLOCK(pcbinfo);
 }
 
 void
 in_pcbpurgeif0(struct inpcbinfo *pcbinfo, struct ifnet *ifp)
 {
 	struct inpcb *inp;
 	struct ip_moptions *imo;
 	int i, gap;
 
 	INP_INFO_RLOCK(pcbinfo);
 	LIST_FOREACH(inp, pcbinfo->ipi_listhead, inp_list) {
 		INP_WLOCK(inp);
 		imo = inp->inp_moptions;
 		if ((inp->inp_vflag & INP_IPV4) &&
 		    imo != NULL) {
 			/*
 			 * Unselect the outgoing interface if it is being
 			 * detached.
 			 */
 			if (imo->imo_multicast_ifp == ifp)
 				imo->imo_multicast_ifp = NULL;
 
 			/*
 			 * Drop multicast group membership if we joined
 			 * through the interface being detached.
 			 */
 			for (i = 0, gap = 0; i < imo->imo_num_memberships;
 			    i++) {
 				if (imo->imo_membership[i]->inm_ifp == ifp) {
 					in_delmulti(imo->imo_membership[i]);
 					gap++;
 				} else if (gap != 0)
 					imo->imo_membership[i - gap] =
 					    imo->imo_membership[i];
 			}
 			imo->imo_num_memberships -= gap;
 		}
 		INP_WUNLOCK(inp);
 	}
 	INP_INFO_RUNLOCK(pcbinfo);
 }
 
 /*
  * Lookup a PCB based on the local address and port.
  */
 #define INP_LOOKUP_MAPPED_PCB_COST	3
 struct inpcb *
 in_pcblookup_local(struct inpcbinfo *pcbinfo, struct in_addr laddr,
     u_short lport, int wild_okay, struct ucred *cred)
 {
 	struct inpcb *inp;
 #ifdef INET6
 	int matchwild = 3 + INP_LOOKUP_MAPPED_PCB_COST;
 #else
 	int matchwild = 3;
 #endif
 	int wildcard;
 
 	INP_INFO_LOCK_ASSERT(pcbinfo);
 
 	if (!wild_okay) {
 		struct inpcbhead *head;
 		/*
 		 * Look for an unconnected (wildcard foreign addr) PCB that
 		 * matches the local address and port we're looking for.
 		 */
 		head = &pcbinfo->ipi_hashbase[INP_PCBHASH(INADDR_ANY, lport,
 		    0, pcbinfo->ipi_hashmask)];
 		LIST_FOREACH(inp, head, inp_hash) {
 #ifdef INET6
 			/* XXX inp locking */
 			if ((inp->inp_vflag & INP_IPV4) == 0)
 				continue;
 #endif
 			if (inp->inp_faddr.s_addr == INADDR_ANY &&
 			    inp->inp_laddr.s_addr == laddr.s_addr &&
 			    inp->inp_lport == lport) {
 				/*
 				 * Found?
 				 */
 				if (cred == NULL ||
 				    inp->inp_cred->cr_prison == cred->cr_prison)
 					return (inp);
 			}
 		}
 		/*
 		 * Not found.
 		 */
 		return (NULL);
 	} else {
 		struct inpcbporthead *porthash;
 		struct inpcbport *phd;
 		struct inpcb *match = NULL;
 		/*
 		 * Best fit PCB lookup.
 		 *
 		 * First see if this local port is in use by looking on the
 		 * port hash list.
 		 */
 		porthash = &pcbinfo->ipi_porthashbase[INP_PCBPORTHASH(lport,
 		    pcbinfo->ipi_porthashmask)];
 		LIST_FOREACH(phd, porthash, phd_hash) {
 			if (phd->phd_port == lport)
 				break;
 		}
 		if (phd != NULL) {
 			/*
 			 * Port is in use by one or more PCBs. Look for best
 			 * fit.
 			 */
 			LIST_FOREACH(inp, &phd->phd_pcblist, inp_portlist) {
 				wildcard = 0;
 				if (cred != NULL &&
 				    inp->inp_cred->cr_prison != cred->cr_prison)
 					continue;
 #ifdef INET6
 				/* XXX inp locking */
 				if ((inp->inp_vflag & INP_IPV4) == 0)
 					continue;
 				/*
 				 * We never select the PCB that has
 				 * INP_IPV6 flag and is bound to :: if
 				 * we have another PCB which is bound
 				 * to 0.0.0.0.  If a PCB has the
 				 * INP_IPV6 flag, then we set its cost
 				 * higher than IPv4 only PCBs.
 				 *
 				 * Note that the case only happens
 				 * when a socket is bound to ::, under
 				 * the condition that the use of the
 				 * mapped address is allowed.
 				 */
 				if ((inp->inp_vflag & INP_IPV6) != 0)
 					wildcard += INP_LOOKUP_MAPPED_PCB_COST;
 #endif
 				if (inp->inp_faddr.s_addr != INADDR_ANY)
 					wildcard++;
 				if (inp->inp_laddr.s_addr != INADDR_ANY) {
 					if (laddr.s_addr == INADDR_ANY)
 						wildcard++;
 					else if (inp->inp_laddr.s_addr != laddr.s_addr)
 						continue;
 				} else {
 					if (laddr.s_addr != INADDR_ANY)
 						wildcard++;
 				}
 				if (wildcard < matchwild) {
 					match = inp;
 					matchwild = wildcard;
 					if (matchwild == 0)
 						break;
 				}
 			}
 		}
 		return (match);
 	}
 }
 #undef INP_LOOKUP_MAPPED_PCB_COST
 
 /*
  * Lookup PCB in hash list.
  */
 struct inpcb *
 in_pcblookup_hash(struct inpcbinfo *pcbinfo, struct in_addr faddr,
     u_int fport_arg, struct in_addr laddr, u_int lport_arg, int wildcard,
     struct ifnet *ifp)
 {
 	struct inpcbhead *head;
 	struct inpcb *inp, *tmpinp;
 	u_short fport = fport_arg, lport = lport_arg;
 
 	INP_INFO_LOCK_ASSERT(pcbinfo);
 
 	/*
 	 * First look for an exact match.
 	 */
 	tmpinp = NULL;
 	head = &pcbinfo->ipi_hashbase[INP_PCBHASH(faddr.s_addr, lport, fport,
 	    pcbinfo->ipi_hashmask)];
 	LIST_FOREACH(inp, head, inp_hash) {
 #ifdef INET6
 		/* XXX inp locking */
 		if ((inp->inp_vflag & INP_IPV4) == 0)
 			continue;
 #endif
 		if (inp->inp_faddr.s_addr == faddr.s_addr &&
 		    inp->inp_laddr.s_addr == laddr.s_addr &&
 		    inp->inp_fport == fport &&
 		    inp->inp_lport == lport) {
 			/*
 			 * XXX We should be able to directly return
 			 * the inp here, without any checks.
 			 * Well unless both bound with SO_REUSEPORT?
 			 */
 			if (jailed(inp->inp_cred))
 				return (inp);
 			if (tmpinp == NULL)
 				tmpinp = inp;
 		}
 	}
 	if (tmpinp != NULL)
 		return (tmpinp);
 
 	/*
 	 * Then look for a wildcard match, if requested.
 	 */
 	if (wildcard == INPLOOKUP_WILDCARD) {
 		struct inpcb *local_wild = NULL, *local_exact = NULL;
 #ifdef INET6
 		struct inpcb *local_wild_mapped = NULL;
 #endif
 		struct inpcb *jail_wild = NULL;
 		int injail;
 
 		/*
 		 * Order of socket selection - we always prefer jails.
 		 *      1. jailed, non-wild.
 		 *      2. jailed, wild.
 		 *      3. non-jailed, non-wild.
 		 *      4. non-jailed, wild.
 		 */
 
 		head = &pcbinfo->ipi_hashbase[INP_PCBHASH(INADDR_ANY, lport,
 		    0, pcbinfo->ipi_hashmask)];
 		LIST_FOREACH(inp, head, inp_hash) {
 #ifdef INET6
 			/* XXX inp locking */
 			if ((inp->inp_vflag & INP_IPV4) == 0)
 				continue;
 #endif
 			if (inp->inp_faddr.s_addr != INADDR_ANY ||
 			    inp->inp_lport != lport)
 				continue;
 
 			/* XXX inp locking */
 			if (ifp && ifp->if_type == IFT_FAITH &&
 			    (inp->inp_flags & INP_FAITH) == 0)
 				continue;
 
 			injail = jailed(inp->inp_cred);
 			if (injail) {
 				if (!prison_check_ip4(inp->inp_cred, &laddr))
 					continue;
 			} else {
 				if (local_exact != NULL)
 					continue;
 			}
 
 			if (inp->inp_laddr.s_addr == laddr.s_addr) {
 				if (injail)
 					return (inp);
 				else
 					local_exact = inp;
 			} else if (inp->inp_laddr.s_addr == INADDR_ANY) {
 #ifdef INET6
 				/* XXX inp locking, NULL check */
 				if (inp->inp_vflag & INP_IPV6PROTO)
 					local_wild_mapped = inp;
 				else
 #endif /* INET6 */
 					if (injail)
 						jail_wild = inp;
 					else
 						local_wild = inp;
 			}
 		} /* LIST_FOREACH */
 		if (jail_wild != NULL)
 			return (jail_wild);
 		if (local_exact != NULL)
 			return (local_exact);
 		if (local_wild != NULL)
 			return (local_wild);
 #ifdef INET6
 		if (local_wild_mapped != NULL)
 			return (local_wild_mapped);
 #endif /* defined(INET6) */
 	} /* if (wildcard == INPLOOKUP_WILDCARD) */
 
 	return (NULL);
 }
 
 /*
  * Insert PCB onto various hash lists.
  */
 int
 in_pcbinshash(struct inpcb *inp)
 {
 	struct inpcbhead *pcbhash;
 	struct inpcbporthead *pcbporthash;
 	struct inpcbinfo *pcbinfo = inp->inp_pcbinfo;
 	struct inpcbport *phd;
 	u_int32_t hashkey_faddr;
 
 	INP_INFO_WLOCK_ASSERT(pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 #ifdef INET6
 	if (inp->inp_vflag & INP_IPV6)
 		hashkey_faddr = inp->in6p_faddr.s6_addr32[3] /* XXX */;
 	else
 #endif /* INET6 */
 	hashkey_faddr = inp->inp_faddr.s_addr;
 
 	pcbhash = &pcbinfo->ipi_hashbase[INP_PCBHASH(hashkey_faddr,
 		 inp->inp_lport, inp->inp_fport, pcbinfo->ipi_hashmask)];
 
 	pcbporthash = &pcbinfo->ipi_porthashbase[
 	    INP_PCBPORTHASH(inp->inp_lport, pcbinfo->ipi_porthashmask)];
 
 	/*
 	 * Go through port list and look for a head for this lport.
 	 */
 	LIST_FOREACH(phd, pcbporthash, phd_hash) {
 		if (phd->phd_port == inp->inp_lport)
 			break;
 	}
 	/*
 	 * If none exists, malloc one and tack it on.
 	 */
 	if (phd == NULL) {
 		phd = malloc(sizeof(struct inpcbport), M_PCB, M_NOWAIT);
 		if (phd == NULL) {
 			return (ENOBUFS); /* XXX */
 		}
 		phd->phd_port = inp->inp_lport;
 		LIST_INIT(&phd->phd_pcblist);
 		LIST_INSERT_HEAD(pcbporthash, phd, phd_hash);
 	}
 	inp->inp_phd = phd;
 	LIST_INSERT_HEAD(&phd->phd_pcblist, inp, inp_portlist);
 	LIST_INSERT_HEAD(pcbhash, inp, inp_hash);
 	return (0);
 }
 
 /*
  * Move PCB to the proper hash bucket when { faddr, fport } have  been
  * changed. NOTE: This does not handle the case of the lport changing (the
  * hashed port list would have to be updated as well), so the lport must
  * not change after in_pcbinshash() has been called.
  */
 void
 in_pcbrehash(struct inpcb *inp)
 {
 	struct inpcbinfo *pcbinfo = inp->inp_pcbinfo;
 	struct inpcbhead *head;
 	u_int32_t hashkey_faddr;
 
 	INP_INFO_WLOCK_ASSERT(pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 #ifdef INET6
 	if (inp->inp_vflag & INP_IPV6)
 		hashkey_faddr = inp->in6p_faddr.s6_addr32[3] /* XXX */;
 	else
 #endif /* INET6 */
 	hashkey_faddr = inp->inp_faddr.s_addr;
 
 	head = &pcbinfo->ipi_hashbase[INP_PCBHASH(hashkey_faddr,
 		inp->inp_lport, inp->inp_fport, pcbinfo->ipi_hashmask)];
 
 	LIST_REMOVE(inp, inp_hash);
 	LIST_INSERT_HEAD(head, inp, inp_hash);
 }
 
 /*
  * Remove PCB from various lists.
  */
 void
 in_pcbremlists(struct inpcb *inp)
 {
 	struct inpcbinfo *pcbinfo = inp->inp_pcbinfo;
 
 	INP_INFO_WLOCK_ASSERT(pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	inp->inp_gencnt = ++pcbinfo->ipi_gencnt;
 	if (inp->inp_lport) {
 		struct inpcbport *phd = inp->inp_phd;
 
 		LIST_REMOVE(inp, inp_hash);
 		LIST_REMOVE(inp, inp_portlist);
 		if (LIST_FIRST(&phd->phd_pcblist) == NULL) {
 			LIST_REMOVE(phd, phd_hash);
 			free(phd, M_PCB);
 		}
 	}
 	LIST_REMOVE(inp, inp_list);
 	pcbinfo->ipi_count--;
 }
 
 /*
  * A set label operation has occurred at the socket layer, propagate the
  * label change into the in_pcb for the socket.
  */
 void
 in_pcbsosetlabel(struct socket *so)
 {
 #ifdef MAC
 	struct inpcb *inp;
 
 	inp = sotoinpcb(so);
 	KASSERT(inp != NULL, ("in_pcbsosetlabel: so->so_pcb == NULL"));
 
 	INP_WLOCK(inp);
 	SOCK_LOCK(so);
 	mac_inpcb_sosetlabel(so, inp);
 	SOCK_UNLOCK(so);
 	INP_WUNLOCK(inp);
 #endif
 }
 
 /*
  * ipport_tick runs once per second, determining if random port allocation
  * should be continued.  If more than ipport_randomcps ports have been
  * allocated in the last second, then we return to sequential port
  * allocation. We return to random allocation only once we drop below
  * ipport_randomcps for at least ipport_randomtime seconds.
  */
 void
 ipport_tick(void *xtp)
 {
 	VNET_ITERATOR_DECL(vnet_iter);
 
 	VNET_LIST_RLOCK();
 	VNET_FOREACH(vnet_iter) {
 		CURVNET_SET(vnet_iter);	/* XXX appease INVARIANTS here */
 		INIT_VNET_INET(vnet_iter);
 		if (V_ipport_tcpallocs <=
 		    V_ipport_tcplastcount + V_ipport_randomcps) {
 			if (V_ipport_stoprandom > 0)
 				V_ipport_stoprandom--;
 		} else
 			V_ipport_stoprandom = V_ipport_randomtime;
 		V_ipport_tcplastcount = V_ipport_tcpallocs;
 		CURVNET_RESTORE();
 	}
 	VNET_LIST_RUNLOCK();
 	callout_reset(&ipport_tick_callout, hz, ipport_tick, NULL);
 }
 
 void
 inp_wlock(struct inpcb *inp)
 {
 
 	INP_WLOCK(inp);
 }
 
 void
 inp_wunlock(struct inpcb *inp)
 {
 
 	INP_WUNLOCK(inp);
 }
 
 void
 inp_rlock(struct inpcb *inp)
 {
 
 	INP_RLOCK(inp);
 }
 
 void
 inp_runlock(struct inpcb *inp)
 {
 
 	INP_RUNLOCK(inp);
 }
 
 #ifdef INVARIANTS
 void
 inp_lock_assert(struct inpcb *inp)
 {
 
 	INP_WLOCK_ASSERT(inp);
 }
 
 void
 inp_unlock_assert(struct inpcb *inp)
 {
 
 	INP_UNLOCK_ASSERT(inp);
 }
 #endif
 
 void
 inp_apply_all(void (*func)(struct inpcb *, void *), void *arg)
 {
 	INIT_VNET_INET(curvnet);
 	struct inpcb *inp;
 
 	INP_INFO_RLOCK(&V_tcbinfo);
 	LIST_FOREACH(inp, V_tcbinfo.ipi_listhead, inp_list) {
 		INP_WLOCK(inp);
 		func(inp, arg);
 		INP_WUNLOCK(inp);
 	}
 	INP_INFO_RUNLOCK(&V_tcbinfo);
 }
 
 struct socket *
 inp_inpcbtosocket(struct inpcb *inp)
 {
 
 	INP_WLOCK_ASSERT(inp);
 	return (inp->inp_socket);
 }
 
 struct tcpcb *
 inp_inpcbtotcpcb(struct inpcb *inp)
 {
 
 	INP_WLOCK_ASSERT(inp);
 	return ((struct tcpcb *)inp->inp_ppcb);
 }
 
 int
 inp_ip_tos_get(const struct inpcb *inp)
 {
 
 	return (inp->inp_ip_tos);
 }
 
 void
 inp_ip_tos_set(struct inpcb *inp, int val)
 {
 
 	inp->inp_ip_tos = val;
 }
 
 void
 inp_4tuple_get(struct inpcb *inp, uint32_t *laddr, uint16_t *lp,
     uint32_t *faddr, uint16_t *fp)
 {
 
 	INP_LOCK_ASSERT(inp);
 	*laddr = inp->inp_laddr.s_addr;
 	*faddr = inp->inp_faddr.s_addr;
 	*lp = inp->inp_lport;
 	*fp = inp->inp_fport;
 }
 
 struct inpcb *
 so_sotoinpcb(struct socket *so)
 {
 
 	return (sotoinpcb(so));
 }
 
 struct tcpcb *
 so_sototcpcb(struct socket *so)
 {
 
 	return (sototcpcb(so));
 }
 
 #ifdef DDB
 static void
 db_print_indent(int indent)
 {
 	int i;
 
 	for (i = 0; i < indent; i++)
 		db_printf(" ");
 }
 
 static void
 db_print_inconninfo(struct in_conninfo *inc, const char *name, int indent)
 {
 	char faddr_str[48], laddr_str[48];
 
 	db_print_indent(indent);
 	db_printf("%s at %p\n", name, inc);
 
 	indent += 2;
 
 #ifdef INET6
 	if (inc->inc_flags == 1) {
 		/* IPv6. */
 		ip6_sprintf(laddr_str, &inc->inc6_laddr);
 		ip6_sprintf(faddr_str, &inc->inc6_faddr);
 	} else {
 #endif
 		/* IPv4. */
 		inet_ntoa_r(inc->inc_laddr, laddr_str);
 		inet_ntoa_r(inc->inc_faddr, faddr_str);
 #ifdef INET6
 	}
 #endif
 	db_print_indent(indent);
 	db_printf("inc_laddr %s   inc_lport %u\n", laddr_str,
 	    ntohs(inc->inc_lport));
 	db_print_indent(indent);
 	db_printf("inc_faddr %s   inc_fport %u\n", faddr_str,
 	    ntohs(inc->inc_fport));
 }
 
 static void
 db_print_inpflags(int inp_flags)
 {
 	int comma;
 
 	comma = 0;
 	if (inp_flags & INP_RECVOPTS) {
 		db_printf("%sINP_RECVOPTS", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_RECVRETOPTS) {
 		db_printf("%sINP_RECVRETOPTS", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_RECVDSTADDR) {
 		db_printf("%sINP_RECVDSTADDR", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_HDRINCL) {
 		db_printf("%sINP_HDRINCL", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_HIGHPORT) {
 		db_printf("%sINP_HIGHPORT", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_LOWPORT) {
 		db_printf("%sINP_LOWPORT", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_ANONPORT) {
 		db_printf("%sINP_ANONPORT", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_RECVIF) {
 		db_printf("%sINP_RECVIF", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_MTUDISC) {
 		db_printf("%sINP_MTUDISC", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_FAITH) {
 		db_printf("%sINP_FAITH", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_RECVTTL) {
 		db_printf("%sINP_RECVTTL", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & INP_DONTFRAG) {
 		db_printf("%sINP_DONTFRAG", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_IPV6_V6ONLY) {
 		db_printf("%sIN6P_IPV6_V6ONLY", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_PKTINFO) {
 		db_printf("%sIN6P_PKTINFO", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_HOPLIMIT) {
 		db_printf("%sIN6P_HOPLIMIT", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_HOPOPTS) {
 		db_printf("%sIN6P_HOPOPTS", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_DSTOPTS) {
 		db_printf("%sIN6P_DSTOPTS", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_RTHDR) {
 		db_printf("%sIN6P_RTHDR", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_RTHDRDSTOPTS) {
 		db_printf("%sIN6P_RTHDRDSTOPTS", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_TCLASS) {
 		db_printf("%sIN6P_TCLASS", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_AUTOFLOWLABEL) {
 		db_printf("%sIN6P_AUTOFLOWLABEL", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_RFC2292) {
 		db_printf("%sIN6P_RFC2292", comma ? ", " : "");
 		comma = 1;
 	}
 	if (inp_flags & IN6P_MTU) {
 		db_printf("IN6P_MTU%s", comma ? ", " : "");
 		comma = 1;
 	}
 }
 
 static void
 db_print_inpvflag(u_char inp_vflag)
 {
 	int comma;
 
 	comma = 0;
 	if (inp_vflag & INP_IPV4) {
 		db_printf("%sINP_IPV4", comma ? ", " : "");
 		comma  = 1;
 	}
 	if (inp_vflag & INP_IPV6) {
 		db_printf("%sINP_IPV6", comma ? ", " : "");
 		comma  = 1;
 	}
 	if (inp_vflag & INP_IPV6PROTO) {
 		db_printf("%sINP_IPV6PROTO", comma ? ", " : "");
 		comma  = 1;
 	}
 	if (inp_vflag & INP_TIMEWAIT) {
 		db_printf("%sINP_TIMEWAIT", comma ? ", " : "");
 		comma  = 1;
 	}
 	if (inp_vflag & INP_ONESBCAST) {
 		db_printf("%sINP_ONESBCAST", comma ? ", " : "");
 		comma  = 1;
 	}
 	if (inp_vflag & INP_DROPPED) {
 		db_printf("%sINP_DROPPED", comma ? ", " : "");
 		comma  = 1;
 	}
 	if (inp_vflag & INP_SOCKREF) {
 		db_printf("%sINP_SOCKREF", comma ? ", " : "");
 		comma  = 1;
 	}
 }
 
 void
 db_print_inpcb(struct inpcb *inp, const char *name, int indent)
 {
 
 	db_print_indent(indent);
 	db_printf("%s at %p\n", name, inp);
 
 	indent += 2;
 
 	db_print_indent(indent);
 	db_printf("inp_flow: 0x%x\n", inp->inp_flow);
 
 	db_print_inconninfo(&inp->inp_inc, "inp_conninfo", indent);
 
 	db_print_indent(indent);
 	db_printf("inp_ppcb: %p   inp_pcbinfo: %p   inp_socket: %p\n",
 	    inp->inp_ppcb, inp->inp_pcbinfo, inp->inp_socket);
 
 	db_print_indent(indent);
 	db_printf("inp_label: %p   inp_flags: 0x%x (",
 	   inp->inp_label, inp->inp_flags);
 	db_print_inpflags(inp->inp_flags);
 	db_printf(")\n");
 
 	db_print_indent(indent);
 	db_printf("inp_sp: %p   inp_vflag: 0x%x (", inp->inp_sp,
 	    inp->inp_vflag);
 	db_print_inpvflag(inp->inp_vflag);
 	db_printf(")\n");
 
 	db_print_indent(indent);
 	db_printf("inp_ip_ttl: %d   inp_ip_p: %d   inp_ip_minttl: %d\n",
 	    inp->inp_ip_ttl, inp->inp_ip_p, inp->inp_ip_minttl);
 
 	db_print_indent(indent);
 #ifdef INET6
 	if (inp->inp_vflag & INP_IPV6) {
 		db_printf("in6p_options: %p   in6p_outputopts: %p   "
 		    "in6p_moptions: %p\n", inp->in6p_options,
 		    inp->in6p_outputopts, inp->in6p_moptions);
 		db_printf("in6p_icmp6filt: %p   in6p_cksum %d   "
 		    "in6p_hops %u\n", inp->in6p_icmp6filt, inp->in6p_cksum,
 		    inp->in6p_hops);
 	} else
 #endif
 	{
 		db_printf("inp_ip_tos: %d   inp_ip_options: %p   "
 		    "inp_ip_moptions: %p\n", inp->inp_ip_tos,
 		    inp->inp_options, inp->inp_moptions);
 	}
 
 	db_print_indent(indent);
 	db_printf("inp_phd: %p   inp_gencnt: %ju\n", inp->inp_phd,
 	    (uintmax_t)inp->inp_gencnt);
 }
 
 DB_SHOW_COMMAND(inpcb, db_show_inpcb)
 {
 	struct inpcb *inp;
 
 	if (!have_addr) {
 		db_printf("usage: show inpcb <addr>\n");
 		return;
 	}
 	inp = (struct inpcb *)addr;
 
 	db_print_inpcb(inp, "inpcb", 0);
 }
 #endif
Index: head/sys/netinet/in_proto.c
===================================================================
--- head/sys/netinet/in_proto.c	(revision 186118)
+++ head/sys/netinet/in_proto.c	(revision 186119)
@@ -1,397 +1,400 @@
 /*-
  * Copyright (c) 1982, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)in_proto.c	8.2 (Berkeley) 2/9/95
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ipx.h"
 #include "opt_mrouting.h"
 #include "opt_ipsec.h"
 #include "opt_inet6.h"
 #include "opt_pf.h"
 #include "opt_carp.h"
 #include "opt_sctp.h"
 #include "opt_mpath.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/socket.h>
 #include <sys/domain.h>
 #include <sys/proc.h>
 #include <sys/protosw.h>
 #include <sys/queue.h>
 #include <sys/sysctl.h>
 
 #include <net/if.h>
 #include <net/route.h>
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
+#include <netinet/in_var.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/igmp_var.h>
 #include <netinet/tcp.h>
 #include <netinet/tcp_timer.h>
 #include <netinet/tcp_var.h>
 #include <netinet/udp.h>
 #include <netinet/udp_var.h>
 #include <netinet/ip_encap.h>
 
 /*
  * TCP/IP protocol family: IP, ICMP, UDP, TCP.
  */
 
 static struct pr_usrreqs nousrreqs;
 
 #ifdef IPSEC
 #include <netipsec/ipsec.h>
 #endif /* IPSEC */
 
 #ifdef SCTP
 #include <netinet/in_pcb.h>
 #include <netinet/sctp_pcb.h>
 #include <netinet/sctp.h>
 #include <netinet/sctp_var.h>
 #endif /* SCTP */
 
 #ifdef DEV_PFSYNC
 #include <net/pfvar.h>
 #include <net/if_pfsync.h>
 #endif
 
 #ifdef DEV_CARP
 #include <netinet/ip_carp.h>
 #endif
 
 extern	struct domain inetdomain;
 
 /* Spacer for loadable protocols. */
 #define IPPROTOSPACER   			\
 {						\
 	.pr_domain =		&inetdomain,	\
 	.pr_protocol =		PROTO_SPACER,	\
 	.pr_usrreqs =		&nousrreqs	\
 }
 
 struct protosw inetsw[] = {
 {
 	.pr_type =		0,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_IP,
 	.pr_init =		ip_init,
 	.pr_slowtimo =		ip_slowtimo,
 	.pr_drain =		ip_drain,
 	.pr_usrreqs =		&nousrreqs
 },
 {
 	.pr_type =		SOCK_DGRAM,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_UDP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		udp_input,
 	.pr_ctlinput =		udp_ctlinput,
 	.pr_ctloutput =		ip_ctloutput,
 	.pr_init =		udp_init,
 	.pr_usrreqs =		&udp_usrreqs
 },
 {
 	.pr_type =		SOCK_STREAM,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_TCP,
 	.pr_flags =		PR_CONNREQUIRED|PR_IMPLOPCL|PR_WANTRCVD,
 	.pr_input =		tcp_input,
 	.pr_ctlinput =		tcp_ctlinput,
 	.pr_ctloutput =		tcp_ctloutput,
 	.pr_init =		tcp_init,
 	.pr_slowtimo =		tcp_slowtimo,
 	.pr_drain =		tcp_drain,
 	.pr_usrreqs =		&tcp_usrreqs
 },
 #ifdef SCTP
 { 
 	.pr_type = 	SOCK_DGRAM,
 	.pr_domain =  	&inetdomain,
         .pr_protocol = 	IPPROTO_SCTP,
         .pr_flags = 	PR_WANTRCVD,
         .pr_input = 	sctp_input,
         .pr_ctlinput =  sctp_ctlinput,	
         .pr_ctloutput = sctp_ctloutput,
         .pr_init = 	sctp_init,	
         .pr_drain = 	sctp_drain,
         .pr_usrreqs = 	&sctp_usrreqs
 },
 {
 	.pr_type = 	SOCK_SEQPACKET,
 	.pr_domain =  	&inetdomain,
         .pr_protocol = 	IPPROTO_SCTP,
         .pr_flags = 	PR_WANTRCVD,
         .pr_input = 	sctp_input,
         .pr_ctlinput =  sctp_ctlinput,	
         .pr_ctloutput = sctp_ctloutput,
         .pr_drain = 	sctp_drain,
         .pr_usrreqs = 	&sctp_usrreqs
 },
 
 { 
 	.pr_type = 	SOCK_STREAM,
 	.pr_domain =  	&inetdomain,
         .pr_protocol = 	IPPROTO_SCTP,
         .pr_flags = 	PR_WANTRCVD,
         .pr_input = 	sctp_input,
         .pr_ctlinput =  sctp_ctlinput,	
         .pr_ctloutput = sctp_ctloutput,
         .pr_drain = 	sctp_drain,
         .pr_usrreqs = 	&sctp_usrreqs
 },
 #endif /* SCTP */
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_RAW,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		rip_input,
 	.pr_ctlinput =		rip_ctlinput,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_usrreqs =		&rip_usrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_ICMP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		icmp_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		icmp_init,
 	.pr_usrreqs =		&rip_usrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_IGMP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		igmp_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		igmp_init,
 	.pr_fasttimo =		igmp_fasttimo,
 	.pr_slowtimo =		igmp_slowtimo,
 	.pr_usrreqs =		&rip_usrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_RSVP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		rsvp_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_usrreqs =		&rip_usrreqs
 },
 #ifdef IPSEC
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_AH,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		ah4_input,
 	.pr_ctlinput =		ah4_ctlinput,
 	.pr_usrreqs =		&nousrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_ESP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		esp4_input,
 	.pr_ctlinput =		esp4_ctlinput,
 	.pr_usrreqs =		&nousrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_IPCOMP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		ipcomp4_input,
 	.pr_usrreqs =		&nousrreqs
 },
 #endif /* IPSEC */
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_IPV4,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		encap4_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		encap_init,
 	.pr_usrreqs =		&rip_usrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_MOBILE,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		encap4_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		encap_init,
 	.pr_usrreqs =		&rip_usrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_ETHERIP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		encap4_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		encap_init,
 	.pr_usrreqs =		&rip_usrreqs
 },
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_GRE,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		encap4_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		encap_init,
 	.pr_usrreqs =		&rip_usrreqs
 },
 # ifdef INET6
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_IPV6,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		encap4_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		encap_init,
 	.pr_usrreqs =		&rip_usrreqs
 },
 #endif
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_PIM,
 	.pr_flags =		PR_ATOMIC|PR_ADDR|PR_LASTHDR,
 	.pr_input =		encap4_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_usrreqs =		&rip_usrreqs
 },
 #ifdef DEV_PFSYNC
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_PFSYNC,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		pfsync_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_usrreqs =		&rip_usrreqs
 },
 #endif	/* DEV_PFSYNC */
 #ifdef DEV_CARP
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_protocol =		IPPROTO_CARP,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		carp_input,
 	.pr_output =		(pr_output_t*)rip_output,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_usrreqs =		&rip_usrreqs
 },
 #endif /* DEV_CARP */
 /* Spacer n-times for loadable protocols. */
 IPPROTOSPACER,
 IPPROTOSPACER,
 IPPROTOSPACER,
 IPPROTOSPACER,
 IPPROTOSPACER,
 IPPROTOSPACER,
 IPPROTOSPACER,
 IPPROTOSPACER,
 /* raw wildcard */
 {
 	.pr_type =		SOCK_RAW,
 	.pr_domain =		&inetdomain,
 	.pr_flags =		PR_ATOMIC|PR_ADDR,
 	.pr_input =		rip_input,
 	.pr_ctloutput =		rip_ctloutput,
 	.pr_init =		rip_init,
 	.pr_usrreqs =		&rip_usrreqs
 },
 };
 
 extern int in_inithead(void **, int);
 
 struct domain inetdomain = {
 	.dom_family =		AF_INET,
 	.dom_name =		"internet",
 	.dom_protosw =		inetsw,
 	.dom_protoswNPROTOSW =	&inetsw[sizeof(inetsw)/sizeof(inetsw[0])],
 #ifdef RADIX_MPATH
 	.dom_rtattach =		rn4_mpath_inithead,
 #else
 	.dom_rtattach =		in_inithead,
 #endif
 	.dom_rtoffset =		32,
-	.dom_maxrtkey =		sizeof(struct sockaddr_in)
+	.dom_maxrtkey =		sizeof(struct sockaddr_in),
+	.dom_ifattach =		in_domifattach,
+	.dom_ifdetach =		in_domifdetach
 };
 
 DOMAIN_SET(inet);
 
 SYSCTL_NODE(_net,      PF_INET,		inet,	CTLFLAG_RW, 0,
 	"Internet Family");
 
 SYSCTL_NODE(_net_inet, IPPROTO_IP,	ip,	CTLFLAG_RW, 0,	"IP");
 SYSCTL_NODE(_net_inet, IPPROTO_ICMP,	icmp,	CTLFLAG_RW, 0,	"ICMP");
 SYSCTL_NODE(_net_inet, IPPROTO_UDP,	udp,	CTLFLAG_RW, 0,	"UDP");
 SYSCTL_NODE(_net_inet, IPPROTO_TCP,	tcp,	CTLFLAG_RW, 0,	"TCP");
 #ifdef SCTP
 SYSCTL_NODE(_net_inet, IPPROTO_SCTP,	sctp,	CTLFLAG_RW, 0,	"SCTP");
 #endif
 SYSCTL_NODE(_net_inet, IPPROTO_IGMP,	igmp,	CTLFLAG_RW, 0,	"IGMP");
 #ifdef IPSEC
 /* XXX no protocol # to use, pick something "reserved" */
 SYSCTL_NODE(_net_inet, 253,		ipsec,	CTLFLAG_RW, 0,	"IPSEC");
 SYSCTL_NODE(_net_inet, IPPROTO_AH,	ah,	CTLFLAG_RW, 0,	"AH");
 SYSCTL_NODE(_net_inet, IPPROTO_ESP,	esp,	CTLFLAG_RW, 0,	"ESP");
 SYSCTL_NODE(_net_inet, IPPROTO_IPCOMP,	ipcomp,	CTLFLAG_RW, 0,	"IPCOMP");
 SYSCTL_NODE(_net_inet, IPPROTO_IPIP,	ipip,	CTLFLAG_RW, 0,	"IPIP");
 #endif /* IPSEC */
 SYSCTL_NODE(_net_inet, IPPROTO_RAW,	raw,	CTLFLAG_RW, 0,	"RAW");
 #ifdef DEV_PFSYNC
 SYSCTL_NODE(_net_inet, IPPROTO_PFSYNC,	pfsync,	CTLFLAG_RW, 0,	"PFSYNC");
 #endif
 #ifdef DEV_CARP
 SYSCTL_NODE(_net_inet, IPPROTO_CARP,	carp,	CTLFLAG_RW, 0,	"CARP");
 #endif
Index: head/sys/netinet/in_rmx.c
===================================================================
--- head/sys/netinet/in_rmx.c	(revision 186118)
+++ head/sys/netinet/in_rmx.c	(revision 186119)
@@ -1,520 +1,492 @@
 /*-
  * Copyright 1994, 1995 Massachusetts Institute of Technology
  *
  * Permission to use, copy, modify, and distribute this software and
  * its documentation for any purpose and without fee is hereby
  * granted, provided that both the above copyright notice and this
  * permission notice appear in all copies, that both the above
  * copyright notice and this permission notice appear in all
  * supporting documentation, and that the name of M.I.T. not be used
  * in advertising or publicity pertaining to distribution of the
  * software without specific, written prior permission.  M.I.T. makes
  * no representations about the suitability of this software for any
  * purpose.  It is provided "as is" without express or implied
  * warranty.
  *
  * THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''.  M.I.T. DISCLAIMS
  * ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
  * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
  * SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * This code does two things necessary for the enhanced TCP metrics to
  * function in a useful manner:
  *  1) It marks all non-host routes as `cloning', thus ensuring that
  *     every actual reference to such a route actually gets turned
  *     into a reference to a host route to the specific destination
  *     requested.
  *  2) When such routes lose all their references, it arranges for them
  *     to be deleted in some random collection of circumstances, so that
  *     a large quantity of stale routing data is not kept in kernel memory
  *     indefinitely.  See in_rtqtimo() below for the exact mechanism.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/sysctl.h>
 #include <sys/socket.h>
 #include <sys/mbuf.h>
 #include <sys/syslog.h>
 #include <sys/callout.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/ip_var.h>
 #include <netinet/vinet.h>
 
 extern int	in_inithead(void **head, int off);
 
 #define RTPRF_OURS		RTF_PROTO3	/* set on routes we manage */
 
 /*
  * Do what we need to do when inserting a route.
  */
 static struct radix_node *
 in_addroute(void *v_arg, void *n_arg, struct radix_node_head *head,
     struct radix_node *treenodes)
 {
 	struct rtentry *rt = (struct rtentry *)treenodes;
 	struct sockaddr_in *sin = (struct sockaddr_in *)rt_key(rt);
-	struct radix_node *ret;
 
+	RADIX_NODE_HEAD_WLOCK_ASSERT(head);
 	/*
 	 * A little bit of help for both IP output and input:
 	 *   For host routes, we make sure that RTF_BROADCAST
 	 *   is set for anything that looks like a broadcast address.
 	 *   This way, we can avoid an expensive call to in_broadcast()
 	 *   in ip_output() most of the time (because the route passed
 	 *   to ip_output() is almost always a host route).
 	 *
 	 *   We also do the same for local addresses, with the thought
 	 *   that this might one day be used to speed up ip_input().
 	 *
 	 * We also mark routes to multicast addresses as such, because
 	 * it's easy to do and might be useful (but this is much more
 	 * dubious since it's so easy to inspect the address).
 	 */
 	if (rt->rt_flags & RTF_HOST) {
 		if (in_broadcast(sin->sin_addr, rt->rt_ifp)) {
 			rt->rt_flags |= RTF_BROADCAST;
 		} else if (satosin(rt->rt_ifa->ifa_addr)->sin_addr.s_addr ==
 		    sin->sin_addr.s_addr) {
 			rt->rt_flags |= RTF_LOCAL;
 		}
 	}
 	if (IN_MULTICAST(ntohl(sin->sin_addr.s_addr)))
 		rt->rt_flags |= RTF_MULTICAST;
 
 	if (!rt->rt_rmx.rmx_mtu && rt->rt_ifp)
 		rt->rt_rmx.rmx_mtu = rt->rt_ifp->if_mtu;
 
-	ret = rn_addroute(v_arg, n_arg, head, treenodes);
-	if (ret == NULL && rt->rt_flags & RTF_HOST) {
-		struct rtentry *rt2;
-		/*
-		 * We are trying to add a host route, but can't.
-		 * Find out if it is because of an
-		 * ARP entry and delete it if so.
-		 */
-		rt2 = in_rtalloc1((struct sockaddr *)sin, 0,
-		    RTF_CLONING|RTF_RNH_LOCKED, rt->rt_fibnum);
-		if (rt2) {
-			if (rt2->rt_flags & RTF_LLINFO &&
-			    rt2->rt_flags & RTF_HOST &&
-			    rt2->rt_gateway &&
-			    rt2->rt_gateway->sa_family == AF_LINK) {
-				rtexpunge(rt2);
-				RTFREE_LOCKED(rt2);
-				ret = rn_addroute(v_arg, n_arg, head,
-						  treenodes);
-			} else
-				RTFREE_LOCKED(rt2);
-		}
-	}
-
-	return ret;
+	return (rn_addroute(v_arg, n_arg, head, treenodes));
 }
 
 /*
  * This code is the inverse of in_clsroute: on first reference, if we
  * were managing the route, stop doing so and set the expiration timer
  * back off again.
  */
 static struct radix_node *
 in_matroute(void *v_arg, struct radix_node_head *head)
 {
 	struct radix_node *rn = rn_match(v_arg, head);
 	struct rtentry *rt = (struct rtentry *)rn;
 
 	/*XXX locking? */
 	if (rt && rt->rt_refcnt == 0) {		/* this is first reference */
 		if (rt->rt_flags & RTPRF_OURS) {
 			rt->rt_flags &= ~RTPRF_OURS;
 			rt->rt_rmx.rmx_expire = 0;
 		}
 	}
 	return rn;
 }
 
 #ifdef VIMAGE_GLOBALS
 static int rtq_reallyold;
 static int rtq_minreallyold;
 static int rtq_toomany;
 #endif
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, IPCTL_RTEXPIRE, rtexpire,
     CTLFLAG_RW, rtq_reallyold, 0,
     "Default expiration time on dynamically learned routes");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, IPCTL_RTMINEXPIRE,
     rtminexpire, CTLFLAG_RW, rtq_minreallyold, 0,
     "Minimum time to attempt to hold onto dynamically learned routes");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, IPCTL_RTMAXCACHE,
     rtmaxcache, CTLFLAG_RW, rtq_toomany, 0,
     "Upper limit on dynamically learned routes");
 
 /*
  * On last reference drop, mark the route as belong to us so that it can be
  * timed out.
  */
 static void
 in_clsroute(struct radix_node *rn, struct radix_node_head *head)
 {
 	INIT_VNET_INET(curvnet);
 	struct rtentry *rt = (struct rtentry *)rn;
 
 	RT_LOCK_ASSERT(rt);
 
 	if (!(rt->rt_flags & RTF_UP))
 		return;			/* prophylactic measures */
 
-	if ((rt->rt_flags & (RTF_LLINFO | RTF_HOST)) != RTF_HOST)
-		return;
-
 	if (rt->rt_flags & RTPRF_OURS)
 		return;
 
-	if (!(rt->rt_flags & (RTF_WASCLONED | RTF_DYNAMIC)))
+	if (!(rt->rt_flags & RTF_DYNAMIC))
 		return;
 
 	/*
 	 * If rtq_reallyold is 0, just delete the route without
 	 * waiting for a timeout cycle to kill it.
 	 */
 	if (V_rtq_reallyold != 0) {
 		rt->rt_flags |= RTPRF_OURS;
 		rt->rt_rmx.rmx_expire = time_uptime + V_rtq_reallyold;
 	} else {
 		rtexpunge(rt);
 	}
 }
 
 struct rtqk_arg {
 	struct radix_node_head *rnh;
 	int draining;
 	int killed;
 	int found;
 	int updating;
 	time_t nextstop;
 };
 
 /*
  * Get rid of old routes.  When draining, this deletes everything, even when
  * the timeout is not expired yet.  When updating, this makes sure that
  * nothing has a timeout longer than the current value of rtq_reallyold.
  */
 static int
 in_rtqkill(struct radix_node *rn, void *rock)
 {
 	INIT_VNET_INET(curvnet);
 	struct rtqk_arg *ap = rock;
 	struct rtentry *rt = (struct rtentry *)rn;
 	int err;
 
 	if (rt->rt_flags & RTPRF_OURS) {
 		ap->found++;
 
 		if (ap->draining || rt->rt_rmx.rmx_expire <= time_uptime) {
 			if (rt->rt_refcnt > 0)
 				panic("rtqkill route really not free");
 
 			err = in_rtrequest(RTM_DELETE,
 					(struct sockaddr *)rt_key(rt),
 					rt->rt_gateway, rt_mask(rt),
 					rt->rt_flags, 0, rt->rt_fibnum);
 			if (err) {
 				log(LOG_WARNING, "in_rtqkill: error %d\n", err);
 			} else {
 				ap->killed++;
 			}
 		} else {
 			if (ap->updating &&
 			    (rt->rt_rmx.rmx_expire - time_uptime >
 			     V_rtq_reallyold)) {
 				rt->rt_rmx.rmx_expire =
 				    time_uptime + V_rtq_reallyold;
 			}
 			ap->nextstop = lmin(ap->nextstop,
 					    rt->rt_rmx.rmx_expire);
 		}
 	}
 
 	return 0;
 }
 
 #define RTQ_TIMEOUT	60*10	/* run no less than once every ten minutes */
 #ifdef VIMAGE_GLOBALS
 static int rtq_timeout;
 static struct callout rtq_timer;
 #endif
 
 static void in_rtqtimo_one(void *rock);
 
 static void
 in_rtqtimo(void *rock)
 {
 	int fibnum;
 	void *newrock;
 	struct timeval atv;
 
 	KASSERT((rock == (void *)V_rt_tables[0][AF_INET]),
 			("in_rtqtimo: unexpected arg"));
 	for (fibnum = 0; fibnum < rt_numfibs; fibnum++) {
 		if ((newrock = V_rt_tables[fibnum][AF_INET]) != NULL)
 			in_rtqtimo_one(newrock);
 	}
 	atv.tv_usec = 0;
 	atv.tv_sec = V_rtq_timeout;
 	callout_reset(&V_rtq_timer, tvtohz(&atv), in_rtqtimo, rock);
 }
 
 static void
 in_rtqtimo_one(void *rock)
 {
 	INIT_VNET_INET(curvnet);
 	struct radix_node_head *rnh = rock;
 	struct rtqk_arg arg;
 	static time_t last_adjusted_timeout = 0;
 
 	arg.found = arg.killed = 0;
 	arg.rnh = rnh;
 	arg.nextstop = time_uptime + V_rtq_timeout;
 	arg.draining = arg.updating = 0;
 	RADIX_NODE_HEAD_LOCK(rnh);
 	rnh->rnh_walktree(rnh, in_rtqkill, &arg);
 	RADIX_NODE_HEAD_UNLOCK(rnh);
 
 	/*
 	 * Attempt to be somewhat dynamic about this:
 	 * If there are ``too many'' routes sitting around taking up space,
 	 * then crank down the timeout, and see if we can't make some more
 	 * go away.  However, we make sure that we will never adjust more
 	 * than once in rtq_timeout seconds, to keep from cranking down too
 	 * hard.
 	 */
 	if ((arg.found - arg.killed > V_rtq_toomany) &&
 	    (time_uptime - last_adjusted_timeout >= V_rtq_timeout) &&
 	    V_rtq_reallyold > V_rtq_minreallyold) {
 		V_rtq_reallyold = 2 * V_rtq_reallyold / 3;
 		if (V_rtq_reallyold < V_rtq_minreallyold) {
 			V_rtq_reallyold = V_rtq_minreallyold;
 		}
 
 		last_adjusted_timeout = time_uptime;
 #ifdef DIAGNOSTIC
 		log(LOG_DEBUG, "in_rtqtimo: adjusted rtq_reallyold to %d\n",
 		    V_rtq_reallyold);
 #endif
 		arg.found = arg.killed = 0;
 		arg.updating = 1;
 		RADIX_NODE_HEAD_LOCK(rnh);
 		rnh->rnh_walktree(rnh, in_rtqkill, &arg);
 		RADIX_NODE_HEAD_UNLOCK(rnh);
 	}
 
 }
 
 void
 in_rtqdrain(void)
 {
 	VNET_ITERATOR_DECL(vnet_iter);
 	struct radix_node_head *rnh;
 	struct rtqk_arg arg;
 	int 	fibnum;
 
 	VNET_LIST_RLOCK();
 	VNET_FOREACH(vnet_iter) {
 		CURVNET_SET(vnet_iter);
 		INIT_VNET_NET(vnet_iter);
 
 		for ( fibnum = 0; fibnum < rt_numfibs; fibnum++) {
 			rnh = V_rt_tables[fibnum][AF_INET];
 			arg.found = arg.killed = 0;
 			arg.rnh = rnh;
 			arg.nextstop = 0;
 			arg.draining = 1;
 			arg.updating = 0;
 			RADIX_NODE_HEAD_LOCK(rnh);
 			rnh->rnh_walktree(rnh, in_rtqkill, &arg);
 			RADIX_NODE_HEAD_UNLOCK(rnh);
 		}
 		CURVNET_RESTORE();
 	}
 	VNET_LIST_RUNLOCK();
 }
 
 static int _in_rt_was_here;
 /*
  * Initialize our routing tree.
  */
 int
 in_inithead(void **head, int off)
 {
 	INIT_VNET_INET(curvnet);
 	struct radix_node_head *rnh;
 
 	/* XXX MRT
 	 * This can be called from vfs_export.c too in which case 'off'
 	 * will be 0. We know the correct value so just use that and
 	 * return directly if it was 0.
 	 * This is a hack that replaces an even worse hack on a bad hack
 	 * on a bad design. After RELENG_7 this should be fixed but that
 	 * will change the ABI, so for now do it this way.
 	 */
 	if (!rn_inithead(head, 32))
 		return 0;
 
 	if (off == 0)		/* XXX MRT  see above */
 		return 1;	/* only do the rest for a real routing table */
 
 	V_rtq_reallyold = 60*60; /* one hour is "really old" */
 	V_rtq_minreallyold = 10; /* never automatically crank down to less */
 	V_rtq_toomany = 128;	 /* 128 cached routes is "too many" */
 	V_rtq_timeout = RTQ_TIMEOUT;
 
 	rnh = *head;
 	rnh->rnh_addaddr = in_addroute;
 	rnh->rnh_matchaddr = in_matroute;
 	rnh->rnh_close = in_clsroute;
 	if (_in_rt_was_here == 0 ) {
 		callout_init(&V_rtq_timer, CALLOUT_MPSAFE);
 		in_rtqtimo(rnh);	/* kick off timeout first time */
 		_in_rt_was_here = 1;
 	}
 	return 1;
 }
 
 /*
  * This zaps old routes when the interface goes down or interface
  * address is deleted.  In the latter case, it deletes static routes
  * that point to this address.  If we don't do this, we may end up
  * using the old address in the future.  The ones we always want to
  * get rid of are things like ARP entries, since the user might down
  * the interface, walk over to a completely different network, and
  * plug back in.
  */
 struct in_ifadown_arg {
 	struct ifaddr *ifa;
 	int del;
 };
 
 static int
 in_ifadownkill(struct radix_node *rn, void *xap)
 {
 	struct in_ifadown_arg *ap = xap;
 	struct rtentry *rt = (struct rtentry *)rn;
 
 	RT_LOCK(rt);
 	if (rt->rt_ifa == ap->ifa &&
 	    (ap->del || !(rt->rt_flags & RTF_STATIC))) {
 		/*
 		 * We need to disable the automatic prune that happens
 		 * in this case in rtrequest() because it will blow
 		 * away the pointers that rn_walktree() needs in order
 		 * continue our descent.  We will end up deleting all
 		 * the routes that rtrequest() would have in any case,
 		 * so that behavior is not needed there.
 		 */
-		rt->rt_flags &= ~RTF_CLONING;
 		rtexpunge(rt);
 	}
 	RT_UNLOCK(rt);
 	return 0;
 }
 
 int
 in_ifadown(struct ifaddr *ifa, int delete)
 {
 	INIT_VNET_NET(curvnet);
 	struct in_ifadown_arg arg;
 	struct radix_node_head *rnh;
 	int	fibnum;
 
 	if (ifa->ifa_addr->sa_family != AF_INET)
 		return 1;
 
 	for ( fibnum = 0; fibnum < rt_numfibs; fibnum++) {
 		rnh = V_rt_tables[fibnum][AF_INET];
 		arg.ifa = ifa;
 		arg.del = delete;
 		RADIX_NODE_HEAD_LOCK(rnh);
 		rnh->rnh_walktree(rnh, in_ifadownkill, &arg);
 		RADIX_NODE_HEAD_UNLOCK(rnh);
 		ifa->ifa_flags &= ~IFA_ROUTE;		/* XXXlocking? */
 	}
 	return 0;
 }
 
 /*
  * inet versions of rt functions. These have fib extensions and 
  * for now will just reference the _fib variants.
  * eventually this order will be reversed,
  */
 void
 in_rtalloc_ign(struct route *ro, u_long ignflags, u_int fibnum)
 {
 	rtalloc_ign_fib(ro, ignflags, fibnum);
 }
 
 int
 in_rtrequest( int req,
 	struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct rtentry **ret_nrt,
 	u_int fibnum)
 {
 	return (rtrequest_fib(req, dst, gateway, netmask, 
 	    flags, ret_nrt, fibnum));
 }
 
 struct rtentry *
 in_rtalloc1(struct sockaddr *dst, int report, u_long ignflags, u_int fibnum)
 {
 	return (rtalloc1_fib(dst, report, ignflags, fibnum));
 }
 
 void
 in_rtredirect(struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct sockaddr *src,
 	u_int fibnum)
 {
 	rtredirect_fib(dst, gateway, netmask, flags, src, fibnum);
 }
  
 void
 in_rtalloc(struct route *ro, u_int fibnum)
 {
 	rtalloc_ign_fib(ro, 0UL, fibnum);
 }
 
 #if 0
 int	 in_rt_getifa(struct rt_addrinfo *, u_int fibnum);
 int	 in_rtioctl(u_long, caddr_t, u_int);
 int	 in_rtrequest1(int, struct rt_addrinfo *, struct rtentry **, u_int);
 #endif
 
 
Index: head/sys/netinet/in_var.h
===================================================================
--- head/sys/netinet/in_var.h	(revision 186118)
+++ head/sys/netinet/in_var.h	(revision 186119)
@@ -1,344 +1,347 @@
 /*-
  * Copyright (c) 1985, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)in_var.h	8.2 (Berkeley) 1/9/95
  * $FreeBSD$
  */
 
 #ifndef _NETINET_IN_VAR_H_
 #define _NETINET_IN_VAR_H_
 
 #include <sys/queue.h>
 #include <sys/fnv_hash.h>
 
 /*
  * Interface address, Internet version.  One of these structures
  * is allocated for each Internet address on an interface.
  * The ifaddr structure contains the protocol-independent part
  * of the structure and is assumed to be first.
  */
 struct in_ifaddr {
 	struct	ifaddr ia_ifa;		/* protocol-independent info */
 #define	ia_ifp		ia_ifa.ifa_ifp
 #define ia_flags	ia_ifa.ifa_flags
 					/* ia_{,sub}net{,mask} in host order */
 	u_long	ia_net;			/* network number of interface */
 	u_long	ia_netmask;		/* mask of net part */
 	u_long	ia_subnet;		/* subnet number, including net */
 	u_long	ia_subnetmask;		/* mask of subnet part */
 	struct	in_addr ia_netbroadcast; /* to recognize net broadcasts */
 	LIST_ENTRY(in_ifaddr) ia_hash;	/* entry in bucket of inet addresses */
 	TAILQ_ENTRY(in_ifaddr) ia_link;	/* list of internet addresses */
 	struct	sockaddr_in ia_addr;	/* reserve space for interface name */
 	struct	sockaddr_in ia_dstaddr; /* reserve space for broadcast addr */
 #define	ia_broadaddr	ia_dstaddr
 	struct	sockaddr_in ia_sockmask; /* reserve space for general netmask */
 };
 
 struct	in_aliasreq {
 	char	ifra_name[IFNAMSIZ];		/* if name, e.g. "en0" */
 	struct	sockaddr_in ifra_addr;
 	struct	sockaddr_in ifra_broadaddr;
 #define ifra_dstaddr ifra_broadaddr
 	struct	sockaddr_in ifra_mask;
 };
 /*
  * Given a pointer to an in_ifaddr (ifaddr),
  * return a pointer to the addr as a sockaddr_in.
  */
 #define IA_SIN(ia)    (&(((struct in_ifaddr *)(ia))->ia_addr))
 #define IA_DSTSIN(ia) (&(((struct in_ifaddr *)(ia))->ia_dstaddr))
 
 #define IN_LNAOF(in, ifa) \
 	((ntohl((in).s_addr) & ~((struct in_ifaddr *)(ifa)->ia_subnetmask))
 
 
 #ifdef	_KERNEL
 extern	u_char	inetctlerrmap[];
 
 /*
  * Hash table for IP addresses.
  */
 TAILQ_HEAD(in_ifaddrhead, in_ifaddr);
 LIST_HEAD(in_ifaddrhashhead, in_ifaddr);
 #ifdef VIMAGE_GLOBALS
 extern	struct in_ifaddrhashhead *in_ifaddrhashtbl;
 extern	struct in_ifaddrhead in_ifaddrhead;
 extern	u_long in_ifaddrhmask;			/* mask for hash table */
 #endif
 
 #define INADDR_NHASH_LOG2       9
 #define INADDR_NHASH		(1 << INADDR_NHASH_LOG2)
 #define INADDR_HASHVAL(x)	fnv_32_buf((&(x)), sizeof(x), FNV1_32_INIT)
 #define INADDR_HASH(x) \
 	(&V_in_ifaddrhashtbl[INADDR_HASHVAL(x) & V_in_ifaddrhmask])
 
 /*
  * Macro for finding the internet address structure (in_ifaddr)
  * corresponding to one of our IP addresses (in_addr).
  */
 #define INADDR_TO_IFADDR(addr, ia) \
 	/* struct in_addr addr; */ \
 	/* struct in_ifaddr *ia; */ \
 do { \
 \
 	LIST_FOREACH(ia, INADDR_HASH((addr).s_addr), ia_hash) \
 		if (IA_SIN(ia)->sin_addr.s_addr == (addr).s_addr) \
 			break; \
 } while (0)
 
 /*
  * Macro for finding the interface (ifnet structure) corresponding to one
  * of our IP addresses.
  */
 #define INADDR_TO_IFP(addr, ifp) \
 	/* struct in_addr addr; */ \
 	/* struct ifnet *ifp; */ \
 { \
 	struct in_ifaddr *ia; \
 \
 	INADDR_TO_IFADDR(addr, ia); \
 	(ifp) = (ia == NULL) ? NULL : ia->ia_ifp; \
 }
 
 /*
  * Macro for finding the internet address structure (in_ifaddr) corresponding
  * to a given interface (ifnet structure).
  */
 #define IFP_TO_IA(ifp, ia) \
 	/* struct ifnet *ifp; */ \
 	/* struct in_ifaddr *ia; */ \
 { \
 	for ((ia) = TAILQ_FIRST(&V_in_ifaddrhead); \
 	    (ia) != NULL && (ia)->ia_ifp != (ifp); \
 	    (ia) = TAILQ_NEXT((ia), ia_link)) \
 		continue; \
 }
 #endif
 
 /*
  * IP datagram reassembly.
  */
 #define	IPREASS_NHASH_LOG2	6
 #define	IPREASS_NHASH		(1 << IPREASS_NHASH_LOG2)
 #define	IPREASS_HMASK		(IPREASS_NHASH - 1)
 #define	IPREASS_HASH(x,y) \
 	(((((x) & 0xF) | ((((x) >> 8) & 0xF) << 4)) ^ (y)) & IPREASS_HMASK)
 
 /*
  * This information should be part of the ifnet structure but we don't wish
  * to change that - as it might break a number of things
  */
 
 struct router_info {
 	struct ifnet *rti_ifp;
 	int    rti_type; /* type of router which is querier on this interface */
 	int    rti_time; /* # of slow timeouts since last old query */
 	SLIST_ENTRY(router_info) rti_list;
 #ifdef notyet
 	int	rti_timev1;	/* IGMPv1 querier present */
 	int	rti_timev2;	/* IGMPv2 querier present */
 	int	rti_timer;	/* report to general query */
 	int	rti_qrv;	/* querier robustness */
 #endif
 };
 
 /*
  * Internet multicast address structure.  There is one of these for each IP
  * multicast group to which this host belongs on a given network interface.
  * For every entry on the interface's if_multiaddrs list which represents
  * an IP multicast group, there is one of these structures.  They are also
  * kept on a system-wide list to make it easier to keep our legacy IGMP code
  * compatible with the rest of the world (see IN_FIRST_MULTI et al, below).
  */
 struct in_multi {
 	LIST_ENTRY(in_multi) inm_link;	/* queue macro glue */
 	struct	in_addr inm_addr;	/* IP multicast address, convenience */
 	struct	ifnet *inm_ifp;		/* back pointer to ifnet */
 	struct	ifmultiaddr *inm_ifma;	/* back pointer to ifmultiaddr */
 	u_int	inm_timer;		/* IGMP membership report timer */
 	u_int	inm_state;		/*  state of the membership */
 	struct	router_info *inm_rti;	/* router info*/
 	u_int	inm_refcount;		/* reference count */
 #ifdef notyet		/* IGMPv3 source-specific multicast fields */
 	TAILQ_HEAD(, in_msfentry) inm_msf;	/* all active source filters */
 	TAILQ_HEAD(, in_msfentry) inm_msf_record;	/* recorded sources */
 	TAILQ_HEAD(, in_msfentry) inm_msf_exclude;	/* exclude sources */
 	TAILQ_HEAD(, in_msfentry) inm_msf_include;	/* include sources */
 	/* XXX: should this lot go to the router_info structure? */
 	/* XXX: can/should these be callouts? */
 	/* IGMP protocol timers */
 	int32_t		inm_ti_curstate;	/* current state timer */
 	int32_t		inm_ti_statechg;	/* state change timer */
 	/* IGMP report timers */
 	uint16_t	inm_rpt_statechg;	/* state change report timer */
 	uint16_t	inm_rpt_toxx;		/* fmode change report timer */
 	/* IGMP protocol state */
 	uint16_t	inm_fmode;		/* filter mode */
 	uint32_t	inm_recsrc_count;	/* # of recorded sources */
 	uint16_t	inm_exclude_sock_count;	/* # of exclude-mode sockets */
 	uint16_t	inm_gass_count;		/* # of g-a-s queries */
 #endif
 };
 
 #ifdef notyet
 /*
  * Internet multicast source filter list. This list is used to store
  * IP multicast source addresses for each membership on an interface.
  * TODO: Allocate these structures using UMA.
  * TODO: Find an easier way of linking the struct into two lists at once.
  */
 struct in_msfentry {
 	TAILQ_ENTRY(in_msfentry) isf_link;	/* next filter in all-list */
 	TAILQ_ENTRY(in_msfentry) isf_next;	/* next filter in queue */
 	struct in_addr	isf_addr;	/* the address of this source */
 	uint16_t	isf_refcount;	/* reference count */
 	uint16_t	isf_reporttag;	/* what to report to the IGMP router */
 	uint16_t	isf_rexmit;	/* retransmission state/count */
 };
 #endif
 
 #ifdef _KERNEL
 
 #ifdef SYSCTL_DECL
 SYSCTL_DECL(_net_inet);
 SYSCTL_DECL(_net_inet_ip);
 SYSCTL_DECL(_net_inet_raw);
 #endif
 
 LIST_HEAD(in_multihead, in_multi);
 #ifdef VIMAGE_GLOBALS
 extern struct in_multihead in_multihead;
 #endif
 
 /*
  * Lock macros for IPv4 layer multicast address lists.  IPv4 lock goes
  * before link layer multicast locks in the lock order.  In most cases,
  * consumers of IN_*_MULTI() macros should acquire the locks before
  * calling them; users of the in_{add,del}multi() functions should not.
  */
 extern struct mtx in_multi_mtx;
 #define	IN_MULTI_LOCK()		mtx_lock(&in_multi_mtx)
 #define	IN_MULTI_UNLOCK()	mtx_unlock(&in_multi_mtx)
 #define	IN_MULTI_LOCK_ASSERT()	mtx_assert(&in_multi_mtx, MA_OWNED)
 
 /*
  * Structure used by macros below to remember position when stepping through
  * all of the in_multi records.
  */
 struct in_multistep {
 	struct in_multi *i_inm;
 };
 
 /*
  * Macro for looking up the in_multi record for a given IP multicast address
  * on a given interface.  If no matching record is found, "inm" is set null.
  */
 #define IN_LOOKUP_MULTI(addr, ifp, inm) \
 	/* struct in_addr addr; */ \
 	/* struct ifnet *ifp; */ \
 	/* struct in_multi *inm; */ \
 do { \
 	struct ifmultiaddr *ifma; \
 \
 	IN_MULTI_LOCK_ASSERT(); \
 	IF_ADDR_LOCK(ifp); \
 	TAILQ_FOREACH(ifma, &((ifp)->if_multiaddrs), ifma_link) { \
 		if (ifma->ifma_addr->sa_family == AF_INET \
 		    && ((struct sockaddr_in *)ifma->ifma_addr)->sin_addr.s_addr == \
 		    (addr).s_addr) \
 			break; \
 	} \
 	(inm) = ifma ? ifma->ifma_protospec : 0; \
 	IF_ADDR_UNLOCK(ifp); \
 } while(0)
 
 /*
  * Macro to step through all of the in_multi records, one at a time.
  * The current position is remembered in "step", which the caller must
  * provide.  IN_FIRST_MULTI(), below, must be called to initialize "step"
  * and get the first record.  Both macros return a NULL "inm" when there
  * are no remaining records.
  */
 #define IN_NEXT_MULTI(step, inm) \
 	/* struct in_multistep  step; */ \
 	/* struct in_multi *inm; */ \
 do { \
 	IN_MULTI_LOCK_ASSERT(); \
 	if (((inm) = (step).i_inm) != NULL) \
 		(step).i_inm = LIST_NEXT((step).i_inm, inm_link); \
 } while(0)
 
 #define IN_FIRST_MULTI(step, inm) \
 	/* struct in_multistep step; */ \
 	/* struct in_multi *inm; */ \
 do { \
 	IN_MULTI_LOCK_ASSERT(); \
 	(step).i_inm = LIST_FIRST(&V_in_multihead); \
 	IN_NEXT_MULTI((step), (inm)); \
 } while(0)
 
 struct	rtentry;
 struct	route;
 struct	ip_moptions;
 
 size_t	imo_match_group(struct ip_moptions *, struct ifnet *,
 	    struct sockaddr *);
 struct	in_msource *imo_match_source(struct ip_moptions *, size_t,
 	    struct sockaddr *);
 struct	in_multi *in_addmulti(struct in_addr *, struct ifnet *);
 void	in_delmulti(struct in_multi *);
 void	in_delmulti_locked(struct in_multi *);
 int	in_control(struct socket *, u_long, caddr_t, struct ifnet *,
 	    struct thread *);
 void	in_rtqdrain(void);
 void	ip_input(struct mbuf *);
 int	in_ifadown(struct ifaddr *ifa, int);
 void	in_ifscrub(struct ifnet *, struct in_ifaddr *);
 struct	mbuf	*ip_fastforward(struct mbuf *);
+void	*in_domifattach(struct ifnet *);
+void	in_domifdetach(struct ifnet *, void *);
+
 
 /* XXX */
 void	 in_rtalloc_ign(struct route *ro, u_long ignflags, u_int fibnum);
 void	 in_rtalloc(struct route *ro, u_int fibnum);
 struct rtentry *in_rtalloc1(struct sockaddr *, int, u_long, u_int);
 void	 in_rtredirect(struct sockaddr *, struct sockaddr *,
 	    struct sockaddr *, int, struct sockaddr *, u_int);
 int	 in_rtrequest(int, struct sockaddr *,
 	    struct sockaddr *, struct sockaddr *, int, struct rtentry **, u_int);
 
 #if 0
 int	 in_rt_getifa(struct rt_addrinfo *, u_int fibnum);
 int	 in_rtioctl(u_long, caddr_t, u_int);
 int	 in_rtrequest1(int, struct rt_addrinfo *, struct rtentry **, u_int);
 #endif
 #endif /* _KERNEL */
 
 /* INET6 stuff */
 #include <netinet6/in6_var.h>
 
 #endif /* _NETINET_IN_VAR_H_ */
Index: head/sys/netinet/ip_carp.c
===================================================================
--- head/sys/netinet/ip_carp.c	(revision 186118)
+++ head/sys/netinet/ip_carp.c	(revision 186119)
@@ -1,2262 +1,2254 @@
 /*
  * Copyright (c) 2002 Michael Shalayeff. All rights reserved.
  * Copyright (c) 2003 Ryan McBride. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR OR HIS RELATIVES BE LIABLE FOR ANY DIRECT,
  * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
  * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
  * SERVICES; LOSS OF MIND, USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
  * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
  * THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_carp.h"
 #include "opt_bpf.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/types.h>
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/conf.h>
 #include <sys/kernel.h>
 #include <sys/limits.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/module.h>
 #include <sys/time.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 #include <sys/signalvar.h>
 #include <sys/filio.h>
 #include <sys/sockio.h>
 
 #include <sys/socket.h>
 #include <sys/vnode.h>
 #include <sys/vimage.h>
 
 #include <machine/stdarg.h>
 
 #include <net/bpf.h>
 #include <net/ethernet.h>
 #include <net/fddi.h>
 #include <net/iso88025.h>
 #include <net/if.h>
 #include <net/if_clone.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/route.h>
 
 #ifdef INET
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet/if_ether.h>
 #include <machine/in_cksum.h>
 #include <netinet/vinet.h>
 #endif
 
 #ifdef INET6
 #include <netinet/icmp6.h>
 #include <netinet/ip6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet6/vinet6.h>
 #endif
 
 #include <crypto/sha1.h>
 #include <netinet/ip_carp.h>
 
 #define	CARP_IFNAME	"carp"
 static MALLOC_DEFINE(M_CARP, "CARP", "CARP interfaces");
 SYSCTL_DECL(_net_inet_carp);
 
 struct carp_softc {
 	struct ifnet	 	*sc_ifp;	/* Interface clue */
 	struct ifnet		*sc_carpdev;	/* Pointer to parent interface */
 	struct in_ifaddr 	*sc_ia;		/* primary iface address */
 	struct ip_moptions 	 sc_imo;
 #ifdef INET6
 	struct in6_ifaddr 	*sc_ia6;	/* primary iface address v6 */
 	struct ip6_moptions 	 sc_im6o;
 #endif /* INET6 */
 	TAILQ_ENTRY(carp_softc)	 sc_list;
 
 	enum { INIT = 0, BACKUP, MASTER }	sc_state;
 
 	int			 sc_flags_backup;
 	int			 sc_suppress;
 
 	int			 sc_sendad_errors;
 #define	CARP_SENDAD_MAX_ERRORS	3
 	int			 sc_sendad_success;
 #define	CARP_SENDAD_MIN_SUCCESS 3
 
 	int			 sc_vhid;
 	int			 sc_advskew;
 	int			 sc_naddrs;
 	int			 sc_naddrs6;
 	int			 sc_advbase;	/* seconds */
 	int			 sc_init_counter;
 	u_int64_t		 sc_counter;
 
 	/* authentication */
 #define CARP_HMAC_PAD	64
 	unsigned char sc_key[CARP_KEY_LEN];
 	unsigned char sc_pad[CARP_HMAC_PAD];
 	SHA1_CTX sc_sha1;
 
 	struct callout		 sc_ad_tmo;	/* advertisement timeout */
 	struct callout		 sc_md_tmo;	/* master down timeout */
 	struct callout 		 sc_md6_tmo;	/* master down timeout */
 	
 	LIST_ENTRY(carp_softc)	 sc_next;	/* Interface clue */
 };
 #define	SC2IFP(sc)	((sc)->sc_ifp)
 
 int carp_suppress_preempt = 0;
 int carp_opts[CARPCTL_MAXID] = { 0, 1, 0, 1, 0, 0 };	/* XXX for now */
 SYSCTL_INT(_net_inet_carp, CARPCTL_ALLOW, allow, CTLFLAG_RW,
     &carp_opts[CARPCTL_ALLOW], 0, "Accept incoming CARP packets");
 SYSCTL_INT(_net_inet_carp, CARPCTL_PREEMPT, preempt, CTLFLAG_RW,
     &carp_opts[CARPCTL_PREEMPT], 0, "high-priority backup preemption mode");
 SYSCTL_INT(_net_inet_carp, CARPCTL_LOG, log, CTLFLAG_RW,
     &carp_opts[CARPCTL_LOG], 0, "log bad carp packets");
 SYSCTL_INT(_net_inet_carp, CARPCTL_ARPBALANCE, arpbalance, CTLFLAG_RW,
     &carp_opts[CARPCTL_ARPBALANCE], 0, "balance arp responses");
 SYSCTL_INT(_net_inet_carp, OID_AUTO, suppress_preempt, CTLFLAG_RD,
     &carp_suppress_preempt, 0, "Preemption is suppressed");
 
 struct carpstats carpstats;
 SYSCTL_STRUCT(_net_inet_carp, CARPCTL_STATS, stats, CTLFLAG_RW,
     &carpstats, carpstats,
     "CARP statistics (struct carpstats, netinet/ip_carp.h)");
 
 struct carp_if {
 	TAILQ_HEAD(, carp_softc) vhif_vrs;
 	int vhif_nvrs;
 
 	struct ifnet 	*vhif_ifp;
 	struct mtx	 vhif_mtx;
 };
 
 /* Get carp_if from softc. Valid after carp_set_addr{,6}. */
 #define	SC2CIF(sc)		((struct carp_if *)(sc)->sc_carpdev->if_carp)
 
 /* lock per carp_if queue */
 #define	CARP_LOCK_INIT(cif)	mtx_init(&(cif)->vhif_mtx, "carp_if", 	\
 	NULL, MTX_DEF)
 #define	CARP_LOCK_DESTROY(cif)	mtx_destroy(&(cif)->vhif_mtx)
 #define	CARP_LOCK_ASSERT(cif)	mtx_assert(&(cif)->vhif_mtx, MA_OWNED)
 #define	CARP_LOCK(cif)		mtx_lock(&(cif)->vhif_mtx)
 #define	CARP_UNLOCK(cif)	mtx_unlock(&(cif)->vhif_mtx)
 
 #define	CARP_SCLOCK(sc)		mtx_lock(&SC2CIF(sc)->vhif_mtx)
 #define	CARP_SCUNLOCK(sc)	mtx_unlock(&SC2CIF(sc)->vhif_mtx)
 #define	CARP_SCLOCK_ASSERT(sc)	mtx_assert(&SC2CIF(sc)->vhif_mtx, MA_OWNED)
 
 #define	CARP_LOG(...)	do {				\
 	if (carp_opts[CARPCTL_LOG] > 0)			\
 		log(LOG_INFO, __VA_ARGS__);		\
 } while (0)
 
 #define	CARP_DEBUG(...)	do {				\
 	if (carp_opts[CARPCTL_LOG] > 1)			\
 		log(LOG_DEBUG, __VA_ARGS__);		\
 } while (0)
 
 static void	carp_hmac_prepare(struct carp_softc *);
 static void	carp_hmac_generate(struct carp_softc *, u_int32_t *,
 		    unsigned char *);
 static int	carp_hmac_verify(struct carp_softc *, u_int32_t *,
 		    unsigned char *);
 static void	carp_setroute(struct carp_softc *, int);
 static void	carp_input_c(struct mbuf *, struct carp_header *, sa_family_t);
 static int 	carp_clone_create(struct if_clone *, int, caddr_t);
 static void 	carp_clone_destroy(struct ifnet *);
 static void	carpdetach(struct carp_softc *, int);
 static int	carp_prepare_ad(struct mbuf *, struct carp_softc *,
 		    struct carp_header *);
 static void	carp_send_ad_all(void);
 static void	carp_send_ad(void *);
 static void	carp_send_ad_locked(struct carp_softc *);
 static void	carp_send_arp(struct carp_softc *);
 static void	carp_master_down(void *);
 static void	carp_master_down_locked(struct carp_softc *);
 static int	carp_ioctl(struct ifnet *, u_long, caddr_t);
 static int	carp_looutput(struct ifnet *, struct mbuf *, struct sockaddr *,
 		    struct rtentry *);
 static void	carp_start(struct ifnet *);
 static void	carp_setrun(struct carp_softc *, sa_family_t);
 static void	carp_set_state(struct carp_softc *, int);
 static int	carp_addrcount(struct carp_if *, struct in_ifaddr *, int);
 enum	{ CARP_COUNT_MASTER, CARP_COUNT_RUNNING };
 
 static void	carp_multicast_cleanup(struct carp_softc *);
 static int	carp_set_addr(struct carp_softc *, struct sockaddr_in *);
 static int	carp_del_addr(struct carp_softc *, struct sockaddr_in *);
 static void	carp_carpdev_state_locked(struct carp_if *);
 static void	carp_sc_state_locked(struct carp_softc *);
 #ifdef INET6
 static void	carp_send_na(struct carp_softc *);
 static int	carp_set_addr6(struct carp_softc *, struct sockaddr_in6 *);
 static int	carp_del_addr6(struct carp_softc *, struct sockaddr_in6 *);
 static void	carp_multicast6_cleanup(struct carp_softc *);
 #endif
 
 static LIST_HEAD(, carp_softc) carpif_list;
 static struct mtx carp_mtx;
 IFC_SIMPLE_DECLARE(carp, 0);
 
 static eventhandler_tag if_detach_event_tag;
 
 static __inline u_int16_t
 carp_cksum(struct mbuf *m, int len)
 {
 	return (in_cksum(m, len));
 }
 
 static void
 carp_hmac_prepare(struct carp_softc *sc)
 {
 	u_int8_t version = CARP_VERSION, type = CARP_ADVERTISEMENT;
 	u_int8_t vhid = sc->sc_vhid & 0xff;
 	struct ifaddr *ifa;
 	int i, found;
 #ifdef INET
 	struct in_addr last, cur, in;
 #endif
 #ifdef INET6
 	struct in6_addr last6, cur6, in6;
 #endif
 
 	if (sc->sc_carpdev)
 		CARP_SCLOCK(sc);
 
 	/* XXX: possible race here */
 
 	/* compute ipad from key */
 	bzero(sc->sc_pad, sizeof(sc->sc_pad));
 	bcopy(sc->sc_key, sc->sc_pad, sizeof(sc->sc_key));
 	for (i = 0; i < sizeof(sc->sc_pad); i++)
 		sc->sc_pad[i] ^= 0x36;
 
 	/* precompute first part of inner hash */
 	SHA1Init(&sc->sc_sha1);
 	SHA1Update(&sc->sc_sha1, sc->sc_pad, sizeof(sc->sc_pad));
 	SHA1Update(&sc->sc_sha1, (void *)&version, sizeof(version));
 	SHA1Update(&sc->sc_sha1, (void *)&type, sizeof(type));
 	SHA1Update(&sc->sc_sha1, (void *)&vhid, sizeof(vhid));
 #ifdef INET
 	cur.s_addr = 0;
 	do {
 		found = 0;
 		last = cur;
 		cur.s_addr = 0xffffffff;
 		TAILQ_FOREACH(ifa, &SC2IFP(sc)->if_addrlist, ifa_list) {
 			in.s_addr = ifatoia(ifa)->ia_addr.sin_addr.s_addr;
 			if (ifa->ifa_addr->sa_family == AF_INET &&
 			    ntohl(in.s_addr) > ntohl(last.s_addr) &&
 			    ntohl(in.s_addr) < ntohl(cur.s_addr)) {
 				cur.s_addr = in.s_addr;
 				found++;
 			}
 		}
 		if (found)
 			SHA1Update(&sc->sc_sha1, (void *)&cur, sizeof(cur));
 	} while (found);
 #endif /* INET */
 #ifdef INET6
 	memset(&cur6, 0, sizeof(cur6));
 	do {
 		found = 0;
 		last6 = cur6;
 		memset(&cur6, 0xff, sizeof(cur6));
 		TAILQ_FOREACH(ifa, &SC2IFP(sc)->if_addrlist, ifa_list) {
 			in6 = ifatoia6(ifa)->ia_addr.sin6_addr;
 			if (IN6_IS_SCOPE_EMBED(&in6))
 				in6.s6_addr16[1] = 0;
 			if (ifa->ifa_addr->sa_family == AF_INET6 &&
 			    memcmp(&in6, &last6, sizeof(in6)) > 0 &&
 			    memcmp(&in6, &cur6, sizeof(in6)) < 0) {
 				cur6 = in6;
 				found++;
 			}
 		}
 		if (found)
 			SHA1Update(&sc->sc_sha1, (void *)&cur6, sizeof(cur6));
 	} while (found);
 #endif /* INET6 */
 
 	/* convert ipad to opad */
 	for (i = 0; i < sizeof(sc->sc_pad); i++)
 		sc->sc_pad[i] ^= 0x36 ^ 0x5c;
 
 	if (sc->sc_carpdev)
 		CARP_SCUNLOCK(sc);
 }
 
 static void
 carp_hmac_generate(struct carp_softc *sc, u_int32_t counter[2],
     unsigned char md[20])
 {
 	SHA1_CTX sha1ctx;
 
 	/* fetch first half of inner hash */
 	bcopy(&sc->sc_sha1, &sha1ctx, sizeof(sha1ctx));
 
 	SHA1Update(&sha1ctx, (void *)counter, sizeof(sc->sc_counter));
 	SHA1Final(md, &sha1ctx);
 
 	/* outer hash */
 	SHA1Init(&sha1ctx);
 	SHA1Update(&sha1ctx, sc->sc_pad, sizeof(sc->sc_pad));
 	SHA1Update(&sha1ctx, md, 20);
 	SHA1Final(md, &sha1ctx);
 }
 
 static int
 carp_hmac_verify(struct carp_softc *sc, u_int32_t counter[2],
     unsigned char md[20])
 {
 	unsigned char md2[20];
 
 	CARP_SCLOCK_ASSERT(sc);
 
 	carp_hmac_generate(sc, counter, md2);
 
 	return (bcmp(md, md2, sizeof(md2)));
 }
 
 static void
 carp_setroute(struct carp_softc *sc, int cmd)
 {
 	struct ifaddr *ifa;
 	int s;
 
 	if (sc->sc_carpdev)
 		CARP_SCLOCK_ASSERT(sc);
 
 	s = splnet();
 	TAILQ_FOREACH(ifa, &SC2IFP(sc)->if_addrlist, ifa_list) {
 		if (ifa->ifa_addr->sa_family == AF_INET &&
 		    sc->sc_carpdev != NULL) {
 			int count = carp_addrcount(
 			    (struct carp_if *)sc->sc_carpdev->if_carp,
 			    ifatoia(ifa), CARP_COUNT_MASTER);
 
 			if ((cmd == RTM_ADD && count == 1) ||
 			    (cmd == RTM_DELETE && count == 0))
 				rtinit(ifa, cmd, RTF_UP | RTF_HOST);
 		}
-#ifdef INET6
-		if (ifa->ifa_addr->sa_family == AF_INET6) {
-			if (cmd == RTM_ADD)
-				in6_ifaddloop(ifa);
-			else
-				in6_ifremloop(ifa);
-		}
-#endif /* INET6 */
 	}
 	splx(s);
 }
 
 static int
 carp_clone_create(struct if_clone *ifc, int unit, caddr_t params)
 {
 
 	struct carp_softc *sc;
 	struct ifnet *ifp;
 
 	sc = malloc(sizeof(*sc), M_CARP, M_WAITOK|M_ZERO);
 	ifp = SC2IFP(sc) = if_alloc(IFT_ETHER);
 	if (ifp == NULL) {
 		free(sc, M_CARP);
 		return (ENOSPC);
 	}
 	
 	sc->sc_flags_backup = 0;
 	sc->sc_suppress = 0;
 	sc->sc_advbase = CARP_DFLTINTV;
 	sc->sc_vhid = -1;	/* required setting */
 	sc->sc_advskew = 0;
 	sc->sc_init_counter = 1;
 	sc->sc_naddrs = sc->sc_naddrs6 = 0; /* M_ZERO? */
 #ifdef INET6
 	sc->sc_im6o.im6o_multicast_hlim = CARP_DFLTTL;
 #endif
 	sc->sc_imo.imo_membership = (struct in_multi **)malloc(
 	    (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), M_CARP,
 	    M_WAITOK);
 	sc->sc_imo.imo_mfilters = NULL;
 	sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS;
 	sc->sc_imo.imo_multicast_vif = -1;
 
 	callout_init(&sc->sc_ad_tmo, CALLOUT_MPSAFE);
 	callout_init(&sc->sc_md_tmo, CALLOUT_MPSAFE);
 	callout_init(&sc->sc_md6_tmo, CALLOUT_MPSAFE);
 	
 	ifp->if_softc = sc;
 	if_initname(ifp, CARP_IFNAME, unit);
 	ifp->if_mtu = ETHERMTU;
 	ifp->if_flags = IFF_LOOPBACK;
 	ifp->if_ioctl = carp_ioctl;
 	ifp->if_output = carp_looutput;
 	ifp->if_start = carp_start;
 	ifp->if_type = IFT_CARP;
 	ifp->if_snd.ifq_maxlen = ifqmaxlen;
 	ifp->if_hdrlen = 0;
 	if_attach(ifp);
 	bpfattach(SC2IFP(sc), DLT_NULL, sizeof(u_int32_t));
 	mtx_lock(&carp_mtx);
 	LIST_INSERT_HEAD(&carpif_list, sc, sc_next);
 	mtx_unlock(&carp_mtx);
 	return (0);
 }
 
 static void
 carp_clone_destroy(struct ifnet *ifp)
 {
 	struct carp_softc *sc = ifp->if_softc;
 
 	if (sc->sc_carpdev)
 		CARP_SCLOCK(sc);
 	carpdetach(sc, 1);	/* Returns unlocked. */
 
 	mtx_lock(&carp_mtx);
 	LIST_REMOVE(sc, sc_next);
 	mtx_unlock(&carp_mtx);
 	bpfdetach(ifp);
 	if_detach(ifp);
 	if_free_type(ifp, IFT_ETHER);
 	free(sc->sc_imo.imo_membership, M_CARP);
 	free(sc, M_CARP);
 }
 
 /*
  * This function can be called on CARP interface destroy path,
  * and in case of the removal of the underlying interface as
  * well. We differentiate these two cases. In the latter case
  * we do not cleanup our multicast memberships, since they
  * are already freed. Also, in the latter case we do not
  * release the lock on return, because the function will be
  * called once more, for another CARP instance on the same
  * interface.
  */
 static void
 carpdetach(struct carp_softc *sc, int unlock)
 {
 	struct carp_if *cif;
 
 	callout_stop(&sc->sc_ad_tmo);
 	callout_stop(&sc->sc_md_tmo);
 	callout_stop(&sc->sc_md6_tmo);
 
 	if (sc->sc_suppress)
 		carp_suppress_preempt--;
 	sc->sc_suppress = 0;
 
 	if (sc->sc_sendad_errors >= CARP_SENDAD_MAX_ERRORS)
 		carp_suppress_preempt--;
 	sc->sc_sendad_errors = 0;
 
 	carp_set_state(sc, INIT);
 	SC2IFP(sc)->if_flags &= ~IFF_UP;
 	carp_setrun(sc, 0);
 	if (unlock)
 		carp_multicast_cleanup(sc);
 #ifdef INET6
 	carp_multicast6_cleanup(sc);
 #endif
 
 	if (sc->sc_carpdev != NULL) {
 		cif = (struct carp_if *)sc->sc_carpdev->if_carp;
 		CARP_LOCK_ASSERT(cif);
 		TAILQ_REMOVE(&cif->vhif_vrs, sc, sc_list);
 		if (!--cif->vhif_nvrs) {
 			ifpromisc(sc->sc_carpdev, 0);
 			sc->sc_carpdev->if_carp = NULL;
 			CARP_LOCK_DESTROY(cif);
 			free(cif, M_IFADDR);
 		} else if (unlock)
 			CARP_UNLOCK(cif);
 		sc->sc_carpdev = NULL;
 	}
 }
 
 /* Detach an interface from the carp. */
 static void
 carp_ifdetach(void *arg __unused, struct ifnet *ifp)
 {
 	struct carp_if *cif = (struct carp_if *)ifp->if_carp;
 	struct carp_softc *sc, *nextsc;
 
 	if (cif == NULL)
 		return;
 
 	/*
 	 * XXX: At the end of for() cycle the lock will be destroyed.
 	 */
 	CARP_LOCK(cif);
 	for (sc = TAILQ_FIRST(&cif->vhif_vrs); sc; sc = nextsc) {
 		nextsc = TAILQ_NEXT(sc, sc_list);
 		carpdetach(sc, 0);
 	}
 }
 
 /*
  * process input packet.
  * we have rearranged checks order compared to the rfc,
  * but it seems more efficient this way or not possible otherwise.
  */
 void
 carp_input(struct mbuf *m, int hlen)
 {
 	struct ip *ip = mtod(m, struct ip *);
 	struct carp_header *ch;
 	int iplen, len;
 
 	carpstats.carps_ipackets++;
 
 	if (!carp_opts[CARPCTL_ALLOW]) {
 		m_freem(m);
 		return;
 	}
 
 	/* check if received on a valid carp interface */
 	if (m->m_pkthdr.rcvif->if_carp == NULL) {
 		carpstats.carps_badif++;
 		CARP_LOG("carp_input: packet received on non-carp "
 		    "interface: %s\n",
 		    m->m_pkthdr.rcvif->if_xname);
 		m_freem(m);
 		return;
 	}
 
 	/* verify that the IP TTL is 255.  */
 	if (ip->ip_ttl != CARP_DFLTTL) {
 		carpstats.carps_badttl++;
 		CARP_LOG("carp_input: received ttl %d != 255i on %s\n",
 		    ip->ip_ttl,
 		    m->m_pkthdr.rcvif->if_xname);
 		m_freem(m);
 		return;
 	}
 
 	iplen = ip->ip_hl << 2;
 
 	if (m->m_pkthdr.len < iplen + sizeof(*ch)) {
 		carpstats.carps_badlen++;
 		CARP_LOG("carp_input: received len %zd < "
 		    "sizeof(struct carp_header)\n",
 		    m->m_len - sizeof(struct ip));
 		m_freem(m);
 		return;
 	}
 
 	if (iplen + sizeof(*ch) < m->m_len) {
 		if ((m = m_pullup(m, iplen + sizeof(*ch))) == NULL) {
 			carpstats.carps_hdrops++;
 			CARP_LOG("carp_input: pullup failed\n");
 			return;
 		}
 		ip = mtod(m, struct ip *);
 	}
 	ch = (struct carp_header *)((char *)ip + iplen);
 
 	/*
 	 * verify that the received packet length is
 	 * equal to the CARP header
 	 */
 	len = iplen + sizeof(*ch);
 	if (len > m->m_pkthdr.len) {
 		carpstats.carps_badlen++;
 		CARP_LOG("carp_input: packet too short %d on %s\n",
 		    m->m_pkthdr.len,
 		    m->m_pkthdr.rcvif->if_xname);
 		m_freem(m);
 		return;
 	}
 
 	if ((m = m_pullup(m, len)) == NULL) {
 		carpstats.carps_hdrops++;
 		return;
 	}
 	ip = mtod(m, struct ip *);
 	ch = (struct carp_header *)((char *)ip + iplen);
 
 	/* verify the CARP checksum */
 	m->m_data += iplen;
 	if (carp_cksum(m, len - iplen)) {
 		carpstats.carps_badsum++;
 		CARP_LOG("carp_input: checksum failed on %s\n",
 		    m->m_pkthdr.rcvif->if_xname);
 		m_freem(m);
 		return;
 	}
 	m->m_data -= iplen;
 
 	carp_input_c(m, ch, AF_INET);
 }
 
 #ifdef INET6
 int
 carp6_input(struct mbuf **mp, int *offp, int proto)
 {
 	struct mbuf *m = *mp;
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	struct carp_header *ch;
 	u_int len;
 
 	carpstats.carps_ipackets6++;
 
 	if (!carp_opts[CARPCTL_ALLOW]) {
 		m_freem(m);
 		return (IPPROTO_DONE);
 	}
 
 	/* check if received on a valid carp interface */
 	if (m->m_pkthdr.rcvif->if_carp == NULL) {
 		carpstats.carps_badif++;
 		CARP_LOG("carp6_input: packet received on non-carp "
 		    "interface: %s\n",
 		    m->m_pkthdr.rcvif->if_xname);
 		m_freem(m);
 		return (IPPROTO_DONE);
 	}
 
 	/* verify that the IP TTL is 255 */
 	if (ip6->ip6_hlim != CARP_DFLTTL) {
 		carpstats.carps_badttl++;
 		CARP_LOG("carp6_input: received ttl %d != 255 on %s\n",
 		    ip6->ip6_hlim,
 		    m->m_pkthdr.rcvif->if_xname);
 		m_freem(m);
 		return (IPPROTO_DONE);
 	}
 
 	/* verify that we have a complete carp packet */
 	len = m->m_len;
 	IP6_EXTHDR_GET(ch, struct carp_header *, m, *offp, sizeof(*ch));
 	if (ch == NULL) {
 		carpstats.carps_badlen++;
 		CARP_LOG("carp6_input: packet size %u too small\n", len);
 		return (IPPROTO_DONE);
 	}
 
 
 	/* verify the CARP checksum */
 	m->m_data += *offp;
 	if (carp_cksum(m, sizeof(*ch))) {
 		carpstats.carps_badsum++;
 		CARP_LOG("carp6_input: checksum failed, on %s\n",
 		    m->m_pkthdr.rcvif->if_xname);
 		m_freem(m);
 		return (IPPROTO_DONE);
 	}
 	m->m_data -= *offp;
 
 	carp_input_c(m, ch, AF_INET6);
 	return (IPPROTO_DONE);
 }
 #endif /* INET6 */
 
 static void
 carp_input_c(struct mbuf *m, struct carp_header *ch, sa_family_t af)
 {
 	struct ifnet *ifp = m->m_pkthdr.rcvif;
 	struct carp_softc *sc;
 	u_int64_t tmp_counter;
 	struct timeval sc_tv, ch_tv;
 
 	/* verify that the VHID is valid on the receiving interface */
 	CARP_LOCK(ifp->if_carp);
 	TAILQ_FOREACH(sc, &((struct carp_if *)ifp->if_carp)->vhif_vrs, sc_list)
 		if (sc->sc_vhid == ch->carp_vhid)
 			break;
 
 	if (!sc || !((SC2IFP(sc)->if_flags & IFF_UP) &&
 	    (SC2IFP(sc)->if_drv_flags & IFF_DRV_RUNNING))) {
 		carpstats.carps_badvhid++;
 		CARP_UNLOCK(ifp->if_carp);
 		m_freem(m);
 		return;
 	}
 
 	getmicrotime(&SC2IFP(sc)->if_lastchange);
 	SC2IFP(sc)->if_ipackets++;
 	SC2IFP(sc)->if_ibytes += m->m_pkthdr.len;
 
 	if (bpf_peers_present(SC2IFP(sc)->if_bpf)) {
 		struct ip *ip = mtod(m, struct ip *);
 		uint32_t af1 = af;
 
 		/* BPF wants net byte order */
 		ip->ip_len = htons(ip->ip_len + (ip->ip_hl << 2));
 		ip->ip_off = htons(ip->ip_off);
 		bpf_mtap2(SC2IFP(sc)->if_bpf, &af1, sizeof(af1), m);
 	}
 
 	/* verify the CARP version. */
 	if (ch->carp_version != CARP_VERSION) {
 		carpstats.carps_badver++;
 		SC2IFP(sc)->if_ierrors++;
 		CARP_UNLOCK(ifp->if_carp);
 		CARP_LOG("%s; invalid version %d\n",
 		    SC2IFP(sc)->if_xname,
 		    ch->carp_version);
 		m_freem(m);
 		return;
 	}
 
 	/* verify the hash */
 	if (carp_hmac_verify(sc, ch->carp_counter, ch->carp_md)) {
 		carpstats.carps_badauth++;
 		SC2IFP(sc)->if_ierrors++;
 		CARP_UNLOCK(ifp->if_carp);
 		CARP_LOG("%s: incorrect hash\n", SC2IFP(sc)->if_xname);
 		m_freem(m);
 		return;
 	}
 
 	tmp_counter = ntohl(ch->carp_counter[0]);
 	tmp_counter = tmp_counter<<32;
 	tmp_counter += ntohl(ch->carp_counter[1]);
 
 	/* XXX Replay protection goes here */
 
 	sc->sc_init_counter = 0;
 	sc->sc_counter = tmp_counter;
 
 	sc_tv.tv_sec = sc->sc_advbase;
 	if (carp_suppress_preempt && sc->sc_advskew <  240)
 		sc_tv.tv_usec = 240 * 1000000 / 256;
 	else
 		sc_tv.tv_usec = sc->sc_advskew * 1000000 / 256;
 	ch_tv.tv_sec = ch->carp_advbase;
 	ch_tv.tv_usec = ch->carp_advskew * 1000000 / 256;
 
 	switch (sc->sc_state) {
 	case INIT:
 		break;
 	case MASTER:
 		/*
 		 * If we receive an advertisement from a master who's going to
 		 * be more frequent than us, go into BACKUP state.
 		 */
 		if (timevalcmp(&sc_tv, &ch_tv, >) ||
 		    timevalcmp(&sc_tv, &ch_tv, ==)) {
 			callout_stop(&sc->sc_ad_tmo);
 			CARP_DEBUG("%s: MASTER -> BACKUP "
 			   "(more frequent advertisement received)\n",
 			   SC2IFP(sc)->if_xname);
 			carp_set_state(sc, BACKUP);
 			carp_setrun(sc, 0);
 			carp_setroute(sc, RTM_DELETE);
 		}
 		break;
 	case BACKUP:
 		/*
 		 * If we're pre-empting masters who advertise slower than us,
 		 * and this one claims to be slower, treat him as down.
 		 */
 		if (carp_opts[CARPCTL_PREEMPT] &&
 		    timevalcmp(&sc_tv, &ch_tv, <)) {
 			CARP_DEBUG("%s: BACKUP -> MASTER "
 			    "(preempting a slower master)\n",
 			    SC2IFP(sc)->if_xname);
 			carp_master_down_locked(sc);
 			break;
 		}
 
 		/*
 		 *  If the master is going to advertise at such a low frequency
 		 *  that he's guaranteed to time out, we'd might as well just
 		 *  treat him as timed out now.
 		 */
 		sc_tv.tv_sec = sc->sc_advbase * 3;
 		if (timevalcmp(&sc_tv, &ch_tv, <)) {
 			CARP_DEBUG("%s: BACKUP -> MASTER "
 			    "(master timed out)\n",
 			    SC2IFP(sc)->if_xname);
 			carp_master_down_locked(sc);
 			break;
 		}
 
 		/*
 		 * Otherwise, we reset the counter and wait for the next
 		 * advertisement.
 		 */
 		carp_setrun(sc, af);
 		break;
 	}
 
 	CARP_UNLOCK(ifp->if_carp);
 
 	m_freem(m);
 	return;
 }
 
 static int
 carp_prepare_ad(struct mbuf *m, struct carp_softc *sc, struct carp_header *ch)
 {
 	struct m_tag *mtag;
 	struct ifnet *ifp = SC2IFP(sc);
 
 	if (sc->sc_init_counter) {
 		/* this could also be seconds since unix epoch */
 		sc->sc_counter = arc4random();
 		sc->sc_counter = sc->sc_counter << 32;
 		sc->sc_counter += arc4random();
 	} else
 		sc->sc_counter++;
 
 	ch->carp_counter[0] = htonl((sc->sc_counter>>32)&0xffffffff);
 	ch->carp_counter[1] = htonl(sc->sc_counter&0xffffffff);
 
 	carp_hmac_generate(sc, ch->carp_counter, ch->carp_md);
 
 	/* Tag packet for carp_output */
 	mtag = m_tag_get(PACKET_TAG_CARP, sizeof(struct ifnet *), M_NOWAIT);
 	if (mtag == NULL) {
 		m_freem(m);
 		SC2IFP(sc)->if_oerrors++;
 		return (ENOMEM);
 	}
 	bcopy(&ifp, (caddr_t)(mtag + 1), sizeof(struct ifnet *));
 	m_tag_prepend(m, mtag);
 
 	return (0);
 }
 
 static void
 carp_send_ad_all(void)
 {
 	struct carp_softc *sc;
 
 	mtx_lock(&carp_mtx);
 	LIST_FOREACH(sc, &carpif_list, sc_next) {
 		if (sc->sc_carpdev == NULL)
 			continue;
 		CARP_SCLOCK(sc);
 		if ((SC2IFP(sc)->if_flags & IFF_UP) &&
 		    (SC2IFP(sc)->if_drv_flags & IFF_DRV_RUNNING) &&
 		     sc->sc_state == MASTER)
 			carp_send_ad_locked(sc);
 		CARP_SCUNLOCK(sc);
 	}
 	mtx_unlock(&carp_mtx);
 }
 
 static void
 carp_send_ad(void *v)
 {
 	struct carp_softc *sc = v;
 
 	CARP_SCLOCK(sc);
 	carp_send_ad_locked(sc);
 	CARP_SCUNLOCK(sc);
 }
 
 static void
 carp_send_ad_locked(struct carp_softc *sc)
 {
 	struct carp_header ch;
 	struct timeval tv;
 	struct carp_header *ch_ptr;
 	struct mbuf *m;
 	int len, advbase, advskew;
 
 	CARP_SCLOCK_ASSERT(sc);
 
 	/* bow out if we've lost our UPness or RUNNINGuiness */
 	if (!((SC2IFP(sc)->if_flags & IFF_UP) &&
 	    (SC2IFP(sc)->if_drv_flags & IFF_DRV_RUNNING))) {
 		advbase = 255;
 		advskew = 255;
 	} else {
 		advbase = sc->sc_advbase;
 		if (!carp_suppress_preempt || sc->sc_advskew > 240)
 			advskew = sc->sc_advskew;
 		else
 			advskew = 240;
 		tv.tv_sec = advbase;
 		tv.tv_usec = advskew * 1000000 / 256;
 	}
 
 	ch.carp_version = CARP_VERSION;
 	ch.carp_type = CARP_ADVERTISEMENT;
 	ch.carp_vhid = sc->sc_vhid;
 	ch.carp_advbase = advbase;
 	ch.carp_advskew = advskew;
 	ch.carp_authlen = 7;	/* XXX DEFINE */
 	ch.carp_pad1 = 0;	/* must be zero */
 	ch.carp_cksum = 0;
 
 #ifdef INET
 	INIT_VNET_INET(curvnet);
 	if (sc->sc_ia) {
 		struct ip *ip;
 
 		MGETHDR(m, M_DONTWAIT, MT_HEADER);
 		if (m == NULL) {
 			SC2IFP(sc)->if_oerrors++;
 			carpstats.carps_onomem++;
 			/* XXX maybe less ? */
 			if (advbase != 255 || advskew != 255)
 				callout_reset(&sc->sc_ad_tmo, tvtohz(&tv),
 				    carp_send_ad, sc);
 			return;
 		}
 		len = sizeof(*ip) + sizeof(ch);
 		m->m_pkthdr.len = len;
 		m->m_pkthdr.rcvif = NULL;
 		m->m_len = len;
 		MH_ALIGN(m, m->m_len);
 		m->m_flags |= M_MCAST;
 		ip = mtod(m, struct ip *);
 		ip->ip_v = IPVERSION;
 		ip->ip_hl = sizeof(*ip) >> 2;
 		ip->ip_tos = IPTOS_LOWDELAY;
 		ip->ip_len = len;
 		ip->ip_id = ip_newid();
 		ip->ip_off = IP_DF;
 		ip->ip_ttl = CARP_DFLTTL;
 		ip->ip_p = IPPROTO_CARP;
 		ip->ip_sum = 0;
 		ip->ip_src.s_addr = sc->sc_ia->ia_addr.sin_addr.s_addr;
 		ip->ip_dst.s_addr = htonl(INADDR_CARP_GROUP);
 
 		ch_ptr = (struct carp_header *)(&ip[1]);
 		bcopy(&ch, ch_ptr, sizeof(ch));
 		if (carp_prepare_ad(m, sc, ch_ptr))
 			return;
 
 		m->m_data += sizeof(*ip);
 		ch_ptr->carp_cksum = carp_cksum(m, len - sizeof(*ip));
 		m->m_data -= sizeof(*ip);
 
 		getmicrotime(&SC2IFP(sc)->if_lastchange);
 		SC2IFP(sc)->if_opackets++;
 		SC2IFP(sc)->if_obytes += len;
 		carpstats.carps_opackets++;
 
 		if (ip_output(m, NULL, NULL, IP_RAWOUTPUT, &sc->sc_imo, NULL)) {
 			SC2IFP(sc)->if_oerrors++;
 			if (sc->sc_sendad_errors < INT_MAX)
 				sc->sc_sendad_errors++;
 			if (sc->sc_sendad_errors == CARP_SENDAD_MAX_ERRORS) {
 				carp_suppress_preempt++;
 				if (carp_suppress_preempt == 1) {
 					CARP_SCUNLOCK(sc);
 					carp_send_ad_all();
 					CARP_SCLOCK(sc);
 				}
 			}
 			sc->sc_sendad_success = 0;
 		} else {
 			if (sc->sc_sendad_errors >= CARP_SENDAD_MAX_ERRORS) {
 				if (++sc->sc_sendad_success >=
 				    CARP_SENDAD_MIN_SUCCESS) {
 					carp_suppress_preempt--;
 					sc->sc_sendad_errors = 0;
 				}
 			} else
 				sc->sc_sendad_errors = 0;
 		}
 	}
 #endif /* INET */
 #ifdef INET6
 	if (sc->sc_ia6) {
 		struct ip6_hdr *ip6;
 
 		MGETHDR(m, M_DONTWAIT, MT_HEADER);
 		if (m == NULL) {
 			SC2IFP(sc)->if_oerrors++;
 			carpstats.carps_onomem++;
 			/* XXX maybe less ? */
 			if (advbase != 255 || advskew != 255)
 				callout_reset(&sc->sc_ad_tmo, tvtohz(&tv),
 				    carp_send_ad, sc);
 			return;
 		}
 		len = sizeof(*ip6) + sizeof(ch);
 		m->m_pkthdr.len = len;
 		m->m_pkthdr.rcvif = NULL;
 		m->m_len = len;
 		MH_ALIGN(m, m->m_len);
 		m->m_flags |= M_MCAST;
 		ip6 = mtod(m, struct ip6_hdr *);
 		bzero(ip6, sizeof(*ip6));
 		ip6->ip6_vfc |= IPV6_VERSION;
 		ip6->ip6_hlim = CARP_DFLTTL;
 		ip6->ip6_nxt = IPPROTO_CARP;
 		bcopy(&sc->sc_ia6->ia_addr.sin6_addr, &ip6->ip6_src,
 		    sizeof(struct in6_addr));
 		/* set the multicast destination */
 
 		ip6->ip6_dst.s6_addr16[0] = htons(0xff02);
 		ip6->ip6_dst.s6_addr8[15] = 0x12;
 		if (in6_setscope(&ip6->ip6_dst, sc->sc_carpdev, NULL) != 0) {
 			SC2IFP(sc)->if_oerrors++;
 			m_freem(m);
 			CARP_LOG("%s: in6_setscope failed\n", __func__);
 			return;
 		}
 
 		ch_ptr = (struct carp_header *)(&ip6[1]);
 		bcopy(&ch, ch_ptr, sizeof(ch));
 		if (carp_prepare_ad(m, sc, ch_ptr))
 			return;
 
 		m->m_data += sizeof(*ip6);
 		ch_ptr->carp_cksum = carp_cksum(m, len - sizeof(*ip6));
 		m->m_data -= sizeof(*ip6);
 
 		getmicrotime(&SC2IFP(sc)->if_lastchange);
 		SC2IFP(sc)->if_opackets++;
 		SC2IFP(sc)->if_obytes += len;
 		carpstats.carps_opackets6++;
 
 		if (ip6_output(m, NULL, NULL, 0, &sc->sc_im6o, NULL, NULL)) {
 			SC2IFP(sc)->if_oerrors++;
 			if (sc->sc_sendad_errors < INT_MAX)
 				sc->sc_sendad_errors++;
 			if (sc->sc_sendad_errors == CARP_SENDAD_MAX_ERRORS) {
 				carp_suppress_preempt++;
 				if (carp_suppress_preempt == 1) {
 					CARP_SCUNLOCK(sc);
 					carp_send_ad_all();
 					CARP_SCLOCK(sc);
 				}
 			}
 			sc->sc_sendad_success = 0;
 		} else {
 			if (sc->sc_sendad_errors >= CARP_SENDAD_MAX_ERRORS) {
 				if (++sc->sc_sendad_success >=
 				    CARP_SENDAD_MIN_SUCCESS) {
 					carp_suppress_preempt--;
 					sc->sc_sendad_errors = 0;
 				}
 			} else
 				sc->sc_sendad_errors = 0;
 		}
 	}
 #endif /* INET6 */
 
 	if (advbase != 255 || advskew != 255)
 		callout_reset(&sc->sc_ad_tmo, tvtohz(&tv),
 		    carp_send_ad, sc);
 
 }
 
 /*
  * Broadcast a gratuitous ARP request containing
  * the virtual router MAC address for each IP address
  * associated with the virtual router.
  */
 static void
 carp_send_arp(struct carp_softc *sc)
 {
 	struct ifaddr *ifa;
 
 	TAILQ_FOREACH(ifa, &SC2IFP(sc)->if_addrlist, ifa_list) {
 
 		if (ifa->ifa_addr->sa_family != AF_INET)
 			continue;
 
 /*		arprequest(sc->sc_carpdev, &in, &in, IF_LLADDR(sc->sc_ifp)); */
 		arp_ifinit2(sc->sc_carpdev, ifa, IF_LLADDR(sc->sc_ifp));
 
 		DELAY(1000);	/* XXX */
 	}
 }
 
 #ifdef INET6
 static void
 carp_send_na(struct carp_softc *sc)
 {
 	struct ifaddr *ifa;
 	struct in6_addr *in6;
 	static struct in6_addr mcast = IN6ADDR_LINKLOCAL_ALLNODES_INIT;
 
 	TAILQ_FOREACH(ifa, &SC2IFP(sc)->if_addrlist, ifa_list) {
 
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 
 		in6 = &ifatoia6(ifa)->ia_addr.sin6_addr;
 		nd6_na_output(sc->sc_carpdev, &mcast, in6,
 		    ND_NA_FLAG_OVERRIDE, 1, NULL);
 		DELAY(1000);	/* XXX */
 	}
 }
 #endif /* INET6 */
 
 static int
 carp_addrcount(struct carp_if *cif, struct in_ifaddr *ia, int type)
 {
 	struct carp_softc *vh;
 	struct ifaddr *ifa;
 	int count = 0;
 
 	CARP_LOCK_ASSERT(cif);
 
 	TAILQ_FOREACH(vh, &cif->vhif_vrs, sc_list) {
 		if ((type == CARP_COUNT_RUNNING &&
 		    (SC2IFP(vh)->if_flags & IFF_UP) &&
 		    (SC2IFP(vh)->if_drv_flags & IFF_DRV_RUNNING)) ||
 		    (type == CARP_COUNT_MASTER && vh->sc_state == MASTER)) {
 			TAILQ_FOREACH(ifa, &SC2IFP(vh)->if_addrlist,
 			    ifa_list) {
 				if (ifa->ifa_addr->sa_family == AF_INET &&
 				    ia->ia_addr.sin_addr.s_addr ==
 				    ifatoia(ifa)->ia_addr.sin_addr.s_addr)
 					count++;
 			}
 		}
 	}
 	return (count);
 }
 
 int
 carp_iamatch(void *v, struct in_ifaddr *ia,
     struct in_addr *isaddr, u_int8_t **enaddr)
 {
 	struct carp_if *cif = v;
 	struct carp_softc *vh;
 	int index, count = 0;
 	struct ifaddr *ifa;
 
 	CARP_LOCK(cif);
 
 	if (carp_opts[CARPCTL_ARPBALANCE]) {
 		/*
 		 * XXX proof of concept implementation.
 		 * We use the source ip to decide which virtual host should
 		 * handle the request. If we're master of that virtual host,
 		 * then we respond, otherwise, just drop the arp packet on
 		 * the floor.
 		 */
 		count = carp_addrcount(cif, ia, CARP_COUNT_RUNNING);
 		if (count == 0) {
 			/* should never reach this */
 			CARP_UNLOCK(cif);
 			return (0);
 		}
 
 		/* this should be a hash, like pf_hash() */
 		index = ntohl(isaddr->s_addr) % count;
 		count = 0;
 
 		TAILQ_FOREACH(vh, &cif->vhif_vrs, sc_list) {
 			if ((SC2IFP(vh)->if_flags & IFF_UP) &&
 			    (SC2IFP(vh)->if_drv_flags & IFF_DRV_RUNNING)) {
 				TAILQ_FOREACH(ifa, &SC2IFP(vh)->if_addrlist,
 				    ifa_list) {
 					if (ifa->ifa_addr->sa_family ==
 					    AF_INET &&
 					    ia->ia_addr.sin_addr.s_addr ==
 					    ifatoia(ifa)->ia_addr.sin_addr.s_addr) {
 						if (count == index) {
 							if (vh->sc_state ==
 							    MASTER) {
 								*enaddr = IF_LLADDR(vh->sc_ifp);
 								CARP_UNLOCK(cif);
 								return (1);
 							} else {
 								CARP_UNLOCK(cif);
 								return (0);
 							}
 						}
 						count++;
 					}
 				}
 			}
 		}
 	} else {
 		TAILQ_FOREACH(vh, &cif->vhif_vrs, sc_list) {
 			if ((SC2IFP(vh)->if_flags & IFF_UP) &&
 			    (SC2IFP(vh)->if_drv_flags & IFF_DRV_RUNNING) &&
 			    ia->ia_ifp == SC2IFP(vh) &&
 			    vh->sc_state == MASTER) {
 				*enaddr = IF_LLADDR(vh->sc_ifp);
 				CARP_UNLOCK(cif);
 				return (1);
 			}
 		}
 	}
 	CARP_UNLOCK(cif);
 	return (0);
 }
 
 #ifdef INET6
 struct ifaddr *
 carp_iamatch6(void *v, struct in6_addr *taddr)
 {
 	struct carp_if *cif = v;
 	struct carp_softc *vh;
 	struct ifaddr *ifa;
 
 	CARP_LOCK(cif);
 	TAILQ_FOREACH(vh, &cif->vhif_vrs, sc_list) {
 		TAILQ_FOREACH(ifa, &SC2IFP(vh)->if_addrlist, ifa_list) {
 			if (IN6_ARE_ADDR_EQUAL(taddr,
 			    &ifatoia6(ifa)->ia_addr.sin6_addr) &&
  			    (SC2IFP(vh)->if_flags & IFF_UP) &&
 			    (SC2IFP(vh)->if_drv_flags & IFF_DRV_RUNNING) &&
 			    vh->sc_state == MASTER) {
 			    	CARP_UNLOCK(cif);
 				return (ifa);
 			}
 		}
 	}
 	CARP_UNLOCK(cif);
 	
 	return (NULL);
 }
 
 void *
 carp_macmatch6(void *v, struct mbuf *m, const struct in6_addr *taddr)
 {
 	struct m_tag *mtag;
 	struct carp_if *cif = v;
 	struct carp_softc *sc;
 	struct ifaddr *ifa;
 
 	CARP_LOCK(cif);
 	TAILQ_FOREACH(sc, &cif->vhif_vrs, sc_list) {
 		TAILQ_FOREACH(ifa, &SC2IFP(sc)->if_addrlist, ifa_list) {
 			if (IN6_ARE_ADDR_EQUAL(taddr,
 			    &ifatoia6(ifa)->ia_addr.sin6_addr) &&
  			    (SC2IFP(sc)->if_flags & IFF_UP) &&
 			    (SC2IFP(sc)->if_drv_flags & IFF_DRV_RUNNING)) {
 				struct ifnet *ifp = SC2IFP(sc);
 				mtag = m_tag_get(PACKET_TAG_CARP,
 				    sizeof(struct ifnet *), M_NOWAIT);
 				if (mtag == NULL) {
 					/* better a bit than nothing */
 					CARP_UNLOCK(cif);
 					return (IF_LLADDR(sc->sc_ifp));
 				}
 				bcopy(&ifp, (caddr_t)(mtag + 1),
 				    sizeof(struct ifnet *));
 				m_tag_prepend(m, mtag);
 
 				CARP_UNLOCK(cif);
 				return (IF_LLADDR(sc->sc_ifp));
 			}
 		}
 	}
 	CARP_UNLOCK(cif);
 
 	return (NULL);
 }
 #endif
 
 struct ifnet *
 carp_forus(void *v, void *dhost)
 {
 	struct carp_if *cif = v;
 	struct carp_softc *vh;
 	u_int8_t *ena = dhost;
 
 	if (ena[0] || ena[1] || ena[2] != 0x5e || ena[3] || ena[4] != 1)
 		return (NULL);
 
 	CARP_LOCK(cif);
 	TAILQ_FOREACH(vh, &cif->vhif_vrs, sc_list)
 		if ((SC2IFP(vh)->if_flags & IFF_UP) &&
 		    (SC2IFP(vh)->if_drv_flags & IFF_DRV_RUNNING) &&
 		    vh->sc_state == MASTER &&
 		    !bcmp(dhost, IF_LLADDR(vh->sc_ifp), ETHER_ADDR_LEN)) {
 		    	CARP_UNLOCK(cif);
 			return (SC2IFP(vh));
 		}
 
     	CARP_UNLOCK(cif);
 	return (NULL);
 }
 
 static void
 carp_master_down(void *v)
 {
 	struct carp_softc *sc = v;
 
 	CARP_SCLOCK(sc);
 	carp_master_down_locked(sc);
 	CARP_SCUNLOCK(sc);
 }
 
 static void
 carp_master_down_locked(struct carp_softc *sc)
 {
 	if (sc->sc_carpdev)
 		CARP_SCLOCK_ASSERT(sc);
 
 	switch (sc->sc_state) {
 	case INIT:
 		printf("%s: master_down event in INIT state\n",
 		    SC2IFP(sc)->if_xname);
 		break;
 	case MASTER:
 		break;
 	case BACKUP:
 		carp_set_state(sc, MASTER);
 		carp_send_ad_locked(sc);
 		carp_send_arp(sc);
 #ifdef INET6
 		carp_send_na(sc);
 #endif /* INET6 */
 		carp_setrun(sc, 0);
 		carp_setroute(sc, RTM_ADD);
 		break;
 	}
 }
 
 /*
  * When in backup state, af indicates whether to reset the master down timer
  * for v4 or v6. If it's set to zero, reset the ones which are already pending.
  */
 static void
 carp_setrun(struct carp_softc *sc, sa_family_t af)
 {
 	struct timeval tv;
 
 	if (sc->sc_carpdev == NULL) {
 		SC2IFP(sc)->if_drv_flags &= ~IFF_DRV_RUNNING;
 		carp_set_state(sc, INIT);
 		return;
 	} else
 		CARP_SCLOCK_ASSERT(sc);
 
 	if (SC2IFP(sc)->if_flags & IFF_UP &&
 	    sc->sc_vhid > 0 && (sc->sc_naddrs || sc->sc_naddrs6))
 		SC2IFP(sc)->if_drv_flags |= IFF_DRV_RUNNING;
 	else {
 		SC2IFP(sc)->if_drv_flags &= ~IFF_DRV_RUNNING;
 		carp_setroute(sc, RTM_DELETE);
 		return;
 	}
 
 	switch (sc->sc_state) {
 	case INIT:
 		if (carp_opts[CARPCTL_PREEMPT] && !carp_suppress_preempt) {
 			carp_send_ad_locked(sc);
 			carp_send_arp(sc);
 #ifdef INET6
 			carp_send_na(sc);
 #endif /* INET6 */
 			CARP_DEBUG("%s: INIT -> MASTER (preempting)\n",
 			    SC2IFP(sc)->if_xname);
 			carp_set_state(sc, MASTER);
 			carp_setroute(sc, RTM_ADD);
 		} else {
 			CARP_DEBUG("%s: INIT -> BACKUP\n", SC2IFP(sc)->if_xname);
 			carp_set_state(sc, BACKUP);
 			carp_setroute(sc, RTM_DELETE);
 			carp_setrun(sc, 0);
 		}
 		break;
 	case BACKUP:
 		callout_stop(&sc->sc_ad_tmo);
 		tv.tv_sec = 3 * sc->sc_advbase;
 		tv.tv_usec = sc->sc_advskew * 1000000 / 256;
 		switch (af) {
 #ifdef INET
 		case AF_INET:
 			callout_reset(&sc->sc_md_tmo, tvtohz(&tv),
 			    carp_master_down, sc);
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			callout_reset(&sc->sc_md6_tmo, tvtohz(&tv),
 			    carp_master_down, sc);
 			break;
 #endif /* INET6 */
 		default:
 			if (sc->sc_naddrs)
 				callout_reset(&sc->sc_md_tmo, tvtohz(&tv),
 				    carp_master_down, sc);
 			if (sc->sc_naddrs6)
 				callout_reset(&sc->sc_md6_tmo, tvtohz(&tv),
 				    carp_master_down, sc);
 			break;
 		}
 		break;
 	case MASTER:
 		tv.tv_sec = sc->sc_advbase;
 		tv.tv_usec = sc->sc_advskew * 1000000 / 256;
 		callout_reset(&sc->sc_ad_tmo, tvtohz(&tv),
 		    carp_send_ad, sc);
 		break;
 	}
 }
 
 static void
 carp_multicast_cleanup(struct carp_softc *sc)
 {
 	struct ip_moptions *imo = &sc->sc_imo;
 	u_int16_t n = imo->imo_num_memberships;
 
 	/* Clean up our own multicast memberships */
 	while (n-- > 0) {
 		if (imo->imo_membership[n] != NULL) {
 			in_delmulti(imo->imo_membership[n]);
 			imo->imo_membership[n] = NULL;
 		}
 	}
 	KASSERT(imo->imo_mfilters == NULL,
 	   ("%s: imo_mfilters != NULL", __func__));
 	imo->imo_num_memberships = 0;
 	imo->imo_multicast_ifp = NULL;
 }
 
 #ifdef INET6
 static void
 carp_multicast6_cleanup(struct carp_softc *sc)
 {
 	struct ip6_moptions *im6o = &sc->sc_im6o;
 
 	while (!LIST_EMPTY(&im6o->im6o_memberships)) {
 		struct in6_multi_mship *imm =
 		    LIST_FIRST(&im6o->im6o_memberships);
 
 		LIST_REMOVE(imm, i6mm_chain);
 		in6_leavegroup(imm);
 	}
 	im6o->im6o_multicast_ifp = NULL;
 }
 #endif
 
 static int
 carp_set_addr(struct carp_softc *sc, struct sockaddr_in *sin)
 {
 	INIT_VNET_INET(curvnet);
 	struct ifnet *ifp;
 	struct carp_if *cif;
 	struct in_ifaddr *ia, *ia_if;
 	struct ip_moptions *imo = &sc->sc_imo;
 	struct in_addr addr;
 	u_long iaddr = htonl(sin->sin_addr.s_addr);
 	int own, error;
 
 	if (sin->sin_addr.s_addr == 0) {
 		if (!(SC2IFP(sc)->if_flags & IFF_UP))
 			carp_set_state(sc, INIT);
 		if (sc->sc_naddrs)
 			SC2IFP(sc)->if_flags |= IFF_UP;
 		if (sc->sc_carpdev)
 			CARP_SCLOCK(sc);
 		carp_setrun(sc, 0);
 		if (sc->sc_carpdev)
 			CARP_SCUNLOCK(sc);
 		return (0);
 	}
 
 	/* we have to do it by hands to check we won't match on us */
 	ia_if = NULL; own = 0;
 	TAILQ_FOREACH(ia, &V_in_ifaddrhead, ia_link) {
 		/* and, yeah, we need a multicast-capable iface too */
 		if (ia->ia_ifp != SC2IFP(sc) &&
 		    (ia->ia_ifp->if_flags & IFF_MULTICAST) &&
 		    (iaddr & ia->ia_subnetmask) == ia->ia_subnet) {
 			if (!ia_if)
 				ia_if = ia;
 			if (sin->sin_addr.s_addr ==
 			    ia->ia_addr.sin_addr.s_addr)
 				own++;
 		}
 	}
 
 	if (!ia_if)
 		return (EADDRNOTAVAIL);
 
 	ia = ia_if;
 	ifp = ia->ia_ifp;
 
 	if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0 ||
 	    (imo->imo_multicast_ifp && imo->imo_multicast_ifp != ifp))
 		return (EADDRNOTAVAIL);
 
 	if (imo->imo_num_memberships == 0) {
 		addr.s_addr = htonl(INADDR_CARP_GROUP);
 		if ((imo->imo_membership[0] = in_addmulti(&addr, ifp)) == NULL)
 			return (ENOBUFS);
 		imo->imo_num_memberships++;
 		imo->imo_multicast_ifp = ifp;
 		imo->imo_multicast_ttl = CARP_DFLTTL;
 		imo->imo_multicast_loop = 0;
 	}
 
 	if (!ifp->if_carp) {
 
 		cif = malloc(sizeof(*cif), M_CARP,
 		    M_WAITOK|M_ZERO);
 		if (!cif) {
 			error = ENOBUFS;
 			goto cleanup;
 		}
 		if ((error = ifpromisc(ifp, 1))) {
 			free(cif, M_CARP);
 			goto cleanup;
 		}
 		
 		CARP_LOCK_INIT(cif);
 		CARP_LOCK(cif);
 		cif->vhif_ifp = ifp;
 		TAILQ_INIT(&cif->vhif_vrs);
 		ifp->if_carp = cif;
 
 	} else {
 		struct carp_softc *vr;
 
 		cif = (struct carp_if *)ifp->if_carp;
 		CARP_LOCK(cif);
 		TAILQ_FOREACH(vr, &cif->vhif_vrs, sc_list)
 			if (vr != sc && vr->sc_vhid == sc->sc_vhid) {
 				CARP_UNLOCK(cif);
 				error = EEXIST;
 				goto cleanup;
 			}
 	}
 	sc->sc_ia = ia;
 	sc->sc_carpdev = ifp;
 
 	{ /* XXX prevent endless loop if already in queue */
 	struct carp_softc *vr, *after = NULL;
 	int myself = 0;
 	cif = (struct carp_if *)ifp->if_carp;
 
 	/* XXX: cif should not change, right? So we still hold the lock */
 	CARP_LOCK_ASSERT(cif);
 
 	TAILQ_FOREACH(vr, &cif->vhif_vrs, sc_list) {
 		if (vr == sc)
 			myself = 1;
 		if (vr->sc_vhid < sc->sc_vhid)
 			after = vr;
 	}
 
 	if (!myself) {
 		/* We're trying to keep things in order */
 		if (after == NULL) {
 			TAILQ_INSERT_TAIL(&cif->vhif_vrs, sc, sc_list);
 		} else {
 			TAILQ_INSERT_AFTER(&cif->vhif_vrs, after, sc, sc_list);
 		}
 		cif->vhif_nvrs++;
 	}
 	}
 
 	sc->sc_naddrs++;
 	SC2IFP(sc)->if_flags |= IFF_UP;
 	if (own)
 		sc->sc_advskew = 0;
 	carp_sc_state_locked(sc);
 	carp_setrun(sc, 0);
 
 	CARP_UNLOCK(cif);
 
 	return (0);
 
 cleanup:
 	in_delmulti(imo->imo_membership[--imo->imo_num_memberships]);
 	return (error);
 }
 
 static int
 carp_del_addr(struct carp_softc *sc, struct sockaddr_in *sin)
 {
 	int error = 0;
 
 	if (!--sc->sc_naddrs) {
 		struct carp_if *cif = (struct carp_if *)sc->sc_carpdev->if_carp;
 		struct ip_moptions *imo = &sc->sc_imo;
 
 		CARP_LOCK(cif);
 		callout_stop(&sc->sc_ad_tmo);
 		SC2IFP(sc)->if_flags &= ~IFF_UP;
 		SC2IFP(sc)->if_drv_flags &= ~IFF_DRV_RUNNING;
 		sc->sc_vhid = -1;
 		in_delmulti(imo->imo_membership[--imo->imo_num_memberships]);
 		imo->imo_multicast_ifp = NULL;
 		TAILQ_REMOVE(&cif->vhif_vrs, sc, sc_list);
 		if (!--cif->vhif_nvrs) {
 			sc->sc_carpdev->if_carp = NULL;
 			CARP_LOCK_DESTROY(cif);
 			free(cif, M_IFADDR);
 		} else {
 			CARP_UNLOCK(cif);
 		}
 	}
 
 	return (error);
 }
 
 #ifdef INET6
 static int
 carp_set_addr6(struct carp_softc *sc, struct sockaddr_in6 *sin6)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp;
 	struct carp_if *cif;
 	struct in6_ifaddr *ia, *ia_if;
 	struct ip6_moptions *im6o = &sc->sc_im6o;
 	struct in6_multi_mship *imm;
 	struct in6_addr in6;
 	int own, error;
 
 	if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) {
 		if (!(SC2IFP(sc)->if_flags & IFF_UP))
 			carp_set_state(sc, INIT);
 		if (sc->sc_naddrs6)
 			SC2IFP(sc)->if_flags |= IFF_UP;
 		if (sc->sc_carpdev)
 			CARP_SCLOCK(sc);
 		carp_setrun(sc, 0);
 		if (sc->sc_carpdev)
 			CARP_SCUNLOCK(sc);
 		return (0);
 	}
 
 	/* we have to do it by hands to check we won't match on us */
 	ia_if = NULL; own = 0;
 	for (ia = V_in6_ifaddr; ia; ia = ia->ia_next) {
 		int i;
 
 		for (i = 0; i < 4; i++) {
 			if ((sin6->sin6_addr.s6_addr32[i] &
 			    ia->ia_prefixmask.sin6_addr.s6_addr32[i]) !=
 			    (ia->ia_addr.sin6_addr.s6_addr32[i] &
 			    ia->ia_prefixmask.sin6_addr.s6_addr32[i]))
 				break;
 		}
 		/* and, yeah, we need a multicast-capable iface too */
 		if (ia->ia_ifp != SC2IFP(sc) &&
 		    (ia->ia_ifp->if_flags & IFF_MULTICAST) &&
 		    (i == 4)) {
 			if (!ia_if)
 				ia_if = ia;
 			if (IN6_ARE_ADDR_EQUAL(&sin6->sin6_addr,
 			    &ia->ia_addr.sin6_addr))
 				own++;
 		}
 	}
 
 	if (!ia_if)
 		return (EADDRNOTAVAIL);
 	ia = ia_if;
 	ifp = ia->ia_ifp;
 
 	if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0 ||
 	    (im6o->im6o_multicast_ifp && im6o->im6o_multicast_ifp != ifp))
 		return (EADDRNOTAVAIL);
 
 	if (!sc->sc_naddrs6) {
 		im6o->im6o_multicast_ifp = ifp;
 
 		/* join CARP multicast address */
 		bzero(&in6, sizeof(in6));
 		in6.s6_addr16[0] = htons(0xff02);
 		in6.s6_addr8[15] = 0x12;
 		if (in6_setscope(&in6, ifp, NULL) != 0)
 			goto cleanup;
 		if ((imm = in6_joingroup(ifp, &in6, &error, 0)) == NULL)
 			goto cleanup;
 		LIST_INSERT_HEAD(&im6o->im6o_memberships, imm, i6mm_chain);
 
 		/* join solicited multicast address */
 		bzero(&in6, sizeof(in6));
 		in6.s6_addr16[0] = htons(0xff02);
 		in6.s6_addr32[1] = 0;
 		in6.s6_addr32[2] = htonl(1);
 		in6.s6_addr32[3] = sin6->sin6_addr.s6_addr32[3];
 		in6.s6_addr8[12] = 0xff;
 		if (in6_setscope(&in6, ifp, NULL) != 0)
 			goto cleanup;
 		if ((imm = in6_joingroup(ifp, &in6, &error, 0)) == NULL)
 			goto cleanup;
 		LIST_INSERT_HEAD(&im6o->im6o_memberships, imm, i6mm_chain);
 	}
 
 	if (!ifp->if_carp) {
 		cif = malloc(sizeof(*cif), M_CARP,
 		    M_WAITOK|M_ZERO);
 		if (!cif) {
 			error = ENOBUFS;
 			goto cleanup;
 		}
 		if ((error = ifpromisc(ifp, 1))) {
 			free(cif, M_CARP);
 			goto cleanup;
 		}
 
 		CARP_LOCK_INIT(cif);
 		CARP_LOCK(cif);
 		cif->vhif_ifp = ifp;
 		TAILQ_INIT(&cif->vhif_vrs);
 		ifp->if_carp = cif;
 
 	} else {
 		struct carp_softc *vr;
 
 		cif = (struct carp_if *)ifp->if_carp;
 		CARP_LOCK(cif);
 		TAILQ_FOREACH(vr, &cif->vhif_vrs, sc_list)
 			if (vr != sc && vr->sc_vhid == sc->sc_vhid) {
 				CARP_UNLOCK(cif);
 				error = EINVAL;
 				goto cleanup;
 			}
 	}
 	sc->sc_ia6 = ia;
 	sc->sc_carpdev = ifp;
 
 	{ /* XXX prevent endless loop if already in queue */
 	struct carp_softc *vr, *after = NULL;
 	int myself = 0;
 	cif = (struct carp_if *)ifp->if_carp;
 	CARP_LOCK_ASSERT(cif);
 
 	TAILQ_FOREACH(vr, &cif->vhif_vrs, sc_list) {
 		if (vr == sc)
 			myself = 1;
 		if (vr->sc_vhid < sc->sc_vhid)
 			after = vr;
 	}
 
 	if (!myself) {
 		/* We're trying to keep things in order */
 		if (after == NULL) {
 			TAILQ_INSERT_TAIL(&cif->vhif_vrs, sc, sc_list);
 		} else {
 			TAILQ_INSERT_AFTER(&cif->vhif_vrs, after, sc, sc_list);
 		}
 		cif->vhif_nvrs++;
 	}
 	}
 
 	sc->sc_naddrs6++;
 	SC2IFP(sc)->if_flags |= IFF_UP;
 	if (own)
 		sc->sc_advskew = 0;
 	carp_sc_state_locked(sc);
 	carp_setrun(sc, 0);
 
 	CARP_UNLOCK(cif);
 
 	return (0);
 
 cleanup:
 	/* clean up multicast memberships */
 	if (!sc->sc_naddrs6) {
 		while (!LIST_EMPTY(&im6o->im6o_memberships)) {
 			imm = LIST_FIRST(&im6o->im6o_memberships);
 			LIST_REMOVE(imm, i6mm_chain);
 			in6_leavegroup(imm);
 		}
 	}
 	return (error);
 }
 
 static int
 carp_del_addr6(struct carp_softc *sc, struct sockaddr_in6 *sin6)
 {
 	int error = 0;
 
 	if (!--sc->sc_naddrs6) {
 		struct carp_if *cif = (struct carp_if *)sc->sc_carpdev->if_carp;
 		struct ip6_moptions *im6o = &sc->sc_im6o;
 
 		CARP_LOCK(cif);
 		callout_stop(&sc->sc_ad_tmo);
 		SC2IFP(sc)->if_flags &= ~IFF_UP;
 		SC2IFP(sc)->if_drv_flags &= ~IFF_DRV_RUNNING;
 		sc->sc_vhid = -1;
 		while (!LIST_EMPTY(&im6o->im6o_memberships)) {
 			struct in6_multi_mship *imm =
 			    LIST_FIRST(&im6o->im6o_memberships);
 
 			LIST_REMOVE(imm, i6mm_chain);
 			in6_leavegroup(imm);
 		}
 		im6o->im6o_multicast_ifp = NULL;
 		TAILQ_REMOVE(&cif->vhif_vrs, sc, sc_list);
 		if (!--cif->vhif_nvrs) {
 			CARP_LOCK_DESTROY(cif);
 			sc->sc_carpdev->if_carp = NULL;
 			free(cif, M_IFADDR);
 		} else
 			CARP_UNLOCK(cif);
 	}
 
 	return (error);
 }
 #endif /* INET6 */
 
 static int
 carp_ioctl(struct ifnet *ifp, u_long cmd, caddr_t addr)
 {
 	struct carp_softc *sc = ifp->if_softc, *vr;
 	struct carpreq carpr;
 	struct ifaddr *ifa;
 	struct ifreq *ifr;
 	struct ifaliasreq *ifra;
 	int locked = 0, error = 0;
 
 	ifa = (struct ifaddr *)addr;
 	ifra = (struct ifaliasreq *)addr;
 	ifr = (struct ifreq *)addr;
 
 	switch (cmd) {
 	case SIOCSIFADDR:
 		switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
 		case AF_INET:
 			SC2IFP(sc)->if_flags |= IFF_UP;
 			bcopy(ifa->ifa_addr, ifa->ifa_dstaddr,
 			    sizeof(struct sockaddr));
 			error = carp_set_addr(sc, satosin(ifa->ifa_addr));
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			SC2IFP(sc)->if_flags |= IFF_UP;
 			error = carp_set_addr6(sc, satosin6(ifa->ifa_addr));
 			break;
 #endif /* INET6 */
 		default:
 			error = EAFNOSUPPORT;
 			break;
 		}
 		break;
 
 	case SIOCAIFADDR:
 		switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
 		case AF_INET:
 			SC2IFP(sc)->if_flags |= IFF_UP;
 			bcopy(ifa->ifa_addr, ifa->ifa_dstaddr,
 			    sizeof(struct sockaddr));
 			error = carp_set_addr(sc, satosin(&ifra->ifra_addr));
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			SC2IFP(sc)->if_flags |= IFF_UP;
 			error = carp_set_addr6(sc, satosin6(&ifra->ifra_addr));
 			break;
 #endif /* INET6 */
 		default:
 			error = EAFNOSUPPORT;
 			break;
 		}
 		break;
 
 	case SIOCDIFADDR:
 		switch (ifa->ifa_addr->sa_family) {
 #ifdef INET
 		case AF_INET:
 			error = carp_del_addr(sc, satosin(&ifra->ifra_addr));
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			error = carp_del_addr6(sc, satosin6(&ifra->ifra_addr));
 			break;
 #endif /* INET6 */
 		default:
 			error = EAFNOSUPPORT;
 			break;
 		}
 		break;
 
 	case SIOCSIFFLAGS:
 		if (sc->sc_carpdev) {
 			locked = 1;
 			CARP_SCLOCK(sc);
 		}
 		if (sc->sc_state != INIT && !(ifr->ifr_flags & IFF_UP)) {
  			callout_stop(&sc->sc_ad_tmo);
  			callout_stop(&sc->sc_md_tmo);
  			callout_stop(&sc->sc_md6_tmo);
 			if (sc->sc_state == MASTER)
 				carp_send_ad_locked(sc);
 			carp_set_state(sc, INIT);
 			carp_setrun(sc, 0);
 		} else if (sc->sc_state == INIT && (ifr->ifr_flags & IFF_UP)) {
 			SC2IFP(sc)->if_flags |= IFF_UP;
 			carp_setrun(sc, 0);
 		}
 		break;
 
 	case SIOCSVH:
 		error = priv_check(curthread, PRIV_NETINET_CARP);
 		if (error)
 			break;
 		if ((error = copyin(ifr->ifr_data, &carpr, sizeof carpr)))
 			break;
 		error = 1;
 		if (sc->sc_carpdev) {
 			locked = 1;
 			CARP_SCLOCK(sc);
 		}
 		if (sc->sc_state != INIT && carpr.carpr_state != sc->sc_state) {
 			switch (carpr.carpr_state) {
 			case BACKUP:
 				callout_stop(&sc->sc_ad_tmo);
 				carp_set_state(sc, BACKUP);
 				carp_setrun(sc, 0);
 				carp_setroute(sc, RTM_DELETE);
 				break;
 			case MASTER:
 				carp_master_down_locked(sc);
 				break;
 			default:
 				break;
 			}
 		}
 		if (carpr.carpr_vhid > 0) {
 			if (carpr.carpr_vhid > 255) {
 				error = EINVAL;
 				break;
 			}
 			if (sc->sc_carpdev) {
 				struct carp_if *cif;
 				cif = (struct carp_if *)sc->sc_carpdev->if_carp;
 				TAILQ_FOREACH(vr, &cif->vhif_vrs, sc_list)
 					if (vr != sc &&
 					    vr->sc_vhid == carpr.carpr_vhid) {
 						error = EEXIST;
 						break;
 					}
 				if (error == EEXIST)
 					break;
 			}
 			sc->sc_vhid = carpr.carpr_vhid;
 			IF_LLADDR(sc->sc_ifp)[0] = 0;
 			IF_LLADDR(sc->sc_ifp)[1] = 0;
 			IF_LLADDR(sc->sc_ifp)[2] = 0x5e;
 			IF_LLADDR(sc->sc_ifp)[3] = 0;
 			IF_LLADDR(sc->sc_ifp)[4] = 1;
 			IF_LLADDR(sc->sc_ifp)[5] = sc->sc_vhid;
 			error--;
 		}
 		if (carpr.carpr_advbase > 0 || carpr.carpr_advskew > 0) {
 			if (carpr.carpr_advskew >= 255) {
 				error = EINVAL;
 				break;
 			}
 			if (carpr.carpr_advbase > 255) {
 				error = EINVAL;
 				break;
 			}
 			sc->sc_advbase = carpr.carpr_advbase;
 			sc->sc_advskew = carpr.carpr_advskew;
 			error--;
 		}
 		bcopy(carpr.carpr_key, sc->sc_key, sizeof(sc->sc_key));
 		if (error > 0)
 			error = EINVAL;
 		else {
 			error = 0;
 			carp_setrun(sc, 0);
 		}
 		break;
 
 	case SIOCGVH:
 		/* XXX: lockless read */
 		bzero(&carpr, sizeof(carpr));
 		carpr.carpr_state = sc->sc_state;
 		carpr.carpr_vhid = sc->sc_vhid;
 		carpr.carpr_advbase = sc->sc_advbase;
 		carpr.carpr_advskew = sc->sc_advskew;
 		error = priv_check(curthread, PRIV_NETINET_CARP);
 		if (error == 0)
 			bcopy(sc->sc_key, carpr.carpr_key,
 			    sizeof(carpr.carpr_key));
 		error = copyout(&carpr, ifr->ifr_data, sizeof(carpr));
 		break;
 
 	default:
 		error = EINVAL;
 	}
 
 	if (locked)
 		CARP_SCUNLOCK(sc);
 
 	carp_hmac_prepare(sc);
 
 	return (error);
 }
 
 /*
  * XXX: this is looutput. We should eventually use it from there.
  */
 static int
 carp_looutput(struct ifnet *ifp, struct mbuf *m, struct sockaddr *dst,
     struct rtentry *rt)
 {
 	u_int32_t af;
 
 	M_ASSERTPKTHDR(m); /* check if we have the packet header */
 
 	if (rt && rt->rt_flags & (RTF_REJECT|RTF_BLACKHOLE)) {
 		m_freem(m);
 		return (rt->rt_flags & RTF_BLACKHOLE ? 0 :
 			rt->rt_flags & RTF_HOST ? EHOSTUNREACH : ENETUNREACH);
 	}
 
 	ifp->if_opackets++;
 	ifp->if_obytes += m->m_pkthdr.len;
 
 	/* BPF writes need to be handled specially. */
 	if (dst->sa_family == AF_UNSPEC) {
 		bcopy(dst->sa_data, &af, sizeof(af));
 		dst->sa_family = af;
 	}
 
 #if 1	/* XXX */
 	switch (dst->sa_family) {
 	case AF_INET:
 	case AF_INET6:
 	case AF_IPX:
 	case AF_APPLETALK:
 		break;
 	default:
 		printf("carp_looutput: af=%d unexpected\n", dst->sa_family);
 		m_freem(m);
 		return (EAFNOSUPPORT);
 	}
 #endif
 	return(if_simloop(ifp, m, dst->sa_family, 0));
 }
 
 /*
  * Start output on carp interface. This function should never be called.
  */
 static void
 carp_start(struct ifnet *ifp)
 {
 #ifdef DEBUG
 	printf("%s: start called\n", ifp->if_xname);
 #endif
 }
 
 int
 carp_output(struct ifnet *ifp, struct mbuf *m, struct sockaddr *sa,
     struct rtentry *rt)
 {
 	struct m_tag *mtag;
 	struct carp_softc *sc;
 	struct ifnet *carp_ifp;
 
 	if (!sa)
 		return (0);
 
 	switch (sa->sa_family) {
 #ifdef INET
 	case AF_INET:
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		break;
 #endif /* INET6 */
 	default:
 		return (0);
 	}
 
 	mtag = m_tag_find(m, PACKET_TAG_CARP, NULL);
 	if (mtag == NULL)
 		return (0);
 
 	bcopy(mtag + 1, &carp_ifp, sizeof(struct ifnet *));
 	sc = carp_ifp->if_softc;
 
 	/* Set the source MAC address to Virtual Router MAC Address */
 	switch (ifp->if_type) {
 	case IFT_ETHER:
 	case IFT_L2VLAN: {
 			struct ether_header *eh;
 
 			eh = mtod(m, struct ether_header *);
 			eh->ether_shost[0] = 0;
 			eh->ether_shost[1] = 0;
 			eh->ether_shost[2] = 0x5e;
 			eh->ether_shost[3] = 0;
 			eh->ether_shost[4] = 1;
 			eh->ether_shost[5] = sc->sc_vhid;
 		}
 		break;
 	case IFT_FDDI: {
 			struct fddi_header *fh;
 
 			fh = mtod(m, struct fddi_header *);
 			fh->fddi_shost[0] = 0;
 			fh->fddi_shost[1] = 0;
 			fh->fddi_shost[2] = 0x5e;
 			fh->fddi_shost[3] = 0;
 			fh->fddi_shost[4] = 1;
 			fh->fddi_shost[5] = sc->sc_vhid;
 		}
 		break;
 	case IFT_ISO88025: {
  			struct iso88025_header *th;
  			th = mtod(m, struct iso88025_header *);
 			th->iso88025_shost[0] = 3;
 			th->iso88025_shost[1] = 0;
 			th->iso88025_shost[2] = 0x40 >> (sc->sc_vhid - 1);
 			th->iso88025_shost[3] = 0x40000 >> (sc->sc_vhid - 1);
 			th->iso88025_shost[4] = 0;
 			th->iso88025_shost[5] = 0;
 		}
 		break;
 	default:
 		printf("%s: carp is not supported for this interface type\n",
 		    ifp->if_xname);
 		return (EOPNOTSUPP);
 	}
 
 	return (0);
 }
 
 static void
 carp_set_state(struct carp_softc *sc, int state)
 {
 	int link_state;
 
 	if (sc->sc_carpdev)
 		CARP_SCLOCK_ASSERT(sc);
 
 	if (sc->sc_state == state)
 		return;
 
 	sc->sc_state = state;
 	switch (state) {
 	case BACKUP:
 		link_state = LINK_STATE_DOWN;
 		break;
 	case MASTER:
 		link_state = LINK_STATE_UP;
 		break;
 	default:
 		link_state = LINK_STATE_UNKNOWN;
 		break;
 	}
 	if_link_state_change(SC2IFP(sc), link_state);
 }
 
 void
 carp_carpdev_state(void *v)
 {
 	struct carp_if *cif = v;
 
 	CARP_LOCK(cif);
 	carp_carpdev_state_locked(cif);
 	CARP_UNLOCK(cif);
 }
 
 static void
 carp_carpdev_state_locked(struct carp_if *cif)
 {
 	struct carp_softc *sc;
 
 	TAILQ_FOREACH(sc, &cif->vhif_vrs, sc_list)
 		carp_sc_state_locked(sc);
 }
 
 static void
 carp_sc_state_locked(struct carp_softc *sc)
 {
 	CARP_SCLOCK_ASSERT(sc);
 
 	if (sc->sc_carpdev->if_link_state != LINK_STATE_UP ||
 	    !(sc->sc_carpdev->if_flags & IFF_UP)) {
 		sc->sc_flags_backup = SC2IFP(sc)->if_flags;
 		SC2IFP(sc)->if_flags &= ~IFF_UP;
 		SC2IFP(sc)->if_drv_flags &= ~IFF_DRV_RUNNING;
 		callout_stop(&sc->sc_ad_tmo);
 		callout_stop(&sc->sc_md_tmo);
 		callout_stop(&sc->sc_md6_tmo);
 		carp_set_state(sc, INIT);
 		carp_setrun(sc, 0);
 		if (!sc->sc_suppress) {
 			carp_suppress_preempt++;
 			if (carp_suppress_preempt == 1) {
 				CARP_SCUNLOCK(sc);
 				carp_send_ad_all();
 				CARP_SCLOCK(sc);
 			}
 		}
 		sc->sc_suppress = 1;
 	} else {
 		SC2IFP(sc)->if_flags |= sc->sc_flags_backup;
 		carp_set_state(sc, INIT);
 		carp_setrun(sc, 0);
 		if (sc->sc_suppress)
 			carp_suppress_preempt--;
 		sc->sc_suppress = 0;
 	}
 
 	return;
 }
 
 static int
 carp_modevent(module_t mod, int type, void *data)
 {
 	switch (type) {
 	case MOD_LOAD:
 		if_detach_event_tag = EVENTHANDLER_REGISTER(ifnet_departure_event,
 		    carp_ifdetach, NULL, EVENTHANDLER_PRI_ANY);
 		if (if_detach_event_tag == NULL)
 			return (ENOMEM);
 		mtx_init(&carp_mtx, "carp_mtx", NULL, MTX_DEF);
 		LIST_INIT(&carpif_list);
 		if_clone_attach(&carp_cloner);
 		break;
 
 	case MOD_UNLOAD:
 		EVENTHANDLER_DEREGISTER(ifnet_departure_event, if_detach_event_tag);
 		if_clone_detach(&carp_cloner);
 		mtx_destroy(&carp_mtx);
 		break;
 
 	default:
 		return (EINVAL);
 	}
 
 	return (0);
 }
 
 static moduledata_t carp_mod = {
 	"carp",
 	carp_modevent,
 	0
 };
 
 DECLARE_MODULE(carp, carp_mod, SI_SUB_PSEUDO, SI_ORDER_ANY);
Index: head/sys/netinet/ip_fastfwd.c
===================================================================
--- head/sys/netinet/ip_fastfwd.c	(revision 186118)
+++ head/sys/netinet/ip_fastfwd.c	(revision 186119)
@@ -1,619 +1,619 @@
 /*-
  * Copyright (c) 2003 Andre Oppermann, Internet Business Solutions AG
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. The name of the author may not be used to endorse or promote
  *    products derived from this software without specific prior written
  *    permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * ip_fastforward gets its speed from processing the forwarded packet to
  * completion (if_output on the other side) without any queues or netisr's.
  * The receiving interface DMAs the packet into memory, the upper half of
  * driver calls ip_fastforward, we do our routing table lookup and directly
  * send it off to the outgoing interface, which DMAs the packet to the
  * network card. The only part of the packet we touch with the CPU is the
  * IP header (unless there are complex firewall rules touching other parts
  * of the packet, but that is up to you). We are essentially limited by bus
  * bandwidth and how fast the network card/driver can set up receives and
  * transmits.
  *
  * We handle basic errors, IP header errors, checksum errors,
  * destination unreachable, fragmentation and fragmentation needed and
  * report them via ICMP to the sender.
  *
  * Else if something is not pure IPv4 unicast forwarding we fall back to
  * the normal ip_input processing path. We should only be called from
  * interfaces connected to the outside world.
  *
  * Firewalling is fully supported including divert, ipfw fwd and ipfilter
  * ipnat and address rewrite.
  *
  * IPSEC is not supported if this host is a tunnel broker. IPSEC is
  * supported for connections to/from local host.
  *
  * We try to do the least expensive (in CPU ops) checks and operations
  * first to catch junk with as little overhead as possible.
  * 
  * We take full advantage of hardware support for IP checksum and
  * fragmentation offloading.
  *
  * We don't do ICMP redirect in the fast forwarding path. I have had my own
  * cases where two core routers with Zebra routing suite would send millions
  * ICMP redirects to connected hosts if the destination router was not the
  * default gateway. In one case it was filling the routing table of a host
  * with approximately 300.000 cloned redirect entries until it ran out of
  * kernel memory. However the networking code proved very robust and it didn't
  * crash or fail in other ways.
  */
 
 /*
  * Many thanks to Matt Thomas of NetBSD for basic structure of ip_flow.c which
  * is being followed here.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ipfw.h"
 #include "opt_ipstealth.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/sysctl.h>
 #include <sys/vimage.h>
 
 #include <net/pfil.h>
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/if_var.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/in_var.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/ip_options.h>
 #include <netinet/vinet.h>
 
 #include <machine/in_cksum.h>
 
 #ifdef VIMAGE_GLOBALS
 static int ipfastforward_active;
 #endif
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO, fastforwarding,
     CTLFLAG_RW, ipfastforward_active, 0, "Enable fast IP forwarding");
 
 static struct sockaddr_in *
 ip_findroute(struct route *ro, struct in_addr dest, struct mbuf *m)
 {
 	INIT_VNET_INET(curvnet);
 	struct sockaddr_in *dst;
 	struct rtentry *rt;
 
 	/*
 	 * Find route to destination.
 	 */
 	bzero(ro, sizeof(*ro));
 	dst = (struct sockaddr_in *)&ro->ro_dst;
 	dst->sin_family = AF_INET;
 	dst->sin_len = sizeof(*dst);
 	dst->sin_addr.s_addr = dest.s_addr;
-	in_rtalloc_ign(ro, RTF_CLONING, M_GETFIB(m));
+	in_rtalloc_ign(ro, 0, M_GETFIB(m));
 
 	/*
 	 * Route there and interface still up?
 	 */
 	rt = ro->ro_rt;
 	if (rt && (rt->rt_flags & RTF_UP) &&
 	    (rt->rt_ifp->if_flags & IFF_UP) &&
 	    (rt->rt_ifp->if_drv_flags & IFF_DRV_RUNNING)) {
 		if (rt->rt_flags & RTF_GATEWAY)
 			dst = (struct sockaddr_in *)rt->rt_gateway;
 	} else {
 		V_ipstat.ips_noroute++;
 		V_ipstat.ips_cantforward++;
 		if (rt)
 			RTFREE(rt);
 		icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_HOST, 0, 0);
 		return NULL;
 	}
 	return dst;
 }
 
 /*
  * Try to forward a packet based on the destination address.
  * This is a fast path optimized for the plain forwarding case.
  * If the packet is handled (and consumed) here then we return 1;
  * otherwise 0 is returned and the packet should be delivered
  * to ip_input for full processing.
  */
 struct mbuf *
 ip_fastforward(struct mbuf *m)
 {
 	INIT_VNET_INET(curvnet);
 	struct ip *ip;
 	struct mbuf *m0 = NULL;
 	struct route ro;
 	struct sockaddr_in *dst = NULL;
 	struct ifnet *ifp;
 	struct in_addr odest, dest;
 	u_short sum, ip_len;
 	int error = 0;
 	int hlen, mtu;
 #ifdef IPFIREWALL_FORWARD
 	struct m_tag *fwd_tag;
 #endif
 
 	/*
 	 * Are we active and forwarding packets?
 	 */
 	if (!V_ipfastforward_active || !V_ipforwarding)
 		return m;
 
 	M_ASSERTVALID(m);
 	M_ASSERTPKTHDR(m);
 
 	ro.ro_rt = NULL;
 
 	/*
 	 * Step 1: check for packet drop conditions (and sanity checks)
 	 */
 
 	/*
 	 * Is entire packet big enough?
 	 */
 	if (m->m_pkthdr.len < sizeof(struct ip)) {
 		V_ipstat.ips_tooshort++;
 		goto drop;
 	}
 
 	/*
 	 * Is first mbuf large enough for ip header and is header present?
 	 */
 	if (m->m_len < sizeof (struct ip) &&
 	   (m = m_pullup(m, sizeof (struct ip))) == NULL) {
 		V_ipstat.ips_toosmall++;
 		return NULL;	/* mbuf already free'd */
 	}
 
 	ip = mtod(m, struct ip *);
 
 	/*
 	 * Is it IPv4?
 	 */
 	if (ip->ip_v != IPVERSION) {
 		V_ipstat.ips_badvers++;
 		goto drop;
 	}
 
 	/*
 	 * Is IP header length correct and is it in first mbuf?
 	 */
 	hlen = ip->ip_hl << 2;
 	if (hlen < sizeof(struct ip)) {	/* minimum header length */
 		V_ipstat.ips_badlen++;
 		goto drop;
 	}
 	if (hlen > m->m_len) {
 		if ((m = m_pullup(m, hlen)) == NULL) {
 			V_ipstat.ips_badhlen++;
 			return NULL;	/* mbuf already free'd */
 		}
 		ip = mtod(m, struct ip *);
 	}
 
 	/*
 	 * Checksum correct?
 	 */
 	if (m->m_pkthdr.csum_flags & CSUM_IP_CHECKED)
 		sum = !(m->m_pkthdr.csum_flags & CSUM_IP_VALID);
 	else {
 		if (hlen == sizeof(struct ip))
 			sum = in_cksum_hdr(ip);
 		else
 			sum = in_cksum(m, hlen);
 	}
 	if (sum) {
 		V_ipstat.ips_badsum++;
 		goto drop;
 	}
 
 	/*
 	 * Remember that we have checked the IP header and found it valid.
 	 */
 	m->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID);
 
 	ip_len = ntohs(ip->ip_len);
 
 	/*
 	 * Is IP length longer than packet we have got?
 	 */
 	if (m->m_pkthdr.len < ip_len) {
 		V_ipstat.ips_tooshort++;
 		goto drop;
 	}
 
 	/*
 	 * Is packet longer than IP header tells us? If yes, truncate packet.
 	 */
 	if (m->m_pkthdr.len > ip_len) {
 		if (m->m_len == m->m_pkthdr.len) {
 			m->m_len = ip_len;
 			m->m_pkthdr.len = ip_len;
 		} else
 			m_adj(m, ip_len - m->m_pkthdr.len);
 	}
 
 	/*
 	 * Is packet from or to 127/8?
 	 */
 	if ((ntohl(ip->ip_dst.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET ||
 	    (ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET) {
 		V_ipstat.ips_badaddr++;
 		goto drop;
 	}
 
 #ifdef ALTQ
 	/*
 	 * Is packet dropped by traffic conditioner?
 	 */
 	if (altq_input != NULL && (*altq_input)(m, AF_INET) == 0)
 		goto drop;
 #endif
 
 	/*
 	 * Step 2: fallback conditions to normal ip_input path processing
 	 */
 
 	/*
 	 * Only IP packets without options
 	 */
 	if (ip->ip_hl != (sizeof(struct ip) >> 2)) {
 		if (ip_doopts == 1)
 			return m;
 		else if (ip_doopts == 2) {
 			icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_FILTER_PROHIB,
 				0, 0);
 			return NULL;	/* mbuf already free'd */
 		}
 		/* else ignore IP options and continue */
 	}
 
 	/*
 	 * Only unicast IP, not from loopback, no L2 or IP broadcast,
 	 * no multicast, no INADDR_ANY
 	 *
 	 * XXX: Probably some of these checks could be direct drop
 	 * conditions.  However it is not clear whether there are some
 	 * hacks or obscure behaviours which make it neccessary to
 	 * let ip_input handle it.  We play safe here and let ip_input
 	 * deal with it until it is proven that we can directly drop it.
 	 */
 	if ((m->m_flags & (M_BCAST|M_MCAST)) ||
 	    (m->m_pkthdr.rcvif->if_flags & IFF_LOOPBACK) ||
 	    ntohl(ip->ip_src.s_addr) == (u_long)INADDR_BROADCAST ||
 	    ntohl(ip->ip_dst.s_addr) == (u_long)INADDR_BROADCAST ||
 	    IN_MULTICAST(ntohl(ip->ip_src.s_addr)) ||
 	    IN_MULTICAST(ntohl(ip->ip_dst.s_addr)) ||
 	    IN_LINKLOCAL(ntohl(ip->ip_src.s_addr)) ||
 	    IN_LINKLOCAL(ntohl(ip->ip_dst.s_addr)) ||
 	    ip->ip_src.s_addr == INADDR_ANY ||
 	    ip->ip_dst.s_addr == INADDR_ANY )
 		return m;
 
 	/*
 	 * Is it for a local address on this host?
 	 */
 	if (in_localip(ip->ip_dst))
 		return m;
 
 	V_ipstat.ips_total++;
 
 	/*
 	 * Step 3: incoming packet firewall processing
 	 */
 
 	/*
 	 * Convert to host representation
 	 */
 	ip->ip_len = ntohs(ip->ip_len);
 	ip->ip_off = ntohs(ip->ip_off);
 
 	odest.s_addr = dest.s_addr = ip->ip_dst.s_addr;
 
 	/*
 	 * Run through list of ipfilter hooks for input packets
 	 */
 	if (!PFIL_HOOKED(&inet_pfil_hook))
 		goto passin;
 
 	if (pfil_run_hooks(&inet_pfil_hook, &m, m->m_pkthdr.rcvif, PFIL_IN, NULL) ||
 	    m == NULL)
 		goto drop;
 
 	M_ASSERTVALID(m);
 	M_ASSERTPKTHDR(m);
 
 	ip = mtod(m, struct ip *);	/* m may have changed by pfil hook */
 	dest.s_addr = ip->ip_dst.s_addr;
 
 	/*
 	 * Destination address changed?
 	 */
 	if (odest.s_addr != dest.s_addr) {
 		/*
 		 * Is it now for a local address on this host?
 		 */
 		if (in_localip(dest))
 			goto forwardlocal;
 		/*
 		 * Go on with new destination address
 		 */
 	}
 #ifdef IPFIREWALL_FORWARD
 	if (m->m_flags & M_FASTFWD_OURS) {
 		/*
 		 * ipfw changed it for a local address on this host.
 		 */
 		goto forwardlocal;
 	}
 #endif /* IPFIREWALL_FORWARD */
 
 passin:
 	/*
 	 * Step 4: decrement TTL and look up route
 	 */
 
 	/*
 	 * Check TTL
 	 */
 #ifdef IPSTEALTH
 	if (!V_ipstealth) {
 #endif
 	if (ip->ip_ttl <= IPTTLDEC) {
 		icmp_error(m, ICMP_TIMXCEED, ICMP_TIMXCEED_INTRANS, 0, 0);
 		return NULL;	/* mbuf already free'd */
 	}
 
 	/*
 	 * Decrement the TTL and incrementally change the IP header checksum.
 	 * Don't bother doing this with hw checksum offloading, it's faster
 	 * doing it right here.
 	 */
 	ip->ip_ttl -= IPTTLDEC;
 	if (ip->ip_sum >= (u_int16_t) ~htons(IPTTLDEC << 8))
 		ip->ip_sum -= ~htons(IPTTLDEC << 8);
 	else
 		ip->ip_sum += htons(IPTTLDEC << 8);
 #ifdef IPSTEALTH
 	}
 #endif
 
 	/*
 	 * Find route to destination.
 	 */
 	if ((dst = ip_findroute(&ro, dest, m)) == NULL)
 		return NULL;	/* icmp unreach already sent */
 	ifp = ro.ro_rt->rt_ifp;
 
 	/*
 	 * Immediately drop blackholed traffic, and directed broadcasts
 	 * for either the all-ones or all-zero subnet addresses on
 	 * locally attached networks.
 	 */
 	if ((ro.ro_rt->rt_flags & (RTF_BLACKHOLE|RTF_BROADCAST)) != 0)
 		goto drop;
 
 	/*
 	 * Step 5: outgoing firewall packet processing
 	 */
 
 	/*
 	 * Run through list of hooks for output packets.
 	 */
 	if (!PFIL_HOOKED(&inet_pfil_hook))
 		goto passout;
 
 	if (pfil_run_hooks(&inet_pfil_hook, &m, ifp, PFIL_OUT, NULL) || m == NULL) {
 		goto drop;
 	}
 
 	M_ASSERTVALID(m);
 	M_ASSERTPKTHDR(m);
 
 	ip = mtod(m, struct ip *);
 	dest.s_addr = ip->ip_dst.s_addr;
 
 	/*
 	 * Destination address changed?
 	 */
 #ifndef IPFIREWALL_FORWARD
 	if (odest.s_addr != dest.s_addr) {
 #else
 	fwd_tag = m_tag_find(m, PACKET_TAG_IPFORWARD, NULL);
 	if (odest.s_addr != dest.s_addr || fwd_tag != NULL) {
 #endif /* IPFIREWALL_FORWARD */
 		/*
 		 * Is it now for a local address on this host?
 		 */
 #ifndef IPFIREWALL_FORWARD
 		if (in_localip(dest)) {
 #else
 		if (m->m_flags & M_FASTFWD_OURS || in_localip(dest)) {
 #endif /* IPFIREWALL_FORWARD */
 forwardlocal:
 			/*
 			 * Return packet for processing by ip_input().
 			 * Keep host byte order as expected at ip_input's
 			 * "ours"-label.
 			 */
 			m->m_flags |= M_FASTFWD_OURS;
 			if (ro.ro_rt)
 				RTFREE(ro.ro_rt);
 			return m;
 		}
 		/*
 		 * Redo route lookup with new destination address
 		 */
 #ifdef IPFIREWALL_FORWARD
 		if (fwd_tag) {
 			dest.s_addr = ((struct sockaddr_in *)
 				    (fwd_tag + 1))->sin_addr.s_addr;
 			m_tag_delete(m, fwd_tag);
 		}
 #endif /* IPFIREWALL_FORWARD */
 		RTFREE(ro.ro_rt);
 		if ((dst = ip_findroute(&ro, dest, m)) == NULL)
 			return NULL;	/* icmp unreach already sent */
 		ifp = ro.ro_rt->rt_ifp;
 	}
 
 passout:
 	/*
 	 * Step 6: send off the packet
 	 */
 
 	/*
 	 * Check if route is dampned (when ARP is unable to resolve)
 	 */
 	if ((ro.ro_rt->rt_flags & RTF_REJECT) &&
 	    (ro.ro_rt->rt_rmx.rmx_expire == 0 ||
 	    time_uptime < ro.ro_rt->rt_rmx.rmx_expire)) {
 		icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_HOST, 0, 0);
 		goto consumed;
 	}
 
 #ifndef ALTQ
 	/*
 	 * Check if there is enough space in the interface queue
 	 */
 	if ((ifp->if_snd.ifq_len + ip->ip_len / ifp->if_mtu + 1) >=
 	    ifp->if_snd.ifq_maxlen) {
 		V_ipstat.ips_odropped++;
 		/* would send source quench here but that is depreciated */
 		goto drop;
 	}
 #endif
 
 	/*
 	 * Check if media link state of interface is not down
 	 */
 	if (ifp->if_link_state == LINK_STATE_DOWN) {
 		icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_HOST, 0, 0);
 		goto consumed;
 	}
 
 	/*
 	 * Check if packet fits MTU or if hardware will fragment for us
 	 */
 	if (ro.ro_rt->rt_rmx.rmx_mtu)
 		mtu = min(ro.ro_rt->rt_rmx.rmx_mtu, ifp->if_mtu);
 	else
 		mtu = ifp->if_mtu;
 
 	if (ip->ip_len <= mtu ||
 	    (ifp->if_hwassist & CSUM_FRAGMENT && (ip->ip_off & IP_DF) == 0)) {
 		/*
 		 * Restore packet header fields to original values
 		 */
 		ip->ip_len = htons(ip->ip_len);
 		ip->ip_off = htons(ip->ip_off);
 		/*
 		 * Send off the packet via outgoing interface
 		 */
 		error = (*ifp->if_output)(ifp, m,
 				(struct sockaddr *)dst, ro.ro_rt);
 	} else {
 		/*
 		 * Handle EMSGSIZE with icmp reply needfrag for TCP MTU discovery
 		 */
 		if (ip->ip_off & IP_DF) {
 			V_ipstat.ips_cantfrag++;
 			icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG,
 				0, mtu);
 			goto consumed;
 		} else {
 			/*
 			 * We have to fragment the packet
 			 */
 			m->m_pkthdr.csum_flags |= CSUM_IP;
 			/*
 			 * ip_fragment expects ip_len and ip_off in host byte
 			 * order but returns all packets in network byte order
 			 */
 			if (ip_fragment(ip, &m, mtu, ifp->if_hwassist,
 					(~ifp->if_hwassist & CSUM_DELAY_IP))) {
 				goto drop;
 			}
 			KASSERT(m != NULL, ("null mbuf and no error"));
 			/*
 			 * Send off the fragments via outgoing interface
 			 */
 			error = 0;
 			do {
 				m0 = m->m_nextpkt;
 				m->m_nextpkt = NULL;
 
 				error = (*ifp->if_output)(ifp, m,
 					(struct sockaddr *)dst, ro.ro_rt);
 				if (error)
 					break;
 			} while ((m = m0) != NULL);
 			if (error) {
 				/* Reclaim remaining fragments */
 				for (m = m0; m; m = m0) {
 					m0 = m->m_nextpkt;
 					m_freem(m);
 				}
 			} else
 				V_ipstat.ips_fragmented++;
 		}
 	}
 
 	if (error != 0)
 		V_ipstat.ips_odropped++;
 	else {
 		ro.ro_rt->rt_rmx.rmx_pksent++;
 		V_ipstat.ips_forward++;
 		V_ipstat.ips_fastforward++;
 	}
 consumed:
 	RTFREE(ro.ro_rt);
 	return NULL;
 drop:
 	if (m)
 		m_freem(m);
 	if (ro.ro_rt)
 		RTFREE(ro.ro_rt);
 	return NULL;
 }
Index: head/sys/netinet/ip_fw2.c
===================================================================
--- head/sys/netinet/ip_fw2.c	(revision 186118)
+++ head/sys/netinet/ip_fw2.c	(revision 186119)
@@ -1,4686 +1,4686 @@
 /*-
  * Copyright (c) 2002 Luigi Rizzo, Universita` di Pisa
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #define        DEB(x)
 #define        DDB(x) x
 
 /*
  * Implement IP packet firewall (new version)
  */
 
 #if !defined(KLD_MODULE)
 #include "opt_ipfw.h"
 #include "opt_ipdivert.h"
 #include "opt_ipdn.h"
 #include "opt_inet.h"
 #ifndef INET
 #error IPFIREWALL requires INET.
 #endif /* INET */
 #endif
 #include "opt_inet6.h"
 #include "opt_ipsec.h"
 #include "opt_mac.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/condvar.h>
 #include <sys/eventhandler.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/jail.h>
 #include <sys/module.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 #include <sys/ucred.h>
 #include <sys/vimage.h>
 #include <net/if.h>
 #include <net/radix.h>
 #include <net/route.h>
 #include <net/pf_mtag.h>
 #include <net/vnet.h>
 
 #define	IPFW_INTERNAL	/* Access to protected data structures in ip_fw.h. */
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/in_var.h>
 #include <netinet/in_pcb.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/ip_fw.h>
 #include <netinet/ip_divert.h>
 #include <netinet/ip_dummynet.h>
 #include <netinet/ip_carp.h>
 #include <netinet/pim.h>
 #include <netinet/tcp.h>
 #include <netinet/tcp_timer.h>
 #include <netinet/tcp_var.h>
 #include <netinet/tcpip.h>
 #include <netinet/udp.h>
 #include <netinet/udp_var.h>
 #include <netinet/sctp.h>
 #include <netinet/vinet.h>
 
 #include <netgraph/ng_ipfw.h>
 
 #include <altq/if_altq.h>
 
 #include <netinet/ip6.h>
 #include <netinet/icmp6.h>
 #ifdef INET6
 #include <netinet6/scope6_var.h>
 #endif
 
 #include <netinet/if_ether.h> /* XXX for ETHERTYPE_IP */
 
 #include <machine/in_cksum.h>	/* XXX for in_cksum */
 
 #include <security/mac/mac_framework.h>
 
 #ifndef VIMAGE
 #ifndef VIMAGE_GLOBALS
 struct vnet_ipfw vnet_ipfw_0;
 #endif
 #endif
 
 /*
  * set_disable contains one bit per set value (0..31).
  * If the bit is set, all rules with the corresponding set
  * are disabled. Set RESVD_SET(31) is reserved for the default rule
  * and rules that are not deleted by the flush command,
  * and CANNOT be disabled.
  * Rules in set RESVD_SET can only be deleted explicitly.
  */
 #ifdef VIMAGE_GLOBALS
 static u_int32_t set_disable;
 static int fw_verbose;
 static struct callout ipfw_timeout;
 static int verbose_limit;
 #endif
 
 static uma_zone_t ipfw_dyn_rule_zone;
 
 /*
  * Data structure to cache our ucred related
  * information. This structure only gets used if
  * the user specified UID/GID based constraints in
  * a firewall rule.
  */
 struct ip_fw_ugid {
 	gid_t		fw_groups[NGROUPS];
 	int		fw_ngroups;
 	uid_t		fw_uid;
 	int		fw_prid;
 };
 
 /*
  * list of rules for layer 3
  */
 #ifdef VIMAGE_GLOBALS
 struct ip_fw_chain layer3_chain;
 #endif
 
 MALLOC_DEFINE(M_IPFW, "IpFw/IpAcct", "IpFw/IpAcct chain's");
 MALLOC_DEFINE(M_IPFW_TBL, "ipfw_tbl", "IpFw tables");
 #define IPFW_NAT_LOADED (ipfw_nat_ptr != NULL)
 ipfw_nat_t *ipfw_nat_ptr = NULL;
 ipfw_nat_cfg_t *ipfw_nat_cfg_ptr;
 ipfw_nat_cfg_t *ipfw_nat_del_ptr;
 ipfw_nat_cfg_t *ipfw_nat_get_cfg_ptr;
 ipfw_nat_cfg_t *ipfw_nat_get_log_ptr;
 
 struct table_entry {
 	struct radix_node	rn[2];
 	struct sockaddr_in	addr, mask;
 	u_int32_t		value;
 };
 
 #ifdef VIMAGE_GLOBALS
 static int fw_debug;
 static int autoinc_step;
 #endif
 
 extern int ipfw_chg_hook(SYSCTL_HANDLER_ARGS);
 
 #ifdef SYSCTL_NODE
 SYSCTL_NODE(_net_inet_ip, OID_AUTO, fw, CTLFLAG_RW, 0, "Firewall");
 SYSCTL_V_PROC(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, enable,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_SECURE3, fw_enable, 0,
     ipfw_chg_hook, "I", "Enable ipfw");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, autoinc_step,
     CTLFLAG_RW, autoinc_step, 0, "Rule number autincrement step");
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip_fw, OID_AUTO, one_pass,
     CTLFLAG_RW | CTLFLAG_SECURE3, fw_one_pass, 0,
     "Only do a single pass through ipfw when using dummynet(4)");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, debug, CTLFLAG_RW,
     fw_debug, 0, "Enable printing of debug ip_fw statements");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, verbose,
     CTLFLAG_RW | CTLFLAG_SECURE3,
     fw_verbose, 0, "Log matches to ipfw rules");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, verbose_limit,
     CTLFLAG_RW, verbose_limit, 0,
     "Set upper limit of matches of ipfw rules logged");
 SYSCTL_UINT(_net_inet_ip_fw, OID_AUTO, default_rule, CTLFLAG_RD,
     NULL, IPFW_DEFAULT_RULE, "The default/max possible rule number.");
 SYSCTL_UINT(_net_inet_ip_fw, OID_AUTO, tables_max, CTLFLAG_RD,
     NULL, IPFW_TABLES_MAX, "The maximum number of tables.");
 
 /*
  * Description of dynamic rules.
  *
  * Dynamic rules are stored in lists accessed through a hash table
  * (ipfw_dyn_v) whose size is curr_dyn_buckets. This value can
  * be modified through the sysctl variable dyn_buckets which is
  * updated when the table becomes empty.
  *
  * XXX currently there is only one list, ipfw_dyn.
  *
  * When a packet is received, its address fields are first masked
  * with the mask defined for the rule, then hashed, then matched
  * against the entries in the corresponding list.
  * Dynamic rules can be used for different purposes:
  *  + stateful rules;
  *  + enforcing limits on the number of sessions;
  *  + in-kernel NAT (not implemented yet)
  *
  * The lifetime of dynamic rules is regulated by dyn_*_lifetime,
  * measured in seconds and depending on the flags.
  *
  * The total number of dynamic rules is stored in dyn_count.
  * The max number of dynamic rules is dyn_max. When we reach
  * the maximum number of rules we do not create anymore. This is
  * done to avoid consuming too much memory, but also too much
  * time when searching on each packet (ideally, we should try instead
  * to put a limit on the length of the list on each bucket...).
  *
  * Each dynamic rule holds a pointer to the parent ipfw rule so
  * we know what action to perform. Dynamic rules are removed when
  * the parent rule is deleted. XXX we should make them survive.
  *
  * There are some limitations with dynamic rules -- we do not
  * obey the 'randomized match', and we do not do multiple
  * passes through the firewall. XXX check the latter!!!
  */
 #ifdef VIMAGE_GLOBALS
 static ipfw_dyn_rule **ipfw_dyn_v;
 static u_int32_t dyn_buckets;
 static u_int32_t curr_dyn_buckets;
 #endif
 
 static struct mtx ipfw_dyn_mtx;		/* mutex guarding dynamic rules */
 #define	IPFW_DYN_LOCK_INIT() \
 	mtx_init(&ipfw_dyn_mtx, "IPFW dynamic rules", NULL, MTX_DEF)
 #define	IPFW_DYN_LOCK_DESTROY()	mtx_destroy(&ipfw_dyn_mtx)
 #define	IPFW_DYN_LOCK()		mtx_lock(&ipfw_dyn_mtx)
 #define	IPFW_DYN_UNLOCK()	mtx_unlock(&ipfw_dyn_mtx)
 #define	IPFW_DYN_LOCK_ASSERT()	mtx_assert(&ipfw_dyn_mtx, MA_OWNED)
 
 /*
  * Timeouts for various events in handing dynamic rules.
  */
 #ifdef VIMAGE_GLOBALS
 static u_int32_t dyn_ack_lifetime;
 static u_int32_t dyn_syn_lifetime;
 static u_int32_t dyn_fin_lifetime;
 static u_int32_t dyn_rst_lifetime;
 static u_int32_t dyn_udp_lifetime;
 static u_int32_t dyn_short_lifetime;
 
 /*
  * Keepalives are sent if dyn_keepalive is set. They are sent every
  * dyn_keepalive_period seconds, in the last dyn_keepalive_interval
  * seconds of lifetime of a rule.
  * dyn_rst_lifetime and dyn_fin_lifetime should be strictly lower
  * than dyn_keepalive_period.
  */
 
 static u_int32_t dyn_keepalive_interval;
 static u_int32_t dyn_keepalive_period;
 static u_int32_t dyn_keepalive;
 
 static u_int32_t static_count;	/* # of static rules */
 static u_int32_t static_len;	/* size in bytes of static rules */
 static u_int32_t dyn_count;	/* # of dynamic rules */
 static u_int32_t dyn_max;	/* max # of dynamic rules */
 #endif /* VIMAGE_GLOBALS */
 
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_buckets,
     CTLFLAG_RW, dyn_buckets, 0, "Number of dyn. buckets");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, curr_dyn_buckets,
     CTLFLAG_RD, curr_dyn_buckets, 0, "Current Number of dyn. buckets");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_count,
     CTLFLAG_RD, dyn_count, 0, "Number of dyn. rules");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_max,
     CTLFLAG_RW, dyn_max, 0, "Max number of dyn. rules");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, static_count,
     CTLFLAG_RD, static_count, 0, "Number of static rules");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_ack_lifetime,
     CTLFLAG_RW, dyn_ack_lifetime, 0, "Lifetime of dyn. rules for acks");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_syn_lifetime,
     CTLFLAG_RW, dyn_syn_lifetime, 0, "Lifetime of dyn. rules for syn");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_fin_lifetime,
     CTLFLAG_RW, dyn_fin_lifetime, 0, "Lifetime of dyn. rules for fin");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_rst_lifetime,
     CTLFLAG_RW, dyn_rst_lifetime, 0, "Lifetime of dyn. rules for rst");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_udp_lifetime,
     CTLFLAG_RW, dyn_udp_lifetime, 0, "Lifetime of dyn. rules for UDP");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_short_lifetime,
     CTLFLAG_RW, dyn_short_lifetime, 0,
     "Lifetime of dyn. rules for other situations");
 SYSCTL_V_INT(V_NET, vnet_ipfw, _net_inet_ip_fw, OID_AUTO, dyn_keepalive,
     CTLFLAG_RW, dyn_keepalive, 0, "Enable keepalives for dyn. rules");
 
 
 #ifdef INET6
 /*
  * IPv6 specific variables
  */
 SYSCTL_DECL(_net_inet6_ip6);
 
 static struct sysctl_ctx_list ip6_fw_sysctl_ctx;
 static struct sysctl_oid *ip6_fw_sysctl_tree;
 #endif /* INET6 */
 #endif /* SYSCTL_NODE */
 
 #ifdef VIMAGE_GLOBALS
 static int fw_deny_unknown_exthdrs;
 #endif
 
 /*
  * L3HDR maps an ipv4 pointer into a layer3 header pointer of type T
  * Other macros just cast void * into the appropriate type
  */
 #define	L3HDR(T, ip)	((T *)((u_int32_t *)(ip) + (ip)->ip_hl))
 #define	TCP(p)		((struct tcphdr *)(p))
 #define	SCTP(p)		((struct sctphdr *)(p))
 #define	UDP(p)		((struct udphdr *)(p))
 #define	ICMP(p)		((struct icmphdr *)(p))
 #define	ICMP6(p)	((struct icmp6_hdr *)(p))
 
 static __inline int
 icmptype_match(struct icmphdr *icmp, ipfw_insn_u32 *cmd)
 {
 	int type = icmp->icmp_type;
 
 	return (type <= ICMP_MAXTYPE && (cmd->d[0] & (1<<type)) );
 }
 
 #define TT	( (1 << ICMP_ECHO) | (1 << ICMP_ROUTERSOLICIT) | \
     (1 << ICMP_TSTAMP) | (1 << ICMP_IREQ) | (1 << ICMP_MASKREQ) )
 
 static int
 is_icmp_query(struct icmphdr *icmp)
 {
 	int type = icmp->icmp_type;
 
 	return (type <= ICMP_MAXTYPE && (TT & (1<<type)) );
 }
 #undef TT
 
 /*
  * The following checks use two arrays of 8 or 16 bits to store the
  * bits that we want set or clear, respectively. They are in the
  * low and high half of cmd->arg1 or cmd->d[0].
  *
  * We scan options and store the bits we find set. We succeed if
  *
  *	(want_set & ~bits) == 0 && (want_clear & ~bits) == want_clear
  *
  * The code is sometimes optimized not to store additional variables.
  */
 
 static int
 flags_match(ipfw_insn *cmd, u_int8_t bits)
 {
 	u_char want_clear;
 	bits = ~bits;
 
 	if ( ((cmd->arg1 & 0xff) & bits) != 0)
 		return 0; /* some bits we want set were clear */
 	want_clear = (cmd->arg1 >> 8) & 0xff;
 	if ( (want_clear & bits) != want_clear)
 		return 0; /* some bits we want clear were set */
 	return 1;
 }
 
 static int
 ipopts_match(struct ip *ip, ipfw_insn *cmd)
 {
 	int optlen, bits = 0;
 	u_char *cp = (u_char *)(ip + 1);
 	int x = (ip->ip_hl << 2) - sizeof (struct ip);
 
 	for (; x > 0; x -= optlen, cp += optlen) {
 		int opt = cp[IPOPT_OPTVAL];
 
 		if (opt == IPOPT_EOL)
 			break;
 		if (opt == IPOPT_NOP)
 			optlen = 1;
 		else {
 			optlen = cp[IPOPT_OLEN];
 			if (optlen <= 0 || optlen > x)
 				return 0; /* invalid or truncated */
 		}
 		switch (opt) {
 
 		default:
 			break;
 
 		case IPOPT_LSRR:
 			bits |= IP_FW_IPOPT_LSRR;
 			break;
 
 		case IPOPT_SSRR:
 			bits |= IP_FW_IPOPT_SSRR;
 			break;
 
 		case IPOPT_RR:
 			bits |= IP_FW_IPOPT_RR;
 			break;
 
 		case IPOPT_TS:
 			bits |= IP_FW_IPOPT_TS;
 			break;
 		}
 	}
 	return (flags_match(cmd, bits));
 }
 
 static int
 tcpopts_match(struct tcphdr *tcp, ipfw_insn *cmd)
 {
 	int optlen, bits = 0;
 	u_char *cp = (u_char *)(tcp + 1);
 	int x = (tcp->th_off << 2) - sizeof(struct tcphdr);
 
 	for (; x > 0; x -= optlen, cp += optlen) {
 		int opt = cp[0];
 		if (opt == TCPOPT_EOL)
 			break;
 		if (opt == TCPOPT_NOP)
 			optlen = 1;
 		else {
 			optlen = cp[1];
 			if (optlen <= 0)
 				break;
 		}
 
 		switch (opt) {
 
 		default:
 			break;
 
 		case TCPOPT_MAXSEG:
 			bits |= IP_FW_TCPOPT_MSS;
 			break;
 
 		case TCPOPT_WINDOW:
 			bits |= IP_FW_TCPOPT_WINDOW;
 			break;
 
 		case TCPOPT_SACK_PERMITTED:
 		case TCPOPT_SACK:
 			bits |= IP_FW_TCPOPT_SACK;
 			break;
 
 		case TCPOPT_TIMESTAMP:
 			bits |= IP_FW_TCPOPT_TS;
 			break;
 
 		}
 	}
 	return (flags_match(cmd, bits));
 }
 
 static int
 iface_match(struct ifnet *ifp, ipfw_insn_if *cmd)
 {
 	if (ifp == NULL)	/* no iface with this packet, match fails */
 		return 0;
 	/* Check by name or by IP address */
 	if (cmd->name[0] != '\0') { /* match by name */
 		/* Check name */
 		if (cmd->p.glob) {
 			if (fnmatch(cmd->name, ifp->if_xname, 0) == 0)
 				return(1);
 		} else {
 			if (strncmp(ifp->if_xname, cmd->name, IFNAMSIZ) == 0)
 				return(1);
 		}
 	} else {
 		struct ifaddr *ia;
 
 		/* XXX lock? */
 		TAILQ_FOREACH(ia, &ifp->if_addrhead, ifa_link) {
 			if (ia->ifa_addr->sa_family != AF_INET)
 				continue;
 			if (cmd->p.ip.s_addr == ((struct sockaddr_in *)
 			    (ia->ifa_addr))->sin_addr.s_addr)
 				return(1);	/* match */
 		}
 	}
 	return(0);	/* no match, fail ... */
 }
 
 /*
  * The verify_path function checks if a route to the src exists and
  * if it is reachable via ifp (when provided).
  * 
  * The 'verrevpath' option checks that the interface that an IP packet
  * arrives on is the same interface that traffic destined for the
  * packet's source address would be routed out of.  The 'versrcreach'
  * option just checks that the source address is reachable via any route
  * (except default) in the routing table.  These two are a measure to block
  * forged packets.  This is also commonly known as "anti-spoofing" or Unicast
  * Reverse Path Forwarding (Unicast RFP) in Cisco-ese. The name of the knobs
  * is purposely reminiscent of the Cisco IOS command,
  *
  *   ip verify unicast reverse-path
  *   ip verify unicast source reachable-via any
  *
  * which implements the same functionality. But note that syntax is
  * misleading. The check may be performed on all IP packets whether unicast,
  * multicast, or broadcast.
  */
 static int
 verify_path(struct in_addr src, struct ifnet *ifp, u_int fib)
 {
 	struct route ro;
 	struct sockaddr_in *dst;
 
 	bzero(&ro, sizeof(ro));
 
 	dst = (struct sockaddr_in *)&(ro.ro_dst);
 	dst->sin_family = AF_INET;
 	dst->sin_len = sizeof(*dst);
 	dst->sin_addr = src;
-	in_rtalloc_ign(&ro, RTF_CLONING, fib);
+	in_rtalloc_ign(&ro, 0, fib);
 
 	if (ro.ro_rt == NULL)
 		return 0;
 
 	/*
 	 * If ifp is provided, check for equality with rtentry.
 	 * We should use rt->rt_ifa->ifa_ifp, instead of rt->rt_ifp,
 	 * in order to pass packets injected back by if_simloop():
 	 * if useloopback == 1 routing entry (via lo0) for our own address
 	 * may exist, so we need to handle routing assymetry.
 	 */
 	if (ifp != NULL && ro.ro_rt->rt_ifa->ifa_ifp != ifp) {
 		RTFREE(ro.ro_rt);
 		return 0;
 	}
 
 	/* if no ifp provided, check if rtentry is not default route */
 	if (ifp == NULL &&
 	     satosin(rt_key(ro.ro_rt))->sin_addr.s_addr == INADDR_ANY) {
 		RTFREE(ro.ro_rt);
 		return 0;
 	}
 
 	/* or if this is a blackhole/reject route */
 	if (ifp == NULL && ro.ro_rt->rt_flags & (RTF_REJECT|RTF_BLACKHOLE)) {
 		RTFREE(ro.ro_rt);
 		return 0;
 	}
 
 	/* found valid route */
 	RTFREE(ro.ro_rt);
 	return 1;
 }
 
 #ifdef INET6
 /*
  * ipv6 specific rules here...
  */
 static __inline int
 icmp6type_match (int type, ipfw_insn_u32 *cmd)
 {
 	return (type <= ICMP6_MAXTYPE && (cmd->d[type/32] & (1<<(type%32)) ) );
 }
 
 static int
 flow6id_match( int curr_flow, ipfw_insn_u32 *cmd )
 {
 	int i;
 	for (i=0; i <= cmd->o.arg1; ++i )
 		if (curr_flow == cmd->d[i] )
 			return 1;
 	return 0;
 }
 
 /* support for IP6_*_ME opcodes */
 static int
 search_ip6_addr_net (struct in6_addr * ip6_addr)
 {
 	INIT_VNET_NET(curvnet);
 	struct ifnet *mdc;
 	struct ifaddr *mdc2;
 	struct in6_ifaddr *fdm;
 	struct in6_addr copia;
 
 	TAILQ_FOREACH(mdc, &V_ifnet, if_link)
 		TAILQ_FOREACH(mdc2, &mdc->if_addrlist, ifa_list) {
 			if (mdc2->ifa_addr->sa_family == AF_INET6) {
 				fdm = (struct in6_ifaddr *)mdc2;
 				copia = fdm->ia_addr.sin6_addr;
 				/* need for leaving scope_id in the sock_addr */
 				in6_clearscope(&copia);
 				if (IN6_ARE_ADDR_EQUAL(ip6_addr, &copia))
 					return 1;
 			}
 		}
 	return 0;
 }
 
 static int
 verify_path6(struct in6_addr *src, struct ifnet *ifp)
 {
 	struct route_in6 ro;
 	struct sockaddr_in6 *dst;
 
 	bzero(&ro, sizeof(ro));
 
 	dst = (struct sockaddr_in6 * )&(ro.ro_dst);
 	dst->sin6_family = AF_INET6;
 	dst->sin6_len = sizeof(*dst);
 	dst->sin6_addr = *src;
 	/* XXX MRT 0 for ipv6 at this time */
-	rtalloc_ign((struct route *)&ro, RTF_CLONING);
+	rtalloc_ign((struct route *)&ro, 0);
 
 	if (ro.ro_rt == NULL)
 		return 0;
 
 	/* 
 	 * if ifp is provided, check for equality with rtentry
 	 * We should use rt->rt_ifa->ifa_ifp, instead of rt->rt_ifp,
 	 * to support the case of sending packets to an address of our own.
 	 * (where the former interface is the first argument of if_simloop()
 	 *  (=ifp), the latter is lo0)
 	 */
 	if (ifp != NULL && ro.ro_rt->rt_ifa->ifa_ifp != ifp) {
 		RTFREE(ro.ro_rt);
 		return 0;
 	}
 
 	/* if no ifp provided, check if rtentry is not default route */
 	if (ifp == NULL &&
 	    IN6_IS_ADDR_UNSPECIFIED(&satosin6(rt_key(ro.ro_rt))->sin6_addr)) {
 		RTFREE(ro.ro_rt);
 		return 0;
 	}
 
 	/* or if this is a blackhole/reject route */
 	if (ifp == NULL && ro.ro_rt->rt_flags & (RTF_REJECT|RTF_BLACKHOLE)) {
 		RTFREE(ro.ro_rt);
 		return 0;
 	}
 
 	/* found valid route */
 	RTFREE(ro.ro_rt);
 	return 1;
 
 }
 static __inline int
 hash_packet6(struct ipfw_flow_id *id)
 {
 	u_int32_t i;
 	i = (id->dst_ip6.__u6_addr.__u6_addr32[2]) ^
 	    (id->dst_ip6.__u6_addr.__u6_addr32[3]) ^
 	    (id->src_ip6.__u6_addr.__u6_addr32[2]) ^
 	    (id->src_ip6.__u6_addr.__u6_addr32[3]) ^
 	    (id->dst_port) ^ (id->src_port);
 	return i;
 }
 
 static int
 is_icmp6_query(int icmp6_type)
 {
 	if ((icmp6_type <= ICMP6_MAXTYPE) &&
 	    (icmp6_type == ICMP6_ECHO_REQUEST ||
 	    icmp6_type == ICMP6_MEMBERSHIP_QUERY ||
 	    icmp6_type == ICMP6_WRUREQUEST ||
 	    icmp6_type == ICMP6_FQDN_QUERY ||
 	    icmp6_type == ICMP6_NI_QUERY))
 		return (1);
 
 	return (0);
 }
 
 static void
 send_reject6(struct ip_fw_args *args, int code, u_int hlen, struct ip6_hdr *ip6)
 {
 	struct mbuf *m;
 
 	m = args->m;
 	if (code == ICMP6_UNREACH_RST && args->f_id.proto == IPPROTO_TCP) {
 		struct tcphdr *tcp;
 		tcp_seq ack, seq;
 		int flags;
 		struct {
 			struct ip6_hdr ip6;
 			struct tcphdr th;
 		} ti;
 		tcp = (struct tcphdr *)((char *)ip6 + hlen);
 
 		if ((tcp->th_flags & TH_RST) != 0) {
 			m_freem(m);
 			args->m = NULL;
 			return;
 		}
 
 		ti.ip6 = *ip6;
 		ti.th = *tcp;
 		ti.th.th_seq = ntohl(ti.th.th_seq);
 		ti.th.th_ack = ntohl(ti.th.th_ack);
 		ti.ip6.ip6_nxt = IPPROTO_TCP;
 
 		if (ti.th.th_flags & TH_ACK) {
 			ack = 0;
 			seq = ti.th.th_ack;
 			flags = TH_RST;
 		} else {
 			ack = ti.th.th_seq;
 			if ((m->m_flags & M_PKTHDR) != 0) {
 				/*
 				 * total new data to ACK is:
 				 * total packet length,
 				 * minus the header length,
 				 * minus the tcp header length.
 				 */
 				ack += m->m_pkthdr.len - hlen
 					- (ti.th.th_off << 2);
 			} else if (ip6->ip6_plen) {
 				ack += ntohs(ip6->ip6_plen) + sizeof(*ip6) -
 				    hlen - (ti.th.th_off << 2);
 			} else {
 				m_freem(m);
 				return;
 			}
 			if (tcp->th_flags & TH_SYN)
 				ack++;
 			seq = 0;
 			flags = TH_RST|TH_ACK;
 		}
 		bcopy(&ti, ip6, sizeof(ti));
 		/*
 		 * m is only used to recycle the mbuf
 		 * The data in it is never read so we don't need
 		 * to correct the offsets or anything
 		 */
 		tcp_respond(NULL, ip6, tcp, m, ack, seq, flags);
 	} else if (code != ICMP6_UNREACH_RST) { /* Send an ICMPv6 unreach. */
 #if 0
 		/*
 		 * Unlike above, the mbufs need to line up with the ip6 hdr,
 		 * as the contents are read. We need to m_adj() the
 		 * needed amount.
 		 * The mbuf will however be thrown away so we can adjust it.
 		 * Remember we did an m_pullup on it already so we
 		 * can make some assumptions about contiguousness.
 		 */
 		if (args->L3offset)
 			m_adj(m, args->L3offset);
 #endif
 		icmp6_error(m, ICMP6_DST_UNREACH, code, 0);
 	} else
 		m_freem(m);
 
 	args->m = NULL;
 }
 
 #endif /* INET6 */
 
 #ifdef VIMAGE_GLOBALS
 static u_int64_t norule_counter;	/* counter for ipfw_log(NULL...) */
 #endif
 
 #define SNPARGS(buf, len) buf + len, sizeof(buf) > len ? sizeof(buf) - len : 0
 #define SNP(buf) buf, sizeof(buf)
 
 /*
  * We enter here when we have a rule with O_LOG.
  * XXX this function alone takes about 2Kbytes of code!
  */
 static void
 ipfw_log(struct ip_fw *f, u_int hlen, struct ip_fw_args *args,
     struct mbuf *m, struct ifnet *oif, u_short offset, uint32_t tablearg,
     struct ip *ip)
 {
 	INIT_VNET_IPFW(curvnet);
 	struct ether_header *eh = args->eh;
 	char *action;
 	int limit_reached = 0;
 	char action2[40], proto[128], fragment[32];
 
 	fragment[0] = '\0';
 	proto[0] = '\0';
 
 	if (f == NULL) {	/* bogus pkt */
 		if (V_verbose_limit != 0 && V_norule_counter >= V_verbose_limit)
 			return;
 		V_norule_counter++;
 		if (V_norule_counter == V_verbose_limit)
 			limit_reached = V_verbose_limit;
 		action = "Refuse";
 	} else {	/* O_LOG is the first action, find the real one */
 		ipfw_insn *cmd = ACTION_PTR(f);
 		ipfw_insn_log *l = (ipfw_insn_log *)cmd;
 
 		if (l->max_log != 0 && l->log_left == 0)
 			return;
 		l->log_left--;
 		if (l->log_left == 0)
 			limit_reached = l->max_log;
 		cmd += F_LEN(cmd);	/* point to first action */
 		if (cmd->opcode == O_ALTQ) {
 			ipfw_insn_altq *altq = (ipfw_insn_altq *)cmd;
 
 			snprintf(SNPARGS(action2, 0), "Altq %d",
 				altq->qid);
 			cmd += F_LEN(cmd);
 		}
 		if (cmd->opcode == O_PROB)
 			cmd += F_LEN(cmd);
 
 		if (cmd->opcode == O_TAG)
 			cmd += F_LEN(cmd);
 
 		action = action2;
 		switch (cmd->opcode) {
 		case O_DENY:
 			action = "Deny";
 			break;
 
 		case O_REJECT:
 			if (cmd->arg1==ICMP_REJECT_RST)
 				action = "Reset";
 			else if (cmd->arg1==ICMP_UNREACH_HOST)
 				action = "Reject";
 			else
 				snprintf(SNPARGS(action2, 0), "Unreach %d",
 					cmd->arg1);
 			break;
 
 		case O_UNREACH6:
 			if (cmd->arg1==ICMP6_UNREACH_RST)
 				action = "Reset";
 			else
 				snprintf(SNPARGS(action2, 0), "Unreach %d",
 					cmd->arg1);
 			break;
 
 		case O_ACCEPT:
 			action = "Accept";
 			break;
 		case O_COUNT:
 			action = "Count";
 			break;
 		case O_DIVERT:
 			snprintf(SNPARGS(action2, 0), "Divert %d",
 				cmd->arg1);
 			break;
 		case O_TEE:
 			snprintf(SNPARGS(action2, 0), "Tee %d",
 				cmd->arg1);
 			break;
 		case O_SETFIB:
 			snprintf(SNPARGS(action2, 0), "SetFib %d",
 				cmd->arg1);
 			break;
 		case O_SKIPTO:
 			snprintf(SNPARGS(action2, 0), "SkipTo %d",
 				cmd->arg1);
 			break;
 		case O_PIPE:
 			snprintf(SNPARGS(action2, 0), "Pipe %d",
 				cmd->arg1);
 			break;
 		case O_QUEUE:
 			snprintf(SNPARGS(action2, 0), "Queue %d",
 				cmd->arg1);
 			break;
 		case O_FORWARD_IP: {
 			ipfw_insn_sa *sa = (ipfw_insn_sa *)cmd;
 			int len;
 			struct in_addr dummyaddr;
 			if (sa->sa.sin_addr.s_addr == INADDR_ANY)
 				dummyaddr.s_addr = htonl(tablearg);
 			else
 				dummyaddr.s_addr = sa->sa.sin_addr.s_addr;
 
 			len = snprintf(SNPARGS(action2, 0), "Forward to %s",
 				inet_ntoa(dummyaddr));
 
 			if (sa->sa.sin_port)
 				snprintf(SNPARGS(action2, len), ":%d",
 				    sa->sa.sin_port);
 			}
 			break;
 		case O_NETGRAPH:
 			snprintf(SNPARGS(action2, 0), "Netgraph %d",
 				cmd->arg1);
 			break;
 		case O_NGTEE:
 			snprintf(SNPARGS(action2, 0), "Ngtee %d",
 				cmd->arg1);
 			break;
 		case O_NAT:
 			action = "Nat";
  			break;
 		default:
 			action = "UNKNOWN";
 			break;
 		}
 	}
 
 	if (hlen == 0) {	/* non-ip */
 		snprintf(SNPARGS(proto, 0), "MAC");
 
 	} else {
 		int len;
 		char src[48], dst[48];
 		struct icmphdr *icmp;
 		struct tcphdr *tcp;
 		struct udphdr *udp;
 #ifdef INET6
 		struct ip6_hdr *ip6 = NULL;
 		struct icmp6_hdr *icmp6;
 #endif
 		src[0] = '\0';
 		dst[0] = '\0';
 #ifdef INET6
 		if (IS_IP6_FLOW_ID(&(args->f_id))) {
 			char ip6buf[INET6_ADDRSTRLEN];
 			snprintf(src, sizeof(src), "[%s]",
 			    ip6_sprintf(ip6buf, &args->f_id.src_ip6));
 			snprintf(dst, sizeof(dst), "[%s]",
 			    ip6_sprintf(ip6buf, &args->f_id.dst_ip6));
 
 			ip6 = (struct ip6_hdr *)ip;
 			tcp = (struct tcphdr *)(((char *)ip) + hlen);
 			udp = (struct udphdr *)(((char *)ip) + hlen);
 		} else
 #endif
 		{
 			tcp = L3HDR(struct tcphdr, ip);
 			udp = L3HDR(struct udphdr, ip);
 
 			inet_ntoa_r(ip->ip_src, src);
 			inet_ntoa_r(ip->ip_dst, dst);
 		}
 
 		switch (args->f_id.proto) {
 		case IPPROTO_TCP:
 			len = snprintf(SNPARGS(proto, 0), "TCP %s", src);
 			if (offset == 0)
 				snprintf(SNPARGS(proto, len), ":%d %s:%d",
 				    ntohs(tcp->th_sport),
 				    dst,
 				    ntohs(tcp->th_dport));
 			else
 				snprintf(SNPARGS(proto, len), " %s", dst);
 			break;
 
 		case IPPROTO_UDP:
 			len = snprintf(SNPARGS(proto, 0), "UDP %s", src);
 			if (offset == 0)
 				snprintf(SNPARGS(proto, len), ":%d %s:%d",
 				    ntohs(udp->uh_sport),
 				    dst,
 				    ntohs(udp->uh_dport));
 			else
 				snprintf(SNPARGS(proto, len), " %s", dst);
 			break;
 
 		case IPPROTO_ICMP:
 			icmp = L3HDR(struct icmphdr, ip);
 			if (offset == 0)
 				len = snprintf(SNPARGS(proto, 0),
 				    "ICMP:%u.%u ",
 				    icmp->icmp_type, icmp->icmp_code);
 			else
 				len = snprintf(SNPARGS(proto, 0), "ICMP ");
 			len += snprintf(SNPARGS(proto, len), "%s", src);
 			snprintf(SNPARGS(proto, len), " %s", dst);
 			break;
 #ifdef INET6
 		case IPPROTO_ICMPV6:
 			icmp6 = (struct icmp6_hdr *)(((char *)ip) + hlen);
 			if (offset == 0)
 				len = snprintf(SNPARGS(proto, 0),
 				    "ICMPv6:%u.%u ",
 				    icmp6->icmp6_type, icmp6->icmp6_code);
 			else
 				len = snprintf(SNPARGS(proto, 0), "ICMPv6 ");
 			len += snprintf(SNPARGS(proto, len), "%s", src);
 			snprintf(SNPARGS(proto, len), " %s", dst);
 			break;
 #endif
 		default:
 			len = snprintf(SNPARGS(proto, 0), "P:%d %s",
 			    args->f_id.proto, src);
 			snprintf(SNPARGS(proto, len), " %s", dst);
 			break;
 		}
 
 #ifdef INET6
 		if (IS_IP6_FLOW_ID(&(args->f_id))) {
 			if (offset & (IP6F_OFF_MASK | IP6F_MORE_FRAG))
 				snprintf(SNPARGS(fragment, 0),
 				    " (frag %08x:%d@%d%s)",
 				    args->f_id.frag_id6,
 				    ntohs(ip6->ip6_plen) - hlen,
 				    ntohs(offset & IP6F_OFF_MASK) << 3,
 				    (offset & IP6F_MORE_FRAG) ? "+" : "");
 		} else
 #endif
 		{
 			int ip_off, ip_len;
 			if (eh != NULL) { /* layer 2 packets are as on the wire */
 				ip_off = ntohs(ip->ip_off);
 				ip_len = ntohs(ip->ip_len);
 			} else {
 				ip_off = ip->ip_off;
 				ip_len = ip->ip_len;
 			}
 			if (ip_off & (IP_MF | IP_OFFMASK))
 				snprintf(SNPARGS(fragment, 0),
 				    " (frag %d:%d@%d%s)",
 				    ntohs(ip->ip_id), ip_len - (ip->ip_hl << 2),
 				    offset << 3,
 				    (ip_off & IP_MF) ? "+" : "");
 		}
 	}
 	if (oif || m->m_pkthdr.rcvif)
 		log(LOG_SECURITY | LOG_INFO,
 		    "ipfw: %d %s %s %s via %s%s\n",
 		    f ? f->rulenum : -1,
 		    action, proto, oif ? "out" : "in",
 		    oif ? oif->if_xname : m->m_pkthdr.rcvif->if_xname,
 		    fragment);
 	else
 		log(LOG_SECURITY | LOG_INFO,
 		    "ipfw: %d %s %s [no if info]%s\n",
 		    f ? f->rulenum : -1,
 		    action, proto, fragment);
 	if (limit_reached)
 		log(LOG_SECURITY | LOG_NOTICE,
 		    "ipfw: limit %d reached on entry %d\n",
 		    limit_reached, f ? f->rulenum : -1);
 }
 
 /*
  * IMPORTANT: the hash function for dynamic rules must be commutative
  * in source and destination (ip,port), because rules are bidirectional
  * and we want to find both in the same bucket.
  */
 static __inline int
 hash_packet(struct ipfw_flow_id *id)
 {
 	INIT_VNET_IPFW(curvnet);
 	u_int32_t i;
 
 #ifdef INET6
 	if (IS_IP6_FLOW_ID(id)) 
 		i = hash_packet6(id);
 	else
 #endif /* INET6 */
 	i = (id->dst_ip) ^ (id->src_ip) ^ (id->dst_port) ^ (id->src_port);
 	i &= (V_curr_dyn_buckets - 1);
 	return i;
 }
 
 /**
  * unlink a dynamic rule from a chain. prev is a pointer to
  * the previous one, q is a pointer to the rule to delete,
  * head is a pointer to the head of the queue.
  * Modifies q and potentially also head.
  */
 #define UNLINK_DYN_RULE(prev, head, q) {				\
 	ipfw_dyn_rule *old_q = q;					\
 									\
 	/* remove a refcount to the parent */				\
 	if (q->dyn_type == O_LIMIT)					\
 		q->parent->count--;					\
 	DEB(printf("ipfw: unlink entry 0x%08x %d -> 0x%08x %d, %d left\n",\
 		(q->id.src_ip), (q->id.src_port),			\
 		(q->id.dst_ip), (q->id.dst_port), V_dyn_count-1 ); )	\
 	if (prev != NULL)						\
 		prev->next = q = q->next;				\
 	else								\
 		head = q = q->next;					\
 	V_dyn_count--;							\
 	uma_zfree(ipfw_dyn_rule_zone, old_q); }
 
 #define TIME_LEQ(a,b)       ((int)((a)-(b)) <= 0)
 
 /**
  * Remove dynamic rules pointing to "rule", or all of them if rule == NULL.
  *
  * If keep_me == NULL, rules are deleted even if not expired,
  * otherwise only expired rules are removed.
  *
  * The value of the second parameter is also used to point to identify
  * a rule we absolutely do not want to remove (e.g. because we are
  * holding a reference to it -- this is the case with O_LIMIT_PARENT
  * rules). The pointer is only used for comparison, so any non-null
  * value will do.
  */
 static void
 remove_dyn_rule(struct ip_fw *rule, ipfw_dyn_rule *keep_me)
 {
 	INIT_VNET_IPFW(curvnet);
 	static u_int32_t last_remove = 0;
 
 #define FORCE (keep_me == NULL)
 
 	ipfw_dyn_rule *prev, *q;
 	int i, pass = 0, max_pass = 0;
 
 	IPFW_DYN_LOCK_ASSERT();
 
 	if (V_ipfw_dyn_v == NULL || V_dyn_count == 0)
 		return;
 	/* do not expire more than once per second, it is useless */
 	if (!FORCE && last_remove == time_uptime)
 		return;
 	last_remove = time_uptime;
 
 	/*
 	 * because O_LIMIT refer to parent rules, during the first pass only
 	 * remove child and mark any pending LIMIT_PARENT, and remove
 	 * them in a second pass.
 	 */
 next_pass:
 	for (i = 0 ; i < V_curr_dyn_buckets ; i++) {
 		for (prev=NULL, q = V_ipfw_dyn_v[i] ; q ; ) {
 			/*
 			 * Logic can become complex here, so we split tests.
 			 */
 			if (q == keep_me)
 				goto next;
 			if (rule != NULL && rule != q->rule)
 				goto next; /* not the one we are looking for */
 			if (q->dyn_type == O_LIMIT_PARENT) {
 				/*
 				 * handle parent in the second pass,
 				 * record we need one.
 				 */
 				max_pass = 1;
 				if (pass == 0)
 					goto next;
 				if (FORCE && q->count != 0 ) {
 					/* XXX should not happen! */
 					printf("ipfw: OUCH! cannot remove rule,"
 					     " count %d\n", q->count);
 				}
 			} else {
 				if (!FORCE &&
 				    !TIME_LEQ( q->expire, time_uptime ))
 					goto next;
 			}
              if (q->dyn_type != O_LIMIT_PARENT || !q->count) {
                      UNLINK_DYN_RULE(prev, V_ipfw_dyn_v[i], q);
                      continue;
              }
 next:
 			prev=q;
 			q=q->next;
 		}
 	}
 	if (pass++ < max_pass)
 		goto next_pass;
 }
 
 
 /**
  * lookup a dynamic rule.
  */
 static ipfw_dyn_rule *
 lookup_dyn_rule_locked(struct ipfw_flow_id *pkt, int *match_direction,
     struct tcphdr *tcp)
 {
 	INIT_VNET_IPFW(curvnet);
 	/*
 	 * stateful ipfw extensions.
 	 * Lookup into dynamic session queue
 	 */
 #define MATCH_REVERSE	0
 #define MATCH_FORWARD	1
 #define MATCH_NONE	2
 #define MATCH_UNKNOWN	3
 	int i, dir = MATCH_NONE;
 	ipfw_dyn_rule *prev, *q=NULL;
 
 	IPFW_DYN_LOCK_ASSERT();
 
 	if (V_ipfw_dyn_v == NULL)
 		goto done;	/* not found */
 	i = hash_packet( pkt );
 	for (prev=NULL, q = V_ipfw_dyn_v[i] ; q != NULL ; ) {
 		if (q->dyn_type == O_LIMIT_PARENT && q->count)
 			goto next;
 		if (TIME_LEQ( q->expire, time_uptime)) { /* expire entry */
 			UNLINK_DYN_RULE(prev, V_ipfw_dyn_v[i], q);
 			continue;
 		}
 		if (pkt->proto == q->id.proto &&
 		    q->dyn_type != O_LIMIT_PARENT) {
 			if (IS_IP6_FLOW_ID(pkt)) {
 			    if (IN6_ARE_ADDR_EQUAL(&(pkt->src_ip6),
 				&(q->id.src_ip6)) &&
 			    IN6_ARE_ADDR_EQUAL(&(pkt->dst_ip6),
 				&(q->id.dst_ip6)) &&
 			    pkt->src_port == q->id.src_port &&
 			    pkt->dst_port == q->id.dst_port ) {
 				dir = MATCH_FORWARD;
 				break;
 			    }
 			    if (IN6_ARE_ADDR_EQUAL(&(pkt->src_ip6),
 				    &(q->id.dst_ip6)) &&
 				IN6_ARE_ADDR_EQUAL(&(pkt->dst_ip6),
 				    &(q->id.src_ip6)) &&
 				pkt->src_port == q->id.dst_port &&
 				pkt->dst_port == q->id.src_port ) {
 				    dir = MATCH_REVERSE;
 				    break;
 			    }
 			} else {
 			    if (pkt->src_ip == q->id.src_ip &&
 				pkt->dst_ip == q->id.dst_ip &&
 				pkt->src_port == q->id.src_port &&
 				pkt->dst_port == q->id.dst_port ) {
 				    dir = MATCH_FORWARD;
 				    break;
 			    }
 			    if (pkt->src_ip == q->id.dst_ip &&
 				pkt->dst_ip == q->id.src_ip &&
 				pkt->src_port == q->id.dst_port &&
 				pkt->dst_port == q->id.src_port ) {
 				    dir = MATCH_REVERSE;
 				    break;
 			    }
 			}
 		}
 next:
 		prev = q;
 		q = q->next;
 	}
 	if (q == NULL)
 		goto done; /* q = NULL, not found */
 
 	if ( prev != NULL) { /* found and not in front */
 		prev->next = q->next;
 		q->next = V_ipfw_dyn_v[i];
 		V_ipfw_dyn_v[i] = q;
 	}
 	if (pkt->proto == IPPROTO_TCP) { /* update state according to flags */
 		u_char flags = pkt->flags & (TH_FIN|TH_SYN|TH_RST);
 
 #define BOTH_SYN	(TH_SYN | (TH_SYN << 8))
 #define BOTH_FIN	(TH_FIN | (TH_FIN << 8))
 		q->state |= (dir == MATCH_FORWARD ) ? flags : (flags << 8);
 		switch (q->state) {
 		case TH_SYN:				/* opening */
 			q->expire = time_uptime + V_dyn_syn_lifetime;
 			break;
 
 		case BOTH_SYN:			/* move to established */
 		case BOTH_SYN | TH_FIN :	/* one side tries to close */
 		case BOTH_SYN | (TH_FIN << 8) :
  			if (tcp) {
 #define _SEQ_GE(a,b) ((int)(a) - (int)(b) >= 0)
 			    u_int32_t ack = ntohl(tcp->th_ack);
 			    if (dir == MATCH_FORWARD) {
 				if (q->ack_fwd == 0 || _SEQ_GE(ack, q->ack_fwd))
 				    q->ack_fwd = ack;
 				else { /* ignore out-of-sequence */
 				    break;
 				}
 			    } else {
 				if (q->ack_rev == 0 || _SEQ_GE(ack, q->ack_rev))
 				    q->ack_rev = ack;
 				else { /* ignore out-of-sequence */
 				    break;
 				}
 			    }
 			}
 			q->expire = time_uptime + V_dyn_ack_lifetime;
 			break;
 
 		case BOTH_SYN | BOTH_FIN:	/* both sides closed */
 			if (V_dyn_fin_lifetime >= V_dyn_keepalive_period)
 				V_dyn_fin_lifetime = V_dyn_keepalive_period - 1;
 			q->expire = time_uptime + V_dyn_fin_lifetime;
 			break;
 
 		default:
 #if 0
 			/*
 			 * reset or some invalid combination, but can also
 			 * occur if we use keep-state the wrong way.
 			 */
 			if ( (q->state & ((TH_RST << 8)|TH_RST)) == 0)
 				printf("invalid state: 0x%x\n", q->state);
 #endif
 			if (V_dyn_rst_lifetime >= V_dyn_keepalive_period)
 				V_dyn_rst_lifetime = V_dyn_keepalive_period - 1;
 			q->expire = time_uptime + V_dyn_rst_lifetime;
 			break;
 		}
 	} else if (pkt->proto == IPPROTO_UDP) {
 		q->expire = time_uptime + V_dyn_udp_lifetime;
 	} else {
 		/* other protocols */
 		q->expire = time_uptime + V_dyn_short_lifetime;
 	}
 done:
 	if (match_direction)
 		*match_direction = dir;
 	return q;
 }
 
 static ipfw_dyn_rule *
 lookup_dyn_rule(struct ipfw_flow_id *pkt, int *match_direction,
     struct tcphdr *tcp)
 {
 	ipfw_dyn_rule *q;
 
 	IPFW_DYN_LOCK();
 	q = lookup_dyn_rule_locked(pkt, match_direction, tcp);
 	if (q == NULL)
 		IPFW_DYN_UNLOCK();
 	/* NB: return table locked when q is not NULL */
 	return q;
 }
 
 static void
 realloc_dynamic_table(void)
 {
 	INIT_VNET_IPFW(curvnet);
 	IPFW_DYN_LOCK_ASSERT();
 
 	/*
 	 * Try reallocation, make sure we have a power of 2 and do
 	 * not allow more than 64k entries. In case of overflow,
 	 * default to 1024.
 	 */
 
 	if (V_dyn_buckets > 65536)
 		V_dyn_buckets = 1024;
 	if ((V_dyn_buckets & (V_dyn_buckets-1)) != 0) { /* not a power of 2 */
 		V_dyn_buckets = V_curr_dyn_buckets; /* reset */
 		return;
 	}
 	V_curr_dyn_buckets = V_dyn_buckets;
 	if (V_ipfw_dyn_v != NULL)
 		free(V_ipfw_dyn_v, M_IPFW);
 	for (;;) {
 		V_ipfw_dyn_v = malloc(V_curr_dyn_buckets * sizeof(ipfw_dyn_rule *),
 		       M_IPFW, M_NOWAIT | M_ZERO);
 		if (V_ipfw_dyn_v != NULL || V_curr_dyn_buckets <= 2)
 			break;
 		V_curr_dyn_buckets /= 2;
 	}
 }
 
 /**
  * Install state of type 'type' for a dynamic session.
  * The hash table contains two type of rules:
  * - regular rules (O_KEEP_STATE)
  * - rules for sessions with limited number of sess per user
  *   (O_LIMIT). When they are created, the parent is
  *   increased by 1, and decreased on delete. In this case,
  *   the third parameter is the parent rule and not the chain.
  * - "parent" rules for the above (O_LIMIT_PARENT).
  */
 static ipfw_dyn_rule *
 add_dyn_rule(struct ipfw_flow_id *id, u_int8_t dyn_type, struct ip_fw *rule)
 {
 	INIT_VNET_IPFW(curvnet);
 	ipfw_dyn_rule *r;
 	int i;
 
 	IPFW_DYN_LOCK_ASSERT();
 
 	if (V_ipfw_dyn_v == NULL ||
 	    (V_dyn_count == 0 && V_dyn_buckets != V_curr_dyn_buckets)) {
 		realloc_dynamic_table();
 		if (V_ipfw_dyn_v == NULL)
 			return NULL; /* failed ! */
 	}
 	i = hash_packet(id);
 
 	r = uma_zalloc(ipfw_dyn_rule_zone, M_NOWAIT | M_ZERO);
 	if (r == NULL) {
 		printf ("ipfw: sorry cannot allocate state\n");
 		return NULL;
 	}
 
 	/* increase refcount on parent, and set pointer */
 	if (dyn_type == O_LIMIT) {
 		ipfw_dyn_rule *parent = (ipfw_dyn_rule *)rule;
 		if ( parent->dyn_type != O_LIMIT_PARENT)
 			panic("invalid parent");
 		parent->count++;
 		r->parent = parent;
 		rule = parent->rule;
 	}
 
 	r->id = *id;
 	r->expire = time_uptime + V_dyn_syn_lifetime;
 	r->rule = rule;
 	r->dyn_type = dyn_type;
 	r->pcnt = r->bcnt = 0;
 	r->count = 0;
 
 	r->bucket = i;
 	r->next = V_ipfw_dyn_v[i];
 	V_ipfw_dyn_v[i] = r;
 	V_dyn_count++;
 	DEB(printf("ipfw: add dyn entry ty %d 0x%08x %d -> 0x%08x %d, total %d\n",
 	   dyn_type,
 	   (r->id.src_ip), (r->id.src_port),
 	   (r->id.dst_ip), (r->id.dst_port),
 	   V_dyn_count ); )
 	return r;
 }
 
 /**
  * lookup dynamic parent rule using pkt and rule as search keys.
  * If the lookup fails, then install one.
  */
 static ipfw_dyn_rule *
 lookup_dyn_parent(struct ipfw_flow_id *pkt, struct ip_fw *rule)
 {
 	INIT_VNET_IPFW(curvnet);
 	ipfw_dyn_rule *q;
 	int i;
 
 	IPFW_DYN_LOCK_ASSERT();
 
 	if (V_ipfw_dyn_v) {
 		int is_v6 = IS_IP6_FLOW_ID(pkt);
 		i = hash_packet( pkt );
 		for (q = V_ipfw_dyn_v[i] ; q != NULL ; q=q->next)
 			if (q->dyn_type == O_LIMIT_PARENT &&
 			    rule== q->rule &&
 			    pkt->proto == q->id.proto &&
 			    pkt->src_port == q->id.src_port &&
 			    pkt->dst_port == q->id.dst_port &&
 			    (
 				(is_v6 &&
 				 IN6_ARE_ADDR_EQUAL(&(pkt->src_ip6),
 					&(q->id.src_ip6)) &&
 				 IN6_ARE_ADDR_EQUAL(&(pkt->dst_ip6),
 					&(q->id.dst_ip6))) ||
 				(!is_v6 &&
 				 pkt->src_ip == q->id.src_ip &&
 				 pkt->dst_ip == q->id.dst_ip)
 			    )
 			) {
 				q->expire = time_uptime + V_dyn_short_lifetime;
 				DEB(printf("ipfw: lookup_dyn_parent found 0x%p\n",q);)
 				return q;
 			}
 	}
 	return add_dyn_rule(pkt, O_LIMIT_PARENT, rule);
 }
 
 /**
  * Install dynamic state for rule type cmd->o.opcode
  *
  * Returns 1 (failure) if state is not installed because of errors or because
  * session limitations are enforced.
  */
 static int
 install_state(struct ip_fw *rule, ipfw_insn_limit *cmd,
     struct ip_fw_args *args, uint32_t tablearg)
 {
 	INIT_VNET_IPFW(curvnet);
 	static int last_log;
 	ipfw_dyn_rule *q;
 	struct in_addr da;
 	char src[48], dst[48];
 
 	src[0] = '\0';
 	dst[0] = '\0';
 
 	DEB(
 	printf("ipfw: %s: type %d 0x%08x %u -> 0x%08x %u\n",
 	    __func__, cmd->o.opcode,
 	    (args->f_id.src_ip), (args->f_id.src_port),
 	    (args->f_id.dst_ip), (args->f_id.dst_port));
 	)
 
 	IPFW_DYN_LOCK();
 
 	q = lookup_dyn_rule_locked(&args->f_id, NULL, NULL);
 
 	if (q != NULL) {	/* should never occur */
 		if (last_log != time_uptime) {
 			last_log = time_uptime;
 			printf("ipfw: %s: entry already present, done\n",
 			    __func__);
 		}
 		IPFW_DYN_UNLOCK();
 		return (0);
 	}
 
 	if (V_dyn_count >= V_dyn_max)
 		/* Run out of slots, try to remove any expired rule. */
 		remove_dyn_rule(NULL, (ipfw_dyn_rule *)1);
 
 	if (V_dyn_count >= V_dyn_max) {
 		if (last_log != time_uptime) {
 			last_log = time_uptime;
 			printf("ipfw: %s: Too many dynamic rules\n", __func__);
 		}
 		IPFW_DYN_UNLOCK();
 		return (1);	/* cannot install, notify caller */
 	}
 
 	switch (cmd->o.opcode) {
 	case O_KEEP_STATE:	/* bidir rule */
 		add_dyn_rule(&args->f_id, O_KEEP_STATE, rule);
 		break;
 
 	case O_LIMIT: {		/* limit number of sessions */
 		struct ipfw_flow_id id;
 		ipfw_dyn_rule *parent;
 		uint32_t conn_limit;
 		uint16_t limit_mask = cmd->limit_mask;
 
 		conn_limit = (cmd->conn_limit == IP_FW_TABLEARG) ?
 		    tablearg : cmd->conn_limit;
 		  
 		DEB(
 		if (cmd->conn_limit == IP_FW_TABLEARG)
 			printf("ipfw: %s: O_LIMIT rule, conn_limit: %u "
 			    "(tablearg)\n", __func__, conn_limit);
 		else
 			printf("ipfw: %s: O_LIMIT rule, conn_limit: %u\n",
 			    __func__, conn_limit);
 		)
 
 		id.dst_ip = id.src_ip = id.dst_port = id.src_port = 0;
 		id.proto = args->f_id.proto;
 		id.addr_type = args->f_id.addr_type;
 		id.fib = M_GETFIB(args->m);
 
 		if (IS_IP6_FLOW_ID (&(args->f_id))) {
 			if (limit_mask & DYN_SRC_ADDR)
 				id.src_ip6 = args->f_id.src_ip6;
 			if (limit_mask & DYN_DST_ADDR)
 				id.dst_ip6 = args->f_id.dst_ip6;
 		} else {
 			if (limit_mask & DYN_SRC_ADDR)
 				id.src_ip = args->f_id.src_ip;
 			if (limit_mask & DYN_DST_ADDR)
 				id.dst_ip = args->f_id.dst_ip;
 		}
 		if (limit_mask & DYN_SRC_PORT)
 			id.src_port = args->f_id.src_port;
 		if (limit_mask & DYN_DST_PORT)
 			id.dst_port = args->f_id.dst_port;
 		if ((parent = lookup_dyn_parent(&id, rule)) == NULL) {
 			printf("ipfw: %s: add parent failed\n", __func__);
 			IPFW_DYN_UNLOCK();
 			return (1);
 		}
 
 		if (parent->count >= conn_limit) {
 			/* See if we can remove some expired rule. */
 			remove_dyn_rule(rule, parent);
 			if (parent->count >= conn_limit) {
 				if (V_fw_verbose && last_log != time_uptime) {
 					last_log = time_uptime;
 #ifdef INET6
 					/*
 					 * XXX IPv6 flows are not
 					 * supported yet.
 					 */
 					if (IS_IP6_FLOW_ID(&(args->f_id))) {
 						char ip6buf[INET6_ADDRSTRLEN];
 						snprintf(src, sizeof(src),
 						    "[%s]", ip6_sprintf(ip6buf,
 							&args->f_id.src_ip6));
 						snprintf(dst, sizeof(dst),
 						    "[%s]", ip6_sprintf(ip6buf,
 							&args->f_id.dst_ip6));
 					} else
 #endif
 					{
 						da.s_addr =
 						    htonl(args->f_id.src_ip);
 						inet_ntoa_r(da, src);
 						da.s_addr =
 						    htonl(args->f_id.dst_ip);
 						inet_ntoa_r(da, dst);
 					}
 					log(LOG_SECURITY | LOG_DEBUG,
 					    "ipfw: %d %s %s:%u -> %s:%u, %s\n",
 					    parent->rule->rulenum,
 					    "drop session",
 					    src, (args->f_id.src_port),
 					    dst, (args->f_id.dst_port),
 					    "too many entries");
 				}
 				IPFW_DYN_UNLOCK();
 				return (1);
 			}
 		}
 		add_dyn_rule(&args->f_id, O_LIMIT, (struct ip_fw *)parent);
 		break;
 	}
 	default:
 		printf("ipfw: %s: unknown dynamic rule type %u\n",
 		    __func__, cmd->o.opcode);
 		IPFW_DYN_UNLOCK();
 		return (1);
 	}
 
 	/* XXX just set lifetime */
 	lookup_dyn_rule_locked(&args->f_id, NULL, NULL);
 
 	IPFW_DYN_UNLOCK();
 	return (0);
 }
 
 /*
  * Generate a TCP packet, containing either a RST or a keepalive.
  * When flags & TH_RST, we are sending a RST packet, because of a
  * "reset" action matched the packet.
  * Otherwise we are sending a keepalive, and flags & TH_
  * The 'replyto' mbuf is the mbuf being replied to, if any, and is required
  * so that MAC can label the reply appropriately.
  */
 static struct mbuf *
 send_pkt(struct mbuf *replyto, struct ipfw_flow_id *id, u_int32_t seq,
     u_int32_t ack, int flags)
 {
 	INIT_VNET_INET(curvnet);
 	struct mbuf *m;
 	struct ip *ip;
 	struct tcphdr *tcp;
 
 	MGETHDR(m, M_DONTWAIT, MT_DATA);
 	if (m == 0)
 		return (NULL);
 	m->m_pkthdr.rcvif = (struct ifnet *)0;
 
 	M_SETFIB(m, id->fib);
 #ifdef MAC
 	if (replyto != NULL)
 		mac_netinet_firewall_reply(replyto, m);
 	else
 		mac_netinet_firewall_send(m);
 #else
 	(void)replyto;		/* don't warn about unused arg */
 #endif
 
 	m->m_pkthdr.len = m->m_len = sizeof(struct ip) + sizeof(struct tcphdr);
 	m->m_data += max_linkhdr;
 
 	ip = mtod(m, struct ip *);
 	bzero(ip, m->m_len);
 	tcp = (struct tcphdr *)(ip + 1); /* no IP options */
 	ip->ip_p = IPPROTO_TCP;
 	tcp->th_off = 5;
 	/*
 	 * Assume we are sending a RST (or a keepalive in the reverse
 	 * direction), swap src and destination addresses and ports.
 	 */
 	ip->ip_src.s_addr = htonl(id->dst_ip);
 	ip->ip_dst.s_addr = htonl(id->src_ip);
 	tcp->th_sport = htons(id->dst_port);
 	tcp->th_dport = htons(id->src_port);
 	if (flags & TH_RST) {	/* we are sending a RST */
 		if (flags & TH_ACK) {
 			tcp->th_seq = htonl(ack);
 			tcp->th_ack = htonl(0);
 			tcp->th_flags = TH_RST;
 		} else {
 			if (flags & TH_SYN)
 				seq++;
 			tcp->th_seq = htonl(0);
 			tcp->th_ack = htonl(seq);
 			tcp->th_flags = TH_RST | TH_ACK;
 		}
 	} else {
 		/*
 		 * We are sending a keepalive. flags & TH_SYN determines
 		 * the direction, forward if set, reverse if clear.
 		 * NOTE: seq and ack are always assumed to be correct
 		 * as set by the caller. This may be confusing...
 		 */
 		if (flags & TH_SYN) {
 			/*
 			 * we have to rewrite the correct addresses!
 			 */
 			ip->ip_dst.s_addr = htonl(id->dst_ip);
 			ip->ip_src.s_addr = htonl(id->src_ip);
 			tcp->th_dport = htons(id->dst_port);
 			tcp->th_sport = htons(id->src_port);
 		}
 		tcp->th_seq = htonl(seq);
 		tcp->th_ack = htonl(ack);
 		tcp->th_flags = TH_ACK;
 	}
 	/*
 	 * set ip_len to the payload size so we can compute
 	 * the tcp checksum on the pseudoheader
 	 * XXX check this, could save a couple of words ?
 	 */
 	ip->ip_len = htons(sizeof(struct tcphdr));
 	tcp->th_sum = in_cksum(m, m->m_pkthdr.len);
 	/*
 	 * now fill fields left out earlier
 	 */
 	ip->ip_ttl = V_ip_defttl;
 	ip->ip_len = m->m_pkthdr.len;
 	m->m_flags |= M_SKIP_FIREWALL;
 	return (m);
 }
 
 /*
  * sends a reject message, consuming the mbuf passed as an argument.
  */
 static void
 send_reject(struct ip_fw_args *args, int code, int ip_len, struct ip *ip)
 {
 
 #if 0
 	/* XXX When ip is not guaranteed to be at mtod() we will
 	 * need to account for this */
 	 * The mbuf will however be thrown away so we can adjust it.
 	 * Remember we did an m_pullup on it already so we
 	 * can make some assumptions about contiguousness.
 	 */
 	if (args->L3offset)
 		m_adj(m, args->L3offset);
 #endif
 	if (code != ICMP_REJECT_RST) { /* Send an ICMP unreach */
 		/* We need the IP header in host order for icmp_error(). */
 		if (args->eh != NULL) {
 			ip->ip_len = ntohs(ip->ip_len);
 			ip->ip_off = ntohs(ip->ip_off);
 		}
 		icmp_error(args->m, ICMP_UNREACH, code, 0L, 0);
 	} else if (args->f_id.proto == IPPROTO_TCP) {
 		struct tcphdr *const tcp =
 		    L3HDR(struct tcphdr, mtod(args->m, struct ip *));
 		if ( (tcp->th_flags & TH_RST) == 0) {
 			struct mbuf *m;
 			m = send_pkt(args->m, &(args->f_id),
 				ntohl(tcp->th_seq), ntohl(tcp->th_ack),
 				tcp->th_flags | TH_RST);
 			if (m != NULL)
 				ip_output(m, NULL, NULL, 0, NULL, NULL);
 		}
 		m_freem(args->m);
 	} else
 		m_freem(args->m);
 	args->m = NULL;
 }
 
 /**
  *
  * Given an ip_fw *, lookup_next_rule will return a pointer
  * to the next rule, which can be either the jump
  * target (for skipto instructions) or the next one in the list (in
  * all other cases including a missing jump target).
  * The result is also written in the "next_rule" field of the rule.
  * Backward jumps are not allowed, so start looking from the next
  * rule...
  *
  * This never returns NULL -- in case we do not have an exact match,
  * the next rule is returned. When the ruleset is changed,
  * pointers are flushed so we are always correct.
  */
 
 static struct ip_fw *
 lookup_next_rule(struct ip_fw *me, u_int32_t tablearg)
 {
 	struct ip_fw *rule = NULL;
 	ipfw_insn *cmd;
 	u_int16_t	rulenum;
 
 	/* look for action, in case it is a skipto */
 	cmd = ACTION_PTR(me);
 	if (cmd->opcode == O_LOG)
 		cmd += F_LEN(cmd);
 	if (cmd->opcode == O_ALTQ)
 		cmd += F_LEN(cmd);
 	if (cmd->opcode == O_TAG)
 		cmd += F_LEN(cmd);
 	if (cmd->opcode == O_SKIPTO ) {
 		if (tablearg != 0) {
 			rulenum = (u_int16_t)tablearg;
 		} else {
 			rulenum = cmd->arg1;
 		}
 		for (rule = me->next; rule ; rule = rule->next) {
 			if (rule->rulenum >= rulenum) {
 				break;
 			}
 		}
 	}
 	if (rule == NULL)			/* failure or not a skipto */
 		rule = me->next;
 	me->next_rule = rule;
 	return rule;
 }
 
 static int
 add_table_entry(struct ip_fw_chain *ch, uint16_t tbl, in_addr_t addr,
     uint8_t mlen, uint32_t value)
 {
 	INIT_VNET_IPFW(curvnet);
 	struct radix_node_head *rnh;
 	struct table_entry *ent;
 	struct radix_node *rn;
 
 	if (tbl >= IPFW_TABLES_MAX)
 		return (EINVAL);
 	rnh = ch->tables[tbl];
 	ent = malloc(sizeof(*ent), M_IPFW_TBL, M_NOWAIT | M_ZERO);
 	if (ent == NULL)
 		return (ENOMEM);
 	ent->value = value;
 	ent->addr.sin_len = ent->mask.sin_len = 8;
 	ent->mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
 	ent->addr.sin_addr.s_addr = addr & ent->mask.sin_addr.s_addr;
 	IPFW_WLOCK(ch);
 	RADIX_NODE_HEAD_LOCK(rnh);
 	rn = rnh->rnh_addaddr(&ent->addr, &ent->mask, rnh, (void *)ent);
 	RADIX_NODE_HEAD_UNLOCK(rnh);
 	if (rn == NULL) {
 		IPFW_WUNLOCK(ch);
 		free(ent, M_IPFW_TBL);
 		return (EEXIST);
 	}
 	IPFW_WUNLOCK(ch);
 	return (0);
 }
 
 static int
 del_table_entry(struct ip_fw_chain *ch, uint16_t tbl, in_addr_t addr,
     uint8_t mlen)
 {
 	struct radix_node_head *rnh;
 	struct table_entry *ent;
 	struct sockaddr_in sa, mask;
 
 	if (tbl >= IPFW_TABLES_MAX)
 		return (EINVAL);
 	rnh = ch->tables[tbl];
 	sa.sin_len = mask.sin_len = 8;
 	mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0);
 	sa.sin_addr.s_addr = addr & mask.sin_addr.s_addr;
 	IPFW_WLOCK(ch);
 	ent = (struct table_entry *)rnh->rnh_deladdr(&sa, &mask, rnh);
 	if (ent == NULL) {
 		IPFW_WUNLOCK(ch);
 		return (ESRCH);
 	}
 	IPFW_WUNLOCK(ch);
 	free(ent, M_IPFW_TBL);
 	return (0);
 }
 
 static int
 flush_table_entry(struct radix_node *rn, void *arg)
 {
 	struct radix_node_head * const rnh = arg;
 	struct table_entry *ent;
 
 	ent = (struct table_entry *)
 	    rnh->rnh_deladdr(rn->rn_key, rn->rn_mask, rnh);
 	if (ent != NULL)
 		free(ent, M_IPFW_TBL);
 	return (0);
 }
 
 static int
 flush_table(struct ip_fw_chain *ch, uint16_t tbl)
 {
 	struct radix_node_head *rnh;
 
 	IPFW_WLOCK_ASSERT(ch);
 
 	if (tbl >= IPFW_TABLES_MAX)
 		return (EINVAL);
 	rnh = ch->tables[tbl];
 	KASSERT(rnh != NULL, ("NULL IPFW table"));
 	rnh->rnh_walktree(rnh, flush_table_entry, rnh);
 	return (0);
 }
 
 static void
 flush_tables(struct ip_fw_chain *ch)
 {
 	uint16_t tbl;
 
 	IPFW_WLOCK_ASSERT(ch);
 
 	for (tbl = 0; tbl < IPFW_TABLES_MAX; tbl++)
 		flush_table(ch, tbl);
 }
 
 static int
 init_tables(struct ip_fw_chain *ch)
 { 
 	int i;
 	uint16_t j;
 
 	for (i = 0; i < IPFW_TABLES_MAX; i++) {
 		if (!rn_inithead((void **)&ch->tables[i], 32)) {
 			for (j = 0; j < i; j++) {
 				(void) flush_table(ch, j);
 			}
 			return (ENOMEM);
 		}
 	}
 	return (0);
 }
 
 static int
 lookup_table(struct ip_fw_chain *ch, uint16_t tbl, in_addr_t addr,
     uint32_t *val)
 {
 	struct radix_node_head *rnh;
 	struct table_entry *ent;
 	struct sockaddr_in sa;
 
 	if (tbl >= IPFW_TABLES_MAX)
 		return (0);
 	rnh = ch->tables[tbl];
 	sa.sin_len = 8;
 	sa.sin_addr.s_addr = addr;
 	ent = (struct table_entry *)(rnh->rnh_lookup(&sa, NULL, rnh));
 	if (ent != NULL) {
 		*val = ent->value;
 		return (1);
 	}
 	return (0);
 }
 
 static int
 count_table_entry(struct radix_node *rn, void *arg)
 {
 	u_int32_t * const cnt = arg;
 
 	(*cnt)++;
 	return (0);
 }
 
 static int
 count_table(struct ip_fw_chain *ch, uint32_t tbl, uint32_t *cnt)
 {
 	struct radix_node_head *rnh;
 
 	if (tbl >= IPFW_TABLES_MAX)
 		return (EINVAL);
 	rnh = ch->tables[tbl];
 	*cnt = 0;
 	rnh->rnh_walktree(rnh, count_table_entry, cnt);
 	return (0);
 }
 
 static int
 dump_table_entry(struct radix_node *rn, void *arg)
 {
 	struct table_entry * const n = (struct table_entry *)rn;
 	ipfw_table * const tbl = arg;
 	ipfw_table_entry *ent;
 
 	if (tbl->cnt == tbl->size)
 		return (1);
 	ent = &tbl->ent[tbl->cnt];
 	ent->tbl = tbl->tbl;
 	if (in_nullhost(n->mask.sin_addr))
 		ent->masklen = 0;
 	else
 		ent->masklen = 33 - ffs(ntohl(n->mask.sin_addr.s_addr));
 	ent->addr = n->addr.sin_addr.s_addr;
 	ent->value = n->value;
 	tbl->cnt++;
 	return (0);
 }
 
 static int
 dump_table(struct ip_fw_chain *ch, ipfw_table *tbl)
 {
 	struct radix_node_head *rnh;
 
 	if (tbl->tbl >= IPFW_TABLES_MAX)
 		return (EINVAL);
 	rnh = ch->tables[tbl->tbl];
 	tbl->cnt = 0;
 	rnh->rnh_walktree(rnh, dump_table_entry, tbl);
 	return (0);
 }
 
 static void
 fill_ugid_cache(struct inpcb *inp, struct ip_fw_ugid *ugp)
 {
 	struct ucred *cr;
 
 	cr = inp->inp_cred;
 	ugp->fw_prid = jailed(cr) ? cr->cr_prison->pr_id : -1;
 	ugp->fw_uid = cr->cr_uid;
 	ugp->fw_ngroups = cr->cr_ngroups;
 	bcopy(cr->cr_groups, ugp->fw_groups, sizeof(ugp->fw_groups));
 }
 
 static int
 check_uidgid(ipfw_insn_u32 *insn, int proto, struct ifnet *oif,
     struct in_addr dst_ip, u_int16_t dst_port, struct in_addr src_ip,
     u_int16_t src_port, struct ip_fw_ugid *ugp, int *ugid_lookupp,
     struct inpcb *inp)
 {
 	INIT_VNET_INET(curvnet);
 	struct inpcbinfo *pi;
 	int wildcard;
 	struct inpcb *pcb;
 	int match;
 	gid_t *gp;
 
 	/*
 	 * Check to see if the UDP or TCP stack supplied us with
 	 * the PCB. If so, rather then holding a lock and looking
 	 * up the PCB, we can use the one that was supplied.
 	 */
 	if (inp && *ugid_lookupp == 0) {
 		INP_LOCK_ASSERT(inp);
 		if (inp->inp_socket != NULL) {
 			fill_ugid_cache(inp, ugp);
 			*ugid_lookupp = 1;
 		} else
 			*ugid_lookupp = -1;
 	}
 	/*
 	 * If we have already been here and the packet has no
 	 * PCB entry associated with it, then we can safely
 	 * assume that this is a no match.
 	 */
 	if (*ugid_lookupp == -1)
 		return (0);
 	if (proto == IPPROTO_TCP) {
 		wildcard = 0;
 		pi = &V_tcbinfo;
 	} else if (proto == IPPROTO_UDP) {
 		wildcard = INPLOOKUP_WILDCARD;
 		pi = &V_udbinfo;
 	} else
 		return 0;
 	match = 0;
 	if (*ugid_lookupp == 0) {
 		INP_INFO_RLOCK(pi);
 		pcb =  (oif) ?
 			in_pcblookup_hash(pi,
 				dst_ip, htons(dst_port),
 				src_ip, htons(src_port),
 				wildcard, oif) :
 			in_pcblookup_hash(pi,
 				src_ip, htons(src_port),
 				dst_ip, htons(dst_port),
 				wildcard, NULL);
 		if (pcb != NULL) {
 			fill_ugid_cache(pcb, ugp);
 			*ugid_lookupp = 1;
 		}
 		INP_INFO_RUNLOCK(pi);
 		if (*ugid_lookupp == 0) {
 			/*
 			 * If the lookup did not yield any results, there
 			 * is no sense in coming back and trying again. So
 			 * we can set lookup to -1 and ensure that we wont
 			 * bother the pcb system again.
 			 */
 			*ugid_lookupp = -1;
 			return (0);
 		}
 	} 
 	if (insn->o.opcode == O_UID)
 		match = (ugp->fw_uid == (uid_t)insn->d[0]);
 	else if (insn->o.opcode == O_GID) {
 		for (gp = ugp->fw_groups;
 			gp < &ugp->fw_groups[ugp->fw_ngroups]; gp++)
 			if (*gp == (gid_t)insn->d[0]) {
 				match = 1;
 				break;
 			}
 	} else if (insn->o.opcode == O_JAIL)
 		match = (ugp->fw_prid == (int)insn->d[0]);
 	return match;
 }
 
 /*
  * The main check routine for the firewall.
  *
  * All arguments are in args so we can modify them and return them
  * back to the caller.
  *
  * Parameters:
  *
  *	args->m	(in/out) The packet; we set to NULL when/if we nuke it.
  *		Starts with the IP header.
  *	args->eh (in)	Mac header if present, or NULL for layer3 packet.
  *	args->L3offset	Number of bytes bypassed if we came from L2.
  *			e.g. often sizeof(eh)  ** NOTYET **
  *	args->oif	Outgoing interface, or NULL if packet is incoming.
  *		The incoming interface is in the mbuf. (in)
  *	args->divert_rule (in/out)
  *		Skip up to the first rule past this rule number;
  *		upon return, non-zero port number for divert or tee.
  *
  *	args->rule	Pointer to the last matching rule (in/out)
  *	args->next_hop	Socket we are forwarding to (out).
  *	args->f_id	Addresses grabbed from the packet (out)
  * 	args->cookie	a cookie depending on rule action
  *
  * Return value:
  *
  *	IP_FW_PASS	the packet must be accepted
  *	IP_FW_DENY	the packet must be dropped
  *	IP_FW_DIVERT	divert packet, port in m_tag
  *	IP_FW_TEE	tee packet, port in m_tag
  *	IP_FW_DUMMYNET	to dummynet, pipe in args->cookie
  *	IP_FW_NETGRAPH	into netgraph, cookie args->cookie
  *
  */
 int
 ipfw_chk(struct ip_fw_args *args)
 {
 	INIT_VNET_INET(curvnet);
 	INIT_VNET_IPFW(curvnet);
 
 	/*
 	 * Local variables holding state during the processing of a packet:
 	 *
 	 * IMPORTANT NOTE: to speed up the processing of rules, there
 	 * are some assumption on the values of the variables, which
 	 * are documented here. Should you change them, please check
 	 * the implementation of the various instructions to make sure
 	 * that they still work.
 	 *
 	 * args->eh	The MAC header. It is non-null for a layer2
 	 *	packet, it is NULL for a layer-3 packet.
 	 * **notyet**
 	 * args->L3offset Offset in the packet to the L3 (IP or equiv.) header.
 	 *
 	 * m | args->m	Pointer to the mbuf, as received from the caller.
 	 *	It may change if ipfw_chk() does an m_pullup, or if it
 	 *	consumes the packet because it calls send_reject().
 	 *	XXX This has to change, so that ipfw_chk() never modifies
 	 *	or consumes the buffer.
 	 * ip	is the beginning of the ip(4 or 6) header.
 	 *	Calculated by adding the L3offset to the start of data.
 	 *	(Until we start using L3offset, the packet is
 	 *	supposed to start with the ip header).
 	 */
 	struct mbuf *m = args->m;
 	struct ip *ip = mtod(m, struct ip *);
 
 	/*
 	 * For rules which contain uid/gid or jail constraints, cache
 	 * a copy of the users credentials after the pcb lookup has been
 	 * executed. This will speed up the processing of rules with
 	 * these types of constraints, as well as decrease contention
 	 * on pcb related locks.
 	 */
 	struct ip_fw_ugid fw_ugid_cache;
 	int ugid_lookup = 0;
 
 	/*
 	 * divinput_flags	If non-zero, set to the IP_FW_DIVERT_*_FLAG
 	 *	associated with a packet input on a divert socket.  This
 	 *	will allow to distinguish traffic and its direction when
 	 *	it originates from a divert socket.
 	 */
 	u_int divinput_flags = 0;
 
 	/*
 	 * oif | args->oif	If NULL, ipfw_chk has been called on the
 	 *	inbound path (ether_input, ip_input).
 	 *	If non-NULL, ipfw_chk has been called on the outbound path
 	 *	(ether_output, ip_output).
 	 */
 	struct ifnet *oif = args->oif;
 
 	struct ip_fw *f = NULL;		/* matching rule */
 	int retval = 0;
 
 	/*
 	 * hlen	The length of the IP header.
 	 */
 	u_int hlen = 0;		/* hlen >0 means we have an IP pkt */
 
 	/*
 	 * offset	The offset of a fragment. offset != 0 means that
 	 *	we have a fragment at this offset of an IPv4 packet.
 	 *	offset == 0 means that (if this is an IPv4 packet)
 	 *	this is the first or only fragment.
 	 *	For IPv6 offset == 0 means there is no Fragment Header. 
 	 *	If offset != 0 for IPv6 always use correct mask to
 	 *	get the correct offset because we add IP6F_MORE_FRAG
 	 *	to be able to dectect the first fragment which would
 	 *	otherwise have offset = 0.
 	 */
 	u_short offset = 0;
 
 	/*
 	 * Local copies of addresses. They are only valid if we have
 	 * an IP packet.
 	 *
 	 * proto	The protocol. Set to 0 for non-ip packets,
 	 *	or to the protocol read from the packet otherwise.
 	 *	proto != 0 means that we have an IPv4 packet.
 	 *
 	 * src_port, dst_port	port numbers, in HOST format. Only
 	 *	valid for TCP and UDP packets.
 	 *
 	 * src_ip, dst_ip	ip addresses, in NETWORK format.
 	 *	Only valid for IPv4 packets.
 	 */
 	u_int8_t proto;
 	u_int16_t src_port = 0, dst_port = 0;	/* NOTE: host format	*/
 	struct in_addr src_ip, dst_ip;		/* NOTE: network format	*/
 	u_int16_t ip_len=0;
 	int pktlen;
 	u_int16_t	etype = 0;	/* Host order stored ether type */
 
 	/*
 	 * dyn_dir = MATCH_UNKNOWN when rules unchecked,
 	 * 	MATCH_NONE when checked and not matched (q = NULL),
 	 *	MATCH_FORWARD or MATCH_REVERSE otherwise (q != NULL)
 	 */
 	int dyn_dir = MATCH_UNKNOWN;
 	ipfw_dyn_rule *q = NULL;
 	struct ip_fw_chain *chain = &V_layer3_chain;
 	struct m_tag *mtag;
 
 	/*
 	 * We store in ulp a pointer to the upper layer protocol header.
 	 * In the ipv4 case this is easy to determine from the header,
 	 * but for ipv6 we might have some additional headers in the middle.
 	 * ulp is NULL if not found.
 	 */
 	void *ulp = NULL;		/* upper layer protocol pointer. */
 	/* XXX ipv6 variables */
 	int is_ipv6 = 0;
 	u_int16_t ext_hd = 0;	/* bits vector for extension header filtering */
 	/* end of ipv6 variables */
 	int is_ipv4 = 0;
 
 	if (m->m_flags & M_SKIP_FIREWALL)
 		return (IP_FW_PASS);	/* accept */
 
 	pktlen = m->m_pkthdr.len;
 	args->f_id.fib = M_GETFIB(m); /* note mbuf not altered) */
 	proto = args->f_id.proto = 0;	/* mark f_id invalid */
 		/* XXX 0 is a valid proto: IP/IPv6 Hop-by-Hop Option */
 
 /*
  * PULLUP_TO(len, p, T) makes sure that len + sizeof(T) is contiguous,
  * then it sets p to point at the offset "len" in the mbuf. WARNING: the
  * pointer might become stale after other pullups (but we never use it
  * this way).
  */
 #define PULLUP_TO(len, p, T)						\
 do {									\
 	int x = (len) + sizeof(T);					\
 	if ((m)->m_len < x) {						\
 		args->m = m = m_pullup(m, x);				\
 		if (m == NULL)						\
 			goto pullup_failed;				\
 	}								\
 	p = (mtod(m, char *) + (len));					\
 } while (0)
 
 	/*
 	 * if we have an ether header,
 	 */
 	if (args->eh)
 		etype = ntohs(args->eh->ether_type);
 
 	/* Identify IP packets and fill up variables. */
 	if (pktlen >= sizeof(struct ip6_hdr) &&
 	    (args->eh == NULL || etype == ETHERTYPE_IPV6) && ip->ip_v == 6) {
 		struct ip6_hdr *ip6 = (struct ip6_hdr *)ip;
 		is_ipv6 = 1;
 		args->f_id.addr_type = 6;
 		hlen = sizeof(struct ip6_hdr);
 		proto = ip6->ip6_nxt;
 
 		/* Search extension headers to find upper layer protocols */
 		while (ulp == NULL) {
 			switch (proto) {
 			case IPPROTO_ICMPV6:
 				PULLUP_TO(hlen, ulp, struct icmp6_hdr);
 				args->f_id.flags = ICMP6(ulp)->icmp6_type;
 				break;
 
 			case IPPROTO_TCP:
 				PULLUP_TO(hlen, ulp, struct tcphdr);
 				dst_port = TCP(ulp)->th_dport;
 				src_port = TCP(ulp)->th_sport;
 				args->f_id.flags = TCP(ulp)->th_flags;
 				break;
 
 			case IPPROTO_SCTP:
 				PULLUP_TO(hlen, ulp, struct sctphdr);
 				src_port = SCTP(ulp)->src_port;
 				dst_port = SCTP(ulp)->dest_port;
 				break;
 
 			case IPPROTO_UDP:
 				PULLUP_TO(hlen, ulp, struct udphdr);
 				dst_port = UDP(ulp)->uh_dport;
 				src_port = UDP(ulp)->uh_sport;
 				break;
 
 			case IPPROTO_HOPOPTS:	/* RFC 2460 */
 				PULLUP_TO(hlen, ulp, struct ip6_hbh);
 				ext_hd |= EXT_HOPOPTS;
 				hlen += (((struct ip6_hbh *)ulp)->ip6h_len + 1) << 3;
 				proto = ((struct ip6_hbh *)ulp)->ip6h_nxt;
 				ulp = NULL;
 				break;
 
 			case IPPROTO_ROUTING:	/* RFC 2460 */
 				PULLUP_TO(hlen, ulp, struct ip6_rthdr);
 				switch (((struct ip6_rthdr *)ulp)->ip6r_type) {
 				case 0:
 					ext_hd |= EXT_RTHDR0;
 					break;
 				case 2:
 					ext_hd |= EXT_RTHDR2;
 					break;
 				default:
 					printf("IPFW2: IPV6 - Unknown Routing "
 					    "Header type(%d)\n",
 					    ((struct ip6_rthdr *)ulp)->ip6r_type);
 					if (V_fw_deny_unknown_exthdrs)
 					    return (IP_FW_DENY);
 					break;
 				}
 				ext_hd |= EXT_ROUTING;
 				hlen += (((struct ip6_rthdr *)ulp)->ip6r_len + 1) << 3;
 				proto = ((struct ip6_rthdr *)ulp)->ip6r_nxt;
 				ulp = NULL;
 				break;
 
 			case IPPROTO_FRAGMENT:	/* RFC 2460 */
 				PULLUP_TO(hlen, ulp, struct ip6_frag);
 				ext_hd |= EXT_FRAGMENT;
 				hlen += sizeof (struct ip6_frag);
 				proto = ((struct ip6_frag *)ulp)->ip6f_nxt;
 				offset = ((struct ip6_frag *)ulp)->ip6f_offlg &
 					IP6F_OFF_MASK;
 				/* Add IP6F_MORE_FRAG for offset of first
 				 * fragment to be != 0. */
 				offset |= ((struct ip6_frag *)ulp)->ip6f_offlg &
 					IP6F_MORE_FRAG;
 				if (offset == 0) {
 					printf("IPFW2: IPV6 - Invalid Fragment "
 					    "Header\n");
 					if (V_fw_deny_unknown_exthdrs)
 					    return (IP_FW_DENY);
 					break;
 				}
 				args->f_id.frag_id6 =
 				    ntohl(((struct ip6_frag *)ulp)->ip6f_ident);
 				ulp = NULL;
 				break;
 
 			case IPPROTO_DSTOPTS:	/* RFC 2460 */
 				PULLUP_TO(hlen, ulp, struct ip6_hbh);
 				ext_hd |= EXT_DSTOPTS;
 				hlen += (((struct ip6_hbh *)ulp)->ip6h_len + 1) << 3;
 				proto = ((struct ip6_hbh *)ulp)->ip6h_nxt;
 				ulp = NULL;
 				break;
 
 			case IPPROTO_AH:	/* RFC 2402 */
 				PULLUP_TO(hlen, ulp, struct ip6_ext);
 				ext_hd |= EXT_AH;
 				hlen += (((struct ip6_ext *)ulp)->ip6e_len + 2) << 2;
 				proto = ((struct ip6_ext *)ulp)->ip6e_nxt;
 				ulp = NULL;
 				break;
 
 			case IPPROTO_ESP:	/* RFC 2406 */
 				PULLUP_TO(hlen, ulp, uint32_t);	/* SPI, Seq# */
 				/* Anything past Seq# is variable length and
 				 * data past this ext. header is encrypted. */
 				ext_hd |= EXT_ESP;
 				break;
 
 			case IPPROTO_NONE:	/* RFC 2460 */
 				/*
 				 * Packet ends here, and IPv6 header has
 				 * already been pulled up. If ip6e_len!=0
 				 * then octets must be ignored.
 				 */
 				ulp = ip; /* non-NULL to get out of loop. */
 				break;
 
 			case IPPROTO_OSPFIGP:
 				/* XXX OSPF header check? */
 				PULLUP_TO(hlen, ulp, struct ip6_ext);
 				break;
 
 			case IPPROTO_PIM:
 				/* XXX PIM header check? */
 				PULLUP_TO(hlen, ulp, struct pim);
 				break;
 
 			case IPPROTO_CARP:
 				PULLUP_TO(hlen, ulp, struct carp_header);
 				if (((struct carp_header *)ulp)->carp_version !=
 				    CARP_VERSION) 
 					return (IP_FW_DENY);
 				if (((struct carp_header *)ulp)->carp_type !=
 				    CARP_ADVERTISEMENT) 
 					return (IP_FW_DENY);
 				break;
 
 			case IPPROTO_IPV6:	/* RFC 2893 */
 				PULLUP_TO(hlen, ulp, struct ip6_hdr);
 				break;
 
 			case IPPROTO_IPV4:	/* RFC 2893 */
 				PULLUP_TO(hlen, ulp, struct ip);
 				break;
 
 			default:
 				printf("IPFW2: IPV6 - Unknown Extension "
 				    "Header(%d), ext_hd=%x\n", proto, ext_hd);
 				if (V_fw_deny_unknown_exthdrs)
 				    return (IP_FW_DENY);
 				PULLUP_TO(hlen, ulp, struct ip6_ext);
 				break;
 			} /*switch */
 		}
 		ip = mtod(m, struct ip *);
 		ip6 = (struct ip6_hdr *)ip;
 		args->f_id.src_ip6 = ip6->ip6_src;
 		args->f_id.dst_ip6 = ip6->ip6_dst;
 		args->f_id.src_ip = 0;
 		args->f_id.dst_ip = 0;
 		args->f_id.flow_id6 = ntohl(ip6->ip6_flow);
 	} else if (pktlen >= sizeof(struct ip) &&
 	    (args->eh == NULL || etype == ETHERTYPE_IP) && ip->ip_v == 4) {
 	    	is_ipv4 = 1;
 		hlen = ip->ip_hl << 2;
 		args->f_id.addr_type = 4;
 
 		/*
 		 * Collect parameters into local variables for faster matching.
 		 */
 		proto = ip->ip_p;
 		src_ip = ip->ip_src;
 		dst_ip = ip->ip_dst;
 		if (args->eh != NULL) { /* layer 2 packets are as on the wire */
 			offset = ntohs(ip->ip_off) & IP_OFFMASK;
 			ip_len = ntohs(ip->ip_len);
 		} else {
 			offset = ip->ip_off & IP_OFFMASK;
 			ip_len = ip->ip_len;
 		}
 		pktlen = ip_len < pktlen ? ip_len : pktlen;
 
 		if (offset == 0) {
 			switch (proto) {
 			case IPPROTO_TCP:
 				PULLUP_TO(hlen, ulp, struct tcphdr);
 				dst_port = TCP(ulp)->th_dport;
 				src_port = TCP(ulp)->th_sport;
 				args->f_id.flags = TCP(ulp)->th_flags;
 				break;
 
 			case IPPROTO_UDP:
 				PULLUP_TO(hlen, ulp, struct udphdr);
 				dst_port = UDP(ulp)->uh_dport;
 				src_port = UDP(ulp)->uh_sport;
 				break;
 
 			case IPPROTO_ICMP:
 				PULLUP_TO(hlen, ulp, struct icmphdr);
 				args->f_id.flags = ICMP(ulp)->icmp_type;
 				break;
 
 			default:
 				break;
 			}
 		}
 
 		ip = mtod(m, struct ip *);
 		args->f_id.src_ip = ntohl(src_ip.s_addr);
 		args->f_id.dst_ip = ntohl(dst_ip.s_addr);
 	}
 #undef PULLUP_TO
 	if (proto) { /* we may have port numbers, store them */
 		args->f_id.proto = proto;
 		args->f_id.src_port = src_port = ntohs(src_port);
 		args->f_id.dst_port = dst_port = ntohs(dst_port);
 	}
 
 	IPFW_RLOCK(chain);
 	mtag = m_tag_find(m, PACKET_TAG_DIVERT, NULL);
 	if (args->rule) {
 		/*
 		 * Packet has already been tagged. Look for the next rule
 		 * to restart processing.
 		 *
 		 * If fw_one_pass != 0 then just accept it.
 		 * XXX should not happen here, but optimized out in
 		 * the caller.
 		 */
 		if (V_fw_one_pass) {
 			IPFW_RUNLOCK(chain);
 			return (IP_FW_PASS);
 		}
 
 		f = args->rule->next_rule;
 		if (f == NULL)
 			f = lookup_next_rule(args->rule, 0);
 	} else {
 		/*
 		 * Find the starting rule. It can be either the first
 		 * one, or the one after divert_rule if asked so.
 		 */
 		int skipto = mtag ? divert_cookie(mtag) : 0;
 
 		f = chain->rules;
 		if (args->eh == NULL && skipto != 0) {
 			if (skipto >= IPFW_DEFAULT_RULE) {
 				IPFW_RUNLOCK(chain);
 				return (IP_FW_DENY); /* invalid */
 			}
 			while (f && f->rulenum <= skipto)
 				f = f->next;
 			if (f == NULL) {	/* drop packet */
 				IPFW_RUNLOCK(chain);
 				return (IP_FW_DENY);
 			}
 		}
 	}
 	/* reset divert rule to avoid confusion later */
 	if (mtag) {
 		divinput_flags = divert_info(mtag) &
 		    (IP_FW_DIVERT_OUTPUT_FLAG | IP_FW_DIVERT_LOOPBACK_FLAG);
 		m_tag_delete(m, mtag);
 	}
 
 	/*
 	 * Now scan the rules, and parse microinstructions for each rule.
 	 */
 	for (; f; f = f->next) {
 		ipfw_insn *cmd;
 		uint32_t tablearg = 0;
 		int l, cmdlen, skip_or; /* skip rest of OR block */
 
 again:
 		if (V_set_disable & (1 << f->set) )
 			continue;
 
 		skip_or = 0;
 		for (l = f->cmd_len, cmd = f->cmd ; l > 0 ;
 		    l -= cmdlen, cmd += cmdlen) {
 			int match;
 
 			/*
 			 * check_body is a jump target used when we find a
 			 * CHECK_STATE, and need to jump to the body of
 			 * the target rule.
 			 */
 
 check_body:
 			cmdlen = F_LEN(cmd);
 			/*
 			 * An OR block (insn_1 || .. || insn_n) has the
 			 * F_OR bit set in all but the last instruction.
 			 * The first match will set "skip_or", and cause
 			 * the following instructions to be skipped until
 			 * past the one with the F_OR bit clear.
 			 */
 			if (skip_or) {		/* skip this instruction */
 				if ((cmd->len & F_OR) == 0)
 					skip_or = 0;	/* next one is good */
 				continue;
 			}
 			match = 0; /* set to 1 if we succeed */
 
 			switch (cmd->opcode) {
 			/*
 			 * The first set of opcodes compares the packet's
 			 * fields with some pattern, setting 'match' if a
 			 * match is found. At the end of the loop there is
 			 * logic to deal with F_NOT and F_OR flags associated
 			 * with the opcode.
 			 */
 			case O_NOP:
 				match = 1;
 				break;
 
 			case O_FORWARD_MAC:
 				printf("ipfw: opcode %d unimplemented\n",
 				    cmd->opcode);
 				break;
 
 			case O_GID:
 			case O_UID:
 			case O_JAIL:
 				/*
 				 * We only check offset == 0 && proto != 0,
 				 * as this ensures that we have a
 				 * packet with the ports info.
 				 */
 				if (offset!=0)
 					break;
 				if (is_ipv6) /* XXX to be fixed later */
 					break;
 				if (proto == IPPROTO_TCP ||
 				    proto == IPPROTO_UDP)
 					match = check_uidgid(
 						    (ipfw_insn_u32 *)cmd,
 						    proto, oif,
 						    dst_ip, dst_port,
 						    src_ip, src_port, &fw_ugid_cache,
 						    &ugid_lookup, args->inp);
 				break;
 
 			case O_RECV:
 				match = iface_match(m->m_pkthdr.rcvif,
 				    (ipfw_insn_if *)cmd);
 				break;
 
 			case O_XMIT:
 				match = iface_match(oif, (ipfw_insn_if *)cmd);
 				break;
 
 			case O_VIA:
 				match = iface_match(oif ? oif :
 				    m->m_pkthdr.rcvif, (ipfw_insn_if *)cmd);
 				break;
 
 			case O_MACADDR2:
 				if (args->eh != NULL) {	/* have MAC header */
 					u_int32_t *want = (u_int32_t *)
 						((ipfw_insn_mac *)cmd)->addr;
 					u_int32_t *mask = (u_int32_t *)
 						((ipfw_insn_mac *)cmd)->mask;
 					u_int32_t *hdr = (u_int32_t *)args->eh;
 
 					match =
 					    ( want[0] == (hdr[0] & mask[0]) &&
 					      want[1] == (hdr[1] & mask[1]) &&
 					      want[2] == (hdr[2] & mask[2]) );
 				}
 				break;
 
 			case O_MAC_TYPE:
 				if (args->eh != NULL) {
 					u_int16_t *p =
 					    ((ipfw_insn_u16 *)cmd)->ports;
 					int i;
 
 					for (i = cmdlen - 1; !match && i>0;
 					    i--, p += 2)
 						match = (etype >= p[0] &&
 						    etype <= p[1]);
 				}
 				break;
 
 			case O_FRAG:
 				match = (offset != 0);
 				break;
 
 			case O_IN:	/* "out" is "not in" */
 				match = (oif == NULL);
 				break;
 
 			case O_LAYER2:
 				match = (args->eh != NULL);
 				break;
 
 			case O_DIVERTED:
 				match = (cmd->arg1 & 1 && divinput_flags &
 				    IP_FW_DIVERT_LOOPBACK_FLAG) ||
 					(cmd->arg1 & 2 && divinput_flags &
 				    IP_FW_DIVERT_OUTPUT_FLAG);
 				break;
 
 			case O_PROTO:
 				/*
 				 * We do not allow an arg of 0 so the
 				 * check of "proto" only suffices.
 				 */
 				match = (proto == cmd->arg1);
 				break;
 
 			case O_IP_SRC:
 				match = is_ipv4 &&
 				    (((ipfw_insn_ip *)cmd)->addr.s_addr ==
 				    src_ip.s_addr);
 				break;
 
 			case O_IP_SRC_LOOKUP:
 			case O_IP_DST_LOOKUP:
 				if (is_ipv4) {
 				    uint32_t a =
 					(cmd->opcode == O_IP_DST_LOOKUP) ?
 					    dst_ip.s_addr : src_ip.s_addr;
 				    uint32_t v;
 
 				    match = lookup_table(chain, cmd->arg1, a,
 					&v);
 				    if (!match)
 					break;
 				    if (cmdlen == F_INSN_SIZE(ipfw_insn_u32))
 					match =
 					    ((ipfw_insn_u32 *)cmd)->d[0] == v;
 				    else
 					tablearg = v;
 				}
 				break;
 
 			case O_IP_SRC_MASK:
 			case O_IP_DST_MASK:
 				if (is_ipv4) {
 				    uint32_t a =
 					(cmd->opcode == O_IP_DST_MASK) ?
 					    dst_ip.s_addr : src_ip.s_addr;
 				    uint32_t *p = ((ipfw_insn_u32 *)cmd)->d;
 				    int i = cmdlen-1;
 
 				    for (; !match && i>0; i-= 2, p+= 2)
 					match = (p[0] == (a & p[1]));
 				}
 				break;
 
 			case O_IP_SRC_ME:
 				if (is_ipv4) {
 					struct ifnet *tif;
 
 					INADDR_TO_IFP(src_ip, tif);
 					match = (tif != NULL);
 				}
 				break;
 
 			case O_IP_DST_SET:
 			case O_IP_SRC_SET:
 				if (is_ipv4) {
 					u_int32_t *d = (u_int32_t *)(cmd+1);
 					u_int32_t addr =
 					    cmd->opcode == O_IP_DST_SET ?
 						args->f_id.dst_ip :
 						args->f_id.src_ip;
 
 					    if (addr < d[0])
 						    break;
 					    addr -= d[0]; /* subtract base */
 					    match = (addr < cmd->arg1) &&
 						( d[ 1 + (addr>>5)] &
 						  (1<<(addr & 0x1f)) );
 				}
 				break;
 
 			case O_IP_DST:
 				match = is_ipv4 &&
 				    (((ipfw_insn_ip *)cmd)->addr.s_addr ==
 				    dst_ip.s_addr);
 				break;
 
 			case O_IP_DST_ME:
 				if (is_ipv4) {
 					struct ifnet *tif;
 
 					INADDR_TO_IFP(dst_ip, tif);
 					match = (tif != NULL);
 				}
 				break;
 
 			case O_IP_SRCPORT:
 			case O_IP_DSTPORT:
 				/*
 				 * offset == 0 && proto != 0 is enough
 				 * to guarantee that we have a
 				 * packet with port info.
 				 */
 				if ((proto==IPPROTO_UDP || proto==IPPROTO_TCP)
 				    && offset == 0) {
 					u_int16_t x =
 					    (cmd->opcode == O_IP_SRCPORT) ?
 						src_port : dst_port ;
 					u_int16_t *p =
 					    ((ipfw_insn_u16 *)cmd)->ports;
 					int i;
 
 					for (i = cmdlen - 1; !match && i>0;
 					    i--, p += 2)
 						match = (x>=p[0] && x<=p[1]);
 				}
 				break;
 
 			case O_ICMPTYPE:
 				match = (offset == 0 && proto==IPPROTO_ICMP &&
 				    icmptype_match(ICMP(ulp), (ipfw_insn_u32 *)cmd) );
 				break;
 
 #ifdef INET6
 			case O_ICMP6TYPE:
 				match = is_ipv6 && offset == 0 &&
 				    proto==IPPROTO_ICMPV6 &&
 				    icmp6type_match(
 					ICMP6(ulp)->icmp6_type,
 					(ipfw_insn_u32 *)cmd);
 				break;
 #endif /* INET6 */
 
 			case O_IPOPT:
 				match = (is_ipv4 &&
 				    ipopts_match(ip, cmd) );
 				break;
 
 			case O_IPVER:
 				match = (is_ipv4 &&
 				    cmd->arg1 == ip->ip_v);
 				break;
 
 			case O_IPID:
 			case O_IPLEN:
 			case O_IPTTL:
 				if (is_ipv4) {	/* only for IP packets */
 				    uint16_t x;
 				    uint16_t *p;
 				    int i;
 
 				    if (cmd->opcode == O_IPLEN)
 					x = ip_len;
 				    else if (cmd->opcode == O_IPTTL)
 					x = ip->ip_ttl;
 				    else /* must be IPID */
 					x = ntohs(ip->ip_id);
 				    if (cmdlen == 1) {
 					match = (cmd->arg1 == x);
 					break;
 				    }
 				    /* otherwise we have ranges */
 				    p = ((ipfw_insn_u16 *)cmd)->ports;
 				    i = cmdlen - 1;
 				    for (; !match && i>0; i--, p += 2)
 					match = (x >= p[0] && x <= p[1]);
 				}
 				break;
 
 			case O_IPPRECEDENCE:
 				match = (is_ipv4 &&
 				    (cmd->arg1 == (ip->ip_tos & 0xe0)) );
 				break;
 
 			case O_IPTOS:
 				match = (is_ipv4 &&
 				    flags_match(cmd, ip->ip_tos));
 				break;
 
 			case O_TCPDATALEN:
 				if (proto == IPPROTO_TCP && offset == 0) {
 				    struct tcphdr *tcp;
 				    uint16_t x;
 				    uint16_t *p;
 				    int i;
 
 				    tcp = TCP(ulp);
 				    x = ip_len -
 					((ip->ip_hl + tcp->th_off) << 2);
 				    if (cmdlen == 1) {
 					match = (cmd->arg1 == x);
 					break;
 				    }
 				    /* otherwise we have ranges */
 				    p = ((ipfw_insn_u16 *)cmd)->ports;
 				    i = cmdlen - 1;
 				    for (; !match && i>0; i--, p += 2)
 					match = (x >= p[0] && x <= p[1]);
 				}
 				break;
 
 			case O_TCPFLAGS:
 				match = (proto == IPPROTO_TCP && offset == 0 &&
 				    flags_match(cmd, TCP(ulp)->th_flags));
 				break;
 
 			case O_TCPOPTS:
 				match = (proto == IPPROTO_TCP && offset == 0 &&
 				    tcpopts_match(TCP(ulp), cmd));
 				break;
 
 			case O_TCPSEQ:
 				match = (proto == IPPROTO_TCP && offset == 0 &&
 				    ((ipfw_insn_u32 *)cmd)->d[0] ==
 					TCP(ulp)->th_seq);
 				break;
 
 			case O_TCPACK:
 				match = (proto == IPPROTO_TCP && offset == 0 &&
 				    ((ipfw_insn_u32 *)cmd)->d[0] ==
 					TCP(ulp)->th_ack);
 				break;
 
 			case O_TCPWIN:
 				match = (proto == IPPROTO_TCP && offset == 0 &&
 				    cmd->arg1 == TCP(ulp)->th_win);
 				break;
 
 			case O_ESTAB:
 				/* reject packets which have SYN only */
 				/* XXX should i also check for TH_ACK ? */
 				match = (proto == IPPROTO_TCP && offset == 0 &&
 				    (TCP(ulp)->th_flags &
 				     (TH_RST | TH_ACK | TH_SYN)) != TH_SYN);
 				break;
 
 			case O_ALTQ: {
 				struct pf_mtag *at;
 				ipfw_insn_altq *altq = (ipfw_insn_altq *)cmd;
 
 				match = 1;
 				at = pf_find_mtag(m);
 				if (at != NULL && at->qid != 0)
 					break;
 				at = pf_get_mtag(m);
 				if (at == NULL) {
 					/*
 					 * Let the packet fall back to the
 					 * default ALTQ.
 					 */
 					break;
 				}
 				at->qid = altq->qid;
 				if (is_ipv4)
 					at->af = AF_INET;
 				else
 					at->af = AF_LINK;
 				at->hdr = ip;
 				break;
 			}
 
 			case O_LOG:
 				if (V_fw_verbose)
 					ipfw_log(f, hlen, args, m,
 					    oif, offset, tablearg, ip);
 				match = 1;
 				break;
 
 			case O_PROB:
 				match = (random()<((ipfw_insn_u32 *)cmd)->d[0]);
 				break;
 
 			case O_VERREVPATH:
 				/* Outgoing packets automatically pass/match */
 				match = ((oif != NULL) ||
 				    (m->m_pkthdr.rcvif == NULL) ||
 				    (
 #ifdef INET6
 				    is_ipv6 ?
 					verify_path6(&(args->f_id.src_ip6),
 					    m->m_pkthdr.rcvif) :
 #endif
 				    verify_path(src_ip, m->m_pkthdr.rcvif,
 				        args->f_id.fib)));
 				break;
 
 			case O_VERSRCREACH:
 				/* Outgoing packets automatically pass/match */
 				match = (hlen > 0 && ((oif != NULL) ||
 #ifdef INET6
 				    is_ipv6 ?
 				        verify_path6(&(args->f_id.src_ip6),
 				            NULL) :
 #endif
 				    verify_path(src_ip, NULL, args->f_id.fib)));
 				break;
 
 			case O_ANTISPOOF:
 				/* Outgoing packets automatically pass/match */
 				if (oif == NULL && hlen > 0 &&
 				    (  (is_ipv4 && in_localaddr(src_ip))
 #ifdef INET6
 				    || (is_ipv6 &&
 				        in6_localaddr(&(args->f_id.src_ip6)))
 #endif
 				    ))
 					match =
 #ifdef INET6
 					    is_ipv6 ? verify_path6(
 					        &(args->f_id.src_ip6),
 					        m->m_pkthdr.rcvif) :
 #endif
 					    verify_path(src_ip,
 					    	m->m_pkthdr.rcvif,
 					        args->f_id.fib);
 				else
 					match = 1;
 				break;
 
 			case O_IPSEC:
 #ifdef IPSEC
 				match = (m_tag_find(m,
 				    PACKET_TAG_IPSEC_IN_DONE, NULL) != NULL);
 #endif
 				/* otherwise no match */
 				break;
 
 #ifdef INET6
 			case O_IP6_SRC:
 				match = is_ipv6 &&
 				    IN6_ARE_ADDR_EQUAL(&args->f_id.src_ip6,
 				    &((ipfw_insn_ip6 *)cmd)->addr6);
 				break;
 
 			case O_IP6_DST:
 				match = is_ipv6 &&
 				IN6_ARE_ADDR_EQUAL(&args->f_id.dst_ip6,
 				    &((ipfw_insn_ip6 *)cmd)->addr6);
 				break;
 			case O_IP6_SRC_MASK:
 			case O_IP6_DST_MASK:
 				if (is_ipv6) {
 					int i = cmdlen - 1;
 					struct in6_addr p;
 					struct in6_addr *d =
 					    &((ipfw_insn_ip6 *)cmd)->addr6;
 
 					for (; !match && i > 0; d += 2,
 					    i -= F_INSN_SIZE(struct in6_addr)
 					    * 2) {
 						p = (cmd->opcode ==
 						    O_IP6_SRC_MASK) ?
 						    args->f_id.src_ip6:
 						    args->f_id.dst_ip6;
 						APPLY_MASK(&p, &d[1]);
 						match =
 						    IN6_ARE_ADDR_EQUAL(&d[0],
 						    &p);
 					}
 				}
 				break;
 
 			case O_IP6_SRC_ME:
 				match= is_ipv6 && search_ip6_addr_net(&args->f_id.src_ip6);
 				break;
 
 			case O_IP6_DST_ME:
 				match= is_ipv6 && search_ip6_addr_net(&args->f_id.dst_ip6);
 				break;
 
 			case O_FLOW6ID:
 				match = is_ipv6 &&
 				    flow6id_match(args->f_id.flow_id6,
 				    (ipfw_insn_u32 *) cmd);
 				break;
 
 			case O_EXT_HDR:
 				match = is_ipv6 &&
 				    (ext_hd & ((ipfw_insn *) cmd)->arg1);
 				break;
 
 			case O_IP6:
 				match = is_ipv6;
 				break;
 #endif
 
 			case O_IP4:
 				match = is_ipv4;
 				break;
 
 			case O_TAG: {
 				uint32_t tag = (cmd->arg1 == IP_FW_TABLEARG) ?
 				    tablearg : cmd->arg1;
 
 				/* Packet is already tagged with this tag? */
 				mtag = m_tag_locate(m, MTAG_IPFW, tag, NULL);
 
 				/* We have `untag' action when F_NOT flag is
 				 * present. And we must remove this mtag from
 				 * mbuf and reset `match' to zero (`match' will
 				 * be inversed later).
 				 * Otherwise we should allocate new mtag and
 				 * push it into mbuf.
 				 */
 				if (cmd->len & F_NOT) { /* `untag' action */
 					if (mtag != NULL)
 						m_tag_delete(m, mtag);
 				} else if (mtag == NULL) {
 					if ((mtag = m_tag_alloc(MTAG_IPFW,
 					    tag, 0, M_NOWAIT)) != NULL)
 						m_tag_prepend(m, mtag);
 				}
 				match = (cmd->len & F_NOT) ? 0: 1;
 				break;
 			}
 
 			case O_FIB: /* try match the specified fib */
 				if (args->f_id.fib == cmd->arg1)
 					match = 1;
 				break;
 
 			case O_TAGGED: {
 				uint32_t tag = (cmd->arg1 == IP_FW_TABLEARG) ?
 				    tablearg : cmd->arg1;
 
 				if (cmdlen == 1) {
 					match = m_tag_locate(m, MTAG_IPFW,
 					    tag, NULL) != NULL;
 					break;
 				}
 
 				/* we have ranges */
 				for (mtag = m_tag_first(m);
 				    mtag != NULL && !match;
 				    mtag = m_tag_next(m, mtag)) {
 					uint16_t *p;
 					int i;
 
 					if (mtag->m_tag_cookie != MTAG_IPFW)
 						continue;
 
 					p = ((ipfw_insn_u16 *)cmd)->ports;
 					i = cmdlen - 1;
 					for(; !match && i > 0; i--, p += 2)
 						match =
 						    mtag->m_tag_id >= p[0] &&
 						    mtag->m_tag_id <= p[1];
 				}
 				break;
 			}
 				
 			/*
 			 * The second set of opcodes represents 'actions',
 			 * i.e. the terminal part of a rule once the packet
 			 * matches all previous patterns.
 			 * Typically there is only one action for each rule,
 			 * and the opcode is stored at the end of the rule
 			 * (but there are exceptions -- see below).
 			 *
 			 * In general, here we set retval and terminate the
 			 * outer loop (would be a 'break 3' in some language,
 			 * but we need to do a 'goto done').
 			 *
 			 * Exceptions:
 			 * O_COUNT and O_SKIPTO actions:
 			 *   instead of terminating, we jump to the next rule
 			 *   ('goto next_rule', equivalent to a 'break 2'),
 			 *   or to the SKIPTO target ('goto again' after
 			 *   having set f, cmd and l), respectively.
 			 *
 			 * O_TAG, O_LOG and O_ALTQ action parameters:
 			 *   perform some action and set match = 1;
 			 *
 			 * O_LIMIT and O_KEEP_STATE: these opcodes are
 			 *   not real 'actions', and are stored right
 			 *   before the 'action' part of the rule.
 			 *   These opcodes try to install an entry in the
 			 *   state tables; if successful, we continue with
 			 *   the next opcode (match=1; break;), otherwise
 			 *   the packet *   must be dropped
 			 *   ('goto done' after setting retval);
 			 *
 			 * O_PROBE_STATE and O_CHECK_STATE: these opcodes
 			 *   cause a lookup of the state table, and a jump
 			 *   to the 'action' part of the parent rule
 			 *   ('goto check_body') if an entry is found, or
 			 *   (CHECK_STATE only) a jump to the next rule if
 			 *   the entry is not found ('goto next_rule').
 			 *   The result of the lookup is cached to make
 			 *   further instances of these opcodes are
 			 *   effectively NOPs.
 			 */
 			case O_LIMIT:
 			case O_KEEP_STATE:
 				if (install_state(f,
 				    (ipfw_insn_limit *)cmd, args, tablearg)) {
 					retval = IP_FW_DENY;
 					goto done; /* error/limit violation */
 				}
 				match = 1;
 				break;
 
 			case O_PROBE_STATE:
 			case O_CHECK_STATE:
 				/*
 				 * dynamic rules are checked at the first
 				 * keep-state or check-state occurrence,
 				 * with the result being stored in dyn_dir.
 				 * The compiler introduces a PROBE_STATE
 				 * instruction for us when we have a
 				 * KEEP_STATE (because PROBE_STATE needs
 				 * to be run first).
 				 */
 				if (dyn_dir == MATCH_UNKNOWN &&
 				    (q = lookup_dyn_rule(&args->f_id,
 				     &dyn_dir, proto == IPPROTO_TCP ?
 					TCP(ulp) : NULL))
 					!= NULL) {
 					/*
 					 * Found dynamic entry, update stats
 					 * and jump to the 'action' part of
 					 * the parent rule.
 					 */
 					q->pcnt++;
 					q->bcnt += pktlen;
 					f = q->rule;
 					cmd = ACTION_PTR(f);
 					l = f->cmd_len - f->act_ofs;
 					IPFW_DYN_UNLOCK();
 					goto check_body;
 				}
 				/*
 				 * Dynamic entry not found. If CHECK_STATE,
 				 * skip to next rule, if PROBE_STATE just
 				 * ignore and continue with next opcode.
 				 */
 				if (cmd->opcode == O_CHECK_STATE)
 					goto next_rule;
 				match = 1;
 				break;
 
 			case O_ACCEPT:
 				retval = 0;	/* accept */
 				goto done;
 
 			case O_PIPE:
 			case O_QUEUE:
 				args->rule = f; /* report matching rule */
 				if (cmd->arg1 == IP_FW_TABLEARG)
 					args->cookie = tablearg;
 				else
 					args->cookie = cmd->arg1;
 				retval = IP_FW_DUMMYNET;
 				goto done;
 
 			case O_DIVERT:
 			case O_TEE: {
 				struct divert_tag *dt;
 
 				if (args->eh) /* not on layer 2 */
 					break;
 				mtag = m_tag_get(PACKET_TAG_DIVERT,
 						sizeof(struct divert_tag),
 						M_NOWAIT);
 				if (mtag == NULL) {
 					/* XXX statistic */
 					/* drop packet */
 					IPFW_RUNLOCK(chain);
 					return (IP_FW_DENY);
 				}
 				dt = (struct divert_tag *)(mtag+1);
 				dt->cookie = f->rulenum;
 				if (cmd->arg1 == IP_FW_TABLEARG)
 					dt->info = tablearg;
 				else
 					dt->info = cmd->arg1;
 				m_tag_prepend(m, mtag);
 				retval = (cmd->opcode == O_DIVERT) ?
 				    IP_FW_DIVERT : IP_FW_TEE;
 				goto done;
 			}
 			case O_COUNT:
 			case O_SKIPTO:
 				f->pcnt++;	/* update stats */
 				f->bcnt += pktlen;
 				f->timestamp = time_uptime;
 				if (cmd->opcode == O_COUNT)
 					goto next_rule;
 				/* handle skipto */
 				if (cmd->arg1 == IP_FW_TABLEARG) {
 					f = lookup_next_rule(f, tablearg);
 				} else {
 					if (f->next_rule == NULL)
 						lookup_next_rule(f, 0);
 					f = f->next_rule;
 				}
 				goto again;
 
 			case O_REJECT:
 				/*
 				 * Drop the packet and send a reject notice
 				 * if the packet is not ICMP (or is an ICMP
 				 * query), and it is not multicast/broadcast.
 				 */
 				if (hlen > 0 && is_ipv4 && offset == 0 &&
 				    (proto != IPPROTO_ICMP ||
 				     is_icmp_query(ICMP(ulp))) &&
 				    !(m->m_flags & (M_BCAST|M_MCAST)) &&
 				    !IN_MULTICAST(ntohl(dst_ip.s_addr))) {
 					send_reject(args, cmd->arg1, ip_len, ip);
 					m = args->m;
 				}
 				/* FALLTHROUGH */
 #ifdef INET6
 			case O_UNREACH6:
 				if (hlen > 0 && is_ipv6 &&
 				    ((offset & IP6F_OFF_MASK) == 0) &&
 				    (proto != IPPROTO_ICMPV6 ||
 				     (is_icmp6_query(args->f_id.flags) == 1)) &&
 				    !(m->m_flags & (M_BCAST|M_MCAST)) &&
 				    !IN6_IS_ADDR_MULTICAST(&args->f_id.dst_ip6)) {
 					send_reject6(
 					    args, cmd->arg1, hlen,
 					    (struct ip6_hdr *)ip);
 					m = args->m;
 				}
 				/* FALLTHROUGH */
 #endif
 			case O_DENY:
 				retval = IP_FW_DENY;
 				goto done;
 
 			case O_FORWARD_IP: {
 				struct sockaddr_in *sa;
 				sa = &(((ipfw_insn_sa *)cmd)->sa);
 				if (args->eh)	/* not valid on layer2 pkts */
 					break;
 				if (!q || dyn_dir == MATCH_FORWARD) {
 					if (sa->sin_addr.s_addr == INADDR_ANY) {
 						bcopy(sa, &args->hopstore,
 							sizeof(*sa));
 						args->hopstore.sin_addr.s_addr =
 						    htonl(tablearg);
 						args->next_hop =
 						    &args->hopstore;
 					} else {
 						args->next_hop = sa;
 					}
 				}
 				retval = IP_FW_PASS;
 			    }
 			    goto done;
 
 			case O_NETGRAPH:
 			case O_NGTEE:
 				args->rule = f;	/* report matching rule */
 				if (cmd->arg1 == IP_FW_TABLEARG)
 					args->cookie = tablearg;
 				else
 					args->cookie = cmd->arg1;
 				retval = (cmd->opcode == O_NETGRAPH) ?
 				    IP_FW_NETGRAPH : IP_FW_NGTEE;
 				goto done;
 
 			case O_SETFIB:
 				f->pcnt++;	/* update stats */
 				f->bcnt += pktlen;
 				f->timestamp = time_uptime;
 				M_SETFIB(m, cmd->arg1);
 				args->f_id.fib = cmd->arg1;
 				goto next_rule;
 
 			case O_NAT: {
                         	struct cfg_nat *t;
                         	int nat_id;
 
  				if (IPFW_NAT_LOADED) {
 					args->rule = f; /* Report matching rule. */
 					t = ((ipfw_insn_nat *)cmd)->nat;
 					if (t == NULL) {
 						nat_id = (cmd->arg1 == IP_FW_TABLEARG) ?
 						    tablearg : cmd->arg1;
 						LOOKUP_NAT(V_layer3_chain, nat_id, t);
 						if (t == NULL) {
 							retval = IP_FW_DENY;
 							goto done;
 						}
 						if (cmd->arg1 != IP_FW_TABLEARG)
 							((ipfw_insn_nat *)cmd)->nat = t;
 					}
 					retval = ipfw_nat_ptr(args, t, m);
 				} else
 					retval = IP_FW_DENY;
 				goto done;
 			}
 
 			default:
 				panic("-- unknown opcode %d\n", cmd->opcode);
 			} /* end of switch() on opcodes */
 
 			if (cmd->len & F_NOT)
 				match = !match;
 
 			if (match) {
 				if (cmd->len & F_OR)
 					skip_or = 1;
 			} else {
 				if (!(cmd->len & F_OR)) /* not an OR block, */
 					break;		/* try next rule    */
 			}
 
 		}	/* end of inner for, scan opcodes */
 
 next_rule:;		/* try next rule		*/
 
 	}		/* end of outer for, scan rules */
 	printf("ipfw: ouch!, skip past end of rules, denying packet\n");
 	IPFW_RUNLOCK(chain);
 	return (IP_FW_DENY);
 
 done:
 	/* Update statistics */
 	f->pcnt++;
 	f->bcnt += pktlen;
 	f->timestamp = time_uptime;
 	IPFW_RUNLOCK(chain);
 	return (retval);
 
 pullup_failed:
 	if (V_fw_verbose)
 		printf("ipfw: pullup failed\n");
 	return (IP_FW_DENY);
 }
 
 /*
  * When a rule is added/deleted, clear the next_rule pointers in all rules.
  * These will be reconstructed on the fly as packets are matched.
  */
 static void
 flush_rule_ptrs(struct ip_fw_chain *chain)
 {
 	struct ip_fw *rule;
 
 	IPFW_WLOCK_ASSERT(chain);
 
 	for (rule = chain->rules; rule; rule = rule->next)
 		rule->next_rule = NULL;
 }
 
 /*
  * Add a new rule to the list. Copy the rule into a malloc'ed area, then
  * possibly create a rule number and add the rule to the list.
  * Update the rule_number in the input struct so the caller knows it as well.
  */
 static int
 add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule)
 {
 	INIT_VNET_IPFW(curvnet);
 	struct ip_fw *rule, *f, *prev;
 	int l = RULESIZE(input_rule);
 
 	if (chain->rules == NULL && input_rule->rulenum != IPFW_DEFAULT_RULE)
 		return (EINVAL);
 
 	rule = malloc(l, M_IPFW, M_NOWAIT | M_ZERO);
 	if (rule == NULL)
 		return (ENOSPC);
 
 	bcopy(input_rule, rule, l);
 
 	rule->next = NULL;
 	rule->next_rule = NULL;
 
 	rule->pcnt = 0;
 	rule->bcnt = 0;
 	rule->timestamp = 0;
 
 	IPFW_WLOCK(chain);
 
 	if (chain->rules == NULL) {	/* default rule */
 		chain->rules = rule;
 		goto done;
         }
 
 	/*
 	 * If rulenum is 0, find highest numbered rule before the
 	 * default rule, and add autoinc_step
 	 */
 	if (V_autoinc_step < 1)
 		V_autoinc_step = 1;
 	else if (V_autoinc_step > 1000)
 		V_autoinc_step = 1000;
 	if (rule->rulenum == 0) {
 		/*
 		 * locate the highest numbered rule before default
 		 */
 		for (f = chain->rules; f; f = f->next) {
 			if (f->rulenum == IPFW_DEFAULT_RULE)
 				break;
 			rule->rulenum = f->rulenum;
 		}
 		if (rule->rulenum < IPFW_DEFAULT_RULE - V_autoinc_step)
 			rule->rulenum += V_autoinc_step;
 		input_rule->rulenum = rule->rulenum;
 	}
 
 	/*
 	 * Now insert the new rule in the right place in the sorted list.
 	 */
 	for (prev = NULL, f = chain->rules; f; prev = f, f = f->next) {
 		if (f->rulenum > rule->rulenum) { /* found the location */
 			if (prev) {
 				rule->next = f;
 				prev->next = rule;
 			} else { /* head insert */
 				rule->next = chain->rules;
 				chain->rules = rule;
 			}
 			break;
 		}
 	}
 	flush_rule_ptrs(chain);
 done:
 	V_static_count++;
 	V_static_len += l;
 	IPFW_WUNLOCK(chain);
 	DEB(printf("ipfw: installed rule %d, static count now %d\n",
 		rule->rulenum, V_static_count);)
 	return (0);
 }
 
 /**
  * Remove a static rule (including derived * dynamic rules)
  * and place it on the ``reap list'' for later reclamation.
  * The caller is in charge of clearing rule pointers to avoid
  * dangling pointers.
  * @return a pointer to the next entry.
  * Arguments are not checked, so they better be correct.
  */
 static struct ip_fw *
 remove_rule(struct ip_fw_chain *chain, struct ip_fw *rule,
     struct ip_fw *prev)
 {
 	INIT_VNET_IPFW(curvnet);
 	struct ip_fw *n;
 	int l = RULESIZE(rule);
 
 	IPFW_WLOCK_ASSERT(chain);
 
 	n = rule->next;
 	IPFW_DYN_LOCK();
 	remove_dyn_rule(rule, NULL /* force removal */);
 	IPFW_DYN_UNLOCK();
 	if (prev == NULL)
 		chain->rules = n;
 	else
 		prev->next = n;
 	V_static_count--;
 	V_static_len -= l;
 
 	rule->next = chain->reap;
 	chain->reap = rule;
 
 	return n;
 }
 
 /**
  * Reclaim storage associated with a list of rules.  This is
  * typically the list created using remove_rule.
  */
 static void
 reap_rules(struct ip_fw *head)
 {
 	struct ip_fw *rule;
 
 	while ((rule = head) != NULL) {
 		head = head->next;
 		if (DUMMYNET_LOADED)
 			ip_dn_ruledel_ptr(rule);
 		free(rule, M_IPFW);
 	}
 }
 
 /*
  * Remove all rules from a chain (except rules in set RESVD_SET
  * unless kill_default = 1).  The caller is responsible for
  * reclaiming storage for the rules left in chain->reap.
  */
 static void
 free_chain(struct ip_fw_chain *chain, int kill_default)
 {
 	struct ip_fw *prev, *rule;
 
 	IPFW_WLOCK_ASSERT(chain);
 
 	flush_rule_ptrs(chain); /* more efficient to do outside the loop */
 	for (prev = NULL, rule = chain->rules; rule ; )
 		if (kill_default || rule->set != RESVD_SET)
 			rule = remove_rule(chain, rule, prev);
 		else {
 			prev = rule;
 			rule = rule->next;
 		}
 }
 
 /**
  * Remove all rules with given number, and also do set manipulation.
  * Assumes chain != NULL && *chain != NULL.
  *
  * The argument is an u_int32_t. The low 16 bit are the rule or set number,
  * the next 8 bits are the new set, the top 8 bits are the command:
  *
  *	0	delete rules with given number
  *	1	delete rules with given set number
  *	2	move rules with given number to new set
  *	3	move rules with given set number to new set
  *	4	swap sets with given numbers
  *	5	delete rules with given number and with given set number
  */
 static int
 del_entry(struct ip_fw_chain *chain, u_int32_t arg)
 {
 	struct ip_fw *prev = NULL, *rule;
 	u_int16_t rulenum;	/* rule or old_set */
 	u_int8_t cmd, new_set;
 
 	rulenum = arg & 0xffff;
 	cmd = (arg >> 24) & 0xff;
 	new_set = (arg >> 16) & 0xff;
 
 	if (cmd > 5 || new_set > RESVD_SET)
 		return EINVAL;
 	if (cmd == 0 || cmd == 2 || cmd == 5) {
 		if (rulenum >= IPFW_DEFAULT_RULE)
 			return EINVAL;
 	} else {
 		if (rulenum > RESVD_SET)	/* old_set */
 			return EINVAL;
 	}
 
 	IPFW_WLOCK(chain);
 	rule = chain->rules;
 	chain->reap = NULL;
 	switch (cmd) {
 	case 0:	/* delete rules with given number */
 		/*
 		 * locate first rule to delete
 		 */
 		for (; rule->rulenum < rulenum; prev = rule, rule = rule->next)
 			;
 		if (rule->rulenum != rulenum) {
 			IPFW_WUNLOCK(chain);
 			return EINVAL;
 		}
 
 		/*
 		 * flush pointers outside the loop, then delete all matching
 		 * rules. prev remains the same throughout the cycle.
 		 */
 		flush_rule_ptrs(chain);
 		while (rule->rulenum == rulenum)
 			rule = remove_rule(chain, rule, prev);
 		break;
 
 	case 1:	/* delete all rules with given set number */
 		flush_rule_ptrs(chain);
 		rule = chain->rules;
 		while (rule->rulenum < IPFW_DEFAULT_RULE)
 			if (rule->set == rulenum)
 				rule = remove_rule(chain, rule, prev);
 			else {
 				prev = rule;
 				rule = rule->next;
 			}
 		break;
 
 	case 2:	/* move rules with given number to new set */
 		rule = chain->rules;
 		for (; rule->rulenum < IPFW_DEFAULT_RULE; rule = rule->next)
 			if (rule->rulenum == rulenum)
 				rule->set = new_set;
 		break;
 
 	case 3: /* move rules with given set number to new set */
 		for (; rule->rulenum < IPFW_DEFAULT_RULE; rule = rule->next)
 			if (rule->set == rulenum)
 				rule->set = new_set;
 		break;
 
 	case 4: /* swap two sets */
 		for (; rule->rulenum < IPFW_DEFAULT_RULE; rule = rule->next)
 			if (rule->set == rulenum)
 				rule->set = new_set;
 			else if (rule->set == new_set)
 				rule->set = rulenum;
 		break;
 	case 5: /* delete rules with given number and with given set number.
 		 * rulenum - given rule number;
 		 * new_set - given set number.
 		 */
 		for (; rule->rulenum < rulenum; prev = rule, rule = rule->next)
 			;
 		if (rule->rulenum != rulenum) {
 			IPFW_WUNLOCK(chain);
 			return (EINVAL);
 		}
 		flush_rule_ptrs(chain);
 		while (rule->rulenum == rulenum) {
 			if (rule->set == new_set)
 				rule = remove_rule(chain, rule, prev);
 			else {
 				prev = rule;
 				rule = rule->next;
 			}
 		}
 	}
 	/*
 	 * Look for rules to reclaim.  We grab the list before
 	 * releasing the lock then reclaim them w/o the lock to
 	 * avoid a LOR with dummynet.
 	 */
 	rule = chain->reap;
 	chain->reap = NULL;
 	IPFW_WUNLOCK(chain);
 	if (rule)
 		reap_rules(rule);
 	return 0;
 }
 
 /*
  * Clear counters for a specific rule.
  * The enclosing "table" is assumed locked.
  */
 static void
 clear_counters(struct ip_fw *rule, int log_only)
 {
 	ipfw_insn_log *l = (ipfw_insn_log *)ACTION_PTR(rule);
 
 	if (log_only == 0) {
 		rule->bcnt = rule->pcnt = 0;
 		rule->timestamp = 0;
 	}
 	if (l->o.opcode == O_LOG)
 		l->log_left = l->max_log;
 }
 
 /**
  * Reset some or all counters on firewall rules.
  * The argument `arg' is an u_int32_t. The low 16 bit are the rule number,
  * the next 8 bits are the set number, the top 8 bits are the command:
  *	0	work with rules from all set's;
  *	1	work with rules only from specified set.
  * Specified rule number is zero if we want to clear all entries.
  * log_only is 1 if we only want to reset logs, zero otherwise.
  */
 static int
 zero_entry(struct ip_fw_chain *chain, u_int32_t arg, int log_only)
 {
 	INIT_VNET_IPFW(curvnet);
 	struct ip_fw *rule;
 	char *msg;
 
 	uint16_t rulenum = arg & 0xffff;
 	uint8_t set = (arg >> 16) & 0xff;
 	uint8_t cmd = (arg >> 24) & 0xff;
 
 	if (cmd > 1)
 		return (EINVAL);
 	if (cmd == 1 && set > RESVD_SET)
 		return (EINVAL);
 
 	IPFW_WLOCK(chain);
 	if (rulenum == 0) {
 		V_norule_counter = 0;
 		for (rule = chain->rules; rule; rule = rule->next) {
 			/* Skip rules from another set. */
 			if (cmd == 1 && rule->set != set)
 				continue;
 			clear_counters(rule, log_only);
 		}
 		msg = log_only ? "ipfw: All logging counts reset.\n" :
 		    "ipfw: Accounting cleared.\n";
 	} else {
 		int cleared = 0;
 		/*
 		 * We can have multiple rules with the same number, so we
 		 * need to clear them all.
 		 */
 		for (rule = chain->rules; rule; rule = rule->next)
 			if (rule->rulenum == rulenum) {
 				while (rule && rule->rulenum == rulenum) {
 					if (cmd == 0 || rule->set == set)
 						clear_counters(rule, log_only);
 					rule = rule->next;
 				}
 				cleared = 1;
 				break;
 			}
 		if (!cleared) {	/* we did not find any matching rules */
 			IPFW_WUNLOCK(chain);
 			return (EINVAL);
 		}
 		msg = log_only ? "ipfw: Entry %d logging count reset.\n" :
 		    "ipfw: Entry %d cleared.\n";
 	}
 	IPFW_WUNLOCK(chain);
 
 	if (V_fw_verbose)
 		log(LOG_SECURITY | LOG_NOTICE, msg, rulenum);
 	return (0);
 }
 
 /*
  * Check validity of the structure before insert.
  * Fortunately rules are simple, so this mostly need to check rule sizes.
  */
 static int
 check_ipfw_struct(struct ip_fw *rule, int size)
 {
 	int l, cmdlen = 0;
 	int have_action=0;
 	ipfw_insn *cmd;
 
 	if (size < sizeof(*rule)) {
 		printf("ipfw: rule too short\n");
 		return (EINVAL);
 	}
 	/* first, check for valid size */
 	l = RULESIZE(rule);
 	if (l != size) {
 		printf("ipfw: size mismatch (have %d want %d)\n", size, l);
 		return (EINVAL);
 	}
 	if (rule->act_ofs >= rule->cmd_len) {
 		printf("ipfw: bogus action offset (%u > %u)\n",
 		    rule->act_ofs, rule->cmd_len - 1);
 		return (EINVAL);
 	}
 	/*
 	 * Now go for the individual checks. Very simple ones, basically only
 	 * instruction sizes.
 	 */
 	for (l = rule->cmd_len, cmd = rule->cmd ;
 			l > 0 ; l -= cmdlen, cmd += cmdlen) {
 		cmdlen = F_LEN(cmd);
 		if (cmdlen > l) {
 			printf("ipfw: opcode %d size truncated\n",
 			    cmd->opcode);
 			return EINVAL;
 		}
 		DEB(printf("ipfw: opcode %d\n", cmd->opcode);)
 		switch (cmd->opcode) {
 		case O_PROBE_STATE:
 		case O_KEEP_STATE:
 		case O_PROTO:
 		case O_IP_SRC_ME:
 		case O_IP_DST_ME:
 		case O_LAYER2:
 		case O_IN:
 		case O_FRAG:
 		case O_DIVERTED:
 		case O_IPOPT:
 		case O_IPTOS:
 		case O_IPPRECEDENCE:
 		case O_IPVER:
 		case O_TCPWIN:
 		case O_TCPFLAGS:
 		case O_TCPOPTS:
 		case O_ESTAB:
 		case O_VERREVPATH:
 		case O_VERSRCREACH:
 		case O_ANTISPOOF:
 		case O_IPSEC:
 #ifdef INET6
 		case O_IP6_SRC_ME:
 		case O_IP6_DST_ME:
 		case O_EXT_HDR:
 		case O_IP6:
 #endif
 		case O_IP4:
 		case O_TAG:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn))
 				goto bad_size;
 			break;
 
 		case O_FIB:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn))
 				goto bad_size;
 			if (cmd->arg1 >= rt_numfibs) {
 				printf("ipfw: invalid fib number %d\n",
 					cmd->arg1);
 				return EINVAL;
 			}
 			break;
 
 		case O_SETFIB:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn))
 				goto bad_size;
 			if (cmd->arg1 >= rt_numfibs) {
 				printf("ipfw: invalid fib number %d\n",
 					cmd->arg1);
 				return EINVAL;
 			}
 			goto check_action;
 
 		case O_UID:
 		case O_GID:
 		case O_JAIL:
 		case O_IP_SRC:
 		case O_IP_DST:
 		case O_TCPSEQ:
 		case O_TCPACK:
 		case O_PROB:
 		case O_ICMPTYPE:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_u32))
 				goto bad_size;
 			break;
 
 		case O_LIMIT:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_limit))
 				goto bad_size;
 			break;
 
 		case O_LOG:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_log))
 				goto bad_size;
 
 			((ipfw_insn_log *)cmd)->log_left =
 			    ((ipfw_insn_log *)cmd)->max_log;
 
 			break;
 
 		case O_IP_SRC_MASK:
 		case O_IP_DST_MASK:
 			/* only odd command lengths */
 			if ( !(cmdlen & 1) || cmdlen > 31)
 				goto bad_size;
 			break;
 
 		case O_IP_SRC_SET:
 		case O_IP_DST_SET:
 			if (cmd->arg1 == 0 || cmd->arg1 > 256) {
 				printf("ipfw: invalid set size %d\n",
 					cmd->arg1);
 				return EINVAL;
 			}
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_u32) +
 			    (cmd->arg1+31)/32 )
 				goto bad_size;
 			break;
 
 		case O_IP_SRC_LOOKUP:
 		case O_IP_DST_LOOKUP:
 			if (cmd->arg1 >= IPFW_TABLES_MAX) {
 				printf("ipfw: invalid table number %d\n",
 				    cmd->arg1);
 				return (EINVAL);
 			}
 			if (cmdlen != F_INSN_SIZE(ipfw_insn) &&
 			    cmdlen != F_INSN_SIZE(ipfw_insn_u32))
 				goto bad_size;
 			break;
 
 		case O_MACADDR2:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_mac))
 				goto bad_size;
 			break;
 
 		case O_NOP:
 		case O_IPID:
 		case O_IPTTL:
 		case O_IPLEN:
 		case O_TCPDATALEN:
 		case O_TAGGED:
 			if (cmdlen < 1 || cmdlen > 31)
 				goto bad_size;
 			break;
 
 		case O_MAC_TYPE:
 		case O_IP_SRCPORT:
 		case O_IP_DSTPORT: /* XXX artificial limit, 30 port pairs */
 			if (cmdlen < 2 || cmdlen > 31)
 				goto bad_size;
 			break;
 
 		case O_RECV:
 		case O_XMIT:
 		case O_VIA:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_if))
 				goto bad_size;
 			break;
 
 		case O_ALTQ:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_altq))
 				goto bad_size;
 			break;
 
 		case O_PIPE:
 		case O_QUEUE:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn))
 				goto bad_size;
 			goto check_action;
 
 		case O_FORWARD_IP:
 #ifdef	IPFIREWALL_FORWARD
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_sa))
 				goto bad_size;
 			goto check_action;
 #else
 			return EINVAL;
 #endif
 
 		case O_DIVERT:
 		case O_TEE:
 			if (ip_divert_ptr == NULL)
 				return EINVAL;
 			else
 				goto check_size;
 		case O_NETGRAPH:
 		case O_NGTEE:
 			if (!NG_IPFW_LOADED)
 				return EINVAL;
 			else
 				goto check_size;
 		case O_NAT:
 			if (!IPFW_NAT_LOADED)
 				return EINVAL;
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_nat))
  				goto bad_size;		
  			goto check_action;
 		case O_FORWARD_MAC: /* XXX not implemented yet */
 		case O_CHECK_STATE:
 		case O_COUNT:
 		case O_ACCEPT:
 		case O_DENY:
 		case O_REJECT:
 #ifdef INET6
 		case O_UNREACH6:
 #endif
 		case O_SKIPTO:
 check_size:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn))
 				goto bad_size;
 check_action:
 			if (have_action) {
 				printf("ipfw: opcode %d, multiple actions"
 					" not allowed\n",
 					cmd->opcode);
 				return EINVAL;
 			}
 			have_action = 1;
 			if (l != cmdlen) {
 				printf("ipfw: opcode %d, action must be"
 					" last opcode\n",
 					cmd->opcode);
 				return EINVAL;
 			}
 			break;
 #ifdef INET6
 		case O_IP6_SRC:
 		case O_IP6_DST:
 			if (cmdlen != F_INSN_SIZE(struct in6_addr) +
 			    F_INSN_SIZE(ipfw_insn))
 				goto bad_size;
 			break;
 
 		case O_FLOW6ID:
 			if (cmdlen != F_INSN_SIZE(ipfw_insn_u32) +
 			    ((ipfw_insn_u32 *)cmd)->o.arg1)
 				goto bad_size;
 			break;
 
 		case O_IP6_SRC_MASK:
 		case O_IP6_DST_MASK:
 			if ( !(cmdlen & 1) || cmdlen > 127)
 				goto bad_size;
 			break;
 		case O_ICMP6TYPE:
 			if( cmdlen != F_INSN_SIZE( ipfw_insn_icmp6 ) )
 				goto bad_size;
 			break;
 #endif
 
 		default:
 			switch (cmd->opcode) {
 #ifndef INET6
 			case O_IP6_SRC_ME:
 			case O_IP6_DST_ME:
 			case O_EXT_HDR:
 			case O_IP6:
 			case O_UNREACH6:
 			case O_IP6_SRC:
 			case O_IP6_DST:
 			case O_FLOW6ID:
 			case O_IP6_SRC_MASK:
 			case O_IP6_DST_MASK:
 			case O_ICMP6TYPE:
 				printf("ipfw: no IPv6 support in kernel\n");
 				return EPROTONOSUPPORT;
 #endif
 			default:
 				printf("ipfw: opcode %d, unknown opcode\n",
 					cmd->opcode);
 				return EINVAL;
 			}
 		}
 	}
 	if (have_action == 0) {
 		printf("ipfw: missing action\n");
 		return EINVAL;
 	}
 	return 0;
 
 bad_size:
 	printf("ipfw: opcode %d size %d wrong\n",
 		cmd->opcode, cmdlen);
 	return EINVAL;
 }
 
 /*
  * Copy the static and dynamic rules to the supplied buffer
  * and return the amount of space actually used.
  */
 static size_t
 ipfw_getrules(struct ip_fw_chain *chain, void *buf, size_t space)
 {
 	INIT_VNET_IPFW(curvnet);
 	char *bp = buf;
 	char *ep = bp + space;
 	struct ip_fw *rule;
 	int i;
 	time_t	boot_seconds;
 
         boot_seconds = boottime.tv_sec;
 	/* XXX this can take a long time and locking will block packet flow */
 	IPFW_RLOCK(chain);
 	for (rule = chain->rules; rule ; rule = rule->next) {
 		/*
 		 * Verify the entry fits in the buffer in case the
 		 * rules changed between calculating buffer space and
 		 * now.  This would be better done using a generation
 		 * number but should suffice for now.
 		 */
 		i = RULESIZE(rule);
 		if (bp + i <= ep) {
 			bcopy(rule, bp, i);
 			/*
 			 * XXX HACK. Store the disable mask in the "next"
 			 * pointer in a wild attempt to keep the ABI the same.
 			 * Why do we do this on EVERY rule?
 			 */
 			bcopy(&V_set_disable,
 			    &(((struct ip_fw *)bp)->next_rule),
 			    sizeof(V_set_disable));
 			if (((struct ip_fw *)bp)->timestamp)
 				((struct ip_fw *)bp)->timestamp += boot_seconds;
 			bp += i;
 		}
 	}
 	IPFW_RUNLOCK(chain);
 	if (V_ipfw_dyn_v) {
 		ipfw_dyn_rule *p, *last = NULL;
 
 		IPFW_DYN_LOCK();
 		for (i = 0 ; i < V_curr_dyn_buckets; i++)
 			for (p = V_ipfw_dyn_v[i] ; p != NULL; p = p->next) {
 				if (bp + sizeof *p <= ep) {
 					ipfw_dyn_rule *dst =
 						(ipfw_dyn_rule *)bp;
 					bcopy(p, dst, sizeof *p);
 					bcopy(&(p->rule->rulenum), &(dst->rule),
 					    sizeof(p->rule->rulenum));
 					/*
 					 * store set number into high word of
 					 * dst->rule pointer.
 					 */
 					bcopy(&(p->rule->set),
 					    (char *)&dst->rule +
 					    sizeof(p->rule->rulenum),
 					    sizeof(p->rule->set));
 					/*
 					 * store a non-null value in "next".
 					 * The userland code will interpret a
 					 * NULL here as a marker
 					 * for the last dynamic rule.
 					 */
 					bcopy(&dst, &dst->next, sizeof(dst));
 					last = dst;
 					dst->expire =
 					    TIME_LEQ(dst->expire, time_uptime) ?
 						0 : dst->expire - time_uptime ;
 					bp += sizeof(ipfw_dyn_rule);
 				}
 			}
 		IPFW_DYN_UNLOCK();
 		if (last != NULL) /* mark last dynamic rule */
 			bzero(&last->next, sizeof(last));
 	}
 	return (bp - (char *)buf);
 }
 
 
 /**
  * {set|get}sockopt parser.
  */
 static int
 ipfw_ctl(struct sockopt *sopt)
 {
 #define	RULE_MAXSIZE	(256*sizeof(u_int32_t))
 	INIT_VNET_IPFW(curvnet);
 	int error;
 	size_t size;
 	struct ip_fw *buf, *rule;
 	u_int32_t rulenum[2];
 
 	error = priv_check(sopt->sopt_td, PRIV_NETINET_IPFW);
 	if (error)
 		return (error);
 
 	/*
 	 * Disallow modifications in really-really secure mode, but still allow
 	 * the logging counters to be reset.
 	 */
 	if (sopt->sopt_name == IP_FW_ADD ||
 	    (sopt->sopt_dir == SOPT_SET && sopt->sopt_name != IP_FW_RESETLOG)) {
 		error = securelevel_ge(sopt->sopt_td->td_ucred, 3);
 		if (error)
 			return (error);
 	}
 
 	error = 0;
 
 	switch (sopt->sopt_name) {
 	case IP_FW_GET:
 		/*
 		 * pass up a copy of the current rules. Static rules
 		 * come first (the last of which has number IPFW_DEFAULT_RULE),
 		 * followed by a possibly empty list of dynamic rule.
 		 * The last dynamic rule has NULL in the "next" field.
 		 *
 		 * Note that the calculated size is used to bound the
 		 * amount of data returned to the user.  The rule set may
 		 * change between calculating the size and returning the
 		 * data in which case we'll just return what fits.
 		 */
 		size = V_static_len;	/* size of static rules */
 		if (V_ipfw_dyn_v)		/* add size of dyn.rules */
 			size += (V_dyn_count * sizeof(ipfw_dyn_rule));
 
 		/*
 		 * XXX todo: if the user passes a short length just to know
 		 * how much room is needed, do not bother filling up the
 		 * buffer, just jump to the sooptcopyout.
 		 */
 		buf = malloc(size, M_TEMP, M_WAITOK);
 		error = sooptcopyout(sopt, buf,
 				ipfw_getrules(&V_layer3_chain, buf, size));
 		free(buf, M_TEMP);
 		break;
 
 	case IP_FW_FLUSH:
 		/*
 		 * Normally we cannot release the lock on each iteration.
 		 * We could do it here only because we start from the head all
 		 * the times so there is no risk of missing some entries.
 		 * On the other hand, the risk is that we end up with
 		 * a very inconsistent ruleset, so better keep the lock
 		 * around the whole cycle.
 		 *
 		 * XXX this code can be improved by resetting the head of
 		 * the list to point to the default rule, and then freeing
 		 * the old list without the need for a lock.
 		 */
 
 		IPFW_WLOCK(&V_layer3_chain);
 		V_layer3_chain.reap = NULL;
 		free_chain(&V_layer3_chain, 0 /* keep default rule */);
 		rule = V_layer3_chain.reap;
 		V_layer3_chain.reap = NULL;
 		IPFW_WUNLOCK(&V_layer3_chain);
 		if (rule != NULL)
 			reap_rules(rule);
 		break;
 
 	case IP_FW_ADD:
 		rule = malloc(RULE_MAXSIZE, M_TEMP, M_WAITOK);
 		error = sooptcopyin(sopt, rule, RULE_MAXSIZE,
 			sizeof(struct ip_fw) );
 		if (error == 0)
 			error = check_ipfw_struct(rule, sopt->sopt_valsize);
 		if (error == 0) {
 			error = add_rule(&V_layer3_chain, rule);
 			size = RULESIZE(rule);
 			if (!error && sopt->sopt_dir == SOPT_GET)
 				error = sooptcopyout(sopt, rule, size);
 		}
 		free(rule, M_TEMP);
 		break;
 
 	case IP_FW_DEL:
 		/*
 		 * IP_FW_DEL is used for deleting single rules or sets,
 		 * and (ab)used to atomically manipulate sets. Argument size
 		 * is used to distinguish between the two:
 		 *    sizeof(u_int32_t)
 		 *	delete single rule or set of rules,
 		 *	or reassign rules (or sets) to a different set.
 		 *    2*sizeof(u_int32_t)
 		 *	atomic disable/enable sets.
 		 *	first u_int32_t contains sets to be disabled,
 		 *	second u_int32_t contains sets to be enabled.
 		 */
 		error = sooptcopyin(sopt, rulenum,
 			2*sizeof(u_int32_t), sizeof(u_int32_t));
 		if (error)
 			break;
 		size = sopt->sopt_valsize;
 		if (size == sizeof(u_int32_t))	/* delete or reassign */
 			error = del_entry(&V_layer3_chain, rulenum[0]);
 		else if (size == 2*sizeof(u_int32_t)) /* set enable/disable */
 			V_set_disable =
 			    (V_set_disable | rulenum[0]) & ~rulenum[1] &
 			    ~(1<<RESVD_SET); /* set RESVD_SET always enabled */
 		else
 			error = EINVAL;
 		break;
 
 	case IP_FW_ZERO:
 	case IP_FW_RESETLOG: /* argument is an u_int_32, the rule number */
 		rulenum[0] = 0;
 		if (sopt->sopt_val != 0) {
 		    error = sooptcopyin(sopt, rulenum,
 			    sizeof(u_int32_t), sizeof(u_int32_t));
 		    if (error)
 			break;
 		}
 		error = zero_entry(&V_layer3_chain, rulenum[0],
 			sopt->sopt_name == IP_FW_RESETLOG);
 		break;
 
 	case IP_FW_TABLE_ADD:
 		{
 			ipfw_table_entry ent;
 
 			error = sooptcopyin(sopt, &ent,
 			    sizeof(ent), sizeof(ent));
 			if (error)
 				break;
 			error = add_table_entry(&V_layer3_chain, ent.tbl,
 			    ent.addr, ent.masklen, ent.value);
 		}
 		break;
 
 	case IP_FW_TABLE_DEL:
 		{
 			ipfw_table_entry ent;
 
 			error = sooptcopyin(sopt, &ent,
 			    sizeof(ent), sizeof(ent));
 			if (error)
 				break;
 			error = del_table_entry(&V_layer3_chain, ent.tbl,
 			    ent.addr, ent.masklen);
 		}
 		break;
 
 	case IP_FW_TABLE_FLUSH:
 		{
 			u_int16_t tbl;
 
 			error = sooptcopyin(sopt, &tbl,
 			    sizeof(tbl), sizeof(tbl));
 			if (error)
 				break;
 			IPFW_WLOCK(&V_layer3_chain);
 			error = flush_table(&V_layer3_chain, tbl);
 			IPFW_WUNLOCK(&V_layer3_chain);
 		}
 		break;
 
 	case IP_FW_TABLE_GETSIZE:
 		{
 			u_int32_t tbl, cnt;
 
 			if ((error = sooptcopyin(sopt, &tbl, sizeof(tbl),
 			    sizeof(tbl))))
 				break;
 			IPFW_RLOCK(&V_layer3_chain);
 			error = count_table(&V_layer3_chain, tbl, &cnt);
 			IPFW_RUNLOCK(&V_layer3_chain);
 			if (error)
 				break;
 			error = sooptcopyout(sopt, &cnt, sizeof(cnt));
 		}
 		break;
 
 	case IP_FW_TABLE_LIST:
 		{
 			ipfw_table *tbl;
 
 			if (sopt->sopt_valsize < sizeof(*tbl)) {
 				error = EINVAL;
 				break;
 			}
 			size = sopt->sopt_valsize;
 			tbl = malloc(size, M_TEMP, M_WAITOK);
 			error = sooptcopyin(sopt, tbl, size, sizeof(*tbl));
 			if (error) {
 				free(tbl, M_TEMP);
 				break;
 			}
 			tbl->size = (size - sizeof(*tbl)) /
 			    sizeof(ipfw_table_entry);
 			IPFW_RLOCK(&V_layer3_chain);
 			error = dump_table(&V_layer3_chain, tbl);
 			IPFW_RUNLOCK(&V_layer3_chain);
 			if (error) {
 				free(tbl, M_TEMP);
 				break;
 			}
 			error = sooptcopyout(sopt, tbl, size);
 			free(tbl, M_TEMP);
 		}
 		break;
 
 	case IP_FW_NAT_CFG:
 		if (IPFW_NAT_LOADED)
 			error = ipfw_nat_cfg_ptr(sopt);
 		else {
 			printf("IP_FW_NAT_CFG: %s\n",
 			    "ipfw_nat not present, please load it");
 			error = EINVAL;
 		}
 		break;
 
 	case IP_FW_NAT_DEL:
 		if (IPFW_NAT_LOADED)
 			error = ipfw_nat_del_ptr(sopt);
 		else {
 			printf("IP_FW_NAT_DEL: %s\n",
 			    "ipfw_nat not present, please load it");
 			error = EINVAL;
 		}
 		break;
 
 	case IP_FW_NAT_GET_CONFIG:
 		if (IPFW_NAT_LOADED)
 			error = ipfw_nat_get_cfg_ptr(sopt);
 		else {
 			printf("IP_FW_NAT_GET_CFG: %s\n",
 			    "ipfw_nat not present, please load it");
 			error = EINVAL;
 		}
 		break;
 
 	case IP_FW_NAT_GET_LOG:
 		if (IPFW_NAT_LOADED)
 			error = ipfw_nat_get_log_ptr(sopt);
 		else {
 			printf("IP_FW_NAT_GET_LOG: %s\n",
 			    "ipfw_nat not present, please load it");
 			error = EINVAL;
 		}
 		break;
 
 	default:
 		printf("ipfw: ipfw_ctl invalid option %d\n", sopt->sopt_name);
 		error = EINVAL;
 	}
 
 	return (error);
 #undef RULE_MAXSIZE
 }
 
 /**
  * dummynet needs a reference to the default rule, because rules can be
  * deleted while packets hold a reference to them. When this happens,
  * dummynet changes the reference to the default rule (it could well be a
  * NULL pointer, but this way we do not need to check for the special
  * case, plus here he have info on the default behaviour).
  */
 struct ip_fw *ip_fw_default_rule;
 
 /*
  * This procedure is only used to handle keepalives. It is invoked
  * every dyn_keepalive_period
  */
 static void
 ipfw_tick(void * __unused unused)
 {
 	struct mbuf *m0, *m, *mnext, **mtailp;
 	int i;
 	ipfw_dyn_rule *q;
 
 	if (V_dyn_keepalive == 0 || V_ipfw_dyn_v == NULL || V_dyn_count == 0)
 		goto done;
 
 	/*
 	 * We make a chain of packets to go out here -- not deferring
 	 * until after we drop the IPFW dynamic rule lock would result
 	 * in a lock order reversal with the normal packet input -> ipfw
 	 * call stack.
 	 */
 	m0 = NULL;
 	mtailp = &m0;
 	IPFW_DYN_LOCK();
 	for (i = 0 ; i < V_curr_dyn_buckets ; i++) {
 		for (q = V_ipfw_dyn_v[i] ; q ; q = q->next ) {
 			if (q->dyn_type == O_LIMIT_PARENT)
 				continue;
 			if (q->id.proto != IPPROTO_TCP)
 				continue;
 			if ( (q->state & BOTH_SYN) != BOTH_SYN)
 				continue;
 			if (TIME_LEQ(time_uptime + V_dyn_keepalive_interval,
 			    q->expire))
 				continue;	/* too early */
 			if (TIME_LEQ(q->expire, time_uptime))
 				continue;	/* too late, rule expired */
 
 			*mtailp = send_pkt(NULL, &(q->id), q->ack_rev - 1,
 				q->ack_fwd, TH_SYN);
 			if (*mtailp != NULL)
 				mtailp = &(*mtailp)->m_nextpkt;
 			*mtailp = send_pkt(NULL, &(q->id), q->ack_fwd - 1,
 				q->ack_rev, 0);
 			if (*mtailp != NULL)
 				mtailp = &(*mtailp)->m_nextpkt;
 		}
 	}
 	IPFW_DYN_UNLOCK();
 	for (m = mnext = m0; m != NULL; m = mnext) {
 		mnext = m->m_nextpkt;
 		m->m_nextpkt = NULL;
 		ip_output(m, NULL, NULL, 0, NULL, NULL);
 	}
 done:
 	callout_reset(&V_ipfw_timeout, V_dyn_keepalive_period * hz,
 		      ipfw_tick, NULL);
 }
 
 int
 ipfw_init(void)
 {
 	INIT_VNET_IPFW(curvnet);
 	struct ip_fw default_rule;
 	int error;
 
 	V_fw_debug = 1;
 	V_autoinc_step = 100;	/* bounded to 1..1000 in add_rule() */
 
 	V_ipfw_dyn_v = NULL;
 	V_dyn_buckets = 256;	/* must be power of 2 */
 	V_curr_dyn_buckets = 256; /* must be power of 2 */
 
 	V_dyn_ack_lifetime = 300;
 	V_dyn_syn_lifetime = 20;
 	V_dyn_fin_lifetime = 1;
 	V_dyn_rst_lifetime = 1;
 	V_dyn_udp_lifetime = 10;
 	V_dyn_short_lifetime = 5;
 
 	V_dyn_keepalive_interval = 20;
 	V_dyn_keepalive_period = 5;
 	V_dyn_keepalive = 1;	/* do send keepalives */
 
 	V_dyn_max = 4096;	/* max # of dynamic rules */
 
 	V_fw_deny_unknown_exthdrs = 1;
 
 #ifdef INET6
 	/* Setup IPv6 fw sysctl tree. */
 	sysctl_ctx_init(&ip6_fw_sysctl_ctx);
 	ip6_fw_sysctl_tree = SYSCTL_ADD_NODE(&ip6_fw_sysctl_ctx,
 	    SYSCTL_STATIC_CHILDREN(_net_inet6_ip6), OID_AUTO, "fw",
 	    CTLFLAG_RW | CTLFLAG_SECURE, 0, "Firewall");
 	SYSCTL_ADD_PROC(&ip6_fw_sysctl_ctx, SYSCTL_CHILDREN(ip6_fw_sysctl_tree),
 	    OID_AUTO, "enable", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_SECURE3,
 	    &V_fw6_enable, 0, ipfw_chg_hook, "I", "Enable ipfw+6");
 	SYSCTL_ADD_INT(&ip6_fw_sysctl_ctx, SYSCTL_CHILDREN(ip6_fw_sysctl_tree),
 	    OID_AUTO, "deny_unknown_exthdrs", CTLFLAG_RW | CTLFLAG_SECURE,
 	    &V_fw_deny_unknown_exthdrs, 0,
 	    "Deny packets with unknown IPv6 Extension Headers");
 #endif
 
 	V_layer3_chain.rules = NULL;
 	IPFW_LOCK_INIT(&V_layer3_chain);
 	ipfw_dyn_rule_zone = uma_zcreate("IPFW dynamic rule",
 	    sizeof(ipfw_dyn_rule), NULL, NULL, NULL, NULL,
 	    UMA_ALIGN_PTR, 0);
 	IPFW_DYN_LOCK_INIT();
 	callout_init(&V_ipfw_timeout, CALLOUT_MPSAFE);
 
 	bzero(&default_rule, sizeof default_rule);
 
 	default_rule.act_ofs = 0;
 	default_rule.rulenum = IPFW_DEFAULT_RULE;
 	default_rule.cmd_len = 1;
 	default_rule.set = RESVD_SET;
 
 	default_rule.cmd[0].len = 1;
 	default_rule.cmd[0].opcode =
 #ifdef IPFIREWALL_DEFAULT_TO_ACCEPT
 				1 ? O_ACCEPT :
 #endif
 				O_DENY;
 
 	error = add_rule(&V_layer3_chain, &default_rule);
 	if (error != 0) {
 		printf("ipfw2: error %u initializing default rule "
 			"(support disabled)\n", error);
 		IPFW_DYN_LOCK_DESTROY();
 		IPFW_LOCK_DESTROY(&V_layer3_chain);
 		uma_zdestroy(ipfw_dyn_rule_zone);
 		return (error);
 	}
 
 	ip_fw_default_rule = V_layer3_chain.rules;
 	printf("ipfw2 "
 #ifdef INET6
 		"(+ipv6) "
 #endif
 		"initialized, divert %s, nat %s, "
 		"rule-based forwarding "
 #ifdef IPFIREWALL_FORWARD
 		"enabled, "
 #else
 		"disabled, "
 #endif
 		"default to %s, logging ",
 #ifdef IPDIVERT
 		"enabled",
 #else
 		"loadable",
 #endif
 #ifdef IPFIREWALL_NAT
 		"enabled",
 #else
 		"loadable",
 #endif
 
 		default_rule.cmd[0].opcode == O_ACCEPT ? "accept" : "deny");
 
 #ifdef IPFIREWALL_VERBOSE
 	V_fw_verbose = 1;
 #endif
 #ifdef IPFIREWALL_VERBOSE_LIMIT
 	V_verbose_limit = IPFIREWALL_VERBOSE_LIMIT;
 #endif
 	if (V_fw_verbose == 0)
 		printf("disabled\n");
 	else if (V_verbose_limit == 0)
 		printf("unlimited\n");
 	else
 		printf("limited to %d packets/entry by default\n",
 		    V_verbose_limit);
 
 	error = init_tables(&V_layer3_chain);
 	if (error) {
 		IPFW_DYN_LOCK_DESTROY();
 		IPFW_LOCK_DESTROY(&V_layer3_chain);
 		uma_zdestroy(ipfw_dyn_rule_zone);
 		return (error);
 	}
 	ip_fw_ctl_ptr = ipfw_ctl;
 	ip_fw_chk_ptr = ipfw_chk;
 	callout_reset(&V_ipfw_timeout, hz, ipfw_tick, NULL);	
 	LIST_INIT(&V_layer3_chain.nat);
 	return (0);
 }
 
 void
 ipfw_destroy(void)
 {
 	struct ip_fw *reap;
 
 	ip_fw_chk_ptr = NULL;
 	ip_fw_ctl_ptr = NULL;
 	callout_drain(&V_ipfw_timeout);
 	IPFW_WLOCK(&V_layer3_chain);
 	flush_tables(&V_layer3_chain);
 	V_layer3_chain.reap = NULL;
 	free_chain(&V_layer3_chain, 1 /* kill default rule */);
 	reap = V_layer3_chain.reap, V_layer3_chain.reap = NULL;
 	IPFW_WUNLOCK(&V_layer3_chain);
 	if (reap != NULL)
 		reap_rules(reap);
 	IPFW_DYN_LOCK_DESTROY();
 	uma_zdestroy(ipfw_dyn_rule_zone);
 	if (V_ipfw_dyn_v != NULL)
 		free(V_ipfw_dyn_v, M_IPFW);
 	IPFW_LOCK_DESTROY(&V_layer3_chain);
 
 #ifdef INET6
 	/* Free IPv6 fw sysctl tree. */
 	sysctl_ctx_free(&ip6_fw_sysctl_ctx);
 #endif
 
 	printf("IP firewall unloaded\n");
 }
Index: head/sys/netinet/ip_input.c
===================================================================
--- head/sys/netinet/ip_input.c	(revision 186118)
+++ head/sys/netinet/ip_input.c	(revision 186119)
@@ -1,1704 +1,1704 @@
 /*-
  * Copyright (c) 1982, 1986, 1988, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_input.c	8.2 (Berkeley) 1/4/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_bootp.h"
 #include "opt_ipfw.h"
 #include "opt_ipstealth.h"
 #include "opt_ipsec.h"
 #include "opt_mac.h"
 #include "opt_carp.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/callout.h>
 #include <sys/mbuf.h>
 #include <sys/malloc.h>
 #include <sys/domain.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/time.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/rwlock.h>
 #include <sys/syslog.h>
 #include <sys/sysctl.h>
 #include <sys/vimage.h>
 
 #include <net/pfil.h>
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/if_var.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 #include <net/netisr.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/in_var.h>
 #include <netinet/ip.h>
 #include <netinet/in_pcb.h>
 #include <netinet/ip_var.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/ip_options.h>
 #include <machine/in_cksum.h>
 #include <netinet/vinet.h>
 #ifdef DEV_CARP
 #include <netinet/ip_carp.h>
 #endif
 #ifdef IPSEC
 #include <netinet/ip_ipsec.h>
 #endif /* IPSEC */
 
 #include <sys/socketvar.h>
 
 /* XXX: Temporary until ipfw_ether and ipfw_bridge are converted. */
 #include <netinet/ip_fw.h>
 #include <netinet/ip_dummynet.h>
 
 #include <security/mac/mac_framework.h>
 
 #ifdef CTASSERT
 CTASSERT(sizeof(struct ip) == 20);
 #endif
 
 #ifndef VIMAGE
 #ifndef VIMAGE_GLOBALS
 struct vnet_inet vnet_inet_0;
 #endif
 #endif
 
 #ifdef VIMAGE_GLOBALS
 static int	ipsendredirects;
 static int	ip_checkinterface;
 static int	ip_keepfaith;
 static int	ip_sendsourcequench;
 int	ip_defttl;
 int	ip_do_randomid;
 int	ipforwarding;
 struct	in_ifaddrhead in_ifaddrhead; 		/* first inet address */
 struct	in_ifaddrhashhead *in_ifaddrhashtbl;	/* inet addr hash table  */
 u_long 	in_ifaddrhmask;				/* mask for hash table */
 struct ipstat ipstat;
 static int ip_rsvp_on;
 struct socket *ip_rsvpd;
 int	rsvp_on;
 static struct ipqhead ipq[IPREASS_NHASH];
 static int	maxnipq;	/* Administrative limit on # reass queues. */
 static int	maxfragsperpacket;
 int	ipstealth;
 static int	nipq;	/* Total # of reass queues */
 #endif
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, IPCTL_FORWARDING,
     forwarding, CTLFLAG_RW, ipforwarding, 0,
     "Enable IP forwarding between interfaces");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, IPCTL_SENDREDIRECTS,
     redirect, CTLFLAG_RW, ipsendredirects, 0,
     "Enable sending IP redirects");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, IPCTL_DEFTTL,
     ttl, CTLFLAG_RW, ip_defttl, 0, "Maximum TTL on IP packets");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, IPCTL_KEEPFAITH,
     keepfaith, CTLFLAG_RW, ip_keepfaith,	0,
     "Enable packet capture for FAITH IPv4->IPv6 translater daemon");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO,
     sendsourcequench, CTLFLAG_RW, ip_sendsourcequench, 0,
     "Enable the transmission of source quench packets");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO, random_id,
     CTLFLAG_RW, ip_do_randomid, 0, "Assign random ip_id values");
 
 /*
  * XXX - Setting ip_checkinterface mostly implements the receive side of
  * the Strong ES model described in RFC 1122, but since the routing table
  * and transmit implementation do not implement the Strong ES model,
  * setting this to 1 results in an odd hybrid.
  *
  * XXX - ip_checkinterface currently must be disabled if you use ipnat
  * to translate the destination address to another local interface.
  *
  * XXX - ip_checkinterface must be disabled if you add IP aliases
  * to the loopback interface instead of the interface where the
  * packets for those addresses are received.
  */
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO,
     check_interface, CTLFLAG_RW, ip_checkinterface, 0,
     "Verify packet arrives on correct interface");
 
 struct pfil_head inet_pfil_hook;	/* Packet filter hooks */
 
 static struct	ifqueue ipintrq;
 static int	ipqmaxlen = IFQ_MAXLEN;
 
 extern	struct domain inetdomain;
 extern	struct protosw inetsw[];
 u_char	ip_protox[IPPROTO_MAX];
 
 SYSCTL_INT(_net_inet_ip, IPCTL_INTRQMAXLEN, intr_queue_maxlen, CTLFLAG_RW,
     &ipintrq.ifq_maxlen, 0, "Maximum size of the IP input queue");
 SYSCTL_INT(_net_inet_ip, IPCTL_INTRQDROPS, intr_queue_drops, CTLFLAG_RD,
     &ipintrq.ifq_drops, 0,
     "Number of packets dropped from the IP input queue");
 
 SYSCTL_V_STRUCT(V_NET, vnet_inet, _net_inet_ip, IPCTL_STATS, stats, CTLFLAG_RW,
     ipstat, ipstat, "IP statistics (struct ipstat, netinet/ip_var.h)");
 
 #ifdef VIMAGE_GLOBALS
 static uma_zone_t ipq_zone;
 #endif
 static struct mtx ipqlock;
 
 #define	IPQ_LOCK()	mtx_lock(&ipqlock)
 #define	IPQ_UNLOCK()	mtx_unlock(&ipqlock)
 #define	IPQ_LOCK_INIT()	mtx_init(&ipqlock, "ipqlock", NULL, MTX_DEF)
 #define	IPQ_LOCK_ASSERT()	mtx_assert(&ipqlock, MA_OWNED)
 
 static void	maxnipq_update(void);
 static void	ipq_zone_change(void *);
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO, fragpackets,
     CTLFLAG_RD, nipq, 0,
     "Current number of IPv4 fragment reassembly queue entries");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO, maxfragsperpacket,
     CTLFLAG_RW, maxfragsperpacket, 0,
     "Maximum number of IPv4 fragments allowed per packet");
 
 struct callout	ipport_tick_callout;
 
 #ifdef IPCTL_DEFMTU
 SYSCTL_INT(_net_inet_ip, IPCTL_DEFMTU, mtu, CTLFLAG_RW,
     &ip_mtu, 0, "Default MTU");
 #endif
 
 #ifdef IPSTEALTH
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_ip, OID_AUTO, stealth, CTLFLAG_RW,
     ipstealth, 0, "IP stealth mode, no TTL decrementation on forwarding");
 #endif
 
 /*
  * ipfw_ether and ipfw_bridge hooks.
  * XXX: Temporary until those are converted to pfil_hooks as well.
  */
 ip_fw_chk_t *ip_fw_chk_ptr = NULL;
 ip_dn_io_t *ip_dn_io_ptr = NULL;
 #ifdef VIMAGE_GLOBALS
 int fw_one_pass;
 #endif
 
 static void	ip_freef(struct ipqhead *, struct ipq *);
 
 /*
  * IP initialization: fill in IP protocol switch table.
  * All protocols not implemented in kernel go to raw IP protocol handler.
  */
 void
 ip_init(void)
 {
 	INIT_VNET_INET(curvnet);
 	struct protosw *pr;
 	int i;
 
 	V_ipsendredirects = 1; /* XXX */
 	V_ip_checkinterface = 0;
 	V_ip_keepfaith = 0;
 	V_ip_sendsourcequench = 0;
 	V_rsvp_on = 0;
 	V_ip_defttl = IPDEFTTL;
 	V_ip_do_randomid = 0;
 	V_ipforwarding = 0;
 	V_ipstealth = 0;
 	V_nipq = 0;	/* Total # of reass queues */
 
 	V_ipport_lowfirstauto = IPPORT_RESERVED - 1;	/* 1023 */
 	V_ipport_lowlastauto = IPPORT_RESERVEDSTART;	/* 600 */
 	V_ipport_firstauto = IPPORT_EPHEMERALFIRST;	/* 10000 */
 	V_ipport_lastauto = IPPORT_EPHEMERALLAST;	/* 65535 */
 	V_ipport_hifirstauto = IPPORT_HIFIRSTAUTO;	/* 49152 */
 	V_ipport_hilastauto = IPPORT_HILASTAUTO;	/* 65535 */
 	V_ipport_reservedhigh = IPPORT_RESERVED - 1;	/* 1023 */
 	V_ipport_reservedlow = 0;
 	V_ipport_randomized = 1;	/* user controlled via sysctl */
 	V_ipport_randomcps = 10;	/* user controlled via sysctl */
 	V_ipport_randomtime = 45;	/* user controlled via sysctl */
 	V_ipport_stoprandom = 0;	/* toggled by ipport_tick */
 
 	V_fw_one_pass = 1;
 
 #ifdef NOTYET
 	/* XXX global static but not instantiated in this file */
 	V_ipfastforward_active = 0;
 	V_subnetsarelocal = 0;
 	V_sameprefixcarponly = 0;
 #endif
 
 	TAILQ_INIT(&V_in_ifaddrhead);
 	V_in_ifaddrhashtbl = hashinit(INADDR_NHASH, M_IFADDR, &V_in_ifaddrhmask);
 	pr = pffindproto(PF_INET, IPPROTO_RAW, SOCK_RAW);
 	if (pr == NULL)
 		panic("ip_init: PF_INET not found");
 
 	/* Initialize the entire ip_protox[] array to IPPROTO_RAW. */
 	for (i = 0; i < IPPROTO_MAX; i++)
 		ip_protox[i] = pr - inetsw;
 	/*
 	 * Cycle through IP protocols and put them into the appropriate place
 	 * in ip_protox[].
 	 */
 	for (pr = inetdomain.dom_protosw;
 	    pr < inetdomain.dom_protoswNPROTOSW; pr++)
 		if (pr->pr_domain->dom_family == PF_INET &&
 		    pr->pr_protocol && pr->pr_protocol != IPPROTO_RAW) {
 			/* Be careful to only index valid IP protocols. */
 			if (pr->pr_protocol < IPPROTO_MAX)
 				ip_protox[pr->pr_protocol] = pr - inetsw;
 		}
 
 	/* Initialize packet filter hooks. */
 	inet_pfil_hook.ph_type = PFIL_TYPE_AF;
 	inet_pfil_hook.ph_af = AF_INET;
 	if ((i = pfil_head_register(&inet_pfil_hook)) != 0)
 		printf("%s: WARNING: unable to register pfil hook, "
 			"error %d\n", __func__, i);
 
 	/* Initialize IP reassembly queue. */
 	IPQ_LOCK_INIT();
 	for (i = 0; i < IPREASS_NHASH; i++)
 	    TAILQ_INIT(&V_ipq[i]);
 	V_maxnipq = nmbclusters / 32;
 	V_maxfragsperpacket = 16;
 	V_ipq_zone = uma_zcreate("ipq", sizeof(struct ipq), NULL, NULL, NULL,
 	    NULL, UMA_ALIGN_PTR, 0);
 	maxnipq_update();
 
 	/* Start ipport_tick. */
 	callout_init(&ipport_tick_callout, CALLOUT_MPSAFE);
 	ipport_tick(NULL);
 	EVENTHANDLER_REGISTER(shutdown_pre_sync, ip_fini, NULL,
 		SHUTDOWN_PRI_DEFAULT);
 	EVENTHANDLER_REGISTER(nmbclusters_change, ipq_zone_change,
 		NULL, EVENTHANDLER_PRI_ANY);
 
 	/* Initialize various other remaining things. */
 	V_ip_id = time_second & 0xffff;
 	ipintrq.ifq_maxlen = ipqmaxlen;
 	mtx_init(&ipintrq.ifq_mtx, "ip_inq", NULL, MTX_DEF);
 	netisr_register(NETISR_IP, ip_input, &ipintrq, 0);
 }
 
 void
 ip_fini(void *xtp)
 {
 
 	callout_stop(&ipport_tick_callout);
 }
 
 /*
  * Ip input routine.  Checksum and byte swap header.  If fragmented
  * try to reassemble.  Process options.  Pass to next level.
  */
 void
 ip_input(struct mbuf *m)
 {
 	INIT_VNET_INET(curvnet);
 	struct ip *ip = NULL;
 	struct in_ifaddr *ia = NULL;
 	struct ifaddr *ifa;
 	int    checkif, hlen = 0;
 	u_short sum;
 	int dchg = 0;				/* dest changed after fw */
 	struct in_addr odst;			/* original dst address */
 
 	M_ASSERTPKTHDR(m);
 
 	if (m->m_flags & M_FASTFWD_OURS) {
 		/*
 		 * Firewall or NAT changed destination to local.
 		 * We expect ip_len and ip_off to be in host byte order.
 		 */
 		m->m_flags &= ~M_FASTFWD_OURS;
 		/* Set up some basics that will be used later. */
 		ip = mtod(m, struct ip *);
 		hlen = ip->ip_hl << 2;
 		goto ours;
 	}
 
 	V_ipstat.ips_total++;
 
 	if (m->m_pkthdr.len < sizeof(struct ip))
 		goto tooshort;
 
 	if (m->m_len < sizeof (struct ip) &&
 	    (m = m_pullup(m, sizeof (struct ip))) == NULL) {
 		V_ipstat.ips_toosmall++;
 		return;
 	}
 	ip = mtod(m, struct ip *);
 
 	if (ip->ip_v != IPVERSION) {
 		V_ipstat.ips_badvers++;
 		goto bad;
 	}
 
 	hlen = ip->ip_hl << 2;
 	if (hlen < sizeof(struct ip)) {	/* minimum header length */
 		V_ipstat.ips_badhlen++;
 		goto bad;
 	}
 	if (hlen > m->m_len) {
 		if ((m = m_pullup(m, hlen)) == NULL) {
 			V_ipstat.ips_badhlen++;
 			return;
 		}
 		ip = mtod(m, struct ip *);
 	}
 
 	/* 127/8 must not appear on wire - RFC1122 */
 	if ((ntohl(ip->ip_dst.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET ||
 	    (ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET) {
 		if ((m->m_pkthdr.rcvif->if_flags & IFF_LOOPBACK) == 0) {
 			V_ipstat.ips_badaddr++;
 			goto bad;
 		}
 	}
 
 	if (m->m_pkthdr.csum_flags & CSUM_IP_CHECKED) {
 		sum = !(m->m_pkthdr.csum_flags & CSUM_IP_VALID);
 	} else {
 		if (hlen == sizeof(struct ip)) {
 			sum = in_cksum_hdr(ip);
 		} else {
 			sum = in_cksum(m, hlen);
 		}
 	}
 	if (sum) {
 		V_ipstat.ips_badsum++;
 		goto bad;
 	}
 
 #ifdef ALTQ
 	if (altq_input != NULL && (*altq_input)(m, AF_INET) == 0)
 		/* packet is dropped by traffic conditioner */
 		return;
 #endif
 
 	/*
 	 * Convert fields to host representation.
 	 */
 	ip->ip_len = ntohs(ip->ip_len);
 	if (ip->ip_len < hlen) {
 		V_ipstat.ips_badlen++;
 		goto bad;
 	}
 	ip->ip_off = ntohs(ip->ip_off);
 
 	/*
 	 * Check that the amount of data in the buffers
 	 * is as at least much as the IP header would have us expect.
 	 * Trim mbufs if longer than we expect.
 	 * Drop packet if shorter than we expect.
 	 */
 	if (m->m_pkthdr.len < ip->ip_len) {
 tooshort:
 		V_ipstat.ips_tooshort++;
 		goto bad;
 	}
 	if (m->m_pkthdr.len > ip->ip_len) {
 		if (m->m_len == m->m_pkthdr.len) {
 			m->m_len = ip->ip_len;
 			m->m_pkthdr.len = ip->ip_len;
 		} else
 			m_adj(m, ip->ip_len - m->m_pkthdr.len);
 	}
 #ifdef IPSEC
 	/*
 	 * Bypass packet filtering for packets from a tunnel (gif).
 	 */
 	if (ip_ipsec_filtertunnel(m))
 		goto passin;
 #endif /* IPSEC */
 
 	/*
 	 * Run through list of hooks for input packets.
 	 *
 	 * NB: Beware of the destination address changing (e.g.
 	 *     by NAT rewriting).  When this happens, tell
 	 *     ip_forward to do the right thing.
 	 */
 
 	/* Jump over all PFIL processing if hooks are not active. */
 	if (!PFIL_HOOKED(&inet_pfil_hook))
 		goto passin;
 
 	odst = ip->ip_dst;
 	if (pfil_run_hooks(&inet_pfil_hook, &m, m->m_pkthdr.rcvif,
 	    PFIL_IN, NULL) != 0)
 		return;
 	if (m == NULL)			/* consumed by filter */
 		return;
 
 	ip = mtod(m, struct ip *);
 	dchg = (odst.s_addr != ip->ip_dst.s_addr);
 
 #ifdef IPFIREWALL_FORWARD
 	if (m->m_flags & M_FASTFWD_OURS) {
 		m->m_flags &= ~M_FASTFWD_OURS;
 		goto ours;
 	}
 	if ((dchg = (m_tag_find(m, PACKET_TAG_IPFORWARD, NULL) != NULL)) != 0) {
 		/*
 		 * Directly ship on the packet.  This allows to forward packets
 		 * that were destined for us to some other directly connected
 		 * host.
 		 */
 		ip_forward(m, dchg);
 		return;
 	}
 #endif /* IPFIREWALL_FORWARD */
 
 passin:
 	/*
 	 * Process options and, if not destined for us,
 	 * ship it on.  ip_dooptions returns 1 when an
 	 * error was detected (causing an icmp message
 	 * to be sent and the original packet to be freed).
 	 */
 	if (hlen > sizeof (struct ip) && ip_dooptions(m, 0))
 		return;
 
         /* greedy RSVP, snatches any PATH packet of the RSVP protocol and no
          * matter if it is destined to another node, or whether it is 
          * a multicast one, RSVP wants it! and prevents it from being forwarded
          * anywhere else. Also checks if the rsvp daemon is running before
 	 * grabbing the packet.
          */
 	if (V_rsvp_on && ip->ip_p==IPPROTO_RSVP) 
 		goto ours;
 
 	/*
 	 * Check our list of addresses, to see if the packet is for us.
 	 * If we don't have any addresses, assume any unicast packet
 	 * we receive might be for us (and let the upper layers deal
 	 * with it).
 	 */
 	if (TAILQ_EMPTY(&V_in_ifaddrhead) &&
 	    (m->m_flags & (M_MCAST|M_BCAST)) == 0)
 		goto ours;
 
 	/*
 	 * Enable a consistency check between the destination address
 	 * and the arrival interface for a unicast packet (the RFC 1122
 	 * strong ES model) if IP forwarding is disabled and the packet
 	 * is not locally generated and the packet is not subject to
 	 * 'ipfw fwd'.
 	 *
 	 * XXX - Checking also should be disabled if the destination
 	 * address is ipnat'ed to a different interface.
 	 *
 	 * XXX - Checking is incompatible with IP aliases added
 	 * to the loopback interface instead of the interface where
 	 * the packets are received.
 	 *
 	 * XXX - This is the case for carp vhost IPs as well so we
 	 * insert a workaround. If the packet got here, we already
 	 * checked with carp_iamatch() and carp_forus().
 	 */
 	checkif = V_ip_checkinterface && (V_ipforwarding == 0) && 
 	    m->m_pkthdr.rcvif != NULL &&
 	    ((m->m_pkthdr.rcvif->if_flags & IFF_LOOPBACK) == 0) &&
 #ifdef DEV_CARP
 	    !m->m_pkthdr.rcvif->if_carp &&
 #endif
 	    (dchg == 0);
 
 	/*
 	 * Check for exact addresses in the hash bucket.
 	 */
 	LIST_FOREACH(ia, INADDR_HASH(ip->ip_dst.s_addr), ia_hash) {
 		/*
 		 * If the address matches, verify that the packet
 		 * arrived via the correct interface if checking is
 		 * enabled.
 		 */
 		if (IA_SIN(ia)->sin_addr.s_addr == ip->ip_dst.s_addr && 
 		    (!checkif || ia->ia_ifp == m->m_pkthdr.rcvif))
 			goto ours;
 	}
 	/*
 	 * Check for broadcast addresses.
 	 *
 	 * Only accept broadcast packets that arrive via the matching
 	 * interface.  Reception of forwarded directed broadcasts would
 	 * be handled via ip_forward() and ether_output() with the loopback
 	 * into the stack for SIMPLEX interfaces handled by ether_output().
 	 */
 	if (m->m_pkthdr.rcvif != NULL &&
 	    m->m_pkthdr.rcvif->if_flags & IFF_BROADCAST) {
 	        TAILQ_FOREACH(ifa, &m->m_pkthdr.rcvif->if_addrhead, ifa_link) {
 			if (ifa->ifa_addr->sa_family != AF_INET)
 				continue;
 			ia = ifatoia(ifa);
 			if (satosin(&ia->ia_broadaddr)->sin_addr.s_addr ==
 			    ip->ip_dst.s_addr)
 				goto ours;
 			if (ia->ia_netbroadcast.s_addr == ip->ip_dst.s_addr)
 				goto ours;
 #ifdef BOOTP_COMPAT
 			if (IA_SIN(ia)->sin_addr.s_addr == INADDR_ANY)
 				goto ours;
 #endif
 		}
 	}
 	/* RFC 3927 2.7: Do not forward datagrams for 169.254.0.0/16. */
 	if (IN_LINKLOCAL(ntohl(ip->ip_dst.s_addr))) {
 		V_ipstat.ips_cantforward++;
 		m_freem(m);
 		return;
 	}
 	if (IN_MULTICAST(ntohl(ip->ip_dst.s_addr))) {
 		struct in_multi *inm;
 		if (V_ip_mrouter) {
 			/*
 			 * If we are acting as a multicast router, all
 			 * incoming multicast packets are passed to the
 			 * kernel-level multicast forwarding function.
 			 * The packet is returned (relatively) intact; if
 			 * ip_mforward() returns a non-zero value, the packet
 			 * must be discarded, else it may be accepted below.
 			 */
 			if (ip_mforward &&
 			    ip_mforward(ip, m->m_pkthdr.rcvif, m, 0) != 0) {
 				V_ipstat.ips_cantforward++;
 				m_freem(m);
 				return;
 			}
 
 			/*
 			 * The process-level routing daemon needs to receive
 			 * all multicast IGMP packets, whether or not this
 			 * host belongs to their destination groups.
 			 */
 			if (ip->ip_p == IPPROTO_IGMP)
 				goto ours;
 			V_ipstat.ips_forward++;
 		}
 		/*
 		 * See if we belong to the destination multicast group on the
 		 * arrival interface.
 		 */
 		IN_MULTI_LOCK();
 		IN_LOOKUP_MULTI(ip->ip_dst, m->m_pkthdr.rcvif, inm);
 		IN_MULTI_UNLOCK();
 		if (inm == NULL) {
 			V_ipstat.ips_notmember++;
 			m_freem(m);
 			return;
 		}
 		goto ours;
 	}
 	if (ip->ip_dst.s_addr == (u_long)INADDR_BROADCAST)
 		goto ours;
 	if (ip->ip_dst.s_addr == INADDR_ANY)
 		goto ours;
 
 	/*
 	 * FAITH(Firewall Aided Internet Translator)
 	 */
 	if (m->m_pkthdr.rcvif && m->m_pkthdr.rcvif->if_type == IFT_FAITH) {
 		if (V_ip_keepfaith) {
 			if (ip->ip_p == IPPROTO_TCP || ip->ip_p == IPPROTO_ICMP) 
 				goto ours;
 		}
 		m_freem(m);
 		return;
 	}
 
 	/*
 	 * Not for us; forward if possible and desirable.
 	 */
 	if (V_ipforwarding == 0) {
 		V_ipstat.ips_cantforward++;
 		m_freem(m);
 	} else {
 #ifdef IPSEC
 		if (ip_ipsec_fwd(m))
 			goto bad;
 #endif /* IPSEC */
 		ip_forward(m, dchg);
 	}
 	return;
 
 ours:
 #ifdef IPSTEALTH
 	/*
 	 * IPSTEALTH: Process non-routing options only
 	 * if the packet is destined for us.
 	 */
 	if (V_ipstealth && hlen > sizeof (struct ip) &&
 	    ip_dooptions(m, 1))
 		return;
 #endif /* IPSTEALTH */
 
 	/* Count the packet in the ip address stats */
 	if (ia != NULL) {
 		ia->ia_ifa.if_ipackets++;
 		ia->ia_ifa.if_ibytes += m->m_pkthdr.len;
 	}
 
 	/*
 	 * Attempt reassembly; if it succeeds, proceed.
 	 * ip_reass() will return a different mbuf.
 	 */
 	if (ip->ip_off & (IP_MF | IP_OFFMASK)) {
 		m = ip_reass(m);
 		if (m == NULL)
 			return;
 		ip = mtod(m, struct ip *);
 		/* Get the header length of the reassembled packet */
 		hlen = ip->ip_hl << 2;
 	}
 
 	/*
 	 * Further protocols expect the packet length to be w/o the
 	 * IP header.
 	 */
 	ip->ip_len -= hlen;
 
 #ifdef IPSEC
 	/*
 	 * enforce IPsec policy checking if we are seeing last header.
 	 * note that we do not visit this with protocols with pcb layer
 	 * code - like udp/tcp/raw ip.
 	 */
 	if (ip_ipsec_input(m))
 		goto bad;
 #endif /* IPSEC */
 
 	/*
 	 * Switch out to protocol's input routine.
 	 */
 	V_ipstat.ips_delivered++;
 
 	(*inetsw[ip_protox[ip->ip_p]].pr_input)(m, hlen);
 	return;
 bad:
 	m_freem(m);
 }
 
 /*
  * After maxnipq has been updated, propagate the change to UMA.  The UMA zone
  * max has slightly different semantics than the sysctl, for historical
  * reasons.
  */
 static void
 maxnipq_update(void)
 {
 	INIT_VNET_INET(curvnet);
 
 	/*
 	 * -1 for unlimited allocation.
 	 */
 	if (V_maxnipq < 0)
 		uma_zone_set_max(V_ipq_zone, 0);
 	/*
 	 * Positive number for specific bound.
 	 */
 	if (V_maxnipq > 0)
 		uma_zone_set_max(V_ipq_zone, V_maxnipq);
 	/*
 	 * Zero specifies no further fragment queue allocation -- set the
 	 * bound very low, but rely on implementation elsewhere to actually
 	 * prevent allocation and reclaim current queues.
 	 */
 	if (V_maxnipq == 0)
 		uma_zone_set_max(V_ipq_zone, 1);
 }
 
 static void
 ipq_zone_change(void *tag)
 {
 	INIT_VNET_INET(curvnet);
 
 	if (V_maxnipq > 0 && V_maxnipq < (nmbclusters / 32)) {
 		V_maxnipq = nmbclusters / 32;
 		maxnipq_update();
 	}
 }
 
 static int
 sysctl_maxnipq(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 	int error, i;
 
 	i = V_maxnipq;
 	error = sysctl_handle_int(oidp, &i, 0, req);
 	if (error || !req->newptr)
 		return (error);
 
 	/*
 	 * XXXRW: Might be a good idea to sanity check the argument and place
 	 * an extreme upper bound.
 	 */
 	if (i < -1)
 		return (EINVAL);
 	V_maxnipq = i;
 	maxnipq_update();
 	return (0);
 }
 
 SYSCTL_PROC(_net_inet_ip, OID_AUTO, maxfragpackets, CTLTYPE_INT|CTLFLAG_RW,
     NULL, 0, sysctl_maxnipq, "I",
     "Maximum number of IPv4 fragment reassembly queue entries");
 
 /*
  * Take incoming datagram fragment and try to reassemble it into
  * whole datagram.  If the argument is the first fragment or one
  * in between the function will return NULL and store the mbuf
  * in the fragment chain.  If the argument is the last fragment
  * the packet will be reassembled and the pointer to the new
  * mbuf returned for further processing.  Only m_tags attached
  * to the first packet/fragment are preserved.
  * The IP header is *NOT* adjusted out of iplen.
  */
 struct mbuf *
 ip_reass(struct mbuf *m)
 {
 	INIT_VNET_INET(curvnet);
 	struct ip *ip;
 	struct mbuf *p, *q, *nq, *t;
 	struct ipq *fp = NULL;
 	struct ipqhead *head;
 	int i, hlen, next;
 	u_int8_t ecn, ecn0;
 	u_short hash;
 
 	/* If maxnipq or maxfragsperpacket are 0, never accept fragments. */
 	if (V_maxnipq == 0 || V_maxfragsperpacket == 0) {
 		V_ipstat.ips_fragments++;
 		V_ipstat.ips_fragdropped++;
 		m_freem(m);
 		return (NULL);
 	}
 
 	ip = mtod(m, struct ip *);
 	hlen = ip->ip_hl << 2;
 
 	hash = IPREASS_HASH(ip->ip_src.s_addr, ip->ip_id);
 	head = &V_ipq[hash];
 	IPQ_LOCK();
 
 	/*
 	 * Look for queue of fragments
 	 * of this datagram.
 	 */
 	TAILQ_FOREACH(fp, head, ipq_list)
 		if (ip->ip_id == fp->ipq_id &&
 		    ip->ip_src.s_addr == fp->ipq_src.s_addr &&
 		    ip->ip_dst.s_addr == fp->ipq_dst.s_addr &&
 #ifdef MAC
 		    mac_ipq_match(m, fp) &&
 #endif
 		    ip->ip_p == fp->ipq_p)
 			goto found;
 
 	fp = NULL;
 
 	/*
 	 * Attempt to trim the number of allocated fragment queues if it
 	 * exceeds the administrative limit.
 	 */
 	if ((V_nipq > V_maxnipq) && (V_maxnipq > 0)) {
 		/*
 		 * drop something from the tail of the current queue
 		 * before proceeding further
 		 */
 		struct ipq *q = TAILQ_LAST(head, ipqhead);
 		if (q == NULL) {   /* gak */
 			for (i = 0; i < IPREASS_NHASH; i++) {
 				struct ipq *r = TAILQ_LAST(&V_ipq[i], ipqhead);
 				if (r) {
 					V_ipstat.ips_fragtimeout +=
 					    r->ipq_nfrags;
 					ip_freef(&V_ipq[i], r);
 					break;
 				}
 			}
 		} else {
 			V_ipstat.ips_fragtimeout += q->ipq_nfrags;
 			ip_freef(head, q);
 		}
 	}
 
 found:
 	/*
 	 * Adjust ip_len to not reflect header,
 	 * convert offset of this to bytes.
 	 */
 	ip->ip_len -= hlen;
 	if (ip->ip_off & IP_MF) {
 		/*
 		 * Make sure that fragments have a data length
 		 * that's a non-zero multiple of 8 bytes.
 		 */
 		if (ip->ip_len == 0 || (ip->ip_len & 0x7) != 0) {
 			V_ipstat.ips_toosmall++; /* XXX */
 			goto dropfrag;
 		}
 		m->m_flags |= M_FRAG;
 	} else
 		m->m_flags &= ~M_FRAG;
 	ip->ip_off <<= 3;
 
 
 	/*
 	 * Attempt reassembly; if it succeeds, proceed.
 	 * ip_reass() will return a different mbuf.
 	 */
 	V_ipstat.ips_fragments++;
 	m->m_pkthdr.header = ip;
 
 	/* Previous ip_reass() started here. */
 	/*
 	 * Presence of header sizes in mbufs
 	 * would confuse code below.
 	 */
 	m->m_data += hlen;
 	m->m_len -= hlen;
 
 	/*
 	 * If first fragment to arrive, create a reassembly queue.
 	 */
 	if (fp == NULL) {
 		fp = uma_zalloc(V_ipq_zone, M_NOWAIT);
 		if (fp == NULL)
 			goto dropfrag;
 #ifdef MAC
 		if (mac_ipq_init(fp, M_NOWAIT) != 0) {
 			uma_zfree(V_ipq_zone, fp);
 			fp = NULL;
 			goto dropfrag;
 		}
 		mac_ipq_create(m, fp);
 #endif
 		TAILQ_INSERT_HEAD(head, fp, ipq_list);
 		V_nipq++;
 		fp->ipq_nfrags = 1;
 		fp->ipq_ttl = IPFRAGTTL;
 		fp->ipq_p = ip->ip_p;
 		fp->ipq_id = ip->ip_id;
 		fp->ipq_src = ip->ip_src;
 		fp->ipq_dst = ip->ip_dst;
 		fp->ipq_frags = m;
 		m->m_nextpkt = NULL;
 		goto done;
 	} else {
 		fp->ipq_nfrags++;
 #ifdef MAC
 		mac_ipq_update(m, fp);
 #endif
 	}
 
 #define GETIP(m)	((struct ip*)((m)->m_pkthdr.header))
 
 	/*
 	 * Handle ECN by comparing this segment with the first one;
 	 * if CE is set, do not lose CE.
 	 * drop if CE and not-ECT are mixed for the same packet.
 	 */
 	ecn = ip->ip_tos & IPTOS_ECN_MASK;
 	ecn0 = GETIP(fp->ipq_frags)->ip_tos & IPTOS_ECN_MASK;
 	if (ecn == IPTOS_ECN_CE) {
 		if (ecn0 == IPTOS_ECN_NOTECT)
 			goto dropfrag;
 		if (ecn0 != IPTOS_ECN_CE)
 			GETIP(fp->ipq_frags)->ip_tos |= IPTOS_ECN_CE;
 	}
 	if (ecn == IPTOS_ECN_NOTECT && ecn0 != IPTOS_ECN_NOTECT)
 		goto dropfrag;
 
 	/*
 	 * Find a segment which begins after this one does.
 	 */
 	for (p = NULL, q = fp->ipq_frags; q; p = q, q = q->m_nextpkt)
 		if (GETIP(q)->ip_off > ip->ip_off)
 			break;
 
 	/*
 	 * If there is a preceding segment, it may provide some of
 	 * our data already.  If so, drop the data from the incoming
 	 * segment.  If it provides all of our data, drop us, otherwise
 	 * stick new segment in the proper place.
 	 *
 	 * If some of the data is dropped from the the preceding
 	 * segment, then it's checksum is invalidated.
 	 */
 	if (p) {
 		i = GETIP(p)->ip_off + GETIP(p)->ip_len - ip->ip_off;
 		if (i > 0) {
 			if (i >= ip->ip_len)
 				goto dropfrag;
 			m_adj(m, i);
 			m->m_pkthdr.csum_flags = 0;
 			ip->ip_off += i;
 			ip->ip_len -= i;
 		}
 		m->m_nextpkt = p->m_nextpkt;
 		p->m_nextpkt = m;
 	} else {
 		m->m_nextpkt = fp->ipq_frags;
 		fp->ipq_frags = m;
 	}
 
 	/*
 	 * While we overlap succeeding segments trim them or,
 	 * if they are completely covered, dequeue them.
 	 */
 	for (; q != NULL && ip->ip_off + ip->ip_len > GETIP(q)->ip_off;
 	     q = nq) {
 		i = (ip->ip_off + ip->ip_len) - GETIP(q)->ip_off;
 		if (i < GETIP(q)->ip_len) {
 			GETIP(q)->ip_len -= i;
 			GETIP(q)->ip_off += i;
 			m_adj(q, i);
 			q->m_pkthdr.csum_flags = 0;
 			break;
 		}
 		nq = q->m_nextpkt;
 		m->m_nextpkt = nq;
 		V_ipstat.ips_fragdropped++;
 		fp->ipq_nfrags--;
 		m_freem(q);
 	}
 
 	/*
 	 * Check for complete reassembly and perform frag per packet
 	 * limiting.
 	 *
 	 * Frag limiting is performed here so that the nth frag has
 	 * a chance to complete the packet before we drop the packet.
 	 * As a result, n+1 frags are actually allowed per packet, but
 	 * only n will ever be stored. (n = maxfragsperpacket.)
 	 *
 	 */
 	next = 0;
 	for (p = NULL, q = fp->ipq_frags; q; p = q, q = q->m_nextpkt) {
 		if (GETIP(q)->ip_off != next) {
 			if (fp->ipq_nfrags > V_maxfragsperpacket) {
 				V_ipstat.ips_fragdropped += fp->ipq_nfrags;
 				ip_freef(head, fp);
 			}
 			goto done;
 		}
 		next += GETIP(q)->ip_len;
 	}
 	/* Make sure the last packet didn't have the IP_MF flag */
 	if (p->m_flags & M_FRAG) {
 		if (fp->ipq_nfrags > V_maxfragsperpacket) {
 			V_ipstat.ips_fragdropped += fp->ipq_nfrags;
 			ip_freef(head, fp);
 		}
 		goto done;
 	}
 
 	/*
 	 * Reassembly is complete.  Make sure the packet is a sane size.
 	 */
 	q = fp->ipq_frags;
 	ip = GETIP(q);
 	if (next + (ip->ip_hl << 2) > IP_MAXPACKET) {
 		V_ipstat.ips_toolong++;
 		V_ipstat.ips_fragdropped += fp->ipq_nfrags;
 		ip_freef(head, fp);
 		goto done;
 	}
 
 	/*
 	 * Concatenate fragments.
 	 */
 	m = q;
 	t = m->m_next;
 	m->m_next = NULL;
 	m_cat(m, t);
 	nq = q->m_nextpkt;
 	q->m_nextpkt = NULL;
 	for (q = nq; q != NULL; q = nq) {
 		nq = q->m_nextpkt;
 		q->m_nextpkt = NULL;
 		m->m_pkthdr.csum_flags &= q->m_pkthdr.csum_flags;
 		m->m_pkthdr.csum_data += q->m_pkthdr.csum_data;
 		m_cat(m, q);
 	}
 	/*
 	 * In order to do checksumming faster we do 'end-around carry' here
 	 * (and not in for{} loop), though it implies we are not going to
 	 * reassemble more than 64k fragments.
 	 */
 	m->m_pkthdr.csum_data =
 	    (m->m_pkthdr.csum_data & 0xffff) + (m->m_pkthdr.csum_data >> 16);
 #ifdef MAC
 	mac_ipq_reassemble(fp, m);
 	mac_ipq_destroy(fp);
 #endif
 
 	/*
 	 * Create header for new ip packet by modifying header of first
 	 * packet;  dequeue and discard fragment reassembly header.
 	 * Make header visible.
 	 */
 	ip->ip_len = (ip->ip_hl << 2) + next;
 	ip->ip_src = fp->ipq_src;
 	ip->ip_dst = fp->ipq_dst;
 	TAILQ_REMOVE(head, fp, ipq_list);
 	V_nipq--;
 	uma_zfree(V_ipq_zone, fp);
 	m->m_len += (ip->ip_hl << 2);
 	m->m_data -= (ip->ip_hl << 2);
 	/* some debugging cruft by sklower, below, will go away soon */
 	if (m->m_flags & M_PKTHDR)	/* XXX this should be done elsewhere */
 		m_fixhdr(m);
 	V_ipstat.ips_reassembled++;
 	IPQ_UNLOCK();
 	return (m);
 
 dropfrag:
 	V_ipstat.ips_fragdropped++;
 	if (fp != NULL)
 		fp->ipq_nfrags--;
 	m_freem(m);
 done:
 	IPQ_UNLOCK();
 	return (NULL);
 
 #undef GETIP
 }
 
 /*
  * Free a fragment reassembly header and all
  * associated datagrams.
  */
 static void
 ip_freef(struct ipqhead *fhp, struct ipq *fp)
 {
 	INIT_VNET_INET(curvnet);
 	struct mbuf *q;
 
 	IPQ_LOCK_ASSERT();
 
 	while (fp->ipq_frags) {
 		q = fp->ipq_frags;
 		fp->ipq_frags = q->m_nextpkt;
 		m_freem(q);
 	}
 	TAILQ_REMOVE(fhp, fp, ipq_list);
 	uma_zfree(V_ipq_zone, fp);
 	V_nipq--;
 }
 
 /*
  * IP timer processing;
  * if a timer expires on a reassembly
  * queue, discard it.
  */
 void
 ip_slowtimo(void)
 {
 	VNET_ITERATOR_DECL(vnet_iter);
 	struct ipq *fp;
 	int i;
 
 	IPQ_LOCK();
 	VNET_LIST_RLOCK();
 	VNET_FOREACH(vnet_iter) {
 		CURVNET_SET(vnet_iter);
 		INIT_VNET_INET(vnet_iter);
 		for (i = 0; i < IPREASS_NHASH; i++) {
 			for(fp = TAILQ_FIRST(&V_ipq[i]); fp;) {
 				struct ipq *fpp;
 
 				fpp = fp;
 				fp = TAILQ_NEXT(fp, ipq_list);
 				if(--fpp->ipq_ttl == 0) {
 					V_ipstat.ips_fragtimeout +=
 					    fpp->ipq_nfrags;
 					ip_freef(&V_ipq[i], fpp);
 				}
 			}
 		}
 		/*
 		 * If we are over the maximum number of fragments
 		 * (due to the limit being lowered), drain off
 		 * enough to get down to the new limit.
 		 */
 		if (V_maxnipq >= 0 && V_nipq > V_maxnipq) {
 			for (i = 0; i < IPREASS_NHASH; i++) {
 				while (V_nipq > V_maxnipq &&
 				    !TAILQ_EMPTY(&V_ipq[i])) {
 					V_ipstat.ips_fragdropped +=
 					    TAILQ_FIRST(&V_ipq[i])->ipq_nfrags;
 					ip_freef(&V_ipq[i],
 					    TAILQ_FIRST(&V_ipq[i]));
 				}
 			}
 		}
 		CURVNET_RESTORE();
 	}
 	VNET_LIST_RUNLOCK();
 	IPQ_UNLOCK();
 }
 
 /*
  * Drain off all datagram fragments.
  */
 void
 ip_drain(void)
 {
 	VNET_ITERATOR_DECL(vnet_iter);
 	int     i;
 
 	IPQ_LOCK();
 	VNET_LIST_RLOCK();
 	VNET_FOREACH(vnet_iter) {
 		CURVNET_SET(vnet_iter);
 		INIT_VNET_INET(vnet_iter);
 		for (i = 0; i < IPREASS_NHASH; i++) {
 			while(!TAILQ_EMPTY(&V_ipq[i])) {
 				V_ipstat.ips_fragdropped +=
 				    TAILQ_FIRST(&V_ipq[i])->ipq_nfrags;
 				ip_freef(&V_ipq[i], TAILQ_FIRST(&V_ipq[i]));
 			}
 		}
 		CURVNET_RESTORE();
 	}
 	VNET_LIST_RUNLOCK();
 	IPQ_UNLOCK();
 	in_rtqdrain();
 }
 
 /*
  * The protocol to be inserted into ip_protox[] must be already registered
  * in inetsw[], either statically or through pf_proto_register().
  */
 int
 ipproto_register(u_char ipproto)
 {
 	struct protosw *pr;
 
 	/* Sanity checks. */
 	if (ipproto == 0)
 		return (EPROTONOSUPPORT);
 
 	/*
 	 * The protocol slot must not be occupied by another protocol
 	 * already.  An index pointing to IPPROTO_RAW is unused.
 	 */
 	pr = pffindproto(PF_INET, IPPROTO_RAW, SOCK_RAW);
 	if (pr == NULL)
 		return (EPFNOSUPPORT);
 	if (ip_protox[ipproto] != pr - inetsw)	/* IPPROTO_RAW */
 		return (EEXIST);
 
 	/* Find the protocol position in inetsw[] and set the index. */
 	for (pr = inetdomain.dom_protosw;
 	     pr < inetdomain.dom_protoswNPROTOSW; pr++) {
 		if (pr->pr_domain->dom_family == PF_INET &&
 		    pr->pr_protocol && pr->pr_protocol == ipproto) {
 			/* Be careful to only index valid IP protocols. */
 			if (pr->pr_protocol < IPPROTO_MAX) {
 				ip_protox[pr->pr_protocol] = pr - inetsw;
 				return (0);
 			} else
 				return (EINVAL);
 		}
 	}
 	return (EPROTONOSUPPORT);
 }
 
 int
 ipproto_unregister(u_char ipproto)
 {
 	struct protosw *pr;
 
 	/* Sanity checks. */
 	if (ipproto == 0)
 		return (EPROTONOSUPPORT);
 
 	/* Check if the protocol was indeed registered. */
 	pr = pffindproto(PF_INET, IPPROTO_RAW, SOCK_RAW);
 	if (pr == NULL)
 		return (EPFNOSUPPORT);
 	if (ip_protox[ipproto] == pr - inetsw)  /* IPPROTO_RAW */
 		return (ENOENT);
 
 	/* Reset the protocol slot to IPPROTO_RAW. */
 	ip_protox[ipproto] = pr - inetsw;
 	return (0);
 }
 
 /*
  * Given address of next destination (final or next hop),
  * return internet address info of interface to be used to get there.
  */
 struct in_ifaddr *
 ip_rtaddr(struct in_addr dst, u_int fibnum)
 {
 	struct route sro;
 	struct sockaddr_in *sin;
 	struct in_ifaddr *ifa;
 
 	bzero(&sro, sizeof(sro));
 	sin = (struct sockaddr_in *)&sro.ro_dst;
 	sin->sin_family = AF_INET;
 	sin->sin_len = sizeof(*sin);
 	sin->sin_addr = dst;
-	in_rtalloc_ign(&sro, RTF_CLONING, fibnum);
+	in_rtalloc_ign(&sro, 0, fibnum);
 
 	if (sro.ro_rt == NULL)
 		return (NULL);
 
 	ifa = ifatoia(sro.ro_rt->rt_ifa);
 	RTFREE(sro.ro_rt);
 	return (ifa);
 }
 
 u_char inetctlerrmap[PRC_NCMDS] = {
 	0,		0,		0,		0,
 	0,		EMSGSIZE,	EHOSTDOWN,	EHOSTUNREACH,
 	EHOSTUNREACH,	EHOSTUNREACH,	ECONNREFUSED,	ECONNREFUSED,
 	EMSGSIZE,	EHOSTUNREACH,	0,		0,
 	0,		0,		EHOSTUNREACH,	0,
 	ENOPROTOOPT,	ECONNREFUSED
 };
 
 /*
  * Forward a packet.  If some error occurs return the sender
  * an icmp packet.  Note we can't always generate a meaningful
  * icmp message because icmp doesn't have a large enough repertoire
  * of codes and types.
  *
  * If not forwarding, just drop the packet.  This could be confusing
  * if ipforwarding was zero but some routing protocol was advancing
  * us as a gateway to somewhere.  However, we must let the routing
  * protocol deal with that.
  *
  * The srcrt parameter indicates whether the packet is being forwarded
  * via a source route.
  */
 void
 ip_forward(struct mbuf *m, int srcrt)
 {
 	INIT_VNET_INET(curvnet);
 	struct ip *ip = mtod(m, struct ip *);
 	struct in_ifaddr *ia = NULL;
 	struct mbuf *mcopy;
 	struct in_addr dest;
 	struct route ro;
 	int error, type = 0, code = 0, mtu = 0;
 
 	if (m->m_flags & (M_BCAST|M_MCAST) || in_canforward(ip->ip_dst) == 0) {
 		V_ipstat.ips_cantforward++;
 		m_freem(m);
 		return;
 	}
 #ifdef IPSTEALTH
 	if (!V_ipstealth) {
 #endif
 		if (ip->ip_ttl <= IPTTLDEC) {
 			icmp_error(m, ICMP_TIMXCEED, ICMP_TIMXCEED_INTRANS,
 			    0, 0);
 			return;
 		}
 #ifdef IPSTEALTH
 	}
 #endif
 
 	ia = ip_rtaddr(ip->ip_dst, M_GETFIB(m));
 	if (!srcrt && ia == NULL) {
 		icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_HOST, 0, 0);
 		return;
 	}
 
 	/*
 	 * Save the IP header and at most 8 bytes of the payload,
 	 * in case we need to generate an ICMP message to the src.
 	 *
 	 * XXX this can be optimized a lot by saving the data in a local
 	 * buffer on the stack (72 bytes at most), and only allocating the
 	 * mbuf if really necessary. The vast majority of the packets
 	 * are forwarded without having to send an ICMP back (either
 	 * because unnecessary, or because rate limited), so we are
 	 * really we are wasting a lot of work here.
 	 *
 	 * We don't use m_copy() because it might return a reference
 	 * to a shared cluster. Both this function and ip_output()
 	 * assume exclusive access to the IP header in `m', so any
 	 * data in a cluster may change before we reach icmp_error().
 	 */
 	MGETHDR(mcopy, M_DONTWAIT, m->m_type);
 	if (mcopy != NULL && !m_dup_pkthdr(mcopy, m, M_DONTWAIT)) {
 		/*
 		 * It's probably ok if the pkthdr dup fails (because
 		 * the deep copy of the tag chain failed), but for now
 		 * be conservative and just discard the copy since
 		 * code below may some day want the tags.
 		 */
 		m_free(mcopy);
 		mcopy = NULL;
 	}
 	if (mcopy != NULL) {
 		mcopy->m_len = min(ip->ip_len, M_TRAILINGSPACE(mcopy));
 		mcopy->m_pkthdr.len = mcopy->m_len;
 		m_copydata(m, 0, mcopy->m_len, mtod(mcopy, caddr_t));
 	}
 
 #ifdef IPSTEALTH
 	if (!V_ipstealth) {
 #endif
 		ip->ip_ttl -= IPTTLDEC;
 #ifdef IPSTEALTH
 	}
 #endif
 
 	/*
 	 * If forwarding packet using same interface that it came in on,
 	 * perhaps should send a redirect to sender to shortcut a hop.
 	 * Only send redirect if source is sending directly to us,
 	 * and if packet was not source routed (or has any options).
 	 * Also, don't send redirect if forwarding using a default route
 	 * or a route modified by a redirect.
 	 */
 	dest.s_addr = 0;
 	if (!srcrt && V_ipsendredirects && ia->ia_ifp == m->m_pkthdr.rcvif) {
 		struct sockaddr_in *sin;
 		struct rtentry *rt;
 
 		bzero(&ro, sizeof(ro));
 		sin = (struct sockaddr_in *)&ro.ro_dst;
 		sin->sin_family = AF_INET;
 		sin->sin_len = sizeof(*sin);
 		sin->sin_addr = ip->ip_dst;
-		in_rtalloc_ign(&ro, RTF_CLONING, M_GETFIB(m));
+		in_rtalloc_ign(&ro, 0, M_GETFIB(m));
 
 		rt = ro.ro_rt;
 
 		if (rt && (rt->rt_flags & (RTF_DYNAMIC|RTF_MODIFIED)) == 0 &&
 		    satosin(rt_key(rt))->sin_addr.s_addr != 0) {
 #define	RTA(rt)	((struct in_ifaddr *)(rt->rt_ifa))
 			u_long src = ntohl(ip->ip_src.s_addr);
 
 			if (RTA(rt) &&
 			    (src & RTA(rt)->ia_subnetmask) == RTA(rt)->ia_subnet) {
 				if (rt->rt_flags & RTF_GATEWAY)
 					dest.s_addr = satosin(rt->rt_gateway)->sin_addr.s_addr;
 				else
 					dest.s_addr = ip->ip_dst.s_addr;
 				/* Router requirements says to only send host redirects */
 				type = ICMP_REDIRECT;
 				code = ICMP_REDIRECT_HOST;
 			}
 		}
 		if (rt)
 			RTFREE(rt);
 	}
 
 	/*
 	 * Try to cache the route MTU from ip_output so we can consider it for
 	 * the ICMP_UNREACH_NEEDFRAG "Next-Hop MTU" field described in RFC1191.
 	 */
 	bzero(&ro, sizeof(ro));
 
 	error = ip_output(m, NULL, &ro, IP_FORWARDING, NULL, NULL);
 
 	if (error == EMSGSIZE && ro.ro_rt)
 		mtu = ro.ro_rt->rt_rmx.rmx_mtu;
 	if (ro.ro_rt)
 		RTFREE(ro.ro_rt);
 
 	if (error)
 		V_ipstat.ips_cantforward++;
 	else {
 		V_ipstat.ips_forward++;
 		if (type)
 			V_ipstat.ips_redirectsent++;
 		else {
 			if (mcopy)
 				m_freem(mcopy);
 			return;
 		}
 	}
 	if (mcopy == NULL)
 		return;
 
 	switch (error) {
 
 	case 0:				/* forwarded, but need redirect */
 		/* type, code set above */
 		break;
 
 	case ENETUNREACH:		/* shouldn't happen, checked above */
 	case EHOSTUNREACH:
 	case ENETDOWN:
 	case EHOSTDOWN:
 	default:
 		type = ICMP_UNREACH;
 		code = ICMP_UNREACH_HOST;
 		break;
 
 	case EMSGSIZE:
 		type = ICMP_UNREACH;
 		code = ICMP_UNREACH_NEEDFRAG;
 
 #ifdef IPSEC
 		/* 
 		 * If IPsec is configured for this path,
 		 * override any possibly mtu value set by ip_output.
 		 */ 
 		mtu = ip_ipsec_mtu(m, mtu);
 #endif /* IPSEC */
 		/*
 		 * If the MTU was set before make sure we are below the
 		 * interface MTU.
 		 * If the MTU wasn't set before use the interface mtu or
 		 * fall back to the next smaller mtu step compared to the
 		 * current packet size.
 		 */
 		if (mtu != 0) {
 			if (ia != NULL)
 				mtu = min(mtu, ia->ia_ifp->if_mtu);
 		} else {
 			if (ia != NULL)
 				mtu = ia->ia_ifp->if_mtu;
 			else
 				mtu = ip_next_mtu(ip->ip_len, 0);
 		}
 		V_ipstat.ips_cantfrag++;
 		break;
 
 	case ENOBUFS:
 		/*
 		 * A router should not generate ICMP_SOURCEQUENCH as
 		 * required in RFC1812 Requirements for IP Version 4 Routers.
 		 * Source quench could be a big problem under DoS attacks,
 		 * or if the underlying interface is rate-limited.
 		 * Those who need source quench packets may re-enable them
 		 * via the net.inet.ip.sendsourcequench sysctl.
 		 */
 		if (V_ip_sendsourcequench == 0) {
 			m_freem(mcopy);
 			return;
 		} else {
 			type = ICMP_SOURCEQUENCH;
 			code = 0;
 		}
 		break;
 
 	case EACCES:			/* ipfw denied packet */
 		m_freem(mcopy);
 		return;
 	}
 	icmp_error(mcopy, type, code, dest.s_addr, mtu);
 }
 
 void
 ip_savecontrol(struct inpcb *inp, struct mbuf **mp, struct ip *ip,
     struct mbuf *m)
 {
 	INIT_VNET_NET(inp->inp_vnet);
 
 	if (inp->inp_socket->so_options & (SO_BINTIME | SO_TIMESTAMP)) {
 		struct bintime bt;
 
 		bintime(&bt);
 		if (inp->inp_socket->so_options & SO_BINTIME) {
 			*mp = sbcreatecontrol((caddr_t) &bt, sizeof(bt),
 			SCM_BINTIME, SOL_SOCKET);
 			if (*mp)
 				mp = &(*mp)->m_next;
 		}
 		if (inp->inp_socket->so_options & SO_TIMESTAMP) {
 			struct timeval tv;
 
 			bintime2timeval(&bt, &tv);
 			*mp = sbcreatecontrol((caddr_t) &tv, sizeof(tv),
 				SCM_TIMESTAMP, SOL_SOCKET);
 			if (*mp)
 				mp = &(*mp)->m_next;
 		}
 	}
 	if (inp->inp_flags & INP_RECVDSTADDR) {
 		*mp = sbcreatecontrol((caddr_t) &ip->ip_dst,
 		    sizeof(struct in_addr), IP_RECVDSTADDR, IPPROTO_IP);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 	if (inp->inp_flags & INP_RECVTTL) {
 		*mp = sbcreatecontrol((caddr_t) &ip->ip_ttl,
 		    sizeof(u_char), IP_RECVTTL, IPPROTO_IP);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 #ifdef notyet
 	/* XXX
 	 * Moving these out of udp_input() made them even more broken
 	 * than they already were.
 	 */
 	/* options were tossed already */
 	if (inp->inp_flags & INP_RECVOPTS) {
 		*mp = sbcreatecontrol((caddr_t) opts_deleted_above,
 		    sizeof(struct in_addr), IP_RECVOPTS, IPPROTO_IP);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 	/* ip_srcroute doesn't do what we want here, need to fix */
 	if (inp->inp_flags & INP_RECVRETOPTS) {
 		*mp = sbcreatecontrol((caddr_t) ip_srcroute(m),
 		    sizeof(struct in_addr), IP_RECVRETOPTS, IPPROTO_IP);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 #endif
 	if (inp->inp_flags & INP_RECVIF) {
 		struct ifnet *ifp;
 		struct sdlbuf {
 			struct sockaddr_dl sdl;
 			u_char	pad[32];
 		} sdlbuf;
 		struct sockaddr_dl *sdp;
 		struct sockaddr_dl *sdl2 = &sdlbuf.sdl;
 
 		if (((ifp = m->m_pkthdr.rcvif)) 
 		&& ( ifp->if_index && (ifp->if_index <= V_if_index))) {
 			sdp = (struct sockaddr_dl *)ifp->if_addr->ifa_addr;
 			/*
 			 * Change our mind and don't try copy.
 			 */
 			if ((sdp->sdl_family != AF_LINK)
 			|| (sdp->sdl_len > sizeof(sdlbuf))) {
 				goto makedummy;
 			}
 			bcopy(sdp, sdl2, sdp->sdl_len);
 		} else {
 makedummy:	
 			sdl2->sdl_len
 				= offsetof(struct sockaddr_dl, sdl_data[0]);
 			sdl2->sdl_family = AF_LINK;
 			sdl2->sdl_index = 0;
 			sdl2->sdl_nlen = sdl2->sdl_alen = sdl2->sdl_slen = 0;
 		}
 		*mp = sbcreatecontrol((caddr_t) sdl2, sdl2->sdl_len,
 			IP_RECVIF, IPPROTO_IP);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 }
 
 /*
  * XXXRW: Multicast routing code in ip_mroute.c is generally MPSAFE, but the
  * ip_rsvp and ip_rsvp_on variables need to be interlocked with rsvp_on
  * locking.  This code remains in ip_input.c as ip_mroute.c is optionally
  * compiled.
  */
 int
 ip_rsvp_init(struct socket *so)
 {
 	INIT_VNET_INET(so->so_vnet);
 
 	if (so->so_type != SOCK_RAW ||
 	    so->so_proto->pr_protocol != IPPROTO_RSVP)
 		return EOPNOTSUPP;
 
 	if (V_ip_rsvpd != NULL)
 		return EADDRINUSE;
 
 	V_ip_rsvpd = so;
 	/*
 	 * This may seem silly, but we need to be sure we don't over-increment
 	 * the RSVP counter, in case something slips up.
 	 */
 	if (!V_ip_rsvp_on) {
 		V_ip_rsvp_on = 1;
 		V_rsvp_on++;
 	}
 
 	return 0;
 }
 
 int
 ip_rsvp_done(void)
 {
 	INIT_VNET_INET(curvnet);
 
 	V_ip_rsvpd = NULL;
 	/*
 	 * This may seem silly, but we need to be sure we don't over-decrement
 	 * the RSVP counter, in case something slips up.
 	 */
 	if (V_ip_rsvp_on) {
 		V_ip_rsvp_on = 0;
 		V_rsvp_on--;
 	}
 	return 0;
 }
 
 void
 rsvp_input(struct mbuf *m, int off)	/* XXX must fixup manually */
 {
 	INIT_VNET_INET(curvnet);
 
 	if (rsvp_input_p) { /* call the real one if loaded */
 		rsvp_input_p(m, off);
 		return;
 	}
 
 	/* Can still get packets with rsvp_on = 0 if there is a local member
 	 * of the group to which the RSVP packet is addressed.  But in this
 	 * case we want to throw the packet away.
 	 */
 	
 	if (!V_rsvp_on) {
 		m_freem(m);
 		return;
 	}
 
 	if (V_ip_rsvpd != NULL) { 
 		rip_input(m, off);
 		return;
 	}
 	/* Drop the packet */
 	m_freem(m);
 }
Index: head/sys/netinet/ip_output.c
===================================================================
--- head/sys/netinet/ip_output.c	(revision 186118)
+++ head/sys/netinet/ip_output.c	(revision 186119)
@@ -1,1197 +1,1196 @@
 /*-
  * Copyright (c) 1982, 1986, 1988, 1990, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_output.c	8.3 (Berkeley) 1/21/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ipfw.h"
 #include "opt_ipsec.h"
 #include "opt_mac.h"
 #include "opt_mbuf_stress_test.h"
 #include "opt_mpath.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sysctl.h>
 #include <sys/ucred.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/netisr.h>
 #include <net/pfil.h>
 #include <net/route.h>
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/in_pcb.h>
 #include <netinet/in_var.h>
 #include <netinet/ip_var.h>
 #include <netinet/ip_options.h>
 #include <netinet/vinet.h>
 
 #ifdef IPSEC
 #include <netinet/ip_ipsec.h>
 #include <netipsec/ipsec.h>
 #endif /* IPSEC*/
 
 #include <machine/in_cksum.h>
 
 #include <security/mac/mac_framework.h>
 
 #define print_ip(x, a, y)	 printf("%s %d.%d.%d.%d%s",\
 				x, (ntohl(a.s_addr)>>24)&0xFF,\
 				  (ntohl(a.s_addr)>>16)&0xFF,\
 				  (ntohl(a.s_addr)>>8)&0xFF,\
 				  (ntohl(a.s_addr))&0xFF, y);
 
 #ifdef VIMAGE_GLOBALS
 u_short ip_id;
 #endif
 
 #ifdef MBUF_STRESS_TEST
 int mbuf_frag_size = 0;
 SYSCTL_INT(_net_inet_ip, OID_AUTO, mbuf_frag_size, CTLFLAG_RW,
 	&mbuf_frag_size, 0, "Fragment outgoing mbufs to this size");
 #endif
 
 static void	ip_mloopback
 	(struct ifnet *, struct mbuf *, struct sockaddr_in *, int);
 
 
 extern	struct protosw inetsw[];
 
 /*
  * IP output.  The packet in mbuf chain m contains a skeletal IP
  * header (with len, off, ttl, proto, tos, src, dst).
  * The mbuf chain containing the packet will be freed.
  * The mbuf opt, if present, will not be freed.
  * In the IP forwarding case, the packet will arrive with options already
  * inserted, so must have a NULL opt pointer.
  */
 int
 ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags,
     struct ip_moptions *imo, struct inpcb *inp)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET(curvnet);
 	struct ip *ip;
 	struct ifnet *ifp = NULL;	/* keep compiler happy */
 	struct mbuf *m0;
 	int hlen = sizeof (struct ip);
 	int mtu;
 	int len, error = 0;
 	struct sockaddr_in *dst = NULL;	/* keep compiler happy */
 	struct in_ifaddr *ia = NULL;
 	int isbroadcast, sw_csum;
 	struct route iproute;
 	struct in_addr odst;
 #ifdef IPFIREWALL_FORWARD
 	struct m_tag *fwd_tag = NULL;
 #endif
 	M_ASSERTPKTHDR(m);
 
 	if (ro == NULL) {
 		ro = &iproute;
 		bzero(ro, sizeof (*ro));
 	}
 
 	if (inp != NULL) {
 		M_SETFIB(m, inp->inp_inc.inc_fibnum);
 		INP_LOCK_ASSERT(inp);
 	}
 
 	if (opt) {
 		len = 0;
 		m = ip_insertoptions(m, opt, &len);
 		if (len != 0)
 			hlen = len;
 	}
 	ip = mtod(m, struct ip *);
 
 	/*
 	 * Fill in IP header.  If we are not allowing fragmentation,
 	 * then the ip_id field is meaningless, but we don't set it
 	 * to zero.  Doing so causes various problems when devices along
 	 * the path (routers, load balancers, firewalls, etc.) illegally
 	 * disable DF on our packet.  Note that a 16-bit counter
 	 * will wrap around in less than 10 seconds at 100 Mbit/s on a
 	 * medium with MTU 1500.  See Steven M. Bellovin, "A Technique
 	 * for Counting NATted Hosts", Proc. IMW'02, available at
 	 * <http://www.cs.columbia.edu/~smb/papers/fnat.pdf>.
 	 */
 	if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) {
 		ip->ip_v = IPVERSION;
 		ip->ip_hl = hlen >> 2;
 		ip->ip_id = ip_newid();
 		V_ipstat.ips_localout++;
 	} else {
 		hlen = ip->ip_hl << 2;
 	}
 
 	dst = (struct sockaddr_in *)&ro->ro_dst;
 again:
 	/*
 	 * If there is a cached route,
 	 * check that it is to the same destination
 	 * and is still up.  If not, free it and try again.
 	 * The address family should also be checked in case of sharing the
 	 * cache with IPv6.
 	 */
 	if (ro->ro_rt && ((ro->ro_rt->rt_flags & RTF_UP) == 0 ||
 			  dst->sin_family != AF_INET ||
 			  dst->sin_addr.s_addr != ip->ip_dst.s_addr)) {
 		RTFREE(ro->ro_rt);
 		ro->ro_rt = (struct rtentry *)NULL;
 	}
 #ifdef IPFIREWALL_FORWARD
 	if (ro->ro_rt == NULL && fwd_tag == NULL) {
 #else
 	if (ro->ro_rt == NULL) {
 #endif
 		bzero(dst, sizeof(*dst));
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = ip->ip_dst;
 	}
 	/*
 	 * If routing to interface only, short circuit routing lookup.
 	 * The use of an all-ones broadcast address implies this; an
 	 * interface is specified by the broadcast address of an interface,
 	 * or the destination address of a ptp interface.
 	 */
 	if (flags & IP_SENDONES) {
 		if ((ia = ifatoia(ifa_ifwithbroadaddr(sintosa(dst)))) == NULL &&
 		    (ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst)))) == NULL) {
 			V_ipstat.ips_noroute++;
 			error = ENETUNREACH;
 			goto bad;
 		}
 		ip->ip_dst.s_addr = INADDR_BROADCAST;
 		dst->sin_addr = ip->ip_dst;
 		ifp = ia->ia_ifp;
 		ip->ip_ttl = 1;
 		isbroadcast = 1;
 	} else if (flags & IP_ROUTETOIF) {
 		if ((ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst)))) == NULL &&
 		    (ia = ifatoia(ifa_ifwithnet(sintosa(dst)))) == NULL) {
 			V_ipstat.ips_noroute++;
 			error = ENETUNREACH;
 			goto bad;
 		}
 		ifp = ia->ia_ifp;
 		ip->ip_ttl = 1;
 		isbroadcast = in_broadcast(dst->sin_addr, ifp);
 	} else if (IN_MULTICAST(ntohl(ip->ip_dst.s_addr)) &&
 	    imo != NULL && imo->imo_multicast_ifp != NULL) {
 		/*
 		 * Bypass the normal routing lookup for multicast
 		 * packets if the interface is specified.
 		 */
 		ifp = imo->imo_multicast_ifp;
 		IFP_TO_IA(ifp, ia);
 		isbroadcast = 0;	/* fool gcc */
 	} else {
 		/*
 		 * We want to do any cloning requested by the link layer,
 		 * as this is probably required in all cases for correct
 		 * operation (as it is for ARP).
 		 */
 		if (ro->ro_rt == NULL)
 #ifdef RADIX_MPATH
 			rtalloc_mpath_fib(ro,
 			    ntohl(ip->ip_src.s_addr ^ ip->ip_dst.s_addr),
 			    inp ? inp->inp_inc.inc_fibnum : M_GETFIB(m));
 #else
 			in_rtalloc_ign(ro, 0,
 			    inp ? inp->inp_inc.inc_fibnum : M_GETFIB(m));
 #endif
 		if (ro->ro_rt == NULL) {
 			V_ipstat.ips_noroute++;
 			error = EHOSTUNREACH;
 			goto bad;
 		}
 		ia = ifatoia(ro->ro_rt->rt_ifa);
 		ifp = ro->ro_rt->rt_ifp;
 		ro->ro_rt->rt_rmx.rmx_pksent++;
 		if (ro->ro_rt->rt_flags & RTF_GATEWAY)
 			dst = (struct sockaddr_in *)ro->ro_rt->rt_gateway;
 		if (ro->ro_rt->rt_flags & RTF_HOST)
 			isbroadcast = (ro->ro_rt->rt_flags & RTF_BROADCAST);
 		else
 			isbroadcast = in_broadcast(dst->sin_addr, ifp);
 	}
 	/*
 	 * Calculate MTU.  If we have a route that is up, use that,
 	 * otherwise use the interface's MTU.
 	 */
 	if (ro->ro_rt != NULL && (ro->ro_rt->rt_flags & (RTF_UP|RTF_HOST))) {
 		/*
 		 * This case can happen if the user changed the MTU
 		 * of an interface after enabling IP on it.  Because
 		 * most netifs don't keep track of routes pointing to
 		 * them, there is no way for one to update all its
 		 * routes when the MTU is changed.
 		 */
 		if (ro->ro_rt->rt_rmx.rmx_mtu > ifp->if_mtu)
 			ro->ro_rt->rt_rmx.rmx_mtu = ifp->if_mtu;
 		mtu = ro->ro_rt->rt_rmx.rmx_mtu;
 	} else {
 		mtu = ifp->if_mtu;
 	}
 	if (IN_MULTICAST(ntohl(ip->ip_dst.s_addr))) {
 		struct in_multi *inm;
 
 		m->m_flags |= M_MCAST;
 		/*
 		 * IP destination address is multicast.  Make sure "dst"
 		 * still points to the address in "ro".  (It may have been
 		 * changed to point to a gateway address, above.)
 		 */
 		dst = (struct sockaddr_in *)&ro->ro_dst;
 		/*
 		 * See if the caller provided any multicast options
 		 */
 		if (imo != NULL) {
 			ip->ip_ttl = imo->imo_multicast_ttl;
 			if (imo->imo_multicast_vif != -1)
 				ip->ip_src.s_addr =
 				    ip_mcast_src ?
 				    ip_mcast_src(imo->imo_multicast_vif) :
 				    INADDR_ANY;
 		} else
 			ip->ip_ttl = IP_DEFAULT_MULTICAST_TTL;
 		/*
 		 * Confirm that the outgoing interface supports multicast.
 		 */
 		if ((imo == NULL) || (imo->imo_multicast_vif == -1)) {
 			if ((ifp->if_flags & IFF_MULTICAST) == 0) {
 				V_ipstat.ips_noroute++;
 				error = ENETUNREACH;
 				goto bad;
 			}
 		}
 		/*
 		 * If source address not specified yet, use address
 		 * of outgoing interface.
 		 */
 		if (ip->ip_src.s_addr == INADDR_ANY) {
 			/* Interface may have no addresses. */
 			if (ia != NULL)
 				ip->ip_src = IA_SIN(ia)->sin_addr;
 		}
 
 		IN_MULTI_LOCK();
 		IN_LOOKUP_MULTI(ip->ip_dst, ifp, inm);
 		if (inm != NULL &&
 		   (imo == NULL || imo->imo_multicast_loop)) {
 			IN_MULTI_UNLOCK();
 			/*
 			 * If we belong to the destination multicast group
 			 * on the outgoing interface, and the caller did not
 			 * forbid loopback, loop back a copy.
 			 */
 			ip_mloopback(ifp, m, dst, hlen);
 		}
 		else {
 			IN_MULTI_UNLOCK();
 			/*
 			 * If we are acting as a multicast router, perform
 			 * multicast forwarding as if the packet had just
 			 * arrived on the interface to which we are about
 			 * to send.  The multicast forwarding function
 			 * recursively calls this function, using the
 			 * IP_FORWARDING flag to prevent infinite recursion.
 			 *
 			 * Multicasts that are looped back by ip_mloopback(),
 			 * above, will be forwarded by the ip_input() routine,
 			 * if necessary.
 			 */
 			if (V_ip_mrouter && (flags & IP_FORWARDING) == 0) {
 				/*
 				 * If rsvp daemon is not running, do not
 				 * set ip_moptions. This ensures that the packet
 				 * is multicast and not just sent down one link
 				 * as prescribed by rsvpd.
 				 */
 				if (!V_rsvp_on)
 					imo = NULL;
 				if (ip_mforward &&
 				    ip_mforward(ip, ifp, m, imo) != 0) {
 					m_freem(m);
 					goto done;
 				}
 			}
 		}
 
 		/*
 		 * Multicasts with a time-to-live of zero may be looped-
 		 * back, above, but must not be transmitted on a network.
 		 * Also, multicasts addressed to the loopback interface
 		 * are not sent -- the above call to ip_mloopback() will
 		 * loop back a copy if this host actually belongs to the
 		 * destination group on the loopback interface.
 		 */
 		if (ip->ip_ttl == 0 || ifp->if_flags & IFF_LOOPBACK) {
 			m_freem(m);
 			goto done;
 		}
 
 		goto sendit;
 	}
 
 	/*
 	 * If the source address is not specified yet, use the address
 	 * of the outoing interface.
 	 */
 	if (ip->ip_src.s_addr == INADDR_ANY) {
 		/* Interface may have no addresses. */
 		if (ia != NULL) {
 			ip->ip_src = IA_SIN(ia)->sin_addr;
 		}
 	}
 
 	/*
 	 * Verify that we have any chance at all of being able to queue the
 	 * packet or packet fragments, unless ALTQ is enabled on the given
 	 * interface in which case packetdrop should be done by queueing.
 	 */
 #ifdef ALTQ
 	if ((!ALTQ_IS_ENABLED(&ifp->if_snd)) &&
 	    ((ifp->if_snd.ifq_len + ip->ip_len / mtu + 1) >=
 	    ifp->if_snd.ifq_maxlen))
 #else
 	if ((ifp->if_snd.ifq_len + ip->ip_len / mtu + 1) >=
 	    ifp->if_snd.ifq_maxlen)
 #endif /* ALTQ */
 	{
 		error = ENOBUFS;
 		V_ipstat.ips_odropped++;
 		ifp->if_snd.ifq_drops += (ip->ip_len / ifp->if_mtu + 1);
 		goto bad;
 	}
 
 	/*
 	 * Look for broadcast address and
 	 * verify user is allowed to send
 	 * such a packet.
 	 */
 	if (isbroadcast) {
 		if ((ifp->if_flags & IFF_BROADCAST) == 0) {
 			error = EADDRNOTAVAIL;
 			goto bad;
 		}
 		if ((flags & IP_ALLOWBROADCAST) == 0) {
 			error = EACCES;
 			goto bad;
 		}
 		/* don't allow broadcast messages to be fragmented */
 		if (ip->ip_len > mtu) {
 			error = EMSGSIZE;
 			goto bad;
 		}
 		m->m_flags |= M_BCAST;
 	} else {
 		m->m_flags &= ~M_BCAST;
 	}
 
 sendit:
 #ifdef IPSEC
 	switch(ip_ipsec_output(&m, inp, &flags, &error, &ro, &iproute, &dst, &ia, &ifp)) {
 	case 1:
 		goto bad;
 	case -1:
 		goto done;
 	case 0:
 	default:
 		break;	/* Continue with packet processing. */
 	}
 	/* Update variables that are affected by ipsec4_output(). */
 	ip = mtod(m, struct ip *);
 	hlen = ip->ip_hl << 2;
 #endif /* IPSEC */
 
 	/* Jump over all PFIL processing if hooks are not active. */
 	if (!PFIL_HOOKED(&inet_pfil_hook))
 		goto passout;
 
 	/* Run through list of hooks for output packets. */
 	odst.s_addr = ip->ip_dst.s_addr;
 	error = pfil_run_hooks(&inet_pfil_hook, &m, ifp, PFIL_OUT, inp);
 	if (error != 0 || m == NULL)
 		goto done;
 
 	ip = mtod(m, struct ip *);
 
 	/* See if destination IP address was changed by packet filter. */
 	if (odst.s_addr != ip->ip_dst.s_addr) {
 		m->m_flags |= M_SKIP_FIREWALL;
 		/* If destination is now ourself drop to ip_input(). */
 		if (in_localip(ip->ip_dst)) {
 			m->m_flags |= M_FASTFWD_OURS;
 			if (m->m_pkthdr.rcvif == NULL)
 				m->m_pkthdr.rcvif = V_loif;
 			if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 				m->m_pkthdr.csum_flags |=
 				    CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
 				m->m_pkthdr.csum_data = 0xffff;
 			}
 			m->m_pkthdr.csum_flags |=
 			    CSUM_IP_CHECKED | CSUM_IP_VALID;
 
 			error = netisr_queue(NETISR_IP, m);
 			goto done;
 		} else
 			goto again;	/* Redo the routing table lookup. */
 	}
 
 #ifdef IPFIREWALL_FORWARD
 	/* See if local, if yes, send it to netisr with IP_FASTFWD_OURS. */
 	if (m->m_flags & M_FASTFWD_OURS) {
 		if (m->m_pkthdr.rcvif == NULL)
 			m->m_pkthdr.rcvif = V_loif;
 		if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 			m->m_pkthdr.csum_flags |=
 			    CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
 			m->m_pkthdr.csum_data = 0xffff;
 		}
 		m->m_pkthdr.csum_flags |=
 			    CSUM_IP_CHECKED | CSUM_IP_VALID;
 
 		error = netisr_queue(NETISR_IP, m);
 		goto done;
 	}
 	/* Or forward to some other address? */
 	fwd_tag = m_tag_find(m, PACKET_TAG_IPFORWARD, NULL);
 	if (fwd_tag) {
 		dst = (struct sockaddr_in *)&ro->ro_dst;
 		bcopy((fwd_tag+1), dst, sizeof(struct sockaddr_in));
 		m->m_flags |= M_SKIP_FIREWALL;
 		m_tag_delete(m, fwd_tag);
 		goto again;
 	}
 #endif /* IPFIREWALL_FORWARD */
 
 passout:
 	/* 127/8 must not appear on wire - RFC1122. */
 	if ((ntohl(ip->ip_dst.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET ||
 	    (ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET) {
 		if ((ifp->if_flags & IFF_LOOPBACK) == 0) {
 			V_ipstat.ips_badaddr++;
 			error = EADDRNOTAVAIL;
 			goto bad;
 		}
 	}
 
 	m->m_pkthdr.csum_flags |= CSUM_IP;
 	sw_csum = m->m_pkthdr.csum_flags & ~ifp->if_hwassist;
 	if (sw_csum & CSUM_DELAY_DATA) {
 		in_delayed_cksum(m);
 		sw_csum &= ~CSUM_DELAY_DATA;
 	}
 	m->m_pkthdr.csum_flags &= ifp->if_hwassist;
 
 	/*
 	 * If small enough for interface, or the interface will take
 	 * care of the fragmentation for us, we can just send directly.
 	 */
 	if (ip->ip_len <= mtu ||
 	    (m->m_pkthdr.csum_flags & ifp->if_hwassist & CSUM_TSO) != 0 ||
 	    ((ip->ip_off & IP_DF) == 0 && (ifp->if_hwassist & CSUM_FRAGMENT))) {
 		ip->ip_len = htons(ip->ip_len);
 		ip->ip_off = htons(ip->ip_off);
 		ip->ip_sum = 0;
 		if (sw_csum & CSUM_DELAY_IP)
 			ip->ip_sum = in_cksum(m, hlen);
 
 		/*
 		 * Record statistics for this interface address.
 		 * With CSUM_TSO the byte/packet count will be slightly
 		 * incorrect because we count the IP+TCP headers only
 		 * once instead of for every generated packet.
 		 */
 		if (!(flags & IP_FORWARDING) && ia) {
 			if (m->m_pkthdr.csum_flags & CSUM_TSO)
 				ia->ia_ifa.if_opackets +=
 				    m->m_pkthdr.len / m->m_pkthdr.tso_segsz;
 			else
 				ia->ia_ifa.if_opackets++;
 			ia->ia_ifa.if_obytes += m->m_pkthdr.len;
 		}
 #ifdef MBUF_STRESS_TEST
 		if (mbuf_frag_size && m->m_pkthdr.len > mbuf_frag_size)
 			m = m_fragment(m, M_DONTWAIT, mbuf_frag_size);
 #endif
 		/*
 		 * Reset layer specific mbuf flags
 		 * to avoid confusing lower layers.
 		 */
 		m->m_flags &= ~(M_PROTOFLAGS);
-
 		error = (*ifp->if_output)(ifp, m,
 				(struct sockaddr *)dst, ro->ro_rt);
 		goto done;
 	}
 
 	/* Balk when DF bit is set or the interface didn't support TSO. */
 	if ((ip->ip_off & IP_DF) || (m->m_pkthdr.csum_flags & CSUM_TSO)) {
 		error = EMSGSIZE;
 		V_ipstat.ips_cantfrag++;
 		goto bad;
 	}
 
 	/*
 	 * Too large for interface; fragment if possible. If successful,
 	 * on return, m will point to a list of packets to be sent.
 	 */
 	error = ip_fragment(ip, &m, mtu, ifp->if_hwassist, sw_csum);
 	if (error)
 		goto bad;
 	for (; m; m = m0) {
 		m0 = m->m_nextpkt;
 		m->m_nextpkt = 0;
 		if (error == 0) {
 			/* Record statistics for this interface address. */
 			if (ia != NULL) {
 				ia->ia_ifa.if_opackets++;
 				ia->ia_ifa.if_obytes += m->m_pkthdr.len;
 			}
 			/*
 			 * Reset layer specific mbuf flags
 			 * to avoid confusing upper layers.
 			 */
 			m->m_flags &= ~(M_PROTOFLAGS);
 
 			error = (*ifp->if_output)(ifp, m,
 			    (struct sockaddr *)dst, ro->ro_rt);
 		} else
 			m_freem(m);
 	}
 
 	if (error == 0)
 		V_ipstat.ips_fragmented++;
 
 done:
 	if (ro == &iproute && ro->ro_rt) {
 		RTFREE(ro->ro_rt);
 	}
 	return (error);
 bad:
 	m_freem(m);
 	goto done;
 }
 
 /*
  * Create a chain of fragments which fit the given mtu. m_frag points to the
  * mbuf to be fragmented; on return it points to the chain with the fragments.
  * Return 0 if no error. If error, m_frag may contain a partially built
  * chain of fragments that should be freed by the caller.
  *
  * if_hwassist_flags is the hw offload capabilities (see if_data.ifi_hwassist)
  * sw_csum contains the delayed checksums flags (e.g., CSUM_DELAY_IP).
  */
 int
 ip_fragment(struct ip *ip, struct mbuf **m_frag, int mtu,
     u_long if_hwassist_flags, int sw_csum)
 {
 	INIT_VNET_INET(curvnet);
 	int error = 0;
 	int hlen = ip->ip_hl << 2;
 	int len = (mtu - hlen) & ~7;	/* size of payload in each fragment */
 	int off;
 	struct mbuf *m0 = *m_frag;	/* the original packet		*/
 	int firstlen;
 	struct mbuf **mnext;
 	int nfrags;
 
 	if (ip->ip_off & IP_DF) {	/* Fragmentation not allowed */
 		V_ipstat.ips_cantfrag++;
 		return EMSGSIZE;
 	}
 
 	/*
 	 * Must be able to put at least 8 bytes per fragment.
 	 */
 	if (len < 8)
 		return EMSGSIZE;
 
 	/*
 	 * If the interface will not calculate checksums on
 	 * fragmented packets, then do it here.
 	 */
 	if (m0->m_pkthdr.csum_flags & CSUM_DELAY_DATA &&
 	    (if_hwassist_flags & CSUM_IP_FRAGS) == 0) {
 		in_delayed_cksum(m0);
 		m0->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
 	}
 
 	if (len > PAGE_SIZE) {
 		/* 
 		 * Fragment large datagrams such that each segment 
 		 * contains a multiple of PAGE_SIZE amount of data, 
 		 * plus headers. This enables a receiver to perform 
 		 * page-flipping zero-copy optimizations.
 		 *
 		 * XXX When does this help given that sender and receiver
 		 * could have different page sizes, and also mtu could
 		 * be less than the receiver's page size ?
 		 */
 		int newlen;
 		struct mbuf *m;
 
 		for (m = m0, off = 0; m && (off+m->m_len) <= mtu; m = m->m_next)
 			off += m->m_len;
 
 		/*
 		 * firstlen (off - hlen) must be aligned on an 
 		 * 8-byte boundary
 		 */
 		if (off < hlen)
 			goto smart_frag_failure;
 		off = ((off - hlen) & ~7) + hlen;
 		newlen = (~PAGE_MASK) & mtu;
 		if ((newlen + sizeof (struct ip)) > mtu) {
 			/* we failed, go back the default */
 smart_frag_failure:
 			newlen = len;
 			off = hlen + len;
 		}
 		len = newlen;
 
 	} else {
 		off = hlen + len;
 	}
 
 	firstlen = off - hlen;
 	mnext = &m0->m_nextpkt;		/* pointer to next packet */
 
 	/*
 	 * Loop through length of segment after first fragment,
 	 * make new header and copy data of each part and link onto chain.
 	 * Here, m0 is the original packet, m is the fragment being created.
 	 * The fragments are linked off the m_nextpkt of the original
 	 * packet, which after processing serves as the first fragment.
 	 */
 	for (nfrags = 1; off < ip->ip_len; off += len, nfrags++) {
 		struct ip *mhip;	/* ip header on the fragment */
 		struct mbuf *m;
 		int mhlen = sizeof (struct ip);
 
 		MGETHDR(m, M_DONTWAIT, MT_DATA);
 		if (m == NULL) {
 			error = ENOBUFS;
 			V_ipstat.ips_odropped++;
 			goto done;
 		}
 		m->m_flags |= (m0->m_flags & M_MCAST) | M_FRAG;
 		/*
 		 * In the first mbuf, leave room for the link header, then
 		 * copy the original IP header including options. The payload
 		 * goes into an additional mbuf chain returned by m_copy().
 		 */
 		m->m_data += max_linkhdr;
 		mhip = mtod(m, struct ip *);
 		*mhip = *ip;
 		if (hlen > sizeof (struct ip)) {
 			mhlen = ip_optcopy(ip, mhip) + sizeof (struct ip);
 			mhip->ip_v = IPVERSION;
 			mhip->ip_hl = mhlen >> 2;
 		}
 		m->m_len = mhlen;
 		/* XXX do we need to add ip->ip_off below ? */
 		mhip->ip_off = ((off - hlen) >> 3) + ip->ip_off;
 		if (off + len >= ip->ip_len) {	/* last fragment */
 			len = ip->ip_len - off;
 			m->m_flags |= M_LASTFRAG;
 		} else
 			mhip->ip_off |= IP_MF;
 		mhip->ip_len = htons((u_short)(len + mhlen));
 		m->m_next = m_copy(m0, off, len);
 		if (m->m_next == NULL) {	/* copy failed */
 			m_free(m);
 			error = ENOBUFS;	/* ??? */
 			V_ipstat.ips_odropped++;
 			goto done;
 		}
 		m->m_pkthdr.len = mhlen + len;
 		m->m_pkthdr.rcvif = NULL;
 #ifdef MAC
 		mac_netinet_fragment(m0, m);
 #endif
 		m->m_pkthdr.csum_flags = m0->m_pkthdr.csum_flags;
 		mhip->ip_off = htons(mhip->ip_off);
 		mhip->ip_sum = 0;
 		if (sw_csum & CSUM_DELAY_IP)
 			mhip->ip_sum = in_cksum(m, mhlen);
 		*mnext = m;
 		mnext = &m->m_nextpkt;
 	}
 	V_ipstat.ips_ofragments += nfrags;
 
 	/* set first marker for fragment chain */
 	m0->m_flags |= M_FIRSTFRAG | M_FRAG;
 	m0->m_pkthdr.csum_data = nfrags;
 
 	/*
 	 * Update first fragment by trimming what's been copied out
 	 * and updating header.
 	 */
 	m_adj(m0, hlen + firstlen - ip->ip_len);
 	m0->m_pkthdr.len = hlen + firstlen;
 	ip->ip_len = htons((u_short)m0->m_pkthdr.len);
 	ip->ip_off |= IP_MF;
 	ip->ip_off = htons(ip->ip_off);
 	ip->ip_sum = 0;
 	if (sw_csum & CSUM_DELAY_IP)
 		ip->ip_sum = in_cksum(m0, hlen);
 
 done:
 	*m_frag = m0;
 	return error;
 }
 
 void
 in_delayed_cksum(struct mbuf *m)
 {
 	struct ip *ip;
 	u_short csum, offset;
 
 	ip = mtod(m, struct ip *);
 	offset = ip->ip_hl << 2 ;
 	csum = in_cksum_skip(m, ip->ip_len, offset);
 	if (m->m_pkthdr.csum_flags & CSUM_UDP && csum == 0)
 		csum = 0xffff;
 	offset += m->m_pkthdr.csum_data;	/* checksum offset */
 
 	if (offset + sizeof(u_short) > m->m_len) {
 		printf("delayed m_pullup, m->len: %d  off: %d  p: %d\n",
 		    m->m_len, offset, ip->ip_p);
 		/*
 		 * XXX
 		 * this shouldn't happen, but if it does, the
 		 * correct behavior may be to insert the checksum
 		 * in the appropriate next mbuf in the chain.
 		 */
 		return;
 	}
 	*(u_short *)(m->m_data + offset) = csum;
 }
 
 /*
  * IP socket option processing.
  */
 int
 ip_ctloutput(struct socket *so, struct sockopt *sopt)
 {
 	struct	inpcb *inp = sotoinpcb(so);
 	int	error, optval;
 
 	error = optval = 0;
 	if (sopt->sopt_level != IPPROTO_IP) {
 		if ((sopt->sopt_level == SOL_SOCKET) &&
 		    (sopt->sopt_name == SO_SETFIB)) {
 			inp->inp_inc.inc_fibnum = so->so_fibnum;
 			return (0);
 		}
 		return (EINVAL);
 	}
 
 	switch (sopt->sopt_dir) {
 	case SOPT_SET:
 		switch (sopt->sopt_name) {
 		case IP_OPTIONS:
 #ifdef notyet
 		case IP_RETOPTS:
 #endif
 		{
 			struct mbuf *m;
 			if (sopt->sopt_valsize > MLEN) {
 				error = EMSGSIZE;
 				break;
 			}
 			MGET(m, sopt->sopt_td ? M_WAIT : M_DONTWAIT, MT_DATA);
 			if (m == NULL) {
 				error = ENOBUFS;
 				break;
 			}
 			m->m_len = sopt->sopt_valsize;
 			error = sooptcopyin(sopt, mtod(m, char *), m->m_len,
 					    m->m_len);
 			if (error) {
 				m_free(m);
 				break;
 			}
 			INP_WLOCK(inp);
 			error = ip_pcbopts(inp, sopt->sopt_name, m);
 			INP_WUNLOCK(inp);
 			return (error);
 		}
 
 		case IP_TOS:
 		case IP_TTL:
 		case IP_MINTTL:
 		case IP_RECVOPTS:
 		case IP_RECVRETOPTS:
 		case IP_RECVDSTADDR:
 		case IP_RECVTTL:
 		case IP_RECVIF:
 		case IP_FAITH:
 		case IP_ONESBCAST:
 		case IP_DONTFRAG:
 			error = sooptcopyin(sopt, &optval, sizeof optval,
 					    sizeof optval);
 			if (error)
 				break;
 
 			switch (sopt->sopt_name) {
 			case IP_TOS:
 				inp->inp_ip_tos = optval;
 				break;
 
 			case IP_TTL:
 				inp->inp_ip_ttl = optval;
 				break;
 
 			case IP_MINTTL:
 				if (optval > 0 && optval <= MAXTTL)
 					inp->inp_ip_minttl = optval;
 				else
 					error = EINVAL;
 				break;
 
 #define	OPTSET(bit) do {						\
 	INP_WLOCK(inp);							\
 	if (optval)							\
 		inp->inp_flags |= bit;					\
 	else								\
 		inp->inp_flags &= ~bit;					\
 	INP_WUNLOCK(inp);						\
 } while (0)
 
 			case IP_RECVOPTS:
 				OPTSET(INP_RECVOPTS);
 				break;
 
 			case IP_RECVRETOPTS:
 				OPTSET(INP_RECVRETOPTS);
 				break;
 
 			case IP_RECVDSTADDR:
 				OPTSET(INP_RECVDSTADDR);
 				break;
 
 			case IP_RECVTTL:
 				OPTSET(INP_RECVTTL);
 				break;
 
 			case IP_RECVIF:
 				OPTSET(INP_RECVIF);
 				break;
 
 			case IP_FAITH:
 				OPTSET(INP_FAITH);
 				break;
 
 			case IP_ONESBCAST:
 				OPTSET(INP_ONESBCAST);
 				break;
 			case IP_DONTFRAG:
 				OPTSET(INP_DONTFRAG);
 				break;
 			}
 			break;
 #undef OPTSET
 
 		/*
 		 * Multicast socket options are processed by the in_mcast
 		 * module.
 		 */
 		case IP_MULTICAST_IF:
 		case IP_MULTICAST_VIF:
 		case IP_MULTICAST_TTL:
 		case IP_MULTICAST_LOOP:
 		case IP_ADD_MEMBERSHIP:
 		case IP_DROP_MEMBERSHIP:
 		case IP_ADD_SOURCE_MEMBERSHIP:
 		case IP_DROP_SOURCE_MEMBERSHIP:
 		case IP_BLOCK_SOURCE:
 		case IP_UNBLOCK_SOURCE:
 		case IP_MSFILTER:
 		case MCAST_JOIN_GROUP:
 		case MCAST_LEAVE_GROUP:
 		case MCAST_JOIN_SOURCE_GROUP:
 		case MCAST_LEAVE_SOURCE_GROUP:
 		case MCAST_BLOCK_SOURCE:
 		case MCAST_UNBLOCK_SOURCE:
 			error = inp_setmoptions(inp, sopt);
 			break;
 
 		case IP_PORTRANGE:
 			error = sooptcopyin(sopt, &optval, sizeof optval,
 					    sizeof optval);
 			if (error)
 				break;
 
 			INP_WLOCK(inp);
 			switch (optval) {
 			case IP_PORTRANGE_DEFAULT:
 				inp->inp_flags &= ~(INP_LOWPORT);
 				inp->inp_flags &= ~(INP_HIGHPORT);
 				break;
 
 			case IP_PORTRANGE_HIGH:
 				inp->inp_flags &= ~(INP_LOWPORT);
 				inp->inp_flags |= INP_HIGHPORT;
 				break;
 
 			case IP_PORTRANGE_LOW:
 				inp->inp_flags &= ~(INP_HIGHPORT);
 				inp->inp_flags |= INP_LOWPORT;
 				break;
 
 			default:
 				error = EINVAL;
 				break;
 			}
 			INP_WUNLOCK(inp);
 			break;
 
 #ifdef IPSEC
 		case IP_IPSEC_POLICY:
 		{
 			caddr_t req;
 			struct mbuf *m;
 
 			if ((error = soopt_getm(sopt, &m)) != 0) /* XXX */
 				break;
 			if ((error = soopt_mcopyin(sopt, m)) != 0) /* XXX */
 				break;
 			req = mtod(m, caddr_t);
 			error = ipsec4_set_policy(inp, sopt->sopt_name, req,
 			    m->m_len, (sopt->sopt_td != NULL) ?
 			    sopt->sopt_td->td_ucred : NULL);
 			m_freem(m);
 			break;
 		}
 #endif /* IPSEC */
 
 		default:
 			error = ENOPROTOOPT;
 			break;
 		}
 		break;
 
 	case SOPT_GET:
 		switch (sopt->sopt_name) {
 		case IP_OPTIONS:
 		case IP_RETOPTS:
 			if (inp->inp_options)
 				error = sooptcopyout(sopt, 
 						     mtod(inp->inp_options,
 							  char *),
 						     inp->inp_options->m_len);
 			else
 				sopt->sopt_valsize = 0;
 			break;
 
 		case IP_TOS:
 		case IP_TTL:
 		case IP_MINTTL:
 		case IP_RECVOPTS:
 		case IP_RECVRETOPTS:
 		case IP_RECVDSTADDR:
 		case IP_RECVTTL:
 		case IP_RECVIF:
 		case IP_PORTRANGE:
 		case IP_FAITH:
 		case IP_ONESBCAST:
 		case IP_DONTFRAG:
 			switch (sopt->sopt_name) {
 
 			case IP_TOS:
 				optval = inp->inp_ip_tos;
 				break;
 
 			case IP_TTL:
 				optval = inp->inp_ip_ttl;
 				break;
 
 			case IP_MINTTL:
 				optval = inp->inp_ip_minttl;
 				break;
 
 #define	OPTBIT(bit)	(inp->inp_flags & bit ? 1 : 0)
 
 			case IP_RECVOPTS:
 				optval = OPTBIT(INP_RECVOPTS);
 				break;
 
 			case IP_RECVRETOPTS:
 				optval = OPTBIT(INP_RECVRETOPTS);
 				break;
 
 			case IP_RECVDSTADDR:
 				optval = OPTBIT(INP_RECVDSTADDR);
 				break;
 
 			case IP_RECVTTL:
 				optval = OPTBIT(INP_RECVTTL);
 				break;
 
 			case IP_RECVIF:
 				optval = OPTBIT(INP_RECVIF);
 				break;
 
 			case IP_PORTRANGE:
 				if (inp->inp_flags & INP_HIGHPORT)
 					optval = IP_PORTRANGE_HIGH;
 				else if (inp->inp_flags & INP_LOWPORT)
 					optval = IP_PORTRANGE_LOW;
 				else
 					optval = 0;
 				break;
 
 			case IP_FAITH:
 				optval = OPTBIT(INP_FAITH);
 				break;
 
 			case IP_ONESBCAST:
 				optval = OPTBIT(INP_ONESBCAST);
 				break;
 			case IP_DONTFRAG:
 				optval = OPTBIT(INP_DONTFRAG);
 				break;
 			}
 			error = sooptcopyout(sopt, &optval, sizeof optval);
 			break;
 
 		/*
 		 * Multicast socket options are processed by the in_mcast
 		 * module.
 		 */
 		case IP_MULTICAST_IF:
 		case IP_MULTICAST_VIF:
 		case IP_MULTICAST_TTL:
 		case IP_MULTICAST_LOOP:
 		case IP_MSFILTER:
 			error = inp_getmoptions(inp, sopt);
 			break;
 
 #ifdef IPSEC
 		case IP_IPSEC_POLICY:
 		{
 			struct mbuf *m = NULL;
 			caddr_t req = NULL;
 			size_t len = 0;
 
 			if (m != 0) {
 				req = mtod(m, caddr_t);
 				len = m->m_len;
 			}
 			error = ipsec4_get_policy(sotoinpcb(so), req, len, &m);
 			if (error == 0)
 				error = soopt_mcopyout(sopt, m); /* XXX */
 			if (error == 0)
 				m_freem(m);
 			break;
 		}
 #endif /* IPSEC */
 
 		default:
 			error = ENOPROTOOPT;
 			break;
 		}
 		break;
 	}
 	return (error);
 }
 
 /*
  * Routine called from ip_output() to loop back a copy of an IP multicast
  * packet to the input queue of a specified interface.  Note that this
  * calls the output routine of the loopback "driver", but with an interface
  * pointer that might NOT be a loopback interface -- evil, but easier than
  * replicating that code here.
  */
 static void
 ip_mloopback(struct ifnet *ifp, struct mbuf *m, struct sockaddr_in *dst,
     int hlen)
 {
 	register struct ip *ip;
 	struct mbuf *copym;
 
 	/*
 	 * Make a deep copy of the packet because we're going to
 	 * modify the pack in order to generate checksums.
 	 */
 	copym = m_dup(m, M_DONTWAIT);
 	if (copym != NULL && (copym->m_flags & M_EXT || copym->m_len < hlen))
 		copym = m_pullup(copym, hlen);
 	if (copym != NULL) {
 		/* If needed, compute the checksum and mark it as valid. */
 		if (copym->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 			in_delayed_cksum(copym);
 			copym->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
 			copym->m_pkthdr.csum_flags |=
 			    CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
 			copym->m_pkthdr.csum_data = 0xffff;
 		}
 		/*
 		 * We don't bother to fragment if the IP length is greater
 		 * than the interface's MTU.  Can this possibly matter?
 		 */
 		ip = mtod(copym, struct ip *);
 		ip->ip_len = htons(ip->ip_len);
 		ip->ip_off = htons(ip->ip_off);
 		ip->ip_sum = 0;
 		ip->ip_sum = in_cksum(copym, hlen);
 #if 1 /* XXX */
 		if (dst->sin_family != AF_INET) {
 			printf("ip_mloopback: bad address family %d\n",
 						dst->sin_family);
 			dst->sin_family = AF_INET;
 		}
 #endif
 		if_simloop(ifp, copym, dst->sin_family, 0);
 	}
 }
Index: head/sys/netinet/tcp_subr.c
===================================================================
--- head/sys/netinet/tcp_subr.c	(revision 186118)
+++ head/sys/netinet/tcp_subr.c	(revision 186119)
@@ -1,2310 +1,2310 @@
 /*-
  * Copyright (c) 1982, 1986, 1988, 1990, 1993, 1995
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)tcp_subr.c	8.2 (Berkeley) 5/24/95
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_compat.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipsec.h"
 #include "opt_mac.h"
 #include "opt_tcpdebug.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/callout.h>
 #include <sys/kernel.h>
 #include <sys/sysctl.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #ifdef INET6
 #include <sys/domain.h>
 #endif
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/protosw.h>
 #include <sys/random.h>
 #include <sys/vimage.h>
 
 #include <vm/uma.h>
 
 #include <net/route.h>
 #include <net/if.h>
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #ifdef INET6
 #include <netinet/ip6.h>
 #endif
 #include <netinet/in_pcb.h>
 #ifdef INET6
 #include <netinet6/in6_pcb.h>
 #endif
 #include <netinet/in_var.h>
 #include <netinet/ip_var.h>
 #ifdef INET6
 #include <netinet6/ip6_var.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/nd6.h>
 #endif
 #include <netinet/ip_icmp.h>
 #include <netinet/tcp.h>
 #include <netinet/tcp_fsm.h>
 #include <netinet/tcp_seq.h>
 #include <netinet/tcp_timer.h>
 #include <netinet/tcp_var.h>
 #include <netinet/tcp_syncache.h>
 #include <netinet/tcp_offload.h>
 #ifdef INET6
 #include <netinet6/tcp6_var.h>
 #endif
 #include <netinet/tcpip.h>
 #ifdef TCPDEBUG
 #include <netinet/tcp_debug.h>
 #endif
 #include <netinet/vinet.h>
 #include <netinet6/ip6protosw.h>
 #include <netinet6/vinet6.h>
 
 #ifdef IPSEC
 #include <netipsec/ipsec.h>
 #include <netipsec/xform.h>
 #ifdef INET6
 #include <netipsec/ipsec6.h>
 #endif
 #include <netipsec/key.h>
 #include <sys/syslog.h>
 #endif /*IPSEC*/
 
 #include <machine/in_cksum.h>
 #include <sys/md5.h>
 
 #include <security/mac/mac_framework.h>
 
 #ifdef VIMAGE_GLOBALS
 int	tcp_mssdflt;
 #ifdef INET6
 int	tcp_v6mssdflt;
 #endif
 int	tcp_minmss;
 int	tcp_do_rfc1323;
 static int	icmp_may_rst;
 static int	tcp_isn_reseed_interval;
 static int	tcp_inflight_enable;
 static int	tcp_inflight_rttthresh;
 static int	tcp_inflight_min;
 static int	tcp_inflight_max;
 static int	tcp_inflight_stab;
 #endif
 
 static int
 sysctl_net_inet_tcp_mss_check(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 	int error, new;
 
 	new = V_tcp_mssdflt;
 	error = sysctl_handle_int(oidp, &new, 0, req);
 	if (error == 0 && req->newptr) {
 		if (new < TCP_MINMSS)
 			error = EINVAL;
 		else
 			V_tcp_mssdflt = new;
 	}
 	return (error);
 }
 
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_tcp, TCPCTL_MSSDFLT, mssdflt,
     CTLTYPE_INT|CTLFLAG_RW, tcp_mssdflt, 0,
     &sysctl_net_inet_tcp_mss_check, "I",
     "Default TCP Maximum Segment Size");
 
 #ifdef INET6
 static int
 sysctl_net_inet_tcp_mss_v6_check(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 	int error, new;
 
 	new = V_tcp_v6mssdflt;
 	error = sysctl_handle_int(oidp, &new, 0, req);
 	if (error == 0 && req->newptr) {
 		if (new < TCP_MINMSS)
 			error = EINVAL;
 		else
 			V_tcp_v6mssdflt = new;
 	}
 	return (error);
 }
 
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_tcp, TCPCTL_V6MSSDFLT, v6mssdflt,
     CTLTYPE_INT|CTLFLAG_RW, tcp_v6mssdflt, 0,
     &sysctl_net_inet_tcp_mss_v6_check, "I",
    "Default TCP Maximum Segment Size for IPv6");
 #endif
 
 /*
  * Minimum MSS we accept and use. This prevents DoS attacks where
  * we are forced to a ridiculous low MSS like 20 and send hundreds
  * of packets instead of one. The effect scales with the available
  * bandwidth and quickly saturates the CPU and network interface
  * with packet generation and sending. Set to zero to disable MINMSS
  * checking. This setting prevents us from sending too small packets.
  */
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp, OID_AUTO, minmss,
     CTLFLAG_RW, tcp_minmss , 0, "Minmum TCP Maximum Segment Size");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp, TCPCTL_DO_RFC1323, rfc1323,
     CTLFLAG_RW, tcp_do_rfc1323, 0,
     "Enable rfc1323 (high performance TCP) extensions");
 
 static int	tcp_log_debug = 0;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, log_debug, CTLFLAG_RW,
     &tcp_log_debug, 0, "Log errors caused by incoming TCP segments");
 
 static int	tcp_tcbhashsize = 0;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, tcbhashsize, CTLFLAG_RDTUN,
     &tcp_tcbhashsize, 0, "Size of TCP control-block hashtable");
 
 static int	do_tcpdrain = 1;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, do_tcpdrain, CTLFLAG_RW, &do_tcpdrain, 0,
     "Enable tcp_drain routine for extra help when low on mbufs");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp, OID_AUTO, pcbcount,
     CTLFLAG_RD, tcbinfo.ipi_count, 0, "Number of active PCBs");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp, OID_AUTO, icmp_may_rst,
     CTLFLAG_RW, icmp_may_rst, 0,
     "Certain ICMP unreachable messages may abort connections in SYN_SENT");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp, OID_AUTO, isn_reseed_interval,
     CTLFLAG_RW, tcp_isn_reseed_interval, 0,
     "Seconds between reseeding of ISN secret");
 
 /*
  * TCP bandwidth limiting sysctls.  Note that the default lower bound of
  * 1024 exists only for debugging.  A good production default would be
  * something like 6100.
  */
 SYSCTL_NODE(_net_inet_tcp, OID_AUTO, inflight, CTLFLAG_RW, 0,
     "TCP inflight data limiting");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp_inflight, OID_AUTO, enable,
     CTLFLAG_RW, tcp_inflight_enable, 0,
     "Enable automatic TCP inflight data limiting");
 
 static int	tcp_inflight_debug = 0;
 SYSCTL_INT(_net_inet_tcp_inflight, OID_AUTO, debug, CTLFLAG_RW,
     &tcp_inflight_debug, 0, "Debug TCP inflight calculations");
 
 SYSCTL_V_PROC(V_NET, vnet_inet, _net_inet_tcp_inflight, OID_AUTO, rttthresh,
     CTLTYPE_INT|CTLFLAG_RW, tcp_inflight_rttthresh, 0, sysctl_msec_to_ticks,
     "I", "RTT threshold below which inflight will deactivate itself");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp_inflight, OID_AUTO, min,
     CTLFLAG_RW, tcp_inflight_min, 0, "Lower-bound for TCP inflight window");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp_inflight, OID_AUTO, max,
     CTLFLAG_RW, tcp_inflight_max, 0, "Upper-bound for TCP inflight window");
 
 SYSCTL_V_INT(V_NET, vnet_inet, _net_inet_tcp_inflight, OID_AUTO, stab,
     CTLFLAG_RW, tcp_inflight_stab, 0,
     "Inflight Algorithm Stabilization 20 = 2 packets");
 
 uma_zone_t sack_hole_zone;
 
 static struct inpcb *tcp_notify(struct inpcb *, int);
 static void	tcp_isn_tick(void *);
 
 /*
  * Target size of TCP PCB hash tables. Must be a power of two.
  *
  * Note that this can be overridden by the kernel environment
  * variable net.inet.tcp.tcbhashsize
  */
 #ifndef TCBHASHSIZE
 #define TCBHASHSIZE	512
 #endif
 
 /*
  * XXX
  * Callouts should be moved into struct tcp directly.  They are currently
  * separate because the tcpcb structure is exported to userland for sysctl
  * parsing purposes, which do not know about callouts.
  */
 struct tcpcb_mem {
 	struct	tcpcb		tcb;
 	struct	tcp_timer	tt;
 };
 
 static uma_zone_t tcpcb_zone;
 MALLOC_DEFINE(M_TCPLOG, "tcplog", "TCP address and flags print buffers");
 struct callout isn_callout;
 static struct mtx isn_mtx;
 
 #define	ISN_LOCK_INIT()	mtx_init(&isn_mtx, "isn_mtx", NULL, MTX_DEF)
 #define	ISN_LOCK()	mtx_lock(&isn_mtx)
 #define	ISN_UNLOCK()	mtx_unlock(&isn_mtx)
 
 /*
  * TCP initialization.
  */
 static void
 tcp_zone_change(void *tag)
 {
 
 	uma_zone_set_max(V_tcbinfo.ipi_zone, maxsockets);
 	uma_zone_set_max(tcpcb_zone, maxsockets);
 	tcp_tw_zone_change();
 }
 
 static int
 tcp_inpcb_init(void *mem, int size, int flags)
 {
 	struct inpcb *inp = mem;
 
 	INP_LOCK_INIT(inp, "inp", "tcpinp");
 	return (0);
 }
 
 void
 tcp_init(void)
 {
 	INIT_VNET_INET(curvnet);
 	int hashsize;
 
 	V_blackhole = 0;
 	V_tcp_delack_enabled = 1;
 	V_drop_synfin = 0;
 	V_tcp_do_rfc3042 = 1;
 	V_tcp_do_rfc3390 = 1;
 	V_tcp_do_ecn = 0;
 	V_tcp_ecn_maxretries = 1;
 	V_tcp_insecure_rst = 0;
 	V_tcp_do_autorcvbuf = 1;
 	V_tcp_autorcvbuf_inc = 16*1024;
 	V_tcp_autorcvbuf_max = 256*1024;
 
 	V_tcp_mssdflt = TCP_MSS;
 #ifdef INET6
 	V_tcp_v6mssdflt = TCP6_MSS;
 #endif
 	V_tcp_minmss = TCP_MINMSS;
 	V_tcp_do_rfc1323 = 1;
 	V_icmp_may_rst = 1;
 	V_tcp_isn_reseed_interval = 0;
 	V_tcp_inflight_enable = 1;
 	V_tcp_inflight_min = 6144;
 	V_tcp_inflight_max = TCP_MAXWIN << TCP_MAX_WINSHIFT;
 	V_tcp_inflight_stab = 20;
 
 	V_path_mtu_discovery = 1;
 	V_ss_fltsz = 1;
 	V_ss_fltsz_local = 4;
 	V_tcp_do_newreno = 1;
 	V_tcp_do_tso = 1;
 	V_tcp_do_autosndbuf = 1;
 	V_tcp_autosndbuf_inc = 8*1024;
 	V_tcp_autosndbuf_max = 256*1024;
 
 	V_nolocaltimewait = 0;
 
 	V_tcp_do_sack = 1;
 	V_tcp_sack_maxholes = 128;
 	V_tcp_sack_globalmaxholes = 65536;
 	V_tcp_sack_globalholes = 0;
 
 	tcp_delacktime = TCPTV_DELACK;
 	tcp_keepinit = TCPTV_KEEP_INIT;
 	tcp_keepidle = TCPTV_KEEP_IDLE;
 	tcp_keepintvl = TCPTV_KEEPINTVL;
 	tcp_maxpersistidle = TCPTV_KEEP_IDLE;
 	tcp_msl = TCPTV_MSL;
 	tcp_rexmit_min = TCPTV_MIN;
 	if (tcp_rexmit_min < 1)
 		tcp_rexmit_min = 1;
 	tcp_rexmit_slop = TCPTV_CPU_VAR;
 	V_tcp_inflight_rttthresh = TCPTV_INFLIGHT_RTTTHRESH;
 	tcp_finwait2_timeout = TCPTV_FINWAIT2_TIMEOUT;
 
 	TUNABLE_INT_FETCH("net.inet.tcp.sack.enable", &V_tcp_do_sack);
 
 	INP_INFO_LOCK_INIT(&V_tcbinfo, "tcp");
 	LIST_INIT(&V_tcb);
 	V_tcbinfo.ipi_listhead = &V_tcb;
 	hashsize = TCBHASHSIZE;
 	TUNABLE_INT_FETCH("net.inet.tcp.tcbhashsize", &hashsize);
 	if (!powerof2(hashsize)) {
 		printf("WARNING: TCB hash size not a power of 2\n");
 		hashsize = 512; /* safe default */
 	}
 	tcp_tcbhashsize = hashsize;
 	V_tcbinfo.ipi_hashbase = hashinit(hashsize, M_PCB,
 	    &V_tcbinfo.ipi_hashmask);
 	V_tcbinfo.ipi_porthashbase = hashinit(hashsize, M_PCB,
 	    &V_tcbinfo.ipi_porthashmask);
 	V_tcbinfo.ipi_zone = uma_zcreate("inpcb", sizeof(struct inpcb),
 	    NULL, NULL, tcp_inpcb_init, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE);
 	uma_zone_set_max(V_tcbinfo.ipi_zone, maxsockets);
 #ifdef INET6
 #define TCP_MINPROTOHDR (sizeof(struct ip6_hdr) + sizeof(struct tcphdr))
 #else /* INET6 */
 #define TCP_MINPROTOHDR (sizeof(struct tcpiphdr))
 #endif /* INET6 */
 	if (max_protohdr < TCP_MINPROTOHDR)
 		max_protohdr = TCP_MINPROTOHDR;
 	if (max_linkhdr + TCP_MINPROTOHDR > MHLEN)
 		panic("tcp_init");
 #undef TCP_MINPROTOHDR
 	/*
 	 * These have to be type stable for the benefit of the timers.
 	 */
 	tcpcb_zone = uma_zcreate("tcpcb", sizeof(struct tcpcb_mem),
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE);
 	uma_zone_set_max(tcpcb_zone, maxsockets);
 	tcp_tw_init();
 	syncache_init();
 	tcp_hc_init();
 	tcp_reass_init();
 	ISN_LOCK_INIT();
 	callout_init(&isn_callout, CALLOUT_MPSAFE);
 	tcp_isn_tick(NULL);
 	EVENTHANDLER_REGISTER(shutdown_pre_sync, tcp_fini, NULL,
 		SHUTDOWN_PRI_DEFAULT);
 	sack_hole_zone = uma_zcreate("sackhole", sizeof(struct sackhole),
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE);
 	EVENTHANDLER_REGISTER(maxsockets_change, tcp_zone_change, NULL,
 		EVENTHANDLER_PRI_ANY);
 }
 
 void
 tcp_fini(void *xtp)
 {
 
 	callout_stop(&isn_callout);
 }
 
 /*
  * Fill in the IP and TCP headers for an outgoing packet, given the tcpcb.
  * tcp_template used to store this data in mbufs, but we now recopy it out
  * of the tcpcb each time to conserve mbufs.
  */
 void
 tcpip_fillheaders(struct inpcb *inp, void *ip_ptr, void *tcp_ptr)
 {
 	struct tcphdr *th = (struct tcphdr *)tcp_ptr;
 
 	INP_WLOCK_ASSERT(inp);
 
 #ifdef INET6
 	if ((inp->inp_vflag & INP_IPV6) != 0) {
 		struct ip6_hdr *ip6;
 
 		ip6 = (struct ip6_hdr *)ip_ptr;
 		ip6->ip6_flow = (ip6->ip6_flow & ~IPV6_FLOWINFO_MASK) |
 			(inp->in6p_flowinfo & IPV6_FLOWINFO_MASK);
 		ip6->ip6_vfc = (ip6->ip6_vfc & ~IPV6_VERSION_MASK) |
 			(IPV6_VERSION & IPV6_VERSION_MASK);
 		ip6->ip6_nxt = IPPROTO_TCP;
 		ip6->ip6_plen = htons(sizeof(struct tcphdr));
 		ip6->ip6_src = inp->in6p_laddr;
 		ip6->ip6_dst = inp->in6p_faddr;
 	} else
 #endif
 	{
 		struct ip *ip;
 
 		ip = (struct ip *)ip_ptr;
 		ip->ip_v = IPVERSION;
 		ip->ip_hl = 5;
 		ip->ip_tos = inp->inp_ip_tos;
 		ip->ip_len = 0;
 		ip->ip_id = 0;
 		ip->ip_off = 0;
 		ip->ip_ttl = inp->inp_ip_ttl;
 		ip->ip_sum = 0;
 		ip->ip_p = IPPROTO_TCP;
 		ip->ip_src = inp->inp_laddr;
 		ip->ip_dst = inp->inp_faddr;
 	}
 	th->th_sport = inp->inp_lport;
 	th->th_dport = inp->inp_fport;
 	th->th_seq = 0;
 	th->th_ack = 0;
 	th->th_x2 = 0;
 	th->th_off = 5;
 	th->th_flags = 0;
 	th->th_win = 0;
 	th->th_urp = 0;
 	th->th_sum = 0;		/* in_pseudo() is called later for ipv4 */
 }
 
 /*
  * Create template to be used to send tcp packets on a connection.
  * Allocates an mbuf and fills in a skeletal tcp/ip header.  The only
  * use for this function is in keepalives, which use tcp_respond.
  */
 struct tcptemp *
 tcpip_maketemplate(struct inpcb *inp)
 {
 	struct tcptemp *t;
 
 	t = malloc(sizeof(*t), M_TEMP, M_NOWAIT);
 	if (t == NULL)
 		return (NULL);
 	tcpip_fillheaders(inp, (void *)&t->tt_ipgen, (void *)&t->tt_t);
 	return (t);
 }
 
 /*
  * Send a single message to the TCP at address specified by
  * the given TCP/IP header.  If m == NULL, then we make a copy
  * of the tcpiphdr at ti and send directly to the addressed host.
  * This is used to force keep alive messages out using the TCP
  * template for a connection.  If flags are given then we send
  * a message back to the TCP which originated the * segment ti,
  * and discard the mbuf containing it and any other attached mbufs.
  *
  * In any case the ack and sequence number of the transmitted
  * segment are as specified by the parameters.
  *
  * NOTE: If m != NULL, then ti must point to *inside* the mbuf.
  */
 void
 tcp_respond(struct tcpcb *tp, void *ipgen, struct tcphdr *th, struct mbuf *m,
     tcp_seq ack, tcp_seq seq, int flags)
 {
 	INIT_VNET_INET(curvnet);
 	int tlen;
 	int win = 0;
 	struct ip *ip;
 	struct tcphdr *nth;
 #ifdef INET6
 	struct ip6_hdr *ip6;
 	int isipv6;
 #endif /* INET6 */
 	int ipflags = 0;
 	struct inpcb *inp;
 
 	KASSERT(tp != NULL || m != NULL, ("tcp_respond: tp and m both NULL"));
 
 #ifdef INET6
 	isipv6 = ((struct ip *)ipgen)->ip_v == 6;
 	ip6 = ipgen;
 #endif /* INET6 */
 	ip = ipgen;
 
 	if (tp != NULL) {
 		inp = tp->t_inpcb;
 		KASSERT(inp != NULL, ("tcp control block w/o inpcb"));
 		INP_WLOCK_ASSERT(inp);
 	} else
 		inp = NULL;
 
 	if (tp != NULL) {
 		if (!(flags & TH_RST)) {
 			win = sbspace(&inp->inp_socket->so_rcv);
 			if (win > (long)TCP_MAXWIN << tp->rcv_scale)
 				win = (long)TCP_MAXWIN << tp->rcv_scale;
 		}
 	}
 	if (m == NULL) {
 		m = m_gethdr(M_DONTWAIT, MT_DATA);
 		if (m == NULL)
 			return;
 		tlen = 0;
 		m->m_data += max_linkhdr;
 #ifdef INET6
 		if (isipv6) {
 			bcopy((caddr_t)ip6, mtod(m, caddr_t),
 			      sizeof(struct ip6_hdr));
 			ip6 = mtod(m, struct ip6_hdr *);
 			nth = (struct tcphdr *)(ip6 + 1);
 		} else
 #endif /* INET6 */
 	      {
 		bcopy((caddr_t)ip, mtod(m, caddr_t), sizeof(struct ip));
 		ip = mtod(m, struct ip *);
 		nth = (struct tcphdr *)(ip + 1);
 	      }
 		bcopy((caddr_t)th, (caddr_t)nth, sizeof(struct tcphdr));
 		flags = TH_ACK;
 	} else {
 		/*
 		 *  reuse the mbuf. 
 		 * XXX MRT We inherrit the FIB, which is lucky.
 		 */
 		m_freem(m->m_next);
 		m->m_next = NULL;
 		m->m_data = (caddr_t)ipgen;
 		/* m_len is set later */
 		tlen = 0;
 #define xchg(a,b,type) { type t; t=a; a=b; b=t; }
 #ifdef INET6
 		if (isipv6) {
 			xchg(ip6->ip6_dst, ip6->ip6_src, struct in6_addr);
 			nth = (struct tcphdr *)(ip6 + 1);
 		} else
 #endif /* INET6 */
 	      {
 		xchg(ip->ip_dst.s_addr, ip->ip_src.s_addr, n_long);
 		nth = (struct tcphdr *)(ip + 1);
 	      }
 		if (th != nth) {
 			/*
 			 * this is usually a case when an extension header
 			 * exists between the IPv6 header and the
 			 * TCP header.
 			 */
 			nth->th_sport = th->th_sport;
 			nth->th_dport = th->th_dport;
 		}
 		xchg(nth->th_dport, nth->th_sport, n_short);
 #undef xchg
 	}
 #ifdef INET6
 	if (isipv6) {
 		ip6->ip6_flow = 0;
 		ip6->ip6_vfc = IPV6_VERSION;
 		ip6->ip6_nxt = IPPROTO_TCP;
 		ip6->ip6_plen = htons((u_short)(sizeof (struct tcphdr) +
 						tlen));
 		tlen += sizeof (struct ip6_hdr) + sizeof (struct tcphdr);
 	} else
 #endif
 	{
 		tlen += sizeof (struct tcpiphdr);
 		ip->ip_len = tlen;
 		ip->ip_ttl = V_ip_defttl;
 		if (V_path_mtu_discovery)
 			ip->ip_off |= IP_DF;
 	}
 	m->m_len = tlen;
 	m->m_pkthdr.len = tlen;
 	m->m_pkthdr.rcvif = NULL;
 #ifdef MAC
 	if (inp != NULL) {
 		/*
 		 * Packet is associated with a socket, so allow the
 		 * label of the response to reflect the socket label.
 		 */
 		INP_WLOCK_ASSERT(inp);
 		mac_inpcb_create_mbuf(inp, m);
 	} else {
 		/*
 		 * Packet is not associated with a socket, so possibly
 		 * update the label in place.
 		 */
 		mac_netinet_tcp_reply(m);
 	}
 #endif
 	nth->th_seq = htonl(seq);
 	nth->th_ack = htonl(ack);
 	nth->th_x2 = 0;
 	nth->th_off = sizeof (struct tcphdr) >> 2;
 	nth->th_flags = flags;
 	if (tp != NULL)
 		nth->th_win = htons((u_short) (win >> tp->rcv_scale));
 	else
 		nth->th_win = htons((u_short)win);
 	nth->th_urp = 0;
 #ifdef INET6
 	if (isipv6) {
 		nth->th_sum = 0;
 		nth->th_sum = in6_cksum(m, IPPROTO_TCP,
 					sizeof(struct ip6_hdr),
 					tlen - sizeof(struct ip6_hdr));
 		ip6->ip6_hlim = in6_selecthlim(tp != NULL ? tp->t_inpcb :
 		    NULL, NULL);
 	} else
 #endif /* INET6 */
 	{
 		nth->th_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr,
 		    htons((u_short)(tlen - sizeof(struct ip) + ip->ip_p)));
 		m->m_pkthdr.csum_flags = CSUM_TCP;
 		m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum);
 	}
 #ifdef TCPDEBUG
 	if (tp == NULL || (inp->inp_socket->so_options & SO_DEBUG))
 		tcp_trace(TA_OUTPUT, 0, tp, mtod(m, void *), th, 0);
 #endif
 #ifdef INET6
 	if (isipv6)
 		(void) ip6_output(m, NULL, NULL, ipflags, NULL, NULL, inp);
 	else
 #endif /* INET6 */
 	(void) ip_output(m, NULL, NULL, ipflags, NULL, inp);
 }
 
 /*
  * Create a new TCP control block, making an
  * empty reassembly queue and hooking it to the argument
  * protocol control block.  The `inp' parameter must have
  * come from the zone allocator set up in tcp_init().
  */
 struct tcpcb *
 tcp_newtcpcb(struct inpcb *inp)
 {
 	INIT_VNET_INET(inp->inp_vnet);
 	struct tcpcb_mem *tm;
 	struct tcpcb *tp;
 #ifdef INET6
 	int isipv6 = (inp->inp_vflag & INP_IPV6) != 0;
 #endif /* INET6 */
 
 	tm = uma_zalloc(tcpcb_zone, M_NOWAIT | M_ZERO);
 	if (tm == NULL)
 		return (NULL);
 	tp = &tm->tcb;
 	tp->t_timers = &tm->tt;
 	/*	LIST_INIT(&tp->t_segq); */	/* XXX covered by M_ZERO */
 	tp->t_maxseg = tp->t_maxopd =
 #ifdef INET6
 		isipv6 ? V_tcp_v6mssdflt :
 #endif /* INET6 */
 		V_tcp_mssdflt;
 
 	/* Set up our timeouts. */
 	callout_init(&tp->t_timers->tt_rexmt, CALLOUT_MPSAFE);
 	callout_init(&tp->t_timers->tt_persist, CALLOUT_MPSAFE);
 	callout_init(&tp->t_timers->tt_keep, CALLOUT_MPSAFE);
 	callout_init(&tp->t_timers->tt_2msl, CALLOUT_MPSAFE);
 	callout_init(&tp->t_timers->tt_delack, CALLOUT_MPSAFE);
 
 	if (V_tcp_do_rfc1323)
 		tp->t_flags = (TF_REQ_SCALE|TF_REQ_TSTMP);
 	if (V_tcp_do_sack)
 		tp->t_flags |= TF_SACK_PERMIT;
 	TAILQ_INIT(&tp->snd_holes);
 	tp->t_inpcb = inp;	/* XXX */
 	/*
 	 * Init srtt to TCPTV_SRTTBASE (0), so we can tell that we have no
 	 * rtt estimate.  Set rttvar so that srtt + 4 * rttvar gives
 	 * reasonable initial retransmit time.
 	 */
 	tp->t_srtt = TCPTV_SRTTBASE;
 	tp->t_rttvar = ((TCPTV_RTOBASE - TCPTV_SRTTBASE) << TCP_RTTVAR_SHIFT) / 4;
 	tp->t_rttmin = tcp_rexmit_min;
 	tp->t_rxtcur = TCPTV_RTOBASE;
 	tp->snd_cwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT;
 	tp->snd_bwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT;
 	tp->snd_ssthresh = TCP_MAXWIN << TCP_MAX_WINSHIFT;
 	tp->t_rcvtime = ticks;
 	tp->t_bw_rtttime = ticks;
 	/*
 	 * IPv4 TTL initialization is necessary for an IPv6 socket as well,
 	 * because the socket may be bound to an IPv6 wildcard address,
 	 * which may match an IPv4-mapped IPv6 address.
 	 */
 	inp->inp_ip_ttl = V_ip_defttl;
 	inp->inp_ppcb = tp;
 	return (tp);		/* XXX */
 }
 
 /*
  * Drop a TCP connection, reporting
  * the specified error.  If connection is synchronized,
  * then send a RST to peer.
  */
 struct tcpcb *
 tcp_drop(struct tcpcb *tp, int errno)
 {
 	INIT_VNET_INET(tp->t_inpcb->inp_vnet);
 	struct socket *so = tp->t_inpcb->inp_socket;
 
 	INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
 	INP_WLOCK_ASSERT(tp->t_inpcb);
 
 	if (TCPS_HAVERCVDSYN(tp->t_state)) {
 		tp->t_state = TCPS_CLOSED;
 		(void) tcp_output_reset(tp);
 		V_tcpstat.tcps_drops++;
 	} else
 		V_tcpstat.tcps_conndrops++;
 	if (errno == ETIMEDOUT && tp->t_softerror)
 		errno = tp->t_softerror;
 	so->so_error = errno;
 	return (tcp_close(tp));
 }
 
 void
 tcp_discardcb(struct tcpcb *tp)
 {
 	INIT_VNET_INET(tp->t_vnet);
 	struct tseg_qent *q;
 	struct inpcb *inp = tp->t_inpcb;
 	struct socket *so = inp->inp_socket;
 #ifdef INET6
 	int isipv6 = (inp->inp_vflag & INP_IPV6) != 0;
 #endif /* INET6 */
 
 	INP_WLOCK_ASSERT(inp);
 
 	/*
 	 * Make sure that all of our timers are stopped before we
 	 * delete the PCB.
 	 */
 	callout_stop(&tp->t_timers->tt_rexmt);
 	callout_stop(&tp->t_timers->tt_persist);
 	callout_stop(&tp->t_timers->tt_keep);
 	callout_stop(&tp->t_timers->tt_2msl);
 	callout_stop(&tp->t_timers->tt_delack);
 
 	/*
 	 * If we got enough samples through the srtt filter,
 	 * save the rtt and rttvar in the routing entry.
 	 * 'Enough' is arbitrarily defined as 4 rtt samples.
 	 * 4 samples is enough for the srtt filter to converge
 	 * to within enough % of the correct value; fewer samples
 	 * and we could save a bogus rtt. The danger is not high
 	 * as tcp quickly recovers from everything.
 	 * XXX: Works very well but needs some more statistics!
 	 */
 	if (tp->t_rttupdated >= 4) {
 		struct hc_metrics_lite metrics;
 		u_long ssthresh;
 
 		bzero(&metrics, sizeof(metrics));
 		/*
 		 * Update the ssthresh always when the conditions below
 		 * are satisfied. This gives us better new start value
 		 * for the congestion avoidance for new connections.
 		 * ssthresh is only set if packet loss occured on a session.
 		 *
 		 * XXXRW: 'so' may be NULL here, and/or socket buffer may be
 		 * being torn down.  Ideally this code would not use 'so'.
 		 */
 		ssthresh = tp->snd_ssthresh;
 		if (ssthresh != 0 && ssthresh < so->so_snd.sb_hiwat / 2) {
 			/*
 			 * convert the limit from user data bytes to
 			 * packets then to packet data bytes.
 			 */
 			ssthresh = (ssthresh + tp->t_maxseg / 2) / tp->t_maxseg;
 			if (ssthresh < 2)
 				ssthresh = 2;
 			ssthresh *= (u_long)(tp->t_maxseg +
 #ifdef INET6
 				      (isipv6 ? sizeof (struct ip6_hdr) +
 					       sizeof (struct tcphdr) :
 #endif
 				       sizeof (struct tcpiphdr)
 #ifdef INET6
 				       )
 #endif
 				      );
 		} else
 			ssthresh = 0;
 		metrics.rmx_ssthresh = ssthresh;
 
 		metrics.rmx_rtt = tp->t_srtt;
 		metrics.rmx_rttvar = tp->t_rttvar;
 		/* XXX: This wraps if the pipe is more than 4 Gbit per second */
 		metrics.rmx_bandwidth = tp->snd_bandwidth;
 		metrics.rmx_cwnd = tp->snd_cwnd;
 		metrics.rmx_sendpipe = 0;
 		metrics.rmx_recvpipe = 0;
 
 		tcp_hc_update(&inp->inp_inc, &metrics);
 	}
 
 	/* free the reassembly queue, if any */
 	while ((q = LIST_FIRST(&tp->t_segq)) != NULL) {
 		LIST_REMOVE(q, tqe_q);
 		m_freem(q->tqe_m);
 		uma_zfree(tcp_reass_zone, q);
 		tp->t_segqlen--;
 		V_tcp_reass_qsize--;
 	}
 	/* Disconnect offload device, if any. */
 	tcp_offload_detach(tp);
 		
 	tcp_free_sackholes(tp);
 	inp->inp_ppcb = NULL;
 	tp->t_inpcb = NULL;
 	uma_zfree(tcpcb_zone, tp);
 }
 
 /*
  * Attempt to close a TCP control block, marking it as dropped, and freeing
  * the socket if we hold the only reference.
  */
 struct tcpcb *
 tcp_close(struct tcpcb *tp)
 {
 	INIT_VNET_INET(tp->t_inpcb->inp_vnet);
 	struct inpcb *inp = tp->t_inpcb;
 	struct socket *so;
 
 	INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	/* Notify any offload devices of listener close */
 	if (tp->t_state == TCPS_LISTEN)
 		tcp_offload_listen_close(tp);
 	in_pcbdrop(inp);
 	V_tcpstat.tcps_closed++;
 	KASSERT(inp->inp_socket != NULL, ("tcp_close: inp_socket NULL"));
 	so = inp->inp_socket;
 	soisdisconnected(so);
 	if (inp->inp_vflag & INP_SOCKREF) {
 		KASSERT(so->so_state & SS_PROTOREF,
 		    ("tcp_close: !SS_PROTOREF"));
 		inp->inp_vflag &= ~INP_SOCKREF;
 		INP_WUNLOCK(inp);
 		ACCEPT_LOCK();
 		SOCK_LOCK(so);
 		so->so_state &= ~SS_PROTOREF;
 		sofree(so);
 		return (NULL);
 	}
 	return (tp);
 }
 
 void
 tcp_drain(void)
 {
 	VNET_ITERATOR_DECL(vnet_iter);
 
 	if (!do_tcpdrain)
 		return;
 
 	VNET_LIST_RLOCK();
 	VNET_FOREACH(vnet_iter) {
 		CURVNET_SET(vnet_iter);
 		INIT_VNET_INET(vnet_iter);
 		struct inpcb *inpb;
 		struct tcpcb *tcpb;
 		struct tseg_qent *te;
 
 	/*
 	 * Walk the tcpbs, if existing, and flush the reassembly queue,
 	 * if there is one...
 	 * XXX: The "Net/3" implementation doesn't imply that the TCP
 	 *      reassembly queue should be flushed, but in a situation
 	 *	where we're really low on mbufs, this is potentially
 	 *	usefull.
 	 */
 		INP_INFO_RLOCK(&V_tcbinfo);
 		LIST_FOREACH(inpb, V_tcbinfo.ipi_listhead, inp_list) {
 			if (inpb->inp_vflag & INP_TIMEWAIT)
 				continue;
 			INP_WLOCK(inpb);
 			if ((tcpb = intotcpcb(inpb)) != NULL) {
 				while ((te = LIST_FIRST(&tcpb->t_segq))
 			            != NULL) {
 					LIST_REMOVE(te, tqe_q);
 					m_freem(te->tqe_m);
 					uma_zfree(tcp_reass_zone, te);
 					tcpb->t_segqlen--;
 					V_tcp_reass_qsize--;
 				}
 				tcp_clean_sackreport(tcpb);
 			}
 			INP_WUNLOCK(inpb);
 		}
 		INP_INFO_RUNLOCK(&V_tcbinfo);
 		CURVNET_RESTORE();
 	}
 	VNET_LIST_RUNLOCK();
 }
 
 /*
  * Notify a tcp user of an asynchronous error;
  * store error as soft error, but wake up user
  * (for now, won't do anything until can select for soft error).
  *
  * Do not wake up user since there currently is no mechanism for
  * reporting soft errors (yet - a kqueue filter may be added).
  */
 static struct inpcb *
 tcp_notify(struct inpcb *inp, int error)
 {
 	struct tcpcb *tp;
 #ifdef INVARIANTS
 	INIT_VNET_INET(inp->inp_vnet); /* V_tcbinfo WLOCK ASSERT */
 #endif
 
 	INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	if ((inp->inp_vflag & INP_TIMEWAIT) ||
 	    (inp->inp_vflag & INP_DROPPED))
 		return (inp);
 
 	tp = intotcpcb(inp);
 	KASSERT(tp != NULL, ("tcp_notify: tp == NULL"));
 
 	/*
 	 * Ignore some errors if we are hooked up.
 	 * If connection hasn't completed, has retransmitted several times,
 	 * and receives a second error, give up now.  This is better
 	 * than waiting a long time to establish a connection that
 	 * can never complete.
 	 */
 	if (tp->t_state == TCPS_ESTABLISHED &&
 	    (error == EHOSTUNREACH || error == ENETUNREACH ||
 	     error == EHOSTDOWN)) {
 		return (inp);
 	} else if (tp->t_state < TCPS_ESTABLISHED && tp->t_rxtshift > 3 &&
 	    tp->t_softerror) {
 		tp = tcp_drop(tp, error);
 		if (tp != NULL)
 			return (inp);
 		else
 			return (NULL);
 	} else {
 		tp->t_softerror = error;
 		return (inp);
 	}
 #if 0
 	wakeup( &so->so_timeo);
 	sorwakeup(so);
 	sowwakeup(so);
 #endif
 }
 
 static int
 tcp_pcblist(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 	int error, i, m, n, pcb_count;
 	struct inpcb *inp, **inp_list;
 	inp_gen_t gencnt;
 	struct xinpgen xig;
 
 	/*
 	 * The process of preparing the TCB list is too time-consuming and
 	 * resource-intensive to repeat twice on every request.
 	 */
 	if (req->oldptr == NULL) {
 		m = syncache_pcbcount();
 		n = V_tcbinfo.ipi_count;
 		req->oldidx = 2 * (sizeof xig)
 			+ ((m + n) + n/8) * sizeof(struct xtcpcb);
 		return (0);
 	}
 
 	if (req->newptr != NULL)
 		return (EPERM);
 
 	/*
 	 * OK, now we're committed to doing something.
 	 */
 	INP_INFO_RLOCK(&V_tcbinfo);
 	gencnt = V_tcbinfo.ipi_gencnt;
 	n = V_tcbinfo.ipi_count;
 	INP_INFO_RUNLOCK(&V_tcbinfo);
 
 	m = syncache_pcbcount();
 
 	error = sysctl_wire_old_buffer(req, 2 * (sizeof xig)
 		+ (n + m) * sizeof(struct xtcpcb));
 	if (error != 0)
 		return (error);
 
 	xig.xig_len = sizeof xig;
 	xig.xig_count = n + m;
 	xig.xig_gen = gencnt;
 	xig.xig_sogen = so_gencnt;
 	error = SYSCTL_OUT(req, &xig, sizeof xig);
 	if (error)
 		return (error);
 
 	error = syncache_pcblist(req, m, &pcb_count);
 	if (error)
 		return (error);
 
 	inp_list = malloc(n * sizeof *inp_list, M_TEMP, M_WAITOK);
 	if (inp_list == NULL)
 		return (ENOMEM);
 
 	INP_INFO_RLOCK(&V_tcbinfo);
 	for (inp = LIST_FIRST(V_tcbinfo.ipi_listhead), i = 0;
 	    inp != NULL && i < n; inp = LIST_NEXT(inp, inp_list)) {
 		INP_RLOCK(inp);
 		if (inp->inp_gencnt <= gencnt) {
 			/*
 			 * XXX: This use of cr_cansee(), introduced with
 			 * TCP state changes, is not quite right, but for
 			 * now, better than nothing.
 			 */
 			if (inp->inp_vflag & INP_TIMEWAIT) {
 				if (intotw(inp) != NULL)
 					error = cr_cansee(req->td->td_ucred,
 					    intotw(inp)->tw_cred);
 				else
 					error = EINVAL;	/* Skip this inp. */
 			} else
 				error = cr_canseeinpcb(req->td->td_ucred, inp);
 			if (error == 0)
 				inp_list[i++] = inp;
 		}
 		INP_RUNLOCK(inp);
 	}
 	INP_INFO_RUNLOCK(&V_tcbinfo);
 	n = i;
 
 	error = 0;
 	for (i = 0; i < n; i++) {
 		inp = inp_list[i];
 		INP_RLOCK(inp);
 		if (inp->inp_gencnt <= gencnt) {
 			struct xtcpcb xt;
 			void *inp_ppcb;
 
 			bzero(&xt, sizeof(xt));
 			xt.xt_len = sizeof xt;
 			/* XXX should avoid extra copy */
 			bcopy(inp, &xt.xt_inp, sizeof *inp);
 			inp_ppcb = inp->inp_ppcb;
 			if (inp_ppcb == NULL)
 				bzero((char *) &xt.xt_tp, sizeof xt.xt_tp);
 			else if (inp->inp_vflag & INP_TIMEWAIT) {
 				bzero((char *) &xt.xt_tp, sizeof xt.xt_tp);
 				xt.xt_tp.t_state = TCPS_TIME_WAIT;
 			} else
 				bcopy(inp_ppcb, &xt.xt_tp, sizeof xt.xt_tp);
 			if (inp->inp_socket != NULL)
 				sotoxsocket(inp->inp_socket, &xt.xt_socket);
 			else {
 				bzero(&xt.xt_socket, sizeof xt.xt_socket);
 				xt.xt_socket.xso_protocol = IPPROTO_TCP;
 			}
 			xt.xt_inp.inp_gencnt = inp->inp_gencnt;
 			INP_RUNLOCK(inp);
 			error = SYSCTL_OUT(req, &xt, sizeof xt);
 		} else
 			INP_RUNLOCK(inp);
 	
 	}
 	if (!error) {
 		/*
 		 * Give the user an updated idea of our state.
 		 * If the generation differs from what we told
 		 * her before, she knows that something happened
 		 * while we were processing this request, and it
 		 * might be necessary to retry.
 		 */
 		INP_INFO_RLOCK(&V_tcbinfo);
 		xig.xig_gen = V_tcbinfo.ipi_gencnt;
 		xig.xig_sogen = so_gencnt;
 		xig.xig_count = V_tcbinfo.ipi_count + pcb_count;
 		INP_INFO_RUNLOCK(&V_tcbinfo);
 		error = SYSCTL_OUT(req, &xig, sizeof xig);
 	}
 	free(inp_list, M_TEMP);
 	return (error);
 }
 
 SYSCTL_PROC(_net_inet_tcp, TCPCTL_PCBLIST, pcblist, CTLFLAG_RD, 0, 0,
     tcp_pcblist, "S,xtcpcb", "List of active TCP connections");
 
 static int
 tcp_getcred(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 	struct xucred xuc;
 	struct sockaddr_in addrs[2];
 	struct inpcb *inp;
 	int error;
 
 	error = priv_check(req->td, PRIV_NETINET_GETCRED);
 	if (error)
 		return (error);
 	error = SYSCTL_IN(req, addrs, sizeof(addrs));
 	if (error)
 		return (error);
 	INP_INFO_RLOCK(&V_tcbinfo);
 	inp = in_pcblookup_hash(&V_tcbinfo, addrs[1].sin_addr,
 	    addrs[1].sin_port, addrs[0].sin_addr, addrs[0].sin_port, 0, NULL);
 	if (inp != NULL) {
 		INP_RLOCK(inp);
 		INP_INFO_RUNLOCK(&V_tcbinfo);
 		if (inp->inp_socket == NULL)
 			error = ENOENT;
 		if (error == 0)
 			error = cr_canseeinpcb(req->td->td_ucred, inp);
 		if (error == 0)
 			cru2x(inp->inp_cred, &xuc);
 		INP_RUNLOCK(inp);
 	} else {
 		INP_INFO_RUNLOCK(&V_tcbinfo);
 		error = ENOENT;
 	}
 	if (error == 0)
 		error = SYSCTL_OUT(req, &xuc, sizeof(struct xucred));
 	return (error);
 }
 
 SYSCTL_PROC(_net_inet_tcp, OID_AUTO, getcred,
     CTLTYPE_OPAQUE|CTLFLAG_RW|CTLFLAG_PRISON, 0, 0,
     tcp_getcred, "S,xucred", "Get the xucred of a TCP connection");
 
 #ifdef INET6
 static int
 tcp6_getcred(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	struct xucred xuc;
 	struct sockaddr_in6 addrs[2];
 	struct inpcb *inp;
 	int error, mapped = 0;
 
 	error = priv_check(req->td, PRIV_NETINET_GETCRED);
 	if (error)
 		return (error);
 	error = SYSCTL_IN(req, addrs, sizeof(addrs));
 	if (error)
 		return (error);
 	if ((error = sa6_embedscope(&addrs[0], V_ip6_use_defzone)) != 0 ||
 	    (error = sa6_embedscope(&addrs[1], V_ip6_use_defzone)) != 0) {
 		return (error);
 	}
 	if (IN6_IS_ADDR_V4MAPPED(&addrs[0].sin6_addr)) {
 		if (IN6_IS_ADDR_V4MAPPED(&addrs[1].sin6_addr))
 			mapped = 1;
 		else
 			return (EINVAL);
 	}
 
 	INP_INFO_RLOCK(&V_tcbinfo);
 	if (mapped == 1)
 		inp = in_pcblookup_hash(&V_tcbinfo,
 			*(struct in_addr *)&addrs[1].sin6_addr.s6_addr[12],
 			addrs[1].sin6_port,
 			*(struct in_addr *)&addrs[0].sin6_addr.s6_addr[12],
 			addrs[0].sin6_port,
 			0, NULL);
 	else
 		inp = in6_pcblookup_hash(&V_tcbinfo,
 			&addrs[1].sin6_addr, addrs[1].sin6_port,
 			&addrs[0].sin6_addr, addrs[0].sin6_port, 0, NULL);
 	if (inp != NULL) {
 		INP_RLOCK(inp);
 		INP_INFO_RUNLOCK(&V_tcbinfo);
 		if (inp->inp_socket == NULL)
 			error = ENOENT;
 		if (error == 0)
 			error = cr_canseeinpcb(req->td->td_ucred, inp);
 		if (error == 0)
 			cru2x(inp->inp_cred, &xuc);
 		INP_RUNLOCK(inp);
 	} else {
 		INP_INFO_RUNLOCK(&V_tcbinfo);
 		error = ENOENT;
 	}
 	if (error == 0)
 		error = SYSCTL_OUT(req, &xuc, sizeof(struct xucred));
 	return (error);
 }
 
 SYSCTL_PROC(_net_inet6_tcp6, OID_AUTO, getcred,
     CTLTYPE_OPAQUE|CTLFLAG_RW|CTLFLAG_PRISON, 0, 0,
     tcp6_getcred, "S,xucred", "Get the xucred of a TCP6 connection");
 #endif
 
 
 void
 tcp_ctlinput(int cmd, struct sockaddr *sa, void *vip)
 {
 	INIT_VNET_INET(curvnet);
 	struct ip *ip = vip;
 	struct tcphdr *th;
 	struct in_addr faddr;
 	struct inpcb *inp;
 	struct tcpcb *tp;
 	struct inpcb *(*notify)(struct inpcb *, int) = tcp_notify;
 	struct icmp *icp;
 	struct in_conninfo inc;
 	tcp_seq icmp_tcp_seq;
 	int mtu;
 
 	faddr = ((struct sockaddr_in *)sa)->sin_addr;
 	if (sa->sa_family != AF_INET || faddr.s_addr == INADDR_ANY)
 		return;
 
 	if (cmd == PRC_MSGSIZE)
 		notify = tcp_mtudisc;
 	else if (V_icmp_may_rst && (cmd == PRC_UNREACH_ADMIN_PROHIB ||
 		cmd == PRC_UNREACH_PORT || cmd == PRC_TIMXCEED_INTRANS) && ip)
 		notify = tcp_drop_syn_sent;
 	/*
 	 * Redirects don't need to be handled up here.
 	 */
 	else if (PRC_IS_REDIRECT(cmd))
 		return;
 	/*
 	 * Source quench is depreciated.
 	 */
 	else if (cmd == PRC_QUENCH)
 		return;
 	/*
 	 * Hostdead is ugly because it goes linearly through all PCBs.
 	 * XXX: We never get this from ICMP, otherwise it makes an
 	 * excellent DoS attack on machines with many connections.
 	 */
 	else if (cmd == PRC_HOSTDEAD)
 		ip = NULL;
 	else if ((unsigned)cmd >= PRC_NCMDS || inetctlerrmap[cmd] == 0)
 		return;
 	if (ip != NULL) {
 		icp = (struct icmp *)((caddr_t)ip
 				      - offsetof(struct icmp, icmp_ip));
 		th = (struct tcphdr *)((caddr_t)ip
 				       + (ip->ip_hl << 2));
 		INP_INFO_WLOCK(&V_tcbinfo);
 		inp = in_pcblookup_hash(&V_tcbinfo, faddr, th->th_dport,
 		    ip->ip_src, th->th_sport, 0, NULL);
 		if (inp != NULL)  {
 			INP_WLOCK(inp);
 			if (!(inp->inp_vflag & INP_TIMEWAIT) &&
 			    !(inp->inp_vflag & INP_DROPPED) &&
 			    !(inp->inp_socket == NULL)) {
 				icmp_tcp_seq = htonl(th->th_seq);
 				tp = intotcpcb(inp);
 				if (SEQ_GEQ(icmp_tcp_seq, tp->snd_una) &&
 				    SEQ_LT(icmp_tcp_seq, tp->snd_max)) {
 					if (cmd == PRC_MSGSIZE) {
 					    /*
 					     * MTU discovery:
 					     * If we got a needfrag set the MTU
 					     * in the route to the suggested new
 					     * value (if given) and then notify.
 					     */
 					    bzero(&inc, sizeof(inc));
 					    inc.inc_flags = 0;	/* IPv4 */
 					    inc.inc_faddr = faddr;
 					    inc.inc_fibnum =
 						inp->inp_inc.inc_fibnum;
 
 					    mtu = ntohs(icp->icmp_nextmtu);
 					    /*
 					     * If no alternative MTU was
 					     * proposed, try the next smaller
 					     * one.  ip->ip_len has already
 					     * been swapped in icmp_input().
 					     */
 					    if (!mtu)
 						mtu = ip_next_mtu(ip->ip_len,
 						 1);
 					    if (mtu < max(296, V_tcp_minmss
 						 + sizeof(struct tcpiphdr)))
 						mtu = 0;
 					    if (!mtu)
 						mtu = V_tcp_mssdflt
 						 + sizeof(struct tcpiphdr);
 					    /*
 					     * Only cache the the MTU if it
 					     * is smaller than the interface
 					     * or route MTU.  tcp_mtudisc()
 					     * will do right thing by itself.
 					     */
 					    if (mtu <= tcp_maxmtu(&inc, NULL))
 						tcp_hc_updatemtu(&inc, mtu);
 					}
 
 					inp = (*notify)(inp, inetctlerrmap[cmd]);
 				}
 			}
 			if (inp != NULL)
 				INP_WUNLOCK(inp);
 		} else {
 			inc.inc_fport = th->th_dport;
 			inc.inc_lport = th->th_sport;
 			inc.inc_faddr = faddr;
 			inc.inc_laddr = ip->ip_src;
 #ifdef INET6
 			inc.inc_isipv6 = 0;
 #endif
 			syncache_unreach(&inc, th);
 		}
 		INP_INFO_WUNLOCK(&V_tcbinfo);
 	} else
 		in_pcbnotifyall(&V_tcbinfo, faddr, inetctlerrmap[cmd], notify);
 }
 
 #ifdef INET6
 void
 tcp6_ctlinput(int cmd, struct sockaddr *sa, void *d)
 {
 	INIT_VNET_INET(curvnet);
 	struct tcphdr th;
 	struct inpcb *(*notify)(struct inpcb *, int) = tcp_notify;
 	struct ip6_hdr *ip6;
 	struct mbuf *m;
 	struct ip6ctlparam *ip6cp = NULL;
 	const struct sockaddr_in6 *sa6_src = NULL;
 	int off;
 	struct tcp_portonly {
 		u_int16_t th_sport;
 		u_int16_t th_dport;
 	} *thp;
 
 	if (sa->sa_family != AF_INET6 ||
 	    sa->sa_len != sizeof(struct sockaddr_in6))
 		return;
 
 	if (cmd == PRC_MSGSIZE)
 		notify = tcp_mtudisc;
 	else if (!PRC_IS_REDIRECT(cmd) &&
 		 ((unsigned)cmd >= PRC_NCMDS || inet6ctlerrmap[cmd] == 0))
 		return;
 	/* Source quench is depreciated. */
 	else if (cmd == PRC_QUENCH)
 		return;
 
 	/* if the parameter is from icmp6, decode it. */
 	if (d != NULL) {
 		ip6cp = (struct ip6ctlparam *)d;
 		m = ip6cp->ip6c_m;
 		ip6 = ip6cp->ip6c_ip6;
 		off = ip6cp->ip6c_off;
 		sa6_src = ip6cp->ip6c_src;
 	} else {
 		m = NULL;
 		ip6 = NULL;
 		off = 0;	/* fool gcc */
 		sa6_src = &sa6_any;
 	}
 
 	if (ip6 != NULL) {
 		struct in_conninfo inc;
 		/*
 		 * XXX: We assume that when IPV6 is non NULL,
 		 * M and OFF are valid.
 		 */
 
 		/* check if we can safely examine src and dst ports */
 		if (m->m_pkthdr.len < off + sizeof(*thp))
 			return;
 
 		bzero(&th, sizeof(th));
 		m_copydata(m, off, sizeof(*thp), (caddr_t)&th);
 
 		in6_pcbnotify(&V_tcbinfo, sa, th.th_dport,
 		    (struct sockaddr *)ip6cp->ip6c_src,
 		    th.th_sport, cmd, NULL, notify);
 
 		inc.inc_fport = th.th_dport;
 		inc.inc_lport = th.th_sport;
 		inc.inc6_faddr = ((struct sockaddr_in6 *)sa)->sin6_addr;
 		inc.inc6_laddr = ip6cp->ip6c_src->sin6_addr;
 		inc.inc_isipv6 = 1;
 		INP_INFO_WLOCK(&V_tcbinfo);
 		syncache_unreach(&inc, &th);
 		INP_INFO_WUNLOCK(&V_tcbinfo);
 	} else
 		in6_pcbnotify(&V_tcbinfo, sa, 0, (const struct sockaddr *)sa6_src,
 			      0, cmd, NULL, notify);
 }
 #endif /* INET6 */
 
 
 /*
  * Following is where TCP initial sequence number generation occurs.
  *
  * There are two places where we must use initial sequence numbers:
  * 1.  In SYN-ACK packets.
  * 2.  In SYN packets.
  *
  * All ISNs for SYN-ACK packets are generated by the syncache.  See
  * tcp_syncache.c for details.
  *
  * The ISNs in SYN packets must be monotonic; TIME_WAIT recycling
  * depends on this property.  In addition, these ISNs should be
  * unguessable so as to prevent connection hijacking.  To satisfy
  * the requirements of this situation, the algorithm outlined in
  * RFC 1948 is used, with only small modifications.
  *
  * Implementation details:
  *
  * Time is based off the system timer, and is corrected so that it
  * increases by one megabyte per second.  This allows for proper
  * recycling on high speed LANs while still leaving over an hour
  * before rollover.
  *
  * As reading the *exact* system time is too expensive to be done
  * whenever setting up a TCP connection, we increment the time
  * offset in two ways.  First, a small random positive increment
  * is added to isn_offset for each connection that is set up.
  * Second, the function tcp_isn_tick fires once per clock tick
  * and increments isn_offset as necessary so that sequence numbers
  * are incremented at approximately ISN_BYTES_PER_SECOND.  The
  * random positive increments serve only to ensure that the same
  * exact sequence number is never sent out twice (as could otherwise
  * happen when a port is recycled in less than the system tick
  * interval.)
  *
  * net.inet.tcp.isn_reseed_interval controls the number of seconds
  * between seeding of isn_secret.  This is normally set to zero,
  * as reseeding should not be necessary.
  *
  * Locking of the global variables isn_secret, isn_last_reseed, isn_offset,
  * isn_offset_old, and isn_ctx is performed using the TCP pcbinfo lock.  In
  * general, this means holding an exclusive (write) lock.
  */
 
 #define ISN_BYTES_PER_SECOND 1048576
 #define ISN_STATIC_INCREMENT 4096
 #define ISN_RANDOM_INCREMENT (4096 - 1)
 
 #ifdef VIMAGE_GLOBALS
 static u_char isn_secret[32];
 static int isn_last_reseed;
 static u_int32_t isn_offset, isn_offset_old;
 #endif
 
 tcp_seq
 tcp_new_isn(struct tcpcb *tp)
 {
 	INIT_VNET_INET(tp->t_vnet);
 	MD5_CTX isn_ctx;
 	u_int32_t md5_buffer[4];
 	tcp_seq new_isn;
 
 	INP_WLOCK_ASSERT(tp->t_inpcb);
 
 	ISN_LOCK();
 	/* Seed if this is the first use, reseed if requested. */
 	if ((V_isn_last_reseed == 0) || ((V_tcp_isn_reseed_interval > 0) &&
 	     (((u_int)V_isn_last_reseed + (u_int)V_tcp_isn_reseed_interval*hz)
 		< (u_int)ticks))) {
 		read_random(&V_isn_secret, sizeof(V_isn_secret));
 		V_isn_last_reseed = ticks;
 	}
 
 	/* Compute the md5 hash and return the ISN. */
 	MD5Init(&isn_ctx);
 	MD5Update(&isn_ctx, (u_char *) &tp->t_inpcb->inp_fport, sizeof(u_short));
 	MD5Update(&isn_ctx, (u_char *) &tp->t_inpcb->inp_lport, sizeof(u_short));
 #ifdef INET6
 	if ((tp->t_inpcb->inp_vflag & INP_IPV6) != 0) {
 		MD5Update(&isn_ctx, (u_char *) &tp->t_inpcb->in6p_faddr,
 			  sizeof(struct in6_addr));
 		MD5Update(&isn_ctx, (u_char *) &tp->t_inpcb->in6p_laddr,
 			  sizeof(struct in6_addr));
 	} else
 #endif
 	{
 		MD5Update(&isn_ctx, (u_char *) &tp->t_inpcb->inp_faddr,
 			  sizeof(struct in_addr));
 		MD5Update(&isn_ctx, (u_char *) &tp->t_inpcb->inp_laddr,
 			  sizeof(struct in_addr));
 	}
 	MD5Update(&isn_ctx, (u_char *) &V_isn_secret, sizeof(V_isn_secret));
 	MD5Final((u_char *) &md5_buffer, &isn_ctx);
 	new_isn = (tcp_seq) md5_buffer[0];
 	V_isn_offset += ISN_STATIC_INCREMENT +
 		(arc4random() & ISN_RANDOM_INCREMENT);
 	new_isn += V_isn_offset;
 	ISN_UNLOCK();
 	return (new_isn);
 }
 
 /*
  * Increment the offset to the next ISN_BYTES_PER_SECOND / 100 boundary
  * to keep time flowing at a relatively constant rate.  If the random
  * increments have already pushed us past the projected offset, do nothing.
  */
 static void
 tcp_isn_tick(void *xtp)
 {
 	VNET_ITERATOR_DECL(vnet_iter);
 	u_int32_t projected_offset;
 
 	ISN_LOCK();
 	VNET_LIST_RLOCK();
 	VNET_FOREACH(vnet_iter) {
 		CURVNET_SET(vnet_iter); /* XXX appease INVARIANTS */
 		INIT_VNET_INET(curvnet);
 		projected_offset =
 		    V_isn_offset_old + ISN_BYTES_PER_SECOND / 100;
 
 		if (SEQ_GT(projected_offset, V_isn_offset))
 			V_isn_offset = projected_offset;
 
 		V_isn_offset_old = V_isn_offset;
 		CURVNET_RESTORE();
 	}
 	VNET_LIST_RUNLOCK();
 	callout_reset(&isn_callout, hz/100, tcp_isn_tick, NULL);
 	ISN_UNLOCK();
 }
 
 /*
  * When a specific ICMP unreachable message is received and the
  * connection state is SYN-SENT, drop the connection.  This behavior
  * is controlled by the icmp_may_rst sysctl.
  */
 struct inpcb *
 tcp_drop_syn_sent(struct inpcb *inp, int errno)
 {
 #ifdef INVARIANTS
 	INIT_VNET_INET(inp->inp_vnet);
 #endif
 	struct tcpcb *tp;
 
 	INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	if ((inp->inp_vflag & INP_TIMEWAIT) ||
 	    (inp->inp_vflag & INP_DROPPED))
 		return (inp);
 
 	tp = intotcpcb(inp);
 	if (tp->t_state != TCPS_SYN_SENT)
 		return (inp);
 
 	tp = tcp_drop(tp, errno);
 	if (tp != NULL)
 		return (inp);
 	else
 		return (NULL);
 }
 
 /*
  * When `need fragmentation' ICMP is received, update our idea of the MSS
  * based on the new value in the route.  Also nudge TCP to send something,
  * since we know the packet we just sent was dropped.
  * This duplicates some code in the tcp_mss() function in tcp_input.c.
  */
 struct inpcb *
 tcp_mtudisc(struct inpcb *inp, int errno)
 {
 	INIT_VNET_INET(inp->inp_vnet);
 	struct tcpcb *tp;
 	struct socket *so;
 
 	INP_WLOCK_ASSERT(inp);
 	if ((inp->inp_vflag & INP_TIMEWAIT) ||
 	    (inp->inp_vflag & INP_DROPPED))
 		return (inp);
 
 	tp = intotcpcb(inp);
 	KASSERT(tp != NULL, ("tcp_mtudisc: tp == NULL"));
 
 	tcp_mss_update(tp, -1, NULL, NULL);
   
 	so = inp->inp_socket;
 	SOCKBUF_LOCK(&so->so_snd);
 	/* If the mss is larger than the socket buffer, decrease the mss. */
 	if (so->so_snd.sb_hiwat < tp->t_maxseg)
 		tp->t_maxseg = so->so_snd.sb_hiwat;
 	SOCKBUF_UNLOCK(&so->so_snd);
 
 	V_tcpstat.tcps_mturesent++;
 	tp->t_rtttime = 0;
 	tp->snd_nxt = tp->snd_una;
 	tcp_free_sackholes(tp);
 	tp->snd_recover = tp->snd_max;
 	if (tp->t_flags & TF_SACK_PERMIT)
 		EXIT_FASTRECOVERY(tp);
 	tcp_output_send(tp);
 	return (inp);
 }
 
 /*
  * Look-up the routing entry to the peer of this inpcb.  If no route
  * is found and it cannot be allocated, then return 0.  This routine
  * is called by TCP routines that access the rmx structure and by
  * tcp_mss_update to get the peer/interface MTU.
  */
 u_long
 tcp_maxmtu(struct in_conninfo *inc, int *flags)
 {
 	struct route sro;
 	struct sockaddr_in *dst;
 	struct ifnet *ifp;
 	u_long maxmtu = 0;
 
 	KASSERT(inc != NULL, ("tcp_maxmtu with NULL in_conninfo pointer"));
 
 	bzero(&sro, sizeof(sro));
 	if (inc->inc_faddr.s_addr != INADDR_ANY) {
 	        dst = (struct sockaddr_in *)&sro.ro_dst;
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = inc->inc_faddr;
-		in_rtalloc_ign(&sro, RTF_CLONING, inc->inc_fibnum);
+		in_rtalloc_ign(&sro, 0, inc->inc_fibnum);
 	}
 	if (sro.ro_rt != NULL) {
 		ifp = sro.ro_rt->rt_ifp;
 		if (sro.ro_rt->rt_rmx.rmx_mtu == 0)
 			maxmtu = ifp->if_mtu;
 		else
 			maxmtu = min(sro.ro_rt->rt_rmx.rmx_mtu, ifp->if_mtu);
 
 		/* Report additional interface capabilities. */
 		if (flags != NULL) {
 			if (ifp->if_capenable & IFCAP_TSO4 &&
 			    ifp->if_hwassist & CSUM_TSO)
 				*flags |= CSUM_TSO;
 		}
 		RTFREE(sro.ro_rt);
 	}
 	return (maxmtu);
 }
 
 #ifdef INET6
 u_long
 tcp_maxmtu6(struct in_conninfo *inc, int *flags)
 {
 	struct route_in6 sro6;
 	struct ifnet *ifp;
 	u_long maxmtu = 0;
 
 	KASSERT(inc != NULL, ("tcp_maxmtu6 with NULL in_conninfo pointer"));
 
 	bzero(&sro6, sizeof(sro6));
 	if (!IN6_IS_ADDR_UNSPECIFIED(&inc->inc6_faddr)) {
 		sro6.ro_dst.sin6_family = AF_INET6;
 		sro6.ro_dst.sin6_len = sizeof(struct sockaddr_in6);
 		sro6.ro_dst.sin6_addr = inc->inc6_faddr;
-		rtalloc_ign((struct route *)&sro6, RTF_CLONING);
+		rtalloc_ign((struct route *)&sro6, 0);
 	}
 	if (sro6.ro_rt != NULL) {
 		ifp = sro6.ro_rt->rt_ifp;
 		if (sro6.ro_rt->rt_rmx.rmx_mtu == 0)
 			maxmtu = IN6_LINKMTU(sro6.ro_rt->rt_ifp);
 		else
 			maxmtu = min(sro6.ro_rt->rt_rmx.rmx_mtu,
 				     IN6_LINKMTU(sro6.ro_rt->rt_ifp));
 
 		/* Report additional interface capabilities. */
 		if (flags != NULL) {
 			if (ifp->if_capenable & IFCAP_TSO6 &&
 			    ifp->if_hwassist & CSUM_TSO)
 				*flags |= CSUM_TSO;
 		}
 		RTFREE(sro6.ro_rt);
 	}
 
 	return (maxmtu);
 }
 #endif /* INET6 */
 
 #ifdef IPSEC
 /* compute ESP/AH header size for TCP, including outer IP header. */
 size_t
 ipsec_hdrsiz_tcp(struct tcpcb *tp)
 {
 	struct inpcb *inp;
 	struct mbuf *m;
 	size_t hdrsiz;
 	struct ip *ip;
 #ifdef INET6
 	struct ip6_hdr *ip6;
 #endif
 	struct tcphdr *th;
 
 	if ((tp == NULL) || ((inp = tp->t_inpcb) == NULL))
 		return (0);
 	MGETHDR(m, M_DONTWAIT, MT_DATA);
 	if (!m)
 		return (0);
 
 #ifdef INET6
 	if ((inp->inp_vflag & INP_IPV6) != 0) {
 		ip6 = mtod(m, struct ip6_hdr *);
 		th = (struct tcphdr *)(ip6 + 1);
 		m->m_pkthdr.len = m->m_len =
 			sizeof(struct ip6_hdr) + sizeof(struct tcphdr);
 		tcpip_fillheaders(inp, ip6, th);
 		hdrsiz = ipsec6_hdrsiz(m, IPSEC_DIR_OUTBOUND, inp);
 	} else
 #endif /* INET6 */
 	{
 		ip = mtod(m, struct ip *);
 		th = (struct tcphdr *)(ip + 1);
 		m->m_pkthdr.len = m->m_len = sizeof(struct tcpiphdr);
 		tcpip_fillheaders(inp, ip, th);
 		hdrsiz = ipsec4_hdrsiz(m, IPSEC_DIR_OUTBOUND, inp);
 	}
 
 	m_free(m);
 	return (hdrsiz);
 }
 #endif /* IPSEC */
 
 /*
  * TCP BANDWIDTH DELAY PRODUCT WINDOW LIMITING
  *
  * This code attempts to calculate the bandwidth-delay product as a
  * means of determining the optimal window size to maximize bandwidth,
  * minimize RTT, and avoid the over-allocation of buffers on interfaces and
  * routers.  This code also does a fairly good job keeping RTTs in check
  * across slow links like modems.  We implement an algorithm which is very
  * similar (but not meant to be) TCP/Vegas.  The code operates on the
  * transmitter side of a TCP connection and so only effects the transmit
  * side of the connection.
  *
  * BACKGROUND:  TCP makes no provision for the management of buffer space
  * at the end points or at the intermediate routers and switches.  A TCP
  * stream, whether using NewReno or not, will eventually buffer as
  * many packets as it is able and the only reason this typically works is
  * due to the fairly small default buffers made available for a connection
  * (typicaly 16K or 32K).  As machines use larger windows and/or window
  * scaling it is now fairly easy for even a single TCP connection to blow-out
  * all available buffer space not only on the local interface, but on
  * intermediate routers and switches as well.  NewReno makes a misguided
  * attempt to 'solve' this problem by waiting for an actual failure to occur,
  * then backing off, then steadily increasing the window again until another
  * failure occurs, ad-infinitum.  This results in terrible oscillation that
  * is only made worse as network loads increase and the idea of intentionally
  * blowing out network buffers is, frankly, a terrible way to manage network
  * resources.
  *
  * It is far better to limit the transmit window prior to the failure
  * condition being achieved.  There are two general ways to do this:  First
  * you can 'scan' through different transmit window sizes and locate the
  * point where the RTT stops increasing, indicating that you have filled the
  * pipe, then scan backwards until you note that RTT stops decreasing, then
  * repeat ad-infinitum.  This method works in principle but has severe
  * implementation issues due to RTT variances, timer granularity, and
  * instability in the algorithm which can lead to many false positives and
  * create oscillations as well as interact badly with other TCP streams
  * implementing the same algorithm.
  *
  * The second method is to limit the window to the bandwidth delay product
  * of the link.  This is the method we implement.  RTT variances and our
  * own manipulation of the congestion window, bwnd, can potentially
  * destabilize the algorithm.  For this reason we have to stabilize the
  * elements used to calculate the window.  We do this by using the minimum
  * observed RTT, the long term average of the observed bandwidth, and
  * by adding two segments worth of slop.  It isn't perfect but it is able
  * to react to changing conditions and gives us a very stable basis on
  * which to extend the algorithm.
  */
 void
 tcp_xmit_bandwidth_limit(struct tcpcb *tp, tcp_seq ack_seq)
 {
 	INIT_VNET_INET(tp->t_vnet);
 	u_long bw;
 	u_long bwnd;
 	int save_ticks;
 
 	INP_WLOCK_ASSERT(tp->t_inpcb);
 
 	/*
 	 * If inflight_enable is disabled in the middle of a tcp connection,
 	 * make sure snd_bwnd is effectively disabled.
 	 */
 	if (V_tcp_inflight_enable == 0 ||
 	    tp->t_rttlow < V_tcp_inflight_rttthresh) {
 		tp->snd_bwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT;
 		tp->snd_bandwidth = 0;
 		return;
 	}
 
 	/*
 	 * Figure out the bandwidth.  Due to the tick granularity this
 	 * is a very rough number and it MUST be averaged over a fairly
 	 * long period of time.  XXX we need to take into account a link
 	 * that is not using all available bandwidth, but for now our
 	 * slop will ramp us up if this case occurs and the bandwidth later
 	 * increases.
 	 *
 	 * Note: if ticks rollover 'bw' may wind up negative.  We must
 	 * effectively reset t_bw_rtttime for this case.
 	 */
 	save_ticks = ticks;
 	if ((u_int)(save_ticks - tp->t_bw_rtttime) < 1)
 		return;
 
 	bw = (int64_t)(ack_seq - tp->t_bw_rtseq) * hz /
 	    (save_ticks - tp->t_bw_rtttime);
 	tp->t_bw_rtttime = save_ticks;
 	tp->t_bw_rtseq = ack_seq;
 	if (tp->t_bw_rtttime == 0 || (int)bw < 0)
 		return;
 	bw = ((int64_t)tp->snd_bandwidth * 15 + bw) >> 4;
 
 	tp->snd_bandwidth = bw;
 
 	/*
 	 * Calculate the semi-static bandwidth delay product, plus two maximal
 	 * segments.  The additional slop puts us squarely in the sweet
 	 * spot and also handles the bandwidth run-up case and stabilization.
 	 * Without the slop we could be locking ourselves into a lower
 	 * bandwidth.
 	 *
 	 * Situations Handled:
 	 *	(1) Prevents over-queueing of packets on LANs, especially on
 	 *	    high speed LANs, allowing larger TCP buffers to be
 	 *	    specified, and also does a good job preventing
 	 *	    over-queueing of packets over choke points like modems
 	 *	    (at least for the transmit side).
 	 *
 	 *	(2) Is able to handle changing network loads (bandwidth
 	 *	    drops so bwnd drops, bandwidth increases so bwnd
 	 *	    increases).
 	 *
 	 *	(3) Theoretically should stabilize in the face of multiple
 	 *	    connections implementing the same algorithm (this may need
 	 *	    a little work).
 	 *
 	 *	(4) Stability value (defaults to 20 = 2 maximal packets) can
 	 *	    be adjusted with a sysctl but typically only needs to be
 	 *	    on very slow connections.  A value no smaller then 5
 	 *	    should be used, but only reduce this default if you have
 	 *	    no other choice.
 	 */
 #define USERTT	((tp->t_srtt + tp->t_rttbest) / 2)
 	bwnd = (int64_t)bw * USERTT / (hz << TCP_RTT_SHIFT) + V_tcp_inflight_stab * tp->t_maxseg / 10;
 #undef USERTT
 
 	if (tcp_inflight_debug > 0) {
 		static int ltime;
 		if ((u_int)(ticks - ltime) >= hz / tcp_inflight_debug) {
 			ltime = ticks;
 			printf("%p bw %ld rttbest %d srtt %d bwnd %ld\n",
 			    tp,
 			    bw,
 			    tp->t_rttbest,
 			    tp->t_srtt,
 			    bwnd
 			);
 		}
 	}
 	if ((long)bwnd < V_tcp_inflight_min)
 		bwnd = V_tcp_inflight_min;
 	if (bwnd > V_tcp_inflight_max)
 		bwnd = V_tcp_inflight_max;
 	if ((long)bwnd < tp->t_maxseg * 2)
 		bwnd = tp->t_maxseg * 2;
 	tp->snd_bwnd = bwnd;
 }
 
 #ifdef TCP_SIGNATURE
 /*
  * Callback function invoked by m_apply() to digest TCP segment data
  * contained within an mbuf chain.
  */
 static int
 tcp_signature_apply(void *fstate, void *data, u_int len)
 {
 
 	MD5Update(fstate, (u_char *)data, len);
 	return (0);
 }
 
 /*
  * Compute TCP-MD5 hash of a TCP segment. (RFC2385)
  *
  * Parameters:
  * m		pointer to head of mbuf chain
  * _unused	
  * len		length of TCP segment data, excluding options
  * optlen	length of TCP segment options
  * buf		pointer to storage for computed MD5 digest
  * direction	direction of flow (IPSEC_DIR_INBOUND or OUTBOUND)
  *
  * We do this over ip, tcphdr, segment data, and the key in the SADB.
  * When called from tcp_input(), we can be sure that th_sum has been
  * zeroed out and verified already.
  *
  * Return 0 if successful, otherwise return -1.
  *
  * XXX The key is retrieved from the system's PF_KEY SADB, by keying a
  * search with the destination IP address, and a 'magic SPI' to be
  * determined by the application. This is hardcoded elsewhere to 1179
  * right now. Another branch of this code exists which uses the SPD to
  * specify per-application flows but it is unstable.
  */
 int
 tcp_signature_compute(struct mbuf *m, int _unused, int len, int optlen,
     u_char *buf, u_int direction)
 {
 	INIT_VNET_IPSEC(curvnet);
 	union sockaddr_union dst;
 	struct ippseudo ippseudo;
 	MD5_CTX ctx;
 	int doff;
 	struct ip *ip;
 	struct ipovly *ipovly;
 	struct secasvar *sav;
 	struct tcphdr *th;
 #ifdef INET6
 	struct ip6_hdr *ip6;
 	struct in6_addr in6;
 	char ip6buf[INET6_ADDRSTRLEN];
 	uint32_t plen;
 	uint16_t nhdr;
 #endif
 	u_short savecsum;
 
 	KASSERT(m != NULL, ("NULL mbuf chain"));
 	KASSERT(buf != NULL, ("NULL signature pointer"));
 
 	/* Extract the destination from the IP header in the mbuf. */
 	bzero(&dst, sizeof(union sockaddr_union));
 	ip = mtod(m, struct ip *);
 #ifdef INET6
 	ip6 = NULL;	/* Make the compiler happy. */
 #endif
 	switch (ip->ip_v) {
 	case IPVERSION:
 		dst.sa.sa_len = sizeof(struct sockaddr_in);
 		dst.sa.sa_family = AF_INET;
 		dst.sin.sin_addr = (direction == IPSEC_DIR_INBOUND) ?
 		    ip->ip_src : ip->ip_dst;
 		break;
 #ifdef INET6
 	case (IPV6_VERSION >> 4):
 		ip6 = mtod(m, struct ip6_hdr *);
 		dst.sa.sa_len = sizeof(struct sockaddr_in6);
 		dst.sa.sa_family = AF_INET6;
 		dst.sin6.sin6_addr = (direction == IPSEC_DIR_INBOUND) ?
 		    ip6->ip6_src : ip6->ip6_dst;
 		break;
 #endif
 	default:
 		return (EINVAL);
 		/* NOTREACHED */
 		break;
 	}
 
 	/* Look up an SADB entry which matches the address of the peer. */
 	sav = KEY_ALLOCSA(&dst, IPPROTO_TCP, htonl(TCP_SIG_SPI));
 	if (sav == NULL) {
 		ipseclog((LOG_ERR, "%s: SADB lookup failed for %s\n", __func__,
 		    (ip->ip_v == IPVERSION) ? inet_ntoa(dst.sin.sin_addr) :
 #ifdef INET6
 			(ip->ip_v == (IPV6_VERSION >> 4)) ?
 			    ip6_sprintf(ip6buf, &dst.sin6.sin6_addr) :
 #endif
 			"(unsupported)"));
 		return (EINVAL);
 	}
 
 	MD5Init(&ctx);
 	/*
 	 * Step 1: Update MD5 hash with IP(v6) pseudo-header.
 	 *
 	 * XXX The ippseudo header MUST be digested in network byte order,
 	 * or else we'll fail the regression test. Assume all fields we've
 	 * been doing arithmetic on have been in host byte order.
 	 * XXX One cannot depend on ipovly->ih_len here. When called from
 	 * tcp_output(), the underlying ip_len member has not yet been set.
 	 */
 	switch (ip->ip_v) {
 	case IPVERSION:
 		ipovly = (struct ipovly *)ip;
 		ippseudo.ippseudo_src = ipovly->ih_src;
 		ippseudo.ippseudo_dst = ipovly->ih_dst;
 		ippseudo.ippseudo_pad = 0;
 		ippseudo.ippseudo_p = IPPROTO_TCP;
 		ippseudo.ippseudo_len = htons(len + sizeof(struct tcphdr) +
 		    optlen);
 		MD5Update(&ctx, (char *)&ippseudo, sizeof(struct ippseudo));
 
 		th = (struct tcphdr *)((u_char *)ip + sizeof(struct ip));
 		doff = sizeof(struct ip) + sizeof(struct tcphdr) + optlen;
 		break;
 #ifdef INET6
 	/*
 	 * RFC 2385, 2.0  Proposal
 	 * For IPv6, the pseudo-header is as described in RFC 2460, namely the
 	 * 128-bit source IPv6 address, 128-bit destination IPv6 address, zero-
 	 * extended next header value (to form 32 bits), and 32-bit segment
 	 * length.
 	 * Note: Upper-Layer Packet Length comes before Next Header.
 	 */
 	case (IPV6_VERSION >> 4):
 		in6 = ip6->ip6_src;
 		in6_clearscope(&in6);
 		MD5Update(&ctx, (char *)&in6, sizeof(struct in6_addr));
 		in6 = ip6->ip6_dst;
 		in6_clearscope(&in6);
 		MD5Update(&ctx, (char *)&in6, sizeof(struct in6_addr));
 		plen = htonl(len + sizeof(struct tcphdr) + optlen);
 		MD5Update(&ctx, (char *)&plen, sizeof(uint32_t));
 		nhdr = 0;
 		MD5Update(&ctx, (char *)&nhdr, sizeof(uint8_t));
 		MD5Update(&ctx, (char *)&nhdr, sizeof(uint8_t));
 		MD5Update(&ctx, (char *)&nhdr, sizeof(uint8_t));
 		nhdr = IPPROTO_TCP;
 		MD5Update(&ctx, (char *)&nhdr, sizeof(uint8_t));
 
 		th = (struct tcphdr *)((u_char *)ip6 + sizeof(struct ip6_hdr));
 		doff = sizeof(struct ip6_hdr) + sizeof(struct tcphdr) + optlen;
 		break;
 #endif
 	default:
 		return (EINVAL);
 		/* NOTREACHED */
 		break;
 	}
 
 
 	/*
 	 * Step 2: Update MD5 hash with TCP header, excluding options.
 	 * The TCP checksum must be set to zero.
 	 */
 	savecsum = th->th_sum;
 	th->th_sum = 0;
 	MD5Update(&ctx, (char *)th, sizeof(struct tcphdr));
 	th->th_sum = savecsum;
 
 	/*
 	 * Step 3: Update MD5 hash with TCP segment data.
 	 *         Use m_apply() to avoid an early m_pullup().
 	 */
 	if (len > 0)
 		m_apply(m, doff, len, tcp_signature_apply, &ctx);
 
 	/*
 	 * Step 4: Update MD5 hash with shared secret.
 	 */
 	MD5Update(&ctx, sav->key_auth->key_data, _KEYLEN(sav->key_auth));
 	MD5Final(buf, &ctx);
 
 	key_sa_recordxfer(sav, m);
 	KEY_FREESAV(&sav);
 	return (0);
 }
 #endif /* TCP_SIGNATURE */
 
 static int
 sysctl_drop(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET(curvnet);
 #ifdef INET6
 	INIT_VNET_INET6(curvnet);
 #endif
 	/* addrs[0] is a foreign socket, addrs[1] is a local one. */
 	struct sockaddr_storage addrs[2];
 	struct inpcb *inp;
 	struct tcpcb *tp;
 	struct tcptw *tw;
 	struct sockaddr_in *fin, *lin;
 #ifdef INET6
 	struct sockaddr_in6 *fin6, *lin6;
 	struct in6_addr f6, l6;
 #endif
 	int error;
 
 	inp = NULL;
 	fin = lin = NULL;
 #ifdef INET6
 	fin6 = lin6 = NULL;
 #endif
 	error = 0;
 
 	if (req->oldptr != NULL || req->oldlen != 0)
 		return (EINVAL);
 	if (req->newptr == NULL)
 		return (EPERM);
 	if (req->newlen < sizeof(addrs))
 		return (ENOMEM);
 	error = SYSCTL_IN(req, &addrs, sizeof(addrs));
 	if (error)
 		return (error);
 
 	switch (addrs[0].ss_family) {
 #ifdef INET6
 	case AF_INET6:
 		fin6 = (struct sockaddr_in6 *)&addrs[0];
 		lin6 = (struct sockaddr_in6 *)&addrs[1];
 		if (fin6->sin6_len != sizeof(struct sockaddr_in6) ||
 		    lin6->sin6_len != sizeof(struct sockaddr_in6))
 			return (EINVAL);
 		if (IN6_IS_ADDR_V4MAPPED(&fin6->sin6_addr)) {
 			if (!IN6_IS_ADDR_V4MAPPED(&lin6->sin6_addr))
 				return (EINVAL);
 			in6_sin6_2_sin_in_sock((struct sockaddr *)&addrs[0]);
 			in6_sin6_2_sin_in_sock((struct sockaddr *)&addrs[1]);
 			fin = (struct sockaddr_in *)&addrs[0];
 			lin = (struct sockaddr_in *)&addrs[1];
 			break;
 		}
 		error = sa6_embedscope(fin6, V_ip6_use_defzone);
 		if (error)
 			return (error);
 		error = sa6_embedscope(lin6, V_ip6_use_defzone);
 		if (error)
 			return (error);
 		break;
 #endif
 	case AF_INET:
 		fin = (struct sockaddr_in *)&addrs[0];
 		lin = (struct sockaddr_in *)&addrs[1];
 		if (fin->sin_len != sizeof(struct sockaddr_in) ||
 		    lin->sin_len != sizeof(struct sockaddr_in))
 			return (EINVAL);
 		break;
 	default:
 		return (EINVAL);
 	}
 	INP_INFO_WLOCK(&V_tcbinfo);
 	switch (addrs[0].ss_family) {
 #ifdef INET6
 	case AF_INET6:
 		inp = in6_pcblookup_hash(&V_tcbinfo, &f6, fin6->sin6_port,
 		    &l6, lin6->sin6_port, 0, NULL);
 		break;
 #endif
 	case AF_INET:
 		inp = in_pcblookup_hash(&V_tcbinfo, fin->sin_addr,
 		    fin->sin_port, lin->sin_addr, lin->sin_port, 0, NULL);
 		break;
 	}
 	if (inp != NULL) {
 		INP_WLOCK(inp);
 		if (inp->inp_vflag & INP_TIMEWAIT) {
 			/*
 			 * XXXRW: There currently exists a state where an
 			 * inpcb is present, but its timewait state has been
 			 * discarded.  For now, don't allow dropping of this
 			 * type of inpcb.
 			 */
 			tw = intotw(inp);
 			if (tw != NULL)
 				tcp_twclose(tw, 0);
 			else
 				INP_WUNLOCK(inp);
 		} else if (!(inp->inp_vflag & INP_DROPPED) &&
 			   !(inp->inp_socket->so_options & SO_ACCEPTCONN)) {
 			tp = intotcpcb(inp);
 			tp = tcp_drop(tp, ECONNABORTED);
 			if (tp != NULL)
 				INP_WUNLOCK(inp);
 		} else
 			INP_WUNLOCK(inp);
 	} else
 		error = ESRCH;
 	INP_INFO_WUNLOCK(&V_tcbinfo);
 	return (error);
 }
 
 SYSCTL_PROC(_net_inet_tcp, TCPCTL_DROP, drop,
     CTLTYPE_STRUCT|CTLFLAG_WR|CTLFLAG_SKIP, NULL,
     0, sysctl_drop, "", "Drop TCP connection");
 
 /*
  * Generate a standardized TCP log line for use throughout the
  * tcp subsystem.  Memory allocation is done with M_NOWAIT to
  * allow use in the interrupt context.
  *
  * NB: The caller MUST free(s, M_TCPLOG) the returned string.
  * NB: The function may return NULL if memory allocation failed.
  *
  * Due to header inclusion and ordering limitations the struct ip
  * and ip6_hdr pointers have to be passed as void pointers.
  */
 char *
 tcp_log_addrs(struct in_conninfo *inc, struct tcphdr *th, void *ip4hdr,
     const void *ip6hdr)
 {
 	char *s, *sp;
 	size_t size;
 	struct ip *ip;
 #ifdef INET6
 	const struct ip6_hdr *ip6;
 
 	ip6 = (const struct ip6_hdr *)ip6hdr;
 #endif /* INET6 */
 	ip = (struct ip *)ip4hdr;
 
 	/*
 	 * The log line looks like this:
 	 * "TCP: [1.2.3.4]:50332 to [1.2.3.4]:80 tcpflags 0x2<SYN>"
 	 */
 	size = sizeof("TCP: []:12345 to []:12345 tcpflags 0x2<>") +
 	    sizeof(PRINT_TH_FLAGS) + 1 +
 #ifdef INET6
 	    2 * INET6_ADDRSTRLEN;
 #else
 	    2 * INET_ADDRSTRLEN;
 #endif /* INET6 */
 
 	/* Is logging enabled? */
 	if (tcp_log_debug == 0 && tcp_log_in_vain == 0)
 		return (NULL);
 
 	s = malloc(size, M_TCPLOG, M_ZERO|M_NOWAIT);
 	if (s == NULL)
 		return (NULL);
 
 	strcat(s, "TCP: [");
 	sp = s + strlen(s);
 
 	if (inc && inc->inc_isipv6 == 0) {
 		inet_ntoa_r(inc->inc_faddr, sp);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i to [", ntohs(inc->inc_fport));
 		sp = s + strlen(s);
 		inet_ntoa_r(inc->inc_laddr, sp);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i", ntohs(inc->inc_lport));
 #ifdef INET6
 	} else if (inc) {
 		ip6_sprintf(sp, &inc->inc6_faddr);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i to [", ntohs(inc->inc_fport));
 		sp = s + strlen(s);
 		ip6_sprintf(sp, &inc->inc6_laddr);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i", ntohs(inc->inc_lport));
 	} else if (ip6 && th) {
 		ip6_sprintf(sp, &ip6->ip6_src);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i to [", ntohs(th->th_sport));
 		sp = s + strlen(s);
 		ip6_sprintf(sp, &ip6->ip6_dst);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i", ntohs(th->th_dport));
 #endif /* INET6 */
 	} else if (ip && th) {
 		inet_ntoa_r(ip->ip_src, sp);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i to [", ntohs(th->th_sport));
 		sp = s + strlen(s);
 		inet_ntoa_r(ip->ip_dst, sp);
 		sp = s + strlen(s);
 		sprintf(sp, "]:%i", ntohs(th->th_dport));
 	} else {
 		free(s, M_TCPLOG);
 		return (NULL);
 	}
 	sp = s + strlen(s);
 	if (th)
 		sprintf(sp, " tcpflags 0x%b", th->th_flags, PRINT_TH_FLAGS);
 	if (*(s + size - 1) != '\0')
 		panic("%s: string too long", __func__);
 	return (s);
 }
Index: head/sys/netinet6/icmp6.c
===================================================================
--- head/sys/netinet6/icmp6.c	(revision 186118)
+++ head/sys/netinet6/icmp6.c	(revision 186119)
@@ -1,2827 +1,2828 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: icmp6.c,v 1.211 2001/04/04 05:56:20 itojun Exp $
  */
 
 /*-
  * Copyright (c) 1982, 1986, 1988, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_icmp.c	8.2 (Berkeley) 1/4/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipsec.h"
 
 #include <sys/param.h>
 #include <sys/domain.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/protosw.h>
 #include <sys/signalvar.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sx.h>
 #include <sys/syslog.h>
 #include <sys/systm.h>
 #include <sys/time.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
+#include <net/if_llatbl.h>
 #include <net/if_types.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_pcb.h>
 #include <netinet/in_var.h>
 #include <netinet/ip6.h>
 #include <netinet/icmp6.h>
 #include <netinet/tcp_var.h>
 #include <netinet/vinet.h>
 
 #include <netinet6/in6_ifattach.h>
 #include <netinet6/in6_pcb.h>
 #include <netinet6/ip6protosw.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/mld6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet6/vinet6.h>
 
 #ifdef IPSEC
 #include <netipsec/ipsec.h>
 #include <netipsec/key.h>
 #endif
 
 extern struct domain inet6domain;
 
 #ifdef VIMAGE_GLOBALS
 extern struct inpcbinfo ripcbinfo;
 extern struct inpcbhead ripcb;
 extern int icmp6errppslim;
 extern int icmp6_nodeinfo;
 
 struct icmp6stat icmp6stat;
 static int icmp6errpps_count;
 static struct timeval icmp6errppslim_last;
 #endif
 
 static void icmp6_errcount(struct icmp6errstat *, int, int);
 static int icmp6_rip6_input(struct mbuf **, int);
 static int icmp6_ratelimit(const struct in6_addr *, const int, const int);
 static const char *icmp6_redirect_diag __P((struct in6_addr *,
 	struct in6_addr *, struct in6_addr *));
 static struct mbuf *ni6_input(struct mbuf *, int);
 static struct mbuf *ni6_nametodns(const char *, int, int);
 static int ni6_dnsmatch(const char *, int, const char *, int);
 static int ni6_addrs __P((struct icmp6_nodeinfo *, struct mbuf *,
 			  struct ifnet **, struct in6_addr *));
 static int ni6_store_addrs __P((struct icmp6_nodeinfo *, struct icmp6_nodeinfo *,
 				struct ifnet *, int));
 static int icmp6_notify_error(struct mbuf **, int, int, int);
 
 
 void
 icmp6_init(void)
 {
 	INIT_VNET_INET6(curvnet);
 
 	V_icmp6errpps_count = 0;
 
 	mld6_init();
 }
 
 static void
 icmp6_errcount(struct icmp6errstat *stat, int type, int code)
 {
 	switch (type) {
 	case ICMP6_DST_UNREACH:
 		switch (code) {
 		case ICMP6_DST_UNREACH_NOROUTE:
 			stat->icp6errs_dst_unreach_noroute++;
 			return;
 		case ICMP6_DST_UNREACH_ADMIN:
 			stat->icp6errs_dst_unreach_admin++;
 			return;
 		case ICMP6_DST_UNREACH_BEYONDSCOPE:
 			stat->icp6errs_dst_unreach_beyondscope++;
 			return;
 		case ICMP6_DST_UNREACH_ADDR:
 			stat->icp6errs_dst_unreach_addr++;
 			return;
 		case ICMP6_DST_UNREACH_NOPORT:
 			stat->icp6errs_dst_unreach_noport++;
 			return;
 		}
 		break;
 	case ICMP6_PACKET_TOO_BIG:
 		stat->icp6errs_packet_too_big++;
 		return;
 	case ICMP6_TIME_EXCEEDED:
 		switch (code) {
 		case ICMP6_TIME_EXCEED_TRANSIT:
 			stat->icp6errs_time_exceed_transit++;
 			return;
 		case ICMP6_TIME_EXCEED_REASSEMBLY:
 			stat->icp6errs_time_exceed_reassembly++;
 			return;
 		}
 		break;
 	case ICMP6_PARAM_PROB:
 		switch (code) {
 		case ICMP6_PARAMPROB_HEADER:
 			stat->icp6errs_paramprob_header++;
 			return;
 		case ICMP6_PARAMPROB_NEXTHEADER:
 			stat->icp6errs_paramprob_nextheader++;
 			return;
 		case ICMP6_PARAMPROB_OPTION:
 			stat->icp6errs_paramprob_option++;
 			return;
 		}
 		break;
 	case ND_REDIRECT:
 		stat->icp6errs_redirect++;
 		return;
 	}
 	stat->icp6errs_unknown++;
 }
 
 /*
  * A wrapper function for icmp6_error() necessary when the erroneous packet
  * may not contain enough scope zone information.
  */
 void
 icmp6_error2(struct mbuf *m, int type, int code, int param,
     struct ifnet *ifp)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ip6_hdr *ip6;
 
 	if (ifp == NULL)
 		return;
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, 0, sizeof(struct ip6_hdr), );
 #else
 	if (m->m_len < sizeof(struct ip6_hdr)) {
 		m = m_pullup(m, sizeof(struct ip6_hdr));
 		if (m == NULL)
 			return;
 	}
 #endif
 
 	ip6 = mtod(m, struct ip6_hdr *);
 
 	if (in6_setscope(&ip6->ip6_src, ifp, NULL) != 0)
 		return;
 	if (in6_setscope(&ip6->ip6_dst, ifp, NULL) != 0)
 		return;
 
 	icmp6_error(m, type, code, param);
 }
 
 /*
  * Generate an error packet of type error in response to bad IP6 packet.
  */
 void
 icmp6_error(struct mbuf *m, int type, int code, int param)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ip6_hdr *oip6, *nip6;
 	struct icmp6_hdr *icmp6;
 	u_int preplen;
 	int off;
 	int nxt;
 
 	V_icmp6stat.icp6s_error++;
 
 	/* count per-type-code statistics */
 	icmp6_errcount(&V_icmp6stat.icp6s_outerrhist, type, code);
 
 #ifdef M_DECRYPTED	/*not openbsd*/
 	if (m->m_flags & M_DECRYPTED) {
 		V_icmp6stat.icp6s_canterror++;
 		goto freeit;
 	}
 #endif
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, 0, sizeof(struct ip6_hdr), );
 #else
 	if (m->m_len < sizeof(struct ip6_hdr)) {
 		m = m_pullup(m, sizeof(struct ip6_hdr));
 		if (m == NULL)
 			return;
 	}
 #endif
 	oip6 = mtod(m, struct ip6_hdr *);
 
 	/*
 	 * If the destination address of the erroneous packet is a multicast
 	 * address, or the packet was sent using link-layer multicast,
 	 * we should basically suppress sending an error (RFC 2463, Section
 	 * 2.4).
 	 * We have two exceptions (the item e.2 in that section):
 	 * - the Pakcet Too Big message can be sent for path MTU discovery.
 	 * - the Parameter Problem Message that can be allowed an icmp6 error
 	 *   in the option type field.  This check has been done in
 	 *   ip6_unknown_opt(), so we can just check the type and code.
 	 */
 	if ((m->m_flags & (M_BCAST|M_MCAST) ||
 	     IN6_IS_ADDR_MULTICAST(&oip6->ip6_dst)) &&
 	    (type != ICMP6_PACKET_TOO_BIG &&
 	     (type != ICMP6_PARAM_PROB ||
 	      code != ICMP6_PARAMPROB_OPTION)))
 		goto freeit;
 
 	/*
 	 * RFC 2463, 2.4 (e.5): source address check.
 	 * XXX: the case of anycast source?
 	 */
 	if (IN6_IS_ADDR_UNSPECIFIED(&oip6->ip6_src) ||
 	    IN6_IS_ADDR_MULTICAST(&oip6->ip6_src))
 		goto freeit;
 
 	/*
 	 * If we are about to send ICMPv6 against ICMPv6 error/redirect,
 	 * don't do it.
 	 */
 	nxt = -1;
 	off = ip6_lasthdr(m, 0, IPPROTO_IPV6, &nxt);
 	if (off >= 0 && nxt == IPPROTO_ICMPV6) {
 		struct icmp6_hdr *icp;
 
 #ifndef PULLDOWN_TEST
 		IP6_EXTHDR_CHECK(m, 0, off + sizeof(struct icmp6_hdr), );
 		icp = (struct icmp6_hdr *)(mtod(m, caddr_t) + off);
 #else
 		IP6_EXTHDR_GET(icp, struct icmp6_hdr *, m, off,
 			sizeof(*icp));
 		if (icp == NULL) {
 			V_icmp6stat.icp6s_tooshort++;
 			return;
 		}
 #endif
 		if (icp->icmp6_type < ICMP6_ECHO_REQUEST ||
 		    icp->icmp6_type == ND_REDIRECT) {
 			/*
 			 * ICMPv6 error
 			 * Special case: for redirect (which is
 			 * informational) we must not send icmp6 error.
 			 */
 			V_icmp6stat.icp6s_canterror++;
 			goto freeit;
 		} else {
 			/* ICMPv6 informational - send the error */
 		}
 	} else {
 		/* non-ICMPv6 - send the error */
 	}
 
 	oip6 = mtod(m, struct ip6_hdr *); /* adjust pointer */
 
 	/* Finally, do rate limitation check. */
 	if (icmp6_ratelimit(&oip6->ip6_src, type, code)) {
 		V_icmp6stat.icp6s_toofreq++;
 		goto freeit;
 	}
 
 	/*
 	 * OK, ICMP6 can be generated.
 	 */
 
 	if (m->m_pkthdr.len >= ICMPV6_PLD_MAXLEN)
 		m_adj(m, ICMPV6_PLD_MAXLEN - m->m_pkthdr.len);
 
 	preplen = sizeof(struct ip6_hdr) + sizeof(struct icmp6_hdr);
 	M_PREPEND(m, preplen, M_DONTWAIT);
 	if (m && m->m_len < preplen)
 		m = m_pullup(m, preplen);
 	if (m == NULL) {
 		nd6log((LOG_DEBUG, "ENOBUFS in icmp6_error %d\n", __LINE__));
 		return;
 	}
 
 	nip6 = mtod(m, struct ip6_hdr *);
 	nip6->ip6_src  = oip6->ip6_src;
 	nip6->ip6_dst  = oip6->ip6_dst;
 
 	in6_clearscope(&oip6->ip6_src);
 	in6_clearscope(&oip6->ip6_dst);
 
 	icmp6 = (struct icmp6_hdr *)(nip6 + 1);
 	icmp6->icmp6_type = type;
 	icmp6->icmp6_code = code;
 	icmp6->icmp6_pptr = htonl((u_int32_t)param);
 
 	/*
 	 * icmp6_reflect() is designed to be in the input path.
 	 * icmp6_error() can be called from both input and output path,
 	 * and if we are in output path rcvif could contain bogus value.
 	 * clear m->m_pkthdr.rcvif for safety, we should have enough scope
 	 * information in ip header (nip6).
 	 */
 	m->m_pkthdr.rcvif = NULL;
 
 	V_icmp6stat.icp6s_outhist[type]++;
 	icmp6_reflect(m, sizeof(struct ip6_hdr)); /* header order: IPv6 - ICMPv6 */
 
 	return;
 
   freeit:
 	/*
 	 * If we can't tell whether or not we can generate ICMP6, free it.
 	 */
 	m_freem(m);
 }
 
 /*
  * Process a received ICMP6 message.
  */
 int
 icmp6_input(struct mbuf **mp, int *offp, int proto)
 {
 	INIT_VNET_INET6(curvnet);
 	INIT_VPROCG(TD_TO_VPROCG(curthread)); /* XXX V_hostname needs this */
 	struct mbuf *m = *mp, *n;
 	struct ip6_hdr *ip6, *nip6;
 	struct icmp6_hdr *icmp6, *nicmp6;
 	int off = *offp;
 	int icmp6len = m->m_pkthdr.len - *offp;
 	int code, sum, noff;
 	char ip6bufs[INET6_ADDRSTRLEN], ip6bufd[INET6_ADDRSTRLEN];
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off, sizeof(struct icmp6_hdr), IPPROTO_DONE);
 	/* m might change if M_LOOP.  So, call mtod after this */
 #endif
 
 	/*
 	 * Locate icmp6 structure in mbuf, and check
 	 * that not corrupted and of at least minimum length
 	 */
 
 	ip6 = mtod(m, struct ip6_hdr *);
 	if (icmp6len < sizeof(struct icmp6_hdr)) {
 		V_icmp6stat.icp6s_tooshort++;
 		goto freeit;
 	}
 
 	/*
 	 * calculate the checksum
 	 */
 #ifndef PULLDOWN_TEST
 	icmp6 = (struct icmp6_hdr *)((caddr_t)ip6 + off);
 #else
 	IP6_EXTHDR_GET(icmp6, struct icmp6_hdr *, m, off, sizeof(*icmp6));
 	if (icmp6 == NULL) {
 		V_icmp6stat.icp6s_tooshort++;
 		return IPPROTO_DONE;
 	}
 #endif
 	code = icmp6->icmp6_code;
 
 	if ((sum = in6_cksum(m, IPPROTO_ICMPV6, off, icmp6len)) != 0) {
 		nd6log((LOG_ERR,
 		    "ICMP6 checksum error(%d|%x) %s\n",
 		    icmp6->icmp6_type, sum,
 		    ip6_sprintf(ip6bufs, &ip6->ip6_src)));
 		V_icmp6stat.icp6s_checksum++;
 		goto freeit;
 	}
 
 	if (faithprefix_p != NULL && (*faithprefix_p)(&ip6->ip6_dst)) {
 		/*
 		 * Deliver very specific ICMP6 type only.
 		 * This is important to deliver TOOBIG.  Otherwise PMTUD
 		 * will not work.
 		 */
 		switch (icmp6->icmp6_type) {
 		case ICMP6_DST_UNREACH:
 		case ICMP6_PACKET_TOO_BIG:
 		case ICMP6_TIME_EXCEEDED:
 			break;
 		default:
 			goto freeit;
 		}
 	}
 
 	V_icmp6stat.icp6s_inhist[icmp6->icmp6_type]++;
 	icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_msg);
 	if (icmp6->icmp6_type < ICMP6_INFOMSG_MASK)
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_error);
 
 	switch (icmp6->icmp6_type) {
 	case ICMP6_DST_UNREACH:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_dstunreach);
 		switch (code) {
 		case ICMP6_DST_UNREACH_NOROUTE:
 			code = PRC_UNREACH_NET;
 			break;
 		case ICMP6_DST_UNREACH_ADMIN:
 			icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_adminprohib);
 			code = PRC_UNREACH_PROTOCOL; /* is this a good code? */
 			break;
 		case ICMP6_DST_UNREACH_ADDR:
 			code = PRC_HOSTDEAD;
 			break;
 		case ICMP6_DST_UNREACH_BEYONDSCOPE:
 			/* I mean "source address was incorrect." */
 			code = PRC_PARAMPROB;
 			break;
 		case ICMP6_DST_UNREACH_NOPORT:
 			code = PRC_UNREACH_PORT;
 			break;
 		default:
 			goto badcode;
 		}
 		goto deliver;
 		break;
 
 	case ICMP6_PACKET_TOO_BIG:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_pkttoobig);
 
 		/* validation is made in icmp6_mtudisc_update */
 
 		code = PRC_MSGSIZE;
 
 		/*
 		 * Updating the path MTU will be done after examining
 		 * intermediate extension headers.
 		 */
 		goto deliver;
 		break;
 
 	case ICMP6_TIME_EXCEEDED:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_timeexceed);
 		switch (code) {
 		case ICMP6_TIME_EXCEED_TRANSIT:
 			code = PRC_TIMXCEED_INTRANS;
 			break;
 		case ICMP6_TIME_EXCEED_REASSEMBLY:
 			code = PRC_TIMXCEED_REASS;
 			break;
 		default:
 			goto badcode;
 		}
 		goto deliver;
 		break;
 
 	case ICMP6_PARAM_PROB:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_paramprob);
 		switch (code) {
 		case ICMP6_PARAMPROB_NEXTHEADER:
 			code = PRC_UNREACH_PROTOCOL;
 			break;
 		case ICMP6_PARAMPROB_HEADER:
 		case ICMP6_PARAMPROB_OPTION:
 			code = PRC_PARAMPROB;
 			break;
 		default:
 			goto badcode;
 		}
 		goto deliver;
 		break;
 
 	case ICMP6_ECHO_REQUEST:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_echo);
 		if (code != 0)
 			goto badcode;
 		if ((n = m_copy(m, 0, M_COPYALL)) == NULL) {
 			/* Give up remote */
 			break;
 		}
 		if ((n->m_flags & M_EXT) != 0
 		 || n->m_len < off + sizeof(struct icmp6_hdr)) {
 			struct mbuf *n0 = n;
 			const int maxlen = sizeof(*nip6) + sizeof(*nicmp6);
 			int n0len;
 
 			MGETHDR(n, M_DONTWAIT, n0->m_type);
 			n0len = n0->m_pkthdr.len;	/* save for use below */
 			if (n)
 				M_MOVE_PKTHDR(n, n0);
 			if (n && maxlen >= MHLEN) {
 				MCLGET(n, M_DONTWAIT);
 				if ((n->m_flags & M_EXT) == 0) {
 					m_free(n);
 					n = NULL;
 				}
 			}
 			if (n == NULL) {
 				/* Give up remote */
 				m_freem(n0);
 				break;
 			}
 			/*
 			 * Copy IPv6 and ICMPv6 only.
 			 */
 			nip6 = mtod(n, struct ip6_hdr *);
 			bcopy(ip6, nip6, sizeof(struct ip6_hdr));
 			nicmp6 = (struct icmp6_hdr *)(nip6 + 1);
 			bcopy(icmp6, nicmp6, sizeof(struct icmp6_hdr));
 			noff = sizeof(struct ip6_hdr);
 			/* new mbuf contains only ipv6+icmpv6 headers */
 			n->m_len = noff + sizeof(struct icmp6_hdr);
 			/*
 			 * Adjust mbuf.  ip6_plen will be adjusted in
 			 * ip6_output().
 			 */
 			m_adj(n0, off + sizeof(struct icmp6_hdr));
 			/* recalculate complete packet size */
 			n->m_pkthdr.len = n0len + (noff - off);
 			n->m_next = n0;
 		} else {
 			nip6 = mtod(n, struct ip6_hdr *);
 			IP6_EXTHDR_GET(nicmp6, struct icmp6_hdr *, n, off,
 			    sizeof(*nicmp6));
 			noff = off;
 		}
 		nicmp6->icmp6_type = ICMP6_ECHO_REPLY;
 		nicmp6->icmp6_code = 0;
 		if (n) {
 			V_icmp6stat.icp6s_reflect++;
 			V_icmp6stat.icp6s_outhist[ICMP6_ECHO_REPLY]++;
 			icmp6_reflect(n, noff);
 		}
 		break;
 
 	case ICMP6_ECHO_REPLY:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_echoreply);
 		if (code != 0)
 			goto badcode;
 		break;
 
 	case MLD_LISTENER_QUERY:
 	case MLD_LISTENER_REPORT:
 		if (icmp6len < sizeof(struct mld_hdr))
 			goto badlen;
 		if (icmp6->icmp6_type == MLD_LISTENER_QUERY) /* XXX: ugly... */
 			icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_mldquery);
 		else
 			icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_mldreport);
 		if ((n = m_copym(m, 0, M_COPYALL, M_DONTWAIT)) == NULL) {
 			/* give up local */
 			mld6_input(m, off);
 			m = NULL;
 			goto freeit;
 		}
 		mld6_input(n, off);
 		/* m stays. */
 		break;
 
 	case MLD_LISTENER_DONE:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_mlddone);
 		if (icmp6len < sizeof(struct mld_hdr))	/* necessary? */
 			goto badlen;
 		break;		/* nothing to be done in kernel */
 
 	case MLD_MTRACE_RESP:
 	case MLD_MTRACE:
 		/* XXX: these two are experimental.  not officially defined. */
 		/* XXX: per-interface statistics? */
 		break;		/* just pass it to applications */
 
 	case ICMP6_WRUREQUEST:	/* ICMP6_FQDN_QUERY */
 	    {
 		enum { WRU, FQDN } mode;
 
 		if (!V_icmp6_nodeinfo)
 			break;
 
 		if (icmp6len == sizeof(struct icmp6_hdr) + 4)
 			mode = WRU;
 		else if (icmp6len >= sizeof(struct icmp6_nodeinfo))
 			mode = FQDN;
 		else
 			goto badlen;
 
 #define hostnamelen	strlen(V_hostname)
 		if (mode == FQDN) {
 #ifndef PULLDOWN_TEST
 			IP6_EXTHDR_CHECK(m, off, sizeof(struct icmp6_nodeinfo),
 			    IPPROTO_DONE);
 #endif
 			n = m_copy(m, 0, M_COPYALL);
 			if (n)
 				n = ni6_input(n, off);
 			/* XXX meaningless if n == NULL */
 			noff = sizeof(struct ip6_hdr);
 		} else {
 			u_char *p;
 			int maxlen, maxhlen;
 
 			/*
 			 * XXX: this combination of flags is pointless,
 			 * but should we keep this for compatibility?
 			 */
 			if ((V_icmp6_nodeinfo & 5) != 5)
 				break;
 
 			if (code != 0)
 				goto badcode;
 			maxlen = sizeof(*nip6) + sizeof(*nicmp6) + 4;
 			if (maxlen >= MCLBYTES) {
 				/* Give up remote */
 				break;
 			}
 			MGETHDR(n, M_DONTWAIT, m->m_type);
 			if (n && maxlen > MHLEN) {
 				MCLGET(n, M_DONTWAIT);
 				if ((n->m_flags & M_EXT) == 0) {
 					m_free(n);
 					n = NULL;
 				}
 			}
 			if (n && !m_dup_pkthdr(n, m, M_DONTWAIT)) {
 				/*
 				 * Previous code did a blind M_COPY_PKTHDR
 				 * and said "just for rcvif".  If true, then
 				 * we could tolerate the dup failing (due to
 				 * the deep copy of the tag chain).  For now
 				 * be conservative and just fail.
 				 */
 				m_free(n);
 				n = NULL;
 			}
 			if (n == NULL) {
 				/* Give up remote */
 				break;
 			}
 			n->m_pkthdr.rcvif = NULL;
 			n->m_len = 0;
 			maxhlen = M_TRAILINGSPACE(n) - maxlen;
 			mtx_lock(&hostname_mtx);
 			if (maxhlen > hostnamelen)
 				maxhlen = hostnamelen;
 			/*
 			 * Copy IPv6 and ICMPv6 only.
 			 */
 			nip6 = mtod(n, struct ip6_hdr *);
 			bcopy(ip6, nip6, sizeof(struct ip6_hdr));
 			nicmp6 = (struct icmp6_hdr *)(nip6 + 1);
 			bcopy(icmp6, nicmp6, sizeof(struct icmp6_hdr));
 			p = (u_char *)(nicmp6 + 1);
 			bzero(p, 4);
 			bcopy(V_hostname, p + 4, maxhlen); /* meaningless TTL */
 			mtx_unlock(&hostname_mtx);
 			noff = sizeof(struct ip6_hdr);
 			n->m_pkthdr.len = n->m_len = sizeof(struct ip6_hdr) +
 				sizeof(struct icmp6_hdr) + 4 + maxhlen;
 			nicmp6->icmp6_type = ICMP6_WRUREPLY;
 			nicmp6->icmp6_code = 0;
 		}
 #undef hostnamelen
 		if (n) {
 			V_icmp6stat.icp6s_reflect++;
 			V_icmp6stat.icp6s_outhist[ICMP6_WRUREPLY]++;
 			icmp6_reflect(n, noff);
 		}
 		break;
 	    }
 
 	case ICMP6_WRUREPLY:
 		if (code != 0)
 			goto badcode;
 		break;
 
 	case ND_ROUTER_SOLICIT:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_routersolicit);
 		if (code != 0)
 			goto badcode;
 		if (icmp6len < sizeof(struct nd_router_solicit))
 			goto badlen;
 		if ((n = m_copym(m, 0, M_COPYALL, M_DONTWAIT)) == NULL) {
 			/* give up local */
 			nd6_rs_input(m, off, icmp6len);
 			m = NULL;
 			goto freeit;
 		}
 		nd6_rs_input(n, off, icmp6len);
 		/* m stays. */
 		break;
 
 	case ND_ROUTER_ADVERT:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_routeradvert);
 		if (code != 0)
 			goto badcode;
 		if (icmp6len < sizeof(struct nd_router_advert))
 			goto badlen;
 		if ((n = m_copym(m, 0, M_COPYALL, M_DONTWAIT)) == NULL) {
 			/* give up local */
 			nd6_ra_input(m, off, icmp6len);
 			m = NULL;
 			goto freeit;
 		}
 		nd6_ra_input(n, off, icmp6len);
 		/* m stays. */
 		break;
 
 	case ND_NEIGHBOR_SOLICIT:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_neighborsolicit);
 		if (code != 0)
 			goto badcode;
 		if (icmp6len < sizeof(struct nd_neighbor_solicit))
 			goto badlen;
 		if ((n = m_copym(m, 0, M_COPYALL, M_DONTWAIT)) == NULL) {
 			/* give up local */
 			nd6_ns_input(m, off, icmp6len);
 			m = NULL;
 			goto freeit;
 		}
 		nd6_ns_input(n, off, icmp6len);
 		/* m stays. */
 		break;
 
 	case ND_NEIGHBOR_ADVERT:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_neighboradvert);
 		if (code != 0)
 			goto badcode;
 		if (icmp6len < sizeof(struct nd_neighbor_advert))
 			goto badlen;
 		if ((n = m_copym(m, 0, M_COPYALL, M_DONTWAIT)) == NULL) {
 			/* give up local */
 			nd6_na_input(m, off, icmp6len);
 			m = NULL;
 			goto freeit;
 		}
 		nd6_na_input(n, off, icmp6len);
 		/* m stays. */
 		break;
 
 	case ND_REDIRECT:
 		icmp6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_redirect);
 		if (code != 0)
 			goto badcode;
 		if (icmp6len < sizeof(struct nd_redirect))
 			goto badlen;
 		if ((n = m_copym(m, 0, M_COPYALL, M_DONTWAIT)) == NULL) {
 			/* give up local */
 			icmp6_redirect_input(m, off);
 			m = NULL;
 			goto freeit;
 		}
 		icmp6_redirect_input(n, off);
 		/* m stays. */
 		break;
 
 	case ICMP6_ROUTER_RENUMBERING:
 		if (code != ICMP6_ROUTER_RENUMBERING_COMMAND &&
 		    code != ICMP6_ROUTER_RENUMBERING_RESULT)
 			goto badcode;
 		if (icmp6len < sizeof(struct icmp6_router_renum))
 			goto badlen;
 		break;
 
 	default:
 		nd6log((LOG_DEBUG,
 		    "icmp6_input: unknown type %d(src=%s, dst=%s, ifid=%d)\n",
 		    icmp6->icmp6_type, ip6_sprintf(ip6bufs, &ip6->ip6_src),
 		    ip6_sprintf(ip6bufd, &ip6->ip6_dst),
 		    m->m_pkthdr.rcvif ? m->m_pkthdr.rcvif->if_index : 0));
 		if (icmp6->icmp6_type < ICMP6_ECHO_REQUEST) {
 			/* ICMPv6 error: MUST deliver it by spec... */
 			code = PRC_NCMDS;
 			/* deliver */
 		} else {
 			/* ICMPv6 informational: MUST not deliver */
 			break;
 		}
 	deliver:
 		if (icmp6_notify_error(&m, off, icmp6len, code)) {
 			/* In this case, m should've been freed. */
 			return (IPPROTO_DONE);
 		}
 		break;
 
 	badcode:
 		V_icmp6stat.icp6s_badcode++;
 		break;
 
 	badlen:
 		V_icmp6stat.icp6s_badlen++;
 		break;
 	}
 
 	/* deliver the packet to appropriate sockets */
 	icmp6_rip6_input(&m, *offp);
 
 	return IPPROTO_DONE;
 
  freeit:
 	m_freem(m);
 	return IPPROTO_DONE;
 }
 
 static int
 icmp6_notify_error(struct mbuf **mp, int off, int icmp6len, int code)
 {
 	INIT_VNET_INET6(curvnet);
 	struct mbuf *m = *mp;
 	struct icmp6_hdr *icmp6;
 	struct ip6_hdr *eip6;
 	u_int32_t notifymtu;
 	struct sockaddr_in6 icmp6src, icmp6dst;
 
 	if (icmp6len < sizeof(struct icmp6_hdr) + sizeof(struct ip6_hdr)) {
 		V_icmp6stat.icp6s_tooshort++;
 		goto freeit;
 	}
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off,
 	    sizeof(struct icmp6_hdr) + sizeof(struct ip6_hdr), -1);
 	icmp6 = (struct icmp6_hdr *)(mtod(m, caddr_t) + off);
 #else
 	IP6_EXTHDR_GET(icmp6, struct icmp6_hdr *, m, off,
 	    sizeof(*icmp6) + sizeof(struct ip6_hdr));
 	if (icmp6 == NULL) {
 		V_icmp6stat.icp6s_tooshort++;
 		return (-1);
 	}
 #endif
 	eip6 = (struct ip6_hdr *)(icmp6 + 1);
 
 	/* Detect the upper level protocol */
 	{
 		void (*ctlfunc)(int, struct sockaddr *, void *);
 		u_int8_t nxt = eip6->ip6_nxt;
 		int eoff = off + sizeof(struct icmp6_hdr) +
 		    sizeof(struct ip6_hdr);
 		struct ip6ctlparam ip6cp;
 		struct in6_addr *finaldst = NULL;
 		int icmp6type = icmp6->icmp6_type;
 		struct ip6_frag *fh;
 		struct ip6_rthdr *rth;
 		struct ip6_rthdr0 *rth0;
 		int rthlen;
 
 		while (1) { /* XXX: should avoid infinite loop explicitly? */
 			struct ip6_ext *eh;
 
 			switch (nxt) {
 			case IPPROTO_HOPOPTS:
 			case IPPROTO_DSTOPTS:
 			case IPPROTO_AH:
 #ifndef PULLDOWN_TEST
 				IP6_EXTHDR_CHECK(m, 0,
 				    eoff + sizeof(struct ip6_ext), -1);
 				eh = (struct ip6_ext *)(mtod(m, caddr_t) + eoff);
 #else
 				IP6_EXTHDR_GET(eh, struct ip6_ext *, m,
 				    eoff, sizeof(*eh));
 				if (eh == NULL) {
 					V_icmp6stat.icp6s_tooshort++;
 					return (-1);
 				}
 #endif
 
 				if (nxt == IPPROTO_AH)
 					eoff += (eh->ip6e_len + 2) << 2;
 				else
 					eoff += (eh->ip6e_len + 1) << 3;
 				nxt = eh->ip6e_nxt;
 				break;
 			case IPPROTO_ROUTING:
 				/*
 				 * When the erroneous packet contains a
 				 * routing header, we should examine the
 				 * header to determine the final destination.
 				 * Otherwise, we can't properly update
 				 * information that depends on the final
 				 * destination (e.g. path MTU).
 				 */
 #ifndef PULLDOWN_TEST
 				IP6_EXTHDR_CHECK(m, 0, eoff + sizeof(*rth), -1);
 				rth = (struct ip6_rthdr *)
 				    (mtod(m, caddr_t) + eoff);
 #else
 				IP6_EXTHDR_GET(rth, struct ip6_rthdr *, m,
 				    eoff, sizeof(*rth));
 				if (rth == NULL) {
 					V_icmp6stat.icp6s_tooshort++;
 					return (-1);
 				}
 #endif
 				rthlen = (rth->ip6r_len + 1) << 3;
 				/*
 				 * XXX: currently there is no
 				 * officially defined type other
 				 * than type-0.
 				 * Note that if the segment left field
 				 * is 0, all intermediate hops must
 				 * have been passed.
 				 */
 				if (rth->ip6r_segleft &&
 				    rth->ip6r_type == IPV6_RTHDR_TYPE_0) {
 					int hops;
 
 #ifndef PULLDOWN_TEST
 					IP6_EXTHDR_CHECK(m, 0, eoff + rthlen, -1);
 					rth0 = (struct ip6_rthdr0 *)
 					    (mtod(m, caddr_t) + eoff);
 #else
 					IP6_EXTHDR_GET(rth0,
 					    struct ip6_rthdr0 *, m,
 					    eoff, rthlen);
 					if (rth0 == NULL) {
 						V_icmp6stat.icp6s_tooshort++;
 						return (-1);
 					}
 #endif
 					/* just ignore a bogus header */
 					if ((rth0->ip6r0_len % 2) == 0 &&
 					    (hops = rth0->ip6r0_len/2))
 						finaldst = (struct in6_addr *)(rth0 + 1) + (hops - 1);
 				}
 				eoff += rthlen;
 				nxt = rth->ip6r_nxt;
 				break;
 			case IPPROTO_FRAGMENT:
 #ifndef PULLDOWN_TEST
 				IP6_EXTHDR_CHECK(m, 0, eoff +
 				    sizeof(struct ip6_frag), -1);
 				fh = (struct ip6_frag *)(mtod(m, caddr_t) +
 				    eoff);
 #else
 				IP6_EXTHDR_GET(fh, struct ip6_frag *, m,
 				    eoff, sizeof(*fh));
 				if (fh == NULL) {
 					V_icmp6stat.icp6s_tooshort++;
 					return (-1);
 				}
 #endif
 				/*
 				 * Data after a fragment header is meaningless
 				 * unless it is the first fragment, but
 				 * we'll go to the notify label for path MTU
 				 * discovery.
 				 */
 				if (fh->ip6f_offlg & IP6F_OFF_MASK)
 					goto notify;
 
 				eoff += sizeof(struct ip6_frag);
 				nxt = fh->ip6f_nxt;
 				break;
 			default:
 				/*
 				 * This case includes ESP and the No Next
 				 * Header.  In such cases going to the notify
 				 * label does not have any meaning
 				 * (i.e. ctlfunc will be NULL), but we go
 				 * anyway since we might have to update
 				 * path MTU information.
 				 */
 				goto notify;
 			}
 		}
 	  notify:
 #ifndef PULLDOWN_TEST
 		icmp6 = (struct icmp6_hdr *)(mtod(m, caddr_t) + off);
 #else
 		IP6_EXTHDR_GET(icmp6, struct icmp6_hdr *, m, off,
 		    sizeof(*icmp6) + sizeof(struct ip6_hdr));
 		if (icmp6 == NULL) {
 			V_icmp6stat.icp6s_tooshort++;
 			return (-1);
 		}
 #endif
 
 		/*
 		 * retrieve parameters from the inner IPv6 header, and convert
 		 * them into sockaddr structures.
 		 * XXX: there is no guarantee that the source or destination
 		 * addresses of the inner packet are in the same scope as
 		 * the addresses of the icmp packet.  But there is no other
 		 * way to determine the zone.
 		 */
 		eip6 = (struct ip6_hdr *)(icmp6 + 1);
 
 		bzero(&icmp6dst, sizeof(icmp6dst));
 		icmp6dst.sin6_len = sizeof(struct sockaddr_in6);
 		icmp6dst.sin6_family = AF_INET6;
 		if (finaldst == NULL)
 			icmp6dst.sin6_addr = eip6->ip6_dst;
 		else
 			icmp6dst.sin6_addr = *finaldst;
 		if (in6_setscope(&icmp6dst.sin6_addr, m->m_pkthdr.rcvif, NULL))
 			goto freeit;
 		bzero(&icmp6src, sizeof(icmp6src));
 		icmp6src.sin6_len = sizeof(struct sockaddr_in6);
 		icmp6src.sin6_family = AF_INET6;
 		icmp6src.sin6_addr = eip6->ip6_src;
 		if (in6_setscope(&icmp6src.sin6_addr, m->m_pkthdr.rcvif, NULL))
 			goto freeit;
 		icmp6src.sin6_flowinfo =
 		    (eip6->ip6_flow & IPV6_FLOWLABEL_MASK);
 
 		if (finaldst == NULL)
 			finaldst = &eip6->ip6_dst;
 		ip6cp.ip6c_m = m;
 		ip6cp.ip6c_icmp6 = icmp6;
 		ip6cp.ip6c_ip6 = (struct ip6_hdr *)(icmp6 + 1);
 		ip6cp.ip6c_off = eoff;
 		ip6cp.ip6c_finaldst = finaldst;
 		ip6cp.ip6c_src = &icmp6src;
 		ip6cp.ip6c_nxt = nxt;
 
 		if (icmp6type == ICMP6_PACKET_TOO_BIG) {
 			notifymtu = ntohl(icmp6->icmp6_mtu);
 			ip6cp.ip6c_cmdarg = (void *)&notifymtu;
 			icmp6_mtudisc_update(&ip6cp, 1);	/*XXX*/
 		}
 
 		ctlfunc = (void (*)(int, struct sockaddr *, void *))
 		    (inet6sw[ip6_protox[nxt]].pr_ctlinput);
 		if (ctlfunc) {
 			(void) (*ctlfunc)(code, (struct sockaddr *)&icmp6dst,
 			    &ip6cp);
 		}
 	}
 	*mp = m;
 	return (0);
 
   freeit:
 	m_freem(m);
 	return (-1);
 }
 
 void
 icmp6_mtudisc_update(struct ip6ctlparam *ip6cp, int validated)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_addr *dst = ip6cp->ip6c_finaldst;
 	struct icmp6_hdr *icmp6 = ip6cp->ip6c_icmp6;
 	struct mbuf *m = ip6cp->ip6c_m;	/* will be necessary for scope issue */
 	u_int mtu = ntohl(icmp6->icmp6_mtu);
 	struct in_conninfo inc;
 
 #if 0
 	/*
 	 * RFC2460 section 5, last paragraph.
 	 * even though minimum link MTU for IPv6 is IPV6_MMTU,
 	 * we may see ICMPv6 too big with mtu < IPV6_MMTU
 	 * due to packet translator in the middle.
 	 * see ip6_output() and ip6_getpmtu() "alwaysfrag" case for
 	 * special handling.
 	 */
 	if (mtu < IPV6_MMTU)
 		return;
 #endif
 
 	/*
 	 * we reject ICMPv6 too big with abnormally small value.
 	 * XXX what is the good definition of "abnormally small"?
 	 */
 	if (mtu < sizeof(struct ip6_hdr) + sizeof(struct ip6_frag) + 8)
 		return;
 
 	if (!validated)
 		return;
 
 	/*
 	 * In case the suggested mtu is less than IPV6_MMTU, we
 	 * only need to remember that it was for above mentioned
 	 * "alwaysfrag" case.
 	 * Try to be as close to the spec as possible.
 	 */
 	if (mtu < IPV6_MMTU)
 		mtu = IPV6_MMTU - 8;
 
 	bzero(&inc, sizeof(inc));
 	inc.inc_flags = 1; /* IPv6 */
 	inc.inc6_faddr = *dst;
 	if (in6_setscope(&inc.inc6_faddr, m->m_pkthdr.rcvif, NULL))
 		return;
 
 	if (mtu < tcp_maxmtu6(&inc, NULL)) {
 		tcp_hc_updatemtu(&inc, mtu);
 		V_icmp6stat.icp6s_pmtuchg++;
 	}
 }
 
 /*
  * Process a Node Information Query packet, based on
  * draft-ietf-ipngwg-icmp-name-lookups-07.
  *
  * Spec incompatibilities:
  * - IPv6 Subject address handling
  * - IPv4 Subject address handling support missing
  * - Proxy reply (answer even if it's not for me)
  * - joins NI group address at in6_ifattach() time only, does not cope
  *   with hostname changes by sethostname(3)
  */
 #define hostnamelen	strlen(V_hostname)
 static struct mbuf *
 ni6_input(struct mbuf *m, int off)
 {
 	INIT_VNET_INET6(curvnet);
 	INIT_VPROCG(TD_TO_VPROCG(curthread)); /* XXX V_hostname needs this */
 	struct icmp6_nodeinfo *ni6, *nni6;
 	struct mbuf *n = NULL;
 	u_int16_t qtype;
 	int subjlen;
 	int replylen = sizeof(struct ip6_hdr) + sizeof(struct icmp6_nodeinfo);
 	struct ni_reply_fqdn *fqdn;
 	int addrs;		/* for NI_QTYPE_NODEADDR */
 	struct ifnet *ifp = NULL; /* for NI_QTYPE_NODEADDR */
 	struct in6_addr in6_subj; /* subject address */
 	struct ip6_hdr *ip6;
 	int oldfqdn = 0;	/* if 1, return pascal string (03 draft) */
 	char *subj = NULL;
 	struct in6_ifaddr *ia6 = NULL;
 
 	ip6 = mtod(m, struct ip6_hdr *);
 #ifndef PULLDOWN_TEST
 	ni6 = (struct icmp6_nodeinfo *)(mtod(m, caddr_t) + off);
 #else
 	IP6_EXTHDR_GET(ni6, struct icmp6_nodeinfo *, m, off, sizeof(*ni6));
 	if (ni6 == NULL) {
 		/* m is already reclaimed */
 		return (NULL);
 	}
 #endif
 
 	/*
 	 * Validate IPv6 source address.
 	 * The default configuration MUST be to refuse answering queries from
 	 * global-scope addresses according to RFC4602.
 	 * Notes:
 	 *  - it's not very clear what "refuse" means; this implementation
 	 *    simply drops it.
 	 *  - it's not very easy to identify global-scope (unicast) addresses
 	 *    since there are many prefixes for them.  It should be safer
 	 *    and in practice sufficient to check "all" but loopback and
 	 *    link-local (note that site-local unicast was deprecated and
 	 *    ULA is defined as global scope-wise)
 	 */
 	if ((V_icmp6_nodeinfo & ICMP6_NODEINFO_GLOBALOK) == 0 &&
 	    !IN6_IS_ADDR_LOOPBACK(&ip6->ip6_src) &&
 	    !IN6_IS_ADDR_LINKLOCAL(&ip6->ip6_src))
 		goto bad;
 
 	/*
 	 * Validate IPv6 destination address.
 	 *
 	 * The Responder must discard the Query without further processing
 	 * unless it is one of the Responder's unicast or anycast addresses, or
 	 * a link-local scope multicast address which the Responder has joined.
 	 * [RFC4602, Section 5.]
 	 */
 	if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
 		if (!IN6_IS_ADDR_MC_LINKLOCAL(&ip6->ip6_dst))
 			goto bad;
 		/* else it's a link-local multicast, fine */
 	} else {		/* unicast or anycast */
 		if ((ia6 = ip6_getdstifaddr(m)) == NULL)
 			goto bad; /* XXX impossible */
 
 		if ((ia6->ia6_flags & IN6_IFF_TEMPORARY) &&
 		    !(V_icmp6_nodeinfo & ICMP6_NODEINFO_TMPADDROK)) {
 			nd6log((LOG_DEBUG, "ni6_input: ignore node info to "
 				"a temporary address in %s:%d",
 			       __FILE__, __LINE__));
 			goto bad;
 		}
 	}
 
 	/* validate query Subject field. */
 	qtype = ntohs(ni6->ni_qtype);
 	subjlen = m->m_pkthdr.len - off - sizeof(struct icmp6_nodeinfo);
 	switch (qtype) {
 	case NI_QTYPE_NOOP:
 	case NI_QTYPE_SUPTYPES:
 		/* 07 draft */
 		if (ni6->ni_code == ICMP6_NI_SUBJ_FQDN && subjlen == 0)
 			break;
 		/* FALLTHROUGH */
 	case NI_QTYPE_FQDN:
 	case NI_QTYPE_NODEADDR:
 	case NI_QTYPE_IPV4ADDR:
 		switch (ni6->ni_code) {
 		case ICMP6_NI_SUBJ_IPV6:
 #if ICMP6_NI_SUBJ_IPV6 != 0
 		case 0:
 #endif
 			/*
 			 * backward compatibility - try to accept 03 draft
 			 * format, where no Subject is present.
 			 */
 			if (qtype == NI_QTYPE_FQDN && ni6->ni_code == 0 &&
 			    subjlen == 0) {
 				oldfqdn++;
 				break;
 			}
 #if ICMP6_NI_SUBJ_IPV6 != 0
 			if (ni6->ni_code != ICMP6_NI_SUBJ_IPV6)
 				goto bad;
 #endif
 
 			if (subjlen != sizeof(struct in6_addr))
 				goto bad;
 
 			/*
 			 * Validate Subject address.
 			 *
 			 * Not sure what exactly "address belongs to the node"
 			 * means in the spec, is it just unicast, or what?
 			 *
 			 * At this moment we consider Subject address as
 			 * "belong to the node" if the Subject address equals
 			 * to the IPv6 destination address; validation for
 			 * IPv6 destination address should have done enough
 			 * check for us.
 			 *
 			 * We do not do proxy at this moment.
 			 */
 			/* m_pulldown instead of copy? */
 			m_copydata(m, off + sizeof(struct icmp6_nodeinfo),
 			    subjlen, (caddr_t)&in6_subj);
 			if (in6_setscope(&in6_subj, m->m_pkthdr.rcvif, NULL))
 				goto bad;
 
 			subj = (char *)&in6_subj;
 			if (IN6_ARE_ADDR_EQUAL(&ip6->ip6_dst, &in6_subj))
 				break;
 
 			/*
 			 * XXX if we are to allow other cases, we should really
 			 * be careful about scope here.
 			 * basically, we should disallow queries toward IPv6
 			 * destination X with subject Y,
 			 * if scope(X) > scope(Y).
 			 * if we allow scope(X) > scope(Y), it will result in
 			 * information leakage across scope boundary.
 			 */
 			goto bad;
 
 		case ICMP6_NI_SUBJ_FQDN:
 			/*
 			 * Validate Subject name with gethostname(3).
 			 *
 			 * The behavior may need some debate, since:
 			 * - we are not sure if the node has FQDN as
 			 *   hostname (returned by gethostname(3)).
 			 * - the code does wildcard match for truncated names.
 			 *   however, we are not sure if we want to perform
 			 *   wildcard match, if gethostname(3) side has
 			 *   truncated hostname.
 			 */
 			mtx_lock(&hostname_mtx);
 			n = ni6_nametodns(V_hostname, hostnamelen, 0);
 			mtx_unlock(&hostname_mtx);
 			if (!n || n->m_next || n->m_len == 0)
 				goto bad;
 			IP6_EXTHDR_GET(subj, char *, m,
 			    off + sizeof(struct icmp6_nodeinfo), subjlen);
 			if (subj == NULL)
 				goto bad;
 			if (!ni6_dnsmatch(subj, subjlen, mtod(n, const char *),
 			    n->m_len)) {
 				goto bad;
 			}
 			m_freem(n);
 			n = NULL;
 			break;
 
 		case ICMP6_NI_SUBJ_IPV4:	/* XXX: to be implemented? */
 		default:
 			goto bad;
 		}
 		break;
 	}
 
 	/* refuse based on configuration.  XXX ICMP6_NI_REFUSED? */
 	switch (qtype) {
 	case NI_QTYPE_FQDN:
 		if ((V_icmp6_nodeinfo & ICMP6_NODEINFO_FQDNOK) == 0)
 			goto bad;
 		break;
 	case NI_QTYPE_NODEADDR:
 	case NI_QTYPE_IPV4ADDR:
 		if ((V_icmp6_nodeinfo & ICMP6_NODEINFO_NODEADDROK) == 0)
 			goto bad;
 		break;
 	}
 
 	/* guess reply length */
 	switch (qtype) {
 	case NI_QTYPE_NOOP:
 		break;		/* no reply data */
 	case NI_QTYPE_SUPTYPES:
 		replylen += sizeof(u_int32_t);
 		break;
 	case NI_QTYPE_FQDN:
 		/* XXX will append an mbuf */
 		replylen += offsetof(struct ni_reply_fqdn, ni_fqdn_namelen);
 		break;
 	case NI_QTYPE_NODEADDR:
 		addrs = ni6_addrs(ni6, m, &ifp, (struct in6_addr *)subj);
 		if ((replylen += addrs * (sizeof(struct in6_addr) +
 		    sizeof(u_int32_t))) > MCLBYTES)
 			replylen = MCLBYTES; /* XXX: will truncate pkt later */
 		break;
 	case NI_QTYPE_IPV4ADDR:
 		/* unsupported - should respond with unknown Qtype? */
 		break;
 	default:
 		/*
 		 * XXX: We must return a reply with the ICMP6 code
 		 * `unknown Qtype' in this case.  However we regard the case
 		 * as an FQDN query for backward compatibility.
 		 * Older versions set a random value to this field,
 		 * so it rarely varies in the defined qtypes.
 		 * But the mechanism is not reliable...
 		 * maybe we should obsolete older versions.
 		 */
 		qtype = NI_QTYPE_FQDN;
 		/* XXX will append an mbuf */
 		replylen += offsetof(struct ni_reply_fqdn, ni_fqdn_namelen);
 		oldfqdn++;
 		break;
 	}
 
 	/* allocate an mbuf to reply. */
 	MGETHDR(n, M_DONTWAIT, m->m_type);
 	if (n == NULL) {
 		m_freem(m);
 		return (NULL);
 	}
 	M_MOVE_PKTHDR(n, m); /* just for recvif */
 	if (replylen > MHLEN) {
 		if (replylen > MCLBYTES) {
 			/*
 			 * XXX: should we try to allocate more? But MCLBYTES
 			 * is probably much larger than IPV6_MMTU...
 			 */
 			goto bad;
 		}
 		MCLGET(n, M_DONTWAIT);
 		if ((n->m_flags & M_EXT) == 0) {
 			goto bad;
 		}
 	}
 	n->m_pkthdr.len = n->m_len = replylen;
 
 	/* copy mbuf header and IPv6 + Node Information base headers */
 	bcopy(mtod(m, caddr_t), mtod(n, caddr_t), sizeof(struct ip6_hdr));
 	nni6 = (struct icmp6_nodeinfo *)(mtod(n, struct ip6_hdr *) + 1);
 	bcopy((caddr_t)ni6, (caddr_t)nni6, sizeof(struct icmp6_nodeinfo));
 
 	/* qtype dependent procedure */
 	switch (qtype) {
 	case NI_QTYPE_NOOP:
 		nni6->ni_code = ICMP6_NI_SUCCESS;
 		nni6->ni_flags = 0;
 		break;
 	case NI_QTYPE_SUPTYPES:
 	{
 		u_int32_t v;
 		nni6->ni_code = ICMP6_NI_SUCCESS;
 		nni6->ni_flags = htons(0x0000);	/* raw bitmap */
 		/* supports NOOP, SUPTYPES, FQDN, and NODEADDR */
 		v = (u_int32_t)htonl(0x0000000f);
 		bcopy(&v, nni6 + 1, sizeof(u_int32_t));
 		break;
 	}
 	case NI_QTYPE_FQDN:
 		nni6->ni_code = ICMP6_NI_SUCCESS;
 		fqdn = (struct ni_reply_fqdn *)(mtod(n, caddr_t) +
 		    sizeof(struct ip6_hdr) + sizeof(struct icmp6_nodeinfo));
 		nni6->ni_flags = 0; /* XXX: meaningless TTL */
 		fqdn->ni_fqdn_ttl = 0;	/* ditto. */
 		/*
 		 * XXX do we really have FQDN in variable "hostname"?
 		 */
 		mtx_lock(&hostname_mtx);
 		n->m_next = ni6_nametodns(V_hostname, hostnamelen, oldfqdn);
 		mtx_unlock(&hostname_mtx);
 		if (n->m_next == NULL)
 			goto bad;
 		/* XXX we assume that n->m_next is not a chain */
 		if (n->m_next->m_next != NULL)
 			goto bad;
 		n->m_pkthdr.len += n->m_next->m_len;
 		break;
 	case NI_QTYPE_NODEADDR:
 	{
 		int lenlim, copied;
 
 		nni6->ni_code = ICMP6_NI_SUCCESS;
 		n->m_pkthdr.len = n->m_len =
 		    sizeof(struct ip6_hdr) + sizeof(struct icmp6_nodeinfo);
 		lenlim = M_TRAILINGSPACE(n);
 		copied = ni6_store_addrs(ni6, nni6, ifp, lenlim);
 		/* XXX: reset mbuf length */
 		n->m_pkthdr.len = n->m_len = sizeof(struct ip6_hdr) +
 		    sizeof(struct icmp6_nodeinfo) + copied;
 		break;
 	}
 	default:
 		break;		/* XXX impossible! */
 	}
 
 	nni6->ni_type = ICMP6_NI_REPLY;
 	m_freem(m);
 	return (n);
 
   bad:
 	m_freem(m);
 	if (n)
 		m_freem(n);
 	return (NULL);
 }
 #undef hostnamelen
 
 /*
  * make a mbuf with DNS-encoded string.  no compression support.
  *
  * XXX names with less than 2 dots (like "foo" or "foo.section") will be
  * treated as truncated name (two \0 at the end).  this is a wild guess.
  *
  * old - return pascal string if non-zero
  */
 static struct mbuf *
 ni6_nametodns(const char *name, int namelen, int old)
 {
 	struct mbuf *m;
 	char *cp, *ep;
 	const char *p, *q;
 	int i, len, nterm;
 
 	if (old)
 		len = namelen + 1;
 	else
 		len = MCLBYTES;
 
 	/* because MAXHOSTNAMELEN is usually 256, we use cluster mbuf */
 	MGET(m, M_DONTWAIT, MT_DATA);
 	if (m && len > MLEN) {
 		MCLGET(m, M_DONTWAIT);
 		if ((m->m_flags & M_EXT) == 0)
 			goto fail;
 	}
 	if (!m)
 		goto fail;
 	m->m_next = NULL;
 
 	if (old) {
 		m->m_len = len;
 		*mtod(m, char *) = namelen;
 		bcopy(name, mtod(m, char *) + 1, namelen);
 		return m;
 	} else {
 		m->m_len = 0;
 		cp = mtod(m, char *);
 		ep = mtod(m, char *) + M_TRAILINGSPACE(m);
 
 		/* if not certain about my name, return empty buffer */
 		if (namelen == 0)
 			return m;
 
 		/*
 		 * guess if it looks like shortened hostname, or FQDN.
 		 * shortened hostname needs two trailing "\0".
 		 */
 		i = 0;
 		for (p = name; p < name + namelen; p++) {
 			if (*p && *p == '.')
 				i++;
 		}
 		if (i < 2)
 			nterm = 2;
 		else
 			nterm = 1;
 
 		p = name;
 		while (cp < ep && p < name + namelen) {
 			i = 0;
 			for (q = p; q < name + namelen && *q && *q != '.'; q++)
 				i++;
 			/* result does not fit into mbuf */
 			if (cp + i + 1 >= ep)
 				goto fail;
 			/*
 			 * DNS label length restriction, RFC1035 page 8.
 			 * "i == 0" case is included here to avoid returning
 			 * 0-length label on "foo..bar".
 			 */
 			if (i <= 0 || i >= 64)
 				goto fail;
 			*cp++ = i;
 			bcopy(p, cp, i);
 			cp += i;
 			p = q;
 			if (p < name + namelen && *p == '.')
 				p++;
 		}
 		/* termination */
 		if (cp + nterm >= ep)
 			goto fail;
 		while (nterm-- > 0)
 			*cp++ = '\0';
 		m->m_len = cp - mtod(m, char *);
 		return m;
 	}
 
 	panic("should not reach here");
 	/* NOTREACHED */
 
  fail:
 	if (m)
 		m_freem(m);
 	return NULL;
 }
 
 /*
  * check if two DNS-encoded string matches.  takes care of truncated
  * form (with \0\0 at the end).  no compression support.
  * XXX upper/lowercase match (see RFC2065)
  */
 static int
 ni6_dnsmatch(const char *a, int alen, const char *b, int blen)
 {
 	const char *a0, *b0;
 	int l;
 
 	/* simplest case - need validation? */
 	if (alen == blen && bcmp(a, b, alen) == 0)
 		return 1;
 
 	a0 = a;
 	b0 = b;
 
 	/* termination is mandatory */
 	if (alen < 2 || blen < 2)
 		return 0;
 	if (a0[alen - 1] != '\0' || b0[blen - 1] != '\0')
 		return 0;
 	alen--;
 	blen--;
 
 	while (a - a0 < alen && b - b0 < blen) {
 		if (a - a0 + 1 > alen || b - b0 + 1 > blen)
 			return 0;
 
 		if ((signed char)a[0] < 0 || (signed char)b[0] < 0)
 			return 0;
 		/* we don't support compression yet */
 		if (a[0] >= 64 || b[0] >= 64)
 			return 0;
 
 		/* truncated case */
 		if (a[0] == 0 && a - a0 == alen - 1)
 			return 1;
 		if (b[0] == 0 && b - b0 == blen - 1)
 			return 1;
 		if (a[0] == 0 || b[0] == 0)
 			return 0;
 
 		if (a[0] != b[0])
 			return 0;
 		l = a[0];
 		if (a - a0 + 1 + l > alen || b - b0 + 1 + l > blen)
 			return 0;
 		if (bcmp(a + 1, b + 1, l) != 0)
 			return 0;
 
 		a += 1 + l;
 		b += 1 + l;
 	}
 
 	if (a - a0 == alen && b - b0 == blen)
 		return 1;
 	else
 		return 0;
 }
 
 /*
  * calculate the number of addresses to be returned in the node info reply.
  */
 static int
 ni6_addrs(struct icmp6_nodeinfo *ni6, struct mbuf *m, struct ifnet **ifpp,
     struct in6_addr *subj)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp;
 	struct in6_ifaddr *ifa6;
 	struct ifaddr *ifa;
 	int addrs = 0, addrsofif, iffound = 0;
 	int niflags = ni6->ni_flags;
 
 	if ((niflags & NI_NODEADDR_FLAG_ALL) == 0) {
 		switch (ni6->ni_code) {
 		case ICMP6_NI_SUBJ_IPV6:
 			if (subj == NULL) /* must be impossible... */
 				return (0);
 			break;
 		default:
 			/*
 			 * XXX: we only support IPv6 subject address for
 			 * this Qtype.
 			 */
 			return (0);
 		}
 	}
 
 	IFNET_RLOCK();
 	for (ifp = TAILQ_FIRST(&V_ifnet); ifp; ifp = TAILQ_NEXT(ifp, if_list)) {
 		addrsofif = 0;
 		TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 			if (ifa->ifa_addr->sa_family != AF_INET6)
 				continue;
 			ifa6 = (struct in6_ifaddr *)ifa;
 
 			if ((niflags & NI_NODEADDR_FLAG_ALL) == 0 &&
 			    IN6_ARE_ADDR_EQUAL(subj, &ifa6->ia_addr.sin6_addr))
 				iffound = 1;
 
 			/*
 			 * IPv4-mapped addresses can only be returned by a
 			 * Node Information proxy, since they represent
 			 * addresses of IPv4-only nodes, which perforce do
 			 * not implement this protocol.
 			 * [icmp-name-lookups-07, Section 5.4]
 			 * So we don't support NI_NODEADDR_FLAG_COMPAT in
 			 * this function at this moment.
 			 */
 
 			/* What do we have to do about ::1? */
 			switch (in6_addrscope(&ifa6->ia_addr.sin6_addr)) {
 			case IPV6_ADDR_SCOPE_LINKLOCAL:
 				if ((niflags & NI_NODEADDR_FLAG_LINKLOCAL) == 0)
 					continue;
 				break;
 			case IPV6_ADDR_SCOPE_SITELOCAL:
 				if ((niflags & NI_NODEADDR_FLAG_SITELOCAL) == 0)
 					continue;
 				break;
 			case IPV6_ADDR_SCOPE_GLOBAL:
 				if ((niflags & NI_NODEADDR_FLAG_GLOBAL) == 0)
 					continue;
 				break;
 			default:
 				continue;
 			}
 
 			/*
 			 * check if anycast is okay.
 			 * XXX: just experimental.  not in the spec.
 			 */
 			if ((ifa6->ia6_flags & IN6_IFF_ANYCAST) != 0 &&
 			    (niflags & NI_NODEADDR_FLAG_ANYCAST) == 0)
 				continue; /* we need only unicast addresses */
 			if ((ifa6->ia6_flags & IN6_IFF_TEMPORARY) != 0 &&
 			    (V_icmp6_nodeinfo & ICMP6_NODEINFO_TMPADDROK) == 0) {
 				continue;
 			}
 			addrsofif++; /* count the address */
 		}
 		if (iffound) {
 			*ifpp = ifp;
 			IFNET_RUNLOCK();
 			return (addrsofif);
 		}
 
 		addrs += addrsofif;
 	}
 	IFNET_RUNLOCK();
 
 	return (addrs);
 }
 
 static int
 ni6_store_addrs(struct icmp6_nodeinfo *ni6, struct icmp6_nodeinfo *nni6,
     struct ifnet *ifp0, int resid)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp = ifp0 ? ifp0 : TAILQ_FIRST(&V_ifnet);
 	struct in6_ifaddr *ifa6;
 	struct ifaddr *ifa;
 	struct ifnet *ifp_dep = NULL;
 	int copied = 0, allow_deprecated = 0;
 	u_char *cp = (u_char *)(nni6 + 1);
 	int niflags = ni6->ni_flags;
 	u_int32_t ltime;
 
 	if (ifp0 == NULL && !(niflags & NI_NODEADDR_FLAG_ALL))
 		return (0);	/* needless to copy */
 
 	IFNET_RLOCK();
   again:
 
 	for (; ifp; ifp = TAILQ_NEXT(ifp, if_list)) {
 		for (ifa = ifp->if_addrlist.tqh_first; ifa;
 		     ifa = ifa->ifa_list.tqe_next) {
 			if (ifa->ifa_addr->sa_family != AF_INET6)
 				continue;
 			ifa6 = (struct in6_ifaddr *)ifa;
 
 			if ((ifa6->ia6_flags & IN6_IFF_DEPRECATED) != 0 &&
 			    allow_deprecated == 0) {
 				/*
 				 * prefererred address should be put before
 				 * deprecated addresses.
 				 */
 
 				/* record the interface for later search */
 				if (ifp_dep == NULL)
 					ifp_dep = ifp;
 
 				continue;
 			} else if ((ifa6->ia6_flags & IN6_IFF_DEPRECATED) == 0 &&
 			    allow_deprecated != 0)
 				continue; /* we now collect deprecated addrs */
 
 			/* What do we have to do about ::1? */
 			switch (in6_addrscope(&ifa6->ia_addr.sin6_addr)) {
 			case IPV6_ADDR_SCOPE_LINKLOCAL:
 				if ((niflags & NI_NODEADDR_FLAG_LINKLOCAL) == 0)
 					continue;
 				break;
 			case IPV6_ADDR_SCOPE_SITELOCAL:
 				if ((niflags & NI_NODEADDR_FLAG_SITELOCAL) == 0)
 					continue;
 				break;
 			case IPV6_ADDR_SCOPE_GLOBAL:
 				if ((niflags & NI_NODEADDR_FLAG_GLOBAL) == 0)
 					continue;
 				break;
 			default:
 				continue;
 			}
 
 			/*
 			 * check if anycast is okay.
 			 * XXX: just experimental.  not in the spec.
 			 */
 			if ((ifa6->ia6_flags & IN6_IFF_ANYCAST) != 0 &&
 			    (niflags & NI_NODEADDR_FLAG_ANYCAST) == 0)
 				continue;
 			if ((ifa6->ia6_flags & IN6_IFF_TEMPORARY) != 0 &&
 			    (V_icmp6_nodeinfo & ICMP6_NODEINFO_TMPADDROK) == 0) {
 				continue;
 			}
 
 			/* now we can copy the address */
 			if (resid < sizeof(struct in6_addr) +
 			    sizeof(u_int32_t)) {
 				/*
 				 * We give up much more copy.
 				 * Set the truncate flag and return.
 				 */
 				nni6->ni_flags |= NI_NODEADDR_FLAG_TRUNCATE;
 				IFNET_RUNLOCK();
 				return (copied);
 			}
 
 			/*
 			 * Set the TTL of the address.
 			 * The TTL value should be one of the following
 			 * according to the specification:
 			 *
 			 * 1. The remaining lifetime of a DHCP lease on the
 			 *    address, or
 			 * 2. The remaining Valid Lifetime of a prefix from
 			 *    which the address was derived through Stateless
 			 *    Autoconfiguration.
 			 *
 			 * Note that we currently do not support stateful
 			 * address configuration by DHCPv6, so the former
 			 * case can't happen.
 			 */
 			if (ifa6->ia6_lifetime.ia6t_expire == 0)
 				ltime = ND6_INFINITE_LIFETIME;
 			else {
 				if (ifa6->ia6_lifetime.ia6t_expire >
 				    time_second)
 					ltime = htonl(ifa6->ia6_lifetime.ia6t_expire - time_second);
 				else
 					ltime = 0;
 			}
 
 			bcopy(&ltime, cp, sizeof(u_int32_t));
 			cp += sizeof(u_int32_t);
 
 			/* copy the address itself */
 			bcopy(&ifa6->ia_addr.sin6_addr, cp,
 			    sizeof(struct in6_addr));
 			in6_clearscope((struct in6_addr *)cp); /* XXX */
 			cp += sizeof(struct in6_addr);
 
 			resid -= (sizeof(struct in6_addr) + sizeof(u_int32_t));
 			copied += (sizeof(struct in6_addr) + sizeof(u_int32_t));
 		}
 		if (ifp0)	/* we need search only on the specified IF */
 			break;
 	}
 
 	if (allow_deprecated == 0 && ifp_dep != NULL) {
 		ifp = ifp_dep;
 		allow_deprecated = 1;
 
 		goto again;
 	}
 
 	IFNET_RUNLOCK();
 
 	return (copied);
 }
 
 /*
  * XXX almost dup'ed code with rip6_input.
  */
 static int
 icmp6_rip6_input(struct mbuf **mp, int off)
 {
 	INIT_VNET_INET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	struct mbuf *m = *mp;
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	struct in6pcb *in6p;
 	struct in6pcb *last = NULL;
 	struct sockaddr_in6 fromsa;
 	struct icmp6_hdr *icmp6;
 	struct mbuf *opts = NULL;
 
 #ifndef PULLDOWN_TEST
 	/* this is assumed to be safe. */
 	icmp6 = (struct icmp6_hdr *)((caddr_t)ip6 + off);
 #else
 	IP6_EXTHDR_GET(icmp6, struct icmp6_hdr *, m, off, sizeof(*icmp6));
 	if (icmp6 == NULL) {
 		/* m is already reclaimed */
 		return (IPPROTO_DONE);
 	}
 #endif
 
 	/*
 	 * XXX: the address may have embedded scope zone ID, which should be
 	 * hidden from applications.
 	 */
 	bzero(&fromsa, sizeof(fromsa));
 	fromsa.sin6_family = AF_INET6;
 	fromsa.sin6_len = sizeof(struct sockaddr_in6);
 	fromsa.sin6_addr = ip6->ip6_src;
 	if (sa6_recoverscope(&fromsa)) {
 		m_freem(m);
 		return (IPPROTO_DONE);
 	}
 
 	INP_INFO_RLOCK(&V_ripcbinfo);
 	LIST_FOREACH(in6p, &V_ripcb, inp_list) {
 		if ((in6p->inp_vflag & INP_IPV6) == 0)
 			continue;
 		if (in6p->in6p_ip6_nxt != IPPROTO_ICMPV6)
 			continue;
 		if (!IN6_IS_ADDR_UNSPECIFIED(&in6p->in6p_laddr) &&
 		   !IN6_ARE_ADDR_EQUAL(&in6p->in6p_laddr, &ip6->ip6_dst))
 			continue;
 		if (!IN6_IS_ADDR_UNSPECIFIED(&in6p->in6p_faddr) &&
 		   !IN6_ARE_ADDR_EQUAL(&in6p->in6p_faddr, &ip6->ip6_src))
 			continue;
 		INP_RLOCK(in6p);
 		if (ICMP6_FILTER_WILLBLOCK(icmp6->icmp6_type,
 		    in6p->in6p_icmp6filt)) {
 			INP_RUNLOCK(in6p);
 			continue;
 		}
 		if (last) {
 			struct	mbuf *n = NULL;
 
 			/*
 			 * Recent network drivers tend to allocate a single
 			 * mbuf cluster, rather than to make a couple of
 			 * mbufs without clusters.  Also, since the IPv6 code
 			 * path tries to avoid m_pullup(), it is highly
 			 * probable that we still have an mbuf cluster here
 			 * even though the necessary length can be stored in an
 			 * mbuf's internal buffer.
 			 * Meanwhile, the default size of the receive socket
 			 * buffer for raw sockets is not so large.  This means
 			 * the possibility of packet loss is relatively higher
 			 * than before.  To avoid this scenario, we copy the
 			 * received data to a separate mbuf that does not use
 			 * a cluster, if possible.
 			 * XXX: it is better to copy the data after stripping
 			 * intermediate headers.
 			 */
 			if ((m->m_flags & M_EXT) && m->m_next == NULL &&
 			    m->m_len <= MHLEN) {
 				MGET(n, M_DONTWAIT, m->m_type);
 				if (n != NULL) {
 					if (m_dup_pkthdr(n, m, M_NOWAIT)) {
 						bcopy(m->m_data, n->m_data,
 						      m->m_len);
 						n->m_len = m->m_len;
 					} else {
 						m_free(n);
 						n = NULL;
 					}
 				}
 			}
 			if (n != NULL ||
 			    (n = m_copy(m, 0, (int)M_COPYALL)) != NULL) {
 				if (last->in6p_flags & IN6P_CONTROLOPTS)
 					ip6_savecontrol(last, n, &opts);
 				/* strip intermediate headers */
 				m_adj(n, off);
 				SOCKBUF_LOCK(&last->in6p_socket->so_rcv);
 				if (sbappendaddr_locked(
 				    &last->in6p_socket->so_rcv,
 				    (struct sockaddr *)&fromsa, n, opts)
 				    == 0) {
 					/* should notify about lost packet */
 					m_freem(n);
 					if (opts) {
 						m_freem(opts);
 					}
 					SOCKBUF_UNLOCK(
 					    &last->in6p_socket->so_rcv);
 				} else
 					sorwakeup_locked(last->in6p_socket);
 				opts = NULL;
 			}
 			INP_RUNLOCK(last);
 		}
 		last = in6p;
 	}
 	INP_INFO_RUNLOCK(&V_ripcbinfo);
 	if (last) {
 		if (last->in6p_flags & IN6P_CONTROLOPTS)
 			ip6_savecontrol(last, m, &opts);
 		/* strip intermediate headers */
 		m_adj(m, off);
 
 		/* avoid using mbuf clusters if possible (see above) */
 		if ((m->m_flags & M_EXT) && m->m_next == NULL &&
 		    m->m_len <= MHLEN) {
 			struct mbuf *n;
 
 			MGET(n, M_DONTWAIT, m->m_type);
 			if (n != NULL) {
 				if (m_dup_pkthdr(n, m, M_NOWAIT)) {
 					bcopy(m->m_data, n->m_data, m->m_len);
 					n->m_len = m->m_len;
 
 					m_freem(m);
 					m = n;
 				} else {
 					m_freem(n);
 					n = NULL;
 				}
 			}
 		}
 		SOCKBUF_LOCK(&last->in6p_socket->so_rcv);
 		if (sbappendaddr_locked(&last->in6p_socket->so_rcv,
 		    (struct sockaddr *)&fromsa, m, opts) == 0) {
 			m_freem(m);
 			if (opts)
 				m_freem(opts);
 			SOCKBUF_UNLOCK(&last->in6p_socket->so_rcv);
 		} else
 			sorwakeup_locked(last->in6p_socket);
 		INP_RUNLOCK(last);
 	} else {
 		m_freem(m);
 		V_ip6stat.ip6s_delivered--;
 	}
 	return IPPROTO_DONE;
 }
 
 /*
  * Reflect the ip6 packet back to the source.
  * OFF points to the icmp6 header, counted from the top of the mbuf.
  */
 void
 icmp6_reflect(struct mbuf *m, size_t off)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ip6_hdr *ip6;
 	struct icmp6_hdr *icmp6;
 	struct in6_ifaddr *ia;
 	int plen;
 	int type, code;
 	struct ifnet *outif = NULL;
 	struct in6_addr origdst, *src = NULL;
 
 	/* too short to reflect */
 	if (off < sizeof(struct ip6_hdr)) {
 		nd6log((LOG_DEBUG,
 		    "sanity fail: off=%lx, sizeof(ip6)=%lx in %s:%d\n",
 		    (u_long)off, (u_long)sizeof(struct ip6_hdr),
 		    __FILE__, __LINE__));
 		goto bad;
 	}
 
 	/*
 	 * If there are extra headers between IPv6 and ICMPv6, strip
 	 * off that header first.
 	 */
 #ifdef DIAGNOSTIC
 	if (sizeof(struct ip6_hdr) + sizeof(struct icmp6_hdr) > MHLEN)
 		panic("assumption failed in icmp6_reflect");
 #endif
 	if (off > sizeof(struct ip6_hdr)) {
 		size_t l;
 		struct ip6_hdr nip6;
 
 		l = off - sizeof(struct ip6_hdr);
 		m_copydata(m, 0, sizeof(nip6), (caddr_t)&nip6);
 		m_adj(m, l);
 		l = sizeof(struct ip6_hdr) + sizeof(struct icmp6_hdr);
 		if (m->m_len < l) {
 			if ((m = m_pullup(m, l)) == NULL)
 				return;
 		}
 		bcopy((caddr_t)&nip6, mtod(m, caddr_t), sizeof(nip6));
 	} else /* off == sizeof(struct ip6_hdr) */ {
 		size_t l;
 		l = sizeof(struct ip6_hdr) + sizeof(struct icmp6_hdr);
 		if (m->m_len < l) {
 			if ((m = m_pullup(m, l)) == NULL)
 				return;
 		}
 	}
 	plen = m->m_pkthdr.len - sizeof(struct ip6_hdr);
 	ip6 = mtod(m, struct ip6_hdr *);
 	ip6->ip6_nxt = IPPROTO_ICMPV6;
 	icmp6 = (struct icmp6_hdr *)(ip6 + 1);
 	type = icmp6->icmp6_type; /* keep type for statistics */
 	code = icmp6->icmp6_code; /* ditto. */
 
 	origdst = ip6->ip6_dst;
 	/*
 	 * ip6_input() drops a packet if its src is multicast.
 	 * So, the src is never multicast.
 	 */
 	ip6->ip6_dst = ip6->ip6_src;
 
 	/*
 	 * If the incoming packet was addressed directly to us (i.e. unicast),
 	 * use dst as the src for the reply.
 	 * The IN6_IFF_NOTREADY case should be VERY rare, but is possible
 	 * (for example) when we encounter an error while forwarding procedure
 	 * destined to a duplicated address of ours.
 	 * Note that ip6_getdstifaddr() may fail if we are in an error handling
 	 * procedure of an outgoing packet of our own, in which case we need
 	 * to search in the ifaddr list.
 	 */
 	if (!IN6_IS_ADDR_MULTICAST(&origdst)) {
 		if ((ia = ip6_getdstifaddr(m))) {
 			if (!(ia->ia6_flags &
 			    (IN6_IFF_ANYCAST|IN6_IFF_NOTREADY)))
 				src = &ia->ia_addr.sin6_addr;
 		} else {
 			struct sockaddr_in6 d;
 
 			bzero(&d, sizeof(d));
 			d.sin6_family = AF_INET6;
 			d.sin6_len = sizeof(d);
 			d.sin6_addr = origdst;
 			ia = (struct in6_ifaddr *)
 			    ifa_ifwithaddr((struct sockaddr *)&d);
 			if (ia &&
 			    !(ia->ia6_flags &
 			    (IN6_IFF_ANYCAST|IN6_IFF_NOTREADY))) {
 				src = &ia->ia_addr.sin6_addr;
 			}
 		}
 	}
 
 	if (src == NULL) {
 		int e;
 		struct sockaddr_in6 sin6;
 		struct route_in6 ro;
 
 		/*
 		 * This case matches to multicasts, our anycast, or unicasts
 		 * that we do not own.  Select a source address based on the
 		 * source address of the erroneous packet.
 		 */
 		bzero(&sin6, sizeof(sin6));
 		sin6.sin6_family = AF_INET6;
 		sin6.sin6_len = sizeof(sin6);
 		sin6.sin6_addr = ip6->ip6_dst; /* zone ID should be embedded */
 
 		bzero(&ro, sizeof(ro));
 		src = in6_selectsrc(&sin6, NULL, NULL, &ro, NULL, &outif, &e);
 		if (ro.ro_rt)
 			RTFREE(ro.ro_rt); /* XXX: we could use this */
 		if (src == NULL) {
 			char ip6buf[INET6_ADDRSTRLEN];
 			nd6log((LOG_DEBUG,
 			    "icmp6_reflect: source can't be determined: "
 			    "dst=%s, error=%d\n",
 			    ip6_sprintf(ip6buf, &sin6.sin6_addr), e));
 			goto bad;
 		}
 	}
 
 	ip6->ip6_src = *src;
 	ip6->ip6_flow = 0;
 	ip6->ip6_vfc &= ~IPV6_VERSION_MASK;
 	ip6->ip6_vfc |= IPV6_VERSION;
 	ip6->ip6_nxt = IPPROTO_ICMPV6;
 	if (outif)
 		ip6->ip6_hlim = ND_IFINFO(outif)->chlim;
 	else if (m->m_pkthdr.rcvif) {
 		/* XXX: This may not be the outgoing interface */
 		ip6->ip6_hlim = ND_IFINFO(m->m_pkthdr.rcvif)->chlim;
 	} else
 		ip6->ip6_hlim = V_ip6_defhlim;
 
 	icmp6->icmp6_cksum = 0;
 	icmp6->icmp6_cksum = in6_cksum(m, IPPROTO_ICMPV6,
 	    sizeof(struct ip6_hdr), plen);
 
 	/*
 	 * XXX option handling
 	 */
 
 	m->m_flags &= ~(M_BCAST|M_MCAST);
 
 	ip6_output(m, NULL, NULL, 0, NULL, &outif, NULL);
 	if (outif)
 		icmp6_ifoutstat_inc(outif, type, code);
 
 	return;
 
  bad:
 	m_freem(m);
 	return;
 }
 
 void
 icmp6_fasttimo(void)
 {
 
 	return;
 }
 
 static const char *
 icmp6_redirect_diag(struct in6_addr *src6, struct in6_addr *dst6,
     struct in6_addr *tgt6)
 {
 	static char buf[1024];
 	char ip6bufs[INET6_ADDRSTRLEN];
 	char ip6bufd[INET6_ADDRSTRLEN];
 	char ip6buft[INET6_ADDRSTRLEN];
 	snprintf(buf, sizeof(buf), "(src=%s dst=%s tgt=%s)",
 	    ip6_sprintf(ip6bufs, src6), ip6_sprintf(ip6bufd, dst6),
 	    ip6_sprintf(ip6buft, tgt6));
 	return buf;
 }
 
 void
 icmp6_redirect_input(struct mbuf *m, int off)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp;
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	struct nd_redirect *nd_rd;
 	int icmp6len = ntohs(ip6->ip6_plen);
 	char *lladdr = NULL;
 	int lladdrlen = 0;
 	u_char *redirhdr = NULL;
 	int redirhdrlen = 0;
 	struct rtentry *rt = NULL;
 	int is_router;
 	int is_onlink;
 	struct in6_addr src6 = ip6->ip6_src;
 	struct in6_addr redtgt6;
 	struct in6_addr reddst6;
 	union nd_opts ndopts;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	if (!m)
 		return;
 
 	ifp = m->m_pkthdr.rcvif;
 
 	if (!ifp)
 		return;
 
 	/* XXX if we are router, we don't update route by icmp6 redirect */
 	if (V_ip6_forwarding)
 		goto freeit;
 	if (!V_icmp6_rediraccept)
 		goto freeit;
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off, icmp6len,);
 	nd_rd = (struct nd_redirect *)((caddr_t)ip6 + off);
 #else
 	IP6_EXTHDR_GET(nd_rd, struct nd_redirect *, m, off, icmp6len);
 	if (nd_rd == NULL) {
 		V_icmp6stat.icp6s_tooshort++;
 		return;
 	}
 #endif
 	redtgt6 = nd_rd->nd_rd_target;
 	reddst6 = nd_rd->nd_rd_dst;
 
 	if (in6_setscope(&redtgt6, m->m_pkthdr.rcvif, NULL) ||
 	    in6_setscope(&reddst6, m->m_pkthdr.rcvif, NULL)) {
 		goto freeit;
 	}
 
 	/* validation */
 	if (!IN6_IS_ADDR_LINKLOCAL(&src6)) {
 		nd6log((LOG_ERR,
 		    "ICMP6 redirect sent from %s rejected; "
 		    "must be from linklocal\n",
 		    ip6_sprintf(ip6buf, &src6)));
 		goto bad;
 	}
 	if (ip6->ip6_hlim != 255) {
 		nd6log((LOG_ERR,
 		    "ICMP6 redirect sent from %s rejected; "
 		    "hlim=%d (must be 255)\n",
 		    ip6_sprintf(ip6buf, &src6), ip6->ip6_hlim));
 		goto bad;
 	}
     {
 	/* ip6->ip6_src must be equal to gw for icmp6->icmp6_reddst */
 	struct sockaddr_in6 sin6;
 	struct in6_addr *gw6;
 
 	bzero(&sin6, sizeof(sin6));
 	sin6.sin6_family = AF_INET6;
 	sin6.sin6_len = sizeof(struct sockaddr_in6);
 	bcopy(&reddst6, &sin6.sin6_addr, sizeof(reddst6));
 	rt = rtalloc1((struct sockaddr *)&sin6, 0, 0UL);
 	if (rt) {
 		if (rt->rt_gateway == NULL ||
 		    rt->rt_gateway->sa_family != AF_INET6) {
 			nd6log((LOG_ERR,
 			    "ICMP6 redirect rejected; no route "
 			    "with inet6 gateway found for redirect dst: %s\n",
 			    icmp6_redirect_diag(&src6, &reddst6, &redtgt6)));
 			RTFREE_LOCKED(rt);
 			goto bad;
 		}
 
 		gw6 = &(((struct sockaddr_in6 *)rt->rt_gateway)->sin6_addr);
 		if (bcmp(&src6, gw6, sizeof(struct in6_addr)) != 0) {
 			nd6log((LOG_ERR,
 			    "ICMP6 redirect rejected; "
 			    "not equal to gw-for-src=%s (must be same): "
 			    "%s\n",
 			    ip6_sprintf(ip6buf, gw6),
 			    icmp6_redirect_diag(&src6, &reddst6, &redtgt6)));
 			RTFREE_LOCKED(rt);
 			goto bad;
 		}
 	} else {
 		nd6log((LOG_ERR,
 		    "ICMP6 redirect rejected; "
 		    "no route found for redirect dst: %s\n",
 		    icmp6_redirect_diag(&src6, &reddst6, &redtgt6)));
 		goto bad;
 	}
 	RTFREE_LOCKED(rt);
 	rt = NULL;
     }
 	if (IN6_IS_ADDR_MULTICAST(&reddst6)) {
 		nd6log((LOG_ERR,
 		    "ICMP6 redirect rejected; "
 		    "redirect dst must be unicast: %s\n",
 		    icmp6_redirect_diag(&src6, &reddst6, &redtgt6)));
 		goto bad;
 	}
 
 	is_router = is_onlink = 0;
 	if (IN6_IS_ADDR_LINKLOCAL(&redtgt6))
 		is_router = 1;	/* router case */
 	if (bcmp(&redtgt6, &reddst6, sizeof(redtgt6)) == 0)
 		is_onlink = 1;	/* on-link destination case */
 	if (!is_router && !is_onlink) {
 		nd6log((LOG_ERR,
 		    "ICMP6 redirect rejected; "
 		    "neither router case nor onlink case: %s\n",
 		    icmp6_redirect_diag(&src6, &reddst6, &redtgt6)));
 		goto bad;
 	}
 	/* validation passed */
 
 	icmp6len -= sizeof(*nd_rd);
 	nd6_option_init(nd_rd + 1, icmp6len, &ndopts);
 	if (nd6_options(&ndopts) < 0) {
 		nd6log((LOG_INFO, "icmp6_redirect_input: "
 		    "invalid ND option, rejected: %s\n",
 		    icmp6_redirect_diag(&src6, &reddst6, &redtgt6)));
 		/* nd6_options have incremented stats */
 		goto freeit;
 	}
 
 	if (ndopts.nd_opts_tgt_lladdr) {
 		lladdr = (char *)(ndopts.nd_opts_tgt_lladdr + 1);
 		lladdrlen = ndopts.nd_opts_tgt_lladdr->nd_opt_len << 3;
 	}
 
 	if (ndopts.nd_opts_rh) {
 		redirhdrlen = ndopts.nd_opts_rh->nd_opt_rh_len;
 		redirhdr = (u_char *)(ndopts.nd_opts_rh + 1); /* xxx */
 	}
 
 	if (lladdr && ((ifp->if_addrlen + 2 + 7) & ~7) != lladdrlen) {
 		nd6log((LOG_INFO,
 		    "icmp6_redirect_input: lladdrlen mismatch for %s "
 		    "(if %d, icmp6 packet %d): %s\n",
 		    ip6_sprintf(ip6buf, &redtgt6),
 		    ifp->if_addrlen, lladdrlen - 2,
 		    icmp6_redirect_diag(&src6, &reddst6, &redtgt6)));
 		goto bad;
 	}
 
 	/* RFC 2461 8.3 */
 	nd6_cache_lladdr(ifp, &redtgt6, lladdr, lladdrlen, ND_REDIRECT,
 	    is_onlink ? ND_REDIRECT_ONLINK : ND_REDIRECT_ROUTER);
 
 	if (!is_onlink) {	/* better router case.  perform rtredirect. */
 		/* perform rtredirect */
 		struct sockaddr_in6 sdst;
 		struct sockaddr_in6 sgw;
 		struct sockaddr_in6 ssrc;
 
 		bzero(&sdst, sizeof(sdst));
 		bzero(&sgw, sizeof(sgw));
 		bzero(&ssrc, sizeof(ssrc));
 		sdst.sin6_family = sgw.sin6_family = ssrc.sin6_family = AF_INET6;
 		sdst.sin6_len = sgw.sin6_len = ssrc.sin6_len =
 			sizeof(struct sockaddr_in6);
 		bcopy(&redtgt6, &sgw.sin6_addr, sizeof(struct in6_addr));
 		bcopy(&reddst6, &sdst.sin6_addr, sizeof(struct in6_addr));
 		bcopy(&src6, &ssrc.sin6_addr, sizeof(struct in6_addr));
 		rtredirect((struct sockaddr *)&sdst, (struct sockaddr *)&sgw,
 		    (struct sockaddr *)NULL, RTF_GATEWAY | RTF_HOST,
 		    (struct sockaddr *)&ssrc);
 	}
 	/* finally update cached route in each socket via pfctlinput */
     {
 	struct sockaddr_in6 sdst;
 
 	bzero(&sdst, sizeof(sdst));
 	sdst.sin6_family = AF_INET6;
 	sdst.sin6_len = sizeof(struct sockaddr_in6);
 	bcopy(&reddst6, &sdst.sin6_addr, sizeof(struct in6_addr));
 	pfctlinput(PRC_REDIRECT_HOST, (struct sockaddr *)&sdst);
 #ifdef IPSEC
 	key_sa_routechange((struct sockaddr *)&sdst);
 #endif /* IPSEC */
     }
 
  freeit:
 	m_freem(m);
 	return;
 
  bad:
 	V_icmp6stat.icp6s_badredirect++;
 	m_freem(m);
 }
 
 void
 icmp6_redirect_output(struct mbuf *m0, struct rtentry *rt)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp;	/* my outgoing interface */
 	struct in6_addr *ifp_ll6;
 	struct in6_addr *router_ll6;
 	struct ip6_hdr *sip6;	/* m0 as struct ip6_hdr */
 	struct mbuf *m = NULL;	/* newly allocated one */
 	struct ip6_hdr *ip6;	/* m as struct ip6_hdr */
 	struct nd_redirect *nd_rd;
 	size_t maxlen;
 	u_char *p;
 	struct ifnet *outif = NULL;
 	struct sockaddr_in6 src_sa;
 
 	icmp6_errcount(&V_icmp6stat.icp6s_outerrhist, ND_REDIRECT, 0);
 
 	/* if we are not router, we don't send icmp6 redirect */
 	if (!V_ip6_forwarding)
 		goto fail;
 
 	/* sanity check */
 	if (!m0 || !rt || !(rt->rt_flags & RTF_UP) || !(ifp = rt->rt_ifp))
 		goto fail;
 
 	/*
 	 * Address check:
 	 *  the source address must identify a neighbor, and
 	 *  the destination address must not be a multicast address
 	 *  [RFC 2461, sec 8.2]
 	 */
 	sip6 = mtod(m0, struct ip6_hdr *);
 	bzero(&src_sa, sizeof(src_sa));
 	src_sa.sin6_family = AF_INET6;
 	src_sa.sin6_len = sizeof(src_sa);
 	src_sa.sin6_addr = sip6->ip6_src;
 	if (nd6_is_addr_neighbor(&src_sa, ifp) == 0)
 		goto fail;
 	if (IN6_IS_ADDR_MULTICAST(&sip6->ip6_dst))
 		goto fail;	/* what should we do here? */
 
 	/* rate limit */
 	if (icmp6_ratelimit(&sip6->ip6_src, ND_REDIRECT, 0))
 		goto fail;
 
 	/*
 	 * Since we are going to append up to 1280 bytes (= IPV6_MMTU),
 	 * we almost always ask for an mbuf cluster for simplicity.
 	 * (MHLEN < IPV6_MMTU is almost always true)
 	 */
 #if IPV6_MMTU >= MCLBYTES
 # error assumption failed about IPV6_MMTU and MCLBYTES
 #endif
 	MGETHDR(m, M_DONTWAIT, MT_HEADER);
 	if (m && IPV6_MMTU >= MHLEN)
 		MCLGET(m, M_DONTWAIT);
 	if (!m)
 		goto fail;
 	m->m_pkthdr.rcvif = NULL;
 	m->m_len = 0;
 	maxlen = M_TRAILINGSPACE(m);
 	maxlen = min(IPV6_MMTU, maxlen);
 	/* just for safety */
 	if (maxlen < sizeof(struct ip6_hdr) + sizeof(struct icmp6_hdr) +
 	    ((sizeof(struct nd_opt_hdr) + ifp->if_addrlen + 7) & ~7)) {
 		goto fail;
 	}
 
 	{
 		/* get ip6 linklocal address for ifp(my outgoing interface). */
 		struct in6_ifaddr *ia;
 		if ((ia = in6ifa_ifpforlinklocal(ifp,
 						 IN6_IFF_NOTREADY|
 						 IN6_IFF_ANYCAST)) == NULL)
 			goto fail;
 		ifp_ll6 = &ia->ia_addr.sin6_addr;
 	}
 
 	/* get ip6 linklocal address for the router. */
 	if (rt->rt_gateway && (rt->rt_flags & RTF_GATEWAY)) {
 		struct sockaddr_in6 *sin6;
 		sin6 = (struct sockaddr_in6 *)rt->rt_gateway;
 		router_ll6 = &sin6->sin6_addr;
 		if (!IN6_IS_ADDR_LINKLOCAL(router_ll6))
 			router_ll6 = (struct in6_addr *)NULL;
 	} else
 		router_ll6 = (struct in6_addr *)NULL;
 
 	/* ip6 */
 	ip6 = mtod(m, struct ip6_hdr *);
 	ip6->ip6_flow = 0;
 	ip6->ip6_vfc &= ~IPV6_VERSION_MASK;
 	ip6->ip6_vfc |= IPV6_VERSION;
 	/* ip6->ip6_plen will be set later */
 	ip6->ip6_nxt = IPPROTO_ICMPV6;
 	ip6->ip6_hlim = 255;
 	/* ip6->ip6_src must be linklocal addr for my outgoing if. */
 	bcopy(ifp_ll6, &ip6->ip6_src, sizeof(struct in6_addr));
 	bcopy(&sip6->ip6_src, &ip6->ip6_dst, sizeof(struct in6_addr));
 
 	/* ND Redirect */
 	nd_rd = (struct nd_redirect *)(ip6 + 1);
 	nd_rd->nd_rd_type = ND_REDIRECT;
 	nd_rd->nd_rd_code = 0;
 	nd_rd->nd_rd_reserved = 0;
 	if (rt->rt_flags & RTF_GATEWAY) {
 		/*
 		 * nd_rd->nd_rd_target must be a link-local address in
 		 * better router cases.
 		 */
 		if (!router_ll6)
 			goto fail;
 		bcopy(router_ll6, &nd_rd->nd_rd_target,
 		    sizeof(nd_rd->nd_rd_target));
 		bcopy(&sip6->ip6_dst, &nd_rd->nd_rd_dst,
 		    sizeof(nd_rd->nd_rd_dst));
 	} else {
 		/* make sure redtgt == reddst */
 		bcopy(&sip6->ip6_dst, &nd_rd->nd_rd_target,
 		    sizeof(nd_rd->nd_rd_target));
 		bcopy(&sip6->ip6_dst, &nd_rd->nd_rd_dst,
 		    sizeof(nd_rd->nd_rd_dst));
 	}
 
 	p = (u_char *)(nd_rd + 1);
 
 	if (!router_ll6)
 		goto nolladdropt;
 
 	{
 		/* target lladdr option */
-		struct rtentry *rt_router = NULL;
 		int len;
-		struct sockaddr_dl *sdl;
+		struct llentry *ln;
 		struct nd_opt_hdr *nd_opt;
 		char *lladdr;
 
-		rt_router = nd6_lookup(router_ll6, 0, ifp);
-		if (!rt_router)
+		IF_AFDATA_LOCK(ifp);
+		ln = nd6_lookup(router_ll6, 0, ifp);
+		IF_AFDATA_UNLOCK(ifp);
+		if (!ln)
 			goto nolladdropt;
+
 		len = sizeof(*nd_opt) + ifp->if_addrlen;
 		len = (len + 7) & ~7;	/* round by 8 */
 		/* safety check */
 		if (len + (p - (u_char *)ip6) > maxlen)
 			goto nolladdropt;
-		if (!(rt_router->rt_flags & RTF_GATEWAY) &&
-		    (rt_router->rt_flags & RTF_LLINFO) &&
-		    (rt_router->rt_gateway->sa_family == AF_LINK) &&
-		    (sdl = (struct sockaddr_dl *)rt_router->rt_gateway) &&
-		    sdl->sdl_alen) {
+
+		if (ln->la_flags & LLE_VALID) {
 			nd_opt = (struct nd_opt_hdr *)p;
 			nd_opt->nd_opt_type = ND_OPT_TARGET_LINKADDR;
 			nd_opt->nd_opt_len = len >> 3;
 			lladdr = (char *)(nd_opt + 1);
-			bcopy(LLADDR(sdl), lladdr, ifp->if_addrlen);
+			bcopy(&ln->ll_addr, lladdr, ifp->if_addrlen);
 			p += len;
 		}
+		LLE_RUNLOCK(ln);
 	}
 nolladdropt:;
 
 	m->m_pkthdr.len = m->m_len = p - (u_char *)ip6;
 
 	/* just to be safe */
 #ifdef M_DECRYPTED	/*not openbsd*/
 	if (m0->m_flags & M_DECRYPTED)
 		goto noredhdropt;
 #endif
 	if (p - (u_char *)ip6 > maxlen)
 		goto noredhdropt;
 
 	{
 		/* redirected header option */
 		int len;
 		struct nd_opt_rd_hdr *nd_opt_rh;
 
 		/*
 		 * compute the maximum size for icmp6 redirect header option.
 		 * XXX room for auth header?
 		 */
 		len = maxlen - (p - (u_char *)ip6);
 		len &= ~7;
 
 		/* This is just for simplicity. */
 		if (m0->m_pkthdr.len != m0->m_len) {
 			if (m0->m_next) {
 				m_freem(m0->m_next);
 				m0->m_next = NULL;
 			}
 			m0->m_pkthdr.len = m0->m_len;
 		}
 
 		/*
 		 * Redirected header option spec (RFC2461 4.6.3) talks nothing
 		 * about padding/truncate rule for the original IP packet.
 		 * From the discussion on IPv6imp in Feb 1999,
 		 * the consensus was:
 		 * - "attach as much as possible" is the goal
 		 * - pad if not aligned (original size can be guessed by
 		 *   original ip6 header)
 		 * Following code adds the padding if it is simple enough,
 		 * and truncates if not.
 		 */
 		if (m0->m_next || m0->m_pkthdr.len != m0->m_len)
 			panic("assumption failed in %s:%d", __FILE__,
 			    __LINE__);
 
 		if (len - sizeof(*nd_opt_rh) < m0->m_pkthdr.len) {
 			/* not enough room, truncate */
 			m0->m_pkthdr.len = m0->m_len = len -
 			    sizeof(*nd_opt_rh);
 		} else {
 			/* enough room, pad or truncate */
 			size_t extra;
 
 			extra = m0->m_pkthdr.len % 8;
 			if (extra) {
 				/* pad if easy enough, truncate if not */
 				if (8 - extra <= M_TRAILINGSPACE(m0)) {
 					/* pad */
 					m0->m_len += (8 - extra);
 					m0->m_pkthdr.len += (8 - extra);
 				} else {
 					/* truncate */
 					m0->m_pkthdr.len -= extra;
 					m0->m_len -= extra;
 				}
 			}
 			len = m0->m_pkthdr.len + sizeof(*nd_opt_rh);
 			m0->m_pkthdr.len = m0->m_len = len -
 			    sizeof(*nd_opt_rh);
 		}
 
 		nd_opt_rh = (struct nd_opt_rd_hdr *)p;
 		bzero(nd_opt_rh, sizeof(*nd_opt_rh));
 		nd_opt_rh->nd_opt_rh_type = ND_OPT_REDIRECTED_HEADER;
 		nd_opt_rh->nd_opt_rh_len = len >> 3;
 		p += sizeof(*nd_opt_rh);
 		m->m_pkthdr.len = m->m_len = p - (u_char *)ip6;
 
 		/* connect m0 to m */
 		m_tag_delete_chain(m0, NULL);
 		m0->m_flags &= ~M_PKTHDR;
 		m->m_next = m0;
 		m->m_pkthdr.len = m->m_len + m0->m_len;
 		m0 = NULL;
 	}
 noredhdropt:;
 	if (m0) {
 		m_freem(m0);
 		m0 = NULL;
 	}
 
 	/* XXX: clear embedded link IDs in the inner header */
 	in6_clearscope(&sip6->ip6_src);
 	in6_clearscope(&sip6->ip6_dst);
 	in6_clearscope(&nd_rd->nd_rd_target);
 	in6_clearscope(&nd_rd->nd_rd_dst);
 
 	ip6->ip6_plen = htons(m->m_pkthdr.len - sizeof(struct ip6_hdr));
 
 	nd_rd->nd_rd_cksum = 0;
 	nd_rd->nd_rd_cksum = in6_cksum(m, IPPROTO_ICMPV6,
 	    sizeof(*ip6), ntohs(ip6->ip6_plen));
 
 	/* send the packet to outside... */
 	ip6_output(m, NULL, NULL, 0, NULL, &outif, NULL);
 	if (outif) {
 		icmp6_ifstat_inc(outif, ifs6_out_msg);
 		icmp6_ifstat_inc(outif, ifs6_out_redirect);
 	}
 	V_icmp6stat.icp6s_outhist[ND_REDIRECT]++;
 
 	return;
 
 fail:
 	if (m)
 		m_freem(m);
 	if (m0)
 		m_freem(m0);
 }
 
 /*
  * ICMPv6 socket option processing.
  */
 int
 icmp6_ctloutput(struct socket *so, struct sockopt *sopt)
 {
 	int error = 0;
 	int optlen;
 	struct inpcb *inp = sotoinpcb(so);
 	int level, op, optname;
 
 	if (sopt) {
 		level = sopt->sopt_level;
 		op = sopt->sopt_dir;
 		optname = sopt->sopt_name;
 		optlen = sopt->sopt_valsize;
 	} else
 		level = op = optname = optlen = 0;
 
 	if (level != IPPROTO_ICMPV6) {
 		return EINVAL;
 	}
 
 	switch (op) {
 	case PRCO_SETOPT:
 		switch (optname) {
 		case ICMP6_FILTER:
 		    {
 			struct icmp6_filter ic6f;
 
 			if (optlen != sizeof(ic6f)) {
 				error = EMSGSIZE;
 				break;
 			}
 			error = sooptcopyin(sopt, &ic6f, optlen, optlen);
 			if (error == 0) {
 				INP_WLOCK(inp);
 				*inp->in6p_icmp6filt = ic6f;
 				INP_WUNLOCK(inp);
 			}
 			break;
 		    }
 
 		default:
 			error = ENOPROTOOPT;
 			break;
 		}
 		break;
 
 	case PRCO_GETOPT:
 		switch (optname) {
 		case ICMP6_FILTER:
 		    {
 			struct icmp6_filter ic6f;
 
 			INP_RLOCK(inp);
 			ic6f = *inp->in6p_icmp6filt;
 			INP_RUNLOCK(inp);
 			error = sooptcopyout(sopt, &ic6f, sizeof(ic6f));
 			break;
 		    }
 
 		default:
 			error = ENOPROTOOPT;
 			break;
 		}
 		break;
 	}
 
 	return (error);
 }
 
 /*
  * Perform rate limit check.
  * Returns 0 if it is okay to send the icmp6 packet.
  * Returns 1 if the router SHOULD NOT send this icmp6 packet due to rate
  * limitation.
  *
  * XXX per-destination/type check necessary?
  *
  * dst - not used at this moment
  * type - not used at this moment
  * code - not used at this moment
  */
 static int
 icmp6_ratelimit(const struct in6_addr *dst, const int type,
     const int code)
 {
 	INIT_VNET_INET6(curvnet);
 	int ret;
 
 	ret = 0;	/* okay to send */
 
 	/* PPS limit */
 	if (!ppsratecheck(&V_icmp6errppslim_last, &V_icmp6errpps_count,
 	    V_icmp6errppslim)) {
 		/* The packet is subject to rate limit */
 		ret++;
 	}
 
 	return ret;
 }
Index: head/sys/netinet6/in6.c
===================================================================
--- head/sys/netinet6/in6.c	(revision 186118)
+++ head/sys/netinet6/in6.c	(revision 186119)
@@ -1,2332 +1,2394 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: in6.c,v 1.259 2002/01/21 11:37:50 keiichi Exp $
  */
 
 /*-
  * Copyright (c) 1982, 1986, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)in.c	8.2 (Berkeley) 11/15/93
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/param.h>
 #include <sys/errno.h>
 #include <sys/malloc.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sockio.h>
 #include <sys/systm.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/time.h>
 #include <sys/kernel.h>
 #include <sys/syslog.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/route.h>
 #include <net/if_dl.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
+#include <net/if_llatbl.h>
 #include <netinet/if_ether.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/in_pcb.h>
 
 #include <netinet/ip6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet6/mld6_var.h>
 #include <netinet6/ip6_mroute.h>
 #include <netinet6/in6_ifattach.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/in6_pcb.h>
 #include <netinet6/vinet6.h>
 
 MALLOC_DEFINE(M_IP6MADDR, "in6_multi", "internet multicast address");
 
 /*
  * Definitions of some costant IP6 addresses.
  */
 const struct in6_addr in6addr_any = IN6ADDR_ANY_INIT;
 const struct in6_addr in6addr_loopback = IN6ADDR_LOOPBACK_INIT;
 const struct in6_addr in6addr_nodelocal_allnodes =
 	IN6ADDR_NODELOCAL_ALLNODES_INIT;
 const struct in6_addr in6addr_linklocal_allnodes =
 	IN6ADDR_LINKLOCAL_ALLNODES_INIT;
 const struct in6_addr in6addr_linklocal_allrouters =
 	IN6ADDR_LINKLOCAL_ALLROUTERS_INIT;
 
 const struct in6_addr in6mask0 = IN6MASK0;
 const struct in6_addr in6mask32 = IN6MASK32;
 const struct in6_addr in6mask64 = IN6MASK64;
 const struct in6_addr in6mask96 = IN6MASK96;
 const struct in6_addr in6mask128 = IN6MASK128;
 
 const struct sockaddr_in6 sa6_any =
 	{ sizeof(sa6_any), AF_INET6, 0, 0, IN6ADDR_ANY_INIT, 0 };
 
 static int in6_lifaddr_ioctl __P((struct socket *, u_long, caddr_t,
 	struct ifnet *, struct thread *));
 static int in6_ifinit __P((struct ifnet *, struct in6_ifaddr *,
 	struct sockaddr_in6 *, int));
 static void in6_unlink_ifa(struct in6_ifaddr *, struct ifnet *);
 
 struct in6_multihead in6_multihead;	/* XXX BSS initialization */
 int	(*faithprefix_p)(struct in6_addr *);
 
-/*
- * Subroutine for in6_ifaddloop() and in6_ifremloop().
- * This routine does actual work.
- */
-static void
-in6_ifloop_request(int cmd, struct ifaddr *ifa)
-{
-	struct sockaddr_in6 all1_sa;
-	struct rtentry *nrt = NULL;
-	int e;
-	char ip6buf[INET6_ADDRSTRLEN];
 
-	bzero(&all1_sa, sizeof(all1_sa));
-	all1_sa.sin6_family = AF_INET6;
-	all1_sa.sin6_len = sizeof(struct sockaddr_in6);
-	all1_sa.sin6_addr = in6mask128;
 
-	/*
-	 * We specify the address itself as the gateway, and set the
-	 * RTF_LLINFO flag, so that the corresponding host route would have
-	 * the flag, and thus applications that assume traditional behavior
-	 * would be happy.  Note that we assume the caller of the function
-	 * (probably implicitly) set nd6_rtrequest() to ifa->ifa_rtrequest,
-	 * which changes the outgoing interface to the loopback interface.
-	 */
-	e = rtrequest(cmd, ifa->ifa_addr, ifa->ifa_addr,
-	    (struct sockaddr *)&all1_sa, RTF_UP|RTF_HOST|RTF_LLINFO, &nrt);
-	if (e != 0) {
-		/* XXX need more descriptive message */
-
-		log(LOG_ERR, "in6_ifloop_request: "
-		    "%s operation failed for %s (errno=%d)\n",
-		    cmd == RTM_ADD ? "ADD" : "DELETE",
-		    ip6_sprintf(ip6buf,
-			    &((struct in6_ifaddr *)ifa)->ia_addr.sin6_addr), e);
-	}
-
-	/*
-	 * Report the addition/removal of the address to the routing socket.
-	 * XXX: since we called rtinit for a p2p interface with a destination,
-	 *      we end up reporting twice in such a case.  Should we rather
-	 *      omit the second report?
-	 */
-	if (nrt) {
-		RT_LOCK(nrt);
-		/*
-		 * Make sure rt_ifa be equal to IFA, the second argument of
-		 * the function.  We need this because when we refer to
-		 * rt_ifa->ia6_flags in ip6_input, we assume that the rt_ifa
-		 * points to the address instead of the loopback address.
-		 */
-		if (cmd == RTM_ADD && ifa != nrt->rt_ifa) {
-			IFAFREE(nrt->rt_ifa);
-			IFAREF(ifa);
-			nrt->rt_ifa = ifa;
-		}
-
-		rt_newaddrmsg(cmd, ifa, e, nrt);
-		if (cmd == RTM_DELETE)
-			RTFREE_LOCKED(nrt);
-		else {
-			/* the cmd must be RTM_ADD here */
-			RT_REMREF(nrt);
-			RT_UNLOCK(nrt);
-		}
-	}
-}
-
-/*
- * Add ownaddr as loopback rtentry.  We previously add the route only if
- * necessary (ex. on a p2p link).  However, since we now manage addresses
- * separately from prefixes, we should always add the route.  We can't
- * rely on the cloning mechanism from the corresponding interface route
- * any more.
- */
-void
-in6_ifaddloop(struct ifaddr *ifa)
-{
-	struct rtentry *rt;
-	int need_loop;
-
-	/* If there is no loopback entry, allocate one. */
-	rt = rtalloc1(ifa->ifa_addr, 0, 0);
-	need_loop = (rt == NULL || (rt->rt_flags & RTF_HOST) == 0 ||
-	    (rt->rt_ifp->if_flags & IFF_LOOPBACK) == 0);
-	if (rt)
-		RTFREE_LOCKED(rt);
-	if (need_loop)
-		in6_ifloop_request(RTM_ADD, ifa);
-}
-
-/*
- * Remove loopback rtentry of ownaddr generated by in6_ifaddloop(),
- * if it exists.
- */
-void
-in6_ifremloop(struct ifaddr *ifa)
-{
-	INIT_VNET_INET6(curvnet);
-	struct in6_ifaddr *ia;
-	struct rtentry *rt;
-	int ia_count = 0;
-
-	/*
-	 * Some of BSD variants do not remove cloned routes
-	 * from an interface direct route, when removing the direct route
-	 * (see comments in net/net_osdep.h).  Even for variants that do remove
-	 * cloned routes, they could fail to remove the cloned routes when
-	 * we handle multple addresses that share a common prefix.
-	 * So, we should remove the route corresponding to the deleted address.
-	 */
-
-	/*
-	 * Delete the entry only if exact one ifa exists.  More than one ifa
-	 * can exist if we assign a same single address to multiple
-	 * (probably p2p) interfaces.
-	 * XXX: we should avoid such a configuration in IPv6...
-	 */
-	for (ia = V_in6_ifaddr; ia; ia = ia->ia_next) {
-		if (IN6_ARE_ADDR_EQUAL(IFA_IN6(ifa), &ia->ia_addr.sin6_addr)) {
-			ia_count++;
-			if (ia_count > 1)
-				break;
-		}
-	}
-
-	if (ia_count == 1) {
-		/*
-		 * Before deleting, check if a corresponding loopbacked host
-		 * route surely exists.  With this check, we can avoid to
-		 * delete an interface direct route whose destination is same
-		 * as the address being removed.  This can happen when removing
-		 * a subnet-router anycast address on an interface attahced
-		 * to a shared medium.
-		 */
-		rt = rtalloc1(ifa->ifa_addr, 0, 0);
-		if (rt != NULL) {
-			if ((rt->rt_flags & RTF_HOST) != 0 &&
-			    (rt->rt_ifp->if_flags & IFF_LOOPBACK) != 0) {
-				RTFREE_LOCKED(rt);
-				in6_ifloop_request(RTM_DELETE, ifa);
-			} else
-				RT_UNLOCK(rt);
-		}
-	}
-}
-
 int
 in6_mask2len(struct in6_addr *mask, u_char *lim0)
 {
 	int x = 0, y;
 	u_char *lim = lim0, *p;
 
 	/* ignore the scope_id part */
 	if (lim0 == NULL || lim0 - (u_char *)mask > sizeof(*mask))
 		lim = (u_char *)mask + sizeof(*mask);
 	for (p = (u_char *)mask; p < lim; x++, p++) {
 		if (*p != 0xff)
 			break;
 	}
 	y = 0;
 	if (p < lim) {
 		for (y = 0; y < 8; y++) {
 			if ((*p & (0x80 >> y)) == 0)
 				break;
 		}
 	}
 
 	/*
 	 * when the limit pointer is given, do a stricter check on the
 	 * remaining bits.
 	 */
 	if (p < lim) {
 		if (y != 0 && (*p & (0x00ff >> y)) != 0)
 			return (-1);
 		for (p = p + 1; p < lim; p++)
 			if (*p != 0)
 				return (-1);
 	}
 
 	return x * 8 + y;
 }
 
 #define ifa2ia6(ifa)	((struct in6_ifaddr *)(ifa))
 #define ia62ifa(ia6)	(&((ia6)->ia_ifa))
 
 int
 in6_control(struct socket *so, u_long cmd, caddr_t data,
     struct ifnet *ifp, struct thread *td)
 {
 	INIT_VNET_INET6(curvnet);
 	struct	in6_ifreq *ifr = (struct in6_ifreq *)data;
 	struct	in6_ifaddr *ia = NULL;
 	struct	in6_aliasreq *ifra = (struct in6_aliasreq *)data;
 	struct sockaddr_in6 *sa6;
 	int error;
 
 	switch (cmd) {
 	case SIOCGETSGCNT_IN6:
 	case SIOCGETMIFCNT_IN6:
 		return (mrt6_ioctl ? mrt6_ioctl(cmd, data) : EOPNOTSUPP);
 	}
 
 	switch(cmd) {
 	case SIOCAADDRCTL_POLICY:
 	case SIOCDADDRCTL_POLICY:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NETINET_ADDRCTRL6);
 			if (error)
 				return (error);
 		}
 		return (in6_src_ioctl(cmd, data));
 	}
 
 	if (ifp == NULL)
 		return (EOPNOTSUPP);
 
 	switch (cmd) {
 	case SIOCSNDFLUSH_IN6:
 	case SIOCSPFXFLUSH_IN6:
 	case SIOCSRTRFLUSH_IN6:
 	case SIOCSDEFIFACE_IN6:
 	case SIOCSIFINFO_FLAGS:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NETINET_ND6);
 			if (error)
 				return (error);
 		}
 		/* FALLTHROUGH */
 	case OSIOCGIFINFO_IN6:
 	case SIOCGIFINFO_IN6:
 	case SIOCSIFINFO_IN6:
 	case SIOCGDRLST_IN6:
 	case SIOCGPRLST_IN6:
 	case SIOCGNBRINFO_IN6:
 	case SIOCGDEFIFACE_IN6:
 		return (nd6_ioctl(cmd, data, ifp));
 	}
 
 	switch (cmd) {
 	case SIOCSIFPREFIX_IN6:
 	case SIOCDIFPREFIX_IN6:
 	case SIOCAIFPREFIX_IN6:
 	case SIOCCIFPREFIX_IN6:
 	case SIOCSGIFPREFIX_IN6:
 	case SIOCGIFPREFIX_IN6:
 		log(LOG_NOTICE,
 		    "prefix ioctls are now invalidated. "
 		    "please use ifconfig.\n");
 		return (EOPNOTSUPP);
 	}
 
 	switch (cmd) {
 	case SIOCSSCOPE6:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NETINET_SCOPE6);
 			if (error)
 				return (error);
 		}
 		return (scope6_set(ifp,
 		    (struct scope6_id *)ifr->ifr_ifru.ifru_scope_id));
 	case SIOCGSCOPE6:
 		return (scope6_get(ifp,
 		    (struct scope6_id *)ifr->ifr_ifru.ifru_scope_id));
 	case SIOCGSCOPE6DEF:
 		return (scope6_get_default((struct scope6_id *)
 		    ifr->ifr_ifru.ifru_scope_id));
 	}
 
 	switch (cmd) {
 	case SIOCALIFADDR:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NET_ADDIFADDR);
 			if (error)
 				return (error);
 		}
 		return in6_lifaddr_ioctl(so, cmd, data, ifp, td);
 
 	case SIOCDLIFADDR:
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NET_DELIFADDR);
 			if (error)
 				return (error);
 		}
 		/* FALLTHROUGH */
 	case SIOCGLIFADDR:
 		return in6_lifaddr_ioctl(so, cmd, data, ifp, td);
 	}
 
 	/*
 	 * Find address for this interface, if it exists.
 	 *
 	 * In netinet code, we have checked ifra_addr in SIOCSIF*ADDR operation
 	 * only, and used the first interface address as the target of other
 	 * operations (without checking ifra_addr).  This was because netinet
 	 * code/API assumed at most 1 interface address per interface.
 	 * Since IPv6 allows a node to assign multiple addresses
 	 * on a single interface, we almost always look and check the
 	 * presence of ifra_addr, and reject invalid ones here.
 	 * It also decreases duplicated code among SIOC*_IN6 operations.
 	 */
 	switch (cmd) {
 	case SIOCAIFADDR_IN6:
 	case SIOCSIFPHYADDR_IN6:
 		sa6 = &ifra->ifra_addr;
 		break;
 	case SIOCSIFADDR_IN6:
 	case SIOCGIFADDR_IN6:
 	case SIOCSIFDSTADDR_IN6:
 	case SIOCSIFNETMASK_IN6:
 	case SIOCGIFDSTADDR_IN6:
 	case SIOCGIFNETMASK_IN6:
 	case SIOCDIFADDR_IN6:
 	case SIOCGIFPSRCADDR_IN6:
 	case SIOCGIFPDSTADDR_IN6:
 	case SIOCGIFAFLAG_IN6:
 	case SIOCSNDFLUSH_IN6:
 	case SIOCSPFXFLUSH_IN6:
 	case SIOCSRTRFLUSH_IN6:
 	case SIOCGIFALIFETIME_IN6:
 	case SIOCSIFALIFETIME_IN6:
 	case SIOCGIFSTAT_IN6:
 	case SIOCGIFSTAT_ICMP6:
 		sa6 = &ifr->ifr_addr;
 		break;
 	default:
 		sa6 = NULL;
 		break;
 	}
 	if (sa6 && sa6->sin6_family == AF_INET6) {
 		int error = 0;
 
 		if (sa6->sin6_scope_id != 0)
 			error = sa6_embedscope(sa6, 0);
 		else
 			error = in6_setscope(&sa6->sin6_addr, ifp, NULL);
 		if (error != 0)
 			return (error);
 		ia = in6ifa_ifpwithaddr(ifp, &sa6->sin6_addr);
 	} else
 		ia = NULL;
 
 	switch (cmd) {
 	case SIOCSIFADDR_IN6:
 	case SIOCSIFDSTADDR_IN6:
 	case SIOCSIFNETMASK_IN6:
 		/*
 		 * Since IPv6 allows a node to assign multiple addresses
 		 * on a single interface, SIOCSIFxxx ioctls are deprecated.
 		 */
 		/* we decided to obsolete this command (20000704) */
 		return (EINVAL);
 
 	case SIOCDIFADDR_IN6:
 		/*
 		 * for IPv4, we look for existing in_ifaddr here to allow
 		 * "ifconfig if0 delete" to remove the first IPv4 address on
 		 * the interface.  For IPv6, as the spec allows multiple
 		 * interface address from the day one, we consider "remove the
 		 * first one" semantics to be not preferable.
 		 */
 		if (ia == NULL)
 			return (EADDRNOTAVAIL);
 		/* FALLTHROUGH */
 	case SIOCAIFADDR_IN6:
 		/*
 		 * We always require users to specify a valid IPv6 address for
 		 * the corresponding operation.
 		 */
 		if (ifra->ifra_addr.sin6_family != AF_INET6 ||
 		    ifra->ifra_addr.sin6_len != sizeof(struct sockaddr_in6))
 			return (EAFNOSUPPORT);
 
 		if (td != NULL) {
 			error = priv_check(td, (cmd == SIOCDIFADDR_IN6) ? 
 			    PRIV_NET_DELIFADDR : PRIV_NET_ADDIFADDR);
 			if (error)
 				return (error);
 		}
 
 		break;
 
 	case SIOCGIFADDR_IN6:
 		/* This interface is basically deprecated. use SIOCGIFCONF. */
 		/* FALLTHROUGH */
 	case SIOCGIFAFLAG_IN6:
 	case SIOCGIFNETMASK_IN6:
 	case SIOCGIFDSTADDR_IN6:
 	case SIOCGIFALIFETIME_IN6:
 		/* must think again about its semantics */
 		if (ia == NULL)
 			return (EADDRNOTAVAIL);
 		break;
 	case SIOCSIFALIFETIME_IN6:
 	    {
 		struct in6_addrlifetime *lt;
 
 		if (td != NULL) {
 			error = priv_check(td, PRIV_NETINET_ALIFETIME6);
 			if (error)
 				return (error);
 		}
 		if (ia == NULL)
 			return (EADDRNOTAVAIL);
 		/* sanity for overflow - beware unsigned */
 		lt = &ifr->ifr_ifru.ifru_lifetime;
 		if (lt->ia6t_vltime != ND6_INFINITE_LIFETIME &&
 		    lt->ia6t_vltime + time_second < time_second) {
 			return EINVAL;
 		}
 		if (lt->ia6t_pltime != ND6_INFINITE_LIFETIME &&
 		    lt->ia6t_pltime + time_second < time_second) {
 			return EINVAL;
 		}
 		break;
 	    }
 	}
 
 	switch (cmd) {
 
 	case SIOCGIFADDR_IN6:
 		ifr->ifr_addr = ia->ia_addr;
 		if ((error = sa6_recoverscope(&ifr->ifr_addr)) != 0)
 			return (error);
 		break;
 
 	case SIOCGIFDSTADDR_IN6:
 		if ((ifp->if_flags & IFF_POINTOPOINT) == 0)
 			return (EINVAL);
 		/*
 		 * XXX: should we check if ifa_dstaddr is NULL and return
 		 * an error?
 		 */
 		ifr->ifr_dstaddr = ia->ia_dstaddr;
 		if ((error = sa6_recoverscope(&ifr->ifr_dstaddr)) != 0)
 			return (error);
 		break;
 
 	case SIOCGIFNETMASK_IN6:
 		ifr->ifr_addr = ia->ia_prefixmask;
 		break;
 
 	case SIOCGIFAFLAG_IN6:
 		ifr->ifr_ifru.ifru_flags6 = ia->ia6_flags;
 		break;
 
 	case SIOCGIFSTAT_IN6:
 		if (ifp == NULL)
 			return EINVAL;
 		bzero(&ifr->ifr_ifru.ifru_stat,
 		    sizeof(ifr->ifr_ifru.ifru_stat));
 		ifr->ifr_ifru.ifru_stat =
 		    *((struct in6_ifextra *)ifp->if_afdata[AF_INET6])->in6_ifstat;
 		break;
 
 	case SIOCGIFSTAT_ICMP6:
 		if (ifp == NULL)
 			return EINVAL;
 		bzero(&ifr->ifr_ifru.ifru_icmp6stat,
 		    sizeof(ifr->ifr_ifru.ifru_icmp6stat));
 		ifr->ifr_ifru.ifru_icmp6stat =
 		    *((struct in6_ifextra *)ifp->if_afdata[AF_INET6])->icmp6_ifstat;
 		break;
 
 	case SIOCGIFALIFETIME_IN6:
 		ifr->ifr_ifru.ifru_lifetime = ia->ia6_lifetime;
 		if (ia->ia6_lifetime.ia6t_vltime != ND6_INFINITE_LIFETIME) {
 			time_t maxexpire;
 			struct in6_addrlifetime *retlt =
 			    &ifr->ifr_ifru.ifru_lifetime;
 
 			/*
 			 * XXX: adjust expiration time assuming time_t is
 			 * signed.
 			 */
 			maxexpire = (-1) &
 			    ~((time_t)1 << ((sizeof(maxexpire) * 8) - 1));
 			if (ia->ia6_lifetime.ia6t_vltime <
 			    maxexpire - ia->ia6_updatetime) {
 				retlt->ia6t_expire = ia->ia6_updatetime +
 				    ia->ia6_lifetime.ia6t_vltime;
 			} else
 				retlt->ia6t_expire = maxexpire;
 		}
 		if (ia->ia6_lifetime.ia6t_pltime != ND6_INFINITE_LIFETIME) {
 			time_t maxexpire;
 			struct in6_addrlifetime *retlt =
 			    &ifr->ifr_ifru.ifru_lifetime;
 
 			/*
 			 * XXX: adjust expiration time assuming time_t is
 			 * signed.
 			 */
 			maxexpire = (-1) &
 			    ~((time_t)1 << ((sizeof(maxexpire) * 8) - 1));
 			if (ia->ia6_lifetime.ia6t_pltime <
 			    maxexpire - ia->ia6_updatetime) {
 				retlt->ia6t_preferred = ia->ia6_updatetime +
 				    ia->ia6_lifetime.ia6t_pltime;
 			} else
 				retlt->ia6t_preferred = maxexpire;
 		}
 		break;
 
 	case SIOCSIFALIFETIME_IN6:
 		ia->ia6_lifetime = ifr->ifr_ifru.ifru_lifetime;
 		/* for sanity */
 		if (ia->ia6_lifetime.ia6t_vltime != ND6_INFINITE_LIFETIME) {
 			ia->ia6_lifetime.ia6t_expire =
 				time_second + ia->ia6_lifetime.ia6t_vltime;
 		} else
 			ia->ia6_lifetime.ia6t_expire = 0;
 		if (ia->ia6_lifetime.ia6t_pltime != ND6_INFINITE_LIFETIME) {
 			ia->ia6_lifetime.ia6t_preferred =
 				time_second + ia->ia6_lifetime.ia6t_pltime;
 		} else
 			ia->ia6_lifetime.ia6t_preferred = 0;
 		break;
 
 	case SIOCAIFADDR_IN6:
 	{
 		int i, error = 0;
 		struct nd_prefixctl pr0;
 		struct nd_prefix *pr;
 
 		/*
 		 * first, make or update the interface address structure,
 		 * and link it to the list.
 		 */
 		if ((error = in6_update_ifa(ifp, ifra, ia, 0)) != 0)
 			return (error);
 		if ((ia = in6ifa_ifpwithaddr(ifp, &ifra->ifra_addr.sin6_addr))
 		    == NULL) {
 			/*
 			 * this can happen when the user specify the 0 valid
 			 * lifetime.
 			 */
 			break;
 		}
 
 		/*
 		 * then, make the prefix on-link on the interface.
 		 * XXX: we'd rather create the prefix before the address, but
 		 * we need at least one address to install the corresponding
 		 * interface route, so we configure the address first.
 		 */
 
 		/*
 		 * convert mask to prefix length (prefixmask has already
 		 * been validated in in6_update_ifa().
 		 */
 		bzero(&pr0, sizeof(pr0));
 		pr0.ndpr_ifp = ifp;
 		pr0.ndpr_plen = in6_mask2len(&ifra->ifra_prefixmask.sin6_addr,
 		    NULL);
 		if (pr0.ndpr_plen == 128) {
 			break;	/* we don't need to install a host route. */
 		}
 		pr0.ndpr_prefix = ifra->ifra_addr;
 		/* apply the mask for safety. */
 		for (i = 0; i < 4; i++) {
 			pr0.ndpr_prefix.sin6_addr.s6_addr32[i] &=
 			    ifra->ifra_prefixmask.sin6_addr.s6_addr32[i];
 		}
 		/*
 		 * XXX: since we don't have an API to set prefix (not address)
 		 * lifetimes, we just use the same lifetimes as addresses.
 		 * The (temporarily) installed lifetimes can be overridden by
 		 * later advertised RAs (when accept_rtadv is non 0), which is
 		 * an intended behavior.
 		 */
 		pr0.ndpr_raf_onlink = 1; /* should be configurable? */
 		pr0.ndpr_raf_auto =
 		    ((ifra->ifra_flags & IN6_IFF_AUTOCONF) != 0);
 		pr0.ndpr_vltime = ifra->ifra_lifetime.ia6t_vltime;
 		pr0.ndpr_pltime = ifra->ifra_lifetime.ia6t_pltime;
 
 		/* add the prefix if not yet. */
 		if ((pr = nd6_prefix_lookup(&pr0)) == NULL) {
 			/*
 			 * nd6_prelist_add will install the corresponding
 			 * interface route.
 			 */
 			if ((error = nd6_prelist_add(&pr0, NULL, &pr)) != 0)
 				return (error);
 			if (pr == NULL) {
 				log(LOG_ERR, "nd6_prelist_add succeeded but "
 				    "no prefix\n");
 				return (EINVAL); /* XXX panic here? */
 			}
 		}
 
 		/* relate the address to the prefix */
 		if (ia->ia6_ndpr == NULL) {
 			ia->ia6_ndpr = pr;
 			pr->ndpr_refcnt++;
 
 			/*
 			 * If this is the first autoconf address from the
 			 * prefix, create a temporary address as well
 			 * (when required).
 			 */
 			if ((ia->ia6_flags & IN6_IFF_AUTOCONF) &&
 			    V_ip6_use_tempaddr && pr->ndpr_refcnt == 1) {
 				int e;
 				if ((e = in6_tmpifadd(ia, 1, 0)) != 0) {
 					log(LOG_NOTICE, "in6_control: failed "
 					    "to create a temporary address, "
 					    "errno=%d\n", e);
 				}
 			}
 		}
 
 		/*
 		 * this might affect the status of autoconfigured addresses,
 		 * that is, this address might make other addresses detached.
 		 */
 		pfxlist_onlink_check();
 		if (error == 0 && ia)
 			EVENTHANDLER_INVOKE(ifaddr_event, ifp);
 		break;
 	}
 
 	case SIOCDIFADDR_IN6:
 	{
 		struct nd_prefix *pr;
 
 		/*
 		 * If the address being deleted is the only one that owns
 		 * the corresponding prefix, expire the prefix as well.
 		 * XXX: theoretically, we don't have to worry about such
 		 * relationship, since we separate the address management
 		 * and the prefix management.  We do this, however, to provide
 		 * as much backward compatibility as possible in terms of
 		 * the ioctl operation.
 		 * Note that in6_purgeaddr() will decrement ndpr_refcnt.
 		 */
 		pr = ia->ia6_ndpr;
 		in6_purgeaddr(&ia->ia_ifa);
 		if (pr && pr->ndpr_refcnt == 0)
 			prelist_remove(pr);
 		EVENTHANDLER_INVOKE(ifaddr_event, ifp);
 		break;
 	}
 
 	default:
 		if (ifp == NULL || ifp->if_ioctl == 0)
 			return (EOPNOTSUPP);
 		return ((*ifp->if_ioctl)(ifp, cmd, data));
 	}
 
 	return (0);
 }
 
 /*
  * Update parameters of an IPv6 interface address.
  * If necessary, a new entry is created and linked into address chains.
  * This function is separated from in6_control().
  * XXX: should this be performed under splnet()?
  */
 int
 in6_update_ifa(struct ifnet *ifp, struct in6_aliasreq *ifra,
     struct in6_ifaddr *ia, int flags)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	INIT_VPROCG(TD_TO_VPROCG(curthread)); /* XXX V_hostname needs this */
 	int error = 0, hostIsNew = 0, plen = -1;
 	struct in6_ifaddr *oia;
 	struct sockaddr_in6 dst6;
 	struct in6_addrlifetime *lt;
 	struct in6_multi_mship *imm;
 	struct in6_multi *in6m_sol;
 	struct rtentry *rt;
 	int delay;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	/* Validate parameters */
 	if (ifp == NULL || ifra == NULL) /* this maybe redundant */
 		return (EINVAL);
 
 	/*
 	 * The destination address for a p2p link must have a family
 	 * of AF_UNSPEC or AF_INET6.
 	 */
 	if ((ifp->if_flags & IFF_POINTOPOINT) != 0 &&
 	    ifra->ifra_dstaddr.sin6_family != AF_INET6 &&
 	    ifra->ifra_dstaddr.sin6_family != AF_UNSPEC)
 		return (EAFNOSUPPORT);
 	/*
 	 * validate ifra_prefixmask.  don't check sin6_family, netmask
 	 * does not carry fields other than sin6_len.
 	 */
 	if (ifra->ifra_prefixmask.sin6_len > sizeof(struct sockaddr_in6))
 		return (EINVAL);
 	/*
 	 * Because the IPv6 address architecture is classless, we require
 	 * users to specify a (non 0) prefix length (mask) for a new address.
 	 * We also require the prefix (when specified) mask is valid, and thus
 	 * reject a non-consecutive mask.
 	 */
 	if (ia == NULL && ifra->ifra_prefixmask.sin6_len == 0)
 		return (EINVAL);
 	if (ifra->ifra_prefixmask.sin6_len != 0) {
 		plen = in6_mask2len(&ifra->ifra_prefixmask.sin6_addr,
 		    (u_char *)&ifra->ifra_prefixmask +
 		    ifra->ifra_prefixmask.sin6_len);
 		if (plen <= 0)
 			return (EINVAL);
 	} else {
 		/*
 		 * In this case, ia must not be NULL.  We just use its prefix
 		 * length.
 		 */
 		plen = in6_mask2len(&ia->ia_prefixmask.sin6_addr, NULL);
 	}
 	/*
 	 * If the destination address on a p2p interface is specified,
 	 * and the address is a scoped one, validate/set the scope
 	 * zone identifier.
 	 */
 	dst6 = ifra->ifra_dstaddr;
 	if ((ifp->if_flags & (IFF_POINTOPOINT|IFF_LOOPBACK)) != 0 &&
 	    (dst6.sin6_family == AF_INET6)) {
 		struct in6_addr in6_tmp;
 		u_int32_t zoneid;
 
 		in6_tmp = dst6.sin6_addr;
 		if (in6_setscope(&in6_tmp, ifp, &zoneid))
 			return (EINVAL); /* XXX: should be impossible */
 
 		if (dst6.sin6_scope_id != 0) {
 			if (dst6.sin6_scope_id != zoneid)
 				return (EINVAL);
 		} else		/* user omit to specify the ID. */
 			dst6.sin6_scope_id = zoneid;
 
 		/* convert into the internal form */
 		if (sa6_embedscope(&dst6, 0))
 			return (EINVAL); /* XXX: should be impossible */
 	}
 	/*
 	 * The destination address can be specified only for a p2p or a
 	 * loopback interface.  If specified, the corresponding prefix length
 	 * must be 128.
 	 */
 	if (ifra->ifra_dstaddr.sin6_family == AF_INET6) {
 		if ((ifp->if_flags & (IFF_POINTOPOINT|IFF_LOOPBACK)) == 0) {
 			/* XXX: noisy message */
 			nd6log((LOG_INFO, "in6_update_ifa: a destination can "
 			    "be specified for a p2p or a loopback IF only\n"));
 			return (EINVAL);
 		}
 		if (plen != 128) {
 			nd6log((LOG_INFO, "in6_update_ifa: prefixlen should "
 			    "be 128 when dstaddr is specified\n"));
 			return (EINVAL);
 		}
 	}
 	/* lifetime consistency check */
 	lt = &ifra->ifra_lifetime;
 	if (lt->ia6t_pltime > lt->ia6t_vltime)
 		return (EINVAL);
 	if (lt->ia6t_vltime == 0) {
 		/*
 		 * the following log might be noisy, but this is a typical
 		 * configuration mistake or a tool's bug.
 		 */
 		nd6log((LOG_INFO,
 		    "in6_update_ifa: valid lifetime is 0 for %s\n",
 		    ip6_sprintf(ip6buf, &ifra->ifra_addr.sin6_addr)));
 
 		if (ia == NULL)
 			return (0); /* there's nothing to do */
 	}
 
 	/*
 	 * If this is a new address, allocate a new ifaddr and link it
 	 * into chains.
 	 */
 	if (ia == NULL) {
 		hostIsNew = 1;
 		/*
 		 * When in6_update_ifa() is called in a process of a received
 		 * RA, it is called under an interrupt context.  So, we should
 		 * call malloc with M_NOWAIT.
 		 */
 		ia = (struct in6_ifaddr *) malloc(sizeof(*ia), M_IFADDR,
 		    M_NOWAIT);
 		if (ia == NULL)
 			return (ENOBUFS);
 		bzero((caddr_t)ia, sizeof(*ia));
 		LIST_INIT(&ia->ia6_memberships);
 		/* Initialize the address and masks, and put time stamp */
 		IFA_LOCK_INIT(&ia->ia_ifa);
 		ia->ia_ifa.ifa_addr = (struct sockaddr *)&ia->ia_addr;
 		ia->ia_addr.sin6_family = AF_INET6;
 		ia->ia_addr.sin6_len = sizeof(ia->ia_addr);
 		ia->ia6_createtime = time_second;
 		if ((ifp->if_flags & (IFF_POINTOPOINT | IFF_LOOPBACK)) != 0) {
 			/*
 			 * XXX: some functions expect that ifa_dstaddr is not
 			 * NULL for p2p interfaces.
 			 */
 			ia->ia_ifa.ifa_dstaddr =
 			    (struct sockaddr *)&ia->ia_dstaddr;
 		} else {
 			ia->ia_ifa.ifa_dstaddr = NULL;
 		}
 		ia->ia_ifa.ifa_netmask = (struct sockaddr *)&ia->ia_prefixmask;
 
 		ia->ia_ifp = ifp;
 		if ((oia = V_in6_ifaddr) != NULL) {
 			for ( ; oia->ia_next; oia = oia->ia_next)
 				continue;
 			oia->ia_next = ia;
 		} else
 			V_in6_ifaddr = ia;
 
 		ia->ia_ifa.ifa_refcnt = 1;
 		TAILQ_INSERT_TAIL(&ifp->if_addrlist, &ia->ia_ifa, ifa_list);
 	}
 
 	/* update timestamp */
 	ia->ia6_updatetime = time_second;
 
 	/* set prefix mask */
 	if (ifra->ifra_prefixmask.sin6_len) {
 		/*
 		 * We prohibit changing the prefix length of an existing
 		 * address, because
 		 * + such an operation should be rare in IPv6, and
 		 * + the operation would confuse prefix management.
 		 */
 		if (ia->ia_prefixmask.sin6_len &&
 		    in6_mask2len(&ia->ia_prefixmask.sin6_addr, NULL) != plen) {
 			nd6log((LOG_INFO, "in6_update_ifa: the prefix length of an"
 			    " existing (%s) address should not be changed\n",
 			    ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr)));
 			error = EINVAL;
 			goto unlink;
 		}
 		ia->ia_prefixmask = ifra->ifra_prefixmask;
 	}
 
 	/*
 	 * If a new destination address is specified, scrub the old one and
 	 * install the new destination.  Note that the interface must be
 	 * p2p or loopback (see the check above.)
 	 */
 	if (dst6.sin6_family == AF_INET6 &&
 	    !IN6_ARE_ADDR_EQUAL(&dst6.sin6_addr, &ia->ia_dstaddr.sin6_addr)) {
 		int e;
 
 		if ((ia->ia_flags & IFA_ROUTE) != 0 &&
 		    (e = rtinit(&(ia->ia_ifa), (int)RTM_DELETE, RTF_HOST)) != 0) {
 			nd6log((LOG_ERR, "in6_update_ifa: failed to remove "
 			    "a route to the old destination: %s\n",
 			    ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr)));
 			/* proceed anyway... */
 		} else
 			ia->ia_flags &= ~IFA_ROUTE;
 		ia->ia_dstaddr = dst6;
 	}
 
 	/*
 	 * Set lifetimes.  We do not refer to ia6t_expire and ia6t_preferred
 	 * to see if the address is deprecated or invalidated, but initialize
 	 * these members for applications.
 	 */
 	ia->ia6_lifetime = ifra->ifra_lifetime;
 	if (ia->ia6_lifetime.ia6t_vltime != ND6_INFINITE_LIFETIME) {
 		ia->ia6_lifetime.ia6t_expire =
 		    time_second + ia->ia6_lifetime.ia6t_vltime;
 	} else
 		ia->ia6_lifetime.ia6t_expire = 0;
 	if (ia->ia6_lifetime.ia6t_pltime != ND6_INFINITE_LIFETIME) {
 		ia->ia6_lifetime.ia6t_preferred =
 		    time_second + ia->ia6_lifetime.ia6t_pltime;
 	} else
 		ia->ia6_lifetime.ia6t_preferred = 0;
 
 	/* reset the interface and routing table appropriately. */
 	if ((error = in6_ifinit(ifp, ia, &ifra->ifra_addr, hostIsNew)) != 0)
 		goto unlink;
 
 	/*
 	 * configure address flags.
 	 */
 	ia->ia6_flags = ifra->ifra_flags;
 	/*
 	 * backward compatibility - if IN6_IFF_DEPRECATED is set from the
 	 * userland, make it deprecated.
 	 */
 	if ((ifra->ifra_flags & IN6_IFF_DEPRECATED) != 0) {
 		ia->ia6_lifetime.ia6t_pltime = 0;
 		ia->ia6_lifetime.ia6t_preferred = time_second;
 	}
 	/*
 	 * Make the address tentative before joining multicast addresses,
 	 * so that corresponding MLD responses would not have a tentative
 	 * source address.
 	 */
 	ia->ia6_flags &= ~IN6_IFF_DUPLICATED;	/* safety */
 	if (hostIsNew && in6if_do_dad(ifp))
 		ia->ia6_flags |= IN6_IFF_TENTATIVE;
 
 	/*
 	 * We are done if we have simply modified an existing address.
 	 */
 	if (!hostIsNew)
 		return (error);
 
 	/*
 	 * Beyond this point, we should call in6_purgeaddr upon an error,
 	 * not just go to unlink.
 	 */
 
 	/* Join necessary multicast groups */
 	in6m_sol = NULL;
 	if ((ifp->if_flags & IFF_MULTICAST) != 0) {
 		struct sockaddr_in6 mltaddr, mltmask;
 		struct in6_addr llsol;
 
 		/* join solicited multicast addr for new host id */
 		bzero(&llsol, sizeof(struct in6_addr));
 		llsol.s6_addr32[0] = IPV6_ADDR_INT32_MLL;
 		llsol.s6_addr32[1] = 0;
 		llsol.s6_addr32[2] = htonl(1);
 		llsol.s6_addr32[3] = ifra->ifra_addr.sin6_addr.s6_addr32[3];
 		llsol.s6_addr8[12] = 0xff;
 		if ((error = in6_setscope(&llsol, ifp, NULL)) != 0) {
 			/* XXX: should not happen */
 			log(LOG_ERR, "in6_update_ifa: "
 			    "in6_setscope failed\n");
 			goto cleanup;
 		}
 		delay = 0;
 		if ((flags & IN6_IFAUPDATE_DADDELAY)) {
 			/*
 			 * We need a random delay for DAD on the address
 			 * being configured.  It also means delaying
 			 * transmission of the corresponding MLD report to
 			 * avoid report collision.
 			 * [draft-ietf-ipv6-rfc2462bis-02.txt]
 			 */
 			delay = arc4random() %
 			    (MAX_RTR_SOLICITATION_DELAY * hz);
 		}
 		imm = in6_joingroup(ifp, &llsol, &error, delay);
 		if (imm == NULL) {
 			nd6log((LOG_WARNING,
 			    "in6_update_ifa: addmulti failed for "
 			    "%s on %s (errno=%d)\n",
 			    ip6_sprintf(ip6buf, &llsol), if_name(ifp),
 			    error));
 			in6_purgeaddr((struct ifaddr *)ia);
 			return (error);
 		}
 		LIST_INSERT_HEAD(&ia->ia6_memberships,
 		    imm, i6mm_chain);
 		in6m_sol = imm->i6mm_maddr;
 
 		bzero(&mltmask, sizeof(mltmask));
 		mltmask.sin6_len = sizeof(struct sockaddr_in6);
 		mltmask.sin6_family = AF_INET6;
 		mltmask.sin6_addr = in6mask32;
 #define	MLTMASK_LEN  4	/* mltmask's masklen (=32bit=4octet) */
 
 		/*
 		 * join link-local all-nodes address
 		 */
 		bzero(&mltaddr, sizeof(mltaddr));
 		mltaddr.sin6_len = sizeof(struct sockaddr_in6);
 		mltaddr.sin6_family = AF_INET6;
 		mltaddr.sin6_addr = in6addr_linklocal_allnodes;
 		if ((error = in6_setscope(&mltaddr.sin6_addr, ifp, NULL)) !=
 		    0)
 			goto cleanup; /* XXX: should not fail */
 
 		/*
 		 * XXX: do we really need this automatic routes?
 		 * We should probably reconsider this stuff.  Most applications
 		 * actually do not need the routes, since they usually specify
 		 * the outgoing interface.
 		 */
 		rt = rtalloc1((struct sockaddr *)&mltaddr, 0, 0UL);
 		if (rt) {
 			/* XXX: only works in !SCOPEDROUTING case. */
 			if (memcmp(&mltaddr.sin6_addr,
 			    &((struct sockaddr_in6 *)rt_key(rt))->sin6_addr,
 			    MLTMASK_LEN)) {
 				RTFREE_LOCKED(rt);
 				rt = NULL;
 			}
 		}
 		if (!rt) {
-			/* XXX: we need RTF_CLONING to fake nd6_rtrequest */
 			error = rtrequest(RTM_ADD, (struct sockaddr *)&mltaddr,
 			    (struct sockaddr *)&ia->ia_addr,
-			    (struct sockaddr *)&mltmask, RTF_UP | RTF_CLONING,
+			    (struct sockaddr *)&mltmask, RTF_UP,
 			    (struct rtentry **)0);
 			if (error)
 				goto cleanup;
 		} else {
 			RTFREE_LOCKED(rt);
 		}
 
 		imm = in6_joingroup(ifp, &mltaddr.sin6_addr, &error, 0);
 		if (!imm) {
 			nd6log((LOG_WARNING,
 			    "in6_update_ifa: addmulti failed for "
 			    "%s on %s (errno=%d)\n",
 			    ip6_sprintf(ip6buf, &mltaddr.sin6_addr),
 			    if_name(ifp), error));
 			goto cleanup;
 		}
 		LIST_INSERT_HEAD(&ia->ia6_memberships, imm, i6mm_chain);
 
 		/*
 		 * join node information group address
 		 */
 #define hostnamelen	strlen(V_hostname)
 		delay = 0;
 		if ((flags & IN6_IFAUPDATE_DADDELAY)) {
 			/*
 			 * The spec doesn't say anything about delay for this
 			 * group, but the same logic should apply.
 			 */
 			delay = arc4random() %
 			    (MAX_RTR_SOLICITATION_DELAY * hz);
 		}
 		mtx_lock(&hostname_mtx);
 		if (in6_nigroup(ifp, V_hostname, hostnamelen,
 		    &mltaddr.sin6_addr) == 0) {
 			mtx_unlock(&hostname_mtx);
 			imm = in6_joingroup(ifp, &mltaddr.sin6_addr, &error,
 			    delay); /* XXX jinmei */
 			if (!imm) {
 				nd6log((LOG_WARNING, "in6_update_ifa: "
 				    "addmulti failed for %s on %s "
 				    "(errno=%d)\n",
 				    ip6_sprintf(ip6buf, &mltaddr.sin6_addr),
 				    if_name(ifp), error));
 				/* XXX not very fatal, go on... */
 			} else {
 				LIST_INSERT_HEAD(&ia->ia6_memberships,
 				    imm, i6mm_chain);
 			}
 		} else
 			mtx_unlock(&hostname_mtx);
 #undef hostnamelen
 
 		/*
 		 * join interface-local all-nodes address.
 		 * (ff01::1%ifN, and ff01::%ifN/32)
 		 */
 		mltaddr.sin6_addr = in6addr_nodelocal_allnodes;
 		if ((error = in6_setscope(&mltaddr.sin6_addr, ifp, NULL))
 		    != 0)
 			goto cleanup; /* XXX: should not fail */
 		/* XXX: again, do we really need the route? */
 		rt = rtalloc1((struct sockaddr *)&mltaddr, 0, 0UL);
 		if (rt) {
 			if (memcmp(&mltaddr.sin6_addr,
 			    &((struct sockaddr_in6 *)rt_key(rt))->sin6_addr,
 			    MLTMASK_LEN)) {
 				RTFREE_LOCKED(rt);
 				rt = NULL;
 			}
 		}
 		if (!rt) {
 			error = rtrequest(RTM_ADD, (struct sockaddr *)&mltaddr,
 			    (struct sockaddr *)&ia->ia_addr,
-			    (struct sockaddr *)&mltmask, RTF_UP | RTF_CLONING,
+			    (struct sockaddr *)&mltmask, RTF_UP,
 			    (struct rtentry **)0);
 			if (error)
 				goto cleanup;
 		} else
 			RTFREE_LOCKED(rt);
 
 		imm = in6_joingroup(ifp, &mltaddr.sin6_addr, &error, 0);
 		if (!imm) {
 			nd6log((LOG_WARNING, "in6_update_ifa: "
 			    "addmulti failed for %s on %s "
 			    "(errno=%d)\n",
 			    ip6_sprintf(ip6buf, &mltaddr.sin6_addr),
 			    if_name(ifp), error));
 			goto cleanup;
 		}
 		LIST_INSERT_HEAD(&ia->ia6_memberships, imm, i6mm_chain);
 #undef	MLTMASK_LEN
 	}
 
 	/*
 	 * Perform DAD, if needed.
 	 * XXX It may be of use, if we can administratively
 	 * disable DAD.
 	 */
 	if (hostIsNew && in6if_do_dad(ifp) &&
 	    ((ifra->ifra_flags & IN6_IFF_NODAD) == 0) &&
 	    (ia->ia6_flags & IN6_IFF_TENTATIVE))
 	{
 		int mindelay, maxdelay;
 
 		delay = 0;
 		if ((flags & IN6_IFAUPDATE_DADDELAY)) {
 			/*
 			 * We need to impose a delay before sending an NS
 			 * for DAD.  Check if we also needed a delay for the
 			 * corresponding MLD message.  If we did, the delay
 			 * should be larger than the MLD delay (this could be
 			 * relaxed a bit, but this simple logic is at least
 			 * safe).
 			 */
 			mindelay = 0;
 			if (in6m_sol != NULL &&
 			    in6m_sol->in6m_state == MLD_REPORTPENDING) {
 				mindelay = in6m_sol->in6m_timer;
 			}
 			maxdelay = MAX_RTR_SOLICITATION_DELAY * hz;
 			if (maxdelay - mindelay == 0)
 				delay = 0;
 			else {
 				delay =
 				    (arc4random() % (maxdelay - mindelay)) +
 				    mindelay;
 			}
 		}
 		nd6_dad_start((struct ifaddr *)ia, delay);
 	}
 
 	return (error);
 
   unlink:
 	/*
 	 * XXX: if a change of an existing address failed, keep the entry
 	 * anyway.
 	 */
 	if (hostIsNew)
 		in6_unlink_ifa(ia, ifp);
 	return (error);
 
   cleanup:
 	in6_purgeaddr(&ia->ia_ifa);
 	return error;
 }
 
 void
 in6_purgeaddr(struct ifaddr *ifa)
 {
 	struct ifnet *ifp = ifa->ifa_ifp;
 	struct in6_ifaddr *ia = (struct in6_ifaddr *) ifa;
-	char ip6buf[INET6_ADDRSTRLEN];
 	struct in6_multi_mship *imm;
 
 	/* stop DAD processing */
 	nd6_dad_stop(ifa);
 
+	IF_AFDATA_LOCK(ifp);
+	lla_lookup(LLTABLE6(ifp), (LLE_DELETE | LLE_IFADDR),
+	    (struct sockaddr *)&ia->ia_addr);
+	IF_AFDATA_UNLOCK(ifp);
+	
 	/*
-	 * delete route to the destination of the address being purged.
-	 * The interface must be p2p or loopback in this case.
-	 */
-	if ((ia->ia_flags & IFA_ROUTE) != 0 && ia->ia_dstaddr.sin6_len != 0) {
-		int e;
-
-		if ((e = rtinit(&(ia->ia_ifa), (int)RTM_DELETE, RTF_HOST))
-		    != 0) {
-			log(LOG_ERR, "in6_purgeaddr: failed to remove "
-			    "a route to the p2p destination: %s on %s, "
-			    "errno=%d\n",
-			    ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr),
-			    if_name(ifp), e);
-			/* proceed anyway... */
-		} else
-			ia->ia_flags &= ~IFA_ROUTE;
-	}
-
-	/* Remove ownaddr's loopback rtentry, if it exists. */
-	in6_ifremloop(&(ia->ia_ifa));
-
-	/*
 	 * leave from multicast groups we have joined for the interface
 	 */
 	while ((imm = ia->ia6_memberships.lh_first) != NULL) {
 		LIST_REMOVE(imm, i6mm_chain);
 		in6_leavegroup(imm);
 	}
 
 	in6_unlink_ifa(ia, ifp);
 }
 
 static void
 in6_unlink_ifa(struct in6_ifaddr *ia, struct ifnet *ifp)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	struct in6_ifaddr *oia;
 	int	s = splnet();
 
 	TAILQ_REMOVE(&ifp->if_addrlist, &ia->ia_ifa, ifa_list);
 
 	oia = ia;
 	if (oia == (ia = V_in6_ifaddr))
 		V_in6_ifaddr = ia->ia_next;
 	else {
 		while (ia->ia_next && (ia->ia_next != oia))
 			ia = ia->ia_next;
 		if (ia->ia_next)
 			ia->ia_next = oia->ia_next;
 		else {
 			/* search failed */
 			printf("Couldn't unlink in6_ifaddr from in6_ifaddr\n");
 		}
 	}
 
 	/*
 	 * Release the reference to the base prefix.  There should be a
 	 * positive reference.
 	 */
 	if (oia->ia6_ndpr == NULL) {
 		nd6log((LOG_NOTICE,
 		    "in6_unlink_ifa: autoconf'ed address "
 		    "%p has no prefix\n", oia));
 	} else {
 		oia->ia6_ndpr->ndpr_refcnt--;
 		oia->ia6_ndpr = NULL;
 	}
 
 	/*
 	 * Also, if the address being removed is autoconf'ed, call
 	 * pfxlist_onlink_check() since the release might affect the status of
 	 * other (detached) addresses.
 	 */
 	if ((oia->ia6_flags & IN6_IFF_AUTOCONF)) {
 		pfxlist_onlink_check();
 	}
 
 	/*
 	 * release another refcnt for the link from in6_ifaddr.
 	 * Note that we should decrement the refcnt at least once for all *BSD.
 	 */
 	IFAFREE(&oia->ia_ifa);
 
 	splx(s);
 }
 
 void
 in6_purgeif(struct ifnet *ifp)
 {
 	struct ifaddr *ifa, *nifa;
 
 	for (ifa = TAILQ_FIRST(&ifp->if_addrlist); ifa != NULL; ifa = nifa) {
 		nifa = TAILQ_NEXT(ifa, ifa_list);
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		in6_purgeaddr(ifa);
 	}
 
 	in6_ifdetach(ifp);
 }
 
 /*
  * SIOC[GAD]LIFADDR.
  *	SIOCGLIFADDR: get first address. (?)
  *	SIOCGLIFADDR with IFLR_PREFIX:
  *		get first address that matches the specified prefix.
  *	SIOCALIFADDR: add the specified address.
  *	SIOCALIFADDR with IFLR_PREFIX:
  *		add the specified prefix, filling hostid part from
  *		the first link-local address.  prefixlen must be <= 64.
  *	SIOCDLIFADDR: delete the specified address.
  *	SIOCDLIFADDR with IFLR_PREFIX:
  *		delete the first address that matches the specified prefix.
  * return values:
  *	EINVAL on invalid parameters
  *	EADDRNOTAVAIL on prefix match failed/specified address not found
  *	other values may be returned from in6_ioctl()
  *
  * NOTE: SIOCALIFADDR(with IFLR_PREFIX set) allows prefixlen less than 64.
  * this is to accomodate address naming scheme other than RFC2374,
  * in the future.
  * RFC2373 defines interface id to be 64bit, but it allows non-RFC2374
  * address encoding scheme. (see figure on page 8)
  */
 static int
 in6_lifaddr_ioctl(struct socket *so, u_long cmd, caddr_t data,
     struct ifnet *ifp, struct thread *td)
 {
 	struct if_laddrreq *iflr = (struct if_laddrreq *)data;
 	struct ifaddr *ifa;
 	struct sockaddr *sa;
 
 	/* sanity checks */
 	if (!data || !ifp) {
 		panic("invalid argument to in6_lifaddr_ioctl");
 		/* NOTREACHED */
 	}
 
 	switch (cmd) {
 	case SIOCGLIFADDR:
 		/* address must be specified on GET with IFLR_PREFIX */
 		if ((iflr->flags & IFLR_PREFIX) == 0)
 			break;
 		/* FALLTHROUGH */
 	case SIOCALIFADDR:
 	case SIOCDLIFADDR:
 		/* address must be specified on ADD and DELETE */
 		sa = (struct sockaddr *)&iflr->addr;
 		if (sa->sa_family != AF_INET6)
 			return EINVAL;
 		if (sa->sa_len != sizeof(struct sockaddr_in6))
 			return EINVAL;
 		/* XXX need improvement */
 		sa = (struct sockaddr *)&iflr->dstaddr;
 		if (sa->sa_family && sa->sa_family != AF_INET6)
 			return EINVAL;
 		if (sa->sa_len && sa->sa_len != sizeof(struct sockaddr_in6))
 			return EINVAL;
 		break;
 	default: /* shouldn't happen */
 #if 0
 		panic("invalid cmd to in6_lifaddr_ioctl");
 		/* NOTREACHED */
 #else
 		return EOPNOTSUPP;
 #endif
 	}
 	if (sizeof(struct in6_addr) * 8 < iflr->prefixlen)
 		return EINVAL;
 
 	switch (cmd) {
 	case SIOCALIFADDR:
 	    {
 		struct in6_aliasreq ifra;
 		struct in6_addr *hostid = NULL;
 		int prefixlen;
 
 		if ((iflr->flags & IFLR_PREFIX) != 0) {
 			struct sockaddr_in6 *sin6;
 
 			/*
 			 * hostid is to fill in the hostid part of the
 			 * address.  hostid points to the first link-local
 			 * address attached to the interface.
 			 */
 			ifa = (struct ifaddr *)in6ifa_ifpforlinklocal(ifp, 0);
 			if (!ifa)
 				return EADDRNOTAVAIL;
 			hostid = IFA_IN6(ifa);
 
 			/* prefixlen must be <= 64. */
 			if (64 < iflr->prefixlen)
 				return EINVAL;
 			prefixlen = iflr->prefixlen;
 
 			/* hostid part must be zero. */
 			sin6 = (struct sockaddr_in6 *)&iflr->addr;
 			if (sin6->sin6_addr.s6_addr32[2] != 0 ||
 			    sin6->sin6_addr.s6_addr32[3] != 0) {
 				return EINVAL;
 			}
 		} else
 			prefixlen = iflr->prefixlen;
 
 		/* copy args to in6_aliasreq, perform ioctl(SIOCAIFADDR_IN6). */
 		bzero(&ifra, sizeof(ifra));
 		bcopy(iflr->iflr_name, ifra.ifra_name, sizeof(ifra.ifra_name));
 
 		bcopy(&iflr->addr, &ifra.ifra_addr,
 		    ((struct sockaddr *)&iflr->addr)->sa_len);
 		if (hostid) {
 			/* fill in hostid part */
 			ifra.ifra_addr.sin6_addr.s6_addr32[2] =
 			    hostid->s6_addr32[2];
 			ifra.ifra_addr.sin6_addr.s6_addr32[3] =
 			    hostid->s6_addr32[3];
 		}
 
 		if (((struct sockaddr *)&iflr->dstaddr)->sa_family) { /* XXX */
 			bcopy(&iflr->dstaddr, &ifra.ifra_dstaddr,
 			    ((struct sockaddr *)&iflr->dstaddr)->sa_len);
 			if (hostid) {
 				ifra.ifra_dstaddr.sin6_addr.s6_addr32[2] =
 				    hostid->s6_addr32[2];
 				ifra.ifra_dstaddr.sin6_addr.s6_addr32[3] =
 				    hostid->s6_addr32[3];
 			}
 		}
 
 		ifra.ifra_prefixmask.sin6_len = sizeof(struct sockaddr_in6);
 		in6_prefixlen2mask(&ifra.ifra_prefixmask.sin6_addr, prefixlen);
 
 		ifra.ifra_flags = iflr->flags & ~IFLR_PREFIX;
 		return in6_control(so, SIOCAIFADDR_IN6, (caddr_t)&ifra, ifp, td);
 	    }
 	case SIOCGLIFADDR:
 	case SIOCDLIFADDR:
 	    {
 		struct in6_ifaddr *ia;
 		struct in6_addr mask, candidate, match;
 		struct sockaddr_in6 *sin6;
 		int cmp;
 
 		bzero(&mask, sizeof(mask));
 		if (iflr->flags & IFLR_PREFIX) {
 			/* lookup a prefix rather than address. */
 			in6_prefixlen2mask(&mask, iflr->prefixlen);
 
 			sin6 = (struct sockaddr_in6 *)&iflr->addr;
 			bcopy(&sin6->sin6_addr, &match, sizeof(match));
 			match.s6_addr32[0] &= mask.s6_addr32[0];
 			match.s6_addr32[1] &= mask.s6_addr32[1];
 			match.s6_addr32[2] &= mask.s6_addr32[2];
 			match.s6_addr32[3] &= mask.s6_addr32[3];
 
 			/* if you set extra bits, that's wrong */
 			if (bcmp(&match, &sin6->sin6_addr, sizeof(match)))
 				return EINVAL;
 
 			cmp = 1;
 		} else {
 			if (cmd == SIOCGLIFADDR) {
 				/* on getting an address, take the 1st match */
 				cmp = 0;	/* XXX */
 			} else {
 				/* on deleting an address, do exact match */
 				in6_prefixlen2mask(&mask, 128);
 				sin6 = (struct sockaddr_in6 *)&iflr->addr;
 				bcopy(&sin6->sin6_addr, &match, sizeof(match));
 
 				cmp = 1;
 			}
 		}
 
 		TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 			if (ifa->ifa_addr->sa_family != AF_INET6)
 				continue;
 			if (!cmp)
 				break;
 
 			/*
 			 * XXX: this is adhoc, but is necessary to allow
 			 * a user to specify fe80::/64 (not /10) for a
 			 * link-local address.
 			 */
 			bcopy(IFA_IN6(ifa), &candidate, sizeof(candidate));
 			in6_clearscope(&candidate);
 			candidate.s6_addr32[0] &= mask.s6_addr32[0];
 			candidate.s6_addr32[1] &= mask.s6_addr32[1];
 			candidate.s6_addr32[2] &= mask.s6_addr32[2];
 			candidate.s6_addr32[3] &= mask.s6_addr32[3];
 			if (IN6_ARE_ADDR_EQUAL(&candidate, &match))
 				break;
 		}
 		if (!ifa)
 			return EADDRNOTAVAIL;
 		ia = ifa2ia6(ifa);
 
 		if (cmd == SIOCGLIFADDR) {
 			int error;
 
 			/* fill in the if_laddrreq structure */
 			bcopy(&ia->ia_addr, &iflr->addr, ia->ia_addr.sin6_len);
 			error = sa6_recoverscope(
 			    (struct sockaddr_in6 *)&iflr->addr);
 			if (error != 0)
 				return (error);
 
 			if ((ifp->if_flags & IFF_POINTOPOINT) != 0) {
 				bcopy(&ia->ia_dstaddr, &iflr->dstaddr,
 				    ia->ia_dstaddr.sin6_len);
 				error = sa6_recoverscope(
 				    (struct sockaddr_in6 *)&iflr->dstaddr);
 				if (error != 0)
 					return (error);
 			} else
 				bzero(&iflr->dstaddr, sizeof(iflr->dstaddr));
 
 			iflr->prefixlen =
 			    in6_mask2len(&ia->ia_prefixmask.sin6_addr, NULL);
 
 			iflr->flags = ia->ia6_flags;	/* XXX */
 
 			return 0;
 		} else {
 			struct in6_aliasreq ifra;
 
 			/* fill in6_aliasreq and do ioctl(SIOCDIFADDR_IN6) */
 			bzero(&ifra, sizeof(ifra));
 			bcopy(iflr->iflr_name, ifra.ifra_name,
 			    sizeof(ifra.ifra_name));
 
 			bcopy(&ia->ia_addr, &ifra.ifra_addr,
 			    ia->ia_addr.sin6_len);
 			if ((ifp->if_flags & IFF_POINTOPOINT) != 0) {
 				bcopy(&ia->ia_dstaddr, &ifra.ifra_dstaddr,
 				    ia->ia_dstaddr.sin6_len);
 			} else {
 				bzero(&ifra.ifra_dstaddr,
 				    sizeof(ifra.ifra_dstaddr));
 			}
 			bcopy(&ia->ia_prefixmask, &ifra.ifra_dstaddr,
 			    ia->ia_prefixmask.sin6_len);
 
 			ifra.ifra_flags = ia->ia6_flags;
 			return in6_control(so, SIOCDIFADDR_IN6, (caddr_t)&ifra,
 			    ifp, td);
 		}
 	    }
 	}
 
 	return EOPNOTSUPP;	/* just for safety */
 }
 
 /*
  * Initialize an interface's intetnet6 address
  * and routing table entry.
  */
 static int
 in6_ifinit(struct ifnet *ifp, struct in6_ifaddr *ia,
     struct sockaddr_in6 *sin6, int newhost)
 {
 	int	error = 0, plen, ifacount = 0;
 	int	s = splimp();
 	struct ifaddr *ifa;
 
 	/*
 	 * Give the interface a chance to initialize
 	 * if this is its first address,
 	 * and to validate the address if necessary.
 	 */
 	TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		ifacount++;
 	}
 
 	ia->ia_addr = *sin6;
 
 	if (ifacount <= 1 && ifp->if_ioctl) {
 		IFF_LOCKGIANT(ifp);
 		error = (*ifp->if_ioctl)(ifp, SIOCSIFADDR, (caddr_t)ia);
 		IFF_UNLOCKGIANT(ifp);
 		if (error) {
 			splx(s);
 			return (error);
 		}
 	}
 	splx(s);
 
 	ia->ia_ifa.ifa_metric = ifp->if_metric;
 
 	/* we could do in(6)_socktrim here, but just omit it at this moment. */
 
-	if (newhost) {
-		/*
-		 * set the rtrequest function to create llinfo.  It also
-		 * adjust outgoing interface of the route for the local
-		 * address when called via in6_ifaddloop() below.
-		 */
-		ia->ia_ifa.ifa_rtrequest = nd6_rtrequest;
-	}
-
 	/*
 	 * Special case:
 	 * If a new destination address is specified for a point-to-point
 	 * interface, install a route to the destination as an interface
-	 * direct route.  In addition, if the link is expected to have neighbor
-	 * cache entries, specify RTF_LLINFO so that a cache entry for the
-	 * destination address will be created.
-	 * created
+	 * direct route. 
 	 * XXX: the logic below rejects assigning multiple addresses on a p2p
 	 * interface that share the same destination.
 	 */
+#if 0 /* QL - verify */
 	plen = in6_mask2len(&ia->ia_prefixmask.sin6_addr, NULL); /* XXX */
 	if (!(ia->ia_flags & IFA_ROUTE) && plen == 128 &&
 	    ia->ia_dstaddr.sin6_family == AF_INET6) {
 		int rtflags = RTF_UP | RTF_HOST;
 		struct rtentry *rt = NULL, **rtp = NULL;
 
 		if (nd6_need_cache(ifp) != 0) {
-			rtflags |= RTF_LLINFO;
 			rtp = &rt;
 		}
 
 		error = rtrequest(RTM_ADD,
 		    (struct sockaddr *)&ia->ia_dstaddr,
 		    (struct sockaddr *)&ia->ia_addr,
 		    (struct sockaddr *)&ia->ia_prefixmask,
 		    ia->ia_flags | rtflags, rtp);
 		if (error != 0)
 			return (error);
 		if (rt != NULL) {
 			struct llinfo_nd6 *ln;
 
 			RT_LOCK(rt);
 			ln = (struct llinfo_nd6 *)rt->rt_llinfo;
 			if (ln != NULL) {
 				/*
 				 * Set the state to STALE because we don't
 				 * have to perform address resolution on this
 				 * link.
 				 */
 				ln->ln_state = ND6_LLINFO_STALE;
 			}
 			RT_REMREF(rt);
 			RT_UNLOCK(rt);
 		}
 		ia->ia_flags |= IFA_ROUTE;
 	}
-	if (plen < 128) {
-		/*
-		 * The RTF_CLONING flag is necessary for in6_is_ifloop_auto().
-		 */
-		ia->ia_ifa.ifa_flags |= RTF_CLONING;
+#else
+	plen = in6_mask2len(&ia->ia_prefixmask.sin6_addr, NULL); /* XXX */
+	if (!(ia->ia_flags & IFA_ROUTE) && plen == 128 &&
+	    ia->ia_dstaddr.sin6_family == AF_INET6) {
+		if ((error = rtinit(&(ia->ia_ifa), (int)RTM_ADD,
+				    RTF_UP | RTF_HOST)) != 0)
+			return (error);
+		ia->ia_flags |= IFA_ROUTE;
 	}
+#endif
 
 	/* Add ownaddr as loopback rtentry, if necessary (ex. on p2p link). */
-	if (newhost)
-		in6_ifaddloop(&(ia->ia_ifa));
+	if (newhost) {
+		struct llentry *ln;
 
+		IF_AFDATA_LOCK(ifp);
+		ia->ia_ifa.ifa_rtrequest = NULL;
+
+		/* XXX QL
+		 * we need to report rt_newaddrmsg
+		 */
+		ln = lla_lookup(LLTABLE6(ifp), (LLE_CREATE | LLE_IFADDR | LLE_EXCLUSIVE),
+		    (struct sockaddr *)&ia->ia_addr);
+		IF_AFDATA_UNLOCK(ifp);
+		if (ln) {
+			ln->la_expire = 0;  /* for IPv6 this means permanent */
+			ln->ln_state = ND6_LLINFO_REACHABLE;
+			LLE_WUNLOCK(ln);
+		}
+	}
+
 	return (error);
 }
 
 struct in6_multi_mship *
 in6_joingroup(struct ifnet *ifp, struct in6_addr *addr,
     int *errorp, int delay)
 {
 	struct in6_multi_mship *imm;
 
 	imm = malloc(sizeof(*imm), M_IP6MADDR, M_NOWAIT);
 	if (!imm) {
 		*errorp = ENOBUFS;
 		return NULL;
 	}
 	imm->i6mm_maddr = in6_addmulti(addr, ifp, errorp, delay);
 	if (!imm->i6mm_maddr) {
 		/* *errorp is alrady set */
 		free(imm, M_IP6MADDR);
 		return NULL;
 	}
 	return imm;
 }
 
 int
 in6_leavegroup(struct in6_multi_mship *imm)
 {
 
 	if (imm->i6mm_maddr)
 		in6_delmulti(imm->i6mm_maddr);
 	free(imm,  M_IP6MADDR);
 	return 0;
 }
 
 /*
  * Find an IPv6 interface link-local address specific to an interface.
  */
 struct in6_ifaddr *
 in6ifa_ifpforlinklocal(struct ifnet *ifp, int ignoreflags)
 {
 	struct ifaddr *ifa;
 
 	TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		if (IN6_IS_ADDR_LINKLOCAL(IFA_IN6(ifa))) {
 			if ((((struct in6_ifaddr *)ifa)->ia6_flags &
 			     ignoreflags) != 0)
 				continue;
 			break;
 		}
 	}
 
 	return ((struct in6_ifaddr *)ifa);
 }
 
 
 /*
  * find the internet address corresponding to a given interface and address.
  */
 struct in6_ifaddr *
 in6ifa_ifpwithaddr(struct ifnet *ifp, struct in6_addr *addr)
 {
 	struct ifaddr *ifa;
 
 	TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		if (IN6_ARE_ADDR_EQUAL(addr, IFA_IN6(ifa)))
 			break;
 	}
 
 	return ((struct in6_ifaddr *)ifa);
 }
 
 /*
  * Convert IP6 address to printable (loggable) representation. Caller
  * has to make sure that ip6buf is at least INET6_ADDRSTRLEN long.
  */
 static char digits[] = "0123456789abcdef";
 char *
 ip6_sprintf(char *ip6buf, const struct in6_addr *addr)
 {
 	int i;
 	char *cp;
 	const u_int16_t *a = (const u_int16_t *)addr;
 	const u_int8_t *d;
 	int dcolon = 0, zero = 0;
 
 	cp = ip6buf;
 
 	for (i = 0; i < 8; i++) {
 		if (dcolon == 1) {
 			if (*a == 0) {
 				if (i == 7)
 					*cp++ = ':';
 				a++;
 				continue;
 			} else
 				dcolon = 2;
 		}
 		if (*a == 0) {
 			if (dcolon == 0 && *(a + 1) == 0) {
 				if (i == 0)
 					*cp++ = ':';
 				*cp++ = ':';
 				dcolon = 1;
 			} else {
 				*cp++ = '0';
 				*cp++ = ':';
 			}
 			a++;
 			continue;
 		}
 		d = (const u_char *)a;
 		/* Try to eliminate leading zeros in printout like in :0001. */
 		zero = 1;
 		*cp = digits[*d >> 4];
 		if (*cp != '0') {
 			zero = 0;
 			cp++;
 		}
 		*cp = digits[*d++ & 0xf];
 		if (zero == 0 || (*cp != '0')) {
 			zero = 0;
 			cp++;
 		}
 		*cp = digits[*d >> 4];
 		if (zero == 0 || (*cp != '0')) {
 			zero = 0;
 			cp++;
 		}
 		*cp++ = digits[*d & 0xf];
 		*cp++ = ':';
 		a++;
 	}
 	*--cp = '\0';
 	return (ip6buf);
 }
 
 int
 in6_localaddr(struct in6_addr *in6)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_ifaddr *ia;
 
 	if (IN6_IS_ADDR_LOOPBACK(in6) || IN6_IS_ADDR_LINKLOCAL(in6))
 		return 1;
 
 	for (ia = V_in6_ifaddr; ia; ia = ia->ia_next) {
 		if (IN6_ARE_MASKED_ADDR_EQUAL(in6, &ia->ia_addr.sin6_addr,
 		    &ia->ia_prefixmask.sin6_addr)) {
 			return 1;
 		}
 	}
 
 	return (0);
 }
 
 int
 in6_is_addr_deprecated(struct sockaddr_in6 *sa6)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_ifaddr *ia;
 
 	for (ia = V_in6_ifaddr; ia; ia = ia->ia_next) {
 		if (IN6_ARE_ADDR_EQUAL(&ia->ia_addr.sin6_addr,
 				       &sa6->sin6_addr) &&
 		    (ia->ia6_flags & IN6_IFF_DEPRECATED) != 0)
 			return (1); /* true */
 
 		/* XXX: do we still have to go thru the rest of the list? */
 	}
 
 	return (0);		/* false */
 }
 
 /*
  * return length of part which dst and src are equal
  * hard coding...
  */
 int
 in6_matchlen(struct in6_addr *src, struct in6_addr *dst)
 {
 	int match = 0;
 	u_char *s = (u_char *)src, *d = (u_char *)dst;
 	u_char *lim = s + 16, r;
 
 	while (s < lim)
 		if ((r = (*d++ ^ *s++)) != 0) {
 			while (r < 128) {
 				match++;
 				r <<= 1;
 			}
 			break;
 		} else
 			match += 8;
 	return match;
 }
 
 /* XXX: to be scope conscious */
 int
 in6_are_prefix_equal(struct in6_addr *p1, struct in6_addr *p2, int len)
 {
 	int bytelen, bitlen;
 
 	/* sanity check */
 	if (0 > len || len > 128) {
 		log(LOG_ERR, "in6_are_prefix_equal: invalid prefix length(%d)\n",
 		    len);
 		return (0);
 	}
 
 	bytelen = len / 8;
 	bitlen = len % 8;
 
 	if (bcmp(&p1->s6_addr, &p2->s6_addr, bytelen))
 		return (0);
 	if (bitlen != 0 &&
 	    p1->s6_addr[bytelen] >> (8 - bitlen) !=
 	    p2->s6_addr[bytelen] >> (8 - bitlen))
 		return (0);
 
 	return (1);
 }
 
 void
 in6_prefixlen2mask(struct in6_addr *maskp, int len)
 {
 	u_char maskarray[8] = {0x80, 0xc0, 0xe0, 0xf0, 0xf8, 0xfc, 0xfe, 0xff};
 	int bytelen, bitlen, i;
 
 	/* sanity check */
 	if (0 > len || len > 128) {
 		log(LOG_ERR, "in6_prefixlen2mask: invalid prefix length(%d)\n",
 		    len);
 		return;
 	}
 
 	bzero(maskp, sizeof(*maskp));
 	bytelen = len / 8;
 	bitlen = len % 8;
 	for (i = 0; i < bytelen; i++)
 		maskp->s6_addr[i] = 0xff;
 	if (bitlen)
 		maskp->s6_addr[bytelen] = maskarray[bitlen - 1];
 }
 
 /*
  * return the best address out of the same scope. if no address was
  * found, return the first valid address from designated IF.
  */
 struct in6_ifaddr *
 in6_ifawithifp(struct ifnet *ifp, struct in6_addr *dst)
 {
 	INIT_VNET_INET6(curvnet);
 	int dst_scope =	in6_addrscope(dst), blen = -1, tlen;
 	struct ifaddr *ifa;
 	struct in6_ifaddr *besta = 0;
 	struct in6_ifaddr *dep[2];	/* last-resort: deprecated */
 
 	dep[0] = dep[1] = NULL;
 
 	/*
 	 * We first look for addresses in the same scope.
 	 * If there is one, return it.
 	 * If two or more, return one which matches the dst longest.
 	 * If none, return one of global addresses assigned other ifs.
 	 */
 	TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_ANYCAST)
 			continue; /* XXX: is there any case to allow anycast? */
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_NOTREADY)
 			continue; /* don't use this interface */
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_DETACHED)
 			continue;
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_DEPRECATED) {
 			if (V_ip6_use_deprecated)
 				dep[0] = (struct in6_ifaddr *)ifa;
 			continue;
 		}
 
 		if (dst_scope == in6_addrscope(IFA_IN6(ifa))) {
 			/*
 			 * call in6_matchlen() as few as possible
 			 */
 			if (besta) {
 				if (blen == -1)
 					blen = in6_matchlen(&besta->ia_addr.sin6_addr, dst);
 				tlen = in6_matchlen(IFA_IN6(ifa), dst);
 				if (tlen > blen) {
 					blen = tlen;
 					besta = (struct in6_ifaddr *)ifa;
 				}
 			} else
 				besta = (struct in6_ifaddr *)ifa;
 		}
 	}
 	if (besta)
 		return (besta);
 
 	TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_ANYCAST)
 			continue; /* XXX: is there any case to allow anycast? */
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_NOTREADY)
 			continue; /* don't use this interface */
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_DETACHED)
 			continue;
 		if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_DEPRECATED) {
 			if (V_ip6_use_deprecated)
 				dep[1] = (struct in6_ifaddr *)ifa;
 			continue;
 		}
 
 		return (struct in6_ifaddr *)ifa;
 	}
 
 	/* use the last-resort values, that are, deprecated addresses */
 	if (dep[0])
 		return dep[0];
 	if (dep[1])
 		return dep[1];
 
 	return NULL;
 }
 
 /*
  * perform DAD when interface becomes IFF_UP.
  */
 void
 in6_if_up(struct ifnet *ifp)
 {
 	struct ifaddr *ifa;
 	struct in6_ifaddr *ia;
 
 	TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		ia = (struct in6_ifaddr *)ifa;
 		if (ia->ia6_flags & IN6_IFF_TENTATIVE) {
 			/*
 			 * The TENTATIVE flag was likely set by hand
 			 * beforehand, implicitly indicating the need for DAD.
 			 * We may be able to skip the random delay in this
 			 * case, but we impose delays just in case.
 			 */
 			nd6_dad_start(ifa,
 			    arc4random() % (MAX_RTR_SOLICITATION_DELAY * hz));
 		}
 	}
 
 	/*
 	 * special cases, like 6to4, are handled in in6_ifattach
 	 */
 	in6_ifattach(ifp, NULL);
 }
 
 int
 in6if_do_dad(struct ifnet *ifp)
 {
 	if ((ifp->if_flags & IFF_LOOPBACK) != 0)
 		return (0);
 
 	switch (ifp->if_type) {
 #ifdef IFT_DUMMY
 	case IFT_DUMMY:
 #endif
 	case IFT_FAITH:
 		/*
 		 * These interfaces do not have the IFF_LOOPBACK flag,
 		 * but loop packets back.  We do not have to do DAD on such
 		 * interfaces.  We should even omit it, because loop-backed
 		 * NS would confuse the DAD procedure.
 		 */
 		return (0);
 	default:
 		/*
 		 * Our DAD routine requires the interface up and running.
 		 * However, some interfaces can be up before the RUNNING
 		 * status.  Additionaly, users may try to assign addresses
 		 * before the interface becomes up (or running).
 		 * We simply skip DAD in such a case as a work around.
 		 * XXX: we should rather mark "tentative" on such addresses,
 		 * and do DAD after the interface becomes ready.
 		 */
 		if (!((ifp->if_flags & IFF_UP) &&
 		    (ifp->if_drv_flags & IFF_DRV_RUNNING)))
 			return (0);
 
 		return (1);
 	}
 }
 
 /*
  * Calculate max IPv6 MTU through all the interfaces and store it
  * to in6_maxmtu.
  */
 void
 in6_setmaxmtu(void)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	unsigned long maxmtu = 0;
 	struct ifnet *ifp;
 
 	IFNET_RLOCK();
 	for (ifp = TAILQ_FIRST(&V_ifnet); ifp;
 	    ifp = TAILQ_NEXT(ifp, if_list)) {
 		/* this function can be called during ifnet initialization */
 		if (!ifp->if_afdata[AF_INET6])
 			continue;
 		if ((ifp->if_flags & IFF_LOOPBACK) == 0 &&
 		    IN6_LINKMTU(ifp) > maxmtu)
 			maxmtu = IN6_LINKMTU(ifp);
 	}
 	IFNET_RUNLOCK();
 	if (maxmtu)	     /* update only when maxmtu is positive */
 		V_in6_maxmtu = maxmtu;
 }
 
 /*
  * Provide the length of interface identifiers to be used for the link attached
  * to the given interface.  The length should be defined in "IPv6 over
  * xxx-link" document.  Note that address architecture might also define
  * the length for a particular set of address prefixes, regardless of the
  * link type.  As clarified in rfc2462bis, those two definitions should be
  * consistent, and those really are as of August 2004.
  */
 int
 in6_if2idlen(struct ifnet *ifp)
 {
 	switch (ifp->if_type) {
 	case IFT_ETHER:		/* RFC2464 */
 #ifdef IFT_PROPVIRTUAL
 	case IFT_PROPVIRTUAL:	/* XXX: no RFC. treat it as ether */
 #endif
 #ifdef IFT_L2VLAN
 	case IFT_L2VLAN:	/* ditto */
 #endif
 #ifdef IFT_IEEE80211
 	case IFT_IEEE80211:	/* ditto */
 #endif
 #ifdef IFT_MIP
 	case IFT_MIP:	/* ditto */
 #endif
 		return (64);
 	case IFT_FDDI:		/* RFC2467 */
 		return (64);
 	case IFT_ISO88025:	/* RFC2470 (IPv6 over Token Ring) */
 		return (64);
 	case IFT_PPP:		/* RFC2472 */
 		return (64);
 	case IFT_ARCNET:	/* RFC2497 */
 		return (64);
 	case IFT_FRELAY:	/* RFC2590 */
 		return (64);
 	case IFT_IEEE1394:	/* RFC3146 */
 		return (64);
 	case IFT_GIF:
 		return (64);	/* draft-ietf-v6ops-mech-v2-07 */
 	case IFT_LOOP:
 		return (64);	/* XXX: is this really correct? */
 	default:
 		/*
 		 * Unknown link type:
 		 * It might be controversial to use the today's common constant
 		 * of 64 for these cases unconditionally.  For full compliance,
 		 * we should return an error in this case.  On the other hand,
 		 * if we simply miss the standard for the link type or a new
 		 * standard is defined for a new link type, the IFID length
 		 * is very likely to be the common constant.  As a compromise,
 		 * we always use the constant, but make an explicit notice
 		 * indicating the "unknown" case.
 		 */
 		printf("in6_if2idlen: unknown link type (%d)\n", ifp->if_type);
 		return (64);
 	}
 }
 
+#include <sys/sysctl.h>
+
+struct in6_llentry {
+	struct llentry		base;
+	struct sockaddr_in6	l3_addr6;
+};
+
+static struct llentry *
+in6_lltable_new(const struct sockaddr *l3addr, u_int flags)
+{
+	struct in6_llentry *lle;
+
+	lle = malloc(sizeof(struct in6_llentry), M_LLTABLE,
+	    M_DONTWAIT | M_ZERO);
+	if (lle == NULL)		/* NB: caller generates msg */
+		return NULL;
+
+	callout_init(&lle->base.ln_timer_ch, CALLOUT_MPSAFE);
+	lle->l3_addr6 = *(const struct sockaddr_in6 *)l3addr;
+	lle->base.lle_refcnt = 1;
+	LLE_LOCK_INIT(&lle->base);
+	return &lle->base;
+}
+
+/*
+ * Deletes an address from the address table.
+ * This function is called by the timer functions
+ * such as arptimer() and nd6_llinfo_timer(), and
+ * the caller does the locking.
+ */
+static void
+in6_lltable_free(struct lltable *llt, struct llentry *lle)
+{
+	free(lle, M_LLTABLE);
+}
+
+static int
+in6_lltable_rtcheck(struct ifnet *ifp, const struct sockaddr *l3addr)
+{
+	struct rtentry *rt;
+	char ip6buf[INET6_ADDRSTRLEN];
+
+	KASSERT(l3addr->sa_family == AF_INET6,
+	    ("sin_family %d", l3addr->sa_family));
+
+	/* XXX rtalloc1 should take a const param */
+	rt = rtalloc1(__DECONST(struct sockaddr *, l3addr), 0, 0);
+	if (rt == NULL || (rt->rt_flags & RTF_GATEWAY) || rt->rt_ifp != ifp) {
+		struct ifaddr *ifa;
+		/* 
+		 * Create an ND6 cache for an IPv6 neighbor 
+		 * that is not covered by our own prefix.
+		 */
+		/* XXX ifaof_ifpforaddr should take a const param */
+		ifa = ifaof_ifpforaddr(__DECONST(struct sockaddr *, l3addr), ifp);
+		if (ifa != NULL) {
+			if (rt != NULL)
+				rtfree(rt);
+			return 0;
+		}
+		log(LOG_INFO, "IPv6 address: \"%s\" is not on the network\n",
+		    ip6_sprintf(ip6buf, &((const struct sockaddr_in6 *)l3addr)->sin6_addr));
+		if (rt != NULL)
+			rtfree(rt);
+		return EINVAL;
+	}
+	rtfree(rt);
+	return 0;
+}
+
+static struct llentry *
+in6_lltable_lookup(struct lltable *llt, u_int flags,
+	const struct sockaddr *l3addr)
+{
+	const struct sockaddr_in6 *sin6 = (const struct sockaddr_in6 *)l3addr;
+	struct ifnet *ifp = llt->llt_ifp;
+	struct llentry *lle;
+	struct llentries *lleh;
+	u_int hashkey;
+
+	IF_AFDATA_LOCK_ASSERT(ifp);
+	KASSERT(l3addr->sa_family == AF_INET6,
+	    ("sin_family %d", l3addr->sa_family));
+
+	hashkey = sin6->sin6_addr.s6_addr32[3];
+	lleh = &llt->lle_head[LLATBL_HASH(hashkey, LLTBL_HASHMASK)];
+	LIST_FOREACH(lle, lleh, lle_next) {
+		if (lle->la_flags & LLE_DELETED)
+			continue;
+		if (bcmp(L3_ADDR(lle), l3addr, l3addr->sa_len) == 0)
+			break;
+	}
+
+	if (lle == NULL) {
+		if (!(flags & LLE_CREATE))
+			return (NULL);
+		/*
+		 * A route that covers the given address must have
+		 * been installed 1st because we are doing a resolution,
+		 * verify this.
+		 */
+		if (!(flags & LLE_IFADDR) &&
+		    in6_lltable_rtcheck(ifp, l3addr) != 0)
+			return NULL;
+
+		lle = in6_lltable_new(l3addr, flags);
+		if (lle == NULL) {
+			log(LOG_INFO, "lla_lookup: new lle malloc failed\n");
+			return NULL;
+		}
+		lle->la_flags = flags & ~LLE_CREATE;
+		if ((flags & (LLE_CREATE | LLE_IFADDR)) == (LLE_CREATE | LLE_IFADDR)) {
+			bcopy(IF_LLADDR(ifp), &lle->ll_addr, ifp->if_addrlen);
+			lle->la_flags |= (LLE_VALID | LLE_STATIC);
+		}
+
+		lle->lle_tbl  = llt;
+		lle->lle_head = lleh;
+		LIST_INSERT_HEAD(lleh, lle, lle_next);
+	} else if (flags & LLE_DELETE) {
+		LLE_WLOCK(lle);
+		lle->la_flags = LLE_DELETED;
+		LLE_WUNLOCK(lle);
+#ifdef DIAGNOSTICS
+		log(LOG_INFO, "ifaddr cache = %p  is deleted\n", lle);	
+#endif	
+		lle = (void *)-1;
+	}
+	if (LLE_IS_VALID(lle)) {
+		if (flags & LLE_EXCLUSIVE)
+			LLE_WLOCK(lle);
+		else
+			LLE_RLOCK(lle);
+	}
+	return (lle);
+}
+
+static int
+in6_lltable_dump(struct lltable *llt, struct sysctl_req *wr)
+{
+	struct ifnet *ifp = llt->llt_ifp;
+	struct llentry *lle;
+	/* XXX stack use */
+	struct {
+		struct rt_msghdr	rtm;
+		struct sockaddr_in6	sin6;
+		/*
+		 * ndp.c assumes that sdl is word aligned
+		 */
+#ifdef __LP64__
+		uint32_t		pad;
+#endif
+		struct sockaddr_dl	sdl;
+	} ndpc;
+	int i, error;
+
+	/* XXXXX
+	 * current IFNET_RLOCK() is mapped to IFNET_WLOCK()
+	 * so it is okay to use this ASSERT, change it when
+	 * IFNET lock is finalized
+	 */
+	IFNET_WLOCK_ASSERT();
+
+	error = 0;
+	for (i = 0; i < LLTBL_HASHTBL_SIZE; i++) {
+		LIST_FOREACH(lle, &llt->lle_head[i], lle_next) {
+			struct sockaddr_dl *sdl;
+
+			/* skip deleted or invalid entries */
+			if ((lle->la_flags & (LLE_DELETED|LLE_VALID)) != LLE_VALID)
+				continue;
+			/*
+			 * produce a msg made of:
+			 *  struct rt_msghdr;
+			 *  struct sockaddr_in6 (IPv6)
+			 *  struct sockaddr_dl;
+			 */
+			bzero(&ndpc, sizeof(ndpc));
+			ndpc.rtm.rtm_msglen = sizeof(ndpc);
+			ndpc.sin6.sin6_family = AF_INET6;
+			ndpc.sin6.sin6_len = sizeof(ndpc.sin6);
+			bcopy(L3_ADDR(lle), &ndpc.sin6, L3_ADDR_LEN(lle));
+
+			/* publish */
+			if (lle->la_flags & LLE_PUB)
+				ndpc.rtm.rtm_flags |= RTF_ANNOUNCE;
+
+			sdl = &ndpc.sdl;
+			sdl->sdl_family = AF_LINK;
+			sdl->sdl_len = sizeof(*sdl);
+			sdl->sdl_alen = ifp->if_addrlen;
+			sdl->sdl_index = ifp->if_index;
+			sdl->sdl_type = ifp->if_type;
+			bcopy(&lle->ll_addr, LLADDR(sdl), ifp->if_addrlen);
+			ndpc.rtm.rtm_rmx.rmx_expire =
+			    lle->la_flags & LLE_STATIC ? 0 : lle->la_expire;
+			ndpc.rtm.rtm_flags |= RTF_HOST;
+			if (lle->la_flags & LLE_STATIC)
+				ndpc.rtm.rtm_flags |= RTF_STATIC;
+			ndpc.rtm.rtm_index = ifp->if_index;
+			error = SYSCTL_OUT(wr, &ndpc, sizeof(ndpc));
+			if (error)
+				break;
+		}
+	}
+	return error;
+}
+
 void *
 in6_domifattach(struct ifnet *ifp)
 {
 	struct in6_ifextra *ext;
 
 	ext = (struct in6_ifextra *)malloc(sizeof(*ext), M_IFADDR, M_WAITOK);
 	bzero(ext, sizeof(*ext));
 
 	ext->in6_ifstat = (struct in6_ifstat *)malloc(sizeof(struct in6_ifstat),
 	    M_IFADDR, M_WAITOK);
 	bzero(ext->in6_ifstat, sizeof(*ext->in6_ifstat));
 
 	ext->icmp6_ifstat =
 	    (struct icmp6_ifstat *)malloc(sizeof(struct icmp6_ifstat),
 	    M_IFADDR, M_WAITOK);
 	bzero(ext->icmp6_ifstat, sizeof(*ext->icmp6_ifstat));
 
 	ext->nd_ifinfo = nd6_ifattach(ifp);
 	ext->scope6_id = scope6_ifattach(ifp);
+	ext->lltable = lltable_init(ifp, AF_INET6);
+	if (ext->lltable != NULL) {
+		ext->lltable->llt_new = in6_lltable_new;
+		ext->lltable->llt_free = in6_lltable_free;
+		ext->lltable->llt_rtcheck = in6_lltable_rtcheck;
+		ext->lltable->llt_lookup = in6_lltable_lookup;
+		ext->lltable->llt_dump = in6_lltable_dump;
+	}
 	return ext;
 }
 
 void
 in6_domifdetach(struct ifnet *ifp, void *aux)
 {
 	struct in6_ifextra *ext = (struct in6_ifextra *)aux;
 
 	scope6_ifdetach(ext->scope6_id);
 	nd6_ifdetach(ext->nd_ifinfo);
+	lltable_free(ext->lltable);
 	free(ext->in6_ifstat, M_IFADDR);
 	free(ext->icmp6_ifstat, M_IFADDR);
 	free(ext, M_IFADDR);
 }
 
 /*
  * Convert sockaddr_in6 to sockaddr_in.  Original sockaddr_in6 must be
  * v4 mapped addr or v4 compat addr
  */
 void
 in6_sin6_2_sin(struct sockaddr_in *sin, struct sockaddr_in6 *sin6)
 {
 
 	bzero(sin, sizeof(*sin));
 	sin->sin_len = sizeof(struct sockaddr_in);
 	sin->sin_family = AF_INET;
 	sin->sin_port = sin6->sin6_port;
 	sin->sin_addr.s_addr = sin6->sin6_addr.s6_addr32[3];
 }
 
 /* Convert sockaddr_in to sockaddr_in6 in v4 mapped addr format. */
 void
 in6_sin_2_v4mapsin6(struct sockaddr_in *sin, struct sockaddr_in6 *sin6)
 {
 	bzero(sin6, sizeof(*sin6));
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_port = sin->sin_port;
 	sin6->sin6_addr.s6_addr32[0] = 0;
 	sin6->sin6_addr.s6_addr32[1] = 0;
 	sin6->sin6_addr.s6_addr32[2] = IPV6_ADDR_INT32_SMP;
 	sin6->sin6_addr.s6_addr32[3] = sin->sin_addr.s_addr;
 }
 
 /* Convert sockaddr_in6 into sockaddr_in. */
 void
 in6_sin6_2_sin_in_sock(struct sockaddr *nam)
 {
 	struct sockaddr_in *sin_p;
 	struct sockaddr_in6 sin6;
 
 	/*
 	 * Save original sockaddr_in6 addr and convert it
 	 * to sockaddr_in.
 	 */
 	sin6 = *(struct sockaddr_in6 *)nam;
 	sin_p = (struct sockaddr_in *)nam;
 	in6_sin6_2_sin(sin_p, &sin6);
 }
 
 /* Convert sockaddr_in into sockaddr_in6 in v4 mapped addr format. */
 void
 in6_sin_2_v4mapsin6_in_sock(struct sockaddr **nam)
 {
 	struct sockaddr_in *sin_p;
 	struct sockaddr_in6 *sin6_p;
 
 	sin6_p = malloc(sizeof *sin6_p, M_SONAME,
 	       M_WAITOK);
 	sin_p = (struct sockaddr_in *)*nam;
 	in6_sin_2_v4mapsin6(sin_p, sin6_p);
 	free(*nam, M_SONAME);
 	*nam = (struct sockaddr *)sin6_p;
 }
Index: head/sys/netinet6/in6_rmx.c
===================================================================
--- head/sys/netinet6/in6_rmx.c	(revision 186118)
+++ head/sys/netinet6/in6_rmx.c	(revision 186119)
@@ -1,504 +1,478 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: in6_rmx.c,v 1.11 2001/07/26 06:53:16 jinmei Exp $
  */
 
 /*-
  * Copyright 1994, 1995 Massachusetts Institute of Technology
  *
  * Permission to use, copy, modify, and distribute this software and
  * its documentation for any purpose and without fee is hereby
  * granted, provided that both the above copyright notice and this
  * permission notice appear in all copies, that both the above
  * copyright notice and this permission notice appear in all
  * supporting documentation, and that the name of M.I.T. not be used
  * in advertising or publicity pertaining to distribution of the
  * software without specific, written prior permission.  M.I.T. makes
  * no representations about the suitability of this software for any
  * purpose.  It is provided "as is" without express or implied
  * warranty.
  *
  * THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''.  M.I.T. DISCLAIMS
  * ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
  * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
  * SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 /*
  * This code does two things necessary for the enhanced TCP metrics to
  * function in a useful manner:
  *  1) It marks all non-host routes as `cloning', thus ensuring that
  *     every actual reference to such a route actually gets turned
  *     into a reference to a host route to the specific destination
  *     requested.
  *  2) When such routes lose all their references, it arranges for them
  *     to be deleted in some random collection of circumstances, so that
  *     a large quantity of stale routing data is not kept in kernel memory
  *     indefinitely.  See in6_rtqtimo() below for the exact mechanism.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/sysctl.h>
 #include <sys/queue.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/mbuf.h>
 #include <sys/rwlock.h>
 #include <sys/syslog.h>
 #include <sys/callout.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/ip_var.h>
 #include <netinet/in_var.h>
 
 #include <netinet/ip6.h>
 #include <netinet6/ip6_var.h>
 
 #include <netinet/icmp6.h>
 #include <netinet6/nd6.h>
 #include <netinet6/vinet6.h>
 
 #include <netinet/tcp.h>
 #include <netinet/tcp_seq.h>
 #include <netinet/tcp_timer.h>
 #include <netinet/tcp_var.h>
 
 extern int	in6_inithead(void **head, int off);
 
 #define RTPRF_OURS		RTF_PROTO3	/* set on routes we manage */
 
 /*
  * Do what we need to do when inserting a route.
  */
 static struct radix_node *
 in6_addroute(void *v_arg, void *n_arg, struct radix_node_head *head,
     struct radix_node *treenodes)
 {
 	struct rtentry *rt = (struct rtentry *)treenodes;
 	struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)rt_key(rt);
 	struct radix_node *ret;
 
+	RADIX_NODE_HEAD_WLOCK_ASSERT(head);
 	if (IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr))
 		rt->rt_flags |= RTF_MULTICAST;
 
 	/*
 	 * A little bit of help for both IPv6 output and input:
 	 *   For local addresses, we make sure that RTF_LOCAL is set,
 	 *   with the thought that this might one day be used to speed up
 	 *   ip_input().
 	 *
 	 * We also mark routes to multicast addresses as such, because
 	 * it's easy to do and might be useful (but this is much more
 	 * dubious since it's so easy to inspect the address).  (This
 	 * is done above.)
 	 *
 	 * XXX
 	 * should elaborate the code.
 	 */
 	if (rt->rt_flags & RTF_HOST) {
 		if (IN6_ARE_ADDR_EQUAL(&satosin6(rt->rt_ifa->ifa_addr)
 					->sin6_addr,
 				       &sin6->sin6_addr)) {
 			rt->rt_flags |= RTF_LOCAL;
 		}
 	}
 
 	if (!rt->rt_rmx.rmx_mtu && rt->rt_ifp)
 		rt->rt_rmx.rmx_mtu = IN6_LINKMTU(rt->rt_ifp);
 
 	ret = rn_addroute(v_arg, n_arg, head, treenodes);
-	if (ret == NULL && rt->rt_flags & RTF_HOST) {
+	if (ret == NULL) {
 		struct rtentry *rt2;
 		/*
-		 * We are trying to add a host route, but can't.
-		 * Find out if it is because of an
-		 * ARP entry and delete it if so.
-		 */
-		rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_RNH_LOCKED|RTF_CLONING);
-		if (rt2) {
-			if (rt2->rt_flags & RTF_LLINFO &&
-				rt2->rt_flags & RTF_HOST &&
-				rt2->rt_gateway &&
-				rt2->rt_gateway->sa_family == AF_LINK) {
-				rtexpunge(rt2);
-				RTFREE_LOCKED(rt2);
-				ret = rn_addroute(v_arg, n_arg, head,
-					treenodes);
-			} else
-				RTFREE_LOCKED(rt2);
-		}
-	} else if (ret == NULL && rt->rt_flags & RTF_CLONING) {
-		struct rtentry *rt2;
-		/*
 		 * We are trying to add a net route, but can't.
 		 * The following case should be allowed, so we'll make a
 		 * special check for this:
 		 *	Two IPv6 addresses with the same prefix is assigned
 		 *	to a single interrface.
 		 *	# ifconfig if0 inet6 3ffe:0501::1 prefix 64 alias (*1)
 		 *	# ifconfig if0 inet6 3ffe:0501::2 prefix 64 alias (*2)
 		 *	In this case, (*1) and (*2) want to add the same
 		 *	net route entry, 3ffe:0501:: -> if0.
 		 *	This case should not raise an error.
 		 */
-		rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_RNH_LOCKED|RTF_CLONING);
+		rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_RNH_LOCKED);
 		if (rt2) {
-			if ((rt2->rt_flags & (RTF_CLONING|RTF_HOST|RTF_GATEWAY))
-					== RTF_CLONING
+			if (((rt2->rt_flags & (RTF_HOST|RTF_GATEWAY)) == 0)
 			 && rt2->rt_gateway
 			 && rt2->rt_gateway->sa_family == AF_LINK
 			 && rt2->rt_ifp == rt->rt_ifp) {
 				ret = rt2->rt_nodes;
 			}
 			RTFREE_LOCKED(rt2);
 		}
 	}
-	return ret;
+	return (ret);
 }
 
 /*
  * This code is the inverse of in6_clsroute: on first reference, if we
  * were managing the route, stop doing so and set the expiration timer
  * back off again.
  */
 static struct radix_node *
 in6_matroute(void *v_arg, struct radix_node_head *head)
 {
 	struct radix_node *rn = rn_match(v_arg, head);
 	struct rtentry *rt = (struct rtentry *)rn;
 
 	if (rt && rt->rt_refcnt == 0) { /* this is first reference */
 		if (rt->rt_flags & RTPRF_OURS) {
 			rt->rt_flags &= ~RTPRF_OURS;
 			rt->rt_rmx.rmx_expire = 0;
 		}
 	}
 	return rn;
 }
 
 SYSCTL_DECL(_net_inet6_ip6);
 
 #ifdef VIMAGE_GLOBALS
 static int rtq_reallyold6;
 static int rtq_minreallyold6;
 static int rtq_toomany6;
 #endif
 
 SYSCTL_V_INT(V_NET, vnet_inet6, _net_inet6_ip6, IPV6CTL_RTEXPIRE,
     rtexpire, CTLFLAG_RW, rtq_reallyold6 , 0, "");
 
 SYSCTL_V_INT(V_NET, vnet_inet6, _net_inet6_ip6, IPV6CTL_RTMINEXPIRE,
     rtminexpire, CTLFLAG_RW, rtq_minreallyold6 , 0, "");
 
 SYSCTL_V_INT(V_NET, vnet_inet6, _net_inet6_ip6, IPV6CTL_RTMAXCACHE,
     rtmaxcache, CTLFLAG_RW, rtq_toomany6 , 0, "");
 
 
 /*
  * On last reference drop, mark the route as belong to us so that it can be
  * timed out.
  */
 static void
 in6_clsroute(struct radix_node *rn, struct radix_node_head *head)
 {
 	INIT_VNET_INET6(curvnet);
 	struct rtentry *rt = (struct rtentry *)rn;
 
 	RT_LOCK_ASSERT(rt);
 
 	if (!(rt->rt_flags & RTF_UP))
 		return;		/* prophylactic measures */
 
-	if ((rt->rt_flags & (RTF_LLINFO | RTF_HOST)) != RTF_HOST)
-		return;
-
-	if ((rt->rt_flags & (RTF_WASCLONED | RTPRF_OURS)) != RTF_WASCLONED)
-		return;
-
 	/*
 	 * As requested by David Greenman:
 	 * If rtq_reallyold6 is 0, just delete the route without
 	 * waiting for a timeout cycle to kill it.
 	 */
 	if (V_rtq_reallyold6 != 0) {
 		rt->rt_flags |= RTPRF_OURS;
 		rt->rt_rmx.rmx_expire = time_uptime + V_rtq_reallyold6;
 	} else {
 		rtexpunge(rt);
 	}
 }
 
 struct rtqk_arg {
 	struct radix_node_head *rnh;
 	int mode;
 	int updating;
 	int draining;
 	int killed;
 	int found;
 	time_t nextstop;
 };
 
 /*
  * Get rid of old routes.  When draining, this deletes everything, even when
  * the timeout is not expired yet.  When updating, this makes sure that
  * nothing has a timeout longer than the current value of rtq_reallyold6.
  */
 static int
 in6_rtqkill(struct radix_node *rn, void *rock)
 {
 	INIT_VNET_INET6(curvnet);
 	struct rtqk_arg *ap = rock;
 	struct rtentry *rt = (struct rtentry *)rn;
 	int err;
 
 	if (rt->rt_flags & RTPRF_OURS) {
 		ap->found++;
 
 		if (ap->draining || rt->rt_rmx.rmx_expire <= time_uptime) {
 			if (rt->rt_refcnt > 0)
 				panic("rtqkill route really not free");
 
 			err = rtrequest(RTM_DELETE,
 					(struct sockaddr *)rt_key(rt),
 					rt->rt_gateway, rt_mask(rt),
-					rt->rt_flags, 0);
+					rt->rt_flags|RTF_RNH_LOCKED, 0);
 			if (err) {
 				log(LOG_WARNING, "in6_rtqkill: error %d", err);
 			} else {
 				ap->killed++;
 			}
 		} else {
 			if (ap->updating
 			   && (rt->rt_rmx.rmx_expire - time_uptime
 			       > V_rtq_reallyold6)) {
 				rt->rt_rmx.rmx_expire = time_uptime
 					+ V_rtq_reallyold6;
 			}
 			ap->nextstop = lmin(ap->nextstop,
 					    rt->rt_rmx.rmx_expire);
 		}
 	}
 
 	return 0;
 }
 
 #define RTQ_TIMEOUT	60*10	/* run no less than once every ten minutes */
 #ifdef VIMAGE_GLOBALS
 static int rtq_timeout6;
 static struct callout rtq_timer6;
 #endif
 
 static void
 in6_rtqtimo(void *rock)
 {
 	CURVNET_SET_QUIET((struct vnet *) rock);
 	INIT_VNET_NET((struct vnet *) rock);
 	INIT_VNET_INET6((struct vnet *) rock);
 	struct radix_node_head *rnh = rock;
 	struct rtqk_arg arg;
 	struct timeval atv;
 	static time_t last_adjusted_timeout = 0;
 
 	arg.found = arg.killed = 0;
 	arg.rnh = rnh;
 	arg.nextstop = time_uptime + V_rtq_timeout6;
 	arg.draining = arg.updating = 0;
 	RADIX_NODE_HEAD_LOCK(rnh);
 	rnh->rnh_walktree(rnh, in6_rtqkill, &arg);
 	RADIX_NODE_HEAD_UNLOCK(rnh);
 
 	/*
 	 * Attempt to be somewhat dynamic about this:
 	 * If there are ``too many'' routes sitting around taking up space,
 	 * then crank down the timeout, and see if we can't make some more
 	 * go away.  However, we make sure that we will never adjust more
 	 * than once in rtq_timeout6 seconds, to keep from cranking down too
 	 * hard.
 	 */
 	if ((arg.found - arg.killed > V_rtq_toomany6)
 	   && (time_uptime - last_adjusted_timeout >= V_rtq_timeout6)
 	   && V_rtq_reallyold6 > V_rtq_minreallyold6) {
 		V_rtq_reallyold6 = 2*V_rtq_reallyold6 / 3;
 		if (V_rtq_reallyold6 < V_rtq_minreallyold6) {
 			V_rtq_reallyold6 = V_rtq_minreallyold6;
 		}
 
 		last_adjusted_timeout = time_uptime;
 #ifdef DIAGNOSTIC
 		log(LOG_DEBUG, "in6_rtqtimo: adjusted rtq_reallyold6 to %d",
 		    V_rtq_reallyold6);
 #endif
 		arg.found = arg.killed = 0;
 		arg.updating = 1;
 		RADIX_NODE_HEAD_LOCK(rnh);
 		rnh->rnh_walktree(rnh, in6_rtqkill, &arg);
 		RADIX_NODE_HEAD_UNLOCK(rnh);
 	}
 
 	atv.tv_usec = 0;
 	atv.tv_sec = arg.nextstop - time_uptime;
 	callout_reset(&V_rtq_timer6, tvtohz(&atv), in6_rtqtimo, rock);
 	CURVNET_RESTORE();
 }
 
 /*
  * Age old PMTUs.
  */
 struct mtuex_arg {
 	struct radix_node_head *rnh;
 	time_t nextstop;
 };
 #ifdef VIMAGE_GLOBALS
 static struct callout rtq_mtutimer;
 #endif
 
 static int
 in6_mtuexpire(struct radix_node *rn, void *rock)
 {
 	struct rtentry *rt = (struct rtentry *)rn;
 	struct mtuex_arg *ap = rock;
 
 	/* sanity */
 	if (!rt)
 		panic("rt == NULL in in6_mtuexpire");
 
 	if (rt->rt_rmx.rmx_expire && !(rt->rt_flags & RTF_PROBEMTU)) {
 		if (rt->rt_rmx.rmx_expire <= time_uptime) {
 			rt->rt_flags |= RTF_PROBEMTU;
 		} else {
 			ap->nextstop = lmin(ap->nextstop,
 					rt->rt_rmx.rmx_expire);
 		}
 	}
 
 	return 0;
 }
 
 #define	MTUTIMO_DEFAULT	(60*1)
 
 static void
 in6_mtutimo(void *rock)
 {
 	CURVNET_SET_QUIET((struct vnet *) rock);
 	INIT_VNET_NET((struct vnet *) rock);
 	INIT_VNET_INET6((struct vnet *) rock);
 	struct radix_node_head *rnh = rock;
 	struct mtuex_arg arg;
 	struct timeval atv;
 
 	arg.rnh = rnh;
 	arg.nextstop = time_uptime + MTUTIMO_DEFAULT;
 	RADIX_NODE_HEAD_LOCK(rnh);
 	rnh->rnh_walktree(rnh, in6_mtuexpire, &arg);
 	RADIX_NODE_HEAD_UNLOCK(rnh);
 
 	atv.tv_usec = 0;
 	atv.tv_sec = arg.nextstop - time_uptime;
 	if (atv.tv_sec < 0) {
 		printf("invalid mtu expiration time on routing table\n");
 		arg.nextstop = time_uptime + 30;	/* last resort */
 		atv.tv_sec = 30;
 	}
 	callout_reset(&V_rtq_mtutimer, tvtohz(&atv), in6_mtutimo, rock);
 	CURVNET_RESTORE();
 }
 
 #if 0
 void
 in6_rtqdrain(void)
 {
 	INIT_VNET_NET(curvnet);
 	struct radix_node_head *rnh = V_rt_tables[AF_INET6];
 	struct rtqk_arg arg;
 
 	arg.found = arg.killed = 0;
 	arg.rnh = rnh;
 	arg.nextstop = 0;
 	arg.draining = 1;
 	arg.updating = 0;
 	RADIX_NODE_HEAD_LOCK(rnh);
 	rnh->rnh_walktree(rnh, in6_rtqkill, &arg);
 	RADIX_NODE_HEAD_UNLOCK(rnh);
 }
 #endif
 
 /*
  * Initialize our routing tree.
  * XXX MRT When off == 0, we are being called from vfs_export.c
  * so just set up their table and leave. (we know what the correct
  * value should be so just use that).. FIX AFTER RELENG_7 is MFC'd
  * see also comments in in_inithead() vfs_export.c and domain.h
  */
 int
 in6_inithead(void **head, int off)
 {
 	INIT_VNET_INET6(curvnet);
 	struct radix_node_head *rnh;
 
 	if (!rn_inithead(head, offsetof(struct sockaddr_in6, sin6_addr) << 3))
 		return 0;		/* See above */
 
 	if (off == 0)		/* See above */
 		return 1;	/* only do the rest for the real thing */
 
 	V_rtq_reallyold6 = 60*60; /* one hour is ``really old'' */
 	V_rtq_minreallyold6 = 10; /* never automatically crank down to less */
 	V_rtq_toomany6 = 128;	  /* 128 cached routes is ``too many'' */
 	V_rtq_timeout6 = RTQ_TIMEOUT;
 
 	rnh = *head;
 	rnh->rnh_addaddr = in6_addroute;
 	rnh->rnh_matchaddr = in6_matroute;
 	rnh->rnh_close = in6_clsroute;
 	callout_init(&V_rtq_timer6, CALLOUT_MPSAFE);
 	in6_rtqtimo(rnh);	/* kick off timeout first time */
 	callout_init(&V_rtq_mtutimer, CALLOUT_MPSAFE);
 	in6_mtutimo(rnh);	/* kick off timeout first time */
 	return 1;
 }
Index: head/sys/netinet6/in6_src.c
===================================================================
--- head/sys/netinet6/in6_src.c	(revision 186118)
+++ head/sys/netinet6/in6_src.c	(revision 186119)
@@ -1,1154 +1,1179 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: in6_src.c,v 1.132 2003/08/26 04:42:27 keiichi Exp $
  */
 
 /*-
  * Copyright (c) 1982, 1986, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)in_pcb.c	8.2 (Berkeley) 1/4/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_mpath.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/priv.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sockio.h>
 #include <sys/sysctl.h>
 #include <sys/errno.h>
 #include <sys/time.h>
 #include <sys/jail.h>
 #include <sys/kernel.h>
 #include <sys/sx.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/route.h>
+#include <net/if_llatbl.h>
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/in_pcb.h>
 #include <netinet/ip_var.h>
 #include <netinet/udp.h>
 #include <netinet/udp_var.h>
 #include <netinet/vinet.h>
 
 #include <netinet6/in6_var.h>
 #include <netinet/ip6.h>
 #include <netinet6/in6_pcb.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet6/vinet6.h>
 
 static struct mtx addrsel_lock;
 #define	ADDRSEL_LOCK_INIT()	mtx_init(&addrsel_lock, "addrsel_lock", NULL, MTX_DEF)
 #define	ADDRSEL_LOCK()		mtx_lock(&addrsel_lock)
 #define	ADDRSEL_UNLOCK()	mtx_unlock(&addrsel_lock)
 #define	ADDRSEL_LOCK_ASSERT()	mtx_assert(&addrsel_lock, MA_OWNED)
 
 static struct sx addrsel_sxlock;
 #define	ADDRSEL_SXLOCK_INIT()	sx_init(&addrsel_sxlock, "addrsel_sxlock")
 #define	ADDRSEL_SLOCK()		sx_slock(&addrsel_sxlock)
 #define	ADDRSEL_SUNLOCK()	sx_sunlock(&addrsel_sxlock)
 #define	ADDRSEL_XLOCK()		sx_xlock(&addrsel_sxlock)
 #define	ADDRSEL_XUNLOCK()	sx_xunlock(&addrsel_sxlock)
 
 #define ADDR_LABEL_NOTAPP (-1)
 
 #ifdef VIMAGE_GLOBALS
 struct in6_addrpolicy defaultaddrpolicy;
 int ip6_prefer_tempaddr;
 #endif
 
 static int selectroute __P((struct sockaddr_in6 *, struct ip6_pktopts *,
 	struct ip6_moptions *, struct route_in6 *, struct ifnet **,
-	struct rtentry **, int, int));
+	struct rtentry **, int));
 static int in6_selectif __P((struct sockaddr_in6 *, struct ip6_pktopts *,
 	struct ip6_moptions *, struct route_in6 *ro, struct ifnet **));
 
 static struct in6_addrpolicy *lookup_addrsel_policy(struct sockaddr_in6 *);
 
 static void init_policy_queue(void);
 static int add_addrsel_policyent(struct in6_addrpolicy *);
 static int delete_addrsel_policyent(struct in6_addrpolicy *);
 static int walk_addrsel_policy __P((int (*)(struct in6_addrpolicy *, void *),
 				    void *));
 static int dump_addrsel_policyent(struct in6_addrpolicy *, void *);
 static struct in6_addrpolicy *match_addrsel_policy(struct sockaddr_in6 *);
 
 /*
  * Return an IPv6 address, which is the most appropriate for a given
  * destination and user specified options.
  * If necessary, this function lookups the routing table and returns
  * an entry to the caller for later use.
  */
 #define REPLACE(r) do {\
 	if ((r) < sizeof(V_ip6stat.ip6s_sources_rule) / \
 		sizeof(V_ip6stat.ip6s_sources_rule[0])) /* check for safety */ \
 		V_ip6stat.ip6s_sources_rule[(r)]++; \
 	/* { \
 	char ip6buf[INET6_ADDRSTRLEN], ip6b[INET6_ADDRSTRLEN]; \
 	printf("in6_selectsrc: replace %s with %s by %d\n", ia_best ? ip6_sprintf(ip6buf, &ia_best->ia_addr.sin6_addr) : "none", ip6_sprintf(ip6b, &ia->ia_addr.sin6_addr), (r)); \
 	} */ \
 	goto replace; \
 } while(0)
 #define NEXT(r) do {\
 	if ((r) < sizeof(V_ip6stat.ip6s_sources_rule) / \
 		sizeof(V_ip6stat.ip6s_sources_rule[0])) /* check for safety */ \
 		V_ip6stat.ip6s_sources_rule[(r)]++; \
 	/* { \
 	char ip6buf[INET6_ADDRSTRLEN], ip6b[INET6_ADDRSTRLEN]; \
 	printf("in6_selectsrc: keep %s against %s by %d\n", ia_best ? ip6_sprintf(ip6buf, &ia_best->ia_addr.sin6_addr) : "none", ip6_sprintf(ip6b, &ia->ia_addr.sin6_addr), (r)); \
 	} */ \
 	goto next;		/* XXX: we can't use 'continue' here */ \
 } while(0)
 #define BREAK(r) do { \
 	if ((r) < sizeof(V_ip6stat.ip6s_sources_rule) / \
 		sizeof(V_ip6stat.ip6s_sources_rule[0])) /* check for safety */ \
 		V_ip6stat.ip6s_sources_rule[(r)]++; \
 	goto out;		/* XXX: we can't use 'break' here */ \
 } while(0)
 
 struct in6_addr *
 in6_selectsrc(struct sockaddr_in6 *dstsock, struct ip6_pktopts *opts,
     struct inpcb *inp, struct route_in6 *ro, struct ucred *cred,
     struct ifnet **ifpp, int *errorp)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_addr dst;
 	struct ifnet *ifp = NULL;
 	struct in6_ifaddr *ia = NULL, *ia_best = NULL;
 	struct in6_pktinfo *pi = NULL;
 	int dst_scope = -1, best_scope = -1, best_matchlen = -1;
 	struct in6_addrpolicy *dst_policy = NULL, *best_policy = NULL;
 	u_int32_t odstzone;
 	int prefer_tempaddr;
 	struct ip6_moptions *mopts;
 
 	dst = dstsock->sin6_addr; /* make a copy for local operation */
 	*errorp = 0;
 	if (ifpp)
 		*ifpp = NULL;
 
 	if (inp != NULL) {
 		INP_LOCK_ASSERT(inp);
 		mopts = inp->in6p_moptions;
 	} else {
 		mopts = NULL;
 	}
 
 	/*
 	 * If the source address is explicitly specified by the caller,
 	 * check if the requested source address is indeed a unicast address
 	 * assigned to the node, and can be used as the packet's source
 	 * address.  If everything is okay, use the address as source.
 	 */
 	if (opts && (pi = opts->ip6po_pktinfo) &&
 	    !IN6_IS_ADDR_UNSPECIFIED(&pi->ipi6_addr)) {
 		struct sockaddr_in6 srcsock;
 		struct in6_ifaddr *ia6;
 
 		/* get the outgoing interface */
 		if ((*errorp = in6_selectif(dstsock, opts, mopts, ro, &ifp))
 		    != 0) {
 			return (NULL);
 		}
 
 		/*
 		 * determine the appropriate zone id of the source based on
 		 * the zone of the destination and the outgoing interface.
 		 * If the specified address is ambiguous wrt the scope zone,
 		 * the interface must be specified; otherwise, ifa_ifwithaddr()
 		 * will fail matching the address.
 		 */
 		bzero(&srcsock, sizeof(srcsock));
 		srcsock.sin6_family = AF_INET6;
 		srcsock.sin6_len = sizeof(srcsock);
 		srcsock.sin6_addr = pi->ipi6_addr;
 		if (ifp) {
 			*errorp = in6_setscope(&srcsock.sin6_addr, ifp, NULL);
 			if (*errorp != 0)
 				return (NULL);
 		}
 		if (cred != NULL && prison_local_ip6(cred, &srcsock.sin6_addr,
 		    (inp != NULL && (inp->inp_flags & IN6P_IPV6_V6ONLY) != 0)) != 0) {
 			*errorp = EADDRNOTAVAIL;
 			return (NULL);
 		}
 
 		ia6 = (struct in6_ifaddr *)ifa_ifwithaddr((struct sockaddr *)(&srcsock));
 		if (ia6 == NULL ||
 		    (ia6->ia6_flags & (IN6_IFF_ANYCAST | IN6_IFF_NOTREADY))) {
 			*errorp = EADDRNOTAVAIL;
 			return (NULL);
 		}
 		pi->ipi6_addr = srcsock.sin6_addr; /* XXX: this overrides pi */
 		if (ifpp)
 			*ifpp = ifp;
 		return (&ia6->ia_addr.sin6_addr);
 	}
 
 	/*
 	 * Otherwise, if the socket has already bound the source, just use it.
 	 */
 	if (inp != NULL && !IN6_IS_ADDR_UNSPECIFIED(&inp->in6p_laddr)) {
 		if (cred != NULL && prison_local_ip6(cred, &inp->in6p_laddr,
 		    ((inp->inp_flags & IN6P_IPV6_V6ONLY) != 0)) != 0) {
 			*errorp = EADDRNOTAVAIL;
 			return (NULL);
 		}
 		return (&inp->in6p_laddr);
 	}
 
 	/*
 	 * If the address is not specified, choose the best one based on
 	 * the outgoing interface and the destination address.
 	 */
 	/* get the outgoing interface */
 	if ((*errorp = in6_selectif(dstsock, opts, mopts, ro, &ifp)) != 0)
 		return (NULL);
 
 #ifdef DIAGNOSTIC
 	if (ifp == NULL)	/* this should not happen */
 		panic("in6_selectsrc: NULL ifp");
 #endif
 	*errorp = in6_setscope(&dst, ifp, &odstzone);
 	if (*errorp != 0)
 		return (NULL);
 
 	for (ia = V_in6_ifaddr; ia; ia = ia->ia_next) {
 		int new_scope = -1, new_matchlen = -1;
 		struct in6_addrpolicy *new_policy = NULL;
 		u_int32_t srczone, osrczone, dstzone;
 		struct in6_addr src;
 		struct ifnet *ifp1 = ia->ia_ifp;
 
 		/*
 		 * We'll never take an address that breaks the scope zone
 		 * of the destination.  We also skip an address if its zone
 		 * does not contain the outgoing interface.
 		 * XXX: we should probably use sin6_scope_id here.
 		 */
 		if (in6_setscope(&dst, ifp1, &dstzone) ||
 		    odstzone != dstzone) {
 			continue;
 		}
 		src = ia->ia_addr.sin6_addr;
 		if (in6_setscope(&src, ifp, &osrczone) ||
 		    in6_setscope(&src, ifp1, &srczone) ||
 		    osrczone != srczone) {
 			continue;
 		}
 
 		/* avoid unusable addresses */
 		if ((ia->ia6_flags &
 		     (IN6_IFF_NOTREADY | IN6_IFF_ANYCAST | IN6_IFF_DETACHED))) {
 				continue;
 		}
 		if (!V_ip6_use_deprecated && IFA6_IS_DEPRECATED(ia))
 			continue;
 
 		if (cred != NULL &&
 		    prison_local_ip6(cred, &ia->ia_addr.sin6_addr,
 			(inp != NULL &&
 			(inp->inp_flags & IN6P_IPV6_V6ONLY) != 0)) != 0)
 			continue;
 
 		/* Rule 1: Prefer same address */
 		if (IN6_ARE_ADDR_EQUAL(&dst, &ia->ia_addr.sin6_addr)) {
 			ia_best = ia;
 			BREAK(1); /* there should be no better candidate */
 		}
 
 		if (ia_best == NULL)
 			REPLACE(0);
 
 		/* Rule 2: Prefer appropriate scope */
 		if (dst_scope < 0)
 			dst_scope = in6_addrscope(&dst);
 		new_scope = in6_addrscope(&ia->ia_addr.sin6_addr);
 		if (IN6_ARE_SCOPE_CMP(best_scope, new_scope) < 0) {
 			if (IN6_ARE_SCOPE_CMP(best_scope, dst_scope) < 0)
 				REPLACE(2);
 			NEXT(2);
 		} else if (IN6_ARE_SCOPE_CMP(new_scope, best_scope) < 0) {
 			if (IN6_ARE_SCOPE_CMP(new_scope, dst_scope) < 0)
 				NEXT(2);
 			REPLACE(2);
 		}
 
 		/*
 		 * Rule 3: Avoid deprecated addresses.  Note that the case of
 		 * !ip6_use_deprecated is already rejected above.
 		 */
 		if (!IFA6_IS_DEPRECATED(ia_best) && IFA6_IS_DEPRECATED(ia))
 			NEXT(3);
 		if (IFA6_IS_DEPRECATED(ia_best) && !IFA6_IS_DEPRECATED(ia))
 			REPLACE(3);
 
 		/* Rule 4: Prefer home addresses */
 		/*
 		 * XXX: This is a TODO.  We should probably merge the MIP6
 		 * case above.
 		 */
 
 		/* Rule 5: Prefer outgoing interface */
 		if (ia_best->ia_ifp == ifp && ia->ia_ifp != ifp)
 			NEXT(5);
 		if (ia_best->ia_ifp != ifp && ia->ia_ifp == ifp)
 			REPLACE(5);
 
 		/*
 		 * Rule 6: Prefer matching label
 		 * Note that best_policy should be non-NULL here.
 		 */
 		if (dst_policy == NULL)
 			dst_policy = lookup_addrsel_policy(dstsock);
 		if (dst_policy->label != ADDR_LABEL_NOTAPP) {
 			new_policy = lookup_addrsel_policy(&ia->ia_addr);
 			if (dst_policy->label == best_policy->label &&
 			    dst_policy->label != new_policy->label)
 				NEXT(6);
 			if (dst_policy->label != best_policy->label &&
 			    dst_policy->label == new_policy->label)
 				REPLACE(6);
 		}
 
 		/*
 		 * Rule 7: Prefer public addresses.
 		 * We allow users to reverse the logic by configuring
 		 * a sysctl variable, so that privacy conscious users can
 		 * always prefer temporary addresses.
 		 */
 		if (opts == NULL ||
 		    opts->ip6po_prefer_tempaddr == IP6PO_TEMPADDR_SYSTEM) {
 			prefer_tempaddr = V_ip6_prefer_tempaddr;
 		} else if (opts->ip6po_prefer_tempaddr ==
 		    IP6PO_TEMPADDR_NOTPREFER) {
 			prefer_tempaddr = 0;
 		} else
 			prefer_tempaddr = 1;
 		if (!(ia_best->ia6_flags & IN6_IFF_TEMPORARY) &&
 		    (ia->ia6_flags & IN6_IFF_TEMPORARY)) {
 			if (prefer_tempaddr)
 				REPLACE(7);
 			else
 				NEXT(7);
 		}
 		if ((ia_best->ia6_flags & IN6_IFF_TEMPORARY) &&
 		    !(ia->ia6_flags & IN6_IFF_TEMPORARY)) {
 			if (prefer_tempaddr)
 				NEXT(7);
 			else
 				REPLACE(7);
 		}
 
 		/*
 		 * Rule 8: prefer addresses on alive interfaces.
 		 * This is a KAME specific rule.
 		 */
 		if ((ia_best->ia_ifp->if_flags & IFF_UP) &&
 		    !(ia->ia_ifp->if_flags & IFF_UP))
 			NEXT(8);
 		if (!(ia_best->ia_ifp->if_flags & IFF_UP) &&
 		    (ia->ia_ifp->if_flags & IFF_UP))
 			REPLACE(8);
 
 		/*
 		 * Rule 14: Use longest matching prefix.
 		 * Note: in the address selection draft, this rule is
 		 * documented as "Rule 8".  However, since it is also
 		 * documented that this rule can be overridden, we assign
 		 * a large number so that it is easy to assign smaller numbers
 		 * to more preferred rules.
 		 */
 		new_matchlen = in6_matchlen(&ia->ia_addr.sin6_addr, &dst);
 		if (best_matchlen < new_matchlen)
 			REPLACE(14);
 		if (new_matchlen < best_matchlen)
 			NEXT(14);
 
 		/* Rule 15 is reserved. */
 
 		/*
 		 * Last resort: just keep the current candidate.
 		 * Or, do we need more rules?
 		 */
 		continue;
 
 	  replace:
 		ia_best = ia;
 		best_scope = (new_scope >= 0 ? new_scope :
 			      in6_addrscope(&ia_best->ia_addr.sin6_addr));
 		best_policy = (new_policy ? new_policy :
 			       lookup_addrsel_policy(&ia_best->ia_addr));
 		best_matchlen = (new_matchlen >= 0 ? new_matchlen :
 				 in6_matchlen(&ia_best->ia_addr.sin6_addr,
 					      &dst));
 
 	  next:
 		continue;
 
 	  out:
 		break;
 	}
 
 	if ((ia = ia_best) == NULL) {
 		*errorp = EADDRNOTAVAIL;
 		return (NULL);
 	}
 
 	if (ifpp)
 		*ifpp = ifp;
 
 	return (&ia->ia_addr.sin6_addr);
 }
 
 /*
  * clone - meaningful only for bsdi and freebsd
  */
 static int
 selectroute(struct sockaddr_in6 *dstsock, struct ip6_pktopts *opts,
     struct ip6_moptions *mopts, struct route_in6 *ro,
-    struct ifnet **retifp, struct rtentry **retrt, int clone,
-    int norouteok)
+    struct ifnet **retifp, struct rtentry **retrt, int norouteok)
 {
 	INIT_VNET_INET6(curvnet);
 	int error = 0;
 	struct ifnet *ifp = NULL;
 	struct rtentry *rt = NULL;
 	struct sockaddr_in6 *sin6_next;
 	struct in6_pktinfo *pi = NULL;
 	struct in6_addr *dst = &dstsock->sin6_addr;
 #if 0
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	if (dstsock->sin6_addr.s6_addr32[0] == 0 &&
 	    dstsock->sin6_addr.s6_addr32[1] == 0 &&
 	    !IN6_IS_ADDR_LOOPBACK(&dstsock->sin6_addr)) {
 		printf("in6_selectroute: strange destination %s\n",
 		       ip6_sprintf(ip6buf, &dstsock->sin6_addr));
 	} else {
 		printf("in6_selectroute: destination = %s%%%d\n",
 		       ip6_sprintf(ip6buf, &dstsock->sin6_addr),
 		       dstsock->sin6_scope_id); /* for debug */
 	}
 #endif
 
 	/* If the caller specify the outgoing interface explicitly, use it. */
 	if (opts && (pi = opts->ip6po_pktinfo) != NULL && pi->ipi6_ifindex) {
 		/* XXX boundary check is assumed to be already done. */
 		ifp = ifnet_byindex(pi->ipi6_ifindex);
 		if (ifp != NULL &&
 		    (norouteok || retrt == NULL ||
 		    IN6_IS_ADDR_MULTICAST(dst))) {
 			/*
 			 * we do not have to check or get the route for
 			 * multicast.
 			 */
 			goto done;
 		} else
 			goto getroute;
 	}
 
 	/*
 	 * If the destination address is a multicast address and the outgoing
 	 * interface for the address is specified by the caller, use it.
 	 */
 	if (IN6_IS_ADDR_MULTICAST(dst) &&
 	    mopts != NULL && (ifp = mopts->im6o_multicast_ifp) != NULL) {
 		goto done; /* we do not need a route for multicast. */
 	}
 
   getroute:
 	/*
 	 * If the next hop address for the packet is specified by the caller,
 	 * use it as the gateway.
 	 */
 	if (opts && opts->ip6po_nexthop) {
 		struct route_in6 *ron;
-
+		struct llentry *la;
+	    
 		sin6_next = satosin6(opts->ip6po_nexthop);
-
+		
 		/* at this moment, we only support AF_INET6 next hops */
 		if (sin6_next->sin6_family != AF_INET6) {
 			error = EAFNOSUPPORT; /* or should we proceed? */
 			goto done;
 		}
 
 		/*
 		 * If the next hop is an IPv6 address, then the node identified
 		 * by that address must be a neighbor of the sending host.
 		 */
 		ron = &opts->ip6po_nextroute;
+		/*
+		 * XXX what do we do here?
+		 * PLZ to be fixing
+		 */
+
+
+		if (ron->ro_rt == NULL) {
+			rtalloc((struct route *)ron); /* multi path case? */
+			if (ron->ro_rt == NULL) {
+				if (ron->ro_rt) {
+					RTFREE(ron->ro_rt);
+					ron->ro_rt = NULL;
+				}
+				error = EHOSTUNREACH;
+				goto done;
+			} 
+		}
+
+		rt = ron->ro_rt;
+		ifp = rt->rt_ifp;
+		IF_AFDATA_LOCK(ifp);
+		la = lla_lookup(LLTABLE6(ifp), 0, (struct sockaddr *)&sin6_next->sin6_addr);
+		IF_AFDATA_UNLOCK(ifp);
+		if (la) 
+			LLE_RUNLOCK(la);
+		else {
+			error = EHOSTUNREACH;
+			goto done;
+		}
+#if 0
 		if ((ron->ro_rt &&
 		     (ron->ro_rt->rt_flags & (RTF_UP | RTF_LLINFO)) !=
 		     (RTF_UP | RTF_LLINFO)) ||
 		    !IN6_ARE_ADDR_EQUAL(&satosin6(&ron->ro_dst)->sin6_addr,
 		    &sin6_next->sin6_addr)) {
 			if (ron->ro_rt) {
 				RTFREE(ron->ro_rt);
 				ron->ro_rt = NULL;
 			}
 			*satosin6(&ron->ro_dst) = *sin6_next;
 		}
 		if (ron->ro_rt == NULL) {
 			rtalloc((struct route *)ron); /* multi path case? */
 			if (ron->ro_rt == NULL ||
 			    !(ron->ro_rt->rt_flags & RTF_LLINFO)) {
 				if (ron->ro_rt) {
 					RTFREE(ron->ro_rt);
 					ron->ro_rt = NULL;
 				}
 				error = EHOSTUNREACH;
 				goto done;
 			}
 		}
-		rt = ron->ro_rt;
-		ifp = rt->rt_ifp;
+#endif
 
 		/*
 		 * When cloning is required, try to allocate a route to the
 		 * destination so that the caller can store path MTU
 		 * information.
 		 */
-		if (!clone)
-			goto done;
+		goto done;
 	}
 
 	/*
 	 * Use a cached route if it exists and is valid, else try to allocate
 	 * a new one.  Note that we should check the address family of the
 	 * cached destination, in case of sharing the cache with IPv4.
 	 */
 	if (ro) {
 		if (ro->ro_rt &&
 		    (!(ro->ro_rt->rt_flags & RTF_UP) ||
 		     ((struct sockaddr *)(&ro->ro_dst))->sa_family != AF_INET6 ||
 		     !IN6_ARE_ADDR_EQUAL(&satosin6(&ro->ro_dst)->sin6_addr,
 		     dst))) {
 			RTFREE(ro->ro_rt);
 			ro->ro_rt = (struct rtentry *)NULL;
 		}
 		if (ro->ro_rt == (struct rtentry *)NULL) {
 			struct sockaddr_in6 *sa6;
 
 			/* No route yet, so try to acquire one */
 			bzero(&ro->ro_dst, sizeof(struct sockaddr_in6));
 			sa6 = (struct sockaddr_in6 *)&ro->ro_dst;
 			*sa6 = *dstsock;
 			sa6->sin6_scope_id = 0;
 
-			if (clone) {
 #ifdef RADIX_MPATH
 				rtalloc_mpath((struct route *)ro,
 				    ntohl(sa6->sin6_addr.s6_addr32[3]));
-#else
-				rtalloc((struct route *)ro);
-#endif
-			} else {
+#else			
 				ro->ro_rt = rtalloc1(&((struct route *)ro)
-						     ->ro_dst, 0, 0UL);
+				    ->ro_dst, 0, 0UL);
 				if (ro->ro_rt)
 					RT_UNLOCK(ro->ro_rt);
-			}
+#endif
 		}
-
+				
 		/*
 		 * do not care about the result if we have the nexthop
 		 * explicitly specified.
 		 */
 		if (opts && opts->ip6po_nexthop)
 			goto done;
 
 		if (ro->ro_rt) {
 			ifp = ro->ro_rt->rt_ifp;
 
 			if (ifp == NULL) { /* can this really happen? */
 				RTFREE(ro->ro_rt);
 				ro->ro_rt = NULL;
 			}
 		}
 		if (ro->ro_rt == NULL)
 			error = EHOSTUNREACH;
 		rt = ro->ro_rt;
 
 		/*
 		 * Check if the outgoing interface conflicts with
 		 * the interface specified by ipi6_ifindex (if specified).
 		 * Note that loopback interface is always okay.
 		 * (this may happen when we are sending a packet to one of
 		 *  our own addresses.)
 		 */
 		if (ifp && opts && opts->ip6po_pktinfo &&
 		    opts->ip6po_pktinfo->ipi6_ifindex) {
 			if (!(ifp->if_flags & IFF_LOOPBACK) &&
 			    ifp->if_index !=
 			    opts->ip6po_pktinfo->ipi6_ifindex) {
 				error = EHOSTUNREACH;
 				goto done;
 			}
 		}
 	}
 
   done:
 	if (ifp == NULL && rt == NULL) {
 		/*
 		 * This can happen if the caller did not pass a cached route
 		 * nor any other hints.  We treat this case an error.
 		 */
 		error = EHOSTUNREACH;
 	}
 	if (error == EHOSTUNREACH)
 		V_ip6stat.ip6s_noroute++;
 
 	if (retifp != NULL)
 		*retifp = ifp;
 	if (retrt != NULL)
 		*retrt = rt;	/* rt may be NULL */
 
 	return (error);
 }
 
 static int
 in6_selectif(struct sockaddr_in6 *dstsock, struct ip6_pktopts *opts,
     struct ip6_moptions *mopts, struct route_in6 *ro, struct ifnet **retifp)
 {
 	int error;
 	struct route_in6 sro;
 	struct rtentry *rt = NULL;
 
 	if (ro == NULL) {
 		bzero(&sro, sizeof(sro));
 		ro = &sro;
 	}
 
 	if ((error = selectroute(dstsock, opts, mopts, ro, retifp,
-				     &rt, 0, 1)) != 0) {
+				     &rt, 1)) != 0) {
 		if (ro == &sro && rt && rt == sro.ro_rt)
 			RTFREE(rt);
 		return (error);
 	}
 
 	/*
 	 * do not use a rejected or black hole route.
 	 * XXX: this check should be done in the L2 output routine.
 	 * However, if we skipped this check here, we'd see the following
 	 * scenario:
 	 * - install a rejected route for a scoped address prefix
 	 *   (like fe80::/10)
 	 * - send a packet to a destination that matches the scoped prefix,
 	 *   with ambiguity about the scope zone.
 	 * - pick the outgoing interface from the route, and disambiguate the
 	 *   scope zone with the interface.
 	 * - ip6_output() would try to get another route with the "new"
 	 *   destination, which may be valid.
 	 * - we'd see no error on output.
 	 * Although this may not be very harmful, it should still be confusing.
 	 * We thus reject the case here.
 	 */
 	if (rt && (rt->rt_flags & (RTF_REJECT | RTF_BLACKHOLE))) {
 		int flags = (rt->rt_flags & RTF_HOST ? EHOSTUNREACH : ENETUNREACH);
 
 		if (ro == &sro && rt && rt == sro.ro_rt)
 			RTFREE(rt);
 		return (flags);
 	}
 
 	/*
 	 * Adjust the "outgoing" interface.  If we're going to loop the packet
 	 * back to ourselves, the ifp would be the loopback interface.
 	 * However, we'd rather know the interface associated to the
 	 * destination address (which should probably be one of our own
 	 * addresses.)
 	 */
 	if (rt && rt->rt_ifa && rt->rt_ifa->ifa_ifp)
 		*retifp = rt->rt_ifa->ifa_ifp;
 
 	if (ro == &sro && rt && rt == sro.ro_rt)
 		RTFREE(rt);
 	return (0);
 }
 
 /*
  * clone - meaningful only for bsdi and freebsd
  */
 int
 in6_selectroute(struct sockaddr_in6 *dstsock, struct ip6_pktopts *opts,
     struct ip6_moptions *mopts, struct route_in6 *ro,
-    struct ifnet **retifp, struct rtentry **retrt, int clone)
+    struct ifnet **retifp, struct rtentry **retrt)
 {
 
 	return (selectroute(dstsock, opts, mopts, ro, retifp,
-	    retrt, clone, 0));
+	    retrt, 0));
 }
 
 /*
  * Default hop limit selection. The precedence is as follows:
  * 1. Hoplimit value specified via ioctl.
  * 2. (If the outgoing interface is detected) the current
  *     hop limit of the interface specified by router advertisement.
  * 3. The system default hoplimit.
  */
 int
 in6_selecthlim(struct in6pcb *in6p, struct ifnet *ifp)
 {
 	INIT_VNET_INET6(curvnet);
 
 	if (in6p && in6p->in6p_hops >= 0)
 		return (in6p->in6p_hops);
 	else if (ifp)
 		return (ND_IFINFO(ifp)->chlim);
 	else if (in6p && !IN6_IS_ADDR_UNSPECIFIED(&in6p->in6p_faddr)) {
 		struct route_in6 ro6;
 		struct ifnet *lifp;
 
 		bzero(&ro6, sizeof(ro6));
 		ro6.ro_dst.sin6_family = AF_INET6;
 		ro6.ro_dst.sin6_len = sizeof(struct sockaddr_in6);
 		ro6.ro_dst.sin6_addr = in6p->in6p_faddr;
 		rtalloc((struct route *)&ro6);
 		if (ro6.ro_rt) {
 			lifp = ro6.ro_rt->rt_ifp;
 			RTFREE(ro6.ro_rt);
 			if (lifp)
 				return (ND_IFINFO(lifp)->chlim);
 		} else
 			return (V_ip6_defhlim);
 	}
 	return (V_ip6_defhlim);
 }
 
 /*
  * XXX: this is borrowed from in6_pcbbind(). If possible, we should
  * share this function by all *bsd*...
  */
 int
 in6_pcbsetport(struct in6_addr *laddr, struct inpcb *inp, struct ucred *cred)
 {
 	INIT_VNET_INET(curvnet);
 	struct socket *so = inp->inp_socket;
 	u_int16_t lport = 0, first, last, *lastport;
 	int count, error = 0, wild = 0, dorandom;
 	struct inpcbinfo *pcbinfo = inp->inp_pcbinfo;
 
 	INP_INFO_WLOCK_ASSERT(pcbinfo);
 	INP_WLOCK_ASSERT(inp);
 
 	if (prison_local_ip6(cred, laddr,
 	    ((inp->inp_flags & IN6P_IPV6_V6ONLY) != 0)) != 0)
 		return(EINVAL);
 
 	/* XXX: this is redundant when called from in6_pcbbind */
 	if ((so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) == 0)
 		wild = INPLOOKUP_WILDCARD;
 
 	inp->inp_flags |= INP_ANONPORT;
 
 	if (inp->inp_flags & INP_HIGHPORT) {
 		first = V_ipport_hifirstauto;	/* sysctl */
 		last  = V_ipport_hilastauto;
 		lastport = &pcbinfo->ipi_lasthi;
 	} else if (inp->inp_flags & INP_LOWPORT) {
 		error = priv_check_cred(cred, PRIV_NETINET_RESERVEDPORT, 0);
 		if (error)
 			return error;
 		first = V_ipport_lowfirstauto;	/* 1023 */
 		last  = V_ipport_lowlastauto;	/* 600 */
 		lastport = &pcbinfo->ipi_lastlow;
 	} else {
 		first = V_ipport_firstauto;	/* sysctl */
 		last  = V_ipport_lastauto;
 		lastport = &pcbinfo->ipi_lastport;
 	}
 
 	/*
 	 * For UDP, use random port allocation as long as the user
 	 * allows it.  For TCP (and as of yet unknown) connections,
 	 * use random port allocation only if the user allows it AND
 	 * ipport_tick() allows it.
 	 */
 	if (V_ipport_randomized &&
 	    (!V_ipport_stoprandom || pcbinfo == &V_udbinfo))
 		dorandom = 1;
 	else
 		dorandom = 0;
 	/*
 	 * It makes no sense to do random port allocation if
 	 * we have the only port available.
 	 */
 	if (first == last)
 		dorandom = 0;
 	/* Make sure to not include UDP packets in the count. */
 	if (pcbinfo != &V_udbinfo)
 		V_ipport_tcpallocs++;
 
 	/*
 	 * Instead of having two loops further down counting up or down
 	 * make sure that first is always <= last and go with only one
 	 * code path implementing all logic.
 	 */
 	if (first > last) {
 		u_int16_t aux;
 
 		aux = first;
 		first = last;
 		last = aux;
 	}
 
 	if (dorandom)
 		*lastport = first + (arc4random() % (last - first));
 
 	count = last - first;
 
 	do {
 		if (count-- < 0) {	/* completely used? */
 			/* Undo an address bind that may have occurred. */
 			inp->in6p_laddr = in6addr_any;
 			return (EADDRNOTAVAIL);
 		}
 		++*lastport;
 		if (*lastport < first || *lastport > last)
 			*lastport = first;
 		lport = htons(*lastport);
 	} while (in6_pcblookup_local(pcbinfo, &inp->in6p_laddr,
 	    lport, wild, cred));
 
 	inp->inp_lport = lport;
 	if (in_pcbinshash(inp) != 0) {
 		inp->in6p_laddr = in6addr_any;
 		inp->inp_lport = 0;
 		return (EAGAIN);
 	}
 
 	return (0);
 }
 
 void
 addrsel_policy_init(void)
 {
 	ADDRSEL_LOCK_INIT();
 	ADDRSEL_SXLOCK_INIT();
 	INIT_VNET_INET6(curvnet);
 
 	V_ip6_prefer_tempaddr = 0;
 
 	init_policy_queue();
 
 	/* initialize the "last resort" policy */
 	bzero(&V_defaultaddrpolicy, sizeof(V_defaultaddrpolicy));
 	V_defaultaddrpolicy.label = ADDR_LABEL_NOTAPP;
 }
 
 static struct in6_addrpolicy *
 lookup_addrsel_policy(struct sockaddr_in6 *key)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_addrpolicy *match = NULL;
 
 	ADDRSEL_LOCK();
 	match = match_addrsel_policy(key);
 
 	if (match == NULL)
 		match = &V_defaultaddrpolicy;
 	else
 		match->use++;
 	ADDRSEL_UNLOCK();
 
 	return (match);
 }
 
 /*
  * Subroutines to manage the address selection policy table via sysctl.
  */
 struct walkarg {
 	struct sysctl_req *w_req;
 };
 
 static int in6_src_sysctl(SYSCTL_HANDLER_ARGS);
 SYSCTL_DECL(_net_inet6_ip6);
 SYSCTL_NODE(_net_inet6_ip6, IPV6CTL_ADDRCTLPOLICY, addrctlpolicy,
 	CTLFLAG_RD, in6_src_sysctl, "");
 
 static int
 in6_src_sysctl(SYSCTL_HANDLER_ARGS)
 {
 	struct walkarg w;
 
 	if (req->newptr)
 		return EPERM;
 
 	bzero(&w, sizeof(w));
 	w.w_req = req;
 
 	return (walk_addrsel_policy(dump_addrsel_policyent, &w));
 }
 
 int
 in6_src_ioctl(u_long cmd, caddr_t data)
 {
 	int i;
 	struct in6_addrpolicy ent0;
 
 	if (cmd != SIOCAADDRCTL_POLICY && cmd != SIOCDADDRCTL_POLICY)
 		return (EOPNOTSUPP); /* check for safety */
 
 	ent0 = *(struct in6_addrpolicy *)data;
 
 	if (ent0.label == ADDR_LABEL_NOTAPP)
 		return (EINVAL);
 	/* check if the prefix mask is consecutive. */
 	if (in6_mask2len(&ent0.addrmask.sin6_addr, NULL) < 0)
 		return (EINVAL);
 	/* clear trailing garbages (if any) of the prefix address. */
 	for (i = 0; i < 4; i++) {
 		ent0.addr.sin6_addr.s6_addr32[i] &=
 			ent0.addrmask.sin6_addr.s6_addr32[i];
 	}
 	ent0.use = 0;
 
 	switch (cmd) {
 	case SIOCAADDRCTL_POLICY:
 		return (add_addrsel_policyent(&ent0));
 	case SIOCDADDRCTL_POLICY:
 		return (delete_addrsel_policyent(&ent0));
 	}
 
 	return (0);		/* XXX: compromise compilers */
 }
 
 /*
  * The followings are implementation of the policy table using a
  * simple tail queue.
  * XXX such details should be hidden.
  * XXX implementation using binary tree should be more efficient.
  */
 struct addrsel_policyent {
 	TAILQ_ENTRY(addrsel_policyent) ape_entry;
 	struct in6_addrpolicy ape_policy;
 };
 
 TAILQ_HEAD(addrsel_policyhead, addrsel_policyent);
 
 #ifdef VIMAGE_GLOBALS
 struct addrsel_policyhead addrsel_policytab;
 #endif
 
 static void
 init_policy_queue(void)
 {
 	INIT_VNET_INET6(curvnet);
 
 	TAILQ_INIT(&V_addrsel_policytab);
 }
 
 static int
 add_addrsel_policyent(struct in6_addrpolicy *newpolicy)
 {
 	INIT_VNET_INET6(curvnet);
 	struct addrsel_policyent *new, *pol;
 
 	new = malloc(sizeof(*new), M_IFADDR,
 	       M_WAITOK);
 	ADDRSEL_XLOCK();
 	ADDRSEL_LOCK();
 
 	/* duplication check */
 	TAILQ_FOREACH(pol, &V_addrsel_policytab, ape_entry) {
 		if (IN6_ARE_ADDR_EQUAL(&newpolicy->addr.sin6_addr,
 				       &pol->ape_policy.addr.sin6_addr) &&
 		    IN6_ARE_ADDR_EQUAL(&newpolicy->addrmask.sin6_addr,
 				       &pol->ape_policy.addrmask.sin6_addr)) {
 			ADDRSEL_UNLOCK();
 			ADDRSEL_XUNLOCK();
 			free(new, M_IFADDR);
 			return (EEXIST);	/* or override it? */
 		}
 	}
 
 	bzero(new, sizeof(*new));
 
 	/* XXX: should validate entry */
 	new->ape_policy = *newpolicy;
 
 	TAILQ_INSERT_TAIL(&V_addrsel_policytab, new, ape_entry);
 	ADDRSEL_UNLOCK();
 	ADDRSEL_XUNLOCK();
 
 	return (0);
 }
 
 static int
 delete_addrsel_policyent(struct in6_addrpolicy *key)
 {
 	INIT_VNET_INET6(curvnet);
 	struct addrsel_policyent *pol;
 
 	ADDRSEL_XLOCK();
 	ADDRSEL_LOCK();
 
 	/* search for the entry in the table */
 	TAILQ_FOREACH(pol, &V_addrsel_policytab, ape_entry) {
 		if (IN6_ARE_ADDR_EQUAL(&key->addr.sin6_addr,
 		    &pol->ape_policy.addr.sin6_addr) &&
 		    IN6_ARE_ADDR_EQUAL(&key->addrmask.sin6_addr,
 		    &pol->ape_policy.addrmask.sin6_addr)) {
 			break;
 		}
 	}
 	if (pol == NULL) {
 		ADDRSEL_UNLOCK();
 		ADDRSEL_XUNLOCK();
 		return (ESRCH);
 	}
 
 	TAILQ_REMOVE(&V_addrsel_policytab, pol, ape_entry);
 	ADDRSEL_UNLOCK();
 	ADDRSEL_XUNLOCK();
 
 	return (0);
 }
 
 static int
 walk_addrsel_policy(int (*callback)(struct in6_addrpolicy *, void *),
     void *w)
 {
 	INIT_VNET_INET6(curvnet);
 	struct addrsel_policyent *pol;
 	int error = 0;
 
 	ADDRSEL_SLOCK();
 	TAILQ_FOREACH(pol, &V_addrsel_policytab, ape_entry) {
 		if ((error = (*callback)(&pol->ape_policy, w)) != 0) {
 			ADDRSEL_SUNLOCK();
 			return (error);
 		}
 	}
 	ADDRSEL_SUNLOCK();
 	return (error);
 }
 
 static int
 dump_addrsel_policyent(struct in6_addrpolicy *pol, void *arg)
 {
 	int error = 0;
 	struct walkarg *w = arg;
 
 	error = SYSCTL_OUT(w->w_req, pol, sizeof(*pol));
 
 	return (error);
 }
 
 static struct in6_addrpolicy *
 match_addrsel_policy(struct sockaddr_in6 *key)
 {
 	INIT_VNET_INET6(curvnet);
 	struct addrsel_policyent *pent;
 	struct in6_addrpolicy *bestpol = NULL, *pol;
 	int matchlen, bestmatchlen = -1;
 	u_char *mp, *ep, *k, *p, m;
 
 	TAILQ_FOREACH(pent, &V_addrsel_policytab, ape_entry) {
 		matchlen = 0;
 
 		pol = &pent->ape_policy;
 		mp = (u_char *)&pol->addrmask.sin6_addr;
 		ep = mp + 16;	/* XXX: scope field? */
 		k = (u_char *)&key->sin6_addr;
 		p = (u_char *)&pol->addr.sin6_addr;
 		for (; mp < ep && *mp; mp++, k++, p++) {
 			m = *mp;
 			if ((*k & m) != *p)
 				goto next; /* not match */
 			if (m == 0xff) /* short cut for a typical case */
 				matchlen += 8;
 			else {
 				while (m >= 0x80) {
 					matchlen++;
 					m <<= 1;
 				}
 			}
 		}
 
 		/* matched.  check if this is better than the current best. */
 		if (bestpol == NULL ||
 		    matchlen > bestmatchlen) {
 			bestpol = pol;
 			bestmatchlen = matchlen;
 		}
 
 	  next:
 		continue;
 	}
 
 	return (bestpol);
 }
Index: head/sys/netinet6/in6_var.h
===================================================================
--- head/sys/netinet6/in6_var.h	(revision 186118)
+++ head/sys/netinet6/in6_var.h	(revision 186119)
@@ -1,635 +1,639 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: in6_var.h,v 1.56 2001/03/29 05:34:31 itojun Exp $
  */
 
 /*-
  * Copyright (c) 1985, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)in_var.h	8.1 (Berkeley) 6/10/93
  * $FreeBSD$
  */
 
 #ifndef _NETINET6_IN6_VAR_H_
 #define _NETINET6_IN6_VAR_H_
 
 /*
  * Interface address, Internet version.  One of these structures
  * is allocated for each interface with an Internet address.
  * The ifaddr structure contains the protocol-independent part
  * of the structure and is assumed to be first.
  */
 
 /*
  * pltime/vltime are just for future reference (required to implements 2
  * hour rule for hosts).  they should never be modified by nd6_timeout or
  * anywhere else.
  *	userland -> kernel: accept pltime/vltime
  *	kernel -> userland: throw up everything
  *	in kernel: modify preferred/expire only
  */
 struct in6_addrlifetime {
 	time_t ia6t_expire;	/* valid lifetime expiration time */
 	time_t ia6t_preferred;	/* preferred lifetime expiration time */
 	u_int32_t ia6t_vltime;	/* valid lifetime */
 	u_int32_t ia6t_pltime;	/* prefix lifetime */
 };
 
 struct nd_ifinfo;
 struct scope6_id;
+struct lltable;
 struct in6_ifextra {
 	struct in6_ifstat *in6_ifstat;
 	struct icmp6_ifstat *icmp6_ifstat;
 	struct nd_ifinfo *nd_ifinfo;
 	struct scope6_id *scope6_id;
+	struct lltable *lltable;
 };
+
+#define	LLTABLE6(ifp)	(((struct in6_ifextra *)(ifp)->if_afdata[AF_INET6])->lltable)
 
 struct	in6_ifaddr {
 	struct	ifaddr ia_ifa;		/* protocol-independent info */
 #define	ia_ifp		ia_ifa.ifa_ifp
 #define ia_flags	ia_ifa.ifa_flags
 	struct	sockaddr_in6 ia_addr;	/* interface address */
 	struct	sockaddr_in6 ia_net;	/* network number of interface */
 	struct	sockaddr_in6 ia_dstaddr; /* space for destination addr */
 	struct	sockaddr_in6 ia_prefixmask; /* prefix mask */
 	u_int32_t ia_plen;		/* prefix length */
 	struct	in6_ifaddr *ia_next;	/* next in6 list of IP6 addresses */
 	int	ia6_flags;
 
 	struct in6_addrlifetime ia6_lifetime;
 	time_t	ia6_createtime; /* the creation time of this address, which is
 				 * currently used for temporary addresses only.
 				 */
 	time_t	ia6_updatetime;
 
 	/* back pointer to the ND prefix (for autoconfigured addresses only) */
 	struct nd_prefix *ia6_ndpr;
 
 	/* multicast addresses joined from the kernel */
 	LIST_HEAD(, in6_multi_mship) ia6_memberships;
 };
 
 /* control structure to manage address selection policy */
 struct in6_addrpolicy {
 	struct sockaddr_in6 addr; /* prefix address */
 	struct sockaddr_in6 addrmask; /* prefix mask */
 	int preced;		/* precedence */
 	int label;		/* matching label */
 	u_quad_t use;		/* statistics */
 };
 
 /*
  * IPv6 interface statistics, as defined in RFC2465 Ipv6IfStatsEntry (p12).
  */
 struct in6_ifstat {
 	u_quad_t ifs6_in_receive;	/* # of total input datagram */
 	u_quad_t ifs6_in_hdrerr;	/* # of datagrams with invalid hdr */
 	u_quad_t ifs6_in_toobig;	/* # of datagrams exceeded MTU */
 	u_quad_t ifs6_in_noroute;	/* # of datagrams with no route */
 	u_quad_t ifs6_in_addrerr;	/* # of datagrams with invalid dst */
 	u_quad_t ifs6_in_protounknown;	/* # of datagrams with unknown proto */
 					/* NOTE: increment on final dst if */
 	u_quad_t ifs6_in_truncated;	/* # of truncated datagrams */
 	u_quad_t ifs6_in_discard;	/* # of discarded datagrams */
 					/* NOTE: fragment timeout is not here */
 	u_quad_t ifs6_in_deliver;	/* # of datagrams delivered to ULP */
 					/* NOTE: increment on final dst if */
 	u_quad_t ifs6_out_forward;	/* # of datagrams forwarded */
 					/* NOTE: increment on outgoing if */
 	u_quad_t ifs6_out_request;	/* # of outgoing datagrams from ULP */
 					/* NOTE: does not include forwrads */
 	u_quad_t ifs6_out_discard;	/* # of discarded datagrams */
 	u_quad_t ifs6_out_fragok;	/* # of datagrams fragmented */
 	u_quad_t ifs6_out_fragfail;	/* # of datagrams failed on fragment */
 	u_quad_t ifs6_out_fragcreat;	/* # of fragment datagrams */
 					/* NOTE: this is # after fragment */
 	u_quad_t ifs6_reass_reqd;	/* # of incoming fragmented packets */
 					/* NOTE: increment on final dst if */
 	u_quad_t ifs6_reass_ok;		/* # of reassembled packets */
 					/* NOTE: this is # after reass */
 					/* NOTE: increment on final dst if */
 	u_quad_t ifs6_reass_fail;	/* # of reass failures */
 					/* NOTE: may not be packet count */
 					/* NOTE: increment on final dst if */
 	u_quad_t ifs6_in_mcast;		/* # of inbound multicast datagrams */
 	u_quad_t ifs6_out_mcast;	/* # of outbound multicast datagrams */
 };
 
 /*
  * ICMPv6 interface statistics, as defined in RFC2466 Ipv6IfIcmpEntry.
  * XXX: I'm not sure if this file is the right place for this structure...
  */
 struct icmp6_ifstat {
 	/*
 	 * Input statistics
 	 */
 	/* ipv6IfIcmpInMsgs, total # of input messages */
 	u_quad_t ifs6_in_msg;
 	/* ipv6IfIcmpInErrors, # of input error messages */
 	u_quad_t ifs6_in_error;
 	/* ipv6IfIcmpInDestUnreachs, # of input dest unreach errors */
 	u_quad_t ifs6_in_dstunreach;
 	/* ipv6IfIcmpInAdminProhibs, # of input administratively prohibited errs */
 	u_quad_t ifs6_in_adminprohib;
 	/* ipv6IfIcmpInTimeExcds, # of input time exceeded errors */
 	u_quad_t ifs6_in_timeexceed;
 	/* ipv6IfIcmpInParmProblems, # of input parameter problem errors */
 	u_quad_t ifs6_in_paramprob;
 	/* ipv6IfIcmpInPktTooBigs, # of input packet too big errors */
 	u_quad_t ifs6_in_pkttoobig;
 	/* ipv6IfIcmpInEchos, # of input echo requests */
 	u_quad_t ifs6_in_echo;
 	/* ipv6IfIcmpInEchoReplies, # of input echo replies */
 	u_quad_t ifs6_in_echoreply;
 	/* ipv6IfIcmpInRouterSolicits, # of input router solicitations */
 	u_quad_t ifs6_in_routersolicit;
 	/* ipv6IfIcmpInRouterAdvertisements, # of input router advertisements */
 	u_quad_t ifs6_in_routeradvert;
 	/* ipv6IfIcmpInNeighborSolicits, # of input neighbor solicitations */
 	u_quad_t ifs6_in_neighborsolicit;
 	/* ipv6IfIcmpInNeighborAdvertisements, # of input neighbor advertisements */
 	u_quad_t ifs6_in_neighboradvert;
 	/* ipv6IfIcmpInRedirects, # of input redirects */
 	u_quad_t ifs6_in_redirect;
 	/* ipv6IfIcmpInGroupMembQueries, # of input MLD queries */
 	u_quad_t ifs6_in_mldquery;
 	/* ipv6IfIcmpInGroupMembResponses, # of input MLD reports */
 	u_quad_t ifs6_in_mldreport;
 	/* ipv6IfIcmpInGroupMembReductions, # of input MLD done */
 	u_quad_t ifs6_in_mlddone;
 
 	/*
 	 * Output statistics. We should solve unresolved routing problem...
 	 */
 	/* ipv6IfIcmpOutMsgs, total # of output messages */
 	u_quad_t ifs6_out_msg;
 	/* ipv6IfIcmpOutErrors, # of output error messages */
 	u_quad_t ifs6_out_error;
 	/* ipv6IfIcmpOutDestUnreachs, # of output dest unreach errors */
 	u_quad_t ifs6_out_dstunreach;
 	/* ipv6IfIcmpOutAdminProhibs, # of output administratively prohibited errs */
 	u_quad_t ifs6_out_adminprohib;
 	/* ipv6IfIcmpOutTimeExcds, # of output time exceeded errors */
 	u_quad_t ifs6_out_timeexceed;
 	/* ipv6IfIcmpOutParmProblems, # of output parameter problem errors */
 	u_quad_t ifs6_out_paramprob;
 	/* ipv6IfIcmpOutPktTooBigs, # of output packet too big errors */
 	u_quad_t ifs6_out_pkttoobig;
 	/* ipv6IfIcmpOutEchos, # of output echo requests */
 	u_quad_t ifs6_out_echo;
 	/* ipv6IfIcmpOutEchoReplies, # of output echo replies */
 	u_quad_t ifs6_out_echoreply;
 	/* ipv6IfIcmpOutRouterSolicits, # of output router solicitations */
 	u_quad_t ifs6_out_routersolicit;
 	/* ipv6IfIcmpOutRouterAdvertisements, # of output router advertisements */
 	u_quad_t ifs6_out_routeradvert;
 	/* ipv6IfIcmpOutNeighborSolicits, # of output neighbor solicitations */
 	u_quad_t ifs6_out_neighborsolicit;
 	/* ipv6IfIcmpOutNeighborAdvertisements, # of output neighbor advertisements */
 	u_quad_t ifs6_out_neighboradvert;
 	/* ipv6IfIcmpOutRedirects, # of output redirects */
 	u_quad_t ifs6_out_redirect;
 	/* ipv6IfIcmpOutGroupMembQueries, # of output MLD queries */
 	u_quad_t ifs6_out_mldquery;
 	/* ipv6IfIcmpOutGroupMembResponses, # of output MLD reports */
 	u_quad_t ifs6_out_mldreport;
 	/* ipv6IfIcmpOutGroupMembReductions, # of output MLD done */
 	u_quad_t ifs6_out_mlddone;
 };
 
 struct	in6_ifreq {
 	char	ifr_name[IFNAMSIZ];
 	union {
 		struct	sockaddr_in6 ifru_addr;
 		struct	sockaddr_in6 ifru_dstaddr;
 		int	ifru_flags;
 		int	ifru_flags6;
 		int	ifru_metric;
 		caddr_t	ifru_data;
 		struct in6_addrlifetime ifru_lifetime;
 		struct in6_ifstat ifru_stat;
 		struct icmp6_ifstat ifru_icmp6stat;
 		u_int32_t ifru_scope_id[16];
 	} ifr_ifru;
 };
 
 struct	in6_aliasreq {
 	char	ifra_name[IFNAMSIZ];
 	struct	sockaddr_in6 ifra_addr;
 	struct	sockaddr_in6 ifra_dstaddr;
 	struct	sockaddr_in6 ifra_prefixmask;
 	int	ifra_flags;
 	struct in6_addrlifetime ifra_lifetime;
 };
 
 /* prefix type macro */
 #define IN6_PREFIX_ND	1
 #define IN6_PREFIX_RR	2
 
 /*
  * prefix related flags passed between kernel(NDP related part) and
  * user land command(ifconfig) and daemon(rtadvd).
  */
 struct in6_prflags {
 	struct prf_ra {
 		u_char onlink : 1;
 		u_char autonomous : 1;
 		u_char reserved : 6;
 	} prf_ra;
 	u_char prf_reserved1;
 	u_short prf_reserved2;
 	/* want to put this on 4byte offset */
 	struct prf_rr {
 		u_char decrvalid : 1;
 		u_char decrprefd : 1;
 		u_char reserved : 6;
 	} prf_rr;
 	u_char prf_reserved3;
 	u_short prf_reserved4;
 };
 
 struct  in6_prefixreq {
 	char	ipr_name[IFNAMSIZ];
 	u_char	ipr_origin;
 	u_char	ipr_plen;
 	u_int32_t ipr_vltime;
 	u_int32_t ipr_pltime;
 	struct in6_prflags ipr_flags;
 	struct	sockaddr_in6 ipr_prefix;
 };
 
 #define PR_ORIG_RA	0
 #define PR_ORIG_RR	1
 #define PR_ORIG_STATIC	2
 #define PR_ORIG_KERNEL	3
 
 #define ipr_raf_onlink		ipr_flags.prf_ra.onlink
 #define ipr_raf_auto		ipr_flags.prf_ra.autonomous
 
 #define ipr_statef_onlink	ipr_flags.prf_state.onlink
 
 #define ipr_rrf_decrvalid	ipr_flags.prf_rr.decrvalid
 #define ipr_rrf_decrprefd	ipr_flags.prf_rr.decrprefd
 
 struct	in6_rrenumreq {
 	char	irr_name[IFNAMSIZ];
 	u_char	irr_origin;
 	u_char	irr_m_len;	/* match len for matchprefix */
 	u_char	irr_m_minlen;	/* minlen for matching prefix */
 	u_char	irr_m_maxlen;	/* maxlen for matching prefix */
 	u_char	irr_u_uselen;	/* uselen for adding prefix */
 	u_char	irr_u_keeplen;	/* keeplen from matching prefix */
 	struct irr_raflagmask {
 		u_char onlink : 1;
 		u_char autonomous : 1;
 		u_char reserved : 6;
 	} irr_raflagmask;
 	u_int32_t irr_vltime;
 	u_int32_t irr_pltime;
 	struct in6_prflags irr_flags;
 	struct	sockaddr_in6 irr_matchprefix;
 	struct	sockaddr_in6 irr_useprefix;
 };
 
 #define irr_raf_mask_onlink	irr_raflagmask.onlink
 #define irr_raf_mask_auto	irr_raflagmask.autonomous
 #define irr_raf_mask_reserved	irr_raflagmask.reserved
 
 #define irr_raf_onlink		irr_flags.prf_ra.onlink
 #define irr_raf_auto		irr_flags.prf_ra.autonomous
 
 #define irr_statef_onlink	irr_flags.prf_state.onlink
 
 #define irr_rrf			irr_flags.prf_rr
 #define irr_rrf_decrvalid	irr_flags.prf_rr.decrvalid
 #define irr_rrf_decrprefd	irr_flags.prf_rr.decrprefd
 
 /*
  * Given a pointer to an in6_ifaddr (ifaddr),
  * return a pointer to the addr as a sockaddr_in6
  */
 #define IA6_IN6(ia)	(&((ia)->ia_addr.sin6_addr))
 #define IA6_DSTIN6(ia)	(&((ia)->ia_dstaddr.sin6_addr))
 #define IA6_MASKIN6(ia)	(&((ia)->ia_prefixmask.sin6_addr))
 #define IA6_SIN6(ia)	(&((ia)->ia_addr))
 #define IA6_DSTSIN6(ia)	(&((ia)->ia_dstaddr))
 #define IFA_IN6(x)	(&((struct sockaddr_in6 *)((x)->ifa_addr))->sin6_addr)
 #define IFA_DSTIN6(x)	(&((struct sockaddr_in6 *)((x)->ifa_dstaddr))->sin6_addr)
 
 #define IFPR_IN6(x)	(&((struct sockaddr_in6 *)((x)->ifpr_prefix))->sin6_addr)
 
 #ifdef _KERNEL
 #define IN6_ARE_MASKED_ADDR_EQUAL(d, a, m)	(	\
 	(((d)->s6_addr32[0] ^ (a)->s6_addr32[0]) & (m)->s6_addr32[0]) == 0 && \
 	(((d)->s6_addr32[1] ^ (a)->s6_addr32[1]) & (m)->s6_addr32[1]) == 0 && \
 	(((d)->s6_addr32[2] ^ (a)->s6_addr32[2]) & (m)->s6_addr32[2]) == 0 && \
 	(((d)->s6_addr32[3] ^ (a)->s6_addr32[3]) & (m)->s6_addr32[3]) == 0 )
 #endif
 
 #define SIOCSIFADDR_IN6		 _IOW('i', 12, struct in6_ifreq)
 #define SIOCGIFADDR_IN6		_IOWR('i', 33, struct in6_ifreq)
 
 #ifdef _KERNEL
 /*
  * SIOCSxxx ioctls should be unused (see comments in in6.c), but
  * we do not shift numbers for binary compatibility.
  */
 #define SIOCSIFDSTADDR_IN6	 _IOW('i', 14, struct in6_ifreq)
 #define SIOCSIFNETMASK_IN6	 _IOW('i', 22, struct in6_ifreq)
 #endif
 
 #define SIOCGIFDSTADDR_IN6	_IOWR('i', 34, struct in6_ifreq)
 #define SIOCGIFNETMASK_IN6	_IOWR('i', 37, struct in6_ifreq)
 
 #define SIOCDIFADDR_IN6		 _IOW('i', 25, struct in6_ifreq)
 #define SIOCAIFADDR_IN6		 _IOW('i', 26, struct in6_aliasreq)
 
 #define SIOCSIFPHYADDR_IN6       _IOW('i', 70, struct in6_aliasreq)
 #define	SIOCGIFPSRCADDR_IN6	_IOWR('i', 71, struct in6_ifreq)
 #define	SIOCGIFPDSTADDR_IN6	_IOWR('i', 72, struct in6_ifreq)
 
 #define SIOCGIFAFLAG_IN6	_IOWR('i', 73, struct in6_ifreq)
 
 #define SIOCGDRLST_IN6		_IOWR('i', 74, struct in6_drlist)
 #ifdef _KERNEL
 /* XXX: SIOCGPRLST_IN6 is exposed in KAME but in6_oprlist is not. */
 #define SIOCGPRLST_IN6		_IOWR('i', 75, struct in6_oprlist)
 #endif
 #ifdef _KERNEL
 #define OSIOCGIFINFO_IN6	_IOWR('i', 76, struct in6_ondireq)
 #endif
 #define SIOCGIFINFO_IN6		_IOWR('i', 108, struct in6_ndireq)
 #define SIOCSIFINFO_IN6		_IOWR('i', 109, struct in6_ndireq)
 #define SIOCSNDFLUSH_IN6	_IOWR('i', 77, struct in6_ifreq)
 #define SIOCGNBRINFO_IN6	_IOWR('i', 78, struct in6_nbrinfo)
 #define SIOCSPFXFLUSH_IN6	_IOWR('i', 79, struct in6_ifreq)
 #define SIOCSRTRFLUSH_IN6	_IOWR('i', 80, struct in6_ifreq)
 
 #define SIOCGIFALIFETIME_IN6	_IOWR('i', 81, struct in6_ifreq)
 #define SIOCSIFALIFETIME_IN6	_IOWR('i', 82, struct in6_ifreq)
 #define SIOCGIFSTAT_IN6		_IOWR('i', 83, struct in6_ifreq)
 #define SIOCGIFSTAT_ICMP6	_IOWR('i', 84, struct in6_ifreq)
 
 #define SIOCSDEFIFACE_IN6	_IOWR('i', 85, struct in6_ndifreq)
 #define SIOCGDEFIFACE_IN6	_IOWR('i', 86, struct in6_ndifreq)
 
 #define SIOCSIFINFO_FLAGS	_IOWR('i', 87, struct in6_ndireq) /* XXX */
 
 #define SIOCSSCOPE6		_IOW('i', 88, struct in6_ifreq)
 #define SIOCGSCOPE6		_IOWR('i', 89, struct in6_ifreq)
 #define SIOCGSCOPE6DEF		_IOWR('i', 90, struct in6_ifreq)
 
 #define SIOCSIFPREFIX_IN6	_IOW('i', 100, struct in6_prefixreq) /* set */
 #define SIOCGIFPREFIX_IN6	_IOWR('i', 101, struct in6_prefixreq) /* get */
 #define SIOCDIFPREFIX_IN6	_IOW('i', 102, struct in6_prefixreq) /* del */
 #define SIOCAIFPREFIX_IN6	_IOW('i', 103, struct in6_rrenumreq) /* add */
 #define SIOCCIFPREFIX_IN6	_IOW('i', 104, \
 				     struct in6_rrenumreq) /* change */
 #define SIOCSGIFPREFIX_IN6	_IOW('i', 105, \
 				     struct in6_rrenumreq) /* set global */
 
 #define SIOCGETSGCNT_IN6	_IOWR('u', 106, \
 				      struct sioc_sg_req6) /* get s,g pkt cnt */
 #define SIOCGETMIFCNT_IN6	_IOWR('u', 107, \
 				      struct sioc_mif_req6) /* get pkt cnt per if */
 
 #define SIOCAADDRCTL_POLICY	_IOW('u', 108, struct in6_addrpolicy)
 #define SIOCDADDRCTL_POLICY	_IOW('u', 109, struct in6_addrpolicy)
 
 #define IN6_IFF_ANYCAST		0x01	/* anycast address */
 #define IN6_IFF_TENTATIVE	0x02	/* tentative address */
 #define IN6_IFF_DUPLICATED	0x04	/* DAD detected duplicate */
 #define IN6_IFF_DETACHED	0x08	/* may be detached from the link */
 #define IN6_IFF_DEPRECATED	0x10	/* deprecated address */
 #define IN6_IFF_NODAD		0x20	/* don't perform DAD on this address
 					 * (used only at first SIOC* call)
 					 */
 #define IN6_IFF_AUTOCONF	0x40	/* autoconfigurable address. */
 #define IN6_IFF_TEMPORARY	0x80	/* temporary (anonymous) address. */
 #define IN6_IFF_NOPFX		0x8000	/* skip kernel prefix management.
 					 * XXX: this should be temporary.
 					 */
 
 /* do not input/output */
 #define IN6_IFF_NOTREADY (IN6_IFF_TENTATIVE|IN6_IFF_DUPLICATED)
 
 #ifdef _KERNEL
 #define IN6_ARE_SCOPE_CMP(a,b) ((a)-(b))
 #define IN6_ARE_SCOPE_EQUAL(a,b) ((a)==(b))
 #endif
 
 #ifdef _KERNEL
 #ifdef VIMAGE_GLOBALS
 extern struct in6_ifaddr *in6_ifaddr;
 
 extern struct icmp6stat icmp6stat;
 
 extern unsigned long in6_maxmtu;
 #endif /* VIMAGE_GLOBALS */
 #define in6_ifstat_inc(ifp, tag) \
 do {								\
 	if (ifp)						\
 		((struct in6_ifextra *)((ifp)->if_afdata[AF_INET6]))->in6_ifstat->tag++; \
 } while (/*CONSTCOND*/ 0)
 
 extern struct in6_addr zeroin6_addr;
 extern u_char inet6ctlerrmap[];
 #ifdef MALLOC_DECLARE
 MALLOC_DECLARE(M_IP6MADDR);
 #endif /* MALLOC_DECLARE */
 
 /*
  * Macro for finding the internet address structure (in6_ifaddr) corresponding
  * to a given interface (ifnet structure).
  */
 
 #define IFP_TO_IA6(ifp, ia)				\
 /* struct ifnet *ifp; */				\
 /* struct in6_ifaddr *ia; */				\
 do {									\
 	struct ifaddr *ifa;						\
 	TAILQ_FOREACH(ifa, &(ifp)->if_addrlist, ifa_list) {		\
 		if (ifa->ifa_addr->sa_family == AF_INET6)		\
 			break;						\
 	}								\
 	(ia) = (struct in6_ifaddr *)ifa;				\
 } while (/*CONSTCOND*/ 0)
 
 #endif /* _KERNEL */
 
 /*
  * Multi-cast membership entry.  One for each group/ifp that a PCB
  * belongs to.
  */
 struct in6_multi_mship {
 	struct	in6_multi *i6mm_maddr;	/* Multicast address pointer */
 	LIST_ENTRY(in6_multi_mship) i6mm_chain;  /* multicast options chain */
 };
 
 struct	in6_multi {
 	LIST_ENTRY(in6_multi) in6m_entry; /* list glue */
 	struct	in6_addr in6m_addr;	/* IP6 multicast address */
 	struct	ifnet *in6m_ifp;	/* back pointer to ifnet */
 	struct	ifmultiaddr *in6m_ifma;	/* back pointer to ifmultiaddr */
 	u_int	in6m_refcount;		/* # membership claims by sockets */
 	u_int	in6m_state;		/* state of the membership */
 	u_int	in6m_timer;		/* MLD6 listener report timer */
 	struct timeval in6m_timer_expire; /* when the timer expires */
 	struct callout *in6m_timer_ch;
 };
 
 #define IN6M_TIMER_UNDEF -1
 
 #ifdef _KERNEL
 /* flags to in6_update_ifa */
 #define IN6_IFAUPDATE_DADDELAY	0x1 /* first time to configure an address */
 
 extern LIST_HEAD(in6_multihead, in6_multi) in6_multihead;
 
 /*
  * Structure used by macros below to remember position when stepping through
  * all of the in6_multi records.
  */
 struct	in6_multistep {
 	struct	in6_ifaddr *i_ia;
 	struct	in6_multi *i_in6m;
 };
 
 /*
  * Macros for looking up the in6_multi record for a given IP6 multicast
  * address on a given interface. If no matching record is found, "in6m"
  * returns NULL.
  */
 
 #define IN6_LOOKUP_MULTI(addr, ifp, in6m)			\
 /* struct in6_addr addr; */					\
 /* struct ifnet *ifp; */					\
 /* struct in6_multi *in6m; */					\
 do { \
 	struct ifmultiaddr *ifma; \
 	IF_ADDR_LOCK(ifp); \
 	TAILQ_FOREACH(ifma, &(ifp)->if_multiaddrs, ifma_link) { \
 		if (ifma->ifma_addr->sa_family == AF_INET6 \
 		    && IN6_ARE_ADDR_EQUAL(&((struct sockaddr_in6 *)ifma->ifma_addr)->sin6_addr, \
 					  &(addr))) \
 			break; \
 	} \
 	(in6m) = (struct in6_multi *)(ifma ? ifma->ifma_protospec : 0); \
 	IF_ADDR_UNLOCK(ifp); \
 } while(0)
 
 /*
  * Macro to step through all of the in6_multi records, one at a time.
  * The current position is remembered in "step", which the caller must
  * provide.  IN6_FIRST_MULTI(), below, must be called to initialize "step"
  * and get the first record.  Both macros return a NULL "in6m" when there
  * are no remaining records.
  */
 #define IN6_NEXT_MULTI(step, in6m)					\
 /* struct in6_multistep step; */					\
 /* struct in6_multi *in6m; */						\
 do { \
 	if (((in6m) = (step).i_in6m) != NULL) \
 		(step).i_in6m = LIST_NEXT((step).i_in6m, in6m_entry); \
 } while(0)
 
 #define IN6_FIRST_MULTI(step, in6m)		\
 /* struct in6_multistep step; */		\
 /* struct in6_multi *in6m */			\
 do { \
 	(step).i_in6m = LIST_FIRST(&in6_multihead); \
 		IN6_NEXT_MULTI((step), (in6m)); \
 } while(0)
 
 struct	in6_multi *in6_addmulti __P((struct in6_addr *, struct ifnet *,
 	int *, int));
 void	in6_delmulti __P((struct in6_multi *));
 struct in6_multi_mship *in6_joingroup(struct ifnet *, struct in6_addr *, int *, int);
 int	in6_leavegroup(struct in6_multi_mship *);
 int	in6_mask2len __P((struct in6_addr *, u_char *));
 int	in6_control __P((struct socket *, u_long, caddr_t, struct ifnet *,
 	struct thread *));
 int	in6_update_ifa __P((struct ifnet *, struct in6_aliasreq *,
 	struct in6_ifaddr *, int));
 void	in6_purgeaddr __P((struct ifaddr *));
 int	in6if_do_dad __P((struct ifnet *));
 void	in6_purgeif __P((struct ifnet *));
 void	in6_savemkludge __P((struct in6_ifaddr *));
 void	*in6_domifattach __P((struct ifnet *));
 void	in6_domifdetach __P((struct ifnet *, void *));
 void	in6_setmaxmtu   __P((void));
 int	in6_if2idlen   __P((struct ifnet *));
 void	in6_restoremkludge __P((struct in6_ifaddr *, struct ifnet *));
 void	in6_purgemkludge __P((struct ifnet *));
 struct in6_ifaddr *in6ifa_ifpforlinklocal __P((struct ifnet *, int));
 struct in6_ifaddr *in6ifa_ifpwithaddr __P((struct ifnet *, struct in6_addr *));
 char	*ip6_sprintf __P((char *, const struct in6_addr *));
 int	in6_addr2zoneid __P((struct ifnet *, struct in6_addr *, u_int32_t *));
 int	in6_matchlen __P((struct in6_addr *, struct in6_addr *));
 int	in6_are_prefix_equal __P((struct in6_addr *, struct in6_addr *, int));
 void	in6_prefixlen2mask __P((struct in6_addr *, int));
 int	in6_prefix_ioctl __P((struct socket *, u_long, caddr_t,
 	struct ifnet *));
 int	in6_prefix_add_ifid __P((int, struct in6_ifaddr *));
 void	in6_prefix_remove_ifid __P((int, struct in6_ifaddr *));
 void	in6_purgeprefix __P((struct ifnet *));
 void	in6_ifremloop(struct ifaddr *);
 void	in6_ifaddloop(struct ifaddr *);
 
 int	in6_is_addr_deprecated __P((struct sockaddr_in6 *));
 struct inpcb;
 int in6_src_ioctl __P((u_long, caddr_t));
 #endif /* _KERNEL */
 
 #endif /* _NETINET6_IN6_VAR_H_ */
Index: head/sys/netinet6/ip6_input.c
===================================================================
--- head/sys/netinet6/ip6_input.c	(revision 186118)
+++ head/sys/netinet6/ip6_input.c	(revision 186119)
@@ -1,1681 +1,1702 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: ip6_input.c,v 1.259 2002/01/21 04:58:09 jinmei Exp $
  */
 
 /*-
  * Copyright (c) 1982, 1986, 1988, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_input.c	8.2 (Berkeley) 1/4/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipsec.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/proc.h>
 #include <sys/domain.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/errno.h>
 #include <sys/time.h>
 #include <sys/kernel.h>
 #include <sys/syslog.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 #include <net/netisr.h>
 #include <net/pfil.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_systm.h>
+#include <net/if_llatbl.h>
 #ifdef INET
 #include <netinet/ip.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/vinet.h>
 #endif /* INET */
 #include <netinet/ip6.h>
 #include <netinet6/in6_var.h>
 #include <netinet6/ip6_var.h>
 #include <netinet/in_pcb.h>
 #include <netinet/icmp6.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/in6_ifattach.h>
 #include <netinet6/nd6.h>
 #include <netinet6/vinet6.h>
 
 #ifdef IPSEC
 #include <netipsec/ipsec.h>
 #include <netinet6/ip6_ipsec.h>
 #include <netipsec/ipsec6.h>
 #endif /* IPSEC */
 
 #include <netinet6/ip6protosw.h>
 
 extern struct domain inet6domain;
 
 u_char ip6_protox[IPPROTO_MAX];
 static struct ifqueue ip6intrq;
 
 #ifndef VIMAGE
 #ifndef VIMAGE_GLOBALS
 struct vnet_inet6 vnet_inet6_0;
 #endif
 #endif
 
 #ifdef VIMAGE_GLOBALS
 static int ip6qmaxlen;
 struct in6_ifaddr *in6_ifaddr;
 struct ip6stat ip6stat;
 
 extern struct callout in6_tmpaddrtimer_ch;
 
 extern int dad_init;
 extern int pmtu_expire;
 extern int pmtu_probe;
 extern u_long rip6_sendspace;
 extern u_long rip6_recvspace;
 extern int icmp6errppslim;
 extern int icmp6_nodeinfo;
 extern int udp6_sendspace;
 extern int udp6_recvspace;
 
 extern struct	route_in6 ip6_forward_rt;
 
 int ip6_forward_srcrt;			/* XXX */
 int ip6_sourcecheck;			/* XXX */
 int ip6_sourcecheck_interval;		/* XXX */
 int ip6_ours_check_algorithm;
 #endif
 
 struct pfil_head inet6_pfil_hook;
 
 static void ip6_init2(void *);
 static struct ip6aux *ip6_setdstifaddr(struct mbuf *, struct in6_ifaddr *);
 static int ip6_hopopts_input(u_int32_t *, u_int32_t *, struct mbuf **, int *);
 #ifdef PULLDOWN_TEST
 static struct mbuf *ip6_pullexthdr(struct mbuf *, size_t, int);
 #endif
 
 /*
  * IP6 initialization: fill in IP6 protocol switch table.
  * All protocols not implemented in kernel go to raw IP6 protocol handler.
  */
 void
 ip6_init(void)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ip6protosw *pr;
 	int i;
 
 	V_ip6qmaxlen = IFQ_MAXLEN;
 	V_in6_maxmtu = 0;
 #ifdef IP6_AUTO_LINKLOCAL
 	V_ip6_auto_linklocal = IP6_AUTO_LINKLOCAL;
 #else
 	V_ip6_auto_linklocal = 1;	/* enable by default */
 #endif
 	TUNABLE_INT_FETCH("net.inet6.ip6.auto_linklocal",
 	    &V_ip6_auto_linklocal);
 
 #ifndef IPV6FORWARDING
 #ifdef GATEWAY6
 #define IPV6FORWARDING	1	/* forward IP6 packets not for us */
 #else
 #define IPV6FORWARDING	0	/* don't forward IP6 packets not for us */
 #endif /* GATEWAY6 */
 #endif /* !IPV6FORWARDING */
 
 #ifndef IPV6_SENDREDIRECTS
 #define IPV6_SENDREDIRECTS	1
 #endif
 
 	V_ip6_forwarding = IPV6FORWARDING; /* act as router? */
 	V_ip6_sendredirects = IPV6_SENDREDIRECTS;
 	V_ip6_defhlim = IPV6_DEFHLIM;
 	V_ip6_defmcasthlim = IPV6_DEFAULT_MULTICAST_HOPS;
 	V_ip6_accept_rtadv = 0;	 /* "IPV6FORWARDING ? 0 : 1" is dangerous */
 	V_ip6_log_interval = 5;
 	V_ip6_hdrnestlimit = 15; /* How many header options will we process? */
 	V_ip6_dad_count = 1;	 /* DupAddrDetectionTransmits */
 	V_ip6_auto_flowlabel = 1;
 	V_ip6_use_deprecated = 1;/* allow deprecated addr (RFC2462 5.5.4) */
 	V_ip6_rr_prune = 5;	 /* router renumbering prefix
                                   * walk list every 5 sec. */
 	V_ip6_mcast_pmtu = 0;	 /* enable pMTU discovery for multicast? */
 	V_ip6_v6only = 1;
 	V_ip6_keepfaith = 0;
 	V_ip6_log_time = (time_t)0L;
 #ifdef IPSTEALTH
 	V_ip6stealth = 0;
 #endif
 	V_nd6_onlink_ns_rfc4861 = 0; /* allow 'on-link' nd6 NS (RFC 4861) */
 
 	V_pmtu_expire = 60*10;
 	V_pmtu_probe = 60*2;
 
 	/* raw IP6 parameters */
 	/*
 	 * Nominal space allocated to a raw ip socket.
 	 */
 #define RIPV6SNDQ	8192
 #define RIPV6RCVQ	8192
 	V_rip6_sendspace = RIPV6SNDQ;
 	V_rip6_recvspace = RIPV6RCVQ;
 
 	/* ICMPV6 parameters */
 	V_icmp6_rediraccept = 1;	/* accept and process redirects */
 	V_icmp6_redirtimeout = 10 * 60;	/* 10 minutes */
 	V_icmp6errppslim = 100;		/* 100pps */
 	/* control how to respond to NI queries */
 	V_icmp6_nodeinfo = (ICMP6_NODEINFO_FQDNOK|ICMP6_NODEINFO_NODEADDROK);
 
 	/* UDP on IP6 parameters */
 	V_udp6_sendspace = 9216;	/* really max datagram size */
 	V_udp6_recvspace = 40 * (1024 + sizeof(struct sockaddr_in6));
 					/* 40 1K datagrams */
 	V_dad_init = 0;
 
 #ifdef DIAGNOSTIC
 	if (sizeof(struct protosw) != sizeof(struct ip6protosw))
 		panic("sizeof(protosw) != sizeof(ip6protosw)");
 #endif
 	pr = (struct ip6protosw *)pffindproto(PF_INET6, IPPROTO_RAW, SOCK_RAW);
 	if (pr == 0)
 		panic("ip6_init");
 
 	/* Initialize the entire ip_protox[] array to IPPROTO_RAW. */
 	for (i = 0; i < IPPROTO_MAX; i++)
 		ip6_protox[i] = pr - inet6sw;
 	/*
 	 * Cycle through IP protocols and put them into the appropriate place
 	 * in ip6_protox[].
 	 */
 	for (pr = (struct ip6protosw *)inet6domain.dom_protosw;
 	    pr < (struct ip6protosw *)inet6domain.dom_protoswNPROTOSW; pr++)
 		if (pr->pr_domain->dom_family == PF_INET6 &&
 		    pr->pr_protocol && pr->pr_protocol != IPPROTO_RAW) {
 			/* Be careful to only index valid IP protocols. */
 			if (pr->pr_protocol < IPPROTO_MAX)
 				ip6_protox[pr->pr_protocol] = pr - inet6sw;
 		}
 
 	/* Initialize packet filter hooks. */
 	inet6_pfil_hook.ph_type = PFIL_TYPE_AF;
 	inet6_pfil_hook.ph_af = AF_INET6;
 	if ((i = pfil_head_register(&inet6_pfil_hook)) != 0)
 		printf("%s: WARNING: unable to register pfil hook, "
 			"error %d\n", __func__, i);
 
 	ip6intrq.ifq_maxlen = V_ip6qmaxlen;
 	mtx_init(&ip6intrq.ifq_mtx, "ip6_inq", NULL, MTX_DEF);
 	netisr_register(NETISR_IPV6, ip6_input, &ip6intrq, 0);
 	scope6_init();
 	addrsel_policy_init();
 	nd6_init();
 	frag6_init();
 	V_ip6_desync_factor = arc4random() % MAX_TEMP_DESYNC_FACTOR;
 }
 
 static void
 ip6_init2(void *dummy)
 {
 	INIT_VNET_INET6(curvnet);
 
 	/* nd6_timer_init */
 	callout_init(&V_nd6_timer_ch, 0);
 	callout_reset(&V_nd6_timer_ch, hz, nd6_timer, NULL);
 
 	/* timer for regeneranation of temporary addresses randomize ID */
 	callout_init(&V_in6_tmpaddrtimer_ch, 0);
 	callout_reset(&V_in6_tmpaddrtimer_ch,
 		      (V_ip6_temp_preferred_lifetime - V_ip6_desync_factor -
 		       V_ip6_temp_regen_advance) * hz,
 		      in6_tmpaddrtimer, NULL);
 }
 
 /* cheat */
 /* This must be after route_init(), which is now SI_ORDER_THIRD */
 SYSINIT(netinet6init2, SI_SUB_PROTO_DOMAIN, SI_ORDER_MIDDLE, ip6_init2, NULL);
 
 void
 ip6_input(struct mbuf *m)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	struct ip6_hdr *ip6;
 	int off = sizeof(struct ip6_hdr), nest;
 	u_int32_t plen;
 	u_int32_t rtalert = ~0;
 	int nxt, ours = 0;
-	struct ifnet *deliverifp = NULL;
+	struct ifnet *deliverifp = NULL, *ifp = NULL;
 	struct in6_addr odst;
 	int srcrt = 0;
+	struct llentry *lle = NULL;
+	struct sockaddr_in6 dst6;
 
 #ifdef IPSEC
 	/*
 	 * should the inner packet be considered authentic?
 	 * see comment in ah4_input().
 	 * NB: m cannot be NULL when passed to the input routine
 	 */
 
 	m->m_flags &= ~M_AUTHIPHDR;
 	m->m_flags &= ~M_AUTHIPDGM;
 
 #endif /* IPSEC */
 
 	/*
 	 * make sure we don't have onion peering information into m_tag.
 	 */
 	ip6_delaux(m);
 
 	/*
 	 * mbuf statistics
 	 */
 	if (m->m_flags & M_EXT) {
 		if (m->m_next)
 			V_ip6stat.ip6s_mext2m++;
 		else
 			V_ip6stat.ip6s_mext1++;
 	} else {
 #define M2MMAX	(sizeof(V_ip6stat.ip6s_m2m)/sizeof(V_ip6stat.ip6s_m2m[0]))
 		if (m->m_next) {
 			if (m->m_flags & M_LOOP) {
 				V_ip6stat.ip6s_m2m[V_loif[0].if_index]++; /* XXX */
 			} else if (m->m_pkthdr.rcvif->if_index < M2MMAX)
 				V_ip6stat.ip6s_m2m[m->m_pkthdr.rcvif->if_index]++;
 			else
 				V_ip6stat.ip6s_m2m[0]++;
 		} else
 			V_ip6stat.ip6s_m1++;
 #undef M2MMAX
 	}
 
 	/* drop the packet if IPv6 operation is disabled on the IF */
 	if ((ND_IFINFO(m->m_pkthdr.rcvif)->flags & ND6_IFF_IFDISABLED)) {
 		m_freem(m);
 		return;
 	}
 
 	in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_receive);
 	V_ip6stat.ip6s_total++;
 
 #ifndef PULLDOWN_TEST
 	/*
 	 * L2 bridge code and some other code can return mbuf chain
 	 * that does not conform to KAME requirement.  too bad.
 	 * XXX: fails to join if interface MTU > MCLBYTES.  jumbogram?
 	 */
 	if (m && m->m_next != NULL && m->m_pkthdr.len < MCLBYTES) {
 		struct mbuf *n;
 
 		MGETHDR(n, M_DONTWAIT, MT_HEADER);
 		if (n)
 			M_MOVE_PKTHDR(n, m);
 		if (n && n->m_pkthdr.len > MHLEN) {
 			MCLGET(n, M_DONTWAIT);
 			if ((n->m_flags & M_EXT) == 0) {
 				m_freem(n);
 				n = NULL;
 			}
 		}
 		if (n == NULL) {
 			m_freem(m);
 			return;	/* ENOBUFS */
 		}
 
 		m_copydata(m, 0, n->m_pkthdr.len, mtod(n, caddr_t));
 		n->m_len = n->m_pkthdr.len;
 		m_freem(m);
 		m = n;
 	}
 	IP6_EXTHDR_CHECK(m, 0, sizeof(struct ip6_hdr), /* nothing */);
 #endif
 
 	if (m->m_len < sizeof(struct ip6_hdr)) {
 		struct ifnet *inifp;
 		inifp = m->m_pkthdr.rcvif;
 		if ((m = m_pullup(m, sizeof(struct ip6_hdr))) == NULL) {
 			V_ip6stat.ip6s_toosmall++;
 			in6_ifstat_inc(inifp, ifs6_in_hdrerr);
 			return;
 		}
 	}
 
 	ip6 = mtod(m, struct ip6_hdr *);
 
 	if ((ip6->ip6_vfc & IPV6_VERSION_MASK) != IPV6_VERSION) {
 		V_ip6stat.ip6s_badvers++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_hdrerr);
 		goto bad;
 	}
 
 	V_ip6stat.ip6s_nxthist[ip6->ip6_nxt]++;
 
 	/*
 	 * Check against address spoofing/corruption.
 	 */
 	if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_src) ||
 	    IN6_IS_ADDR_UNSPECIFIED(&ip6->ip6_dst)) {
 		/*
 		 * XXX: "badscope" is not very suitable for a multicast source.
 		 */
 		V_ip6stat.ip6s_badscope++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_addrerr);
 		goto bad;
 	}
 	if (IN6_IS_ADDR_MC_INTFACELOCAL(&ip6->ip6_dst) &&
 	    !(m->m_flags & M_LOOP)) {
 		/*
 		 * In this case, the packet should come from the loopback
 		 * interface.  However, we cannot just check the if_flags,
 		 * because ip6_mloopback() passes the "actual" interface
 		 * as the outgoing/incoming interface.
 		 */
 		V_ip6stat.ip6s_badscope++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_addrerr);
 		goto bad;
 	}
 
 #ifdef ALTQ
 	if (altq_input != NULL && (*altq_input)(m, AF_INET6) == 0) {
 		/* packet is dropped by traffic conditioner */
 		return;
 	}
 #endif
 	/*
 	 * The following check is not documented in specs.  A malicious
 	 * party may be able to use IPv4 mapped addr to confuse tcp/udp stack
 	 * and bypass security checks (act as if it was from 127.0.0.1 by using
 	 * IPv6 src ::ffff:127.0.0.1).  Be cautious.
 	 *
 	 * This check chokes if we are in an SIIT cloud.  As none of BSDs
 	 * support IPv4-less kernel compilation, we cannot support SIIT
 	 * environment at all.  So, it makes more sense for us to reject any
 	 * malicious packets for non-SIIT environment, than try to do a
 	 * partial support for SIIT environment.
 	 */
 	if (IN6_IS_ADDR_V4MAPPED(&ip6->ip6_src) ||
 	    IN6_IS_ADDR_V4MAPPED(&ip6->ip6_dst)) {
 		V_ip6stat.ip6s_badscope++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_addrerr);
 		goto bad;
 	}
 #if 0
 	/*
 	 * Reject packets with IPv4 compatible addresses (auto tunnel).
 	 *
 	 * The code forbids auto tunnel relay case in RFC1933 (the check is
 	 * stronger than RFC1933).  We may want to re-enable it if mech-xx
 	 * is revised to forbid relaying case.
 	 */
 	if (IN6_IS_ADDR_V4COMPAT(&ip6->ip6_src) ||
 	    IN6_IS_ADDR_V4COMPAT(&ip6->ip6_dst)) {
 		V_ip6stat.ip6s_badscope++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_addrerr);
 		goto bad;
 	}
 #endif
 
 	/*
 	 * Run through list of hooks for input packets.
 	 *
 	 * NB: Beware of the destination address changing
 	 *     (e.g. by NAT rewriting).  When this happens,
 	 *     tell ip6_forward to do the right thing.
 	 */
 	odst = ip6->ip6_dst;
 
 	/* Jump over all PFIL processing if hooks are not active. */
 	if (!PFIL_HOOKED(&inet6_pfil_hook))
 		goto passin;
 
 	if (pfil_run_hooks(&inet6_pfil_hook, &m, m->m_pkthdr.rcvif, PFIL_IN, NULL))
 		return;
 	if (m == NULL)			/* consumed by filter */
 		return;
 	ip6 = mtod(m, struct ip6_hdr *);
 	srcrt = !IN6_ARE_ADDR_EQUAL(&odst, &ip6->ip6_dst);
 
 passin:
 	/*
 	 * Disambiguate address scope zones (if there is ambiguity).
 	 * We first make sure that the original source or destination address
 	 * is not in our internal form for scoped addresses.  Such addresses
 	 * are not necessarily invalid spec-wise, but we cannot accept them due
 	 * to the usage conflict.
 	 * in6_setscope() then also checks and rejects the cases where src or
 	 * dst are the loopback address and the receiving interface
 	 * is not loopback.
 	 */
 	if (in6_clearscope(&ip6->ip6_src) || in6_clearscope(&ip6->ip6_dst)) {
 		V_ip6stat.ip6s_badscope++; /* XXX */
 		goto bad;
 	}
 	if (in6_setscope(&ip6->ip6_src, m->m_pkthdr.rcvif, NULL) ||
 	    in6_setscope(&ip6->ip6_dst, m->m_pkthdr.rcvif, NULL)) {
 		V_ip6stat.ip6s_badscope++;
 		goto bad;
 	}
 
 	/*
 	 * Multicast check
 	 */
 	if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
 		struct in6_multi *in6m = 0;
 
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_mcast);
 		/*
 		 * See if we belong to the destination multicast group on the
 		 * arrival interface.
 		 */
 		IN6_LOOKUP_MULTI(ip6->ip6_dst, m->m_pkthdr.rcvif, in6m);
 		if (in6m)
 			ours = 1;
 		else if (!ip6_mrouter) {
 			V_ip6stat.ip6s_notmember++;
 			V_ip6stat.ip6s_cantforward++;
 			in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_discard);
 			goto bad;
 		}
 		deliverifp = m->m_pkthdr.rcvif;
 		goto hbhcheck;
 	}
 
 	/*
 	 *  Unicast check
 	 */
+
+	bzero(&dst6, sizeof(dst6));
+	dst6.sin6_family = AF_INET6;
+	dst6.sin6_len = sizeof(struct sockaddr_in6);
+	dst6.sin6_addr = ip6->ip6_dst;
+	ifp = m->m_pkthdr.rcvif;
+	IF_AFDATA_LOCK(ifp);
+	lle = lla_lookup(LLTABLE6(ifp), 0,
+	     (struct sockaddr *)&dst6);
+	IF_AFDATA_UNLOCK(ifp);
+	if ((lle != NULL) && (lle->la_flags & LLE_IFADDR)) {
+		ours = 1;
+		deliverifp = ifp;
+		LLE_RUNLOCK(lle);
+		goto hbhcheck;
+	}
+	LLE_RUNLOCK(lle);
+
 	if (V_ip6_forward_rt.ro_rt != NULL &&
 	    (V_ip6_forward_rt.ro_rt->rt_flags & RTF_UP) != 0 &&
 	    IN6_ARE_ADDR_EQUAL(&ip6->ip6_dst,
 	    &((struct sockaddr_in6 *)(&V_ip6_forward_rt.ro_dst))->sin6_addr))
 		V_ip6stat.ip6s_forward_cachehit++;
 	else {
 		struct sockaddr_in6 *dst6;
 
 		if (V_ip6_forward_rt.ro_rt) {
 			/* route is down or destination is different */
 			V_ip6stat.ip6s_forward_cachemiss++;
 			RTFREE(V_ip6_forward_rt.ro_rt);
 			V_ip6_forward_rt.ro_rt = 0;
 		}
 
 		bzero(&V_ip6_forward_rt.ro_dst, sizeof(struct sockaddr_in6));
 		dst6 = (struct sockaddr_in6 *)&V_ip6_forward_rt.ro_dst;
 		dst6->sin6_len = sizeof(struct sockaddr_in6);
 		dst6->sin6_family = AF_INET6;
 		dst6->sin6_addr = ip6->ip6_dst;
 
 		rtalloc((struct route *)&V_ip6_forward_rt);
 	}
 
 #define rt6_key(r) ((struct sockaddr_in6 *)((r)->rt_nodes->rn_key))
 
 	/*
 	 * Accept the packet if the forwarding interface to the destination
 	 * according to the routing table is the loopback interface,
 	 * unless the associated route has a gateway.
 	 * Note that this approach causes to accept a packet if there is a
 	 * route to the loopback interface for the destination of the packet.
 	 * But we think it's even useful in some situations, e.g. when using
 	 * a special daemon which wants to intercept the packet.
 	 *
 	 * XXX: some OSes automatically make a cloned route for the destination
 	 * of an outgoing packet.  If the outgoing interface of the packet
 	 * is a loopback one, the kernel would consider the packet to be
 	 * accepted, even if we have no such address assinged on the interface.
 	 * We check the cloned flag of the route entry to reject such cases,
 	 * assuming that route entries for our own addresses are not made by
 	 * cloning (it should be true because in6_addloop explicitly installs
 	 * the host route).  However, we might have to do an explicit check
 	 * while it would be less efficient.  Or, should we rather install a
 	 * reject route for such a case?
 	 */
 	if (V_ip6_forward_rt.ro_rt &&
 	    (V_ip6_forward_rt.ro_rt->rt_flags &
 	     (RTF_HOST|RTF_GATEWAY)) == RTF_HOST &&
 #ifdef RTF_WASCLONED
 	    !(V_ip6_forward_rt.ro_rt->rt_flags & RTF_WASCLONED) &&
 #endif
 #ifdef RTF_CLONED
 	    !(V_ip6_forward_rt.ro_rt->rt_flags & RTF_CLONED) &&
 #endif
 #if 0
 	    /*
 	     * The check below is redundant since the comparison of
 	     * the destination and the key of the rtentry has
 	     * already done through looking up the routing table.
 	     */
 	    IN6_ARE_ADDR_EQUAL(&ip6->ip6_dst,
 	    &rt6_key(V_ip6_forward_rt.ro_rt)->sin6_addr)
 #endif
 	    V_ip6_forward_rt.ro_rt->rt_ifp->if_type == IFT_LOOP) {
 		struct in6_ifaddr *ia6 =
 			(struct in6_ifaddr *)V_ip6_forward_rt.ro_rt->rt_ifa;
 
 		/*
 		 * record address information into m_tag.
 		 */
 		(void)ip6_setdstifaddr(m, ia6);
 
 		/*
 		 * packets to a tentative, duplicated, or somehow invalid
 		 * address must not be accepted.
 		 */
 		if (!(ia6->ia6_flags & IN6_IFF_NOTREADY)) {
 			/* this address is ready */
 			ours = 1;
 			deliverifp = ia6->ia_ifp;	/* correct? */
 			/* Count the packet in the ip address stats */
 			ia6->ia_ifa.if_ipackets++;
 			ia6->ia_ifa.if_ibytes += m->m_pkthdr.len;
 			goto hbhcheck;
 		} else {
 			char ip6bufs[INET6_ADDRSTRLEN];
 			char ip6bufd[INET6_ADDRSTRLEN];
 			/* address is not ready, so discard the packet. */
 			nd6log((LOG_INFO,
 			    "ip6_input: packet to an unready address %s->%s\n",
 			    ip6_sprintf(ip6bufs, &ip6->ip6_src),
 			    ip6_sprintf(ip6bufd, &ip6->ip6_dst)));
 
 			goto bad;
 		}
 	}
 
 	/*
 	 * FAITH (Firewall Aided Internet Translator)
 	 */
 	if (V_ip6_keepfaith) {
 		if (V_ip6_forward_rt.ro_rt && V_ip6_forward_rt.ro_rt->rt_ifp
 		 && V_ip6_forward_rt.ro_rt->rt_ifp->if_type == IFT_FAITH) {
 			/* XXX do we need more sanity checks? */
 			ours = 1;
 			deliverifp = V_ip6_forward_rt.ro_rt->rt_ifp; /* faith */
 			goto hbhcheck;
 		}
 	}
 
 	/*
 	 * Now there is no reason to process the packet if it's not our own
 	 * and we're not a router.
 	 */
 	if (!V_ip6_forwarding) {
 		V_ip6stat.ip6s_cantforward++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_discard);
 		goto bad;
 	}
 
   hbhcheck:
 	/*
 	 * record address information into m_tag, if we don't have one yet.
 	 * note that we are unable to record it, if the address is not listed
 	 * as our interface address (e.g. multicast addresses, addresses
 	 * within FAITH prefixes and such).
 	 */
 	if (deliverifp && !ip6_getdstifaddr(m)) {
 		struct in6_ifaddr *ia6;
 
 		ia6 = in6_ifawithifp(deliverifp, &ip6->ip6_dst);
 		if (ia6) {
 			if (!ip6_setdstifaddr(m, ia6)) {
 				/*
 				 * XXX maybe we should drop the packet here,
 				 * as we could not provide enough information
 				 * to the upper layers.
 				 */
 			}
 		}
 	}
 
 	/*
 	 * Process Hop-by-Hop options header if it's contained.
 	 * m may be modified in ip6_hopopts_input().
 	 * If a JumboPayload option is included, plen will also be modified.
 	 */
 	plen = (u_int32_t)ntohs(ip6->ip6_plen);
 	if (ip6->ip6_nxt == IPPROTO_HOPOPTS) {
 		struct ip6_hbh *hbh;
 
 		if (ip6_hopopts_input(&plen, &rtalert, &m, &off)) {
 #if 0	/*touches NULL pointer*/
 			in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_discard);
 #endif
 			return;	/* m have already been freed */
 		}
 
 		/* adjust pointer */
 		ip6 = mtod(m, struct ip6_hdr *);
 
 		/*
 		 * if the payload length field is 0 and the next header field
 		 * indicates Hop-by-Hop Options header, then a Jumbo Payload
 		 * option MUST be included.
 		 */
 		if (ip6->ip6_plen == 0 && plen == 0) {
 			/*
 			 * Note that if a valid jumbo payload option is
 			 * contained, ip6_hopopts_input() must set a valid
 			 * (non-zero) payload length to the variable plen.
 			 */
 			V_ip6stat.ip6s_badoptions++;
 			in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_discard);
 			in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_hdrerr);
 			icmp6_error(m, ICMP6_PARAM_PROB,
 				    ICMP6_PARAMPROB_HEADER,
 				    (caddr_t)&ip6->ip6_plen - (caddr_t)ip6);
 			return;
 		}
 #ifndef PULLDOWN_TEST
 		/* ip6_hopopts_input() ensures that mbuf is contiguous */
 		hbh = (struct ip6_hbh *)(ip6 + 1);
 #else
 		IP6_EXTHDR_GET(hbh, struct ip6_hbh *, m, sizeof(struct ip6_hdr),
 			sizeof(struct ip6_hbh));
 		if (hbh == NULL) {
 			V_ip6stat.ip6s_tooshort++;
 			return;
 		}
 #endif
 		nxt = hbh->ip6h_nxt;
 
 		/*
 		 * If we are acting as a router and the packet contains a
 		 * router alert option, see if we know the option value.
 		 * Currently, we only support the option value for MLD, in which
 		 * case we should pass the packet to the multicast routing
 		 * daemon.
 		 */
 		if (rtalert != ~0 && V_ip6_forwarding) {
 			switch (rtalert) {
 			case IP6OPT_RTALERT_MLD:
 				ours = 1;
 				break;
 			default:
 				/*
 				 * RFC2711 requires unrecognized values must be
 				 * silently ignored.
 				 */
 				break;
 			}
 		}
 	} else
 		nxt = ip6->ip6_nxt;
 
 	/*
 	 * Check that the amount of data in the buffers
 	 * is as at least much as the IPv6 header would have us expect.
 	 * Trim mbufs if longer than we expect.
 	 * Drop packet if shorter than we expect.
 	 */
 	if (m->m_pkthdr.len - sizeof(struct ip6_hdr) < plen) {
 		V_ip6stat.ip6s_tooshort++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_truncated);
 		goto bad;
 	}
 	if (m->m_pkthdr.len > sizeof(struct ip6_hdr) + plen) {
 		if (m->m_len == m->m_pkthdr.len) {
 			m->m_len = sizeof(struct ip6_hdr) + plen;
 			m->m_pkthdr.len = sizeof(struct ip6_hdr) + plen;
 		} else
 			m_adj(m, sizeof(struct ip6_hdr) + plen - m->m_pkthdr.len);
 	}
 
 	/*
 	 * Forward if desirable.
 	 */
 	if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
 		/*
 		 * If we are acting as a multicast router, all
 		 * incoming multicast packets are passed to the
 		 * kernel-level multicast forwarding function.
 		 * The packet is returned (relatively) intact; if
 		 * ip6_mforward() returns a non-zero value, the packet
 		 * must be discarded, else it may be accepted below.
 		 */
 		if (ip6_mrouter && ip6_mforward &&
 		    ip6_mforward(ip6, m->m_pkthdr.rcvif, m)) {
 			V_ip6stat.ip6s_cantforward++;
 			m_freem(m);
 			return;
 		}
 		if (!ours) {
 			m_freem(m);
 			return;
 		}
 	} else if (!ours) {
 		ip6_forward(m, srcrt);
 		return;
 	}
 
 	ip6 = mtod(m, struct ip6_hdr *);
 
 	/*
 	 * Malicious party may be able to use IPv4 mapped addr to confuse
 	 * tcp/udp stack and bypass security checks (act as if it was from
 	 * 127.0.0.1 by using IPv6 src ::ffff:127.0.0.1).  Be cautious.
 	 *
 	 * For SIIT end node behavior, you may want to disable the check.
 	 * However, you will  become vulnerable to attacks using IPv4 mapped
 	 * source.
 	 */
 	if (IN6_IS_ADDR_V4MAPPED(&ip6->ip6_src) ||
 	    IN6_IS_ADDR_V4MAPPED(&ip6->ip6_dst)) {
 		V_ip6stat.ip6s_badscope++;
 		in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_addrerr);
 		goto bad;
 	}
 
 	/*
 	 * Tell launch routine the next header
 	 */
 	V_ip6stat.ip6s_delivered++;
 	in6_ifstat_inc(deliverifp, ifs6_in_deliver);
 	nest = 0;
 
 	while (nxt != IPPROTO_DONE) {
 		if (V_ip6_hdrnestlimit && (++nest > V_ip6_hdrnestlimit)) {
 			V_ip6stat.ip6s_toomanyhdr++;
 			goto bad;
 		}
 
 		/*
 		 * protection against faulty packet - there should be
 		 * more sanity checks in header chain processing.
 		 */
 		if (m->m_pkthdr.len < off) {
 			V_ip6stat.ip6s_tooshort++;
 			in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_truncated);
 			goto bad;
 		}
 
 #ifdef IPSEC
 		/*
 		 * enforce IPsec policy checking if we are seeing last header.
 		 * note that we do not visit this with protocols with pcb layer
 		 * code - like udp/tcp/raw ip.
 		 */
 		if (ip6_ipsec_input(m, nxt))
 			goto bad;
 #endif /* IPSEC */
 		nxt = (*inet6sw[ip6_protox[nxt]].pr_input)(&m, &off, nxt);
 	}
 	return;
  bad:
 	m_freem(m);
 }
 
 /*
  * set/grab in6_ifaddr correspond to IPv6 destination address.
  * XXX backward compatibility wrapper
  */
 static struct ip6aux *
 ip6_setdstifaddr(struct mbuf *m, struct in6_ifaddr *ia6)
 {
 	struct ip6aux *ip6a;
 
 	ip6a = ip6_addaux(m);
 	if (ip6a)
 		ip6a->ip6a_dstia6 = ia6;
 	return ip6a;	/* NULL if failed to set */
 }
 
 struct in6_ifaddr *
 ip6_getdstifaddr(struct mbuf *m)
 {
 	struct ip6aux *ip6a;
 
 	ip6a = ip6_findaux(m);
 	if (ip6a)
 		return ip6a->ip6a_dstia6;
 	else
 		return NULL;
 }
 
 /*
  * Hop-by-Hop options header processing. If a valid jumbo payload option is
  * included, the real payload length will be stored in plenp.
  *
  * rtalertp - XXX: should be stored more smart way
  */
 static int
 ip6_hopopts_input(u_int32_t *plenp, u_int32_t *rtalertp,
     struct mbuf **mp, int *offp)
 {
 	INIT_VNET_INET6(curvnet);
 	struct mbuf *m = *mp;
 	int off = *offp, hbhlen;
 	struct ip6_hbh *hbh;
 	u_int8_t *opt;
 
 	/* validation of the length of the header */
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off, sizeof(*hbh), -1);
 	hbh = (struct ip6_hbh *)(mtod(m, caddr_t) + off);
 	hbhlen = (hbh->ip6h_len + 1) << 3;
 
 	IP6_EXTHDR_CHECK(m, off, hbhlen, -1);
 	hbh = (struct ip6_hbh *)(mtod(m, caddr_t) + off);
 #else
 	IP6_EXTHDR_GET(hbh, struct ip6_hbh *, m,
 		sizeof(struct ip6_hdr), sizeof(struct ip6_hbh));
 	if (hbh == NULL) {
 		V_ip6stat.ip6s_tooshort++;
 		return -1;
 	}
 	hbhlen = (hbh->ip6h_len + 1) << 3;
 	IP6_EXTHDR_GET(hbh, struct ip6_hbh *, m, sizeof(struct ip6_hdr),
 		hbhlen);
 	if (hbh == NULL) {
 		V_ip6stat.ip6s_tooshort++;
 		return -1;
 	}
 #endif
 	off += hbhlen;
 	hbhlen -= sizeof(struct ip6_hbh);
 	opt = (u_int8_t *)hbh + sizeof(struct ip6_hbh);
 
 	if (ip6_process_hopopts(m, (u_int8_t *)hbh + sizeof(struct ip6_hbh),
 				hbhlen, rtalertp, plenp) < 0)
 		return (-1);
 
 	*offp = off;
 	*mp = m;
 	return (0);
 }
 
 /*
  * Search header for all Hop-by-hop options and process each option.
  * This function is separate from ip6_hopopts_input() in order to
  * handle a case where the sending node itself process its hop-by-hop
  * options header. In such a case, the function is called from ip6_output().
  *
  * The function assumes that hbh header is located right after the IPv6 header
  * (RFC2460 p7), opthead is pointer into data content in m, and opthead to
  * opthead + hbhlen is located in continuous memory region.
  */
 int
 ip6_process_hopopts(struct mbuf *m, u_int8_t *opthead, int hbhlen,
     u_int32_t *rtalertp, u_int32_t *plenp)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ip6_hdr *ip6;
 	int optlen = 0;
 	u_int8_t *opt = opthead;
 	u_int16_t rtalert_val;
 	u_int32_t jumboplen;
 	const int erroff = sizeof(struct ip6_hdr) + sizeof(struct ip6_hbh);
 
 	for (; hbhlen > 0; hbhlen -= optlen, opt += optlen) {
 		switch (*opt) {
 		case IP6OPT_PAD1:
 			optlen = 1;
 			break;
 		case IP6OPT_PADN:
 			if (hbhlen < IP6OPT_MINLEN) {
 				V_ip6stat.ip6s_toosmall++;
 				goto bad;
 			}
 			optlen = *(opt + 1) + 2;
 			break;
 		case IP6OPT_ROUTER_ALERT:
 			/* XXX may need check for alignment */
 			if (hbhlen < IP6OPT_RTALERT_LEN) {
 				V_ip6stat.ip6s_toosmall++;
 				goto bad;
 			}
 			if (*(opt + 1) != IP6OPT_RTALERT_LEN - 2) {
 				/* XXX stat */
 				icmp6_error(m, ICMP6_PARAM_PROB,
 				    ICMP6_PARAMPROB_HEADER,
 				    erroff + opt + 1 - opthead);
 				return (-1);
 			}
 			optlen = IP6OPT_RTALERT_LEN;
 			bcopy((caddr_t)(opt + 2), (caddr_t)&rtalert_val, 2);
 			*rtalertp = ntohs(rtalert_val);
 			break;
 		case IP6OPT_JUMBO:
 			/* XXX may need check for alignment */
 			if (hbhlen < IP6OPT_JUMBO_LEN) {
 				V_ip6stat.ip6s_toosmall++;
 				goto bad;
 			}
 			if (*(opt + 1) != IP6OPT_JUMBO_LEN - 2) {
 				/* XXX stat */
 				icmp6_error(m, ICMP6_PARAM_PROB,
 				    ICMP6_PARAMPROB_HEADER,
 				    erroff + opt + 1 - opthead);
 				return (-1);
 			}
 			optlen = IP6OPT_JUMBO_LEN;
 
 			/*
 			 * IPv6 packets that have non 0 payload length
 			 * must not contain a jumbo payload option.
 			 */
 			ip6 = mtod(m, struct ip6_hdr *);
 			if (ip6->ip6_plen) {
 				V_ip6stat.ip6s_badoptions++;
 				icmp6_error(m, ICMP6_PARAM_PROB,
 				    ICMP6_PARAMPROB_HEADER,
 				    erroff + opt - opthead);
 				return (-1);
 			}
 
 			/*
 			 * We may see jumbolen in unaligned location, so
 			 * we'd need to perform bcopy().
 			 */
 			bcopy(opt + 2, &jumboplen, sizeof(jumboplen));
 			jumboplen = (u_int32_t)htonl(jumboplen);
 
 #if 1
 			/*
 			 * if there are multiple jumbo payload options,
 			 * *plenp will be non-zero and the packet will be
 			 * rejected.
 			 * the behavior may need some debate in ipngwg -
 			 * multiple options does not make sense, however,
 			 * there's no explicit mention in specification.
 			 */
 			if (*plenp != 0) {
 				V_ip6stat.ip6s_badoptions++;
 				icmp6_error(m, ICMP6_PARAM_PROB,
 				    ICMP6_PARAMPROB_HEADER,
 				    erroff + opt + 2 - opthead);
 				return (-1);
 			}
 #endif
 
 			/*
 			 * jumbo payload length must be larger than 65535.
 			 */
 			if (jumboplen <= IPV6_MAXPACKET) {
 				V_ip6stat.ip6s_badoptions++;
 				icmp6_error(m, ICMP6_PARAM_PROB,
 				    ICMP6_PARAMPROB_HEADER,
 				    erroff + opt + 2 - opthead);
 				return (-1);
 			}
 			*plenp = jumboplen;
 
 			break;
 		default:		/* unknown option */
 			if (hbhlen < IP6OPT_MINLEN) {
 				V_ip6stat.ip6s_toosmall++;
 				goto bad;
 			}
 			optlen = ip6_unknown_opt(opt, m,
 			    erroff + opt - opthead);
 			if (optlen == -1)
 				return (-1);
 			optlen += 2;
 			break;
 		}
 	}
 
 	return (0);
 
   bad:
 	m_freem(m);
 	return (-1);
 }
 
 /*
  * Unknown option processing.
  * The third argument `off' is the offset from the IPv6 header to the option,
  * which is necessary if the IPv6 header the and option header and IPv6 header
  * is not continuous in order to return an ICMPv6 error.
  */
 int
 ip6_unknown_opt(u_int8_t *optp, struct mbuf *m, int off)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ip6_hdr *ip6;
 
 	switch (IP6OPT_TYPE(*optp)) {
 	case IP6OPT_TYPE_SKIP: /* ignore the option */
 		return ((int)*(optp + 1));
 	case IP6OPT_TYPE_DISCARD:	/* silently discard */
 		m_freem(m);
 		return (-1);
 	case IP6OPT_TYPE_FORCEICMP: /* send ICMP even if multicasted */
 		V_ip6stat.ip6s_badoptions++;
 		icmp6_error(m, ICMP6_PARAM_PROB, ICMP6_PARAMPROB_OPTION, off);
 		return (-1);
 	case IP6OPT_TYPE_ICMP: /* send ICMP if not multicasted */
 		V_ip6stat.ip6s_badoptions++;
 		ip6 = mtod(m, struct ip6_hdr *);
 		if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst) ||
 		    (m->m_flags & (M_BCAST|M_MCAST)))
 			m_freem(m);
 		else
 			icmp6_error(m, ICMP6_PARAM_PROB,
 				    ICMP6_PARAMPROB_OPTION, off);
 		return (-1);
 	}
 
 	m_freem(m);		/* XXX: NOTREACHED */
 	return (-1);
 }
 
 /*
  * Create the "control" list for this pcb.
  * These functions will not modify mbuf chain at all.
  *
  * With KAME mbuf chain restriction:
  * The routine will be called from upper layer handlers like tcp6_input().
  * Thus the routine assumes that the caller (tcp6_input) have already
  * called IP6_EXTHDR_CHECK() and all the extension headers are located in the
  * very first mbuf on the mbuf chain.
  *
  * ip6_savecontrol_v4 will handle those options that are possible to be
  * set on a v4-mapped socket.
  * ip6_savecontrol will directly call ip6_savecontrol_v4 to handle those
  * options and handle the v6-only ones itself.
  */
 struct mbuf **
 ip6_savecontrol_v4(struct inpcb *inp, struct mbuf *m, struct mbuf **mp,
     int *v4only)
 {
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 
 #ifdef SO_TIMESTAMP
 	if ((inp->inp_socket->so_options & SO_TIMESTAMP) != 0) {
 		struct timeval tv;
 
 		microtime(&tv);
 		*mp = sbcreatecontrol((caddr_t) &tv, sizeof(tv),
 		    SCM_TIMESTAMP, SOL_SOCKET);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 #endif
 
 	if ((ip6->ip6_vfc & IPV6_VERSION_MASK) != IPV6_VERSION) {
 		if (v4only != NULL)
 			*v4only = 1;
 		return (mp);
 	}
 
 #define IS2292(inp, x, y)	(((inp)->inp_flags & IN6P_RFC2292) ? (x) : (y))
 	/* RFC 2292 sec. 5 */
 	if ((inp->inp_flags & IN6P_PKTINFO) != 0) {
 		struct in6_pktinfo pi6;
 
 		bcopy(&ip6->ip6_dst, &pi6.ipi6_addr, sizeof(struct in6_addr));
 		in6_clearscope(&pi6.ipi6_addr);	/* XXX */
 		pi6.ipi6_ifindex =
 		    (m && m->m_pkthdr.rcvif) ? m->m_pkthdr.rcvif->if_index : 0;
 
 		*mp = sbcreatecontrol((caddr_t) &pi6,
 		    sizeof(struct in6_pktinfo),
 		    IS2292(inp, IPV6_2292PKTINFO, IPV6_PKTINFO), IPPROTO_IPV6);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 
 	if ((inp->inp_flags & IN6P_HOPLIMIT) != 0) {
 		int hlim = ip6->ip6_hlim & 0xff;
 
 		*mp = sbcreatecontrol((caddr_t) &hlim, sizeof(int),
 		    IS2292(inp, IPV6_2292HOPLIMIT, IPV6_HOPLIMIT),
 		    IPPROTO_IPV6);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 
 	if (v4only != NULL)
 		*v4only = 0;
 	return (mp);
 }
 
 void
 ip6_savecontrol(struct inpcb *in6p, struct mbuf *m, struct mbuf **mp)
 {
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	int v4only = 0;
 
 	mp = ip6_savecontrol_v4(in6p, m, mp, &v4only);
 	if (v4only)
 		return;
 
 	if ((in6p->in6p_flags & IN6P_TCLASS) != 0) {
 		u_int32_t flowinfo;
 		int tclass;
 
 		flowinfo = (u_int32_t)ntohl(ip6->ip6_flow & IPV6_FLOWINFO_MASK);
 		flowinfo >>= 20;
 
 		tclass = flowinfo & 0xff;
 		*mp = sbcreatecontrol((caddr_t) &tclass, sizeof(tclass),
 		    IPV6_TCLASS, IPPROTO_IPV6);
 		if (*mp)
 			mp = &(*mp)->m_next;
 	}
 
 	/*
 	 * IPV6_HOPOPTS socket option.  Recall that we required super-user
 	 * privilege for the option (see ip6_ctloutput), but it might be too
 	 * strict, since there might be some hop-by-hop options which can be
 	 * returned to normal user.
 	 * See also RFC 2292 section 6 (or RFC 3542 section 8).
 	 */
 	if ((in6p->in6p_flags & IN6P_HOPOPTS) != 0) {
 		/*
 		 * Check if a hop-by-hop options header is contatined in the
 		 * received packet, and if so, store the options as ancillary
 		 * data. Note that a hop-by-hop options header must be
 		 * just after the IPv6 header, which is assured through the
 		 * IPv6 input processing.
 		 */
 		if (ip6->ip6_nxt == IPPROTO_HOPOPTS) {
 			struct ip6_hbh *hbh;
 			int hbhlen = 0;
 #ifdef PULLDOWN_TEST
 			struct mbuf *ext;
 #endif
 
 #ifndef PULLDOWN_TEST
 			hbh = (struct ip6_hbh *)(ip6 + 1);
 			hbhlen = (hbh->ip6h_len + 1) << 3;
 #else
 			ext = ip6_pullexthdr(m, sizeof(struct ip6_hdr),
 			    ip6->ip6_nxt);
 			if (ext == NULL) {
 				V_ip6stat.ip6s_tooshort++;
 				return;
 			}
 			hbh = mtod(ext, struct ip6_hbh *);
 			hbhlen = (hbh->ip6h_len + 1) << 3;
 			if (hbhlen != ext->m_len) {
 				m_freem(ext);
 				V_ip6stat.ip6s_tooshort++;
 				return;
 			}
 #endif
 
 			/*
 			 * XXX: We copy the whole header even if a
 			 * jumbo payload option is included, the option which
 			 * is to be removed before returning according to
 			 * RFC2292.
 			 * Note: this constraint is removed in RFC3542
 			 */
 			*mp = sbcreatecontrol((caddr_t)hbh, hbhlen,
 			    IS2292(in6p, IPV6_2292HOPOPTS, IPV6_HOPOPTS),
 			    IPPROTO_IPV6);
 			if (*mp)
 				mp = &(*mp)->m_next;
 #ifdef PULLDOWN_TEST
 			m_freem(ext);
 #endif
 		}
 	}
 
 	if ((in6p->in6p_flags & (IN6P_RTHDR | IN6P_DSTOPTS)) != 0) {
 		int nxt = ip6->ip6_nxt, off = sizeof(struct ip6_hdr);
 
 		/*
 		 * Search for destination options headers or routing
 		 * header(s) through the header chain, and stores each
 		 * header as ancillary data.
 		 * Note that the order of the headers remains in
 		 * the chain of ancillary data.
 		 */
 		while (1) {	/* is explicit loop prevention necessary? */
 			struct ip6_ext *ip6e = NULL;
 			int elen;
 #ifdef PULLDOWN_TEST
 			struct mbuf *ext = NULL;
 #endif
 
 			/*
 			 * if it is not an extension header, don't try to
 			 * pull it from the chain.
 			 */
 			switch (nxt) {
 			case IPPROTO_DSTOPTS:
 			case IPPROTO_ROUTING:
 			case IPPROTO_HOPOPTS:
 			case IPPROTO_AH: /* is it possible? */
 				break;
 			default:
 				goto loopend;
 			}
 
 #ifndef PULLDOWN_TEST
 			if (off + sizeof(*ip6e) > m->m_len)
 				goto loopend;
 			ip6e = (struct ip6_ext *)(mtod(m, caddr_t) + off);
 			if (nxt == IPPROTO_AH)
 				elen = (ip6e->ip6e_len + 2) << 2;
 			else
 				elen = (ip6e->ip6e_len + 1) << 3;
 			if (off + elen > m->m_len)
 				goto loopend;
 #else
 			ext = ip6_pullexthdr(m, off, nxt);
 			if (ext == NULL) {
 				V_ip6stat.ip6s_tooshort++;
 				return;
 			}
 			ip6e = mtod(ext, struct ip6_ext *);
 			if (nxt == IPPROTO_AH)
 				elen = (ip6e->ip6e_len + 2) << 2;
 			else
 				elen = (ip6e->ip6e_len + 1) << 3;
 			if (elen != ext->m_len) {
 				m_freem(ext);
 				V_ip6stat.ip6s_tooshort++;
 				return;
 			}
 #endif
 
 			switch (nxt) {
 			case IPPROTO_DSTOPTS:
 				if (!(in6p->in6p_flags & IN6P_DSTOPTS))
 					break;
 
 				*mp = sbcreatecontrol((caddr_t)ip6e, elen,
 				    IS2292(in6p,
 					IPV6_2292DSTOPTS, IPV6_DSTOPTS),
 				    IPPROTO_IPV6);
 				if (*mp)
 					mp = &(*mp)->m_next;
 				break;
 			case IPPROTO_ROUTING:
 				if (!in6p->in6p_flags & IN6P_RTHDR)
 					break;
 
 				*mp = sbcreatecontrol((caddr_t)ip6e, elen,
 				    IS2292(in6p, IPV6_2292RTHDR, IPV6_RTHDR),
 				    IPPROTO_IPV6);
 				if (*mp)
 					mp = &(*mp)->m_next;
 				break;
 			case IPPROTO_HOPOPTS:
 			case IPPROTO_AH: /* is it possible? */
 				break;
 
 			default:
 				/*
 				 * other cases have been filtered in the above.
 				 * none will visit this case.  here we supply
 				 * the code just in case (nxt overwritten or
 				 * other cases).
 				 */
 #ifdef PULLDOWN_TEST
 				m_freem(ext);
 #endif
 				goto loopend;
 
 			}
 
 			/* proceed with the next header. */
 			off += elen;
 			nxt = ip6e->ip6e_nxt;
 			ip6e = NULL;
 #ifdef PULLDOWN_TEST
 			m_freem(ext);
 			ext = NULL;
 #endif
 		}
 	  loopend:
 		;
 	}
 }
 #undef IS2292
 
 void
 ip6_notify_pmtu(struct inpcb *in6p, struct sockaddr_in6 *dst, u_int32_t *mtu)
 {
 	struct socket *so;
 	struct mbuf *m_mtu;
 	struct ip6_mtuinfo mtuctl;
 
 	so =  in6p->inp_socket;
 
 	if (mtu == NULL)
 		return;
 
 #ifdef DIAGNOSTIC
 	if (so == NULL)		/* I believe this is impossible */
 		panic("ip6_notify_pmtu: socket is NULL");
 #endif
 
 	bzero(&mtuctl, sizeof(mtuctl));	/* zero-clear for safety */
 	mtuctl.ip6m_mtu = *mtu;
 	mtuctl.ip6m_addr = *dst;
 	if (sa6_recoverscope(&mtuctl.ip6m_addr))
 		return;
 
 	if ((m_mtu = sbcreatecontrol((caddr_t)&mtuctl, sizeof(mtuctl),
 	    IPV6_PATHMTU, IPPROTO_IPV6)) == NULL)
 		return;
 
 	if (sbappendaddr(&so->so_rcv, (struct sockaddr *)dst, NULL, m_mtu)
 	    == 0) {
 		m_freem(m_mtu);
 		/* XXX: should count statistics */
 	} else
 		sorwakeup(so);
 
 	return;
 }
 
 #ifdef PULLDOWN_TEST
 /*
  * pull single extension header from mbuf chain.  returns single mbuf that
  * contains the result, or NULL on error.
  */
 static struct mbuf *
 ip6_pullexthdr(struct mbuf *m, size_t off, int nxt)
 {
 	struct ip6_ext ip6e;
 	size_t elen;
 	struct mbuf *n;
 
 #ifdef DIAGNOSTIC
 	switch (nxt) {
 	case IPPROTO_DSTOPTS:
 	case IPPROTO_ROUTING:
 	case IPPROTO_HOPOPTS:
 	case IPPROTO_AH: /* is it possible? */
 		break;
 	default:
 		printf("ip6_pullexthdr: invalid nxt=%d\n", nxt);
 	}
 #endif
 
 	m_copydata(m, off, sizeof(ip6e), (caddr_t)&ip6e);
 	if (nxt == IPPROTO_AH)
 		elen = (ip6e.ip6e_len + 2) << 2;
 	else
 		elen = (ip6e.ip6e_len + 1) << 3;
 
 	MGET(n, M_DONTWAIT, MT_DATA);
 	if (n && elen >= MLEN) {
 		MCLGET(n, M_DONTWAIT);
 		if ((n->m_flags & M_EXT) == 0) {
 			m_free(n);
 			n = NULL;
 		}
 	}
 	if (!n)
 		return NULL;
 
 	n->m_len = 0;
 	if (elen >= M_TRAILINGSPACE(n)) {
 		m_free(n);
 		return NULL;
 	}
 
 	m_copydata(m, off, elen, mtod(n, caddr_t));
 	n->m_len = elen;
 	return n;
 }
 #endif
 
 /*
  * Get pointer to the previous header followed by the header
  * currently processed.
  * XXX: This function supposes that
  *	M includes all headers,
  *	the next header field and the header length field of each header
  *	are valid, and
  *	the sum of each header length equals to OFF.
  * Because of these assumptions, this function must be called very
  * carefully. Moreover, it will not be used in the near future when
  * we develop `neater' mechanism to process extension headers.
  */
 char *
 ip6_get_prevhdr(struct mbuf *m, int off)
 {
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 
 	if (off == sizeof(struct ip6_hdr))
 		return (&ip6->ip6_nxt);
 	else {
 		int len, nxt;
 		struct ip6_ext *ip6e = NULL;
 
 		nxt = ip6->ip6_nxt;
 		len = sizeof(struct ip6_hdr);
 		while (len < off) {
 			ip6e = (struct ip6_ext *)(mtod(m, caddr_t) + len);
 
 			switch (nxt) {
 			case IPPROTO_FRAGMENT:
 				len += sizeof(struct ip6_frag);
 				break;
 			case IPPROTO_AH:
 				len += (ip6e->ip6e_len + 2) << 2;
 				break;
 			default:
 				len += (ip6e->ip6e_len + 1) << 3;
 				break;
 			}
 			nxt = ip6e->ip6e_nxt;
 		}
 		if (ip6e)
 			return (&ip6e->ip6e_nxt);
 		else
 			return NULL;
 	}
 }
 
 /*
  * get next header offset.  m will be retained.
  */
 int
 ip6_nexthdr(struct mbuf *m, int off, int proto, int *nxtp)
 {
 	struct ip6_hdr ip6;
 	struct ip6_ext ip6e;
 	struct ip6_frag fh;
 
 	/* just in case */
 	if (m == NULL)
 		panic("ip6_nexthdr: m == NULL");
 	if ((m->m_flags & M_PKTHDR) == 0 || m->m_pkthdr.len < off)
 		return -1;
 
 	switch (proto) {
 	case IPPROTO_IPV6:
 		if (m->m_pkthdr.len < off + sizeof(ip6))
 			return -1;
 		m_copydata(m, off, sizeof(ip6), (caddr_t)&ip6);
 		if (nxtp)
 			*nxtp = ip6.ip6_nxt;
 		off += sizeof(ip6);
 		return off;
 
 	case IPPROTO_FRAGMENT:
 		/*
 		 * terminate parsing if it is not the first fragment,
 		 * it does not make sense to parse through it.
 		 */
 		if (m->m_pkthdr.len < off + sizeof(fh))
 			return -1;
 		m_copydata(m, off, sizeof(fh), (caddr_t)&fh);
 		/* IP6F_OFF_MASK = 0xfff8(BigEndian), 0xf8ff(LittleEndian) */
 		if (fh.ip6f_offlg & IP6F_OFF_MASK)
 			return -1;
 		if (nxtp)
 			*nxtp = fh.ip6f_nxt;
 		off += sizeof(struct ip6_frag);
 		return off;
 
 	case IPPROTO_AH:
 		if (m->m_pkthdr.len < off + sizeof(ip6e))
 			return -1;
 		m_copydata(m, off, sizeof(ip6e), (caddr_t)&ip6e);
 		if (nxtp)
 			*nxtp = ip6e.ip6e_nxt;
 		off += (ip6e.ip6e_len + 2) << 2;
 		return off;
 
 	case IPPROTO_HOPOPTS:
 	case IPPROTO_ROUTING:
 	case IPPROTO_DSTOPTS:
 		if (m->m_pkthdr.len < off + sizeof(ip6e))
 			return -1;
 		m_copydata(m, off, sizeof(ip6e), (caddr_t)&ip6e);
 		if (nxtp)
 			*nxtp = ip6e.ip6e_nxt;
 		off += (ip6e.ip6e_len + 1) << 3;
 		return off;
 
 	case IPPROTO_NONE:
 	case IPPROTO_ESP:
 	case IPPROTO_IPCOMP:
 		/* give up */
 		return -1;
 
 	default:
 		return -1;
 	}
 
 	return -1;
 }
 
 /*
  * get offset for the last header in the chain.  m will be kept untainted.
  */
 int
 ip6_lasthdr(struct mbuf *m, int off, int proto, int *nxtp)
 {
 	int newoff;
 	int nxt;
 
 	if (!nxtp) {
 		nxt = -1;
 		nxtp = &nxt;
 	}
 	while (1) {
 		newoff = ip6_nexthdr(m, off, proto, nxtp);
 		if (newoff < 0)
 			return off;
 		else if (newoff < off)
 			return -1;	/* invalid */
 		else if (newoff == off)
 			return newoff;
 
 		off = newoff;
 		proto = *nxtp;
 	}
 }
 
 struct ip6aux *
 ip6_addaux(struct mbuf *m)
 {
 	struct m_tag *mtag;
 
 	mtag = m_tag_find(m, PACKET_TAG_IPV6_INPUT, NULL);
 	if (!mtag) {
 		mtag = m_tag_get(PACKET_TAG_IPV6_INPUT, sizeof(struct ip6aux),
 		    M_NOWAIT);
 		if (mtag) {
 			m_tag_prepend(m, mtag);
 			bzero(mtag + 1, sizeof(struct ip6aux));
 		}
 	}
 	return mtag ? (struct ip6aux *)(mtag + 1) : NULL;
 }
 
 struct ip6aux *
 ip6_findaux(struct mbuf *m)
 {
 	struct m_tag *mtag;
 
 	mtag = m_tag_find(m, PACKET_TAG_IPV6_INPUT, NULL);
 	return mtag ? (struct ip6aux *)(mtag + 1) : NULL;
 }
 
 void
 ip6_delaux(struct mbuf *m)
 {
 	struct m_tag *mtag;
 
 	mtag = m_tag_find(m, PACKET_TAG_IPV6_INPUT, NULL);
 	if (mtag)
 		m_tag_delete(m, mtag);
 }
 
 /*
  * System control for IP6
  */
 
 u_char	inet6ctlerrmap[PRC_NCMDS] = {
 	0,		0,		0,		0,
 	0,		EMSGSIZE,	EHOSTDOWN,	EHOSTUNREACH,
 	EHOSTUNREACH,	EHOSTUNREACH,	ECONNREFUSED,	ECONNREFUSED,
 	EMSGSIZE,	EHOSTUNREACH,	0,		0,
 	0,		0,		0,		0,
 	ENOPROTOOPT
 };
Index: head/sys/netinet6/ip6_output.c
===================================================================
--- head/sys/netinet6/ip6_output.c	(revision 186118)
+++ head/sys/netinet6/ip6_output.c	(revision 186119)
@@ -1,3348 +1,3348 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: ip6_output.c,v 1.279 2002/01/26 06:12:30 jinmei Exp $
  */
 
 /*-
  * Copyright (c) 1982, 1986, 1988, 1990, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_output.c	8.3 (Berkeley) 1/21/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipsec.h"
 
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/errno.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/ucred.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/netisr.h>
 #include <net/route.h>
 #include <net/pfil.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet6/in6_var.h>
 #include <netinet/ip6.h>
 #include <netinet/icmp6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet/in_pcb.h>
 #include <netinet/tcp_var.h>
 #include <netinet6/nd6.h>
 #include <netinet/vinet.h>
 
 #ifdef IPSEC
 #include <netipsec/ipsec.h>
 #include <netipsec/ipsec6.h>
 #include <netipsec/key.h>
 #include <netinet6/ip6_ipsec.h>
 #endif /* IPSEC */
 
 #include <netinet6/ip6protosw.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/vinet6.h>
 
 static MALLOC_DEFINE(M_IP6MOPTS, "ip6_moptions", "internet multicast options");
 
 struct ip6_exthdrs {
 	struct mbuf *ip6e_ip6;
 	struct mbuf *ip6e_hbh;
 	struct mbuf *ip6e_dest1;
 	struct mbuf *ip6e_rthdr;
 	struct mbuf *ip6e_dest2;
 };
 
 static int ip6_pcbopt __P((int, u_char *, int, struct ip6_pktopts **,
 			   struct ucred *, int));
 static int ip6_pcbopts __P((struct ip6_pktopts **, struct mbuf *,
 	struct socket *, struct sockopt *));
 static int ip6_getpcbopt(struct ip6_pktopts *, int, struct sockopt *);
 static int ip6_setpktopt __P((int, u_char *, int, struct ip6_pktopts *,
 	struct ucred *, int, int, int));
 
 static int ip6_setmoptions(int, struct ip6_moptions **, struct mbuf *);
 static int ip6_getmoptions(int, struct ip6_moptions *, struct mbuf **);
 static int ip6_copyexthdr(struct mbuf **, caddr_t, int);
 static int ip6_insertfraghdr __P((struct mbuf *, struct mbuf *, int,
 	struct ip6_frag **));
 static int ip6_insert_jumboopt(struct ip6_exthdrs *, u_int32_t);
 static int ip6_splithdr(struct mbuf *, struct ip6_exthdrs *);
 static int ip6_getpmtu __P((struct route_in6 *, struct route_in6 *,
 	struct ifnet *, struct in6_addr *, u_long *, int *));
 static int copypktopts(struct ip6_pktopts *, struct ip6_pktopts *, int);
 
 
 /*
  * Make an extension header from option data.  hp is the source, and
  * mp is the destination.
  */
 #define MAKE_EXTHDR(hp, mp)						\
     do {								\
 	if (hp) {							\
 		struct ip6_ext *eh = (struct ip6_ext *)(hp);		\
 		error = ip6_copyexthdr((mp), (caddr_t)(hp),		\
 		    ((eh)->ip6e_len + 1) << 3);				\
 		if (error)						\
 			goto freehdrs;					\
 	}								\
     } while (/*CONSTCOND*/ 0)
 
 /*
  * Form a chain of extension headers.
  * m is the extension header mbuf
  * mp is the previous mbuf in the chain
  * p is the next header
  * i is the type of option.
  */
 #define MAKE_CHAIN(m, mp, p, i)\
     do {\
 	if (m) {\
 		if (!hdrsplit) \
 			panic("assumption failed: hdr not split"); \
 		*mtod((m), u_char *) = *(p);\
 		*(p) = (i);\
 		p = mtod((m), u_char *);\
 		(m)->m_next = (mp)->m_next;\
 		(mp)->m_next = (m);\
 		(mp) = (m);\
 	}\
     } while (/*CONSTCOND*/ 0)
 
 /*
  * IP6 output. The packet in mbuf chain m contains a skeletal IP6
  * header (with pri, len, nxt, hlim, src, dst).
  * This function may modify ver and hlim only.
  * The mbuf chain containing the packet will be freed.
  * The mbuf opt, if present, will not be freed.
  *
  * type of "mtu": rt_rmx.rmx_mtu is u_long, ifnet.ifr_mtu is int, and
  * nd_ifinfo.linkmtu is u_int32_t.  so we use u_long to hold largest one,
  * which is rt_rmx.rmx_mtu.
  *
  * ifpp - XXX: just for statistics
  */
 int
 ip6_output(struct mbuf *m0, struct ip6_pktopts *opt,
     struct route_in6 *ro, int flags, struct ip6_moptions *im6o,
     struct ifnet **ifpp, struct inpcb *inp)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	struct ip6_hdr *ip6, *mhip6;
 	struct ifnet *ifp, *origifp;
 	struct mbuf *m = m0;
 	struct mbuf *mprev = NULL;
 	int hlen, tlen, len, off;
 	struct route_in6 ip6route;
 	struct rtentry *rt = NULL;
 	struct sockaddr_in6 *dst, src_sa, dst_sa;
 	struct in6_addr odst;
 	int error = 0;
 	struct in6_ifaddr *ia = NULL;
 	u_long mtu;
 	int alwaysfrag, dontfrag;
 	u_int32_t optlen = 0, plen = 0, unfragpartlen = 0;
 	struct ip6_exthdrs exthdrs;
 	struct in6_addr finaldst, src0, dst0;
 	u_int32_t zone;
 	struct route_in6 *ro_pmtu = NULL;
 	int hdrsplit = 0;
 	int needipsec = 0;
 #ifdef IPSEC
 	struct ipsec_output_state state;
 	struct ip6_rthdr *rh = NULL;
 	int needipsectun = 0;
 	int segleft_org = 0;
 	struct secpolicy *sp = NULL;
 #endif /* IPSEC */
 
 	ip6 = mtod(m, struct ip6_hdr *);
 	if (ip6 == NULL) {
 		printf ("ip6 is NULL");
 		goto bad;
 	}
 
 	finaldst = ip6->ip6_dst;
 
 	bzero(&exthdrs, sizeof(exthdrs));
 
 	if (opt) {
 		/* Hop-by-Hop options header */
 		MAKE_EXTHDR(opt->ip6po_hbh, &exthdrs.ip6e_hbh);
 		/* Destination options header(1st part) */
 		if (opt->ip6po_rthdr) {
 			/*
 			 * Destination options header(1st part)
 			 * This only makes sense with a routing header.
 			 * See Section 9.2 of RFC 3542.
 			 * Disabling this part just for MIP6 convenience is
 			 * a bad idea.  We need to think carefully about a
 			 * way to make the advanced API coexist with MIP6
 			 * options, which might automatically be inserted in
 			 * the kernel.
 			 */
 			MAKE_EXTHDR(opt->ip6po_dest1, &exthdrs.ip6e_dest1);
 		}
 		/* Routing header */
 		MAKE_EXTHDR(opt->ip6po_rthdr, &exthdrs.ip6e_rthdr);
 		/* Destination options header(2nd part) */
 		MAKE_EXTHDR(opt->ip6po_dest2, &exthdrs.ip6e_dest2);
 	}
 
 	/*
 	 * IPSec checking which handles several cases.
 	 * FAST IPSEC: We re-injected the packet.
 	 */
 #ifdef IPSEC
 	switch(ip6_ipsec_output(&m, inp, &flags, &error, &ifp, &sp))
 	{
 	case 1:                 /* Bad packet */
 		goto freehdrs;
 	case -1:                /* Do IPSec */
 		needipsec = 1;
 	case 0:                 /* No IPSec */
 	default:
 		break;
 	}
 #endif /* IPSEC */
 
 	/*
 	 * Calculate the total length of the extension header chain.
 	 * Keep the length of the unfragmentable part for fragmentation.
 	 */
 	optlen = 0;
 	if (exthdrs.ip6e_hbh)
 		optlen += exthdrs.ip6e_hbh->m_len;
 	if (exthdrs.ip6e_dest1)
 		optlen += exthdrs.ip6e_dest1->m_len;
 	if (exthdrs.ip6e_rthdr)
 		optlen += exthdrs.ip6e_rthdr->m_len;
 	unfragpartlen = optlen + sizeof(struct ip6_hdr);
 
 	/* NOTE: we don't add AH/ESP length here. do that later. */
 	if (exthdrs.ip6e_dest2)
 		optlen += exthdrs.ip6e_dest2->m_len;
 
 	/*
 	 * If we need IPsec, or there is at least one extension header,
 	 * separate IP6 header from the payload.
 	 */
 	if ((needipsec || optlen) && !hdrsplit) {
 		if ((error = ip6_splithdr(m, &exthdrs)) != 0) {
 			m = NULL;
 			goto freehdrs;
 		}
 		m = exthdrs.ip6e_ip6;
 		hdrsplit++;
 	}
 
 	/* adjust pointer */
 	ip6 = mtod(m, struct ip6_hdr *);
 
 	/* adjust mbuf packet header length */
 	m->m_pkthdr.len += optlen;
 	plen = m->m_pkthdr.len - sizeof(*ip6);
 
 	/* If this is a jumbo payload, insert a jumbo payload option. */
 	if (plen > IPV6_MAXPACKET) {
 		if (!hdrsplit) {
 			if ((error = ip6_splithdr(m, &exthdrs)) != 0) {
 				m = NULL;
 				goto freehdrs;
 			}
 			m = exthdrs.ip6e_ip6;
 			hdrsplit++;
 		}
 		/* adjust pointer */
 		ip6 = mtod(m, struct ip6_hdr *);
 		if ((error = ip6_insert_jumboopt(&exthdrs, plen)) != 0)
 			goto freehdrs;
 		ip6->ip6_plen = 0;
 	} else
 		ip6->ip6_plen = htons(plen);
 
 	/*
 	 * Concatenate headers and fill in next header fields.
 	 * Here we have, on "m"
 	 *	IPv6 payload
 	 * and we insert headers accordingly.  Finally, we should be getting:
 	 *	IPv6 hbh dest1 rthdr ah* [esp* dest2 payload]
 	 *
 	 * during the header composing process, "m" points to IPv6 header.
 	 * "mprev" points to an extension header prior to esp.
 	 */
 	u_char *nexthdrp = &ip6->ip6_nxt;
 	mprev = m;
 
 	/*
 	 * we treat dest2 specially.  this makes IPsec processing
 	 * much easier.  the goal here is to make mprev point the
 	 * mbuf prior to dest2.
 	 *
 	 * result: IPv6 dest2 payload
 	 * m and mprev will point to IPv6 header.
 	 */
 	if (exthdrs.ip6e_dest2) {
 		if (!hdrsplit)
 			panic("assumption failed: hdr not split");
 		exthdrs.ip6e_dest2->m_next = m->m_next;
 		m->m_next = exthdrs.ip6e_dest2;
 		*mtod(exthdrs.ip6e_dest2, u_char *) = ip6->ip6_nxt;
 		ip6->ip6_nxt = IPPROTO_DSTOPTS;
 	}
 
 	/*
 	 * result: IPv6 hbh dest1 rthdr dest2 payload
 	 * m will point to IPv6 header.  mprev will point to the
 	 * extension header prior to dest2 (rthdr in the above case).
 	 */
 	MAKE_CHAIN(exthdrs.ip6e_hbh, mprev, nexthdrp, IPPROTO_HOPOPTS);
 	MAKE_CHAIN(exthdrs.ip6e_dest1, mprev, nexthdrp,
 		   IPPROTO_DSTOPTS);
 	MAKE_CHAIN(exthdrs.ip6e_rthdr, mprev, nexthdrp,
 		   IPPROTO_ROUTING);
 
 #ifdef IPSEC
 	if (!needipsec)
 		goto skip_ipsec2;
 
 	/*
 	 * pointers after IPsec headers are not valid any more.
 	 * other pointers need a great care too.
 	 * (IPsec routines should not mangle mbufs prior to AH/ESP)
 	 */
 	exthdrs.ip6e_dest2 = NULL;
 
 	if (exthdrs.ip6e_rthdr) {
 		rh = mtod(exthdrs.ip6e_rthdr, struct ip6_rthdr *);
 		segleft_org = rh->ip6r_segleft;
 		rh->ip6r_segleft = 0;
 	}
 
 	bzero(&state, sizeof(state));
 	state.m = m;
 	error = ipsec6_output_trans(&state, nexthdrp, mprev, sp, flags,
 				    &needipsectun);
 	m = state.m;
 	if (error == EJUSTRETURN) {
 		/*
 		 * We had a SP with a level of 'use' and no SA. We
 		 * will just continue to process the packet without
 		 * IPsec processing.
 		 */
 		;
 	} else if (error) {
 		/* mbuf is already reclaimed in ipsec6_output_trans. */
 		m = NULL;
 		switch (error) {
 		case EHOSTUNREACH:
 		case ENETUNREACH:
 		case EMSGSIZE:
 		case ENOBUFS:
 		case ENOMEM:
 			break;
 		default:
 			printf("[%s:%d] (ipsec): error code %d\n",
 			    __func__, __LINE__, error);
 			/* FALLTHROUGH */
 		case ENOENT:
 			/* don't show these error codes to the user */
 			error = 0;
 			break;
 		}
 		goto bad;
 	} else if (!needipsectun) {
 		/*
 		 * In the FAST IPSec case we have already
 		 * re-injected the packet and it has been freed
 		 * by the ipsec_done() function.  So, just clean
 		 * up after ourselves.
 		 */
 		m = NULL;
 		goto done;
 	}
 	if (exthdrs.ip6e_rthdr) {
 		/* ah6_output doesn't modify mbuf chain */
 		rh->ip6r_segleft = segleft_org;
 	}
 skip_ipsec2:;
 #endif /* IPSEC */
 
 	/*
 	 * If there is a routing header, replace the destination address field
 	 * with the first hop of the routing header.
 	 */
 	if (exthdrs.ip6e_rthdr) {
 		struct ip6_rthdr *rh =
 			(struct ip6_rthdr *)(mtod(exthdrs.ip6e_rthdr,
 						  struct ip6_rthdr *));
 		struct ip6_rthdr0 *rh0;
 		struct in6_addr *addr;
 		struct sockaddr_in6 sa;
 
 		switch (rh->ip6r_type) {
 		case IPV6_RTHDR_TYPE_0:
 			 rh0 = (struct ip6_rthdr0 *)rh;
 			 addr = (struct in6_addr *)(rh0 + 1);
 
 			 /*
 			  * construct a sockaddr_in6 form of
 			  * the first hop.
 			  *
 			  * XXX: we may not have enough
 			  * information about its scope zone;
 			  * there is no standard API to pass
 			  * the information from the
 			  * application.
 			  */
 			 bzero(&sa, sizeof(sa));
 			 sa.sin6_family = AF_INET6;
 			 sa.sin6_len = sizeof(sa);
 			 sa.sin6_addr = addr[0];
 			 if ((error = sa6_embedscope(&sa,
 			     V_ip6_use_defzone)) != 0) {
 				 goto bad;
 			 }
 			 ip6->ip6_dst = sa.sin6_addr;
 			 bcopy(&addr[1], &addr[0], sizeof(struct in6_addr)
 			     * (rh0->ip6r0_segleft - 1));
 			 addr[rh0->ip6r0_segleft - 1] = finaldst;
 			 /* XXX */
 			 in6_clearscope(addr + rh0->ip6r0_segleft - 1);
 			 break;
 		default:	/* is it possible? */
 			 error = EINVAL;
 			 goto bad;
 		}
 	}
 
 	/* Source address validation */
 	if (IN6_IS_ADDR_UNSPECIFIED(&ip6->ip6_src) &&
 	    (flags & IPV6_UNSPECSRC) == 0) {
 		error = EOPNOTSUPP;
 		V_ip6stat.ip6s_badscope++;
 		goto bad;
 	}
 	if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_src)) {
 		error = EOPNOTSUPP;
 		V_ip6stat.ip6s_badscope++;
 		goto bad;
 	}
 
 	V_ip6stat.ip6s_localout++;
 
 	/*
 	 * Route packet.
 	 */
 	if (ro == 0) {
 		ro = &ip6route;
 		bzero((caddr_t)ro, sizeof(*ro));
 	}
 	ro_pmtu = ro;
 	if (opt && opt->ip6po_rthdr)
 		ro = &opt->ip6po_route;
 	dst = (struct sockaddr_in6 *)&ro->ro_dst;
 
 again:
 	/*
 	 * if specified, try to fill in the traffic class field.
 	 * do not override if a non-zero value is already set.
 	 * we check the diffserv field and the ecn field separately.
 	 */
 	if (opt && opt->ip6po_tclass >= 0) {
 		int mask = 0;
 
 		if ((ip6->ip6_flow & htonl(0xfc << 20)) == 0)
 			mask |= 0xfc;
 		if ((ip6->ip6_flow & htonl(0x03 << 20)) == 0)
 			mask |= 0x03;
 		if (mask != 0)
 			ip6->ip6_flow |= htonl((opt->ip6po_tclass & mask) << 20);
 	}
 
 	/* fill in or override the hop limit field, if necessary. */
 	if (opt && opt->ip6po_hlim != -1)
 		ip6->ip6_hlim = opt->ip6po_hlim & 0xff;
 	else if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
 		if (im6o != NULL)
 			ip6->ip6_hlim = im6o->im6o_multicast_hlim;
 		else
 			ip6->ip6_hlim = V_ip6_defmcasthlim;
 	}
 
 #ifdef IPSEC
 	/*
 	 * We may re-inject packets into the stack here.
 	 */
 	if (needipsec && needipsectun) {
 		struct ipsec_output_state state;
 
 		/*
 		 * All the extension headers will become inaccessible
 		 * (since they can be encrypted).
 		 * Don't panic, we need no more updates to extension headers
 		 * on inner IPv6 packet (since they are now encapsulated).
 		 *
 		 * IPv6 [ESP|AH] IPv6 [extension headers] payload
 		 */
 		bzero(&exthdrs, sizeof(exthdrs));
 		exthdrs.ip6e_ip6 = m;
 
 		bzero(&state, sizeof(state));
 		state.m = m;
 		state.ro = (struct route *)ro;
 		state.dst = (struct sockaddr *)dst;
 
 		error = ipsec6_output_tunnel(&state, sp, flags);
 
 		m = state.m;
 		ro = (struct route_in6 *)state.ro;
 		dst = (struct sockaddr_in6 *)state.dst;
 		if (error == EJUSTRETURN) {
 			/*
 			 * We had a SP with a level of 'use' and no SA. We
 			 * will just continue to process the packet without
 			 * IPsec processing.
 			 */
 			;
 		} else if (error) {
 			/* mbuf is already reclaimed in ipsec6_output_tunnel. */
 			m0 = m = NULL;
 			m = NULL;
 			switch (error) {
 			case EHOSTUNREACH:
 			case ENETUNREACH:
 			case EMSGSIZE:
 			case ENOBUFS:
 			case ENOMEM:
 				break;
 			default:
 				printf("[%s:%d] (ipsec): error code %d\n",
 				    __func__, __LINE__, error);
 				/* FALLTHROUGH */
 			case ENOENT:
 				/* don't show these error codes to the user */
 				error = 0;
 				break;
 			}
 			goto bad;
 		} else {
 			/*
 			 * In the FAST IPSec case we have already
 			 * re-injected the packet and it has been freed
 			 * by the ipsec_done() function.  So, just clean
 			 * up after ourselves.
 			 */
 			m = NULL;
 			goto done;
 		}
 
 		exthdrs.ip6e_ip6 = m;
 	}
 #endif /* IPSEC */
 
 	/* adjust pointer */
 	ip6 = mtod(m, struct ip6_hdr *);
 
 	bzero(&dst_sa, sizeof(dst_sa));
 	dst_sa.sin6_family = AF_INET6;
 	dst_sa.sin6_len = sizeof(dst_sa);
 	dst_sa.sin6_addr = ip6->ip6_dst;
 	if ((error = in6_selectroute(&dst_sa, opt, im6o, ro,
-	    &ifp, &rt, 0)) != 0) {
+	    &ifp, &rt)) != 0) {
 		switch (error) {
 		case EHOSTUNREACH:
 			V_ip6stat.ip6s_noroute++;
 			break;
 		case EADDRNOTAVAIL:
 		default:
 			break; /* XXX statistics? */
 		}
 		if (ifp != NULL)
 			in6_ifstat_inc(ifp, ifs6_out_discard);
 		goto bad;
 	}
 	if (rt == NULL) {
 		/*
 		 * If in6_selectroute() does not return a route entry,
 		 * dst may not have been updated.
 		 */
 		*dst = dst_sa;	/* XXX */
 	}
 
 	/*
 	 * then rt (for unicast) and ifp must be non-NULL valid values.
 	 */
 	if ((flags & IPV6_FORWARDING) == 0) {
 		/* XXX: the FORWARDING flag can be set for mrouting. */
 		in6_ifstat_inc(ifp, ifs6_out_request);
 	}
 	if (rt != NULL) {
 		ia = (struct in6_ifaddr *)(rt->rt_ifa);
 		rt->rt_use++;
 	}
 
 	/*
 	 * The outgoing interface must be in the zone of source and
 	 * destination addresses.  We should use ia_ifp to support the
 	 * case of sending packets to an address of our own.
 	 */
 	if (ia != NULL && ia->ia_ifp)
 		origifp = ia->ia_ifp;
 	else
 		origifp = ifp;
 
 	src0 = ip6->ip6_src;
 	if (in6_setscope(&src0, origifp, &zone))
 		goto badscope;
 	bzero(&src_sa, sizeof(src_sa));
 	src_sa.sin6_family = AF_INET6;
 	src_sa.sin6_len = sizeof(src_sa);
 	src_sa.sin6_addr = ip6->ip6_src;
 	if (sa6_recoverscope(&src_sa) || zone != src_sa.sin6_scope_id)
 		goto badscope;
 
 	dst0 = ip6->ip6_dst;
 	if (in6_setscope(&dst0, origifp, &zone))
 		goto badscope;
 	/* re-initialize to be sure */
 	bzero(&dst_sa, sizeof(dst_sa));
 	dst_sa.sin6_family = AF_INET6;
 	dst_sa.sin6_len = sizeof(dst_sa);
 	dst_sa.sin6_addr = ip6->ip6_dst;
 	if (sa6_recoverscope(&dst_sa) || zone != dst_sa.sin6_scope_id) {
 		goto badscope;
 	}
 
 	/* scope check is done. */
 	goto routefound;
 
   badscope:
 	V_ip6stat.ip6s_badscope++;
 	in6_ifstat_inc(origifp, ifs6_out_discard);
 	if (error == 0)
 		error = EHOSTUNREACH; /* XXX */
 	goto bad;
 
   routefound:
 	if (rt && !IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
 		if (opt && opt->ip6po_nextroute.ro_rt) {
 			/*
 			 * The nexthop is explicitly specified by the
 			 * application.  We assume the next hop is an IPv6
 			 * address.
 			 */
 			dst = (struct sockaddr_in6 *)opt->ip6po_nexthop;
 		}
 		else if ((rt->rt_flags & RTF_GATEWAY))
 			dst = (struct sockaddr_in6 *)rt->rt_gateway;
 	}
 
 	if (!IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
 		m->m_flags &= ~(M_BCAST | M_MCAST); /* just in case */
 	} else {
 		struct	in6_multi *in6m;
 
 		m->m_flags = (m->m_flags & ~M_BCAST) | M_MCAST;
 
 		in6_ifstat_inc(ifp, ifs6_out_mcast);
 
 		/*
 		 * Confirm that the outgoing interface supports multicast.
 		 */
 		if (!(ifp->if_flags & IFF_MULTICAST)) {
 			V_ip6stat.ip6s_noroute++;
 			in6_ifstat_inc(ifp, ifs6_out_discard);
 			error = ENETUNREACH;
 			goto bad;
 		}
 		IN6_LOOKUP_MULTI(ip6->ip6_dst, ifp, in6m);
 		if (in6m != NULL &&
 		   (im6o == NULL || im6o->im6o_multicast_loop)) {
 			/*
 			 * If we belong to the destination multicast group
 			 * on the outgoing interface, and the caller did not
 			 * forbid loopback, loop back a copy.
 			 */
 			ip6_mloopback(ifp, m, dst);
 		} else {
 			/*
 			 * If we are acting as a multicast router, perform
 			 * multicast forwarding as if the packet had just
 			 * arrived on the interface to which we are about
 			 * to send.  The multicast forwarding function
 			 * recursively calls this function, using the
 			 * IPV6_FORWARDING flag to prevent infinite recursion.
 			 *
 			 * Multicasts that are looped back by ip6_mloopback(),
 			 * above, will be forwarded by the ip6_input() routine,
 			 * if necessary.
 			 */
 			if (ip6_mrouter && (flags & IPV6_FORWARDING) == 0) {
 				/*
 				 * XXX: ip6_mforward expects that rcvif is NULL
 				 * when it is called from the originating path.
 				 * However, it is not always the case, since
 				 * some versions of MGETHDR() does not
 				 * initialize the field.
 				 */
 				m->m_pkthdr.rcvif = NULL;
 				if (ip6_mforward(ip6, ifp, m) != 0) {
 					m_freem(m);
 					goto done;
 				}
 			}
 		}
 		/*
 		 * Multicasts with a hoplimit of zero may be looped back,
 		 * above, but must not be transmitted on a network.
 		 * Also, multicasts addressed to the loopback interface
 		 * are not sent -- the above call to ip6_mloopback() will
 		 * loop back a copy if this host actually belongs to the
 		 * destination group on the loopback interface.
 		 */
 		if (ip6->ip6_hlim == 0 || (ifp->if_flags & IFF_LOOPBACK) ||
 		    IN6_IS_ADDR_MC_INTFACELOCAL(&ip6->ip6_dst)) {
 			m_freem(m);
 			goto done;
 		}
 	}
 
 	/*
 	 * Fill the outgoing inteface to tell the upper layer
 	 * to increment per-interface statistics.
 	 */
 	if (ifpp)
 		*ifpp = ifp;
 
 	/* Determine path MTU. */
 	if ((error = ip6_getpmtu(ro_pmtu, ro, ifp, &finaldst, &mtu,
 	    &alwaysfrag)) != 0)
 		goto bad;
 
 	/*
 	 * The caller of this function may specify to use the minimum MTU
 	 * in some cases.
 	 * An advanced API option (IPV6_USE_MIN_MTU) can also override MTU
 	 * setting.  The logic is a bit complicated; by default, unicast
 	 * packets will follow path MTU while multicast packets will be sent at
 	 * the minimum MTU.  If IP6PO_MINMTU_ALL is specified, all packets
 	 * including unicast ones will be sent at the minimum MTU.  Multicast
 	 * packets will always be sent at the minimum MTU unless
 	 * IP6PO_MINMTU_DISABLE is explicitly specified.
 	 * See RFC 3542 for more details.
 	 */
 	if (mtu > IPV6_MMTU) {
 		if ((flags & IPV6_MINMTU))
 			mtu = IPV6_MMTU;
 		else if (opt && opt->ip6po_minmtu == IP6PO_MINMTU_ALL)
 			mtu = IPV6_MMTU;
 		else if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst) &&
 			 (opt == NULL ||
 			  opt->ip6po_minmtu != IP6PO_MINMTU_DISABLE)) {
 			mtu = IPV6_MMTU;
 		}
 	}
 
 	/*
 	 * clear embedded scope identifiers if necessary.
 	 * in6_clearscope will touch the addresses only when necessary.
 	 */
 	in6_clearscope(&ip6->ip6_src);
 	in6_clearscope(&ip6->ip6_dst);
 
 	/*
 	 * If the outgoing packet contains a hop-by-hop options header,
 	 * it must be examined and processed even by the source node.
 	 * (RFC 2460, section 4.)
 	 */
 	if (exthdrs.ip6e_hbh) {
 		struct ip6_hbh *hbh = mtod(exthdrs.ip6e_hbh, struct ip6_hbh *);
 		u_int32_t dummy; /* XXX unused */
 		u_int32_t plen = 0; /* XXX: ip6_process will check the value */
 
 #ifdef DIAGNOSTIC
 		if ((hbh->ip6h_len + 1) << 3 > exthdrs.ip6e_hbh->m_len)
 			panic("ip6e_hbh is not continuous");
 #endif
 		/*
 		 *  XXX: if we have to send an ICMPv6 error to the sender,
 		 *       we need the M_LOOP flag since icmp6_error() expects
 		 *       the IPv6 and the hop-by-hop options header are
 		 *       continuous unless the flag is set.
 		 */
 		m->m_flags |= M_LOOP;
 		m->m_pkthdr.rcvif = ifp;
 		if (ip6_process_hopopts(m, (u_int8_t *)(hbh + 1),
 		    ((hbh->ip6h_len + 1) << 3) - sizeof(struct ip6_hbh),
 		    &dummy, &plen) < 0) {
 			/* m was already freed at this point */
 			error = EINVAL;/* better error? */
 			goto done;
 		}
 		m->m_flags &= ~M_LOOP; /* XXX */
 		m->m_pkthdr.rcvif = NULL;
 	}
 
 	/* Jump over all PFIL processing if hooks are not active. */
 	if (!PFIL_HOOKED(&inet6_pfil_hook))
 		goto passout;
 
 	odst = ip6->ip6_dst;
 	/* Run through list of hooks for output packets. */
 	error = pfil_run_hooks(&inet6_pfil_hook, &m, ifp, PFIL_OUT, inp);
 	if (error != 0 || m == NULL)
 		goto done;
 	ip6 = mtod(m, struct ip6_hdr *);
 
 	/* See if destination IP address was changed by packet filter. */
 	if (!IN6_ARE_ADDR_EQUAL(&odst, &ip6->ip6_dst)) {
 		m->m_flags |= M_SKIP_FIREWALL;
 		/* If destination is now ourself drop to ip6_input(). */
 		if (in6_localaddr(&ip6->ip6_dst)) {
 			if (m->m_pkthdr.rcvif == NULL)
 				m->m_pkthdr.rcvif = V_loif;
 			if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 				m->m_pkthdr.csum_flags |=
 				    CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
 				m->m_pkthdr.csum_data = 0xffff;
 			}
 			m->m_pkthdr.csum_flags |=
 			    CSUM_IP_CHECKED | CSUM_IP_VALID;
 			error = netisr_queue(NETISR_IPV6, m);
 			goto done;
 		} else
 			goto again;	/* Redo the routing table lookup. */
 	}
 
 	/* XXX: IPFIREWALL_FORWARD */
 
 passout:
 	/*
 	 * Send the packet to the outgoing interface.
 	 * If necessary, do IPv6 fragmentation before sending.
 	 *
 	 * the logic here is rather complex:
 	 * 1: normal case (dontfrag == 0, alwaysfrag == 0)
 	 * 1-a:	send as is if tlen <= path mtu
 	 * 1-b:	fragment if tlen > path mtu
 	 *
 	 * 2: if user asks us not to fragment (dontfrag == 1)
 	 * 2-a:	send as is if tlen <= interface mtu
 	 * 2-b:	error if tlen > interface mtu
 	 *
 	 * 3: if we always need to attach fragment header (alwaysfrag == 1)
 	 *	always fragment
 	 *
 	 * 4: if dontfrag == 1 && alwaysfrag == 1
 	 *	error, as we cannot handle this conflicting request
 	 */
 	tlen = m->m_pkthdr.len;
 
 	if (opt && (opt->ip6po_flags & IP6PO_DONTFRAG))
 		dontfrag = 1;
 	else
 		dontfrag = 0;
 	if (dontfrag && alwaysfrag) {	/* case 4 */
 		/* conflicting request - can't transmit */
 		error = EMSGSIZE;
 		goto bad;
 	}
 	if (dontfrag && tlen > IN6_LINKMTU(ifp)) {	/* case 2-b */
 		/*
 		 * Even if the DONTFRAG option is specified, we cannot send the
 		 * packet when the data length is larger than the MTU of the
 		 * outgoing interface.
 		 * Notify the error by sending IPV6_PATHMTU ancillary data as
 		 * well as returning an error code (the latter is not described
 		 * in the API spec.)
 		 */
 		u_int32_t mtu32;
 		struct ip6ctlparam ip6cp;
 
 		mtu32 = (u_int32_t)mtu;
 		bzero(&ip6cp, sizeof(ip6cp));
 		ip6cp.ip6c_cmdarg = (void *)&mtu32;
 		pfctlinput2(PRC_MSGSIZE, (struct sockaddr *)&ro_pmtu->ro_dst,
 		    (void *)&ip6cp);
 
 		error = EMSGSIZE;
 		goto bad;
 	}
 
 	/*
 	 * transmit packet without fragmentation
 	 */
 	if (dontfrag || (!alwaysfrag && tlen <= mtu)) {	/* case 1-a and 2-a */
 		struct in6_ifaddr *ia6;
 
 		ip6 = mtod(m, struct ip6_hdr *);
 		ia6 = in6_ifawithifp(ifp, &ip6->ip6_src);
 		if (ia6) {
 			/* Record statistics for this interface address. */
 			ia6->ia_ifa.if_opackets++;
 			ia6->ia_ifa.if_obytes += m->m_pkthdr.len;
 		}
 		error = nd6_output(ifp, origifp, m, dst, ro->ro_rt);
 		goto done;
 	}
 
 	/*
 	 * try to fragment the packet.  case 1-b and 3
 	 */
 	if (mtu < IPV6_MMTU) {
 		/* path MTU cannot be less than IPV6_MMTU */
 		error = EMSGSIZE;
 		in6_ifstat_inc(ifp, ifs6_out_fragfail);
 		goto bad;
 	} else if (ip6->ip6_plen == 0) {
 		/* jumbo payload cannot be fragmented */
 		error = EMSGSIZE;
 		in6_ifstat_inc(ifp, ifs6_out_fragfail);
 		goto bad;
 	} else {
 		struct mbuf **mnext, *m_frgpart;
 		struct ip6_frag *ip6f;
 		u_int32_t id = htonl(ip6_randomid());
 		u_char nextproto;
 
 		int qslots = ifp->if_snd.ifq_maxlen - ifp->if_snd.ifq_len;
 
 		/*
 		 * Too large for the destination or interface;
 		 * fragment if possible.
 		 * Must be able to put at least 8 bytes per fragment.
 		 */
 		hlen = unfragpartlen;
 		if (mtu > IPV6_MAXPACKET)
 			mtu = IPV6_MAXPACKET;
 
 		len = (mtu - hlen - sizeof(struct ip6_frag)) & ~7;
 		if (len < 8) {
 			error = EMSGSIZE;
 			in6_ifstat_inc(ifp, ifs6_out_fragfail);
 			goto bad;
 		}
 
 		/*
 		 * Verify that we have any chance at all of being able to queue
 		 *      the packet or packet fragments
 		 */
 		if (qslots <= 0 || ((u_int)qslots * (mtu - hlen)
 		    < tlen  /* - hlen */)) {
 			error = ENOBUFS;
 			V_ip6stat.ip6s_odropped++;
 			goto bad;
 		}
 
 		mnext = &m->m_nextpkt;
 
 		/*
 		 * Change the next header field of the last header in the
 		 * unfragmentable part.
 		 */
 		if (exthdrs.ip6e_rthdr) {
 			nextproto = *mtod(exthdrs.ip6e_rthdr, u_char *);
 			*mtod(exthdrs.ip6e_rthdr, u_char *) = IPPROTO_FRAGMENT;
 		} else if (exthdrs.ip6e_dest1) {
 			nextproto = *mtod(exthdrs.ip6e_dest1, u_char *);
 			*mtod(exthdrs.ip6e_dest1, u_char *) = IPPROTO_FRAGMENT;
 		} else if (exthdrs.ip6e_hbh) {
 			nextproto = *mtod(exthdrs.ip6e_hbh, u_char *);
 			*mtod(exthdrs.ip6e_hbh, u_char *) = IPPROTO_FRAGMENT;
 		} else {
 			nextproto = ip6->ip6_nxt;
 			ip6->ip6_nxt = IPPROTO_FRAGMENT;
 		}
 
 		/*
 		 * Loop through length of segment after first fragment,
 		 * make new header and copy data of each part and link onto
 		 * chain.
 		 */
 		m0 = m;
 		for (off = hlen; off < tlen; off += len) {
 			MGETHDR(m, M_DONTWAIT, MT_HEADER);
 			if (!m) {
 				error = ENOBUFS;
 				V_ip6stat.ip6s_odropped++;
 				goto sendorfree;
 			}
 			m->m_pkthdr.rcvif = NULL;
 			m->m_flags = m0->m_flags & M_COPYFLAGS;
 			*mnext = m;
 			mnext = &m->m_nextpkt;
 			m->m_data += max_linkhdr;
 			mhip6 = mtod(m, struct ip6_hdr *);
 			*mhip6 = *ip6;
 			m->m_len = sizeof(*mhip6);
 			error = ip6_insertfraghdr(m0, m, hlen, &ip6f);
 			if (error) {
 				V_ip6stat.ip6s_odropped++;
 				goto sendorfree;
 			}
 			ip6f->ip6f_offlg = htons((u_short)((off - hlen) & ~7));
 			if (off + len >= tlen)
 				len = tlen - off;
 			else
 				ip6f->ip6f_offlg |= IP6F_MORE_FRAG;
 			mhip6->ip6_plen = htons((u_short)(len + hlen +
 			    sizeof(*ip6f) - sizeof(struct ip6_hdr)));
 			if ((m_frgpart = m_copy(m0, off, len)) == 0) {
 				error = ENOBUFS;
 				V_ip6stat.ip6s_odropped++;
 				goto sendorfree;
 			}
 			m_cat(m, m_frgpart);
 			m->m_pkthdr.len = len + hlen + sizeof(*ip6f);
 			m->m_pkthdr.rcvif = NULL;
 			ip6f->ip6f_reserved = 0;
 			ip6f->ip6f_ident = id;
 			ip6f->ip6f_nxt = nextproto;
 			V_ip6stat.ip6s_ofragments++;
 			in6_ifstat_inc(ifp, ifs6_out_fragcreat);
 		}
 
 		in6_ifstat_inc(ifp, ifs6_out_fragok);
 	}
 
 	/*
 	 * Remove leading garbages.
 	 */
 sendorfree:
 	m = m0->m_nextpkt;
 	m0->m_nextpkt = 0;
 	m_freem(m0);
 	for (m0 = m; m; m = m0) {
 		m0 = m->m_nextpkt;
 		m->m_nextpkt = 0;
 		if (error == 0) {
 			/* Record statistics for this interface address. */
 			if (ia) {
 				ia->ia_ifa.if_opackets++;
 				ia->ia_ifa.if_obytes += m->m_pkthdr.len;
 			}
 			error = nd6_output(ifp, origifp, m, dst, ro->ro_rt);
 		} else
 			m_freem(m);
 	}
 
 	if (error == 0)
 		V_ip6stat.ip6s_fragmented++;
 
 done:
 	if (ro == &ip6route && ro->ro_rt) { /* brace necessary for RTFREE */
 		RTFREE(ro->ro_rt);
 	} else if (ro_pmtu == &ip6route && ro_pmtu->ro_rt) {
 		RTFREE(ro_pmtu->ro_rt);
 	}
 #ifdef IPSEC
 	if (sp != NULL)
 		KEY_FREESP(&sp);
 #endif
 
 	return (error);
 
 freehdrs:
 	m_freem(exthdrs.ip6e_hbh);	/* m_freem will check if mbuf is 0 */
 	m_freem(exthdrs.ip6e_dest1);
 	m_freem(exthdrs.ip6e_rthdr);
 	m_freem(exthdrs.ip6e_dest2);
 	/* FALLTHROUGH */
 bad:
 	if (m)
 		m_freem(m);
 	goto done;
 }
 
 static int
 ip6_copyexthdr(struct mbuf **mp, caddr_t hdr, int hlen)
 {
 	struct mbuf *m;
 
 	if (hlen > MCLBYTES)
 		return (ENOBUFS); /* XXX */
 
 	MGET(m, M_DONTWAIT, MT_DATA);
 	if (!m)
 		return (ENOBUFS);
 
 	if (hlen > MLEN) {
 		MCLGET(m, M_DONTWAIT);
 		if ((m->m_flags & M_EXT) == 0) {
 			m_free(m);
 			return (ENOBUFS);
 		}
 	}
 	m->m_len = hlen;
 	if (hdr)
 		bcopy(hdr, mtod(m, caddr_t), hlen);
 
 	*mp = m;
 	return (0);
 }
 
 /*
  * Insert jumbo payload option.
  */
 static int
 ip6_insert_jumboopt(struct ip6_exthdrs *exthdrs, u_int32_t plen)
 {
 	struct mbuf *mopt;
 	u_char *optbuf;
 	u_int32_t v;
 
 #define JUMBOOPTLEN	8	/* length of jumbo payload option and padding */
 
 	/*
 	 * If there is no hop-by-hop options header, allocate new one.
 	 * If there is one but it doesn't have enough space to store the
 	 * jumbo payload option, allocate a cluster to store the whole options.
 	 * Otherwise, use it to store the options.
 	 */
 	if (exthdrs->ip6e_hbh == 0) {
 		MGET(mopt, M_DONTWAIT, MT_DATA);
 		if (mopt == 0)
 			return (ENOBUFS);
 		mopt->m_len = JUMBOOPTLEN;
 		optbuf = mtod(mopt, u_char *);
 		optbuf[1] = 0;	/* = ((JUMBOOPTLEN) >> 3) - 1 */
 		exthdrs->ip6e_hbh = mopt;
 	} else {
 		struct ip6_hbh *hbh;
 
 		mopt = exthdrs->ip6e_hbh;
 		if (M_TRAILINGSPACE(mopt) < JUMBOOPTLEN) {
 			/*
 			 * XXX assumption:
 			 * - exthdrs->ip6e_hbh is not referenced from places
 			 *   other than exthdrs.
 			 * - exthdrs->ip6e_hbh is not an mbuf chain.
 			 */
 			int oldoptlen = mopt->m_len;
 			struct mbuf *n;
 
 			/*
 			 * XXX: give up if the whole (new) hbh header does
 			 * not fit even in an mbuf cluster.
 			 */
 			if (oldoptlen + JUMBOOPTLEN > MCLBYTES)
 				return (ENOBUFS);
 
 			/*
 			 * As a consequence, we must always prepare a cluster
 			 * at this point.
 			 */
 			MGET(n, M_DONTWAIT, MT_DATA);
 			if (n) {
 				MCLGET(n, M_DONTWAIT);
 				if ((n->m_flags & M_EXT) == 0) {
 					m_freem(n);
 					n = NULL;
 				}
 			}
 			if (!n)
 				return (ENOBUFS);
 			n->m_len = oldoptlen + JUMBOOPTLEN;
 			bcopy(mtod(mopt, caddr_t), mtod(n, caddr_t),
 			    oldoptlen);
 			optbuf = mtod(n, caddr_t) + oldoptlen;
 			m_freem(mopt);
 			mopt = exthdrs->ip6e_hbh = n;
 		} else {
 			optbuf = mtod(mopt, u_char *) + mopt->m_len;
 			mopt->m_len += JUMBOOPTLEN;
 		}
 		optbuf[0] = IP6OPT_PADN;
 		optbuf[1] = 1;
 
 		/*
 		 * Adjust the header length according to the pad and
 		 * the jumbo payload option.
 		 */
 		hbh = mtod(mopt, struct ip6_hbh *);
 		hbh->ip6h_len += (JUMBOOPTLEN >> 3);
 	}
 
 	/* fill in the option. */
 	optbuf[2] = IP6OPT_JUMBO;
 	optbuf[3] = 4;
 	v = (u_int32_t)htonl(plen + JUMBOOPTLEN);
 	bcopy(&v, &optbuf[4], sizeof(u_int32_t));
 
 	/* finally, adjust the packet header length */
 	exthdrs->ip6e_ip6->m_pkthdr.len += JUMBOOPTLEN;
 
 	return (0);
 #undef JUMBOOPTLEN
 }
 
 /*
  * Insert fragment header and copy unfragmentable header portions.
  */
 static int
 ip6_insertfraghdr(struct mbuf *m0, struct mbuf *m, int hlen,
     struct ip6_frag **frghdrp)
 {
 	struct mbuf *n, *mlast;
 
 	if (hlen > sizeof(struct ip6_hdr)) {
 		n = m_copym(m0, sizeof(struct ip6_hdr),
 		    hlen - sizeof(struct ip6_hdr), M_DONTWAIT);
 		if (n == 0)
 			return (ENOBUFS);
 		m->m_next = n;
 	} else
 		n = m;
 
 	/* Search for the last mbuf of unfragmentable part. */
 	for (mlast = n; mlast->m_next; mlast = mlast->m_next)
 		;
 
 	if ((mlast->m_flags & M_EXT) == 0 &&
 	    M_TRAILINGSPACE(mlast) >= sizeof(struct ip6_frag)) {
 		/* use the trailing space of the last mbuf for the fragment hdr */
 		*frghdrp = (struct ip6_frag *)(mtod(mlast, caddr_t) +
 		    mlast->m_len);
 		mlast->m_len += sizeof(struct ip6_frag);
 		m->m_pkthdr.len += sizeof(struct ip6_frag);
 	} else {
 		/* allocate a new mbuf for the fragment header */
 		struct mbuf *mfrg;
 
 		MGET(mfrg, M_DONTWAIT, MT_DATA);
 		if (mfrg == 0)
 			return (ENOBUFS);
 		mfrg->m_len = sizeof(struct ip6_frag);
 		*frghdrp = mtod(mfrg, struct ip6_frag *);
 		mlast->m_next = mfrg;
 	}
 
 	return (0);
 }
 
 static int
 ip6_getpmtu(struct route_in6 *ro_pmtu, struct route_in6 *ro,
     struct ifnet *ifp, struct in6_addr *dst, u_long *mtup,
     int *alwaysfragp)
 {
 	u_int32_t mtu = 0;
 	int alwaysfrag = 0;
 	int error = 0;
 
 	if (ro_pmtu != ro) {
 		/* The first hop and the final destination may differ. */
 		struct sockaddr_in6 *sa6_dst =
 		    (struct sockaddr_in6 *)&ro_pmtu->ro_dst;
 		if (ro_pmtu->ro_rt &&
 		    ((ro_pmtu->ro_rt->rt_flags & RTF_UP) == 0 ||
 		     !IN6_ARE_ADDR_EQUAL(&sa6_dst->sin6_addr, dst))) {
 			RTFREE(ro_pmtu->ro_rt);
 			ro_pmtu->ro_rt = (struct rtentry *)NULL;
 		}
 		if (ro_pmtu->ro_rt == NULL) {
 			bzero(sa6_dst, sizeof(*sa6_dst));
 			sa6_dst->sin6_family = AF_INET6;
 			sa6_dst->sin6_len = sizeof(struct sockaddr_in6);
 			sa6_dst->sin6_addr = *dst;
 
 			rtalloc((struct route *)ro_pmtu);
 		}
 	}
 	if (ro_pmtu->ro_rt) {
 		u_int32_t ifmtu;
 		struct in_conninfo inc;
 
 		bzero(&inc, sizeof(inc));
 		inc.inc_flags = 1; /* IPv6 */
 		inc.inc6_faddr = *dst;
 
 		if (ifp == NULL)
 			ifp = ro_pmtu->ro_rt->rt_ifp;
 		ifmtu = IN6_LINKMTU(ifp);
 		mtu = tcp_hc_getmtu(&inc);
 		if (mtu)
 			mtu = min(mtu, ro_pmtu->ro_rt->rt_rmx.rmx_mtu);
 		else
 			mtu = ro_pmtu->ro_rt->rt_rmx.rmx_mtu;
 		if (mtu == 0)
 			mtu = ifmtu;
 		else if (mtu < IPV6_MMTU) {
 			/*
 			 * RFC2460 section 5, last paragraph:
 			 * if we record ICMPv6 too big message with
 			 * mtu < IPV6_MMTU, transmit packets sized IPV6_MMTU
 			 * or smaller, with framgent header attached.
 			 * (fragment header is needed regardless from the
 			 * packet size, for translators to identify packets)
 			 */
 			alwaysfrag = 1;
 			mtu = IPV6_MMTU;
 		} else if (mtu > ifmtu) {
 			/*
 			 * The MTU on the route is larger than the MTU on
 			 * the interface!  This shouldn't happen, unless the
 			 * MTU of the interface has been changed after the
 			 * interface was brought up.  Change the MTU in the
 			 * route to match the interface MTU (as long as the
 			 * field isn't locked).
 			 */
 			mtu = ifmtu;
 			ro_pmtu->ro_rt->rt_rmx.rmx_mtu = mtu;
 		}
 	} else if (ifp) {
 		mtu = IN6_LINKMTU(ifp);
 	} else
 		error = EHOSTUNREACH; /* XXX */
 
 	*mtup = mtu;
 	if (alwaysfragp)
 		*alwaysfragp = alwaysfrag;
 	return (error);
 }
 
 /*
  * IP6 socket option processing.
  */
 int
 ip6_ctloutput(struct socket *so, struct sockopt *sopt)
 {
 	int optdatalen, uproto;
 	void *optdata;
 	struct inpcb *in6p = sotoinpcb(so);
 	int error, optval;
 	int level, op, optname;
 	int optlen;
 	struct thread *td;
 
 	level = sopt->sopt_level;
 	op = sopt->sopt_dir;
 	optname = sopt->sopt_name;
 	optlen = sopt->sopt_valsize;
 	td = sopt->sopt_td;
 	error = 0;
 	optval = 0;
 	uproto = (int)so->so_proto->pr_protocol;
 
 	if (level == IPPROTO_IPV6) {
 		switch (op) {
 
 		case SOPT_SET:
 			switch (optname) {
 			case IPV6_2292PKTOPTIONS:
 #ifdef IPV6_PKTOPTIONS
 			case IPV6_PKTOPTIONS:
 #endif
 			{
 				struct mbuf *m;
 
 				error = soopt_getm(sopt, &m); /* XXX */
 				if (error != 0)
 					break;
 				error = soopt_mcopyin(sopt, m); /* XXX */
 				if (error != 0)
 					break;
 				error = ip6_pcbopts(&in6p->in6p_outputopts,
 						    m, so, sopt);
 				m_freem(m); /* XXX */
 				break;
 			}
 
 			/*
 			 * Use of some Hop-by-Hop options or some
 			 * Destination options, might require special
 			 * privilege.  That is, normal applications
 			 * (without special privilege) might be forbidden
 			 * from setting certain options in outgoing packets,
 			 * and might never see certain options in received
 			 * packets. [RFC 2292 Section 6]
 			 * KAME specific note:
 			 *  KAME prevents non-privileged users from sending or
 			 *  receiving ANY hbh/dst options in order to avoid
 			 *  overhead of parsing options in the kernel.
 			 */
 			case IPV6_RECVHOPOPTS:
 			case IPV6_RECVDSTOPTS:
 			case IPV6_RECVRTHDRDSTOPTS:
 				if (td != NULL) {
 					error = priv_check(td,
 					    PRIV_NETINET_SETHDROPTS);
 					if (error)
 						break;
 				}
 				/* FALLTHROUGH */
 			case IPV6_UNICAST_HOPS:
 			case IPV6_HOPLIMIT:
 			case IPV6_FAITH:
 
 			case IPV6_RECVPKTINFO:
 			case IPV6_RECVHOPLIMIT:
 			case IPV6_RECVRTHDR:
 			case IPV6_RECVPATHMTU:
 			case IPV6_RECVTCLASS:
 			case IPV6_V6ONLY:
 			case IPV6_AUTOFLOWLABEL:
 				if (optlen != sizeof(int)) {
 					error = EINVAL;
 					break;
 				}
 				error = sooptcopyin(sopt, &optval,
 					sizeof optval, sizeof optval);
 				if (error)
 					break;
 				switch (optname) {
 
 				case IPV6_UNICAST_HOPS:
 					if (optval < -1 || optval >= 256)
 						error = EINVAL;
 					else {
 						/* -1 = kernel default */
 						in6p->in6p_hops = optval;
 						if ((in6p->in6p_vflag &
 						     INP_IPV4) != 0)
 							in6p->inp_ip_ttl = optval;
 					}
 					break;
 #define OPTSET(bit) \
 do { \
 	if (optval) \
 		in6p->in6p_flags |= (bit); \
 	else \
 		in6p->in6p_flags &= ~(bit); \
 } while (/*CONSTCOND*/ 0)
 #define OPTSET2292(bit) \
 do { \
 	in6p->in6p_flags |= IN6P_RFC2292; \
 	if (optval) \
 		in6p->in6p_flags |= (bit); \
 	else \
 		in6p->in6p_flags &= ~(bit); \
 } while (/*CONSTCOND*/ 0)
 #define OPTBIT(bit) (in6p->in6p_flags & (bit) ? 1 : 0)
 
 				case IPV6_RECVPKTINFO:
 					/* cannot mix with RFC2292 */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_PKTINFO);
 					break;
 
 				case IPV6_HOPLIMIT:
 				{
 					struct ip6_pktopts **optp;
 
 					/* cannot mix with RFC2292 */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					optp = &in6p->in6p_outputopts;
 					error = ip6_pcbopt(IPV6_HOPLIMIT,
 					    (u_char *)&optval, sizeof(optval),
 					    optp, (td != NULL) ? td->td_ucred :
 					    NULL, uproto);
 					break;
 				}
 
 				case IPV6_RECVHOPLIMIT:
 					/* cannot mix with RFC2292 */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_HOPLIMIT);
 					break;
 
 				case IPV6_RECVHOPOPTS:
 					/* cannot mix with RFC2292 */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_HOPOPTS);
 					break;
 
 				case IPV6_RECVDSTOPTS:
 					/* cannot mix with RFC2292 */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_DSTOPTS);
 					break;
 
 				case IPV6_RECVRTHDRDSTOPTS:
 					/* cannot mix with RFC2292 */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_RTHDRDSTOPTS);
 					break;
 
 				case IPV6_RECVRTHDR:
 					/* cannot mix with RFC2292 */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_RTHDR);
 					break;
 
 				case IPV6_FAITH:
 					OPTSET(IN6P_FAITH);
 					break;
 
 				case IPV6_RECVPATHMTU:
 					/*
 					 * We ignore this option for TCP
 					 * sockets.
 					 * (RFC3542 leaves this case
 					 * unspecified.)
 					 */
 					if (uproto != IPPROTO_TCP)
 						OPTSET(IN6P_MTU);
 					break;
 
 				case IPV6_V6ONLY:
 					/*
 					 * make setsockopt(IPV6_V6ONLY)
 					 * available only prior to bind(2).
 					 * see ipng mailing list, Jun 22 2001.
 					 */
 					if (in6p->in6p_lport ||
 					    !IN6_IS_ADDR_UNSPECIFIED(&in6p->in6p_laddr)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_IPV6_V6ONLY);
 					if (optval)
 						in6p->in6p_vflag &= ~INP_IPV4;
 					else
 						in6p->in6p_vflag |= INP_IPV4;
 					break;
 				case IPV6_RECVTCLASS:
 					/* cannot mix with RFC2292 XXX */
 					if (OPTBIT(IN6P_RFC2292)) {
 						error = EINVAL;
 						break;
 					}
 					OPTSET(IN6P_TCLASS);
 					break;
 				case IPV6_AUTOFLOWLABEL:
 					OPTSET(IN6P_AUTOFLOWLABEL);
 					break;
 
 				}
 				break;
 
 			case IPV6_TCLASS:
 			case IPV6_DONTFRAG:
 			case IPV6_USE_MIN_MTU:
 			case IPV6_PREFER_TEMPADDR:
 				if (optlen != sizeof(optval)) {
 					error = EINVAL;
 					break;
 				}
 				error = sooptcopyin(sopt, &optval,
 					sizeof optval, sizeof optval);
 				if (error)
 					break;
 				{
 					struct ip6_pktopts **optp;
 					optp = &in6p->in6p_outputopts;
 					error = ip6_pcbopt(optname,
 					    (u_char *)&optval, sizeof(optval),
 					    optp, (td != NULL) ? td->td_ucred :
 					    NULL, uproto);
 					break;
 				}
 
 			case IPV6_2292PKTINFO:
 			case IPV6_2292HOPLIMIT:
 			case IPV6_2292HOPOPTS:
 			case IPV6_2292DSTOPTS:
 			case IPV6_2292RTHDR:
 				/* RFC 2292 */
 				if (optlen != sizeof(int)) {
 					error = EINVAL;
 					break;
 				}
 				error = sooptcopyin(sopt, &optval,
 					sizeof optval, sizeof optval);
 				if (error)
 					break;
 				switch (optname) {
 				case IPV6_2292PKTINFO:
 					OPTSET2292(IN6P_PKTINFO);
 					break;
 				case IPV6_2292HOPLIMIT:
 					OPTSET2292(IN6P_HOPLIMIT);
 					break;
 				case IPV6_2292HOPOPTS:
 					/*
 					 * Check super-user privilege.
 					 * See comments for IPV6_RECVHOPOPTS.
 					 */
 					if (td != NULL) {
 						error = priv_check(td,
 						    PRIV_NETINET_SETHDROPTS);
 						if (error)
 							return (error);
 					}
 					OPTSET2292(IN6P_HOPOPTS);
 					break;
 				case IPV6_2292DSTOPTS:
 					if (td != NULL) {
 						error = priv_check(td,
 						    PRIV_NETINET_SETHDROPTS);
 						if (error)
 							return (error);
 					}
 					OPTSET2292(IN6P_DSTOPTS|IN6P_RTHDRDSTOPTS); /* XXX */
 					break;
 				case IPV6_2292RTHDR:
 					OPTSET2292(IN6P_RTHDR);
 					break;
 				}
 				break;
 			case IPV6_PKTINFO:
 			case IPV6_HOPOPTS:
 			case IPV6_RTHDR:
 			case IPV6_DSTOPTS:
 			case IPV6_RTHDRDSTOPTS:
 			case IPV6_NEXTHOP:
 			{
 				/* new advanced API (RFC3542) */
 				u_char *optbuf;
 				u_char optbuf_storage[MCLBYTES];
 				int optlen;
 				struct ip6_pktopts **optp;
 
 				/* cannot mix with RFC2292 */
 				if (OPTBIT(IN6P_RFC2292)) {
 					error = EINVAL;
 					break;
 				}
 
 				/*
 				 * We only ensure valsize is not too large
 				 * here.  Further validation will be done
 				 * later.
 				 */
 				error = sooptcopyin(sopt, optbuf_storage,
 				    sizeof(optbuf_storage), 0);
 				if (error)
 					break;
 				optlen = sopt->sopt_valsize;
 				optbuf = optbuf_storage;
 				optp = &in6p->in6p_outputopts;
 				error = ip6_pcbopt(optname, optbuf, optlen,
 				    optp, (td != NULL) ? td->td_ucred : NULL,
 				    uproto);
 				break;
 			}
 #undef OPTSET
 
 			case IPV6_MULTICAST_IF:
 			case IPV6_MULTICAST_HOPS:
 			case IPV6_MULTICAST_LOOP:
 			case IPV6_JOIN_GROUP:
 			case IPV6_LEAVE_GROUP:
 			    {
 				if (sopt->sopt_valsize > MLEN) {
 					error = EMSGSIZE;
 					break;
 				}
 				/* XXX */
 			    }
 			    /* FALLTHROUGH */
 			    {
 				struct mbuf *m;
 
 				if (sopt->sopt_valsize > MCLBYTES) {
 					error = EMSGSIZE;
 					break;
 				}
 				/* XXX */
 				MGET(m, sopt->sopt_td ? M_WAIT : M_DONTWAIT, MT_DATA);
 				if (m == 0) {
 					error = ENOBUFS;
 					break;
 				}
 				if (sopt->sopt_valsize > MLEN) {
 					MCLGET(m, sopt->sopt_td ? M_WAIT : M_DONTWAIT);
 					if ((m->m_flags & M_EXT) == 0) {
 						m_free(m);
 						error = ENOBUFS;
 						break;
 					}
 				}
 				m->m_len = sopt->sopt_valsize;
 				error = sooptcopyin(sopt, mtod(m, char *),
 						    m->m_len, m->m_len);
 				if (error) {
 					(void)m_free(m);
 					break;
 				}
 				error =	ip6_setmoptions(sopt->sopt_name,
 							&in6p->in6p_moptions,
 							m);
 				(void)m_free(m);
 			    }
 				break;
 
 			case IPV6_PORTRANGE:
 				error = sooptcopyin(sopt, &optval,
 				    sizeof optval, sizeof optval);
 				if (error)
 					break;
 
 				switch (optval) {
 				case IPV6_PORTRANGE_DEFAULT:
 					in6p->in6p_flags &= ~(IN6P_LOWPORT);
 					in6p->in6p_flags &= ~(IN6P_HIGHPORT);
 					break;
 
 				case IPV6_PORTRANGE_HIGH:
 					in6p->in6p_flags &= ~(IN6P_LOWPORT);
 					in6p->in6p_flags |= IN6P_HIGHPORT;
 					break;
 
 				case IPV6_PORTRANGE_LOW:
 					in6p->in6p_flags &= ~(IN6P_HIGHPORT);
 					in6p->in6p_flags |= IN6P_LOWPORT;
 					break;
 
 				default:
 					error = EINVAL;
 					break;
 				}
 				break;
 
 #ifdef IPSEC
 			case IPV6_IPSEC_POLICY:
 			{
 				caddr_t req;
 				struct mbuf *m;
 
 				if ((error = soopt_getm(sopt, &m)) != 0) /* XXX */
 					break;
 				if ((error = soopt_mcopyin(sopt, m)) != 0) /* XXX */
 					break;
 				req = mtod(m, caddr_t);
 				error = ipsec6_set_policy(in6p, optname, req,
 				    m->m_len, (sopt->sopt_td != NULL) ?
 				    sopt->sopt_td->td_ucred : NULL);
 				m_freem(m);
 				break;
 			}
 #endif /* IPSEC */
 
 			default:
 				error = ENOPROTOOPT;
 				break;
 			}
 			break;
 
 		case SOPT_GET:
 			switch (optname) {
 
 			case IPV6_2292PKTOPTIONS:
 #ifdef IPV6_PKTOPTIONS
 			case IPV6_PKTOPTIONS:
 #endif
 				/*
 				 * RFC3542 (effectively) deprecated the
 				 * semantics of the 2292-style pktoptions.
 				 * Since it was not reliable in nature (i.e.,
 				 * applications had to expect the lack of some
 				 * information after all), it would make sense
 				 * to simplify this part by always returning
 				 * empty data.
 				 */
 				sopt->sopt_valsize = 0;
 				break;
 
 			case IPV6_RECVHOPOPTS:
 			case IPV6_RECVDSTOPTS:
 			case IPV6_RECVRTHDRDSTOPTS:
 			case IPV6_UNICAST_HOPS:
 			case IPV6_RECVPKTINFO:
 			case IPV6_RECVHOPLIMIT:
 			case IPV6_RECVRTHDR:
 			case IPV6_RECVPATHMTU:
 
 			case IPV6_FAITH:
 			case IPV6_V6ONLY:
 			case IPV6_PORTRANGE:
 			case IPV6_RECVTCLASS:
 			case IPV6_AUTOFLOWLABEL:
 				switch (optname) {
 
 				case IPV6_RECVHOPOPTS:
 					optval = OPTBIT(IN6P_HOPOPTS);
 					break;
 
 				case IPV6_RECVDSTOPTS:
 					optval = OPTBIT(IN6P_DSTOPTS);
 					break;
 
 				case IPV6_RECVRTHDRDSTOPTS:
 					optval = OPTBIT(IN6P_RTHDRDSTOPTS);
 					break;
 
 				case IPV6_UNICAST_HOPS:
 					optval = in6p->in6p_hops;
 					break;
 
 				case IPV6_RECVPKTINFO:
 					optval = OPTBIT(IN6P_PKTINFO);
 					break;
 
 				case IPV6_RECVHOPLIMIT:
 					optval = OPTBIT(IN6P_HOPLIMIT);
 					break;
 
 				case IPV6_RECVRTHDR:
 					optval = OPTBIT(IN6P_RTHDR);
 					break;
 
 				case IPV6_RECVPATHMTU:
 					optval = OPTBIT(IN6P_MTU);
 					break;
 
 				case IPV6_FAITH:
 					optval = OPTBIT(IN6P_FAITH);
 					break;
 
 				case IPV6_V6ONLY:
 					optval = OPTBIT(IN6P_IPV6_V6ONLY);
 					break;
 
 				case IPV6_PORTRANGE:
 				    {
 					int flags;
 					flags = in6p->in6p_flags;
 					if (flags & IN6P_HIGHPORT)
 						optval = IPV6_PORTRANGE_HIGH;
 					else if (flags & IN6P_LOWPORT)
 						optval = IPV6_PORTRANGE_LOW;
 					else
 						optval = 0;
 					break;
 				    }
 				case IPV6_RECVTCLASS:
 					optval = OPTBIT(IN6P_TCLASS);
 					break;
 
 				case IPV6_AUTOFLOWLABEL:
 					optval = OPTBIT(IN6P_AUTOFLOWLABEL);
 					break;
 				}
 				if (error)
 					break;
 				error = sooptcopyout(sopt, &optval,
 					sizeof optval);
 				break;
 
 			case IPV6_PATHMTU:
 			{
 				u_long pmtu = 0;
 				struct ip6_mtuinfo mtuinfo;
 				struct route_in6 sro;
 
 				bzero(&sro, sizeof(sro));
 
 				if (!(so->so_state & SS_ISCONNECTED))
 					return (ENOTCONN);
 				/*
 				 * XXX: we dot not consider the case of source
 				 * routing, or optional information to specify
 				 * the outgoing interface.
 				 */
 				error = ip6_getpmtu(&sro, NULL, NULL,
 				    &in6p->in6p_faddr, &pmtu, NULL);
 				if (sro.ro_rt)
 					RTFREE(sro.ro_rt);
 				if (error)
 					break;
 				if (pmtu > IPV6_MAXPACKET)
 					pmtu = IPV6_MAXPACKET;
 
 				bzero(&mtuinfo, sizeof(mtuinfo));
 				mtuinfo.ip6m_mtu = (u_int32_t)pmtu;
 				optdata = (void *)&mtuinfo;
 				optdatalen = sizeof(mtuinfo);
 				error = sooptcopyout(sopt, optdata,
 				    optdatalen);
 				break;
 			}
 
 			case IPV6_2292PKTINFO:
 			case IPV6_2292HOPLIMIT:
 			case IPV6_2292HOPOPTS:
 			case IPV6_2292RTHDR:
 			case IPV6_2292DSTOPTS:
 				switch (optname) {
 				case IPV6_2292PKTINFO:
 					optval = OPTBIT(IN6P_PKTINFO);
 					break;
 				case IPV6_2292HOPLIMIT:
 					optval = OPTBIT(IN6P_HOPLIMIT);
 					break;
 				case IPV6_2292HOPOPTS:
 					optval = OPTBIT(IN6P_HOPOPTS);
 					break;
 				case IPV6_2292RTHDR:
 					optval = OPTBIT(IN6P_RTHDR);
 					break;
 				case IPV6_2292DSTOPTS:
 					optval = OPTBIT(IN6P_DSTOPTS|IN6P_RTHDRDSTOPTS);
 					break;
 				}
 				error = sooptcopyout(sopt, &optval,
 				    sizeof optval);
 				break;
 			case IPV6_PKTINFO:
 			case IPV6_HOPOPTS:
 			case IPV6_RTHDR:
 			case IPV6_DSTOPTS:
 			case IPV6_RTHDRDSTOPTS:
 			case IPV6_NEXTHOP:
 			case IPV6_TCLASS:
 			case IPV6_DONTFRAG:
 			case IPV6_USE_MIN_MTU:
 			case IPV6_PREFER_TEMPADDR:
 				error = ip6_getpcbopt(in6p->in6p_outputopts,
 				    optname, sopt);
 				break;
 
 			case IPV6_MULTICAST_IF:
 			case IPV6_MULTICAST_HOPS:
 			case IPV6_MULTICAST_LOOP:
 			case IPV6_JOIN_GROUP:
 			case IPV6_LEAVE_GROUP:
 			    {
 				struct mbuf *m;
 				error = ip6_getmoptions(sopt->sopt_name,
 				    in6p->in6p_moptions, &m);
 				if (error == 0)
 					error = sooptcopyout(sopt,
 					    mtod(m, char *), m->m_len);
 				m_freem(m);
 			    }
 				break;
 
 #ifdef IPSEC
 			case IPV6_IPSEC_POLICY:
 			  {
 				caddr_t req = NULL;
 				size_t len = 0;
 				struct mbuf *m = NULL;
 				struct mbuf **mp = &m;
 				size_t ovalsize = sopt->sopt_valsize;
 				caddr_t oval = (caddr_t)sopt->sopt_val;
 
 				error = soopt_getm(sopt, &m); /* XXX */
 				if (error != 0)
 					break;
 				error = soopt_mcopyin(sopt, m); /* XXX */
 				if (error != 0)
 					break;
 				sopt->sopt_valsize = ovalsize;
 				sopt->sopt_val = oval;
 				if (m) {
 					req = mtod(m, caddr_t);
 					len = m->m_len;
 				}
 				error = ipsec6_get_policy(in6p, req, len, mp);
 				if (error == 0)
 					error = soopt_mcopyout(sopt, m); /* XXX */
 				if (error == 0 && m)
 					m_freem(m);
 				break;
 			  }
 #endif /* IPSEC */
 
 			default:
 				error = ENOPROTOOPT;
 				break;
 			}
 			break;
 		}
 	} else {		/* level != IPPROTO_IPV6 */
 		error = EINVAL;
 	}
 	return (error);
 }
 
 int
 ip6_raw_ctloutput(struct socket *so, struct sockopt *sopt)
 {
 	int error = 0, optval, optlen;
 	const int icmp6off = offsetof(struct icmp6_hdr, icmp6_cksum);
 	struct in6pcb *in6p = sotoin6pcb(so);
 	int level, op, optname;
 
 	level = sopt->sopt_level;
 	op = sopt->sopt_dir;
 	optname = sopt->sopt_name;
 	optlen = sopt->sopt_valsize;
 
 	if (level != IPPROTO_IPV6) {
 		return (EINVAL);
 	}
 
 	switch (optname) {
 	case IPV6_CHECKSUM:
 		/*
 		 * For ICMPv6 sockets, no modification allowed for checksum
 		 * offset, permit "no change" values to help existing apps.
 		 *
 		 * RFC3542 says: "An attempt to set IPV6_CHECKSUM
 		 * for an ICMPv6 socket will fail."
 		 * The current behavior does not meet RFC3542.
 		 */
 		switch (op) {
 		case SOPT_SET:
 			if (optlen != sizeof(int)) {
 				error = EINVAL;
 				break;
 			}
 			error = sooptcopyin(sopt, &optval, sizeof(optval),
 					    sizeof(optval));
 			if (error)
 				break;
 			if ((optval % 2) != 0) {
 				/* the API assumes even offset values */
 				error = EINVAL;
 			} else if (so->so_proto->pr_protocol ==
 			    IPPROTO_ICMPV6) {
 				if (optval != icmp6off)
 					error = EINVAL;
 			} else
 				in6p->in6p_cksum = optval;
 			break;
 
 		case SOPT_GET:
 			if (so->so_proto->pr_protocol == IPPROTO_ICMPV6)
 				optval = icmp6off;
 			else
 				optval = in6p->in6p_cksum;
 
 			error = sooptcopyout(sopt, &optval, sizeof(optval));
 			break;
 
 		default:
 			error = EINVAL;
 			break;
 		}
 		break;
 
 	default:
 		error = ENOPROTOOPT;
 		break;
 	}
 
 	return (error);
 }
 
 /*
  * Set up IP6 options in pcb for insertion in output packets or
  * specifying behavior of outgoing packets.
  */
 static int
 ip6_pcbopts(struct ip6_pktopts **pktopt, struct mbuf *m,
     struct socket *so, struct sockopt *sopt)
 {
 	struct ip6_pktopts *opt = *pktopt;
 	int error = 0;
 	struct thread *td = sopt->sopt_td;
 
 	/* turn off any old options. */
 	if (opt) {
 #ifdef DIAGNOSTIC
 		if (opt->ip6po_pktinfo || opt->ip6po_nexthop ||
 		    opt->ip6po_hbh || opt->ip6po_dest1 || opt->ip6po_dest2 ||
 		    opt->ip6po_rhinfo.ip6po_rhi_rthdr)
 			printf("ip6_pcbopts: all specified options are cleared.\n");
 #endif
 		ip6_clearpktopts(opt, -1);
 	} else
 		opt = malloc(sizeof(*opt), M_IP6OPT, M_WAITOK);
 	*pktopt = NULL;
 
 	if (!m || m->m_len == 0) {
 		/*
 		 * Only turning off any previous options, regardless of
 		 * whether the opt is just created or given.
 		 */
 		free(opt, M_IP6OPT);
 		return (0);
 	}
 
 	/*  set options specified by user. */
 	if ((error = ip6_setpktopts(m, opt, NULL, (td != NULL) ?
 	    td->td_ucred : NULL, so->so_proto->pr_protocol)) != 0) {
 		ip6_clearpktopts(opt, -1); /* XXX: discard all options */
 		free(opt, M_IP6OPT);
 		return (error);
 	}
 	*pktopt = opt;
 	return (0);
 }
 
 /*
  * initialize ip6_pktopts.  beware that there are non-zero default values in
  * the struct.
  */
 void
 ip6_initpktopts(struct ip6_pktopts *opt)
 {
 
 	bzero(opt, sizeof(*opt));
 	opt->ip6po_hlim = -1;	/* -1 means default hop limit */
 	opt->ip6po_tclass = -1;	/* -1 means default traffic class */
 	opt->ip6po_minmtu = IP6PO_MINMTU_MCASTONLY;
 	opt->ip6po_prefer_tempaddr = IP6PO_TEMPADDR_SYSTEM;
 }
 
 static int
 ip6_pcbopt(int optname, u_char *buf, int len, struct ip6_pktopts **pktopt,
     struct ucred *cred, int uproto)
 {
 	struct ip6_pktopts *opt;
 
 	if (*pktopt == NULL) {
 		*pktopt = malloc(sizeof(struct ip6_pktopts), M_IP6OPT,
 		    M_WAITOK);
 		ip6_initpktopts(*pktopt);
 	}
 	opt = *pktopt;
 
 	return (ip6_setpktopt(optname, buf, len, opt, cred, 1, 0, uproto));
 }
 
 static int
 ip6_getpcbopt(struct ip6_pktopts *pktopt, int optname, struct sockopt *sopt)
 {
 	void *optdata = NULL;
 	int optdatalen = 0;
 	struct ip6_ext *ip6e;
 	int error = 0;
 	struct in6_pktinfo null_pktinfo;
 	int deftclass = 0, on;
 	int defminmtu = IP6PO_MINMTU_MCASTONLY;
 	int defpreftemp = IP6PO_TEMPADDR_SYSTEM;
 
 	switch (optname) {
 	case IPV6_PKTINFO:
 		if (pktopt && pktopt->ip6po_pktinfo)
 			optdata = (void *)pktopt->ip6po_pktinfo;
 		else {
 			/* XXX: we don't have to do this every time... */
 			bzero(&null_pktinfo, sizeof(null_pktinfo));
 			optdata = (void *)&null_pktinfo;
 		}
 		optdatalen = sizeof(struct in6_pktinfo);
 		break;
 	case IPV6_TCLASS:
 		if (pktopt && pktopt->ip6po_tclass >= 0)
 			optdata = (void *)&pktopt->ip6po_tclass;
 		else
 			optdata = (void *)&deftclass;
 		optdatalen = sizeof(int);
 		break;
 	case IPV6_HOPOPTS:
 		if (pktopt && pktopt->ip6po_hbh) {
 			optdata = (void *)pktopt->ip6po_hbh;
 			ip6e = (struct ip6_ext *)pktopt->ip6po_hbh;
 			optdatalen = (ip6e->ip6e_len + 1) << 3;
 		}
 		break;
 	case IPV6_RTHDR:
 		if (pktopt && pktopt->ip6po_rthdr) {
 			optdata = (void *)pktopt->ip6po_rthdr;
 			ip6e = (struct ip6_ext *)pktopt->ip6po_rthdr;
 			optdatalen = (ip6e->ip6e_len + 1) << 3;
 		}
 		break;
 	case IPV6_RTHDRDSTOPTS:
 		if (pktopt && pktopt->ip6po_dest1) {
 			optdata = (void *)pktopt->ip6po_dest1;
 			ip6e = (struct ip6_ext *)pktopt->ip6po_dest1;
 			optdatalen = (ip6e->ip6e_len + 1) << 3;
 		}
 		break;
 	case IPV6_DSTOPTS:
 		if (pktopt && pktopt->ip6po_dest2) {
 			optdata = (void *)pktopt->ip6po_dest2;
 			ip6e = (struct ip6_ext *)pktopt->ip6po_dest2;
 			optdatalen = (ip6e->ip6e_len + 1) << 3;
 		}
 		break;
 	case IPV6_NEXTHOP:
 		if (pktopt && pktopt->ip6po_nexthop) {
 			optdata = (void *)pktopt->ip6po_nexthop;
 			optdatalen = pktopt->ip6po_nexthop->sa_len;
 		}
 		break;
 	case IPV6_USE_MIN_MTU:
 		if (pktopt)
 			optdata = (void *)&pktopt->ip6po_minmtu;
 		else
 			optdata = (void *)&defminmtu;
 		optdatalen = sizeof(int);
 		break;
 	case IPV6_DONTFRAG:
 		if (pktopt && ((pktopt->ip6po_flags) & IP6PO_DONTFRAG))
 			on = 1;
 		else
 			on = 0;
 		optdata = (void *)&on;
 		optdatalen = sizeof(on);
 		break;
 	case IPV6_PREFER_TEMPADDR:
 		if (pktopt)
 			optdata = (void *)&pktopt->ip6po_prefer_tempaddr;
 		else
 			optdata = (void *)&defpreftemp;
 		optdatalen = sizeof(int);
 		break;
 	default:		/* should not happen */
 #ifdef DIAGNOSTIC
 		panic("ip6_getpcbopt: unexpected option\n");
 #endif
 		return (ENOPROTOOPT);
 	}
 
 	error = sooptcopyout(sopt, optdata, optdatalen);
 
 	return (error);
 }
 
 void
 ip6_clearpktopts(struct ip6_pktopts *pktopt, int optname)
 {
 	if (pktopt == NULL)
 		return;
 
 	if (optname == -1 || optname == IPV6_PKTINFO) {
 		if (pktopt->ip6po_pktinfo)
 			free(pktopt->ip6po_pktinfo, M_IP6OPT);
 		pktopt->ip6po_pktinfo = NULL;
 	}
 	if (optname == -1 || optname == IPV6_HOPLIMIT)
 		pktopt->ip6po_hlim = -1;
 	if (optname == -1 || optname == IPV6_TCLASS)
 		pktopt->ip6po_tclass = -1;
 	if (optname == -1 || optname == IPV6_NEXTHOP) {
 		if (pktopt->ip6po_nextroute.ro_rt) {
 			RTFREE(pktopt->ip6po_nextroute.ro_rt);
 			pktopt->ip6po_nextroute.ro_rt = NULL;
 		}
 		if (pktopt->ip6po_nexthop)
 			free(pktopt->ip6po_nexthop, M_IP6OPT);
 		pktopt->ip6po_nexthop = NULL;
 	}
 	if (optname == -1 || optname == IPV6_HOPOPTS) {
 		if (pktopt->ip6po_hbh)
 			free(pktopt->ip6po_hbh, M_IP6OPT);
 		pktopt->ip6po_hbh = NULL;
 	}
 	if (optname == -1 || optname == IPV6_RTHDRDSTOPTS) {
 		if (pktopt->ip6po_dest1)
 			free(pktopt->ip6po_dest1, M_IP6OPT);
 		pktopt->ip6po_dest1 = NULL;
 	}
 	if (optname == -1 || optname == IPV6_RTHDR) {
 		if (pktopt->ip6po_rhinfo.ip6po_rhi_rthdr)
 			free(pktopt->ip6po_rhinfo.ip6po_rhi_rthdr, M_IP6OPT);
 		pktopt->ip6po_rhinfo.ip6po_rhi_rthdr = NULL;
 		if (pktopt->ip6po_route.ro_rt) {
 			RTFREE(pktopt->ip6po_route.ro_rt);
 			pktopt->ip6po_route.ro_rt = NULL;
 		}
 	}
 	if (optname == -1 || optname == IPV6_DSTOPTS) {
 		if (pktopt->ip6po_dest2)
 			free(pktopt->ip6po_dest2, M_IP6OPT);
 		pktopt->ip6po_dest2 = NULL;
 	}
 }
 
 #define PKTOPT_EXTHDRCPY(type) \
 do {\
 	if (src->type) {\
 		int hlen = (((struct ip6_ext *)src->type)->ip6e_len + 1) << 3;\
 		dst->type = malloc(hlen, M_IP6OPT, canwait);\
 		if (dst->type == NULL && canwait == M_NOWAIT)\
 			goto bad;\
 		bcopy(src->type, dst->type, hlen);\
 	}\
 } while (/*CONSTCOND*/ 0)
 
 static int
 copypktopts(struct ip6_pktopts *dst, struct ip6_pktopts *src, int canwait)
 {
 	if (dst == NULL || src == NULL)  {
 		printf("ip6_clearpktopts: invalid argument\n");
 		return (EINVAL);
 	}
 
 	dst->ip6po_hlim = src->ip6po_hlim;
 	dst->ip6po_tclass = src->ip6po_tclass;
 	dst->ip6po_flags = src->ip6po_flags;
 	if (src->ip6po_pktinfo) {
 		dst->ip6po_pktinfo = malloc(sizeof(*dst->ip6po_pktinfo),
 		    M_IP6OPT, canwait);
 		if (dst->ip6po_pktinfo == NULL)
 			goto bad;
 		*dst->ip6po_pktinfo = *src->ip6po_pktinfo;
 	}
 	if (src->ip6po_nexthop) {
 		dst->ip6po_nexthop = malloc(src->ip6po_nexthop->sa_len,
 		    M_IP6OPT, canwait);
 		if (dst->ip6po_nexthop == NULL)
 			goto bad;
 		bcopy(src->ip6po_nexthop, dst->ip6po_nexthop,
 		    src->ip6po_nexthop->sa_len);
 	}
 	PKTOPT_EXTHDRCPY(ip6po_hbh);
 	PKTOPT_EXTHDRCPY(ip6po_dest1);
 	PKTOPT_EXTHDRCPY(ip6po_dest2);
 	PKTOPT_EXTHDRCPY(ip6po_rthdr); /* not copy the cached route */
 	return (0);
 
   bad:
 	ip6_clearpktopts(dst, -1);
 	return (ENOBUFS);
 }
 #undef PKTOPT_EXTHDRCPY
 
 struct ip6_pktopts *
 ip6_copypktopts(struct ip6_pktopts *src, int canwait)
 {
 	int error;
 	struct ip6_pktopts *dst;
 
 	dst = malloc(sizeof(*dst), M_IP6OPT, canwait);
 	if (dst == NULL)
 		return (NULL);
 	ip6_initpktopts(dst);
 
 	if ((error = copypktopts(dst, src, canwait)) != 0) {
 		free(dst, M_IP6OPT);
 		return (NULL);
 	}
 
 	return (dst);
 }
 
 void
 ip6_freepcbopts(struct ip6_pktopts *pktopt)
 {
 	if (pktopt == NULL)
 		return;
 
 	ip6_clearpktopts(pktopt, -1);
 
 	free(pktopt, M_IP6OPT);
 }
 
 /*
  * Set the IP6 multicast options in response to user setsockopt().
  */
 static int
 ip6_setmoptions(int optname, struct ip6_moptions **im6op, struct mbuf *m)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	int error = 0;
 	u_int loop, ifindex;
 	struct ipv6_mreq *mreq;
 	struct ifnet *ifp;
 	struct ip6_moptions *im6o = *im6op;
 	struct route_in6 ro;
 	struct in6_multi_mship *imm;
 
 	if (im6o == NULL) {
 		/*
 		 * No multicast option buffer attached to the pcb;
 		 * allocate one and initialize to default values.
 		 */
 		im6o = (struct ip6_moptions *)
 			malloc(sizeof(*im6o), M_IP6MOPTS, M_WAITOK);
 
 		if (im6o == NULL)
 			return (ENOBUFS);
 		*im6op = im6o;
 		im6o->im6o_multicast_ifp = NULL;
 		im6o->im6o_multicast_hlim = V_ip6_defmcasthlim;
 		im6o->im6o_multicast_loop = IPV6_DEFAULT_MULTICAST_LOOP;
 		LIST_INIT(&im6o->im6o_memberships);
 	}
 
 	switch (optname) {
 
 	case IPV6_MULTICAST_IF:
 		/*
 		 * Select the interface for outgoing multicast packets.
 		 */
 		if (m == NULL || m->m_len != sizeof(u_int)) {
 			error = EINVAL;
 			break;
 		}
 		bcopy(mtod(m, u_int *), &ifindex, sizeof(ifindex));
 		if (ifindex < 0 || V_if_index < ifindex) {
 			error = ENXIO;	/* XXX EINVAL? */
 			break;
 		}
 		ifp = ifnet_byindex(ifindex);
 		if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0) {
 			error = EADDRNOTAVAIL;
 			break;
 		}
 		im6o->im6o_multicast_ifp = ifp;
 		break;
 
 	case IPV6_MULTICAST_HOPS:
 	    {
 		/*
 		 * Set the IP6 hoplimit for outgoing multicast packets.
 		 */
 		int optval;
 		if (m == NULL || m->m_len != sizeof(int)) {
 			error = EINVAL;
 			break;
 		}
 		bcopy(mtod(m, u_int *), &optval, sizeof(optval));
 		if (optval < -1 || optval >= 256)
 			error = EINVAL;
 		else if (optval == -1)
 			im6o->im6o_multicast_hlim = V_ip6_defmcasthlim;
 		else
 			im6o->im6o_multicast_hlim = optval;
 		break;
 	    }
 
 	case IPV6_MULTICAST_LOOP:
 		/*
 		 * Set the loopback flag for outgoing multicast packets.
 		 * Must be zero or one.
 		 */
 		if (m == NULL || m->m_len != sizeof(u_int)) {
 			error = EINVAL;
 			break;
 		}
 		bcopy(mtod(m, u_int *), &loop, sizeof(loop));
 		if (loop > 1) {
 			error = EINVAL;
 			break;
 		}
 		im6o->im6o_multicast_loop = loop;
 		break;
 
 	case IPV6_JOIN_GROUP:
 		/*
 		 * Add a multicast group membership.
 		 * Group must be a valid IP6 multicast address.
 		 */
 		if (m == NULL || m->m_len != sizeof(struct ipv6_mreq)) {
 			error = EINVAL;
 			break;
 		}
 		mreq = mtod(m, struct ipv6_mreq *);
 
 		if (IN6_IS_ADDR_UNSPECIFIED(&mreq->ipv6mr_multiaddr)) {
 			/*
 			 * We use the unspecified address to specify to accept
 			 * all multicast addresses. Only super user is allowed
 			 * to do this.
 			 */
 			/* XXX-BZ might need a better PRIV_NETINET_x for this */
 			error = priv_check(curthread, PRIV_NETINET_MROUTE);
 			if (error)
 				break;
 		} else if (!IN6_IS_ADDR_MULTICAST(&mreq->ipv6mr_multiaddr)) {
 			error = EINVAL;
 			break;
 		}
 
 		/*
 		 * If no interface was explicitly specified, choose an
 		 * appropriate one according to the given multicast address.
 		 */
 		if (mreq->ipv6mr_interface == 0) {
 			struct sockaddr_in6 *dst;
 
 			/*
 			 * Look up the routing table for the
 			 * address, and choose the outgoing interface.
 			 *   XXX: is it a good approach?
 			 */
 			ro.ro_rt = NULL;
 			dst = (struct sockaddr_in6 *)&ro.ro_dst;
 			bzero(dst, sizeof(*dst));
 			dst->sin6_family = AF_INET6;
 			dst->sin6_len = sizeof(*dst);
 			dst->sin6_addr = mreq->ipv6mr_multiaddr;
 			rtalloc((struct route *)&ro);
 			if (ro.ro_rt == NULL) {
 				error = EADDRNOTAVAIL;
 				break;
 			}
 			ifp = ro.ro_rt->rt_ifp;
 			RTFREE(ro.ro_rt);
 		} else {
 			/*
 			 * If the interface is specified, validate it.
 			 */
 			if (mreq->ipv6mr_interface < 0 ||
 			    V_if_index < mreq->ipv6mr_interface) {
 				error = ENXIO;	/* XXX EINVAL? */
 				break;
 			}
 			ifp = ifnet_byindex(mreq->ipv6mr_interface);
 			if (!ifp) {
 				error = ENXIO;	/* XXX EINVAL? */
 				break;
 			}
 		}
 
 		/*
 		 * See if we found an interface, and confirm that it
 		 * supports multicast
 		 */
 		if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0) {
 			error = EADDRNOTAVAIL;
 			break;
 		}
 
 		if (in6_setscope(&mreq->ipv6mr_multiaddr, ifp, NULL)) {
 			error = EADDRNOTAVAIL; /* XXX: should not happen */
 			break;
 		}
 
 		/*
 		 * See if the membership already exists.
 		 */
 		for (imm = im6o->im6o_memberships.lh_first;
 		     imm != NULL; imm = imm->i6mm_chain.le_next)
 			if (imm->i6mm_maddr->in6m_ifp == ifp &&
 			    IN6_ARE_ADDR_EQUAL(&imm->i6mm_maddr->in6m_addr,
 					       &mreq->ipv6mr_multiaddr))
 				break;
 		if (imm != NULL) {
 			error = EADDRINUSE;
 			break;
 		}
 		/*
 		 * Everything looks good; add a new record to the multicast
 		 * address list for the given interface.
 		 */
 		imm = in6_joingroup(ifp, &mreq->ipv6mr_multiaddr,  &error, 0);
 		if (imm == NULL)
 			break;
 		LIST_INSERT_HEAD(&im6o->im6o_memberships, imm, i6mm_chain);
 		break;
 
 	case IPV6_LEAVE_GROUP:
 		/*
 		 * Drop a multicast group membership.
 		 * Group must be a valid IP6 multicast address.
 		 */
 		if (m == NULL || m->m_len != sizeof(struct ipv6_mreq)) {
 			error = EINVAL;
 			break;
 		}
 		mreq = mtod(m, struct ipv6_mreq *);
 
 		/*
 		 * If an interface address was specified, get a pointer
 		 * to its ifnet structure.
 		 */
 		if (mreq->ipv6mr_interface < 0 ||
 		    V_if_index < mreq->ipv6mr_interface) {
 			error = ENXIO;	/* XXX EINVAL? */
 			break;
 		}
 		if (mreq->ipv6mr_interface == 0)
 			ifp = NULL;
 		else
 			ifp = ifnet_byindex(mreq->ipv6mr_interface);
 
 		/* Fill in the scope zone ID */
 		if (ifp) {
 			if (in6_setscope(&mreq->ipv6mr_multiaddr, ifp, NULL)) {
 				/* XXX: should not happen */
 				error = EADDRNOTAVAIL;
 				break;
 			}
 		} else if (mreq->ipv6mr_interface != 0) {
 			/*
 			 * This case happens when the (positive) index is in
 			 * the valid range, but the corresponding interface has
 			 * been detached dynamically (XXX).
 			 */
 			error = EADDRNOTAVAIL;
 			break;
 		} else {	/* ipv6mr_interface == 0 */
 			struct sockaddr_in6 sa6_mc;
 
 			/*
 			 * The API spec says as follows:
 			 *  If the interface index is specified as 0, the
 			 *  system may choose a multicast group membership to
 			 *  drop by matching the multicast address only.
 			 * On the other hand, we cannot disambiguate the scope
 			 * zone unless an interface is provided.  Thus, we
 			 * check if there's ambiguity with the default scope
 			 * zone as the last resort.
 			 */
 			bzero(&sa6_mc, sizeof(sa6_mc));
 			sa6_mc.sin6_family = AF_INET6;
 			sa6_mc.sin6_len = sizeof(sa6_mc);
 			sa6_mc.sin6_addr = mreq->ipv6mr_multiaddr;
 			error = sa6_embedscope(&sa6_mc, V_ip6_use_defzone);
 			if (error != 0)
 				break;
 			mreq->ipv6mr_multiaddr = sa6_mc.sin6_addr;
 		}
 
 		/*
 		 * Find the membership in the membership list.
 		 */
 		for (imm = im6o->im6o_memberships.lh_first;
 		     imm != NULL; imm = imm->i6mm_chain.le_next) {
 			if ((ifp == NULL || imm->i6mm_maddr->in6m_ifp == ifp) &&
 			    IN6_ARE_ADDR_EQUAL(&imm->i6mm_maddr->in6m_addr,
 			    &mreq->ipv6mr_multiaddr))
 				break;
 		}
 		if (imm == NULL) {
 			/* Unable to resolve interface */
 			error = EADDRNOTAVAIL;
 			break;
 		}
 		/*
 		 * Give up the multicast address record to which the
 		 * membership points.
 		 */
 		LIST_REMOVE(imm, i6mm_chain);
 		in6_delmulti(imm->i6mm_maddr);
 		free(imm, M_IP6MADDR);
 		break;
 
 	default:
 		error = EOPNOTSUPP;
 		break;
 	}
 
 	/*
 	 * If all options have default values, no need to keep the mbuf.
 	 */
 	if (im6o->im6o_multicast_ifp == NULL &&
 	    im6o->im6o_multicast_hlim == V_ip6_defmcasthlim &&
 	    im6o->im6o_multicast_loop == IPV6_DEFAULT_MULTICAST_LOOP &&
 	    im6o->im6o_memberships.lh_first == NULL) {
 		free(*im6op, M_IP6MOPTS);
 		*im6op = NULL;
 	}
 
 	return (error);
 }
 
 /*
  * Return the IP6 multicast options in response to user getsockopt().
  */
 static int
 ip6_getmoptions(int optname, struct ip6_moptions *im6o, struct mbuf **mp)
 {
 	INIT_VNET_INET6(curvnet);
 	u_int *hlim, *loop, *ifindex;
 
 	*mp = m_get(M_WAIT, MT_HEADER);		/* XXX */
 
 	switch (optname) {
 
 	case IPV6_MULTICAST_IF:
 		ifindex = mtod(*mp, u_int *);
 		(*mp)->m_len = sizeof(u_int);
 		if (im6o == NULL || im6o->im6o_multicast_ifp == NULL)
 			*ifindex = 0;
 		else
 			*ifindex = im6o->im6o_multicast_ifp->if_index;
 		return (0);
 
 	case IPV6_MULTICAST_HOPS:
 		hlim = mtod(*mp, u_int *);
 		(*mp)->m_len = sizeof(u_int);
 		if (im6o == NULL)
 			*hlim = V_ip6_defmcasthlim;
 		else
 			*hlim = im6o->im6o_multicast_hlim;
 		return (0);
 
 	case IPV6_MULTICAST_LOOP:
 		loop = mtod(*mp, u_int *);
 		(*mp)->m_len = sizeof(u_int);
 		if (im6o == NULL)
 			*loop = V_ip6_defmcasthlim;
 		else
 			*loop = im6o->im6o_multicast_loop;
 		return (0);
 
 	default:
 		return (EOPNOTSUPP);
 	}
 }
 
 /*
  * Discard the IP6 multicast options.
  */
 void
 ip6_freemoptions(struct ip6_moptions *im6o)
 {
 	struct in6_multi_mship *imm;
 
 	if (im6o == NULL)
 		return;
 
 	while ((imm = im6o->im6o_memberships.lh_first) != NULL) {
 		LIST_REMOVE(imm, i6mm_chain);
 		if (imm->i6mm_maddr)
 			in6_delmulti(imm->i6mm_maddr);
 		free(imm, M_IP6MADDR);
 	}
 	free(im6o, M_IP6MOPTS);
 }
 
 /*
  * Set IPv6 outgoing packet options based on advanced API.
  */
 int
 ip6_setpktopts(struct mbuf *control, struct ip6_pktopts *opt,
     struct ip6_pktopts *stickyopt, struct ucred *cred, int uproto)
 {
 	struct cmsghdr *cm = 0;
 
 	if (control == NULL || opt == NULL)
 		return (EINVAL);
 
 	ip6_initpktopts(opt);
 	if (stickyopt) {
 		int error;
 
 		/*
 		 * If stickyopt is provided, make a local copy of the options
 		 * for this particular packet, then override them by ancillary
 		 * objects.
 		 * XXX: copypktopts() does not copy the cached route to a next
 		 * hop (if any).  This is not very good in terms of efficiency,
 		 * but we can allow this since this option should be rarely
 		 * used.
 		 */
 		if ((error = copypktopts(opt, stickyopt, M_NOWAIT)) != 0)
 			return (error);
 	}
 
 	/*
 	 * XXX: Currently, we assume all the optional information is stored
 	 * in a single mbuf.
 	 */
 	if (control->m_next)
 		return (EINVAL);
 
 	for (; control->m_len > 0; control->m_data += CMSG_ALIGN(cm->cmsg_len),
 	    control->m_len -= CMSG_ALIGN(cm->cmsg_len)) {
 		int error;
 
 		if (control->m_len < CMSG_LEN(0))
 			return (EINVAL);
 
 		cm = mtod(control, struct cmsghdr *);
 		if (cm->cmsg_len == 0 || cm->cmsg_len > control->m_len)
 			return (EINVAL);
 		if (cm->cmsg_level != IPPROTO_IPV6)
 			continue;
 
 		error = ip6_setpktopt(cm->cmsg_type, CMSG_DATA(cm),
 		    cm->cmsg_len - CMSG_LEN(0), opt, cred, 0, 1, uproto);
 		if (error)
 			return (error);
 	}
 
 	return (0);
 }
 
 /*
  * Set a particular packet option, as a sticky option or an ancillary data
  * item.  "len" can be 0 only when it's a sticky option.
  * We have 4 cases of combination of "sticky" and "cmsg":
  * "sticky=0, cmsg=0": impossible
  * "sticky=0, cmsg=1": RFC2292 or RFC3542 ancillary data
  * "sticky=1, cmsg=0": RFC3542 socket option
  * "sticky=1, cmsg=1": RFC2292 socket option
  */
 static int
 ip6_setpktopt(int optname, u_char *buf, int len, struct ip6_pktopts *opt,
     struct ucred *cred, int sticky, int cmsg, int uproto)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	int minmtupolicy, preftemp;
 	int error;
 
 	if (!sticky && !cmsg) {
 #ifdef DIAGNOSTIC
 		printf("ip6_setpktopt: impossible case\n");
 #endif
 		return (EINVAL);
 	}
 
 	/*
 	 * IPV6_2292xxx is for backward compatibility to RFC2292, and should
 	 * not be specified in the context of RFC3542.  Conversely,
 	 * RFC3542 types should not be specified in the context of RFC2292.
 	 */
 	if (!cmsg) {
 		switch (optname) {
 		case IPV6_2292PKTINFO:
 		case IPV6_2292HOPLIMIT:
 		case IPV6_2292NEXTHOP:
 		case IPV6_2292HOPOPTS:
 		case IPV6_2292DSTOPTS:
 		case IPV6_2292RTHDR:
 		case IPV6_2292PKTOPTIONS:
 			return (ENOPROTOOPT);
 		}
 	}
 	if (sticky && cmsg) {
 		switch (optname) {
 		case IPV6_PKTINFO:
 		case IPV6_HOPLIMIT:
 		case IPV6_NEXTHOP:
 		case IPV6_HOPOPTS:
 		case IPV6_DSTOPTS:
 		case IPV6_RTHDRDSTOPTS:
 		case IPV6_RTHDR:
 		case IPV6_USE_MIN_MTU:
 		case IPV6_DONTFRAG:
 		case IPV6_TCLASS:
 		case IPV6_PREFER_TEMPADDR: /* XXX: not an RFC3542 option */
 			return (ENOPROTOOPT);
 		}
 	}
 
 	switch (optname) {
 	case IPV6_2292PKTINFO:
 	case IPV6_PKTINFO:
 	{
 		struct ifnet *ifp = NULL;
 		struct in6_pktinfo *pktinfo;
 
 		if (len != sizeof(struct in6_pktinfo))
 			return (EINVAL);
 
 		pktinfo = (struct in6_pktinfo *)buf;
 
 		/*
 		 * An application can clear any sticky IPV6_PKTINFO option by
 		 * doing a "regular" setsockopt with ipi6_addr being
 		 * in6addr_any and ipi6_ifindex being zero.
 		 * [RFC 3542, Section 6]
 		 */
 		if (optname == IPV6_PKTINFO && opt->ip6po_pktinfo &&
 		    pktinfo->ipi6_ifindex == 0 &&
 		    IN6_IS_ADDR_UNSPECIFIED(&pktinfo->ipi6_addr)) {
 			ip6_clearpktopts(opt, optname);
 			break;
 		}
 
 		if (uproto == IPPROTO_TCP && optname == IPV6_PKTINFO &&
 		    sticky && !IN6_IS_ADDR_UNSPECIFIED(&pktinfo->ipi6_addr)) {
 			return (EINVAL);
 		}
 
 		/* validate the interface index if specified. */
 		if (pktinfo->ipi6_ifindex > V_if_index ||
 		    pktinfo->ipi6_ifindex < 0) {
 			 return (ENXIO);
 		}
 		if (pktinfo->ipi6_ifindex) {
 			ifp = ifnet_byindex(pktinfo->ipi6_ifindex);
 			if (ifp == NULL)
 				return (ENXIO);
 		}
 
 		/*
 		 * We store the address anyway, and let in6_selectsrc()
 		 * validate the specified address.  This is because ipi6_addr
 		 * may not have enough information about its scope zone, and
 		 * we may need additional information (such as outgoing
 		 * interface or the scope zone of a destination address) to
 		 * disambiguate the scope.
 		 * XXX: the delay of the validation may confuse the
 		 * application when it is used as a sticky option.
 		 */
 		if (opt->ip6po_pktinfo == NULL) {
 			opt->ip6po_pktinfo = malloc(sizeof(*pktinfo),
 			    M_IP6OPT, M_NOWAIT);
 			if (opt->ip6po_pktinfo == NULL)
 				return (ENOBUFS);
 		}
 		bcopy(pktinfo, opt->ip6po_pktinfo, sizeof(*pktinfo));
 		break;
 	}
 
 	case IPV6_2292HOPLIMIT:
 	case IPV6_HOPLIMIT:
 	{
 		int *hlimp;
 
 		/*
 		 * RFC 3542 deprecated the usage of sticky IPV6_HOPLIMIT
 		 * to simplify the ordering among hoplimit options.
 		 */
 		if (optname == IPV6_HOPLIMIT && sticky)
 			return (ENOPROTOOPT);
 
 		if (len != sizeof(int))
 			return (EINVAL);
 		hlimp = (int *)buf;
 		if (*hlimp < -1 || *hlimp > 255)
 			return (EINVAL);
 
 		opt->ip6po_hlim = *hlimp;
 		break;
 	}
 
 	case IPV6_TCLASS:
 	{
 		int tclass;
 
 		if (len != sizeof(int))
 			return (EINVAL);
 		tclass = *(int *)buf;
 		if (tclass < -1 || tclass > 255)
 			return (EINVAL);
 
 		opt->ip6po_tclass = tclass;
 		break;
 	}
 
 	case IPV6_2292NEXTHOP:
 	case IPV6_NEXTHOP:
 		if (cred != NULL) {
 			error = priv_check_cred(cred,
 			    PRIV_NETINET_SETHDROPTS, 0);
 			if (error)
 				return (error);
 		}
 
 		if (len == 0) {	/* just remove the option */
 			ip6_clearpktopts(opt, IPV6_NEXTHOP);
 			break;
 		}
 
 		/* check if cmsg_len is large enough for sa_len */
 		if (len < sizeof(struct sockaddr) || len < *buf)
 			return (EINVAL);
 
 		switch (((struct sockaddr *)buf)->sa_family) {
 		case AF_INET6:
 		{
 			struct sockaddr_in6 *sa6 = (struct sockaddr_in6 *)buf;
 			int error;
 
 			if (sa6->sin6_len != sizeof(struct sockaddr_in6))
 				return (EINVAL);
 
 			if (IN6_IS_ADDR_UNSPECIFIED(&sa6->sin6_addr) ||
 			    IN6_IS_ADDR_MULTICAST(&sa6->sin6_addr)) {
 				return (EINVAL);
 			}
 			if ((error = sa6_embedscope(sa6, V_ip6_use_defzone))
 			    != 0) {
 				return (error);
 			}
 			break;
 		}
 		case AF_LINK:	/* should eventually be supported */
 		default:
 			return (EAFNOSUPPORT);
 		}
 
 		/* turn off the previous option, then set the new option. */
 		ip6_clearpktopts(opt, IPV6_NEXTHOP);
 		opt->ip6po_nexthop = malloc(*buf, M_IP6OPT, M_NOWAIT);
 		if (opt->ip6po_nexthop == NULL)
 			return (ENOBUFS);
 		bcopy(buf, opt->ip6po_nexthop, *buf);
 		break;
 
 	case IPV6_2292HOPOPTS:
 	case IPV6_HOPOPTS:
 	{
 		struct ip6_hbh *hbh;
 		int hbhlen;
 
 		/*
 		 * XXX: We don't allow a non-privileged user to set ANY HbH
 		 * options, since per-option restriction has too much
 		 * overhead.
 		 */
 		if (cred != NULL) {
 			error = priv_check_cred(cred,
 			    PRIV_NETINET_SETHDROPTS, 0);
 			if (error)
 				return (error);
 		}
 
 		if (len == 0) {
 			ip6_clearpktopts(opt, IPV6_HOPOPTS);
 			break;	/* just remove the option */
 		}
 
 		/* message length validation */
 		if (len < sizeof(struct ip6_hbh))
 			return (EINVAL);
 		hbh = (struct ip6_hbh *)buf;
 		hbhlen = (hbh->ip6h_len + 1) << 3;
 		if (len != hbhlen)
 			return (EINVAL);
 
 		/* turn off the previous option, then set the new option. */
 		ip6_clearpktopts(opt, IPV6_HOPOPTS);
 		opt->ip6po_hbh = malloc(hbhlen, M_IP6OPT, M_NOWAIT);
 		if (opt->ip6po_hbh == NULL)
 			return (ENOBUFS);
 		bcopy(hbh, opt->ip6po_hbh, hbhlen);
 
 		break;
 	}
 
 	case IPV6_2292DSTOPTS:
 	case IPV6_DSTOPTS:
 	case IPV6_RTHDRDSTOPTS:
 	{
 		struct ip6_dest *dest, **newdest = NULL;
 		int destlen;
 
 		if (cred != NULL) { /* XXX: see the comment for IPV6_HOPOPTS */
 			error = priv_check_cred(cred,
 			    PRIV_NETINET_SETHDROPTS, 0);
 			if (error)
 				return (error);
 		}
 
 		if (len == 0) {
 			ip6_clearpktopts(opt, optname);
 			break;	/* just remove the option */
 		}
 
 		/* message length validation */
 		if (len < sizeof(struct ip6_dest))
 			return (EINVAL);
 		dest = (struct ip6_dest *)buf;
 		destlen = (dest->ip6d_len + 1) << 3;
 		if (len != destlen)
 			return (EINVAL);
 
 		/*
 		 * Determine the position that the destination options header
 		 * should be inserted; before or after the routing header.
 		 */
 		switch (optname) {
 		case IPV6_2292DSTOPTS:
 			/*
 			 * The old advacned API is ambiguous on this point.
 			 * Our approach is to determine the position based
 			 * according to the existence of a routing header.
 			 * Note, however, that this depends on the order of the
 			 * extension headers in the ancillary data; the 1st
 			 * part of the destination options header must appear
 			 * before the routing header in the ancillary data,
 			 * too.
 			 * RFC3542 solved the ambiguity by introducing
 			 * separate ancillary data or option types.
 			 */
 			if (opt->ip6po_rthdr == NULL)
 				newdest = &opt->ip6po_dest1;
 			else
 				newdest = &opt->ip6po_dest2;
 			break;
 		case IPV6_RTHDRDSTOPTS:
 			newdest = &opt->ip6po_dest1;
 			break;
 		case IPV6_DSTOPTS:
 			newdest = &opt->ip6po_dest2;
 			break;
 		}
 
 		/* turn off the previous option, then set the new option. */
 		ip6_clearpktopts(opt, optname);
 		*newdest = malloc(destlen, M_IP6OPT, M_NOWAIT);
 		if (*newdest == NULL)
 			return (ENOBUFS);
 		bcopy(dest, *newdest, destlen);
 
 		break;
 	}
 
 	case IPV6_2292RTHDR:
 	case IPV6_RTHDR:
 	{
 		struct ip6_rthdr *rth;
 		int rthlen;
 
 		if (len == 0) {
 			ip6_clearpktopts(opt, IPV6_RTHDR);
 			break;	/* just remove the option */
 		}
 
 		/* message length validation */
 		if (len < sizeof(struct ip6_rthdr))
 			return (EINVAL);
 		rth = (struct ip6_rthdr *)buf;
 		rthlen = (rth->ip6r_len + 1) << 3;
 		if (len != rthlen)
 			return (EINVAL);
 
 		switch (rth->ip6r_type) {
 		case IPV6_RTHDR_TYPE_0:
 			if (rth->ip6r_len == 0)	/* must contain one addr */
 				return (EINVAL);
 			if (rth->ip6r_len % 2) /* length must be even */
 				return (EINVAL);
 			if (rth->ip6r_len / 2 != rth->ip6r_segleft)
 				return (EINVAL);
 			break;
 		default:
 			return (EINVAL);	/* not supported */
 		}
 
 		/* turn off the previous option */
 		ip6_clearpktopts(opt, IPV6_RTHDR);
 		opt->ip6po_rthdr = malloc(rthlen, M_IP6OPT, M_NOWAIT);
 		if (opt->ip6po_rthdr == NULL)
 			return (ENOBUFS);
 		bcopy(rth, opt->ip6po_rthdr, rthlen);
 
 		break;
 	}
 
 	case IPV6_USE_MIN_MTU:
 		if (len != sizeof(int))
 			return (EINVAL);
 		minmtupolicy = *(int *)buf;
 		if (minmtupolicy != IP6PO_MINMTU_MCASTONLY &&
 		    minmtupolicy != IP6PO_MINMTU_DISABLE &&
 		    minmtupolicy != IP6PO_MINMTU_ALL) {
 			return (EINVAL);
 		}
 		opt->ip6po_minmtu = minmtupolicy;
 		break;
 
 	case IPV6_DONTFRAG:
 		if (len != sizeof(int))
 			return (EINVAL);
 
 		if (uproto == IPPROTO_TCP || *(int *)buf == 0) {
 			/*
 			 * we ignore this option for TCP sockets.
 			 * (RFC3542 leaves this case unspecified.)
 			 */
 			opt->ip6po_flags &= ~IP6PO_DONTFRAG;
 		} else
 			opt->ip6po_flags |= IP6PO_DONTFRAG;
 		break;
 
 	case IPV6_PREFER_TEMPADDR:
 		if (len != sizeof(int))
 			return (EINVAL);
 		preftemp = *(int *)buf;
 		if (preftemp != IP6PO_TEMPADDR_SYSTEM &&
 		    preftemp != IP6PO_TEMPADDR_NOTPREFER &&
 		    preftemp != IP6PO_TEMPADDR_PREFER) {
 			return (EINVAL);
 		}
 		opt->ip6po_prefer_tempaddr = preftemp;
 		break;
 
 	default:
 		return (ENOPROTOOPT);
 	} /* end of switch */
 
 	return (0);
 }
 
 /*
  * Routine called from ip6_output() to loop back a copy of an IP6 multicast
  * packet to the input queue of a specified interface.  Note that this
  * calls the output routine of the loopback "driver", but with an interface
  * pointer that might NOT be &loif -- easier than replicating that code here.
  */
 void
 ip6_mloopback(struct ifnet *ifp, struct mbuf *m, struct sockaddr_in6 *dst)
 {
 	struct mbuf *copym;
 	struct ip6_hdr *ip6;
 
 	copym = m_copy(m, 0, M_COPYALL);
 	if (copym == NULL)
 		return;
 
 	/*
 	 * Make sure to deep-copy IPv6 header portion in case the data
 	 * is in an mbuf cluster, so that we can safely override the IPv6
 	 * header portion later.
 	 */
 	if ((copym->m_flags & M_EXT) != 0 ||
 	    copym->m_len < sizeof(struct ip6_hdr)) {
 		copym = m_pullup(copym, sizeof(struct ip6_hdr));
 		if (copym == NULL)
 			return;
 	}
 
 #ifdef DIAGNOSTIC
 	if (copym->m_len < sizeof(*ip6)) {
 		m_freem(copym);
 		return;
 	}
 #endif
 
 	ip6 = mtod(copym, struct ip6_hdr *);
 	/*
 	 * clear embedded scope identifiers if necessary.
 	 * in6_clearscope will touch the addresses only when necessary.
 	 */
 	in6_clearscope(&ip6->ip6_src);
 	in6_clearscope(&ip6->ip6_dst);
 
 	(void)if_simloop(ifp, copym, dst->sin6_family, 0);
 }
 
 /*
  * Chop IPv6 header off from the payload.
  */
 static int
 ip6_splithdr(struct mbuf *m, struct ip6_exthdrs *exthdrs)
 {
 	struct mbuf *mh;
 	struct ip6_hdr *ip6;
 
 	ip6 = mtod(m, struct ip6_hdr *);
 	if (m->m_len > sizeof(*ip6)) {
 		MGETHDR(mh, M_DONTWAIT, MT_HEADER);
 		if (mh == 0) {
 			m_freem(m);
 			return ENOBUFS;
 		}
 		M_MOVE_PKTHDR(mh, m);
 		MH_ALIGN(mh, sizeof(*ip6));
 		m->m_len -= sizeof(*ip6);
 		m->m_data += sizeof(*ip6);
 		mh->m_next = m;
 		m = mh;
 		m->m_len = sizeof(*ip6);
 		bcopy((caddr_t)ip6, mtod(m, caddr_t), sizeof(*ip6));
 	}
 	exthdrs->ip6e_ip6 = m;
 	return 0;
 }
 
 /*
  * Compute IPv6 extension header length.
  */
 int
 ip6_optlen(struct in6pcb *in6p)
 {
 	int len;
 
 	if (!in6p->in6p_outputopts)
 		return 0;
 
 	len = 0;
 #define elen(x) \
     (((struct ip6_ext *)(x)) ? (((struct ip6_ext *)(x))->ip6e_len + 1) << 3 : 0)
 
 	len += elen(in6p->in6p_outputopts->ip6po_hbh);
 	if (in6p->in6p_outputopts->ip6po_rthdr)
 		/* dest1 is valid with rthdr only */
 		len += elen(in6p->in6p_outputopts->ip6po_dest1);
 	len += elen(in6p->in6p_outputopts->ip6po_rthdr);
 	len += elen(in6p->in6p_outputopts->ip6po_dest2);
 	return len;
 #undef elen
 }
Index: head/sys/netinet6/ip6_var.h
===================================================================
--- head/sys/netinet6/ip6_var.h	(revision 186118)
+++ head/sys/netinet6/ip6_var.h	(revision 186119)
@@ -1,406 +1,406 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: ip6_var.h,v 1.62 2001/05/03 14:51:48 itojun Exp $
  */
 
 /*-
  * Copyright (c) 1982, 1986, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_var.h	8.1 (Berkeley) 6/10/93
  * $FreeBSD$
  */
 
 #ifndef _NETINET6_IP6_VAR_H_
 #define _NETINET6_IP6_VAR_H_
 
 /*
  * IP6 reassembly queue structure.  Each fragment
  * being reassembled is attached to one of these structures.
  */
 struct	ip6q {
 	struct ip6asfrag *ip6q_down;
 	struct ip6asfrag *ip6q_up;
 	u_int32_t	ip6q_ident;
 	u_int8_t	ip6q_nxt;
 	u_int8_t	ip6q_ecn;
 	u_int8_t	ip6q_ttl;
 	struct in6_addr ip6q_src, ip6q_dst;
 	struct ip6q	*ip6q_next;
 	struct ip6q	*ip6q_prev;
 	int		ip6q_unfrglen;	/* len of unfragmentable part */
 #ifdef notyet
 	u_char		*ip6q_nxtp;
 #endif
 	int		ip6q_nfrag;	/* # of fragments */
 	struct label	*ip6q_label;
 };
 
 struct	ip6asfrag {
 	struct ip6asfrag *ip6af_down;
 	struct ip6asfrag *ip6af_up;
 	struct mbuf	*ip6af_m;
 	int		ip6af_offset;	/* offset in ip6af_m to next header */
 	int		ip6af_frglen;	/* fragmentable part length */
 	int		ip6af_off;	/* fragment offset */
 	u_int16_t	ip6af_mff;	/* more fragment bit in frag off */
 };
 
 #define IP6_REASS_MBUF(ip6af) (*(struct mbuf **)&((ip6af)->ip6af_m))
 
 struct	ip6_moptions {
 	struct	ifnet *im6o_multicast_ifp; /* ifp for outgoing multicasts */
 	u_char	im6o_multicast_hlim;	/* hoplimit for outgoing multicasts */
 	u_char	im6o_multicast_loop;	/* 1 >= hear sends if a member */
 	LIST_HEAD(, in6_multi_mship) im6o_memberships;
 };
 
 /*
  * Control options for outgoing packets
  */
 
 /* Routing header related info */
 struct	ip6po_rhinfo {
 	struct	ip6_rthdr *ip6po_rhi_rthdr; /* Routing header */
 	struct	route_in6 ip6po_rhi_route; /* Route to the 1st hop */
 };
 #define ip6po_rthdr	ip6po_rhinfo.ip6po_rhi_rthdr
 #define ip6po_route	ip6po_rhinfo.ip6po_rhi_route
 
 /* Nexthop related info */
 struct	ip6po_nhinfo {
 	struct	sockaddr *ip6po_nhi_nexthop;
 	struct	route_in6 ip6po_nhi_route; /* Route to the nexthop */
 };
 #define ip6po_nexthop	ip6po_nhinfo.ip6po_nhi_nexthop
 #define ip6po_nextroute	ip6po_nhinfo.ip6po_nhi_route
 
 struct	ip6_pktopts {
 	struct	mbuf *ip6po_m;	/* Pointer to mbuf storing the data */
 	int	ip6po_hlim;	/* Hoplimit for outgoing packets */
 
 	/* Outgoing IF/address information */
 	struct	in6_pktinfo *ip6po_pktinfo;
 
 	/* Next-hop address information */
 	struct	ip6po_nhinfo ip6po_nhinfo;
 
 	struct	ip6_hbh *ip6po_hbh; /* Hop-by-Hop options header */
 
 	/* Destination options header (before a routing header) */
 	struct	ip6_dest *ip6po_dest1;
 
 	/* Routing header related info. */
 	struct	ip6po_rhinfo ip6po_rhinfo;
 
 	/* Destination options header (after a routing header) */
 	struct	ip6_dest *ip6po_dest2;
 
 	int	ip6po_tclass;	/* traffic class */
 
 	int	ip6po_minmtu;  /* fragment vs PMTU discovery policy */
 #define IP6PO_MINMTU_MCASTONLY	-1 /* default; send at min MTU for multicast*/
 #define IP6PO_MINMTU_DISABLE	 0 /* always perform pmtu disc */
 #define IP6PO_MINMTU_ALL	 1 /* always send at min MTU */
 
 	int	ip6po_prefer_tempaddr;  /* whether temporary addresses are
 					   preferred as source address */
 #define IP6PO_TEMPADDR_SYSTEM	-1 /* follow the system default */
 #define IP6PO_TEMPADDR_NOTPREFER 0 /* not prefer temporary address */
 #define IP6PO_TEMPADDR_PREFER	 1 /* prefer temporary address */
 
 	int ip6po_flags;
 #if 0	/* parameters in this block is obsolete. do not reuse the values. */
 #define IP6PO_REACHCONF	0x01	/* upper-layer reachability confirmation. */
 #define IP6PO_MINMTU	0x02	/* use minimum MTU (IPV6_USE_MIN_MTU) */
 #endif
 #define IP6PO_DONTFRAG	0x04	/* disable fragmentation (IPV6_DONTFRAG) */
 #define IP6PO_USECOA	0x08	/* use care of address */
 };
 
 /*
  * Control options for incoming packets
  */
 
 struct	ip6stat {
 	u_quad_t ip6s_total;		/* total packets received */
 	u_quad_t ip6s_tooshort;		/* packet too short */
 	u_quad_t ip6s_toosmall;		/* not enough data */
 	u_quad_t ip6s_fragments;	/* fragments received */
 	u_quad_t ip6s_fragdropped;	/* frags dropped(dups, out of space) */
 	u_quad_t ip6s_fragtimeout;	/* fragments timed out */
 	u_quad_t ip6s_fragoverflow;	/* fragments that exceeded limit */
 	u_quad_t ip6s_forward;		/* packets forwarded */
 	u_quad_t ip6s_cantforward;	/* packets rcvd for unreachable dest */
 	u_quad_t ip6s_redirectsent;	/* packets forwarded on same net */
 	u_quad_t ip6s_delivered;	/* datagrams delivered to upper level*/
 	u_quad_t ip6s_localout;		/* total ip packets generated here */
 	u_quad_t ip6s_odropped;		/* lost packets due to nobufs, etc. */
 	u_quad_t ip6s_reassembled;	/* total packets reassembled ok */
 	u_quad_t ip6s_fragmented;	/* datagrams successfully fragmented */
 	u_quad_t ip6s_ofragments;	/* output fragments created */
 	u_quad_t ip6s_cantfrag;		/* don't fragment flag was set, etc. */
 	u_quad_t ip6s_badoptions;	/* error in option processing */
 	u_quad_t ip6s_noroute;		/* packets discarded due to no route */
 	u_quad_t ip6s_badvers;		/* ip6 version != 6 */
 	u_quad_t ip6s_rawout;		/* total raw ip packets generated */
 	u_quad_t ip6s_badscope;		/* scope error */
 	u_quad_t ip6s_notmember;	/* don't join this multicast group */
 	u_quad_t ip6s_nxthist[256];	/* next header history */
 	u_quad_t ip6s_m1;		/* one mbuf */
 	u_quad_t ip6s_m2m[32];		/* two or more mbuf */
 	u_quad_t ip6s_mext1;		/* one ext mbuf */
 	u_quad_t ip6s_mext2m;		/* two or more ext mbuf */
 	u_quad_t ip6s_exthdrtoolong;	/* ext hdr are not continuous */
 	u_quad_t ip6s_nogif;		/* no match gif found */
 	u_quad_t ip6s_toomanyhdr;	/* discarded due to too many headers */
 
 	/*
 	 * statistics for improvement of the source address selection
 	 * algorithm:
 	 * XXX: hardcoded 16 = # of ip6 multicast scope types + 1
 	 */
 	/* number of times that address selection fails */
 	u_quad_t ip6s_sources_none;
 	/* number of times that an address on the outgoing I/F is chosen */
 	u_quad_t ip6s_sources_sameif[16];
 	/* number of times that an address on a non-outgoing I/F is chosen */
 	u_quad_t ip6s_sources_otherif[16];
 	/*
 	 * number of times that an address that has the same scope
 	 * from the destination is chosen.
 	 */
 	u_quad_t ip6s_sources_samescope[16];
 	/*
 	 * number of times that an address that has a different scope
 	 * from the destination is chosen.
 	 */
 	u_quad_t ip6s_sources_otherscope[16];
 	/* number of times that a deprecated address is chosen */
 	u_quad_t ip6s_sources_deprecated[16];
 
 	u_quad_t ip6s_forward_cachehit;
 	u_quad_t ip6s_forward_cachemiss;
 
 	/* number of times that each rule of source selection is applied. */
 	u_quad_t ip6s_sources_rule[16];
 };
 
 #ifdef _KERNEL
 /*
  * IPv6 onion peeling state.
  * it will be initialized when we come into ip6_input().
  * XXX do not make it a kitchen sink!
  */
 struct ip6aux {
 	u_int32_t ip6a_flags;
 #define IP6A_SWAP	0x01		/* swapped home/care-of on packet */
 #define IP6A_HASEEN	0x02		/* HA was present */
 #define IP6A_BRUID	0x04		/* BR Unique Identifier was present */
 #define IP6A_RTALERTSEEN 0x08		/* rtalert present */
 
 	/* ip6.ip6_src */
 	struct in6_addr ip6a_careof;	/* care-of address of the peer */
 	struct in6_addr ip6a_home;	/* home address of the peer */
 	u_int16_t	ip6a_bruid;	/* BR unique identifier */
 
 	/* ip6.ip6_dst */
 	struct in6_ifaddr *ip6a_dstia6;	/* my ifaddr that matches ip6_dst */
 
 	/* rtalert */
 	u_int16_t ip6a_rtalert;		/* rtalert option value */
 
 	/*
 	 * decapsulation history will be here.
 	 * with IPsec it may not be accurate.
 	 */
 };
 #endif
 
 #ifdef _KERNEL
 /* flags passed to ip6_output as last parameter */
 #define	IPV6_UNSPECSRC		0x01	/* allow :: as the source address */
 #define	IPV6_FORWARDING		0x02	/* most of IPv6 header exists */
 #define	IPV6_MINMTU		0x04	/* use minimum MTU (IPV6_USE_MIN_MTU) */
 
 #ifdef __NO_STRICT_ALIGNMENT
 #define IP6_HDR_ALIGNED_P(ip)	1
 #else
 #define IP6_HDR_ALIGNED_P(ip)	((((intptr_t) (ip)) & 3) == 0)
 #endif
 
 #ifdef VIMAGE_GLOBALS
 extern struct	ip6stat ip6stat;	/* statistics */
 extern int	ip6_defhlim;		/* default hop limit */
 extern int	ip6_defmcasthlim;	/* default multicast hop limit */
 extern int	ip6_forwarding;		/* act as router? */
 extern int	ip6_forward_srcrt;	/* forward src-routed? */
 extern int	ip6_gif_hlim;		/* Hop limit for gif encap packet */
 extern int	ip6_use_deprecated;	/* allow deprecated addr as source */
 extern int	ip6_rr_prune;		/* router renumbering prefix
 					 * walk list every 5 sec.    */
 extern int	ip6_mcast_pmtu;		/* enable pMTU discovery for multicast? */
 extern int	ip6_v6only;
 #endif /* VIMAGE_GLOBALS */
 
 extern struct socket *ip6_mrouter;	/* multicast routing daemon */
 #ifdef VIMAGE_GLOBALS
 extern int	ip6_sendredirects;	/* send IP redirects when forwarding? */
 extern int	ip6_maxfragpackets; /* Maximum packets in reassembly queue */
 extern int	ip6_maxfrags;	/* Maximum fragments in reassembly queue */
 extern int	ip6_sourcecheck;	/* Verify source interface */
 extern int	ip6_sourcecheck_interval; /* Interval between log messages */
 extern int	ip6_accept_rtadv;	/* Acts as a host not a router */
 extern int	ip6_keepfaith;		/* Firewall Aided Internet Translator */
 extern int	ip6_log_interval;
 extern time_t	ip6_log_time;
 extern int	ip6_hdrnestlimit; /* upper limit of # of extension headers */
 extern int	ip6_dad_count;		/* DupAddrDetectionTransmits */
 
 extern int	ip6_auto_flowlabel;
 extern int	ip6_auto_linklocal;
 
 extern int	ip6_use_tempaddr; /* whether to use temporary addresses. */
 extern int	ip6_prefer_tempaddr; /* whether to prefer temporary addresses
 					in the source address selection */
 
 #ifdef IPSTEALTH
 extern int	ip6stealth;
 #endif
 
 extern int	ip6_use_defzone; /* whether to use the default scope zone
 				    when unspecified */
 #endif /* VIMAGE_GLOBALS */
 
 extern	struct pfil_head inet6_pfil_hook;	/* packet filter hooks */
 
 extern struct	pr_usrreqs rip6_usrreqs;
 struct sockopt;
 
 struct inpcb;
 
 int	icmp6_ctloutput __P((struct socket *, struct sockopt *sopt));
 
 struct in6_ifaddr;
 void	ip6_init __P((void));
 void	ip6_input __P((struct mbuf *));
 struct in6_ifaddr *ip6_getdstifaddr __P((struct mbuf *));
 void	ip6_freepcbopts __P((struct ip6_pktopts *));
 void	ip6_freemoptions __P((struct ip6_moptions *));
 int	ip6_unknown_opt __P((u_int8_t *, struct mbuf *, int));
 char *	ip6_get_prevhdr __P((struct mbuf *, int));
 int	ip6_nexthdr __P((struct mbuf *, int, int, int *));
 int	ip6_lasthdr __P((struct mbuf *, int, int, int *));
 
 struct ip6aux *ip6_addaux __P((struct mbuf *));
 struct ip6aux *ip6_findaux __P((struct mbuf *));
 void	ip6_delaux __P((struct mbuf *));
 
 extern int	(*ip6_mforward)(struct ip6_hdr *, struct ifnet *,
     struct mbuf *);
 
 int	ip6_process_hopopts __P((struct mbuf *, u_int8_t *, int, u_int32_t *,
 				 u_int32_t *));
 struct mbuf	**ip6_savecontrol_v4(struct inpcb *, struct mbuf *,
 	    struct mbuf **, int *);
 void	ip6_savecontrol __P((struct inpcb *, struct mbuf *, struct mbuf **));
 void	ip6_notify_pmtu __P((struct inpcb *, struct sockaddr_in6 *,
 			     u_int32_t *));
 int	ip6_sysctl __P((int *, u_int, void *, size_t *, void *, size_t));
 
 void	ip6_forward __P((struct mbuf *, int));
 
 void	ip6_mloopback __P((struct ifnet *, struct mbuf *, struct sockaddr_in6 *));
 int	ip6_output __P((struct mbuf *, struct ip6_pktopts *,
 			struct route_in6 *,
 			int,
 			struct ip6_moptions *, struct ifnet **,
 			struct inpcb *));
 int	ip6_ctloutput __P((struct socket *, struct sockopt *));
 int	ip6_raw_ctloutput __P((struct socket *, struct sockopt *));
 void	ip6_initpktopts __P((struct ip6_pktopts *));
 int	ip6_setpktopts __P((struct mbuf *, struct ip6_pktopts *,
 	struct ip6_pktopts *, struct ucred *, int));
 void	ip6_clearpktopts __P((struct ip6_pktopts *, int));
 struct ip6_pktopts *ip6_copypktopts __P((struct ip6_pktopts *, int));
 int	ip6_optlen __P((struct inpcb *));
 
 int	route6_input __P((struct mbuf **, int *, int));
 
 void	frag6_init __P((void));
 int	frag6_input __P((struct mbuf **, int *, int));
 void	frag6_slowtimo __P((void));
 void	frag6_drain __P((void));
 
 void	rip6_init __P((void));
 int	rip6_input __P((struct mbuf **, int *, int));
 void	rip6_ctlinput __P((int, struct sockaddr *, void *));
 int	rip6_ctloutput __P((struct socket *, struct sockopt *));
 int	rip6_output __P((struct mbuf *, ...));
 int	rip6_usrreq __P((struct socket *,
 	    int, struct mbuf *, struct mbuf *, struct mbuf *, struct thread *));
 
 int	dest6_input __P((struct mbuf **, int *, int));
 int	none_input __P((struct mbuf **, int *, int));
 
 struct in6_addr *in6_selectsrc __P((struct sockaddr_in6 *, struct ip6_pktopts *,
 	struct inpcb *inp, struct route_in6 *, struct ucred *cred,
 	struct ifnet **, int *));
 int in6_selectroute __P((struct sockaddr_in6 *, struct ip6_pktopts *,
 	struct ip6_moptions *, struct route_in6 *, struct ifnet **,
-	struct rtentry **, int));
+	struct rtentry **));
 u_int32_t ip6_randomid __P((void));
 u_int32_t ip6_randomflowlabel __P((void));
 #endif /* _KERNEL */
 
 #endif /* !_NETINET6_IP6_VAR_H_ */
Index: head/sys/netinet6/nd6.c
===================================================================
--- head/sys/netinet6/nd6.c	(revision 186118)
+++ head/sys/netinet6/nd6.c	(revision 186119)
@@ -1,2462 +1,2207 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: nd6.c,v 1.144 2001/05/24 07:44:00 itojun Exp $
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_mac.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/callout.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/time.h>
 #include <sys/kernel.h>
 #include <sys/protosw.h>
 #include <sys/errno.h>
 #include <sys/syslog.h>
+#include <sys/lock.h>
+#include <sys/rwlock.h>
 #include <sys/queue.h>
 #include <sys/sysctl.h>
 
 #include <net/if.h>
 #include <net/if_arc.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/iso88025.h>
 #include <net/fddi.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
+#include <net/if_llatbl.h>
+#define	L3_ADDR_SIN6(le)	((struct sockaddr_in6 *) L3_ADDR(le))
 #include <netinet/if_ether.h>
 #include <netinet6/in6_var.h>
 #include <netinet/ip6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet/icmp6.h>
 #include <netinet6/vinet6.h>
 
 #include <sys/limits.h>
 #include <sys/vimage.h>
 
 #include <security/mac/mac_framework.h>
 
 #define ND6_SLOWTIMER_INTERVAL (60 * 60) /* 1 hour */
 #define ND6_RECALC_REACHTM_INTERVAL (60 * 120) /* 2 hours */
 
 #define SIN6(s) ((struct sockaddr_in6 *)s)
 #define SDL(s) ((struct sockaddr_dl *)s)
 
 #ifdef VIMAGE_GLOBALS
 int nd6_prune;
 int nd6_delay;
 int nd6_umaxtries;
 int nd6_mmaxtries;
 int nd6_useloopback;
 int nd6_gctimer;
 
 /* preventing too many loops in ND option parsing */
 int nd6_maxndopt;
 
 int nd6_maxnudhint;
 int nd6_maxqueuelen;
 
 int nd6_debug;
 
 /* for debugging? */
+#if 0
 static int nd6_inuse, nd6_allocated;
-struct llinfo_nd6 llinfo_nd6;
+#endif
 
 struct nd_drhead nd_defrouter;
 struct nd_prhead nd_prefix;
 
 int nd6_recalc_reachtm_interval;
 #endif /* VIMAGE_GLOBALS */
 
 static struct sockaddr_in6 all1_sa;
 
 static int nd6_is_new_addr_neighbor __P((struct sockaddr_in6 *,
 	struct ifnet *));
 static void nd6_setmtu0(struct ifnet *, struct nd_ifinfo *);
 static void nd6_slowtimo(void *);
 static int regen_tmpaddr(struct in6_ifaddr *);
-static struct llinfo_nd6 *nd6_free(struct rtentry *, int);
+static struct llentry *nd6_free(struct llentry *, int);
 static void nd6_llinfo_timer(void *);
-static void clear_llinfo_pqueue(struct llinfo_nd6 *);
+static void clear_llinfo_pqueue(struct llentry *);
 
 #ifdef VIMAGE_GLOBALS
 struct callout nd6_slowtimo_ch;
 struct callout nd6_timer_ch;
 extern struct callout in6_tmpaddrtimer_ch;
 extern int dad_ignore_ns;
 extern int dad_maxtry;
 #endif
 
 void
 nd6_init(void)
 {
 	INIT_VNET_INET6(curvnet);
 	static int nd6_init_done = 0;
 	int i;
 
 	if (nd6_init_done) {
 		log(LOG_NOTICE, "nd6_init called more than once(ignored)\n");
 		return;
 	}
 
 	V_nd6_prune	= 1;	/* walk list every 1 seconds */
 	V_nd6_delay	= 5;	/* delay first probe time 5 second */
 	V_nd6_umaxtries	= 3;	/* maximum unicast query */
 	V_nd6_mmaxtries	= 3;	/* maximum multicast query */
 	V_nd6_useloopback = 1;	/* use loopback interface for local traffic */
 	V_nd6_gctimer	= (60 * 60 * 24); /* 1 day: garbage collection timer */
 
 	/* preventing too many loops in ND option parsing */
 	V_nd6_maxndopt = 10;	/* max # of ND options allowed */
 
 	V_nd6_maxnudhint = 0;	/* max # of subsequent upper layer hints */
 	V_nd6_maxqueuelen = 1;	/* max pkts cached in unresolved ND entries */
 
 #ifdef ND6_DEBUG
 	V_nd6_debug = 1;
 #else
 	V_nd6_debug = 0;
 #endif
 
 	V_nd6_recalc_reachtm_interval = ND6_RECALC_REACHTM_INTERVAL;
 
 	V_dad_ignore_ns = 0;	/* ignore NS in DAD - specwise incorrect*/
 	V_dad_maxtry = 15;	/* max # of *tries* to transmit DAD packet */
 
+	/*
+	 * XXX just to get this to compile KMM
+	 */
+#ifdef notyet
 	V_llinfo_nd6.ln_next = &V_llinfo_nd6;
 	V_llinfo_nd6.ln_prev = &V_llinfo_nd6;
+#endif
 	LIST_INIT(&V_nd_prefix);
 
 	V_ip6_use_tempaddr = 0;
 	V_ip6_temp_preferred_lifetime = DEF_TEMP_PREFERRED_LIFETIME;
 	V_ip6_temp_valid_lifetime = DEF_TEMP_VALID_LIFETIME;
 	V_ip6_temp_regen_advance = TEMPADDR_REGEN_ADVANCE;
 
 	all1_sa.sin6_family = AF_INET6;
 	all1_sa.sin6_len = sizeof(struct sockaddr_in6);
 	for (i = 0; i < sizeof(all1_sa.sin6_addr); i++)
 		all1_sa.sin6_addr.s6_addr[i] = 0xff;
 
 	/* initialization of the default router list */
 	TAILQ_INIT(&V_nd_defrouter);
 	/* start timer */
 	callout_init(&V_nd6_slowtimo_ch, 0);
 	callout_reset(&V_nd6_slowtimo_ch, ND6_SLOWTIMER_INTERVAL * hz,
 	    nd6_slowtimo, NULL);
 
 	nd6_init_done = 1;
 
 }
 
 struct nd_ifinfo *
 nd6_ifattach(struct ifnet *ifp)
 {
 	struct nd_ifinfo *nd;
 
 	nd = (struct nd_ifinfo *)malloc(sizeof(*nd), M_IP6NDP, M_WAITOK);
 	bzero(nd, sizeof(*nd));
 
 	nd->initialized = 1;
 
 	nd->chlim = IPV6_DEFHLIM;
 	nd->basereachable = REACHABLE_TIME;
 	nd->reachable = ND_COMPUTE_RTIME(nd->basereachable);
 	nd->retrans = RETRANS_TIMER;
 	/*
 	 * Note that the default value of ip6_accept_rtadv is 0, which means
 	 * we won't accept RAs by default even if we set ND6_IFF_ACCEPT_RTADV
 	 * here.
 	 */
 	nd->flags = (ND6_IFF_PERFORMNUD | ND6_IFF_ACCEPT_RTADV);
 
 	/* XXX: we cannot call nd6_setmtu since ifp is not fully initialized */
 	nd6_setmtu0(ifp, nd);
 
 	return nd;
 }
 
 void
 nd6_ifdetach(struct nd_ifinfo *nd)
 {
 
 	free(nd, M_IP6NDP);
 }
 
 /*
  * Reset ND level link MTU. This function is called when the physical MTU
  * changes, which means we might have to adjust the ND level MTU.
  */
 void
 nd6_setmtu(struct ifnet *ifp)
 {
 
 	nd6_setmtu0(ifp, ND_IFINFO(ifp));
 }
 
 /* XXX todo: do not maintain copy of ifp->if_mtu in ndi->maxmtu */
 void
 nd6_setmtu0(struct ifnet *ifp, struct nd_ifinfo *ndi)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	u_int32_t omaxmtu;
 
 	omaxmtu = ndi->maxmtu;
 
 	switch (ifp->if_type) {
 	case IFT_ARCNET:
 		ndi->maxmtu = MIN(ARC_PHDS_MAXMTU, ifp->if_mtu); /* RFC2497 */
 		break;
 	case IFT_FDDI:
 		ndi->maxmtu = MIN(FDDIIPMTU, ifp->if_mtu); /* RFC2467 */
 		break;
 	case IFT_ISO88025:
 		 ndi->maxmtu = MIN(ISO88025_MAX_MTU, ifp->if_mtu);
 		 break;
 	default:
 		ndi->maxmtu = ifp->if_mtu;
 		break;
 	}
 
 	/*
 	 * Decreasing the interface MTU under IPV6 minimum MTU may cause
 	 * undesirable situation.  We thus notify the operator of the change
 	 * explicitly.  The check for omaxmtu is necessary to restrict the
 	 * log to the case of changing the MTU, not initializing it.
 	 */
 	if (omaxmtu >= IPV6_MMTU && ndi->maxmtu < IPV6_MMTU) {
 		log(LOG_NOTICE, "nd6_setmtu0: "
 		    "new link MTU on %s (%lu) is too small for IPv6\n",
 		    if_name(ifp), (unsigned long)ndi->maxmtu);
 	}
 
 	if (ndi->maxmtu > V_in6_maxmtu)
 		in6_setmaxmtu(); /* check all interfaces just in case */
 
 #undef MIN
 }
 
 void
 nd6_option_init(void *opt, int icmp6len, union nd_opts *ndopts)
 {
 
 	bzero(ndopts, sizeof(*ndopts));
 	ndopts->nd_opts_search = (struct nd_opt_hdr *)opt;
 	ndopts->nd_opts_last
 		= (struct nd_opt_hdr *)(((u_char *)opt) + icmp6len);
 
 	if (icmp6len == 0) {
 		ndopts->nd_opts_done = 1;
 		ndopts->nd_opts_search = NULL;
 	}
 }
 
 /*
  * Take one ND option.
  */
 struct nd_opt_hdr *
 nd6_option(union nd_opts *ndopts)
 {
 	struct nd_opt_hdr *nd_opt;
 	int olen;
 
 	if (ndopts == NULL)
 		panic("ndopts == NULL in nd6_option");
 	if (ndopts->nd_opts_last == NULL)
 		panic("uninitialized ndopts in nd6_option");
 	if (ndopts->nd_opts_search == NULL)
 		return NULL;
 	if (ndopts->nd_opts_done)
 		return NULL;
 
 	nd_opt = ndopts->nd_opts_search;
 
 	/* make sure nd_opt_len is inside the buffer */
 	if ((caddr_t)&nd_opt->nd_opt_len >= (caddr_t)ndopts->nd_opts_last) {
 		bzero(ndopts, sizeof(*ndopts));
 		return NULL;
 	}
 
 	olen = nd_opt->nd_opt_len << 3;
 	if (olen == 0) {
 		/*
 		 * Message validation requires that all included
 		 * options have a length that is greater than zero.
 		 */
 		bzero(ndopts, sizeof(*ndopts));
 		return NULL;
 	}
 
 	ndopts->nd_opts_search = (struct nd_opt_hdr *)((caddr_t)nd_opt + olen);
 	if (ndopts->nd_opts_search > ndopts->nd_opts_last) {
 		/* option overruns the end of buffer, invalid */
 		bzero(ndopts, sizeof(*ndopts));
 		return NULL;
 	} else if (ndopts->nd_opts_search == ndopts->nd_opts_last) {
 		/* reached the end of options chain */
 		ndopts->nd_opts_done = 1;
 		ndopts->nd_opts_search = NULL;
 	}
 	return nd_opt;
 }
 
 /*
  * Parse multiple ND options.
  * This function is much easier to use, for ND routines that do not need
  * multiple options of the same type.
  */
 int
 nd6_options(union nd_opts *ndopts)
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_opt_hdr *nd_opt;
 	int i = 0;
 
 	if (ndopts == NULL)
 		panic("ndopts == NULL in nd6_options");
 	if (ndopts->nd_opts_last == NULL)
 		panic("uninitialized ndopts in nd6_options");
 	if (ndopts->nd_opts_search == NULL)
 		return 0;
 
 	while (1) {
 		nd_opt = nd6_option(ndopts);
 		if (nd_opt == NULL && ndopts->nd_opts_last == NULL) {
 			/*
 			 * Message validation requires that all included
 			 * options have a length that is greater than zero.
 			 */
 			V_icmp6stat.icp6s_nd_badopt++;
 			bzero(ndopts, sizeof(*ndopts));
 			return -1;
 		}
 
 		if (nd_opt == NULL)
 			goto skip1;
 
 		switch (nd_opt->nd_opt_type) {
 		case ND_OPT_SOURCE_LINKADDR:
 		case ND_OPT_TARGET_LINKADDR:
 		case ND_OPT_MTU:
 		case ND_OPT_REDIRECTED_HEADER:
 			if (ndopts->nd_opt_array[nd_opt->nd_opt_type]) {
 				nd6log((LOG_INFO,
 				    "duplicated ND6 option found (type=%d)\n",
 				    nd_opt->nd_opt_type));
 				/* XXX bark? */
 			} else {
 				ndopts->nd_opt_array[nd_opt->nd_opt_type]
 					= nd_opt;
 			}
 			break;
 		case ND_OPT_PREFIX_INFORMATION:
 			if (ndopts->nd_opt_array[nd_opt->nd_opt_type] == 0) {
 				ndopts->nd_opt_array[nd_opt->nd_opt_type]
 					= nd_opt;
 			}
 			ndopts->nd_opts_pi_end =
 				(struct nd_opt_prefix_info *)nd_opt;
 			break;
 		default:
 			/*
 			 * Unknown options must be silently ignored,
 			 * to accomodate future extension to the protocol.
 			 */
 			nd6log((LOG_DEBUG,
 			    "nd6_options: unsupported option %d - "
 			    "option ignored\n", nd_opt->nd_opt_type));
 		}
 
 skip1:
 		i++;
 		if (i > V_nd6_maxndopt) {
 			V_icmp6stat.icp6s_nd_toomanyopt++;
 			nd6log((LOG_INFO, "too many loop in nd opt\n"));
 			break;
 		}
 
 		if (ndopts->nd_opts_done)
 			break;
 	}
 
 	return 0;
 }
 
 /*
  * ND6 timer routine to handle ND6 entries
  */
 void
-nd6_llinfo_settimer(struct llinfo_nd6 *ln, long tick)
+nd6_llinfo_settimer_locked(struct llentry *ln, long tick)
 {
 	if (tick < 0) {
-		ln->ln_expire = 0;
+		ln->la_expire = 0;
 		ln->ln_ntick = 0;
 		callout_stop(&ln->ln_timer_ch);
+		/*
+		 * XXX - do we know that there is
+		 * callout installed? i.e. are we 
+		 * guaranteed that we're not dropping
+		 * a reference that we did not add?
+		 * KMM 
+		 */
+		LLE_REMREF(ln);
 	} else {
-		ln->ln_expire = time_second + tick / hz;
+		ln->la_expire = time_second + tick / hz;
+		LLE_ADDREF(ln);
 		if (tick > INT_MAX) {
 			ln->ln_ntick = tick - INT_MAX;
 			callout_reset(&ln->ln_timer_ch, INT_MAX,
 			    nd6_llinfo_timer, ln);
 		} else {
 			ln->ln_ntick = 0;
 			callout_reset(&ln->ln_timer_ch, tick,
 			    nd6_llinfo_timer, ln);
 		}
 	}
 }
 
+void
+nd6_llinfo_settimer(struct llentry *ln, long tick)
+{
+
+	LLE_WLOCK(ln);
+	nd6_llinfo_settimer_locked(ln, tick);
+	LLE_WUNLOCK(ln);
+}
+
 static void
 nd6_llinfo_timer(void *arg)
 {
-	struct llinfo_nd6 *ln;
-	struct rtentry *rt;
+	struct llentry *ln;
 	struct in6_addr *dst;
 	struct ifnet *ifp;
 	struct nd_ifinfo *ndi = NULL;
 
-	ln = (struct llinfo_nd6 *)arg;
+	ln = (struct llentry *)arg;
+	if (ln == NULL) {
+		panic("%s: NULL entry!\n", __func__);
+		return;
+	}
 
+	if ((ifp = ((ln->lle_tbl != NULL) ? ln->lle_tbl->llt_ifp : NULL)) == NULL)
+		panic("ln ifp == NULL");
+
+	CURVNET_SET(ifp->if_vnet);
+	INIT_VNET_INET6(curvnet);
+
 	if (ln->ln_ntick > 0) {
 		if (ln->ln_ntick > INT_MAX) {
 			ln->ln_ntick -= INT_MAX;
 			nd6_llinfo_settimer(ln, INT_MAX);
 		} else {
 			ln->ln_ntick = 0;
 			nd6_llinfo_settimer(ln, ln->ln_ntick);
 		}
-		return;
+		goto done;
 	}
 
-	if ((rt = ln->ln_rt) == NULL)
-		panic("ln->ln_rt == NULL");
-	if ((ifp = rt->rt_ifp) == NULL)
-		panic("ln->ln_rt->rt_ifp == NULL");
 	ndi = ND_IFINFO(ifp);
+	dst = &L3_ADDR_SIN6(ln)->sin6_addr;
+	if ((ln->la_flags & LLE_STATIC) || (ln->la_expire > time_second)) {
+		goto done;
+	}
 
-	CURVNET_SET(ifp->if_vnet);
-	INIT_VNET_INET6(curvnet);
+	if (ln->la_flags & LLE_DELETED) {
+		(void)nd6_free(ln, 0);
+		goto done;
+	}
 
-	/* sanity check */
-	if (rt->rt_llinfo && (struct llinfo_nd6 *)rt->rt_llinfo != ln)
-		panic("rt_llinfo(%p) is not equal to ln(%p)",
-		      rt->rt_llinfo, ln);
-	if (rt_key(rt) == NULL)
-		panic("rt key is NULL in nd6_timer(ln=%p)", ln);
-
-	dst = &((struct sockaddr_in6 *)rt_key(rt))->sin6_addr;
-
 	switch (ln->ln_state) {
 	case ND6_LLINFO_INCOMPLETE:
-		if (ln->ln_asked < V_nd6_mmaxtries) {
-			ln->ln_asked++;
+		if (ln->la_asked < V_nd6_mmaxtries) {
+			ln->la_asked++;
 			nd6_llinfo_settimer(ln, (long)ndi->retrans * hz / 1000);
 			nd6_ns_output(ifp, NULL, dst, ln, 0);
 		} else {
-			struct mbuf *m = ln->ln_hold;
+			struct mbuf *m = ln->la_hold;
 			if (m) {
 				struct mbuf *m0;
 
 				/*
-				 * assuming every packet in ln_hold has the
+				 * assuming every packet in la_hold has the
 				 * same IP header
 				 */
 				m0 = m->m_nextpkt;
 				m->m_nextpkt = NULL;
 				icmp6_error2(m, ICMP6_DST_UNREACH,
-				    ICMP6_DST_UNREACH_ADDR, 0, rt->rt_ifp);
+				    ICMP6_DST_UNREACH_ADDR, 0, ifp);
 
-				ln->ln_hold = m0;
+				ln->la_hold = m0;
 				clear_llinfo_pqueue(ln);
 			}
-			if (rt && rt->rt_llinfo)
-				(void)nd6_free(rt, 0);
+			(void)nd6_free(ln, 0);
 			ln = NULL;
 		}
 		break;
 	case ND6_LLINFO_REACHABLE:
 		if (!ND6_LLINFO_PERMANENT(ln)) {
 			ln->ln_state = ND6_LLINFO_STALE;
 			nd6_llinfo_settimer(ln, (long)V_nd6_gctimer * hz);
 		}
 		break;
 
 	case ND6_LLINFO_STALE:
 		/* Garbage Collection(RFC 2461 5.3) */
 		if (!ND6_LLINFO_PERMANENT(ln)) {
-			if (rt && rt->rt_llinfo)
-				(void)nd6_free(rt, 1);
+			(void)nd6_free(ln, 1);
 			ln = NULL;
 		}
 		break;
 
 	case ND6_LLINFO_DELAY:
 		if (ndi && (ndi->flags & ND6_IFF_PERFORMNUD) != 0) {
 			/* We need NUD */
-			ln->ln_asked = 1;
+			ln->la_asked = 1;
 			ln->ln_state = ND6_LLINFO_PROBE;
 			nd6_llinfo_settimer(ln, (long)ndi->retrans * hz / 1000);
 			nd6_ns_output(ifp, dst, dst, ln, 0);
 		} else {
 			ln->ln_state = ND6_LLINFO_STALE; /* XXX */
 			nd6_llinfo_settimer(ln, (long)V_nd6_gctimer * hz);
 		}
 		break;
 	case ND6_LLINFO_PROBE:
-		if (ln->ln_asked < V_nd6_umaxtries) {
-			ln->ln_asked++;
+		if (ln->la_asked < V_nd6_umaxtries) {
+			ln->la_asked++;
 			nd6_llinfo_settimer(ln, (long)ndi->retrans * hz / 1000);
 			nd6_ns_output(ifp, dst, dst, ln, 0);
-		} else if (rt->rt_ifa != NULL &&
-		    rt->rt_ifa->ifa_addr->sa_family == AF_INET6 &&
-		    (((struct in6_ifaddr *)rt->rt_ifa)->ia_flags & IFA_ROUTE)) {
-			/*
-			 * This is an unreachable neighbor whose address is
-			 * specified as the destination of a p2p interface
-			 * (see in6_ifinit()).  We should not free the entry
-			 * since this is sort of a "static" entry generated
-			 * via interface address configuration.
-			 */
-			ln->ln_asked = 0;
-			ln->ln_expire = 0; /* make it permanent */
-			ln->ln_state = ND6_LLINFO_STALE;
 		} else {
-			if (rt && rt->rt_llinfo)
-				(void)nd6_free(rt, 0);
+			(void)nd6_free(ln, 0);
 			ln = NULL;
 		}
 		break;
 	}
 	CURVNET_RESTORE();
+done:
+	if (ln != NULL)
+		LLE_FREE(ln);
 }
 
 
 /*
  * ND6 timer routine to expire default route list and prefix list
  */
 void
 nd6_timer(void *arg)
 {
 	CURVNET_SET_QUIET((struct vnet *) arg);
 	INIT_VNET_INET6((struct vnet *) arg);
 	int s;
 	struct nd_defrouter *dr;
 	struct nd_prefix *pr;
 	struct in6_ifaddr *ia6, *nia6;
 	struct in6_addrlifetime *lt6;
 
 	callout_reset(&V_nd6_timer_ch, V_nd6_prune * hz,
 	    nd6_timer, NULL);
 
 	/* expire default router list */
 	s = splnet();
 	dr = TAILQ_FIRST(&V_nd_defrouter);
 	while (dr) {
 		if (dr->expire && dr->expire < time_second) {
 			struct nd_defrouter *t;
 			t = TAILQ_NEXT(dr, dr_entry);
 			defrtrlist_del(dr);
 			dr = t;
 		} else {
 			dr = TAILQ_NEXT(dr, dr_entry);
 		}
 	}
 
 	/*
 	 * expire interface addresses.
 	 * in the past the loop was inside prefix expiry processing.
 	 * However, from a stricter speci-confrmance standpoint, we should
 	 * rather separate address lifetimes and prefix lifetimes.
 	 */
   addrloop:
 	for (ia6 = V_in6_ifaddr; ia6; ia6 = nia6) {
 		nia6 = ia6->ia_next;
 		/* check address lifetime */
 		lt6 = &ia6->ia6_lifetime;
 		if (IFA6_IS_INVALID(ia6)) {
 			int regen = 0;
 
 			/*
 			 * If the expiring address is temporary, try
 			 * regenerating a new one.  This would be useful when
 			 * we suspended a laptop PC, then turned it on after a
 			 * period that could invalidate all temporary
 			 * addresses.  Although we may have to restart the
 			 * loop (see below), it must be after purging the
 			 * address.  Otherwise, we'd see an infinite loop of
 			 * regeneration.
 			 */
 			if (V_ip6_use_tempaddr &&
 			    (ia6->ia6_flags & IN6_IFF_TEMPORARY) != 0) {
 				if (regen_tmpaddr(ia6) == 0)
 					regen = 1;
 			}
 
 			in6_purgeaddr(&ia6->ia_ifa);
 
 			if (regen)
 				goto addrloop; /* XXX: see below */
 		} else if (IFA6_IS_DEPRECATED(ia6)) {
 			int oldflags = ia6->ia6_flags;
 
 			ia6->ia6_flags |= IN6_IFF_DEPRECATED;
 
 			/*
 			 * If a temporary address has just become deprecated,
 			 * regenerate a new one if possible.
 			 */
 			if (V_ip6_use_tempaddr &&
 			    (ia6->ia6_flags & IN6_IFF_TEMPORARY) != 0 &&
 			    (oldflags & IN6_IFF_DEPRECATED) == 0) {
 
 				if (regen_tmpaddr(ia6) == 0) {
 					/*
 					 * A new temporary address is
 					 * generated.
 					 * XXX: this means the address chain
 					 * has changed while we are still in
 					 * the loop.  Although the change
 					 * would not cause disaster (because
 					 * it's not a deletion, but an
 					 * addition,) we'd rather restart the
 					 * loop just for safety.  Or does this
 					 * significantly reduce performance??
 					 */
 					goto addrloop;
 				}
 			}
 		} else {
 			/*
 			 * A new RA might have made a deprecated address
 			 * preferred.
 			 */
 			ia6->ia6_flags &= ~IN6_IFF_DEPRECATED;
 		}
 	}
 
 	/* expire prefix list */
 	pr = V_nd_prefix.lh_first;
 	while (pr) {
 		/*
 		 * check prefix lifetime.
 		 * since pltime is just for autoconf, pltime processing for
 		 * prefix is not necessary.
 		 */
 		if (pr->ndpr_vltime != ND6_INFINITE_LIFETIME &&
 		    time_second - pr->ndpr_lastupdate > pr->ndpr_vltime) {
 			struct nd_prefix *t;
 			t = pr->ndpr_next;
 
 			/*
 			 * address expiration and prefix expiration are
 			 * separate.  NEVER perform in6_purgeaddr here.
 			 */
 
 			prelist_remove(pr);
 			pr = t;
 		} else
 			pr = pr->ndpr_next;
 	}
 	splx(s);
 	CURVNET_RESTORE();
 }
 
 /*
  * ia6 - deprecated/invalidated temporary address
  */
 static int
 regen_tmpaddr(struct in6_ifaddr *ia6)
 {
 	struct ifaddr *ifa;
 	struct ifnet *ifp;
 	struct in6_ifaddr *public_ifa6 = NULL;
 
 	ifp = ia6->ia_ifa.ifa_ifp;
 	for (ifa = ifp->if_addrlist.tqh_first; ifa;
 	     ifa = ifa->ifa_list.tqe_next) {
 		struct in6_ifaddr *it6;
 
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 
 		it6 = (struct in6_ifaddr *)ifa;
 
 		/* ignore no autoconf addresses. */
 		if ((it6->ia6_flags & IN6_IFF_AUTOCONF) == 0)
 			continue;
 
 		/* ignore autoconf addresses with different prefixes. */
 		if (it6->ia6_ndpr == NULL || it6->ia6_ndpr != ia6->ia6_ndpr)
 			continue;
 
 		/*
 		 * Now we are looking at an autoconf address with the same
 		 * prefix as ours.  If the address is temporary and is still
 		 * preferred, do not create another one.  It would be rare, but
 		 * could happen, for example, when we resume a laptop PC after
 		 * a long period.
 		 */
 		if ((it6->ia6_flags & IN6_IFF_TEMPORARY) != 0 &&
 		    !IFA6_IS_DEPRECATED(it6)) {
 			public_ifa6 = NULL;
 			break;
 		}
 
 		/*
 		 * This is a public autoconf address that has the same prefix
 		 * as ours.  If it is preferred, keep it.  We can't break the
 		 * loop here, because there may be a still-preferred temporary
 		 * address with the prefix.
 		 */
 		if (!IFA6_IS_DEPRECATED(it6))
 		    public_ifa6 = it6;
 	}
 
 	if (public_ifa6 != NULL) {
 		int e;
 
 		if ((e = in6_tmpifadd(public_ifa6, 0, 0)) != 0) {
 			log(LOG_NOTICE, "regen_tmpaddr: failed to create a new"
 			    " tmp addr,errno=%d\n", e);
 			return (-1);
 		}
 		return (0);
 	}
 
 	return (-1);
 }
 
 /*
  * Nuke neighbor cache/prefix/default router management table, right before
  * ifp goes away.
  */
 void
 nd6_purge(struct ifnet *ifp)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
-	struct llinfo_nd6 *ln, *nln;
 	struct nd_defrouter *dr, *ndr;
 	struct nd_prefix *pr, *npr;
 
 	/*
 	 * Nuke default router list entries toward ifp.
 	 * We defer removal of default router list entries that is installed
 	 * in the routing table, in order to keep additional side effects as
 	 * small as possible.
 	 */
 	for (dr = TAILQ_FIRST(&V_nd_defrouter); dr; dr = ndr) {
 		ndr = TAILQ_NEXT(dr, dr_entry);
 		if (dr->installed)
 			continue;
 
 		if (dr->ifp == ifp)
 			defrtrlist_del(dr);
 	}
 
 	for (dr = TAILQ_FIRST(&V_nd_defrouter); dr; dr = ndr) {
 		ndr = TAILQ_NEXT(dr, dr_entry);
 		if (!dr->installed)
 			continue;
 
 		if (dr->ifp == ifp)
 			defrtrlist_del(dr);
 	}
 
 	/* Nuke prefix list entries toward ifp */
 	for (pr = V_nd_prefix.lh_first; pr; pr = npr) {
 		npr = pr->ndpr_next;
 		if (pr->ndpr_ifp == ifp) {
 			/*
 			 * Because if_detach() does *not* release prefixes
 			 * while purging addresses the reference count will
 			 * still be above zero. We therefore reset it to
 			 * make sure that the prefix really gets purged.
 			 */
 			pr->ndpr_refcnt = 0;
 
 			/*
 			 * Previously, pr->ndpr_addr is removed as well,
 			 * but I strongly believe we don't have to do it.
 			 * nd6_purge() is only called from in6_ifdetach(),
 			 * which removes all the associated interface addresses
 			 * by itself.
 			 * (jinmei@kame.net 20010129)
 			 */
 			prelist_remove(pr);
 		}
 	}
 
 	/* cancel default outgoing interface setting */
 	if (V_nd6_defifindex == ifp->if_index)
 		nd6_setdefaultiface(0);
 
 	if (!V_ip6_forwarding && V_ip6_accept_rtadv) { /* XXX: too restrictive? */
-		/* refresh default router list */
+		/* refresh default router list
+		 *
+		 * 
+		 */
 		defrouter_select();
+
 	}
 
-	/*
-	 * Nuke neighbor cache entries for the ifp.
-	 * Note that rt->rt_ifp may not be the same as ifp,
-	 * due to KAME goto ours hack.  See RTM_RESOLVE case in
-	 * nd6_rtrequest(), and ip6_input().
+	/* XXXXX
+	 * We do not nuke the neighbor cache entries here any more
+	 * because the neighbor cache is kept in if_afdata[AF_INET6].
+	 * nd6_purge() is invoked by in6_ifdetach() which is called
+	 * from if_detach() where everything gets purged. So let
+	 * in6_domifdetach() do the actual L2 table purging work.
 	 */
-	ln = V_llinfo_nd6.ln_next;
-	while (ln && ln != &V_llinfo_nd6) {
-		struct rtentry *rt;
-		struct sockaddr_dl *sdl;
-
-		nln = ln->ln_next;
-		rt = ln->ln_rt;
-		if (rt && rt->rt_gateway &&
-		    rt->rt_gateway->sa_family == AF_LINK) {
-			sdl = (struct sockaddr_dl *)rt->rt_gateway;
-			if (sdl->sdl_index == ifp->if_index)
-				nln = nd6_free(rt, 0);
-		}
-		ln = nln;
-	}
 }
 
-struct rtentry *
-nd6_lookup(struct in6_addr *addr6, int create, struct ifnet *ifp)
+/* 
+ * the caller acquires and releases the lock on the lltbls
+ * Returns the llentry locked
+ */
+struct llentry *
+nd6_lookup(struct in6_addr *addr6, int flags, struct ifnet *ifp)
 {
 	INIT_VNET_INET6(curvnet);
-	struct rtentry *rt;
 	struct sockaddr_in6 sin6;
-	char ip6buf[INET6_ADDRSTRLEN];
-
+	struct llentry *ln;
+	int llflags = 0;
+	
 	bzero(&sin6, sizeof(sin6));
 	sin6.sin6_len = sizeof(struct sockaddr_in6);
 	sin6.sin6_family = AF_INET6;
 	sin6.sin6_addr = *addr6;
-	rt = rtalloc1((struct sockaddr *)&sin6, create, 0UL);
-	if (rt) {
-		if ((rt->rt_flags & RTF_LLINFO) == 0 && create) {
-			/*
-			 * This is the case for the default route.
-			 * If we want to create a neighbor cache for the
-			 * address, we should free the route for the
-			 * destination and allocate an interface route.
-			 */
-			RTFREE_LOCKED(rt);
-			rt = NULL;
-		}
-	}
-	if (rt == NULL) {
-		if (create && ifp) {
-			int e;
 
-			/*
-			 * If no route is available and create is set,
-			 * we allocate a host route for the destination
-			 * and treat it like an interface route.
-			 * This hack is necessary for a neighbor which can't
-			 * be covered by our own prefix.
-			 */
-			struct ifaddr *ifa =
-			    ifaof_ifpforaddr((struct sockaddr *)&sin6, ifp);
-			if (ifa == NULL)
-				return (NULL);
+	IF_AFDATA_LOCK_ASSERT(ifp);
 
-			/*
-			 * Create a new route.  RTF_LLINFO is necessary
-			 * to create a Neighbor Cache entry for the
-			 * destination in nd6_rtrequest which will be
-			 * called in rtrequest via ifa->ifa_rtrequest.
-			 */
-			if ((e = rtrequest(RTM_ADD, (struct sockaddr *)&sin6,
-			    ifa->ifa_addr, (struct sockaddr *)&all1_sa,
-			    (ifa->ifa_flags | RTF_HOST | RTF_LLINFO) &
-			    ~RTF_CLONING, &rt)) != 0) {
-				log(LOG_ERR,
-				    "nd6_lookup: failed to add route for a "
-				    "neighbor(%s), errno=%d\n",
-				    ip6_sprintf(ip6buf, addr6), e);
-			}
-			if (rt == NULL)
-				return (NULL);
-			RT_LOCK(rt);
-			if (rt->rt_llinfo) {
-				struct llinfo_nd6 *ln =
-				    (struct llinfo_nd6 *)rt->rt_llinfo;
-				ln->ln_state = ND6_LLINFO_NOSTATE;
-			}
-		} else
-			return (NULL);
+	if (flags & ND6_CREATE)
+	    llflags |= LLE_CREATE;
+	if (flags & ND6_EXCLUSIVE)
+	    llflags |= LLE_EXCLUSIVE;	
+	
+	ln = lla_lookup(LLTABLE6(ifp), llflags, (struct sockaddr *)&sin6);
+	if ((ln != NULL) && (flags & LLE_CREATE)) {
+		ln->ln_state = ND6_LLINFO_NOSTATE;
+		callout_init(&ln->ln_timer_ch, 0);
 	}
-	RT_LOCK_ASSERT(rt);
-	RT_REMREF(rt);
-	/*
-	 * Validation for the entry.
-	 * Note that the check for rt_llinfo is necessary because a cloned
-	 * route from a parent route that has the L flag (e.g. the default
-	 * route to a p2p interface) may have the flag, too, while the
-	 * destination is not actually a neighbor.
-	 * XXX: we can't use rt->rt_ifp to check for the interface, since
-	 *      it might be the loopback interface if the entry is for our
-	 *      own address on a non-loopback interface. Instead, we should
-	 *      use rt->rt_ifa->ifa_ifp, which would specify the REAL
-	 *	interface.
-	 * Note also that ifa_ifp and ifp may differ when we connect two
-	 * interfaces to a same link, install a link prefix to an interface,
-	 * and try to install a neighbor cache on an interface that does not
-	 * have a route to the prefix.
-	 */
-	if ((rt->rt_flags & RTF_GATEWAY) || (rt->rt_flags & RTF_LLINFO) == 0 ||
-	    rt->rt_gateway->sa_family != AF_LINK || rt->rt_llinfo == NULL ||
-	    (ifp && rt->rt_ifa->ifa_ifp != ifp)) {
-		if (create) {
-			nd6log((LOG_DEBUG,
-			    "nd6_lookup: failed to lookup %s (if = %s)\n",
-			    ip6_sprintf(ip6buf, addr6),
-			    ifp ? if_name(ifp) : "unspec"));
-		}
-		RT_UNLOCK(rt);
-		return (NULL);
-	}
-	RT_UNLOCK(rt);		/* XXX not ready to return rt locked */
-	return (rt);
+	
+	return (ln);
 }
 
 /*
  * Test whether a given IPv6 address is a neighbor or not, ignoring
  * the actual neighbor cache.  The neighbor cache is ignored in order
  * to not reenter the routing code from within itself.
  */
 static int
 nd6_is_new_addr_neighbor(struct sockaddr_in6 *addr, struct ifnet *ifp)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	struct nd_prefix *pr;
 	struct ifaddr *dstaddr;
 
 	/*
 	 * A link-local address is always a neighbor.
 	 * XXX: a link does not necessarily specify a single interface.
 	 */
 	if (IN6_IS_ADDR_LINKLOCAL(&addr->sin6_addr)) {
 		struct sockaddr_in6 sin6_copy;
 		u_int32_t zone;
 
 		/*
 		 * We need sin6_copy since sa6_recoverscope() may modify the
 		 * content (XXX).
 		 */
 		sin6_copy = *addr;
 		if (sa6_recoverscope(&sin6_copy))
 			return (0); /* XXX: should be impossible */
 		if (in6_setscope(&sin6_copy.sin6_addr, ifp, &zone))
 			return (0);
 		if (sin6_copy.sin6_scope_id == zone)
 			return (1);
 		else
 			return (0);
 	}
 
 	/*
 	 * If the address matches one of our addresses,
 	 * it should be a neighbor.
 	 * If the address matches one of our on-link prefixes, it should be a
 	 * neighbor.
 	 */
 	for (pr = V_nd_prefix.lh_first; pr; pr = pr->ndpr_next) {
 		if (pr->ndpr_ifp != ifp)
 			continue;
 
 		if (!(pr->ndpr_stateflags & NDPRF_ONLINK))
 			continue;
 
 		if (IN6_ARE_MASKED_ADDR_EQUAL(&pr->ndpr_prefix.sin6_addr,
 		    &addr->sin6_addr, &pr->ndpr_mask))
 			return (1);
 	}
 
 	/*
 	 * If the address is assigned on the node of the other side of
 	 * a p2p interface, the address should be a neighbor.
 	 */
 	dstaddr = ifa_ifwithdstaddr((struct sockaddr *)addr);
 	if ((dstaddr != NULL) && (dstaddr->ifa_ifp == ifp))
 		return (1);
 
 	/*
 	 * If the default router list is empty, all addresses are regarded
 	 * as on-link, and thus, as a neighbor.
 	 * XXX: we restrict the condition to hosts, because routers usually do
 	 * not have the "default router list".
 	 */
 	if (!V_ip6_forwarding && TAILQ_FIRST(&V_nd_defrouter) == NULL &&
 	    V_nd6_defifindex == ifp->if_index) {
 		return (1);
 	}
 
 	return (0);
 }
 
 
 /*
  * Detect if a given IPv6 address identifies a neighbor on a given link.
  * XXX: should take care of the destination of a p2p link?
  */
 int
 nd6_is_addr_neighbor(struct sockaddr_in6 *addr, struct ifnet *ifp)
 {
+	struct llentry *lle;
+	int rc = 0;
 
+	IF_AFDATA_UNLOCK_ASSERT(ifp);
 	if (nd6_is_new_addr_neighbor(addr, ifp))
 		return (1);
 
 	/*
 	 * Even if the address matches none of our addresses, it might be
 	 * in the neighbor cache.
 	 */
-	if (nd6_lookup(&addr->sin6_addr, 0, ifp) != NULL)
-		return (1);
-
-	return (0);
+	IF_AFDATA_LOCK(ifp);
+	if ((lle = nd6_lookup(&addr->sin6_addr, 0, ifp)) != NULL) {
+		LLE_RUNLOCK(lle);
+		rc = 1;
+	}
+	IF_AFDATA_UNLOCK(ifp);
+	return (rc);
 }
 
 /*
  * Free an nd6 llinfo entry.
  * Since the function would cause significant changes in the kernel, DO NOT
  * make it global, unless you have a strong reason for the change, and are sure
  * that the change is safe.
  */
-static struct llinfo_nd6 *
-nd6_free(struct rtentry *rt, int gc)
+static struct llentry *
+nd6_free(struct llentry *ln, int gc)
 {
 	INIT_VNET_INET6(curvnet);
-	struct llinfo_nd6 *ln = (struct llinfo_nd6 *)rt->rt_llinfo, *next;
-	struct in6_addr in6 = ((struct sockaddr_in6 *)rt_key(rt))->sin6_addr;
+        struct llentry *next;
 	struct nd_defrouter *dr;
+	struct ifnet *ifp=NULL;
 
 	/*
 	 * we used to have pfctlinput(PRC_HOSTDEAD) here.
 	 * even though it is not harmful, it was not really necessary.
 	 */
 
 	/* cancel timer */
 	nd6_llinfo_settimer(ln, -1);
 
 	if (!V_ip6_forwarding) {
 		int s;
 		s = splnet();
-		dr = defrouter_lookup(&((struct sockaddr_in6 *)rt_key(rt))->sin6_addr,
-		    rt->rt_ifp);
+		dr = defrouter_lookup(&L3_ADDR_SIN6(ln)->sin6_addr, ln->lle_tbl->llt_ifp);
 
 		if (dr != NULL && dr->expire &&
 		    ln->ln_state == ND6_LLINFO_STALE && gc) {
 			/*
 			 * If the reason for the deletion is just garbage
 			 * collection, and the neighbor is an active default
 			 * router, do not delete it.  Instead, reset the GC
 			 * timer using the router's lifetime.
 			 * Simply deleting the entry would affect default
 			 * router selection, which is not necessarily a good
 			 * thing, especially when we're using router preference
 			 * values.
 			 * XXX: the check for ln_state would be redundant,
 			 *      but we intentionally keep it just in case.
 			 */
 			if (dr->expire > time_second)
 				nd6_llinfo_settimer(ln,
 				    (dr->expire - time_second) * hz);
 			else
 				nd6_llinfo_settimer(ln, (long)V_nd6_gctimer * hz);
 			splx(s);
-			return (ln->ln_next);
+			return (LIST_NEXT(ln, lle_next));
 		}
 
 		if (ln->ln_router || dr) {
 			/*
 			 * rt6_flush must be called whether or not the neighbor
 			 * is in the Default Router List.
 			 * See a corresponding comment in nd6_na_input().
 			 */
-			rt6_flush(&in6, rt->rt_ifp);
+			rt6_flush(&L3_ADDR_SIN6(ln)->sin6_addr, ln->lle_tbl->llt_ifp);
 		}
 
 		if (dr) {
 			/*
 			 * Unreachablity of a router might affect the default
 			 * router selection and on-link detection of advertised
 			 * prefixes.
 			 */
 
 			/*
 			 * Temporarily fake the state to choose a new default
 			 * router and to perform on-link determination of
 			 * prefixes correctly.
 			 * Below the state will be set correctly,
 			 * or the entry itself will be deleted.
 			 */
 			ln->ln_state = ND6_LLINFO_INCOMPLETE;
 
 			/*
 			 * Since defrouter_select() does not affect the
 			 * on-link determination and MIP6 needs the check
 			 * before the default router selection, we perform
 			 * the check now.
 			 */
 			pfxlist_onlink_check();
 
 			/*
 			 * refresh default router list
 			 */
 			defrouter_select();
 		}
 		splx(s);
 	}
 
 	/*
 	 * Before deleting the entry, remember the next entry as the
 	 * return value.  We need this because pfxlist_onlink_check() above
 	 * might have freed other entries (particularly the old next entry) as
 	 * a side effect (XXX).
 	 */
-	next = ln->ln_next;
+	next = LIST_NEXT(ln, lle_next);
 
-	/*
-	 * Detach the route from the routing tree and the list of neighbor
-	 * caches, and disable the route entry not to be used in already
-	 * cached routes.
-	 */
-	rtrequest(RTM_DELETE, rt_key(rt), (struct sockaddr *)0,
-	    rt_mask(rt), 0, (struct rtentry **)0);
+	ifp = ln->lle_tbl->llt_ifp;
+	IF_AFDATA_LOCK(ifp);
+	LLE_WLOCK(ln);
+	llentry_free(ln);
+	IF_AFDATA_UNLOCK(ifp);
 
 	return (next);
 }
 
 /*
  * Upper-layer reachability hint for Neighbor Unreachability Detection.
  *
  * XXX cost-effective methods?
  */
 void
 nd6_nud_hint(struct rtentry *rt, struct in6_addr *dst6, int force)
 {
 	INIT_VNET_INET6(curvnet);
-	struct llinfo_nd6 *ln;
+	struct llentry *ln;
+	struct ifnet *ifp;
 
-	/*
-	 * If the caller specified "rt", use that.  Otherwise, resolve the
-	 * routing table by supplied "dst6".
-	 */
-	if (rt == NULL) {
-		if (dst6 == NULL)
-			return;
-		if ((rt = nd6_lookup(dst6, 0, NULL)) == NULL)
-			return;
-	}
+	if ((dst6 == NULL) || (rt == NULL))
+		return;
 
-	if ((rt->rt_flags & RTF_GATEWAY) != 0 ||
-	    (rt->rt_flags & RTF_LLINFO) == 0 ||
-	    rt->rt_llinfo == NULL || rt->rt_gateway == NULL ||
-	    rt->rt_gateway->sa_family != AF_LINK) {
-		/* This is not a host route. */
+	ifp = rt->rt_ifp;
+	IF_AFDATA_LOCK(ifp);
+	ln = nd6_lookup(dst6, ND6_EXCLUSIVE, NULL);
+	IF_AFDATA_UNLOCK(ifp);
+	if (ln == NULL)
 		return;
-	}
 
-	ln = (struct llinfo_nd6 *)rt->rt_llinfo;
 	if (ln->ln_state < ND6_LLINFO_REACHABLE)
-		return;
+		goto done;
 
 	/*
 	 * if we get upper-layer reachability confirmation many times,
 	 * it is possible we have false information.
 	 */
 	if (!force) {
 		ln->ln_byhint++;
-		if (ln->ln_byhint > V_nd6_maxnudhint)
-			return;
+		if (ln->ln_byhint > V_nd6_maxnudhint) {
+			goto done;
+		}
 	}
 
-	ln->ln_state = ND6_LLINFO_REACHABLE;
+ 	ln->ln_state = ND6_LLINFO_REACHABLE;
 	if (!ND6_LLINFO_PERMANENT(ln)) {
 		nd6_llinfo_settimer(ln,
 		    (long)ND_IFINFO(rt->rt_ifp)->reachable * hz);
 	}
+done:
+	LLE_WUNLOCK(ln);
 }
 
-/*
- * info - XXX unused
- */
-void
-nd6_rtrequest(int req, struct rtentry *rt, struct rt_addrinfo *info)
-{
-	struct sockaddr *gate = rt->rt_gateway;
-	struct llinfo_nd6 *ln = (struct llinfo_nd6 *)rt->rt_llinfo;
-	static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
-	struct ifnet *ifp = rt->rt_ifp;
-	struct ifaddr *ifa;
-	INIT_VNET_NET(ifp->if_vnet);
-	INIT_VNET_INET6(ifp->if_vnet);
 
-	RT_LOCK_ASSERT(rt);
-
-	if ((rt->rt_flags & RTF_GATEWAY) != 0)
-		return;
-
-	if (nd6_need_cache(ifp) == 0 && (rt->rt_flags & RTF_HOST) == 0) {
-		/*
-		 * This is probably an interface direct route for a link
-		 * which does not need neighbor caches (e.g. fe80::%lo0/64).
-		 * We do not need special treatment below for such a route.
-		 * Moreover, the RTF_LLINFO flag which would be set below
-		 * would annoy the ndp(8) command.
-		 */
-		return;
-	}
-
-	if (req == RTM_RESOLVE &&
-	    (nd6_need_cache(ifp) == 0 || /* stf case */
-	     !nd6_is_new_addr_neighbor((struct sockaddr_in6 *)rt_key(rt),
-	     ifp))) {
-		/*
-		 * FreeBSD and BSD/OS often make a cloned host route based
-		 * on a less-specific route (e.g. the default route).
-		 * If the less specific route does not have a "gateway"
-		 * (this is the case when the route just goes to a p2p or an
-		 * stf interface), we'll mistakenly make a neighbor cache for
-		 * the host route, and will see strange neighbor solicitation
-		 * for the corresponding destination.  In order to avoid the
-		 * confusion, we check if the destination of the route is
-		 * a neighbor in terms of neighbor discovery, and stop the
-		 * process if not.  Additionally, we remove the LLINFO flag
-		 * so that ndp(8) will not try to get the neighbor information
-		 * of the destination.
-		 */
-		rt->rt_flags &= ~RTF_LLINFO;
-		return;
-	}
-
-	switch (req) {
-	case RTM_ADD:
-		/*
-		 * There is no backward compatibility :)
-		 *
-		 * if ((rt->rt_flags & RTF_HOST) == 0 &&
-		 *     SIN(rt_mask(rt))->sin_addr.s_addr != 0xffffffff)
-		 *	   rt->rt_flags |= RTF_CLONING;
-		 */
-		if ((rt->rt_flags & RTF_CLONING) ||
-		    ((rt->rt_flags & RTF_LLINFO) && ln == NULL)) {
-			/*
-			 * Case 1: This route should come from a route to
-			 * interface (RTF_CLONING case) or the route should be
-			 * treated as on-link but is currently not
-			 * (RTF_LLINFO && ln == NULL case).
-			 */
-			rt_setgate(rt, rt_key(rt),
-				   (struct sockaddr *)&null_sdl);
-			gate = rt->rt_gateway;
-			SDL(gate)->sdl_type = ifp->if_type;
-			SDL(gate)->sdl_index = ifp->if_index;
-			if (ln)
-				nd6_llinfo_settimer(ln, 0);
-			if ((rt->rt_flags & RTF_CLONING) != 0)
-				break;
-		}
-		/*
-		 * In IPv4 code, we try to annonuce new RTF_ANNOUNCE entry here.
-		 * We don't do that here since llinfo is not ready yet.
-		 *
-		 * There are also couple of other things to be discussed:
-		 * - unsolicited NA code needs improvement beforehand
-		 * - RFC2461 says we MAY send multicast unsolicited NA
-		 *   (7.2.6 paragraph 4), however, it also says that we
-		 *   SHOULD provide a mechanism to prevent multicast NA storm.
-		 *   we don't have anything like it right now.
-		 *   note that the mechanism needs a mutual agreement
-		 *   between proxies, which means that we need to implement
-		 *   a new protocol, or a new kludge.
-		 * - from RFC2461 6.2.4, host MUST NOT send an unsolicited NA.
-		 *   we need to check ip6forwarding before sending it.
-		 *   (or should we allow proxy ND configuration only for
-		 *   routers?  there's no mention about proxy ND from hosts)
-		 */
-		/* FALLTHROUGH */
-	case RTM_RESOLVE:
-		if ((ifp->if_flags & (IFF_POINTOPOINT | IFF_LOOPBACK)) == 0) {
-			/*
-			 * Address resolution isn't necessary for a point to
-			 * point link, so we can skip this test for a p2p link.
-			 */
-			if (gate->sa_family != AF_LINK ||
-			    gate->sa_len < sizeof(null_sdl)) {
-				log(LOG_DEBUG,
-				    "nd6_rtrequest: bad gateway value: %s\n",
-				    if_name(ifp));
-				break;
-			}
-			SDL(gate)->sdl_type = ifp->if_type;
-			SDL(gate)->sdl_index = ifp->if_index;
-		}
-		if (ln != NULL)
-			break;	/* This happens on a route change */
-		/*
-		 * Case 2: This route may come from cloning, or a manual route
-		 * add with a LL address.
-		 */
-		R_Malloc(ln, struct llinfo_nd6 *, sizeof(*ln));
-		rt->rt_llinfo = (caddr_t)ln;
-		if (ln == NULL) {
-			log(LOG_DEBUG, "nd6_rtrequest: malloc failed\n");
-			break;
-		}
-		V_nd6_inuse++;
-		V_nd6_allocated++;
-		bzero(ln, sizeof(*ln));
-		RT_ADDREF(rt);
-		ln->ln_rt = rt;
-		callout_init(&ln->ln_timer_ch, 0);
-
-		/* this is required for "ndp" command. - shin */
-		if (req == RTM_ADD) {
-		        /*
-			 * gate should have some valid AF_LINK entry,
-			 * and ln->ln_expire should have some lifetime
-			 * which is specified by ndp command.
-			 */
-			ln->ln_state = ND6_LLINFO_REACHABLE;
-			ln->ln_byhint = 0;
-		} else {
-		        /*
-			 * When req == RTM_RESOLVE, rt is created and
-			 * initialized in rtrequest(), so rt_expire is 0.
-			 */
-			ln->ln_state = ND6_LLINFO_NOSTATE;
-			nd6_llinfo_settimer(ln, 0);
-		}
-		rt->rt_flags |= RTF_LLINFO;
-		ln->ln_next = V_llinfo_nd6.ln_next;
-		V_llinfo_nd6.ln_next = ln;
-		ln->ln_prev = &V_llinfo_nd6;
-		ln->ln_next->ln_prev = ln;
-
-		/*
-		 * check if rt_key(rt) is one of my address assigned
-		 * to the interface.
-		 */
-		ifa = (struct ifaddr *)in6ifa_ifpwithaddr(rt->rt_ifp,
-		    &SIN6(rt_key(rt))->sin6_addr);
-		if (ifa) {
-			caddr_t macp = nd6_ifptomac(ifp);
-			nd6_llinfo_settimer(ln, -1);
-			ln->ln_state = ND6_LLINFO_REACHABLE;
-			ln->ln_byhint = 0;
-			if (macp) {
-				bcopy(macp, LLADDR(SDL(gate)), ifp->if_addrlen);
-				SDL(gate)->sdl_alen = ifp->if_addrlen;
-			}
-			if (V_nd6_useloopback) {
-				rt->rt_ifp = &V_loif[0];	/* XXX */
-				/*
-				 * Make sure rt_ifa be equal to the ifaddr
-				 * corresponding to the address.
-				 * We need this because when we refer
-				 * rt_ifa->ia6_flags in ip6_input, we assume
-				 * that the rt_ifa points to the address instead
-				 * of the loopback address.
-				 */
-				if (ifa != rt->rt_ifa) {
-					IFAFREE(rt->rt_ifa);
-					IFAREF(ifa);
-					rt->rt_ifa = ifa;
-				}
-			}
-		} else if (rt->rt_flags & RTF_ANNOUNCE) {
-			nd6_llinfo_settimer(ln, -1);
-			ln->ln_state = ND6_LLINFO_REACHABLE;
-			ln->ln_byhint = 0;
-
-			/* join solicited node multicast for proxy ND */
-			if (ifp->if_flags & IFF_MULTICAST) {
-				struct in6_addr llsol;
-				int error;
-
-				llsol = SIN6(rt_key(rt))->sin6_addr;
-				llsol.s6_addr32[0] = IPV6_ADDR_INT32_MLL;
-				llsol.s6_addr32[1] = 0;
-				llsol.s6_addr32[2] = htonl(1);
-				llsol.s6_addr8[12] = 0xff;
-				if (in6_setscope(&llsol, ifp, NULL))
-					break;
-				if (in6_addmulti(&llsol, ifp,
-				    &error, 0) == NULL) {
-					char ip6buf[INET6_ADDRSTRLEN];
-					nd6log((LOG_ERR, "%s: failed to join "
-					    "%s (errno=%d)\n", if_name(ifp),
-					    ip6_sprintf(ip6buf, &llsol),
-					    error));
-				}
-			}
-		}
-		break;
-
-	case RTM_DELETE:
-		if (ln == NULL)
-			break;
-		/* leave from solicited node multicast for proxy ND */
-		if ((rt->rt_flags & RTF_ANNOUNCE) != 0 &&
-		    (ifp->if_flags & IFF_MULTICAST) != 0) {
-			struct in6_addr llsol;
-			struct in6_multi *in6m;
-
-			llsol = SIN6(rt_key(rt))->sin6_addr;
-			llsol.s6_addr32[0] = IPV6_ADDR_INT32_MLL;
-			llsol.s6_addr32[1] = 0;
-			llsol.s6_addr32[2] = htonl(1);
-			llsol.s6_addr8[12] = 0xff;
-			if (in6_setscope(&llsol, ifp, NULL) == 0) {
-				IN6_LOOKUP_MULTI(llsol, ifp, in6m);
-				if (in6m)
-					in6_delmulti(in6m);
-			} else
-				; /* XXX: should not happen. bark here? */
-		}
-		V_nd6_inuse--;
-		ln->ln_next->ln_prev = ln->ln_prev;
-		ln->ln_prev->ln_next = ln->ln_next;
-		ln->ln_prev = NULL;
-		nd6_llinfo_settimer(ln, -1);
-		RT_REMREF(rt);
-		rt->rt_llinfo = 0;
-		rt->rt_flags &= ~RTF_LLINFO;
-		clear_llinfo_pqueue(ln);
-		Free((caddr_t)ln);
-	}
-}
-
 int
 nd6_ioctl(u_long cmd, caddr_t data, struct ifnet *ifp)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	struct in6_drlist *drl = (struct in6_drlist *)data;
 	struct in6_oprlist *oprl = (struct in6_oprlist *)data;
 	struct in6_ndireq *ndi = (struct in6_ndireq *)data;
 	struct in6_nbrinfo *nbi = (struct in6_nbrinfo *)data;
 	struct in6_ndifreq *ndif = (struct in6_ndifreq *)data;
 	struct nd_defrouter *dr;
 	struct nd_prefix *pr;
-	struct rtentry *rt;
 	int i = 0, error = 0;
 	int s;
 
 	switch (cmd) {
 	case SIOCGDRLST_IN6:
 		/*
 		 * obsolete API, use sysctl under net.inet6.icmp6
 		 */
 		bzero(drl, sizeof(*drl));
 		s = splnet();
 		dr = TAILQ_FIRST(&V_nd_defrouter);
 		while (dr && i < DRLSTSIZ) {
 			drl->defrouter[i].rtaddr = dr->rtaddr;
 			in6_clearscope(&drl->defrouter[i].rtaddr);
 
 			drl->defrouter[i].flags = dr->flags;
 			drl->defrouter[i].rtlifetime = dr->rtlifetime;
 			drl->defrouter[i].expire = dr->expire;
 			drl->defrouter[i].if_index = dr->ifp->if_index;
 			i++;
 			dr = TAILQ_NEXT(dr, dr_entry);
 		}
 		splx(s);
 		break;
 	case SIOCGPRLST_IN6:
 		/*
 		 * obsolete API, use sysctl under net.inet6.icmp6
 		 *
 		 * XXX the structure in6_prlist was changed in backward-
 		 * incompatible manner.  in6_oprlist is used for SIOCGPRLST_IN6,
 		 * in6_prlist is used for nd6_sysctl() - fill_prlist().
 		 */
 		/*
 		 * XXX meaning of fields, especialy "raflags", is very
 		 * differnet between RA prefix list and RR/static prefix list.
 		 * how about separating ioctls into two?
 		 */
 		bzero(oprl, sizeof(*oprl));
 		s = splnet();
 		pr = V_nd_prefix.lh_first;
 		while (pr && i < PRLSTSIZ) {
 			struct nd_pfxrouter *pfr;
 			int j;
 
 			oprl->prefix[i].prefix = pr->ndpr_prefix.sin6_addr;
 			oprl->prefix[i].raflags = pr->ndpr_raf;
 			oprl->prefix[i].prefixlen = pr->ndpr_plen;
 			oprl->prefix[i].vltime = pr->ndpr_vltime;
 			oprl->prefix[i].pltime = pr->ndpr_pltime;
 			oprl->prefix[i].if_index = pr->ndpr_ifp->if_index;
 			if (pr->ndpr_vltime == ND6_INFINITE_LIFETIME)
 				oprl->prefix[i].expire = 0;
 			else {
 				time_t maxexpire;
 
 				/* XXX: we assume time_t is signed. */
 				maxexpire = (-1) &
 				    ~((time_t)1 <<
 				    ((sizeof(maxexpire) * 8) - 1));
 				if (pr->ndpr_vltime <
 				    maxexpire - pr->ndpr_lastupdate) {
 					oprl->prefix[i].expire =
 					    pr->ndpr_lastupdate +
 					    pr->ndpr_vltime;
 				} else
 					oprl->prefix[i].expire = maxexpire;
 			}
 
 			pfr = pr->ndpr_advrtrs.lh_first;
 			j = 0;
 			while (pfr) {
 				if (j < DRLSTSIZ) {
 #define RTRADDR oprl->prefix[i].advrtr[j]
 					RTRADDR = pfr->router->rtaddr;
 					in6_clearscope(&RTRADDR);
 #undef RTRADDR
 				}
 				j++;
 				pfr = pfr->pfr_next;
 			}
 			oprl->prefix[i].advrtrs = j;
 			oprl->prefix[i].origin = PR_ORIG_RA;
 
 			i++;
 			pr = pr->ndpr_next;
 		}
 		splx(s);
 
 		break;
 	case OSIOCGIFINFO_IN6:
 #define ND	ndi->ndi
 		/* XXX: old ndp(8) assumes a positive value for linkmtu. */
 		bzero(&ND, sizeof(ND));
 		ND.linkmtu = IN6_LINKMTU(ifp);
 		ND.maxmtu = ND_IFINFO(ifp)->maxmtu;
 		ND.basereachable = ND_IFINFO(ifp)->basereachable;
 		ND.reachable = ND_IFINFO(ifp)->reachable;
 		ND.retrans = ND_IFINFO(ifp)->retrans;
 		ND.flags = ND_IFINFO(ifp)->flags;
 		ND.recalctm = ND_IFINFO(ifp)->recalctm;
 		ND.chlim = ND_IFINFO(ifp)->chlim;
 		break;
 	case SIOCGIFINFO_IN6:
 		ND = *ND_IFINFO(ifp);
 		break;
 	case SIOCSIFINFO_IN6:
 		/*
 		 * used to change host variables from userland.
 		 * intented for a use on router to reflect RA configurations.
 		 */
 		/* 0 means 'unspecified' */
 		if (ND.linkmtu != 0) {
 			if (ND.linkmtu < IPV6_MMTU ||
 			    ND.linkmtu > IN6_LINKMTU(ifp)) {
 				error = EINVAL;
 				break;
 			}
 			ND_IFINFO(ifp)->linkmtu = ND.linkmtu;
 		}
 
 		if (ND.basereachable != 0) {
 			int obasereachable = ND_IFINFO(ifp)->basereachable;
 
 			ND_IFINFO(ifp)->basereachable = ND.basereachable;
 			if (ND.basereachable != obasereachable)
 				ND_IFINFO(ifp)->reachable =
 				    ND_COMPUTE_RTIME(ND.basereachable);
 		}
 		if (ND.retrans != 0)
 			ND_IFINFO(ifp)->retrans = ND.retrans;
 		if (ND.chlim != 0)
 			ND_IFINFO(ifp)->chlim = ND.chlim;
 		/* FALLTHROUGH */
 	case SIOCSIFINFO_FLAGS:
 		ND_IFINFO(ifp)->flags = ND.flags;
 		break;
 #undef ND
 	case SIOCSNDFLUSH_IN6:	/* XXX: the ioctl name is confusing... */
 		/* sync kernel routing table with the default router list */
 		defrouter_reset();
 		defrouter_select();
 		break;
 	case SIOCSPFXFLUSH_IN6:
 	{
 		/* flush all the prefix advertised by routers */
 		struct nd_prefix *pr, *next;
 
 		s = splnet();
 		for (pr = V_nd_prefix.lh_first; pr; pr = next) {
 			struct in6_ifaddr *ia, *ia_next;
 
 			next = pr->ndpr_next;
 
 			if (IN6_IS_ADDR_LINKLOCAL(&pr->ndpr_prefix.sin6_addr))
 				continue; /* XXX */
 
 			/* do we really have to remove addresses as well? */
 			for (ia = V_in6_ifaddr; ia; ia = ia_next) {
 				/* ia might be removed.  keep the next ptr. */
 				ia_next = ia->ia_next;
 
 				if ((ia->ia6_flags & IN6_IFF_AUTOCONF) == 0)
 					continue;
 
 				if (ia->ia6_ndpr == pr)
 					in6_purgeaddr(&ia->ia_ifa);
 			}
 			prelist_remove(pr);
 		}
 		splx(s);
 		break;
 	}
 	case SIOCSRTRFLUSH_IN6:
 	{
 		/* flush all the default routers */
 		struct nd_defrouter *dr, *next;
 
 		s = splnet();
 		defrouter_reset();
 		for (dr = TAILQ_FIRST(&V_nd_defrouter); dr; dr = next) {
 			next = TAILQ_NEXT(dr, dr_entry);
 			defrtrlist_del(dr);
 		}
 		defrouter_select();
 		splx(s);
 		break;
 	}
 	case SIOCGNBRINFO_IN6:
 	{
-		struct llinfo_nd6 *ln;
+		struct llentry *ln;
 		struct in6_addr nb_addr = nbi->addr; /* make local for safety */
 
 		if ((error = in6_setscope(&nb_addr, ifp, NULL)) != 0)
 			return (error);
 
-		s = splnet();
-		if ((rt = nd6_lookup(&nb_addr, 0, ifp)) == NULL) {
+		IF_AFDATA_LOCK(ifp);
+		ln = nd6_lookup(&nb_addr, 0, ifp);
+		IF_AFDATA_UNLOCK(ifp);
+
+		if (ln == NULL) {
 			error = EINVAL;
-			splx(s);
 			break;
 		}
-		ln = (struct llinfo_nd6 *)rt->rt_llinfo;
 		nbi->state = ln->ln_state;
-		nbi->asked = ln->ln_asked;
+		nbi->asked = ln->la_asked;
 		nbi->isrouter = ln->ln_router;
-		nbi->expire = ln->ln_expire;
-		splx(s);
-
+		nbi->expire = ln->la_expire;
+		LLE_RUNLOCK(ln);
 		break;
 	}
 	case SIOCGDEFIFACE_IN6:	/* XXX: should be implemented as a sysctl? */
 		ndif->ifindex = V_nd6_defifindex;
 		break;
 	case SIOCSDEFIFACE_IN6:	/* XXX: should be implemented as a sysctl? */
 		return (nd6_setdefaultiface(ndif->ifindex));
 	}
 	return (error);
 }
 
 /*
  * Create neighbor cache entry and cache link-layer address,
  * on reception of inbound ND6 packets.  (RS/RA/NS/redirect)
  *
  * type - ICMP6 type
  * code - type dependent information
+ *
+ * XXXXX
+ *  The caller of this function already acquired the ndp 
+ *  cache table lock because the cache entry is returned.
  */
-struct rtentry *
+struct llentry *
 nd6_cache_lladdr(struct ifnet *ifp, struct in6_addr *from, char *lladdr,
     int lladdrlen, int type, int code)
 {
 	INIT_VNET_INET6(curvnet);
-	struct rtentry *rt = NULL;
-	struct llinfo_nd6 *ln = NULL;
+	struct llentry *ln = NULL;
 	int is_newentry;
-	struct sockaddr_dl *sdl = NULL;
 	int do_update;
 	int olladdr;
 	int llchange;
+	int flags = 0;
 	int newstate = 0;
+	struct sockaddr_in6 sin6;
+	struct mbuf *chain = NULL;
 
+	IF_AFDATA_UNLOCK_ASSERT(ifp);
+
 	if (ifp == NULL)
 		panic("ifp == NULL in nd6_cache_lladdr");
 	if (from == NULL)
 		panic("from == NULL in nd6_cache_lladdr");
 
 	/* nothing must be updated for unspecified address */
 	if (IN6_IS_ADDR_UNSPECIFIED(from))
 		return NULL;
 
 	/*
 	 * Validation about ifp->if_addrlen and lladdrlen must be done in
 	 * the caller.
 	 *
 	 * XXX If the link does not have link-layer adderss, what should
 	 * we do? (ifp->if_addrlen == 0)
 	 * Spec says nothing in sections for RA, RS and NA.  There's small
 	 * description on it in NS section (RFC 2461 7.2.3).
 	 */
-
-	rt = nd6_lookup(from, 0, ifp);
-	if (rt == NULL) {
-		rt = nd6_lookup(from, 1, ifp);
+	flags |= lladdr ? ND6_EXCLUSIVE : 0;
+	IF_AFDATA_LOCK(ifp);
+	ln = nd6_lookup(from, flags, ifp);
+	if (ln)
+		IF_AFDATA_UNLOCK(ifp);
+	if (ln == NULL) {
+		flags |= LLE_EXCLUSIVE;
+		ln = nd6_lookup(from, flags |ND6_CREATE, ifp);
+		IF_AFDATA_UNLOCK(ifp);
 		is_newentry = 1;
 	} else {
 		/* do nothing if static ndp is set */
-		if (rt->rt_flags & RTF_STATIC)
-			return NULL;
+		if (ln->la_flags & LLE_STATIC)
+			goto done;
 		is_newentry = 0;
 	}
-
-	if (rt == NULL)
-		return NULL;
-	if ((rt->rt_flags & (RTF_GATEWAY | RTF_LLINFO)) != RTF_LLINFO) {
-fail:
-		(void)nd6_free(rt, 0);
-		return NULL;
-	}
-	ln = (struct llinfo_nd6 *)rt->rt_llinfo;
 	if (ln == NULL)
-		goto fail;
-	if (rt->rt_gateway == NULL)
-		goto fail;
-	if (rt->rt_gateway->sa_family != AF_LINK)
-		goto fail;
-	sdl = SDL(rt->rt_gateway);
+		return (NULL);
 
-	olladdr = (sdl->sdl_alen) ? 1 : 0;
+	olladdr = (ln->la_flags & LLE_VALID) ? 1 : 0;
 	if (olladdr && lladdr) {
-		if (bcmp(lladdr, LLADDR(sdl), ifp->if_addrlen))
-			llchange = 1;
-		else
-			llchange = 0;
+		llchange = bcmp(lladdr, &ln->ll_addr,
+		    ifp->if_addrlen);
 	} else
 		llchange = 0;
 
 	/*
 	 * newentry olladdr  lladdr  llchange	(*=record)
 	 *	0	n	n	--	(1)
 	 *	0	y	n	--	(2)
 	 *	0	n	y	--	(3) * STALE
 	 *	0	y	y	n	(4) *
 	 *	0	y	y	y	(5) * STALE
 	 *	1	--	n	--	(6)   NOSTATE(= PASSIVE)
 	 *	1	--	y	--	(7) * STALE
 	 */
 
 	if (lladdr) {		/* (3-5) and (7) */
 		/*
 		 * Record source link-layer address
 		 * XXX is it dependent to ifp->if_type?
 		 */
-		sdl->sdl_alen = ifp->if_addrlen;
-		bcopy(lladdr, LLADDR(sdl), ifp->if_addrlen);
+		bcopy(lladdr, &ln->ll_addr, ifp->if_addrlen);
+		ln->la_flags |= LLE_VALID;
 	}
 
 	if (!is_newentry) {
 		if ((!olladdr && lladdr != NULL) ||	/* (3) */
 		    (olladdr && lladdr != NULL && llchange)) {	/* (5) */
 			do_update = 1;
 			newstate = ND6_LLINFO_STALE;
 		} else					/* (1-2,4) */
 			do_update = 0;
 	} else {
 		do_update = 1;
 		if (lladdr == NULL)			/* (6) */
 			newstate = ND6_LLINFO_NOSTATE;
 		else					/* (7) */
 			newstate = ND6_LLINFO_STALE;
 	}
 
 	if (do_update) {
 		/*
 		 * Update the state of the neighbor cache.
 		 */
 		ln->ln_state = newstate;
 
 		if (ln->ln_state == ND6_LLINFO_STALE) {
 			/*
 			 * XXX: since nd6_output() below will cause
 			 * state tansition to DELAY and reset the timer,
 			 * we must set the timer now, although it is actually
 			 * meaningless.
 			 */
-			nd6_llinfo_settimer(ln, (long)V_nd6_gctimer * hz);
+			nd6_llinfo_settimer_locked(ln, (long)V_nd6_gctimer * hz);
 
-			if (ln->ln_hold) {
+			if (ln->la_hold) {
 				struct mbuf *m_hold, *m_hold_next;
 
 				/*
-				 * reset the ln_hold in advance, to explicitly
-				 * prevent a ln_hold lookup in nd6_output()
+				 * reset the la_hold in advance, to explicitly
+				 * prevent a la_hold lookup in nd6_output()
 				 * (wouldn't happen, though...)
 				 */
-				for (m_hold = ln->ln_hold, ln->ln_hold = NULL;
+				for (m_hold = ln->la_hold, ln->la_hold = NULL;
 				    m_hold; m_hold = m_hold_next) {
 					m_hold_next = m_hold->m_nextpkt;
 					m_hold->m_nextpkt = NULL;
 
 					/*
 					 * we assume ifp is not a p2p here, so
 					 * just set the 2nd argument as the
 					 * 1st one.
 					 */
-					nd6_output(ifp, ifp, m_hold,
-					     (struct sockaddr_in6 *)rt_key(rt),
-					     rt);
+					nd6_output_lle(ifp, ifp, m_hold, L3_ADDR_SIN6(ln), NULL, ln, &chain);
 				}
+				if (chain)
+					memcpy(&sin6, L3_ADDR_SIN6(ln), sizeof(sin6));
 			}
 		} else if (ln->ln_state == ND6_LLINFO_INCOMPLETE) {
 			/* probe right away */
-			nd6_llinfo_settimer((void *)ln, 0);
+			nd6_llinfo_settimer_locked((void *)ln, 0);
 		}
 	}
 
 	/*
 	 * ICMP6 type dependent behavior.
 	 *
 	 * NS: clear IsRouter if new entry
 	 * RS: clear IsRouter
 	 * RA: set IsRouter if there's lladdr
 	 * redir: clear IsRouter if new entry
 	 *
 	 * RA case, (1):
 	 * The spec says that we must set IsRouter in the following cases:
 	 * - If lladdr exist, set IsRouter.  This means (1-5).
 	 * - If it is old entry (!newentry), set IsRouter.  This means (7).
 	 * So, based on the spec, in (1-5) and (7) cases we must set IsRouter.
 	 * A quetion arises for (1) case.  (1) case has no lladdr in the
 	 * neighbor cache, this is similar to (6).
 	 * This case is rare but we figured that we MUST NOT set IsRouter.
 	 *
 	 * newentry olladdr  lladdr  llchange	    NS  RS  RA	redir
 	 *							D R
 	 *	0	n	n	--	(1)	c   ?     s
 	 *	0	y	n	--	(2)	c   s     s
 	 *	0	n	y	--	(3)	c   s     s
 	 *	0	y	y	n	(4)	c   s     s
 	 *	0	y	y	y	(5)	c   s     s
 	 *	1	--	n	--	(6) c	c	c s
 	 *	1	--	y	--	(7) c	c   s	c s
 	 *
 	 *					(c=clear s=set)
 	 */
 	switch (type & 0xff) {
 	case ND_NEIGHBOR_SOLICIT:
 		/*
 		 * New entry must have is_router flag cleared.
 		 */
 		if (is_newentry)	/* (6-7) */
 			ln->ln_router = 0;
 		break;
 	case ND_REDIRECT:
 		/*
 		 * If the icmp is a redirect to a better router, always set the
 		 * is_router flag.  Otherwise, if the entry is newly created,
 		 * clear the flag.  [RFC 2461, sec 8.3]
 		 */
 		if (code == ND_REDIRECT_ROUTER)
 			ln->ln_router = 1;
 		else if (is_newentry) /* (6-7) */
 			ln->ln_router = 0;
 		break;
 	case ND_ROUTER_SOLICIT:
 		/*
 		 * is_router flag must always be cleared.
 		 */
 		ln->ln_router = 0;
 		break;
 	case ND_ROUTER_ADVERT:
 		/*
 		 * Mark an entry with lladdr as a router.
 		 */
 		if ((!is_newentry && (olladdr || lladdr)) ||	/* (2-5) */
 		    (is_newentry && lladdr)) {			/* (7) */
 			ln->ln_router = 1;
 		}
 		break;
 	}
 
+	if (ln) {
+		if (flags & ND6_EXCLUSIVE)
+			LLE_WUNLOCK(ln);
+		else
+			LLE_RUNLOCK(ln);
+		if (ln->la_flags & LLE_STATIC)
+			ln = NULL;
+	}
+	if (chain)
+		nd6_output_flush(ifp, ifp, chain, &sin6, NULL);
+	
 	/*
 	 * When the link-layer address of a router changes, select the
 	 * best router again.  In particular, when the neighbor entry is newly
 	 * created, it might affect the selection policy.
 	 * Question: can we restrict the first condition to the "is_newentry"
 	 * case?
 	 * XXX: when we hear an RA from a new router with the link-layer
 	 * address option, defrouter_select() is called twice, since
 	 * defrtrlist_update called the function as well.  However, I believe
 	 * we can compromise the overhead, since it only happens the first
 	 * time.
 	 * XXX: although defrouter_select() should not have a bad effect
 	 * for those are not autoconfigured hosts, we explicitly avoid such
 	 * cases for safety.
 	 */
-	if (do_update && ln->ln_router && !V_ip6_forwarding && V_ip6_accept_rtadv)
+	if (do_update && ln->ln_router && !V_ip6_forwarding && V_ip6_accept_rtadv) {
+		/*
+		 * guaranteed recursion
+		 */
 		defrouter_select();
-
-	return rt;
+	}
+	
+	return (ln);
+done:	
+	if (ln) {
+		if (flags & ND6_EXCLUSIVE)
+			LLE_WUNLOCK(ln);
+		else
+			LLE_RUNLOCK(ln);
+		if (ln->la_flags & LLE_STATIC)
+			ln = NULL;
+	}
+	return (ln);
 }
 
 static void
 nd6_slowtimo(void *arg)
 {
 	CURVNET_SET((struct vnet *) arg);
 	INIT_VNET_NET((struct vnet *) arg);
 	INIT_VNET_INET6((struct vnet *) arg);
 	struct nd_ifinfo *nd6if;
 	struct ifnet *ifp;
 
 	callout_reset(&V_nd6_slowtimo_ch, ND6_SLOWTIMER_INTERVAL * hz,
 	    nd6_slowtimo, NULL);
 	IFNET_RLOCK();
 	for (ifp = TAILQ_FIRST(&V_ifnet); ifp;
 	    ifp = TAILQ_NEXT(ifp, if_list)) {
 		nd6if = ND_IFINFO(ifp);
 		if (nd6if->basereachable && /* already initialized */
 		    (nd6if->recalctm -= ND6_SLOWTIMER_INTERVAL) <= 0) {
 			/*
 			 * Since reachable time rarely changes by router
 			 * advertisements, we SHOULD insure that a new random
 			 * value gets recomputed at least once every few hours.
 			 * (RFC 2461, 6.3.4)
 			 */
 			nd6if->recalctm = V_nd6_recalc_reachtm_interval;
 			nd6if->reachable = ND_COMPUTE_RTIME(nd6if->basereachable);
 		}
 	}
 	IFNET_RUNLOCK();
 	CURVNET_RESTORE();
 }
 
-#define senderr(e) { error = (e); goto bad;}
 int
 nd6_output(struct ifnet *ifp, struct ifnet *origifp, struct mbuf *m0,
     struct sockaddr_in6 *dst, struct rtentry *rt0)
 {
+
+	return (nd6_output_lle(ifp, origifp, m0, dst, rt0, NULL, NULL));
+}
+
+
+/*
+ * Note that I'm not enforcing any global serialization
+ * lle state or asked changes here as the logic is too
+ * complicated to avoid having to always acquire an exclusive
+ * lock
+ * KMM
+ *
+ */
+#define senderr(e) { error = (e); goto bad;}
+
+int
+nd6_output_lle(struct ifnet *ifp, struct ifnet *origifp, struct mbuf *m0,
+    struct sockaddr_in6 *dst, struct rtentry *rt0, struct llentry *lle,
+	struct mbuf **tail)
+{
 	INIT_VNET_INET6(curvnet);
 	struct mbuf *m = m0;
 	struct rtentry *rt = rt0;
-	struct sockaddr_in6 *gw6 = NULL;
-	struct llinfo_nd6 *ln = NULL;
+	struct llentry *ln = lle;
 	int error = 0;
+	int flags = 0;
 
+#ifdef INVARIANTS
+	if (lle) {
+		
+		LLE_WLOCK_ASSERT(lle);
+
+		KASSERT(tail != NULL, (" lle locked but no tail pointer passed"));
+	}
+#endif
 	if (IN6_IS_ADDR_MULTICAST(&dst->sin6_addr))
 		goto sendpkt;
 
 	if (nd6_need_cache(ifp) == 0)
 		goto sendpkt;
 
 	/*
 	 * next hop determination.  This routine is derived from ether_output.
 	 */
-	/* NB: the locking here is tortuous... */
-	if (rt != NULL)
-		RT_LOCK(rt);
-again:
-	if (rt != NULL) {
-		if ((rt->rt_flags & RTF_UP) == 0) {
-			RT_UNLOCK(rt);
-			rt0 = rt = rtalloc1((struct sockaddr *)dst, 1, 0UL);
-			if (rt != NULL) {
-				RT_REMREF(rt);
-				if (rt->rt_ifp != ifp)
-					/*
-					 * XXX maybe we should update ifp too,
-					 * but the original code didn't and I
-					 * don't know what is correct here.
-					 */
-					goto again;
-			} else
-				senderr(EHOSTUNREACH);
-		}
 
-		if (rt->rt_flags & RTF_GATEWAY) {
-			gw6 = (struct sockaddr_in6 *)rt->rt_gateway;
-
-			/*
-			 * We skip link-layer address resolution and NUD
-			 * if the gateway is not a neighbor from ND point
-			 * of view, regardless of the value of nd_ifinfo.flags.
-			 * The second condition is a bit tricky; we skip
-			 * if the gateway is our own address, which is
-			 * sometimes used to install a route to a p2p link.
-			 */
-			if (!nd6_is_addr_neighbor(gw6, ifp) ||
-			    in6ifa_ifpwithaddr(ifp, &gw6->sin6_addr)) {
-				RT_UNLOCK(rt);
-				/*
-				 * We allow this kind of tricky route only
-				 * when the outgoing interface is p2p.
-				 * XXX: we may need a more generic rule here.
-				 */
-				if ((ifp->if_flags & IFF_POINTOPOINT) == 0)
-					senderr(EHOSTUNREACH);
-
-				goto sendpkt;
-			}
-
-			if (rt->rt_gwroute == NULL)
-				goto lookup;
-			rt = rt->rt_gwroute;
-			RT_LOCK(rt);		/* NB: gwroute */
-			if ((rt->rt_flags & RTF_UP) == 0) {
-				RTFREE_LOCKED(rt);	/* unlock gwroute */
-				rt = rt0;
-				rt0->rt_gwroute = NULL;
-			lookup:
-				RT_UNLOCK(rt0);
-				rt = rtalloc1(rt->rt_gateway, 1, 0UL);
-				if (rt == rt0) {
-					RT_REMREF(rt0);
-					RT_UNLOCK(rt0);
-					senderr(EHOSTUNREACH);
-				}
-				RT_LOCK(rt0);
-				if (rt0->rt_gwroute != NULL)
-					RTFREE(rt0->rt_gwroute);
-				rt0->rt_gwroute = rt;
-				if (rt == NULL) {
-					RT_UNLOCK(rt0);
-					senderr(EHOSTUNREACH);
-				}
-			}
-			RT_UNLOCK(rt0);
-		}
-		RT_UNLOCK(rt);
-	}
-
 	/*
 	 * Address resolution or Neighbor Unreachability Detection
 	 * for the next hop.
 	 * At this point, the destination of the packet must be a unicast
 	 * or an anycast address(i.e. not a multicast).
 	 */
 
-	/* Look up the neighbor cache for the nexthop */
-	if (rt && (rt->rt_flags & RTF_LLINFO) != 0)
-		ln = (struct llinfo_nd6 *)rt->rt_llinfo;
-	else {
-		/*
-		 * Since nd6_is_addr_neighbor() internally calls nd6_lookup(),
-		 * the condition below is not very efficient.  But we believe
-		 * it is tolerable, because this should be a rare case.
-		 */
-		if (nd6_is_addr_neighbor(dst, ifp) &&
-		    (rt = nd6_lookup(&dst->sin6_addr, 1, ifp)) != NULL)
-			ln = (struct llinfo_nd6 *)rt->rt_llinfo;
-	}
-	if (ln == NULL || rt == NULL) {
+	flags = ((m != NULL) || (lle != NULL)) ? LLE_EXCLUSIVE : 0;
+	if (ln == NULL) {
+	retry:
+		IF_AFDATA_LOCK(rt->rt_ifp);
+		ln = lla_lookup(LLTABLE6(ifp), flags, (struct sockaddr *)dst);
+		IF_AFDATA_UNLOCK(rt->rt_ifp);
+		if ((ln == NULL) && nd6_is_addr_neighbor(dst, ifp))  {
+			/*
+			 * Since nd6_is_addr_neighbor() internally calls nd6_lookup(),
+			 * the condition below is not very efficient.  But we believe
+			 * it is tolerable, because this should be a rare case.
+			 */
+			flags = ND6_CREATE | (m ? ND6_EXCLUSIVE : 0);
+			IF_AFDATA_LOCK(rt->rt_ifp);
+			ln = nd6_lookup(&dst->sin6_addr, flags, ifp);
+			IF_AFDATA_UNLOCK(rt->rt_ifp);
+		}
+	} 
+	if (ln == NULL) {
 		if ((ifp->if_flags & IFF_POINTOPOINT) == 0 &&
 		    !(ND_IFINFO(ifp)->flags & ND6_IFF_PERFORMNUD)) {
 			char ip6buf[INET6_ADDRSTRLEN];
 			log(LOG_DEBUG,
 			    "nd6_output: can't allocate llinfo for %s "
 			    "(ln=%p, rt=%p)\n",
 			    ip6_sprintf(ip6buf, &dst->sin6_addr), ln, rt);
 			senderr(EIO);	/* XXX: good error? */
 		}
-
 		goto sendpkt;	/* send anyway */
 	}
 
 	/* We don't have to do link-layer address resolution on a p2p link. */
 	if ((ifp->if_flags & IFF_POINTOPOINT) != 0 &&
 	    ln->ln_state < ND6_LLINFO_REACHABLE) {
+		if ((flags & LLE_EXCLUSIVE) == 0) {
+			flags |= LLE_EXCLUSIVE;
+			goto retry;
+		}
 		ln->ln_state = ND6_LLINFO_STALE;
-		nd6_llinfo_settimer(ln, (long)V_nd6_gctimer * hz);
+		nd6_llinfo_settimer_locked(ln, (long)V_nd6_gctimer * hz);
 	}
 
 	/*
 	 * The first time we send a packet to a neighbor whose entry is
 	 * STALE, we have to change the state to DELAY and a sets a timer to
 	 * expire in DELAY_FIRST_PROBE_TIME seconds to ensure do
 	 * neighbor unreachability detection on expiration.
 	 * (RFC 2461 7.3.3)
 	 */
 	if (ln->ln_state == ND6_LLINFO_STALE) {
-		ln->ln_asked = 0;
+		if ((flags & LLE_EXCLUSIVE) == 0) {
+			flags |= LLE_EXCLUSIVE;
+			LLE_RUNLOCK(ln);
+			goto retry;
+		}
+		ln->la_asked = 0;
 		ln->ln_state = ND6_LLINFO_DELAY;
-		nd6_llinfo_settimer(ln, (long)V_nd6_delay * hz);
+		nd6_llinfo_settimer_locked(ln, (long)V_nd6_delay * hz);
 	}
 
 	/*
 	 * If the neighbor cache entry has a state other than INCOMPLETE
 	 * (i.e. its link-layer address is already resolved), just
 	 * send the packet.
 	 */
 	if (ln->ln_state > ND6_LLINFO_INCOMPLETE)
 		goto sendpkt;
 
 	/*
 	 * There is a neighbor cache entry, but no ethernet address
 	 * response yet.  Append this latest packet to the end of the
 	 * packet queue in the mbuf, unless the number of the packet
 	 * does not exceed nd6_maxqueuelen.  When it exceeds nd6_maxqueuelen,
 	 * the oldest packet in the queue will be removed.
 	 */
 	if (ln->ln_state == ND6_LLINFO_NOSTATE)
 		ln->ln_state = ND6_LLINFO_INCOMPLETE;
-	if (ln->ln_hold) {
+
+	if ((flags & LLE_EXCLUSIVE) == 0) {
+		flags |= LLE_EXCLUSIVE;
+		LLE_RUNLOCK(ln);
+		goto retry;
+	}
+	if (ln->la_hold) {
 		struct mbuf *m_hold;
 		int i;
-
+		
 		i = 0;
-		for (m_hold = ln->ln_hold; m_hold; m_hold = m_hold->m_nextpkt) {
+		for (m_hold = ln->la_hold; m_hold; m_hold = m_hold->m_nextpkt) {
 			i++;
 			if (m_hold->m_nextpkt == NULL) {
 				m_hold->m_nextpkt = m;
 				break;
 			}
 		}
 		while (i >= V_nd6_maxqueuelen) {
-			m_hold = ln->ln_hold;
-			ln->ln_hold = ln->ln_hold->m_nextpkt;
+			m_hold = ln->la_hold;
+			ln->la_hold = ln->la_hold->m_nextpkt;
 			m_freem(m_hold);
 			i--;
 		}
 	} else {
-		ln->ln_hold = m;
+		ln->la_hold = m;
 	}
-
 	/*
+	 * We did the lookup (no lle arg) so we
+	 * need to do the unlock here
+	 */
+	if (lle == NULL) {
+		if (flags & LLE_EXCLUSIVE)
+			LLE_WUNLOCK(ln);
+		else
+			LLE_RUNLOCK(ln);
+	}
+	
+	/*
 	 * If there has been no NS for the neighbor after entering the
 	 * INCOMPLETE state, send the first solicitation.
 	 */
-	if (!ND6_LLINFO_PERMANENT(ln) && ln->ln_asked == 0) {
-		ln->ln_asked++;
+	if (!ND6_LLINFO_PERMANENT(ln) && ln->la_asked == 0) {
+		ln->la_asked++;
+		
 		nd6_llinfo_settimer(ln,
 		    (long)ND_IFINFO(ifp)->retrans * hz / 1000);
 		nd6_ns_output(ifp, NULL, &dst->sin6_addr, ln, 0);
 	}
 	return (0);
 
   sendpkt:
 	/* discard the packet if IPv6 operation is disabled on the interface */
 	if ((ND_IFINFO(ifp)->flags & ND6_IFF_IFDISABLED)) {
 		error = ENETDOWN; /* better error? */
 		goto bad;
 	}
+	/*
+	 * ln is valid and the caller did not pass in 
+	 * an llentry
+	 */
+	if (ln && (lle == NULL)) {
+		if (flags & LLE_EXCLUSIVE)
+			LLE_WUNLOCK(ln);
+		else
+			LLE_RUNLOCK(ln);
+	}
 
 #ifdef MAC
 	mac_netinet6_nd6_send(ifp, m);
 #endif
+	if (lle != NULL) {
+		if (*tail == NULL)
+			*tail = m;
+		else
+			(*tail)->m_nextpkt = m;
+		return (error);
+	}
 	if ((ifp->if_flags & IFF_LOOPBACK) != 0) {
 		return ((*ifp->if_output)(origifp, m, (struct sockaddr *)dst,
 		    rt));
 	}
-	return ((*ifp->if_output)(ifp, m, (struct sockaddr *)dst, rt));
+	error = (*ifp->if_output)(ifp, m, (struct sockaddr *)dst, rt);
+	return (error);
 
   bad:
+	/*
+	 * ln is valid and the caller did not pass in 
+	 * an llentry
+	 */
+	if (ln && (lle == NULL)) {
+		if (flags & LLE_EXCLUSIVE)
+			LLE_WUNLOCK(ln);
+		else
+			LLE_RUNLOCK(ln);
+	}
 	if (m)
 		m_freem(m);
 	return (error);
 }
 #undef senderr
 
+
 int
+nd6_output_flush(struct ifnet *ifp, struct ifnet *origifp, struct mbuf *chain,
+    struct sockaddr_in6 *dst, struct rtentry *rt)
+{
+	struct mbuf *m, *m_head;
+	struct ifnet *outifp;
+	int error = 0;
+
+	m_head = chain;
+	if ((ifp->if_flags & IFF_LOOPBACK) != 0)
+		outifp = origifp;
+	else
+		outifp = ifp;
+	
+	while (m_head) {
+		m = m_head;
+		m_head = m_head->m_nextpkt;
+		error = (*ifp->if_output)(ifp, m, (struct sockaddr *)dst, rt);			       
+	}
+
+	/*
+	 * XXX
+	 * note that intermediate errors are blindly ignored - but this is 
+	 * the same convention as used with nd6_output when called by
+	 * nd6_cache_lladdr
+	 */
+	return (error);
+}	
+
+
+int
 nd6_need_cache(struct ifnet *ifp)
 {
 	/*
 	 * XXX: we currently do not make neighbor cache on any interface
 	 * other than ARCnet, Ethernet, FDDI and GIF.
 	 *
 	 * RFC2893 says:
 	 * - unidirectional tunnels needs no ND
 	 */
 	switch (ifp->if_type) {
 	case IFT_ARCNET:
 	case IFT_ETHER:
 	case IFT_FDDI:
 	case IFT_IEEE1394:
 #ifdef IFT_L2VLAN
 	case IFT_L2VLAN:
 #endif
 #ifdef IFT_IEEE80211
 	case IFT_IEEE80211:
 #endif
 #ifdef IFT_CARP
 	case IFT_CARP:
 #endif
 	case IFT_GIF:		/* XXX need more cases? */
 	case IFT_PPP:
 	case IFT_TUNNEL:
 	case IFT_BRIDGE:
 	case IFT_PROPVIRTUAL:
 		return (1);
 	default:
 		return (0);
 	}
 }
 
+/*
+ * the callers of this function need to be re-worked to drop
+ * the lle lock, drop here for now
+ */
 int
 nd6_storelladdr(struct ifnet *ifp, struct rtentry *rt0, struct mbuf *m,
-    struct sockaddr *dst, u_char *desten)
+    struct sockaddr *dst, u_char *desten, struct llentry **lle)
 {
-	struct sockaddr_dl *sdl;
-	struct rtentry *rt;
-	int error;
+	struct llentry *ln;
 
+	*lle = NULL;
+	IF_AFDATA_UNLOCK_ASSERT(ifp);
 	if (m->m_flags & M_MCAST) {
 		int i;
 
 		switch (ifp->if_type) {
 		case IFT_ETHER:
 		case IFT_FDDI:
 #ifdef IFT_L2VLAN
 		case IFT_L2VLAN:
 #endif
 #ifdef IFT_IEEE80211
 		case IFT_IEEE80211:
 #endif
 		case IFT_BRIDGE:
 		case IFT_ISO88025:
 			ETHER_MAP_IPV6_MULTICAST(&SIN6(dst)->sin6_addr,
 						 desten);
 			return (0);
 		case IFT_IEEE1394:
 			/*
 			 * netbsd can use if_broadcastaddr, but we don't do so
 			 * to reduce # of ifdef.
 			 */
 			for (i = 0; i < ifp->if_addrlen; i++)
 				desten[i] = ~0;
 			return (0);
 		case IFT_ARCNET:
 			*desten = 0;
 			return (0);
 		default:
 			m_freem(m);
 			return (EAFNOSUPPORT);
 		}
 	}
 
-	if (rt0 == NULL) {
+
+	/*
+	 * the entry should have been created in nd6_store_lladdr
+	 */
+	IF_AFDATA_LOCK(ifp);
+	ln = lla_lookup(LLTABLE6(ifp), 0, dst);
+	IF_AFDATA_UNLOCK(ifp);
+	if ((ln == NULL) || !(ln->la_flags & LLE_VALID)) {
+		if (ln)
+			LLE_RUNLOCK(ln);
 		/* this could happen, if we could not allocate memory */
 		m_freem(m);
-		return (ENOMEM);
+		return (1);
 	}
 
-	error = rt_check(&rt, &rt0, dst);
-	if (error) {
-		m_freem(m);
-		return (error);
-	}
-	RT_UNLOCK(rt);
-
-	if (rt->rt_gateway->sa_family != AF_LINK) {
-		printf("nd6_storelladdr: something odd happens\n");
-		m_freem(m);
-		return (EINVAL);
-	}
-	sdl = SDL(rt->rt_gateway);
-	if (sdl->sdl_alen == 0) {
-		/* this should be impossible, but we bark here for debugging */
-		printf("nd6_storelladdr: sdl_alen == 0\n");
-		m_freem(m);
-		return (EINVAL);
-	}
-
-	bcopy(LLADDR(sdl), desten, sdl->sdl_alen);
+	bcopy(&ln->ll_addr, desten, ifp->if_addrlen);
+	*lle = ln;
+	LLE_RUNLOCK(ln);
+	/*
+	 * A *small* use after free race exists here
+	 */
 	return (0);
 }
 
-static void
-clear_llinfo_pqueue(struct llinfo_nd6 *ln)
+static void 
+clear_llinfo_pqueue(struct llentry *ln)
 {
 	struct mbuf *m_hold, *m_hold_next;
 
-	for (m_hold = ln->ln_hold; m_hold; m_hold = m_hold_next) {
+	for (m_hold = ln->la_hold; m_hold; m_hold = m_hold_next) {
 		m_hold_next = m_hold->m_nextpkt;
 		m_hold->m_nextpkt = NULL;
 		m_freem(m_hold);
 	}
 
-	ln->ln_hold = NULL;
+	ln->la_hold = NULL;
 	return;
 }
 
 static int nd6_sysctl_drlist(SYSCTL_HANDLER_ARGS);
 static int nd6_sysctl_prlist(SYSCTL_HANDLER_ARGS);
 #ifdef SYSCTL_DECL
 SYSCTL_DECL(_net_inet6_icmp6);
 #endif
 SYSCTL_NODE(_net_inet6_icmp6, ICMPV6CTL_ND6_DRLIST, nd6_drlist,
 	CTLFLAG_RD, nd6_sysctl_drlist, "");
 SYSCTL_NODE(_net_inet6_icmp6, ICMPV6CTL_ND6_PRLIST, nd6_prlist,
 	CTLFLAG_RD, nd6_sysctl_prlist, "");
 SYSCTL_V_INT(V_NET, vnet_inet6, _net_inet6_icmp6, ICMPV6CTL_ND6_MAXQLEN,
 	nd6_maxqueuelen, CTLFLAG_RW, nd6_maxqueuelen, 1, "");
 
 static int
 nd6_sysctl_drlist(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET6(curvnet);
 	int error;
 	char buf[1024] __aligned(4);
 	struct in6_defrouter *d, *de;
 	struct nd_defrouter *dr;
 
 	if (req->newptr)
 		return EPERM;
 	error = 0;
 
 	for (dr = TAILQ_FIRST(&V_nd_defrouter); dr;
 	     dr = TAILQ_NEXT(dr, dr_entry)) {
 		d = (struct in6_defrouter *)buf;
 		de = (struct in6_defrouter *)(buf + sizeof(buf));
 
 		if (d + 1 <= de) {
 			bzero(d, sizeof(*d));
 			d->rtaddr.sin6_family = AF_INET6;
 			d->rtaddr.sin6_len = sizeof(d->rtaddr);
 			d->rtaddr.sin6_addr = dr->rtaddr;
 			error = sa6_recoverscope(&d->rtaddr);
 			if (error != 0)
 				return (error);
 			d->flags = dr->flags;
 			d->rtlifetime = dr->rtlifetime;
 			d->expire = dr->expire;
 			d->if_index = dr->ifp->if_index;
 		} else
 			panic("buffer too short");
 
 		error = SYSCTL_OUT(req, buf, sizeof(*d));
 		if (error)
 			break;
 	}
 
 	return (error);
 }
 
 static int
 nd6_sysctl_prlist(SYSCTL_HANDLER_ARGS)
 {
 	INIT_VNET_INET6(curvnet);
 	int error;
 	char buf[1024] __aligned(4);
 	struct in6_prefix *p, *pe;
 	struct nd_prefix *pr;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	if (req->newptr)
 		return EPERM;
 	error = 0;
 
 	for (pr = V_nd_prefix.lh_first; pr; pr = pr->ndpr_next) {
 		u_short advrtrs;
 		size_t advance;
 		struct sockaddr_in6 *sin6, *s6;
 		struct nd_pfxrouter *pfr;
 
 		p = (struct in6_prefix *)buf;
 		pe = (struct in6_prefix *)(buf + sizeof(buf));
 
 		if (p + 1 <= pe) {
 			bzero(p, sizeof(*p));
 			sin6 = (struct sockaddr_in6 *)(p + 1);
 
 			p->prefix = pr->ndpr_prefix;
 			if (sa6_recoverscope(&p->prefix)) {
 				log(LOG_ERR,
 				    "scope error in prefix list (%s)\n",
 				    ip6_sprintf(ip6buf, &p->prefix.sin6_addr));
 				/* XXX: press on... */
 			}
 			p->raflags = pr->ndpr_raf;
 			p->prefixlen = pr->ndpr_plen;
 			p->vltime = pr->ndpr_vltime;
 			p->pltime = pr->ndpr_pltime;
 			p->if_index = pr->ndpr_ifp->if_index;
 			if (pr->ndpr_vltime == ND6_INFINITE_LIFETIME)
 				p->expire = 0;
 			else {
 				time_t maxexpire;
 
 				/* XXX: we assume time_t is signed. */
 				maxexpire = (-1) &
 				    ~((time_t)1 <<
 				    ((sizeof(maxexpire) * 8) - 1));
 				if (pr->ndpr_vltime <
 				    maxexpire - pr->ndpr_lastupdate) {
 				    p->expire = pr->ndpr_lastupdate +
 				        pr->ndpr_vltime;
 				} else
 					p->expire = maxexpire;
 			}
 			p->refcnt = pr->ndpr_refcnt;
 			p->flags = pr->ndpr_stateflags;
 			p->origin = PR_ORIG_RA;
 			advrtrs = 0;
 			for (pfr = pr->ndpr_advrtrs.lh_first; pfr;
 			     pfr = pfr->pfr_next) {
 				if ((void *)&sin6[advrtrs + 1] > (void *)pe) {
 					advrtrs++;
 					continue;
 				}
 				s6 = &sin6[advrtrs];
 				bzero(s6, sizeof(*s6));
 				s6->sin6_family = AF_INET6;
 				s6->sin6_len = sizeof(*sin6);
 				s6->sin6_addr = pfr->router->rtaddr;
 				if (sa6_recoverscope(s6)) {
 					log(LOG_ERR,
 					    "scope error in "
 					    "prefix list (%s)\n",
 					    ip6_sprintf(ip6buf,
 						    &pfr->router->rtaddr));
 				}
 				advrtrs++;
 			}
 			p->advrtrs = advrtrs;
 		} else
 			panic("buffer too short");
 
 		advance = sizeof(*p) + sizeof(*sin6) * advrtrs;
 		error = SYSCTL_OUT(req, buf, advance);
 		if (error)
 			break;
 	}
 
 	return (error);
 }
Index: head/sys/netinet6/nd6.h
===================================================================
--- head/sys/netinet6/nd6.h	(revision 186118)
+++ head/sys/netinet6/nd6.h	(revision 186119)
@@ -1,443 +1,437 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: nd6.h,v 1.76 2001/12/18 02:10:31 itojun Exp $
  * $FreeBSD$
  */
 
 #ifndef _NETINET6_ND6_H_
 #define _NETINET6_ND6_H_
 
 /* see net/route.h, or net/if_inarp.h */
 #ifndef RTF_ANNOUNCE
 #define RTF_ANNOUNCE	RTF_PROTO2
 #endif
 
 #include <sys/queue.h>
 #include <sys/callout.h>
 
-struct	llinfo_nd6 {
-	struct	llinfo_nd6 *ln_next;
-	struct	llinfo_nd6 *ln_prev;
-	struct	rtentry *ln_rt;
-	struct	mbuf *ln_hold;	/* last packet until resolved/timeout */
-	long	ln_asked;	/* number of queries already sent for this addr */
-	u_long	ln_expire;	/* lifetime for NDP state transition */
-	short	ln_state;	/* reachability state */
-	short	ln_router;	/* 2^0: ND6 router bit */
-	int	ln_byhint;	/* # of times we made it reachable by UL hint */
+struct llentry;
 
-	long	ln_ntick;
-	struct callout ln_timer_ch;
-};
-
 #define ND6_LLINFO_NOSTATE	-2
 /*
  * We don't need the WAITDELETE state any more, but we keep the definition
  * in a comment line instead of removing it. This is necessary to avoid
  * unintentionally reusing the value for another purpose, which might
  * affect backward compatibility with old applications.
  * (20000711 jinmei@kame.net)
  */
 /* #define ND6_LLINFO_WAITDELETE	-1 */
 #define ND6_LLINFO_INCOMPLETE	0
 #define ND6_LLINFO_REACHABLE	1
 #define ND6_LLINFO_STALE	2
 #define ND6_LLINFO_DELAY	3
 #define ND6_LLINFO_PROBE	4
 
 #define ND6_IS_LLINFO_PROBREACH(n) ((n)->ln_state > ND6_LLINFO_INCOMPLETE)
-#define ND6_LLINFO_PERMANENT(n) (((n)->ln_expire == 0) && ((n)->ln_state > ND6_LLINFO_INCOMPLETE))
+#define ND6_LLINFO_PERMANENT(n) (((n)->la_expire == 0) && ((n)->ln_state > ND6_LLINFO_INCOMPLETE))
 
 struct nd_ifinfo {
 	u_int32_t linkmtu;		/* LinkMTU */
 	u_int32_t maxmtu;		/* Upper bound of LinkMTU */
 	u_int32_t basereachable;	/* BaseReachableTime */
 	u_int32_t reachable;		/* Reachable Time */
 	u_int32_t retrans;		/* Retrans Timer */
 	u_int32_t flags;		/* Flags */
 	int recalctm;			/* BaseReacable re-calculation timer */
 	u_int8_t chlim;			/* CurHopLimit */
 	u_int8_t initialized; /* Flag to see the entry is initialized */
 	/* the following 3 members are for privacy extension for addrconf */
 	u_int8_t randomseed0[8]; /* upper 64 bits of MD5 digest */
 	u_int8_t randomseed1[8]; /* lower 64 bits (usually the EUI64 IFID) */
 	u_int8_t randomid[8];	/* current random ID */
 };
 
 #define ND6_IFF_PERFORMNUD	0x1
 #define ND6_IFF_ACCEPT_RTADV	0x2
 #define ND6_IFF_PREFER_SOURCE	0x4 /* XXX: not related to ND. */
 #define ND6_IFF_IFDISABLED	0x8 /* IPv6 operation is disabled due to
 				     * DAD failure.  (XXX: not ND-specific)
 				     */
 #define ND6_IFF_DONT_SET_IFROUTE	0x10
 
+#define	ND6_CREATE		LLE_CREATE
+#define	ND6_EXCLUSIVE		LLE_EXCLUSIVE
+
 #ifdef _KERNEL
 #define ND_IFINFO(ifp) \
 	(((struct in6_ifextra *)(ifp)->if_afdata[AF_INET6])->nd_ifinfo)
 #define IN6_LINKMTU(ifp) \
 	((ND_IFINFO(ifp)->linkmtu && ND_IFINFO(ifp)->linkmtu < (ifp)->if_mtu) \
 	    ? ND_IFINFO(ifp)->linkmtu \
 	    : ((ND_IFINFO(ifp)->maxmtu && ND_IFINFO(ifp)->maxmtu < (ifp)->if_mtu) \
 		? ND_IFINFO(ifp)->maxmtu : (ifp)->if_mtu))
 #endif
 
 struct in6_nbrinfo {
 	char ifname[IFNAMSIZ];	/* if name, e.g. "en0" */
 	struct in6_addr addr;	/* IPv6 address of the neighbor */
 	long	asked;		/* number of queries already sent for this addr */
 	int	isrouter;	/* if it acts as a router */
 	int	state;		/* reachability state */
 	int	expire;		/* lifetime for NDP state transition */
 };
 
 #define DRLSTSIZ 10
 #define PRLSTSIZ 10
 struct	in6_drlist {
 	char ifname[IFNAMSIZ];
 	struct {
 		struct	in6_addr rtaddr;
 		u_char	flags;
 		u_short	rtlifetime;
 		u_long	expire;
 		u_short if_index;
 	} defrouter[DRLSTSIZ];
 };
 
 struct	in6_defrouter {
 	struct	sockaddr_in6 rtaddr;
 	u_char	flags;
 	u_short	rtlifetime;
 	u_long	expire;
 	u_short if_index;
 };
 
 #ifdef _KERNEL
 struct	in6_oprlist {
 	char ifname[IFNAMSIZ];
 	struct {
 		struct	in6_addr prefix;
 		struct prf_ra raflags;
 		u_char	prefixlen;
 		u_char	origin;
 		u_long vltime;
 		u_long pltime;
 		u_long expire;
 		u_short if_index;
 		u_short advrtrs; /* number of advertisement routers */
 		struct	in6_addr advrtr[DRLSTSIZ]; /* XXX: explicit limit */
 	} prefix[PRLSTSIZ];
 };
 #endif
 
 struct	in6_prlist {
 	char ifname[IFNAMSIZ];
 	struct {
 		struct	in6_addr prefix;
 		struct prf_ra raflags;
 		u_char	prefixlen;
 		u_char	origin;
 		u_int32_t vltime;
 		u_int32_t pltime;
 		time_t expire;
 		u_short if_index;
 		u_short advrtrs; /* number of advertisement routers */
 		struct	in6_addr advrtr[DRLSTSIZ]; /* XXX: explicit limit */
 	} prefix[PRLSTSIZ];
 };
 
 struct in6_prefix {
 	struct	sockaddr_in6 prefix;
 	struct prf_ra raflags;
 	u_char	prefixlen;
 	u_char	origin;
 	u_int32_t vltime;
 	u_int32_t pltime;
 	time_t expire;
 	u_int32_t flags;
 	int refcnt;
 	u_short if_index;
 	u_short advrtrs; /* number of advertisement routers */
 	/* struct sockaddr_in6 advrtr[] */
 };
 
 #ifdef _KERNEL
 struct	in6_ondireq {
 	char ifname[IFNAMSIZ];
 	struct {
 		u_int32_t linkmtu;	/* LinkMTU */
 		u_int32_t maxmtu;	/* Upper bound of LinkMTU */
 		u_int32_t basereachable; /* BaseReachableTime */
 		u_int32_t reachable;	/* Reachable Time */
 		u_int32_t retrans;	/* Retrans Timer */
 		u_int32_t flags;	/* Flags */
 		int recalctm;		/* BaseReacable re-calculation timer */
 		u_int8_t chlim;		/* CurHopLimit */
 		u_int8_t receivedra;
 	} ndi;
 };
 #endif
 
 struct	in6_ndireq {
 	char ifname[IFNAMSIZ];
 	struct nd_ifinfo ndi;
 };
 
 struct	in6_ndifreq {
 	char ifname[IFNAMSIZ];
 	u_long ifindex;
 };
 
 /* Prefix status */
 #define NDPRF_ONLINK		0x1
 #define NDPRF_DETACHED		0x2
 
 /* protocol constants */
 #define MAX_RTR_SOLICITATION_DELAY	1	/* 1sec */
 #define RTR_SOLICITATION_INTERVAL	4	/* 4sec */
 #define MAX_RTR_SOLICITATIONS		3
 
 #define ND6_INFINITE_LIFETIME		0xffffffff
 
 #ifdef _KERNEL
 /* node constants */
 #define MAX_REACHABLE_TIME		3600000	/* msec */
 #define REACHABLE_TIME			30000	/* msec */
 #define RETRANS_TIMER			1000	/* msec */
 #define MIN_RANDOM_FACTOR		512	/* 1024 * 0.5 */
 #define MAX_RANDOM_FACTOR		1536	/* 1024 * 1.5 */
 #define DEF_TEMP_VALID_LIFETIME		604800	/* 1 week */
 #define DEF_TEMP_PREFERRED_LIFETIME	86400	/* 1 day */
 #define TEMPADDR_REGEN_ADVANCE		5	/* sec */
 #define MAX_TEMP_DESYNC_FACTOR		600	/* 10 min */
 #define ND_COMPUTE_RTIME(x) \
 		(((MIN_RANDOM_FACTOR * (x >> 10)) + (arc4random() & \
 		((MAX_RANDOM_FACTOR - MIN_RANDOM_FACTOR) * (x >> 10)))) /1000)
 
 TAILQ_HEAD(nd_drhead, nd_defrouter);
 struct	nd_defrouter {
 	TAILQ_ENTRY(nd_defrouter) dr_entry;
 	struct	in6_addr rtaddr;
 	u_char	flags;		/* flags on RA message */
 	u_short	rtlifetime;
 	u_long	expire;
 	struct  ifnet *ifp;
 	int	installed;	/* is installed into kernel routing table */
 };
 
 struct nd_prefixctl {
 	struct ifnet *ndpr_ifp;
 
 	/* prefix */
 	struct sockaddr_in6 ndpr_prefix;
 	u_char	ndpr_plen;
 
 	u_int32_t ndpr_vltime;	/* advertised valid lifetime */
 	u_int32_t ndpr_pltime;	/* advertised preferred lifetime */
 
 	struct prf_ra ndpr_flags;
 };
 
 
 struct nd_prefix {
 	struct ifnet *ndpr_ifp;
 	LIST_ENTRY(nd_prefix) ndpr_entry;
 	struct sockaddr_in6 ndpr_prefix;	/* prefix */
 	struct in6_addr ndpr_mask; /* netmask derived from the prefix */
 
 	u_int32_t ndpr_vltime;	/* advertised valid lifetime */
 	u_int32_t ndpr_pltime;	/* advertised preferred lifetime */
 
 	time_t ndpr_expire;	/* expiration time of the prefix */
 	time_t ndpr_preferred;	/* preferred time of the prefix */
 	time_t ndpr_lastupdate; /* reception time of last advertisement */
 
 	struct prf_ra ndpr_flags;
 	u_int32_t ndpr_stateflags; /* actual state flags */
 	/* list of routers that advertise the prefix: */
 	LIST_HEAD(pr_rtrhead, nd_pfxrouter) ndpr_advrtrs;
 	u_char	ndpr_plen;
 	int	ndpr_refcnt;	/* reference couter from addresses */
 };
 
 #define ndpr_next		ndpr_entry.le_next
 
 #define ndpr_raf		ndpr_flags
 #define ndpr_raf_onlink		ndpr_flags.onlink
 #define ndpr_raf_auto		ndpr_flags.autonomous
 #define ndpr_raf_router		ndpr_flags.router
 
 /*
  * Message format for use in obtaining information about prefixes
  * from inet6 sysctl function
  */
 struct inet6_ndpr_msghdr {
 	u_short	inpm_msglen;	/* to skip over non-understood messages */
 	u_char	inpm_version;	/* future binary compatibility */
 	u_char	inpm_type;	/* message type */
 	struct in6_addr inpm_prefix;
 	u_long	prm_vltim;
 	u_long	prm_pltime;
 	u_long	prm_expire;
 	u_long	prm_preferred;
 	struct in6_prflags prm_flags;
 	u_short	prm_index;	/* index for associated ifp */
 	u_char	prm_plen;	/* length of prefix in bits */
 };
 
 #define prm_raf_onlink		prm_flags.prf_ra.onlink
 #define prm_raf_auto		prm_flags.prf_ra.autonomous
 
 #define prm_statef_onlink	prm_flags.prf_state.onlink
 
 #define prm_rrf_decrvalid	prm_flags.prf_rr.decrvalid
 #define prm_rrf_decrprefd	prm_flags.prf_rr.decrprefd
 
 struct nd_pfxrouter {
 	LIST_ENTRY(nd_pfxrouter) pfr_entry;
 #define pfr_next pfr_entry.le_next
 	struct nd_defrouter *router;
 };
 
 LIST_HEAD(nd_prhead, nd_prefix);
 
 /* nd6.c */
 #ifdef VIMAGE_GLOBALS
 extern int nd6_prune;
 extern int nd6_delay;
 extern int nd6_umaxtries;
 extern int nd6_mmaxtries;
 extern int nd6_useloopback;
 extern int nd6_maxnudhint;
 extern int nd6_gctimer;
-extern struct llinfo_nd6 llinfo_nd6;
 extern struct nd_drhead nd_defrouter;
 extern struct nd_prhead nd_prefix;
 extern int nd6_debug;
 extern int nd6_onlink_ns_rfc4861;
 
 extern struct callout nd6_timer_ch;
 
 /* nd6_rtr.c */
 extern int nd6_defifindex;
 extern int ip6_desync_factor;	/* seconds */
 extern u_int32_t ip6_temp_preferred_lifetime; /* seconds */
 extern u_int32_t ip6_temp_valid_lifetime; /* seconds */
 extern int ip6_temp_regen_advance; /* seconds */
 #endif /* VIMAGE_GLOBALS */
 
 #define nd6log(x)	do { if (V_nd6_debug) log x; } while (/*CONSTCOND*/ 0)
 
 union nd_opts {
 	struct nd_opt_hdr *nd_opt_array[8];	/* max = target address list */
 	struct {
 		struct nd_opt_hdr *zero;
 		struct nd_opt_hdr *src_lladdr;
 		struct nd_opt_hdr *tgt_lladdr;
 		struct nd_opt_prefix_info *pi_beg; /* multiple opts, start */
 		struct nd_opt_rd_hdr *rh;
 		struct nd_opt_mtu *mtu;
 		struct nd_opt_hdr *search;	/* multiple opts */
 		struct nd_opt_hdr *last;	/* multiple opts */
 		int done;
 		struct nd_opt_prefix_info *pi_end;/* multiple opts, end */
 	} nd_opt_each;
 };
 #define nd_opts_src_lladdr	nd_opt_each.src_lladdr
 #define nd_opts_tgt_lladdr	nd_opt_each.tgt_lladdr
 #define nd_opts_pi		nd_opt_each.pi_beg
 #define nd_opts_pi_end		nd_opt_each.pi_end
 #define nd_opts_rh		nd_opt_each.rh
 #define nd_opts_mtu		nd_opt_each.mtu
 #define nd_opts_search		nd_opt_each.search
 #define nd_opts_last		nd_opt_each.last
 #define nd_opts_done		nd_opt_each.done
 
 /* XXX: need nd6_var.h?? */
 /* nd6.c */
 void nd6_init __P((void));
 struct nd_ifinfo *nd6_ifattach __P((struct ifnet *));
 void nd6_ifdetach __P((struct nd_ifinfo *));
 int nd6_is_addr_neighbor __P((struct sockaddr_in6 *, struct ifnet *));
 void nd6_option_init __P((void *, int, union nd_opts *));
 struct nd_opt_hdr *nd6_option __P((union nd_opts *));
 int nd6_options __P((union nd_opts *));
-struct	rtentry *nd6_lookup __P((struct in6_addr *, int, struct ifnet *));
+struct	llentry *nd6_lookup __P((struct in6_addr *, int, struct ifnet *));
 void nd6_setmtu __P((struct ifnet *));
-void nd6_llinfo_settimer __P((struct llinfo_nd6 *, long));
+void nd6_llinfo_settimer __P((struct llentry *, long));
+void nd6_llinfo_settimer_locked __P((struct llentry *, long));
 void nd6_timer __P((void *));
 void nd6_purge __P((struct ifnet *));
 void nd6_nud_hint __P((struct rtentry *, struct in6_addr *, int));
 int nd6_resolve __P((struct ifnet *, struct rtentry *, struct mbuf *,
 	struct sockaddr *, u_char *));
-void nd6_rtrequest __P((int, struct rtentry *, struct rt_addrinfo *));
 int nd6_ioctl __P((u_long, caddr_t, struct ifnet *));
-struct rtentry *nd6_cache_lladdr __P((struct ifnet *, struct in6_addr *,
+struct llentry *nd6_cache_lladdr __P((struct ifnet *, struct in6_addr *,
 	char *, int, int, int));
 int nd6_output __P((struct ifnet *, struct ifnet *, struct mbuf *,
 	struct sockaddr_in6 *, struct rtentry *));
+int nd6_output_lle __P((struct ifnet *, struct ifnet *, struct mbuf *,
+	struct sockaddr_in6 *, struct rtentry *, struct llentry *,
+	struct mbuf **));
+int nd6_output_flush __P((struct ifnet *, struct ifnet *, struct mbuf *,
+	struct sockaddr_in6 *, struct rtentry *));
 int nd6_need_cache __P((struct ifnet *));
 int nd6_storelladdr __P((struct ifnet *, struct rtentry *, struct mbuf *,
-	struct sockaddr *, u_char *));
+	struct sockaddr *, u_char *, struct llentry **));
 
 /* nd6_nbr.c */
 void nd6_na_input __P((struct mbuf *, int, int));
 void nd6_na_output __P((struct ifnet *, const struct in6_addr *,
 	const struct in6_addr *, u_long, int, struct sockaddr *));
 void nd6_ns_input __P((struct mbuf *, int, int));
 void nd6_ns_output __P((struct ifnet *, const struct in6_addr *,
-	const struct in6_addr *, struct llinfo_nd6 *, int));
+	const struct in6_addr *, struct llentry *, int));
 caddr_t nd6_ifptomac __P((struct ifnet *));
 void nd6_dad_start __P((struct ifaddr *, int));
 void nd6_dad_stop __P((struct ifaddr *));
 void nd6_dad_duplicated __P((struct ifaddr *));
 
 /* nd6_rtr.c */
 void nd6_rs_input __P((struct mbuf *, int, int));
 void nd6_ra_input __P((struct mbuf *, int, int));
 void prelist_del __P((struct nd_prefix *));
 void defrouter_addreq __P((struct nd_defrouter *));
 void defrouter_reset __P((void));
 void defrouter_select __P((void));
 void defrtrlist_del __P((struct nd_defrouter *));
 void prelist_remove __P((struct nd_prefix *));
 int nd6_prelist_add __P((struct nd_prefixctl *, struct nd_defrouter *,
 	struct nd_prefix **));
 int nd6_prefix_onlink __P((struct nd_prefix *));
 int nd6_prefix_offlink __P((struct nd_prefix *));
 void pfxlist_onlink_check __P((void));
 struct nd_defrouter *defrouter_lookup __P((struct in6_addr *, struct ifnet *));
 struct nd_prefix *nd6_prefix_lookup __P((struct nd_prefixctl *));
 void rt6_flush __P((struct in6_addr *, struct ifnet *));
 int nd6_setdefaultiface __P((int));
 int in6_tmpifadd __P((const struct in6_ifaddr *, int, int));
 
 #endif /* _KERNEL */
 
 #endif /* _NETINET6_ND6_H_ */
Index: head/sys/netinet6/nd6_nbr.c
===================================================================
--- head/sys/netinet6/nd6_nbr.c	(revision 186118)
+++ head/sys/netinet6/nd6_nbr.c	(revision 186119)
@@ -1,1505 +1,1519 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: nd6_nbr.c,v 1.86 2002/01/21 02:33:04 jinmei Exp $
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_ipsec.h"
 #include "opt_carp.h"
 #include "opt_mpath.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/malloc.h>
+#include <sys/lock.h>
+#include <sys/rwlock.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/time.h>
 #include <sys/kernel.h>
 #include <sys/errno.h>
 #include <sys/syslog.h>
 #include <sys/queue.h>
 #include <sys/callout.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/if_dl.h>
 #include <net/if_var.h>
 #include <net/route.h>
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 
 #include <netinet/in.h>
 #include <netinet/in_var.h>
+#include <net/if_llatbl.h>
+#define	L3_ADDR_SIN6(le)	((struct sockaddr_in6 *) L3_ADDR(le))
 #include <netinet6/in6_var.h>
 #include <netinet6/in6_ifattach.h>
 #include <netinet/ip6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet/icmp6.h>
 #include <netinet6/vinet6.h>
 
 #ifdef DEV_CARP
 #include <netinet/ip_carp.h>
 #endif
 
 #define SDL(s) ((struct sockaddr_dl *)s)
 
 struct dadq;
 static struct dadq *nd6_dad_find(struct ifaddr *);
 static void nd6_dad_starttimer(struct dadq *, int);
 static void nd6_dad_stoptimer(struct dadq *);
 static void nd6_dad_timer(struct ifaddr *);
 static void nd6_dad_ns_output(struct dadq *, struct ifaddr *);
 static void nd6_dad_ns_input(struct ifaddr *);
 static void nd6_dad_na_input(struct ifaddr *);
 
 #ifdef VIMAGE_GLOBALS
 int dad_ignore_ns;
 int dad_maxtry;
 #endif
 
 /*
  * Input a Neighbor Solicitation Message.
  *
  * Based on RFC 2461
  * Based on RFC 2462 (duplicate address detection)
  */
 void
 nd6_ns_input(struct mbuf *m, int off, int icmp6len)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp = m->m_pkthdr.rcvif;
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	struct nd_neighbor_solicit *nd_ns;
 	struct in6_addr saddr6 = ip6->ip6_src;
 	struct in6_addr daddr6 = ip6->ip6_dst;
 	struct in6_addr taddr6;
 	struct in6_addr myaddr6;
 	char *lladdr = NULL;
 	struct ifaddr *ifa = NULL;
 	int lladdrlen = 0;
 	int anycast = 0, proxy = 0, tentative = 0;
 	int tlladdr;
 	union nd_opts ndopts;
 	struct sockaddr_dl *proxydl = NULL;
 	char ip6bufs[INET6_ADDRSTRLEN], ip6bufd[INET6_ADDRSTRLEN];
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off, icmp6len,);
 	nd_ns = (struct nd_neighbor_solicit *)((caddr_t)ip6 + off);
 #else
 	IP6_EXTHDR_GET(nd_ns, struct nd_neighbor_solicit *, m, off, icmp6len);
 	if (nd_ns == NULL) {
 		V_icmp6stat.icp6s_tooshort++;
 		return;
 	}
 #endif
 	ip6 = mtod(m, struct ip6_hdr *); /* adjust pointer for safety */
 	taddr6 = nd_ns->nd_ns_target;
 	if (in6_setscope(&taddr6, ifp, NULL) != 0)
 		goto bad;
 
 	if (ip6->ip6_hlim != 255) {
 		nd6log((LOG_ERR,
 		    "nd6_ns_input: invalid hlim (%d) from %s to %s on %s\n",
 		    ip6->ip6_hlim, ip6_sprintf(ip6bufs, &ip6->ip6_src),
 		    ip6_sprintf(ip6bufd, &ip6->ip6_dst), if_name(ifp)));
 		goto bad;
 	}
 
 	if (IN6_IS_ADDR_UNSPECIFIED(&saddr6)) {
 		/* dst has to be a solicited node multicast address. */
 		if (daddr6.s6_addr16[0] == IPV6_ADDR_INT16_MLL &&
 		    /* don't check ifindex portion */
 		    daddr6.s6_addr32[1] == 0 &&
 		    daddr6.s6_addr32[2] == IPV6_ADDR_INT32_ONE &&
 		    daddr6.s6_addr8[12] == 0xff) {
 			; /* good */
 		} else {
 			nd6log((LOG_INFO, "nd6_ns_input: bad DAD packet "
 			    "(wrong ip6 dst)\n"));
 			goto bad;
 		}
 	} else if (!V_nd6_onlink_ns_rfc4861) {
 		struct sockaddr_in6 src_sa6;
 
 		/*
 		 * According to recent IETF discussions, it is not a good idea
 		 * to accept a NS from an address which would not be deemed
 		 * to be a neighbor otherwise.  This point is expected to be
 		 * clarified in future revisions of the specification.
 		 */
 		bzero(&src_sa6, sizeof(src_sa6));
 		src_sa6.sin6_family = AF_INET6;
 		src_sa6.sin6_len = sizeof(src_sa6);
 		src_sa6.sin6_addr = saddr6;
-		if (!nd6_is_addr_neighbor(&src_sa6, ifp)) {
+		if (nd6_is_addr_neighbor(&src_sa6, ifp) == 0) {
 			nd6log((LOG_INFO, "nd6_ns_input: "
 				"NS packet from non-neighbor\n"));
 			goto bad;
 		}
 	}
 
 	if (IN6_IS_ADDR_MULTICAST(&taddr6)) {
 		nd6log((LOG_INFO, "nd6_ns_input: bad NS target (multicast)\n"));
 		goto bad;
 	}
 
 	icmp6len -= sizeof(*nd_ns);
 	nd6_option_init(nd_ns + 1, icmp6len, &ndopts);
 	if (nd6_options(&ndopts) < 0) {
 		nd6log((LOG_INFO,
 		    "nd6_ns_input: invalid ND option, ignored\n"));
 		/* nd6_options have incremented stats */
 		goto freeit;
 	}
 
 	if (ndopts.nd_opts_src_lladdr) {
 		lladdr = (char *)(ndopts.nd_opts_src_lladdr + 1);
 		lladdrlen = ndopts.nd_opts_src_lladdr->nd_opt_len << 3;
 	}
 
 	if (IN6_IS_ADDR_UNSPECIFIED(&ip6->ip6_src) && lladdr) {
 		nd6log((LOG_INFO, "nd6_ns_input: bad DAD packet "
 		    "(link-layer address option)\n"));
 		goto bad;
 	}
 
 	/*
 	 * Attaching target link-layer address to the NA?
 	 * (RFC 2461 7.2.4)
 	 *
 	 * NS IP dst is unicast/anycast			MUST NOT add
 	 * NS IP dst is solicited-node multicast	MUST add
 	 *
 	 * In implementation, we add target link-layer address by default.
 	 * We do not add one in MUST NOT cases.
 	 */
 	if (!IN6_IS_ADDR_MULTICAST(&daddr6))
 		tlladdr = 0;
 	else
 		tlladdr = 1;
 
 	/*
 	 * Target address (taddr6) must be either:
 	 * (1) Valid unicast/anycast address for my receiving interface,
 	 * (2) Unicast address for which I'm offering proxy service, or
 	 * (3) "tentative" address on which DAD is being performed.
 	 */
 	/* (1) and (3) check. */
 #ifdef DEV_CARP
 	if (ifp->if_carp)
 		ifa = carp_iamatch6(ifp->if_carp, &taddr6);
 	if (ifa == NULL)
 		ifa = (struct ifaddr *)in6ifa_ifpwithaddr(ifp, &taddr6);
 #else
 	ifa = (struct ifaddr *)in6ifa_ifpwithaddr(ifp, &taddr6);
 #endif
 
 	/* (2) check. */
 	if (ifa == NULL) {
 		struct rtentry *rt;
 		struct sockaddr_in6 tsin6;
 		int need_proxy;
 #ifdef RADIX_MPATH
 		struct route_in6 ro;
 #endif
 
 		bzero(&tsin6, sizeof tsin6);
 		tsin6.sin6_len = sizeof(struct sockaddr_in6);
 		tsin6.sin6_family = AF_INET6;
 		tsin6.sin6_addr = taddr6;
 
 #ifdef RADIX_MPATH
 		bzero(&ro, sizeof(ro));
 		ro.ro_dst = tsin6;
 		rtalloc_mpath((struct route *)&ro, RTF_ANNOUNCE);
 		rt = ro.ro_rt;
 #else
 		rt = rtalloc1((struct sockaddr *)&tsin6, 0, 0);
 #endif
 		need_proxy = (rt && (rt->rt_flags & RTF_ANNOUNCE) != 0 &&
 		    rt->rt_gateway->sa_family == AF_LINK);
 		if (rt)
 			rtfree(rt);
 		if (need_proxy) {
 			/*
 			 * proxy NDP for single entry
 			 */
 			ifa = (struct ifaddr *)in6ifa_ifpforlinklocal(ifp,
 				IN6_IFF_NOTREADY|IN6_IFF_ANYCAST);
 			if (ifa) {
 				proxy = 1;
 				proxydl = SDL(rt->rt_gateway);
 			}
 		}
 	}
 	if (ifa == NULL) {
 		/*
 		 * We've got an NS packet, and we don't have that adddress
 		 * assigned for us.  We MUST silently ignore it.
 		 * See RFC2461 7.2.3.
 		 */
 		goto freeit;
 	}
 	myaddr6 = *IFA_IN6(ifa);
 	anycast = ((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_ANYCAST;
 	tentative = ((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_TENTATIVE;
 	if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_DUPLICATED)
 		goto freeit;
 
 	if (lladdr && ((ifp->if_addrlen + 2 + 7) & ~7) != lladdrlen) {
 		nd6log((LOG_INFO, "nd6_ns_input: lladdrlen mismatch for %s "
 		    "(if %d, NS packet %d)\n",
 		    ip6_sprintf(ip6bufs, &taddr6),
 		    ifp->if_addrlen, lladdrlen - 2));
 		goto bad;
 	}
 
 	if (IN6_ARE_ADDR_EQUAL(&myaddr6, &saddr6)) {
 		nd6log((LOG_INFO, "nd6_ns_input: duplicate IP6 address %s\n",
 		    ip6_sprintf(ip6bufs, &saddr6)));
 		goto freeit;
 	}
 
 	/*
 	 * We have neighbor solicitation packet, with target address equals to
 	 * one of my tentative address.
 	 *
 	 * src addr	how to process?
 	 * ---		---
 	 * multicast	of course, invalid (rejected in ip6_input)
 	 * unicast	somebody is doing address resolution -> ignore
 	 * unspec	dup address detection
 	 *
 	 * The processing is defined in RFC 2462.
 	 */
 	if (tentative) {
 		/*
 		 * If source address is unspecified address, it is for
 		 * duplicate address detection.
 		 *
 		 * If not, the packet is for addess resolution;
 		 * silently ignore it.
 		 */
 		if (IN6_IS_ADDR_UNSPECIFIED(&saddr6))
 			nd6_dad_ns_input(ifa);
 
 		goto freeit;
 	}
 
 	/*
 	 * If the source address is unspecified address, entries must not
 	 * be created or updated.
 	 * It looks that sender is performing DAD.  Output NA toward
 	 * all-node multicast address, to tell the sender that I'm using
 	 * the address.
 	 * S bit ("solicited") must be zero.
 	 */
 	if (IN6_IS_ADDR_UNSPECIFIED(&saddr6)) {
 		struct in6_addr in6_all;
 
 		in6_all = in6addr_linklocal_allnodes;
 		if (in6_setscope(&in6_all, ifp, NULL) != 0)
 			goto bad;
 		nd6_na_output(ifp, &in6_all, &taddr6,
 		    ((anycast || proxy || !tlladdr) ? 0 : ND_NA_FLAG_OVERRIDE) |
 		    (V_ip6_forwarding ? ND_NA_FLAG_ROUTER : 0),
 		    tlladdr, (struct sockaddr *)proxydl);
 		goto freeit;
 	}
 
 	nd6_cache_lladdr(ifp, &saddr6, lladdr, lladdrlen,
 	    ND_NEIGHBOR_SOLICIT, 0);
 
 	nd6_na_output(ifp, &saddr6, &taddr6,
 	    ((anycast || proxy || !tlladdr) ? 0 : ND_NA_FLAG_OVERRIDE) |
 	    (V_ip6_forwarding ? ND_NA_FLAG_ROUTER : 0) | ND_NA_FLAG_SOLICITED,
 	    tlladdr, (struct sockaddr *)proxydl);
  freeit:
 	m_freem(m);
 	return;
 
  bad:
 	nd6log((LOG_ERR, "nd6_ns_input: src=%s\n",
 		ip6_sprintf(ip6bufs, &saddr6)));
 	nd6log((LOG_ERR, "nd6_ns_input: dst=%s\n",
 		ip6_sprintf(ip6bufs, &daddr6)));
 	nd6log((LOG_ERR, "nd6_ns_input: tgt=%s\n",
 		ip6_sprintf(ip6bufs, &taddr6)));
 	V_icmp6stat.icp6s_badns++;
 	m_freem(m);
 }
 
 /*
  * Output a Neighbor Solicitation Message. Caller specifies:
  *	- ICMP6 header source IP6 address
  *	- ND6 header target IP6 address
  *	- ND6 header source datalink address
  *
  * Based on RFC 2461
  * Based on RFC 2462 (duplicate address detection)
  *
  *   ln - for source address determination
  *  dad - duplicate address detection
  */
 void
-nd6_ns_output(struct ifnet *ifp, const struct in6_addr *daddr6,
-    const struct in6_addr *taddr6, struct llinfo_nd6 *ln, int dad)
+nd6_ns_output(struct ifnet *ifp, const struct in6_addr *daddr6, 
+    const struct in6_addr *taddr6, struct llentry *ln, int dad)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	struct mbuf *m;
 	struct ip6_hdr *ip6;
 	struct nd_neighbor_solicit *nd_ns;
 	struct in6_addr *src, src_in;
 	struct ip6_moptions im6o;
 	int icmp6len;
 	int maxlen;
 	caddr_t mac;
 	struct route_in6 ro;
 
 	bzero(&ro, sizeof(ro));
 
 	if (IN6_IS_ADDR_MULTICAST(taddr6))
 		return;
 
 	/* estimate the size of message */
 	maxlen = sizeof(*ip6) + sizeof(*nd_ns);
 	maxlen += (sizeof(struct nd_opt_hdr) + ifp->if_addrlen + 7) & ~7;
 	if (max_linkhdr + maxlen >= MCLBYTES) {
 #ifdef DIAGNOSTIC
 		printf("nd6_ns_output: max_linkhdr + maxlen >= MCLBYTES "
 		    "(%d + %d > %d)\n", max_linkhdr, maxlen, MCLBYTES);
 #endif
 		return;
 	}
 
 	MGETHDR(m, M_DONTWAIT, MT_DATA);
 	if (m && max_linkhdr + maxlen >= MHLEN) {
 		MCLGET(m, M_DONTWAIT);
 		if ((m->m_flags & M_EXT) == 0) {
 			m_free(m);
 			m = NULL;
 		}
 	}
 	if (m == NULL)
 		return;
 	m->m_pkthdr.rcvif = NULL;
 
 	if (daddr6 == NULL || IN6_IS_ADDR_MULTICAST(daddr6)) {
 		m->m_flags |= M_MCAST;
 		im6o.im6o_multicast_ifp = ifp;
 		im6o.im6o_multicast_hlim = 255;
 		im6o.im6o_multicast_loop = 0;
 	}
 
 	icmp6len = sizeof(*nd_ns);
 	m->m_pkthdr.len = m->m_len = sizeof(*ip6) + icmp6len;
 	m->m_data += max_linkhdr;	/* or MH_ALIGN() equivalent? */
 
 	/* fill neighbor solicitation packet */
 	ip6 = mtod(m, struct ip6_hdr *);
 	ip6->ip6_flow = 0;
 	ip6->ip6_vfc &= ~IPV6_VERSION_MASK;
 	ip6->ip6_vfc |= IPV6_VERSION;
 	/* ip6->ip6_plen will be set later */
 	ip6->ip6_nxt = IPPROTO_ICMPV6;
 	ip6->ip6_hlim = 255;
 	if (daddr6)
 		ip6->ip6_dst = *daddr6;
 	else {
 		ip6->ip6_dst.s6_addr16[0] = IPV6_ADDR_INT16_MLL;
 		ip6->ip6_dst.s6_addr16[1] = 0;
 		ip6->ip6_dst.s6_addr32[1] = 0;
 		ip6->ip6_dst.s6_addr32[2] = IPV6_ADDR_INT32_ONE;
 		ip6->ip6_dst.s6_addr32[3] = taddr6->s6_addr32[3];
 		ip6->ip6_dst.s6_addr8[12] = 0xff;
 		if (in6_setscope(&ip6->ip6_dst, ifp, NULL) != 0)
 			goto bad;
 	}
 	if (!dad) {
 		/*
 		 * RFC2461 7.2.2:
 		 * "If the source address of the packet prompting the
 		 * solicitation is the same as one of the addresses assigned
 		 * to the outgoing interface, that address SHOULD be placed
 		 * in the IP Source Address of the outgoing solicitation.
 		 * Otherwise, any one of the addresses assigned to the
 		 * interface should be used."
 		 *
 		 * We use the source address for the prompting packet
 		 * (saddr6), if:
 		 * - saddr6 is given from the caller (by giving "ln"), and
 		 * - saddr6 belongs to the outgoing interface.
 		 * Otherwise, we perform the source address selection as usual.
 		 */
 		struct ip6_hdr *hip6;		/* hold ip6 */
 		struct in6_addr *hsrc = NULL;
 
-		if (ln && ln->ln_hold) {
+		if (ln && ln->la_hold) {
 			/*
-			 * assuming every packet in ln_hold has the same IP
+			 * assuming every packet in la_hold has the same IP
 			 * header
 			 */
-			hip6 = mtod(ln->ln_hold, struct ip6_hdr *);
+			hip6 = mtod(ln->la_hold, struct ip6_hdr *);
 			/* XXX pullup? */
-			if (sizeof(*hip6) < ln->ln_hold->m_len)
+			if (sizeof(*hip6) < ln->la_hold->m_len)
 				hsrc = &hip6->ip6_src;
 			else
 				hsrc = NULL;
 		}
 		if (hsrc && in6ifa_ifpwithaddr(ifp, hsrc))
 			src = hsrc;
 		else {
 			int error;
 			struct sockaddr_in6 dst_sa;
 
 			bzero(&dst_sa, sizeof(dst_sa));
 			dst_sa.sin6_family = AF_INET6;
 			dst_sa.sin6_len = sizeof(dst_sa);
 			dst_sa.sin6_addr = ip6->ip6_dst;
 
 			src = in6_selectsrc(&dst_sa, NULL,
 			    NULL, &ro, NULL, NULL, &error);
 			if (src == NULL) {
 				char ip6buf[INET6_ADDRSTRLEN];
 				nd6log((LOG_DEBUG,
 				    "nd6_ns_output: source can't be "
 				    "determined: dst=%s, error=%d\n",
 				    ip6_sprintf(ip6buf, &dst_sa.sin6_addr),
 				    error));
 				goto bad;
 			}
 		}
 	} else {
 		/*
 		 * Source address for DAD packet must always be IPv6
 		 * unspecified address. (0::0)
 		 * We actually don't have to 0-clear the address (we did it
 		 * above), but we do so here explicitly to make the intention
 		 * clearer.
 		 */
 		bzero(&src_in, sizeof(src_in));
 		src = &src_in;
 	}
 	ip6->ip6_src = *src;
 	nd_ns = (struct nd_neighbor_solicit *)(ip6 + 1);
 	nd_ns->nd_ns_type = ND_NEIGHBOR_SOLICIT;
 	nd_ns->nd_ns_code = 0;
 	nd_ns->nd_ns_reserved = 0;
 	nd_ns->nd_ns_target = *taddr6;
 	in6_clearscope(&nd_ns->nd_ns_target); /* XXX */
 
 	/*
 	 * Add source link-layer address option.
 	 *
 	 *				spec		implementation
 	 *				---		---
 	 * DAD packet			MUST NOT	do not add the option
 	 * there's no link layer address:
 	 *				impossible	do not add the option
 	 * there's link layer address:
 	 *	Multicast NS		MUST add one	add the option
 	 *	Unicast NS		SHOULD add one	add the option
 	 */
 	if (!dad && (mac = nd6_ifptomac(ifp))) {
 		int optlen = sizeof(struct nd_opt_hdr) + ifp->if_addrlen;
 		struct nd_opt_hdr *nd_opt = (struct nd_opt_hdr *)(nd_ns + 1);
 		/* 8 byte alignments... */
 		optlen = (optlen + 7) & ~7;
 
 		m->m_pkthdr.len += optlen;
 		m->m_len += optlen;
 		icmp6len += optlen;
 		bzero((caddr_t)nd_opt, optlen);
 		nd_opt->nd_opt_type = ND_OPT_SOURCE_LINKADDR;
 		nd_opt->nd_opt_len = optlen >> 3;
 		bcopy(mac, (caddr_t)(nd_opt + 1), ifp->if_addrlen);
 	}
 
 	ip6->ip6_plen = htons((u_short)icmp6len);
 	nd_ns->nd_ns_cksum = 0;
 	nd_ns->nd_ns_cksum =
 	    in6_cksum(m, IPPROTO_ICMPV6, sizeof(*ip6), icmp6len);
 
 	ip6_output(m, NULL, &ro, dad ? IPV6_UNSPECSRC : 0, &im6o, NULL, NULL);
 	icmp6_ifstat_inc(ifp, ifs6_out_msg);
 	icmp6_ifstat_inc(ifp, ifs6_out_neighborsolicit);
 	V_icmp6stat.icp6s_outhist[ND_NEIGHBOR_SOLICIT]++;
 
 	if (ro.ro_rt) {		/* we don't cache this route. */
 		RTFREE(ro.ro_rt);
 	}
 	return;
 
   bad:
 	if (ro.ro_rt) {
 		RTFREE(ro.ro_rt);
 	}
 	m_freem(m);
 	return;
 }
 
 /*
  * Neighbor advertisement input handling.
  *
  * Based on RFC 2461
  * Based on RFC 2462 (duplicate address detection)
  *
  * the following items are not implemented yet:
  * - proxy advertisement delay rule (RFC2461 7.2.8, last paragraph, SHOULD)
  * - anycast advertisement delay rule (RFC2461 7.2.7, SHOULD)
  */
 void
 nd6_na_input(struct mbuf *m, int off, int icmp6len)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp = m->m_pkthdr.rcvif;
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	struct nd_neighbor_advert *nd_na;
 	struct in6_addr daddr6 = ip6->ip6_dst;
 	struct in6_addr taddr6;
 	int flags;
 	int is_router;
 	int is_solicited;
 	int is_override;
 	char *lladdr = NULL;
 	int lladdrlen = 0;
 	struct ifaddr *ifa;
-	struct llinfo_nd6 *ln;
-	struct rtentry *rt;
-	struct sockaddr_dl *sdl;
+	struct llentry *ln = NULL;
 	union nd_opts ndopts;
+	struct mbuf *chain = NULL;
+	struct sockaddr_in6 sin6;
 	char ip6bufs[INET6_ADDRSTRLEN], ip6bufd[INET6_ADDRSTRLEN];
 
 	if (ip6->ip6_hlim != 255) {
 		nd6log((LOG_ERR,
 		    "nd6_na_input: invalid hlim (%d) from %s to %s on %s\n",
 		    ip6->ip6_hlim, ip6_sprintf(ip6bufs, &ip6->ip6_src),
 		    ip6_sprintf(ip6bufd, &ip6->ip6_dst), if_name(ifp)));
 		goto bad;
 	}
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off, icmp6len,);
 	nd_na = (struct nd_neighbor_advert *)((caddr_t)ip6 + off);
 #else
 	IP6_EXTHDR_GET(nd_na, struct nd_neighbor_advert *, m, off, icmp6len);
 	if (nd_na == NULL) {
 		V_icmp6stat.icp6s_tooshort++;
 		return;
 	}
 #endif
 
 	flags = nd_na->nd_na_flags_reserved;
 	is_router = ((flags & ND_NA_FLAG_ROUTER) != 0);
 	is_solicited = ((flags & ND_NA_FLAG_SOLICITED) != 0);
 	is_override = ((flags & ND_NA_FLAG_OVERRIDE) != 0);
 
 	taddr6 = nd_na->nd_na_target;
 	if (in6_setscope(&taddr6, ifp, NULL))
 		goto bad;	/* XXX: impossible */
 
 	if (IN6_IS_ADDR_MULTICAST(&taddr6)) {
 		nd6log((LOG_ERR,
 		    "nd6_na_input: invalid target address %s\n",
 		    ip6_sprintf(ip6bufs, &taddr6)));
 		goto bad;
 	}
 	if (IN6_IS_ADDR_MULTICAST(&daddr6))
 		if (is_solicited) {
 			nd6log((LOG_ERR,
 			    "nd6_na_input: a solicited adv is multicasted\n"));
 			goto bad;
 		}
 
 	icmp6len -= sizeof(*nd_na);
 	nd6_option_init(nd_na + 1, icmp6len, &ndopts);
 	if (nd6_options(&ndopts) < 0) {
 		nd6log((LOG_INFO,
 		    "nd6_na_input: invalid ND option, ignored\n"));
 		/* nd6_options have incremented stats */
 		goto freeit;
 	}
 
 	if (ndopts.nd_opts_tgt_lladdr) {
 		lladdr = (char *)(ndopts.nd_opts_tgt_lladdr + 1);
 		lladdrlen = ndopts.nd_opts_tgt_lladdr->nd_opt_len << 3;
 	}
 
 	ifa = (struct ifaddr *)in6ifa_ifpwithaddr(ifp, &taddr6);
 
 	/*
 	 * Target address matches one of my interface address.
 	 *
 	 * If my address is tentative, this means that there's somebody
 	 * already using the same address as mine.  This indicates DAD failure.
 	 * This is defined in RFC 2462.
 	 *
 	 * Otherwise, process as defined in RFC 2461.
 	 */
 	if (ifa
 	 && (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_TENTATIVE)) {
 		nd6_dad_na_input(ifa);
 		goto freeit;
 	}
 
 	/* Just for safety, maybe unnecessary. */
 	if (ifa) {
 		log(LOG_ERR,
 		    "nd6_na_input: duplicate IP6 address %s\n",
 		    ip6_sprintf(ip6bufs, &taddr6));
 		goto freeit;
 	}
 
 	if (lladdr && ((ifp->if_addrlen + 2 + 7) & ~7) != lladdrlen) {
 		nd6log((LOG_INFO, "nd6_na_input: lladdrlen mismatch for %s "
 		    "(if %d, NA packet %d)\n", ip6_sprintf(ip6bufs, &taddr6),
 		    ifp->if_addrlen, lladdrlen - 2));
 		goto bad;
 	}
 
 	/*
 	 * If no neighbor cache entry is found, NA SHOULD silently be
 	 * discarded.
 	 */
-	rt = nd6_lookup(&taddr6, 0, ifp);
-	if ((rt == NULL) ||
-	   ((ln = (struct llinfo_nd6 *)rt->rt_llinfo) == NULL) ||
-	   ((sdl = SDL(rt->rt_gateway)) == NULL))
+	IF_AFDATA_LOCK(ifp);
+	ln = nd6_lookup(&taddr6, LLE_EXCLUSIVE, ifp);
+	IF_AFDATA_UNLOCK(ifp);
+	if (ln == NULL) {
 		goto freeit;
+	}
 
 	if (ln->ln_state == ND6_LLINFO_INCOMPLETE) {
 		/*
 		 * If the link-layer has address, and no lladdr option came,
 		 * discard the packet.
 		 */
-		if (ifp->if_addrlen && lladdr == NULL)
+		if (ifp->if_addrlen && lladdr == NULL) {
 			goto freeit;
+		}
 
 		/*
 		 * Record link-layer address, and update the state.
 		 */
-		sdl->sdl_alen = ifp->if_addrlen;
-		bcopy(lladdr, LLADDR(sdl), ifp->if_addrlen);
+		bcopy(lladdr, &ln->ll_addr, ifp->if_addrlen);
+		ln->la_flags |= LLE_VALID;
 		if (is_solicited) {
 			ln->ln_state = ND6_LLINFO_REACHABLE;
 			ln->ln_byhint = 0;
 			if (!ND6_LLINFO_PERMANENT(ln)) {
-				nd6_llinfo_settimer(ln,
-				    (long)ND_IFINFO(rt->rt_ifp)->reachable * hz);
+				nd6_llinfo_settimer_locked(ln,
+				    (long)ND_IFINFO(ln->lle_tbl->llt_ifp)->reachable * hz);
 			}
 		} else {
 			ln->ln_state = ND6_LLINFO_STALE;
-			nd6_llinfo_settimer(ln, (long)V_nd6_gctimer * hz);
+			nd6_llinfo_settimer_locked(ln, (long)V_nd6_gctimer * hz);
 		}
 		if ((ln->ln_router = is_router) != 0) {
 			/*
 			 * This means a router's state has changed from
 			 * non-reachable to probably reachable, and might
 			 * affect the status of associated prefixes..
 			 */
 			pfxlist_onlink_check();
 		}
 	} else {
 		int llchange;
 
 		/*
 		 * Check if the link-layer address has changed or not.
 		 */
 		if (lladdr == NULL)
 			llchange = 0;
 		else {
-			if (sdl->sdl_alen) {
-				if (bcmp(lladdr, LLADDR(sdl), ifp->if_addrlen))
+			if (ln->la_flags & LLE_VALID) {
+				if (bcmp(lladdr, &ln->ll_addr, ifp->if_addrlen))
 					llchange = 1;
 				else
 					llchange = 0;
 			} else
 				llchange = 1;
 		}
 
 		/*
 		 * This is VERY complex.  Look at it with care.
 		 *
 		 * override solicit lladdr llchange	action
 		 *					(L: record lladdr)
 		 *
 		 *	0	0	n	--	(2c)
 		 *	0	0	y	n	(2b) L
 		 *	0	0	y	y	(1)    REACHABLE->STALE
 		 *	0	1	n	--	(2c)   *->REACHABLE
 		 *	0	1	y	n	(2b) L *->REACHABLE
 		 *	0	1	y	y	(1)    REACHABLE->STALE
 		 *	1	0	n	--	(2a)
 		 *	1	0	y	n	(2a) L
 		 *	1	0	y	y	(2a) L *->STALE
 		 *	1	1	n	--	(2a)   *->REACHABLE
 		 *	1	1	y	n	(2a) L *->REACHABLE
 		 *	1	1	y	y	(2a) L *->REACHABLE
 		 */
 		if (!is_override && (lladdr != NULL && llchange)) {  /* (1) */
 			/*
 			 * If state is REACHABLE, make it STALE.
 			 * no other updates should be done.
 			 */
 			if (ln->ln_state == ND6_LLINFO_REACHABLE) {
 				ln->ln_state = ND6_LLINFO_STALE;
-				nd6_llinfo_settimer(ln, (long)V_nd6_gctimer * hz);
+				nd6_llinfo_settimer_locked(ln, (long)V_nd6_gctimer * hz);
 			}
 			goto freeit;
 		} else if (is_override				   /* (2a) */
 			|| (!is_override && (lladdr != NULL && !llchange)) /* (2b) */
 			|| lladdr == NULL) {			   /* (2c) */
 			/*
 			 * Update link-local address, if any.
 			 */
 			if (lladdr != NULL) {
-				sdl->sdl_alen = ifp->if_addrlen;
-				bcopy(lladdr, LLADDR(sdl), ifp->if_addrlen);
+				bcopy(lladdr, &ln->ll_addr, ifp->if_addrlen);
+				ln->la_flags |= LLE_VALID;
 			}
 
 			/*
 			 * If solicited, make the state REACHABLE.
 			 * If not solicited and the link-layer address was
 			 * changed, make it STALE.
 			 */
 			if (is_solicited) {
 				ln->ln_state = ND6_LLINFO_REACHABLE;
 				ln->ln_byhint = 0;
 				if (!ND6_LLINFO_PERMANENT(ln)) {
-					nd6_llinfo_settimer(ln,
+					nd6_llinfo_settimer_locked(ln,
 					    (long)ND_IFINFO(ifp)->reachable * hz);
 				}
 			} else {
 				if (lladdr != NULL && llchange) {
 					ln->ln_state = ND6_LLINFO_STALE;
-					nd6_llinfo_settimer(ln,
+					nd6_llinfo_settimer_locked(ln,
 					    (long)V_nd6_gctimer * hz);
 				}
 			}
 		}
 
 		if (ln->ln_router && !is_router) {
 			/*
 			 * The peer dropped the router flag.
 			 * Remove the sender from the Default Router List and
 			 * update the Destination Cache entries.
 			 */
 			struct nd_defrouter *dr;
 			struct in6_addr *in6;
-			int s;
 
-			in6 = &((struct sockaddr_in6 *)rt_key(rt))->sin6_addr;
+			in6 = &L3_ADDR_SIN6(ln)->sin6_addr;
 
 			/*
 			 * Lock to protect the default router list.
 			 * XXX: this might be unnecessary, since this function
 			 * is only called under the network software interrupt
 			 * context.  However, we keep it just for safety.
 			 */
-			s = splnet();
-			dr = defrouter_lookup(in6, ifp);
+			dr = defrouter_lookup(in6, ln->lle_tbl->llt_ifp);
 			if (dr)
 				defrtrlist_del(dr);
 			else if (!V_ip6_forwarding) {
 				/*
 				 * Even if the neighbor is not in the default
 				 * router list, the neighbor may be used
 				 * as a next hop for some destinations
 				 * (e.g. redirect case). So we must
 				 * call rt6_flush explicitly.
 				 */
 				rt6_flush(&ip6->ip6_src, ifp);
 			}
-			splx(s);
 		}
 		ln->ln_router = is_router;
 	}
-	rt->rt_flags &= ~RTF_REJECT;
-	ln->ln_asked = 0;
-	if (ln->ln_hold) {
+        /* XXX - QL
+	 *  Does this matter?
+	 *  rt->rt_flags &= ~RTF_REJECT;
+	 */
+	ln->la_asked = 0;
+	if (ln->la_hold) {
 		struct mbuf *m_hold, *m_hold_next;
 
 		/*
-		 * reset the ln_hold in advance, to explicitly
-		 * prevent a ln_hold lookup in nd6_output()
+		 * reset the la_hold in advance, to explicitly
+		 * prevent a la_hold lookup in nd6_output()
 		 * (wouldn't happen, though...)
 		 */
-		for (m_hold = ln->ln_hold;
+		for (m_hold = ln->la_hold, ln->la_hold = NULL;
 		    m_hold; m_hold = m_hold_next) {
 			m_hold_next = m_hold->m_nextpkt;
 			m_hold->m_nextpkt = NULL;
 			/*
 			 * we assume ifp is not a loopback here, so just set
 			 * the 2nd argument as the 1st one.
 			 */
-			nd6_output(ifp, ifp, m_hold,
-			    (struct sockaddr_in6 *)rt_key(rt), rt);
+			nd6_output_lle(ifp, ifp, m_hold, L3_ADDR_SIN6(ln), NULL, ln, &chain);
 		}
-		ln->ln_hold = NULL;
 	}
-
  freeit:
+	if (ln) {
+		if (chain)
+			memcpy(&sin6, L3_ADDR_SIN6(ln), sizeof(sin6));
+		LLE_WUNLOCK(ln);
+
+		if (chain)
+			nd6_output_flush(ifp, ifp, chain, &sin6, NULL);
+	}
 	m_freem(m);
 	return;
 
  bad:
+	if (ln)
+		LLE_WUNLOCK(ln);
+
 	V_icmp6stat.icp6s_badna++;
 	m_freem(m);
 }
 
 /*
  * Neighbor advertisement output handling.
  *
  * Based on RFC 2461
  *
  * the following items are not implemented yet:
  * - proxy advertisement delay rule (RFC2461 7.2.8, last paragraph, SHOULD)
  * - anycast advertisement delay rule (RFC2461 7.2.7, SHOULD)
  *
  * tlladdr - 1 if include target link-layer address
  * sdl0 - sockaddr_dl (= proxy NA) or NULL
  */
 void
 nd6_na_output(struct ifnet *ifp, const struct in6_addr *daddr6_0,
     const struct in6_addr *taddr6, u_long flags, int tlladdr,
     struct sockaddr *sdl0)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	struct mbuf *m;
 	struct ip6_hdr *ip6;
 	struct nd_neighbor_advert *nd_na;
 	struct ip6_moptions im6o;
 	struct in6_addr *src, daddr6;
 	struct sockaddr_in6 dst_sa;
 	int icmp6len, maxlen, error;
 	caddr_t mac = NULL;
 	struct route_in6 ro;
 
 	bzero(&ro, sizeof(ro));
 
 	daddr6 = *daddr6_0;	/* make a local copy for modification */
 
 	/* estimate the size of message */
 	maxlen = sizeof(*ip6) + sizeof(*nd_na);
 	maxlen += (sizeof(struct nd_opt_hdr) + ifp->if_addrlen + 7) & ~7;
 	if (max_linkhdr + maxlen >= MCLBYTES) {
 #ifdef DIAGNOSTIC
 		printf("nd6_na_output: max_linkhdr + maxlen >= MCLBYTES "
 		    "(%d + %d > %d)\n", max_linkhdr, maxlen, MCLBYTES);
 #endif
 		return;
 	}
 
 	MGETHDR(m, M_DONTWAIT, MT_DATA);
 	if (m && max_linkhdr + maxlen >= MHLEN) {
 		MCLGET(m, M_DONTWAIT);
 		if ((m->m_flags & M_EXT) == 0) {
 			m_free(m);
 			m = NULL;
 		}
 	}
 	if (m == NULL)
 		return;
 	m->m_pkthdr.rcvif = NULL;
 
 	if (IN6_IS_ADDR_MULTICAST(&daddr6)) {
 		m->m_flags |= M_MCAST;
 		im6o.im6o_multicast_ifp = ifp;
 		im6o.im6o_multicast_hlim = 255;
 		im6o.im6o_multicast_loop = 0;
 	}
 
 	icmp6len = sizeof(*nd_na);
 	m->m_pkthdr.len = m->m_len = sizeof(struct ip6_hdr) + icmp6len;
 	m->m_data += max_linkhdr;	/* or MH_ALIGN() equivalent? */
 
 	/* fill neighbor advertisement packet */
 	ip6 = mtod(m, struct ip6_hdr *);
 	ip6->ip6_flow = 0;
 	ip6->ip6_vfc &= ~IPV6_VERSION_MASK;
 	ip6->ip6_vfc |= IPV6_VERSION;
 	ip6->ip6_nxt = IPPROTO_ICMPV6;
 	ip6->ip6_hlim = 255;
 	if (IN6_IS_ADDR_UNSPECIFIED(&daddr6)) {
 		/* reply to DAD */
 		daddr6.s6_addr16[0] = IPV6_ADDR_INT16_MLL;
 		daddr6.s6_addr16[1] = 0;
 		daddr6.s6_addr32[1] = 0;
 		daddr6.s6_addr32[2] = 0;
 		daddr6.s6_addr32[3] = IPV6_ADDR_INT32_ONE;
 		if (in6_setscope(&daddr6, ifp, NULL))
 			goto bad;
 
 		flags &= ~ND_NA_FLAG_SOLICITED;
 	}
 	ip6->ip6_dst = daddr6;
 	bzero(&dst_sa, sizeof(struct sockaddr_in6));
 	dst_sa.sin6_family = AF_INET6;
 	dst_sa.sin6_len = sizeof(struct sockaddr_in6);
 	dst_sa.sin6_addr = daddr6;
 
 	/*
 	 * Select a source whose scope is the same as that of the dest.
 	 */
 	bcopy(&dst_sa, &ro.ro_dst, sizeof(dst_sa));
 	src = in6_selectsrc(&dst_sa, NULL, NULL, &ro, NULL, NULL, &error);
 	if (src == NULL) {
 		char ip6buf[INET6_ADDRSTRLEN];
 		nd6log((LOG_DEBUG, "nd6_na_output: source can't be "
 		    "determined: dst=%s, error=%d\n",
 		    ip6_sprintf(ip6buf, &dst_sa.sin6_addr), error));
 		goto bad;
 	}
 	ip6->ip6_src = *src;
 	nd_na = (struct nd_neighbor_advert *)(ip6 + 1);
 	nd_na->nd_na_type = ND_NEIGHBOR_ADVERT;
 	nd_na->nd_na_code = 0;
 	nd_na->nd_na_target = *taddr6;
 	in6_clearscope(&nd_na->nd_na_target); /* XXX */
 
 	/*
 	 * "tlladdr" indicates NS's condition for adding tlladdr or not.
 	 * see nd6_ns_input() for details.
 	 * Basically, if NS packet is sent to unicast/anycast addr,
 	 * target lladdr option SHOULD NOT be included.
 	 */
 	if (tlladdr) {
 		/*
 		 * sdl0 != NULL indicates proxy NA.  If we do proxy, use
 		 * lladdr in sdl0.  If we are not proxying (sending NA for
 		 * my address) use lladdr configured for the interface.
 		 */
 		if (sdl0 == NULL) {
 #ifdef DEV_CARP
 			if (ifp->if_carp)
 				mac = carp_macmatch6(ifp->if_carp, m, taddr6);
 			if (mac == NULL)
 				mac = nd6_ifptomac(ifp);
 #else
 			mac = nd6_ifptomac(ifp);
 #endif
 		} else if (sdl0->sa_family == AF_LINK) {
 			struct sockaddr_dl *sdl;
 			sdl = (struct sockaddr_dl *)sdl0;
 			if (sdl->sdl_alen == ifp->if_addrlen)
 				mac = LLADDR(sdl);
 		}
 	}
 	if (tlladdr && mac) {
 		int optlen = sizeof(struct nd_opt_hdr) + ifp->if_addrlen;
 		struct nd_opt_hdr *nd_opt = (struct nd_opt_hdr *)(nd_na + 1);
 
 		/* roundup to 8 bytes alignment! */
 		optlen = (optlen + 7) & ~7;
 
 		m->m_pkthdr.len += optlen;
 		m->m_len += optlen;
 		icmp6len += optlen;
 		bzero((caddr_t)nd_opt, optlen);
 		nd_opt->nd_opt_type = ND_OPT_TARGET_LINKADDR;
 		nd_opt->nd_opt_len = optlen >> 3;
 		bcopy(mac, (caddr_t)(nd_opt + 1), ifp->if_addrlen);
 	} else
 		flags &= ~ND_NA_FLAG_OVERRIDE;
 
 	ip6->ip6_plen = htons((u_short)icmp6len);
 	nd_na->nd_na_flags_reserved = flags;
 	nd_na->nd_na_cksum = 0;
 	nd_na->nd_na_cksum =
 	    in6_cksum(m, IPPROTO_ICMPV6, sizeof(struct ip6_hdr), icmp6len);
 
 	ip6_output(m, NULL, &ro, 0, &im6o, NULL, NULL);
 	icmp6_ifstat_inc(ifp, ifs6_out_msg);
 	icmp6_ifstat_inc(ifp, ifs6_out_neighboradvert);
 	V_icmp6stat.icp6s_outhist[ND_NEIGHBOR_ADVERT]++;
 
 	if (ro.ro_rt) {		/* we don't cache this route. */
 		RTFREE(ro.ro_rt);
 	}
 	return;
 
   bad:
 	if (ro.ro_rt) {
 		RTFREE(ro.ro_rt);
 	}
 	m_freem(m);
 	return;
 }
 
 caddr_t
 nd6_ifptomac(struct ifnet *ifp)
 {
 	switch (ifp->if_type) {
 	case IFT_ARCNET:
 	case IFT_ETHER:
 	case IFT_FDDI:
 	case IFT_IEEE1394:
 #ifdef IFT_L2VLAN
 	case IFT_L2VLAN:
 #endif
 #ifdef IFT_IEEE80211
 	case IFT_IEEE80211:
 #endif
 #ifdef IFT_CARP
 	case IFT_CARP:
 #endif
 	case IFT_BRIDGE:
 	case IFT_ISO88025:
 		return IF_LLADDR(ifp);
 	default:
 		return NULL;
 	}
 }
 
 TAILQ_HEAD(dadq_head, dadq);
 struct dadq {
 	TAILQ_ENTRY(dadq) dad_list;
 	struct ifaddr *dad_ifa;
 	int dad_count;		/* max NS to send */
 	int dad_ns_tcount;	/* # of trials to send NS */
 	int dad_ns_ocount;	/* NS sent so far */
 	int dad_ns_icount;
 	int dad_na_icount;
 	struct callout dad_timer_ch;
 };
 
 #ifdef VIMAGE_GLOBALS
 static struct dadq_head dadq;
 int dad_init;
 #endif
 
 static struct dadq *
 nd6_dad_find(struct ifaddr *ifa)
 {
 	INIT_VNET_INET6(curvnet);
 	struct dadq *dp;
 
 	for (dp = V_dadq.tqh_first; dp; dp = dp->dad_list.tqe_next) {
 		if (dp->dad_ifa == ifa)
 			return dp;
 	}
 	return NULL;
 }
 
 static void
 nd6_dad_starttimer(struct dadq *dp, int ticks)
 {
 
 	callout_reset(&dp->dad_timer_ch, ticks,
 	    (void (*)(void *))nd6_dad_timer, (void *)dp->dad_ifa);
 }
 
 static void
 nd6_dad_stoptimer(struct dadq *dp)
 {
 
 	callout_stop(&dp->dad_timer_ch);
 }
 
 /*
  * Start Duplicate Address Detection (DAD) for specified interface address.
  */
 void
 nd6_dad_start(struct ifaddr *ifa, int delay)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_ifaddr *ia = (struct in6_ifaddr *)ifa;
 	struct dadq *dp;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	if (!V_dad_init) {
 		TAILQ_INIT(&V_dadq);
 		V_dad_init++;
 	}
 
 	/*
 	 * If we don't need DAD, don't do it.
 	 * There are several cases:
 	 * - DAD is disabled (ip6_dad_count == 0)
 	 * - the interface address is anycast
 	 */
 	if (!(ia->ia6_flags & IN6_IFF_TENTATIVE)) {
 		log(LOG_DEBUG,
 			"nd6_dad_start: called with non-tentative address "
 			"%s(%s)\n",
 			ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr),
 			ifa->ifa_ifp ? if_name(ifa->ifa_ifp) : "???");
 		return;
 	}
 	if (ia->ia6_flags & IN6_IFF_ANYCAST) {
 		ia->ia6_flags &= ~IN6_IFF_TENTATIVE;
 		return;
 	}
 	if (!V_ip6_dad_count) {
 		ia->ia6_flags &= ~IN6_IFF_TENTATIVE;
 		return;
 	}
 	if (ifa->ifa_ifp == NULL)
 		panic("nd6_dad_start: ifa->ifa_ifp == NULL");
 	if (!(ifa->ifa_ifp->if_flags & IFF_UP)) {
 		return;
 	}
 	if (nd6_dad_find(ifa) != NULL) {
 		/* DAD already in progress */
 		return;
 	}
 
 	dp = malloc(sizeof(*dp), M_IP6NDP, M_NOWAIT);
 	if (dp == NULL) {
 		log(LOG_ERR, "nd6_dad_start: memory allocation failed for "
 			"%s(%s)\n",
 			ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr),
 			ifa->ifa_ifp ? if_name(ifa->ifa_ifp) : "???");
 		return;
 	}
 	bzero(dp, sizeof(*dp));
 	callout_init(&dp->dad_timer_ch, 0);
 	TAILQ_INSERT_TAIL(&V_dadq, (struct dadq *)dp, dad_list);
 
 	nd6log((LOG_DEBUG, "%s: starting DAD for %s\n", if_name(ifa->ifa_ifp),
 	    ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr)));
 
 	/*
 	 * Send NS packet for DAD, ip6_dad_count times.
 	 * Note that we must delay the first transmission, if this is the
 	 * first packet to be sent from the interface after interface
 	 * (re)initialization.
 	 */
 	dp->dad_ifa = ifa;
 	IFAREF(ifa);	/* just for safety */
 	dp->dad_count = V_ip6_dad_count;
 	dp->dad_ns_icount = dp->dad_na_icount = 0;
 	dp->dad_ns_ocount = dp->dad_ns_tcount = 0;
 	if (delay == 0) {
 		nd6_dad_ns_output(dp, ifa);
 		nd6_dad_starttimer(dp,
 		    (long)ND_IFINFO(ifa->ifa_ifp)->retrans * hz / 1000);
 	} else {
 		nd6_dad_starttimer(dp, delay);
 	}
 }
 
 /*
  * terminate DAD unconditionally.  used for address removals.
  */
 void
 nd6_dad_stop(struct ifaddr *ifa)
 {
 	INIT_VNET_INET6(curvnet);
 	struct dadq *dp;
 
 	if (!V_dad_init)
 		return;
 	dp = nd6_dad_find(ifa);
 	if (!dp) {
 		/* DAD wasn't started yet */
 		return;
 	}
 
 	nd6_dad_stoptimer(dp);
 
 	TAILQ_REMOVE(&V_dadq, (struct dadq *)dp, dad_list);
 	free(dp, M_IP6NDP);
 	dp = NULL;
 	IFAFREE(ifa);
 }
 
 static void
 nd6_dad_timer(struct ifaddr *ifa)
 {
 	CURVNET_SET(dp->dad_vnet);
 	INIT_VNET_INET6(curvnet);
 	int s;
 	struct in6_ifaddr *ia = (struct in6_ifaddr *)ifa;
 	struct dadq *dp;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	s = splnet();		/* XXX */
 
 	/* Sanity check */
 	if (ia == NULL) {
 		log(LOG_ERR, "nd6_dad_timer: called with null parameter\n");
 		goto done;
 	}
 	dp = nd6_dad_find(ifa);
 	if (dp == NULL) {
 		log(LOG_ERR, "nd6_dad_timer: DAD structure not found\n");
 		goto done;
 	}
 	if (ia->ia6_flags & IN6_IFF_DUPLICATED) {
 		log(LOG_ERR, "nd6_dad_timer: called with duplicated address "
 			"%s(%s)\n",
 			ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr),
 			ifa->ifa_ifp ? if_name(ifa->ifa_ifp) : "???");
 		goto done;
 	}
 	if ((ia->ia6_flags & IN6_IFF_TENTATIVE) == 0) {
 		log(LOG_ERR, "nd6_dad_timer: called with non-tentative address "
 			"%s(%s)\n",
 			ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr),
 			ifa->ifa_ifp ? if_name(ifa->ifa_ifp) : "???");
 		goto done;
 	}
 
 	/* timeouted with IFF_{RUNNING,UP} check */
 	if (dp->dad_ns_tcount > V_dad_maxtry) {
 		nd6log((LOG_INFO, "%s: could not run DAD, driver problem?\n",
 		    if_name(ifa->ifa_ifp)));
 
 		TAILQ_REMOVE(&V_dadq, (struct dadq *)dp, dad_list);
 		free(dp, M_IP6NDP);
 		dp = NULL;
 		IFAFREE(ifa);
 		goto done;
 	}
 
 	/* Need more checks? */
 	if (dp->dad_ns_ocount < dp->dad_count) {
 		/*
 		 * We have more NS to go.  Send NS packet for DAD.
 		 */
 		nd6_dad_ns_output(dp, ifa);
 		nd6_dad_starttimer(dp,
 		    (long)ND_IFINFO(ifa->ifa_ifp)->retrans * hz / 1000);
 	} else {
 		/*
 		 * We have transmitted sufficient number of DAD packets.
 		 * See what we've got.
 		 */
 		int duplicate;
 
 		duplicate = 0;
 
 		if (dp->dad_na_icount) {
 			/*
 			 * the check is in nd6_dad_na_input(),
 			 * but just in case
 			 */
 			duplicate++;
 		}
 
 		if (dp->dad_ns_icount) {
 			/* We've seen NS, means DAD has failed. */
 			duplicate++;
 		}
 
 		if (duplicate) {
 			/* (*dp) will be freed in nd6_dad_duplicated() */
 			dp = NULL;
 			nd6_dad_duplicated(ifa);
 		} else {
 			/*
 			 * We are done with DAD.  No NA came, no NS came.
 			 * No duplicate address found.
 			 */
 			ia->ia6_flags &= ~IN6_IFF_TENTATIVE;
 
 			nd6log((LOG_DEBUG,
 			    "%s: DAD complete for %s - no duplicates found\n",
 			    if_name(ifa->ifa_ifp),
 			    ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr)));
 
 			TAILQ_REMOVE(&V_dadq, (struct dadq *)dp, dad_list);
 			free(dp, M_IP6NDP);
 			dp = NULL;
 			IFAFREE(ifa);
 		}
 	}
 
 done:
 	splx(s);
 	CURVNET_RESTORE();
 }
 
 void
 nd6_dad_duplicated(struct ifaddr *ifa)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_ifaddr *ia = (struct in6_ifaddr *)ifa;
 	struct ifnet *ifp;
 	struct dadq *dp;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	dp = nd6_dad_find(ifa);
 	if (dp == NULL) {
 		log(LOG_ERR, "nd6_dad_duplicated: DAD structure not found\n");
 		return;
 	}
 
 	log(LOG_ERR, "%s: DAD detected duplicate IPv6 address %s: "
 	    "NS in/out=%d/%d, NA in=%d\n",
 	    if_name(ifa->ifa_ifp), ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr),
 	    dp->dad_ns_icount, dp->dad_ns_ocount, dp->dad_na_icount);
 
 	ia->ia6_flags &= ~IN6_IFF_TENTATIVE;
 	ia->ia6_flags |= IN6_IFF_DUPLICATED;
 
 	/* We are done with DAD, with duplicate address found. (failure) */
 	nd6_dad_stoptimer(dp);
 
 	ifp = ifa->ifa_ifp;
 	log(LOG_ERR, "%s: DAD complete for %s - duplicate found\n",
 	    if_name(ifp), ip6_sprintf(ip6buf, &ia->ia_addr.sin6_addr));
 	log(LOG_ERR, "%s: manual intervention required\n",
 	    if_name(ifp));
 
 	/*
 	 * If the address is a link-local address formed from an interface
 	 * identifier based on the hardware address which is supposed to be
 	 * uniquely assigned (e.g., EUI-64 for an Ethernet interface), IP
 	 * operation on the interface SHOULD be disabled.
 	 * [rfc2462bis-03 Section 5.4.5]
 	 */
 	if (IN6_IS_ADDR_LINKLOCAL(&ia->ia_addr.sin6_addr)) {
 		struct in6_addr in6;
 
 		/*
 		 * To avoid over-reaction, we only apply this logic when we are
 		 * very sure that hardware addresses are supposed to be unique.
 		 */
 		switch (ifp->if_type) {
 		case IFT_ETHER:
 		case IFT_FDDI:
 		case IFT_ATM:
 		case IFT_IEEE1394:
 #ifdef IFT_IEEE80211
 		case IFT_IEEE80211:
 #endif
 			in6 = ia->ia_addr.sin6_addr;
 			if (in6_get_hw_ifid(ifp, &in6) == 0 &&
 			    IN6_ARE_ADDR_EQUAL(&ia->ia_addr.sin6_addr, &in6)) {
 				ND_IFINFO(ifp)->flags |= ND6_IFF_IFDISABLED;
 				log(LOG_ERR, "%s: possible hardware address "
 				    "duplication detected, disable IPv6\n",
 				    if_name(ifp));
 			}
 			break;
 		}
 	}
 
 	TAILQ_REMOVE(&V_dadq, (struct dadq *)dp, dad_list);
 	free(dp, M_IP6NDP);
 	dp = NULL;
 	IFAFREE(ifa);
 }
 
 static void
 nd6_dad_ns_output(struct dadq *dp, struct ifaddr *ifa)
 {
 	struct in6_ifaddr *ia = (struct in6_ifaddr *)ifa;
 	struct ifnet *ifp = ifa->ifa_ifp;
 
 	dp->dad_ns_tcount++;
 	if ((ifp->if_flags & IFF_UP) == 0) {
 		return;
 	}
 	if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) {
 		return;
 	}
 
 	dp->dad_ns_ocount++;
 	nd6_ns_output(ifp, NULL, &ia->ia_addr.sin6_addr, NULL, 1);
 }
 
 static void
 nd6_dad_ns_input(struct ifaddr *ifa)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_ifaddr *ia;
 	struct ifnet *ifp;
 	const struct in6_addr *taddr6;
 	struct dadq *dp;
 	int duplicate;
 
 	if (ifa == NULL)
 		panic("ifa == NULL in nd6_dad_ns_input");
 
 	ia = (struct in6_ifaddr *)ifa;
 	ifp = ifa->ifa_ifp;
 	taddr6 = &ia->ia_addr.sin6_addr;
 	duplicate = 0;
 	dp = nd6_dad_find(ifa);
 
 	/* Quickhack - completely ignore DAD NS packets */
 	if (V_dad_ignore_ns) {
 		char ip6buf[INET6_ADDRSTRLEN];
 		nd6log((LOG_INFO,
 		    "nd6_dad_ns_input: ignoring DAD NS packet for "
 		    "address %s(%s)\n", ip6_sprintf(ip6buf, taddr6),
 		    if_name(ifa->ifa_ifp)));
 		return;
 	}
 
 	/*
 	 * if I'm yet to start DAD, someone else started using this address
 	 * first.  I have a duplicate and you win.
 	 */
 	if (dp == NULL || dp->dad_ns_ocount == 0)
 		duplicate++;
 
 	/* XXX more checks for loopback situation - see nd6_dad_timer too */
 
 	if (duplicate) {
 		dp = NULL;	/* will be freed in nd6_dad_duplicated() */
 		nd6_dad_duplicated(ifa);
 	} else {
 		/*
 		 * not sure if I got a duplicate.
 		 * increment ns count and see what happens.
 		 */
 		if (dp)
 			dp->dad_ns_icount++;
 	}
 }
 
 static void
 nd6_dad_na_input(struct ifaddr *ifa)
 {
 	struct dadq *dp;
 
 	if (ifa == NULL)
 		panic("ifa == NULL in nd6_dad_na_input");
 
 	dp = nd6_dad_find(ifa);
 	if (dp)
 		dp->dad_na_icount++;
 
 	/* remove the address. */
 	nd6_dad_duplicated(ifa);
 }
Index: head/sys/netinet6/nd6_rtr.c
===================================================================
--- head/sys/netinet6/nd6_rtr.c	(revision 186118)
+++ head/sys/netinet6/nd6_rtr.c	(revision 186119)
@@ -1,2118 +1,2125 @@
 /*-
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$KAME: nd6_rtr.c,v 1.111 2001/04/27 01:37:15 jinmei Exp $
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/time.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/errno.h>
 #include <sys/rwlock.h>
 #include <sys/syslog.h>
 #include <sys/queue.h>
 #include <sys/vimage.h>
 
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 #include <net/radix.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
+#include <net/if_llatbl.h>
 #include <netinet6/in6_var.h>
 #include <netinet6/in6_ifattach.h>
 #include <netinet/ip6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet/icmp6.h>
 #include <netinet6/scope6_var.h>
 #include <netinet6/vinet6.h>
 
 #define SDL(s)	((struct sockaddr_dl *)s)
 
 static int rtpref(struct nd_defrouter *);
 static struct nd_defrouter *defrtrlist_update(struct nd_defrouter *);
 static int prelist_update __P((struct nd_prefixctl *, struct nd_defrouter *,
     struct mbuf *, int));
 static struct in6_ifaddr *in6_ifadd(struct nd_prefixctl *,	int);
 static struct nd_pfxrouter *pfxrtr_lookup __P((struct nd_prefix *,
 	struct nd_defrouter *));
 static void pfxrtr_add(struct nd_prefix *, struct nd_defrouter *);
 static void pfxrtr_del(struct nd_pfxrouter *);
 static struct nd_pfxrouter *find_pfxlist_reachable_router
 (struct nd_prefix *);
 static void defrouter_delreq(struct nd_defrouter *);
 static void nd6_rtmsg(int, struct rtentry *);
 
 static int in6_init_prefix_ltimes(struct nd_prefix *);
 static void in6_init_address_ltimes __P((struct nd_prefix *,
 	struct in6_addrlifetime *));
 
 static int rt6_deleteroute(struct radix_node *, void *);
 
 #ifdef VIMAGE_GLOBALS
 extern int nd6_recalc_reachtm_interval;
 
 static struct ifnet *nd6_defifp;
 int nd6_defifindex;
 
 int ip6_use_tempaddr;
 int ip6_desync_factor;
 u_int32_t ip6_temp_preferred_lifetime;
 u_int32_t ip6_temp_valid_lifetime;
 int ip6_temp_regen_advance;
 #endif
 
 /* RTPREF_MEDIUM has to be 0! */
 #define RTPREF_HIGH	1
 #define RTPREF_MEDIUM	0
 #define RTPREF_LOW	(-1)
 #define RTPREF_RESERVED	(-2)
 #define RTPREF_INVALID	(-3)	/* internal */
 
 /*
  * Receive Router Solicitation Message - just for routers.
  * Router solicitation/advertisement is mostly managed by userland program
  * (rtadvd) so here we have no function like nd6_ra_output().
  *
  * Based on RFC 2461
  */
 void
 nd6_rs_input(struct mbuf *m, int off, int icmp6len)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp = m->m_pkthdr.rcvif;
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	struct nd_router_solicit *nd_rs;
 	struct in6_addr saddr6 = ip6->ip6_src;
 	char *lladdr = NULL;
 	int lladdrlen = 0;
 	union nd_opts ndopts;
 	char ip6bufs[INET6_ADDRSTRLEN], ip6bufd[INET6_ADDRSTRLEN];
 
 	/* If I'm not a router, ignore it. */
 	if (V_ip6_accept_rtadv != 0 || V_ip6_forwarding != 1)
 		goto freeit;
 
 	/* Sanity checks */
 	if (ip6->ip6_hlim != 255) {
 		nd6log((LOG_ERR,
 		    "nd6_rs_input: invalid hlim (%d) from %s to %s on %s\n",
 		    ip6->ip6_hlim, ip6_sprintf(ip6bufs, &ip6->ip6_src),
 		    ip6_sprintf(ip6bufd, &ip6->ip6_dst), if_name(ifp)));
 		goto bad;
 	}
 
 	/*
 	 * Don't update the neighbor cache, if src = ::.
 	 * This indicates that the src has no IP address assigned yet.
 	 */
 	if (IN6_IS_ADDR_UNSPECIFIED(&saddr6))
 		goto freeit;
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off, icmp6len,);
 	nd_rs = (struct nd_router_solicit *)((caddr_t)ip6 + off);
 #else
 	IP6_EXTHDR_GET(nd_rs, struct nd_router_solicit *, m, off, icmp6len);
 	if (nd_rs == NULL) {
 		V_icmp6stat.icp6s_tooshort++;
 		return;
 	}
 #endif
 
 	icmp6len -= sizeof(*nd_rs);
 	nd6_option_init(nd_rs + 1, icmp6len, &ndopts);
 	if (nd6_options(&ndopts) < 0) {
 		nd6log((LOG_INFO,
 		    "nd6_rs_input: invalid ND option, ignored\n"));
 		/* nd6_options have incremented stats */
 		goto freeit;
 	}
 
 	if (ndopts.nd_opts_src_lladdr) {
 		lladdr = (char *)(ndopts.nd_opts_src_lladdr + 1);
 		lladdrlen = ndopts.nd_opts_src_lladdr->nd_opt_len << 3;
 	}
 
 	if (lladdr && ((ifp->if_addrlen + 2 + 7) & ~7) != lladdrlen) {
 		nd6log((LOG_INFO,
 		    "nd6_rs_input: lladdrlen mismatch for %s "
 		    "(if %d, RS packet %d)\n",
 		    ip6_sprintf(ip6bufs, &saddr6),
 		    ifp->if_addrlen, lladdrlen - 2));
 		goto bad;
 	}
 
 	nd6_cache_lladdr(ifp, &saddr6, lladdr, lladdrlen, ND_ROUTER_SOLICIT, 0);
 
  freeit:
 	m_freem(m);
 	return;
 
  bad:
 	V_icmp6stat.icp6s_badrs++;
 	m_freem(m);
 }
 
 /*
  * Receive Router Advertisement Message.
  *
  * Based on RFC 2461
  * TODO: on-link bit on prefix information
  * TODO: ND_RA_FLAG_{OTHER,MANAGED} processing
  */
 void
 nd6_ra_input(struct mbuf *m, int off, int icmp6len)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp = m->m_pkthdr.rcvif;
 	struct nd_ifinfo *ndi = ND_IFINFO(ifp);
 	struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *);
 	struct nd_router_advert *nd_ra;
 	struct in6_addr saddr6 = ip6->ip6_src;
 	int mcast = 0;
 	union nd_opts ndopts;
 	struct nd_defrouter *dr;
 	char ip6bufs[INET6_ADDRSTRLEN], ip6bufd[INET6_ADDRSTRLEN];
 
 	/*
 	 * We only accept RAs only when
 	 * the system-wide variable allows the acceptance, and
 	 * per-interface variable allows RAs on the receiving interface.
 	 */
 	if (V_ip6_accept_rtadv == 0)
 		goto freeit;
 	if (!(ndi->flags & ND6_IFF_ACCEPT_RTADV))
 		goto freeit;
 
 	if (ip6->ip6_hlim != 255) {
 		nd6log((LOG_ERR,
 		    "nd6_ra_input: invalid hlim (%d) from %s to %s on %s\n",
 		    ip6->ip6_hlim, ip6_sprintf(ip6bufs, &ip6->ip6_src),
 		    ip6_sprintf(ip6bufd, &ip6->ip6_dst), if_name(ifp)));
 		goto bad;
 	}
 
 	if (!IN6_IS_ADDR_LINKLOCAL(&saddr6)) {
 		nd6log((LOG_ERR,
 		    "nd6_ra_input: src %s is not link-local\n",
 		    ip6_sprintf(ip6bufs, &saddr6)));
 		goto bad;
 	}
 
 #ifndef PULLDOWN_TEST
 	IP6_EXTHDR_CHECK(m, off, icmp6len,);
 	nd_ra = (struct nd_router_advert *)((caddr_t)ip6 + off);
 #else
 	IP6_EXTHDR_GET(nd_ra, struct nd_router_advert *, m, off, icmp6len);
 	if (nd_ra == NULL) {
 		V_icmp6stat.icp6s_tooshort++;
 		return;
 	}
 #endif
 
 	icmp6len -= sizeof(*nd_ra);
 	nd6_option_init(nd_ra + 1, icmp6len, &ndopts);
 	if (nd6_options(&ndopts) < 0) {
 		nd6log((LOG_INFO,
 		    "nd6_ra_input: invalid ND option, ignored\n"));
 		/* nd6_options have incremented stats */
 		goto freeit;
 	}
 
     {
 	struct nd_defrouter dr0;
 	u_int32_t advreachable = nd_ra->nd_ra_reachable;
 
 	/* remember if this is a multicasted advertisement */
 	if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst))
 		mcast = 1;
 
 	bzero(&dr0, sizeof(dr0));
 	dr0.rtaddr = saddr6;
 	dr0.flags  = nd_ra->nd_ra_flags_reserved;
 	dr0.rtlifetime = ntohs(nd_ra->nd_ra_router_lifetime);
 	dr0.expire = time_second + dr0.rtlifetime;
 	dr0.ifp = ifp;
 	/* unspecified or not? (RFC 2461 6.3.4) */
 	if (advreachable) {
 		advreachable = ntohl(advreachable);
 		if (advreachable <= MAX_REACHABLE_TIME &&
 		    ndi->basereachable != advreachable) {
 			ndi->basereachable = advreachable;
 			ndi->reachable = ND_COMPUTE_RTIME(ndi->basereachable);
 			ndi->recalctm = V_nd6_recalc_reachtm_interval; /* reset */
 		}
 	}
 	if (nd_ra->nd_ra_retransmit)
 		ndi->retrans = ntohl(nd_ra->nd_ra_retransmit);
 	if (nd_ra->nd_ra_curhoplimit)
 		ndi->chlim = nd_ra->nd_ra_curhoplimit;
 	dr = defrtrlist_update(&dr0);
     }
 
 	/*
 	 * prefix
 	 */
 	if (ndopts.nd_opts_pi) {
 		struct nd_opt_hdr *pt;
 		struct nd_opt_prefix_info *pi = NULL;
 		struct nd_prefixctl pr;
 
 		for (pt = (struct nd_opt_hdr *)ndopts.nd_opts_pi;
 		     pt <= (struct nd_opt_hdr *)ndopts.nd_opts_pi_end;
 		     pt = (struct nd_opt_hdr *)((caddr_t)pt +
 						(pt->nd_opt_len << 3))) {
 			if (pt->nd_opt_type != ND_OPT_PREFIX_INFORMATION)
 				continue;
 			pi = (struct nd_opt_prefix_info *)pt;
 
 			if (pi->nd_opt_pi_len != 4) {
 				nd6log((LOG_INFO,
 				    "nd6_ra_input: invalid option "
 				    "len %d for prefix information option, "
 				    "ignored\n", pi->nd_opt_pi_len));
 				continue;
 			}
 
 			if (128 < pi->nd_opt_pi_prefix_len) {
 				nd6log((LOG_INFO,
 				    "nd6_ra_input: invalid prefix "
 				    "len %d for prefix information option, "
 				    "ignored\n", pi->nd_opt_pi_prefix_len));
 				continue;
 			}
 
 			if (IN6_IS_ADDR_MULTICAST(&pi->nd_opt_pi_prefix)
 			 || IN6_IS_ADDR_LINKLOCAL(&pi->nd_opt_pi_prefix)) {
 				nd6log((LOG_INFO,
 				    "nd6_ra_input: invalid prefix "
 				    "%s, ignored\n",
 				    ip6_sprintf(ip6bufs,
 					&pi->nd_opt_pi_prefix)));
 				continue;
 			}
 
 			bzero(&pr, sizeof(pr));
 			pr.ndpr_prefix.sin6_family = AF_INET6;
 			pr.ndpr_prefix.sin6_len = sizeof(pr.ndpr_prefix);
 			pr.ndpr_prefix.sin6_addr = pi->nd_opt_pi_prefix;
 			pr.ndpr_ifp = (struct ifnet *)m->m_pkthdr.rcvif;
 
 			pr.ndpr_raf_onlink = (pi->nd_opt_pi_flags_reserved &
 			    ND_OPT_PI_FLAG_ONLINK) ? 1 : 0;
 			pr.ndpr_raf_auto = (pi->nd_opt_pi_flags_reserved &
 			    ND_OPT_PI_FLAG_AUTO) ? 1 : 0;
 			pr.ndpr_plen = pi->nd_opt_pi_prefix_len;
 			pr.ndpr_vltime = ntohl(pi->nd_opt_pi_valid_time);
 			pr.ndpr_pltime = ntohl(pi->nd_opt_pi_preferred_time);
 			(void)prelist_update(&pr, dr, m, mcast);
 		}
 	}
 
 	/*
 	 * MTU
 	 */
 	if (ndopts.nd_opts_mtu && ndopts.nd_opts_mtu->nd_opt_mtu_len == 1) {
 		u_long mtu;
 		u_long maxmtu;
 
 		mtu = (u_long)ntohl(ndopts.nd_opts_mtu->nd_opt_mtu_mtu);
 
 		/* lower bound */
 		if (mtu < IPV6_MMTU) {
 			nd6log((LOG_INFO, "nd6_ra_input: bogus mtu option "
 			    "mtu=%lu sent from %s, ignoring\n",
 			    mtu, ip6_sprintf(ip6bufs, &ip6->ip6_src)));
 			goto skip;
 		}
 
 		/* upper bound */
 		maxmtu = (ndi->maxmtu && ndi->maxmtu < ifp->if_mtu)
 		    ? ndi->maxmtu : ifp->if_mtu;
 		if (mtu <= maxmtu) {
 			int change = (ndi->linkmtu != mtu);
 
 			ndi->linkmtu = mtu;
 			if (change) /* in6_maxmtu may change */
 				in6_setmaxmtu();
 		} else {
 			nd6log((LOG_INFO, "nd6_ra_input: bogus mtu "
 			    "mtu=%lu sent from %s; "
 			    "exceeds maxmtu %lu, ignoring\n",
 			    mtu, ip6_sprintf(ip6bufs, &ip6->ip6_src), maxmtu));
 		}
 	}
 
  skip:
 
 	/*
 	 * Source link layer address
 	 */
     {
 	char *lladdr = NULL;
 	int lladdrlen = 0;
 
 	if (ndopts.nd_opts_src_lladdr) {
 		lladdr = (char *)(ndopts.nd_opts_src_lladdr + 1);
 		lladdrlen = ndopts.nd_opts_src_lladdr->nd_opt_len << 3;
 	}
 
 	if (lladdr && ((ifp->if_addrlen + 2 + 7) & ~7) != lladdrlen) {
 		nd6log((LOG_INFO,
 		    "nd6_ra_input: lladdrlen mismatch for %s "
 		    "(if %d, RA packet %d)\n", ip6_sprintf(ip6bufs, &saddr6),
 		    ifp->if_addrlen, lladdrlen - 2));
 		goto bad;
 	}
 
 	nd6_cache_lladdr(ifp, &saddr6, lladdr,
 	    lladdrlen, ND_ROUTER_ADVERT, 0);
 
 	/*
 	 * Installing a link-layer address might change the state of the
 	 * router's neighbor cache, which might also affect our on-link
 	 * detection of adveritsed prefixes.
 	 */
 	pfxlist_onlink_check();
     }
 
  freeit:
 	m_freem(m);
 	return;
 
  bad:
 	V_icmp6stat.icp6s_badra++;
 	m_freem(m);
 }
 
 /*
  * default router list proccessing sub routines
  */
 
 /* tell the change to user processes watching the routing socket. */
 static void
 nd6_rtmsg(int cmd, struct rtentry *rt)
 {
 	struct rt_addrinfo info;
 
 	bzero((caddr_t)&info, sizeof(info));
 	info.rti_info[RTAX_DST] = rt_key(rt);
 	info.rti_info[RTAX_GATEWAY] = rt->rt_gateway;
 	info.rti_info[RTAX_NETMASK] = rt_mask(rt);
 	if (rt->rt_ifp) {
 		info.rti_info[RTAX_IFP] =
 		    TAILQ_FIRST(&rt->rt_ifp->if_addrlist)->ifa_addr;
 		info.rti_info[RTAX_IFA] = rt->rt_ifa->ifa_addr;
 	}
 
 	rt_missmsg(cmd, &info, rt->rt_flags, 0);
 }
 
 void
 defrouter_addreq(struct nd_defrouter *new)
 {
 	struct sockaddr_in6 def, mask, gate;
 	struct rtentry *newrt = NULL;
 	int s;
 	int error;
 
 	bzero(&def, sizeof(def));
 	bzero(&mask, sizeof(mask));
 	bzero(&gate, sizeof(gate));
 
 	def.sin6_len = mask.sin6_len = gate.sin6_len =
 	    sizeof(struct sockaddr_in6);
 	def.sin6_family = gate.sin6_family = AF_INET6;
 	gate.sin6_addr = new->rtaddr;
 
 	s = splnet();
 	error = rtrequest(RTM_ADD, (struct sockaddr *)&def,
 	    (struct sockaddr *)&gate, (struct sockaddr *)&mask,
 	    RTF_GATEWAY, &newrt);
 	if (newrt) {
-		RT_LOCK(newrt);
 		nd6_rtmsg(RTM_ADD, newrt); /* tell user process */
-		RT_REMREF(newrt);
-		RT_UNLOCK(newrt);
+		RTFREE(newrt);
 	}
 	if (error == 0)
 		new->installed = 1;
 	splx(s);
 	return;
 }
 
 struct nd_defrouter *
 defrouter_lookup(struct in6_addr *addr, struct ifnet *ifp)
 {
 	INIT_VNET_INET6(ifp->if_vnet);
 	struct nd_defrouter *dr;
 
 	for (dr = TAILQ_FIRST(&V_nd_defrouter); dr;
 	     dr = TAILQ_NEXT(dr, dr_entry)) {
 		if (dr->ifp == ifp && IN6_ARE_ADDR_EQUAL(addr, &dr->rtaddr))
 			return (dr);
 	}
 
 	return (NULL);		/* search failed */
 }
 
 /*
  * Remove the default route for a given router.
  * This is just a subroutine function for defrouter_select(), and should
  * not be called from anywhere else.
  */
 static void
 defrouter_delreq(struct nd_defrouter *dr)
 {
 	struct sockaddr_in6 def, mask, gate;
 	struct rtentry *oldrt = NULL;
 
 	bzero(&def, sizeof(def));
 	bzero(&mask, sizeof(mask));
 	bzero(&gate, sizeof(gate));
 
 	def.sin6_len = mask.sin6_len = gate.sin6_len =
 	    sizeof(struct sockaddr_in6);
 	def.sin6_family = gate.sin6_family = AF_INET6;
 	gate.sin6_addr = dr->rtaddr;
 
 	rtrequest(RTM_DELETE, (struct sockaddr *)&def,
 	    (struct sockaddr *)&gate,
 	    (struct sockaddr *)&mask, RTF_GATEWAY, &oldrt);
 	if (oldrt) {
 		nd6_rtmsg(RTM_DELETE, oldrt);
 		RTFREE(oldrt);
 	}
 
 	dr->installed = 0;
 }
 
 /*
  * remove all default routes from default router list
  */
 void
 defrouter_reset(void)
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_defrouter *dr;
 
 	for (dr = TAILQ_FIRST(&V_nd_defrouter); dr;
 	     dr = TAILQ_NEXT(dr, dr_entry))
 		defrouter_delreq(dr);
 
 	/*
 	 * XXX should we also nuke any default routers in the kernel, by
 	 * going through them by rtalloc1()?
 	 */
 }
 
 void
 defrtrlist_del(struct nd_defrouter *dr)
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_defrouter *deldr = NULL;
 	struct nd_prefix *pr;
 
 	/*
 	 * Flush all the routing table entries that use the router
 	 * as a next hop.
 	 */
 	if (!V_ip6_forwarding && V_ip6_accept_rtadv) /* XXX: better condition? */
 		rt6_flush(&dr->rtaddr, dr->ifp);
 
 	if (dr->installed) {
 		deldr = dr;
 		defrouter_delreq(dr);
 	}
 	TAILQ_REMOVE(&V_nd_defrouter, dr, dr_entry);
 
 	/*
 	 * Also delete all the pointers to the router in each prefix lists.
 	 */
 	for (pr = V_nd_prefix.lh_first; pr; pr = pr->ndpr_next) {
 		struct nd_pfxrouter *pfxrtr;
 		if ((pfxrtr = pfxrtr_lookup(pr, dr)) != NULL)
 			pfxrtr_del(pfxrtr);
 	}
 	pfxlist_onlink_check();
 
 	/*
 	 * If the router is the primary one, choose a new one.
 	 * Note that defrouter_select() will remove the current gateway
 	 * from the routing table.
 	 */
 	if (deldr)
 		defrouter_select();
 
 	free(dr, M_IP6NDP);
 }
 
 /*
  * Default Router Selection according to Section 6.3.6 of RFC 2461 and
  * draft-ietf-ipngwg-router-selection:
  * 1) Routers that are reachable or probably reachable should be preferred.
  *    If we have more than one (probably) reachable router, prefer ones
  *    with the highest router preference.
  * 2) When no routers on the list are known to be reachable or
  *    probably reachable, routers SHOULD be selected in a round-robin
  *    fashion, regardless of router preference values.
  * 3) If the Default Router List is empty, assume that all
  *    destinations are on-link.
  *
  * We assume nd_defrouter is sorted by router preference value.
  * Since the code below covers both with and without router preference cases,
  * we do not need to classify the cases by ifdef.
  *
  * At this moment, we do not try to install more than one default router,
  * even when the multipath routing is available, because we're not sure about
  * the benefits for stub hosts comparing to the risk of making the code
  * complicated and the possibility of introducing bugs.
  */
 void
 defrouter_select(void)
 {
 	INIT_VNET_INET6(curvnet);
 	int s = splnet();
 	struct nd_defrouter *dr, *selected_dr = NULL, *installed_dr = NULL;
-	struct rtentry *rt = NULL;
-	struct llinfo_nd6 *ln = NULL;
+	struct llentry *ln = NULL;
 
 	/*
 	 * This function should be called only when acting as an autoconfigured
 	 * host.  Although the remaining part of this function is not effective
 	 * if the node is not an autoconfigured host, we explicitly exclude
 	 * such cases here for safety.
 	 */
 	if (V_ip6_forwarding || !V_ip6_accept_rtadv) {
 		nd6log((LOG_WARNING,
 		    "defrouter_select: called unexpectedly (forwarding=%d, "
 		    "accept_rtadv=%d)\n", V_ip6_forwarding, V_ip6_accept_rtadv));
 		splx(s);
 		return;
 	}
 
 	/*
 	 * Let's handle easy case (3) first:
 	 * If default router list is empty, there's nothing to be done.
 	 */
 	if (!TAILQ_FIRST(&V_nd_defrouter)) {
 		splx(s);
 		return;
 	}
 
 	/*
 	 * Search for a (probably) reachable router from the list.
 	 * We just pick up the first reachable one (if any), assuming that
 	 * the ordering rule of the list described in defrtrlist_update().
 	 */
 	for (dr = TAILQ_FIRST(&V_nd_defrouter); dr;
 	     dr = TAILQ_NEXT(dr, dr_entry)) {
+		IF_AFDATA_LOCK(dr->ifp);
 		if (selected_dr == NULL &&
-		    (rt = nd6_lookup(&dr->rtaddr, 0, dr->ifp)) &&
-		    (ln = (struct llinfo_nd6 *)rt->rt_llinfo) &&
+		    (ln = nd6_lookup(&dr->rtaddr, 0, dr->ifp)) &&
 		    ND6_IS_LLINFO_PROBREACH(ln)) {
 			selected_dr = dr;
 		}
+		IF_AFDATA_UNLOCK(dr->ifp);
 
 		if (dr->installed && installed_dr == NULL)
 			installed_dr = dr;
 		else if (dr->installed && installed_dr) {
 			/* this should not happen.  warn for diagnosis. */
 			log(LOG_ERR, "defrouter_select: more than one router"
 			    " is installed\n");
 		}
 	}
 	/*
 	 * If none of the default routers was found to be reachable,
 	 * round-robin the list regardless of preference.
 	 * Otherwise, if we have an installed router, check if the selected
 	 * (reachable) router should really be preferred to the installed one.
 	 * We only prefer the new router when the old one is not reachable
 	 * or when the new one has a really higher preference value.
 	 */
 	if (selected_dr == NULL) {
 		if (installed_dr == NULL || !TAILQ_NEXT(installed_dr, dr_entry))
 			selected_dr = TAILQ_FIRST(&V_nd_defrouter);
 		else
 			selected_dr = TAILQ_NEXT(installed_dr, dr_entry);
-	} else if (installed_dr &&
-	    (rt = nd6_lookup(&installed_dr->rtaddr, 0, installed_dr->ifp)) &&
-	    (ln = (struct llinfo_nd6 *)rt->rt_llinfo) &&
-	    ND6_IS_LLINFO_PROBREACH(ln) &&
-	    rtpref(selected_dr) <= rtpref(installed_dr)) {
-		selected_dr = installed_dr;
+	} else if (installed_dr) {
+		IF_AFDATA_LOCK(installed_dr->ifp);
+		if ((ln = nd6_lookup(&installed_dr->rtaddr, 0, installed_dr->ifp)) &&
+		    ND6_IS_LLINFO_PROBREACH(ln) &&
+		    rtpref(selected_dr) <= rtpref(installed_dr)) {
+			selected_dr = installed_dr;
+		}
+		IF_AFDATA_UNLOCK(installed_dr->ifp);
 	}
 
 	/*
 	 * If the selected router is different than the installed one,
 	 * remove the installed router and install the selected one.
 	 * Note that the selected router is never NULL here.
 	 */
 	if (installed_dr != selected_dr) {
 		if (installed_dr)
 			defrouter_delreq(installed_dr);
 		defrouter_addreq(selected_dr);
 	}
 
 	splx(s);
 	return;
 }
 
 /*
  * for default router selection
  * regards router-preference field as a 2-bit signed integer
  */
 static int
 rtpref(struct nd_defrouter *dr)
 {
 	switch (dr->flags & ND_RA_FLAG_RTPREF_MASK) {
 	case ND_RA_FLAG_RTPREF_HIGH:
 		return (RTPREF_HIGH);
 	case ND_RA_FLAG_RTPREF_MEDIUM:
 	case ND_RA_FLAG_RTPREF_RSV:
 		return (RTPREF_MEDIUM);
 	case ND_RA_FLAG_RTPREF_LOW:
 		return (RTPREF_LOW);
 	default:
 		/*
 		 * This case should never happen.  If it did, it would mean a
 		 * serious bug of kernel internal.  We thus always bark here.
 		 * Or, can we even panic?
 		 */
 		log(LOG_ERR, "rtpref: impossible RA flag %x\n", dr->flags);
 		return (RTPREF_INVALID);
 	}
 	/* NOTREACHED */
 }
 
 static struct nd_defrouter *
 defrtrlist_update(struct nd_defrouter *new)
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_defrouter *dr, *n;
 	int s = splnet();
 
 	if ((dr = defrouter_lookup(&new->rtaddr, new->ifp)) != NULL) {
 		/* entry exists */
 		if (new->rtlifetime == 0) {
 			defrtrlist_del(dr);
 			dr = NULL;
 		} else {
 			int oldpref = rtpref(dr);
 
 			/* override */
 			dr->flags = new->flags; /* xxx flag check */
 			dr->rtlifetime = new->rtlifetime;
 			dr->expire = new->expire;
 
 			/*
 			 * If the preference does not change, there's no need
 			 * to sort the entries.
 			 */
 			if (rtpref(new) == oldpref) {
 				splx(s);
 				return (dr);
 			}
 
 			/*
 			 * preferred router may be changed, so relocate
 			 * this router.
 			 * XXX: calling TAILQ_REMOVE directly is a bad manner.
 			 * However, since defrtrlist_del() has many side
 			 * effects, we intentionally do so here.
 			 * defrouter_select() below will handle routing
 			 * changes later.
 			 */
 			TAILQ_REMOVE(&V_nd_defrouter, dr, dr_entry);
 			n = dr;
 			goto insert;
 		}
 		splx(s);
 		return (dr);
 	}
 
 	/* entry does not exist */
 	if (new->rtlifetime == 0) {
 		splx(s);
 		return (NULL);
 	}
 
 	n = (struct nd_defrouter *)malloc(sizeof(*n), M_IP6NDP, M_NOWAIT);
 	if (n == NULL) {
 		splx(s);
 		return (NULL);
 	}
 	bzero(n, sizeof(*n));
 	*n = *new;
 
 insert:
 	/*
 	 * Insert the new router in the Default Router List;
 	 * The Default Router List should be in the descending order
 	 * of router-preferece.  Routers with the same preference are
 	 * sorted in the arriving time order.
 	 */
 
 	/* insert at the end of the group */
 	for (dr = TAILQ_FIRST(&V_nd_defrouter); dr;
 	     dr = TAILQ_NEXT(dr, dr_entry)) {
 		if (rtpref(n) > rtpref(dr))
 			break;
 	}
 	if (dr)
 		TAILQ_INSERT_BEFORE(dr, n, dr_entry);
 	else
 		TAILQ_INSERT_TAIL(&V_nd_defrouter, n, dr_entry);
 
 	defrouter_select();
 
 	splx(s);
 
 	return (n);
 }
 
 static struct nd_pfxrouter *
 pfxrtr_lookup(struct nd_prefix *pr, struct nd_defrouter *dr)
 {
 	struct nd_pfxrouter *search;
 
 	for (search = pr->ndpr_advrtrs.lh_first; search; search = search->pfr_next) {
 		if (search->router == dr)
 			break;
 	}
 
 	return (search);
 }
 
 static void
 pfxrtr_add(struct nd_prefix *pr, struct nd_defrouter *dr)
 {
 	struct nd_pfxrouter *new;
 
 	new = (struct nd_pfxrouter *)malloc(sizeof(*new), M_IP6NDP, M_NOWAIT);
 	if (new == NULL)
 		return;
 	bzero(new, sizeof(*new));
 	new->router = dr;
 
 	LIST_INSERT_HEAD(&pr->ndpr_advrtrs, new, pfr_entry);
 
 	pfxlist_onlink_check();
 }
 
 static void
 pfxrtr_del(struct nd_pfxrouter *pfr)
 {
 	LIST_REMOVE(pfr, pfr_entry);
 	free(pfr, M_IP6NDP);
 }
 
 struct nd_prefix *
 nd6_prefix_lookup(struct nd_prefixctl *key)
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_prefix *search;
 
 	for (search = V_nd_prefix.lh_first;
 	    search; search = search->ndpr_next) {
 		if (key->ndpr_ifp == search->ndpr_ifp &&
 		    key->ndpr_plen == search->ndpr_plen &&
 		    in6_are_prefix_equal(&key->ndpr_prefix.sin6_addr,
 		    &search->ndpr_prefix.sin6_addr, key->ndpr_plen)) {
 			break;
 		}
 	}
 
 	return (search);
 }
 
 int
 nd6_prelist_add(struct nd_prefixctl *pr, struct nd_defrouter *dr,
     struct nd_prefix **newp)
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_prefix *new = NULL;
 	int error = 0;
 	int i, s;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	new = (struct nd_prefix *)malloc(sizeof(*new), M_IP6NDP, M_NOWAIT);
 	if (new == NULL)
 		return(ENOMEM);
 	bzero(new, sizeof(*new));
 	new->ndpr_ifp = pr->ndpr_ifp;
 	new->ndpr_prefix = pr->ndpr_prefix;
 	new->ndpr_plen = pr->ndpr_plen;
 	new->ndpr_vltime = pr->ndpr_vltime;
 	new->ndpr_pltime = pr->ndpr_pltime;
 	new->ndpr_flags = pr->ndpr_flags;
 	if ((error = in6_init_prefix_ltimes(new)) != 0) {
 		free(new, M_IP6NDP);
 		return(error);
 	}
 	new->ndpr_lastupdate = time_second;
 	if (newp != NULL)
 		*newp = new;
 
 	/* initialization */
 	LIST_INIT(&new->ndpr_advrtrs);
 	in6_prefixlen2mask(&new->ndpr_mask, new->ndpr_plen);
 	/* make prefix in the canonical form */
 	for (i = 0; i < 4; i++)
 		new->ndpr_prefix.sin6_addr.s6_addr32[i] &=
 		    new->ndpr_mask.s6_addr32[i];
 
 	s = splnet();
 	/* link ndpr_entry to nd_prefix list */
 	LIST_INSERT_HEAD(&V_nd_prefix, new, ndpr_entry);
 	splx(s);
 
 	/* ND_OPT_PI_FLAG_ONLINK processing */
 	if (new->ndpr_raf_onlink) {
 		int e;
 
 		if ((e = nd6_prefix_onlink(new)) != 0) {
 			nd6log((LOG_ERR, "nd6_prelist_add: failed to make "
 			    "the prefix %s/%d on-link on %s (errno=%d)\n",
 			    ip6_sprintf(ip6buf, &pr->ndpr_prefix.sin6_addr),
 			    pr->ndpr_plen, if_name(pr->ndpr_ifp), e));
 			/* proceed anyway. XXX: is it correct? */
 		}
 	}
 
 	if (dr)
 		pfxrtr_add(new, dr);
 
 	return 0;
 }
 
 void
 prelist_remove(struct nd_prefix *pr)
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_pfxrouter *pfr, *next;
 	int e, s;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	/* make sure to invalidate the prefix until it is really freed. */
 	pr->ndpr_vltime = 0;
 	pr->ndpr_pltime = 0;
 
 	/*
 	 * Though these flags are now meaningless, we'd rather keep the value
 	 * of pr->ndpr_raf_onlink and pr->ndpr_raf_auto not to confuse users
 	 * when executing "ndp -p".
 	 */
 
 	if ((pr->ndpr_stateflags & NDPRF_ONLINK) != 0 &&
 	    (e = nd6_prefix_offlink(pr)) != 0) {
 		nd6log((LOG_ERR, "prelist_remove: failed to make %s/%d offlink "
 		    "on %s, errno=%d\n",
 		    ip6_sprintf(ip6buf, &pr->ndpr_prefix.sin6_addr),
 		    pr->ndpr_plen, if_name(pr->ndpr_ifp), e));
 		/* what should we do? */
 	}
 
 	if (pr->ndpr_refcnt > 0)
 		return;		/* notice here? */
 
 	s = splnet();
 
 	/* unlink ndpr_entry from nd_prefix list */
 	LIST_REMOVE(pr, ndpr_entry);
 
 	/* free list of routers that adversed the prefix */
 	for (pfr = pr->ndpr_advrtrs.lh_first; pfr; pfr = next) {
 		next = pfr->pfr_next;
 
 		free(pfr, M_IP6NDP);
 	}
 	splx(s);
 
 	free(pr, M_IP6NDP);
 
 	pfxlist_onlink_check();
 }
 
 /*
  * dr - may be NULL
  */
 
 static int
 prelist_update(struct nd_prefixctl *new, struct nd_defrouter *dr,
     struct mbuf *m, int mcast)
 {
 	INIT_VNET_INET6(curvnet);
 	struct in6_ifaddr *ia6 = NULL, *ia6_match = NULL;
 	struct ifaddr *ifa;
 	struct ifnet *ifp = new->ndpr_ifp;
 	struct nd_prefix *pr;
 	int s = splnet();
 	int error = 0;
 	int newprefix = 0;
 	int auth;
 	struct in6_addrlifetime lt6_tmp;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	auth = 0;
 	if (m) {
 		/*
 		 * Authenticity for NA consists authentication for
 		 * both IP header and IP datagrams, doesn't it ?
 		 */
 #if defined(M_AUTHIPHDR) && defined(M_AUTHIPDGM)
 		auth = ((m->m_flags & M_AUTHIPHDR) &&
 		    (m->m_flags & M_AUTHIPDGM));
 #endif
 	}
 
 	if ((pr = nd6_prefix_lookup(new)) != NULL) {
 		/*
 		 * nd6_prefix_lookup() ensures that pr and new have the same
 		 * prefix on a same interface.
 		 */
 
 		/*
 		 * Update prefix information.  Note that the on-link (L) bit
 		 * and the autonomous (A) bit should NOT be changed from 1
 		 * to 0.
 		 */
 		if (new->ndpr_raf_onlink == 1)
 			pr->ndpr_raf_onlink = 1;
 		if (new->ndpr_raf_auto == 1)
 			pr->ndpr_raf_auto = 1;
 		if (new->ndpr_raf_onlink) {
 			pr->ndpr_vltime = new->ndpr_vltime;
 			pr->ndpr_pltime = new->ndpr_pltime;
 			(void)in6_init_prefix_ltimes(pr); /* XXX error case? */
 			pr->ndpr_lastupdate = time_second;
 		}
 
 		if (new->ndpr_raf_onlink &&
 		    (pr->ndpr_stateflags & NDPRF_ONLINK) == 0) {
 			int e;
 
 			if ((e = nd6_prefix_onlink(pr)) != 0) {
 				nd6log((LOG_ERR,
 				    "prelist_update: failed to make "
 				    "the prefix %s/%d on-link on %s "
 				    "(errno=%d)\n",
 				    ip6_sprintf(ip6buf,
 					    &pr->ndpr_prefix.sin6_addr),
 				    pr->ndpr_plen, if_name(pr->ndpr_ifp), e));
 				/* proceed anyway. XXX: is it correct? */
 			}
 		}
 
 		if (dr && pfxrtr_lookup(pr, dr) == NULL)
 			pfxrtr_add(pr, dr);
 	} else {
 		struct nd_prefix *newpr = NULL;
 
 		newprefix = 1;
 
 		if (new->ndpr_vltime == 0)
 			goto end;
 		if (new->ndpr_raf_onlink == 0 && new->ndpr_raf_auto == 0)
 			goto end;
 
 		error = nd6_prelist_add(new, dr, &newpr);
 		if (error != 0 || newpr == NULL) {
 			nd6log((LOG_NOTICE, "prelist_update: "
 			    "nd6_prelist_add failed for %s/%d on %s "
 			    "errno=%d, returnpr=%p\n",
 			    ip6_sprintf(ip6buf, &new->ndpr_prefix.sin6_addr),
 			    new->ndpr_plen, if_name(new->ndpr_ifp),
 			    error, newpr));
 			goto end; /* we should just give up in this case. */
 		}
 
 		/*
 		 * XXX: from the ND point of view, we can ignore a prefix
 		 * with the on-link bit being zero.  However, we need a
 		 * prefix structure for references from autoconfigured
 		 * addresses.  Thus, we explicitly make sure that the prefix
 		 * itself expires now.
 		 */
 		if (newpr->ndpr_raf_onlink == 0) {
 			newpr->ndpr_vltime = 0;
 			newpr->ndpr_pltime = 0;
 			in6_init_prefix_ltimes(newpr);
 		}
 
 		pr = newpr;
 	}
 
 	/*
 	 * Address autoconfiguration based on Section 5.5.3 of RFC 2462.
 	 * Note that pr must be non NULL at this point.
 	 */
 
 	/* 5.5.3 (a). Ignore the prefix without the A bit set. */
 	if (!new->ndpr_raf_auto)
 		goto end;
 
 	/*
 	 * 5.5.3 (b). the link-local prefix should have been ignored in
 	 * nd6_ra_input.
 	 */
 
 	/* 5.5.3 (c). Consistency check on lifetimes: pltime <= vltime. */
 	if (new->ndpr_pltime > new->ndpr_vltime) {
 		error = EINVAL;	/* XXX: won't be used */
 		goto end;
 	}
 
 	/*
 	 * 5.5.3 (d).  If the prefix advertised is not equal to the prefix of
 	 * an address configured by stateless autoconfiguration already in the
 	 * list of addresses associated with the interface, and the Valid
 	 * Lifetime is not 0, form an address.  We first check if we have
 	 * a matching prefix.
 	 * Note: we apply a clarification in rfc2462bis-02 here.  We only
 	 * consider autoconfigured addresses while RFC2462 simply said
 	 * "address".
 	 */
 	TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 		struct in6_ifaddr *ifa6;
 		u_int32_t remaininglifetime;
 
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 
 		ifa6 = (struct in6_ifaddr *)ifa;
 
 		/*
 		 * We only consider autoconfigured addresses as per rfc2462bis.
 		 */
 		if (!(ifa6->ia6_flags & IN6_IFF_AUTOCONF))
 			continue;
 
 		/*
 		 * Spec is not clear here, but I believe we should concentrate
 		 * on unicast (i.e. not anycast) addresses.
 		 * XXX: other ia6_flags? detached or duplicated?
 		 */
 		if ((ifa6->ia6_flags & IN6_IFF_ANYCAST) != 0)
 			continue;
 
 		/*
 		 * Ignore the address if it is not associated with a prefix
 		 * or is associated with a prefix that is different from this
 		 * one.  (pr is never NULL here)
 		 */
 		if (ifa6->ia6_ndpr != pr)
 			continue;
 
 		if (ia6_match == NULL) /* remember the first one */
 			ia6_match = ifa6;
 
 		/*
 		 * An already autoconfigured address matched.  Now that we
 		 * are sure there is at least one matched address, we can
 		 * proceed to 5.5.3. (e): update the lifetimes according to the
 		 * "two hours" rule and the privacy extension.
 		 * We apply some clarifications in rfc2462bis:
 		 * - use remaininglifetime instead of storedlifetime as a
 		 *   variable name
 		 * - remove the dead code in the "two-hour" rule
 		 */
 #define TWOHOUR		(120*60)
 		lt6_tmp = ifa6->ia6_lifetime;
 
 		if (lt6_tmp.ia6t_vltime == ND6_INFINITE_LIFETIME)
 			remaininglifetime = ND6_INFINITE_LIFETIME;
 		else if (time_second - ifa6->ia6_updatetime >
 			 lt6_tmp.ia6t_vltime) {
 			/*
 			 * The case of "invalid" address.  We should usually
 			 * not see this case.
 			 */
 			remaininglifetime = 0;
 		} else
 			remaininglifetime = lt6_tmp.ia6t_vltime -
 			    (time_second - ifa6->ia6_updatetime);
 
 		/* when not updating, keep the current stored lifetime. */
 		lt6_tmp.ia6t_vltime = remaininglifetime;
 
 		if (TWOHOUR < new->ndpr_vltime ||
 		    remaininglifetime < new->ndpr_vltime) {
 			lt6_tmp.ia6t_vltime = new->ndpr_vltime;
 		} else if (remaininglifetime <= TWOHOUR) {
 			if (auth) {
 				lt6_tmp.ia6t_vltime = new->ndpr_vltime;
 			}
 		} else {
 			/*
 			 * new->ndpr_vltime <= TWOHOUR &&
 			 * TWOHOUR < remaininglifetime
 			 */
 			lt6_tmp.ia6t_vltime = TWOHOUR;
 		}
 
 		/* The 2 hour rule is not imposed for preferred lifetime. */
 		lt6_tmp.ia6t_pltime = new->ndpr_pltime;
 
 		in6_init_address_ltimes(pr, &lt6_tmp);
 
 		/*
 		 * We need to treat lifetimes for temporary addresses
 		 * differently, according to
 		 * draft-ietf-ipv6-privacy-addrs-v2-01.txt 3.3 (1);
 		 * we only update the lifetimes when they are in the maximum
 		 * intervals.
 		 */
 		if ((ifa6->ia6_flags & IN6_IFF_TEMPORARY) != 0) {
 			u_int32_t maxvltime, maxpltime;
 
 			if (V_ip6_temp_valid_lifetime >
 			    (u_int32_t)((time_second - ifa6->ia6_createtime) +
 			    V_ip6_desync_factor)) {
 				maxvltime = V_ip6_temp_valid_lifetime -
 				    (time_second - ifa6->ia6_createtime) -
 				    V_ip6_desync_factor;
 			} else
 				maxvltime = 0;
 			if (V_ip6_temp_preferred_lifetime >
 			    (u_int32_t)((time_second - ifa6->ia6_createtime) +
 			    V_ip6_desync_factor)) {
 				maxpltime = V_ip6_temp_preferred_lifetime -
 				    (time_second - ifa6->ia6_createtime) -
 				    V_ip6_desync_factor;
 			} else
 				maxpltime = 0;
 
 			if (lt6_tmp.ia6t_vltime == ND6_INFINITE_LIFETIME ||
 			    lt6_tmp.ia6t_vltime > maxvltime) {
 				lt6_tmp.ia6t_vltime = maxvltime;
 			}
 			if (lt6_tmp.ia6t_pltime == ND6_INFINITE_LIFETIME ||
 			    lt6_tmp.ia6t_pltime > maxpltime) {
 				lt6_tmp.ia6t_pltime = maxpltime;
 			}
 		}
 		ifa6->ia6_lifetime = lt6_tmp;
 		ifa6->ia6_updatetime = time_second;
 	}
 	if (ia6_match == NULL && new->ndpr_vltime) {
 		int ifidlen;
 
 		/*
 		 * 5.5.3 (d) (continued)
 		 * No address matched and the valid lifetime is non-zero.
 		 * Create a new address.
 		 */
 
 		/*
 		 * Prefix Length check:
 		 * If the sum of the prefix length and interface identifier
 		 * length does not equal 128 bits, the Prefix Information
 		 * option MUST be ignored.  The length of the interface
 		 * identifier is defined in a separate link-type specific
 		 * document.
 		 */
 		ifidlen = in6_if2idlen(ifp);
 		if (ifidlen < 0) {
 			/* this should not happen, so we always log it. */
 			log(LOG_ERR, "prelist_update: IFID undefined (%s)\n",
 			    if_name(ifp));
 			goto end;
 		}
 		if (ifidlen + pr->ndpr_plen != 128) {
 			nd6log((LOG_INFO,
 			    "prelist_update: invalid prefixlen "
 			    "%d for %s, ignored\n",
 			    pr->ndpr_plen, if_name(ifp)));
 			goto end;
 		}
 
 		if ((ia6 = in6_ifadd(new, mcast)) != NULL) {
 			/*
 			 * note that we should use pr (not new) for reference.
 			 */
 			pr->ndpr_refcnt++;
 			ia6->ia6_ndpr = pr;
 
 			/*
 			 * RFC 3041 3.3 (2).
 			 * When a new public address is created as described
 			 * in RFC2462, also create a new temporary address.
 			 *
 			 * RFC 3041 3.5.
 			 * When an interface connects to a new link, a new
 			 * randomized interface identifier should be generated
 			 * immediately together with a new set of temporary
 			 * addresses.  Thus, we specifiy 1 as the 2nd arg of
 			 * in6_tmpifadd().
 			 */
 			if (V_ip6_use_tempaddr) {
 				int e;
 				if ((e = in6_tmpifadd(ia6, 1, 1)) != 0) {
 					nd6log((LOG_NOTICE, "prelist_update: "
 					    "failed to create a temporary "
 					    "address, errno=%d\n",
 					    e));
 				}
 			}
 
 			/*
 			 * A newly added address might affect the status
 			 * of other addresses, so we check and update it.
 			 * XXX: what if address duplication happens?
 			 */
 			pfxlist_onlink_check();
 		} else {
 			/* just set an error. do not bark here. */
 			error = EADDRNOTAVAIL; /* XXX: might be unused. */
 		}
 	}
 
  end:
 	splx(s);
 	return error;
 }
 
 /*
  * A supplement function used in the on-link detection below;
  * detect if a given prefix has a (probably) reachable advertising router.
  * XXX: lengthy function name...
  */
 static struct nd_pfxrouter *
 find_pfxlist_reachable_router(struct nd_prefix *pr)
 {
 	struct nd_pfxrouter *pfxrtr;
-	struct rtentry *rt;
-	struct llinfo_nd6 *ln;
+	struct llentry *ln;
 
 	for (pfxrtr = LIST_FIRST(&pr->ndpr_advrtrs); pfxrtr;
 	     pfxrtr = LIST_NEXT(pfxrtr, pfr_entry)) {
-		if ((rt = nd6_lookup(&pfxrtr->router->rtaddr, 0,
+		IF_AFDATA_LOCK(pfxrtr->router->ifp);
+		if ((ln = nd6_lookup(&pfxrtr->router->rtaddr, 0,
 		    pfxrtr->router->ifp)) &&
-		    (ln = (struct llinfo_nd6 *)rt->rt_llinfo) &&
-		    ND6_IS_LLINFO_PROBREACH(ln))
+		    ND6_IS_LLINFO_PROBREACH(ln)) {
+			IF_AFDATA_UNLOCK(pfxrtr->router->ifp);
 			break;	/* found */
+		}
+		IF_AFDATA_UNLOCK(pfxrtr->router->ifp);
 	}
-
 	return (pfxrtr);
 }
 
 /*
  * Check if each prefix in the prefix list has at least one available router
  * that advertised the prefix (a router is "available" if its neighbor cache
  * entry is reachable or probably reachable).
  * If the check fails, the prefix may be off-link, because, for example,
  * we have moved from the network but the lifetime of the prefix has not
  * expired yet.  So we should not use the prefix if there is another prefix
  * that has an available router.
  * But, if there is no prefix that has an available router, we still regards
  * all the prefixes as on-link.  This is because we can't tell if all the
  * routers are simply dead or if we really moved from the network and there
  * is no router around us.
  */
 void
 pfxlist_onlink_check()
 {
 	INIT_VNET_INET6(curvnet);
 	struct nd_prefix *pr;
 	struct in6_ifaddr *ifa;
 	struct nd_defrouter *dr;
 	struct nd_pfxrouter *pfxrtr = NULL;
 
 	/*
 	 * Check if there is a prefix that has a reachable advertising
 	 * router.
 	 */
 	for (pr = V_nd_prefix.lh_first; pr; pr = pr->ndpr_next) {
 		if (pr->ndpr_raf_onlink && find_pfxlist_reachable_router(pr))
 			break;
 	}
 
 	/*
 	 * If we have no such prefix, check whether we still have a router
 	 * that does not advertise any prefixes.
 	 */
 	if (pr == NULL) {
 		for (dr = TAILQ_FIRST(&V_nd_defrouter); dr;
 		    dr = TAILQ_NEXT(dr, dr_entry)) {
 			struct nd_prefix *pr0;
 
 			for (pr0 = V_nd_prefix.lh_first; pr0;
 			    pr0 = pr0->ndpr_next) {
 				if ((pfxrtr = pfxrtr_lookup(pr0, dr)) != NULL)
 					break;
 			}
 			if (pfxrtr != NULL)
 				break;
 		}
 	}
 	if (pr != NULL || (TAILQ_FIRST(&V_nd_defrouter) && pfxrtr == NULL)) {
 		/*
 		 * There is at least one prefix that has a reachable router,
 		 * or at least a router which probably does not advertise
 		 * any prefixes.  The latter would be the case when we move
 		 * to a new link where we have a router that does not provide
 		 * prefixes and we configure an address by hand.
 		 * Detach prefixes which have no reachable advertising
 		 * router, and attach other prefixes.
 		 */
 		for (pr = V_nd_prefix.lh_first; pr; pr = pr->ndpr_next) {
 			/* XXX: a link-local prefix should never be detached */
 			if (IN6_IS_ADDR_LINKLOCAL(&pr->ndpr_prefix.sin6_addr))
 				continue;
 
 			/*
 			 * we aren't interested in prefixes without the L bit
 			 * set.
 			 */
 			if (pr->ndpr_raf_onlink == 0)
 				continue;
 
 			if ((pr->ndpr_stateflags & NDPRF_DETACHED) == 0 &&
 			    find_pfxlist_reachable_router(pr) == NULL)
 				pr->ndpr_stateflags |= NDPRF_DETACHED;
 			if ((pr->ndpr_stateflags & NDPRF_DETACHED) != 0 &&
 			    find_pfxlist_reachable_router(pr) != 0)
 				pr->ndpr_stateflags &= ~NDPRF_DETACHED;
 		}
 	} else {
 		/* there is no prefix that has a reachable router */
 		for (pr = V_nd_prefix.lh_first; pr; pr = pr->ndpr_next) {
 			if (IN6_IS_ADDR_LINKLOCAL(&pr->ndpr_prefix.sin6_addr))
 				continue;
 
 			if (pr->ndpr_raf_onlink == 0)
 				continue;
 
 			if ((pr->ndpr_stateflags & NDPRF_DETACHED) != 0)
 				pr->ndpr_stateflags &= ~NDPRF_DETACHED;
 		}
 	}
 
 	/*
 	 * Remove each interface route associated with a (just) detached
 	 * prefix, and reinstall the interface route for a (just) attached
 	 * prefix.  Note that all attempt of reinstallation does not
 	 * necessarily success, when a same prefix is shared among multiple
 	 * interfaces.  Such cases will be handled in nd6_prefix_onlink,
 	 * so we don't have to care about them.
 	 */
 	for (pr = V_nd_prefix.lh_first; pr; pr = pr->ndpr_next) {
 		int e;
 		char ip6buf[INET6_ADDRSTRLEN];
 
 		if (IN6_IS_ADDR_LINKLOCAL(&pr->ndpr_prefix.sin6_addr))
 			continue;
 
 		if (pr->ndpr_raf_onlink == 0)
 			continue;
 
 		if ((pr->ndpr_stateflags & NDPRF_DETACHED) != 0 &&
 		    (pr->ndpr_stateflags & NDPRF_ONLINK) != 0) {
 			if ((e = nd6_prefix_offlink(pr)) != 0) {
 				nd6log((LOG_ERR,
 				    "pfxlist_onlink_check: failed to "
 				    "make %s/%d offlink, errno=%d\n",
 				    ip6_sprintf(ip6buf,
 					    &pr->ndpr_prefix.sin6_addr),
 					    pr->ndpr_plen, e));
 			}
 		}
 		if ((pr->ndpr_stateflags & NDPRF_DETACHED) == 0 &&
 		    (pr->ndpr_stateflags & NDPRF_ONLINK) == 0 &&
 		    pr->ndpr_raf_onlink) {
 			if ((e = nd6_prefix_onlink(pr)) != 0) {
 				nd6log((LOG_ERR,
 				    "pfxlist_onlink_check: failed to "
 				    "make %s/%d onlink, errno=%d\n",
 				    ip6_sprintf(ip6buf,
 					    &pr->ndpr_prefix.sin6_addr),
 					    pr->ndpr_plen, e));
 			}
 		}
 	}
 
 	/*
 	 * Changes on the prefix status might affect address status as well.
 	 * Make sure that all addresses derived from an attached prefix are
 	 * attached, and that all addresses derived from a detached prefix are
 	 * detached.  Note, however, that a manually configured address should
 	 * always be attached.
 	 * The precise detection logic is same as the one for prefixes.
 	 */
 	for (ifa = V_in6_ifaddr; ifa; ifa = ifa->ia_next) {
 		if (!(ifa->ia6_flags & IN6_IFF_AUTOCONF))
 			continue;
 
 		if (ifa->ia6_ndpr == NULL) {
 			/*
 			 * This can happen when we first configure the address
 			 * (i.e. the address exists, but the prefix does not).
 			 * XXX: complicated relationships...
 			 */
 			continue;
 		}
 
 		if (find_pfxlist_reachable_router(ifa->ia6_ndpr))
 			break;
 	}
 	if (ifa) {
 		for (ifa = V_in6_ifaddr; ifa; ifa = ifa->ia_next) {
 			if ((ifa->ia6_flags & IN6_IFF_AUTOCONF) == 0)
 				continue;
 
 			if (ifa->ia6_ndpr == NULL) /* XXX: see above. */
 				continue;
 
 			if (find_pfxlist_reachable_router(ifa->ia6_ndpr)) {
 				if (ifa->ia6_flags & IN6_IFF_DETACHED) {
 					ifa->ia6_flags &= ~IN6_IFF_DETACHED;
 					ifa->ia6_flags |= IN6_IFF_TENTATIVE;
 					nd6_dad_start((struct ifaddr *)ifa, 0);
 				}
 			} else {
 				ifa->ia6_flags |= IN6_IFF_DETACHED;
 			}
 		}
 	}
 	else {
 		for (ifa = V_in6_ifaddr; ifa; ifa = ifa->ia_next) {
 			if ((ifa->ia6_flags & IN6_IFF_AUTOCONF) == 0)
 				continue;
 
 			if (ifa->ia6_flags & IN6_IFF_DETACHED) {
 				ifa->ia6_flags &= ~IN6_IFF_DETACHED;
 				ifa->ia6_flags |= IN6_IFF_TENTATIVE;
 				/* Do we need a delay in this case? */
 				nd6_dad_start((struct ifaddr *)ifa, 0);
 			}
 		}
 	}
 }
 
 int
 nd6_prefix_onlink(struct nd_prefix *pr)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifaddr *ifa;
 	struct ifnet *ifp = pr->ndpr_ifp;
 	struct sockaddr_in6 mask6;
 	struct nd_prefix *opr;
 	u_long rtflags;
 	int error = 0;
+	struct radix_node_head *rnh;
 	struct rtentry *rt = NULL;
 	char ip6buf[INET6_ADDRSTRLEN];
+	struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
 
 	/* sanity check */
 	if ((pr->ndpr_stateflags & NDPRF_ONLINK) != 0) {
 		nd6log((LOG_ERR,
 		    "nd6_prefix_onlink: %s/%d is already on-link\n",
 		    ip6_sprintf(ip6buf, &pr->ndpr_prefix.sin6_addr),
 		    pr->ndpr_plen));
 		return (EEXIST);
 	}
 
 	/*
 	 * Add the interface route associated with the prefix.  Before
 	 * installing the route, check if there's the same prefix on another
 	 * interface, and the prefix has already installed the interface route.
 	 * Although such a configuration is expected to be rare, we explicitly
 	 * allow it.
 	 */
 	for (opr = V_nd_prefix.lh_first; opr; opr = opr->ndpr_next) {
 		if (opr == pr)
 			continue;
 
 		if ((opr->ndpr_stateflags & NDPRF_ONLINK) == 0)
 			continue;
 
 		if (opr->ndpr_plen == pr->ndpr_plen &&
 		    in6_are_prefix_equal(&pr->ndpr_prefix.sin6_addr,
 		    &opr->ndpr_prefix.sin6_addr, pr->ndpr_plen))
 			return (0);
 	}
 
 	/*
 	 * We prefer link-local addresses as the associated interface address.
 	 */
 	/* search for a link-local addr */
 	ifa = (struct ifaddr *)in6ifa_ifpforlinklocal(ifp,
 	    IN6_IFF_NOTREADY | IN6_IFF_ANYCAST);
 	if (ifa == NULL) {
 		/* XXX: freebsd does not have ifa_ifwithaf */
 		TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) {
 			if (ifa->ifa_addr->sa_family == AF_INET6)
 				break;
 		}
 		/* should we care about ia6_flags? */
 	}
 	if (ifa == NULL) {
 		/*
 		 * This can still happen, when, for example, we receive an RA
 		 * containing a prefix with the L bit set and the A bit clear,
 		 * after removing all IPv6 addresses on the receiving
 		 * interface.  This should, of course, be rare though.
 		 */
 		nd6log((LOG_NOTICE,
 		    "nd6_prefix_onlink: failed to find any ifaddr"
 		    " to add route for a prefix(%s/%d) on %s\n",
 		    ip6_sprintf(ip6buf, &pr->ndpr_prefix.sin6_addr),
 		    pr->ndpr_plen, if_name(ifp)));
 		return (0);
 	}
 
 	/*
 	 * in6_ifinit() sets nd6_rtrequest to ifa_rtrequest for all ifaddrs.
 	 * ifa->ifa_rtrequest = nd6_rtrequest;
 	 */
 	bzero(&mask6, sizeof(mask6));
 	mask6.sin6_len = sizeof(mask6);
 	mask6.sin6_addr = pr->ndpr_mask;
-	rtflags = ifa->ifa_flags | RTF_CLONING | RTF_UP;
-	if (nd6_need_cache(ifp)) {
-		/* explicitly set in case ifa_flags does not set the flag. */
-		rtflags |= RTF_CLONING;
-	} else {
-		/*
-		 * explicitly clear the cloning bit in case ifa_flags sets it.
-		 */
-		rtflags &= ~RTF_CLONING;
-	}
+	rtflags = ifa->ifa_flags | RTF_UP;
 	error = rtrequest(RTM_ADD, (struct sockaddr *)&pr->ndpr_prefix,
 	    ifa->ifa_addr, (struct sockaddr *)&mask6, rtflags, &rt);
 	if (error == 0) {
-		if (rt != NULL) /* this should be non NULL, though */
+		if (rt != NULL) /* this should be non NULL, though */ {
+			rnh = V_rt_tables[rt->rt_fibnum][AF_INET6];
+			RADIX_NODE_HEAD_LOCK(rnh);
+			RT_LOCK(rt);
+			if (!rt_setgate(rt, rt_key(rt), (struct sockaddr *)&null_sdl)) {
+				((struct sockaddr_dl *)rt->rt_gateway)->sdl_type =
+					rt->rt_ifp->if_type;
+				((struct sockaddr_dl *)rt->rt_gateway)->sdl_index =
+					rt->rt_ifp->if_index;
+			}
+			RADIX_NODE_HEAD_UNLOCK(rnh);
 			nd6_rtmsg(RTM_ADD, rt);
+			RT_UNLOCK(rt);
+		}
 		pr->ndpr_stateflags |= NDPRF_ONLINK;
 	} else {
 		char ip6bufg[INET6_ADDRSTRLEN], ip6bufm[INET6_ADDRSTRLEN];
 		nd6log((LOG_ERR, "nd6_prefix_onlink: failed to add route for a"
 		    " prefix (%s/%d) on %s, gw=%s, mask=%s, flags=%lx "
 		    "errno = %d\n",
 		    ip6_sprintf(ip6buf, &pr->ndpr_prefix.sin6_addr),
 		    pr->ndpr_plen, if_name(ifp),
 		    ip6_sprintf(ip6bufg, &((struct sockaddr_in6 *)ifa->ifa_addr)->sin6_addr),
 		    ip6_sprintf(ip6bufm, &mask6.sin6_addr), rtflags, error));
 	}
 
 	if (rt != NULL) {
 		RT_LOCK(rt);
 		RT_REMREF(rt);
 		RT_UNLOCK(rt);
 	}
 
 	return (error);
 }
 
 int
 nd6_prefix_offlink(struct nd_prefix *pr)
 {
 	INIT_VNET_INET6(curvnet);
 	int error = 0;
 	struct ifnet *ifp = pr->ndpr_ifp;
 	struct nd_prefix *opr;
 	struct sockaddr_in6 sa6, mask6;
 	struct rtentry *rt = NULL;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	/* sanity check */
 	if ((pr->ndpr_stateflags & NDPRF_ONLINK) == 0) {
 		nd6log((LOG_ERR,
 		    "nd6_prefix_offlink: %s/%d is already off-link\n",
 		    ip6_sprintf(ip6buf, &pr->ndpr_prefix.sin6_addr),
 		    pr->ndpr_plen));
 		return (EEXIST);
 	}
 
 	bzero(&sa6, sizeof(sa6));
 	sa6.sin6_family = AF_INET6;
 	sa6.sin6_len = sizeof(sa6);
 	bcopy(&pr->ndpr_prefix.sin6_addr, &sa6.sin6_addr,
 	    sizeof(struct in6_addr));
 	bzero(&mask6, sizeof(mask6));
 	mask6.sin6_family = AF_INET6;
 	mask6.sin6_len = sizeof(sa6);
 	bcopy(&pr->ndpr_mask, &mask6.sin6_addr, sizeof(struct in6_addr));
 	error = rtrequest(RTM_DELETE, (struct sockaddr *)&sa6, NULL,
 	    (struct sockaddr *)&mask6, 0, &rt);
 	if (error == 0) {
 		pr->ndpr_stateflags &= ~NDPRF_ONLINK;
 
 		/* report the route deletion to the routing socket. */
 		if (rt != NULL)
 			nd6_rtmsg(RTM_DELETE, rt);
 
 		/*
 		 * There might be the same prefix on another interface,
 		 * the prefix which could not be on-link just because we have
 		 * the interface route (see comments in nd6_prefix_onlink).
 		 * If there's one, try to make the prefix on-link on the
 		 * interface.
 		 */
 		for (opr = V_nd_prefix.lh_first; opr; opr = opr->ndpr_next) {
 			if (opr == pr)
 				continue;
 
 			if ((opr->ndpr_stateflags & NDPRF_ONLINK) != 0)
 				continue;
 
 			/*
 			 * KAME specific: detached prefixes should not be
 			 * on-link.
 			 */
 			if ((opr->ndpr_stateflags & NDPRF_DETACHED) != 0)
 				continue;
 
 			if (opr->ndpr_plen == pr->ndpr_plen &&
 			    in6_are_prefix_equal(&pr->ndpr_prefix.sin6_addr,
 			    &opr->ndpr_prefix.sin6_addr, pr->ndpr_plen)) {
 				int e;
 
 				if ((e = nd6_prefix_onlink(opr)) != 0) {
 					nd6log((LOG_ERR,
 					    "nd6_prefix_offlink: failed to "
 					    "recover a prefix %s/%d from %s "
 					    "to %s (errno = %d)\n",
 					    ip6_sprintf(ip6buf,
 						&opr->ndpr_prefix.sin6_addr),
 					    opr->ndpr_plen, if_name(ifp),
 					    if_name(opr->ndpr_ifp), e));
 				}
 			}
 		}
 	} else {
 		/* XXX: can we still set the NDPRF_ONLINK flag? */
 		nd6log((LOG_ERR,
 		    "nd6_prefix_offlink: failed to delete route: "
 		    "%s/%d on %s (errno = %d)\n",
 		    ip6_sprintf(ip6buf, &sa6.sin6_addr), pr->ndpr_plen,
 		    if_name(ifp), error));
 	}
 
 	if (rt != NULL) {
 		RTFREE(rt);
 	}
 
 	return (error);
 }
 
 static struct in6_ifaddr *
 in6_ifadd(struct nd_prefixctl *pr, int mcast)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp = pr->ndpr_ifp;
 	struct ifaddr *ifa;
 	struct in6_aliasreq ifra;
 	struct in6_ifaddr *ia, *ib;
 	int error, plen0;
 	struct in6_addr mask;
 	int prefixlen = pr->ndpr_plen;
 	int updateflags;
 	char ip6buf[INET6_ADDRSTRLEN];
 
 	in6_prefixlen2mask(&mask, prefixlen);
 
 	/*
 	 * find a link-local address (will be interface ID).
 	 * Is it really mandatory? Theoretically, a global or a site-local
 	 * address can be configured without a link-local address, if we
 	 * have a unique interface identifier...
 	 *
 	 * it is not mandatory to have a link-local address, we can generate
 	 * interface identifier on the fly.  we do this because:
 	 * (1) it should be the easiest way to find interface identifier.
 	 * (2) RFC2462 5.4 suggesting the use of the same interface identifier
 	 * for multiple addresses on a single interface, and possible shortcut
 	 * of DAD.  we omitted DAD for this reason in the past.
 	 * (3) a user can prevent autoconfiguration of global address
 	 * by removing link-local address by hand (this is partly because we
 	 * don't have other way to control the use of IPv6 on an interface.
 	 * this has been our design choice - cf. NRL's "ifconfig auto").
 	 * (4) it is easier to manage when an interface has addresses
 	 * with the same interface identifier, than to have multiple addresses
 	 * with different interface identifiers.
 	 */
 	ifa = (struct ifaddr *)in6ifa_ifpforlinklocal(ifp, 0); /* 0 is OK? */
 	if (ifa)
 		ib = (struct in6_ifaddr *)ifa;
 	else
 		return NULL;
 
 	/* prefixlen + ifidlen must be equal to 128 */
 	plen0 = in6_mask2len(&ib->ia_prefixmask.sin6_addr, NULL);
 	if (prefixlen != plen0) {
 		nd6log((LOG_INFO, "in6_ifadd: wrong prefixlen for %s "
 		    "(prefix=%d ifid=%d)\n",
 		    if_name(ifp), prefixlen, 128 - plen0));
 		return NULL;
 	}
 
 	/* make ifaddr */
 
 	bzero(&ifra, sizeof(ifra));
 	/*
 	 * in6_update_ifa() does not use ifra_name, but we accurately set it
 	 * for safety.
 	 */
 	strncpy(ifra.ifra_name, if_name(ifp), sizeof(ifra.ifra_name));
 	ifra.ifra_addr.sin6_family = AF_INET6;
 	ifra.ifra_addr.sin6_len = sizeof(struct sockaddr_in6);
 	/* prefix */
 	ifra.ifra_addr.sin6_addr = pr->ndpr_prefix.sin6_addr;
 	ifra.ifra_addr.sin6_addr.s6_addr32[0] &= mask.s6_addr32[0];
 	ifra.ifra_addr.sin6_addr.s6_addr32[1] &= mask.s6_addr32[1];
 	ifra.ifra_addr.sin6_addr.s6_addr32[2] &= mask.s6_addr32[2];
 	ifra.ifra_addr.sin6_addr.s6_addr32[3] &= mask.s6_addr32[3];
 
 	/* interface ID */
 	ifra.ifra_addr.sin6_addr.s6_addr32[0] |=
 	    (ib->ia_addr.sin6_addr.s6_addr32[0] & ~mask.s6_addr32[0]);
 	ifra.ifra_addr.sin6_addr.s6_addr32[1] |=
 	    (ib->ia_addr.sin6_addr.s6_addr32[1] & ~mask.s6_addr32[1]);
 	ifra.ifra_addr.sin6_addr.s6_addr32[2] |=
 	    (ib->ia_addr.sin6_addr.s6_addr32[2] & ~mask.s6_addr32[2]);
 	ifra.ifra_addr.sin6_addr.s6_addr32[3] |=
 	    (ib->ia_addr.sin6_addr.s6_addr32[3] & ~mask.s6_addr32[3]);
 
 	/* new prefix mask. */
 	ifra.ifra_prefixmask.sin6_len = sizeof(struct sockaddr_in6);
 	ifra.ifra_prefixmask.sin6_family = AF_INET6;
 	bcopy(&mask, &ifra.ifra_prefixmask.sin6_addr,
 	    sizeof(ifra.ifra_prefixmask.sin6_addr));
 
 	/* lifetimes. */
 	ifra.ifra_lifetime.ia6t_vltime = pr->ndpr_vltime;
 	ifra.ifra_lifetime.ia6t_pltime = pr->ndpr_pltime;
 
 	/* XXX: scope zone ID? */
 
 	ifra.ifra_flags |= IN6_IFF_AUTOCONF; /* obey autoconf */
 
 	/*
 	 * Make sure that we do not have this address already.  This should
 	 * usually not happen, but we can still see this case, e.g., if we
 	 * have manually configured the exact address to be configured.
 	 */
 	if (in6ifa_ifpwithaddr(ifp, &ifra.ifra_addr.sin6_addr) != NULL) {
 		/* this should be rare enough to make an explicit log */
 		log(LOG_INFO, "in6_ifadd: %s is already configured\n",
 		    ip6_sprintf(ip6buf, &ifra.ifra_addr.sin6_addr));
 		return (NULL);
 	}
 
 	/*
 	 * Allocate ifaddr structure, link into chain, etc.
 	 * If we are going to create a new address upon receiving a multicasted
 	 * RA, we need to impose a random delay before starting DAD.
 	 * [draft-ietf-ipv6-rfc2462bis-02.txt, Section 5.4.2]
 	 */
 	updateflags = 0;
 	if (mcast)
 		updateflags |= IN6_IFAUPDATE_DADDELAY;
 	if ((error = in6_update_ifa(ifp, &ifra, NULL, updateflags)) != 0) {
 		nd6log((LOG_ERR,
 		    "in6_ifadd: failed to make ifaddr %s on %s (errno=%d)\n",
 		    ip6_sprintf(ip6buf, &ifra.ifra_addr.sin6_addr),
 		    if_name(ifp), error));
 		return (NULL);	/* ifaddr must not have been allocated. */
 	}
 
 	ia = in6ifa_ifpwithaddr(ifp, &ifra.ifra_addr.sin6_addr);
 
 	return (ia);		/* this is always non-NULL */
 }
 
 /*
  * ia0 - corresponding public address
  */
 int
 in6_tmpifadd(const struct in6_ifaddr *ia0, int forcegen, int delay)
 {
 	INIT_VNET_INET6(curvnet);
 	struct ifnet *ifp = ia0->ia_ifa.ifa_ifp;
 	struct in6_ifaddr *newia, *ia;
 	struct in6_aliasreq ifra;
 	int i, error;
 	int trylimit = 3;	/* XXX: adhoc value */
 	int updateflags;
 	u_int32_t randid[2];
 	time_t vltime0, pltime0;
 
 	bzero(&ifra, sizeof(ifra));
 	strncpy(ifra.ifra_name, if_name(ifp), sizeof(ifra.ifra_name));
 	ifra.ifra_addr = ia0->ia_addr;
 	/* copy prefix mask */
 	ifra.ifra_prefixmask = ia0->ia_prefixmask;
 	/* clear the old IFID */
 	for (i = 0; i < 4; i++) {
 		ifra.ifra_addr.sin6_addr.s6_addr32[i] &=
 		    ifra.ifra_prefixmask.sin6_addr.s6_addr32[i];
 	}
 
   again:
 	if (in6_get_tmpifid(ifp, (u_int8_t *)randid,
 	    (const u_int8_t *)&ia0->ia_addr.sin6_addr.s6_addr[8], forcegen)) {
 		nd6log((LOG_NOTICE, "in6_tmpifadd: failed to find a good "
 		    "random IFID\n"));
 		return (EINVAL);
 	}
 	ifra.ifra_addr.sin6_addr.s6_addr32[2] |=
 	    (randid[0] & ~(ifra.ifra_prefixmask.sin6_addr.s6_addr32[2]));
 	ifra.ifra_addr.sin6_addr.s6_addr32[3] |=
 	    (randid[1] & ~(ifra.ifra_prefixmask.sin6_addr.s6_addr32[3]));
 
 	/*
 	 * in6_get_tmpifid() quite likely provided a unique interface ID.
 	 * However, we may still have a chance to see collision, because
 	 * there may be a time lag between generation of the ID and generation
 	 * of the address.  So, we'll do one more sanity check.
 	 */
 	for (ia = V_in6_ifaddr; ia; ia = ia->ia_next) {
 		if (IN6_ARE_ADDR_EQUAL(&ia->ia_addr.sin6_addr,
 		    &ifra.ifra_addr.sin6_addr)) {
 			if (trylimit-- == 0) {
 				/*
 				 * Give up.  Something strange should have
 				 * happened.
 				 */
 				nd6log((LOG_NOTICE, "in6_tmpifadd: failed to "
 				    "find a unique random IFID\n"));
 				return (EEXIST);
 			}
 			forcegen = 1;
 			goto again;
 		}
 	}
 
 	/*
 	 * The Valid Lifetime is the lower of the Valid Lifetime of the
          * public address or TEMP_VALID_LIFETIME.
 	 * The Preferred Lifetime is the lower of the Preferred Lifetime
          * of the public address or TEMP_PREFERRED_LIFETIME -
          * DESYNC_FACTOR.
 	 */
 	if (ia0->ia6_lifetime.ia6t_vltime != ND6_INFINITE_LIFETIME) {
 		vltime0 = IFA6_IS_INVALID(ia0) ? 0 :
 		    (ia0->ia6_lifetime.ia6t_vltime -
 		    (time_second - ia0->ia6_updatetime));
 		if (vltime0 > V_ip6_temp_valid_lifetime)
 			vltime0 = V_ip6_temp_valid_lifetime;
 	} else
 		vltime0 = V_ip6_temp_valid_lifetime;
 	if (ia0->ia6_lifetime.ia6t_pltime != ND6_INFINITE_LIFETIME) {
 		pltime0 = IFA6_IS_DEPRECATED(ia0) ? 0 :
 		    (ia0->ia6_lifetime.ia6t_pltime -
 		    (time_second - ia0->ia6_updatetime));
 		if (pltime0 > V_ip6_temp_preferred_lifetime - V_ip6_desync_factor){
 			pltime0 = V_ip6_temp_preferred_lifetime -
 			    V_ip6_desync_factor;
 		}
 	} else
 		pltime0 = V_ip6_temp_preferred_lifetime - V_ip6_desync_factor;
 	ifra.ifra_lifetime.ia6t_vltime = vltime0;
 	ifra.ifra_lifetime.ia6t_pltime = pltime0;
 
 	/*
 	 * A temporary address is created only if this calculated Preferred
 	 * Lifetime is greater than REGEN_ADVANCE time units.
 	 */
 	if (ifra.ifra_lifetime.ia6t_pltime <= V_ip6_temp_regen_advance)
 		return (0);
 
 	/* XXX: scope zone ID? */
 
 	ifra.ifra_flags |= (IN6_IFF_AUTOCONF|IN6_IFF_TEMPORARY);
 
 	/* allocate ifaddr structure, link into chain, etc. */
 	updateflags = 0;
 	if (delay)
 		updateflags |= IN6_IFAUPDATE_DADDELAY;
 	if ((error = in6_update_ifa(ifp, &ifra, NULL, updateflags)) != 0)
 		return (error);
 
 	newia = in6ifa_ifpwithaddr(ifp, &ifra.ifra_addr.sin6_addr);
 	if (newia == NULL) {	/* XXX: can it happen? */
 		nd6log((LOG_ERR,
 		    "in6_tmpifadd: ifa update succeeded, but we got "
 		    "no ifaddr\n"));
 		return (EINVAL); /* XXX */
 	}
 	newia->ia6_ndpr = ia0->ia6_ndpr;
 	newia->ia6_ndpr->ndpr_refcnt++;
 
 	/*
 	 * A newly added address might affect the status of other addresses.
 	 * XXX: when the temporary address is generated with a new public
 	 * address, the onlink check is redundant.  However, it would be safe
 	 * to do the check explicitly everywhere a new address is generated,
 	 * and, in fact, we surely need the check when we create a new
 	 * temporary address due to deprecation of an old temporary address.
 	 */
 	pfxlist_onlink_check();
 
 	return (0);
 }
 
 static int
 in6_init_prefix_ltimes(struct nd_prefix *ndpr)
 {
 	if (ndpr->ndpr_pltime == ND6_INFINITE_LIFETIME)
 		ndpr->ndpr_preferred = 0;
 	else
 		ndpr->ndpr_preferred = time_second + ndpr->ndpr_pltime;
 	if (ndpr->ndpr_vltime == ND6_INFINITE_LIFETIME)
 		ndpr->ndpr_expire = 0;
 	else
 		ndpr->ndpr_expire = time_second + ndpr->ndpr_vltime;
 
 	return 0;
 }
 
 static void
 in6_init_address_ltimes(struct nd_prefix *new, struct in6_addrlifetime *lt6)
 {
 	/* init ia6t_expire */
 	if (lt6->ia6t_vltime == ND6_INFINITE_LIFETIME)
 		lt6->ia6t_expire = 0;
 	else {
 		lt6->ia6t_expire = time_second;
 		lt6->ia6t_expire += lt6->ia6t_vltime;
 	}
 
 	/* init ia6t_preferred */
 	if (lt6->ia6t_pltime == ND6_INFINITE_LIFETIME)
 		lt6->ia6t_preferred = 0;
 	else {
 		lt6->ia6t_preferred = time_second;
 		lt6->ia6t_preferred += lt6->ia6t_pltime;
 	}
 }
 
 /*
  * Delete all the routing table entries that use the specified gateway.
  * XXX: this function causes search through all entries of routing table, so
  * it shouldn't be called when acting as a router.
  */
 void
 rt6_flush(struct in6_addr *gateway, struct ifnet *ifp)
 {
 	INIT_VNET_NET(curvnet);
 	struct radix_node_head *rnh = V_rt_tables[0][AF_INET6];
 	int s = splnet();
 
 	/* We'll care only link-local addresses */
 	if (!IN6_IS_ADDR_LINKLOCAL(gateway)) {
 		splx(s);
 		return;
 	}
 
 	RADIX_NODE_HEAD_LOCK(rnh);
 	rnh->rnh_walktree(rnh, rt6_deleteroute, (void *)gateway);
 	RADIX_NODE_HEAD_UNLOCK(rnh);
 	splx(s);
 }
 
 static int
 rt6_deleteroute(struct radix_node *rn, void *arg)
 {
 #define SIN6(s)	((struct sockaddr_in6 *)s)
 	struct rtentry *rt = (struct rtentry *)rn;
 	struct in6_addr *gate = (struct in6_addr *)arg;
 
 	if (rt->rt_gateway == NULL || rt->rt_gateway->sa_family != AF_INET6)
 		return (0);
 
 	if (!IN6_ARE_ADDR_EQUAL(gate, &SIN6(rt->rt_gateway)->sin6_addr)) {
 		return (0);
 	}
 
 	/*
 	 * Do not delete a static route.
 	 * XXX: this seems to be a bit ad-hoc. Should we consider the
 	 * 'cloned' bit instead?
 	 */
 	if ((rt->rt_flags & RTF_STATIC) != 0)
 		return (0);
 
 	/*
 	 * We delete only host route. This means, in particular, we don't
 	 * delete default route.
 	 */
 	if ((rt->rt_flags & RTF_HOST) == 0)
 		return (0);
 
 	return (rtrequest(RTM_DELETE, rt_key(rt), rt->rt_gateway,
 	    rt_mask(rt), rt->rt_flags, 0));
 #undef SIN6
 }
 
 int
 nd6_setdefaultiface(int ifindex)
 {
 	INIT_VNET_NET(curvnet);
 	INIT_VNET_INET6(curvnet);
 	int error = 0;
 
 	if (ifindex < 0 || V_if_index < ifindex)
 		return (EINVAL);
 	if (ifindex != 0 && !ifnet_byindex(ifindex))
 		return (EINVAL);
 
 	if (V_nd6_defifindex != ifindex) {
 		V_nd6_defifindex = ifindex;
 		if (V_nd6_defifindex > 0)
 			V_nd6_defifp = ifnet_byindex(V_nd6_defifindex);
 		else
 			V_nd6_defifp = NULL;
 
 		/*
 		 * Our current implementation assumes one-to-one maping between
 		 * interfaces and links, so it would be natural to use the
 		 * default interface as the default link.
 		 */
 		scope6_setdefault(V_nd6_defifp);
 	}
 
 	return (error);
 }
Index: head/sys/netinet6/vinet6.h
===================================================================
--- head/sys/netinet6/vinet6.h	(revision 186118)
+++ head/sys/netinet6/vinet6.h	(revision 186119)
@@ -1,265 +1,264 @@
 /*-
  * Copyright (c) 2006-2008 University of Zagreb
  * Copyright (c) 2006-2008 FreeBSD Foundation
  *
  * This software was developed by the University of Zagreb and the
  * FreeBSD Foundation under sponsorship by the Stichting NLnet and the
  * FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef _NETINET6_VINET6_H_
 #define _NETINET6_VINET6_H_
 
 #include <sys/callout.h>
 #include <sys/queue.h>
 #include <sys/types.h>
 
 #include <net/if_var.h>
 
 #include <netinet/icmp6.h>
 #include <netinet/in.h>
 
 #include <netinet6/ip6_var.h>
 #include <netinet6/nd6.h>
 #include <netinet6/raw_ip6.h>
 #include <netinet6/scope6_var.h>
 
 struct vnet_inet6 {
 	struct in6_ifaddr *		_in6_ifaddr;
 
 	u_int				_frag6_nfragpackets;
 	u_int				_frag6_nfrags;
 	struct ip6q			_ip6q;
 
 	struct route_in6 		_ip6_forward_rt;
 
 	struct in6_addrpolicy 		_defaultaddrpolicy;
 	TAILQ_HEAD(, addrsel_policyent) _addrsel_policytab;
 	u_int				_in6_maxmtu;
 	int				_ip6_auto_linklocal;
 	int				_rtq_minreallyold6;
 	int				_rtq_reallyold6;
 	int				_rtq_toomany6;
 
 	struct ip6stat 			_ip6stat;
 	struct rip6stat 		_rip6stat;
 	struct icmp6stat 		_icmp6stat;
 
 	int				_rtq_timeout6;  
 	struct callout 			_rtq_timer6;
 	struct callout 			_rtq_mtutimer;
 	struct callout 			_nd6_slowtimo_ch;
 	struct callout 			_nd6_timer_ch;
 	struct callout 			_in6_tmpaddrtimer_ch;
 
 	int				_nd6_inuse;
 	int				_nd6_allocated;
 	int				_nd6_onlink_ns_rfc4861;
-	struct llinfo_nd6		_llinfo_nd6;
 	struct nd_drhead		_nd_defrouter;
 	struct nd_prhead 		_nd_prefix;
 	struct ifnet *			_nd6_defifp;
 	int				_nd6_defifindex;
 
 	struct scope6_id 		_sid_default;
 
 	TAILQ_HEAD(, dadq) 		_dadq;
 	int				_dad_init;
 
 	int				_icmp6errpps_count;
 	struct timeval			_icmp6errppslim_last;
 
 	int 				_ip6_forwarding;
 	int				_ip6_sendredirects;
 	int				_ip6_defhlim;
 	int				_ip6_defmcasthlim;
 	int				_ip6_accept_rtadv;
 	int				_ip6_maxfragpackets;
 	int				_ip6_maxfrags;
 	int				_ip6_log_interval;
 	int				_ip6_hdrnestlimit;
 	int				_ip6_dad_count;
 	int				_ip6_auto_flowlabel;
 	int				_ip6_use_deprecated;
 	int				_ip6_rr_prune;
 	int				_ip6_mcast_pmtu;
 	int				_ip6_v6only;
 	int				_ip6_keepfaith;
 	int				_ip6stealth;
 	time_t				_ip6_log_time;
 
 	int				_pmtu_expire;
 	int				_pmtu_probe;
 	u_long				_rip6_sendspace;
 	u_long				_rip6_recvspace;
 	int				_icmp6_rediraccept;
 	int				_icmp6_redirtimeout;
 	int				_icmp6errppslim;
 	int				_icmp6_nodeinfo;
 	int				_udp6_sendspace;
 	int				_udp6_recvspace;
 	int				_ip6qmaxlen;
 	int				_ip6_prefer_tempaddr;
 	int				_ip6_forward_srcrt;
 	int				_ip6_sourcecheck;
 	int				_ip6_sourcecheck_interval;
 	int				_ip6_ours_check_algorithm;
 
 	int				_nd6_prune;
 	int				_nd6_delay;
 	int				_nd6_umaxtries;
 	int				_nd6_mmaxtries;
 	int				_nd6_useloopback;
 	int				_nd6_gctimer;
 	int				_nd6_maxndopt;
 	int				_nd6_maxnudhint;
 	int				_nd6_maxqueuelen;
 	int				_nd6_debug;
 	int				_nd6_recalc_reachtm_interval;
 	int				_dad_ignore_ns;
 	int				_dad_maxtry;
 	int				_ip6_use_tempaddr;
 	int				_ip6_desync_factor;
 	u_int32_t			_ip6_temp_preferred_lifetime;
 	u_int32_t			_ip6_temp_valid_lifetime;
 
 	int				_ip6_mrouter_ver;
 	int				_pim6;
 	u_int				_mrt6debug;
 
 	int				_ip6_temp_regen_advance;
 	int				_ip6_use_defzone;
 
 	struct ip6_pktopts		_ip6_opts;
 };
 
 #ifndef VIMAGE
 #ifndef VIMAGE_GLOBALS
 extern struct vnet_inet6 vnet_inet6_0;
 #endif
 #endif
 
 #define	INIT_VNET_INET6(vnet) \
 	INIT_FROM_VNET(vnet, VNET_MOD_INET6, struct vnet_inet6, vnet_inet6)
 
 #define	VNET_INET6(sym)		VSYM(vnet_inet6, sym)
 
 /*
  * Symbol translation macros
  */
 #define	V_addrsel_policytab		VNET_INET6(addrsel_policytab)
 #define	V_dad_ignore_ns			VNET_INET6(dad_ignore_ns)
 #define	V_dad_init			VNET_INET6(dad_init)
 #define	V_dad_maxtry			VNET_INET6(dad_maxtry)
 #define	V_dadq				VNET_INET6(dadq)
 #define	V_defaultaddrpolicy		VNET_INET6(defaultaddrpolicy)
 #define	V_frag6_nfragpackets		VNET_INET6(frag6_nfragpackets)
 #define	V_frag6_nfrags			VNET_INET6(frag6_nfrags)
 #define	V_icmp6_nodeinfo		VNET_INET6(icmp6_nodeinfo)
 #define	V_icmp6_rediraccept		VNET_INET6(icmp6_rediraccept)
 #define	V_icmp6_redirtimeout		VNET_INET6(icmp6_redirtimeout)
 #define	V_icmp6errpps_count		VNET_INET6(icmp6errpps_count)
 #define	V_icmp6errppslim		VNET_INET6(icmp6errppslim)
 #define	V_icmp6errppslim_last		VNET_INET6(icmp6errppslim_last)
 #define	V_icmp6stat			VNET_INET6(icmp6stat)
 #define	V_in6_ifaddr			VNET_INET6(in6_ifaddr)
 #define	V_in6_maxmtu			VNET_INET6(in6_maxmtu)
 #define	V_in6_tmpaddrtimer_ch		VNET_INET6(in6_tmpaddrtimer_ch)
 #define	V_ip6_accept_rtadv		VNET_INET6(ip6_accept_rtadv)
 #define	V_ip6_auto_flowlabel		VNET_INET6(ip6_auto_flowlabel)
 #define	V_ip6_auto_linklocal		VNET_INET6(ip6_auto_linklocal)
 #define	V_ip6_dad_count			VNET_INET6(ip6_dad_count)
 #define	V_ip6_defhlim			VNET_INET6(ip6_defhlim)
 #define	V_ip6_defmcasthlim		VNET_INET6(ip6_defmcasthlim)
 #define	V_ip6_desync_factor		VNET_INET6(ip6_desync_factor)
 #define	V_ip6_forward_rt		VNET_INET6(ip6_forward_rt)
 #define	V_ip6_forward_srcrt		VNET_INET6(ip6_forward_srcrt)
 #define	V_ip6_forwarding		VNET_INET6(ip6_forwarding)
 #define	V_ip6_hdrnestlimit		VNET_INET6(ip6_hdrnestlimit)
 #define	V_ip6_keepfaith			VNET_INET6(ip6_keepfaith)
 #define	V_ip6_log_interval		VNET_INET6(ip6_log_interval)
 #define	V_ip6_log_time			VNET_INET6(ip6_log_time)
 #define	V_ip6_maxfragpackets		VNET_INET6(ip6_maxfragpackets)
 #define	V_ip6_maxfrags			VNET_INET6(ip6_maxfrags)
 #define	V_ip6_mcast_pmtu		VNET_INET6(ip6_mcast_pmtu)
 #define	V_ip6_mrouter_ver		VNET_INET6(ip6_mrouter_ver)
 #define	V_ip6_opts			VNET_INET6(ip6_opts)
 #define	V_ip6_ours_check_algorithm	VNET_INET6(ip6_ours_check_algorithm)
 #define	V_ip6_prefer_tempaddr		VNET_INET6(ip6_prefer_tempaddr)
 #define	V_ip6_rr_prune			VNET_INET6(ip6_rr_prune)
 #define	V_ip6_sendredirects		VNET_INET6(ip6_sendredirects)
 #define	V_ip6_sourcecheck		VNET_INET6(ip6_sourcecheck)
 #define	V_ip6_sourcecheck_interval	VNET_INET6(ip6_sourcecheck_interval)
 #define	V_ip6_temp_preferred_lifetime	VNET_INET6(ip6_temp_preferred_lifetime)
 #define	V_ip6_temp_regen_advance	VNET_INET6(ip6_temp_regen_advance)
 #define	V_ip6_temp_valid_lifetime	VNET_INET6(ip6_temp_valid_lifetime)
 #define	V_ip6_use_defzone		VNET_INET6(ip6_use_defzone)
 #define	V_ip6_use_deprecated		VNET_INET6(ip6_use_deprecated)
 #define	V_ip6_use_tempaddr		VNET_INET6(ip6_use_tempaddr)
 #define	V_ip6_v6only			VNET_INET6(ip6_v6only)
 #define	V_ip6q				VNET_INET6(ip6q)
 #define	V_ip6qmaxlen			VNET_INET6(ip6qmaxlen)
 #define	V_ip6stat			VNET_INET6(ip6stat)
 #define	V_ip6stealth			VNET_INET6(ip6stealth)
 #define	V_llinfo_nd6			VNET_INET6(llinfo_nd6)
 #define	V_mrt6debug			VNET_INET6(mrt6debug)
 #define	V_nd6_allocated			VNET_INET6(nd6_allocated)
 #define	V_nd6_debug			VNET_INET6(nd6_debug)
 #define	V_nd6_defifindex		VNET_INET6(nd6_defifindex)
 #define	V_nd6_defifp			VNET_INET6(nd6_defifp)
 #define	V_nd6_delay			VNET_INET6(nd6_delay)
 #define	V_nd6_gctimer			VNET_INET6(nd6_gctimer)
 #define	V_nd6_inuse			VNET_INET6(nd6_inuse)
 #define	V_nd6_maxndopt			VNET_INET6(nd6_maxndopt)
 #define	V_nd6_maxnudhint		VNET_INET6(nd6_maxnudhint)
 #define	V_nd6_maxqueuelen		VNET_INET6(nd6_maxqueuelen)
 #define	V_nd6_mmaxtries			VNET_INET6(nd6_mmaxtries)
 #define	V_nd6_onlink_ns_rfc4861		VNET_INET6(nd6_onlink_ns_rfc4861)
 #define	V_nd6_prune			VNET_INET6(nd6_prune)
 #define	V_nd6_recalc_reachtm_interval	VNET_INET6(nd6_recalc_reachtm_interval)
 #define	V_nd6_slowtimo_ch		VNET_INET6(nd6_slowtimo_ch)
 #define	V_nd6_timer_ch			VNET_INET6(nd6_timer_ch)
 #define	V_nd6_umaxtries			VNET_INET6(nd6_umaxtries)
 #define	V_nd6_useloopback		VNET_INET6(nd6_useloopback)
 #define	V_nd_defrouter			VNET_INET6(nd_defrouter)
 #define	V_nd_prefix			VNET_INET6(nd_prefix)
 #define	V_pim6				VNET_INET6(pim6)
 #define	V_pmtu_expire			VNET_INET6(pmtu_expire)
 #define	V_pmtu_probe			VNET_INET6(pmtu_probe)
 #define	V_rip6_recvspace		VNET_INET6(rip6_recvspace)
 #define	V_rip6_sendspace		VNET_INET6(rip6_sendspace)
 #define	V_rip6stat			VNET_INET6(rip6stat)
 #define	V_rtq_minreallyold6 		VNET_INET6(rtq_minreallyold6)
 #define	V_rtq_mtutimer			VNET_INET6(rtq_mtutimer)
 #define	V_rtq_reallyold6		VNET_INET6(rtq_reallyold6)
 #define	V_rtq_timeout6			VNET_INET6(rtq_timeout6)
 #define	V_rtq_timer6			VNET_INET6(rtq_timer6)
 #define	V_rtq_toomany6			VNET_INET6(rtq_toomany6)
 #define	V_sid_default			VNET_INET6(sid_default)
 #define	V_udp6_recvspace		VNET_INET6(udp6_recvspace)	
 #define	V_udp6_sendspace		VNET_INET6(udp6_sendspace)
 
 #endif /* !_NETINET6_VINET6_H_ */
Index: head/sys/sys/param.h
===================================================================
--- head/sys/sys/param.h	(revision 186118)
+++ head/sys/sys/param.h	(revision 186119)
@@ -1,318 +1,318 @@
 /*-
  * Copyright (c) 1982, 1986, 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  * (c) UNIX System Laboratories, Inc.
  * All or some portions of this file are derived from material licensed
  * to the University of California by American Telephone and Telegraph
  * Co. or Unix System Laboratories, Inc. and are reproduced herein with
  * the permission of UNIX System Laboratories, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)param.h	8.3 (Berkeley) 4/4/95
  * $FreeBSD$
  */
 
 #ifndef _SYS_PARAM_H_
 #define _SYS_PARAM_H_
 
 #include <sys/_null.h>
 
 #define	BSD	199506		/* System version (year & month). */
 #define BSD4_3	1
 #define BSD4_4	1
 
 /* 
  * __FreeBSD_version numbers are documented in the Porter's Handbook.
  * If you bump the version for any reason, you should update the documentation
  * there.
  * Currently this lives here:
  *
  *	doc/en_US.ISO8859-1/books/porters-handbook/book.sgml
  *
  * scheme is:  <major><two digit minor>Rxx
  *		'R' is 0 if release branch or x.0-CURRENT before RELENG_*_0
  *		is created, otherwise 1.
  */
 #undef __FreeBSD_version
-#define __FreeBSD_version 800058	/* Master, propagated to newvers */
+#define __FreeBSD_version 800059	/* Master, propagated to newvers */
 
 #ifndef LOCORE
 #include <sys/types.h>
 #endif
 
 /*
  * Machine-independent constants (some used in following include files).
  * Redefined constants are from POSIX 1003.1 limits file.
  *
  * MAXCOMLEN should be >= sizeof(ac_comm) (see <acct.h>)
  * MAXLOGNAME should be == UT_NAMESIZE+1 (see <utmp.h>)
  */
 #include <sys/syslimits.h>
 
 #define	MAXCOMLEN	19		/* max command name remembered */
 #define	MAXINTERP	32		/* max interpreter file name length */
 #define	MAXLOGNAME	17		/* max login name length (incl. NUL) */
 #define	MAXUPRC		CHILD_MAX	/* max simultaneous processes */
 #define	NCARGS		ARG_MAX		/* max bytes for an exec function */
 #define	NGROUPS		NGROUPS_MAX	/* max number groups */
 #define	NOFILE		OPEN_MAX	/* max open files per process */
 #define	NOGROUP		65535		/* marker for empty group set member */
 #define MAXHOSTNAMELEN	256		/* max hostname size */
 #define SPECNAMELEN	63		/* max length of devicename */
 
 /* More types and definitions used throughout the kernel. */
 #ifdef _KERNEL
 #include <sys/cdefs.h>
 #include <sys/errno.h>
 #ifndef LOCORE
 #include <sys/time.h>
 #include <sys/priority.h>
 #endif
 
 #ifndef FALSE
 #define	FALSE	0
 #endif
 #ifndef TRUE
 #define	TRUE	1
 #endif
 #endif
 
 #ifndef _KERNEL
 /* Signals. */
 #include <sys/signal.h>
 #endif
 
 /* Machine type dependent parameters. */
 #include <machine/param.h>
 #ifndef _KERNEL
 #include <sys/limits.h>
 #endif
 
 #ifndef _NO_NAMESPACE_POLLUTION
 
 #ifndef DEV_BSHIFT
 #define	DEV_BSHIFT	9		/* log2(DEV_BSIZE) */
 #endif
 #define	DEV_BSIZE	(1<<DEV_BSHIFT)
 
 #ifndef BLKDEV_IOSIZE
 #define BLKDEV_IOSIZE  PAGE_SIZE	/* default block device I/O size */
 #endif
 #ifndef DFLTPHYS
 #define DFLTPHYS	(64 * 1024)	/* default max raw I/O transfer size */
 #endif
 #ifndef MAXPHYS
 #define MAXPHYS		(128 * 1024)	/* max raw I/O transfer size */
 #endif
 #ifndef MAXDUMPPGS
 #define MAXDUMPPGS	(DFLTPHYS/PAGE_SIZE)
 #endif
 
 /*
  * Constants related to network buffer management.
  * MCLBYTES must be no larger than PAGE_SIZE.
  */
 #ifndef	MSIZE
 #define MSIZE		256		/* size of an mbuf */
 #endif	/* MSIZE */
 
 #ifndef	MCLSHIFT
 #define MCLSHIFT	11		/* convert bytes to mbuf clusters */
 #endif	/* MCLSHIFT */
 
 #define MCLBYTES	(1 << MCLSHIFT)	/* size of an mbuf cluster */
 
 #define	MJUMPAGESIZE	PAGE_SIZE	/* jumbo cluster 4k */
 #define	MJUM9BYTES	(9 * 1024)	/* jumbo cluster 9k */
 #define	MJUM16BYTES	(16 * 1024)	/* jumbo cluster 16k */
 
 /*
  * Some macros for units conversion
  */
 
 /* clicks to bytes */
 #ifndef ctob
 #define ctob(x)	((x)<<PAGE_SHIFT)
 #endif
 
 /* bytes to clicks */
 #ifndef btoc
 #define btoc(x)	(((vm_offset_t)(x)+PAGE_MASK)>>PAGE_SHIFT)
 #endif
 
 /*
  * btodb() is messy and perhaps slow because `bytes' may be an off_t.  We
  * want to shift an unsigned type to avoid sign extension and we don't
  * want to widen `bytes' unnecessarily.  Assume that the result fits in
  * a daddr_t.
  */
 #ifndef btodb
 #define btodb(bytes)	 		/* calculates (bytes / DEV_BSIZE) */ \
 	(sizeof (bytes) > sizeof(long) \
 	 ? (daddr_t)((unsigned long long)(bytes) >> DEV_BSHIFT) \
 	 : (daddr_t)((unsigned long)(bytes) >> DEV_BSHIFT))
 #endif
 
 #ifndef dbtob
 #define dbtob(db)			/* calculates (db * DEV_BSIZE) */ \
 	((off_t)(db) << DEV_BSHIFT)
 #endif
 
 #endif /* _NO_NAMESPACE_POLLUTION */
 
 #define	PRIMASK	0x0ff
 #define	PCATCH	0x100		/* OR'd with pri for tsleep to check signals */
 #define	PDROP	0x200	/* OR'd with pri to stop re-entry of interlock mutex */
 
 #define	NZERO	0		/* default "nice" */
 
 #define	NBBY	8		/* number of bits in a byte */
 #define	NBPW	sizeof(int)	/* number of bytes per word (integer) */
 
 #define	CMASK	022		/* default file mask: S_IWGRP|S_IWOTH */
 
 #define	NODEV	(dev_t)(-1)	/* non-existent device */
 
 #define	CBLOCK	128		/* Clist block size, must be a power of 2. */
 				/* Data chars/clist. */
 #define	CBSIZE	(CBLOCK - sizeof(struct cblock *))
 #define	CROUND	(CBLOCK - 1)	/* Clist rounding. */
 
 /*
  * File system parameters and macros.
  *
  * MAXBSIZE -	Filesystems are made out of blocks of at most MAXBSIZE bytes
  *		per block.  MAXBSIZE may be made larger without effecting
  *		any existing filesystems as long as it does not exceed MAXPHYS,
  *		and may be made smaller at the risk of not being able to use
  *		filesystems which require a block size exceeding MAXBSIZE.
  *
  * BKVASIZE -	Nominal buffer space per buffer, in bytes.  BKVASIZE is the
  *		minimum KVM memory reservation the kernel is willing to make.
  *		Filesystems can of course request smaller chunks.  Actual 
  *		backing memory uses a chunk size of a page (PAGE_SIZE).
  *
  *		If you make BKVASIZE too small you risk seriously fragmenting
  *		the buffer KVM map which may slow things down a bit.  If you
  *		make it too big the kernel will not be able to optimally use 
  *		the KVM memory reserved for the buffer cache and will wind 
  *		up with too-few buffers.
  *
  *		The default is 16384, roughly 2x the block size used by a
  *		normal UFS filesystem.
  */
 #define MAXBSIZE	65536	/* must be power of 2 */
 #define BKVASIZE	16384	/* must be power of 2 */
 #define BKVAMASK	(BKVASIZE-1)
 
 /*
  * MAXPATHLEN defines the longest permissible path length after expanding
  * symbolic links. It is used to allocate a temporary buffer from the buffer
  * pool in which to do the name expansion, hence should be a power of two,
  * and must be less than or equal to MAXBSIZE.  MAXSYMLINKS defines the
  * maximum number of symbolic links that may be expanded in a path name.
  * It should be set high enough to allow all legitimate uses, but halt
  * infinite loops reasonably quickly.
  */
 #define	MAXPATHLEN	PATH_MAX
 #define MAXSYMLINKS	32
 
 /* Bit map related macros. */
 #define	setbit(a,i)	(((unsigned char *)(a))[(i)/NBBY] |= 1<<((i)%NBBY))
 #define	clrbit(a,i)	(((unsigned char *)(a))[(i)/NBBY] &= ~(1<<((i)%NBBY)))
 #define	isset(a,i)							\
 	(((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY)))
 #define	isclr(a,i)							\
 	((((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) == 0)
 
 /* Macros for counting and rounding. */
 #ifndef howmany
 #define	howmany(x, y)	(((x)+((y)-1))/(y))
 #endif
 #define	rounddown(x, y)	(((x)/(y))*(y))
 #define	roundup(x, y)	((((x)+((y)-1))/(y))*(y))  /* to any y */
 #define	roundup2(x, y)	(((x)+((y)-1))&(~((y)-1))) /* if y is powers of two */
 #define powerof2(x)	((((x)-1)&(x))==0)
 
 /* Macros for min/max. */
 #define	MIN(a,b) (((a)<(b))?(a):(b))
 #define	MAX(a,b) (((a)>(b))?(a):(b))
 
 #ifdef _KERNEL
 /*
  * Basic byte order function prototypes for non-inline functions.
  */
 #ifndef LOCORE
 #ifndef _BYTEORDER_PROTOTYPED
 #define	_BYTEORDER_PROTOTYPED
 __BEGIN_DECLS
 __uint32_t	 htonl(__uint32_t);
 __uint16_t	 htons(__uint16_t);
 __uint32_t	 ntohl(__uint32_t);
 __uint16_t	 ntohs(__uint16_t);
 __END_DECLS
 #endif
 #endif
 
 #ifndef lint
 #ifndef _BYTEORDER_FUNC_DEFINED
 #define	_BYTEORDER_FUNC_DEFINED
 #define	htonl(x)	__htonl(x)
 #define	htons(x)	__htons(x)
 #define	ntohl(x)	__ntohl(x)
 #define	ntohs(x)	__ntohs(x)
 #endif /* !_BYTEORDER_FUNC_DEFINED */
 #endif /* lint */
 #endif /* _KERNEL */
 
 /*
  * Scale factor for scaled integers used to count %cpu time and load avgs.
  *
  * The number of CPU `tick's that map to a unique `%age' can be expressed
  * by the formula (1 / (2 ^ (FSHIFT - 11))).  The maximum load average that
  * can be calculated (assuming 32 bits) can be closely approximated using
  * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15).
  *
  * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age',
  * FSHIFT must be at least 11; this gives us a maximum load avg of ~1024.
  */
 #define	FSHIFT	11		/* bits to right of fixed binary point */
 #define FSCALE	(1<<FSHIFT)
 
 #define dbtoc(db)			/* calculates devblks to pages */ \
 	((db + (ctodb(1) - 1)) >> (PAGE_SHIFT - DEV_BSHIFT))
  
 #define ctodb(db)			/* calculates pages to devblks */ \
 	((db) << (PAGE_SHIFT - DEV_BSHIFT))
 
 /*
  * Given the pointer x to the member m of the struct s, return
  * a pointer to the containing structure.
  */
 #define	member2struct(s, m, x)						\
 	((struct s *)(void *)((char *)(x) - offsetof(struct s, m)))
 
 #endif	/* _SYS_PARAM_H_ */
Index: head/usr.bin/netstat/route.c
===================================================================
--- head/usr.bin/netstat/route.c	(revision 186118)
+++ head/usr.bin/netstat/route.c	(revision 186119)
@@ -1,1142 +1,1126 @@
 /*-
  * Copyright (c) 1983, 1988, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #if 0
 #ifndef lint
 static char sccsid[] = "From: @(#)route.c	8.6 (Berkeley) 4/28/95";
 #endif /* not lint */
 #endif
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/protosw.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/time.h>
 
 #include <net/ethernet.h>
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/radix.h>
 #include <net/route.h>
 
 #include <netinet/in.h>
 #include <netipx/ipx.h>
 #include <netatalk/at.h>
 #include <netgraph/ng_socket.h>
 
 #include <sys/sysctl.h>
 
 #include <arpa/inet.h>
 #include <libutil.h>
 #include <netdb.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sysexits.h>
 #include <unistd.h>
 #include <err.h>
 #include "netstat.h"
 
 #define	kget(p, d) (kread((u_long)(p), (char *)&(d), sizeof (d)))
 
 /*
  * Definitions for showing gateway flags.
  */
 struct bits {
 	u_long	b_mask;
 	char	b_val;
 } bits[] = {
 	{ RTF_UP,	'U' },
 	{ RTF_GATEWAY,	'G' },
 	{ RTF_HOST,	'H' },
 	{ RTF_REJECT,	'R' },
 	{ RTF_DYNAMIC,	'D' },
 	{ RTF_MODIFIED,	'M' },
 	{ RTF_DONE,	'd' }, /* Completed -- for routing messages only */
-	{ RTF_CLONING,	'C' },
 	{ RTF_XRESOLVE,	'X' },
-	{ RTF_LLINFO,	'L' },
 	{ RTF_STATIC,	'S' },
 	{ RTF_PROTO1,	'1' },
 	{ RTF_PROTO2,	'2' },
-	{ RTF_WASCLONED,'W' },
 	{ RTF_PRCLONING,'c' },
 	{ RTF_PROTO3,	'3' },
 	{ RTF_BLACKHOLE,'B' },
 	{ RTF_BROADCAST,'b' },
+#ifdef RTF_LLINFO
+	{ RTF_LLINFO,	'L' },
+#endif
+#ifdef RTF_WASCLONED
+	{ RTF_WASCLONED,'W' },
+#endif
+#ifdef RTF_CLONING
+	{ RTF_CLONING,	'C' },
+#endif
 	{ 0 , 0 }
 };
 
 typedef union {
 	long	dummy;		/* Helps align structure. */
 	struct	sockaddr u_sa;
 	u_short	u_data[128];
 } sa_u;
 
 static sa_u pt_u;
 
 int	fibnum;
 int	do_rtent = 0;
 struct	rtentry rtentry;
 struct	radix_node rnode;
 struct	radix_mask rmask;
 struct  rtline  {
 	struct	radix_node_head *tables[AF_MAX+1]; /*xxx*/
 };
 struct	rtline *rt_tables;
 
 struct	radix_node_head *rt_tables_line[1][AF_MAX+1]; /*xxx*/
 
 int	NewTree = 0;
 
 struct	timespec uptime;
 
 static struct sockaddr *kgetsa(struct sockaddr *);
 static void size_cols(int ef, struct radix_node *rn);
 static void size_cols_tree(struct radix_node *rn);
 static void size_cols_rtentry(struct rtentry *rt);
 static void p_tree(struct radix_node *);
 static void p_rtnode(void);
 static void ntreestuff(void);
 static void np_rtentry(struct rt_msghdr *);
 static void p_sockaddr(struct sockaddr *, struct sockaddr *, int, int);
 static const char *fmt_sockaddr(struct sockaddr *sa, struct sockaddr *mask,
     int flags);
 static void p_flags(int, const char *);
 static const char *fmt_flags(int f);
 static void p_rtentry(struct rtentry *);
 static void domask(char *, in_addr_t, u_long);
 
 /*
  * Print routing tables.
  */
 void
 routepr(u_long rtree)
 {
 	struct radix_node_head *rnh, head;
 	size_t intsize;
 	int i;
 	int numfibs;
 
 	intsize = sizeof(int);
 	if (sysctlbyname("net.my_fibnum", &fibnum, &intsize, NULL, 0) == -1)
 		fibnum = 0;
 	if (sysctlbyname("net.fibs", &numfibs, &intsize, NULL, 0) == -1)
 		numfibs = 1;
 	rt_tables = calloc(numfibs, sizeof(struct rtline));
 	if (rt_tables == NULL)
 		err(EX_OSERR, "memory allocation failed");
 	/*
 	 * Since kernel & userland use different timebase
 	 * (time_uptime vs time_second) and we are reading kernel memory
 	 * directly we should do rt_rmx.rmx_expire --> expire_time conversion.
 	 */
 	if (clock_gettime(CLOCK_UPTIME, &uptime) < 0)
 		err(EX_OSERR, "clock_gettime() failed");
 
 	printf("Routing tables\n");
 
 	if (Aflag == 0 && NewTree)
 		ntreestuff();
 	else {
 		if (rtree == 0) {
 			printf("rt_tables: symbol not in namelist\n");
 			return;
 		}
 
 		if (kread((u_long)(rtree), (char *)(rt_tables),
 		    (numfibs * sizeof(struct rtline))) != 0)
 			return;
 		for (i = 0; i <= AF_MAX; i++) {
 			int tmpfib;
 			if (i != AF_INET)
 				tmpfib = 0;
 			else
 				tmpfib = fibnum;
 			if ((rnh = rt_tables[tmpfib].tables[i]) == 0)
 				continue;
 			if (kget(rnh, head) != 0)
 				continue;
 			if (i == AF_UNSPEC) {
 				if (Aflag && af == 0) {
 					printf("Netmasks:\n");
 					p_tree(head.rnh_treetop);
 				}
 			} else if (af == AF_UNSPEC || af == i) {
 				size_cols(i, head.rnh_treetop);
 				pr_family(i);
 				do_rtent = 1;
 				pr_rthdr(i);
 				p_tree(head.rnh_treetop);
 			}
 		}
 	}
 }
 
 /*
  * Print address family header before a section of the routing table.
  */
 void
 pr_family(int af1)
 {
 	const char *afname;
 
 	switch (af1) {
 	case AF_INET:
 		afname = "Internet";
 		break;
 #ifdef INET6
 	case AF_INET6:
 		afname = "Internet6";
 		break;
 #endif /*INET6*/
 	case AF_IPX:
 		afname = "IPX";
 		break;
 	case AF_ISO:
 		afname = "ISO";
 		break;
 	case AF_APPLETALK:
 		afname = "AppleTalk";
 		break;
 	case AF_CCITT:
 		afname = "X.25";
 		break;
 	case AF_NETGRAPH:
 		afname = "Netgraph";
 		break;
 	default:
 		afname = NULL;
 		break;
 	}
 	if (afname)
 		printf("\n%s:\n", afname);
 	else
 		printf("\nProtocol Family %d:\n", af1);
 }
 
 /* column widths; each followed by one space */
 #ifndef INET6
 #define	WID_DST_DEFAULT(af) 	18	/* width of destination column */
 #define	WID_GW_DEFAULT(af)	18	/* width of gateway column */
 #define	WID_IF_DEFAULT(af)	(Wflag ? 8 : 6)	/* width of netif column */
 #else
 #define	WID_DST_DEFAULT(af) \
 	((af) == AF_INET6 ? (numeric_addr ? 33: 18) : 18)
 #define	WID_GW_DEFAULT(af) \
 	((af) == AF_INET6 ? (numeric_addr ? 29 : 18) : 18)
 #define	WID_IF_DEFAULT(af)	((af) == AF_INET6 ? 8 : (Wflag ? 8 : 6))
 #endif /*INET6*/
 
 static int wid_dst;
 static int wid_gw;
 static int wid_flags;
 static int wid_refs;
 static int wid_use;
 static int wid_mtu;
 static int wid_if;
 static int wid_expire;
 
 static void
 size_cols(int ef __unused, struct radix_node *rn)
 {
 	wid_dst = WID_DST_DEFAULT(ef);
 	wid_gw = WID_GW_DEFAULT(ef);
 	wid_flags = 6;
 	wid_refs = 6;
 	wid_use = 8;
 	wid_mtu = 6;
 	wid_if = WID_IF_DEFAULT(ef);
 	wid_expire = 6;
 
 	if (Wflag)
 		size_cols_tree(rn);
 }
 
 static void
 size_cols_tree(struct radix_node *rn)
 {
 again:
 	if (kget(rn, rnode) != 0)
 		return;
 	if (!(rnode.rn_flags & RNF_ACTIVE))
 		return;
 	if (rnode.rn_bit < 0) {
 		if ((rnode.rn_flags & RNF_ROOT) == 0) {
 			if (kget(rn, rtentry) != 0)
 				return;
 			size_cols_rtentry(&rtentry);
 		}
 		if ((rn = rnode.rn_dupedkey))
 			goto again;
 	} else {
 		rn = rnode.rn_right;
 		size_cols_tree(rnode.rn_left);
 		size_cols_tree(rn);
 	}
 }
 
 static void
 size_cols_rtentry(struct rtentry *rt)
 {
 	static struct ifnet ifnet, *lastif;
-	struct rtentry parent;
 	static char buffer[100];
 	const char *bp;
 	struct sockaddr *sa;
 	sa_u addr, mask;
 	int len;
 
-	/*
-	 * Don't print protocol-cloned routes unless -a.
-	 */
-	if (rt->rt_flags & RTF_WASCLONED && !aflag) {
-		if (kget(rt->rt_parent, parent) != 0)
-			return;
-		if (parent.rt_flags & RTF_PRCLONING)
-			return;
-	}
-
 	bzero(&addr, sizeof(addr));
 	if ((sa = kgetsa(rt_key(rt))))
 		bcopy(sa, &addr, sa->sa_len);
 	bzero(&mask, sizeof(mask));
 	if (rt_mask(rt) && (sa = kgetsa(rt_mask(rt))))
 		bcopy(sa, &mask, sa->sa_len);
 	bp = fmt_sockaddr(&addr.u_sa, &mask.u_sa, rt->rt_flags);
 	len = strlen(bp);
 	wid_dst = MAX(len, wid_dst);
 
 	bp = fmt_sockaddr(kgetsa(rt->rt_gateway), NULL, RTF_HOST);
 	len = strlen(bp);
 	wid_gw = MAX(len, wid_gw);
 
 	bp = fmt_flags(rt->rt_flags);
 	len = strlen(bp);
 	wid_flags = MAX(len, wid_flags);
 
 	if (addr.u_sa.sa_family == AF_INET || Wflag) {
-		len = snprintf(buffer, sizeof(buffer), "%ld", rt->rt_refcnt);
+		len = snprintf(buffer, sizeof(buffer), "%d", rt->rt_refcnt);
 		wid_refs = MAX(len, wid_refs);
 		len = snprintf(buffer, sizeof(buffer), "%lu", rt->rt_use);
 		wid_use = MAX(len, wid_use);
 		if (Wflag && rt->rt_rmx.rmx_mtu != 0) {
 			len = snprintf(buffer, sizeof(buffer),
 				       "%lu", rt->rt_rmx.rmx_mtu);
 			wid_mtu = MAX(len, wid_mtu);
 		}
 	}
 	if (rt->rt_ifp) {
 		if (rt->rt_ifp != lastif) {
 			if (kget(rt->rt_ifp, ifnet) == 0) 
 				len = strlen(ifnet.if_xname);
 			else
 				len = strlen("---");
 			lastif = rt->rt_ifp;
 			wid_if = MAX(len, wid_if);
 		}
 		if (rt->rt_rmx.rmx_expire) {
 			time_t expire_time;
 
 			if ((expire_time =
 			    rt->rt_rmx.rmx_expire - uptime.tv_sec) > 0) {
 				len = snprintf(buffer, sizeof(buffer), "%d",
 					       (int)expire_time);
 				wid_expire = MAX(len, wid_expire);
 			}
 		}
 	}
 }
 
 
 /*
  * Print header for routing table columns.
  */
 void
 pr_rthdr(int af1)
 {
 
 	if (Aflag)
 		printf("%-8.8s ","Address");
 	if (af1 == AF_INET || Wflag) {
 		if (Wflag) {
 			printf("%-*.*s %-*.*s %-*.*s %*.*s %*.*s %*.*s %*.*s %*s\n",
 				wid_dst,	wid_dst,	"Destination",
 				wid_gw,		wid_gw,		"Gateway",
 				wid_flags,	wid_flags,	"Flags",
 				wid_refs,	wid_refs,	"Refs",
 				wid_use,	wid_use,	"Use",
 				wid_mtu,	wid_mtu,	"Mtu",
 				wid_if,		wid_if,		"Netif",
 				wid_expire,			"Expire");
 		} else {
 			printf("%-*.*s %-*.*s %-*.*s %*.*s %*.*s %*.*s %*s\n",
 				wid_dst,	wid_dst,	"Destination",
 				wid_gw,		wid_gw,		"Gateway",
 				wid_flags,	wid_flags,	"Flags",
 				wid_refs,	wid_refs,	"Refs",
 				wid_use,	wid_use,	"Use",
 				wid_if,		wid_if,		"Netif",
 				wid_expire,			"Expire");
 		}
 	} else {
 		printf("%-*.*s %-*.*s %-*.*s  %*.*s %*s\n",
 			wid_dst,	wid_dst,	"Destination",
 			wid_gw,		wid_gw,		"Gateway",
 			wid_flags,	wid_flags,	"Flags",
 			wid_if,		wid_if,		"Netif",
 			wid_expire,			"Expire");
 	}
 }
 
 static struct sockaddr *
 kgetsa(struct sockaddr *dst)
 {
 
 	if (kget(dst, pt_u.u_sa) != 0)
 		return (NULL);
 	if (pt_u.u_sa.sa_len > sizeof (pt_u.u_sa))
 		kread((u_long)dst, (char *)pt_u.u_data, pt_u.u_sa.sa_len);
 	return (&pt_u.u_sa);
 }
 
 static void
 p_tree(struct radix_node *rn)
 {
 
 again:
 	if (kget(rn, rnode) != 0)
 		return;
 	if (!(rnode.rn_flags & RNF_ACTIVE))
 		return;
 	if (rnode.rn_bit < 0) {
 		if (Aflag)
 			printf("%-8.8lx ", (u_long)rn);
 		if (rnode.rn_flags & RNF_ROOT) {
 			if (Aflag)
 				printf("(root node)%s",
 				    rnode.rn_dupedkey ? " =>\n" : "\n");
 		} else if (do_rtent) {
 			if (kget(rn, rtentry) == 0) {
 				p_rtentry(&rtentry);
 				if (Aflag)
 					p_rtnode();
 			}
 		} else {
 			p_sockaddr(kgetsa((struct sockaddr *)rnode.rn_key),
 				   NULL, 0, 44);
 			putchar('\n');
 		}
 		if ((rn = rnode.rn_dupedkey))
 			goto again;
 	} else {
 		if (Aflag && do_rtent) {
 			printf("%-8.8lx ", (u_long)rn);
 			p_rtnode();
 		}
 		rn = rnode.rn_right;
 		p_tree(rnode.rn_left);
 		p_tree(rn);
 	}
 }
 
 char	nbuf[20];
 
 static void
 p_rtnode(void)
 {
 	struct radix_mask *rm = rnode.rn_mklist;
 
 	if (rnode.rn_bit < 0) {
 		if (rnode.rn_mask) {
 			printf("\t  mask ");
 			p_sockaddr(kgetsa((struct sockaddr *)rnode.rn_mask),
 				   NULL, 0, -1);
 		} else if (rm == 0)
 			return;
 	} else {
 		sprintf(nbuf, "(%d)", rnode.rn_bit);
 		printf("%6.6s %8.8lx : %8.8lx", nbuf, (u_long)rnode.rn_left, (u_long)rnode.rn_right);
 	}
 	while (rm) {
 		if (kget(rm, rmask) != 0)
 			break;
 		sprintf(nbuf, " %d refs, ", rmask.rm_refs);
 		printf(" mk = %8.8lx {(%d),%s",
 			(u_long)rm, -1 - rmask.rm_bit, rmask.rm_refs ? nbuf : " ");
 		if (rmask.rm_flags & RNF_NORMAL) {
 			struct radix_node rnode_aux;
 			printf(" <normal>, ");
 			if (kget(rmask.rm_leaf, rnode_aux) == 0)
 				p_sockaddr(kgetsa((struct sockaddr *)rnode_aux.rn_mask),
 				    NULL, 0, -1);
 			else
 				p_sockaddr(NULL, NULL, 0, -1);
 		} else
 		    p_sockaddr(kgetsa((struct sockaddr *)rmask.rm_mask),
 				NULL, 0, -1);
 		putchar('}');
 		if ((rm = rmask.rm_mklist))
 			printf(" ->");
 	}
 	putchar('\n');
 }
 
 static void
 ntreestuff(void)
 {
 	size_t needed;
 	int mib[6];
 	char *buf, *next, *lim;
 	struct rt_msghdr *rtm;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = 0;
 	mib[4] = NET_RT_DUMP;
 	mib[5] = 0;
 	if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0) {
 		err(1, "sysctl: net.route.0.0.dump estimate");
 	}
 
 	if ((buf = malloc(needed)) == 0) {
 		errx(2, "malloc(%lu)", (unsigned long)needed);
 	}
 	if (sysctl(mib, 6, buf, &needed, NULL, 0) < 0) {
 		err(1, "sysctl: net.route.0.0.dump");
 	}
 	lim  = buf + needed;
 	for (next = buf; next < lim; next += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)next;
 		np_rtentry(rtm);
 	}
 }
 
 static void
 np_rtentry(struct rt_msghdr *rtm)
 {
 	struct sockaddr *sa = (struct sockaddr *)(rtm + 1);
 #ifdef notdef
 	static int masks_done, banner_printed;
 #endif
 	static int old_af;
 	int af1 = 0, interesting = RTF_UP | RTF_GATEWAY | RTF_HOST;
 
 #ifdef notdef
 	/* for the moment, netmasks are skipped over */
 	if (!banner_printed) {
 		printf("Netmasks:\n");
 		banner_printed = 1;
 	}
 	if (masks_done == 0) {
 		if (rtm->rtm_addrs != RTA_DST ) {
 			masks_done = 1;
 			af1 = sa->sa_family;
 		}
 	} else
 #endif
 		af1 = sa->sa_family;
 	if (af1 != old_af) {
 		pr_family(af1);
 		old_af = af1;
 	}
 	if (rtm->rtm_addrs == RTA_DST)
 		p_sockaddr(sa, NULL, 0, 36);
 	else {
 		p_sockaddr(sa, NULL, rtm->rtm_flags, 16);
 		sa = (struct sockaddr *)(SA_SIZE(sa) + (char *)sa);
 		p_sockaddr(sa, NULL, 0, 18);
 	}
 	p_flags(rtm->rtm_flags & interesting, "%-6.6s ");
 	putchar('\n');
 }
 
 static void
 p_sockaddr(struct sockaddr *sa, struct sockaddr *mask, int flags, int width)
 {
 	const char *cp;
 
 	cp = fmt_sockaddr(sa, mask, flags);
 
 	if (width < 0 )
 		printf("%s ", cp);
 	else {
 		if (numeric_addr)
 			printf("%-*s ", width, cp);
 		else
 			printf("%-*.*s ", width, width, cp);
 	}
 }
 
 static const char *
 fmt_sockaddr(struct sockaddr *sa, struct sockaddr *mask, int flags)
 {
 	static char workbuf[128];
 	const char *cp;
 
 	if (sa == NULL)
 		return ("null");
 
 	switch(sa->sa_family) {
 	case AF_INET:
 	    {
 		struct sockaddr_in *sockin = (struct sockaddr_in *)sa;
 
 		if ((sockin->sin_addr.s_addr == INADDR_ANY) &&
 			mask &&
 			ntohl(((struct sockaddr_in *)mask)->sin_addr.s_addr)
 				==0L)
 				cp = "default" ;
 		else if (flags & RTF_HOST)
 			cp = routename(sockin->sin_addr.s_addr);
 		else if (mask)
 			cp = netname(sockin->sin_addr.s_addr,
 				     ntohl(((struct sockaddr_in *)mask)
 					   ->sin_addr.s_addr));
 		else
 			cp = netname(sockin->sin_addr.s_addr, 0L);
 		break;
 	    }
 
 #ifdef INET6
 	case AF_INET6:
 	    {
 		struct sockaddr_in6 *sa6 = (struct sockaddr_in6 *)sa;
 		struct in6_addr *in6 = &sa6->sin6_addr;
 
 		/*
 		 * XXX: This is a special workaround for KAME kernels.
 		 * sin6_scope_id field of SA should be set in the future.
 		 */
 		if (IN6_IS_ADDR_LINKLOCAL(in6) ||
 		    IN6_IS_ADDR_MC_LINKLOCAL(in6)) {
 		    /* XXX: override is ok? */
 		    sa6->sin6_scope_id = (u_int32_t)ntohs(*(u_short *)&in6->s6_addr[2]);
 		    *(u_short *)&in6->s6_addr[2] = 0;
 		}
 
 		if (flags & RTF_HOST)
 		    cp = routename6(sa6);
 		else if (mask)
 		    cp = netname6(sa6,
 				  &((struct sockaddr_in6 *)mask)->sin6_addr);
 		else {
 		    cp = netname6(sa6, NULL);
 		}
 		break;
 	    }
 #endif /*INET6*/
 
 	case AF_IPX:
 	    {
 		struct ipx_addr work = ((struct sockaddr_ipx *)sa)->sipx_addr;
 		if (ipx_nullnet(satoipx_addr(work)))
 			cp = "default";
 		else
 			cp = ipx_print(sa);
 		break;
 	    }
 	case AF_APPLETALK:
 	    {
 		if (!(flags & RTF_HOST) && mask)
 			cp = atalk_print2(sa,mask,9);
 		else
 			cp = atalk_print(sa,11);
 		break;
 	    }
 	case AF_NETGRAPH:
 	    {
 		strlcpy(workbuf, ((struct sockaddr_ng *)sa)->sg_data,
 		        sizeof(workbuf));
 		cp = workbuf;
 		break;
 	    }
 
 	case AF_LINK:
 	    {
 		struct sockaddr_dl *sdl = (struct sockaddr_dl *)sa;
 
 		if (sdl->sdl_nlen == 0 && sdl->sdl_alen == 0 &&
 		    sdl->sdl_slen == 0) {
 			(void) sprintf(workbuf, "link#%d", sdl->sdl_index);
 			cp = workbuf;
 		} else
 			switch (sdl->sdl_type) {
 
 			case IFT_ETHER:
 			case IFT_L2VLAN:
 			case IFT_BRIDGE:
 				if (sdl->sdl_alen == ETHER_ADDR_LEN) {
 					cp = ether_ntoa((struct ether_addr *)
 					    (sdl->sdl_data + sdl->sdl_nlen));
 					break;
 				}
 				/* FALLTHROUGH */
 			default:
 				cp = link_ntoa(sdl);
 				break;
 			}
 		break;
 	    }
 
 	default:
 	    {
 		u_char *s = (u_char *)sa->sa_data, *slim;
 		char *cq, *cqlim;
 
 		cq = workbuf;
 		slim =  sa->sa_len + (u_char *) sa;
 		cqlim = cq + sizeof(workbuf) - 6;
 		cq += sprintf(cq, "(%d)", sa->sa_family);
 		while (s < slim && cq < cqlim) {
 			cq += sprintf(cq, " %02x", *s++);
 			if (s < slim)
 			    cq += sprintf(cq, "%02x", *s++);
 		}
 		cp = workbuf;
 	    }
 	}
 
 	return (cp);
 }
 
 static void
 p_flags(int f, const char *format)
 {
 	printf(format, fmt_flags(f));
 }
 
 static const char *
 fmt_flags(int f)
 {
 	static char name[33];
 	char *flags;
 	struct bits *p = bits;
 
 	for (flags = name; p->b_mask; p++)
 		if (p->b_mask & f)
 			*flags++ = p->b_val;
 	*flags = '\0';
 	return (name);
 }
 
 static void
 p_rtentry(struct rtentry *rt)
 {
 	static struct ifnet ifnet, *lastif;
-	struct rtentry parent;
 	static char buffer[128];
 	static char prettyname[128];
 	struct sockaddr *sa;
 	sa_u addr, mask;
 
-	/*
-	 * Don't print protocol-cloned routes unless -a.
-	 */
-	if (rt->rt_flags & RTF_WASCLONED && !aflag) {
-		if (kget(rt->rt_parent, parent) != 0)
-			return;
-		if (parent.rt_flags & RTF_PRCLONING)
-			return;
-	}
-
 	bzero(&addr, sizeof(addr));
 	if ((sa = kgetsa(rt_key(rt))))
 		bcopy(sa, &addr, sa->sa_len);
 	bzero(&mask, sizeof(mask));
 	if (rt_mask(rt) && (sa = kgetsa(rt_mask(rt))))
 		bcopy(sa, &mask, sa->sa_len);
 	p_sockaddr(&addr.u_sa, &mask.u_sa, rt->rt_flags, wid_dst);
 	p_sockaddr(kgetsa(rt->rt_gateway), NULL, RTF_HOST, wid_gw);
 	snprintf(buffer, sizeof(buffer), "%%-%d.%ds ", wid_flags, wid_flags);
 	p_flags(rt->rt_flags, buffer);
 	if (addr.u_sa.sa_family == AF_INET || Wflag) {
-		printf("%*ld %*lu ", wid_refs, rt->rt_refcnt,
+		printf("%*d %*lu ", wid_refs, rt->rt_refcnt,
 				     wid_use, rt->rt_use);
 		if (Wflag) {
 			if (rt->rt_rmx.rmx_mtu != 0)
 				printf("%*lu ", wid_mtu, rt->rt_rmx.rmx_mtu);
 			else
 				printf("%*s ", wid_mtu, "");
 		}
 	}
 	if (rt->rt_ifp) {
 		if (rt->rt_ifp != lastif) {
 			if (kget(rt->rt_ifp, ifnet) == 0)
 				strlcpy(prettyname, ifnet.if_xname,
 				    sizeof(prettyname));
 			else
 				strlcpy(prettyname, "---", sizeof(prettyname));
 			lastif = rt->rt_ifp;
 		}
 		printf("%*.*s", wid_if, wid_if, prettyname);
 		if (rt->rt_rmx.rmx_expire) {
 			time_t expire_time;
 
 			if ((expire_time =
 			    rt->rt_rmx.rmx_expire - uptime.tv_sec) > 0)
 				printf(" %*d", wid_expire, (int)expire_time);
 		}
 		if (rt->rt_nodes[0].rn_dupedkey)
 			printf(" =>");
 	}
 	putchar('\n');
 }
 
 char *
 routename(in_addr_t in)
 {
 	char *cp;
 	static char line[MAXHOSTNAMELEN];
 	struct hostent *hp;
 
 	cp = 0;
 	if (!numeric_addr) {
 		hp = gethostbyaddr(&in, sizeof (struct in_addr), AF_INET);
 		if (hp) {
 			cp = hp->h_name;
 			trimdomain(cp, strlen(cp));
 		}
 	}
 	if (cp) {
 		strlcpy(line, cp, sizeof(line));
 	} else {
 #define	C(x)	((x) & 0xff)
 		in = ntohl(in);
 		sprintf(line, "%u.%u.%u.%u",
 		    C(in >> 24), C(in >> 16), C(in >> 8), C(in));
 	}
 	return (line);
 }
 
 #define	NSHIFT(m) (							\
 	(m) == IN_CLASSA_NET ? IN_CLASSA_NSHIFT :			\
 	(m) == IN_CLASSB_NET ? IN_CLASSB_NSHIFT :			\
 	(m) == IN_CLASSC_NET ? IN_CLASSC_NSHIFT :			\
 	0)
 
 static void
 domask(char *dst, in_addr_t addr __unused, u_long mask)
 {
 	int b, i;
 
 	if (mask == 0 || (!numeric_addr && NSHIFT(mask) != 0)) {
 		*dst = '\0';
 		return;
 	}
 	i = 0;
 	for (b = 0; b < 32; b++)
 		if (mask & (1 << b)) {
 			int bb;
 
 			i = b;
 			for (bb = b+1; bb < 32; bb++)
 				if (!(mask & (1 << bb))) {
 					i = -1;	/* noncontig */
 					break;
 				}
 			break;
 		}
 	if (i == -1)
 		sprintf(dst, "&0x%lx", mask);
 	else
 		sprintf(dst, "/%d", 32-i);
 }
 
 /*
  * Return the name of the network whose address is given.
  * The address is assumed to be that of a net or subnet, not a host.
  */
 char *
 netname(in_addr_t in, u_long mask)
 {
 	char *cp = 0;
 	static char line[MAXHOSTNAMELEN];
 	struct netent *np = 0;
 	in_addr_t i;
 
 	i = ntohl(in);
 	if (!numeric_addr && i) {
 		np = getnetbyaddr(i >> NSHIFT(mask), AF_INET);
 		if (np != NULL) {
 			cp = np->n_name;
 			trimdomain(cp, strlen(cp));
 		}
 	}
 	if (cp != NULL) {
 		strlcpy(line, cp, sizeof(line));
 	} else {
 		inet_ntop(AF_INET, &in, line, sizeof(line) - 1);
 	}
 	domask(line + strlen(line), i, mask);
 	return (line);
 }
 
 #undef NSHIFT
 
 #ifdef INET6
 const char *
 netname6(struct sockaddr_in6 *sa6, struct in6_addr *mask)
 {
 	static char line[MAXHOSTNAMELEN];
 	u_char *p = (u_char *)mask;
 	u_char *lim;
 	int masklen, illegal = 0, flag = 0;
 
 	if (mask) {
 		for (masklen = 0, lim = p + 16; p < lim; p++) {
 			switch (*p) {
 			 case 0xff:
 				 masklen += 8;
 				 break;
 			 case 0xfe:
 				 masklen += 7;
 				 break;
 			 case 0xfc:
 				 masklen += 6;
 				 break;
 			 case 0xf8:
 				 masklen += 5;
 				 break;
 			 case 0xf0:
 				 masklen += 4;
 				 break;
 			 case 0xe0:
 				 masklen += 3;
 				 break;
 			 case 0xc0:
 				 masklen += 2;
 				 break;
 			 case 0x80:
 				 masklen += 1;
 				 break;
 			 case 0x00:
 				 break;
 			 default:
 				 illegal ++;
 				 break;
 			}
 		}
 		if (illegal)
 			fprintf(stderr, "illegal prefixlen\n");
 	}
 	else
 		masklen = 128;
 
 	if (masklen == 0 && IN6_IS_ADDR_UNSPECIFIED(&sa6->sin6_addr))
 		return("default");
 
 	if (numeric_addr)
 		flag |= NI_NUMERICHOST;
 	getnameinfo((struct sockaddr *)sa6, sa6->sin6_len, line, sizeof(line),
 		    NULL, 0, flag);
 
 	if (numeric_addr)
 		sprintf(&line[strlen(line)], "/%d", masklen);
 
 	return line;
 }
 
 char *
 routename6(struct sockaddr_in6 *sa6)
 {
 	static char line[MAXHOSTNAMELEN];
 	int flag = 0;
 	/* use local variable for safety */
 	struct sockaddr_in6 sa6_local;
 
 	sa6_local.sin6_family = AF_INET6;
 	sa6_local.sin6_len = sizeof(sa6_local);
 	sa6_local.sin6_addr = sa6->sin6_addr;
 	sa6_local.sin6_scope_id = sa6->sin6_scope_id;
 
 	if (numeric_addr)
 		flag |= NI_NUMERICHOST;
 
 	getnameinfo((struct sockaddr *)&sa6_local, sa6_local.sin6_len,
 		    line, sizeof(line), NULL, 0, flag);
 
 	return line;
 }
 #endif /*INET6*/
 
 /*
  * Print routing statistics
  */
 void
 rt_stats(u_long rtsaddr, u_long rttaddr)
 {
 	struct rtstat rtstat;
 	int rttrash;
 
 	if (rtsaddr == 0) {
 		printf("rtstat: symbol not in namelist\n");
 		return;
 	}
 	if (rttaddr == 0) {
 		printf("rttrash: symbol not in namelist\n");
 		return;
 	}
 	kread(rtsaddr, (char *)&rtstat, sizeof (rtstat));
 	kread(rttaddr, (char *)&rttrash, sizeof (rttrash));
 	printf("routing:\n");
 
 #define	p(f, m) if (rtstat.f || sflag <= 1) \
 	printf(m, rtstat.f, plural(rtstat.f))
 
 	p(rts_badredirect, "\t%u bad routing redirect%s\n");
 	p(rts_dynamic, "\t%u dynamically created route%s\n");
 	p(rts_newgateway, "\t%u new gateway%s due to redirects\n");
 	p(rts_unreach, "\t%u destination%s found unreachable\n");
 	p(rts_wildcard, "\t%u use%s of a wildcard route\n");
 #undef p
 
 	if (rttrash || sflag <= 1)
 		printf("\t%u route%s not in table but not freed\n",
 		    rttrash, plural(rttrash));
 }
 
 char *
 ipx_print(struct sockaddr *sa)
 {
 	u_short port;
 	struct servent *sp = 0;
 	const char *net = "", *host = "";
 	char *p;
 	u_char *q;
 	struct ipx_addr work = ((struct sockaddr_ipx *)sa)->sipx_addr;
 	static char mybuf[50];
 	char cport[10], chost[15], cnet[15];
 
 	port = ntohs(work.x_port);
 
 	if (ipx_nullnet(work) && ipx_nullhost(work)) {
 
 		if (port) {
 			if (sp)
 				sprintf(mybuf, "*.%s", sp->s_name);
 			else
 				sprintf(mybuf, "*.%x", port);
 		} else
 			sprintf(mybuf, "*.*");
 
 		return (mybuf);
 	}
 
 	if (ipx_wildnet(work))
 		net = "any";
 	else if (ipx_nullnet(work))
 		net = "*";
 	else {
 		q = work.x_net.c_net;
 		sprintf(cnet, "%02x%02x%02x%02x",
 			q[0], q[1], q[2], q[3]);
 		for (p = cnet; *p == '0' && p < cnet + 8; p++)
 			continue;
 		net = p;
 	}
 
 	if (ipx_wildhost(work))
 		host = "any";
 	else if (ipx_nullhost(work))
 		host = "*";
 	else {
 		q = work.x_host.c_host;
 		sprintf(chost, "%02x%02x%02x%02x%02x%02x",
 			q[0], q[1], q[2], q[3], q[4], q[5]);
 		for (p = chost; *p == '0' && p < chost + 12; p++)
 			continue;
 		host = p;
 	}
 
 	if (port) {
 		if (strcmp(host, "*") == 0)
 			host = "";
 		if (sp)
 			snprintf(cport, sizeof(cport),
 				"%s%s", *host ? "." : "", sp->s_name);
 		else
 			snprintf(cport, sizeof(cport),
 				"%s%x", *host ? "." : "", port);
 	} else
 		*cport = 0;
 
 	snprintf(mybuf, sizeof(mybuf), "%s.%s%s", net, host, cport);
 	return(mybuf);
 }
 
 char *
 ipx_phost(struct sockaddr *sa)
 {
 	struct sockaddr_ipx *sipx = (struct sockaddr_ipx *)sa;
 	struct sockaddr_ipx work;
 	static union ipx_net ipx_zeronet;
 	char *p;
 	struct ipx_addr in;
 
 	work = *sipx;
 	in = work.sipx_addr;
 
 	work.sipx_addr.x_port = 0;
 	work.sipx_addr.x_net = ipx_zeronet;
 	p = ipx_print((struct sockaddr *)&work);
 	if (strncmp("*.", p, 2) == 0) p += 2;
 
 	return(p);
 }
 
 void
 upHex(char *p0)
 {
 	char *p = p0;
 
 	for (; *p; p++)
 		switch (*p) {
 
 		case 'a':
 		case 'b':
 		case 'c':
 		case 'd':
 		case 'e':
 		case 'f':
 			*p += ('A' - 'a');
 			break;
 		}
 }
Index: head/usr.sbin/arp/arp.c
===================================================================
--- head/usr.sbin/arp/arp.c	(revision 186118)
+++ head/usr.sbin/arp/arp.c	(revision 186119)
@@ -1,821 +1,839 @@
 /*
  * Copyright (c) 1984, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Sun Microsystems, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #if 0
 #ifndef lint
 static char const copyright[] =
 "@(#) Copyright (c) 1984, 1993\n\
 	The Regents of the University of California.  All rights reserved.\n";
 #endif /* not lint */
 
 #ifndef lint
 static char const sccsid[] = "@(#)from: arp.c	8.2 (Berkeley) 1/2/94";
 #endif /* not lint */
 #endif
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  * arp - display, set, and delete arp table entries
  */
 
 
 #include <sys/param.h>
 #include <sys/file.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/sysctl.h>
 #include <sys/ioctl.h>
 #include <sys/time.h>
 
 #include <net/if.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/route.h>
 #include <net/iso88025.h>
 
 #include <netinet/in.h>
 #include <netinet/if_ether.h>
 
 #include <arpa/inet.h>
 
 #include <ctype.h>
 #include <err.h>
 #include <errno.h>
 #include <netdb.h>
 #include <nlist.h>
 #include <paths.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <strings.h>
 #include <unistd.h>
 
 typedef void (action_fn)(struct sockaddr_dl *sdl,
 	struct sockaddr_inarp *s_in, struct rt_msghdr *rtm);
 
 static int search(u_long addr, action_fn *action);
 static action_fn print_entry;
 static action_fn nuke_entry;
 
 static int delete(char *host, int do_proxy);
 static void usage(void);
 static int set(int argc, char **argv);
 static int get(char *host);
 static int file(char *name);
 static struct rt_msghdr *rtmsg(int cmd,
     struct sockaddr_inarp *dst, struct sockaddr_dl *sdl);
 static int get_ether_addr(in_addr_t ipaddr, struct ether_addr *hwaddr);
 static struct sockaddr_inarp *getaddr(char *host);
 static int valid_type(int type);
 
 static int nflag;	/* no reverse dns lookups */
 static char *rifname;
 
 static int	expire_time, flags, doing_proxy, proxy_only;
 
 /* which function we're supposed to do */
 #define F_GET		1
 #define F_SET		2
 #define F_FILESET	3
 #define F_REPLACE	4
 #define F_DELETE	5
 
 #define SETFUNC(f)	{ if (func) usage(); func = (f); }
 
 int
 main(int argc, char *argv[])
 {
 	int ch, func = 0;
 	int rtn = 0;
 	int aflag = 0;	/* do it for all entries */
 
 	while ((ch = getopt(argc, argv, "andfsSi:")) != -1)
 		switch((char)ch) {
 		case 'a':
 			aflag = 1;
 			break;
 		case 'd':
 			SETFUNC(F_DELETE);
 			break;
 		case 'n':
 			nflag = 1;
 			break;
 		case 'S':
 			SETFUNC(F_REPLACE);
 			break;
 		case 's':
 			SETFUNC(F_SET);
 			break;
 		case 'f' :
 			SETFUNC(F_FILESET);
 			break;
 		case 'i':
 			rifname = optarg;
 			break;
 		case '?':
 		default:
 			usage();
 		}
 	argc -= optind;
 	argv += optind;
 
 	if (!func)
 		func = F_GET;
 	if (rifname) {
 		if (func != F_GET && !(func == F_DELETE && aflag))
 			errx(1, "-i not applicable to this operation");
 		if (if_nametoindex(rifname) == 0) {
 			if (errno == ENXIO)
 				errx(1, "interface %s does not exist", rifname);
 			else
 				err(1, "if_nametoindex(%s)", rifname);
 		}
 	}
 	switch (func) {
 	case F_GET:
 		if (aflag) {
 			if (argc != 0)
 				usage();
 			search(0, print_entry);
 		} else {
 			if (argc != 1)
 				usage();
 			rtn = get(argv[0]);
 		}
 		break;
 	case F_SET:
 	case F_REPLACE:
 		if (argc < 2 || argc > 6)
 			usage();
 		if (func == F_REPLACE)
 			(void)delete(argv[0], 0);
 		rtn = set(argc, argv) ? 1 : 0;
 		break;
 	case F_DELETE:
 		if (aflag) {
 			if (argc != 0)
 				usage();
 			search(0, nuke_entry);
 		} else {
 			if (argc == 2 && strncmp(argv[1], "pub", 3) == 0)
 				ch = SIN_PROXY;
 			else if (argc == 1)
 				ch = 0;
 			else
 				usage();
 			rtn = delete(argv[0], ch);
 		}
 		break;
 	case F_FILESET:
 		if (argc != 1)
 			usage();
 		rtn = file(argv[0]);
 		break;
 	}
 
 	return (rtn);
 }
 
 /*
  * Process a file to set standard arp entries
  */
 static int
 file(char *name)
 {
 	FILE *fp;
 	int i, retval;
 	char line[100], arg[5][50], *args[5], *p;
 
 	if ((fp = fopen(name, "r")) == NULL)
 		err(1, "cannot open %s", name);
 	args[0] = &arg[0][0];
 	args[1] = &arg[1][0];
 	args[2] = &arg[2][0];
 	args[3] = &arg[3][0];
 	args[4] = &arg[4][0];
 	retval = 0;
 	while(fgets(line, sizeof(line), fp) != NULL) {
 		if ((p = strchr(line, '#')) != NULL)
 			*p = '\0';
 		for (p = line; isblank(*p); p++);
 		if (*p == '\n' || *p == '\0')
 			continue;
 		i = sscanf(p, "%49s %49s %49s %49s %49s", arg[0], arg[1],
 		    arg[2], arg[3], arg[4]);
 		if (i < 2) {
 			warnx("bad line: %s", line);
 			retval = 1;
 			continue;
 		}
 		if (set(i, args))
 			retval = 1;
 	}
 	fclose(fp);
 	return (retval);
 }
 
 /*
  * Given a hostname, fills up a (static) struct sockaddr_inarp with
  * the address of the host and returns a pointer to the
  * structure.
  */
 static struct sockaddr_inarp *
 getaddr(char *host)
 {
 	struct hostent *hp;
 	static struct sockaddr_inarp reply;
 
 	bzero(&reply, sizeof(reply));
 	reply.sin_len = sizeof(reply);
 	reply.sin_family = AF_INET;
 	reply.sin_addr.s_addr = inet_addr(host);
 	if (reply.sin_addr.s_addr == INADDR_NONE) {
 		if (!(hp = gethostbyname(host))) {
 			warnx("%s: %s", host, hstrerror(h_errno));
 			return (NULL);
 		}
 		bcopy((char *)hp->h_addr, (char *)&reply.sin_addr,
 			sizeof reply.sin_addr);
 	}
 	return (&reply);
 }
 
 /*
  * Returns true if the type is a valid one for ARP.
  */
 static int
 valid_type(int type)
 {
 
 	switch (type) {
 	case IFT_ETHER:
 	case IFT_FDDI:
 	case IFT_ISO88023:
 	case IFT_ISO88024:
 	case IFT_ISO88025:
 	case IFT_L2VLAN:
 	case IFT_BRIDGE:
 		return (1);
 	default:
 		return (0);
 	}
 }
 
 /*
  * Set an individual arp entry
  */
 static int
 set(int argc, char **argv)
 {
 	struct sockaddr_inarp *addr;
 	struct sockaddr_inarp *dst;	/* what are we looking for */
 	struct sockaddr_dl *sdl;
 	struct rt_msghdr *rtm;
 	struct ether_addr *ea;
 	char *host = argv[0], *eaddr = argv[1];
 	struct sockaddr_dl sdl_m;
 
 	argc -= 2;
 	argv += 2;
 
 	bzero(&sdl_m, sizeof(sdl_m));
 	sdl_m.sdl_len = sizeof(sdl_m);
 	sdl_m.sdl_family = AF_LINK;
 
 	dst = getaddr(host);
 	if (dst == NULL)
 		return (1);
 	doing_proxy = flags = proxy_only = expire_time = 0;
 	while (argc-- > 0) {
 		if (strncmp(argv[0], "temp", 4) == 0) {
 			struct timeval tv;
 			gettimeofday(&tv, 0);
 			expire_time = tv.tv_sec + 20 * 60;
 		} else if (strncmp(argv[0], "pub", 3) == 0) {
 			flags |= RTF_ANNOUNCE;
 			doing_proxy = 1;
 			if (argc && strncmp(argv[1], "only", 3) == 0) {
 				proxy_only = 1;
 				dst->sin_other = SIN_PROXY;
 				argc--; argv++;
 			}
 		} else if (strncmp(argv[0], "blackhole", 9) == 0) {
 			flags |= RTF_BLACKHOLE;
 		} else if (strncmp(argv[0], "reject", 6) == 0) {
 			flags |= RTF_REJECT;
 		} else if (strncmp(argv[0], "trail", 5) == 0) {
 			/* XXX deprecated and undocumented feature */
 			printf("%s: Sending trailers is no longer supported\n",
 				host);
 		}
 		argv++;
 	}
 	ea = (struct ether_addr *)LLADDR(&sdl_m);
 	if (doing_proxy && !strcmp(eaddr, "auto")) {
 		if (!get_ether_addr(dst->sin_addr.s_addr, ea)) {
 			printf("no interface found for %s\n",
 			       inet_ntoa(dst->sin_addr));
 			return (1);
 		}
 		sdl_m.sdl_alen = ETHER_ADDR_LEN;
 	} else {
 		struct ether_addr *ea1 = ether_aton(eaddr);
 
 		if (ea1 == NULL) {
 			warnx("invalid Ethernet address '%s'", eaddr);
 			return (1);
 		} else {
 			*ea = *ea1;
 			sdl_m.sdl_alen = ETHER_ADDR_LEN;
 		}
 	}
 	for (;;) {	/* try at most twice */
 		rtm = rtmsg(RTM_GET, dst, &sdl_m);
 		if (rtm == NULL) {
 			warn("%s", host);
 			return (1);
 		}
 		addr = (struct sockaddr_inarp *)(rtm + 1);
 		sdl = (struct sockaddr_dl *)(SA_SIZE(addr) + (char *)addr);
 		if (addr->sin_addr.s_addr != dst->sin_addr.s_addr)	
 			break;
 		if (sdl->sdl_family == AF_LINK &&
-		    (rtm->rtm_flags & RTF_LLINFO) &&
 		    !(rtm->rtm_flags & RTF_GATEWAY) &&
 		    valid_type(sdl->sdl_type) )
 			break;
 		if (doing_proxy == 0) {
 			printf("set: can only proxy for %s\n", host);
 			return (1);
 		}
 		if (dst->sin_other & SIN_PROXY) {
 			printf("set: proxy entry exists for non 802 device\n");
 			return (1);
 		}
 		dst->sin_other = SIN_PROXY;
 		proxy_only = 1;
 	}
 
 	if (sdl->sdl_family != AF_LINK) {
 		printf("cannot intuit interface index and type for %s\n", host);
 		return (1);
 	}
 	sdl_m.sdl_type = sdl->sdl_type;
 	sdl_m.sdl_index = sdl->sdl_index;
 	return (rtmsg(RTM_ADD, dst, &sdl_m) == NULL);
 }
 
 /*
  * Display an individual arp entry
  */
 static int
 get(char *host)
 {
 	struct sockaddr_inarp *addr;
 
 	addr = getaddr(host);
 	if (addr == NULL)
 		return (1);
 	if (0 == search(addr->sin_addr.s_addr, print_entry)) {
 		printf("%s (%s) -- no entry",
 		    host, inet_ntoa(addr->sin_addr));
 		if (rifname)
 			printf(" on %s", rifname);
 		printf("\n");
 		return (1);
 	}
 	return (0);
 }
 
 /*
  * Delete an arp entry
  */
 static int
 delete(char *host, int do_proxy)
 {
 	struct sockaddr_inarp *addr, *dst;
 	struct rt_msghdr *rtm;
 	struct sockaddr_dl *sdl;
+	struct sockaddr_dl sdl_m;
 
 	dst = getaddr(host);
 	if (dst == NULL)
 		return (1);
 	dst->sin_other = do_proxy;
+
+	/*
+	 * setup the data structure to notify the kernel
+	 * it is the ARP entry the RTM_GET is interested
+	 * in
+	 */
+	bzero(&sdl_m, sizeof(sdl_m));
+	sdl_m.sdl_len = sizeof(sdl_m);
+	sdl_m.sdl_family = AF_LINK;
+
 	for (;;) {	/* try twice */
-		rtm = rtmsg(RTM_GET, dst, NULL);
+		rtm = rtmsg(RTM_GET, dst, &sdl_m);
 		if (rtm == NULL) {
 			warn("%s", host);
 			return (1);
 		}
 		addr = (struct sockaddr_inarp *)(rtm + 1);
 		sdl = (struct sockaddr_dl *)(SA_SIZE(addr) + (char *)addr);
-		if (addr->sin_addr.s_addr == dst->sin_addr.s_addr &&
-		    sdl->sdl_family == AF_LINK &&
-		    (rtm->rtm_flags & RTF_LLINFO) &&
+
+		/*
+		 * With the new L2/L3 restructure, the route 
+		 * returned is a prefix route. The important
+		 * piece of information from the previous
+		 * RTM_GET is the interface index. In the
+		 * case of ECMP, the kernel will traverse
+		 * the route group for the given entry.
+		 */
+		if (sdl->sdl_family == AF_LINK &&
 		    !(rtm->rtm_flags & RTF_GATEWAY) &&
-		    valid_type(sdl->sdl_type) )
-			break;	/* found it */
+		    valid_type(sdl->sdl_type) ) {
+			addr->sin_addr.s_addr = dst->sin_addr.s_addr;
+			break;
+		}
+
 		if (dst->sin_other & SIN_PROXY) {
 			fprintf(stderr, "delete: cannot locate %s\n",host);
 			return (1);
 		}
 		dst->sin_other = SIN_PROXY;
 	}
 	if (rtmsg(RTM_DELETE, dst, NULL) != NULL) {
 		printf("%s (%s) deleted\n", host, inet_ntoa(addr->sin_addr));
 		return (0);
 	}
 	return (1);
 }
 
 /*
  * Search the arp table and do some action on matching entries
  */
 static int
 search(u_long addr, action_fn *action)
 {
 	int mib[6];
 	size_t needed;
 	char *lim, *buf, *newbuf, *next;
 	struct rt_msghdr *rtm;
 	struct sockaddr_inarp *sin2;
 	struct sockaddr_dl *sdl;
 	char ifname[IF_NAMESIZE];
 	int st, found_entry = 0;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = AF_INET;
 	mib[4] = NET_RT_FLAGS;
+#ifdef RTF_LLINFO
 	mib[5] = RTF_LLINFO;
+#else
+	mib[5] = 0;
+#endif	
 	if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0)
 		err(1, "route-sysctl-estimate");
 	if (needed == 0)	/* empty table */
 		return 0;
 	buf = NULL;
 	for (;;) {
 		newbuf = realloc(buf, needed);
 		if (newbuf == NULL) {
 			if (buf != NULL)
 				free(buf);
 			errx(1, "could not reallocate memory");
 		}
 		buf = newbuf;
 		st = sysctl(mib, 6, buf, &needed, NULL, 0);
 		if (st == 0 || errno != ENOMEM)
 			break;
 		needed += needed / 8;
 	}
 	if (st == -1)
 		err(1, "actual retrieval of routing table");
 	lim = buf + needed;
 	for (next = buf; next < lim; next += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)next;
 		sin2 = (struct sockaddr_inarp *)(rtm + 1);
 		sdl = (struct sockaddr_dl *)((char *)sin2 + SA_SIZE(sin2));
 		if (rifname && if_indextoname(sdl->sdl_index, ifname) &&
 		    strcmp(ifname, rifname))
 			continue;
 		if (addr) {
 			if (addr != sin2->sin_addr.s_addr)
 				continue;
 			found_entry = 1;
 		}
 		(*action)(sdl, sin2, rtm);
 	}
 	free(buf);
 	return (found_entry);
 }
 
 /*
  * Display an arp entry
  */
 static void
 print_entry(struct sockaddr_dl *sdl,
 	struct sockaddr_inarp *addr, struct rt_msghdr *rtm)
 {
 	const char *host;
 	struct hostent *hp;
 	struct iso88025_sockaddr_dl_data *trld;
 	char ifname[IF_NAMESIZE];
 	int seg;
 
 	if (nflag == 0)
 		hp = gethostbyaddr((caddr_t)&(addr->sin_addr),
 		    sizeof addr->sin_addr, AF_INET);
 	else
 		hp = 0;
 	if (hp)
 		host = hp->h_name;
 	else {
 		host = "?";
 		if (h_errno == TRY_AGAIN)
 			nflag = 1;
 	}
 	printf("%s (%s) at ", host, inet_ntoa(addr->sin_addr));
 	if (sdl->sdl_alen) {
 		if ((sdl->sdl_type == IFT_ETHER ||
 		    sdl->sdl_type == IFT_L2VLAN ||
 		    sdl->sdl_type == IFT_BRIDGE) &&
 		    sdl->sdl_alen == ETHER_ADDR_LEN)
 			printf("%s", ether_ntoa((struct ether_addr *)LLADDR(sdl)));
 		else {
 			int n = sdl->sdl_nlen > 0 ? sdl->sdl_nlen + 1 : 0;
 
 			printf("%s", link_ntoa(sdl) + n);
 		}
 	} else
 		printf("(incomplete)");
 	if (if_indextoname(sdl->sdl_index, ifname) != NULL)
 		printf(" on %s", ifname);
 	if (rtm->rtm_rmx.rmx_expire == 0)
 		printf(" permanent");
 	if (addr->sin_other & SIN_PROXY)
 		printf(" published (proxy only)");
-	if (rtm->rtm_addrs & RTA_NETMASK) {
-		addr = (struct sockaddr_inarp *)
-			(SA_SIZE(sdl) + (char *)sdl);
-		if (addr->sin_addr.s_addr == 0xffffffff)
-			printf(" published");
-		if (addr->sin_len != 8)
-			printf("(weird)");
-	}
-        switch(sdl->sdl_type) {
+	if (rtm->rtm_flags & RTF_ANNOUNCE)
+		printf(" published");
+	switch(sdl->sdl_type) {
 	case IFT_ETHER:
                 printf(" [ethernet]");
                 break;
 	case IFT_ISO88025:
                 printf(" [token-ring]");
 		trld = SDL_ISO88025(sdl);
 		if (trld->trld_rcf != 0) {
 			printf(" rt=%x", ntohs(trld->trld_rcf));
 			for (seg = 0;
 			     seg < ((TR_RCF_RIFLEN(trld->trld_rcf) - 2 ) / 2);
 			     seg++) 
 				printf(":%x", ntohs(*(trld->trld_route[seg])));
 		}
                 break;
 	case IFT_FDDI:
                 printf(" [fddi]");
                 break;
 	case IFT_ATM:
                 printf(" [atm]");
                 break;
 	case IFT_L2VLAN:
 		printf(" [vlan]");
 		break;
 	case IFT_IEEE1394:
                 printf(" [firewire]");
                 break;
 	case IFT_BRIDGE:
 		printf(" [bridge]");
 		break;
 	default:
 		break;
         }
 		
 	printf("\n");
 
 }
 
 /*
  * Nuke an arp entry
  */
 static void
 nuke_entry(struct sockaddr_dl *sdl __unused,
 	struct sockaddr_inarp *addr, struct rt_msghdr *rtm __unused)
 {
 	char ip[20];
 
 	snprintf(ip, sizeof(ip), "%s", inet_ntoa(addr->sin_addr));
 	(void)delete(ip, 0);
 }
 
 static void
 usage(void)
 {
 	fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n%s\n%s\n",
 		"usage: arp [-n] [-i interface] hostname",
 		"       arp [-n] [-i interface] -a",
 		"       arp -d hostname [pub]",
 		"       arp -d [-i interface] -a",
 		"       arp -s hostname ether_addr [temp] [reject] [blackhole] [pub [only]]",
 		"       arp -S hostname ether_addr [temp] [reject] [blackhole] [pub [only]]",
 		"       arp -f filename");
 	exit(1);
 }
 
 static struct rt_msghdr *
 rtmsg(int cmd, struct sockaddr_inarp *dst, struct sockaddr_dl *sdl)
 {
 	static int seq;
 	int rlen;
 	int l;
 	struct sockaddr_in so_mask, *som = &so_mask;
 	static int s = -1;
 	static pid_t pid;
 
 	static struct	{
 		struct	rt_msghdr m_rtm;
 		char	m_space[512];
 	}	m_rtmsg;
 
 	struct rt_msghdr *rtm = &m_rtmsg.m_rtm;
 	char *cp = m_rtmsg.m_space;
 
 	if (s < 0) {	/* first time: open socket, get pid */
 		s = socket(PF_ROUTE, SOCK_RAW, 0);
 		if (s < 0)
 			err(1, "socket");
 		pid = getpid();
 	}
 	bzero(&so_mask, sizeof(so_mask));
 	so_mask.sin_len = 8;
 	so_mask.sin_addr.s_addr = 0xffffffff;
 
 	errno = 0;
 	/*
 	 * XXX RTM_DELETE relies on a previous RTM_GET to fill the buffer
 	 * appropriately.
 	 */
 	if (cmd == RTM_DELETE)
 		goto doit;
 	bzero((char *)&m_rtmsg, sizeof(m_rtmsg));
 	rtm->rtm_flags = flags;
 	rtm->rtm_version = RTM_VERSION;
 
 	switch (cmd) {
 	default:
 		errx(1, "internal wrong cmd");
 	case RTM_ADD:
 		rtm->rtm_addrs |= RTA_GATEWAY;
 		rtm->rtm_rmx.rmx_expire = expire_time;
 		rtm->rtm_inits = RTV_EXPIRE;
 		rtm->rtm_flags |= (RTF_HOST | RTF_STATIC);
 		dst->sin_other = 0;
 		if (doing_proxy) {
 			if (proxy_only)
 				dst->sin_other = SIN_PROXY;
 			else {
 				rtm->rtm_addrs |= RTA_NETMASK;
 				rtm->rtm_flags &= ~RTF_HOST;
 			}
 		}
 		/* FALLTHROUGH */
 	case RTM_GET:
 		rtm->rtm_addrs |= RTA_DST;
 	}
 #define NEXTADDR(w, s) \
 	if ((s) != NULL && rtm->rtm_addrs & (w)) { \
 		bcopy((s), cp, sizeof(*(s))); cp += SA_SIZE(s);}
 
 	NEXTADDR(RTA_DST, dst);
 	NEXTADDR(RTA_GATEWAY, sdl);
 	NEXTADDR(RTA_NETMASK, som);
 
 	rtm->rtm_msglen = cp - (char *)&m_rtmsg;
 doit:
 	l = rtm->rtm_msglen;
 	rtm->rtm_seq = ++seq;
 	rtm->rtm_type = cmd;
 	if ((rlen = write(s, (char *)&m_rtmsg, l)) < 0) {
 		if (errno != ESRCH || cmd != RTM_DELETE) {
 			warn("writing to routing socket");
 			return (NULL);
 		}
 	}
 	do {
 		l = read(s, (char *)&m_rtmsg, sizeof(m_rtmsg));
 	} while (l > 0 && (rtm->rtm_seq != seq || rtm->rtm_pid != pid));
 	if (l < 0)
 		warn("read from routing socket");
 	return (rtm);
 }
 
 /*
  * get_ether_addr - get the hardware address of an interface on the
  * the same subnet as ipaddr.
  */
 #define MAX_IFS		32
 
 static int
 get_ether_addr(in_addr_t ipaddr, struct ether_addr *hwaddr)
 {
 	struct ifreq *ifr, *ifend, *ifp;
 	in_addr_t ina, mask;
 	struct sockaddr_dl *dla;
 	struct ifreq ifreq;
 	struct ifconf ifc;
 	struct ifreq ifs[MAX_IFS];
 	int sock;
 	int retval = 0;
 
 	sock = socket(AF_INET, SOCK_DGRAM, 0);
 	if (sock < 0)
 		err(1, "socket");
 
 	ifc.ifc_len = sizeof(ifs);
 	ifc.ifc_req = ifs;
 	if (ioctl(sock, SIOCGIFCONF, &ifc) < 0) {
 		warnx("ioctl(SIOCGIFCONF)");
 		goto done;
 	}
 
 #define NEXTIFR(i)						\
     ((struct ifreq *)((char *)&(i)->ifr_addr			\
 	+ MAX((i)->ifr_addr.sa_len, sizeof((i)->ifr_addr))) )
 
 	/*
 	 * Scan through looking for an interface with an Internet
 	 * address on the same subnet as `ipaddr'.
 	 */
 	ifend = (struct ifreq *)(ifc.ifc_buf + ifc.ifc_len);
 	for (ifr = ifc.ifc_req; ifr < ifend; ifr = NEXTIFR(ifr) ) {
 		if (ifr->ifr_addr.sa_family != AF_INET)
 			continue;
 		strncpy(ifreq.ifr_name, ifr->ifr_name,
 			sizeof(ifreq.ifr_name));
 		ifreq.ifr_addr = ifr->ifr_addr;
 		/*
 		 * Check that the interface is up,
 		 * and not point-to-point or loopback.
 		 */
 		if (ioctl(sock, SIOCGIFFLAGS, &ifreq) < 0)
 			continue;
 		if ((ifreq.ifr_flags &
 		     (IFF_UP|IFF_BROADCAST|IFF_POINTOPOINT|
 				IFF_LOOPBACK|IFF_NOARP))
 		     != (IFF_UP|IFF_BROADCAST))
 			continue;
 		/*
 		 * Get its netmask and check that it's on 
 		 * the right subnet.
 		 */
 		if (ioctl(sock, SIOCGIFNETMASK, &ifreq) < 0)
 			continue;
 		mask = ((struct sockaddr_in *)
 			&ifreq.ifr_addr)->sin_addr.s_addr;
 		ina = ((struct sockaddr_in *)
 			&ifr->ifr_addr)->sin_addr.s_addr;
 		if ((ipaddr & mask) == (ina & mask))
 			break; /* ok, we got it! */
 	}
 
 	if (ifr >= ifend)
 		goto done;
 
 	/*
 	 * Now scan through again looking for a link-level address
 	 * for this interface.
 	 */
 	ifp = ifr;
 	for (ifr = ifc.ifc_req; ifr < ifend; ifr = NEXTIFR(ifr))
 		if (strcmp(ifp->ifr_name, ifr->ifr_name) == 0 &&
 		    ifr->ifr_addr.sa_family == AF_LINK)
 			break;
 	if (ifr >= ifend)
 		goto done;
 	/*
 	 * Found the link-level address - copy it out
 	 */
 	dla = (struct sockaddr_dl *) &ifr->ifr_addr;
 	memcpy(hwaddr,  LLADDR(dla), dla->sdl_alen);
 	printf("using interface %s for proxy with address ",
 		ifp->ifr_name);
 	printf("%s\n", ether_ntoa(hwaddr));
 	retval = dla->sdl_alen;
 done:
 	close(sock);
 	return (retval);
 }
Index: head/usr.sbin/ndp/ndp.c
===================================================================
--- head/usr.sbin/ndp/ndp.c	(revision 186118)
+++ head/usr.sbin/ndp/ndp.c	(revision 186119)
@@ -1,1618 +1,1630 @@
 /*	$FreeBSD$	*/
 /*	$KAME: ndp.c,v 1.104 2003/06/27 07:48:39 itojun Exp $	*/
 
 /*
  * Copyright (C) 1995, 1996, 1997, 1998, and 1999 WIDE Project.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 /*
  * Copyright (c) 1984, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Sun Microsystems, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * Based on:
  * "@(#) Copyright (c) 1984, 1993\n\
  *	The Regents of the University of California.  All rights reserved.\n";
  *
  * "@(#)arp.c	8.2 (Berkeley) 1/2/94";
  */
 
 /*
  * ndp - display, set, delete and flush neighbor cache
  */
 
 
 #include <sys/param.h>
 #include <sys/file.h>
 #include <sys/ioctl.h>
 #include <sys/socket.h>
 #include <sys/sysctl.h>
 #include <sys/time.h>
 #include <sys/queue.h>
 
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/if_dl.h>
 #include <net/if_types.h>
 #include <net/route.h>
 
 #include <netinet/in.h>
 #include <netinet/if_ether.h>
 
 #include <netinet/icmp6.h>
 #include <netinet6/in6_var.h>
 #include <netinet6/nd6.h>
 
 #include <arpa/inet.h>
 
 #include <netdb.h>
 #include <errno.h>
 #include <nlist.h>
 #include <stdio.h>
 #include <string.h>
 #include <paths.h>
 #include <err.h>
 #include <stdlib.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include "gmt2local.h"
 
 /* packing rule for routing socket */
 #define ROUNDUP(a) \
 	((a) > 0 ? (1 + (((a) - 1) | (sizeof(long) - 1))) : sizeof(long))
 #define ADVANCE(x, n) (x += ROUNDUP((n)->sa_len))
 
+#define NEXTADDR(w, s) \
+	if (rtm->rtm_addrs & (w)) { \
+		bcopy((char *)&s, cp, sizeof(s)); cp += sizeof(s);}
+
+
 static pid_t pid;
 static int nflag;
 static int tflag;
 static int32_t thiszone;	/* time difference with gmt */
 static int s = -1;
 static int repeat = 0;
 
 char ntop_buf[INET6_ADDRSTRLEN];	/* inet_ntop() */
 char host_buf[NI_MAXHOST];		/* getnameinfo() */
 char ifix_buf[IFNAMSIZ];		/* if_indextoname() */
 
 int main(int, char **);
 int file(char *);
 void getsocket(void);
 int set(int, char **);
 void get(char *);
 int delete(char *);
 void dump(struct in6_addr *, int);
 static struct in6_nbrinfo *getnbrinfo(struct in6_addr *, int, int);
 static char *ether_str(struct sockaddr_dl *);
 int ndp_ether_aton(char *, u_char *);
 void usage(void);
 int rtmsg(int);
 void ifinfo(char *, int, char **);
 void rtrlist(void);
 void plist(void);
 void pfx_flush(void);
 void rtr_flush(void);
 void harmonize_rtr(void);
 #ifdef SIOCSDEFIFACE_IN6	/* XXX: check SIOCGDEFIFACE_IN6 as well? */
 static void getdefif(void);
 static void setdefif(char *);
 #endif
 static char *sec2str(time_t);
 static char *ether_str(struct sockaddr_dl *);
 static void ts_print(const struct timeval *);
 
 #ifdef ICMPV6CTL_ND6_DRLIST
 static char *rtpref_str[] = {
 	"medium",		/* 00 */
 	"high",			/* 01 */
 	"rsv",			/* 10 */
 	"low"			/* 11 */
 };
 #endif
 
 int mode = 0;
 char *arg = NULL;
 
 int
 main(argc, argv)
 	int argc;
 	char **argv;
 {
 	int ch;
 
 	pid = getpid();
 	thiszone = gmt2local(0);
 	while ((ch = getopt(argc, argv, "acd:f:Ii:nprstA:HPR")) != -1)
 		switch (ch) {
 		case 'a':
 		case 'c':
 		case 'p':
 		case 'r':
 		case 'H':
 		case 'P':
 		case 'R':
 		case 's':
 		case 'I':
 			if (mode) {
 				usage();
 				/*NOTREACHED*/
 			}
 			mode = ch;
 			arg = NULL;
 			break;
 		case 'd':
 		case 'f':
 		case 'i' :
 			if (mode) {
 				usage();
 				/*NOTREACHED*/
 			}
 			mode = ch;
 			arg = optarg;
 			break;
 		case 'n':
 			nflag = 1;
 			break;
 		case 't':
 			tflag = 1;
 			break;
 		case 'A':
 			if (mode) {
 				usage();
 				/*NOTREACHED*/
 			}
 			mode = 'a';
 			repeat = atoi(optarg);
 			if (repeat < 0) {
 				usage();
 				/*NOTREACHED*/
 			}
 			break;
 		default:
 			usage();
 		}
 
 	argc -= optind;
 	argv += optind;
 
 	switch (mode) {
 	case 'a':
 	case 'c':
 		if (argc != 0) {
 			usage();
 			/*NOTREACHED*/
 		}
 		dump(0, mode == 'c');
 		break;
 	case 'd':
 		if (argc != 0) {
 			usage();
 			/*NOTREACHED*/
 		}
 		delete(arg);
 		break;
 	case 'I':
 #ifdef SIOCSDEFIFACE_IN6	/* XXX: check SIOCGDEFIFACE_IN6 as well? */
 		if (argc > 1) {
 			usage();
 			/*NOTREACHED*/
 		} else if (argc == 1) {
 			if (strcmp(*argv, "delete") == 0 ||
 			    if_nametoindex(*argv))
 				setdefif(*argv);
 			else
 				errx(1, "invalid interface %s", *argv);
 		}
 		getdefif(); /* always call it to print the result */
 		break;
 #else
 		errx(1, "not supported yet");
 		/*NOTREACHED*/
 #endif
 	case 'p':
 		if (argc != 0) {
 			usage();
 			/*NOTREACHED*/
 		}
 		plist();
 		break;
 	case 'i':
 		ifinfo(arg, argc, argv);
 		break;
 	case 'r':
 		if (argc != 0) {
 			usage();
 			/*NOTREACHED*/
 		}
 		rtrlist();
 		break;
 	case 's':
 		if (argc < 2 || argc > 4)
 			usage();
 		exit(set(argc, argv) ? 1 : 0);
 	case 'H':
 		if (argc != 0) {
 			usage();
 			/*NOTREACHED*/
 		}
 		harmonize_rtr();
 		break;
 	case 'P':
 		if (argc != 0) {
 			usage();
 			/*NOTREACHED*/
 		}
 		pfx_flush();
 		break;
 	case 'R':
 		if (argc != 0) {
 			usage();
 			/*NOTREACHED*/
 		}
 		rtr_flush();
 		break;
 	case 0:
 		if (argc != 1) {
 			usage();
 			/*NOTREACHED*/
 		}
 		get(argv[0]);
 		break;
 	}
 	exit(0);
 }
 
 /*
  * Process a file to set standard ndp entries
  */
 int
 file(name)
 	char *name;
 {
 	FILE *fp;
 	int i, retval;
 	char line[100], arg[5][50], *args[5];
 
 	if ((fp = fopen(name, "r")) == NULL) {
 		fprintf(stderr, "ndp: cannot open %s\n", name);
 		exit(1);
 	}
 	args[0] = &arg[0][0];
 	args[1] = &arg[1][0];
 	args[2] = &arg[2][0];
 	args[3] = &arg[3][0];
 	args[4] = &arg[4][0];
 	retval = 0;
 	while (fgets(line, sizeof(line), fp) != NULL) {
 		i = sscanf(line, "%49s %49s %49s %49s %49s",
 		    arg[0], arg[1], arg[2], arg[3], arg[4]);
 		if (i < 2) {
 			fprintf(stderr, "ndp: bad line: %s\n", line);
 			retval = 1;
 			continue;
 		}
 		if (set(i, args))
 			retval = 1;
 	}
 	fclose(fp);
 	return (retval);
 }
 
 void
 getsocket()
 {
 	if (s < 0) {
 		s = socket(PF_ROUTE, SOCK_RAW, 0);
 		if (s < 0) {
 			err(1, "socket");
 			/* NOTREACHED */
 		}
 	}
 }
 
 struct	sockaddr_in6 so_mask = {sizeof(so_mask), AF_INET6 };
 struct	sockaddr_in6 blank_sin = {sizeof(blank_sin), AF_INET6 }, sin_m;
 struct	sockaddr_dl blank_sdl = {sizeof(blank_sdl), AF_LINK }, sdl_m;
 int	expire_time, flags, found_entry;
 struct	{
 	struct	rt_msghdr m_rtm;
 	char	m_space[512];
 }	m_rtmsg;
 
 /*
  * Set an individual neighbor cache entry
  */
 int
 set(argc, argv)
 	int argc;
 	char **argv;
 {
 	register struct sockaddr_in6 *sin = &sin_m;
 	register struct sockaddr_dl *sdl;
 	register struct rt_msghdr *rtm = &(m_rtmsg.m_rtm);
 	struct addrinfo hints, *res;
 	int gai_error;
 	u_char *ea;
 	char *host = argv[0], *eaddr = argv[1];
 
 	getsocket();
 	argc -= 2;
 	argv += 2;
 	sdl_m = blank_sdl;
 	sin_m = blank_sin;
 
 	bzero(&hints, sizeof(hints));
 	hints.ai_family = AF_INET6;
 	gai_error = getaddrinfo(host, NULL, &hints, &res);
 	if (gai_error) {
 		fprintf(stderr, "ndp: %s: %s\n", host,
 			gai_strerror(gai_error));
 		return 1;
 	}
 	sin->sin6_addr = ((struct sockaddr_in6 *)res->ai_addr)->sin6_addr;
 #ifdef __KAME__
 	if (IN6_IS_ADDR_LINKLOCAL(&sin->sin6_addr)) {
 		*(u_int16_t *)&sin->sin6_addr.s6_addr[2] =
 		    htons(((struct sockaddr_in6 *)res->ai_addr)->sin6_scope_id);
 	}
 #endif
 	ea = (u_char *)LLADDR(&sdl_m);
 	if (ndp_ether_aton(eaddr, ea) == 0)
 		sdl_m.sdl_alen = 6;
 	flags = expire_time = 0;
 	while (argc-- > 0) {
 		if (strncmp(argv[0], "temp", 4) == 0) {
 			struct timeval time;
 
 			gettimeofday(&time, 0);
 			expire_time = time.tv_sec + 20 * 60;
 		} else if (strncmp(argv[0], "proxy", 5) == 0)
 			flags |= RTF_ANNOUNCE;
 		argv++;
 	}
 	if (rtmsg(RTM_GET) < 0) {
 		errx(1, "RTM_GET(%s) failed", host);
 		/* NOTREACHED */
 	}
 	sin = (struct sockaddr_in6 *)(rtm + 1);
 	sdl = (struct sockaddr_dl *)(ROUNDUP(sin->sin6_len) + (char *)sin);
 	if (IN6_ARE_ADDR_EQUAL(&sin->sin6_addr, &sin_m.sin6_addr)) {
 		if (sdl->sdl_family == AF_LINK &&
-		    (rtm->rtm_flags & RTF_LLINFO) &&
 		    !(rtm->rtm_flags & RTF_GATEWAY)) {
 			switch (sdl->sdl_type) {
 			case IFT_ETHER: case IFT_FDDI: case IFT_ISO88023:
 			case IFT_ISO88024: case IFT_ISO88025:
 				goto overwrite;
 			}
 		}
 		/*
 		 * IPv4 arp command retries with sin_other = SIN_PROXY here.
 		 */
 		fprintf(stderr, "set: cannot configure a new entry\n");
 		return 1;
 	}
 
 overwrite:
 	if (sdl->sdl_family != AF_LINK) {
 		printf("cannot intuit interface index and type for %s\n", host);
 		return (1);
 	}
 	sdl_m.sdl_type = sdl->sdl_type;
 	sdl_m.sdl_index = sdl->sdl_index;
 	return (rtmsg(RTM_ADD));
 }
 
 /*
  * Display an individual neighbor cache entry
  */
 void
 get(host)
 	char *host;
 {
 	struct sockaddr_in6 *sin = &sin_m;
 	struct addrinfo hints, *res;
 	int gai_error;
 
 	sin_m = blank_sin;
 	bzero(&hints, sizeof(hints));
 	hints.ai_family = AF_INET6;
 	gai_error = getaddrinfo(host, NULL, &hints, &res);
 	if (gai_error) {
 		fprintf(stderr, "ndp: %s: %s\n", host,
 		    gai_strerror(gai_error));
 		return;
 	}
 	sin->sin6_addr = ((struct sockaddr_in6 *)res->ai_addr)->sin6_addr;
 #ifdef __KAME__
 	if (IN6_IS_ADDR_LINKLOCAL(&sin->sin6_addr)) {
 		*(u_int16_t *)&sin->sin6_addr.s6_addr[2] =
 		    htons(((struct sockaddr_in6 *)res->ai_addr)->sin6_scope_id);
 	}
 #endif
 	dump(&sin->sin6_addr, 0);
 	if (found_entry == 0) {
 		getnameinfo((struct sockaddr *)sin, sin->sin6_len, host_buf,
 		    sizeof(host_buf), NULL ,0,
 		    (nflag ? NI_NUMERICHOST : 0));
 		printf("%s (%s) -- no entry\n", host, host_buf);
 		exit(1);
 	}
 }
 
 /*
  * Delete a neighbor cache entry
  */
 int
 delete(host)
 	char *host;
 {
 	struct sockaddr_in6 *sin = &sin_m;
 	register struct rt_msghdr *rtm = &m_rtmsg.m_rtm;
+	register char *cp = m_rtmsg.m_space;
 	struct sockaddr_dl *sdl;
 	struct addrinfo hints, *res;
 	int gai_error;
 
 	getsocket();
 	sin_m = blank_sin;
 
 	bzero(&hints, sizeof(hints));
 	hints.ai_family = AF_INET6;
 	gai_error = getaddrinfo(host, NULL, &hints, &res);
 	if (gai_error) {
 		fprintf(stderr, "ndp: %s: %s\n", host,
 		    gai_strerror(gai_error));
 		return 1;
 	}
 	sin->sin6_addr = ((struct sockaddr_in6 *)res->ai_addr)->sin6_addr;
 #ifdef __KAME__
 	if (IN6_IS_ADDR_LINKLOCAL(&sin->sin6_addr)) {
 		*(u_int16_t *)&sin->sin6_addr.s6_addr[2] =
 		    htons(((struct sockaddr_in6 *)res->ai_addr)->sin6_scope_id);
 	}
 #endif
 	if (rtmsg(RTM_GET) < 0) {
 		errx(1, "RTM_GET(%s) failed", host);
 		/* NOTREACHED */
 	}
 	sin = (struct sockaddr_in6 *)(rtm + 1);
 	sdl = (struct sockaddr_dl *)(ROUNDUP(sin->sin6_len) + (char *)sin);
 	if (IN6_ARE_ADDR_EQUAL(&sin->sin6_addr, &sin_m.sin6_addr)) {
 		if (sdl->sdl_family == AF_LINK &&
-		    (rtm->rtm_flags & RTF_LLINFO) &&
 		    !(rtm->rtm_flags & RTF_GATEWAY)) {
 			goto delete;
 		}
 		/*
 		 * IPv4 arp command retries with sin_other = SIN_PROXY here.
 		 */
 		fprintf(stderr, "delete: cannot delete non-NDP entry\n");
 		return 1;
 	}
 
 delete:
 	if (sdl->sdl_family != AF_LINK) {
 		printf("cannot locate %s\n", host);
 		return (1);
 	}
+        /* 
+         * need to reinit the field because it has rt_key
+         * but we want the actual address
+         */
+	NEXTADDR(RTA_DST, sin_m);
 	if (rtmsg(RTM_DELETE) == 0) {
 		struct sockaddr_in6 s6 = *sin; /* XXX: for safety */
 
 #ifdef __KAME__
 		if (IN6_IS_ADDR_LINKLOCAL(&s6.sin6_addr)) {
 			s6.sin6_scope_id = ntohs(*(u_int16_t *)&s6.sin6_addr.s6_addr[2]);
 			*(u_int16_t *)&s6.sin6_addr.s6_addr[2] = 0;
 		}
 #endif
 		getnameinfo((struct sockaddr *)&s6,
 		    s6.sin6_len, host_buf,
 		    sizeof(host_buf), NULL, 0,
 		    (nflag ? NI_NUMERICHOST : 0));
 		printf("%s (%s) deleted\n", host, host_buf);
 	}
 
 	return 0;
 }
 
 #define W_ADDR	36
 #define W_LL	17
 #define W_IF	6
 
 /*
  * Dump the entire neighbor cache
  */
 void
 dump(addr, cflag)
 	struct in6_addr *addr;
 	int cflag;
 {
 	int mib[6];
 	size_t needed;
 	char *lim, *buf, *next;
 	struct rt_msghdr *rtm;
 	struct sockaddr_in6 *sin;
 	struct sockaddr_dl *sdl;
 	extern int h_errno;
 	struct in6_nbrinfo *nbi;
 	struct timeval time;
 	int addrwidth;
 	int llwidth;
 	int ifwidth;
 	char flgbuf[8];
 	char *ifname;
 
 	/* Print header */
 	if (!tflag && !cflag)
 		printf("%-*.*s %-*.*s %*.*s %-9.9s %1s %5s\n",
 		    W_ADDR, W_ADDR, "Neighbor", W_LL, W_LL, "Linklayer Address",
 		    W_IF, W_IF, "Netif", "Expire", "S", "Flags");
 
 again:;
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = AF_INET6;
 	mib[4] = NET_RT_FLAGS;
+#ifdef RTF_LLINFO
 	mib[5] = RTF_LLINFO;
+#else
+	mib[5] = 0;
+#endif
 	if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0)
 		err(1, "sysctl(PF_ROUTE estimate)");
 	if (needed > 0) {
 		if ((buf = malloc(needed)) == NULL)
 			err(1, "malloc");
 		if (sysctl(mib, 6, buf, &needed, NULL, 0) < 0)
 			err(1, "sysctl(PF_ROUTE, NET_RT_FLAGS)");
 		lim = buf + needed;
 	} else
 		buf = lim = NULL;
 
 	for (next = buf; next && next < lim; next += rtm->rtm_msglen) {
 		int isrouter = 0, prbs = 0;
 
 		rtm = (struct rt_msghdr *)next;
 		sin = (struct sockaddr_in6 *)(rtm + 1);
 		sdl = (struct sockaddr_dl *)((char *)sin + ROUNDUP(sin->sin6_len));
 
 		/*
 		 * Some OSes can produce a route that has the LINK flag but
 		 * has a non-AF_LINK gateway (e.g. fe80::xx%lo0 on FreeBSD
 		 * and BSD/OS, where xx is not the interface identifier on
 		 * lo0).  Such routes entry would annoy getnbrinfo() below,
 		 * so we skip them.
 		 * XXX: such routes should have the GATEWAY flag, not the
 		 * LINK flag.  However, there is rotten routing software
 		 * that advertises all routes that have the GATEWAY flag.
 		 * Thus, KAME kernel intentionally does not set the LINK flag.
 		 * What is to be fixed is not ndp, but such routing software
 		 * (and the kernel workaround)...
 		 */
 		if (sdl->sdl_family != AF_LINK)
 			continue;
 
 		if (!(rtm->rtm_flags & RTF_HOST))
 			continue;
 
 		if (addr) {
 			if (!IN6_ARE_ADDR_EQUAL(addr, &sin->sin6_addr))
 				continue;
 			found_entry = 1;
 		} else if (IN6_IS_ADDR_MULTICAST(&sin->sin6_addr))
 			continue;
 		if (IN6_IS_ADDR_LINKLOCAL(&sin->sin6_addr) ||
 		    IN6_IS_ADDR_MC_LINKLOCAL(&sin->sin6_addr)) {
 			/* XXX: should scope id be filled in the kernel? */
 			if (sin->sin6_scope_id == 0)
 				sin->sin6_scope_id = sdl->sdl_index;
 #ifdef __KAME__
 			/* KAME specific hack; removed the embedded id */
 			*(u_int16_t *)&sin->sin6_addr.s6_addr[2] = 0;
 #endif
 		}
 		getnameinfo((struct sockaddr *)sin, sin->sin6_len, host_buf,
 		    sizeof(host_buf), NULL, 0, (nflag ? NI_NUMERICHOST : 0));
 		if (cflag) {
 #ifdef RTF_WASCLONED
 			if (rtm->rtm_flags & RTF_WASCLONED)
 				delete(host_buf);
 #elif defined(RTF_CLONED)
 			if (rtm->rtm_flags & RTF_CLONED)
 				delete(host_buf);
 #else
 			delete(host_buf);
 #endif
 			continue;
 		}
 		gettimeofday(&time, 0);
 		if (tflag)
 			ts_print(&time);
 
 		addrwidth = strlen(host_buf);
 		if (addrwidth < W_ADDR)
 			addrwidth = W_ADDR;
 		llwidth = strlen(ether_str(sdl));
 		if (W_ADDR + W_LL - addrwidth > llwidth)
 			llwidth = W_ADDR + W_LL - addrwidth;
 		ifname = if_indextoname(sdl->sdl_index, ifix_buf);
 		if (!ifname)
 			ifname = "?";
 		ifwidth = strlen(ifname);
 		if (W_ADDR + W_LL + W_IF - addrwidth - llwidth > ifwidth)
 			ifwidth = W_ADDR + W_LL + W_IF - addrwidth - llwidth;
 
 		printf("%-*.*s %-*.*s %*.*s", addrwidth, addrwidth, host_buf,
 		    llwidth, llwidth, ether_str(sdl), ifwidth, ifwidth, ifname);
 
 		/* Print neighbor discovery specific informations */
 		nbi = getnbrinfo(&sin->sin6_addr, sdl->sdl_index, 1);
 		if (nbi) {
 			if (nbi->expire > time.tv_sec) {
 				printf(" %-9.9s",
 				    sec2str(nbi->expire - time.tv_sec));
 			} else if (nbi->expire == 0)
 				printf(" %-9.9s", "permanent");
 			else
 				printf(" %-9.9s", "expired");
 
 			switch (nbi->state) {
 			case ND6_LLINFO_NOSTATE:
 				 printf(" N");
 				 break;
 #ifdef ND6_LLINFO_WAITDELETE
 			case ND6_LLINFO_WAITDELETE:
 				 printf(" W");
 				 break;
 #endif
 			case ND6_LLINFO_INCOMPLETE:
 				 printf(" I");
 				 break;
 			case ND6_LLINFO_REACHABLE:
 				 printf(" R");
 				 break;
 			case ND6_LLINFO_STALE:
 				 printf(" S");
 				 break;
 			case ND6_LLINFO_DELAY:
 				 printf(" D");
 				 break;
 			case ND6_LLINFO_PROBE:
 				 printf(" P");
 				 break;
 			default:
 				 printf(" ?");
 				 break;
 			}
 
 			isrouter = nbi->isrouter;
 			prbs = nbi->asked;
 		} else {
 			warnx("failed to get neighbor information");
 			printf("  ");
 		}
 
 		/*
 		 * other flags. R: router, P: proxy, W: ??
 		 */
 		if ((rtm->rtm_addrs & RTA_NETMASK) == 0) {
 			snprintf(flgbuf, sizeof(flgbuf), "%s%s",
 			    isrouter ? "R" : "",
 			    (rtm->rtm_flags & RTF_ANNOUNCE) ? "p" : "");
 		} else {
 			sin = (struct sockaddr_in6 *)
 			    (sdl->sdl_len + (char *)sdl);
 #if 0	/* W and P are mystery even for us */
 			snprintf(flgbuf, sizeof(flgbuf), "%s%s%s%s",
 			    isrouter ? "R" : "",
 			    !IN6_IS_ADDR_UNSPECIFIED(&sin->sin6_addr) ? "P" : "",
 			    (sin->sin6_len != sizeof(struct sockaddr_in6)) ? "W" : "",
 			    (rtm->rtm_flags & RTF_ANNOUNCE) ? "p" : "");
 #else
 			snprintf(flgbuf, sizeof(flgbuf), "%s%s",
 			    isrouter ? "R" : "",
 			    (rtm->rtm_flags & RTF_ANNOUNCE) ? "p" : "");
 #endif
 		}
 		printf(" %s", flgbuf);
 
 		if (prbs)
 			printf(" %d", prbs);
 
 		printf("\n");
 	}
 	if (buf != NULL)
 		free(buf);
 
 	if (repeat) {
 		printf("\n");
 		fflush(stdout);
 		sleep(repeat);
 		goto again;
 	}
 }
 
 static struct in6_nbrinfo *
 getnbrinfo(addr, ifindex, warning)
 	struct in6_addr *addr;
 	int ifindex;
 	int warning;
 {
 	static struct in6_nbrinfo nbi;
 	int s;
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0)
 		err(1, "socket");
 
 	bzero(&nbi, sizeof(nbi));
 	if_indextoname(ifindex, nbi.ifname);
 	nbi.addr = *addr;
 	if (ioctl(s, SIOCGNBRINFO_IN6, (caddr_t)&nbi) < 0) {
 		if (warning)
 			warn("ioctl(SIOCGNBRINFO_IN6)");
 		close(s);
 		return(NULL);
 	}
 
 	close(s);
 	return(&nbi);
 }
 
 static char *
 ether_str(sdl)
 	struct sockaddr_dl *sdl;
 {
 	static char hbuf[NI_MAXHOST];
 	u_char *cp;
 
 	if (sdl->sdl_alen) {
 		cp = (u_char *)LLADDR(sdl);
 		snprintf(hbuf, sizeof(hbuf), "%x:%x:%x:%x:%x:%x",
 		    cp[0], cp[1], cp[2], cp[3], cp[4], cp[5]);
 	} else
 		snprintf(hbuf, sizeof(hbuf), "(incomplete)");
 
 	return(hbuf);
 }
 
 int
 ndp_ether_aton(a, n)
 	char *a;
 	u_char *n;
 {
 	int i, o[6];
 
 	i = sscanf(a, "%x:%x:%x:%x:%x:%x", &o[0], &o[1], &o[2],
 	    &o[3], &o[4], &o[5]);
 	if (i != 6) {
 		fprintf(stderr, "ndp: invalid Ethernet address '%s'\n", a);
 		return (1);
 	}
 	for (i = 0; i < 6; i++)
 		n[i] = o[i];
 	return (0);
 }
 
 void
 usage()
 {
 	printf("usage: ndp [-nt] hostname\n");
 	printf("       ndp [-nt] -a | -c | -p | -r | -H | -P | -R\n");
 	printf("       ndp [-nt] -A wait\n");
 	printf("       ndp [-nt] -d hostname\n");
 	printf("       ndp [-nt] -f filename\n");
 	printf("       ndp [-nt] -i interface [flags...]\n");
 #ifdef SIOCSDEFIFACE_IN6
 	printf("       ndp [-nt] -I [interface|delete]\n");
 #endif
 	printf("       ndp [-nt] -s nodename etheraddr [temp] [proxy]\n");
 	exit(1);
 }
 
 int
 rtmsg(cmd)
 	int cmd;
 {
 	static int seq;
 	int rlen;
 	register struct rt_msghdr *rtm = &m_rtmsg.m_rtm;
 	register char *cp = m_rtmsg.m_space;
 	register int l;
 
 	errno = 0;
 	if (cmd == RTM_DELETE)
 		goto doit;
 	bzero((char *)&m_rtmsg, sizeof(m_rtmsg));
 	rtm->rtm_flags = flags;
 	rtm->rtm_version = RTM_VERSION;
 
 	switch (cmd) {
 	default:
 		fprintf(stderr, "ndp: internal wrong cmd\n");
 		exit(1);
 	case RTM_ADD:
 		rtm->rtm_addrs |= RTA_GATEWAY;
 		if (expire_time) {
 			rtm->rtm_rmx.rmx_expire = expire_time;
 			rtm->rtm_inits = RTV_EXPIRE;
 		}
 		rtm->rtm_flags |= (RTF_HOST | RTF_STATIC);
 #if 0 /* we don't support ipv6addr/128 type proxying */
 		if (rtm->rtm_flags & RTF_ANNOUNCE) {
 			rtm->rtm_flags &= ~RTF_HOST;
 			rtm->rtm_addrs |= RTA_NETMASK;
 		}
 #endif
 		/* FALLTHROUGH */
 	case RTM_GET:
 		rtm->rtm_addrs |= RTA_DST;
 	}
-#define NEXTADDR(w, s) \
-	if (rtm->rtm_addrs & (w)) { \
-		bcopy((char *)&s, cp, sizeof(s)); cp += SA_SIZE(&s);}
 
 	NEXTADDR(RTA_DST, sin_m);
 	NEXTADDR(RTA_GATEWAY, sdl_m);
 #if 0 /* we don't support ipv6addr/128 type proxying */
 	memset(&so_mask.sin6_addr, 0xff, sizeof(so_mask.sin6_addr));
 	NEXTADDR(RTA_NETMASK, so_mask);
 #endif
 
 	rtm->rtm_msglen = cp - (char *)&m_rtmsg;
 doit:
 	l = rtm->rtm_msglen;
 	rtm->rtm_seq = ++seq;
 	rtm->rtm_type = cmd;
 	if ((rlen = write(s, (char *)&m_rtmsg, l)) < 0) {
 		if (errno != ESRCH || cmd != RTM_DELETE) {
 			err(1, "writing to routing socket");
 			/* NOTREACHED */
 		}
 	}
 	do {
 		l = read(s, (char *)&m_rtmsg, sizeof(m_rtmsg));
 	} while (l > 0 && (rtm->rtm_seq != seq || rtm->rtm_pid != pid));
 	if (l < 0)
 		(void) fprintf(stderr, "ndp: read from routing socket: %s\n",
 		    strerror(errno));
 	return (0);
 }
 
 void
 ifinfo(ifname, argc, argv)
 	char *ifname;
 	int argc;
 	char **argv;
 {
 	struct in6_ndireq nd;
 	int i, s;
 	u_int32_t newflags;
 #ifdef IPV6CTL_USETEMPADDR
 	u_int8_t nullbuf[8];
 #endif
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
 		err(1, "socket");
 		/* NOTREACHED */
 	}
 	bzero(&nd, sizeof(nd));
 	strlcpy(nd.ifname, ifname, sizeof(nd.ifname));
 	if (ioctl(s, SIOCGIFINFO_IN6, (caddr_t)&nd) < 0) {
 		err(1, "ioctl(SIOCGIFINFO_IN6)");
 		/* NOTREACHED */
 	}
 #define ND nd.ndi
 	newflags = ND.flags;
 	for (i = 0; i < argc; i++) {
 		int clear = 0;
 		char *cp = argv[i];
 
 		if (*cp == '-') {
 			clear = 1;
 			cp++;
 		}
 
 #define SETFLAG(s, f) \
 	do {\
 		if (strcmp(cp, (s)) == 0) {\
 			if (clear)\
 				newflags &= ~(f);\
 			else\
 				newflags |= (f);\
 		}\
 	} while (0)
 /*
  * XXX: this macro is not 100% correct, in that it matches "nud" against
  *      "nudbogus".  But we just let it go since this is minor.
  */
 #define SETVALUE(f, v) \
 	do { \
 		char *valptr; \
 		unsigned long newval; \
 		v = 0; /* unspecified */ \
 		if (strncmp(cp, f, strlen(f)) == 0) { \
 			valptr = strchr(cp, '='); \
 			if (valptr == NULL) \
 				err(1, "syntax error in %s field", (f)); \
 			errno = 0; \
 			newval = strtoul(++valptr, NULL, 0); \
 			if (errno) \
 				err(1, "syntax error in %s's value", (f)); \
 			v = newval; \
 		} \
 	} while (0)
 
 		SETFLAG("disabled", ND6_IFF_IFDISABLED);
 		SETFLAG("nud", ND6_IFF_PERFORMNUD);
 #ifdef ND6_IFF_ACCEPT_RTADV
 		SETFLAG("accept_rtadv", ND6_IFF_ACCEPT_RTADV);
 #endif
 #ifdef ND6_IFF_PREFER_SOURCE
 		SETFLAG("prefer_source", ND6_IFF_PREFER_SOURCE);
 #endif
 		SETVALUE("basereachable", ND.basereachable);
 		SETVALUE("retrans", ND.retrans);
 		SETVALUE("curhlim", ND.chlim);
 
 		ND.flags = newflags;
 		if (ioctl(s, SIOCSIFINFO_IN6, (caddr_t)&nd) < 0) {
 			err(1, "ioctl(SIOCSIFINFO_IN6)");
 			/* NOTREACHED */
 		}
 #undef SETFLAG
 #undef SETVALUE
 	}
 
 	if (!ND.initialized) {
 		errx(1, "%s: not initialized yet", ifname);
 		/* NOTREACHED */
 	}
 
 	if (ioctl(s, SIOCGIFINFO_IN6, (caddr_t)&nd) < 0) {
 		err(1, "ioctl(SIOCGIFINFO_IN6)");
 		/* NOTREACHED */
 	}
 	printf("linkmtu=%d", ND.linkmtu);
 	printf(", maxmtu=%d", ND.maxmtu);
 	printf(", curhlim=%d", ND.chlim);
 	printf(", basereachable=%ds%dms",
 	    ND.basereachable / 1000, ND.basereachable % 1000);
 	printf(", reachable=%ds", ND.reachable);
 	printf(", retrans=%ds%dms", ND.retrans / 1000, ND.retrans % 1000);
 #ifdef IPV6CTL_USETEMPADDR
 	memset(nullbuf, 0, sizeof(nullbuf));
 	if (memcmp(nullbuf, ND.randomid, sizeof(nullbuf)) != 0) {
 		int j;
 		u_int8_t *rbuf;
 
 		for (i = 0; i < 3; i++) {
 			switch (i) {
 			case 0:
 				printf("\nRandom seed(0): ");
 				rbuf = ND.randomseed0;
 				break;
 			case 1:
 				printf("\nRandom seed(1): ");
 				rbuf = ND.randomseed1;
 				break;
 			case 2:
 				printf("\nRandom ID:      ");
 				rbuf = ND.randomid;
 				break;
 			default:
 				errx(1, "impossible case for tempaddr display");
 			}
 			for (j = 0; j < 8; j++)
 				printf("%02x", rbuf[j]);
 		}
 	}
 #endif
 	if (ND.flags) {
 		printf("\nFlags: ");
 #ifdef ND6_IFF_IFDISABLED
 		if ((ND.flags & ND6_IFF_IFDISABLED))
 			printf("disabled ");
 #endif
 		if ((ND.flags & ND6_IFF_PERFORMNUD))
 			printf("nud ");
 #ifdef ND6_IFF_ACCEPT_RTADV
 		if ((ND.flags & ND6_IFF_ACCEPT_RTADV))
 			printf("accept_rtadv ");
 #endif
 #ifdef ND6_IFF_PREFER_SOURCE
 		if ((ND.flags & ND6_IFF_PREFER_SOURCE))
 			printf("prefer_source ");
 #endif
 	}
 	putc('\n', stdout);
 #undef ND
 
 	close(s);
 }
 
 #ifndef ND_RA_FLAG_RTPREF_MASK	/* XXX: just for compilation on *BSD release */
 #define ND_RA_FLAG_RTPREF_MASK	0x18 /* 00011000 */
 #endif
 
 void
 rtrlist()
 {
 #ifdef ICMPV6CTL_ND6_DRLIST
 	int mib[] = { CTL_NET, PF_INET6, IPPROTO_ICMPV6, ICMPV6CTL_ND6_DRLIST };
 	char *buf;
 	struct in6_defrouter *p, *ep;
 	size_t l;
 	struct timeval time;
 
 	if (sysctl(mib, sizeof(mib) / sizeof(mib[0]), NULL, &l, NULL, 0) < 0) {
 		err(1, "sysctl(ICMPV6CTL_ND6_DRLIST)");
 		/*NOTREACHED*/
 	}
 	if (l == 0)
 		return;
 	buf = malloc(l);
 	if (!buf) {
 		err(1, "malloc");
 		/*NOTREACHED*/
 	}
 	if (sysctl(mib, sizeof(mib) / sizeof(mib[0]), buf, &l, NULL, 0) < 0) {
 		err(1, "sysctl(ICMPV6CTL_ND6_DRLIST)");
 		/*NOTREACHED*/
 	}
 
 	ep = (struct in6_defrouter *)(buf + l);
 	for (p = (struct in6_defrouter *)buf; p < ep; p++) {
 		int rtpref;
 
 		if (getnameinfo((struct sockaddr *)&p->rtaddr,
 		    p->rtaddr.sin6_len, host_buf, sizeof(host_buf), NULL, 0,
 		    (nflag ? NI_NUMERICHOST : 0)) != 0)
 			strlcpy(host_buf, "?", sizeof(host_buf));
 
 		printf("%s if=%s", host_buf,
 		    if_indextoname(p->if_index, ifix_buf));
 		printf(", flags=%s%s",
 		    p->flags & ND_RA_FLAG_MANAGED ? "M" : "",
 		    p->flags & ND_RA_FLAG_OTHER   ? "O" : "");
 		rtpref = ((p->flags & ND_RA_FLAG_RTPREF_MASK) >> 3) & 0xff;
 		printf(", pref=%s", rtpref_str[rtpref]);
 
 		gettimeofday(&time, 0);
 		if (p->expire == 0)
 			printf(", expire=Never\n");
 		else
 			printf(", expire=%s\n",
 			    sec2str(p->expire - time.tv_sec));
 	}
 	free(buf);
 #else
 	struct in6_drlist dr;
 	int s, i;
 	struct timeval time;
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
 		err(1, "socket");
 		/* NOTREACHED */
 	}
 	bzero(&dr, sizeof(dr));
 	strlcpy(dr.ifname, "lo0", sizeof(dr.ifname)); /* dummy */
 	if (ioctl(s, SIOCGDRLST_IN6, (caddr_t)&dr) < 0) {
 		err(1, "ioctl(SIOCGDRLST_IN6)");
 		/* NOTREACHED */
 	}
 #define DR dr.defrouter[i]
 	for (i = 0 ; DR.if_index && i < DRLSTSIZ ; i++) {
 		struct sockaddr_in6 sin6;
 
 		bzero(&sin6, sizeof(sin6));
 		sin6.sin6_family = AF_INET6;
 		sin6.sin6_len = sizeof(sin6);
 		sin6.sin6_addr = DR.rtaddr;
 		getnameinfo((struct sockaddr *)&sin6, sin6.sin6_len, host_buf,
 		    sizeof(host_buf), NULL, 0,
 		    (nflag ? NI_NUMERICHOST : 0));
 
 		printf("%s if=%s", host_buf,
 		    if_indextoname(DR.if_index, ifix_buf));
 		printf(", flags=%s%s",
 		    DR.flags & ND_RA_FLAG_MANAGED ? "M" : "",
 		    DR.flags & ND_RA_FLAG_OTHER   ? "O" : "");
 		gettimeofday(&time, 0);
 		if (DR.expire == 0)
 			printf(", expire=Never\n");
 		else
 			printf(", expire=%s\n",
 			    sec2str(DR.expire - time.tv_sec));
 	}
 #undef DR
 	close(s);
 #endif
 }
 
 void
 plist()
 {
 #ifdef ICMPV6CTL_ND6_PRLIST
 	int mib[] = { CTL_NET, PF_INET6, IPPROTO_ICMPV6, ICMPV6CTL_ND6_PRLIST };
 	char *buf;
 	struct in6_prefix *p, *ep, *n;
 	struct sockaddr_in6 *advrtr;
 	size_t l;
 	struct timeval time;
 	const int niflags = NI_NUMERICHOST;
 	int ninflags = nflag ? NI_NUMERICHOST : 0;
 	char namebuf[NI_MAXHOST];
 
 	if (sysctl(mib, sizeof(mib) / sizeof(mib[0]), NULL, &l, NULL, 0) < 0) {
 		err(1, "sysctl(ICMPV6CTL_ND6_PRLIST)");
 		/*NOTREACHED*/
 	}
 	buf = malloc(l);
 	if (!buf) {
 		err(1, "malloc");
 		/*NOTREACHED*/
 	}
 	if (sysctl(mib, sizeof(mib) / sizeof(mib[0]), buf, &l, NULL, 0) < 0) {
 		err(1, "sysctl(ICMPV6CTL_ND6_PRLIST)");
 		/*NOTREACHED*/
 	}
 
 	ep = (struct in6_prefix *)(buf + l);
 	for (p = (struct in6_prefix *)buf; p < ep; p = n) {
 		advrtr = (struct sockaddr_in6 *)(p + 1);
 		n = (struct in6_prefix *)&advrtr[p->advrtrs];
 
 		if (getnameinfo((struct sockaddr *)&p->prefix,
 		    p->prefix.sin6_len, namebuf, sizeof(namebuf),
 		    NULL, 0, niflags) != 0)
 			strlcpy(namebuf, "?", sizeof(namebuf));
 		printf("%s/%d if=%s\n", namebuf, p->prefixlen,
 		    if_indextoname(p->if_index, ifix_buf));
 
 		gettimeofday(&time, 0);
 		/*
 		 * meaning of fields, especially flags, is very different
 		 * by origin.  notify the difference to the users.
 		 */
 		printf("flags=%s%s%s%s%s",
 		    p->raflags.onlink ? "L" : "",
 		    p->raflags.autonomous ? "A" : "",
 		    (p->flags & NDPRF_ONLINK) != 0 ? "O" : "",
 		    (p->flags & NDPRF_DETACHED) != 0 ? "D" : "",
 #ifdef NDPRF_HOME
 		    (p->flags & NDPRF_HOME) != 0 ? "H" : ""
 #else
 		    ""
 #endif
 		    );
 		if (p->vltime == ND6_INFINITE_LIFETIME)
 			printf(" vltime=infinity");
 		else
 			printf(" vltime=%lu", (unsigned long)p->vltime);
 		if (p->pltime == ND6_INFINITE_LIFETIME)
 			printf(", pltime=infinity");
 		else
 			printf(", pltime=%lu", (unsigned long)p->pltime);
 		if (p->expire == 0)
 			printf(", expire=Never");
 		else if (p->expire >= time.tv_sec)
 			printf(", expire=%s",
 			    sec2str(p->expire - time.tv_sec));
 		else
 			printf(", expired");
 		printf(", ref=%d", p->refcnt);
 		printf("\n");
 		/*
 		 * "advertising router" list is meaningful only if the prefix
 		 * information is from RA.
 		 */
 		if (p->advrtrs) {
 			int j;
 			struct sockaddr_in6 *sin6;
 
 			sin6 = advrtr;
 			printf("  advertised by\n");
 			for (j = 0; j < p->advrtrs; j++) {
 				struct in6_nbrinfo *nbi;
 
 				if (getnameinfo((struct sockaddr *)sin6,
 				    sin6->sin6_len, namebuf, sizeof(namebuf),
 				    NULL, 0, ninflags) != 0)
 					strlcpy(namebuf, "?", sizeof(namebuf));
 				printf("    %s", namebuf);
 
 				nbi = getnbrinfo(&sin6->sin6_addr,
 				    p->if_index, 0);
 				if (nbi) {
 					switch (nbi->state) {
 					case ND6_LLINFO_REACHABLE:
 					case ND6_LLINFO_STALE:
 					case ND6_LLINFO_DELAY:
 					case ND6_LLINFO_PROBE:
 						printf(" (reachable)\n");
 						break;
 					default:
 						printf(" (unreachable)\n");
 					}
 				} else
 					printf(" (no neighbor state)\n");
 				sin6++;
 			}
 		} else
 			printf("  No advertising router\n");
 	}
 	free(buf);
 #else
 	struct in6_prlist pr;
 	int s, i;
 	struct timeval time;
 
 	gettimeofday(&time, 0);
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
 		err(1, "socket");
 		/* NOTREACHED */
 	}
 	bzero(&pr, sizeof(pr));
 	strlcpy(pr.ifname, "lo0", sizeof(pr.ifname)); /* dummy */
 	if (ioctl(s, SIOCGPRLST_IN6, (caddr_t)&pr) < 0) {
 		err(1, "ioctl(SIOCGPRLST_IN6)");
 		/* NOTREACHED */
 	}
 #define PR pr.prefix[i]
 	for (i = 0; PR.if_index && i < PRLSTSIZ ; i++) {
 		struct sockaddr_in6 p6;
 		char namebuf[NI_MAXHOST];
 		int niflags;
 
 #ifdef NDPRF_ONLINK
 		p6 = PR.prefix;
 #else
 		memset(&p6, 0, sizeof(p6));
 		p6.sin6_family = AF_INET6;
 		p6.sin6_len = sizeof(p6);
 		p6.sin6_addr = PR.prefix;
 #endif
 
 		/*
 		 * copy link index to sin6_scope_id field.
 		 * XXX: KAME specific.
 		 */
 		if (IN6_IS_ADDR_LINKLOCAL(&p6.sin6_addr)) {
 			u_int16_t linkid;
 
 			memcpy(&linkid, &p6.sin6_addr.s6_addr[2],
 			    sizeof(linkid));
 			linkid = ntohs(linkid);
 			p6.sin6_scope_id = linkid;
 			p6.sin6_addr.s6_addr[2] = 0;
 			p6.sin6_addr.s6_addr[3] = 0;
 		}
 
 		niflags = NI_NUMERICHOST;
 		if (getnameinfo((struct sockaddr *)&p6,
 		    sizeof(p6), namebuf, sizeof(namebuf),
 		    NULL, 0, niflags)) {
 			warnx("getnameinfo failed");
 			continue;
 		}
 		printf("%s/%d if=%s\n", namebuf, PR.prefixlen,
 		    if_indextoname(PR.if_index, ifix_buf));
 
 		gettimeofday(&time, 0);
 		/*
 		 * meaning of fields, especially flags, is very different
 		 * by origin.  notify the difference to the users.
 		 */
 #if 0
 		printf("  %s",
 		    PR.origin == PR_ORIG_RA ? "" : "advertise: ");
 #endif
 #ifdef NDPRF_ONLINK
 		printf("flags=%s%s%s%s%s",
 		    PR.raflags.onlink ? "L" : "",
 		    PR.raflags.autonomous ? "A" : "",
 		    (PR.flags & NDPRF_ONLINK) != 0 ? "O" : "",
 		    (PR.flags & NDPRF_DETACHED) != 0 ? "D" : "",
 #ifdef NDPRF_HOME
 		    (PR.flags & NDPRF_HOME) != 0 ? "H" : ""
 #else
 		    ""
 #endif
 		    );
 #else
 		printf("flags=%s%s",
 		    PR.raflags.onlink ? "L" : "",
 		    PR.raflags.autonomous ? "A" : "");
 #endif
 		if (PR.vltime == ND6_INFINITE_LIFETIME)
 			printf(" vltime=infinity");
 		else
 			printf(" vltime=%lu", PR.vltime);
 		if (PR.pltime == ND6_INFINITE_LIFETIME)
 			printf(", pltime=infinity");
 		else
 			printf(", pltime=%lu", PR.pltime);
 		if (PR.expire == 0)
 			printf(", expire=Never");
 		else if (PR.expire >= time.tv_sec)
 			printf(", expire=%s",
 			    sec2str(PR.expire - time.tv_sec));
 		else
 			printf(", expired");
 #ifdef NDPRF_ONLINK
 		printf(", ref=%d", PR.refcnt);
 #endif
 #if 0
 		switch (PR.origin) {
 		case PR_ORIG_RA:
 			printf(", origin=RA");
 			break;
 		case PR_ORIG_RR:
 			printf(", origin=RR");
 			break;
 		case PR_ORIG_STATIC:
 			printf(", origin=static");
 			break;
 		case PR_ORIG_KERNEL:
 			printf(", origin=kernel");
 			break;
 		default:
 			printf(", origin=?");
 			break;
 		}
 #endif
 		printf("\n");
 		/*
 		 * "advertising router" list is meaningful only if the prefix
 		 * information is from RA.
 		 */
 		if (0 &&	/* prefix origin is almost obsolted */
 		    PR.origin != PR_ORIG_RA)
 			;
 		else if (PR.advrtrs) {
 			int j;
 			printf("  advertised by\n");
 			for (j = 0; j < PR.advrtrs; j++) {
 				struct sockaddr_in6 sin6;
 				struct in6_nbrinfo *nbi;
 
 				bzero(&sin6, sizeof(sin6));
 				sin6.sin6_family = AF_INET6;
 				sin6.sin6_len = sizeof(sin6);
 				sin6.sin6_addr = PR.advrtr[j];
 				sin6.sin6_scope_id = PR.if_index; /* XXX */
 				getnameinfo((struct sockaddr *)&sin6,
 				    sin6.sin6_len, host_buf,
 				    sizeof(host_buf), NULL, 0,
 				    (nflag ? NI_NUMERICHOST : 0));
 				printf("    %s", host_buf);
 
 				nbi = getnbrinfo(&sin6.sin6_addr,
 				    PR.if_index, 0);
 				if (nbi) {
 					switch (nbi->state) {
 					case ND6_LLINFO_REACHABLE:
 					case ND6_LLINFO_STALE:
 					case ND6_LLINFO_DELAY:
 					case ND6_LLINFO_PROBE:
 						 printf(" (reachable)\n");
 						 break;
 					default:
 						 printf(" (unreachable)\n");
 					}
 				} else
 					printf(" (no neighbor state)\n");
 			}
 			if (PR.advrtrs > DRLSTSIZ)
 				printf("    and %d routers\n",
 				    PR.advrtrs - DRLSTSIZ);
 		} else
 			printf("  No advertising router\n");
 	}
 #undef PR
 	close(s);
 #endif
 }
 
 void
 pfx_flush()
 {
 	char dummyif[IFNAMSIZ+8];
 	int s;
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0)
 		err(1, "socket");
 	strlcpy(dummyif, "lo0", sizeof(dummyif)); /* dummy */
 	if (ioctl(s, SIOCSPFXFLUSH_IN6, (caddr_t)&dummyif) < 0)
 		err(1, "ioctl(SIOCSPFXFLUSH_IN6)");
 }
 
 void
 rtr_flush()
 {
 	char dummyif[IFNAMSIZ+8];
 	int s;
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0)
 		err(1, "socket");
 	strlcpy(dummyif, "lo0", sizeof(dummyif)); /* dummy */
 	if (ioctl(s, SIOCSRTRFLUSH_IN6, (caddr_t)&dummyif) < 0)
 		err(1, "ioctl(SIOCSRTRFLUSH_IN6)");
 
 	close(s);
 }
 
 void
 harmonize_rtr()
 {
 	char dummyif[IFNAMSIZ+8];
 	int s;
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0)
 		err(1, "socket");
 	strlcpy(dummyif, "lo0", sizeof(dummyif)); /* dummy */
 	if (ioctl(s, SIOCSNDFLUSH_IN6, (caddr_t)&dummyif) < 0)
 		err(1, "ioctl(SIOCSNDFLUSH_IN6)");
 
 	close(s);
 }
 
 #ifdef SIOCSDEFIFACE_IN6	/* XXX: check SIOCGDEFIFACE_IN6 as well? */
 static void
 setdefif(ifname)
 	char *ifname;
 {
 	struct in6_ndifreq ndifreq;
 	unsigned int ifindex;
 
 	if (strcasecmp(ifname, "delete") == 0)
 		ifindex = 0;
 	else {
 		if ((ifindex = if_nametoindex(ifname)) == 0)
 			err(1, "failed to resolve i/f index for %s", ifname);
 	}
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0)
 		err(1, "socket");
 
 	strlcpy(ndifreq.ifname, "lo0", sizeof(ndifreq.ifname)); /* dummy */
 	ndifreq.ifindex = ifindex;
 
 	if (ioctl(s, SIOCSDEFIFACE_IN6, (caddr_t)&ndifreq) < 0)
 		err(1, "ioctl(SIOCSDEFIFACE_IN6)");
 
 	close(s);
 }
 
 static void
 getdefif()
 {
 	struct in6_ndifreq ndifreq;
 	char ifname[IFNAMSIZ+8];
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0)
 		err(1, "socket");
 
 	memset(&ndifreq, 0, sizeof(ndifreq));
 	strlcpy(ndifreq.ifname, "lo0", sizeof(ndifreq.ifname)); /* dummy */
 
 	if (ioctl(s, SIOCGDEFIFACE_IN6, (caddr_t)&ndifreq) < 0)
 		err(1, "ioctl(SIOCGDEFIFACE_IN6)");
 
 	if (ndifreq.ifindex == 0)
 		printf("No default interface.\n");
 	else {
 		if ((if_indextoname(ndifreq.ifindex, ifname)) == NULL)
 			err(1, "failed to resolve ifname for index %lu",
 			    ndifreq.ifindex);
 		printf("ND default interface = %s\n", ifname);
 	}
 
 	close(s);
 }
 #endif
 
 static char *
 sec2str(total)
 	time_t total;
 {
 	static char result[256];
 	int days, hours, mins, secs;
 	int first = 1;
 	char *p = result;
 	char *ep = &result[sizeof(result)];
 	int n;
 
 	days = total / 3600 / 24;
 	hours = (total / 3600) % 24;
 	mins = (total / 60) % 60;
 	secs = total % 60;
 
 	if (days) {
 		first = 0;
 		n = snprintf(p, ep - p, "%dd", days);
 		if (n < 0 || n >= ep - p)
 			return "?";
 		p += n;
 	}
 	if (!first || hours) {
 		first = 0;
 		n = snprintf(p, ep - p, "%dh", hours);
 		if (n < 0 || n >= ep - p)
 			return "?";
 		p += n;
 	}
 	if (!first || mins) {
 		first = 0;
 		n = snprintf(p, ep - p, "%dm", mins);
 		if (n < 0 || n >= ep - p)
 			return "?";
 		p += n;
 	}
 	snprintf(p, ep - p, "%ds", secs);
 
 	return(result);
 }
 
 /*
  * Print the timestamp
  * from tcpdump/util.c
  */
 static void
 ts_print(tvp)
 	const struct timeval *tvp;
 {
 	int s;
 
 	/* Default */
 	s = (tvp->tv_sec + thiszone) % 86400;
 	(void)printf("%02d:%02d:%02d.%06u ",
 	    s / 3600, (s % 3600) / 60, s % 60, (u_int32_t)tvp->tv_usec);
 }
+
+#undef NEXTADDR
Index: head/usr.sbin/ppp/route.c
===================================================================
--- head/usr.sbin/ppp/route.c	(revision 186118)
+++ head/usr.sbin/ppp/route.c	(revision 186119)
@@ -1,929 +1,937 @@
 /*-
  * Copyright (c) 1996 - 2001 Brian Somers <brian@Awfulhak.org>
  *          based on work by Toshiharu OHNO <tony-o@iij.ad.jp>
  *                           Internet Initiative Japan, Inc (IIJ)
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <sys/param.h>
 #include <sys/socket.h>
 #include <net/if_types.h>
 #include <net/route.h>
 #include <net/if.h>
 #include <netinet/in.h>
 #include <arpa/inet.h>
 #include <net/if_dl.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <sys/un.h>
 
 #include <errno.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sys/sysctl.h>
 #include <termios.h>
 #include <unistd.h>
 
 #include "layer.h"
 #include "defs.h"
 #include "command.h"
 #include "mbuf.h"
 #include "log.h"
 #include "iplist.h"
 #include "timer.h"
 #include "throughput.h"
 #include "lqr.h"
 #include "hdlc.h"
 #include "fsm.h"
 #include "lcp.h"
 #include "ccp.h"
 #include "link.h"
 #include "slcompress.h"
 #include "ncpaddr.h"
 #include "ipcp.h"
 #include "filter.h"
 #include "descriptor.h"
 #include "mp.h"
 #ifndef NORADIUS
 #include "radius.h"
 #endif
 #include "ipv6cp.h"
 #include "ncp.h"
 #include "bundle.h"
 #include "route.h"
 #include "prompt.h"
 #include "iface.h"
 #include "id.h"
 
 
 static void
 p_sockaddr(struct prompt *prompt, struct sockaddr *phost,
            struct sockaddr *pmask, int width)
 {
   struct ncprange range;
   char buf[29];
   struct sockaddr_dl *dl = (struct sockaddr_dl *)phost;
 
   if (log_IsKept(LogDEBUG)) {
     char tmp[50];
 
     log_Printf(LogDEBUG, "Found the following sockaddr:\n");
     log_Printf(LogDEBUG, "  Family %d, len %d\n",
                (int)phost->sa_family, (int)phost->sa_len);
     inet_ntop(phost->sa_family, phost->sa_data, tmp, sizeof tmp);
     log_Printf(LogDEBUG, "  Addr %s\n", tmp);
     if (pmask) {
       inet_ntop(pmask->sa_family, pmask->sa_data, tmp, sizeof tmp);
       log_Printf(LogDEBUG, "  Mask %s\n", tmp);
     }
   }
 
   switch (phost->sa_family) {
   case AF_INET:
 #ifndef NOINET6
   case AF_INET6:
 #endif
     ncprange_setsa(&range, phost, pmask);
     if (ncprange_isdefault(&range))
       prompt_Printf(prompt, "%-*s ", width - 1, "default");
     else
       prompt_Printf(prompt, "%-*s ", width - 1, ncprange_ntoa(&range));
     return;
 
   case AF_LINK:
     if (dl->sdl_nlen)
       snprintf(buf, sizeof buf, "%.*s", dl->sdl_nlen, dl->sdl_data);
     else if (dl->sdl_alen) {
       if (dl->sdl_type == IFT_ETHER) {
         if (dl->sdl_alen < sizeof buf / 3) {
           int f;
           u_char *MAC;
 
           MAC = (u_char *)dl->sdl_data + dl->sdl_nlen;
           for (f = 0; f < dl->sdl_alen; f++)
             sprintf(buf+f*3, "%02x:", MAC[f]);
           buf[f*3-1] = '\0';
         } else
           strcpy(buf, "??:??:??:??:??:??");
       } else
         sprintf(buf, "<IFT type %d>", dl->sdl_type);
     }  else if (dl->sdl_slen)
       sprintf(buf, "<slen %d?>", dl->sdl_slen);
     else
       sprintf(buf, "link#%d", dl->sdl_index);
     break;
 
   default:
     sprintf(buf, "<AF type %d>", phost->sa_family);
     break;
   }
 
   prompt_Printf(prompt, "%-*s ", width-1, buf);
 }
 
 static struct bits {
   u_int32_t b_mask;
   char b_val;
 } bits[] = {
   { RTF_UP, 'U' },
   { RTF_GATEWAY, 'G' },
   { RTF_HOST, 'H' },
   { RTF_REJECT, 'R' },
   { RTF_DYNAMIC, 'D' },
   { RTF_MODIFIED, 'M' },
   { RTF_DONE, 'd' },
-  { RTF_CLONING, 'C' },
   { RTF_XRESOLVE, 'X' },
-  { RTF_LLINFO, 'L' },
+#ifdef RTF_CLONING
+  { RTF_CLONING, 'C' },
+#endif
   { RTF_STATIC, 'S' },
   { RTF_PROTO1, '1' },
   { RTF_PROTO2, '2' },
   { RTF_BLACKHOLE, 'B' },
+
+#ifdef RTF_LLINFO
+  { RTF_LLINFO, 'L' },
+#endif
+#ifdef RTF_CLONING  
+  { RTF_CLONING, 'C' },
+#endif
 #ifdef RTF_WASCLONED
   { RTF_WASCLONED, 'W' },
 #endif
 #ifdef RTF_PRCLONING
   { RTF_PRCLONING, 'c' },
 #endif
 #ifdef RTF_PROTO3
   { RTF_PROTO3, '3' },
 #endif
 #ifdef RTF_BROADCAST
   { RTF_BROADCAST, 'b' },
 #endif
   { 0, '\0' }
 };
 
 #ifndef RTF_WASCLONED
 #define RTF_WASCLONED (0)
 #endif
 
 static void
 p_flags(struct prompt *prompt, u_int32_t f, unsigned max)
 {
   char name[33], *flags;
   register struct bits *p = bits;
 
   if (max > sizeof name - 1)
     max = sizeof name - 1;
 
   for (flags = name; p->b_mask && flags - name < (int)max; p++)
     if (p->b_mask & f)
       *flags++ = p->b_val;
   *flags = '\0';
   prompt_Printf(prompt, "%-*.*s", (int)max, (int)max, name);
 }
 
 static int route_nifs = -1;
 
 const char *
 Index2Nam(int idx)
 {
   /*
    * XXX: Maybe we should select() on the routing socket so that we can
    *      notice interfaces that come & go (PCCARD support).
    *      Or we could even support a signal that resets these so that
    *      the PCCARD insert/remove events can signal ppp.
    */
   static char **ifs;		/* Figure these out once */
   static int debug_done;	/* Debug once */
 
   if (idx > route_nifs || (idx > 0 && ifs[idx-1] == NULL)) {
     int mib[6], have, had;
     size_t needed;
     char *buf, *ptr, *end;
     struct sockaddr_dl *dl;
     struct if_msghdr *ifm;
 
     if (ifs) {
       free(ifs);
       ifs = NULL;
       route_nifs = 0;
     }
     debug_done = 0;
 
     mib[0] = CTL_NET;
     mib[1] = PF_ROUTE;
     mib[2] = 0;
     mib[3] = 0;
     mib[4] = NET_RT_IFLIST;
     mib[5] = 0;
 
     if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0) {
       log_Printf(LogERROR, "Index2Nam: sysctl: estimate: %s\n",
                  strerror(errno));
       return NumStr(idx, NULL, 0);
     }
     if ((buf = malloc(needed)) == NULL)
       return NumStr(idx, NULL, 0);
     if (sysctl(mib, 6, buf, &needed, NULL, 0) < 0) {
       free(buf);
       return NumStr(idx, NULL, 0);
     }
     end = buf + needed;
 
     have = 0;
     for (ptr = buf; ptr < end; ptr += ifm->ifm_msglen) {
       ifm = (struct if_msghdr *)ptr;
       if (ifm->ifm_type != RTM_IFINFO)
         continue;
       dl = (struct sockaddr_dl *)(ifm + 1);
       if (ifm->ifm_index > 0) {
         if (ifm->ifm_index > have) {
           char **newifs;
 
           had = have;
           have = ifm->ifm_index + 5;
           if (had)
             newifs = (char **)realloc(ifs, sizeof(char *) * have);
           else
             newifs = (char **)malloc(sizeof(char *) * have);
           if (!newifs) {
             log_Printf(LogDEBUG, "Index2Nam: %s\n", strerror(errno));
             route_nifs = 0;
             if (ifs) {
               free(ifs);
               ifs = NULL;
             }
             free(buf);
             return NumStr(idx, NULL, 0);
           }
           ifs = newifs;
           memset(ifs + had, '\0', sizeof(char *) * (have - had));
         }
         if (ifs[ifm->ifm_index-1] == NULL) {
           ifs[ifm->ifm_index-1] = (char *)malloc(dl->sdl_nlen+1);
           if (ifs[ifm->ifm_index-1] == NULL)
 	    log_Printf(LogDEBUG, "Skipping interface %d: Out of memory\n",
                   ifm->ifm_index);
 	  else {
 	    memcpy(ifs[ifm->ifm_index-1], dl->sdl_data, dl->sdl_nlen);
 	    ifs[ifm->ifm_index-1][dl->sdl_nlen] = '\0';
 	    if (route_nifs < ifm->ifm_index)
 	      route_nifs = ifm->ifm_index;
 	  }
         }
       } else if (log_IsKept(LogDEBUG))
         log_Printf(LogDEBUG, "Skipping out-of-range interface %d!\n",
                   ifm->ifm_index);
     }
     free(buf);
   }
 
   if (log_IsKept(LogDEBUG) && !debug_done) {
     int f;
 
     log_Printf(LogDEBUG, "Found the following interfaces:\n");
     for (f = 0; f < route_nifs; f++)
       if (ifs[f] != NULL)
         log_Printf(LogDEBUG, " Index %d, name \"%s\"\n", f+1, ifs[f]);
     debug_done = 1;
   }
 
   if (idx < 1 || idx > route_nifs || ifs[idx-1] == NULL)
     return NumStr(idx, NULL, 0);
 
   return ifs[idx-1];
 }
 
 void
 route_ParseHdr(struct rt_msghdr *rtm, struct sockaddr *sa[RTAX_MAX])
 {
   char *wp;
   int rtax;
 
   wp = (char *)(rtm + 1);
 
   for (rtax = 0; rtax < RTAX_MAX; rtax++)
     if (rtm->rtm_addrs & (1 << rtax)) {
       sa[rtax] = (struct sockaddr *)wp;
       wp += ROUNDUP(sa[rtax]->sa_len);
       if (sa[rtax]->sa_family == 0)
         sa[rtax] = NULL;	/* ??? */
     } else
       sa[rtax] = NULL;
 }
 
 int
 route_Show(struct cmdargs const *arg)
 {
   struct rt_msghdr *rtm;
   struct sockaddr *sa[RTAX_MAX];
   char *sp, *ep, *cp;
   size_t needed;
   int mib[6];
 
   mib[0] = CTL_NET;
   mib[1] = PF_ROUTE;
   mib[2] = 0;
   mib[3] = 0;
   mib[4] = NET_RT_DUMP;
   mib[5] = 0;
   if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0) {
     log_Printf(LogERROR, "route_Show: sysctl: estimate: %s\n", strerror(errno));
     return (1);
   }
   sp = malloc(needed);
   if (sp == NULL)
     return (1);
   if (sysctl(mib, 6, sp, &needed, NULL, 0) < 0) {
     log_Printf(LogERROR, "route_Show: sysctl: getroute: %s\n", strerror(errno));
     free(sp);
     return (1);
   }
   ep = sp + needed;
 
   prompt_Printf(arg->prompt, "%-20s%-20sFlags  Netif\n",
                 "Destination", "Gateway");
   for (cp = sp; cp < ep; cp += rtm->rtm_msglen) {
     rtm = (struct rt_msghdr *)cp;
 
     route_ParseHdr(rtm, sa);
 
     if (sa[RTAX_DST] && sa[RTAX_GATEWAY]) {
       p_sockaddr(arg->prompt, sa[RTAX_DST], sa[RTAX_NETMASK], 20);
       p_sockaddr(arg->prompt, sa[RTAX_GATEWAY], NULL, 20);
 
       p_flags(arg->prompt, rtm->rtm_flags, 6);
       prompt_Printf(arg->prompt, " %s\n", Index2Nam(rtm->rtm_index));
     } else
       prompt_Printf(arg->prompt, "<can't parse routing entry>\n");
   }
   free(sp);
   return 0;
 }
 
 /*
  *  Delete routes associated with our interface
  */
 void
 route_IfDelete(struct bundle *bundle, int all)
 {
   struct rt_msghdr *rtm;
   struct sockaddr *sa[RTAX_MAX];
   struct ncprange range;
   int pass;
   size_t needed;
   char *sp, *cp, *ep;
   int mib[6];
 
   log_Printf(LogDEBUG, "route_IfDelete (%d)\n", bundle->iface->index);
 
   mib[0] = CTL_NET;
   mib[1] = PF_ROUTE;
   mib[2] = 0;
   mib[3] = 0;
   mib[4] = NET_RT_DUMP;
   mib[5] = 0;
   if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0) {
     log_Printf(LogERROR, "route_IfDelete: sysctl: estimate: %s\n",
               strerror(errno));
     return;
   }
 
   sp = malloc(needed);
   if (sp == NULL)
     return;
 
   if (sysctl(mib, 6, sp, &needed, NULL, 0) < 0) {
     log_Printf(LogERROR, "route_IfDelete: sysctl: getroute: %s\n",
               strerror(errno));
     free(sp);
     return;
   }
   ep = sp + needed;
 
   for (pass = 0; pass < 2; pass++) {
     /*
      * We do 2 passes.  The first deletes all cloned routes.  The second
      * deletes all non-cloned routes.  This is done to avoid
      * potential errors from trying to delete route X after route Y where
      * route X was cloned from route Y (and is no longer there 'cos it
      * may have gone with route Y).
      */
     if (RTF_WASCLONED == 0 && pass == 0)
       /* So we can't tell ! */
       continue;
     for (cp = sp; cp < ep; cp += rtm->rtm_msglen) {
       rtm = (struct rt_msghdr *)cp;
       route_ParseHdr(rtm, sa);
       if (rtm->rtm_index == bundle->iface->index &&
           sa[RTAX_DST] && sa[RTAX_GATEWAY] &&
           (sa[RTAX_DST]->sa_family == AF_INET
 #ifndef NOINET6
            || sa[RTAX_DST]->sa_family == AF_INET6
 #endif
            ) &&
           (all || (rtm->rtm_flags & RTF_GATEWAY))) {
         if (log_IsKept(LogDEBUG)) {
           char gwstr[41];
           struct ncpaddr gw;
           ncprange_setsa(&range, sa[RTAX_DST], sa[RTAX_NETMASK]);
           ncpaddr_setsa(&gw, sa[RTAX_GATEWAY]);
           snprintf(gwstr, sizeof gwstr, "%s", ncpaddr_ntoa(&gw));
           log_Printf(LogDEBUG, "Found %s %s\n", ncprange_ntoa(&range), gwstr);
         }
         if (sa[RTAX_GATEWAY]->sa_family == AF_INET ||
 #ifndef NOINET6
             sa[RTAX_GATEWAY]->sa_family == AF_INET6 ||
 #endif
             sa[RTAX_GATEWAY]->sa_family == AF_LINK) {
           if ((pass == 0 && (rtm->rtm_flags & RTF_WASCLONED)) ||
               (pass == 1 && !(rtm->rtm_flags & RTF_WASCLONED))) {
             ncprange_setsa(&range, sa[RTAX_DST], sa[RTAX_NETMASK]);
             rt_Set(bundle, RTM_DELETE, &range, NULL, 0, 0);
           } else
             log_Printf(LogDEBUG, "route_IfDelete: Skip it (pass %d)\n", pass);
         } else
           log_Printf(LogDEBUG,
                     "route_IfDelete: Can't remove routes for family %d\n",
                     sa[RTAX_GATEWAY]->sa_family);
       }
     }
   }
   free(sp);
 }
 
 
 /*
  *  Update the MTU on all routes for the given interface
  */
 void
 route_UpdateMTU(struct bundle *bundle)
 {
   struct rt_msghdr *rtm;
   struct sockaddr *sa[RTAX_MAX];
   struct ncprange dst;
   size_t needed;
   char *sp, *cp, *ep;
   int mib[6];
 
   log_Printf(LogDEBUG, "route_UpdateMTU (%d)\n", bundle->iface->index);
 
   mib[0] = CTL_NET;
   mib[1] = PF_ROUTE;
   mib[2] = 0;
   mib[3] = 0;
   mib[4] = NET_RT_DUMP;
   mib[5] = 0;
   if (sysctl(mib, 6, NULL, &needed, NULL, 0) < 0) {
     log_Printf(LogERROR, "route_IfDelete: sysctl: estimate: %s\n",
               strerror(errno));
     return;
   }
 
   sp = malloc(needed);
   if (sp == NULL)
     return;
 
   if (sysctl(mib, 6, sp, &needed, NULL, 0) < 0) {
     log_Printf(LogERROR, "route_IfDelete: sysctl: getroute: %s\n",
               strerror(errno));
     free(sp);
     return;
   }
   ep = sp + needed;
 
   for (cp = sp; cp < ep; cp += rtm->rtm_msglen) {
     rtm = (struct rt_msghdr *)cp;
     route_ParseHdr(rtm, sa);
     if (sa[RTAX_DST] && (sa[RTAX_DST]->sa_family == AF_INET
 #ifndef NOINET6
                          || sa[RTAX_DST]->sa_family == AF_INET6
 #endif
                         ) &&
         sa[RTAX_GATEWAY] && rtm->rtm_index == bundle->iface->index) {
       if (log_IsKept(LogTCPIP)) {
         ncprange_setsa(&dst, sa[RTAX_DST], sa[RTAX_NETMASK]);
         log_Printf(LogTCPIP, "route_UpdateMTU: Netif: %d (%s), dst %s,"
                    " mtu %lu\n", rtm->rtm_index, Index2Nam(rtm->rtm_index),
                    ncprange_ntoa(&dst), bundle->iface->mtu);
       }
       rt_Update(bundle, sa[RTAX_DST], sa[RTAX_GATEWAY], sa[RTAX_NETMASK]);
     }
   }
 
   free(sp);
 }
 
 int
 GetIfIndex(char *name)
 {
   int idx;
 
   idx = 1;
   while (route_nifs == -1 || idx < route_nifs)
     if (strcmp(Index2Nam(idx), name) == 0)
       return idx;
     else
       idx++;
   return -1;
 }
 
 void
 route_Change(struct bundle *bundle, struct sticky_route *r,
              const struct ncpaddr *me, const struct ncpaddr *peer)
 {
   struct ncpaddr dst;
 
   for (; r; r = r->next) {
     ncprange_getaddr(&r->dst, &dst);
     if (ncpaddr_family(me) == AF_INET) {
       if ((r->type & ROUTE_DSTMYADDR) && !ncpaddr_equal(&dst, me)) {
         rt_Set(bundle, RTM_DELETE, &r->dst, NULL, 1, 0);
         ncprange_sethost(&r->dst, me);
         if (r->type & ROUTE_GWHISADDR)
           ncpaddr_copy(&r->gw, peer);
       } else if ((r->type & ROUTE_DSTHISADDR) && !ncpaddr_equal(&dst, peer)) {
         rt_Set(bundle, RTM_DELETE, &r->dst, NULL, 1, 0);
         ncprange_sethost(&r->dst, peer);
         if (r->type & ROUTE_GWHISADDR)
           ncpaddr_copy(&r->gw, peer);
       } else if ((r->type & ROUTE_DSTDNS0) && !ncpaddr_equal(&dst, peer)) {
         if (bundle->ncp.ipcp.ns.dns[0].s_addr == INADDR_NONE)
           continue;
         rt_Set(bundle, RTM_DELETE, &r->dst, NULL, 1, 0);
         if (r->type & ROUTE_GWHISADDR)
           ncpaddr_copy(&r->gw, peer);
       } else if ((r->type & ROUTE_DSTDNS1) && !ncpaddr_equal(&dst, peer)) {
         if (bundle->ncp.ipcp.ns.dns[1].s_addr == INADDR_NONE)
           continue;
         rt_Set(bundle, RTM_DELETE, &r->dst, NULL, 1, 0);
         if (r->type & ROUTE_GWHISADDR)
           ncpaddr_copy(&r->gw, peer);
       } else if ((r->type & ROUTE_GWHISADDR) && !ncpaddr_equal(&r->gw, peer))
         ncpaddr_copy(&r->gw, peer);
 #ifndef NOINET6
     } else if (ncpaddr_family(me) == AF_INET6) {
       if ((r->type & ROUTE_DSTMYADDR6) && !ncpaddr_equal(&dst, me)) {
         rt_Set(bundle, RTM_DELETE, &r->dst, NULL, 1, 0);
         ncprange_sethost(&r->dst, me);
         if (r->type & ROUTE_GWHISADDR)
           ncpaddr_copy(&r->gw, peer);
       } else if ((r->type & ROUTE_DSTHISADDR6) && !ncpaddr_equal(&dst, peer)) {
         rt_Set(bundle, RTM_DELETE, &r->dst, NULL, 1, 0);
         ncprange_sethost(&r->dst, peer);
         if (r->type & ROUTE_GWHISADDR)
           ncpaddr_copy(&r->gw, peer);
       } else if ((r->type & ROUTE_GWHISADDR6) && !ncpaddr_equal(&r->gw, peer))
         ncpaddr_copy(&r->gw, peer);
 #endif
     }
     rt_Set(bundle, RTM_ADD, &r->dst, &r->gw, 1, 0);
   }
 }
 
 void
 route_Add(struct sticky_route **rp, int type, const struct ncprange *dst,
           const struct ncpaddr *gw)
 {
   struct sticky_route *r;
   int dsttype = type & ROUTE_DSTANY;
 
   r = NULL;
   while (*rp) {
     if ((dsttype && dsttype == ((*rp)->type & ROUTE_DSTANY)) ||
         (!dsttype && ncprange_equal(&(*rp)->dst, dst))) {
       /* Oops, we already have this route - unlink it */
       free(r);			/* impossible really  */
       r = *rp;
       *rp = r->next;
     } else
       rp = &(*rp)->next;
   }
 
   if (r == NULL) {
     r = (struct sticky_route *)malloc(sizeof(struct sticky_route));
     if (r == NULL) {
       log_Printf(LogERROR, "route_Add: Out of memory!\n");
       return;
     }
   }
   r->type = type;
   r->next = NULL;
   ncprange_copy(&r->dst, dst);
   ncpaddr_copy(&r->gw, gw);
   *rp = r;
 }
 
 void
 route_Delete(struct sticky_route **rp, int type, const struct ncprange *dst)
 {
   struct sticky_route *r;
   int dsttype = type & ROUTE_DSTANY;
 
   for (; *rp; rp = &(*rp)->next) {
     if ((dsttype && dsttype == ((*rp)->type & ROUTE_DSTANY)) ||
         (!dsttype && ncprange_equal(dst, &(*rp)->dst))) {
       r = *rp;
       *rp = r->next;
       free(r);
       break;
     }
   }
 }
 
 void
 route_DeleteAll(struct sticky_route **rp)
 {
   struct sticky_route *r, *rn;
 
   for (r = *rp; r; r = rn) {
     rn = r->next;
     free(r);
   }
   *rp = NULL;
 }
 
 void
 route_ShowSticky(struct prompt *p, struct sticky_route *r, const char *tag,
                  int indent)
 {
   int tlen = strlen(tag);
 
   if (tlen + 2 > indent)
     prompt_Printf(p, "%s:\n%*s", tag, indent, "");
   else
     prompt_Printf(p, "%s:%*s", tag, indent - tlen - 1, "");
 
   for (; r; r = r->next) {
     prompt_Printf(p, "%*sadd ", tlen ? 0 : indent, "");
     tlen = 0;
     if (r->type & ROUTE_DSTMYADDR)
       prompt_Printf(p, "MYADDR");
     else if (r->type & ROUTE_DSTMYADDR6)
       prompt_Printf(p, "MYADDR6");
     else if (r->type & ROUTE_DSTHISADDR)
       prompt_Printf(p, "HISADDR");
     else if (r->type & ROUTE_DSTHISADDR6)
       prompt_Printf(p, "HISADDR6");
     else if (r->type & ROUTE_DSTDNS0)
       prompt_Printf(p, "DNS0");
     else if (r->type & ROUTE_DSTDNS1)
       prompt_Printf(p, "DNS1");
     else if (ncprange_isdefault(&r->dst))
       prompt_Printf(p, "default");
     else
       prompt_Printf(p, "%s", ncprange_ntoa(&r->dst));
 
     if (r->type & ROUTE_GWHISADDR)
       prompt_Printf(p, " HISADDR\n");
     else if (r->type & ROUTE_GWHISADDR6)
       prompt_Printf(p, " HISADDR6\n");
     else
       prompt_Printf(p, " %s\n", ncpaddr_ntoa(&r->gw));
   }
 }
 
 struct rtmsg {
   struct rt_msghdr m_rtm;
   char m_space[256];
 };
 
 static size_t
 memcpy_roundup(char *cp, const void *data, size_t len)
 {
   size_t padlen;
 
   padlen = ROUNDUP(len);
   memcpy(cp, data, len);
   if (padlen > len)
     memset(cp + len, '\0', padlen - len);
 
   return padlen;
 }
 
 #if defined(__KAME__) && !defined(NOINET6)
 static void
 add_scope(struct sockaddr *sa, int ifindex)
 {
   struct sockaddr_in6 *sa6;
 
   if (sa->sa_family != AF_INET6)
     return;
   sa6 = (struct sockaddr_in6 *)sa;
   if (!IN6_IS_ADDR_LINKLOCAL(&sa6->sin6_addr) &&
       !IN6_IS_ADDR_MC_LINKLOCAL(&sa6->sin6_addr))
     return;
   if (*(u_int16_t *)&sa6->sin6_addr.s6_addr[2] != 0)
     return;
   *(u_int16_t *)&sa6->sin6_addr.s6_addr[2] = htons(ifindex);
 }
 #endif
 
 int
 rt_Set(struct bundle *bundle, int cmd, const struct ncprange *dst,
        const struct ncpaddr *gw, int bang, int quiet)
 {
   struct rtmsg rtmes;
   int s, nb, wb;
   char *cp;
   const char *cmdstr;
   struct sockaddr_storage sadst, samask, sagw;
   int result = 1;
 
   if (bang)
     cmdstr = (cmd == RTM_ADD ? "Add!" : "Delete!");
   else
     cmdstr = (cmd == RTM_ADD ? "Add" : "Delete");
   s = ID0socket(PF_ROUTE, SOCK_RAW, 0);
   if (s < 0) {
     log_Printf(LogERROR, "rt_Set: socket(): %s\n", strerror(errno));
     return result;
   }
   memset(&rtmes, '\0', sizeof rtmes);
   rtmes.m_rtm.rtm_version = RTM_VERSION;
   rtmes.m_rtm.rtm_type = cmd;
   rtmes.m_rtm.rtm_addrs = RTA_DST;
   rtmes.m_rtm.rtm_seq = ++bundle->routing_seq;
   rtmes.m_rtm.rtm_pid = getpid();
   rtmes.m_rtm.rtm_flags = RTF_UP | RTF_GATEWAY | RTF_STATIC;
 
   if (cmd == RTM_ADD) {
     if (bundle->ncp.cfg.sendpipe > 0) {
       rtmes.m_rtm.rtm_rmx.rmx_sendpipe = bundle->ncp.cfg.sendpipe;
       rtmes.m_rtm.rtm_inits |= RTV_SPIPE;
     }
     if (bundle->ncp.cfg.recvpipe > 0) {
       rtmes.m_rtm.rtm_rmx.rmx_recvpipe = bundle->ncp.cfg.recvpipe;
       rtmes.m_rtm.rtm_inits |= RTV_RPIPE;
     }
   }
 
   ncprange_getsa(dst, &sadst, &samask);
 #if defined(__KAME__) && !defined(NOINET6)
   add_scope((struct sockaddr *)&sadst, bundle->iface->index);
 #endif
 
   cp = rtmes.m_space;
   cp += memcpy_roundup(cp, &sadst, sadst.ss_len);
   if (cmd == RTM_ADD) {
     if (gw == NULL) {
       log_Printf(LogERROR, "rt_Set: Program error\n");
       close(s);
       return result;
     }
     ncpaddr_getsa(gw, &sagw);
 #if defined(__KAME__) && !defined(NOINET6)
     add_scope((struct sockaddr *)&sagw, bundle->iface->index);
 #endif
     if (ncpaddr_isdefault(gw)) {
       if (!quiet)
         log_Printf(LogERROR, "rt_Set: Cannot add a route with"
                    " gateway 0.0.0.0\n");
       close(s);
       return result;
     } else {
       cp += memcpy_roundup(cp, &sagw, sagw.ss_len);
       rtmes.m_rtm.rtm_addrs |= RTA_GATEWAY;
     }
   }
 
   if (!ncprange_ishost(dst)) {
     cp += memcpy_roundup(cp, &samask, samask.ss_len);
     rtmes.m_rtm.rtm_addrs |= RTA_NETMASK;
   }
 
   nb = cp - (char *)&rtmes;
   rtmes.m_rtm.rtm_msglen = nb;
   wb = ID0write(s, &rtmes, nb);
   if (wb < 0) {
     log_Printf(LogTCPIP, "rt_Set failure:\n");
     log_Printf(LogTCPIP, "rt_Set:  Cmd = %s\n", cmdstr);
     log_Printf(LogTCPIP, "rt_Set:  Dst = %s\n", ncprange_ntoa(dst));
     if (gw != NULL)
       log_Printf(LogTCPIP, "rt_Set:  Gateway = %s\n", ncpaddr_ntoa(gw));
 failed:
     if (cmd == RTM_ADD && (rtmes.m_rtm.rtm_errno == EEXIST ||
                            (rtmes.m_rtm.rtm_errno == 0 && errno == EEXIST))) {
       if (!bang) {
         log_Printf(LogWARN, "Add route failed: %s already exists\n",
 		   ncprange_ntoa(dst));
         result = 0;	/* Don't add to our dynamic list */
       } else {
         rtmes.m_rtm.rtm_type = cmd = RTM_CHANGE;
         if ((wb = ID0write(s, &rtmes, nb)) < 0)
           goto failed;
       }
     } else if (cmd == RTM_DELETE &&
              (rtmes.m_rtm.rtm_errno == ESRCH ||
               (rtmes.m_rtm.rtm_errno == 0 && errno == ESRCH))) {
       if (!bang)
         log_Printf(LogWARN, "Del route failed: %s: Non-existent\n",
                   ncprange_ntoa(dst));
     } else if (rtmes.m_rtm.rtm_errno == 0) {
       if (!quiet || errno != ENETUNREACH)
         log_Printf(LogWARN, "%s route failed: %s: errno: %s\n", cmdstr,
                    ncprange_ntoa(dst), strerror(errno));
     } else
       log_Printf(LogWARN, "%s route failed: %s: %s\n",
 		 cmdstr, ncprange_ntoa(dst), strerror(rtmes.m_rtm.rtm_errno));
   }
 
   if (log_IsKept(LogDEBUG)) {
     char gwstr[40];
 
     if (gw)
       snprintf(gwstr, sizeof gwstr, "%s", ncpaddr_ntoa(gw));
     else
       snprintf(gwstr, sizeof gwstr, "<none>");
     log_Printf(LogDEBUG, "wrote %d: cmd = %s, dst = %s, gateway = %s\n",
                wb, cmdstr, ncprange_ntoa(dst), gwstr);
   }
   close(s);
 
   return result;
 }
 
 void
 rt_Update(struct bundle *bundle, const struct sockaddr *dst,
           const struct sockaddr *gw, const struct sockaddr *mask)
 {
   struct ncprange ncpdst;
   struct rtmsg rtmes;
   char *p;
   int s, wb;
 
   s = ID0socket(PF_ROUTE, SOCK_RAW, 0);
   if (s < 0) {
     log_Printf(LogERROR, "rt_Update: socket(): %s\n", strerror(errno));
     return;
   }
 
   memset(&rtmes, '\0', sizeof rtmes);
   rtmes.m_rtm.rtm_version = RTM_VERSION;
   rtmes.m_rtm.rtm_type = RTM_CHANGE;
   rtmes.m_rtm.rtm_addrs = 0;
   rtmes.m_rtm.rtm_seq = ++bundle->routing_seq;
   rtmes.m_rtm.rtm_pid = getpid();
   rtmes.m_rtm.rtm_flags = RTF_UP | RTF_STATIC;
 
   if (bundle->ncp.cfg.sendpipe > 0) {
     rtmes.m_rtm.rtm_rmx.rmx_sendpipe = bundle->ncp.cfg.sendpipe;
     rtmes.m_rtm.rtm_inits |= RTV_SPIPE;
   }
 
   if (bundle->ncp.cfg.recvpipe > 0) {
     rtmes.m_rtm.rtm_rmx.rmx_recvpipe = bundle->ncp.cfg.recvpipe;
     rtmes.m_rtm.rtm_inits |= RTV_RPIPE;
   }
 
   rtmes.m_rtm.rtm_rmx.rmx_mtu = bundle->iface->mtu;
   rtmes.m_rtm.rtm_inits |= RTV_MTU;
   p = rtmes.m_space;
 
   if (dst) {
     rtmes.m_rtm.rtm_addrs |= RTA_DST;
     p += memcpy_roundup(p, dst, dst->sa_len);
   }
 
   rtmes.m_rtm.rtm_addrs |= RTA_GATEWAY;
   p += memcpy_roundup(p, gw, gw->sa_len);
   if (mask) {
     rtmes.m_rtm.rtm_addrs |= RTA_NETMASK;
     p += memcpy_roundup(p, mask, mask->sa_len);
   }
 
   rtmes.m_rtm.rtm_msglen = p - (char *)&rtmes;
 
   wb = ID0write(s, &rtmes, rtmes.m_rtm.rtm_msglen);
   if (wb < 0) {
     ncprange_setsa(&ncpdst, dst, mask);
 
     log_Printf(LogTCPIP, "rt_Update failure:\n");
     log_Printf(LogTCPIP, "rt_Update:  Dst = %s\n", ncprange_ntoa(&ncpdst));
 
     if (rtmes.m_rtm.rtm_errno == 0)
       log_Printf(LogWARN, "%s: Change route failed: errno: %s\n",
                  ncprange_ntoa(&ncpdst), strerror(errno));
     else
       log_Printf(LogWARN, "%s: Change route failed: %s\n",
 		 ncprange_ntoa(&ncpdst), strerror(rtmes.m_rtm.rtm_errno));
   }
   close(s);
 }
Index: head/usr.sbin/route6d/route6d.c
===================================================================
--- head/usr.sbin/route6d/route6d.c	(revision 186118)
+++ head/usr.sbin/route6d/route6d.c	(revision 186119)
@@ -1,3633 +1,3631 @@
 /*	$FreeBSD$	*/
 /*	$KAME: route6d.c,v 1.104 2003/10/31 00:30:20 itojun Exp $	*/
 
 /*
  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
  * All rights reserved.
  * 
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the project nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  * 
  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #ifndef	lint
 static char _rcsid[] = "$KAME: route6d.c,v 1.104 2003/10/31 00:30:20 itojun Exp $";
 #endif
 
 #include <stdio.h>
 
 #include <time.h>
 #include <unistd.h>
 #include <stdlib.h>
 #include <string.h>
 #include <signal.h>
 #ifdef __STDC__
 #include <stdarg.h>
 #else
 #include <varargs.h>
 #endif
 #include <syslog.h>
 #include <stddef.h>
 #include <errno.h>
 #include <err.h>
 #ifdef HAVE_POLL_H
 #include <poll.h>
 #endif
 
 #include <sys/types.h>
 #include <sys/param.h>
 #include <sys/file.h>
 #include <sys/socket.h>
 #include <sys/ioctl.h>
 #include <sys/sysctl.h>
 #include <sys/uio.h>
 #include <net/if.h>
 #include <net/if_var.h>
 #define	_KERNEL	1
 #include <net/route.h>
 #undef _KERNEL
 #include <netinet/in.h>
 #include <netinet/in_var.h>
 #include <netinet/ip6.h>
 #include <netinet/udp.h>
 #include <netdb.h>
 #include <ifaddrs.h>
 
 #include <arpa/inet.h>
 
 #include "route6d.h"
 
 #define	MAXFILTER	40
 
 #ifdef	DEBUG
 #define	INIT_INTERVAL6	6
 #else
 #define	INIT_INTERVAL6	10	/* Wait to submit an initial riprequest */
 #endif
 
 /* alignment constraint for routing socket */
 #define ROUNDUP(a) \
 	((a) > 0 ? (1 + (((a) - 1) | (sizeof(long) - 1))) : sizeof(long))
 #define ADVANCE(x, n) (x += ROUNDUP((n)->sa_len))
 
 /*
  * Following two macros are highly depending on KAME Release
  */
 #define	IN6_LINKLOCAL_IFINDEX(addr) \
 	((addr).s6_addr[2] << 8 | (addr).s6_addr[3])
 
 #define	SET_IN6_LINKLOCAL_IFINDEX(addr, index) \
 	do { \
 		(addr).s6_addr[2] = ((index) >> 8) & 0xff; \
 		(addr).s6_addr[3] = (index) & 0xff; \
 	} while (0)
 
 struct	ifc {			/* Configuration of an interface */
 	char	*ifc_name;			/* if name */
 	struct	ifc *ifc_next;
 	int	ifc_index;			/* if index */
 	int	ifc_mtu;			/* if mtu */
 	int	ifc_metric;			/* if metric */
 	u_int	ifc_flags;			/* flags */
 	short	ifc_cflags;			/* IFC_XXX */
 	struct	in6_addr ifc_mylladdr;		/* my link-local address */
 	struct	sockaddr_in6 ifc_ripsin;	/* rip multicast address */
 	struct	iff *ifc_filter;		/* filter structure */
 	struct	ifac *ifc_addr;			/* list of AF_INET6 addresses */
 	int	ifc_joined;			/* joined to ff02::9 */
 };
 
 struct	ifac {			/* Adddress associated to an interface */ 
 	struct	ifc *ifa_conf;		/* back pointer */
 	struct	ifac *ifa_next;
 	struct	in6_addr ifa_addr;	/* address */
 	struct	in6_addr ifa_raddr;	/* remote address, valid in p2p */
 	int	ifa_plen;		/* prefix length */
 };
 
 struct	iff {
 	int	iff_type;
 	struct	in6_addr iff_addr;
 	int	iff_plen;
 	struct	iff *iff_next;
 };
 
 struct	ifc *ifc;
 int	nifc;		/* number of valid ifc's */
 struct	ifc **index2ifc;
 int	nindex2ifc;
 struct	ifc *loopifcp = NULL;	/* pointing to loopback */
 #ifdef HAVE_POLL_H
 struct	pollfd set[2];
 #else
 fd_set	*sockvecp;	/* vector to select() for receiving */
 fd_set	*recvecp;
 int	fdmasks;
 int	maxfd;		/* maximum fd for select() */
 #endif
 int	rtsock;		/* the routing socket */
 int	ripsock;	/* socket to send/receive RIP datagram */
 
 struct	rip6 *ripbuf;	/* packet buffer for sending */
 
 /*
  * Maintain the routes in a linked list.  When the number of the routes
  * grows, somebody would like to introduce a hash based or a radix tree
  * based structure.  I believe the number of routes handled by RIP is
  * limited and I don't have to manage a complex data structure, however.
  *
  * One of the major drawbacks of the linear linked list is the difficulty
  * of representing the relationship between a couple of routes.  This may
  * be a significant problem when we have to support route aggregation with
  * supressing the specifices covered by the aggregate.
  */
 
 struct	riprt {
 	struct	riprt *rrt_next;	/* next destination */
 	struct	riprt *rrt_same;	/* same destination - future use */
 	struct	netinfo6 rrt_info;	/* network info */
 	struct	in6_addr rrt_gw;	/* gateway */
 	u_long	rrt_flags;		/* kernel routing table flags */
 	u_long	rrt_rflags;		/* route6d routing table flags */
 	time_t	rrt_t;			/* when the route validated */
 	int	rrt_index;		/* ifindex from which this route got */
 };
 
 struct	riprt *riprt = 0;
 
 int	dflag = 0;	/* debug flag */
 int	qflag = 0;	/* quiet flag */
 int	nflag = 0;	/* don't update kernel routing table */
 int	aflag = 0;	/* age out even the statically defined routes */
 int	hflag = 0;	/* don't split horizon */
 int	lflag = 0;	/* exchange site local routes */
 int	sflag = 0;	/* announce static routes w/ split horizon */
 int	Sflag = 0;	/* announce static routes to every interface */
 unsigned long routetag = 0;	/* route tag attached on originating case */
 
 char	*filter[MAXFILTER];
 int	filtertype[MAXFILTER];
 int	nfilter = 0;
 
 pid_t	pid;
 
 struct	sockaddr_storage ripsin;
 
 int	interval = 1;
 time_t	nextalarm = 0;
 time_t	sup_trig_update = 0;
 
 FILE	*rtlog = NULL;
 
 int logopened = 0;
 
 static	int	seq = 0;
 
 volatile sig_atomic_t seenalrm;
 volatile sig_atomic_t seenquit;
 volatile sig_atomic_t seenusr1;
 
 #define	RRTF_AGGREGATE		0x08000000
 #define	RRTF_NOADVERTISE	0x10000000
 #define	RRTF_NH_NOT_LLADDR	0x20000000
 #define RRTF_SENDANYWAY		0x40000000
 #define	RRTF_CHANGED		0x80000000
 
 int main(int, char **);
 void sighandler(int);
 void ripalarm(void);
 void riprecv(void);
 void ripsend(struct ifc *, struct sockaddr_in6 *, int);
 int out_filter(struct riprt *, struct ifc *);
 void init(void);
 void sockopt(struct ifc *);
 void ifconfig(void);
 void ifconfig1(const char *, const struct sockaddr *, struct ifc *, int);
 void rtrecv(void);
 int rt_del(const struct sockaddr_in6 *, const struct sockaddr_in6 *,
 	const struct sockaddr_in6 *);
 int rt_deladdr(struct ifc *, const struct sockaddr_in6 *,
 	const struct sockaddr_in6 *);
 void filterconfig(void);
 int getifmtu(int);
 const char *rttypes(struct rt_msghdr *);
 const char *rtflags(struct rt_msghdr *);
 const char *ifflags(int);
 int ifrt(struct ifc *, int);
 void ifrt_p2p(struct ifc *, int);
 void applymask(struct in6_addr *, struct in6_addr *);
 void applyplen(struct in6_addr *, int);
 void ifrtdump(int);
 void ifdump(int);
 void ifdump0(FILE *, const struct ifc *);
 void rtdump(int);
 void rt_entry(struct rt_msghdr *, int);
 void rtdexit(void);
 void riprequest(struct ifc *, struct netinfo6 *, int,
 	struct sockaddr_in6 *);
 void ripflush(struct ifc *, struct sockaddr_in6 *);
 void sendrequest(struct ifc *);
 int sin6mask2len(const struct sockaddr_in6 *);
 int mask2len(const struct in6_addr *, int);
 int sendpacket(struct sockaddr_in6 *, int);
 int addroute(struct riprt *, const struct in6_addr *, struct ifc *);
 int delroute(struct netinfo6 *, struct in6_addr *);
 struct in6_addr *getroute(struct netinfo6 *, struct in6_addr *);
 void krtread(int);
 int tobeadv(struct riprt *, struct ifc *);
 char *allocopy(char *);
 char *hms(void);
 const char *inet6_n2p(const struct in6_addr *);
 struct ifac *ifa_match(const struct ifc *, const struct in6_addr *, int);
 struct in6_addr *plen2mask(int);
 struct riprt *rtsearch(struct netinfo6 *, struct riprt **);
 int ripinterval(int);
 time_t ripsuptrig(void);
 void fatal(const char *, ...)
 	__attribute__((__format__(__printf__, 1, 2)));
 void trace(int, const char *, ...)
 	__attribute__((__format__(__printf__, 2, 3)));
 void tracet(int, const char *, ...)
 	__attribute__((__format__(__printf__, 2, 3)));
 unsigned int if_maxindex(void);
 struct ifc *ifc_find(char *);
 struct iff *iff_find(struct ifc *, int);
 void setindex2ifc(int, struct ifc *);
 
 #define	MALLOC(type)	((type *)malloc(sizeof(type)))
 
 int
 main(argc, argv)
 	int	argc;
 	char	**argv;
 {
 	int	ch;
 	int	error = 0;
 	struct	ifc *ifcp;
 	sigset_t mask, omask;
 	FILE	*pidfile;
 	char *progname;
 	char *ep;
 
 	progname = strrchr(*argv, '/');
 	if (progname)
 		progname++;
 	else
 		progname = *argv;
 
 	pid = getpid();
 	while ((ch = getopt(argc, argv, "A:N:O:R:T:L:t:adDhlnqsS")) != -1) {
 		switch (ch) {
 		case 'A':
 		case 'N':
 		case 'O':
 		case 'T':
 		case 'L':
 			if (nfilter >= MAXFILTER) {
 				fatal("Exceeds MAXFILTER");
 				/*NOTREACHED*/
 			}
 			filtertype[nfilter] = ch;
 			filter[nfilter++] = allocopy(optarg);
 			break;
 		case 't':
 			ep = NULL;
 			routetag = strtoul(optarg, &ep, 0);
 			if (!ep || *ep != '\0' || (routetag & ~0xffff) != 0) {
 				fatal("invalid route tag");
 				/*NOTREACHED*/
 			}
 			break;
 		case 'R':
 			if ((rtlog = fopen(optarg, "w")) == NULL) {
 				fatal("Can not write to routelog");
 				/*NOTREACHED*/
 			}
 			break;
 #define	FLAG(c, flag, n)	case c: do { flag = n; break; } while(0)
 		FLAG('a', aflag, 1); break;
 		FLAG('d', dflag, 1); break;
 		FLAG('D', dflag, 2); break;
 		FLAG('h', hflag, 1); break;
 		FLAG('l', lflag, 1); break;
 		FLAG('n', nflag, 1); break;
 		FLAG('q', qflag, 1); break;
 		FLAG('s', sflag, 1); break;
 		FLAG('S', Sflag, 1); break;
 #undef	FLAG
 		default:
 			fatal("Invalid option specified, terminating");
 			/*NOTREACHED*/
 		}
 	}
 	argc -= optind;
 	argv += optind;
 	if (argc > 0) {
 		fatal("bogus extra arguments");
 		/*NOTREACHED*/
 	}
 
 	if (geteuid()) {
 		nflag = 1;
 		fprintf(stderr, "No kernel update is allowed\n");
 	}
 
 	if (dflag == 0) {
 		if (daemon(0, 0) < 0) {
 			fatal("daemon");
 			/*NOTREACHED*/
 		}
 	}
 
 	openlog(progname, LOG_NDELAY|LOG_PID, LOG_DAEMON);
 	logopened++;
 
 	if ((ripbuf = (struct rip6 *)malloc(RIP6_MAXMTU)) == NULL)
 		fatal("malloc");
 	memset(ripbuf, 0, RIP6_MAXMTU);
 	ripbuf->rip6_cmd = RIP6_RESPONSE;
 	ripbuf->rip6_vers = RIP6_VERSION;
 	ripbuf->rip6_res1[0] = 0;
 	ripbuf->rip6_res1[1] = 0;
 
 	init();
 	ifconfig();
 	for (ifcp = ifc; ifcp; ifcp = ifcp->ifc_next) {
 		if (ifcp->ifc_index < 0) {
 			fprintf(stderr,
 "No ifindex found at %s (no link-local address?)\n",
 				ifcp->ifc_name);
 			error++;
 		}
 	}
 	if (error)
 		exit(1);
 	if (loopifcp == NULL) {
 		fatal("No loopback found");
 		/*NOTREACHED*/
 	}
 	for (ifcp = ifc; ifcp; ifcp = ifcp->ifc_next)
 		ifrt(ifcp, 0);
 	filterconfig();
 	krtread(0);
 	if (dflag)
 		ifrtdump(0);
 
 #if 1
 	pid = getpid();
 	if ((pidfile = fopen(ROUTE6D_PID, "w")) != NULL) {
 		fprintf(pidfile, "%d\n", pid);
 		fclose(pidfile);
 	}
 #endif
 
 	if ((ripbuf = (struct rip6 *)malloc(RIP6_MAXMTU)) == NULL) {
 		fatal("malloc");
 		/*NOTREACHED*/
 	}
 	memset(ripbuf, 0, RIP6_MAXMTU);
 	ripbuf->rip6_cmd = RIP6_RESPONSE;
 	ripbuf->rip6_vers = RIP6_VERSION;
 	ripbuf->rip6_res1[0] = 0;
 	ripbuf->rip6_res1[1] = 0;
 
 	if (signal(SIGALRM, sighandler) == SIG_ERR ||
 	    signal(SIGQUIT, sighandler) == SIG_ERR ||
 	    signal(SIGTERM, sighandler) == SIG_ERR ||
 	    signal(SIGUSR1, sighandler) == SIG_ERR ||
 	    signal(SIGHUP, sighandler) == SIG_ERR ||
 	    signal(SIGINT, sighandler) == SIG_ERR) {
 		fatal("signal");
 		/*NOTREACHED*/
 	}
 	/*
 	 * To avoid rip packet congestion (not on a cable but in this
 	 * process), wait for a moment to send the first RIP6_RESPONSE
 	 * packets.
 	 */
 	alarm(ripinterval(INIT_INTERVAL6));
 
 	for (ifcp = ifc; ifcp; ifcp = ifcp->ifc_next) {
 		if (iff_find(ifcp, 'N'))
 			continue;
 		if (ifcp->ifc_index > 0 && (ifcp->ifc_flags & IFF_UP))
 			sendrequest(ifcp);
 	}
 
 	syslog(LOG_INFO, "**** Started ****");
 	sigemptyset(&mask);
 	sigaddset(&mask, SIGALRM);
 	while (1) {
 		if (seenalrm) {
 			ripalarm();
 			seenalrm = 0;
 			continue;
 		}
 		if (seenquit) {
 			rtdexit();
 			seenquit = 0;
 			continue;
 		}
 		if (seenusr1) {
 			ifrtdump(SIGUSR1);
 			seenusr1 = 0;
 			continue;
 		}
 
 #ifdef HAVE_POLL_H
 		switch (poll(set, 2, INFTIM))
 #else
 		memcpy(recvecp, sockvecp, fdmasks);
 		switch (select(maxfd + 1, recvecp, 0, 0, 0))
 #endif
 		{
 		case -1:
 			if (errno != EINTR) {
 				fatal("select");
 				/*NOTREACHED*/
 			}
 			continue;
 		case 0:
 			continue;
 		default:
 #ifdef HAVE_POLL_H
 			if (set[0].revents & POLLIN)
 #else
 			if (FD_ISSET(ripsock, recvecp))
 #endif
 			{
 				sigprocmask(SIG_BLOCK, &mask, &omask);
 				riprecv();
 				sigprocmask(SIG_SETMASK, &omask, NULL);
 			}
 #ifdef HAVE_POLL_H
 			if (set[1].revents & POLLIN)
 #else
 			if (FD_ISSET(rtsock, recvecp))
 #endif
 			{
 				sigprocmask(SIG_BLOCK, &mask, &omask);
 				rtrecv();
 				sigprocmask(SIG_SETMASK, &omask, NULL);
 			}
 		}
 	}
 }
 
 void
 sighandler(signo)
 	int signo;
 {
 
 	switch (signo) {
 	case SIGALRM:
 		seenalrm++;
 		break;
 	case SIGQUIT:
 	case SIGTERM:
 		seenquit++;
 		break;
 	case SIGUSR1:
 	case SIGHUP:
 	case SIGINT:
 		seenusr1++;
 		break;
 	}
 }
 
 /*
  * gracefully exits after resetting sockopts.
  */
 /* ARGSUSED */
 void
 rtdexit()
 {
 	struct	riprt *rrt;
 
 	alarm(0);
 	for (rrt = riprt; rrt; rrt = rrt->rrt_next) {
 		if (rrt->rrt_rflags & RRTF_AGGREGATE) {
 			delroute(&rrt->rrt_info, &rrt->rrt_gw);
 		}
 	}
 	close(ripsock);
 	close(rtsock);
 	syslog(LOG_INFO, "**** Terminated ****");
 	closelog();
 	exit(1);
 }
 
 /*
  * Called periodically:
  *	1. age out the learned route. remove it if necessary.
  *	2. submit RIP6_RESPONSE packets.
  * Invoked in every SUPPLY_INTERVAL6 (30) seconds.  I believe we don't have
  * to invoke this function in every 1 or 5 or 10 seconds only to age the
  * routes more precisely.
  */
 /* ARGSUSED */
 void
 ripalarm()
 {
 	struct	ifc *ifcp;
 	struct	riprt *rrt, *rrt_prev, *rrt_next;
 	time_t	t_lifetime, t_holddown;
 
 	/* age the RIP routes */
 	rrt_prev = 0;
 	t_lifetime = time(NULL) - RIP_LIFETIME;
 	t_holddown = t_lifetime - RIP_HOLDDOWN;
 	for (rrt = riprt; rrt; rrt = rrt_next) {
 		rrt_next = rrt->rrt_next;
 
 		if (rrt->rrt_t == 0) {
 			rrt_prev = rrt;
 			continue;
 		}
 		if (rrt->rrt_t < t_holddown) {
 			if (rrt_prev) {
 				rrt_prev->rrt_next = rrt->rrt_next;
 			} else {
 				riprt = rrt->rrt_next;
 			}
 			delroute(&rrt->rrt_info, &rrt->rrt_gw);
 			free(rrt);
 			continue;
 		}
 		if (rrt->rrt_t < t_lifetime)
 			rrt->rrt_info.rip6_metric = HOPCNT_INFINITY6;
 		rrt_prev = rrt;
 	}
 	/* Supply updates */
 	for (ifcp = ifc; ifcp; ifcp = ifcp->ifc_next) {
 		if (ifcp->ifc_index > 0 && (ifcp->ifc_flags & IFF_UP))
 			ripsend(ifcp, &ifcp->ifc_ripsin, 0);
 	}
 	alarm(ripinterval(SUPPLY_INTERVAL6));
 }
 
 void
 init()
 {
 	int	error;
 	const int int0 = 0, int1 = 1, int255 = 255;
 	struct	addrinfo hints, *res;
 	char	port[NI_MAXSERV];
 
 	ifc = (struct ifc *)NULL;
 	nifc = 0;
 	nindex2ifc = 0;	/*initial guess*/
 	index2ifc = NULL;
 	snprintf(port, sizeof(port), "%u", RIP6_PORT);
 
 	memset(&hints, 0, sizeof(hints));
 	hints.ai_family = PF_INET6;
 	hints.ai_socktype = SOCK_DGRAM;
 	hints.ai_protocol = IPPROTO_UDP;
 	hints.ai_flags = AI_PASSIVE;
 	error = getaddrinfo(NULL, port, &hints, &res);
 	if (error) {
 		fatal("%s", gai_strerror(error));
 		/*NOTREACHED*/
 	}
 	if (res->ai_next) {
 		fatal(":: resolved to multiple address");
 		/*NOTREACHED*/
 	}
 
 	ripsock = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
 	if (ripsock < 0) {
 		fatal("rip socket");
 		/*NOTREACHED*/
 	}
 #ifdef IPV6_V6ONLY
 	if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_V6ONLY,
 	    &int1, sizeof(int1)) < 0) {
 		fatal("rip IPV6_V6ONLY");
 		/*NOTREACHED*/
 	}
 #endif
 	if (bind(ripsock, res->ai_addr, res->ai_addrlen) < 0) {
 		fatal("rip bind");
 		/*NOTREACHED*/
 	}
 	if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_MULTICAST_HOPS,
 	    &int255, sizeof(int255)) < 0) {
 		fatal("rip IPV6_MULTICAST_HOPS");
 		/*NOTREACHED*/
 	}
 	if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_MULTICAST_LOOP,
 	    &int0, sizeof(int0)) < 0) {
 		fatal("rip IPV6_MULTICAST_LOOP");
 		/*NOTREACHED*/
 	}
 
 #ifdef IPV6_RECVPKTINFO
 	if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_RECVPKTINFO,
 	    &int1, sizeof(int1)) < 0) {
 		fatal("rip IPV6_RECVPKTINFO");
 		/*NOTREACHED*/
 	}
 #else  /* old adv. API */
 	if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_PKTINFO,
 	    &int1, sizeof(int1)) < 0) {
 		fatal("rip IPV6_PKTINFO");
 		/*NOTREACHED*/
 	}
 #endif 
 
 #ifdef IPV6_RECVPKTINFO
 	if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_RECVHOPLIMIT,
 	    &int1, sizeof(int1)) < 0) {
 		fatal("rip IPV6_RECVHOPLIMIT");
 		/*NOTREACHED*/
 	}
 #else  /* old adv. API */
 	if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_HOPLIMIT,
 	    &int1, sizeof(int1)) < 0) {
 		fatal("rip IPV6_HOPLIMIT");
 		/*NOTREACHED*/
 	}
 #endif
 
 	memset(&hints, 0, sizeof(hints));
 	hints.ai_family = PF_INET6;
 	hints.ai_socktype = SOCK_DGRAM;
 	hints.ai_protocol = IPPROTO_UDP;
 	error = getaddrinfo(RIP6_DEST, port, &hints, &res);
 	if (error) {
 		fatal("%s", gai_strerror(error));
 		/*NOTREACHED*/
 	}
 	if (res->ai_next) {
 		fatal("%s resolved to multiple address", RIP6_DEST);
 		/*NOTREACHED*/
 	}
 	memcpy(&ripsin, res->ai_addr, res->ai_addrlen);
 
 #ifdef HAVE_POLL_H
 	set[0].fd = ripsock;
 	set[0].events = POLLIN;
 #else
 	maxfd = ripsock;
 #endif
 
 	if (nflag == 0) {
 		if ((rtsock = socket(PF_ROUTE, SOCK_RAW, 0)) < 0) {
 			fatal("route socket");
 			/*NOTREACHED*/
 		}
 #ifdef HAVE_POLL_H
 		set[1].fd = rtsock;
 		set[1].events = POLLIN;
 #else
 		if (rtsock > maxfd)
 			maxfd = rtsock;
 #endif
 	} else {
 #ifdef HAVE_POLL_H
 		set[1].fd = -1;
 #else
 		rtsock = -1;	/*just for safety */
 #endif
 	}
 
 #ifndef HAVE_POLL_H
 	fdmasks = howmany(maxfd + 1, NFDBITS) * sizeof(fd_mask);
 	if ((sockvecp = malloc(fdmasks)) == NULL) {
 		fatal("malloc");
 		/*NOTREACHED*/
 	}
 	if ((recvecp = malloc(fdmasks)) == NULL) {
 		fatal("malloc");
 		/*NOTREACHED*/
 	}
 	memset(sockvecp, 0, fdmasks);
 	FD_SET(ripsock, sockvecp);
 	if (rtsock >= 0)
 		FD_SET(rtsock, sockvecp);
 #endif
 }
 
 #define	RIPSIZE(n) \
 	(sizeof(struct rip6) + ((n)-1) * sizeof(struct netinfo6))
 
 /*
  * ripflush flushes the rip datagram stored in the rip buffer
  */
 static int nrt;
 static struct netinfo6 *np;
 
 void
 ripflush(ifcp, sin6)
 	struct ifc *ifcp;
 	struct sockaddr_in6 *sin6;
 {
 	int i;
 	int error;
 
 	if (ifcp)
 		tracet(1, "Send(%s): info(%d) to %s.%d\n",
 			ifcp->ifc_name, nrt,
 			inet6_n2p(&sin6->sin6_addr), ntohs(sin6->sin6_port));
 	else
 		tracet(1, "Send: info(%d) to %s.%d\n",
 			nrt, inet6_n2p(&sin6->sin6_addr), ntohs(sin6->sin6_port));
 	if (dflag >= 2) {
 		np = ripbuf->rip6_nets;
 		for (i = 0; i < nrt; i++, np++) {
 			if (np->rip6_metric == NEXTHOP_METRIC) {
 				if (IN6_IS_ADDR_UNSPECIFIED(&np->rip6_dest))
 					trace(2, "    NextHop reset");
 				else {
 					trace(2, "    NextHop %s",
 						inet6_n2p(&np->rip6_dest));
 				}
 			} else {
 				trace(2, "    %s/%d[%d]",
 					inet6_n2p(&np->rip6_dest),
 					np->rip6_plen, np->rip6_metric);
 			}
 			if (np->rip6_tag) {
 				trace(2, "  tag=0x%04x",
 					ntohs(np->rip6_tag) & 0xffff);
 			}
 			trace(2, "\n");
 		}
 	}
 	error = sendpacket(sin6, RIPSIZE(nrt));
 	if (error == EAFNOSUPPORT) {
 		/* Protocol not supported */
 		tracet(1, "Could not send info to %s (%s): "
 			"set IFF_UP to 0\n",
 			ifcp->ifc_name, inet6_n2p(&ifcp->ifc_ripsin.sin6_addr));
 		ifcp->ifc_flags &= ~IFF_UP;	/* As if down for AF_INET6 */
 	}
 	nrt = 0; np = ripbuf->rip6_nets;
 }
 
 /*
  * Generate RIP6_RESPONSE packets and send them.
  */
 void
 ripsend(ifcp, sin6, flag)
 	struct	ifc *ifcp;
 	struct	sockaddr_in6 *sin6;
 	int flag;
 {
 	struct	riprt *rrt;
 	struct	in6_addr *nh;	/* next hop */
 	int	maxrte;
 
 	if (qflag)
 		return;
 
 	if (ifcp == NULL) {
 		/*
 		 * Request from non-link local address is not
 		 * a regular route6d update.
 		 */
 		maxrte = (IFMINMTU - sizeof(struct ip6_hdr) - 
 				sizeof(struct udphdr) - 
 				sizeof(struct rip6) + sizeof(struct netinfo6)) /
 				sizeof(struct netinfo6);
 		nrt = 0; np = ripbuf->rip6_nets; nh = NULL;
 		for (rrt = riprt; rrt; rrt = rrt->rrt_next) {
 			if (rrt->rrt_rflags & RRTF_NOADVERTISE)
 				continue;
 			/* Put the route to the buffer */
 			*np = rrt->rrt_info;
 			np++; nrt++;
 			if (nrt == maxrte) {
 				ripflush(NULL, sin6);
 				nh = NULL;
 			}
 		}
 		if (nrt)	/* Send last packet */
 			ripflush(NULL, sin6);
 		return;
 	}
 
 	if ((flag & RRTF_SENDANYWAY) == 0 &&
 	    (qflag || (ifcp->ifc_flags & IFF_LOOPBACK)))
 		return;
 
 	/* -N: no use */
 	if (iff_find(ifcp, 'N') != NULL)
 		return;
 
 	/* -T: generate default route only */
 	if (iff_find(ifcp, 'T') != NULL) {
 		struct netinfo6 rrt_info;
 		memset(&rrt_info, 0, sizeof(struct netinfo6));
 		rrt_info.rip6_dest = in6addr_any;
 		rrt_info.rip6_plen = 0;
 		rrt_info.rip6_metric = 1;
 		rrt_info.rip6_metric += ifcp->ifc_metric;
 		rrt_info.rip6_tag = htons(routetag & 0xffff);
 		np = ripbuf->rip6_nets;
 		*np = rrt_info;
 		nrt = 1;
 		ripflush(ifcp, sin6);
 		return;
 	}
 
 	maxrte = (ifcp->ifc_mtu - sizeof(struct ip6_hdr) - 
 			sizeof(struct udphdr) - 
 			sizeof(struct rip6) + sizeof(struct netinfo6)) /
 			sizeof(struct netinfo6);
 
 	nrt = 0; np = ripbuf->rip6_nets; nh = NULL;
 	for (rrt = riprt; rrt; rrt = rrt->rrt_next) {
 		if (rrt->rrt_rflags & RRTF_NOADVERTISE)
 			continue;
 
 		/* Need to check filter here */
 		if (out_filter(rrt, ifcp) == 0)
 			continue;
 
 		/* Check split horizon and other conditions */
 		if (tobeadv(rrt, ifcp) == 0)
 			continue;
 
 		/* Only considers the routes with flag if specified */
 		if ((flag & RRTF_CHANGED) &&
 		    (rrt->rrt_rflags & RRTF_CHANGED) == 0)
 			continue;
 
 		/* Check nexthop */
 		if (rrt->rrt_index == ifcp->ifc_index &&
 		    !IN6_IS_ADDR_UNSPECIFIED(&rrt->rrt_gw) &&
 		    (rrt->rrt_rflags & RRTF_NH_NOT_LLADDR) == 0) {
 			if (nh == NULL || !IN6_ARE_ADDR_EQUAL(nh, &rrt->rrt_gw)) {
 				if (nrt == maxrte - 2)
 					ripflush(ifcp, sin6);
 				np->rip6_dest = rrt->rrt_gw;
 				if (IN6_IS_ADDR_LINKLOCAL(&np->rip6_dest))
 					SET_IN6_LINKLOCAL_IFINDEX(np->rip6_dest, 0);
 				np->rip6_plen = 0;
 				np->rip6_tag = 0;
 				np->rip6_metric = NEXTHOP_METRIC;
 				nh = &rrt->rrt_gw;
 				np++; nrt++;
 			}
 		} else if (nh && (rrt->rrt_index != ifcp->ifc_index ||
 			          !IN6_ARE_ADDR_EQUAL(nh, &rrt->rrt_gw) ||
 				  rrt->rrt_rflags & RRTF_NH_NOT_LLADDR)) {
 			/* Reset nexthop */
 			if (nrt == maxrte - 2)
 				ripflush(ifcp, sin6);
 			memset(np, 0, sizeof(struct netinfo6));
 			np->rip6_metric = NEXTHOP_METRIC;
 			nh = NULL;
 			np++; nrt++;
 		}
 
 		/* Put the route to the buffer */
 		*np = rrt->rrt_info;
 		np++; nrt++;
 		if (nrt == maxrte) {
 			ripflush(ifcp, sin6);
 			nh = NULL;
 		}
 	}
 	if (nrt)	/* Send last packet */
 		ripflush(ifcp, sin6);
 }
 
 /*
  * outbound filter logic, per-route/interface.
  */
 int
 out_filter(rrt, ifcp)
 	struct riprt *rrt;
 	struct ifc *ifcp;
 {
 	struct iff *iffp;
 	struct in6_addr ia;
 	int ok;
 
 	/*
 	 * -A: filter out less specific routes, if we have aggregated
 	 * route configured.
 	 */ 
 	for (iffp = ifcp->ifc_filter; iffp; iffp = iffp->iff_next) {
 		if (iffp->iff_type != 'A')
 			continue;
 		if (rrt->rrt_info.rip6_plen <= iffp->iff_plen)
 			continue;
 		ia = rrt->rrt_info.rip6_dest; 
 		applyplen(&ia, iffp->iff_plen);
 		if (IN6_ARE_ADDR_EQUAL(&ia, &iffp->iff_addr))
 			return 0;
 	}
 
 	/*
 	 * if it is an aggregated route, advertise it only to the
 	 * interfaces specified on -A.
 	 */
 	if ((rrt->rrt_rflags & RRTF_AGGREGATE) != 0) {
 		ok = 0;
 		for (iffp = ifcp->ifc_filter; iffp; iffp = iffp->iff_next) {
 			if (iffp->iff_type != 'A')
 				continue;
 			if (rrt->rrt_info.rip6_plen == iffp->iff_plen &&
 			    IN6_ARE_ADDR_EQUAL(&rrt->rrt_info.rip6_dest,
 			    &iffp->iff_addr)) {
 				ok = 1;
 				break;
 			}
 		}
 		if (!ok)
 			return 0;
 	}
 
 	/*
 	 * -O: advertise only if prefix matches the configured prefix.
 	 */
 	if (iff_find(ifcp, 'O')) {
 		ok = 0;
 		for (iffp = ifcp->ifc_filter; iffp; iffp = iffp->iff_next) {
 			if (iffp->iff_type != 'O')
 				continue;
 			if (rrt->rrt_info.rip6_plen < iffp->iff_plen)
 				continue;
 			ia = rrt->rrt_info.rip6_dest; 
 			applyplen(&ia, iffp->iff_plen);
 			if (IN6_ARE_ADDR_EQUAL(&ia, &iffp->iff_addr)) {
 				ok = 1;
 				break;
 			}
 		}
 		if (!ok)
 			return 0;
 	}
 
 	/* the prefix should be advertised */
 	return 1;
 }
 
 /*
  * Determine if the route is to be advertised on the specified interface.
  * It checks options specified in the arguments and the split horizon rule.
  */
 int
 tobeadv(rrt, ifcp)
 	struct riprt *rrt;
 	struct ifc *ifcp;
 {
 
 	/* Special care for static routes */
 	if (rrt->rrt_flags & RTF_STATIC) {
 		/* XXX don't advertise reject/blackhole routes */
 		if (rrt->rrt_flags & (RTF_REJECT | RTF_BLACKHOLE))
 			return 0;
 
 		if (Sflag)	/* Yes, advertise it anyway */
 			return 1;
 		if (sflag && rrt->rrt_index != ifcp->ifc_index)
 			return 1;
 		return 0;
 	}
 	/* Regular split horizon */
 	if (hflag == 0 && rrt->rrt_index == ifcp->ifc_index)
 		return 0;
 	return 1;
 }
 
 /*
  * Send a rip packet actually.
  */
 int
 sendpacket(sin6, len)
 	struct	sockaddr_in6 *sin6;
 	int	len;
 {
 	struct msghdr m;
 	struct cmsghdr *cm;
 	struct iovec iov[2];
 	u_char cmsgbuf[256];
 	struct in6_pktinfo *pi;
 	int idx;
 	struct sockaddr_in6 sincopy;
 
 	/* do not overwrite the given sin */
 	sincopy = *sin6;
 	sin6 = &sincopy;
 
 	if (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr) ||
 	    IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr)) {
 		/* XXX: do not mix the interface index and link index */
 		idx = IN6_LINKLOCAL_IFINDEX(sin6->sin6_addr);
 		SET_IN6_LINKLOCAL_IFINDEX(sin6->sin6_addr, 0);
 		sin6->sin6_scope_id = idx;
 	} else
 		idx = 0;
 
 	m.msg_name = (caddr_t)sin6;
 	m.msg_namelen = sizeof(*sin6);
 	iov[0].iov_base = (caddr_t)ripbuf;
 	iov[0].iov_len = len;
 	m.msg_iov = iov;
 	m.msg_iovlen = 1;
 	if (!idx) {
 		m.msg_control = NULL;
 		m.msg_controllen = 0;
 	} else {
 		memset(cmsgbuf, 0, sizeof(cmsgbuf));
 		cm = (struct cmsghdr *)cmsgbuf;
 		m.msg_control = (caddr_t)cm;
 		m.msg_controllen = CMSG_SPACE(sizeof(struct in6_pktinfo));
 
 		cm->cmsg_len = CMSG_LEN(sizeof(struct in6_pktinfo));
 		cm->cmsg_level = IPPROTO_IPV6;
 		cm->cmsg_type = IPV6_PKTINFO;
 		pi = (struct in6_pktinfo *)CMSG_DATA(cm);
 		memset(&pi->ipi6_addr, 0, sizeof(pi->ipi6_addr)); /*::*/
 		pi->ipi6_ifindex = idx;
 	}
 
 	if (sendmsg(ripsock, &m, 0 /*MSG_DONTROUTE*/) < 0) {
 		trace(1, "sendmsg: %s\n", strerror(errno));
 		return errno;
 	}
 
 	return 0;
 }
 
 /*
  * Receive and process RIP packets.  Update the routes/kernel forwarding
  * table if necessary.
  */
 void
 riprecv()
 {
 	struct	ifc *ifcp, *ic;
 	struct	sockaddr_in6 fsock;
 	struct	in6_addr nh;	/* next hop */
 	struct	rip6 *rp;
 	struct	netinfo6 *np, *nq;
 	struct	riprt *rrt;
 	ssize_t	len, nn;
 	unsigned int need_trigger, idx;
 	char	buf[4 * RIP6_MAXMTU];
 	time_t	t;
 	struct msghdr m;
 	struct cmsghdr *cm;
 	struct iovec iov[2];
 	u_char cmsgbuf[256];
 	struct in6_pktinfo *pi = NULL;
 	int *hlimp = NULL;
 	struct iff *iffp;
 	struct in6_addr ia;
 	int ok;
 	time_t t_half_lifetime;
 
 	need_trigger = 0;
 
 	m.msg_name = (caddr_t)&fsock;
 	m.msg_namelen = sizeof(fsock);
 	iov[0].iov_base = (caddr_t)buf;
 	iov[0].iov_len = sizeof(buf);
 	m.msg_iov = iov;
 	m.msg_iovlen = 1;
 	cm = (struct cmsghdr *)cmsgbuf;
 	m.msg_control = (caddr_t)cm;
 	m.msg_controllen = sizeof(cmsgbuf);
 	if ((len = recvmsg(ripsock, &m, 0)) < 0) {
 		fatal("recvmsg");
 		/*NOTREACHED*/
 	}
 	idx = 0;
 	for (cm = (struct cmsghdr *)CMSG_FIRSTHDR(&m);
 	     cm;
 	     cm = (struct cmsghdr *)CMSG_NXTHDR(&m, cm)) {
 		if (cm->cmsg_level != IPPROTO_IPV6)
 		    continue;
 		switch (cm->cmsg_type) {
 		case IPV6_PKTINFO:
 			if (cm->cmsg_len != CMSG_LEN(sizeof(*pi))) {
 				trace(1,
 				    "invalid cmsg length for IPV6_PKTINFO\n");
 				return;
 			}
 			pi = (struct in6_pktinfo *)(CMSG_DATA(cm));
 			idx = pi->ipi6_ifindex;
 			break;
 		case IPV6_HOPLIMIT:
 			if (cm->cmsg_len != CMSG_LEN(sizeof(int))) {
 				trace(1,
 				    "invalid cmsg length for IPV6_HOPLIMIT\n");
 				return;
 			}
 			hlimp = (int *)CMSG_DATA(cm);
 			break;
 		}
 	}
 	if (idx && IN6_IS_ADDR_LINKLOCAL(&fsock.sin6_addr))
 		SET_IN6_LINKLOCAL_IFINDEX(fsock.sin6_addr, idx);
 
 	if (len < sizeof(struct rip6)) {
 		trace(1, "Packet too short\n");
 		return;
 	}
 
 	if (pi == NULL || hlimp == NULL) {
 		/*
 		 * This can happen when the kernel failed to allocate memory
 		 * for the ancillary data.  Although we might be able to handle
 		 * some cases without this info, those are minor and not so
 		 * important, so it's better to discard the packet for safer
 		 * operation.
 		 */
 		trace(1, "IPv6 packet information cannot be retrieved\n");
 		return;
 	}
 
 	nh = fsock.sin6_addr;
 	nn = (len - sizeof(struct rip6) + sizeof(struct netinfo6)) /
 		sizeof(struct netinfo6);
 	rp = (struct rip6 *)buf;
 	np = rp->rip6_nets;
 
 	if (rp->rip6_vers != RIP6_VERSION) {
 		trace(1, "Incorrect RIP version %d\n", rp->rip6_vers);
 		return;
 	}
 	if (rp->rip6_cmd == RIP6_REQUEST) {
 		if (idx && idx < nindex2ifc) {
 			ifcp = index2ifc[idx];
 			riprequest(ifcp, np, nn, &fsock);
 		} else {
 			riprequest(NULL, np, nn, &fsock);
 		}
 		return; 
 	} 
 
 	if (!IN6_IS_ADDR_LINKLOCAL(&fsock.sin6_addr)) {
 		trace(1, "Response from non-ll addr: %s\n",
 		    inet6_n2p(&fsock.sin6_addr));
 		return;		/* Ignore packets from non-link-local addr */
 	}
 	if (ntohs(fsock.sin6_port) != RIP6_PORT) {
 		trace(1, "Response from non-rip port from %s\n",
 		    inet6_n2p(&fsock.sin6_addr));
 		return;
 	}
 	if (IN6_IS_ADDR_MULTICAST(&pi->ipi6_addr) && *hlimp != 255) {
 		trace(1,
 		    "Response packet with a smaller hop limit (%d) from %s\n",
 		    *hlimp, inet6_n2p(&fsock.sin6_addr));
 		return;
 	}
 	/*
 	 * Further validation: since this program does not send off-link
 	 * requests, an incoming response must always come from an on-link
 	 * node.  Although this is normally ensured by the source address
 	 * check above, it may not 100% be safe because there are router
 	 * implementations that (invalidly) allow a packet with a link-local
 	 * source address to be forwarded to a different link.
 	 * So we also check whether the destination address is a link-local
 	 * address or the hop limit is 255.  Note that RFC2080 does not require
 	 * the specific hop limit for a unicast response, so we cannot assume 
 	 * the limitation.
 	 */
 	if (!IN6_IS_ADDR_LINKLOCAL(&pi->ipi6_addr) && *hlimp != 255) {
 		trace(1,
 		    "Response packet possibly from an off-link node: "
 		    "from %s to %s hlim=%d\n",
 		    inet6_n2p(&fsock.sin6_addr),
 		    inet6_n2p(&pi->ipi6_addr), *hlimp);
 		return;
 	}
 
 	idx = IN6_LINKLOCAL_IFINDEX(fsock.sin6_addr);
 	ifcp = (idx < nindex2ifc) ? index2ifc[idx] : NULL;
 	if (!ifcp) {
 		trace(1, "Packets to unknown interface index %d\n", idx);
 		return;		/* Ignore it */
 	}
 	if (IN6_ARE_ADDR_EQUAL(&ifcp->ifc_mylladdr, &fsock.sin6_addr))
 		return;		/* The packet is from me; ignore */
 	if (rp->rip6_cmd != RIP6_RESPONSE) {
 		trace(1, "Invalid command %d\n", rp->rip6_cmd);
 		return; 
 	}
 
 	/* -N: no use */
 	if (iff_find(ifcp, 'N') != NULL)
 		return;
 
 	tracet(1, "Recv(%s): from %s.%d info(%d)\n",
 	    ifcp->ifc_name, inet6_n2p(&nh), ntohs(fsock.sin6_port), nn);
 
 	t = time(NULL);
 	t_half_lifetime = t - (RIP_LIFETIME/2);
 	for (; nn; nn--, np++) {
 		if (np->rip6_metric == NEXTHOP_METRIC) {
 			/* modify neighbor address */
 			if (IN6_IS_ADDR_LINKLOCAL(&np->rip6_dest)) {
 				nh = np->rip6_dest;
 				SET_IN6_LINKLOCAL_IFINDEX(nh, idx);
 				trace(1, "\tNexthop: %s\n", inet6_n2p(&nh));
 			} else if (IN6_IS_ADDR_UNSPECIFIED(&np->rip6_dest)) {
 				nh = fsock.sin6_addr;
 				trace(1, "\tNexthop: %s\n", inet6_n2p(&nh));
 			} else {
 				nh = fsock.sin6_addr;
 				trace(1, "\tInvalid Nexthop: %s\n",
 				    inet6_n2p(&np->rip6_dest));
 			}
 			continue;
 		}
 		if (IN6_IS_ADDR_MULTICAST(&np->rip6_dest)) {
 			trace(1, "\tMulticast netinfo6: %s/%d [%d]\n",
 				inet6_n2p(&np->rip6_dest),
 				np->rip6_plen, np->rip6_metric);
 			continue;
 		}
 		if (IN6_IS_ADDR_LOOPBACK(&np->rip6_dest)) {
 			trace(1, "\tLoopback netinfo6: %s/%d [%d]\n",
 				inet6_n2p(&np->rip6_dest),
 				np->rip6_plen, np->rip6_metric);
 			continue;
 		}
 		if (IN6_IS_ADDR_LINKLOCAL(&np->rip6_dest)) {
 			trace(1, "\tLink Local netinfo6: %s/%d [%d]\n",
 				inet6_n2p(&np->rip6_dest),
 				np->rip6_plen, np->rip6_metric);
 			continue;
 		}
 		/* may need to pass sitelocal prefix in some case, however*/
 		if (IN6_IS_ADDR_SITELOCAL(&np->rip6_dest) && !lflag) {
 			trace(1, "\tSite Local netinfo6: %s/%d [%d]\n",
 				inet6_n2p(&np->rip6_dest),
 				np->rip6_plen, np->rip6_metric);
 			continue;
 		}
 		trace(2, "\tnetinfo6: %s/%d [%d]",
 			inet6_n2p(&np->rip6_dest),
 			np->rip6_plen, np->rip6_metric);
 		if (np->rip6_tag)
 			trace(2, "  tag=0x%04x", ntohs(np->rip6_tag) & 0xffff);
 		if (dflag >= 2) {
 			ia = np->rip6_dest;
 			applyplen(&ia, np->rip6_plen);
 			if (!IN6_ARE_ADDR_EQUAL(&ia, &np->rip6_dest))
 				trace(2, " [junk outside prefix]");
 		}
 
 		/*
 		 * -L: listen only if the prefix matches the configuration
 		 */
 		ok = 1;		/* if there's no L filter, it is ok */
 		for (iffp = ifcp->ifc_filter; iffp; iffp = iffp->iff_next) {
 			if (iffp->iff_type != 'L')
 				continue;
 			ok = 0;
 			if (np->rip6_plen < iffp->iff_plen)
 				continue;
 			/* special rule: ::/0 means default, not "in /0" */
 			if (iffp->iff_plen == 0 && np->rip6_plen > 0)
 				continue;
 			ia = np->rip6_dest; 
 			applyplen(&ia, iffp->iff_plen);
 			if (IN6_ARE_ADDR_EQUAL(&ia, &iffp->iff_addr)) {
 				ok = 1;
 				break;
 			}
 		}
 		if (!ok) {
 			trace(2, "  (filtered)\n");
 			continue;
 		}
 
 		trace(2, "\n");
 		np->rip6_metric++;
 		np->rip6_metric += ifcp->ifc_metric;
 		if (np->rip6_metric > HOPCNT_INFINITY6)
 			np->rip6_metric = HOPCNT_INFINITY6;
 
 		applyplen(&np->rip6_dest, np->rip6_plen);
 		if ((rrt = rtsearch(np, NULL)) != NULL) {
 			if (rrt->rrt_t == 0)
 				continue;	/* Intf route has priority */
 			nq = &rrt->rrt_info;
 			if (nq->rip6_metric > np->rip6_metric) {
 				if (rrt->rrt_index == ifcp->ifc_index &&
 				    IN6_ARE_ADDR_EQUAL(&nh, &rrt->rrt_gw)) {
 					/* Small metric from the same gateway */
 					nq->rip6_metric = np->rip6_metric;
 				} else {
 					/* Better route found */
 					rrt->rrt_index = ifcp->ifc_index;
 					/* Update routing table */
 					delroute(nq, &rrt->rrt_gw);
 					rrt->rrt_gw = nh;
 					*nq = *np;
 					addroute(rrt, &nh, ifcp);
 				}
 				rrt->rrt_rflags |= RRTF_CHANGED;
 				rrt->rrt_t = t;
 				need_trigger = 1;
 			} else if (nq->rip6_metric < np->rip6_metric &&
 				   rrt->rrt_index == ifcp->ifc_index &&
 				   IN6_ARE_ADDR_EQUAL(&nh, &rrt->rrt_gw)) {
 				/* Got worse route from same gw */
 				nq->rip6_metric = np->rip6_metric;
 				rrt->rrt_t = t;
 				rrt->rrt_rflags |= RRTF_CHANGED;
 				need_trigger = 1;
 			} else if (nq->rip6_metric == np->rip6_metric &&
 				   np->rip6_metric < HOPCNT_INFINITY6) {
 				if (rrt->rrt_index == ifcp->ifc_index &&
 				   IN6_ARE_ADDR_EQUAL(&nh, &rrt->rrt_gw)) { 
 					/* same metric, same route from same gw */
 					rrt->rrt_t = t;
 				} else if (rrt->rrt_t < t_half_lifetime) {
 					/* Better route found */
 					rrt->rrt_index = ifcp->ifc_index;
 					/* Update routing table */
 					delroute(nq, &rrt->rrt_gw);
 					rrt->rrt_gw = nh;
 					*nq = *np;
 					addroute(rrt, &nh, ifcp);
 					rrt->rrt_rflags |= RRTF_CHANGED;
 					rrt->rrt_t = t;
 				}
 			}
 			/* 
 			 * if nq->rip6_metric == HOPCNT_INFINITY6 then
 			 * do not update age value.  Do nothing.
 			 */
 		} else if (np->rip6_metric < HOPCNT_INFINITY6) {
 			/* Got a new valid route */
 			if ((rrt = MALLOC(struct riprt)) == NULL) {
 				fatal("malloc: struct riprt");
 				/*NOTREACHED*/
 			}
 			memset(rrt, 0, sizeof(*rrt));
 			nq = &rrt->rrt_info;
 
 			rrt->rrt_same = NULL;
 			rrt->rrt_index = ifcp->ifc_index;
 			rrt->rrt_flags = RTF_UP|RTF_GATEWAY;
 			rrt->rrt_gw = nh;
 			*nq = *np;
 			applyplen(&nq->rip6_dest, nq->rip6_plen);
 			if (nq->rip6_plen == sizeof(struct in6_addr) * 8)
 				rrt->rrt_flags |= RTF_HOST;
 
 			/* Put the route to the list */
 			rrt->rrt_next = riprt;
 			riprt = rrt;
 			/* Update routing table */
 			addroute(rrt, &nh, ifcp);
 			rrt->rrt_rflags |= RRTF_CHANGED;
 			need_trigger = 1;
 			rrt->rrt_t = t;
 		}
 	}
 	/* XXX need to care the interval between triggered updates */
 	if (need_trigger) {
 		if (nextalarm > time(NULL) + RIP_TRIG_INT6_MAX) {
 			for (ic = ifc; ic; ic = ic->ifc_next) {
 				if (ifcp->ifc_index == ic->ifc_index)
 					continue;
 				if (ic->ifc_flags & IFF_UP)
 					ripsend(ic, &ic->ifc_ripsin,
 						RRTF_CHANGED);
 			}
 		}
 		/* Reset the flag */
 		for (rrt = riprt; rrt; rrt = rrt->rrt_next)
 			rrt->rrt_rflags &= ~RRTF_CHANGED;
 	}
 }
 
 /*
  * Send all routes request packet to the specified interface.
  */
 void
 sendrequest(ifcp)
 	struct ifc *ifcp;
 {
 	struct netinfo6 *np;
 	int error;
 
 	if (ifcp->ifc_flags & IFF_LOOPBACK)
 		return;
 	ripbuf->rip6_cmd = RIP6_REQUEST;
 	np = ripbuf->rip6_nets;
 	memset(np, 0, sizeof(struct netinfo6));
 	np->rip6_metric = HOPCNT_INFINITY6;
 	tracet(1, "Send rtdump Request to %s (%s)\n",
 		ifcp->ifc_name, inet6_n2p(&ifcp->ifc_ripsin.sin6_addr));
 	error = sendpacket(&ifcp->ifc_ripsin, RIPSIZE(1));
 	if (error == EAFNOSUPPORT) {
 		/* Protocol not supported */
 		tracet(1, "Could not send rtdump Request to %s (%s): "
 			"set IFF_UP to 0\n",
 			ifcp->ifc_name, inet6_n2p(&ifcp->ifc_ripsin.sin6_addr));
 		ifcp->ifc_flags &= ~IFF_UP;	/* As if down for AF_INET6 */
 	}
 	ripbuf->rip6_cmd = RIP6_RESPONSE;
 }
 
 /*
  * Process a RIP6_REQUEST packet.
  */
 void
 riprequest(ifcp, np, nn, sin6)
 	struct ifc *ifcp;
 	struct netinfo6 *np;
 	int nn;
 	struct sockaddr_in6 *sin6;
 {
 	int i;
 	struct riprt *rrt;
 
 	if (!(nn == 1 && IN6_IS_ADDR_UNSPECIFIED(&np->rip6_dest) &&
 	      np->rip6_plen == 0 && np->rip6_metric == HOPCNT_INFINITY6)) {
 		/* Specific response, don't split-horizon */
 		trace(1, "\tRIP Request\n");
 		for (i = 0; i < nn; i++, np++) {
 			rrt = rtsearch(np, NULL);
 			if (rrt)
 				np->rip6_metric = rrt->rrt_info.rip6_metric;
 			else
 				np->rip6_metric = HOPCNT_INFINITY6;
 		}
 		(void)sendpacket(sin6, RIPSIZE(nn));
 		return;
 	}
 	/* Whole routing table dump */
 	trace(1, "\tRIP Request -- whole routing table\n");
 	ripsend(ifcp, sin6, RRTF_SENDANYWAY);
 }
 
 /*
  * Get information of each interface.
  */
 void
 ifconfig()
 {
 	struct ifaddrs *ifap, *ifa;
 	struct ifc *ifcp;
 	struct ipv6_mreq mreq;
 	int s;
 
 	if ((s = socket(AF_INET6, SOCK_DGRAM, 0)) < 0) {
 		fatal("socket");
 		/*NOTREACHED*/
 	}
 
 	if (getifaddrs(&ifap) != 0) {
 		fatal("getifaddrs");
 		/*NOTREACHED*/
 	}
 
 	for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
 		if (ifa->ifa_addr->sa_family != AF_INET6)
 			continue;
 		ifcp = ifc_find(ifa->ifa_name);
 		/* we are interested in multicast-capable interfaces */
 		if ((ifa->ifa_flags & IFF_MULTICAST) == 0)
 			continue;
 		if (!ifcp) {
 			/* new interface */
 			if ((ifcp = MALLOC(struct ifc)) == NULL) {
 				fatal("malloc: struct ifc");
 				/*NOTREACHED*/
 			}
 			memset(ifcp, 0, sizeof(*ifcp));
 			ifcp->ifc_index = -1;
 			ifcp->ifc_next = ifc;
 			ifc = ifcp;
 			nifc++;
 			ifcp->ifc_name = allocopy(ifa->ifa_name);
 			ifcp->ifc_addr = 0;
 			ifcp->ifc_filter = 0;
 			ifcp->ifc_flags = ifa->ifa_flags;
 			trace(1, "newif %s <%s>\n", ifcp->ifc_name,
 				ifflags(ifcp->ifc_flags));
 			if (!strcmp(ifcp->ifc_name, LOOPBACK_IF))
 				loopifcp = ifcp;
 		} else {
 			/* update flag, this may be up again */
 			if (ifcp->ifc_flags != ifa->ifa_flags) {
 				trace(1, "%s: <%s> -> ", ifcp->ifc_name,
 					ifflags(ifcp->ifc_flags));
 				trace(1, "<%s>\n", ifflags(ifa->ifa_flags));
 				ifcp->ifc_cflags |= IFC_CHANGED;
 			}
 			ifcp->ifc_flags = ifa->ifa_flags;
 		}
 		ifconfig1(ifa->ifa_name, ifa->ifa_addr, ifcp, s);
 		if ((ifcp->ifc_flags & (IFF_LOOPBACK | IFF_UP)) == IFF_UP
 		 && 0 < ifcp->ifc_index && !ifcp->ifc_joined) {
 			mreq.ipv6mr_multiaddr = ifcp->ifc_ripsin.sin6_addr;
 			mreq.ipv6mr_interface = ifcp->ifc_index;
 			if (setsockopt(ripsock, IPPROTO_IPV6, IPV6_JOIN_GROUP,
 			    &mreq, sizeof(mreq)) < 0) {
 				fatal("IPV6_JOIN_GROUP");
 				/*NOTREACHED*/
 			}
 			trace(1, "join %s %s\n", ifcp->ifc_name, RIP6_DEST);
 			ifcp->ifc_joined++;
 		}
 	}
 	close(s);
 	freeifaddrs(ifap);
 }
 
 void
 ifconfig1(name, sa, ifcp, s)
 	const char *name;
 	const struct sockaddr *sa;
 	struct	ifc *ifcp;
 	int	s;
 {
 	struct	in6_ifreq ifr;
 	const struct sockaddr_in6 *sin6;
 	struct	ifac *ifa;
 	int	plen;
 	char	buf[BUFSIZ];
 
 	sin6 = (const struct sockaddr_in6 *)sa;
 	if (IN6_IS_ADDR_SITELOCAL(&sin6->sin6_addr) && !lflag)
 		return;
 	ifr.ifr_addr = *sin6;
 	strncpy(ifr.ifr_name, name, sizeof(ifr.ifr_name));
 	if (ioctl(s, SIOCGIFNETMASK_IN6, (char *)&ifr) < 0) {
 		fatal("ioctl: SIOCGIFNETMASK_IN6");
 		/*NOTREACHED*/
 	}
 	plen = sin6mask2len(&ifr.ifr_addr);
 	if ((ifa = ifa_match(ifcp, &sin6->sin6_addr, plen)) != NULL) {
 		/* same interface found */
 		/* need check if something changed */
 		/* XXX not yet implemented */
 		return;
 	}
 	/*
 	 * New address is found
 	 */
 	if ((ifa = MALLOC(struct ifac)) == NULL) {
 		fatal("malloc: struct ifac");
 		/*NOTREACHED*/
 	}
 	memset(ifa, 0, sizeof(*ifa));
 	ifa->ifa_conf = ifcp;
 	ifa->ifa_next = ifcp->ifc_addr;
 	ifcp->ifc_addr = ifa;
 	ifa->ifa_addr = sin6->sin6_addr;
 	ifa->ifa_plen = plen;
 	if (ifcp->ifc_flags & IFF_POINTOPOINT) {
 		ifr.ifr_addr = *sin6;
 		if (ioctl(s, SIOCGIFDSTADDR_IN6, (char *)&ifr) < 0) {
 			fatal("ioctl: SIOCGIFDSTADDR_IN6");
 			/*NOTREACHED*/
 		}
 		ifa->ifa_raddr = ifr.ifr_dstaddr.sin6_addr;
 		inet_ntop(AF_INET6, (void *)&ifa->ifa_raddr, buf, sizeof(buf));
 		trace(1, "found address %s/%d -- %s\n",
 			inet6_n2p(&ifa->ifa_addr), ifa->ifa_plen, buf);
 	} else {
 		trace(1, "found address %s/%d\n",
 			inet6_n2p(&ifa->ifa_addr), ifa->ifa_plen);
 	}
 	if (ifcp->ifc_index < 0 && IN6_IS_ADDR_LINKLOCAL(&ifa->ifa_addr)) {
 		ifcp->ifc_mylladdr = ifa->ifa_addr;
 		ifcp->ifc_index = IN6_LINKLOCAL_IFINDEX(ifa->ifa_addr);
 		memcpy(&ifcp->ifc_ripsin, &ripsin, ripsin.ss_len);
 		SET_IN6_LINKLOCAL_IFINDEX(ifcp->ifc_ripsin.sin6_addr,
 			ifcp->ifc_index);
 		setindex2ifc(ifcp->ifc_index, ifcp);
 		ifcp->ifc_mtu = getifmtu(ifcp->ifc_index);
 		if (ifcp->ifc_mtu > RIP6_MAXMTU)
 			ifcp->ifc_mtu = RIP6_MAXMTU;
 		if (ioctl(s, SIOCGIFMETRIC, (char *)&ifr) < 0) {
 			fatal("ioctl: SIOCGIFMETRIC");
 			/*NOTREACHED*/
 		}
 		ifcp->ifc_metric = ifr.ifr_metric;
 		trace(1, "\tindex: %d, mtu: %d, metric: %d\n",
 			ifcp->ifc_index, ifcp->ifc_mtu, ifcp->ifc_metric);
 	} else
 		ifcp->ifc_cflags |= IFC_CHANGED;
 }
 
 /*
  * Receive and process routing messages.
  * Update interface information as necesssary.
  */
 void
 rtrecv()
 {
 	char buf[BUFSIZ];
 	char *p, *q;
 	struct rt_msghdr *rtm;
 	struct ifa_msghdr *ifam;
 	struct if_msghdr *ifm;
 	int len;
 	struct ifc *ifcp, *ic;
 	int iface = 0, rtable = 0;
 	struct sockaddr_in6 *rta[RTAX_MAX];
 	struct sockaddr_in6 mask;
 	int i, addrs;
 	struct riprt *rrt;
 
 	if ((len = read(rtsock, buf, sizeof(buf))) < 0) {
 		perror("read from rtsock");
 		exit(1);
 	}
 	if (len < sizeof(*rtm)) {
 		trace(1, "short read from rtsock: %d (should be > %lu)\n",
 			len, (u_long)sizeof(*rtm));
 		return;
 	}
 	if (dflag >= 2) {
 		fprintf(stderr, "rtmsg:\n");
 		for (i = 0; i < len; i++) {
 			fprintf(stderr, "%02x ", buf[i] & 0xff);
 			if (i % 16 == 15) fprintf(stderr, "\n");
 		}
 		fprintf(stderr, "\n");
 	}
 
 	for (p = buf; p - buf < len; p += ((struct rt_msghdr *)p)->rtm_msglen) {
 		/* safety against bogus message */
 		if (((struct rt_msghdr *)p)->rtm_msglen <= 0) {
 			trace(1, "bogus rtmsg: length=%d\n",
 				((struct rt_msghdr *)p)->rtm_msglen);
 			break;
 		}
 		rtm = NULL;
 		ifam = NULL;
 		ifm = NULL;
 		switch (((struct rt_msghdr *)p)->rtm_type) {
 		case RTM_NEWADDR:
 		case RTM_DELADDR:
 			ifam = (struct ifa_msghdr *)p;
 			addrs = ifam->ifam_addrs;
 			q = (char *)(ifam + 1);
 			break;
 		case RTM_IFINFO:
 			ifm = (struct if_msghdr *)p;
 			addrs = ifm->ifm_addrs;
 			q = (char *)(ifm + 1);
 			break;
 		default:
 			rtm = (struct rt_msghdr *)p;
 			addrs = rtm->rtm_addrs;
 			q = (char *)(rtm + 1);
 			if (rtm->rtm_version != RTM_VERSION) {
 				trace(1, "unexpected rtmsg version %d "
 					"(should be %d)\n",
 					rtm->rtm_version, RTM_VERSION);
 				continue;
 			}
 			if (rtm->rtm_pid == pid) {
 #if 0
 				trace(1, "rtmsg looped back to me, ignored\n");
 #endif
 				continue;
 			}
 			break;
 		}
 		memset(&rta, 0, sizeof(rta));
 		for (i = 0; i < RTAX_MAX; i++) {
 			if (addrs & (1 << i)) {
 				rta[i] = (struct sockaddr_in6 *)q;
 				q += ROUNDUP(rta[i]->sin6_len);
 			}
 		}
 
 		trace(1, "rtsock: %s (addrs=%x)\n",
 			rttypes((struct rt_msghdr *)p), addrs);
 		if (dflag >= 2) {
 			for (i = 0;
 			     i < ((struct rt_msghdr *)p)->rtm_msglen;
 			     i++) {
 				fprintf(stderr, "%02x ", p[i] & 0xff);
 				if (i % 16 == 15) fprintf(stderr, "\n");
 			}
 			fprintf(stderr, "\n");
 		}
 
 		/*
 		 * Easy ones first.
 		 *
 		 * We may be able to optimize by using ifm->ifm_index or
 		 * ifam->ifam_index.  For simplicity we don't do that here.
 		 */
 		switch (((struct rt_msghdr *)p)->rtm_type) {
 		case RTM_NEWADDR:
 		case RTM_IFINFO:
 			iface++;
 			continue;
 		case RTM_ADD:
 			rtable++;
 			continue;
 		case RTM_LOSING:
 		case RTM_MISS:
-		case RTM_RESOLVE:
 		case RTM_GET:
 		case RTM_LOCK:
 			/* nothing to be done here */
 			trace(1, "\tnothing to be done, ignored\n");
 			continue;
 		}
 
 #if 0
 		if (rta[RTAX_DST] == NULL) {
 			trace(1, "\tno destination, ignored\n");
 			continue;	
 		}
 		if (rta[RTAX_DST]->sin6_family != AF_INET6) {
 			trace(1, "\taf mismatch, ignored\n");
 			continue;
 		}
 		if (IN6_IS_ADDR_LINKLOCAL(&rta[RTAX_DST]->sin6_addr)) {
 			trace(1, "\tlinklocal destination, ignored\n");
 			continue;
 		}
 		if (IN6_ARE_ADDR_EQUAL(&rta[RTAX_DST]->sin6_addr, &in6addr_loopback)) {
 			trace(1, "\tloopback destination, ignored\n");
 			continue;		/* Loopback */
 		}
 		if (IN6_IS_ADDR_MULTICAST(&rta[RTAX_DST]->sin6_addr)) {
 			trace(1, "\tmulticast destination, ignored\n");
 			continue;
 		}
 #endif
 
 		/* hard ones */
 		switch (((struct rt_msghdr *)p)->rtm_type) {
 		case RTM_NEWADDR:
 		case RTM_IFINFO:
 		case RTM_ADD:
 		case RTM_LOSING:
 		case RTM_MISS:
-		case RTM_RESOLVE:
 		case RTM_GET:
 		case RTM_LOCK:
 			/* should already be handled */
 			fatal("rtrecv: never reach here");
 			/*NOTREACHED*/
 		case RTM_DELETE:
 			if (!rta[RTAX_DST] || !rta[RTAX_GATEWAY]) {
 				trace(1, "\tsome of dst/gw/netamsk are "
 				    "unavailable, ignored\n");
 				break;
 			}
 			if ((rtm->rtm_flags & RTF_HOST) != 0) {
 				mask.sin6_len = sizeof(mask);
 				memset(&mask.sin6_addr, 0xff,
 				    sizeof(mask.sin6_addr));
 				rta[RTAX_NETMASK] = &mask;
 			} else if (!rta[RTAX_NETMASK]) {
 				trace(1, "\tsome of dst/gw/netamsk are "
 				    "unavailable, ignored\n");
 				break;
 			}
 			if (rt_del(rta[RTAX_DST], rta[RTAX_GATEWAY],
 			    rta[RTAX_NETMASK]) == 0) {
 				rtable++;	/*just to be sure*/
 			}
 			break;
 		case RTM_CHANGE:
 		case RTM_REDIRECT:
 			trace(1, "\tnot supported yet, ignored\n");
 			break;
 		case RTM_DELADDR:
 			if (!rta[RTAX_NETMASK] || !rta[RTAX_IFA]) {
 				trace(1, "\tno netmask or ifa given, ignored\n");
 				break;
 			}
 			if (ifam->ifam_index < nindex2ifc)
 				ifcp = index2ifc[ifam->ifam_index];
 			else
 				ifcp = NULL;
 			if (!ifcp) {
 				trace(1, "\tinvalid ifam_index %d, ignored\n",
 					ifam->ifam_index);
 				break;
 			}
 			if (!rt_deladdr(ifcp, rta[RTAX_IFA], rta[RTAX_NETMASK]))
 				iface++;
 			break;
 		case RTM_OLDADD:
 		case RTM_OLDDEL:
 			trace(1, "\tnot supported yet, ignored\n");
 			break;
 		}
 
 	}
 
 	if (iface) {
 		trace(1, "rtsock: reconfigure interfaces, refresh interface routes\n");
 		ifconfig();
 		for (ifcp = ifc; ifcp; ifcp = ifcp->ifc_next)
 			if (ifcp->ifc_cflags & IFC_CHANGED) {
 				if (ifrt(ifcp, 1)) {
 					for (ic = ifc; ic; ic = ic->ifc_next) {
 						if (ifcp->ifc_index == ic->ifc_index)
 							continue;
 						if (ic->ifc_flags & IFF_UP)
 							ripsend(ic, &ic->ifc_ripsin,
 							RRTF_CHANGED);
 					}
 					/* Reset the flag */
 					for (rrt = riprt; rrt; rrt = rrt->rrt_next)
 						rrt->rrt_rflags &= ~RRTF_CHANGED;
 				}
 				ifcp->ifc_cflags &= ~IFC_CHANGED;
 			}
 	}
 	if (rtable) {
 		trace(1, "rtsock: read routing table again\n");
 		krtread(1);
 	}
 }
 
 /*
  * remove specified route from the internal routing table.
  */
 int
 rt_del(sdst, sgw, smask)
 	const struct sockaddr_in6 *sdst;
 	const struct sockaddr_in6 *sgw;
 	const struct sockaddr_in6 *smask;
 {
 	const struct in6_addr *dst = NULL;
 	const struct in6_addr *gw = NULL;
 	int prefix;
 	struct netinfo6 ni6;
 	struct riprt *rrt = NULL;
 	time_t t_lifetime;
 
 	if (sdst->sin6_family != AF_INET6) {
 		trace(1, "\tother AF, ignored\n");
 		return -1;
 	}
 	if (IN6_IS_ADDR_LINKLOCAL(&sdst->sin6_addr)
 	 || IN6_ARE_ADDR_EQUAL(&sdst->sin6_addr, &in6addr_loopback)
 	 || IN6_IS_ADDR_MULTICAST(&sdst->sin6_addr)) {
 		trace(1, "\taddress %s not interesting, ignored\n",
 			inet6_n2p(&sdst->sin6_addr));
 		return -1;
 	}
 	dst = &sdst->sin6_addr;
 	if (sgw->sin6_family == AF_INET6) {
 		/* easy case */
 		gw = &sgw->sin6_addr;
 		prefix = sin6mask2len(smask);
 	} else if (sgw->sin6_family == AF_LINK) {
 		/*
 		 * Interface route... a hard case.  We need to get the prefix
 		 * length from the kernel, but we now are parsing rtmsg.
 		 * We'll purge matching routes from my list, then get the
 		 * fresh list.
 		 */
 		struct riprt *longest;
 		trace(1, "\t%s is an interface route, guessing prefixlen\n",
 			inet6_n2p(dst));
 		longest = NULL;
 		for (rrt = riprt; rrt; rrt = rrt->rrt_next) {
 			if (IN6_ARE_ADDR_EQUAL(&rrt->rrt_info.rip6_dest,
 					&sdst->sin6_addr)
 			 && IN6_IS_ADDR_LOOPBACK(&rrt->rrt_gw)) {
 				if (!longest
 				 || longest->rrt_info.rip6_plen <
 						 rrt->rrt_info.rip6_plen) {
 					longest = rrt;
 				}
 			}
 		}
 		rrt = longest;
 		if (!rrt) {
 			trace(1, "\tno matching interface route found\n");
 			return -1;
 		}
 		gw = &in6addr_loopback;
 		prefix = rrt->rrt_info.rip6_plen;
 	} else {
 		trace(1, "\tunsupported af: (gw=%d)\n", sgw->sin6_family);
 		return -1;
 	}
 
 	trace(1, "\tdeleting %s/%d ", inet6_n2p(dst), prefix);
 	trace(1, "gw %s\n", inet6_n2p(gw));
 	t_lifetime = time(NULL) - RIP_LIFETIME;
 	/* age route for interface address */
 	memset(&ni6, 0, sizeof(ni6));
 	ni6.rip6_dest = *dst;
 	ni6.rip6_plen = prefix;
 	applyplen(&ni6.rip6_dest, ni6.rip6_plen);	/*to be sure*/
 	trace(1, "\tfind route %s/%d\n", inet6_n2p(&ni6.rip6_dest),
 		ni6.rip6_plen);
 	if (!rrt && (rrt = rtsearch(&ni6, NULL)) == NULL) {
 		trace(1, "\tno route found\n");
 		return -1;
 	}
 #if 0
 	if ((rrt->rrt_flags & RTF_STATIC) == 0) {
 		trace(1, "\tyou can delete static routes only\n");
 	} else
 #endif
 	if (!IN6_ARE_ADDR_EQUAL(&rrt->rrt_gw, gw)) {
 		trace(1, "\tgw mismatch: %s <-> ",
 			inet6_n2p(&rrt->rrt_gw));
 		trace(1, "%s\n", inet6_n2p(gw));
 	} else {
 		trace(1, "\troute found, age it\n");
 		if (rrt->rrt_t == 0 || rrt->rrt_t > t_lifetime) {
 			rrt->rrt_t = t_lifetime;
 			rrt->rrt_info.rip6_metric = HOPCNT_INFINITY6;
 		}
 	}
 	return 0;
 }
 
 /*
  * remove specified address from internal interface/routing table.
  */
 int
 rt_deladdr(ifcp, sifa, smask)
 	struct ifc *ifcp;
 	const struct sockaddr_in6 *sifa;
 	const struct sockaddr_in6 *smask;
 {
 	const struct in6_addr *addr = NULL;
 	int prefix;
 	struct ifac *ifa = NULL;
 	struct netinfo6 ni6;
 	struct riprt *rrt = NULL;
 	time_t t_lifetime;
 	int updated = 0;
 
 	if (sifa->sin6_family != AF_INET6) {
 		trace(1, "\tother AF, ignored\n");
 		return -1;
 	}
 	addr = &sifa->sin6_addr;
 	prefix = sin6mask2len(smask);
 
 	trace(1, "\tdeleting %s/%d from %s\n",
 		inet6_n2p(addr), prefix, ifcp->ifc_name);
 	ifa = ifa_match(ifcp, addr, prefix);
 	if (!ifa) {
 		trace(1, "\tno matching ifa found for %s/%d on %s\n",
 			inet6_n2p(addr), prefix, ifcp->ifc_name);
 		return -1;
 	}
 	if (ifa->ifa_conf != ifcp) {
 		trace(1, "\taddress table corrupt: back pointer does not match "
 			"(%s != %s)\n",
 			ifcp->ifc_name, ifa->ifa_conf->ifc_name);
 		return -1;
 	}
 	/* remove ifa from interface */
 	if (ifcp->ifc_addr == ifa)
 		ifcp->ifc_addr = ifa->ifa_next;
 	else {
 		struct ifac *p;
 		for (p = ifcp->ifc_addr; p; p = p->ifa_next) {
 			if (p->ifa_next == ifa) {
 				p->ifa_next = ifa->ifa_next;
 				break;
 			}
 		}
 	}
 	ifa->ifa_next = NULL;
 	ifa->ifa_conf = NULL;
 	t_lifetime = time(NULL) - RIP_LIFETIME;
 	/* age route for interface address */
 	memset(&ni6, 0, sizeof(ni6));
 	ni6.rip6_dest = ifa->ifa_addr;
 	ni6.rip6_plen = ifa->ifa_plen;
 	applyplen(&ni6.rip6_dest, ni6.rip6_plen);
 	trace(1, "\tfind interface route %s/%d on %d\n",
 		inet6_n2p(&ni6.rip6_dest), ni6.rip6_plen, ifcp->ifc_index);
 	if ((rrt = rtsearch(&ni6, NULL)) != NULL) {
 		struct in6_addr none;
 		memset(&none, 0, sizeof(none));
 		if (rrt->rrt_index == ifcp->ifc_index &&
 		    (IN6_ARE_ADDR_EQUAL(&rrt->rrt_gw, &none) ||
 		     IN6_IS_ADDR_LOOPBACK(&rrt->rrt_gw))) {
 			trace(1, "\troute found, age it\n");
 			if (rrt->rrt_t == 0 || rrt->rrt_t > t_lifetime) {
 				rrt->rrt_t = t_lifetime;
 				rrt->rrt_info.rip6_metric = HOPCNT_INFINITY6;
 			}
 			updated++;
 		} else {
 			trace(1, "\tnon-interface route found: %s/%d on %d\n",
 				inet6_n2p(&rrt->rrt_info.rip6_dest),
 				rrt->rrt_info.rip6_plen,
 				rrt->rrt_index);
 		}
 	} else
 		trace(1, "\tno interface route found\n");
 	/* age route for p2p destination */
 	if (ifcp->ifc_flags & IFF_POINTOPOINT) {
 		memset(&ni6, 0, sizeof(ni6));
 		ni6.rip6_dest = ifa->ifa_raddr;
 		ni6.rip6_plen = 128;
 		applyplen(&ni6.rip6_dest, ni6.rip6_plen);	/*to be sure*/
 		trace(1, "\tfind p2p route %s/%d on %d\n",
 			inet6_n2p(&ni6.rip6_dest), ni6.rip6_plen,
 			ifcp->ifc_index);
 		if ((rrt = rtsearch(&ni6, NULL)) != NULL) {
 			if (rrt->rrt_index == ifcp->ifc_index &&
 			    IN6_ARE_ADDR_EQUAL(&rrt->rrt_gw, &ifa->ifa_addr)) {
 				trace(1, "\troute found, age it\n");
 				if (rrt->rrt_t == 0 || rrt->rrt_t > t_lifetime) {
 					rrt->rrt_t = t_lifetime;
 					rrt->rrt_info.rip6_metric =
 					    HOPCNT_INFINITY6;
 					updated++;
 				}
 			} else {
 				trace(1, "\tnon-p2p route found: %s/%d on %d\n",
 					inet6_n2p(&rrt->rrt_info.rip6_dest),
 					rrt->rrt_info.rip6_plen,
 					rrt->rrt_index);
 			}
 		} else
 			trace(1, "\tno p2p route found\n");
 	}
 	return updated ? 0 : -1;
 }
 
 /*
  * Get each interface address and put those interface routes to the route
  * list.
  */
 int
 ifrt(ifcp, again)
 	struct ifc *ifcp;
 	int again;
 {
 	struct ifac *ifa;
 	struct riprt *rrt = NULL, *search_rrt, *prev_rrt, *loop_rrt;
 	struct netinfo6 *np;
 	time_t t_lifetime;
 	int need_trigger = 0;
 
 #if 0
 	if (ifcp->ifc_flags & IFF_LOOPBACK)
 		return 0;			/* ignore loopback */
 #endif
 
 	if (ifcp->ifc_flags & IFF_POINTOPOINT) {
 		ifrt_p2p(ifcp, again);
 		return 0;
 	}
 
 	for (ifa = ifcp->ifc_addr; ifa; ifa = ifa->ifa_next) {
 		if (IN6_IS_ADDR_LINKLOCAL(&ifa->ifa_addr)) {
 #if 0
 			trace(1, "route: %s on %s: "
 			    "skip linklocal interface address\n",
 			    inet6_n2p(&ifa->ifa_addr), ifcp->ifc_name);
 #endif
 			continue;
 		}
 		if (IN6_IS_ADDR_UNSPECIFIED(&ifa->ifa_addr)) {
 #if 0
 			trace(1, "route: %s: skip unspec interface address\n",
 			    ifcp->ifc_name);
 #endif
 			continue;
 		}
 		if (IN6_IS_ADDR_LOOPBACK(&ifa->ifa_addr)) {
 #if 0
 			trace(1, "route: %s: skip loopback address\n",
 			    ifcp->ifc_name);
 #endif
 			continue;
 		}
 		if (ifcp->ifc_flags & IFF_UP) {
 			if ((rrt = MALLOC(struct riprt)) == NULL)
 				fatal("malloc: struct riprt");
 			memset(rrt, 0, sizeof(*rrt));
 			rrt->rrt_same = NULL;
 			rrt->rrt_index = ifcp->ifc_index;
 			rrt->rrt_t = 0;	/* don't age */
 			rrt->rrt_info.rip6_dest = ifa->ifa_addr;
 			rrt->rrt_info.rip6_tag = htons(routetag & 0xffff);
 			rrt->rrt_info.rip6_metric = 1 + ifcp->ifc_metric;
 			rrt->rrt_info.rip6_plen = ifa->ifa_plen;
-			if (ifa->ifa_plen == 128)
-				rrt->rrt_flags = RTF_HOST;
-			else
-				rrt->rrt_flags = RTF_CLONING;
+			rrt->rrt_flags = RTF_HOST;
 			rrt->rrt_rflags |= RRTF_CHANGED;
 			applyplen(&rrt->rrt_info.rip6_dest, ifa->ifa_plen);
 			memset(&rrt->rrt_gw, 0, sizeof(struct in6_addr));
 			rrt->rrt_gw = ifa->ifa_addr;
 			np = &rrt->rrt_info;
 			search_rrt = rtsearch(np, &prev_rrt);
 			if (search_rrt != NULL) {
 				if (search_rrt->rrt_info.rip6_metric <=
 				    rrt->rrt_info.rip6_metric) {
 					/* Already have better route */
 					if (!again) {
 						trace(1, "route: %s/%d: "
 						    "already registered (%s)\n",
 						    inet6_n2p(&np->rip6_dest), np->rip6_plen,
 						    ifcp->ifc_name);
 					}
 					goto next;
 				}
 
 				if (prev_rrt)
 					prev_rrt->rrt_next = rrt->rrt_next;
 				else
 					riprt = rrt->rrt_next;
 				delroute(&rrt->rrt_info, &rrt->rrt_gw);
 			}
 			/* Attach the route to the list */
 			trace(1, "route: %s/%d: register route (%s)\n",
 			    inet6_n2p(&np->rip6_dest), np->rip6_plen,
 			    ifcp->ifc_name);
 			rrt->rrt_next = riprt;
 			riprt = rrt;
 			addroute(rrt, &rrt->rrt_gw, ifcp);
 			rrt = NULL;
 			sendrequest(ifcp);
 			ripsend(ifcp, &ifcp->ifc_ripsin, 0);
 			need_trigger = 1;
 		} else {
 			for (loop_rrt = riprt; loop_rrt; loop_rrt = loop_rrt->rrt_next) {
 				if (loop_rrt->rrt_index == ifcp->ifc_index) {
 					t_lifetime = time(NULL) - RIP_LIFETIME;
 					if (loop_rrt->rrt_t == 0 || loop_rrt->rrt_t > t_lifetime) {
 						loop_rrt->rrt_t = t_lifetime;
 						loop_rrt->rrt_info.rip6_metric = HOPCNT_INFINITY6;
 						loop_rrt->rrt_rflags |= RRTF_CHANGED;
 						need_trigger = 1;
 					}
 				}
 			}
                 }
 	next:
 		if (rrt)
 			free(rrt);
 	}
 	return need_trigger;
 }
 
 /*
  * there are couple of p2p interface routing models.  "behavior" lets
  * you pick one.  it looks that gated behavior fits best with BSDs,
  * since BSD kernels do not look at prefix length on p2p interfaces.
  */
 void
 ifrt_p2p(ifcp, again)
 	struct ifc *ifcp;
 	int again;
 {
 	struct ifac *ifa;
 	struct riprt *rrt, *orrt, *prevrrt;
 	struct netinfo6 *np;
 	struct in6_addr addr, dest;
 	int advert, ignore, i;
 #define P2PADVERT_NETWORK	1
 #define P2PADVERT_ADDR		2
 #define P2PADVERT_DEST		4
 #define P2PADVERT_MAX		4
 	const enum { CISCO, GATED, ROUTE6D } behavior = GATED;
 	const char *category = "";
 	const char *noadv;
 
 	for (ifa = ifcp->ifc_addr; ifa; ifa = ifa->ifa_next) {
 		addr = ifa->ifa_addr;
 		dest = ifa->ifa_raddr;
 		applyplen(&addr, ifa->ifa_plen);
 		applyplen(&dest, ifa->ifa_plen);
 		advert = ignore = 0;
 		switch (behavior) {
 		case CISCO:
 			/*
 			 * honor addr/plen, just like normal shared medium
 			 * interface.  this may cause trouble if you reuse
 			 * addr/plen on other interfaces.
 			 *
 			 * advertise addr/plen.
 			 */
 			advert |= P2PADVERT_NETWORK;
 			break;
 		case GATED:
 			/*
 			 * prefixlen on p2p interface is meaningless.
 			 * advertise addr/128 and dest/128.
 			 *
 			 * do not install network route to route6d routing
 			 * table (if we do, it would prevent route installation
 			 * for other p2p interface that shares addr/plen).
 			 *
 			 * XXX what should we do if dest is ::?  it will not
 			 * get announced anyways (see following filter),
 			 * but we need to think.
 			 */
 			advert |= P2PADVERT_ADDR;
 			advert |= P2PADVERT_DEST;
 			ignore |= P2PADVERT_NETWORK;
 			break;
 		case ROUTE6D:
 			/*
 			 * just for testing.  actually the code is redundant
 			 * given the current p2p interface address assignment
 			 * rule for kame kernel.
 			 *
 			 * intent:
 			 *	A/n -> announce A/n
 			 *	A B/n, A and B share prefix -> A/n (= B/n)
 			 *	A B/n, do not share prefix -> A/128 and B/128
 			 * actually, A/64 and A B/128 are the only cases
 			 * permitted by the kernel:
 			 *	A/64 -> A/64
 			 *	A B/128 -> A/128 and B/128
 			 */
 			if (!IN6_IS_ADDR_UNSPECIFIED(&ifa->ifa_raddr)) {
 				if (IN6_ARE_ADDR_EQUAL(&addr, &dest))
 					advert |= P2PADVERT_NETWORK;
 				else {
 					advert |= P2PADVERT_ADDR;
 					advert |= P2PADVERT_DEST;
 					ignore |= P2PADVERT_NETWORK;
 				}
 			} else
 				advert |= P2PADVERT_NETWORK;
 			break;
 		}
 
 		for (i = 1; i <= P2PADVERT_MAX; i *= 2) {
 			if ((ignore & i) != 0)
 				continue;
 			if ((rrt = MALLOC(struct riprt)) == NULL) {
 				fatal("malloc: struct riprt");
 				/*NOTREACHED*/
 			}
 			memset(rrt, 0, sizeof(*rrt));
 			rrt->rrt_same = NULL;
 			rrt->rrt_index = ifcp->ifc_index;
 			rrt->rrt_t = 0;	/* don't age */
 			switch (i) {
 			case P2PADVERT_NETWORK:
 				rrt->rrt_info.rip6_dest = ifa->ifa_addr;
 				rrt->rrt_info.rip6_plen = ifa->ifa_plen;
 				applyplen(&rrt->rrt_info.rip6_dest,
 				    ifa->ifa_plen);
 				category = "network";
 				break;
 			case P2PADVERT_ADDR:
 				rrt->rrt_info.rip6_dest = ifa->ifa_addr;
 				rrt->rrt_info.rip6_plen = 128;
 				rrt->rrt_gw = in6addr_loopback;
 				category = "addr";
 				break;
 			case P2PADVERT_DEST:
 				rrt->rrt_info.rip6_dest = ifa->ifa_raddr;
 				rrt->rrt_info.rip6_plen = 128;
 				rrt->rrt_gw = ifa->ifa_addr;
 				category = "dest";
 				break;
 			}
 			if (IN6_IS_ADDR_UNSPECIFIED(&rrt->rrt_info.rip6_dest) ||
 			    IN6_IS_ADDR_LINKLOCAL(&rrt->rrt_info.rip6_dest)) {
 #if 0
 				trace(1, "route: %s: skip unspec/linklocal "
 				    "(%s on %s)\n", category, ifcp->ifc_name);
 #endif
 				free(rrt);
 				continue;
 			}
 			if ((advert & i) == 0) {
 				rrt->rrt_rflags |= RRTF_NOADVERTISE;
 				noadv = ", NO-ADV";
 			} else
 				noadv = "";
 			rrt->rrt_info.rip6_tag = htons(routetag & 0xffff);
 			rrt->rrt_info.rip6_metric = 1 + ifcp->ifc_metric;
 			np = &rrt->rrt_info;
 			orrt = rtsearch(np, &prevrrt);
 			if (!orrt) {
 				/* Attach the route to the list */
 				trace(1, "route: %s/%d: register route "
 				    "(%s on %s%s)\n",
 				    inet6_n2p(&np->rip6_dest), np->rip6_plen,
 				    category, ifcp->ifc_name, noadv);
 				rrt->rrt_next = riprt;
 				riprt = rrt;
 			} else if (rrt->rrt_index != orrt->rrt_index ||
 			    rrt->rrt_info.rip6_metric != orrt->rrt_info.rip6_metric) {
 				/* swap route */
 				rrt->rrt_next = orrt->rrt_next;
 				if (prevrrt)
 					prevrrt->rrt_next = rrt;
 				else
 					riprt = rrt;
 				free(orrt);
 
 				trace(1, "route: %s/%d: update (%s on %s%s)\n",
 				    inet6_n2p(&np->rip6_dest), np->rip6_plen,
 				    category, ifcp->ifc_name, noadv);
 			} else {
 				/* Already found */
 				if (!again) {
 					trace(1, "route: %s/%d: "
 					    "already registered (%s on %s%s)\n",
 					    inet6_n2p(&np->rip6_dest),
 					    np->rip6_plen, category,
 					    ifcp->ifc_name, noadv);
 				}
 				free(rrt);
 			}
 		}
 	}
 #undef P2PADVERT_NETWORK
 #undef P2PADVERT_ADDR
 #undef P2PADVERT_DEST
 #undef P2PADVERT_MAX
 }
 
 int
 getifmtu(ifindex)
 	int	ifindex;
 {
 	int	mib[6];
 	char	*buf;
 	size_t	msize;
 	struct	if_msghdr *ifm;
 	int	mtu;
 
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = AF_INET6;
 	mib[4] = NET_RT_IFLIST;
 	mib[5] = ifindex;
 	if (sysctl(mib, 6, NULL, &msize, NULL, 0) < 0) {
 		fatal("sysctl estimate NET_RT_IFLIST");
 		/*NOTREACHED*/
 	}
 	if ((buf = malloc(msize)) == NULL) {
 		fatal("malloc");
 		/*NOTREACHED*/
 	}
 	if (sysctl(mib, 6, buf, &msize, NULL, 0) < 0) {
 		fatal("sysctl NET_RT_IFLIST");
 		/*NOTREACHED*/
 	}
 	ifm = (struct if_msghdr *)buf;
 	mtu = ifm->ifm_data.ifi_mtu;
 	if (ifindex != ifm->ifm_index) {
 		fatal("ifindex does not match with ifm_index");
 		/*NOTREACHED*/
 	}
 	free(buf);
 	return mtu;
 }
 
 const char *
 rttypes(rtm)
 	struct rt_msghdr *rtm;
 {
 #define	RTTYPE(s, f) \
 do { \
 	if (rtm->rtm_type == (f)) \
 		return (s); \
 } while (0)
 	RTTYPE("ADD", RTM_ADD);
 	RTTYPE("DELETE", RTM_DELETE);
 	RTTYPE("CHANGE", RTM_CHANGE);
 	RTTYPE("GET", RTM_GET);
 	RTTYPE("LOSING", RTM_LOSING);
 	RTTYPE("REDIRECT", RTM_REDIRECT);
 	RTTYPE("MISS", RTM_MISS);
 	RTTYPE("LOCK", RTM_LOCK);
 	RTTYPE("OLDADD", RTM_OLDADD);
 	RTTYPE("OLDDEL", RTM_OLDDEL);
-	RTTYPE("RESOLVE", RTM_RESOLVE);
 	RTTYPE("NEWADDR", RTM_NEWADDR);
 	RTTYPE("DELADDR", RTM_DELADDR);
 	RTTYPE("IFINFO", RTM_IFINFO);
 #ifdef RTM_OLDADD
 	RTTYPE("OLDADD", RTM_OLDADD);
 #endif
 #ifdef RTM_OLDDEL
 	RTTYPE("OLDDEL", RTM_OLDDEL);
 #endif
 #ifdef RTM_OIFINFO
 	RTTYPE("OIFINFO", RTM_OIFINFO);
 #endif
 #ifdef RTM_IFANNOUNCE
 	RTTYPE("IFANNOUNCE", RTM_IFANNOUNCE);
 #endif
 #ifdef RTM_NEWMADDR
 	RTTYPE("NEWMADDR", RTM_NEWMADDR);
 #endif
 #ifdef RTM_DELMADDR
 	RTTYPE("DELMADDR", RTM_DELMADDR);
 #endif
 #undef RTTYPE
 	return NULL;
 }
 
 const char *
 rtflags(rtm)
 	struct rt_msghdr *rtm;
 {
 	static char buf[BUFSIZ];
 
 	/*
 	 * letter conflict should be okay.  painful when *BSD diverges...
 	 */
 	strlcpy(buf, "", sizeof(buf));
 #define	RTFLAG(s, f) \
 do { \
 	if (rtm->rtm_flags & (f)) \
 		strlcat(buf, (s), sizeof(buf)); \
 } while (0)
 	RTFLAG("U", RTF_UP);
 	RTFLAG("G", RTF_GATEWAY);
 	RTFLAG("H", RTF_HOST);
 	RTFLAG("R", RTF_REJECT);
 	RTFLAG("D", RTF_DYNAMIC);
 	RTFLAG("M", RTF_MODIFIED);
 	RTFLAG("d", RTF_DONE);
 #ifdef	RTF_MASK
 	RTFLAG("m", RTF_MASK);
 #endif
+#ifdef RTF_CLONING
 	RTFLAG("C", RTF_CLONING);
+#endif
 #ifdef RTF_CLONED
 	RTFLAG("c", RTF_CLONED);
 #endif
 #ifdef RTF_PRCLONING
 	RTFLAG("c", RTF_PRCLONING);
 #endif
 #ifdef RTF_WASCLONED
 	RTFLAG("W", RTF_WASCLONED);
 #endif
 	RTFLAG("X", RTF_XRESOLVE);
+#ifdef RTF_LLINFO
 	RTFLAG("L", RTF_LLINFO);
+#endif
 	RTFLAG("S", RTF_STATIC);
 	RTFLAG("B", RTF_BLACKHOLE);
 #ifdef RTF_PROTO3
 	RTFLAG("3", RTF_PROTO3);
 #endif
 	RTFLAG("2", RTF_PROTO2);
 	RTFLAG("1", RTF_PROTO1);
 #ifdef RTF_BROADCAST
 	RTFLAG("b", RTF_BROADCAST);
 #endif
 #ifdef RTF_DEFAULT
 	RTFLAG("d", RTF_DEFAULT);
 #endif
 #ifdef RTF_ISAROUTER
 	RTFLAG("r", RTF_ISAROUTER);
 #endif
 #ifdef RTF_TUNNEL
 	RTFLAG("T", RTF_TUNNEL);
 #endif
 #ifdef RTF_AUTH
 	RTFLAG("A", RTF_AUTH);
 #endif
 #ifdef RTF_CRYPT
 	RTFLAG("E", RTF_CRYPT);
 #endif
 #undef RTFLAG
 	return buf;
 }
 
 const char *
 ifflags(flags)
 	int flags;
 {
 	static char buf[BUFSIZ];
 
 	strlcpy(buf, "", sizeof(buf));
 #define	IFFLAG(s, f) \
 do { \
 	if (flags & (f)) { \
 		if (buf[0]) \
 			strlcat(buf, ",", sizeof(buf)); \
 		strlcat(buf, (s), sizeof(buf)); \
 	} \
 } while (0)
 	IFFLAG("UP", IFF_UP);
 	IFFLAG("BROADCAST", IFF_BROADCAST);
 	IFFLAG("DEBUG", IFF_DEBUG);
 	IFFLAG("LOOPBACK", IFF_LOOPBACK);
 	IFFLAG("POINTOPOINT", IFF_POINTOPOINT);
 #ifdef IFF_NOTRAILERS
 	IFFLAG("NOTRAILERS", IFF_NOTRAILERS);
 #endif
 #ifdef IFF_SMART
 	IFFLAG("SMART", IFF_SMART);
 #endif
 	IFFLAG("RUNNING", IFF_RUNNING);
 	IFFLAG("NOARP", IFF_NOARP);
 	IFFLAG("PROMISC", IFF_PROMISC);
 	IFFLAG("ALLMULTI", IFF_ALLMULTI);
 	IFFLAG("OACTIVE", IFF_OACTIVE);
 	IFFLAG("SIMPLEX", IFF_SIMPLEX);
 	IFFLAG("LINK0", IFF_LINK0);
 	IFFLAG("LINK1", IFF_LINK1);
 	IFFLAG("LINK2", IFF_LINK2);
 	IFFLAG("MULTICAST", IFF_MULTICAST);
 #undef IFFLAG
 	return buf;
 }
 
 void
 krtread(again)
 	int again;
 {
 	int mib[6];
 	size_t msize;
 	char *buf, *p, *lim;
 	struct rt_msghdr *rtm;
 	int retry;
 	const char *errmsg;
 
 	retry = 0;
 	buf = NULL;
 	mib[0] = CTL_NET;
 	mib[1] = PF_ROUTE;
 	mib[2] = 0;
 	mib[3] = AF_INET6;	/* Address family */
 	mib[4] = NET_RT_DUMP;	/* Dump the kernel routing table */
 	mib[5] = 0;		/* No flags */
 	do {
 		retry++;
 		errmsg = NULL;
 		if (buf)
 			free(buf);
 		if (sysctl(mib, 6, NULL, &msize, NULL, 0) < 0) {
 			errmsg = "sysctl estimate";
 			continue;
 		}
 		if ((buf = malloc(msize)) == NULL) {
 			errmsg = "malloc";
 			continue;
 		}
 		if (sysctl(mib, 6, buf, &msize, NULL, 0) < 0) {
 			errmsg = "sysctl NET_RT_DUMP";
 			continue;
 		}
 	} while (retry < 5 && errmsg != NULL);
 	if (errmsg) {
 		fatal("%s (with %d retries, msize=%lu)", errmsg, retry,
 		    (u_long)msize);
 		/*NOTREACHED*/
 	} else if (1 < retry)
 		syslog(LOG_INFO, "NET_RT_DUMP %d retires", retry);
 
 	lim = buf + msize;
 	for (p = buf; p < lim; p += rtm->rtm_msglen) {
 		rtm = (struct rt_msghdr *)p;
 		rt_entry(rtm, again);
 	}
 	free(buf);
 }
 
 void
 rt_entry(rtm, again)
 	struct rt_msghdr *rtm;
 	int again;
 {
 	struct	sockaddr_in6 *sin6_dst, *sin6_gw, *sin6_mask;
 	struct	sockaddr_in6 *sin6_genmask, *sin6_ifp;
 	char	*rtmp, *ifname = NULL;
 	struct	riprt *rrt, *orrt;
 	struct	netinfo6 *np;
 	int	s;
 
 	sin6_dst = sin6_gw = sin6_mask = sin6_genmask = sin6_ifp = 0;
 	if ((rtm->rtm_flags & RTF_UP) == 0 || rtm->rtm_flags &
-		(RTF_CLONING|RTF_XRESOLVE|RTF_LLINFO|RTF_BLACKHOLE)) {
+		(RTF_XRESOLVE|RTF_BLACKHOLE)) {
 		return;		/* not interested in the link route */
 	}
 	/* do not look at cloned routes */
 #ifdef RTF_WASCLONED
 	if (rtm->rtm_flags & RTF_WASCLONED)
 		return;
 #endif
 #ifdef RTF_CLONED
 	if (rtm->rtm_flags & RTF_CLONED)
 		return;
 #endif
 	/*
 	 * do not look at dynamic routes.
 	 * netbsd/openbsd cloned routes have UGHD.
 	 */
 	if (rtm->rtm_flags & RTF_DYNAMIC)
 		return;
 	rtmp = (char *)(rtm + 1);
 	/* Destination */
 	if ((rtm->rtm_addrs & RTA_DST) == 0)
 		return;		/* ignore routes without destination address */
 	sin6_dst = (struct sockaddr_in6 *)rtmp;
 	rtmp += ROUNDUP(sin6_dst->sin6_len);
 	if (rtm->rtm_addrs & RTA_GATEWAY) {
 		sin6_gw = (struct sockaddr_in6 *)rtmp;
 		rtmp += ROUNDUP(sin6_gw->sin6_len);
 	}
 	if (rtm->rtm_addrs & RTA_NETMASK) {
 		sin6_mask = (struct sockaddr_in6 *)rtmp;
 		rtmp += ROUNDUP(sin6_mask->sin6_len);
 	}
 	if (rtm->rtm_addrs & RTA_GENMASK) {
 		sin6_genmask = (struct sockaddr_in6 *)rtmp;
 		rtmp += ROUNDUP(sin6_genmask->sin6_len);
 	}
 	if (rtm->rtm_addrs & RTA_IFP) {
 		sin6_ifp = (struct sockaddr_in6 *)rtmp;
 		rtmp += ROUNDUP(sin6_ifp->sin6_len);
 	}
 
 	/* Destination */
 	if (sin6_dst->sin6_family != AF_INET6)
 		return;
 	if (IN6_IS_ADDR_LINKLOCAL(&sin6_dst->sin6_addr))
 		return;		/* Link-local */
 	if (IN6_ARE_ADDR_EQUAL(&sin6_dst->sin6_addr, &in6addr_loopback))
 		return;		/* Loopback */
 	if (IN6_IS_ADDR_MULTICAST(&sin6_dst->sin6_addr))
 		return;
 
 	if ((rrt = MALLOC(struct riprt)) == NULL) {
 		fatal("malloc: struct riprt");
 		/*NOTREACHED*/
 	}
 	memset(rrt, 0, sizeof(*rrt));
 	np = &rrt->rrt_info;
 	rrt->rrt_same = NULL;
 	rrt->rrt_t = time(NULL);
 	if (aflag == 0 && (rtm->rtm_flags & RTF_STATIC))
 		rrt->rrt_t = 0;	/* Don't age static routes */
 	if ((rtm->rtm_flags & (RTF_HOST|RTF_GATEWAY)) == RTF_HOST)
 		rrt->rrt_t = 0;	/* Don't age non-gateway host routes */
 	np->rip6_tag = 0;
 	np->rip6_metric = rtm->rtm_rmx.rmx_hopcount;
 	if (np->rip6_metric < 1)
 		np->rip6_metric = 1;
 	rrt->rrt_flags = rtm->rtm_flags;
 	np->rip6_dest = sin6_dst->sin6_addr;
 
 	/* Mask or plen */
 	if (rtm->rtm_flags & RTF_HOST)
 		np->rip6_plen = 128;	/* Host route */
 	else if (sin6_mask)
 		np->rip6_plen = sin6mask2len(sin6_mask);
 	else
 		np->rip6_plen = 0;
 
 	orrt = rtsearch(np, NULL);
 	if (orrt && orrt->rrt_info.rip6_metric != HOPCNT_INFINITY6) {
 		/* Already found */
 		if (!again) {
 			trace(1, "route: %s/%d flags %s: already registered\n",
 				inet6_n2p(&np->rip6_dest), np->rip6_plen,
 				rtflags(rtm));
 		}
 		free(rrt);
 		return;
 	}
 	/* Gateway */
 	if (!sin6_gw)
 		memset(&rrt->rrt_gw, 0, sizeof(struct in6_addr));
 	else {
 		if (sin6_gw->sin6_family == AF_INET6)
 			rrt->rrt_gw = sin6_gw->sin6_addr;
 		else if (sin6_gw->sin6_family == AF_LINK) {
 			/* XXX in case ppp link? */
 			rrt->rrt_gw = in6addr_loopback;
 		} else
 			memset(&rrt->rrt_gw, 0, sizeof(struct in6_addr));
 	}
 	trace(1, "route: %s/%d flags %s",
 		inet6_n2p(&np->rip6_dest), np->rip6_plen, rtflags(rtm));
 	trace(1, " gw %s", inet6_n2p(&rrt->rrt_gw));
 
 	/* Interface */
 	s = rtm->rtm_index;
 	if (s < nindex2ifc && index2ifc[s])
 		ifname = index2ifc[s]->ifc_name;
 	else {
 		trace(1, " not configured\n");
 		free(rrt);
 		return;
 	}
 	trace(1, " if %s sock %d", ifname, s);
 	rrt->rrt_index = s;
 
 	trace(1, "\n");
 
 	/* Check gateway */
 	if (!IN6_IS_ADDR_LINKLOCAL(&rrt->rrt_gw) &&
 	    !IN6_IS_ADDR_LOOPBACK(&rrt->rrt_gw) &&
 	    (rrt->rrt_flags & RTF_LOCAL) == 0) {
 		trace(0, "***** Gateway %s is not a link-local address.\n",
 			inet6_n2p(&rrt->rrt_gw));
 		trace(0, "*****     dest(%s) if(%s) -- Not optimized.\n",
 			inet6_n2p(&rrt->rrt_info.rip6_dest), ifname);
 		rrt->rrt_rflags |= RRTF_NH_NOT_LLADDR;
 	}
 
 	/* Put it to the route list */
 	if (orrt && orrt->rrt_info.rip6_metric == HOPCNT_INFINITY6) {
 		/* replace route list */
 		rrt->rrt_next = orrt->rrt_next;
 		*orrt = *rrt;
 		trace(1, "route: %s/%d flags %s: replace new route\n",
 		    inet6_n2p(&np->rip6_dest), np->rip6_plen,
 		    rtflags(rtm));
 		free(rrt);
 	} else {
 		rrt->rrt_next = riprt;
 		riprt = rrt;
 	}
 }
 
 int
 addroute(rrt, gw, ifcp)
 	struct riprt *rrt;
 	const struct in6_addr *gw;
 	struct ifc *ifcp;
 {
 	struct	netinfo6 *np;
 	u_char	buf[BUFSIZ], buf1[BUFSIZ], buf2[BUFSIZ];
 	struct	rt_msghdr	*rtm;
 	struct	sockaddr_in6	*sin6;
 	int	len;
 
 	np = &rrt->rrt_info;
 	inet_ntop(AF_INET6, (const void *)gw, (char *)buf1, sizeof(buf1));
 	inet_ntop(AF_INET6, (void *)&ifcp->ifc_mylladdr, (char *)buf2, sizeof(buf2));
 	tracet(1, "ADD: %s/%d gw %s [%d] ifa %s\n",
 		inet6_n2p(&np->rip6_dest), np->rip6_plen, buf1,
 		np->rip6_metric - 1, buf2);
 	if (rtlog)
 		fprintf(rtlog, "%s: ADD: %s/%d gw %s [%d] ifa %s\n", hms(),
 			inet6_n2p(&np->rip6_dest), np->rip6_plen, buf1,
 			np->rip6_metric - 1, buf2);
 	if (nflag)
 		return 0;
 
 	memset(buf, 0, sizeof(buf));
 	rtm = (struct rt_msghdr *)buf;
 	rtm->rtm_type = RTM_ADD;
 	rtm->rtm_version = RTM_VERSION;
 	rtm->rtm_seq = ++seq;
 	rtm->rtm_pid = pid;
 	rtm->rtm_flags = rrt->rrt_flags;
 	rtm->rtm_addrs = RTA_DST | RTA_GATEWAY | RTA_NETMASK;
 	rtm->rtm_rmx.rmx_hopcount = np->rip6_metric - 1;
 	rtm->rtm_inits = RTV_HOPCOUNT;
 	sin6 = (struct sockaddr_in6 *)&buf[sizeof(struct rt_msghdr)];
 	/* Destination */
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_addr = np->rip6_dest;
 	sin6 = (struct sockaddr_in6 *)((char *)sin6 + ROUNDUP(sin6->sin6_len));
 	/* Gateway */
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_addr = *gw;
 	sin6 = (struct sockaddr_in6 *)((char *)sin6 + ROUNDUP(sin6->sin6_len));
 	/* Netmask */
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_addr = *(plen2mask(np->rip6_plen));
 	sin6 = (struct sockaddr_in6 *)((char *)sin6 + ROUNDUP(sin6->sin6_len));
 
 	len = (char *)sin6 - (char *)buf;
 	rtm->rtm_msglen = len;
 	if (write(rtsock, buf, len) > 0)
 		return 0;
 
 	if (errno == EEXIST) {
 		trace(0, "ADD: Route already exists %s/%d gw %s\n",
 		    inet6_n2p(&np->rip6_dest), np->rip6_plen, buf1);
 		if (rtlog)
 			fprintf(rtlog, "ADD: Route already exists %s/%d gw %s\n",
 			    inet6_n2p(&np->rip6_dest), np->rip6_plen, buf1);
 	} else {
 		trace(0, "Can not write to rtsock (addroute): %s\n",
 		    strerror(errno));
 		if (rtlog)
 			fprintf(rtlog, "\tCan not write to rtsock: %s\n",
 			    strerror(errno));
 	}
 	return -1;
 }
 
 int
 delroute(np, gw)
 	struct netinfo6 *np;
 	struct in6_addr *gw;
 {
 	u_char	buf[BUFSIZ], buf2[BUFSIZ];
 	struct	rt_msghdr	*rtm;
 	struct	sockaddr_in6	*sin6;
 	int	len;
 
 	inet_ntop(AF_INET6, (void *)gw, (char *)buf2, sizeof(buf2));
 	tracet(1, "DEL: %s/%d gw %s\n", inet6_n2p(&np->rip6_dest),
 		np->rip6_plen, buf2);
 	if (rtlog)
 		fprintf(rtlog, "%s: DEL: %s/%d gw %s\n",
 			hms(), inet6_n2p(&np->rip6_dest), np->rip6_plen, buf2);
 	if (nflag)
 		return 0;
 
 	memset(buf, 0, sizeof(buf));
 	rtm = (struct rt_msghdr *)buf;
 	rtm->rtm_type = RTM_DELETE;
 	rtm->rtm_version = RTM_VERSION;
 	rtm->rtm_seq = ++seq;
 	rtm->rtm_pid = pid;
 	rtm->rtm_flags = RTF_UP | RTF_GATEWAY;
 	if (np->rip6_plen == sizeof(struct in6_addr) * 8)
 		rtm->rtm_flags |= RTF_HOST;
 	rtm->rtm_addrs = RTA_DST | RTA_GATEWAY | RTA_NETMASK;
 	sin6 = (struct sockaddr_in6 *)&buf[sizeof(struct rt_msghdr)];
 	/* Destination */
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_addr = np->rip6_dest;
 	sin6 = (struct sockaddr_in6 *)((char *)sin6 + ROUNDUP(sin6->sin6_len));
 	/* Gateway */
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_addr = *gw;
 	sin6 = (struct sockaddr_in6 *)((char *)sin6 + ROUNDUP(sin6->sin6_len));
 	/* Netmask */
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_addr = *(plen2mask(np->rip6_plen));
 	sin6 = (struct sockaddr_in6 *)((char *)sin6 + ROUNDUP(sin6->sin6_len));
 
 	len = (char *)sin6 - (char *)buf;
 	rtm->rtm_msglen = len;
 	if (write(rtsock, buf, len) >= 0)
 		return 0;
 
 	if (errno == ESRCH) {
 		trace(0, "RTDEL: Route does not exist: %s/%d gw %s\n",
 		    inet6_n2p(&np->rip6_dest), np->rip6_plen, buf2);
 		if (rtlog)
 			fprintf(rtlog, "RTDEL: Route does not exist: %s/%d gw %s\n",
 			    inet6_n2p(&np->rip6_dest), np->rip6_plen, buf2);
 	} else {
 		trace(0, "Can not write to rtsock (delroute): %s\n",
 		    strerror(errno));
 		if (rtlog)
 			fprintf(rtlog, "\tCan not write to rtsock: %s\n",
 			    strerror(errno));
 	}
 	return -1;
 }
 
 struct in6_addr *
 getroute(np, gw)
 	struct netinfo6 *np;
 	struct in6_addr *gw;
 {
 	u_char buf[BUFSIZ];
 	int myseq;
 	int len;
 	struct rt_msghdr *rtm;
 	struct sockaddr_in6 *sin6;
 
 	rtm = (struct rt_msghdr *)buf;
 	len = sizeof(struct rt_msghdr) + sizeof(struct sockaddr_in6);
 	memset(rtm, 0, len);
 	rtm->rtm_type = RTM_GET;
 	rtm->rtm_version = RTM_VERSION;
 	myseq = ++seq;
 	rtm->rtm_seq = myseq;
 	rtm->rtm_addrs = RTA_DST;
 	rtm->rtm_msglen = len;
 	sin6 = (struct sockaddr_in6 *)&buf[sizeof(struct rt_msghdr)];
 	sin6->sin6_len = sizeof(struct sockaddr_in6);
 	sin6->sin6_family = AF_INET6;
 	sin6->sin6_addr = np->rip6_dest;
 	if (write(rtsock, buf, len) < 0) {
 		if (errno == ESRCH)	/* No such route found */
 			return NULL;
 		perror("write to rtsock");
 		exit(1);
 	}
 	do {
 		if ((len = read(rtsock, buf, sizeof(buf))) < 0) {
 			perror("read from rtsock");
 			exit(1);
 		}
 		rtm = (struct rt_msghdr *)buf;
 	} while (rtm->rtm_seq != myseq || rtm->rtm_pid != pid);
 	sin6 = (struct sockaddr_in6 *)&buf[sizeof(struct rt_msghdr)];
 	if (rtm->rtm_addrs & RTA_DST) {
 		sin6 = (struct sockaddr_in6 *)
 			((char *)sin6 + ROUNDUP(sin6->sin6_len));
 	}
 	if (rtm->rtm_addrs & RTA_GATEWAY) {
 		*gw = sin6->sin6_addr;
 		return gw;
 	}
 	return NULL;
 }
 
 const char *
 inet6_n2p(p)
 	const struct in6_addr *p;
 {
 	static char buf[BUFSIZ];
 
 	return inet_ntop(AF_INET6, (const void *)p, buf, sizeof(buf));
 }
 
 void
 ifrtdump(sig)
 	int sig;
 {
 
 	ifdump(sig);
 	rtdump(sig);
 }
 
 void
 ifdump(sig)
 	int sig;
 {
 	struct ifc *ifcp;
 	FILE *dump;
 	int i;
 
 	if (sig == 0)
 		dump = stderr;
 	else
 		if ((dump = fopen(ROUTE6D_DUMP, "a")) == NULL)
 			dump = stderr;
 
 	fprintf(dump, "%s: Interface Table Dump\n", hms());
 	fprintf(dump, "  Number of interfaces: %d\n", nifc);
 	for (i = 0; i < 2; i++) {
 		fprintf(dump, "  %sadvertising interfaces:\n", i ? "non-" : "");
 		for (ifcp = ifc; ifcp; ifcp = ifcp->ifc_next) {
 			if (i == 0) {
 				if ((ifcp->ifc_flags & IFF_UP) == 0)
 					continue;
 				if (iff_find(ifcp, 'N') != NULL)
 					continue;
 			} else {
 				if (ifcp->ifc_flags & IFF_UP)
 					continue;
 			}
 			ifdump0(dump, ifcp);
 		}
 	}
 	fprintf(dump, "\n");
 	if (dump != stderr)
 		fclose(dump);
 }
 
 void
 ifdump0(dump, ifcp)
 	FILE *dump;
 	const struct ifc *ifcp;
 {
 	struct ifac *ifa;
 	struct iff *iffp;
 	char buf[BUFSIZ];
 	const char *ft;
 	int addr;
 
 	fprintf(dump, "    %s: index(%d) flags(%s) addr(%s) mtu(%d) metric(%d)\n",
 		ifcp->ifc_name, ifcp->ifc_index, ifflags(ifcp->ifc_flags),
 		inet6_n2p(&ifcp->ifc_mylladdr),
 		ifcp->ifc_mtu, ifcp->ifc_metric);
 	for (ifa = ifcp->ifc_addr; ifa; ifa = ifa->ifa_next) {
 		if (ifcp->ifc_flags & IFF_POINTOPOINT) {
 			inet_ntop(AF_INET6, (void *)&ifa->ifa_raddr,
 				buf, sizeof(buf));
 			fprintf(dump, "\t%s/%d -- %s\n",
 				inet6_n2p(&ifa->ifa_addr),
 				ifa->ifa_plen, buf);
 		} else {
 			fprintf(dump, "\t%s/%d\n",
 				inet6_n2p(&ifa->ifa_addr),
 				ifa->ifa_plen);
 		}
 	}
 	if (ifcp->ifc_filter) {
 		fprintf(dump, "\tFilter:");
 		for (iffp = ifcp->ifc_filter; iffp; iffp = iffp->iff_next) {
 			addr = 0;
 			switch (iffp->iff_type) {
 			case 'A':
 				ft = "Aggregate"; addr++; break;
 			case 'N':
 				ft = "No-use"; break;
 			case 'O':
 				ft = "Advertise-only"; addr++; break;
 			case 'T':
 				ft = "Default-only"; break;
 			case 'L':
 				ft = "Listen-only"; addr++; break;
 			default:
 				snprintf(buf, sizeof(buf), "Unknown-%c", iffp->iff_type);
 				ft = buf;
 				addr++;
 				break;
 			}
 			fprintf(dump, " %s", ft);
 			if (addr) {
 				fprintf(dump, "(%s/%d)", inet6_n2p(&iffp->iff_addr),
 					iffp->iff_plen);
 			}
 		}
 		fprintf(dump, "\n");
 	}
 }
 
 void
 rtdump(sig)
 	int sig;
 {
 	struct	riprt *rrt;
 	char	buf[BUFSIZ];
 	FILE	*dump;
 	time_t	t, age;
 
 	if (sig == 0)
 		dump = stderr;
 	else
 		if ((dump = fopen(ROUTE6D_DUMP, "a")) == NULL)
 			dump = stderr;
 
 	t = time(NULL);
 	fprintf(dump, "\n%s: Routing Table Dump\n", hms());
 	for (rrt = riprt; rrt; rrt = rrt->rrt_next) {
 		if (rrt->rrt_t == 0)
 			age = 0;
 		else
 			age = t - rrt->rrt_t;
 		inet_ntop(AF_INET6, (void *)&rrt->rrt_info.rip6_dest,
 			buf, sizeof(buf));
 		fprintf(dump, "    %s/%d if(%d:%s) gw(%s) [%d] age(%ld)",
 			buf, rrt->rrt_info.rip6_plen, rrt->rrt_index,
 			index2ifc[rrt->rrt_index]->ifc_name,
 			inet6_n2p(&rrt->rrt_gw),
 			rrt->rrt_info.rip6_metric, (long)age);
 		if (rrt->rrt_info.rip6_tag) {
 			fprintf(dump, " tag(0x%04x)",
 				ntohs(rrt->rrt_info.rip6_tag) & 0xffff);
 		}
 		if (rrt->rrt_rflags & RRTF_NH_NOT_LLADDR)
 			fprintf(dump, " NOT-LL");
 		if (rrt->rrt_rflags & RRTF_NOADVERTISE)
 			fprintf(dump, " NO-ADV");
 		fprintf(dump, "\n");
 	}
 	fprintf(dump, "\n");
 	if (dump != stderr)
 		fclose(dump);
 }
 
 /*
  * Parse the -A (and -O) options and put corresponding filter object to the
  * specified interface structures.  Each of the -A/O option has the following
  * syntax:	-A 5f09:c400::/32,ef0,ef1  (aggregate)
  * 		-O 5f09:c400::/32,ef0,ef1  (only when match)
  */
 void
 filterconfig()
 {
 	int i;
 	char *p, *ap, *iflp, *ifname, *ep;
 	struct iff ftmp, *iff_obj;
 	struct ifc *ifcp;
 	struct riprt *rrt;
 #if 0
 	struct in6_addr gw;
 #endif
 	u_long plen;
 
 	for (i = 0; i < nfilter; i++) {
 		ap = filter[i];
 		iflp = NULL;
 		ifcp = NULL;
 		if (filtertype[i] == 'N' || filtertype[i] == 'T') {
 			iflp = ap;
 			goto ifonly;
 		}
 		if ((p = strchr(ap, ',')) != NULL) {
 			*p++ = '\0';
 			iflp = p;
 		}
 		if ((p = strchr(ap, '/')) == NULL) {
 			fatal("no prefixlen specified for '%s'", ap);
 			/*NOTREACHED*/
 		}
 		*p++ = '\0';
 		if (inet_pton(AF_INET6, ap, &ftmp.iff_addr) != 1) {
 			fatal("invalid prefix specified for '%s'", ap);
 			/*NOTREACHED*/
 		}
 		errno = 0;
 		ep = NULL;
 		plen = strtoul(p, &ep, 10);
 		if (errno || !*p || *ep || plen > sizeof(ftmp.iff_addr) * 8) {
 			fatal("invalid prefix length specified for '%s'", ap);
 			/*NOTREACHED*/
 		}
 		ftmp.iff_plen = plen;
 		ftmp.iff_next = NULL;
 		applyplen(&ftmp.iff_addr, ftmp.iff_plen);
 ifonly:
 		ftmp.iff_type = filtertype[i];
 		if (iflp == NULL || *iflp == '\0') {
 			fatal("no interface specified for '%s'", ap);
 			/*NOTREACHED*/
 		}
 		/* parse the interface listing portion */
 		while (iflp) {
 			ifname = iflp;
 			if ((iflp = strchr(iflp, ',')) != NULL)
 				*iflp++ = '\0';
 			ifcp = ifc_find(ifname);
 			if (ifcp == NULL) {
 				fatal("no interface %s exists", ifname);
 				/*NOTREACHED*/
 			}
 			iff_obj = (struct iff *)malloc(sizeof(struct iff));
 			if (iff_obj == NULL) {
 				fatal("malloc of iff_obj");
 				/*NOTREACHED*/
 			}
 			memcpy((void *)iff_obj, (void *)&ftmp,
 			    sizeof(struct iff));
 			/* link it to the interface filter */
 			iff_obj->iff_next = ifcp->ifc_filter;
 			ifcp->ifc_filter = iff_obj;
 		}
 
 		/*
 		 * -A: aggregate configuration.
 		 */
 		if (filtertype[i] != 'A')
 			continue;
 		/* put the aggregate to the kernel routing table */
 		rrt = (struct riprt *)malloc(sizeof(struct riprt));
 		if (rrt == NULL) {
 			fatal("malloc: rrt");
 			/*NOTREACHED*/
 		}
 		memset(rrt, 0, sizeof(struct riprt));
 		rrt->rrt_info.rip6_dest = ftmp.iff_addr;
 		rrt->rrt_info.rip6_plen = ftmp.iff_plen;
 		rrt->rrt_info.rip6_metric = 1;
 		rrt->rrt_info.rip6_tag = htons(routetag & 0xffff);
 		rrt->rrt_gw = in6addr_loopback;
 		rrt->rrt_flags = RTF_UP | RTF_REJECT;
 		rrt->rrt_rflags = RRTF_AGGREGATE;
 		rrt->rrt_t = 0;
 		rrt->rrt_index = loopifcp->ifc_index;
 #if 0
 		if (getroute(&rrt->rrt_info, &gw)) {
 #if 0
 			/*
 			 * When the address has already been registered in the
 			 * kernel routing table, it should be removed 
 			 */
 			delroute(&rrt->rrt_info, &gw);
 #else
 			/* it is safer behavior */
 			errno = EINVAL;
 			fatal("%s/%u already in routing table, "
 			    "cannot aggregate",
 			    inet6_n2p(&rrt->rrt_info.rip6_dest),
 			    rrt->rrt_info.rip6_plen);
 			/*NOTREACHED*/
 #endif
 		}
 #endif
 		/* Put the route to the list */
 		rrt->rrt_next = riprt;
 		riprt = rrt;
 		trace(1, "Aggregate: %s/%d for %s\n",
 			inet6_n2p(&ftmp.iff_addr), ftmp.iff_plen,
 			ifcp->ifc_name);
 		/* Add this route to the kernel */
 		if (nflag) 	/* do not modify kernel routing table */
 			continue;
 		addroute(rrt, &in6addr_loopback, loopifcp);
 	}
 }
 
 /***************** utility functions *****************/
 
 /*
  * Returns a pointer to ifac whose address and prefix length matches
  * with the address and prefix length specified in the arguments.
  */
 struct ifac *
 ifa_match(ifcp, ia, plen)
 	const struct ifc *ifcp;
 	const struct in6_addr *ia;
 	int plen;
 {
 	struct ifac *ifa;
 
 	for (ifa = ifcp->ifc_addr; ifa; ifa = ifa->ifa_next) {
 		if (IN6_ARE_ADDR_EQUAL(&ifa->ifa_addr, ia) &&
 		    ifa->ifa_plen == plen)
 			break;
 	}
 	return ifa;
 }
 
 /*
  * Return a pointer to riprt structure whose address and prefix length
  * matches with the address and prefix length found in the argument.
  * Note: This is not a rtalloc().  Therefore exact match is necessary.
  */
 struct riprt *
 rtsearch(np, prev_rrt)
 	struct	netinfo6 *np;
 	struct	riprt **prev_rrt;
 {
 	struct	riprt	*rrt;
 
 	if (prev_rrt)
 		*prev_rrt = NULL;
 	for (rrt = riprt; rrt; rrt = rrt->rrt_next) {
 		if (rrt->rrt_info.rip6_plen == np->rip6_plen &&
 		    IN6_ARE_ADDR_EQUAL(&rrt->rrt_info.rip6_dest,
 				       &np->rip6_dest))
 			return rrt;
 		if (prev_rrt)
 			*prev_rrt = rrt;
 	}
 	if (prev_rrt)
 		*prev_rrt = NULL;
 	return 0;
 }
 
 int
 sin6mask2len(sin6)
 	const struct sockaddr_in6 *sin6;
 {
 
 	return mask2len(&sin6->sin6_addr,
 	    sin6->sin6_len - offsetof(struct sockaddr_in6, sin6_addr));
 }
 
 int
 mask2len(addr, lenlim)
 	const struct in6_addr *addr;
 	int lenlim;
 {
 	int i = 0, j;
 	const u_char *p = (const u_char *)addr;
 	
 	for (j = 0; j < lenlim; j++, p++) {
 		if (*p != 0xff)
 			break;
 		i += 8;
 	}
 	if (j < lenlim) {
 		switch (*p) {
 #define	MASKLEN(m, l)	case m: do { i += l; break; } while (0)
 		MASKLEN(0xfe, 7); break;
 		MASKLEN(0xfc, 6); break;
 		MASKLEN(0xf8, 5); break;
 		MASKLEN(0xf0, 4); break;
 		MASKLEN(0xe0, 3); break;
 		MASKLEN(0xc0, 2); break;
 		MASKLEN(0x80, 1); break;
 #undef	MASKLEN
 		}
 	}
 	return i;
 }
 
 void
 applymask(addr, mask)
 	struct in6_addr *addr, *mask;
 {
 	int	i;
 	u_long	*p, *q;
 
 	p = (u_long *)addr; q = (u_long *)mask;
 	for (i = 0; i < 4; i++)
 		*p++ &= *q++;
 }
 
 static const u_char plent[8] = {
 	0x00, 0x80, 0xc0, 0xe0, 0xf0, 0xf8, 0xfc, 0xfe
 };
 
 void
 applyplen(ia, plen)
 	struct	in6_addr *ia;
 	int	plen;
 {
 	u_char	*p;
 	int	i;
 
 	p = ia->s6_addr;
 	for (i = 0; i < 16; i++) {
 		if (plen <= 0)
 			*p = 0;
 		else if (plen < 8)
 			*p &= plent[plen];
 		p++, plen -= 8;
 	}
 }
 
 static const int pl2m[9] = {
 	0x00, 0x80, 0xc0, 0xe0, 0xf0, 0xf8, 0xfc, 0xfe, 0xff
 };
 
 struct in6_addr *
 plen2mask(n)
 	int	n;
 {
 	static struct in6_addr ia;
 	u_char	*p;
 	int	i;
 
 	memset(&ia, 0, sizeof(struct in6_addr));
 	p = (u_char *)&ia;
 	for (i = 0; i < 16; i++, p++, n -= 8) {
 		if (n >= 8) {
 			*p = 0xff;
 			continue;
 		}
 		*p = pl2m[n];
 		break;
 	}
 	return &ia;
 }
 
 char *
 allocopy(p)
 	char *p;
 {
 	int len = strlen(p) + 1;
 	char *q = (char *)malloc(len);
 
 	if (!q) {
 		fatal("malloc");
 		/*NOTREACHED*/
 	}
 
 	strlcpy(q, p, len);
 	return q;
 }
 
 char *
 hms()
 {
 	static char buf[BUFSIZ];
 	time_t t;
 	struct	tm *tm;
 
 	t = time(NULL);
 	if ((tm = localtime(&t)) == 0) {
 		fatal("localtime");
 		/*NOTREACHED*/
 	}
 	snprintf(buf, sizeof(buf), "%02d:%02d:%02d", tm->tm_hour, tm->tm_min,
 	    tm->tm_sec);
 	return buf;
 }
 
 #define	RIPRANDDEV	1.0	/* 30 +- 15, max - min = 30 */
 
 int
 ripinterval(timer)
 	int timer;
 {
 	double r = rand();
 
 	interval = (int)(timer + timer * RIPRANDDEV * (r / RAND_MAX - 0.5));
 	nextalarm = time(NULL) + interval;
 	return interval;
 }
 
 time_t
 ripsuptrig()
 {
 	time_t t;
 
 	double r = rand();
 	t  = (int)(RIP_TRIG_INT6_MIN + 
 		(RIP_TRIG_INT6_MAX - RIP_TRIG_INT6_MIN) * (r / RAND_MAX));
 	sup_trig_update = time(NULL) + t;
 	return t;
 }
 
 void
 #ifdef __STDC__
 fatal(const char *fmt, ...)
 #else
 fatal(fmt, va_alist)
 	char	*fmt;
 	va_dcl
 #endif
 {
 	va_list ap;
 	char buf[1024];
 
 #ifdef __STDC__
 	va_start(ap, fmt);
 #else
 	va_start(ap);
 #endif
 	vsnprintf(buf, sizeof(buf), fmt, ap);
 	va_end(ap);
 	perror(buf);
 	if (errno)
 		syslog(LOG_ERR, "%s: %s", buf, strerror(errno));
 	else
 		syslog(LOG_ERR, "%s", buf);
 	rtdexit();
 }
 
 void
 #ifdef __STDC__
 tracet(int level, const char *fmt, ...)
 #else
 tracet(level, fmt, va_alist)
 	int level;
 	char *fmt;
 	va_dcl
 #endif
 {
 	va_list ap;
 
 	if (level <= dflag) {
 #ifdef __STDC__
 		va_start(ap, fmt);
 #else
 		va_start(ap);
 #endif
 		fprintf(stderr, "%s: ", hms());
 		vfprintf(stderr, fmt, ap);
 		va_end(ap);
 	}
 	if (dflag) {
 #ifdef __STDC__
 		va_start(ap, fmt);
 #else
 		va_start(ap);
 #endif
 		if (level > 0)
 			vsyslog(LOG_DEBUG, fmt, ap);
 		else
 			vsyslog(LOG_WARNING, fmt, ap);
 		va_end(ap);
 	}
 }
 
 void
 #ifdef __STDC__
 trace(int level, const char *fmt, ...)
 #else
 trace(level, fmt, va_alist)
 	int level;
 	char *fmt;
 	va_dcl
 #endif
 {
 	va_list ap;
 
 	if (level <= dflag) {
 #ifdef __STDC__
 		va_start(ap, fmt);
 #else
 		va_start(ap);
 #endif
 		vfprintf(stderr, fmt, ap);
 		va_end(ap);
 	}
 	if (dflag) {
 #ifdef __STDC__
 		va_start(ap, fmt);
 #else
 		va_start(ap);
 #endif
 		if (level > 0)
 			vsyslog(LOG_DEBUG, fmt, ap);
 		else
 			vsyslog(LOG_WARNING, fmt, ap);
 		va_end(ap);
 	}
 }
 
 unsigned int
 if_maxindex()
 {
 	struct if_nameindex *p, *p0;
 	unsigned int max = 0;
 
 	p0 = if_nameindex();
 	for (p = p0; p && p->if_index && p->if_name; p++) {
 		if (max < p->if_index)
 			max = p->if_index;
 	}
 	if_freenameindex(p0);
 	return max;
 }
 
 struct ifc *
 ifc_find(name)
 	char *name;
 {
 	struct ifc *ifcp;
 
 	for (ifcp = ifc; ifcp; ifcp = ifcp->ifc_next) {
 		if (strcmp(name, ifcp->ifc_name) == 0)
 			return ifcp;
 	}
 	return (struct ifc *)NULL;
 }
 
 struct iff *
 iff_find(ifcp, type)
 	struct ifc *ifcp;
 	int type;
 {
 	struct iff *iffp;
 
 	for (iffp = ifcp->ifc_filter; iffp; iffp = iffp->iff_next) {
 		if (iffp->iff_type == type)
 			return iffp;
 	}
 	return NULL;
 }
 
 void
 setindex2ifc(idx, ifcp)
 	int idx;
 	struct ifc *ifcp;
 {
 	int n, nsize;
 	struct ifc **p;
 
 	if (!index2ifc) {
 		nindex2ifc = 5;	/*initial guess*/
 		index2ifc = (struct ifc **)
 			malloc(sizeof(*index2ifc) * nindex2ifc);
 		if (index2ifc == NULL) {
 			fatal("malloc");
 			/*NOTREACHED*/
 		}
 		memset(index2ifc, 0, sizeof(*index2ifc) * nindex2ifc);
 	}
 	n = nindex2ifc;
 	for (nsize = nindex2ifc; nsize <= idx; nsize *= 2)
 		;
 	if (n != nsize) {
 		p = (struct ifc **)realloc(index2ifc,
 		    sizeof(*index2ifc) * nsize);
 		if (p == NULL) {
 			fatal("realloc");
 			/*NOTREACHED*/
 		}
 		memset(p + n, 0, sizeof(*index2ifc) * (nindex2ifc - n));
 		index2ifc = p;
 		nindex2ifc = nsize;
 	}
 	index2ifc[idx] = ifcp;
 }