Index: user/alc/PQ_LAUNDRY/MAINTAINERS =================================================================== --- user/alc/PQ_LAUNDRY/MAINTAINERS (revision 303205) +++ user/alc/PQ_LAUNDRY/MAINTAINERS (revision 303206) @@ -1,107 +1,108 @@ $FreeBSD$ Please note that the content of this file is strictly advisory. No locks listed here are valid. The only strict review requirements are granted by core. These are documented in head/LOCKS and enforced by svnadmin/conf/approvers. The source tree is a community effort. However, some folks go to the trouble of looking after particular areas of the tree. In return for their active caretaking of the code it is polite to coordinate changes with them. This is a list of people who have expressed an interest in part of the code or listed their active caretaking role so that other committers can easily find somebody who is familiar with it. The notes should specify if there is a 3rd party source tree involved or other things that should be kept in mind. However, this is not a 'big stick', it is an offer to help and a source of guidance. It does not override the communal nature of the tree. It is not a registry of 'turf' or private property. *** This list is prone to becoming stale quickly. The best way to find the recent maintainer of a sub-system is to check recent logs for that directory or sub-system. *** *** Maintainers are encouraged to visit: https://reviews.freebsd.org/herald and configure notifications for parts of the tree which they maintain. Notifications can automatically be sent when someone proposes a revision or makes a commit to the specified subtree. *** subsystem login notes ----------------------------- atf freebsd-testing,jmmv,ngie Pre-commit review requested. ath(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org callout_*(9) rrs Pre-commit review requested -- becareful its tricksy code :o. contrib/compiler-rt dim Pre-commit review preferred. contrib/libc++ dim Pre-commit review preferred. contrib/libcxxrt dim Pre-commit review preferred. contrib/llvm dim Pre-commit review preferred. contrib/llvm/tools/lldb emaste Pre-commit review preferred. contrib/netbsd-tests freebsd-testing,ngie Pre-commit review requested. contrib/pjdfstest freebsd-testing,ngie,pjd Pre-commit review requested. dev/usb/wlan adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org *env(3) secteam Due to the problematic security history of this code, please have patches reviewed by secteam. etc/mail gshapiro Pre-commit review requested. Keep in sync with -STABLE. etc/sendmail gshapiro Pre-commit review requested. Keep in sync with -STABLE. fetch des Pre-commit review requested. geli pjd Pre-commit review requested (both sys/geom/eli/ and sbin/geom/class/eli/). isci(4) jimharris Pre-commit review requested. iwm(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org iwn(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org kqueue jmg Pre-commit review requested. Documentation Required. libdpv dteske Pre-commit review requested. Keep in sync with dpv(1). libfetch des Pre-commit review requested. libfigpar dteske Pre-commit review requested. libpam des Pre-commit review requested. linprocfs des Pre-commit review requested. lpr gad Pre-commit review requested, particularly for lpd/recvjob.c and lpd/printjob.c. nanobsd imp Pre-commit phabricator review requested. net80211 adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org nfs freebsd-fs@FreeBSD.org, rmacklem is best for reviews. nis(8), yp(8) araujo Pre-commit review requested. nvd(4) jimharris Pre-commit review requested. nvme(4) jimharris Pre-commit review requested. nvmecontrol(8) jimharris Pre-commit review requested. opencrypto jmg Pre-commit review requested. Documentation Required. openssh des Pre-commit review requested. openssl benl,jkim Pre-commit review requested. otus(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org pci bus imp,jhb Pre-commit review requested. pmcstudy(8) rrs Pre-commit review requested. procfs des Pre-commit review requested. pseudofs des Pre-commit review requested. release/release.sh gjb,re Pre-commit review and regression tests requested. sctp rrs,tuexen Pre-commit review requested (changes need to be backported to github). sendmail gshapiro Pre-commit review requested. sh(1) jilles Pre-commit review requested. This also applies to kill(1), printf(1) and test(1) which are compiled in as builtins. share/mk imp, bapt, bdrewery, emaste, sjg Make is hard. share/mk/*.test.mk freebsd-testing,ngie (same list as share/mk too) Pre-commit review requested. sys/boot/forth dteske Pre-commit review requested. sys/compat/linuxkpi hselasky If in doubt, ask. sys/dev/e1000 erj Pre-commit phabricator review requested. sys/dev/ixgbe erj Pre-commit phabricator review requested. sys/dev/ixl erj Pre-commit phabricator review requested. sys/dev/sound/usb hselasky If in doubt, ask. sys/dev/usb hselasky If in doubt, ask. sys/netinet/ip_carp.c glebius Pre-commit review recommended. sys/netpfil/pf kp,glebius Pre-commit review recommended. tests freebsd-testing,ngie Pre-commit review requested. usr.sbin/bsdconfig dteske Pre-commit phabricator review requested. usr.sbin/dpv dteske Pre-commit review requested. Keep in sync with libdpv. usr.sbin/pkg pkg@ Please coordinate behavior or flag changes with pkg team. usr.sbin/sysrc dteske Pre-commit phabricator review requested. Keep in sync with bsdconfig(8) sysrc.subr. vmm(4) neel,grehan Pre-commit review requested. autofs(5) trasz Pre-commit review recommended. iscsi(4) trasz Pre-commit review recommended. rctl(8) trasz Pre-commit review recommended. +sys/dev/ofw nwhitehorn Pre-commit review recommended. Property changes on: user/alc/PQ_LAUNDRY/MAINTAINERS ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/MAINTAINERS:r299821-303204 Index: user/alc/PQ_LAUNDRY/Makefile.inc1 =================================================================== --- user/alc/PQ_LAUNDRY/Makefile.inc1 (revision 303205) +++ user/alc/PQ_LAUNDRY/Makefile.inc1 (revision 303206) @@ -1,2561 +1,2564 @@ # # $FreeBSD$ # # Make command line options: # -DNO_CLEANDIR run ${MAKE} clean, instead of ${MAKE} cleandir # -DNO_CLEAN do not clean at all # -DDB_FROM_SRC use the user/group databases in src/etc instead of # the system database when installing. # -DNO_SHARE do not go into share subdir # -DKERNFAST define NO_KERNEL{CONFIG,CLEAN,OBJ} # -DNO_KERNELCONFIG do not run config in ${MAKE} buildkernel # -DNO_KERNELCLEAN do not run ${MAKE} clean in ${MAKE} buildkernel # -DNO_KERNELOBJ do not run ${MAKE} obj in ${MAKE} buildkernel # -DNO_PORTSUPDATE do not update ports in ${MAKE} update # -DNO_ROOT install without using root privilege # -DNO_DOCUPDATE do not update doc in ${MAKE} update # -DWITHOUT_CTF do not run the DTrace CTF conversion tools on built objects # LOCAL_DIRS="list of dirs" to add additional dirs to the SUBDIR list # LOCAL_ITOOLS="list of tools" to add additional tools to the ITOOLS list # LOCAL_LIB_DIRS="list of dirs" to add additional dirs to libraries target # LOCAL_MTREE="list of mtree files" to process to allow local directories # to be created before files are installed # LOCAL_TOOL_DIRS="list of dirs" to add additional dirs to the build-tools # list +# LOCAL_XTOOL_DIRS="list of dirs" to add additional dirs to the +# cross-tools target # METALOG="path to metadata log" to write permission and ownership # when NO_ROOT is set. (default: ${DESTDIR}/METALOG) # TARGET="machine" to crossbuild world for a different machine type # TARGET_ARCH= may be required when a TARGET supports multiple endians # BUILDENV_SHELL= shell to launch for the buildenv target (def:${SHELL}) # WORLD_FLAGS= additional flags to pass to make(1) during buildworld # KERNEL_FLAGS= additional flags to pass to make(1) during buildkernel # SUBDIR_OVERRIDE="list of dirs" to build rather than everything. # All libraries and includes, and some build tools will still build. # # The intended user-driven targets are: # buildworld - rebuild *everything*, including glue to help do upgrades # installworld- install everything built by "buildworld" # checkworld - run test suite on installed world # doxygen - build API documentation of the kernel # update - convenient way to update your source tree (eg: svn/svnup) # # Standard targets (not defined here) are documented in the makefiles in # /usr/share/mk. These include: # obj depend all install clean cleandepend cleanobj .if !defined(TARGET) || !defined(TARGET_ARCH) .error "Both TARGET and TARGET_ARCH must be defined." .endif SRCDIR?= ${.CURDIR} LOCALBASE?= /usr/local # Cross toolchain changes must be in effect before bsd.compiler.mk # so that gets the right CC, and pass CROSS_TOOLCHAIN to submakes. .if defined(CROSS_TOOLCHAIN) .include "${LOCALBASE}/share/toolchains/${CROSS_TOOLCHAIN}.mk" CROSSENV+=CROSS_TOOLCHAIN="${CROSS_TOOLCHAIN}" .endif .if defined(CROSS_TOOLCHAIN_PREFIX) CROSS_COMPILER_PREFIX?=${CROSS_TOOLCHAIN_PREFIX} .endif XCOMPILERS= CC CXX CPP .for COMPILER in ${XCOMPILERS} .if defined(CROSS_COMPILER_PREFIX) X${COMPILER}?= ${CROSS_COMPILER_PREFIX}${${COMPILER}} .else X${COMPILER}?= ${${COMPILER}} .endif .endfor # If a full path to an external cross compiler is given, don't build # a cross compiler. .if ${XCC:N${CCACHE_BIN}:M/*} MK_CROSS_COMPILER= no .endif # Pull in COMPILER_TYPE and COMPILER_FREEBSD_VERSION early. .include .include "share/mk/src.opts.mk" # Check if there is a local compiler that can satisfy as an external compiler. .if ${MK_SYSTEM_COMPILER} == "yes" && ${MK_CROSS_COMPILER} == "yes" && \ (${MK_CLANG_BOOTSTRAP} == "yes" || ${MK_GCC_BOOTSTRAP} == "yes") && \ !make(showconfig) && !make(native-xtools) && !make(xdev*) # Which compiler is expected to be used? .if ${MK_CLANG_BOOTSTRAP} == "yes" _expected_compiler_type= clang .elif ${MK_GCC_BOOTSTRAP} == "yes" _expected_compiler_type= gcc .endif # If the expected vs CC is different then we can't skip. # GCC cannot be used for cross-arch yet. For clang we pass -target later if # TARGET_ARCH!=MACHINE_ARCH. .if ${_expected_compiler_type} == ${COMPILER_TYPE} && \ (${COMPILER_TYPE} == "clang" || ${TARGET_ARCH} == ${MACHINE_ARCH}) # It needs to be the same revision as we would build for the bootstrap. .if !defined(CROSS_COMPILER_FREEBSD_VERSION) .if ${_expected_compiler_type} == "clang" CROSS_COMPILER_FREEBSD_VERSION!= \ awk '$$2 == "FREEBSD_CC_VERSION" {printf("%d\n", $$3)}' \ ${SRCDIR}/lib/clang/freebsd_cc_version.h || echo unknown CROSS_COMPILER_VERSION!= \ awk '$$2 == "CLANG_VERSION" {split($$3, a, "."); print a[1] * 10000 + a[2] * 100 + a[3]}' \ ${SRCDIR}/lib/clang/include/clang/Basic/Version.inc || echo unknown .elif ${_expected_compiler_type} == "gcc" CROSS_COMPILER_FREEBSD_VERSION!= \ awk '$$2 == "FBSD_CC_VER" {printf("%d\n", $$3)}' \ ${SRCDIR}/gnu/usr.bin/cc/cc_tools/freebsd-native.h || echo unknown CROSS_COMPILER_VERSION!= \ awk -F. '{print $$1 * 10000 + $$2 * 100 + $$3}' \ ${SRCDIR}/contrib/gcc/BASE-VER || echo unknown .endif .export CROSS_COMPILER_FREEBSD_VERSION CROSS_COMPILER_VERSION .endif # !defined(CROSS_COMPILER_FREEBSD_VERSION) .if ${COMPILER_VERSION} == ${CROSS_COMPILER_VERSION} && \ ${COMPILER_FREEBSD_VERSION} == ${CROSS_COMPILER_FREEBSD_VERSION} # Everything matches, disable the bootstrap compiler. MK_CLANG_BOOTSTRAP= no MK_GCC_BOOTSTRAP= no .if make(buildworld) .info SYSTEM_COMPILER: Determined that CC=${CC} matches the source tree. Not bootstrapping a cross-compiler. .endif .endif # ${COMPILER_VERSION} == ${CROSS_COMPILER_VERSION} .endif # ${_expected_compiler_type} == ${COMPILER_TYPE} .endif # ${XCC:N${CCACHE_BIN}:M/*} # For installworld need to ensure that the looked-up compiler metadata is # passed along rather than trying to run cc from the restricted # STRICTTMPPATH. .if ${MK_CLANG_BOOTSTRAP} == "no" && ${MK_GCC_BOOTSTRAP} == "no" .if !defined(X_COMPILER_TYPE) CROSSENV+= COMPILER_VERSION=${COMPILER_VERSION} \ COMPILER_TYPE=${COMPILER_TYPE} \ COMPILER_FREEBSD_VERSION=${COMPILER_FREEBSD_VERSION} .else CROSSENV+= COMPILER_VERSION=${X_COMPILER_VERSION} \ COMPILER_TYPE=${X_COMPILER_TYPE} \ COMPILER_FREEBSD_VERSION=${X_COMPILER_FREEBSD_VERSION} .endif .endif # Handle external binutils. .if defined(CROSS_TOOLCHAIN_PREFIX) CROSS_BINUTILS_PREFIX?=${CROSS_TOOLCHAIN_PREFIX} .endif # If we do not have a bootstrap binutils (because the in-tree one does not # support the target architecture), provide a default cross-binutils prefix. # This allows aarch64 builds, for example, to automatically use the # aarch64-binutils port or package. .if !make(showconfig) .if !empty(BROKEN_OPTIONS:MBINUTILS_BOOTSTRAP) && \ !defined(CROSS_BINUTILS_PREFIX) CROSS_BINUTILS_PREFIX=/usr/local/${TARGET_ARCH}-freebsd/bin/ .if !exists(${CROSS_BINUTILS_PREFIX}) .error In-tree binutils does not support the ${TARGET_ARCH} architecture. Install the ${TARGET_ARCH}-binutils port or package or set CROSS_BINUTILS_PREFIX. .endif .endif .endif XBINUTILS= AS AR LD NM OBJCOPY OBJDUMP RANLIB SIZE STRINGS .for BINUTIL in ${XBINUTILS} .if defined(CROSS_BINUTILS_PREFIX) && \ exists(${CROSS_BINUTILS_PREFIX}${${BINUTIL}}) X${BINUTIL}?= ${CROSS_BINUTILS_PREFIX}${${BINUTIL}} .else X${BINUTIL}?= ${${BINUTIL}} .endif .endfor # We must do lib/ and libexec/ before bin/ in case of a mid-install error to # keep the users system reasonably usable. For static->dynamic root upgrades, # we don't want to install a dynamic binary without rtld and the needed # libraries. More commonly, for dynamic root, we don't want to install a # binary that requires a newer library version that hasn't been installed yet. # This ordering is not a guarantee though. The only guarantee of a working # system here would require fine-grained ordering of all components based # on their dependencies. .if !empty(SUBDIR_OVERRIDE) SUBDIR= ${SUBDIR_OVERRIDE} .else SUBDIR= lib libexec .if !defined(NO_ROOT) && (make(installworld) || make(install)) # Ensure libraries are installed before progressing. SUBDIR+=.WAIT .endif SUBDIR+=bin .if ${MK_CDDL} != "no" SUBDIR+=cddl .endif SUBDIR+=gnu include .if ${MK_KERBEROS} != "no" SUBDIR+=kerberos5 .endif .if ${MK_RESCUE} != "no" SUBDIR+=rescue .endif SUBDIR+=sbin .if ${MK_CRYPT} != "no" SUBDIR+=secure .endif .if !defined(NO_SHARE) SUBDIR+=share .endif SUBDIR+=sys usr.bin usr.sbin .if ${MK_TESTS} != "no" SUBDIR+= tests .endif .if ${MK_OFED} != "no" SUBDIR+=contrib/ofed .endif # Local directories are last, since it is nice to at least get the base # system rebuilt before you do them. .for _DIR in ${LOCAL_DIRS} .if exists(${.CURDIR}/${_DIR}/Makefile) SUBDIR+= ${_DIR} .endif .endfor # Add LOCAL_LIB_DIRS, but only if they will not be picked up as a SUBDIR # of a LOCAL_DIRS directory. This allows LOCAL_DIRS=foo and # LOCAL_LIB_DIRS=foo/lib to behave as expected. .for _DIR in ${LOCAL_DIRS:M*/} ${LOCAL_DIRS:N*/:S|$|/|} _REDUNDENT_LIB_DIRS+= ${LOCAL_LIB_DIRS:M${_DIR}*} .endfor .for _DIR in ${LOCAL_LIB_DIRS} .if empty(_REDUNDENT_LIB_DIRS:M${_DIR}) && exists(${.CURDIR}/${_DIR}/Makefile) SUBDIR+= ${_DIR} .else .warning ${_DIR} not added to SUBDIR list. See UPDATING 20141121. .endif .endfor # We must do etc/ last as it hooks into building the man whatis file # by calling 'makedb' in share/man. This is only relevant for # install/distribute so they build the whatis file after every manpage is # installed. .if make(installworld) || make(install) SUBDIR+=.WAIT .endif SUBDIR+=etc .endif # !empty(SUBDIR_OVERRIDE) .if defined(NOCLEAN) .warning NOCLEAN option is deprecated. Use NO_CLEAN instead. NO_CLEAN= ${NOCLEAN} .endif .if defined(NO_CLEANDIR) CLEANDIR= clean cleandepend .else CLEANDIR= cleandir .endif .if ${MK_META_MODE} == "yes" # If filemon is used then we can rely on the build being incremental-safe. # The .meta files will also track the build command and rebuild should # it change. .if empty(.MAKE.MODE:Mnofilemon) NO_CLEAN= t .endif .endif LOCAL_TOOL_DIRS?= PACKAGEDIR?= ${DESTDIR}/${DISTDIR} .if empty(SHELL:M*csh*) BUILDENV_SHELL?=${SHELL} .else BUILDENV_SHELL?=/bin/sh .endif .if !defined(SVN) || empty(SVN) . for _P in /usr/bin /usr/local/bin . for _S in svn svnlite . if exists(${_P}/${_S}) SVN= ${_P}/${_S} . endif . endfor . endfor .endif SVNFLAGS?= -r HEAD MAKEOBJDIRPREFIX?= /usr/obj .if !defined(OSRELDATE) .if exists(/usr/include/osreldate.h) OSRELDATE!= awk '/^\#define[[:space:]]*__FreeBSD_version/ { print $$3 }' \ /usr/include/osreldate.h .else OSRELDATE= 0 .endif .export OSRELDATE .endif # Set VERSION for CTFMERGE to use via the default CTFFLAGS=-L VERSION. .if !defined(_REVISION) _REVISION!= MK_AUTO_OBJ=no ${MAKE} -C ${SRCDIR}/release -V REVISION .export _REVISION .endif .if !defined(_BRANCH) _BRANCH!= MK_AUTO_OBJ=no ${MAKE} -C ${SRCDIR}/release -V BRANCH .export _BRANCH .endif .if !defined(SRCRELDATE) SRCRELDATE!= awk '/^\#define[[:space:]]*__FreeBSD_version/ { print $$3 }' \ ${SRCDIR}/sys/sys/param.h .export SRCRELDATE .endif .if !defined(VERSION) VERSION= FreeBSD ${_REVISION}-${_BRANCH:C/-p[0-9]+$//} ${TARGET_ARCH} ${SRCRELDATE} .export VERSION .endif .if !defined(PKG_VERSION) .if ${_BRANCH:MSTABLE*} || ${_BRANCH:MCURRENT*} || ${_BRANCH:MALPHA*} TIMENOW= %Y%m%d%H%M%S EXTRA_REVISION= .s${TIMENOW:gmtime} .endif .if ${_BRANCH:M*-p*} EXTRA_REVISION= _${_BRANCH:C/.*-p([0-9]+$)/\1/} .endif PKG_VERSION= ${_REVISION}${EXTRA_REVISION} .endif KNOWN_ARCHES?= aarch64/arm64 \ amd64 \ arm \ armeb/arm \ armv6/arm \ i386 \ i386/pc98 \ mips \ mipsel/mips \ mips64el/mips \ mipsn32el/mips \ mips64/mips \ mipsn32/mips \ powerpc \ powerpc64/powerpc \ riscv64/riscv \ sparc64 .if ${TARGET} == ${TARGET_ARCH} _t= ${TARGET} .else _t= ${TARGET_ARCH}/${TARGET} .endif .for _t in ${_t} .if empty(KNOWN_ARCHES:M${_t}) .error Unknown target ${TARGET_ARCH}:${TARGET}. .endif .endfor .if ${TARGET} == ${MACHINE} TARGET_CPUTYPE?=${CPUTYPE} .else TARGET_CPUTYPE?= .endif .if !empty(TARGET_CPUTYPE) _TARGET_CPUTYPE=${TARGET_CPUTYPE} .else _TARGET_CPUTYPE=dummy .endif _CPUTYPE!= MK_AUTO_OBJ=no MAKEFLAGS= CPUTYPE=${_TARGET_CPUTYPE} ${MAKE} \ -f /dev/null -m ${.CURDIR}/share/mk -V CPUTYPE .if ${_CPUTYPE} != ${_TARGET_CPUTYPE} .error CPUTYPE global should be set with ?=. .endif .if make(buildworld) BUILD_ARCH!= uname -p .if ${MACHINE_ARCH} != ${BUILD_ARCH} .error To cross-build, set TARGET_ARCH. .endif .endif .if ${MACHINE} == ${TARGET} && ${MACHINE_ARCH} == ${TARGET_ARCH} && !defined(CROSS_BUILD_TESTING) OBJTREE= ${MAKEOBJDIRPREFIX} .else OBJTREE= ${MAKEOBJDIRPREFIX}/${TARGET}.${TARGET_ARCH} .endif WORLDTMP= ${OBJTREE}${.CURDIR}/tmp BPATH= ${WORLDTMP}/legacy/usr/sbin:${WORLDTMP}/legacy/usr/bin:${WORLDTMP}/legacy/bin XPATH= ${WORLDTMP}/usr/sbin:${WORLDTMP}/usr/bin STRICTTMPPATH= ${BPATH}:${XPATH} TMPPATH= ${STRICTTMPPATH}:${PATH} # # Avoid running mktemp(1) unless actually needed. # It may not be functional, e.g., due to new ABI # when in the middle of installing over this system. # .if make(distributeworld) || make(installworld) || make(stageworld) INSTALLTMP!= /usr/bin/mktemp -d -u -t install .endif .if make(stagekernel) || make(distributekernel) TAGS+= kernel PACKAGE= kernel .endif # # Building a world goes through the following stages # # 1. legacy stage [BMAKE] # This stage is responsible for creating compatibility # shims that are needed by the bootstrap-tools, # build-tools and cross-tools stages. These are generally # APIs that tools from one of those three stages need to # build that aren't present on the host. # 1. bootstrap-tools stage [BMAKE] # This stage is responsible for creating programs that # are needed for backward compatibility reasons. They # are not built as cross-tools. # 2. build-tools stage [TMAKE] # This stage is responsible for creating the object # tree and building any tools that are needed during # the build process. Some programs are listed during # this phase because they build binaries to generate # files needed to build these programs. This stage also # builds the 'build-tools' target rather than 'all'. # 3. cross-tools stage [XMAKE] # This stage is responsible for creating any tools that # are needed for building the system. A cross-compiler is one # of them. This differs from build tools in two ways: # 1. the 'all' target is built rather than 'build-tools' # 2. these tools are installed into TMPPATH for stage 4. # 4. world stage [WMAKE] # This stage actually builds the world. # 5. install stage (optional) [IMAKE] # This stage installs a previously built world. # BOOTSTRAPPING?= 0 # Keep these in sync MINIMUM_SUPPORTED_OSREL?= 900044 MINIMUM_SUPPORTED_REL?= 9.1 # Common environment for world related stages CROSSENV+= MAKEOBJDIRPREFIX=${OBJTREE} \ MACHINE_ARCH=${TARGET_ARCH} \ MACHINE=${TARGET} \ CPUTYPE=${TARGET_CPUTYPE} .if ${MK_META_MODE} != "no" # Don't rebuild build-tools targets during normal build. CROSSENV+= BUILD_TOOLS_META=.NOMETA_CMP .endif .if ${MK_GROFF} != "no" CROSSENV+= GROFF_BIN_PATH=${WORLDTMP}/legacy/usr/bin \ GROFF_FONT_PATH=${WORLDTMP}/legacy/usr/share/groff_font \ GROFF_TMAC_PATH=${WORLDTMP}/legacy/usr/share/tmac .endif .if defined(TARGET_CFLAGS) CROSSENV+= ${TARGET_CFLAGS} .endif # bootstrap-tools stage BMAKEENV= INSTALL="sh ${.CURDIR}/tools/install.sh" \ TOOLS_PREFIX=${WORLDTMP} \ PATH=${BPATH}:${PATH} \ WORLDTMP=${WORLDTMP} \ MAKEFLAGS="-m ${.CURDIR}/tools/build/mk ${.MAKEFLAGS}" # need to keep this in sync with targets/pseudo/bootstrap-tools/Makefile BSARGS= DESTDIR= \ BOOTSTRAPPING=${OSRELDATE} \ SSP_CFLAGS= \ MK_HTML=no NO_LINT=yes MK_MAN=no \ -DNO_PIC MK_PROFILE=no -DNO_SHARED \ -DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no \ MK_CLANG_EXTRAS=no MK_CLANG_FULL=no \ MK_LLDB=no MK_TESTS=no \ MK_INCLUDES=yes BMAKE= MAKEOBJDIRPREFIX=${WORLDTMP} \ ${BMAKEENV} ${MAKE} ${WORLD_FLAGS} -f Makefile.inc1 \ ${BSARGS} # build-tools stage TMAKE= MAKEOBJDIRPREFIX=${OBJTREE} \ ${BMAKEENV} ${MAKE} ${WORLD_FLAGS} -f Makefile.inc1 \ TARGET=${TARGET} TARGET_ARCH=${TARGET_ARCH} \ DESTDIR= \ BOOTSTRAPPING=${OSRELDATE} \ SSP_CFLAGS= \ -DNO_LINT \ -DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no \ MK_CLANG_EXTRAS=no MK_CLANG_FULL=no \ MK_LLDB=no MK_TESTS=no # cross-tools stage XMAKE= TOOLS_PREFIX=${WORLDTMP} ${BMAKE} \ TARGET=${TARGET} TARGET_ARCH=${TARGET_ARCH} \ MK_GDB=no MK_TESTS=no # kernel-tools stage KTMAKEENV= INSTALL="sh ${.CURDIR}/tools/install.sh" \ PATH=${BPATH}:${PATH} \ WORLDTMP=${WORLDTMP} KTMAKE= TOOLS_PREFIX=${WORLDTMP} MAKEOBJDIRPREFIX=${WORLDTMP} \ ${KTMAKEENV} ${MAKE} ${WORLD_FLAGS} -f Makefile.inc1 \ DESTDIR= \ BOOTSTRAPPING=${OSRELDATE} \ SSP_CFLAGS= \ MK_HTML=no -DNO_LINT MK_MAN=no \ -DNO_PIC MK_PROFILE=no -DNO_SHARED \ -DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no # world stage WMAKEENV= ${CROSSENV} \ INSTALL="sh ${.CURDIR}/tools/install.sh" \ PATH=${TMPPATH} # make hierarchy HMAKE= PATH=${TMPPATH} ${MAKE} LOCAL_MTREE=${LOCAL_MTREE:Q} .if defined(NO_ROOT) HMAKE+= PATH=${TMPPATH} METALOG=${METALOG} -DNO_ROOT .endif CROSSENV+= CC="${XCC} ${XCFLAGS}" CXX="${XCXX} ${XCXXFLAGS} ${XCFLAGS}" \ CPP="${XCPP} ${XCFLAGS}" \ AS="${XAS}" AR="${XAR}" LD="${XLD}" NM=${XNM} \ OBJDUMP=${XOBJDUMP} OBJCOPY="${XOBJCOPY}" \ RANLIB=${XRANLIB} STRINGS=${XSTRINGS} \ SIZE="${XSIZE}" .if defined(CROSS_BINUTILS_PREFIX) && exists(${CROSS_BINUTILS_PREFIX}) # In the case of xdev-build tools, CROSS_BINUTILS_PREFIX won't be a # directory, but the compiler will look in the right place for its # tools so we don't need to tell it where to look. BFLAGS+= -B${CROSS_BINUTILS_PREFIX} .endif # External compiler needs sysroot and target flags. .if ${MK_CROSS_COMPILER} == "no" || \ (${MK_CLANG_BOOTSTRAP} == "no" && ${MK_GCC_BOOTSTRAP} == "no") .if !defined(CROSS_BINUTILS_PREFIX) || !exists(${CROSS_BINUTILS_PREFIX}) BFLAGS+= -B${WORLDTMP}/usr/bin .endif .if ${TARGET} == "arm" .if ${TARGET_ARCH:Marmv6*} != "" && ${TARGET_CPUTYPE:M*soft*} == "" TARGET_ABI= gnueabihf .else TARGET_ABI= gnueabi .endif .endif .if defined(X_COMPILER_TYPE) && ${X_COMPILER_TYPE} == gcc # GCC requires -isystem and -L when using a cross-compiler. --sysroot # won't set header path and -L is used to ensure the base library path # is added before the port PREFIX library path. XCFLAGS+= -isystem ${WORLDTMP}/usr/include -L${WORLDTMP}/usr/lib # Force using libc++ for external GCC. # XXX: This should be checking MK_GNUCXX == no .if ${X_COMPILER_VERSION} >= 40800 XCXXFLAGS+= -isystem ${WORLDTMP}/usr/include/c++/v1 -std=c++11 \ -nostdinc++ -L${WORLDTMP}/../lib/libc++ .endif .else TARGET_ABI?= unknown TARGET_TRIPLE?= ${TARGET_ARCH:C/amd64/x86_64/}-${TARGET_ABI}-freebsd12.0 XCFLAGS+= -target ${TARGET_TRIPLE} .endif XCFLAGS+= --sysroot=${WORLDTMP} .endif # ${MK_CROSS_COMPILER} == "no" .if !empty(BFLAGS) XCFLAGS+= ${BFLAGS} .endif .if ${MK_LIB32} != "no" && (${TARGET_ARCH} == "amd64" || \ ${TARGET_ARCH} == "powerpc64") LIBCOMPAT= 32 .include "Makefile.libcompat" .elif ${MK_LIBSOFT} != "no" && ${TARGET_ARCH} == "armv6" LIBCOMPAT= SOFT .include "Makefile.libcompat" .endif WMAKE= ${WMAKEENV} ${MAKE} ${WORLD_FLAGS} -f Makefile.inc1 DESTDIR=${WORLDTMP} IMAKEENV= ${CROSSENV} IMAKE= ${IMAKEENV} ${MAKE} -f Makefile.inc1 \ ${IMAKE_INSTALL} ${IMAKE_MTREE} .if empty(.MAKEFLAGS:M-n) IMAKEENV+= PATH=${STRICTTMPPATH}:${INSTALLTMP} \ LD_LIBRARY_PATH=${INSTALLTMP} \ PATH_LOCALE=${INSTALLTMP}/locale IMAKE+= __MAKE_SHELL=${INSTALLTMP}/sh .else IMAKEENV+= PATH=${TMPPATH}:${INSTALLTMP} .endif .if defined(DB_FROM_SRC) INSTALLFLAGS+= -N ${.CURDIR}/etc MTREEFLAGS+= -N ${.CURDIR}/etc .endif _INSTALL_DDIR= ${DESTDIR}/${DISTDIR} INSTALL_DDIR= ${_INSTALL_DDIR:S://:/:g:C:/$::} .if defined(NO_ROOT) METALOG?= ${DESTDIR}/${DISTDIR}/METALOG IMAKE+= -DNO_ROOT METALOG=${METALOG} INSTALLFLAGS+= -U -M ${METALOG} -D ${INSTALL_DDIR} MTREEFLAGS+= -W .endif .if defined(BUILD_PKGS) INSTALLFLAGS+= -h sha256 .endif .if defined(DB_FROM_SRC) || defined(NO_ROOT) IMAKE_INSTALL= INSTALL="install ${INSTALLFLAGS}" IMAKE_MTREE= MTREE_CMD="mtree ${MTREEFLAGS}" .endif # kernel stage KMAKEENV= ${WMAKEENV} KMAKE= ${KMAKEENV} ${MAKE} ${.MAKEFLAGS} ${KERNEL_FLAGS} KERNEL=${INSTKERNNAME} # # buildworld # # Attempt to rebuild the entire system, with reasonable chance of # success, regardless of how old your existing system is. # _worldtmp: .PHONY .if ${.CURDIR:C/[^,]//g} != "" # The m4 build of sendmail files doesn't like it if ',' is used # anywhere in the path of it's files. @echo @echo "*** Error: path to source tree contains a comma ','" @echo false .endif @echo @echo "--------------------------------------------------------------" @echo ">>> Rebuilding the temporary build tree" @echo "--------------------------------------------------------------" .if !defined(NO_CLEAN) rm -rf ${WORLDTMP} .if defined(LIBCOMPAT) rm -rf ${LIBCOMPATTMP} .endif .else rm -rf ${WORLDTMP}/legacy/usr/include # XXX - These can depend on any header file. rm -f ${OBJTREE}${.CURDIR}/lib/libsysdecode/ioctl.c rm -f ${OBJTREE}${.CURDIR}/usr.bin/kdump/kdump_subr.c .endif .for _dir in \ lib lib/casper usr legacy/bin legacy/usr mkdir -p ${WORLDTMP}/${_dir} .endfor mtree -deU -f ${.CURDIR}/etc/mtree/BSD.usr.dist \ -p ${WORLDTMP}/legacy/usr >/dev/null .if ${MK_GROFF} != "no" mtree -deU -f ${.CURDIR}/etc/mtree/BSD.groff.dist \ -p ${WORLDTMP}/legacy/usr >/dev/null .endif mtree -deU -f ${.CURDIR}/etc/mtree/BSD.include.dist \ -p ${WORLDTMP}/legacy/usr/include >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.usr.dist \ -p ${WORLDTMP}/usr >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.include.dist \ -p ${WORLDTMP}/usr/include >/dev/null ln -sf ${.CURDIR}/sys ${WORLDTMP} .if ${MK_DEBUG_FILES} != "no" # We could instead disable debug files for these build stages mtree -deU -f ${.CURDIR}/etc/mtree/BSD.debug.dist \ -p ${WORLDTMP}/legacy/usr/lib >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.debug.dist \ -p ${WORLDTMP}/usr/lib >/dev/null .endif .if defined(LIBCOMPAT) mtree -deU -f ${.CURDIR}/etc/mtree/BSD.lib${libcompat}.dist \ -p ${WORLDTMP}/usr >/dev/null .if ${MK_DEBUG_FILES} != "no" mtree -deU -f ${.CURDIR}/etc/mtree/BSD.lib${libcompat}.dist \ -p ${WORLDTMP}/legacy/usr/lib/debug/usr >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.lib${libcompat}.dist \ -p ${WORLDTMP}/usr/lib/debug/usr >/dev/null .endif .endif .if ${MK_TESTS} != "no" mkdir -p ${WORLDTMP}${TESTSBASE} mtree -deU -f ${.CURDIR}/etc/mtree/BSD.tests.dist \ -p ${WORLDTMP}${TESTSBASE} >/dev/null .if ${MK_DEBUG_FILES} != "no" mkdir -p ${WORLDTMP}/usr/lib/debug/${TESTSBASE} mtree -deU -f ${.CURDIR}/etc/mtree/BSD.tests.dist \ -p ${WORLDTMP}/usr/lib/debug/${TESTSBASE} >/dev/null .endif .endif .for _mtree in ${LOCAL_MTREE} mtree -deU -f ${.CURDIR}/${_mtree} -p ${WORLDTMP} > /dev/null .endfor _legacy: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 1.1: legacy release compatibility shims" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${BMAKE} legacy _bootstrap-tools: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 1.2: bootstrap tools" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${BMAKE} bootstrap-tools _cleanobj: .if !defined(NO_CLEAN) @echo @echo "--------------------------------------------------------------" @echo ">>> stage 2.1: cleaning up the object tree" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${WMAKE} ${CLEANDIR} .if defined(LIBCOMPAT) ${_+_}cd ${.CURDIR}; ${LIBCOMPATWMAKE} -f Makefile.inc1 ${CLEANDIR} .endif .endif _obj: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 2.2: rebuilding the object tree" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${WMAKE} obj _build-tools: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 2.3: build tools" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${TMAKE} build-tools _cross-tools: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 3: cross tools" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${XMAKE} cross-tools ${_+_}cd ${.CURDIR}; ${XMAKE} kernel-tools _includes: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 4.1: building includes" @echo "--------------------------------------------------------------" # Special handling for SUBDIR_OVERRIDE in buildworld as they most likely need # headers from default SUBDIR. Do SUBDIR_OVERRIDE includes last. ${_+_}cd ${.CURDIR}; ${WMAKE} SUBDIR_OVERRIDE= SHARED=symlinks \ MK_INCLUDES=yes includes .if !empty(SUBDIR_OVERRIDE) && make(buildworld) ${_+_}cd ${.CURDIR}; ${WMAKE} MK_INCLUDES=yes SHARED=symlinks includes .endif _libraries: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 4.2: building libraries" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; \ ${WMAKE} -DNO_FSCHG MK_HTML=no -DNO_LINT MK_MAN=no \ MK_PROFILE=no MK_TESTS=no MK_TESTS_SUPPORT=${MK_TESTS} libraries everything: .PHONY @echo @echo "--------------------------------------------------------------" @echo ">>> stage 4.3: building everything" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; _PARALLEL_SUBDIR_OK=1 ${WMAKE} all WMAKE_TGTS= WMAKE_TGTS+= _worldtmp _legacy .if empty(SUBDIR_OVERRIDE) WMAKE_TGTS+= _bootstrap-tools .endif WMAKE_TGTS+= _cleanobj _obj _build-tools _cross-tools WMAKE_TGTS+= _includes _libraries WMAKE_TGTS+= everything .if defined(LIBCOMPAT) && empty(SUBDIR_OVERRIDE) WMAKE_TGTS+= build${libcompat} .endif buildworld: buildworld_prologue ${WMAKE_TGTS} buildworld_epilogue .PHONY .ORDER: buildworld_prologue ${WMAKE_TGTS} buildworld_epilogue buildworld_prologue: .PHONY @echo "--------------------------------------------------------------" @echo ">>> World build started on `LC_ALL=C date`" @echo "--------------------------------------------------------------" buildworld_epilogue: .PHONY @echo @echo "--------------------------------------------------------------" @echo ">>> World build completed on `LC_ALL=C date`" @echo "--------------------------------------------------------------" # # We need to have this as a target because the indirection between Makefile # and Makefile.inc1 causes the correct PATH to be used, rather than a # modification of the current environment's PATH. In addition, we need # to quote multiword values. # buildenvvars: .PHONY @echo ${WMAKEENV:Q} ${.MAKE.EXPORTED:@v@$v=\"${$v}\"@} .if ${.TARGETS:Mbuildenv} .if ${.MAKEFLAGS:M-j} .error The buildenv target is incompatible with -j .endif .endif BUILDENV_DIR?= ${.CURDIR} buildenv: .PHONY @echo Entering world for ${TARGET_ARCH}:${TARGET} .if ${BUILDENV_SHELL:M*zsh*} @echo For ZSH you must run: export CPUTYPE=${TARGET_CPUTYPE} .endif @cd ${BUILDENV_DIR} && env ${WMAKEENV} BUILDENV=1 ${BUILDENV_SHELL} \ || true TOOLCHAIN_TGTS= ${WMAKE_TGTS:Neverything:Nbuild${libcompat}} toolchain: ${TOOLCHAIN_TGTS} .PHONY kernel-toolchain: ${TOOLCHAIN_TGTS:N_includes:N_libraries} .PHONY # # installcheck # # Checks to be sure system is ready for installworld/installkernel. # installcheck: _installcheck_world _installcheck_kernel .PHONY _installcheck_world: .PHONY _installcheck_kernel: .PHONY # # Require DESTDIR to be set if installing for a different architecture or # using the user/group database in the source tree. # .if ${TARGET_ARCH} != ${MACHINE_ARCH} || ${TARGET} != ${MACHINE} || \ defined(DB_FROM_SRC) .if !make(distributeworld) _installcheck_world: __installcheck_DESTDIR _installcheck_kernel: __installcheck_DESTDIR __installcheck_DESTDIR: .PHONY .if !defined(DESTDIR) || empty(DESTDIR) @echo "ERROR: Please set DESTDIR!"; \ false .endif .endif .endif .if !defined(DB_FROM_SRC) # # Check for missing UIDs/GIDs. # CHECK_UIDS= auditdistd CHECK_GIDS= audit .if ${MK_SENDMAIL} != "no" CHECK_UIDS+= smmsp CHECK_GIDS+= smmsp .endif .if ${MK_PF} != "no" CHECK_UIDS+= proxy CHECK_GIDS+= proxy authpf .endif .if ${MK_UNBOUND} != "no" CHECK_UIDS+= unbound CHECK_GIDS+= unbound .endif _installcheck_world: __installcheck_UGID __installcheck_UGID: .PHONY .for uid in ${CHECK_UIDS} @if ! `id -u ${uid} >/dev/null 2>&1`; then \ echo "ERROR: Required ${uid} user is missing, see /usr/src/UPDATING."; \ false; \ fi .endfor .for gid in ${CHECK_GIDS} @if ! `find / -prune -group ${gid} >/dev/null 2>&1`; then \ echo "ERROR: Required ${gid} group is missing, see /usr/src/UPDATING."; \ false; \ fi .endfor .endif # # Required install tools to be saved in a scratch dir for safety. # .if ${MK_ZONEINFO} != "no" _zoneinfo= zic tzsetup .endif ITOOLS= [ awk cap_mkdb cat chflags chmod chown cmp cp \ date echo egrep find grep id install ${_install-info} \ ln make mkdir mtree mv pwd_mkdb \ rm sed services_mkdb sh strip sysctl test true uname wc ${_zoneinfo} \ ${LOCAL_ITOOLS} # Needed for share/man .if ${MK_MAN_UTILS} != "no" ITOOLS+=makewhatis .endif # # distributeworld # # Distributes everything compiled by a `buildworld'. # # installworld # # Installs everything compiled by a 'buildworld'. # # Non-base distributions produced by the base system EXTRA_DISTRIBUTIONS= doc .if defined(LIBCOMPAT) EXTRA_DISTRIBUTIONS+= lib${libcompat} .endif .if ${MK_TESTS} != "no" EXTRA_DISTRIBUTIONS+= tests .endif DEBUG_DISTRIBUTIONS= .if ${MK_DEBUG_FILES} != "no" DEBUG_DISTRIBUTIONS+= base ${EXTRA_DISTRIBUTIONS:S,doc,,:S,tests,,} .endif MTREE_MAGIC?= mtree 2.0 distributeworld installworld stageworld: _installcheck_world .PHONY mkdir -p ${INSTALLTMP} progs=$$(for prog in ${ITOOLS}; do \ if progpath=`which $$prog`; then \ echo $$progpath; \ else \ echo "Required tool $$prog not found in PATH." >&2; \ exit 1; \ fi; \ done); \ libs=$$(ldd -f "%o %p\n" -f "%o %p\n" $$progs 2>/dev/null | sort -u | \ while read line; do \ set -- $$line; \ if [ "$$2 $$3" != "not found" ]; then \ echo $$2; \ else \ echo "Required library $$1 not found." >&2; \ exit 1; \ fi; \ done); \ cp $$libs $$progs ${INSTALLTMP} cp -R $${PATH_LOCALE:-"/usr/share/locale"} ${INSTALLTMP}/locale .if defined(NO_ROOT) -mkdir -p ${METALOG:H} echo "#${MTREE_MAGIC}" > ${METALOG} .endif .if make(distributeworld) .for dist in ${EXTRA_DISTRIBUTIONS} -mkdir ${DESTDIR}/${DISTDIR}/${dist} mtree -deU -f ${.CURDIR}/etc/mtree/BSD.root.dist \ -p ${DESTDIR}/${DISTDIR}/${dist} >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.usr.dist \ -p ${DESTDIR}/${DISTDIR}/${dist}/usr >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.include.dist \ -p ${DESTDIR}/${DISTDIR}/${dist}/usr/include >/dev/null .if ${MK_DEBUG_FILES} != "no" mtree -deU -f ${.CURDIR}/etc/mtree/BSD.debug.dist \ -p ${DESTDIR}/${DISTDIR}/${dist}/usr/lib >/dev/null .endif .if defined(LIBCOMPAT) mtree -deU -f ${.CURDIR}/etc/mtree/BSD.lib${libcompat}.dist \ -p ${DESTDIR}/${DISTDIR}/${dist}/usr >/dev/null .if ${MK_DEBUG_FILES} != "no" mtree -deU -f ${.CURDIR}/etc/mtree/BSD.lib${libcompat}.dist \ -p ${DESTDIR}/${DISTDIR}/${dist}/usr/lib/debug/usr >/dev/null .endif .endif .if ${MK_TESTS} != "no" && ${dist} == "tests" -mkdir -p ${DESTDIR}/${DISTDIR}/${dist}${TESTSBASE} mtree -deU -f ${.CURDIR}/etc/mtree/BSD.tests.dist \ -p ${DESTDIR}/${DISTDIR}/${dist}${TESTSBASE} >/dev/null .if ${MK_DEBUG_FILES} != "no" mtree -deU -f ${.CURDIR}/etc/mtree/BSD.tests.dist \ -p ${DESTDIR}/${DISTDIR}/${dist}/usr/lib/debug/${TESTSBASE} >/dev/null .endif .endif .if defined(NO_ROOT) ${IMAKEENV} mtree -C -f ${.CURDIR}/etc/mtree/BSD.root.dist | \ sed -e 's#^\./#./${dist}/#' >> ${METALOG} ${IMAKEENV} mtree -C -f ${.CURDIR}/etc/mtree/BSD.usr.dist | \ sed -e 's#^\./#./${dist}/usr/#' >> ${METALOG} ${IMAKEENV} mtree -C -f ${.CURDIR}/etc/mtree/BSD.include.dist | \ sed -e 's#^\./#./${dist}/usr/include/#' >> ${METALOG} .if defined(LIBCOMPAT) ${IMAKEENV} mtree -C -f ${.CURDIR}/etc/mtree/BSD.lib${libcompat}.dist | \ sed -e 's#^\./#./${dist}/usr/#' >> ${METALOG} .endif .endif .endfor -mkdir ${DESTDIR}/${DISTDIR}/base ${_+_}cd ${.CURDIR}/etc; ${CROSSENV} PATH=${TMPPATH} ${MAKE} \ METALOG=${METALOG} ${IMAKE_INSTALL} ${IMAKE_MTREE} \ DISTBASE=/base DESTDIR=${DESTDIR}/${DISTDIR}/base \ LOCAL_MTREE=${LOCAL_MTREE:Q} distrib-dirs .endif ${_+_}cd ${.CURDIR}; ${IMAKE} re${.TARGET:S/world$//}; \ ${IMAKEENV} rm -rf ${INSTALLTMP} .if make(distributeworld) .for dist in ${EXTRA_DISTRIBUTIONS} find ${DESTDIR}/${DISTDIR}/${dist} -mindepth 1 -empty -delete .endfor .if defined(NO_ROOT) .for dist in base ${EXTRA_DISTRIBUTIONS} @# For each file that exists in this dist, print the corresponding @# line from the METALOG. This relies on the fact that @# a line containing only the filename will sort immediately before @# the relevant mtree line. cd ${DESTDIR}/${DISTDIR}; \ find ./${dist} | sort -u ${METALOG} - | \ awk 'BEGIN { print "#${MTREE_MAGIC}" } !/ type=/ { file = $$1 } / type=/ { if ($$1 == file) { sub(/^\.\/${dist}\//, "./"); print } }' > \ ${DESTDIR}/${DISTDIR}/${dist}.meta .endfor .for dist in ${DEBUG_DISTRIBUTIONS} @# For each file that exists in this dist, print the corresponding @# line from the METALOG. This relies on the fact that @# a line containing only the filename will sort immediately before @# the relevant mtree line. cd ${DESTDIR}/${DISTDIR}; \ find ./${dist}/usr/lib/debug | sort -u ${METALOG} - | \ awk 'BEGIN { print "#${MTREE_MAGIC}" } !/ type=/ { file = $$1 } / type=/ { if ($$1 == file) { sub(/^\.\/${dist}\//, "./"); print } }' > \ ${DESTDIR}/${DISTDIR}/${dist}.debug.meta .endfor .endif .endif packageworld: .PHONY .for dist in base ${EXTRA_DISTRIBUTIONS} .if defined(NO_ROOT) ${_+_}cd ${DESTDIR}/${DISTDIR}/${dist}; \ tar cvf - --exclude usr/lib/debug \ @${DESTDIR}/${DISTDIR}/${dist}.meta | \ ${XZ_CMD} > ${PACKAGEDIR}/${dist}.txz .else ${_+_}cd ${DESTDIR}/${DISTDIR}/${dist}; \ tar cvf - --exclude usr/lib/debug . | \ ${XZ_CMD} > ${PACKAGEDIR}/${dist}.txz .endif .endfor .for dist in ${DEBUG_DISTRIBUTIONS} . if defined(NO_ROOT) ${_+_}cd ${DESTDIR}/${DISTDIR}/${dist}; \ tar cvf - @${DESTDIR}/${DISTDIR}/${dist}.debug.meta | \ ${XZ_CMD} > ${PACKAGEDIR}/${dist}-dbg.txz . else ${_+_}cd ${DESTDIR}/${DISTDIR}/${dist}; \ tar cvLf - usr/lib/debug | \ ${XZ_CMD} > ${PACKAGEDIR}/${dist}-dbg.txz . endif .endfor # # reinstall # # If you have a build server, you can NFS mount the source and obj directories # and do a 'make reinstall' on the *client* to install new binaries from the # most recent server build. # restage reinstall: .MAKE .PHONY @echo "--------------------------------------------------------------" @echo ">>> Making hierarchy" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${MAKE} -f Makefile.inc1 \ LOCAL_MTREE=${LOCAL_MTREE:Q} hierarchy .if make(restage) @echo "--------------------------------------------------------------" @echo ">>> Making distribution" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${MAKE} -f Makefile.inc1 \ LOCAL_MTREE=${LOCAL_MTREE:Q} distribution .endif @echo @echo "--------------------------------------------------------------" @echo ">>> Installing everything" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${MAKE} -f Makefile.inc1 install .if defined(LIBCOMPAT) ${_+_}cd ${.CURDIR}; ${MAKE} -f Makefile.inc1 install${libcompat} .endif redistribute: .MAKE .PHONY @echo "--------------------------------------------------------------" @echo ">>> Distributing everything" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${MAKE} -f Makefile.inc1 distribute .if defined(LIBCOMPAT) ${_+_}cd ${.CURDIR}; ${MAKE} -f Makefile.inc1 distribute${libcompat} \ DISTRIBUTION=lib${libcompat} .endif distrib-dirs distribution: .MAKE .PHONY ${_+_}cd ${.CURDIR}/etc; ${CROSSENV} PATH=${TMPPATH} ${MAKE} \ ${IMAKE_INSTALL} ${IMAKE_MTREE} METALOG=${METALOG} ${.TARGET} .if make(distribution) ${_+_}cd ${.CURDIR}; ${CROSSENV} PATH=${TMPPATH} \ ${MAKE} -f Makefile.inc1 ${IMAKE_INSTALL} \ METALOG=${METALOG} MK_TESTS=no installconfig .endif # # buildkernel and installkernel # # Which kernels to build and/or install is specified by setting # KERNCONF. If not defined a GENERIC kernel is built/installed. # Only the existing (depending TARGET) config files are used # for building kernels and only the first of these is designated # as the one being installed. # # Note that we have to use TARGET instead of TARGET_ARCH when # we're in kernel-land. Since only TARGET_ARCH is (expected) to # be set to cross-build, we have to make sure TARGET is set # properly. .if defined(KERNFAST) NO_KERNELCLEAN= t NO_KERNELCONFIG= t NO_KERNELOBJ= t # Shortcut for KERNCONF=Blah -DKERNFAST is now KERNFAST=Blah .if !defined(KERNCONF) && ${KERNFAST} != "1" KERNCONF=${KERNFAST} .endif .endif .if ${TARGET_ARCH} == "powerpc64" KERNCONF?= GENERIC64 .else KERNCONF?= GENERIC .endif INSTKERNNAME?= kernel KERNSRCDIR?= ${.CURDIR}/sys KRNLCONFDIR= ${KERNSRCDIR}/${TARGET}/conf KRNLOBJDIR= ${OBJTREE}${KERNSRCDIR} KERNCONFDIR?= ${KRNLCONFDIR} BUILDKERNELS= INSTALLKERNEL= .if defined(NO_INSTALLKERNEL) # All of the BUILDKERNELS loops start at index 1. BUILDKERNELS+= dummy .endif .for _kernel in ${KERNCONF} .if exists(${KERNCONFDIR}/${_kernel}) BUILDKERNELS+= ${_kernel} .if empty(INSTALLKERNEL) && !defined(NO_INSTALLKERNEL) INSTALLKERNEL= ${_kernel} .endif .endif .endfor ${WMAKE_TGTS:N_worldtmp:Nbuild${libcompat}} ${.ALLTARGETS:M_*:N_worldtmp}: .MAKE .PHONY # # buildkernel # # Builds all kernels defined by BUILDKERNELS. # buildkernel: .MAKE .PHONY .if empty(BUILDKERNELS:Ndummy) @echo "ERROR: Missing kernel configuration file(s) (${KERNCONF})."; \ false .endif @echo .for _kernel in ${BUILDKERNELS:Ndummy} @echo "--------------------------------------------------------------" @echo ">>> Kernel build for ${_kernel} started on `LC_ALL=C date`" @echo "--------------------------------------------------------------" @echo "===> ${_kernel}" mkdir -p ${KRNLOBJDIR} .if !defined(NO_KERNELCONFIG) @echo @echo "--------------------------------------------------------------" @echo ">>> stage 1: configuring the kernel" @echo "--------------------------------------------------------------" cd ${KRNLCONFDIR}; \ PATH=${TMPPATH} \ config ${CONFIGARGS} -d ${KRNLOBJDIR}/${_kernel} \ -I '${KERNCONFDIR}' '${KERNCONFDIR}/${_kernel}' .endif .if !defined(NO_CLEAN) && !defined(NO_KERNELCLEAN) @echo @echo "--------------------------------------------------------------" @echo ">>> stage 2.1: cleaning up the object tree" @echo "--------------------------------------------------------------" ${_+_}cd ${KRNLOBJDIR}/${_kernel}; ${KMAKE} ${CLEANDIR} .endif .if !defined(NO_KERNELOBJ) @echo @echo "--------------------------------------------------------------" @echo ">>> stage 2.2: rebuilding the object tree" @echo "--------------------------------------------------------------" ${_+_}cd ${KRNLOBJDIR}/${_kernel}; ${KMAKE} obj .endif @echo @echo "--------------------------------------------------------------" @echo ">>> stage 2.3: build tools" @echo "--------------------------------------------------------------" ${_+_}cd ${.CURDIR}; ${KTMAKE} kernel-tools @echo @echo "--------------------------------------------------------------" @echo ">>> stage 3.1: building everything" @echo "--------------------------------------------------------------" ${_+_}cd ${KRNLOBJDIR}/${_kernel}; ${KMAKE} all -DNO_MODULES_OBJ @echo "--------------------------------------------------------------" @echo ">>> Kernel build for ${_kernel} completed on `LC_ALL=C date`" @echo "--------------------------------------------------------------" .endfor NO_INSTALLEXTRAKERNELS?= yes # # installkernel, etc. # # Install the kernel defined by INSTALLKERNEL # installkernel installkernel.debug \ reinstallkernel reinstallkernel.debug: _installcheck_kernel .PHONY .if !defined(NO_INSTALLKERNEL) .if empty(INSTALLKERNEL) @echo "ERROR: No kernel \"${KERNCONF}\" to install."; \ false .endif @echo "--------------------------------------------------------------" @echo ">>> Installing kernel ${INSTALLKERNEL}" @echo "--------------------------------------------------------------" cd ${KRNLOBJDIR}/${INSTALLKERNEL}; \ ${CROSSENV} PATH=${TMPPATH} \ ${MAKE} ${IMAKE_INSTALL} KERNEL=${INSTKERNNAME} ${.TARGET:S/kernel//} .endif .if ${BUILDKERNELS:[#]} > 1 && ${NO_INSTALLEXTRAKERNELS} != "yes" .for _kernel in ${BUILDKERNELS:[2..-1]} @echo "--------------------------------------------------------------" @echo ">>> Installing kernel ${_kernel}" @echo "--------------------------------------------------------------" cd ${KRNLOBJDIR}/${_kernel}; \ ${CROSSENV} PATH=${TMPPATH} \ ${MAKE} ${IMAKE_INSTALL} KERNEL=${INSTKERNNAME}.${_kernel} ${.TARGET:S/kernel//} .endfor .endif distributekernel distributekernel.debug: .PHONY .if !defined(NO_INSTALLKERNEL) .if empty(INSTALLKERNEL) @echo "ERROR: No kernel \"${KERNCONF}\" to install."; \ false .endif mkdir -p ${DESTDIR}/${DISTDIR} .if defined(NO_ROOT) @echo "#${MTREE_MAGIC}" > ${DESTDIR}/${DISTDIR}/kernel.premeta .endif cd ${KRNLOBJDIR}/${INSTALLKERNEL}; \ ${IMAKEENV} ${IMAKE_INSTALL:S/METALOG/kernel.premeta/} \ ${IMAKE_MTREE} PATH=${TMPPATH} ${MAKE} KERNEL=${INSTKERNNAME} \ DESTDIR=${INSTALL_DDIR}/kernel \ ${.TARGET:S/distributekernel/install/} .if defined(NO_ROOT) @sed -e 's|^./kernel|.|' ${DESTDIR}/${DISTDIR}/kernel.premeta > \ ${DESTDIR}/${DISTDIR}/kernel.meta .endif .endif .if ${BUILDKERNELS:[#]} > 1 && ${NO_INSTALLEXTRAKERNELS} != "yes" .for _kernel in ${BUILDKERNELS:[2..-1]} .if defined(NO_ROOT) @echo "#${MTREE_MAGIC}" > ${DESTDIR}/${DISTDIR}/kernel.${_kernel}.premeta .endif cd ${KRNLOBJDIR}/${_kernel}; \ ${IMAKEENV} ${IMAKE_INSTALL:S/METALOG/kernel.${_kernel}.premeta/} \ ${IMAKE_MTREE} PATH=${TMPPATH} ${MAKE} \ KERNEL=${INSTKERNNAME}.${_kernel} \ DESTDIR=${INSTALL_DDIR}/kernel.${_kernel} \ ${.TARGET:S/distributekernel/install/} .if defined(NO_ROOT) @sed -e "s|^./kernel.${_kernel}|.|" \ ${DESTDIR}/${DISTDIR}/kernel.${_kernel}.premeta > \ ${DESTDIR}/${DISTDIR}/kernel.${_kernel}.meta .endif .endfor .endif packagekernel: .PHONY .if defined(NO_ROOT) .if !defined(NO_INSTALLKERNEL) cd ${DESTDIR}/${DISTDIR}/kernel; \ tar cvf - --exclude '*.debug' \ @${DESTDIR}/${DISTDIR}/kernel.meta | \ ${XZ_CMD} > ${PACKAGEDIR}/kernel.txz .endif cd ${DESTDIR}/${DISTDIR}/kernel; \ tar cvf - --include '*/*/*.debug' \ @${DESTDIR}/${DISTDIR}/kernel.meta | \ ${XZ_CMD} > ${DESTDIR}/${DISTDIR}/kernel-dbg.txz .if ${BUILDKERNELS:[#]} > 1 && ${NO_INSTALLEXTRAKERNELS} != "yes" .for _kernel in ${BUILDKERNELS:[2..-1]} cd ${DESTDIR}/${DISTDIR}/kernel.${_kernel}; \ tar cvf - --exclude '*.debug' \ @${DESTDIR}/${DISTDIR}/kernel.${_kernel}.meta | \ ${XZ_CMD} > ${PACKAGEDIR}/kernel.${_kernel}.txz cd ${DESTDIR}/${DISTDIR}/kernel.${_kernel}; \ tar cvf - --include '*/*/*.debug' \ @${DESTDIR}/${DISTDIR}/kernel.${_kernel}.meta | \ ${XZ_CMD} > ${DESTDIR}/${DISTDIR}/kernel.${_kernel}-dbg.txz .endfor .endif .else .if !defined(NO_INSTALLKERNEL) cd ${DESTDIR}/${DISTDIR}/kernel; \ tar cvf - --exclude '*.debug' . | \ ${XZ_CMD} > ${PACKAGEDIR}/kernel.txz .endif cd ${DESTDIR}/${DISTDIR}/kernel; \ tar cvf - --include '*/*/*.debug' $$(eval find .) | \ ${XZ_CMD} > ${DESTDIR}/${DISTDIR}/kernel-dbg.txz .if ${BUILDKERNELS:[#]} > 1 && ${NO_INSTALLEXTRAKERNELS} != "yes" .for _kernel in ${BUILDKERNELS:[2..-1]} cd ${DESTDIR}/${DISTDIR}/kernel.${_kernel}; \ tar cvf - --exclude '*.debug' . | \ ${XZ_CMD} > ${PACKAGEDIR}/kernel.${_kernel}.txz cd ${DESTDIR}/${DISTDIR}/kernel.${_kernel}; \ tar cvf - --include '*/*/*.debug' $$(eval find .) | \ ${XZ_CMD} > ${DESTDIR}/${DISTDIR}/kernel.${_kernel}-dbg.txz .endfor .endif .endif stagekernel: .PHONY ${_+_}${MAKE} -C ${.CURDIR} ${.MAKEFLAGS} distributekernel PORTSDIR?= /usr/ports WSTAGEDIR?= ${MAKEOBJDIRPREFIX}${.CURDIR}/${TARGET}.${TARGET_ARCH}/worldstage KSTAGEDIR?= ${MAKEOBJDIRPREFIX}${.CURDIR}/${TARGET}.${TARGET_ARCH}/kernelstage REPODIR?= ${MAKEOBJDIRPREFIX}${.CURDIR}/repo PKGSIGNKEY?= # empty .ORDER: stage-packages create-packages .ORDER: create-packages create-world-packages .ORDER: create-packages create-kernel-packages .ORDER: create-packages sign-packages _pkgbootstrap: .PHONY .if !exists(${LOCALBASE}/sbin/pkg) @env ASSUME_ALWAYS_YES=YES pkg bootstrap .endif packages: .PHONY ${_+_}${MAKE} -C ${.CURDIR} PKG_VERSION=${PKG_VERSION} real-packages package-pkg: .PHONY rm -rf /tmp/ports.${TARGET} || : env ${WMAKEENV:Q} SRCDIR=${.CURDIR} PORTSDIR=${PORTSDIR} REVISION=${_REVISION} \ PKG_VERSION=${PKG_VERSION} REPODIR=${REPODIR} WSTAGEDIR=${WSTAGEDIR} \ sh ${.CURDIR}/release/scripts/make-pkg-package.sh real-packages: stage-packages create-packages sign-packages .PHONY stage-packages: .PHONY @mkdir -p ${REPODIR} ${WSTAGEDIR} ${KSTAGEDIR} ${_+_}@cd ${.CURDIR}; \ ${MAKE} DESTDIR=${WSTAGEDIR} -DNO_ROOT -B stageworld ; \ ${MAKE} DESTDIR=${KSTAGEDIR} -DNO_ROOT -B stagekernel create-packages: _pkgbootstrap .PHONY @mkdir -p ${REPODIR} ${_+_}@cd ${.CURDIR}; \ ${MAKE} DESTDIR=${WSTAGEDIR} \ PKG_VERSION=${PKG_VERSION} create-world-packages ; \ ${MAKE} DESTDIR=${KSTAGEDIR} \ PKG_VERSION=${PKG_VERSION} DISTDIR=kernel \ create-kernel-packages create-world-packages: _pkgbootstrap .PHONY @rm -f ${WSTAGEDIR}/*.plist 2>/dev/null || : @cd ${WSTAGEDIR} ; \ awk -f ${SRCDIR}/release/scripts/mtree-to-plist.awk \ ${WSTAGEDIR}/METALOG @for plist in ${WSTAGEDIR}/*.plist; do \ plist=$${plist##*/} ; \ pkgname=$${plist%.plist} ; \ sh ${SRCDIR}/release/packages/generate-ucl.sh -o $${pkgname} \ -s ${SRCDIR} -u ${WSTAGEDIR}/$${pkgname}.ucl ; \ done @for plist in ${WSTAGEDIR}/*.plist; do \ plist=$${plist##*/} ; \ pkgname=$${plist%.plist} ; \ awk -F\" ' \ /^name/ { printf("===> Creating %s-", $$2); next } \ /^version/ { print $$2; next } \ ' ${WSTAGEDIR}/$${pkgname}.ucl ; \ pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh -o ALLOW_BASE_SHLIBS=yes \ create -M ${WSTAGEDIR}/$${pkgname}.ucl \ -p ${WSTAGEDIR}/$${pkgname}.plist \ -r ${WSTAGEDIR} \ -o ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/${PKG_VERSION} ; \ done create-kernel-packages: _pkgbootstrap .PHONY .if exists(${KSTAGEDIR}/kernel.meta) .for flavor in "" -debug @cd ${KSTAGEDIR}/${DISTDIR} ; \ awk -f ${SRCDIR}/release/scripts/mtree-to-plist.awk \ -v kernel=yes -v _kernconf=${INSTALLKERNEL} \ ${KSTAGEDIR}/kernel.meta ; \ cap_arg=`cd ${SRCDIR}/etc ; ${MAKE} -VCAP_MKDB_ENDIAN` ; \ pwd_arg=`cd ${SRCDIR}/etc ; ${MAKE} -VPWD_MKDB_ENDIAN` ; \ sed -e "s/%VERSION%/${PKG_VERSION}/" \ -e "s/%PKGNAME%/kernel-${INSTALLKERNEL:tl}${flavor}/" \ -e "s/%COMMENT%/FreeBSD ${INSTALLKERNEL} kernel ${flavor}/" \ -e "s/%DESC%/FreeBSD ${INSTALLKERNEL} kernel ${flavor}/" \ -e "s/%CAP_MKDB_ENDIAN%/$${cap_arg}/g" \ -e "s/%PWD_MKDB_ENDIAN%/$${pwd_arg}/g" \ ${SRCDIR}/release/packages/kernel.ucl \ > ${KSTAGEDIR}/${DISTDIR}/kernel.${INSTALLKERNEL}${flavor}.ucl ; \ awk -F\" ' \ /name/ { printf("===> Creating %s-", $$2); next } \ /version/ {print $$2; next } ' \ ${KSTAGEDIR}/${DISTDIR}/kernel.${INSTALLKERNEL}${flavor}.ucl ; \ pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh -o ALLOW_BASE_SHLIBS=yes \ create -M ${KSTAGEDIR}/${DISTDIR}/kernel.${INSTALLKERNEL}${flavor}.ucl \ -p ${KSTAGEDIR}/${DISTDIR}/kernel.${INSTALLKERNEL}${flavor}.plist \ -r ${KSTAGEDIR}/${DISTDIR} \ -o ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/${PKG_VERSION} .endfor .endif .if ${BUILDKERNELS:[#]} > 1 && ${NO_INSTALLEXTRAKERNELS} != "yes" .for _kernel in ${BUILDKERNELS:[2..-1]} .if exists(${KSTAGEDIR}/kernel.${_kernel}.meta) .for flavor in "" -debug @cd ${KSTAGEDIR}/kernel.${_kernel} ; \ awk -f ${SRCDIR}/release/scripts/mtree-to-plist.awk \ -v kernel=yes -v _kernconf=${_kernel} \ ${KSTAGEDIR}/kernel.${_kernel}.meta ; \ cap_arg=`cd ${SRCDIR}/etc ; ${MAKE} -VCAP_MKDB_ENDIAN` ; \ pwd_arg=`cd ${SRCDIR}/etc ; ${MAKE} -VPWD_MKDB_ENDIAN` ; \ sed -e "s/%VERSION%/${PKG_VERSION}/" \ -e "s/%PKGNAME%/kernel-${_kernel:tl}${flavor}/" \ -e "s/%COMMENT%/FreeBSD ${_kernel} kernel ${flavor}/" \ -e "s/%DESC%/FreeBSD ${_kernel} kernel ${flavor}/" \ -e "s/%CAP_MKDB_ENDIAN%/$${cap_arg}/g" \ -e "s/%PWD_MKDB_ENDIAN%/$${pwd_arg}/g" \ ${SRCDIR}/release/packages/kernel.ucl \ > ${KSTAGEDIR}/kernel.${_kernel}/kernel.${_kernel}${flavor}.ucl ; \ awk -F\" ' \ /name/ { printf("===> Creating %s-", $$2); next } \ /version/ {print $$2; next } ' \ ${KSTAGEDIR}/kernel.${_kernel}/kernel.${_kernel}${flavor}.ucl ; \ pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh -o ALLOW_BASE_SHLIBS=yes \ create -M ${KSTAGEDIR}/kernel.${_kernel}/kernel.${_kernel}${flavor}.ucl \ -p ${KSTAGEDIR}/kernel.${_kernel}/kernel.${_kernel}${flavor}.plist \ -r ${KSTAGEDIR}/kernel.${_kernel} \ -o ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/${PKG_VERSION} .endfor .endif .endfor .endif sign-packages: _pkgbootstrap .PHONY @[ -L "${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/latest" ] && \ unlink ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/latest ; \ pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh repo \ -o ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/${PKG_VERSION} \ ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/${PKG_VERSION} \ ${PKGSIGNKEY} ; \ ln -s ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/${PKG_VERSION} \ ${REPODIR}/$$(pkg -o ABI_FILE=${WSTAGEDIR}/bin/sh config ABI)/latest # # # checkworld # # Run test suite on installed world. # checkworld: .PHONY @if [ ! -x ${LOCALBASE}/bin/kyua ]; then \ echo "You need kyua (devel/kyua) to run the test suite." | /usr/bin/fmt; \ exit 1; \ fi ${_+_}${LOCALBASE}/bin/kyua test -k ${TESTSBASE}/Kyuafile # # # doxygen # # Build the API documentation with doxygen # doxygen: .PHONY @if [ ! -x ${LOCALBASE}/bin/doxygen ]; then \ echo "You need doxygen (devel/doxygen) to generate the API documentation of the kernel." | /usr/bin/fmt; \ exit 1; \ fi ${_+_}cd ${.CURDIR}/tools/kerneldoc/subsys; ${MAKE} obj all # # update # # Update the source tree(s), by running svn/svnup to update to the # latest copy. # update: .PHONY .if defined(SVN_UPDATE) @echo "--------------------------------------------------------------" @echo ">>> Updating ${.CURDIR} using Subversion" @echo "--------------------------------------------------------------" @(cd ${.CURDIR}; ${SVN} update ${SVNFLAGS}) .endif # # ------------------------------------------------------------------------ # # From here onwards are utility targets used by the 'make world' and # related targets. If your 'world' breaks, you may like to try to fix # the problem and manually run the following targets to attempt to # complete the build. Beware, this is *not* guaranteed to work, you # need to have a pretty good grip on the current state of the system # to attempt to manually finish it. If in doubt, 'make world' again. # # # legacy: Build compatibility shims for the next three targets. This is a # minimal set of tools and shims necessary to compensate for older systems # which don't have the APIs required by the targets built in bootstrap-tools, # build-tools or cross-tools. # # ELF Tool Chain libraries are needed for ELF tools and dtrace tools. # r296685 fix cross-endian objcopy .if ${BOOTSTRAPPING} < 1100102 _elftoolchain_libs= lib/libelf lib/libdwarf .endif legacy: .PHONY .if ${BOOTSTRAPPING} < ${MINIMUM_SUPPORTED_OSREL} && ${BOOTSTRAPPING} != 0 @echo "ERROR: Source upgrades from versions prior to ${MINIMUM_SUPPORTED_REL} are not supported."; \ false .endif .for _tool in tools/build ${_elftoolchain_libs} ${_+_}@${ECHODIR} "===> ${_tool} (obj,includes,all,install)"; \ cd ${.CURDIR}/${_tool}; \ ${MAKE} DIRPRFX=${_tool}/ obj; \ ${MAKE} DIRPRFX=${_tool}/ DESTDIR=${MAKEOBJDIRPREFIX}/legacy includes; \ ${MAKE} DIRPRFX=${_tool}/ MK_INCLUDES=no all; \ ${MAKE} DIRPRFX=${_tool}/ MK_INCLUDES=no \ DESTDIR=${MAKEOBJDIRPREFIX}/legacy install .endfor # # bootstrap-tools: Build tools needed for compatibility. These are binaries that # are built to build other binaries in the system. However, the focus of these # binaries is usually quite narrow. Bootstrap tools use the host's compiler and # libraries, augmented by -legacy. # _bt= _bootstrap-tools .if ${MK_GAMES} != "no" _strfile= usr.bin/fortune/strfile .endif .if ${MK_GCC} != "no" && ${MK_CXX} != "no" _gperf= gnu/usr.bin/gperf .endif .if ${MK_GROFF} != "no" _groff= gnu/usr.bin/groff \ usr.bin/soelim .endif .if ${MK_VT} != "no" _vtfontcvt= usr.bin/vtfontcvt .endif .if ${BOOTSTRAPPING} < 900002 _sed= usr.bin/sed .endif .if ${BOOTSTRAPPING} < 1000033 _libopenbsd= lib/libopenbsd _m4= usr.bin/m4 _lex= usr.bin/lex ${_bt}-usr.bin/m4: ${_bt}-lib/libopenbsd ${_bt}-usr.bin/lex: ${_bt}-usr.bin/m4 .endif .if ${BOOTSTRAPPING} < 1000026 _nmtree= lib/libnetbsd \ usr.sbin/nmtree ${_bt}-usr.sbin/nmtree: ${_bt}-lib/libnetbsd .endif .if ${BOOTSTRAPPING} < 1000027 _cat= bin/cat .endif # r264059 support for status= .if ${BOOTSTRAPPING} < 1100017 _dd= bin/dd .endif # r277259 crunchide: Correct 64-bit section header offset # r281674 crunchide: always include both 32- and 64-bit ELF support .if ${BOOTSTRAPPING} < 1100078 _crunchide= usr.sbin/crunch/crunchide .endif # r285986 crunchen: use STRIPBIN rather than STRIP # 1100113: Support MK_AUTO_OBJ .if ${BOOTSTRAPPING} < 1100078 || \ (${MK_AUTO_OBJ} == "yes" && ${BOOTSTRAPPING} < 1100114) _crunchgen= usr.sbin/crunch/crunchgen .endif .if ${BOOTSTRAPPING} >= 900040 && ${BOOTSTRAPPING} < 900041 _awk= usr.bin/awk .endif # r296926 -P keymap search path, MFC to stable/10 in r298297 .if ${BOOTSTRAPPING} < 1003501 || \ (${BOOTSTRAPPING} >= 1100000 && ${BOOTSTRAPPING} < 1100103) _kbdcontrol= usr.sbin/kbdcontrol .endif _yacc= lib/liby \ usr.bin/yacc ${_bt}-usr.bin/yacc: ${_bt}-lib/liby .if ${MK_BSNMP} != "no" _gensnmptree= usr.sbin/bsnmpd/gensnmptree .endif # We need to build tblgen when we're building clang either as # the bootstrap compiler, or as the part of the normal build. .if ${MK_CLANG_BOOTSTRAP} != "no" || ${MK_CLANG} != "no" _clang_tblgen= \ lib/clang/libllvmsupport \ lib/clang/libllvmtablegen \ usr.bin/clang/llvm-tblgen \ usr.bin/clang/clang-tblgen ${_bt}-usr.bin/clang/clang-tblgen: ${_bt}-lib/clang/libllvmtablegen ${_bt}-lib/clang/libllvmsupport ${_bt}-usr.bin/clang/llvm-tblgen: ${_bt}-lib/clang/libllvmtablegen ${_bt}-lib/clang/libllvmsupport .endif # Default to building the GPL DTC, but build the BSDL one if users explicitly # request it. _dtc= usr.bin/dtc .if ${MK_GPL_DTC} != "no" _dtc= gnu/usr.bin/dtc .endif .if ${MK_KERBEROS} != "no" _kerberos5_bootstrap_tools= \ kerberos5/tools/make-roken \ kerberos5/lib/libroken \ kerberos5/lib/libvers \ kerberos5/tools/asn1_compile \ kerberos5/tools/slc \ usr.bin/compile_et .ORDER: ${_kerberos5_bootstrap_tools:C/^/${_bt}-/g} .endif # r283777 makewhatis(1) replaced with mandoc version which builds a database. .if ${MK_MANDOCDB} != "no" && ${BOOTSTRAPPING} < 1100075 _libopenbsd?= lib/libopenbsd _makewhatis= lib/libsqlite3 \ usr.bin/mandoc ${_bt}-usr.bin/mandoc: ${_bt}-lib/libopenbsd ${_bt}-lib/libsqlite3 .endif bootstrap-tools: .PHONY # Please document (add comment) why something is in 'bootstrap-tools'. # Try to bound the building of the bootstrap-tool to just the # FreeBSD versions that need the tool built at this stage of the build. .for _tool in \ ${_clang_tblgen} \ ${_kerberos5_bootstrap_tools} \ ${_strfile} \ ${_gperf} \ ${_groff} \ ${_dtc} \ ${_awk} \ ${_cat} \ ${_dd} \ ${_kbdcontrol} \ usr.bin/lorder \ ${_libopenbsd} \ ${_makewhatis} \ usr.bin/rpcgen \ ${_sed} \ ${_yacc} \ ${_m4} \ ${_lex} \ usr.bin/xinstall \ ${_gensnmptree} \ usr.sbin/config \ ${_crunchide} \ ${_crunchgen} \ ${_nmtree} \ ${_vtfontcvt} \ usr.bin/localedef ${_bt}-${_tool}: .PHONY .MAKE ${_+_}@${ECHODIR} "===> ${_tool} (obj,all,install)"; \ cd ${.CURDIR}/${_tool}; \ ${MAKE} DIRPRFX=${_tool}/ obj; \ ${MAKE} DIRPRFX=${_tool}/ all; \ ${MAKE} DIRPRFX=${_tool}/ DESTDIR=${MAKEOBJDIRPREFIX}/legacy install bootstrap-tools: ${_bt}-${_tool} .endfor # # build-tools: Build special purpose build tools # .if !defined(NO_SHARE) _share= share/syscons/scrnmaps .endif .if ${MK_GCC} != "no" _gcc_tools= gnu/usr.bin/cc/cc_tools .endif .if ${MK_RESCUE} != "no" # rescue includes programs that have build-tools targets _rescue=rescue/rescue .endif .for _tool in \ bin/csh \ bin/sh \ ${LOCAL_TOOL_DIRS} \ lib/ncurses/ncurses \ lib/ncurses/ncursesw \ ${_rescue} \ ${_share} \ usr.bin/awk \ lib/libmagic \ usr.bin/mkesdb_static \ usr.bin/mkcsmapper_static \ usr.bin/vi/catalog build-tools_${_tool}: .PHONY ${_+_}@${ECHODIR} "===> ${_tool} (obj,build-tools)"; \ cd ${.CURDIR}/${_tool}; \ ${MAKE} DIRPRFX=${_tool}/ obj; \ ${MAKE} DIRPRFX=${_tool}/ build-tools build-tools: build-tools_${_tool} .endfor .for _tool in \ ${_gcc_tools} build-tools_${_tool}: .PHONY ${_+_}@${ECHODIR} "===> ${_tool} (obj,all)"; \ cd ${.CURDIR}/${_tool}; \ ${MAKE} DIRPRFX=${_tool}/ obj; \ ${MAKE} DIRPRFX=${_tool}/ all build-tools: build-tools_${_tool} .endfor # # kernel-tools: Build kernel-building tools # kernel-tools: .PHONY mkdir -p ${MAKEOBJDIRPREFIX}/usr mtree -deU -f ${.CURDIR}/etc/mtree/BSD.usr.dist \ -p ${MAKEOBJDIRPREFIX}/usr >/dev/null # # cross-tools: All the tools needed to build the rest of the system after # we get done with the earlier stages. It is the last set of tools needed # to begin building the target binaries. # .if ${TARGET_ARCH} != ${MACHINE_ARCH} .if ${TARGET_ARCH} == "amd64" || ${TARGET_ARCH} == "i386" _btxld= usr.sbin/btxld .endif .endif # Rebuild ctfconvert and ctfmerge to avoid difficult-to-diagnose failures # resulting from missing bug fixes or ELF Toolchain updates. .if ${MK_CDDL} != "no" _dtrace_tools= cddl/lib/libctf cddl/usr.bin/ctfconvert \ cddl/usr.bin/ctfmerge .endif # If we're given an XAS, don't build binutils. .if ${XAS:M/*} == "" .if ${MK_BINUTILS_BOOTSTRAP} != "no" _binutils= gnu/usr.bin/binutils .endif .if ${MK_ELFTOOLCHAIN_BOOTSTRAP} != "no" _elftctools= lib/libelftc \ lib/libpe \ usr.bin/elfcopy \ usr.bin/nm \ usr.bin/size \ usr.bin/strings # These are not required by the build, but can be useful for developers who # cross-build on a FreeBSD 10 host: _elftctools+= usr.bin/addr2line .endif .elif ${TARGET_ARCH} != ${MACHINE_ARCH} && ${MK_ELFTOOLCHAIN_BOOTSTRAP} != "no" # If cross-building with an external binutils we still need to build strip for # the target (for at least crunchide). _elftctools= lib/libelftc \ lib/libpe \ usr.bin/elfcopy .endif .if ${MK_CROSS_COMPILER} != "no" .if ${MK_CLANG_BOOTSTRAP} != "no" _clang= usr.bin/clang _clang_libs= lib/clang .endif .if ${MK_GCC_BOOTSTRAP} != "no" _cc= gnu/usr.bin/cc .endif .endif .if ${MK_USB} != "no" _usb_tools= sys/boot/usb/tools .endif cross-tools: .MAKE .PHONY .for _tool in \ + ${LOCAL_XTOOL_DIRS} \ ${_clang_libs} \ ${_clang} \ ${_binutils} \ ${_elftctools} \ ${_dtrace_tools} \ ${_cc} \ ${_btxld} \ ${_usb_tools} ${_+_}@${ECHODIR} "===> ${_tool} (obj,all,install)"; \ cd ${.CURDIR}/${_tool}; \ ${MAKE} DIRPRFX=${_tool}/ obj; \ ${MAKE} DIRPRFX=${_tool}/ all; \ ${MAKE} DIRPRFX=${_tool}/ DESTDIR=${MAKEOBJDIRPREFIX} install .endfor NXBDESTDIR= ${OBJTREE}/nxb-bin NXBENV= MAKEOBJDIRPREFIX=${OBJTREE}/nxb \ INSTALL="sh ${.CURDIR}/tools/install.sh" \ PATH=${PATH}:${OBJTREE}/gperf_for_gcc/usr/bin NXBMAKE= ${NXBENV} ${MAKE} \ LLVM_TBLGEN=${NXBDESTDIR}/usr/bin/llvm-tblgen \ CLANG_TBLGEN=${NXBDESTDIR}/usr/bin/clang-tblgen \ MACHINE=${TARGET} MACHINE_ARCH=${TARGET_ARCH} \ MK_GDB=no MK_TESTS=no \ SSP_CFLAGS= \ MK_HTML=no NO_LINT=yes MK_MAN=no \ -DNO_PIC MK_PROFILE=no -DNO_SHARED \ -DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no \ MK_CLANG_EXTRAS=no MK_CLANG_FULL=no \ MK_LLDB=no MK_DEBUG_FILES=no # native-xtools is the current target for qemu-user cross builds of ports # via poudriere and the imgact_binmisc kernel module. # For non-clang enabled targets that are still using the in tree gcc # we must build a gperf binary for one instance of its Makefiles. On # clang-enabled systems, the gperf binary is obsolete. native-xtools: .PHONY .if ${MK_GCC_BOOTSTRAP} != "no" mkdir -p ${OBJTREE}/gperf_for_gcc/usr/bin ${_+_}@${ECHODIR} "===> ${_gperf} (obj,all,install)"; \ cd ${.CURDIR}/${_gperf}; \ ${NXBMAKE} DIRPRFX=${_gperf}/ obj; \ ${NXBMAKE} DIRPRFX=${_gperf}/ all; \ ${NXBMAKE} DIRPRFX=${_gperf}/ DESTDIR=${OBJTREE}/gperf_for_gcc install .endif mkdir -p ${NXBDESTDIR}/bin ${NXBDESTDIR}/sbin ${NXBDESTDIR}/usr mtree -deU -f ${.CURDIR}/etc/mtree/BSD.usr.dist \ -p ${NXBDESTDIR}/usr >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.include.dist \ -p ${NXBDESTDIR}/usr/include >/dev/null .if ${MK_DEBUG_FILES} != "no" mtree -deU -f ${.CURDIR}/etc/mtree/BSD.debug.dist \ -p ${NXBDESTDIR}/usr/lib >/dev/null .endif .for _tool in \ bin/cat \ bin/chmod \ bin/cp \ bin/csh \ bin/echo \ bin/expr \ bin/hostname \ bin/ln \ bin/ls \ bin/mkdir \ bin/mv \ bin/ps \ bin/realpath \ bin/rm \ bin/rmdir \ bin/sh \ bin/sleep \ ${_clang_tblgen} \ usr.bin/ar \ ${_binutils} \ ${_elftctools} \ ${_cc} \ ${_gcc_tools} \ ${_clang_libs} \ ${_clang} \ sbin/md5 \ sbin/sysctl \ gnu/usr.bin/diff \ usr.bin/awk \ usr.bin/basename \ usr.bin/bmake \ usr.bin/bzip2 \ usr.bin/cmp \ usr.bin/dirname \ usr.bin/env \ usr.bin/fetch \ usr.bin/find \ usr.bin/grep \ usr.bin/gzip \ usr.bin/id \ usr.bin/lex \ usr.bin/lorder \ usr.bin/mktemp \ usr.bin/mt \ usr.bin/patch \ usr.bin/sed \ usr.bin/sort \ usr.bin/tar \ usr.bin/touch \ usr.bin/tr \ usr.bin/true \ usr.bin/uniq \ usr.bin/unzip \ usr.bin/xargs \ usr.bin/xinstall \ usr.bin/xz \ usr.bin/yacc \ usr.sbin/chown ${_+_}@${ECHODIR} "===> ${_tool} (obj,all,install)"; \ cd ${.CURDIR}/${_tool}; \ ${NXBMAKE} DIRPRFX=${_tool}/ obj; \ ${NXBMAKE} DIRPRFX=${_tool}/ all; \ ${NXBMAKE} DIRPRFX=${_tool}/ DESTDIR=${NXBDESTDIR} install .endfor # # hierarchy - ensure that all the needed directories are present # hierarchy hier: .MAKE .PHONY ${_+_}cd ${.CURDIR}/etc; ${HMAKE} distrib-dirs # # libraries - build all libraries, and install them under ${DESTDIR}. # # The list of libraries with dependents (${_prebuild_libs}) and their # interdependencies (__L) are built automatically by the # ${.CURDIR}/tools/make_libdeps.sh script. # libraries: .MAKE .PHONY ${_+_}cd ${.CURDIR}; \ ${MAKE} -f Makefile.inc1 _prereq_libs; \ ${MAKE} -f Makefile.inc1 _startup_libs; \ ${MAKE} -f Makefile.inc1 _prebuild_libs; \ ${MAKE} -f Makefile.inc1 _generic_libs # # static libgcc.a prerequisite for shared libc # _prereq_libs= gnu/lib/libssp/libssp_nonshared gnu/lib/libgcc lib/libcompiler_rt # These dependencies are not automatically generated: # # gnu/lib/csu, gnu/lib/libgcc, lib/csu and lib/libc must be built before # all shared libraries for ELF. # _startup_libs= gnu/lib/csu _startup_libs+= lib/csu _startup_libs+= gnu/lib/libgcc _startup_libs+= lib/libcompiler_rt _startup_libs+= lib/libc _startup_libs+= lib/libc_nonshared .if ${MK_LIBCPLUSPLUS} != "no" _startup_libs+= lib/libcxxrt .endif gnu/lib/libgcc__L: lib/libc__L gnu/lib/libgcc__L: lib/libc_nonshared__L .if ${MK_LIBCPLUSPLUS} != "no" lib/libcxxrt__L: gnu/lib/libgcc__L .endif _prebuild_libs= ${_kerberos5_lib_libasn1} \ ${_kerberos5_lib_libhdb} \ ${_kerberos5_lib_libheimbase} \ ${_kerberos5_lib_libheimntlm} \ ${_libsqlite3} \ ${_kerberos5_lib_libheimipcc} \ ${_kerberos5_lib_libhx509} ${_kerberos5_lib_libkrb5} \ ${_kerberos5_lib_libroken} \ ${_kerberos5_lib_libwind} \ lib/libbz2 ${_libcom_err} lib/libcrypt \ lib/libelf lib/libexpat \ lib/libfigpar \ ${_lib_libgssapi} \ lib/libkiconv lib/libkvm lib/liblzma lib/libmd lib/libnv \ ${_lib_casper} \ lib/ncurses/ncurses lib/ncurses/ncursesw \ lib/libopie lib/libpam/libpam ${_lib_libthr} \ ${_lib_libradius} lib/libsbuf lib/libtacplus \ lib/libgeom \ ${_cddl_lib_libumem} ${_cddl_lib_libnvpair} \ ${_cddl_lib_libuutil} \ ${_cddl_lib_libavl} \ ${_cddl_lib_libzfs_core} \ ${_cddl_lib_libctf} \ lib/libutil lib/libpjdlog ${_lib_libypclnt} lib/libz lib/msun \ ${_secure_lib_libcrypto} ${_lib_libldns} \ ${_secure_lib_libssh} ${_secure_lib_libssl} \ gnu/lib/libdialog .if ${MK_GNUCXX} != "no" _prebuild_libs+= gnu/lib/libstdc++ gnu/lib/libsupc++ gnu/lib/libstdc++__L: lib/msun__L gnu/lib/libsupc++__L: gnu/lib/libstdc++__L .endif .if ${MK_LIBCPLUSPLUS} != "no" _prebuild_libs+= lib/libc++ .endif lib/libgeom__L: lib/libexpat__L lib/libkvm__L: lib/libelf__L .if ${MK_LIBTHR} != "no" _lib_libthr= lib/libthr .endif .if ${MK_RADIUS_SUPPORT} != "no" _lib_libradius= lib/libradius .endif .if ${MK_OFED} != "no" _ofed_lib= contrib/ofed/usr.lib _prebuild_libs+= contrib/ofed/usr.lib/libosmcomp _prebuild_libs+= contrib/ofed/usr.lib/libopensm _prebuild_libs+= contrib/ofed/usr.lib/libibcommon _prebuild_libs+= contrib/ofed/usr.lib/libibverbs _prebuild_libs+= contrib/ofed/usr.lib/libibumad contrib/ofed/usr.lib/libopensm__L: lib/libthr__L contrib/ofed/usr.lib/libosmcomp__L: lib/libthr__L contrib/ofed/usr.lib/libibumad__L: contrib/ofed/usr.lib/libibcommon__L .endif .if ${MK_CASPER} != "no" _lib_casper= lib/libcasper .endif lib/libpjdlog__L: lib/libutil__L lib/libcasper__L: lib/libnv__L lib/liblzma__L: lib/libthr__L _generic_libs= ${_cddl_lib} gnu/lib ${_kerberos5_lib} lib ${_secure_lib} usr.bin/lex/lib ${_ofed_lib} .for _DIR in ${LOCAL_LIB_DIRS} .if exists(${.CURDIR}/${_DIR}/Makefile) && empty(_generic_libs:M${_DIR}) _generic_libs+= ${_DIR} .endif .endfor lib/libopie__L lib/libtacplus__L: lib/libmd__L .if ${MK_CDDL} != "no" _cddl_lib_libumem= cddl/lib/libumem _cddl_lib_libnvpair= cddl/lib/libnvpair _cddl_lib_libavl= cddl/lib/libavl _cddl_lib_libuutil= cddl/lib/libuutil _cddl_lib_libzfs_core= cddl/lib/libzfs_core _cddl_lib_libctf= cddl/lib/libctf _cddl_lib= cddl/lib cddl/lib/libzfs_core__L: cddl/lib/libnvpair__L cddl/lib/libzfs__L: lib/libgeom__L cddl/lib/libctf__L: lib/libz__L .endif # cddl/lib/libdtrace requires lib/libproc and lib/librtld_db; it's only built # on select architectures though (see cddl/lib/Makefile) .if ${MACHINE_CPUARCH} != "sparc64" _prebuild_libs+= lib/libproc lib/librtld_db .endif .if ${MK_CRYPT} != "no" .if ${MK_OPENSSL} != "no" _secure_lib_libcrypto= secure/lib/libcrypto _secure_lib_libssl= secure/lib/libssl lib/libradius__L secure/lib/libssl__L: secure/lib/libcrypto__L .if ${MK_LDNS} != "no" _lib_libldns= lib/libldns lib/libldns__L: secure/lib/libcrypto__L .endif .if ${MK_OPENSSH} != "no" _secure_lib_libssh= secure/lib/libssh secure/lib/libssh__L: lib/libz__L secure/lib/libcrypto__L lib/libcrypt__L .if ${MK_LDNS} != "no" secure/lib/libssh__L: lib/libldns__L .endif .if ${MK_KERBEROS_SUPPORT} != "no" secure/lib/libssh__L: lib/libgssapi__L kerberos5/lib/libkrb5__L \ kerberos5/lib/libhx509__L kerberos5/lib/libasn1__L lib/libcom_err__L \ lib/libmd__L kerberos5/lib/libroken__L .endif .endif .endif _secure_lib= secure/lib .endif .if ${MK_KERBEROS} != "no" kerberos5/lib/libasn1__L: lib/libcom_err__L kerberos5/lib/libroken__L kerberos5/lib/libhdb__L: kerberos5/lib/libasn1__L lib/libcom_err__L \ kerberos5/lib/libkrb5__L kerberos5/lib/libroken__L \ kerberos5/lib/libwind__L lib/libsqlite3__L kerberos5/lib/libheimntlm__L: secure/lib/libcrypto__L kerberos5/lib/libkrb5__L \ kerberos5/lib/libroken__L lib/libcom_err__L kerberos5/lib/libhx509__L: kerberos5/lib/libasn1__L lib/libcom_err__L \ secure/lib/libcrypto__L kerberos5/lib/libroken__L kerberos5/lib/libwind__L kerberos5/lib/libkrb5__L: kerberos5/lib/libasn1__L lib/libcom_err__L \ lib/libcrypt__L secure/lib/libcrypto__L kerberos5/lib/libhx509__L \ kerberos5/lib/libroken__L kerberos5/lib/libwind__L \ kerberos5/lib/libheimbase__L kerberos5/lib/libheimipcc__L kerberos5/lib/libroken__L: lib/libcrypt__L kerberos5/lib/libwind__L: kerberos5/lib/libroken__L lib/libcom_err__L kerberos5/lib/libheimbase__L: lib/libthr__L kerberos5/lib/libheimipcc__L: kerberos5/lib/libroken__L kerberos5/lib/libheimbase__L lib/libthr__L .endif lib/libsqlite3__L: lib/libthr__L .if ${MK_GSSAPI} != "no" _lib_libgssapi= lib/libgssapi .endif .if ${MK_KERBEROS} != "no" _kerberos5_lib= kerberos5/lib _kerberos5_lib_libasn1= kerberos5/lib/libasn1 _kerberos5_lib_libhdb= kerberos5/lib/libhdb _kerberos5_lib_libheimbase= kerberos5/lib/libheimbase _kerberos5_lib_libkrb5= kerberos5/lib/libkrb5 _kerberos5_lib_libhx509= kerberos5/lib/libhx509 _kerberos5_lib_libroken= kerberos5/lib/libroken _kerberos5_lib_libheimntlm= kerberos5/lib/libheimntlm _libsqlite3= lib/libsqlite3 _kerberos5_lib_libheimipcc= kerberos5/lib/libheimipcc _kerberos5_lib_libwind= kerberos5/lib/libwind _libcom_err= lib/libcom_err .endif .if ${MK_NIS} != "no" _lib_libypclnt= lib/libypclnt .endif .if ${MK_OPENSSL} == "no" lib/libradius__L: lib/libmd__L .endif lib/libproc__L: \ ${_cddl_lib_libctf:D${_cddl_lib_libctf}__L} lib/libelf__L lib/librtld_db__L lib/libutil__L .if ${MK_CXX} != "no" .if ${MK_LIBCPLUSPLUS} != "no" lib/libproc__L: lib/libcxxrt__L .else # This implies MK_GNUCXX != "no"; see lib/libproc lib/libproc__L: gnu/lib/libsupc++__L .endif .endif gnu/lib/libdialog__L: lib/msun__L lib/ncurses/ncursesw__L .for _lib in ${_prereq_libs} ${_lib}__PL: .PHONY .MAKE .if exists(${.CURDIR}/${_lib}) ${_+_}@${ECHODIR} "===> ${_lib} (obj,all,install)"; \ cd ${.CURDIR}/${_lib}; \ ${MAKE} MK_TESTS=no DIRPRFX=${_lib}/ obj; \ ${MAKE} MK_TESTS=no MK_PROFILE=no -DNO_PIC \ DIRPRFX=${_lib}/ all; \ ${MAKE} MK_TESTS=no MK_PROFILE=no -DNO_PIC \ DIRPRFX=${_lib}/ install .endif .endfor .for _lib in ${_startup_libs} ${_prebuild_libs} ${_generic_libs} ${_lib}__L: .PHONY .MAKE .if exists(${.CURDIR}/${_lib}) ${_+_}@${ECHODIR} "===> ${_lib} (obj,all,install)"; \ cd ${.CURDIR}/${_lib}; \ ${MAKE} MK_TESTS=no DIRPRFX=${_lib}/ obj; \ ${MAKE} MK_TESTS=no DIRPRFX=${_lib}/ all; \ ${MAKE} MK_TESTS=no DIRPRFX=${_lib}/ install .endif .endfor _prereq_libs: ${_prereq_libs:S/$/__PL/} _startup_libs: ${_startup_libs:S/$/__L/} _prebuild_libs: ${_prebuild_libs:S/$/__L/} _generic_libs: ${_generic_libs:S/$/__L/} # Enable SUBDIR_PARALLEL when not calling 'make all', unless called from # 'everything' with _PARALLEL_SUBDIR_OK set. This is because it is unlikely # that running 'make all' from the top-level, especially with a SUBDIR_OVERRIDE # or LOCAL_DIRS set, will have a reliable build if SUBDIRs are built in # parallel. This is safe for the world stage of buildworld though since it has # already built libraries in a proper order and installed includes into # WORLDTMP. Special handling is done for SUBDIR ordering for 'install*' to # avoid trashing a system if it crashes mid-install. .if !make(all) || defined(_PARALLEL_SUBDIR_OK) SUBDIR_PARALLEL= .endif .include .if make(check-old) || make(check-old-dirs) || \ make(check-old-files) || make(check-old-libs) || \ make(delete-old) || make(delete-old-dirs) || \ make(delete-old-files) || make(delete-old-libs) # # check for / delete old files section # .include "ObsoleteFiles.inc" OLD_LIBS_MESSAGE="Please be sure no application still uses those libraries, \ else you can not start such an application. Consult UPDATING for more \ information regarding how to cope with the removal/revision bump of a \ specific library." .if !defined(BATCH_DELETE_OLD_FILES) RM_I=-i .else RM_I=-v .endif delete-old-files: .PHONY @echo ">>> Removing old files (only deletes safe to delete libs)" # Ask for every old file if the user really wants to remove it. # It's annoying, but better safe than sorry. # NB: We cannot pass the list of OLD_FILES as a parameter because the # argument list will get too long. Using .for/.endfor make "loops" will make # the Makefile parser segfault. @exec 3<&0; \ cd ${.CURDIR}; \ ${MAKE} -f ${.CURDIR}/Makefile.inc1 ${.MAKEFLAGS} ${.TARGET} \ -V OLD_FILES -V "OLD_FILES:Musr/share/*.gz:R" | xargs -n1 | \ while read file; do \ if [ -f "${DESTDIR}/$${file}" -o -L "${DESTDIR}/$${file}" ]; then \ chflags noschg "${DESTDIR}/$${file}" 2>/dev/null || true; \ rm ${RM_I} "${DESTDIR}/$${file}" <&3; \ fi; \ for ext in debug symbols; do \ if ! [ -e "${DESTDIR}/$${file}" ] && [ -f \ "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}" ]; then \ rm ${RM_I} "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}" \ <&3; \ fi; \ done; \ done # Remove catpages without corresponding manpages. @exec 3<&0; \ find ${DESTDIR}/usr/share/man/cat* ! -type d | \ sed -ep -e's:${DESTDIR}/usr/share/man/cat:${DESTDIR}/usr/share/man/man:' | \ while read catpage; do \ read manpage; \ if [ ! -e "$${manpage}" ]; then \ rm ${RM_I} $${catpage} <&3; \ fi; \ done @echo ">>> Old files removed" check-old-files: .PHONY @echo ">>> Checking for old files" @cd ${.CURDIR}; \ ${MAKE} -f ${.CURDIR}/Makefile.inc1 ${.MAKEFLAGS} ${.TARGET} \ -V OLD_FILES -V "OLD_FILES:Musr/share/*.gz:R" | xargs -n1 | \ while read file; do \ if [ -f "${DESTDIR}/$${file}" -o -L "${DESTDIR}/$${file}" ]; then \ echo "${DESTDIR}/$${file}"; \ fi; \ for ext in debug symbols; do \ if [ -f "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}" ]; then \ echo "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}"; \ fi; \ done; \ done # Check for catpages without corresponding manpages. @find ${DESTDIR}/usr/share/man/cat* ! -type d | \ sed -ep -e's:${DESTDIR}/usr/share/man/cat:${DESTDIR}/usr/share/man/man:' | \ while read catpage; do \ read manpage; \ if [ ! -e "$${manpage}" ]; then \ echo $${catpage}; \ fi; \ done delete-old-libs: .PHONY @echo ">>> Removing old libraries" @echo "${OLD_LIBS_MESSAGE}" | fmt @exec 3<&0; \ cd ${.CURDIR}; \ ${MAKE} -f ${.CURDIR}/Makefile.inc1 ${.MAKEFLAGS} ${.TARGET} \ -V OLD_LIBS | xargs -n1 | \ while read file; do \ if [ -f "${DESTDIR}/$${file}" -o -L "${DESTDIR}/$${file}" ]; then \ chflags noschg "${DESTDIR}/$${file}" 2>/dev/null || true; \ rm ${RM_I} "${DESTDIR}/$${file}" <&3; \ fi; \ for ext in debug symbols; do \ if ! [ -e "${DESTDIR}/$${file}" ] && [ -f \ "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}" ]; then \ rm ${RM_I} "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}" \ <&3; \ fi; \ done; \ done @echo ">>> Old libraries removed" check-old-libs: .PHONY @echo ">>> Checking for old libraries" @cd ${.CURDIR}; \ ${MAKE} -f ${.CURDIR}/Makefile.inc1 ${.MAKEFLAGS} ${.TARGET} \ -V OLD_LIBS | xargs -n1 | \ while read file; do \ if [ -f "${DESTDIR}/$${file}" -o -L "${DESTDIR}/$${file}" ]; then \ echo "${DESTDIR}/$${file}"; \ fi; \ for ext in debug symbols; do \ if [ -f "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}" ]; then \ echo "${DESTDIR}${DEBUGDIR}/$${file}.$${ext}"; \ fi; \ done; \ done delete-old-dirs: .PHONY @echo ">>> Removing old directories" @cd ${.CURDIR}; \ ${MAKE} -f ${.CURDIR}/Makefile.inc1 ${.MAKEFLAGS} ${.TARGET} \ -V OLD_DIRS | xargs -n1 | sort -r | \ while read dir; do \ if [ -d "${DESTDIR}/$${dir}" ]; then \ rmdir -v "${DESTDIR}/$${dir}" || true; \ elif [ -L "${DESTDIR}/$${dir}" ]; then \ echo "${DESTDIR}/$${dir} is a link, please remove everything manually."; \ fi; \ done @echo ">>> Old directories removed" check-old-dirs: .PHONY @echo ">>> Checking for old directories" @cd ${.CURDIR}; \ ${MAKE} -f ${.CURDIR}/Makefile.inc1 ${.MAKEFLAGS} ${.TARGET} \ -V OLD_DIRS | xargs -n1 | \ while read dir; do \ if [ -d "${DESTDIR}/$${dir}" ]; then \ echo "${DESTDIR}/$${dir}"; \ elif [ -L "${DESTDIR}/$${dir}" ]; then \ echo "${DESTDIR}/$${dir} is a link, please remove everything manually."; \ fi; \ done delete-old: delete-old-files delete-old-dirs .PHONY @echo "To remove old libraries run '${MAKE} delete-old-libs'." check-old: check-old-files check-old-libs check-old-dirs .PHONY @echo "To remove old files and directories run '${MAKE} delete-old'." @echo "To remove old libraries run '${MAKE} delete-old-libs'." .endif # # showconfig - show build configuration. # showconfig: .PHONY @(${MAKE} -n -f ${.CURDIR}/sys/conf/kern.opts.mk -V dummy -dg1; \ ${MAKE} -n -f ${.CURDIR}/share/mk/src.opts.mk -V dummy -dg1) 2>&1 | grep ^MK_ | sort -u .if !empty(KRNLOBJDIR) && !empty(KERNCONF) DTBOUTPUTPATH= ${KRNLOBJDIR}/${KERNCONF}/ .if !defined(FDT_DTS_FILE) || empty(FDT_DTS_FILE) .if exists(${KERNCONFDIR}/${KERNCONF}) FDT_DTS_FILE!= awk 'BEGIN {FS="="} /^makeoptions[[:space:]]+FDT_DTS_FILE/ {print $$2}' \ '${KERNCONFDIR}/${KERNCONF}' ; echo .endif .endif .endif .if !defined(DTBOUTPUTPATH) || !exists(${DTBOUTPUTPATH}) DTBOUTPUTPATH= ${.CURDIR} .endif # # Build 'standalone' Device Tree Blob # builddtb: .PHONY @PATH=${TMPPATH} MACHINE=${TARGET} \ ${.CURDIR}/sys/tools/fdt/make_dtb.sh ${.CURDIR}/sys \ "${FDT_DTS_FILE}" ${DTBOUTPUTPATH} ############### # cleanworld # In the following, the first 'rm' in a series will usually remove all # files and directories. If it does not, then there are probably some # files with file flags set, so this unsets them and tries the 'rm' a # second time. There are situations where this target will be cleaning # some directories via more than one method, but that duplication is # needed to correctly handle all the possible situations. Removing all # files without file flags set in the first 'rm' instance saves time, # because 'chflags' will need to operate on fewer files afterwards. # # It is expected that BW_CANONICALOBJDIR == the CANONICALOBJDIR as would be # created by bsd.obj.mk, except that we don't want to .include that file # in this makefile. # BW_CANONICALOBJDIR:=${OBJTREE}${.CURDIR} cleanworld: .PHONY .if exists(${BW_CANONICALOBJDIR}/) -rm -rf ${BW_CANONICALOBJDIR}/* -chflags -R 0 ${BW_CANONICALOBJDIR} rm -rf ${BW_CANONICALOBJDIR}/* .endif .if ${.CURDIR} == ${.OBJDIR} || ${.CURDIR}/obj == ${.OBJDIR} # To be safe in this case, fall back to a 'make cleandir' ${_+_}@cd ${.CURDIR}; ${MAKE} cleandir .endif .if defined(TARGET) && defined(TARGET_ARCH) .if ${TARGET} == ${MACHINE} && ${TARGET_ARCH} == ${MACHINE_ARCH} XDEV_CPUTYPE?=${CPUTYPE} .else XDEV_CPUTYPE?=${TARGET_CPUTYPE} .endif NOFUN=-DNO_FSCHG MK_HTML=no -DNO_LINT \ MK_MAN=no MK_NLS=no MK_PROFILE=no \ MK_KERBEROS=no MK_RESCUE=no MK_TESTS=no MK_WARNS=no \ TARGET=${TARGET} TARGET_ARCH=${TARGET_ARCH} \ CPUTYPE=${XDEV_CPUTYPE} XDDIR=${TARGET_ARCH}-freebsd XDTP?=/usr/${XDDIR} .if ${XDTP:N/*} .error XDTP variable should be an absolute path .endif CDBENV=MAKEOBJDIRPREFIX=${MAKEOBJDIRPREFIX}/${XDDIR} \ INSTALL="sh ${.CURDIR}/tools/install.sh" CDENV= ${CDBENV} \ TOOLS_PREFIX=${XDTP} CD2CFLAGS=-isystem ${XDDESTDIR}/usr/include -L${XDDESTDIR}/usr/lib \ --sysroot=${XDDESTDIR}/ -B${XDDESTDIR}/usr/libexec \ -B${XDDESTDIR}/usr/bin -B${XDDESTDIR}/usr/lib CD2ENV=${CDENV} CC="${CC} ${CD2CFLAGS}" CXX="${CXX} ${CD2CFLAGS}" \ CPP="${CPP} ${CD2CFLAGS}" \ MACHINE=${TARGET} MACHINE_ARCH=${TARGET_ARCH} CDTMP= ${MAKEOBJDIRPREFIX}/${XDDIR}/${.CURDIR}/tmp CDMAKE=${CDENV} PATH=${CDTMP}/usr/bin:${PATH} ${MAKE} ${NOFUN} CD2MAKE=${CD2ENV} PATH=${CDTMP}/usr/bin:${XDDESTDIR}/usr/bin:${PATH} ${MAKE} ${NOFUN} XDDESTDIR=${DESTDIR}/${XDTP} .if !defined(OSREL) OSREL!= uname -r | sed -e 's/[-(].*//' .endif .ORDER: xdev-build xdev-install xdev-links xdev: xdev-build xdev-install .PHONY .ORDER: _xb-worldtmp _xb-bootstrap-tools _xb-build-tools _xb-cross-tools xdev-build: _xb-worldtmp _xb-bootstrap-tools _xb-build-tools _xb-cross-tools .PHONY _xb-worldtmp: .PHONY mkdir -p ${CDTMP}/usr mtree -deU -f ${.CURDIR}/etc/mtree/BSD.usr.dist \ -p ${CDTMP}/usr >/dev/null _xb-bootstrap-tools: .PHONY .for _tool in \ ${_clang_tblgen} \ ${_gperf} ${_+_}@${ECHODIR} "===> ${_tool} (obj,all,install)"; \ cd ${.CURDIR}/${_tool}; \ ${CDMAKE} DIRPRFX=${_tool}/ obj; \ ${CDMAKE} DIRPRFX=${_tool}/ all; \ ${CDMAKE} DIRPRFX=${_tool}/ DESTDIR=${CDTMP} install .endfor _xb-build-tools: .PHONY ${_+_}@cd ${.CURDIR}; \ ${CDBENV} ${MAKE} -f Makefile.inc1 ${NOFUN} build-tools _xb-cross-tools: .PHONY .for _tool in \ ${_binutils} \ ${_elftctools} \ usr.bin/ar \ ${_clang_libs} \ ${_clang} \ ${_cc} ${_+_}@${ECHODIR} "===> xdev ${_tool} (obj,all)"; \ cd ${.CURDIR}/${_tool}; \ ${CDMAKE} DIRPRFX=${_tool}/ obj; \ ${CDMAKE} DIRPRFX=${_tool}/ all .endfor _xi-mtree: .PHONY ${_+_}@${ECHODIR} "mtree populating ${XDDESTDIR}" mkdir -p ${XDDESTDIR} mtree -deU -f ${.CURDIR}/etc/mtree/BSD.root.dist \ -p ${XDDESTDIR} >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.usr.dist \ -p ${XDDESTDIR}/usr >/dev/null mtree -deU -f ${.CURDIR}/etc/mtree/BSD.include.dist \ -p ${XDDESTDIR}/usr/include >/dev/null .if defined(LIBCOMPAT) mtree -deU -f ${.CURDIR}/etc/mtree/BSD.lib${libcompat}.dist \ -p ${XDDESTDIR}/usr >/dev/null .endif .if ${MK_TESTS} != "no" mkdir -p ${XDDESTDIR}${TESTSBASE} mtree -deU -f ${.CURDIR}/etc/mtree/BSD.tests.dist \ -p ${XDDESTDIR}${TESTSBASE} >/dev/null .endif .ORDER: xdev-build _xi-mtree _xi-cross-tools _xi-includes _xi-libraries xdev-install: xdev-build _xi-mtree _xi-cross-tools _xi-includes _xi-libraries .PHONY _xi-cross-tools: .PHONY @echo "_xi-cross-tools" .for _tool in \ ${_binutils} \ ${_elftctools} \ usr.bin/ar \ ${_clang_libs} \ ${_clang} \ ${_cc} ${_+_}@${ECHODIR} "===> xdev ${_tool} (install)"; \ cd ${.CURDIR}/${_tool}; \ ${CDMAKE} DIRPRFX=${_tool}/ install DESTDIR=${XDDESTDIR} .endfor _xi-includes: .PHONY ${_+_}cd ${.CURDIR}; ${CD2MAKE} -f Makefile.inc1 includes \ DESTDIR=${XDDESTDIR} _xi-libraries: .PHONY ${_+_}cd ${.CURDIR}; ${CD2MAKE} -f Makefile.inc1 libraries \ DESTDIR=${XDDESTDIR} xdev-links: .PHONY ${_+_}cd ${XDDESTDIR}/usr/bin; \ mkdir -p ../../../../usr/bin; \ for i in *; do \ ln -sf ../../${XDTP}/usr/bin/$$i \ ../../../../usr/bin/${XDDIR}-$$i; \ ln -sf ../../${XDTP}/usr/bin/$$i \ ../../../../usr/bin/${XDDIR}${OSREL}-$$i; \ done .else xdev xdev-build xdev-install xdev-links: .PHONY @echo "*** Error: Both TARGET and TARGET_ARCH must be defined for \"${.TARGET}\" target" .endif Index: user/alc/PQ_LAUNDRY/UPDATING =================================================================== --- user/alc/PQ_LAUNDRY/UPDATING (revision 303205) +++ user/alc/PQ_LAUNDRY/UPDATING (revision 303206) @@ -1,1622 +1,1623 @@ Updating Information for FreeBSD current users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to the tip of head, and then rebuild without this option. The bootstrap process from older version of current across the gcc/clang cutover is a bit fragile. NOTE TO PEOPLE WHO THINK THAT FreeBSD 12.x IS SLOW: FreeBSD 12.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely disable the most expensive debugging functionality run "ln -s 'abort:false,junk:false' /etc/malloc.conf".) 20160622: The libc stub for the pipe(2) system call has been replaced with - a wrapper which calls the pipe2(2) system call and the pipe(2) is now - only implemented by the kernels which include "options - COMPAT_FREEBSD10" in their config file (this is the default). - Users should ensure that this option is enabled in their kernel - or upgrade userspace to r302092 before upgrading their kernel. + a wrapper that calls the pipe2(2) system call and the pipe(2) + system call is now only implemented by the kernels that include + "options COMPAT_FREEBSD10" in their config file (this is the + default). Users should ensure that this option is enabled in + their kernel or upgrade userspace to r302092 before upgrading their + kernel. 20160527: CAM will now strip leading spaces from SCSI disks' serial numbers. This will effect users who create UFS filesystems on SCSI disks using those disk's diskid device nodes. For example, if /etc/fstab previously contained a line like "/dev/diskid/DISK-%20%20%20%20%20%20%20ABCDEFG0123456", you should change it to "/dev/diskid/DISK-ABCDEFG0123456". Users of geom transforms like gmirror may also be affected. ZFS users should generally be fine. 20160523: The bitstring(3) API has been updated with new functionality and improved performance. But it is binary-incompatible with the old API. Objects built with the new headers may not be linked against objects built with the old headers. 20160520: The brk and sbrk functions have been removed from libc on arm64. Binutils from ports has been updated to not link to these functions and should be updated to the latest version before installing a new libc. 20160517: The armv6 port now defaults to hard float ABI. Limited support for running both hardfloat and soft float on the same system is available using the libraries installed with -DWITH_LIBSOFT. This has only been tested as an upgrade path for installworld and packages may fail or need manual intervention to run. New packages will be needed. To update an existing self-hosted armv6hf system, you must add TARGET_ARCH=armv6 on the make command line for both the build and the install steps. 20160510: Kernel modules compiled outside of a kernel build now default to installing to /boot/modules instead of /boot/kernel. Many kernel modules built this way (such as those in ports) already overrode KMODDIR explicitly to install into /boot/modules. However, manually building and installing a module from /sys/modules will now install to /boot/modules instead of /boot/kernel. 20160414: The CAM I/O scheduler has been committed to the kernel. There should be no user visible impact. This does enable NCQ Trim on ada SSDs. While the list of known rogues that claim support for this but actually corrupt data is believed to be complete, be on the lookout for data corruption. The known rogue list is believed to be complete: o Crucial MX100, M550 drives with MU01 firmware. o Micron M510 and M550 drives with MU01 firmware. o Micron M500 prior to MU07 firmware o Samsung 830, 840, and 850 all firmwares o FCCT M500 all firmwares Crucial has firmware http://www.crucial.com/usa/en/support-ssd-firmware with working NCQ TRIM. For Micron branded drives, see your sales rep for updated firmware. Black listed drives will work correctly because these drives work correctly so long as no NCQ TRIMs are sent to them. Given this list is the same as found in Linux, it's believed there are no other rogues in the market place. All other models from the above vendors work. To be safe, if you are at all concerned, you can quirk each of your drives to prevent NCQ from being sent by setting: kern.cam.ada.X.quirks="0x2" in loader.conf. If the drive requires the 4k sector quirk, set the quirks entry to 0x3. 20160330: The FAST_DEPEND build option has been removed and its functionality is now the one true way. The old mkdep(1) style of 'make depend' has been removed. See 20160311 for further details. 20160317: Resource range types have grown from unsigned long to uintmax_t. All drivers, and anything using libdevinfo, need to be recompiled. 20160311: WITH_FAST_DEPEND is now enabled by default for in-tree and out-of-tree builds. It no longer runs mkdep(1) during 'make depend', and the 'make depend' stage can safely be skipped now as it is auto ran when building 'make all' and will generate all SRCS and DPSRCS before building anything else. Dependencies are gathered at compile time with -MF flags kept in separate .depend files per object file. Users should run 'make cleandepend' once if using -DNO_CLEAN to clean out older stale .depend files. 20160306: On amd64, clang 3.8.0 can now insert sections of type AMD64_UNWIND into kernel modules. Therefore, if you load any kernel modules at boot time, please install the boot loaders after you install the kernel, but before rebooting, e.g.: make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE make -C sys/boot install Then follow the usual steps, described in the General Notes section, below. 20160305: Clang, llvm, lldb and compiler-rt have been upgraded to 3.8.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20160301: The AIO subsystem is now a standard part of the kernel. The VFS_AIO kernel option and aio.ko kernel module have been removed. Due to stability concerns, asynchronous I/O requests are only permitted on sockets and raw disks by default. To enable asynchronous I/O requests on all file types, set the vfs.aio.enable_unsafe sysctl to a non-zero value. 20160226: The ELF object manipulation tool objcopy is now provided by the ELF Tool Chain project rather than by GNU binutils. It should be a drop-in replacement, with the addition of arm64 support. The (temporary) src.conf knob WITHOUT_ELFCOPY_AS_OBJCOPY knob may be set to obtain the GNU version if necessary. 20160129: Building ZFS pools on top of zvols is prohibited by default. That feature has never worked safely; it's always been prone to deadlocks. Using a zvol as the backing store for a VM guest's virtual disk will still work, even if the guest is using ZFS. Legacy behavior can be restored by setting vfs.zfs.vol.recursive=1. 20160119: The NONE and HPN patches has been removed from OpenSSH. They are still available in the security/openssh-portable port. 20160113: With the addition of ypldap(8), a new _ypldap user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20151216: The tftp loader (pxeboot) now uses the option root-path directive. As a consequence it no longer looks for a pxeboot.4th file on the tftp server. Instead it uses the regular /boot infrastructure as with the other loaders. 20151211: The code to start recording plug and play data into the modules has been committed. While the old tools will properly build a new kernel, a number of warnings about "unknown metadata record 4" will be produced for an older kldxref. To avoid such warnings, make sure to rebuild the kernel toolchain (or world). Make sure that you have r292078 or later when trying to build 292077 or later before rebuilding. 20151207: Debug data files are now built by default with 'make buildworld' and installed with 'make installworld'. This facilitates debugging but requires more disk space both during the build and for the installed world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes in src.conf(5). 20151130: r291527 changed the internal interface between the nfsd.ko and nfscommon.ko modules. As such, they must both be upgraded to-gether. __FreeBSD_version has been bumped because of this. 20151108: Add support for unicode collation strings leads to a change of order of files listed by ls(1) for example. To get back to the old behaviour, set LC_COLLATE environment variable to "C". Databases administrators will need to reindex their databases given collation results will be different. Due to a bug in install(1) it is recommended to remove the ancient locales before running make installworld. rm -rf /usr/share/locale/* 20151030: The OpenSSL has been upgraded to 1.0.2d. Any binaries requiring libcrypto.so.7 or libssl.so.7 must be recompiled. 20151020: Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0. Kernel modules isp_2400_multi and isp_2500_multi were removed and should be replaced with isp_2400 and isp_2500 modules respectively. 20151017: The build previously allowed using 'make -n' to not recurse into sub-directories while showing what commands would be executed, and 'make -n -n' to recursively show commands. Now 'make -n' will recurse and 'make -N' will not. 20151012: If you specify SENDMAIL_MC or SENDMAIL_CF in make.conf, mergemaster and etcupdate will now use this file. A custom sendmail.cf is now updated via this mechanism rather than via installworld. If you had excluded sendmail.cf in mergemaster.rc or etcupdate.conf, you may want to remove the exclusion or change it to "always install". /etc/mail/sendmail.cf is now managed the same way regardless of whether SENDMAIL_MC/SENDMAIL_CF is used. If you are not using SENDMAIL_MC/SENDMAIL_CF there should be no change in behavior. 20151011: Compatibility shims for legacy ATA device names have been removed. It includes ATA_STATIC_ID kernel option, kern.cam.ada.legacy_aliases and kern.geom.raid.legacy_aliases loader tunables, kern.devalias.* environment variables, /dev/ad* and /dev/ar* symbolic links. 20151006: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.7.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20150924: Kernel debug files have been moved to /usr/lib/debug/boot/kernel/, and renamed from .symbols to .debug. This reduces the size requirements on the boot partition or file system and provides consistency with userland debug files. When using the supported kernel installation method the /usr/lib/debug/boot/kernel directory will be renamed (to kernel.old) as is done with /boot/kernel. Developers wishing to maintain the historical behavior of installing debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5). 20150827: The wireless drivers had undergone changes that remove the 'parent interface' from the ifconfig -l output. The rc.d network scripts used to check presence of a parent interface in the list, so old scripts would fail to start wireless networking. Thus, etcupdate(3) or mergemaster(8) run is required after kernel update, to update your rc.d scripts in /etc. 20150827: pf no longer supports 'scrub fragment crop' or 'scrub fragment drop-ovl' These configurations are now automatically interpreted as 'scrub fragment reassemble'. 20150817: Kernel-loadable modules for the random(4) device are back. To use them, the kernel must have device random options RANDOM_LOADABLE kldload(8) can then be used to load random_fortuna.ko or random_yarrow.ko. Please note that due to the indirect function calls that the loadable modules need to provide, the build-in variants will be slightly more efficient. The random(4) kernel option RANDOM_DUMMY has been retired due to unpopularity. It was not all that useful anyway. 20150813: The WITHOUT_ELFTOOLCHAIN_TOOLS src.conf(5) knob has been retired. Control over building the ELF Tool Chain tools is now provided by the WITHOUT_TOOLCHAIN knob. 20150810: The polarity of Pulse Per Second (PPS) capture events with the uart(4) driver has been corrected. Prior to this change the PPS "assert" event corresponded to the trailing edge of a positive PPS pulse and the "clear" event was the leading edge of the next pulse. As the width of a PPS pulse in a typical GPS receiver is on the order of 1 millisecond, most users will not notice any significant difference with this change. Anyone who has compensated for the historical polarity reversal by configuring a negative offset equal to the pulse width will need to remove that workaround. 20150809: The default group assigned to /dev/dri entries has been changed from 'wheel' to 'video' with the id of '44'. If you want to have access to the dri devices please add yourself to the video group with: # pw groupmod video -m $USER 20150806: The menu.rc and loader.rc files will now be replaced during upgrades. Please migrate local changes to menu.rc.local and loader.rc.local instead. 20150805: GNU Binutils versions of addr2line, c++filt, nm, readelf, size, strings and strip have been removed. The src.conf(5) knob WITHOUT_ELFTOOLCHAIN_TOOLS no longer provides the binutils tools. 20150728: As ZFS requires more kernel stack pages than is the default on some architectures e.g. i386, it now warns if KSTACK_PAGES is less than ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing). Please consider using 'options KSTACK_PAGES=X' where X is greater than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations. 20150706: sendmail has been updated to 8.15.2. Starting with FreeBSD 11.0 and sendmail 8.15, sendmail uses uncompressed IPv6 addresses by default, i.e., they will not contain "::". For example, instead of ::1, it will be 0:0:0:0:0:0:0:1. This permits a zero subnet to have a more specific match, such as different map entries for IPv6:0:0 vs IPv6:0. This change requires that configuration data (including maps, files, classes, custom ruleset, etc.) must use the same format, so make certain such configuration data is upgrading. As a very simple check search for patterns like 'IPv6:[0-9a-fA-F:]*::' and 'IPv6::'. To return to the old behavior, set the m4 option confUSE_COMPRESSED_IPV6_ADDRESSES or the cf option UseCompressedIPv6Addresses. 20150630: The default kernel entropy-processing algorithm is now Fortuna, replacing Yarrow. Assuming you have 'device random' in your kernel config file, the configurations allow a kernel option to override this default. You may choose *ONE* of: options RANDOM_YARROW # Legacy /dev/random algorithm. options RANDOM_DUMMY # Blocking-only driver. If you have neither, you get Fortuna. For most people, read no further, Fortuna will give a /dev/random that works like it always used to, and the difference will be irrelevant. If you remove 'device random', you get *NO* kernel-processed entropy at all. This may be acceptable to folks building embedded systems, but has complications. Carry on reading, and it is assumed you know what you need. *PLEASE* read random(4) and random(9) if you are in the habit of tweaking kernel configs, and/or if you are a member of the embedded community, wanting specific and not-usual behaviour from your security subsystems. NOTE!! If you use RANDOM_DUMMY and/or have no 'device random', you will NOT have a functioning /dev/random, and many cryptographic features will not work, including SSH. You may also find strange behaviour from the random(3) set of library functions, in particular sranddev(3), srandomdev(3) and arc4random(3). The reason for this is that the KERN_ARND sysctl only returns entropy if it thinks it has some to share, and with RANDOM_DUMMY or no 'device random' this will never happen. 20150623: An additional fix for the issue described in the 20150614 sendmail entry below has been been committed in revision 284717. 20150616: FreeBSD's old make (fmake) has been removed from the system. It is available as the devel/fmake port or via pkg install fmake. 20150615: The fix for the issue described in the 20150614 sendmail entry below has been been committed in revision 284436. The work around described in that entry is no longer needed unless the default setting is overridden by a confDH_PARAMETERS configuration setting of '5' or pointing to a 512 bit DH parameter file. 20150614: ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf and devel/kyua to version 0.20+ and adjust any calling code to work with Kyuafile and kyua. 20150614: The import of openssl to address the FreeBSD-SA-15:10.openssl security advisory includes a change which rejects handshakes with DH parameters below 768 bits. sendmail releases prior to 8.15.2 (not yet released), defaulted to a 512 bit DH parameter setting for client connections. To work around this interoperability, sendmail can be configured to use a 2048 bit DH parameter by: 1. Edit /etc/mail/`hostname`.mc 2. If a setting for confDH_PARAMETERS does not exist or exists and is set to a string beginning with '5', replace it with '2'. 3. If a setting for confDH_PARAMETERS exists and is set to a file path, create a new file with: openssl dhparam -out /path/to/file 2048 4. Rebuild the .cf file: cd /etc/mail/; make; make install 5. Restart sendmail: cd /etc/mail/; make restart A sendmail patch is coming, at which time this file will be updated. 20150604: Generation of legacy formatted entries have been disabled by default in pwd_mkdb(8), as all base system consumers of the legacy formatted entries were converted to use the new format by default when the new, machine independent format have been added and supported since FreeBSD 5.x. Please see the pwd_mkdb(8) manual page for further details. 20150525: Clang and llvm have been upgraded to 3.6.1 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150521: TI platform code switched to using vendor DTS files and this update may break existing systems running on Beaglebone, Beaglebone Black, and Pandaboard: - dtb files should be regenerated/reinstalled. Filenames are the same but content is different now - GPIO addressing was changed, now each GPIO bank (32 pins per bank) has its own /dev/gpiocX device, e.g. pin 121 on /dev/gpioc0 in old addressing scheme is now pin 25 on /dev/gpioc3. - Pandaboard: /etc/ttys should be updated, serial console device is now /dev/ttyu2, not /dev/ttyu0 20150501: soelim(1) from gnu/usr.bin/groff has been replaced by usr.bin/soelim. If you need the GNU extension from groff soelim(1), install groff from package: pkg install groff, or via ports: textproc/groff. 20150423: chmod, chflags, chown and chgrp now affect symlinks in -R mode as defined in symlink(7); previously symlinks were silently ignored. 20150415: The const qualifier has been removed from iconv(3) to comply with POSIX. The ports tree is aware of this from r384038 onwards. 20150416: Libraries specified by LIBADD in Makefiles must have a corresponding DPADD_ variable to ensure correct dependencies. This is now enforced in src.libnames.mk. 20150324: From legacy ata(4) driver was removed support for SATA controllers supported by more functional drivers ahci(4), siis(4) and mvs(4). Kernel modules ataahci and ataadaptec were removed completely, replaced by ahci and mvs modules respectively. 20150315: Clang, llvm and lldb have been upgraded to 3.6.0 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150307: The 32-bit PowerPC kernel has been changed to a position-independent executable. This can only be booted with a version of loader(8) newer than January 31, 2015, so make sure to update both world and kernel before rebooting. 20150217: If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014), but before r278950, the RNG was not seeded properly. Immediately upgrade the kernel to r278950 or later and regenerate any keys (e.g. ssh keys or openssl keys) that were generated w/ a kernel from that range. This does not affect programs that directly used /dev/random or /dev/urandom. All userland uses of arc4random(3) are affected. 20150210: The autofs(4) ABI was changed in order to restore binary compatibility with 10.1-RELEASE. The automountd(8) daemon needs to be rebuilt to work with the new kernel. 20150131: The powerpc64 kernel has been changed to a position-independent executable. This can only be booted with a new version of loader(8), so make sure to update both world and kernel before rebooting. 20150118: Clang and llvm have been upgraded to 3.5.1 release. This is a bugfix only release, no new features have been added. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0. 20150107: ELF tools addr2line, elfcopy (strip), nm, size, and strings are now taken from the ELF Tool Chain project rather than GNU binutils. They should be drop-in replacements, with the addition of arm64 support. The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the binutils tools, if necessary. See 20150805 for updated information. 20150105: The default Unbound configuration now enables remote control using a local socket. Users who have already enabled the local_unbound service should regenerate their configuration by running "service local_unbound setup" as root. 20150102: The GNU texinfo and GNU info pages have been removed. To be able to view GNU info pages please install texinfo from ports. 20141231: Clang, llvm and lldb have been upgraded to 3.5.0 release. As of this release, a prerequisite for building clang, llvm and lldb is a C++11 capable compiler and C++11 standard library. This means that to be able to successfully build the cross-tools stage of buildworld, with clang as the bootstrap compiler, your system compiler or cross compiler should either be clang 3.3 or later, or gcc 4.8 or later, and your system C++ library should be libc++, or libdstdc++ from gcc 4.8 or later. On any standard FreeBSD 10.x or 11.x installation, where clang and libc++ are on by default (that is, on x86 or arm), this should work out of the box. On 9.x installations where clang is enabled by default, e.g. on x86 and powerpc, libc++ will not be enabled by default, so libc++ should be built (with clang) and installed first. If both clang and libc++ are missing, build clang first, then use it to build libc++. On 8.x and earlier installations, upgrade to 9.x first, and then follow the instructions for 9.x above. Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by default, and do not build clang. Many embedded systems are resource constrained, and will not be able to build clang in a reasonable time, or in some cases at all. In those cases, cross building bootable systems on amd64 is a workaround. This new version of clang introduces a number of new warnings, of which the following are most likely to appear: -Wabsolute-value This warns in two cases, for both C and C++: * When the code is trying to take the absolute value of an unsigned quantity, which is effectively a no-op, and almost never what was intended. The code should be fixed, if at all possible. If you are sure that the unsigned quantity can be safely cast to signed, without loss of information or undefined behavior, you can add an explicit cast, or disable the warning. * When the code is trying to take an absolute value, but the called abs() variant is for the wrong type, which can lead to truncation. If you want to disable the warning instead of fixing the code, please make sure that truncation will not occur, or it might lead to unwanted side-effects. -Wtautological-undefined-compare and -Wundefined-bool-conversion These warn when C++ code is trying to compare 'this' against NULL, while 'this' should never be NULL in well-defined C++ code. However, there is some legacy (pre C++11) code out there, which actively abuses this feature, which was less strictly defined in previous C++ versions. Squid and openjdk do this, for example. The warning can be turned off for C++98 and earlier, but compiling the code in C++11 mode might result in unexpected behavior; for example, the parts of the program that are unreachable could be optimized away. 20141222: The old NFS client and server (kernel options NFSCLIENT, NFSSERVER) kernel sources have been removed. The .h files remain, since some utilities include them. This will need to be fixed later. If "mount -t oldnfs ..." is attempted, it will fail. If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used, the utilities will report errors. 20141121: The handling of LOCAL_LIB_DIRS has been altered to skip addition of directories to top level SUBDIR variable when their parent directory is included in LOCAL_DIRS. Users with build systems with such hierarchies and without SUBDIR entries in the parent directory Makefiles should add them or add the directories to LOCAL_DIRS. 20141109: faith(4) and faithd(8) have been removed from the base system. Faith has been obsolete for a very long time. 20141104: vt(4), the new console driver, is enabled by default. It brings support for Unicode and double-width characters, as well as support for UEFI and integration with the KMS kernel video drivers. You may need to update your console settings in /etc/rc.conf, most probably the keymap. During boot, /etc/rc.d/syscons will indicate what you need to do. vt(4) still has issues and lacks some features compared to syscons(4). See the wiki for up-to-date information: https://wiki.freebsd.org/Newcons If you want to keep using syscons(4), you can do so by adding the following line to /boot/loader.conf: kern.vty=sc 20141102: pjdfstest has been integrated into kyua as an opt-in test suite. Please see share/doc/pjdfstest/README for more details on how to execute it. 20141009: gperf has been removed from the base system for architectures that use clang. Ports that require gperf will obtain it from the devel/gperf port. 20140923: pjdfstest has been moved from tools/regression/pjdfstest to contrib/pjdfstest . 20140922: At svn r271982, The default linux compat kernel ABI has been adjusted to 2.6.18 in support of the linux-c6 compat ports infrastructure update. If you wish to continue using the linux-f10 compat ports, add compat.linux.osrelease=2.6.16 to your local sysctl.conf. Users are encouraged to update their linux-compat packages to linux-c6 during their next update cycle. 20140729: The ofwfb driver, used to provide a graphics console on PowerPC when using vt(4), no longer allows mmap() of all physical memory. This will prevent Xorg on PowerPC with some ATI graphics cards from initializing properly unless x11-servers/xorg-server is updated to 1.12.4_8 or newer. 20140723: The xdev targets have been converted to using TARGET and TARGET_ARCH instead of XDEV and XDEV_ARCH. 20140719: The default unbound configuration has been modified to address issues with reverse lookups on networks that use private address ranges. If you use the local_unbound service, run "service local_unbound setup" as root to regenerate your configuration, then "service local_unbound reload" to load the new configuration. 20140709: The GNU texinfo and GNU info pages are not built and installed anymore, WITH_INFO knob has been added to allow to built and install them again. UPDATE: see 20150102 entry on texinfo's removal 20140708: The GNU readline library is now an INTERNALLIB - that is, it is statically linked into consumers (GDB and variants) in the base system, and the shared library is no longer installed. The devel/readline port is available for third party software that requires readline. 20140702: The Itanium architecture (ia64) has been removed from the list of known architectures. This is the first step in the removal of the architecture. 20140701: Commit r268115 has added NFSv4.1 server support, merged from projects/nfsv4.1-server. Since this includes changes to the internal interfaces between the NFS related modules, a full build of the kernel and modules will be necessary. __FreeBSD_version has been bumped. 20140629: The WITHOUT_VT_SUPPORT kernel config knob has been renamed WITHOUT_VT. (The other _SUPPORT knobs have a consistent meaning which differs from the behaviour controlled by this knob.) 20140619: Maximal length of the serial number in CTL was increased from 16 to 64 chars, that breaks ABI. All CTL-related tools, such as ctladm and ctld, need to be rebuilt to work with a new kernel. 20140606: The libatf-c and libatf-c++ major versions were downgraded to 0 and 1 respectively to match the upstream numbers. They were out of sync because, when they were originally added to FreeBSD, the upstream versions were not respected. These libraries are private and not yet built by default, so renumbering them should be a non-issue. However, unclean source trees will yield broken test programs once the operator executes "make delete-old-libs" after a "make installworld". Additionally, the atf-sh binary was made private by moving it into /usr/libexec/. Already-built shell test programs will keep the path to the old binary so they will break after "make delete-old" is run. If you are using WITH_TESTS=yes (not the default), wipe the object tree and rebuild from scratch to prevent spurious test failures. This is only needed once: the misnumbered libraries and misplaced binaries have been added to OptionalObsoleteFiles.inc so they will be removed during a clean upgrade. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140508: We bogusly installed src.opts.mk in /usr/share/mk. This file should be removed to avoid issues in the future (and has been added to ObsoleteFiles.inc). 20140505: /etc/src.conf now affects only builds of the FreeBSD src tree. In the past, it affected all builds that used the bsd.*.mk files. The old behavior was a bug, but people may have relied upon it. To get this behavior back, you can .include /etc/src.conf from /etc/make.conf (which is still global and isn't changed). This also changes the behavior of incremental builds inside the tree of individual directories. Set MAKESYSPATH to ".../share/mk" to do that. Although this has survived make universe and some upgrade scenarios, other upgrade scenarios may have broken. At least one form of temporary breakage was fixed with MAKESYSPATH settings for buildworld as well... In cases where MAKESYSPATH isn't working with this setting, you'll need to set it to the full path to your tree. One side effect of all this cleaning up is that bsd.compiler.mk is no longer implicitly included by bsd.own.mk. If you wish to use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk as well. 20140430: The lindev device has been removed since /dev/full has been made a standard device. __FreeBSD_version has been bumped. 20140424: The knob WITHOUT_VI was added to the base system, which controls building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1) in order to reorder files share/termcap and didn't build ex(1) as a build tool, so building/installing with WITH_VI is highly advised for build hosts for older releases. This issue has been fixed in stable/9 and stable/10 in r277022 and r276991, respectively. 20140418: The YES_HESIOD knob has been removed. It has been obsolete for a decade. Please move to using WITH_HESIOD instead or your builds will silently lack HESIOD. 20140405: The uart(4) driver has been changed with respect to its handling of the low-level console. Previously the uart(4) driver prevented any process from changing the baudrate or the CLOCAL and HUPCL control flags. By removing the restrictions, operators can make changes to the serial console port without having to reboot. However, when getty(8) is started on the serial device that is associated with the low-level console, a misconfigured terminal line in /etc/ttys will now have a real impact. Before upgrading the kernel, make sure that /etc/ttys has the serial console device configured as 3wire without baudrate to preserve the previous behaviour. E.g: ttyu0 "/usr/libexec/getty 3wire" vt100 on secure 20140306: Support for libwrap (TCP wrappers) in rpcbind was disabled by default to improve performance. To re-enable it, if needed, run rpcbind with command line option -W. 20140226: Switched back to the GPL dtc compiler due to updates in the upstream dts files not being supported by the BSDL dtc compiler. You will need to rebuild your kernel toolchain to pick up the new compiler. Core dumps may result while building dtb files during a kernel build if you fail to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler. 20140216: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver has been removed. Please use the nfe(4) driver for NVIDIA nForce MCP Ethernet adapters instead. 20140212: An ABI incompatibility crept into the libc++ 3.4 import in r261283. This could cause certain C++ applications using shared libraries built against the previous version of libc++ to crash. The incompatibility has now been fixed, but any C++ applications or shared libraries built between r261283 and r261801 should be recompiled. 20140204: OpenSSH will now ignore errors caused by kernel lacking of Capsicum capability mode support. Please note that enabling the feature in kernel is still highly recommended. 20140131: OpenSSH is now built with sandbox support, and will use sandbox as the default privilege separation method. This requires Capsicum capability mode support in kernel. 20140128: The libelf and libdwarf libraries have been updated to newer versions from upstream. Shared library version numbers for these two libraries were bumped. Any ports or binaries requiring these two libraries should be recompiled. __FreeBSD_version is bumped to 1100006. 20140110: If a Makefile in a tests/ directory was auto-generating a Kyuafile instead of providing an explicit one, this would prevent such Makefile from providing its own Kyuafile in the future during NO_CLEAN builds. This has been fixed in the Makefiles but manual intervention is needed to clean an objdir if you use NO_CLEAN: # find /usr/obj -name Kyuafile | xargs rm -f 20131213: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 1100004. 20131108: The WITHOUT_ATF build knob has been removed and its functionality has been subsumed into the more generic WITHOUT_TESTS. If you were using the former to disable the build of the ATF libraries, you should change your settings to use the latter. 20131025: The default version of mtree is nmtree which is obtained from NetBSD. The output is generally the same, but may vary slightly. If you found you need identical output adding "-F freebsd9" to the command line should do the trick. For the time being, the old mtree is available as fmtree. 20131014: libbsdyml has been renamed to libyaml and moved to /usr/lib/private. This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg 1.1.4_8 and verify bsdyml not linked in, before running "make delete-old-libs": # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean or # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml 20131010: The stable/10 branch has been created in subversion from head revision r256279. 20131010: The rc.d/jail script has been updated to support jail(8) configuration file. The "jail__*" rc.conf(5) variables for per-jail configuration are automatically converted to /var/run/jail..conf before the jail(8) utility is invoked. This is transparently backward compatible. See below about some incompatibilities and rc.conf(5) manual page for more details. These variables are now deprecated in favor of jail(8) configuration file. One can use "rc.d/jail config " command to generate a jail(8) configuration file in /var/run/jail..conf without running the jail(8) utility. The default pathname of the configuration file is /etc/jail.conf and can be specified by using $jail_conf or $jail__conf variables. Please note that jail_devfs_ruleset accepts an integer at this moment. Please consider to rewrite the ruleset name with an integer. 20130930: BIND has been removed from the base system. If all you need is a local resolver, simply enable and start the local_unbound service instead. Otherwise, several versions of BIND are available in the ports tree. The dns/bind99 port is one example. With this change, nslookup(1) and dig(1) are no longer in the base system. Users should instead use host(1) and drill(1) which are in the base system. Alternatively, nslookup and dig can be obtained by installing the dns/bind-tools port. 20130916: With the addition of unbound(8), a new unbound user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20130911: OpenSSH is now built with DNSSEC support, and will by default silently trust signed SSHFP records. This can be controlled with the VerifyHostKeyDNS client configuration setting. DNSSEC support can be disabled entirely with the WITHOUT_LDNS option in src.conf. 20130906: The GNU Compiler Collection and C++ standard library (libstdc++) are no longer built by default on platforms where clang is the system compiler. You can enable them with the WITH_GCC and WITH_GNUCXX options in src.conf. 20130905: The PROCDESC kernel option is now part of the GENERIC kernel configuration and is required for the rwhod(8) to work. If you are using custom kernel configuration, you should include 'options PROCDESC'. 20130905: The API and ABI related to the Capsicum framework was modified in backward incompatible way. The userland libraries and programs have to be recompiled to work with the new kernel. This includes the following libraries and programs, but the whole buildworld is advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl, kdump, procstat, rwho, rwhod, uniq. 20130903: AES-NI intrinsic support has been added to gcc. The AES-NI module has been updated to use this support. A new gcc is required to build the aesni module on both i386 and amd64. 20130821: The PADLOCK_RNG and RDRAND_RNG kernel options are now devices. Thus "device padlock_rng" and "device rdrand_rng" should be used instead of "options PADLOCK_RNG" & "options RDRAND_RNG". 20130813: WITH_ICONV has been split into two feature sets. WITH_ICONV now enables just the iconv* functionality and is now on by default. WITH_LIBICONV_COMPAT enables the libiconv api and link time compatibility. Set WITHOUT_ICONV to build the old way. If you have been using WITH_ICONV before, you will very likely need to turn on WITH_LIBICONV_COMPAT. 20130806: INVARIANTS option now enables DEBUG for code with OpenSolaris and Illumos origin, including ZFS. If you have INVARIANTS in your kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG explicitly. DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS) locks if WITNESS option was set. Because that generated a lot of witness(9) reports and all of them were believed to be false positives, this is no longer done. New option OPENSOLARIS_WITNESS can be used to achieve the previous behavior. 20130806: Timer values in IPv6 data structures now use time_uptime instead of time_second. Although this is not a user-visible functional change, userland utilities which directly use them---ndp(8), rtadvd(8), and rtsold(8) in the base system---need to be updated to r253970 or later. 20130802: find -delete can now delete the pathnames given as arguments, instead of only files found below them or if the pathname did not contain any slashes. Formerly, the following error message would result: find: -delete: : relative path potentially not safe Deleting the pathnames given as arguments can be prevented without error messages using -mindepth 1 or by changing directory and passing "." as argument to find. This works in the old as well as the new version of find. 20130726: Behavior of devfs rules path matching has been changed. Pattern is now always matched against fully qualified devfs path and slash characters must be explicitly matched by slashes in pattern (FNM_PATHNAME). Rulesets involving devfs subdirectories must be reviewed. 20130716: The default ARM ABI has changed to the ARM EABI. The old ABI is incompatible with the ARM EABI and all programs and modules will need to be rebuilt to work with a new kernel. To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set. NOTE: Support for the old ABI will be removed in the future and users are advised to upgrade. 20130709: pkg_install has been disconnected from the build if you really need it you should add WITH_PKGTOOLS in your src.conf(5). 20130709: Most of network statistics structures were changed to be able keep 64-bits counters. Thus all tools, that work with networking statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.) 20130618: Fix a bug that allowed a tracing process (e.g. gdb) to write to a memory-mapped file in the traced process's address space even if neither the traced process nor the tracing process had write access to that file. 20130615: CVS has been removed from the base system. An exact copy of the code is available from the devel/cvs port. 20130613: Some people report the following error after the switch to bmake: make: illegal option -- J usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable] ... *** [buildworld] Error code 2 this likely due to an old instance of make in ${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE}) which src/Makefile will use that blindly, if it exists, so if you see the above error: rm -rf `make -V MAKEPATH` should resolve it. 20130516: Use bmake by default. Whereas before one could choose to build with bmake via -DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old make. The goal is to remove these knobs for 10-RELEASE. It is worth noting that bmake (like gmake) treats the command line as the unit of failure, rather than statements within the command line. Thus '(cd some/where && dosomething)' is safer than 'cd some/where; dosomething'. The '()' allows consistent behavior in parallel build. 20130429: Fix a bug that allows NFS clients to issue READDIR on files. 20130426: The WITHOUT_IDEA option has been removed because the IDEA patent expired. 20130426: The sysctl which controls TRIM support under ZFS has been renamed from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been enabled by default. 20130425: The mergemaster command now uses the default MAKEOBJDIRPREFIX rather than creating it's own in the temporary directory in order allow access to bootstrapped versions of tools such as install and mtree. When upgrading from version of FreeBSD where the install command does not support -l, you will need to install a new mergemaster command if mergemaster -p is required. This can be accomplished with the command (cd src/usr.sbin/mergemaster && make install). 20130404: Legacy ATA stack, disabled and replaced by new CAM-based one since FreeBSD 9.0, completely removed from the sources. Kernel modules atadisk and atapi*, user-level tools atacontrol and burncd are removed. Kernel option `options ATA_CAM` is now permanently enabled and removed. 20130319: SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2) and socketpair(2). Software, in particular Kerberos, may automatically detect and use these during building. The resulting binaries will not work on older kernels. 20130308: CTL_DISABLE has also been added to the sparc64 GENERIC (for further information, see the respective 20130304 entry). 20130304: Recent commits to callout(9) changed the size of struct callout, so the KBI is probably heavily disturbed. Also, some functions in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced by macros. Every kernel module using it won't load, so rebuild is requested. The ctl device has been re-enabled in GENERIC for i386 and amd64, but does not initialize by default (because of the new CTL_DISABLE option) to save memory. To re-enable it, remove the CTL_DISABLE option from the kernel config file or set kern.cam.ctl.disable=0 in /boot/loader.conf. 20130301: The ctl device has been disabled in GENERIC for i386 and amd64. This was done due to the extra memory being allocated at system initialisation time by the ctl driver which was only used if a CAM target device was created. This makes a FreeBSD system unusable on 128MB or less of RAM. 20130208: A new compression method (lz4) has been merged to -HEAD. Please refer to zpool-features(7) for more information. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20130129: A BSD-licensed patch(1) variant has been added and is installed as bsdpatch, being the GNU version the default patch. To inverse the logic and use the BSD-licensed one as default, while having the GNU version installed as gnupatch, rebuild and install world with the WITH_BSD_PATCH knob set. 20130121: Due to the use of the new -l option to install(1) during build and install, you must take care not to directly set the INSTALL make variable in your /etc/make.conf, /etc/src.conf, or on the command line. If you wish to use the -C flag for all installs you may be able to add INSTALL+=-C to /etc/make.conf or /etc/src.conf. 20130118: The install(1) option -M has changed meaning and now takes an argument that is a file or path to append logs to. In the unlikely event that -M was the last option on the command line and the command line contained at least two files and a target directory the first file will have logs appended to it. The -M option served little practical purpose in the last decade so its use is expected to be extremely rare. 20121223: After switching to Clang as the default compiler some users of ZFS on i386 systems started to experience stack overflow kernel panics. Please consider using 'options KSTACK_PAGES=4' in such configurations. 20121222: GEOM_LABEL now mangles label names read from file system metadata. Mangling affect labels containing spaces, non-printable characters, '%' or '"'. Device names in /etc/fstab and other places may need to be updated. 20121217: By default, only the 10 most recent kernel dumps will be saved. To restore the previous behaviour (no limit on the number of kernel dumps stored in the dump directory) add the following line to /etc/rc.conf: savecore_flags="" 20121201: With the addition of auditdistd(8), a new auditdistd user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20121117: The sin6_scope_id member variable in struct sockaddr_in6 is now filled by the kernel before passing the structure to the userland via sysctl or routing socket. This means the KAME-specific embedded scope id in sin6_addr.s6_addr[2] is always cleared in userland application. This behavior can be controlled by net.inet6.ip6.deembed_scopeid. __FreeBSD_version is bumped to 1000025. 20121105: On i386 and amd64 systems WITH_CLANG_IS_CC is now the default. This means that the world and kernel will be compiled with clang and that clang will be installed as /usr/bin/cc, /usr/bin/c++, and /usr/bin/cpp. To disable this behavior and revert to building with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions of current may need to bootstrap WITHOUT_CLANG first if the clang build fails (its compatibility window doesn't extend to the 9 stable branch point). 20121102: The IPFIREWALL_FORWARD kernel option has been removed. Its functionality now turned on by default. 20121023: The ZERO_COPY_SOCKET kernel option has been removed and split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP. NB: SOCKET_SEND_COW uses the VM page based copy-on-write mechanism which is not safe and may result in kernel crashes. NB: The SOCKET_RECV_PFLIP mechanism is useless as no current driver supports disposeable external page sized mbuf storage. Proper replacements for both zero-copy mechanisms are under consideration and will eventually lead to complete removal of the two kernel options. 20121023: The IPv4 network stack has been converted to network byte order. The following modules need to be recompiled together with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4), pf(4), ipfw(4), ng_ipfw(4), stf(4). 20121022: Support for non-MPSAFE filesystems was removed from VFS. The VFS_VERSION was bumped, all filesystem modules shall be recompiled. 20121018: All the non-MPSAFE filesystems have been disconnected from the build. The full list includes: codafs, hpfs, ntfs, nwfs, portalfs, smbfs, xfs. 20121016: The interface cloning API and ABI has changed. The following modules need to be recompiled together with kernel: ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4), vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4), faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4). 20121015: The sdhci driver was split in two parts: sdhci (generic SD Host Controller logic) and sdhci_pci (actual hardware driver). No kernel config modifications are required, but if you load sdhc as a module you must switch to sdhci_pci instead. 20121014: Import the FUSE kernel and userland support into base system. 20121013: The GNU sort(1) program has been removed since the BSD-licensed sort(1) has been the default for quite some time and no serious problems have been reported. The corresponding WITH_GNU_SORT knob has also gone. 20121006: The pfil(9) API/ABI for AF_INET family has been changed. Packet filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled with new kernel. 20121001: The net80211(4) ABI has been changed to allow for improved driver PS-POLL and power-save support. All wireless drivers need to be recompiled to work with the new kernel. 20120913: The random(4) support for the VIA hardware random number generator (`PADLOCK') is no longer enabled unconditionally. Add the padlock_rng device in the custom kernel config if needed. The GENERIC kernels on i386 and amd64 do include the device, so the change only affects the custom kernel configurations. 20120908: The pf(4) packet filter ABI has been changed. pfctl(8) and snmp_pf module need to be recompiled to work with new kernel. 20120828: A new ZFS feature flag "com.delphix:empty_bpobj" has been merged to -HEAD. Pools that have empty_bpobj in active state can not be imported read-write with ZFS implementations that do not support this feature. For more information read the zpool-features(5) manual page. 20120727: The sparc64 ZFS loader has been changed to no longer try to auto- detect ZFS providers based on diskN aliases but now requires these to be explicitly listed in the OFW boot-device environment variable. 20120712: The OpenSSL has been upgraded to 1.0.1c. Any binaries requiring libcrypto.so.6 or libssl.so.6 must be recompiled. Also, there are configuration changes. Make sure to merge /etc/ssl/openssl.cnf. 20120712: The following sysctls and tunables have been renamed for consistency with other variables: kern.cam.da.da_send_ordered -> kern.cam.da.send_ordered kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered 20120628: The sort utility has been replaced with BSD sort. For now, GNU sort is also available as "gnusort" or the default can be set back to GNU sort by setting WITH_GNU_SORT. In this case, BSD sort will be installed as "bsdsort". 20120611: A new version of ZFS (pool version 5000) has been merged to -HEAD. Starting with this version the old system of ZFS pool versioning is superseded by "feature flags". This concept enables forward compatibility against certain future changes in functionality of ZFS pools. The first read-only compatible "feature flag" for ZFS pools is named "com.delphix:async_destroy". For more information read the new zpool-features(5) manual page. Please refer to the "ZFS notes" section of this file for information on upgrading boot ZFS pools. 20120417: The malloc(3) implementation embedded in libc now uses sources imported as contrib/jemalloc. The most disruptive API change is to /etc/malloc.conf. If your system has an old-style /etc/malloc.conf, delete it prior to installworld, and optionally re-create it using the new format after rebooting. See malloc.conf(5) for details (specifically the TUNING section and the "opt.*" entries in the MALLCTL NAMESPACE section). 20120328: Big-endian MIPS TARGET_ARCH values no longer end in "eb". mips64eb is now spelled mips64. mipsn32eb is now spelled mipsn32. mipseb is now spelled mips. This is to aid compatibility with third-party software that expects this naming scheme in uname(3). Little-endian settings are unchanged. If you are updating a big-endian mips64 machine from before this change, you may need to set MACHINE_ARCH=mips64 in your environment before the new build system will recognize your machine. 20120306: Disable by default the option VFS_ALLOW_NONMPSAFE for all supported platforms. 20120229: Now unix domain sockets behave "as expected" on nullfs(5). Previously nullfs(5) did not pass through all behaviours to the underlying layer, as a result if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. 20120211: The getifaddrs upgrade path broken with 20111215 has been restored. If you have upgraded in between 20111215 and 20120209 you need to recompile libc again with your kernel. You still need to recompile world to be able to configure CARP but this restriction already comes from 20111215. 20120114: The set_rcvar() function has been removed from /etc/rc.subr. All base and ports rc.d scripts have been updated, so if you have a port installed with a script in /usr/local/etc/rc.d you can either hand-edit the rcvar= line, or reinstall the port. An easy way to handle the mass-update of /etc/rc.d: rm /etc/rc.d/* && mergemaster -i 20120109: panic(9) now stops other CPUs in the SMP systems, disables interrupts on the current CPU and prevents other threads from running. This behavior can be reverted using the kern.stop_scheduler_on_panic tunable/sysctl. The new behavior can be incompatible with kern.sync_on_panic. 20111215: The carp(4) facility has been changed significantly. Configuration of the CARP protocol via ifconfig(8) has changed, as well as format of CARP events submitted to devd(8) has changed. See manual pages for more information. The arpbalance feature of carp(4) is currently not supported anymore. Size of struct in_aliasreq, struct in6_aliasreq has changed. User utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8), need to be recompiled. 20111122: The acpi_wmi(4) status device /dev/wmistat has been renamed to /dev/wmistat0. 20111108: The option VFS_ALLOW_NONMPSAFE option has been added in order to explicitely support non-MPSAFE filesystems. It is on by default for all supported platform at this present time. 20111101: The broken amd(4) driver has been replaced with esp(4) in the amd64, i386 and pc98 GENERIC kernel configuration files. 20110930: sysinstall has been removed 20110923: The stable/9 branch created in subversion. This corresponds to the RELENG_9 branch in CVS. COMMON ITEMS: General Notes ------------- Avoid using make -j when upgrading. While generally safe, there are sometimes problems using -j to upgrade. If your upgrade fails with -j, please try again without -j. From time to time in the past there have been problems using -j with buildworld and/or installworld. This is especially true when upgrading between "distant" versions (eg one that cross a major release boundary or several minor releases, or when several months have passed on the -current branch). Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach before reporting problems with a major version upgrade. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. This file should be read as a log of events. When a later event changes information of a prior event, the prior event should not be deleted. Instead, a pointer to the entry with the new information should be placed in the old entry. Readers of this file should also sanity check older entries before relying on them blindly. Authors of new entries should write them with this in mind. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ada0: "gpart bootcode -p /boot/gptzfsboot -i 1 ada0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To just build a kernel when you know that it won't mess you up -------------------------------------------------------------- This assumes you are already running a CURRENT system. Replace ${arch} with the architecture of your machine (e.g. "i386", "arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc). cd src/sys/${arch}/conf config KERNEL_NAME_HERE cd ../compile/KERNEL_NAME_HERE make depend make make install If this fails, go to the "To build a kernel" section. To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make kernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make kernel KERNCONF=YOUR_KERNEL_HERE [8] [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a cd src adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a noop. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from current before 20130425 or from -stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] In order to have a kernel that can run the 4.x binaries needed to do an installworld, you must include the COMPAT_FREEBSD4 option in your kernel. Failure to do so may leave you with a system that is hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is required to run the 5.x binaries on more recent kernels. And so on for COMPAT_FREEBSD6 and COMPAT_FREEBSD7. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. [9] When checking out sources, you must include the -P flag to have cvs prune empty directories. If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since September 23, 2011. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh. All Rights Reserved. Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb/zdb.8 =================================================================== --- user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb/zdb.8 (revision 303205) +++ user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb/zdb.8 (revision 303206) @@ -1,340 +1,351 @@ '\" te .\" Copyright (c) 2012, Martin Matuska . .\" All Rights Reserved. .\" .\" This file and its contents are supplied under the terms of the .\" Common Development and Distribution License ("CDDL"), version 1.0. .\" You may only use this file in accordance with the terms of version .\" 1.0 of the CDDL. .\" .\" A full copy of the text of the CDDL should have accompanied this .\" source. A copy of the CDDL is also available via the Internet at .\" http://www.illumos.org/license/CDDL. .\" .\" .\" Copyright 2012, Richard Lowe. .\" Copyright (c) 2012, Marcelo Araujo . .\" Copyright (c) 2012, 2014 by Delphix. All rights reserved. .\" All Rights Reserved. .\" .\" $FreeBSD$ .\" .Dd July 26, 2014 .Dt ZDB 8 .Os .Sh NAME .Nm zdb .Nd Display zpool debugging and consistency information .Sh SYNOPSIS .Nm .Op Fl CumdibcsDvhLMXFPA .Op Fl e Op Fl p Ar path... .Op Fl t Ar txg .Op Fl U Ar cache .Op Fl I Ar inflight I/Os .Op Fl x Ar dumpdir .Ar poolname .Op Ar object ... .Nm .Op Fl divPA .Op Fl e Op Fl p Ar path... .Op Fl U Ar cache .Ar dataset .Op Ar object ... .Nm .Fl m Op Fl MLXFPA .Op Fl t Ar txg .Op Fl e Op Fl p Ar path... .Op Fl U Ar cache .Ar poolname .Nm .Fl R Op Fl A .Op Fl e Op Fl p Ar path... .Op Fl U Ar cache .Ar poolname .Ar poolname .Ar vdev Ns : Ns Ar offset Ns : Ns Ar size Ns Op Ns : Ns Ar flags .Nm .Fl S .Op Fl AP .Op Fl e Op Fl p Ar path... .Op Fl U Ar cache .Ar poolname .Ar poolname .Nm .Fl l .Op Fl uA .Ar device .Nm .Fl C .Op Fl A .Op Fl U Ar cache .Sh DESCRIPTION The .Nm utility displays information about a ZFS pool useful for debugging and performs some amount of consistency checking. It is a not a general purpose tool and options (and facilities) may change. This is neither a .Xr fsck 8 nor a .Xr fsdb 8 utility. .Pp The output of this command in general reflects the on-disk structure of a ZFS pool, and is inherently unstable. The precise output of most invocations is not documented, a knowledge of ZFS internals is assumed. .Pp +If the +.Ar dataset +argument does not contain any +.Sy / +or +.Sy @ +characters, it is interpreted as a pool name. +The root dataset can be specified as +.Pa pool Ns Sy / +(pool name followed by a slash). +.Pp When operating on an imported and active pool it is possible, though unlikely, that zdb may interpret inconsistent pool data and behave erratically. .Sh OPTIONS Display options: .Bl -tag -width indent .It Fl b Display statistics regarding the number, size (logical, physical and allocated) and deduplication of blocks. .It Fl c Verify the checksum of all metadata blocks while printing block statistics (see .Fl b Ns ). .Pp If specified multiple times, verify the checksums of all blocks. .It Fl C Display information about the configuration. If specified with no other options, instead display information about the cache file .Po Pa /etc/zfs/zpool.cache Pc . To specify the cache file to display, see .Fl U .Pp If specified multiple times, and a pool name is also specified display both the cached configuration and the on-disk configuration. If specified multiple times with .Fl e also display the configuration that would be used were the pool to be imported. .It Fl d Display information about datasets. Specified once, displays basic dataset information: ID, create transaction, size, and object count. .Pp If specified multiple times provides greater and greater verbosity. .Pp If object IDs are specified, display information about those specific objects only. .It Fl D Display deduplication statistics, including the deduplication ratio (dedup), compression ratio (compress), inflation due to the zfs copies property (copies), and an overall effective ratio (dedup * compress / copies). .Pp If specified twice, display a histogram of deduplication statistics, showing the allocated (physically present on disk) and referenced (logically referenced in the pool) block counts and sizes by reference count. .Pp If specified a third time, display the statistics independently for each deduplication table. .Pp If specified a fourth time, dump the contents of the deduplication tables describing duplicate blocks. .Pp If specified a fifth time, also dump the contents of the deduplication tables describing unique blocks. .It Fl h Display pool history similar to .Cm zpool history , but include internal changes, transaction, and dataset information. .It Fl i Display information about intent log (ZIL) entries relating to each dataset. If specified multiple times, display counts of each intent log transaction type. .It Fl l Ar device Display the vdev labels from the specified device. If the .Fl u option is also specified, also display the uberblocks on this device. .It Fl L Disable leak tracing and the loading of space maps. By default, .Nm verifies that all non-free blocks are referenced, which can be very expensive. .It Fl m Display the offset, spacemap, and free space of each metaslab. When specified twice, also display information about the on-disk free space histogram associated with each metaslab. When specified three time, display the maximum contiguous free space, the in-core free space histogram, and the percentage of free space in each space map. When specified four times display every spacemap record. .It Fl M Display the offset, spacemap, and free space of each metaslab. When specified twice, also display information about the maximum contiguous free space and the percentage of free space in each space map. When specified three times display every spacemap record. .It Xo .Fl R Ar poolname .Ar vdev Ns : Ns Ar offset Ns : Ns Ar size Ns Op Ns : Ns Ar flags .Xc Read and display a block from the specified device. By default the block is displayed as a hex dump, but see the description of the .Fl r flag, below. .Pp The block is specified in terms of a colon-separated tuple .Ar vdev (an integer vdev identifier) .Ar offset (the offset within the vdev) .Ar size (the size of the block to read) and, optionally, .Ar flags (a set of flags, described below). .Bl -tag -width indent .It Sy b offset Print block pointer .It Sy d Decompress the block .It Sy e Byte swap the block .It Sy g Dump gang block header .It Sy i Dump indirect block .It Sy r Dump raw uninterpreted block data .El .It Fl s Report statistics on .Nm Ns 's I/O. Display operation counts, bandwidth, and error counts of I/O to the pool from .Nm . .It Fl S Simulate the effects of deduplication, constructing a DDT and then display that DDT as with \fB-DD\fR. .It Fl u Display the current uberblock. .El .Pp Other options: .Bl -tag -width indent .It Fl A Do not abort should any assertion fail. .It Fl AA Enable panic recovery, certain errors which would otherwise be fatal are demoted to warnings. .It Fl AAA Do not abort if asserts fail and also enable panic recovery. .It Fl e Op Fl p Ar path... Operate on an exported pool, not present in .Pa /etc/zfs/zpool.cache . The .Fl p flag specifies the path under which devices are to be searched. .It Fl x Ar dumpdir All blocks accessed will be copied to files in the specified directory. The blocks will be placed in sparse files whose name is the same as that of the file or device read. zdb can be then run on the generated files. Note that the .Fl bbc flags are sufficient to access (and thus copy) all metadata on the pool. .It Fl F Attempt to make an unreadable pool readable by trying progressively older transactions. .It Fl I Ar inflight I/Os Limit the number of outstanding checksum I/Os to the specified value. The default value is 200. This option affects the performance of the .Fl c option. .It Fl P Print numbers in an unscaled form more amenable to parsing, eg. 1000000 rather than 1M. .It Fl t Ar transaction Specify the highest transaction to use when searching for uberblocks. See also the .Fl u and .Fl l options for a means to see the available uberblocks and their associated transaction numbers. .It Fl U Ar cachefile Use a cache file other than .Pa /boot/zfs/zpool.cache . .It Fl v Enable verbosity. Specify multiple times for increased verbosity. .It Fl X Attempt .Ql extreme transaction rewind, that is attempt the same recovery as .Fl F but read transactions otherwise deemed too old. .El .Pp Specifying a display option more than once enables verbosity for only that option, with more occurrences enabling more verbosity. .Pp If no options are specified, all information about the named pool will be displayed at default verbosity. .Sh EXAMPLES .Bl -tag -width 0n .It Sy Example 1 Display the configuration of imported pool 'rpool' .Bd -literal -offset 2n .Li # Ic zdb -C rpool MOS Configuration: version: 28 name: 'rpool' ... .Ed .It Sy Example 2 Display basic dataset information about 'rpool' .Bd -literal -offset 2n .Li # Ic zdb -d rpool Dataset mos [META], ID 0, cr_txg 4, 26.9M, 1051 objects Dataset rpool/swap [ZVOL], ID 59, cr_txg 356, 486M, 2 objects ... .Ed .It Xo Sy Example 3 Display basic information about object 0 in .Sy 'rpool/export/home' .Xc .Bd -literal -offset 2n .Li # Ic zdb -d rpool/export/home 0 Dataset rpool/export/home [ZPL], ID 137, cr_txg 1546, 32K, 8 objects Object lvl iblk dblk dsize lsize %full type 0 7 16K 16K 15.0K 16K 25.00 DMU dnode .Ed .It Xo Sy Example 4 Display the predicted effect of enabling deduplication on .Sy 'rpool' .Xc .Bd -literal -offset 2n .Li # Ic zdb -S rpool Simulated DDT histogram: bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 694K 27.1G 15.0G 15.0G 694K 27.1G 15.0G 15.0G 2 35.0K 1.33G 699M 699M 74.7K 2.79G 1.45G 1.45G ... dedup = 1.11, compress = 1.80, copies = 1.00, dedup * compress / copies = 2.00 .Ed .El .Sh SEE ALSO .Xr zfs 8 , .Xr zpool 8 .Sh AUTHORS This manual page is a .Xr mdoc 7 reimplementation of the .Tn illumos manual page .Em zdb(1M) , modified and customized for .Fx and licensed under the Common Development and Distribution License .Pq Tn CDDL . .Pp The .Xr mdoc 7 implementation of this manual page was initially written by .An Martin Matuska Aq mm@FreeBSD.org and .An Marcelo Araujo Aq araujo@FreeBSD.org . Index: user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb/zdb.c =================================================================== --- user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb/zdb.c (revision 303205) +++ user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb/zdb.c (revision 303206) @@ -1,3808 +1,3834 @@ /* * CDDL HEADER START * * The contents of this file are subject to the terms of the * Common Development and Distribution License (the "License"). * You may not use this file except in compliance with the License. * * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE * or http://www.opensolaris.org/os/licensing. * See the License for the specific language governing permissions * and limitations under the License. * * When distributing Covered Code, include this CDDL HEADER in each * file and include the License file at usr/src/OPENSOLARIS.LICENSE. * If applicable, add the following below this CDDL HEADER, with the * fields enclosed by brackets "[]" replaced with your own identifying * information: Portions Copyright [yyyy] [name of copyright owner] * * CDDL HEADER END */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2011, 2015 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #undef ZFS_MAXNAMELEN #undef verify #include #define ZDB_COMPRESS_NAME(idx) ((idx) < ZIO_COMPRESS_FUNCTIONS ? \ zio_compress_table[(idx)].ci_name : "UNKNOWN") #define ZDB_CHECKSUM_NAME(idx) ((idx) < ZIO_CHECKSUM_FUNCTIONS ? \ zio_checksum_table[(idx)].ci_name : "UNKNOWN") #define ZDB_OT_NAME(idx) ((idx) < DMU_OT_NUMTYPES ? \ dmu_ot[(idx)].ot_name : DMU_OT_IS_VALID(idx) ? \ dmu_ot_byteswap[DMU_OT_BYTESWAP(idx)].ob_name : "UNKNOWN") #define ZDB_OT_TYPE(idx) ((idx) < DMU_OT_NUMTYPES ? (idx) : \ (((idx) == DMU_OTN_ZAP_DATA || (idx) == DMU_OTN_ZAP_METADATA) ? \ DMU_OT_ZAP_OTHER : DMU_OT_NUMTYPES)) #ifndef lint extern boolean_t zfs_recover; extern uint64_t zfs_arc_max, zfs_arc_meta_limit; extern int zfs_vdev_async_read_max_active; #else boolean_t zfs_recover; uint64_t zfs_arc_max, zfs_arc_meta_limit; int zfs_vdev_async_read_max_active; #endif const char cmdname[] = "zdb"; uint8_t dump_opt[256]; typedef void object_viewer_t(objset_t *, uint64_t, void *data, size_t size); extern void dump_intent_log(zilog_t *); static uint64_t *zopt_object = NULL; static int zopt_objects = 0; static libzfs_handle_t *g_zfs; static uint64_t max_inflight = 1000; static void snprintf_blkptr_compact(char *, size_t, const blkptr_t *); /* * These libumem hooks provide a reasonable set of defaults for the allocator's * debugging facilities. */ const char * _umem_debug_init() { return ("default,verbose"); /* $UMEM_DEBUG setting */ } const char * _umem_logging_init(void) { return ("fail,contents"); /* $UMEM_LOGGING setting */ } static void usage(void) { (void) fprintf(stderr, "Usage: %s [-CumMdibcsDvhLXFPA] [-t txg] [-e [-p path...]] " "[-U config] [-I inflight I/Os] [-x dumpdir] poolname [object...]\n" " %s [-divPA] [-e -p path...] [-U config] dataset " "[object...]\n" " %s -mM [-LXFPA] [-t txg] [-e [-p path...]] [-U config] " "poolname [vdev [metaslab...]]\n" " %s -R [-A] [-e [-p path...]] poolname " "vdev:offset:size[:flags]\n" " %s -S [-PA] [-e [-p path...]] [-U config] poolname\n" " %s -l [-uA] device\n" " %s -C [-A] [-U config]\n\n", cmdname, cmdname, cmdname, cmdname, cmdname, cmdname, cmdname); (void) fprintf(stderr, " Dataset name must include at least one " "separator character '/' or '@'\n"); (void) fprintf(stderr, " If dataset name is specified, only that " "dataset is dumped\n"); (void) fprintf(stderr, " If object numbers are specified, only " "those objects are dumped\n\n"); (void) fprintf(stderr, " Options to control amount of output:\n"); (void) fprintf(stderr, " -u uberblock\n"); (void) fprintf(stderr, " -d dataset(s)\n"); (void) fprintf(stderr, " -i intent logs\n"); (void) fprintf(stderr, " -C config (or cachefile if alone)\n"); (void) fprintf(stderr, " -h pool history\n"); (void) fprintf(stderr, " -b block statistics\n"); (void) fprintf(stderr, " -m metaslabs\n"); (void) fprintf(stderr, " -M metaslab groups\n"); (void) fprintf(stderr, " -c checksum all metadata (twice for " "all data) blocks\n"); (void) fprintf(stderr, " -s report stats on zdb's I/O\n"); (void) fprintf(stderr, " -D dedup statistics\n"); (void) fprintf(stderr, " -S simulate dedup to measure effect\n"); (void) fprintf(stderr, " -v verbose (applies to all others)\n"); (void) fprintf(stderr, " -l dump label contents\n"); (void) fprintf(stderr, " -L disable leak tracking (do not " "load spacemaps)\n"); (void) fprintf(stderr, " -R read and display block from a " "device\n\n"); (void) fprintf(stderr, " Below options are intended for use " "with other options:\n"); (void) fprintf(stderr, " -A ignore assertions (-A), enable " "panic recovery (-AA) or both (-AAA)\n"); (void) fprintf(stderr, " -F attempt automatic rewind within " "safe range of transaction groups\n"); (void) fprintf(stderr, " -U -- use alternate " "cachefile\n"); (void) fprintf(stderr, " -X attempt extreme rewind (does not " "work with dataset)\n"); (void) fprintf(stderr, " -e pool is exported/destroyed/" "has altroot/not in a cachefile\n"); (void) fprintf(stderr, " -p -- use one or more with " "-e to specify path to vdev dir\n"); (void) fprintf(stderr, " -x -- " "dump all read blocks into specified directory\n"); (void) fprintf(stderr, " -P print numbers in parseable form\n"); (void) fprintf(stderr, " -t -- highest txg to use when " "searching for uberblocks\n"); (void) fprintf(stderr, " -I -- " "specify the maximum number of " "checksumming I/Os [default is 200]\n"); (void) fprintf(stderr, "Specify an option more than once (e.g. -bb) " "to make only that option verbose\n"); (void) fprintf(stderr, "Default is to dump everything non-verbosely\n"); exit(1); } /* * Called for usage errors that are discovered after a call to spa_open(), * dmu_bonus_hold(), or pool_match(). abort() is called for other errors. */ static void fatal(const char *fmt, ...) { va_list ap; va_start(ap, fmt); (void) fprintf(stderr, "%s: ", cmdname); (void) vfprintf(stderr, fmt, ap); va_end(ap); (void) fprintf(stderr, "\n"); exit(1); } /* ARGSUSED */ static void dump_packed_nvlist(objset_t *os, uint64_t object, void *data, size_t size) { nvlist_t *nv; size_t nvsize = *(uint64_t *)data; char *packed = umem_alloc(nvsize, UMEM_NOFAIL); VERIFY(0 == dmu_read(os, object, 0, nvsize, packed, DMU_READ_PREFETCH)); VERIFY(nvlist_unpack(packed, nvsize, &nv, 0) == 0); umem_free(packed, nvsize); dump_nvlist(nv, 8); nvlist_free(nv); } /* ARGSUSED */ static void dump_history_offsets(objset_t *os, uint64_t object, void *data, size_t size) { spa_history_phys_t *shp = data; if (shp == NULL) return; (void) printf("\t\tpool_create_len = %llu\n", (u_longlong_t)shp->sh_pool_create_len); (void) printf("\t\tphys_max_off = %llu\n", (u_longlong_t)shp->sh_phys_max_off); (void) printf("\t\tbof = %llu\n", (u_longlong_t)shp->sh_bof); (void) printf("\t\teof = %llu\n", (u_longlong_t)shp->sh_eof); (void) printf("\t\trecords_lost = %llu\n", (u_longlong_t)shp->sh_records_lost); } static void zdb_nicenum(uint64_t num, char *buf) { if (dump_opt['P']) (void) sprintf(buf, "%llu", (longlong_t)num); else nicenum(num, buf); } const char histo_stars[] = "****************************************"; const int histo_width = sizeof (histo_stars) - 1; static void dump_histogram(const uint64_t *histo, int size, int offset) { int i; int minidx = size - 1; int maxidx = 0; uint64_t max = 0; for (i = 0; i < size; i++) { if (histo[i] > max) max = histo[i]; if (histo[i] > 0 && i > maxidx) maxidx = i; if (histo[i] > 0 && i < minidx) minidx = i; } if (max < histo_width) max = histo_width; for (i = minidx; i <= maxidx; i++) { (void) printf("\t\t\t%3u: %6llu %s\n", i + offset, (u_longlong_t)histo[i], &histo_stars[(max - histo[i]) * histo_width / max]); } } static void dump_zap_stats(objset_t *os, uint64_t object) { int error; zap_stats_t zs; error = zap_get_stats(os, object, &zs); if (error) return; if (zs.zs_ptrtbl_len == 0) { ASSERT(zs.zs_num_blocks == 1); (void) printf("\tmicrozap: %llu bytes, %llu entries\n", (u_longlong_t)zs.zs_blocksize, (u_longlong_t)zs.zs_num_entries); return; } (void) printf("\tFat ZAP stats:\n"); (void) printf("\t\tPointer table:\n"); (void) printf("\t\t\t%llu elements\n", (u_longlong_t)zs.zs_ptrtbl_len); (void) printf("\t\t\tzt_blk: %llu\n", (u_longlong_t)zs.zs_ptrtbl_zt_blk); (void) printf("\t\t\tzt_numblks: %llu\n", (u_longlong_t)zs.zs_ptrtbl_zt_numblks); (void) printf("\t\t\tzt_shift: %llu\n", (u_longlong_t)zs.zs_ptrtbl_zt_shift); (void) printf("\t\t\tzt_blks_copied: %llu\n", (u_longlong_t)zs.zs_ptrtbl_blks_copied); (void) printf("\t\t\tzt_nextblk: %llu\n", (u_longlong_t)zs.zs_ptrtbl_nextblk); (void) printf("\t\tZAP entries: %llu\n", (u_longlong_t)zs.zs_num_entries); (void) printf("\t\tLeaf blocks: %llu\n", (u_longlong_t)zs.zs_num_leafs); (void) printf("\t\tTotal blocks: %llu\n", (u_longlong_t)zs.zs_num_blocks); (void) printf("\t\tzap_block_type: 0x%llx\n", (u_longlong_t)zs.zs_block_type); (void) printf("\t\tzap_magic: 0x%llx\n", (u_longlong_t)zs.zs_magic); (void) printf("\t\tzap_salt: 0x%llx\n", (u_longlong_t)zs.zs_salt); (void) printf("\t\tLeafs with 2^n pointers:\n"); dump_histogram(zs.zs_leafs_with_2n_pointers, ZAP_HISTOGRAM_SIZE, 0); (void) printf("\t\tBlocks with n*5 entries:\n"); dump_histogram(zs.zs_blocks_with_n5_entries, ZAP_HISTOGRAM_SIZE, 0); (void) printf("\t\tBlocks n/10 full:\n"); dump_histogram(zs.zs_blocks_n_tenths_full, ZAP_HISTOGRAM_SIZE, 0); (void) printf("\t\tEntries with n chunks:\n"); dump_histogram(zs.zs_entries_using_n_chunks, ZAP_HISTOGRAM_SIZE, 0); (void) printf("\t\tBuckets with n entries:\n"); dump_histogram(zs.zs_buckets_with_n_entries, ZAP_HISTOGRAM_SIZE, 0); } /*ARGSUSED*/ static void dump_none(objset_t *os, uint64_t object, void *data, size_t size) { } /*ARGSUSED*/ static void dump_unknown(objset_t *os, uint64_t object, void *data, size_t size) { (void) printf("\tUNKNOWN OBJECT TYPE\n"); } /*ARGSUSED*/ void dump_uint8(objset_t *os, uint64_t object, void *data, size_t size) { } /*ARGSUSED*/ static void dump_uint64(objset_t *os, uint64_t object, void *data, size_t size) { } /*ARGSUSED*/ static void dump_zap(objset_t *os, uint64_t object, void *data, size_t size) { zap_cursor_t zc; zap_attribute_t attr; void *prop; int i; dump_zap_stats(os, object); (void) printf("\n"); for (zap_cursor_init(&zc, os, object); zap_cursor_retrieve(&zc, &attr) == 0; zap_cursor_advance(&zc)) { (void) printf("\t\t%s = ", attr.za_name); if (attr.za_num_integers == 0) { (void) printf("\n"); continue; } prop = umem_zalloc(attr.za_num_integers * attr.za_integer_length, UMEM_NOFAIL); (void) zap_lookup(os, object, attr.za_name, attr.za_integer_length, attr.za_num_integers, prop); if (attr.za_integer_length == 1) { (void) printf("%s", (char *)prop); } else { for (i = 0; i < attr.za_num_integers; i++) { switch (attr.za_integer_length) { case 2: (void) printf("%u ", ((uint16_t *)prop)[i]); break; case 4: (void) printf("%u ", ((uint32_t *)prop)[i]); break; case 8: (void) printf("%lld ", (u_longlong_t)((int64_t *)prop)[i]); break; } } } (void) printf("\n"); umem_free(prop, attr.za_num_integers * attr.za_integer_length); } zap_cursor_fini(&zc); } static void dump_bpobj(objset_t *os, uint64_t object, void *data, size_t size) { bpobj_phys_t *bpop = data; char bytes[32], comp[32], uncomp[32]; if (bpop == NULL) return; zdb_nicenum(bpop->bpo_bytes, bytes); zdb_nicenum(bpop->bpo_comp, comp); zdb_nicenum(bpop->bpo_uncomp, uncomp); (void) printf("\t\tnum_blkptrs = %llu\n", (u_longlong_t)bpop->bpo_num_blkptrs); (void) printf("\t\tbytes = %s\n", bytes); if (size >= BPOBJ_SIZE_V1) { (void) printf("\t\tcomp = %s\n", comp); (void) printf("\t\tuncomp = %s\n", uncomp); } if (size >= sizeof (*bpop)) { (void) printf("\t\tsubobjs = %llu\n", (u_longlong_t)bpop->bpo_subobjs); (void) printf("\t\tnum_subobjs = %llu\n", (u_longlong_t)bpop->bpo_num_subobjs); } if (dump_opt['d'] < 5) return; for (uint64_t i = 0; i < bpop->bpo_num_blkptrs; i++) { char blkbuf[BP_SPRINTF_LEN]; blkptr_t bp; int err = dmu_read(os, object, i * sizeof (bp), sizeof (bp), &bp, 0); if (err != 0) { (void) printf("got error %u from dmu_read\n", err); break; } snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), &bp); (void) printf("\t%s\n", blkbuf); } } /* ARGSUSED */ static void dump_bpobj_subobjs(objset_t *os, uint64_t object, void *data, size_t size) { dmu_object_info_t doi; VERIFY0(dmu_object_info(os, object, &doi)); uint64_t *subobjs = kmem_alloc(doi.doi_max_offset, KM_SLEEP); int err = dmu_read(os, object, 0, doi.doi_max_offset, subobjs, 0); if (err != 0) { (void) printf("got error %u from dmu_read\n", err); kmem_free(subobjs, doi.doi_max_offset); return; } int64_t last_nonzero = -1; for (uint64_t i = 0; i < doi.doi_max_offset / 8; i++) { if (subobjs[i] != 0) last_nonzero = i; } for (int64_t i = 0; i <= last_nonzero; i++) { (void) printf("\t%llu\n", (longlong_t)subobjs[i]); } kmem_free(subobjs, doi.doi_max_offset); } /*ARGSUSED*/ static void dump_ddt_zap(objset_t *os, uint64_t object, void *data, size_t size) { dump_zap_stats(os, object); /* contents are printed elsewhere, properly decoded */ } /*ARGSUSED*/ static void dump_sa_attrs(objset_t *os, uint64_t object, void *data, size_t size) { zap_cursor_t zc; zap_attribute_t attr; dump_zap_stats(os, object); (void) printf("\n"); for (zap_cursor_init(&zc, os, object); zap_cursor_retrieve(&zc, &attr) == 0; zap_cursor_advance(&zc)) { (void) printf("\t\t%s = ", attr.za_name); if (attr.za_num_integers == 0) { (void) printf("\n"); continue; } (void) printf(" %llx : [%d:%d:%d]\n", (u_longlong_t)attr.za_first_integer, (int)ATTR_LENGTH(attr.za_first_integer), (int)ATTR_BSWAP(attr.za_first_integer), (int)ATTR_NUM(attr.za_first_integer)); } zap_cursor_fini(&zc); } /*ARGSUSED*/ static void dump_sa_layouts(objset_t *os, uint64_t object, void *data, size_t size) { zap_cursor_t zc; zap_attribute_t attr; uint16_t *layout_attrs; int i; dump_zap_stats(os, object); (void) printf("\n"); for (zap_cursor_init(&zc, os, object); zap_cursor_retrieve(&zc, &attr) == 0; zap_cursor_advance(&zc)) { (void) printf("\t\t%s = [", attr.za_name); if (attr.za_num_integers == 0) { (void) printf("\n"); continue; } VERIFY(attr.za_integer_length == 2); layout_attrs = umem_zalloc(attr.za_num_integers * attr.za_integer_length, UMEM_NOFAIL); VERIFY(zap_lookup(os, object, attr.za_name, attr.za_integer_length, attr.za_num_integers, layout_attrs) == 0); for (i = 0; i != attr.za_num_integers; i++) (void) printf(" %d ", (int)layout_attrs[i]); (void) printf("]\n"); umem_free(layout_attrs, attr.za_num_integers * attr.za_integer_length); } zap_cursor_fini(&zc); } /*ARGSUSED*/ static void dump_zpldir(objset_t *os, uint64_t object, void *data, size_t size) { zap_cursor_t zc; zap_attribute_t attr; const char *typenames[] = { /* 0 */ "not specified", /* 1 */ "FIFO", /* 2 */ "Character Device", /* 3 */ "3 (invalid)", /* 4 */ "Directory", /* 5 */ "5 (invalid)", /* 6 */ "Block Device", /* 7 */ "7 (invalid)", /* 8 */ "Regular File", /* 9 */ "9 (invalid)", /* 10 */ "Symbolic Link", /* 11 */ "11 (invalid)", /* 12 */ "Socket", /* 13 */ "Door", /* 14 */ "Event Port", /* 15 */ "15 (invalid)", }; dump_zap_stats(os, object); (void) printf("\n"); for (zap_cursor_init(&zc, os, object); zap_cursor_retrieve(&zc, &attr) == 0; zap_cursor_advance(&zc)) { (void) printf("\t\t%s = %lld (type: %s)\n", attr.za_name, ZFS_DIRENT_OBJ(attr.za_first_integer), typenames[ZFS_DIRENT_TYPE(attr.za_first_integer)]); } zap_cursor_fini(&zc); } int get_dtl_refcount(vdev_t *vd) { int refcount = 0; if (vd->vdev_ops->vdev_op_leaf) { space_map_t *sm = vd->vdev_dtl_sm; if (sm != NULL && sm->sm_dbuf->db_size == sizeof (space_map_phys_t)) return (1); return (0); } for (int c = 0; c < vd->vdev_children; c++) refcount += get_dtl_refcount(vd->vdev_child[c]); return (refcount); } int get_metaslab_refcount(vdev_t *vd) { int refcount = 0; if (vd->vdev_top == vd && !vd->vdev_removing) { for (int m = 0; m < vd->vdev_ms_count; m++) { space_map_t *sm = vd->vdev_ms[m]->ms_sm; if (sm != NULL && sm->sm_dbuf->db_size == sizeof (space_map_phys_t)) refcount++; } } for (int c = 0; c < vd->vdev_children; c++) refcount += get_metaslab_refcount(vd->vdev_child[c]); return (refcount); } static int verify_spacemap_refcounts(spa_t *spa) { uint64_t expected_refcount = 0; uint64_t actual_refcount; (void) feature_get_refcount(spa, &spa_feature_table[SPA_FEATURE_SPACEMAP_HISTOGRAM], &expected_refcount); actual_refcount = get_dtl_refcount(spa->spa_root_vdev); actual_refcount += get_metaslab_refcount(spa->spa_root_vdev); if (expected_refcount != actual_refcount) { (void) printf("space map refcount mismatch: expected %lld != " "actual %lld\n", (longlong_t)expected_refcount, (longlong_t)actual_refcount); return (2); } return (0); } static void dump_spacemap(objset_t *os, space_map_t *sm) { uint64_t alloc, offset, entry; char *ddata[] = { "ALLOC", "FREE", "CONDENSE", "INVALID", "INVALID", "INVALID", "INVALID", "INVALID" }; if (sm == NULL) return; /* * Print out the freelist entries in both encoded and decoded form. */ alloc = 0; for (offset = 0; offset < space_map_length(sm); offset += sizeof (entry)) { uint8_t mapshift = sm->sm_shift; VERIFY0(dmu_read(os, space_map_object(sm), offset, sizeof (entry), &entry, DMU_READ_PREFETCH)); if (SM_DEBUG_DECODE(entry)) { (void) printf("\t [%6llu] %s: txg %llu, pass %llu\n", (u_longlong_t)(offset / sizeof (entry)), ddata[SM_DEBUG_ACTION_DECODE(entry)], (u_longlong_t)SM_DEBUG_TXG_DECODE(entry), (u_longlong_t)SM_DEBUG_SYNCPASS_DECODE(entry)); } else { (void) printf("\t [%6llu] %c range:" " %010llx-%010llx size: %06llx\n", (u_longlong_t)(offset / sizeof (entry)), SM_TYPE_DECODE(entry) == SM_ALLOC ? 'A' : 'F', (u_longlong_t)((SM_OFFSET_DECODE(entry) << mapshift) + sm->sm_start), (u_longlong_t)((SM_OFFSET_DECODE(entry) << mapshift) + sm->sm_start + (SM_RUN_DECODE(entry) << mapshift)), (u_longlong_t)(SM_RUN_DECODE(entry) << mapshift)); if (SM_TYPE_DECODE(entry) == SM_ALLOC) alloc += SM_RUN_DECODE(entry) << mapshift; else alloc -= SM_RUN_DECODE(entry) << mapshift; } } if (alloc != space_map_allocated(sm)) { (void) printf("space_map_object alloc (%llu) INCONSISTENT " "with space map summary (%llu)\n", (u_longlong_t)space_map_allocated(sm), (u_longlong_t)alloc); } } static void dump_metaslab_stats(metaslab_t *msp) { char maxbuf[32]; range_tree_t *rt = msp->ms_tree; avl_tree_t *t = &msp->ms_size_tree; int free_pct = range_tree_space(rt) * 100 / msp->ms_size; zdb_nicenum(metaslab_block_maxsize(msp), maxbuf); (void) printf("\t %25s %10lu %7s %6s %4s %4d%%\n", "segments", avl_numnodes(t), "maxsize", maxbuf, "freepct", free_pct); (void) printf("\tIn-memory histogram:\n"); dump_histogram(rt->rt_histogram, RANGE_TREE_HISTOGRAM_SIZE, 0); } static void dump_metaslab(metaslab_t *msp) { vdev_t *vd = msp->ms_group->mg_vd; spa_t *spa = vd->vdev_spa; space_map_t *sm = msp->ms_sm; char freebuf[32]; zdb_nicenum(msp->ms_size - space_map_allocated(sm), freebuf); (void) printf( "\tmetaslab %6llu offset %12llx spacemap %6llu free %5s\n", (u_longlong_t)msp->ms_id, (u_longlong_t)msp->ms_start, (u_longlong_t)space_map_object(sm), freebuf); if (dump_opt['m'] > 2 && !dump_opt['L']) { mutex_enter(&msp->ms_lock); metaslab_load_wait(msp); if (!msp->ms_loaded) { VERIFY0(metaslab_load(msp)); range_tree_stat_verify(msp->ms_tree); } dump_metaslab_stats(msp); metaslab_unload(msp); mutex_exit(&msp->ms_lock); } if (dump_opt['m'] > 1 && sm != NULL && spa_feature_is_active(spa, SPA_FEATURE_SPACEMAP_HISTOGRAM)) { /* * The space map histogram represents free space in chunks * of sm_shift (i.e. bucket 0 refers to 2^sm_shift). */ (void) printf("\tOn-disk histogram:\t\tfragmentation %llu\n", (u_longlong_t)msp->ms_fragmentation); dump_histogram(sm->sm_phys->smp_histogram, SPACE_MAP_HISTOGRAM_SIZE, sm->sm_shift); } if (dump_opt['d'] > 5 || dump_opt['m'] > 3) { ASSERT(msp->ms_size == (1ULL << vd->vdev_ms_shift)); mutex_enter(&msp->ms_lock); dump_spacemap(spa->spa_meta_objset, msp->ms_sm); mutex_exit(&msp->ms_lock); } } static void print_vdev_metaslab_header(vdev_t *vd) { (void) printf("\tvdev %10llu\n\t%-10s%5llu %-19s %-15s %-10s\n", (u_longlong_t)vd->vdev_id, "metaslabs", (u_longlong_t)vd->vdev_ms_count, "offset", "spacemap", "free"); (void) printf("\t%15s %19s %15s %10s\n", "---------------", "-------------------", "---------------", "-------------"); } static void dump_metaslab_groups(spa_t *spa) { vdev_t *rvd = spa->spa_root_vdev; metaslab_class_t *mc = spa_normal_class(spa); uint64_t fragmentation; metaslab_class_histogram_verify(mc); for (int c = 0; c < rvd->vdev_children; c++) { vdev_t *tvd = rvd->vdev_child[c]; metaslab_group_t *mg = tvd->vdev_mg; if (mg->mg_class != mc) continue; metaslab_group_histogram_verify(mg); mg->mg_fragmentation = metaslab_group_fragmentation(mg); (void) printf("\tvdev %10llu\t\tmetaslabs%5llu\t\t" "fragmentation", (u_longlong_t)tvd->vdev_id, (u_longlong_t)tvd->vdev_ms_count); if (mg->mg_fragmentation == ZFS_FRAG_INVALID) { (void) printf("%3s\n", "-"); } else { (void) printf("%3llu%%\n", (u_longlong_t)mg->mg_fragmentation); } dump_histogram(mg->mg_histogram, RANGE_TREE_HISTOGRAM_SIZE, 0); } (void) printf("\tpool %s\tfragmentation", spa_name(spa)); fragmentation = metaslab_class_fragmentation(mc); if (fragmentation == ZFS_FRAG_INVALID) (void) printf("\t%3s\n", "-"); else (void) printf("\t%3llu%%\n", (u_longlong_t)fragmentation); dump_histogram(mc->mc_histogram, RANGE_TREE_HISTOGRAM_SIZE, 0); } static void dump_metaslabs(spa_t *spa) { vdev_t *vd, *rvd = spa->spa_root_vdev; uint64_t m, c = 0, children = rvd->vdev_children; (void) printf("\nMetaslabs:\n"); if (!dump_opt['d'] && zopt_objects > 0) { c = zopt_object[0]; if (c >= children) (void) fatal("bad vdev id: %llu", (u_longlong_t)c); if (zopt_objects > 1) { vd = rvd->vdev_child[c]; print_vdev_metaslab_header(vd); for (m = 1; m < zopt_objects; m++) { if (zopt_object[m] < vd->vdev_ms_count) dump_metaslab( vd->vdev_ms[zopt_object[m]]); else (void) fprintf(stderr, "bad metaslab " "number %llu\n", (u_longlong_t)zopt_object[m]); } (void) printf("\n"); return; } children = c + 1; } for (; c < children; c++) { vd = rvd->vdev_child[c]; print_vdev_metaslab_header(vd); for (m = 0; m < vd->vdev_ms_count; m++) dump_metaslab(vd->vdev_ms[m]); (void) printf("\n"); } } static void dump_dde(const ddt_t *ddt, const ddt_entry_t *dde, uint64_t index) { const ddt_phys_t *ddp = dde->dde_phys; const ddt_key_t *ddk = &dde->dde_key; char *types[4] = { "ditto", "single", "double", "triple" }; char blkbuf[BP_SPRINTF_LEN]; blkptr_t blk; for (int p = 0; p < DDT_PHYS_TYPES; p++, ddp++) { if (ddp->ddp_phys_birth == 0) continue; ddt_bp_create(ddt->ddt_checksum, ddk, ddp, &blk); snprintf_blkptr(blkbuf, sizeof (blkbuf), &blk); (void) printf("index %llx refcnt %llu %s %s\n", (u_longlong_t)index, (u_longlong_t)ddp->ddp_refcnt, types[p], blkbuf); } } static void dump_dedup_ratio(const ddt_stat_t *dds) { double rL, rP, rD, D, dedup, compress, copies; if (dds->dds_blocks == 0) return; rL = (double)dds->dds_ref_lsize; rP = (double)dds->dds_ref_psize; rD = (double)dds->dds_ref_dsize; D = (double)dds->dds_dsize; dedup = rD / D; compress = rL / rP; copies = rD / rP; (void) printf("dedup = %.2f, compress = %.2f, copies = %.2f, " "dedup * compress / copies = %.2f\n\n", dedup, compress, copies, dedup * compress / copies); } static void dump_ddt(ddt_t *ddt, enum ddt_type type, enum ddt_class class) { char name[DDT_NAMELEN]; ddt_entry_t dde; uint64_t walk = 0; dmu_object_info_t doi; uint64_t count, dspace, mspace; int error; error = ddt_object_info(ddt, type, class, &doi); if (error == ENOENT) return; ASSERT(error == 0); error = ddt_object_count(ddt, type, class, &count); ASSERT(error == 0); if (count == 0) return; dspace = doi.doi_physical_blocks_512 << 9; mspace = doi.doi_fill_count * doi.doi_data_block_size; ddt_object_name(ddt, type, class, name); (void) printf("%s: %llu entries, size %llu on disk, %llu in core\n", name, (u_longlong_t)count, (u_longlong_t)(dspace / count), (u_longlong_t)(mspace / count)); if (dump_opt['D'] < 3) return; zpool_dump_ddt(NULL, &ddt->ddt_histogram[type][class]); if (dump_opt['D'] < 4) return; if (dump_opt['D'] < 5 && class == DDT_CLASS_UNIQUE) return; (void) printf("%s contents:\n\n", name); while ((error = ddt_object_walk(ddt, type, class, &walk, &dde)) == 0) dump_dde(ddt, &dde, walk); ASSERT(error == ENOENT); (void) printf("\n"); } static void dump_all_ddts(spa_t *spa) { ddt_histogram_t ddh_total = { 0 }; ddt_stat_t dds_total = { 0 }; for (enum zio_checksum c = 0; c < ZIO_CHECKSUM_FUNCTIONS; c++) { ddt_t *ddt = spa->spa_ddt[c]; for (enum ddt_type type = 0; type < DDT_TYPES; type++) { for (enum ddt_class class = 0; class < DDT_CLASSES; class++) { dump_ddt(ddt, type, class); } } } ddt_get_dedup_stats(spa, &dds_total); if (dds_total.dds_blocks == 0) { (void) printf("All DDTs are empty\n"); return; } (void) printf("\n"); if (dump_opt['D'] > 1) { (void) printf("DDT histogram (aggregated over all DDTs):\n"); ddt_get_dedup_histogram(spa, &ddh_total); zpool_dump_ddt(&dds_total, &ddh_total); } dump_dedup_ratio(&dds_total); } static void dump_dtl_seg(void *arg, uint64_t start, uint64_t size) { char *prefix = arg; (void) printf("%s [%llu,%llu) length %llu\n", prefix, (u_longlong_t)start, (u_longlong_t)(start + size), (u_longlong_t)(size)); } static void dump_dtl(vdev_t *vd, int indent) { spa_t *spa = vd->vdev_spa; boolean_t required; char *name[DTL_TYPES] = { "missing", "partial", "scrub", "outage" }; char prefix[256]; spa_vdev_state_enter(spa, SCL_NONE); required = vdev_dtl_required(vd); (void) spa_vdev_state_exit(spa, NULL, 0); if (indent == 0) (void) printf("\nDirty time logs:\n\n"); (void) printf("\t%*s%s [%s]\n", indent, "", vd->vdev_path ? vd->vdev_path : vd->vdev_parent ? vd->vdev_ops->vdev_op_type : spa_name(spa), required ? "DTL-required" : "DTL-expendable"); for (int t = 0; t < DTL_TYPES; t++) { range_tree_t *rt = vd->vdev_dtl[t]; if (range_tree_space(rt) == 0) continue; (void) snprintf(prefix, sizeof (prefix), "\t%*s%s", indent + 2, "", name[t]); mutex_enter(rt->rt_lock); range_tree_walk(rt, dump_dtl_seg, prefix); mutex_exit(rt->rt_lock); if (dump_opt['d'] > 5 && vd->vdev_children == 0) dump_spacemap(spa->spa_meta_objset, vd->vdev_dtl_sm); } for (int c = 0; c < vd->vdev_children; c++) dump_dtl(vd->vdev_child[c], indent + 4); } /* from spa_history.c: spa_history_create_obj() */ #define HIS_BUF_LEN_DEF (128 << 10) #define HIS_BUF_LEN_MAX (1 << 30) static void dump_history(spa_t *spa) { nvlist_t **events = NULL; char *buf = NULL; uint64_t bufsize = HIS_BUF_LEN_DEF; uint64_t resid, len, off = 0; uint_t num = 0; int error; time_t tsec; struct tm t; char tbuf[30]; char internalstr[MAXPATHLEN]; if ((buf = malloc(bufsize)) == NULL) (void) fprintf(stderr, "Unable to read history: " "out of memory\n"); do { len = bufsize; if ((error = spa_history_get(spa, &off, &len, buf)) != 0) { (void) fprintf(stderr, "Unable to read history: " "error %d\n", error); return; } if (zpool_history_unpack(buf, len, &resid, &events, &num) != 0) break; off -= resid; /* * If the history block is too big, double the buffer * size and try again. */ if (resid == len) { free(buf); buf = NULL; bufsize <<= 1; if ((bufsize >= HIS_BUF_LEN_MAX) || ((buf = malloc(bufsize)) == NULL)) { (void) fprintf(stderr, "Unable to read history: " "out of memory\n"); return; } } } while (len != 0); free(buf); (void) printf("\nHistory:\n"); for (int i = 0; i < num; i++) { uint64_t time, txg, ievent; char *cmd, *intstr; boolean_t printed = B_FALSE; if (nvlist_lookup_uint64(events[i], ZPOOL_HIST_TIME, &time) != 0) goto next; if (nvlist_lookup_string(events[i], ZPOOL_HIST_CMD, &cmd) != 0) { if (nvlist_lookup_uint64(events[i], ZPOOL_HIST_INT_EVENT, &ievent) != 0) goto next; verify(nvlist_lookup_uint64(events[i], ZPOOL_HIST_TXG, &txg) == 0); verify(nvlist_lookup_string(events[i], ZPOOL_HIST_INT_STR, &intstr) == 0); if (ievent >= ZFS_NUM_LEGACY_HISTORY_EVENTS) goto next; (void) snprintf(internalstr, sizeof (internalstr), "[internal %s txg:%lld] %s", zfs_history_event_names[ievent], txg, intstr); cmd = internalstr; } tsec = time; (void) localtime_r(&tsec, &t); (void) strftime(tbuf, sizeof (tbuf), "%F.%T", &t); (void) printf("%s %s\n", tbuf, cmd); printed = B_TRUE; next: if (dump_opt['h'] > 1) { if (!printed) (void) printf("unrecognized record:\n"); dump_nvlist(events[i], 2); } } } /*ARGSUSED*/ static void dump_dnode(objset_t *os, uint64_t object, void *data, size_t size) { } static uint64_t blkid2offset(const dnode_phys_t *dnp, const blkptr_t *bp, const zbookmark_phys_t *zb) { if (dnp == NULL) { ASSERT(zb->zb_level < 0); if (zb->zb_object == 0) return (zb->zb_blkid); return (zb->zb_blkid * BP_GET_LSIZE(bp)); } ASSERT(zb->zb_level >= 0); return ((zb->zb_blkid << (zb->zb_level * (dnp->dn_indblkshift - SPA_BLKPTRSHIFT))) * dnp->dn_datablkszsec << SPA_MINBLOCKSHIFT); } static void snprintf_blkptr_compact(char *blkbuf, size_t buflen, const blkptr_t *bp) { const dva_t *dva = bp->blk_dva; int ndvas = dump_opt['d'] > 5 ? BP_GET_NDVAS(bp) : 1; if (dump_opt['b'] >= 6) { snprintf_blkptr(blkbuf, buflen, bp); return; } if (BP_IS_EMBEDDED(bp)) { (void) sprintf(blkbuf, "EMBEDDED et=%u %llxL/%llxP B=%llu", (int)BPE_GET_ETYPE(bp), (u_longlong_t)BPE_GET_LSIZE(bp), (u_longlong_t)BPE_GET_PSIZE(bp), (u_longlong_t)bp->blk_birth); return; } blkbuf[0] = '\0'; for (int i = 0; i < ndvas; i++) (void) snprintf(blkbuf + strlen(blkbuf), buflen - strlen(blkbuf), "%llu:%llx:%llx ", (u_longlong_t)DVA_GET_VDEV(&dva[i]), (u_longlong_t)DVA_GET_OFFSET(&dva[i]), (u_longlong_t)DVA_GET_ASIZE(&dva[i])); if (BP_IS_HOLE(bp)) { (void) snprintf(blkbuf + strlen(blkbuf), buflen - strlen(blkbuf), "%llxL B=%llu", (u_longlong_t)BP_GET_LSIZE(bp), (u_longlong_t)bp->blk_birth); } else { (void) snprintf(blkbuf + strlen(blkbuf), buflen - strlen(blkbuf), "%llxL/%llxP F=%llu B=%llu/%llu", (u_longlong_t)BP_GET_LSIZE(bp), (u_longlong_t)BP_GET_PSIZE(bp), (u_longlong_t)BP_GET_FILL(bp), (u_longlong_t)bp->blk_birth, (u_longlong_t)BP_PHYSICAL_BIRTH(bp)); } } static void print_indirect(blkptr_t *bp, const zbookmark_phys_t *zb, const dnode_phys_t *dnp) { char blkbuf[BP_SPRINTF_LEN]; int l; if (!BP_IS_EMBEDDED(bp)) { ASSERT3U(BP_GET_TYPE(bp), ==, dnp->dn_type); ASSERT3U(BP_GET_LEVEL(bp), ==, zb->zb_level); } (void) printf("%16llx ", (u_longlong_t)blkid2offset(dnp, bp, zb)); ASSERT(zb->zb_level >= 0); for (l = dnp->dn_nlevels - 1; l >= -1; l--) { if (l == zb->zb_level) { (void) printf("L%llx", (u_longlong_t)zb->zb_level); } else { (void) printf(" "); } } snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp); (void) printf("%s\n", blkbuf); } static int visit_indirect(spa_t *spa, const dnode_phys_t *dnp, blkptr_t *bp, const zbookmark_phys_t *zb) { int err = 0; if (bp->blk_birth == 0) return (0); print_indirect(bp, zb, dnp); if (BP_GET_LEVEL(bp) > 0 && !BP_IS_HOLE(bp)) { arc_flags_t flags = ARC_FLAG_WAIT; int i; blkptr_t *cbp; int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT; arc_buf_t *buf; uint64_t fill = 0; err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf, ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb); if (err) return (err); ASSERT(buf->b_data); /* recursively visit blocks below this */ cbp = buf->b_data; for (i = 0; i < epb; i++, cbp++) { zbookmark_phys_t czb; SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object, zb->zb_level - 1, zb->zb_blkid * epb + i); err = visit_indirect(spa, dnp, cbp, &czb); if (err) break; fill += BP_GET_FILL(cbp); } if (!err) ASSERT3U(fill, ==, BP_GET_FILL(bp)); (void) arc_buf_remove_ref(buf, &buf); } return (err); } /*ARGSUSED*/ static void dump_indirect(dnode_t *dn) { dnode_phys_t *dnp = dn->dn_phys; int j; zbookmark_phys_t czb; (void) printf("Indirect blocks:\n"); SET_BOOKMARK(&czb, dmu_objset_id(dn->dn_objset), dn->dn_object, dnp->dn_nlevels - 1, 0); for (j = 0; j < dnp->dn_nblkptr; j++) { czb.zb_blkid = j; (void) visit_indirect(dmu_objset_spa(dn->dn_objset), dnp, &dnp->dn_blkptr[j], &czb); } (void) printf("\n"); } /*ARGSUSED*/ static void dump_dsl_dir(objset_t *os, uint64_t object, void *data, size_t size) { dsl_dir_phys_t *dd = data; time_t crtime; char nice[32]; if (dd == NULL) return; ASSERT3U(size, >=, sizeof (dsl_dir_phys_t)); crtime = dd->dd_creation_time; (void) printf("\t\tcreation_time = %s", ctime(&crtime)); (void) printf("\t\thead_dataset_obj = %llu\n", (u_longlong_t)dd->dd_head_dataset_obj); (void) printf("\t\tparent_dir_obj = %llu\n", (u_longlong_t)dd->dd_parent_obj); (void) printf("\t\torigin_obj = %llu\n", (u_longlong_t)dd->dd_origin_obj); (void) printf("\t\tchild_dir_zapobj = %llu\n", (u_longlong_t)dd->dd_child_dir_zapobj); zdb_nicenum(dd->dd_used_bytes, nice); (void) printf("\t\tused_bytes = %s\n", nice); zdb_nicenum(dd->dd_compressed_bytes, nice); (void) printf("\t\tcompressed_bytes = %s\n", nice); zdb_nicenum(dd->dd_uncompressed_bytes, nice); (void) printf("\t\tuncompressed_bytes = %s\n", nice); zdb_nicenum(dd->dd_quota, nice); (void) printf("\t\tquota = %s\n", nice); zdb_nicenum(dd->dd_reserved, nice); (void) printf("\t\treserved = %s\n", nice); (void) printf("\t\tprops_zapobj = %llu\n", (u_longlong_t)dd->dd_props_zapobj); (void) printf("\t\tdeleg_zapobj = %llu\n", (u_longlong_t)dd->dd_deleg_zapobj); (void) printf("\t\tflags = %llx\n", (u_longlong_t)dd->dd_flags); #define DO(which) \ zdb_nicenum(dd->dd_used_breakdown[DD_USED_ ## which], nice); \ (void) printf("\t\tused_breakdown[" #which "] = %s\n", nice) DO(HEAD); DO(SNAP); DO(CHILD); DO(CHILD_RSRV); DO(REFRSRV); #undef DO } /*ARGSUSED*/ static void dump_dsl_dataset(objset_t *os, uint64_t object, void *data, size_t size) { dsl_dataset_phys_t *ds = data; time_t crtime; char used[32], compressed[32], uncompressed[32], unique[32]; char blkbuf[BP_SPRINTF_LEN]; if (ds == NULL) return; ASSERT(size == sizeof (*ds)); crtime = ds->ds_creation_time; zdb_nicenum(ds->ds_referenced_bytes, used); zdb_nicenum(ds->ds_compressed_bytes, compressed); zdb_nicenum(ds->ds_uncompressed_bytes, uncompressed); zdb_nicenum(ds->ds_unique_bytes, unique); snprintf_blkptr(blkbuf, sizeof (blkbuf), &ds->ds_bp); (void) printf("\t\tdir_obj = %llu\n", (u_longlong_t)ds->ds_dir_obj); (void) printf("\t\tprev_snap_obj = %llu\n", (u_longlong_t)ds->ds_prev_snap_obj); (void) printf("\t\tprev_snap_txg = %llu\n", (u_longlong_t)ds->ds_prev_snap_txg); (void) printf("\t\tnext_snap_obj = %llu\n", (u_longlong_t)ds->ds_next_snap_obj); (void) printf("\t\tsnapnames_zapobj = %llu\n", (u_longlong_t)ds->ds_snapnames_zapobj); (void) printf("\t\tnum_children = %llu\n", (u_longlong_t)ds->ds_num_children); (void) printf("\t\tuserrefs_obj = %llu\n", (u_longlong_t)ds->ds_userrefs_obj); (void) printf("\t\tcreation_time = %s", ctime(&crtime)); (void) printf("\t\tcreation_txg = %llu\n", (u_longlong_t)ds->ds_creation_txg); (void) printf("\t\tdeadlist_obj = %llu\n", (u_longlong_t)ds->ds_deadlist_obj); (void) printf("\t\tused_bytes = %s\n", used); (void) printf("\t\tcompressed_bytes = %s\n", compressed); (void) printf("\t\tuncompressed_bytes = %s\n", uncompressed); (void) printf("\t\tunique = %s\n", unique); (void) printf("\t\tfsid_guid = %llu\n", (u_longlong_t)ds->ds_fsid_guid); (void) printf("\t\tguid = %llu\n", (u_longlong_t)ds->ds_guid); (void) printf("\t\tflags = %llx\n", (u_longlong_t)ds->ds_flags); (void) printf("\t\tnext_clones_obj = %llu\n", (u_longlong_t)ds->ds_next_clones_obj); (void) printf("\t\tprops_obj = %llu\n", (u_longlong_t)ds->ds_props_obj); (void) printf("\t\tbp = %s\n", blkbuf); } /* ARGSUSED */ static int dump_bptree_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx) { char blkbuf[BP_SPRINTF_LEN]; if (bp->blk_birth != 0) { snprintf_blkptr(blkbuf, sizeof (blkbuf), bp); (void) printf("\t%s\n", blkbuf); } return (0); } static void dump_bptree(objset_t *os, uint64_t obj, char *name) { char bytes[32]; bptree_phys_t *bt; dmu_buf_t *db; if (dump_opt['d'] < 3) return; VERIFY3U(0, ==, dmu_bonus_hold(os, obj, FTAG, &db)); bt = db->db_data; zdb_nicenum(bt->bt_bytes, bytes); (void) printf("\n %s: %llu datasets, %s\n", name, (unsigned long long)(bt->bt_end - bt->bt_begin), bytes); dmu_buf_rele(db, FTAG); if (dump_opt['d'] < 5) return; (void) printf("\n"); (void) bptree_iterate(os, obj, B_FALSE, dump_bptree_cb, NULL, NULL); } /* ARGSUSED */ static int dump_bpobj_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx) { char blkbuf[BP_SPRINTF_LEN]; ASSERT(bp->blk_birth != 0); snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp); (void) printf("\t%s\n", blkbuf); return (0); } static void dump_full_bpobj(bpobj_t *bpo, char *name, int indent) { char bytes[32]; char comp[32]; char uncomp[32]; if (dump_opt['d'] < 3) return; zdb_nicenum(bpo->bpo_phys->bpo_bytes, bytes); if (bpo->bpo_havesubobj && bpo->bpo_phys->bpo_subobjs != 0) { zdb_nicenum(bpo->bpo_phys->bpo_comp, comp); zdb_nicenum(bpo->bpo_phys->bpo_uncomp, uncomp); (void) printf(" %*s: object %llu, %llu local blkptrs, " "%llu subobjs in object %llu, %s (%s/%s comp)\n", indent * 8, name, (u_longlong_t)bpo->bpo_object, (u_longlong_t)bpo->bpo_phys->bpo_num_blkptrs, (u_longlong_t)bpo->bpo_phys->bpo_num_subobjs, (u_longlong_t)bpo->bpo_phys->bpo_subobjs, bytes, comp, uncomp); for (uint64_t i = 0; i < bpo->bpo_phys->bpo_num_subobjs; i++) { uint64_t subobj; bpobj_t subbpo; int error; VERIFY0(dmu_read(bpo->bpo_os, bpo->bpo_phys->bpo_subobjs, i * sizeof (subobj), sizeof (subobj), &subobj, 0)); error = bpobj_open(&subbpo, bpo->bpo_os, subobj); if (error != 0) { (void) printf("ERROR %u while trying to open " "subobj id %llu\n", error, (u_longlong_t)subobj); continue; } dump_full_bpobj(&subbpo, "subobj", indent + 1); bpobj_close(&subbpo); } } else { (void) printf(" %*s: object %llu, %llu blkptrs, %s\n", indent * 8, name, (u_longlong_t)bpo->bpo_object, (u_longlong_t)bpo->bpo_phys->bpo_num_blkptrs, bytes); } if (dump_opt['d'] < 5) return; if (indent == 0) { (void) bpobj_iterate_nofree(bpo, dump_bpobj_cb, NULL, NULL); (void) printf("\n"); } } static void dump_deadlist(dsl_deadlist_t *dl) { dsl_deadlist_entry_t *dle; uint64_t unused; char bytes[32]; char comp[32]; char uncomp[32]; if (dump_opt['d'] < 3) return; if (dl->dl_oldfmt) { dump_full_bpobj(&dl->dl_bpobj, "old-format deadlist", 0); return; } zdb_nicenum(dl->dl_phys->dl_used, bytes); zdb_nicenum(dl->dl_phys->dl_comp, comp); zdb_nicenum(dl->dl_phys->dl_uncomp, uncomp); (void) printf("\n Deadlist: %s (%s/%s comp)\n", bytes, comp, uncomp); if (dump_opt['d'] < 4) return; (void) printf("\n"); /* force the tree to be loaded */ dsl_deadlist_space_range(dl, 0, UINT64_MAX, &unused, &unused, &unused); for (dle = avl_first(&dl->dl_tree); dle; dle = AVL_NEXT(&dl->dl_tree, dle)) { if (dump_opt['d'] >= 5) { char buf[128]; (void) snprintf(buf, sizeof (buf), "mintxg %llu -> " "obj %llu", (longlong_t)dle->dle_mintxg, (longlong_t)dle->dle_bpobj.bpo_object); dump_full_bpobj(&dle->dle_bpobj, buf, 0); } else { (void) printf("mintxg %llu -> obj %llu\n", (longlong_t)dle->dle_mintxg, (longlong_t)dle->dle_bpobj.bpo_object); } } } static avl_tree_t idx_tree; static avl_tree_t domain_tree; static boolean_t fuid_table_loaded; static boolean_t sa_loaded; sa_attr_type_t *sa_attr_table; static void fuid_table_destroy() { if (fuid_table_loaded) { zfs_fuid_table_destroy(&idx_tree, &domain_tree); fuid_table_loaded = B_FALSE; } } /* * print uid or gid information. * For normal POSIX id just the id is printed in decimal format. * For CIFS files with FUID the fuid is printed in hex followed by * the domain-rid string. */ static void print_idstr(uint64_t id, const char *id_type) { if (FUID_INDEX(id)) { char *domain; domain = zfs_fuid_idx_domain(&idx_tree, FUID_INDEX(id)); (void) printf("\t%s %llx [%s-%d]\n", id_type, (u_longlong_t)id, domain, (int)FUID_RID(id)); } else { (void) printf("\t%s %llu\n", id_type, (u_longlong_t)id); } } static void dump_uidgid(objset_t *os, uint64_t uid, uint64_t gid) { uint32_t uid_idx, gid_idx; uid_idx = FUID_INDEX(uid); gid_idx = FUID_INDEX(gid); /* Load domain table, if not already loaded */ if (!fuid_table_loaded && (uid_idx || gid_idx)) { uint64_t fuid_obj; /* first find the fuid object. It lives in the master node */ VERIFY(zap_lookup(os, MASTER_NODE_OBJ, ZFS_FUID_TABLES, 8, 1, &fuid_obj) == 0); zfs_fuid_avl_tree_create(&idx_tree, &domain_tree); (void) zfs_fuid_table_load(os, fuid_obj, &idx_tree, &domain_tree); fuid_table_loaded = B_TRUE; } print_idstr(uid, "uid"); print_idstr(gid, "gid"); } /*ARGSUSED*/ static void dump_znode(objset_t *os, uint64_t object, void *data, size_t size) { char path[MAXPATHLEN * 2]; /* allow for xattr and failure prefix */ sa_handle_t *hdl; uint64_t xattr, rdev, gen; uint64_t uid, gid, mode, fsize, parent, links; uint64_t pflags; uint64_t acctm[2], modtm[2], chgtm[2], crtm[2]; time_t z_crtime, z_atime, z_mtime, z_ctime; sa_bulk_attr_t bulk[12]; int idx = 0; int error; if (!sa_loaded) { uint64_t sa_attrs = 0; uint64_t version; VERIFY(zap_lookup(os, MASTER_NODE_OBJ, ZPL_VERSION_STR, 8, 1, &version) == 0); if (version >= ZPL_VERSION_SA) { VERIFY(zap_lookup(os, MASTER_NODE_OBJ, ZFS_SA_ATTRS, 8, 1, &sa_attrs) == 0); } if ((error = sa_setup(os, sa_attrs, zfs_attr_table, ZPL_END, &sa_attr_table)) != 0) { (void) printf("sa_setup failed errno %d, can't " "display znode contents\n", error); return; } sa_loaded = B_TRUE; } if (sa_handle_get(os, object, NULL, SA_HDL_PRIVATE, &hdl)) { (void) printf("Failed to get handle for SA znode\n"); return; } SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_UID], NULL, &uid, 8); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_GID], NULL, &gid, 8); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_LINKS], NULL, &links, 8); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_GEN], NULL, &gen, 8); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_MODE], NULL, &mode, 8); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_PARENT], NULL, &parent, 8); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_SIZE], NULL, &fsize, 8); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_ATIME], NULL, acctm, 16); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_MTIME], NULL, modtm, 16); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_CRTIME], NULL, crtm, 16); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_CTIME], NULL, chgtm, 16); SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_FLAGS], NULL, &pflags, 8); if (sa_bulk_lookup(hdl, bulk, idx)) { (void) sa_handle_destroy(hdl); return; } error = zfs_obj_to_path(os, object, path, sizeof (path)); if (error != 0) { (void) snprintf(path, sizeof (path), "\?\?\?", (u_longlong_t)object); } if (dump_opt['d'] < 3) { (void) printf("\t%s\n", path); (void) sa_handle_destroy(hdl); return; } z_crtime = (time_t)crtm[0]; z_atime = (time_t)acctm[0]; z_mtime = (time_t)modtm[0]; z_ctime = (time_t)chgtm[0]; (void) printf("\tpath %s\n", path); dump_uidgid(os, uid, gid); (void) printf("\tatime %s", ctime(&z_atime)); (void) printf("\tmtime %s", ctime(&z_mtime)); (void) printf("\tctime %s", ctime(&z_ctime)); (void) printf("\tcrtime %s", ctime(&z_crtime)); (void) printf("\tgen %llu\n", (u_longlong_t)gen); (void) printf("\tmode %llo\n", (u_longlong_t)mode); (void) printf("\tsize %llu\n", (u_longlong_t)fsize); (void) printf("\tparent %llu\n", (u_longlong_t)parent); (void) printf("\tlinks %llu\n", (u_longlong_t)links); (void) printf("\tpflags %llx\n", (u_longlong_t)pflags); if (sa_lookup(hdl, sa_attr_table[ZPL_XATTR], &xattr, sizeof (uint64_t)) == 0) (void) printf("\txattr %llu\n", (u_longlong_t)xattr); if (sa_lookup(hdl, sa_attr_table[ZPL_RDEV], &rdev, sizeof (uint64_t)) == 0) (void) printf("\trdev 0x%016llx\n", (u_longlong_t)rdev); sa_handle_destroy(hdl); } /*ARGSUSED*/ static void dump_acl(objset_t *os, uint64_t object, void *data, size_t size) { } /*ARGSUSED*/ static void dump_dmu_objset(objset_t *os, uint64_t object, void *data, size_t size) { } static object_viewer_t *object_viewer[DMU_OT_NUMTYPES + 1] = { dump_none, /* unallocated */ dump_zap, /* object directory */ dump_uint64, /* object array */ dump_none, /* packed nvlist */ dump_packed_nvlist, /* packed nvlist size */ dump_none, /* bpobj */ dump_bpobj, /* bpobj header */ dump_none, /* SPA space map header */ dump_none, /* SPA space map */ dump_none, /* ZIL intent log */ dump_dnode, /* DMU dnode */ dump_dmu_objset, /* DMU objset */ dump_dsl_dir, /* DSL directory */ dump_zap, /* DSL directory child map */ dump_zap, /* DSL dataset snap map */ dump_zap, /* DSL props */ dump_dsl_dataset, /* DSL dataset */ dump_znode, /* ZFS znode */ dump_acl, /* ZFS V0 ACL */ dump_uint8, /* ZFS plain file */ dump_zpldir, /* ZFS directory */ dump_zap, /* ZFS master node */ dump_zap, /* ZFS delete queue */ dump_uint8, /* zvol object */ dump_zap, /* zvol prop */ dump_uint8, /* other uint8[] */ dump_uint64, /* other uint64[] */ dump_zap, /* other ZAP */ dump_zap, /* persistent error log */ dump_uint8, /* SPA history */ dump_history_offsets, /* SPA history offsets */ dump_zap, /* Pool properties */ dump_zap, /* DSL permissions */ dump_acl, /* ZFS ACL */ dump_uint8, /* ZFS SYSACL */ dump_none, /* FUID nvlist */ dump_packed_nvlist, /* FUID nvlist size */ dump_zap, /* DSL dataset next clones */ dump_zap, /* DSL scrub queue */ dump_zap, /* ZFS user/group used */ dump_zap, /* ZFS user/group quota */ dump_zap, /* snapshot refcount tags */ dump_ddt_zap, /* DDT ZAP object */ dump_zap, /* DDT statistics */ dump_znode, /* SA object */ dump_zap, /* SA Master Node */ dump_sa_attrs, /* SA attribute registration */ dump_sa_layouts, /* SA attribute layouts */ dump_zap, /* DSL scrub translations */ dump_none, /* fake dedup BP */ dump_zap, /* deadlist */ dump_none, /* deadlist hdr */ dump_zap, /* dsl clones */ dump_bpobj_subobjs, /* bpobj subobjs */ dump_unknown, /* Unknown type, must be last */ }; static void dump_object(objset_t *os, uint64_t object, int verbosity, int *print_header) { dmu_buf_t *db = NULL; dmu_object_info_t doi; dnode_t *dn; void *bonus = NULL; size_t bsize = 0; char iblk[32], dblk[32], lsize[32], asize[32], fill[32]; char bonus_size[32]; char aux[50]; int error; if (*print_header) { (void) printf("\n%10s %3s %5s %5s %5s %5s %6s %s\n", "Object", "lvl", "iblk", "dblk", "dsize", "lsize", "%full", "type"); *print_header = 0; } if (object == 0) { dn = DMU_META_DNODE(os); } else { error = dmu_bonus_hold(os, object, FTAG, &db); if (error) fatal("dmu_bonus_hold(%llu) failed, errno %u", object, error); bonus = db->db_data; bsize = db->db_size; dn = DB_DNODE((dmu_buf_impl_t *)db); } dmu_object_info_from_dnode(dn, &doi); zdb_nicenum(doi.doi_metadata_block_size, iblk); zdb_nicenum(doi.doi_data_block_size, dblk); zdb_nicenum(doi.doi_max_offset, lsize); zdb_nicenum(doi.doi_physical_blocks_512 << 9, asize); zdb_nicenum(doi.doi_bonus_size, bonus_size); (void) sprintf(fill, "%6.2f", 100.0 * doi.doi_fill_count * doi.doi_data_block_size / (object == 0 ? DNODES_PER_BLOCK : 1) / doi.doi_max_offset); aux[0] = '\0'; if (doi.doi_checksum != ZIO_CHECKSUM_INHERIT || verbosity >= 6) { (void) snprintf(aux + strlen(aux), sizeof (aux), " (K=%s)", ZDB_CHECKSUM_NAME(doi.doi_checksum)); } if (doi.doi_compress != ZIO_COMPRESS_INHERIT || verbosity >= 6) { (void) snprintf(aux + strlen(aux), sizeof (aux), " (Z=%s)", ZDB_COMPRESS_NAME(doi.doi_compress)); } (void) printf("%10lld %3u %5s %5s %5s %5s %6s %s%s\n", (u_longlong_t)object, doi.doi_indirection, iblk, dblk, asize, lsize, fill, ZDB_OT_NAME(doi.doi_type), aux); if (doi.doi_bonus_type != DMU_OT_NONE && verbosity > 3) { (void) printf("%10s %3s %5s %5s %5s %5s %6s %s\n", "", "", "", "", "", bonus_size, "bonus", ZDB_OT_NAME(doi.doi_bonus_type)); } if (verbosity >= 4) { (void) printf("\tdnode flags: %s%s%s\n", (dn->dn_phys->dn_flags & DNODE_FLAG_USED_BYTES) ? "USED_BYTES " : "", (dn->dn_phys->dn_flags & DNODE_FLAG_USERUSED_ACCOUNTED) ? "USERUSED_ACCOUNTED " : "", (dn->dn_phys->dn_flags & DNODE_FLAG_SPILL_BLKPTR) ? "SPILL_BLKPTR" : ""); (void) printf("\tdnode maxblkid: %llu\n", (longlong_t)dn->dn_phys->dn_maxblkid); object_viewer[ZDB_OT_TYPE(doi.doi_bonus_type)](os, object, bonus, bsize); object_viewer[ZDB_OT_TYPE(doi.doi_type)](os, object, NULL, 0); *print_header = 1; } if (verbosity >= 5) dump_indirect(dn); if (verbosity >= 5) { /* * Report the list of segments that comprise the object. */ uint64_t start = 0; uint64_t end; uint64_t blkfill = 1; int minlvl = 1; if (dn->dn_type == DMU_OT_DNODE) { minlvl = 0; blkfill = DNODES_PER_BLOCK; } for (;;) { char segsize[32]; error = dnode_next_offset(dn, 0, &start, minlvl, blkfill, 0); if (error) break; end = start; error = dnode_next_offset(dn, DNODE_FIND_HOLE, &end, minlvl, blkfill, 0); zdb_nicenum(end - start, segsize); (void) printf("\t\tsegment [%016llx, %016llx)" " size %5s\n", (u_longlong_t)start, (u_longlong_t)end, segsize); if (error) break; start = end; } } if (db != NULL) dmu_buf_rele(db, FTAG); } static char *objset_types[DMU_OST_NUMTYPES] = { "NONE", "META", "ZPL", "ZVOL", "OTHER", "ANY" }; static void dump_dir(objset_t *os) { dmu_objset_stats_t dds; uint64_t object, object_count; uint64_t refdbytes, usedobjs, scratch; char numbuf[32]; char blkbuf[BP_SPRINTF_LEN + 20]; char osname[MAXNAMELEN]; char *type = "UNKNOWN"; int verbosity = dump_opt['d']; int print_header = 1; int i, error; dsl_pool_config_enter(dmu_objset_pool(os), FTAG); dmu_objset_fast_stat(os, &dds); dsl_pool_config_exit(dmu_objset_pool(os), FTAG); if (dds.dds_type < DMU_OST_NUMTYPES) type = objset_types[dds.dds_type]; if (dds.dds_type == DMU_OST_META) { dds.dds_creation_txg = TXG_INITIAL; usedobjs = BP_GET_FILL(os->os_rootbp); refdbytes = dsl_dir_phys(os->os_spa->spa_dsl_pool->dp_mos_dir)-> dd_used_bytes; } else { dmu_objset_space(os, &refdbytes, &scratch, &usedobjs, &scratch); } ASSERT3U(usedobjs, ==, BP_GET_FILL(os->os_rootbp)); zdb_nicenum(refdbytes, numbuf); if (verbosity >= 4) { (void) snprintf(blkbuf, sizeof (blkbuf), ", rootbp "); (void) snprintf_blkptr(blkbuf + strlen(blkbuf), sizeof (blkbuf) - strlen(blkbuf), os->os_rootbp); } else { blkbuf[0] = '\0'; } dmu_objset_name(os, osname); (void) printf("Dataset %s [%s], ID %llu, cr_txg %llu, " "%s, %llu objects%s\n", osname, type, (u_longlong_t)dmu_objset_id(os), (u_longlong_t)dds.dds_creation_txg, numbuf, (u_longlong_t)usedobjs, blkbuf); if (zopt_objects != 0) { for (i = 0; i < zopt_objects; i++) dump_object(os, zopt_object[i], verbosity, &print_header); (void) printf("\n"); return; } if (dump_opt['i'] != 0 || verbosity >= 2) dump_intent_log(dmu_objset_zil(os)); if (dmu_objset_ds(os) != NULL) dump_deadlist(&dmu_objset_ds(os)->ds_deadlist); if (verbosity < 2) return; if (BP_IS_HOLE(os->os_rootbp)) return; dump_object(os, 0, verbosity, &print_header); object_count = 0; if (DMU_USERUSED_DNODE(os) != NULL && DMU_USERUSED_DNODE(os)->dn_type != 0) { dump_object(os, DMU_USERUSED_OBJECT, verbosity, &print_header); dump_object(os, DMU_GROUPUSED_OBJECT, verbosity, &print_header); } object = 0; while ((error = dmu_object_next(os, &object, B_FALSE, 0)) == 0) { dump_object(os, object, verbosity, &print_header); object_count++; } ASSERT3U(object_count, ==, usedobjs); (void) printf("\n"); if (error != ESRCH) { (void) fprintf(stderr, "dmu_object_next() = %d\n", error); abort(); } } static void dump_uberblock(uberblock_t *ub, const char *header, const char *footer) { time_t timestamp = ub->ub_timestamp; (void) printf(header ? header : ""); (void) printf("\tmagic = %016llx\n", (u_longlong_t)ub->ub_magic); (void) printf("\tversion = %llu\n", (u_longlong_t)ub->ub_version); (void) printf("\ttxg = %llu\n", (u_longlong_t)ub->ub_txg); (void) printf("\tguid_sum = %llu\n", (u_longlong_t)ub->ub_guid_sum); (void) printf("\ttimestamp = %llu UTC = %s", (u_longlong_t)ub->ub_timestamp, asctime(localtime(×tamp))); if (dump_opt['u'] >= 3) { char blkbuf[BP_SPRINTF_LEN]; snprintf_blkptr(blkbuf, sizeof (blkbuf), &ub->ub_rootbp); (void) printf("\trootbp = %s\n", blkbuf); } (void) printf(footer ? footer : ""); } static void dump_config(spa_t *spa) { dmu_buf_t *db; size_t nvsize = 0; int error = 0; error = dmu_bonus_hold(spa->spa_meta_objset, spa->spa_config_object, FTAG, &db); if (error == 0) { nvsize = *(uint64_t *)db->db_data; dmu_buf_rele(db, FTAG); (void) printf("\nMOS Configuration:\n"); dump_packed_nvlist(spa->spa_meta_objset, spa->spa_config_object, (void *)&nvsize, 1); } else { (void) fprintf(stderr, "dmu_bonus_hold(%llu) failed, errno %d", (u_longlong_t)spa->spa_config_object, error); } } static void dump_cachefile(const char *cachefile) { int fd; struct stat64 statbuf; char *buf; nvlist_t *config; if ((fd = open64(cachefile, O_RDONLY)) < 0) { (void) fprintf(stderr, "cannot open '%s': %s\n", cachefile, strerror(errno)); exit(1); } if (fstat64(fd, &statbuf) != 0) { (void) fprintf(stderr, "failed to stat '%s': %s\n", cachefile, strerror(errno)); exit(1); } if ((buf = malloc(statbuf.st_size)) == NULL) { (void) fprintf(stderr, "failed to allocate %llu bytes\n", (u_longlong_t)statbuf.st_size); exit(1); } if (read(fd, buf, statbuf.st_size) != statbuf.st_size) { (void) fprintf(stderr, "failed to read %llu bytes\n", (u_longlong_t)statbuf.st_size); exit(1); } (void) close(fd); if (nvlist_unpack(buf, statbuf.st_size, &config, 0) != 0) { (void) fprintf(stderr, "failed to unpack nvlist\n"); exit(1); } free(buf); dump_nvlist(config, 0); nvlist_free(config); } #define ZDB_MAX_UB_HEADER_SIZE 32 static void dump_label_uberblocks(vdev_label_t *lbl, uint64_t ashift) { vdev_t vd; vdev_t *vdp = &vd; char header[ZDB_MAX_UB_HEADER_SIZE]; vd.vdev_ashift = ashift; vdp->vdev_top = vdp; for (int i = 0; i < VDEV_UBERBLOCK_COUNT(vdp); i++) { uint64_t uoff = VDEV_UBERBLOCK_OFFSET(vdp, i); uberblock_t *ub = (void *)((char *)lbl + uoff); if (uberblock_verify(ub)) continue; (void) snprintf(header, ZDB_MAX_UB_HEADER_SIZE, "Uberblock[%d]\n", i); dump_uberblock(ub, header, ""); } } static void dump_label(const char *dev) { int fd; vdev_label_t label; char *path, *buf = label.vl_vdev_phys.vp_nvlist; size_t buflen = sizeof (label.vl_vdev_phys.vp_nvlist); struct stat64 statbuf; uint64_t psize, ashift; int len = strlen(dev) + 1; if (strncmp(dev, ZFS_DISK_ROOTD, strlen(ZFS_DISK_ROOTD)) == 0) { len++; path = malloc(len); (void) snprintf(path, len, "%s%s", ZFS_RDISK_ROOTD, dev + strlen(ZFS_DISK_ROOTD)); } else { path = strdup(dev); } if ((fd = open64(path, O_RDONLY)) < 0) { (void) printf("cannot open '%s': %s\n", path, strerror(errno)); free(path); exit(1); } if (fstat64(fd, &statbuf) != 0) { (void) printf("failed to stat '%s': %s\n", path, strerror(errno)); free(path); (void) close(fd); exit(1); } if (S_ISBLK(statbuf.st_mode)) { (void) printf("cannot use '%s': character device required\n", path); free(path); (void) close(fd); exit(1); } psize = statbuf.st_size; psize = P2ALIGN(psize, (uint64_t)sizeof (vdev_label_t)); for (int l = 0; l < VDEV_LABELS; l++) { nvlist_t *config = NULL; (void) printf("--------------------------------------------\n"); (void) printf("LABEL %d\n", l); (void) printf("--------------------------------------------\n"); if (pread64(fd, &label, sizeof (label), vdev_label_offset(psize, l, 0)) != sizeof (label)) { (void) printf("failed to read label %d\n", l); continue; } if (nvlist_unpack(buf, buflen, &config, 0) != 0) { (void) printf("failed to unpack label %d\n", l); ashift = SPA_MINBLOCKSHIFT; } else { nvlist_t *vdev_tree = NULL; dump_nvlist(config, 4); if ((nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, &vdev_tree) != 0) || (nvlist_lookup_uint64(vdev_tree, ZPOOL_CONFIG_ASHIFT, &ashift) != 0)) ashift = SPA_MINBLOCKSHIFT; nvlist_free(config); } if (dump_opt['u']) dump_label_uberblocks(&label, ashift); } free(path); (void) close(fd); } static uint64_t dataset_feature_count[SPA_FEATURES]; /*ARGSUSED*/ static int dump_one_dir(const char *dsname, void *arg) { int error; objset_t *os; error = dmu_objset_own(dsname, DMU_OST_ANY, B_TRUE, FTAG, &os); if (error) { (void) printf("Could not open %s, error %d\n", dsname, error); return (0); } for (spa_feature_t f = 0; f < SPA_FEATURES; f++) { if (!dmu_objset_ds(os)->ds_feature_inuse[f]) continue; ASSERT(spa_feature_table[f].fi_flags & ZFEATURE_FLAG_PER_DATASET); dataset_feature_count[f]++; } dump_dir(os); dmu_objset_disown(os, FTAG); fuid_table_destroy(); sa_loaded = B_FALSE; return (0); } /* * Block statistics. */ #define PSIZE_HISTO_SIZE (SPA_OLD_MAXBLOCKSIZE / SPA_MINBLOCKSIZE + 2) typedef struct zdb_blkstats { uint64_t zb_asize; uint64_t zb_lsize; uint64_t zb_psize; uint64_t zb_count; uint64_t zb_gangs; uint64_t zb_ditto_samevdev; uint64_t zb_psize_histogram[PSIZE_HISTO_SIZE]; } zdb_blkstats_t; /* * Extended object types to report deferred frees and dedup auto-ditto blocks. */ #define ZDB_OT_DEFERRED (DMU_OT_NUMTYPES + 0) #define ZDB_OT_DITTO (DMU_OT_NUMTYPES + 1) #define ZDB_OT_OTHER (DMU_OT_NUMTYPES + 2) #define ZDB_OT_TOTAL (DMU_OT_NUMTYPES + 3) static char *zdb_ot_extname[] = { "deferred free", "dedup ditto", "other", "Total", }; #define ZB_TOTAL DN_MAX_LEVELS typedef struct zdb_cb { zdb_blkstats_t zcb_type[ZB_TOTAL + 1][ZDB_OT_TOTAL + 1]; uint64_t zcb_dedup_asize; uint64_t zcb_dedup_blocks; uint64_t zcb_embedded_blocks[NUM_BP_EMBEDDED_TYPES]; uint64_t zcb_embedded_histogram[NUM_BP_EMBEDDED_TYPES] [BPE_PAYLOAD_SIZE]; uint64_t zcb_start; uint64_t zcb_lastprint; uint64_t zcb_totalasize; uint64_t zcb_errors[256]; int zcb_readfails; int zcb_haderrors; spa_t *zcb_spa; } zdb_cb_t; static void zdb_count_block(zdb_cb_t *zcb, zilog_t *zilog, const blkptr_t *bp, dmu_object_type_t type) { uint64_t refcnt = 0; ASSERT(type < ZDB_OT_TOTAL); if (zilog && zil_bp_tree_add(zilog, bp) != 0) return; for (int i = 0; i < 4; i++) { int l = (i < 2) ? BP_GET_LEVEL(bp) : ZB_TOTAL; int t = (i & 1) ? type : ZDB_OT_TOTAL; int equal; zdb_blkstats_t *zb = &zcb->zcb_type[l][t]; zb->zb_asize += BP_GET_ASIZE(bp); zb->zb_lsize += BP_GET_LSIZE(bp); zb->zb_psize += BP_GET_PSIZE(bp); zb->zb_count++; /* * The histogram is only big enough to record blocks up to * SPA_OLD_MAXBLOCKSIZE; larger blocks go into the last, * "other", bucket. */ int idx = BP_GET_PSIZE(bp) >> SPA_MINBLOCKSHIFT; idx = MIN(idx, SPA_OLD_MAXBLOCKSIZE / SPA_MINBLOCKSIZE + 1); zb->zb_psize_histogram[idx]++; zb->zb_gangs += BP_COUNT_GANG(bp); switch (BP_GET_NDVAS(bp)) { case 2: if (DVA_GET_VDEV(&bp->blk_dva[0]) == DVA_GET_VDEV(&bp->blk_dva[1])) zb->zb_ditto_samevdev++; break; case 3: equal = (DVA_GET_VDEV(&bp->blk_dva[0]) == DVA_GET_VDEV(&bp->blk_dva[1])) + (DVA_GET_VDEV(&bp->blk_dva[0]) == DVA_GET_VDEV(&bp->blk_dva[2])) + (DVA_GET_VDEV(&bp->blk_dva[1]) == DVA_GET_VDEV(&bp->blk_dva[2])); if (equal != 0) zb->zb_ditto_samevdev++; break; } } if (BP_IS_EMBEDDED(bp)) { zcb->zcb_embedded_blocks[BPE_GET_ETYPE(bp)]++; zcb->zcb_embedded_histogram[BPE_GET_ETYPE(bp)] [BPE_GET_PSIZE(bp)]++; return; } if (dump_opt['L']) return; if (BP_GET_DEDUP(bp)) { ddt_t *ddt; ddt_entry_t *dde; ddt = ddt_select(zcb->zcb_spa, bp); ddt_enter(ddt); dde = ddt_lookup(ddt, bp, B_FALSE); if (dde == NULL) { refcnt = 0; } else { ddt_phys_t *ddp = ddt_phys_select(dde, bp); ddt_phys_decref(ddp); refcnt = ddp->ddp_refcnt; if (ddt_phys_total_refcnt(dde) == 0) ddt_remove(ddt, dde); } ddt_exit(ddt); } VERIFY3U(zio_wait(zio_claim(NULL, zcb->zcb_spa, refcnt ? 0 : spa_first_txg(zcb->zcb_spa), bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0); } /* ARGSUSED */ static void zdb_blkptr_done(zio_t *zio) { spa_t *spa = zio->io_spa; blkptr_t *bp = zio->io_bp; int ioerr = zio->io_error; zdb_cb_t *zcb = zio->io_private; zbookmark_phys_t *zb = &zio->io_bookmark; zio_data_buf_free(zio->io_data, zio->io_size); mutex_enter(&spa->spa_scrub_lock); spa->spa_scrub_inflight--; cv_broadcast(&spa->spa_scrub_io_cv); if (ioerr && !(zio->io_flags & ZIO_FLAG_SPECULATIVE)) { char blkbuf[BP_SPRINTF_LEN]; zcb->zcb_haderrors = 1; zcb->zcb_errors[ioerr]++; if (dump_opt['b'] >= 2) snprintf_blkptr(blkbuf, sizeof (blkbuf), bp); else blkbuf[0] = '\0'; (void) printf("zdb_blkptr_cb: " "Got error %d reading " "<%llu, %llu, %lld, %llx> %s -- skipping\n", ioerr, (u_longlong_t)zb->zb_objset, (u_longlong_t)zb->zb_object, (u_longlong_t)zb->zb_level, (u_longlong_t)zb->zb_blkid, blkbuf); } mutex_exit(&spa->spa_scrub_lock); } /* ARGSUSED */ static int zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_phys_t *zb, const dnode_phys_t *dnp, void *arg) { zdb_cb_t *zcb = arg; dmu_object_type_t type; boolean_t is_metadata; if (bp == NULL) return (0); if (dump_opt['b'] >= 5 && bp->blk_birth > 0) { char blkbuf[BP_SPRINTF_LEN]; snprintf_blkptr(blkbuf, sizeof (blkbuf), bp); (void) printf("objset %llu object %llu " "level %lld offset 0x%llx %s\n", (u_longlong_t)zb->zb_objset, (u_longlong_t)zb->zb_object, (longlong_t)zb->zb_level, (u_longlong_t)blkid2offset(dnp, bp, zb), blkbuf); } if (BP_IS_HOLE(bp)) return (0); type = BP_GET_TYPE(bp); zdb_count_block(zcb, zilog, bp, (type & DMU_OT_NEWTYPE) ? ZDB_OT_OTHER : type); is_metadata = (BP_GET_LEVEL(bp) != 0 || DMU_OT_IS_METADATA(type)); if (!BP_IS_EMBEDDED(bp) && (dump_opt['c'] > 1 || (dump_opt['c'] && is_metadata))) { size_t size = BP_GET_PSIZE(bp); void *data = zio_data_buf_alloc(size); int flags = ZIO_FLAG_CANFAIL | ZIO_FLAG_SCRUB | ZIO_FLAG_RAW; /* If it's an intent log block, failure is expected. */ if (zb->zb_level == ZB_ZIL_LEVEL) flags |= ZIO_FLAG_SPECULATIVE; mutex_enter(&spa->spa_scrub_lock); while (spa->spa_scrub_inflight > max_inflight) cv_wait(&spa->spa_scrub_io_cv, &spa->spa_scrub_lock); spa->spa_scrub_inflight++; mutex_exit(&spa->spa_scrub_lock); zio_nowait(zio_read(NULL, spa, bp, data, size, zdb_blkptr_done, zcb, ZIO_PRIORITY_ASYNC_READ, flags, zb)); } zcb->zcb_readfails = 0; /* only call gethrtime() every 100 blocks */ static int iters; if (++iters > 100) iters = 0; else return (0); if (dump_opt['b'] < 5 && gethrtime() > zcb->zcb_lastprint + NANOSEC) { uint64_t now = gethrtime(); char buf[10]; uint64_t bytes = zcb->zcb_type[ZB_TOTAL][ZDB_OT_TOTAL].zb_asize; int kb_per_sec = 1 + bytes / (1 + ((now - zcb->zcb_start) / 1000 / 1000)); int sec_remaining = (zcb->zcb_totalasize - bytes) / 1024 / kb_per_sec; zfs_nicenum(bytes, buf, sizeof (buf)); (void) fprintf(stderr, "\r%5s completed (%4dMB/s) " "estimated time remaining: %uhr %02umin %02usec ", buf, kb_per_sec / 1024, sec_remaining / 60 / 60, sec_remaining / 60 % 60, sec_remaining % 60); zcb->zcb_lastprint = now; } return (0); } static void zdb_leak(void *arg, uint64_t start, uint64_t size) { vdev_t *vd = arg; (void) printf("leaked space: vdev %llu, offset 0x%llx, size %llu\n", (u_longlong_t)vd->vdev_id, (u_longlong_t)start, (u_longlong_t)size); } static metaslab_ops_t zdb_metaslab_ops = { NULL /* alloc */ }; static void zdb_ddt_leak_init(spa_t *spa, zdb_cb_t *zcb) { ddt_bookmark_t ddb = { 0 }; ddt_entry_t dde; int error; while ((error = ddt_walk(spa, &ddb, &dde)) == 0) { blkptr_t blk; ddt_phys_t *ddp = dde.dde_phys; if (ddb.ddb_class == DDT_CLASS_UNIQUE) return; ASSERT(ddt_phys_total_refcnt(&dde) > 1); for (int p = 0; p < DDT_PHYS_TYPES; p++, ddp++) { if (ddp->ddp_phys_birth == 0) continue; ddt_bp_create(ddb.ddb_checksum, &dde.dde_key, ddp, &blk); if (p == DDT_PHYS_DITTO) { zdb_count_block(zcb, NULL, &blk, ZDB_OT_DITTO); } else { zcb->zcb_dedup_asize += BP_GET_ASIZE(&blk) * (ddp->ddp_refcnt - 1); zcb->zcb_dedup_blocks++; } } if (!dump_opt['L']) { ddt_t *ddt = spa->spa_ddt[ddb.ddb_checksum]; ddt_enter(ddt); VERIFY(ddt_lookup(ddt, &blk, B_TRUE) != NULL); ddt_exit(ddt); } } ASSERT(error == ENOENT); } static void zdb_leak_init(spa_t *spa, zdb_cb_t *zcb) { zcb->zcb_spa = spa; if (!dump_opt['L']) { vdev_t *rvd = spa->spa_root_vdev; for (uint64_t c = 0; c < rvd->vdev_children; c++) { vdev_t *vd = rvd->vdev_child[c]; for (uint64_t m = 0; m < vd->vdev_ms_count; m++) { metaslab_t *msp = vd->vdev_ms[m]; mutex_enter(&msp->ms_lock); metaslab_unload(msp); /* * For leak detection, we overload the metaslab * ms_tree to contain allocated segments * instead of free segments. As a result, * we can't use the normal metaslab_load/unload * interfaces. */ if (msp->ms_sm != NULL) { (void) fprintf(stderr, "\rloading space map for " "vdev %llu of %llu, " "metaslab %llu of %llu ...", (longlong_t)c, (longlong_t)rvd->vdev_children, (longlong_t)m, (longlong_t)vd->vdev_ms_count); msp->ms_ops = &zdb_metaslab_ops; /* * We don't want to spend the CPU * manipulating the size-ordered * tree, so clear the range_tree * ops. */ msp->ms_tree->rt_ops = NULL; VERIFY0(space_map_load(msp->ms_sm, msp->ms_tree, SM_ALLOC)); msp->ms_loaded = B_TRUE; } mutex_exit(&msp->ms_lock); } } (void) fprintf(stderr, "\n"); } spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER); zdb_ddt_leak_init(spa, zcb); spa_config_exit(spa, SCL_CONFIG, FTAG); } static void zdb_leak_fini(spa_t *spa) { if (!dump_opt['L']) { vdev_t *rvd = spa->spa_root_vdev; for (int c = 0; c < rvd->vdev_children; c++) { vdev_t *vd = rvd->vdev_child[c]; for (int m = 0; m < vd->vdev_ms_count; m++) { metaslab_t *msp = vd->vdev_ms[m]; mutex_enter(&msp->ms_lock); /* * The ms_tree has been overloaded to * contain allocated segments. Now that we * finished traversing all blocks, any * block that remains in the ms_tree * represents an allocated block that we * did not claim during the traversal. * Claimed blocks would have been removed * from the ms_tree. */ range_tree_vacate(msp->ms_tree, zdb_leak, vd); msp->ms_loaded = B_FALSE; mutex_exit(&msp->ms_lock); } } } } /* ARGSUSED */ static int count_block_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx) { zdb_cb_t *zcb = arg; if (dump_opt['b'] >= 5) { char blkbuf[BP_SPRINTF_LEN]; snprintf_blkptr(blkbuf, sizeof (blkbuf), bp); (void) printf("[%s] %s\n", "deferred free", blkbuf); } zdb_count_block(zcb, NULL, bp, ZDB_OT_DEFERRED); return (0); } static int dump_block_stats(spa_t *spa) { zdb_cb_t zcb = { 0 }; zdb_blkstats_t *zb, *tzb; uint64_t norm_alloc, norm_space, total_alloc, total_found; int flags = TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA | TRAVERSE_HARD; boolean_t leaks = B_FALSE; (void) printf("\nTraversing all blocks %s%s%s%s%s...\n\n", (dump_opt['c'] || !dump_opt['L']) ? "to verify " : "", (dump_opt['c'] == 1) ? "metadata " : "", dump_opt['c'] ? "checksums " : "", (dump_opt['c'] && !dump_opt['L']) ? "and verify " : "", !dump_opt['L'] ? "nothing leaked " : ""); /* * Load all space maps as SM_ALLOC maps, then traverse the pool * claiming each block we discover. If the pool is perfectly * consistent, the space maps will be empty when we're done. * Anything left over is a leak; any block we can't claim (because * it's not part of any space map) is a double allocation, * reference to a freed block, or an unclaimed log block. */ zdb_leak_init(spa, &zcb); /* * If there's a deferred-free bplist, process that first. */ (void) bpobj_iterate_nofree(&spa->spa_deferred_bpobj, count_block_cb, &zcb, NULL); if (spa_version(spa) >= SPA_VERSION_DEADLISTS) { (void) bpobj_iterate_nofree(&spa->spa_dsl_pool->dp_free_bpobj, count_block_cb, &zcb, NULL); } if (spa_feature_is_active(spa, SPA_FEATURE_ASYNC_DESTROY)) { VERIFY3U(0, ==, bptree_iterate(spa->spa_meta_objset, spa->spa_dsl_pool->dp_bptree_obj, B_FALSE, count_block_cb, &zcb, NULL)); } if (dump_opt['c'] > 1) flags |= TRAVERSE_PREFETCH_DATA; zcb.zcb_totalasize = metaslab_class_get_alloc(spa_normal_class(spa)); zcb.zcb_start = zcb.zcb_lastprint = gethrtime(); zcb.zcb_haderrors |= traverse_pool(spa, 0, flags, zdb_blkptr_cb, &zcb); /* * If we've traversed the data blocks then we need to wait for those * I/Os to complete. We leverage "The Godfather" zio to wait on * all async I/Os to complete. */ if (dump_opt['c']) { for (int i = 0; i < max_ncpus; i++) { (void) zio_wait(spa->spa_async_zio_root[i]); spa->spa_async_zio_root[i] = zio_root(spa, NULL, NULL, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE | ZIO_FLAG_GODFATHER); } } if (zcb.zcb_haderrors) { (void) printf("\nError counts:\n\n"); (void) printf("\t%5s %s\n", "errno", "count"); for (int e = 0; e < 256; e++) { if (zcb.zcb_errors[e] != 0) { (void) printf("\t%5d %llu\n", e, (u_longlong_t)zcb.zcb_errors[e]); } } } /* * Report any leaked segments. */ zdb_leak_fini(spa); tzb = &zcb.zcb_type[ZB_TOTAL][ZDB_OT_TOTAL]; norm_alloc = metaslab_class_get_alloc(spa_normal_class(spa)); norm_space = metaslab_class_get_space(spa_normal_class(spa)); total_alloc = norm_alloc + metaslab_class_get_alloc(spa_log_class(spa)); total_found = tzb->zb_asize - zcb.zcb_dedup_asize; if (total_found == total_alloc) { if (!dump_opt['L']) (void) printf("\n\tNo leaks (block sum matches space" " maps exactly)\n"); } else { (void) printf("block traversal size %llu != alloc %llu " "(%s %lld)\n", (u_longlong_t)total_found, (u_longlong_t)total_alloc, (dump_opt['L']) ? "unreachable" : "leaked", (longlong_t)(total_alloc - total_found)); leaks = B_TRUE; } if (tzb->zb_count == 0) return (2); (void) printf("\n"); (void) printf("\tbp count: %10llu\n", (u_longlong_t)tzb->zb_count); (void) printf("\tganged count: %10llu\n", (longlong_t)tzb->zb_gangs); (void) printf("\tbp logical: %10llu avg: %6llu\n", (u_longlong_t)tzb->zb_lsize, (u_longlong_t)(tzb->zb_lsize / tzb->zb_count)); (void) printf("\tbp physical: %10llu avg:" " %6llu compression: %6.2f\n", (u_longlong_t)tzb->zb_psize, (u_longlong_t)(tzb->zb_psize / tzb->zb_count), (double)tzb->zb_lsize / tzb->zb_psize); (void) printf("\tbp allocated: %10llu avg:" " %6llu compression: %6.2f\n", (u_longlong_t)tzb->zb_asize, (u_longlong_t)(tzb->zb_asize / tzb->zb_count), (double)tzb->zb_lsize / tzb->zb_asize); (void) printf("\tbp deduped: %10llu ref>1:" " %6llu deduplication: %6.2f\n", (u_longlong_t)zcb.zcb_dedup_asize, (u_longlong_t)zcb.zcb_dedup_blocks, (double)zcb.zcb_dedup_asize / tzb->zb_asize + 1.0); (void) printf("\tSPA allocated: %10llu used: %5.2f%%\n", (u_longlong_t)norm_alloc, 100.0 * norm_alloc / norm_space); for (bp_embedded_type_t i = 0; i < NUM_BP_EMBEDDED_TYPES; i++) { if (zcb.zcb_embedded_blocks[i] == 0) continue; (void) printf("\n"); (void) printf("\tadditional, non-pointer bps of type %u: " "%10llu\n", i, (u_longlong_t)zcb.zcb_embedded_blocks[i]); if (dump_opt['b'] >= 3) { (void) printf("\t number of (compressed) bytes: " "number of bps\n"); dump_histogram(zcb.zcb_embedded_histogram[i], sizeof (zcb.zcb_embedded_histogram[i]) / sizeof (zcb.zcb_embedded_histogram[i][0]), 0); } } if (tzb->zb_ditto_samevdev != 0) { (void) printf("\tDittoed blocks on same vdev: %llu\n", (longlong_t)tzb->zb_ditto_samevdev); } if (dump_opt['b'] >= 2) { int l, t, level; (void) printf("\nBlocks\tLSIZE\tPSIZE\tASIZE" "\t avg\t comp\t%%Total\tType\n"); for (t = 0; t <= ZDB_OT_TOTAL; t++) { char csize[32], lsize[32], psize[32], asize[32]; char avg[32], gang[32]; char *typename; if (t < DMU_OT_NUMTYPES) typename = dmu_ot[t].ot_name; else typename = zdb_ot_extname[t - DMU_OT_NUMTYPES]; if (zcb.zcb_type[ZB_TOTAL][t].zb_asize == 0) { (void) printf("%6s\t%5s\t%5s\t%5s" "\t%5s\t%5s\t%6s\t%s\n", "-", "-", "-", "-", "-", "-", "-", typename); continue; } for (l = ZB_TOTAL - 1; l >= -1; l--) { level = (l == -1 ? ZB_TOTAL : l); zb = &zcb.zcb_type[level][t]; if (zb->zb_asize == 0) continue; if (dump_opt['b'] < 3 && level != ZB_TOTAL) continue; if (level == 0 && zb->zb_asize == zcb.zcb_type[ZB_TOTAL][t].zb_asize) continue; zdb_nicenum(zb->zb_count, csize); zdb_nicenum(zb->zb_lsize, lsize); zdb_nicenum(zb->zb_psize, psize); zdb_nicenum(zb->zb_asize, asize); zdb_nicenum(zb->zb_asize / zb->zb_count, avg); zdb_nicenum(zb->zb_gangs, gang); (void) printf("%6s\t%5s\t%5s\t%5s\t%5s" "\t%5.2f\t%6.2f\t", csize, lsize, psize, asize, avg, (double)zb->zb_lsize / zb->zb_psize, 100.0 * zb->zb_asize / tzb->zb_asize); if (level == ZB_TOTAL) (void) printf("%s\n", typename); else (void) printf(" L%d %s\n", level, typename); if (dump_opt['b'] >= 3 && zb->zb_gangs > 0) { (void) printf("\t number of ganged " "blocks: %s\n", gang); } if (dump_opt['b'] >= 4) { (void) printf("psize " "(in 512-byte sectors): " "number of blocks\n"); dump_histogram(zb->zb_psize_histogram, PSIZE_HISTO_SIZE, 0); } } } } (void) printf("\n"); if (leaks) return (2); if (zcb.zcb_haderrors) return (3); return (0); } typedef struct zdb_ddt_entry { ddt_key_t zdde_key; uint64_t zdde_ref_blocks; uint64_t zdde_ref_lsize; uint64_t zdde_ref_psize; uint64_t zdde_ref_dsize; avl_node_t zdde_node; } zdb_ddt_entry_t; /* ARGSUSED */ static int zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp, const zbookmark_phys_t *zb, const dnode_phys_t *dnp, void *arg) { avl_tree_t *t = arg; avl_index_t where; zdb_ddt_entry_t *zdde, zdde_search; if (bp == NULL || BP_IS_HOLE(bp) || BP_IS_EMBEDDED(bp)) return (0); if (dump_opt['S'] > 1 && zb->zb_level == ZB_ROOT_LEVEL) { (void) printf("traversing objset %llu, %llu objects, " "%lu blocks so far\n", (u_longlong_t)zb->zb_objset, (u_longlong_t)BP_GET_FILL(bp), avl_numnodes(t)); } if (BP_IS_HOLE(bp) || BP_GET_CHECKSUM(bp) == ZIO_CHECKSUM_OFF || BP_GET_LEVEL(bp) > 0 || DMU_OT_IS_METADATA(BP_GET_TYPE(bp))) return (0); ddt_key_fill(&zdde_search.zdde_key, bp); zdde = avl_find(t, &zdde_search, &where); if (zdde == NULL) { zdde = umem_zalloc(sizeof (*zdde), UMEM_NOFAIL); zdde->zdde_key = zdde_search.zdde_key; avl_insert(t, zdde, where); } zdde->zdde_ref_blocks += 1; zdde->zdde_ref_lsize += BP_GET_LSIZE(bp); zdde->zdde_ref_psize += BP_GET_PSIZE(bp); zdde->zdde_ref_dsize += bp_get_dsize_sync(spa, bp); return (0); } static void dump_simulated_ddt(spa_t *spa) { avl_tree_t t; void *cookie = NULL; zdb_ddt_entry_t *zdde; ddt_histogram_t ddh_total = { 0 }; ddt_stat_t dds_total = { 0 }; avl_create(&t, ddt_entry_compare, sizeof (zdb_ddt_entry_t), offsetof(zdb_ddt_entry_t, zdde_node)); spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER); (void) traverse_pool(spa, 0, TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA, zdb_ddt_add_cb, &t); spa_config_exit(spa, SCL_CONFIG, FTAG); while ((zdde = avl_destroy_nodes(&t, &cookie)) != NULL) { ddt_stat_t dds; uint64_t refcnt = zdde->zdde_ref_blocks; ASSERT(refcnt != 0); dds.dds_blocks = zdde->zdde_ref_blocks / refcnt; dds.dds_lsize = zdde->zdde_ref_lsize / refcnt; dds.dds_psize = zdde->zdde_ref_psize / refcnt; dds.dds_dsize = zdde->zdde_ref_dsize / refcnt; dds.dds_ref_blocks = zdde->zdde_ref_blocks; dds.dds_ref_lsize = zdde->zdde_ref_lsize; dds.dds_ref_psize = zdde->zdde_ref_psize; dds.dds_ref_dsize = zdde->zdde_ref_dsize; ddt_stat_add(&ddh_total.ddh_stat[highbit64(refcnt) - 1], &dds, 0); umem_free(zdde, sizeof (*zdde)); } avl_destroy(&t); ddt_histogram_stat(&dds_total, &ddh_total); (void) printf("Simulated DDT histogram:\n"); zpool_dump_ddt(&dds_total, &ddh_total); dump_dedup_ratio(&dds_total); } static void dump_zpool(spa_t *spa) { dsl_pool_t *dp = spa_get_dsl(spa); int rc = 0; if (dump_opt['S']) { dump_simulated_ddt(spa); return; } if (!dump_opt['e'] && dump_opt['C'] > 1) { (void) printf("\nCached configuration:\n"); dump_nvlist(spa->spa_config, 8); } if (dump_opt['C']) dump_config(spa); if (dump_opt['u']) dump_uberblock(&spa->spa_uberblock, "\nUberblock:\n", "\n"); if (dump_opt['D']) dump_all_ddts(spa); if (dump_opt['d'] > 2 || dump_opt['m']) dump_metaslabs(spa); if (dump_opt['M']) dump_metaslab_groups(spa); if (dump_opt['d'] || dump_opt['i']) { dump_dir(dp->dp_meta_objset); if (dump_opt['d'] >= 3) { dump_full_bpobj(&spa->spa_deferred_bpobj, "Deferred frees", 0); if (spa_version(spa) >= SPA_VERSION_DEADLISTS) { dump_full_bpobj( &spa->spa_dsl_pool->dp_free_bpobj, "Pool snapshot frees", 0); } if (spa_feature_is_active(spa, SPA_FEATURE_ASYNC_DESTROY)) { dump_bptree(spa->spa_meta_objset, spa->spa_dsl_pool->dp_bptree_obj, "Pool dataset frees"); } dump_dtl(spa->spa_root_vdev, 0); } (void) dmu_objset_find(spa_name(spa), dump_one_dir, NULL, DS_FIND_SNAPSHOTS | DS_FIND_CHILDREN); for (spa_feature_t f = 0; f < SPA_FEATURES; f++) { uint64_t refcount; if (!(spa_feature_table[f].fi_flags & ZFEATURE_FLAG_PER_DATASET)) { ASSERT0(dataset_feature_count[f]); continue; } (void) feature_get_refcount(spa, &spa_feature_table[f], &refcount); if (dataset_feature_count[f] != refcount) { (void) printf("%s feature refcount mismatch: " "%lld datasets != %lld refcount\n", spa_feature_table[f].fi_uname, (longlong_t)dataset_feature_count[f], (longlong_t)refcount); rc = 2; } else { (void) printf("Verified %s feature refcount " "of %llu is correct\n", spa_feature_table[f].fi_uname, (longlong_t)refcount); } } } if (rc == 0 && (dump_opt['b'] || dump_opt['c'])) rc = dump_block_stats(spa); if (rc == 0) rc = verify_spacemap_refcounts(spa); if (dump_opt['s']) show_pool_stats(spa); if (dump_opt['h']) dump_history(spa); if (rc != 0) exit(rc); } #define ZDB_FLAG_CHECKSUM 0x0001 #define ZDB_FLAG_DECOMPRESS 0x0002 #define ZDB_FLAG_BSWAP 0x0004 #define ZDB_FLAG_GBH 0x0008 #define ZDB_FLAG_INDIRECT 0x0010 #define ZDB_FLAG_PHYS 0x0020 #define ZDB_FLAG_RAW 0x0040 #define ZDB_FLAG_PRINT_BLKPTR 0x0080 int flagbits[256]; static void zdb_print_blkptr(blkptr_t *bp, int flags) { char blkbuf[BP_SPRINTF_LEN]; if (flags & ZDB_FLAG_BSWAP) byteswap_uint64_array((void *)bp, sizeof (blkptr_t)); snprintf_blkptr(blkbuf, sizeof (blkbuf), bp); (void) printf("%s\n", blkbuf); } static void zdb_dump_indirect(blkptr_t *bp, int nbps, int flags) { int i; for (i = 0; i < nbps; i++) zdb_print_blkptr(&bp[i], flags); } static void zdb_dump_gbh(void *buf, int flags) { zdb_dump_indirect((blkptr_t *)buf, SPA_GBH_NBLKPTRS, flags); } static void zdb_dump_block_raw(void *buf, uint64_t size, int flags) { if (flags & ZDB_FLAG_BSWAP) byteswap_uint64_array(buf, size); (void) write(1, buf, size); } static void zdb_dump_block(char *label, void *buf, uint64_t size, int flags) { uint64_t *d = (uint64_t *)buf; int nwords = size / sizeof (uint64_t); int do_bswap = !!(flags & ZDB_FLAG_BSWAP); int i, j; char *hdr, *c; if (do_bswap) hdr = " 7 6 5 4 3 2 1 0 f e d c b a 9 8"; else hdr = " 0 1 2 3 4 5 6 7 8 9 a b c d e f"; (void) printf("\n%s\n%6s %s 0123456789abcdef\n", label, "", hdr); for (i = 0; i < nwords; i += 2) { (void) printf("%06llx: %016llx %016llx ", (u_longlong_t)(i * sizeof (uint64_t)), (u_longlong_t)(do_bswap ? BSWAP_64(d[i]) : d[i]), (u_longlong_t)(do_bswap ? BSWAP_64(d[i + 1]) : d[i + 1])); c = (char *)&d[i]; for (j = 0; j < 2 * sizeof (uint64_t); j++) (void) printf("%c", isprint(c[j]) ? c[j] : '.'); (void) printf("\n"); } } /* * There are two acceptable formats: * leaf_name - For example: c1t0d0 or /tmp/ztest.0a * child[.child]* - For example: 0.1.1 * * The second form can be used to specify arbitrary vdevs anywhere * in the heirarchy. For example, in a pool with a mirror of * RAID-Zs, you can specify either RAID-Z vdev with 0.0 or 0.1 . */ static vdev_t * zdb_vdev_lookup(vdev_t *vdev, char *path) { char *s, *p, *q; int i; if (vdev == NULL) return (NULL); /* First, assume the x.x.x.x format */ i = (int)strtoul(path, &s, 10); if (s == path || (s && *s != '.' && *s != '\0')) goto name; if (i < 0 || i >= vdev->vdev_children) return (NULL); vdev = vdev->vdev_child[i]; if (*s == '\0') return (vdev); return (zdb_vdev_lookup(vdev, s+1)); name: for (i = 0; i < vdev->vdev_children; i++) { vdev_t *vc = vdev->vdev_child[i]; if (vc->vdev_path == NULL) { vc = zdb_vdev_lookup(vc, path); if (vc == NULL) continue; else return (vc); } p = strrchr(vc->vdev_path, '/'); p = p ? p + 1 : vc->vdev_path; q = &vc->vdev_path[strlen(vc->vdev_path) - 2]; if (strcmp(vc->vdev_path, path) == 0) return (vc); if (strcmp(p, path) == 0) return (vc); if (strcmp(q, "s0") == 0 && strncmp(p, path, q - p) == 0) return (vc); } return (NULL); } /* * Read a block from a pool and print it out. The syntax of the * block descriptor is: * * pool:vdev_specifier:offset:size[:flags] * * pool - The name of the pool you wish to read from * vdev_specifier - Which vdev (see comment for zdb_vdev_lookup) * offset - offset, in hex, in bytes * size - Amount of data to read, in hex, in bytes * flags - A string of characters specifying options * b: Decode a blkptr at given offset within block * *c: Calculate and display checksums * d: Decompress data before dumping * e: Byteswap data before dumping * g: Display data as a gang block header * i: Display as an indirect block * p: Do I/O to physical offset * r: Dump raw data to stdout * * * = not yet implemented */ static void zdb_read_block(char *thing, spa_t *spa) { blkptr_t blk, *bp = &blk; dva_t *dva = bp->blk_dva; int flags = 0; uint64_t offset = 0, size = 0, psize = 0, lsize = 0, blkptr_offset = 0; zio_t *zio; vdev_t *vd; void *pbuf, *lbuf, *buf; char *s, *p, *dup, *vdev, *flagstr; int i, error; dup = strdup(thing); s = strtok(dup, ":"); vdev = s ? s : ""; s = strtok(NULL, ":"); offset = strtoull(s ? s : "", NULL, 16); s = strtok(NULL, ":"); size = strtoull(s ? s : "", NULL, 16); s = strtok(NULL, ":"); flagstr = s ? s : ""; s = NULL; if (size == 0) s = "size must not be zero"; if (!IS_P2ALIGNED(size, DEV_BSIZE)) s = "size must be a multiple of sector size"; if (!IS_P2ALIGNED(offset, DEV_BSIZE)) s = "offset must be a multiple of sector size"; if (s) { (void) printf("Invalid block specifier: %s - %s\n", thing, s); free(dup); return; } for (s = strtok(flagstr, ":"); s; s = strtok(NULL, ":")) { for (i = 0; flagstr[i]; i++) { int bit = flagbits[(uchar_t)flagstr[i]]; if (bit == 0) { (void) printf("***Invalid flag: %c\n", flagstr[i]); continue; } flags |= bit; /* If it's not something with an argument, keep going */ if ((bit & (ZDB_FLAG_CHECKSUM | ZDB_FLAG_PRINT_BLKPTR)) == 0) continue; p = &flagstr[i + 1]; if (bit == ZDB_FLAG_PRINT_BLKPTR) blkptr_offset = strtoull(p, &p, 16); if (*p != ':' && *p != '\0') { (void) printf("***Invalid flag arg: '%s'\n", s); free(dup); return; } i += p - &flagstr[i + 1]; /* skip over the number */ } } vd = zdb_vdev_lookup(spa->spa_root_vdev, vdev); if (vd == NULL) { (void) printf("***Invalid vdev: %s\n", vdev); free(dup); return; } else { if (vd->vdev_path) (void) fprintf(stderr, "Found vdev: %s\n", vd->vdev_path); else (void) fprintf(stderr, "Found vdev type: %s\n", vd->vdev_ops->vdev_op_type); } psize = size; lsize = size; pbuf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); lbuf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); BP_ZERO(bp); DVA_SET_VDEV(&dva[0], vd->vdev_id); DVA_SET_OFFSET(&dva[0], offset); DVA_SET_GANG(&dva[0], !!(flags & ZDB_FLAG_GBH)); DVA_SET_ASIZE(&dva[0], vdev_psize_to_asize(vd, psize)); BP_SET_BIRTH(bp, TXG_INITIAL, TXG_INITIAL); BP_SET_LSIZE(bp, lsize); BP_SET_PSIZE(bp, psize); BP_SET_COMPRESS(bp, ZIO_COMPRESS_OFF); BP_SET_CHECKSUM(bp, ZIO_CHECKSUM_OFF); BP_SET_TYPE(bp, DMU_OT_NONE); BP_SET_LEVEL(bp, 0); BP_SET_DEDUP(bp, 0); BP_SET_BYTEORDER(bp, ZFS_HOST_BYTEORDER); spa_config_enter(spa, SCL_STATE, FTAG, RW_READER); zio = zio_root(spa, NULL, NULL, 0); if (vd == vd->vdev_top) { /* * Treat this as a normal block read. */ zio_nowait(zio_read(zio, spa, bp, pbuf, psize, NULL, NULL, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL)); } else { /* * Treat this as a vdev child I/O. */ zio_nowait(zio_vdev_child_io(zio, bp, vd, offset, pbuf, psize, ZIO_TYPE_READ, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_DONT_CACHE | ZIO_FLAG_DONT_QUEUE | ZIO_FLAG_DONT_PROPAGATE | ZIO_FLAG_DONT_RETRY | ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL, NULL)); } error = zio_wait(zio); spa_config_exit(spa, SCL_STATE, FTAG); if (error) { (void) printf("Read of %s failed, error: %d\n", thing, error); goto out; } if (flags & ZDB_FLAG_DECOMPRESS) { /* * We don't know how the data was compressed, so just try * every decompress function at every inflated blocksize. */ enum zio_compress c; void *pbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); void *lbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); bcopy(pbuf, pbuf2, psize); VERIFY(random_get_pseudo_bytes((uint8_t *)pbuf + psize, SPA_MAXBLOCKSIZE - psize) == 0); VERIFY(random_get_pseudo_bytes((uint8_t *)pbuf2 + psize, SPA_MAXBLOCKSIZE - psize) == 0); for (lsize = SPA_MAXBLOCKSIZE; lsize > psize; lsize -= SPA_MINBLOCKSIZE) { for (c = 0; c < ZIO_COMPRESS_FUNCTIONS; c++) { if (zio_decompress_data(c, pbuf, lbuf, psize, lsize) == 0 && zio_decompress_data(c, pbuf2, lbuf2, psize, lsize) == 0 && bcmp(lbuf, lbuf2, lsize) == 0) break; } if (c != ZIO_COMPRESS_FUNCTIONS) break; lsize -= SPA_MINBLOCKSIZE; } umem_free(pbuf2, SPA_MAXBLOCKSIZE); umem_free(lbuf2, SPA_MAXBLOCKSIZE); if (lsize <= psize) { (void) printf("Decompress of %s failed\n", thing); goto out; } buf = lbuf; size = lsize; } else { buf = pbuf; size = psize; } if (flags & ZDB_FLAG_PRINT_BLKPTR) zdb_print_blkptr((blkptr_t *)(void *) ((uintptr_t)buf + (uintptr_t)blkptr_offset), flags); else if (flags & ZDB_FLAG_RAW) zdb_dump_block_raw(buf, size, flags); else if (flags & ZDB_FLAG_INDIRECT) zdb_dump_indirect((blkptr_t *)buf, size / sizeof (blkptr_t), flags); else if (flags & ZDB_FLAG_GBH) zdb_dump_gbh(buf, flags); else zdb_dump_block(thing, buf, size, flags); out: umem_free(pbuf, SPA_MAXBLOCKSIZE); umem_free(lbuf, SPA_MAXBLOCKSIZE); free(dup); } static boolean_t pool_match(nvlist_t *cfg, char *tgt) { uint64_t v, guid = strtoull(tgt, NULL, 0); char *s; if (guid != 0) { if (nvlist_lookup_uint64(cfg, ZPOOL_CONFIG_POOL_GUID, &v) == 0) return (v == guid); } else { if (nvlist_lookup_string(cfg, ZPOOL_CONFIG_POOL_NAME, &s) == 0) return (strcmp(s, tgt) == 0); } return (B_FALSE); } static char * find_zpool(char **target, nvlist_t **configp, int dirc, char **dirv) { nvlist_t *pools; nvlist_t *match = NULL; char *name = NULL; char *sepp = NULL; char sep; int count = 0; importargs_t args = { 0 }; args.paths = dirc; args.path = dirv; args.can_be_active = B_TRUE; if ((sepp = strpbrk(*target, "/@")) != NULL) { sep = *sepp; *sepp = '\0'; } pools = zpool_search_import(g_zfs, &args); if (pools != NULL) { nvpair_t *elem = NULL; while ((elem = nvlist_next_nvpair(pools, elem)) != NULL) { verify(nvpair_value_nvlist(elem, configp) == 0); if (pool_match(*configp, *target)) { count++; if (match != NULL) { /* print previously found config */ if (name != NULL) { (void) printf("%s\n", name); dump_nvlist(match, 8); name = NULL; } (void) printf("%s\n", nvpair_name(elem)); dump_nvlist(*configp, 8); } else { match = *configp; name = nvpair_name(elem); } } } } if (count > 1) (void) fatal("\tMatched %d pools - use pool GUID " "instead of pool name or \n" "\tpool name part of a dataset name to select pool", count); if (sepp) *sepp = sep; /* * If pool GUID was specified for pool id, replace it with pool name */ if (name && (strstr(*target, name) != *target)) { int sz = 1 + strlen(name) + ((sepp) ? strlen(sepp) : 0); *target = umem_alloc(sz, UMEM_NOFAIL); (void) snprintf(*target, sz, "%s%s", name, sepp ? sepp : ""); } *configp = name ? match : NULL; return (name); } int main(int argc, char **argv) { int i, c; struct rlimit rl = { 1024, 1024 }; spa_t *spa = NULL; objset_t *os = NULL; int dump_all = 1; int verbose = 0; int error = 0; char **searchdirs = NULL; int nsearch = 0; char *target; nvlist_t *policy = NULL; uint64_t max_txg = UINT64_MAX; int rewind = ZPOOL_NEVER_REWIND; + char *spa_config_path_env; + boolean_t target_is_spa = B_TRUE; (void) setrlimit(RLIMIT_NOFILE, &rl); (void) enable_extended_FILE_stdio(-1, -1); dprintf_setup(&argc, argv); + /* + * If there is an environment variable SPA_CONFIG_PATH it overrides + * default spa_config_path setting. If -U flag is specified it will + * override this environment variable settings once again. + */ + spa_config_path_env = getenv("SPA_CONFIG_PATH"); + if (spa_config_path_env != NULL) + spa_config_path = spa_config_path_env; + while ((c = getopt(argc, argv, "bcdhilmMI:suCDRSAFLXx:evp:t:U:P")) != -1) { switch (c) { case 'b': case 'c': case 'd': case 'h': case 'i': case 'l': case 'm': case 's': case 'u': case 'C': case 'D': case 'M': case 'R': case 'S': dump_opt[c]++; dump_all = 0; break; case 'A': case 'F': case 'L': case 'X': case 'e': case 'P': dump_opt[c]++; break; case 'I': max_inflight = strtoull(optarg, NULL, 0); if (max_inflight == 0) { (void) fprintf(stderr, "maximum number " "of inflight I/Os must be greater " "than 0\n"); usage(); } break; case 'p': if (searchdirs == NULL) { searchdirs = umem_alloc(sizeof (char *), UMEM_NOFAIL); } else { char **tmp = umem_alloc((nsearch + 1) * sizeof (char *), UMEM_NOFAIL); bcopy(searchdirs, tmp, nsearch * sizeof (char *)); umem_free(searchdirs, nsearch * sizeof (char *)); searchdirs = tmp; } searchdirs[nsearch++] = optarg; break; case 't': max_txg = strtoull(optarg, NULL, 0); if (max_txg < TXG_INITIAL) { (void) fprintf(stderr, "incorrect txg " "specified: %s\n", optarg); usage(); } break; case 'U': spa_config_path = optarg; break; case 'v': verbose++; break; case 'x': vn_dumpdir = optarg; break; default: usage(); break; } } if (!dump_opt['e'] && searchdirs != NULL) { (void) fprintf(stderr, "-p option requires use of -e\n"); usage(); } /* * ZDB does not typically re-read blocks; therefore limit the ARC * to 256 MB, which can be used entirely for metadata. */ zfs_arc_max = zfs_arc_meta_limit = 256 * 1024 * 1024; /* * "zdb -c" uses checksum-verifying scrub i/os which are async reads. * "zdb -b" uses traversal prefetch which uses async reads. * For good performance, let several of them be active at once. */ zfs_vdev_async_read_max_active = 10; kernel_init(FREAD); g_zfs = libzfs_init(); if (g_zfs == NULL) fatal("Fail to initialize zfs"); if (dump_all) verbose = MAX(verbose, 1); for (c = 0; c < 256; c++) { if (dump_all && !strchr("elAFLRSXP", c)) dump_opt[c] = 1; if (dump_opt[c]) dump_opt[c] += verbose; } aok = (dump_opt['A'] == 1) || (dump_opt['A'] > 2); zfs_recover = (dump_opt['A'] > 1); argc -= optind; argv += optind; if (argc < 2 && dump_opt['R']) usage(); if (argc < 1) { if (!dump_opt['e'] && dump_opt['C']) { dump_cachefile(spa_config_path); return (0); } usage(); } if (dump_opt['l']) { dump_label(argv[0]); return (0); } if (dump_opt['X'] || dump_opt['F']) rewind = ZPOOL_DO_REWIND | (dump_opt['X'] ? ZPOOL_EXTREME_REWIND : 0); if (nvlist_alloc(&policy, NV_UNIQUE_NAME_TYPE, 0) != 0 || nvlist_add_uint64(policy, ZPOOL_REWIND_REQUEST_TXG, max_txg) != 0 || nvlist_add_uint32(policy, ZPOOL_REWIND_REQUEST, rewind) != 0) fatal("internal error: %s", strerror(ENOMEM)); error = 0; target = argv[0]; if (dump_opt['e']) { nvlist_t *cfg = NULL; char *name = find_zpool(&target, &cfg, nsearch, searchdirs); error = ENOENT; if (name) { if (dump_opt['C'] > 1) { (void) printf("\nConfiguration for import:\n"); dump_nvlist(cfg, 8); } if (nvlist_add_nvlist(cfg, ZPOOL_REWIND_POLICY, policy) != 0) { fatal("can't open '%s': %s", target, strerror(ENOMEM)); } if ((error = spa_import(name, cfg, NULL, ZFS_IMPORT_MISSING_LOG)) != 0) { error = spa_import(name, cfg, NULL, ZFS_IMPORT_VERBATIM); } } } + if (strpbrk(target, "/@") != NULL) { + size_t targetlen; + + target_is_spa = B_FALSE; + /* + * Remove any trailing slash. Later code would get confused + * by it, but we want to allow it so that "pool/" can + * indicate that we want to dump the topmost filesystem, + * rather than the whole pool. + */ + targetlen = strlen(target); + if (targetlen != 0 && target[targetlen - 1] == '/') + target[targetlen - 1] = '\0'; + } + if (error == 0) { - if (strpbrk(target, "/@") == NULL || dump_opt['R']) { + if (target_is_spa || dump_opt['R']) { error = spa_open_rewind(target, &spa, FTAG, policy, NULL); if (error) { /* * If we're missing the log device then * try opening the pool after clearing the * log state. */ mutex_enter(&spa_namespace_lock); if ((spa = spa_lookup(target)) != NULL && spa->spa_log_state == SPA_LOG_MISSING) { spa->spa_log_state = SPA_LOG_CLEAR; error = 0; } mutex_exit(&spa_namespace_lock); if (!error) { error = spa_open_rewind(target, &spa, FTAG, policy, NULL); } } } else { error = dmu_objset_own(target, DMU_OST_ANY, B_TRUE, FTAG, &os); } } nvlist_free(policy); if (error) fatal("can't open '%s': %s", target, strerror(error)); argv++; argc--; if (!dump_opt['R']) { if (argc > 0) { zopt_objects = argc; zopt_object = calloc(zopt_objects, sizeof (uint64_t)); for (i = 0; i < zopt_objects; i++) { errno = 0; zopt_object[i] = strtoull(argv[i], NULL, 0); if (zopt_object[i] == 0 && errno != 0) fatal("bad number %s: %s", argv[i], strerror(errno)); } } if (os != NULL) { dump_dir(os); } else if (zopt_objects > 0 && !dump_opt['m']) { dump_dir(spa->spa_meta_objset); } else { dump_zpool(spa); } } else { flagbits['b'] = ZDB_FLAG_PRINT_BLKPTR; flagbits['c'] = ZDB_FLAG_CHECKSUM; flagbits['d'] = ZDB_FLAG_DECOMPRESS; flagbits['e'] = ZDB_FLAG_BSWAP; flagbits['g'] = ZDB_FLAG_GBH; flagbits['i'] = ZDB_FLAG_INDIRECT; flagbits['p'] = ZDB_FLAG_PHYS; flagbits['r'] = ZDB_FLAG_RAW; for (i = 0; i < argc; i++) zdb_read_block(argv[i], spa); } (os != NULL) ? dmu_objset_disown(os, FTAG) : spa_close(spa, FTAG); fuid_table_destroy(); sa_loaded = B_FALSE; libzfs_fini(g_zfs); kernel_fini(); return (0); } Index: user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb =================================================================== --- user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb (revision 303205) +++ user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris/cmd/zdb ___________________________________________________________________ Added: svn:mergeinfo ## -0,0 +0,17 ## Merged /projects/libzfs_core/cddl/contrib/opensolaris/cmd/zdb:r247831-248551 Merged /projects/clang350-import/cddl/contrib/opensolaris/cmd/zdb:r274961-276476 Merged /vendor/resolver/dist/cddl/contrib/opensolaris/cmd/zdb:r1540-186085 Merged /projects/clang360-import/cddl/contrib/opensolaris/cmd/zdb:r277327-280030 Merged /projects/clang-sparc64/cddl/contrib/opensolaris/cmd/zdb:r262258-262612 Merged /projects/clang370-import/cddl/contrib/opensolaris/cmd/zdb:r287506-288928 Merged /projects/clang380-import/cddl/contrib/opensolaris/cmd/zdb:r292913-296412 Merged /projects/multi-fibv6/head/cddl/contrib/opensolaris/cmd/zdb:r230929-231848 Merged /projects/quota64/cddl/contrib/opensolaris/cmd/zdb:r184125-207707 Merged /vendor/illumos/dist/cmd/zdb:r238592,238725,239610,239746,240110,240262,240326,240357,240949,242729,242732-242733,243012-243013,243395,244245,246388,246392,247176,247180,247580,247844-247845,248217,248266,249185,249332,251619,251623-251624,251644,252213,252215,253781,253784,254070-254071,254079,254421-254422,254746,254748,254750-254751,255255,258371-258374,258384,258972,259170,260152,260154,260710,262570,263436-263438,263886-263887,264666,264829,266766,266986-266989,266992,267565-267566,267568,267570,267931,268119,268121,268453-268455,268714,268848,269010,269223,269426,270197,271225,271511,271516,272493,272585,272588,272802,272851,274271-274273,275532,275536-275537,275547,275551,275783-275784,277425-277430,277432,279822,284030,284035,284042,286224,286538,286540,286542,286544,286546,286548,286550,286553,286555,286586,286588,286597,286599,286602,286604,286704,286707,288408,289003,289310-289312,289493,289498,289526,289530,289535,289561,289689,294816,295046,296505,296518,296527,296532,296534,296536,296538,296540,297505,297760,298471,303082-303083 Merged /head/cddl/contrib/opensolaris/cmd/zdb:r2-168403,289108-303204 Merged /projects/mpsutil/cddl/contrib/opensolaris/cmd/zdb:r286179-290100 Merged /projects/collation/cddl/contrib/opensolaris/cmd/zdb:r286424-290491 Merged /vendor-cddl/opensolaris/dist/cddl/contrib/opensolaris/cmd/zdb:r178477-194441 Merged /projects/largeSMP/cddl/contrib/opensolaris/cmd/zdb:r221273-222812,222815-223757 Merged /vendor/opensolaris/dist/cmd/zdb:r194442-210759 Merged /projects/clang-trunk/cddl/contrib/opensolaris/cmd/zdb:r283596-287505 Index: user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris =================================================================== --- user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris (revision 303205) +++ user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/cddl/contrib/opensolaris ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/cddl/contrib/opensolaris:r303053-303204 Index: user/alc/PQ_LAUNDRY/cddl =================================================================== --- user/alc/PQ_LAUNDRY/cddl (revision 303205) +++ user/alc/PQ_LAUNDRY/cddl (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/cddl ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/cddl:r303053-303204 Index: user/alc/PQ_LAUNDRY/contrib/binutils/bfd/elfxx-mips.c =================================================================== --- user/alc/PQ_LAUNDRY/contrib/binutils/bfd/elfxx-mips.c (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/binutils/bfd/elfxx-mips.c (revision 303206) @@ -1,11445 +1,11445 @@ /* MIPS-specific support for ELF Copyright 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc. Most of the information added by Ian Lance Taylor, Cygnus Support, . N32/64 ABI support added by Mark Mitchell, CodeSourcery, LLC. Traditional MIPS targets support added by Koundinya.K, Dansk Data Elektronik & Operations Research Group. This file is part of BFD, the Binary File Descriptor library. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA. */ /* This file handles functionality common to the different MIPS ABI's. */ #include "sysdep.h" #include "bfd.h" #include "libbfd.h" #include "libiberty.h" #include "elf-bfd.h" #include "elfxx-mips.h" #include "elf/mips.h" #include "elf-vxworks.h" /* Get the ECOFF swapping routines. */ #include "coff/sym.h" #include "coff/symconst.h" #include "coff/ecoff.h" #include "coff/mips.h" #include "hashtab.h" /* This structure is used to hold information about one GOT entry. There are three types of entry: (1) absolute addresses (abfd == NULL) (2) SYMBOL + OFFSET addresses, where SYMBOL is local to an input bfd (abfd != NULL, symndx >= 0) (3) global and forced-local symbols (abfd != NULL, symndx == -1) Type (3) entries are treated differently for different types of GOT. In the "master" GOT -- i.e. the one that describes every GOT reference needed in the link -- the mips_got_entry is keyed on both the symbol and the input bfd that references it. If it turns out that we need multiple GOTs, we can then use this information to create separate GOTs for each input bfd. However, we want each of these separate GOTs to have at most one entry for a given symbol, so their type (3) entries are keyed only on the symbol. The input bfd given by the "abfd" field is somewhat arbitrary in this case. This means that when there are multiple GOTs, each GOT has a unique mips_got_entry for every symbol within it. We can therefore use the mips_got_entry fields (tls_type and gotidx) to track the symbol's GOT index. However, if it turns out that we need only a single GOT, we continue to use the master GOT to describe it. There may therefore be several mips_got_entries for the same symbol, each with a different input bfd. We want to make sure that each symbol gets a unique GOT entry, so when there's a single GOT, we use the symbol's hash entry, not the mips_got_entry fields, to track a symbol's GOT index. */ struct mips_got_entry { /* The input bfd in which the symbol is defined. */ bfd *abfd; /* The index of the symbol, as stored in the relocation r_info, if we have a local symbol; -1 otherwise. */ long symndx; union { /* If abfd == NULL, an address that must be stored in the got. */ bfd_vma address; /* If abfd != NULL && symndx != -1, the addend of the relocation that should be added to the symbol value. */ bfd_vma addend; /* If abfd != NULL && symndx == -1, the hash table entry corresponding to a global symbol in the got (or, local, if h->forced_local). */ struct mips_elf_link_hash_entry *h; } d; /* The TLS types included in this GOT entry (specifically, GD and IE). The GD and IE flags can be added as we encounter new relocations. LDM can also be set; it will always be alone, not combined with any GD or IE flags. An LDM GOT entry will be a local symbol entry with r_symndx == 0. */ unsigned char tls_type; /* The offset from the beginning of the .got section to the entry corresponding to this symbol+addend. If it's a global symbol whose offset is yet to be decided, it's going to be -1. */ long gotidx; }; /* This structure is used to hold .got information when linking. */ struct mips_got_info { /* The global symbol in the GOT with the lowest index in the dynamic symbol table. */ struct elf_link_hash_entry *global_gotsym; /* The number of global .got entries. */ unsigned int global_gotno; /* The number of .got slots used for TLS. */ unsigned int tls_gotno; /* The first unused TLS .got entry. Used only during mips_elf_initialize_tls_index. */ unsigned int tls_assigned_gotno; /* The number of local .got entries. */ unsigned int local_gotno; /* The number of local .got entries we have used. */ unsigned int assigned_gotno; /* A hash table holding members of the got. */ struct htab *got_entries; /* A hash table mapping input bfds to other mips_got_info. NULL unless multi-got was necessary. */ struct htab *bfd2got; /* In multi-got links, a pointer to the next got (err, rather, most of the time, it points to the previous got). */ struct mips_got_info *next; /* This is the GOT index of the TLS LDM entry for the GOT, MINUS_ONE for none, or MINUS_TWO for not yet assigned. This is needed because a single-GOT link may have multiple hash table entries for the LDM. It does not get initialized in multi-GOT mode. */ bfd_vma tls_ldm_offset; }; /* Map an input bfd to a got in a multi-got link. */ struct mips_elf_bfd2got_hash { bfd *bfd; struct mips_got_info *g; }; /* Structure passed when traversing the bfd2got hash table, used to create and merge bfd's gots. */ struct mips_elf_got_per_bfd_arg { /* A hashtable that maps bfds to gots. */ htab_t bfd2got; /* The output bfd. */ bfd *obfd; /* The link information. */ struct bfd_link_info *info; /* A pointer to the primary got, i.e., the one that's going to get the implicit relocations from DT_MIPS_LOCAL_GOTNO and DT_MIPS_GOTSYM. */ struct mips_got_info *primary; /* A non-primary got we're trying to merge with other input bfd's gots. */ struct mips_got_info *current; /* The maximum number of got entries that can be addressed with a 16-bit offset. */ unsigned int max_count; /* The number of local and global entries in the primary got. */ unsigned int primary_count; /* The number of local and global entries in the current got. */ unsigned int current_count; /* The total number of global entries which will live in the primary got and be automatically relocated. This includes those not referenced by the primary GOT but included in the "master" GOT. */ unsigned int global_count; }; /* Another structure used to pass arguments for got entries traversal. */ struct mips_elf_set_global_got_offset_arg { struct mips_got_info *g; int value; unsigned int needed_relocs; struct bfd_link_info *info; }; /* A structure used to count TLS relocations or GOT entries, for GOT entry or ELF symbol table traversal. */ struct mips_elf_count_tls_arg { struct bfd_link_info *info; unsigned int needed; }; struct _mips_elf_section_data { struct bfd_elf_section_data elf; union { struct mips_got_info *got_info; bfd_byte *tdata; } u; }; #define mips_elf_section_data(sec) \ ((struct _mips_elf_section_data *) elf_section_data (sec)) /* This structure is passed to mips_elf_sort_hash_table_f when sorting the dynamic symbols. */ struct mips_elf_hash_sort_data { /* The symbol in the global GOT with the lowest dynamic symbol table index. */ struct elf_link_hash_entry *low; /* The least dynamic symbol table index corresponding to a non-TLS symbol with a GOT entry. */ long min_got_dynindx; /* The greatest dynamic symbol table index corresponding to a symbol with a GOT entry that is not referenced (e.g., a dynamic symbol with dynamic relocations pointing to it from non-primary GOTs). */ long max_unref_got_dynindx; /* The greatest dynamic symbol table index not corresponding to a symbol without a GOT entry. */ long max_non_got_dynindx; }; /* The MIPS ELF linker needs additional information for each symbol in the global hash table. */ struct mips_elf_link_hash_entry { struct elf_link_hash_entry root; /* External symbol information. */ EXTR esym; /* Number of R_MIPS_32, R_MIPS_REL32, or R_MIPS_64 relocs against this symbol. */ unsigned int possibly_dynamic_relocs; /* If the R_MIPS_32, R_MIPS_REL32, or R_MIPS_64 reloc is against a readonly section. */ bfd_boolean readonly_reloc; /* We must not create a stub for a symbol that has relocations related to taking the function's address, i.e. any but R_MIPS_CALL*16 ones -- see "MIPS ABI Supplement, 3rd Edition", p. 4-20. */ bfd_boolean no_fn_stub; /* If there is a stub that 32 bit functions should use to call this 16 bit function, this points to the section containing the stub. */ asection *fn_stub; /* Whether we need the fn_stub; this is set if this symbol appears in any relocs other than a 16 bit call. */ bfd_boolean need_fn_stub; /* If there is a stub that 16 bit functions should use to call this 32 bit function, this points to the section containing the stub. */ asection *call_stub; /* This is like the call_stub field, but it is used if the function being called returns a floating point value. */ asection *call_fp_stub; /* Are we forced local? This will only be set if we have converted the initial global GOT entry to a local GOT entry. */ bfd_boolean forced_local; /* Are we referenced by some kind of relocation? */ bfd_boolean is_relocation_target; /* Are we referenced by branch relocations? */ bfd_boolean is_branch_target; #define GOT_NORMAL 0 #define GOT_TLS_GD 1 #define GOT_TLS_LDM 2 #define GOT_TLS_IE 4 #define GOT_TLS_OFFSET_DONE 0x40 #define GOT_TLS_DONE 0x80 unsigned char tls_type; /* This is only used in single-GOT mode; in multi-GOT mode there is one mips_got_entry per GOT entry, so the offset is stored there. In single-GOT mode there may be many mips_got_entry structures all referring to the same GOT slot. It might be possible to use root.got.offset instead, but that field is overloaded already. */ bfd_vma tls_got_offset; }; /* MIPS ELF linker hash table. */ struct mips_elf_link_hash_table { struct elf_link_hash_table root; #if 0 /* We no longer use this. */ /* String section indices for the dynamic section symbols. */ bfd_size_type dynsym_sec_strindex[SIZEOF_MIPS_DYNSYM_SECNAMES]; #endif /* The number of .rtproc entries. */ bfd_size_type procedure_count; /* The size of the .compact_rel section (if SGI_COMPAT). */ bfd_size_type compact_rel_size; /* This flag indicates that the value of DT_MIPS_RLD_MAP dynamic entry is set to the address of __rld_obj_head as in IRIX5. */ bfd_boolean use_rld_obj_head; /* This is the value of the __rld_map or __rld_obj_head symbol. */ bfd_vma rld_value; /* This is set if we see any mips16 stub sections. */ bfd_boolean mips16_stubs_seen; /* True if we're generating code for VxWorks. */ bfd_boolean is_vxworks; /* Shortcuts to some dynamic sections, or NULL if they are not being used. */ asection *srelbss; asection *sdynbss; asection *srelplt; asection *srelplt2; asection *sgotplt; asection *splt; /* The size of the PLT header in bytes (VxWorks only). */ bfd_vma plt_header_size; /* The size of a PLT entry in bytes (VxWorks only). */ bfd_vma plt_entry_size; /* The size of a function stub entry in bytes. */ bfd_vma function_stub_size; }; #define TLS_RELOC_P(r_type) \ (r_type == R_MIPS_TLS_DTPMOD32 \ || r_type == R_MIPS_TLS_DTPMOD64 \ || r_type == R_MIPS_TLS_DTPREL32 \ || r_type == R_MIPS_TLS_DTPREL64 \ || r_type == R_MIPS_TLS_GD \ || r_type == R_MIPS_TLS_LDM \ || r_type == R_MIPS_TLS_DTPREL_HI16 \ || r_type == R_MIPS_TLS_DTPREL_LO16 \ || r_type == R_MIPS_TLS_GOTTPREL \ || r_type == R_MIPS_TLS_TPREL32 \ || r_type == R_MIPS_TLS_TPREL64 \ || r_type == R_MIPS_TLS_TPREL_HI16 \ || r_type == R_MIPS_TLS_TPREL_LO16) /* Structure used to pass information to mips_elf_output_extsym. */ struct extsym_info { bfd *abfd; struct bfd_link_info *info; struct ecoff_debug_info *debug; const struct ecoff_debug_swap *swap; bfd_boolean failed; }; /* The names of the runtime procedure table symbols used on IRIX5. */ static const char * const mips_elf_dynsym_rtproc_names[] = { "_procedure_table", "_procedure_string_table", "_procedure_table_size", NULL }; /* These structures are used to generate the .compact_rel section on IRIX5. */ typedef struct { unsigned long id1; /* Always one? */ unsigned long num; /* Number of compact relocation entries. */ unsigned long id2; /* Always two? */ unsigned long offset; /* The file offset of the first relocation. */ unsigned long reserved0; /* Zero? */ unsigned long reserved1; /* Zero? */ } Elf32_compact_rel; typedef struct { bfd_byte id1[4]; bfd_byte num[4]; bfd_byte id2[4]; bfd_byte offset[4]; bfd_byte reserved0[4]; bfd_byte reserved1[4]; } Elf32_External_compact_rel; typedef struct { unsigned int ctype : 1; /* 1: long 0: short format. See below. */ unsigned int rtype : 4; /* Relocation types. See below. */ unsigned int dist2to : 8; unsigned int relvaddr : 19; /* (VADDR - vaddr of the previous entry)/ 4 */ unsigned long konst; /* KONST field. See below. */ unsigned long vaddr; /* VADDR to be relocated. */ } Elf32_crinfo; typedef struct { unsigned int ctype : 1; /* 1: long 0: short format. See below. */ unsigned int rtype : 4; /* Relocation types. See below. */ unsigned int dist2to : 8; unsigned int relvaddr : 19; /* (VADDR - vaddr of the previous entry)/ 4 */ unsigned long konst; /* KONST field. See below. */ } Elf32_crinfo2; typedef struct { bfd_byte info[4]; bfd_byte konst[4]; bfd_byte vaddr[4]; } Elf32_External_crinfo; typedef struct { bfd_byte info[4]; bfd_byte konst[4]; } Elf32_External_crinfo2; /* These are the constants used to swap the bitfields in a crinfo. */ #define CRINFO_CTYPE (0x1) #define CRINFO_CTYPE_SH (31) #define CRINFO_RTYPE (0xf) #define CRINFO_RTYPE_SH (27) #define CRINFO_DIST2TO (0xff) #define CRINFO_DIST2TO_SH (19) #define CRINFO_RELVADDR (0x7ffff) #define CRINFO_RELVADDR_SH (0) /* A compact relocation info has long (3 words) or short (2 words) formats. A short format doesn't have VADDR field and relvaddr fields contains ((VADDR - vaddr of the previous entry) >> 2). */ #define CRF_MIPS_LONG 1 #define CRF_MIPS_SHORT 0 /* There are 4 types of compact relocation at least. The value KONST has different meaning for each type: (type) (konst) CT_MIPS_REL32 Address in data CT_MIPS_WORD Address in word (XXX) CT_MIPS_GPHI_LO GP - vaddr CT_MIPS_JMPAD Address to jump */ #define CRT_MIPS_REL32 0xa #define CRT_MIPS_WORD 0xb #define CRT_MIPS_GPHI_LO 0xc #define CRT_MIPS_JMPAD 0xd #define mips_elf_set_cr_format(x,format) ((x).ctype = (format)) #define mips_elf_set_cr_type(x,type) ((x).rtype = (type)) #define mips_elf_set_cr_dist2to(x,v) ((x).dist2to = (v)) #define mips_elf_set_cr_relvaddr(x,d) ((x).relvaddr = (d)<<2) /* The structure of the runtime procedure descriptor created by the loader for use by the static exception system. */ typedef struct runtime_pdr { bfd_vma adr; /* Memory address of start of procedure. */ long regmask; /* Save register mask. */ long regoffset; /* Save register offset. */ long fregmask; /* Save floating point register mask. */ long fregoffset; /* Save floating point register offset. */ long frameoffset; /* Frame size. */ short framereg; /* Frame pointer register. */ short pcreg; /* Offset or reg of return pc. */ long irpss; /* Index into the runtime string table. */ long reserved; struct exception_info *exception_info;/* Pointer to exception array. */ } RPDR, *pRPDR; #define cbRPDR sizeof (RPDR) #define rpdNil ((pRPDR) 0) static struct mips_got_entry *mips_elf_create_local_got_entry (bfd *, struct bfd_link_info *, bfd *, struct mips_got_info *, asection *, bfd_vma, unsigned long, struct mips_elf_link_hash_entry *, int); static bfd_boolean mips_elf_sort_hash_table_f (struct mips_elf_link_hash_entry *, void *); static bfd_vma mips_elf_high (bfd_vma); static bfd_boolean mips16_stub_section_p (bfd *, asection *); static bfd_boolean mips_elf_create_dynamic_relocation (bfd *, struct bfd_link_info *, const Elf_Internal_Rela *, struct mips_elf_link_hash_entry *, asection *, bfd_vma, bfd_vma *, asection *); static hashval_t mips_elf_got_entry_hash (const void *); static bfd_vma mips_elf_adjust_gp (bfd *, struct mips_got_info *, bfd *); static struct mips_got_info *mips_elf_got_for_ibfd (struct mips_got_info *, bfd *); /* This will be used when we sort the dynamic relocation records. */ static bfd *reldyn_sorting_bfd; /* Nonzero if ABFD is using the N32 ABI. */ #define ABI_N32_P(abfd) \ ((elf_elfheader (abfd)->e_flags & EF_MIPS_ABI2) != 0) /* Nonzero if ABFD is using the N64 ABI. */ #define ABI_64_P(abfd) \ (get_elf_backend_data (abfd)->s->elfclass == ELFCLASS64) /* Nonzero if ABFD is using NewABI conventions. */ #define NEWABI_P(abfd) (ABI_N32_P (abfd) || ABI_64_P (abfd)) /* The IRIX compatibility level we are striving for. */ #define IRIX_COMPAT(abfd) \ (get_elf_backend_data (abfd)->elf_backend_mips_irix_compat (abfd)) /* Whether we are trying to be compatible with IRIX at all. */ #define SGI_COMPAT(abfd) \ (IRIX_COMPAT (abfd) != ict_none) /* The name of the options section. */ #define MIPS_ELF_OPTIONS_SECTION_NAME(abfd) \ (NEWABI_P (abfd) ? ".MIPS.options" : ".options") /* True if NAME is the recognized name of any SHT_MIPS_OPTIONS section. Some IRIX system files do not use MIPS_ELF_OPTIONS_SECTION_NAME. */ #define MIPS_ELF_OPTIONS_SECTION_NAME_P(NAME) \ (strcmp (NAME, ".MIPS.options") == 0 || strcmp (NAME, ".options") == 0) /* Whether the section is readonly. */ #define MIPS_ELF_READONLY_SECTION(sec) \ ((sec->flags & (SEC_ALLOC | SEC_LOAD | SEC_READONLY)) \ == (SEC_ALLOC | SEC_LOAD | SEC_READONLY)) /* The name of the stub section. */ #define MIPS_ELF_STUB_SECTION_NAME(abfd) ".MIPS.stubs" /* The size of an external REL relocation. */ #define MIPS_ELF_REL_SIZE(abfd) \ (get_elf_backend_data (abfd)->s->sizeof_rel) /* The size of an external RELA relocation. */ #define MIPS_ELF_RELA_SIZE(abfd) \ (get_elf_backend_data (abfd)->s->sizeof_rela) /* The size of an external dynamic table entry. */ #define MIPS_ELF_DYN_SIZE(abfd) \ (get_elf_backend_data (abfd)->s->sizeof_dyn) /* The size of the rld_map pointer. */ #define MIPS_ELF_RLD_MAP_SIZE(abfd) \ (get_elf_backend_data (abfd)->s->arch_size / 8) /* The size of a GOT entry. */ #define MIPS_ELF_GOT_SIZE(abfd) \ (get_elf_backend_data (abfd)->s->arch_size / 8) /* The size of a symbol-table entry. */ #define MIPS_ELF_SYM_SIZE(abfd) \ (get_elf_backend_data (abfd)->s->sizeof_sym) /* The default alignment for sections, as a power of two. */ #define MIPS_ELF_LOG_FILE_ALIGN(abfd) \ (get_elf_backend_data (abfd)->s->log_file_align) /* Get word-sized data. */ #define MIPS_ELF_GET_WORD(abfd, ptr) \ (ABI_64_P (abfd) ? bfd_get_64 (abfd, ptr) : bfd_get_32 (abfd, ptr)) /* Put out word-sized data. */ #define MIPS_ELF_PUT_WORD(abfd, val, ptr) \ (ABI_64_P (abfd) \ ? bfd_put_64 (abfd, val, ptr) \ : bfd_put_32 (abfd, val, ptr)) /* Add a dynamic symbol table-entry. */ #define MIPS_ELF_ADD_DYNAMIC_ENTRY(info, tag, val) \ _bfd_elf_add_dynamic_entry (info, tag, val) #define MIPS_ELF_RTYPE_TO_HOWTO(abfd, rtype, rela) \ (get_elf_backend_data (abfd)->elf_backend_mips_rtype_to_howto (rtype, rela)) /* Determine whether the internal relocation of index REL_IDX is REL (zero) or RELA (non-zero). The assumption is that, if there are two relocation sections for this section, one of them is REL and the other is RELA. If the index of the relocation we're testing is in range for the first relocation section, check that the external relocation size is that for RELA. It is also assumed that, if rel_idx is not in range for the first section, and this first section contains REL relocs, then the relocation is in the second section, that is RELA. */ #define MIPS_RELOC_RELA_P(abfd, sec, rel_idx) \ ((NUM_SHDR_ENTRIES (&elf_section_data (sec)->rel_hdr) \ * get_elf_backend_data (abfd)->s->int_rels_per_ext_rel \ > (bfd_vma)(rel_idx)) \ == (elf_section_data (sec)->rel_hdr.sh_entsize \ == (ABI_64_P (abfd) ? sizeof (Elf64_External_Rela) \ : sizeof (Elf32_External_Rela)))) /* The name of the dynamic relocation section. */ #define MIPS_ELF_REL_DYN_NAME(INFO) \ (mips_elf_hash_table (INFO)->is_vxworks ? ".rela.dyn" : ".rel.dyn") /* In case we're on a 32-bit machine, construct a 64-bit "-1" value from smaller values. Start with zero, widen, *then* decrement. */ #define MINUS_ONE (((bfd_vma)0) - 1) #define MINUS_TWO (((bfd_vma)0) - 2) /* The number of local .got entries we reserve. */ #define MIPS_RESERVED_GOTNO(INFO) \ (mips_elf_hash_table (INFO)->is_vxworks ? 3 : 2) /* The offset of $gp from the beginning of the .got section. */ #define ELF_MIPS_GP_OFFSET(INFO) \ (mips_elf_hash_table (INFO)->is_vxworks ? 0x0 : 0x7ff0) /* The maximum size of the GOT for it to be addressable using 16-bit offsets from $gp. */ #define MIPS_ELF_GOT_MAX_SIZE(INFO) (ELF_MIPS_GP_OFFSET (INFO) + 0x7fff) /* Instructions which appear in a stub. */ #define STUB_LW(abfd) \ ((ABI_64_P (abfd) \ ? 0xdf998010 /* ld t9,0x8010(gp) */ \ : 0x8f998010)) /* lw t9,0x8010(gp) */ #define STUB_MOVE(abfd) \ ((ABI_64_P (abfd) \ ? 0x03e0782d /* daddu t7,ra */ \ : 0x03e07821)) /* addu t7,ra */ #define STUB_LUI(VAL) (0x3c180000 + (VAL)) /* lui t8,VAL */ #define STUB_JALR 0x0320f809 /* jalr t9,ra */ #define STUB_ORI(VAL) (0x37180000 + (VAL)) /* ori t8,t8,VAL */ #define STUB_LI16U(VAL) (0x34180000 + (VAL)) /* ori t8,zero,VAL unsigned */ #define STUB_LI16S(abfd, VAL) \ ((ABI_64_P (abfd) \ ? (0x64180000 + (VAL)) /* daddiu t8,zero,VAL sign extended */ \ : (0x24180000 + (VAL)))) /* addiu t8,zero,VAL sign extended */ #define MIPS_FUNCTION_STUB_NORMAL_SIZE 16 #define MIPS_FUNCTION_STUB_BIG_SIZE 20 /* The name of the dynamic interpreter. This is put in the .interp section. */ #define ELF_DYNAMIC_INTERPRETER(abfd) \ (ABI_N32_P (abfd) ? "/usr/lib32/libc.so.1" \ : ABI_64_P (abfd) ? "/usr/lib64/libc.so.1" \ : "/usr/lib/libc.so.1") #ifdef BFD64 #define MNAME(bfd,pre,pos) \ (ABI_64_P (bfd) ? CONCAT4 (pre,64,_,pos) : CONCAT4 (pre,32,_,pos)) #define ELF_R_SYM(bfd, i) \ (ABI_64_P (bfd) ? ELF64_R_SYM (i) : ELF32_R_SYM (i)) #define ELF_R_TYPE(bfd, i) \ (ABI_64_P (bfd) ? ELF64_MIPS_R_TYPE (i) : ELF32_R_TYPE (i)) #define ELF_R_INFO(bfd, s, t) \ (ABI_64_P (bfd) ? ELF64_R_INFO (s, t) : ELF32_R_INFO (s, t)) #else #define MNAME(bfd,pre,pos) CONCAT4 (pre,32,_,pos) #define ELF_R_SYM(bfd, i) \ (ELF32_R_SYM (i)) #define ELF_R_TYPE(bfd, i) \ (ELF32_R_TYPE (i)) #define ELF_R_INFO(bfd, s, t) \ (ELF32_R_INFO (s, t)) #endif /* The mips16 compiler uses a couple of special sections to handle floating point arguments. Section names that look like .mips16.fn.FNNAME contain stubs that copy floating point arguments from the fp regs to the gp regs and then jump to FNNAME. If any 32 bit function calls FNNAME, the call should be redirected to the stub instead. If no 32 bit function calls FNNAME, the stub should be discarded. We need to consider any reference to the function, not just a call, because if the address of the function is taken we will need the stub, since the address might be passed to a 32 bit function. Section names that look like .mips16.call.FNNAME contain stubs that copy floating point arguments from the gp regs to the fp regs and then jump to FNNAME. If FNNAME is a 32 bit function, then any 16 bit function that calls FNNAME should be redirected to the stub instead. If FNNAME is not a 32 bit function, the stub should be discarded. .mips16.call.fp.FNNAME sections are similar, but contain stubs which call FNNAME and then copy the return value from the fp regs to the gp regs. These stubs store the return value in $18 while calling FNNAME; any function which might call one of these stubs must arrange to save $18 around the call. (This case is not needed for 32 bit functions that call 16 bit functions, because 16 bit functions always return floating point values in both $f0/$f1 and $2/$3.) Note that in all cases FNNAME might be defined statically. Therefore, FNNAME is not used literally. Instead, the relocation information will indicate which symbol the section is for. We record any stubs that we find in the symbol table. */ #define FN_STUB ".mips16.fn." #define CALL_STUB ".mips16.call." #define CALL_FP_STUB ".mips16.call.fp." #define FN_STUB_P(name) CONST_STRNEQ (name, FN_STUB) #define CALL_STUB_P(name) CONST_STRNEQ (name, CALL_STUB) #define CALL_FP_STUB_P(name) CONST_STRNEQ (name, CALL_FP_STUB) /* The format of the first PLT entry in a VxWorks executable. */ static const bfd_vma mips_vxworks_exec_plt0_entry[] = { 0x3c190000, /* lui t9, %hi(_GLOBAL_OFFSET_TABLE_) */ 0x27390000, /* addiu t9, t9, %lo(_GLOBAL_OFFSET_TABLE_) */ 0x8f390008, /* lw t9, 8(t9) */ 0x00000000, /* nop */ 0x03200008, /* jr t9 */ 0x00000000 /* nop */ }; /* The format of subsequent PLT entries. */ static const bfd_vma mips_vxworks_exec_plt_entry[] = { 0x10000000, /* b .PLT_resolver */ 0x24180000, /* li t8, */ 0x3c190000, /* lui t9, %hi(<.got.plt slot>) */ 0x27390000, /* addiu t9, t9, %lo(<.got.plt slot>) */ 0x8f390000, /* lw t9, 0(t9) */ 0x00000000, /* nop */ 0x03200008, /* jr t9 */ 0x00000000 /* nop */ }; /* The format of the first PLT entry in a VxWorks shared object. */ static const bfd_vma mips_vxworks_shared_plt0_entry[] = { 0x8f990008, /* lw t9, 8(gp) */ 0x00000000, /* nop */ 0x03200008, /* jr t9 */ 0x00000000, /* nop */ 0x00000000, /* nop */ 0x00000000 /* nop */ }; /* The format of subsequent PLT entries. */ static const bfd_vma mips_vxworks_shared_plt_entry[] = { 0x10000000, /* b .PLT_resolver */ 0x24180000 /* li t8, */ }; /* Look up an entry in a MIPS ELF linker hash table. */ #define mips_elf_link_hash_lookup(table, string, create, copy, follow) \ ((struct mips_elf_link_hash_entry *) \ elf_link_hash_lookup (&(table)->root, (string), (create), \ (copy), (follow))) /* Traverse a MIPS ELF linker hash table. */ #define mips_elf_link_hash_traverse(table, func, info) \ (elf_link_hash_traverse \ (&(table)->root, \ (bfd_boolean (*) (struct elf_link_hash_entry *, void *)) (func), \ (info))) /* Get the MIPS ELF linker hash table from a link_info structure. */ #define mips_elf_hash_table(p) \ ((struct mips_elf_link_hash_table *) ((p)->hash)) /* Find the base offsets for thread-local storage in this object, for GD/LD and IE/LE respectively. */ #define TP_OFFSET 0x7000 #define DTP_OFFSET 0x8000 static bfd_vma dtprel_base (struct bfd_link_info *info) { /* If tls_sec is NULL, we should have signalled an error already. */ if (elf_hash_table (info)->tls_sec == NULL) return 0; return elf_hash_table (info)->tls_sec->vma + DTP_OFFSET; } static bfd_vma tprel_base (struct bfd_link_info *info) { /* If tls_sec is NULL, we should have signalled an error already. */ if (elf_hash_table (info)->tls_sec == NULL) return 0; return elf_hash_table (info)->tls_sec->vma + TP_OFFSET; } /* Create an entry in a MIPS ELF linker hash table. */ static struct bfd_hash_entry * mips_elf_link_hash_newfunc (struct bfd_hash_entry *entry, struct bfd_hash_table *table, const char *string) { struct mips_elf_link_hash_entry *ret = (struct mips_elf_link_hash_entry *) entry; /* Allocate the structure if it has not already been allocated by a subclass. */ if (ret == NULL) ret = bfd_hash_allocate (table, sizeof (struct mips_elf_link_hash_entry)); if (ret == NULL) return (struct bfd_hash_entry *) ret; /* Call the allocation method of the superclass. */ ret = ((struct mips_elf_link_hash_entry *) _bfd_elf_link_hash_newfunc ((struct bfd_hash_entry *) ret, table, string)); if (ret != NULL) { /* Set local fields. */ memset (&ret->esym, 0, sizeof (EXTR)); /* We use -2 as a marker to indicate that the information has not been set. -1 means there is no associated ifd. */ ret->esym.ifd = -2; ret->possibly_dynamic_relocs = 0; ret->readonly_reloc = FALSE; ret->no_fn_stub = FALSE; ret->fn_stub = NULL; ret->need_fn_stub = FALSE; ret->call_stub = NULL; ret->call_fp_stub = NULL; ret->forced_local = FALSE; ret->is_branch_target = FALSE; ret->is_relocation_target = FALSE; ret->tls_type = GOT_NORMAL; } return (struct bfd_hash_entry *) ret; } bfd_boolean _bfd_mips_elf_new_section_hook (bfd *abfd, asection *sec) { if (!sec->used_by_bfd) { struct _mips_elf_section_data *sdata; bfd_size_type amt = sizeof (*sdata); sdata = bfd_zalloc (abfd, amt); if (sdata == NULL) return FALSE; sec->used_by_bfd = sdata; } return _bfd_elf_new_section_hook (abfd, sec); } /* Read ECOFF debugging information from a .mdebug section into a ecoff_debug_info structure. */ bfd_boolean _bfd_mips_elf_read_ecoff_info (bfd *abfd, asection *section, struct ecoff_debug_info *debug) { HDRR *symhdr; const struct ecoff_debug_swap *swap; char *ext_hdr; swap = get_elf_backend_data (abfd)->elf_backend_ecoff_debug_swap; memset (debug, 0, sizeof (*debug)); ext_hdr = bfd_malloc (swap->external_hdr_size); if (ext_hdr == NULL && swap->external_hdr_size != 0) goto error_return; if (! bfd_get_section_contents (abfd, section, ext_hdr, 0, swap->external_hdr_size)) goto error_return; symhdr = &debug->symbolic_header; (*swap->swap_hdr_in) (abfd, ext_hdr, symhdr); /* The symbolic header contains absolute file offsets and sizes to read. */ #define READ(ptr, offset, count, size, type) \ if (symhdr->count == 0) \ debug->ptr = NULL; \ else \ { \ bfd_size_type amt = (bfd_size_type) size * symhdr->count; \ debug->ptr = bfd_malloc (amt); \ if (debug->ptr == NULL) \ goto error_return; \ if (bfd_seek (abfd, symhdr->offset, SEEK_SET) != 0 \ || bfd_bread (debug->ptr, amt, abfd) != amt) \ goto error_return; \ } READ (line, cbLineOffset, cbLine, sizeof (unsigned char), unsigned char *); READ (external_dnr, cbDnOffset, idnMax, swap->external_dnr_size, void *); READ (external_pdr, cbPdOffset, ipdMax, swap->external_pdr_size, void *); READ (external_sym, cbSymOffset, isymMax, swap->external_sym_size, void *); READ (external_opt, cbOptOffset, ioptMax, swap->external_opt_size, void *); READ (external_aux, cbAuxOffset, iauxMax, sizeof (union aux_ext), union aux_ext *); READ (ss, cbSsOffset, issMax, sizeof (char), char *); READ (ssext, cbSsExtOffset, issExtMax, sizeof (char), char *); READ (external_fdr, cbFdOffset, ifdMax, swap->external_fdr_size, void *); READ (external_rfd, cbRfdOffset, crfd, swap->external_rfd_size, void *); READ (external_ext, cbExtOffset, iextMax, swap->external_ext_size, void *); #undef READ debug->fdr = NULL; return TRUE; error_return: if (ext_hdr != NULL) free (ext_hdr); if (debug->line != NULL) free (debug->line); if (debug->external_dnr != NULL) free (debug->external_dnr); if (debug->external_pdr != NULL) free (debug->external_pdr); if (debug->external_sym != NULL) free (debug->external_sym); if (debug->external_opt != NULL) free (debug->external_opt); if (debug->external_aux != NULL) free (debug->external_aux); if (debug->ss != NULL) free (debug->ss); if (debug->ssext != NULL) free (debug->ssext); if (debug->external_fdr != NULL) free (debug->external_fdr); if (debug->external_rfd != NULL) free (debug->external_rfd); if (debug->external_ext != NULL) free (debug->external_ext); return FALSE; } /* Swap RPDR (runtime procedure table entry) for output. */ static void ecoff_swap_rpdr_out (bfd *abfd, const RPDR *in, struct rpdr_ext *ex) { H_PUT_S32 (abfd, in->adr, ex->p_adr); H_PUT_32 (abfd, in->regmask, ex->p_regmask); H_PUT_32 (abfd, in->regoffset, ex->p_regoffset); H_PUT_32 (abfd, in->fregmask, ex->p_fregmask); H_PUT_32 (abfd, in->fregoffset, ex->p_fregoffset); H_PUT_32 (abfd, in->frameoffset, ex->p_frameoffset); H_PUT_16 (abfd, in->framereg, ex->p_framereg); H_PUT_16 (abfd, in->pcreg, ex->p_pcreg); H_PUT_32 (abfd, in->irpss, ex->p_irpss); } /* Create a runtime procedure table from the .mdebug section. */ static bfd_boolean mips_elf_create_procedure_table (void *handle, bfd *abfd, struct bfd_link_info *info, asection *s, struct ecoff_debug_info *debug) { const struct ecoff_debug_swap *swap; HDRR *hdr = &debug->symbolic_header; RPDR *rpdr, *rp; struct rpdr_ext *erp; void *rtproc; struct pdr_ext *epdr; struct sym_ext *esym; char *ss, **sv; char *str; bfd_size_type size; bfd_size_type count; unsigned long sindex; unsigned long i; PDR pdr; SYMR sym; const char *no_name_func = _("static procedure (no name)"); epdr = NULL; rpdr = NULL; esym = NULL; ss = NULL; sv = NULL; swap = get_elf_backend_data (abfd)->elf_backend_ecoff_debug_swap; sindex = strlen (no_name_func) + 1; count = hdr->ipdMax; if (count > 0) { size = swap->external_pdr_size; epdr = bfd_malloc (size * count); if (epdr == NULL) goto error_return; if (! _bfd_ecoff_get_accumulated_pdr (handle, (bfd_byte *) epdr)) goto error_return; size = sizeof (RPDR); rp = rpdr = bfd_malloc (size * count); if (rpdr == NULL) goto error_return; size = sizeof (char *); sv = bfd_malloc (size * count); if (sv == NULL) goto error_return; count = hdr->isymMax; size = swap->external_sym_size; esym = bfd_malloc (size * count); if (esym == NULL) goto error_return; if (! _bfd_ecoff_get_accumulated_sym (handle, (bfd_byte *) esym)) goto error_return; count = hdr->issMax; ss = bfd_malloc (count); if (ss == NULL) goto error_return; if (! _bfd_ecoff_get_accumulated_ss (handle, (bfd_byte *) ss)) goto error_return; count = hdr->ipdMax; for (i = 0; i < (unsigned long) count; i++, rp++) { (*swap->swap_pdr_in) (abfd, epdr + i, &pdr); (*swap->swap_sym_in) (abfd, &esym[pdr.isym], &sym); rp->adr = sym.value; rp->regmask = pdr.regmask; rp->regoffset = pdr.regoffset; rp->fregmask = pdr.fregmask; rp->fregoffset = pdr.fregoffset; rp->frameoffset = pdr.frameoffset; rp->framereg = pdr.framereg; rp->pcreg = pdr.pcreg; rp->irpss = sindex; sv[i] = ss + sym.iss; sindex += strlen (sv[i]) + 1; } } size = sizeof (struct rpdr_ext) * (count + 2) + sindex; size = BFD_ALIGN (size, 16); rtproc = bfd_alloc (abfd, size); if (rtproc == NULL) { mips_elf_hash_table (info)->procedure_count = 0; goto error_return; } mips_elf_hash_table (info)->procedure_count = count + 2; erp = rtproc; memset (erp, 0, sizeof (struct rpdr_ext)); erp++; str = (char *) rtproc + sizeof (struct rpdr_ext) * (count + 2); strcpy (str, no_name_func); str += strlen (no_name_func) + 1; for (i = 0; i < count; i++) { ecoff_swap_rpdr_out (abfd, rpdr + i, erp + i); strcpy (str, sv[i]); str += strlen (sv[i]) + 1; } H_PUT_S32 (abfd, -1, (erp + count)->p_adr); /* Set the size and contents of .rtproc section. */ s->size = size; s->contents = rtproc; /* Skip this section later on (I don't think this currently matters, but someday it might). */ s->map_head.link_order = NULL; if (epdr != NULL) free (epdr); if (rpdr != NULL) free (rpdr); if (esym != NULL) free (esym); if (ss != NULL) free (ss); if (sv != NULL) free (sv); return TRUE; error_return: if (epdr != NULL) free (epdr); if (rpdr != NULL) free (rpdr); if (esym != NULL) free (esym); if (ss != NULL) free (ss); if (sv != NULL) free (sv); return FALSE; } /* Check the mips16 stubs for a particular symbol, and see if we can discard them. */ static bfd_boolean mips_elf_check_mips16_stubs (struct mips_elf_link_hash_entry *h, void *data ATTRIBUTE_UNUSED) { if (h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; if (h->fn_stub != NULL && ! h->need_fn_stub) { /* We don't need the fn_stub; the only references to this symbol are 16 bit calls. Clobber the size to 0 to prevent it from being included in the link. */ h->fn_stub->size = 0; h->fn_stub->flags &= ~SEC_RELOC; h->fn_stub->reloc_count = 0; h->fn_stub->flags |= SEC_EXCLUDE; } if (h->call_stub != NULL && h->root.other == STO_MIPS16) { /* We don't need the call_stub; this is a 16 bit function, so calls from other 16 bit functions are OK. Clobber the size to 0 to prevent it from being included in the link. */ h->call_stub->size = 0; h->call_stub->flags &= ~SEC_RELOC; h->call_stub->reloc_count = 0; h->call_stub->flags |= SEC_EXCLUDE; } if (h->call_fp_stub != NULL && h->root.other == STO_MIPS16) { /* We don't need the call_stub; this is a 16 bit function, so calls from other 16 bit functions are OK. Clobber the size to 0 to prevent it from being included in the link. */ h->call_fp_stub->size = 0; h->call_fp_stub->flags &= ~SEC_RELOC; h->call_fp_stub->reloc_count = 0; h->call_fp_stub->flags |= SEC_EXCLUDE; } return TRUE; } /* R_MIPS16_26 is used for the mips16 jal and jalx instructions. Most mips16 instructions are 16 bits, but these instructions are 32 bits. The format of these instructions is: +--------------+--------------------------------+ | JALX | X| Imm 20:16 | Imm 25:21 | +--------------+--------------------------------+ | Immediate 15:0 | +-----------------------------------------------+ JALX is the 5-bit value 00011. X is 0 for jal, 1 for jalx. Note that the immediate value in the first word is swapped. When producing a relocatable object file, R_MIPS16_26 is handled mostly like R_MIPS_26. In particular, the addend is stored as a straight 26-bit value in a 32-bit instruction. (gas makes life simpler for itself by never adjusting a R_MIPS16_26 reloc to be against a section, so the addend is always zero). However, the 32 bit instruction is stored as 2 16-bit values, rather than a single 32-bit value. In a big-endian file, the result is the same; in a little-endian file, the two 16-bit halves of the 32 bit value are swapped. This is so that a disassembler can recognize the jal instruction. When doing a final link, R_MIPS16_26 is treated as a 32 bit instruction stored as two 16-bit values. The addend A is the contents of the targ26 field. The calculation is the same as R_MIPS_26. When storing the calculated value, reorder the immediate value as shown above, and don't forget to store the value as two 16-bit values. To put it in MIPS ABI terms, the relocation field is T-targ26-16, defined as big-endian: +--------+----------------------+ | | | | | targ26-16 | |31 26|25 0| +--------+----------------------+ little-endian: +----------+------+-------------+ | | | | | sub1 | | sub2 | |0 9|10 15|16 31| +----------+--------------------+ where targ26-16 is sub1 followed by sub2 (i.e., the addend field A is ((sub1 << 16) | sub2)). When producing a relocatable object file, the calculation is (((A < 2) | ((P + 4) & 0xf0000000) + S) >> 2) When producing a fully linked file, the calculation is let R = (((A < 2) | ((P + 4) & 0xf0000000) + S) >> 2) ((R & 0x1f0000) << 5) | ((R & 0x3e00000) >> 5) | (R & 0xffff) R_MIPS16_GPREL is used for GP-relative addressing in mips16 mode. A typical instruction will have a format like this: +--------------+--------------------------------+ | EXTEND | Imm 10:5 | Imm 15:11 | +--------------+--------------------------------+ | Major | rx | ry | Imm 4:0 | +--------------+--------------------------------+ EXTEND is the five bit value 11110. Major is the instruction opcode. This is handled exactly like R_MIPS_GPREL16, except that the addend is retrieved and stored as shown in this diagram; that is, the Imm fields above replace the V-rel16 field. All we need to do here is shuffle the bits appropriately. As above, the two 16-bit halves must be swapped on a little-endian system. R_MIPS16_HI16 and R_MIPS16_LO16 are used in mips16 mode to access data when neither GP-relative nor PC-relative addressing can be used. They are handled like R_MIPS_HI16 and R_MIPS_LO16, except that the addend is retrieved and stored as shown above for R_MIPS16_GPREL. */ void _bfd_mips16_elf_reloc_unshuffle (bfd *abfd, int r_type, bfd_boolean jal_shuffle, bfd_byte *data) { bfd_vma extend, insn, val; if (r_type != R_MIPS16_26 && r_type != R_MIPS16_GPREL && r_type != R_MIPS16_HI16 && r_type != R_MIPS16_LO16) return; /* Pick up the mips16 extend instruction and the real instruction. */ extend = bfd_get_16 (abfd, data); insn = bfd_get_16 (abfd, data + 2); if (r_type == R_MIPS16_26) { if (jal_shuffle) val = ((extend & 0xfc00) << 16) | ((extend & 0x3e0) << 11) | ((extend & 0x1f) << 21) | insn; else val = extend << 16 | insn; } else val = ((extend & 0xf800) << 16) | ((insn & 0xffe0) << 11) | ((extend & 0x1f) << 11) | (extend & 0x7e0) | (insn & 0x1f); bfd_put_32 (abfd, val, data); } void _bfd_mips16_elf_reloc_shuffle (bfd *abfd, int r_type, bfd_boolean jal_shuffle, bfd_byte *data) { bfd_vma extend, insn, val; if (r_type != R_MIPS16_26 && r_type != R_MIPS16_GPREL && r_type != R_MIPS16_HI16 && r_type != R_MIPS16_LO16) return; val = bfd_get_32 (abfd, data); if (r_type == R_MIPS16_26) { if (jal_shuffle) { insn = val & 0xffff; extend = ((val >> 16) & 0xfc00) | ((val >> 11) & 0x3e0) | ((val >> 21) & 0x1f); } else { insn = val & 0xffff; extend = val >> 16; } } else { insn = ((val >> 11) & 0xffe0) | (val & 0x1f); extend = ((val >> 16) & 0xf800) | ((val >> 11) & 0x1f) | (val & 0x7e0); } bfd_put_16 (abfd, insn, data + 2); bfd_put_16 (abfd, extend, data); } bfd_reloc_status_type _bfd_mips_elf_gprel16_with_gp (bfd *abfd, asymbol *symbol, arelent *reloc_entry, asection *input_section, bfd_boolean relocatable, void *data, bfd_vma gp) { bfd_vma relocation; bfd_signed_vma val; bfd_reloc_status_type status; if (bfd_is_com_section (symbol->section)) relocation = 0; else relocation = symbol->value; relocation += symbol->section->output_section->vma; relocation += symbol->section->output_offset; if (reloc_entry->address > bfd_get_section_limit (abfd, input_section)) return bfd_reloc_outofrange; /* Set val to the offset into the section or symbol. */ val = reloc_entry->addend; _bfd_mips_elf_sign_extend (val, 16); /* Adjust val for the final section location and GP value. If we are producing relocatable output, we don't want to do this for an external symbol. */ if (! relocatable || (symbol->flags & BSF_SECTION_SYM) != 0) val += relocation - gp; if (reloc_entry->howto->partial_inplace) { status = _bfd_relocate_contents (reloc_entry->howto, abfd, val, (bfd_byte *) data + reloc_entry->address); if (status != bfd_reloc_ok) return status; } else reloc_entry->addend = val; if (relocatable) reloc_entry->address += input_section->output_offset; return bfd_reloc_ok; } /* Used to store a REL high-part relocation such as R_MIPS_HI16 or R_MIPS_GOT16. REL is the relocation, INPUT_SECTION is the section that contains the relocation field and DATA points to the start of INPUT_SECTION. */ struct mips_hi16 { struct mips_hi16 *next; bfd_byte *data; asection *input_section; arelent rel; }; /* FIXME: This should not be a static variable. */ static struct mips_hi16 *mips_hi16_list; /* A howto special_function for REL *HI16 relocations. We can only calculate the correct value once we've seen the partnering *LO16 relocation, so just save the information for later. The ABI requires that the *LO16 immediately follow the *HI16. However, as a GNU extension, we permit an arbitrary number of *HI16s to be associated with a single *LO16. This significantly simplies the relocation handling in gcc. */ bfd_reloc_status_type _bfd_mips_elf_hi16_reloc (bfd *abfd ATTRIBUTE_UNUSED, arelent *reloc_entry, asymbol *symbol ATTRIBUTE_UNUSED, void *data, asection *input_section, bfd *output_bfd, char **error_message ATTRIBUTE_UNUSED) { struct mips_hi16 *n; if (reloc_entry->address > bfd_get_section_limit (abfd, input_section)) return bfd_reloc_outofrange; n = bfd_malloc (sizeof *n); if (n == NULL) return bfd_reloc_outofrange; n->next = mips_hi16_list; n->data = data; n->input_section = input_section; n->rel = *reloc_entry; mips_hi16_list = n; if (output_bfd != NULL) reloc_entry->address += input_section->output_offset; return bfd_reloc_ok; } /* A howto special_function for REL R_MIPS_GOT16 relocations. This is just like any other 16-bit relocation when applied to global symbols, but is treated in the same as R_MIPS_HI16 when applied to local symbols. */ bfd_reloc_status_type _bfd_mips_elf_got16_reloc (bfd *abfd, arelent *reloc_entry, asymbol *symbol, void *data, asection *input_section, bfd *output_bfd, char **error_message) { if ((symbol->flags & (BSF_GLOBAL | BSF_WEAK)) != 0 || bfd_is_und_section (bfd_get_section (symbol)) || bfd_is_com_section (bfd_get_section (symbol))) /* The relocation is against a global symbol. */ return _bfd_mips_elf_generic_reloc (abfd, reloc_entry, symbol, data, input_section, output_bfd, error_message); return _bfd_mips_elf_hi16_reloc (abfd, reloc_entry, symbol, data, input_section, output_bfd, error_message); } /* A howto special_function for REL *LO16 relocations. The *LO16 itself is a straightforward 16 bit inplace relocation, but we must deal with any partnering high-part relocations as well. */ bfd_reloc_status_type _bfd_mips_elf_lo16_reloc (bfd *abfd, arelent *reloc_entry, asymbol *symbol, void *data, asection *input_section, bfd *output_bfd, char **error_message) { bfd_vma vallo; bfd_byte *location = (bfd_byte *) data + reloc_entry->address; if (reloc_entry->address > bfd_get_section_limit (abfd, input_section)) return bfd_reloc_outofrange; _bfd_mips16_elf_reloc_unshuffle (abfd, reloc_entry->howto->type, FALSE, location); vallo = bfd_get_32 (abfd, location); _bfd_mips16_elf_reloc_shuffle (abfd, reloc_entry->howto->type, FALSE, location); while (mips_hi16_list != NULL) { bfd_reloc_status_type ret; struct mips_hi16 *hi; hi = mips_hi16_list; /* R_MIPS_GOT16 relocations are something of a special case. We want to install the addend in the same way as for a R_MIPS_HI16 relocation (with a rightshift of 16). However, since GOT16 relocations can also be used with global symbols, their howto has a rightshift of 0. */ if (hi->rel.howto->type == R_MIPS_GOT16) hi->rel.howto = MIPS_ELF_RTYPE_TO_HOWTO (abfd, R_MIPS_HI16, FALSE); /* VALLO is a signed 16-bit number. Bias it by 0x8000 so that any carry or borrow will induce a change of +1 or -1 in the high part. */ hi->rel.addend += (vallo + 0x8000) & 0xffff; ret = _bfd_mips_elf_generic_reloc (abfd, &hi->rel, symbol, hi->data, hi->input_section, output_bfd, error_message); if (ret != bfd_reloc_ok) return ret; mips_hi16_list = hi->next; free (hi); } return _bfd_mips_elf_generic_reloc (abfd, reloc_entry, symbol, data, input_section, output_bfd, error_message); } /* A generic howto special_function. This calculates and installs the relocation itself, thus avoiding the oft-discussed problems in bfd_perform_relocation and bfd_install_relocation. */ bfd_reloc_status_type _bfd_mips_elf_generic_reloc (bfd *abfd ATTRIBUTE_UNUSED, arelent *reloc_entry, asymbol *symbol, void *data ATTRIBUTE_UNUSED, asection *input_section, bfd *output_bfd, char **error_message ATTRIBUTE_UNUSED) { bfd_signed_vma val; bfd_reloc_status_type status; bfd_boolean relocatable; relocatable = (output_bfd != NULL); if (reloc_entry->address > bfd_get_section_limit (abfd, input_section)) return bfd_reloc_outofrange; /* Build up the field adjustment in VAL. */ val = 0; if (!relocatable || (symbol->flags & BSF_SECTION_SYM) != 0) { /* Either we're calculating the final field value or we have a relocation against a section symbol. Add in the section's offset or address. */ val += symbol->section->output_section->vma; val += symbol->section->output_offset; } if (!relocatable) { /* We're calculating the final field value. Add in the symbol's value and, if pc-relative, subtract the address of the field itself. */ val += symbol->value; if (reloc_entry->howto->pc_relative) { val -= input_section->output_section->vma; val -= input_section->output_offset; val -= reloc_entry->address; } } /* VAL is now the final adjustment. If we're keeping this relocation in the output file, and if the relocation uses a separate addend, we just need to add VAL to that addend. Otherwise we need to add VAL to the relocation field itself. */ if (relocatable && !reloc_entry->howto->partial_inplace) reloc_entry->addend += val; else { bfd_byte *location = (bfd_byte *) data + reloc_entry->address; /* Add in the separate addend, if any. */ val += reloc_entry->addend; /* Add VAL to the relocation field. */ _bfd_mips16_elf_reloc_unshuffle (abfd, reloc_entry->howto->type, FALSE, location); status = _bfd_relocate_contents (reloc_entry->howto, abfd, val, location); _bfd_mips16_elf_reloc_shuffle (abfd, reloc_entry->howto->type, FALSE, location); if (status != bfd_reloc_ok) return status; } if (relocatable) reloc_entry->address += input_section->output_offset; return bfd_reloc_ok; } /* Swap an entry in a .gptab section. Note that these routines rely on the equivalence of the two elements of the union. */ static void bfd_mips_elf32_swap_gptab_in (bfd *abfd, const Elf32_External_gptab *ex, Elf32_gptab *in) { in->gt_entry.gt_g_value = H_GET_32 (abfd, ex->gt_entry.gt_g_value); in->gt_entry.gt_bytes = H_GET_32 (abfd, ex->gt_entry.gt_bytes); } static void bfd_mips_elf32_swap_gptab_out (bfd *abfd, const Elf32_gptab *in, Elf32_External_gptab *ex) { H_PUT_32 (abfd, in->gt_entry.gt_g_value, ex->gt_entry.gt_g_value); H_PUT_32 (abfd, in->gt_entry.gt_bytes, ex->gt_entry.gt_bytes); } static void bfd_elf32_swap_compact_rel_out (bfd *abfd, const Elf32_compact_rel *in, Elf32_External_compact_rel *ex) { H_PUT_32 (abfd, in->id1, ex->id1); H_PUT_32 (abfd, in->num, ex->num); H_PUT_32 (abfd, in->id2, ex->id2); H_PUT_32 (abfd, in->offset, ex->offset); H_PUT_32 (abfd, in->reserved0, ex->reserved0); H_PUT_32 (abfd, in->reserved1, ex->reserved1); } static void bfd_elf32_swap_crinfo_out (bfd *abfd, const Elf32_crinfo *in, Elf32_External_crinfo *ex) { unsigned long l; l = (((in->ctype & CRINFO_CTYPE) << CRINFO_CTYPE_SH) | ((in->rtype & CRINFO_RTYPE) << CRINFO_RTYPE_SH) | ((in->dist2to & CRINFO_DIST2TO) << CRINFO_DIST2TO_SH) | ((in->relvaddr & CRINFO_RELVADDR) << CRINFO_RELVADDR_SH)); H_PUT_32 (abfd, l, ex->info); H_PUT_32 (abfd, in->konst, ex->konst); H_PUT_32 (abfd, in->vaddr, ex->vaddr); } /* A .reginfo section holds a single Elf32_RegInfo structure. These routines swap this structure in and out. They are used outside of BFD, so they are globally visible. */ void bfd_mips_elf32_swap_reginfo_in (bfd *abfd, const Elf32_External_RegInfo *ex, Elf32_RegInfo *in) { in->ri_gprmask = H_GET_32 (abfd, ex->ri_gprmask); in->ri_cprmask[0] = H_GET_32 (abfd, ex->ri_cprmask[0]); in->ri_cprmask[1] = H_GET_32 (abfd, ex->ri_cprmask[1]); in->ri_cprmask[2] = H_GET_32 (abfd, ex->ri_cprmask[2]); in->ri_cprmask[3] = H_GET_32 (abfd, ex->ri_cprmask[3]); in->ri_gp_value = H_GET_32 (abfd, ex->ri_gp_value); } void bfd_mips_elf32_swap_reginfo_out (bfd *abfd, const Elf32_RegInfo *in, Elf32_External_RegInfo *ex) { H_PUT_32 (abfd, in->ri_gprmask, ex->ri_gprmask); H_PUT_32 (abfd, in->ri_cprmask[0], ex->ri_cprmask[0]); H_PUT_32 (abfd, in->ri_cprmask[1], ex->ri_cprmask[1]); H_PUT_32 (abfd, in->ri_cprmask[2], ex->ri_cprmask[2]); H_PUT_32 (abfd, in->ri_cprmask[3], ex->ri_cprmask[3]); H_PUT_32 (abfd, in->ri_gp_value, ex->ri_gp_value); } /* In the 64 bit ABI, the .MIPS.options section holds register information in an Elf64_Reginfo structure. These routines swap them in and out. They are globally visible because they are used outside of BFD. These routines are here so that gas can call them without worrying about whether the 64 bit ABI has been included. */ void bfd_mips_elf64_swap_reginfo_in (bfd *abfd, const Elf64_External_RegInfo *ex, Elf64_Internal_RegInfo *in) { in->ri_gprmask = H_GET_32 (abfd, ex->ri_gprmask); in->ri_pad = H_GET_32 (abfd, ex->ri_pad); in->ri_cprmask[0] = H_GET_32 (abfd, ex->ri_cprmask[0]); in->ri_cprmask[1] = H_GET_32 (abfd, ex->ri_cprmask[1]); in->ri_cprmask[2] = H_GET_32 (abfd, ex->ri_cprmask[2]); in->ri_cprmask[3] = H_GET_32 (abfd, ex->ri_cprmask[3]); in->ri_gp_value = H_GET_64 (abfd, ex->ri_gp_value); } void bfd_mips_elf64_swap_reginfo_out (bfd *abfd, const Elf64_Internal_RegInfo *in, Elf64_External_RegInfo *ex) { H_PUT_32 (abfd, in->ri_gprmask, ex->ri_gprmask); H_PUT_32 (abfd, in->ri_pad, ex->ri_pad); H_PUT_32 (abfd, in->ri_cprmask[0], ex->ri_cprmask[0]); H_PUT_32 (abfd, in->ri_cprmask[1], ex->ri_cprmask[1]); H_PUT_32 (abfd, in->ri_cprmask[2], ex->ri_cprmask[2]); H_PUT_32 (abfd, in->ri_cprmask[3], ex->ri_cprmask[3]); H_PUT_64 (abfd, in->ri_gp_value, ex->ri_gp_value); } /* Swap in an options header. */ void bfd_mips_elf_swap_options_in (bfd *abfd, const Elf_External_Options *ex, Elf_Internal_Options *in) { in->kind = H_GET_8 (abfd, ex->kind); in->size = H_GET_8 (abfd, ex->size); in->section = H_GET_16 (abfd, ex->section); in->info = H_GET_32 (abfd, ex->info); } /* Swap out an options header. */ void bfd_mips_elf_swap_options_out (bfd *abfd, const Elf_Internal_Options *in, Elf_External_Options *ex) { H_PUT_8 (abfd, in->kind, ex->kind); H_PUT_8 (abfd, in->size, ex->size); H_PUT_16 (abfd, in->section, ex->section); H_PUT_32 (abfd, in->info, ex->info); } /* This function is called via qsort() to sort the dynamic relocation entries by increasing r_symndx value. */ static int sort_dynamic_relocs (const void *arg1, const void *arg2) { Elf_Internal_Rela int_reloc1; Elf_Internal_Rela int_reloc2; int diff; bfd_elf32_swap_reloc_in (reldyn_sorting_bfd, arg1, &int_reloc1); bfd_elf32_swap_reloc_in (reldyn_sorting_bfd, arg2, &int_reloc2); diff = ELF32_R_SYM (int_reloc1.r_info) - ELF32_R_SYM (int_reloc2.r_info); if (diff != 0) return diff; if (int_reloc1.r_offset < int_reloc2.r_offset) return -1; if (int_reloc1.r_offset > int_reloc2.r_offset) return 1; return 0; } /* Like sort_dynamic_relocs, but used for elf64 relocations. */ static int sort_dynamic_relocs_64 (const void *arg1 ATTRIBUTE_UNUSED, const void *arg2 ATTRIBUTE_UNUSED) { #ifdef BFD64 Elf_Internal_Rela int_reloc1[3]; Elf_Internal_Rela int_reloc2[3]; (*get_elf_backend_data (reldyn_sorting_bfd)->s->swap_reloc_in) (reldyn_sorting_bfd, arg1, int_reloc1); (*get_elf_backend_data (reldyn_sorting_bfd)->s->swap_reloc_in) (reldyn_sorting_bfd, arg2, int_reloc2); if (ELF64_R_SYM (int_reloc1[0].r_info) < ELF64_R_SYM (int_reloc2[0].r_info)) return -1; if (ELF64_R_SYM (int_reloc1[0].r_info) > ELF64_R_SYM (int_reloc2[0].r_info)) return 1; if (int_reloc1[0].r_offset < int_reloc2[0].r_offset) return -1; if (int_reloc1[0].r_offset > int_reloc2[0].r_offset) return 1; return 0; #else abort (); #endif } /* This routine is used to write out ECOFF debugging external symbol information. It is called via mips_elf_link_hash_traverse. The ECOFF external symbol information must match the ELF external symbol information. Unfortunately, at this point we don't know whether a symbol is required by reloc information, so the two tables may wind up being different. We must sort out the external symbol information before we can set the final size of the .mdebug section, and we must set the size of the .mdebug section before we can relocate any sections, and we can't know which symbols are required by relocation until we relocate the sections. Fortunately, it is relatively unlikely that any symbol will be stripped but required by a reloc. In particular, it can not happen when generating a final executable. */ static bfd_boolean mips_elf_output_extsym (struct mips_elf_link_hash_entry *h, void *data) { struct extsym_info *einfo = data; bfd_boolean strip; asection *sec, *output_section; if (h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; if (h->root.indx == -2) strip = FALSE; else if ((h->root.def_dynamic || h->root.ref_dynamic || h->root.type == bfd_link_hash_new) && !h->root.def_regular && !h->root.ref_regular) strip = TRUE; else if (einfo->info->strip == strip_all || (einfo->info->strip == strip_some && bfd_hash_lookup (einfo->info->keep_hash, h->root.root.root.string, FALSE, FALSE) == NULL)) strip = TRUE; else strip = FALSE; if (strip) return TRUE; if (h->esym.ifd == -2) { h->esym.jmptbl = 0; h->esym.cobol_main = 0; h->esym.weakext = 0; h->esym.reserved = 0; h->esym.ifd = ifdNil; h->esym.asym.value = 0; h->esym.asym.st = stGlobal; if (h->root.root.type == bfd_link_hash_undefined || h->root.root.type == bfd_link_hash_undefweak) { const char *name; /* Use undefined class. Also, set class and type for some special symbols. */ name = h->root.root.root.string; if (strcmp (name, mips_elf_dynsym_rtproc_names[0]) == 0 || strcmp (name, mips_elf_dynsym_rtproc_names[1]) == 0) { h->esym.asym.sc = scData; h->esym.asym.st = stLabel; h->esym.asym.value = 0; } else if (strcmp (name, mips_elf_dynsym_rtproc_names[2]) == 0) { h->esym.asym.sc = scAbs; h->esym.asym.st = stLabel; h->esym.asym.value = mips_elf_hash_table (einfo->info)->procedure_count; } else if (strcmp (name, "_gp_disp") == 0 && ! NEWABI_P (einfo->abfd)) { h->esym.asym.sc = scAbs; h->esym.asym.st = stLabel; h->esym.asym.value = elf_gp (einfo->abfd); } else h->esym.asym.sc = scUndefined; } else if (h->root.root.type != bfd_link_hash_defined && h->root.root.type != bfd_link_hash_defweak) h->esym.asym.sc = scAbs; else { const char *name; sec = h->root.root.u.def.section; output_section = sec->output_section; /* When making a shared library and symbol h is the one from the another shared library, OUTPUT_SECTION may be null. */ if (output_section == NULL) h->esym.asym.sc = scUndefined; else { name = bfd_section_name (output_section->owner, output_section); if (strcmp (name, ".text") == 0) h->esym.asym.sc = scText; else if (strcmp (name, ".data") == 0) h->esym.asym.sc = scData; else if (strcmp (name, ".sdata") == 0) h->esym.asym.sc = scSData; else if (strcmp (name, ".rodata") == 0 || strcmp (name, ".rdata") == 0) h->esym.asym.sc = scRData; else if (strcmp (name, ".bss") == 0) h->esym.asym.sc = scBss; else if (strcmp (name, ".sbss") == 0) h->esym.asym.sc = scSBss; else if (strcmp (name, ".init") == 0) h->esym.asym.sc = scInit; else if (strcmp (name, ".fini") == 0) h->esym.asym.sc = scFini; else h->esym.asym.sc = scAbs; } } h->esym.asym.reserved = 0; h->esym.asym.index = indexNil; } if (h->root.root.type == bfd_link_hash_common) h->esym.asym.value = h->root.root.u.c.size; else if (h->root.root.type == bfd_link_hash_defined || h->root.root.type == bfd_link_hash_defweak) { if (h->esym.asym.sc == scCommon) h->esym.asym.sc = scBss; else if (h->esym.asym.sc == scSCommon) h->esym.asym.sc = scSBss; sec = h->root.root.u.def.section; output_section = sec->output_section; if (output_section != NULL) h->esym.asym.value = (h->root.root.u.def.value + sec->output_offset + output_section->vma); else h->esym.asym.value = 0; } else if (h->root.needs_plt) { struct mips_elf_link_hash_entry *hd = h; bfd_boolean no_fn_stub = h->no_fn_stub; while (hd->root.root.type == bfd_link_hash_indirect) { hd = (struct mips_elf_link_hash_entry *)h->root.root.u.i.link; no_fn_stub = no_fn_stub || hd->no_fn_stub; } if (!no_fn_stub) { /* Set type and value for a symbol with a function stub. */ h->esym.asym.st = stProc; sec = hd->root.root.u.def.section; if (sec == NULL) h->esym.asym.value = 0; else { output_section = sec->output_section; if (output_section != NULL) h->esym.asym.value = (hd->root.plt.offset + sec->output_offset + output_section->vma); else h->esym.asym.value = 0; } } } if (! bfd_ecoff_debug_one_external (einfo->abfd, einfo->debug, einfo->swap, h->root.root.root.string, &h->esym)) { einfo->failed = TRUE; return FALSE; } return TRUE; } /* A comparison routine used to sort .gptab entries. */ static int gptab_compare (const void *p1, const void *p2) { const Elf32_gptab *a1 = p1; const Elf32_gptab *a2 = p2; return a1->gt_entry.gt_g_value - a2->gt_entry.gt_g_value; } /* Functions to manage the got entry hash table. */ /* Use all 64 bits of a bfd_vma for the computation of a 32-bit hash number. */ static INLINE hashval_t mips_elf_hash_bfd_vma (bfd_vma addr) { #ifdef BFD64 return addr + (addr >> 32); #else return addr; #endif } /* got_entries only match if they're identical, except for gotidx, so use all fields to compute the hash, and compare the appropriate union members. */ static hashval_t mips_elf_got_entry_hash (const void *entry_) { const struct mips_got_entry *entry = (struct mips_got_entry *)entry_; return entry->symndx + ((entry->tls_type & GOT_TLS_LDM) << 17) + (! entry->abfd ? mips_elf_hash_bfd_vma (entry->d.address) : entry->abfd->id + (entry->symndx >= 0 ? mips_elf_hash_bfd_vma (entry->d.addend) : entry->d.h->root.root.root.hash)); } static int mips_elf_got_entry_eq (const void *entry1, const void *entry2) { const struct mips_got_entry *e1 = (struct mips_got_entry *)entry1; const struct mips_got_entry *e2 = (struct mips_got_entry *)entry2; /* An LDM entry can only match another LDM entry. */ if ((e1->tls_type ^ e2->tls_type) & GOT_TLS_LDM) return 0; return e1->abfd == e2->abfd && e1->symndx == e2->symndx && (! e1->abfd ? e1->d.address == e2->d.address : e1->symndx >= 0 ? e1->d.addend == e2->d.addend : e1->d.h == e2->d.h); } /* multi_got_entries are still a match in the case of global objects, even if the input bfd in which they're referenced differs, so the hash computation and compare functions are adjusted accordingly. */ static hashval_t mips_elf_multi_got_entry_hash (const void *entry_) { const struct mips_got_entry *entry = (struct mips_got_entry *)entry_; return entry->symndx + (! entry->abfd ? mips_elf_hash_bfd_vma (entry->d.address) : entry->symndx >= 0 ? ((entry->tls_type & GOT_TLS_LDM) ? (GOT_TLS_LDM << 17) : (entry->abfd->id + mips_elf_hash_bfd_vma (entry->d.addend))) : entry->d.h->root.root.root.hash); } static int mips_elf_multi_got_entry_eq (const void *entry1, const void *entry2) { const struct mips_got_entry *e1 = (struct mips_got_entry *)entry1; const struct mips_got_entry *e2 = (struct mips_got_entry *)entry2; /* Any two LDM entries match. */ if (e1->tls_type & e2->tls_type & GOT_TLS_LDM) return 1; /* Nothing else matches an LDM entry. */ if ((e1->tls_type ^ e2->tls_type) & GOT_TLS_LDM) return 0; return e1->symndx == e2->symndx && (e1->symndx >= 0 ? e1->abfd == e2->abfd && e1->d.addend == e2->d.addend : e1->abfd == NULL || e2->abfd == NULL ? e1->abfd == e2->abfd && e1->d.address == e2->d.address : e1->d.h == e2->d.h); } /* Return the dynamic relocation section. If it doesn't exist, try to create a new it if CREATE_P, otherwise return NULL. Also return NULL if creation fails. */ static asection * mips_elf_rel_dyn_section (struct bfd_link_info *info, bfd_boolean create_p) { const char *dname; asection *sreloc; bfd *dynobj; dname = MIPS_ELF_REL_DYN_NAME (info); dynobj = elf_hash_table (info)->dynobj; sreloc = bfd_get_section_by_name (dynobj, dname); if (sreloc == NULL && create_p) { sreloc = bfd_make_section_with_flags (dynobj, dname, (SEC_ALLOC | SEC_LOAD | SEC_HAS_CONTENTS | SEC_IN_MEMORY | SEC_LINKER_CREATED | SEC_READONLY)); if (sreloc == NULL || ! bfd_set_section_alignment (dynobj, sreloc, MIPS_ELF_LOG_FILE_ALIGN (dynobj))) return NULL; } return sreloc; } /* Returns the GOT section for ABFD. */ static asection * mips_elf_got_section (bfd *abfd, bfd_boolean maybe_excluded) { asection *sgot = bfd_get_section_by_name (abfd, ".got"); if (sgot == NULL || (! maybe_excluded && (sgot->flags & SEC_EXCLUDE) != 0)) return NULL; return sgot; } /* Returns the GOT information associated with the link indicated by INFO. If SGOTP is non-NULL, it is filled in with the GOT section. */ static struct mips_got_info * mips_elf_got_info (bfd *abfd, asection **sgotp) { asection *sgot; struct mips_got_info *g; sgot = mips_elf_got_section (abfd, TRUE); BFD_ASSERT (sgot != NULL); BFD_ASSERT (mips_elf_section_data (sgot) != NULL); g = mips_elf_section_data (sgot)->u.got_info; BFD_ASSERT (g != NULL); if (sgotp) *sgotp = (sgot->flags & SEC_EXCLUDE) == 0 ? sgot : NULL; return g; } /* Count the number of relocations needed for a TLS GOT entry, with access types from TLS_TYPE, and symbol H (or a local symbol if H is NULL). */ static int mips_tls_got_relocs (struct bfd_link_info *info, unsigned char tls_type, struct elf_link_hash_entry *h) { int indx = 0; int ret = 0; bfd_boolean need_relocs = FALSE; bfd_boolean dyn = elf_hash_table (info)->dynamic_sections_created; if (h && WILL_CALL_FINISH_DYNAMIC_SYMBOL (dyn, info->shared, h) && (!info->shared || !SYMBOL_REFERENCES_LOCAL (info, h))) indx = h->dynindx; if ((info->shared || indx != 0) && (h == NULL || ELF_ST_VISIBILITY (h->other) == STV_DEFAULT || h->root.type != bfd_link_hash_undefweak)) need_relocs = TRUE; if (!need_relocs) return FALSE; if (tls_type & GOT_TLS_GD) { ret++; if (indx != 0) ret++; } if (tls_type & GOT_TLS_IE) ret++; if ((tls_type & GOT_TLS_LDM) && info->shared) ret++; return ret; } /* Count the number of TLS relocations required for the GOT entry in ARG1, if it describes a local symbol. */ static int mips_elf_count_local_tls_relocs (void **arg1, void *arg2) { struct mips_got_entry *entry = * (struct mips_got_entry **) arg1; struct mips_elf_count_tls_arg *arg = arg2; if (entry->abfd != NULL && entry->symndx != -1) arg->needed += mips_tls_got_relocs (arg->info, entry->tls_type, NULL); return 1; } /* Count the number of TLS GOT entries required for the global (or forced-local) symbol in ARG1. */ static int mips_elf_count_global_tls_entries (void *arg1, void *arg2) { struct mips_elf_link_hash_entry *hm = (struct mips_elf_link_hash_entry *) arg1; struct mips_elf_count_tls_arg *arg = arg2; if (hm->tls_type & GOT_TLS_GD) arg->needed += 2; if (hm->tls_type & GOT_TLS_IE) arg->needed += 1; return 1; } /* Count the number of TLS relocations required for the global (or forced-local) symbol in ARG1. */ static int mips_elf_count_global_tls_relocs (void *arg1, void *arg2) { struct mips_elf_link_hash_entry *hm = (struct mips_elf_link_hash_entry *) arg1; struct mips_elf_count_tls_arg *arg = arg2; arg->needed += mips_tls_got_relocs (arg->info, hm->tls_type, &hm->root); return 1; } /* Output a simple dynamic relocation into SRELOC. */ static void mips_elf_output_dynamic_relocation (bfd *output_bfd, asection *sreloc, unsigned long indx, int r_type, bfd_vma offset) { Elf_Internal_Rela rel[3]; memset (rel, 0, sizeof (rel)); rel[0].r_info = ELF_R_INFO (output_bfd, indx, r_type); rel[0].r_offset = rel[1].r_offset = rel[2].r_offset = offset; if (ABI_64_P (output_bfd)) { (*get_elf_backend_data (output_bfd)->s->swap_reloc_out) (output_bfd, &rel[0], (sreloc->contents + sreloc->reloc_count * sizeof (Elf64_Mips_External_Rel))); } else bfd_elf32_swap_reloc_out (output_bfd, &rel[0], (sreloc->contents + sreloc->reloc_count * sizeof (Elf32_External_Rel))); ++sreloc->reloc_count; } /* Initialize a set of TLS GOT entries for one symbol. */ static void mips_elf_initialize_tls_slots (bfd *abfd, bfd_vma got_offset, unsigned char *tls_type_p, struct bfd_link_info *info, struct mips_elf_link_hash_entry *h, bfd_vma value) { int indx; asection *sreloc, *sgot; bfd_vma offset, offset2; bfd *dynobj; bfd_boolean need_relocs = FALSE; dynobj = elf_hash_table (info)->dynobj; sgot = mips_elf_got_section (dynobj, FALSE); indx = 0; if (h != NULL) { bfd_boolean dyn = elf_hash_table (info)->dynamic_sections_created; if (WILL_CALL_FINISH_DYNAMIC_SYMBOL (dyn, info->shared, &h->root) && (!info->shared || !SYMBOL_REFERENCES_LOCAL (info, &h->root))) indx = h->root.dynindx; } if (*tls_type_p & GOT_TLS_DONE) return; if ((info->shared || indx != 0) && (h == NULL || ELF_ST_VISIBILITY (h->root.other) == STV_DEFAULT || h->root.type != bfd_link_hash_undefweak)) need_relocs = TRUE; /* MINUS_ONE means the symbol is not defined in this object. It may not be defined at all; assume that the value doesn't matter in that case. Otherwise complain if we would use the value. */ BFD_ASSERT (value != MINUS_ONE || (indx != 0 && need_relocs) || h->root.root.type == bfd_link_hash_undefweak); /* Emit necessary relocations. */ sreloc = mips_elf_rel_dyn_section (info, FALSE); /* General Dynamic. */ if (*tls_type_p & GOT_TLS_GD) { offset = got_offset; offset2 = offset + MIPS_ELF_GOT_SIZE (abfd); if (need_relocs) { mips_elf_output_dynamic_relocation (abfd, sreloc, indx, ABI_64_P (abfd) ? R_MIPS_TLS_DTPMOD64 : R_MIPS_TLS_DTPMOD32, sgot->output_offset + sgot->output_section->vma + offset); if (indx) mips_elf_output_dynamic_relocation (abfd, sreloc, indx, ABI_64_P (abfd) ? R_MIPS_TLS_DTPREL64 : R_MIPS_TLS_DTPREL32, sgot->output_offset + sgot->output_section->vma + offset2); else MIPS_ELF_PUT_WORD (abfd, value - dtprel_base (info), sgot->contents + offset2); } else { MIPS_ELF_PUT_WORD (abfd, 1, sgot->contents + offset); MIPS_ELF_PUT_WORD (abfd, value - dtprel_base (info), sgot->contents + offset2); } got_offset += 2 * MIPS_ELF_GOT_SIZE (abfd); } /* Initial Exec model. */ if (*tls_type_p & GOT_TLS_IE) { offset = got_offset; if (need_relocs) { if (indx == 0) MIPS_ELF_PUT_WORD (abfd, value - elf_hash_table (info)->tls_sec->vma, sgot->contents + offset); else MIPS_ELF_PUT_WORD (abfd, 0, sgot->contents + offset); mips_elf_output_dynamic_relocation (abfd, sreloc, indx, ABI_64_P (abfd) ? R_MIPS_TLS_TPREL64 : R_MIPS_TLS_TPREL32, sgot->output_offset + sgot->output_section->vma + offset); } else MIPS_ELF_PUT_WORD (abfd, value - tprel_base (info), sgot->contents + offset); } if (*tls_type_p & GOT_TLS_LDM) { /* The initial offset is zero, and the LD offsets will include the bias by DTP_OFFSET. */ MIPS_ELF_PUT_WORD (abfd, 0, sgot->contents + got_offset + MIPS_ELF_GOT_SIZE (abfd)); if (!info->shared) MIPS_ELF_PUT_WORD (abfd, 1, sgot->contents + got_offset); else mips_elf_output_dynamic_relocation (abfd, sreloc, indx, ABI_64_P (abfd) ? R_MIPS_TLS_DTPMOD64 : R_MIPS_TLS_DTPMOD32, sgot->output_offset + sgot->output_section->vma + got_offset); } *tls_type_p |= GOT_TLS_DONE; } /* Return the GOT index to use for a relocation of type R_TYPE against a symbol accessed using TLS_TYPE models. The GOT entries for this symbol in this GOT start at GOT_INDEX. This function initializes the GOT entries and corresponding relocations. */ static bfd_vma mips_tls_got_index (bfd *abfd, bfd_vma got_index, unsigned char *tls_type, int r_type, struct bfd_link_info *info, struct mips_elf_link_hash_entry *h, bfd_vma symbol) { BFD_ASSERT (r_type == R_MIPS_TLS_GOTTPREL || r_type == R_MIPS_TLS_GD || r_type == R_MIPS_TLS_LDM); mips_elf_initialize_tls_slots (abfd, got_index, tls_type, info, h, symbol); if (r_type == R_MIPS_TLS_GOTTPREL) { BFD_ASSERT (*tls_type & GOT_TLS_IE); if (*tls_type & GOT_TLS_GD) return got_index + 2 * MIPS_ELF_GOT_SIZE (abfd); else return got_index; } if (r_type == R_MIPS_TLS_GD) { BFD_ASSERT (*tls_type & GOT_TLS_GD); return got_index; } if (r_type == R_MIPS_TLS_LDM) { BFD_ASSERT (*tls_type & GOT_TLS_LDM); return got_index; } return got_index; } /* Return the offset from _GLOBAL_OFFSET_TABLE_ of the .got.plt entry for global symbol H. .got.plt comes before the GOT, so the offset will be negative. */ static bfd_vma mips_elf_gotplt_index (struct bfd_link_info *info, struct elf_link_hash_entry *h) { bfd_vma plt_index, got_address, got_value; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); BFD_ASSERT (h->plt.offset != (bfd_vma) -1); /* Calculate the index of the symbol's PLT entry. */ plt_index = (h->plt.offset - htab->plt_header_size) / htab->plt_entry_size; /* Calculate the address of the associated .got.plt entry. */ got_address = (htab->sgotplt->output_section->vma + htab->sgotplt->output_offset + plt_index * 4); /* Calculate the value of _GLOBAL_OFFSET_TABLE_. */ got_value = (htab->root.hgot->root.u.def.section->output_section->vma + htab->root.hgot->root.u.def.section->output_offset + htab->root.hgot->root.u.def.value); return got_address - got_value; } /* Return the GOT offset for address VALUE. If there is not yet a GOT entry for this value, create one. If R_SYMNDX refers to a TLS symbol, create a TLS GOT entry instead. Return -1 if no satisfactory GOT offset can be found. */ static bfd_vma mips_elf_local_got_index (bfd *abfd, bfd *ibfd, struct bfd_link_info *info, bfd_vma value, unsigned long r_symndx, struct mips_elf_link_hash_entry *h, int r_type) { asection *sgot; struct mips_got_info *g; struct mips_got_entry *entry; g = mips_elf_got_info (elf_hash_table (info)->dynobj, &sgot); entry = mips_elf_create_local_got_entry (abfd, info, ibfd, g, sgot, value, r_symndx, h, r_type); if (!entry) return MINUS_ONE; if (TLS_RELOC_P (r_type)) { if (entry->symndx == -1 && g->next == NULL) /* A type (3) entry in the single-GOT case. We use the symbol's hash table entry to track the index. */ return mips_tls_got_index (abfd, h->tls_got_offset, &h->tls_type, r_type, info, h, value); else return mips_tls_got_index (abfd, entry->gotidx, &entry->tls_type, r_type, info, h, value); } else return entry->gotidx; } /* Returns the GOT index for the global symbol indicated by H. */ static bfd_vma mips_elf_global_got_index (bfd *abfd, bfd *ibfd, struct elf_link_hash_entry *h, int r_type, struct bfd_link_info *info) { bfd_vma index; asection *sgot; struct mips_got_info *g, *gg; long global_got_dynindx = 0; gg = g = mips_elf_got_info (abfd, &sgot); if (g->bfd2got && ibfd) { struct mips_got_entry e, *p; BFD_ASSERT (h->dynindx >= 0); g = mips_elf_got_for_ibfd (g, ibfd); if (g->next != gg || TLS_RELOC_P (r_type)) { e.abfd = ibfd; e.symndx = -1; e.d.h = (struct mips_elf_link_hash_entry *)h; e.tls_type = 0; p = htab_find (g->got_entries, &e); BFD_ASSERT (p->gotidx > 0); if (TLS_RELOC_P (r_type)) { bfd_vma value = MINUS_ONE; if ((h->root.type == bfd_link_hash_defined || h->root.type == bfd_link_hash_defweak) && h->root.u.def.section->output_section) value = (h->root.u.def.value + h->root.u.def.section->output_offset + h->root.u.def.section->output_section->vma); return mips_tls_got_index (abfd, p->gotidx, &p->tls_type, r_type, info, e.d.h, value); } else return p->gotidx; } } if (gg->global_gotsym != NULL) global_got_dynindx = gg->global_gotsym->dynindx; if (TLS_RELOC_P (r_type)) { struct mips_elf_link_hash_entry *hm = (struct mips_elf_link_hash_entry *) h; bfd_vma value = MINUS_ONE; if ((h->root.type == bfd_link_hash_defined || h->root.type == bfd_link_hash_defweak) && h->root.u.def.section->output_section) value = (h->root.u.def.value + h->root.u.def.section->output_offset + h->root.u.def.section->output_section->vma); index = mips_tls_got_index (abfd, hm->tls_got_offset, &hm->tls_type, r_type, info, hm, value); } else { /* Once we determine the global GOT entry with the lowest dynamic symbol table index, we must put all dynamic symbols with greater indices into the GOT. That makes it easy to calculate the GOT offset. */ BFD_ASSERT (h->dynindx >= global_got_dynindx); index = ((h->dynindx - global_got_dynindx + g->local_gotno) * MIPS_ELF_GOT_SIZE (abfd)); } BFD_ASSERT (index < sgot->size); return index; } /* Find a GOT page entry that points to within 32KB of VALUE. These entries are supposed to be placed at small offsets in the GOT, i.e., within 32KB of GP. Return the index of the GOT entry, or -1 if no entry could be created. If OFFSETP is nonnull, use it to return the offset of the GOT entry from VALUE. */ static bfd_vma mips_elf_got_page (bfd *abfd, bfd *ibfd, struct bfd_link_info *info, bfd_vma value, bfd_vma *offsetp) { asection *sgot; struct mips_got_info *g; bfd_vma page, index; struct mips_got_entry *entry; g = mips_elf_got_info (elf_hash_table (info)->dynobj, &sgot); page = (value + 0x8000) & ~(bfd_vma) 0xffff; entry = mips_elf_create_local_got_entry (abfd, info, ibfd, g, sgot, page, 0, NULL, R_MIPS_GOT_PAGE); if (!entry) return MINUS_ONE; index = entry->gotidx; if (offsetp) *offsetp = value - entry->d.address; return index; } /* Find a local GOT entry for an R_MIPS_GOT16 relocation against VALUE. EXTERNAL is true if the relocation was against a global symbol that has been forced local. */ static bfd_vma mips_elf_got16_entry (bfd *abfd, bfd *ibfd, struct bfd_link_info *info, bfd_vma value, bfd_boolean external) { asection *sgot; struct mips_got_info *g; struct mips_got_entry *entry; /* GOT16 relocations against local symbols are followed by a LO16 relocation; those against global symbols are not. Thus if the symbol was originally local, the GOT16 relocation should load the equivalent of %hi(VALUE), otherwise it should load VALUE itself. */ if (! external) value = mips_elf_high (value) << 16; g = mips_elf_got_info (elf_hash_table (info)->dynobj, &sgot); entry = mips_elf_create_local_got_entry (abfd, info, ibfd, g, sgot, value, 0, NULL, R_MIPS_GOT16); if (entry) return entry->gotidx; else return MINUS_ONE; } /* Returns the offset for the entry at the INDEXth position in the GOT. */ static bfd_vma mips_elf_got_offset_from_index (bfd *dynobj, bfd *output_bfd, bfd *input_bfd, bfd_vma index) { asection *sgot; bfd_vma gp; struct mips_got_info *g; g = mips_elf_got_info (dynobj, &sgot); gp = _bfd_get_gp_value (output_bfd) + mips_elf_adjust_gp (output_bfd, g, input_bfd); return sgot->output_section->vma + sgot->output_offset + index - gp; } /* Create and return a local GOT entry for VALUE, which was calculated from a symbol belonging to INPUT_SECTON. Return NULL if it could not be created. If R_SYMNDX refers to a TLS symbol, create a TLS entry instead. */ static struct mips_got_entry * mips_elf_create_local_got_entry (bfd *abfd, struct bfd_link_info *info, bfd *ibfd, struct mips_got_info *gg, asection *sgot, bfd_vma value, unsigned long r_symndx, struct mips_elf_link_hash_entry *h, int r_type) { struct mips_got_entry entry, **loc; struct mips_got_info *g; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); entry.abfd = NULL; entry.symndx = -1; entry.d.address = value; entry.tls_type = 0; g = mips_elf_got_for_ibfd (gg, ibfd); if (g == NULL) { g = mips_elf_got_for_ibfd (gg, abfd); BFD_ASSERT (g != NULL); } /* We might have a symbol, H, if it has been forced local. Use the global entry then. It doesn't matter whether an entry is local or global for TLS, since the dynamic linker does not automatically relocate TLS GOT entries. */ BFD_ASSERT (h == NULL || h->root.forced_local); if (TLS_RELOC_P (r_type)) { struct mips_got_entry *p; entry.abfd = ibfd; if (r_type == R_MIPS_TLS_LDM) { entry.tls_type = GOT_TLS_LDM; entry.symndx = 0; entry.d.addend = 0; } else if (h == NULL) { entry.symndx = r_symndx; entry.d.addend = 0; } else entry.d.h = h; p = (struct mips_got_entry *) htab_find (g->got_entries, &entry); BFD_ASSERT (p); return p; } loc = (struct mips_got_entry **) htab_find_slot (g->got_entries, &entry, INSERT); if (*loc) return *loc; entry.gotidx = MIPS_ELF_GOT_SIZE (abfd) * g->assigned_gotno++; entry.tls_type = 0; *loc = (struct mips_got_entry *)bfd_alloc (abfd, sizeof entry); if (! *loc) return NULL; memcpy (*loc, &entry, sizeof entry); if (g->assigned_gotno >= g->local_gotno) { (*loc)->gotidx = -1; /* We didn't allocate enough space in the GOT. */ (*_bfd_error_handler) (_("not enough GOT space for local GOT entries")); bfd_set_error (bfd_error_bad_value); return NULL; } MIPS_ELF_PUT_WORD (abfd, value, (sgot->contents + entry.gotidx)); /* These GOT entries need a dynamic relocation on VxWorks. */ if (htab->is_vxworks) { Elf_Internal_Rela outrel; asection *s; bfd_byte *loc; bfd_vma got_address; s = mips_elf_rel_dyn_section (info, FALSE); got_address = (sgot->output_section->vma + sgot->output_offset + entry.gotidx); loc = s->contents + (s->reloc_count++ * sizeof (Elf32_External_Rela)); outrel.r_offset = got_address; outrel.r_info = ELF32_R_INFO (STN_UNDEF, R_MIPS_32); outrel.r_addend = value; bfd_elf32_swap_reloca_out (abfd, &outrel, loc); } return *loc; } /* Sort the dynamic symbol table so that symbols that need GOT entries appear towards the end. This reduces the amount of GOT space required. MAX_LOCAL is used to set the number of local symbols known to be in the dynamic symbol table. During _bfd_mips_elf_size_dynamic_sections, this value is 1. Afterward, the section symbols are added and the count is higher. */ static bfd_boolean mips_elf_sort_hash_table (struct bfd_link_info *info, unsigned long max_local) { struct mips_elf_hash_sort_data hsd; struct mips_got_info *g; bfd *dynobj; dynobj = elf_hash_table (info)->dynobj; g = mips_elf_got_info (dynobj, NULL); hsd.low = NULL; hsd.max_unref_got_dynindx = hsd.min_got_dynindx = elf_hash_table (info)->dynsymcount /* In the multi-got case, assigned_gotno of the master got_info indicate the number of entries that aren't referenced in the primary GOT, but that must have entries because there are dynamic relocations that reference it. Since they aren't referenced, we move them to the end of the GOT, so that they don't prevent other entries that are referenced from getting too large offsets. */ - (g->next ? g->assigned_gotno : 0); hsd.max_non_got_dynindx = max_local; mips_elf_link_hash_traverse (((struct mips_elf_link_hash_table *) elf_hash_table (info)), mips_elf_sort_hash_table_f, &hsd); /* There should have been enough room in the symbol table to accommodate both the GOT and non-GOT symbols. */ BFD_ASSERT (hsd.max_non_got_dynindx <= hsd.min_got_dynindx); BFD_ASSERT ((unsigned long)hsd.max_unref_got_dynindx <= elf_hash_table (info)->dynsymcount); /* Now we know which dynamic symbol has the lowest dynamic symbol table index in the GOT. */ g->global_gotsym = hsd.low; return TRUE; } /* If H needs a GOT entry, assign it the highest available dynamic index. Otherwise, assign it the lowest available dynamic index. */ static bfd_boolean mips_elf_sort_hash_table_f (struct mips_elf_link_hash_entry *h, void *data) { struct mips_elf_hash_sort_data *hsd = data; if (h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; /* Symbols without dynamic symbol table entries aren't interesting at all. */ if (h->root.dynindx == -1) return TRUE; /* Global symbols that need GOT entries that are not explicitly referenced are marked with got offset 2. Those that are referenced get a 1, and those that don't need GOT entries get -1. */ if (h->root.got.offset == 2) { BFD_ASSERT (h->tls_type == GOT_NORMAL); if (hsd->max_unref_got_dynindx == hsd->min_got_dynindx) hsd->low = (struct elf_link_hash_entry *) h; h->root.dynindx = hsd->max_unref_got_dynindx++; } else if (h->root.got.offset != 1) h->root.dynindx = hsd->max_non_got_dynindx++; else { BFD_ASSERT (h->tls_type == GOT_NORMAL); h->root.dynindx = --hsd->min_got_dynindx; hsd->low = (struct elf_link_hash_entry *) h; } return TRUE; } /* If H is a symbol that needs a global GOT entry, but has a dynamic symbol table index lower than any we've seen to date, record it for posterity. */ static bfd_boolean mips_elf_record_global_got_symbol (struct elf_link_hash_entry *h, bfd *abfd, struct bfd_link_info *info, struct mips_got_info *g, unsigned char tls_flag) { struct mips_got_entry entry, **loc; /* A global symbol in the GOT must also be in the dynamic symbol table. */ if (h->dynindx == -1) { switch (ELF_ST_VISIBILITY (h->other)) { case STV_INTERNAL: case STV_HIDDEN: _bfd_mips_elf_hide_symbol (info, h, TRUE); break; } if (!bfd_elf_link_record_dynamic_symbol (info, h)) return FALSE; } /* Make sure we have a GOT to put this entry into. */ BFD_ASSERT (g != NULL); entry.abfd = abfd; entry.symndx = -1; entry.d.h = (struct mips_elf_link_hash_entry *) h; entry.tls_type = 0; loc = (struct mips_got_entry **) htab_find_slot (g->got_entries, &entry, INSERT); /* If we've already marked this entry as needing GOT space, we don't need to do it again. */ if (*loc) { (*loc)->tls_type |= tls_flag; return TRUE; } *loc = (struct mips_got_entry *)bfd_alloc (abfd, sizeof entry); if (! *loc) return FALSE; entry.gotidx = -1; entry.tls_type = tls_flag; memcpy (*loc, &entry, sizeof entry); if (h->got.offset != MINUS_ONE) return TRUE; /* By setting this to a value other than -1, we are indicating that there needs to be a GOT entry for H. Avoid using zero, as the generic ELF copy_indirect_symbol tests for <= 0. */ if (tls_flag == 0) h->got.offset = 1; return TRUE; } /* Reserve space in G for a GOT entry containing the value of symbol SYMNDX in input bfd ABDF, plus ADDEND. */ static bfd_boolean mips_elf_record_local_got_symbol (bfd *abfd, long symndx, bfd_vma addend, struct mips_got_info *g, unsigned char tls_flag) { struct mips_got_entry entry, **loc; entry.abfd = abfd; entry.symndx = symndx; entry.d.addend = addend; entry.tls_type = tls_flag; loc = (struct mips_got_entry **) htab_find_slot (g->got_entries, &entry, INSERT); if (*loc) { if (tls_flag == GOT_TLS_GD && !((*loc)->tls_type & GOT_TLS_GD)) { g->tls_gotno += 2; (*loc)->tls_type |= tls_flag; } else if (tls_flag == GOT_TLS_IE && !((*loc)->tls_type & GOT_TLS_IE)) { g->tls_gotno += 1; (*loc)->tls_type |= tls_flag; } return TRUE; } if (tls_flag != 0) { entry.gotidx = -1; entry.tls_type = tls_flag; if (tls_flag == GOT_TLS_IE) g->tls_gotno += 1; else if (tls_flag == GOT_TLS_GD) g->tls_gotno += 2; else if (g->tls_ldm_offset == MINUS_ONE) { g->tls_ldm_offset = MINUS_TWO; g->tls_gotno += 2; } } else { entry.gotidx = g->local_gotno++; entry.tls_type = 0; } *loc = (struct mips_got_entry *)bfd_alloc (abfd, sizeof entry); if (! *loc) return FALSE; memcpy (*loc, &entry, sizeof entry); return TRUE; } /* Compute the hash value of the bfd in a bfd2got hash entry. */ static hashval_t mips_elf_bfd2got_entry_hash (const void *entry_) { const struct mips_elf_bfd2got_hash *entry = (struct mips_elf_bfd2got_hash *)entry_; return entry->bfd->id; } /* Check whether two hash entries have the same bfd. */ static int mips_elf_bfd2got_entry_eq (const void *entry1, const void *entry2) { const struct mips_elf_bfd2got_hash *e1 = (const struct mips_elf_bfd2got_hash *)entry1; const struct mips_elf_bfd2got_hash *e2 = (const struct mips_elf_bfd2got_hash *)entry2; return e1->bfd == e2->bfd; } /* In a multi-got link, determine the GOT to be used for IBFD. G must be the master GOT data. */ static struct mips_got_info * mips_elf_got_for_ibfd (struct mips_got_info *g, bfd *ibfd) { struct mips_elf_bfd2got_hash e, *p; if (! g->bfd2got) return g; e.bfd = ibfd; p = htab_find (g->bfd2got, &e); return p ? p->g : NULL; } /* Create one separate got for each bfd that has entries in the global got, such that we can tell how many local and global entries each bfd requires. */ static int mips_elf_make_got_per_bfd (void **entryp, void *p) { struct mips_got_entry *entry = (struct mips_got_entry *)*entryp; struct mips_elf_got_per_bfd_arg *arg = (struct mips_elf_got_per_bfd_arg *)p; htab_t bfd2got = arg->bfd2got; struct mips_got_info *g; struct mips_elf_bfd2got_hash bfdgot_entry, *bfdgot; void **bfdgotp; /* Find the got_info for this GOT entry's input bfd. Create one if none exists. */ bfdgot_entry.bfd = entry->abfd; bfdgotp = htab_find_slot (bfd2got, &bfdgot_entry, INSERT); bfdgot = (struct mips_elf_bfd2got_hash *)*bfdgotp; if (bfdgot != NULL) g = bfdgot->g; else { bfdgot = (struct mips_elf_bfd2got_hash *)bfd_alloc (arg->obfd, sizeof (struct mips_elf_bfd2got_hash)); if (bfdgot == NULL) { arg->obfd = 0; return 0; } *bfdgotp = bfdgot; bfdgot->bfd = entry->abfd; bfdgot->g = g = (struct mips_got_info *) bfd_alloc (arg->obfd, sizeof (struct mips_got_info)); if (g == NULL) { arg->obfd = 0; return 0; } g->global_gotsym = NULL; g->global_gotno = 0; g->local_gotno = 0; g->assigned_gotno = -1; g->tls_gotno = 0; g->tls_assigned_gotno = 0; g->tls_ldm_offset = MINUS_ONE; g->got_entries = htab_try_create (1, mips_elf_multi_got_entry_hash, mips_elf_multi_got_entry_eq, NULL); if (g->got_entries == NULL) { arg->obfd = 0; return 0; } g->bfd2got = NULL; g->next = NULL; } /* Insert the GOT entry in the bfd's got entry hash table. */ entryp = htab_find_slot (g->got_entries, entry, INSERT); if (*entryp != NULL) return 1; *entryp = entry; if (entry->tls_type) { if (entry->tls_type & (GOT_TLS_GD | GOT_TLS_LDM)) g->tls_gotno += 2; if (entry->tls_type & GOT_TLS_IE) g->tls_gotno += 1; } else if (entry->symndx >= 0 || entry->d.h->forced_local) ++g->local_gotno; else ++g->global_gotno; return 1; } /* Attempt to merge gots of different input bfds. Try to use as much as possible of the primary got, since it doesn't require explicit dynamic relocations, but don't use bfds that would reference global symbols out of the addressable range. Failing the primary got, attempt to merge with the current got, or finish the current got and then make make the new got current. */ static int mips_elf_merge_gots (void **bfd2got_, void *p) { struct mips_elf_bfd2got_hash *bfd2got = (struct mips_elf_bfd2got_hash *)*bfd2got_; struct mips_elf_got_per_bfd_arg *arg = (struct mips_elf_got_per_bfd_arg *)p; unsigned int lcount = bfd2got->g->local_gotno; unsigned int gcount = bfd2got->g->global_gotno; unsigned int tcount = bfd2got->g->tls_gotno; unsigned int maxcnt = arg->max_count; bfd_boolean too_many_for_tls = FALSE; /* We place TLS GOT entries after both locals and globals. The globals for the primary GOT may overflow the normal GOT size limit, so be sure not to merge a GOT which requires TLS with the primary GOT in that case. This doesn't affect non-primary GOTs. */ if (tcount > 0) { unsigned int primary_total = lcount + tcount + arg->global_count; if (primary_total > maxcnt) too_many_for_tls = TRUE; } /* If we don't have a primary GOT and this is not too big, use it as a starting point for the primary GOT. */ if (! arg->primary && lcount + gcount + tcount <= maxcnt && ! too_many_for_tls) { arg->primary = bfd2got->g; arg->primary_count = lcount + gcount; } /* If it looks like we can merge this bfd's entries with those of the primary, merge them. The heuristics is conservative, but we don't have to squeeze it too hard. */ else if (arg->primary && ! too_many_for_tls && (arg->primary_count + lcount + gcount + tcount) <= maxcnt) { struct mips_got_info *g = bfd2got->g; int old_lcount = arg->primary->local_gotno; int old_gcount = arg->primary->global_gotno; int old_tcount = arg->primary->tls_gotno; bfd2got->g = arg->primary; htab_traverse (g->got_entries, mips_elf_make_got_per_bfd, arg); if (arg->obfd == NULL) return 0; htab_delete (g->got_entries); /* We don't have to worry about releasing memory of the actual got entries, since they're all in the master got_entries hash table anyway. */ BFD_ASSERT (old_lcount + lcount >= arg->primary->local_gotno); BFD_ASSERT (old_gcount + gcount >= arg->primary->global_gotno); BFD_ASSERT (old_tcount + tcount >= arg->primary->tls_gotno); arg->primary_count = arg->primary->local_gotno + arg->primary->global_gotno + arg->primary->tls_gotno; } /* If we can merge with the last-created got, do it. */ else if (arg->current && arg->current_count + lcount + gcount + tcount <= maxcnt) { struct mips_got_info *g = bfd2got->g; int old_lcount = arg->current->local_gotno; int old_gcount = arg->current->global_gotno; int old_tcount = arg->current->tls_gotno; bfd2got->g = arg->current; htab_traverse (g->got_entries, mips_elf_make_got_per_bfd, arg); if (arg->obfd == NULL) return 0; htab_delete (g->got_entries); BFD_ASSERT (old_lcount + lcount >= arg->current->local_gotno); BFD_ASSERT (old_gcount + gcount >= arg->current->global_gotno); BFD_ASSERT (old_tcount + tcount >= arg->current->tls_gotno); arg->current_count = arg->current->local_gotno + arg->current->global_gotno + arg->current->tls_gotno; } /* Well, we couldn't merge, so create a new GOT. Don't check if it fits; if it turns out that it doesn't, we'll get relocation overflows anyway. */ else { bfd2got->g->next = arg->current; arg->current = bfd2got->g; arg->current_count = lcount + gcount + 2 * tcount; } return 1; } /* Set the TLS GOT index for the GOT entry in ENTRYP. ENTRYP's NEXT field is null iff there is just a single GOT. */ static int mips_elf_initialize_tls_index (void **entryp, void *p) { struct mips_got_entry *entry = (struct mips_got_entry *)*entryp; struct mips_got_info *g = p; bfd_vma next_index; unsigned char tls_type; /* We're only interested in TLS symbols. */ if (entry->tls_type == 0) return 1; next_index = MIPS_ELF_GOT_SIZE (entry->abfd) * (long) g->tls_assigned_gotno; if (entry->symndx == -1 && g->next == NULL) { /* A type (3) got entry in the single-GOT case. We use the symbol's hash table entry to track its index. */ if (entry->d.h->tls_type & GOT_TLS_OFFSET_DONE) return 1; entry->d.h->tls_type |= GOT_TLS_OFFSET_DONE; entry->d.h->tls_got_offset = next_index; tls_type = entry->d.h->tls_type; } else { if (entry->tls_type & GOT_TLS_LDM) { /* There are separate mips_got_entry objects for each input bfd that requires an LDM entry. Make sure that all LDM entries in a GOT resolve to the same index. */ if (g->tls_ldm_offset != MINUS_TWO && g->tls_ldm_offset != MINUS_ONE) { entry->gotidx = g->tls_ldm_offset; return 1; } g->tls_ldm_offset = next_index; } entry->gotidx = next_index; tls_type = entry->tls_type; } /* Account for the entries we've just allocated. */ if (tls_type & (GOT_TLS_GD | GOT_TLS_LDM)) g->tls_assigned_gotno += 2; if (tls_type & GOT_TLS_IE) g->tls_assigned_gotno += 1; return 1; } /* If passed a NULL mips_got_info in the argument, set the marker used to tell whether a global symbol needs a got entry (in the primary got) to the given VALUE. If passed a pointer G to a mips_got_info in the argument (it must not be the primary GOT), compute the offset from the beginning of the (primary) GOT section to the entry in G corresponding to the global symbol. G's assigned_gotno must contain the index of the first available global GOT entry in G. VALUE must contain the size of a GOT entry in bytes. For each global GOT entry that requires a dynamic relocation, NEEDED_RELOCS is incremented, and the symbol is marked as not eligible for lazy resolution through a function stub. */ static int mips_elf_set_global_got_offset (void **entryp, void *p) { struct mips_got_entry *entry = (struct mips_got_entry *)*entryp; struct mips_elf_set_global_got_offset_arg *arg = (struct mips_elf_set_global_got_offset_arg *)p; struct mips_got_info *g = arg->g; if (g && entry->tls_type != GOT_NORMAL) arg->needed_relocs += mips_tls_got_relocs (arg->info, entry->tls_type, entry->symndx == -1 ? &entry->d.h->root : NULL); if (entry->abfd != NULL && entry->symndx == -1 && entry->d.h->root.dynindx != -1 && entry->d.h->tls_type == GOT_NORMAL) { if (g) { BFD_ASSERT (g->global_gotsym == NULL); entry->gotidx = arg->value * (long) g->assigned_gotno++; if (arg->info->shared || (elf_hash_table (arg->info)->dynamic_sections_created && entry->d.h->root.def_dynamic && !entry->d.h->root.def_regular)) ++arg->needed_relocs; } else entry->d.h->root.got.offset = arg->value; } return 1; } /* Mark any global symbols referenced in the GOT we are iterating over as inelligible for lazy resolution stubs. */ static int mips_elf_set_no_stub (void **entryp, void *p ATTRIBUTE_UNUSED) { struct mips_got_entry *entry = (struct mips_got_entry *)*entryp; if (entry->abfd != NULL && entry->symndx == -1 && entry->d.h->root.dynindx != -1) entry->d.h->no_fn_stub = TRUE; return 1; } /* Follow indirect and warning hash entries so that each got entry points to the final symbol definition. P must point to a pointer to the hash table we're traversing. Since this traversal may modify the hash table, we set this pointer to NULL to indicate we've made a potentially-destructive change to the hash table, so the traversal must be restarted. */ static int mips_elf_resolve_final_got_entry (void **entryp, void *p) { struct mips_got_entry *entry = (struct mips_got_entry *)*entryp; htab_t got_entries = *(htab_t *)p; if (entry->abfd != NULL && entry->symndx == -1) { struct mips_elf_link_hash_entry *h = entry->d.h; while (h->root.root.type == bfd_link_hash_indirect || h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; if (entry->d.h == h) return 1; entry->d.h = h; /* If we can't find this entry with the new bfd hash, re-insert it, and get the traversal restarted. */ if (! htab_find (got_entries, entry)) { htab_clear_slot (got_entries, entryp); entryp = htab_find_slot (got_entries, entry, INSERT); if (! *entryp) *entryp = entry; /* Abort the traversal, since the whole table may have moved, and leave it up to the parent to restart the process. */ *(htab_t *)p = NULL; return 0; } /* We might want to decrement the global_gotno count, but it's either too early or too late for that at this point. */ } return 1; } /* Turn indirect got entries in a got_entries table into their final locations. */ static void mips_elf_resolve_final_got_entries (struct mips_got_info *g) { htab_t got_entries; do { got_entries = g->got_entries; htab_traverse (got_entries, mips_elf_resolve_final_got_entry, &got_entries); } while (got_entries == NULL); } /* Return the offset of an input bfd IBFD's GOT from the beginning of the primary GOT. */ static bfd_vma mips_elf_adjust_gp (bfd *abfd, struct mips_got_info *g, bfd *ibfd) { if (g->bfd2got == NULL) return 0; g = mips_elf_got_for_ibfd (g, ibfd); if (! g) return 0; BFD_ASSERT (g->next); g = g->next; return (g->local_gotno + g->global_gotno + g->tls_gotno) * MIPS_ELF_GOT_SIZE (abfd); } /* Turn a single GOT that is too big for 16-bit addressing into a sequence of GOTs, each one 16-bit addressable. */ static bfd_boolean mips_elf_multi_got (bfd *abfd, struct bfd_link_info *info, struct mips_got_info *g, asection *got, bfd_size_type pages) { struct mips_elf_got_per_bfd_arg got_per_bfd_arg; struct mips_elf_set_global_got_offset_arg set_got_offset_arg; struct mips_got_info *gg; unsigned int assign; g->bfd2got = htab_try_create (1, mips_elf_bfd2got_entry_hash, mips_elf_bfd2got_entry_eq, NULL); if (g->bfd2got == NULL) return FALSE; got_per_bfd_arg.bfd2got = g->bfd2got; got_per_bfd_arg.obfd = abfd; got_per_bfd_arg.info = info; /* Count how many GOT entries each input bfd requires, creating a map from bfd to got info while at that. */ htab_traverse (g->got_entries, mips_elf_make_got_per_bfd, &got_per_bfd_arg); if (got_per_bfd_arg.obfd == NULL) return FALSE; got_per_bfd_arg.current = NULL; got_per_bfd_arg.primary = NULL; /* Taking out PAGES entries is a worst-case estimate. We could compute the maximum number of pages that each separate input bfd uses, but it's probably not worth it. */ got_per_bfd_arg.max_count = ((MIPS_ELF_GOT_MAX_SIZE (info) / MIPS_ELF_GOT_SIZE (abfd)) - MIPS_RESERVED_GOTNO (info) - pages); /* The number of globals that will be included in the primary GOT. See the calls to mips_elf_set_global_got_offset below for more information. */ got_per_bfd_arg.global_count = g->global_gotno; /* Try to merge the GOTs of input bfds together, as long as they don't seem to exceed the maximum GOT size, choosing one of them to be the primary GOT. */ htab_traverse (g->bfd2got, mips_elf_merge_gots, &got_per_bfd_arg); if (got_per_bfd_arg.obfd == NULL) return FALSE; /* If we do not find any suitable primary GOT, create an empty one. */ if (got_per_bfd_arg.primary == NULL) { g->next = (struct mips_got_info *) bfd_alloc (abfd, sizeof (struct mips_got_info)); if (g->next == NULL) return FALSE; g->next->global_gotsym = NULL; g->next->global_gotno = 0; g->next->local_gotno = 0; g->next->tls_gotno = 0; g->next->assigned_gotno = 0; g->next->tls_assigned_gotno = 0; g->next->tls_ldm_offset = MINUS_ONE; g->next->got_entries = htab_try_create (1, mips_elf_multi_got_entry_hash, mips_elf_multi_got_entry_eq, NULL); if (g->next->got_entries == NULL) return FALSE; g->next->bfd2got = NULL; } else g->next = got_per_bfd_arg.primary; g->next->next = got_per_bfd_arg.current; /* GG is now the master GOT, and G is the primary GOT. */ gg = g; g = g->next; /* Map the output bfd to the primary got. That's what we're going to use for bfds that use GOT16 or GOT_PAGE relocations that we didn't mark in check_relocs, and we want a quick way to find it. We can't just use gg->next because we're going to reverse the list. */ { struct mips_elf_bfd2got_hash *bfdgot; void **bfdgotp; bfdgot = (struct mips_elf_bfd2got_hash *)bfd_alloc (abfd, sizeof (struct mips_elf_bfd2got_hash)); if (bfdgot == NULL) return FALSE; bfdgot->bfd = abfd; bfdgot->g = g; bfdgotp = htab_find_slot (gg->bfd2got, bfdgot, INSERT); BFD_ASSERT (*bfdgotp == NULL); *bfdgotp = bfdgot; } /* The IRIX dynamic linker requires every symbol that is referenced in a dynamic relocation to be present in the primary GOT, so arrange for them to appear after those that are actually referenced. GNU/Linux could very well do without it, but it would slow down the dynamic linker, since it would have to resolve every dynamic symbol referenced in other GOTs more than once, without help from the cache. Also, knowing that every external symbol has a GOT helps speed up the resolution of local symbols too, so GNU/Linux follows IRIX's practice. The number 2 is used by mips_elf_sort_hash_table_f to count global GOT symbols that are unreferenced in the primary GOT, with an initial dynamic index computed from gg->assigned_gotno, where the number of unreferenced global entries in the primary GOT is preserved. */ if (1) { gg->assigned_gotno = gg->global_gotno - g->global_gotno; g->global_gotno = gg->global_gotno; set_got_offset_arg.value = 2; } else { /* This could be used for dynamic linkers that don't optimize symbol resolution while applying relocations so as to use primary GOT entries or assuming the symbol is locally-defined. With this code, we assign lower dynamic indices to global symbols that are not referenced in the primary GOT, so that their entries can be omitted. */ gg->assigned_gotno = 0; set_got_offset_arg.value = -1; } /* Reorder dynamic symbols as described above (which behavior depends on the setting of VALUE). */ set_got_offset_arg.g = NULL; htab_traverse (gg->got_entries, mips_elf_set_global_got_offset, &set_got_offset_arg); set_got_offset_arg.value = 1; htab_traverse (g->got_entries, mips_elf_set_global_got_offset, &set_got_offset_arg); if (! mips_elf_sort_hash_table (info, 1)) return FALSE; /* Now go through the GOTs assigning them offset ranges. [assigned_gotno, local_gotno[ will be set to the range of local entries in each GOT. We can then compute the end of a GOT by adding local_gotno to global_gotno. We reverse the list and make it circular since then we'll be able to quickly compute the beginning of a GOT, by computing the end of its predecessor. To avoid special cases for the primary GOT, while still preserving assertions that are valid for both single- and multi-got links, we arrange for the main got struct to have the right number of global entries, but set its local_gotno such that the initial offset of the primary GOT is zero. Remember that the primary GOT will become the last item in the circular linked list, so it points back to the master GOT. */ gg->local_gotno = -g->global_gotno; gg->global_gotno = g->global_gotno; gg->tls_gotno = 0; assign = 0; gg->next = gg; do { struct mips_got_info *gn; assign += MIPS_RESERVED_GOTNO (info); g->assigned_gotno = assign; g->local_gotno += assign + pages; assign = g->local_gotno + g->global_gotno + g->tls_gotno; /* Take g out of the direct list, and push it onto the reversed list that gg points to. g->next is guaranteed to be nonnull after this operation, as required by mips_elf_initialize_tls_index. */ gn = g->next; g->next = gg->next; gg->next = g; /* Set up any TLS entries. We always place the TLS entries after all non-TLS entries. */ g->tls_assigned_gotno = g->local_gotno + g->global_gotno; htab_traverse (g->got_entries, mips_elf_initialize_tls_index, g); /* Move onto the next GOT. It will be a secondary GOT if nonull. */ g = gn; /* Mark global symbols in every non-primary GOT as ineligible for stubs. */ if (g) htab_traverse (g->got_entries, mips_elf_set_no_stub, NULL); } while (g); got->size = (gg->next->local_gotno + gg->next->global_gotno + gg->next->tls_gotno) * MIPS_ELF_GOT_SIZE (abfd); return TRUE; } /* Returns the first relocation of type r_type found, beginning with RELOCATION. RELEND is one-past-the-end of the relocation table. */ static const Elf_Internal_Rela * mips_elf_next_relocation (bfd *abfd ATTRIBUTE_UNUSED, unsigned int r_type, const Elf_Internal_Rela *relocation, const Elf_Internal_Rela *relend) { unsigned long r_symndx = ELF_R_SYM (abfd, relocation->r_info); while (relocation < relend) { if (ELF_R_TYPE (abfd, relocation->r_info) == r_type && ELF_R_SYM (abfd, relocation->r_info) == r_symndx) return relocation; ++relocation; } /* We didn't find it. */ return NULL; } /* Return whether a relocation is against a local symbol. */ static bfd_boolean mips_elf_local_relocation_p (bfd *input_bfd, const Elf_Internal_Rela *relocation, asection **local_sections, bfd_boolean check_forced) { unsigned long r_symndx; Elf_Internal_Shdr *symtab_hdr; struct mips_elf_link_hash_entry *h; size_t extsymoff; r_symndx = ELF_R_SYM (input_bfd, relocation->r_info); symtab_hdr = &elf_tdata (input_bfd)->symtab_hdr; extsymoff = (elf_bad_symtab (input_bfd)) ? 0 : symtab_hdr->sh_info; if (r_symndx < extsymoff) return TRUE; if (elf_bad_symtab (input_bfd) && local_sections[r_symndx] != NULL) return TRUE; if (check_forced) { /* Look up the hash table to check whether the symbol was forced local. */ h = (struct mips_elf_link_hash_entry *) elf_sym_hashes (input_bfd) [r_symndx - extsymoff]; /* Find the real hash-table entry for this symbol. */ while (h->root.root.type == bfd_link_hash_indirect || h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; if (h->root.forced_local) return TRUE; } return FALSE; } /* Sign-extend VALUE, which has the indicated number of BITS. */ bfd_vma _bfd_mips_elf_sign_extend (bfd_vma value, int bits) { if (value & ((bfd_vma) 1 << (bits - 1))) /* VALUE is negative. */ value |= ((bfd_vma) - 1) << bits; return value; } /* Return non-zero if the indicated VALUE has overflowed the maximum range expressible by a signed number with the indicated number of BITS. */ static bfd_boolean mips_elf_overflow_p (bfd_vma value, int bits) { bfd_signed_vma svalue = (bfd_signed_vma) value; if (svalue > (1 << (bits - 1)) - 1) /* The value is too big. */ return TRUE; else if (svalue < -(1 << (bits - 1))) /* The value is too small. */ return TRUE; /* All is well. */ return FALSE; } /* Calculate the %high function. */ static bfd_vma mips_elf_high (bfd_vma value) { return ((value + (bfd_vma) 0x8000) >> 16) & 0xffff; } /* Calculate the %higher function. */ static bfd_vma mips_elf_higher (bfd_vma value ATTRIBUTE_UNUSED) { #ifdef BFD64 return ((value + (bfd_vma) 0x80008000) >> 32) & 0xffff; #else abort (); return MINUS_ONE; #endif } /* Calculate the %highest function. */ static bfd_vma mips_elf_highest (bfd_vma value ATTRIBUTE_UNUSED) { #ifdef BFD64 return ((value + (((bfd_vma) 0x8000 << 32) | 0x80008000)) >> 48) & 0xffff; #else abort (); return MINUS_ONE; #endif } /* Create the .compact_rel section. */ static bfd_boolean mips_elf_create_compact_rel_section (bfd *abfd, struct bfd_link_info *info ATTRIBUTE_UNUSED) { flagword flags; register asection *s; if (bfd_get_section_by_name (abfd, ".compact_rel") == NULL) { flags = (SEC_HAS_CONTENTS | SEC_IN_MEMORY | SEC_LINKER_CREATED | SEC_READONLY); s = bfd_make_section_with_flags (abfd, ".compact_rel", flags); if (s == NULL || ! bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd))) return FALSE; s->size = sizeof (Elf32_External_compact_rel); } return TRUE; } /* Create the .got section to hold the global offset table. */ static bfd_boolean mips_elf_create_got_section (bfd *abfd, struct bfd_link_info *info, bfd_boolean maybe_exclude) { flagword flags; register asection *s; struct elf_link_hash_entry *h; struct bfd_link_hash_entry *bh; struct mips_got_info *g; bfd_size_type amt; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); /* This function may be called more than once. */ s = mips_elf_got_section (abfd, TRUE); if (s) { if (! maybe_exclude) s->flags &= ~SEC_EXCLUDE; return TRUE; } flags = (SEC_ALLOC | SEC_LOAD | SEC_HAS_CONTENTS | SEC_IN_MEMORY | SEC_LINKER_CREATED); if (maybe_exclude) flags |= SEC_EXCLUDE; /* We have to use an alignment of 2**4 here because this is hardcoded in the function stub generation and in the linker script. */ s = bfd_make_section_with_flags (abfd, ".got", flags); if (s == NULL || ! bfd_set_section_alignment (abfd, s, 4)) return FALSE; /* Define the symbol _GLOBAL_OFFSET_TABLE_. We don't do this in the linker script because we don't want to define the symbol if we are not creating a global offset table. */ bh = NULL; if (! (_bfd_generic_link_add_one_symbol (info, abfd, "_GLOBAL_OFFSET_TABLE_", BSF_GLOBAL, s, 0, NULL, FALSE, get_elf_backend_data (abfd)->collect, &bh))) return FALSE; h = (struct elf_link_hash_entry *) bh; h->non_elf = 0; h->def_regular = 1; h->type = STT_OBJECT; elf_hash_table (info)->hgot = h; if (info->shared && ! bfd_elf_link_record_dynamic_symbol (info, h)) return FALSE; amt = sizeof (struct mips_got_info); g = bfd_alloc (abfd, amt); if (g == NULL) return FALSE; g->global_gotsym = NULL; g->global_gotno = 0; g->tls_gotno = 0; g->local_gotno = MIPS_RESERVED_GOTNO (info); g->assigned_gotno = MIPS_RESERVED_GOTNO (info); g->bfd2got = NULL; g->next = NULL; g->tls_ldm_offset = MINUS_ONE; g->got_entries = htab_try_create (1, mips_elf_got_entry_hash, mips_elf_got_entry_eq, NULL); if (g->got_entries == NULL) return FALSE; mips_elf_section_data (s)->u.got_info = g; mips_elf_section_data (s)->elf.this_hdr.sh_flags |= SHF_ALLOC | SHF_WRITE | SHF_MIPS_GPREL; /* VxWorks also needs a .got.plt section. */ if (htab->is_vxworks) { s = bfd_make_section_with_flags (abfd, ".got.plt", SEC_ALLOC | SEC_LOAD | SEC_HAS_CONTENTS | SEC_IN_MEMORY | SEC_LINKER_CREATED); if (s == NULL || !bfd_set_section_alignment (abfd, s, 4)) return FALSE; htab->sgotplt = s; } return TRUE; } /* Return true if H refers to the special VxWorks __GOTT_BASE__ or __GOTT_INDEX__ symbols. These symbols are only special for shared objects; they are not used in executables. */ static bfd_boolean is_gott_symbol (struct bfd_link_info *info, struct elf_link_hash_entry *h) { return (mips_elf_hash_table (info)->is_vxworks && info->shared && (strcmp (h->root.root.string, "__GOTT_BASE__") == 0 || strcmp (h->root.root.string, "__GOTT_INDEX__") == 0)); } /* Calculate the value produced by the RELOCATION (which comes from the INPUT_BFD). The ADDEND is the addend to use for this RELOCATION; RELOCATION->R_ADDEND is ignored. The result of the relocation calculation is stored in VALUEP. REQUIRE_JALXP indicates whether or not the opcode used with this relocation must be JALX. This function returns bfd_reloc_continue if the caller need take no further action regarding this relocation, bfd_reloc_notsupported if something goes dramatically wrong, bfd_reloc_overflow if an overflow occurs, and bfd_reloc_ok to indicate success. */ static bfd_reloc_status_type mips_elf_calculate_relocation (bfd *abfd, bfd *input_bfd, asection *input_section, struct bfd_link_info *info, const Elf_Internal_Rela *relocation, bfd_vma addend, reloc_howto_type *howto, Elf_Internal_Sym *local_syms, asection **local_sections, bfd_vma *valuep, const char **namep, bfd_boolean *require_jalxp, bfd_boolean save_addend) { /* The eventual value we will return. */ bfd_vma value; /* The address of the symbol against which the relocation is occurring. */ bfd_vma symbol = 0; /* The final GP value to be used for the relocatable, executable, or shared object file being produced. */ bfd_vma gp = MINUS_ONE; /* The place (section offset or address) of the storage unit being relocated. */ bfd_vma p; /* The value of GP used to create the relocatable object. */ bfd_vma gp0 = MINUS_ONE; /* The offset into the global offset table at which the address of the relocation entry symbol, adjusted by the addend, resides during execution. */ bfd_vma g = MINUS_ONE; /* The section in which the symbol referenced by the relocation is located. */ asection *sec = NULL; struct mips_elf_link_hash_entry *h = NULL; /* TRUE if the symbol referred to by this relocation is a local symbol. */ bfd_boolean local_p, was_local_p; /* TRUE if the symbol referred to by this relocation is "_gp_disp". */ bfd_boolean gp_disp_p = FALSE; /* TRUE if the symbol referred to by this relocation is "__gnu_local_gp". */ bfd_boolean gnu_local_gp_p = FALSE; Elf_Internal_Shdr *symtab_hdr; size_t extsymoff; unsigned long r_symndx; int r_type; /* TRUE if overflow occurred during the calculation of the relocation value. */ bfd_boolean overflowed_p; /* TRUE if this relocation refers to a MIPS16 function. */ bfd_boolean target_is_16_bit_code_p = FALSE; struct mips_elf_link_hash_table *htab; bfd *dynobj; dynobj = elf_hash_table (info)->dynobj; htab = mips_elf_hash_table (info); /* Parse the relocation. */ r_symndx = ELF_R_SYM (input_bfd, relocation->r_info); r_type = ELF_R_TYPE (input_bfd, relocation->r_info); p = (input_section->output_section->vma + input_section->output_offset + relocation->r_offset); /* Assume that there will be no overflow. */ overflowed_p = FALSE; /* Figure out whether or not the symbol is local, and get the offset used in the array of hash table entries. */ symtab_hdr = &elf_tdata (input_bfd)->symtab_hdr; local_p = mips_elf_local_relocation_p (input_bfd, relocation, local_sections, FALSE); was_local_p = local_p; if (! elf_bad_symtab (input_bfd)) extsymoff = symtab_hdr->sh_info; else { /* The symbol table does not follow the rule that local symbols must come before globals. */ extsymoff = 0; } /* Figure out the value of the symbol. */ if (local_p) { Elf_Internal_Sym *sym; sym = local_syms + r_symndx; sec = local_sections[r_symndx]; symbol = sec->output_section->vma + sec->output_offset; if (ELF_ST_TYPE (sym->st_info) != STT_SECTION || (sec->flags & SEC_MERGE)) symbol += sym->st_value; if ((sec->flags & SEC_MERGE) && ELF_ST_TYPE (sym->st_info) == STT_SECTION) { addend = _bfd_elf_rel_local_sym (abfd, sym, &sec, addend); addend -= symbol; addend += sec->output_section->vma + sec->output_offset; } /* MIPS16 text labels should be treated as odd. */ if (sym->st_other == STO_MIPS16) ++symbol; /* Record the name of this symbol, for our caller. */ *namep = bfd_elf_string_from_elf_section (input_bfd, symtab_hdr->sh_link, sym->st_name); if (*namep == '\0') *namep = bfd_section_name (input_bfd, sec); target_is_16_bit_code_p = (sym->st_other == STO_MIPS16); } else { /* ??? Could we use RELOC_FOR_GLOBAL_SYMBOL here ? */ /* For global symbols we look up the symbol in the hash-table. */ h = ((struct mips_elf_link_hash_entry *) elf_sym_hashes (input_bfd) [r_symndx - extsymoff]); /* Find the real hash-table entry for this symbol. */ while (h->root.root.type == bfd_link_hash_indirect || h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; /* Record the name of this symbol, for our caller. */ *namep = h->root.root.root.string; /* See if this is the special _gp_disp symbol. Note that such a symbol must always be a global symbol. */ if (strcmp (*namep, "_gp_disp") == 0 && ! NEWABI_P (input_bfd)) { /* Relocations against _gp_disp are permitted only with R_MIPS_HI16 and R_MIPS_LO16 relocations. */ if (r_type != R_MIPS_HI16 && r_type != R_MIPS_LO16 && r_type != R_MIPS16_HI16 && r_type != R_MIPS16_LO16) return bfd_reloc_notsupported; gp_disp_p = TRUE; } /* See if this is the special _gp symbol. Note that such a symbol must always be a global symbol. */ else if (strcmp (*namep, "__gnu_local_gp") == 0) gnu_local_gp_p = TRUE; /* If this symbol is defined, calculate its address. Note that _gp_disp is a magic symbol, always implicitly defined by the linker, so it's inappropriate to check to see whether or not its defined. */ else if ((h->root.root.type == bfd_link_hash_defined || h->root.root.type == bfd_link_hash_defweak) && h->root.root.u.def.section) { sec = h->root.root.u.def.section; if (sec->output_section) symbol = (h->root.root.u.def.value + sec->output_section->vma + sec->output_offset); else symbol = h->root.root.u.def.value; } else if (h->root.root.type == bfd_link_hash_undefweak) /* We allow relocations against undefined weak symbols, giving it the value zero, so that you can undefined weak functions and check to see if they exist by looking at their addresses. */ symbol = 0; else if (info->unresolved_syms_in_objects == RM_IGNORE && ELF_ST_VISIBILITY (h->root.other) == STV_DEFAULT) symbol = 0; else if (strcmp (*namep, SGI_COMPAT (input_bfd) ? "_DYNAMIC_LINK" : "_DYNAMIC_LINKING") == 0) { /* If this is a dynamic link, we should have created a _DYNAMIC_LINK symbol or _DYNAMIC_LINKING(for normal mips) symbol in in _bfd_mips_elf_create_dynamic_sections. Otherwise, we should define the symbol with a value of 0. FIXME: It should probably get into the symbol table somehow as well. */ BFD_ASSERT (! info->shared); BFD_ASSERT (bfd_get_section_by_name (abfd, ".dynamic") == NULL); symbol = 0; } else if (ELF_MIPS_IS_OPTIONAL (h->root.other)) { /* This is an optional symbol - an Irix specific extension to the ELF spec. Ignore it for now. XXX - FIXME - there is more to the spec for OPTIONAL symbols than simply ignoring them, but we do not handle this for now. For information see the "64-bit ELF Object File Specification" which is available from here: http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf */ symbol = 0; } else { if (! ((*info->callbacks->undefined_symbol) (info, h->root.root.root.string, input_bfd, input_section, relocation->r_offset, (info->unresolved_syms_in_objects == RM_GENERATE_ERROR) || ELF_ST_VISIBILITY (h->root.other)))) return bfd_reloc_undefined; symbol = 0; } target_is_16_bit_code_p = (h->root.other == STO_MIPS16); } /* If this is a 32- or 64-bit call to a 16-bit function with a stub, we need to redirect the call to the stub, unless we're already *in* a stub. */ if (r_type != R_MIPS16_26 && !info->relocatable && ((h != NULL && h->fn_stub != NULL) || (local_p && elf_tdata (input_bfd)->local_stubs != NULL && elf_tdata (input_bfd)->local_stubs[r_symndx] != NULL)) && !mips16_stub_section_p (input_bfd, input_section)) { /* This is a 32- or 64-bit call to a 16-bit function. We should have already noticed that we were going to need the stub. */ if (local_p) sec = elf_tdata (input_bfd)->local_stubs[r_symndx]; else { BFD_ASSERT (h->need_fn_stub); sec = h->fn_stub; } symbol = sec->output_section->vma + sec->output_offset; /* The target is 16-bit, but the stub isn't. */ target_is_16_bit_code_p = FALSE; } /* If this is a 16-bit call to a 32- or 64-bit function with a stub, we need to redirect the call to the stub. */ else if (r_type == R_MIPS16_26 && !info->relocatable && ((h != NULL && (h->call_stub != NULL || h->call_fp_stub != NULL)) || (local_p && elf_tdata (input_bfd)->local_call_stubs != NULL && elf_tdata (input_bfd)->local_call_stubs[r_symndx] != NULL)) && !target_is_16_bit_code_p) { if (local_p) sec = elf_tdata (input_bfd)->local_call_stubs[r_symndx]; else { /* If both call_stub and call_fp_stub are defined, we can figure out which one to use by checking which one appears in the input file. */ if (h->call_stub != NULL && h->call_fp_stub != NULL) { asection *o; sec = NULL; for (o = input_bfd->sections; o != NULL; o = o->next) { if (CALL_FP_STUB_P (bfd_get_section_name (input_bfd, o))) { sec = h->call_fp_stub; break; } } if (sec == NULL) sec = h->call_stub; } else if (h->call_stub != NULL) sec = h->call_stub; else sec = h->call_fp_stub; } BFD_ASSERT (sec->size > 0); symbol = sec->output_section->vma + sec->output_offset; } /* Calls from 16-bit code to 32-bit code and vice versa require the special jalx instruction. */ *require_jalxp = (!info->relocatable && (((r_type == R_MIPS16_26) && !target_is_16_bit_code_p) || ((r_type == R_MIPS_26) && target_is_16_bit_code_p))); local_p = mips_elf_local_relocation_p (input_bfd, relocation, local_sections, TRUE); /* If we haven't already determined the GOT offset, or the GP value, and we're going to need it, get it now. */ switch (r_type) { case R_MIPS_GOT_PAGE: case R_MIPS_GOT_OFST: /* We need to decay to GOT_DISP/addend if the symbol doesn't bind locally. */ local_p = local_p || _bfd_elf_symbol_refs_local_p (&h->root, info, 1); if (local_p || r_type == R_MIPS_GOT_OFST) break; /* Fall through. */ case R_MIPS_CALL16: case R_MIPS_GOT16: case R_MIPS_GOT_DISP: case R_MIPS_GOT_HI16: case R_MIPS_CALL_HI16: case R_MIPS_GOT_LO16: case R_MIPS_CALL_LO16: case R_MIPS_TLS_GD: case R_MIPS_TLS_GOTTPREL: case R_MIPS_TLS_LDM: /* Find the index into the GOT where this value is located. */ if (r_type == R_MIPS_TLS_LDM) { g = mips_elf_local_got_index (abfd, input_bfd, info, 0, 0, NULL, r_type); if (g == MINUS_ONE) return bfd_reloc_outofrange; } else if (!local_p) { /* On VxWorks, CALL relocations should refer to the .got.plt entry, which is initialized to point at the PLT stub. */ if (htab->is_vxworks && (r_type == R_MIPS_CALL_HI16 || r_type == R_MIPS_CALL_LO16 || r_type == R_MIPS_CALL16)) { BFD_ASSERT (addend == 0); BFD_ASSERT (h->root.needs_plt); g = mips_elf_gotplt_index (info, &h->root); } else { /* GOT_PAGE may take a non-zero addend, that is ignored in a GOT_PAGE relocation that decays to GOT_DISP because the symbol turns out to be global. The addend is then added as GOT_OFST. */ BFD_ASSERT (addend == 0 || r_type == R_MIPS_GOT_PAGE); g = mips_elf_global_got_index (dynobj, input_bfd, &h->root, r_type, info); if (h->tls_type == GOT_NORMAL && (! elf_hash_table(info)->dynamic_sections_created || (info->shared && (info->symbolic || h->root.forced_local) && h->root.def_regular))) { /* This is a static link or a -Bsymbolic link. The symbol is defined locally, or was forced to be local. We must initialize this entry in the GOT. */ asection *sgot = mips_elf_got_section (dynobj, FALSE); MIPS_ELF_PUT_WORD (dynobj, symbol, sgot->contents + g); } } } else if (!htab->is_vxworks && (r_type == R_MIPS_CALL16 || (r_type == R_MIPS_GOT16))) /* The calculation below does not involve "g". */ break; else { g = mips_elf_local_got_index (abfd, input_bfd, info, symbol + addend, r_symndx, h, r_type); if (g == MINUS_ONE) return bfd_reloc_outofrange; } /* Convert GOT indices to actual offsets. */ g = mips_elf_got_offset_from_index (dynobj, abfd, input_bfd, g); break; case R_MIPS_HI16: case R_MIPS_LO16: case R_MIPS_GPREL16: case R_MIPS_GPREL32: case R_MIPS_LITERAL: case R_MIPS16_HI16: case R_MIPS16_LO16: case R_MIPS16_GPREL: gp0 = _bfd_get_gp_value (input_bfd); gp = _bfd_get_gp_value (abfd); if (dynobj) gp += mips_elf_adjust_gp (abfd, mips_elf_got_info (dynobj, NULL), input_bfd); break; default: break; } if (gnu_local_gp_p) symbol = gp; /* Relocations against the VxWorks __GOTT_BASE__ and __GOTT_INDEX__ symbols are resolved by the loader. Add them to .rela.dyn. */ if (h != NULL && is_gott_symbol (info, &h->root)) { Elf_Internal_Rela outrel; bfd_byte *loc; asection *s; s = mips_elf_rel_dyn_section (info, FALSE); loc = s->contents + s->reloc_count++ * sizeof (Elf32_External_Rela); outrel.r_offset = (input_section->output_section->vma + input_section->output_offset + relocation->r_offset); outrel.r_info = ELF32_R_INFO (h->root.dynindx, r_type); outrel.r_addend = addend; bfd_elf32_swap_reloca_out (abfd, &outrel, loc); /* If we've written this relocation for a readonly section, we need to set DF_TEXTREL again, so that we do not delete the DT_TEXTREL tag. */ if (MIPS_ELF_READONLY_SECTION (input_section)) info->flags |= DF_TEXTREL; *valuep = 0; return bfd_reloc_ok; } /* Figure out what kind of relocation is being performed. */ switch (r_type) { case R_MIPS_NONE: return bfd_reloc_continue; case R_MIPS_16: value = symbol + _bfd_mips_elf_sign_extend (addend, 16); overflowed_p = mips_elf_overflow_p (value, 16); break; case R_MIPS_32: case R_MIPS_REL32: case R_MIPS_64: if ((info->shared || (!htab->is_vxworks && htab->root.dynamic_sections_created && h != NULL && h->root.def_dynamic && !h->root.def_regular)) && r_symndx != 0 && (input_section->flags & SEC_ALLOC) != 0) { /* If we're creating a shared library, or this relocation is against a symbol in a shared library, then we can't know where the symbol will end up. So, we create a relocation record in the output, and leave the job up to the dynamic linker. In VxWorks executables, references to external symbols are handled using copy relocs or PLT stubs, so there's no need to add a dynamic relocation here. */ value = addend; if (!mips_elf_create_dynamic_relocation (abfd, info, relocation, h, sec, symbol, &value, input_section)) return bfd_reloc_undefined; } else { if (r_type != R_MIPS_REL32) value = symbol + addend; else value = addend; } value &= howto->dst_mask; break; case R_MIPS_PC32: value = symbol + addend - p; value &= howto->dst_mask; break; case R_MIPS16_26: /* The calculation for R_MIPS16_26 is just the same as for an R_MIPS_26. It's only the storage of the relocated field into the output file that's different. That's handled in mips_elf_perform_relocation. So, we just fall through to the R_MIPS_26 case here. */ case R_MIPS_26: if (local_p) value = ((addend | ((p + 4) & 0xf0000000)) + symbol) >> 2; else { value = (_bfd_mips_elf_sign_extend (addend, 28) + symbol) >> 2; if (h->root.root.type != bfd_link_hash_undefweak) overflowed_p = (value >> 26) != ((p + 4) >> 28); } value &= howto->dst_mask; break; case R_MIPS_TLS_DTPREL_HI16: value = (mips_elf_high (addend + symbol - dtprel_base (info)) & howto->dst_mask); break; case R_MIPS_TLS_DTPREL_LO16: case R_MIPS_TLS_DTPREL32: case R_MIPS_TLS_DTPREL64: value = (symbol + addend - dtprel_base (info)) & howto->dst_mask; break; case R_MIPS_TLS_TPREL_HI16: value = (mips_elf_high (addend + symbol - tprel_base (info)) & howto->dst_mask); break; case R_MIPS_TLS_TPREL_LO16: value = (symbol + addend - tprel_base (info)) & howto->dst_mask; break; case R_MIPS_HI16: case R_MIPS16_HI16: if (!gp_disp_p) { value = mips_elf_high (addend + symbol); value &= howto->dst_mask; } else { /* For MIPS16 ABI code we generate this sequence 0: li $v0,%hi(_gp_disp) 4: addiupc $v1,%lo(_gp_disp) 8: sll $v0,16 12: addu $v0,$v1 14: move $gp,$v0 So the offsets of hi and lo relocs are the same, but the $pc is four higher than $t9 would be, so reduce both reloc addends by 4. */ if (r_type == R_MIPS16_HI16) value = mips_elf_high (addend + gp - p - 4); else value = mips_elf_high (addend + gp - p); overflowed_p = mips_elf_overflow_p (value, 16); } break; case R_MIPS_LO16: case R_MIPS16_LO16: if (!gp_disp_p) value = (symbol + addend) & howto->dst_mask; else { /* See the comment for R_MIPS16_HI16 above for the reason for this conditional. */ if (r_type == R_MIPS16_LO16) value = addend + gp - p; else value = addend + gp - p + 4; /* The MIPS ABI requires checking the R_MIPS_LO16 relocation for overflow. But, on, say, IRIX5, relocations against _gp_disp are normally generated from the .cpload pseudo-op. It generates code that normally looks like this: lui $gp,%hi(_gp_disp) addiu $gp,$gp,%lo(_gp_disp) addu $gp,$gp,$t9 Here $t9 holds the address of the function being called, as required by the MIPS ELF ABI. The R_MIPS_LO16 relocation can easily overflow in this situation, but the R_MIPS_HI16 relocation will handle the overflow. Therefore, we consider this a bug in the MIPS ABI, and do not check for overflow here. */ } break; case R_MIPS_LITERAL: /* Because we don't merge literal sections, we can handle this just like R_MIPS_GPREL16. In the long run, we should merge shared literals, and then we will need to additional work here. */ /* Fall through. */ case R_MIPS16_GPREL: /* The R_MIPS16_GPREL performs the same calculation as R_MIPS_GPREL16, but stores the relocated bits in a different order. We don't need to do anything special here; the differences are handled in mips_elf_perform_relocation. */ case R_MIPS_GPREL16: /* Only sign-extend the addend if it was extracted from the instruction. If the addend was separate, leave it alone, otherwise we may lose significant bits. */ if (howto->partial_inplace) addend = _bfd_mips_elf_sign_extend (addend, 16); value = symbol + addend - gp; /* If the symbol was local, any earlier relocatable links will have adjusted its addend with the gp offset, so compensate for that now. Don't do it for symbols forced local in this link, though, since they won't have had the gp offset applied to them before. */ if (was_local_p) value += gp0; overflowed_p = mips_elf_overflow_p (value, 16); break; case R_MIPS_GOT16: case R_MIPS_CALL16: /* VxWorks does not have separate local and global semantics for R_MIPS_GOT16; every relocation evaluates to "G". */ if (!htab->is_vxworks && local_p) { bfd_boolean forced; forced = ! mips_elf_local_relocation_p (input_bfd, relocation, local_sections, FALSE); value = mips_elf_got16_entry (abfd, input_bfd, info, symbol + addend, forced); if (value == MINUS_ONE) return bfd_reloc_outofrange; value = mips_elf_got_offset_from_index (dynobj, abfd, input_bfd, value); overflowed_p = mips_elf_overflow_p (value, 16); break; } /* Fall through. */ case R_MIPS_TLS_GD: case R_MIPS_TLS_GOTTPREL: case R_MIPS_TLS_LDM: case R_MIPS_GOT_DISP: got_disp: value = g; overflowed_p = mips_elf_overflow_p (value, 16); break; case R_MIPS_GPREL32: value = (addend + symbol + gp0 - gp); if (!save_addend) value &= howto->dst_mask; break; case R_MIPS_PC16: case R_MIPS_GNU_REL16_S2: value = symbol + _bfd_mips_elf_sign_extend (addend, 18) - p; overflowed_p = mips_elf_overflow_p (value, 18); value >>= howto->rightshift; value &= howto->dst_mask; break; case R_MIPS_GOT_HI16: case R_MIPS_CALL_HI16: /* We're allowed to handle these two relocations identically. The dynamic linker is allowed to handle the CALL relocations differently by creating a lazy evaluation stub. */ value = g; value = mips_elf_high (value); value &= howto->dst_mask; break; case R_MIPS_GOT_LO16: case R_MIPS_CALL_LO16: value = g & howto->dst_mask; break; case R_MIPS_GOT_PAGE: /* GOT_PAGE relocations that reference non-local symbols decay to GOT_DISP. The corresponding GOT_OFST relocation decays to 0. */ if (! local_p) goto got_disp; value = mips_elf_got_page (abfd, input_bfd, info, symbol + addend, NULL); if (value == MINUS_ONE) return bfd_reloc_outofrange; value = mips_elf_got_offset_from_index (dynobj, abfd, input_bfd, value); overflowed_p = mips_elf_overflow_p (value, 16); break; case R_MIPS_GOT_OFST: if (local_p) mips_elf_got_page (abfd, input_bfd, info, symbol + addend, &value); else value = addend; overflowed_p = mips_elf_overflow_p (value, 16); break; case R_MIPS_SUB: value = symbol - addend; value &= howto->dst_mask; break; case R_MIPS_HIGHER: value = mips_elf_higher (addend + symbol); value &= howto->dst_mask; break; case R_MIPS_HIGHEST: value = mips_elf_highest (addend + symbol); value &= howto->dst_mask; break; case R_MIPS_SCN_DISP: value = symbol + addend - sec->output_offset; value &= howto->dst_mask; break; case R_MIPS_JALR: /* This relocation is only a hint. In some cases, we optimize it into a bal instruction. But we don't try to optimize branches to the PLT; that will wind up wasting time. */ if (h != NULL && h->root.plt.offset != (bfd_vma) -1) return bfd_reloc_continue; value = symbol + addend; break; case R_MIPS_PJUMP: case R_MIPS_GNU_VTINHERIT: case R_MIPS_GNU_VTENTRY: /* We don't do anything with these at present. */ return bfd_reloc_continue; default: /* An unrecognized relocation type. */ return bfd_reloc_notsupported; } /* Store the VALUE for our caller. */ *valuep = value; return overflowed_p ? bfd_reloc_overflow : bfd_reloc_ok; } /* Obtain the field relocated by RELOCATION. */ static bfd_vma mips_elf_obtain_contents (reloc_howto_type *howto, const Elf_Internal_Rela *relocation, bfd *input_bfd, bfd_byte *contents) { bfd_vma x; bfd_byte *location = contents + relocation->r_offset; /* Obtain the bytes. */ x = bfd_get ((8 * bfd_get_reloc_size (howto)), input_bfd, location); return x; } /* It has been determined that the result of the RELOCATION is the VALUE. Use HOWTO to place VALUE into the output file at the appropriate position. The SECTION is the section to which the relocation applies. If REQUIRE_JALX is TRUE, then the opcode used for the relocation must be either JAL or JALX, and it is unconditionally converted to JALX. Returns FALSE if anything goes wrong. */ static bfd_boolean mips_elf_perform_relocation (struct bfd_link_info *info, reloc_howto_type *howto, const Elf_Internal_Rela *relocation, bfd_vma value, bfd *input_bfd, asection *input_section, bfd_byte *contents, bfd_boolean require_jalx) { bfd_vma x; bfd_byte *location; int r_type = ELF_R_TYPE (input_bfd, relocation->r_info); /* Figure out where the relocation is occurring. */ location = contents + relocation->r_offset; _bfd_mips16_elf_reloc_unshuffle (input_bfd, r_type, FALSE, location); /* Obtain the current value. */ x = mips_elf_obtain_contents (howto, relocation, input_bfd, contents); /* Clear the field we are setting. */ x &= ~howto->dst_mask; /* Set the field. */ x |= (value & howto->dst_mask); /* If required, turn JAL into JALX. */ if (require_jalx) { bfd_boolean ok; bfd_vma opcode = x >> 26; bfd_vma jalx_opcode; /* Check to see if the opcode is already JAL or JALX. */ if (r_type == R_MIPS16_26) { ok = ((opcode == 0x6) || (opcode == 0x7)); jalx_opcode = 0x7; } else { ok = ((opcode == 0x3) || (opcode == 0x1d)); jalx_opcode = 0x1d; } /* If the opcode is not JAL or JALX, there's a problem. */ if (!ok) { (*_bfd_error_handler) (_("%B: %A+0x%lx: jump to stub routine which is not jal"), input_bfd, input_section, (unsigned long) relocation->r_offset); bfd_set_error (bfd_error_bad_value); return FALSE; } /* Make this the JALX opcode. */ x = (x & ~(0x3f << 26)) | (jalx_opcode << 26); } /* On the RM9000, bal is faster than jal, because bal uses branch prediction hardware. If we are linking for the RM9000, and we see jal, and bal fits, use it instead. Note that this transformation should be safe for all architectures. */ if (bfd_get_mach (input_bfd) == bfd_mach_mips9000 && !info->relocatable && !require_jalx && ((r_type == R_MIPS_26 && (x >> 26) == 0x3) /* jal addr */ || (r_type == R_MIPS_JALR && x == 0x0320f809))) /* jalr t9 */ { bfd_vma addr; bfd_vma dest; bfd_signed_vma off; addr = (input_section->output_section->vma + input_section->output_offset + relocation->r_offset + 4); if (r_type == R_MIPS_26) dest = (value << 2) | ((addr >> 28) << 28); else dest = value; off = dest - addr; if (off <= 0x1ffff && off >= -0x20000) x = 0x04110000 | (((bfd_vma) off >> 2) & 0xffff); /* bal addr */ } /* Put the value into the output. */ bfd_put (8 * bfd_get_reloc_size (howto), input_bfd, x, location); _bfd_mips16_elf_reloc_shuffle(input_bfd, r_type, !info->relocatable, location); return TRUE; } /* Returns TRUE if SECTION is a MIPS16 stub section. */ static bfd_boolean mips16_stub_section_p (bfd *abfd ATTRIBUTE_UNUSED, asection *section) { const char *name = bfd_get_section_name (abfd, section); return FN_STUB_P (name) || CALL_STUB_P (name) || CALL_FP_STUB_P (name); } /* Add room for N relocations to the .rel(a).dyn section in ABFD. */ static void mips_elf_allocate_dynamic_relocations (bfd *abfd, struct bfd_link_info *info, unsigned int n) { asection *s; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); s = mips_elf_rel_dyn_section (info, FALSE); BFD_ASSERT (s != NULL); if (htab->is_vxworks) s->size += n * MIPS_ELF_RELA_SIZE (abfd); else { if (s->size == 0) { /* Make room for a null element. */ s->size += MIPS_ELF_REL_SIZE (abfd); ++s->reloc_count; } s->size += n * MIPS_ELF_REL_SIZE (abfd); } } /* Create a rel.dyn relocation for the dynamic linker to resolve. REL is the original relocation, which is now being transformed into a dynamic relocation. The ADDENDP is adjusted if necessary; the caller should store the result in place of the original addend. */ static bfd_boolean mips_elf_create_dynamic_relocation (bfd *output_bfd, struct bfd_link_info *info, const Elf_Internal_Rela *rel, struct mips_elf_link_hash_entry *h, asection *sec, bfd_vma symbol, bfd_vma *addendp, asection *input_section) { Elf_Internal_Rela outrel[3]; asection *sreloc; bfd *dynobj; int r_type; long indx; bfd_boolean defined_p; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); r_type = ELF_R_TYPE (output_bfd, rel->r_info); dynobj = elf_hash_table (info)->dynobj; sreloc = mips_elf_rel_dyn_section (info, FALSE); BFD_ASSERT (sreloc != NULL); BFD_ASSERT (sreloc->contents != NULL); BFD_ASSERT (sreloc->reloc_count * MIPS_ELF_REL_SIZE (output_bfd) < sreloc->size); outrel[0].r_offset = _bfd_elf_section_offset (output_bfd, info, input_section, rel[0].r_offset); if (ABI_64_P (output_bfd)) { outrel[1].r_offset = _bfd_elf_section_offset (output_bfd, info, input_section, rel[1].r_offset); outrel[2].r_offset = _bfd_elf_section_offset (output_bfd, info, input_section, rel[2].r_offset); } if (outrel[0].r_offset == MINUS_ONE) /* The relocation field has been deleted. */ return TRUE; if (outrel[0].r_offset == MINUS_TWO) { /* The relocation field has been converted into a relative value of some sort. Functions like _bfd_elf_write_section_eh_frame expect the field to be fully relocated, so add in the symbol's value. */ *addendp += symbol; return TRUE; } /* We must now calculate the dynamic symbol table index to use in the relocation. */ if (h != NULL - && (!h->root.def_regular + && (sec == NULL || !h->root.def_regular || (info->shared && !info->symbolic && !h->root.forced_local))) { indx = h->root.dynindx; if (SGI_COMPAT (output_bfd)) defined_p = h->root.def_regular; else /* ??? glibc's ld.so just adds the final GOT entry to the relocation field. It therefore treats relocs against defined symbols in the same way as relocs against undefined symbols. */ defined_p = FALSE; } else { if (sec != NULL && bfd_is_abs_section (sec)) indx = 0; else if (sec == NULL || sec->owner == NULL) { bfd_set_error (bfd_error_bad_value); return FALSE; } else { indx = elf_section_data (sec->output_section)->dynindx; if (indx == 0) { asection *osec = htab->root.text_index_section; indx = elf_section_data (osec)->dynindx; } if (indx == 0) abort (); } /* Instead of generating a relocation using the section symbol, we may as well make it a fully relative relocation. We want to avoid generating relocations to local symbols because we used to generate them incorrectly, without adding the original symbol value, which is mandated by the ABI for section symbols. In order to give dynamic loaders and applications time to phase out the incorrect use, we refrain from emitting section-relative relocations. It's not like they're useful, after all. This should be a bit more efficient as well. */ /* ??? Although this behavior is compatible with glibc's ld.so, the ABI says that relocations against STN_UNDEF should have a symbol value of 0. Irix rld honors this, so relocations against STN_UNDEF have no effect. */ if (!SGI_COMPAT (output_bfd)) indx = 0; defined_p = TRUE; } /* If the relocation was previously an absolute relocation and this symbol will not be referred to by the relocation, we must adjust it by the value we give it in the dynamic symbol table. Otherwise leave the job up to the dynamic linker. */ if (defined_p && r_type != R_MIPS_REL32) *addendp += symbol; if (htab->is_vxworks) /* VxWorks uses non-relative relocations for this. */ outrel[0].r_info = ELF32_R_INFO (indx, R_MIPS_32); else /* The relocation is always an REL32 relocation because we don't know where the shared library will wind up at load-time. */ outrel[0].r_info = ELF_R_INFO (output_bfd, (unsigned long) indx, R_MIPS_REL32); /* For strict adherence to the ABI specification, we should generate a R_MIPS_64 relocation record by itself before the _REL32/_64 record as well, such that the addend is read in as a 64-bit value (REL32 is a 32-bit relocation, after all). However, since none of the existing ELF64 MIPS dynamic loaders seems to care, we don't waste space with these artificial relocations. If this turns out to not be true, mips_elf_allocate_dynamic_relocation() should be tweaked so as to make room for a pair of dynamic relocations per invocation if ABI_64_P, and here we should generate an additional relocation record with R_MIPS_64 by itself for a NULL symbol before this relocation record. */ outrel[1].r_info = ELF_R_INFO (output_bfd, 0, ABI_64_P (output_bfd) ? R_MIPS_64 : R_MIPS_NONE); outrel[2].r_info = ELF_R_INFO (output_bfd, 0, R_MIPS_NONE); /* Adjust the output offset of the relocation to reference the correct location in the output file. */ outrel[0].r_offset += (input_section->output_section->vma + input_section->output_offset); outrel[1].r_offset += (input_section->output_section->vma + input_section->output_offset); outrel[2].r_offset += (input_section->output_section->vma + input_section->output_offset); /* Put the relocation back out. We have to use the special relocation outputter in the 64-bit case since the 64-bit relocation format is non-standard. */ if (ABI_64_P (output_bfd)) { (*get_elf_backend_data (output_bfd)->s->swap_reloc_out) (output_bfd, &outrel[0], (sreloc->contents + sreloc->reloc_count * sizeof (Elf64_Mips_External_Rel))); } else if (htab->is_vxworks) { /* VxWorks uses RELA rather than REL dynamic relocations. */ outrel[0].r_addend = *addendp; bfd_elf32_swap_reloca_out (output_bfd, &outrel[0], (sreloc->contents + sreloc->reloc_count * sizeof (Elf32_External_Rela))); } else bfd_elf32_swap_reloc_out (output_bfd, &outrel[0], (sreloc->contents + sreloc->reloc_count * sizeof (Elf32_External_Rel))); /* We've now added another relocation. */ ++sreloc->reloc_count; /* Make sure the output section is writable. The dynamic linker will be writing to it. */ elf_section_data (input_section->output_section)->this_hdr.sh_flags |= SHF_WRITE; /* On IRIX5, make an entry of compact relocation info. */ if (IRIX_COMPAT (output_bfd) == ict_irix5) { asection *scpt = bfd_get_section_by_name (dynobj, ".compact_rel"); bfd_byte *cr; if (scpt) { Elf32_crinfo cptrel; mips_elf_set_cr_format (cptrel, CRF_MIPS_LONG); cptrel.vaddr = (rel->r_offset + input_section->output_section->vma + input_section->output_offset); if (r_type == R_MIPS_REL32) mips_elf_set_cr_type (cptrel, CRT_MIPS_REL32); else mips_elf_set_cr_type (cptrel, CRT_MIPS_WORD); mips_elf_set_cr_dist2to (cptrel, 0); cptrel.konst = *addendp; cr = (scpt->contents + sizeof (Elf32_External_compact_rel)); mips_elf_set_cr_relvaddr (cptrel, 0); bfd_elf32_swap_crinfo_out (output_bfd, &cptrel, ((Elf32_External_crinfo *) cr + scpt->reloc_count)); ++scpt->reloc_count; } } /* If we've written this relocation for a readonly section, we need to set DF_TEXTREL again, so that we do not delete the DT_TEXTREL tag. */ if (MIPS_ELF_READONLY_SECTION (input_section)) info->flags |= DF_TEXTREL; return TRUE; } /* Return the MACH for a MIPS e_flags value. */ unsigned long _bfd_elf_mips_mach (flagword flags) { switch (flags & EF_MIPS_MACH) { case E_MIPS_MACH_3900: return bfd_mach_mips3900; case E_MIPS_MACH_4010: return bfd_mach_mips4010; case E_MIPS_MACH_4100: return bfd_mach_mips4100; case E_MIPS_MACH_4111: return bfd_mach_mips4111; case E_MIPS_MACH_4120: return bfd_mach_mips4120; case E_MIPS_MACH_4650: return bfd_mach_mips4650; case E_MIPS_MACH_5400: return bfd_mach_mips5400; case E_MIPS_MACH_5500: return bfd_mach_mips5500; case E_MIPS_MACH_9000: return bfd_mach_mips9000; case E_MIPS_MACH_OCTEON: return bfd_mach_mips_octeon; case E_MIPS_MACH_SB1: return bfd_mach_mips_sb1; default: switch (flags & EF_MIPS_ARCH) { default: case E_MIPS_ARCH_1: return bfd_mach_mips3000; case E_MIPS_ARCH_2: return bfd_mach_mips6000; case E_MIPS_ARCH_3: return bfd_mach_mips4000; case E_MIPS_ARCH_4: return bfd_mach_mips8000; case E_MIPS_ARCH_5: return bfd_mach_mips5; case E_MIPS_ARCH_32: return bfd_mach_mipsisa32; case E_MIPS_ARCH_64: return bfd_mach_mipsisa64; case E_MIPS_ARCH_32R2: return bfd_mach_mipsisa32r2; case E_MIPS_ARCH_64R2: return bfd_mach_mipsisa64r2; } } return 0; } /* Return printable name for ABI. */ static INLINE char * elf_mips_abi_name (bfd *abfd) { flagword flags; flags = elf_elfheader (abfd)->e_flags; switch (flags & EF_MIPS_ABI) { case 0: if (ABI_N32_P (abfd)) return "N32"; else if (ABI_64_P (abfd)) return "64"; else return "none"; case E_MIPS_ABI_O32: return "O32"; case E_MIPS_ABI_O64: return "O64"; case E_MIPS_ABI_EABI32: return "EABI32"; case E_MIPS_ABI_EABI64: return "EABI64"; default: return "unknown abi"; } } /* MIPS ELF uses two common sections. One is the usual one, and the other is for small objects. All the small objects are kept together, and then referenced via the gp pointer, which yields faster assembler code. This is what we use for the small common section. This approach is copied from ecoff.c. */ static asection mips_elf_scom_section; static asymbol mips_elf_scom_symbol; static asymbol *mips_elf_scom_symbol_ptr; /* MIPS ELF also uses an acommon section, which represents an allocated common symbol which may be overridden by a definition in a shared library. */ static asection mips_elf_acom_section; static asymbol mips_elf_acom_symbol; static asymbol *mips_elf_acom_symbol_ptr; /* Handle the special MIPS section numbers that a symbol may use. This is used for both the 32-bit and the 64-bit ABI. */ void _bfd_mips_elf_symbol_processing (bfd *abfd, asymbol *asym) { elf_symbol_type *elfsym; elfsym = (elf_symbol_type *) asym; switch (elfsym->internal_elf_sym.st_shndx) { case SHN_MIPS_ACOMMON: /* This section is used in a dynamically linked executable file. It is an allocated common section. The dynamic linker can either resolve these symbols to something in a shared library, or it can just leave them here. For our purposes, we can consider these symbols to be in a new section. */ if (mips_elf_acom_section.name == NULL) { /* Initialize the acommon section. */ mips_elf_acom_section.name = ".acommon"; mips_elf_acom_section.flags = SEC_ALLOC; mips_elf_acom_section.output_section = &mips_elf_acom_section; mips_elf_acom_section.symbol = &mips_elf_acom_symbol; mips_elf_acom_section.symbol_ptr_ptr = &mips_elf_acom_symbol_ptr; mips_elf_acom_symbol.name = ".acommon"; mips_elf_acom_symbol.flags = BSF_SECTION_SYM; mips_elf_acom_symbol.section = &mips_elf_acom_section; mips_elf_acom_symbol_ptr = &mips_elf_acom_symbol; } asym->section = &mips_elf_acom_section; break; case SHN_COMMON: /* Common symbols less than the GP size are automatically treated as SHN_MIPS_SCOMMON symbols on IRIX5. */ if (asym->value > elf_gp_size (abfd) || ELF_ST_TYPE (elfsym->internal_elf_sym.st_info) == STT_TLS || IRIX_COMPAT (abfd) == ict_irix6) break; /* Fall through. */ case SHN_MIPS_SCOMMON: if (mips_elf_scom_section.name == NULL) { /* Initialize the small common section. */ mips_elf_scom_section.name = ".scommon"; mips_elf_scom_section.flags = SEC_IS_COMMON; mips_elf_scom_section.output_section = &mips_elf_scom_section; mips_elf_scom_section.symbol = &mips_elf_scom_symbol; mips_elf_scom_section.symbol_ptr_ptr = &mips_elf_scom_symbol_ptr; mips_elf_scom_symbol.name = ".scommon"; mips_elf_scom_symbol.flags = BSF_SECTION_SYM; mips_elf_scom_symbol.section = &mips_elf_scom_section; mips_elf_scom_symbol_ptr = &mips_elf_scom_symbol; } asym->section = &mips_elf_scom_section; asym->value = elfsym->internal_elf_sym.st_size; break; case SHN_MIPS_SUNDEFINED: asym->section = bfd_und_section_ptr; break; case SHN_MIPS_TEXT: { asection *section = bfd_get_section_by_name (abfd, ".text"); BFD_ASSERT (SGI_COMPAT (abfd)); if (section != NULL) { asym->section = section; /* MIPS_TEXT is a bit special, the address is not an offset to the base of the .text section. So substract the section base address to make it an offset. */ asym->value -= section->vma; } } break; case SHN_MIPS_DATA: { asection *section = bfd_get_section_by_name (abfd, ".data"); BFD_ASSERT (SGI_COMPAT (abfd)); if (section != NULL) { asym->section = section; /* MIPS_DATA is a bit special, the address is not an offset to the base of the .data section. So substract the section base address to make it an offset. */ asym->value -= section->vma; } } break; } } /* Implement elf_backend_eh_frame_address_size. This differs from the default in the way it handles EABI64. EABI64 was originally specified as an LP64 ABI, and that is what -mabi=eabi normally gives on a 64-bit target. However, gcc has historically accepted the combination of -mabi=eabi and -mlong32, and this ILP32 variation has become semi-official over time. Both forms use elf32 and have pointer-sized FDE addresses. If an EABI object was generated by GCC 4.0 or above, it will have an empty .gcc_compiled_longXX section, where XX is the size of longs in bits. Unfortunately, ILP32 objects generated by earlier compilers have no special marking to distinguish them from LP64 objects. We don't want users of the official LP64 ABI to be punished for the existence of the ILP32 variant, but at the same time, we don't want to mistakenly interpret pre-4.0 ILP32 objects as being LP64 objects. We therefore take the following approach: - If ABFD contains a .gcc_compiled_longXX section, use it to determine the pointer size. - Otherwise check the type of the first relocation. Assume that the LP64 ABI is being used if the relocation is of type R_MIPS_64. - Otherwise punt. The second check is enough to detect LP64 objects generated by pre-4.0 compilers because, in the kind of output generated by those compilers, the first relocation will be associated with either a CIE personality routine or an FDE start address. Furthermore, the compilers never used a special (non-pointer) encoding for this ABI. Checking the relocation type should also be safe because there is no reason to use R_MIPS_64 in an ILP32 object. Pre-4.0 compilers never did so. */ unsigned int _bfd_mips_elf_eh_frame_address_size (bfd *abfd, asection *sec) { if (elf_elfheader (abfd)->e_ident[EI_CLASS] == ELFCLASS64) return 8; if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ABI) == E_MIPS_ABI_EABI64) { bfd_boolean long32_p, long64_p; long32_p = bfd_get_section_by_name (abfd, ".gcc_compiled_long32") != 0; long64_p = bfd_get_section_by_name (abfd, ".gcc_compiled_long64") != 0; if (long32_p && long64_p) return 0; if (long32_p) return 4; if (long64_p) return 8; if (sec->reloc_count > 0 && elf_section_data (sec)->relocs != NULL && (ELF32_R_TYPE (elf_section_data (sec)->relocs[0].r_info) == R_MIPS_64)) return 8; return 0; } return 4; } /* There appears to be a bug in the MIPSpro linker that causes GOT_DISP relocations against two unnamed section symbols to resolve to the same address. For example, if we have code like: lw $4,%got_disp(.data)($gp) lw $25,%got_disp(.text)($gp) jalr $25 then the linker will resolve both relocations to .data and the program will jump there rather than to .text. We can work around this problem by giving names to local section symbols. This is also what the MIPSpro tools do. */ bfd_boolean _bfd_mips_elf_name_local_section_symbols (bfd *abfd) { return SGI_COMPAT (abfd); } /* Work over a section just before writing it out. This routine is used by both the 32-bit and the 64-bit ABI. FIXME: We recognize sections that need the SHF_MIPS_GPREL flag by name; there has to be a better way. */ bfd_boolean _bfd_mips_elf_section_processing (bfd *abfd, Elf_Internal_Shdr *hdr) { if (hdr->sh_type == SHT_MIPS_REGINFO && hdr->sh_size > 0) { bfd_byte buf[4]; BFD_ASSERT (hdr->sh_size == sizeof (Elf32_External_RegInfo)); BFD_ASSERT (hdr->contents == NULL); if (bfd_seek (abfd, hdr->sh_offset + sizeof (Elf32_External_RegInfo) - 4, SEEK_SET) != 0) return FALSE; H_PUT_32 (abfd, elf_gp (abfd), buf); if (bfd_bwrite (buf, 4, abfd) != 4) return FALSE; } if (hdr->sh_type == SHT_MIPS_OPTIONS && hdr->bfd_section != NULL && mips_elf_section_data (hdr->bfd_section) != NULL && mips_elf_section_data (hdr->bfd_section)->u.tdata != NULL) { bfd_byte *contents, *l, *lend; /* We stored the section contents in the tdata field in the set_section_contents routine. We save the section contents so that we don't have to read them again. At this point we know that elf_gp is set, so we can look through the section contents to see if there is an ODK_REGINFO structure. */ contents = mips_elf_section_data (hdr->bfd_section)->u.tdata; l = contents; lend = contents + hdr->sh_size; while (l + sizeof (Elf_External_Options) <= lend) { Elf_Internal_Options intopt; bfd_mips_elf_swap_options_in (abfd, (Elf_External_Options *) l, &intopt); if (intopt.size < sizeof (Elf_External_Options)) { (*_bfd_error_handler) (_("%B: Warning: bad `%s' option size %u smaller than its header"), abfd, MIPS_ELF_OPTIONS_SECTION_NAME (abfd), intopt.size); break; } if (ABI_64_P (abfd) && intopt.kind == ODK_REGINFO) { bfd_byte buf[8]; if (bfd_seek (abfd, (hdr->sh_offset + (l - contents) + sizeof (Elf_External_Options) + (sizeof (Elf64_External_RegInfo) - 8)), SEEK_SET) != 0) return FALSE; H_PUT_64 (abfd, elf_gp (abfd), buf); if (bfd_bwrite (buf, 8, abfd) != 8) return FALSE; } else if (intopt.kind == ODK_REGINFO) { bfd_byte buf[4]; if (bfd_seek (abfd, (hdr->sh_offset + (l - contents) + sizeof (Elf_External_Options) + (sizeof (Elf32_External_RegInfo) - 4)), SEEK_SET) != 0) return FALSE; H_PUT_32 (abfd, elf_gp (abfd), buf); if (bfd_bwrite (buf, 4, abfd) != 4) return FALSE; } l += intopt.size; } } if (hdr->bfd_section != NULL) { const char *name = bfd_get_section_name (abfd, hdr->bfd_section); if (strcmp (name, ".sdata") == 0 || strcmp (name, ".lit8") == 0 || strcmp (name, ".lit4") == 0) { hdr->sh_flags |= SHF_ALLOC | SHF_WRITE | SHF_MIPS_GPREL; hdr->sh_type = SHT_PROGBITS; } else if (strcmp (name, ".sbss") == 0) { hdr->sh_flags |= SHF_ALLOC | SHF_WRITE | SHF_MIPS_GPREL; hdr->sh_type = SHT_NOBITS; } else if (strcmp (name, ".srdata") == 0) { hdr->sh_flags |= SHF_ALLOC | SHF_MIPS_GPREL; hdr->sh_type = SHT_PROGBITS; } else if (strcmp (name, ".compact_rel") == 0) { hdr->sh_flags = 0; hdr->sh_type = SHT_PROGBITS; } else if (strcmp (name, ".rtproc") == 0) { if (hdr->sh_addralign != 0 && hdr->sh_entsize == 0) { unsigned int adjust; adjust = hdr->sh_size % hdr->sh_addralign; if (adjust != 0) hdr->sh_size += hdr->sh_addralign - adjust; } } } return TRUE; } /* Handle a MIPS specific section when reading an object file. This is called when elfcode.h finds a section with an unknown type. This routine supports both the 32-bit and 64-bit ELF ABI. FIXME: We need to handle the SHF_MIPS_GPREL flag, but I'm not sure how to. */ bfd_boolean _bfd_mips_elf_section_from_shdr (bfd *abfd, Elf_Internal_Shdr *hdr, const char *name, int shindex) { flagword flags = 0; /* There ought to be a place to keep ELF backend specific flags, but at the moment there isn't one. We just keep track of the sections by their name, instead. Fortunately, the ABI gives suggested names for all the MIPS specific sections, so we will probably get away with this. */ switch (hdr->sh_type) { case SHT_MIPS_LIBLIST: if (strcmp (name, ".liblist") != 0) return FALSE; break; case SHT_MIPS_MSYM: if (strcmp (name, ".msym") != 0) return FALSE; break; case SHT_MIPS_CONFLICT: if (strcmp (name, ".conflict") != 0) return FALSE; break; case SHT_MIPS_GPTAB: if (! CONST_STRNEQ (name, ".gptab.")) return FALSE; break; case SHT_MIPS_UCODE: if (strcmp (name, ".ucode") != 0) return FALSE; break; case SHT_MIPS_DEBUG: if (strcmp (name, ".mdebug") != 0) return FALSE; flags = SEC_DEBUGGING; break; case SHT_MIPS_REGINFO: if (strcmp (name, ".reginfo") != 0 || hdr->sh_size != sizeof (Elf32_External_RegInfo)) return FALSE; flags = (SEC_LINK_ONCE | SEC_LINK_DUPLICATES_SAME_SIZE); break; case SHT_MIPS_IFACE: if (strcmp (name, ".MIPS.interfaces") != 0) return FALSE; break; case SHT_MIPS_CONTENT: if (! CONST_STRNEQ (name, ".MIPS.content")) return FALSE; break; case SHT_MIPS_OPTIONS: if (!MIPS_ELF_OPTIONS_SECTION_NAME_P (name)) return FALSE; break; case SHT_MIPS_DWARF: if (! CONST_STRNEQ (name, ".debug_")) return FALSE; break; case SHT_MIPS_SYMBOL_LIB: if (strcmp (name, ".MIPS.symlib") != 0) return FALSE; break; case SHT_MIPS_EVENTS: if (! CONST_STRNEQ (name, ".MIPS.events") && ! CONST_STRNEQ (name, ".MIPS.post_rel")) return FALSE; break; default: break; } if (! _bfd_elf_make_section_from_shdr (abfd, hdr, name, shindex)) return FALSE; if (flags) { if (! bfd_set_section_flags (abfd, hdr->bfd_section, (bfd_get_section_flags (abfd, hdr->bfd_section) | flags))) return FALSE; } /* FIXME: We should record sh_info for a .gptab section. */ /* For a .reginfo section, set the gp value in the tdata information from the contents of this section. We need the gp value while processing relocs, so we just get it now. The .reginfo section is not used in the 64-bit MIPS ELF ABI. */ if (hdr->sh_type == SHT_MIPS_REGINFO) { Elf32_External_RegInfo ext; Elf32_RegInfo s; if (! bfd_get_section_contents (abfd, hdr->bfd_section, &ext, 0, sizeof ext)) return FALSE; bfd_mips_elf32_swap_reginfo_in (abfd, &ext, &s); elf_gp (abfd) = s.ri_gp_value; } /* For a SHT_MIPS_OPTIONS section, look for a ODK_REGINFO entry, and set the gp value based on what we find. We may see both SHT_MIPS_REGINFO and SHT_MIPS_OPTIONS/ODK_REGINFO; in that case, they should agree. */ if (hdr->sh_type == SHT_MIPS_OPTIONS) { bfd_byte *contents, *l, *lend; contents = bfd_malloc (hdr->sh_size); if (contents == NULL) return FALSE; if (! bfd_get_section_contents (abfd, hdr->bfd_section, contents, 0, hdr->sh_size)) { free (contents); return FALSE; } l = contents; lend = contents + hdr->sh_size; while (l + sizeof (Elf_External_Options) <= lend) { Elf_Internal_Options intopt; bfd_mips_elf_swap_options_in (abfd, (Elf_External_Options *) l, &intopt); if (intopt.size < sizeof (Elf_External_Options)) { (*_bfd_error_handler) (_("%B: Warning: bad `%s' option size %u smaller than its header"), abfd, MIPS_ELF_OPTIONS_SECTION_NAME (abfd), intopt.size); break; } if (ABI_64_P (abfd) && intopt.kind == ODK_REGINFO) { Elf64_Internal_RegInfo intreg; bfd_mips_elf64_swap_reginfo_in (abfd, ((Elf64_External_RegInfo *) (l + sizeof (Elf_External_Options))), &intreg); elf_gp (abfd) = intreg.ri_gp_value; } else if (intopt.kind == ODK_REGINFO) { Elf32_RegInfo intreg; bfd_mips_elf32_swap_reginfo_in (abfd, ((Elf32_External_RegInfo *) (l + sizeof (Elf_External_Options))), &intreg); elf_gp (abfd) = intreg.ri_gp_value; } l += intopt.size; } free (contents); } return TRUE; } /* Set the correct type for a MIPS ELF section. We do this by the section name, which is a hack, but ought to work. This routine is used by both the 32-bit and the 64-bit ABI. */ bfd_boolean _bfd_mips_elf_fake_sections (bfd *abfd, Elf_Internal_Shdr *hdr, asection *sec) { const char *name = bfd_get_section_name (abfd, sec); if (strcmp (name, ".liblist") == 0) { hdr->sh_type = SHT_MIPS_LIBLIST; hdr->sh_info = sec->size / sizeof (Elf32_Lib); /* The sh_link field is set in final_write_processing. */ } else if (strcmp (name, ".conflict") == 0) hdr->sh_type = SHT_MIPS_CONFLICT; else if (CONST_STRNEQ (name, ".gptab.")) { hdr->sh_type = SHT_MIPS_GPTAB; hdr->sh_entsize = sizeof (Elf32_External_gptab); /* The sh_info field is set in final_write_processing. */ } else if (strcmp (name, ".ucode") == 0) hdr->sh_type = SHT_MIPS_UCODE; else if (strcmp (name, ".mdebug") == 0) { hdr->sh_type = SHT_MIPS_DEBUG; /* In a shared object on IRIX 5.3, the .mdebug section has an entsize of 0. FIXME: Does this matter? */ if (SGI_COMPAT (abfd) && (abfd->flags & DYNAMIC) != 0) hdr->sh_entsize = 0; else hdr->sh_entsize = 1; } else if (strcmp (name, ".reginfo") == 0) { hdr->sh_type = SHT_MIPS_REGINFO; /* In a shared object on IRIX 5.3, the .reginfo section has an entsize of 0x18. FIXME: Does this matter? */ if (SGI_COMPAT (abfd)) { if ((abfd->flags & DYNAMIC) != 0) hdr->sh_entsize = sizeof (Elf32_External_RegInfo); else hdr->sh_entsize = 1; } else hdr->sh_entsize = sizeof (Elf32_External_RegInfo); } else if (SGI_COMPAT (abfd) && (strcmp (name, ".hash") == 0 || strcmp (name, ".dynamic") == 0 || strcmp (name, ".dynstr") == 0)) { if (SGI_COMPAT (abfd)) hdr->sh_entsize = 0; #if 0 /* This isn't how the IRIX6 linker behaves. */ hdr->sh_info = SIZEOF_MIPS_DYNSYM_SECNAMES; #endif } else if (strcmp (name, ".got") == 0 || strcmp (name, ".srdata") == 0 || strcmp (name, ".sdata") == 0 || strcmp (name, ".sbss") == 0 || strcmp (name, ".lit4") == 0 || strcmp (name, ".lit8") == 0) hdr->sh_flags |= SHF_MIPS_GPREL; else if (strcmp (name, ".MIPS.interfaces") == 0) { hdr->sh_type = SHT_MIPS_IFACE; hdr->sh_flags |= SHF_MIPS_NOSTRIP; } else if (CONST_STRNEQ (name, ".MIPS.content")) { hdr->sh_type = SHT_MIPS_CONTENT; hdr->sh_flags |= SHF_MIPS_NOSTRIP; /* The sh_info field is set in final_write_processing. */ } else if (MIPS_ELF_OPTIONS_SECTION_NAME_P (name)) { hdr->sh_type = SHT_MIPS_OPTIONS; hdr->sh_entsize = 1; hdr->sh_flags |= SHF_MIPS_NOSTRIP; } else if (CONST_STRNEQ (name, ".debug_")) hdr->sh_type = SHT_MIPS_DWARF; else if (strcmp (name, ".MIPS.symlib") == 0) { hdr->sh_type = SHT_MIPS_SYMBOL_LIB; /* The sh_link and sh_info fields are set in final_write_processing. */ } else if (CONST_STRNEQ (name, ".MIPS.events") || CONST_STRNEQ (name, ".MIPS.post_rel")) { hdr->sh_type = SHT_MIPS_EVENTS; hdr->sh_flags |= SHF_MIPS_NOSTRIP; /* The sh_link field is set in final_write_processing. */ } else if (strcmp (name, ".msym") == 0) { hdr->sh_type = SHT_MIPS_MSYM; hdr->sh_flags |= SHF_ALLOC; hdr->sh_entsize = 8; } /* The generic elf_fake_sections will set up REL_HDR using the default kind of relocations. We used to set up a second header for the non-default kind of relocations here, but only NewABI would use these, and the IRIX ld doesn't like resulting empty RELA sections. Thus we create those header only on demand now. */ return TRUE; } /* Given a BFD section, try to locate the corresponding ELF section index. This is used by both the 32-bit and the 64-bit ABI. Actually, it's not clear to me that the 64-bit ABI supports these, but for non-PIC objects we will certainly want support for at least the .scommon section. */ bfd_boolean _bfd_mips_elf_section_from_bfd_section (bfd *abfd ATTRIBUTE_UNUSED, asection *sec, int *retval) { if (strcmp (bfd_get_section_name (abfd, sec), ".scommon") == 0) { *retval = SHN_MIPS_SCOMMON; return TRUE; } if (strcmp (bfd_get_section_name (abfd, sec), ".acommon") == 0) { *retval = SHN_MIPS_ACOMMON; return TRUE; } return FALSE; } /* Hook called by the linker routine which adds symbols from an object file. We must handle the special MIPS section numbers here. */ bfd_boolean _bfd_mips_elf_add_symbol_hook (bfd *abfd, struct bfd_link_info *info, Elf_Internal_Sym *sym, const char **namep, flagword *flagsp ATTRIBUTE_UNUSED, asection **secp, bfd_vma *valp) { if (SGI_COMPAT (abfd) && (abfd->flags & DYNAMIC) != 0 && strcmp (*namep, "_rld_new_interface") == 0) { /* Skip IRIX5 rld entry name. */ *namep = NULL; return TRUE; } /* Shared objects may have a dynamic symbol '_gp_disp' defined as a SECTION *ABS*. This causes ld to think it can resolve _gp_disp by setting a DT_NEEDED for the shared object. Since _gp_disp is a magic symbol resolved by the linker, we ignore this bogus definition of _gp_disp. New ABI objects do not suffer from this problem so this is not done for them. */ if (!NEWABI_P(abfd) && (sym->st_shndx == SHN_ABS) && (strcmp (*namep, "_gp_disp") == 0)) { *namep = NULL; return TRUE; } switch (sym->st_shndx) { case SHN_COMMON: /* Common symbols less than the GP size are automatically treated as SHN_MIPS_SCOMMON symbols. */ if (sym->st_size > elf_gp_size (abfd) || ELF_ST_TYPE (sym->st_info) == STT_TLS || IRIX_COMPAT (abfd) == ict_irix6) break; /* Fall through. */ case SHN_MIPS_SCOMMON: *secp = bfd_make_section_old_way (abfd, ".scommon"); (*secp)->flags |= SEC_IS_COMMON; *valp = sym->st_size; break; case SHN_MIPS_TEXT: /* This section is used in a shared object. */ if (elf_tdata (abfd)->elf_text_section == NULL) { asymbol *elf_text_symbol; asection *elf_text_section; bfd_size_type amt = sizeof (asection); elf_text_section = bfd_zalloc (abfd, amt); if (elf_text_section == NULL) return FALSE; amt = sizeof (asymbol); elf_text_symbol = bfd_zalloc (abfd, amt); if (elf_text_symbol == NULL) return FALSE; /* Initialize the section. */ elf_tdata (abfd)->elf_text_section = elf_text_section; elf_tdata (abfd)->elf_text_symbol = elf_text_symbol; elf_text_section->symbol = elf_text_symbol; elf_text_section->symbol_ptr_ptr = &elf_tdata (abfd)->elf_text_symbol; elf_text_section->name = ".text"; elf_text_section->flags = SEC_NO_FLAGS; elf_text_section->output_section = NULL; elf_text_section->owner = abfd; elf_text_symbol->name = ".text"; elf_text_symbol->flags = BSF_SECTION_SYM | BSF_DYNAMIC; elf_text_symbol->section = elf_text_section; } /* This code used to do *secp = bfd_und_section_ptr if info->shared. I don't know why, and that doesn't make sense, so I took it out. */ *secp = elf_tdata (abfd)->elf_text_section; break; case SHN_MIPS_ACOMMON: /* Fall through. XXX Can we treat this as allocated data? */ case SHN_MIPS_DATA: /* This section is used in a shared object. */ if (elf_tdata (abfd)->elf_data_section == NULL) { asymbol *elf_data_symbol; asection *elf_data_section; bfd_size_type amt = sizeof (asection); elf_data_section = bfd_zalloc (abfd, amt); if (elf_data_section == NULL) return FALSE; amt = sizeof (asymbol); elf_data_symbol = bfd_zalloc (abfd, amt); if (elf_data_symbol == NULL) return FALSE; /* Initialize the section. */ elf_tdata (abfd)->elf_data_section = elf_data_section; elf_tdata (abfd)->elf_data_symbol = elf_data_symbol; elf_data_section->symbol = elf_data_symbol; elf_data_section->symbol_ptr_ptr = &elf_tdata (abfd)->elf_data_symbol; elf_data_section->name = ".data"; elf_data_section->flags = SEC_NO_FLAGS; elf_data_section->output_section = NULL; elf_data_section->owner = abfd; elf_data_symbol->name = ".data"; elf_data_symbol->flags = BSF_SECTION_SYM | BSF_DYNAMIC; elf_data_symbol->section = elf_data_section; } /* This code used to do *secp = bfd_und_section_ptr if info->shared. I don't know why, and that doesn't make sense, so I took it out. */ *secp = elf_tdata (abfd)->elf_data_section; break; case SHN_MIPS_SUNDEFINED: *secp = bfd_und_section_ptr; break; } if (SGI_COMPAT (abfd) && ! info->shared && info->hash->creator == abfd->xvec && strcmp (*namep, "__rld_obj_head") == 0) { struct elf_link_hash_entry *h; struct bfd_link_hash_entry *bh; /* Mark __rld_obj_head as dynamic. */ bh = NULL; if (! (_bfd_generic_link_add_one_symbol (info, abfd, *namep, BSF_GLOBAL, *secp, *valp, NULL, FALSE, get_elf_backend_data (abfd)->collect, &bh))) return FALSE; h = (struct elf_link_hash_entry *) bh; h->non_elf = 0; h->def_regular = 1; h->type = STT_OBJECT; if (! bfd_elf_link_record_dynamic_symbol (info, h)) return FALSE; mips_elf_hash_table (info)->use_rld_obj_head = TRUE; } /* If this is a mips16 text symbol, add 1 to the value to make it odd. This will cause something like .word SYM to come up with the right value when it is loaded into the PC. */ if (sym->st_other == STO_MIPS16) ++*valp; return TRUE; } /* This hook function is called before the linker writes out a global symbol. We mark symbols as small common if appropriate. This is also where we undo the increment of the value for a mips16 symbol. */ bfd_boolean _bfd_mips_elf_link_output_symbol_hook (struct bfd_link_info *info ATTRIBUTE_UNUSED, const char *name ATTRIBUTE_UNUSED, Elf_Internal_Sym *sym, asection *input_sec, struct elf_link_hash_entry *h ATTRIBUTE_UNUSED) { /* If we see a common symbol, which implies a relocatable link, then if a symbol was small common in an input file, mark it as small common in the output file. */ if (sym->st_shndx == SHN_COMMON && strcmp (input_sec->name, ".scommon") == 0) sym->st_shndx = SHN_MIPS_SCOMMON; if (sym->st_other == STO_MIPS16) sym->st_value &= ~1; return TRUE; } /* Functions for the dynamic linker. */ /* Create dynamic sections when linking against a dynamic object. */ bfd_boolean _bfd_mips_elf_create_dynamic_sections (bfd *abfd, struct bfd_link_info *info) { struct elf_link_hash_entry *h; struct bfd_link_hash_entry *bh; flagword flags; register asection *s; const char * const *namep; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); flags = (SEC_ALLOC | SEC_LOAD | SEC_HAS_CONTENTS | SEC_IN_MEMORY | SEC_LINKER_CREATED | SEC_READONLY); /* The psABI requires a read-only .dynamic section, but the VxWorks EABI doesn't. */ if (!htab->is_vxworks) { s = bfd_get_section_by_name (abfd, ".dynamic"); if (s != NULL) { if (! bfd_set_section_flags (abfd, s, flags)) return FALSE; } } /* We need to create .got section. */ if (! mips_elf_create_got_section (abfd, info, FALSE)) return FALSE; if (! mips_elf_rel_dyn_section (info, TRUE)) return FALSE; /* Create .stub section. */ if (bfd_get_section_by_name (abfd, MIPS_ELF_STUB_SECTION_NAME (abfd)) == NULL) { s = bfd_make_section_with_flags (abfd, MIPS_ELF_STUB_SECTION_NAME (abfd), flags | SEC_CODE); if (s == NULL || ! bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd))) return FALSE; } if ((IRIX_COMPAT (abfd) == ict_irix5 || IRIX_COMPAT (abfd) == ict_none) && !info->shared && bfd_get_section_by_name (abfd, ".rld_map") == NULL) { s = bfd_make_section_with_flags (abfd, ".rld_map", flags &~ (flagword) SEC_READONLY); if (s == NULL || ! bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd))) return FALSE; } /* On IRIX5, we adjust add some additional symbols and change the alignments of several sections. There is no ABI documentation indicating that this is necessary on IRIX6, nor any evidence that the linker takes such action. */ if (IRIX_COMPAT (abfd) == ict_irix5) { for (namep = mips_elf_dynsym_rtproc_names; *namep != NULL; namep++) { bh = NULL; if (! (_bfd_generic_link_add_one_symbol (info, abfd, *namep, BSF_GLOBAL, bfd_und_section_ptr, 0, NULL, FALSE, get_elf_backend_data (abfd)->collect, &bh))) return FALSE; h = (struct elf_link_hash_entry *) bh; h->non_elf = 0; h->def_regular = 1; h->type = STT_SECTION; if (! bfd_elf_link_record_dynamic_symbol (info, h)) return FALSE; } /* We need to create a .compact_rel section. */ if (SGI_COMPAT (abfd)) { if (!mips_elf_create_compact_rel_section (abfd, info)) return FALSE; } /* Change alignments of some sections. */ s = bfd_get_section_by_name (abfd, ".hash"); if (s != NULL) bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd)); s = bfd_get_section_by_name (abfd, ".dynsym"); if (s != NULL) bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd)); s = bfd_get_section_by_name (abfd, ".dynstr"); if (s != NULL) bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd)); s = bfd_get_section_by_name (abfd, ".reginfo"); if (s != NULL) bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd)); s = bfd_get_section_by_name (abfd, ".dynamic"); if (s != NULL) bfd_set_section_alignment (abfd, s, MIPS_ELF_LOG_FILE_ALIGN (abfd)); } if (!info->shared) { const char *name; name = SGI_COMPAT (abfd) ? "_DYNAMIC_LINK" : "_DYNAMIC_LINKING"; bh = NULL; if (!(_bfd_generic_link_add_one_symbol (info, abfd, name, BSF_GLOBAL, bfd_abs_section_ptr, 0, NULL, FALSE, get_elf_backend_data (abfd)->collect, &bh))) return FALSE; h = (struct elf_link_hash_entry *) bh; h->non_elf = 0; h->def_regular = 1; h->type = STT_SECTION; if (! bfd_elf_link_record_dynamic_symbol (info, h)) return FALSE; if (! mips_elf_hash_table (info)->use_rld_obj_head) { /* __rld_map is a four byte word located in the .data section and is filled in by the rtld to contain a pointer to the _r_debug structure. Its symbol value will be set in _bfd_mips_elf_finish_dynamic_symbol. */ s = bfd_get_section_by_name (abfd, ".rld_map"); BFD_ASSERT (s != NULL); name = SGI_COMPAT (abfd) ? "__rld_map" : "__RLD_MAP"; bh = NULL; if (!(_bfd_generic_link_add_one_symbol (info, abfd, name, BSF_GLOBAL, s, 0, NULL, FALSE, get_elf_backend_data (abfd)->collect, &bh))) return FALSE; h = (struct elf_link_hash_entry *) bh; h->non_elf = 0; h->def_regular = 1; h->type = STT_OBJECT; if (! bfd_elf_link_record_dynamic_symbol (info, h)) return FALSE; } } if (htab->is_vxworks) { /* Create the .plt, .rela.plt, .dynbss and .rela.bss sections. Also create the _PROCEDURE_LINKAGE_TABLE symbol. */ if (!_bfd_elf_create_dynamic_sections (abfd, info)) return FALSE; /* Cache the sections created above. */ htab->sdynbss = bfd_get_section_by_name (abfd, ".dynbss"); htab->srelbss = bfd_get_section_by_name (abfd, ".rela.bss"); htab->srelplt = bfd_get_section_by_name (abfd, ".rela.plt"); htab->splt = bfd_get_section_by_name (abfd, ".plt"); if (!htab->sdynbss || (!htab->srelbss && !info->shared) || !htab->srelplt || !htab->splt) abort (); /* Do the usual VxWorks handling. */ if (!elf_vxworks_create_dynamic_sections (abfd, info, &htab->srelplt2)) return FALSE; /* Work out the PLT sizes. */ if (info->shared) { htab->plt_header_size = 4 * ARRAY_SIZE (mips_vxworks_shared_plt0_entry); htab->plt_entry_size = 4 * ARRAY_SIZE (mips_vxworks_shared_plt_entry); } else { htab->plt_header_size = 4 * ARRAY_SIZE (mips_vxworks_exec_plt0_entry); htab->plt_entry_size = 4 * ARRAY_SIZE (mips_vxworks_exec_plt_entry); } } return TRUE; } /* Look through the relocs for a section during the first phase, and allocate space in the global offset table. */ bfd_boolean _bfd_mips_elf_check_relocs (bfd *abfd, struct bfd_link_info *info, asection *sec, const Elf_Internal_Rela *relocs) { const char *name; bfd *dynobj; Elf_Internal_Shdr *symtab_hdr; struct elf_link_hash_entry **sym_hashes; struct mips_got_info *g; size_t extsymoff; const Elf_Internal_Rela *rel; const Elf_Internal_Rela *rel_end; asection *sgot; asection *sreloc; const struct elf_backend_data *bed; struct mips_elf_link_hash_table *htab; if (info->relocatable) return TRUE; htab = mips_elf_hash_table (info); dynobj = elf_hash_table (info)->dynobj; symtab_hdr = &elf_tdata (abfd)->symtab_hdr; sym_hashes = elf_sym_hashes (abfd); extsymoff = (elf_bad_symtab (abfd)) ? 0 : symtab_hdr->sh_info; /* Check for the mips16 stub sections. */ name = bfd_get_section_name (abfd, sec); if (FN_STUB_P (name)) { unsigned long r_symndx; /* Look at the relocation information to figure out which symbol this is for. */ r_symndx = ELF_R_SYM (abfd, relocs->r_info); if (r_symndx < extsymoff || sym_hashes[r_symndx - extsymoff] == NULL) { asection *o; /* This stub is for a local symbol. This stub will only be needed if there is some relocation in this BFD, other than a 16 bit function call, which refers to this symbol. */ for (o = abfd->sections; o != NULL; o = o->next) { Elf_Internal_Rela *sec_relocs; const Elf_Internal_Rela *r, *rend; /* We can ignore stub sections when looking for relocs. */ if ((o->flags & SEC_RELOC) == 0 || o->reloc_count == 0 || mips16_stub_section_p (abfd, o)) continue; sec_relocs = _bfd_elf_link_read_relocs (abfd, o, NULL, NULL, info->keep_memory); if (sec_relocs == NULL) return FALSE; rend = sec_relocs + o->reloc_count; for (r = sec_relocs; r < rend; r++) if (ELF_R_SYM (abfd, r->r_info) == r_symndx && ELF_R_TYPE (abfd, r->r_info) != R_MIPS16_26) break; if (elf_section_data (o)->relocs != sec_relocs) free (sec_relocs); if (r < rend) break; } if (o == NULL) { /* There is no non-call reloc for this stub, so we do not need it. Since this function is called before the linker maps input sections to output sections, we can easily discard it by setting the SEC_EXCLUDE flag. */ sec->flags |= SEC_EXCLUDE; return TRUE; } /* Record this stub in an array of local symbol stubs for this BFD. */ if (elf_tdata (abfd)->local_stubs == NULL) { unsigned long symcount; asection **n; bfd_size_type amt; if (elf_bad_symtab (abfd)) symcount = NUM_SHDR_ENTRIES (symtab_hdr); else symcount = symtab_hdr->sh_info; amt = symcount * sizeof (asection *); n = bfd_zalloc (abfd, amt); if (n == NULL) return FALSE; elf_tdata (abfd)->local_stubs = n; } sec->flags |= SEC_KEEP; elf_tdata (abfd)->local_stubs[r_symndx] = sec; /* We don't need to set mips16_stubs_seen in this case. That flag is used to see whether we need to look through the global symbol table for stubs. We don't need to set it here, because we just have a local stub. */ } else { struct mips_elf_link_hash_entry *h; h = ((struct mips_elf_link_hash_entry *) sym_hashes[r_symndx - extsymoff]); while (h->root.root.type == bfd_link_hash_indirect || h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; /* H is the symbol this stub is for. */ /* If we already have an appropriate stub for this function, we don't need another one, so we can discard this one. Since this function is called before the linker maps input sections to output sections, we can easily discard it by setting the SEC_EXCLUDE flag. */ if (h->fn_stub != NULL) { sec->flags |= SEC_EXCLUDE; return TRUE; } sec->flags |= SEC_KEEP; h->fn_stub = sec; mips_elf_hash_table (info)->mips16_stubs_seen = TRUE; } } else if (CALL_STUB_P (name) || CALL_FP_STUB_P (name)) { unsigned long r_symndx; struct mips_elf_link_hash_entry *h; asection **loc; /* Look at the relocation information to figure out which symbol this is for. */ r_symndx = ELF_R_SYM (abfd, relocs->r_info); if (r_symndx < extsymoff || sym_hashes[r_symndx - extsymoff] == NULL) { asection *o; /* This stub is for a local symbol. This stub will only be needed if there is some relocation (R_MIPS16_26) in this BFD that refers to this symbol. */ for (o = abfd->sections; o != NULL; o = o->next) { Elf_Internal_Rela *sec_relocs; const Elf_Internal_Rela *r, *rend; /* We can ignore stub sections when looking for relocs. */ if ((o->flags & SEC_RELOC) == 0 || o->reloc_count == 0 || mips16_stub_section_p (abfd, o)) continue; sec_relocs = _bfd_elf_link_read_relocs (abfd, o, NULL, NULL, info->keep_memory); if (sec_relocs == NULL) return FALSE; rend = sec_relocs + o->reloc_count; for (r = sec_relocs; r < rend; r++) if (ELF_R_SYM (abfd, r->r_info) == r_symndx && ELF_R_TYPE (abfd, r->r_info) == R_MIPS16_26) break; if (elf_section_data (o)->relocs != sec_relocs) free (sec_relocs); if (r < rend) break; } if (o == NULL) { /* There is no non-call reloc for this stub, so we do not need it. Since this function is called before the linker maps input sections to output sections, we can easily discard it by setting the SEC_EXCLUDE flag. */ sec->flags |= SEC_EXCLUDE; return TRUE; } /* Record this stub in an array of local symbol call_stubs for this BFD. */ if (elf_tdata (abfd)->local_call_stubs == NULL) { unsigned long symcount; asection **n; bfd_size_type amt; if (elf_bad_symtab (abfd)) symcount = NUM_SHDR_ENTRIES (symtab_hdr); else symcount = symtab_hdr->sh_info; amt = symcount * sizeof (asection *); n = bfd_zalloc (abfd, amt); if (n == NULL) return FALSE; elf_tdata (abfd)->local_call_stubs = n; } sec->flags |= SEC_KEEP; elf_tdata (abfd)->local_call_stubs[r_symndx] = sec; /* We don't need to set mips16_stubs_seen in this case. That flag is used to see whether we need to look through the global symbol table for stubs. We don't need to set it here, because we just have a local stub. */ } else { h = ((struct mips_elf_link_hash_entry *) sym_hashes[r_symndx - extsymoff]); /* H is the symbol this stub is for. */ if (CALL_FP_STUB_P (name)) loc = &h->call_fp_stub; else loc = &h->call_stub; /* If we already have an appropriate stub for this function, we don't need another one, so we can discard this one. Since this function is called before the linker maps input sections to output sections, we can easily discard it by setting the SEC_EXCLUDE flag. */ if (*loc != NULL) { sec->flags |= SEC_EXCLUDE; return TRUE; } sec->flags |= SEC_KEEP; *loc = sec; mips_elf_hash_table (info)->mips16_stubs_seen = TRUE; } } if (dynobj == NULL) { sgot = NULL; g = NULL; } else { sgot = mips_elf_got_section (dynobj, FALSE); if (sgot == NULL) g = NULL; else { BFD_ASSERT (mips_elf_section_data (sgot) != NULL); g = mips_elf_section_data (sgot)->u.got_info; BFD_ASSERT (g != NULL); } } sreloc = NULL; bed = get_elf_backend_data (abfd); rel_end = relocs + sec->reloc_count * bed->s->int_rels_per_ext_rel; for (rel = relocs; rel < rel_end; ++rel) { unsigned long r_symndx; unsigned int r_type; struct elf_link_hash_entry *h; r_symndx = ELF_R_SYM (abfd, rel->r_info); r_type = ELF_R_TYPE (abfd, rel->r_info); if (r_symndx < extsymoff) h = NULL; else if (r_symndx >= extsymoff + NUM_SHDR_ENTRIES (symtab_hdr)) { (*_bfd_error_handler) (_("%B: Malformed reloc detected for section %s"), abfd, name); bfd_set_error (bfd_error_bad_value); return FALSE; } else { h = sym_hashes[r_symndx - extsymoff]; /* This may be an indirect symbol created because of a version. */ if (h != NULL) { while (h->root.type == bfd_link_hash_indirect) h = (struct elf_link_hash_entry *) h->root.u.i.link; } } /* Some relocs require a global offset table. */ if (dynobj == NULL || sgot == NULL) { switch (r_type) { case R_MIPS_GOT16: case R_MIPS_CALL16: case R_MIPS_CALL_HI16: case R_MIPS_CALL_LO16: case R_MIPS_GOT_HI16: case R_MIPS_GOT_LO16: case R_MIPS_GOT_PAGE: case R_MIPS_GOT_OFST: case R_MIPS_GOT_DISP: case R_MIPS_TLS_GOTTPREL: case R_MIPS_TLS_GD: case R_MIPS_TLS_LDM: if (dynobj == NULL) elf_hash_table (info)->dynobj = dynobj = abfd; if (! mips_elf_create_got_section (dynobj, info, FALSE)) return FALSE; g = mips_elf_got_info (dynobj, &sgot); if (htab->is_vxworks && !info->shared) { (*_bfd_error_handler) (_("%B: GOT reloc at 0x%lx not expected in executables"), abfd, (unsigned long) rel->r_offset); bfd_set_error (bfd_error_bad_value); return FALSE; } break; case R_MIPS_32: case R_MIPS_REL32: case R_MIPS_64: /* In VxWorks executables, references to external symbols are handled using copy relocs or PLT stubs, so there's no need to add a dynamic relocation here. */ if (dynobj == NULL && (info->shared || (h != NULL && !htab->is_vxworks)) && (sec->flags & SEC_ALLOC) != 0) elf_hash_table (info)->dynobj = dynobj = abfd; break; default: break; } } if (h) { ((struct mips_elf_link_hash_entry *) h)->is_relocation_target = TRUE; /* Relocations against the special VxWorks __GOTT_BASE__ and __GOTT_INDEX__ symbols must be left to the loader. Allocate room for them in .rela.dyn. */ if (is_gott_symbol (info, h)) { if (sreloc == NULL) { sreloc = mips_elf_rel_dyn_section (info, TRUE); if (sreloc == NULL) return FALSE; } mips_elf_allocate_dynamic_relocations (dynobj, info, 1); if (MIPS_ELF_READONLY_SECTION (sec)) /* We tell the dynamic linker that there are relocations against the text segment. */ info->flags |= DF_TEXTREL; } } else if (r_type == R_MIPS_CALL_LO16 || r_type == R_MIPS_GOT_LO16 || r_type == R_MIPS_GOT_DISP || (r_type == R_MIPS_GOT16 && htab->is_vxworks)) { /* We may need a local GOT entry for this relocation. We don't count R_MIPS_GOT_PAGE because we can estimate the maximum number of pages needed by looking at the size of the segment. Similar comments apply to R_MIPS_GOT16 and R_MIPS_CALL16, except on VxWorks, where GOT relocations always evaluate to "G". We don't count R_MIPS_GOT_HI16, or R_MIPS_CALL_HI16 because these are always followed by an R_MIPS_GOT_LO16 or R_MIPS_CALL_LO16. */ if (! mips_elf_record_local_got_symbol (abfd, r_symndx, rel->r_addend, g, 0)) return FALSE; } switch (r_type) { case R_MIPS_CALL16: if (h == NULL) { (*_bfd_error_handler) (_("%B: CALL16 reloc at 0x%lx not against global symbol"), abfd, (unsigned long) rel->r_offset); bfd_set_error (bfd_error_bad_value); return FALSE; } /* Fall through. */ case R_MIPS_CALL_HI16: case R_MIPS_CALL_LO16: if (h != NULL) { /* VxWorks call relocations point the function's .got.plt entry, which will be allocated by adjust_dynamic_symbol. Otherwise, this symbol requires a global GOT entry. */ if (!htab->is_vxworks && !mips_elf_record_global_got_symbol (h, abfd, info, g, 0)) return FALSE; /* We need a stub, not a plt entry for the undefined function. But we record it as if it needs plt. See _bfd_elf_adjust_dynamic_symbol. */ h->needs_plt = 1; h->type = STT_FUNC; } break; case R_MIPS_GOT_PAGE: /* If this is a global, overridable symbol, GOT_PAGE will decay to GOT_DISP, so we'll need a GOT entry for it. */ if (h == NULL) break; else { struct mips_elf_link_hash_entry *hmips = (struct mips_elf_link_hash_entry *) h; while (hmips->root.root.type == bfd_link_hash_indirect || hmips->root.root.type == bfd_link_hash_warning) hmips = (struct mips_elf_link_hash_entry *) hmips->root.root.u.i.link; if (hmips->root.def_regular && ! (info->shared && ! info->symbolic && ! hmips->root.forced_local)) break; } /* Fall through. */ case R_MIPS_GOT16: case R_MIPS_GOT_HI16: case R_MIPS_GOT_LO16: case R_MIPS_GOT_DISP: if (h && ! mips_elf_record_global_got_symbol (h, abfd, info, g, 0)) return FALSE; break; case R_MIPS_TLS_GOTTPREL: if (info->shared) info->flags |= DF_STATIC_TLS; /* Fall through */ case R_MIPS_TLS_LDM: if (r_type == R_MIPS_TLS_LDM) { r_symndx = 0; h = NULL; } /* Fall through */ case R_MIPS_TLS_GD: /* This symbol requires a global offset table entry, or two for TLS GD relocations. */ { unsigned char flag = (r_type == R_MIPS_TLS_GD ? GOT_TLS_GD : r_type == R_MIPS_TLS_LDM ? GOT_TLS_LDM : GOT_TLS_IE); if (h != NULL) { struct mips_elf_link_hash_entry *hmips = (struct mips_elf_link_hash_entry *) h; hmips->tls_type |= flag; if (h && ! mips_elf_record_global_got_symbol (h, abfd, info, g, flag)) return FALSE; } else { BFD_ASSERT (flag == GOT_TLS_LDM || r_symndx != 0); if (! mips_elf_record_local_got_symbol (abfd, r_symndx, rel->r_addend, g, flag)) return FALSE; } } break; case R_MIPS_32: case R_MIPS_REL32: case R_MIPS_64: /* In VxWorks executables, references to external symbols are handled using copy relocs or PLT stubs, so there's no need to add a .rela.dyn entry for this relocation. */ if ((info->shared || (h != NULL && !htab->is_vxworks)) && (sec->flags & SEC_ALLOC) != 0) { if (sreloc == NULL) { sreloc = mips_elf_rel_dyn_section (info, TRUE); if (sreloc == NULL) return FALSE; } if (info->shared) { /* When creating a shared object, we must copy these reloc types into the output file as R_MIPS_REL32 relocs. Make room for this reloc in .rel(a).dyn. */ mips_elf_allocate_dynamic_relocations (dynobj, info, 1); if (MIPS_ELF_READONLY_SECTION (sec)) /* We tell the dynamic linker that there are relocations against the text segment. */ info->flags |= DF_TEXTREL; } else { struct mips_elf_link_hash_entry *hmips; /* We only need to copy this reloc if the symbol is defined in a dynamic object. */ hmips = (struct mips_elf_link_hash_entry *) h; ++hmips->possibly_dynamic_relocs; if (MIPS_ELF_READONLY_SECTION (sec)) /* We need it to tell the dynamic linker if there are relocations against the text segment. */ hmips->readonly_reloc = TRUE; } /* Even though we don't directly need a GOT entry for this symbol, a symbol must have a dynamic symbol table index greater that DT_MIPS_GOTSYM if there are dynamic relocations against it. This does not apply to VxWorks, which does not have the usual coupling between global GOT entries and .dynsym entries. */ if (h != NULL && !htab->is_vxworks) { if (dynobj == NULL) elf_hash_table (info)->dynobj = dynobj = abfd; if (! mips_elf_create_got_section (dynobj, info, TRUE)) return FALSE; g = mips_elf_got_info (dynobj, &sgot); if (! mips_elf_record_global_got_symbol (h, abfd, info, g, 0)) return FALSE; } } if (SGI_COMPAT (abfd)) mips_elf_hash_table (info)->compact_rel_size += sizeof (Elf32_External_crinfo); break; case R_MIPS_PC16: if (h) ((struct mips_elf_link_hash_entry *) h)->is_branch_target = TRUE; break; case R_MIPS_26: if (h) ((struct mips_elf_link_hash_entry *) h)->is_branch_target = TRUE; /* Fall through. */ case R_MIPS_GPREL16: case R_MIPS_LITERAL: case R_MIPS_GPREL32: if (SGI_COMPAT (abfd)) mips_elf_hash_table (info)->compact_rel_size += sizeof (Elf32_External_crinfo); break; /* This relocation describes the C++ object vtable hierarchy. Reconstruct it for later use during GC. */ case R_MIPS_GNU_VTINHERIT: if (!bfd_elf_gc_record_vtinherit (abfd, sec, h, rel->r_offset)) return FALSE; break; /* This relocation describes which C++ vtable entries are actually used. Record for later use during GC. */ case R_MIPS_GNU_VTENTRY: if (!bfd_elf_gc_record_vtentry (abfd, sec, h, rel->r_offset)) return FALSE; break; default: break; } /* We must not create a stub for a symbol that has relocations related to taking the function's address. This doesn't apply to VxWorks, where CALL relocs refer to a .got.plt entry instead of a normal .got entry. */ if (!htab->is_vxworks && h != NULL) switch (r_type) { default: ((struct mips_elf_link_hash_entry *) h)->no_fn_stub = TRUE; break; case R_MIPS_CALL16: case R_MIPS_CALL_HI16: case R_MIPS_CALL_LO16: case R_MIPS_JALR: break; } /* If this reloc is not a 16 bit call, and it has a global symbol, then we will need the fn_stub if there is one. References from a stub section do not count. */ if (h != NULL && r_type != R_MIPS16_26 && !mips16_stub_section_p (abfd, sec)) { struct mips_elf_link_hash_entry *mh; mh = (struct mips_elf_link_hash_entry *) h; mh->need_fn_stub = TRUE; } } return TRUE; } bfd_boolean _bfd_mips_relax_section (bfd *abfd, asection *sec, struct bfd_link_info *link_info, bfd_boolean *again) { Elf_Internal_Rela *internal_relocs; Elf_Internal_Rela *irel, *irelend; Elf_Internal_Shdr *symtab_hdr; bfd_byte *contents = NULL; size_t extsymoff; bfd_boolean changed_contents = FALSE; bfd_vma sec_start = sec->output_section->vma + sec->output_offset; Elf_Internal_Sym *isymbuf = NULL; /* We are not currently changing any sizes, so only one pass. */ *again = FALSE; if (link_info->relocatable) return TRUE; internal_relocs = _bfd_elf_link_read_relocs (abfd, sec, NULL, NULL, link_info->keep_memory); if (internal_relocs == NULL) return TRUE; irelend = internal_relocs + sec->reloc_count * get_elf_backend_data (abfd)->s->int_rels_per_ext_rel; symtab_hdr = &elf_tdata (abfd)->symtab_hdr; extsymoff = (elf_bad_symtab (abfd)) ? 0 : symtab_hdr->sh_info; for (irel = internal_relocs; irel < irelend; irel++) { bfd_vma symval; bfd_signed_vma sym_offset; unsigned int r_type; unsigned long r_symndx; asection *sym_sec; unsigned long instruction; /* Turn jalr into bgezal, and jr into beq, if they're marked with a JALR relocation, that indicate where they jump to. This saves some pipeline bubbles. */ r_type = ELF_R_TYPE (abfd, irel->r_info); if (r_type != R_MIPS_JALR) continue; r_symndx = ELF_R_SYM (abfd, irel->r_info); /* Compute the address of the jump target. */ if (r_symndx >= extsymoff) { struct mips_elf_link_hash_entry *h = ((struct mips_elf_link_hash_entry *) elf_sym_hashes (abfd) [r_symndx - extsymoff]); while (h->root.root.type == bfd_link_hash_indirect || h->root.root.type == bfd_link_hash_warning) h = (struct mips_elf_link_hash_entry *) h->root.root.u.i.link; /* If a symbol is undefined, or if it may be overridden, skip it. */ if (! ((h->root.root.type == bfd_link_hash_defined || h->root.root.type == bfd_link_hash_defweak) && h->root.root.u.def.section) || (link_info->shared && ! link_info->symbolic && !h->root.forced_local)) continue; sym_sec = h->root.root.u.def.section; if (sym_sec->output_section) symval = (h->root.root.u.def.value + sym_sec->output_section->vma + sym_sec->output_offset); else symval = h->root.root.u.def.value; } else { Elf_Internal_Sym *isym; /* Read this BFD's symbols if we haven't done so already. */ if (isymbuf == NULL && symtab_hdr->sh_info != 0) { isymbuf = (Elf_Internal_Sym *) symtab_hdr->contents; if (isymbuf == NULL) isymbuf = bfd_elf_get_elf_syms (abfd, symtab_hdr, symtab_hdr->sh_info, 0, NULL, NULL, NULL); if (isymbuf == NULL) goto relax_return; } isym = isymbuf + r_symndx; if (isym->st_shndx == SHN_UNDEF) continue; else if (isym->st_shndx == SHN_ABS) sym_sec = bfd_abs_section_ptr; else if (isym->st_shndx == SHN_COMMON) sym_sec = bfd_com_section_ptr; else sym_sec = bfd_section_from_elf_index (abfd, isym->st_shndx); symval = isym->st_value + sym_sec->output_section->vma + sym_sec->output_offset; } /* Compute branch offset, from delay slot of the jump to the branch target. */ sym_offset = (symval + irel->r_addend) - (sec_start + irel->r_offset + 4); /* Branch offset must be properly aligned. */ if ((sym_offset & 3) != 0) continue; sym_offset >>= 2; /* Check that it's in range. */ if (sym_offset < -0x8000 || sym_offset >= 0x8000) continue; /* Get the section contents if we haven't done so already. */ if (contents == NULL) { /* Get cached copy if it exists. */ if (elf_section_data (sec)->this_hdr.contents != NULL) contents = elf_section_data (sec)->this_hdr.contents; else { if (!bfd_malloc_and_get_section (abfd, sec, &contents)) goto relax_return; } } instruction = bfd_get_32 (abfd, contents + irel->r_offset); /* If it was jalr , turn it into bgezal $zero, . */ if ((instruction & 0xfc1fffff) == 0x0000f809) instruction = 0x04110000; /* If it was jr , turn it into b . */ else if ((instruction & 0xfc1fffff) == 0x00000008) instruction = 0x10000000; else continue; instruction |= (sym_offset & 0xffff); bfd_put_32 (abfd, instruction, contents + irel->r_offset); changed_contents = TRUE; } if (contents != NULL && elf_section_data (sec)->this_hdr.contents != contents) { if (!changed_contents && !link_info->keep_memory) free (contents); else { /* Cache the section contents for elf_link_input_bfd. */ elf_section_data (sec)->this_hdr.contents = contents; } } return TRUE; relax_return: if (contents != NULL && elf_section_data (sec)->this_hdr.contents != contents) free (contents); return FALSE; } /* Adjust a symbol defined by a dynamic object and referenced by a regular object. The current definition is in some section of the dynamic object, but we're not including those sections. We have to change the definition to something the rest of the link can understand. */ bfd_boolean _bfd_mips_elf_adjust_dynamic_symbol (struct bfd_link_info *info, struct elf_link_hash_entry *h) { bfd *dynobj; struct mips_elf_link_hash_entry *hmips; asection *s; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); dynobj = elf_hash_table (info)->dynobj; /* Make sure we know what is going on here. */ BFD_ASSERT (dynobj != NULL && (h->needs_plt || h->u.weakdef != NULL || (h->def_dynamic && h->ref_regular && !h->def_regular))); /* If this symbol is defined in a dynamic object, we need to copy any R_MIPS_32 or R_MIPS_REL32 relocs against it into the output file. */ hmips = (struct mips_elf_link_hash_entry *) h; if (! info->relocatable && hmips->possibly_dynamic_relocs != 0 && (h->root.type == bfd_link_hash_defweak || !h->def_regular)) { mips_elf_allocate_dynamic_relocations (dynobj, info, hmips->possibly_dynamic_relocs); if (hmips->readonly_reloc) /* We tell the dynamic linker that there are relocations against the text segment. */ info->flags |= DF_TEXTREL; } /* For a function, create a stub, if allowed. */ if (! hmips->no_fn_stub && h->needs_plt) { if (! elf_hash_table (info)->dynamic_sections_created) return TRUE; /* If this symbol is not defined in a regular file, then set the symbol to the stub location. This is required to make function pointers compare as equal between the normal executable and the shared library. */ if (!h->def_regular) { /* We need .stub section. */ s = bfd_get_section_by_name (dynobj, MIPS_ELF_STUB_SECTION_NAME (dynobj)); BFD_ASSERT (s != NULL); h->root.u.def.section = s; h->root.u.def.value = s->size; /* XXX Write this stub address somewhere. */ h->plt.offset = s->size; /* Make room for this stub code. */ s->size += htab->function_stub_size; /* The last half word of the stub will be filled with the index of this symbol in .dynsym section. */ return TRUE; } } else if ((h->type == STT_FUNC) && !h->needs_plt) { /* This will set the entry for this symbol in the GOT to 0, and the dynamic linker will take care of this. */ h->root.u.def.value = 0; return TRUE; } /* If this is a weak symbol, and there is a real definition, the processor independent code will have arranged for us to see the real definition first, and we can just use the same value. */ if (h->u.weakdef != NULL) { BFD_ASSERT (h->u.weakdef->root.type == bfd_link_hash_defined || h->u.weakdef->root.type == bfd_link_hash_defweak); h->root.u.def.section = h->u.weakdef->root.u.def.section; h->root.u.def.value = h->u.weakdef->root.u.def.value; return TRUE; } /* This is a reference to a symbol defined by a dynamic object which is not a function. */ return TRUE; } /* Likewise, for VxWorks. */ bfd_boolean _bfd_mips_vxworks_adjust_dynamic_symbol (struct bfd_link_info *info, struct elf_link_hash_entry *h) { bfd *dynobj; struct mips_elf_link_hash_entry *hmips; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); dynobj = elf_hash_table (info)->dynobj; hmips = (struct mips_elf_link_hash_entry *) h; /* Make sure we know what is going on here. */ BFD_ASSERT (dynobj != NULL && (h->needs_plt || h->needs_copy || h->u.weakdef != NULL || (h->def_dynamic && h->ref_regular && !h->def_regular))); /* If the symbol is defined by a dynamic object, we need a PLT stub if either (a) we want to branch to the symbol or (b) we're linking an executable that needs a canonical function address. In the latter case, the canonical address will be the address of the executable's load stub. */ if ((hmips->is_branch_target || (!info->shared && h->type == STT_FUNC && hmips->is_relocation_target)) && h->def_dynamic && h->ref_regular && !h->def_regular && !h->forced_local) h->needs_plt = 1; /* Locally-binding symbols do not need a PLT stub; we can refer to the functions directly. */ else if (h->needs_plt && (SYMBOL_CALLS_LOCAL (info, h) || (ELF_ST_VISIBILITY (h->other) != STV_DEFAULT && h->root.type == bfd_link_hash_undefweak))) { h->needs_plt = 0; return TRUE; } if (h->needs_plt) { /* If this is the first symbol to need a PLT entry, allocate room for the header, and for the header's .rela.plt.unloaded entries. */ if (htab->splt->size == 0) { htab->splt->size += htab->plt_header_size; if (!info->shared) htab->srelplt2->size += 2 * sizeof (Elf32_External_Rela); } /* Assign the next .plt entry to this symbol. */ h->plt.offset = htab->splt->size; htab->splt->size += htab->plt_entry_size; /* If the output file has no definition of the symbol, set the symbol's value to the address of the stub. For executables, point at the PLT load stub rather than the lazy resolution stub; this stub will become the canonical function address. */ if (!h->def_regular) { h->root.u.def.section = htab->splt; h->root.u.def.value = h->plt.offset; if (!info->shared) h->root.u.def.value += 8; } /* Make room for the .got.plt entry and the R_JUMP_SLOT relocation. */ htab->sgotplt->size += 4; htab->srelplt->size += sizeof (Elf32_External_Rela); /* Make room for the .rela.plt.unloaded relocations. */ if (!info->shared) htab->srelplt2->size += 3 * sizeof (Elf32_External_Rela); return TRUE; } /* If a function symbol is defined by a dynamic object, and we do not need a PLT stub for it, the symbol's value should be zero. */ if (h->type == STT_FUNC && h->def_dynamic && h->ref_regular && !h->def_regular) { h->root.u.def.value = 0; return TRUE; } /* If this is a weak symbol, and there is a real definition, the processor independent code will have arranged for us to see the real definition first, and we can just use the same value. */ if (h->u.weakdef != NULL) { BFD_ASSERT (h->u.weakdef->root.type == bfd_link_hash_defined || h->u.weakdef->root.type == bfd_link_hash_defweak); h->root.u.def.section = h->u.weakdef->root.u.def.section; h->root.u.def.value = h->u.weakdef->root.u.def.value; return TRUE; } /* This is a reference to a symbol defined by a dynamic object which is not a function. */ if (info->shared) return TRUE; /* We must allocate the symbol in our .dynbss section, which will become part of the .bss section of the executable. There will be an entry for this symbol in the .dynsym section. The dynamic object will contain position independent code, so all references from the dynamic object to this symbol will go through the global offset table. The dynamic linker will use the .dynsym entry to determine the address it must put in the global offset table, so both the dynamic object and the regular object will refer to the same memory location for the variable. */ if ((h->root.u.def.section->flags & SEC_ALLOC) != 0) { htab->srelbss->size += sizeof (Elf32_External_Rela); h->needs_copy = 1; } return _bfd_elf_adjust_dynamic_copy (h, htab->sdynbss); } /* Return the number of dynamic section symbols required by OUTPUT_BFD. The number might be exact or a worst-case estimate, depending on how much information is available to elf_backend_omit_section_dynsym at the current linking stage. */ static bfd_size_type count_section_dynsyms (bfd *output_bfd, struct bfd_link_info *info) { bfd_size_type count; count = 0; if (info->shared || elf_hash_table (info)->is_relocatable_executable) { asection *p; const struct elf_backend_data *bed; bed = get_elf_backend_data (output_bfd); for (p = output_bfd->sections; p ; p = p->next) if ((p->flags & SEC_EXCLUDE) == 0 && (p->flags & SEC_ALLOC) != 0 && !(*bed->elf_backend_omit_section_dynsym) (output_bfd, info, p)) ++count; } return count; } /* This function is called after all the input files have been read, and the input sections have been assigned to output sections. We check for any mips16 stub sections that we can discard. */ bfd_boolean _bfd_mips_elf_always_size_sections (bfd *output_bfd, struct bfd_link_info *info) { asection *ri; bfd *dynobj; asection *s; struct mips_got_info *g; int i; bfd_size_type loadable_size = 0; bfd_size_type local_gotno; bfd_size_type dynsymcount; bfd *sub; struct mips_elf_count_tls_arg count_tls_arg; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); /* The .reginfo section has a fixed size. */ ri = bfd_get_section_by_name (output_bfd, ".reginfo"); if (ri != NULL) bfd_set_section_size (output_bfd, ri, sizeof (Elf32_External_RegInfo)); if (! (info->relocatable || ! mips_elf_hash_table (info)->mips16_stubs_seen)) mips_elf_link_hash_traverse (mips_elf_hash_table (info), mips_elf_check_mips16_stubs, NULL); dynobj = elf_hash_table (info)->dynobj; if (dynobj == NULL) /* Relocatable links don't have it. */ return TRUE; g = mips_elf_got_info (dynobj, &s); if (s == NULL) return TRUE; /* Calculate the total loadable size of the output. That will give us the maximum number of GOT_PAGE entries required. */ for (sub = info->input_bfds; sub; sub = sub->link_next) { asection *subsection; for (subsection = sub->sections; subsection; subsection = subsection->next) { if ((subsection->flags & SEC_ALLOC) == 0) continue; loadable_size += ((subsection->size + 0xf) &~ (bfd_size_type) 0xf); } } /* There has to be a global GOT entry for every symbol with a dynamic symbol table index of DT_MIPS_GOTSYM or higher. Therefore, it make sense to put those symbols that need GOT entries at the end of the symbol table. We do that here. */ if (! mips_elf_sort_hash_table (info, 1)) return FALSE; if (g->global_gotsym != NULL) i = elf_hash_table (info)->dynsymcount - g->global_gotsym->dynindx; else /* If there are no global symbols, or none requiring relocations, then GLOBAL_GOTSYM will be NULL. */ i = 0; /* Get a worst-case estimate of the number of dynamic symbols needed. At this point, dynsymcount does not account for section symbols and count_section_dynsyms may overestimate the number that will be needed. */ dynsymcount = (elf_hash_table (info)->dynsymcount + count_section_dynsyms (output_bfd, info)); /* Determine the size of one stub entry. */ htab->function_stub_size = (dynsymcount > 0x10000 ? MIPS_FUNCTION_STUB_BIG_SIZE : MIPS_FUNCTION_STUB_NORMAL_SIZE); /* In the worst case, we'll get one stub per dynamic symbol, plus one to account for the dummy entry at the end required by IRIX rld. */ loadable_size += htab->function_stub_size * (i + 1); if (htab->is_vxworks) /* There's no need to allocate page entries for VxWorks; R_MIPS_GOT16 relocations against local symbols evaluate to "G", and the EABI does not include R_MIPS_GOT_PAGE. */ local_gotno = 0; else /* Assume there are two loadable segments consisting of contiguous sections. Is 5 enough? */ local_gotno = (loadable_size >> 16) + 5; g->local_gotno += local_gotno; s->size += g->local_gotno * MIPS_ELF_GOT_SIZE (output_bfd); g->global_gotno = i; s->size += i * MIPS_ELF_GOT_SIZE (output_bfd); /* We need to calculate tls_gotno for global symbols at this point instead of building it up earlier, to avoid doublecounting entries for one global symbol from multiple input files. */ count_tls_arg.info = info; count_tls_arg.needed = 0; elf_link_hash_traverse (elf_hash_table (info), mips_elf_count_global_tls_entries, &count_tls_arg); g->tls_gotno += count_tls_arg.needed; s->size += g->tls_gotno * MIPS_ELF_GOT_SIZE (output_bfd); mips_elf_resolve_final_got_entries (g); /* VxWorks does not support multiple GOTs. It initializes $gp to __GOTT_BASE__[__GOTT_INDEX__], the value of which is set by the dynamic loader. */ if (!htab->is_vxworks && s->size > MIPS_ELF_GOT_MAX_SIZE (info)) { if (! mips_elf_multi_got (output_bfd, info, g, s, local_gotno)) return FALSE; } else { /* Set up TLS entries for the first GOT. */ g->tls_assigned_gotno = g->global_gotno + g->local_gotno; htab_traverse (g->got_entries, mips_elf_initialize_tls_index, g); } return TRUE; } /* Set the sizes of the dynamic sections. */ bfd_boolean _bfd_mips_elf_size_dynamic_sections (bfd *output_bfd, struct bfd_link_info *info) { bfd *dynobj; asection *s, *sreldyn; bfd_boolean reltext; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); dynobj = elf_hash_table (info)->dynobj; BFD_ASSERT (dynobj != NULL); if (elf_hash_table (info)->dynamic_sections_created) { /* Set the contents of the .interp section to the interpreter. */ if (info->executable) { s = bfd_get_section_by_name (dynobj, ".interp"); BFD_ASSERT (s != NULL); s->size = strlen (ELF_DYNAMIC_INTERPRETER (output_bfd)) + 1; s->contents = (bfd_byte *) ELF_DYNAMIC_INTERPRETER (output_bfd); } } /* The check_relocs and adjust_dynamic_symbol entry points have determined the sizes of the various dynamic sections. Allocate memory for them. */ reltext = FALSE; sreldyn = NULL; for (s = dynobj->sections; s != NULL; s = s->next) { const char *name; /* It's OK to base decisions on the section name, because none of the dynobj section names depend upon the input files. */ name = bfd_get_section_name (dynobj, s); if ((s->flags & SEC_LINKER_CREATED) == 0) continue; if (CONST_STRNEQ (name, ".rel")) { if (s->size != 0) { const char *outname; asection *target; /* If this relocation section applies to a read only section, then we probably need a DT_TEXTREL entry. If the relocation section is .rel(a).dyn, we always assert a DT_TEXTREL entry rather than testing whether there exists a relocation to a read only section or not. */ outname = bfd_get_section_name (output_bfd, s->output_section); target = bfd_get_section_by_name (output_bfd, outname + 4); if ((target != NULL && (target->flags & SEC_READONLY) != 0 && (target->flags & SEC_ALLOC) != 0) || strcmp (outname, MIPS_ELF_REL_DYN_NAME (info)) == 0) reltext = TRUE; /* We use the reloc_count field as a counter if we need to copy relocs into the output file. */ if (strcmp (name, MIPS_ELF_REL_DYN_NAME (info)) != 0) s->reloc_count = 0; /* If combreloc is enabled, elf_link_sort_relocs() will sort relocations, but in a different way than we do, and before we're done creating relocations. Also, it will move them around between input sections' relocation's contents, so our sorting would be broken, so don't let it run. */ info->combreloc = 0; } } else if (htab->is_vxworks && strcmp (name, ".got") == 0) { /* Executables do not need a GOT. */ if (info->shared) { /* Allocate relocations for all but the reserved entries. */ struct mips_got_info *g; unsigned int count; g = mips_elf_got_info (dynobj, NULL); count = (g->global_gotno + g->local_gotno - MIPS_RESERVED_GOTNO (info)); mips_elf_allocate_dynamic_relocations (dynobj, info, count); } } else if (!htab->is_vxworks && CONST_STRNEQ (name, ".got")) { /* _bfd_mips_elf_always_size_sections() has already done most of the work, but some symbols may have been mapped to versions that we must now resolve in the got_entries hash tables. */ struct mips_got_info *gg = mips_elf_got_info (dynobj, NULL); struct mips_got_info *g = gg; struct mips_elf_set_global_got_offset_arg set_got_offset_arg; unsigned int needed_relocs = 0; if (gg->next) { set_got_offset_arg.value = MIPS_ELF_GOT_SIZE (output_bfd); set_got_offset_arg.info = info; /* NOTE 2005-02-03: How can this call, or the next, ever find any indirect entries to resolve? They were all resolved in mips_elf_multi_got. */ mips_elf_resolve_final_got_entries (gg); for (g = gg->next; g && g->next != gg; g = g->next) { unsigned int save_assign; mips_elf_resolve_final_got_entries (g); /* Assign offsets to global GOT entries. */ save_assign = g->assigned_gotno; g->assigned_gotno = g->local_gotno; set_got_offset_arg.g = g; set_got_offset_arg.needed_relocs = 0; htab_traverse (g->got_entries, mips_elf_set_global_got_offset, &set_got_offset_arg); needed_relocs += set_got_offset_arg.needed_relocs; BFD_ASSERT (g->assigned_gotno - g->local_gotno <= g->global_gotno); g->assigned_gotno = save_assign; if (info->shared) { needed_relocs += g->local_gotno - g->assigned_gotno; BFD_ASSERT (g->assigned_gotno == g->next->local_gotno + g->next->global_gotno + g->next->tls_gotno + MIPS_RESERVED_GOTNO (info)); } } } else { struct mips_elf_count_tls_arg arg; arg.info = info; arg.needed = 0; htab_traverse (gg->got_entries, mips_elf_count_local_tls_relocs, &arg); elf_link_hash_traverse (elf_hash_table (info), mips_elf_count_global_tls_relocs, &arg); needed_relocs += arg.needed; } if (needed_relocs) mips_elf_allocate_dynamic_relocations (dynobj, info, needed_relocs); } else if (strcmp (name, MIPS_ELF_STUB_SECTION_NAME (output_bfd)) == 0) { /* IRIX rld assumes that the function stub isn't at the end of .text section. So put a dummy. XXX */ s->size += htab->function_stub_size; } else if (! info->shared && ! mips_elf_hash_table (info)->use_rld_obj_head && CONST_STRNEQ (name, ".rld_map")) { /* We add a room for __rld_map. It will be filled in by the rtld to contain a pointer to the _r_debug structure. */ s->size += MIPS_ELF_RLD_MAP_SIZE (output_bfd); } else if (SGI_COMPAT (output_bfd) && CONST_STRNEQ (name, ".compact_rel")) s->size += mips_elf_hash_table (info)->compact_rel_size; else if (! CONST_STRNEQ (name, ".init") && s != htab->sgotplt && s != htab->splt) { /* It's not one of our sections, so don't allocate space. */ continue; } if (s->size == 0) { s->flags |= SEC_EXCLUDE; continue; } if ((s->flags & SEC_HAS_CONTENTS) == 0) continue; /* Allocate memory for this section last, since we may increase its size above. */ if (strcmp (name, MIPS_ELF_REL_DYN_NAME (info)) == 0) { sreldyn = s; continue; } /* Allocate memory for the section contents. */ s->contents = bfd_zalloc (dynobj, s->size); if (s->contents == NULL) { bfd_set_error (bfd_error_no_memory); return FALSE; } } /* Allocate memory for the .rel(a).dyn section. */ if (sreldyn != NULL) { sreldyn->contents = bfd_zalloc (dynobj, sreldyn->size); if (sreldyn->contents == NULL) { bfd_set_error (bfd_error_no_memory); return FALSE; } } if (elf_hash_table (info)->dynamic_sections_created) { /* Add some entries to the .dynamic section. We fill in the values later, in _bfd_mips_elf_finish_dynamic_sections, but we must add the entries now so that we get the correct size for the .dynamic section. */ /* SGI object has the equivalence of DT_DEBUG in the DT_MIPS_RLD_MAP entry. This must come first because glibc only fills in DT_MIPS_RLD_MAP (not DT_DEBUG) and GDB only looks at the first one it sees. */ if (!info->shared && !MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_RLD_MAP, 0)) return FALSE; /* The DT_DEBUG entry may be filled in by the dynamic linker and used by the debugger. */ if (info->executable && !SGI_COMPAT (output_bfd) && !MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_DEBUG, 0)) return FALSE; if (reltext && (SGI_COMPAT (output_bfd) || htab->is_vxworks)) info->flags |= DF_TEXTREL; if ((info->flags & DF_TEXTREL) != 0) { if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_TEXTREL, 0)) return FALSE; /* Clear the DF_TEXTREL flag. It will be set again if we write out an actual text relocation; we may not, because at this point we do not know whether e.g. any .eh_frame absolute relocations have been converted to PC-relative. */ info->flags &= ~DF_TEXTREL; } if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_PLTGOT, 0)) return FALSE; if (htab->is_vxworks) { /* VxWorks uses .rela.dyn instead of .rel.dyn. It does not use any of the DT_MIPS_* tags. */ if (mips_elf_rel_dyn_section (info, FALSE)) { if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_RELA, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_RELASZ, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_RELAENT, 0)) return FALSE; } if (htab->splt->size > 0) { if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_PLTREL, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_JMPREL, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_PLTRELSZ, 0)) return FALSE; } } else { if (mips_elf_rel_dyn_section (info, FALSE)) { if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_REL, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_RELSZ, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_RELENT, 0)) return FALSE; } if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_RLD_VERSION, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_FLAGS, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_BASE_ADDRESS, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_LOCAL_GOTNO, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_SYMTABNO, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_UNREFEXTNO, 0)) return FALSE; if (! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_GOTSYM, 0)) return FALSE; if (IRIX_COMPAT (dynobj) == ict_irix5 && ! MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_HIPAGENO, 0)) return FALSE; if (IRIX_COMPAT (dynobj) == ict_irix6 && (bfd_get_section_by_name (dynobj, MIPS_ELF_OPTIONS_SECTION_NAME (dynobj))) && !MIPS_ELF_ADD_DYNAMIC_ENTRY (info, DT_MIPS_OPTIONS, 0)) return FALSE; } } return TRUE; } /* REL is a relocation in INPUT_BFD that is being copied to OUTPUT_BFD. Adjust its R_ADDEND field so that it is correct for the output file. LOCAL_SYMS and LOCAL_SECTIONS are arrays of INPUT_BFD's local symbols and sections respectively; both use symbol indexes. */ static void mips_elf_adjust_addend (bfd *output_bfd, struct bfd_link_info *info, bfd *input_bfd, Elf_Internal_Sym *local_syms, asection **local_sections, Elf_Internal_Rela *rel) { unsigned int r_type, r_symndx; Elf_Internal_Sym *sym; asection *sec; if (mips_elf_local_relocation_p (input_bfd, rel, local_sections, FALSE)) { r_type = ELF_R_TYPE (output_bfd, rel->r_info); if (r_type == R_MIPS16_GPREL || r_type == R_MIPS_GPREL16 || r_type == R_MIPS_GPREL32 || r_type == R_MIPS_LITERAL) { rel->r_addend += _bfd_get_gp_value (input_bfd); rel->r_addend -= _bfd_get_gp_value (output_bfd); } r_symndx = ELF_R_SYM (output_bfd, rel->r_info); sym = local_syms + r_symndx; /* Adjust REL's addend to account for section merging. */ if (!info->relocatable) { sec = local_sections[r_symndx]; _bfd_elf_rela_local_sym (output_bfd, sym, &sec, rel); } /* This would normally be done by the rela_normal code in elflink.c. */ if (ELF_ST_TYPE (sym->st_info) == STT_SECTION) rel->r_addend += local_sections[r_symndx]->output_offset; } } /* Relocate a MIPS ELF section. */ bfd_boolean _bfd_mips_elf_relocate_section (bfd *output_bfd, struct bfd_link_info *info, bfd *input_bfd, asection *input_section, bfd_byte *contents, Elf_Internal_Rela *relocs, Elf_Internal_Sym *local_syms, asection **local_sections) { Elf_Internal_Rela *rel; const Elf_Internal_Rela *relend; bfd_vma addend = 0; bfd_boolean use_saved_addend_p = FALSE; const struct elf_backend_data *bed; bed = get_elf_backend_data (output_bfd); relend = relocs + input_section->reloc_count * bed->s->int_rels_per_ext_rel; for (rel = relocs; rel < relend; ++rel) { const char *name; bfd_vma value = 0; reloc_howto_type *howto; bfd_boolean require_jalx; /* TRUE if the relocation is a RELA relocation, rather than a REL relocation. */ bfd_boolean rela_relocation_p = TRUE; unsigned int r_type = ELF_R_TYPE (output_bfd, rel->r_info); const char *msg; unsigned long r_symndx; asection *sec; Elf_Internal_Shdr *symtab_hdr; struct elf_link_hash_entry *h; /* Find the relocation howto for this relocation. */ howto = MIPS_ELF_RTYPE_TO_HOWTO (input_bfd, r_type, NEWABI_P (input_bfd) && (MIPS_RELOC_RELA_P (input_bfd, input_section, rel - relocs))); r_symndx = ELF_R_SYM (input_bfd, rel->r_info); symtab_hdr = &elf_tdata (input_bfd)->symtab_hdr; if (mips_elf_local_relocation_p (input_bfd, rel, local_sections, FALSE)) { sec = local_sections[r_symndx]; h = NULL; } else { unsigned long extsymoff; extsymoff = 0; if (!elf_bad_symtab (input_bfd)) extsymoff = symtab_hdr->sh_info; h = elf_sym_hashes (input_bfd) [r_symndx - extsymoff]; while (h->root.type == bfd_link_hash_indirect || h->root.type == bfd_link_hash_warning) h = (struct elf_link_hash_entry *) h->root.u.i.link; sec = NULL; if (h->root.type == bfd_link_hash_defined || h->root.type == bfd_link_hash_defweak) sec = h->root.u.def.section; } if (sec != NULL && elf_discarded_section (sec)) { /* For relocs against symbols from removed linkonce sections, or sections discarded by a linker script, we just want the section contents zeroed. Avoid any special processing. */ _bfd_clear_contents (howto, input_bfd, contents + rel->r_offset); rel->r_info = 0; rel->r_addend = 0; continue; } if (r_type == R_MIPS_64 && ! NEWABI_P (input_bfd)) { /* Some 32-bit code uses R_MIPS_64. In particular, people use 64-bit code, but make sure all their addresses are in the lowermost or uppermost 32-bit section of the 64-bit address space. Thus, when they use an R_MIPS_64 they mean what is usually meant by R_MIPS_32, with the exception that the stored value is sign-extended to 64 bits. */ howto = MIPS_ELF_RTYPE_TO_HOWTO (input_bfd, R_MIPS_32, FALSE); /* On big-endian systems, we need to lie about the position of the reloc. */ if (bfd_big_endian (input_bfd)) rel->r_offset += 4; } if (!use_saved_addend_p) { Elf_Internal_Shdr *rel_hdr; /* If these relocations were originally of the REL variety, we must pull the addend out of the field that will be relocated. Otherwise, we simply use the contents of the RELA relocation. To determine which flavor or relocation this is, we depend on the fact that the INPUT_SECTION's REL_HDR is read before its REL_HDR2. */ rel_hdr = &elf_section_data (input_section)->rel_hdr; if ((size_t) (rel - relocs) >= (NUM_SHDR_ENTRIES (rel_hdr) * bed->s->int_rels_per_ext_rel)) rel_hdr = elf_section_data (input_section)->rel_hdr2; if (rel_hdr->sh_entsize == MIPS_ELF_REL_SIZE (input_bfd)) { bfd_byte *location = contents + rel->r_offset; /* Note that this is a REL relocation. */ rela_relocation_p = FALSE; /* Get the addend, which is stored in the input file. */ _bfd_mips16_elf_reloc_unshuffle (input_bfd, r_type, FALSE, location); addend = mips_elf_obtain_contents (howto, rel, input_bfd, contents); _bfd_mips16_elf_reloc_shuffle(input_bfd, r_type, FALSE, location); addend &= howto->src_mask; /* For some kinds of relocations, the ADDEND is a combination of the addend stored in two different relocations. */ if (r_type == R_MIPS_HI16 || r_type == R_MIPS16_HI16 || (r_type == R_MIPS_GOT16 && mips_elf_local_relocation_p (input_bfd, rel, local_sections, FALSE))) { const Elf_Internal_Rela *lo16_relocation; reloc_howto_type *lo16_howto; int lo16_type; if (r_type == R_MIPS16_HI16) lo16_type = R_MIPS16_LO16; else lo16_type = R_MIPS_LO16; /* The combined value is the sum of the HI16 addend, left-shifted by sixteen bits, and the LO16 addend, sign extended. (Usually, the code does a `lui' of the HI16 value, and then an `addiu' of the LO16 value.) Scan ahead to find a matching LO16 relocation. According to the MIPS ELF ABI, the R_MIPS_LO16 relocation must be immediately following. However, for the IRIX6 ABI, the next relocation may be a composed relocation consisting of several relocations for the same address. In that case, the R_MIPS_LO16 relocation may occur as one of these. We permit a similar extension in general, as that is useful for GCC. In some cases GCC dead code elimination removes the LO16 but keeps the corresponding HI16. This is strictly speaking a violation of the ABI but not immediately harmful. */ lo16_relocation = mips_elf_next_relocation (input_bfd, lo16_type, rel, relend); if (lo16_relocation == NULL) { const char *name; if (h) name = h->root.root.string; else name = bfd_elf_sym_name (input_bfd, symtab_hdr, local_syms + r_symndx, sec); (*_bfd_error_handler) (_("%B: Can't find matching LO16 reloc against `%s' for %s at 0x%lx in section `%A'"), input_bfd, input_section, name, howto->name, rel->r_offset); } else { bfd_byte *lo16_location; bfd_vma l; lo16_location = contents + lo16_relocation->r_offset; /* Obtain the addend kept there. */ lo16_howto = MIPS_ELF_RTYPE_TO_HOWTO (input_bfd, lo16_type, FALSE); _bfd_mips16_elf_reloc_unshuffle (input_bfd, lo16_type, FALSE, lo16_location); l = mips_elf_obtain_contents (lo16_howto, lo16_relocation, input_bfd, contents); _bfd_mips16_elf_reloc_shuffle (input_bfd, lo16_type, FALSE, lo16_location); l &= lo16_howto->src_mask; l <<= lo16_howto->rightshift; l = _bfd_mips_elf_sign_extend (l, 16); addend <<= 16; /* Compute the combined addend. */ addend += l; } } else addend <<= howto->rightshift; } else addend = rel->r_addend; mips_elf_adjust_addend (output_bfd, info, input_bfd, local_syms, local_sections, rel); } if (info->relocatable) { if (r_type == R_MIPS_64 && ! NEWABI_P (output_bfd) && bfd_big_endian (input_bfd)) rel->r_offset -= 4; if (!rela_relocation_p && rel->r_addend) { addend += rel->r_addend; if (r_type == R_MIPS_HI16 || r_type == R_MIPS_GOT16) addend = mips_elf_high (addend); else if (r_type == R_MIPS_HIGHER) addend = mips_elf_higher (addend); else if (r_type == R_MIPS_HIGHEST) addend = mips_elf_highest (addend); else addend >>= howto->rightshift; /* We use the source mask, rather than the destination mask because the place to which we are writing will be source of the addend in the final link. */ addend &= howto->src_mask; if (r_type == R_MIPS_64 && ! NEWABI_P (output_bfd)) /* See the comment above about using R_MIPS_64 in the 32-bit ABI. Here, we need to update the addend. It would be possible to get away with just using the R_MIPS_32 reloc but for endianness. */ { bfd_vma sign_bits; bfd_vma low_bits; bfd_vma high_bits; if (addend & ((bfd_vma) 1 << 31)) #ifdef BFD64 sign_bits = ((bfd_vma) 1 << 32) - 1; #else sign_bits = -1; #endif else sign_bits = 0; /* If we don't know that we have a 64-bit type, do two separate stores. */ if (bfd_big_endian (input_bfd)) { /* Store the sign-bits (which are most significant) first. */ low_bits = sign_bits; high_bits = addend; } else { low_bits = addend; high_bits = sign_bits; } bfd_put_32 (input_bfd, low_bits, contents + rel->r_offset); bfd_put_32 (input_bfd, high_bits, contents + rel->r_offset + 4); continue; } if (! mips_elf_perform_relocation (info, howto, rel, addend, input_bfd, input_section, contents, FALSE)) return FALSE; } /* Go on to the next relocation. */ continue; } /* In the N32 and 64-bit ABIs there may be multiple consecutive relocations for the same offset. In that case we are supposed to treat the output of each relocation as the addend for the next. */ if (rel + 1 < relend && rel->r_offset == rel[1].r_offset && ELF_R_TYPE (input_bfd, rel[1].r_info) != R_MIPS_NONE) use_saved_addend_p = TRUE; else use_saved_addend_p = FALSE; /* Figure out what value we are supposed to relocate. */ switch (mips_elf_calculate_relocation (output_bfd, input_bfd, input_section, info, rel, addend, howto, local_syms, local_sections, &value, &name, &require_jalx, use_saved_addend_p)) { case bfd_reloc_continue: /* There's nothing to do. */ continue; case bfd_reloc_undefined: /* mips_elf_calculate_relocation already called the undefined_symbol callback. There's no real point in trying to perform the relocation at this point, so we just skip ahead to the next relocation. */ continue; case bfd_reloc_notsupported: msg = _("internal error: unsupported relocation error"); info->callbacks->warning (info, msg, name, input_bfd, input_section, rel->r_offset); return FALSE; case bfd_reloc_overflow: if (use_saved_addend_p) /* Ignore overflow until we reach the last relocation for a given location. */ ; else { BFD_ASSERT (name != NULL); if (! ((*info->callbacks->reloc_overflow) (info, NULL, name, howto->name, (bfd_vma) 0, input_bfd, input_section, rel->r_offset))) return FALSE; } break; case bfd_reloc_ok: break; default: abort (); break; } /* If we've got another relocation for the address, keep going until we reach the last one. */ if (use_saved_addend_p) { addend = value; continue; } if (r_type == R_MIPS_64 && ! NEWABI_P (output_bfd)) /* See the comment above about using R_MIPS_64 in the 32-bit ABI. Until now, we've been using the HOWTO for R_MIPS_32; that calculated the right value. Now, however, we sign-extend the 32-bit result to 64-bits, and store it as a 64-bit value. We are especially generous here in that we go to extreme lengths to support this usage on systems with only a 32-bit VMA. */ { bfd_vma sign_bits; bfd_vma low_bits; bfd_vma high_bits; if (value & ((bfd_vma) 1 << 31)) #ifdef BFD64 sign_bits = ((bfd_vma) 1 << 32) - 1; #else sign_bits = -1; #endif else sign_bits = 0; /* If we don't know that we have a 64-bit type, do two separate stores. */ if (bfd_big_endian (input_bfd)) { /* Undo what we did above. */ rel->r_offset -= 4; /* Store the sign-bits (which are most significant) first. */ low_bits = sign_bits; high_bits = value; } else { low_bits = value; high_bits = sign_bits; } bfd_put_32 (input_bfd, low_bits, contents + rel->r_offset); bfd_put_32 (input_bfd, high_bits, contents + rel->r_offset + 4); continue; } /* Actually perform the relocation. */ if (! mips_elf_perform_relocation (info, howto, rel, value, input_bfd, input_section, contents, require_jalx)) return FALSE; } return TRUE; } /* If NAME is one of the special IRIX6 symbols defined by the linker, adjust it appropriately now. */ static void mips_elf_irix6_finish_dynamic_symbol (bfd *abfd ATTRIBUTE_UNUSED, const char *name, Elf_Internal_Sym *sym) { /* The linker script takes care of providing names and values for these, but we must place them into the right sections. */ static const char* const text_section_symbols[] = { "_ftext", "_etext", "__dso_displacement", "__elf_header", "__program_header_table", NULL }; static const char* const data_section_symbols[] = { "_fdata", "_edata", "_end", "_fbss", NULL }; const char* const *p; int i; for (i = 0; i < 2; ++i) for (p = (i == 0) ? text_section_symbols : data_section_symbols; *p; ++p) if (strcmp (*p, name) == 0) { /* All of these symbols are given type STT_SECTION by the IRIX6 linker. */ sym->st_info = ELF_ST_INFO (STB_GLOBAL, STT_SECTION); sym->st_other = STO_PROTECTED; /* The IRIX linker puts these symbols in special sections. */ if (i == 0) sym->st_shndx = SHN_MIPS_TEXT; else sym->st_shndx = SHN_MIPS_DATA; break; } } /* Finish up dynamic symbol handling. We set the contents of various dynamic sections here. */ bfd_boolean _bfd_mips_elf_finish_dynamic_symbol (bfd *output_bfd, struct bfd_link_info *info, struct elf_link_hash_entry *h, Elf_Internal_Sym *sym) { bfd *dynobj; asection *sgot; struct mips_got_info *g, *gg; const char *name; int idx; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); dynobj = elf_hash_table (info)->dynobj; if (h->plt.offset != MINUS_ONE) { asection *s; bfd_byte stub[MIPS_FUNCTION_STUB_BIG_SIZE]; /* This symbol has a stub. Set it up. */ BFD_ASSERT (h->dynindx != -1); s = bfd_get_section_by_name (dynobj, MIPS_ELF_STUB_SECTION_NAME (dynobj)); BFD_ASSERT (s != NULL); BFD_ASSERT ((htab->function_stub_size == MIPS_FUNCTION_STUB_BIG_SIZE) || (h->dynindx <= 0xffff)); /* Values up to 2^31 - 1 are allowed. Larger values would cause sign extension at runtime in the stub, resulting in a negative index value. */ if (h->dynindx & ~0x7fffffff) return FALSE; /* Fill the stub. */ idx = 0; bfd_put_32 (output_bfd, STUB_LW (output_bfd), stub + idx); idx += 4; bfd_put_32 (output_bfd, STUB_MOVE (output_bfd), stub + idx); idx += 4; if (htab->function_stub_size == MIPS_FUNCTION_STUB_BIG_SIZE) { bfd_put_32 (output_bfd, STUB_LUI ((h->dynindx >> 16) & 0x7fff), stub + idx); idx += 4; } bfd_put_32 (output_bfd, STUB_JALR, stub + idx); idx += 4; /* If a large stub is not required and sign extension is not a problem, then use legacy code in the stub. */ if (htab->function_stub_size == MIPS_FUNCTION_STUB_BIG_SIZE) bfd_put_32 (output_bfd, STUB_ORI (h->dynindx & 0xffff), stub + idx); else if (h->dynindx & ~0x7fff) bfd_put_32 (output_bfd, STUB_LI16U (h->dynindx & 0xffff), stub + idx); else bfd_put_32 (output_bfd, STUB_LI16S (output_bfd, h->dynindx), stub + idx); BFD_ASSERT (h->plt.offset <= s->size); memcpy (s->contents + h->plt.offset, stub, htab->function_stub_size); /* Mark the symbol as undefined. plt.offset != -1 occurs only for the referenced symbol. */ sym->st_shndx = SHN_UNDEF; /* The run-time linker uses the st_value field of the symbol to reset the global offset table entry for this external to its stub address when unlinking a shared object. */ sym->st_value = (s->output_section->vma + s->output_offset + h->plt.offset); } BFD_ASSERT (h->dynindx != -1 || h->forced_local); sgot = mips_elf_got_section (dynobj, FALSE); BFD_ASSERT (sgot != NULL); BFD_ASSERT (mips_elf_section_data (sgot) != NULL); g = mips_elf_section_data (sgot)->u.got_info; BFD_ASSERT (g != NULL); /* Run through the global symbol table, creating GOT entries for all the symbols that need them. */ if (g->global_gotsym != NULL && h->dynindx >= g->global_gotsym->dynindx) { bfd_vma offset; bfd_vma value; value = sym->st_value; offset = mips_elf_global_got_index (dynobj, output_bfd, h, R_MIPS_GOT16, info); MIPS_ELF_PUT_WORD (output_bfd, value, sgot->contents + offset); } if (g->next && h->dynindx != -1 && h->type != STT_TLS) { struct mips_got_entry e, *p; bfd_vma entry; bfd_vma offset; gg = g; e.abfd = output_bfd; e.symndx = -1; e.d.h = (struct mips_elf_link_hash_entry *)h; e.tls_type = 0; for (g = g->next; g->next != gg; g = g->next) { if (g->got_entries && (p = (struct mips_got_entry *) htab_find (g->got_entries, &e))) { offset = p->gotidx; if (info->shared || (elf_hash_table (info)->dynamic_sections_created && p->d.h != NULL && p->d.h->root.def_dynamic && !p->d.h->root.def_regular)) { /* Create an R_MIPS_REL32 relocation for this entry. Due to the various compatibility problems, it's easier to mock up an R_MIPS_32 or R_MIPS_64 relocation and leave mips_elf_create_dynamic_relocation to calculate the appropriate addend. */ Elf_Internal_Rela rel[3]; memset (rel, 0, sizeof (rel)); if (ABI_64_P (output_bfd)) rel[0].r_info = ELF_R_INFO (output_bfd, 0, R_MIPS_64); else rel[0].r_info = ELF_R_INFO (output_bfd, 0, R_MIPS_32); rel[0].r_offset = rel[1].r_offset = rel[2].r_offset = offset; entry = 0; if (! (mips_elf_create_dynamic_relocation (output_bfd, info, rel, e.d.h, NULL, sym->st_value, &entry, sgot))) return FALSE; } else entry = sym->st_value; MIPS_ELF_PUT_WORD (output_bfd, entry, sgot->contents + offset); } } } /* Mark _DYNAMIC and _GLOBAL_OFFSET_TABLE_ as absolute. */ name = h->root.root.string; if (strcmp (name, "_DYNAMIC") == 0 || h == elf_hash_table (info)->hgot) sym->st_shndx = SHN_ABS; else if (strcmp (name, "_DYNAMIC_LINK") == 0 || strcmp (name, "_DYNAMIC_LINKING") == 0) { sym->st_shndx = SHN_ABS; sym->st_info = ELF_ST_INFO (STB_GLOBAL, STT_SECTION); sym->st_value = 1; } else if (strcmp (name, "_gp_disp") == 0 && ! NEWABI_P (output_bfd)) { sym->st_shndx = SHN_ABS; sym->st_info = ELF_ST_INFO (STB_GLOBAL, STT_SECTION); sym->st_value = elf_gp (output_bfd); } else if (SGI_COMPAT (output_bfd)) { if (strcmp (name, mips_elf_dynsym_rtproc_names[0]) == 0 || strcmp (name, mips_elf_dynsym_rtproc_names[1]) == 0) { sym->st_info = ELF_ST_INFO (STB_GLOBAL, STT_SECTION); sym->st_other = STO_PROTECTED; sym->st_value = 0; sym->st_shndx = SHN_MIPS_DATA; } else if (strcmp (name, mips_elf_dynsym_rtproc_names[2]) == 0) { sym->st_info = ELF_ST_INFO (STB_GLOBAL, STT_SECTION); sym->st_other = STO_PROTECTED; sym->st_value = mips_elf_hash_table (info)->procedure_count; sym->st_shndx = SHN_ABS; } else if (sym->st_shndx != SHN_UNDEF && sym->st_shndx != SHN_ABS) { if (h->type == STT_FUNC) sym->st_shndx = SHN_MIPS_TEXT; else if (h->type == STT_OBJECT) sym->st_shndx = SHN_MIPS_DATA; } } /* Handle the IRIX6-specific symbols. */ if (IRIX_COMPAT (output_bfd) == ict_irix6) mips_elf_irix6_finish_dynamic_symbol (output_bfd, name, sym); if (! info->shared) { if (! mips_elf_hash_table (info)->use_rld_obj_head && (strcmp (name, "__rld_map") == 0 || strcmp (name, "__RLD_MAP") == 0)) { asection *s = bfd_get_section_by_name (dynobj, ".rld_map"); BFD_ASSERT (s != NULL); sym->st_value = s->output_section->vma + s->output_offset; bfd_put_32 (output_bfd, 0, s->contents); if (mips_elf_hash_table (info)->rld_value == 0) mips_elf_hash_table (info)->rld_value = sym->st_value; } else if (mips_elf_hash_table (info)->use_rld_obj_head && strcmp (name, "__rld_obj_head") == 0) { /* IRIX6 does not use a .rld_map section. */ if (IRIX_COMPAT (output_bfd) == ict_irix5 || IRIX_COMPAT (output_bfd) == ict_none) BFD_ASSERT (bfd_get_section_by_name (dynobj, ".rld_map") != NULL); mips_elf_hash_table (info)->rld_value = sym->st_value; } } /* If this is a mips16 symbol, force the value to be even. */ if (sym->st_other == STO_MIPS16) sym->st_value &= ~1; return TRUE; } /* Likewise, for VxWorks. */ bfd_boolean _bfd_mips_vxworks_finish_dynamic_symbol (bfd *output_bfd, struct bfd_link_info *info, struct elf_link_hash_entry *h, Elf_Internal_Sym *sym) { bfd *dynobj; asection *sgot; struct mips_got_info *g; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); dynobj = elf_hash_table (info)->dynobj; if (h->plt.offset != (bfd_vma) -1) { bfd_byte *loc; bfd_vma plt_address, plt_index, got_address, got_offset, branch_offset; Elf_Internal_Rela rel; static const bfd_vma *plt_entry; BFD_ASSERT (h->dynindx != -1); BFD_ASSERT (htab->splt != NULL); BFD_ASSERT (h->plt.offset <= htab->splt->size); /* Calculate the address of the .plt entry. */ plt_address = (htab->splt->output_section->vma + htab->splt->output_offset + h->plt.offset); /* Calculate the index of the entry. */ plt_index = ((h->plt.offset - htab->plt_header_size) / htab->plt_entry_size); /* Calculate the address of the .got.plt entry. */ got_address = (htab->sgotplt->output_section->vma + htab->sgotplt->output_offset + plt_index * 4); /* Calculate the offset of the .got.plt entry from _GLOBAL_OFFSET_TABLE_. */ got_offset = mips_elf_gotplt_index (info, h); /* Calculate the offset for the branch at the start of the PLT entry. The branch jumps to the beginning of .plt. */ branch_offset = -(h->plt.offset / 4 + 1) & 0xffff; /* Fill in the initial value of the .got.plt entry. */ bfd_put_32 (output_bfd, plt_address, htab->sgotplt->contents + plt_index * 4); /* Find out where the .plt entry should go. */ loc = htab->splt->contents + h->plt.offset; if (info->shared) { plt_entry = mips_vxworks_shared_plt_entry; bfd_put_32 (output_bfd, plt_entry[0] | branch_offset, loc); bfd_put_32 (output_bfd, plt_entry[1] | plt_index, loc + 4); } else { bfd_vma got_address_high, got_address_low; plt_entry = mips_vxworks_exec_plt_entry; got_address_high = ((got_address + 0x8000) >> 16) & 0xffff; got_address_low = got_address & 0xffff; bfd_put_32 (output_bfd, plt_entry[0] | branch_offset, loc); bfd_put_32 (output_bfd, plt_entry[1] | plt_index, loc + 4); bfd_put_32 (output_bfd, plt_entry[2] | got_address_high, loc + 8); bfd_put_32 (output_bfd, plt_entry[3] | got_address_low, loc + 12); bfd_put_32 (output_bfd, plt_entry[4], loc + 16); bfd_put_32 (output_bfd, plt_entry[5], loc + 20); bfd_put_32 (output_bfd, plt_entry[6], loc + 24); bfd_put_32 (output_bfd, plt_entry[7], loc + 28); loc = (htab->srelplt2->contents + (plt_index * 3 + 2) * sizeof (Elf32_External_Rela)); /* Emit a relocation for the .got.plt entry. */ rel.r_offset = got_address; rel.r_info = ELF32_R_INFO (htab->root.hplt->indx, R_MIPS_32); rel.r_addend = h->plt.offset; bfd_elf32_swap_reloca_out (output_bfd, &rel, loc); /* Emit a relocation for the lui of %hi(<.got.plt slot>). */ loc += sizeof (Elf32_External_Rela); rel.r_offset = plt_address + 8; rel.r_info = ELF32_R_INFO (htab->root.hgot->indx, R_MIPS_HI16); rel.r_addend = got_offset; bfd_elf32_swap_reloca_out (output_bfd, &rel, loc); /* Emit a relocation for the addiu of %lo(<.got.plt slot>). */ loc += sizeof (Elf32_External_Rela); rel.r_offset += 4; rel.r_info = ELF32_R_INFO (htab->root.hgot->indx, R_MIPS_LO16); bfd_elf32_swap_reloca_out (output_bfd, &rel, loc); } /* Emit an R_MIPS_JUMP_SLOT relocation against the .got.plt entry. */ loc = htab->srelplt->contents + plt_index * sizeof (Elf32_External_Rela); rel.r_offset = got_address; rel.r_info = ELF32_R_INFO (h->dynindx, R_MIPS_JUMP_SLOT); rel.r_addend = 0; bfd_elf32_swap_reloca_out (output_bfd, &rel, loc); if (!h->def_regular) sym->st_shndx = SHN_UNDEF; } BFD_ASSERT (h->dynindx != -1 || h->forced_local); sgot = mips_elf_got_section (dynobj, FALSE); BFD_ASSERT (sgot != NULL); BFD_ASSERT (mips_elf_section_data (sgot) != NULL); g = mips_elf_section_data (sgot)->u.got_info; BFD_ASSERT (g != NULL); /* See if this symbol has an entry in the GOT. */ if (g->global_gotsym != NULL && h->dynindx >= g->global_gotsym->dynindx) { bfd_vma offset; Elf_Internal_Rela outrel; bfd_byte *loc; asection *s; /* Install the symbol value in the GOT. */ offset = mips_elf_global_got_index (dynobj, output_bfd, h, R_MIPS_GOT16, info); MIPS_ELF_PUT_WORD (output_bfd, sym->st_value, sgot->contents + offset); /* Add a dynamic relocation for it. */ s = mips_elf_rel_dyn_section (info, FALSE); loc = s->contents + (s->reloc_count++ * sizeof (Elf32_External_Rela)); outrel.r_offset = (sgot->output_section->vma + sgot->output_offset + offset); outrel.r_info = ELF32_R_INFO (h->dynindx, R_MIPS_32); outrel.r_addend = 0; bfd_elf32_swap_reloca_out (dynobj, &outrel, loc); } /* Emit a copy reloc, if needed. */ if (h->needs_copy) { Elf_Internal_Rela rel; BFD_ASSERT (h->dynindx != -1); rel.r_offset = (h->root.u.def.section->output_section->vma + h->root.u.def.section->output_offset + h->root.u.def.value); rel.r_info = ELF32_R_INFO (h->dynindx, R_MIPS_COPY); rel.r_addend = 0; bfd_elf32_swap_reloca_out (output_bfd, &rel, htab->srelbss->contents + (htab->srelbss->reloc_count * sizeof (Elf32_External_Rela))); ++htab->srelbss->reloc_count; } /* If this is a mips16 symbol, force the value to be even. */ if (sym->st_other == STO_MIPS16) sym->st_value &= ~1; return TRUE; } /* Install the PLT header for a VxWorks executable and finalize the contents of .rela.plt.unloaded. */ static void mips_vxworks_finish_exec_plt (bfd *output_bfd, struct bfd_link_info *info) { Elf_Internal_Rela rela; bfd_byte *loc; bfd_vma got_value, got_value_high, got_value_low, plt_address; static const bfd_vma *plt_entry; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); plt_entry = mips_vxworks_exec_plt0_entry; /* Calculate the value of _GLOBAL_OFFSET_TABLE_. */ got_value = (htab->root.hgot->root.u.def.section->output_section->vma + htab->root.hgot->root.u.def.section->output_offset + htab->root.hgot->root.u.def.value); got_value_high = ((got_value + 0x8000) >> 16) & 0xffff; got_value_low = got_value & 0xffff; /* Calculate the address of the PLT header. */ plt_address = htab->splt->output_section->vma + htab->splt->output_offset; /* Install the PLT header. */ loc = htab->splt->contents; bfd_put_32 (output_bfd, plt_entry[0] | got_value_high, loc); bfd_put_32 (output_bfd, plt_entry[1] | got_value_low, loc + 4); bfd_put_32 (output_bfd, plt_entry[2], loc + 8); bfd_put_32 (output_bfd, plt_entry[3], loc + 12); bfd_put_32 (output_bfd, plt_entry[4], loc + 16); bfd_put_32 (output_bfd, plt_entry[5], loc + 20); /* Output the relocation for the lui of %hi(_GLOBAL_OFFSET_TABLE_). */ loc = htab->srelplt2->contents; rela.r_offset = plt_address; rela.r_info = ELF32_R_INFO (htab->root.hgot->indx, R_MIPS_HI16); rela.r_addend = 0; bfd_elf32_swap_reloca_out (output_bfd, &rela, loc); loc += sizeof (Elf32_External_Rela); /* Output the relocation for the following addiu of %lo(_GLOBAL_OFFSET_TABLE_). */ rela.r_offset += 4; rela.r_info = ELF32_R_INFO (htab->root.hgot->indx, R_MIPS_LO16); bfd_elf32_swap_reloca_out (output_bfd, &rela, loc); loc += sizeof (Elf32_External_Rela); /* Fix up the remaining relocations. They may have the wrong symbol index for _G_O_T_ or _P_L_T_ depending on the order in which symbols were output. */ while (loc < htab->srelplt2->contents + htab->srelplt2->size) { Elf_Internal_Rela rel; bfd_elf32_swap_reloca_in (output_bfd, loc, &rel); rel.r_info = ELF32_R_INFO (htab->root.hplt->indx, R_MIPS_32); bfd_elf32_swap_reloca_out (output_bfd, &rel, loc); loc += sizeof (Elf32_External_Rela); bfd_elf32_swap_reloca_in (output_bfd, loc, &rel); rel.r_info = ELF32_R_INFO (htab->root.hgot->indx, R_MIPS_HI16); bfd_elf32_swap_reloca_out (output_bfd, &rel, loc); loc += sizeof (Elf32_External_Rela); bfd_elf32_swap_reloca_in (output_bfd, loc, &rel); rel.r_info = ELF32_R_INFO (htab->root.hgot->indx, R_MIPS_LO16); bfd_elf32_swap_reloca_out (output_bfd, &rel, loc); loc += sizeof (Elf32_External_Rela); } } /* Install the PLT header for a VxWorks shared library. */ static void mips_vxworks_finish_shared_plt (bfd *output_bfd, struct bfd_link_info *info) { unsigned int i; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); /* We just need to copy the entry byte-by-byte. */ for (i = 0; i < ARRAY_SIZE (mips_vxworks_shared_plt0_entry); i++) bfd_put_32 (output_bfd, mips_vxworks_shared_plt0_entry[i], htab->splt->contents + i * 4); } /* Finish up the dynamic sections. */ bfd_boolean _bfd_mips_elf_finish_dynamic_sections (bfd *output_bfd, struct bfd_link_info *info) { bfd *dynobj; asection *sdyn; asection *sgot; struct mips_got_info *gg, *g; struct mips_elf_link_hash_table *htab; htab = mips_elf_hash_table (info); dynobj = elf_hash_table (info)->dynobj; sdyn = bfd_get_section_by_name (dynobj, ".dynamic"); sgot = mips_elf_got_section (dynobj, FALSE); if (sgot == NULL) gg = g = NULL; else { BFD_ASSERT (mips_elf_section_data (sgot) != NULL); gg = mips_elf_section_data (sgot)->u.got_info; BFD_ASSERT (gg != NULL); g = mips_elf_got_for_ibfd (gg, output_bfd); BFD_ASSERT (g != NULL); } if (elf_hash_table (info)->dynamic_sections_created) { bfd_byte *b; int dyn_to_skip = 0, dyn_skipped = 0; BFD_ASSERT (sdyn != NULL); BFD_ASSERT (g != NULL); for (b = sdyn->contents; b < sdyn->contents + sdyn->size; b += MIPS_ELF_DYN_SIZE (dynobj)) { Elf_Internal_Dyn dyn; const char *name; size_t elemsize; asection *s; bfd_boolean swap_out_p; /* Read in the current dynamic entry. */ (*get_elf_backend_data (dynobj)->s->swap_dyn_in) (dynobj, b, &dyn); /* Assume that we're going to modify it and write it out. */ swap_out_p = TRUE; switch (dyn.d_tag) { case DT_RELENT: dyn.d_un.d_val = MIPS_ELF_REL_SIZE (dynobj); break; case DT_RELAENT: BFD_ASSERT (htab->is_vxworks); dyn.d_un.d_val = MIPS_ELF_RELA_SIZE (dynobj); break; case DT_STRSZ: /* Rewrite DT_STRSZ. */ dyn.d_un.d_val = _bfd_elf_strtab_size (elf_hash_table (info)->dynstr); break; case DT_PLTGOT: name = ".got"; if (htab->is_vxworks) { /* _GLOBAL_OFFSET_TABLE_ is defined to be the beginning of the ".got" section in DYNOBJ. */ s = bfd_get_section_by_name (dynobj, name); BFD_ASSERT (s != NULL); dyn.d_un.d_ptr = s->output_section->vma + s->output_offset; } else { s = bfd_get_section_by_name (output_bfd, name); BFD_ASSERT (s != NULL); dyn.d_un.d_ptr = s->vma; } break; case DT_MIPS_RLD_VERSION: dyn.d_un.d_val = 1; /* XXX */ break; case DT_MIPS_FLAGS: dyn.d_un.d_val = RHF_NOTPOT; /* XXX */ break; case DT_MIPS_TIME_STAMP: { time_t t; time (&t); dyn.d_un.d_val = t; } break; case DT_MIPS_ICHECKSUM: /* XXX FIXME: */ swap_out_p = FALSE; break; case DT_MIPS_IVERSION: /* XXX FIXME: */ swap_out_p = FALSE; break; case DT_MIPS_BASE_ADDRESS: s = output_bfd->sections; BFD_ASSERT (s != NULL); dyn.d_un.d_ptr = s->vma & ~(bfd_vma) 0xffff; break; case DT_MIPS_LOCAL_GOTNO: dyn.d_un.d_val = g->local_gotno; break; case DT_MIPS_UNREFEXTNO: /* The index into the dynamic symbol table which is the entry of the first external symbol that is not referenced within the same object. */ dyn.d_un.d_val = bfd_count_sections (output_bfd) + 1; break; case DT_MIPS_GOTSYM: if (gg->global_gotsym) { dyn.d_un.d_val = gg->global_gotsym->dynindx; break; } /* In case if we don't have global got symbols we default to setting DT_MIPS_GOTSYM to the same value as DT_MIPS_SYMTABNO, so we just fall through. */ case DT_MIPS_SYMTABNO: name = ".dynsym"; elemsize = MIPS_ELF_SYM_SIZE (output_bfd); s = bfd_get_section_by_name (output_bfd, name); BFD_ASSERT (s != NULL); dyn.d_un.d_val = s->size / elemsize; break; case DT_MIPS_HIPAGENO: dyn.d_un.d_val = g->local_gotno - MIPS_RESERVED_GOTNO (info); break; case DT_MIPS_RLD_MAP: dyn.d_un.d_ptr = mips_elf_hash_table (info)->rld_value; break; case DT_MIPS_OPTIONS: s = (bfd_get_section_by_name (output_bfd, MIPS_ELF_OPTIONS_SECTION_NAME (output_bfd))); dyn.d_un.d_ptr = s->vma; break; case DT_RELASZ: BFD_ASSERT (htab->is_vxworks); /* The count does not include the JUMP_SLOT relocations. */ if (htab->srelplt) dyn.d_un.d_val -= htab->srelplt->size; break; case DT_PLTREL: BFD_ASSERT (htab->is_vxworks); dyn.d_un.d_val = DT_RELA; break; case DT_PLTRELSZ: BFD_ASSERT (htab->is_vxworks); dyn.d_un.d_val = htab->srelplt->size; break; case DT_JMPREL: BFD_ASSERT (htab->is_vxworks); dyn.d_un.d_val = (htab->srelplt->output_section->vma + htab->srelplt->output_offset); break; case DT_TEXTREL: /* If we didn't need any text relocations after all, delete the dynamic tag. */ if (!(info->flags & DF_TEXTREL)) { dyn_to_skip = MIPS_ELF_DYN_SIZE (dynobj); swap_out_p = FALSE; } break; case DT_FLAGS: /* If we didn't need any text relocations after all, clear DF_TEXTREL from DT_FLAGS. */ if (!(info->flags & DF_TEXTREL)) dyn.d_un.d_val &= ~DF_TEXTREL; else swap_out_p = FALSE; break; default: swap_out_p = FALSE; break; } if (swap_out_p || dyn_skipped) (*get_elf_backend_data (dynobj)->s->swap_dyn_out) (dynobj, &dyn, b - dyn_skipped); if (dyn_to_skip) { dyn_skipped += dyn_to_skip; dyn_to_skip = 0; } } /* Wipe out any trailing entries if we shifted down a dynamic tag. */ if (dyn_skipped > 0) memset (b - dyn_skipped, 0, dyn_skipped); } if (sgot != NULL && sgot->size > 0) { if (htab->is_vxworks) { /* The first entry of the global offset table points to the ".dynamic" section. The second is initialized by the loader and contains the shared library identifier. The third is also initialized by the loader and points to the lazy resolution stub. */ MIPS_ELF_PUT_WORD (output_bfd, sdyn->output_offset + sdyn->output_section->vma, sgot->contents); MIPS_ELF_PUT_WORD (output_bfd, 0, sgot->contents + MIPS_ELF_GOT_SIZE (output_bfd)); MIPS_ELF_PUT_WORD (output_bfd, 0, sgot->contents + 2 * MIPS_ELF_GOT_SIZE (output_bfd)); } else { /* The first entry of the global offset table will be filled at runtime. The second entry will be used by some runtime loaders. This isn't the case of IRIX rld. */ MIPS_ELF_PUT_WORD (output_bfd, (bfd_vma) 0, sgot->contents); MIPS_ELF_PUT_WORD (output_bfd, (bfd_vma) 0x80000000, sgot->contents + MIPS_ELF_GOT_SIZE (output_bfd)); } elf_section_data (sgot->output_section)->this_hdr.sh_entsize = MIPS_ELF_GOT_SIZE (output_bfd); } /* Generate dynamic relocations for the non-primary gots. */ if (gg != NULL && gg->next) { Elf_Internal_Rela rel[3]; bfd_vma addend = 0; memset (rel, 0, sizeof (rel)); rel[0].r_info = ELF_R_INFO (output_bfd, 0, R_MIPS_REL32); for (g = gg->next; g->next != gg; g = g->next) { bfd_vma index = g->next->local_gotno + g->next->global_gotno + g->next->tls_gotno; MIPS_ELF_PUT_WORD (output_bfd, 0, sgot->contents + index++ * MIPS_ELF_GOT_SIZE (output_bfd)); MIPS_ELF_PUT_WORD (output_bfd, 0x80000000, sgot->contents + index++ * MIPS_ELF_GOT_SIZE (output_bfd)); if (! info->shared) continue; while (index < g->assigned_gotno) { rel[0].r_offset = rel[1].r_offset = rel[2].r_offset = index++ * MIPS_ELF_GOT_SIZE (output_bfd); if (!(mips_elf_create_dynamic_relocation (output_bfd, info, rel, NULL, bfd_abs_section_ptr, 0, &addend, sgot))) return FALSE; BFD_ASSERT (addend == 0); } } } /* The generation of dynamic relocations for the non-primary gots adds more dynamic relocations. We cannot count them until here. */ if (elf_hash_table (info)->dynamic_sections_created) { bfd_byte *b; bfd_boolean swap_out_p; BFD_ASSERT (sdyn != NULL); for (b = sdyn->contents; b < sdyn->contents + sdyn->size; b += MIPS_ELF_DYN_SIZE (dynobj)) { Elf_Internal_Dyn dyn; asection *s; /* Read in the current dynamic entry. */ (*get_elf_backend_data (dynobj)->s->swap_dyn_in) (dynobj, b, &dyn); /* Assume that we're going to modify it and write it out. */ swap_out_p = TRUE; switch (dyn.d_tag) { case DT_RELSZ: /* Reduce DT_RELSZ to account for any relocations we decided not to make. This is for the n64 irix rld, which doesn't seem to apply any relocations if there are trailing null entries. */ s = mips_elf_rel_dyn_section (info, FALSE); dyn.d_un.d_val = (s->reloc_count * (ABI_64_P (output_bfd) ? sizeof (Elf64_Mips_External_Rel) : sizeof (Elf32_External_Rel))); /* Adjust the section size too. Tools like the prelinker can reasonably expect the values to the same. */ elf_section_data (s->output_section)->this_hdr.sh_size = dyn.d_un.d_val; break; default: swap_out_p = FALSE; break; } if (swap_out_p) (*get_elf_backend_data (dynobj)->s->swap_dyn_out) (dynobj, &dyn, b); } } { asection *s; Elf32_compact_rel cpt; if (SGI_COMPAT (output_bfd)) { /* Write .compact_rel section out. */ s = bfd_get_section_by_name (dynobj, ".compact_rel"); if (s != NULL) { cpt.id1 = 1; cpt.num = s->reloc_count; cpt.id2 = 2; cpt.offset = (s->output_section->filepos + sizeof (Elf32_External_compact_rel)); cpt.reserved0 = 0; cpt.reserved1 = 0; bfd_elf32_swap_compact_rel_out (output_bfd, &cpt, ((Elf32_External_compact_rel *) s->contents)); /* Clean up a dummy stub function entry in .text. */ s = bfd_get_section_by_name (dynobj, MIPS_ELF_STUB_SECTION_NAME (dynobj)); if (s != NULL) { file_ptr dummy_offset; BFD_ASSERT (s->size >= htab->function_stub_size); dummy_offset = s->size - htab->function_stub_size; memset (s->contents + dummy_offset, 0, htab->function_stub_size); } } } /* The psABI says that the dynamic relocations must be sorted in increasing order of r_symndx. The VxWorks EABI doesn't require this, and because the code below handles REL rather than RELA relocations, using it for VxWorks would be outright harmful. */ if (!htab->is_vxworks) { s = mips_elf_rel_dyn_section (info, FALSE); if (s != NULL && s->size > (bfd_vma)2 * MIPS_ELF_REL_SIZE (output_bfd)) { reldyn_sorting_bfd = output_bfd; if (ABI_64_P (output_bfd)) qsort ((Elf64_External_Rel *) s->contents + 1, s->reloc_count - 1, sizeof (Elf64_Mips_External_Rel), sort_dynamic_relocs_64); else qsort ((Elf32_External_Rel *) s->contents + 1, s->reloc_count - 1, sizeof (Elf32_External_Rel), sort_dynamic_relocs); } } } if (htab->is_vxworks && htab->splt->size > 0) { if (info->shared) mips_vxworks_finish_shared_plt (output_bfd, info); else mips_vxworks_finish_exec_plt (output_bfd, info); } return TRUE; } /* Set ABFD's EF_MIPS_ARCH and EF_MIPS_MACH flags. */ static void mips_set_isa_flags (bfd *abfd) { flagword val; switch (bfd_get_mach (abfd)) { default: case bfd_mach_mips3000: val = E_MIPS_ARCH_1; break; case bfd_mach_mips3900: val = E_MIPS_ARCH_1 | E_MIPS_MACH_3900; break; case bfd_mach_mips6000: val = E_MIPS_ARCH_2; break; case bfd_mach_mips4000: case bfd_mach_mips4300: case bfd_mach_mips4400: case bfd_mach_mips4600: val = E_MIPS_ARCH_3; break; case bfd_mach_mips4010: val = E_MIPS_ARCH_3 | E_MIPS_MACH_4010; break; case bfd_mach_mips4100: val = E_MIPS_ARCH_3 | E_MIPS_MACH_4100; break; case bfd_mach_mips4111: val = E_MIPS_ARCH_3 | E_MIPS_MACH_4111; break; case bfd_mach_mips4120: val = E_MIPS_ARCH_3 | E_MIPS_MACH_4120; break; case bfd_mach_mips4650: val = E_MIPS_ARCH_3 | E_MIPS_MACH_4650; break; case bfd_mach_mips5400: val = E_MIPS_ARCH_4 | E_MIPS_MACH_5400; break; case bfd_mach_mips5500: val = E_MIPS_ARCH_4 | E_MIPS_MACH_5500; break; case bfd_mach_mips9000: val = E_MIPS_ARCH_4 | E_MIPS_MACH_9000; break; case bfd_mach_mips5000: case bfd_mach_mips7000: case bfd_mach_mips8000: case bfd_mach_mips10000: case bfd_mach_mips12000: val = E_MIPS_ARCH_4; break; case bfd_mach_mips5: val = E_MIPS_ARCH_5; break; case bfd_mach_mips_octeon: val = E_MIPS_ARCH_64R2 | E_MIPS_MACH_OCTEON; break; case bfd_mach_mips_sb1: val = E_MIPS_ARCH_64 | E_MIPS_MACH_SB1; break; case bfd_mach_mipsisa32: val = E_MIPS_ARCH_32; break; case bfd_mach_mipsisa64: val = E_MIPS_ARCH_64; break; case bfd_mach_mipsisa32r2: val = E_MIPS_ARCH_32R2; break; case bfd_mach_mipsisa64r2: val = E_MIPS_ARCH_64R2; break; } elf_elfheader (abfd)->e_flags &= ~(EF_MIPS_ARCH | EF_MIPS_MACH); elf_elfheader (abfd)->e_flags |= val; } /* The final processing done just before writing out a MIPS ELF object file. This gets the MIPS architecture right based on the machine number. This is used by both the 32-bit and the 64-bit ABI. */ void _bfd_mips_elf_final_write_processing (bfd *abfd, bfd_boolean linker ATTRIBUTE_UNUSED) { unsigned int i; Elf_Internal_Shdr **hdrpp; const char *name; asection *sec; /* Keep the existing EF_MIPS_MACH and EF_MIPS_ARCH flags if the former is nonzero. This is for compatibility with old objects, which used a combination of a 32-bit EF_MIPS_ARCH and a 64-bit EF_MIPS_MACH. */ if ((elf_elfheader (abfd)->e_flags & EF_MIPS_MACH) == 0) mips_set_isa_flags (abfd); /* Set the sh_info field for .gptab sections and other appropriate info for each special section. */ for (i = 1, hdrpp = elf_elfsections (abfd) + 1; i < elf_numsections (abfd); i++, hdrpp++) { switch ((*hdrpp)->sh_type) { case SHT_MIPS_MSYM: case SHT_MIPS_LIBLIST: sec = bfd_get_section_by_name (abfd, ".dynstr"); if (sec != NULL) (*hdrpp)->sh_link = elf_section_data (sec)->this_idx; break; case SHT_MIPS_GPTAB: BFD_ASSERT ((*hdrpp)->bfd_section != NULL); name = bfd_get_section_name (abfd, (*hdrpp)->bfd_section); BFD_ASSERT (name != NULL && CONST_STRNEQ (name, ".gptab.")); sec = bfd_get_section_by_name (abfd, name + sizeof ".gptab" - 1); BFD_ASSERT (sec != NULL); (*hdrpp)->sh_info = elf_section_data (sec)->this_idx; break; case SHT_MIPS_CONTENT: BFD_ASSERT ((*hdrpp)->bfd_section != NULL); name = bfd_get_section_name (abfd, (*hdrpp)->bfd_section); BFD_ASSERT (name != NULL && CONST_STRNEQ (name, ".MIPS.content")); sec = bfd_get_section_by_name (abfd, name + sizeof ".MIPS.content" - 1); BFD_ASSERT (sec != NULL); (*hdrpp)->sh_link = elf_section_data (sec)->this_idx; break; case SHT_MIPS_SYMBOL_LIB: sec = bfd_get_section_by_name (abfd, ".dynsym"); if (sec != NULL) (*hdrpp)->sh_link = elf_section_data (sec)->this_idx; sec = bfd_get_section_by_name (abfd, ".liblist"); if (sec != NULL) (*hdrpp)->sh_info = elf_section_data (sec)->this_idx; break; case SHT_MIPS_EVENTS: BFD_ASSERT ((*hdrpp)->bfd_section != NULL); name = bfd_get_section_name (abfd, (*hdrpp)->bfd_section); BFD_ASSERT (name != NULL); if (CONST_STRNEQ (name, ".MIPS.events")) sec = bfd_get_section_by_name (abfd, name + sizeof ".MIPS.events" - 1); else { BFD_ASSERT (CONST_STRNEQ (name, ".MIPS.post_rel")); sec = bfd_get_section_by_name (abfd, (name + sizeof ".MIPS.post_rel" - 1)); } BFD_ASSERT (sec != NULL); (*hdrpp)->sh_link = elf_section_data (sec)->this_idx; break; } } } /* When creating an IRIX5 executable, we need REGINFO and RTPROC segments. */ int _bfd_mips_elf_additional_program_headers (bfd *abfd, struct bfd_link_info *info ATTRIBUTE_UNUSED) { asection *s; int ret = 0; /* See if we need a PT_MIPS_REGINFO segment. */ s = bfd_get_section_by_name (abfd, ".reginfo"); if (s && (s->flags & SEC_LOAD)) ++ret; /* See if we need a PT_MIPS_OPTIONS segment. */ if (IRIX_COMPAT (abfd) == ict_irix6 && bfd_get_section_by_name (abfd, MIPS_ELF_OPTIONS_SECTION_NAME (abfd))) ++ret; /* See if we need a PT_MIPS_RTPROC segment. */ if (IRIX_COMPAT (abfd) == ict_irix5 && bfd_get_section_by_name (abfd, ".dynamic") && bfd_get_section_by_name (abfd, ".mdebug")) ++ret; /* Allocate a PT_NULL header in dynamic objects. See _bfd_mips_elf_modify_segment_map for details. */ if (!SGI_COMPAT (abfd) && bfd_get_section_by_name (abfd, ".dynamic")) ++ret; return ret; } /* Modify the segment map for an IRIX5 executable. */ bfd_boolean _bfd_mips_elf_modify_segment_map (bfd *abfd, struct bfd_link_info *info ATTRIBUTE_UNUSED) { asection *s; struct elf_segment_map *m, **pm; bfd_size_type amt; /* If there is a .reginfo section, we need a PT_MIPS_REGINFO segment. */ s = bfd_get_section_by_name (abfd, ".reginfo"); if (s != NULL && (s->flags & SEC_LOAD) != 0) { for (m = elf_tdata (abfd)->segment_map; m != NULL; m = m->next) if (m->p_type == PT_MIPS_REGINFO) break; if (m == NULL) { amt = sizeof *m; m = bfd_zalloc (abfd, amt); if (m == NULL) return FALSE; m->p_type = PT_MIPS_REGINFO; m->count = 1; m->sections[0] = s; /* We want to put it after the PHDR and INTERP segments. */ pm = &elf_tdata (abfd)->segment_map; while (*pm != NULL && ((*pm)->p_type == PT_PHDR || (*pm)->p_type == PT_INTERP)) pm = &(*pm)->next; m->next = *pm; *pm = m; } } /* For IRIX 6, we don't have .mdebug sections, nor does anything but .dynamic end up in PT_DYNAMIC. However, we do have to insert a PT_MIPS_OPTIONS segment immediately following the program header table. */ if (NEWABI_P (abfd) /* On non-IRIX6 new abi, we'll have already created a segment for this section, so don't create another. I'm not sure this is not also the case for IRIX 6, but I can't test it right now. */ && IRIX_COMPAT (abfd) == ict_irix6) { for (s = abfd->sections; s; s = s->next) if (elf_section_data (s)->this_hdr.sh_type == SHT_MIPS_OPTIONS) break; if (s) { struct elf_segment_map *options_segment; pm = &elf_tdata (abfd)->segment_map; while (*pm != NULL && ((*pm)->p_type == PT_PHDR || (*pm)->p_type == PT_INTERP)) pm = &(*pm)->next; if (*pm == NULL || (*pm)->p_type != PT_MIPS_OPTIONS) { amt = sizeof (struct elf_segment_map); options_segment = bfd_zalloc (abfd, amt); options_segment->next = *pm; options_segment->p_type = PT_MIPS_OPTIONS; options_segment->p_flags = PF_R; options_segment->p_flags_valid = TRUE; options_segment->count = 1; options_segment->sections[0] = s; *pm = options_segment; } } } else { if (IRIX_COMPAT (abfd) == ict_irix5) { /* If there are .dynamic and .mdebug sections, we make a room for the RTPROC header. FIXME: Rewrite without section names. */ if (bfd_get_section_by_name (abfd, ".interp") == NULL && bfd_get_section_by_name (abfd, ".dynamic") != NULL && bfd_get_section_by_name (abfd, ".mdebug") != NULL) { for (m = elf_tdata (abfd)->segment_map; m != NULL; m = m->next) if (m->p_type == PT_MIPS_RTPROC) break; if (m == NULL) { amt = sizeof *m; m = bfd_zalloc (abfd, amt); if (m == NULL) return FALSE; m->p_type = PT_MIPS_RTPROC; s = bfd_get_section_by_name (abfd, ".rtproc"); if (s == NULL) { m->count = 0; m->p_flags = 0; m->p_flags_valid = 1; } else { m->count = 1; m->sections[0] = s; } /* We want to put it after the DYNAMIC segment. */ pm = &elf_tdata (abfd)->segment_map; while (*pm != NULL && (*pm)->p_type != PT_DYNAMIC) pm = &(*pm)->next; if (*pm != NULL) pm = &(*pm)->next; m->next = *pm; *pm = m; } } } /* On IRIX5, the PT_DYNAMIC segment includes the .dynamic, .dynstr, .dynsym, and .hash sections, and everything in between. */ for (pm = &elf_tdata (abfd)->segment_map; *pm != NULL; pm = &(*pm)->next) if ((*pm)->p_type == PT_DYNAMIC) break; m = *pm; if (m != NULL && IRIX_COMPAT (abfd) == ict_none) { /* For a normal mips executable the permissions for the PT_DYNAMIC segment are read, write and execute. We do that here since the code in elf.c sets only the read permission. This matters sometimes for the dynamic linker. */ if (bfd_get_section_by_name (abfd, ".dynamic") != NULL) { m->p_flags = PF_R | PF_W | PF_X; m->p_flags_valid = 1; } } /* GNU/Linux binaries do not need the extended PT_DYNAMIC section. glibc's dynamic linker has traditionally derived the number of tags from the p_filesz field, and sometimes allocates stack arrays of that size. An overly-big PT_DYNAMIC segment can be actively harmful in such cases. Making PT_DYNAMIC contain other sections can also make life hard for the prelinker, which might move one of the other sections to a different PT_LOAD segment. */ if (SGI_COMPAT (abfd) && m != NULL && m->count == 1 && strcmp (m->sections[0]->name, ".dynamic") == 0) { static const char *sec_names[] = { ".dynamic", ".dynstr", ".dynsym", ".hash" }; bfd_vma low, high; unsigned int i, c; struct elf_segment_map *n; low = ~(bfd_vma) 0; high = 0; for (i = 0; i < sizeof sec_names / sizeof sec_names[0]; i++) { s = bfd_get_section_by_name (abfd, sec_names[i]); if (s != NULL && (s->flags & SEC_LOAD) != 0) { bfd_size_type sz; if (low > s->vma) low = s->vma; sz = s->size; if (high < s->vma + sz) high = s->vma + sz; } } c = 0; for (s = abfd->sections; s != NULL; s = s->next) if ((s->flags & SEC_LOAD) != 0 && s->vma >= low && s->vma + s->size <= high) ++c; amt = sizeof *n + (bfd_size_type) (c - 1) * sizeof (asection *); n = bfd_zalloc (abfd, amt); if (n == NULL) return FALSE; *n = *m; n->count = c; i = 0; for (s = abfd->sections; s != NULL; s = s->next) { if ((s->flags & SEC_LOAD) != 0 && s->vma >= low && s->vma + s->size <= high) { n->sections[i] = s; ++i; } } *pm = n; } } /* Allocate a spare program header in dynamic objects so that tools like the prelinker can add an extra PT_LOAD entry. If the prelinker needs to make room for a new PT_LOAD entry, its standard procedure is to move the first (read-only) sections into the new (writable) segment. However, the MIPS ABI requires .dynamic to be in a read-only segment, and the section will often start within sizeof (ElfNN_Phdr) bytes of the last program header. Although the prelinker could in principle move .dynamic to a writable segment, it seems better to allocate a spare program header instead, and avoid the need to move any sections. There is a long tradition of allocating spare dynamic tags, so allocating a spare program header seems like a natural extension. */ if (!SGI_COMPAT (abfd) && bfd_get_section_by_name (abfd, ".dynamic")) { for (pm = &elf_tdata (abfd)->segment_map; *pm != NULL; pm = &(*pm)->next) if ((*pm)->p_type == PT_NULL) break; if (*pm == NULL) { m = bfd_zalloc (abfd, sizeof (*m)); if (m == NULL) return FALSE; m->p_type = PT_NULL; *pm = m; } } return TRUE; } /* Return the section that should be marked against GC for a given relocation. */ asection * _bfd_mips_elf_gc_mark_hook (asection *sec, struct bfd_link_info *info, Elf_Internal_Rela *rel, struct elf_link_hash_entry *h, Elf_Internal_Sym *sym) { /* ??? Do mips16 stub sections need to be handled special? */ if (h != NULL) switch (ELF_R_TYPE (sec->owner, rel->r_info)) { case R_MIPS_GNU_VTINHERIT: case R_MIPS_GNU_VTENTRY: return NULL; } return _bfd_elf_gc_mark_hook (sec, info, rel, h, sym); } /* Update the got entry reference counts for the section being removed. */ bfd_boolean _bfd_mips_elf_gc_sweep_hook (bfd *abfd ATTRIBUTE_UNUSED, struct bfd_link_info *info ATTRIBUTE_UNUSED, asection *sec ATTRIBUTE_UNUSED, const Elf_Internal_Rela *relocs ATTRIBUTE_UNUSED) { #if 0 Elf_Internal_Shdr *symtab_hdr; struct elf_link_hash_entry **sym_hashes; bfd_signed_vma *local_got_refcounts; const Elf_Internal_Rela *rel, *relend; unsigned long r_symndx; struct elf_link_hash_entry *h; symtab_hdr = &elf_tdata (abfd)->symtab_hdr; sym_hashes = elf_sym_hashes (abfd); local_got_refcounts = elf_local_got_refcounts (abfd); relend = relocs + sec->reloc_count; for (rel = relocs; rel < relend; rel++) switch (ELF_R_TYPE (abfd, rel->r_info)) { case R_MIPS_GOT16: case R_MIPS_CALL16: case R_MIPS_CALL_HI16: case R_MIPS_CALL_LO16: case R_MIPS_GOT_HI16: case R_MIPS_GOT_LO16: case R_MIPS_GOT_DISP: case R_MIPS_GOT_PAGE: case R_MIPS_GOT_OFST: /* ??? It would seem that the existing MIPS code does no sort of reference counting or whatnot on its GOT and PLT entries, so it is not possible to garbage collect them at this time. */ break; default: break; } #endif return TRUE; } /* Copy data from a MIPS ELF indirect symbol to its direct symbol, hiding the old indirect symbol. Process additional relocation information. Also called for weakdefs, in which case we just let _bfd_elf_link_hash_copy_indirect copy the flags for us. */ void _bfd_mips_elf_copy_indirect_symbol (struct bfd_link_info *info, struct elf_link_hash_entry *dir, struct elf_link_hash_entry *ind) { struct mips_elf_link_hash_entry *dirmips, *indmips; _bfd_elf_link_hash_copy_indirect (info, dir, ind); if (ind->root.type != bfd_link_hash_indirect) return; dirmips = (struct mips_elf_link_hash_entry *) dir; indmips = (struct mips_elf_link_hash_entry *) ind; dirmips->possibly_dynamic_relocs += indmips->possibly_dynamic_relocs; if (indmips->readonly_reloc) dirmips->readonly_reloc = TRUE; if (indmips->no_fn_stub) dirmips->no_fn_stub = TRUE; if (dirmips->tls_type == 0) dirmips->tls_type = indmips->tls_type; } void _bfd_mips_elf_hide_symbol (struct bfd_link_info *info, struct elf_link_hash_entry *entry, bfd_boolean force_local) { bfd *dynobj; asection *got; struct mips_got_info *g; struct mips_elf_link_hash_entry *h; h = (struct mips_elf_link_hash_entry *) entry; if (h->forced_local) return; h->forced_local = force_local; dynobj = elf_hash_table (info)->dynobj; if (dynobj != NULL && force_local && h->root.type != STT_TLS && (got = mips_elf_got_section (dynobj, TRUE)) != NULL && (g = mips_elf_section_data (got)->u.got_info) != NULL) { if (g->next) { struct mips_got_entry e; struct mips_got_info *gg = g; /* Since we're turning what used to be a global symbol into a local one, bump up the number of local entries of each GOT that had an entry for it. This will automatically decrease the number of global entries, since global_gotno is actually the upper limit of global entries. */ e.abfd = dynobj; e.symndx = -1; e.d.h = h; e.tls_type = 0; for (g = g->next; g != gg; g = g->next) if (htab_find (g->got_entries, &e)) { BFD_ASSERT (g->global_gotno > 0); g->local_gotno++; g->global_gotno--; } /* If this was a global symbol forced into the primary GOT, we no longer need an entry for it. We can't release the entry at this point, but we must at least stop counting it as one of the symbols that required a forced got entry. */ if (h->root.got.offset == 2) { BFD_ASSERT (gg->assigned_gotno > 0); gg->assigned_gotno--; } } else if (g->global_gotno == 0 && g->global_gotsym == NULL) /* If we haven't got through GOT allocation yet, just bump up the number of local entries, as this symbol won't be counted as global. */ g->local_gotno++; else if (h->root.got.offset == 1) { /* If we're past non-multi-GOT allocation and this symbol had been marked for a global got entry, give it a local entry instead. */ BFD_ASSERT (g->global_gotno > 0); g->local_gotno++; g->global_gotno--; } } _bfd_elf_link_hash_hide_symbol (info, &h->root, force_local); } #define PDR_SIZE 32 bfd_boolean _bfd_mips_elf_discard_info (bfd *abfd, struct elf_reloc_cookie *cookie, struct bfd_link_info *info) { asection *o; bfd_boolean ret = FALSE; unsigned char *tdata; size_t i, skip; o = bfd_get_section_by_name (abfd, ".pdr"); if (! o) return FALSE; if (o->size == 0) return FALSE; if (o->size % PDR_SIZE != 0) return FALSE; if (o->output_section != NULL && bfd_is_abs_section (o->output_section)) return FALSE; tdata = bfd_zmalloc (o->size / PDR_SIZE); if (! tdata) return FALSE; cookie->rels = _bfd_elf_link_read_relocs (abfd, o, NULL, NULL, info->keep_memory); if (!cookie->rels) { free (tdata); return FALSE; } cookie->rel = cookie->rels; cookie->relend = cookie->rels + o->reloc_count; for (i = 0, skip = 0; i < o->size / PDR_SIZE; i ++) { if (bfd_elf_reloc_symbol_deleted_p (i * PDR_SIZE, cookie)) { tdata[i] = 1; skip ++; } } if (skip != 0) { mips_elf_section_data (o)->u.tdata = tdata; o->size -= skip * PDR_SIZE; ret = TRUE; } else free (tdata); if (! info->keep_memory) free (cookie->rels); return ret; } bfd_boolean _bfd_mips_elf_ignore_discarded_relocs (asection *sec) { if (strcmp (sec->name, ".pdr") == 0) return TRUE; return FALSE; } bfd_boolean _bfd_mips_elf_write_section (bfd *output_bfd, struct bfd_link_info *link_info ATTRIBUTE_UNUSED, asection *sec, bfd_byte *contents) { bfd_byte *to, *from, *end; int i; if (strcmp (sec->name, ".pdr") != 0) return FALSE; if (mips_elf_section_data (sec)->u.tdata == NULL) return FALSE; to = contents; end = contents + sec->size; for (from = contents, i = 0; from < end; from += PDR_SIZE, i++) { if ((mips_elf_section_data (sec)->u.tdata)[i] == 1) continue; if (to != from) memcpy (to, from, PDR_SIZE); to += PDR_SIZE; } bfd_set_section_contents (output_bfd, sec->output_section, contents, sec->output_offset, sec->size); return TRUE; } /* MIPS ELF uses a special find_nearest_line routine in order the handle the ECOFF debugging information. */ struct mips_elf_find_line { struct ecoff_debug_info d; struct ecoff_find_line i; }; bfd_boolean _bfd_mips_elf_find_nearest_line (bfd *abfd, asection *section, asymbol **symbols, bfd_vma offset, const char **filename_ptr, const char **functionname_ptr, unsigned int *line_ptr) { asection *msec; if (_bfd_dwarf1_find_nearest_line (abfd, section, symbols, offset, filename_ptr, functionname_ptr, line_ptr)) return TRUE; if (_bfd_dwarf2_find_nearest_line (abfd, section, symbols, offset, filename_ptr, functionname_ptr, line_ptr, ABI_64_P (abfd) ? 8 : 0, &elf_tdata (abfd)->dwarf2_find_line_info)) return TRUE; msec = bfd_get_section_by_name (abfd, ".mdebug"); if (msec != NULL) { flagword origflags; struct mips_elf_find_line *fi; const struct ecoff_debug_swap * const swap = get_elf_backend_data (abfd)->elf_backend_ecoff_debug_swap; /* If we are called during a link, mips_elf_final_link may have cleared the SEC_HAS_CONTENTS field. We force it back on here if appropriate (which it normally will be). */ origflags = msec->flags; if (elf_section_data (msec)->this_hdr.sh_type != SHT_NOBITS) msec->flags |= SEC_HAS_CONTENTS; fi = elf_tdata (abfd)->find_line_info; if (fi == NULL) { bfd_size_type external_fdr_size; char *fraw_src; char *fraw_end; struct fdr *fdr_ptr; bfd_size_type amt = sizeof (struct mips_elf_find_line); fi = bfd_zalloc (abfd, amt); if (fi == NULL) { msec->flags = origflags; return FALSE; } if (! _bfd_mips_elf_read_ecoff_info (abfd, msec, &fi->d)) { msec->flags = origflags; return FALSE; } /* Swap in the FDR information. */ amt = fi->d.symbolic_header.ifdMax * sizeof (struct fdr); fi->d.fdr = bfd_alloc (abfd, amt); if (fi->d.fdr == NULL) { msec->flags = origflags; return FALSE; } external_fdr_size = swap->external_fdr_size; fdr_ptr = fi->d.fdr; fraw_src = (char *) fi->d.external_fdr; fraw_end = (fraw_src + fi->d.symbolic_header.ifdMax * external_fdr_size); for (; fraw_src < fraw_end; fraw_src += external_fdr_size, fdr_ptr++) (*swap->swap_fdr_in) (abfd, fraw_src, fdr_ptr); elf_tdata (abfd)->find_line_info = fi; /* Note that we don't bother to ever free this information. find_nearest_line is either called all the time, as in objdump -l, so the information should be saved, or it is rarely called, as in ld error messages, so the memory wasted is unimportant. Still, it would probably be a good idea for free_cached_info to throw it away. */ } if (_bfd_ecoff_locate_line (abfd, section, offset, &fi->d, swap, &fi->i, filename_ptr, functionname_ptr, line_ptr)) { msec->flags = origflags; return TRUE; } msec->flags = origflags; } /* Fall back on the generic ELF find_nearest_line routine. */ return _bfd_elf_find_nearest_line (abfd, section, symbols, offset, filename_ptr, functionname_ptr, line_ptr); } bfd_boolean _bfd_mips_elf_find_inliner_info (bfd *abfd, const char **filename_ptr, const char **functionname_ptr, unsigned int *line_ptr) { bfd_boolean found; found = _bfd_dwarf2_find_inliner_info (abfd, filename_ptr, functionname_ptr, line_ptr, & elf_tdata (abfd)->dwarf2_find_line_info); return found; } /* When are writing out the .options or .MIPS.options section, remember the bytes we are writing out, so that we can install the GP value in the section_processing routine. */ bfd_boolean _bfd_mips_elf_set_section_contents (bfd *abfd, sec_ptr section, const void *location, file_ptr offset, bfd_size_type count) { if (MIPS_ELF_OPTIONS_SECTION_NAME_P (section->name)) { bfd_byte *c; if (elf_section_data (section) == NULL) { bfd_size_type amt = sizeof (struct bfd_elf_section_data); section->used_by_bfd = bfd_zalloc (abfd, amt); if (elf_section_data (section) == NULL) return FALSE; } c = mips_elf_section_data (section)->u.tdata; if (c == NULL) { c = bfd_zalloc (abfd, section->size); if (c == NULL) return FALSE; mips_elf_section_data (section)->u.tdata = c; } memcpy (c + offset, location, count); } return _bfd_elf_set_section_contents (abfd, section, location, offset, count); } /* This is almost identical to bfd_generic_get_... except that some MIPS relocations need to be handled specially. Sigh. */ bfd_byte * _bfd_elf_mips_get_relocated_section_contents (bfd *abfd, struct bfd_link_info *link_info, struct bfd_link_order *link_order, bfd_byte *data, bfd_boolean relocatable, asymbol **symbols) { /* Get enough memory to hold the stuff */ bfd *input_bfd = link_order->u.indirect.section->owner; asection *input_section = link_order->u.indirect.section; bfd_size_type sz; long reloc_size = bfd_get_reloc_upper_bound (input_bfd, input_section); arelent **reloc_vector = NULL; long reloc_count; if (reloc_size < 0) goto error_return; reloc_vector = bfd_malloc (reloc_size); if (reloc_vector == NULL && reloc_size != 0) goto error_return; /* read in the section */ sz = input_section->rawsize ? input_section->rawsize : input_section->size; if (!bfd_get_section_contents (input_bfd, input_section, data, 0, sz)) goto error_return; reloc_count = bfd_canonicalize_reloc (input_bfd, input_section, reloc_vector, symbols); if (reloc_count < 0) goto error_return; if (reloc_count > 0) { arelent **parent; /* for mips */ int gp_found; bfd_vma gp = 0x12345678; /* initialize just to shut gcc up */ { struct bfd_hash_entry *h; struct bfd_link_hash_entry *lh; /* Skip all this stuff if we aren't mixing formats. */ if (abfd && input_bfd && abfd->xvec == input_bfd->xvec) lh = 0; else { h = bfd_hash_lookup (&link_info->hash->table, "_gp", FALSE, FALSE); lh = (struct bfd_link_hash_entry *) h; } lookup: if (lh) { switch (lh->type) { case bfd_link_hash_undefined: case bfd_link_hash_undefweak: case bfd_link_hash_common: gp_found = 0; break; case bfd_link_hash_defined: case bfd_link_hash_defweak: gp_found = 1; gp = lh->u.def.value; break; case bfd_link_hash_indirect: case bfd_link_hash_warning: lh = lh->u.i.link; /* @@FIXME ignoring warning for now */ goto lookup; case bfd_link_hash_new: default: abort (); } } else gp_found = 0; } /* end mips */ for (parent = reloc_vector; *parent != NULL; parent++) { char *error_message = NULL; bfd_reloc_status_type r; /* Specific to MIPS: Deal with relocation types that require knowing the gp of the output bfd. */ asymbol *sym = *(*parent)->sym_ptr_ptr; /* If we've managed to find the gp and have a special function for the relocation then go ahead, else default to the generic handling. */ if (gp_found && (*parent)->howto->special_function == _bfd_mips_elf32_gprel16_reloc) r = _bfd_mips_elf_gprel16_with_gp (input_bfd, sym, *parent, input_section, relocatable, data, gp); else r = bfd_perform_relocation (input_bfd, *parent, data, input_section, relocatable ? abfd : NULL, &error_message); if (relocatable) { asection *os = input_section->output_section; /* A partial link, so keep the relocs */ os->orelocation[os->reloc_count] = *parent; os->reloc_count++; } if (r != bfd_reloc_ok) { switch (r) { case bfd_reloc_undefined: if (!((*link_info->callbacks->undefined_symbol) (link_info, bfd_asymbol_name (*(*parent)->sym_ptr_ptr), input_bfd, input_section, (*parent)->address, TRUE))) goto error_return; break; case bfd_reloc_dangerous: BFD_ASSERT (error_message != NULL); if (!((*link_info->callbacks->reloc_dangerous) (link_info, error_message, input_bfd, input_section, (*parent)->address))) goto error_return; break; case bfd_reloc_overflow: if (!((*link_info->callbacks->reloc_overflow) (link_info, NULL, bfd_asymbol_name (*(*parent)->sym_ptr_ptr), (*parent)->howto->name, (*parent)->addend, input_bfd, input_section, (*parent)->address))) goto error_return; break; case bfd_reloc_outofrange: default: abort (); break; } } } } if (reloc_vector != NULL) free (reloc_vector); return data; error_return: if (reloc_vector != NULL) free (reloc_vector); return NULL; } /* Create a MIPS ELF linker hash table. */ struct bfd_link_hash_table * _bfd_mips_elf_link_hash_table_create (bfd *abfd) { struct mips_elf_link_hash_table *ret; bfd_size_type amt = sizeof (struct mips_elf_link_hash_table); ret = bfd_malloc (amt); if (ret == NULL) return NULL; if (!_bfd_elf_link_hash_table_init (&ret->root, abfd, mips_elf_link_hash_newfunc, sizeof (struct mips_elf_link_hash_entry))) { free (ret); return NULL; } #if 0 /* We no longer use this. */ for (i = 0; i < SIZEOF_MIPS_DYNSYM_SECNAMES; i++) ret->dynsym_sec_strindex[i] = (bfd_size_type) -1; #endif ret->procedure_count = 0; ret->compact_rel_size = 0; ret->use_rld_obj_head = FALSE; ret->rld_value = 0; ret->mips16_stubs_seen = FALSE; ret->is_vxworks = FALSE; ret->srelbss = NULL; ret->sdynbss = NULL; ret->srelplt = NULL; ret->srelplt2 = NULL; ret->sgotplt = NULL; ret->splt = NULL; ret->plt_header_size = 0; ret->plt_entry_size = 0; ret->function_stub_size = 0; return &ret->root.root; } /* Likewise, but indicate that the target is VxWorks. */ struct bfd_link_hash_table * _bfd_mips_vxworks_link_hash_table_create (bfd *abfd) { struct bfd_link_hash_table *ret; ret = _bfd_mips_elf_link_hash_table_create (abfd); if (ret) { struct mips_elf_link_hash_table *htab; htab = (struct mips_elf_link_hash_table *) ret; htab->is_vxworks = 1; } return ret; } /* We need to use a special link routine to handle the .reginfo and the .mdebug sections. We need to merge all instances of these sections together, not write them all out sequentially. */ bfd_boolean _bfd_mips_elf_final_link (bfd *abfd, struct bfd_link_info *info) { asection *o; struct bfd_link_order *p; asection *reginfo_sec, *mdebug_sec, *gptab_data_sec, *gptab_bss_sec; asection *rtproc_sec; Elf32_RegInfo reginfo; struct ecoff_debug_info debug; const struct elf_backend_data *bed = get_elf_backend_data (abfd); const struct ecoff_debug_swap *swap = bed->elf_backend_ecoff_debug_swap; HDRR *symhdr = &debug.symbolic_header; void *mdebug_handle = NULL; asection *s; EXTR esym; unsigned int i; bfd_size_type amt; struct mips_elf_link_hash_table *htab; static const char * const secname[] = { ".text", ".init", ".fini", ".data", ".rodata", ".sdata", ".sbss", ".bss" }; static const int sc[] = { scText, scInit, scFini, scData, scRData, scSData, scSBss, scBss }; /* We'd carefully arranged the dynamic symbol indices, and then the generic size_dynamic_sections renumbered them out from under us. Rather than trying somehow to prevent the renumbering, just do the sort again. */ htab = mips_elf_hash_table (info); if (elf_hash_table (info)->dynamic_sections_created) { bfd *dynobj; asection *got; struct mips_got_info *g; bfd_size_type dynsecsymcount; /* When we resort, we must tell mips_elf_sort_hash_table what the lowest index it may use is. That's the number of section symbols we're going to add. The generic ELF linker only adds these symbols when building a shared object. Note that we count the sections after (possibly) removing the .options section above. */ dynsecsymcount = count_section_dynsyms (abfd, info); if (! mips_elf_sort_hash_table (info, dynsecsymcount + 1)) return FALSE; /* Make sure we didn't grow the global .got region. */ dynobj = elf_hash_table (info)->dynobj; got = mips_elf_got_section (dynobj, FALSE); g = mips_elf_section_data (got)->u.got_info; if (g->global_gotsym != NULL) BFD_ASSERT ((elf_hash_table (info)->dynsymcount - g->global_gotsym->dynindx) <= g->global_gotno); } /* Get a value for the GP register. */ if (elf_gp (abfd) == 0) { struct bfd_link_hash_entry *h; h = bfd_link_hash_lookup (info->hash, "_gp", FALSE, FALSE, TRUE); if (h != NULL && h->type == bfd_link_hash_defined) elf_gp (abfd) = (h->u.def.value + h->u.def.section->output_section->vma + h->u.def.section->output_offset); else if (htab->is_vxworks && (h = bfd_link_hash_lookup (info->hash, "_GLOBAL_OFFSET_TABLE_", FALSE, FALSE, TRUE)) && h->type == bfd_link_hash_defined) elf_gp (abfd) = (h->u.def.section->output_section->vma + h->u.def.section->output_offset + h->u.def.value); else if (info->relocatable) { bfd_vma lo = MINUS_ONE; /* Find the GP-relative section with the lowest offset. */ for (o = abfd->sections; o != NULL; o = o->next) if (o->vma < lo && (elf_section_data (o)->this_hdr.sh_flags & SHF_MIPS_GPREL)) lo = o->vma; /* And calculate GP relative to that. */ elf_gp (abfd) = lo + ELF_MIPS_GP_OFFSET (info); } else { /* If the relocate_section function needs to do a reloc involving the GP value, it should make a reloc_dangerous callback to warn that GP is not defined. */ } } /* Go through the sections and collect the .reginfo and .mdebug information. */ reginfo_sec = NULL; mdebug_sec = NULL; gptab_data_sec = NULL; gptab_bss_sec = NULL; for (o = abfd->sections; o != NULL; o = o->next) { if (strcmp (o->name, ".reginfo") == 0) { memset (®info, 0, sizeof reginfo); /* We have found the .reginfo section in the output file. Look through all the link_orders comprising it and merge the information together. */ for (p = o->map_head.link_order; p != NULL; p = p->next) { asection *input_section; bfd *input_bfd; Elf32_External_RegInfo ext; Elf32_RegInfo sub; if (p->type != bfd_indirect_link_order) { if (p->type == bfd_data_link_order) continue; abort (); } input_section = p->u.indirect.section; input_bfd = input_section->owner; if (! bfd_get_section_contents (input_bfd, input_section, &ext, 0, sizeof ext)) return FALSE; bfd_mips_elf32_swap_reginfo_in (input_bfd, &ext, &sub); reginfo.ri_gprmask |= sub.ri_gprmask; reginfo.ri_cprmask[0] |= sub.ri_cprmask[0]; reginfo.ri_cprmask[1] |= sub.ri_cprmask[1]; reginfo.ri_cprmask[2] |= sub.ri_cprmask[2]; reginfo.ri_cprmask[3] |= sub.ri_cprmask[3]; /* ri_gp_value is set by the function mips_elf32_section_processing when the section is finally written out. */ /* Hack: reset the SEC_HAS_CONTENTS flag so that elf_link_input_bfd ignores this section. */ input_section->flags &= ~SEC_HAS_CONTENTS; } /* Size has been set in _bfd_mips_elf_always_size_sections. */ BFD_ASSERT(o->size == sizeof (Elf32_External_RegInfo)); /* Skip this section later on (I don't think this currently matters, but someday it might). */ o->map_head.link_order = NULL; reginfo_sec = o; } if (strcmp (o->name, ".mdebug") == 0) { struct extsym_info einfo; bfd_vma last; /* We have found the .mdebug section in the output file. Look through all the link_orders comprising it and merge the information together. */ symhdr->magic = swap->sym_magic; /* FIXME: What should the version stamp be? */ symhdr->vstamp = 0; symhdr->ilineMax = 0; symhdr->cbLine = 0; symhdr->idnMax = 0; symhdr->ipdMax = 0; symhdr->isymMax = 0; symhdr->ioptMax = 0; symhdr->iauxMax = 0; symhdr->issMax = 0; symhdr->issExtMax = 0; symhdr->ifdMax = 0; symhdr->crfd = 0; symhdr->iextMax = 0; /* We accumulate the debugging information itself in the debug_info structure. */ debug.line = NULL; debug.external_dnr = NULL; debug.external_pdr = NULL; debug.external_sym = NULL; debug.external_opt = NULL; debug.external_aux = NULL; debug.ss = NULL; debug.ssext = debug.ssext_end = NULL; debug.external_fdr = NULL; debug.external_rfd = NULL; debug.external_ext = debug.external_ext_end = NULL; mdebug_handle = bfd_ecoff_debug_init (abfd, &debug, swap, info); if (mdebug_handle == NULL) return FALSE; esym.jmptbl = 0; esym.cobol_main = 0; esym.weakext = 0; esym.reserved = 0; esym.ifd = ifdNil; esym.asym.iss = issNil; esym.asym.st = stLocal; esym.asym.reserved = 0; esym.asym.index = indexNil; last = 0; for (i = 0; i < sizeof (secname) / sizeof (secname[0]); i++) { esym.asym.sc = sc[i]; s = bfd_get_section_by_name (abfd, secname[i]); if (s != NULL) { esym.asym.value = s->vma; last = s->vma + s->size; } else esym.asym.value = last; if (!bfd_ecoff_debug_one_external (abfd, &debug, swap, secname[i], &esym)) return FALSE; } for (p = o->map_head.link_order; p != NULL; p = p->next) { asection *input_section; bfd *input_bfd; const struct ecoff_debug_swap *input_swap; struct ecoff_debug_info input_debug; char *eraw_src; char *eraw_end; if (p->type != bfd_indirect_link_order) { if (p->type == bfd_data_link_order) continue; abort (); } input_section = p->u.indirect.section; input_bfd = input_section->owner; if (bfd_get_flavour (input_bfd) != bfd_target_elf_flavour || (get_elf_backend_data (input_bfd) ->elf_backend_ecoff_debug_swap) == NULL) { /* I don't know what a non MIPS ELF bfd would be doing with a .mdebug section, but I don't really want to deal with it. */ continue; } input_swap = (get_elf_backend_data (input_bfd) ->elf_backend_ecoff_debug_swap); BFD_ASSERT (p->size == input_section->size); /* The ECOFF linking code expects that we have already read in the debugging information and set up an ecoff_debug_info structure, so we do that now. */ if (! _bfd_mips_elf_read_ecoff_info (input_bfd, input_section, &input_debug)) return FALSE; if (! (bfd_ecoff_debug_accumulate (mdebug_handle, abfd, &debug, swap, input_bfd, &input_debug, input_swap, info))) return FALSE; /* Loop through the external symbols. For each one with interesting information, try to find the symbol in the linker global hash table and save the information for the output external symbols. */ eraw_src = input_debug.external_ext; eraw_end = (eraw_src + (input_debug.symbolic_header.iextMax * input_swap->external_ext_size)); for (; eraw_src < eraw_end; eraw_src += input_swap->external_ext_size) { EXTR ext; const char *name; struct mips_elf_link_hash_entry *h; (*input_swap->swap_ext_in) (input_bfd, eraw_src, &ext); if (ext.asym.sc == scNil || ext.asym.sc == scUndefined || ext.asym.sc == scSUndefined) continue; name = input_debug.ssext + ext.asym.iss; h = mips_elf_link_hash_lookup (mips_elf_hash_table (info), name, FALSE, FALSE, TRUE); if (h == NULL || h->esym.ifd != -2) continue; if (ext.ifd != -1) { BFD_ASSERT (ext.ifd < input_debug.symbolic_header.ifdMax); ext.ifd = input_debug.ifdmap[ext.ifd]; } h->esym = ext; } /* Free up the information we just read. */ free (input_debug.line); free (input_debug.external_dnr); free (input_debug.external_pdr); free (input_debug.external_sym); free (input_debug.external_opt); free (input_debug.external_aux); free (input_debug.ss); free (input_debug.ssext); free (input_debug.external_fdr); free (input_debug.external_rfd); free (input_debug.external_ext); /* Hack: reset the SEC_HAS_CONTENTS flag so that elf_link_input_bfd ignores this section. */ input_section->flags &= ~SEC_HAS_CONTENTS; } if (SGI_COMPAT (abfd) && info->shared) { /* Create .rtproc section. */ rtproc_sec = bfd_get_section_by_name (abfd, ".rtproc"); if (rtproc_sec == NULL) { flagword flags = (SEC_HAS_CONTENTS | SEC_IN_MEMORY | SEC_LINKER_CREATED | SEC_READONLY); rtproc_sec = bfd_make_section_with_flags (abfd, ".rtproc", flags); if (rtproc_sec == NULL || ! bfd_set_section_alignment (abfd, rtproc_sec, 4)) return FALSE; } if (! mips_elf_create_procedure_table (mdebug_handle, abfd, info, rtproc_sec, &debug)) return FALSE; } /* Build the external symbol information. */ einfo.abfd = abfd; einfo.info = info; einfo.debug = &debug; einfo.swap = swap; einfo.failed = FALSE; mips_elf_link_hash_traverse (mips_elf_hash_table (info), mips_elf_output_extsym, &einfo); if (einfo.failed) return FALSE; /* Set the size of the .mdebug section. */ o->size = bfd_ecoff_debug_size (abfd, &debug, swap); /* Skip this section later on (I don't think this currently matters, but someday it might). */ o->map_head.link_order = NULL; mdebug_sec = o; } if (CONST_STRNEQ (o->name, ".gptab.")) { const char *subname; unsigned int c; Elf32_gptab *tab; Elf32_External_gptab *ext_tab; unsigned int j; /* The .gptab.sdata and .gptab.sbss sections hold information describing how the small data area would change depending upon the -G switch. These sections not used in executables files. */ if (! info->relocatable) { for (p = o->map_head.link_order; p != NULL; p = p->next) { asection *input_section; if (p->type != bfd_indirect_link_order) { if (p->type == bfd_data_link_order) continue; abort (); } input_section = p->u.indirect.section; /* Hack: reset the SEC_HAS_CONTENTS flag so that elf_link_input_bfd ignores this section. */ input_section->flags &= ~SEC_HAS_CONTENTS; } /* Skip this section later on (I don't think this currently matters, but someday it might). */ o->map_head.link_order = NULL; /* Really remove the section. */ bfd_section_list_remove (abfd, o); --abfd->section_count; continue; } /* There is one gptab for initialized data, and one for uninitialized data. */ if (strcmp (o->name, ".gptab.sdata") == 0) gptab_data_sec = o; else if (strcmp (o->name, ".gptab.sbss") == 0) gptab_bss_sec = o; else { (*_bfd_error_handler) (_("%s: illegal section name `%s'"), bfd_get_filename (abfd), o->name); bfd_set_error (bfd_error_nonrepresentable_section); return FALSE; } /* The linker script always combines .gptab.data and .gptab.sdata into .gptab.sdata, and likewise for .gptab.bss and .gptab.sbss. It is possible that there is no .sdata or .sbss section in the output file, in which case we must change the name of the output section. */ subname = o->name + sizeof ".gptab" - 1; if (bfd_get_section_by_name (abfd, subname) == NULL) { if (o == gptab_data_sec) o->name = ".gptab.data"; else o->name = ".gptab.bss"; subname = o->name + sizeof ".gptab" - 1; BFD_ASSERT (bfd_get_section_by_name (abfd, subname) != NULL); } /* Set up the first entry. */ c = 1; amt = c * sizeof (Elf32_gptab); tab = bfd_malloc (amt); if (tab == NULL) return FALSE; tab[0].gt_header.gt_current_g_value = elf_gp_size (abfd); tab[0].gt_header.gt_unused = 0; /* Combine the input sections. */ for (p = o->map_head.link_order; p != NULL; p = p->next) { asection *input_section; bfd *input_bfd; bfd_size_type size; unsigned long last; bfd_size_type gpentry; if (p->type != bfd_indirect_link_order) { if (p->type == bfd_data_link_order) continue; abort (); } input_section = p->u.indirect.section; input_bfd = input_section->owner; /* Combine the gptab entries for this input section one by one. We know that the input gptab entries are sorted by ascending -G value. */ size = input_section->size; last = 0; for (gpentry = sizeof (Elf32_External_gptab); gpentry < size; gpentry += sizeof (Elf32_External_gptab)) { Elf32_External_gptab ext_gptab; Elf32_gptab int_gptab; unsigned long val; unsigned long add; bfd_boolean exact; unsigned int look; if (! (bfd_get_section_contents (input_bfd, input_section, &ext_gptab, gpentry, sizeof (Elf32_External_gptab)))) { free (tab); return FALSE; } bfd_mips_elf32_swap_gptab_in (input_bfd, &ext_gptab, &int_gptab); val = int_gptab.gt_entry.gt_g_value; add = int_gptab.gt_entry.gt_bytes - last; exact = FALSE; for (look = 1; look < c; look++) { if (tab[look].gt_entry.gt_g_value >= val) tab[look].gt_entry.gt_bytes += add; if (tab[look].gt_entry.gt_g_value == val) exact = TRUE; } if (! exact) { Elf32_gptab *new_tab; unsigned int max; /* We need a new table entry. */ amt = (bfd_size_type) (c + 1) * sizeof (Elf32_gptab); new_tab = bfd_realloc (tab, amt); if (new_tab == NULL) { free (tab); return FALSE; } tab = new_tab; tab[c].gt_entry.gt_g_value = val; tab[c].gt_entry.gt_bytes = add; /* Merge in the size for the next smallest -G value, since that will be implied by this new value. */ max = 0; for (look = 1; look < c; look++) { if (tab[look].gt_entry.gt_g_value < val && (max == 0 || (tab[look].gt_entry.gt_g_value > tab[max].gt_entry.gt_g_value))) max = look; } if (max != 0) tab[c].gt_entry.gt_bytes += tab[max].gt_entry.gt_bytes; ++c; } last = int_gptab.gt_entry.gt_bytes; } /* Hack: reset the SEC_HAS_CONTENTS flag so that elf_link_input_bfd ignores this section. */ input_section->flags &= ~SEC_HAS_CONTENTS; } /* The table must be sorted by -G value. */ if (c > 2) qsort (tab + 1, c - 1, sizeof (tab[0]), gptab_compare); /* Swap out the table. */ amt = (bfd_size_type) c * sizeof (Elf32_External_gptab); ext_tab = bfd_alloc (abfd, amt); if (ext_tab == NULL) { free (tab); return FALSE; } for (j = 0; j < c; j++) bfd_mips_elf32_swap_gptab_out (abfd, tab + j, ext_tab + j); free (tab); o->size = c * sizeof (Elf32_External_gptab); o->contents = (bfd_byte *) ext_tab; /* Skip this section later on (I don't think this currently matters, but someday it might). */ o->map_head.link_order = NULL; } } /* Invoke the regular ELF backend linker to do all the work. */ if (!bfd_elf_final_link (abfd, info)) return FALSE; /* Now write out the computed sections. */ if (reginfo_sec != NULL) { Elf32_External_RegInfo ext; bfd_mips_elf32_swap_reginfo_out (abfd, ®info, &ext); if (! bfd_set_section_contents (abfd, reginfo_sec, &ext, 0, sizeof ext)) return FALSE; } if (mdebug_sec != NULL) { BFD_ASSERT (abfd->output_has_begun); if (! bfd_ecoff_write_accumulated_debug (mdebug_handle, abfd, &debug, swap, info, mdebug_sec->filepos)) return FALSE; bfd_ecoff_debug_free (mdebug_handle, abfd, &debug, swap, info); } if (gptab_data_sec != NULL) { if (! bfd_set_section_contents (abfd, gptab_data_sec, gptab_data_sec->contents, 0, gptab_data_sec->size)) return FALSE; } if (gptab_bss_sec != NULL) { if (! bfd_set_section_contents (abfd, gptab_bss_sec, gptab_bss_sec->contents, 0, gptab_bss_sec->size)) return FALSE; } if (SGI_COMPAT (abfd)) { rtproc_sec = bfd_get_section_by_name (abfd, ".rtproc"); if (rtproc_sec != NULL) { if (! bfd_set_section_contents (abfd, rtproc_sec, rtproc_sec->contents, 0, rtproc_sec->size)) return FALSE; } } return TRUE; } /* Structure for saying that BFD machine EXTENSION extends BASE. */ struct mips_mach_extension { unsigned long extension, base; }; /* An array describing how BFD machines relate to one another. The entries are ordered topologically with MIPS I extensions listed last. */ static const struct mips_mach_extension mips_mach_extensions[] = { /* MIPS64r2 extensions. */ { bfd_mach_mips_octeon, bfd_mach_mipsisa64r2 }, /* MIPS64 extensions. */ { bfd_mach_mipsisa64r2, bfd_mach_mipsisa64 }, { bfd_mach_mips_sb1, bfd_mach_mipsisa64 }, /* MIPS V extensions. */ { bfd_mach_mipsisa64, bfd_mach_mips5 }, /* R10000 extensions. */ { bfd_mach_mips12000, bfd_mach_mips10000 }, /* R5000 extensions. Note: the vr5500 ISA is an extension of the core vr5400 ISA, but doesn't include the multimedia stuff. It seems better to allow vr5400 and vr5500 code to be merged anyway, since many libraries will just use the core ISA. Perhaps we could add some sort of ASE flag if this ever proves a problem. */ { bfd_mach_mips5500, bfd_mach_mips5400 }, { bfd_mach_mips5400, bfd_mach_mips5000 }, /* MIPS IV extensions. */ { bfd_mach_mips5, bfd_mach_mips8000 }, { bfd_mach_mips10000, bfd_mach_mips8000 }, { bfd_mach_mips5000, bfd_mach_mips8000 }, { bfd_mach_mips7000, bfd_mach_mips8000 }, { bfd_mach_mips9000, bfd_mach_mips8000 }, /* VR4100 extensions. */ { bfd_mach_mips4120, bfd_mach_mips4100 }, { bfd_mach_mips4111, bfd_mach_mips4100 }, /* MIPS III extensions. */ { bfd_mach_mips8000, bfd_mach_mips4000 }, { bfd_mach_mips4650, bfd_mach_mips4000 }, { bfd_mach_mips4600, bfd_mach_mips4000 }, { bfd_mach_mips4400, bfd_mach_mips4000 }, { bfd_mach_mips4300, bfd_mach_mips4000 }, { bfd_mach_mips4100, bfd_mach_mips4000 }, { bfd_mach_mips4010, bfd_mach_mips4000 }, /* MIPS32 extensions. */ { bfd_mach_mipsisa32r2, bfd_mach_mipsisa32 }, /* MIPS II extensions. */ { bfd_mach_mips4000, bfd_mach_mips6000 }, { bfd_mach_mipsisa32, bfd_mach_mips6000 }, /* MIPS I extensions. */ { bfd_mach_mips6000, bfd_mach_mips3000 }, { bfd_mach_mips3900, bfd_mach_mips3000 } }; /* Return true if bfd machine EXTENSION is an extension of machine BASE. */ static bfd_boolean mips_mach_extends_p (unsigned long base, unsigned long extension) { size_t i; if (extension == base) return TRUE; if (base == bfd_mach_mipsisa32 && mips_mach_extends_p (bfd_mach_mipsisa64, extension)) return TRUE; if (base == bfd_mach_mipsisa32r2 && mips_mach_extends_p (bfd_mach_mipsisa64r2, extension)) return TRUE; for (i = 0; i < ARRAY_SIZE (mips_mach_extensions); i++) if (extension == mips_mach_extensions[i].extension) { extension = mips_mach_extensions[i].base; if (extension == base) return TRUE; } return FALSE; } /* Return true if the given ELF header flags describe a 32-bit binary. */ static bfd_boolean mips_32bit_flags_p (flagword flags) { return ((flags & EF_MIPS_32BITMODE) != 0 || (flags & EF_MIPS_ABI) == E_MIPS_ABI_O32 || (flags & EF_MIPS_ABI) == E_MIPS_ABI_EABI32 || (flags & EF_MIPS_ARCH) == E_MIPS_ARCH_1 || (flags & EF_MIPS_ARCH) == E_MIPS_ARCH_2 || (flags & EF_MIPS_ARCH) == E_MIPS_ARCH_32 || (flags & EF_MIPS_ARCH) == E_MIPS_ARCH_32R2); } /* Merge object attributes from IBFD into OBFD. Raise an error if there are conflicting attributes. */ static bfd_boolean mips_elf_merge_obj_attributes (bfd *ibfd, bfd *obfd) { obj_attribute *in_attr; obj_attribute *out_attr; if (!elf_known_obj_attributes_proc (obfd)[0].i) { /* This is the first object. Copy the attributes. */ _bfd_elf_copy_obj_attributes (ibfd, obfd); /* Use the Tag_null value to indicate the attributes have been initialized. */ elf_known_obj_attributes_proc (obfd)[0].i = 1; return TRUE; } /* Check for conflicting Tag_GNU_MIPS_ABI_FP attributes and merge non-conflicting ones. */ in_attr = elf_known_obj_attributes (ibfd)[OBJ_ATTR_GNU]; out_attr = elf_known_obj_attributes (obfd)[OBJ_ATTR_GNU]; if (in_attr[Tag_GNU_MIPS_ABI_FP].i != out_attr[Tag_GNU_MIPS_ABI_FP].i) { out_attr[Tag_GNU_MIPS_ABI_FP].type = 1; if (out_attr[Tag_GNU_MIPS_ABI_FP].i == 0) out_attr[Tag_GNU_MIPS_ABI_FP].i = in_attr[Tag_GNU_MIPS_ABI_FP].i; else if (in_attr[Tag_GNU_MIPS_ABI_FP].i == 0) ; else if (in_attr[Tag_GNU_MIPS_ABI_FP].i > 3) _bfd_error_handler (_("Warning: %B uses unknown floating point ABI %d"), ibfd, in_attr[Tag_GNU_MIPS_ABI_FP].i); else if (out_attr[Tag_GNU_MIPS_ABI_FP].i > 3) _bfd_error_handler (_("Warning: %B uses unknown floating point ABI %d"), obfd, out_attr[Tag_GNU_MIPS_ABI_FP].i); else switch (out_attr[Tag_GNU_MIPS_ABI_FP].i) { case 1: switch (in_attr[Tag_GNU_MIPS_ABI_FP].i) { case 2: _bfd_error_handler (_("Warning: %B uses -msingle-float, %B uses -mdouble-float"), obfd, ibfd); case 3: _bfd_error_handler (_("Warning: %B uses hard float, %B uses soft float"), obfd, ibfd); break; default: abort (); } break; case 2: switch (in_attr[Tag_GNU_MIPS_ABI_FP].i) { case 1: _bfd_error_handler (_("Warning: %B uses -msingle-float, %B uses -mdouble-float"), ibfd, obfd); case 3: _bfd_error_handler (_("Warning: %B uses hard float, %B uses soft float"), obfd, ibfd); break; default: abort (); } break; case 3: switch (in_attr[Tag_GNU_MIPS_ABI_FP].i) { case 1: case 2: _bfd_error_handler (_("Warning: %B uses hard float, %B uses soft float"), ibfd, obfd); break; default: abort (); } break; default: abort (); } } /* Merge Tag_compatibility attributes and any common GNU ones. */ _bfd_elf_merge_object_attributes (ibfd, obfd); return TRUE; } /* Merge backend specific data from an object file to the output object file when linking. */ bfd_boolean _bfd_mips_elf_merge_private_bfd_data (bfd *ibfd, bfd *obfd) { flagword old_flags; flagword new_flags; bfd_boolean ok; bfd_boolean null_input_bfd = TRUE; asection *sec; /* Check if we have the same endianess */ if (! _bfd_generic_verify_endian_match (ibfd, obfd)) { (*_bfd_error_handler) (_("%B: endianness incompatible with that of the selected emulation"), ibfd); return FALSE; } if (bfd_get_flavour (ibfd) != bfd_target_elf_flavour || bfd_get_flavour (obfd) != bfd_target_elf_flavour) return TRUE; if (strcmp (bfd_get_target (ibfd), bfd_get_target (obfd)) != 0) { (*_bfd_error_handler) (_("%B: ABI is incompatible with that of the selected emulation"), ibfd); return FALSE; } if (!mips_elf_merge_obj_attributes (ibfd, obfd)) return FALSE; new_flags = elf_elfheader (ibfd)->e_flags; elf_elfheader (obfd)->e_flags |= new_flags & EF_MIPS_NOREORDER; old_flags = elf_elfheader (obfd)->e_flags; if (! elf_flags_init (obfd)) { elf_flags_init (obfd) = TRUE; elf_elfheader (obfd)->e_flags = new_flags; elf_elfheader (obfd)->e_ident[EI_CLASS] = elf_elfheader (ibfd)->e_ident[EI_CLASS]; if (bfd_get_arch (obfd) == bfd_get_arch (ibfd) && (bfd_get_arch_info (obfd)->the_default || mips_mach_extends_p (bfd_get_mach (obfd), bfd_get_mach (ibfd)))) { if (! bfd_set_arch_mach (obfd, bfd_get_arch (ibfd), bfd_get_mach (ibfd))) return FALSE; } return TRUE; } /* Check flag compatibility. */ new_flags &= ~EF_MIPS_NOREORDER; old_flags &= ~EF_MIPS_NOREORDER; /* Some IRIX 6 BSD-compatibility objects have this bit set. It doesn't seem to matter. */ new_flags &= ~EF_MIPS_XGOT; old_flags &= ~EF_MIPS_XGOT; /* MIPSpro generates ucode info in n64 objects. Again, we should just be able to ignore this. */ new_flags &= ~EF_MIPS_UCODE; old_flags &= ~EF_MIPS_UCODE; /* Don't care about the PIC flags from dynamic objects; they are PIC by design. */ if ((new_flags & (EF_MIPS_PIC | EF_MIPS_CPIC)) != 0 && (ibfd->flags & DYNAMIC) != 0) new_flags &= ~ (EF_MIPS_PIC | EF_MIPS_CPIC); if (new_flags == old_flags) return TRUE; /* Check to see if the input BFD actually contains any sections. If not, its flags may not have been initialised either, but it cannot actually cause any incompatibility. */ for (sec = ibfd->sections; sec != NULL; sec = sec->next) { /* Ignore synthetic sections and empty .text, .data and .bss sections which are automatically generated by gas. */ if (strcmp (sec->name, ".reginfo") && strcmp (sec->name, ".mdebug") && (sec->size != 0 || (strcmp (sec->name, ".text") && strcmp (sec->name, ".data") && strcmp (sec->name, ".bss")))) { null_input_bfd = FALSE; break; } } if (null_input_bfd) return TRUE; ok = TRUE; if (((new_flags & (EF_MIPS_PIC | EF_MIPS_CPIC)) != 0) != ((old_flags & (EF_MIPS_PIC | EF_MIPS_CPIC)) != 0)) { (*_bfd_error_handler) (_("%B: warning: linking PIC files with non-PIC files"), ibfd); ok = TRUE; } if (new_flags & (EF_MIPS_PIC | EF_MIPS_CPIC)) elf_elfheader (obfd)->e_flags |= EF_MIPS_CPIC; if (! (new_flags & EF_MIPS_PIC)) elf_elfheader (obfd)->e_flags &= ~EF_MIPS_PIC; new_flags &= ~ (EF_MIPS_PIC | EF_MIPS_CPIC); old_flags &= ~ (EF_MIPS_PIC | EF_MIPS_CPIC); /* Compare the ISAs. */ if (mips_32bit_flags_p (old_flags) != mips_32bit_flags_p (new_flags)) { (*_bfd_error_handler) (_("%B: linking 32-bit code with 64-bit code"), ibfd); ok = FALSE; } else if (!mips_mach_extends_p (bfd_get_mach (ibfd), bfd_get_mach (obfd))) { /* OBFD's ISA isn't the same as, or an extension of, IBFD's. */ if (mips_mach_extends_p (bfd_get_mach (obfd), bfd_get_mach (ibfd))) { /* Copy the architecture info from IBFD to OBFD. Also copy the 32-bit flag (if set) so that we continue to recognise OBFD as a 32-bit binary. */ bfd_set_arch_info (obfd, bfd_get_arch_info (ibfd)); elf_elfheader (obfd)->e_flags &= ~(EF_MIPS_ARCH | EF_MIPS_MACH); elf_elfheader (obfd)->e_flags |= new_flags & (EF_MIPS_ARCH | EF_MIPS_MACH | EF_MIPS_32BITMODE); /* Copy across the ABI flags if OBFD doesn't use them and if that was what caused us to treat IBFD as 32-bit. */ if ((old_flags & EF_MIPS_ABI) == 0 && mips_32bit_flags_p (new_flags) && !mips_32bit_flags_p (new_flags & ~EF_MIPS_ABI)) elf_elfheader (obfd)->e_flags |= new_flags & EF_MIPS_ABI; } else { /* The ISAs aren't compatible. */ (*_bfd_error_handler) (_("%B: linking %s module with previous %s modules"), ibfd, bfd_printable_name (ibfd), bfd_printable_name (obfd)); ok = FALSE; } } new_flags &= ~(EF_MIPS_ARCH | EF_MIPS_MACH | EF_MIPS_32BITMODE); old_flags &= ~(EF_MIPS_ARCH | EF_MIPS_MACH | EF_MIPS_32BITMODE); /* Compare ABIs. The 64-bit ABI does not use EF_MIPS_ABI. But, it does set EI_CLASS differently from any 32-bit ABI. */ if ((new_flags & EF_MIPS_ABI) != (old_flags & EF_MIPS_ABI) || (elf_elfheader (ibfd)->e_ident[EI_CLASS] != elf_elfheader (obfd)->e_ident[EI_CLASS])) { /* Only error if both are set (to different values). */ if (((new_flags & EF_MIPS_ABI) && (old_flags & EF_MIPS_ABI)) || (elf_elfheader (ibfd)->e_ident[EI_CLASS] != elf_elfheader (obfd)->e_ident[EI_CLASS])) { (*_bfd_error_handler) (_("%B: ABI mismatch: linking %s module with previous %s modules"), ibfd, elf_mips_abi_name (ibfd), elf_mips_abi_name (obfd)); ok = FALSE; } new_flags &= ~EF_MIPS_ABI; old_flags &= ~EF_MIPS_ABI; } /* For now, allow arbitrary mixing of ASEs (retain the union). */ if ((new_flags & EF_MIPS_ARCH_ASE) != (old_flags & EF_MIPS_ARCH_ASE)) { elf_elfheader (obfd)->e_flags |= new_flags & EF_MIPS_ARCH_ASE; new_flags &= ~ EF_MIPS_ARCH_ASE; old_flags &= ~ EF_MIPS_ARCH_ASE; } /* Warn about any other mismatches */ if (new_flags != old_flags) { (*_bfd_error_handler) (_("%B: uses different e_flags (0x%lx) fields than previous modules (0x%lx)"), ibfd, (unsigned long) new_flags, (unsigned long) old_flags); ok = FALSE; } if (! ok) { bfd_set_error (bfd_error_bad_value); return FALSE; } return TRUE; } /* Function to keep MIPS specific file flags like as EF_MIPS_PIC. */ bfd_boolean _bfd_mips_elf_set_private_flags (bfd *abfd, flagword flags) { BFD_ASSERT (!elf_flags_init (abfd) || elf_elfheader (abfd)->e_flags == flags); elf_elfheader (abfd)->e_flags = flags; elf_flags_init (abfd) = TRUE; return TRUE; } bfd_boolean _bfd_mips_elf_print_private_bfd_data (bfd *abfd, void *ptr) { FILE *file = ptr; BFD_ASSERT (abfd != NULL && ptr != NULL); /* Print normal ELF private data. */ _bfd_elf_print_private_bfd_data (abfd, ptr); /* xgettext:c-format */ fprintf (file, _("private flags = %lx:"), elf_elfheader (abfd)->e_flags); if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ABI) == E_MIPS_ABI_O32) fprintf (file, _(" [abi=O32]")); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ABI) == E_MIPS_ABI_O64) fprintf (file, _(" [abi=O64]")); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ABI) == E_MIPS_ABI_EABI32) fprintf (file, _(" [abi=EABI32]")); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ABI) == E_MIPS_ABI_EABI64) fprintf (file, _(" [abi=EABI64]")); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ABI)) fprintf (file, _(" [abi unknown]")); else if (ABI_N32_P (abfd)) fprintf (file, _(" [abi=N32]")); else if (ABI_64_P (abfd)) fprintf (file, _(" [abi=64]")); else fprintf (file, _(" [no abi set]")); if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_1) fprintf (file, " [mips1]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_2) fprintf (file, " [mips2]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_3) fprintf (file, " [mips3]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_4) fprintf (file, " [mips4]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_5) fprintf (file, " [mips5]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_32) fprintf (file, " [mips32]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_64) fprintf (file, " [mips64]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_32R2) fprintf (file, " [mips32r2]"); else if ((elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH) == E_MIPS_ARCH_64R2) fprintf (file, " [mips64r2]"); else fprintf (file, _(" [unknown ISA]")); if (elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH_ASE_MDMX) fprintf (file, " [mdmx]"); if (elf_elfheader (abfd)->e_flags & EF_MIPS_ARCH_ASE_M16) fprintf (file, " [mips16]"); if (elf_elfheader (abfd)->e_flags & EF_MIPS_32BITMODE) fprintf (file, " [32bitmode]"); else fprintf (file, _(" [not 32bitmode]")); if (elf_elfheader (abfd)->e_flags & EF_MIPS_NOREORDER) fprintf (file, " [noreorder]"); if (elf_elfheader (abfd)->e_flags & EF_MIPS_PIC) fprintf (file, " [PIC]"); if (elf_elfheader (abfd)->e_flags & EF_MIPS_CPIC) fprintf (file, " [CPIC]"); if (elf_elfheader (abfd)->e_flags & EF_MIPS_XGOT) fprintf (file, " [XGOT]"); if (elf_elfheader (abfd)->e_flags & EF_MIPS_UCODE) fprintf (file, " [UCODE]"); fputc ('\n', file); return TRUE; } const struct bfd_elf_special_section _bfd_mips_elf_special_sections[] = { { STRING_COMMA_LEN (".lit4"), 0, SHT_PROGBITS, SHF_ALLOC + SHF_WRITE + SHF_MIPS_GPREL }, { STRING_COMMA_LEN (".lit8"), 0, SHT_PROGBITS, SHF_ALLOC + SHF_WRITE + SHF_MIPS_GPREL }, { STRING_COMMA_LEN (".mdebug"), 0, SHT_MIPS_DEBUG, 0 }, { STRING_COMMA_LEN (".sbss"), -2, SHT_NOBITS, SHF_ALLOC + SHF_WRITE + SHF_MIPS_GPREL }, { STRING_COMMA_LEN (".sdata"), -2, SHT_PROGBITS, SHF_ALLOC + SHF_WRITE + SHF_MIPS_GPREL }, { STRING_COMMA_LEN (".ucode"), 0, SHT_MIPS_UCODE, 0 }, { NULL, 0, 0, 0, 0 } }; /* Merge non visibility st_other attributes. Ensure that the STO_OPTIONAL flag is copied into h->other, even if this is not a definiton of the symbol. */ void _bfd_mips_elf_merge_symbol_attribute (struct elf_link_hash_entry *h, const Elf_Internal_Sym *isym, bfd_boolean definition, bfd_boolean dynamic ATTRIBUTE_UNUSED) { if ((isym->st_other & ~ELF_ST_VISIBILITY (-1)) != 0) { unsigned char other; other = (definition ? isym->st_other : h->other); other &= ~ELF_ST_VISIBILITY (-1); h->other = other | ELF_ST_VISIBILITY (h->other); } if (!definition && ELF_MIPS_IS_OPTIONAL (isym->st_other)) h->other |= STO_OPTIONAL; } /* Decide whether an undefined symbol is special and can be ignored. This is the case for OPTIONAL symbols on IRIX. */ bfd_boolean _bfd_mips_elf_ignore_undef_symbol (struct elf_link_hash_entry *h) { return ELF_MIPS_IS_OPTIONAL (h->other) ? TRUE : FALSE; } bfd_boolean _bfd_mips_elf_common_definition (Elf_Internal_Sym *sym) { return (sym->st_shndx == SHN_COMMON || sym->st_shndx == SHN_MIPS_ACOMMON || sym->st_shndx == SHN_MIPS_SCOMMON); } Index: user/alc/PQ_LAUNDRY/contrib/binutils =================================================================== --- user/alc/PQ_LAUNDRY/contrib/binutils (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/binutils (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/contrib/binutils ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/contrib/binutils:r303053-303204 Index: user/alc/PQ_LAUNDRY/contrib/libcxxrt/exception.cc =================================================================== --- user/alc/PQ_LAUNDRY/contrib/libcxxrt/exception.cc (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/libcxxrt/exception.cc (revision 303206) @@ -1,1550 +1,1568 @@ /* * Copyright 2010-2011 PathScale, Inc. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * 1. Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * * 2. Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS * IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include #include #include #include #include #include "typeinfo.h" #include "dwarf_eh.h" #include "atomic.h" #include "cxxabi.h" #pragma weak pthread_key_create #pragma weak pthread_setspecific #pragma weak pthread_getspecific #pragma weak pthread_once #ifdef LIBCXXRT_WEAK_LOCKS #pragma weak pthread_mutex_lock #define pthread_mutex_lock(mtx) do {\ if (pthread_mutex_lock) pthread_mutex_lock(mtx);\ } while(0) #pragma weak pthread_mutex_unlock #define pthread_mutex_unlock(mtx) do {\ if (pthread_mutex_unlock) pthread_mutex_unlock(mtx);\ } while(0) #pragma weak pthread_cond_signal #define pthread_cond_signal(cv) do {\ if (pthread_cond_signal) pthread_cond_signal(cv);\ } while(0) #pragma weak pthread_cond_wait #define pthread_cond_wait(cv, mtx) do {\ if (pthread_cond_wait) pthread_cond_wait(cv, mtx);\ } while(0) #endif using namespace ABI_NAMESPACE; /** * Saves the result of the landing pad that we have found. For ARM, this is * stored in the generic unwind structure, while on other platforms it is * stored in the C++ exception. */ static void saveLandingPad(struct _Unwind_Context *context, struct _Unwind_Exception *ucb, struct __cxa_exception *ex, int selector, dw_eh_ptr_t landingPad) { #if defined(__arm__) && !defined(__ARM_DWARF_EH__) // On ARM, we store the saved exception in the generic part of the structure ucb->barrier_cache.sp = _Unwind_GetGR(context, 13); ucb->barrier_cache.bitpattern[1] = static_cast(selector); ucb->barrier_cache.bitpattern[3] = reinterpret_cast(landingPad); #endif // Cache the results for the phase 2 unwind, if we found a handler // and this is not a foreign exception. if (ex) { ex->handlerSwitchValue = selector; ex->catchTemp = landingPad; } } /** * Loads the saved landing pad. Returns 1 on success, 0 on failure. */ static int loadLandingPad(struct _Unwind_Context *context, struct _Unwind_Exception *ucb, struct __cxa_exception *ex, unsigned long *selector, dw_eh_ptr_t *landingPad) { #if defined(__arm__) && !defined(__ARM_DWARF_EH__) *selector = ucb->barrier_cache.bitpattern[1]; *landingPad = reinterpret_cast(ucb->barrier_cache.bitpattern[3]); return 1; #else if (ex) { *selector = ex->handlerSwitchValue; *landingPad = reinterpret_cast(ex->catchTemp); return 0; } return 0; #endif } static inline _Unwind_Reason_Code continueUnwinding(struct _Unwind_Exception *ex, struct _Unwind_Context *context) { #if defined(__arm__) && !defined(__ARM_DWARF_EH__) if (__gnu_unwind_frame(ex, context) != _URC_OK) { return _URC_FAILURE; } #endif return _URC_CONTINUE_UNWIND; } extern "C" void __cxa_free_exception(void *thrown_exception); extern "C" void __cxa_free_dependent_exception(void *thrown_exception); extern "C" void* __dynamic_cast(const void *sub, const __class_type_info *src, const __class_type_info *dst, ptrdiff_t src2dst_offset); /** * The type of a handler that has been found. */ typedef enum { /** No handler. */ handler_none, /** * A cleanup - the exception will propagate through this frame, but code * must be run when this happens. */ handler_cleanup, /** * A catch statement. The exception will not propagate past this frame * (without an explicit rethrow). */ handler_catch } handler_type; /** * Per-thread info required by the runtime. We store a single structure * pointer in thread-local storage, because this tends to be a scarce resource * and it's impolite to steal all of it and not leave any for the rest of the * program. * * Instances of this structure are allocated lazily - at most one per thread - * and are destroyed on thread termination. */ struct __cxa_thread_info { /** The termination handler for this thread. */ terminate_handler terminateHandler; /** The unexpected exception handler for this thread. */ unexpected_handler unexpectedHandler; /** * The number of emergency buffers held by this thread. This is 0 in * normal operation - the emergency buffers are only used when malloc() * fails to return memory for allocating an exception. Threads are not * permitted to hold more than 4 emergency buffers (as per recommendation * in ABI spec [3.3.1]). */ int emergencyBuffersHeld; /** * The exception currently running in a cleanup. */ _Unwind_Exception *currentCleanup; /** * Our state with respect to foreign exceptions. Usually none, set to * caught if we have just caught an exception and rethrown if we are * rethrowing it. */ enum { none, caught, rethrown } foreign_exception_state; /** * The public part of this structure, accessible from outside of this * module. */ __cxa_eh_globals globals; }; /** * Dependent exception. This */ struct __cxa_dependent_exception { #if __LP64__ void *primaryException; #endif std::type_info *exceptionType; void (*exceptionDestructor) (void *); unexpected_handler unexpectedHandler; terminate_handler terminateHandler; __cxa_exception *nextException; int handlerCount; #if defined(__arm__) && !defined(__ARM_DWARF_EH__) _Unwind_Exception *nextCleanup; int cleanupCount; #endif int handlerSwitchValue; const char *actionRecord; const char *languageSpecificData; void *catchTemp; void *adjustedPtr; #if !__LP64__ void *primaryException; #endif _Unwind_Exception unwindHeader; }; namespace std { void unexpected(); class exception { public: virtual ~exception() throw(); virtual const char* what() const throw(); }; } /** * Class of exceptions to distinguish between this and other exception types. * * The first four characters are the vendor ID. Currently, we use GNUC, * because we aim for ABI-compatibility with the GNU implementation, and * various checks may test for equality of the class, which is incorrect. */ static const uint64_t exception_class = EXCEPTION_CLASS('G', 'N', 'U', 'C', 'C', '+', '+', '\0'); /** * Class used for dependent exceptions. */ static const uint64_t dependent_exception_class = EXCEPTION_CLASS('G', 'N', 'U', 'C', 'C', '+', '+', '\x01'); /** * The low four bytes of the exception class, indicating that we conform to the * Itanium C++ ABI. This is currently unused, but should be used in the future * if we change our exception class, to allow this library and libsupc++ to be * linked to the same executable and both to interoperate. */ static const uint32_t abi_exception_class = GENERIC_EXCEPTION_CLASS('C', '+', '+', '\0'); static bool isCXXException(uint64_t cls) { return (cls == exception_class) || (cls == dependent_exception_class); } static bool isDependentException(uint64_t cls) { return cls == dependent_exception_class; } static __cxa_exception *exceptionFromPointer(void *ex) { return reinterpret_cast<__cxa_exception*>(static_cast(ex) - offsetof(struct __cxa_exception, unwindHeader)); } static __cxa_exception *realExceptionFromException(__cxa_exception *ex) { if (!isDependentException(ex->unwindHeader.exception_class)) { return ex; } return reinterpret_cast<__cxa_exception*>((reinterpret_cast<__cxa_dependent_exception*>(ex))->primaryException)-1; } namespace std { // Forward declaration of standard library terminate() function used to // abort execution. void terminate(void); } using namespace ABI_NAMESPACE; /** The global termination handler. */ static terminate_handler terminateHandler = abort; /** The global unexpected exception handler. */ static unexpected_handler unexpectedHandler = std::terminate; /** Key used for thread-local data. */ static pthread_key_t eh_key; /** * Cleanup function, allowing foreign exception handlers to correctly destroy * this exception if they catch it. */ static void exception_cleanup(_Unwind_Reason_Code reason, struct _Unwind_Exception *ex) { // Exception layout: // [__cxa_exception [_Unwind_Exception]] [exception object] // // __cxa_free_exception expects a pointer to the exception object __cxa_free_exception(static_cast(ex + 1)); } static void dependent_exception_cleanup(_Unwind_Reason_Code reason, struct _Unwind_Exception *ex) { __cxa_free_dependent_exception(static_cast(ex + 1)); } /** * Recursively walk a list of exceptions and delete them all in post-order. */ static void free_exception_list(__cxa_exception *ex) { if (0 != ex->nextException) { free_exception_list(ex->nextException); } // __cxa_free_exception() expects to be passed the thrown object, which // immediately follows the exception, not the exception itself __cxa_free_exception(ex+1); } /** * Cleanup function called when a thread exists to make certain that all of the * per-thread data is deleted. */ static void thread_cleanup(void* thread_info) { __cxa_thread_info *info = static_cast<__cxa_thread_info*>(thread_info); if (info->globals.caughtExceptions) { // If this is a foreign exception, ask it to clean itself up. if (info->foreign_exception_state != __cxa_thread_info::none) { _Unwind_Exception *e = reinterpret_cast<_Unwind_Exception*>(info->globals.caughtExceptions); if (e->exception_cleanup) e->exception_cleanup(_URC_FOREIGN_EXCEPTION_CAUGHT, e); } else { free_exception_list(info->globals.caughtExceptions); } } free(thread_info); } /** * Once control used to protect the key creation. */ static pthread_once_t once_control = PTHREAD_ONCE_INIT; /** * We may not be linked against a full pthread implementation. If we're not, * then we need to fake the thread-local storage by storing 'thread-local' * things in a global. */ static bool fakeTLS; /** * Thread-local storage for a single-threaded program. */ static __cxa_thread_info singleThreadInfo; /** * Initialise eh_key. */ static void init_key(void) { if ((0 == pthread_key_create) || (0 == pthread_setspecific) || (0 == pthread_getspecific)) { fakeTLS = true; return; } pthread_key_create(&eh_key, thread_cleanup); pthread_setspecific(eh_key, reinterpret_cast(0x42)); fakeTLS = (pthread_getspecific(eh_key) != reinterpret_cast(0x42)); pthread_setspecific(eh_key, 0); } /** * Returns the thread info structure, creating it if it is not already created. */ static __cxa_thread_info *thread_info() { if ((0 == pthread_once) || pthread_once(&once_control, init_key)) { fakeTLS = true; } if (fakeTLS) { return &singleThreadInfo; } __cxa_thread_info *info = static_cast<__cxa_thread_info*>(pthread_getspecific(eh_key)); if (0 == info) { info = static_cast<__cxa_thread_info*>(calloc(1, sizeof(__cxa_thread_info))); pthread_setspecific(eh_key, info); } return info; } /** * Fast version of thread_info(). May fail if thread_info() is not called on * this thread at least once already. */ static __cxa_thread_info *thread_info_fast() { if (fakeTLS) { return &singleThreadInfo; } return static_cast<__cxa_thread_info*>(pthread_getspecific(eh_key)); } /** * ABI function returning the __cxa_eh_globals structure. */ extern "C" __cxa_eh_globals *ABI_NAMESPACE::__cxa_get_globals(void) { return &(thread_info()->globals); } /** * Version of __cxa_get_globals() assuming that __cxa_get_globals() has already * been called at least once by this thread. */ extern "C" __cxa_eh_globals *ABI_NAMESPACE::__cxa_get_globals_fast(void) { return &(thread_info_fast()->globals); } /** * An emergency allocation reserved for when malloc fails. This is treated as * 16 buffers of 1KB each. */ static char emergency_buffer[16384]; /** * Flag indicating whether each buffer is allocated. */ static bool buffer_allocated[16]; /** * Lock used to protect emergency allocation. */ static pthread_mutex_t emergency_malloc_lock = PTHREAD_MUTEX_INITIALIZER; /** * Condition variable used to wait when two threads are both trying to use the * emergency malloc() buffer at once. */ static pthread_cond_t emergency_malloc_wait = PTHREAD_COND_INITIALIZER; /** * Allocates size bytes from the emergency allocation mechanism, if possible. * This function will fail if size is over 1KB or if this thread already has 4 * emergency buffers. If all emergency buffers are allocated, it will sleep * until one becomes available. */ static char *emergency_malloc(size_t size) { if (size > 1024) { return 0; } __cxa_thread_info *info = thread_info(); // Only 4 emergency buffers allowed per thread! if (info->emergencyBuffersHeld > 3) { return 0; } pthread_mutex_lock(&emergency_malloc_lock); int buffer = -1; while (buffer < 0) { // While we were sleeping on the lock, another thread might have free'd // enough memory for us to use, so try the allocation again - no point // using the emergency buffer if there is some real memory that we can // use... void *m = calloc(1, size); if (0 != m) { pthread_mutex_unlock(&emergency_malloc_lock); return static_cast(m); } for (int i=0 ; i<16 ; i++) { if (!buffer_allocated[i]) { buffer = i; buffer_allocated[i] = true; break; } } // If there still isn't a buffer available, then sleep on the condition // variable. This will be signalled when another thread releases one // of the emergency buffers. if (buffer < 0) { pthread_cond_wait(&emergency_malloc_wait, &emergency_malloc_lock); } } pthread_mutex_unlock(&emergency_malloc_lock); info->emergencyBuffersHeld++; return emergency_buffer + (1024 * buffer); } /** * Frees a buffer returned by emergency_malloc(). * * Note: Neither this nor emergency_malloc() is particularly efficient. This * should not matter, because neither will be called in normal operation - they * are only used when the program runs out of memory, which should not happen * often. */ static void emergency_malloc_free(char *ptr) { int buffer = -1; // Find the buffer corresponding to this pointer. for (int i=0 ; i<16 ; i++) { if (ptr == static_cast(emergency_buffer + (1024 * i))) { buffer = i; break; } } assert(buffer >= 0 && "Trying to free something that is not an emergency buffer!"); // emergency_malloc() is expected to return 0-initialized data. We don't // zero the buffer when allocating it, because the static buffers will // begin life containing 0 values. memset(ptr, 0, 1024); // Signal the condition variable to wake up any threads that are blocking // waiting for some space in the emergency buffer pthread_mutex_lock(&emergency_malloc_lock); // In theory, we don't need to do this with the lock held. In practice, // our array of bools will probably be updated using 32-bit or 64-bit // memory operations, so this update may clobber adjacent values. buffer_allocated[buffer] = false; pthread_cond_signal(&emergency_malloc_wait); pthread_mutex_unlock(&emergency_malloc_lock); } static char *alloc_or_die(size_t size) { char *buffer = static_cast(calloc(1, size)); // If calloc() doesn't want to give us any memory, try using an emergency // buffer. if (0 == buffer) { buffer = emergency_malloc(size); // This is only reached if the allocation is greater than 1KB, and // anyone throwing objects that big really should know better. if (0 == buffer) { fprintf(stderr, "Out of memory attempting to allocate exception\n"); std::terminate(); } } return buffer; } static void free_exception(char *e) { // If this allocation is within the address range of the emergency buffer, // don't call free() because it was not allocated with malloc() if ((e >= emergency_buffer) && (e < (emergency_buffer + sizeof(emergency_buffer)))) { emergency_malloc_free(e); } else { free(e); } } +#ifdef __LP64__ /** + * There's an ABI bug in __cxa_exception: unwindHeader requires 16-byte + * alignment but it was broken by the addition of the referenceCount. + * The unwindHeader is at offset 0x58 in __cxa_exception. In order to keep + * compatibility with consumers of the broken __cxa_exception, explicitly add + * padding on allocation (and account for it on free). + */ +static const int exception_alignment_padding = 8; +#else +static const int exception_alignment_padding = 0; +#endif + +/** * Allocates an exception structure. Returns a pointer to the space that can * be used to store an object of thrown_size bytes. This function will use an * emergency buffer if malloc() fails, and may block if there are no such * buffers available. */ extern "C" void *__cxa_allocate_exception(size_t thrown_size) { - size_t size = thrown_size + sizeof(__cxa_exception); + size_t size = exception_alignment_padding + sizeof(__cxa_exception) + + thrown_size; char *buffer = alloc_or_die(size); - return buffer+sizeof(__cxa_exception); + return buffer + exception_alignment_padding + sizeof(__cxa_exception); } extern "C" void *__cxa_allocate_dependent_exception(void) { - size_t size = sizeof(__cxa_dependent_exception); + size_t size = exception_alignment_padding + + sizeof(__cxa_dependent_exception); char *buffer = alloc_or_die(size); - return buffer+sizeof(__cxa_dependent_exception); + return buffer + exception_alignment_padding + + sizeof(__cxa_dependent_exception); } /** * __cxa_free_exception() is called when an exception was thrown in between * calling __cxa_allocate_exception() and actually throwing the exception. * This happens when the object's copy constructor throws an exception. * * In this implementation, it is also called by __cxa_end_catch() and during * thread cleanup. */ extern "C" void __cxa_free_exception(void *thrown_exception) { __cxa_exception *ex = reinterpret_cast<__cxa_exception*>(thrown_exception) - 1; // Free the object that was thrown, calling its destructor if (0 != ex->exceptionDestructor) { try { ex->exceptionDestructor(thrown_exception); } catch(...) { // FIXME: Check that this is really what the spec says to do. std::terminate(); } } - free_exception(reinterpret_cast(ex)); + free_exception(reinterpret_cast(ex) - + exception_alignment_padding); } static void releaseException(__cxa_exception *exception) { if (isDependentException(exception->unwindHeader.exception_class)) { __cxa_free_dependent_exception(exception+1); return; } if (__sync_sub_and_fetch(&exception->referenceCount, 1) == 0) { // __cxa_free_exception() expects to be passed the thrown object, // which immediately follows the exception, not the exception // itself __cxa_free_exception(exception+1); } } void __cxa_free_dependent_exception(void *thrown_exception) { __cxa_dependent_exception *ex = reinterpret_cast<__cxa_dependent_exception*>(thrown_exception) - 1; assert(isDependentException(ex->unwindHeader.exception_class)); if (ex->primaryException) { releaseException(realExceptionFromException(reinterpret_cast<__cxa_exception*>(ex))); } - free_exception(reinterpret_cast(ex)); + free_exception(reinterpret_cast(ex) - + exception_alignment_padding); } /** * Callback function used with _Unwind_Backtrace(). * * Prints a stack trace. Used only for debugging help. * * Note: As of FreeBSD 8.1, dladd() still doesn't work properly, so this only * correctly prints function names from public, relocatable, symbols. */ static _Unwind_Reason_Code trace(struct _Unwind_Context *context, void *c) { Dl_info myinfo; int mylookup = dladdr(reinterpret_cast(__cxa_current_exception_type), &myinfo); void *ip = reinterpret_cast(_Unwind_GetIP(context)); Dl_info info; if (dladdr(ip, &info) != 0) { if (mylookup == 0 || strcmp(info.dli_fname, myinfo.dli_fname) != 0) { printf("%p:%s() in %s\n", ip, info.dli_sname, info.dli_fname); } } return _URC_CONTINUE_UNWIND; } /** * Report a failure that occurred when attempting to throw an exception. * * If the failure happened by falling off the end of the stack without finding * a handler, prints a back trace before aborting. */ #if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4) extern "C" void *__cxa_begin_catch(void *e) throw(); #else extern "C" void *__cxa_begin_catch(void *e); #endif static void report_failure(_Unwind_Reason_Code err, __cxa_exception *thrown_exception) { switch (err) { default: break; case _URC_FATAL_PHASE1_ERROR: fprintf(stderr, "Fatal error during phase 1 unwinding\n"); break; #if !defined(__arm__) || defined(__ARM_DWARF_EH__) case _URC_FATAL_PHASE2_ERROR: fprintf(stderr, "Fatal error during phase 2 unwinding\n"); break; #endif case _URC_END_OF_STACK: __cxa_begin_catch (&(thrown_exception->unwindHeader)); std::terminate(); fprintf(stderr, "Terminating due to uncaught exception %p", static_cast(thrown_exception)); thrown_exception = realExceptionFromException(thrown_exception); static const __class_type_info *e_ti = static_cast(&typeid(std::exception)); const __class_type_info *throw_ti = dynamic_cast(thrown_exception->exceptionType); if (throw_ti) { std::exception *e = static_cast(e_ti->cast_to(static_cast(thrown_exception+1), throw_ti)); if (e) { fprintf(stderr, " '%s'", e->what()); } } size_t bufferSize = 128; char *demangled = static_cast(malloc(bufferSize)); const char *mangled = thrown_exception->exceptionType->name(); int status; demangled = __cxa_demangle(mangled, demangled, &bufferSize, &status); fprintf(stderr, " of type %s\n", status == 0 ? demangled : mangled); if (status == 0) { free(demangled); } // Print a back trace if no handler is found. // TODO: Make this optional #ifndef __arm__ _Unwind_Backtrace(trace, 0); #endif // Just abort. No need to call std::terminate for the second time abort(); break; } std::terminate(); } static void throw_exception(__cxa_exception *ex) { __cxa_thread_info *info = thread_info(); ex->unexpectedHandler = info->unexpectedHandler; if (0 == ex->unexpectedHandler) { ex->unexpectedHandler = unexpectedHandler; } ex->terminateHandler = info->terminateHandler; if (0 == ex->terminateHandler) { ex->terminateHandler = terminateHandler; } info->globals.uncaughtExceptions++; _Unwind_Reason_Code err = _Unwind_RaiseException(&ex->unwindHeader); // The _Unwind_RaiseException() function should not return, it should // unwind the stack past this function. If it does return, then something // has gone wrong. report_failure(err, ex); } /** * ABI function for throwing an exception. Takes the object to be thrown (the * pointer returned by __cxa_allocate_exception()), the type info for the * pointee, and the destructor (if there is one) as arguments. */ extern "C" void __cxa_throw(void *thrown_exception, std::type_info *tinfo, void(*dest)(void*)) { __cxa_exception *ex = reinterpret_cast<__cxa_exception*>(thrown_exception) - 1; ex->referenceCount = 1; ex->exceptionType = tinfo; ex->exceptionDestructor = dest; ex->unwindHeader.exception_class = exception_class; ex->unwindHeader.exception_cleanup = exception_cleanup; throw_exception(ex); } extern "C" void __cxa_rethrow_primary_exception(void* thrown_exception) { if (NULL == thrown_exception) { return; } __cxa_exception *original = exceptionFromPointer(thrown_exception); __cxa_dependent_exception *ex = reinterpret_cast<__cxa_dependent_exception*>(__cxa_allocate_dependent_exception())-1; ex->primaryException = thrown_exception; __cxa_increment_exception_refcount(thrown_exception); ex->exceptionType = original->exceptionType; ex->unwindHeader.exception_class = dependent_exception_class; ex->unwindHeader.exception_cleanup = dependent_exception_cleanup; throw_exception(reinterpret_cast<__cxa_exception*>(ex)); } extern "C" void *__cxa_current_primary_exception(void) { __cxa_eh_globals* globals = __cxa_get_globals(); __cxa_exception *ex = globals->caughtExceptions; if (0 == ex) { return NULL; } ex = realExceptionFromException(ex); __sync_fetch_and_add(&ex->referenceCount, 1); return ex + 1; } extern "C" void __cxa_increment_exception_refcount(void* thrown_exception) { if (NULL == thrown_exception) { return; } __cxa_exception *ex = static_cast<__cxa_exception*>(thrown_exception) - 1; if (isDependentException(ex->unwindHeader.exception_class)) { return; } __sync_fetch_and_add(&ex->referenceCount, 1); } extern "C" void __cxa_decrement_exception_refcount(void* thrown_exception) { if (NULL == thrown_exception) { return; } __cxa_exception *ex = static_cast<__cxa_exception*>(thrown_exception) - 1; releaseException(ex); } /** * ABI function. Rethrows the current exception. Does not remove the * exception from the stack or decrement its handler count - the compiler is * expected to set the landing pad for this function to the end of the catch * block, and then call _Unwind_Resume() to continue unwinding once * __cxa_end_catch() has been called and any cleanup code has been run. */ extern "C" void __cxa_rethrow() { __cxa_thread_info *ti = thread_info(); __cxa_eh_globals *globals = &ti->globals; // Note: We don't remove this from the caught list here, because // __cxa_end_catch will be called when we unwind out of the try block. We // could probably make this faster by providing an alternative rethrow // function and ensuring that all cleanup code is run before calling it, so // we can skip the top stack frame when unwinding. __cxa_exception *ex = globals->caughtExceptions; if (0 == ex) { fprintf(stderr, "Attempting to rethrow an exception that doesn't exist!\n"); std::terminate(); } if (ti->foreign_exception_state != __cxa_thread_info::none) { ti->foreign_exception_state = __cxa_thread_info::rethrown; _Unwind_Exception *e = reinterpret_cast<_Unwind_Exception*>(ex); _Unwind_Reason_Code err = _Unwind_Resume_or_Rethrow(e); report_failure(err, ex); return; } assert(ex->handlerCount > 0 && "Rethrowing uncaught exception!"); // ex->handlerCount will be decremented in __cxa_end_catch in enclosing // catch block // Make handler count negative. This will tell __cxa_end_catch that // exception was rethrown and exception object should not be destroyed // when handler count become zero ex->handlerCount = -ex->handlerCount; // Continue unwinding the stack with this exception. This should unwind to // the place in the caller where __cxa_end_catch() is called. The caller // will then run cleanup code and bounce the exception back with // _Unwind_Resume(). _Unwind_Reason_Code err = _Unwind_Resume_or_Rethrow(&ex->unwindHeader); report_failure(err, ex); } /** * Returns the type_info object corresponding to the filter. */ static std::type_info *get_type_info_entry(_Unwind_Context *context, dwarf_eh_lsda *lsda, int filter) { // Get the address of the record in the table. dw_eh_ptr_t record = lsda->type_table - dwarf_size_of_fixed_size_field(lsda->type_table_encoding)*filter; //record -= 4; dw_eh_ptr_t start = record; // Read the value, but it's probably an indirect reference... int64_t offset = read_value(lsda->type_table_encoding, &record); // (If the entry is 0, don't try to dereference it. That would be bad.) if (offset == 0) { return 0; } // ...so we need to resolve it return reinterpret_cast(resolve_indirect_value(context, lsda->type_table_encoding, offset, start)); } /** * Checks the type signature found in a handler against the type of the thrown * object. If ex is 0 then it is assumed to be a foreign exception and only * matches cleanups. */ static bool check_type_signature(__cxa_exception *ex, const std::type_info *type, void *&adjustedPtr) { void *exception_ptr = static_cast(ex+1); const std::type_info *ex_type = ex ? ex->exceptionType : 0; bool is_ptr = ex ? ex_type->__is_pointer_p() : false; if (is_ptr) { exception_ptr = *static_cast(exception_ptr); } // Always match a catchall, even with a foreign exception // // Note: A 0 here is a catchall, not a cleanup, so we return true to // indicate that we found a catch. if (0 == type) { if (ex) { adjustedPtr = exception_ptr; } return true; } if (0 == ex) { return false; } // If the types are the same, no casting is needed. if (*type == *ex_type) { adjustedPtr = exception_ptr; return true; } if (type->__do_catch(ex_type, &exception_ptr, 1)) { adjustedPtr = exception_ptr; return true; } return false; } /** * Checks whether the exception matches the type specifiers in this action * record. If the exception only matches cleanups, then this returns false. * If it matches a catch (including a catchall) then it returns true. * * The selector argument is used to return the selector that is passed in the * second exception register when installing the context. */ static handler_type check_action_record(_Unwind_Context *context, dwarf_eh_lsda *lsda, dw_eh_ptr_t action_record, __cxa_exception *ex, unsigned long *selector, void *&adjustedPtr) { if (!action_record) { return handler_cleanup; } handler_type found = handler_none; while (action_record) { int filter = read_sleb128(&action_record); dw_eh_ptr_t action_record_offset_base = action_record; int displacement = read_sleb128(&action_record); action_record = displacement ? action_record_offset_base + displacement : 0; // We only check handler types for C++ exceptions - foreign exceptions // are only allowed for cleanups and catchalls. if (filter > 0) { std::type_info *handler_type = get_type_info_entry(context, lsda, filter); if (check_type_signature(ex, handler_type, adjustedPtr)) { *selector = filter; return handler_catch; } } else if (filter < 0 && 0 != ex) { bool matched = false; *selector = filter; #if defined(__arm__) && !defined(__ARM_DWARF_EH__) filter++; std::type_info *handler_type = get_type_info_entry(context, lsda, filter--); while (handler_type) { if (check_type_signature(ex, handler_type, adjustedPtr)) { matched = true; break; } handler_type = get_type_info_entry(context, lsda, filter--); } #else unsigned char *type_index = reinterpret_cast(lsda->type_table) - filter - 1; while (*type_index) { std::type_info *handler_type = get_type_info_entry(context, lsda, *(type_index++)); // If the exception spec matches a permitted throw type for // this function, don't report a handler - we are allowed to // propagate this exception out. if (check_type_signature(ex, handler_type, adjustedPtr)) { matched = true; break; } } #endif if (matched) { continue; } // If we don't find an allowed exception spec, we need to install // the context for this action. The landing pad will then call the // unexpected exception function. Treat this as a catch return handler_catch; } else if (filter == 0) { *selector = filter; found = handler_cleanup; } } return found; } static void pushCleanupException(_Unwind_Exception *exceptionObject, __cxa_exception *ex) { #if defined(__arm__) && !defined(__ARM_DWARF_EH__) __cxa_thread_info *info = thread_info_fast(); if (ex) { ex->cleanupCount++; if (ex->cleanupCount > 1) { assert(exceptionObject == info->currentCleanup); return; } ex->nextCleanup = info->currentCleanup; } info->currentCleanup = exceptionObject; #endif } /** * The exception personality function. This is referenced in the unwinding * DWARF metadata and is called by the unwind library for each C++ stack frame * containing catch or cleanup code. */ extern "C" BEGIN_PERSONALITY_FUNCTION(__gxx_personality_v0) // This personality function is for version 1 of the ABI. If you use it // with a future version of the ABI, it won't know what to do, so it // reports a fatal error and give up before it breaks anything. if (1 != version) { return _URC_FATAL_PHASE1_ERROR; } __cxa_exception *ex = 0; __cxa_exception *realEx = 0; // If this exception is throw by something else then we can't make any // assumptions about its layout beyond the fields declared in // _Unwind_Exception. bool foreignException = !isCXXException(exceptionClass); // If this isn't a foreign exception, then we have a C++ exception structure if (!foreignException) { ex = exceptionFromPointer(exceptionObject); realEx = realExceptionFromException(ex); } #if defined(__arm__) && !defined(__ARM_DWARF_EH__) unsigned char *lsda_addr = static_cast(_Unwind_GetLanguageSpecificData(context)); #else unsigned char *lsda_addr = reinterpret_cast(static_cast(_Unwind_GetLanguageSpecificData(context))); #endif // No LSDA implies no landing pads - try the next frame if (0 == lsda_addr) { return continueUnwinding(exceptionObject, context); } // These two variables define how the exception will be handled. dwarf_eh_action action = {0}; unsigned long selector = 0; // During the search phase, we do a complete lookup. If we return // _URC_HANDLER_FOUND, then the phase 2 unwind will call this function with // a _UA_HANDLER_FRAME action, telling us to install the handler frame. If // we return _URC_CONTINUE_UNWIND, we may be called again later with a // _UA_CLEANUP_PHASE action for this frame. // // The point of the two-stage unwind allows us to entirely avoid any stack // unwinding if there is no handler. If there are just cleanups found, // then we can just panic call an abort function. // // Matching a handler is much more expensive than matching a cleanup, // because we don't need to bother doing type comparisons (or looking at // the type table at all) for a cleanup. This means that there is no need // to cache the result of finding a cleanup, because it's (quite) quick to // look it up again from the action table. if (actions & _UA_SEARCH_PHASE) { struct dwarf_eh_lsda lsda = parse_lsda(context, lsda_addr); if (!dwarf_eh_find_callsite(context, &lsda, &action)) { // EH range not found. This happens if exception is thrown and not // caught inside a cleanup (destructor). We should call // terminate() in this case. The catchTemp (landing pad) field of // exception object will contain null when personality function is // called with _UA_HANDLER_FRAME action for phase 2 unwinding. return _URC_HANDLER_FOUND; } handler_type found_handler = check_action_record(context, &lsda, action.action_record, realEx, &selector, ex->adjustedPtr); // If there's no action record, we've only found a cleanup, so keep // searching for something real if (found_handler == handler_catch) { // Cache the results for the phase 2 unwind, if we found a handler // and this is not a foreign exception. if (ex) { saveLandingPad(context, exceptionObject, ex, selector, action.landing_pad); ex->languageSpecificData = reinterpret_cast(lsda_addr); ex->actionRecord = reinterpret_cast(action.action_record); // ex->adjustedPtr is set when finding the action record. } return _URC_HANDLER_FOUND; } return continueUnwinding(exceptionObject, context); } // If this is a foreign exception, we didn't have anywhere to cache the // lookup stuff, so we need to do it again. If this is either a forced // unwind, a foreign exception, or a cleanup, then we just install the // context for a cleanup. if (!(actions & _UA_HANDLER_FRAME)) { // cleanup struct dwarf_eh_lsda lsda = parse_lsda(context, lsda_addr); dwarf_eh_find_callsite(context, &lsda, &action); if (0 == action.landing_pad) { return continueUnwinding(exceptionObject, context); } handler_type found_handler = check_action_record(context, &lsda, action.action_record, realEx, &selector, ex->adjustedPtr); // Ignore handlers this time. if (found_handler != handler_cleanup) { return continueUnwinding(exceptionObject, context); } pushCleanupException(exceptionObject, ex); } else if (foreignException) { struct dwarf_eh_lsda lsda = parse_lsda(context, lsda_addr); dwarf_eh_find_callsite(context, &lsda, &action); check_action_record(context, &lsda, action.action_record, realEx, &selector, ex->adjustedPtr); } else if (ex->catchTemp == 0) { // Uncaught exception in cleanup, calling terminate std::terminate(); } else { // Restore the saved info if we saved some last time. loadLandingPad(context, exceptionObject, ex, &selector, &action.landing_pad); ex->catchTemp = 0; ex->handlerSwitchValue = 0; } _Unwind_SetIP(context, reinterpret_cast(action.landing_pad)); _Unwind_SetGR(context, __builtin_eh_return_data_regno(0), reinterpret_cast(exceptionObject)); _Unwind_SetGR(context, __builtin_eh_return_data_regno(1), selector); return _URC_INSTALL_CONTEXT; } /** * ABI function called when entering a catch statement. The argument is the * pointer passed out of the personality function. This is always the start of * the _Unwind_Exception object. The return value for this function is the * pointer to the caught exception, which is either the adjusted pointer (for * C++ exceptions) of the unadjusted pointer (for foreign exceptions). */ #if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4) extern "C" void *__cxa_begin_catch(void *e) throw() #else extern "C" void *__cxa_begin_catch(void *e) #endif { // We can't call the fast version here, because if the first exception that // we see is a foreign exception then we won't have called it yet. __cxa_thread_info *ti = thread_info(); __cxa_eh_globals *globals = &ti->globals; globals->uncaughtExceptions--; _Unwind_Exception *exceptionObject = static_cast<_Unwind_Exception*>(e); if (isCXXException(exceptionObject->exception_class)) { __cxa_exception *ex = exceptionFromPointer(exceptionObject); if (ex->handlerCount == 0) { // Add this to the front of the list of exceptions being handled // and increment its handler count so that it won't be deleted // prematurely. ex->nextException = globals->caughtExceptions; globals->caughtExceptions = ex; } if (ex->handlerCount < 0) { // Rethrown exception is catched before end of catch block. // Clear the rethrow flag (make value positive) - we are allowed // to delete this exception at the end of the catch block, as long // as it isn't thrown again later. // Code pattern: // // try { // throw x; // } // catch() { // try { // throw; // } // catch() { // __cxa_begin_catch() <- we are here // } // } ex->handlerCount = -ex->handlerCount + 1; } else { ex->handlerCount++; } ti->foreign_exception_state = __cxa_thread_info::none; return ex->adjustedPtr; } else { // If this is a foreign exception, then we need to be able to // store it. We can't chain foreign exceptions, so we give up // if there are already some outstanding ones. if (globals->caughtExceptions != 0) { std::terminate(); } globals->caughtExceptions = reinterpret_cast<__cxa_exception*>(exceptionObject); ti->foreign_exception_state = __cxa_thread_info::caught; } // exceptionObject is the pointer to the _Unwind_Exception within the // __cxa_exception. The throw object is after this return (reinterpret_cast(exceptionObject) + sizeof(_Unwind_Exception)); } /** * ABI function called when exiting a catch block. This will free the current * exception if it is no longer referenced in other catch blocks. */ extern "C" void __cxa_end_catch() { // We can call the fast version here because the slow version is called in // __cxa_throw(), which must have been called before we end a catch block __cxa_thread_info *ti = thread_info_fast(); __cxa_eh_globals *globals = &ti->globals; __cxa_exception *ex = globals->caughtExceptions; assert(0 != ex && "Ending catch when no exception is on the stack!"); if (ti->foreign_exception_state != __cxa_thread_info::none) { if (ti->foreign_exception_state != __cxa_thread_info::rethrown) { _Unwind_Exception *e = reinterpret_cast<_Unwind_Exception*>(ti->globals.caughtExceptions); if (e->exception_cleanup) e->exception_cleanup(_URC_FOREIGN_EXCEPTION_CAUGHT, e); } globals->caughtExceptions = 0; ti->foreign_exception_state = __cxa_thread_info::none; return; } bool deleteException = true; if (ex->handlerCount < 0) { // exception was rethrown. Exception should not be deleted even if // handlerCount become zero. // Code pattern: // try { // throw x; // } // catch() { // { // throw; // } // cleanup { // __cxa_end_catch(); <- we are here // } // } // ex->handlerCount++; deleteException = false; } else { ex->handlerCount--; } if (ex->handlerCount == 0) { globals->caughtExceptions = ex->nextException; if (deleteException) { releaseException(ex); } } } /** * ABI function. Returns the type of the current exception. */ extern "C" std::type_info *__cxa_current_exception_type() { __cxa_eh_globals *globals = __cxa_get_globals(); __cxa_exception *ex = globals->caughtExceptions; return ex ? ex->exceptionType : 0; } /** * ABI function, called when an exception specification is violated. * * This function does not return. */ extern "C" void __cxa_call_unexpected(void*exception) { _Unwind_Exception *exceptionObject = static_cast<_Unwind_Exception*>(exception); if (exceptionObject->exception_class == exception_class) { __cxa_exception *ex = exceptionFromPointer(exceptionObject); if (ex->unexpectedHandler) { ex->unexpectedHandler(); // Should not be reached. abort(); } } std::unexpected(); // Should not be reached. abort(); } /** * ABI function, returns the adjusted pointer to the exception object. */ extern "C" void *__cxa_get_exception_ptr(void *exceptionObject) { return exceptionFromPointer(exceptionObject)->adjustedPtr; } /** * As an extension, we provide the ability for the unexpected and terminate * handlers to be thread-local. We default to the standards-compliant * behaviour where they are global. */ static bool thread_local_handlers = false; namespace pathscale { /** * Sets whether unexpected and terminate handlers should be thread-local. */ void set_use_thread_local_handlers(bool flag) throw() { thread_local_handlers = flag; } /** * Sets a thread-local unexpected handler. */ unexpected_handler set_unexpected(unexpected_handler f) throw() { static __cxa_thread_info *info = thread_info(); unexpected_handler old = info->unexpectedHandler; info->unexpectedHandler = f; return old; } /** * Sets a thread-local terminate handler. */ terminate_handler set_terminate(terminate_handler f) throw() { static __cxa_thread_info *info = thread_info(); terminate_handler old = info->terminateHandler; info->terminateHandler = f; return old; } } namespace std { /** * Sets the function that will be called when an exception specification is * violated. */ unexpected_handler set_unexpected(unexpected_handler f) throw() { if (thread_local_handlers) { return pathscale::set_unexpected(f); } return ATOMIC_SWAP(&unexpectedHandler, f); } /** * Sets the function that is called to terminate the program. */ terminate_handler set_terminate(terminate_handler f) throw() { if (thread_local_handlers) { return pathscale::set_terminate(f); } return ATOMIC_SWAP(&terminateHandler, f); } /** * Terminates the program, calling a custom terminate implementation if * required. */ void terminate() { static __cxa_thread_info *info = thread_info(); if (0 != info && 0 != info->terminateHandler) { info->terminateHandler(); // Should not be reached - a terminate handler is not expected to // return. abort(); } terminateHandler(); } /** * Called when an unexpected exception is encountered (i.e. an exception * violates an exception specification). This calls abort() unless a * custom handler has been set.. */ void unexpected() { static __cxa_thread_info *info = thread_info(); if (0 != info && 0 != info->unexpectedHandler) { info->unexpectedHandler(); // Should not be reached - a terminate handler is not expected to // return. abort(); } unexpectedHandler(); } /** * Returns whether there are any exceptions currently being thrown that * have not been caught. This can occur inside a nested catch statement. */ bool uncaught_exception() throw() { __cxa_thread_info *info = thread_info(); return info->globals.uncaughtExceptions != 0; } /** * Returns the number of exceptions currently being thrown that have not * been caught. This can occur inside a nested catch statement. */ int uncaught_exceptions() throw() { __cxa_thread_info *info = thread_info(); return info->globals.uncaughtExceptions; } /** * Returns the current unexpected handler. */ unexpected_handler get_unexpected() throw() { __cxa_thread_info *info = thread_info(); if (info->unexpectedHandler) { return info->unexpectedHandler; } return ATOMIC_LOAD(&unexpectedHandler); } /** * Returns the current terminate handler. */ terminate_handler get_terminate() throw() { __cxa_thread_info *info = thread_info(); if (info->terminateHandler) { return info->terminateHandler; } return ATOMIC_LOAD(&terminateHandler); } } #if defined(__arm__) && !defined(__ARM_DWARF_EH__) extern "C" _Unwind_Exception *__cxa_get_cleanup(void) { __cxa_thread_info *info = thread_info_fast(); _Unwind_Exception *exceptionObject = info->currentCleanup; if (isCXXException(exceptionObject->exception_class)) { __cxa_exception *ex = exceptionFromPointer(exceptionObject); ex->cleanupCount--; if (ex->cleanupCount == 0) { info->currentCleanup = ex->nextCleanup; ex->nextCleanup = 0; } } else { info->currentCleanup = 0; } return exceptionObject; } asm ( ".pushsection .text.__cxa_end_cleanup \n" ".global __cxa_end_cleanup \n" ".type __cxa_end_cleanup, \"function\" \n" "__cxa_end_cleanup: \n" " push {r1, r2, r3, r4} \n" " bl __cxa_get_cleanup \n" " push {r1, r2, r3, r4} \n" " b _Unwind_Resume \n" " bl abort \n" ".popsection \n" ); #endif Index: user/alc/PQ_LAUNDRY/contrib/libcxxrt =================================================================== --- user/alc/PQ_LAUNDRY/contrib/libcxxrt (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/libcxxrt (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/contrib/libcxxrt ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/contrib/libcxxrt:r299821-303204 Index: user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind/include/__libunwind_config.h =================================================================== --- user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind/include/__libunwind_config.h (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind/include/__libunwind_config.h (revision 303206) @@ -1,71 +1,71 @@ //===------------------------- __libunwind_config.h -----------------------===// // // The LLVM Compiler Infrastructure // // This file is dual licensed under the MIT and the University of Illinois Open // Source Licenses. See LICENSE.TXT for details. // //===----------------------------------------------------------------------===// #ifndef ____LIBUNWIND_CONFIG_H__ #define ____LIBUNWIND_CONFIG_H__ #if defined(__arm__) && !defined(__USING_SJLJ_EXCEPTIONS__) && \ !defined(__ARM_DWARF_EH__) #define _LIBUNWIND_ARM_EHABI 1 #else #define _LIBUNWIND_ARM_EHABI 0 #endif #if defined(_LIBUNWIND_IS_NATIVE_ONLY) # if defined(__i386__) # define _LIBUNWIND_TARGET_I386 1 # define _LIBUNWIND_CONTEXT_SIZE 8 # define _LIBUNWIND_CURSOR_SIZE 19 # define _LIBUNWIND_MAX_REGISTER 9 # elif defined(__x86_64__) # define _LIBUNWIND_TARGET_X86_64 1 # define _LIBUNWIND_CONTEXT_SIZE 21 # define _LIBUNWIND_CURSOR_SIZE 33 # define _LIBUNWIND_MAX_REGISTER 17 # elif defined(__ppc__) # define _LIBUNWIND_TARGET_PPC 1 # define _LIBUNWIND_CONTEXT_SIZE 117 # define _LIBUNWIND_CURSOR_SIZE 128 # define _LIBUNWIND_MAX_REGISTER 113 # elif defined(__aarch64__) # define _LIBUNWIND_TARGET_AARCH64 1 # define _LIBUNWIND_CONTEXT_SIZE 66 # define _LIBUNWIND_CURSOR_SIZE 78 # define _LIBUNWIND_MAX_REGISTER 96 # elif defined(__arm__) # define _LIBUNWIND_TARGET_ARM 1 # define _LIBUNWIND_CONTEXT_SIZE 60 # define _LIBUNWIND_CURSOR_SIZE 67 # define _LIBUNWIND_MAX_REGISTER 96 # elif defined(__or1k__) # define _LIBUNWIND_TARGET_OR1K 1 # define _LIBUNWIND_CONTEXT_SIZE 16 # define _LIBUNWIND_CURSOR_SIZE 28 # define _LIBUNWIND_MAX_REGISTER 32 # elif defined(__riscv__) # define _LIBUNWIND_TARGET_RISCV 1 -# define _LIBUNWIND_CONTEXT_SIZE 128 /* XXX */ -# define _LIBUNWIND_CURSOR_SIZE 140 /* XXX */ +# define _LIBUNWIND_CONTEXT_SIZE 64 +# define _LIBUNWIND_CURSOR_SIZE 76 # define _LIBUNWIND_MAX_REGISTER 96 # else # error "Unsupported architecture." # endif #else // !_LIBUNWIND_IS_NATIVE_ONLY # define _LIBUNWIND_TARGET_I386 1 # define _LIBUNWIND_TARGET_X86_64 1 # define _LIBUNWIND_TARGET_PPC 1 # define _LIBUNWIND_TARGET_AARCH64 1 # define _LIBUNWIND_TARGET_ARM 1 # define _LIBUNWIND_TARGET_OR1K 1 # define _LIBUNWIND_CONTEXT_SIZE 128 # define _LIBUNWIND_CURSOR_SIZE 140 # define _LIBUNWIND_MAX_REGISTER 120 #endif // _LIBUNWIND_IS_NATIVE_ONLY #endif // ____LIBUNWIND_CONFIG_H__ Index: user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind/include/unwind.h =================================================================== --- user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind/include/unwind.h (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind/include/unwind.h (revision 303206) @@ -1,372 +1,372 @@ //===------------------------------- unwind.h -----------------------------===// // // The LLVM Compiler Infrastructure // // This file is dual licensed under the MIT and the University of Illinois Open // Source Licenses. See LICENSE.TXT for details. // // // C++ ABI Level 1 ABI documented at: // http://mentorembedded.github.io/cxx-abi/abi-eh.html // //===----------------------------------------------------------------------===// #ifndef __UNWIND_H__ #define __UNWIND_H__ #include <__libunwind_config.h> #include #include #if defined(__APPLE__) #define LIBUNWIND_UNAVAIL __attribute__ (( unavailable )) #else #define LIBUNWIND_UNAVAIL #endif typedef enum { _URC_NO_REASON = 0, _URC_OK = 0, _URC_FOREIGN_EXCEPTION_CAUGHT = 1, _URC_FATAL_PHASE2_ERROR = 2, _URC_FATAL_PHASE1_ERROR = 3, _URC_NORMAL_STOP = 4, _URC_END_OF_STACK = 5, _URC_HANDLER_FOUND = 6, _URC_INSTALL_CONTEXT = 7, _URC_CONTINUE_UNWIND = 8, #if _LIBUNWIND_ARM_EHABI _URC_FAILURE = 9 #endif } _Unwind_Reason_Code; typedef enum { _UA_SEARCH_PHASE = 1, _UA_CLEANUP_PHASE = 2, _UA_HANDLER_FRAME = 4, _UA_FORCE_UNWIND = 8, _UA_END_OF_STACK = 16 // gcc extension to C++ ABI } _Unwind_Action; typedef struct _Unwind_Context _Unwind_Context; // opaque #if _LIBUNWIND_ARM_EHABI typedef uint32_t _Unwind_State; static const _Unwind_State _US_VIRTUAL_UNWIND_FRAME = 0; static const _Unwind_State _US_UNWIND_FRAME_STARTING = 1; static const _Unwind_State _US_UNWIND_FRAME_RESUME = 2; /* Undocumented flag for force unwinding. */ static const _Unwind_State _US_FORCE_UNWIND = 8; typedef uint32_t _Unwind_EHT_Header; struct _Unwind_Control_Block; typedef struct _Unwind_Control_Block _Unwind_Control_Block; typedef struct _Unwind_Control_Block _Unwind_Exception; /* Alias */ struct _Unwind_Control_Block { uint64_t exception_class; void (*exception_cleanup)(_Unwind_Reason_Code, _Unwind_Control_Block*); /* Unwinder cache, private fields for the unwinder's use */ struct { uint32_t reserved1; /* init reserved1 to 0, then don't touch */ uint32_t reserved2; uint32_t reserved3; uint32_t reserved4; uint32_t reserved5; } unwinder_cache; /* Propagation barrier cache (valid after phase 1): */ struct { uint32_t sp; uint32_t bitpattern[5]; } barrier_cache; /* Cleanup cache (preserved over cleanup): */ struct { uint32_t bitpattern[4]; } cleanup_cache; /* Pr cache (for pr's benefit): */ struct { uint32_t fnstart; /* function start address */ _Unwind_EHT_Header* ehtp; /* pointer to EHT entry header word */ uint32_t additional; uint32_t reserved1; } pr_cache; long long int :0; /* Enforce the 8-byte alignment */ }; typedef _Unwind_Reason_Code (*_Unwind_Stop_Fn) (_Unwind_State state, _Unwind_Exception* exceptionObject, struct _Unwind_Context* context); typedef _Unwind_Reason_Code (*__personality_routine) (_Unwind_State state, _Unwind_Exception* exceptionObject, struct _Unwind_Context* context); #else struct _Unwind_Context; // opaque struct _Unwind_Exception; // forward declaration typedef struct _Unwind_Exception _Unwind_Exception; struct _Unwind_Exception { uint64_t exception_class; void (*exception_cleanup)(_Unwind_Reason_Code reason, _Unwind_Exception *exc); uintptr_t private_1; // non-zero means forced unwind uintptr_t private_2; // holds sp that phase1 found for phase2 to use #ifndef __LP64__ // The gcc implementation of _Unwind_Exception used attribute mode on the // above fields which had the side effect of causing this whole struct to // round up to 32 bytes in size. To be more explicit, we add pad fields // added for binary compatibility. uint32_t reserved[3]; #endif -}; +} __attribute__((__aligned__)); typedef _Unwind_Reason_Code (*_Unwind_Stop_Fn) (int version, _Unwind_Action actions, uint64_t exceptionClass, _Unwind_Exception* exceptionObject, struct _Unwind_Context* context, void* stop_parameter ); typedef _Unwind_Reason_Code (*__personality_routine) (int version, _Unwind_Action actions, uint64_t exceptionClass, _Unwind_Exception* exceptionObject, struct _Unwind_Context* context); #endif #ifdef __cplusplus extern "C" { #endif // // The following are the base functions documented by the C++ ABI // #ifdef __USING_SJLJ_EXCEPTIONS__ extern _Unwind_Reason_Code _Unwind_SjLj_RaiseException(_Unwind_Exception *exception_object); extern void _Unwind_SjLj_Resume(_Unwind_Exception *exception_object); #else extern _Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *exception_object); extern void _Unwind_Resume(_Unwind_Exception *exception_object); #endif extern void _Unwind_DeleteException(_Unwind_Exception *exception_object); #if _LIBUNWIND_ARM_EHABI typedef enum { _UVRSC_CORE = 0, /* integer register */ _UVRSC_VFP = 1, /* vfp */ _UVRSC_WMMXD = 3, /* Intel WMMX data register */ _UVRSC_WMMXC = 4 /* Intel WMMX control register */ } _Unwind_VRS_RegClass; typedef enum { _UVRSD_UINT32 = 0, _UVRSD_VFPX = 1, _UVRSD_UINT64 = 3, _UVRSD_FLOAT = 4, _UVRSD_DOUBLE = 5 } _Unwind_VRS_DataRepresentation; typedef enum { _UVRSR_OK = 0, _UVRSR_NOT_IMPLEMENTED = 1, _UVRSR_FAILED = 2 } _Unwind_VRS_Result; extern void _Unwind_Complete(_Unwind_Exception* exception_object); extern _Unwind_VRS_Result _Unwind_VRS_Get(_Unwind_Context *context, _Unwind_VRS_RegClass regclass, uint32_t regno, _Unwind_VRS_DataRepresentation representation, void *valuep); extern _Unwind_VRS_Result _Unwind_VRS_Set(_Unwind_Context *context, _Unwind_VRS_RegClass regclass, uint32_t regno, _Unwind_VRS_DataRepresentation representation, void *valuep); extern _Unwind_VRS_Result _Unwind_VRS_Pop(_Unwind_Context *context, _Unwind_VRS_RegClass regclass, uint32_t discriminator, _Unwind_VRS_DataRepresentation representation); #endif #if !_LIBUNWIND_ARM_EHABI extern uintptr_t _Unwind_GetGR(struct _Unwind_Context *context, int index); extern void _Unwind_SetGR(struct _Unwind_Context *context, int index, uintptr_t new_value); extern uintptr_t _Unwind_GetIP(struct _Unwind_Context *context); extern void _Unwind_SetIP(struct _Unwind_Context *, uintptr_t new_value); #else // _LIBUNWIND_ARM_EHABI #if defined(_LIBUNWIND_UNWIND_LEVEL1_EXTERNAL_LINKAGE) #define _LIBUNWIND_EXPORT_UNWIND_LEVEL1 extern #else #define _LIBUNWIND_EXPORT_UNWIND_LEVEL1 static __inline__ #endif // These are de facto helper functions for ARM, which delegate the function // calls to _Unwind_VRS_Get/Set(). These are not a part of ARM EHABI // specification, thus these function MUST be inlined. Please don't replace // these with the "extern" function declaration; otherwise, the program // including this header won't be ABI compatible and will result in // link error when we are linking the program with libgcc. _LIBUNWIND_EXPORT_UNWIND_LEVEL1 uintptr_t _Unwind_GetGR(struct _Unwind_Context *context, int index) { uintptr_t value = 0; _Unwind_VRS_Get(context, _UVRSC_CORE, (uint32_t)index, _UVRSD_UINT32, &value); return value; } _LIBUNWIND_EXPORT_UNWIND_LEVEL1 void _Unwind_SetGR(struct _Unwind_Context *context, int index, uintptr_t value) { _Unwind_VRS_Set(context, _UVRSC_CORE, (uint32_t)index, _UVRSD_UINT32, &value); } _LIBUNWIND_EXPORT_UNWIND_LEVEL1 uintptr_t _Unwind_GetIP(struct _Unwind_Context *context) { // remove the thumb-bit before returning return _Unwind_GetGR(context, 15) & (~(uintptr_t)0x1); } _LIBUNWIND_EXPORT_UNWIND_LEVEL1 void _Unwind_SetIP(struct _Unwind_Context *context, uintptr_t value) { uintptr_t thumb_bit = _Unwind_GetGR(context, 15) & ((uintptr_t)0x1); _Unwind_SetGR(context, 15, value | thumb_bit); } #endif // _LIBUNWIND_ARM_EHABI extern uintptr_t _Unwind_GetRegionStart(struct _Unwind_Context *context); extern uintptr_t _Unwind_GetLanguageSpecificData(struct _Unwind_Context *context); #ifdef __USING_SJLJ_EXCEPTIONS__ extern _Unwind_Reason_Code _Unwind_SjLj_ForcedUnwind(_Unwind_Exception *exception_object, _Unwind_Stop_Fn stop, void *stop_parameter); #else extern _Unwind_Reason_Code _Unwind_ForcedUnwind(_Unwind_Exception *exception_object, _Unwind_Stop_Fn stop, void *stop_parameter); #endif #ifdef __USING_SJLJ_EXCEPTIONS__ typedef struct _Unwind_FunctionContext *_Unwind_FunctionContext_t; extern void _Unwind_SjLj_Register(_Unwind_FunctionContext_t fc); extern void _Unwind_SjLj_Unregister(_Unwind_FunctionContext_t fc); #endif // // The following are semi-suppoted extensions to the C++ ABI // // // called by __cxa_rethrow(). // #ifdef __USING_SJLJ_EXCEPTIONS__ extern _Unwind_Reason_Code _Unwind_SjLj_Resume_or_Rethrow(_Unwind_Exception *exception_object); #else extern _Unwind_Reason_Code _Unwind_Resume_or_Rethrow(_Unwind_Exception *exception_object); #endif // _Unwind_Backtrace() is a gcc extension that walks the stack and calls the // _Unwind_Trace_Fn once per frame until it reaches the bottom of the stack // or the _Unwind_Trace_Fn function returns something other than _URC_NO_REASON. typedef _Unwind_Reason_Code (*_Unwind_Trace_Fn)(struct _Unwind_Context *, void *); extern _Unwind_Reason_Code _Unwind_Backtrace(_Unwind_Trace_Fn, void *); // _Unwind_GetCFA is a gcc extension that can be called from within a // personality handler to get the CFA (stack pointer before call) of // current frame. extern uintptr_t _Unwind_GetCFA(struct _Unwind_Context *); // _Unwind_GetIPInfo is a gcc extension that can be called from within a // personality handler. Similar to _Unwind_GetIP() but also returns in // *ipBefore a non-zero value if the instruction pointer is at or before the // instruction causing the unwind. Normally, in a function call, the IP returned // is the return address which is after the call instruction and may be past the // end of the function containing the call instruction. extern uintptr_t _Unwind_GetIPInfo(struct _Unwind_Context *context, int *ipBefore); // __register_frame() is used with dynamically generated code to register the // FDE for a generated (JIT) code. The FDE must use pc-rel addressing to point // to its function and optional LSDA. // __register_frame() has existed in all versions of Mac OS X, but in 10.4 and // 10.5 it was buggy and did not actually register the FDE with the unwinder. // In 10.6 and later it does register properly. extern void __register_frame(const void *fde); extern void __deregister_frame(const void *fde); // _Unwind_Find_FDE() will locate the FDE if the pc is in some function that has // an associated FDE. Note, Mac OS X 10.6 and later, introduces "compact unwind // info" which the runtime uses in preference to dwarf unwind info. This // function will only work if the target function has an FDE but no compact // unwind info. struct dwarf_eh_bases { uintptr_t tbase; uintptr_t dbase; uintptr_t func; }; extern const void *_Unwind_Find_FDE(const void *pc, struct dwarf_eh_bases *); // This function attempts to find the start (address of first instruction) of // a function given an address inside the function. It only works if the // function has an FDE (dwarf unwind info). // This function is unimplemented on Mac OS X 10.6 and later. Instead, use // _Unwind_Find_FDE() and look at the dwarf_eh_bases.func result. extern void *_Unwind_FindEnclosingFunction(void *pc); // Mac OS X does not support text-rel and data-rel addressing so these functions // are unimplemented extern uintptr_t _Unwind_GetDataRelBase(struct _Unwind_Context *context) LIBUNWIND_UNAVAIL; extern uintptr_t _Unwind_GetTextRelBase(struct _Unwind_Context *context) LIBUNWIND_UNAVAIL; // Mac OS X 10.4 and 10.5 had implementations of these functions in // libgcc_s.dylib, but they never worked. /// These functions are no longer available on Mac OS X. extern void __register_frame_info_bases(const void *fde, void *ob, void *tb, void *db) LIBUNWIND_UNAVAIL; extern void __register_frame_info(const void *fde, void *ob) LIBUNWIND_UNAVAIL; extern void __register_frame_info_table_bases(const void *fde, void *ob, void *tb, void *db) LIBUNWIND_UNAVAIL; extern void __register_frame_info_table(const void *fde, void *ob) LIBUNWIND_UNAVAIL; extern void __register_frame_table(const void *fde) LIBUNWIND_UNAVAIL; extern void *__deregister_frame_info(const void *fde) LIBUNWIND_UNAVAIL; extern void *__deregister_frame_info_bases(const void *fde) LIBUNWIND_UNAVAIL; #ifdef __cplusplus } #endif #endif // __UNWIND_H__ Index: user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind =================================================================== --- user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/contrib/llvm/projects/libunwind ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/contrib/llvm/projects/libunwind:r303053-303204 Index: user/alc/PQ_LAUNDRY/contrib/llvm =================================================================== --- user/alc/PQ_LAUNDRY/contrib/llvm (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/llvm (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/contrib/llvm ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/contrib/llvm:r303053-303204 Index: user/alc/PQ_LAUNDRY/contrib/openresolv/Makefile =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/Makefile (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/Makefile (revision 303206) @@ -1,85 +1,85 @@ PKG= openresolv -VERSION= 3.7.3 # Nasty hack so that make clean works without configure being run _CONFIG_MK!= test -e config.mk && echo config.mk || echo config-null.mk CONFIG_MK?= ${_CONFIG_MK} include ${CONFIG_MK} SBINDIR?= /sbin SYSCONFDIR?= /etc LIBEXECDIR?= /libexec/resolvconf VARDIR?= /var/run/resolvconf -RCDIR?= /etc/rc.d -RESTARTCMD?= if ${RCDIR}/\1 status >/dev/null 2>\&1; then \ - ${RCDIR}/\1 restart; \ - fi INSTALL?= install SED?= sed +VERSION!= ${SED} -n 's/OPENRESOLV_VERSION="\(.*\)".*/\1/p' resolvconf.in + BINMODE?= 0755 DOCMODE?= 0644 MANMODE?= 0444 RESOLVCONF= resolvconf resolvconf.8 resolvconf.conf.5 SUBSCRIBERS= libc dnsmasq named pdnsd unbound TARGET= ${RESOLVCONF} ${SUBSCRIBERS} SRCS= ${TARGET:C,$,.in,} # pmake SRCS:= ${TARGET:=.in} # gmake SED_SBINDIR= -e 's:@SBINDIR@:${SBINDIR}:g' SED_SYSCONFDIR= -e 's:@SYSCONFDIR@:${SYSCONFDIR}:g' SED_LIBEXECDIR= -e 's:@LIBEXECDIR@:${LIBEXECDIR}:g' SED_VARDIR= -e 's:@VARDIR@:${VARDIR}:g' SED_RCDIR= -e 's:@RCDIR@:${RCDIR}:g' -SED_RESTARTCMD= -e 's:@RESTARTCMD \(.*\)@:${RESTARTCMD}:g' +SED_RESTARTCMD= -e 's:@RESTARTCMD@:${RESTARTCMD}:g' +SED_RCDIR= -e 's:@RCDIR@:${RCDIR}:g' +SED_STATUSARG= -e 's:@STATUSARG@:${STATUSARG}:g' DISTPREFIX?= ${PKG}-${VERSION} DISTFILEGZ?= ${DISTPREFIX}.tar.gz DISTFILE?= ${DISTPREFIX}.tar.xz FOSSILID?= current .SUFFIXES: .in all: ${TARGET} -.in: +.in: Makefile ${CONFIG_MK} ${SED} ${SED_SBINDIR} ${SED_SYSCONFDIR} ${SED_LIBEXECDIR} \ - ${SED_VARDIR} ${SED_RCDIR} ${SED_RESTARTCMD} \ + ${SED_VARDIR} \ + ${SED_RCDIR} ${SED_RESTARTCMD} ${SED_RCDIR} ${SED_STATUSARG} \ $< > $@ clean: rm -f ${TARGET} distclean: clean rm -f config.mk ${DISTFILE} installdirs: proginstall: ${TARGET} ${INSTALL} -d ${DESTDIR}${SBINDIR} ${INSTALL} -m ${BINMODE} resolvconf ${DESTDIR}${SBINDIR} ${INSTALL} -d ${DESTDIR}${SYSCONFDIR} test -e ${DESTDIR}${SYSCONFDIR}/resolvconf.conf || \ ${INSTALL} -m ${DOCMODE} resolvconf.conf ${DESTDIR}${SYSCONFDIR} ${INSTALL} -d ${DESTDIR}${LIBEXECDIR} ${INSTALL} -m ${DOCMODE} ${SUBSCRIBERS} ${DESTDIR}${LIBEXECDIR} maninstall: ${INSTALL} -d ${DESTDIR}${MANDIR}/man8 ${INSTALL} -m ${MANMODE} resolvconf.8 ${DESTDIR}${MANDIR}/man8 ${INSTALL} -d ${DESTDIR}${MANDIR}/man5 ${INSTALL} -m ${MANMODE} resolvconf.conf.5 ${DESTDIR}${MANDIR}/man5 install: proginstall maninstall import: rm -rf /tmp/${DISTPREFIX} ${INSTALL} -d /tmp/${DISTPREFIX} cp README ${SRCS} /tmp/${DISTPREFIX} dist: fossil tarball --name ${DISTPREFIX} ${FOSSILID} ${DISTFILEGZ} gunzip -c ${DISTFILEGZ} | xz >${DISTFILE} rm ${DISTFILEGZ} Index: user/alc/PQ_LAUNDRY/contrib/openresolv/configure =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/configure (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/configure (revision 303206) @@ -1,225 +1,152 @@ #!/bin/sh # Try and be like autotools configure, but without autotools # Ensure that we do not inherit these from env OS= BUILD= HOST= TARGET= RESTARTCMD= RCDIR= +STATUSARG= for x do opt=${x%%=*} var=${x#*=} case "$opt" in --os|OS) OS=$var;; --with-cc|CC) CC=$var;; --debug) DEBUG=$var;; --disable-debug) DEBUG=no;; --enable-debug) DEBUG=yes;; --prefix) PREFIX=$var;; --sysconfdir) SYSCONFDIR=$var;; --bindir|--sbindir) SBINDIR=$var;; --libexecdir) LIBEXECDIR=$var;; --statedir|--localstatedir) STATEDIR=$var;; --dbdir) DBDIR=$var;; --rundir) RUNDIR=$var;; --mandir) MANDIR=$var;; --with-ccopts|CFLAGS) CFLAGS=$var;; CPPFLAGS) CPPFLAGS=$var;; --build) BUILD=$var;; --host) HOST=$var;; --target) TARGET=$var;; --libdir) LIBDIR=$var;; --restartcmd) RESTARTCMD=$var;; + --rcdir) RCDIR=$var;; + --statusarg) STATUSARG=$var;; --includedir) eval INCLUDEDIR="$INCLUDEDIR${INCLUDEDIR:+ }$var";; --datadir|--infodir) ;; # ignore autotools --disable-maintainer-mode|--disable-dependency-tracking) ;; --help) echo "See the README file for available options"; exit 0;; *) echo "$0: WARNING: unknown option $opt" >&2;; esac done if [ -z "$LIBEXECDIR" ]; then printf "Checking for directory /libexec ... " if [ -d /libexec ]; then echo "yes" LIBEXECDIR=$PREFIX/libexec/resolvconf else echo "no" LIBEXECDIR=$PREFIX/lib/resolvconf fi fi if [ -z "$RUNDIR" ]; then printf "Checking for directory /run ... " if [ -d /run ]; then echo "yes" RUNDIR=/run else echo "no" RUNDIR=/var/run fi fi : ${SED:=sed} : ${SYSCONFDIR:=$PREFIX/etc} : ${SBINDIR:=$PREFIX/sbin} : ${LIBEXECDIR:=$PREFIX/libexec/resolvconf} : ${STATEDIR:=/var} : ${RUNDIR:=$STATEDIR/run} : ${MANDIR:=${PREFIX:-/usr}/share/man} eval SYSCONFDIR="$SYSCONFDIR" eval SBINDIR="$SBINDIR" eval LIBEXECDIR="$LIBEXECDIR" eval VARDIR="$RUNDIR/resolvconf" eval MANDIR="$MANDIR" CONFIG_MK=config.mk if [ -z "$BUILD" ]; then # autoconf target triplet: cpu-vendor-os BUILD=$(uname -m)-unknown-$(uname -s | tr '[:upper:]' '[:lower:]') fi : ${HOST:=$BUILD} if [ -z "$OS" ]; then echo "Deriving operating system from ... $HOST" # Derive OS from cpu-vendor-[kernel-]os CPU=${HOST%%-*} REST=${HOST#*-} if [ "$CPU" != "$REST" ]; then VENDOR=${REST%%-*} REST=${REST#*-} if [ "$VENDOR" != "$REST" ]; then # Use kernel if given, otherwise os OS=${REST%%-*} else # 2 tupple OS=$VENDOR VENDOR= fi fi # Work with cpu-kernel-os, ie Debian case "$VENDOR" in linux*|kfreebsd*) OS=$VENDOR; VENDOR= ;; esac # Special case case "$OS" in gnu*) OS=hurd;; # No HURD support as yet esac fi echo "Configuring openresolv for ... $OS" rm -rf $CONFIG_MK echo "# $OS" >$CONFIG_MK -for x in SYSCONFDIR SBINDIR LIBEXECDIR VARDIR MANDIR; do +# On FreeBSD, /etc/init.d/foo status returns 0 if foo is not enabled +# regardless of if it's not running. +# So we force onestatus to work around this silly bug. +if [ -z "$STATUSARG" ]; then + case "$OS" in + freebsd*) STATUSARG="onestatus";; + esac +fi + +for x in SYSCONFDIR SBINDIR LIBEXECDIR VARDIR MANDIR RESTARTCMD RCDIR STATUSARG +do eval v=\$$x # Make files look nice for import l=$((10 - ${#x})) unset t [ $l -gt 3 ] && t=" " echo "$x=$t $v" >>$CONFIG_MK done -if [ -z "$RESTARTCMD" ]; then - printf "Checking for systemd ... " - if [ -x /bin/systemctl ]; then - RESTARTCMD="/bin/systemctl try-restart \1" - echo "yes" - elif [ -x /usr/bin/systemctl ]; then - RESTARTCMD="/usr/bin/systemctl try-restart \1" - echo "yes" - else - echo "no" - fi -fi - -# Arch upgraded to systemd, so this check has to be just after systemd -# but higher than the others -if [ -z "$RESTARTCMD" ]; then - printf "Checking for Arch ... " - if [ -e /etc/arch-release -a -d /etc/rc.d ]; then - RCDIR=/etc/rc.d - RESTARTCMD="[ -e /var/run/daemons/\1 ] \&\& /etc/rc.d/\1 restart" - echo "yes" - else - echo "no" - fi -fi - -if [ -z "$RESTARTCMD" ]; then - printf "Checking for OpenRC ... " - if [ -x /sbin/rc-service ]; then - RESTARTCMD="if /sbin/rc-service -e \1; then /sbin/rc-service \1 -- -Ds restart; fi" - echo "yes" - else - echo "no" - fi -fi -if [ -z "$RESTARTCMD" ]; then - printf "Checking for invoke-rc.d ... " - if [ -x /usr/sbin/invoke-rc.d ]; then - RCDIR=/etc/init.d - RESTARTCMD="if /usr/sbin/invoke-rc.d --quiet \1 status >/dev/null 2>\&1; then /usr/sbin/invoke-rc.d \1 restart; fi" - echo "yes" - else - echo "no" - fi -fi -if [ -z "$RESTARTCMD" ]; then - printf "Checking for service ... " - if [ -x /sbin/service ]; then - RCDIR=/etc/init.d - RESTARTCMD="if /sbin/service \1; then /sbin/service \1 restart; fi" - echo "yes" - else - echo "no" - fi -fi -if [ -z "$RESTARTCMD" ]; then - printf "Checking for runit... " - if [ -x /bin/sv ]; then - RESTARTCMD="/bin/sv try-restart \1" - echo "yes" - elif [ -x /usr/bin/sv ]; then - RESTARTCMD="/usr/bin/sv try-restart \1" - echo "yes" - else - echo "no" - fi -fi -if [ -z "$RESTARTCMD" ]; then - for x in /etc/init.d/rc.d /etc/rc.d /etc/init.d; do - printf "Checking for $x ... " - if [ -d $x ]; then - RCDIR=$x - RESTARTCMD="if $x/\1 status >/dev/null 2>\&1; then $x/\1 restart; fi" - echo "yes" - break - else - echo "no" - fi - done -fi - -if [ -z "$RESTARTCMD" ]; then - echo "$0: WARNING: No means of interacting with system services detected!" - exit 1 -fi - -echo "RCDIR= $RCDIR" >>$CONFIG_MK -# Work around bug in the dash shell as "echo 'foo \1'" does bad things -printf "%s\n" "RESTARTCMD= $RESTARTCMD" >>$CONFIG_MK - echo echo " SYSCONFDIR = $SYSCONFDIR" echo " SBINDIR = $SBINDIR" echo " LIBEXECDIR = $LIBEXECDIR" echo " VARDIR = $RUNDIR" echo " MANDIR = $MANDIR" +echo +echo " RESTARTCMD = $RESTARTCMD" +echo " RCDIR = $RCDIR" +echo " STATUSARG = $STATUSARG" echo Index: user/alc/PQ_LAUNDRY/contrib/openresolv/dnsmasq.in =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/dnsmasq.in (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/dnsmasq.in (revision 303206) @@ -1,202 +1,209 @@ #!/bin/sh -# Copyright (c) 2007-2012 Roy Marples +# Copyright (c) 2007-2016 Roy Marples # All rights reserved # dnsmasq subscriber for resolvconf # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided # with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. [ -f "@SYSCONFDIR@"/resolvconf.conf ] || exit 0 . "@SYSCONFDIR@/resolvconf.conf" || exit 1 [ -z "$dnsmasq_conf" -a -z "$dnsmasq_resolv" ] && exit 0 [ -z "$RESOLVCONF" ] && eval "$(@SBINDIR@/resolvconf -v)" NL=" " : ${dnsmasq_pid:=/var/run/dnsmasq.pid} [ -s "$dnsmasq_pid" ] || dnsmasq_pid=/var/run/dnsmasq/dnsmasq.pid [ -s "$dnsmasq_pid" ] || unset dnsmasq_pid : ${dnsmasq_service:=dnsmasq} -: ${dnsmasq_restart:=@RESTARTCMD ${dnsmasq_service}@} newconf="# Generated by resolvconf$NL" newresolv="$newconf" # Using dbus means that we never have to restart the daemon # This is important as it means we should not drop DNS queries # whilst changing DNS options around. However, dbus support is optional # so we need to validate a few things first. # Check for DBus support in the binary dbus=false dbus_ex=false dbus_introspect=$(dbus-send --print-reply --system \ --dest=uk.org.thekelleys.dnsmasq \ /uk/org/thekelleys/dnsmasq \ org.freedesktop.DBus.Introspectable.Introspect \ 2>/dev/null) if [ $? = 0 ]; then dbus=true if printf %s "$dbus_introspect" | \ grep -q '' then dbus_ex=true fi fi for n in $NAMESERVERS; do newresolv="${newresolv}nameserver $n$NL" done dbusdest= dbusdest_ex= conf= for d in $DOMAINS; do dn="${d%%:*}" ns="${d#*:}" while [ -n "$ns" ]; do n="${ns%%,*}" if $dbus && ! $dbus_ex; then case "$n" in *.*.*.*) SIFS=${IFS-y} OIFS=$IFS IFS=. set -- $n num="0x$(printf %02x $1 $2 $3 $4)" if [ "$SIFS" = y ]; then unset IFS else IFS=$OIFS fi dbusdest="$dbusdest uint32:$(printf %u $num)" dbusdest="$dbusdest string:$dn" ;; *:*%*) # This version of dnsmasq won't accept # scoped IPv6 addresses dbus=false ;; *:*) SIFS=${IFS-y} OIFS=$IFS bytes= front= back= empty=false i=0 IFS=: set -- $n while [ -n "$1" -o -n "$2" ]; do addr="$1" shift if [ -z "$addr" ]; then empty=true continue fi i=$(($i + 1)) while [ ${#addr} -lt 4 ]; do addr="0${addr}" done byte1="$(printf %d 0x${addr%??})" byte2="$(printf %d 0x${addr#??})" if $empty; then back="$back byte:$byte1 byte:$byte2" else front="$front byte:$byte1 byte:$byte2" fi done while [ $i != 8 ]; do i=$(($i + 1)) front="$front byte:0 byte:0" done front="${front}$back" if [ "$SIFS" = y ]; then unset IFS else IFS=$OIFS fi dbusdest="${dbusdest}$front string:$dn" ;; *) if ! $dbus_ex; then dbus=false fi ;; esac fi dbusdest_ex="$dbusdest_ex${dbusdest_ex:+,}/$dn/$n" conf="${conf}server=/$dn/$n$NL" [ "$ns" = "${ns#*,}" ] && break ns="${ns#*,}" done done if $dbus; then newconf="$newconf$NL# Domain specific servers will" newconf="$newconf be sent over dbus${NL}" else newconf="$newconf$conf" fi # Try to ensure that config dirs exist if type config_mkdirs >/dev/null 2>&1; then config_mkdirs "$dnsmasq_conf" "$dnsmasq_resolv" else @SBINDIR@/resolvconf -D "$dnsmasq_conf" "$dnsmasq_resolv" fi changed=false if [ -n "$dnsmasq_conf" ]; then if [ ! -f "$dnsmasq_conf" ] || \ [ "$(cat "$dnsmasq_conf")" != "$(printf %s "$newconf")" ] then changed=true printf %s "$newconf" >"$dnsmasq_conf" fi fi if [ -n "$dnsmasq_resolv" ]; then # dnsmasq polls this file so no need to set changed=true if [ -f "$dnsmasq_resolv" ]; then if [ "$(cat "$dnsmasq_resolv")" != "$(printf %s "$newresolv")" ] then printf %s "$newresolv" >"$dnsmasq_resolv" fi else printf %s "$newresolv" >"$dnsmasq_resolv" fi fi if $changed; then - eval $dnsmasq_restart + # dnsmasq does not re-read the configuration file on SIGHUP + if [ -n "$dnsmasq_restart" ]; then + eval $dnsmasq_restart + elif [ -n "$RESTARTCMD" ]; then + set -- ${dnsmasq_service} + eval $RESTARTCMD + else + @SBINDIR@/resolvconf -r ${dnsmasq_service} + fi fi if $dbus; then if [ -s "$dnsmasq_pid" ]; then $changed || kill -HUP $(cat "$dnsmasq_pid") fi # Send even if empty so old servers are cleared if $dbus_ex; then method=SetDomainServers if [ -n "$dbusdest_ex" ]; then dbusdest_ex="array:string:$dbusdest_ex" fi dbusdest="$dbusdest_ex" else method=SetServers fi dbus-send --system --dest=uk.org.thekelleys.dnsmasq \ /uk/org/thekelleys/dnsmasq uk.org.thekelleys.$method \ $dbusdest fi Index: user/alc/PQ_LAUNDRY/contrib/openresolv/libc.in =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/libc.in (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/libc.in (revision 303206) @@ -1,246 +1,252 @@ #!/bin/sh -# Copyright (c) 2007-2014 Roy Marples +# Copyright (c) 2007-2016 Roy Marples # All rights reserved # libc subscriber for resolvconf # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided # with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. SYSCONFDIR=@SYSCONFDIR@ LIBEXECDIR=@LIBEXECDIR@ VARDIR=@VARDIR@ IFACEDIR="$VARDIR/interfaces" NL=" " # sed may not be available, and this is faster on small files key_get_value() { local key="$1" x= line= shift if [ $# -eq 0 ]; then while read -r line; do case "$line" in "$key"*) echo "${line##$key}";; esac done else for x do while read -r line; do case "$line" in "$key"*) echo "${line##$key}";; esac done < "$x" done fi } keys_remove() { local key x line found while read -r line; do found=false for key do case "$line" in "$key"*|"#"*|" "*|" "*|"") found=true;; esac $found && break done $found || echo "$line" done } local_nameservers="127.* 0.0.0.0 255.255.255.255 ::1" # Support original resolvconf configuration layout # as well as the openresolv config file if [ -f "$SYSCONFDIR"/resolvconf.conf ]; then . "$SYSCONFDIR"/resolvconf.conf elif [ -d "$SYSCONFDIR"/resolvconf ]; then SYSCONFDIR="$SYSCONFDIR/resolvconf/resolv.conf.d" base="$SYSCONFDIR/resolv.conf.d/base" if [ -f "$base" ]; then prepend_nameservers="$(key_get_value "nameserver " "$base")" domain="$(key_get_value "domain " "$base")" prepend_search="$(key_get_value "search " "$base")" resolv_conf_options="$(key_get_value "options " "$base")" resolv_conf_sortlist="$(key_get_value "sortlist " "$base")" fi if [ -f "$SYSCONFDIR"/resolv.conf.d/head ]; then resolv_conf_head="$(cat "${SYSCONFDIR}"/resolv.conf.d/head)" fi if [ -f "$SYSCONFDIR"/resolv.conf.d/tail ]; then resolv_conf_tail="$(cat "$SYSCONFDIR"/resolv.conf.d/tail)" fi fi : ${resolv_conf:=/etc/resolv.conf} : ${libc_service:=nscd} -: ${libc_restart:=@RESTARTCMD ${libc_service}@} : ${list_resolv:=@SBINDIR@/resolvconf -l} if [ "${resolv_conf_head-x}" = x -a -f "$SYSCONFDIR"/resolv.conf.head ]; then resolv_conf_head="$(cat "${SYSCONFDIR}"/resolv.conf.head)" fi if [ "${resolv_conf_tail-x}" = x -a -f "$SYSCONFDIR"/resolv.conf.tail ]; then resolv_conf_tail="$(cat "$SYSCONFDIR"/resolv.conf.tail)" fi backup=true signature="# Generated by resolvconf" uniqify() { local result= while [ -n "$1" ]; do case " $result " in *" $1 "*);; *) result="$result $1";; esac shift done echo "${result# *}" } case "${resolv_conf_passthrough:-NO}" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) backup=false newest= for conf in "$IFACEDIR"/*; do if [ -z "$newest" -o "$conf" -nt "$newest" ]; then newest="$conf" fi done [ -z "$newest" ] && exit 0 newconf="$(cat "$newest")$NL" ;; /dev/null|[Nn][Uu][Ll][Ll]) : ${resolv_conf_local_only:=NO} if [ "$local_nameservers" = "127.* 0.0.0.0 255.255.255.255 ::1" ]; then local_nameservers= fi # Need to overwrite our variables. eval "$(@SBINDIR@/resolvconf -V)" ;; *) [ -z "$RESOLVCONF" ] && eval "$(@SBINDIR@/resolvconf -v)" ;; esac case "${resolv_conf_passthrough:-NO}" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) ;; *) : ${domain:=$DOMAIN} newsearch="$(uniqify $prepend_search $SEARCH $append_search)" NS="$LOCALNAMESERVERS $NAMESERVERS" newns= gotlocal=false for n in $(uniqify $prepend_nameservers $NS $append_nameservers); do add=true islocal=false for l in $local_nameservers; do case "$n" in $l) islocal=true; gotlocal=true; break;; esac done if ! $islocal; then case "${resolv_conf_local_only:-YES}" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) $gotlocal && add=false;; esac fi $add && newns="$newns $n" done # Hold our new resolv.conf in a variable to save on temporary files newconf="$signature$NL" if [ -n "$resolv_conf_head" ]; then newconf="$newconf$resolv_conf_head$NL" fi [ -n "$domain" ] && newconf="${newconf}domain $domain$NL" if [ -n "$newsearch" -a "$newsearch" != "$domain" ]; then newconf="${newconf}search $newsearch$NL" fi for n in $newns; do newconf="${newconf}nameserver $n$NL" done # Now add anything we don't care about such as sortlist and options stuff="$($list_resolv | keys_remove nameserver domain search)" if [ -n "$stuff" ]; then newconf="$newconf$stuff$NL" fi # Append any user defined ones if [ -n "$resolv_conf_options" ]; then newconf="${newconf}options $resolv_conf_options$NL" fi if [ -n "$resolv_conf_sortlist" ]; then newconf="${newconf}sortlist $resolv_conf_sortlist$NL" fi if [ -n "$resolv_conf_tail" ]; then newconf="$newconf$resolv_conf_tail$NL" fi ;; esac # Check if the file has actually changed or not if [ -e "$resolv_conf" ]; then [ "$(cat "$resolv_conf")" = "$(printf %s "$newconf")" ] && exit 0 fi # Change is good. # If the old file does not have our signature, back it up. # If the new file just has our signature, restore the backup. if $backup; then if [ "$newconf" = "$signature$NL" ]; then if [ -e "$resolv_conf.bak" ]; then newconf="$(cat "$resolv_conf.bak")" fi elif [ -e "$resolv_conf" ]; then read line <"$resolv_conf" if [ "$line" != "$signature" ]; then cp "$resolv_conf" "$resolv_conf.bak" fi fi fi # Create our resolv.conf now (umask 022; echo "$newconf" >"$resolv_conf") -eval $libc_restart +if [ -n "$libc_restart" ]; then + eval $libc_restart +elif [ -n "$RESTARTCMD" ]; then + set -- ${libc_service} + eval $RESTARTCMD +else + @SBINDIR@/resolvconf -r ${libc_service} +fi retval=0 # Notify users of the resolver for script in "$LIBEXECDIR"/libc.d/*; do if [ -f "$script" ]; then if [ -x "$script" ]; then "$script" "$@" else (. "$script") fi retval=$(($retval + $?)) fi done exit $retval Index: user/alc/PQ_LAUNDRY/contrib/openresolv/named.in =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/named.in (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/named.in (revision 303206) @@ -1,106 +1,118 @@ #!/bin/sh -# Copyright (c) 2007-2012 Roy Marples +# Copyright (c) 2007-2016 Roy Marples # All rights reserved # named subscriber for resolvconf # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided # with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. [ -f "@SYSCONFDIR@"/resolvconf.conf ] || exit 0 . "@SYSCONFDIR@/resolvconf.conf" || exit 1 [ -z "$named_zones" -a -z "$named_options" ] && exit 0 [ -z "$RESOLVCONF" ] && eval "$(@SBINDIR@/resolvconf -v)" NL=" " # Platform specific kludges if [ -z "$named_service" -a -z "$named_restart" -a \ - -d "@RCDIR@" -a ! -x "@RCDIR@"/named ] + -d "$RCDIR" -a ! -x "$RCDIR"/named ] then - if [ -x "@RCDIR@"/bind9 ]; then + if [ -x "$RCDIR"/bind9 ]; then # Debian and derivatives named_service=bind9 - elif [ -x "@RCDIR@"/rc.bind ]; then + elif [ -x "$RCDIR"/rc.bind ]; then # Slackware named_service=rc.bind fi fi : ${named_service:=named} -: ${named_restart:=@RESTARTCMD ${named_service}@} + +: ${named_pid:=/var/run/$named_service.pid} +[ -s "$named_pid" ] || named_pid=/var/run/$named_service/$named_service.pid +[ -s "$named_pid" ] || unset named_pid + newoptions="# Generated by resolvconf$NL" newzones="$newoptions" forward= for n in $NAMESERVERS; do case "$forward" in *"$NL $n;"*);; *) forward="$forward$NL $n;";; esac done if [ -n "$forward" ]; then newoptions="${newoptions}forward first;${NL}forwarders {$forward${NL}};$NL" fi for d in $DOMAINS; do newzones="${newzones}zone \"${d%%:*}\" {$NL" newzones="$newzones type forward;$NL" newzones="$newzones forward first;$NL forwarders {$NL" ns="${d#*:}" while [ -n "$ns" ]; do newzones="$newzones ${ns%%,*};$NL" [ "$ns" = "${ns#*,}" ] && break ns="${ns#*,}" done newzones="$newzones };$NL};$NL" done # Try to ensure that config dirs exist if type config_mkdirs >/dev/null 2>&1; then config_mkdirs "$named_options" "$named_zones" else @SBINDIR@/resolvconf -D "$named_options" "$named_zones" fi # No point in changing files or reloading bind if the end result has not # changed changed=false if [ -n "$named_options" ]; then if [ ! -f "$named_options" ] || \ [ "$(cat "$named_options")" != "$(printf %s "$newoptions")" ] then printf %s "$newoptions" >"$named_options" changed=true fi fi if [ -n "$named_zones" ]; then if [ ! -f "$named_zones" ] || \ [ "$(cat "$named_zones")" != "$(printf %s "$newzones")" ] then printf %s "$newzones" >"$named_zones" changed=true fi fi +# named does not seem to work with SIGHUP which is a same if $changed; then - eval $named_restart + if [ -n "$named_restart" ]; then + eval $named_restart + elif [ -n "$RESTARTCMD" ]; then + set -- ${named_service} + eval $RESTARTCMD + else + @SBINDIR@/resolvconf -r ${named_service} + fi fi Index: user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.8.in =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.8.in (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.8.in (revision 303206) @@ -1,306 +1,318 @@ -.\" Copyright (c) 2007-2015 Roy Marples +.\" Copyright (c) 2007-2016 Roy Marples .\" All rights reserved .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd April 27, 2015 +.Dd May 7, 2016 .Dt RESOLVCONF 8 .Os .Sh NAME .Nm resolvconf .Nd a framework for managing multiple DNS configurations .Sh SYNOPSIS .Nm .Fl I .Nm .Op Fl m Ar metric .Op Fl p .Op Fl x .Fl a Ar interface Ns Op Ar .protocol .No < Ns Pa file .Nm .Op Fl f .Fl d Ar interface Ns Op Ar .protocol .Nm .Op Fl x .Fl il Ar pattern .Nm .Fl u .Sh DESCRIPTION .Nm manages .Xr resolv.conf 5 files from multiple sources, such as DHCP and VPN clients. Traditionally, the host runs just one client and that updates .Pa /etc/resolv.conf . More modern systems frequently have wired and wireless interfaces and there is no guarantee both are on the same network. With the advent of VPN and other types of networking daemons, many things now contend for the contents of .Pa /etc/resolv.conf . .Pp .Nm solves this by letting the daemon send their .Xr resolv.conf 5 file to .Nm via -.Xr stdin 3 +.Xr stdin 4 with the argument .Fl a Ar interface Ns Op Ar .protocol instead of the filesystem. .Nm then updates .Pa /etc/resolv.conf as it thinks best. When a local resolver other than libc is installed, such as .Xr dnsmasq 8 or .Xr named 8 , then .Nm will supply files that the resolver should be configured to include. .Pp .Nm assumes it has a job to do. In some situations .Nm needs to act as a deterrent to writing to .Pa /etc/resolv.conf . Where this file cannot be made immutable or you just need to toggle this behaviour, .Nm can be disabled by adding .Sy resolvconf Ns = Ns NO to .Xr resolvconf.conf 5 . .Pp .Nm can mark an interfaces .Pa resolv.conf as private. This means that the name servers listed in that .Pa resolv.conf are only used for queries against the domain/search listed in the same file. -This only works when a local resolver other than libc is installed. +This only works when a local resolver other than libc is installed. See .Xr resolvconf.conf 5 for how to configure .Nm to use a local name server. .Pp .Nm can mark an interfaces .Pa resolv.conf as exclusive. Only the latest exclusive interface is used for processing, otherwise all are. .Pp When an interface goes down, it should then call .Nm with .Fl d Ar interface.* arguments to delete the .Pa resolv.conf file(s) for all the .Ar protocols on the .Ar interface . .Pp -Here are some more options that -.Nm -has:- +Here are some options for the above commands:- .Bl -tag -width indent -.It Fl I -Initialise the state directory -.Pa @VARDIR@ . -This only needs to be called if the initial system boot sequence does not -automatically clean it out; for example the state directory is moved -somewhere other than -.Pa /var/run . -If used, it should only be called once as early in the system boot sequence -as possible and before -.Nm -is used to add interfaces. .It Fl f -Ignore non existant interfaces. +Ignore non existent interfaces. Only really useful for deleting interfaces. +.It Fl m Ar metric +Set the metric of the interface when adding it, default of 0. +Lower metrics take precedence. +This affects the default order of interfaces when listed. +.It Fl p +Marks the interface +.Pa resolv.conf +as private. +.It Fl x +Mark the interface +.Pa resolv.conf +as exclusive when adding, otherwise only use the latest exclusive interface. +.El +.Pp +.Nm +has some more commands for general usage:- +.Bl -tag -width indent .It Fl i Ar pattern List the interfaces and protocols, optionally matching .Ar pattern , we have .Pa resolv.conf files for. .It Fl l Ar pattern List the .Pa resolv.conf files we have. If .Ar pattern is specified then we list the files for the interfaces and protocols that match it. -.It Fl m Ar metric -Set the metric of the interface when adding it, default of 0. -Lower metrics take precedence. -This affects the default order of interfaces when listed. -.It Fl p -Marks the interface -.Pa resolv.conf -as private. .It Fl u Force .Nm to update all its subscribers. .Nm does not update the subscribers when adding a resolv.conf that matches what it already has for that interface. -.It Fl x -Mark the interface -.Pa resolv.conf -as exclusive when adding, otherwise only use the latest exclusive interface. .El .Pp .Nm -also has some options designed to be used by its subscribers:- +also has some commands designed to be used by it's subscribers and +system startup:- .Bl -tag -width indent +.It Fl I +Initialise the state directory +.Pa @VARDIR@ . +This only needs to be called if the initial system boot sequence does not +automatically clean it out; for example the state directory is moved +somewhere other than +.Pa /var/run . +If used, it should only be called once as early in the system boot sequence +as possible and before +.Nm +is used to add interfaces. +.It Fl R +Echo the command used to restart a service. +.It Fl r Ar service +If the +.Ar service +is running then restart it. +If the service does not exist or is not running then zero is returned, +otherwise the result of restarting the service. .It Fl v Echo variables DOMAINS, SEARCH and NAMESERVERS so that the subscriber can configure the resolver easily. .It Fl V Same as .Fl v except that only the information configured in .Xr resolvconf.conf 5 is set. .El .Sh INTERFACE ORDERING For .Nm to work effectively, it has to process the resolv.confs for the interfaces in the correct order. .Nm first processes interfaces from the .Sy interface_order list, then interfaces without a metic and that match the .Sy dynamic_order list, then interfaces with a metric in order and finally the rest in the operating systems lexical order. See .Xr resolvconf.conf 5 for details on these lists. .Sh PROTOCOLS Here are some suggested protocol tags to use for each .Pa resolv.conf file registered on an .Ar interface Ns No :- .Bl -tag -width indent .It dhcp Dynamic Host Configuration Protocol. Initial versions of .Nm did not recommend a .Ar protocol tag be appended to the .Ar interface name. When the protocol is absent, it is assumed to be the DHCP protocol. .It ppp Point-to-Point Protocol. .It ra IPv6 Router Advertisement. .It dhcp6 Dynamic Host Configuration Protocol, version 6. .El .Sh IMPLEMENTATION NOTES If a subscriber has the executable bit then it is executed otherwise it is assumed to be a shell script and sourced into the current environment in a subshell. This is done so that subscribers can remain fast, but are also not limited to the shell language. .Pp Portable subscribers should not use anything outside of .Pa /bin and .Pa /sbin because .Pa /usr and others may not be available when booting. Also, it would be unwise to assume any shell specific features. .Sh ENVIRONMENT .Bl -ohang .It Va IF_METRIC If the .Fl m option is not present then we use .Va IF_METRIC for the metric. .It Va IF_PRIVATE Marks the interface .Pa resolv.conf as private. .It Va IF_EXCLUSIVE Marks the interface .Pa resolv.conf as exclusive. .El .Sh FILES .Bl -ohang .It Pa /etc/resolv.conf.bak Backup file of the original resolv.conf. .It Pa @SYSCONFDIR@/resolvconf.conf Configuration file for .Nm . .It Pa @LIBEXECDIR@ Directory of subscribers which are run every time .Nm adds, deletes or updates. .It Pa @LIBEXECDIR@/libc.d Directory of subscribers which are run after the libc subscriber is run. .It Pa @VARDIR@ State directory for .Nm . .El +.Sh SEE ALSO +.Xr resolver 3 , +.Xr stdin 4 , +.Xr resolv.conf 5 , +.Xr resolvconf.conf 5 .Sh HISTORY This implementation of .Nm is called openresolv and is fully command line compatible with Debian's resolvconf, as written by Thomas Hood. -.Sh SEE ALSO -.Xr resolv.conf 5 , -.Xr resolvconf.conf 5 , -.Xr resolver 3 , -.Xr stdin 3 .Sh AUTHORS .An Roy Marples Aq Mt roy@marples.name .Sh BUGS Please report them to .Lk http://roy.marples.name/projects/openresolv .Pp .Nm does not validate any of the files given to it. .Pp When running a local resolver other than libc, you will need to configure it to include files that .Nm will generate. You should consult .Xr resolvconf.conf 5 for instructions on how to configure your resolver. Index: user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.conf.5.in =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.conf.5.in (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.conf.5.in (revision 303206) @@ -1,321 +1,321 @@ .\" Copyright (c) 2009-2016 Roy Marples .\" All rights reserved .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd February 21, 2016 +.Dd April 28, 2016 .Dt RESOLVCONF.CONF 5 .Os .Sh NAME .Nm resolvconf.conf .Nd resolvconf configuration file .Sh DESCRIPTION .Nm is the configuration file for .Xr resolvconf 8 . The .Nm file is a shell script that is sourced by .Xr resolvconf 8 , meaning that .Nm must contain valid shell commands. Listed below are the standard .Nm variables that may be set. If the values contain whitespace, wildcards or other special shell characters, ensure they are quoted and escaped correctly. See the .Sy replace variable for an example on quoting. .Pp After updating this file, you may wish to run .Nm resolvconf -u to apply the new configuration. .Pp When a dynamically generated list is appended or prepended to, the whole is made unique where left-most wins. .Sh RESOLVCONF OPTIONS .Bl -tag -width indent .It Sy resolvconf Set to NO to disable .Nm resolvconf from running any subscribers. Defaults to YES. .It Sy interface_order These interfaces will always be processed first. If unset, defaults to the following:- .D1 lo lo[0-9]* .It Sy dynamic_order These interfaces will be processed next, unless they have a metric. If unset, defaults to the following:- .D1 tap[0-9]* tun[0-9]* vpn vpn[0-9]* ppp[0-9]* ippp[0-9]* .It Sy local_nameservers If unset, defaults to the following:- .D1 127.* 0.0.0.0 255.255.255.255 ::1 .It Sy search_domains Prepend search domains to the dynamically generated list. .It Sy search_domains_append Append search domains to the dynamically generated list. .It Sy domain_blacklist A list of domains to be removed from consideration. To remove a domain, you can use foo.* To remove a sub domain, you can use *.bar .It Sy name_servers Prepend name servers to the dynamically generated list. You should set this to 127.0.0.1 if you use a local name server other than libc. .It Sy name_servers_append Append name servers to the dynamically generated list. .It Sy name_server_blacklist A list of name servers to be removed from consideration. The default is 0.0.0.0 as some faulty routers send it via DHCP. To remove a block, you can use 192.168.* .It Sy private_interfaces These interfaces name servers will only be queried for the domains listed in their resolv.conf. Useful for VPN domains. Setting .Sy private_interfaces Ns ="*" will stop the forwarding of the root zone and allows the local resolver to recursively query the root servers directly. Requires a local nameserver other than libc. This is equivalent to the .Nm resolvconf -p option. .It Sy replace -Is a space separated list of replacement keywords. The syntax is this: +Is a space separated list of replacement keywords. +The syntax is this: .Va $keyword Ns / Ns Va $match Ns / Ns Va $replacement .Pp Example, given this resolv.conf: .D1 domain foo.org .D1 search foo.org dead.beef .D1 nameserver 1.2.3.4 .D1 nameserver 2.3.4.5 and this configuaration: .D1 replace="search/foo*/bar.com nameserver/1.2.3.4/5.6.7.8 nameserver/2.3.4.5/" you would get this resolv.conf instead: .D1 domain foo.org .D1 search bar.com .D1 nameserver 5.6.7.8 .It Sy replace_sub Works the same way as .Sy replace except it works on each space separated value rather than the whole line, so it's useful for the replacing a single domain within the search directive. Using the same example resolv.conf and changing .Sy replace to .Sy replace_sub , you would get this resolv.conf instead: .D1 domain foo.org .D1 search bar.com dead.beef .D1 nameserver 5.6.7.8 .It Sy state_dir Override the default state directory of .Pa @VARDIR@ . This should not be changed once .Nm resolvconf is in use unless the old directory is copied to the new one. .El .Sh LIBC OPTIONS The following variables affect .Xr resolv.conf 5 directly:- .Bl -tag -width indent .It Sy resolv_conf Defaults to .Pa /etc/resolv.conf if not set. .It Sy resolv_conf_options A list of libc resolver options, as specified in .Xr resolv.conf 5 . .It Sy resolv_conf_passthrough When set to YES the latest resolv.conf is written to .Sy resolv_conf without any alteration. When set to /dev/null or NULL, .Sy resolv_conf_local_only is defaulted to NO, .Sy local_nameservers is unset unless overridden and only the information set in .Nm is written to .Sy resolv_conf . .It Sy resolv_conf_sortlist A libc resolver sortlist, as specified in .Xr resolv.conf 5 . .It Sy resolv_conf_local_only If a local name server is configured then the default is just to specify that and ignore all other entries as they will be configured for the local name server. Set this to NO to also list non-local nameservers. This will give you working DNS even if the local nameserver stops functioning at the expense of duplicated server queries. .It Sy append_nameservers Append name servers to the dynamically generated list. .It Sy prepend_nameservers Prepend name servers to the dynamically generated list. .It Sy append_search Append search domains to the dynamically generated list. .It Sy prepend_search Prepend search domains to the dynamically generated list. .El .Sh SUBSCRIBER OPTIONS openresolv ships with subscribers for the name servers .Xr dnsmasq 8 , .Xr named 8 , .Xr pdnsd 8 and .Xr unbound 8 . Each subscriber can create configuration files which should be included in in the subscribers main configuration file. .Pp To disable a subscriber, simply set it's name to NO. For example, to disable the libc subscriber you would set: .D1 libc=NO .Bl -tag -width indent .It Sy dnsmasq_conf This file tells dnsmasq which name servers to use for specific domains. .It Sy dnsmasq_resolv This file tells dnsmasq which name servers to use for global lookups. .Pp Example resolvconf.conf for dnsmasq: .D1 name_servers=127.0.0.1 .D1 dnsmasq_conf=/etc/dnsmasq-conf.conf .D1 dnsmasq_resolv=/etc/dnsmasq-resolv.conf .Pp Example dnsmasq.conf: .D1 listen-address=127.0.0.1 .D1 # If dnsmasq is compiled for DBus then we can take .D1 # advantage of not having to restart dnsmasq. .D1 enable-dbus .D1 conf-file=/etc/dnsmasq-conf.conf .D1 resolv-file=/etc/dnsmasq-resolv.conf .It Sy named_options Include this file in the named options block. This file tells named which name servers to use for global lookups. .It Sy named_zones Include this file in the named global scope, after the options block. This file tells named which name servers to use for specific domains. .Pp Example resolvconf.conf for named: .D1 name_servers=127.0.0.1 .D1 named_options=/etc/named-options.conf .D1 named_zones=/etc/named-zones.conf .Pp Example named.conf: .D1 options { .D1 listen-on { 127.0.0.1; }; .D1 include "/etc/named-options.conf"; .D1 }; .D1 include "/etc/named-zones.conf"; .It Sy pdnsd_conf This is the main pdnsd configuration file which we modify to add our forward domains to. If this variable is not set then we rely on the pdnsd configuration file setup to read .Pa pdnsd_resolv as documented below. .It Sy pdnsd_resolv This file tells pdnsd about global name servers. If this variable is not set then it's written to .Pa pdnsd_conf . .Pp Example resolvconf.conf for pdnsd: .D1 name_servers=127.0.0.1 .D1 pdnsd_conf=/etc/pdnsd.conf .D1 # pdnsd_resolv=/etc/pdnsd-resolv.conf .Pp Example pdnsd.conf: .D1 global { .D1 server_ip = 127.0.0.1; .D1 status_ctl = on; .D1 } .D1 server { .D1 # A server definition is required, even if emtpy. .D1 label="empty"; .D1 proxy_only=on; .D1 # file="/etc/pdnsd-resolv.conf"; .D1 } .It Sy unbound_conf This file tells unbound about specific and global name servers. .It Sy unbound_insecure When set to YES, unbound marks the domains as insecure, thus ignoring DNSSEC. .Pp Example resolvconf.conf for unbound: .D1 name_servers=127.0.0.1 .D1 unbound_conf=/etc/unbound-resolvconf.conf .Pp Example unbound.conf: .D1 include: /etc/unbound-resolvconf.conf .El .Sh SUBSCRIBER INTEGRATION Not all distributions store the files the subscribers need in the same locations. For example, named service scripts have been called named, bind and rc.bind and they could be located in a directory called /etc/rc.d, /etc/init.d or similar. Each subscriber attempts to automatically configure itself, but not every distribution has been catered for. Also, users could equally want to use a different version from the one installed by default, such as bind8 and bind9. To accommodate this, the subscribers have these files in configurable variables, documented below. .Pp .Bl -tag -width indent .It Sy dnsmasq_service -Location of the dnsmasq service. +Name of the dnsmasq service. .It Sy dnsmasq_restart Command to restart the dnsmasq service. .It Sy dnsmasq_pid Location of the dnsmasq pidfile. .It Sy libc_service -Location of the libc service. +Name of the libc service. .It Sy libc_restart Command to restart the libc service. .It Sy named_service -Location of the named service. +Name of the named service. .It Sy named_restart Command to restart the named service. .It Sy pdnsd_restart Command to restart the pdnsd service. .It Sy unbound_service -Location of the unbound service. +Name of the unbound service. .It Sy unbound_restart Command to restart the unbound service. .It Sy unbound_pid Location of the unbound pidfile. .El .Sh SEE ALSO +.Xr sh 1 , .Xr resolv.conf 5 , .Xr resolvconf 8 -and -.Xr sh 1 . .Sh AUTHORS .An Roy Marples Aq Mt roy@marples.name .Sh BUGS Each distribution is a special snowflake and likes to name the same thing differently, namely the named service script. .Pp Please report them to .Lk http://roy.marples.name/projects/openresolv Index: user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.in =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.in (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/resolvconf.in (revision 303206) @@ -1,791 +1,907 @@ #!/bin/sh # Copyright (c) 2007-2016 Roy Marples # All rights reserved # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided # with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. RESOLVCONF="$0" +OPENRESOLV_VERSION="3.8.1" SYSCONFDIR=@SYSCONFDIR@ LIBEXECDIR=@LIBEXECDIR@ VARDIR=@VARDIR@ +RCDIR=@RCDIR@ +RESTARTCMD=@RESTARTCMD@ # Disregard dhcpcd setting unset interface_order state_dir # If you change this, change the test in VFLAG and libc.in as well local_nameservers="127.* 0.0.0.0 255.255.255.255 ::1" dynamic_order="tap[0-9]* tun[0-9]* vpn vpn[0-9]* ppp[0-9]* ippp[0-9]*" interface_order="lo lo[0-9]*" name_server_blacklist="0.0.0.0" # Support original resolvconf configuration layout # as well as the openresolv config file if [ -f "$SYSCONFDIR"/resolvconf.conf ]; then . "$SYSCONFDIR"/resolvconf.conf [ -n "$state_dir" ] && VARDIR="$state_dir" elif [ -d "$SYSCONFDIR/resolvconf" ]; then SYSCONFDIR="$SYSCONFDIR/resolvconf" if [ -f "$SYSCONFDIR"/interface-order ]; then interface_order="$(cat "$SYSCONFDIR"/interface-order)" fi fi IFACEDIR="$VARDIR/interfaces" METRICDIR="$VARDIR/metrics" PRIVATEDIR="$VARDIR/private" EXCLUSIVEDIR="$VARDIR/exclusive" LOCKDIR="$VARDIR/lock" _PWD="$PWD" warn() { echo "$*" >&2 } error_exit() { echo "$*" >&2 exit 1 } usage() { cat <<-EOF - Usage: ${RESOLVCONF##*/} [options] + Usage: ${RESOLVCONF##*/} [options] command [argument] Inform the system about any DNS updates. - Options: + Commands: -a \$INTERFACE Add DNS information to the specified interface (DNS supplied via stdin in resolv.conf format) - -m metric Give the added DNS information a metric - -p Mark the interface as private - -x Mark the interface as exclusive -d \$INTERFACE Delete DNS information from the specified interface - -f Ignore non existant interfaces - -I Init the state dir - -u Run updates from our current DNS information - -l [\$PATTERN] Show DNS information, optionally from interfaces - that match the specified pattern + -h Show this help cruft -i [\$PATTERN] Show interfaces that have supplied DNS information optionally from interfaces that match the specified pattern + -l [\$PATTERN] Show DNS information, optionally from interfaces + that match the specified pattern + + -u Run updates from our current DNS information + + Options: + -f Ignore non existent interfaces + -m metric Give the added DNS information a metric + -p Mark the interface as private + -x Mark the interface as exclusive + + Subscriber and System Init Commands: + -I Init the state dir + -r \$SERVICE Restart the system service + (restarting a non-existent or non-running service + should have no output and return 0) + -R Show the system service restart command -v [\$PATTERN] echo NEWDOMAIN, NEWSEARCH and NEWNS variables to the console - -h Show this help cruft + -V [\$PATTERN] Same as -v, but only uses configuration in + $SYSCONFDIR/resolvconf.conf EOF [ -z "$1" ] && exit 0 echo error_exit "$*" } -echo_resolv() -{ - local line= OIFS="$IFS" - - [ -n "$1" -a -f "$IFACEDIR/$1" ] || return 1 - echo "# resolv.conf from $1" - # Our variable maker works of the fact each resolv.conf per interface - # is separated by blank lines. - # So we remove them when echoing them. - while read -r line; do - IFS="$OIFS" - if [ -n "$line" ]; then - # We need to set IFS here to preserve any whitespace - IFS='' - printf "%s\n" "$line" - fi - done < "$IFACEDIR/$1" - echo - IFS="$OIFS" -} - # Strip any trailing dot from each name as a FQDN does not belong # in resolv.conf(5) # If you think otherwise, capture a DNS trace and you'll see libc # will strip it regardless. # This also solves setting up duplicate zones in our subscribers. strip_trailing_dots() { local n= d= for n; do printf "$d%s" "${n%.}" d=" " done printf "\n" } # Parse resolv.conf's and make variables # for domain name servers, search name servers and global nameservers parse_resolv() { local line= ns= ds= search= d= n= newns= local new=true iface= private=false p= domain= l= islocal= newns= while read -r line; do case "$line" in "# resolv.conf from "*) if ${new}; then iface="${line#\# resolv.conf from *}" new=false if [ -e "$PRIVATEDIR/$iface" ]; then private=true else # Allow expansion cd "$IFACEDIR" private=false for p in $private_interfaces; do case "$iface" in - "$p"|"$p":*) private=true; break;; + "$p"|"$p":*) + private=true + break + ;; esac done fi fi ;; "nameserver "*) islocal=false for l in $local_nameservers; do case "${line#* }" in $l) islocal=true echo "LOCALNAMESERVERS=\"\$LOCALNAMESERVERS ${line#* }\"" break ;; esac done $islocal || ns="$ns${line#* } " ;; "domain "*) search="$(strip_trailing_dots ${line#* })" if [ -z "$domain" ]; then domain="$search" echo "DOMAIN=\"$domain\"" fi ;; "search "*) search="$(strip_trailing_dots ${line#* })" ;; *) [ -n "$line" ] && continue if [ -n "$ns" -a -n "$search" ]; then newns= for n in $ns; do newns="$newns${newns:+,}$n" done ds= for d in $search; do ds="$ds${ds:+ }$d:$newns" done echo "DOMAINS=\"\$DOMAINS $ds\"" fi echo "SEARCH=\"\$SEARCH $search\"" if ! $private; then echo "NAMESERVERS=\"\$NAMESERVERS $ns\"" fi ns= search= new=true ;; esac done } uniqify() { local result= while [ -n "$1" ]; do case " $result " in *" $1 "*);; *) result="$result $1";; esac shift done echo "${result# *}" } dirname() { local dir= OIFS="$IFS" local IFS=/ set -- $@ IFS="$OIFS" if [ -n "$1" ]; then printf %s . else shift fi while [ -n "$2" ]; do printf "/%s" "$1" shift done printf "\n" } config_mkdirs() { local e=0 f d for f; do [ -n "$f" ] || continue d="$(dirname "$f")" if [ ! -d "$d" ]; then if type install >/dev/null 2>&1; then install -d "$d" || e=$? else mkdir "$d" || e=$? fi fi done return $e } +# With the advent of alternative init systems, it's possible to have +# more than one installed. So we need to try and guess what one we're +# using unless overriden by configure. +# Note that restarting a service is a last resort - the subscribers +# should make a reasonable attempt to reconfigre the service via some +# method, normally SIGHUP. +detect_init() +{ + [ -n "$RESTARTCMD" ] && return 0 + + # Detect the running init system. + # As systemd and OpenRC can be installed on top of legacy init + # systems we try to detect them first. + local status="@STATUSARG@" + : ${status:=status} + if [ -x /bin/systemctl -a -S /run/systemd/private ]; then + RESTARTCMD="if /bin/systemctl --quiet is-active \$1.service; then + /bin/systemctl restart \$1.service; +fi" + elif [ -x /usr/bin/systemctl -a -S /run/systemd/private ]; then + RESTARTCMD="if /usr/bin/systemctl --quiet is-active \$1.service; then + /usr/bin/systemctl restart \$1.service; +fi" + elif [ -x /sbin/rc-service -a \ + -s /libexec/rc/init.d/softlevel -o -s /run/openrc/softlevel ] + then + RESTARTCMD="/sbin/rc-service -i \$1 -- -Ds restart" + elif [ -x /usr/sbin/invoke-rc.d ]; then + RCDIR=/etc/init.d + RESTARTCMD="if /usr/sbin/invoke-rc.d --quiet \$1 status 1>/dev/null 2>&1; then + /usr/sbin/invoke-rc.d \$1 restart; +fi" + elif [ -x /sbin/service ]; then + # Old RedHat + RCDIR=/etc/init.d + RESTARTCMD="if /sbin/service \$1; then + /sbin/service \$1 restart; +fi" + elif [ -x /usr/sbin/service ]; then + # Could be FreeBSD + RESTARTCMD="if /usr/sbin/service \$1 $status 1>/dev/null 2>&1; then + /usr/sbin/service \$1 restart; +fi" + elif [ -x /bin/sv ]; then + RESTARTCMD="/bin/sv try-restart \$1" + elif [ -x /usr/bin/sv ]; then + RESTARTCMD="/usr/bin/sv try-restart \$1" + elif [ -e /etc/arch-release -a -d /etc/rc.d ]; then + RCDIR=/etc/rc.d + RESTARTCMD="if [ -e /var/run/daemons/\$1 ]; then + /etc/rc.d/\$1 restart; +fi" + elif [ -e /etc/slackware-version -a -d /etc/rc.d ]; then + RESTARTCMD="if /etc/rc.d/rc.\$1 status 1>/dev/null 2>&1; then + /etc/rc.d/rc.\$1 restart; +fi" + elif [ -e /etc/rc.d/rc.subr -a -d /etc/rc.d ]; then + # OpenBSD + RESTARTCMD="if /etc/rc.d/\$1 check 1>/dev/null 2>&1; then + /etc/rc.d/\$1 restart; +fi" + else + for x in /etc/init.d/rc.d /etc/rc.d /etc/init.d; do + [ -d $x ] || continue + RESTARTCMD="if $x/\$1 $status 1>/dev/null 2>&1; then + $x/\$1 restart; +fi" + break + done + fi + + if [ -z "$RESTARTCMD" ]; then + if [ "$NOINIT_WARNED" != true ]; then + warn "could not detect a useable init system" + _NOINIT_WARNED=true + fi + return 1 + fi + _NOINIT_WARNED= + return 0 +} + +echo_resolv() +{ + local line= OIFS="$IFS" + + [ -n "$1" -a -f "$IFACEDIR/$1" ] || return 1 + echo "# resolv.conf from $1" + # Our variable maker works of the fact each resolv.conf per interface + # is separated by blank lines. + # So we remove them when echoing them. + while read -r line; do + IFS="$OIFS" + if [ -n "$line" ]; then + # We need to set IFS here to preserve any whitespace + IFS='' + printf "%s\n" "$line" + fi + done < "$IFACEDIR/$1" + IFS="$OIFS" +} + list_resolv() { [ -d "$IFACEDIR" ] || return 0 local report=false list= retval=0 cmd="$1" excl= shift case "$IF_EXCLUSIVE" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) if [ -d "$EXCLUSIVEDIR" ]; then cd "$EXCLUSIVEDIR" for i in *; do if [ -f "$i" ]; then list="${i#* }" break fi done fi excl=true ;; *) excl=false ;; esac # If we have an interface ordering list, then use that. # It works by just using pathname expansion in the interface directory. if [ -n "$1" ]; then list="$*" $force || report=true elif ! $excl; then cd "$IFACEDIR" for i in $interface_order; do [ -f "$i" ] && list="$list $i" for ii in "$i":* "$i".*; do [ -f "$ii" ] && list="$list $ii" done done for i in $dynamic_order; do if [ -e "$i" -a ! -e "$METRICDIR/"*" $i" ]; then list="$list $i" fi for ii in "$i":* "$i".*; do if [ -f "$ii" -a ! -e "$METRICDIR/"*" $ii" ]; then list="$list $ii" fi done done if [ -d "$METRICDIR" ]; then cd "$METRICDIR" for i in *; do [ -f "$i" ] && list="$list ${i#* }" done fi list="$list *" fi cd "$IFACEDIR" retval=1 + excl=true for i in $(uniqify $list); do # Only list interfaces which we really have if ! [ -f "$i" ]; then if $report; then echo "No resolv.conf for interface $i" >&2 retval=2 fi continue fi if [ "$cmd" = i -o "$cmd" = "-i" ]; then printf %s "$i " else echo_resolv "$i" + echo fi [ $? = 0 -a "$retval" = 1 ] && retval=0 done [ "$cmd" = i -o "$cmd" = "-i" ] && echo return $retval } list_remove() { local list= e= l= result= found= retval=0 [ -z "$2" ] && return 0 eval list=\"\$$1\" shift set -f for e; do found=false for l in $list; do case "$e" in $l) found=true;; esac $found && break done if $found; then retval=$(($retval + 1)) else result="$result $e" fi done set +f echo "${result# *}" return $retval } echo_prepend() { echo "# Generated by resolvconf" if [ -n "$search_domains" ]; then echo "search $search_domains" fi for n in $name_servers; do echo "nameserver $n" done echo } echo_append() { echo "# Generated by resolvconf" if [ -n "$search_domains_append" ]; then echo "search $search_domains_append" fi for n in $name_servers_append; do echo "nameserver $n" done echo } replace() { local r= k= f= v= val= sub= while read -r keyword value; do for r in $replace; do k="${r%%/*}" r="${r#*/}" f="${r%%/*}" r="${r#*/}" v="${r%%/*}" case "$keyword" in $k) case "$value" in $f) value="$v";; esac ;; esac done val= for sub in $value; do for r in $replace_sub; do k="${r%%/*}" r="${r#*/}" f="${r%%/*}" r="${r#*/}" v="${r%%/*}" case "$keyword" in $k) case "$sub" in $f) sub="$v";; esac ;; esac done val="$val${val:+ }$sub" done printf "%s %s\n" "$keyword" "$val" done } make_vars() { local newdomains= d= dn= newns= ns= # Clear variables DOMAIN= DOMAINS= SEARCH= NAMESERVERS= LOCALNAMESERVERS= if [ -n "$name_servers" -o -n "$search_domains" ]; then eval "$(echo_prepend | parse_resolv)" fi if [ -z "$VFLAG" ]; then IF_EXCLUSIVE=1 list_resolv -i "$@" >/dev/null || IF_EXCLUSIVE=0 eval "$(list_resolv -l "$@" | replace | parse_resolv)" fi if [ -n "$name_servers_append" -o -n "$search_domains_append" ]; then eval "$(echo_append | parse_resolv)" fi # Ensure that we only list each domain once for d in $DOMAINS; do dn="${d%%:*}" list_remove domain_blacklist "$dn" >/dev/null || continue case " $newdomains" in *" ${dn}:"*) continue;; esac newns= for nd in $DOMAINS; do if [ "$dn" = "${nd%%:*}" ]; then ns="${nd#*:}" while [ -n "$ns" ]; do case ",$newns," in *,${ns%%,*},*) ;; *) list_remove name_server_blacklist \ "${ns%%,*}" >/dev/null \ && newns="$newns${newns:+,}${ns%%,*}";; esac [ "$ns" = "${ns#*,}" ] && break ns="${ns#*,}" done fi done if [ -n "$newns" ]; then newdomains="$newdomains${newdomains:+ }$dn:$newns" fi done DOMAIN="$(list_remove domain_blacklist $DOMAIN)" SEARCH="$(uniqify $SEARCH)" SEARCH="$(list_remove domain_blacklist $SEARCH)" NAMESERVERS="$(uniqify $NAMESERVERS)" NAMESERVERS="$(list_remove name_server_blacklist $NAMESERVERS)" LOCALNAMESERVERS="$(uniqify $LOCALNAMESERVERS)" LOCALNAMESERVERS="$(list_remove name_server_blacklist $LOCALNAMESERVERS)" echo "DOMAIN='$DOMAIN'" echo "SEARCH='$SEARCH'" echo "NAMESERVERS='$NAMESERVERS'" echo "LOCALNAMESERVERS='$LOCALNAMESERVERS'" echo "DOMAINS='$newdomains'" } force=false VFLAG= -while getopts a:Dd:fhIilm:puvVx OPT; do +while getopts a:Dd:fhIilm:pRruvVx OPT; do case "$OPT" in f) force=true;; h) usage;; m) IF_METRIC="$OPTARG";; p) IF_PRIVATE=1;; V) VFLAG=1 if [ "$local_nameservers" = \ "127.* 0.0.0.0 255.255.255.255 ::1" ] then local_nameservers= fi ;; x) IF_EXCLUSIVE=1;; '?') ;; *) cmd="$OPT"; iface="$OPTARG";; esac done shift $(($OPTIND - 1)) args="$iface${iface:+ }$*" # -I inits the state dir if [ "$cmd" = I ]; then if [ -d "$VARDIR" ]; then rm -rf "$VARDIR"/* fi exit $? fi # -D ensures that the listed config file base dirs exist if [ "$cmd" = D ]; then config_mkdirs "$@" exit $? fi # -l lists our resolv files, optionally for a specific interface if [ "$cmd" = l -o "$cmd" = i ]; then list_resolv "$cmd" "$args" exit $? fi +# Restart a service or echo the command to restart a service +if [ "$cmd" = r -o "$cmd" = R ]; then + detect_init || exit 1 + if [ "$cmd" = r ]; then + set -- $args + eval $RESTARTCMD + else + echo "$RESTARTCMD" + fi + exit $? +fi + # Not normally needed, but subscribers should be able to run independently if [ "$cmd" = v -o -n "$VFLAG" ]; then make_vars "$iface" exit $? fi # Test that we have valid options if [ "$cmd" = a -o "$cmd" = d ]; then if [ -z "$iface" ]; then usage "Interface not specified" fi elif [ "$cmd" != u ]; then [ -n "$cmd" -a "$cmd" != h ] && usage "Unknown option $cmd" usage fi if [ "$cmd" = a ]; then for x in '/' \\ ' ' '*'; do case "$iface" in *[$x]*) error_exit "$x not allowed in interface name";; esac done for x in '.' '-' '~'; do case "$iface" in [$x]*) error_exit \ "$x not allowed at start of interface name";; esac done [ "$cmd" = a -a -t 0 ] && error_exit "No file given via stdin" fi if [ ! -d "$VARDIR" ]; then if [ -L "$VARDIR" ]; then dir="$(readlink "$VARDIR")" # link maybe relative cd "${VARDIR%/*}" if ! mkdir -m 0755 -p "$dir"; then error_exit "Failed to create needed" \ "directory $dir" fi else if ! mkdir -m 0755 -p "$VARDIR"; then error_exit "Failed to create needed" \ "directory $VARDIR" fi fi fi if [ ! -d "$IFACEDIR" ]; then mkdir -m 0755 -p "$IFACEDIR" || \ error_exit "Failed to create needed directory $IFACEDIR" if [ "$cmd" = d ]; then # Provide the same error messages as below if ! ${force}; then cd "$IFACEDIR" for i in $args; do warn "No resolv.conf for interface $i" done fi ${force} exit $? fi fi # An interface was added, changed, deleted or a general update was called. # Due to exclusivity we need to ensure that this is an atomic operation. # Our subscribers *may* need this as well if the init system is sub par. # As such we spinlock at this point as best we can. # We don't use flock(1) because it's not widely available and normally resides # in /usr which we do our very best to operate without. [ -w "$VARDIR" ] || error_exit "Cannot write to $LOCKDIR" : ${lock_timeout:=10} while true; do if mkdir "$LOCKDIR" 2>/dev/null; then trap 'rm -rf "$LOCKDIR";' EXIT trap 'rm -rf "$LOCKDIR"; exit 1' INT QUIT ABRT SEGV ALRM TERM echo $$ >"$LOCKDIR/pid" break fi pid=$(cat "$LOCKDIR/pid") if ! kill -0 "$pid"; then warn "clearing stale lock pid $pid" rm -rf "$LOCKDIR" continue fi lock_timeout=$(($lock_timeout - 1)) if [ "$lock_timeout" -le 0 ]; then error_exit "timed out waiting for lock from pid $pid" fi sleep 1 done case "$cmd" in a) # Read resolv.conf from stdin resolv="$(cat)" changed=false changedfile=false # If what we are given matches what we have, then do nothing if [ -e "$IFACEDIR/$iface" ]; then if [ "$(echo "$resolv")" != \ "$(cat "$IFACEDIR/$iface")" ] then changed=true changedfile=true fi else changed=true changedfile=true fi # Set metric and private before creating the interface resolv.conf file # to ensure that it will have the correct flags [ ! -d "$METRICDIR" ] && mkdir "$METRICDIR" oldmetric="$METRICDIR/"*" $iface" newmetric= if [ -n "$IF_METRIC" ]; then # Pad metric to 6 characters, so 5 is less than 10 while [ ${#IF_METRIC} -le 6 ]; do IF_METRIC="0$IF_METRIC" done newmetric="$METRICDIR/$IF_METRIC $iface" fi rm -f "$METRICDIR/"*" $iface" [ "$oldmetric" != "$newmetric" -a \ "$oldmetric" != "$METRICDIR/* $iface" ] && changed=true [ -n "$newmetric" ] && echo " " >"$newmetric" case "$IF_PRIVATE" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) if [ ! -d "$PRIVATEDIR" ]; then [ -e "$PRIVATEDIR" ] && rm "$PRIVATEDIR" mkdir "$PRIVATEDIR" fi [ -e "$PRIVATEDIR/$iface" ] || changed=true [ -d "$PRIVATEDIR" ] && echo " " >"$PRIVATEDIR/$iface" ;; *) if [ -e "$PRIVATEDIR/$iface" ]; then rm -f "$PRIVATEDIR/$iface" changed=true fi ;; esac oldexcl= for x in "$EXCLUSIVEDIR/"*" $iface"; do if [ -f "$x" ]; then oldexcl="$x" break fi done case "$IF_EXCLUSIVE" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) if [ ! -d "$EXCLUSIVEDIR" ]; then [ -e "$EXCLUSIVEDIR" ] && rm "$EXCLUSIVEDIR" mkdir "$EXCLUSIVEDIR" fi cd "$EXCLUSIVEDIR" for x in *; do [ -f "$x" ] && break done if [ "${x#* }" != "$iface" ]; then if [ "$x" = "${x% *}" ]; then x=10000000 else x="${x% *}" fi if [ "$x" = "0000000" ]; then warn "exclusive underflow" else x=$(($x - 1)) fi if [ -d "$EXCLUSIVEDIR" ]; then echo " " >"$EXCLUSIVEDIR/$x $iface" fi changed=true fi ;; *) if [ -f "$oldexcl" ]; then rm -f "$oldexcl" changed=true fi ;; esac if $changedfile; then printf "%s\n" "$resolv" >"$IFACEDIR/$iface" || exit $? elif ! $changed; then exit 0 fi unset changed changedfile oldmetric newmetric x oldexcl ;; d) # Delete any existing information about the interface cd "$IFACEDIR" changed=false for i in $args; do if [ -e "$i" ]; then changed=true elif ! ${force}; then warn "No resolv.conf for interface $i" fi rm -f "$i" "$METRICDIR/"*" $i" \ "$PRIVATEDIR/$i" \ "$EXCLUSIVEDIR/"*" $i" || exit $? done if ! ${changed}; then # Set the return code based on the forced flag ${force} exit $? fi unset changed i ;; esac case "${resolvconf:-YES}" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) ;; *) exit 0;; esac +# Try and detect a suitable init system for our scripts +detect_init +export RESTARTCMD RCDIR _NOINIT_WARNED + eval "$(make_vars)" export RESOLVCONF DOMAINS SEARCH NAMESERVERS LOCALNAMESERVERS : ${list_resolv:=list_resolv -l} retval=0 # Run scripts in the same directory resolvconf is run from -# in case any scripts accidently dump files in the wrong place. +# in case any scripts accidentally dump files in the wrong place. cd "$_PWD" for script in "$LIBEXECDIR"/*; do if [ -f "$script" ]; then eval script_enabled="\$${script##*/}" case "${script_enabled:-YES}" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) ;; *) continue;; esac if [ -x "$script" ]; then "$script" "$cmd" "$iface" else (set -- "$cmd" "$iface"; . "$script") fi retval=$(($retval + $?)) fi done exit $retval Index: user/alc/PQ_LAUNDRY/contrib/openresolv/unbound.in =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv/unbound.in (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv/unbound.in (revision 303206) @@ -1,86 +1,97 @@ #!/bin/sh -# Copyright (c) 2009-2014 Roy Marples +# Copyright (c) 2009-2016 Roy Marples # All rights reserved # unbound subscriber for resolvconf # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided # with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. unbound_insecure= [ -f "@SYSCONFDIR@"/resolvconf.conf ] || exit 0 . "@SYSCONFDIR@/resolvconf.conf" || exit 1 [ -z "$unbound_conf" ] && exit 0 [ -z "$RESOLVCONF" ] && eval "$(@SBINDIR@/resolvconf -v)" NL=" " : ${unbound_pid:=/var/run/unbound.pid} : ${unbound_service:=unbound} -: ${unbound_restart:=@RESTARTCMD ${unbound_service}@} newconf="# Generated by resolvconf$NL" for d in $DOMAINS; do dn="${d%%:*}" ns="${d#*:}" case "$unbound_insecure" in [Yy][Ee][Ss]|[Tt][Rr][Uu][Ee]|[Oo][Nn]|1) newconf="$newconf${NL}server:$NL" newconf="$newconf domain-insecure: \"$dn\"$NL" ;; esac newconf="$newconf${NL}forward-zone:$NL name: \"$dn\"$NL" while [ -n "$ns" ]; do newconf="$newconf forward-addr: ${ns%%,*}$NL" [ "$ns" = "${ns#*,}" ] && break ns="${ns#*,}" done done if [ -n "$NAMESERVERS" ]; then newconf="$newconf${NL}forward-zone:$NL name: \".\"$NL" for n in $NAMESERVERS; do newconf="$newconf forward-addr: $n$NL" done fi # Try to ensure that config dirs exist if type config_mkdirs >/dev/null 2>&1; then config_mkdirs "$unbound_conf" else @SBINDIR@/resolvconf -D "$unbound_conf" fi +restart_unbound() +{ + if [ -n "$unbound_restart" ]; then + eval $unbound_restart + elif [ -n "$RESTARTCMD" ]; then + set -- ${unbound_service} + eval $RESTARTCMD + else + @SBINDIR@/resolvconf -r ${unbound_service} + fi +} + if [ ! -f "$unbound_conf" ] || \ [ "$(cat "$unbound_conf")" != "$(printf %s "$newconf")" ] then printf %s "$newconf" >"$unbound_conf" # If we can't sent a HUP then force a restart if [ -s "$unbound_pid" ]; then if ! kill -HUP $(cat "$unbound_pid") 2>/dev/null; then - eval $unbound_restart + restart_unbound fi else - eval $unbound_restart + restart_unbound fi fi Index: user/alc/PQ_LAUNDRY/contrib/openresolv =================================================================== --- user/alc/PQ_LAUNDRY/contrib/openresolv (revision 303205) +++ user/alc/PQ_LAUNDRY/contrib/openresolv (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/contrib/openresolv ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,2 ## Merged /head/contrib/openresolv:r296481-303204 Merged /vendor/openresolv/dist:r298167,300962,303048 Index: user/alc/PQ_LAUNDRY/crypto/heimdal/lib/roken/version-script.map =================================================================== --- user/alc/PQ_LAUNDRY/crypto/heimdal/lib/roken/version-script.map (revision 303205) +++ user/alc/PQ_LAUNDRY/crypto/heimdal/lib/roken/version-script.map (revision 303206) @@ -1,203 +1,197 @@ HEIMDAL_ROKEN_1.0 { global: arg_printusage; arg_printusage_i18n; base64_decode; base64_encode; cgetcap; cgetclose; cgetmatch; cgetnum; cgetset; cgetustr; ct_memcmp; err; errx; free_getarg_strings; get_default_username; get_window_size; getarg; getnameinfo_verified; hex_decode; hex_encode; issuid; k_getpwnam; k_getpwuid; mini_inetd; mini_inetd_addrinfo; net_read; net_write; parse_bytes; parse_flags; parse_time; parse_units; print_flags_table; print_time_table; print_units_table; rk_asnprintf; rk_asprintf; rk_bswap16; rk_bswap32; rk_cgetent; rk_cgetstr; rk_cloexec; rk_cloexec_file; rk_cloexec_dir; rk_closefrom; rk_copyhostent; rk_dns_free_data; rk_dns_lookup; rk_dns_srv_order; rk_dns_string_to_type; rk_dns_type_to_string; rk_dumpdata; rk_ecalloc; rk_emalloc; rk_eread; rk_erealloc; rk_esetenv; rk_estrdup; rk_ewrite; rk_flock; rk_fnmatch; rk_free_environment; rk_freeaddrinfo; rk_freehostent; rk_freeifaddrs; rk_gai_strerror; rk_getaddrinfo; rk_getifaddrs; rk_getipnodebyaddr; rk_getipnodebyname; rk_getnameinfo; rk_getprogname; rk_glob; rk_globfree; rk_hex_decode; rk_hex_encode; rk_hostent_find_fqdn; rk_inet_ntop; rk_inet_pton; rk_localtime_r; rk_mkstemp; rk_pid_file_delete; rk_pid_file_write; rk_pidfile; rk_pipe_execv; rk_random_init; rk_read_environment; rk_readv; rk_realloc; rk_strerror; rk_strerror_r; rk_setprogname; rk_simple_execle; rk_simple_execlp; rk_simple_execve; rk_simple_execve_timed; rk_simple_execvp; rk_simple_execvp_timed; rk_socket; rk_socket_addr_size; rk_socket_get_address; rk_socket_get_port; rk_socket_set_address_and_port; rk_socket_set_any; rk_socket_set_debug; rk_socket_set_ipv6only; rk_socket_set_port; rk_socket_set_portrange; rk_socket_set_reuseaddr; rk_socket_set_tos; rk_socket_sockaddr_size; rk_strcollect; rk_strftime; rk_strlcat; rk_strlcpy; rk_strlwr; rk_strndup; rk_strnlen; rk_strpoolcollect; rk_strpoolfree; rk_strpoolprintf; rk_strptime; rk_strsep_copy; rk_strsvis; - rk_strsvis; rk_strsvisx; rk_strunvis; - rk_strunvis; rk_strunvisx; rk_strupr; rk_strvis; - rk_strvis; rk_strvisx; - rk_strvisx; rk_svis; - rk_svis; rk_timegm; rk_timevaladd; rk_timevalfix; rk_timevalsub; rk_tdelete; rk_tfind; rk_tsearch; rk_twalk; rk_undumpdata; rk_unvis; rk_vasnprintf; rk_vasprintf; - rk_vis; rk_vis; rk_vsnprintf; rk_vstrcollect; rk_wait_for_process; rk_wait_for_process_timed; rk_warnerr; rk_xfree; roken_concat; roken_getaddrinfo_hostspec2; roken_getaddrinfo_hostspec; roken_gethostby_setup; roken_gethostbyaddr; roken_gethostbyname; roken_mconcat; roken_vconcat; roken_vmconcat; rtbl_add_column; rtbl_add_column_by_id; rtbl_add_column_entry; rtbl_add_column_entry_by_id; rtbl_add_column_entryv; rtbl_add_column_entryv_by_id; rtbl_create; rtbl_destroy; rtbl_format; rtbl_get_flags; rtbl_new_row; rtbl_set_column_affix_by_id; rtbl_set_column_prefix; rtbl_set_flags; rtbl_set_prefix; rtbl_set_separator; signal; simple_execl; tm2time; unix_verify_user; unparse_bytes; unparse_bytes_short; unparse_flags; unparse_time; unparse_time_approx; unparse_units; unparse_units_approx; verr; verrx; vwarn; vwarnx; warn; warnx; writev; local: *; }; Index: user/alc/PQ_LAUNDRY/crypto/heimdal =================================================================== --- user/alc/PQ_LAUNDRY/crypto/heimdal (revision 303205) +++ user/alc/PQ_LAUNDRY/crypto/heimdal (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/crypto/heimdal ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/crypto/heimdal:r299821-303204 Index: user/alc/PQ_LAUNDRY/crypto/openssh =================================================================== --- user/alc/PQ_LAUNDRY/crypto/openssh (revision 303205) +++ user/alc/PQ_LAUNDRY/crypto/openssh (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/crypto/openssh ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/crypto/openssh:r296872-303204 Index: user/alc/PQ_LAUNDRY/etc/ntp/leap-seconds =================================================================== --- user/alc/PQ_LAUNDRY/etc/ntp/leap-seconds (revision 303205) +++ user/alc/PQ_LAUNDRY/etc/ntp/leap-seconds (revision 303206) @@ -1,221 +1,222 @@ # # $FreeBSD$ # # In the following text, the symbol '#' introduces # a comment, which continues from that symbol until # the end of the line. A plain comment line has a # whitespace character following the comment indicator. # There are also special comment lines defined below. # A special comment will always have a non-whitespace # character in column 2. # # A blank line should be ignored. # # The following table shows the corrections that must # be applied to compute International Atomic Time (TAI) # from the Coordinated Universal Time (UTC) values that # are transmitted by almost all time services. # # The first column shows an epoch as a number of seconds # since 1900.0 and the second column shows the number of # seconds that must be added to UTC to compute TAI for # any timestamp at or after that epoch. The value on # each line is valid from the indicated initial instant # until the epoch given on the next one or indefinitely # into the future if there is no next line. # (The comment on each line shows the representation of # the corresponding initial epoch in the usual # day-month-year format. The epoch always begins at # 00:00:00 UTC on the indicated day. See Note 5 below.) # # Important notes: # # 1. Coordinated Universal Time (UTC) is often referred to # as Greenwich Mean Time (GMT). The GMT time scale is no # longer used, and the use of GMT to designate UTC is # discouraged. # # 2. The UTC time scale is realized by many national # laboratories and timing centers. Each laboratory # identifies its realization with its name: Thus # UTC(NIST), UTC(USNO), etc. The differences among # these different realizations are typically on the # order of a few nanoseconds (i.e., 0.000 000 00x s) # and can be ignored for many purposes. These differences # are tabulated in Circular T, which is published monthly # by the International Bureau of Weights and Measures # (BIPM). See www.bipm.fr for more information. # # 3. The current definition of the relationship between UTC # and TAI dates from 1 January 1972. A number of different # time scales were in use before than epoch, and it can be # quite difficult to compute precise timestamps and time # intervals in those "prehistoric" days. For more information, # consult: # # The Explanatory Supplement to the Astronomical # Ephemeris. # or # Terry Quinn, "The BIPM and the Accurate Measurement # of Time," Proc. of the IEEE, Vol. 79, pp. 894-905, # July, 1991. # # 4. The insertion of leap seconds into UTC is currently the # responsibility of the International Earth Rotation Service, # which is located at the Paris Observatory: # # Central Bureau of IERS # 61, Avenue de l'Observatoire # 75014 Paris, France. # # Leap seconds are announced by the IERS in its Bulletin C # # See hpiers.obspm.fr or www.iers.org for more details. # # All national laboratories and timing centers use the # data from the BIPM and the IERS to construct their # local realizations of UTC. # # Although the definition also includes the possibility # of dropping seconds ("negative" leap seconds), this has # never been done and is unlikely to be necessary in the # foreseeable future. # # 5. If your system keeps time as the number of seconds since # some epoch (e.g., NTP timestamps), then the algorithm for # assigning a UTC time stamp to an event that happens during a positive # leap second is not well defined. The official name of that leap # second is 23:59:60, but there is no way of representing that time # in these systems. # Many systems of this type effectively stop the system clock for # one second during the leap second and use a time that is equivalent # to 23:59:59 UTC twice. For these systems, the corresponding TAI # timestamp would be obtained by advancing to the next entry in the # following table when the time equivalent to 23:59:59 UTC # is used for the second time. Thus the leap second which # occurred on 30 June 1972 at 23:59:59 UTC would have TAI # timestamps computed as follows: # # ... # 30 June 1972 23:59:59 (2287785599, first time): TAI= UTC + 10 seconds # 30 June 1972 23:59:60 (2287785599,second time): TAI= UTC + 11 seconds # 1 July 1972 00:00:00 (2287785600) TAI= UTC + 11 seconds # ... # # If your system realizes the leap second by repeating 00:00:00 UTC twice # (this is possible but not usual), then the advance to the next entry # in the table must occur the second time that a time equivlent to # 00:00:00 UTC is used. Thus, using the same example as above: # # ... # 30 June 1972 23:59:59 (2287785599): TAI= UTC + 10 seconds # 30 June 1972 23:59:60 (2287785600, first time): TAI= UTC + 10 seconds # 1 July 1972 00:00:00 (2287785600,second time): TAI= UTC + 11 seconds # ... # # in both cases the use of timestamps based on TAI produces a smooth # time scale with no discontinuity in the time interval. # # This complexity would not be needed for negative leap seconds (if they # are ever used). The UTC time would skip 23:59:59 and advance from # 23:59:58 to 00:00:00 in that case. The TAI offset would decrease by # 1 second at the same instant. This is a much easier situation to deal # with, since the difficulty of unambiguously representing the epoch # during the leap second does not arise. # # Questions or comments to: # Jeff Prillaman # Time Service Department # US Naval Observatory # Washington, DC # jeffrey.prillaman@usno.navy.mil # -# Last Update of leap second values: 11 Jan 2016 +# Last Update of leap second values: 6 Jul 2016 # # The following line shows this last update date in NTP timestamp # format. This is the date on which the most recent change to # the leap second data was added to the file. This line can # be identified by the unique pair of characters in the first two # columns as shown below. # -#$ 3661459200 +#$ 3676752000 # # The data in this file will be updated periodically as new leap # seconds are announced. In addition to being entered on the line # above, the update time (in NTP format) will be added to the basic # file name leap-seconds to form the name leap-seconds.. # In addition, the generic name leap-seconds.list will always point to # the most recent version of the file. # # This update procedure will be performed only when a new leap second # is announced. # # The following entry specifies the expiration date of the data # in this file in units of seconds since 1900.0. This expiration date # will be changed at least twice per year whether or not a new leap # second is announced. These semi-annual changes will be made no # later than 1 June and 1 December of each year to indicate what # action (if any) is to be taken on 30 June and 31 December, # respectively. (These are the customary effective dates for new # leap seconds.) This expiration date will be identified by a # unique pair of characters in columns 1 and 2 as shown below. # In the unlikely event that a leap second is announced with an # effective date other than 30 June or 31 December, then this # file will be edited to include that leap second as soon as it is # announced or at least one month before the effective date # (whichever is later). # If an announcement by the IERS specifies that no leap second is # scheduled, then only the expiration date of the file will # be advanced to show that the information in the file is still # current -- the update time stamp, the data and the name of the file # will not change. # -# Updated through IERS Bulletin C 51 -# File expires on: 1 Dec 2016 +# Updated through IERS Bulletin C 52 +# File expires on: 1 Jun 2017 # -#@ 3689539200 +#@ 3705264000 # 2272060800 10 # 1 Jan 1972 2287785600 11 # 1 Jul 1972 2303683200 12 # 1 Jan 1973 2335219200 13 # 1 Jan 1974 2366755200 14 # 1 Jan 1975 2398291200 15 # 1 Jan 1976 2429913600 16 # 1 Jan 1977 2461449600 17 # 1 Jan 1978 2492985600 18 # 1 Jan 1979 2524521600 19 # 1 Jan 1980 2571782400 20 # 1 Jul 1981 2603318400 21 # 1 Jul 1982 2634854400 22 # 1 Jul 1983 2698012800 23 # 1 Jul 1985 2776982400 24 # 1 Jan 1988 2840140800 25 # 1 Jan 1990 2871676800 26 # 1 Jan 1991 2918937600 27 # 1 Jul 1992 2950473600 28 # 1 Jul 1993 2982009600 29 # 1 Jul 1994 3029443200 30 # 1 Jan 1996 3076704000 31 # 1 Jul 1997 3124137600 32 # 1 Jan 1999 3345062400 33 # 1 Jan 2006 3439756800 34 # 1 Jan 2009 3550089600 35 # 1 Jul 2012 3644697600 36 # 1 Jul 2015 +3692217600 37 # 1 Jan 2017 # # the following special comment contains the # hash value of the data in this file computed # use the secure hash algorithm as specified # by FIPS 180-1. See the files in ~/sha for # the details of how this hash value is # computed. Note that the hash computation # ignores comments and whitespace characters # in data lines. It includes the NTP values # of both the last modification time and the # expiration time of the file, but not the # white space on those lines. # the hash line is also ignored in the # computation. # -#h 63b4df04 0907d94f 2dadb7a1 684f7767 2a372421 +#h 63f8fea8 587c099d abcf130a ad525eae 3e105052 # Index: user/alc/PQ_LAUNDRY/gnu/usr.bin/binutils =================================================================== --- user/alc/PQ_LAUNDRY/gnu/usr.bin/binutils (revision 303205) +++ user/alc/PQ_LAUNDRY/gnu/usr.bin/binutils (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/gnu/usr.bin/binutils ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/gnu/usr.bin/binutils:r300263-303204 Index: user/alc/PQ_LAUNDRY/gnu/usr.bin/gdb =================================================================== --- user/alc/PQ_LAUNDRY/gnu/usr.bin/gdb (revision 303205) +++ user/alc/PQ_LAUNDRY/gnu/usr.bin/gdb (revision 303206) Property changes on: user/alc/PQ_LAUNDRY/gnu/usr.bin/gdb ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head/gnu/usr.bin/gdb:r300263-303204 Index: user/alc/PQ_LAUNDRY/lib/libc/gen/glob.c =================================================================== --- user/alc/PQ_LAUNDRY/lib/libc/gen/glob.c (revision 303205) +++ user/alc/PQ_LAUNDRY/lib/libc/gen/glob.c (revision 303206) @@ -1,1030 +1,1086 @@ /* * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Guido van Rossum. * * Copyright (c) 2011 The FreeBSD Foundation * All rights reserved. * Portions of this software were developed by David Chisnall * under sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #if defined(LIBC_SCCS) && !defined(lint) static char sccsid[] = "@(#)glob.c 8.3 (Berkeley) 10/13/93"; #endif /* LIBC_SCCS and not lint */ #include __FBSDID("$FreeBSD$"); /* * glob(3) -- a superset of the one defined in POSIX 1003.2. * * The [!...] convention to negate a range is supported (SysV, Posix, ksh). * * Optional extra services, controlled by flags not defined by POSIX: * * GLOB_QUOTE: * Escaping convention: \ inhibits any special meaning the following * character might have (except \ at end of string is retained). * GLOB_MAGCHAR: * Set in gl_flags if pattern contained a globbing character. * GLOB_NOMAGIC: * Same as GLOB_NOCHECK, but it will only append pattern if it did * not contain any magic characters. [Used in csh style globbing] * GLOB_ALTDIRFUNC: * Use alternately specified directory access functions. * GLOB_TILDE: * expand ~user/foo to the /home/dir/of/user/foo * GLOB_BRACE: * expand {1,2}{a,b} to 1a 1b 2a 2b * gl_matchc: * Number of matches in the current invocation of glob. */ /* * Some notes on multibyte character support: * 1. Patterns with illegal byte sequences match nothing - even if * GLOB_NOCHECK is specified. * 2. Illegal byte sequences in filenames are handled by treating them as * single-byte characters with a values of such bytes of the sequence * cast to wchar_t. * 3. State-dependent encodings are not currently supported. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "collate.h" /* * glob(3) expansion limits. Stop the expansion if any of these limits * is reached. This caps the runtime in the face of DoS attacks. See * also CVE-2010-2632 */ #define GLOB_LIMIT_BRACE 128 /* number of brace calls */ #define GLOB_LIMIT_PATH 65536 /* number of path elements */ #define GLOB_LIMIT_READDIR 16384 /* number of readdirs */ #define GLOB_LIMIT_STAT 1024 /* number of stat system calls */ #define GLOB_LIMIT_STRING ARG_MAX /* maximum total size for paths */ struct glob_limit { size_t l_brace_cnt; size_t l_path_lim; size_t l_readdir_cnt; size_t l_stat_cnt; size_t l_string_cnt; }; #define DOT L'.' #define EOS L'\0' #define LBRACKET L'[' #define NOT L'!' #define QUESTION L'?' #define QUOTE L'\\' #define RANGE L'-' #define RBRACKET L']' #define SEP L'/' #define STAR L'*' #define TILDE L'~' #define LBRACE L'{' #define RBRACE L'}' -#define SLASH L'/' #define COMMA L',' -#ifndef DEBUG - #define M_QUOTE 0x8000000000ULL #define M_PROTECT 0x4000000000ULL #define M_MASK 0xffffffffffULL #define M_CHAR 0x00ffffffffULL typedef uint_fast64_t Char; -#else - -#define M_QUOTE 0x80 -#define M_PROTECT 0x40 -#define M_MASK 0xff -#define M_CHAR 0x7f - -typedef char Char; - -#endif - - #define CHAR(c) ((Char)((c)&M_CHAR)) #define META(c) ((Char)((c)|M_QUOTE)) +#define UNPROT(c) ((c) & ~M_PROTECT) #define M_ALL META(L'*') #define M_END META(L']') #define M_NOT META(L'!') #define M_ONE META(L'?') #define M_RNG META(L'-') #define M_SET META(L'[') #define ismeta(c) (((c)&M_QUOTE) != 0) +#define isprot(c) (((c)&M_PROTECT) != 0) static int compare(const void *, const void *); -static int g_Ctoc(const Char *, char *, size_t); +static int g_Ctoc(const Char *, char *, size_t, int); static int g_lstat(Char *, struct stat *, glob_t *); static DIR *g_opendir(Char *, glob_t *); static const Char *g_strchr(const Char *, wchar_t); #ifdef notdef static Char *g_strcat(Char *, const Char *); #endif static int g_stat(Char *, struct stat *, glob_t *); -static int glob0(const Char *, glob_t *, struct glob_limit *); +static int glob0(const Char *, glob_t *, struct glob_limit *, int); static int glob1(Char *, glob_t *, struct glob_limit *); static int glob2(Char *, Char *, Char *, Char *, glob_t *, struct glob_limit *); static int glob3(Char *, Char *, Char *, Char *, Char *, glob_t *, struct glob_limit *); -static int globextend(const Char *, glob_t *, struct glob_limit *); +static int globextend(const Char *, glob_t *, struct glob_limit *, int); static const Char * globtilde(const Char *, Char *, size_t, glob_t *); +static int globexp0(const Char *, glob_t *, struct glob_limit *); static int globexp1(const Char *, glob_t *, struct glob_limit *); -static int globexp2(const Char *, const Char *, glob_t *, int *, +static int globexp2(const Char *, const Char *, glob_t *, struct glob_limit *); static int match(Char *, Char *, Char *); #ifdef DEBUG static void qprintf(const char *, Char *); #endif int glob(const char * __restrict pattern, int flags, int (*errfunc)(const char *, int), glob_t * __restrict pglob) { struct glob_limit limit = { 0, 0, 0, 0, 0 }; const char *patnext; Char *bufnext, *bufend, patbuf[MAXPATHLEN], prot; mbstate_t mbs; wchar_t wc; size_t clen; + int too_long; patnext = pattern; if (!(flags & GLOB_APPEND)) { pglob->gl_pathc = 0; pglob->gl_pathv = NULL; if (!(flags & GLOB_DOOFFS)) pglob->gl_offs = 0; } if (flags & GLOB_LIMIT) { limit.l_path_lim = pglob->gl_matchc; if (limit.l_path_lim == 0) limit.l_path_lim = GLOB_LIMIT_PATH; } pglob->gl_flags = flags & ~GLOB_MAGCHAR; pglob->gl_errfunc = errfunc; pglob->gl_matchc = 0; bufnext = patbuf; bufend = bufnext + MAXPATHLEN - 1; + too_long = 1; if (flags & GLOB_NOESCAPE) { memset(&mbs, 0, sizeof(mbs)); - while (bufend - bufnext >= MB_CUR_MAX) { + while (bufnext <= bufend) { clen = mbrtowc(&wc, patnext, MB_LEN_MAX, &mbs); if (clen == (size_t)-1 || clen == (size_t)-2) return (GLOB_NOMATCH); - else if (clen == 0) + else if (clen == 0) { + too_long = 0; break; + } *bufnext++ = wc; patnext += clen; } } else { /* Protect the quoted characters. */ memset(&mbs, 0, sizeof(mbs)); - while (bufend - bufnext >= MB_CUR_MAX) { + while (bufnext <= bufend) { if (*patnext == '\\') { if (*++patnext == '\0') { - *bufnext++ = QUOTE | M_PROTECT; + *bufnext++ = QUOTE; continue; } prot = M_PROTECT; } else prot = 0; clen = mbrtowc(&wc, patnext, MB_LEN_MAX, &mbs); if (clen == (size_t)-1 || clen == (size_t)-2) return (GLOB_NOMATCH); - else if (clen == 0) + else if (clen == 0) { + too_long = 0; break; - *bufnext++ = wc | ((wc != DOT && wc != SEP) ? - prot : 0); + } + *bufnext++ = wc | prot; patnext += clen; } } + if (too_long) + return (GLOB_NOMATCH); *bufnext = EOS; if (flags & GLOB_BRACE) - return (globexp1(patbuf, pglob, &limit)); + return (globexp0(patbuf, pglob, &limit)); else - return (glob0(patbuf, pglob, &limit)); + return (glob0(patbuf, pglob, &limit, 1)); } +static int +globexp0(const Char *pattern, glob_t *pglob, struct glob_limit *limit) +{ + int rv; + size_t oldpathc; + + /* Protect a single {}, for find(1), like csh */ + if (pattern[0] == LBRACE && pattern[1] == RBRACE && pattern[2] == EOS) { + if ((pglob->gl_flags & GLOB_LIMIT) && + limit->l_brace_cnt++ >= GLOB_LIMIT_BRACE) { + errno = 0; + return (GLOB_NOSPACE); + } + return (glob0(pattern, pglob, limit, 1)); + } + + oldpathc = pglob->gl_pathc; + + if ((rv = globexp1(pattern, pglob, limit)) != 0) + return rv; + /* + * If there was no match we are going to append the pattern + * if GLOB_NOCHECK was specified or if GLOB_NOMAGIC was specified + * and the pattern did not contain any magic characters + * GLOB_NOMAGIC is there just for compatibility with csh. + */ + if (pglob->gl_pathc == oldpathc) { + if (((pglob->gl_flags & GLOB_NOCHECK) || + ((pglob->gl_flags & GLOB_NOMAGIC) && + !(pglob->gl_flags & GLOB_MAGCHAR)))) + return (globextend(pattern, pglob, limit, 1)); + else + return (GLOB_NOMATCH); + } + if (!(pglob->gl_flags & GLOB_NOSORT)) + qsort(pglob->gl_pathv + pglob->gl_offs + oldpathc, + pglob->gl_pathc - oldpathc, sizeof(char *), compare); + return (0); +} + /* * Expand recursively a glob {} pattern. When there is no more expansion * invoke the standard globbing routine to glob the rest of the magic * characters */ static int globexp1(const Char *pattern, glob_t *pglob, struct glob_limit *limit) { - const Char* ptr = pattern; - int rv; + const Char* ptr; - if ((pglob->gl_flags & GLOB_LIMIT) && - limit->l_brace_cnt++ >= GLOB_LIMIT_BRACE) { - errno = 0; - return (GLOB_NOSPACE); + if ((ptr = g_strchr(pattern, LBRACE)) != NULL) { + if ((pglob->gl_flags & GLOB_LIMIT) && + limit->l_brace_cnt++ >= GLOB_LIMIT_BRACE) { + errno = 0; + return (GLOB_NOSPACE); + } + return (globexp2(ptr, pattern, pglob, limit)); } - /* Protect a single {}, for find(1), like csh */ - if (pattern[0] == LBRACE && pattern[1] == RBRACE && pattern[2] == EOS) - return glob0(pattern, pglob, limit); - - while ((ptr = g_strchr(ptr, LBRACE)) != NULL) - if (!globexp2(ptr, pattern, pglob, &rv, limit)) - return rv; - - return glob0(pattern, pglob, limit); + return (glob0(pattern, pglob, limit, 0)); } /* * Recursive brace globbing helper. Tries to expand a single brace. * If it succeeds then it invokes globexp1 with the new pattern. * If it fails then it tries to glob the rest of the pattern and returns. */ static int -globexp2(const Char *ptr, const Char *pattern, glob_t *pglob, int *rv, +globexp2(const Char *ptr, const Char *pattern, glob_t *pglob, struct glob_limit *limit) { - int i; + int i, rv; Char *lm, *ls; const Char *pe, *pm, *pm1, *pl; Char patbuf[MAXPATHLEN]; /* copy part up to the brace */ for (lm = patbuf, pm = pattern; pm != ptr; *lm++ = *pm++) continue; *lm = EOS; ls = lm; /* Find the balanced brace */ - for (i = 0, pe = ++ptr; *pe; pe++) + for (i = 0, pe = ++ptr; *pe != EOS; pe++) if (*pe == LBRACKET) { /* Ignore everything between [] */ for (pm = pe++; *pe != RBRACKET && *pe != EOS; pe++) continue; if (*pe == EOS) { /* * We could not find a matching RBRACKET. * Ignore and just look for RBRACE */ pe = pm; } } else if (*pe == LBRACE) i++; else if (*pe == RBRACE) { if (i == 0) break; i--; } /* Non matching braces; just glob the pattern */ - if (i != 0 || *pe == EOS) { - *rv = glob0(patbuf, pglob, limit); - return (0); - } + if (i != 0 || *pe == EOS) + return (glob0(pattern, pglob, limit, 0)); for (i = 0, pl = pm = ptr; pm <= pe; pm++) switch (*pm) { case LBRACKET: /* Ignore everything between [] */ for (pm1 = pm++; *pm != RBRACKET && *pm != EOS; pm++) continue; if (*pm == EOS) { /* * We could not find a matching RBRACKET. * Ignore and just look for RBRACE */ pm = pm1; } break; case LBRACE: i++; break; case RBRACE: if (i) { i--; break; } /* FALLTHROUGH */ case COMMA: if (i && *pm == COMMA) break; else { /* Append the current string */ for (lm = ls; (pl < pm); *lm++ = *pl++) continue; /* * Append the rest of the pattern after the * closing brace */ for (pl = pe + 1; (*lm++ = *pl++) != EOS;) continue; /* Expand the current pattern */ #ifdef DEBUG qprintf("globexp2:", patbuf); #endif - *rv = globexp1(patbuf, pglob, limit); + rv = globexp1(patbuf, pglob, limit); + if (rv) + return (rv); /* move after the comma, to the next string */ pl = pm + 1; } break; default: break; } - *rv = 0; return (0); } /* * expand tilde from the passwd file. */ static const Char * globtilde(const Char *pattern, Char *patbuf, size_t patbuf_len, glob_t *pglob) { struct passwd *pwd; char *h, *sc; const Char *p; Char *b, *eb; wchar_t wc; wchar_t wbuf[MAXPATHLEN]; wchar_t *wbufend, *dc; size_t clen; mbstate_t mbs; int too_long; if (*pattern != TILDE || !(pglob->gl_flags & GLOB_TILDE)) return (pattern); /* * Copy up to the end of the string or / */ eb = &patbuf[patbuf_len - 1]; for (p = pattern + 1, b = patbuf; - b < eb && *p != EOS && *p != SLASH; *b++ = *p++) + b < eb && *p != EOS && UNPROT(*p) != SEP; *b++ = *p++) continue; - if (*p != EOS && *p != SLASH) + if (*p != EOS && UNPROT(*p) != SEP) return (NULL); *b = EOS; h = NULL; if (patbuf[0] == EOS) { /* * handle a plain ~ or ~/ by expanding $HOME first (iff * we're not running setuid or setgid) and then trying * the password file */ if (issetugid() != 0 || (h = getenv("HOME")) == NULL) { if (((h = getlogin()) != NULL && (pwd = getpwnam(h)) != NULL) || (pwd = getpwuid(getuid())) != NULL) h = pwd->pw_dir; else return (pattern); } } else { /* * Expand a ~user */ - if (g_Ctoc(patbuf, (char *)wbuf, sizeof(wbuf))) + if (g_Ctoc(patbuf, (char *)wbuf, sizeof(wbuf), 0)) return (NULL); if ((pwd = getpwnam((char *)wbuf)) == NULL) return (pattern); else h = pwd->pw_dir; } /* Copy the home directory */ dc = wbuf; sc = h; wbufend = wbuf + MAXPATHLEN - 1; too_long = 1; memset(&mbs, 0, sizeof(mbs)); while (dc <= wbufend) { clen = mbrtowc(&wc, sc, MB_LEN_MAX, &mbs); if (clen == (size_t)-1 || clen == (size_t)-2) { /* XXX See initial comment #2. */ wc = (unsigned char)*sc; clen = 1; memset(&mbs, 0, sizeof(mbs)); } if ((*dc++ = wc) == EOS) { too_long = 0; break; } sc += clen; } if (too_long) return (NULL); dc = wbuf; - for (b = patbuf; b < eb && *dc != EOS; b++, dc++) - *b = *dc | ((*dc != DOT && *dc != SEP) ? M_PROTECT : 0); + for (b = patbuf; b < eb && *dc != EOS; *b++ = *dc++ | M_PROTECT) + continue; if (*dc != EOS) return (NULL); /* Append the rest of the pattern */ if (*p != EOS) { too_long = 1; while (b <= eb) { if ((*b++ = *p++) == EOS) { too_long = 0; break; } } if (too_long) return (NULL); } else *b = EOS; return (patbuf); } /* * The main glob() routine: compiles the pattern (optionally processing * quotes), calls glob1() to do the real pattern matching, and finally * sorts the list (unless unsorted operation is requested). Returns 0 * if things went well, nonzero if errors occurred. */ static int -glob0(const Char *pattern, glob_t *pglob, struct glob_limit *limit) +glob0(const Char *pattern, glob_t *pglob, struct glob_limit *limit, + int final) { const Char *qpatnext; int err; size_t oldpathc; Char *bufnext, c, patbuf[MAXPATHLEN]; qpatnext = globtilde(pattern, patbuf, MAXPATHLEN, pglob); if (qpatnext == NULL) { errno = 0; return (GLOB_NOSPACE); } oldpathc = pglob->gl_pathc; bufnext = patbuf; /* We don't need to check for buffer overflow any more. */ while ((c = *qpatnext++) != EOS) { switch (c) { case LBRACKET: c = *qpatnext; if (c == NOT) ++qpatnext; if (*qpatnext == EOS || g_strchr(qpatnext+1, RBRACKET) == NULL) { *bufnext++ = LBRACKET; if (c == NOT) --qpatnext; break; } *bufnext++ = M_SET; if (c == NOT) *bufnext++ = M_NOT; c = *qpatnext++; do { *bufnext++ = CHAR(c); if (*qpatnext == RANGE && (c = qpatnext[1]) != RBRACKET) { *bufnext++ = M_RNG; *bufnext++ = CHAR(c); qpatnext += 2; } } while ((c = *qpatnext++) != RBRACKET); pglob->gl_flags |= GLOB_MAGCHAR; *bufnext++ = M_END; break; case QUESTION: pglob->gl_flags |= GLOB_MAGCHAR; *bufnext++ = M_ONE; break; case STAR: pglob->gl_flags |= GLOB_MAGCHAR; /* collapse adjacent stars to one, * to avoid exponential behavior */ if (bufnext == patbuf || bufnext[-1] != M_ALL) *bufnext++ = M_ALL; break; default: *bufnext++ = CHAR(c); break; } } *bufnext = EOS; #ifdef DEBUG qprintf("glob0:", patbuf); #endif if ((err = glob1(patbuf, pglob, limit)) != 0) return(err); - /* - * If there was no match we are going to append the pattern - * if GLOB_NOCHECK was specified or if GLOB_NOMAGIC was specified - * and the pattern did not contain any magic characters - * GLOB_NOMAGIC is there just for compatibility with csh. - */ - if (pglob->gl_pathc == oldpathc) { - if (((pglob->gl_flags & GLOB_NOCHECK) || - ((pglob->gl_flags & GLOB_NOMAGIC) && - !(pglob->gl_flags & GLOB_MAGCHAR)))) - return (globextend(pattern, pglob, limit)); - else - return (GLOB_NOMATCH); + if (final) { + /* + * If there was no match we are going to append the pattern + * if GLOB_NOCHECK was specified or if GLOB_NOMAGIC was specified + * and the pattern did not contain any magic characters + * GLOB_NOMAGIC is there just for compatibility with csh. + */ + if (pglob->gl_pathc == oldpathc) { + if (((pglob->gl_flags & GLOB_NOCHECK) || + ((pglob->gl_flags & GLOB_NOMAGIC) && + !(pglob->gl_flags & GLOB_MAGCHAR)))) + return (globextend(pattern, pglob, limit, 1)); + else + return (GLOB_NOMATCH); + } + if (!(pglob->gl_flags & GLOB_NOSORT)) + qsort(pglob->gl_pathv + pglob->gl_offs + oldpathc, + pglob->gl_pathc - oldpathc, sizeof(char *), compare); } - if (!(pglob->gl_flags & GLOB_NOSORT)) - qsort(pglob->gl_pathv + pglob->gl_offs + oldpathc, - pglob->gl_pathc - oldpathc, sizeof(char *), compare); return (0); } static int compare(const void *p, const void *q) { return (strcoll(*(char **)p, *(char **)q)); } static int glob1(Char *pattern, glob_t *pglob, struct glob_limit *limit) { Char pathbuf[MAXPATHLEN]; /* A null pathname is invalid -- POSIX 1003.1 sect. 2.4. */ if (*pattern == EOS) return (0); return (glob2(pathbuf, pathbuf, pathbuf + MAXPATHLEN - 1, pattern, pglob, limit)); } /* * The functions glob2 and glob3 are mutually recursive; there is one level * of recursion for each segment in the pattern that contains one or more * meta characters. */ static int glob2(Char *pathbuf, Char *pathend, Char *pathend_last, Char *pattern, glob_t *pglob, struct glob_limit *limit) { struct stat sb; Char *p, *q; int anymeta; /* * Loop over pattern segments until end of pattern or until * segment with meta character found. */ for (anymeta = 0;;) { if (*pattern == EOS) { /* End of pattern? */ *pathend = EOS; if (g_lstat(pathbuf, &sb, pglob)) return (0); if ((pglob->gl_flags & GLOB_LIMIT) && limit->l_stat_cnt++ >= GLOB_LIMIT_STAT) { errno = 0; return (GLOB_NOSPACE); } - if (((pglob->gl_flags & GLOB_MARK) && - pathend[-1] != SEP) && (S_ISDIR(sb.st_mode) - || (S_ISLNK(sb.st_mode) && - (g_stat(pathbuf, &sb, pglob) == 0) && + if ((pglob->gl_flags & GLOB_MARK) && + UNPROT(pathend[-1]) != SEP && + (S_ISDIR(sb.st_mode) || + (S_ISLNK(sb.st_mode) && + g_stat(pathbuf, &sb, pglob) == 0 && S_ISDIR(sb.st_mode)))) { if (pathend + 1 > pathend_last) { errno = 0; return (GLOB_NOSPACE); } *pathend++ = SEP; *pathend = EOS; } ++pglob->gl_matchc; - return (globextend(pathbuf, pglob, limit)); + return (globextend(pathbuf, pglob, limit, 0)); } /* Find end of next segment, copy tentatively to pathend. */ q = pathend; p = pattern; - while (*p != EOS && *p != SEP) { + while (*p != EOS && UNPROT(*p) != SEP) { if (ismeta(*p)) anymeta = 1; if (q + 1 > pathend_last) { errno = 0; return (GLOB_NOSPACE); } *q++ = *p++; } if (!anymeta) { /* No expansion, do next segment. */ pathend = q; pattern = p; - while (*pattern == SEP) { + while (UNPROT(*pattern) == SEP) { if (pathend + 1 > pathend_last) { errno = 0; return (GLOB_NOSPACE); } *pathend++ = *pattern++; } } else /* Need expansion, recurse. */ return (glob3(pathbuf, pathend, pathend_last, pattern, p, pglob, limit)); } /* NOTREACHED */ } static int glob3(Char *pathbuf, Char *pathend, Char *pathend_last, Char *pattern, Char *restpattern, glob_t *pglob, struct glob_limit *limit) { struct dirent *dp; DIR *dirp; - int err; + int err, too_long, saverrno; char buf[MAXPATHLEN + MB_LEN_MAX - 1]; struct dirent *(*readdirfunc)(DIR *); - errno = 0; - if (pathend > pathend_last) + if (pathend > pathend_last) { + errno = 0; return (GLOB_NOSPACE); + } *pathend = EOS; + if (pglob->gl_errfunc != NULL && + g_Ctoc(pathbuf, buf, sizeof(buf), 0)) { + errno = 0; + return (GLOB_NOSPACE); + } + errno = 0; if ((dirp = g_opendir(pathbuf, pglob)) == NULL) { if (errno == ENOENT || errno == ENOTDIR) return (0); - if (pglob->gl_errfunc != NULL) { - if (g_Ctoc(pathbuf, buf, sizeof(buf))) { - errno = 0; - return (GLOB_NOSPACE); - } - if (pglob->gl_errfunc(buf, errno)) - return (GLOB_ABORTED); - } - if (pglob->gl_flags & GLOB_ERR) + if ((pglob->gl_errfunc != NULL && + pglob->gl_errfunc(buf, errno)) || + (pglob->gl_flags & GLOB_ERR)) return (GLOB_ABORTED); return (0); } err = 0; /* pglob->gl_readdir takes a void *, fix this manually */ if (pglob->gl_flags & GLOB_ALTDIRFUNC) readdirfunc = (struct dirent *(*)(DIR *))pglob->gl_readdir; else readdirfunc = readdir; + errno = 0; /* Search directory for matching names. */ while ((dp = (*readdirfunc)(dirp)) != NULL) { char *sc; Char *dc; wchar_t wc; size_t clen; mbstate_t mbs; if ((pglob->gl_flags & GLOB_LIMIT) && limit->l_readdir_cnt++ >= GLOB_LIMIT_READDIR) { errno = 0; - if (pathend + 1 > pathend_last) - err = GLOB_NOSPACE; - else { - *pathend++ = SEP; - *pathend = EOS; - err = GLOB_NOSPACE; - } + err = GLOB_NOSPACE; break; } /* Initial DOT must be matched literally. */ - if (dp->d_name[0] == '.' && *pattern != DOT) + if (dp->d_name[0] == '.' && UNPROT(*pattern) != DOT) continue; memset(&mbs, 0, sizeof(mbs)); dc = pathend; sc = dp->d_name; - while (dc < pathend_last) { + too_long = 1; + while (dc <= pathend_last) { clen = mbrtowc(&wc, sc, MB_LEN_MAX, &mbs); if (clen == (size_t)-1 || clen == (size_t)-2) { /* XXX See initial comment #2. */ wc = (unsigned char)*sc; clen = 1; memset(&mbs, 0, sizeof(mbs)); } - if ((*dc++ = wc) == EOS) + if ((*dc++ = wc) == EOS) { + too_long = 0; break; + } sc += clen; } - if (!match(pathend, pattern, restpattern)) { + if (too_long || !match(pathend, pattern, restpattern)) { *pathend = EOS; continue; } err = glob2(pathbuf, --dc, pathend_last, restpattern, pglob, limit); if (err) break; + errno = 0; } + saverrno = errno; if (pglob->gl_flags & GLOB_ALTDIRFUNC) (*pglob->gl_closedir)(dirp); else closedir(dirp); - return (err); + errno = saverrno; + + if (err) + return (err); + + if (dp == NULL && errno != 0 && ((pglob->gl_errfunc != NULL && + pglob->gl_errfunc(buf, errno)) || (pglob->gl_flags & GLOB_ERR))) + return (GLOB_ABORTED); + + return (0); } /* * Extend the gl_pathv member of a glob_t structure to accommodate a new item, * add the new item, and update gl_pathc. * * This assumes the BSD realloc, which only copies the block when its size * crosses a power-of-two boundary; for v7 realloc, this would cause quadratic * behavior. * * Return 0 if new item added, error code if memory couldn't be allocated. * * Invariant of the glob_t structure: * Either gl_pathc is zero and gl_pathv is NULL; or gl_pathc > 0 and * gl_pathv points to (gl_offs + gl_pathc + 1) items. */ static int -globextend(const Char *path, glob_t *pglob, struct glob_limit *limit) +globextend(const Char *path, glob_t *pglob, struct glob_limit *limit, + int prot) { char **pathv; size_t i, newsize, len; char *copy; const Char *p; if ((pglob->gl_flags & GLOB_LIMIT) && pglob->gl_matchc > limit->l_path_lim) { errno = 0; return (GLOB_NOSPACE); } newsize = sizeof(*pathv) * (2 + pglob->gl_pathc + pglob->gl_offs); /* realloc(NULL, newsize) is equivalent to malloc(newsize). */ pathv = realloc((void *)pglob->gl_pathv, newsize); if (pathv == NULL) return (GLOB_NOSPACE); if (pglob->gl_pathv == NULL && pglob->gl_offs > 0) { /* first time around -- clear initial gl_offs items */ pathv += pglob->gl_offs; for (i = pglob->gl_offs + 1; --i > 0; ) *--pathv = NULL; } pglob->gl_pathv = pathv; - for (p = path; *p++;) + for (p = path; *p++ != EOS;) continue; len = MB_CUR_MAX * (size_t)(p - path); /* XXX overallocation */ - limit->l_string_cnt += len; - if ((pglob->gl_flags & GLOB_LIMIT) && - limit->l_string_cnt >= GLOB_LIMIT_STRING) { - errno = 0; - return (GLOB_NOSPACE); - } + if (prot) + len += (size_t)(p - path) - 1; if ((copy = malloc(len)) != NULL) { - if (g_Ctoc(path, copy, len)) { + if (g_Ctoc(path, copy, len, prot)) { free(copy); errno = 0; return (GLOB_NOSPACE); } + limit->l_string_cnt += strlen(copy) + 1; + if ((pglob->gl_flags & GLOB_LIMIT) && + limit->l_string_cnt >= GLOB_LIMIT_STRING) { + free(copy); + errno = 0; + return (GLOB_NOSPACE); + } pathv[pglob->gl_offs + pglob->gl_pathc++] = copy; } pathv[pglob->gl_offs + pglob->gl_pathc] = NULL; return (copy == NULL ? GLOB_NOSPACE : 0); } /* * pattern matching function for filenames. Each occurrence of the * * pattern causes a recursion level. */ static int match(Char *name, Char *pat, Char *patend) { int ok, negate_range; Char c, k; struct xlocale_collate *table = (struct xlocale_collate*)__get_locale()->components[XLC_COLLATE]; while (pat < patend) { c = *pat++; switch (c & M_MASK) { case M_ALL: if (pat == patend) return (1); do if (match(name, pat, patend)) return (1); while (*name++ != EOS); return (0); case M_ONE: if (*name++ == EOS) return (0); break; case M_SET: ok = 0; if ((k = *name++) == EOS) return (0); if ((negate_range = ((*pat & M_MASK) == M_NOT)) != 0) ++pat; while (((c = *pat++) & M_MASK) != M_END) if ((*pat & M_MASK) == M_RNG) { if (table->__collate_load_error ? CHAR(c) <= CHAR(k) && CHAR(k) <= CHAR(pat[1]) : __wcollate_range_cmp(CHAR(c), CHAR(k)) <= 0 && __wcollate_range_cmp(CHAR(k), CHAR(pat[1])) <= 0) ok = 1; pat += 2; } else if (c == k) ok = 1; if (ok == negate_range) return (0); break; default: if (*name++ != c) return (0); break; } } return (*name == EOS); } /* Free allocated data belonging to a glob_t structure. */ void globfree(glob_t *pglob) { size_t i; char **pp; if (pglob->gl_pathv != NULL) { pp = pglob->gl_pathv + pglob->gl_offs; for (i = pglob->gl_pathc; i--; ++pp) if (*pp) free(*pp); free(pglob->gl_pathv); pglob->gl_pathv = NULL; } } static DIR * g_opendir(Char *str, glob_t *pglob) { char buf[MAXPATHLEN + MB_LEN_MAX - 1]; if (*str == EOS) strcpy(buf, "."); else { - if (g_Ctoc(str, buf, sizeof(buf))) { + if (g_Ctoc(str, buf, sizeof(buf), 0)) { errno = ENAMETOOLONG; return (NULL); } } if (pglob->gl_flags & GLOB_ALTDIRFUNC) return ((*pglob->gl_opendir)(buf)); return (opendir(buf)); } static int g_lstat(Char *fn, struct stat *sb, glob_t *pglob) { char buf[MAXPATHLEN + MB_LEN_MAX - 1]; - if (g_Ctoc(fn, buf, sizeof(buf))) { + if (g_Ctoc(fn, buf, sizeof(buf), 0)) { errno = ENAMETOOLONG; return (-1); } if (pglob->gl_flags & GLOB_ALTDIRFUNC) return((*pglob->gl_lstat)(buf, sb)); return (lstat(buf, sb)); } static int g_stat(Char *fn, struct stat *sb, glob_t *pglob) { char buf[MAXPATHLEN + MB_LEN_MAX - 1]; - if (g_Ctoc(fn, buf, sizeof(buf))) { + if (g_Ctoc(fn, buf, sizeof(buf), 0)) { errno = ENAMETOOLONG; return (-1); } if (pglob->gl_flags & GLOB_ALTDIRFUNC) return ((*pglob->gl_stat)(buf, sb)); return (stat(buf, sb)); } static const Char * g_strchr(const Char *str, wchar_t ch) { do { if (*str == ch) return (str); } while (*str++); return (NULL); } static int -g_Ctoc(const Char *str, char *buf, size_t len) +g_Ctoc(const Char *str, char *buf, size_t len, int prot) { mbstate_t mbs; size_t clen; + Char Ch; + Ch = *str; memset(&mbs, 0, sizeof(mbs)); while (len >= MB_CUR_MAX) { - clen = wcrtomb(buf, CHAR(*str), &mbs); + if (prot && isprot(Ch)) { + Ch = UNPROT(Ch); + *buf++ = '\\'; + len--; + continue; + } + clen = wcrtomb(buf, CHAR(Ch), &mbs); if (clen == (size_t)-1) { /* XXX See initial comment #2. */ - *buf = (char)CHAR(*str); + *buf = (char)CHAR(Ch); clen = 1; memset(&mbs, 0, sizeof(mbs)); } - if (CHAR(*str) == EOS) + if (CHAR(Ch) == EOS) return (0); - str++; + Ch = *++str; buf += clen; len -= clen; } return (1); } #ifdef DEBUG static void qprintf(const char *str, Char *s) { Char *p; - (void)printf("%s:\n", str); - for (p = s; *p; p++) - (void)printf("%c", CHAR(*p)); - (void)printf("\n"); - for (p = s; *p; p++) - (void)printf("%c", *p & M_PROTECT ? '"' : ' '); - (void)printf("\n"); - for (p = s; *p; p++) - (void)printf("%c", ismeta(*p) ? '_' : ' '); - (void)printf("\n"); + (void)printf("%s\n", str); + if (s != NULL) { + for (p = s; *p != EOS; p++) + (void)printf("%c", (char)CHAR(*p)); + (void)printf("\n"); + for (p = s; *p != EOS; p++) + (void)printf("%c", (isprot(*p) ? '\\' : ' ')); + (void)printf("\n"); + for (p = s; *p != EOS; p++) + (void)printf("%c", (ismeta(*p) ? '_' : ' ')); + (void)printf("\n"); + } } #endif Index: user/alc/PQ_LAUNDRY/lib/libc/sys/aio_fsync.2 =================================================================== --- user/alc/PQ_LAUNDRY/lib/libc/sys/aio_fsync.2 (revision 303205) +++ user/alc/PQ_LAUNDRY/lib/libc/sys/aio_fsync.2 (revision 303206) @@ -1,181 +1,181 @@ .\" Copyright (c) 2013 Sergey Kandaurov .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd July 15, 2016 +.Dd July 21, 2016 .Dt AIO_FSYNC 2 .Os .Sh NAME .Nm aio_fsync .Nd asynchronous file synchronization (REALTIME) .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In aio.h .Ft int .Fn aio_fsync "int op" "struct aiocb *iocb" .Sh DESCRIPTION The .Fn aio_fsync system call allows the calling process to move all modified data associated with the descriptor .Fa iocb->aio_fildes to a permanent storage device. The call returns immediately after the synchronization request has been enqueued to the descriptor; the synchronization may or may not have completed at the time the call returns. .Pp The .Fa op argument can only be set to .Dv O_SYNC to cause all currently queued I/O operations to be completed as if by a call to .Xr fsync 2 . .Pp If _POSIX_PRIORITIZED_IO is defined, and the descriptor supports it, then the enqueued operation is submitted at a priority equal to that of the calling process minus .Fa iocb->aio_reqprio . .Pp The .Fa iocb pointer may be subsequently used as an argument to .Fn aio_return and .Fn aio_error in order to determine return or error status for the enqueued operation while it is in progress. .Pp If the request could not be enqueued (generally due to invalid arguments), the call returns without having enqueued the request. .Pp The .Fa iocb->aio_sigevent structure can be used to request notification of the request's completion as described in .Xr aio 4 . .Sh RESTRICTIONS The asynchronous I/O Control Block structure pointed to by .Fa iocb must remain valid until the operation has completed. For this reason, use of auto (stack) variables for these objects is discouraged. .Pp The asynchronous I/O control buffer .Fa iocb should be zeroed before the .Fn aio_fsync call to avoid passing bogus context information to the kernel. .Pp Modifications of the Asynchronous I/O Control Block structure or the buffer contents after the request has been enqueued, but before the request has completed, are not allowed. .Sh RETURN VALUES .Rv -std aio_fsync .Sh ERRORS The .Fn aio_fsync system call will fail if: .Bl -tag -width Er .It Bq Er EAGAIN The request was not queued because of system resource limitations. .It Bq Er EINVAL The asynchronous notification method in .Fa iocb->aio_sigevent.sigev_notify is invalid or not supported. -.It Bq Er ENOSYS -The -.Fn aio_fsync -system call is not supported. +.It Bq Er EOPNOTSUPP +Asynchronous file synchronization operations on the file descriptor +.Fa iocb->aio_fildes +are unsafe and unsafe asynchronous I/O operations are disabled. .It Bq Er EINVAL A value of the .Fa op argument is not set to .Dv O_SYNC . .El .Pp The following conditions may be synchronously detected when the .Fn aio_fsync system call is made, or asynchronously, at any time thereafter. If they are detected at call time, .Fn aio_fsync returns -1 and sets .Va errno appropriately; otherwise the .Fn aio_return system call must be called, and will return -1, and .Fn aio_error must be called to determine the actual value that would have been returned in .Va errno . .Bl -tag -width Er .It Bq Er EBADF The .Fa iocb->aio_fildes argument is not a valid descriptor. .It Bq Er EINVAL This implementation does not support synchronized I/O for this file. .El .Pp If the request is successfully enqueued, but subsequently cancelled or an error occurs, the value returned by the .Fn aio_return system call is per the .Xr read 2 and .Xr write 2 system calls, and the value returned by the .Fn aio_error system call is one of the error returns from the .Xr read 2 or .Xr write 2 system calls. .Sh SEE ALSO .Xr aio_cancel 2 , .Xr aio_error 2 , .Xr aio_read 2 , .Xr aio_return 2 , .Xr aio_suspend 2 , .Xr aio_waitcomplete 2 , .Xr aio_write 2 , .Xr fsync 2 , .Xr sigevent 3 , .Xr siginfo 3 , .Xr aio 4 .Sh STANDARDS The .Fn aio_fsync system call is expected to conform to the .St -p1003.1 standard. .Sh HISTORY The .Fn aio_fsync system call first appeared in .Fx 7.0 . Index: user/alc/PQ_LAUNDRY/lib/libc/sys/aio_mlock.2 =================================================================== --- user/alc/PQ_LAUNDRY/lib/libc/sys/aio_mlock.2 (revision 303205) +++ user/alc/PQ_LAUNDRY/lib/libc/sys/aio_mlock.2 (revision 303206) @@ -1,144 +1,140 @@ .\" Copyright (c) 2013 Gleb Smirnoff .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd July 15, 2016 +.Dd July 21, 2016 .Dt AIO_MLOCK 2 .Os .Sh NAME .Nm aio_mlock .Nd asynchronous .Xr mlock 2 operation .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In aio.h .Ft int .Fn aio_mlock "struct aiocb *iocb" .Sh DESCRIPTION The .Fn aio_mlock system call allows the calling process to lock into memory the physical pages associated with the virtual address range starting at .Fa iocb->aio_buf for .Fa iocb->aio_nbytes bytes. The call returns immediately after the locking request has been enqueued; the operation may or may not have completed at the time the call returns. .Pp The .Fa iocb pointer may be subsequently used as an argument to .Fn aio_return and .Fn aio_error in order to determine return or error status for the enqueued operation while it is in progress. .Pp If the request could not be enqueued (generally due to .Xr aio 4 limits), then the call returns without having enqueued the request. .Pp The .Fa iocb->aio_sigevent structure can be used to request notification of the request's completion as described in .Xr aio 4 . .Sh RESTRICTIONS The Asynchronous I/O Control Block structure pointed to by .Fa iocb and the buffer that the .Fa iocb->aio_buf member of that structure references must remain valid until the operation has completed. For this reason, use of auto (stack) variables for these objects is discouraged. .Pp The asynchronous I/O control buffer .Fa iocb should be zeroed before the .Fn aio_mlock call to avoid passing bogus context information to the kernel. .Pp Modifications of the Asynchronous I/O Control Block structure or the buffer contents after the request has been enqueued, but before the request has completed, are not allowed. .Sh RETURN VALUES .Rv -std aio_mlock .Sh ERRORS The .Fn aio_mlock system call will fail if: .Bl -tag -width Er .It Bq Er EAGAIN The request was not queued because of system resource limitations. .It Bq Er EINVAL The asynchronous notification method in .Fa iocb->aio_sigevent.sigev_notify is invalid or not supported. -.It Bq Er ENOSYS -The -.Fn aio_mlock -system call is not supported. .El .Pp If the request is successfully enqueued, but subsequently cancelled or an error occurs, the value returned by the .Fn aio_return system call is per the .Xr mlock 2 system call, and the value returned by the .Fn aio_error system call is one of the error returns from the .Xr mlock 2 system call, or .Er ECANCELED if the request was explicitly cancelled via a call to .Fn aio_cancel . .Sh SEE ALSO .Xr aio_cancel 2 , .Xr aio_error 2 , .Xr aio_return 2 , .Xr mlock 2 , .Xr sigevent 3 , .Xr aio 4 .Sh PORTABILITY The .Fn aio_mlock system call is a .Fx extension, and should not be used in portable code. .Sh HISTORY The .Fn aio_mlock system call first appeared in .Fx 10.0 . .Sh AUTHORS The system call was introduced by .An Gleb Smirnoff Aq Mt glebius@FreeBSD.org . Index: user/alc/PQ_LAUNDRY/lib/libc/sys/aio_read.2 =================================================================== --- user/alc/PQ_LAUNDRY/lib/libc/sys/aio_read.2 (revision 303205) +++ user/alc/PQ_LAUNDRY/lib/libc/sys/aio_read.2 (revision 303206) @@ -1,225 +1,225 @@ .\" Copyright (c) 1998 Terry Lambert .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd July 15, 2016 +.Dd July 21, 2016 .Dt AIO_READ 2 .Os .Sh NAME .Nm aio_read .Nd asynchronous read from a file (REALTIME) .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In aio.h .Ft int .Fn aio_read "struct aiocb *iocb" .Sh DESCRIPTION The .Fn aio_read system call allows the calling process to read .Fa iocb->aio_nbytes from the descriptor .Fa iocb->aio_fildes beginning at the offset .Fa iocb->aio_offset into the buffer pointed to by .Fa iocb->aio_buf . The call returns immediately after the read request has been enqueued to the descriptor; the read may or may not have completed at the time the call returns. .Pp If _POSIX_PRIORITIZED_IO is defined, and the descriptor supports it, then the enqueued operation is submitted at a priority equal to that of the calling process minus .Fa iocb->aio_reqprio . .Pp The .Fa iocb->aio_lio_opcode argument is ignored by the .Fn aio_read system call. .Pp The .Fa iocb pointer may be subsequently used as an argument to .Fn aio_return and .Fn aio_error in order to determine return or error status for the enqueued operation while it is in progress. .Pp If the request could not be enqueued (generally due to invalid arguments), then the call returns without having enqueued the request. .Pp If the request is successfully enqueued, the value of .Fa iocb->aio_offset can be modified during the request as context, so this value must not be referenced after the request is enqueued. .Pp The .Fa iocb->aio_sigevent structure can be used to request notification of the request's completion as described in .Xr aio 4 . .Sh RESTRICTIONS The Asynchronous I/O Control Block structure pointed to by .Fa iocb and the buffer that the .Fa iocb->aio_buf member of that structure references must remain valid until the operation has completed. For this reason, use of auto (stack) variables for these objects is discouraged. .Pp The asynchronous I/O control buffer .Fa iocb should be zeroed before the .Fn aio_read call to avoid passing bogus context information to the kernel. .Pp Modifications of the Asynchronous I/O Control Block structure or the buffer contents after the request has been enqueued, but before the request has completed, are not allowed. .Pp If the file offset in .Fa iocb->aio_offset is past the offset maximum for .Fa iocb->aio_fildes , no I/O will occur. .Sh RETURN VALUES .Rv -std aio_read .Sh DIAGNOSTICS None. .Sh ERRORS The .Fn aio_read system call will fail if: .Bl -tag -width Er .It Bq Er EAGAIN The request was not queued because of system resource limitations. .It Bq Er EINVAL The asynchronous notification method in .Fa iocb->aio_sigevent.sigev_notify is invalid or not supported. -.It Bq Er ENOSYS -The -.Fn aio_read -system call is not supported. +.It Bq Er EOPNOTSUPP +Asynchronous read operations on the file descriptor +.Fa iocb->aio_fildes +are unsafe and unsafe asynchronous I/O operations are disabled. .El .Pp The following conditions may be synchronously detected when the .Fn aio_read system call is made, or asynchronously, at any time thereafter. If they are detected at call time, .Fn aio_read returns -1 and sets .Va errno appropriately; otherwise the .Fn aio_return system call must be called, and will return -1, and .Fn aio_error must be called to determine the actual value that would have been returned in .Va errno . .Bl -tag -width Er .It Bq Er EBADF The .Fa iocb->aio_fildes argument is invalid. .It Bq Er EINVAL The offset .Fa iocb->aio_offset is not valid, the priority specified by .Fa iocb->aio_reqprio is not a valid priority, or the number of bytes specified by .Fa iocb->aio_nbytes is not valid. .It Bq Er EOVERFLOW The file is a regular file, .Fa iocb->aio_nbytes is greater than zero, the starting offset in .Fa iocb->aio_offset is before the end of the file, but is at or beyond the .Fa iocb->aio_fildes offset maximum. .El .Pp If the request is successfully enqueued, but subsequently cancelled or an error occurs, the value returned by the .Fn aio_return system call is per the .Xr read 2 system call, and the value returned by the .Fn aio_error system call is either one of the error returns from the .Xr read 2 system call, or one of: .Bl -tag -width Er .It Bq Er EBADF The .Fa iocb->aio_fildes argument is invalid for reading. .It Bq Er ECANCELED The request was explicitly cancelled via a call to .Fn aio_cancel . .It Bq Er EINVAL The offset .Fa iocb->aio_offset would be invalid. .El .Sh SEE ALSO .Xr aio_cancel 2 , .Xr aio_error 2 , .Xr aio_return 2 , .Xr aio_suspend 2 , .Xr aio_waitcomplete 2 , .Xr aio_write 2 , .Xr sigevent 3 , .Xr siginfo 3 , .Xr aio 4 .Sh STANDARDS The .Fn aio_read system call is expected to conform to the .St -p1003.1 standard. .Sh HISTORY The .Fn aio_read system call first appeared in .Fx 3.0 . .Sh AUTHORS This manual page was written by .An Terry Lambert Aq Mt terry@whistle.com . .Sh BUGS Invalid information in .Fa iocb->_aiocb_private may confuse the kernel. Index: user/alc/PQ_LAUNDRY/lib/libc/sys/aio_write.2 =================================================================== --- user/alc/PQ_LAUNDRY/lib/libc/sys/aio_write.2 (revision 303205) +++ user/alc/PQ_LAUNDRY/lib/libc/sys/aio_write.2 (revision 303206) @@ -1,220 +1,220 @@ .\" Copyright (c) 1999 Softweyr LLC. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY Softweyr LLC AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL Softweyr LLC OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd July 15, 2016 +.Dd July 21, 2016 .Dt AIO_WRITE 2 .Os .Sh NAME .Nm aio_write .Nd asynchronous write to a file (REALTIME) .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In aio.h .Ft int .Fn aio_write "struct aiocb *iocb" .Sh DESCRIPTION The .Fn aio_write system call allows the calling process to write .Fa iocb->aio_nbytes from the buffer pointed to by .Fa iocb->aio_buf to the descriptor .Fa iocb->aio_fildes . The call returns immediately after the write request has been enqueued to the descriptor; the write may or may not have completed at the time the call returns. If the request could not be enqueued, generally due to invalid arguments, the call returns without having enqueued the request. .Pp If .Dv O_APPEND is set for .Fa iocb->aio_fildes , .Fn aio_write operations append to the file in the same order as the calls were made. If .Dv O_APPEND is not set for the file descriptor, the write operation will occur at the absolute position from the beginning of the file plus .Fa iocb->aio_offset . .Pp If .Dv _POSIX_PRIORITIZED_IO is defined, and the descriptor supports it, then the enqueued operation is submitted at a priority equal to that of the calling process minus .Fa iocb->aio_reqprio . .Pp The .Fa iocb pointer may be subsequently used as an argument to .Fn aio_return and .Fn aio_error in order to determine return or error status for the enqueued operation while it is in progress. .Pp If the request is successfully enqueued, the value of .Fa iocb->aio_offset can be modified during the request as context, so this value must not be referenced after the request is enqueued. .Pp The .Fa iocb->aio_sigevent structure can be used to request notification of the request's completion as described in .Xr aio 4 . .Sh RESTRICTIONS The Asynchronous I/O Control Block structure pointed to by .Fa iocb and the buffer that the .Fa iocb->aio_buf member of that structure references must remain valid until the operation has completed. For this reason, use of auto (stack) variables for these objects is discouraged. .Pp The asynchronous I/O control buffer .Fa iocb should be zeroed before the .Fn aio_write system call to avoid passing bogus context information to the kernel. .Pp Modifications of the Asynchronous I/O Control Block structure or the buffer contents after the request has been enqueued, but before the request has completed, are not allowed. .Pp If the file offset in .Fa iocb->aio_offset is past the offset maximum for .Fa iocb->aio_fildes , no I/O will occur. .Sh RETURN VALUES .Rv -std aio_write .Sh ERRORS The .Fn aio_write system call will fail if: .Bl -tag -width Er .It Bq Er EAGAIN The request was not queued because of system resource limitations. .It Bq Er EINVAL The asynchronous notification method in .Fa iocb->aio_sigevent.sigev_notify is invalid or not supported. -.It Bq Er ENOSYS -The -.Fn aio_write -system call is not supported. +.It Bq Er EOPNOTSUPP +Asynchronous write operations on the file descriptor +.Fa iocb->aio_fildes +are unsafe and unsafe asynchronous I/O operations are disabled. .El .Pp The following conditions may be synchronously detected when the .Fn aio_write system call is made, or asynchronously, at any time thereafter. If they are detected at call time, .Fn aio_write returns -1 and sets .Va errno appropriately; otherwise the .Fn aio_return system call must be called, and will return -1, and .Fn aio_error must be called to determine the actual value that would have been returned in .Va errno . .Bl -tag -width Er .It Bq Er EBADF The .Fa iocb->aio_fildes argument is invalid, or is not opened for writing. .It Bq Er EINVAL The offset .Fa iocb->aio_offset is not valid, the priority specified by .Fa iocb->aio_reqprio is not a valid priority, or the number of bytes specified by .Fa iocb->aio_nbytes is not valid. .El .Pp If the request is successfully enqueued, but subsequently canceled or an error occurs, the value returned by the .Fn aio_return system call is per the .Xr write 2 system call, and the value returned by the .Fn aio_error system call is either one of the error returns from the .Xr write 2 system call, or one of: .Bl -tag -width Er .It Bq Er EBADF The .Fa iocb->aio_fildes argument is invalid for writing. .It Bq Er ECANCELED The request was explicitly canceled via a call to .Fn aio_cancel . .It Bq Er EINVAL The offset .Fa iocb->aio_offset would be invalid. .El .Sh SEE ALSO .Xr aio_cancel 2 , .Xr aio_error 2 , .Xr aio_return 2 , .Xr aio_suspend 2 , .Xr aio_waitcomplete 2 , .Xr sigevent 3 , .Xr siginfo 3 , .Xr aio 4 .Sh STANDARDS The .Fn aio_write system call is expected to conform to the .St -p1003.1 standard. .Sh HISTORY The .Fn aio_write system call first appeared in .Fx 3.0 . .Sh AUTHORS This manual page was written by .An Wes Peters Aq Mt wes@softweyr.com . .Sh BUGS Invalid information in .Fa iocb->_aiocb_private may confuse the kernel. Index: user/alc/PQ_LAUNDRY/lib/libc/sys/pipe.2 =================================================================== --- user/alc/PQ_LAUNDRY/lib/libc/sys/pipe.2 (revision 303205) +++ user/alc/PQ_LAUNDRY/lib/libc/sys/pipe.2 (revision 303206) @@ -1,159 +1,178 @@ .\" Copyright (c) 1980, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 4. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" @(#)pipe.2 8.1 (Berkeley) 6/4/93 .\" $FreeBSD$ .\" -.Dd June 22, 2016 +.Dd July 20, 2016 .Dt PIPE 2 .Os .Sh NAME .Nm pipe , .Nm pipe2 .Nd create descriptor pair for interprocess communication .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In unistd.h .Ft int .Fn pipe "int fildes[2]" .Ft int .Fn pipe2 "int fildes[2]" "int flags" .Sh DESCRIPTION The .Fn pipe -system call +function creates a .Em pipe , which is an object allowing bidirectional data flow, and allocates a pair of file descriptors. .Pp The .Fn pipe2 system call allows control over the attributes of the file descriptors via the .Fa flags argument. Values for .Fa flags are constructed by a bitwise-inclusive OR of flags from the following list, defined in .In fcntl.h : .Bl -tag -width ".Dv O_NONBLOCK" .It Dv O_CLOEXEC Set the close-on-exec flag for the new file descriptors. .It Dv O_NONBLOCK Set the non-blocking flag for the ends of the pipe. .El .Pp If the .Fa flags argument is 0, the behavior is identical to a call to .Fn pipe . .Pp By convention, the first descriptor is normally used as the .Em read end of the pipe, and the second is normally the .Em write end , so that data written to .Fa fildes[1] appears on (i.e., can be read from) .Fa fildes[0] . This allows the output of one program to be sent to another program: the source's standard output is set up to be the write end of the pipe, and the sink's standard input is set up to be the read end of the pipe. The pipe itself persists until all its associated descriptors are closed. .Pp A pipe that has had an end closed is considered .Em widowed . Writing on such a pipe causes the writing process to receive a .Dv SIGPIPE signal. Widowing a pipe is the only way to deliver end-of-file to a reader: after the reader consumes any buffered data, reading a widowed pipe returns a zero count. .Pp The bidirectional nature of this implementation of pipes is not portable to older systems, so it is recommended to use the convention for using the endpoints in the traditional manner when using a pipe in one direction. +.Sh IMPLEMENTATION NOTES +The +.Fn pipe +function calls the +.Fn pipe2 +system call. +As a result, system call traces such as those captured by +.Xr dtrace 1 +or +.Xr ktrace 1 +will show calls to +.Fn pipe2 . .Sh RETURN VALUES .Rv -std pipe .Sh ERRORS The .Fn pipe and .Fn pipe2 system calls will fail if: .Bl -tag -width Er .It Bq Er EFAULT .Ar fildes argument points to an invalid memory location. .It Bq Er EMFILE Too many descriptors are active. .It Bq Er ENFILE The system file table is full. .It Bq Er ENOMEM Not enough kernel memory to establish a pipe. .El .Pp The .Fn pipe2 system call will also fail if: .Bl -tag -width Er .It Bq Er EINVAL The .Fa flags argument is invalid. .El .Sh SEE ALSO .Xr sh 1 , .Xr fork 2 , .Xr read 2 , .Xr socketpair 2 , .Xr write 2 .Sh HISTORY The .Fn pipe function appeared in .At v3 . .Pp Bidirectional pipes were first used on .At V.4 . .Pp The .Fn pipe2 function appeared in .Fx 10.0 . +.Pp +The +.Fn pipe +function became a wrapper around +.Fn pipe2 +in +.Fx 11.0 . Index: user/alc/PQ_LAUNDRY/lib/libmd/Makefile =================================================================== --- user/alc/PQ_LAUNDRY/lib/libmd/Makefile (revision 303205) +++ user/alc/PQ_LAUNDRY/lib/libmd/Makefile (revision 303206) @@ -1,385 +1,386 @@ # $FreeBSD$ PACKAGE=lib${LIB} LIB= md SHLIB_MAJOR= 6 SHLIBDIR?= /lib SRCS= md4c.c md5c.c md4hl.c md5hl.c \ rmd160c.c rmd160hl.c \ sha0c.c sha0hl.c sha1c.c sha1hl.c \ sha256c.c sha256hl.c \ sha384hl.c \ sha512c.c sha512hl.c sha512thl.c \ skein.c skein_block.c \ skein256hl.c skein512hl.c skein1024hl.c INCS= md4.h md5.h ripemd.h sha.h sha256.h sha384.h sha512.h sha512t.h \ skein.h skein_port.h skein_freebsd.h skein_iv.h WARNS?= 0 MAN+= md4.3 md5.3 ripemd.3 sha.3 sha256.3 sha512.3 skein.3 MLINKS+=md4.3 MD4Init.3 md4.3 MD4Update.3 md4.3 MD4Final.3 MLINKS+=md4.3 MD4End.3 md4.3 MD4File.3 md4.3 MD4FileChunk.3 MLINKS+=md4.3 MD4Data.3 MLINKS+=md5.3 MD5Init.3 md5.3 MD5Update.3 md5.3 MD5Final.3 MLINKS+=md5.3 MD5End.3 md5.3 MD5File.3 md5.3 MD5FileChunk.3 MLINKS+=md5.3 MD5Data.3 MLINKS+=ripemd.3 RIPEMD160_Init.3 ripemd.3 RIPEMD160_Update.3 MLINKS+=ripemd.3 RIPEMD160_Final.3 ripemd.3 RIPEMD160_Data.3 MLINKS+=ripemd.3 RIPEMD160_End.3 ripemd.3 RIPEMD160_File.3 MLINKS+=ripemd.3 RIPEMD160_FileChunk.3 MLINKS+=sha.3 SHA_Init.3 sha.3 SHA_Update.3 sha.3 SHA_Final.3 MLINKS+=sha.3 SHA_End.3 sha.3 SHA_File.3 sha.3 SHA_FileChunk.3 MLINKS+=sha.3 SHA_Data.3 MLINKS+=sha.3 SHA1_Init.3 sha.3 SHA1_Update.3 sha.3 SHA1_Final.3 MLINKS+=sha.3 SHA1_End.3 sha.3 SHA1_File.3 sha.3 SHA1_FileChunk.3 MLINKS+=sha.3 SHA1_Data.3 MLINKS+=sha256.3 SHA256_Init.3 sha256.3 SHA256_Update.3 MLINKS+=sha256.3 SHA256_Final.3 sha256.3 SHA256_End.3 MLINKS+=sha256.3 SHA256_File.3 sha256.3 SHA256_FileChunk.3 MLINKS+=sha256.3 SHA256_Data.3 MLINKS+=sha512.3 SHA384_Init.3 sha512.3 SHA384_Update.3 MLINKS+=sha512.3 SHA384_Final.3 sha512.3 SHA384_End.3 MLINKS+=sha512.3 SHA384_File.3 sha512.3 SHA384_FileChunk.3 MLINKS+=sha512.3 SHA384_Data.3 sha512.3 sha384.3 MLINKS+=sha512.3 SHA512_Init.3 sha512.3 SHA512_Update.3 MLINKS+=sha512.3 SHA512_Final.3 sha512.3 SHA512_End.3 MLINKS+=sha512.3 SHA512_File.3 sha512.3 SHA512_FileChunk.3 MLINKS+=sha512.3 SHA512_Data.3 MLINKS+=sha512.3 SHA512_256_Init.3 sha512.3 SHA512_256_Update.3 MLINKS+=sha512.3 SHA512_256_Final.3 sha512.3 SHA512_256_End.3 MLINKS+=sha512.3 SHA512_256_File.3 sha512.3 SHA512_256_FileChunk.3 MLINKS+=sha512.3 SHA512_256_Data.3 MLINKS+=skein.3 SKEIN256_Init.3 skein.3 SKEIN256_Update.3 MLINKS+=skein.3 SKEIN256_Final.3 skein.3 SKEIN256_End.3 MLINKS+=skein.3 SKEIN256_File.3 skein.3 SKEIN256_FileChunk.3 MLINKS+=skein.3 SKEIN256_Data.3 skein.3 skein256.3 MLINKS+=skein.3 SKEIN512_Init.3 skein.3 SKEIN512_Update.3 MLINKS+=skein.3 SKEIN512_Final.3 skein.3 SKEIN512_End.3 MLINKS+=skein.3 SKEIN512_File.3 skein.3 SKEIN512_FileChunk.3 MLINKS+=skein.3 SKEIN512_Data.3 skein.3 skein512.3 MLINKS+=skein.3 SKEIN1024_Init.3 skein.3 SKEIN1024_Update.3 MLINKS+=skein.3 SKEIN1024_Final.3 skein.3 SKEIN1024_End.3 MLINKS+=skein.3 SKEIN1024_File.3 skein.3 SKEIN1024_FileChunk.3 MLINKS+=skein.3 SKEIN1024_Data.3 skein.3 skein1024.3 CLEANFILES+= md[245]hl.c md[245].ref md[245].3 mddriver \ rmd160.ref rmd160hl.c rmddriver \ sha0.ref sha0hl.c sha1.ref sha1hl.c shadriver \ sha256.ref sha256hl.c sha384hl.c sha384.ref \ sha512.ref sha512hl.c sha512t256.ref sha512thl.c \ skein256hl.c skein512hl.c skein1024hl.c \ skein256.ref skein512.ref skein1024.ref \ skeindriver # Define WEAK_REFS to provide weak aliases for libmd symbols # # Note that the same sources are also used internally by libcrypt, # in which case: # * macros are used to rename symbols to libcrypt internal names # * no weak aliases are generated CFLAGS+= -I${.CURDIR} -I${.CURDIR}/../../sys/crypto/sha2 CFLAGS+= -I${.CURDIR}/../../sys/crypto/skein CFLAGS+= -DWEAK_REFS .PATH: ${.CURDIR}/${MACHINE_ARCH} ${.CURDIR}/../../sys/crypto/sha2 .PATH: ${.CURDIR}/../../sys/crypto/skein ${.CURDIR}/../../sys/crypto/skein/${MACHINE_ARCH} .if exists(${MACHINE_ARCH}/sha.S) SRCS+= sha.S CFLAGS+= -DSHA1_ASM .endif .if exists(${MACHINE_ARCH}/rmd160.S) SRCS+= rmd160.S CFLAGS+= -DRMD160_ASM .endif .if exists(${MACHINE_ARCH}/skein_block_asm.s) +AFLAGS += --strip-local-absolute SRCS+= skein_block_asm.s CFLAGS+= -DSKEIN_ASM -DSKEIN_USE_ASM=1792 # list of block functions to replace with assembly: 256+512+1024 = 1792 .endif .if exists(${MACHINE_ARCH}/sha.S) || exists(${MACHINE_ARCH}/rmd160.S) || exists(${MACHINE_ARCH}/skein_block_asm.s) ACFLAGS+= -DELF -Wa,--noexecstack .endif md4hl.c: mdXhl.c (echo '#define LENGTH 16'; \ sed -e 's/mdX/md4/g' -e 's/MDX/MD4/g' ${.ALLSRC}) > ${.TARGET} md5hl.c: mdXhl.c (echo '#define LENGTH 16'; \ sed -e 's/mdX/md5/g' -e 's/MDX/MD5/g' ${.ALLSRC}) > ${.TARGET} sha0hl.c: mdXhl.c (echo '#define LENGTH 20'; \ sed -e 's/mdX/sha/g' -e 's/MDX/SHA_/g' -e 's/SHA__/SHA_/g' \ ${.ALLSRC}) > ${.TARGET} sha1hl.c: mdXhl.c (echo '#define LENGTH 20'; \ sed -e 's/mdX/sha/g' -e 's/MDX/SHA1_/g' -e 's/SHA1__/SHA1_/g' \ ${.ALLSRC}) > ${.TARGET} sha256hl.c: mdXhl.c (echo '#define LENGTH 32'; \ sed -e 's/mdX/sha256/g' -e 's/MDX/SHA256_/g' \ -e 's/SHA256__/SHA256_/g' \ ${.ALLSRC}) > ${.TARGET} sha384hl.c: mdXhl.c (echo '#define LENGTH 48'; \ sed -e 's/mdX/sha384/g' -e 's/MDX/SHA384_/g' \ -e 's/SHA384__/SHA384_/g' \ ${.ALLSRC}) > ${.TARGET} sha512hl.c: mdXhl.c (echo '#define LENGTH 64'; \ sed -e 's/mdX/sha512/g' -e 's/MDX/SHA512_/g' \ -e 's/SHA512__/SHA512_/g' \ ${.ALLSRC}) > ${.TARGET} sha512thl.c: mdXhl.c (echo '#define LENGTH 32'; \ sed -e 's/mdX/sha512t/g' -e 's/MDX/SHA512_256_/g' \ -e 's/SHA512_256__/SHA512_256_/g' \ -e 's/SHA512_256_CTX/SHA512_CTX/g' \ ${.ALLSRC}) > ${.TARGET} rmd160hl.c: mdXhl.c (echo '#define LENGTH 20'; \ sed -e 's/mdX/ripemd/g' -e 's/MDX/RIPEMD160_/g' \ -e 's/RIPEMD160__/RIPEMD160_/g' \ ${.ALLSRC}) > ${.TARGET} skein256hl.c: mdXhl.c (echo '#define LENGTH 32'; \ sed -e 's/mdX/skein/g' -e 's/MDX/SKEIN256_/g' \ -e 's/SKEIN256__/SKEIN256_/g' \ ${.ALLSRC}) > ${.TARGET} skein512hl.c: mdXhl.c (echo '#define LENGTH 64'; \ sed -e 's/mdX/skein/g' -e 's/MDX/SKEIN512_/g' \ -e 's/SKEIN512__/SKEIN512_/g' \ ${.ALLSRC}) > ${.TARGET} skein1024hl.c: mdXhl.c (echo '#define LENGTH 128'; \ sed -e 's/mdX/skein/g' -e 's/MDX/SKEIN1024_/g' \ -e 's/SKEIN1024__/SKEIN1024_/g' \ ${.ALLSRC}) > ${.TARGET} .for i in 2 4 5 md${i}.3: ${.CURDIR}/mdX.3 sed -e "s/mdX/md${i}/g" -e "s/MDX/MD${i}/g" ${.ALLSRC} > ${.TARGET} cat ${.CURDIR}/md${i}.copyright >> ${.TARGET} .endfor md4.ref: echo 'MD4 test suite:' > ${.TARGET} @echo 'MD4 ("") = 31d6cfe0d16ae931b73c59d7e0c089c0' >> ${.TARGET} @echo 'MD4 ("a") = bde52cb31de33e46245e05fbdbd6fb24' >> ${.TARGET} @echo 'MD4 ("abc") = a448017aaf21d8525fc10ae87aa6729d' >> ${.TARGET} @echo 'MD4 ("message digest") = d9130a8164549fe818874806e1c7014b' >> ${.TARGET} @echo 'MD4 ("abcdefghijklmnopqrstuvwxyz") = d79e1c308aa5bbcdeea8ed63df412da9' >> ${.TARGET} @echo 'MD4 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ '043f8582f241db351ce627e153e7f0e4' >> ${.TARGET} @echo 'MD4 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ 'e33b4ddc9c38f2199c3e7b164fcc0536' >> ${.TARGET} md5.ref: echo 'MD5 test suite:' > ${.TARGET} @echo 'MD5 ("") = d41d8cd98f00b204e9800998ecf8427e' >> ${.TARGET} @echo 'MD5 ("a") = 0cc175b9c0f1b6a831c399e269772661' >> ${.TARGET} @echo 'MD5 ("abc") = 900150983cd24fb0d6963f7d28e17f72' >> ${.TARGET} @echo 'MD5 ("message digest") = f96b697d7cb7938d525a2f31aaf161d0' >> ${.TARGET} @echo 'MD5 ("abcdefghijklmnopqrstuvwxyz") = c3fcd3d76192e4007dfb496cca67e13b' >> ${.TARGET} @echo 'MD5 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") = d174ab98d277d9f5a5611c2c9f419d9f' >> ${.TARGET} @echo 'MD5 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") = 57edf4a22be3c955ac49da2e2107b67a' >> ${.TARGET} sha0.ref: echo 'SHA-0 test suite:' > ${.TARGET} @echo 'SHA-0 ("") = f96cea198ad1dd5617ac084a3d92c6107708c0ef' >> ${.TARGET} @echo 'SHA-0 ("abc") = 0164b8a914cd2a5e74c4f7ff082c4d97f1edf880' >> ${.TARGET} @echo 'SHA-0 ("message digest") =' \ 'c1b0f222d150ebb9aa36a40cafdc8bcbed830b14' >> ${.TARGET} @echo 'SHA-0 ("abcdefghijklmnopqrstuvwxyz") =' \ 'b40ce07a430cfd3c033039b9fe9afec95dc1bdcd' >> ${.TARGET} @echo 'SHA-0 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ '79e966f7a3a990df33e40e3d7f8f18d2caebadfa' >> ${.TARGET} @echo 'SHA-0 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ '4aa29d14d171522ece47bee8957e35a41f3e9cff' >> ${.TARGET} sha1.ref: echo 'SHA-1 test suite:' > ${.TARGET} @echo 'SHA-1 ("") = da39a3ee5e6b4b0d3255bfef95601890afd80709' >> ${.TARGET} @echo 'SHA-1 ("abc") = a9993e364706816aba3e25717850c26c9cd0d89d' >> ${.TARGET} @echo 'SHA-1 ("message digest") =' \ 'c12252ceda8be8994d5fa0290a47231c1d16aae3' >> ${.TARGET} @echo 'SHA-1 ("abcdefghijklmnopqrstuvwxyz") =' \ '32d10c7b8cf96570ca04ce37f2a19d84240d3a89' >> ${.TARGET} @echo 'SHA-1 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ '761c457bf73b14d27e9e9265c46f4b4dda11f940' >> ${.TARGET} @echo 'SHA-1 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ '50abf5706a150990a08b2c5ea40fa0e585554732' >> ${.TARGET} sha256.ref: echo 'SHA-256 test suite:' > ${.TARGET} @echo 'SHA-256 ("") = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' >> ${.TARGET} @echo 'SHA-256 ("abc") =' \ 'ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad' >> ${.TARGET} @echo 'SHA-256 ("message digest") =' \ 'f7846f55cf23e14eebeab5b4e1550cad5b509e3348fbc4efa3a1413d393cb650' >> ${.TARGET} @echo 'SHA-256 ("abcdefghijklmnopqrstuvwxyz") =' \ '71c480df93d6ae2f1efad1447c66c9525e316218cf51fc8d9ed832f2daf18b73' >> ${.TARGET} @echo 'SHA-256 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ 'db4bfcbd4da0cd85a60c3c37d3fbd8805c77f15fc6b1fdfe614ee0a7c8fdb4c0' >> ${.TARGET} @echo 'SHA-256 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ 'f371bc4a311f2b009eef952dd83ca80e2b60026c8e935592d0f9c308453c813e' >> ${.TARGET} sha384.ref: echo 'SHA-384 test suite:' > ${.TARGET} @echo 'SHA-384 ("") =' \ '38b060a751ac96384cd9327eb1b1e36a21fdb71114be07434c0cc7bf63f6e1da274edebfe76f65fbd51ad2f14898b95b' >> ${.TARGET} @echo 'SHA-384 ("abc") =' \ 'cb00753f45a35e8bb5a03d699ac65007272c32ab0eded1631a8b605a43ff5bed8086072ba1e7cc2358baeca134c825a7' >> ${.TARGET} @echo 'SHA-384 ("message digest") =' \ '473ed35167ec1f5d8e550368a3db39be54639f828868e9454c239fc8b52e3c61dbd0d8b4de1390c256dcbb5d5fd99cd5' >> ${.TARGET} @echo 'SHA-384 ("abcdefghijklmnopqrstuvwxyz") =' \ 'feb67349df3db6f5924815d6c3dc133f091809213731fe5c7b5f4999e463479ff2877f5f2936fa63bb43784b12f3ebb4' >> ${.TARGET} @echo 'SHA-384 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ '1761336e3f7cbfe51deb137f026f89e01a448e3b1fafa64039c1464ee8732f11a5341a6f41e0c202294736ed64db1a84' >> ${.TARGET} @echo 'SHA-384 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ 'b12932b0627d1c060942f5447764155655bd4da0c9afa6dd9b9ef53129af1b8fb0195996d2de9ca0df9d821ffee67026' >> ${.TARGET} sha512.ref: echo 'SHA-512 test suite:' > ${.TARGET} @echo 'SHA-512 ("") =' \ 'cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e' >> ${.TARGET} @echo 'SHA-512 ("abc") =' \ 'ddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a9eeee64b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f' >> ${.TARGET} @echo 'SHA-512 ("message digest") =' \ '107dbf389d9e9f71a3a95f6c055b9251bc5268c2be16d6c13492ea45b0199f3309e16455ab1e96118e8a905d5597b72038ddb372a89826046de66687bb420e7c' >> ${.TARGET} @echo 'SHA-512 ("abcdefghijklmnopqrstuvwxyz") =' \ '4dbff86cc2ca1bae1e16468a05cb9881c97f1753bce3619034898faa1aabe429955a1bf8ec483d7421fe3c1646613a59ed5441fb0f321389f77f48a879c7b1f1' >> ${.TARGET} @echo 'SHA-512 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ '1e07be23c26a86ea37ea810c8ec7809352515a970e9253c26f536cfc7a9996c45c8370583e0a78fa4a90041d71a4ceab7423f19c71b9d5a3e01249f0bebd5894' >> ${.TARGET} @echo 'SHA-512 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ '72ec1ef1124a45b047e8b7c75a932195135bb61de24ec0d1914042246e0aec3a2354e093d76f3048b456764346900cb130d2a4fd5dd16abb5e30bcb850dee843' >> ${.TARGET} sha512t256.ref: echo 'SHA-512256 test suite:' > ${.TARGET} @echo 'SHA-512256 ("") =' \ 'c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a' >> ${.TARGET} @echo 'SHA-512256 ("abc") =' \ '53048e2681941ef99b2e29b76b4c7dabe4c2d0c634fc6d46e0e2f13107e7af23' >> ${.TARGET} @echo 'SHA-512256 ("message digest") =' \ '0cf471fd17ed69d990daf3433c89b16d63dec1bb9cb42a6094604ee5d7b4e9fb' >> ${.TARGET} @echo 'SHA-512256 ("abcdefghijklmnopqrstuvwxyz") =' \ 'fc3189443f9c268f626aea08a756abe7b726b05f701cb08222312ccfd6710a26' >> ${.TARGET} @echo 'SHA-512256 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ 'cdf1cc0effe26ecc0c13758f7b4a48e000615df241284185c39eb05d355bb9c8' >> ${.TARGET} @echo 'SHA-512256 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ '2c9fdbc0c90bdd87612ee8455474f9044850241dc105b1e8b94b8ddf5fac9148' >> ${.TARGET} rmd160.ref: echo 'RIPEMD160 test suite:' > ${.TARGET} @echo 'RIPEMD160 ("") = 9c1185a5c5e9fc54612808977ee8f548b2258d31' >> ${.TARGET} @echo 'RIPEMD160 ("abc") = 8eb208f7e05d987a9b044a8e98c6b087f15a0bfc' >> ${.TARGET} @echo 'RIPEMD160 ("message digest") =' \ '5d0689ef49d2fae572b881b123a85ffa21595f36' >> ${.TARGET} @echo 'RIPEMD160 ("abcdefghijklmnopqrstuvwxyz") =' \ 'f71c27109c692c1b56bbdceb5b9d2865b3708dbc' >> ${.TARGET} @echo 'RIPEMD160 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ 'b0e20b6e3116640286ed3a87a5713079b21f5189' >> ${.TARGET} @echo 'RIPEMD160 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ '9b752e45573d4b39f4dbd3323cab82bf63326bfb' >> ${.TARGET} skein256.ref: echo 'SKEIN256 test suite:' > ${.TARGET} @echo 'SKEIN256 ("") = c8877087da56e072870daa843f176e9453115929094c3a40c463a196c29bf7ba' >> ${.TARGET} @echo 'SKEIN256 ("abc") = 258bdec343b9fde1639221a5ae0144a96e552e5288753c5fec76c05fc2fc1870' >> ${.TARGET} @echo 'SKEIN256 ("message digest") =' \ '4d2ce0062b5eb3a4db95bc1117dd8aa014f6cd50fdc8e64f31f7d41f9231e488' >> ${.TARGET} @echo 'SKEIN256 ("abcdefghijklmnopqrstuvwxyz") =' \ '46d8440685461b00e3ddb891b2ecc6855287d2bd8834a95fb1c1708b00ea5e82' >> ${.TARGET} @echo 'SKEIN256 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ '7c5eb606389556b33d34eb2536459528dc0af97adbcd0ce273aeb650f598d4b2' >> ${.TARGET} @echo 'SKEIN256 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ '4def7a7e5464a140ae9c3a80279fbebce4bd00f9faad819ab7e001512f67a10d' >> ${.TARGET} skein512.ref: echo 'SKEIN512 test suite:' > ${.TARGET} @echo 'SKEIN512 ("") =' \ 'bc5b4c50925519c290cc634277ae3d6257212395cba733bbad37a4af0fa06af41fca7903d06564fea7a2d3730dbdb80c1f85562dfcc070334ea4d1d9e72cba7a' >> ${.TARGET} @echo 'SKEIN512 ("abc") =' \ '8f5dd9ec798152668e35129496b029a960c9a9b88662f7f9482f110b31f9f93893ecfb25c009baad9e46737197d5630379816a886aa05526d3a70df272d96e75' >> ${.TARGET} @echo 'SKEIN512 ("message digest") =' \ '15b73c158ffb875fed4d72801ded0794c720b121c0c78edf45f900937e6933d9e21a3a984206933d504b5dbb2368000411477ee1b204c986068df77886542fcc' >> ${.TARGET} @echo 'SKEIN512 ("abcdefghijklmnopqrstuvwxyz") =' \ '23793ad900ef12f9165c8080da6fdfd2c8354a2929b8aadf83aa82a3c6470342f57cf8c035ec0d97429b626c4d94f28632c8f5134fd367dca5cf293d2ec13f8c' >> ${.TARGET} @echo 'SKEIN512 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ '0c6bed927e022f5ddcf81877d42e5f75798a9f8fd3ede3d83baac0a2f364b082e036c11af35fe478745459dd8f5c0b73efe3c56ba5bb2009208d5a29cc6e469c' >> ${.TARGET} @echo 'SKEIN512 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ '2ca9fcffb3456f297d1b5f407014ecb856f0baac8eb540f534b1f187196f21e88f31103128c2f03fcc9857d7a58eb66f9525e2302d88833ee069295537a434ce' >> ${.TARGET} skein1024.ref: echo 'SKEIN1024 test suite:' > ${.TARGET} @echo 'SKEIN1024 ("") =' \ '0fff9563bb3279289227ac77d319b6fff8d7e9f09da1247b72a0a265cd6d2a62645ad547ed8193db48cff847c06494a03f55666d3b47eb4c20456c9373c86297d630d5578ebd34cb40991578f9f52b18003efa35d3da6553ff35db91b81ab890bec1b189b7f52cb2a783ebb7d823d725b0b4a71f6824e88f68f982eefc6d19c6' >> ${.TARGET} @echo 'SKEIN1024 ("abc") =' \ '35a599a0f91abcdb4cb73c19b8cb8d947742d82c309137a7caed29e8e0a2ca7a9ff9a90c34c1908cc7e7fd99bb15032fb86e76df21b72628399b5f7c3cc209d7bb31c99cd4e19465622a049afbb87c03b5ce3888d17e6e667279ec0aa9b3e2712624c01b5f5bbe1a564220bdcf6990af0c2539019f313fdd7406cca3892a1f1f' >> ${.TARGET} @echo 'SKEIN1024 ("message digest") =' \ 'ea891f5268acd0fac97467fc1aa89d1ce8681a9992a42540e53babee861483110c2d16f49e73bac27653ff173003e40cfb08516cd34262e6af95a5d8645c9c1abb3e813604d508b8511b30f9a5c1b352aa0791c7d2f27b2706dccea54bc7de6555b5202351751c3299f97c09cf89c40f67187e2521c0fad82b30edbb224f0458' >> ${.TARGET} @echo 'SKEIN1024 ("abcdefghijklmnopqrstuvwxyz") =' \ 'f23d95c2a25fbcd0e797cd058fec39d3c52d2b5afd7a9af1df934e63257d1d3dcf3246e7329c0f1104c1e51e3d22e300507b0c3b9f985bb1f645ef49835080536becf83788e17fed09c9982ba65c3cb7ffe6a5f745b911c506962adf226e435c42f6f6bc08d288f9c810e807e3216ef444f3db22744441deefa4900982a1371f' >> ${.TARGET} @echo 'SKEIN1024 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =' \ 'cf3889e8a8d11bfd3938055d7d061437962bc5eac8ae83b1b71c94be201b8cf657fdbfc38674997a008c0c903f56a23feb3ae30e012377f1cfa080a9ca7fe8b96138662653fb3335c7d06595bf8baf65e215307532094cfdfa056bd8052ab792a3944a2adaa47b30335b8badb8fe9eb94fe329cdca04e58bbc530f0af709f469' >> ${.TARGET} @echo 'SKEIN1024 ("12345678901234567890123456789012345678901234567890123456789012345678901234567890") =' \ 'cf21a613620e6c119eca31fdfaad449a8e02f95ca256c21d2a105f8e4157048f9fe1e897893ea18b64e0e37cb07d5ac947f27ba544caf7cbc1ad094e675aed77a366270f7eb7f46543bccfa61c526fd628408058ed00ed566ac35a9761d002e629c4fb0d430b2f4ad016fcc49c44d2981c4002da0eecc42144160e2eaea4855a' >> ${.TARGET} test: md4.ref md5.ref sha0.ref rmd160.ref sha1.ref sha256.ref sha384.ref \ sha512.ref sha512t256.ref skein256.ref skein512.ref skein1024.ref @${ECHO} if any of these test fail, the code produces wrong results @${ECHO} and should NOT be used. ${CC} ${CFLAGS} ${LDFLAGS} -DMD=4 -o mddriver ${.CURDIR}/mddriver.c libmd.a ./mddriver | cmp md4.ref - @${ECHO} MD4 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DMD=5 -o mddriver ${.CURDIR}/mddriver.c libmd.a ./mddriver | cmp md5.ref - @${ECHO} MD5 passed test -rm -f mddriver ${CC} ${CFLAGS} ${LDFLAGS} -o rmddriver ${.CURDIR}/rmddriver.c libmd.a ./rmddriver | cmp rmd160.ref - @${ECHO} RIPEMD160 passed test -rm -f rmddriver ${CC} ${CFLAGS} ${LDFLAGS} -DSHA=0 -o shadriver ${.CURDIR}/shadriver.c libmd.a ./shadriver | cmp sha0.ref - @${ECHO} SHA-0 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DSHA=1 -o shadriver ${.CURDIR}/shadriver.c libmd.a ./shadriver | cmp sha1.ref - @${ECHO} SHA-1 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DSHA=256 -o shadriver ${.CURDIR}/shadriver.c libmd.a ./shadriver | cmp sha256.ref - @${ECHO} SHA-256 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DSHA=384 -o shadriver ${.CURDIR}/shadriver.c libmd.a ./shadriver | cmp sha384.ref - @${ECHO} SHA-384 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DSHA=512 -o shadriver ${.CURDIR}/shadriver.c libmd.a ./shadriver | cmp sha512.ref - @${ECHO} SHA-512 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DSHA=512256 -o shadriver ${.CURDIR}/shadriver.c libmd.a ./shadriver | cmp sha512t256.ref - @${ECHO} SHA-512t256 passed test -rm -f shadriver ${CC} ${CFLAGS} ${LDFLAGS} -DSKEIN=256 -o skeindriver ${.CURDIR}/skeindriver.c libmd.a ./skeindriver | cmp skein256.ref - @${ECHO} SKEIN256 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DSKEIN=512 -o skeindriver ${.CURDIR}/skeindriver.c libmd.a ./skeindriver | cmp skein512.ref - @${ECHO} SKEIN512 passed test ${CC} ${CFLAGS} ${LDFLAGS} -DSKEIN=1024 -o skeindriver ${.CURDIR}/skeindriver.c libmd.a ./skeindriver | cmp skein1024.ref - @${ECHO} SKEIN1024 passed test -rm -f skeindriver .include Index: user/alc/PQ_LAUNDRY/release/tools/arm.subr =================================================================== --- user/alc/PQ_LAUNDRY/release/tools/arm.subr (revision 303205) +++ user/alc/PQ_LAUNDRY/release/tools/arm.subr (revision 303206) @@ -1,136 +1,137 @@ #!/bin/sh #- # Copyright (c) 2015 The FreeBSD Foundation # All rights reserved. # # Portions of this software were developed by Glen Barber # under sponsorship from the FreeBSD Foundation. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # Common subroutines used to build arm/armv6 images. # # $FreeBSD$ # cleanup() { if [ -c "${DESTDIR}/dev/null" ]; then umount_loop ${DESTDIR}/dev 2>/dev/null fi umount_loop ${DESTDIR} if [ ! -z "${mddev}" ]; then mdconfig -d -u ${mddev} fi return 0 } umount_loop() { DIR=$1 i=0 sync while ! umount ${DIR}; do i=$(( $i + 1 )) if [ $i -ge 10 ]; then # This should never happen. But, it has happened. echo "Cannot umount(8) ${DIR}" echo "Something has gone horribly wrong." return 1 fi sleep 1 done return 0 } arm_create_disk() { # Create the target raw file and temporary work directory. chroot ${CHROOTDIR} gpart create -s ${PART_SCHEME} ${mddev} chroot ${CHROOTDIR} gpart add -t '!12' -a 63 -s ${FAT_SIZE} ${mddev} chroot ${CHROOTDIR} gpart set -a active -i 1 ${mddev} chroot ${CHROOTDIR} newfs_msdos -L msdosboot -F ${FAT_TYPE} /dev/${mddev}s1 chroot ${CHROOTDIR} gpart add -t freebsd ${mddev} chroot ${CHROOTDIR} gpart create -s bsd ${mddev}s2 chroot ${CHROOTDIR} gpart add -t freebsd-ufs -a 64k /dev/${mddev}s2 chroot ${CHROOTDIR} newfs -U -L rootfs /dev/${mddev}s2a return 0 } arm_create_user() { # Create a default user account 'freebsd' with the password 'freebsd', # and set the default password for the 'root' user to 'root'. chroot ${CHROOTDIR} /usr/sbin/pw -R ${DESTDIR} \ groupadd freebsd -g 1001 chroot ${CHROOTDIR} mkdir -p ${DESTDIR}/home/freebsd chroot ${CHROOTDIR} /usr/sbin/pw -R ${DESTDIR} \ useradd freebsd \ -m -M 0755 -w yes -n freebsd -u 1001 -g 1001 -G 0 \ -c 'FreeBSD User' -d '/home/freebsd' -s '/bin/csh' chroot ${CHROOTDIR} /usr/sbin/pw -R ${DESTDIR} \ usermod root -w yes + chroot ${CHROOTDIR} ln -s /home ${DESTDIR}/usr/home return 0 } arm_install_base() { chroot ${CHROOTDIR} mount /dev/${mddev}s2a ${DESTDIR} eval chroot ${CHROOTDIR} make -C ${WORLDDIR} \ TARGET=${EMBEDDED_TARGET} \ TARGET_ARCH=${EMBEDDED_TARGET_ARCH} \ DESTDIR=${DESTDIR} KERNCONF=${KERNEL} \ installworld installkernel distribution chroot ${CHROOTDIR} mkdir -p ${DESTDIR}/boot/msdos arm_create_user echo '# Custom /etc/fstab for FreeBSD embedded images' \ > ${CHROOTDIR}/${DESTDIR}/etc/fstab echo "/dev/ufs/rootfs / ufs rw 1 1" \ >> ${CHROOTDIR}/${DESTDIR}/etc/fstab echo "/dev/msdosfs/MSDOSBOOT /boot/msdos msdosfs rw,noatime 0 0" \ >> ${CHROOTDIR}/${DESTDIR}/etc/fstab echo "tmpfs /tmp tmpfs rw,mode=1777,size=50m 0 0" \ >> ${CHROOTDIR}/${DESTDIR}/etc/fstab local hostname hostname="$(echo ${KERNEL} | tr '[:upper:]' '[:lower:]')" echo "hostname=\"${hostname}\"" > ${CHROOTDIR}/${DESTDIR}/etc/rc.conf echo 'ifconfig_DEFAULT="DHCP"' >> ${CHROOTDIR}/${DESTDIR}/etc/rc.conf echo 'sshd_enable="YES"' >> ${CHROOTDIR}/${DESTDIR}/etc/rc.conf echo 'sendmail_enable="NONE"' >> ${CHROOTDIR}/${DESTDIR}/etc/rc.conf echo 'sendmail_submit_enable="NO"' >> ${CHROOTDIR}/${DESTDIR}/etc/rc.conf echo 'sendmail_outbound_enable="NO"' >> ${CHROOTDIR}/${DESTDIR}/etc/rc.conf echo 'sendmail_msp_queue_enable="NO"' >> ${CHROOTDIR}/${DESTDIR}/etc/rc.conf echo 'growfs_enable="YES"' >> ${CHROOTDIR}/${DESTDIR}/etc/rc.conf sync umount_loop ${CHROOTDIR}/${DESTDIR} return 0 } arm_install_uboot() { # Override in the arm/KERNEL.conf file. return 0 } Index: user/alc/PQ_LAUNDRY/share/man/man4/aio.4 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man4/aio.4 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man4/aio.4 (revision 303206) @@ -1,211 +1,221 @@ .\"- .\" Copyright (c) 2002 Dag-Erling Coïdan Smørgrav .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. The name of the author may not be used to endorse or promote products .\" derived from this software without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd July 15, 2016 +.Dd July 21, 2016 .Dt AIO 4 .Os .Sh NAME .Nm aio .Nd asynchronous I/O .Sh DESCRIPTION The .Nm facility provides system calls for asynchronous I/O. -However, asynchronous I/O operations are only enabled for certain file -types by default. -Asynchronous I/O operations for other file types may block an AIO daemon -indefinitely resulting in process and/or system hangs. -Asynchronous I/O operations can be enabled for all file types by setting +Asynchronous I/O operations are not completed synchronously by the +calling thread. +Instead, the calling thread invokes one system call to request an +asynchronous I/O operation. +The status of a completed request is retrieved later via a separate +system call. +.Pp +Asynchronous I/O operations on some file descriptor types may block an +AIO daemon indefinitely resulting in process and/or system hangs. +Operations on these file descriptor types are considered +.Dq unsafe +and disabled by default. +They can be enabled by setting the .Va vfs.aio.enable_unsafe sysctl node to a non-zero value. .Pp -Asynchronous I/O operations on sockets and raw disk devices do not block -indefinitely and are enabled by default. +Asynchronous I/O operations on sockets, +raw disk devices, +and regular files on local filesystems do not block +indefinitely and are always enabled. .Pp The .Nm facility uses kernel processes (also known as AIO daemons) to service most asynchronous I/O requests. These processes are grouped into pools containing a variable number of processes. Each pool will add or remove processes to the pool based on load. Pools can be configured by sysctl nodes that define the minimum and maximum number of processes as well as the amount of time an idle process will wait before exiting. .Pp One pool of AIO daemons is used to service asynchronous I/O requests for sockets. These processes are named .Dq soaiod . The following sysctl nodes are used with this pool: .Bl -tag -width indent .It Va kern.ipc.aio.num_procs The current number of processes in the pool. .It Va kern.ipc.aio.target_procs The minimum number of processes that should be present in the pool. .It Va kern.ipc.aio.max_procs The maximum number of processes permitted in the pool. .It Va kern.ipc.aio.lifetime The amount of time a process is permitted to idle in clock ticks. If a process is idle for this amount of time and there are more processes in the pool than the target minimum, the process will exit. .El .Pp A second pool of AIO daemons is used to service all other asynchronous I/O requests except for I/O requests to raw disks. These processes are named .Dq aiod . The following sysctl nodes are used with this pool: .Bl -tag -width indent .It Va vfs.aio.num_aio_procs The current number of processes in the pool. .It Va vfs.aio.target_aio_procs The minimum number of processes that should be present in the pool. .It Va vfs.aio.max_aio_procs The maximum number of processes permitted in the pool. .It Va vfs.aio.aiod_lifetime The amount of time a process is permitted to idle in clock ticks. If a process is idle for this amount of time and there are more processes in the pool than the target minimum, the process will exit. .El .Pp Asynchronous I/O requests for raw disks are queued directly to the disk device layer after temporarily wiring the user pages associated with the request. These requests are not serviced by any of the AIO daemon pools. .Pp Several limits on the number of asynchronous I/O requests are imposed both system-wide and per-process. These limits are configured via the following sysctls: .Bl -tag -width indent .It Va vfs.aio.max_buf_aio The maximum number of queued asynchronous I/O requests for raw disks permitted for a single process. Asynchronous I/O requests that have completed but whose status has not been retrieved via .Xr aio_return 2 or .Xr aio_waitcomplete 2 are not counted against this limit. .It Va vfs.aio.num_buf_aio The number of queued asynchronous I/O requests for raw disks system-wide. .It Va vfs.aio.max_aio_queue_per_proc The maximum number of asynchronous I/O requests for a single process serviced concurrently by the default AIO daemon pool. .It Va vfs.aio.max_aio_per_proc The maximum number of outstanding asynchronous I/O requests permitted for a single process. This includes requests that have not been serviced, requests currently being serviced, and requests that have completed but whose status has not been retrieved via .Xr aio_return 2 or .Xr aio_waitcomplete 2 . .It Va vfs.aio.num_queue_count The number of outstanding asynchronous I/O requests system-wide. .It Va vfs.aio.max_aio_queue The maximum number of outstanding asynchronous I/O requests permitted system-wide. .El .Pp Asynchronous I/O control buffers should be zeroed before initializing individual fields. This ensures all fields are initialized. .Pp All asynchronous I/O control buffers contain a .Vt sigevent structure in the .Va aio_sigevent field which can be used to request notification when an operation completes. .Pp For .Dv SIGEV_KEVENT notifications, the posted kevent will contain: .Bl -column ".Va filter" .It Sy Member Ta Sy Value .It Va ident Ta asynchronous I/O control buffer pointer .It Va filter Ta Dv EVFILT_AIO .It Va udata Ta value stored in .Va aio_sigevent.sigev_value .El .Pp For .Dv SIGEV_SIGNO and .Dv SIGEV_THREAD_ID notifications, the information for the queued signal will include .Dv SI_ASYNCIO in the .Va si_code field and the value stored in .Va sigevent.sigev_value in the .Va si_value field. .Pp For .Dv SIGEV_THREAD notifications, the value stored in .Va aio_sigevent.sigev_value is passed to the .Va aio_sigevent.sigev_notify_function as described in .Xr sigevent 3 . .Sh SEE ALSO .Xr aio_cancel 2 , .Xr aio_error 2 , .Xr aio_read 2 , .Xr aio_return 2 , .Xr aio_suspend 2 , .Xr aio_waitcomplete 2 , .Xr aio_write 2 , .Xr lio_listio 2 , .Xr sigevent 3 , .Xr sysctl 8 .Sh HISTORY The .Nm facility appeared as a kernel option in .Fx 3.0 . The .Nm kernel module appeared in .Fx 5.0 . The .Nm facility was integrated into all kernels in .Fx 11.0 . Index: user/alc/PQ_LAUNDRY/share/man/man4/amdpm.4 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man4/amdpm.4 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man4/amdpm.4 (revision 303206) @@ -1,72 +1,73 @@ .\" Copyright (c) 2001 Murray Stokely .\" Copyright (c) 1999 Takanori Watanabe .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd December 31, 2005 .Dt AMDPM 4 .Os .Sh NAME .Nm amdpm .Nd AMD 756/766/768/8111 Power Management controller driver .Sh SYNOPSIS .Cd device smbus .Cd device smb .Cd device amdpm .Sh DESCRIPTION This driver provides access to .Tn AMD 756/766/768/8111 Power management controllers . Currently, only the SMBus 1.0 controller function is implemented. The SMBus 2.0 functionality of the AMD 8111 controller is supported via the .Xr amdsmb 4 driver. .Pp The embedded SMBus controller of the AMD 756 chipset may give you access to the monitoring facilities of your mainboard. See .Xr smb 4 for writing user code to fetch voltages, temperature and so on from the monitoring chip of your mainboard. .Sh SEE ALSO .Xr amdsmb 4 , +.Xr intpm 4 , .Xr smb 4 , .Xr smbus 4 .Sh HISTORY The .Nm driver first appeared in .Fx 4.5 . .Sh AUTHORS .An -nosplit This driver was written by .An "Matthew C. Forman" . Based heavily on the .Nm alpm driver by .An Nicolas Souchu . This manual page was written by .An Murray Stokely Aq Mt murray@FreeBSD.org . .Sh BUGS Only polling mode is supported. Index: user/alc/PQ_LAUNDRY/share/man/man4/amdsmb.4 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man4/amdsmb.4 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man4/amdsmb.4 (revision 303206) @@ -1,54 +1,57 @@ .\" Copyright (c) 2005 Christian Brueffer .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd December 31, 2005 .Dt AMDSMB 4 .Os .Sh NAME .Nm amdsmb .Nd "AMD-8111 SMBus 2.0 controller driver" .Sh SYNOPSIS +.Cd "device pci" .Cd "device smbus" .Cd "device smb" .Cd "device amdsmb" .Sh DESCRIPTION The .Nm driver provides access to the AMD-8111 SMBus 2.0 controller. .Sh SEE ALSO +.Xr amdpm 4 , +.Xr intpm 4 , .Xr smb 4 , .Xr smbus 4 .Sh HISTORY The .Nm driver first appeared in .Fx 6.2 . .Sh AUTHORS .An -nosplit The .Nm driver was written by .An Ruslan Ermilov Aq Mt ru@FreeBSD.org . Index: user/alc/PQ_LAUNDRY/share/man/man4/ichsmb.4 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man4/ichsmb.4 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man4/ichsmb.4 (revision 303206) @@ -1,57 +1,61 @@ .\" Copyright (c) 1996-1999 Whistle Communications, Inc. .\" All rights reserved. .\" .\" Subject to the following obligations and disclaimer of warranty, use and .\" redistribution of this software, in source or object code forms, with or .\" without modifications are expressly permitted by Whistle Communications; .\" provided, however, that: .\" 1. Any and all reproductions of the source or object code must include the .\" copyright notice above and the following disclaimer of warranties; and .\" 2. No rights are granted, in any manner or form, to use Whistle .\" Communications, Inc. trademarks, including the mark "WHISTLE .\" COMMUNICATIONS" on advertising, endorsements, or otherwise except as .\" such appears in the above copyright notice or in the software. .\" .\" THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND .\" TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO .\" REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE, .\" INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF .\" MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. .\" WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY .\" REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS .\" SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE. .\" IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES .\" RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING .\" WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, .\" PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY .\" OF SUCH DAMAGE. .\" .\" Author: Archie Cobbs .\" .\" $FreeBSD$ .\" -.Dd November 28, 2007 +.Dd July 20, 2016 .Dt ICHSMB 4 .Os .Sh NAME .Nm ichsmb .Nd Intel ICH SMBus controller driver .Sh SYNOPSIS .Cd device pci .Cd device smbus .Cd device smb .Cd device ichsmb .Sh DESCRIPTION -This driver provides access to the SMBus controller logical -device contained in the Intel 82801AA (ICH), 82801AB (ICH0), -82801BA (ICH2), 82801CA (ICH3), 82801DC (ICH4), 82801EB (ICH5), -82801FB (ICH6) and 82801GB (ICH7) PCI chips. +The +.Nm +driver provides +.Xr smbus 4 +support for the SMBus controller logical device contained in all Intel +motherboard chipsets starting from 82801AA (ICH). .Sh SEE ALSO +.Xr intpm 4 , +.Xr ismt 4 , .Xr smb 4 , .Xr smbus 4 .Sh AUTHORS .An Archie L. Cobbs Aq Mt archie@FreeBSD.org Index: user/alc/PQ_LAUNDRY/share/man/man4/intpm.4 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man4/intpm.4 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man4/intpm.4 (revision 303206) @@ -1,63 +1,80 @@ .\" Copyright (c) 1999 Takanori Watanabe .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd January 8, 1999 +.Dd July 20, 2016 .Dt INTPM 4 .Os .Sh NAME .Nm intpm .Nd Intel PIIX4 Power Management controller driver .Sh SYNOPSIS +.Cd device pci .Cd device smbus .Cd device smb .Cd device intpm .Sh DESCRIPTION -This driver provides access to -.Tn Intel PIIX4 PCI Controller function 3 , -Power management controller. -Currently, only smbus controller -function is implemented. -But it also have bus idle monitoring function. -It -will display mapped I/O address for bus monitoring function when attaching. +The +.Nm +driver provides access to +.Tn Intel PIIX4 +compatible Power Management controllers. +Currently, only +.Xr smbus 4 +controller function is implemented. +.Sh HARDWARE +The +.Nm +driver supports the following chipsets: +.Pp +.Bl -bullet -compact +.It +Intel 82371AB/82443MX +.It +ATI IXP400 +.It +AMD SB600/700/710/750 +.El .Sh SEE ALSO +.Xr amdpm 4 , +.Xr amdsmb 4 , +.Xr ichsmb 4 , .Xr smb 4 , .Xr smbus 4 .Sh HISTORY The .Nm driver first appeared in .Fx 3.4 . .Sh AUTHORS This manual page was written by .An Takanori Watanabe Aq Mt takawata@shidahara1.planet.sci.kobe-u.ac.jp . .Sh BUGS This device requires IRQ 9 exclusively. To use this, you should enable ACPI function in BIOS configuration, or PnP mechanism assigns conflicted IRQ for PnP ISA card. And do not use IRQ 9 for Non-PnP ISA cards. Index: user/alc/PQ_LAUNDRY/share/man/man4/ismt.4 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man4/ismt.4 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man4/ismt.4 (revision 303206) @@ -1,59 +1,60 @@ .\" .\" Copyright (c) 2014 Intel Corporation .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions, and the following disclaimer, .\" without modification. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of Intel Corporation nor the names of its .\" contributors may be used to endorse or promote products derived from .\" this software without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS .\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR .\" A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT .\" HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, .\" STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING .\" IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE .\" POSSIBILITY OF SUCH DAMAGES. .\" .\" ismt driver man page. .\" .\" Author: Jim Harris .\" .\" $FreeBSD$ .\" .Dd January 11, 2016 .Dt ISMT 4 .Os .Sh NAME .Nm ismt .Nd Intel SMBus Message Transport (SMBus 2.0) driver .Sh SYNOPSIS .Cd device pci .Cd device smbus .Cd device smb .Cd device ismt .Sh DESCRIPTION This driver provides access to the SMBus 2.0 controller device contained in the Intel Atom S1200 and C2000 CPUs. .Sh SEE ALSO +.Xr ichsmb 4 , .Xr smb 4 , .Xr smbus 4 .Sh HISTORY The .Nm driver first appeared in .Fx 10.3 . .Sh AUTHORS .An Jim Harris Aq Mt jimharris@FreeBSD.org Index: user/alc/PQ_LAUNDRY/share/man/man7/arch.7 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man7/arch.7 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man7/arch.7 (revision 303206) @@ -1,171 +1,173 @@ .\" Copyright (c) 2016 The FreeBSD Foundation. All rights reserved. .\" .\" This documentation was created by Ed Maste under sponsorship of .\" The FreeBSD Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd July 19, 2016 .Dt ARCH 7 .Os .Sh NAME .Nm arch .Nd Architecture-specific details .Sh DESCRIPTION Differences between CPU architectures and platforms supported by .Fx . .Pp .Ss Type sizes On all supported architectures, .Bl -column -offset -indent "long long" "Size" .It Sy Type Ta Sy Size .It short Ta 2 .It int Ta 4 .It long Ta sizeof(void*) .It long long Ta 8 .It float Ta 4 .It double Ta 8 .El -.Bl -column -offset indent ".Sy Architecture" ".Sy sizeof(void *)" ".Sy "sizeof(long double)" +.Bl -column -offset indent "Sy Architecture" "Sy sizeof(void *)" "Sy sizeof(long double)" .It Sy Architecture Ta Sy sizeof(void *) Ta Sy sizeof(long double) .It amd64 Ta 8 Ta 16 .It arm Ta 4 Ta 8 .It armeb Ta 4 Ta 8 .It armv6 Ta 4 Ta 8 .It arm64 Ta 8 Ta 16 .It i386 Ta 4 Ta 12 .It mips Ta 4 Ta 8 .It mipsel Ta 4 Ta 8 .It mipsn32 Ta 4 Ta 8 .It mips64 Ta 8 Ta 8 .It mips64el Ta 8 Ta 8 .It powerpc Ta 4 Ta 8 .It powerpc64 Ta 8 Ta 8 .It riscv Ta 8 Ta .It sparc64 Ta 8 Ta 16 .El .Ss Endianness and Char Signedness -.Bl -column -offset indent ".Sy Architecture" ".Sy Endianness" ".Sy "char Signedness" +.Bl -column -offset indent "Sy Architecture" "Sy Endianness" "Sy char Signedness" .It Sy Architecture Ta Sy Endianness Ta Sy char Signedness .It amd64 Ta little Ta signed .It arm Ta little Ta unsigned .It armeb Ta big Ta unsigned .It armv6 Ta little Ta unsigned .It arm64 Ta little Ta unsigned .It i386 Ta little Ta signed .It mips Ta big Ta signed .It mipsel Ta little Ta signed .It mipsn32 Ta big Ta signed .It mips64 Ta big Ta signed .It mips64el Ta little Ta signed .It powerpc Ta big Ta unsigned .It powerpc64 Ta big Ta unsigned .It riscv Ta little Ta signed .It sparc64 Ta big Ta signed .El .Ss Page Size -.Bl -column -offset indent ".Sy Architecture" ".Sy Page Sizes" +.Bl -column -offset indent "Sy Architecture" "Sy Page Sizes" .It Sy Architecture Ta Sy Page Sizes .It amd64 Ta 4K, 2M, 1G .It arm Ta 4K .It armeb Ta 4K .It armv6 Ta 4K, 1M .It arm64 Ta 4K, 2M, 1G .It i386 Ta 4K, 2M (PAE), 4M .It mips Ta 4K .It mipsel Ta 4K .It mipsn32 Ta 4K .It mips64 Ta 4K .It mips64el Ta 4K .It powerpc Ta 4K .It powerpc64 Ta 4K .It riscv Ta 4K .It sparc64 Ta 8K .El .Ss Floating Point -.Bl -column -offset indent ".Sy Architecture" ".Sy float, double" ".Sy long double" +.Bl -column -offset indent "Sy Architecture" "Sy float, double" "Sy long double" .It Sy Architecture Ta Sy float, double Ta Sy long double .It amd64 Ta hard Ta hard, 80 bit .It arm Ta soft Ta soft, double precision .It armeb Ta soft Ta soft, double precision .It armv6 Ta hard Ta hard, double precision .It arm64 Ta hard Ta soft, quad precision .It i386 Ta hard Ta hard, 80 bit .It mips Ta soft Ta identical to double .It mipsel Ta soft Ta identical to double .It mipsn32 Ta soft Ta identical to double .It mips64 Ta soft Ta identical to double .It mips64el Ta soft Ta identical to double .It powerpc Ta hard Ta hard, double precision .It powerpc64 Ta hard Ta hard, double precision .It riscv Ta .It sparc64 Ta hard Ta hard, quad precision .El .Ss Predefined Macros The compiler provides a number of predefined macros. Some of these provide architecture-specific details and are explained below. Other macros, including those required by the language standard, are not included here. .Pp The full set of predefined macros can be obtained with this command: .Bd -literal -offset indent cc -x c -Dm -E /dev/null .Ed .Pp Common type size and endianness macros: -.Bl -column -offset indent "BYTE_ORDER" ".Sy Meaning" +.Bl -column -offset indent "BYTE_ORDER" "Sy Meaning" .It Sy Macro Ta Sy Meaning .It Dv __LP64__ Ta 64-bit (8-byte) long and pointer, 32-bit (4-byte) int .It Dv __ILP32__ Ta 32-bit (4-byte) int, long and pointer .It Dv BYTE_ORDER Ta Either Dv BIG_ENDIAN or Dv LITTLE_ENDIAN . -.Dv PDP11_ENDIAN is not used on FreeBSD. +.Dv PDP11_ENDIAN +is not used on +.Fx . .El .Pp Architecture-specific macros: -.Bl -column -offset indent ".Sy Architecture" ".Sy Predefined macros" +.Bl -column -offset indent "Sy Architecture" "Sy Predefined macros" .It Sy Architecture Ta Sy Predefined macros .It amd64 Ta Dv __amd64__, Dv __x86_64__ .It arm Ta Dv __arm__ .It armeb Ta Dv __arm__ .It armv6 Ta Dv __arm__, Dv __ARM_ARCH >= 6 .It arm64 Ta Dv __aarch64__ .It i386 Ta Dv __i386__ .It mips Ta Dv __mips__, Dv __MIPSEB__, Dv __mips_o32 .It mipsel Ta Dv __mips__, Dv __mips_o32 .It mipsn32 Ta Dv __mips__, Dv __MIPSEB__, Dv __mips_n32 .It mips64 Ta Dv __mips__, Dv __MIPSEB__, Dv __mips_n64 .It mips64el Ta Dv __mips__, Dv __mips_n64 .It powerpc Ta Dv __powerpc__ .It powerpc64 Ta Dv __powerpc__, Dv __powerpc64__ .It riscv Ta Dv __riscv__, Dv __riscv64 .It sparc64 Ta Dv __sparc64__ .El .Sh SEE ALSO .Xr src.conf 5 , .Xr build 7 .Sh HISTORY An .Nm manual page appeared in .Fx 12 . Index: user/alc/PQ_LAUNDRY/share/man/man7/build.7 =================================================================== --- user/alc/PQ_LAUNDRY/share/man/man7/build.7 (revision 303205) +++ user/alc/PQ_LAUNDRY/share/man/man7/build.7 (revision 303206) @@ -1,688 +1,693 @@ .\" Copyright (c) 2000 .\" Mike W. Meyer .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd April 28, 2016 +.Dd July 20, 2016 .Dt BUILD 7 .Os .Sh NAME .Nm build .Nd information on how to build the system .Sh DESCRIPTION The sources for the .Fx system and its applications are contained in three different directories, normally .Pa /usr/src , .Pa /usr/doc , and .Pa /usr/ports . These directories may be initially empty or non-existent until updated with .Xr svn 1 or .Xr portsnap 8 . Directory .Pa /usr/src contains the .Dq "base system" sources, which is loosely defined as the things required to rebuild the system to a useful state. Directory .Pa /usr/doc contains the source for the system documentation, excluding the manual pages. Directory .Pa /usr/ports contains a tree that provides a consistent interface for building and installing third party applications. For more information about the ports build process, see .Xr ports 7 . .Pp The .Xr make 1 command is used in each of these directories to build and install the things in that directory. Issuing the .Xr make 1 command in any directory or subdirectory of those directories has the same effect as issuing the same command in all subdirectories of that directory. With no target specified, the things in that directory are just built. .Pp A source tree is allowed to be read-only. As described in .Xr make 1 , objects are usually built in a separate object directory hierarchy specified by the environment variable .Va MAKEOBJDIRPREFIX , or under .Pa /usr/obj if variable .Va MAKEOBJDIRPREFIX is not set. For a given source directory, its canonical object directory would be .Pa ${MAKEOBJDIRPREFIX}${.CURDIR} if .Xr make 1 variable .Va MAKEOBJDIRPREFIX is set, or .Pa /usr/obj${.CURDIR} if this variable is not set. Cross-builds set the object directory as described in the documentation for the .Cm buildworld target below. .Pp The build may be controlled by defining .Xr make 1 variables described in the .Sx ENVIRONMENT section below, and by the variables documented in .Xr make.conf 5 . .Pp The following list provides the names and actions for the targets supported by the build system: .Bl -tag -width ".Cm cleandepend" .It Cm analyze Run Clang static analyzer against all objects and present output on stdout. .It Cm check Run tests for a given subdirectory. The default directory used is .Pa ${.OBJDIR} , but the check directory can be changed with .Pa ${CHECKDIR} . .It Cm checkworld Run the .Fx test suite on installed world. .It Cm clean Remove any files created during the build process. .It Cm cleandepend Remove the .Pa ${.OBJDIR}/${DEPENDFILE}* files generated by prior .Dq Li "make" and .Dq Li "make depend" steps. .It Cm cleandir Remove the canonical object directory if it exists, or perform actions equivalent to .Dq Li "make clean cleandepend" if it does not. This target will also remove an .Pa obj link in .Pa ${.CURDIR} if that exists. .Pp It is advisable to run .Dq Li "make cleandir" twice: the first invocation will remove the canonical object directory and the second one will clean up .Pa ${.CURDIR} . .It Cm depend Generate a list of build dependencies in file .Pa ${.OBJDIR}/${DEPENDFILE} . Per-object dependencies are generated at build time and stored in .Pa ${.OBJDIR}/${DEPENDFILE}.${OBJ} . .It Cm install Install the results of the build to the appropriate location in the installation directory hierarchy specified in variable .Va DESTDIR . .It Cm obj Create the canonical object directory associated with the current directory. .It Cm objlink Create a symbolic link to the canonical object directory in .Pa ${.CURDIR} . .It Cm tags Generate a tags file using the program specified in the .Xr make 1 variable .Va CTAGS . The build system supports .Xr ctags 1 and .Nm "GNU Global" . .El .Pp The other supported targets under directory .Pa /usr/src are: .Bl -tag -width ".Cm distributeworld" .It Cm buildenv Spawn an interactive shell with environment variables set up for cross-building the system. The target architecture needs to be specified with .Xr make 1 variables .Va TARGET_ARCH and .Va TARGET . .Pp This target is only useful after a complete cross-toolchain including the compiler, linker, assembler, headers and libraries has been built; see the .Cm toolchain target below. .It Cm buildworld Build everything but the kernel, configure files in .Pa etc , and .Pa release . The object directory can be changed from the default .Pa /usr/obj by setting the .Pa MAKEOBJDIRPREFIX .Xr make 1 variable. The actual build location prefix used is .Pa ${MAKEOBJDIRPREFIX}${.CURDIR} for native builds, and .Pa ${MAKEOBJDIRPREFIX}/${TARGET}${.CURDIR} for cross builds and native builds with variable .Va CROSS_BUILD_TESTING set. .It Cm cleanworld Attempt to clean up targets built by a preceding .Cm buildworld step. .It Cm distributeworld Distribute everything compiled by a preceding .Cm buildworld step. Files are placed in the directory hierarchy specified by .Xr make 1 variable .Va DISTDIR . This target is used while building a release; see .Xr release 7 . .It Cm packageworld Archive the results of .Cm distributeworld , placing the results in .Va DISTDIR . This target is used while building a release; see .Xr release 7 . .It Cm installworld Install everything built by a preceding .Cm buildworld step into the directory hierarchy pointed to by .Xr make 1 variable .Va DESTDIR . .Pp If installing onto an NFS file system and running .Xr make 1 with the .Fl j option, make sure that .Xr rpc.lockd 8 is running on both client and server. See .Xr rc.conf 5 on how to make it start at boot time. .It Cm toolchain Create the build toolchain needed to build the rest of the system. For cross-architecture builds, this step creates a cross-toolchain. .It Cm universe For each architecture, execute a .Cm buildworld followed by a .Cm buildkernel for all kernels for that architecture, including .Pa LINT . This command takes a long time. .It Cm update Get updated sources as configured in .Xr make.conf 5 . .It Cm targets Print a list of supported .Va TARGET / .Va TARGET_ARCH pairs for world and kernel targets. .It Cm tinderbox Execute the same targets as .Cm universe . In addition print a summary of all failed targets at the end and exit with an error if there were any. .It Cm toolchains Create a build toolchain for each architecture supported by the build system. .El .Pp Kernel specific build targets in .Pa /usr/src are: .Bl -tag -width ".Cm distributekernel" .It Cm buildkernel Rebuild the kernel and the kernel modules. The object directory can be changed from the default .Pa /usr/obj by setting the .Pa MAKEOBJDIRPREFIX .Xr make 1 variable. .It Cm installkernel Install the kernel and the kernel modules to directory .Pa ${DESTDIR}/boot/kernel , renaming any pre-existing directory with this name to .Pa kernel.old if it contained the currently running kernel. The target directory under .Pa ${DESTDIR} may be modified using the .Va INSTKERNNAME and .Va KODIR .Xr make 1 variables. .It Cm distributekernel Install the kernel to the directory .Pa ${DISTDIR}/kernel/boot/kernel . This target is used while building a release; see .Xr release 7 . .It Cm packagekernel Archive the results of .Cm distributekernel , placing the results in .Va DISTDIR . This target is used while building a release; see .Xr release 7 . .It Cm kernel Equivalent to .Cm buildkernel followed by .Cm installkernel .It Cm kernel-toolchain Rebuild the tools needed for kernel compilation. Use this if you did not do a .Cm buildworld first. .It Cm reinstallkernel Reinstall the kernel and the kernel modules, overwriting the contents of the target directory. As with the .Cm installkernel target, the target directory can be specified using the .Xr make 1 variable .Va INSTKERNNAME . .El .Pp Convenience targets for cleaning up the install destination directory denoted by variable .Va DESTDIR include: .Bl -tag -width ".Cm delete-old-libs" .It Cm check-old Print a list of old files and directories in the system. .It Cm delete-old Delete obsolete base system files and directories interactively. When .Li -DBATCH_DELETE_OLD_FILES is specified at the command line, the delete operation will be non-interactive. The variables .Va DESTDIR , .Va TARGET_ARCH and .Va TARGET should be set as with .Dq Li "make installworld" . .It Cm delete-old-libs Delete obsolete base system libraries interactively. This target should only be used if no third party software uses these libraries. When .Li -DBATCH_DELETE_OLD_FILES is specified at the command line, the delete operation will be non-interactive. The variables .Va DESTDIR , .Va TARGET_ARCH and .Va TARGET should be set as with .Dq Li "make installworld" . .El .Sh ENVIRONMENT Variables that influence all builds include: .Bl -tag -width ".Va MAKEOBJDIRPREFIX" .It Va DEBUG_FLAGS Defines a set of debugging flags that will be used to build all userland binaries under .Pa /usr/src . When .Va DEBUG_FLAGS is defined, the .Cm install and .Cm installworld targets install binaries from the current .Va MAKEOBJDIRPREFIX without stripping, so that debugging information is retained in the installed binaries. .It Va DESTDIR The directory hierarchy prefix where built objects will be installed. If not set, .Va DESTDIR defaults to the empty string. .It Va MAKEOBJDIRPREFIX Defines the prefix for directory names in the tree of built objects. Defaults to .Pa /usr/obj if not defined. This variable should only be set in the environment and not via .Pa /etc/make.conf or the command line. .It Va NO_WERROR If defined, compiler warnings will not cause the build to halt, even if the makefile says otherwise. .It Va WITH_CTF If defined, the build process will run the DTrace CTF conversion tools on built objects. .El .Pp Additionally, builds in .Pa /usr/src are influenced by the following .Xr make 1 variables: .Bl -tag -width ".Va SUBDIR_OVERRIDE" .It Va KERNCONF Overrides which kernel to build and install for the various kernel make targets. It defaults to .Cm GENERIC . .It Va KERNFAST If set, the build target .Cm buildkernel defaults to setting .Va NO_KERNELCLEAN , .Va NO_KERNELCONFIG , and .Va NO_KERNELOBJ . When set to a value other than .Cm 1 then .Va KERNCONF is set to the value of .Va KERNFAST . .It Va LOCAL_DIRS If set, this variable supplies a list of additional directories relative to the root of the source tree to build as part of the .Cm everything target. .It Va LOCAL_ITOOLS If set, this variable supplies a list of additional tools that are used by the .Cm installworld and .Cm distributeworld targets. .It Va LOCAL_LIB_DIRS If set, this variable supplies a list of additional directories relative to the root of the source tree to build as part of the .Cm libraries target. .It Va LOCAL_MTREE If set, this variable supplies a list of additional mtrees relative to the root of the source tree to use as part of the .Cm hierarchy target. .It Va LOCAL_TOOL_DIRS If set, this variable supplies a list of additional directories relative to the root of the source tree to build as part of the .Cm build-tools +target. +.It Va LOCAL_XTOOL_DIRS +If set, this variable supplies a list of additional directories relative to +the root of the source tree to build as part of the +.Cm cross-tools target. .It Va PORTS_MODULES A list of ports with kernel modules that should be built and installed as part of the .Cm buildkernel and .Cm installkernel process. .Bd -literal -offset indent make PORTS_MODULES=emulators/kqemu-kmod kernel .Ed .It Va STRIPBIN Command to use at install time when stripping binaries. Be sure to add any additional tools required to run .Va STRIPBIN to the .Va LOCAL_ITOOLS .Xr make 1 variable before running the .Cm distributeworld or .Cm installworld targets. See .Xr install 1 for more details. .It Va SUBDIR_OVERRIDE Override the default list of sub-directories and only build the sub-directory named in this variable. If combined with .Cm buildworld then all libraries and includes, and some of the build tools will still build as well. When combined with .Cm buildworld it is necesarry to override .Va LOCAL_LIB_DIRS with any custom directories containing libraries. This allows building a subset of the system in the same way as .Cm buildworld does using its sysroot handling. This variable can also be useful when debugging failed builds. .Bd -literal -offset indent make some-target SUBDIR_OVERRIDE=foo/bar .Ed .It Va TARGET The target hardware platform. This is analogous to the .Dq Nm uname Fl m output. This is necessary to cross-build some target architectures. For example, cross-building for PC98 machines requires .Va TARGET_ARCH Ns = Ns Li i386 and .Va TARGET Ns = Ns Li pc98 . If not set, .Va TARGET defaults to the current hardware platform. .It Va TARGET_ARCH The target machine processor architecture. This is analogous to the .Dq Nm uname Fl p output. Set this to cross-build for a different architecture. If not set, .Va TARGET_ARCH defaults to the current machine architecture, unless .Va TARGET is also set, in which case it defaults to the appropriate value for that platform. Typically, one only needs to set .Va TARGET . .El .Pp Builds under directory .Pa /usr/src are also influenced by defining one or more of the following symbols, using the .Fl D option of .Xr make 1 : .Bl -tag -width ".Va -DNO_KERNELCONFIG" .It Va NO_CLEANDIR If set, the build targets that clean parts of the object tree use the equivalent of .Dq make clean instead of .Dq make cleandir . .It Va NO_CLEAN If set, no object tree files are cleaned at all. This is the default when .Va WITH_META_MODE is used with .Xr filemon 4 loaded. See .Xr src.conf 5 for more details. Setting .Va NO_CLEAN implies .Va NO_KERNELCLEAN , so when .Va NO_CLEAN is set no kernel objects are cleaned either. .It Va NO_CTF If set, the build process does not run the DTrace CTF conversion tools on built objects. .It Va NO_SHARE If set, the build does not descend into the .Pa /usr/src/share subdirectory (i.e., manual pages, locale data files, timezone data files and other .Pa /usr/src/share files will not be rebuild from their sources). .It Va NO_KERNELCLEAN If set, the build process does not run .Dq make clean as part of the .Cm buildkernel target. .It Va NO_KERNELCONFIG If set, the build process does not run .Xr config 8 as part of the .Cm buildkernel target. .It Va NO_KERNELOBJ If set, the build process does not run .Dq make obj as part of the .Cm buildkernel target. .It Va NO_DOCUPDATE If set, the update process does not update the source of the .Fx documentation as part of the .Dq make update target. .It Va NO_PORTSUPDATE If set, the update process does not update the Ports tree as part of the .Dq make update target. .It Va NO_WWWUPDATE If set, the update process does not update the www tree as part of the .Dq make update target. .El .Pp Builds under directory .Pa /usr/doc are influenced by the following .Xr make 1 variables: .Bl -tag -width ".Va DOC_LANG" .It Va DOC_LANG If set, restricts the documentation build to the language subdirectories specified as its content. The default action is to build documentation for all languages. .El .Pp Builds using the .Cm universe target are influenced by the following .Xr make 1 variables: .Bl -tag -width ".Va MAKE_JUST_KERNELS" .It Va JFLAG Pass the value of this variable to each .Xr make 1 invocation used to build worlds and kernels. This can be used to enable multiple jobs within a single architecture's build while still building each architecture serially. .It Va MAKE_JUST_KERNELS Only build kernels for each supported architecture. .It Va MAKE_JUST_WORLDS Only build worlds for each supported architecture. .It Va UNIVERSE_TARGET Execute the specified .Xr make 1 target for each supported architecture instead of the default action of building a world and one or more kernels. .El .Sh FILES .Bl -tag -width ".Pa /usr/share/examples/etc/make.conf" -compact .It Pa /usr/doc/Makefile .It Pa /usr/doc/share/mk/doc.project.mk .It Pa /usr/ports/Mk/bsd.port.mk .It Pa /usr/ports/Mk/bsd.sites.mk .It Pa /usr/share/examples/etc/make.conf .It Pa /usr/src/Makefile .It Pa /usr/src/Makefile.inc1 .El .Sh EXAMPLES For an .Dq approved method of updating your system from the latest sources, please see the .Sx COMMON ITEMS section in .Pa src/UPDATING . .Pp The following sequence of commands can be used to cross-build the system for the sparc64 architecture on an i386 host: .Bd -literal -offset indent cd /usr/src make TARGET=sparc64 buildworld make TARGET=sparc64 DESTDIR=/clients/sparc64 installworld .Ed .Sh SEE ALSO .Xr cc 1 , .Xr install 1 , .Xr make 1 , .Xr svn 1 , .Xr make.conf 5 , .Xr src.conf 5 , .Xr ports 7 , .Xr release 7 , .Xr config 8 , .Xr mergemaster 8 , .Xr portsnap 8 , .Xr reboot 8 , .Xr shutdown 8 , .Xr tests 7 .Sh AUTHORS .An Mike W. Meyer Aq Mt mwm@mired.org Index: user/alc/PQ_LAUNDRY/share/misc/committers-src.dot =================================================================== --- user/alc/PQ_LAUNDRY/share/misc/committers-src.dot (revision 303205) +++ user/alc/PQ_LAUNDRY/share/misc/committers-src.dot (revision 303206) @@ -1,783 +1,787 @@ # $FreeBSD$ # This file is meant to list all FreeBSD src committers and describe the # mentor-mentee relationships between them. # The graphical output can be generated from this file with the following # command: # $ dot -T png -o file.png committers-src.dot # # The dot binary is part of the graphics/graphviz port. digraph src { # Node definitions follow this example: # # foo [label="Foo Bar\nfoo@FreeBSD.org\n????/??/??"] # # ????/??/?? is the date when the commit bit was obtained, usually the one you # can find looking at svn logs for the svnadmin/access file. # Use YYYY/MM/DD format. # # For returned commit bits, the node definition will follow this example: # # foo [label="Foo Bar\nfoo@FreeBSD.org\n????/??/??\n????/??/??"] # # The first date is the same as for an active committer, the second date is # the date when the commit bit has been returned. Again, check svn logs. node [color=grey62, style=filled, bgcolor=black]; # Alumni go here.. Try to keep things sorted. alm [label="Andrew Moore\nalm@FreeBSD.org\n1993/06/12\n????/??/??"] anholt [label="Eric Anholt\nanholt@FreeBSD.org\n2002/04/22\n2008/08/07"] archie [label="Archie Cobbs\narchie@FreeBSD.org\n1998/11/06\n2006/06/09"] arr [label="Andrew R. Reiter\narr@FreeBSD.org\n2001/11/02\n2005/05/25"] arun [label="Arun Sharma\narun@FreeBSD.org\n2003/03/06\n2006/12/16"] asmodai [label="Jeroen Ruigrok\nasmodai@FreeBSD.org\n1999/12/16\n2001/11/16"] benjsc [label="Benjamin Close\nbenjsc@FreeBSD.org\n2007/02/09\n2010/09/15"] billf [label="Bill Fumerola\nbillf@FreeBSD.org\n1998/11/11\n2008/11/10"] bmah [label="Bruce A. Mah\nbmah@FreeBSD.org\n2002/01/29\n2009/09/13"] bmilekic [label="Bosko Milekic\nbmilekic@FreeBSD.org\n2000/09/21\n2008/11/10"] bushman [label="Michael Bushkov\nbushman@FreeBSD.org\n2007/03/10\n2010/04/29"] carl [label="Carl Delsey\ncarl@FreeBSD.org\n2013/01/14\n2014/03/06"] ceri [label="Ceri Davies\nceri@FreeBSD.org\n2006/11/07\n2012/03/07"] cjc [label="Crist J. Clark\ncjc@FreeBSD.org\n2001/06/01\n2006/12/29"] davidxu [label="David Xu\ndavidxu@FreeBSD.org\n2002/09/02\n2014/04/14"] dds [label="Diomidis Spinellis\ndds@FreeBSD.org\n2003/06/20\n2010/09/22"] dhartmei [label="Daniel Hartmeier\ndhartmei@FreeBSD.org\n2004/04/06\n2008/12/08"] dmlb [label="Duncan Barclay\ndmlb@FreeBSD.org\n2001/12/14\n2008/11/10"] dougb [label="Doug Barton\ndougb@FreeBSD.org\n2000/10/26\n2012/10/08"] eik [label="Oliver Eikemeier\neik@FreeBSD.org\n2004/05/20\n2008/11/10"] furuta [label="Atsushi Furuta\nfuruta@FreeBSD.org\n2000/06/21\n2003/03/08"] gj [label="Gary L. Jennejohn\ngj@FreeBSD.org\n1994/??/??\n2006/04/28"] groudier [label="Gerard Roudier\ngroudier@FreeBSD.org\n1999/12/30\n2006/04/06"] jake [label="Jake Burkholder\njake@FreeBSD.org\n2000/05/16\n2008/11/10"] jayanth [label="Jayanth Vijayaraghavan\njayanth@FreeBSD.org\n2000/05/08\n2008/11/10"] jb [label="John Birrell\njb@FreeBSD.org\n1997/03/27\n2009/12/15"] jdp [label="John Polstra\njdp@FreeBSD.org\n1995/12/07\n2008/02/26"] jedgar [label="Chris D. Faulhaber\njedgar@FreeBSD.org\n1999/12/15\n2006/04/07"] jkh [label="Jordan K. Hubbard\njkh@FreeBSD.org\n1993/06/12\n2008/06/13"] jlemon [label="Jonathan Lemon\njlemon@FreeBSD.org\n1997/08/14\n2008/11/10"] joe [label="Josef Karthauser\njoe@FreeBSD.org\n1999/10/22\n2008/08/10"] jtc [label="J.T. Conklin\njtc@FreeBSD.org\n1993/06/12\n????/??/??"] kargl [label="Steven G. Kargl\nkargl@FreeBSD.org\n2011/01/17\n2015/06/28"] kbyanc [label="Kelly Yancey\nkbyanc@FreeBSD.org\n2000/07/11\n2006/07/25"] keichii [label="Michael Wu\nkeichii@FreeBSD.org\n2001/03/07\n2006/04/28"] linimon [label="Mark Linimon\nlinimon@FreeBSD.org\n2006/09/30\n2008/05/04"] lulf [label="Ulf Lilleengen\nlulf@FreeBSD.org\n2007/10/24\n2012/01/19"] mb [label="Maxim Bolotin\nmb@FreeBSD.org\n2000/04/06\n2003/03/08"] marks [label="Mark Santcroos\nmarks@FreeBSD.org\n2004/03/18\n2008/09/29"] mike [label="Mike Barcroft\nmike@FreeBSD.org\n2001/07/17\n2006/04/28"] msmith [label="Mike Smith\nmsmith@FreeBSD.org\n1996/10/22\n2003/12/15"] murray [label="Murray Stokely\nmurray@FreeBSD.org\n2000/04/05\n2010/07/25"] mux [label="Maxime Henrion\nmux@FreeBSD.org\n2002/03/03\n2011/06/22"] nate [label="Nate Willams\nnate@FreeBSD.org\n1993/06/12\n2003/12/15"] njl [label="Nate Lawson\nnjl@FreeBSD.org\n2002/08/07\n2008/02/16"] non [label="Noriaki Mitsnaga\nnon@FreeBSD.org\n2000/06/19\n2007/03/06"] onoe [label="Atsushi Onoe\nonoe@FreeBSD.org\n2000/07/21\n2008/11/10"] rafan [label="Rong-En Fan\nrafan@FreeBSD.org\n2007/01/31\n2012/07/23"] randi [label="Randi Harper\nrandi@FreeBSD.org\n2010/04/20\n2012/05/10"] rgrimes [label="Rod Grimes\nrgrimes@FreeBSD.org\n1993/06/12\n2003/03/08"] rink [label="Rink Springer\nrink@FreeBSD.org\n2006/01/16\n2010/11/04"] robert [label="Robert Drehmel\nrobert@FreeBSD.org\n2001/08/23\n2006/05/13"] sah [label="Sam Hopkins\nsah@FreeBSD.org\n2004/12/15\n2008/11/10"] shafeeq [label="Shafeeq Sinnamohideen\nshafeeq@FreeBSD.org\n2000/06/19\n2006/04/06"] sheldonh [label="Sheldon Hearn\nsheldonh@FreeBSD.org\n1999/06/14\n2006/05/13"] shiba [label="Takeshi Shibagaki\nshiba@FreeBSD.org\n2000/06/19\n2008/11/10"] shin [label="Yoshinobu Inoue\nshin@FreeBSD.org\n1999/07/29\n2003/03/08"] snb [label="Nick Barkas\nsnb@FreeBSD.org\n2009/05/05\n2010/11/04"] tmm [label="Thomas Moestl\ntmm@FreeBSD.org\n2001/03/07\n2006/07/12"] toshi [label="Toshihiko Arai\ntoshi@FreeBSD.org\n2000/07/06\n2003/03/08"] tshiozak [label="Takuya SHIOZAKI\ntshiozak@FreeBSD.org\n2001/04/25\n2003/03/08"] uch [label="UCHIYAMA Yasushi\nuch@FreeBSD.org\n2000/06/21\n2002/04/24"] wilko [label="Wilko Bulte\nwilko@FreeBSD.org\n2000/01/13\n2013/01/17"] yar [label="Yar Tikhiy\nyar@FreeBSD.org\n2001/03/25\n2012/05/23"] zack [label="Zack Kirsch\nzack@FreeBSD.org\n2010/11/05\n2012/09/08"] node [color=lightblue2, style=filled, bgcolor=black]; # Current src committers go here. Try to keep things sorted. ache [label="Andrey Chernov\nache@FreeBSD.org\n1993/10/31"] achim [label="Achim Leubner\nachim@FreeBSD.org\n2013/01/23"] adrian [label="Adrian Chadd\nadrian@FreeBSD.org\n2000/07/03"] ae [label="Andrey V. Elsukov\nae@FreeBSD.org\n2010/06/03"] akiyama [label="Shunsuke Akiyama\nakiyama@FreeBSD.org\n2000/06/19"] alc [label="Alan Cox\nalc@FreeBSD.org\n1999/02/23"] allanjude [label="Allan Jude\nallanjude@FreeBSD.org\n2015/07/30"] ambrisko [label="Doug Ambrisko\nambrisko@FreeBSD.org\n2001/12/19"] anchie [label="Ana Kukec\nanchie@FreeBSD.org\n2010/04/14"] andre [label="Andre Oppermann\nandre@FreeBSD.org\n2003/11/12"] andreast [label="Andreas Tobler\nandreast@FreeBSD.org\n2010/09/05"] andrew [label="Andrew Turner\nandrew@FreeBSD.org\n2010/07/19"] antoine [label="Antoine Brodin\nantoine@FreeBSD.org\n2008/02/03"] araujo [label="Marcelo Araujo\naraujo@FreeBSD.org\n2015/08/04"] ariff [label="Ariff Abdullah\nariff@FreeBSD.org\n2005/11/14"] art [label="Artem Belevich\nart@FreeBSD.org\n2011/03/29"] arybchik [label="Andrew Rybchenko\narybchik@FreeBSD.org\n2014/10/12"] asomers [label="Alan Somers\nasomers@FreeBSD.org\n2013/04/24"] avg [label="Andriy Gapon\navg@FreeBSD.org\n2009/02/18"] avos [label="Andriy Voskoboinyk\navos@FreeBSD.org\n2015/09/24"] badger [label="Eric Badger\nbadger@FreeBSD.org\n2016/07/01"] bapt [label="Baptiste Daroussin\nbapt@FreeBSD.org\n2011/12/23"] bdrewery [label="Bryan Drewery\nbdrewery@FreeBSD.org\n2013/12/14"] benl [label="Ben Laurie\nbenl@FreeBSD.org\n2011/05/18"] benno [label="Benno Rice\nbenno@FreeBSD.org\n2000/11/02"] bms [label="Bruce M Simpson\nbms@FreeBSD.org\n2003/08/06"] br [label="Ruslan Bukin\nbr@FreeBSD.org\n2013/09/02"] brian [label="Brian Somers\nbrian@FreeBSD.org\n1996/12/16"] brooks [label="Brooks Davis\nbrooks@FreeBSD.org\n2001/06/21"] brucec [label="Bruce Cran\nbrucec@FreeBSD.org\n2010/01/29"] brueffer [label="Christian Brueffer\nbrueffer@FreeBSD.org\n2006/02/28"] bruno [label="Bruno Ducrot\nbruno@FreeBSD.org\n2005/07/18"] bryanv [label="Bryan Venteicher\nbryanv@FreeBSD.org\n2012/11/03"] bschmidt [label="Bernhard Schmidt\nbschmidt@FreeBSD.org\n2010/02/06"] bz [label="Bjoern A. Zeeb\nbz@FreeBSD.org\n2004/07/27"] cem [label="Conrad Meyer\ncem@FreeBSD.org\n2015/07/05"] cognet [label="Olivier Houchard\ncognet@FreeBSD.org\n2002/10/09"] cokane [label="Coleman Kane\ncokane@FreeBSD.org\n2000/06/19"] cperciva [label="Colin Percival\ncperciva@FreeBSD.org\n2004/01/20"] csjp [label="Christian S.J. Peron\ncsjp@FreeBSD.org\n2004/05/04"] das [label="David Schultz\ndas@FreeBSD.org\n2003/02/21"] davide [label="Davide Italiano\ndavide@FreeBSD.org\n2012/01/27"] dchagin [label="Dmitry Chagin\ndchagin@FreeBSD.org\n2009/02/28"] delphij [label="Xin Li\ndelphij@FreeBSD.org\n2004/09/14"] des [label="Dag-Erling Smorgrav\ndes@FreeBSD.org\n1998/04/03"] dfr [label="Doug Rabson\ndfr@FreeBSD.org\n????/??/??"] dg [label="David Greenman\ndg@FreeBSD.org\n1993/06/14"] dim [label="Dimitry Andric\ndim@FreeBSD.org\n2010/08/30"] dteske [label="Devin Teske\ndteske@FreeBSD.org\n2012/04/10"] dumbbell [label="Jean-Sebastien Pedron\ndumbbell@FreeBSD.org\n2004/11/29"] dwmalone [label="David Malone\ndwmalone@FreeBSD.org\n2000/07/11"] eadler [label="Eitan Adler\neadler@FreeBSD.org\n2012/01/18"] ed [label="Ed Schouten\ned@FreeBSD.org\n2008/05/22"] edavis [label="Eric Davis\nedavis@FreeBSD.org\n2013/10/09"] edwin [label="Edwin Groothuis\nedwin@FreeBSD.org\n2007/06/25"] eivind [label="Eivind Eklund\neivind@FreeBSD.org\n1997/02/02"] emaste [label="Ed Maste\nemaste@FreeBSD.org\n2005/10/04"] emax [label="Maksim Yevmenkin\nemax@FreeBSD.org\n2003/10/12"] eri [label="Ermal Luci\neri@FreeBSD.org\n2008/06/11"] erj [label="Eric Joyner\nerj@FreeBSD.org\n2014/12/14"] fabient [label="Fabien Thomas\nfabient@FreeBSD.org\n2009/03/16"] fanf [label="Tony Finch\nfanf@FreeBSD.org\n2002/05/05"] fjoe [label="Max Khon\nfjoe@FreeBSD.org\n2001/08/06"] flz [label="Florent Thoumie\nflz@FreeBSD.org\n2006/03/30"] gabor [label="Gabor Kovesdan\ngabor@FreeBSD.org\n2010/02/02"] gad [label="Garance A. Drosehn\ngad@FreeBSD.org\n2000/10/27"] gallatin [label="Andrew Gallatin\ngallatin@FreeBSD.org\n1999/01/15"] gavin [label="Gavin Atkinson\ngavin@FreeBSD.org\n2009/12/07"] gibbs [label="Justin T. Gibbs\ngibbs@FreeBSD.org\n????/??/??"] gjb [label="Glen Barber\ngjb@FreeBSD.org\n2013/06/04"] gleb [label="Gleb Kurtsou\ngleb@FreeBSD.org\n2011/09/19"] glebius [label="Gleb Smirnoff\nglebius@FreeBSD.org\n2004/07/14"] gnn [label="George V. Neville-Neil\ngnn@FreeBSD.org\n2004/10/11"] gordon [label="Gordon Tetlow\ngordon@FreeBSD.org\n2002/05/17"] grehan [label="Peter Grehan\ngrehan@FreeBSD.org\n2002/08/08"] grog [label="Greg Lehey\ngrog@FreeBSD.org\n1998/08/30"] gshapiro [label="Gregory Shapiro\ngshapiro@FreeBSD.org\n2000/07/12"] harti [label="Hartmut Brandt\nharti@FreeBSD.org\n2003/01/29"] hiren [label="Hiren Panchasara\nhiren@FreeBSD.org\n2013/04/12"] hmp [label="Hiten Pandya\nhmp@FreeBSD.org\n2004/03/23"] ian [label="Ian Lepore\nian@FreeBSD.org\n2013/01/07"] iedowse [label="Ian Dowse\niedowse@FreeBSD.org\n2000/12/01"] imp [label="Warner Losh\nimp@FreeBSD.org\n1996/09/20"] ivoras [label="Ivan Voras\nivoras@FreeBSD.org\n2008/06/10"] jah [label="Jason A. Harmening\njah@FreeBSD.org\n2015/03/08"] jamie [label="Jamie Gritton\njamie@FreeBSD.org\n2009/01/28"] jasone [label="Jason Evans\njasone@FreeBSD.org\n1999/03/03"] jceel [label="Jakub Klama\njceel@FreeBSD.org\n2011/09/25"] jch [label="Julien Charbon\njch@FreeBSD.org\n2014/09/24"] jchandra [label="Jayachandran C.\njchandra@FreeBSD.org\n2010/05/19"] jeff [label="Jeff Roberson\njeff@FreeBSD.org\n2002/02/21"] jh [label="Jaakko Heinonen\njh@FreeBSD.org\n2009/10/02"] jhb [label="John Baldwin\njhb@FreeBSD.org\n1999/08/23"] jhibbits [label="Justin Hibbits\njhibbits@FreeBSD.org\n2011/11/30"] jilles [label="Jilles Tjoelker\njilles@FreeBSD.org\n2009/05/22"] jimharris [label="Jim Harris\njimharris@FreeBSD.org\n2011/12/09"] jinmei [label="JINMEI Tatuya\njinmei@FreeBSD.org\n2007/03/17"] jkim [label="Jung-uk Kim\njkim@FreeBSD.org\n2005/07/06"] jkoshy [label="A. Joseph Koshy\njkoshy@FreeBSD.org\n1998/05/13"] jlh [label="Jeremie Le Hen\njlh@FreeBSD.org\n2012/04/22"] jls [label="Jordan Sissel\njls@FreeBSD.org\n2006/12/06"] jmcneill [label="Jared McNeill\njmcneill@FreeBSD.org\n2016/02/24"] jmg [label="John-Mark Gurney\njmg@FreeBSD.org\n1997/02/13"] jmmv [label="Julio Merino\njmmv@FreeBSD.org\n2013/11/02"] joerg [label="Joerg Wunsch\njoerg@FreeBSD.org\n1993/11/14"] jon [label="Jonathan Chen\njon@FreeBSD.org\n2000/10/17"] jonathan [label="Jonathan Anderson\njonathan@FreeBSD.org\n2010/10/07"] jpaetzel [label="Josh Paetzel\njpaetzel@FreeBSD.org\n2011/01/21"] jtl [label="Jonathan T. Looney\njtl@FreeBSD.org\n2015/10/26"] julian [label="Julian Elischer\njulian@FreeBSD.org\n1993/04/19"] jwd [label="John De Boskey\njwd@FreeBSD.org\n2000/05/19"] kaiw [label="Kai Wang\nkaiw@FreeBSD.org\n2007/09/26"] kan [label="Alexander Kabaev\nkan@FreeBSD.org\n2002/07/21"] karels [label="Mike Karels\nkarels@FreeBSD.org\n2016/06/09"] ken [label="Ken Merry\nken@FreeBSD.org\n1998/09/08"] kensmith [label="Ken Smith\nkensmith@FreeBSD.org\n2004/01/23"] kevlo [label="Kevin Lo\nkevlo@FreeBSD.org\n2006/07/23"] kib [label="Konstantin Belousov\nkib@FreeBSD.org\n2006/06/03"] kmacy [label="Kip Macy\nkmacy@FreeBSD.org\n2005/06/01"] kp [label="Kristof Provost\nkp@FreeBSD.org\n2015/03/22"] landonf [label="Landon Fuller\nlandonf@FreeBSD.org\n2016/05/31"] le [label="Lukas Ertl\nle@FreeBSD.org\n2004/02/02"] lidl [label="Kurt Lidl\nlidl@FreeBSD.org\n2015/10/21"] loos [label="Luiz Otavio O Souza\nloos@FreeBSD.org\n2013/07/03"] lstewart [label="Lawrence Stewart\nlstewart@FreeBSD.org\n2008/10/06"] manu [label="Emmanuel Vadot\nmanu@FreeBSD.org\n2016/04/24"] marcel [label="Marcel Moolenaar\nmarcel@FreeBSD.org\n1999/07/03"] marius [label="Marius Strobl\nmarius@FreeBSD.org\n2004/04/17"] markj [label="Mark Johnston\nmarkj@FreeBSD.org\n2012/12/18"] markm [label="Mark Murray\nmarkm@FreeBSD.org\n1995/04/24"] markus [label="Markus Brueffer\nmarkus@FreeBSD.org\n2006/06/01"] matteo [label="Matteo Riondato\nmatteo@FreeBSD.org\n2006/01/18"] mav [label="Alexander Motin\nmav@FreeBSD.org\n2007/04/12"] maxim [label="Maxim Konovalov\nmaxim@FreeBSD.org\n2002/02/07"] mdf [label="Matthew Fleming\nmdf@FreeBSD.org\n2010/06/04"] mdodd [label="Matthew N. Dodd\nmdodd@FreeBSD.org\n1999/07/27"] melifaro [label="Alexander V. Chernikov\nmelifaro@FreeBSD.org\n2011/10/04"] +mizhka [label="Michael Zhilin\nmizhka@FreeBSD.org\n2016/07/19"] mjacob [label="Matt Jacob\nmjacob@FreeBSD.org\n1997/08/13"] mjg [label="Mateusz Guzik\nmjg@FreeBSD.org\n2012/06/04"] mlaier [label="Max Laier\nmlaier@FreeBSD.org\n2004/02/10"] mmel [label="Michal Meloun\nmmel@FreeBSD.org\n2015/11/01"] monthadar [label="Monthadar Al Jaberi\nmonthadar@FreeBSD.org\n2012/04/02"] mp [label="Mark Peek\nmp@FreeBSD.org\n2001/07/27"] mr [label="Michael Reifenberger\nmr@FreeBSD.org\n2001/09/30"] neel [label="Neel Natu\nneel@FreeBSD.org\n2009/09/20"] netchild [label="Alexander Leidinger\nnetchild@FreeBSD.org\n2005/03/31"] ngie [label="Ngie Cooper\nngie@FreeBSD.org\n2014/07/27"] nork [label="Norikatsu Shigemura\nnork@FreeBSD.org\n2009/06/09"] np [label="Navdeep Parhar\nnp@FreeBSD.org\n2009/06/05"] nwhitehorn [label="Nathan Whitehorn\nnwhitehorn@FreeBSD.org\n2008/07/03"] n_hibma [label="Nick Hibma\nn_hibma@FreeBSD.org\n1998/11/26"] obrien [label="David E. O'Brien\nobrien@FreeBSD.org\n1996/10/29"] olli [label="Oliver Fromme\nolli@FreeBSD.org\n2008/02/14"] oshogbo [label="Mariusz Zaborski\noshogbo@FreeBSD.org\n2015/04/15"] peadar [label="Peter Edwards\npeadar@FreeBSD.org\n2004/03/08"] peter [label="Peter Wemm\npeter@FreeBSD.org\n1995/07/04"] peterj [label="Peter Jeremy\npeterj@FreeBSD.org\n2012/09/14"] pfg [label="Pedro Giffuni\npfg@FreeBSD.org\n2011/12/01"] phil [label="Phil Shafer\nphil@FreeBSD.ogr\n2016/12/30"] philip [label="Philip Paeps\nphilip@FreeBSD.org\n2004/01/21"] phk [label="Poul-Henning Kamp\nphk@FreeBSD.org\n1994/02/21"] pho [label="Peter Holm\npho@FreeBSD.org\n2008/11/16"] pjd [label="Pawel Jakub Dawidek\npjd@FreeBSD.org\n2004/02/02"] pkelsey [label="Patrick Kelsey\pkelsey@FreeBSD.org\n2014/05/29"] pluknet [label="Sergey Kandaurov\npluknet@FreeBSD.org\n2010/10/05"] ps [label="Paul Saab\nps@FreeBSD.org\n2000/02/23"] qingli [label="Qing Li\nqingli@FreeBSD.org\n2005/04/13"] ray [label="Aleksandr Rybalko\nray@FreeBSD.org\n2011/05/25"] rdivacky [label="Roman Divacky\nrdivacky@FreeBSD.org\n2008/03/13"] remko [label="Remko Lodder\nremko@FreeBSD.org\n2007/02/23"] rik [label="Roman Kurakin\nrik@FreeBSD.org\n2003/12/18"] rmacklem [label="Rick Macklem\nrmacklem@FreeBSD.org\n2009/03/27"] rmh [label="Robert Millan\nrmh@FreeBSD.org\n2011/09/18"] rnoland [label="Robert Noland\nrnoland@FreeBSD.org\n2008/09/15"] roberto [label="Ollivier Robert\nroberto@FreeBSD.org\n1995/02/22"] rodrigc [label="Craig Rodrigues\nrodrigc@FreeBSD.org\n2005/05/14"] royger [label="Roger Pau Monne\nroyger@FreeBSD.org\n2013/11/26"] rpaulo [label="Rui Paulo\nrpaulo@FreeBSD.org\n2007/09/25"] rpokala [label="Ravi Pokala\nrpokala@FreeBSD.org\n2015/11/19"] rrs [label="Randall R Stewart\nrrs@FreeBSD.org\n2007/02/08"] rse [label="Ralf S. Engelschall\nrse@FreeBSD.org\n1997/07/31"] rstone [label="Ryan Stone\nrstone@FreeBSD.org\n2010/04/19"] ru [label="Ruslan Ermilov\nru@FreeBSD.org\n1999/05/27"] rwatson [label="Robert N. M. Watson\nrwatson@FreeBSD.org\n1999/12/16"] sam [label="Sam Leffler\nsam@FreeBSD.org\n2002/07/02"] sanpei [label="MIHIRA Sanpei Yoshiro\nsanpei@FreeBSD.org\n2000/06/19"] sbruno [label="Sean Bruno\nsbruno@FreeBSD.org\n2008/08/02"] scf [label="Sean C. Farley\nscf@FreeBSD.org\n2007/06/24"] schweikh [label="Jens Schweikhardt\nschweikh@FreeBSD.org\n2001/04/06"] scottl [label="Scott Long\nscottl@FreeBSD.org\n2000/09/28"] se [label="Stefan Esser\nse@FreeBSD.org\n1994/08/26"] sephe [label="Sepherosa Ziehau\nsephe@FreeBSD.org\n2007/03/28"] sepotvin [label="Stephane E. Potvin\nsepotvin@FreeBSD.org\n2007/02/15"] sgalabov [label="Stanislav Galabov\nsgalabov@FreeBSD.org\n2016/02/24"] simon [label="Simon L. Nielsen\nsimon@FreeBSD.org\n2006/03/07"] sjg [label="Simon J. Gerraty\nsjg@FreeBSD.org\n2012/10/23"] skra [label="Svatopluk Kraus\nskra@FreeBSD.org\n2015/10/28"] slm [label="Stephen McConnell\nslm@FreeBSD.org\n2014/05/07"] smh [label="Steven Hartland\nsmh@FreeBSD.org\n2012/11/12"] sobomax [label="Maxim Sobolev\nsobomax@FreeBSD.org\n2001/07/25"] sos [label="Soren Schmidt\nsos@FreeBSD.org\n????/??/??"] sson [label="Stacey Son\nsson@FreeBSD.org\n2008/07/08"] stas [label="Stanislav Sedov\nstas@FreeBSD.org\n2008/08/22"] +stevek [label="Stephen J. Kiernan\nstevek@FreeBSD.org\n2016/07/18"] suz [label="SUZUKI Shinsuke\nsuz@FreeBSD.org\n2002/03/26"] syrinx [label="Shteryana Shopova\nsyrinx@FreeBSD.org\n2006/10/07"] takawata [label="Takanori Watanabe\ntakawata@FreeBSD.org\n2000/07/06"] theraven [label="David Chisnall\ntheraven@FreeBSD.org\n2011/11/11"] thompsa [label="Andrew Thompson\nthompsa@FreeBSD.org\n2005/05/25"] ticso [label="Bernd Walter\nticso@FreeBSD.org\n2002/01/31"] tijl [label="Tijl Coosemans\ntijl@FreeBSD.org\n2010/07/16"] trasz [label="Edward Tomasz Napierala\ntrasz@FreeBSD.org\n2008/08/22"] trhodes [label="Tom Rhodes\ntrhodes@FreeBSD.org\n2002/05/28"] trociny [label="Mikolaj Golub\ntrociny@FreeBSD.org\n2011/03/10"] tuexen [label="Michael Tuexen\ntuexen@FreeBSD.org\n2009/06/06"] tychon [label="Tycho Nightingale\ntychon@FreeBSD.org\n2014/01/21"] ume [label="Hajimu UMEMOTO\nume@FreeBSD.org\n2000/02/26"] uqs [label="Ulrich Spoerlein\nuqs@FreeBSD.org\n2010/01/28"] vangyzen [label="Eric van Gyzen\nvangyzen@FreeBSD.org\n2015/03/08"] vanhu [label="Yvan Vanhullebus\nvanhu@FreeBSD.org\n2008/07/21"] versus [label="Konrad Jankowski\nversus@FreeBSD.org\n2008/10/27"] weongyo [label="Weongyo Jeong\nweongyo@FreeBSD.org\n2007/12/21"] wes [label="Wes Peters\nwes@FreeBSD.org\n1998/11/25"] whu [label="Wei Hu\nwhu@FreeBSD.org\n2015/02/11"] wkoszek [label="Wojciech A. Koszek\nwkoszek@FreeBSD.org\n2006/02/21"] wma [label="Wojciech Macek\nwma@FreeBSD.org\n2016/01/18"] wollman [label="Garrett Wollman\nwollman@FreeBSD.org\n????/??/??"] wsalamon [label="Wayne Salamon\nwsalamon@FreeBSD.org\n2005/06/25"] yongari [label="Pyun YongHyeon\nyongari@FreeBSD.org\n2004/08/01"] zbb [label="Zbigniew Bodek\nzbb@FreeBSD.org\n2013/09/02"] zec [label="Marko Zec\nzec@FreeBSD.org\n2008/06/22"] zml [label="Zachary Loafman\nzml@FreeBSD.org\n2009/05/27"] zont [label="Andrey Zonov\nzont@FreeBSD.org\n2012/08/21"] # Pseudo target representing rev 1.1 of commit.allow day1 [label="Birth of FreeBSD"] # Here are the mentor/mentee relationships. # Group together all the mentees for a particular mentor. # Keep the list sorted by mentor login. day1 -> jtc day1 -> jkh day1 -> nate day1 -> rgrimes day1 -> alm day1 -> dg adrian -> avos adrian -> jmcneill adrian -> landonf adrian -> lidl adrian -> loos +adrian -> mizhka adrian -> monthadar adrian -> ray adrian -> rmh adrian -> sephe adrian -> sgalabov ae -> melifaro alc -> davide andre -> qingli andrew -> manu anholt -> jkim avg -> art avg -> pluknet avg -> smh bapt -> allanjude bapt -> araujo bapt -> bdrewery benno -> grehan billf -> dougb billf -> gad billf -> jedgar billf -> jhb billf -> shafeeq bmilekic -> csjp bms -> dhartmei bms -> mlaier bms -> thompsa brian -> joe brooks -> bushman brooks -> jamie brooks -> theraven bz -> anchie bz -> jamie bz -> syrinx cognet -> br cognet -> jceel cognet -> kevlo cognet -> ian cognet -> manu cognet -> wkoszek cognet -> wma cognet -> zbb cperciva -> eadler cperciva -> flz cperciva -> randi cperciva -> simon csjp -> bushman das -> kargl das -> rodrigc delphij -> gabor delphij -> rafan delphij -> sephe des -> anholt des -> hmp des -> mike des -> olli des -> ru des -> bapt dds -> versus dfr -> gallatin dfr -> zml dg -> peter dim -> theraven dwmalone -> fanf dwmalone -> peadar dwmalone -> snb ed -> dim ed -> gavin ed -> jilles ed -> rdivacky ed -> uqs eivind -> des eivind -> rwatson emaste -> achim emaste -> rstone emaste -> dteske emaste -> markj emax -> markus fjoe -> versus gallatin -> ticso gavin -> versus gibbs -> mjacob gibbs -> njl gibbs -> royger gibbs -> whu glebius -> mav gnn -> jinmei gnn -> rrs gnn -> ivoras gnn -> vanhu gnn -> lstewart gnn -> np gnn -> davide gnn -> arybchik gnn -> erj gnn -> kp gnn -> jtl gnn -> karels gonzo -> jmcneill grehan -> bryanv grog -> edwin grog -> le grog -> peterj imp -> akiyama imp -> ambrisko imp -> andrew imp -> bmah imp -> bruno imp -> dmlb imp -> emax imp -> furuta imp -> joe imp -> jon imp -> keichii imp -> mb imp -> mr imp -> neel imp -> non imp -> nork imp -> onoe imp -> remko imp -> rik imp -> rink imp -> sanpei imp -> shiba imp -> takawata imp -> toshi imp -> uch jake -> bms jake -> gordon jake -> harti jake -> jeff jake -> kmacy jake -> robert jake -> yongari jb -> sson jdp -> fjoe jfv -> erj jhb -> arr jhb -> avg jhb -> jch jhb -> jeff jhb -> kbyanc jhb -> peterj jhb -> pfg jhb -> rnoland jhb -> rpokala jimharris -> carl jkh -> dfr jkh -> gj jkh -> grog jkh -> imp jkh -> jlemon jkh -> joerg jkh -> jwd jkh -> msmith jkh -> murray jkh -> phk jkh -> wes jkh -> yar jkoshy -> kaiw jkoshy -> fabient jkoshy -> rstone jlemon -> bmilekic jlemon -> brooks jmallett -> pkelsey jmmv -> ngie joerg -> brian joerg -> eik joerg -> jmg joerg -> le joerg -> netchild joerg -> schweikh julian -> glebius julian -> davidxu julian -> archie julian -> adrian julian -> zec julian -> mp kan -> kib ken -> asomers ken -> slm kib -> ae kib -> badger kib -> dchagin kib -> gjb kib -> jah kib -> jlh kib -> jpaetzel kib -> lulf kib -> melifaro kib -> mmel kib -> pho kib -> pluknet kib -> rdivacky kib -> rmacklem kib -> rmh kib -> skra kib -> stas kib -> tijl kib -> trociny kib -> vangyzen kib -> zont kmacy -> lstewart marcel -> allanjude marcel -> art marcel -> arun marcel -> marius marcel -> nwhitehorn marcel -> sjg markj -> cem markm -> jasone markm -> sheldonh mav -> ae mdf -> gleb mdodd -> jake mike -> das mlaier -> benjsc mlaier -> dhartmei mlaier -> thompsa mlaier -> eri msmith -> cokane msmith -> jasone msmith -> scottl murray -> delphij mux -> cognet mux -> dumbbell netchild -> ariff njl -> marks njl -> philip njl -> rpaulo njl -> sepotvin nwhitehorn -> andreast nwhitehorn -> jhibbits obrien -> benno obrien -> groudier obrien -> gshapiro obrien -> kan obrien -> sam peter -> asmodai peter -> jayanth peter -> ps philip -> benl philip -> ed philip -> jls philip -> matteo philip -> uqs philip -> kp phk -> jkoshy phk -> mux pjd -> kib pjd -> lulf pjd -> oshogbo pjd -> smh pjd -> trociny rgrimes -> markm rmacklem -> jwd royger -> whu rpaulo -> avg rpaulo -> bschmidt rpaulo -> dim rpaulo -> jmmv rpaulo -> lidl rpaulo -> ngie rrs -> brucec rrs -> jchandra rrs -> tuexen rstone -> markj ru -> ceri ru -> cjc ru -> eik ru -> maxim ru -> sobomax rwatson -> adrian rwatson -> antoine rwatson -> bmah rwatson -> brueffer rwatson -> bz rwatson -> cperciva rwatson -> emaste rwatson -> gnn rwatson -> jh rwatson -> jonathan rwatson -> kensmith rwatson -> kmacy rwatson -> linimon rwatson -> rmacklem rwatson -> shafeeq rwatson -> tmm rwatson -> trasz rwatson -> trhodes rwatson -> wsalamon rodrigc -> araujo sam -> andre sam -> benjsc sam -> sephe sbruno -> hiren sbruno -> jimharris schweikh -> dds scottl -> achim scottl -> jimharris scottl -> pjd scottl -> sah scottl -> sbruno scottl -> slm scottl -> yongari sheldonh -> dwmalone sheldonh -> iedowse shin -> ume simon -> benl sjg -> phil +sjg -> stevek sos -> marcel theraven -> phil thompsa -> weongyo thompsa -> eri trasz -> jh trasz -> mjg ume -> jinmei ume -> suz ume -> tshiozak vangyzen -> badger wes -> scf wkoszek -> jceel wollman -> gad zml -> mdf zml -> zack } Index: user/alc/PQ_LAUNDRY/share/mk/bsd.cpu.mk =================================================================== --- user/alc/PQ_LAUNDRY/share/mk/bsd.cpu.mk (revision 303205) +++ user/alc/PQ_LAUNDRY/share/mk/bsd.cpu.mk (revision 303206) @@ -1,364 +1,365 @@ # $FreeBSD$ # Set default CPU compile flags and baseline CPUTYPE for each arch. The # compile flags must support the minimum CPU type for each architecture but # may tune support for more advanced processors. .if !defined(CPUTYPE) || empty(CPUTYPE) _CPUCFLAGS = . if ${MACHINE_CPUARCH} == "aarch64" MACHINE_CPU = arm64 . elif ${MACHINE_CPUARCH} == "amd64" MACHINE_CPU = amd64 sse2 sse mmx . elif ${MACHINE_CPUARCH} == "arm" MACHINE_CPU = arm . elif ${MACHINE_CPUARCH} == "i386" MACHINE_CPU = i486 . elif ${MACHINE_CPUARCH} == "mips" MACHINE_CPU = mips . elif ${MACHINE_CPUARCH} == "powerpc" MACHINE_CPU = aim . elif ${MACHINE_CPUARCH} == "riscv" MACHINE_CPU = riscv . elif ${MACHINE_CPUARCH} == "sparc64" MACHINE_CPU = ultrasparc . endif .else # Handle aliases (not documented in make.conf to avoid user confusion # between e.g. i586 and pentium) . if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386" . if ${CPUTYPE} == "barcelona" CPUTYPE = amdfam10 . elif ${CPUTYPE} == "core-avx2" CPUTYPE = haswell . elif ${CPUTYPE} == "core-avx-i" CPUTYPE = ivybridge . elif ${CPUTYPE} == "corei7-avx" CPUTYPE = sandybridge . elif ${CPUTYPE} == "corei7" CPUTYPE = nehalem . elif ${CPUTYPE} == "slm" CPUTYPE = silvermont . elif ${CPUTYPE} == "atom" CPUTYPE = bonnell . elif ${CPUTYPE} == "core" CPUTYPE = prescott . endif . if ${MACHINE_CPUARCH} == "amd64" . if ${CPUTYPE} == "prescott" CPUTYPE = nocona . endif . else . if ${CPUTYPE} == "k7" CPUTYPE = athlon . elif ${CPUTYPE} == "p4" CPUTYPE = pentium4 . elif ${CPUTYPE} == "p4m" CPUTYPE = pentium4m . elif ${CPUTYPE} == "p3" CPUTYPE = pentium3 . elif ${CPUTYPE} == "p3m" CPUTYPE = pentium3m . elif ${CPUTYPE} == "p-m" CPUTYPE = pentium-m . elif ${CPUTYPE} == "p2" CPUTYPE = pentium2 . elif ${CPUTYPE} == "i686" CPUTYPE = pentiumpro . elif ${CPUTYPE} == "i586/mmx" CPUTYPE = pentium-mmx . elif ${CPUTYPE} == "i586" CPUTYPE = pentium . endif . endif . elif ${MACHINE_ARCH} == "sparc64" . if ${CPUTYPE} == "us" CPUTYPE = ultrasparc . elif ${CPUTYPE} == "us3" CPUTYPE = ultrasparc3 . endif . endif ############################################################################### # Logic to set up correct gcc optimization flag. This must be included # after /etc/make.conf so it can react to the local value of CPUTYPE # defined therein. Consult: # http://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html # http://gcc.gnu.org/onlinedocs/gcc/RS-6000-and-PowerPC-Options.html # http://gcc.gnu.org/onlinedocs/gcc/MIPS-Options.html # http://gcc.gnu.org/onlinedocs/gcc/SPARC-Options.html # http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html . if ${MACHINE_CPUARCH} == "i386" . if ${CPUTYPE} == "crusoe" _CPUCFLAGS = -march=i686 -falign-functions=0 -falign-jumps=0 -falign-loops=0 . elif ${CPUTYPE} == "k5" _CPUCFLAGS = -march=pentium . elif ${CPUTYPE} == "c7" _CPUCFLAGS = -march=c3-2 . else _CPUCFLAGS = -march=${CPUTYPE} . endif . elif ${MACHINE_CPUARCH} == "amd64" _CPUCFLAGS = -march=${CPUTYPE} . elif ${MACHINE_CPUARCH} == "arm" . if ${CPUTYPE} == "xscale" #XXX: gcc doesn't seem to like -mcpu=xscale, and dies while rebuilding itself #_CPUCFLAGS = -mcpu=xscale _CPUCFLAGS = -march=armv5te -D__XSCALE__ . elif ${CPUTYPE:M*soft*} != "" _CPUCFLAGS = -mfloat-abi=softfp . elif ${CPUTYPE} == "armv6" # Not sure we still need ARM_ARCH_6=1 here. _CPUCFLAGS = -march=${CPUTYPE} -DARM_ARCH_6=1 . elif ${CPUTYPE} == "cortexa" _CPUCFLAGS = -march=armv7 -DARM_ARCH_6=1 -mfpu=vfp . elif ${CPUTYPE:Marmv[4567]*} != "" # Handle all the armvX types that FreeBSD runs: # armv4, armv4t, armv5, armv5te, armv6, armv6t2, armv7, armv7-a, armv7ve # they require -march=. All the others require -mcpu=. _CPUCFLAGS = -march=${CPUTYPE} . else # Common values for FreeBSD # arm: (any arm v4 or v5 processor you are targeting) # arm920t, arm926ej-s, marvell-pj4, fa526, fa626, # fa606te, fa626te, fa726te # armv6: (any arm v7 or v8 processor you are targeting and the arm1176jzf-s) # arm1176jzf-s, generic-armv7-a, cortex-a5, cortex-a7, cortex-a8, # cortex-a9, cortex-a12, cortex-a15, cortex-a17, cortex-a53, cortex-a57, # cortex-a72, exynos-m1 _CPUCFLAGS = -mcpu=${CPUTYPE} . endif . elif ${MACHINE_ARCH} == "powerpc" . if ${CPUTYPE} == "e500" _CPUCFLAGS = -Wa,-me500 -msoft-float . else _CPUCFLAGS = -mcpu=${CPUTYPE} -mno-powerpc64 . endif . elif ${MACHINE_ARCH} == "powerpc64" _CPUCFLAGS = -mcpu=${CPUTYPE} . elif ${MACHINE_CPUARCH} == "mips" # mips[1234], mips32, mips64, and all later releases need to have mips # preserved (releases later than r2 require external toolchain) . if ${CPUTYPE:Mmips32*} != "" || ${CPUTYPE:Mmips64*} != "" || \ ${CPUTYPE:Mmips[1234]} != "" _CPUCFLAGS = -march=${CPUTYPE} . else # Default -march to the CPUTYPE passed in, with mips stripped off so we # accept either mips4kc or 4kc, mostly for historical reasons # Typical values for cores: # 4kc, 24kc, 34kc, 74kc, 1004kc, octeon, octeon+, octeon2, octeon3, # sb1, xlp, xlr _CPUCFLAGS = -march=${CPUTYPE:S/^mips//} . endif . elif ${MACHINE_CPUARCH} == "riscv" _CPUCFLAGS = -msoft-float # -march="RV64I" # RISCVTODO . elif ${MACHINE_ARCH} == "sparc64" . if ${CPUTYPE} == "v9" _CPUCFLAGS = -mcpu=v9 . elif ${CPUTYPE} == "ultrasparc" _CPUCFLAGS = -mcpu=ultrasparc . elif ${CPUTYPE} == "ultrasparc3" _CPUCFLAGS = -mcpu=ultrasparc3 . endif . elif ${MACHINE_CPUARCH} == "aarch64" _CPUCFLAGS = -mcpu=${CPUTYPE} . endif # Set up the list of CPU features based on the CPU type. This is an # unordered list to make it easy for client makefiles to test for the # presence of a CPU feature. ########## i386 . if ${MACHINE_CPUARCH} == "i386" . if ${CPUTYPE} == "bdver4" MACHINE_CPU = xop avx2 avx sse42 sse41 ssse3 sse4a sse3 sse2 sse mmx k6 k5 i586 . elif ${CPUTYPE} == "bdver3" || ${CPUTYPE} == "bdver2" || \ ${CPUTYPE} == "bdver1" MACHINE_CPU = xop avx sse42 sse41 ssse3 sse4a sse3 sse2 sse mmx k6 k5 i586 . elif ${CPUTYPE} == "btver2" MACHINE_CPU = avx sse42 sse41 ssse3 sse4a sse3 sse2 sse mmx k6 k5 i586 . elif ${CPUTYPE} == "btver1" MACHINE_CPU = ssse3 sse4a sse3 sse2 sse mmx k6 k5 i586 . elif ${CPUTYPE} == "amdfam10" MACHINE_CPU = athlon-xp athlon k7 3dnow sse4a sse3 sse2 sse mmx k6 k5 i586 . elif ${CPUTYPE} == "opteron-sse3" || ${CPUTYPE} == "athlon64-sse3" MACHINE_CPU = athlon-xp athlon k7 3dnow sse3 sse2 sse mmx k6 k5 i586 . elif ${CPUTYPE} == "opteron" || ${CPUTYPE} == "athlon64" || \ ${CPUTYPE} == "athlon-fx" MACHINE_CPU = athlon-xp athlon k7 3dnow sse2 sse mmx k6 k5 i586 . elif ${CPUTYPE} == "athlon-mp" || ${CPUTYPE} == "athlon-xp" || \ ${CPUTYPE} == "athlon-4" MACHINE_CPU = athlon-xp athlon k7 3dnow sse mmx k6 k5 i586 . elif ${CPUTYPE} == "athlon" || ${CPUTYPE} == "athlon-tbird" MACHINE_CPU = athlon k7 3dnow mmx k6 k5 i586 . elif ${CPUTYPE} == "k6-3" || ${CPUTYPE} == "k6-2" || ${CPUTYPE} == "geode" MACHINE_CPU = 3dnow mmx k6 k5 i586 . elif ${CPUTYPE} == "k6" MACHINE_CPU = mmx k6 k5 i586 . elif ${CPUTYPE} == "k5" MACHINE_CPU = k5 i586 . elif ${CPUTYPE} == "skylake" || ${CPUTYPE} == "knl" MACHINE_CPU = avx512 avx2 avx sse42 sse41 ssse3 sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "broadwell" || ${CPUTYPE} == "haswell" MACHINE_CPU = avx2 avx sse42 sse41 ssse3 sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "ivybridge" || ${CPUTYPE} == "sandybridge" MACHINE_CPU = avx sse42 sse41 ssse3 sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "westmere" || ${CPUTYPE} == "nehalem" || \ ${CPUTYPE} == "silvermont" MACHINE_CPU = sse42 sse41 ssse3 sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "penryn" MACHINE_CPU = sse41 ssse3 sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "core2" || ${CPUTYPE} == "bonnell" MACHINE_CPU = ssse3 sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "yonah" || ${CPUTYPE} == "prescott" MACHINE_CPU = sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "pentium4" || ${CPUTYPE} == "pentium4m" || \ ${CPUTYPE} == "pentium-m" MACHINE_CPU = sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "pentium3" || ${CPUTYPE} == "pentium3m" MACHINE_CPU = sse i686 mmx i586 . elif ${CPUTYPE} == "pentium2" MACHINE_CPU = i686 mmx i586 . elif ${CPUTYPE} == "pentiumpro" MACHINE_CPU = i686 i586 . elif ${CPUTYPE} == "pentium-mmx" MACHINE_CPU = mmx i586 . elif ${CPUTYPE} == "pentium" MACHINE_CPU = i586 . elif ${CPUTYPE} == "c7" MACHINE_CPU = sse3 sse2 sse i686 mmx i586 . elif ${CPUTYPE} == "c3-2" MACHINE_CPU = sse i686 mmx i586 . elif ${CPUTYPE} == "c3" MACHINE_CPU = 3dnow mmx i586 . elif ${CPUTYPE} == "winchip2" MACHINE_CPU = 3dnow mmx . elif ${CPUTYPE} == "winchip-c6" MACHINE_CPU = mmx . endif MACHINE_CPU += i486 ########## amd64 . elif ${MACHINE_CPUARCH} == "amd64" . if ${CPUTYPE} == "bdver4" MACHINE_CPU = xop avx2 avx sse42 sse41 ssse3 sse4a sse3 . elif ${CPUTYPE} == "bdver3" || ${CPUTYPE} == "bdver2" || \ ${CPUTYPE} == "bdver1" MACHINE_CPU = xop avx sse42 sse41 ssse3 sse4a sse3 . elif ${CPUTYPE} == "btver2" MACHINE_CPU = avx sse42 sse41 ssse3 sse4a sse3 . elif ${CPUTYPE} == "btver1" MACHINE_CPU = ssse3 sse4a sse3 . elif ${CPUTYPE} == "amdfam10" MACHINE_CPU = k8 3dnow sse4a sse3 . elif ${CPUTYPE} == "opteron-sse3" || ${CPUTYPE} == "athlon64-sse3" || \ ${CPUTYPE} == "k8-sse3" MACHINE_CPU = k8 3dnow sse3 . elif ${CPUTYPE} == "opteron" || ${CPUTYPE} == "athlon64" || \ ${CPUTYPE} == "athlon-fx" || ${CPUTYPE} == "k8" MACHINE_CPU = k8 3dnow . elif ${CPUTYPE} == "skylake" || ${CPUTYPE} == "knl" MACHINE_CPU = avx512 avx2 avx sse42 sse41 ssse3 sse3 . elif ${CPUTYPE} == "broadwell" || ${CPUTYPE} == "haswell" MACHINE_CPU = avx2 avx sse42 sse41 ssse3 sse3 . elif ${CPUTYPE} == "ivybridge" || ${CPUTYPE} == "sandybridge" MACHINE_CPU = avx sse42 sse41 ssse3 sse3 . elif ${CPUTYPE} == "westmere" || ${CPUTYPE} == "nehalem" || \ ${CPUTYPE} == "silvermont" MACHINE_CPU = sse42 sse41 ssse3 sse3 . elif ${CPUTYPE} == "penryn" MACHINE_CPU = sse41 ssse3 sse3 . elif ${CPUTYPE} == "core2" || ${CPUTYPE} == "bonnell" MACHINE_CPU = ssse3 sse3 . elif ${CPUTYPE} == "nocona" MACHINE_CPU = sse3 . endif MACHINE_CPU += amd64 sse2 sse mmx ########## Mips . elif ${MACHINE_CPUARCH} == "mips" MACHINE_CPU = mips ########## powerpc . elif ${MACHINE_ARCH} == "powerpc" . if ${CPUTYPE} == "e500" MACHINE_CPU = booke softfp . endif ########## riscv . elif ${MACHINE_CPUARCH} == "riscv" MACHINE_CPU = riscv ########## sparc64 . elif ${MACHINE_ARCH} == "sparc64" . if ${CPUTYPE} == "v9" MACHINE_CPU = v9 . elif ${CPUTYPE} == "ultrasparc" MACHINE_CPU = v9 ultrasparc . elif ${CPUTYPE} == "ultrasparc3" MACHINE_CPU = v9 ultrasparc ultrasparc3 . endif . endif .endif .if ${MACHINE_CPUARCH} == "mips" CFLAGS += -G0 .endif ########## arm .if ${MACHINE_CPUARCH} == "arm" MACHINE_CPU += arm . if ${MACHINE_ARCH:Marmv6*} != "" MACHINE_CPU += armv6 . endif # armv6 is a hybrid. It can use the softfp ABI, but doesn't emulate # floating point in the general case, so don't define softfp for # it at this time. arm and armeb are pure softfp, so define it # for them. . if ${MACHINE_ARCH:Marmv6*} == "" MACHINE_CPU += softfp . endif # Normally armv6 is hard float ABI from FreeBSD 11 onwards. However # when CPUTYPE has 'soft' in it, we use the soft-float ABI to allow # building of soft-float ABI libraries. In this case, we have to # add the -mfloat-abi=softfp to force that. .if ${MACHINE_ARCH:Marmv6*} && defined(CPUTYPE) && ${CPUTYPE:M*soft*} != "" # Needs to be CFLAGS not _CPUCFLAGS because it's needed for the ABI # not a nice optimization. CFLAGS += -mfloat-abi=softfp .endif .endif .if ${MACHINE_CPUARCH} == "riscv" CFLAGS += -msoft-float +ACFLAGS += -msoft-float .endif # NB: COPTFLAGS is handled in /usr/src/sys/conf/kern.pre.mk .if !defined(NO_CPU_CFLAGS) CFLAGS += ${_CPUCFLAGS} .endif # # Prohibit the compiler from emitting SIMD instructions. # These flags are added to CFLAGS in areas where the extra context-switch # cost outweighs the advantages of SIMD instructions. # # gcc: # Setting -mno-mmx implies -mno-3dnow # Setting -mno-sse implies -mno-sse2, -mno-sse3, -mno-ssse3 and -mfpmath=387 # # clang: # Setting -mno-mmx implies -mno-3dnow and -mno-3dnowa # Setting -mno-sse implies -mno-sse2, -mno-sse3, -mno-ssse3, -mno-sse41 and # -mno-sse42 # (-mfpmath= is not supported) # .if ${MACHINE_CPUARCH} == "i386" || ${MACHINE_CPUARCH} == "amd64" CFLAGS_NO_SIMD.clang= -mno-avx CFLAGS_NO_SIMD= -mno-mmx -mno-sse .endif CFLAGS_NO_SIMD += ${CFLAGS_NO_SIMD.${COMPILER_TYPE}} # Add in any architecture-specific CFLAGS. # These come from make.conf or the command line or the environment. CFLAGS += ${CFLAGS.${MACHINE_ARCH}} CXXFLAGS += ${CXXFLAGS.${MACHINE_ARCH}} Index: user/alc/PQ_LAUNDRY/share/mk/bsd.sys.mk =================================================================== --- user/alc/PQ_LAUNDRY/share/mk/bsd.sys.mk (revision 303205) +++ user/alc/PQ_LAUNDRY/share/mk/bsd.sys.mk (revision 303206) @@ -1,314 +1,319 @@ # $FreeBSD$ # # This file contains common settings used for building FreeBSD # sources. # Enable various levels of compiler warning checks. These may be # overridden (e.g. if using a non-gcc compiler) by defining MK_WARNS=no. # for GCC: http://gcc.gnu.org/onlinedocs/gcc-4.2.1/gcc/Warning-Options.html .include # the default is gnu99 for now CSTD?= gnu99 .if ${CSTD} == "k&r" CFLAGS+= -traditional .elif ${CSTD} == "c89" || ${CSTD} == "c90" CFLAGS+= -std=iso9899:1990 .elif ${CSTD} == "c94" || ${CSTD} == "c95" CFLAGS+= -std=iso9899:199409 .elif ${CSTD} == "c99" CFLAGS+= -std=iso9899:1999 .else # CSTD CFLAGS+= -std=${CSTD} .endif # CSTD # -pedantic is problematic because it also imposes namespace restrictions #CFLAGS+= -pedantic .if defined(WARNS) .if ${WARNS} >= 1 CWARNFLAGS+= -Wsystem-headers .if !defined(NO_WERROR) && !defined(NO_WERROR.${COMPILER_TYPE}) CWARNFLAGS+= -Werror .endif # !NO_WERROR && !NO_WERROR.${COMPILER_TYPE} .endif # WARNS >= 1 .if ${WARNS} >= 2 CWARNFLAGS+= -Wall -Wno-format-y2k .endif # WARNS >= 2 .if ${WARNS} >= 3 CWARNFLAGS+= -W -Wno-unused-parameter -Wstrict-prototypes\ -Wmissing-prototypes -Wpointer-arith .endif # WARNS >= 3 .if ${WARNS} >= 4 CWARNFLAGS+= -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow\ -Wunused-parameter .if !defined(NO_WCAST_ALIGN) && !defined(NO_WCAST_ALIGN.${COMPILER_TYPE}) CWARNFLAGS+= -Wcast-align .endif # !NO_WCAST_ALIGN !NO_WCAST_ALIGN.${COMPILER_TYPE} .endif # WARNS >= 4 # BDECFLAGS .if ${WARNS} >= 6 CWARNFLAGS+= -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls\ -Wold-style-definition .if !defined(NO_WMISSING_VARIABLE_DECLARATIONS) CWARNFLAGS.clang+= -Wmissing-variable-declarations .endif .if !defined(NO_WTHREAD_SAFETY) CWARNFLAGS.clang+= -Wthread-safety .endif .endif # WARNS >= 6 .if ${WARNS} >= 2 && ${WARNS} <= 4 # XXX Delete -Wuninitialized by default for now -- the compiler doesn't # XXX always get it right. CWARNFLAGS+= -Wno-uninitialized .endif # WARNS >=2 && WARNS <= 4 CWARNFLAGS+= -Wno-pointer-sign # Clang has more warnings enabled by default, and when using -Wall, so if WARNS # is set to low values, these have to be disabled explicitly. .if ${WARNS} <= 6 CWARNFLAGS.clang+= -Wno-empty-body -Wno-string-plus-int .if ${COMPILER_TYPE} == "clang" && ${COMPILER_VERSION} >= 30400 CWARNFLAGS.clang+= -Wno-unused-const-variable .endif .endif # WARNS <= 6 .if ${WARNS} <= 3 CWARNFLAGS.clang+= -Wno-tautological-compare -Wno-unused-value\ -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion .if ${COMPILER_TYPE} == "clang" && ${COMPILER_VERSION} >= 30600 CWARNFLAGS.clang+= -Wno-unused-local-typedef .endif .endif # WARNS <= 3 .if ${WARNS} <= 2 CWARNFLAGS.clang+= -Wno-switch -Wno-switch-enum -Wno-knr-promoted-parameter .endif # WARNS <= 2 .if ${WARNS} <= 1 CWARNFLAGS.clang+= -Wno-parentheses .endif # WARNS <= 1 .if defined(NO_WARRAY_BOUNDS) CWARNFLAGS.clang+= -Wno-array-bounds .endif # NO_WARRAY_BOUNDS .endif # WARNS .if defined(FORMAT_AUDIT) WFORMAT= 1 .endif # FORMAT_AUDIT .if defined(WFORMAT) .if ${WFORMAT} > 0 #CWARNFLAGS+= -Wformat-nonliteral -Wformat-security -Wno-format-extra-args CWARNFLAGS+= -Wformat=2 -Wno-format-extra-args .if ${WARNS} <= 3 CWARNFLAGS.clang+= -Wno-format-nonliteral .endif # WARNS <= 3 .if !defined(NO_WERROR) && !defined(NO_WERROR.${COMPILER_TYPE}) CWARNFLAGS+= -Werror .endif # !NO_WERROR && !NO_WERROR.${COMPILER_TYPE} .endif # WFORMAT > 0 .endif # WFORMAT .if defined(NO_WFORMAT) || defined(NO_WFORMAT.${COMPILER_TYPE}) CWARNFLAGS+= -Wno-format .endif # NO_WFORMAT || NO_WFORMAT.${COMPILER_TYPE} # GCC 5.2.0 .if ${COMPILER_TYPE} == "gcc" && ${COMPILER_VERSION} >= 50200 CWARNFLAGS+= -Wno-error=unused-function -Wno-error=enum-compare -Wno-error=logical-not-parentheses -Wno-error=bool-compare -Wno-error=uninitialized -Wno-error=array-bounds -Wno-error=clobbered -Wno-error=cast-align -Wno-error=extra -Wno-error=attributes -Wno-error=inline -Wno-error=unused-but-set-variable -Wno-error=unused-value -Wno-error=strict-aliasing -Wno-error=address .endif +# GCC 6.1.0 +.if ${COMPILER_TYPE} == "gcc" && ${COMPILER_VERSION} >= 60100 +CWARNFLAGS+= -Wno-error=unused-const-variable= -Wno-error=nonnull-compare -Wno-error=shift-negative-value -Wno-error=misleading-indentation -Wno-error=tautological-compare +.endif + # How to handle FreeBSD custom printf format specifiers. .if ${COMPILER_TYPE} == "clang" && ${COMPILER_VERSION} >= 30600 FORMAT_EXTENSIONS= -D__printf__=__freebsd_kprintf__ .else FORMAT_EXTENSIONS= -fformat-extensions .endif .if defined(IGNORE_PRAGMA) CWARNFLAGS+= -Wno-unknown-pragmas .endif # IGNORE_PRAGMA # We need this conditional because many places that use it # only enable it for some files with CLFAGS.$FILE+=${CLANG_NO_IAS}. # unconditionally, and can't easily use the CFLAGS.clang= # mechanism. .if ${COMPILER_TYPE} == "clang" CLANG_NO_IAS= -no-integrated-as .endif CLANG_OPT_SMALL= -mstack-alignment=8 -mllvm -inline-threshold=3\ -mllvm -simplifycfg-dup-ret .if ${COMPILER_VERSION} >= 30500 && ${COMPILER_VERSION} < 30700 CLANG_OPT_SMALL+= -mllvm -enable-gvn=false .else CLANG_OPT_SMALL+= -mllvm -enable-load-pre=false .endif CFLAGS.clang+= -Qunused-arguments .if ${MACHINE_CPUARCH} == "sparc64" # Don't emit .cfi directives, since we must use GNU as on sparc64, for now. CFLAGS.clang+= -fno-dwarf2-cfi-asm .endif # SPARC64 # The libc++ headers use c++11 extensions. These are normally silenced because # they are treated as system headers, but we explicitly disable that warning # suppression when building the base system to catch bugs in our headers. # Eventually we'll want to start building the base system C++ code as C++11, # but not yet. CXXFLAGS.clang+= -Wno-c++11-extensions .if ${MK_SSP} != "no" && \ ${MACHINE_CPUARCH} != "arm" && ${MACHINE_CPUARCH} != "mips" .if (${COMPILER_TYPE} == "clang" && ${COMPILER_VERSION} >= 30500) || \ (${COMPILER_TYPE} == "gcc" && \ (${COMPILER_VERSION} == 40201 || ${COMPILER_VERSION} >= 40900)) # Don't use -Wstack-protector as it breaks world with -Werror. SSP_CFLAGS?= -fstack-protector-strong .else SSP_CFLAGS?= -fstack-protector .endif CFLAGS+= ${SSP_CFLAGS} .endif # SSP && !ARM && !MIPS # Allow user-specified additional warning flags, plus compiler and file # specific flag overrides, unless we've overriden this... .if ${MK_WARNS} != "no" CFLAGS+= ${CWARNFLAGS} ${CWARNFLAGS.${COMPILER_TYPE}} CFLAGS+= ${CWARNFLAGS.${.IMPSRC:T}} .endif CFLAGS+= ${CFLAGS.${COMPILER_TYPE}} CXXFLAGS+= ${CXXFLAGS.${COMPILER_TYPE}} AFLAGS+= ${AFLAGS.${.IMPSRC:T}} ACFLAGS+= ${ACFLAGS.${.IMPSRC:T}} CFLAGS+= ${CFLAGS.${.IMPSRC:T}} CXXFLAGS+= ${CXXFLAGS.${.IMPSRC:T}} .if defined(SRCTOP) # Prevent rebuilding during install to support read-only objdirs. .if !make(all) && make(install) && empty(.MAKE.MODE:Mmeta) CFLAGS+= ERROR-tried-to-rebuild-during-make-install .endif .endif # Tell bmake not to mistake standard targets for things to be searched for # or expect to ever be up-to-date. PHONY_NOTMAIN = analyze afterdepend afterinstall all beforedepend beforeinstall \ beforelinking build build-tools buildconfig buildfiles \ buildincludes check checkdpadd clean cleandepend cleandir \ cleanobj configure depend distclean distribute exe \ files html includes install installconfig installfiles \ installincludes lint obj objlink objs objwarn \ realinstall tags whereobj # we don't want ${PROG} to be PHONY .PHONY: ${PHONY_NOTMAIN:N${PROG:U}} .NOTMAIN: ${PHONY_NOTMAIN:Nall} .if ${MK_STAGING} != "no" .if defined(_SKIP_BUILD) || (!make(all) && !make(clean*)) _SKIP_STAGING?= yes .endif .if ${_SKIP_STAGING:Uno} == "yes" staging stage_libs stage_files stage_as stage_links stage_symlinks: .else # allow targets like beforeinstall to be leveraged DESTDIR= ${STAGE_OBJTOP} .export DESTDIR .if target(beforeinstall) .if !empty(_LIBS) || (${MK_STAGING_PROG} != "no" && !defined(INTERNALPROG)) staging: beforeinstall .endif .endif # normally only libs and includes are staged .if ${MK_STAGING_PROG} != "no" && !defined(INTERNALPROG) STAGE_DIR.prog= ${STAGE_OBJTOP}${BINDIR} .if !empty(PROG) .if defined(PROGNAME) STAGE_AS_SETS+= prog STAGE_AS_${PROG}= ${PROGNAME} stage_as.prog: ${PROG} .else STAGE_SETS+= prog stage_files.prog: ${PROG} STAGE_TARGETS+= stage_files .endif .endif .endif .if !empty(_LIBS) && !defined(INTERNALLIB) .if defined(SHLIBDIR) && ${SHLIBDIR} != ${LIBDIR} && ${_LIBS:Uno:M*.so.*} != "" STAGE_SETS+= shlib STAGE_DIR.shlib= ${STAGE_OBJTOP}${SHLIBDIR} STAGE_FILES.shlib+= ${_LIBS:M*.so.*} stage_files.shlib: ${_LIBS:M*.so.*} .endif .if defined(SHLIB_LINK) && commands(${SHLIB_LINK:R}.ld) STAGE_AS_SETS+= ldscript STAGE_AS.ldscript+= ${SHLIB_LINK:R}.ld stage_as.ldscript: ${SHLIB_LINK:R}.ld STAGE_DIR.ldscript = ${STAGE_LIBDIR} STAGE_AS_${SHLIB_LINK:R}.ld:= ${SHLIB_LINK} NO_SHLIB_LINKS= .endif .if target(stage_files.shlib) stage_libs: ${_LIBS} .if defined(DEBUG_FLAGS) && target(${SHLIB_NAME}.symbols) stage_files.shlib: ${SHLIB_NAME}.symbols .endif .else stage_libs: ${_LIBS} .endif .if defined(SHLIB_NAME) && defined(DEBUG_FLAGS) && target(${SHLIB_NAME}.symbols) stage_libs: ${SHLIB_NAME}.symbols .endif .endif .if !empty(INCS) || !empty(INCSGROUPS) && target(buildincludes) .if !defined(NO_BEFOREBUILD_INCLUDES) stage_includes: buildincludes beforebuild: stage_includes .endif .endif .for t in stage_libs stage_files stage_as .if target($t) STAGE_TARGETS+= $t .endif .endfor .if !empty(STAGE_AS_SETS) STAGE_TARGETS+= stage_as .endif .if !empty(_LIBS) || (${MK_STAGING_PROG} != "no" && !defined(INTERNALPROG)) .if !empty(LINKS) STAGE_TARGETS+= stage_links .if ${MAKE_VERSION} < 20131001 stage_links.links: ${_LIBS} ${PROG} .endif STAGE_SETS+= links STAGE_LINKS.links= ${LINKS} .endif .if !empty(SYMLINKS) STAGE_TARGETS+= stage_symlinks STAGE_SETS+= links STAGE_SYMLINKS.links= ${SYMLINKS} .endif .endif .include .endif .endif .if defined(META_TARGETS) .for _tgt in ${META_TARGETS} .if target(${_tgt}) ${_tgt}: ${META_DEPS} .endif .endfor .endif Index: user/alc/PQ_LAUNDRY/sys/arm/allwinner/a10_gpio.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/arm/allwinner/a10_gpio.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/arm/allwinner/a10_gpio.c (revision 303206) @@ -1,746 +1,746 @@ /*- * Copyright (c) 2013 Ganbold Tsagaankhuu * Copyright (c) 2012 Oleksandr Tymoshenko * Copyright (c) 2012 Luiz Otavio O Souza. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #if defined(__aarch64__) #include "opt_soc.h" #endif #include "gpio_if.h" #define A10_GPIO_DEFAULT_CAPS (GPIO_PIN_INPUT | GPIO_PIN_OUTPUT | \ GPIO_PIN_PULLUP | GPIO_PIN_PULLDOWN) #define A10_GPIO_NONE 0 #define A10_GPIO_PULLUP 1 #define A10_GPIO_PULLDOWN 2 #define A10_GPIO_INPUT 0 #define A10_GPIO_OUTPUT 1 #define AW_GPIO_DRV_MASK 0x3 #define AW_GPIO_PUD_MASK 0x3 #define AW_PINCTRL 1 #define AW_R_PINCTRL 2 /* Defined in a10_padconf.c */ #ifdef SOC_ALLWINNER_A10 extern const struct allwinner_padconf a10_padconf; #endif /* Defined in a13_padconf.c */ #ifdef SOC_ALLWINNER_A13 extern const struct allwinner_padconf a13_padconf; #endif /* Defined in a20_padconf.c */ #ifdef SOC_ALLWINNER_A20 extern const struct allwinner_padconf a20_padconf; #endif /* Defined in a31_padconf.c */ #ifdef SOC_ALLWINNER_A31 extern const struct allwinner_padconf a31_padconf; #endif /* Defined in a31s_padconf.c */ #ifdef SOC_ALLWINNER_A31S extern const struct allwinner_padconf a31s_padconf; #endif #if defined(SOC_ALLWINNER_A31) || defined(SOC_ALLWINNER_A31S) extern const struct allwinner_padconf a31_r_padconf; #endif /* Defined in h3_padconf.c */ #ifdef SOC_ALLWINNER_H3 extern const struct allwinner_padconf h3_padconf; extern const struct allwinner_padconf h3_r_padconf; #endif /* Defined in a83t_padconf.c */ #ifdef SOC_ALLWINNER_A83T extern const struct allwinner_padconf a83t_padconf; extern const struct allwinner_padconf a83t_r_padconf; #endif /* Defined in a64_padconf.c */ #ifdef SOC_ALLWINNER_A64 extern const struct allwinner_padconf a64_padconf; extern const struct allwinner_padconf a64_r_padconf; #endif static struct ofw_compat_data compat_data[] = { #ifdef SOC_ALLWINNER_A10 {"allwinner,sun4i-a10-pinctrl", (uintptr_t)&a10_padconf}, #endif #ifdef SOC_ALLWINNER_A13 {"allwinner,sun5i-a13-pinctrl", (uintptr_t)&a13_padconf}, #endif #ifdef SOC_ALLWINNER_A20 {"allwinner,sun7i-a20-pinctrl", (uintptr_t)&a20_padconf}, #endif #ifdef SOC_ALLWINNER_A31 {"allwinner,sun6i-a31-pinctrl", (uintptr_t)&a31_padconf}, #endif #ifdef SOC_ALLWINNER_A31S {"allwinner,sun6i-a31s-pinctrl", (uintptr_t)&a31s_padconf}, #endif #if defined(SOC_ALLWINNER_A31) || defined(SOC_ALLWINNER_A31S) {"allwinner,sun6i-a31-r-pinctrl", (uintptr_t)&a31_r_padconf}, #endif #ifdef SOC_ALLWINNER_A83T {"allwinner,sun8i-a83t-pinctrl", (uintptr_t)&a83t_padconf}, {"allwinner,sun8i-a83t-r-pinctrl", (uintptr_t)&a83t_r_padconf}, #endif #ifdef SOC_ALLWINNER_H3 {"allwinner,sun8i-h3-pinctrl", (uintptr_t)&h3_padconf}, {"allwinner,sun8i-h3-r-pinctrl", (uintptr_t)&h3_r_padconf}, #endif #ifdef SOC_ALLWINNER_A64 {"allwinner,sun50i-a64-pinctrl", (uintptr_t)&a64_padconf}, {"allwinner,sun50i-a64-r-pinctrl", (uintptr_t)&a64_r_padconf}, #endif {NULL, 0} }; struct a10_gpio_softc { device_t sc_dev; device_t sc_busdev; struct mtx sc_mtx; struct resource * sc_mem_res; struct resource * sc_irq_res; bus_space_tag_t sc_bst; bus_space_handle_t sc_bsh; void * sc_intrhand; const struct allwinner_padconf * padconf; }; #define A10_GPIO_LOCK(_sc) mtx_lock_spin(&(_sc)->sc_mtx) #define A10_GPIO_UNLOCK(_sc) mtx_unlock_spin(&(_sc)->sc_mtx) #define A10_GPIO_LOCK_ASSERT(_sc) mtx_assert(&(_sc)->sc_mtx, MA_OWNED) #define A10_GPIO_GP_CFG(_bank, _idx) 0x00 + ((_bank) * 0x24) + ((_idx) << 2) #define A10_GPIO_GP_DAT(_bank) 0x10 + ((_bank) * 0x24) #define A10_GPIO_GP_DRV(_bank, _idx) 0x14 + ((_bank) * 0x24) + ((_idx) << 2) #define A10_GPIO_GP_PUL(_bank, _idx) 0x1c + ((_bank) * 0x24) + ((_idx) << 2) #define A10_GPIO_GP_INT_CFG0 0x200 #define A10_GPIO_GP_INT_CFG1 0x204 #define A10_GPIO_GP_INT_CFG2 0x208 #define A10_GPIO_GP_INT_CFG3 0x20c #define A10_GPIO_GP_INT_CTL 0x210 #define A10_GPIO_GP_INT_STA 0x214 #define A10_GPIO_GP_INT_DEB 0x218 #define A10_GPIO_WRITE(_sc, _off, _val) \ bus_space_write_4(_sc->sc_bst, _sc->sc_bsh, _off, _val) #define A10_GPIO_READ(_sc, _off) \ bus_space_read_4(_sc->sc_bst, _sc->sc_bsh, _off) static uint32_t a10_gpio_get_function(struct a10_gpio_softc *sc, uint32_t pin) { uint32_t bank, func, offset; /* Must be called with lock held. */ A10_GPIO_LOCK_ASSERT(sc); if (pin > sc->padconf->npins) return (0); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; offset = ((pin & 0x07) << 2); func = A10_GPIO_READ(sc, A10_GPIO_GP_CFG(bank, pin >> 3)); switch ((func >> offset) & 0x7) { case A10_GPIO_INPUT: return (GPIO_PIN_INPUT); case A10_GPIO_OUTPUT: return (GPIO_PIN_OUTPUT); } return (0); } static int a10_gpio_set_function(struct a10_gpio_softc *sc, uint32_t pin, uint32_t f) { uint32_t bank, data, offset; /* Check if the function exists in the padconf data */ if (sc->padconf->pins[pin].functions[f] == NULL) return (EINVAL); /* Must be called with lock held. */ A10_GPIO_LOCK_ASSERT(sc); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; offset = ((pin & 0x07) << 2); data = A10_GPIO_READ(sc, A10_GPIO_GP_CFG(bank, pin >> 3)); data &= ~(7 << offset); data |= (f << offset); A10_GPIO_WRITE(sc, A10_GPIO_GP_CFG(bank, pin >> 3), data); return (0); } static uint32_t a10_gpio_get_pud(struct a10_gpio_softc *sc, uint32_t pin) { uint32_t bank, offset, val; /* Must be called with lock held. */ A10_GPIO_LOCK_ASSERT(sc); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; offset = ((pin & 0x0f) << 1); val = A10_GPIO_READ(sc, A10_GPIO_GP_PUL(bank, pin >> 4)); switch ((val >> offset) & 0x3) { case A10_GPIO_PULLDOWN: return (GPIO_PIN_PULLDOWN); case A10_GPIO_PULLUP: return (GPIO_PIN_PULLUP); } return (0); } static void a10_gpio_set_pud(struct a10_gpio_softc *sc, uint32_t pin, uint32_t state) { uint32_t bank, offset, val; /* Must be called with lock held. */ A10_GPIO_LOCK_ASSERT(sc); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; offset = ((pin & 0x0f) << 1); val = A10_GPIO_READ(sc, A10_GPIO_GP_PUL(bank, pin >> 4)); val &= ~(AW_GPIO_PUD_MASK << offset); val |= (state << offset); A10_GPIO_WRITE(sc, A10_GPIO_GP_PUL(bank, pin >> 4), val); } static void a10_gpio_set_drv(struct a10_gpio_softc *sc, uint32_t pin, uint32_t drive) { uint32_t bank, offset, val; /* Must be called with lock held. */ A10_GPIO_LOCK_ASSERT(sc); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; offset = ((pin & 0x0f) << 1); val = A10_GPIO_READ(sc, A10_GPIO_GP_DRV(bank, pin >> 4)); val &= ~(AW_GPIO_DRV_MASK << offset); val |= (drive << offset); A10_GPIO_WRITE(sc, A10_GPIO_GP_DRV(bank, pin >> 4), val); } static int a10_gpio_pin_configure(struct a10_gpio_softc *sc, uint32_t pin, uint32_t flags) { int err = 0; /* Must be called with lock held. */ A10_GPIO_LOCK_ASSERT(sc); /* Manage input/output. */ if (flags & (GPIO_PIN_INPUT | GPIO_PIN_OUTPUT)) { if (flags & GPIO_PIN_OUTPUT) err = a10_gpio_set_function(sc, pin, A10_GPIO_OUTPUT); else err = a10_gpio_set_function(sc, pin, A10_GPIO_INPUT); } if (err) return (err); /* Manage Pull-up/pull-down. */ if (flags & (GPIO_PIN_PULLUP | GPIO_PIN_PULLDOWN)) { if (flags & GPIO_PIN_PULLUP) a10_gpio_set_pud(sc, pin, A10_GPIO_PULLUP); else a10_gpio_set_pud(sc, pin, A10_GPIO_PULLDOWN); } else a10_gpio_set_pud(sc, pin, A10_GPIO_NONE); return (0); } static device_t a10_gpio_get_bus(device_t dev) { struct a10_gpio_softc *sc; sc = device_get_softc(dev); return (sc->sc_busdev); } static int a10_gpio_pin_max(device_t dev, int *maxpin) { struct a10_gpio_softc *sc; sc = device_get_softc(dev); *maxpin = sc->padconf->npins - 1; return (0); } static int a10_gpio_pin_getcaps(device_t dev, uint32_t pin, uint32_t *caps) { struct a10_gpio_softc *sc; sc = device_get_softc(dev); if (pin >= sc->padconf->npins) return (EINVAL); *caps = A10_GPIO_DEFAULT_CAPS; return (0); } static int a10_gpio_pin_getflags(device_t dev, uint32_t pin, uint32_t *flags) { struct a10_gpio_softc *sc; sc = device_get_softc(dev); if (pin >= sc->padconf->npins) return (EINVAL); A10_GPIO_LOCK(sc); *flags = a10_gpio_get_function(sc, pin); *flags |= a10_gpio_get_pud(sc, pin); A10_GPIO_UNLOCK(sc); return (0); } static int a10_gpio_pin_getname(device_t dev, uint32_t pin, char *name) { struct a10_gpio_softc *sc; sc = device_get_softc(dev); if (pin >= sc->padconf->npins) return (EINVAL); snprintf(name, GPIOMAXNAME - 1, "%s", sc->padconf->pins[pin].name); name[GPIOMAXNAME - 1] = '\0'; return (0); } static int a10_gpio_pin_setflags(device_t dev, uint32_t pin, uint32_t flags) { struct a10_gpio_softc *sc; int err; sc = device_get_softc(dev); if (pin > sc->padconf->npins) return (EINVAL); A10_GPIO_LOCK(sc); err = a10_gpio_pin_configure(sc, pin, flags); A10_GPIO_UNLOCK(sc); return (err); } static int a10_gpio_pin_set(device_t dev, uint32_t pin, unsigned int value) { struct a10_gpio_softc *sc; uint32_t bank, data; sc = device_get_softc(dev); if (pin > sc->padconf->npins) return (EINVAL); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; A10_GPIO_LOCK(sc); data = A10_GPIO_READ(sc, A10_GPIO_GP_DAT(bank)); if (value) data |= (1 << pin); else data &= ~(1 << pin); A10_GPIO_WRITE(sc, A10_GPIO_GP_DAT(bank), data); A10_GPIO_UNLOCK(sc); return (0); } static int a10_gpio_pin_get(device_t dev, uint32_t pin, unsigned int *val) { struct a10_gpio_softc *sc; uint32_t bank, reg_data; sc = device_get_softc(dev); if (pin > sc->padconf->npins) return (EINVAL); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; A10_GPIO_LOCK(sc); reg_data = A10_GPIO_READ(sc, A10_GPIO_GP_DAT(bank)); A10_GPIO_UNLOCK(sc); *val = (reg_data & (1 << pin)) ? 1 : 0; return (0); } static int a10_gpio_pin_toggle(device_t dev, uint32_t pin) { struct a10_gpio_softc *sc; uint32_t bank, data; sc = device_get_softc(dev); if (pin > sc->padconf->npins) return (EINVAL); bank = sc->padconf->pins[pin].port; pin = sc->padconf->pins[pin].pin; A10_GPIO_LOCK(sc); data = A10_GPIO_READ(sc, A10_GPIO_GP_DAT(bank)); if (data & (1 << pin)) data &= ~(1 << pin); else data |= (1 << pin); A10_GPIO_WRITE(sc, A10_GPIO_GP_DAT(bank), data); A10_GPIO_UNLOCK(sc); return (0); } static int aw_find_pinnum_by_name(struct a10_gpio_softc *sc, const char *pinname) { int i; for (i = 0; i < sc->padconf->npins; i++) if (!strcmp(pinname, sc->padconf->pins[i].name)) return i; return (-1); } static int aw_find_pin_func(struct a10_gpio_softc *sc, int pin, const char *func) { int i; for (i = 0; i < AW_MAX_FUNC_BY_PIN; i++) if (sc->padconf->pins[pin].functions[i] && !strcmp(func, sc->padconf->pins[pin].functions[i])) return (i); return (-1); } static int aw_fdt_configure_pins(device_t dev, phandle_t cfgxref) { struct a10_gpio_softc *sc; phandle_t node; const char **pinlist = NULL; char *pin_function = NULL; uint32_t pin_drive, pin_pull; int pins_nb, pin_num, pin_func, i, ret; sc = device_get_softc(dev); node = OF_node_from_xref(cfgxref); ret = 0; /* Getting all prop for configuring pins */ pins_nb = ofw_bus_string_list_to_array(node, "allwinner,pins", &pinlist); if (pins_nb <= 0) return (ENOENT); if (OF_getprop_alloc(node, "allwinner,function", sizeof(*pin_function), (void **)&pin_function) == -1) { ret = ENOENT; goto out; } if (OF_getencprop(node, "allwinner,drive", &pin_drive, sizeof(pin_drive)) == -1) { ret = ENOENT; goto out; } if (OF_getencprop(node, "allwinner,pull", &pin_pull, sizeof(pin_pull)) == -1) { ret = ENOENT; goto out; } /* Configure each pin to the correct function, drive and pull */ for (i = 0; i < pins_nb; i++) { pin_num = aw_find_pinnum_by_name(sc, pinlist[i]); if (pin_num == -1) { ret = ENOENT; goto out; } pin_func = aw_find_pin_func(sc, pin_num, pin_function); if (pin_func == -1) { ret = ENOENT; goto out; } A10_GPIO_LOCK(sc); a10_gpio_set_function(sc, pin_num, pin_func); a10_gpio_set_drv(sc, pin_num, pin_drive); a10_gpio_set_pud(sc, pin_num, pin_pull); A10_GPIO_UNLOCK(sc); } out: OF_prop_free(pinlist); OF_prop_free(pin_function); return (ret); } static int a10_gpio_probe(device_t dev) { if (!ofw_bus_status_okay(dev)) return (ENXIO); if (ofw_bus_search_compatible(dev, compat_data)->ocd_data == 0) return (ENXIO); device_set_desc(dev, "Allwinner GPIO/Pinmux controller"); return (BUS_PROBE_DEFAULT); } static int a10_gpio_attach(device_t dev) { int rid, error; phandle_t gpio; struct a10_gpio_softc *sc; clk_t clk; hwreset_t rst; sc = device_get_softc(dev); sc->sc_dev = dev; mtx_init(&sc->sc_mtx, "a10 gpio", "gpio", MTX_SPIN); rid = 0; sc->sc_mem_res = bus_alloc_resource_any(dev, SYS_RES_MEMORY, &rid, RF_ACTIVE); if (!sc->sc_mem_res) { device_printf(dev, "cannot allocate memory window\n"); goto fail; } sc->sc_bst = rman_get_bustag(sc->sc_mem_res); sc->sc_bsh = rman_get_bushandle(sc->sc_mem_res); rid = 0; sc->sc_irq_res = bus_alloc_resource_any(dev, SYS_RES_IRQ, &rid, RF_ACTIVE); if (!sc->sc_irq_res) { device_printf(dev, "cannot allocate interrupt\n"); goto fail; } /* Find our node. */ gpio = ofw_bus_get_node(sc->sc_dev); if (!OF_hasprop(gpio, "gpio-controller")) /* Node is not a GPIO controller. */ goto fail; /* Use the right pin data for the current SoC */ sc->padconf = (struct allwinner_padconf *)ofw_bus_search_compatible(dev, compat_data)->ocd_data; if (hwreset_get_by_ofw_idx(dev, 0, 0, &rst) == 0) { error = hwreset_deassert(rst); if (error != 0) { device_printf(dev, "cannot de-assert reset\n"); return (error); } } if (clk_get_by_ofw_index(dev, 0, 0, &clk) == 0) { error = clk_enable(clk); if (error != 0) { device_printf(dev, "could not enable clock\n"); return (error); } } sc->sc_busdev = gpiobus_attach_bus(dev); if (sc->sc_busdev == NULL) goto fail; /* * Register as a pinctrl device */ fdt_pinctrl_register(dev, "allwinner,pins"); fdt_pinctrl_configure_tree(dev); return (0); fail: if (sc->sc_irq_res) bus_release_resource(dev, SYS_RES_IRQ, 0, sc->sc_irq_res); if (sc->sc_mem_res) bus_release_resource(dev, SYS_RES_MEMORY, 0, sc->sc_mem_res); mtx_destroy(&sc->sc_mtx); return (ENXIO); } static int a10_gpio_detach(device_t dev) { return (EBUSY); } static phandle_t a10_gpio_get_node(device_t dev, device_t bus) { /* We only have one child, the GPIO bus, which needs our own node. */ return (ofw_bus_get_node(dev)); } static int a10_gpio_map_gpios(device_t bus, phandle_t dev, phandle_t gparent, int gcells, pcell_t *gpios, uint32_t *pin, uint32_t *flags) { struct a10_gpio_softc *sc; int i; sc = device_get_softc(bus); /* The GPIO pins are mapped as: . */ for (i = 0; i < sc->padconf->npins; i++) if (sc->padconf->pins[i].port == gpios[0] && sc->padconf->pins[i].pin == gpios[1]) { *pin = i; break; } *flags = gpios[gcells - 1]; return (0); } static device_method_t a10_gpio_methods[] = { /* Device interface */ DEVMETHOD(device_probe, a10_gpio_probe), DEVMETHOD(device_attach, a10_gpio_attach), DEVMETHOD(device_detach, a10_gpio_detach), /* GPIO protocol */ DEVMETHOD(gpio_get_bus, a10_gpio_get_bus), DEVMETHOD(gpio_pin_max, a10_gpio_pin_max), DEVMETHOD(gpio_pin_getname, a10_gpio_pin_getname), DEVMETHOD(gpio_pin_getflags, a10_gpio_pin_getflags), DEVMETHOD(gpio_pin_getcaps, a10_gpio_pin_getcaps), DEVMETHOD(gpio_pin_setflags, a10_gpio_pin_setflags), DEVMETHOD(gpio_pin_get, a10_gpio_pin_get), DEVMETHOD(gpio_pin_set, a10_gpio_pin_set), DEVMETHOD(gpio_pin_toggle, a10_gpio_pin_toggle), DEVMETHOD(gpio_map_gpios, a10_gpio_map_gpios), /* ofw_bus interface */ DEVMETHOD(ofw_bus_get_node, a10_gpio_get_node), /* fdt_pinctrl interface */ DEVMETHOD(fdt_pinctrl_configure,aw_fdt_configure_pins), DEVMETHOD_END }; static devclass_t a10_gpio_devclass; static driver_t a10_gpio_driver = { "gpio", a10_gpio_methods, sizeof(struct a10_gpio_softc), }; EARLY_DRIVER_MODULE(a10_gpio, simplebus, a10_gpio_driver, a10_gpio_devclass, 0, 0, - BUS_PASS_INTERRUPT + BUS_PASS_ORDER_MIDDLE); + BUS_PASS_INTERRUPT + BUS_PASS_ORDER_LATE); Index: user/alc/PQ_LAUNDRY/sys/arm/allwinner/a20/a20_padconf.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/arm/allwinner/a20/a20_padconf.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/arm/allwinner/a20/a20_padconf.c (revision 303206) @@ -1,231 +1,231 @@ /*- * Copyright (c) 2016 Emmanuel Vadot * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #ifdef SOC_ALLWINNER_A20 const static struct allwinner_pins a20_pins[] = { {"PA0", 0, 0, {"gpio_in", "gpio_out", "emac", "spi1", "uart2", "gmac", NULL, NULL}}, {"PA1", 0, 1, {"gpio_in", "gpio_out", "emac", "spi1", "uart2", "gmac", NULL, NULL}}, {"PA2", 0, 2, {"gpio_in", "gpio_out", "emac", "spi1", "uart2", "gmac", NULL, NULL}}, {"PA3", 0, 3, {"gpio_in", "gpio_out", "emac", "spi1", "uart2", "gmac", NULL, NULL}}, {"PA4", 0, 4, {"gpio_in", "gpio_out", "emac", "spi1", NULL, "gmac", NULL, NULL}}, {"PA5", 0, 5, {"gpio_in", "gpio_out", "emac", "spi3", NULL, "gmac", NULL, NULL}}, {"PA6", 0, 6, {"gpio_in", "gpio_out", "emac", "spi3", NULL, "gmac", NULL, NULL}}, {"PA7", 0, 7, {"gpio_in", "gpio_out", "emac", "spi3", NULL, "gmac", NULL, NULL}}, {"PA8", 0, 8, {"gpio_in", "gpio_out", "emac", "spi3", NULL, "gmac", NULL, NULL}}, {"PA9", 0, 9, {"gpio_in", "gpio_out", "emac", "spi3", NULL, "gmac", "i2c1", NULL}}, {"PA10", 0, 10, {"gpio_in", "gpio_out", "emac", NULL, "uart1", "gmac", NULL, NULL}}, {"PA11", 0, 11, {"gpio_in", "gpio_out", "emac", NULL, "uart1", "gmac", NULL, NULL}}, {"PA12", 0, 12, {"gpio_in", "gpio_out", "emac", "uart6", "uart1", "gmac", NULL, NULL}}, {"PA13", 0, 13, {"gpio_in", "gpio_out", "emac", "uart6", "uart1", "gmac", NULL, NULL}}, {"PA14", 0, 14, {"gpio_in", "gpio_out", "emac", "uart7", "uart1", "gmac", "i2c1", NULL}}, {"PA15", 0, 15, {"gpio_in", "gpio_out", "emac", "uart7", "uart1", "gmac", "i2c1", NULL}}, {"PA16", 0, 16, {"gpio_in", "gpio_out", NULL, "can", "uart1", "gmac", "i2c1", NULL}}, {"PA17", 0, 17, {"gpio_in", "gpio_out", NULL, "can", "uart1", "gmac", "i2c1", NULL}}, {"PB0", 1, 0, {"gpio_in", "gpio_out", "i2c0", NULL, NULL, NULL, NULL, NULL}}, {"PB1", 1, 1, {"gpio_in", "gpio_out", "i2c0", NULL, NULL, NULL, NULL, NULL}}, {"PB2", 1, 2, {"gpio_in", "gpio_out", "pwm", NULL, NULL, NULL, NULL, NULL}}, {"PB3", 1, 3, {"gpio_in", "gpio_out", "ir0", NULL, "spdif", NULL, NULL, NULL}}, {"PB4", 1, 4, {"gpio_in", "gpio_out", "ir0", NULL, NULL, NULL, NULL, NULL}}, {"PB5", 1, 5, {"gpio_in", "gpio_out", "i2s0", "ac97", NULL, NULL, NULL, NULL}}, {"PB6", 1, 6, {"gpio_in", "gpio_out", "i2c0", "ac97", NULL, NULL, NULL, NULL}}, {"PB7", 1, 7, {"gpio_in", "gpio_out", "i2c0", "ac97", NULL, NULL, NULL, NULL}}, {"PB8", 1, 8, {"gpio_in", "gpio_out", "i2c0", "ac97", NULL, NULL, NULL, NULL}}, {"PB9", 1, 9, {"gpio_in", "gpio_out", "i2c0", NULL, NULL, NULL, NULL, NULL}}, {"PB10", 1, 10, {"gpio_in", "gpio_out", "i2c0", NULL, NULL, NULL, NULL, NULL}}, {"PB11", 1, 11, {"gpio_in", "gpio_out", "i2c0", NULL, NULL, NULL, NULL, NULL}}, {"PB12", 1, 12, {"gpio_in", "gpio_out", "i2c0", "ac97", "spdif", NULL, NULL, NULL}}, {"PB13", 1, 13, {"gpio_in", "gpio_out", "spi2", NULL, "spdif", NULL, NULL, NULL}}, {"PB14", 1, 14, {"gpio_in", "gpio_out", "spi2", "jtag", NULL, NULL, NULL, NULL}}, {"PB15", 1, 15, {"gpio_in", "gpio_out", "spi2", "jtag", NULL, NULL, NULL, NULL}}, {"PB16", 1, 16, {"gpio_in", "gpio_out", "spi2", "jtag", NULL, NULL, NULL, NULL}}, {"PB17", 1, 17, {"gpio_in", "gpio_out", "spi2", "jtag", NULL, NULL, NULL, NULL}}, {"PB18", 1, 18, {"gpio_in", "gpio_out", "i2c1", NULL, NULL, NULL, NULL, NULL}}, {"PB19", 1, 19, {"gpio_in", "gpio_out", "i2c1", NULL, NULL, NULL, NULL, NULL}}, {"PB20", 1, 20, {"gpio_in", "gpio_out", "i2c2", NULL, NULL, NULL, NULL, NULL}}, {"PB21", 1, 21, {"gpio_in", "gpio_out", "i2c2", NULL, NULL, NULL, NULL, NULL}}, {"PB22", 1, 22, {"gpio_in", "gpio_out", "uart0", "ir1", NULL, NULL, NULL, NULL}}, {"PB23", 1, 23, {"gpio_in", "gpio_out", "uart0", "ir1", NULL, NULL, NULL, NULL}}, {"PC0", 2, 0, {"gpio_in", "gpio_out", "nand0", "spi0", NULL, NULL, NULL, NULL}}, {"PC1", 2, 1, {"gpio_in", "gpio_out", "nand0", "spi0", NULL, NULL, NULL, NULL}}, {"PC2", 2, 2, {"gpio_in", "gpio_out", "nand0", "spi0", NULL, NULL, NULL, NULL}}, {"PC3", 2, 3, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC4", 2, 4, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, - {"PC5", 2, 5, {"gpio_in", "gpio_out", "nand0", "mmc2", NULL, NULL, NULL, NULL}}, + {"PC5", 2, 5, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC6", 2, 6, {"gpio_in", "gpio_out", "nand0", "mmc2", NULL, NULL, NULL, NULL}}, {"PC7", 2, 7, {"gpio_in", "gpio_out", "nand0", "mmc2", NULL, NULL, NULL, NULL}}, {"PC8", 2, 8, {"gpio_in", "gpio_out", "nand0", "mmc2", NULL, NULL, NULL, NULL}}, {"PC9", 2, 9, {"gpio_in", "gpio_out", "nand0", "mmc2", NULL, NULL, NULL, NULL}}, {"PC10", 2, 10, {"gpio_in", "gpio_out", "nand0", "mmc2", NULL, NULL, NULL, NULL}}, {"PC11", 2, 11, {"gpio_in", "gpio_out", "nand0", "mmc2", NULL, NULL, NULL, NULL}}, {"PC12", 2, 12, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC13", 2, 13, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC14", 2, 14, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC15", 2, 15, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC16", 2, 16, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC17", 2, 17, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC18", 2, 18, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PC19", 2, 19, {"gpio_in", "gpio_out", "nand0", "spi2", NULL, NULL, "eint", NULL}}, {"PC20", 2, 20, {"gpio_in", "gpio_out", "nand0", "spi2", NULL, NULL, "eint", NULL}}, {"PC21", 2, 21, {"gpio_in", "gpio_out", "nand0", "spi2", NULL, NULL, "eint", NULL}}, {"PC22", 2, 22, {"gpio_in", "gpio_out", "nand0", "spi2", NULL, NULL, "eint", NULL}}, {"PC23", 2, 23, {"gpio_in", "gpio_out", NULL, "spi0", NULL, NULL, NULL, NULL}}, {"PC24", 2, 24, {"gpio_in", "gpio_out", "nand0", NULL, NULL, NULL, NULL, NULL}}, {"PD0", 3, 0, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD1", 3, 1, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD2", 3, 2, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD3", 3, 3, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD4", 3, 4, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD5", 3, 5, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD6", 3, 6, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD7", 3, 7, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD8", 3, 8, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD9", 3, 9, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD10", 3, 10, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD11", 3, 11, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD12", 3, 12, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD13", 3, 13, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD14", 3, 14, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD15", 3, 15, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD16", 3, 16, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD17", 3, 17, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD18", 3, 18, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD19", 3, 19, {"gpio_in", "gpio_out", "lcd0", "lvds0", NULL, NULL, NULL, NULL}}, {"PD20", 3, 20, {"gpio_in", "gpio_out", "lcd0", "csi1", NULL, NULL, NULL, NULL}}, {"PD21", 3, 21, {"gpio_in", "gpio_out", "lcd0", "sim", NULL, NULL, NULL, NULL}}, {"PD22", 3, 22, {"gpio_in", "gpio_out", "lcd0", "sim", NULL, NULL, NULL, NULL}}, {"PD23", 3, 23, {"gpio_in", "gpio_out", "lcd0", "sim", NULL, NULL, NULL, NULL}}, {"PD24", 3, 24, {"gpio_in", "gpio_out", "lcd0", "sim", NULL, NULL, NULL, NULL}}, {"PD25", 3, 25, {"gpio_in", "gpio_out", "lcd0", "sim", NULL, NULL, NULL, NULL}}, {"PD26", 3, 26, {"gpio_in", "gpio_out", "lcd0", "sim", NULL, NULL, NULL, NULL}}, {"PD27", 3, 27, {"gpio_in", "gpio_out", "lcd0", "sim", NULL, NULL, NULL, NULL}}, {"PE0", 4, 0, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE1", 4, 1, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE2", 4, 2, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE3", 4, 3, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE4", 4, 4, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE5", 4, 5, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE6", 4, 6, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE7", 4, 7, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE8", 4, 8, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE9", 4, 9, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE10", 4, 10, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PE11", 4, 11, {"gpio_in", "gpio_out", "ts0", "csi0", NULL, NULL, NULL, NULL}}, {"PF0", 5, 0, {"gpio_in", "gpio_out", "mmc0", NULL, "jtag", NULL, NULL, NULL}}, {"PF1", 5, 1, {"gpio_in", "gpio_out", "mmc0", NULL, "jtag", NULL, NULL, NULL}}, {"PF2", 5, 2, {"gpio_in", "gpio_out", "mmc0", NULL, "uart0", NULL, NULL, NULL}}, {"PF3", 5, 3, {"gpio_in", "gpio_out", "mmc0", NULL, "jtag", NULL, NULL, NULL}}, {"PF4", 5, 4, {"gpio_in", "gpio_out", "mmc0", NULL, "uart0", NULL, NULL, NULL}}, {"PF5", 5, 5, {"gpio_in", "gpio_out", "mmc0", NULL, "jtag", NULL, NULL, NULL}}, {"PG0", 6, 0, {"gpio_in", "gpio_out", "ts1", "csi1", "mmc1", NULL, NULL, NULL}}, {"PG1", 6, 1, {"gpio_in", "gpio_out", "ts1", "csi1", "mmc1", NULL, NULL, NULL}}, {"PG2", 6, 2, {"gpio_in", "gpio_out", "ts1", "csi1", "mmc1", NULL, NULL, NULL}}, {"PG3", 6, 3, {"gpio_in", "gpio_out", "ts1", "csi1", "mmc1", NULL, NULL, NULL}}, {"PG4", 6, 4, {"gpio_in", "gpio_out", "ts1", "csi1", "mmc1", "csi0", NULL, NULL}}, {"PG5", 6, 5, {"gpio_in", "gpio_out", "ts1", "csi1", "mmc1", "csi0", NULL, NULL}}, {"PG6", 6, 6, {"gpio_in", "gpio_out", "ts1", "csi1", "uart3", "csi0", NULL, NULL}}, {"PG7", 6, 7, {"gpio_in", "gpio_out", "ts1", "csi1", "uart3", "csi0", NULL, NULL}}, {"PG8", 6, 8, {"gpio_in", "gpio_out", "ts1", "csi1", "uart3", "csi0", NULL, NULL}}, {"PG9", 6, 9, {"gpio_in", "gpio_out", "ts1", "csi1", "uart3", "csi0", NULL, NULL}}, {"PG10", 6, 10, {"gpio_in", "gpio_out", "ts1", "csi1", "uart4", "csi0", NULL, NULL}}, {"PG11", 6, 11, {"gpio_in", "gpio_out", "ts1", "csi1", "uart4", "csi0", NULL, NULL}}, {"PH0", 7, 0, {"gpio_in", "gpio_out", "lcd1", NULL, "uart3", NULL, "eint", "csi1"}}, {"PH1", 7, 1, {"gpio_in", "gpio_out", "lcd1", NULL, "uart3", NULL, "eint", "csi1"}}, {"PH2", 7, 2, {"gpio_in", "gpio_out", "lcd1", NULL, "uart3", NULL, "eint", "csi1"}}, {"PH3", 7, 3, {"gpio_in", "gpio_out", "lcd1", NULL, "uart3", NULL, "eint", "csi1"}}, {"PH4", 7, 4, {"gpio_in", "gpio_out", "lcd1", NULL, "uart4", NULL, "eint", "csi1"}}, {"PH5", 7, 5, {"gpio_in", "gpio_out", "lcd1", NULL, "uart4", NULL, "eint", "csi1"}}, {"PH6", 7, 6, {"gpio_in", "gpio_out", "lcd1", NULL, "uart5", "ms", "eint", "csi1"}}, {"PH7", 7, 7, {"gpio_in", "gpio_out", "lcd1", NULL, "uart5", "ms", "eint", "csi1"}}, {"PH8", 7, 8, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "ms", "eint", "csi1"}}, {"PH9", 7, 9, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "ms", "eint", "csi1"}}, {"PH10", 7, 10, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "ms", "eint", "csi1"}}, {"PH11", 7, 11, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "ms", "eint", "csi1"}}, {"PH12", 7, 12, {"gpio_in", "gpio_out", "lcd1", NULL, "ps2", NULL, "eint", "csi1"}}, {"PH13", 7, 13, {"gpio_in", "gpio_out", "lcd1", NULL, "ps2", "sim", "eint", "csi1"}}, {"PH14", 7, 14, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "sim", "eint", "csi1"}}, {"PH15", 7, 15, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "sim", "eint", "csi1"}}, {"PH16", 7, 16, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "sim", "eint", "csi1"}}, {"PH17", 7, 17, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "sim", "eint", "csi1"}}, {"PH18", 7, 18, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "sim", "eint", "csi1"}}, {"PH19", 7, 19, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "sim", "eint", "csi1"}}, {"PH20", 7, 20, {"gpio_in", "gpio_out", "lcd1", NULL, "can", NULL, "eint", "csi1"}}, {"PH21", 7, 21, {"gpio_in", "gpio_out", "lcd1", NULL, "can", NULL, "eint", "csi1"}}, {"PH22", 7, 22, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "mmc1", NULL, "csi1"}}, {"PH23", 7, 23, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "mmc1", NULL, "csi1"}}, {"PH24", 7, 24, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "mmc1", NULL, "csi1"}}, {"PH25", 7, 25, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "mmc1", NULL, "csi1"}}, {"PH26", 7, 26, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "mmc1", NULL, "csi1"}}, {"PH27", 7, 27, {"gpio_in", "gpio_out", "lcd1", NULL, "keypad", "mmc1", NULL, "csi1"}}, {"PI0", 8, 0, {"gpio_in", "gpio_out", NULL, "i2c3", NULL, NULL, NULL, NULL}}, {"PI1", 8, 1, {"gpio_in", "gpio_out", NULL, "i2c3", NULL, NULL, NULL, NULL}}, {"PI2", 8, 2, {"gpio_in", "gpio_out", NULL, "i2c4", NULL, NULL, NULL, NULL}}, {"PI3", 8, 3, {"gpio_in", "gpio_out", "pwm", "i2c4", NULL, NULL, NULL, NULL}}, {"PI4", 8, 4, {"gpio_in", "gpio_out", "mmc3", NULL, NULL, NULL, NULL, NULL}}, {"PI5", 8, 5, {"gpio_in", "gpio_out", "mmc3", NULL, NULL, NULL, NULL, NULL}}, {"PI6", 8, 6, {"gpio_in", "gpio_out", "mmc3", NULL, NULL, NULL, NULL, NULL}}, {"PI7", 8, 7, {"gpio_in", "gpio_out", "mmc3", NULL, NULL, NULL, NULL, NULL}}, {"PI8", 8, 8, {"gpio_in", "gpio_out", "mmc3", NULL, NULL, NULL, NULL, NULL}}, {"PI9", 8, 9, {"gpio_in", "gpio_out", "mmc3", NULL, NULL, NULL, NULL, NULL}}, {"PI10", 8, 10, {"gpio_in", "gpio_out", "spi0", "uart5", NULL, NULL, "eint", NULL}}, {"PI11", 8, 11, {"gpio_in", "gpio_out", "spi0", "uart5", NULL, NULL, "eint", NULL}}, {"PI12", 8, 12, {"gpio_in", "gpio_out", "spi0", "uart6", "clk_out_a", NULL, "eint", NULL}}, {"PI13", 8, 13, {"gpio_in", "gpio_out", "spi0", "uart6", "clk_out_b", NULL, "eint", NULL}}, {"PI14", 8, 14, {"gpio_in", "gpio_out", "spi0", "ps2", "timer4", NULL, "eint", NULL}}, {"PI15", 8, 15, {"gpio_in", "gpio_out", "spi1", "ps2", "timer5", NULL, "eint", NULL}}, {"PI16", 8, 16, {"gpio_in", "gpio_out", "spi1", "uart2", NULL, NULL, "eint", NULL}}, {"PI17", 8, 17, {"gpio_in", "gpio_out", "spi1", "uart2", NULL, NULL, "eint", NULL}}, {"PI18", 8, 18, {"gpio_in", "gpio_out", "spi1", "uart2", NULL, NULL, "eint", NULL}}, {"PI19", 8, 19, {"gpio_in", "gpio_out", "spi1", "uart2", NULL, NULL, "eint", NULL}}, {"PI20", 8, 20, {"gpio_in", "gpio_out", "ps2", "uart7", "hdmi", NULL, NULL, NULL}}, {"PI21", 8, 21, {"gpio_in", "gpio_out", "ps2", "uart7", "hdmi", NULL, NULL, NULL}}, }; const struct allwinner_padconf a20_padconf = { .npins = sizeof(a20_pins) / sizeof(struct allwinner_pins), .pins = a20_pins, }; #endif /* SOC_ALLWINNER_A20 */ Index: user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_nmi.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_nmi.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_nmi.c (revision 303206) @@ -1,404 +1,404 @@ /*- * Copyright (c) 2016 Emmanuel Vadot * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_platform.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pic_if.h" #define NMI_IRQ_CTRL_REG 0x0 #define NMI_IRQ_LOW_LEVEL 0x0 #define NMI_IRQ_LOW_EDGE 0x1 #define NMI_IRQ_HIGH_LEVEL 0x2 #define NMI_IRQ_HIGH_EDGE 0x3 #define NMI_IRQ_PENDING_REG 0x4 #define NMI_IRQ_ACK (1U << 0) #define A20_NMI_IRQ_ENABLE_REG 0x8 #define A31_NMI_IRQ_ENABLE_REG 0x34 #define NMI_IRQ_ENABLE (1U << 0) #define SC_NMI_READ(_sc, _reg) bus_read_4(_sc->res[0], _reg) #define SC_NMI_WRITE(_sc, _reg, _val) bus_write_4(_sc->res[0], _reg, _val) static struct resource_spec aw_nmi_res_spec[] = { { SYS_RES_MEMORY, 0, RF_ACTIVE }, { SYS_RES_IRQ, 0, RF_ACTIVE }, { -1, 0, 0 } }; struct aw_nmi_intr { struct intr_irqsrc isrc; u_int irq; enum intr_polarity pol; enum intr_trigger tri; }; struct aw_nmi_softc { device_t dev; struct resource * res[2]; void * intrcookie; struct aw_nmi_intr intr; uint8_t enable_reg; }; #define A20_NMI 1 #define A31_NMI 2 static struct ofw_compat_data compat_data[] = { {"allwinner,sun7i-a20-sc-nmi", A20_NMI}, {"allwinner,sun6i-a31-sc-nmi", A31_NMI}, {NULL, 0}, }; static int aw_nmi_intr(void *arg) { struct aw_nmi_softc *sc; sc = arg; if (SC_NMI_READ(sc, NMI_IRQ_PENDING_REG) == 0) { device_printf(sc->dev, "Spurious interrupt\n"); return (FILTER_HANDLED); } if (intr_isrc_dispatch(&sc->intr.isrc, curthread->td_intr_frame) != 0) { SC_NMI_WRITE(sc, sc->enable_reg, !NMI_IRQ_ENABLE); device_printf(sc->dev, "Stray interrupt, NMI disabled\n"); } return (FILTER_HANDLED); } static void aw_nmi_enable_intr(device_t dev, struct intr_irqsrc *isrc) { struct aw_nmi_softc *sc; sc = device_get_softc(dev); SC_NMI_WRITE(sc, sc->enable_reg, NMI_IRQ_ENABLE); } static void aw_nmi_disable_intr(device_t dev, struct intr_irqsrc *isrc) { struct aw_nmi_softc *sc; sc = device_get_softc(dev); SC_NMI_WRITE(sc, sc->enable_reg, !NMI_IRQ_ENABLE); } static int aw_nmi_map_fdt(device_t dev, u_int ncells, pcell_t *cells, u_int *irqp, enum intr_polarity *polp, enum intr_trigger *trigp) { u_int irq, tripol; enum intr_polarity pol; enum intr_trigger trig; if (ncells != 2) { device_printf(dev, "Invalid #interrupt-cells\n"); return (EINVAL); } irq = cells[0]; if (irq != 0) { device_printf(dev, "Controller only support irq 0\n"); return (EINVAL); } tripol = cells[1]; switch (tripol) { case IRQ_TYPE_EDGE_RISING: trig = INTR_TRIGGER_EDGE; pol = INTR_POLARITY_HIGH; break; case IRQ_TYPE_EDGE_FALLING: trig = INTR_TRIGGER_EDGE; pol = INTR_POLARITY_LOW; break; case IRQ_TYPE_LEVEL_HIGH: trig = INTR_TRIGGER_LEVEL; pol = INTR_POLARITY_HIGH; break; case IRQ_TYPE_LEVEL_LOW: trig = INTR_TRIGGER_LEVEL; pol = INTR_POLARITY_LOW; break; default: device_printf(dev, "unsupported trigger/polarity 0x%2x\n", tripol); return (ENOTSUP); } *irqp = irq; if (polp != NULL) *polp = pol; if (trigp != NULL) *trigp = trig; return (0); } static int aw_nmi_map_intr(device_t dev, struct intr_map_data *data, struct intr_irqsrc **isrcp) { struct intr_map_data_fdt *daf; struct aw_nmi_softc *sc; int error; u_int irq; if (data->type != INTR_MAP_DATA_FDT) return (ENOTSUP); sc = device_get_softc(dev); daf = (struct intr_map_data_fdt *)data; error = aw_nmi_map_fdt(dev, daf->ncells, daf->cells, &irq, NULL, NULL); if (error == 0) *isrcp = &sc->intr.isrc; return (error); } static int aw_nmi_setup_intr(device_t dev, struct intr_irqsrc *isrc, struct resource *res, struct intr_map_data *data) { struct intr_map_data_fdt *daf; struct aw_nmi_softc *sc; struct aw_nmi_intr *nmi_intr; int error, icfg; u_int irq; enum intr_trigger trig; enum intr_polarity pol; /* Get config for interrupt. */ if (data == NULL || data->type != INTR_MAP_DATA_FDT) return (ENOTSUP); sc = device_get_softc(dev); nmi_intr = (struct aw_nmi_intr *)isrc; daf = (struct intr_map_data_fdt *)data; error = aw_nmi_map_fdt(dev, daf->ncells, daf->cells, &irq, &pol, &trig); if (error != 0) return (error); if (nmi_intr->irq != irq) return (EINVAL); /* Compare config if this is not first setup. */ if (isrc->isrc_handlers != 0) { if (pol != nmi_intr->pol || trig != nmi_intr->tri) return (EINVAL); else return (0); } nmi_intr->pol = pol; nmi_intr->tri = trig; if (trig == INTR_TRIGGER_LEVEL) { if (pol == INTR_POLARITY_LOW) icfg = NMI_IRQ_LOW_LEVEL; else icfg = NMI_IRQ_HIGH_LEVEL; } else { if (pol == INTR_POLARITY_HIGH) icfg = NMI_IRQ_HIGH_EDGE; else icfg = NMI_IRQ_LOW_EDGE; } SC_NMI_WRITE(sc, NMI_IRQ_CTRL_REG, icfg); return (0); } static int aw_nmi_teardown_intr(device_t dev, struct intr_irqsrc *isrc, struct resource *res, struct intr_map_data *data) { struct aw_nmi_softc *sc; sc = device_get_softc(dev); if (isrc->isrc_handlers == 0) { sc->intr.pol = INTR_POLARITY_CONFORM; sc->intr.tri = INTR_TRIGGER_CONFORM; SC_NMI_WRITE(sc, sc->enable_reg, !NMI_IRQ_ENABLE); } return (0); } static void aw_nmi_pre_ithread(device_t dev, struct intr_irqsrc *isrc) { struct aw_nmi_softc *sc; sc = device_get_softc(dev); aw_nmi_disable_intr(dev, isrc); SC_NMI_WRITE(sc, NMI_IRQ_PENDING_REG, NMI_IRQ_ACK); } static void aw_nmi_post_ithread(device_t dev, struct intr_irqsrc *isrc) { arm_irq_memory_barrier(0); aw_nmi_enable_intr(dev, isrc); } static void aw_nmi_post_filter(device_t dev, struct intr_irqsrc *isrc) { struct aw_nmi_softc *sc; sc = device_get_softc(dev); arm_irq_memory_barrier(0); SC_NMI_WRITE(sc, NMI_IRQ_PENDING_REG, NMI_IRQ_ACK); } static int aw_nmi_probe(device_t dev) { if (!ofw_bus_status_okay(dev)) return (ENXIO); if (ofw_bus_search_compatible(dev, compat_data)->ocd_data == 0) return (ENXIO); device_set_desc(dev, "Allwinner NMI Controller"); return (BUS_PROBE_DEFAULT); } static int aw_nmi_attach(device_t dev) { struct aw_nmi_softc *sc; phandle_t xref; sc = device_get_softc(dev); sc->dev = dev; if (bus_alloc_resources(dev, aw_nmi_res_spec, sc->res) != 0) { device_printf(dev, "can't allocate device resources\n"); return (ENXIO); } if ((bus_setup_intr(dev, sc->res[1], INTR_TYPE_MISC, aw_nmi_intr, NULL, sc, &sc->intrcookie))) { device_printf(dev, "unable to register interrupt handler\n"); bus_release_resources(dev, aw_nmi_res_spec, sc->res); return (ENXIO); } switch (ofw_bus_search_compatible(dev, compat_data)->ocd_data) { case A20_NMI: sc->enable_reg = A20_NMI_IRQ_ENABLE_REG; break; case A31_NMI: sc->enable_reg = A31_NMI_IRQ_ENABLE_REG; break; } /* Disable and clear interrupts */ SC_NMI_WRITE(sc, sc->enable_reg, !NMI_IRQ_ENABLE); SC_NMI_WRITE(sc, NMI_IRQ_PENDING_REG, NMI_IRQ_ACK); xref = OF_xref_from_node(ofw_bus_get_node(dev)); /* Register our isrc */ sc->intr.irq = 0; sc->intr.pol = INTR_POLARITY_CONFORM; sc->intr.tri = INTR_TRIGGER_CONFORM; if (intr_isrc_register(&sc->intr.isrc, sc->dev, 0, "%s,%u", device_get_nameunit(sc->dev), sc->intr.irq) != 0) goto error; if (intr_pic_register(dev, (intptr_t)xref) == NULL) { device_printf(dev, "could not register pic\n"); goto error; } return (0); error: bus_teardown_intr(dev, sc->res[1], sc->intrcookie); bus_release_resources(dev, aw_nmi_res_spec, sc->res); return (ENXIO); } static device_method_t aw_nmi_methods[] = { DEVMETHOD(device_probe, aw_nmi_probe), DEVMETHOD(device_attach, aw_nmi_attach), /* Interrupt controller interface */ DEVMETHOD(pic_disable_intr, aw_nmi_disable_intr), DEVMETHOD(pic_enable_intr, aw_nmi_enable_intr), DEVMETHOD(pic_map_intr, aw_nmi_map_intr), DEVMETHOD(pic_setup_intr, aw_nmi_setup_intr), DEVMETHOD(pic_teardown_intr, aw_nmi_teardown_intr), DEVMETHOD(pic_post_filter, aw_nmi_post_filter), DEVMETHOD(pic_post_ithread, aw_nmi_post_ithread), DEVMETHOD(pic_pre_ithread, aw_nmi_pre_ithread), {0, 0}, }; static driver_t aw_nmi_driver = { "aw_nmi", aw_nmi_methods, sizeof(struct aw_nmi_softc), }; static devclass_t aw_nmi_devclass; EARLY_DRIVER_MODULE(aw_nmi, simplebus, aw_nmi_driver, - aw_nmi_devclass, 0, 0, BUS_PASS_INTERRUPT + BUS_PASS_ORDER_LAST); + aw_nmi_devclass, 0, 0, BUS_PASS_INTERRUPT + BUS_PASS_ORDER_LATE); Index: user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_sid.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_sid.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_sid.c (revision 303206) @@ -1,135 +1,209 @@ /*- * Copyright (c) 2016 Jared McNeill * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ /* * Allwinner secure ID controller */ #include __FBSDID("$FreeBSD$"); +#include #include #include #include #include #include #include +#include #include #include #include #include #define SID_SRAM 0x200 #define SID_THERMAL_CALIB0 (SID_SRAM + 0x34) #define SID_THERMAL_CALIB1 (SID_SRAM + 0x38) +enum sid_type { + A10_SID = 1, + A20_SID, + A83T_SID, +}; + static struct ofw_compat_data compat_data[] = { - { "allwinner,sun8i-a83t-sid", 1 }, + { "allwinner,sun4i-a10-sid", A10_SID}, + { "allwinner,sun7i-a20-sid", A20_SID}, + { "allwinner,sun8i-a83t-sid", A83T_SID}, { NULL, 0 } }; struct aw_sid_softc { struct resource *res; + int type; }; static struct aw_sid_softc *aw_sid_sc; static struct resource_spec aw_sid_spec[] = { { SYS_RES_MEMORY, 0, RF_ACTIVE }, { -1, 0 } }; +enum sid_keys { + AW_SID_ROOT_KEY, +}; + +#define ROOT_KEY_OFF 0x0 +#define ROOT_KEY_SIZE 4 + #define RD4(sc, reg) bus_read_4((sc)->res, (reg)) #define WR4(sc, reg, val) bus_write_4((sc)->res, (reg), (val)) +static int aw_sid_sysctl(SYSCTL_HANDLER_ARGS); + static int aw_sid_probe(device_t dev) { if (!ofw_bus_status_okay(dev)) return (ENXIO); if (ofw_bus_search_compatible(dev, compat_data)->ocd_data == 0) return (ENXIO); device_set_desc(dev, "Allwinner Secure ID Controller"); return (BUS_PROBE_DEFAULT); } static int aw_sid_attach(device_t dev) { struct aw_sid_softc *sc; sc = device_get_softc(dev); if (bus_alloc_resources(dev, aw_sid_spec, &sc->res) != 0) { device_printf(dev, "cannot allocate resources for device\n"); return (ENXIO); } aw_sid_sc = sc; + sc->type = ofw_bus_search_compatible(dev, compat_data)->ocd_data; + switch (sc->type) { + case A10_SID: + case A20_SID: + SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), + SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), + OID_AUTO, "rootkey", + CTLTYPE_STRING | CTLFLAG_RD, + dev, AW_SID_ROOT_KEY, aw_sid_sysctl, "A", "Root Key"); + break; + default: + break; + } return (0); } int aw_sid_read_tscalib(uint32_t *calib0, uint32_t *calib1) { struct aw_sid_softc *sc; sc = aw_sid_sc; if (sc == NULL) return (ENXIO); + if (sc->type != A83T_SID) + return (ENXIO); *calib0 = RD4(sc, SID_THERMAL_CALIB0); *calib1 = RD4(sc, SID_THERMAL_CALIB1); return (0); +} + +int +aw_sid_get_rootkey(u_char *out) +{ + struct aw_sid_softc *sc; + int i; + u_int tmp; + + sc = aw_sid_sc; + if (sc == NULL) + return (ENXIO); + if (sc->type != A10_SID && sc->type != A20_SID) + return (ENXIO); + + for (i = 0; i < ROOT_KEY_SIZE ; i++) { + tmp = RD4(aw_sid_sc, ROOT_KEY_OFF + (i * 4)); + be32enc(&out[i * 4], tmp); + } + + return (0); +} + +static int +aw_sid_sysctl(SYSCTL_HANDLER_ARGS) +{ + enum sid_keys key = arg2; + u_char rootkey[16]; + char out[33]; + + if (key != AW_SID_ROOT_KEY) + return (ENOENT); + + if (aw_sid_get_rootkey(rootkey) != 0) + return (ENOENT); + snprintf(out, sizeof(out), + "%16D", rootkey, ""); + + return sysctl_handle_string(oidp, out, sizeof(out), req); } static device_method_t aw_sid_methods[] = { /* Device interface */ DEVMETHOD(device_probe, aw_sid_probe), DEVMETHOD(device_attach, aw_sid_attach), DEVMETHOD_END }; static driver_t aw_sid_driver = { "aw_sid", aw_sid_methods, sizeof(struct aw_sid_softc), }; static devclass_t aw_sid_devclass; EARLY_DRIVER_MODULE(aw_sid, simplebus, aw_sid_driver, aw_sid_devclass, 0, 0, BUS_PASS_RESOURCE + BUS_PASS_ORDER_FIRST); MODULE_VERSION(aw_sid, 1); Index: user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_sid.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_sid.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/arm/allwinner/aw_sid.h (revision 303206) @@ -1,34 +1,35 @@ /*- * Copyright (c) 2016 Jared McNeill * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __AW_SID_H__ #define __AW_SID_H__ int aw_sid_read_tscalib(uint32_t *, uint32_t *); +int aw_sid_get_rootkey(u_char *out); #endif /* !__AW_SID_H__ */ Index: user/alc/PQ_LAUNDRY/sys/cam/cam_ccb.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/cam/cam_ccb.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/cam/cam_ccb.h (revision 303206) @@ -1,1434 +1,1435 @@ /*- * Data structures and definitions for CAM Control Blocks (CCBs). * * Copyright (c) 1997, 1998 Justin T. Gibbs. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _CAM_CAM_CCB_H #define _CAM_CAM_CCB_H 1 #include #include #include #include #ifndef _KERNEL #include #endif #include #include #include #include /* General allocation length definitions for CCB structures */ #define IOCDBLEN CAM_MAX_CDBLEN /* Space for CDB bytes/pointer */ #define VUHBALEN 14 /* Vendor Unique HBA length */ #define SIM_IDLEN 16 /* ASCII string len for SIM ID */ #define HBA_IDLEN 16 /* ASCII string len for HBA ID */ #define DEV_IDLEN 16 /* ASCII string len for device names */ #define CCB_PERIPH_PRIV_SIZE 2 /* size of peripheral private area */ #define CCB_SIM_PRIV_SIZE 2 /* size of sim private area */ /* Struct definitions for CAM control blocks */ /* Common CCB header */ /* CAM CCB flags */ typedef enum { CAM_CDB_POINTER = 0x00000001,/* The CDB field is a pointer */ CAM_QUEUE_ENABLE = 0x00000002,/* SIM queue actions are enabled */ CAM_CDB_LINKED = 0x00000004,/* CCB contains a linked CDB */ CAM_NEGOTIATE = 0x00000008,/* * Perform transport negotiation * with this command. */ CAM_DATA_ISPHYS = 0x00000010,/* Data type with physical addrs */ CAM_DIS_AUTOSENSE = 0x00000020,/* Disable autosense feature */ CAM_DIR_BOTH = 0x00000000,/* Data direction (00:IN/OUT) */ CAM_DIR_IN = 0x00000040,/* Data direction (01:DATA IN) */ CAM_DIR_OUT = 0x00000080,/* Data direction (10:DATA OUT) */ CAM_DIR_NONE = 0x000000C0,/* Data direction (11:no data) */ CAM_DIR_MASK = 0x000000C0,/* Data direction Mask */ CAM_DATA_VADDR = 0x00000000,/* Data type (000:Virtual) */ CAM_DATA_PADDR = 0x00000010,/* Data type (001:Physical) */ CAM_DATA_SG = 0x00040000,/* Data type (010:sglist) */ CAM_DATA_SG_PADDR = 0x00040010,/* Data type (011:sglist phys) */ CAM_DATA_BIO = 0x00200000,/* Data type (100:bio) */ CAM_DATA_MASK = 0x00240010,/* Data type mask */ CAM_SOFT_RST_OP = 0x00000100,/* Use Soft reset alternative */ CAM_ENG_SYNC = 0x00000200,/* Flush resid bytes on complete */ CAM_DEV_QFRZDIS = 0x00000400,/* Disable DEV Q freezing */ CAM_DEV_QFREEZE = 0x00000800,/* Freeze DEV Q on execution */ CAM_HIGH_POWER = 0x00001000,/* Command takes a lot of power */ CAM_SENSE_PTR = 0x00002000,/* Sense data is a pointer */ CAM_SENSE_PHYS = 0x00004000,/* Sense pointer is physical addr*/ CAM_TAG_ACTION_VALID = 0x00008000,/* Use the tag action in this ccb*/ CAM_PASS_ERR_RECOVER = 0x00010000,/* Pass driver does err. recovery*/ CAM_DIS_DISCONNECT = 0x00020000,/* Disable disconnect */ CAM_MSG_BUF_PHYS = 0x00080000,/* Message buffer ptr is physical*/ CAM_SNS_BUF_PHYS = 0x00100000,/* Autosense data ptr is physical*/ CAM_CDB_PHYS = 0x00400000,/* CDB poiner is physical */ CAM_ENG_SGLIST = 0x00800000,/* SG list is for the HBA engine */ /* Phase cognizant mode flags */ CAM_DIS_AUTOSRP = 0x01000000,/* Disable autosave/restore ptrs */ CAM_DIS_AUTODISC = 0x02000000,/* Disable auto disconnect */ CAM_TGT_CCB_AVAIL = 0x04000000,/* Target CCB available */ CAM_TGT_PHASE_MODE = 0x08000000,/* The SIM runs in phase mode */ CAM_MSGB_VALID = 0x10000000,/* Message buffer valid */ CAM_STATUS_VALID = 0x20000000,/* Status buffer valid */ CAM_DATAB_VALID = 0x40000000,/* Data buffer valid */ /* Host target Mode flags */ CAM_SEND_SENSE = 0x08000000,/* Send sense data with status */ CAM_TERM_IO = 0x10000000,/* Terminate I/O Message sup. */ CAM_DISCONNECT = 0x20000000,/* Disconnects are mandatory */ CAM_SEND_STATUS = 0x40000000,/* Send status after data phase */ CAM_UNLOCKED = 0x80000000 /* Call callback without lock. */ } ccb_flags; typedef enum { CAM_USER_DATA_ADDR = 0x00000002,/* Userspace data pointers */ CAM_SG_FORMAT_IOVEC = 0x00000004,/* iovec instead of busdma S/G*/ CAM_UNMAPPED_BUF = 0x00000008 /* use unmapped I/O */ } ccb_xflags; /* XPT Opcodes for xpt_action */ typedef enum { /* Function code flags are bits greater than 0xff */ XPT_FC_QUEUED = 0x100, /* Non-immediate function code */ XPT_FC_USER_CCB = 0x200, XPT_FC_XPT_ONLY = 0x400, /* Only for the transport layer device */ XPT_FC_DEV_QUEUED = 0x800 | XPT_FC_QUEUED, /* Passes through the device queues */ /* Common function commands: 0x00->0x0F */ XPT_NOOP = 0x00, /* Execute Nothing */ XPT_SCSI_IO = 0x01 | XPT_FC_DEV_QUEUED, /* Execute the requested I/O operation */ XPT_GDEV_TYPE = 0x02, /* Get type information for specified device */ XPT_GDEVLIST = 0x03, /* Get a list of peripheral devices */ XPT_PATH_INQ = 0x04, /* Path routing inquiry */ XPT_REL_SIMQ = 0x05, /* Release a frozen device queue */ XPT_SASYNC_CB = 0x06, /* Set Asynchronous Callback Parameters */ XPT_SDEV_TYPE = 0x07, /* Set device type information */ XPT_SCAN_BUS = 0x08 | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* (Re)Scan the SCSI Bus */ XPT_DEV_MATCH = 0x09 | XPT_FC_XPT_ONLY, /* Get EDT entries matching the given pattern */ XPT_DEBUG = 0x0a, /* Turn on debugging for a bus, target or lun */ XPT_PATH_STATS = 0x0b, /* Path statistics (error counts, etc.) */ XPT_GDEV_STATS = 0x0c, /* Device statistics (error counts, etc.) */ XPT_DEV_ADVINFO = 0x0e, /* Get/Set Device advanced information */ XPT_ASYNC = 0x0f | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* Asynchronous event */ /* SCSI Control Functions: 0x10->0x1F */ XPT_ABORT = 0x10, /* Abort the specified CCB */ XPT_RESET_BUS = 0x11 | XPT_FC_XPT_ONLY, /* Reset the specified SCSI bus */ XPT_RESET_DEV = 0x12 | XPT_FC_DEV_QUEUED, /* Bus Device Reset the specified SCSI device */ XPT_TERM_IO = 0x13, /* Terminate the I/O process */ XPT_SCAN_LUN = 0x14 | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* Scan Logical Unit */ XPT_GET_TRAN_SETTINGS = 0x15, /* * Get default/user transfer settings * for the target */ XPT_SET_TRAN_SETTINGS = 0x16, /* * Set transfer rate/width * negotiation settings */ XPT_CALC_GEOMETRY = 0x17, /* * Calculate the geometry parameters for * a device give the sector size and * volume size. */ XPT_ATA_IO = 0x18 | XPT_FC_DEV_QUEUED, /* Execute the requested ATA I/O operation */ XPT_GET_SIM_KNOB_OLD = 0x18, /* Compat only */ XPT_SET_SIM_KNOB = 0x19, /* * Set SIM specific knob values. */ XPT_GET_SIM_KNOB = 0x1a, /* * Get SIM specific knob values. */ XPT_SMP_IO = 0x1b | XPT_FC_DEV_QUEUED, /* Serial Management Protocol */ XPT_NVME_IO = 0x1c | XPT_FC_DEV_QUEUED, /* Execiute the requestred NVMe I/O operation */ XPT_MMCSD_IO = 0x1d | XPT_FC_DEV_QUEUED, /* Placeholder for MMC / SD / SDIO I/O stuff */ XPT_SCAN_TGT = 0x1E | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* Scan Target */ /* HBA engine commands 0x20->0x2F */ XPT_ENG_INQ = 0x20 | XPT_FC_XPT_ONLY, /* HBA engine feature inquiry */ XPT_ENG_EXEC = 0x21 | XPT_FC_DEV_QUEUED, /* HBA execute engine request */ /* Target mode commands: 0x30->0x3F */ XPT_EN_LUN = 0x30, /* Enable LUN as a target */ XPT_TARGET_IO = 0x31 | XPT_FC_DEV_QUEUED, /* Execute target I/O request */ XPT_ACCEPT_TARGET_IO = 0x32 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Accept Host Target Mode CDB */ XPT_CONT_TARGET_IO = 0x33 | XPT_FC_DEV_QUEUED, /* Continue Host Target I/O Connection */ XPT_IMMED_NOTIFY = 0x34 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Notify Host Target driver of event (obsolete) */ XPT_NOTIFY_ACK = 0x35, /* Acknowledgement of event (obsolete) */ XPT_IMMEDIATE_NOTIFY = 0x36 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Notify Host Target driver of event */ XPT_NOTIFY_ACKNOWLEDGE = 0x37 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Acknowledgement of event */ XPT_REPROBE_LUN = 0x38 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Query device capacity and notify GEOM */ /* Vendor Unique codes: 0x80->0x8F */ XPT_VUNIQUE = 0x80 } xpt_opcode; #define XPT_FC_GROUP_MASK 0xF0 #define XPT_FC_GROUP(op) ((op) & XPT_FC_GROUP_MASK) #define XPT_FC_GROUP_COMMON 0x00 #define XPT_FC_GROUP_SCSI_CONTROL 0x10 #define XPT_FC_GROUP_HBA_ENGINE 0x20 #define XPT_FC_GROUP_TMODE 0x30 #define XPT_FC_GROUP_VENDOR_UNIQUE 0x80 #define XPT_FC_IS_DEV_QUEUED(ccb) \ (((ccb)->ccb_h.func_code & XPT_FC_DEV_QUEUED) == XPT_FC_DEV_QUEUED) #define XPT_FC_IS_QUEUED(ccb) \ (((ccb)->ccb_h.func_code & XPT_FC_QUEUED) != 0) typedef enum { PROTO_UNKNOWN, PROTO_UNSPECIFIED, PROTO_SCSI, /* Small Computer System Interface */ PROTO_ATA, /* AT Attachment */ PROTO_ATAPI, /* AT Attachment Packetized Interface */ PROTO_SATAPM, /* SATA Port Multiplier */ PROTO_SEMB, /* SATA Enclosure Management Bridge */ PROTO_NVME, /* NVME */ } cam_proto; typedef enum { XPORT_UNKNOWN, XPORT_UNSPECIFIED, XPORT_SPI, /* SCSI Parallel Interface */ XPORT_FC, /* Fiber Channel */ XPORT_SSA, /* Serial Storage Architecture */ XPORT_USB, /* Universal Serial Bus */ XPORT_PPB, /* Parallel Port Bus */ XPORT_ATA, /* AT Attachment */ XPORT_SAS, /* Serial Attached SCSI */ XPORT_SATA, /* Serial AT Attachment */ XPORT_ISCSI, /* iSCSI */ XPORT_SRP, /* SCSI RDMA Protocol */ XPORT_NVME, /* NVMe over PCIe */ } cam_xport; +#define XPORT_IS_NVME(t) ((t) == XPORT_NVME) #define XPORT_IS_ATA(t) ((t) == XPORT_ATA || (t) == XPORT_SATA) #define XPORT_IS_SCSI(t) ((t) != XPORT_UNKNOWN && \ (t) != XPORT_UNSPECIFIED && \ - !XPORT_IS_ATA(t)) + !XPORT_IS_ATA(t) && !XPORT_IS_NVME(t)) #define XPORT_DEVSTAT_TYPE(t) (XPORT_IS_ATA(t) ? DEVSTAT_TYPE_IF_IDE : \ XPORT_IS_SCSI(t) ? DEVSTAT_TYPE_IF_SCSI : \ DEVSTAT_TYPE_IF_OTHER) #define PROTO_VERSION_UNKNOWN (UINT_MAX - 1) #define PROTO_VERSION_UNSPECIFIED UINT_MAX #define XPORT_VERSION_UNKNOWN (UINT_MAX - 1) #define XPORT_VERSION_UNSPECIFIED UINT_MAX typedef union { LIST_ENTRY(ccb_hdr) le; SLIST_ENTRY(ccb_hdr) sle; TAILQ_ENTRY(ccb_hdr) tqe; STAILQ_ENTRY(ccb_hdr) stqe; } camq_entry; typedef union { void *ptr; u_long field; u_int8_t bytes[sizeof(uintptr_t)]; } ccb_priv_entry; typedef union { ccb_priv_entry entries[CCB_PERIPH_PRIV_SIZE]; u_int8_t bytes[CCB_PERIPH_PRIV_SIZE * sizeof(ccb_priv_entry)]; } ccb_ppriv_area; typedef union { ccb_priv_entry entries[CCB_SIM_PRIV_SIZE]; u_int8_t bytes[CCB_SIM_PRIV_SIZE * sizeof(ccb_priv_entry)]; } ccb_spriv_area; typedef struct { struct timeval *etime; uintptr_t sim_data; uintptr_t periph_data; } ccb_qos_area; struct ccb_hdr { cam_pinfo pinfo; /* Info for priority scheduling */ camq_entry xpt_links; /* For chaining in the XPT layer */ camq_entry sim_links; /* For chaining in the SIM layer */ camq_entry periph_links; /* For chaining in the type driver */ u_int32_t retry_count; void (*cbfcnp)(struct cam_periph *, union ccb *); /* Callback on completion function */ xpt_opcode func_code; /* XPT function code */ u_int32_t status; /* Status returned by CAM subsystem */ struct cam_path *path; /* Compiled path for this ccb */ path_id_t path_id; /* Path ID for the request */ target_id_t target_id; /* Target device ID */ lun_id_t target_lun; /* Target LUN number */ u_int32_t flags; /* ccb_flags */ u_int32_t xflags; /* Extended flags */ ccb_ppriv_area periph_priv; ccb_spriv_area sim_priv; ccb_qos_area qos; u_int32_t timeout; /* Hard timeout value in mseconds */ struct timeval softtimeout; /* Soft timeout value in sec + usec */ }; /* Get Device Information CCB */ struct ccb_getdev { struct ccb_hdr ccb_h; cam_proto protocol; struct scsi_inquiry_data inq_data; struct ata_params ident_data; u_int8_t serial_num[252]; u_int8_t inq_flags; u_int8_t serial_num_len; const struct nvme_controller_data *nvme_cdata; const struct nvme_namespace_data *nvme_data; }; /* Device Statistics CCB */ struct ccb_getdevstats { struct ccb_hdr ccb_h; int dev_openings; /* Space left for more work on device*/ int dev_active; /* Transactions running on the device */ int allocated; /* CCBs allocated for the device */ int queued; /* CCBs queued to be sent to the device */ int held; /* * CCBs held by peripheral drivers * for this device */ int maxtags; /* * Boundary conditions for number of * tagged operations */ int mintags; struct timeval last_reset; /* Time of last bus reset/loop init */ }; typedef enum { CAM_GDEVLIST_LAST_DEVICE, CAM_GDEVLIST_LIST_CHANGED, CAM_GDEVLIST_MORE_DEVS, CAM_GDEVLIST_ERROR } ccb_getdevlist_status_e; struct ccb_getdevlist { struct ccb_hdr ccb_h; char periph_name[DEV_IDLEN]; u_int32_t unit_number; unsigned int generation; u_int32_t index; ccb_getdevlist_status_e status; }; typedef enum { PERIPH_MATCH_NONE = 0x000, PERIPH_MATCH_PATH = 0x001, PERIPH_MATCH_TARGET = 0x002, PERIPH_MATCH_LUN = 0x004, PERIPH_MATCH_NAME = 0x008, PERIPH_MATCH_UNIT = 0x010, PERIPH_MATCH_ANY = 0x01f } periph_pattern_flags; struct periph_match_pattern { char periph_name[DEV_IDLEN]; u_int32_t unit_number; path_id_t path_id; target_id_t target_id; lun_id_t target_lun; periph_pattern_flags flags; }; typedef enum { DEV_MATCH_NONE = 0x000, DEV_MATCH_PATH = 0x001, DEV_MATCH_TARGET = 0x002, DEV_MATCH_LUN = 0x004, DEV_MATCH_INQUIRY = 0x008, DEV_MATCH_DEVID = 0x010, DEV_MATCH_ANY = 0x00f } dev_pattern_flags; struct device_id_match_pattern { uint8_t id_len; uint8_t id[256]; }; struct device_match_pattern { path_id_t path_id; target_id_t target_id; lun_id_t target_lun; dev_pattern_flags flags; union { struct scsi_static_inquiry_pattern inq_pat; struct device_id_match_pattern devid_pat; } data; }; typedef enum { BUS_MATCH_NONE = 0x000, BUS_MATCH_PATH = 0x001, BUS_MATCH_NAME = 0x002, BUS_MATCH_UNIT = 0x004, BUS_MATCH_BUS_ID = 0x008, BUS_MATCH_ANY = 0x00f } bus_pattern_flags; struct bus_match_pattern { path_id_t path_id; char dev_name[DEV_IDLEN]; u_int32_t unit_number; u_int32_t bus_id; bus_pattern_flags flags; }; union match_pattern { struct periph_match_pattern periph_pattern; struct device_match_pattern device_pattern; struct bus_match_pattern bus_pattern; }; typedef enum { DEV_MATCH_PERIPH, DEV_MATCH_DEVICE, DEV_MATCH_BUS } dev_match_type; struct dev_match_pattern { dev_match_type type; union match_pattern pattern; }; struct periph_match_result { char periph_name[DEV_IDLEN]; u_int32_t unit_number; path_id_t path_id; target_id_t target_id; lun_id_t target_lun; }; typedef enum { DEV_RESULT_NOFLAG = 0x00, DEV_RESULT_UNCONFIGURED = 0x01 } dev_result_flags; struct device_match_result { path_id_t path_id; target_id_t target_id; lun_id_t target_lun; cam_proto protocol; struct scsi_inquiry_data inq_data; struct ata_params ident_data; dev_result_flags flags; }; struct bus_match_result { path_id_t path_id; char dev_name[DEV_IDLEN]; u_int32_t unit_number; u_int32_t bus_id; }; union match_result { struct periph_match_result periph_result; struct device_match_result device_result; struct bus_match_result bus_result; }; struct dev_match_result { dev_match_type type; union match_result result; }; typedef enum { CAM_DEV_MATCH_LAST, CAM_DEV_MATCH_MORE, CAM_DEV_MATCH_LIST_CHANGED, CAM_DEV_MATCH_SIZE_ERROR, CAM_DEV_MATCH_ERROR } ccb_dev_match_status; typedef enum { CAM_DEV_POS_NONE = 0x000, CAM_DEV_POS_BUS = 0x001, CAM_DEV_POS_TARGET = 0x002, CAM_DEV_POS_DEVICE = 0x004, CAM_DEV_POS_PERIPH = 0x008, CAM_DEV_POS_PDPTR = 0x010, CAM_DEV_POS_TYPEMASK = 0xf00, CAM_DEV_POS_EDT = 0x100, CAM_DEV_POS_PDRV = 0x200 } dev_pos_type; struct ccb_dm_cookie { void *bus; void *target; void *device; void *periph; void *pdrv; }; struct ccb_dev_position { u_int generations[4]; #define CAM_BUS_GENERATION 0x00 #define CAM_TARGET_GENERATION 0x01 #define CAM_DEV_GENERATION 0x02 #define CAM_PERIPH_GENERATION 0x03 dev_pos_type position_type; struct ccb_dm_cookie cookie; }; struct ccb_dev_match { struct ccb_hdr ccb_h; ccb_dev_match_status status; u_int32_t num_patterns; u_int32_t pattern_buf_len; struct dev_match_pattern *patterns; u_int32_t num_matches; u_int32_t match_buf_len; struct dev_match_result *matches; struct ccb_dev_position pos; }; /* * Definitions for the path inquiry CCB fields. */ #define CAM_VERSION 0x19 /* Hex value for current version */ typedef enum { PI_MDP_ABLE = 0x80, /* Supports MDP message */ PI_WIDE_32 = 0x40, /* Supports 32 bit wide SCSI */ PI_WIDE_16 = 0x20, /* Supports 16 bit wide SCSI */ PI_SDTR_ABLE = 0x10, /* Supports SDTR message */ PI_LINKED_CDB = 0x08, /* Supports linked CDBs */ PI_SATAPM = 0x04, /* Supports SATA PM */ PI_TAG_ABLE = 0x02, /* Supports tag queue messages */ PI_SOFT_RST = 0x01 /* Supports soft reset alternative */ } pi_inqflag; typedef enum { PIT_PROCESSOR = 0x80, /* Target mode processor mode */ PIT_PHASE = 0x40, /* Target mode phase cog. mode */ PIT_DISCONNECT = 0x20, /* Disconnects supported in target mode */ PIT_TERM_IO = 0x10, /* Terminate I/O message supported in TM */ PIT_GRP_6 = 0x08, /* Group 6 commands supported */ PIT_GRP_7 = 0x04 /* Group 7 commands supported */ } pi_tmflag; typedef enum { PIM_ATA_EXT = 0x200,/* ATA requests can understand ata_ext requests */ PIM_EXTLUNS = 0x100,/* 64bit extended LUNs supported */ PIM_SCANHILO = 0x80, /* Bus scans from high ID to low ID */ PIM_NOREMOVE = 0x40, /* Removeable devices not included in scan */ PIM_NOINITIATOR = 0x20, /* Initiator role not supported. */ PIM_NOBUSRESET = 0x10, /* User has disabled initial BUS RESET */ PIM_NO_6_BYTE = 0x08, /* Do not send 6-byte commands */ PIM_SEQSCAN = 0x04, /* Do bus scans sequentially, not in parallel */ PIM_UNMAPPED = 0x02, PIM_NOSCAN = 0x01 /* SIM does its own scanning */ } pi_miscflag; /* Path Inquiry CCB */ struct ccb_pathinq_settings_spi { u_int8_t ppr_options; }; struct ccb_pathinq_settings_fc { u_int64_t wwnn; /* world wide node name */ u_int64_t wwpn; /* world wide port name */ u_int32_t port; /* 24 bit port id, if known */ u_int32_t bitrate; /* Mbps */ }; struct ccb_pathinq_settings_sas { u_int32_t bitrate; /* Mbps */ }; struct ccb_pathinq_settings_nvme { uint16_t nsid; /* Namespace ID for this path */ }; #define PATHINQ_SETTINGS_SIZE 128 struct ccb_pathinq { struct ccb_hdr ccb_h; u_int8_t version_num; /* Version number for the SIM/HBA */ u_int8_t hba_inquiry; /* Mimic of INQ byte 7 for the HBA */ u_int16_t target_sprt; /* Flags for target mode support */ u_int32_t hba_misc; /* Misc HBA features */ u_int16_t hba_eng_cnt; /* HBA engine count */ /* Vendor Unique capabilities */ u_int8_t vuhba_flags[VUHBALEN]; u_int32_t max_target; /* Maximum supported Target */ u_int32_t max_lun; /* Maximum supported Lun */ u_int32_t async_flags; /* Installed Async handlers */ path_id_t hpath_id; /* Highest Path ID in the subsystem */ target_id_t initiator_id; /* ID of the HBA on the SCSI bus */ char sim_vid[SIM_IDLEN]; /* Vendor ID of the SIM */ char hba_vid[HBA_IDLEN]; /* Vendor ID of the HBA */ char dev_name[DEV_IDLEN];/* Device name for SIM */ u_int32_t unit_number; /* Unit number for SIM */ u_int32_t bus_id; /* Bus ID for SIM */ u_int32_t base_transfer_speed;/* Base bus speed in KB/sec */ cam_proto protocol; u_int protocol_version; cam_xport transport; u_int transport_version; union { struct ccb_pathinq_settings_spi spi; struct ccb_pathinq_settings_fc fc; struct ccb_pathinq_settings_sas sas; struct ccb_pathinq_settings_nvme nvme; char ccb_pathinq_settings_opaque[PATHINQ_SETTINGS_SIZE]; } xport_specific; u_int maxio; /* Max supported I/O size, in bytes. */ u_int16_t hba_vendor; /* HBA vendor ID */ u_int16_t hba_device; /* HBA device ID */ u_int16_t hba_subvendor; /* HBA subvendor ID */ u_int16_t hba_subdevice; /* HBA subdevice ID */ }; /* Path Statistics CCB */ struct ccb_pathstats { struct ccb_hdr ccb_h; struct timeval last_reset; /* Time of last bus reset/loop init */ }; typedef enum { SMP_FLAG_NONE = 0x00, SMP_FLAG_REQ_SG = 0x01, SMP_FLAG_RSP_SG = 0x02 } ccb_smp_pass_flags; /* * Serial Management Protocol CCB * XXX Currently the semantics for this CCB are that it is executed either * by the addressed device, or that device's parent (i.e. an expander for * any device on an expander) if the addressed device doesn't support SMP. * Later, once we have the ability to probe SMP-only devices and put them * in CAM's topology, the CCB will only be executed by the addressed device * if possible. */ struct ccb_smpio { struct ccb_hdr ccb_h; uint8_t *smp_request; int smp_request_len; uint16_t smp_request_sglist_cnt; uint8_t *smp_response; int smp_response_len; uint16_t smp_response_sglist_cnt; ccb_smp_pass_flags flags; }; typedef union { u_int8_t *sense_ptr; /* * Pointer to storage * for sense information */ /* Storage Area for sense information */ struct scsi_sense_data sense_buf; } sense_t; typedef union { u_int8_t *cdb_ptr; /* Pointer to the CDB bytes to send */ /* Area for the CDB send */ u_int8_t cdb_bytes[IOCDBLEN]; } cdb_t; /* * SCSI I/O Request CCB used for the XPT_SCSI_IO and XPT_CONT_TARGET_IO * function codes. */ struct ccb_scsiio { struct ccb_hdr ccb_h; union ccb *next_ccb; /* Ptr for next CCB for action */ u_int8_t *req_map; /* Ptr to mapping info */ u_int8_t *data_ptr; /* Ptr to the data buf/SG list */ u_int32_t dxfer_len; /* Data transfer length */ /* Autosense storage */ struct scsi_sense_data sense_data; u_int8_t sense_len; /* Number of bytes to autosense */ u_int8_t cdb_len; /* Number of bytes for the CDB */ u_int16_t sglist_cnt; /* Number of SG list entries */ u_int8_t scsi_status; /* Returned SCSI status */ u_int8_t sense_resid; /* Autosense resid length: 2's comp */ u_int32_t resid; /* Transfer residual length: 2's comp */ cdb_t cdb_io; /* Union for CDB bytes/pointer */ u_int8_t *msg_ptr; /* Pointer to the message buffer */ u_int16_t msg_len; /* Number of bytes for the Message */ u_int8_t tag_action; /* What to do for tag queueing */ /* * The tag action should be either the define below (to send a * non-tagged transaction) or one of the defined scsi tag messages * from scsi_message.h. */ #define CAM_TAG_ACTION_NONE 0x00 u_int tag_id; /* tag id from initator (target mode) */ u_int init_id; /* initiator id of who selected */ }; static __inline uint8_t * scsiio_cdb_ptr(struct ccb_scsiio *ccb) { return ((ccb->ccb_h.flags & CAM_CDB_POINTER) ? ccb->cdb_io.cdb_ptr : ccb->cdb_io.cdb_bytes); } /* * ATA I/O Request CCB used for the XPT_ATA_IO function code. */ struct ccb_ataio { struct ccb_hdr ccb_h; union ccb *next_ccb; /* Ptr for next CCB for action */ struct ata_cmd cmd; /* ATA command register set */ struct ata_res res; /* ATA result register set */ u_int8_t *data_ptr; /* Ptr to the data buf/SG list */ u_int32_t dxfer_len; /* Data transfer length */ u_int32_t resid; /* Transfer residual length: 2's comp */ u_int8_t ata_flags; /* Flags for the rest of the buffer */ #define ATA_FLAG_AUX 0x1 uint32_t aux; uint32_t unused; }; struct ccb_accept_tio { struct ccb_hdr ccb_h; cdb_t cdb_io; /* Union for CDB bytes/pointer */ u_int8_t cdb_len; /* Number of bytes for the CDB */ u_int8_t tag_action; /* What to do for tag queueing */ u_int8_t sense_len; /* Number of bytes of Sense Data */ u_int tag_id; /* tag id from initator (target mode) */ u_int init_id; /* initiator id of who selected */ struct scsi_sense_data sense_data; }; /* Release SIM Queue */ struct ccb_relsim { struct ccb_hdr ccb_h; u_int32_t release_flags; #define RELSIM_ADJUST_OPENINGS 0x01 #define RELSIM_RELEASE_AFTER_TIMEOUT 0x02 #define RELSIM_RELEASE_AFTER_CMDCMPLT 0x04 #define RELSIM_RELEASE_AFTER_QEMPTY 0x08 u_int32_t openings; u_int32_t release_timeout; /* Abstract argument. */ u_int32_t qfrozen_cnt; }; /* * NVMe I/O Request CCB used for the XPT_NVME_IO function code. */ struct ccb_nvmeio { struct ccb_hdr ccb_h; union ccb *next_ccb; /* Ptr for next CCB for action */ struct nvme_command cmd; /* NVME command, per NVME standard */ struct nvme_completion cpl; /* NVME completion, per NVME standard */ uint8_t *data_ptr; /* Ptr to the data buf/SG list */ uint32_t dxfer_len; /* Data transfer length */ uint32_t resid; /* Transfer residual length: 2's comp unused ?*/ }; /* * Definitions for the asynchronous callback CCB fields. */ typedef enum { AC_UNIT_ATTENTION = 0x4000,/* Device reported UNIT ATTENTION */ AC_ADVINFO_CHANGED = 0x2000,/* Advance info might have changes */ AC_CONTRACT = 0x1000,/* A contractual callback */ AC_GETDEV_CHANGED = 0x800,/* Getdev info might have changed */ AC_INQ_CHANGED = 0x400,/* Inquiry info might have changed */ AC_TRANSFER_NEG = 0x200,/* New transfer settings in effect */ AC_LOST_DEVICE = 0x100,/* A device went away */ AC_FOUND_DEVICE = 0x080,/* A new device was found */ AC_PATH_DEREGISTERED = 0x040,/* A path has de-registered */ AC_PATH_REGISTERED = 0x020,/* A new path has been registered */ AC_SENT_BDR = 0x010,/* A BDR message was sent to target */ AC_SCSI_AEN = 0x008,/* A SCSI AEN has been received */ AC_UNSOL_RESEL = 0x002,/* Unsolicited reselection occurred */ AC_BUS_RESET = 0x001 /* A SCSI bus reset occurred */ } ac_code; typedef void ac_callback_t (void *softc, u_int32_t code, struct cam_path *path, void *args); /* * Generic Asynchronous callbacks. * * Generic arguments passed bac which are then interpreted between a per-system * contract number. */ #define AC_CONTRACT_DATA_MAX (128 - sizeof (u_int64_t)) struct ac_contract { u_int64_t contract_number; u_int8_t contract_data[AC_CONTRACT_DATA_MAX]; }; #define AC_CONTRACT_DEV_CHG 1 struct ac_device_changed { u_int64_t wwpn; u_int32_t port; target_id_t target; u_int8_t arrived; }; /* Set Asynchronous Callback CCB */ struct ccb_setasync { struct ccb_hdr ccb_h; u_int32_t event_enable; /* Async Event enables */ ac_callback_t *callback; void *callback_arg; }; /* Set Device Type CCB */ struct ccb_setdev { struct ccb_hdr ccb_h; u_int8_t dev_type; /* Value for dev type field in EDT */ }; /* SCSI Control Functions */ /* Abort XPT request CCB */ struct ccb_abort { struct ccb_hdr ccb_h; union ccb *abort_ccb; /* Pointer to CCB to abort */ }; /* Reset SCSI Bus CCB */ struct ccb_resetbus { struct ccb_hdr ccb_h; }; /* Reset SCSI Device CCB */ struct ccb_resetdev { struct ccb_hdr ccb_h; }; /* Terminate I/O Process Request CCB */ struct ccb_termio { struct ccb_hdr ccb_h; union ccb *termio_ccb; /* Pointer to CCB to terminate */ }; typedef enum { CTS_TYPE_CURRENT_SETTINGS, CTS_TYPE_USER_SETTINGS } cts_type; struct ccb_trans_settings_scsi { u_int valid; /* Which fields to honor */ #define CTS_SCSI_VALID_TQ 0x01 u_int flags; #define CTS_SCSI_FLAGS_TAG_ENB 0x01 }; struct ccb_trans_settings_ata { u_int valid; /* Which fields to honor */ #define CTS_ATA_VALID_TQ 0x01 u_int flags; #define CTS_ATA_FLAGS_TAG_ENB 0x01 }; struct ccb_trans_settings_spi { u_int valid; /* Which fields to honor */ #define CTS_SPI_VALID_SYNC_RATE 0x01 #define CTS_SPI_VALID_SYNC_OFFSET 0x02 #define CTS_SPI_VALID_BUS_WIDTH 0x04 #define CTS_SPI_VALID_DISC 0x08 #define CTS_SPI_VALID_PPR_OPTIONS 0x10 u_int flags; #define CTS_SPI_FLAGS_DISC_ENB 0x01 u_int sync_period; u_int sync_offset; u_int bus_width; u_int ppr_options; }; struct ccb_trans_settings_fc { u_int valid; /* Which fields to honor */ #define CTS_FC_VALID_WWNN 0x8000 #define CTS_FC_VALID_WWPN 0x4000 #define CTS_FC_VALID_PORT 0x2000 #define CTS_FC_VALID_SPEED 0x1000 u_int64_t wwnn; /* world wide node name */ u_int64_t wwpn; /* world wide port name */ u_int32_t port; /* 24 bit port id, if known */ u_int32_t bitrate; /* Mbps */ }; struct ccb_trans_settings_sas { u_int valid; /* Which fields to honor */ #define CTS_SAS_VALID_SPEED 0x1000 u_int32_t bitrate; /* Mbps */ }; struct ccb_trans_settings_pata { u_int valid; /* Which fields to honor */ #define CTS_ATA_VALID_MODE 0x01 #define CTS_ATA_VALID_BYTECOUNT 0x02 #define CTS_ATA_VALID_ATAPI 0x20 #define CTS_ATA_VALID_CAPS 0x40 int mode; /* Mode */ u_int bytecount; /* Length of PIO transaction */ u_int atapi; /* Length of ATAPI CDB */ u_int caps; /* Device and host SATA caps. */ #define CTS_ATA_CAPS_H 0x0000ffff #define CTS_ATA_CAPS_H_DMA48 0x00000001 /* 48-bit DMA */ #define CTS_ATA_CAPS_D 0xffff0000 }; struct ccb_trans_settings_sata { u_int valid; /* Which fields to honor */ #define CTS_SATA_VALID_MODE 0x01 #define CTS_SATA_VALID_BYTECOUNT 0x02 #define CTS_SATA_VALID_REVISION 0x04 #define CTS_SATA_VALID_PM 0x08 #define CTS_SATA_VALID_TAGS 0x10 #define CTS_SATA_VALID_ATAPI 0x20 #define CTS_SATA_VALID_CAPS 0x40 int mode; /* Legacy PATA mode */ u_int bytecount; /* Length of PIO transaction */ int revision; /* SATA revision */ u_int pm_present; /* PM is present (XPT->SIM) */ u_int tags; /* Number of allowed tags */ u_int atapi; /* Length of ATAPI CDB */ u_int caps; /* Device and host SATA caps. */ #define CTS_SATA_CAPS_H 0x0000ffff #define CTS_SATA_CAPS_H_PMREQ 0x00000001 #define CTS_SATA_CAPS_H_APST 0x00000002 #define CTS_SATA_CAPS_H_DMAAA 0x00000010 /* Auto-activation */ #define CTS_SATA_CAPS_H_AN 0x00000020 /* Async. notification */ #define CTS_SATA_CAPS_D 0xffff0000 #define CTS_SATA_CAPS_D_PMREQ 0x00010000 #define CTS_SATA_CAPS_D_APST 0x00020000 }; struct ccb_trans_settings_nvme { u_int valid; /* Which fields to honor */ #define CTS_NVME_VALID_SPEC 0x01 #define CTS_NVME_VALID_CAPS 0x02 u_int spec_major; /* Major version of spec supported */ u_int spec_minor; /* Minor verison of spec supported */ u_int spec_tiny; /* Tiny version of spec supported */ u_int max_xfer; /* Max transfer size (0 -> unlimited */ u_int caps; }; /* Get/Set transfer rate/width/disconnection/tag queueing settings */ struct ccb_trans_settings { struct ccb_hdr ccb_h; cts_type type; /* Current or User settings */ cam_proto protocol; u_int protocol_version; cam_xport transport; u_int transport_version; union { u_int valid; /* Which fields to honor */ struct ccb_trans_settings_ata ata; struct ccb_trans_settings_scsi scsi; struct ccb_trans_settings_nvme nvme; } proto_specific; union { u_int valid; /* Which fields to honor */ struct ccb_trans_settings_spi spi; struct ccb_trans_settings_fc fc; struct ccb_trans_settings_sas sas; struct ccb_trans_settings_pata ata; struct ccb_trans_settings_sata sata; struct ccb_trans_settings_nvme nvme; } xport_specific; }; /* * Calculate the geometry parameters for a device * give the block size and volume size in blocks. */ struct ccb_calc_geometry { struct ccb_hdr ccb_h; u_int32_t block_size; u_int64_t volume_size; u_int32_t cylinders; u_int8_t heads; u_int8_t secs_per_track; }; /* * Set or get SIM (and transport) specific knobs */ #define KNOB_VALID_ADDRESS 0x1 #define KNOB_VALID_ROLE 0x2 #define KNOB_ROLE_NONE 0x0 #define KNOB_ROLE_INITIATOR 0x1 #define KNOB_ROLE_TARGET 0x2 #define KNOB_ROLE_BOTH 0x3 struct ccb_sim_knob_settings_spi { u_int valid; u_int initiator_id; u_int role; }; struct ccb_sim_knob_settings_fc { u_int valid; u_int64_t wwnn; /* world wide node name */ u_int64_t wwpn; /* world wide port name */ u_int role; }; struct ccb_sim_knob_settings_sas { u_int valid; u_int64_t wwnn; /* world wide node name */ u_int role; }; #define KNOB_SETTINGS_SIZE 128 struct ccb_sim_knob { struct ccb_hdr ccb_h; union { u_int valid; /* Which fields to honor */ struct ccb_sim_knob_settings_spi spi; struct ccb_sim_knob_settings_fc fc; struct ccb_sim_knob_settings_sas sas; char pad[KNOB_SETTINGS_SIZE]; } xport_specific; }; /* * Rescan the given bus, or bus/target/lun */ struct ccb_rescan { struct ccb_hdr ccb_h; cam_flags flags; }; /* * Turn on debugging for the given bus, bus/target, or bus/target/lun. */ struct ccb_debug { struct ccb_hdr ccb_h; cam_debug_flags flags; }; /* Target mode structures. */ struct ccb_en_lun { struct ccb_hdr ccb_h; u_int16_t grp6_len; /* Group 6 VU CDB length */ u_int16_t grp7_len; /* Group 7 VU CDB length */ u_int8_t enable; }; /* old, barely used immediate notify, binary compatibility */ struct ccb_immed_notify { struct ccb_hdr ccb_h; struct scsi_sense_data sense_data; u_int8_t sense_len; /* Number of bytes in sense buffer */ u_int8_t initiator_id; /* Id of initiator that selected */ u_int8_t message_args[7]; /* Message Arguments */ }; struct ccb_notify_ack { struct ccb_hdr ccb_h; u_int16_t seq_id; /* Sequence identifier */ u_int8_t event; /* Event flags */ }; struct ccb_immediate_notify { struct ccb_hdr ccb_h; u_int tag_id; /* Tag for immediate notify */ u_int seq_id; /* Tag for target of notify */ u_int initiator_id; /* Initiator Identifier */ u_int arg; /* Function specific */ }; struct ccb_notify_acknowledge { struct ccb_hdr ccb_h; u_int tag_id; /* Tag for immediate notify */ u_int seq_id; /* Tar for target of notify */ u_int initiator_id; /* Initiator Identifier */ u_int arg; /* Response information */ /* * Lower byte of arg is one of RESPONSE CODE values defined below * (subset of response codes from SPL-4 and FCP-4 specifications), * upper 3 bytes is code-specific ADDITIONAL RESPONSE INFORMATION. */ #define CAM_RSP_TMF_COMPLETE 0x00 #define CAM_RSP_TMF_REJECTED 0x04 #define CAM_RSP_TMF_FAILED 0x05 #define CAM_RSP_TMF_SUCCEEDED 0x08 #define CAM_RSP_TMF_INCORRECT_LUN 0x09 }; /* HBA engine structures. */ typedef enum { EIT_BUFFER, /* Engine type: buffer memory */ EIT_LOSSLESS, /* Engine type: lossless compression */ EIT_LOSSY, /* Engine type: lossy compression */ EIT_ENCRYPT /* Engine type: encryption */ } ei_type; typedef enum { EAD_VUNIQUE, /* Engine algorithm ID: vendor unique */ EAD_LZ1V1, /* Engine algorithm ID: LZ1 var.1 */ EAD_LZ2V1, /* Engine algorithm ID: LZ2 var.1 */ EAD_LZ2V2 /* Engine algorithm ID: LZ2 var.2 */ } ei_algo; struct ccb_eng_inq { struct ccb_hdr ccb_h; u_int16_t eng_num; /* The engine number for this inquiry */ ei_type eng_type; /* Returned engine type */ ei_algo eng_algo; /* Returned engine algorithm type */ u_int32_t eng_memeory; /* Returned engine memory size */ }; struct ccb_eng_exec { /* This structure must match SCSIIO size */ struct ccb_hdr ccb_h; u_int8_t *pdrv_ptr; /* Ptr used by the peripheral driver */ u_int8_t *req_map; /* Ptr for mapping info on the req. */ u_int8_t *data_ptr; /* Pointer to the data buf/SG list */ u_int32_t dxfer_len; /* Data transfer length */ u_int8_t *engdata_ptr; /* Pointer to the engine buffer data */ u_int16_t sglist_cnt; /* Num of scatter gather list entries */ u_int32_t dmax_len; /* Destination data maximum length */ u_int32_t dest_len; /* Destination data length */ int32_t src_resid; /* Source residual length: 2's comp */ u_int32_t timeout; /* Timeout value */ u_int16_t eng_num; /* Engine number for this request */ u_int16_t vu_flags; /* Vendor Unique flags */ }; /* * Definitions for the timeout field in the SCSI I/O CCB. */ #define CAM_TIME_DEFAULT 0x00000000 /* Use SIM default value */ #define CAM_TIME_INFINITY 0xFFFFFFFF /* Infinite timeout */ #define CAM_SUCCESS 0 /* For signaling general success */ #define CAM_FAILURE 1 /* For signaling general failure */ #define CAM_FALSE 0 #define CAM_TRUE 1 #define XPT_CCB_INVALID -1 /* for signaling a bad CCB to free */ /* * CCB for working with advanced device information. This operates in a fashion * similar to XPT_GDEV_TYPE. Specify the target in ccb_h, the buffer * type requested, and provide a buffer size/buffer to write to. If the * buffer is too small, provsiz will be larger than bufsiz. */ struct ccb_dev_advinfo { struct ccb_hdr ccb_h; uint32_t flags; #define CDAI_FLAG_NONE 0x0 /* No flags set */ #define CDAI_FLAG_STORE 0x1 /* If set, action becomes store */ uint32_t buftype; /* IN: Type of data being requested */ /* NB: buftype is interpreted on a per-transport basis */ #define CDAI_TYPE_SCSI_DEVID 1 #define CDAI_TYPE_SERIAL_NUM 2 #define CDAI_TYPE_PHYS_PATH 3 #define CDAI_TYPE_RCAPLONG 4 #define CDAI_TYPE_EXT_INQ 5 off_t bufsiz; /* IN: Size of external buffer */ #define CAM_SCSI_DEVID_MAXLEN 65536 /* length in buffer is an uint16_t */ off_t provsiz; /* OUT: Size required/used */ uint8_t *buf; /* IN/OUT: Buffer for requested data */ }; /* * CCB for sending async events */ struct ccb_async { struct ccb_hdr ccb_h; uint32_t async_code; off_t async_arg_size; void *async_arg_ptr; }; /* * Union of all CCB types for kernel space allocation. This union should * never be used for manipulating CCBs - its only use is for the allocation * and deallocation of raw CCB space and is the return type of xpt_ccb_alloc * and the argument to xpt_ccb_free. */ union ccb { struct ccb_hdr ccb_h; /* For convenience */ struct ccb_scsiio csio; struct ccb_getdev cgd; struct ccb_getdevlist cgdl; struct ccb_pathinq cpi; struct ccb_relsim crs; struct ccb_setasync csa; struct ccb_setdev csd; struct ccb_pathstats cpis; struct ccb_getdevstats cgds; struct ccb_dev_match cdm; struct ccb_trans_settings cts; struct ccb_calc_geometry ccg; struct ccb_sim_knob knob; struct ccb_abort cab; struct ccb_resetbus crb; struct ccb_resetdev crd; struct ccb_termio tio; struct ccb_accept_tio atio; struct ccb_scsiio ctio; struct ccb_en_lun cel; struct ccb_immed_notify cin; struct ccb_notify_ack cna; struct ccb_immediate_notify cin1; struct ccb_notify_acknowledge cna2; struct ccb_eng_inq cei; struct ccb_eng_exec cee; struct ccb_smpio smpio; struct ccb_rescan crcn; struct ccb_debug cdbg; struct ccb_ataio ataio; struct ccb_dev_advinfo cdai; struct ccb_async casync; struct ccb_nvmeio nvmeio; }; #define CCB_CLEAR_ALL_EXCEPT_HDR(ccbp) \ bzero((char *)(ccbp) + sizeof((ccbp)->ccb_h), \ sizeof(*(ccbp)) - sizeof((ccbp)->ccb_h)) __BEGIN_DECLS static __inline void cam_fill_csio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int8_t tag_action, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int8_t cdb_len, u_int32_t timeout); static __inline void cam_fill_nvmeio(struct ccb_nvmeio *nvmeio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout); static __inline void cam_fill_ctio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action, u_int tag_id, u_int init_id, u_int scsi_status, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout); static __inline void cam_fill_ataio(struct ccb_ataio *ataio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout); static __inline void cam_fill_smpio(struct ccb_smpio *smpio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint32_t flags, uint8_t *smp_request, int smp_request_len, uint8_t *smp_response, int smp_response_len, uint32_t timeout); static __inline void cam_fill_csio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int8_t tag_action, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int8_t cdb_len, u_int32_t timeout) { csio->ccb_h.func_code = XPT_SCSI_IO; csio->ccb_h.flags = flags; csio->ccb_h.xflags = 0; csio->ccb_h.retry_count = retries; csio->ccb_h.cbfcnp = cbfcnp; csio->ccb_h.timeout = timeout; csio->data_ptr = data_ptr; csio->dxfer_len = dxfer_len; csio->sense_len = sense_len; csio->cdb_len = cdb_len; csio->tag_action = tag_action; } static __inline void cam_fill_ctio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action, u_int tag_id, u_int init_id, u_int scsi_status, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout) { csio->ccb_h.func_code = XPT_CONT_TARGET_IO; csio->ccb_h.flags = flags; csio->ccb_h.xflags = 0; csio->ccb_h.retry_count = retries; csio->ccb_h.cbfcnp = cbfcnp; csio->ccb_h.timeout = timeout; csio->data_ptr = data_ptr; csio->dxfer_len = dxfer_len; csio->scsi_status = scsi_status; csio->tag_action = tag_action; csio->tag_id = tag_id; csio->init_id = init_id; } static __inline void cam_fill_ataio(struct ccb_ataio *ataio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action __unused, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout) { ataio->ccb_h.func_code = XPT_ATA_IO; ataio->ccb_h.flags = flags; ataio->ccb_h.retry_count = retries; ataio->ccb_h.cbfcnp = cbfcnp; ataio->ccb_h.timeout = timeout; ataio->data_ptr = data_ptr; ataio->dxfer_len = dxfer_len; ataio->ata_flags = 0; } static __inline void cam_fill_smpio(struct ccb_smpio *smpio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint32_t flags, uint8_t *smp_request, int smp_request_len, uint8_t *smp_response, int smp_response_len, uint32_t timeout) { #ifdef _KERNEL KASSERT((flags & CAM_DIR_MASK) == CAM_DIR_BOTH, ("direction != CAM_DIR_BOTH")); KASSERT((smp_request != NULL) && (smp_response != NULL), ("need valid request and response buffers")); KASSERT((smp_request_len != 0) && (smp_response_len != 0), ("need non-zero request and response lengths")); #endif /*_KERNEL*/ smpio->ccb_h.func_code = XPT_SMP_IO; smpio->ccb_h.flags = flags; smpio->ccb_h.retry_count = retries; smpio->ccb_h.cbfcnp = cbfcnp; smpio->ccb_h.timeout = timeout; smpio->smp_request = smp_request; smpio->smp_request_len = smp_request_len; smpio->smp_response = smp_response; smpio->smp_response_len = smp_response_len; } static __inline void cam_set_ccbstatus(union ccb *ccb, cam_status status) { ccb->ccb_h.status &= ~CAM_STATUS_MASK; ccb->ccb_h.status |= status; } static __inline cam_status cam_ccb_status(union ccb *ccb) { return ((cam_status)(ccb->ccb_h.status & CAM_STATUS_MASK)); } void cam_calc_geometry(struct ccb_calc_geometry *ccg, int extended); static __inline void cam_fill_nvmeio(struct ccb_nvmeio *nvmeio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout) { nvmeio->ccb_h.func_code = XPT_NVME_IO; nvmeio->ccb_h.flags = flags; nvmeio->ccb_h.retry_count = retries; nvmeio->ccb_h.cbfcnp = cbfcnp; nvmeio->ccb_h.timeout = timeout; nvmeio->data_ptr = data_ptr; nvmeio->dxfer_len = dxfer_len; } __END_DECLS #endif /* _CAM_CAM_CCB_H */ Index: user/alc/PQ_LAUNDRY/sys/cam/cam_xpt.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/cam/cam_xpt.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/cam/cam_xpt.c (revision 303206) @@ -1,5395 +1,5417 @@ /*- * Implementation of the Common Access Method Transport (XPT) layer. * * Copyright (c) 1997, 1998, 1999 Justin T. Gibbs. * Copyright (c) 1997, 1998, 1999 Kenneth D. Merry. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* geometry translation */ #include /* for xpt_print below */ #include "opt_cam.h" /* * This is the maximum number of high powered commands (e.g. start unit) * that can be outstanding at a particular time. */ #ifndef CAM_MAX_HIGHPOWER #define CAM_MAX_HIGHPOWER 4 #endif /* Datastructures internal to the xpt layer */ MALLOC_DEFINE(M_CAMXPT, "CAM XPT", "CAM XPT buffers"); MALLOC_DEFINE(M_CAMDEV, "CAM DEV", "CAM devices"); MALLOC_DEFINE(M_CAMCCB, "CAM CCB", "CAM CCBs"); MALLOC_DEFINE(M_CAMPATH, "CAM path", "CAM paths"); /* Object for defering XPT actions to a taskqueue */ struct xpt_task { struct task task; void *data1; uintptr_t data2; }; struct xpt_softc { uint32_t xpt_generation; /* number of high powered commands that can go through right now */ struct mtx xpt_highpower_lock; STAILQ_HEAD(highpowerlist, cam_ed) highpowerq; int num_highpower; /* queue for handling async rescan requests. */ TAILQ_HEAD(, ccb_hdr) ccb_scanq; int buses_to_config; int buses_config_done; /* Registered busses */ TAILQ_HEAD(,cam_eb) xpt_busses; u_int bus_generation; struct intr_config_hook *xpt_config_hook; int boot_delay; struct callout boot_callout; struct mtx xpt_topo_lock; struct mtx xpt_lock; struct taskqueue *xpt_taskq; }; typedef enum { DM_RET_COPY = 0x01, DM_RET_FLAG_MASK = 0x0f, DM_RET_NONE = 0x00, DM_RET_STOP = 0x10, DM_RET_DESCEND = 0x20, DM_RET_ERROR = 0x30, DM_RET_ACTION_MASK = 0xf0 } dev_match_ret; typedef enum { XPT_DEPTH_BUS, XPT_DEPTH_TARGET, XPT_DEPTH_DEVICE, XPT_DEPTH_PERIPH } xpt_traverse_depth; struct xpt_traverse_config { xpt_traverse_depth depth; void *tr_func; void *tr_arg; }; typedef int xpt_busfunc_t (struct cam_eb *bus, void *arg); typedef int xpt_targetfunc_t (struct cam_et *target, void *arg); typedef int xpt_devicefunc_t (struct cam_ed *device, void *arg); typedef int xpt_periphfunc_t (struct cam_periph *periph, void *arg); typedef int xpt_pdrvfunc_t (struct periph_driver **pdrv, void *arg); /* Transport layer configuration information */ static struct xpt_softc xsoftc; MTX_SYSINIT(xpt_topo_init, &xsoftc.xpt_topo_lock, "XPT topology lock", MTX_DEF); SYSCTL_INT(_kern_cam, OID_AUTO, boot_delay, CTLFLAG_RDTUN, &xsoftc.boot_delay, 0, "Bus registration wait time"); SYSCTL_UINT(_kern_cam, OID_AUTO, xpt_generation, CTLFLAG_RD, &xsoftc.xpt_generation, 0, "CAM peripheral generation count"); struct cam_doneq { struct mtx_padalign cam_doneq_mtx; STAILQ_HEAD(, ccb_hdr) cam_doneq; int cam_doneq_sleep; }; static struct cam_doneq cam_doneqs[MAXCPU]; static int cam_num_doneqs; static struct proc *cam_proc; SYSCTL_INT(_kern_cam, OID_AUTO, num_doneqs, CTLFLAG_RDTUN, &cam_num_doneqs, 0, "Number of completion queues/threads"); struct cam_periph *xpt_periph; static periph_init_t xpt_periph_init; static struct periph_driver xpt_driver = { xpt_periph_init, "xpt", TAILQ_HEAD_INITIALIZER(xpt_driver.units), /* generation */ 0, CAM_PERIPH_DRV_EARLY }; PERIPHDRIVER_DECLARE(xpt, xpt_driver); static d_open_t xptopen; static d_close_t xptclose; static d_ioctl_t xptioctl; static d_ioctl_t xptdoioctl; static struct cdevsw xpt_cdevsw = { .d_version = D_VERSION, .d_flags = 0, .d_open = xptopen, .d_close = xptclose, .d_ioctl = xptioctl, .d_name = "xpt", }; /* Storage for debugging datastructures */ struct cam_path *cam_dpath; u_int32_t cam_dflags = CAM_DEBUG_FLAGS; SYSCTL_UINT(_kern_cam, OID_AUTO, dflags, CTLFLAG_RWTUN, &cam_dflags, 0, "Enabled debug flags"); u_int32_t cam_debug_delay = CAM_DEBUG_DELAY; SYSCTL_UINT(_kern_cam, OID_AUTO, debug_delay, CTLFLAG_RWTUN, &cam_debug_delay, 0, "Delay in us after each debug message"); /* Our boot-time initialization hook */ static int cam_module_event_handler(module_t, int /*modeventtype_t*/, void *); static moduledata_t cam_moduledata = { "cam", cam_module_event_handler, NULL }; static int xpt_init(void *); DECLARE_MODULE(cam, cam_moduledata, SI_SUB_CONFIGURE, SI_ORDER_SECOND); MODULE_VERSION(cam, 1); static void xpt_async_bcast(struct async_list *async_head, u_int32_t async_code, struct cam_path *path, void *async_arg); static path_id_t xptnextfreepathid(void); static path_id_t xptpathid(const char *sim_name, int sim_unit, int sim_bus); static union ccb *xpt_get_ccb(struct cam_periph *periph); static union ccb *xpt_get_ccb_nowait(struct cam_periph *periph); static void xpt_run_allocq(struct cam_periph *periph, int sleep); static void xpt_run_allocq_task(void *context, int pending); static void xpt_run_devq(struct cam_devq *devq); static timeout_t xpt_release_devq_timeout; static void xpt_release_simq_timeout(void *arg) __unused; static void xpt_acquire_bus(struct cam_eb *bus); static void xpt_release_bus(struct cam_eb *bus); static uint32_t xpt_freeze_devq_device(struct cam_ed *dev, u_int count); static int xpt_release_devq_device(struct cam_ed *dev, u_int count, int run_queue); static struct cam_et* xpt_alloc_target(struct cam_eb *bus, target_id_t target_id); static void xpt_acquire_target(struct cam_et *target); static void xpt_release_target(struct cam_et *target); static struct cam_eb* xpt_find_bus(path_id_t path_id); static struct cam_et* xpt_find_target(struct cam_eb *bus, target_id_t target_id); static struct cam_ed* xpt_find_device(struct cam_et *target, lun_id_t lun_id); static void xpt_config(void *arg); static int xpt_schedule_dev(struct camq *queue, cam_pinfo *dev_pinfo, u_int32_t new_priority); static xpt_devicefunc_t xptpassannouncefunc; static void xptaction(struct cam_sim *sim, union ccb *work_ccb); static void xptpoll(struct cam_sim *sim); static void camisr_runqueue(void); static void xpt_done_process(struct ccb_hdr *ccb_h); static void xpt_done_td(void *); static dev_match_ret xptbusmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_eb *bus); static dev_match_ret xptdevicematch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_ed *device); static dev_match_ret xptperiphmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_periph *periph); static xpt_busfunc_t xptedtbusfunc; static xpt_targetfunc_t xptedttargetfunc; static xpt_devicefunc_t xptedtdevicefunc; static xpt_periphfunc_t xptedtperiphfunc; static xpt_pdrvfunc_t xptplistpdrvfunc; static xpt_periphfunc_t xptplistperiphfunc; static int xptedtmatch(struct ccb_dev_match *cdm); static int xptperiphlistmatch(struct ccb_dev_match *cdm); static int xptbustraverse(struct cam_eb *start_bus, xpt_busfunc_t *tr_func, void *arg); static int xpttargettraverse(struct cam_eb *bus, struct cam_et *start_target, xpt_targetfunc_t *tr_func, void *arg); static int xptdevicetraverse(struct cam_et *target, struct cam_ed *start_device, xpt_devicefunc_t *tr_func, void *arg); static int xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg); static int xptpdrvtraverse(struct periph_driver **start_pdrv, xpt_pdrvfunc_t *tr_func, void *arg); static int xptpdperiphtraverse(struct periph_driver **pdrv, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg); static xpt_busfunc_t xptdefbusfunc; static xpt_targetfunc_t xptdeftargetfunc; static xpt_devicefunc_t xptdefdevicefunc; static xpt_periphfunc_t xptdefperiphfunc; static void xpt_finishconfig_task(void *context, int pending); static void xpt_dev_async_default(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg); static struct cam_ed * xpt_alloc_device_default(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id); static xpt_devicefunc_t xptsetasyncfunc; static xpt_busfunc_t xptsetasyncbusfunc; static cam_status xptregister(struct cam_periph *periph, void *arg); static const char * xpt_action_name(uint32_t action); static __inline int device_is_queued(struct cam_ed *device); static __inline int xpt_schedule_devq(struct cam_devq *devq, struct cam_ed *dev) { int retval; mtx_assert(&devq->send_mtx, MA_OWNED); if ((dev->ccbq.queue.entries > 0) && (dev->ccbq.dev_openings > 0) && (dev->ccbq.queue.qfrozen_cnt == 0)) { /* * The priority of a device waiting for controller * resources is that of the highest priority CCB * enqueued. */ retval = xpt_schedule_dev(&devq->send_queue, &dev->devq_entry, CAMQ_GET_PRIO(&dev->ccbq.queue)); } else { retval = 0; } return (retval); } static __inline int device_is_queued(struct cam_ed *device) { return (device->devq_entry.index != CAM_UNQUEUED_INDEX); } static void xpt_periph_init() { make_dev(&xpt_cdevsw, 0, UID_ROOT, GID_OPERATOR, 0600, "xpt0"); } static int xptopen(struct cdev *dev, int flags, int fmt, struct thread *td) { /* * Only allow read-write access. */ if (((flags & FWRITE) == 0) || ((flags & FREAD) == 0)) return(EPERM); /* * We don't allow nonblocking access. */ if ((flags & O_NONBLOCK) != 0) { printf("%s: can't do nonblocking access\n", devtoname(dev)); return(ENODEV); } return(0); } static int xptclose(struct cdev *dev, int flag, int fmt, struct thread *td) { return(0); } /* * Don't automatically grab the xpt softc lock here even though this is going * through the xpt device. The xpt device is really just a back door for * accessing other devices and SIMs, so the right thing to do is to grab * the appropriate SIM lock once the bus/SIM is located. */ static int xptioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { int error; if ((error = xptdoioctl(dev, cmd, addr, flag, td)) == ENOTTY) { error = cam_compat_ioctl(dev, cmd, addr, flag, td, xptdoioctl); } return (error); } static int xptdoioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { int error; error = 0; switch(cmd) { /* * For the transport layer CAMIOCOMMAND ioctl, we really only want * to accept CCB types that don't quite make sense to send through a * passthrough driver. XPT_PATH_INQ is an exception to this, as stated * in the CAM spec. */ case CAMIOCOMMAND: { union ccb *ccb; union ccb *inccb; struct cam_eb *bus; inccb = (union ccb *)addr; bus = xpt_find_bus(inccb->ccb_h.path_id); if (bus == NULL) return (EINVAL); switch (inccb->ccb_h.func_code) { case XPT_SCAN_BUS: case XPT_RESET_BUS: if (inccb->ccb_h.target_id != CAM_TARGET_WILDCARD || inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) { xpt_release_bus(bus); return (EINVAL); } break; case XPT_SCAN_TGT: if (inccb->ccb_h.target_id == CAM_TARGET_WILDCARD || inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) { xpt_release_bus(bus); return (EINVAL); } break; default: break; } switch(inccb->ccb_h.func_code) { case XPT_SCAN_BUS: case XPT_RESET_BUS: case XPT_PATH_INQ: case XPT_ENG_INQ: case XPT_SCAN_LUN: case XPT_SCAN_TGT: ccb = xpt_alloc_ccb(); /* * Create a path using the bus, target, and lun the * user passed in. */ if (xpt_create_path(&ccb->ccb_h.path, NULL, inccb->ccb_h.path_id, inccb->ccb_h.target_id, inccb->ccb_h.target_lun) != CAM_REQ_CMP){ error = EINVAL; xpt_free_ccb(ccb); break; } /* Ensure all of our fields are correct */ xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path, inccb->ccb_h.pinfo.priority); xpt_merge_ccb(ccb, inccb); xpt_path_lock(ccb->ccb_h.path); cam_periph_runccb(ccb, NULL, 0, 0, NULL); xpt_path_unlock(ccb->ccb_h.path); bcopy(ccb, inccb, sizeof(union ccb)); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); break; case XPT_DEBUG: { union ccb ccb; /* * This is an immediate CCB, so it's okay to * allocate it on the stack. */ /* * Create a path using the bus, target, and lun the * user passed in. */ if (xpt_create_path(&ccb.ccb_h.path, NULL, inccb->ccb_h.path_id, inccb->ccb_h.target_id, inccb->ccb_h.target_lun) != CAM_REQ_CMP){ error = EINVAL; break; } /* Ensure all of our fields are correct */ xpt_setup_ccb(&ccb.ccb_h, ccb.ccb_h.path, inccb->ccb_h.pinfo.priority); xpt_merge_ccb(&ccb, inccb); xpt_action(&ccb); bcopy(&ccb, inccb, sizeof(union ccb)); xpt_free_path(ccb.ccb_h.path); break; } case XPT_DEV_MATCH: { struct cam_periph_map_info mapinfo; struct cam_path *old_path; /* * We can't deal with physical addresses for this * type of transaction. */ if ((inccb->ccb_h.flags & CAM_DATA_MASK) != CAM_DATA_VADDR) { error = EINVAL; break; } /* * Save this in case the caller had it set to * something in particular. */ old_path = inccb->ccb_h.path; /* * We really don't need a path for the matching * code. The path is needed because of the * debugging statements in xpt_action(). They * assume that the CCB has a valid path. */ inccb->ccb_h.path = xpt_periph->path; bzero(&mapinfo, sizeof(mapinfo)); /* * Map the pattern and match buffers into kernel * virtual address space. */ error = cam_periph_mapmem(inccb, &mapinfo, MAXPHYS); if (error) { inccb->ccb_h.path = old_path; break; } /* * This is an immediate CCB, we can send it on directly. */ xpt_action(inccb); /* * Map the buffers back into user space. */ cam_periph_unmapmem(inccb, &mapinfo); inccb->ccb_h.path = old_path; error = 0; break; } default: error = ENOTSUP; break; } xpt_release_bus(bus); break; } /* * This is the getpassthru ioctl. It takes a XPT_GDEVLIST ccb as input, * with the periphal driver name and unit name filled in. The other * fields don't really matter as input. The passthrough driver name * ("pass"), and unit number are passed back in the ccb. The current * device generation number, and the index into the device peripheral * driver list, and the status are also passed back. Note that * since we do everything in one pass, unlike the XPT_GDEVLIST ccb, * we never return a status of CAM_GDEVLIST_LIST_CHANGED. It is * (or rather should be) impossible for the device peripheral driver * list to change since we look at the whole thing in one pass, and * we do it with lock protection. * */ case CAMGETPASSTHRU: { union ccb *ccb; struct cam_periph *periph; struct periph_driver **p_drv; char *name; u_int unit; int base_periph_found; ccb = (union ccb *)addr; unit = ccb->cgdl.unit_number; name = ccb->cgdl.periph_name; base_periph_found = 0; /* * Sanity check -- make sure we don't get a null peripheral * driver name. */ if (*ccb->cgdl.periph_name == '\0') { error = EINVAL; break; } /* Keep the list from changing while we traverse it */ xpt_lock_buses(); /* first find our driver in the list of drivers */ for (p_drv = periph_drivers; *p_drv != NULL; p_drv++) if (strcmp((*p_drv)->driver_name, name) == 0) break; if (*p_drv == NULL) { xpt_unlock_buses(); ccb->ccb_h.status = CAM_REQ_CMP_ERR; ccb->cgdl.status = CAM_GDEVLIST_ERROR; *ccb->cgdl.periph_name = '\0'; ccb->cgdl.unit_number = 0; error = ENOENT; break; } /* * Run through every peripheral instance of this driver * and check to see whether it matches the unit passed * in by the user. If it does, get out of the loops and * find the passthrough driver associated with that * peripheral driver. */ for (periph = TAILQ_FIRST(&(*p_drv)->units); periph != NULL; periph = TAILQ_NEXT(periph, unit_links)) { if (periph->unit_number == unit) break; } /* * If we found the peripheral driver that the user passed * in, go through all of the peripheral drivers for that * particular device and look for a passthrough driver. */ if (periph != NULL) { struct cam_ed *device; int i; base_periph_found = 1; device = periph->path->device; for (i = 0, periph = SLIST_FIRST(&device->periphs); periph != NULL; periph = SLIST_NEXT(periph, periph_links), i++) { /* * Check to see whether we have a * passthrough device or not. */ if (strcmp(periph->periph_name, "pass") == 0) { /* * Fill in the getdevlist fields. */ strcpy(ccb->cgdl.periph_name, periph->periph_name); ccb->cgdl.unit_number = periph->unit_number; if (SLIST_NEXT(periph, periph_links)) ccb->cgdl.status = CAM_GDEVLIST_MORE_DEVS; else ccb->cgdl.status = CAM_GDEVLIST_LAST_DEVICE; ccb->cgdl.generation = device->generation; ccb->cgdl.index = i; /* * Fill in some CCB header fields * that the user may want. */ ccb->ccb_h.path_id = periph->path->bus->path_id; ccb->ccb_h.target_id = periph->path->target->target_id; ccb->ccb_h.target_lun = periph->path->device->lun_id; ccb->ccb_h.status = CAM_REQ_CMP; break; } } } /* * If the periph is null here, one of two things has * happened. The first possibility is that we couldn't * find the unit number of the particular peripheral driver * that the user is asking about. e.g. the user asks for * the passthrough driver for "da11". We find the list of * "da" peripherals all right, but there is no unit 11. * The other possibility is that we went through the list * of peripheral drivers attached to the device structure, * but didn't find one with the name "pass". Either way, * we return ENOENT, since we couldn't find something. */ if (periph == NULL) { ccb->ccb_h.status = CAM_REQ_CMP_ERR; ccb->cgdl.status = CAM_GDEVLIST_ERROR; *ccb->cgdl.periph_name = '\0'; ccb->cgdl.unit_number = 0; error = ENOENT; /* * It is unfortunate that this is even necessary, * but there are many, many clueless users out there. * If this is true, the user is looking for the * passthrough driver, but doesn't have one in his * kernel. */ if (base_periph_found == 1) { printf("xptioctl: pass driver is not in the " "kernel\n"); printf("xptioctl: put \"device pass\" in " "your kernel config file\n"); } } xpt_unlock_buses(); break; } default: error = ENOTTY; break; } return(error); } static int cam_module_event_handler(module_t mod, int what, void *arg) { int error; switch (what) { case MOD_LOAD: if ((error = xpt_init(NULL)) != 0) return (error); break; case MOD_UNLOAD: return EBUSY; default: return EOPNOTSUPP; } return 0; } static void xpt_rescan_done(struct cam_periph *periph, union ccb *done_ccb) { if (done_ccb->ccb_h.ppriv_ptr1 == NULL) { xpt_free_path(done_ccb->ccb_h.path); xpt_free_ccb(done_ccb); } else { done_ccb->ccb_h.cbfcnp = done_ccb->ccb_h.ppriv_ptr1; (*done_ccb->ccb_h.cbfcnp)(periph, done_ccb); } xpt_release_boot(); } /* thread to handle bus rescans */ static void xpt_scanner_thread(void *dummy) { union ccb *ccb; struct cam_path path; xpt_lock_buses(); for (;;) { if (TAILQ_EMPTY(&xsoftc.ccb_scanq)) msleep(&xsoftc.ccb_scanq, &xsoftc.xpt_topo_lock, PRIBIO, "-", 0); if ((ccb = (union ccb *)TAILQ_FIRST(&xsoftc.ccb_scanq)) != NULL) { TAILQ_REMOVE(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe); xpt_unlock_buses(); /* * Since lock can be dropped inside and path freed * by completion callback even before return here, * take our own path copy for reference. */ xpt_copy_path(&path, ccb->ccb_h.path); xpt_path_lock(&path); xpt_action(ccb); xpt_path_unlock(&path); xpt_release_path(&path); xpt_lock_buses(); } } } void xpt_rescan(union ccb *ccb) { struct ccb_hdr *hdr; /* Prepare request */ if (ccb->ccb_h.path->target->target_id == CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_BUS; else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_TGT; else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id != CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_LUN; else { xpt_print(ccb->ccb_h.path, "illegal scan path\n"); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } CAM_DEBUG(ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_rescan: func %#x %s\n", ccb->ccb_h.func_code, xpt_action_name(ccb->ccb_h.func_code))); ccb->ccb_h.ppriv_ptr1 = ccb->ccb_h.cbfcnp; ccb->ccb_h.cbfcnp = xpt_rescan_done; xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path, CAM_PRIORITY_XPT); /* Don't make duplicate entries for the same paths. */ xpt_lock_buses(); if (ccb->ccb_h.ppriv_ptr1 == NULL) { TAILQ_FOREACH(hdr, &xsoftc.ccb_scanq, sim_links.tqe) { if (xpt_path_comp(hdr->path, ccb->ccb_h.path) == 0) { wakeup(&xsoftc.ccb_scanq); xpt_unlock_buses(); xpt_print(ccb->ccb_h.path, "rescan already queued\n"); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } } } TAILQ_INSERT_TAIL(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe); xsoftc.buses_to_config++; wakeup(&xsoftc.ccb_scanq); xpt_unlock_buses(); } /* Functions accessed by the peripheral drivers */ static int xpt_init(void *dummy) { struct cam_sim *xpt_sim; struct cam_path *path; struct cam_devq *devq; cam_status status; int error, i; TAILQ_INIT(&xsoftc.xpt_busses); TAILQ_INIT(&xsoftc.ccb_scanq); STAILQ_INIT(&xsoftc.highpowerq); xsoftc.num_highpower = CAM_MAX_HIGHPOWER; mtx_init(&xsoftc.xpt_lock, "XPT lock", NULL, MTX_DEF); mtx_init(&xsoftc.xpt_highpower_lock, "XPT highpower lock", NULL, MTX_DEF); xsoftc.xpt_taskq = taskqueue_create("CAM XPT task", M_WAITOK, taskqueue_thread_enqueue, /*context*/&xsoftc.xpt_taskq); #ifdef CAM_BOOT_DELAY /* * Override this value at compile time to assist our users * who don't use loader to boot a kernel. */ xsoftc.boot_delay = CAM_BOOT_DELAY; #endif /* * The xpt layer is, itself, the equivalent of a SIM. * Allow 16 ccbs in the ccb pool for it. This should * give decent parallelism when we probe busses and * perform other XPT functions. */ devq = cam_simq_alloc(16); xpt_sim = cam_sim_alloc(xptaction, xptpoll, "xpt", /*softc*/NULL, /*unit*/0, /*mtx*/&xsoftc.xpt_lock, /*max_dev_transactions*/0, /*max_tagged_dev_transactions*/0, devq); if (xpt_sim == NULL) return (ENOMEM); mtx_lock(&xsoftc.xpt_lock); if ((status = xpt_bus_register(xpt_sim, NULL, 0)) != CAM_SUCCESS) { mtx_unlock(&xsoftc.xpt_lock); printf("xpt_init: xpt_bus_register failed with status %#x," " failing attach\n", status); return (EINVAL); } mtx_unlock(&xsoftc.xpt_lock); /* * Looking at the XPT from the SIM layer, the XPT is * the equivalent of a peripheral driver. Allocate * a peripheral driver entry for us. */ if ((status = xpt_create_path(&path, NULL, CAM_XPT_PATH_ID, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD)) != CAM_REQ_CMP) { printf("xpt_init: xpt_create_path failed with status %#x," " failing attach\n", status); return (EINVAL); } xpt_path_lock(path); cam_periph_alloc(xptregister, NULL, NULL, NULL, "xpt", CAM_PERIPH_BIO, path, NULL, 0, xpt_sim); xpt_path_unlock(path); xpt_free_path(path); if (cam_num_doneqs < 1) cam_num_doneqs = 1 + mp_ncpus / 6; else if (cam_num_doneqs > MAXCPU) cam_num_doneqs = MAXCPU; for (i = 0; i < cam_num_doneqs; i++) { mtx_init(&cam_doneqs[i].cam_doneq_mtx, "CAM doneq", NULL, MTX_DEF); STAILQ_INIT(&cam_doneqs[i].cam_doneq); error = kproc_kthread_add(xpt_done_td, &cam_doneqs[i], &cam_proc, NULL, 0, 0, "cam", "doneq%d", i); if (error != 0) { cam_num_doneqs = i; break; } } if (cam_num_doneqs < 1) { printf("xpt_init: Cannot init completion queues " "- failing attach\n"); return (ENOMEM); } /* * Register a callback for when interrupts are enabled. */ xsoftc.xpt_config_hook = (struct intr_config_hook *)malloc(sizeof(struct intr_config_hook), M_CAMXPT, M_NOWAIT | M_ZERO); if (xsoftc.xpt_config_hook == NULL) { printf("xpt_init: Cannot malloc config hook " "- failing attach\n"); return (ENOMEM); } xsoftc.xpt_config_hook->ich_func = xpt_config; if (config_intrhook_establish(xsoftc.xpt_config_hook) != 0) { free (xsoftc.xpt_config_hook, M_CAMXPT); printf("xpt_init: config_intrhook_establish failed " "- failing attach\n"); } return (0); } static cam_status xptregister(struct cam_periph *periph, void *arg) { struct cam_sim *xpt_sim; if (periph == NULL) { printf("xptregister: periph was NULL!!\n"); return(CAM_REQ_CMP_ERR); } xpt_sim = (struct cam_sim *)arg; xpt_sim->softc = periph; xpt_periph = periph; periph->softc = NULL; return(CAM_REQ_CMP); } int32_t xpt_add_periph(struct cam_periph *periph) { struct cam_ed *device; int32_t status; TASK_INIT(&periph->periph_run_task, 0, xpt_run_allocq_task, periph); device = periph->path->device; status = CAM_REQ_CMP; if (device != NULL) { mtx_lock(&device->target->bus->eb_mtx); device->generation++; SLIST_INSERT_HEAD(&device->periphs, periph, periph_links); mtx_unlock(&device->target->bus->eb_mtx); atomic_add_32(&xsoftc.xpt_generation, 1); } return (status); } void xpt_remove_periph(struct cam_periph *periph) { struct cam_ed *device; device = periph->path->device; if (device != NULL) { mtx_lock(&device->target->bus->eb_mtx); device->generation++; SLIST_REMOVE(&device->periphs, periph, cam_periph, periph_links); mtx_unlock(&device->target->bus->eb_mtx); atomic_add_32(&xsoftc.xpt_generation, 1); } } void xpt_announce_periph(struct cam_periph *periph, char *announce_string) { struct cam_path *path = periph->path; cam_periph_assert(periph, MA_OWNED); periph->flags |= CAM_PERIPH_ANNOUNCED; printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n", periph->periph_name, periph->unit_number, path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id, path->bus->path_id, path->target->target_id, (uintmax_t)path->device->lun_id); printf("%s%d: ", periph->periph_name, periph->unit_number); if (path->device->protocol == PROTO_SCSI) scsi_print_inquiry(&path->device->inq_data); else if (path->device->protocol == PROTO_ATA || path->device->protocol == PROTO_SATAPM) ata_print_ident(&path->device->ident_data); else if (path->device->protocol == PROTO_SEMB) semb_print_ident( (struct sep_identify_data *)&path->device->ident_data); + else if (path->device->protocol == PROTO_NVME) + nvme_print_ident(path->device->nvme_cdata, path->device->nvme_data); else printf("Unknown protocol device\n"); if (path->device->serial_num_len > 0) { /* Don't wrap the screen - print only the first 60 chars */ printf("%s%d: Serial Number %.60s\n", periph->periph_name, periph->unit_number, path->device->serial_num); } /* Announce transport details. */ (*(path->bus->xport->announce))(periph); /* Announce command queueing. */ if (path->device->inq_flags & SID_CmdQue || path->device->flags & CAM_DEV_TAG_AFTER_COUNT) { printf("%s%d: Command Queueing enabled\n", periph->periph_name, periph->unit_number); } /* Announce caller's details if they've passed in. */ if (announce_string != NULL) printf("%s%d: %s\n", periph->periph_name, periph->unit_number, announce_string); } void xpt_announce_quirks(struct cam_periph *periph, int quirks, char *bit_string) { if (quirks != 0) { printf("%s%d: quirks=0x%b\n", periph->periph_name, periph->unit_number, quirks, bit_string); } } void xpt_denounce_periph(struct cam_periph *periph) { struct cam_path *path = periph->path; cam_periph_assert(periph, MA_OWNED); printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n", periph->periph_name, periph->unit_number, path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id, path->bus->path_id, path->target->target_id, (uintmax_t)path->device->lun_id); printf("%s%d: ", periph->periph_name, periph->unit_number); if (path->device->protocol == PROTO_SCSI) scsi_print_inquiry_short(&path->device->inq_data); else if (path->device->protocol == PROTO_ATA || path->device->protocol == PROTO_SATAPM) ata_print_ident_short(&path->device->ident_data); else if (path->device->protocol == PROTO_SEMB) semb_print_ident_short( (struct sep_identify_data *)&path->device->ident_data); + else if (path->device->protocol == PROTO_NVME) + nvme_print_ident(path->device->nvme_cdata, path->device->nvme_data); else printf("Unknown protocol device"); if (path->device->serial_num_len > 0) printf(" s/n %.60s", path->device->serial_num); printf(" detached\n"); } int xpt_getattr(char *buf, size_t len, const char *attr, struct cam_path *path) { int ret = -1, l; struct ccb_dev_advinfo cdai; struct scsi_vpd_id_descriptor *idd; xpt_path_assert(path, MA_OWNED); memset(&cdai, 0, sizeof(cdai)); xpt_setup_ccb(&cdai.ccb_h, path, CAM_PRIORITY_NORMAL); cdai.ccb_h.func_code = XPT_DEV_ADVINFO; cdai.bufsiz = len; if (!strcmp(attr, "GEOM::ident")) cdai.buftype = CDAI_TYPE_SERIAL_NUM; else if (!strcmp(attr, "GEOM::physpath")) cdai.buftype = CDAI_TYPE_PHYS_PATH; else if (strcmp(attr, "GEOM::lunid") == 0 || strcmp(attr, "GEOM::lunname") == 0) { cdai.buftype = CDAI_TYPE_SCSI_DEVID; cdai.bufsiz = CAM_SCSI_DEVID_MAXLEN; } else goto out; cdai.buf = malloc(cdai.bufsiz, M_CAMXPT, M_NOWAIT|M_ZERO); if (cdai.buf == NULL) { ret = ENOMEM; goto out; } xpt_action((union ccb *)&cdai); /* can only be synchronous */ if ((cdai.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(cdai.ccb_h.path, 0, 0, 0, FALSE); if (cdai.provsiz == 0) goto out; if (cdai.buftype == CDAI_TYPE_SCSI_DEVID) { if (strcmp(attr, "GEOM::lunid") == 0) { idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_naa); if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_eui64); } else idd = NULL; if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_t10); if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_name); if (idd == NULL) goto out; ret = 0; if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_ASCII) { if (idd->length < len) { for (l = 0; l < idd->length; l++) buf[l] = idd->identifier[l] ? idd->identifier[l] : ' '; buf[l] = 0; } else ret = EFAULT; } else if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_UTF8) { l = strnlen(idd->identifier, idd->length); if (l < len) { bcopy(idd->identifier, buf, l); buf[l] = 0; } else ret = EFAULT; } else { if (idd->length * 2 < len) { for (l = 0; l < idd->length; l++) sprintf(buf + l * 2, "%02x", idd->identifier[l]); } else ret = EFAULT; } } else { ret = 0; if (strlcpy(buf, cdai.buf, len) >= len) ret = EFAULT; } out: if (cdai.buf != NULL) free(cdai.buf, M_CAMXPT); return ret; } static dev_match_ret xptbusmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_eb *bus) { dev_match_ret retval; u_int i; retval = DM_RET_NONE; /* * If we aren't given something to match against, that's an error. */ if (bus == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this bus matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_DESCEND | DM_RET_COPY); for (i = 0; i < num_patterns; i++) { struct bus_match_pattern *cur_pattern; /* * If the pattern in question isn't for a bus node, we * aren't interested. However, we do indicate to the * calling routine that we should continue descending the * tree, since the user wants to match against lower-level * EDT elements. */ if (patterns[i].type != DEV_MATCH_BUS) { if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_DESCEND; continue; } cur_pattern = &patterns[i].pattern.bus_pattern; /* * If they want to match any bus node, we give them any * device node. */ if (cur_pattern->flags == BUS_MATCH_ANY) { /* set the copy flag */ retval |= DM_RET_COPY; /* * If we've already decided on an action, go ahead * and return. */ if ((retval & DM_RET_ACTION_MASK) != DM_RET_NONE) return(retval); } /* * Not sure why someone would do this... */ if (cur_pattern->flags == BUS_MATCH_NONE) continue; if (((cur_pattern->flags & BUS_MATCH_PATH) != 0) && (cur_pattern->path_id != bus->path_id)) continue; if (((cur_pattern->flags & BUS_MATCH_BUS_ID) != 0) && (cur_pattern->bus_id != bus->sim->bus_id)) continue; if (((cur_pattern->flags & BUS_MATCH_UNIT) != 0) && (cur_pattern->unit_number != bus->sim->unit_number)) continue; if (((cur_pattern->flags & BUS_MATCH_NAME) != 0) && (strncmp(cur_pattern->dev_name, bus->sim->sim_name, DEV_IDLEN) != 0)) continue; /* * If we get to this point, the user definitely wants * information on this bus. So tell the caller to copy the * data out. */ retval |= DM_RET_COPY; /* * If the return action has been set to descend, then we * know that we've already seen a non-bus matching * expression, therefore we need to further descend the tree. * This won't change by continuing around the loop, so we * go ahead and return. If we haven't seen a non-bus * matching expression, we keep going around the loop until * we exhaust the matching expressions. We'll set the stop * flag once we fall out of the loop. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND) return(retval); } /* * If the return action hasn't been set to descend yet, that means * we haven't seen anything other than bus matching patterns. So * tell the caller to stop descending the tree -- the user doesn't * want to match against lower level tree elements. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_STOP; return(retval); } static dev_match_ret xptdevicematch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_ed *device) { dev_match_ret retval; u_int i; retval = DM_RET_NONE; /* * If we aren't given something to match against, that's an error. */ if (device == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this device matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_DESCEND | DM_RET_COPY); for (i = 0; i < num_patterns; i++) { struct device_match_pattern *cur_pattern; struct scsi_vpd_device_id *device_id_page; /* * If the pattern in question isn't for a device node, we * aren't interested. */ if (patterns[i].type != DEV_MATCH_DEVICE) { if ((patterns[i].type == DEV_MATCH_PERIPH) && ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE)) retval |= DM_RET_DESCEND; continue; } cur_pattern = &patterns[i].pattern.device_pattern; /* Error out if mutually exclusive options are specified. */ if ((cur_pattern->flags & (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID)) == (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID)) return(DM_RET_ERROR); /* * If they want to match any device node, we give them any * device node. */ if (cur_pattern->flags == DEV_MATCH_ANY) goto copy_dev_node; /* * Not sure why someone would do this... */ if (cur_pattern->flags == DEV_MATCH_NONE) continue; if (((cur_pattern->flags & DEV_MATCH_PATH) != 0) && (cur_pattern->path_id != device->target->bus->path_id)) continue; if (((cur_pattern->flags & DEV_MATCH_TARGET) != 0) && (cur_pattern->target_id != device->target->target_id)) continue; if (((cur_pattern->flags & DEV_MATCH_LUN) != 0) && (cur_pattern->target_lun != device->lun_id)) continue; if (((cur_pattern->flags & DEV_MATCH_INQUIRY) != 0) && (cam_quirkmatch((caddr_t)&device->inq_data, (caddr_t)&cur_pattern->data.inq_pat, 1, sizeof(cur_pattern->data.inq_pat), scsi_static_inquiry_match) == NULL)) continue; device_id_page = (struct scsi_vpd_device_id *)device->device_id; if (((cur_pattern->flags & DEV_MATCH_DEVID) != 0) && (device->device_id_len < SVPD_DEVICE_ID_HDR_LEN || scsi_devid_match((uint8_t *)device_id_page->desc_list, device->device_id_len - SVPD_DEVICE_ID_HDR_LEN, cur_pattern->data.devid_pat.id, cur_pattern->data.devid_pat.id_len) != 0)) continue; copy_dev_node: /* * If we get to this point, the user definitely wants * information on this device. So tell the caller to copy * the data out. */ retval |= DM_RET_COPY; /* * If the return action has been set to descend, then we * know that we've already seen a peripheral matching * expression, therefore we need to further descend the tree. * This won't change by continuing around the loop, so we * go ahead and return. If we haven't seen a peripheral * matching expression, we keep going around the loop until * we exhaust the matching expressions. We'll set the stop * flag once we fall out of the loop. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND) return(retval); } /* * If the return action hasn't been set to descend yet, that means * we haven't seen any peripheral matching patterns. So tell the * caller to stop descending the tree -- the user doesn't want to * match against lower level tree elements. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_STOP; return(retval); } /* * Match a single peripheral against any number of match patterns. */ static dev_match_ret xptperiphmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_periph *periph) { dev_match_ret retval; u_int i; /* * If we aren't given something to match against, that's an error. */ if (periph == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this peripheral matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_STOP | DM_RET_COPY); /* * There aren't any nodes below a peripheral node, so there's no * reason to descend the tree any further. */ retval = DM_RET_STOP; for (i = 0; i < num_patterns; i++) { struct periph_match_pattern *cur_pattern; /* * If the pattern in question isn't for a peripheral, we * aren't interested. */ if (patterns[i].type != DEV_MATCH_PERIPH) continue; cur_pattern = &patterns[i].pattern.periph_pattern; /* * If they want to match on anything, then we will do so. */ if (cur_pattern->flags == PERIPH_MATCH_ANY) { /* set the copy flag */ retval |= DM_RET_COPY; /* * We've already set the return action to stop, * since there are no nodes below peripherals in * the tree. */ return(retval); } /* * Not sure why someone would do this... */ if (cur_pattern->flags == PERIPH_MATCH_NONE) continue; if (((cur_pattern->flags & PERIPH_MATCH_PATH) != 0) && (cur_pattern->path_id != periph->path->bus->path_id)) continue; /* * For the target and lun id's, we have to make sure the * target and lun pointers aren't NULL. The xpt peripheral * has a wildcard target and device. */ if (((cur_pattern->flags & PERIPH_MATCH_TARGET) != 0) && ((periph->path->target == NULL) ||(cur_pattern->target_id != periph->path->target->target_id))) continue; if (((cur_pattern->flags & PERIPH_MATCH_LUN) != 0) && ((periph->path->device == NULL) || (cur_pattern->target_lun != periph->path->device->lun_id))) continue; if (((cur_pattern->flags & PERIPH_MATCH_UNIT) != 0) && (cur_pattern->unit_number != periph->unit_number)) continue; if (((cur_pattern->flags & PERIPH_MATCH_NAME) != 0) && (strncmp(cur_pattern->periph_name, periph->periph_name, DEV_IDLEN) != 0)) continue; /* * If we get to this point, the user definitely wants * information on this peripheral. So tell the caller to * copy the data out. */ retval |= DM_RET_COPY; /* * The return action has already been set to stop, since * peripherals don't have any nodes below them in the EDT. */ return(retval); } /* * If we get to this point, the peripheral that was passed in * doesn't match any of the patterns. */ return(retval); } static int xptedtbusfunc(struct cam_eb *bus, void *arg) { struct ccb_dev_match *cdm; struct cam_et *target; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; /* * If our position is for something deeper in the tree, that means * that we've already seen this node. So, we keep going down. */ if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target != NULL)) retval = DM_RET_DESCEND; else retval = xptbusmatch(cdm->patterns, cdm->num_patterns, bus); /* * If we got an error, bail out of the search. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this bus out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS; cdm->pos.cookie.bus = bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_BUS; cdm->matches[j].result.bus_result.path_id = bus->path_id; cdm->matches[j].result.bus_result.bus_id = bus->sim->bus_id; cdm->matches[j].result.bus_result.unit_number = bus->sim->unit_number; strncpy(cdm->matches[j].result.bus_result.dev_name, bus->sim->sim_name, DEV_IDLEN); } /* * If the user is only interested in busses, there's no * reason to descend to the next level in the tree. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP) return(1); /* * If there is a target generation recorded, check it to * make sure the target list hasn't changed. */ mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target != NULL)) { if ((cdm->pos.generations[CAM_TARGET_GENERATION] != bus->generation)) { mtx_unlock(&bus->eb_mtx); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return (0); } target = (struct cam_et *)cdm->pos.cookie.target; target->refcount++; } else target = NULL; mtx_unlock(&bus->eb_mtx); return (xpttargettraverse(bus, target, xptedttargetfunc, arg)); } static int xptedttargetfunc(struct cam_et *target, void *arg) { struct ccb_dev_match *cdm; struct cam_eb *bus; struct cam_ed *device; cdm = (struct ccb_dev_match *)arg; bus = target->bus; /* * If there is a device list generation recorded, check it to * make sure the device list hasn't changed. */ mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target == target) && (cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device != NULL)) { if (cdm->pos.generations[CAM_DEV_GENERATION] != target->generation) { mtx_unlock(&bus->eb_mtx); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } device = (struct cam_ed *)cdm->pos.cookie.device; device->refcount++; } else device = NULL; mtx_unlock(&bus->eb_mtx); return (xptdevicetraverse(target, device, xptedtdevicefunc, arg)); } static int xptedtdevicefunc(struct cam_ed *device, void *arg) { struct cam_eb *bus; struct cam_periph *periph; struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; bus = device->target->bus; /* * If our position is for something deeper in the tree, that means * that we've already seen this node. So, we keep going down. */ if ((cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device == device) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) retval = DM_RET_DESCEND; else retval = xptdevicematch(cdm->patterns, cdm->num_patterns, device); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this device out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS | CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE; cdm->pos.cookie.bus = device->target->bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->pos.cookie.target = device->target; cdm->pos.generations[CAM_TARGET_GENERATION] = device->target->bus->generation; cdm->pos.cookie.device = device; cdm->pos.generations[CAM_DEV_GENERATION] = device->target->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_DEVICE; cdm->matches[j].result.device_result.path_id = device->target->bus->path_id; cdm->matches[j].result.device_result.target_id = device->target->target_id; cdm->matches[j].result.device_result.target_lun = device->lun_id; cdm->matches[j].result.device_result.protocol = device->protocol; bcopy(&device->inq_data, &cdm->matches[j].result.device_result.inq_data, sizeof(struct scsi_inquiry_data)); bcopy(&device->ident_data, &cdm->matches[j].result.device_result.ident_data, sizeof(struct ata_params)); /* Let the user know whether this device is unconfigured */ if (device->flags & CAM_DEV_UNCONFIGURED) cdm->matches[j].result.device_result.flags = DEV_RESULT_UNCONFIGURED; else cdm->matches[j].result.device_result.flags = DEV_RESULT_NOFLAG; } /* * If the user isn't interested in peripherals, don't descend * the tree any further. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP) return(1); /* * If there is a peripheral list generation recorded, make sure * it hasn't changed. */ xpt_lock_buses(); mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target == device->target) && (cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device == device) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) { if (cdm->pos.generations[CAM_PERIPH_GENERATION] != device->generation) { mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } periph = (struct cam_periph *)cdm->pos.cookie.periph; periph->refcount++; } else periph = NULL; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); return (xptperiphtraverse(device, periph, xptedtperiphfunc, arg)); } static int xptedtperiphfunc(struct cam_periph *periph, void *arg) { struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this peripheral out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS | CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE | CAM_DEV_POS_PERIPH; cdm->pos.cookie.bus = periph->path->bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->pos.cookie.target = periph->path->target; cdm->pos.generations[CAM_TARGET_GENERATION] = periph->path->bus->generation; cdm->pos.cookie.device = periph->path->device; cdm->pos.generations[CAM_DEV_GENERATION] = periph->path->target->generation; cdm->pos.cookie.periph = periph; cdm->pos.generations[CAM_PERIPH_GENERATION] = periph->path->device->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_PERIPH; cdm->matches[j].result.periph_result.path_id = periph->path->bus->path_id; cdm->matches[j].result.periph_result.target_id = periph->path->target->target_id; cdm->matches[j].result.periph_result.target_lun = periph->path->device->lun_id; cdm->matches[j].result.periph_result.unit_number = periph->unit_number; strncpy(cdm->matches[j].result.periph_result.periph_name, periph->periph_name, DEV_IDLEN); } return(1); } static int xptedtmatch(struct ccb_dev_match *cdm) { struct cam_eb *bus; int ret; cdm->num_matches = 0; /* * Check the bus list generation. If it has changed, the user * needs to reset everything and start over. */ xpt_lock_buses(); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus != NULL)) { if (cdm->pos.generations[CAM_BUS_GENERATION] != xsoftc.bus_generation) { xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } bus = (struct cam_eb *)cdm->pos.cookie.bus; bus->refcount++; } else bus = NULL; xpt_unlock_buses(); ret = xptbustraverse(bus, xptedtbusfunc, cdm); /* * If we get back 0, that means that we had to stop before fully * traversing the EDT. It also means that one of the subroutines * has set the status field to the proper value. If we get back 1, * we've fully traversed the EDT and copied out any matching entries. */ if (ret == 1) cdm->status = CAM_DEV_MATCH_LAST; return(ret); } static int xptplistpdrvfunc(struct periph_driver **pdrv, void *arg) { struct cam_periph *periph; struct ccb_dev_match *cdm; cdm = (struct ccb_dev_match *)arg; xpt_lock_buses(); if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR) && (cdm->pos.cookie.pdrv == pdrv) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) { if (cdm->pos.generations[CAM_PERIPH_GENERATION] != (*pdrv)->generation) { xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } periph = (struct cam_periph *)cdm->pos.cookie.periph; periph->refcount++; } else periph = NULL; xpt_unlock_buses(); return (xptpdperiphtraverse(pdrv, periph, xptplistperiphfunc, arg)); } static int xptplistperiphfunc(struct cam_periph *periph, void *arg) { struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this peripheral out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { struct periph_driver **pdrv; pdrv = NULL; bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_PDRV | CAM_DEV_POS_PDPTR | CAM_DEV_POS_PERIPH; /* * This may look a bit non-sensical, but it is * actually quite logical. There are very few * peripheral drivers, and bloating every peripheral * structure with a pointer back to its parent * peripheral driver linker set entry would cost * more in the long run than doing this quick lookup. */ for (pdrv = periph_drivers; *pdrv != NULL; pdrv++) { if (strcmp((*pdrv)->driver_name, periph->periph_name) == 0) break; } if (*pdrv == NULL) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } cdm->pos.cookie.pdrv = pdrv; /* * The periph generation slot does double duty, as * does the periph pointer slot. They are used for * both edt and pdrv lookups and positioning. */ cdm->pos.cookie.periph = periph; cdm->pos.generations[CAM_PERIPH_GENERATION] = (*pdrv)->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_PERIPH; cdm->matches[j].result.periph_result.path_id = periph->path->bus->path_id; /* * The transport layer peripheral doesn't have a target or * lun. */ if (periph->path->target) cdm->matches[j].result.periph_result.target_id = periph->path->target->target_id; else cdm->matches[j].result.periph_result.target_id = CAM_TARGET_WILDCARD; if (periph->path->device) cdm->matches[j].result.periph_result.target_lun = periph->path->device->lun_id; else cdm->matches[j].result.periph_result.target_lun = CAM_LUN_WILDCARD; cdm->matches[j].result.periph_result.unit_number = periph->unit_number; strncpy(cdm->matches[j].result.periph_result.periph_name, periph->periph_name, DEV_IDLEN); } return(1); } static int xptperiphlistmatch(struct ccb_dev_match *cdm) { int ret; cdm->num_matches = 0; /* * At this point in the edt traversal function, we check the bus * list generation to make sure that no busses have been added or * removed since the user last sent a XPT_DEV_MATCH ccb through. * For the peripheral driver list traversal function, however, we * don't have to worry about new peripheral driver types coming or * going; they're in a linker set, and therefore can't change * without a recompile. */ if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR) && (cdm->pos.cookie.pdrv != NULL)) ret = xptpdrvtraverse( (struct periph_driver **)cdm->pos.cookie.pdrv, xptplistpdrvfunc, cdm); else ret = xptpdrvtraverse(NULL, xptplistpdrvfunc, cdm); /* * If we get back 0, that means that we had to stop before fully * traversing the peripheral driver tree. It also means that one of * the subroutines has set the status field to the proper value. If * we get back 1, we've fully traversed the EDT and copied out any * matching entries. */ if (ret == 1) cdm->status = CAM_DEV_MATCH_LAST; return(ret); } static int xptbustraverse(struct cam_eb *start_bus, xpt_busfunc_t *tr_func, void *arg) { struct cam_eb *bus, *next_bus; int retval; retval = 1; if (start_bus) bus = start_bus; else { xpt_lock_buses(); bus = TAILQ_FIRST(&xsoftc.xpt_busses); if (bus == NULL) { xpt_unlock_buses(); return (retval); } bus->refcount++; xpt_unlock_buses(); } for (; bus != NULL; bus = next_bus) { retval = tr_func(bus, arg); if (retval == 0) { xpt_release_bus(bus); break; } xpt_lock_buses(); next_bus = TAILQ_NEXT(bus, links); if (next_bus) next_bus->refcount++; xpt_unlock_buses(); xpt_release_bus(bus); } return(retval); } static int xpttargettraverse(struct cam_eb *bus, struct cam_et *start_target, xpt_targetfunc_t *tr_func, void *arg) { struct cam_et *target, *next_target; int retval; retval = 1; if (start_target) target = start_target; else { mtx_lock(&bus->eb_mtx); target = TAILQ_FIRST(&bus->et_entries); if (target == NULL) { mtx_unlock(&bus->eb_mtx); return (retval); } target->refcount++; mtx_unlock(&bus->eb_mtx); } for (; target != NULL; target = next_target) { retval = tr_func(target, arg); if (retval == 0) { xpt_release_target(target); break; } mtx_lock(&bus->eb_mtx); next_target = TAILQ_NEXT(target, links); if (next_target) next_target->refcount++; mtx_unlock(&bus->eb_mtx); xpt_release_target(target); } return(retval); } static int xptdevicetraverse(struct cam_et *target, struct cam_ed *start_device, xpt_devicefunc_t *tr_func, void *arg) { struct cam_eb *bus; struct cam_ed *device, *next_device; int retval; retval = 1; bus = target->bus; if (start_device) device = start_device; else { mtx_lock(&bus->eb_mtx); device = TAILQ_FIRST(&target->ed_entries); if (device == NULL) { mtx_unlock(&bus->eb_mtx); return (retval); } device->refcount++; mtx_unlock(&bus->eb_mtx); } for (; device != NULL; device = next_device) { mtx_lock(&device->device_mtx); retval = tr_func(device, arg); mtx_unlock(&device->device_mtx); if (retval == 0) { xpt_release_device(device); break; } mtx_lock(&bus->eb_mtx); next_device = TAILQ_NEXT(device, links); if (next_device) next_device->refcount++; mtx_unlock(&bus->eb_mtx); xpt_release_device(device); } return(retval); } static int xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg) { struct cam_eb *bus; struct cam_periph *periph, *next_periph; int retval; retval = 1; bus = device->target->bus; if (start_periph) periph = start_periph; else { xpt_lock_buses(); mtx_lock(&bus->eb_mtx); periph = SLIST_FIRST(&device->periphs); while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0) periph = SLIST_NEXT(periph, periph_links); if (periph == NULL) { mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); return (retval); } periph->refcount++; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); } for (; periph != NULL; periph = next_periph) { retval = tr_func(periph, arg); if (retval == 0) { cam_periph_release_locked(periph); break; } xpt_lock_buses(); mtx_lock(&bus->eb_mtx); next_periph = SLIST_NEXT(periph, periph_links); while (next_periph != NULL && (next_periph->flags & CAM_PERIPH_FREE) != 0) next_periph = SLIST_NEXT(next_periph, periph_links); if (next_periph) next_periph->refcount++; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); cam_periph_release_locked(periph); } return(retval); } static int xptpdrvtraverse(struct periph_driver **start_pdrv, xpt_pdrvfunc_t *tr_func, void *arg) { struct periph_driver **pdrv; int retval; retval = 1; /* * We don't traverse the peripheral driver list like we do the * other lists, because it is a linker set, and therefore cannot be * changed during runtime. If the peripheral driver list is ever * re-done to be something other than a linker set (i.e. it can * change while the system is running), the list traversal should * be modified to work like the other traversal functions. */ for (pdrv = (start_pdrv ? start_pdrv : periph_drivers); *pdrv != NULL; pdrv++) { retval = tr_func(pdrv, arg); if (retval == 0) return(retval); } return(retval); } static int xptpdperiphtraverse(struct periph_driver **pdrv, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg) { struct cam_periph *periph, *next_periph; int retval; retval = 1; if (start_periph) periph = start_periph; else { xpt_lock_buses(); periph = TAILQ_FIRST(&(*pdrv)->units); while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0) periph = TAILQ_NEXT(periph, unit_links); if (periph == NULL) { xpt_unlock_buses(); return (retval); } periph->refcount++; xpt_unlock_buses(); } for (; periph != NULL; periph = next_periph) { cam_periph_lock(periph); retval = tr_func(periph, arg); cam_periph_unlock(periph); if (retval == 0) { cam_periph_release(periph); break; } xpt_lock_buses(); next_periph = TAILQ_NEXT(periph, unit_links); while (next_periph != NULL && (next_periph->flags & CAM_PERIPH_FREE) != 0) next_periph = TAILQ_NEXT(next_periph, unit_links); if (next_periph) next_periph->refcount++; xpt_unlock_buses(); cam_periph_release(periph); } return(retval); } static int xptdefbusfunc(struct cam_eb *bus, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_BUS) { xpt_busfunc_t *tr_func; tr_func = (xpt_busfunc_t *)tr_config->tr_func; return(tr_func(bus, tr_config->tr_arg)); } else return(xpttargettraverse(bus, NULL, xptdeftargetfunc, arg)); } static int xptdeftargetfunc(struct cam_et *target, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_TARGET) { xpt_targetfunc_t *tr_func; tr_func = (xpt_targetfunc_t *)tr_config->tr_func; return(tr_func(target, tr_config->tr_arg)); } else return(xptdevicetraverse(target, NULL, xptdefdevicefunc, arg)); } static int xptdefdevicefunc(struct cam_ed *device, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_DEVICE) { xpt_devicefunc_t *tr_func; tr_func = (xpt_devicefunc_t *)tr_config->tr_func; return(tr_func(device, tr_config->tr_arg)); } else return(xptperiphtraverse(device, NULL, xptdefperiphfunc, arg)); } static int xptdefperiphfunc(struct cam_periph *periph, void *arg) { struct xpt_traverse_config *tr_config; xpt_periphfunc_t *tr_func; tr_config = (struct xpt_traverse_config *)arg; tr_func = (xpt_periphfunc_t *)tr_config->tr_func; /* * Unlike the other default functions, we don't check for depth * here. The peripheral driver level is the last level in the EDT, * so if we're here, we should execute the function in question. */ return(tr_func(periph, tr_config->tr_arg)); } /* * Execute the given function for every bus in the EDT. */ static int xpt_for_all_busses(xpt_busfunc_t *tr_func, void *arg) { struct xpt_traverse_config tr_config; tr_config.depth = XPT_DEPTH_BUS; tr_config.tr_func = tr_func; tr_config.tr_arg = arg; return(xptbustraverse(NULL, xptdefbusfunc, &tr_config)); } /* * Execute the given function for every device in the EDT. */ static int xpt_for_all_devices(xpt_devicefunc_t *tr_func, void *arg) { struct xpt_traverse_config tr_config; tr_config.depth = XPT_DEPTH_DEVICE; tr_config.tr_func = tr_func; tr_config.tr_arg = arg; return(xptbustraverse(NULL, xptdefbusfunc, &tr_config)); } static int xptsetasyncfunc(struct cam_ed *device, void *arg) { struct cam_path path; struct ccb_getdev cgd; struct ccb_setasync *csa = (struct ccb_setasync *)arg; /* * Don't report unconfigured devices (Wildcard devs, * devices only for target mode, device instances * that have been invalidated but are waiting for * their last reference count to be released). */ if ((device->flags & CAM_DEV_UNCONFIGURED) != 0) return (1); xpt_compile_path(&path, NULL, device->target->bus->path_id, device->target->target_id, device->lun_id); xpt_setup_ccb(&cgd.ccb_h, &path, CAM_PRIORITY_NORMAL); cgd.ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)&cgd); csa->callback(csa->callback_arg, AC_FOUND_DEVICE, &path, &cgd); xpt_release_path(&path); return(1); } static int xptsetasyncbusfunc(struct cam_eb *bus, void *arg) { struct cam_path path; struct ccb_pathinq cpi; struct ccb_setasync *csa = (struct ccb_setasync *)arg; xpt_compile_path(&path, /*periph*/NULL, bus->path_id, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); xpt_path_lock(&path); xpt_setup_ccb(&cpi.ccb_h, &path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); csa->callback(csa->callback_arg, AC_PATH_REGISTERED, &path, &cpi); xpt_path_unlock(&path); xpt_release_path(&path); return(1); } void xpt_action(union ccb *start_ccb) { CAM_DEBUG(start_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_action: func %#x %s\n", start_ccb->ccb_h.func_code, xpt_action_name(start_ccb->ccb_h.func_code))); start_ccb->ccb_h.status = CAM_REQ_INPROG; (*(start_ccb->ccb_h.path->bus->xport->action))(start_ccb); } void xpt_action_default(union ccb *start_ccb) { struct cam_path *path; struct cam_sim *sim; int lock; path = start_ccb->ccb_h.path; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_action_default: func %#x %s\n", start_ccb->ccb_h.func_code, xpt_action_name(start_ccb->ccb_h.func_code))); switch (start_ccb->ccb_h.func_code) { case XPT_SCSI_IO: { struct cam_ed *device; /* * For the sake of compatibility with SCSI-1 * devices that may not understand the identify * message, we include lun information in the * second byte of all commands. SCSI-1 specifies * that luns are a 3 bit value and reserves only 3 * bits for lun information in the CDB. Later * revisions of the SCSI spec allow for more than 8 * luns, but have deprecated lun information in the * CDB. So, if the lun won't fit, we must omit. * * Also be aware that during initial probing for devices, * the inquiry information is unknown but initialized to 0. * This means that this code will be exercised while probing * devices with an ANSI revision greater than 2. */ device = path->device; if (device->protocol_version <= SCSI_REV_2 && start_ccb->ccb_h.target_lun < 8 && (start_ccb->ccb_h.flags & CAM_CDB_POINTER) == 0) { start_ccb->csio.cdb_io.cdb_bytes[1] |= start_ccb->ccb_h.target_lun << 5; } start_ccb->csio.scsi_status = SCSI_STATUS_OK; } /* FALLTHROUGH */ case XPT_TARGET_IO: case XPT_CONT_TARGET_IO: start_ccb->csio.sense_resid = 0; start_ccb->csio.resid = 0; /* FALLTHROUGH */ case XPT_ATA_IO: if (start_ccb->ccb_h.func_code == XPT_ATA_IO) start_ccb->ataio.resid = 0; /* FALLTHROUGH */ + case XPT_NVME_IO: + if (start_ccb->ccb_h.func_code == XPT_NVME_IO) + start_ccb->nvmeio.resid = 0; + /* FALLTHROUGH */ case XPT_RESET_DEV: case XPT_ENG_EXEC: case XPT_SMP_IO: { struct cam_devq *devq; devq = path->bus->sim->devq; mtx_lock(&devq->send_mtx); cam_ccbq_insert_ccb(&path->device->ccbq, start_ccb); if (xpt_schedule_devq(devq, path->device) != 0) xpt_run_devq(devq); mtx_unlock(&devq->send_mtx); break; } case XPT_CALC_GEOMETRY: /* Filter out garbage */ if (start_ccb->ccg.block_size == 0 || start_ccb->ccg.volume_size == 0) { start_ccb->ccg.cylinders = 0; start_ccb->ccg.heads = 0; start_ccb->ccg.secs_per_track = 0; start_ccb->ccb_h.status = CAM_REQ_CMP; break; } #if defined(PC98) || defined(__sparc64__) /* * In a PC-98 system, geometry translation depens on * the "real" device geometry obtained from mode page 4. * SCSI geometry translation is performed in the * initialization routine of the SCSI BIOS and the result * stored in host memory. If the translation is available * in host memory, use it. If not, rely on the default * translation the device driver performs. * For sparc64, we may need adjust the geometry of large * disks in order to fit the limitations of the 16-bit * fields of the VTOC8 disk label. */ if (scsi_da_bios_params(&start_ccb->ccg) != 0) { start_ccb->ccb_h.status = CAM_REQ_CMP; break; } #endif goto call_sim; case XPT_ABORT: { union ccb* abort_ccb; abort_ccb = start_ccb->cab.abort_ccb; if (XPT_FC_IS_DEV_QUEUED(abort_ccb)) { if (abort_ccb->ccb_h.pinfo.index >= 0) { struct cam_ccbq *ccbq; struct cam_ed *device; device = abort_ccb->ccb_h.path->device; ccbq = &device->ccbq; cam_ccbq_remove_ccb(ccbq, abort_ccb); abort_ccb->ccb_h.status = CAM_REQ_ABORTED|CAM_DEV_QFRZN; xpt_freeze_devq(abort_ccb->ccb_h.path, 1); xpt_done(abort_ccb); start_ccb->ccb_h.status = CAM_REQ_CMP; break; } if (abort_ccb->ccb_h.pinfo.index == CAM_UNQUEUED_INDEX && (abort_ccb->ccb_h.status & CAM_SIM_QUEUED) == 0) { /* * We've caught this ccb en route to * the SIM. Flag it for abort and the * SIM will do so just before starting * real work on the CCB. */ abort_ccb->ccb_h.status = CAM_REQ_ABORTED|CAM_DEV_QFRZN; xpt_freeze_devq(abort_ccb->ccb_h.path, 1); start_ccb->ccb_h.status = CAM_REQ_CMP; break; } } if (XPT_FC_IS_QUEUED(abort_ccb) && (abort_ccb->ccb_h.pinfo.index == CAM_DONEQ_INDEX)) { /* * It's already completed but waiting * for our SWI to get to it. */ start_ccb->ccb_h.status = CAM_UA_ABORT; break; } /* * If we weren't able to take care of the abort request * in the XPT, pass the request down to the SIM for processing. */ } /* FALLTHROUGH */ case XPT_ACCEPT_TARGET_IO: case XPT_EN_LUN: case XPT_IMMED_NOTIFY: case XPT_NOTIFY_ACK: case XPT_RESET_BUS: case XPT_IMMEDIATE_NOTIFY: case XPT_NOTIFY_ACKNOWLEDGE: case XPT_GET_SIM_KNOB_OLD: case XPT_GET_SIM_KNOB: case XPT_SET_SIM_KNOB: case XPT_GET_TRAN_SETTINGS: case XPT_SET_TRAN_SETTINGS: case XPT_PATH_INQ: call_sim: sim = path->bus->sim; lock = (mtx_owned(sim->mtx) == 0); if (lock) CAM_SIM_LOCK(sim); CAM_DEBUG(path, CAM_DEBUG_TRACE, ("sim->sim_action: func=%#x\n", start_ccb->ccb_h.func_code)); (*(sim->sim_action))(sim, start_ccb); CAM_DEBUG(path, CAM_DEBUG_TRACE, ("sim->sim_action: status=%#x\n", start_ccb->ccb_h.status)); if (lock) CAM_SIM_UNLOCK(sim); break; case XPT_PATH_STATS: start_ccb->cpis.last_reset = path->bus->last_reset; start_ccb->ccb_h.status = CAM_REQ_CMP; break; case XPT_GDEV_TYPE: { struct cam_ed *dev; dev = path->device; if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) { start_ccb->ccb_h.status = CAM_DEV_NOT_THERE; } else { struct ccb_getdev *cgd; cgd = &start_ccb->cgd; cgd->protocol = dev->protocol; cgd->inq_data = dev->inq_data; cgd->ident_data = dev->ident_data; cgd->inq_flags = dev->inq_flags; + cgd->nvme_data = dev->nvme_data; + cgd->nvme_cdata = dev->nvme_cdata; cgd->ccb_h.status = CAM_REQ_CMP; cgd->serial_num_len = dev->serial_num_len; if ((dev->serial_num_len > 0) && (dev->serial_num != NULL)) bcopy(dev->serial_num, cgd->serial_num, dev->serial_num_len); } break; } case XPT_GDEV_STATS: { struct cam_ed *dev; dev = path->device; if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) { start_ccb->ccb_h.status = CAM_DEV_NOT_THERE; } else { struct ccb_getdevstats *cgds; struct cam_eb *bus; struct cam_et *tar; struct cam_devq *devq; cgds = &start_ccb->cgds; bus = path->bus; tar = path->target; devq = bus->sim->devq; mtx_lock(&devq->send_mtx); cgds->dev_openings = dev->ccbq.dev_openings; cgds->dev_active = dev->ccbq.dev_active; cgds->allocated = dev->ccbq.allocated; cgds->queued = cam_ccbq_pending_ccb_count(&dev->ccbq); cgds->held = cgds->allocated - cgds->dev_active - cgds->queued; cgds->last_reset = tar->last_reset; cgds->maxtags = dev->maxtags; cgds->mintags = dev->mintags; if (timevalcmp(&tar->last_reset, &bus->last_reset, <)) cgds->last_reset = bus->last_reset; mtx_unlock(&devq->send_mtx); cgds->ccb_h.status = CAM_REQ_CMP; } break; } case XPT_GDEVLIST: { struct cam_periph *nperiph; struct periph_list *periph_head; struct ccb_getdevlist *cgdl; u_int i; struct cam_ed *device; int found; found = 0; /* * Don't want anyone mucking with our data. */ device = path->device; periph_head = &device->periphs; cgdl = &start_ccb->cgdl; /* * Check and see if the list has changed since the user * last requested a list member. If so, tell them that the * list has changed, and therefore they need to start over * from the beginning. */ if ((cgdl->index != 0) && (cgdl->generation != device->generation)) { cgdl->status = CAM_GDEVLIST_LIST_CHANGED; break; } /* * Traverse the list of peripherals and attempt to find * the requested peripheral. */ for (nperiph = SLIST_FIRST(periph_head), i = 0; (nperiph != NULL) && (i <= cgdl->index); nperiph = SLIST_NEXT(nperiph, periph_links), i++) { if (i == cgdl->index) { strncpy(cgdl->periph_name, nperiph->periph_name, DEV_IDLEN); cgdl->unit_number = nperiph->unit_number; found = 1; } } if (found == 0) { cgdl->status = CAM_GDEVLIST_ERROR; break; } if (nperiph == NULL) cgdl->status = CAM_GDEVLIST_LAST_DEVICE; else cgdl->status = CAM_GDEVLIST_MORE_DEVS; cgdl->index++; cgdl->generation = device->generation; cgdl->ccb_h.status = CAM_REQ_CMP; break; } case XPT_DEV_MATCH: { dev_pos_type position_type; struct ccb_dev_match *cdm; cdm = &start_ccb->cdm; /* * There are two ways of getting at information in the EDT. * The first way is via the primary EDT tree. It starts * with a list of busses, then a list of targets on a bus, * then devices/luns on a target, and then peripherals on a * device/lun. The "other" way is by the peripheral driver * lists. The peripheral driver lists are organized by * peripheral driver. (obviously) So it makes sense to * use the peripheral driver list if the user is looking * for something like "da1", or all "da" devices. If the * user is looking for something on a particular bus/target * or lun, it's generally better to go through the EDT tree. */ if (cdm->pos.position_type != CAM_DEV_POS_NONE) position_type = cdm->pos.position_type; else { u_int i; position_type = CAM_DEV_POS_NONE; for (i = 0; i < cdm->num_patterns; i++) { if ((cdm->patterns[i].type == DEV_MATCH_BUS) ||(cdm->patterns[i].type == DEV_MATCH_DEVICE)){ position_type = CAM_DEV_POS_EDT; break; } } if (cdm->num_patterns == 0) position_type = CAM_DEV_POS_EDT; else if (position_type == CAM_DEV_POS_NONE) position_type = CAM_DEV_POS_PDRV; } switch(position_type & CAM_DEV_POS_TYPEMASK) { case CAM_DEV_POS_EDT: xptedtmatch(cdm); break; case CAM_DEV_POS_PDRV: xptperiphlistmatch(cdm); break; default: cdm->status = CAM_DEV_MATCH_ERROR; break; } if (cdm->status == CAM_DEV_MATCH_ERROR) start_ccb->ccb_h.status = CAM_REQ_CMP_ERR; else start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_SASYNC_CB: { struct ccb_setasync *csa; struct async_node *cur_entry; struct async_list *async_head; u_int32_t added; csa = &start_ccb->csa; added = csa->event_enable; async_head = &path->device->asyncs; /* * If there is already an entry for us, simply * update it. */ cur_entry = SLIST_FIRST(async_head); while (cur_entry != NULL) { if ((cur_entry->callback_arg == csa->callback_arg) && (cur_entry->callback == csa->callback)) break; cur_entry = SLIST_NEXT(cur_entry, links); } if (cur_entry != NULL) { /* * If the request has no flags set, * remove the entry. */ added &= ~cur_entry->event_enable; if (csa->event_enable == 0) { SLIST_REMOVE(async_head, cur_entry, async_node, links); xpt_release_device(path->device); free(cur_entry, M_CAMXPT); } else { cur_entry->event_enable = csa->event_enable; } csa->event_enable = added; } else { cur_entry = malloc(sizeof(*cur_entry), M_CAMXPT, M_NOWAIT); if (cur_entry == NULL) { csa->ccb_h.status = CAM_RESRC_UNAVAIL; break; } cur_entry->event_enable = csa->event_enable; cur_entry->event_lock = mtx_owned(path->bus->sim->mtx) ? 1 : 0; cur_entry->callback_arg = csa->callback_arg; cur_entry->callback = csa->callback; SLIST_INSERT_HEAD(async_head, cur_entry, links); xpt_acquire_device(path->device); } start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_REL_SIMQ: { struct ccb_relsim *crs; struct cam_ed *dev; crs = &start_ccb->crs; dev = path->device; if (dev == NULL) { crs->ccb_h.status = CAM_DEV_NOT_THERE; break; } if ((crs->release_flags & RELSIM_ADJUST_OPENINGS) != 0) { /* Don't ever go below one opening */ if (crs->openings > 0) { xpt_dev_ccbq_resize(path, crs->openings); if (bootverbose) { xpt_print(path, "number of openings is now %d\n", crs->openings); } } } mtx_lock(&dev->sim->devq->send_mtx); if ((crs->release_flags & RELSIM_RELEASE_AFTER_TIMEOUT) != 0) { if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) { /* * Just extend the old timeout and decrement * the freeze count so that a single timeout * is sufficient for releasing the queue. */ start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; callout_stop(&dev->callout); } else { start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } callout_reset_sbt(&dev->callout, SBT_1MS * crs->release_timeout, 0, xpt_release_devq_timeout, dev, 0); dev->flags |= CAM_DEV_REL_TIMEOUT_PENDING; } if ((crs->release_flags & RELSIM_RELEASE_AFTER_CMDCMPLT) != 0) { if ((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0) { /* * Decrement the freeze count so that a single * completion is still sufficient to unfreeze * the queue. */ start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; } else { dev->flags |= CAM_DEV_REL_ON_COMPLETE; start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } } if ((crs->release_flags & RELSIM_RELEASE_AFTER_QEMPTY) != 0) { if ((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0 || (dev->ccbq.dev_active == 0)) { start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; } else { dev->flags |= CAM_DEV_REL_ON_QUEUE_EMPTY; start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } } mtx_unlock(&dev->sim->devq->send_mtx); if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) == 0) xpt_release_devq(path, /*count*/1, /*run_queue*/TRUE); start_ccb->crs.qfrozen_cnt = dev->ccbq.queue.qfrozen_cnt; start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_DEBUG: { struct cam_path *oldpath; /* Check that all request bits are supported. */ if (start_ccb->cdbg.flags & ~(CAM_DEBUG_COMPILE)) { start_ccb->ccb_h.status = CAM_FUNC_NOTAVAIL; break; } cam_dflags = CAM_DEBUG_NONE; if (cam_dpath != NULL) { oldpath = cam_dpath; cam_dpath = NULL; xpt_free_path(oldpath); } if (start_ccb->cdbg.flags != CAM_DEBUG_NONE) { if (xpt_create_path(&cam_dpath, NULL, start_ccb->ccb_h.path_id, start_ccb->ccb_h.target_id, start_ccb->ccb_h.target_lun) != CAM_REQ_CMP) { start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; } else { cam_dflags = start_ccb->cdbg.flags; start_ccb->ccb_h.status = CAM_REQ_CMP; xpt_print(cam_dpath, "debugging flags now %x\n", cam_dflags); } } else start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_NOOP: if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0) xpt_freeze_devq(path, 1); start_ccb->ccb_h.status = CAM_REQ_CMP; break; case XPT_REPROBE_LUN: xpt_async(AC_INQ_CHANGED, path, NULL); start_ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(start_ccb); break; default: case XPT_SDEV_TYPE: case XPT_TERM_IO: case XPT_ENG_INQ: /* XXX Implement */ - printf("%s: CCB type %#x not supported\n", __func__, - start_ccb->ccb_h.func_code); + xpt_print_path(start_ccb->ccb_h.path); + printf("%s: CCB type %#x %s not supported\n", __func__, + start_ccb->ccb_h.func_code, + xpt_action_name(start_ccb->ccb_h.func_code)); start_ccb->ccb_h.status = CAM_PROVIDE_FAIL; if (start_ccb->ccb_h.func_code & XPT_FC_DEV_QUEUED) { xpt_done(start_ccb); } break; } CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_action_default: func= %#x %s status %#x\n", start_ccb->ccb_h.func_code, xpt_action_name(start_ccb->ccb_h.func_code), start_ccb->ccb_h.status)); } void xpt_polled_action(union ccb *start_ccb) { u_int32_t timeout; struct cam_sim *sim; struct cam_devq *devq; struct cam_ed *dev; timeout = start_ccb->ccb_h.timeout * 10; sim = start_ccb->ccb_h.path->bus->sim; devq = sim->devq; dev = start_ccb->ccb_h.path->device; mtx_unlock(&dev->device_mtx); /* * Steal an opening so that no other queued requests * can get it before us while we simulate interrupts. */ mtx_lock(&devq->send_mtx); dev->ccbq.dev_openings--; while((devq->send_openings <= 0 || dev->ccbq.dev_openings < 0) && (--timeout > 0)) { mtx_unlock(&devq->send_mtx); DELAY(100); CAM_SIM_LOCK(sim); (*(sim->sim_poll))(sim); CAM_SIM_UNLOCK(sim); camisr_runqueue(); mtx_lock(&devq->send_mtx); } dev->ccbq.dev_openings++; mtx_unlock(&devq->send_mtx); if (timeout != 0) { xpt_action(start_ccb); while(--timeout > 0) { CAM_SIM_LOCK(sim); (*(sim->sim_poll))(sim); CAM_SIM_UNLOCK(sim); camisr_runqueue(); if ((start_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_INPROG) break; DELAY(100); } if (timeout == 0) { /* * XXX Is it worth adding a sim_timeout entry * point so we can attempt recovery? If * this is only used for dumps, I don't think * it is. */ start_ccb->ccb_h.status = CAM_CMD_TIMEOUT; } } else { start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; } mtx_lock(&dev->device_mtx); } /* * Schedule a peripheral driver to receive a ccb when its * target device has space for more transactions. */ void xpt_schedule(struct cam_periph *periph, u_int32_t new_priority) { CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("xpt_schedule\n")); cam_periph_assert(periph, MA_OWNED); if (new_priority < periph->scheduled_priority) { periph->scheduled_priority = new_priority; xpt_run_allocq(periph, 0); } } /* * Schedule a device to run on a given queue. * If the device was inserted as a new entry on the queue, * return 1 meaning the device queue should be run. If we * were already queued, implying someone else has already * started the queue, return 0 so the caller doesn't attempt * to run the queue. */ static int xpt_schedule_dev(struct camq *queue, cam_pinfo *pinfo, u_int32_t new_priority) { int retval; u_int32_t old_priority; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_schedule_dev\n")); old_priority = pinfo->priority; /* * Are we already queued? */ if (pinfo->index != CAM_UNQUEUED_INDEX) { /* Simply reorder based on new priority */ if (new_priority < old_priority) { camq_change_priority(queue, pinfo->index, new_priority); CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("changed priority to %d\n", new_priority)); retval = 1; } else retval = 0; } else { /* New entry on the queue */ if (new_priority < old_priority) pinfo->priority = new_priority; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("Inserting onto queue\n")); pinfo->generation = ++queue->generation; camq_insert(queue, pinfo); retval = 1; } return (retval); } static void xpt_run_allocq_task(void *context, int pending) { struct cam_periph *periph = context; cam_periph_lock(periph); periph->flags &= ~CAM_PERIPH_RUN_TASK; xpt_run_allocq(periph, 1); cam_periph_unlock(periph); cam_periph_release(periph); } static void xpt_run_allocq(struct cam_periph *periph, int sleep) { struct cam_ed *device; union ccb *ccb; uint32_t prio; cam_periph_assert(periph, MA_OWNED); if (periph->periph_allocating) return; periph->periph_allocating = 1; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_allocq(%p)\n", periph)); device = periph->path->device; ccb = NULL; restart: while ((prio = min(periph->scheduled_priority, periph->immediate_priority)) != CAM_PRIORITY_NONE && (periph->periph_allocated - (ccb != NULL ? 1 : 0) < device->ccbq.total_openings || prio <= CAM_PRIORITY_OOB)) { if (ccb == NULL && (ccb = xpt_get_ccb_nowait(periph)) == NULL) { if (sleep) { ccb = xpt_get_ccb(periph); goto restart; } if (periph->flags & CAM_PERIPH_RUN_TASK) break; cam_periph_doacquire(periph); periph->flags |= CAM_PERIPH_RUN_TASK; taskqueue_enqueue(xsoftc.xpt_taskq, &periph->periph_run_task); break; } xpt_setup_ccb(&ccb->ccb_h, periph->path, prio); if (prio == periph->immediate_priority) { periph->immediate_priority = CAM_PRIORITY_NONE; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("waking cam_periph_getccb()\n")); SLIST_INSERT_HEAD(&periph->ccb_list, &ccb->ccb_h, periph_links.sle); wakeup(&periph->ccb_list); } else { periph->scheduled_priority = CAM_PRIORITY_NONE; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("calling periph_start()\n")); periph->periph_start(periph, ccb); } ccb = NULL; } if (ccb != NULL) xpt_release_ccb(ccb); periph->periph_allocating = 0; } static void xpt_run_devq(struct cam_devq *devq) { char cdb_str[(SCSI_MAX_CDBLEN * 3) + 1]; int lock; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_devq\n")); devq->send_queue.qfrozen_cnt++; while ((devq->send_queue.entries > 0) && (devq->send_openings > 0) && (devq->send_queue.qfrozen_cnt <= 1)) { struct cam_ed *device; union ccb *work_ccb; struct cam_sim *sim; device = (struct cam_ed *)camq_remove(&devq->send_queue, CAMQ_HEAD); CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("running device %p\n", device)); work_ccb = cam_ccbq_peek_ccb(&device->ccbq, CAMQ_HEAD); if (work_ccb == NULL) { printf("device on run queue with no ccbs???\n"); continue; } if ((work_ccb->ccb_h.flags & CAM_HIGH_POWER) != 0) { mtx_lock(&xsoftc.xpt_highpower_lock); if (xsoftc.num_highpower <= 0) { /* * We got a high power command, but we * don't have any available slots. Freeze * the device queue until we have a slot * available. */ xpt_freeze_devq_device(device, 1); STAILQ_INSERT_TAIL(&xsoftc.highpowerq, device, highpowerq_entry); mtx_unlock(&xsoftc.xpt_highpower_lock); continue; } else { /* * Consume a high power slot while * this ccb runs. */ xsoftc.num_highpower--; } mtx_unlock(&xsoftc.xpt_highpower_lock); } cam_ccbq_remove_ccb(&device->ccbq, work_ccb); cam_ccbq_send_ccb(&device->ccbq, work_ccb); devq->send_openings--; devq->send_active++; xpt_schedule_devq(devq, device); mtx_unlock(&devq->send_mtx); if ((work_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0) { /* * The client wants to freeze the queue * after this CCB is sent. */ xpt_freeze_devq(work_ccb->ccb_h.path, 1); } /* In Target mode, the peripheral driver knows best... */ if (work_ccb->ccb_h.func_code == XPT_SCSI_IO) { if ((device->inq_flags & SID_CmdQue) != 0 && work_ccb->csio.tag_action != CAM_TAG_ACTION_NONE) work_ccb->ccb_h.flags |= CAM_TAG_ACTION_VALID; else /* * Clear this in case of a retried CCB that * failed due to a rejected tag. */ work_ccb->ccb_h.flags &= ~CAM_TAG_ACTION_VALID; } switch (work_ccb->ccb_h.func_code) { case XPT_SCSI_IO: CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_CDB,("%s. CDB: %s\n", scsi_op_desc(work_ccb->csio.cdb_io.cdb_bytes[0], &device->inq_data), scsi_cdb_string(work_ccb->csio.cdb_io.cdb_bytes, cdb_str, sizeof(cdb_str)))); break; case XPT_ATA_IO: CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_CDB,("%s. ACB: %s\n", ata_op_string(&work_ccb->ataio.cmd), ata_cmd_string(&work_ccb->ataio.cmd, cdb_str, sizeof(cdb_str)))); break; + case XPT_NVME_IO: + CAM_DEBUG(work_ccb->ccb_h.path, + CAM_DEBUG_CDB,("%s. NCB: %s\n", + nvme_op_string(&work_ccb->nvmeio.cmd), + nvme_cmd_string(&work_ccb->nvmeio.cmd, + cdb_str, sizeof(cdb_str)))); + break; default: break; } /* * Device queues can be shared among multiple SIM instances * that reside on different busses. Use the SIM from the * queued device, rather than the one from the calling bus. */ sim = device->sim; lock = (mtx_owned(sim->mtx) == 0); if (lock) CAM_SIM_LOCK(sim); work_ccb->ccb_h.qos.sim_data = sbinuptime(); // xxx uintprt_t too small 32bit platforms (*(sim->sim_action))(sim, work_ccb); if (lock) CAM_SIM_UNLOCK(sim); mtx_lock(&devq->send_mtx); } devq->send_queue.qfrozen_cnt--; } /* * This function merges stuff from the slave ccb into the master ccb, while * keeping important fields in the master ccb constant. */ void xpt_merge_ccb(union ccb *master_ccb, union ccb *slave_ccb) { /* * Pull fields that are valid for peripheral drivers to set * into the master CCB along with the CCB "payload". */ master_ccb->ccb_h.retry_count = slave_ccb->ccb_h.retry_count; master_ccb->ccb_h.func_code = slave_ccb->ccb_h.func_code; master_ccb->ccb_h.timeout = slave_ccb->ccb_h.timeout; master_ccb->ccb_h.flags = slave_ccb->ccb_h.flags; bcopy(&(&slave_ccb->ccb_h)[1], &(&master_ccb->ccb_h)[1], sizeof(union ccb) - sizeof(struct ccb_hdr)); } void xpt_setup_ccb_flags(struct ccb_hdr *ccb_h, struct cam_path *path, u_int32_t priority, u_int32_t flags) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_setup_ccb\n")); ccb_h->pinfo.priority = priority; ccb_h->path = path; ccb_h->path_id = path->bus->path_id; if (path->target) ccb_h->target_id = path->target->target_id; else ccb_h->target_id = CAM_TARGET_WILDCARD; if (path->device) { ccb_h->target_lun = path->device->lun_id; ccb_h->pinfo.generation = ++path->device->ccbq.queue.generation; } else { ccb_h->target_lun = CAM_TARGET_WILDCARD; } ccb_h->pinfo.index = CAM_UNQUEUED_INDEX; ccb_h->flags = flags; ccb_h->xflags = 0; } void xpt_setup_ccb(struct ccb_hdr *ccb_h, struct cam_path *path, u_int32_t priority) { xpt_setup_ccb_flags(ccb_h, path, priority, /*flags*/ 0); } /* Path manipulation functions */ cam_status xpt_create_path(struct cam_path **new_path_ptr, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { struct cam_path *path; cam_status status; path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT); if (path == NULL) { status = CAM_RESRC_UNAVAIL; return(status); } status = xpt_compile_path(path, perph, path_id, target_id, lun_id); if (status != CAM_REQ_CMP) { free(path, M_CAMPATH); path = NULL; } *new_path_ptr = path; return (status); } cam_status xpt_create_path_unlocked(struct cam_path **new_path_ptr, struct cam_periph *periph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { return (xpt_create_path(new_path_ptr, periph, path_id, target_id, lun_id)); } cam_status xpt_compile_path(struct cam_path *new_path, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { struct cam_eb *bus; struct cam_et *target; struct cam_ed *device; cam_status status; status = CAM_REQ_CMP; /* Completed without error */ target = NULL; /* Wildcarded */ device = NULL; /* Wildcarded */ /* * We will potentially modify the EDT, so block interrupts * that may attempt to create cam paths. */ bus = xpt_find_bus(path_id); if (bus == NULL) { status = CAM_PATH_INVALID; } else { xpt_lock_buses(); mtx_lock(&bus->eb_mtx); target = xpt_find_target(bus, target_id); if (target == NULL) { /* Create one */ struct cam_et *new_target; new_target = xpt_alloc_target(bus, target_id); if (new_target == NULL) { status = CAM_RESRC_UNAVAIL; } else { target = new_target; } } xpt_unlock_buses(); if (target != NULL) { device = xpt_find_device(target, lun_id); if (device == NULL) { /* Create one */ struct cam_ed *new_device; new_device = (*(bus->xport->alloc_device))(bus, target, lun_id); if (new_device == NULL) { status = CAM_RESRC_UNAVAIL; } else { device = new_device; } } } mtx_unlock(&bus->eb_mtx); } /* * Only touch the user's data if we are successful. */ if (status == CAM_REQ_CMP) { new_path->periph = perph; new_path->bus = bus; new_path->target = target; new_path->device = device; CAM_DEBUG(new_path, CAM_DEBUG_TRACE, ("xpt_compile_path\n")); } else { if (device != NULL) xpt_release_device(device); if (target != NULL) xpt_release_target(target); if (bus != NULL) xpt_release_bus(bus); } return (status); } cam_status xpt_clone_path(struct cam_path **new_path_ptr, struct cam_path *path) { struct cam_path *new_path; new_path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT); if (new_path == NULL) return(CAM_RESRC_UNAVAIL); xpt_copy_path(new_path, path); *new_path_ptr = new_path; return (CAM_REQ_CMP); } void xpt_copy_path(struct cam_path *new_path, struct cam_path *path) { *new_path = *path; if (path->bus != NULL) xpt_acquire_bus(path->bus); if (path->target != NULL) xpt_acquire_target(path->target); if (path->device != NULL) xpt_acquire_device(path->device); } void xpt_release_path(struct cam_path *path) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_path\n")); if (path->device != NULL) { xpt_release_device(path->device); path->device = NULL; } if (path->target != NULL) { xpt_release_target(path->target); path->target = NULL; } if (path->bus != NULL) { xpt_release_bus(path->bus); path->bus = NULL; } } void xpt_free_path(struct cam_path *path) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_free_path\n")); xpt_release_path(path); free(path, M_CAMPATH); } void xpt_path_counts(struct cam_path *path, uint32_t *bus_ref, uint32_t *periph_ref, uint32_t *target_ref, uint32_t *device_ref) { xpt_lock_buses(); if (bus_ref) { if (path->bus) *bus_ref = path->bus->refcount; else *bus_ref = 0; } if (periph_ref) { if (path->periph) *periph_ref = path->periph->refcount; else *periph_ref = 0; } xpt_unlock_buses(); if (target_ref) { if (path->target) *target_ref = path->target->refcount; else *target_ref = 0; } if (device_ref) { if (path->device) *device_ref = path->device->refcount; else *device_ref = 0; } } /* * Return -1 for failure, 0 for exact match, 1 for match with wildcards * in path1, 2 for match with wildcards in path2. */ int xpt_path_comp(struct cam_path *path1, struct cam_path *path2) { int retval = 0; if (path1->bus != path2->bus) { if (path1->bus->path_id == CAM_BUS_WILDCARD) retval = 1; else if (path2->bus->path_id == CAM_BUS_WILDCARD) retval = 2; else return (-1); } if (path1->target != path2->target) { if (path1->target->target_id == CAM_TARGET_WILDCARD) { if (retval == 0) retval = 1; } else if (path2->target->target_id == CAM_TARGET_WILDCARD) retval = 2; else return (-1); } if (path1->device != path2->device) { if (path1->device->lun_id == CAM_LUN_WILDCARD) { if (retval == 0) retval = 1; } else if (path2->device->lun_id == CAM_LUN_WILDCARD) retval = 2; else return (-1); } return (retval); } int xpt_path_comp_dev(struct cam_path *path, struct cam_ed *dev) { int retval = 0; if (path->bus != dev->target->bus) { if (path->bus->path_id == CAM_BUS_WILDCARD) retval = 1; else if (dev->target->bus->path_id == CAM_BUS_WILDCARD) retval = 2; else return (-1); } if (path->target != dev->target) { if (path->target->target_id == CAM_TARGET_WILDCARD) { if (retval == 0) retval = 1; } else if (dev->target->target_id == CAM_TARGET_WILDCARD) retval = 2; else return (-1); } if (path->device != dev) { if (path->device->lun_id == CAM_LUN_WILDCARD) { if (retval == 0) retval = 1; } else if (dev->lun_id == CAM_LUN_WILDCARD) retval = 2; else return (-1); } return (retval); } void xpt_print_path(struct cam_path *path) { if (path == NULL) printf("(nopath): "); else { if (path->periph != NULL) printf("(%s%d:", path->periph->periph_name, path->periph->unit_number); else printf("(noperiph:"); if (path->bus != NULL) printf("%s%d:%d:", path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id); else printf("nobus:"); if (path->target != NULL) printf("%d:", path->target->target_id); else printf("X:"); if (path->device != NULL) printf("%jx): ", (uintmax_t)path->device->lun_id); else printf("X): "); } } void xpt_print_device(struct cam_ed *device) { if (device == NULL) printf("(nopath): "); else { printf("(noperiph:%s%d:%d:%d:%jx): ", device->sim->sim_name, device->sim->unit_number, device->sim->bus_id, device->target->target_id, (uintmax_t)device->lun_id); } } void xpt_print(struct cam_path *path, const char *fmt, ...) { va_list ap; xpt_print_path(path); va_start(ap, fmt); vprintf(fmt, ap); va_end(ap); } int xpt_path_string(struct cam_path *path, char *str, size_t str_len) { struct sbuf sb; sbuf_new(&sb, str, str_len, 0); if (path == NULL) sbuf_printf(&sb, "(nopath): "); else { if (path->periph != NULL) sbuf_printf(&sb, "(%s%d:", path->periph->periph_name, path->periph->unit_number); else sbuf_printf(&sb, "(noperiph:"); if (path->bus != NULL) sbuf_printf(&sb, "%s%d:%d:", path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id); else sbuf_printf(&sb, "nobus:"); if (path->target != NULL) sbuf_printf(&sb, "%d:", path->target->target_id); else sbuf_printf(&sb, "X:"); if (path->device != NULL) sbuf_printf(&sb, "%jx): ", (uintmax_t)path->device->lun_id); else sbuf_printf(&sb, "X): "); } sbuf_finish(&sb); return(sbuf_len(&sb)); } path_id_t xpt_path_path_id(struct cam_path *path) { return(path->bus->path_id); } target_id_t xpt_path_target_id(struct cam_path *path) { if (path->target != NULL) return (path->target->target_id); else return (CAM_TARGET_WILDCARD); } lun_id_t xpt_path_lun_id(struct cam_path *path) { if (path->device != NULL) return (path->device->lun_id); else return (CAM_LUN_WILDCARD); } struct cam_sim * xpt_path_sim(struct cam_path *path) { return (path->bus->sim); } struct cam_periph* xpt_path_periph(struct cam_path *path) { return (path->periph); } /* * Release a CAM control block for the caller. Remit the cost of the structure * to the device referenced by the path. If the this device had no 'credits' * and peripheral drivers have registered async callbacks for this notification * call them now. */ void xpt_release_ccb(union ccb *free_ccb) { struct cam_ed *device; struct cam_periph *periph; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_release_ccb\n")); xpt_path_assert(free_ccb->ccb_h.path, MA_OWNED); device = free_ccb->ccb_h.path->device; periph = free_ccb->ccb_h.path->periph; xpt_free_ccb(free_ccb); periph->periph_allocated--; cam_ccbq_release_opening(&device->ccbq); xpt_run_allocq(periph, 0); } /* Functions accessed by SIM drivers */ static struct xpt_xport xport_default = { .alloc_device = xpt_alloc_device_default, .action = xpt_action_default, .async = xpt_dev_async_default, }; /* * A sim structure, listing the SIM entry points and instance * identification info is passed to xpt_bus_register to hook the SIM * into the CAM framework. xpt_bus_register creates a cam_eb entry * for this new bus and places it in the array of busses and assigns * it a path_id. The path_id may be influenced by "hard wiring" * information specified by the user. Once interrupt services are * available, the bus will be probed. */ int32_t xpt_bus_register(struct cam_sim *sim, device_t parent, u_int32_t bus) { struct cam_eb *new_bus; struct cam_eb *old_bus; struct ccb_pathinq cpi; struct cam_path *path; cam_status status; mtx_assert(sim->mtx, MA_OWNED); sim->bus_id = bus; new_bus = (struct cam_eb *)malloc(sizeof(*new_bus), M_CAMXPT, M_NOWAIT|M_ZERO); if (new_bus == NULL) { /* Couldn't satisfy request */ return (CAM_RESRC_UNAVAIL); } mtx_init(&new_bus->eb_mtx, "CAM bus lock", NULL, MTX_DEF); TAILQ_INIT(&new_bus->et_entries); cam_sim_hold(sim); new_bus->sim = sim; timevalclear(&new_bus->last_reset); new_bus->flags = 0; new_bus->refcount = 1; /* Held until a bus_deregister event */ new_bus->generation = 0; xpt_lock_buses(); sim->path_id = new_bus->path_id = xptpathid(sim->sim_name, sim->unit_number, sim->bus_id); old_bus = TAILQ_FIRST(&xsoftc.xpt_busses); while (old_bus != NULL && old_bus->path_id < new_bus->path_id) old_bus = TAILQ_NEXT(old_bus, links); if (old_bus != NULL) TAILQ_INSERT_BEFORE(old_bus, new_bus, links); else TAILQ_INSERT_TAIL(&xsoftc.xpt_busses, new_bus, links); xsoftc.bus_generation++; xpt_unlock_buses(); /* * Set a default transport so that a PATH_INQ can be issued to * the SIM. This will then allow for probing and attaching of * a more appropriate transport. */ new_bus->xport = &xport_default; status = xpt_create_path(&path, /*periph*/NULL, sim->path_id, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) { xpt_release_bus(new_bus); free(path, M_CAMXPT); return (CAM_RESRC_UNAVAIL); } xpt_setup_ccb(&cpi.ccb_h, path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); if (cpi.ccb_h.status == CAM_REQ_CMP) { switch (cpi.transport) { case XPORT_SPI: case XPORT_SAS: case XPORT_FC: case XPORT_USB: case XPORT_ISCSI: case XPORT_SRP: case XPORT_PPB: new_bus->xport = scsi_get_xport(); break; case XPORT_ATA: case XPORT_SATA: new_bus->xport = ata_get_xport(); + break; + case XPORT_NVME: + new_bus->xport = nvme_get_xport(); break; default: new_bus->xport = &xport_default; break; } } /* Notify interested parties */ if (sim->path_id != CAM_XPT_PATH_ID) { xpt_async(AC_PATH_REGISTERED, path, &cpi); if ((cpi.hba_misc & PIM_NOSCAN) == 0) { union ccb *scan_ccb; /* Initiate bus rescan. */ scan_ccb = xpt_alloc_ccb_nowait(); if (scan_ccb != NULL) { scan_ccb->ccb_h.path = path; scan_ccb->ccb_h.func_code = XPT_SCAN_BUS; scan_ccb->crcn.flags = 0; xpt_rescan(scan_ccb); } else { xpt_print(path, "Can't allocate CCB to scan bus\n"); xpt_free_path(path); } } else xpt_free_path(path); } else xpt_free_path(path); return (CAM_SUCCESS); } int32_t xpt_bus_deregister(path_id_t pathid) { struct cam_path bus_path; cam_status status; status = xpt_compile_path(&bus_path, NULL, pathid, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) return (status); xpt_async(AC_LOST_DEVICE, &bus_path, NULL); xpt_async(AC_PATH_DEREGISTERED, &bus_path, NULL); /* Release the reference count held while registered. */ xpt_release_bus(bus_path.bus); xpt_release_path(&bus_path); return (CAM_REQ_CMP); } static path_id_t xptnextfreepathid(void) { struct cam_eb *bus; path_id_t pathid; const char *strval; mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED); pathid = 0; bus = TAILQ_FIRST(&xsoftc.xpt_busses); retry: /* Find an unoccupied pathid */ while (bus != NULL && bus->path_id <= pathid) { if (bus->path_id == pathid) pathid++; bus = TAILQ_NEXT(bus, links); } /* * Ensure that this pathid is not reserved for * a bus that may be registered in the future. */ if (resource_string_value("scbus", pathid, "at", &strval) == 0) { ++pathid; /* Start the search over */ goto retry; } return (pathid); } static path_id_t xptpathid(const char *sim_name, int sim_unit, int sim_bus) { path_id_t pathid; int i, dunit, val; char buf[32]; const char *dname; pathid = CAM_XPT_PATH_ID; snprintf(buf, sizeof(buf), "%s%d", sim_name, sim_unit); if (strcmp(buf, "xpt0") == 0 && sim_bus == 0) return (pathid); i = 0; while ((resource_find_match(&i, &dname, &dunit, "at", buf)) == 0) { if (strcmp(dname, "scbus")) { /* Avoid a bit of foot shooting. */ continue; } if (dunit < 0) /* unwired?! */ continue; if (resource_int_value("scbus", dunit, "bus", &val) == 0) { if (sim_bus == val) { pathid = dunit; break; } } else if (sim_bus == 0) { /* Unspecified matches bus 0 */ pathid = dunit; break; } else { printf("Ambiguous scbus configuration for %s%d " "bus %d, cannot wire down. The kernel " "config entry for scbus%d should " "specify a controller bus.\n" "Scbus will be assigned dynamically.\n", sim_name, sim_unit, sim_bus, dunit); break; } } if (pathid == CAM_XPT_PATH_ID) pathid = xptnextfreepathid(); return (pathid); } static const char * xpt_async_string(u_int32_t async_code) { switch (async_code) { case AC_BUS_RESET: return ("AC_BUS_RESET"); case AC_UNSOL_RESEL: return ("AC_UNSOL_RESEL"); case AC_SCSI_AEN: return ("AC_SCSI_AEN"); case AC_SENT_BDR: return ("AC_SENT_BDR"); case AC_PATH_REGISTERED: return ("AC_PATH_REGISTERED"); case AC_PATH_DEREGISTERED: return ("AC_PATH_DEREGISTERED"); case AC_FOUND_DEVICE: return ("AC_FOUND_DEVICE"); case AC_LOST_DEVICE: return ("AC_LOST_DEVICE"); case AC_TRANSFER_NEG: return ("AC_TRANSFER_NEG"); case AC_INQ_CHANGED: return ("AC_INQ_CHANGED"); case AC_GETDEV_CHANGED: return ("AC_GETDEV_CHANGED"); case AC_CONTRACT: return ("AC_CONTRACT"); case AC_ADVINFO_CHANGED: return ("AC_ADVINFO_CHANGED"); case AC_UNIT_ATTENTION: return ("AC_UNIT_ATTENTION"); } return ("AC_UNKNOWN"); } static int xpt_async_size(u_int32_t async_code) { switch (async_code) { case AC_BUS_RESET: return (0); case AC_UNSOL_RESEL: return (0); case AC_SCSI_AEN: return (0); case AC_SENT_BDR: return (0); case AC_PATH_REGISTERED: return (sizeof(struct ccb_pathinq)); case AC_PATH_DEREGISTERED: return (0); case AC_FOUND_DEVICE: return (sizeof(struct ccb_getdev)); case AC_LOST_DEVICE: return (0); case AC_TRANSFER_NEG: return (sizeof(struct ccb_trans_settings)); case AC_INQ_CHANGED: return (0); case AC_GETDEV_CHANGED: return (0); case AC_CONTRACT: return (sizeof(struct ac_contract)); case AC_ADVINFO_CHANGED: return (-1); case AC_UNIT_ATTENTION: return (sizeof(struct ccb_scsiio)); } return (0); } static int xpt_async_process_dev(struct cam_ed *device, void *arg) { union ccb *ccb = arg; struct cam_path *path = ccb->ccb_h.path; void *async_arg = ccb->casync.async_arg_ptr; u_int32_t async_code = ccb->casync.async_code; int relock; if (path->device != device && path->device->lun_id != CAM_LUN_WILDCARD && device->lun_id != CAM_LUN_WILDCARD) return (1); /* * The async callback could free the device. * If it is a broadcast async, it doesn't hold * device reference, so take our own reference. */ xpt_acquire_device(device); /* * If async for specific device is to be delivered to * the wildcard client, take the specific device lock. * XXX: We may need a way for client to specify it. */ if ((device->lun_id == CAM_LUN_WILDCARD && path->device->lun_id != CAM_LUN_WILDCARD) || (device->target->target_id == CAM_TARGET_WILDCARD && path->target->target_id != CAM_TARGET_WILDCARD) || (device->target->bus->path_id == CAM_BUS_WILDCARD && path->target->bus->path_id != CAM_BUS_WILDCARD)) { mtx_unlock(&device->device_mtx); xpt_path_lock(path); relock = 1; } else relock = 0; (*(device->target->bus->xport->async))(async_code, device->target->bus, device->target, device, async_arg); xpt_async_bcast(&device->asyncs, async_code, path, async_arg); if (relock) { xpt_path_unlock(path); mtx_lock(&device->device_mtx); } xpt_release_device(device); return (1); } static int xpt_async_process_tgt(struct cam_et *target, void *arg) { union ccb *ccb = arg; struct cam_path *path = ccb->ccb_h.path; if (path->target != target && path->target->target_id != CAM_TARGET_WILDCARD && target->target_id != CAM_TARGET_WILDCARD) return (1); if (ccb->casync.async_code == AC_SENT_BDR) { /* Update our notion of when the last reset occurred */ microtime(&target->last_reset); } return (xptdevicetraverse(target, NULL, xpt_async_process_dev, ccb)); } static void xpt_async_process(struct cam_periph *periph, union ccb *ccb) { struct cam_eb *bus; struct cam_path *path; void *async_arg; u_int32_t async_code; path = ccb->ccb_h.path; async_code = ccb->casync.async_code; async_arg = ccb->casync.async_arg_ptr; CAM_DEBUG(path, CAM_DEBUG_TRACE | CAM_DEBUG_INFO, ("xpt_async(%s)\n", xpt_async_string(async_code))); bus = path->bus; if (async_code == AC_BUS_RESET) { /* Update our notion of when the last reset occurred */ microtime(&bus->last_reset); } xpttargettraverse(bus, NULL, xpt_async_process_tgt, ccb); /* * If this wasn't a fully wildcarded async, tell all * clients that want all async events. */ if (bus != xpt_periph->path->bus) { xpt_path_lock(xpt_periph->path); xpt_async_process_dev(xpt_periph->path->device, ccb); xpt_path_unlock(xpt_periph->path); } if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD) xpt_release_devq(path, 1, TRUE); else xpt_release_simq(path->bus->sim, TRUE); if (ccb->casync.async_arg_size > 0) free(async_arg, M_CAMXPT); xpt_free_path(path); xpt_free_ccb(ccb); } static void xpt_async_bcast(struct async_list *async_head, u_int32_t async_code, struct cam_path *path, void *async_arg) { struct async_node *cur_entry; int lock; cur_entry = SLIST_FIRST(async_head); while (cur_entry != NULL) { struct async_node *next_entry; /* * Grab the next list entry before we call the current * entry's callback. This is because the callback function * can delete its async callback entry. */ next_entry = SLIST_NEXT(cur_entry, links); if ((cur_entry->event_enable & async_code) != 0) { lock = cur_entry->event_lock; if (lock) CAM_SIM_LOCK(path->device->sim); cur_entry->callback(cur_entry->callback_arg, async_code, path, async_arg); if (lock) CAM_SIM_UNLOCK(path->device->sim); } cur_entry = next_entry; } } void xpt_async(u_int32_t async_code, struct cam_path *path, void *async_arg) { union ccb *ccb; int size; ccb = xpt_alloc_ccb_nowait(); if (ccb == NULL) { xpt_print(path, "Can't allocate CCB to send %s\n", xpt_async_string(async_code)); return; } if (xpt_clone_path(&ccb->ccb_h.path, path) != CAM_REQ_CMP) { xpt_print(path, "Can't allocate path to send %s\n", xpt_async_string(async_code)); xpt_free_ccb(ccb); return; } ccb->ccb_h.path->periph = NULL; ccb->ccb_h.func_code = XPT_ASYNC; ccb->ccb_h.cbfcnp = xpt_async_process; ccb->ccb_h.flags |= CAM_UNLOCKED; ccb->casync.async_code = async_code; ccb->casync.async_arg_size = 0; size = xpt_async_size(async_code); CAM_DEBUG(ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_async: func %#x %s aync_code %d %s\n", ccb->ccb_h.func_code, xpt_action_name(ccb->ccb_h.func_code), async_code, xpt_async_string(async_code))); if (size > 0 && async_arg != NULL) { ccb->casync.async_arg_ptr = malloc(size, M_CAMXPT, M_NOWAIT); if (ccb->casync.async_arg_ptr == NULL) { xpt_print(path, "Can't allocate argument to send %s\n", xpt_async_string(async_code)); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } memcpy(ccb->casync.async_arg_ptr, async_arg, size); ccb->casync.async_arg_size = size; } else if (size < 0) { ccb->casync.async_arg_ptr = async_arg; ccb->casync.async_arg_size = size; } if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD) xpt_freeze_devq(path, 1); else xpt_freeze_simq(path->bus->sim, 1); xpt_done(ccb); } static void xpt_dev_async_default(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg) { /* * We only need to handle events for real devices. */ if (target->target_id == CAM_TARGET_WILDCARD || device->lun_id == CAM_LUN_WILDCARD) return; printf("%s called\n", __func__); } static uint32_t xpt_freeze_devq_device(struct cam_ed *dev, u_int count) { struct cam_devq *devq; uint32_t freeze; devq = dev->sim->devq; mtx_assert(&devq->send_mtx, MA_OWNED); CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_freeze_devq_device(%d) %u->%u\n", count, dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt + count)); freeze = (dev->ccbq.queue.qfrozen_cnt += count); /* Remove frozen device from sendq. */ if (device_is_queued(dev)) camq_remove(&devq->send_queue, dev->devq_entry.index); return (freeze); } u_int32_t xpt_freeze_devq(struct cam_path *path, u_int count) { struct cam_ed *dev = path->device; struct cam_devq *devq; uint32_t freeze; devq = dev->sim->devq; mtx_lock(&devq->send_mtx); CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_freeze_devq(%d)\n", count)); freeze = xpt_freeze_devq_device(dev, count); mtx_unlock(&devq->send_mtx); return (freeze); } u_int32_t xpt_freeze_simq(struct cam_sim *sim, u_int count) { struct cam_devq *devq; uint32_t freeze; devq = sim->devq; mtx_lock(&devq->send_mtx); freeze = (devq->send_queue.qfrozen_cnt += count); mtx_unlock(&devq->send_mtx); return (freeze); } static void xpt_release_devq_timeout(void *arg) { struct cam_ed *dev; struct cam_devq *devq; dev = (struct cam_ed *)arg; CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_release_devq_timeout\n")); devq = dev->sim->devq; mtx_assert(&devq->send_mtx, MA_OWNED); if (xpt_release_devq_device(dev, /*count*/1, /*run_queue*/TRUE)) xpt_run_devq(devq); } void xpt_release_devq(struct cam_path *path, u_int count, int run_queue) { struct cam_ed *dev; struct cam_devq *devq; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_devq(%d, %d)\n", count, run_queue)); dev = path->device; devq = dev->sim->devq; mtx_lock(&devq->send_mtx); if (xpt_release_devq_device(dev, count, run_queue)) xpt_run_devq(dev->sim->devq); mtx_unlock(&devq->send_mtx); } static int xpt_release_devq_device(struct cam_ed *dev, u_int count, int run_queue) { mtx_assert(&dev->sim->devq->send_mtx, MA_OWNED); CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_release_devq_device(%d, %d) %u->%u\n", count, run_queue, dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt - count)); if (count > dev->ccbq.queue.qfrozen_cnt) { #ifdef INVARIANTS printf("xpt_release_devq(): requested %u > present %u\n", count, dev->ccbq.queue.qfrozen_cnt); #endif count = dev->ccbq.queue.qfrozen_cnt; } dev->ccbq.queue.qfrozen_cnt -= count; if (dev->ccbq.queue.qfrozen_cnt == 0) { /* * No longer need to wait for a successful * command completion. */ dev->flags &= ~CAM_DEV_REL_ON_COMPLETE; /* * Remove any timeouts that might be scheduled * to release this queue. */ if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) { callout_stop(&dev->callout); dev->flags &= ~CAM_DEV_REL_TIMEOUT_PENDING; } /* * Now that we are unfrozen schedule the * device so any pending transactions are * run. */ xpt_schedule_devq(dev->sim->devq, dev); } else run_queue = 0; return (run_queue); } void xpt_release_simq(struct cam_sim *sim, int run_queue) { struct cam_devq *devq; devq = sim->devq; mtx_lock(&devq->send_mtx); if (devq->send_queue.qfrozen_cnt <= 0) { #ifdef INVARIANTS printf("xpt_release_simq: requested 1 > present %u\n", devq->send_queue.qfrozen_cnt); #endif } else devq->send_queue.qfrozen_cnt--; if (devq->send_queue.qfrozen_cnt == 0) { /* * If there is a timeout scheduled to release this * sim queue, remove it. The queue frozen count is * already at 0. */ if ((sim->flags & CAM_SIM_REL_TIMEOUT_PENDING) != 0){ callout_stop(&sim->callout); sim->flags &= ~CAM_SIM_REL_TIMEOUT_PENDING; } if (run_queue) { /* * Now that we are unfrozen run the send queue. */ xpt_run_devq(sim->devq); } } mtx_unlock(&devq->send_mtx); } /* * XXX Appears to be unused. */ static void xpt_release_simq_timeout(void *arg) { struct cam_sim *sim; sim = (struct cam_sim *)arg; xpt_release_simq(sim, /* run_queue */ TRUE); } void xpt_done(union ccb *done_ccb) { struct cam_doneq *queue; int run, hash; CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done: func= %#x %s status %#x\n", done_ccb->ccb_h.func_code, xpt_action_name(done_ccb->ccb_h.func_code), done_ccb->ccb_h.status)); if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0) return; /* Store the time the ccb was in the sim */ done_ccb->ccb_h.qos.sim_data = sbinuptime() - done_ccb->ccb_h.qos.sim_data; hash = (done_ccb->ccb_h.path_id + done_ccb->ccb_h.target_id + done_ccb->ccb_h.target_lun) % cam_num_doneqs; queue = &cam_doneqs[hash]; mtx_lock(&queue->cam_doneq_mtx); run = (queue->cam_doneq_sleep && STAILQ_EMPTY(&queue->cam_doneq)); STAILQ_INSERT_TAIL(&queue->cam_doneq, &done_ccb->ccb_h, sim_links.stqe); done_ccb->ccb_h.pinfo.index = CAM_DONEQ_INDEX; mtx_unlock(&queue->cam_doneq_mtx); if (run) wakeup(&queue->cam_doneq); } void xpt_done_direct(union ccb *done_ccb) { CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done_direct: status %#x\n", done_ccb->ccb_h.status)); if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0) return; /* Store the time the ccb was in the sim */ done_ccb->ccb_h.qos.sim_data = sbinuptime() - done_ccb->ccb_h.qos.sim_data; xpt_done_process(&done_ccb->ccb_h); } union ccb * xpt_alloc_ccb() { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK); return (new_ccb); } union ccb * xpt_alloc_ccb_nowait() { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT); return (new_ccb); } void xpt_free_ccb(union ccb *free_ccb) { free(free_ccb, M_CAMCCB); } /* Private XPT functions */ /* * Get a CAM control block for the caller. Charge the structure to the device * referenced by the path. If we don't have sufficient resources to allocate * more ccbs, we return NULL. */ static union ccb * xpt_get_ccb_nowait(struct cam_periph *periph) { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT); if (new_ccb == NULL) return (NULL); periph->periph_allocated++; cam_ccbq_take_opening(&periph->path->device->ccbq); return (new_ccb); } static union ccb * xpt_get_ccb(struct cam_periph *periph) { union ccb *new_ccb; cam_periph_unlock(periph); new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK); cam_periph_lock(periph); periph->periph_allocated++; cam_ccbq_take_opening(&periph->path->device->ccbq); return (new_ccb); } union ccb * cam_periph_getccb(struct cam_periph *periph, u_int32_t priority) { struct ccb_hdr *ccb_h; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("cam_periph_getccb\n")); cam_periph_assert(periph, MA_OWNED); while ((ccb_h = SLIST_FIRST(&periph->ccb_list)) == NULL || ccb_h->pinfo.priority != priority) { if (priority < periph->immediate_priority) { periph->immediate_priority = priority; xpt_run_allocq(periph, 0); } else cam_periph_sleep(periph, &periph->ccb_list, PRIBIO, "cgticb", 0); } SLIST_REMOVE_HEAD(&periph->ccb_list, periph_links.sle); return ((union ccb *)ccb_h); } static void xpt_acquire_bus(struct cam_eb *bus) { xpt_lock_buses(); bus->refcount++; xpt_unlock_buses(); } static void xpt_release_bus(struct cam_eb *bus) { xpt_lock_buses(); KASSERT(bus->refcount >= 1, ("bus->refcount >= 1")); if (--bus->refcount > 0) { xpt_unlock_buses(); return; } TAILQ_REMOVE(&xsoftc.xpt_busses, bus, links); xsoftc.bus_generation++; xpt_unlock_buses(); KASSERT(TAILQ_EMPTY(&bus->et_entries), ("destroying bus, but target list is not empty")); cam_sim_release(bus->sim); mtx_destroy(&bus->eb_mtx); free(bus, M_CAMXPT); } static struct cam_et * xpt_alloc_target(struct cam_eb *bus, target_id_t target_id) { struct cam_et *cur_target, *target; mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED); mtx_assert(&bus->eb_mtx, MA_OWNED); target = (struct cam_et *)malloc(sizeof(*target), M_CAMXPT, M_NOWAIT|M_ZERO); if (target == NULL) return (NULL); TAILQ_INIT(&target->ed_entries); target->bus = bus; target->target_id = target_id; target->refcount = 1; target->generation = 0; target->luns = NULL; mtx_init(&target->luns_mtx, "CAM LUNs lock", NULL, MTX_DEF); timevalclear(&target->last_reset); /* * Hold a reference to our parent bus so it * will not go away before we do. */ bus->refcount++; /* Insertion sort into our bus's target list */ cur_target = TAILQ_FIRST(&bus->et_entries); while (cur_target != NULL && cur_target->target_id < target_id) cur_target = TAILQ_NEXT(cur_target, links); if (cur_target != NULL) { TAILQ_INSERT_BEFORE(cur_target, target, links); } else { TAILQ_INSERT_TAIL(&bus->et_entries, target, links); } bus->generation++; return (target); } static void xpt_acquire_target(struct cam_et *target) { struct cam_eb *bus = target->bus; mtx_lock(&bus->eb_mtx); target->refcount++; mtx_unlock(&bus->eb_mtx); } static void xpt_release_target(struct cam_et *target) { struct cam_eb *bus = target->bus; mtx_lock(&bus->eb_mtx); if (--target->refcount > 0) { mtx_unlock(&bus->eb_mtx); return; } TAILQ_REMOVE(&bus->et_entries, target, links); bus->generation++; mtx_unlock(&bus->eb_mtx); KASSERT(TAILQ_EMPTY(&target->ed_entries), ("destroying target, but device list is not empty")); xpt_release_bus(bus); mtx_destroy(&target->luns_mtx); if (target->luns) free(target->luns, M_CAMXPT); free(target, M_CAMXPT); } static struct cam_ed * xpt_alloc_device_default(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id) { struct cam_ed *device; device = xpt_alloc_device(bus, target, lun_id); if (device == NULL) return (NULL); device->mintags = 1; device->maxtags = 1; return (device); } static void xpt_destroy_device(void *context, int pending) { struct cam_ed *device = context; mtx_lock(&device->device_mtx); mtx_destroy(&device->device_mtx); free(device, M_CAMDEV); } struct cam_ed * xpt_alloc_device(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id) { struct cam_ed *cur_device, *device; struct cam_devq *devq; cam_status status; mtx_assert(&bus->eb_mtx, MA_OWNED); /* Make space for us in the device queue on our bus */ devq = bus->sim->devq; mtx_lock(&devq->send_mtx); status = cam_devq_resize(devq, devq->send_queue.array_size + 1); mtx_unlock(&devq->send_mtx); if (status != CAM_REQ_CMP) return (NULL); device = (struct cam_ed *)malloc(sizeof(*device), M_CAMDEV, M_NOWAIT|M_ZERO); if (device == NULL) return (NULL); cam_init_pinfo(&device->devq_entry); device->target = target; device->lun_id = lun_id; device->sim = bus->sim; if (cam_ccbq_init(&device->ccbq, bus->sim->max_dev_openings) != 0) { free(device, M_CAMDEV); return (NULL); } SLIST_INIT(&device->asyncs); SLIST_INIT(&device->periphs); device->generation = 0; device->flags = CAM_DEV_UNCONFIGURED; device->tag_delay_count = 0; device->tag_saved_openings = 0; device->refcount = 1; mtx_init(&device->device_mtx, "CAM device lock", NULL, MTX_DEF); callout_init_mtx(&device->callout, &devq->send_mtx, 0); TASK_INIT(&device->device_destroy_task, 0, xpt_destroy_device, device); /* * Hold a reference to our parent bus so it * will not go away before we do. */ target->refcount++; cur_device = TAILQ_FIRST(&target->ed_entries); while (cur_device != NULL && cur_device->lun_id < lun_id) cur_device = TAILQ_NEXT(cur_device, links); if (cur_device != NULL) TAILQ_INSERT_BEFORE(cur_device, device, links); else TAILQ_INSERT_TAIL(&target->ed_entries, device, links); target->generation++; return (device); } void xpt_acquire_device(struct cam_ed *device) { struct cam_eb *bus = device->target->bus; mtx_lock(&bus->eb_mtx); device->refcount++; mtx_unlock(&bus->eb_mtx); } void xpt_release_device(struct cam_ed *device) { struct cam_eb *bus = device->target->bus; struct cam_devq *devq; mtx_lock(&bus->eb_mtx); if (--device->refcount > 0) { mtx_unlock(&bus->eb_mtx); return; } TAILQ_REMOVE(&device->target->ed_entries, device,links); device->target->generation++; mtx_unlock(&bus->eb_mtx); /* Release our slot in the devq */ devq = bus->sim->devq; mtx_lock(&devq->send_mtx); cam_devq_resize(devq, devq->send_queue.array_size - 1); mtx_unlock(&devq->send_mtx); KASSERT(SLIST_EMPTY(&device->periphs), ("destroying device, but periphs list is not empty")); KASSERT(device->devq_entry.index == CAM_UNQUEUED_INDEX, ("destroying device while still queued for ccbs")); if ((device->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) callout_stop(&device->callout); xpt_release_target(device->target); cam_ccbq_fini(&device->ccbq); /* * Free allocated memory. free(9) does nothing if the * supplied pointer is NULL, so it is safe to call without * checking. */ free(device->supported_vpds, M_CAMXPT); free(device->device_id, M_CAMXPT); free(device->ext_inq, M_CAMXPT); free(device->physpath, M_CAMXPT); free(device->rcap_buf, M_CAMXPT); free(device->serial_num, M_CAMXPT); taskqueue_enqueue(xsoftc.xpt_taskq, &device->device_destroy_task); } u_int32_t xpt_dev_ccbq_resize(struct cam_path *path, int newopenings) { int result; struct cam_ed *dev; dev = path->device; mtx_lock(&dev->sim->devq->send_mtx); result = cam_ccbq_resize(&dev->ccbq, newopenings); mtx_unlock(&dev->sim->devq->send_mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0 || (dev->inq_flags & SID_CmdQue) != 0) dev->tag_saved_openings = newopenings; return (result); } static struct cam_eb * xpt_find_bus(path_id_t path_id) { struct cam_eb *bus; xpt_lock_buses(); for (bus = TAILQ_FIRST(&xsoftc.xpt_busses); bus != NULL; bus = TAILQ_NEXT(bus, links)) { if (bus->path_id == path_id) { bus->refcount++; break; } } xpt_unlock_buses(); return (bus); } static struct cam_et * xpt_find_target(struct cam_eb *bus, target_id_t target_id) { struct cam_et *target; mtx_assert(&bus->eb_mtx, MA_OWNED); for (target = TAILQ_FIRST(&bus->et_entries); target != NULL; target = TAILQ_NEXT(target, links)) { if (target->target_id == target_id) { target->refcount++; break; } } return (target); } static struct cam_ed * xpt_find_device(struct cam_et *target, lun_id_t lun_id) { struct cam_ed *device; mtx_assert(&target->bus->eb_mtx, MA_OWNED); for (device = TAILQ_FIRST(&target->ed_entries); device != NULL; device = TAILQ_NEXT(device, links)) { if (device->lun_id == lun_id) { device->refcount++; break; } } return (device); } void xpt_start_tags(struct cam_path *path) { struct ccb_relsim crs; struct cam_ed *device; struct cam_sim *sim; int newopenings; device = path->device; sim = path->bus->sim; device->flags &= ~CAM_DEV_TAG_AFTER_COUNT; xpt_freeze_devq(path, /*count*/1); device->inq_flags |= SID_CmdQue; if (device->tag_saved_openings != 0) newopenings = device->tag_saved_openings; else newopenings = min(device->maxtags, sim->max_tagged_dev_openings); xpt_dev_ccbq_resize(path, newopenings); xpt_async(AC_GETDEV_CHANGED, path, NULL); xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL); crs.ccb_h.func_code = XPT_REL_SIMQ; crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY; crs.openings = crs.release_timeout = crs.qfrozen_cnt = 0; xpt_action((union ccb *)&crs); } void xpt_stop_tags(struct cam_path *path) { struct ccb_relsim crs; struct cam_ed *device; struct cam_sim *sim; device = path->device; sim = path->bus->sim; device->flags &= ~CAM_DEV_TAG_AFTER_COUNT; device->tag_delay_count = 0; xpt_freeze_devq(path, /*count*/1); device->inq_flags &= ~SID_CmdQue; xpt_dev_ccbq_resize(path, sim->max_dev_openings); xpt_async(AC_GETDEV_CHANGED, path, NULL); xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL); crs.ccb_h.func_code = XPT_REL_SIMQ; crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY; crs.openings = crs.release_timeout = crs.qfrozen_cnt = 0; xpt_action((union ccb *)&crs); } static void xpt_boot_delay(void *arg) { xpt_release_boot(); } static void xpt_config(void *arg) { /* * Now that interrupts are enabled, go find our devices */ if (taskqueue_start_threads(&xsoftc.xpt_taskq, 1, PRIBIO, "CAM taskq")) printf("xpt_config: failed to create taskqueue thread.\n"); /* Setup debugging path */ if (cam_dflags != CAM_DEBUG_NONE) { if (xpt_create_path(&cam_dpath, NULL, CAM_DEBUG_BUS, CAM_DEBUG_TARGET, CAM_DEBUG_LUN) != CAM_REQ_CMP) { printf("xpt_config: xpt_create_path() failed for debug" " target %d:%d:%d, debugging disabled\n", CAM_DEBUG_BUS, CAM_DEBUG_TARGET, CAM_DEBUG_LUN); cam_dflags = CAM_DEBUG_NONE; } } else cam_dpath = NULL; periphdriver_init(1); xpt_hold_boot(); callout_init(&xsoftc.boot_callout, 1); callout_reset_sbt(&xsoftc.boot_callout, SBT_1MS * xsoftc.boot_delay, 0, xpt_boot_delay, NULL, 0); /* Fire up rescan thread. */ if (kproc_kthread_add(xpt_scanner_thread, NULL, &cam_proc, NULL, 0, 0, "cam", "scanner")) { printf("xpt_config: failed to create rescan thread.\n"); } } void xpt_hold_boot(void) { xpt_lock_buses(); xsoftc.buses_to_config++; xpt_unlock_buses(); } void xpt_release_boot(void) { xpt_lock_buses(); xsoftc.buses_to_config--; if (xsoftc.buses_to_config == 0 && xsoftc.buses_config_done == 0) { struct xpt_task *task; xsoftc.buses_config_done = 1; xpt_unlock_buses(); /* Call manually because we don't have any busses */ task = malloc(sizeof(struct xpt_task), M_CAMXPT, M_NOWAIT); if (task != NULL) { TASK_INIT(&task->task, 0, xpt_finishconfig_task, task); taskqueue_enqueue(taskqueue_thread, &task->task); } } else xpt_unlock_buses(); } /* * If the given device only has one peripheral attached to it, and if that * peripheral is the passthrough driver, announce it. This insures that the * user sees some sort of announcement for every peripheral in their system. */ static int xptpassannouncefunc(struct cam_ed *device, void *arg) { struct cam_periph *periph; int i; for (periph = SLIST_FIRST(&device->periphs), i = 0; periph != NULL; periph = SLIST_NEXT(periph, periph_links), i++); periph = SLIST_FIRST(&device->periphs); if ((i == 1) && (strncmp(periph->periph_name, "pass", 4) == 0)) xpt_announce_periph(periph, NULL); return(1); } static void xpt_finishconfig_task(void *context, int pending) { periphdriver_init(2); /* * Check for devices with no "standard" peripheral driver * attached. For any devices like that, announce the * passthrough driver so the user will see something. */ if (!bootverbose) xpt_for_all_devices(xptpassannouncefunc, NULL); /* Release our hook so that the boot can continue. */ config_intrhook_disestablish(xsoftc.xpt_config_hook); free(xsoftc.xpt_config_hook, M_CAMXPT); xsoftc.xpt_config_hook = NULL; free(context, M_CAMXPT); } cam_status xpt_register_async(int event, ac_callback_t *cbfunc, void *cbarg, struct cam_path *path) { struct ccb_setasync csa; cam_status status; int xptpath = 0; if (path == NULL) { status = xpt_create_path(&path, /*periph*/NULL, CAM_XPT_PATH_ID, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) return (status); xpt_path_lock(path); xptpath = 1; } xpt_setup_ccb(&csa.ccb_h, path, CAM_PRIORITY_NORMAL); csa.ccb_h.func_code = XPT_SASYNC_CB; csa.event_enable = event; csa.callback = cbfunc; csa.callback_arg = cbarg; xpt_action((union ccb *)&csa); status = csa.ccb_h.status; CAM_DEBUG(csa.ccb_h.path, CAM_DEBUG_TRACE, ("xpt_register_async: func %p\n", cbfunc)); if (xptpath) { xpt_path_unlock(path); xpt_free_path(path); } if ((status == CAM_REQ_CMP) && (csa.event_enable & AC_FOUND_DEVICE)) { /* * Get this peripheral up to date with all * the currently existing devices. */ xpt_for_all_devices(xptsetasyncfunc, &csa); } if ((status == CAM_REQ_CMP) && (csa.event_enable & AC_PATH_REGISTERED)) { /* * Get this peripheral up to date with all * the currently existing busses. */ xpt_for_all_busses(xptsetasyncbusfunc, &csa); } return (status); } static void xptaction(struct cam_sim *sim, union ccb *work_ccb) { CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xptaction\n")); switch (work_ccb->ccb_h.func_code) { /* Common cases first */ case XPT_PATH_INQ: /* Path routing inquiry */ { struct ccb_pathinq *cpi; cpi = &work_ccb->cpi; cpi->version_num = 1; /* XXX??? */ cpi->hba_inquiry = 0; cpi->target_sprt = 0; cpi->hba_misc = 0; cpi->hba_eng_cnt = 0; cpi->max_target = 0; cpi->max_lun = 0; cpi->initiator_id = 0; strncpy(cpi->sim_vid, "FreeBSD", SIM_IDLEN); strncpy(cpi->hba_vid, "", HBA_IDLEN); strncpy(cpi->dev_name, sim->sim_name, DEV_IDLEN); cpi->unit_number = sim->unit_number; cpi->bus_id = sim->bus_id; cpi->base_transfer_speed = 0; cpi->protocol = PROTO_UNSPECIFIED; cpi->protocol_version = PROTO_VERSION_UNSPECIFIED; cpi->transport = XPORT_UNSPECIFIED; cpi->transport_version = XPORT_VERSION_UNSPECIFIED; cpi->ccb_h.status = CAM_REQ_CMP; xpt_done(work_ccb); break; } default: work_ccb->ccb_h.status = CAM_REQ_INVALID; xpt_done(work_ccb); break; } } /* * The xpt as a "controller" has no interrupt sources, so polling * is a no-op. */ static void xptpoll(struct cam_sim *sim) { } void xpt_lock_buses(void) { mtx_lock(&xsoftc.xpt_topo_lock); } void xpt_unlock_buses(void) { mtx_unlock(&xsoftc.xpt_topo_lock); } struct mtx * xpt_path_mtx(struct cam_path *path) { return (&path->device->device_mtx); } static void xpt_done_process(struct ccb_hdr *ccb_h) { struct cam_sim *sim; struct cam_devq *devq; struct mtx *mtx = NULL; if (ccb_h->flags & CAM_HIGH_POWER) { struct highpowerlist *hphead; struct cam_ed *device; mtx_lock(&xsoftc.xpt_highpower_lock); hphead = &xsoftc.highpowerq; device = STAILQ_FIRST(hphead); /* * Increment the count since this command is done. */ xsoftc.num_highpower++; /* * Any high powered commands queued up? */ if (device != NULL) { STAILQ_REMOVE_HEAD(hphead, highpowerq_entry); mtx_unlock(&xsoftc.xpt_highpower_lock); mtx_lock(&device->sim->devq->send_mtx); xpt_release_devq_device(device, /*count*/1, /*runqueue*/TRUE); mtx_unlock(&device->sim->devq->send_mtx); } else mtx_unlock(&xsoftc.xpt_highpower_lock); } sim = ccb_h->path->bus->sim; if (ccb_h->status & CAM_RELEASE_SIMQ) { xpt_release_simq(sim, /*run_queue*/FALSE); ccb_h->status &= ~CAM_RELEASE_SIMQ; } if ((ccb_h->flags & CAM_DEV_QFRZDIS) && (ccb_h->status & CAM_DEV_QFRZN)) { xpt_release_devq(ccb_h->path, /*count*/1, /*run_queue*/TRUE); ccb_h->status &= ~CAM_DEV_QFRZN; } devq = sim->devq; if ((ccb_h->func_code & XPT_FC_USER_CCB) == 0) { struct cam_ed *dev = ccb_h->path->device; mtx_lock(&devq->send_mtx); devq->send_active--; devq->send_openings++; cam_ccbq_ccb_done(&dev->ccbq, (union ccb *)ccb_h); if (((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0 && (dev->ccbq.dev_active == 0))) { dev->flags &= ~CAM_DEV_REL_ON_QUEUE_EMPTY; xpt_release_devq_device(dev, /*count*/1, /*run_queue*/FALSE); } if (((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0 && (ccb_h->status&CAM_STATUS_MASK) != CAM_REQUEUE_REQ)) { dev->flags &= ~CAM_DEV_REL_ON_COMPLETE; xpt_release_devq_device(dev, /*count*/1, /*run_queue*/FALSE); } if (!device_is_queued(dev)) (void)xpt_schedule_devq(devq, dev); xpt_run_devq(devq); mtx_unlock(&devq->send_mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0) { mtx = xpt_path_mtx(ccb_h->path); mtx_lock(mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0 && (--dev->tag_delay_count == 0)) xpt_start_tags(ccb_h->path); } } if ((ccb_h->flags & CAM_UNLOCKED) == 0) { if (mtx == NULL) { mtx = xpt_path_mtx(ccb_h->path); mtx_lock(mtx); } } else { if (mtx != NULL) { mtx_unlock(mtx); mtx = NULL; } } /* Call the peripheral driver's callback */ ccb_h->pinfo.index = CAM_UNQUEUED_INDEX; (*ccb_h->cbfcnp)(ccb_h->path->periph, (union ccb *)ccb_h); if (mtx != NULL) mtx_unlock(mtx); } void xpt_done_td(void *arg) { struct cam_doneq *queue = arg; struct ccb_hdr *ccb_h; STAILQ_HEAD(, ccb_hdr) doneq; STAILQ_INIT(&doneq); mtx_lock(&queue->cam_doneq_mtx); while (1) { while (STAILQ_EMPTY(&queue->cam_doneq)) { queue->cam_doneq_sleep = 1; msleep(&queue->cam_doneq, &queue->cam_doneq_mtx, PRIBIO, "-", 0); queue->cam_doneq_sleep = 0; } STAILQ_CONCAT(&doneq, &queue->cam_doneq); mtx_unlock(&queue->cam_doneq_mtx); THREAD_NO_SLEEPING(); while ((ccb_h = STAILQ_FIRST(&doneq)) != NULL) { STAILQ_REMOVE_HEAD(&doneq, sim_links.stqe); xpt_done_process(ccb_h); } THREAD_SLEEPING_OK(); mtx_lock(&queue->cam_doneq_mtx); } } static void camisr_runqueue(void) { struct ccb_hdr *ccb_h; struct cam_doneq *queue; int i; /* Process global queues. */ for (i = 0; i < cam_num_doneqs; i++) { queue = &cam_doneqs[i]; mtx_lock(&queue->cam_doneq_mtx); while ((ccb_h = STAILQ_FIRST(&queue->cam_doneq)) != NULL) { STAILQ_REMOVE_HEAD(&queue->cam_doneq, sim_links.stqe); mtx_unlock(&queue->cam_doneq_mtx); xpt_done_process(ccb_h); mtx_lock(&queue->cam_doneq_mtx); } mtx_unlock(&queue->cam_doneq_mtx); } } struct kv { uint32_t v; const char *name; }; static struct kv map[] = { { XPT_NOOP, "XPT_NOOP" }, { XPT_SCSI_IO, "XPT_SCSI_IO" }, { XPT_GDEV_TYPE, "XPT_GDEV_TYPE" }, { XPT_GDEVLIST, "XPT_GDEVLIST" }, { XPT_PATH_INQ, "XPT_PATH_INQ" }, { XPT_REL_SIMQ, "XPT_REL_SIMQ" }, { XPT_SASYNC_CB, "XPT_SASYNC_CB" }, { XPT_SDEV_TYPE, "XPT_SDEV_TYPE" }, { XPT_SCAN_BUS, "XPT_SCAN_BUS" }, { XPT_DEV_MATCH, "XPT_DEV_MATCH" }, { XPT_DEBUG, "XPT_DEBUG" }, { XPT_PATH_STATS, "XPT_PATH_STATS" }, { XPT_GDEV_STATS, "XPT_GDEV_STATS" }, { XPT_DEV_ADVINFO, "XPT_DEV_ADVINFO" }, { XPT_ASYNC, "XPT_ASYNC" }, { XPT_ABORT, "XPT_ABORT" }, { XPT_RESET_BUS, "XPT_RESET_BUS" }, { XPT_RESET_DEV, "XPT_RESET_DEV" }, { XPT_TERM_IO, "XPT_TERM_IO" }, { XPT_SCAN_LUN, "XPT_SCAN_LUN" }, { XPT_GET_TRAN_SETTINGS, "XPT_GET_TRAN_SETTINGS" }, { XPT_SET_TRAN_SETTINGS, "XPT_SET_TRAN_SETTINGS" }, { XPT_CALC_GEOMETRY, "XPT_CALC_GEOMETRY" }, { XPT_ATA_IO, "XPT_ATA_IO" }, { XPT_GET_SIM_KNOB, "XPT_GET_SIM_KNOB" }, { XPT_SET_SIM_KNOB, "XPT_SET_SIM_KNOB" }, { XPT_NVME_IO, "XPT_NVME_IO" }, { XPT_MMCSD_IO, "XPT_MMCSD_IO" }, { XPT_SMP_IO, "XPT_SMP_IO" }, { XPT_SCAN_TGT, "XPT_SCAN_TGT" }, { XPT_ENG_INQ, "XPT_ENG_INQ" }, { XPT_ENG_EXEC, "XPT_ENG_EXEC" }, { XPT_EN_LUN, "XPT_EN_LUN" }, { XPT_TARGET_IO, "XPT_TARGET_IO" }, { XPT_ACCEPT_TARGET_IO, "XPT_ACCEPT_TARGET_IO" }, { XPT_CONT_TARGET_IO, "XPT_CONT_TARGET_IO" }, { XPT_IMMED_NOTIFY, "XPT_IMMED_NOTIFY" }, { XPT_NOTIFY_ACK, "XPT_NOTIFY_ACK" }, { XPT_IMMEDIATE_NOTIFY, "XPT_IMMEDIATE_NOTIFY" }, { XPT_NOTIFY_ACKNOWLEDGE, "XPT_NOTIFY_ACKNOWLEDGE" }, { 0, 0 } }; static const char * xpt_action_name(uint32_t action) { static char buffer[32]; /* Only for unknown messages -- racy */ struct kv *walker = map; while (walker->name != NULL) { if (walker->v == action) return (walker->name); walker++; } snprintf(buffer, sizeof(buffer), "%#x", action); return (buffer); } Index: user/alc/PQ_LAUNDRY/sys/cam/nvme/nvme_xpt.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/cam/nvme/nvme_xpt.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/cam/nvme/nvme_xpt.c (revision 303206) @@ -1,607 +1,605 @@ /*- * Copyright (c) 2015 Netflix, Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * derived from ata_xpt.c: Copyright (c) 2009 Alexander Motin */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* for xpt_print below */ #include "opt_cam.h" struct nvme_quirk_entry { u_int quirks; #define CAM_QUIRK_MAXTAGS 1 u_int mintags; u_int maxtags; }; /* Not even sure why we need this */ static periph_init_t nvme_probe_periph_init; static struct periph_driver nvme_probe_driver = { nvme_probe_periph_init, "nvme_probe", TAILQ_HEAD_INITIALIZER(nvme_probe_driver.units), /* generation */ 0, CAM_PERIPH_DRV_EARLY }; PERIPHDRIVER_DECLARE(nvme_probe, nvme_probe_driver); typedef enum { NVME_PROBE_IDENTIFY, NVME_PROBE_DONE, NVME_PROBE_INVALID, NVME_PROBE_RESET } nvme_probe_action; static char *nvme_probe_action_text[] = { "NVME_PROBE_IDENTIFY", "NVME_PROBE_DONE", "NVME_PROBE_INVALID", "NVME_PROBE_RESET", }; #define NVME_PROBE_SET_ACTION(softc, newaction) \ do { \ char **text; \ text = nvme_probe_action_text; \ CAM_DEBUG((softc)->periph->path, CAM_DEBUG_PROBE, \ ("Probe %s to %s\n", text[(softc)->action], \ text[(newaction)])); \ (softc)->action = (newaction); \ } while(0) typedef enum { NVME_PROBE_NO_ANNOUNCE = 0x04 } nvme_probe_flags; typedef struct { TAILQ_HEAD(, ccb_hdr) request_ccbs; nvme_probe_action action; nvme_probe_flags flags; int restart; struct cam_periph *periph; } nvme_probe_softc; static struct nvme_quirk_entry nvme_quirk_table[] = { { // { // T_ANY, SIP_MEDIA_REMOVABLE|SIP_MEDIA_FIXED, // /*vendor*/"*", /*product*/"*", /*revision*/"*" // }, .quirks = 0, .mintags = 0, .maxtags = 0 }, }; static const int nvme_quirk_table_size = sizeof(nvme_quirk_table) / sizeof(*nvme_quirk_table); static cam_status nvme_probe_register(struct cam_periph *periph, void *arg); static void nvme_probe_schedule(struct cam_periph *nvme_probe_periph); static void nvme_probe_start(struct cam_periph *periph, union ccb *start_ccb); static void nvme_probe_cleanup(struct cam_periph *periph); //static void nvme_find_quirk(struct cam_ed *device); static void nvme_scan_lun(struct cam_periph *periph, struct cam_path *path, cam_flags flags, union ccb *ccb); static struct cam_ed * nvme_alloc_device(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id); static void nvme_device_transport(struct cam_path *path); static void nvme_dev_async(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg); static void nvme_action(union ccb *start_ccb); static void nvme_announce_periph(struct cam_periph *periph); static struct xpt_xport nvme_xport = { .alloc_device = nvme_alloc_device, .action = nvme_action, .async = nvme_dev_async, .announce = nvme_announce_periph, }; struct xpt_xport * nvme_get_xport(void) { + return (&nvme_xport); } static void nvme_probe_periph_init() { - printf("nvme cam probe device init\n"); + } static cam_status nvme_probe_register(struct cam_periph *periph, void *arg) { union ccb *request_ccb; /* CCB representing the probe request */ cam_status status; nvme_probe_softc *softc; request_ccb = (union ccb *)arg; if (request_ccb == NULL) { printf("nvme_probe_register: no probe CCB, " "can't register device\n"); return(CAM_REQ_CMP_ERR); } softc = (nvme_probe_softc *)malloc(sizeof(*softc), M_CAMXPT, M_ZERO | M_NOWAIT); if (softc == NULL) { printf("nvme_probe_register: Unable to probe new device. " "Unable to allocate softc\n"); return(CAM_REQ_CMP_ERR); } TAILQ_INIT(&softc->request_ccbs); TAILQ_INSERT_TAIL(&softc->request_ccbs, &request_ccb->ccb_h, periph_links.tqe); softc->flags = 0; periph->softc = softc; softc->periph = periph; softc->action = NVME_PROBE_INVALID; status = cam_periph_acquire(periph); if (status != CAM_REQ_CMP) { return (status); } CAM_DEBUG(periph->path, CAM_DEBUG_PROBE, ("Probe started\n")); // nvme_device_transport(periph->path); nvme_probe_schedule(periph); return(CAM_REQ_CMP); } static void nvme_probe_schedule(struct cam_periph *periph) { union ccb *ccb; nvme_probe_softc *softc; softc = (nvme_probe_softc *)periph->softc; ccb = (union ccb *)TAILQ_FIRST(&softc->request_ccbs); NVME_PROBE_SET_ACTION(softc, NVME_PROBE_IDENTIFY); if (ccb->crcn.flags & CAM_EXPECT_INQ_CHANGE) softc->flags |= NVME_PROBE_NO_ANNOUNCE; else softc->flags &= ~NVME_PROBE_NO_ANNOUNCE; xpt_schedule(periph, CAM_PRIORITY_XPT); } static void nvme_probe_start(struct cam_periph *periph, union ccb *start_ccb) { struct ccb_nvmeio *nvmeio; struct ccb_scsiio *csio; nvme_probe_softc *softc; struct cam_path *path; const struct nvme_namespace_data *nvme_data; lun_id_t lun; CAM_DEBUG(start_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("nvme_probe_start\n")); softc = (nvme_probe_softc *)periph->softc; path = start_ccb->ccb_h.path; nvmeio = &start_ccb->nvmeio; csio = &start_ccb->csio; nvme_data = periph->path->device->nvme_data; if (softc->restart) { softc->restart = 0; if (periph->path->device->flags & CAM_DEV_UNCONFIGURED) NVME_PROBE_SET_ACTION(softc, NVME_PROBE_RESET); else NVME_PROBE_SET_ACTION(softc, NVME_PROBE_IDENTIFY); } /* * Other transports have to ask their SIM to do a lot of action. * NVMe doesn't, so don't do the dance. Just do things * directly. */ switch (softc->action) { case NVME_PROBE_RESET: /* FALLTHROUGH */ case NVME_PROBE_IDENTIFY: nvme_device_transport(path); /* * Test for lun == CAM_LUN_WILDCARD is lame, but * appears to be necessary here. XXX */ lun = xpt_path_lun_id(periph->path); if (lun == CAM_LUN_WILDCARD || periph->path->device->flags & CAM_DEV_UNCONFIGURED) { path->device->flags &= ~CAM_DEV_UNCONFIGURED; xpt_acquire_device(path->device); start_ccb->ccb_h.func_code = XPT_GDEV_TYPE; xpt_action(start_ccb); xpt_async(AC_FOUND_DEVICE, path, start_ccb); } NVME_PROBE_SET_ACTION(softc, NVME_PROBE_DONE); break; default: panic("nvme_probe_start: invalid action state 0x%x\n", softc->action); } /* * Probing is now done. We need to complete any lingering items * in the queue, though there shouldn't be any. */ xpt_release_ccb(start_ccb); CAM_DEBUG(periph->path, CAM_DEBUG_PROBE, ("Probe completed\n")); while ((start_ccb = (union ccb *)TAILQ_FIRST(&softc->request_ccbs))) { TAILQ_REMOVE(&softc->request_ccbs, &start_ccb->ccb_h, periph_links.tqe); start_ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(start_ccb); } -// XXX not sure I need this -// XXX unlike other XPTs, we never freeze the queue since we have a super-simple -// XXX state machine - /* Drop freeze taken due to CAM_DEV_QFREEZE flag set. -- did we really do this? */ -// cam_release_devq(path, 0, 0, 0, FALSE); cam_periph_invalidate(periph); - cam_periph_release_locked(periph); + /* Can't release periph since we hit a (possibly bogus) assertion */ +// cam_periph_release_locked(periph); } static void nvme_probe_cleanup(struct cam_periph *periph) { + free(periph->softc, M_CAMXPT); } #if 0 /* XXX should be used, don't delete */ static void nvme_find_quirk(struct cam_ed *device) { struct nvme_quirk_entry *quirk; caddr_t match; match = cam_quirkmatch((caddr_t)&device->nvme_data, (caddr_t)nvme_quirk_table, nvme_quirk_table_size, sizeof(*nvme_quirk_table), nvme_identify_match); if (match == NULL) panic("xpt_find_quirk: device didn't match wildcard entry!!"); quirk = (struct nvme_quirk_entry *)match; device->quirk = quirk; if (quirk->quirks & CAM_QUIRK_MAXTAGS) { device->mintags = quirk->mintags; device->maxtags = quirk->maxtags; } } #endif static void nvme_scan_lun(struct cam_periph *periph, struct cam_path *path, cam_flags flags, union ccb *request_ccb) { struct ccb_pathinq cpi; cam_status status; struct cam_periph *old_periph; int lock; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("nvme_scan_lun\n")); xpt_setup_ccb(&cpi.ccb_h, path, CAM_PRIORITY_NONE); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); if (cpi.ccb_h.status != CAM_REQ_CMP) { if (request_ccb != NULL) { request_ccb->ccb_h.status = cpi.ccb_h.status; xpt_done(request_ccb); } return; } if (xpt_path_lun_id(path) == CAM_LUN_WILDCARD) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("nvme_scan_lun ignoring bus\n")); request_ccb->ccb_h.status = CAM_REQ_CMP; /* XXX signal error ? */ xpt_done(request_ccb); return; } lock = (xpt_path_owned(path) == 0); if (lock) xpt_path_lock(path); if ((old_periph = cam_periph_find(path, "nvme_probe")) != NULL) { if ((old_periph->flags & CAM_PERIPH_INVALID) == 0) { nvme_probe_softc *softc; softc = (nvme_probe_softc *)old_periph->softc; TAILQ_INSERT_TAIL(&softc->request_ccbs, &request_ccb->ccb_h, periph_links.tqe); softc->restart = 1; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("restarting nvme_probe device\n")); } else { request_ccb->ccb_h.status = CAM_REQ_CMP_ERR; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("Failing to restart nvme_probe device\n")); xpt_done(request_ccb); } } else { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("Adding nvme_probe device\n")); status = cam_periph_alloc(nvme_probe_register, NULL, nvme_probe_cleanup, nvme_probe_start, "nvme_probe", CAM_PERIPH_BIO, request_ccb->ccb_h.path, NULL, 0, request_ccb); if (status != CAM_REQ_CMP) { xpt_print(path, "xpt_scan_lun: cam_alloc_periph " "returned an error, can't continue probe\n"); request_ccb->ccb_h.status = status; xpt_done(request_ccb); } } if (lock) xpt_path_unlock(path); } static struct cam_ed * nvme_alloc_device(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id) { struct nvme_quirk_entry *quirk; struct cam_ed *device; device = xpt_alloc_device(bus, target, lun_id); if (device == NULL) return (NULL); /* * Take the default quirk entry until we have inquiry * data from nvme and can determine a better quirk to use. */ quirk = &nvme_quirk_table[nvme_quirk_table_size - 1]; device->quirk = (void *)quirk; device->mintags = 0; device->maxtags = 0; device->inq_flags = 0; device->queue_flags = 0; device->device_id = NULL; /* XXX Need to set this somewhere */ device->device_id_len = 0; device->serial_num = NULL; /* XXX Need to set this somewhere */ device->serial_num_len = 0; return (device); } static void nvme_device_transport(struct cam_path *path) { struct ccb_pathinq cpi; struct ccb_trans_settings cts; /* XXX get data from nvme namespace and other info ??? */ /* Get transport information from the SIM */ xpt_setup_ccb(&cpi.ccb_h, path, CAM_PRIORITY_NONE); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); path->device->transport = cpi.transport; path->device->transport_version = cpi.transport_version; path->device->protocol = cpi.protocol; path->device->protocol_version = cpi.protocol_version; /* Tell the controller what we think */ xpt_setup_ccb(&cts.ccb_h, path, CAM_PRIORITY_NONE); cts.ccb_h.func_code = XPT_SET_TRAN_SETTINGS; cts.type = CTS_TYPE_CURRENT_SETTINGS; cts.transport = path->device->transport; cts.transport_version = path->device->transport_version; cts.protocol = path->device->protocol; cts.protocol_version = path->device->protocol_version; cts.proto_specific.valid = 0; cts.xport_specific.valid = 0; xpt_action((union ccb *)&cts); } static void nvme_dev_advinfo(union ccb *start_ccb) { struct cam_ed *device; struct ccb_dev_advinfo *cdai; off_t amt; start_ccb->ccb_h.status = CAM_REQ_INVALID; device = start_ccb->ccb_h.path->device; cdai = &start_ccb->cdai; switch(cdai->buftype) { case CDAI_TYPE_SCSI_DEVID: if (cdai->flags & CDAI_FLAG_STORE) return; cdai->provsiz = device->device_id_len; if (device->device_id_len == 0) break; amt = device->device_id_len; if (cdai->provsiz > cdai->bufsiz) amt = cdai->bufsiz; memcpy(cdai->buf, device->device_id, amt); break; case CDAI_TYPE_SERIAL_NUM: if (cdai->flags & CDAI_FLAG_STORE) return; cdai->provsiz = device->serial_num_len; if (device->serial_num_len == 0) break; amt = device->serial_num_len; if (cdai->provsiz > cdai->bufsiz) amt = cdai->bufsiz; memcpy(cdai->buf, device->serial_num, amt); break; case CDAI_TYPE_PHYS_PATH: if (cdai->flags & CDAI_FLAG_STORE) { if (device->physpath != NULL) free(device->physpath, M_CAMXPT); device->physpath_len = cdai->bufsiz; /* Clear existing buffer if zero length */ if (cdai->bufsiz == 0) break; device->physpath = malloc(cdai->bufsiz, M_CAMXPT, M_NOWAIT); if (device->physpath == NULL) { start_ccb->ccb_h.status = CAM_REQ_ABORTED; return; } memcpy(device->physpath, cdai->buf, cdai->bufsiz); } else { cdai->provsiz = device->physpath_len; if (device->physpath_len == 0) break; amt = device->physpath_len; if (cdai->provsiz > cdai->bufsiz) amt = cdai->bufsiz; memcpy(cdai->buf, device->physpath, amt); } break; default: return; } start_ccb->ccb_h.status = CAM_REQ_CMP; if (cdai->flags & CDAI_FLAG_STORE) { xpt_async(AC_ADVINFO_CHANGED, start_ccb->ccb_h.path, (void *)(uintptr_t)cdai->buftype); } } static void nvme_action(union ccb *start_ccb) { CAM_DEBUG(start_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("nvme_action: func= %#x\n", start_ccb->ccb_h.func_code)); switch (start_ccb->ccb_h.func_code) { case XPT_SCAN_BUS: printf("NVME scan BUS started -- ignored\n"); // break; case XPT_SCAN_TGT: printf("NVME scan TGT started -- ignored\n"); // break; case XPT_SCAN_LUN: printf("NVME scan started\n"); nvme_scan_lun(start_ccb->ccb_h.path->periph, start_ccb->ccb_h.path, start_ccb->crcn.flags, start_ccb); break; case XPT_DEV_ADVINFO: nvme_dev_advinfo(start_ccb); break; default: xpt_action_default(start_ccb); break; } } /* * Handle any per-device event notifications that require action by the XPT. */ static void nvme_dev_async(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg) { /* * We only need to handle events for real devices. */ if (target->target_id == CAM_TARGET_WILDCARD || device->lun_id == CAM_LUN_WILDCARD) return; if (async_code == AC_LOST_DEVICE && (device->flags & CAM_DEV_UNCONFIGURED) == 0) { device->flags |= CAM_DEV_UNCONFIGURED; xpt_release_device(device); } } static void nvme_announce_periph(struct cam_periph *periph) { struct ccb_pathinq cpi; struct ccb_trans_settings cts; struct cam_path *path = periph->path; cam_periph_assert(periph, MA_OWNED); xpt_setup_ccb(&cts.ccb_h, path, CAM_PRIORITY_NORMAL); cts.ccb_h.func_code = XPT_GET_TRAN_SETTINGS; cts.type = CTS_TYPE_CURRENT_SETTINGS; xpt_action((union ccb*)&cts); if ((cts.ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) return; /* Ask the SIM for its base transfer speed */ xpt_setup_ccb(&cpi.ccb_h, path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); /* XXX NVME STUFF HERE */ printf("\n"); } Index: user/alc/PQ_LAUNDRY/sys/conf/config.mk =================================================================== --- user/alc/PQ_LAUNDRY/sys/conf/config.mk (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/conf/config.mk (revision 303206) @@ -1,60 +1,60 @@ # $FreeBSD$ # # Common code to marry kernel config(8) goo and module building goo. # # Generate options files that otherwise would be built # in substantially similar ways through the tree. Move # the code here when they all produce identical results # (or should) .if !defined(KERNBUILDDIR) opt_bpf.h: echo "#define DEV_BPF 1" > ${.TARGET} .if ${MK_INET_SUPPORT} != "no" opt_inet.h: @echo "#define INET 1" > ${.TARGET} @echo "#define TCP_OFFLOAD 1" >> ${.TARGET} .endif .if ${MK_INET6_SUPPORT} != "no" opt_inet6.h: @echo "#define INET6 1" > ${.TARGET} .endif .if ${MK_EISA} != "no" opt_eisa.h: @echo "#define DEV_EISA 1" > ${.TARGET} .endif opt_mrouting.h: echo "#define MROUTING 1" > ${.TARGET} opt_natm.h: echo "#define NATM 1" > ${.TARGET} opt_scsi.h: echo "#define SCSI_DELAY 15000" > ${.TARGET} opt_wlan.h: echo "#define IEEE80211_DEBUG 1" > ${.TARGET} echo "#define IEEE80211_AMPDU_AGE 1" >> ${.TARGET} echo "#define IEEE80211_SUPPORT_MESH 1" >> ${.TARGET} KERN_OPTS.i386=NEW_PCIB DEV_PCI KERN_OPTS.pc98=NEW_PCIB DEV_PCI KERN_OPTS.amd64=NEW_PCIB DEV_PCI KERN_OPTS.powerpc=NEW_PCIB DEV_PCI KERN_OPTS=MROUTING NATM IEEE80211_DEBUG \ IEEE80211_AMPDU_AGE IEEE80211_SUPPORT_MESH DEV_BPF \ ${KERN_OPTS.${MACHINE}} ${KERN_OPTS_EXTRA} .if ${MK_INET_SUPPORT} != "no" KERN_OPTS+= INET TCP_OFFLOAD .endif .if ${MK_INET6_SUPPORT} != "no" KERN_OPTS+= INET6 .endif .if ${MK_EISA} != "no" KERN_OPTS+= DEV_EISA .endif .elif !defined(KERN_OPTS) KERN_OPTS!=cat ${KERNBUILDDIR}/opt*.h | awk '{print $$2;}' | sort -u .export KERN_OPTS .endif -.if !defined(__MPATH) +.if !defined(NO_MODULES) && !defined(__MPATH) __MPATH!=find ${SYSDIR:tA}/ -name \*_if.m .export __MPATH .endif Index: user/alc/PQ_LAUNDRY/sys/conf/files =================================================================== --- user/alc/PQ_LAUNDRY/sys/conf/files (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/conf/files (revision 303206) @@ -1,4356 +1,4356 @@ # $FreeBSD$ # # The long compile-with and dependency lines are required because of # limitations in config: backslash-newline doesn't work in strings, and # dependency lines other than the first are silently ignored. # acpi_quirks.h optional acpi \ dependency "$S/tools/acpi_quirks2h.awk $S/dev/acpica/acpi_quirks" \ compile-with "${AWK} -f $S/tools/acpi_quirks2h.awk $S/dev/acpica/acpi_quirks" \ no-obj no-implicit-rule before-depend \ clean "acpi_quirks.h" bhnd_nvram_map.h optional bhnd \ dependency "$S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/tools/nvram_map_gen.awk $S/dev/bhnd/nvram/nvram_map" \ compile-with "sh $S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/nvram/nvram_map -h" \ no-obj no-implicit-rule before-depend \ clean "bhnd_nvram_map.h" bhnd_nvram_map_data.h optional bhnd \ dependency "$S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/tools/nvram_map_gen.awk $S/dev/bhnd/nvram/nvram_map" \ compile-with "sh $S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/nvram/nvram_map -d" \ no-obj no-implicit-rule before-depend \ clean "bhnd_nvram_map_data.h" # # The 'fdt_dtb_file' target covers an actual DTB file name, which is derived # from the specified source (DTS) file: .dts -> .dtb # fdt_dtb_file optional fdt fdt_dtb_static \ compile-with "sh -c 'MACHINE=${MACHINE} $S/tools/fdt/make_dtb.sh $S ${FDT_DTS_FILE} ${.CURDIR}'" \ no-obj no-implicit-rule before-depend \ clean "${FDT_DTS_FILE:R}.dtb" fdt_static_dtb.h optional fdt fdt_dtb_static \ compile-with "sh -c 'MACHINE=${MACHINE} $S/tools/fdt/make_dtbh.sh ${FDT_DTS_FILE} ${.CURDIR}'" \ dependency "fdt_dtb_file" \ no-obj no-implicit-rule before-depend \ clean "fdt_static_dtb.h" feeder_eq_gen.h optional sound \ dependency "$S/tools/sound/feeder_eq_mkfilter.awk" \ compile-with "${AWK} -f $S/tools/sound/feeder_eq_mkfilter.awk -- ${FEEDER_EQ_PRESETS} > feeder_eq_gen.h" \ no-obj no-implicit-rule before-depend \ clean "feeder_eq_gen.h" feeder_rate_gen.h optional sound \ dependency "$S/tools/sound/feeder_rate_mkfilter.awk" \ compile-with "${AWK} -f $S/tools/sound/feeder_rate_mkfilter.awk -- ${FEEDER_RATE_PRESETS} > feeder_rate_gen.h" \ no-obj no-implicit-rule before-depend \ clean "feeder_rate_gen.h" snd_fxdiv_gen.h optional sound \ dependency "$S/tools/sound/snd_fxdiv_gen.awk" \ compile-with "${AWK} -f $S/tools/sound/snd_fxdiv_gen.awk -- > snd_fxdiv_gen.h" \ no-obj no-implicit-rule before-depend \ clean "snd_fxdiv_gen.h" miidevs.h optional miibus | mii \ dependency "$S/tools/miidevs2h.awk $S/dev/mii/miidevs" \ compile-with "${AWK} -f $S/tools/miidevs2h.awk $S/dev/mii/miidevs" \ no-obj no-implicit-rule before-depend \ clean "miidevs.h" pccarddevs.h standard \ dependency "$S/tools/pccarddevs2h.awk $S/dev/pccard/pccarddevs" \ compile-with "${AWK} -f $S/tools/pccarddevs2h.awk $S/dev/pccard/pccarddevs" \ no-obj no-implicit-rule before-depend \ clean "pccarddevs.h" kbdmuxmap.h optional kbdmux_dflt_keymap \ compile-with "kbdcontrol -P ${S:S/sys$/share/}/vt/keymaps -P ${S:S/sys$/share/}/syscons/keymaps -L ${KBDMUX_DFLT_KEYMAP} | sed -e 's/^static keymap_t.* = /static keymap_t key_map = /' -e 's/^static accentmap_t.* = /static accentmap_t accent_map = /' > kbdmuxmap.h" \ no-obj no-implicit-rule before-depend \ clean "kbdmuxmap.h" teken_state.h optional sc | vt \ dependency "$S/teken/gensequences $S/teken/sequences" \ compile-with "${AWK} -f $S/teken/gensequences $S/teken/sequences > teken_state.h" \ no-obj no-implicit-rule before-depend \ clean "teken_state.h" usbdevs.h optional usb \ dependency "$S/tools/usbdevs2h.awk $S/dev/usb/usbdevs" \ compile-with "${AWK} -f $S/tools/usbdevs2h.awk $S/dev/usb/usbdevs -h" \ no-obj no-implicit-rule before-depend \ clean "usbdevs.h" usbdevs_data.h optional usb \ dependency "$S/tools/usbdevs2h.awk $S/dev/usb/usbdevs" \ compile-with "${AWK} -f $S/tools/usbdevs2h.awk $S/dev/usb/usbdevs -d" \ no-obj no-implicit-rule before-depend \ clean "usbdevs_data.h" cam/cam.c optional scbus cam/cam_compat.c optional scbus cam/cam_iosched.c optional scbus cam/cam_periph.c optional scbus cam/cam_queue.c optional scbus cam/cam_sim.c optional scbus cam/cam_xpt.c optional scbus cam/ata/ata_all.c optional scbus cam/ata/ata_xpt.c optional scbus cam/ata/ata_pmp.c optional scbus -cam/nvme/nvme_all.c optional scbus nvme +cam/nvme/nvme_all.c optional scbus cam/nvme/nvme_da.c optional scbus nvme da !nvd -cam/nvme/nvme_xpt.c optional scbus nvme +cam/nvme/nvme_xpt.c optional scbus cam/scsi/scsi_xpt.c optional scbus cam/scsi/scsi_all.c optional scbus cam/scsi/scsi_cd.c optional cd cam/scsi/scsi_ch.c optional ch cam/ata/ata_da.c optional ada | da cam/ctl/ctl.c optional ctl cam/ctl/ctl_backend.c optional ctl cam/ctl/ctl_backend_block.c optional ctl cam/ctl/ctl_backend_ramdisk.c optional ctl cam/ctl/ctl_cmd_table.c optional ctl cam/ctl/ctl_frontend.c optional ctl cam/ctl/ctl_frontend_cam_sim.c optional ctl cam/ctl/ctl_frontend_ioctl.c optional ctl cam/ctl/ctl_frontend_iscsi.c optional ctl cam/ctl/ctl_ha.c optional ctl cam/ctl/ctl_scsi_all.c optional ctl cam/ctl/ctl_tpc.c optional ctl cam/ctl/ctl_tpc_local.c optional ctl cam/ctl/ctl_error.c optional ctl cam/ctl/ctl_util.c optional ctl cam/ctl/scsi_ctl.c optional ctl cam/scsi/scsi_da.c optional da cam/scsi/scsi_low.c optional ct | ncv | nsp | stg cam/scsi/scsi_pass.c optional pass cam/scsi/scsi_pt.c optional pt cam/scsi/scsi_sa.c optional sa cam/scsi/scsi_enc.c optional ses cam/scsi/scsi_enc_ses.c optional ses cam/scsi/scsi_enc_safte.c optional ses cam/scsi/scsi_sg.c optional sg cam/scsi/scsi_targ_bh.c optional targbh cam/scsi/scsi_target.c optional targ cam/scsi/smp_all.c optional scbus # shared between zfs and dtrace cddl/compat/opensolaris/kern/opensolaris.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_cmn_err.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_kmem.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_misc.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_sunddi.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_taskq.c optional zfs | dtrace compile-with "${CDDL_C}" # zfs specific cddl/compat/opensolaris/kern/opensolaris_acl.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_dtrace.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_kobj.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_kstat.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_lookup.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_policy.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_string.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_sysevent.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_uio.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_vfs.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_vm.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_zone.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/acl/acl_common.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/avl/avl.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/nvpair/opensolaris_fnvpair.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair_alloc_fixed.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/unicode/u8_textprep.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfeature_common.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_comutil.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_deleg.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_fletcher.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_ioctl_compat.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_prop.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zpool_prop.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zprop_common.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/gfs.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/vnode.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/blkptr.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bplist.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bpobj.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bqueue.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/ddt.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/ddt_zap.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_zfetch.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c optional zfs compile-with "${ZFS_C}" \ warning "kernel contains CDDL licensed ZFS filesystem" cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/gzip.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/refcount.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/rrwlock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/sha256.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/skein_zfs.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_errlog.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/space_reftree.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/trim_map.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/uberblock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/unique.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfeature.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_byteswap.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_debug.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fm.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fuid.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_log.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_onexit.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_sa.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio_checksum.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio_inject.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zle.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zrlock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/callb.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/fm.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/list.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/nvpair_alloc_system.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/adler32.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/deflate.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/inffast.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/inflate.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/inftrees.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/opensolaris_crc32.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/trees.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/zmod.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/zmod_subr.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/zutil.c optional zfs compile-with "${ZFS_C}" # dtrace specific cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c optional dtrace compile-with "${DTRACE_C}" \ warning "kernel contains CDDL licensed DTRACE" cddl/dev/dtmalloc/dtmalloc.c optional dtmalloc | dtraceall compile-with "${CDDL_C}" cddl/dev/profile/profile.c optional dtrace_profile | dtraceall compile-with "${CDDL_C}" cddl/dev/sdt/sdt.c optional dtrace_sdt | dtraceall compile-with "${CDDL_C}" cddl/dev/fbt/fbt.c optional dtrace_fbt | dtraceall compile-with "${FBT_C}" cddl/dev/systrace/systrace.c optional dtrace_systrace | dtraceall compile-with "${CDDL_C}" cddl/dev/prototype.c optional dtrace_prototype | dtraceall compile-with "${CDDL_C}" fs/nfsclient/nfs_clkdtrace.c optional dtnfscl nfscl | dtraceall nfscl compile-with "${CDDL_C}" compat/cloudabi/cloudabi_clock.c optional compat_cloudabi64 compat/cloudabi/cloudabi_errno.c optional compat_cloudabi64 compat/cloudabi/cloudabi_fd.c optional compat_cloudabi64 compat/cloudabi/cloudabi_file.c optional compat_cloudabi64 compat/cloudabi/cloudabi_futex.c optional compat_cloudabi64 compat/cloudabi/cloudabi_mem.c optional compat_cloudabi64 compat/cloudabi/cloudabi_proc.c optional compat_cloudabi64 compat/cloudabi/cloudabi_random.c optional compat_cloudabi64 compat/cloudabi/cloudabi_sock.c optional compat_cloudabi64 compat/cloudabi/cloudabi_thread.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_fd.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_module.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_poll.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_sock.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_syscalls.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_sysent.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_thread.c optional compat_cloudabi64 compat/freebsd32/freebsd32_capability.c optional compat_freebsd32 compat/freebsd32/freebsd32_ioctl.c optional compat_freebsd32 compat/freebsd32/freebsd32_misc.c optional compat_freebsd32 compat/freebsd32/freebsd32_syscalls.c optional compat_freebsd32 compat/freebsd32/freebsd32_sysent.c optional compat_freebsd32 contrib/dev/acpica/common/ahids.c optional acpi acpi_debug contrib/dev/acpica/common/ahuuids.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbcmds.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbconvert.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbdisply.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbexec.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbhistry.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbinput.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbmethod.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbnames.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbobject.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbstats.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbtest.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbutils.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbxface.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmbuffer.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmcstyle.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmdeferred.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmnames.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmopcode.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrc.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrcl.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrcl2.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrcs.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmutils.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmwalk.c optional acpi acpi_debug contrib/dev/acpica/components/dispatcher/dsargs.c optional acpi contrib/dev/acpica/components/dispatcher/dscontrol.c optional acpi contrib/dev/acpica/components/dispatcher/dsdebug.c optional acpi contrib/dev/acpica/components/dispatcher/dsfield.c optional acpi contrib/dev/acpica/components/dispatcher/dsinit.c optional acpi contrib/dev/acpica/components/dispatcher/dsmethod.c optional acpi contrib/dev/acpica/components/dispatcher/dsmthdat.c optional acpi contrib/dev/acpica/components/dispatcher/dsobject.c optional acpi contrib/dev/acpica/components/dispatcher/dsopcode.c optional acpi contrib/dev/acpica/components/dispatcher/dsutils.c optional acpi contrib/dev/acpica/components/dispatcher/dswexec.c optional acpi contrib/dev/acpica/components/dispatcher/dswload.c optional acpi contrib/dev/acpica/components/dispatcher/dswload2.c optional acpi contrib/dev/acpica/components/dispatcher/dswscope.c optional acpi contrib/dev/acpica/components/dispatcher/dswstate.c optional acpi contrib/dev/acpica/components/events/evevent.c optional acpi contrib/dev/acpica/components/events/evglock.c optional acpi contrib/dev/acpica/components/events/evgpe.c optional acpi contrib/dev/acpica/components/events/evgpeblk.c optional acpi contrib/dev/acpica/components/events/evgpeinit.c optional acpi contrib/dev/acpica/components/events/evgpeutil.c optional acpi contrib/dev/acpica/components/events/evhandler.c optional acpi contrib/dev/acpica/components/events/evmisc.c optional acpi contrib/dev/acpica/components/events/evregion.c optional acpi contrib/dev/acpica/components/events/evrgnini.c optional acpi contrib/dev/acpica/components/events/evsci.c optional acpi contrib/dev/acpica/components/events/evxface.c optional acpi contrib/dev/acpica/components/events/evxfevnt.c optional acpi contrib/dev/acpica/components/events/evxfgpe.c optional acpi contrib/dev/acpica/components/events/evxfregn.c optional acpi contrib/dev/acpica/components/executer/exconcat.c optional acpi contrib/dev/acpica/components/executer/exconfig.c optional acpi contrib/dev/acpica/components/executer/exconvrt.c optional acpi contrib/dev/acpica/components/executer/excreate.c optional acpi contrib/dev/acpica/components/executer/exdebug.c optional acpi contrib/dev/acpica/components/executer/exdump.c optional acpi contrib/dev/acpica/components/executer/exfield.c optional acpi contrib/dev/acpica/components/executer/exfldio.c optional acpi contrib/dev/acpica/components/executer/exmisc.c optional acpi contrib/dev/acpica/components/executer/exmutex.c optional acpi contrib/dev/acpica/components/executer/exnames.c optional acpi contrib/dev/acpica/components/executer/exoparg1.c optional acpi contrib/dev/acpica/components/executer/exoparg2.c optional acpi contrib/dev/acpica/components/executer/exoparg3.c optional acpi contrib/dev/acpica/components/executer/exoparg6.c optional acpi contrib/dev/acpica/components/executer/exprep.c optional acpi contrib/dev/acpica/components/executer/exregion.c optional acpi contrib/dev/acpica/components/executer/exresnte.c optional acpi contrib/dev/acpica/components/executer/exresolv.c optional acpi contrib/dev/acpica/components/executer/exresop.c optional acpi contrib/dev/acpica/components/executer/exstore.c optional acpi contrib/dev/acpica/components/executer/exstoren.c optional acpi contrib/dev/acpica/components/executer/exstorob.c optional acpi contrib/dev/acpica/components/executer/exsystem.c optional acpi contrib/dev/acpica/components/executer/extrace.c optional acpi contrib/dev/acpica/components/executer/exutils.c optional acpi contrib/dev/acpica/components/hardware/hwacpi.c optional acpi contrib/dev/acpica/components/hardware/hwesleep.c optional acpi contrib/dev/acpica/components/hardware/hwgpe.c optional acpi contrib/dev/acpica/components/hardware/hwpci.c optional acpi contrib/dev/acpica/components/hardware/hwregs.c optional acpi contrib/dev/acpica/components/hardware/hwsleep.c optional acpi contrib/dev/acpica/components/hardware/hwtimer.c optional acpi contrib/dev/acpica/components/hardware/hwvalid.c optional acpi contrib/dev/acpica/components/hardware/hwxface.c optional acpi contrib/dev/acpica/components/hardware/hwxfsleep.c optional acpi contrib/dev/acpica/components/namespace/nsaccess.c optional acpi contrib/dev/acpica/components/namespace/nsalloc.c optional acpi contrib/dev/acpica/components/namespace/nsarguments.c optional acpi contrib/dev/acpica/components/namespace/nsconvert.c optional acpi contrib/dev/acpica/components/namespace/nsdump.c optional acpi contrib/dev/acpica/components/namespace/nseval.c optional acpi contrib/dev/acpica/components/namespace/nsinit.c optional acpi contrib/dev/acpica/components/namespace/nsload.c optional acpi contrib/dev/acpica/components/namespace/nsnames.c optional acpi contrib/dev/acpica/components/namespace/nsobject.c optional acpi contrib/dev/acpica/components/namespace/nsparse.c optional acpi contrib/dev/acpica/components/namespace/nspredef.c optional acpi contrib/dev/acpica/components/namespace/nsprepkg.c optional acpi contrib/dev/acpica/components/namespace/nsrepair.c optional acpi contrib/dev/acpica/components/namespace/nsrepair2.c optional acpi contrib/dev/acpica/components/namespace/nssearch.c optional acpi contrib/dev/acpica/components/namespace/nsutils.c optional acpi contrib/dev/acpica/components/namespace/nswalk.c optional acpi contrib/dev/acpica/components/namespace/nsxfeval.c optional acpi contrib/dev/acpica/components/namespace/nsxfname.c optional acpi contrib/dev/acpica/components/namespace/nsxfobj.c optional acpi contrib/dev/acpica/components/parser/psargs.c optional acpi contrib/dev/acpica/components/parser/psloop.c optional acpi contrib/dev/acpica/components/parser/psobject.c optional acpi contrib/dev/acpica/components/parser/psopcode.c optional acpi contrib/dev/acpica/components/parser/psopinfo.c optional acpi contrib/dev/acpica/components/parser/psparse.c optional acpi contrib/dev/acpica/components/parser/psscope.c optional acpi contrib/dev/acpica/components/parser/pstree.c optional acpi contrib/dev/acpica/components/parser/psutils.c optional acpi contrib/dev/acpica/components/parser/pswalk.c optional acpi contrib/dev/acpica/components/parser/psxface.c optional acpi contrib/dev/acpica/components/resources/rsaddr.c optional acpi contrib/dev/acpica/components/resources/rscalc.c optional acpi contrib/dev/acpica/components/resources/rscreate.c optional acpi contrib/dev/acpica/components/resources/rsdump.c optional acpi acpi_debug contrib/dev/acpica/components/resources/rsdumpinfo.c optional acpi contrib/dev/acpica/components/resources/rsinfo.c optional acpi contrib/dev/acpica/components/resources/rsio.c optional acpi contrib/dev/acpica/components/resources/rsirq.c optional acpi contrib/dev/acpica/components/resources/rslist.c optional acpi contrib/dev/acpica/components/resources/rsmemory.c optional acpi contrib/dev/acpica/components/resources/rsmisc.c optional acpi contrib/dev/acpica/components/resources/rsserial.c optional acpi contrib/dev/acpica/components/resources/rsutils.c optional acpi contrib/dev/acpica/components/resources/rsxface.c optional acpi contrib/dev/acpica/components/tables/tbdata.c optional acpi contrib/dev/acpica/components/tables/tbfadt.c optional acpi contrib/dev/acpica/components/tables/tbfind.c optional acpi contrib/dev/acpica/components/tables/tbinstal.c optional acpi contrib/dev/acpica/components/tables/tbprint.c optional acpi contrib/dev/acpica/components/tables/tbutils.c optional acpi contrib/dev/acpica/components/tables/tbxface.c optional acpi contrib/dev/acpica/components/tables/tbxfload.c optional acpi contrib/dev/acpica/components/tables/tbxfroot.c optional acpi contrib/dev/acpica/components/utilities/utaddress.c optional acpi contrib/dev/acpica/components/utilities/utalloc.c optional acpi contrib/dev/acpica/components/utilities/utascii.c optional acpi contrib/dev/acpica/components/utilities/utbuffer.c optional acpi contrib/dev/acpica/components/utilities/utcache.c optional acpi contrib/dev/acpica/components/utilities/utcopy.c optional acpi contrib/dev/acpica/components/utilities/utdebug.c optional acpi contrib/dev/acpica/components/utilities/utdecode.c optional acpi contrib/dev/acpica/components/utilities/utdelete.c optional acpi contrib/dev/acpica/components/utilities/uterror.c optional acpi contrib/dev/acpica/components/utilities/uteval.c optional acpi contrib/dev/acpica/components/utilities/utexcep.c optional acpi contrib/dev/acpica/components/utilities/utglobal.c optional acpi contrib/dev/acpica/components/utilities/uthex.c optional acpi contrib/dev/acpica/components/utilities/utids.c optional acpi contrib/dev/acpica/components/utilities/utinit.c optional acpi contrib/dev/acpica/components/utilities/utlock.c optional acpi contrib/dev/acpica/components/utilities/utmath.c optional acpi contrib/dev/acpica/components/utilities/utmisc.c optional acpi contrib/dev/acpica/components/utilities/utmutex.c optional acpi contrib/dev/acpica/components/utilities/utnonansi.c optional acpi contrib/dev/acpica/components/utilities/utobject.c optional acpi contrib/dev/acpica/components/utilities/utosi.c optional acpi contrib/dev/acpica/components/utilities/utownerid.c optional acpi contrib/dev/acpica/components/utilities/utpredef.c optional acpi contrib/dev/acpica/components/utilities/utresrc.c optional acpi contrib/dev/acpica/components/utilities/utstate.c optional acpi contrib/dev/acpica/components/utilities/utstring.c optional acpi contrib/dev/acpica/components/utilities/utuuid.c optional acpi acpi_debug contrib/dev/acpica/components/utilities/utxface.c optional acpi contrib/dev/acpica/components/utilities/utxferror.c optional acpi contrib/dev/acpica/components/utilities/utxfinit.c optional acpi #contrib/dev/acpica/components/utilities/utxfmutex.c optional acpi contrib/ipfilter/netinet/fil.c optional ipfilter inet \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_auth.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_fil_freebsd.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_frag.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_log.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_nat.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_proxy.c optional ipfilter inet \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_state.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_lookup.c optional ipfilter inet \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN} -Wno-unused -Wno-error -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_pool.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_htable.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_sync.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/mlfk_ipl.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_nat6.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_rules.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_scan.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_dstlist.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/radix_ipf.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/libfdt/fdt.c optional fdt contrib/libfdt/fdt_ro.c optional fdt contrib/libfdt/fdt_rw.c optional fdt contrib/libfdt/fdt_strerror.c optional fdt contrib/libfdt/fdt_sw.c optional fdt contrib/libfdt/fdt_wip.c optional fdt contrib/libnv/dnvlist.c standard contrib/libnv/nvlist.c standard contrib/libnv/nvpair.c standard contrib/ngatm/netnatm/api/cc_conn.c optional ngatm_ccatm \ compile-with "${NORMAL_C_NOWERROR} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_data.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_dump.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_port.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_sig.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_user.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/unisap.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/misc/straddr.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/misc/unimsg_common.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/msg/traffic.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/msg/uni_ie.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/msg/uni_msg.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/saal/saal_sscfu.c optional ngatm_sscfu \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/saal/saal_sscop.c optional ngatm_sscop \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_call.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_coord.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_party.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_print.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_reset.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_uni.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_unimsgcpy.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_verify.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" crypto/blowfish/bf_ecb.c optional ipsec crypto/blowfish/bf_skey.c optional crypto | ipsec crypto/camellia/camellia.c optional crypto | ipsec crypto/camellia/camellia-api.c optional crypto | ipsec crypto/des/des_ecb.c optional crypto | ipsec | netsmb crypto/des/des_setkey.c optional crypto | ipsec | netsmb crypto/rc4/rc4.c optional netgraph_mppc_encryption | kgssapi crypto/rijndael/rijndael-alg-fst.c optional crypto | geom_bde | \ ipsec | random !random_loadable | wlan_ccmp crypto/rijndael/rijndael-api-fst.c optional geom_bde | random !random_loadable crypto/rijndael/rijndael-api.c optional crypto | ipsec | wlan_ccmp crypto/sha1.c optional carp | crypto | ipsec | \ netgraph_mppc_encryption | sctp crypto/sha2/sha256c.c optional crypto | geom_bde | ipsec | random !random_loadable | \ sctp | zfs crypto/sha2/sha512c.c optional crypto | geom_bde | ipsec | zfs crypto/skein/skein.c optional crypto | zfs crypto/skein/skein_block.c optional crypto | zfs crypto/siphash/siphash.c optional inet | inet6 crypto/siphash/siphash_test.c optional inet | inet6 ddb/db_access.c optional ddb ddb/db_break.c optional ddb ddb/db_capture.c optional ddb ddb/db_command.c optional ddb ddb/db_examine.c optional ddb ddb/db_expr.c optional ddb ddb/db_input.c optional ddb ddb/db_lex.c optional ddb ddb/db_main.c optional ddb ddb/db_output.c optional ddb ddb/db_print.c optional ddb ddb/db_ps.c optional ddb ddb/db_run.c optional ddb ddb/db_script.c optional ddb ddb/db_sym.c optional ddb ddb/db_thread.c optional ddb ddb/db_textdump.c optional ddb ddb/db_variables.c optional ddb ddb/db_watch.c optional ddb ddb/db_write_cmd.c optional ddb dev/aac/aac.c optional aac dev/aac/aac_cam.c optional aacp aac dev/aac/aac_debug.c optional aac dev/aac/aac_disk.c optional aac dev/aac/aac_linux.c optional aac compat_linux dev/aac/aac_pci.c optional aac pci dev/aacraid/aacraid.c optional aacraid dev/aacraid/aacraid_cam.c optional aacraid scbus dev/aacraid/aacraid_debug.c optional aacraid dev/aacraid/aacraid_linux.c optional aacraid compat_linux dev/aacraid/aacraid_pci.c optional aacraid pci dev/acpi_support/acpi_wmi.c optional acpi_wmi acpi dev/acpi_support/acpi_asus.c optional acpi_asus acpi dev/acpi_support/acpi_asus_wmi.c optional acpi_asus_wmi acpi dev/acpi_support/acpi_fujitsu.c optional acpi_fujitsu acpi dev/acpi_support/acpi_hp.c optional acpi_hp acpi dev/acpi_support/acpi_ibm.c optional acpi_ibm acpi dev/acpi_support/acpi_panasonic.c optional acpi_panasonic acpi dev/acpi_support/acpi_sony.c optional acpi_sony acpi dev/acpi_support/acpi_toshiba.c optional acpi_toshiba acpi dev/acpi_support/atk0110.c optional aibs acpi dev/acpica/Osd/OsdDebug.c optional acpi dev/acpica/Osd/OsdHardware.c optional acpi dev/acpica/Osd/OsdInterrupt.c optional acpi dev/acpica/Osd/OsdMemory.c optional acpi dev/acpica/Osd/OsdSchedule.c optional acpi dev/acpica/Osd/OsdStream.c optional acpi dev/acpica/Osd/OsdSynch.c optional acpi dev/acpica/Osd/OsdTable.c optional acpi dev/acpica/acpi.c optional acpi dev/acpica/acpi_acad.c optional acpi dev/acpica/acpi_battery.c optional acpi dev/acpica/acpi_button.c optional acpi dev/acpica/acpi_cmbat.c optional acpi dev/acpica/acpi_cpu.c optional acpi dev/acpica/acpi_ec.c optional acpi dev/acpica/acpi_isab.c optional acpi isa dev/acpica/acpi_lid.c optional acpi dev/acpica/acpi_package.c optional acpi dev/acpica/acpi_pci.c optional acpi pci dev/acpica/acpi_pci_link.c optional acpi pci dev/acpica/acpi_pcib.c optional acpi pci dev/acpica/acpi_pcib_acpi.c optional acpi pci dev/acpica/acpi_pcib_pci.c optional acpi pci dev/acpica/acpi_perf.c optional acpi dev/acpica/acpi_powerres.c optional acpi dev/acpica/acpi_quirk.c optional acpi dev/acpica/acpi_resource.c optional acpi dev/acpica/acpi_smbat.c optional acpi dev/acpica/acpi_thermal.c optional acpi dev/acpica/acpi_throttle.c optional acpi dev/acpica/acpi_timer.c optional acpi dev/acpica/acpi_video.c optional acpi_video acpi dev/acpica/acpi_dock.c optional acpi_dock acpi dev/adlink/adlink.c optional adlink dev/advansys/adv_eisa.c optional adv eisa dev/advansys/adv_pci.c optional adv pci dev/advansys/advansys.c optional adv dev/advansys/advlib.c optional adv dev/advansys/advmcode.c optional adv dev/advansys/adw_pci.c optional adw pci dev/advansys/adwcam.c optional adw dev/advansys/adwlib.c optional adw dev/advansys/adwmcode.c optional adw dev/ae/if_ae.c optional ae pci dev/age/if_age.c optional age pci dev/agp/agp.c optional agp pci dev/agp/agp_if.m optional agp pci dev/aha/aha.c optional aha dev/aha/aha_isa.c optional aha isa dev/aha/aha_mca.c optional aha mca dev/ahb/ahb.c optional ahb eisa dev/ahci/ahci.c optional ahci dev/ahci/ahciem.c optional ahci dev/ahci/ahci_pci.c optional ahci pci dev/aic/aic.c optional aic dev/aic/aic_pccard.c optional aic pccard dev/aic7xxx/ahc_eisa.c optional ahc eisa dev/aic7xxx/ahc_isa.c optional ahc isa dev/aic7xxx/ahc_pci.c optional ahc pci \ compile-with "${NORMAL_C} ${NO_WCONSTANT_CONVERSION}" dev/aic7xxx/ahd_pci.c optional ahd pci \ compile-with "${NORMAL_C} ${NO_WCONSTANT_CONVERSION}" dev/aic7xxx/aic7770.c optional ahc dev/aic7xxx/aic79xx.c optional ahd pci dev/aic7xxx/aic79xx_osm.c optional ahd pci dev/aic7xxx/aic79xx_pci.c optional ahd pci dev/aic7xxx/aic79xx_reg_print.c optional ahd pci ahd_reg_pretty_print dev/aic7xxx/aic7xxx.c optional ahc dev/aic7xxx/aic7xxx_93cx6.c optional ahc dev/aic7xxx/aic7xxx_osm.c optional ahc dev/aic7xxx/aic7xxx_pci.c optional ahc pci dev/aic7xxx/aic7xxx_reg_print.c optional ahc ahc_reg_pretty_print dev/alc/if_alc.c optional alc pci dev/ale/if_ale.c optional ale pci dev/alpm/alpm.c optional alpm pci dev/altera/avgen/altera_avgen.c optional altera_avgen dev/altera/avgen/altera_avgen_fdt.c optional altera_avgen fdt dev/altera/avgen/altera_avgen_nexus.c optional altera_avgen dev/altera/sdcard/altera_sdcard.c optional altera_sdcard dev/altera/sdcard/altera_sdcard_disk.c optional altera_sdcard dev/altera/sdcard/altera_sdcard_io.c optional altera_sdcard dev/altera/sdcard/altera_sdcard_fdt.c optional altera_sdcard fdt dev/altera/sdcard/altera_sdcard_nexus.c optional altera_sdcard dev/altera/pio/pio.c optional altera_pio dev/altera/pio/pio_if.m optional altera_pio dev/amdpm/amdpm.c optional amdpm pci | nfpm pci dev/amdsmb/amdsmb.c optional amdsmb pci dev/amr/amr.c optional amr dev/amr/amr_cam.c optional amrp amr dev/amr/amr_disk.c optional amr dev/amr/amr_linux.c optional amr compat_linux dev/amr/amr_pci.c optional amr pci dev/an/if_an.c optional an dev/an/if_an_isa.c optional an isa dev/an/if_an_pccard.c optional an pccard dev/an/if_an_pci.c optional an pci # dev/ata/ata_if.m optional ata | atacore dev/ata/ata-all.c optional ata | atacore dev/ata/ata-dma.c optional ata | atacore dev/ata/ata-lowlevel.c optional ata | atacore dev/ata/ata-sata.c optional ata | atacore dev/ata/ata-card.c optional ata pccard | atapccard dev/ata/ata-cbus.c optional ata pc98 | atapc98 dev/ata/ata-isa.c optional ata isa | ataisa dev/ata/ata-pci.c optional ata pci | atapci dev/ata/chipsets/ata-acard.c optional ata pci | ataacard dev/ata/chipsets/ata-acerlabs.c optional ata pci | ataacerlabs dev/ata/chipsets/ata-amd.c optional ata pci | ataamd dev/ata/chipsets/ata-ati.c optional ata pci | ataati dev/ata/chipsets/ata-cenatek.c optional ata pci | atacenatek dev/ata/chipsets/ata-cypress.c optional ata pci | atacypress dev/ata/chipsets/ata-cyrix.c optional ata pci | atacyrix dev/ata/chipsets/ata-highpoint.c optional ata pci | atahighpoint dev/ata/chipsets/ata-intel.c optional ata pci | ataintel dev/ata/chipsets/ata-ite.c optional ata pci | ataite dev/ata/chipsets/ata-jmicron.c optional ata pci | atajmicron dev/ata/chipsets/ata-marvell.c optional ata pci | atamarvell dev/ata/chipsets/ata-micron.c optional ata pci | atamicron dev/ata/chipsets/ata-national.c optional ata pci | atanational dev/ata/chipsets/ata-netcell.c optional ata pci | atanetcell dev/ata/chipsets/ata-nvidia.c optional ata pci | atanvidia dev/ata/chipsets/ata-promise.c optional ata pci | atapromise dev/ata/chipsets/ata-serverworks.c optional ata pci | ataserverworks dev/ata/chipsets/ata-siliconimage.c optional ata pci | atasiliconimage | ataati dev/ata/chipsets/ata-sis.c optional ata pci | atasis dev/ata/chipsets/ata-via.c optional ata pci | atavia # dev/ath/if_ath_pci.c optional ath_pci pci \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/ath/if_ath_ahb.c optional ath_ahb \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/ath/if_ath.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_alq.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_beacon.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_btcoex.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_btcoex_mci.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_debug.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_descdma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_keycache.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_ioctl.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_led.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_lna_div.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tx.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tx_edma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tx_ht.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tdma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_sysctl.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_rx.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_rx_edma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_spectral.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ah_osdep.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/ath/ath_hal/ah.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v1.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v3.c optional ath_hal | ath_ar5211 | ath_ar5212 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v14.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v4k.c \ optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_9287.c \ optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_regdomain.c optional ath \ compile-with "${NORMAL_C} ${NO_WSHIFT_COUNT_NEGATIVE} ${NO_WSHIFT_COUNT_OVERFLOW} -I$S/dev/ath" # ar5210 dev/ath/ath_hal/ar5210/ar5210_attach.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_beacon.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_interrupts.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_keycache.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_misc.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_phy.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_power.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_recv.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_reset.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_xmit.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar5211 dev/ath/ath_hal/ar5211/ar5211_attach.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_beacon.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_interrupts.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_keycache.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_misc.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_phy.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_power.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_recv.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_reset.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_xmit.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar5212 dev/ath/ath_hal/ar5212/ar5212_ani.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_attach.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_beacon.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_eeprom.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_gpio.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_interrupts.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_keycache.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_misc.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_phy.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_power.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_recv.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_reset.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_rfgain.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_xmit.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar5416 (depends on ar5212) dev/ath/ath_hal/ar5416/ar5416_ani.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_attach.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_beacon.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_btcoex.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal_iq.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal_adcgain.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal_adcdc.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_eeprom.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_gpio.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_interrupts.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_keycache.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_misc.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_phy.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_power.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_radar.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_recv.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_reset.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_spectral.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_xmit.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9130 (depends upon ar5416) - also requires AH_SUPPORT_AR9130 # # Since this is an embedded MAC SoC, there's no need to compile it into the # default HAL. dev/ath/ath_hal/ar9001/ar9130_attach.c optional ath_ar9130 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9001/ar9130_phy.c optional ath_ar9130 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9001/ar9130_eeprom.c optional ath_ar9130 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9160 (depends on ar5416) dev/ath/ath_hal/ar9001/ar9160_attach.c optional ath_hal | ath_ar9160 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9280 (depends on ar5416) dev/ath/ath_hal/ar9002/ar9280_attach.c optional ath_hal | ath_ar9280 | \ ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9280_olc.c optional ath_hal | ath_ar9280 | \ ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9285 (depends on ar5416 and ar9280) dev/ath/ath_hal/ar9002/ar9285_attach.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_btcoex.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_reset.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_cal.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_phy.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_diversity.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9287 (depends on ar5416) dev/ath/ath_hal/ar9002/ar9287_attach.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287_reset.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287_cal.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287_olc.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9300 contrib/dev/ath/ath_hal/ar9300/ar9300_ani.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_attach.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_beacon.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_eeprom.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal ${NO_WCONSTANT_CONVERSION}" contrib/dev/ath/ath_hal/ar9300/ar9300_freebsd.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_gpio.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_interrupts.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_keycache.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_mci.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_misc.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_paprd.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_phy.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_power.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_radar.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_radio.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_recv.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_recv_ds.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_reset.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal ${NO_WSOMETIMES_UNINITIALIZED} -Wno-unused-function" contrib/dev/ath/ath_hal/ar9300/ar9300_stub.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_stub_funcs.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_spectral.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_timer.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_xmit.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_xmit_ds.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" # rf backends dev/ath/ath_hal/ar5212/ar2316.c optional ath_rf2316 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar2317.c optional ath_rf2317 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar2413.c optional ath_hal | ath_rf2413 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar2425.c optional ath_hal | ath_rf2425 | ath_rf2417 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5111.c optional ath_hal | ath_rf5111 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5112.c optional ath_hal | ath_rf5112 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5413.c optional ath_hal | ath_rf5413 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar2133.c optional ath_hal | ath_ar5416 | \ ath_ar9130 | ath_ar9160 | ath_ar9280 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9280.c optional ath_hal | ath_ar9280 | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ath rate control algorithms dev/ath/ath_rate/amrr/amrr.c optional ath_rate_amrr \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_rate/onoe/onoe.c optional ath_rate_onoe \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_rate/sample/sample.c optional ath_rate_sample \ compile-with "${NORMAL_C} -I$S/dev/ath" # ath DFS modules dev/ath/ath_dfs/null/dfs_null.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/bce/if_bce.c optional bce dev/bfe/if_bfe.c optional bfe dev/bge/if_bge.c optional bge dev/bhnd/bhnd.c optional bhnd dev/bhnd/bhnd_nexus.c optional bhnd siba_nexus | \ bhnd bcma_nexus dev/bhnd/bhnd_subr.c optional bhnd dev/bhnd/bhnd_bus_if.m optional bhnd dev/bhnd/bhndb/bhnd_bhndb.c optional bhndb bhnd dev/bhnd/bhndb/bhndb.c optional bhndb bhnd dev/bhnd/bhndb/bhndb_bus_if.m optional bhndb bhnd dev/bhnd/bhndb/bhndb_hwdata.c optional bhndb bhnd dev/bhnd/bhndb/bhndb_if.m optional bhndb bhnd dev/bhnd/bhndb/bhndb_pci.c optional bhndb bhnd pci dev/bhnd/bhndb/bhndb_pci_hwdata.c optional bhndb bhnd pci dev/bhnd/bhndb/bhndb_pci_sprom.c optional bhndb bhnd pci dev/bhnd/bhndb/bhndb_subr.c optional bhndb bhnd dev/bhnd/bcma/bcma.c optional bcma bhnd dev/bhnd/bcma/bcma_bhndb.c optional bcma bhnd bhndb dev/bhnd/bcma/bcma_erom.c optional bcma bhnd dev/bhnd/bcma/bcma_nexus.c optional bcma_nexus bcma bhnd dev/bhnd/bcma/bcma_subr.c optional bcma bhnd dev/bhnd/cores/chipc/chipc.c optional bhnd dev/bhnd/cores/chipc/chipc_cfi.c optional bhnd cfi dev/bhnd/cores/chipc/chipc_slicer.c optional bhnd cfi | bhnd spibus dev/bhnd/cores/chipc/chipc_spi.c optional bhnd spibus dev/bhnd/cores/chipc/chipc_subr.c optional bhnd dev/bhnd/cores/chipc/bhnd_chipc_if.m optional bhnd dev/bhnd/cores/chipc/bhnd_sprom_chipc.c optional bhnd dev/bhnd/cores/pci/bhnd_pci.c optional bhnd pci dev/bhnd/cores/pci/bhnd_pci_hostb.c optional bhndb bhnd pci dev/bhnd/cores/pci/bhnd_pcib.c optional bhnd_pcib bhnd pci dev/bhnd/cores/pcie2/bhnd_pcie2.c optional bhnd pci dev/bhnd/cores/pcie2/bhnd_pcie2_hostb.c optional bhndb bhnd pci dev/bhnd/cores/pcie2/bhnd_pcie2b.c optional bhnd_pcie2b bhnd pci dev/bhnd/nvram/bhnd_nvram_if.m optional bhnd dev/bhnd/nvram/bhnd_sprom.c optional bhnd dev/bhnd/nvram/bhnd_sprom_subr.c optional bhnd dev/bhnd/nvram/nvram_subr.c optional bhnd dev/bhnd/siba/siba.c optional siba bhnd dev/bhnd/siba/siba_bhndb.c optional siba bhnd bhndb dev/bhnd/siba/siba_nexus.c optional siba_nexus siba bhnd dev/bhnd/siba/siba_subr.c optional siba bhnd # dev/bktr/bktr_audio.c optional bktr pci dev/bktr/bktr_card.c optional bktr pci dev/bktr/bktr_core.c optional bktr pci dev/bktr/bktr_i2c.c optional bktr pci smbus dev/bktr/bktr_os.c optional bktr pci dev/bktr/bktr_tuner.c optional bktr pci dev/bktr/msp34xx.c optional bktr pci dev/buslogic/bt.c optional bt dev/buslogic/bt_eisa.c optional bt eisa dev/buslogic/bt_isa.c optional bt isa dev/buslogic/bt_mca.c optional bt mca dev/buslogic/bt_pci.c optional bt pci dev/bwi/bwimac.c optional bwi dev/bwi/bwiphy.c optional bwi dev/bwi/bwirf.c optional bwi dev/bwi/if_bwi.c optional bwi dev/bwi/if_bwi_pci.c optional bwi pci # XXX Work around clang warning, until maintainer approves fix. dev/bwn/if_bwn.c optional bwn siba_bwn \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/bwn/if_bwn_pci.c optional bwn pci bhnd dev/bwn/if_bwn_phy_common.c optional bwn siba_bwn dev/bwn/if_bwn_phy_g.c optional bwn siba_bwn \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/bwn/if_bwn_phy_lp.c optional bwn siba_bwn \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/bwn/if_bwn_phy_n.c optional bwn siba_bwn dev/bwn/if_bwn_util.c optional bwn siba_bwn dev/bwn/bwn_mac.c optional bwn bhnd dev/cardbus/cardbus.c optional cardbus dev/cardbus/cardbus_cis.c optional cardbus dev/cardbus/cardbus_device.c optional cardbus dev/cas/if_cas.c optional cas dev/cfi/cfi_bus_fdt.c optional cfi fdt dev/cfi/cfi_bus_nexus.c optional cfi dev/cfi/cfi_core.c optional cfi dev/cfi/cfi_dev.c optional cfi dev/cfi/cfi_disk.c optional cfid dev/ciss/ciss.c optional ciss dev/cm/smc90cx6.c optional cm dev/cmx/cmx.c optional cmx dev/cmx/cmx_pccard.c optional cmx pccard dev/cpufreq/ichss.c optional cpufreq dev/cs/if_cs.c optional cs dev/cs/if_cs_isa.c optional cs isa dev/cs/if_cs_pccard.c optional cs pccard dev/cxgb/cxgb_main.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/cxgb_sge.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_mc5.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_vsc7323.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_vsc8211.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_ael1002.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_aq100x.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_mv88e1xxx.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_xgmac.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_t3_hw.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_tn1010.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/sys/uipc_mvec.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/cxgb_t3fw.c optional cxgb cxgb_t3fw \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgbe/t4_mp_ring.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_main.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_netmap.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_sge.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_l2t.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_tracer.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/common/t4_hw.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" t4fw_cfg.c optional cxgbe \ compile-with "${AWK} -f $S/tools/fw_stub.awk t4fw_cfg.fw:t4fw_cfg t4fw_cfg_uwire.fw:t4fw_cfg_uwire t4fw.fw:t4fw -mt4fw_cfg -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "t4fw_cfg.c" t4fw_cfg.fwo optional cxgbe \ dependency "t4fw_cfg.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t4fw_cfg.fwo" t4fw_cfg.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t4fw_cfg.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t4fw_cfg.fw" t4fw_cfg_uwire.fwo optional cxgbe \ dependency "t4fw_cfg_uwire.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t4fw_cfg_uwire.fwo" t4fw_cfg_uwire.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t4fw_cfg_uwire.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t4fw_cfg_uwire.fw" t4fw.fwo optional cxgbe \ dependency "t4fw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t4fw.fwo" t4fw.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t4fw-1.15.37.0.bin.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "t4fw.fw" t5fw_cfg.c optional cxgbe \ compile-with "${AWK} -f $S/tools/fw_stub.awk t5fw_cfg.fw:t5fw_cfg t5fw.fw:t5fw -mt5fw_cfg -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "t5fw_cfg.c" t5fw_cfg.fwo optional cxgbe \ dependency "t5fw_cfg.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t5fw_cfg.fwo" t5fw_cfg.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t5fw_cfg.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t5fw_cfg.fw" t5fw.fwo optional cxgbe \ dependency "t5fw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t5fw.fwo" t5fw.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t5fw-1.15.37.0.bin.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "t5fw.fw" dev/cy/cy.c optional cy dev/cy/cy_isa.c optional cy isa dev/cy/cy_pci.c optional cy pci dev/cyapa/cyapa.c optional cyapa smbus dev/dc/if_dc.c optional dc pci dev/dc/dcphy.c optional dc pci dev/dc/pnphy.c optional dc pci dev/dcons/dcons.c optional dcons dev/dcons/dcons_crom.c optional dcons_crom dev/dcons/dcons_os.c optional dcons dev/de/if_de.c optional de pci dev/digi/CX.c optional digi_CX dev/digi/CX_PCI.c optional digi_CX_PCI dev/digi/EPCX.c optional digi_EPCX dev/digi/EPCX_PCI.c optional digi_EPCX_PCI dev/digi/Xe.c optional digi_Xe dev/digi/Xem.c optional digi_Xem dev/digi/Xr.c optional digi_Xr dev/digi/digi.c optional digi dev/digi/digi_isa.c optional digi isa dev/digi/digi_pci.c optional digi pci dev/dpt/dpt_eisa.c optional dpt eisa dev/dpt/dpt_pci.c optional dpt pci dev/dpt/dpt_scsi.c optional dpt dev/drm/ati_pcigart.c optional drm dev/drm/drm_agpsupport.c optional drm dev/drm/drm_auth.c optional drm dev/drm/drm_bufs.c optional drm dev/drm/drm_context.c optional drm dev/drm/drm_dma.c optional drm dev/drm/drm_drawable.c optional drm dev/drm/drm_drv.c optional drm dev/drm/drm_fops.c optional drm dev/drm/drm_hashtab.c optional drm dev/drm/drm_ioctl.c optional drm dev/drm/drm_irq.c optional drm dev/drm/drm_lock.c optional drm dev/drm/drm_memory.c optional drm dev/drm/drm_mm.c optional drm dev/drm/drm_pci.c optional drm dev/drm/drm_scatter.c optional drm dev/drm/drm_sman.c optional drm dev/drm/drm_sysctl.c optional drm dev/drm/drm_vm.c optional drm dev/drm/i915_dma.c optional i915drm dev/drm/i915_drv.c optional i915drm dev/drm/i915_irq.c optional i915drm dev/drm/i915_mem.c optional i915drm dev/drm/i915_suspend.c optional i915drm dev/drm/mach64_dma.c optional mach64drm dev/drm/mach64_drv.c optional mach64drm dev/drm/mach64_irq.c optional mach64drm dev/drm/mach64_state.c optional mach64drm dev/drm/mga_dma.c optional mgadrm dev/drm/mga_drv.c optional mgadrm dev/drm/mga_irq.c optional mgadrm dev/drm/mga_state.c optional mgadrm dev/drm/mga_warp.c optional mgadrm dev/drm/r128_cce.c optional r128drm \ compile-with "${NORMAL_C} ${NO_WCONSTANT_CONVERSION}" dev/drm/r128_drv.c optional r128drm dev/drm/r128_irq.c optional r128drm dev/drm/r128_state.c optional r128drm dev/drm/r300_cmdbuf.c optional radeondrm dev/drm/r600_blit.c optional radeondrm dev/drm/r600_cp.c optional radeondrm \ compile-with "${NORMAL_C} ${NO_WCONSTANT_CONVERSION}" dev/drm/radeon_cp.c optional radeondrm \ compile-with "${NORMAL_C} ${NO_WCONSTANT_CONVERSION}" dev/drm/radeon_cs.c optional radeondrm dev/drm/radeon_drv.c optional radeondrm dev/drm/radeon_irq.c optional radeondrm dev/drm/radeon_mem.c optional radeondrm dev/drm/radeon_state.c optional radeondrm dev/drm/savage_bci.c optional savagedrm dev/drm/savage_drv.c optional savagedrm dev/drm/savage_state.c optional savagedrm dev/drm/sis_drv.c optional sisdrm dev/drm/sis_ds.c optional sisdrm dev/drm/sis_mm.c optional sisdrm dev/drm/tdfx_drv.c optional tdfxdrm dev/drm/via_dma.c optional viadrm dev/drm/via_dmablit.c optional viadrm dev/drm/via_drv.c optional viadrm dev/drm/via_irq.c optional viadrm dev/drm/via_map.c optional viadrm dev/drm/via_mm.c optional viadrm dev/drm/via_verifier.c optional viadrm dev/drm/via_video.c optional viadrm dev/ed/if_ed.c optional ed dev/ed/if_ed_novell.c optional ed dev/ed/if_ed_rtl80x9.c optional ed dev/ed/if_ed_pccard.c optional ed pccard dev/ed/if_ed_pci.c optional ed pci dev/eisa/eisa_if.m standard dev/eisa/eisaconf.c optional eisa dev/e1000/if_em.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/if_lem.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/if_igb.c optional igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_80003es2lan.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82540.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82541.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82542.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82543.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82571.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82575.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_ich8lan.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_i210.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_api.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_mac.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_manage.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_nvm.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_phy.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_vf.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_mbx.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_osdep.c optional em | igb \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/et/if_et.c optional et dev/en/if_en_pci.c optional en pci dev/en/midway.c optional en dev/ep/if_ep.c optional ep dev/ep/if_ep_eisa.c optional ep eisa dev/ep/if_ep_isa.c optional ep isa dev/ep/if_ep_mca.c optional ep mca dev/ep/if_ep_pccard.c optional ep pccard dev/esp/esp_pci.c optional esp pci dev/esp/ncr53c9x.c optional esp dev/etherswitch/arswitch/arswitch.c optional arswitch dev/etherswitch/arswitch/arswitch_reg.c optional arswitch dev/etherswitch/arswitch/arswitch_phy.c optional arswitch dev/etherswitch/arswitch/arswitch_8216.c optional arswitch dev/etherswitch/arswitch/arswitch_8226.c optional arswitch dev/etherswitch/arswitch/arswitch_8316.c optional arswitch dev/etherswitch/arswitch/arswitch_8327.c optional arswitch dev/etherswitch/arswitch/arswitch_7240.c optional arswitch dev/etherswitch/arswitch/arswitch_9340.c optional arswitch dev/etherswitch/arswitch/arswitch_vlans.c optional arswitch dev/etherswitch/etherswitch.c optional etherswitch dev/etherswitch/etherswitch_if.m optional etherswitch dev/etherswitch/ip17x/ip17x.c optional ip17x dev/etherswitch/ip17x/ip175c.c optional ip17x dev/etherswitch/ip17x/ip175d.c optional ip17x dev/etherswitch/ip17x/ip17x_phy.c optional ip17x dev/etherswitch/ip17x/ip17x_vlans.c optional ip17x dev/etherswitch/miiproxy.c optional miiproxy dev/etherswitch/rtl8366/rtl8366rb.c optional rtl8366rb dev/etherswitch/ukswitch/ukswitch.c optional ukswitch dev/ex/if_ex.c optional ex dev/ex/if_ex_isa.c optional ex isa dev/ex/if_ex_pccard.c optional ex pccard dev/exca/exca.c optional cbb dev/extres/clk/clk.c optional ext_resources clk dev/extres/clk/clkdev_if.m optional ext_resources clk dev/extres/clk/clknode_if.m optional ext_resources clk dev/extres/clk/clk_bus.c optional ext_resources clk fdt dev/extres/clk/clk_div.c optional ext_resources clk dev/extres/clk/clk_fixed.c optional ext_resources clk dev/extres/clk/clk_gate.c optional ext_resources clk dev/extres/clk/clk_mux.c optional ext_resources clk dev/extres/phy/phy.c optional ext_resources phy dev/extres/phy/phy_if.m optional ext_resources phy dev/extres/hwreset/hwreset.c optional ext_resources hwreset dev/extres/hwreset/hwreset_if.m optional ext_resources hwreset dev/extres/regulator/regdev_if.m optional ext_resources regulator dev/extres/regulator/regnode_if.m optional ext_resources regulator dev/extres/regulator/regulator.c optional ext_resources regulator dev/extres/regulator/regulator_bus.c optional ext_resources regulator fdt dev/extres/regulator/regulator_fixed.c optional ext_resources regulator dev/fatm/if_fatm.c optional fatm pci dev/fb/fbd.c optional fbd | vt dev/fb/fb_if.m standard dev/fb/splash.c optional sc splash dev/fdt/fdt_clock.c optional fdt fdt_clock dev/fdt/fdt_clock_if.m optional fdt fdt_clock dev/fdt/fdt_common.c optional fdt dev/fdt/fdt_pinctrl.c optional fdt fdt_pinctrl dev/fdt/fdt_pinctrl_if.m optional fdt fdt_pinctrl dev/fdt/fdt_slicer.c optional fdt cfi | fdt nand | fdt mx25l dev/fdt/fdt_static_dtb.S optional fdt fdt_dtb_static \ dependency "fdt_dtb_file" dev/fdt/simplebus.c optional fdt dev/fe/if_fe.c optional fe dev/fe/if_fe_pccard.c optional fe pccard dev/filemon/filemon.c optional filemon dev/firewire/firewire.c optional firewire dev/firewire/fwcrom.c optional firewire dev/firewire/fwdev.c optional firewire dev/firewire/fwdma.c optional firewire dev/firewire/fwmem.c optional firewire dev/firewire/fwohci.c optional firewire dev/firewire/fwohci_pci.c optional firewire pci dev/firewire/if_fwe.c optional fwe dev/firewire/if_fwip.c optional fwip dev/firewire/sbp.c optional sbp dev/firewire/sbp_targ.c optional sbp_targ dev/flash/at45d.c optional at45d dev/flash/mx25l.c optional mx25l dev/fxp/if_fxp.c optional fxp dev/fxp/inphy.c optional fxp dev/gem/if_gem.c optional gem dev/gem/if_gem_pci.c optional gem pci dev/gem/if_gem_sbus.c optional gem sbus dev/gpio/gpiobacklight.c optional gpiobacklight fdt dev/gpio/gpiokeys.c optional gpiokeys fdt dev/gpio/gpiokeys_codes.c optional gpiokeys fdt dev/gpio/gpiobus.c optional gpio \ dependency "gpiobus_if.h" dev/gpio/gpioc.c optional gpio \ dependency "gpio_if.h" dev/gpio/gpioiic.c optional gpioiic dev/gpio/gpioled.c optional gpioled dev/gpio/gpiospi.c optional gpiospi dev/gpio/gpio_if.m optional gpio dev/gpio/gpiobus_if.m optional gpio dev/gpio/gpiopps.c optional gpiopps dev/gpio/ofw_gpiobus.c optional fdt gpio dev/hatm/if_hatm.c optional hatm pci dev/hatm/if_hatm_intr.c optional hatm pci dev/hatm/if_hatm_ioctl.c optional hatm pci dev/hatm/if_hatm_rx.c optional hatm pci dev/hatm/if_hatm_tx.c optional hatm pci dev/hifn/hifn7751.c optional hifn dev/hme/if_hme.c optional hme dev/hme/if_hme_pci.c optional hme pci dev/hme/if_hme_sbus.c optional hme sbus dev/hptiop/hptiop.c optional hptiop scbus dev/hwpmc/hwpmc_logging.c optional hwpmc dev/hwpmc/hwpmc_mod.c optional hwpmc dev/hwpmc/hwpmc_soft.c optional hwpmc dev/ichiic/ig4_iic.c optional ig4 smbus dev/ichiic/ig4_pci.c optional ig4 pci smbus dev/ichsmb/ichsmb.c optional ichsmb dev/ichsmb/ichsmb_pci.c optional ichsmb pci dev/ida/ida.c optional ida dev/ida/ida_disk.c optional ida dev/ida/ida_eisa.c optional ida eisa dev/ida/ida_pci.c optional ida pci dev/ie/if_ie.c optional ie isa nowerror dev/ie/if_ie_isa.c optional ie isa dev/iicbus/ad7418.c optional ad7418 dev/iicbus/ds1307.c optional ds1307 dev/iicbus/ds133x.c optional ds133x dev/iicbus/ds1374.c optional ds1374 dev/iicbus/ds1672.c optional ds1672 dev/iicbus/ds3231.c optional ds3231 dev/iicbus/icee.c optional icee dev/iicbus/if_ic.c optional ic dev/iicbus/iic.c optional iic dev/iicbus/iicbb.c optional iicbb dev/iicbus/iicbb_if.m optional iicbb dev/iicbus/iicbus.c optional iicbus dev/iicbus/iicbus_if.m optional iicbus dev/iicbus/iiconf.c optional iicbus dev/iicbus/iicsmb.c optional iicsmb \ dependency "iicbus_if.h" dev/iicbus/iicoc.c optional iicoc dev/iicbus/lm75.c optional lm75 dev/iicbus/ofw_iicbus.c optional fdt iicbus dev/iicbus/pcf8563.c optional pcf8563 dev/iicbus/s35390a.c optional s35390a dev/iir/iir.c optional iir dev/iir/iir_ctrl.c optional iir dev/iir/iir_pci.c optional iir pci dev/intpm/intpm.c optional intpm pci # XXX Work around clang warning, until maintainer approves fix. dev/ips/ips.c optional ips \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/ips/ips_commands.c optional ips dev/ips/ips_disk.c optional ips dev/ips/ips_ioctl.c optional ips dev/ips/ips_pci.c optional ips pci dev/ipw/if_ipw.c optional ipw ipwbssfw.c optional ipwbssfw | ipwfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk ipw_bss.fw:ipw_bss:130 -lintel_ipw -mipw_bss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "ipwbssfw.c" ipw_bss.fwo optional ipwbssfw | ipwfw \ dependency "ipw_bss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "ipw_bss.fwo" ipw_bss.fw optional ipwbssfw | ipwfw \ dependency "$S/contrib/dev/ipw/ipw2100-1.3.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "ipw_bss.fw" ipwibssfw.c optional ipwibssfw | ipwfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk ipw_ibss.fw:ipw_ibss:130 -lintel_ipw -mipw_ibss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "ipwibssfw.c" ipw_ibss.fwo optional ipwibssfw | ipwfw \ dependency "ipw_ibss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "ipw_ibss.fwo" ipw_ibss.fw optional ipwibssfw | ipwfw \ dependency "$S/contrib/dev/ipw/ipw2100-1.3-i.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "ipw_ibss.fw" ipwmonitorfw.c optional ipwmonitorfw | ipwfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk ipw_monitor.fw:ipw_monitor:130 -lintel_ipw -mipw_monitor -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "ipwmonitorfw.c" ipw_monitor.fwo optional ipwmonitorfw | ipwfw \ dependency "ipw_monitor.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "ipw_monitor.fwo" ipw_monitor.fw optional ipwmonitorfw | ipwfw \ dependency "$S/contrib/dev/ipw/ipw2100-1.3-p.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "ipw_monitor.fw" dev/iscsi/icl.c optional iscsi | ctl dev/iscsi/icl_conn_if.m optional iscsi | ctl dev/iscsi/icl_soft.c optional iscsi | ctl dev/iscsi/icl_soft_proxy.c optional iscsi | ctl dev/iscsi/iscsi.c optional iscsi scbus dev/iscsi_initiator/iscsi.c optional iscsi_initiator scbus dev/iscsi_initiator/iscsi_subr.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_cam.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_soc.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_sm.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_subr.c optional iscsi_initiator scbus dev/ismt/ismt.c optional ismt dev/isl/isl.c optional isl smbus dev/isp/isp.c optional isp dev/isp/isp_freebsd.c optional isp dev/isp/isp_library.c optional isp dev/isp/isp_pci.c optional isp pci dev/isp/isp_sbus.c optional isp sbus dev/isp/isp_target.c optional isp dev/ispfw/ispfw.c optional ispfw dev/iwi/if_iwi.c optional iwi iwibssfw.c optional iwibssfw | iwifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwi_bss.fw:iwi_bss:300 -lintel_iwi -miwi_bss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwibssfw.c" iwi_bss.fwo optional iwibssfw | iwifw \ dependency "iwi_bss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwi_bss.fwo" iwi_bss.fw optional iwibssfw | iwifw \ dependency "$S/contrib/dev/iwi/ipw2200-bss.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwi_bss.fw" iwiibssfw.c optional iwiibssfw | iwifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwi_ibss.fw:iwi_ibss:300 -lintel_iwi -miwi_ibss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwiibssfw.c" iwi_ibss.fwo optional iwiibssfw | iwifw \ dependency "iwi_ibss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwi_ibss.fwo" iwi_ibss.fw optional iwiibssfw | iwifw \ dependency "$S/contrib/dev/iwi/ipw2200-ibss.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwi_ibss.fw" iwimonitorfw.c optional iwimonitorfw | iwifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwi_monitor.fw:iwi_monitor:300 -lintel_iwi -miwi_monitor -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwimonitorfw.c" iwi_monitor.fwo optional iwimonitorfw | iwifw \ dependency "iwi_monitor.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwi_monitor.fwo" iwi_monitor.fw optional iwimonitorfw | iwifw \ dependency "$S/contrib/dev/iwi/ipw2200-sniffer.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwi_monitor.fw" dev/iwm/if_iwm.c optional iwm dev/iwm/if_iwm_binding.c optional iwm dev/iwm/if_iwm_led.c optional iwm dev/iwm/if_iwm_mac_ctxt.c optional iwm dev/iwm/if_iwm_pcie_trans.c optional iwm dev/iwm/if_iwm_phy_ctxt.c optional iwm dev/iwm/if_iwm_phy_db.c optional iwm dev/iwm/if_iwm_power.c optional iwm dev/iwm/if_iwm_scan.c optional iwm dev/iwm/if_iwm_time_event.c optional iwm dev/iwm/if_iwm_util.c optional iwm iwm3160fw.c optional iwm3160fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm3160.fw:iwm3160fw -miwm3160fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm3160fw.c" iwm3160fw.fwo optional iwm3160fw | iwmfw \ dependency "iwm3160.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm3160fw.fwo" iwm3160.fw optional iwm3160fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-3160-9.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm3160.fw" iwm7260fw.c optional iwm7260fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm7260.fw:iwm7260fw -miwm7260fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm7260fw.c" iwm7260fw.fwo optional iwm7260fw | iwmfw \ dependency "iwm7260.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm7260fw.fwo" iwm7260.fw optional iwm7260fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-7260-9.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm7260.fw" iwm7265fw.c optional iwm7265fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm7265.fw:iwm7265fw -miwm7265fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm7265fw.c" iwm7265fw.fwo optional iwm7265fw | iwmfw \ dependency "iwm7265.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm7265fw.fwo" iwm7265.fw optional iwm7265fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-7265-9.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm7265.fw" dev/iwn/if_iwn.c optional iwn iwn1000fw.c optional iwn1000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn1000.fw:iwn1000fw -miwn1000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn1000fw.c" iwn1000fw.fwo optional iwn1000fw | iwnfw \ dependency "iwn1000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn1000fw.fwo" iwn1000.fw optional iwn1000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-1000-39.31.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn1000.fw" iwn100fw.c optional iwn100fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn100.fw:iwn100fw -miwn100fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn100fw.c" iwn100fw.fwo optional iwn100fw | iwnfw \ dependency "iwn100.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn100fw.fwo" iwn100.fw optional iwn100fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-100-39.31.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn100.fw" iwn105fw.c optional iwn105fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn105.fw:iwn105fw -miwn105fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn105fw.c" iwn105fw.fwo optional iwn105fw | iwnfw \ dependency "iwn105.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn105fw.fwo" iwn105.fw optional iwn105fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-105-6-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn105.fw" iwn135fw.c optional iwn135fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn135.fw:iwn135fw -miwn135fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn135fw.c" iwn135fw.fwo optional iwn135fw | iwnfw \ dependency "iwn135.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn135fw.fwo" iwn135.fw optional iwn135fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-135-6-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn135.fw" iwn2000fw.c optional iwn2000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn2000.fw:iwn2000fw -miwn2000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn2000fw.c" iwn2000fw.fwo optional iwn2000fw | iwnfw \ dependency "iwn2000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn2000fw.fwo" iwn2000.fw optional iwn2000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-2000-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn2000.fw" iwn2030fw.c optional iwn2030fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn2030.fw:iwn2030fw -miwn2030fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn2030fw.c" iwn2030fw.fwo optional iwn2030fw | iwnfw \ dependency "iwn2030.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn2030fw.fwo" iwn2030.fw optional iwn2030fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwnwifi-2030-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn2030.fw" iwn4965fw.c optional iwn4965fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn4965.fw:iwn4965fw -miwn4965fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn4965fw.c" iwn4965fw.fwo optional iwn4965fw | iwnfw \ dependency "iwn4965.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn4965fw.fwo" iwn4965.fw optional iwn4965fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-4965-228.61.2.24.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn4965.fw" iwn5000fw.c optional iwn5000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn5000.fw:iwn5000fw -miwn5000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn5000fw.c" iwn5000fw.fwo optional iwn5000fw | iwnfw \ dependency "iwn5000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn5000fw.fwo" iwn5000.fw optional iwn5000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-5000-8.83.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn5000.fw" iwn5150fw.c optional iwn5150fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn5150.fw:iwn5150fw -miwn5150fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn5150fw.c" iwn5150fw.fwo optional iwn5150fw | iwnfw \ dependency "iwn5150.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn5150fw.fwo" iwn5150.fw optional iwn5150fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-5150-8.24.2.2.fw.uu"\ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn5150.fw" iwn6000fw.c optional iwn6000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6000.fw:iwn6000fw -miwn6000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6000fw.c" iwn6000fw.fwo optional iwn6000fw | iwnfw \ dependency "iwn6000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6000fw.fwo" iwn6000.fw optional iwn6000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6000-9.221.4.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6000.fw" iwn6000g2afw.c optional iwn6000g2afw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6000g2a.fw:iwn6000g2afw -miwn6000g2afw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6000g2afw.c" iwn6000g2afw.fwo optional iwn6000g2afw | iwnfw \ dependency "iwn6000g2a.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6000g2afw.fwo" iwn6000g2a.fw optional iwn6000g2afw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6000g2a-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6000g2a.fw" iwn6000g2bfw.c optional iwn6000g2bfw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6000g2b.fw:iwn6000g2bfw -miwn6000g2bfw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6000g2bfw.c" iwn6000g2bfw.fwo optional iwn6000g2bfw | iwnfw \ dependency "iwn6000g2b.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6000g2bfw.fwo" iwn6000g2b.fw optional iwn6000g2bfw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6000g2b-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6000g2b.fw" iwn6050fw.c optional iwn6050fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6050.fw:iwn6050fw -miwn6050fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6050fw.c" iwn6050fw.fwo optional iwn6050fw | iwnfw \ dependency "iwn6050.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6050fw.fwo" iwn6050.fw optional iwn6050fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6050-41.28.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6050.fw" dev/ixgb/if_ixgb.c optional ixgb dev/ixgb/ixgb_ee.c optional ixgb dev/ixgb/ixgb_hw.c optional ixgb dev/ixgbe/if_ix.c optional ix inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe -DSMP" dev/ixgbe/if_ixv.c optional ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe -DSMP" dev/ixgbe/ix_txrx.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_osdep.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_phy.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_api.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_common.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_mbx.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_vf.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_82598.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_82599.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_x540.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_x550.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_dcb.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_dcb_82598.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_dcb_82599.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/jme/if_jme.c optional jme pci dev/joy/joy.c optional joy dev/joy/joy_isa.c optional joy isa dev/kbd/kbd.c optional atkbd | pckbd | sc | ukbd | vt dev/kbdmux/kbdmux.c optional kbdmux dev/ksyms/ksyms.c optional ksyms dev/le/am7990.c optional le dev/le/am79900.c optional le dev/le/if_le_pci.c optional le pci dev/le/lance.c optional le dev/led/led.c standard dev/lge/if_lge.c optional lge dev/lmc/if_lmc.c optional lmc dev/malo/if_malo.c optional malo dev/malo/if_malohal.c optional malo dev/malo/if_malo_pci.c optional malo pci dev/mc146818/mc146818.c optional mc146818 dev/mca/mca_bus.c optional mca dev/mcd/mcd.c optional mcd isa nowerror dev/mcd/mcd_isa.c optional mcd isa nowerror dev/md/md.c optional md dev/mdio/mdio_if.m optional miiproxy | mdio dev/mdio/mdio.c optional miiproxy | mdio dev/mem/memdev.c optional mem dev/mem/memutil.c optional mem dev/mfi/mfi.c optional mfi dev/mfi/mfi_debug.c optional mfi dev/mfi/mfi_pci.c optional mfi pci dev/mfi/mfi_disk.c optional mfi dev/mfi/mfi_syspd.c optional mfi dev/mfi/mfi_tbolt.c optional mfi dev/mfi/mfi_linux.c optional mfi compat_linux dev/mfi/mfi_cam.c optional mfip scbus dev/mii/acphy.c optional miibus | acphy dev/mii/amphy.c optional miibus | amphy dev/mii/atphy.c optional miibus | atphy dev/mii/axphy.c optional miibus | axphy dev/mii/bmtphy.c optional miibus | bmtphy dev/mii/brgphy.c optional miibus | brgphy dev/mii/ciphy.c optional miibus | ciphy dev/mii/e1000phy.c optional miibus | e1000phy dev/mii/gentbi.c optional miibus | gentbi dev/mii/icsphy.c optional miibus | icsphy dev/mii/ip1000phy.c optional miibus | ip1000phy dev/mii/jmphy.c optional miibus | jmphy dev/mii/lxtphy.c optional miibus | lxtphy dev/mii/mii.c optional miibus | mii dev/mii/mii_bitbang.c optional miibus | mii_bitbang dev/mii/mii_physubr.c optional miibus | mii dev/mii/miibus_if.m optional miibus | mii dev/mii/mlphy.c optional miibus | mlphy dev/mii/nsgphy.c optional miibus | nsgphy dev/mii/nsphy.c optional miibus | nsphy dev/mii/nsphyter.c optional miibus | nsphyter dev/mii/pnaphy.c optional miibus | pnaphy dev/mii/qsphy.c optional miibus | qsphy dev/mii/rdcphy.c optional miibus | rdcphy dev/mii/rgephy.c optional miibus | rgephy dev/mii/rlphy.c optional miibus | rlphy dev/mii/rlswitch.c optional rlswitch dev/mii/smcphy.c optional miibus | smcphy dev/mii/smscphy.c optional miibus | smscphy dev/mii/tdkphy.c optional miibus | tdkphy dev/mii/tlphy.c optional miibus | tlphy dev/mii/truephy.c optional miibus | truephy dev/mii/ukphy.c optional miibus | mii dev/mii/ukphy_subr.c optional miibus | mii dev/mii/xmphy.c optional miibus | xmphy dev/mk48txx/mk48txx.c optional mk48txx dev/mlx/mlx.c optional mlx dev/mlx/mlx_disk.c optional mlx dev/mlx/mlx_pci.c optional mlx pci dev/mly/mly.c optional mly dev/mmc/mmc.c optional mmc dev/mmc/mmcbr_if.m standard dev/mmc/mmcbus_if.m standard dev/mmc/mmcsd.c optional mmcsd dev/mn/if_mn.c optional mn pci dev/mpr/mpr.c optional mpr dev/mpr/mpr_config.c optional mpr # XXX Work around clang warning, until maintainer approves fix. dev/mpr/mpr_mapping.c optional mpr \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/mpr/mpr_pci.c optional mpr pci dev/mpr/mpr_sas.c optional mpr \ compile-with "${NORMAL_C} ${NO_WUNNEEDED_INTERNAL_DECL}" dev/mpr/mpr_sas_lsi.c optional mpr dev/mpr/mpr_table.c optional mpr dev/mpr/mpr_user.c optional mpr dev/mps/mps.c optional mps dev/mps/mps_config.c optional mps # XXX Work around clang warning, until maintainer approves fix. dev/mps/mps_mapping.c optional mps \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/mps/mps_pci.c optional mps pci dev/mps/mps_sas.c optional mps \ compile-with "${NORMAL_C} ${NO_WUNNEEDED_INTERNAL_DECL}" dev/mps/mps_sas_lsi.c optional mps dev/mps/mps_table.c optional mps dev/mps/mps_user.c optional mps dev/mpt/mpt.c optional mpt dev/mpt/mpt_cam.c optional mpt dev/mpt/mpt_debug.c optional mpt dev/mpt/mpt_pci.c optional mpt pci dev/mpt/mpt_raid.c optional mpt dev/mpt/mpt_user.c optional mpt dev/mrsas/mrsas.c optional mrsas dev/mrsas/mrsas_cam.c optional mrsas dev/mrsas/mrsas_ioctl.c optional mrsas dev/mrsas/mrsas_fp.c optional mrsas dev/msk/if_msk.c optional msk dev/mvs/mvs.c optional mvs dev/mvs/mvs_if.m optional mvs dev/mvs/mvs_pci.c optional mvs pci dev/mwl/if_mwl.c optional mwl dev/mwl/if_mwl_pci.c optional mwl pci dev/mwl/mwlhal.c optional mwl mwlfw.c optional mwlfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk mw88W8363.fw:mw88W8363fw mwlboot.fw:mwlboot -mmwl -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "mwlfw.c" mw88W8363.fwo optional mwlfw \ dependency "mw88W8363.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "mw88W8363.fwo" mw88W8363.fw optional mwlfw \ dependency "$S/contrib/dev/mwl/mw88W8363.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "mw88W8363.fw" mwlboot.fwo optional mwlfw \ dependency "mwlboot.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "mwlboot.fwo" mwlboot.fw optional mwlfw \ dependency "$S/contrib/dev/mwl/mwlboot.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "mwlboot.fw" dev/mxge/if_mxge.c optional mxge pci dev/mxge/mxge_eth_z8e.c optional mxge pci dev/mxge/mxge_ethp_z8e.c optional mxge pci dev/mxge/mxge_rss_eth_z8e.c optional mxge pci dev/mxge/mxge_rss_ethp_z8e.c optional mxge pci dev/my/if_my.c optional my dev/nand/nand.c optional nand dev/nand/nand_bbt.c optional nand dev/nand/nand_cdev.c optional nand dev/nand/nand_generic.c optional nand dev/nand/nand_geom.c optional nand dev/nand/nand_id.c optional nand dev/nand/nandbus.c optional nand dev/nand/nandbus_if.m optional nand dev/nand/nand_if.m optional nand dev/nand/nandsim.c optional nandsim nand dev/nand/nandsim_chip.c optional nandsim nand dev/nand/nandsim_ctrl.c optional nandsim nand dev/nand/nandsim_log.c optional nandsim nand dev/nand/nandsim_swap.c optional nandsim nand dev/nand/nfc_if.m optional nand dev/ncr/ncr.c optional ncr pci dev/ncv/ncr53c500.c optional ncv dev/ncv/ncr53c500_pccard.c optional ncv pccard dev/netmap/netmap.c optional netmap dev/netmap/netmap_freebsd.c optional netmap dev/netmap/netmap_generic.c optional netmap dev/netmap/netmap_mbq.c optional netmap dev/netmap/netmap_mem2.c optional netmap dev/netmap/netmap_monitor.c optional netmap dev/netmap/netmap_offloadings.c optional netmap dev/netmap/netmap_pipe.c optional netmap dev/netmap/netmap_vale.c optional netmap # compile-with "${NORMAL_C} -Wconversion -Wextra" dev/nfsmb/nfsmb.c optional nfsmb pci dev/nge/if_nge.c optional nge dev/nxge/if_nxge.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nxge/xgehal/xgehal-device.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nxge/xgehal/xgehal-mm.c optional nxge dev/nxge/xgehal/xge-queue.c optional nxge dev/nxge/xgehal/xgehal-driver.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nxge/xgehal/xgehal-ring.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nxge/xgehal/xgehal-channel.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nxge/xgehal/xgehal-fifo.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nxge/xgehal/xgehal-stats.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nxge/xgehal/xgehal-config.c optional nxge dev/nxge/xgehal/xgehal-mgmt.c optional nxge \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN}" dev/nmdm/nmdm.c optional nmdm dev/nsp/nsp.c optional nsp dev/nsp/nsp_pccard.c optional nsp pccard dev/null/null.c standard dev/oce/oce_hw.c optional oce pci dev/oce/oce_if.c optional oce pci dev/oce/oce_mbox.c optional oce pci dev/oce/oce_queue.c optional oce pci dev/oce/oce_sysctl.c optional oce pci dev/oce/oce_util.c optional oce pci dev/ofw/ofw_bus_if.m optional fdt dev/ofw/ofw_bus_subr.c optional fdt dev/ofw/ofw_fdt.c optional fdt dev/ofw/ofw_if.m optional fdt dev/ofw/ofw_subr.c optional fdt dev/ofw/ofwbus.c optional fdt dev/ofw/openfirm.c optional fdt dev/ofw/openfirmio.c optional fdt dev/ow/ow.c optional ow \ dependency "owll_if.h" \ dependency "own_if.h" dev/ow/owll_if.m optional ow dev/ow/own_if.m optional ow dev/ow/ow_temp.c optional ow_temp dev/ow/owc_gpiobus.c optional owc gpio dev/patm/if_patm.c optional patm pci dev/patm/if_patm_attach.c optional patm pci dev/patm/if_patm_intr.c optional patm pci dev/patm/if_patm_ioctl.c optional patm pci dev/patm/if_patm_rtables.c optional patm pci dev/patm/if_patm_rx.c optional patm pci dev/patm/if_patm_tx.c optional patm pci dev/pbio/pbio.c optional pbio isa dev/pccard/card_if.m standard dev/pccard/pccard.c optional pccard dev/pccard/pccard_cis.c optional pccard dev/pccard/pccard_cis_quirks.c optional pccard dev/pccard/pccard_device.c optional pccard dev/pccard/power_if.m standard dev/pccbb/pccbb.c optional cbb dev/pccbb/pccbb_isa.c optional cbb isa dev/pccbb/pccbb_pci.c optional cbb pci dev/pcf/pcf.c optional pcf dev/pci/eisa_pci.c optional pci eisa dev/pci/fixup_pci.c optional pci dev/pci/hostb_pci.c optional pci dev/pci/ignore_pci.c optional pci dev/pci/isa_pci.c optional pci isa dev/pci/pci.c optional pci dev/pci/pci_if.m standard dev/pci/pci_iov.c optional pci pci_iov dev/pci/pci_iov_if.m standard dev/pci/pci_iov_schema.c optional pci pci_iov dev/pci/pci_pci.c optional pci dev/pci/pci_subr.c optional pci dev/pci/pci_user.c optional pci dev/pci/pcib_if.m standard dev/pci/pcib_support.c standard dev/pci/vga_pci.c optional pci dev/pcn/if_pcn.c optional pcn pci dev/pdq/if_fea.c optional fea eisa dev/pdq/if_fpa.c optional fpa pci dev/pdq/pdq.c optional nowerror fea eisa | fpa pci dev/pdq/pdq_ifsubr.c optional nowerror fea eisa | fpa pci dev/pms/freebsd/driver/ini/src/agtiapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sadisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/mpi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saframe.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sahw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sainit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saint.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sampicmd.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sampirsp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saphy.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saport.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sasata.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sasmp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sassp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/satimer.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sautil.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saioctlcmd.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/mpidebug.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dminit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmsmp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmdisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmport.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmtimer.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmmisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/sminit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smmisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smsat.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smsatcb.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smsathw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smtimer.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdinit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdmisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdesgl.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdport.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdint.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdioctl.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdhw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/ossacmnapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tddmcmnapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdsmcmnapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdtimers.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itdio.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itdcb.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itdinit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itddisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sata/host/sat.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sata/host/ossasat.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sata/host/sathw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/ppbus/if_plip.c optional plip dev/ppbus/immio.c optional vpo dev/ppbus/lpbb.c optional lpbb dev/ppbus/lpt.c optional lpt dev/ppbus/pcfclock.c optional pcfclock dev/ppbus/ppb_1284.c optional ppbus dev/ppbus/ppb_base.c optional ppbus dev/ppbus/ppb_msq.c optional ppbus dev/ppbus/ppbconf.c optional ppbus dev/ppbus/ppbus_if.m optional ppbus dev/ppbus/ppi.c optional ppi dev/ppbus/pps.c optional pps dev/ppbus/vpo.c optional vpo dev/ppbus/vpoio.c optional vpo dev/ppc/ppc.c optional ppc dev/ppc/ppc_acpi.c optional ppc acpi dev/ppc/ppc_isa.c optional ppc isa dev/ppc/ppc_pci.c optional ppc pci dev/ppc/ppc_puc.c optional ppc puc dev/proto/proto_bus_isa.c optional proto acpi | proto isa dev/proto/proto_bus_pci.c optional proto pci dev/proto/proto_busdma.c optional proto dev/proto/proto_core.c optional proto dev/pst/pst-iop.c optional pst dev/pst/pst-pci.c optional pst pci dev/pst/pst-raid.c optional pst dev/pty/pty.c optional pty dev/puc/puc.c optional puc dev/puc/puc_cfg.c optional puc dev/puc/puc_pccard.c optional puc pccard dev/puc/puc_pci.c optional puc pci dev/puc/pucdata.c optional puc pci dev/quicc/quicc_core.c optional quicc dev/ral/rt2560.c optional ral dev/ral/rt2661.c optional ral dev/ral/rt2860.c optional ral dev/ral/if_ral_pci.c optional ral pci rt2561fw.c optional rt2561fw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2561.fw:rt2561fw -mrt2561 -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2561fw.c" rt2561fw.fwo optional rt2561fw | ralfw \ dependency "rt2561.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2561fw.fwo" rt2561.fw optional rt2561fw | ralfw \ dependency "$S/contrib/dev/ral/rt2561.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2561.fw" rt2561sfw.c optional rt2561sfw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2561s.fw:rt2561sfw -mrt2561s -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2561sfw.c" rt2561sfw.fwo optional rt2561sfw | ralfw \ dependency "rt2561s.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2561sfw.fwo" rt2561s.fw optional rt2561sfw | ralfw \ dependency "$S/contrib/dev/ral/rt2561s.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2561s.fw" rt2661fw.c optional rt2661fw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2661.fw:rt2661fw -mrt2661 -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2661fw.c" rt2661fw.fwo optional rt2661fw | ralfw \ dependency "rt2661.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2661fw.fwo" rt2661.fw optional rt2661fw | ralfw \ dependency "$S/contrib/dev/ral/rt2661.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2661.fw" rt2860fw.c optional rt2860fw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2860.fw:rt2860fw -mrt2860 -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2860fw.c" rt2860fw.fwo optional rt2860fw | ralfw \ dependency "rt2860.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2860fw.fwo" rt2860.fw optional rt2860fw | ralfw \ dependency "$S/contrib/dev/ral/rt2860.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2860.fw" dev/random/random_infra.c optional random dev/random/random_harvestq.c optional random dev/random/randomdev.c optional random random_yarrow | \ random !random_yarrow !random_loadable dev/random/yarrow.c optional random random_yarrow dev/random/fortuna.c optional random !random_yarrow !random_loadable dev/random/hash.c optional random random_yarrow | \ random !random_yarrow !random_loadable dev/rc/rc.c optional rc dev/rccgpio/rccgpio.c optional rccgpio gpio dev/re/if_re.c optional re dev/rl/if_rl.c optional rl pci dev/rndtest/rndtest.c optional rndtest dev/rp/rp.c optional rp dev/rp/rp_isa.c optional rp isa dev/rp/rp_pci.c optional rp pci dev/rtwn/if_rtwn.c optional rtwn rtwn-rtl8192cfwU.c optional rtwn-rtl8192cfwU | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8192cfwU.fw:rtwn-rtl8192cfwU:111 -mrtwn-rtl8192cfwU -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8192cfwU.c" rtwn-rtl8192cfwU.fwo optional rtwn-rtl8192cfwU | rtwnfw \ dependency "rtwn-rtl8192cfwU.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8192cfwU.fwo" rtwn-rtl8192cfwU.fw optional rtwn-rtl8192cfwU | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8192cfwU.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8192cfwU.fw" rtwn-rtl8192cfwU_B.c optional rtwn-rtl8192cfwU_B | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8192cfwU_B.fw:rtwn-rtl8192cfwU_B:111 -mrtwn-rtl8192cfwU_B -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8192cfwU_B.c" rtwn-rtl8192cfwU_B.fwo optional rtwn-rtl8192cfwU_B | rtwnfw \ dependency "rtwn-rtl8192cfwU_B.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8192cfwU_B.fwo" rtwn-rtl8192cfwU_B.fw optional rtwn-rtl8192cfwU_B | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8192cfwU_B.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8192cfwU_B.fw" dev/safe/safe.c optional safe dev/scc/scc_if.m optional scc dev/scc/scc_bfe_ebus.c optional scc ebus dev/scc/scc_bfe_quicc.c optional scc quicc dev/scc/scc_bfe_sbus.c optional scc fhc | scc sbus dev/scc/scc_core.c optional scc dev/scc/scc_dev_quicc.c optional scc quicc dev/scc/scc_dev_sab82532.c optional scc dev/scc/scc_dev_z8530.c optional scc dev/scd/scd.c optional scd isa dev/scd/scd_isa.c optional scd isa dev/sdhci/sdhci.c optional sdhci dev/sdhci/sdhci_if.m optional sdhci dev/sdhci/sdhci_pci.c optional sdhci pci dev/sf/if_sf.c optional sf pci dev/sge/if_sge.c optional sge pci dev/si/si.c optional si dev/si/si2_z280.c optional si dev/si/si3_t225.c optional si dev/si/si_eisa.c optional si eisa dev/si/si_isa.c optional si isa dev/si/si_pci.c optional si pci dev/siba/siba_bwn.c optional siba_bwn pci dev/siba/siba_core.c optional siba_bwn pci dev/siis/siis.c optional siis pci dev/sis/if_sis.c optional sis pci dev/sk/if_sk.c optional sk pci dev/smbus/smb.c optional smb dev/smbus/smbconf.c optional smbus dev/smbus/smbus.c optional smbus dev/smbus/smbus_if.m optional smbus dev/smc/if_smc.c optional smc dev/smc/if_smc_fdt.c optional smc fdt dev/sn/if_sn.c optional sn dev/sn/if_sn_isa.c optional sn isa dev/sn/if_sn_pccard.c optional sn pccard dev/snp/snp.c optional snp dev/sound/clone.c optional sound dev/sound/unit.c optional sound dev/sound/isa/ad1816.c optional snd_ad1816 isa dev/sound/isa/ess.c optional snd_ess isa dev/sound/isa/gusc.c optional snd_gusc isa dev/sound/isa/mss.c optional snd_mss isa dev/sound/isa/sb16.c optional snd_sb16 isa dev/sound/isa/sb8.c optional snd_sb8 isa dev/sound/isa/sbc.c optional snd_sbc isa dev/sound/isa/sndbuf_dma.c optional sound isa dev/sound/pci/als4000.c optional snd_als4000 pci dev/sound/pci/atiixp.c optional snd_atiixp pci dev/sound/pci/cmi.c optional snd_cmi pci dev/sound/pci/cs4281.c optional snd_cs4281 pci dev/sound/pci/csa.c optional snd_csa pci dev/sound/pci/csapcm.c optional snd_csa pci dev/sound/pci/ds1.c optional snd_ds1 pci dev/sound/pci/emu10k1.c optional snd_emu10k1 pci dev/sound/pci/emu10kx.c optional snd_emu10kx pci dev/sound/pci/emu10kx-pcm.c optional snd_emu10kx pci dev/sound/pci/emu10kx-midi.c optional snd_emu10kx pci dev/sound/pci/envy24.c optional snd_envy24 pci dev/sound/pci/envy24ht.c optional snd_envy24ht pci dev/sound/pci/es137x.c optional snd_es137x pci dev/sound/pci/fm801.c optional snd_fm801 pci dev/sound/pci/ich.c optional snd_ich pci dev/sound/pci/maestro.c optional snd_maestro pci dev/sound/pci/maestro3.c optional snd_maestro3 pci dev/sound/pci/neomagic.c optional snd_neomagic pci dev/sound/pci/solo.c optional snd_solo pci dev/sound/pci/spicds.c optional snd_spicds pci dev/sound/pci/t4dwave.c optional snd_t4dwave pci dev/sound/pci/via8233.c optional snd_via8233 pci dev/sound/pci/via82c686.c optional snd_via82c686 pci dev/sound/pci/vibes.c optional snd_vibes pci dev/sound/pci/hda/hdaa.c optional snd_hda pci dev/sound/pci/hda/hdaa_patches.c optional snd_hda pci dev/sound/pci/hda/hdac.c optional snd_hda pci dev/sound/pci/hda/hdac_if.m optional snd_hda pci dev/sound/pci/hda/hdacc.c optional snd_hda pci dev/sound/pci/hdspe.c optional snd_hdspe pci dev/sound/pci/hdspe-pcm.c optional snd_hdspe pci dev/sound/pcm/ac97.c optional sound dev/sound/pcm/ac97_if.m optional sound dev/sound/pcm/ac97_patch.c optional sound dev/sound/pcm/buffer.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/channel.c optional sound dev/sound/pcm/channel_if.m optional sound dev/sound/pcm/dsp.c optional sound dev/sound/pcm/feeder.c optional sound dev/sound/pcm/feeder_chain.c optional sound dev/sound/pcm/feeder_eq.c optional sound \ dependency "feeder_eq_gen.h" \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_if.m optional sound dev/sound/pcm/feeder_format.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_matrix.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_mixer.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_rate.c optional sound \ dependency "feeder_rate_gen.h" \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_volume.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/mixer.c optional sound dev/sound/pcm/mixer_if.m optional sound dev/sound/pcm/sndstat.c optional sound dev/sound/pcm/sound.c optional sound dev/sound/pcm/vchan.c optional sound dev/sound/usb/uaudio.c optional snd_uaudio usb dev/sound/usb/uaudio_pcm.c optional snd_uaudio usb dev/sound/midi/midi.c optional sound dev/sound/midi/mpu401.c optional sound dev/sound/midi/mpu_if.m optional sound dev/sound/midi/mpufoi_if.m optional sound dev/sound/midi/sequencer.c optional sound dev/sound/midi/synth_if.m optional sound dev/spibus/ofw_spibus.c optional fdt spibus dev/spibus/spibus.c optional spibus \ dependency "spibus_if.h" dev/spibus/spigen.c optional spigen dev/spibus/spibus_if.m optional spibus dev/ste/if_ste.c optional ste pci dev/stg/tmc18c30.c optional stg dev/stg/tmc18c30_isa.c optional stg isa dev/stg/tmc18c30_pccard.c optional stg pccard dev/stg/tmc18c30_pci.c optional stg pci dev/stg/tmc18c30_subr.c optional stg dev/stge/if_stge.c optional stge dev/streams/streams.c optional streams dev/sym/sym_hipd.c optional sym \ dependency "$S/dev/sym/sym_{conf,defs}.h" dev/syscons/blank/blank_saver.c optional blank_saver dev/syscons/daemon/daemon_saver.c optional daemon_saver dev/syscons/dragon/dragon_saver.c optional dragon_saver dev/syscons/fade/fade_saver.c optional fade_saver dev/syscons/fire/fire_saver.c optional fire_saver dev/syscons/green/green_saver.c optional green_saver dev/syscons/logo/logo.c optional logo_saver dev/syscons/logo/logo_saver.c optional logo_saver dev/syscons/rain/rain_saver.c optional rain_saver dev/syscons/schistory.c optional sc dev/syscons/scmouse.c optional sc dev/syscons/scterm.c optional sc dev/syscons/scvidctl.c optional sc dev/syscons/snake/snake_saver.c optional snake_saver dev/syscons/star/star_saver.c optional star_saver dev/syscons/syscons.c optional sc dev/syscons/sysmouse.c optional sc dev/syscons/warp/warp_saver.c optional warp_saver dev/tdfx/tdfx_linux.c optional tdfx_linux tdfx compat_linux dev/tdfx/tdfx_pci.c optional tdfx pci dev/ti/if_ti.c optional ti pci dev/tl/if_tl.c optional tl pci dev/trm/trm.c optional trm dev/twa/tw_cl_init.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_cl_intr.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_cl_io.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_cl_misc.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_osl_cam.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_osl_freebsd.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twe/twe.c optional twe dev/twe/twe_freebsd.c optional twe dev/tws/tws.c optional tws dev/tws/tws_cam.c optional tws dev/tws/tws_hdm.c optional tws dev/tws/tws_services.c optional tws dev/tws/tws_user.c optional tws dev/tx/if_tx.c optional tx dev/txp/if_txp.c optional txp dev/uart/uart_bus_acpi.c optional uart acpi dev/uart/uart_bus_ebus.c optional uart ebus dev/uart/uart_bus_fdt.c optional uart fdt dev/uart/uart_bus_isa.c optional uart isa dev/uart/uart_bus_pccard.c optional uart pccard dev/uart/uart_bus_pci.c optional uart pci dev/uart/uart_bus_puc.c optional uart puc dev/uart/uart_bus_scc.c optional uart scc dev/uart/uart_core.c optional uart dev/uart/uart_dbg.c optional uart gdb dev/uart/uart_dev_ns8250.c optional uart uart_ns8250 | uart uart_snps dev/uart/uart_dev_pl011.c optional uart pl011 dev/uart/uart_dev_quicc.c optional uart quicc dev/uart/uart_dev_sab82532.c optional uart uart_sab82532 dev/uart/uart_dev_sab82532.c optional uart scc dev/uart/uart_dev_snps.c optional uart uart_snps dev/uart/uart_dev_z8530.c optional uart uart_z8530 dev/uart/uart_dev_z8530.c optional uart scc dev/uart/uart_if.m optional uart dev/uart/uart_subr.c optional uart dev/uart/uart_tty.c optional uart dev/ubsec/ubsec.c optional ubsec # # USB controller drivers # dev/usb/controller/at91dci.c optional at91dci dev/usb/controller/at91dci_atmelarm.c optional at91dci at91rm9200 dev/usb/controller/musb_otg.c optional musb dev/usb/controller/musb_otg_atmelarm.c optional musb at91rm9200 dev/usb/controller/dwc_otg.c optional dwcotg dev/usb/controller/dwc_otg_fdt.c optional dwcotg fdt dev/usb/controller/ehci.c optional ehci dev/usb/controller/ehci_pci.c optional ehci pci dev/usb/controller/ohci.c optional ohci dev/usb/controller/ohci_pci.c optional ohci pci dev/usb/controller/uhci.c optional uhci dev/usb/controller/uhci_pci.c optional uhci pci dev/usb/controller/xhci.c optional xhci dev/usb/controller/xhci_pci.c optional xhci pci dev/usb/controller/saf1761_otg.c optional saf1761otg dev/usb/controller/saf1761_otg_fdt.c optional saf1761otg fdt dev/usb/controller/uss820dci.c optional uss820dci dev/usb/controller/uss820dci_atmelarm.c optional uss820dci at91rm9200 dev/usb/controller/usb_controller.c optional usb # # USB storage drivers # dev/usb/storage/umass.c optional umass dev/usb/storage/urio.c optional urio dev/usb/storage/ustorage_fs.c optional usfs # # USB core # dev/usb/usb_busdma.c optional usb dev/usb/usb_core.c optional usb dev/usb/usb_debug.c optional usb dev/usb/usb_dev.c optional usb dev/usb/usb_device.c optional usb dev/usb/usb_dynamic.c optional usb dev/usb/usb_error.c optional usb dev/usb/usb_generic.c optional usb dev/usb/usb_handle_request.c optional usb dev/usb/usb_hid.c optional usb dev/usb/usb_hub.c optional usb dev/usb/usb_if.m optional usb dev/usb/usb_lookup.c optional usb dev/usb/usb_mbuf.c optional usb dev/usb/usb_msctest.c optional usb dev/usb/usb_parse.c optional usb dev/usb/usb_pf.c optional usb dev/usb/usb_process.c optional usb dev/usb/usb_request.c optional usb dev/usb/usb_transfer.c optional usb dev/usb/usb_util.c optional usb # # USB network drivers # dev/usb/net/if_aue.c optional aue dev/usb/net/if_axe.c optional axe dev/usb/net/if_axge.c optional axge dev/usb/net/if_cdce.c optional cdce dev/usb/net/if_cue.c optional cue dev/usb/net/if_ipheth.c optional ipheth dev/usb/net/if_kue.c optional kue dev/usb/net/if_mos.c optional mos dev/usb/net/if_rue.c optional rue dev/usb/net/if_smsc.c optional smsc dev/usb/net/if_udav.c optional udav dev/usb/net/if_ure.c optional ure dev/usb/net/if_usie.c optional usie dev/usb/net/if_urndis.c optional urndis dev/usb/net/ruephy.c optional rue dev/usb/net/usb_ethernet.c optional uether | aue | axe | axge | cdce | \ cue | ipheth | kue | mos | rue | \ smsc | udav | ure | urndis dev/usb/net/uhso.c optional uhso # # USB WLAN drivers # dev/usb/wlan/if_rsu.c optional rsu rsu-rtl8712fw.c optional rsu-rtl8712fw | rsufw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rsu-rtl8712fw.fw:rsu-rtl8712fw:120 -mrsu-rtl8712fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rsu-rtl8712fw.c" rsu-rtl8712fw.fwo optional rsu-rtl8712fw | rsufw \ dependency "rsu-rtl8712fw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rsu-rtl8712fw.fwo" rsu-rtl8712fw.fw optional rsu-rtl8712.fw | rsufw \ dependency "$S/contrib/dev/rsu/rsu-rtl8712fw.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rsu-rtl8712fw.fw" dev/usb/wlan/if_rum.c optional rum dev/usb/wlan/if_run.c optional run runfw.c optional runfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk run.fw:runfw -mrunfw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "runfw.c" runfw.fwo optional runfw \ dependency "run.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "runfw.fwo" run.fw optional runfw \ dependency "$S/contrib/dev/run/rt2870.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "run.fw" dev/usb/wlan/if_uath.c optional uath dev/usb/wlan/if_upgt.c optional upgt dev/usb/wlan/if_ural.c optional ural dev/usb/wlan/if_urtw.c optional urtw dev/usb/wlan/if_zyd.c optional zyd # # USB serial and parallel port drivers # dev/usb/serial/u3g.c optional u3g dev/usb/serial/uark.c optional uark dev/usb/serial/ubsa.c optional ubsa dev/usb/serial/ubser.c optional ubser dev/usb/serial/uchcom.c optional uchcom dev/usb/serial/ucycom.c optional ucycom dev/usb/serial/ufoma.c optional ufoma dev/usb/serial/uftdi.c optional uftdi dev/usb/serial/ugensa.c optional ugensa dev/usb/serial/uipaq.c optional uipaq dev/usb/serial/ulpt.c optional ulpt dev/usb/serial/umcs.c optional umcs dev/usb/serial/umct.c optional umct dev/usb/serial/umodem.c optional umodem dev/usb/serial/umoscom.c optional umoscom dev/usb/serial/uplcom.c optional uplcom dev/usb/serial/uslcom.c optional uslcom dev/usb/serial/uvisor.c optional uvisor dev/usb/serial/uvscom.c optional uvscom dev/usb/serial/usb_serial.c optional ucom | u3g | uark | ubsa | ubser | \ uchcom | ucycom | ufoma | uftdi | \ ugensa | uipaq | umcs | umct | \ umodem | umoscom | uplcom | usie | \ uslcom | uvisor | uvscom # # USB misc drivers # dev/usb/misc/ufm.c optional ufm dev/usb/misc/udbp.c optional udbp dev/usb/misc/ugold.c optional ugold dev/usb/misc/uled.c optional uled # # USB input drivers # dev/usb/input/atp.c optional atp dev/usb/input/uep.c optional uep dev/usb/input/uhid.c optional uhid dev/usb/input/ukbd.c optional ukbd dev/usb/input/ums.c optional ums dev/usb/input/wsp.c optional wsp # # USB quirks # dev/usb/quirk/usb_quirk.c optional usb # # USB templates # dev/usb/template/usb_template.c optional usb_template dev/usb/template/usb_template_audio.c optional usb_template dev/usb/template/usb_template_cdce.c optional usb_template dev/usb/template/usb_template_kbd.c optional usb_template dev/usb/template/usb_template_modem.c optional usb_template dev/usb/template/usb_template_mouse.c optional usb_template dev/usb/template/usb_template_msc.c optional usb_template dev/usb/template/usb_template_mtp.c optional usb_template dev/usb/template/usb_template_phone.c optional usb_template dev/usb/template/usb_template_serialnet.c optional usb_template dev/usb/template/usb_template_midi.c optional usb_template # # USB video drivers # dev/usb/video/udl.c optional udl # # USB END # dev/videomode/videomode.c optional videomode dev/videomode/edid.c optional videomode dev/videomode/pickmode.c optional videomode dev/videomode/vesagtf.c optional videomode dev/utopia/idtphy.c optional utopia dev/utopia/suni.c optional utopia dev/utopia/utopia.c optional utopia dev/vge/if_vge.c optional vge dev/viapm/viapm.c optional viapm pci dev/virtio/virtio.c optional virtio dev/virtio/virtqueue.c optional virtio dev/virtio/virtio_bus_if.m optional virtio dev/virtio/virtio_if.m optional virtio dev/virtio/pci/virtio_pci.c optional virtio_pci dev/virtio/mmio/virtio_mmio.c optional virtio_mmio dev/virtio/mmio/virtio_mmio_if.m optional virtio_mmio dev/virtio/network/if_vtnet.c optional vtnet dev/virtio/block/virtio_blk.c optional virtio_blk dev/virtio/balloon/virtio_balloon.c optional virtio_balloon dev/virtio/scsi/virtio_scsi.c optional virtio_scsi dev/virtio/random/virtio_random.c optional virtio_random dev/virtio/console/virtio_console.c optional virtio_console dev/vkbd/vkbd.c optional vkbd dev/vr/if_vr.c optional vr pci dev/vt/colors/vt_termcolors.c optional vt dev/vt/font/vt_font_default.c optional vt dev/vt/font/vt_mouse_cursor.c optional vt dev/vt/hw/efifb/efifb.c optional vt_efifb dev/vt/hw/fb/vt_fb.c optional vt dev/vt/hw/vga/vt_vga.c optional vt vt_vga dev/vt/logo/logo_freebsd.c optional vt splash dev/vt/logo/logo_beastie.c optional vt splash dev/vt/vt_buf.c optional vt dev/vt/vt_consolectl.c optional vt dev/vt/vt_core.c optional vt dev/vt/vt_cpulogos.c optional vt splash dev/vt/vt_font.c optional vt dev/vt/vt_sysmouse.c optional vt dev/vte/if_vte.c optional vte pci dev/vx/if_vx.c optional vx dev/vx/if_vx_eisa.c optional vx eisa dev/vx/if_vx_pci.c optional vx pci dev/vxge/vxge.c optional vxge dev/vxge/vxgehal/vxgehal-ifmsg.c optional vxge dev/vxge/vxgehal/vxgehal-mrpcim.c optional vxge dev/vxge/vxgehal/vxge-queue.c optional vxge dev/vxge/vxgehal/vxgehal-ring.c optional vxge dev/vxge/vxgehal/vxgehal-swapper.c optional vxge dev/vxge/vxgehal/vxgehal-mgmt.c optional vxge dev/vxge/vxgehal/vxgehal-srpcim.c optional vxge dev/vxge/vxgehal/vxgehal-config.c optional vxge dev/vxge/vxgehal/vxgehal-blockpool.c optional vxge dev/vxge/vxgehal/vxgehal-doorbells.c optional vxge dev/vxge/vxgehal/vxgehal-mgmtaux.c optional vxge dev/vxge/vxgehal/vxgehal-device.c optional vxge dev/vxge/vxgehal/vxgehal-mm.c optional vxge dev/vxge/vxgehal/vxgehal-driver.c optional vxge dev/vxge/vxgehal/vxgehal-virtualpath.c optional vxge dev/vxge/vxgehal/vxgehal-channel.c optional vxge dev/vxge/vxgehal/vxgehal-fifo.c optional vxge dev/watchdog/watchdog.c standard dev/wb/if_wb.c optional wb pci dev/wds/wd7000.c optional wds isa dev/wi/if_wi.c optional wi dev/wi/if_wi_pccard.c optional wi pccard dev/wi/if_wi_pci.c optional wi pci dev/wl/if_wl.c optional wl isa dev/wpi/if_wpi.c optional wpi pci wpifw.c optional wpifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk wpi.fw:wpifw:153229 -mwpi -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "wpifw.c" wpifw.fwo optional wpifw \ dependency "wpi.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "wpifw.fwo" wpi.fw optional wpifw \ dependency "$S/contrib/dev/wpi/iwlwifi-3945-15.32.2.9.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "wpi.fw" dev/xe/if_xe.c optional xe dev/xe/if_xe_pccard.c optional xe pccard dev/xen/balloon/balloon.c optional xenhvm dev/xen/blkfront/blkfront.c optional xenhvm dev/xen/blkback/blkback.c optional xenhvm dev/xen/console/xen_console.c optional xenhvm dev/xen/control/control.c optional xenhvm dev/xen/grant_table/grant_table.c optional xenhvm dev/xen/netback/netback.c optional xenhvm dev/xen/netfront/netfront.c optional xenhvm dev/xen/xenpci/xenpci.c optional xenpci dev/xen/timer/timer.c optional xenhvm dev/xen/pvcpu/pvcpu.c optional xenhvm dev/xen/xenstore/xenstore.c optional xenhvm dev/xen/xenstore/xenstore_dev.c optional xenhvm dev/xen/xenstore/xenstored_dev.c optional xenhvm dev/xen/evtchn/evtchn_dev.c optional xenhvm dev/xen/privcmd/privcmd.c optional xenhvm dev/xen/debug/debug.c optional xenhvm dev/xl/if_xl.c optional xl pci dev/xl/xlphy.c optional xl pci fs/autofs/autofs.c optional autofs fs/autofs/autofs_vfsops.c optional autofs fs/autofs/autofs_vnops.c optional autofs fs/deadfs/dead_vnops.c standard fs/devfs/devfs_devs.c standard fs/devfs/devfs_dir.c standard fs/devfs/devfs_rule.c standard fs/devfs/devfs_vfsops.c standard fs/devfs/devfs_vnops.c standard fs/fdescfs/fdesc_vfsops.c optional fdescfs fs/fdescfs/fdesc_vnops.c optional fdescfs fs/fifofs/fifo_vnops.c standard fs/cuse/cuse.c optional cuse fs/fuse/fuse_device.c optional fuse fs/fuse/fuse_file.c optional fuse fs/fuse/fuse_internal.c optional fuse fs/fuse/fuse_io.c optional fuse fs/fuse/fuse_ipc.c optional fuse fs/fuse/fuse_main.c optional fuse fs/fuse/fuse_node.c optional fuse fs/fuse/fuse_vfsops.c optional fuse fs/fuse/fuse_vnops.c optional fuse fs/msdosfs/msdosfs_conv.c optional msdosfs fs/msdosfs/msdosfs_denode.c optional msdosfs fs/msdosfs/msdosfs_fat.c optional msdosfs fs/msdosfs/msdosfs_fileno.c optional msdosfs fs/msdosfs/msdosfs_iconv.c optional msdosfs_iconv fs/msdosfs/msdosfs_lookup.c optional msdosfs fs/msdosfs/msdosfs_vfsops.c optional msdosfs fs/msdosfs/msdosfs_vnops.c optional msdosfs fs/nandfs/bmap.c optional nandfs fs/nandfs/nandfs_alloc.c optional nandfs fs/nandfs/nandfs_bmap.c optional nandfs fs/nandfs/nandfs_buffer.c optional nandfs fs/nandfs/nandfs_cleaner.c optional nandfs fs/nandfs/nandfs_cpfile.c optional nandfs fs/nandfs/nandfs_dat.c optional nandfs fs/nandfs/nandfs_dir.c optional nandfs fs/nandfs/nandfs_ifile.c optional nandfs fs/nandfs/nandfs_segment.c optional nandfs fs/nandfs/nandfs_subr.c optional nandfs fs/nandfs/nandfs_sufile.c optional nandfs fs/nandfs/nandfs_vfsops.c optional nandfs fs/nandfs/nandfs_vnops.c optional nandfs fs/nfs/nfs_commonkrpc.c optional nfscl | nfsd fs/nfs/nfs_commonsubs.c optional nfscl | nfsd fs/nfs/nfs_commonport.c optional nfscl | nfsd fs/nfs/nfs_commonacl.c optional nfscl | nfsd fs/nfsclient/nfs_clcomsubs.c optional nfscl fs/nfsclient/nfs_clsubs.c optional nfscl fs/nfsclient/nfs_clstate.c optional nfscl fs/nfsclient/nfs_clkrpc.c optional nfscl fs/nfsclient/nfs_clrpcops.c optional nfscl fs/nfsclient/nfs_clvnops.c optional nfscl fs/nfsclient/nfs_clnode.c optional nfscl fs/nfsclient/nfs_clvfsops.c optional nfscl fs/nfsclient/nfs_clport.c optional nfscl fs/nfsclient/nfs_clbio.c optional nfscl fs/nfsclient/nfs_clnfsiod.c optional nfscl fs/nfsserver/nfs_fha_new.c optional nfsd inet fs/nfsserver/nfs_nfsdsocket.c optional nfsd inet fs/nfsserver/nfs_nfsdsubs.c optional nfsd inet fs/nfsserver/nfs_nfsdstate.c optional nfsd inet fs/nfsserver/nfs_nfsdkrpc.c optional nfsd inet fs/nfsserver/nfs_nfsdserv.c optional nfsd inet fs/nfsserver/nfs_nfsdport.c optional nfsd inet fs/nfsserver/nfs_nfsdcache.c optional nfsd inet fs/nullfs/null_subr.c optional nullfs fs/nullfs/null_vfsops.c optional nullfs fs/nullfs/null_vnops.c optional nullfs fs/procfs/procfs.c optional procfs fs/procfs/procfs_ctl.c optional procfs fs/procfs/procfs_dbregs.c optional procfs fs/procfs/procfs_fpregs.c optional procfs fs/procfs/procfs_ioctl.c optional procfs fs/procfs/procfs_map.c optional procfs fs/procfs/procfs_mem.c optional procfs fs/procfs/procfs_note.c optional procfs fs/procfs/procfs_osrel.c optional procfs fs/procfs/procfs_regs.c optional procfs fs/procfs/procfs_rlimit.c optional procfs fs/procfs/procfs_status.c optional procfs fs/procfs/procfs_type.c optional procfs fs/pseudofs/pseudofs.c optional pseudofs fs/pseudofs/pseudofs_fileno.c optional pseudofs fs/pseudofs/pseudofs_vncache.c optional pseudofs fs/pseudofs/pseudofs_vnops.c optional pseudofs fs/smbfs/smbfs_io.c optional smbfs fs/smbfs/smbfs_node.c optional smbfs fs/smbfs/smbfs_smb.c optional smbfs fs/smbfs/smbfs_subr.c optional smbfs fs/smbfs/smbfs_vfsops.c optional smbfs fs/smbfs/smbfs_vnops.c optional smbfs fs/udf/osta.c optional udf fs/udf/udf_iconv.c optional udf_iconv fs/udf/udf_vfsops.c optional udf fs/udf/udf_vnops.c optional udf fs/unionfs/union_subr.c optional unionfs fs/unionfs/union_vfsops.c optional unionfs fs/unionfs/union_vnops.c optional unionfs fs/tmpfs/tmpfs_vnops.c optional tmpfs fs/tmpfs/tmpfs_fifoops.c optional tmpfs fs/tmpfs/tmpfs_vfsops.c optional tmpfs fs/tmpfs/tmpfs_subr.c optional tmpfs gdb/gdb_cons.c optional gdb gdb/gdb_main.c optional gdb gdb/gdb_packet.c optional gdb geom/bde/g_bde.c optional geom_bde geom/bde/g_bde_crypt.c optional geom_bde geom/bde/g_bde_lock.c optional geom_bde geom/bde/g_bde_work.c optional geom_bde geom/cache/g_cache.c optional geom_cache geom/concat/g_concat.c optional geom_concat geom/eli/g_eli.c optional geom_eli geom/eli/g_eli_crypto.c optional geom_eli geom/eli/g_eli_ctl.c optional geom_eli geom/eli/g_eli_hmac.c optional geom_eli geom/eli/g_eli_integrity.c optional geom_eli geom/eli/g_eli_key.c optional geom_eli geom/eli/g_eli_key_cache.c optional geom_eli geom/eli/g_eli_privacy.c optional geom_eli geom/eli/pkcs5v2.c optional geom_eli geom/gate/g_gate.c optional geom_gate geom/geom_aes.c optional geom_aes geom/geom_bsd.c optional geom_bsd geom/geom_bsd_enc.c optional geom_bsd | geom_part_bsd geom/geom_ccd.c optional ccd | geom_ccd geom/geom_ctl.c standard geom/geom_dev.c standard geom/geom_disk.c standard geom/geom_dump.c standard geom/geom_event.c standard geom/geom_fox.c optional geom_fox geom/geom_flashmap.c optional fdt cfi | fdt nand | fdt mx25l geom/geom_io.c standard geom/geom_kern.c standard geom/geom_map.c optional geom_map geom/geom_mbr.c optional geom_mbr geom/geom_mbr_enc.c optional geom_mbr geom/geom_pc98.c optional geom_pc98 geom/geom_pc98_enc.c optional geom_pc98 geom/geom_redboot.c optional geom_redboot geom/geom_slice.c standard geom/geom_subr.c standard geom/geom_sunlabel.c optional geom_sunlabel geom/geom_sunlabel_enc.c optional geom_sunlabel geom/geom_vfs.c standard geom/geom_vol_ffs.c optional geom_vol geom/journal/g_journal.c optional geom_journal geom/journal/g_journal_ufs.c optional geom_journal geom/label/g_label.c optional geom_label | geom_label_gpt geom/label/g_label_ext2fs.c optional geom_label geom/label/g_label_iso9660.c optional geom_label geom/label/g_label_msdosfs.c optional geom_label geom/label/g_label_ntfs.c optional geom_label geom/label/g_label_reiserfs.c optional geom_label geom/label/g_label_ufs.c optional geom_label geom/label/g_label_gpt.c optional geom_label | geom_label_gpt geom/label/g_label_disk_ident.c optional geom_label geom/linux_lvm/g_linux_lvm.c optional geom_linux_lvm geom/mirror/g_mirror.c optional geom_mirror geom/mirror/g_mirror_ctl.c optional geom_mirror geom/mountver/g_mountver.c optional geom_mountver geom/multipath/g_multipath.c optional geom_multipath geom/nop/g_nop.c optional geom_nop geom/part/g_part.c standard geom/part/g_part_if.m standard geom/part/g_part_apm.c optional geom_part_apm geom/part/g_part_bsd.c optional geom_part_bsd geom/part/g_part_bsd64.c optional geom_part_bsd64 geom/part/g_part_ebr.c optional geom_part_ebr geom/part/g_part_gpt.c optional geom_part_gpt geom/part/g_part_ldm.c optional geom_part_ldm geom/part/g_part_mbr.c optional geom_part_mbr geom/part/g_part_pc98.c optional geom_part_pc98 geom/part/g_part_vtoc8.c optional geom_part_vtoc8 geom/raid/g_raid.c optional geom_raid geom/raid/g_raid_ctl.c optional geom_raid geom/raid/g_raid_md_if.m optional geom_raid geom/raid/g_raid_tr_if.m optional geom_raid geom/raid/md_ddf.c optional geom_raid geom/raid/md_intel.c optional geom_raid geom/raid/md_jmicron.c optional geom_raid geom/raid/md_nvidia.c optional geom_raid geom/raid/md_promise.c optional geom_raid geom/raid/md_sii.c optional geom_raid geom/raid/tr_concat.c optional geom_raid geom/raid/tr_raid0.c optional geom_raid geom/raid/tr_raid1.c optional geom_raid geom/raid/tr_raid1e.c optional geom_raid geom/raid/tr_raid5.c optional geom_raid geom/raid3/g_raid3.c optional geom_raid3 geom/raid3/g_raid3_ctl.c optional geom_raid3 geom/shsec/g_shsec.c optional geom_shsec geom/stripe/g_stripe.c optional geom_stripe contrib/xz-embedded/freebsd/xz_malloc.c \ optional xz_embedded | geom_uzip \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_crc32.c \ optional xz_embedded | geom_uzip \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_dec_bcj.c \ optional xz_embedded | geom_uzip \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_dec_lzma2.c \ optional xz_embedded | geom_uzip \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_dec_stream.c \ optional xz_embedded | geom_uzip \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" geom/uzip/g_uzip.c optional geom_uzip geom/uzip/g_uzip_lzma.c optional geom_uzip geom/uzip/g_uzip_wrkthr.c optional geom_uzip geom/uzip/g_uzip_zlib.c optional geom_uzip geom/vinum/geom_vinum.c optional geom_vinum geom/vinum/geom_vinum_create.c optional geom_vinum geom/vinum/geom_vinum_drive.c optional geom_vinum geom/vinum/geom_vinum_plex.c optional geom_vinum geom/vinum/geom_vinum_volume.c optional geom_vinum geom/vinum/geom_vinum_subr.c optional geom_vinum geom/vinum/geom_vinum_raid5.c optional geom_vinum geom/vinum/geom_vinum_share.c optional geom_vinum geom/vinum/geom_vinum_list.c optional geom_vinum geom/vinum/geom_vinum_rm.c optional geom_vinum geom/vinum/geom_vinum_init.c optional geom_vinum geom/vinum/geom_vinum_state.c optional geom_vinum geom/vinum/geom_vinum_rename.c optional geom_vinum geom/vinum/geom_vinum_move.c optional geom_vinum geom/vinum/geom_vinum_events.c optional geom_vinum geom/virstor/binstream.c optional geom_virstor geom/virstor/g_virstor.c optional geom_virstor geom/virstor/g_virstor_md.c optional geom_virstor geom/zero/g_zero.c optional geom_zero fs/ext2fs/ext2_alloc.c optional ext2fs fs/ext2fs/ext2_balloc.c optional ext2fs fs/ext2fs/ext2_bmap.c optional ext2fs fs/ext2fs/ext2_extents.c optional ext2fs fs/ext2fs/ext2_inode.c optional ext2fs fs/ext2fs/ext2_inode_cnv.c optional ext2fs fs/ext2fs/ext2_hash.c optional ext2fs fs/ext2fs/ext2_htree.c optional ext2fs fs/ext2fs/ext2_lookup.c optional ext2fs fs/ext2fs/ext2_subr.c optional ext2fs fs/ext2fs/ext2_vfsops.c optional ext2fs fs/ext2fs/ext2_vnops.c optional ext2fs # isa/isa_if.m standard isa/isa_common.c optional isa isa/isahint.c optional isa isa/pnp.c optional isa isapnp isa/pnpparse.c optional isa isapnp fs/cd9660/cd9660_bmap.c optional cd9660 fs/cd9660/cd9660_lookup.c optional cd9660 fs/cd9660/cd9660_node.c optional cd9660 fs/cd9660/cd9660_rrip.c optional cd9660 fs/cd9660/cd9660_util.c optional cd9660 fs/cd9660/cd9660_vfsops.c optional cd9660 fs/cd9660/cd9660_vnops.c optional cd9660 fs/cd9660/cd9660_iconv.c optional cd9660_iconv kern/bus_if.m standard kern/clock_if.m standard kern/cpufreq_if.m standard kern/device_if.m standard kern/imgact_binmisc.c optional imagact_binmisc kern/imgact_elf.c standard kern/imgact_elf32.c optional compat_freebsd32 kern/imgact_shell.c standard kern/inflate.c optional gzip kern/init_main.c standard kern/init_sysent.c standard kern/ksched.c optional _kposix_priority_scheduling kern/kern_acct.c standard kern/kern_alq.c optional alq kern/kern_clock.c standard kern/kern_condvar.c standard kern/kern_conf.c standard kern/kern_cons.c standard kern/kern_cpu.c standard kern/kern_cpuset.c standard kern/kern_context.c standard kern/kern_descrip.c standard kern/kern_dtrace.c optional kdtrace_hooks kern/kern_dump.c standard kern/kern_environment.c standard kern/kern_et.c standard kern/kern_event.c standard kern/kern_exec.c standard kern/kern_exit.c standard kern/kern_fail.c standard kern/kern_ffclock.c standard kern/kern_fork.c standard kern/kern_gzio.c optional gzio kern/kern_hhook.c standard kern/kern_idle.c standard kern/kern_intr.c standard kern/kern_jail.c standard kern/kern_khelp.c standard kern/kern_kthread.c standard kern/kern_ktr.c optional ktr kern/kern_ktrace.c standard kern/kern_linker.c standard kern/kern_lock.c standard kern/kern_lockf.c standard kern/kern_lockstat.c optional kdtrace_hooks kern/kern_loginclass.c standard kern/kern_malloc.c standard kern/kern_mbuf.c standard kern/kern_mib.c standard kern/kern_module.c standard kern/kern_mtxpool.c standard kern/kern_mutex.c standard kern/kern_ntptime.c standard kern/kern_numa.c standard kern/kern_osd.c standard kern/kern_physio.c standard kern/kern_pmc.c standard kern/kern_poll.c optional device_polling kern/kern_priv.c standard kern/kern_proc.c standard kern/kern_procctl.c standard kern/kern_prot.c standard kern/kern_racct.c standard kern/kern_rangelock.c standard kern/kern_rctl.c standard kern/kern_resource.c standard kern/kern_rmlock.c standard kern/kern_rwlock.c standard kern/kern_sdt.c optional kdtrace_hooks kern/kern_sema.c standard kern/kern_sendfile.c standard kern/kern_sharedpage.c standard kern/kern_shutdown.c standard kern/kern_sig.c standard kern/kern_switch.c standard kern/kern_sx.c standard kern/kern_synch.c standard kern/kern_syscalls.c standard kern/kern_sysctl.c standard kern/kern_tc.c standard kern/kern_thr.c standard kern/kern_thread.c standard kern/kern_time.c standard kern/kern_timeout.c standard kern/kern_umtx.c standard kern/kern_uuid.c standard kern/kern_xxx.c standard kern/link_elf.c standard kern/linker_if.m standard kern/md4c.c optional netsmb kern/md5c.c standard kern/p1003_1b.c standard kern/posix4_mib.c standard kern/sched_4bsd.c optional sched_4bsd kern/sched_ule.c optional sched_ule kern/serdev_if.m standard kern/stack_protector.c standard \ compile-with "${NORMAL_C:N-fstack-protector*}" kern/subr_acl_nfs4.c optional ufs_acl | zfs kern/subr_acl_posix1e.c optional ufs_acl kern/subr_autoconf.c standard kern/subr_blist.c standard kern/subr_bus.c standard kern/subr_bus_dma.c standard kern/subr_bufring.c standard kern/subr_capability.c standard kern/subr_clock.c standard kern/subr_counter.c standard kern/subr_devstat.c standard kern/subr_disk.c standard kern/subr_eventhandler.c standard kern/subr_fattime.c standard kern/subr_firmware.c optional firmware kern/subr_hash.c standard kern/subr_hints.c standard kern/subr_kdb.c standard kern/subr_kobj.c standard kern/subr_lock.c standard kern/subr_log.c standard kern/subr_mbpool.c optional libmbpool kern/subr_mchain.c optional libmchain kern/subr_module.c standard kern/subr_msgbuf.c standard kern/subr_param.c standard kern/subr_pcpu.c standard kern/subr_pctrie.c standard kern/subr_power.c standard kern/subr_prf.c standard kern/subr_prof.c standard kern/subr_rman.c standard kern/subr_rtc.c standard kern/subr_sbuf.c standard kern/subr_scanf.c standard kern/subr_sglist.c standard kern/subr_sleepqueue.c standard kern/subr_smp.c standard kern/subr_stack.c optional ddb | stack | ktr kern/subr_taskqueue.c standard kern/subr_terminal.c optional vt kern/subr_trap.c standard kern/subr_turnstile.c standard kern/subr_uio.c standard kern/subr_unit.c standard kern/subr_vmem.c standard kern/subr_witness.c optional witness kern/sys_capability.c standard kern/sys_generic.c standard kern/sys_pipe.c standard kern/sys_procdesc.c standard kern/sys_process.c standard kern/sys_socket.c standard kern/syscalls.c standard kern/sysv_ipc.c standard kern/sysv_msg.c optional sysvmsg kern/sysv_sem.c optional sysvsem kern/sysv_shm.c optional sysvshm kern/tty.c standard kern/tty_compat.c optional compat_43tty kern/tty_info.c standard kern/tty_inq.c standard kern/tty_outq.c standard kern/tty_pts.c standard kern/tty_tty.c standard kern/tty_ttydisc.c standard kern/uipc_accf.c standard kern/uipc_debug.c optional ddb kern/uipc_domain.c standard kern/uipc_mbuf.c standard kern/uipc_mbuf2.c standard kern/uipc_mbufhash.c standard kern/uipc_mqueue.c optional p1003_1b_mqueue kern/uipc_sem.c optional p1003_1b_semaphores kern/uipc_shm.c standard kern/uipc_sockbuf.c standard kern/uipc_socket.c standard kern/uipc_syscalls.c standard kern/uipc_usrreq.c standard kern/vfs_acl.c standard kern/vfs_aio.c standard kern/vfs_bio.c standard kern/vfs_cache.c standard kern/vfs_cluster.c standard kern/vfs_default.c standard kern/vfs_export.c standard kern/vfs_extattr.c standard kern/vfs_hash.c standard kern/vfs_init.c standard kern/vfs_lookup.c standard kern/vfs_mount.c standard kern/vfs_mountroot.c standard kern/vfs_subr.c standard kern/vfs_syscalls.c standard kern/vfs_vnops.c standard # # Kernel GSS-API # gssd.h optional kgssapi \ dependency "$S/kgssapi/gssd.x" \ compile-with "RPCGEN_CPP='${CPP}' rpcgen -hM $S/kgssapi/gssd.x | grep -v pthread.h > gssd.h" \ no-obj no-implicit-rule before-depend local \ clean "gssd.h" gssd_xdr.c optional kgssapi \ dependency "$S/kgssapi/gssd.x gssd.h" \ compile-with "RPCGEN_CPP='${CPP}' rpcgen -c $S/kgssapi/gssd.x -o gssd_xdr.c" \ no-implicit-rule before-depend local \ clean "gssd_xdr.c" gssd_clnt.c optional kgssapi \ dependency "$S/kgssapi/gssd.x gssd.h" \ compile-with "RPCGEN_CPP='${CPP}' rpcgen -lM $S/kgssapi/gssd.x | grep -v string.h > gssd_clnt.c" \ no-implicit-rule before-depend local \ clean "gssd_clnt.c" kgssapi/gss_accept_sec_context.c optional kgssapi kgssapi/gss_add_oid_set_member.c optional kgssapi kgssapi/gss_acquire_cred.c optional kgssapi kgssapi/gss_canonicalize_name.c optional kgssapi kgssapi/gss_create_empty_oid_set.c optional kgssapi kgssapi/gss_delete_sec_context.c optional kgssapi kgssapi/gss_display_status.c optional kgssapi kgssapi/gss_export_name.c optional kgssapi kgssapi/gss_get_mic.c optional kgssapi kgssapi/gss_init_sec_context.c optional kgssapi kgssapi/gss_impl.c optional kgssapi kgssapi/gss_import_name.c optional kgssapi kgssapi/gss_names.c optional kgssapi kgssapi/gss_pname_to_uid.c optional kgssapi kgssapi/gss_release_buffer.c optional kgssapi kgssapi/gss_release_cred.c optional kgssapi kgssapi/gss_release_name.c optional kgssapi kgssapi/gss_release_oid_set.c optional kgssapi kgssapi/gss_set_cred_option.c optional kgssapi kgssapi/gss_test_oid_set_member.c optional kgssapi kgssapi/gss_unwrap.c optional kgssapi kgssapi/gss_verify_mic.c optional kgssapi kgssapi/gss_wrap.c optional kgssapi kgssapi/gss_wrap_size_limit.c optional kgssapi kgssapi/gssd_prot.c optional kgssapi kgssapi/krb5/krb5_mech.c optional kgssapi kgssapi/krb5/kcrypto.c optional kgssapi kgssapi/krb5/kcrypto_aes.c optional kgssapi kgssapi/krb5/kcrypto_arcfour.c optional kgssapi kgssapi/krb5/kcrypto_des.c optional kgssapi kgssapi/krb5/kcrypto_des3.c optional kgssapi kgssapi/kgss_if.m optional kgssapi kgssapi/gsstest.c optional kgssapi_debug # These files in libkern/ are those needed by all architectures. Some # of the files in libkern/ are only needed on some architectures, e.g., # libkern/divdi3.c is needed by i386 but not alpha. Also, some of these # routines may be optimized for a particular platform. In either case, # the file should be moved to conf/files. from here. # libkern/arc4random.c standard libkern/asprintf.c standard libkern/bcd.c standard libkern/bsearch.c standard libkern/crc32.c standard libkern/explicit_bzero.c standard libkern/fnmatch.c standard libkern/iconv.c optional libiconv libkern/iconv_converter_if.m optional libiconv libkern/iconv_ucs.c optional libiconv libkern/iconv_xlat.c optional libiconv libkern/iconv_xlat16.c optional libiconv libkern/inet_aton.c standard libkern/inet_ntoa.c standard libkern/inet_ntop.c standard libkern/inet_pton.c standard libkern/jenkins_hash.c standard libkern/murmur3_32.c standard libkern/mcount.c optional profiling-routine libkern/memcchr.c standard libkern/memchr.c standard libkern/memcmp.c standard libkern/memmem.c optional gdb libkern/qsort.c standard libkern/qsort_r.c standard libkern/random.c standard libkern/scanc.c standard libkern/strcasecmp.c standard libkern/strcat.c standard libkern/strchr.c standard libkern/strcmp.c standard libkern/strcpy.c standard libkern/strcspn.c standard libkern/strdup.c standard libkern/strndup.c standard libkern/strlcat.c standard libkern/strlcpy.c standard libkern/strlen.c standard libkern/strncat.c standard libkern/strncmp.c standard libkern/strncpy.c standard libkern/strnlen.c standard libkern/strrchr.c standard libkern/strsep.c standard libkern/strspn.c standard libkern/strstr.c standard libkern/strtol.c standard libkern/strtoq.c standard libkern/strtoul.c standard libkern/strtouq.c standard libkern/strvalid.c standard libkern/timingsafe_bcmp.c standard libkern/zlib.c optional crypto | geom_uzip | ipsec | \ mxge | netgraph_deflate | \ ddb_ctf | gzio net/altq/altq_cbq.c optional altq net/altq/altq_cdnr.c optional altq net/altq/altq_codel.c optional altq net/altq/altq_hfsc.c optional altq net/altq/altq_fairq.c optional altq net/altq/altq_priq.c optional altq net/altq/altq_red.c optional altq net/altq/altq_rio.c optional altq net/altq/altq_rmclass.c optional altq net/altq/altq_subr.c optional altq net/bpf.c standard net/bpf_buffer.c optional bpf net/bpf_jitter.c optional bpf_jitter net/bpf_filter.c optional bpf | netgraph_bpf net/bpf_zerocopy.c optional bpf net/bridgestp.c optional bridge | if_bridge net/flowtable.c optional flowtable inet | flowtable inet6 net/ieee8023ad_lacp.c optional lagg net/if.c standard net/if_arcsubr.c optional arcnet net/if_atmsubr.c optional atm net/if_bridge.c optional bridge inet | if_bridge inet net/if_clone.c standard net/if_dead.c standard net/if_debug.c optional ddb net/if_disc.c optional disc net/if_edsc.c optional edsc net/if_enc.c optional enc inet | enc inet6 net/if_epair.c optional epair net/if_ethersubr.c optional ether net/if_fddisubr.c optional fddi net/if_fwsubr.c optional fwip net/if_gif.c optional gif inet | gif inet6 | \ netgraph_gif inet | netgraph_gif inet6 net/if_gre.c optional gre inet | gre inet6 net/if_iso88025subr.c optional token net/if_lagg.c optional lagg net/if_loop.c optional loop net/if_llatbl.c standard net/if_me.c optional me inet net/if_media.c standard net/if_mib.c standard net/if_spppfr.c optional sppp | netgraph_sppp net/if_spppsubr.c optional sppp | netgraph_sppp net/if_stf.c optional stf inet inet6 net/if_tun.c optional tun net/if_tap.c optional tap net/if_vlan.c optional vlan net/if_vxlan.c optional vxlan inet | vxlan inet6 net/ifdi_if.m optional ether pci net/iflib.c optional ether pci net/mp_ring.c optional ether net/mppcc.c optional netgraph_mppc_compression net/mppcd.c optional netgraph_mppc_compression net/netisr.c standard net/pfil.c optional ether | inet net/radix.c standard net/radix_mpath.c standard net/raw_cb.c standard net/raw_usrreq.c standard net/route.c standard net/rss_config.c optional inet rss | inet6 rss net/rtsock.c standard net/slcompress.c optional netgraph_vjc | sppp | \ netgraph_sppp net/toeplitz.c optional inet rss | inet6 rss net/vnet.c optional vimage net80211/ieee80211.c optional wlan net80211/ieee80211_acl.c optional wlan wlan_acl net80211/ieee80211_action.c optional wlan net80211/ieee80211_ageq.c optional wlan net80211/ieee80211_adhoc.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_ageq.c optional wlan net80211/ieee80211_amrr.c optional wlan | wlan_amrr net80211/ieee80211_crypto.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_crypto_ccmp.c optional wlan wlan_ccmp net80211/ieee80211_crypto_none.c optional wlan net80211/ieee80211_crypto_tkip.c optional wlan wlan_tkip net80211/ieee80211_crypto_wep.c optional wlan wlan_wep net80211/ieee80211_ddb.c optional wlan ddb net80211/ieee80211_dfs.c optional wlan net80211/ieee80211_freebsd.c optional wlan net80211/ieee80211_hostap.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_ht.c optional wlan net80211/ieee80211_hwmp.c optional wlan ieee80211_support_mesh net80211/ieee80211_input.c optional wlan net80211/ieee80211_ioctl.c optional wlan net80211/ieee80211_mesh.c optional wlan ieee80211_support_mesh \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_monitor.c optional wlan net80211/ieee80211_node.c optional wlan net80211/ieee80211_output.c optional wlan net80211/ieee80211_phy.c optional wlan net80211/ieee80211_power.c optional wlan net80211/ieee80211_proto.c optional wlan net80211/ieee80211_radiotap.c optional wlan net80211/ieee80211_ratectl.c optional wlan net80211/ieee80211_ratectl_none.c optional wlan net80211/ieee80211_regdomain.c optional wlan net80211/ieee80211_rssadapt.c optional wlan wlan_rssadapt net80211/ieee80211_scan.c optional wlan net80211/ieee80211_scan_sta.c optional wlan net80211/ieee80211_sta.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_superg.c optional wlan ieee80211_support_superg net80211/ieee80211_scan_sw.c optional wlan net80211/ieee80211_tdma.c optional wlan ieee80211_support_tdma net80211/ieee80211_wds.c optional wlan net80211/ieee80211_xauth.c optional wlan wlan_xauth net80211/ieee80211_alq.c optional wlan ieee80211_alq netgraph/atm/ccatm/ng_ccatm.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/ng_atm.c optional ngatm_atm netgraph/atm/ngatmbase.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/sscfu/ng_sscfu.c optional ngatm_sscfu \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/sscop/ng_sscop.c optional ngatm_sscop \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/uni/ng_uni.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/bluetooth/common/ng_bluetooth.c optional netgraph_bluetooth netgraph/bluetooth/drivers/bt3c/ng_bt3c_pccard.c optional netgraph_bluetooth_bt3c netgraph/bluetooth/drivers/h4/ng_h4.c optional netgraph_bluetooth_h4 netgraph/bluetooth/drivers/ubt/ng_ubt.c optional netgraph_bluetooth_ubt usb netgraph/bluetooth/drivers/ubtbcmfw/ubtbcmfw.c optional netgraph_bluetooth_ubtbcmfw usb netgraph/bluetooth/hci/ng_hci_cmds.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_evnt.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_main.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_misc.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_ulpi.c optional netgraph_bluetooth_hci netgraph/bluetooth/l2cap/ng_l2cap_cmds.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_evnt.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_llpi.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_main.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_misc.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_ulpi.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/socket/ng_btsocket.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_hci_raw.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_l2cap.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_l2cap_raw.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_rfcomm.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_sco.c optional netgraph_bluetooth_socket netgraph/netflow/netflow.c optional netgraph_netflow netgraph/netflow/netflow_v9.c optional netgraph_netflow netgraph/netflow/ng_netflow.c optional netgraph_netflow netgraph/ng_UI.c optional netgraph_UI netgraph/ng_async.c optional netgraph_async netgraph/ng_atmllc.c optional netgraph_atmllc netgraph/ng_base.c optional netgraph netgraph/ng_bpf.c optional netgraph_bpf netgraph/ng_bridge.c optional netgraph_bridge netgraph/ng_car.c optional netgraph_car netgraph/ng_cisco.c optional netgraph_cisco netgraph/ng_deflate.c optional netgraph_deflate netgraph/ng_device.c optional netgraph_device netgraph/ng_echo.c optional netgraph_echo netgraph/ng_eiface.c optional netgraph_eiface netgraph/ng_ether.c optional netgraph_ether netgraph/ng_ether_echo.c optional netgraph_ether_echo netgraph/ng_frame_relay.c optional netgraph_frame_relay netgraph/ng_gif.c optional netgraph_gif inet6 | netgraph_gif inet netgraph/ng_gif_demux.c optional netgraph_gif_demux netgraph/ng_hole.c optional netgraph_hole netgraph/ng_iface.c optional netgraph_iface netgraph/ng_ip_input.c optional netgraph_ip_input netgraph/ng_ipfw.c optional netgraph_ipfw inet ipfirewall netgraph/ng_ksocket.c optional netgraph_ksocket netgraph/ng_l2tp.c optional netgraph_l2tp netgraph/ng_lmi.c optional netgraph_lmi netgraph/ng_mppc.c optional netgraph_mppc_compression | \ netgraph_mppc_encryption netgraph/ng_nat.c optional netgraph_nat inet libalias netgraph/ng_one2many.c optional netgraph_one2many netgraph/ng_parse.c optional netgraph netgraph/ng_patch.c optional netgraph_patch netgraph/ng_pipe.c optional netgraph_pipe netgraph/ng_ppp.c optional netgraph_ppp netgraph/ng_pppoe.c optional netgraph_pppoe netgraph/ng_pptpgre.c optional netgraph_pptpgre netgraph/ng_pred1.c optional netgraph_pred1 netgraph/ng_rfc1490.c optional netgraph_rfc1490 netgraph/ng_socket.c optional netgraph_socket netgraph/ng_split.c optional netgraph_split netgraph/ng_sppp.c optional netgraph_sppp netgraph/ng_tag.c optional netgraph_tag netgraph/ng_tcpmss.c optional netgraph_tcpmss netgraph/ng_tee.c optional netgraph_tee netgraph/ng_tty.c optional netgraph_tty netgraph/ng_vjc.c optional netgraph_vjc netgraph/ng_vlan.c optional netgraph_vlan netinet/accf_data.c optional accept_filter_data inet netinet/accf_dns.c optional accept_filter_dns inet netinet/accf_http.c optional accept_filter_http inet netinet/if_atm.c optional atm netinet/if_ether.c optional inet ether netinet/igmp.c optional inet netinet/in.c optional inet netinet/in_debug.c optional inet ddb netinet/in_kdtrace.c optional inet | inet6 netinet/ip_carp.c optional inet carp | inet6 carp netinet/in_fib.c optional inet netinet/in_gif.c optional gif inet | netgraph_gif inet netinet/ip_gre.c optional gre inet netinet/ip_id.c optional inet netinet/in_mcast.c optional inet netinet/in_pcb.c optional inet | inet6 netinet/in_pcbgroup.c optional inet pcbgroup | inet6 pcbgroup netinet/in_proto.c optional inet | inet6 netinet/in_rmx.c optional inet netinet/in_rss.c optional inet rss netinet/ip_divert.c optional inet ipdivert ipfirewall netinet/ip_ecn.c optional inet | inet6 netinet/ip_encap.c optional inet | inet6 netinet/ip_fastfwd.c optional inet netinet/ip_icmp.c optional inet | inet6 netinet/ip_input.c optional inet netinet/ip_ipsec.c optional inet ipsec netinet/ip_mroute.c optional mrouting inet netinet/ip_options.c optional inet netinet/ip_output.c optional inet netinet/ip_reass.c optional inet netinet/raw_ip.c optional inet | inet6 netinet/cc/cc.c optional inet | inet6 netinet/cc/cc_newreno.c optional inet | inet6 netinet/sctp_asconf.c optional inet sctp | inet6 sctp netinet/sctp_auth.c optional inet sctp | inet6 sctp netinet/sctp_bsd_addr.c optional inet sctp | inet6 sctp netinet/sctp_cc_functions.c optional inet sctp | inet6 sctp netinet/sctp_crc32.c optional inet sctp | inet6 sctp netinet/sctp_indata.c optional inet sctp | inet6 sctp netinet/sctp_input.c optional inet sctp | inet6 sctp netinet/sctp_output.c optional inet sctp | inet6 sctp netinet/sctp_pcb.c optional inet sctp | inet6 sctp netinet/sctp_peeloff.c optional inet sctp | inet6 sctp netinet/sctp_ss_functions.c optional inet sctp | inet6 sctp netinet/sctp_syscalls.c optional inet sctp | inet6 sctp netinet/sctp_sysctl.c optional inet sctp | inet6 sctp netinet/sctp_timer.c optional inet sctp | inet6 sctp netinet/sctp_usrreq.c optional inet sctp | inet6 sctp netinet/sctputil.c optional inet sctp | inet6 sctp netinet/siftr.c optional inet siftr alq | inet6 siftr alq netinet/tcp_debug.c optional tcpdebug netinet/tcp_fastopen.c optional inet tcp_rfc7413 | inet6 tcp_rfc7413 netinet/tcp_hostcache.c optional inet | inet6 netinet/tcp_input.c optional inet | inet6 netinet/tcp_lro.c optional inet | inet6 netinet/tcp_output.c optional inet | inet6 netinet/tcp_offload.c optional tcp_offload inet | tcp_offload inet6 netinet/tcp_pcap.c optional inet tcppcap | inet6 tcppcap netinet/tcp_reass.c optional inet | inet6 netinet/tcp_sack.c optional inet | inet6 netinet/tcp_subr.c optional inet | inet6 netinet/tcp_syncache.c optional inet | inet6 netinet/tcp_timer.c optional inet | inet6 netinet/tcp_timewait.c optional inet | inet6 netinet/tcp_usrreq.c optional inet | inet6 netinet/udp_usrreq.c optional inet | inet6 netinet/libalias/alias.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_db.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_mod.c optional libalias | netgraph_nat netinet/libalias/alias_proxy.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_util.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_sctp.c optional libalias inet | netgraph_nat inet netinet6/dest6.c optional inet6 netinet6/frag6.c optional inet6 netinet6/icmp6.c optional inet6 netinet6/in6.c optional inet6 netinet6/in6_cksum.c optional inet6 netinet6/in6_fib.c optional inet6 netinet6/in6_gif.c optional gif inet6 | netgraph_gif inet6 netinet6/in6_ifattach.c optional inet6 netinet6/in6_mcast.c optional inet6 netinet6/in6_pcb.c optional inet6 netinet6/in6_pcbgroup.c optional inet6 pcbgroup netinet6/in6_proto.c optional inet6 netinet6/in6_rmx.c optional inet6 netinet6/in6_rss.c optional inet6 rss netinet6/in6_src.c optional inet6 netinet6/ip6_forward.c optional inet6 netinet6/ip6_gre.c optional gre inet6 netinet6/ip6_id.c optional inet6 netinet6/ip6_input.c optional inet6 netinet6/ip6_mroute.c optional mrouting inet6 netinet6/ip6_output.c optional inet6 netinet6/ip6_ipsec.c optional inet6 ipsec netinet6/mld6.c optional inet6 netinet6/nd6.c optional inet6 netinet6/nd6_nbr.c optional inet6 netinet6/nd6_rtr.c optional inet6 netinet6/raw_ip6.c optional inet6 netinet6/route6.c optional inet6 netinet6/scope6.c optional inet6 netinet6/sctp6_usrreq.c optional inet6 sctp netinet6/udp6_usrreq.c optional inet6 netipsec/ipsec.c optional ipsec inet | ipsec inet6 netipsec/ipsec_input.c optional ipsec inet | ipsec inet6 netipsec/ipsec_mbuf.c optional ipsec inet | ipsec inet6 netipsec/ipsec_output.c optional ipsec inet | ipsec inet6 netipsec/key.c optional ipsec inet | ipsec inet6 netipsec/key_debug.c optional ipsec inet | ipsec inet6 netipsec/keysock.c optional ipsec inet | ipsec inet6 netipsec/xform_ah.c optional ipsec inet | ipsec inet6 netipsec/xform_esp.c optional ipsec inet | ipsec inet6 netipsec/xform_ipcomp.c optional ipsec inet | ipsec inet6 netipsec/xform_tcp.c optional ipsec inet tcp_signature | \ ipsec inet6 tcp_signature netnatm/natm.c optional natm netnatm/natm_pcb.c optional natm netnatm/natm_proto.c optional natm netpfil/ipfw/dn_aqm_codel.c optional inet dummynet netpfil/ipfw/dn_aqm_pie.c optional inet dummynet netpfil/ipfw/dn_heap.c optional inet dummynet netpfil/ipfw/dn_sched_fifo.c optional inet dummynet netpfil/ipfw/dn_sched_fq_codel.c optional inet dummynet netpfil/ipfw/dn_sched_fq_pie.c optional inet dummynet netpfil/ipfw/dn_sched_prio.c optional inet dummynet netpfil/ipfw/dn_sched_qfq.c optional inet dummynet netpfil/ipfw/dn_sched_rr.c optional inet dummynet netpfil/ipfw/dn_sched_wf2q.c optional inet dummynet netpfil/ipfw/ip_dummynet.c optional inet dummynet netpfil/ipfw/ip_dn_io.c optional inet dummynet netpfil/ipfw/ip_dn_glue.c optional inet dummynet netpfil/ipfw/ip_fw2.c optional inet ipfirewall netpfil/ipfw/ip_fw_dynamic.c optional inet ipfirewall netpfil/ipfw/ip_fw_eaction.c optional inet ipfirewall netpfil/ipfw/ip_fw_log.c optional inet ipfirewall netpfil/ipfw/ip_fw_pfil.c optional inet ipfirewall netpfil/ipfw/ip_fw_sockopt.c optional inet ipfirewall netpfil/ipfw/ip_fw_table.c optional inet ipfirewall netpfil/ipfw/ip_fw_table_algo.c optional inet ipfirewall netpfil/ipfw/ip_fw_table_value.c optional inet ipfirewall netpfil/ipfw/ip_fw_iface.c optional inet ipfirewall netpfil/ipfw/ip_fw_nat.c optional inet ipfirewall_nat netpfil/ipfw/nptv6/ip_fw_nptv6.c optional inet inet6 ipfirewall \ ipfirewall_nptv6 netpfil/ipfw/nptv6/nptv6.c optional inet inet6 ipfirewall \ ipfirewall_nptv6 netpfil/pf/if_pflog.c optional pflog pf inet netpfil/pf/if_pfsync.c optional pfsync pf inet netpfil/pf/pf.c optional pf inet netpfil/pf/pf_if.c optional pf inet netpfil/pf/pf_ioctl.c optional pf inet netpfil/pf/pf_lb.c optional pf inet netpfil/pf/pf_norm.c optional pf inet netpfil/pf/pf_osfp.c optional pf inet netpfil/pf/pf_ruleset.c optional pf inet netpfil/pf/pf_table.c optional pf inet netpfil/pf/in4_cksum.c optional pf inet netsmb/smb_conn.c optional netsmb netsmb/smb_crypt.c optional netsmb netsmb/smb_dev.c optional netsmb netsmb/smb_iod.c optional netsmb netsmb/smb_rq.c optional netsmb netsmb/smb_smb.c optional netsmb netsmb/smb_subr.c optional netsmb netsmb/smb_trantcp.c optional netsmb netsmb/smb_usr.c optional netsmb nfs/bootp_subr.c optional bootp nfscl nfs/krpc_subr.c optional bootp nfscl nfs/nfs_diskless.c optional nfscl nfs_root nfs/nfs_fha.c optional nfsd nfs/nfs_lock.c optional nfscl | nfslockd | nfsd nfs/nfs_nfssvc.c optional nfscl | nfsd nlm/nlm_advlock.c optional nfslockd | nfsd nlm/nlm_prot_clnt.c optional nfslockd | nfsd nlm/nlm_prot_impl.c optional nfslockd | nfsd nlm/nlm_prot_server.c optional nfslockd | nfsd nlm/nlm_prot_svc.c optional nfslockd | nfsd nlm/nlm_prot_xdr.c optional nfslockd | nfsd nlm/sm_inter_xdr.c optional nfslockd | nfsd # Linux Kernel Programming Interface compat/linuxkpi/common/src/linux_kmod.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_compat.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_pci.c optional compat_linuxkpi pci \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_idr.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_radix.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_usb.c optional compat_linuxkpi usb \ compile-with "${LINUXKPI_C}" # OpenFabrics Enterprise Distribution (Infiniband) ofed/drivers/infiniband/core/addr.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/agent.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/cache.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" # XXX Mad.c must be ordered before cm.c for sysinit sets to occur in # the correct order. ofed/drivers/infiniband/core/mad.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/cm.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/ -Wno-unused-function" ofed/drivers/infiniband/core/cma.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/device.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/fmr_pool.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/iwcm.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/mad_rmpp.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/multicast.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/packer.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/peer_mem.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/sa_query.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/smi.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/sysfs.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/ucm.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/ucma.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/ud_header.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/umem.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/user_mad.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/uverbs_cmd.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/uverbs_main.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/uverbs_marshall.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/core/verbs.c optional ofed \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/core/" ofed/drivers/infiniband/ulp/ipoib/ipoib_cm.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" #ofed/drivers/infiniband/ulp/ipoib/ipoib_fs.c optional ipoib \ # compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_ib.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_main.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_multicast.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_verbs.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" #ofed/drivers/infiniband/ulp/ipoib/ipoib_vlan.c optional ipoib \ # compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/sdp/sdp_bcopy.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_main.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_rx.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_cma.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_tx.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/hw/mlx4/alias_GUID.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/mcg.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/sysfs.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/cm.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/ah.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/cq.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/doorbell.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/mad.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/main.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/mlx4_exp.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/mr.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/qp.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/srq.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/infiniband/hw/mlx4/wc.c optional mlx4ib \ no-depend obj-prefix "mlx4ib_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/infiniband/hw/mlx4/" ofed/drivers/net/mlx4/alloc.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/catas.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/cmd.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/cq.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/eq.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/fw.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/icm.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/intf.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/main.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/mcg.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/ -Wno-unused" ofed/drivers/net/mlx4/mr.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/pd.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/port.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/profile.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/qp.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/reset.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/sense.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/srq.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/resource_tracker.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/sys_tune.c optional mlx4ib | mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/en_cq.c optional mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/en_main.c optional mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/en_netdev.c optional mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/en_port.c optional mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/en_resources.c optional mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/en_rx.c optional mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" ofed/drivers/net/mlx4/en_tx.c optional mlxen \ no-depend obj-prefix "mlx4_" \ compile-with "${OFED_C_NOIMP} -I$S/ofed/drivers/net/mlx4/" dev/mlx5/mlx5_core/mlx5_alloc.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_cmd.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_cq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_eq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_flow_table.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_fw.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_health.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_mad.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_main.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_mcg.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_mr.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_pagealloc.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_pd.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_port.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_qp.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_srq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_transobj.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_uar.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_vport.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_wq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_ethtool.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_main.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_tx.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_flow_table.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_rx.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_txrx.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_allocator.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_av.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_catas.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_cmd.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_cq.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_eq.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_mad.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_main.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_mcg.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_memfree.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_mr.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_pd.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_profile.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_provider.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_qp.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_reset.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_srq.c optional mthca \ compile-with "${OFED_C}" ofed/drivers/infiniband/hw/mthca/mthca_uar.c optional mthca \ compile-with "${OFED_C}" # crypto support opencrypto/cast.c optional crypto | ipsec opencrypto/criov.c optional crypto | ipsec opencrypto/crypto.c optional crypto | ipsec opencrypto/cryptodev.c optional cryptodev opencrypto/cryptodev_if.m optional crypto | ipsec opencrypto/cryptosoft.c optional crypto | ipsec opencrypto/cryptodeflate.c optional crypto | ipsec opencrypto/gmac.c optional crypto | ipsec opencrypto/gfmult.c optional crypto | ipsec opencrypto/rmd160.c optional crypto | ipsec opencrypto/skipjack.c optional crypto | ipsec opencrypto/xform.c optional crypto | ipsec rpc/auth_none.c optional krpc | nfslockd | nfscl | nfsd rpc/auth_unix.c optional krpc | nfslockd | nfscl | nfsd rpc/authunix_prot.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_bck.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_dg.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_rc.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_vc.c optional krpc | nfslockd | nfscl | nfsd rpc/getnetconfig.c optional krpc | nfslockd | nfscl | nfsd rpc/replay.c optional krpc | nfslockd | nfscl | nfsd rpc/rpc_callmsg.c optional krpc | nfslockd | nfscl | nfsd rpc/rpc_generic.c optional krpc | nfslockd | nfscl | nfsd rpc/rpc_prot.c optional krpc | nfslockd | nfscl | nfsd rpc/rpcb_clnt.c optional krpc | nfslockd | nfscl | nfsd rpc/rpcb_prot.c optional krpc | nfslockd | nfscl | nfsd rpc/svc.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_auth.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_auth_unix.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_dg.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_generic.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_vc.c optional krpc | nfslockd | nfscl | nfsd rpc/rpcsec_gss/rpcsec_gss.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/rpcsec_gss_conf.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/rpcsec_gss_misc.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/rpcsec_gss_prot.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/svc_rpcsec_gss.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi security/audit/audit.c optional audit security/audit/audit_arg.c optional audit security/audit/audit_bsm.c optional audit security/audit/audit_bsm_klib.c optional audit security/audit/audit_pipe.c optional audit security/audit/audit_syscalls.c standard security/audit/audit_trigger.c optional audit security/audit/audit_worker.c optional audit security/audit/bsm_domain.c optional audit security/audit/bsm_errno.c optional audit security/audit/bsm_fcntl.c optional audit security/audit/bsm_socket_type.c optional audit security/audit/bsm_token.c optional audit security/mac/mac_audit.c optional mac audit security/mac/mac_cred.c optional mac security/mac/mac_framework.c optional mac security/mac/mac_inet.c optional mac inet | mac inet6 security/mac/mac_inet6.c optional mac inet6 security/mac/mac_label.c optional mac security/mac/mac_net.c optional mac security/mac/mac_pipe.c optional mac security/mac/mac_posix_sem.c optional mac security/mac/mac_posix_shm.c optional mac security/mac/mac_priv.c optional mac security/mac/mac_process.c optional mac security/mac/mac_socket.c optional mac security/mac/mac_syscalls.c standard security/mac/mac_system.c optional mac security/mac/mac_sysv_msg.c optional mac security/mac/mac_sysv_sem.c optional mac security/mac/mac_sysv_shm.c optional mac security/mac/mac_vfs.c optional mac security/mac_biba/mac_biba.c optional mac_biba security/mac_bsdextended/mac_bsdextended.c optional mac_bsdextended security/mac_bsdextended/ugidfw_system.c optional mac_bsdextended security/mac_bsdextended/ugidfw_vnode.c optional mac_bsdextended security/mac_ifoff/mac_ifoff.c optional mac_ifoff security/mac_lomac/mac_lomac.c optional mac_lomac security/mac_mls/mac_mls.c optional mac_mls security/mac_none/mac_none.c optional mac_none security/mac_partition/mac_partition.c optional mac_partition security/mac_portacl/mac_portacl.c optional mac_portacl security/mac_seeotheruids/mac_seeotheruids.c optional mac_seeotheruids security/mac_stub/mac_stub.c optional mac_stub security/mac_test/mac_test.c optional mac_test teken/teken.c optional sc | vt ufs/ffs/ffs_alloc.c optional ffs ufs/ffs/ffs_balloc.c optional ffs ufs/ffs/ffs_inode.c optional ffs ufs/ffs/ffs_snapshot.c optional ffs ufs/ffs/ffs_softdep.c optional ffs ufs/ffs/ffs_subr.c optional ffs ufs/ffs/ffs_tables.c optional ffs ufs/ffs/ffs_vfsops.c optional ffs ufs/ffs/ffs_vnops.c optional ffs ufs/ffs/ffs_rawread.c optional ffs directio ufs/ffs/ffs_suspend.c optional ffs ufs/ufs/ufs_acl.c optional ffs ufs/ufs/ufs_bmap.c optional ffs ufs/ufs/ufs_dirhash.c optional ffs ufs/ufs/ufs_extattr.c optional ffs ufs/ufs/ufs_gjournal.c optional ffs UFS_GJOURNAL ufs/ufs/ufs_inode.c optional ffs ufs/ufs/ufs_lookup.c optional ffs ufs/ufs/ufs_quota.c optional ffs ufs/ufs/ufs_vfsops.c optional ffs ufs/ufs/ufs_vnops.c optional ffs vm/default_pager.c standard vm/device_pager.c standard vm/phys_pager.c standard vm/redzone.c optional DEBUG_REDZONE vm/sg_pager.c standard vm/swap_pager.c standard vm/uma_core.c standard vm/uma_dbg.c standard vm/memguard.c optional DEBUG_MEMGUARD vm/vm_fault.c standard vm/vm_glue.c standard vm/vm_init.c standard vm/vm_kern.c standard vm/vm_map.c standard vm/vm_meter.c standard vm/vm_mmap.c standard vm/vm_object.c standard vm/vm_page.c standard vm/vm_pageout.c standard vm/vm_pager.c standard vm/vm_phys.c standard vm/vm_radix.c standard vm/vm_reserv.c standard vm/vm_domain.c standard vm/vm_unix.c standard vm/vm_zeroidle.c standard vm/vnode_pager.c standard xen/features.c optional xenhvm xen/xenbus/xenbus_if.m optional xenhvm xen/xenbus/xenbus.c optional xenhvm xen/xenbus/xenbusb_if.m optional xenhvm xen/xenbus/xenbusb.c optional xenhvm xen/xenbus/xenbusb_front.c optional xenhvm xen/xenbus/xenbusb_back.c optional xenhvm xen/xenmem/xenmem_if.m optional xenhvm xdr/xdr.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_array.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_mbuf.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_mem.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_reference.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_sizeof.c optional krpc | nfslockd | nfscl | nfsd Index: user/alc/PQ_LAUNDRY/sys/conf/files.arm64 =================================================================== --- user/alc/PQ_LAUNDRY/sys/conf/files.arm64 (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/conf/files.arm64 (revision 303206) @@ -1,100 +1,100 @@ # $FreeBSD$ arm/arm/generic_timer.c standard arm/arm/gic.c optional intrng arm/arm/pmu.c standard arm64/acpica/acpi_machdep.c optional acpi arm64/acpica/OsdEnvironment.c optional acpi arm64/acpica/acpi_wakeup.c optional acpi arm64/acpica/pci_cfgreg.c optional acpi pci arm64/arm64/autoconf.c standard arm64/arm64/bcopy.c standard arm64/arm64/bus_machdep.c standard arm64/arm64/bus_space_asm.S standard arm64/arm64/busdma_bounce.c standard arm64/arm64/busdma_machdep.c standard arm64/arm64/bzero.S standard arm64/arm64/clock.c standard arm64/arm64/copyinout.S standard arm64/arm64/copystr.c standard arm64/arm64/cpufunc_asm.S standard arm64/arm64/db_disasm.c optional ddb arm64/arm64/db_interface.c optional ddb arm64/arm64/db_trace.c optional ddb arm64/arm64/debug_monitor.c optional kdb arm64/arm64/disassem.c optional ddb arm64/arm64/dump_machdep.c standard arm64/arm64/elf_machdep.c standard arm64/arm64/exception.S standard arm64/arm64/gicv3_its.c optional intrng arm64/arm64/gic_v3.c standard arm64/arm64/gic_v3_fdt.c optional fdt arm64/arm64/identcpu.c standard arm64/arm64/in_cksum.c optional inet | inet6 arm64/arm64/locore.S standard no-obj arm64/arm64/machdep.c standard arm64/arm64/mem.c standard arm64/arm64/minidump_machdep.c standard arm64/arm64/mp_machdep.c optional smp arm64/arm64/nexus.c standard arm64/arm64/ofw_machdep.c optional fdt arm64/arm64/pmap.c standard arm64/arm64/stack_machdep.c optional ddb | stack arm64/arm64/support.S standard arm64/arm64/swtch.S standard arm64/arm64/sys_machdep.c standard arm64/arm64/trap.c standard arm64/arm64/uio_machdep.c standard arm64/arm64/uma_machdep.c standard arm64/arm64/unwind.c optional ddb | kdtrace_hooks | stack arm64/arm64/vfp.c standard arm64/arm64/vm_machdep.c standard arm64/cavium/thunder_pcie_fdt.c optional soc_cavm_thunderx pci fdt arm64/cavium/thunder_pcie_pem.c optional soc_cavm_thunderx pci arm64/cavium/thunder_pcie_pem_fdt.c optional soc_cavm_thunderx pci fdt arm64/cavium/thunder_pcie_common.c optional soc_cavm_thunderx pci arm64/cloudabi64/cloudabi64_sysvec.c optional compat_cloudabi64 crypto/blowfish/bf_enc.c optional crypto | ipsec crypto/des/des_enc.c optional crypto | ipsec | netsmb dev/acpica/acpi_if.m optional acpi dev/ahci/ahci_generic.c optional ahci fdt dev/hwpmc/hwpmc_arm64.c optional hwpmc dev/hwpmc/hwpmc_arm64_md.c optional hwpmc -dev/mmc/host/dwmmc.c optional dwmmc -dev/mmc/host/dwmmc_hisi.c optional dwmmc soc_hisi_hi6220 +dev/mmc/host/dwmmc.c optional dwmmc fdt +dev/mmc/host/dwmmc_hisi.c optional dwmmc fdt soc_hisi_hi6220 dev/ofw/ofw_cpu.c optional fdt dev/ofw/ofwpci.c optional fdt pci dev/pci/pci_host_generic.c optional pci fdt dev/psci/psci.c optional psci dev/psci/psci_arm64.S optional psci dev/uart/uart_cpu_fdt.c optional uart fdt dev/uart/uart_dev_pl011.c optional uart pl011 -dev/usb/controller/dwc_otg_hisi.c optional dwcotg soc_hisi_hi6220 +dev/usb/controller/dwc_otg_hisi.c optional dwcotg fdt soc_hisi_hi6220 dev/usb/controller/generic_ohci.c optional ohci fdt dev/usb/controller/generic_usb_if.m optional ohci fdt dev/vnic/mrml_bridge.c optional vnic fdt dev/vnic/nic_main.c optional vnic pci dev/vnic/nicvf_main.c optional vnic pci pci_iov dev/vnic/nicvf_queues.c optional vnic pci pci_iov dev/vnic/thunder_bgx_fdt.c optional vnic fdt dev/vnic/thunder_bgx.c optional vnic pci dev/vnic/thunder_mdio_fdt.c optional vnic fdt dev/vnic/thunder_mdio.c optional vnic dev/vnic/lmac_if.m optional inet | inet6 | vnic kern/kern_clocksource.c standard kern/msi_if.m optional intrng kern/pic_if.m optional intrng kern/subr_devmap.c standard kern/subr_intr.c optional intrng libkern/bcmp.c standard libkern/ffs.c standard libkern/ffsl.c standard libkern/ffsll.c standard libkern/fls.c standard libkern/flsl.c standard libkern/flsll.c standard libkern/memmove.c standard libkern/memset.c standard cddl/contrib/opensolaris/common/atomic/aarch64/opensolaris_atomic.S optional zfs | dtrace compile-with "${CDDL_C}" cddl/dev/dtrace/aarch64/dtrace_asm.S optional dtrace compile-with "${DTRACE_S}" cddl/dev/dtrace/aarch64/dtrace_subr.c optional dtrace compile-with "${DTRACE_C}" cddl/dev/fbt/aarch64/fbt_isa.c optional dtrace_fbt | dtraceall compile-with "${FBT_C}" Index: user/alc/PQ_LAUNDRY/sys/conf/kern.mk =================================================================== --- user/alc/PQ_LAUNDRY/sys/conf/kern.mk (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/conf/kern.mk (revision 303206) @@ -1,233 +1,236 @@ # $FreeBSD$ # # Warning flags for compiling the kernel and components of the kernel: # CWARNFLAGS?= -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes \ -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual \ -Wundef -Wno-pointer-sign ${FORMAT_EXTENSIONS} \ -Wmissing-include-dirs -fdiagnostics-show-option \ -Wno-unknown-pragmas \ ${CWARNEXTRA} # # The following flags are next up for working on: # -Wextra # Disable a few warnings for clang, since there are several places in the # kernel where fixing them is more trouble than it is worth, or where there is # a false positive. .if ${COMPILER_TYPE} == "clang" NO_WCONSTANT_CONVERSION= -Wno-constant-conversion NO_WSHIFT_COUNT_NEGATIVE= -Wno-shift-count-negative NO_WSHIFT_COUNT_OVERFLOW= -Wno-shift-count-overflow NO_WSELF_ASSIGN= -Wno-self-assign NO_WUNNEEDED_INTERNAL_DECL= -Wno-unneeded-internal-declaration NO_WSOMETIMES_UNINITIALIZED= -Wno-error-sometimes-uninitialized NO_WCAST_QUAL= -Wno-cast-qual # Several other warnings which might be useful in some cases, but not severe # enough to error out the whole kernel build. Display them anyway, so there is # some incentive to fix them eventually. CWARNEXTRA?= -Wno-error-tautological-compare -Wno-error-empty-body \ -Wno-error-parentheses-equality -Wno-error-unused-function \ -Wno-error-pointer-sign .if ${COMPILER_VERSION} >= 30700 CWARNEXTRA+= -Wno-error-shift-negative-value .endif CLANG_NO_IAS= -no-integrated-as .if ${COMPILER_VERSION} < 30500 # XXX: clang < 3.5 integrated-as doesn't grok .codeNN directives CLANG_NO_IAS34= -no-integrated-as .endif .endif .if ${COMPILER_TYPE} == "gcc" .if ${COMPILER_VERSION} >= 40800 # Catch-all for all the things that are in our tree, but for which we're # not yet ready for this compiler. CWARNEXTRA?= -Wno-error=inline -Wno-error=enum-compare -Wno-error=unused-but-set-variable \ -Wno-error=aggressive-loop-optimizations -Wno-error=maybe-uninitialized \ -Wno-error=array-bounds -Wno-error=address \ -Wno-error=cast-qual -Wno-error=sequence-point -Wno-error=attributes \ -Wno-error=strict-overflow -Wno-error=overflow +.if ${COMPILER_VERSION} >= 60100 +CWARNEXTRA+= -Wno-error=nonnull-compare -Wno-error=shift-overflow= +.endif .else # For gcc 4.2, eliminate the too-often-wrong warnings about uninitialized vars. CWARNEXTRA?= -Wno-uninitialized .endif .endif # External compilers may not support our format extensions. Allow them # to be disabled. WARNING: format checking is disabled in this case. .if ${MK_FORMAT_EXTENSIONS} == "no" FORMAT_EXTENSIONS= -Wno-format .elif ${COMPILER_TYPE} == "clang" && ${COMPILER_VERSION} >= 30600 FORMAT_EXTENSIONS= -D__printf__=__freebsd_kprintf__ .else FORMAT_EXTENSIONS= -fformat-extensions .endif # # On i386, do not align the stack to 16-byte boundaries. Otherwise GCC 2.95 # and above adds code to the entry and exit point of every function to align the # stack to 16-byte boundaries -- thus wasting approximately 12 bytes of stack # per function call. While the 16-byte alignment may benefit micro benchmarks, # it is probably an overall loss as it makes the code bigger (less efficient # use of code cache tag lines) and uses more stack (less efficient use of data # cache tag lines). Explicitly prohibit the use of FPU, SSE and other SIMD # operations inside the kernel itself. These operations are exclusively # reserved for user applications. # # gcc: # Setting -mno-mmx implies -mno-3dnow # Setting -mno-sse implies -mno-sse2, -mno-sse3 and -mno-ssse3 # # clang: # Setting -mno-mmx implies -mno-3dnow and -mno-3dnowa # Setting -mno-sse implies -mno-sse2, -mno-sse3, -mno-ssse3, -mno-sse41 and -mno-sse42 # .if ${MACHINE_CPUARCH} == "i386" CFLAGS.gcc+= -mno-align-long-strings -mpreferred-stack-boundary=2 CFLAGS.clang+= -mno-aes -mno-avx CFLAGS+= -mno-mmx -mno-sse -msoft-float INLINE_LIMIT?= 8000 .endif .if ${MACHINE_CPUARCH} == "arm" INLINE_LIMIT?= 8000 .endif .if ${MACHINE_CPUARCH} == "aarch64" # We generally don't want fpu instructions in the kernel. CFLAGS += -mgeneral-regs-only # Reserve x18 for pcpu data CFLAGS += -ffixed-x18 .endif .if ${MACHINE_CPUARCH} == "riscv" CFLAGS.gcc+= -mcmodel=medany INLINE_LIMIT?= 8000 .endif # # For sparc64 we want the medany code model so modules may be located # anywhere in the 64-bit address space. We also tell GCC to use floating # point emulation. This avoids using floating point registers for integer # operations which it has a tendency to do. # .if ${MACHINE_CPUARCH} == "sparc64" CFLAGS.clang+= -mcmodel=large -fno-dwarf2-cfi-asm CFLAGS.gcc+= -mcmodel=medany -msoft-float INLINE_LIMIT?= 15000 .endif # # For AMD64, we explicitly prohibit the use of FPU, SSE and other SIMD # operations inside the kernel itself. These operations are exclusively # reserved for user applications. # # gcc: # Setting -mno-mmx implies -mno-3dnow # Setting -mno-sse implies -mno-sse2, -mno-sse3, -mno-ssse3 and -mfpmath=387 # # clang: # Setting -mno-mmx implies -mno-3dnow and -mno-3dnowa # Setting -mno-sse implies -mno-sse2, -mno-sse3, -mno-ssse3, -mno-sse41 and -mno-sse42 # (-mfpmath= is not supported) # .if ${MACHINE_CPUARCH} == "amd64" CFLAGS.clang+= -mno-aes -mno-avx CFLAGS+= -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float \ -fno-asynchronous-unwind-tables INLINE_LIMIT?= 8000 .endif # # For PowerPC we tell gcc to use floating point emulation. This avoids using # floating point registers for integer operations which it has a tendency to do. # Also explicitly disable Altivec instructions inside the kernel. # .if ${MACHINE_CPUARCH} == "powerpc" CFLAGS+= -mno-altivec CFLAGS.clang+= -mllvm -disable-ppc-float-in-variadic=true CFLAGS.gcc+= -msoft-float INLINE_LIMIT?= 15000 .endif # # Use dot symbols on powerpc64 to make ddb happy # .if ${MACHINE_ARCH} == "powerpc64" CFLAGS.gcc+= -mcall-aixdesc .endif # # For MIPS we also tell gcc to use floating point emulation # .if ${MACHINE_CPUARCH} == "mips" CFLAGS+= -msoft-float INLINE_LIMIT?= 8000 .endif # # GCC 3.0 and above like to do certain optimizations based on the # assumption that the program is linked against libc. Stop this. # CFLAGS+= -ffreestanding # # The C standard leaves signed integer overflow behavior undefined. # gcc and clang opimizers take advantage of this. The kernel makes # use of signed integer wraparound mechanics so we need the compiler # to treat it as a wraparound and not take shortcuts. # CFLAGS+= -fwrapv # # GCC SSP support # .if ${MK_SSP} != "no" && \ ${MACHINE_CPUARCH} != "arm" && ${MACHINE_CPUARCH} != "mips" CFLAGS+= -fstack-protector .endif # # Add -gdwarf-2 when compiling -g. The default starting in clang v3.4 # and gcc 4.8 is to generate DWARF version 4. However, our tools don't # cope well with DWARF 4, so force it to genereate DWARF2, which they # understand. Do this unconditionally as it is harmless when not needed, # but critical for these newer versions. # .if ${CFLAGS:M-g} != "" && ${CFLAGS:M-gdwarf*} == "" CFLAGS+= -gdwarf-2 .endif CFLAGS+= ${CWARNFLAGS} ${CWARNFLAGS.${.IMPSRC:T}} CFLAGS+= ${CFLAGS.${COMPILER_TYPE}} ${CFLAGS.${.IMPSRC:T}} # Tell bmake not to mistake standard targets for things to be searched for # or expect to ever be up-to-date. PHONY_NOTMAIN = afterdepend afterinstall all beforedepend beforeinstall \ beforelinking build build-tools buildfiles buildincludes \ checkdpadd clean cleandepend cleandir cleanobj configure \ depend distclean distribute exe \ html includes install installfiles installincludes lint \ obj objlink objs objwarn \ realinstall regress \ tags whereobj .PHONY: ${PHONY_NOTMAIN} .NOTMAIN: ${PHONY_NOTMAIN} CSTD= c99 .if ${CSTD} == "k&r" CFLAGS+= -traditional .elif ${CSTD} == "c89" || ${CSTD} == "c90" CFLAGS+= -std=iso9899:1990 .elif ${CSTD} == "c94" || ${CSTD} == "c95" CFLAGS+= -std=iso9899:199409 .elif ${CSTD} == "c99" CFLAGS+= -std=iso9899:1999 .else # CSTD CFLAGS+= -std=${CSTD} .endif # CSTD Index: user/alc/PQ_LAUNDRY/sys/conf/kern.pre.mk =================================================================== --- user/alc/PQ_LAUNDRY/sys/conf/kern.pre.mk (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/conf/kern.pre.mk (revision 303206) @@ -1,249 +1,251 @@ # $FreeBSD$ # Part of a unified Makefile for building kernels. This part contains all # of the definitions that need to be before %BEFORE_DEPEND. # Allow user to configure things that only effect src tree builds. # Note: This is duplicated from src.sys.mk to ensure that we include # /etc/src.conf when building the kernel. Kernels can be built without # the rest of /usr/src, but they still always process SRCCONF even though # the normal mechanisms to prevent that (compiling out of tree) won't # work. To ensure they do work, we have to duplicate thee few lines here. SRCCONF?= /etc/src.conf .if (exists(${SRCCONF}) || ${SRCCONF} != "/etc/src.conf") && !target(_srcconf_included_) .include "${SRCCONF}" _srcconf_included_: .endif .include .include .include "kern.opts.mk" # The kernel build always occurs in the object directory which is .CURDIR. .if ${.MAKE.MODE:Unormal:Mmeta} .MAKE.MODE+= curdirOk=yes .endif # Can be overridden by makeoptions or /etc/make.conf KERNEL_KO?= kernel KERNEL?= kernel KODIR?= /boot/${KERNEL} LDSCRIPT_NAME?= ldscript.$M LDSCRIPT?= $S/conf/${LDSCRIPT_NAME} M= ${MACHINE} AWK?= awk CP?= cp LINT?= lint NM?= nm OBJCOPY?= objcopy SIZE?= size .if defined(DEBUG) _MINUS_O= -O CTFFLAGS+= -g .else .if ${MACHINE_CPUARCH} == "powerpc" _MINUS_O= -O # gcc miscompiles some code at -O2 .else _MINUS_O= -O2 .endif .endif .if ${MACHINE_CPUARCH} == "amd64" .if ${COMPILER_TYPE} == "clang" COPTFLAGS?=-O2 -pipe .else COPTFLAGS?=-O2 -frename-registers -pipe .endif .else COPTFLAGS?=${_MINUS_O} -pipe .endif .if !empty(COPTFLAGS:M-O[23s]) && empty(COPTFLAGS:M-fno-strict-aliasing) COPTFLAGS+= -fno-strict-aliasing .endif .if !defined(NO_CPU_COPTFLAGS) COPTFLAGS+= ${_CPUCFLAGS} .endif NOSTDINC= -nostdinc INCLUDES= ${NOSTDINC} ${INCLMAGIC} -I. -I$S CFLAGS= ${COPTFLAGS} ${DEBUG} CFLAGS+= ${INCLUDES} -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h CFLAGS_PARAM_INLINE_UNIT_GROWTH?=100 CFLAGS_PARAM_LARGE_FUNCTION_GROWTH?=1000 .if ${MACHINE_CPUARCH} == "mips" CFLAGS_ARCH_PARAMS?=--param max-inline-insns-single=1000 .endif CFLAGS.gcc+= -fno-common -fms-extensions -finline-limit=${INLINE_LIMIT} CFLAGS.gcc+= --param inline-unit-growth=${CFLAGS_PARAM_INLINE_UNIT_GROWTH} CFLAGS.gcc+= --param large-function-growth=${CFLAGS_PARAM_LARGE_FUNCTION_GROWTH} .if defined(CFLAGS_ARCH_PARAMS) CFLAGS.gcc+=${CFLAGS_ARCH_PARAMS} .endif WERROR?= -Werror # XXX LOCORE means "don't declare C stuff" not "for locore.s". ASM_CFLAGS= -x assembler-with-cpp -DLOCORE ${CFLAGS} ${ASM_CFLAGS.${.IMPSRC:T}} .if defined(PROFLEVEL) && ${PROFLEVEL} >= 1 CFLAGS+= -DGPROF CFLAGS.gcc+= -falign-functions=16 .if ${PROFLEVEL} >= 2 CFLAGS+= -DGPROF4 -DGUPROF PROF= -pg .if ${COMPILER_TYPE} == "gcc" PROF+= -mprofiler-epilogue .endif .else PROF= -pg .endif .endif DEFINED_PROF= ${PROF} # Put configuration-specific C flags last (except for ${PROF}) so that they # can override the others. CFLAGS+= ${CONF_CFLAGS} # Optional linting. This can be overridden in /etc/make.conf. LINTFLAGS= ${LINTOBJKERNFLAGS} NORMAL_C= ${CC} -c ${CFLAGS} ${WERROR} ${PROF} ${.IMPSRC} NORMAL_S= ${CC:N${CCACHE_BIN}} -c ${ASM_CFLAGS} ${WERROR} ${.IMPSRC} PROFILE_C= ${CC} -c ${CFLAGS} ${WERROR} ${.IMPSRC} NORMAL_C_NOWERROR= ${CC} -c ${CFLAGS} ${PROF} ${.IMPSRC} NORMAL_M= ${AWK} -f $S/tools/makeobjops.awk ${.IMPSRC} -c ; \ ${CC} -c ${CFLAGS} ${WERROR} ${PROF} ${.PREFIX}.c NORMAL_FW= uudecode -o ${.TARGET} ${.ALLSRC} NORMAL_FWO= ${LD} -b binary --no-warn-mismatch -d -warn-common -r \ -o ${.TARGET} ${.ALLSRC:M*.fw} # Common for dtrace / zfs CDDL_CFLAGS= -DFREEBSD_NAMECACHE -nostdinc -I$S/cddl/compat/opensolaris -I$S/cddl/contrib/opensolaris/uts/common -I$S -I$S/cddl/contrib/opensolaris/common ${CFLAGS} -Wno-unknown-pragmas -Wno-missing-prototypes -Wno-undef -Wno-strict-prototypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline -Wno-switch -Wno-pointer-arith -Wno-unknown-pragmas CDDL_CFLAGS+= -include $S/cddl/compat/opensolaris/sys/debug_compat.h CDDL_C= ${CC} -c ${CDDL_CFLAGS} ${WERROR} ${PROF} ${.IMPSRC} # Special flags for managing the compat compiles for ZFS ZFS_CFLAGS= -DBUILDING_ZFS -I$S/cddl/contrib/opensolaris/uts/common/fs/zfs -I$S/cddl/contrib/opensolaris/uts/common/zmod -I$S/cddl/contrib/opensolaris/common/zfs ${CDDL_CFLAGS} ZFS_ASM_CFLAGS= -x assembler-with-cpp -DLOCORE ${ZFS_CFLAGS} ZFS_C= ${CC} -c ${ZFS_CFLAGS} ${WERROR} ${PROF} ${.IMPSRC} ZFS_S= ${CC} -c ${ZFS_ASM_CFLAGS} ${WERROR} ${.IMPSRC} # Special flags for managing the compat compiles for DTrace DTRACE_CFLAGS= -DBUILDING_DTRACE ${CDDL_CFLAGS} -I$S/cddl/dev/dtrace -I$S/cddl/dev/dtrace/${MACHINE_CPUARCH} .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386" DTRACE_CFLAGS+= -I$S/cddl/contrib/opensolaris/uts/intel -I$S/cddl/dev/dtrace/x86 .endif DTRACE_CFLAGS+= -I$S/cddl/contrib/opensolaris/common/util -I$S -DDIS_MEM -DSMP DTRACE_ASM_CFLAGS= -x assembler-with-cpp -DLOCORE ${DTRACE_CFLAGS} DTRACE_C= ${CC} -c ${DTRACE_CFLAGS} ${WERROR} ${PROF} ${.IMPSRC} DTRACE_S= ${CC} -c ${DTRACE_ASM_CFLAGS} ${WERROR} ${.IMPSRC} # Special flags for managing the compat compiles for DTrace/FBT FBT_CFLAGS= -DBUILDING_DTRACE -nostdinc -I$S/cddl/dev/fbt/${MACHINE_CPUARCH} -I$S/cddl/dev/fbt -I$S/cddl/compat/opensolaris -I$S/cddl/contrib/opensolaris/uts/common -I$S ${CDDL_CFLAGS} .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386" FBT_CFLAGS+= -I$S/cddl/dev/fbt/x86 .endif FBT_C= ${CC} -c ${FBT_CFLAGS} ${WERROR} ${PROF} ${.IMPSRC} .if ${MK_CTF} != "no" NORMAL_CTFCONVERT= ${CTFCONVERT} ${CTFFLAGS} ${.TARGET} .elif ${MAKE_VERSION} >= 5201111300 NORMAL_CTFCONVERT= .else NORMAL_CTFCONVERT= @: .endif NORMAL_LINT= ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.IMPSRC} # Linux Kernel Programming Interface C-flags LINUXKPI_INCLUDES= -I$S/compat/linuxkpi/common/include LINUXKPI_C= ${NORMAL_C} ${LINUXKPI_INCLUDES} # Infiniband C flags. Correct include paths and omit errors that linux # does not honor. OFEDINCLUDES= -I$S/ofed/include ${LINUXKPI_INCLUDES} OFEDNOERR= -Wno-cast-qual -Wno-pointer-arith OFEDCFLAGS= ${CFLAGS:N-I*} ${OFEDINCLUDES} ${CFLAGS:M-I*} ${OFEDNOERR} OFED_C_NOIMP= ${CC} -c -o ${.TARGET} ${OFEDCFLAGS} ${WERROR} ${PROF} OFED_C= ${OFED_C_NOIMP} ${.IMPSRC} GEN_CFILES= $S/$M/$M/genassym.c ${MFILES:T:S/.m$/.c/} SYSTEM_CFILES= config.c env.c hints.c vnode_if.c SYSTEM_DEP= Makefile ${SYSTEM_OBJS} SYSTEM_OBJS= locore.o ${MDOBJS} ${OBJS} SYSTEM_OBJS+= ${SYSTEM_CFILES:.c=.o} SYSTEM_OBJS+= hack.So MD_ROOT_SIZE_CONFIGURED!= grep MD_ROOT_SIZE opt_md.h || true ; echo .if ${MFS_IMAGE:Uno} != "no" .if empty(MD_ROOT_SIZE_CONFIGURED) SYSTEM_OBJS+= embedfs_${MFS_IMAGE:T:R}.o .endif .endif SYSTEM_LD= @${LD} -Bdynamic -T ${LDSCRIPT} ${_LDFLAGS} --no-warn-mismatch \ --warn-common --export-dynamic --dynamic-linker /red/herring \ -o ${.TARGET} -X ${SYSTEM_OBJS} vers.o SYSTEM_LD_TAIL= @${OBJCOPY} --strip-symbol gcc2_compiled. ${.TARGET} ; \ ${SIZE} ${.TARGET} ; chmod 755 ${.TARGET} SYSTEM_DEP+= ${LDSCRIPT} # Calculate path for .m files early, if needed. -.if !defined(__MPATH) +.if !defined(NO_MODULES) && !defined(__MPATH) __MPATH!=find ${S:tA}/ -name \*_if.m .endif # MKMODULESENV is set here so that port makefiles can augment # them. MKMODULESENV+= MAKEOBJDIRPREFIX=${.OBJDIR}/modules KMODDIR=${KODIR} MKMODULESENV+= MACHINE_CPUARCH=${MACHINE_CPUARCH} MKMODULESENV+= MACHINE=${MACHINE} MACHINE_ARCH=${MACHINE_ARCH} MKMODULESENV+= MODULES_EXTRA="${MODULES_EXTRA}" WITHOUT_MODULES="${WITHOUT_MODULES}" .if (${KERN_IDENT} == LINT) MKMODULESENV+= ALL_MODULES=LINT .endif .if defined(MODULES_OVERRIDE) MKMODULESENV+= MODULES_OVERRIDE="${MODULES_OVERRIDE}" .endif .if defined(DEBUG) MKMODULESENV+= DEBUG_FLAGS="${DEBUG}" .endif +.if !defined(NO_MODULES) MKMODULESENV+= __MPATH="${__MPATH}" +.endif # Architecture and output format arguments for objdump to convert image to # object file .if ${MFS_IMAGE:Uno} != "no" .if empty(MD_ROOT_SIZE_CONFIGURED) .if !defined(EMBEDFS_FORMAT.${MACHINE_ARCH}) EMBEDFS_FORMAT.${MACHINE_ARCH}!= awk -F'"' '/OUTPUT_FORMAT/ {print $$2}' ${LDSCRIPT} .if empty(EMBEDFS_FORMAT.${MACHINE_ARCH}) .undef EMBEDFS_FORMAT.${MACHINE_ARCH} .endif .endif .if !defined(EMBEDFS_ARCH.${MACHINE_ARCH}) EMBEDFS_ARCH.${MACHINE_ARCH}!= sed -n '/OUTPUT_ARCH/s/.*(\(.*\)).*/\1/p' ${LDSCRIPT} .if empty(EMBEDFS_ARCH.${MACHINE_ARCH}) .undef EMBEDFS_ARCH.${MACHINE_ARCH} .endif .endif EMBEDFS_FORMAT.arm?= elf32-littlearm EMBEDFS_FORMAT.armv6?= elf32-littlearm EMBEDFS_FORMAT.mips?= elf32-tradbigmips EMBEDFS_FORMAT.mipsel?= elf32-tradlittlemips EMBEDFS_FORMAT.mips64?= elf64-tradbigmips EMBEDFS_FORMAT.mips64el?= elf64-tradlittlemips EMBEDFS_FORMAT.riscv?= elf64-littleriscv .endif .endif # Detect kernel config options that force stack frames to be turned on. DDB_ENABLED!= grep DDB opt_ddb.h || true ; echo DTR_ENABLED!= grep KDTRACE_FRAME opt_kdtrace.h || true ; echo HWPMC_ENABLED!= grep HWPMC opt_hwpmc_hooks.h || true ; echo Index: user/alc/PQ_LAUNDRY/sys/dev/cxgbe/t4_sge.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/cxgbe/t4_sge.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/cxgbe/t4_sge.c (revision 303206) @@ -1,4893 +1,4960 @@ /*- * Copyright (c) 2011 Chelsio Communications, Inc. * All rights reserved. * Written by: Navdeep Parhar * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef DEV_NETMAP #include #include #include #include #include #endif #include "common/common.h" #include "common/t4_regs.h" #include "common/t4_regs_values.h" #include "common/t4_msg.h" #include "t4_l2t.h" #include "t4_mp_ring.h" #ifdef T4_PKT_TIMESTAMP #define RX_COPY_THRESHOLD (MINCLSIZE - 8) #else #define RX_COPY_THRESHOLD MINCLSIZE #endif /* * Ethernet frames are DMA'd at this byte offset into the freelist buffer. * 0-7 are valid values. */ int fl_pktshift = 2; TUNABLE_INT("hw.cxgbe.fl_pktshift", &fl_pktshift); /* * Pad ethernet payload up to this boundary. * -1: driver should figure out a good value. * 0: disable padding. * Any power of 2 from 32 to 4096 (both inclusive) is also a valid value. */ int fl_pad = -1; TUNABLE_INT("hw.cxgbe.fl_pad", &fl_pad); /* * Status page length. * -1: driver should figure out a good value. * 64 or 128 are the only other valid values. */ int spg_len = -1; TUNABLE_INT("hw.cxgbe.spg_len", &spg_len); /* * Congestion drops. * -1: no congestion feedback (not recommended). * 0: backpressure the channel instead of dropping packets right away. * 1: no backpressure, drop packets for the congested queue immediately. */ static int cong_drop = 0; TUNABLE_INT("hw.cxgbe.cong_drop", &cong_drop); /* * Deliver multiple frames in the same free list buffer if they fit. * -1: let the driver decide whether to enable buffer packing or not. * 0: disable buffer packing. * 1: enable buffer packing. */ static int buffer_packing = -1; TUNABLE_INT("hw.cxgbe.buffer_packing", &buffer_packing); /* * Start next frame in a packed buffer at this boundary. * -1: driver should figure out a good value. * T4: driver will ignore this and use the same value as fl_pad above. * T5: 16, or a power of 2 from 64 to 4096 (both inclusive) is a valid value. */ static int fl_pack = -1; TUNABLE_INT("hw.cxgbe.fl_pack", &fl_pack); /* * Allow the driver to create mbuf(s) in a cluster allocated for rx. * 0: never; always allocate mbufs from the zone_mbuf UMA zone. * 1: ok to create mbuf(s) within a cluster if there is room. */ static int allow_mbufs_in_cluster = 1; TUNABLE_INT("hw.cxgbe.allow_mbufs_in_cluster", &allow_mbufs_in_cluster); /* * Largest rx cluster size that the driver is allowed to allocate. */ static int largest_rx_cluster = MJUM16BYTES; TUNABLE_INT("hw.cxgbe.largest_rx_cluster", &largest_rx_cluster); /* * Size of cluster allocation that's most likely to succeed. The driver will * fall back to this size if it fails to allocate clusters larger than this. */ static int safest_rx_cluster = PAGE_SIZE; TUNABLE_INT("hw.cxgbe.safest_rx_cluster", &safest_rx_cluster); struct txpkts { u_int wr_type; /* type 0 or type 1 */ u_int npkt; /* # of packets in this work request */ u_int plen; /* total payload (sum of all packets) */ u_int len16; /* # of 16B pieces used by this work request */ }; /* A packet's SGL. This + m_pkthdr has all info needed for tx */ struct sgl { struct sglist sg; struct sglist_seg seg[TX_SGL_SEGS]; }; static int service_iq(struct sge_iq *, int); static struct mbuf *get_fl_payload(struct adapter *, struct sge_fl *, uint32_t); static int t4_eth_rx(struct sge_iq *, const struct rss_header *, struct mbuf *); static inline void init_iq(struct sge_iq *, struct adapter *, int, int, int); static inline void init_fl(struct adapter *, struct sge_fl *, int, int, char *); static inline void init_eq(struct adapter *, struct sge_eq *, int, int, uint8_t, uint16_t, char *); static int alloc_ring(struct adapter *, size_t, bus_dma_tag_t *, bus_dmamap_t *, bus_addr_t *, void **); static int free_ring(struct adapter *, bus_dma_tag_t, bus_dmamap_t, bus_addr_t, void *); static int alloc_iq_fl(struct vi_info *, struct sge_iq *, struct sge_fl *, int, int); static int free_iq_fl(struct vi_info *, struct sge_iq *, struct sge_fl *); static void add_fl_sysctls(struct sysctl_ctx_list *, struct sysctl_oid *, struct sge_fl *); static int alloc_fwq(struct adapter *); static int free_fwq(struct adapter *); static int alloc_mgmtq(struct adapter *); static int free_mgmtq(struct adapter *); static int alloc_rxq(struct vi_info *, struct sge_rxq *, int, int, struct sysctl_oid *); static int free_rxq(struct vi_info *, struct sge_rxq *); #ifdef TCP_OFFLOAD static int alloc_ofld_rxq(struct vi_info *, struct sge_ofld_rxq *, int, int, struct sysctl_oid *); static int free_ofld_rxq(struct vi_info *, struct sge_ofld_rxq *); #endif #ifdef DEV_NETMAP static int alloc_nm_rxq(struct vi_info *, struct sge_nm_rxq *, int, int, struct sysctl_oid *); static int free_nm_rxq(struct vi_info *, struct sge_nm_rxq *); static int alloc_nm_txq(struct vi_info *, struct sge_nm_txq *, int, int, struct sysctl_oid *); static int free_nm_txq(struct vi_info *, struct sge_nm_txq *); #endif static int ctrl_eq_alloc(struct adapter *, struct sge_eq *); static int eth_eq_alloc(struct adapter *, struct vi_info *, struct sge_eq *); #ifdef TCP_OFFLOAD static int ofld_eq_alloc(struct adapter *, struct vi_info *, struct sge_eq *); #endif static int alloc_eq(struct adapter *, struct vi_info *, struct sge_eq *); static int free_eq(struct adapter *, struct sge_eq *); static int alloc_wrq(struct adapter *, struct vi_info *, struct sge_wrq *, struct sysctl_oid *); static int free_wrq(struct adapter *, struct sge_wrq *); static int alloc_txq(struct vi_info *, struct sge_txq *, int, struct sysctl_oid *); static int free_txq(struct vi_info *, struct sge_txq *); static void oneseg_dma_callback(void *, bus_dma_segment_t *, int, int); static inline void ring_fl_db(struct adapter *, struct sge_fl *); static int refill_fl(struct adapter *, struct sge_fl *, int); static void refill_sfl(void *); static int alloc_fl_sdesc(struct sge_fl *); static void free_fl_sdesc(struct adapter *, struct sge_fl *); static void find_best_refill_source(struct adapter *, struct sge_fl *, int); static void find_safe_refill_source(struct adapter *, struct sge_fl *); static void add_fl_to_sfl(struct adapter *, struct sge_fl *); static inline void get_pkt_gl(struct mbuf *, struct sglist *); static inline u_int txpkt_len16(u_int, u_int); static inline u_int txpkts0_len16(u_int); static inline u_int txpkts1_len16(void); static u_int write_txpkt_wr(struct sge_txq *, struct fw_eth_tx_pkt_wr *, struct mbuf *, u_int); static int try_txpkts(struct mbuf *, struct mbuf *, struct txpkts *, u_int); static int add_to_txpkts(struct mbuf *, struct txpkts *, u_int); static u_int write_txpkts_wr(struct sge_txq *, struct fw_eth_tx_pkts_wr *, struct mbuf *, const struct txpkts *, u_int); static void write_gl_to_txd(struct sge_txq *, struct mbuf *, caddr_t *, int); static inline void copy_to_txd(struct sge_eq *, caddr_t, caddr_t *, int); static inline void ring_eq_db(struct adapter *, struct sge_eq *, u_int); static inline uint16_t read_hw_cidx(struct sge_eq *); static inline u_int reclaimable_tx_desc(struct sge_eq *); static inline u_int total_available_tx_desc(struct sge_eq *); static u_int reclaim_tx_descs(struct sge_txq *, u_int); static void tx_reclaim(void *, int); static __be64 get_flit(struct sglist_seg *, int, int); static int handle_sge_egr_update(struct sge_iq *, const struct rss_header *, struct mbuf *); static int handle_fw_msg(struct sge_iq *, const struct rss_header *, struct mbuf *); +static int t4_handle_wrerr_rpl(struct adapter *, const __be64 *); static void wrq_tx_drain(void *, int); static void drain_wrq_wr_list(struct adapter *, struct sge_wrq *); static int sysctl_uint16(SYSCTL_HANDLER_ARGS); static int sysctl_bufsizes(SYSCTL_HANDLER_ARGS); static int sysctl_tc(SYSCTL_HANDLER_ARGS); static counter_u64_t extfree_refs; static counter_u64_t extfree_rels; an_handler_t t4_an_handler; fw_msg_handler_t t4_fw_msg_handler[NUM_FW6_TYPES]; cpl_handler_t t4_cpl_handler[NUM_CPL_CMDS]; static int an_not_handled(struct sge_iq *iq, const struct rsp_ctrl *ctrl) { #ifdef INVARIANTS panic("%s: async notification on iq %p (ctrl %p)", __func__, iq, ctrl); #else log(LOG_ERR, "%s: async notification on iq %p (ctrl %p)\n", __func__, iq, ctrl); #endif return (EDOOFUS); } int t4_register_an_handler(an_handler_t h) { uintptr_t *loc, new; new = h ? (uintptr_t)h : (uintptr_t)an_not_handled; loc = (uintptr_t *) &t4_an_handler; atomic_store_rel_ptr(loc, new); return (0); } static int fw_msg_not_handled(struct adapter *sc, const __be64 *rpl) { const struct cpl_fw6_msg *cpl = __containerof(rpl, struct cpl_fw6_msg, data[0]); #ifdef INVARIANTS panic("%s: fw_msg type %d", __func__, cpl->type); #else log(LOG_ERR, "%s: fw_msg type %d\n", __func__, cpl->type); #endif return (EDOOFUS); } int t4_register_fw_msg_handler(int type, fw_msg_handler_t h) { uintptr_t *loc, new; if (type >= nitems(t4_fw_msg_handler)) return (EINVAL); /* * These are dispatched by the handler for FW{4|6}_CPL_MSG using the CPL * handler dispatch table. Reject any attempt to install a handler for * this subtype. */ if (type == FW_TYPE_RSSCPL || type == FW6_TYPE_RSSCPL) return (EINVAL); new = h ? (uintptr_t)h : (uintptr_t)fw_msg_not_handled; loc = (uintptr_t *) &t4_fw_msg_handler[type]; atomic_store_rel_ptr(loc, new); return (0); } static int cpl_not_handled(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m) { #ifdef INVARIANTS panic("%s: opcode 0x%02x on iq %p with payload %p", __func__, rss->opcode, iq, m); #else log(LOG_ERR, "%s: opcode 0x%02x on iq %p with payload %p\n", __func__, rss->opcode, iq, m); m_freem(m); #endif return (EDOOFUS); } int t4_register_cpl_handler(int opcode, cpl_handler_t h) { uintptr_t *loc, new; if (opcode >= nitems(t4_cpl_handler)) return (EINVAL); new = h ? (uintptr_t)h : (uintptr_t)cpl_not_handled; loc = (uintptr_t *) &t4_cpl_handler[opcode]; atomic_store_rel_ptr(loc, new); return (0); } /* * Called on MOD_LOAD. Validates and calculates the SGE tunables. */ void t4_sge_modload(void) { int i; if (fl_pktshift < 0 || fl_pktshift > 7) { printf("Invalid hw.cxgbe.fl_pktshift value (%d)," " using 2 instead.\n", fl_pktshift); fl_pktshift = 2; } if (spg_len != 64 && spg_len != 128) { int len; #if defined(__i386__) || defined(__amd64__) len = cpu_clflush_line_size > 64 ? 128 : 64; #else len = 64; #endif if (spg_len != -1) { printf("Invalid hw.cxgbe.spg_len value (%d)," " using %d instead.\n", spg_len, len); } spg_len = len; } if (cong_drop < -1 || cong_drop > 1) { printf("Invalid hw.cxgbe.cong_drop value (%d)," " using 0 instead.\n", cong_drop); cong_drop = 0; } extfree_refs = counter_u64_alloc(M_WAITOK); extfree_rels = counter_u64_alloc(M_WAITOK); counter_u64_zero(extfree_refs); counter_u64_zero(extfree_rels); t4_an_handler = an_not_handled; for (i = 0; i < nitems(t4_fw_msg_handler); i++) t4_fw_msg_handler[i] = fw_msg_not_handled; for (i = 0; i < nitems(t4_cpl_handler); i++) t4_cpl_handler[i] = cpl_not_handled; t4_register_cpl_handler(CPL_FW4_MSG, handle_fw_msg); t4_register_cpl_handler(CPL_FW6_MSG, handle_fw_msg); t4_register_cpl_handler(CPL_SGE_EGR_UPDATE, handle_sge_egr_update); t4_register_cpl_handler(CPL_RX_PKT, t4_eth_rx); t4_register_fw_msg_handler(FW6_TYPE_CMD_RPL, t4_handle_fw_rpl); + t4_register_fw_msg_handler(FW6_TYPE_WRERR_RPL, t4_handle_wrerr_rpl); } void t4_sge_modunload(void) { counter_u64_free(extfree_refs); counter_u64_free(extfree_rels); } uint64_t t4_sge_extfree_refs(void) { uint64_t refs, rels; rels = counter_u64_fetch(extfree_rels); refs = counter_u64_fetch(extfree_refs); return (refs - rels); } static inline void setup_pad_and_pack_boundaries(struct adapter *sc) { uint32_t v, m; int pad, pack; pad = fl_pad; if (fl_pad < 32 || fl_pad > 4096 || !powerof2(fl_pad)) { /* * If there is any chance that we might use buffer packing and * the chip is a T4, then pick 64 as the pad/pack boundary. Set * it to 32 in all other cases. */ pad = is_t4(sc) && buffer_packing ? 64 : 32; /* * For fl_pad = 0 we'll still write a reasonable value to the * register but all the freelists will opt out of padding. * We'll complain here only if the user tried to set it to a * value greater than 0 that was invalid. */ if (fl_pad > 0) { device_printf(sc->dev, "Invalid hw.cxgbe.fl_pad value" " (%d), using %d instead.\n", fl_pad, pad); } } m = V_INGPADBOUNDARY(M_INGPADBOUNDARY); v = V_INGPADBOUNDARY(ilog2(pad) - 5); t4_set_reg_field(sc, A_SGE_CONTROL, m, v); if (is_t4(sc)) { if (fl_pack != -1 && fl_pack != pad) { /* Complain but carry on. */ device_printf(sc->dev, "hw.cxgbe.fl_pack (%d) ignored," " using %d instead.\n", fl_pack, pad); } return; } pack = fl_pack; if (fl_pack < 16 || fl_pack == 32 || fl_pack > 4096 || !powerof2(fl_pack)) { pack = max(sc->params.pci.mps, CACHE_LINE_SIZE); MPASS(powerof2(pack)); if (pack < 16) pack = 16; if (pack == 32) pack = 64; if (pack > 4096) pack = 4096; if (fl_pack != -1) { device_printf(sc->dev, "Invalid hw.cxgbe.fl_pack value" " (%d), using %d instead.\n", fl_pack, pack); } } m = V_INGPACKBOUNDARY(M_INGPACKBOUNDARY); if (pack == 16) v = V_INGPACKBOUNDARY(0); else v = V_INGPACKBOUNDARY(ilog2(pack) - 5); MPASS(!is_t4(sc)); /* T4 doesn't have SGE_CONTROL2 */ t4_set_reg_field(sc, A_SGE_CONTROL2, m, v); } /* * adap->params.vpd.cclk must be set up before this is called. */ void t4_tweak_chip_settings(struct adapter *sc) { int i; uint32_t v, m; int intr_timer[SGE_NTIMERS] = {1, 5, 10, 50, 100, 200}; int timer_max = M_TIMERVALUE0 * 1000 / sc->params.vpd.cclk; int intr_pktcount[SGE_NCOUNTERS] = {1, 8, 16, 32}; /* 63 max */ uint16_t indsz = min(RX_COPY_THRESHOLD - 1, M_INDICATESIZE); static int sge_flbuf_sizes[] = { MCLBYTES, #if MJUMPAGESIZE != MCLBYTES MJUMPAGESIZE, MJUMPAGESIZE - CL_METADATA_SIZE, MJUMPAGESIZE - 2 * MSIZE - CL_METADATA_SIZE, #endif MJUM9BYTES, MJUM16BYTES, MCLBYTES - MSIZE - CL_METADATA_SIZE, MJUM9BYTES - CL_METADATA_SIZE, MJUM16BYTES - CL_METADATA_SIZE, }; KASSERT(sc->flags & MASTER_PF, ("%s: trying to change chip settings when not master.", __func__)); m = V_PKTSHIFT(M_PKTSHIFT) | F_RXPKTCPLMODE | F_EGRSTATUSPAGESIZE; v = V_PKTSHIFT(fl_pktshift) | F_RXPKTCPLMODE | V_EGRSTATUSPAGESIZE(spg_len == 128); t4_set_reg_field(sc, A_SGE_CONTROL, m, v); setup_pad_and_pack_boundaries(sc); v = V_HOSTPAGESIZEPF0(PAGE_SHIFT - 10) | V_HOSTPAGESIZEPF1(PAGE_SHIFT - 10) | V_HOSTPAGESIZEPF2(PAGE_SHIFT - 10) | V_HOSTPAGESIZEPF3(PAGE_SHIFT - 10) | V_HOSTPAGESIZEPF4(PAGE_SHIFT - 10) | V_HOSTPAGESIZEPF5(PAGE_SHIFT - 10) | V_HOSTPAGESIZEPF6(PAGE_SHIFT - 10) | V_HOSTPAGESIZEPF7(PAGE_SHIFT - 10); t4_write_reg(sc, A_SGE_HOST_PAGE_SIZE, v); KASSERT(nitems(sge_flbuf_sizes) <= SGE_FLBUF_SIZES, ("%s: hw buffer size table too big", __func__)); for (i = 0; i < min(nitems(sge_flbuf_sizes), SGE_FLBUF_SIZES); i++) { t4_write_reg(sc, A_SGE_FL_BUFFER_SIZE0 + (4 * i), sge_flbuf_sizes[i]); } v = V_THRESHOLD_0(intr_pktcount[0]) | V_THRESHOLD_1(intr_pktcount[1]) | V_THRESHOLD_2(intr_pktcount[2]) | V_THRESHOLD_3(intr_pktcount[3]); t4_write_reg(sc, A_SGE_INGRESS_RX_THRESHOLD, v); KASSERT(intr_timer[0] <= timer_max, ("%s: not a single usable timer (%d, %d)", __func__, intr_timer[0], timer_max)); for (i = 1; i < nitems(intr_timer); i++) { KASSERT(intr_timer[i] >= intr_timer[i - 1], ("%s: timers not listed in increasing order (%d)", __func__, i)); while (intr_timer[i] > timer_max) { if (i == nitems(intr_timer) - 1) { intr_timer[i] = timer_max; break; } intr_timer[i] += intr_timer[i - 1]; intr_timer[i] /= 2; } } v = V_TIMERVALUE0(us_to_core_ticks(sc, intr_timer[0])) | V_TIMERVALUE1(us_to_core_ticks(sc, intr_timer[1])); t4_write_reg(sc, A_SGE_TIMER_VALUE_0_AND_1, v); v = V_TIMERVALUE2(us_to_core_ticks(sc, intr_timer[2])) | V_TIMERVALUE3(us_to_core_ticks(sc, intr_timer[3])); t4_write_reg(sc, A_SGE_TIMER_VALUE_2_AND_3, v); v = V_TIMERVALUE4(us_to_core_ticks(sc, intr_timer[4])) | V_TIMERVALUE5(us_to_core_ticks(sc, intr_timer[5])); t4_write_reg(sc, A_SGE_TIMER_VALUE_4_AND_5, v); /* 4K, 16K, 64K, 256K DDP "page sizes" */ v = V_HPZ0(0) | V_HPZ1(2) | V_HPZ2(4) | V_HPZ3(6); t4_write_reg(sc, A_ULP_RX_TDDP_PSZ, v); m = v = F_TDDPTAGTCB; t4_set_reg_field(sc, A_ULP_RX_CTL, m, v); m = V_INDICATESIZE(M_INDICATESIZE) | F_REARMDDPOFFSET | F_RESETDDPOFFSET; v = V_INDICATESIZE(indsz) | F_REARMDDPOFFSET | F_RESETDDPOFFSET; t4_set_reg_field(sc, A_TP_PARA_REG5, m, v); } /* * SGE wants the buffer to be at least 64B and then a multiple of 16. If * padding is is use the buffer's start and end need to be aligned to the pad * boundary as well. We'll just make sure that the size is a multiple of the * boundary here, it is up to the buffer allocation code to make sure the start * of the buffer is aligned as well. */ static inline int hwsz_ok(struct adapter *sc, int hwsz) { int mask = fl_pad ? sc->params.sge.pad_boundary - 1 : 16 - 1; return (hwsz >= 64 && (hwsz & mask) == 0); } /* * XXX: driver really should be able to deal with unexpected settings. */ int t4_read_chip_settings(struct adapter *sc) { struct sge *s = &sc->sge; struct sge_params *sp = &sc->params.sge; int i, j, n, rc = 0; uint32_t m, v, r; uint16_t indsz = min(RX_COPY_THRESHOLD - 1, M_INDICATESIZE); static int sw_buf_sizes[] = { /* Sorted by size */ MCLBYTES, #if MJUMPAGESIZE != MCLBYTES MJUMPAGESIZE, #endif MJUM9BYTES, MJUM16BYTES }; struct sw_zone_info *swz, *safe_swz; struct hw_buf_info *hwb; t4_init_sge_params(sc); m = F_RXPKTCPLMODE; v = F_RXPKTCPLMODE; r = t4_read_reg(sc, A_SGE_CONTROL); if ((r & m) != v) { device_printf(sc->dev, "invalid SGE_CONTROL(0x%x)\n", r); rc = EINVAL; } /* * If this changes then every single use of PAGE_SHIFT in the driver * needs to be carefully reviewed for PAGE_SHIFT vs sp->page_shift. */ if (sp->page_shift != PAGE_SHIFT) { device_printf(sc->dev, "invalid SGE_HOST_PAGE_SIZE(0x%x)\n", r); rc = EINVAL; } /* Filter out unusable hw buffer sizes entirely (mark with -2). */ hwb = &s->hw_buf_info[0]; for (i = 0; i < nitems(s->hw_buf_info); i++, hwb++) { r = t4_read_reg(sc, A_SGE_FL_BUFFER_SIZE0 + (4 * i)); hwb->size = r; hwb->zidx = hwsz_ok(sc, r) ? -1 : -2; hwb->next = -1; } /* * Create a sorted list in decreasing order of hw buffer sizes (and so * increasing order of spare area) for each software zone. * * If padding is enabled then the start and end of the buffer must align * to the pad boundary; if packing is enabled then they must align with * the pack boundary as well. Allocations from the cluster zones are * aligned to min(size, 4K), so the buffer starts at that alignment and * ends at hwb->size alignment. If mbuf inlining is allowed the * starting alignment will be reduced to MSIZE and the driver will * exercise appropriate caution when deciding on the best buffer layout * to use. */ n = 0; /* no usable buffer size to begin with */ swz = &s->sw_zone_info[0]; safe_swz = NULL; for (i = 0; i < SW_ZONE_SIZES; i++, swz++) { int8_t head = -1, tail = -1; swz->size = sw_buf_sizes[i]; swz->zone = m_getzone(swz->size); swz->type = m_gettype(swz->size); if (swz->size < PAGE_SIZE) { MPASS(powerof2(swz->size)); if (fl_pad && (swz->size % sp->pad_boundary != 0)) continue; } if (swz->size == safest_rx_cluster) safe_swz = swz; hwb = &s->hw_buf_info[0]; for (j = 0; j < SGE_FLBUF_SIZES; j++, hwb++) { if (hwb->zidx != -1 || hwb->size > swz->size) continue; #ifdef INVARIANTS if (fl_pad) MPASS(hwb->size % sp->pad_boundary == 0); #endif hwb->zidx = i; if (head == -1) head = tail = j; else if (hwb->size < s->hw_buf_info[tail].size) { s->hw_buf_info[tail].next = j; tail = j; } else { int8_t *cur; struct hw_buf_info *t; for (cur = &head; *cur != -1; cur = &t->next) { t = &s->hw_buf_info[*cur]; if (hwb->size == t->size) { hwb->zidx = -2; break; } if (hwb->size > t->size) { hwb->next = *cur; *cur = j; break; } } } } swz->head_hwidx = head; swz->tail_hwidx = tail; if (tail != -1) { n++; if (swz->size - s->hw_buf_info[tail].size >= CL_METADATA_SIZE) sc->flags |= BUF_PACKING_OK; } } if (n == 0) { device_printf(sc->dev, "no usable SGE FL buffer size.\n"); rc = EINVAL; } s->safe_hwidx1 = -1; s->safe_hwidx2 = -1; if (safe_swz != NULL) { s->safe_hwidx1 = safe_swz->head_hwidx; for (i = safe_swz->head_hwidx; i != -1; i = hwb->next) { int spare; hwb = &s->hw_buf_info[i]; #ifdef INVARIANTS if (fl_pad) MPASS(hwb->size % sp->pad_boundary == 0); #endif spare = safe_swz->size - hwb->size; if (spare >= CL_METADATA_SIZE) { s->safe_hwidx2 = i; break; } } } v = V_HPZ0(0) | V_HPZ1(2) | V_HPZ2(4) | V_HPZ3(6); r = t4_read_reg(sc, A_ULP_RX_TDDP_PSZ); if (r != v) { device_printf(sc->dev, "invalid ULP_RX_TDDP_PSZ(0x%x)\n", r); rc = EINVAL; } m = v = F_TDDPTAGTCB; r = t4_read_reg(sc, A_ULP_RX_CTL); if ((r & m) != v) { device_printf(sc->dev, "invalid ULP_RX_CTL(0x%x)\n", r); rc = EINVAL; } m = V_INDICATESIZE(M_INDICATESIZE) | F_REARMDDPOFFSET | F_RESETDDPOFFSET; v = V_INDICATESIZE(indsz) | F_REARMDDPOFFSET | F_RESETDDPOFFSET; r = t4_read_reg(sc, A_TP_PARA_REG5); if ((r & m) != v) { device_printf(sc->dev, "invalid TP_PARA_REG5(0x%x)\n", r); rc = EINVAL; } t4_init_tp_params(sc); t4_read_mtu_tbl(sc, sc->params.mtus, NULL); t4_load_mtus(sc, sc->params.mtus, sc->params.a_wnd, sc->params.b_wnd); return (rc); } int t4_create_dma_tag(struct adapter *sc) { int rc; rc = bus_dma_tag_create(bus_get_dma_tag(sc->dev), 1, 0, BUS_SPACE_MAXADDR, BUS_SPACE_MAXADDR, NULL, NULL, BUS_SPACE_MAXSIZE, BUS_SPACE_UNRESTRICTED, BUS_SPACE_MAXSIZE, BUS_DMA_ALLOCNOW, NULL, NULL, &sc->dmat); if (rc != 0) { device_printf(sc->dev, "failed to create main DMA tag: %d\n", rc); } return (rc); } void t4_sge_sysctls(struct adapter *sc, struct sysctl_ctx_list *ctx, struct sysctl_oid_list *children) { struct sge_params *sp = &sc->params.sge; SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "buffer_sizes", CTLTYPE_STRING | CTLFLAG_RD, &sc->sge, 0, sysctl_bufsizes, "A", "freelist buffer sizes"); SYSCTL_ADD_INT(ctx, children, OID_AUTO, "fl_pktshift", CTLFLAG_RD, NULL, sp->fl_pktshift, "payload DMA offset in rx buffer (bytes)"); SYSCTL_ADD_INT(ctx, children, OID_AUTO, "fl_pad", CTLFLAG_RD, NULL, sp->pad_boundary, "payload pad boundary (bytes)"); SYSCTL_ADD_INT(ctx, children, OID_AUTO, "spg_len", CTLFLAG_RD, NULL, sp->spg_len, "status page size (bytes)"); SYSCTL_ADD_INT(ctx, children, OID_AUTO, "cong_drop", CTLFLAG_RD, NULL, cong_drop, "congestion drop setting"); SYSCTL_ADD_INT(ctx, children, OID_AUTO, "fl_pack", CTLFLAG_RD, NULL, sp->pack_boundary, "payload pack boundary (bytes)"); } int t4_destroy_dma_tag(struct adapter *sc) { if (sc->dmat) bus_dma_tag_destroy(sc->dmat); return (0); } /* * Allocate and initialize the firmware event queue and the management queue. * * Returns errno on failure. Resources allocated up to that point may still be * allocated. Caller is responsible for cleanup in case this function fails. */ int t4_setup_adapter_queues(struct adapter *sc) { int rc; ADAPTER_LOCK_ASSERT_NOTOWNED(sc); sysctl_ctx_init(&sc->ctx); sc->flags |= ADAP_SYSCTL_CTX; /* * Firmware event queue */ rc = alloc_fwq(sc); if (rc != 0) return (rc); /* * Management queue. This is just a control queue that uses the fwq as * its associated iq. */ rc = alloc_mgmtq(sc); return (rc); } /* * Idempotent */ int t4_teardown_adapter_queues(struct adapter *sc) { ADAPTER_LOCK_ASSERT_NOTOWNED(sc); /* Do this before freeing the queue */ if (sc->flags & ADAP_SYSCTL_CTX) { sysctl_ctx_free(&sc->ctx); sc->flags &= ~ADAP_SYSCTL_CTX; } free_mgmtq(sc); free_fwq(sc); return (0); } static inline int first_vector(struct vi_info *vi) { struct adapter *sc = vi->pi->adapter; if (sc->intr_count == 1) return (0); return (vi->first_intr); } /* * Given an arbitrary "index," come up with an iq that can be used by other * queues (of this VI) for interrupt forwarding, SGE egress updates, etc. * The iq returned is guaranteed to be something that takes direct interrupts. */ static struct sge_iq * vi_intr_iq(struct vi_info *vi, int idx) { struct adapter *sc = vi->pi->adapter; struct sge *s = &sc->sge; struct sge_iq *iq = NULL; int nintr, i; if (sc->intr_count == 1) return (&sc->sge.fwq); nintr = vi->nintr; KASSERT(nintr != 0, ("%s: vi %p has no exclusive interrupts, total interrupts = %d", __func__, vi, sc->intr_count)); i = idx % nintr; if (vi->flags & INTR_RXQ) { if (i < vi->nrxq) { iq = &s->rxq[vi->first_rxq + i].iq; goto done; } i -= vi->nrxq; } #ifdef TCP_OFFLOAD if (vi->flags & INTR_OFLD_RXQ) { if (i < vi->nofldrxq) { iq = &s->ofld_rxq[vi->first_ofld_rxq + i].iq; goto done; } i -= vi->nofldrxq; } #endif panic("%s: vi %p, intr_flags 0x%lx, idx %d, total intr %d\n", __func__, vi, vi->flags & INTR_ALL, idx, nintr); done: MPASS(iq != NULL); KASSERT(iq->flags & IQ_INTR, ("%s: iq %p (vi %p, intr_flags 0x%lx, idx %d)", __func__, iq, vi, vi->flags & INTR_ALL, idx)); return (iq); } /* Maximum payload that can be delivered with a single iq descriptor */ static inline int mtu_to_max_payload(struct adapter *sc, int mtu, const int toe) { int payload; #ifdef TCP_OFFLOAD if (toe) { payload = sc->tt.rx_coalesce ? G_RXCOALESCESIZE(t4_read_reg(sc, A_TP_PARA_REG2)) : mtu; } else { #endif /* large enough even when hw VLAN extraction is disabled */ payload = sc->params.sge.fl_pktshift + ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN + mtu; #ifdef TCP_OFFLOAD } #endif return (payload); } int t4_setup_vi_queues(struct vi_info *vi) { int rc = 0, i, j, intr_idx, iqid; struct sge_rxq *rxq; struct sge_txq *txq; struct sge_wrq *ctrlq; #ifdef TCP_OFFLOAD struct sge_ofld_rxq *ofld_rxq; struct sge_wrq *ofld_txq; #endif #ifdef DEV_NETMAP int saved_idx; struct sge_nm_rxq *nm_rxq; struct sge_nm_txq *nm_txq; #endif char name[16]; struct port_info *pi = vi->pi; struct adapter *sc = pi->adapter; struct ifnet *ifp = vi->ifp; struct sysctl_oid *oid = device_get_sysctl_tree(vi->dev); struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid); int maxp, mtu = ifp->if_mtu; /* Interrupt vector to start from (when using multiple vectors) */ intr_idx = first_vector(vi); #ifdef DEV_NETMAP saved_idx = intr_idx; if (ifp->if_capabilities & IFCAP_NETMAP) { /* netmap is supported with direct interrupts only. */ MPASS(vi->flags & INTR_RXQ); /* * We don't have buffers to back the netmap rx queues * right now so we create the queues in a way that * doesn't set off any congestion signal in the chip. */ oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "nm_rxq", CTLFLAG_RD, NULL, "rx queues"); for_each_nm_rxq(vi, i, nm_rxq) { rc = alloc_nm_rxq(vi, nm_rxq, intr_idx, i, oid); if (rc != 0) goto done; intr_idx++; } oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "nm_txq", CTLFLAG_RD, NULL, "tx queues"); for_each_nm_txq(vi, i, nm_txq) { iqid = vi->first_nm_rxq + (i % vi->nnmrxq); rc = alloc_nm_txq(vi, nm_txq, iqid, i, oid); if (rc != 0) goto done; } } /* Normal rx queues and netmap rx queues share the same interrupts. */ intr_idx = saved_idx; #endif /* * First pass over all NIC and TOE rx queues: * a) initialize iq and fl * b) allocate queue iff it will take direct interrupts. */ maxp = mtu_to_max_payload(sc, mtu, 0); if (vi->flags & INTR_RXQ) { oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "rxq", CTLFLAG_RD, NULL, "rx queues"); } for_each_rxq(vi, i, rxq) { init_iq(&rxq->iq, sc, vi->tmr_idx, vi->pktc_idx, vi->qsize_rxq); snprintf(name, sizeof(name), "%s rxq%d-fl", device_get_nameunit(vi->dev), i); init_fl(sc, &rxq->fl, vi->qsize_rxq / 8, maxp, name); if (vi->flags & INTR_RXQ) { rxq->iq.flags |= IQ_INTR; rc = alloc_rxq(vi, rxq, intr_idx, i, oid); if (rc != 0) goto done; intr_idx++; } } #ifdef DEV_NETMAP if (ifp->if_capabilities & IFCAP_NETMAP) intr_idx = saved_idx + max(vi->nrxq, vi->nnmrxq); #endif #ifdef TCP_OFFLOAD maxp = mtu_to_max_payload(sc, mtu, 1); if (vi->flags & INTR_OFLD_RXQ) { oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ofld_rxq", CTLFLAG_RD, NULL, "rx queues for offloaded TCP connections"); } for_each_ofld_rxq(vi, i, ofld_rxq) { init_iq(&ofld_rxq->iq, sc, vi->tmr_idx, vi->pktc_idx, vi->qsize_rxq); snprintf(name, sizeof(name), "%s ofld_rxq%d-fl", device_get_nameunit(vi->dev), i); init_fl(sc, &ofld_rxq->fl, vi->qsize_rxq / 8, maxp, name); if (vi->flags & INTR_OFLD_RXQ) { ofld_rxq->iq.flags |= IQ_INTR; rc = alloc_ofld_rxq(vi, ofld_rxq, intr_idx, i, oid); if (rc != 0) goto done; intr_idx++; } } #endif /* * Second pass over all NIC and TOE rx queues. The queues forwarding * their interrupts are allocated now. */ j = 0; if (!(vi->flags & INTR_RXQ)) { oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "rxq", CTLFLAG_RD, NULL, "rx queues"); for_each_rxq(vi, i, rxq) { MPASS(!(rxq->iq.flags & IQ_INTR)); intr_idx = vi_intr_iq(vi, j)->abs_id; rc = alloc_rxq(vi, rxq, intr_idx, i, oid); if (rc != 0) goto done; j++; } } #ifdef TCP_OFFLOAD if (vi->nofldrxq != 0 && !(vi->flags & INTR_OFLD_RXQ)) { oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ofld_rxq", CTLFLAG_RD, NULL, "rx queues for offloaded TCP connections"); for_each_ofld_rxq(vi, i, ofld_rxq) { MPASS(!(ofld_rxq->iq.flags & IQ_INTR)); intr_idx = vi_intr_iq(vi, j)->abs_id; rc = alloc_ofld_rxq(vi, ofld_rxq, intr_idx, i, oid); if (rc != 0) goto done; j++; } } #endif /* * Now the tx queues. Only one pass needed. */ oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "txq", CTLFLAG_RD, NULL, "tx queues"); j = 0; for_each_txq(vi, i, txq) { iqid = vi_intr_iq(vi, j)->cntxt_id; snprintf(name, sizeof(name), "%s txq%d", device_get_nameunit(vi->dev), i); init_eq(sc, &txq->eq, EQ_ETH, vi->qsize_txq, pi->tx_chan, iqid, name); rc = alloc_txq(vi, txq, i, oid); if (rc != 0) goto done; j++; } #ifdef TCP_OFFLOAD oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ofld_txq", CTLFLAG_RD, NULL, "tx queues for offloaded TCP connections"); for_each_ofld_txq(vi, i, ofld_txq) { struct sysctl_oid *oid2; iqid = vi_intr_iq(vi, j)->cntxt_id; snprintf(name, sizeof(name), "%s ofld_txq%d", device_get_nameunit(vi->dev), i); init_eq(sc, &ofld_txq->eq, EQ_OFLD, vi->qsize_txq, pi->tx_chan, iqid, name); snprintf(name, sizeof(name), "%d", i); oid2 = SYSCTL_ADD_NODE(&vi->ctx, SYSCTL_CHILDREN(oid), OID_AUTO, name, CTLFLAG_RD, NULL, "offload tx queue"); rc = alloc_wrq(sc, vi, ofld_txq, oid2); if (rc != 0) goto done; j++; } #endif /* * Finally, the control queue. */ if (!IS_MAIN_VI(vi)) goto done; oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ctrlq", CTLFLAG_RD, NULL, "ctrl queue"); ctrlq = &sc->sge.ctrlq[pi->port_id]; iqid = vi_intr_iq(vi, 0)->cntxt_id; snprintf(name, sizeof(name), "%s ctrlq", device_get_nameunit(vi->dev)); init_eq(sc, &ctrlq->eq, EQ_CTRL, CTRL_EQ_QSIZE, pi->tx_chan, iqid, name); rc = alloc_wrq(sc, vi, ctrlq, oid); done: if (rc) t4_teardown_vi_queues(vi); return (rc); } /* * Idempotent */ int t4_teardown_vi_queues(struct vi_info *vi) { int i; struct port_info *pi = vi->pi; struct adapter *sc = pi->adapter; struct sge_rxq *rxq; struct sge_txq *txq; #ifdef TCP_OFFLOAD struct sge_ofld_rxq *ofld_rxq; struct sge_wrq *ofld_txq; #endif #ifdef DEV_NETMAP struct sge_nm_rxq *nm_rxq; struct sge_nm_txq *nm_txq; #endif /* Do this before freeing the queues */ if (vi->flags & VI_SYSCTL_CTX) { sysctl_ctx_free(&vi->ctx); vi->flags &= ~VI_SYSCTL_CTX; } #ifdef DEV_NETMAP if (vi->ifp->if_capabilities & IFCAP_NETMAP) { for_each_nm_txq(vi, i, nm_txq) { free_nm_txq(vi, nm_txq); } for_each_nm_rxq(vi, i, nm_rxq) { free_nm_rxq(vi, nm_rxq); } } #endif /* * Take down all the tx queues first, as they reference the rx queues * (for egress updates, etc.). */ if (IS_MAIN_VI(vi)) free_wrq(sc, &sc->sge.ctrlq[pi->port_id]); for_each_txq(vi, i, txq) { free_txq(vi, txq); } #ifdef TCP_OFFLOAD for_each_ofld_txq(vi, i, ofld_txq) { free_wrq(sc, ofld_txq); } #endif /* * Then take down the rx queues that forward their interrupts, as they * reference other rx queues. */ for_each_rxq(vi, i, rxq) { if ((rxq->iq.flags & IQ_INTR) == 0) free_rxq(vi, rxq); } #ifdef TCP_OFFLOAD for_each_ofld_rxq(vi, i, ofld_rxq) { if ((ofld_rxq->iq.flags & IQ_INTR) == 0) free_ofld_rxq(vi, ofld_rxq); } #endif /* * Then take down the rx queues that take direct interrupts. */ for_each_rxq(vi, i, rxq) { if (rxq->iq.flags & IQ_INTR) free_rxq(vi, rxq); } #ifdef TCP_OFFLOAD for_each_ofld_rxq(vi, i, ofld_rxq) { if (ofld_rxq->iq.flags & IQ_INTR) free_ofld_rxq(vi, ofld_rxq); } #endif return (0); } /* * Deals with errors and the firmware event queue. All data rx queues forward * their interrupt to the firmware event queue. */ void t4_intr_all(void *arg) { struct adapter *sc = arg; struct sge_iq *fwq = &sc->sge.fwq; t4_intr_err(arg); if (atomic_cmpset_int(&fwq->state, IQS_IDLE, IQS_BUSY)) { service_iq(fwq, 0); atomic_cmpset_int(&fwq->state, IQS_BUSY, IQS_IDLE); } } /* Deals with error interrupts */ void t4_intr_err(void *arg) { struct adapter *sc = arg; t4_write_reg(sc, MYPF_REG(A_PCIE_PF_CLI), 0); t4_slow_intr_handler(sc); } void t4_intr_evt(void *arg) { struct sge_iq *iq = arg; if (atomic_cmpset_int(&iq->state, IQS_IDLE, IQS_BUSY)) { service_iq(iq, 0); atomic_cmpset_int(&iq->state, IQS_BUSY, IQS_IDLE); } } void t4_intr(void *arg) { struct sge_iq *iq = arg; if (atomic_cmpset_int(&iq->state, IQS_IDLE, IQS_BUSY)) { service_iq(iq, 0); atomic_cmpset_int(&iq->state, IQS_BUSY, IQS_IDLE); } } void t4_vi_intr(void *arg) { struct irq *irq = arg; #ifdef DEV_NETMAP if (atomic_cmpset_int(&irq->nm_state, NM_ON, NM_BUSY)) { t4_nm_intr(irq->nm_rxq); atomic_cmpset_int(&irq->nm_state, NM_BUSY, NM_ON); } #endif if (irq->rxq != NULL) t4_intr(irq->rxq); } /* * Deals with anything and everything on the given ingress queue. */ static int service_iq(struct sge_iq *iq, int budget) { struct sge_iq *q; struct sge_rxq *rxq = iq_to_rxq(iq); /* Use iff iq is part of rxq */ struct sge_fl *fl; /* Use iff IQ_HAS_FL */ struct adapter *sc = iq->adapter; struct iq_desc *d = &iq->desc[iq->cidx]; int ndescs = 0, limit; int rsp_type, refill; uint32_t lq; uint16_t fl_hw_cidx; struct mbuf *m0; STAILQ_HEAD(, sge_iq) iql = STAILQ_HEAD_INITIALIZER(iql); #if defined(INET) || defined(INET6) const struct timeval lro_timeout = {0, sc->lro_timeout}; #endif KASSERT(iq->state == IQS_BUSY, ("%s: iq %p not BUSY", __func__, iq)); limit = budget ? budget : iq->qsize / 16; if (iq->flags & IQ_HAS_FL) { fl = &rxq->fl; fl_hw_cidx = fl->hw_cidx; /* stable snapshot */ } else { fl = NULL; fl_hw_cidx = 0; /* to silence gcc warning */ } /* * We always come back and check the descriptor ring for new indirect * interrupts and other responses after running a single handler. */ for (;;) { while ((d->rsp.u.type_gen & F_RSPD_GEN) == iq->gen) { rmb(); refill = 0; m0 = NULL; rsp_type = G_RSPD_TYPE(d->rsp.u.type_gen); lq = be32toh(d->rsp.pldbuflen_qid); switch (rsp_type) { case X_RSPD_TYPE_FLBUF: KASSERT(iq->flags & IQ_HAS_FL, ("%s: data for an iq (%p) with no freelist", __func__, iq)); m0 = get_fl_payload(sc, fl, lq); if (__predict_false(m0 == NULL)) goto process_iql; refill = IDXDIFF(fl->hw_cidx, fl_hw_cidx, fl->sidx) > 2; #ifdef T4_PKT_TIMESTAMP /* * 60 bit timestamp for the payload is * *(uint64_t *)m0->m_pktdat. Note that it is * in the leading free-space in the mbuf. The * kernel can clobber it during a pullup, * m_copymdata, etc. You need to make sure that * the mbuf reaches you unmolested if you care * about the timestamp. */ *(uint64_t *)m0->m_pktdat = be64toh(ctrl->u.last_flit) & 0xfffffffffffffff; #endif /* fall through */ case X_RSPD_TYPE_CPL: KASSERT(d->rss.opcode < NUM_CPL_CMDS, ("%s: bad opcode %02x.", __func__, d->rss.opcode)); t4_cpl_handler[d->rss.opcode](iq, &d->rss, m0); break; case X_RSPD_TYPE_INTR: /* * Interrupts should be forwarded only to queues * that are not forwarding their interrupts. * This means service_iq can recurse but only 1 * level deep. */ KASSERT(budget == 0, ("%s: budget %u, rsp_type %u", __func__, budget, rsp_type)); /* * There are 1K interrupt-capable queues (qids 0 * through 1023). A response type indicating a * forwarded interrupt with a qid >= 1K is an * iWARP async notification. */ if (lq >= 1024) { t4_an_handler(iq, &d->rsp); break; } q = sc->sge.iqmap[lq - sc->sge.iq_start]; if (atomic_cmpset_int(&q->state, IQS_IDLE, IQS_BUSY)) { if (service_iq(q, q->qsize / 16) == 0) { atomic_cmpset_int(&q->state, IQS_BUSY, IQS_IDLE); } else { STAILQ_INSERT_TAIL(&iql, q, link); } } break; default: KASSERT(0, ("%s: illegal response type %d on iq %p", __func__, rsp_type, iq)); log(LOG_ERR, "%s: illegal response type %d on iq %p", device_get_nameunit(sc->dev), rsp_type, iq); break; } d++; if (__predict_false(++iq->cidx == iq->sidx)) { iq->cidx = 0; iq->gen ^= F_RSPD_GEN; d = &iq->desc[0]; } if (__predict_false(++ndescs == limit)) { t4_write_reg(sc, MYPF_REG(A_SGE_PF_GTS), V_CIDXINC(ndescs) | V_INGRESSQID(iq->cntxt_id) | V_SEINTARM(V_QINTR_TIMER_IDX(X_TIMERREG_UPDATE_CIDX))); ndescs = 0; #if defined(INET) || defined(INET6) if (iq->flags & IQ_LRO_ENABLED && sc->lro_timeout != 0) { tcp_lro_flush_inactive(&rxq->lro, &lro_timeout); } #endif if (budget) { if (iq->flags & IQ_HAS_FL) { FL_LOCK(fl); refill_fl(sc, fl, 32); FL_UNLOCK(fl); } return (EINPROGRESS); } } if (refill) { FL_LOCK(fl); refill_fl(sc, fl, 32); FL_UNLOCK(fl); fl_hw_cidx = fl->hw_cidx; } } process_iql: if (STAILQ_EMPTY(&iql)) break; /* * Process the head only, and send it to the back of the list if * it's still not done. */ q = STAILQ_FIRST(&iql); STAILQ_REMOVE_HEAD(&iql, link); if (service_iq(q, q->qsize / 8) == 0) atomic_cmpset_int(&q->state, IQS_BUSY, IQS_IDLE); else STAILQ_INSERT_TAIL(&iql, q, link); } #if defined(INET) || defined(INET6) if (iq->flags & IQ_LRO_ENABLED) { struct lro_ctrl *lro = &rxq->lro; tcp_lro_flush_all(lro); } #endif t4_write_reg(sc, MYPF_REG(A_SGE_PF_GTS), V_CIDXINC(ndescs) | V_INGRESSQID((u32)iq->cntxt_id) | V_SEINTARM(iq->intr_params)); if (iq->flags & IQ_HAS_FL) { int starved; FL_LOCK(fl); starved = refill_fl(sc, fl, 64); FL_UNLOCK(fl); if (__predict_false(starved != 0)) add_fl_to_sfl(sc, fl); } return (0); } static inline int cl_has_metadata(struct sge_fl *fl, struct cluster_layout *cll) { int rc = fl->flags & FL_BUF_PACKING || cll->region1 > 0; if (rc) MPASS(cll->region3 >= CL_METADATA_SIZE); return (rc); } static inline struct cluster_metadata * cl_metadata(struct adapter *sc, struct sge_fl *fl, struct cluster_layout *cll, caddr_t cl) { if (cl_has_metadata(fl, cll)) { struct sw_zone_info *swz = &sc->sge.sw_zone_info[cll->zidx]; return ((struct cluster_metadata *)(cl + swz->size) - 1); } return (NULL); } static void rxb_free(struct mbuf *m, void *arg1, void *arg2) { uma_zone_t zone = arg1; caddr_t cl = arg2; uma_zfree(zone, cl); counter_u64_add(extfree_rels, 1); } /* * The mbuf returned by this function could be allocated from zone_mbuf or * constructed in spare room in the cluster. * * The mbuf carries the payload in one of these ways * a) frame inside the mbuf (mbuf from zone_mbuf) * b) m_cljset (for clusters without metadata) zone_mbuf * c) m_extaddref (cluster with metadata) inline mbuf * d) m_extaddref (cluster with metadata) zone_mbuf */ static struct mbuf * get_scatter_segment(struct adapter *sc, struct sge_fl *fl, int fr_offset, int remaining) { struct mbuf *m; struct fl_sdesc *sd = &fl->sdesc[fl->cidx]; struct cluster_layout *cll = &sd->cll; struct sw_zone_info *swz = &sc->sge.sw_zone_info[cll->zidx]; struct hw_buf_info *hwb = &sc->sge.hw_buf_info[cll->hwidx]; struct cluster_metadata *clm = cl_metadata(sc, fl, cll, sd->cl); int len, blen; caddr_t payload; blen = hwb->size - fl->rx_offset; /* max possible in this buf */ len = min(remaining, blen); payload = sd->cl + cll->region1 + fl->rx_offset; if (fl->flags & FL_BUF_PACKING) { const u_int l = fr_offset + len; const u_int pad = roundup2(l, fl->buf_boundary) - l; if (fl->rx_offset + len + pad < hwb->size) blen = len + pad; MPASS(fl->rx_offset + blen <= hwb->size); } else { MPASS(fl->rx_offset == 0); /* not packing */ } if (sc->sc_do_rxcopy && len < RX_COPY_THRESHOLD) { /* * Copy payload into a freshly allocated mbuf. */ m = fr_offset == 0 ? m_gethdr(M_NOWAIT, MT_DATA) : m_get(M_NOWAIT, MT_DATA); if (m == NULL) return (NULL); fl->mbuf_allocated++; #ifdef T4_PKT_TIMESTAMP /* Leave room for a timestamp */ m->m_data += 8; #endif /* copy data to mbuf */ bcopy(payload, mtod(m, caddr_t), len); } else if (sd->nmbuf * MSIZE < cll->region1) { /* * There's spare room in the cluster for an mbuf. Create one * and associate it with the payload that's in the cluster. */ MPASS(clm != NULL); m = (struct mbuf *)(sd->cl + sd->nmbuf * MSIZE); /* No bzero required */ if (m_init(m, M_NOWAIT, MT_DATA, fr_offset == 0 ? M_PKTHDR | M_NOFREE : M_NOFREE)) return (NULL); fl->mbuf_inlined++; m_extaddref(m, payload, blen, &clm->refcount, rxb_free, swz->zone, sd->cl); if (sd->nmbuf++ == 0) counter_u64_add(extfree_refs, 1); } else { /* * Grab an mbuf from zone_mbuf and associate it with the * payload in the cluster. */ m = fr_offset == 0 ? m_gethdr(M_NOWAIT, MT_DATA) : m_get(M_NOWAIT, MT_DATA); if (m == NULL) return (NULL); fl->mbuf_allocated++; if (clm != NULL) { m_extaddref(m, payload, blen, &clm->refcount, rxb_free, swz->zone, sd->cl); if (sd->nmbuf++ == 0) counter_u64_add(extfree_refs, 1); } else { m_cljset(m, sd->cl, swz->type); sd->cl = NULL; /* consumed, not a recycle candidate */ } } if (fr_offset == 0) m->m_pkthdr.len = remaining; m->m_len = len; if (fl->flags & FL_BUF_PACKING) { fl->rx_offset += blen; MPASS(fl->rx_offset <= hwb->size); if (fl->rx_offset < hwb->size) return (m); /* without advancing the cidx */ } if (__predict_false(++fl->cidx % 8 == 0)) { uint16_t cidx = fl->cidx / 8; if (__predict_false(cidx == fl->sidx)) fl->cidx = cidx = 0; fl->hw_cidx = cidx; } fl->rx_offset = 0; return (m); } static struct mbuf * get_fl_payload(struct adapter *sc, struct sge_fl *fl, uint32_t len_newbuf) { struct mbuf *m0, *m, **pnext; u_int remaining; const u_int total = G_RSPD_LEN(len_newbuf); if (__predict_false(fl->flags & FL_BUF_RESUME)) { M_ASSERTPKTHDR(fl->m0); MPASS(fl->m0->m_pkthdr.len == total); MPASS(fl->remaining < total); m0 = fl->m0; pnext = fl->pnext; remaining = fl->remaining; fl->flags &= ~FL_BUF_RESUME; goto get_segment; } if (fl->rx_offset > 0 && len_newbuf & F_RSPD_NEWBUF) { fl->rx_offset = 0; if (__predict_false(++fl->cidx % 8 == 0)) { uint16_t cidx = fl->cidx / 8; if (__predict_false(cidx == fl->sidx)) fl->cidx = cidx = 0; fl->hw_cidx = cidx; } } /* * Payload starts at rx_offset in the current hw buffer. Its length is * 'len' and it may span multiple hw buffers. */ m0 = get_scatter_segment(sc, fl, 0, total); if (m0 == NULL) return (NULL); remaining = total - m0->m_len; pnext = &m0->m_next; while (remaining > 0) { get_segment: MPASS(fl->rx_offset == 0); m = get_scatter_segment(sc, fl, total - remaining, remaining); if (__predict_false(m == NULL)) { fl->m0 = m0; fl->pnext = pnext; fl->remaining = remaining; fl->flags |= FL_BUF_RESUME; return (NULL); } *pnext = m; pnext = &m->m_next; remaining -= m->m_len; } *pnext = NULL; M_ASSERTPKTHDR(m0); return (m0); } static int t4_eth_rx(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m0) { struct sge_rxq *rxq = iq_to_rxq(iq); struct ifnet *ifp = rxq->ifp; struct adapter *sc = iq->adapter; const struct cpl_rx_pkt *cpl = (const void *)(rss + 1); #if defined(INET) || defined(INET6) struct lro_ctrl *lro = &rxq->lro; #endif static const int sw_hashtype[4][2] = { {M_HASHTYPE_NONE, M_HASHTYPE_NONE}, {M_HASHTYPE_RSS_IPV4, M_HASHTYPE_RSS_IPV6}, {M_HASHTYPE_RSS_TCP_IPV4, M_HASHTYPE_RSS_TCP_IPV6}, {M_HASHTYPE_RSS_UDP_IPV4, M_HASHTYPE_RSS_UDP_IPV6}, }; KASSERT(m0 != NULL, ("%s: no payload with opcode %02x", __func__, rss->opcode)); m0->m_pkthdr.len -= sc->params.sge.fl_pktshift; m0->m_len -= sc->params.sge.fl_pktshift; m0->m_data += sc->params.sge.fl_pktshift; m0->m_pkthdr.rcvif = ifp; M_HASHTYPE_SET(m0, sw_hashtype[rss->hash_type][rss->ipv6]); m0->m_pkthdr.flowid = be32toh(rss->hash_val); if (cpl->csum_calc && !cpl->err_vec) { if (ifp->if_capenable & IFCAP_RXCSUM && cpl->l2info & htobe32(F_RXF_IP)) { m0->m_pkthdr.csum_flags = (CSUM_IP_CHECKED | CSUM_IP_VALID | CSUM_DATA_VALID | CSUM_PSEUDO_HDR); rxq->rxcsum++; } else if (ifp->if_capenable & IFCAP_RXCSUM_IPV6 && cpl->l2info & htobe32(F_RXF_IP6)) { m0->m_pkthdr.csum_flags = (CSUM_DATA_VALID_IPV6 | CSUM_PSEUDO_HDR); rxq->rxcsum++; } if (__predict_false(cpl->ip_frag)) m0->m_pkthdr.csum_data = be16toh(cpl->csum); else m0->m_pkthdr.csum_data = 0xffff; } if (cpl->vlan_ex) { m0->m_pkthdr.ether_vtag = be16toh(cpl->vlan); m0->m_flags |= M_VLANTAG; rxq->vlan_extraction++; } #if defined(INET) || defined(INET6) if (cpl->l2info & htobe32(F_RXF_LRO) && iq->flags & IQ_LRO_ENABLED && tcp_lro_rx(lro, m0, 0) == 0) { /* queued for LRO */ } else #endif ifp->if_input(ifp, m0); return (0); } /* * Must drain the wrq or make sure that someone else will. */ static void wrq_tx_drain(void *arg, int n) { struct sge_wrq *wrq = arg; struct sge_eq *eq = &wrq->eq; EQ_LOCK(eq); if (TAILQ_EMPTY(&wrq->incomplete_wrs) && !STAILQ_EMPTY(&wrq->wr_list)) drain_wrq_wr_list(wrq->adapter, wrq); EQ_UNLOCK(eq); } static void drain_wrq_wr_list(struct adapter *sc, struct sge_wrq *wrq) { struct sge_eq *eq = &wrq->eq; u_int available, dbdiff; /* # of hardware descriptors */ u_int n; struct wrqe *wr; struct fw_eth_tx_pkt_wr *dst; /* any fw WR struct will do */ EQ_LOCK_ASSERT_OWNED(eq); MPASS(TAILQ_EMPTY(&wrq->incomplete_wrs)); wr = STAILQ_FIRST(&wrq->wr_list); MPASS(wr != NULL); /* Must be called with something useful to do */ MPASS(eq->pidx == eq->dbidx); dbdiff = 0; do { eq->cidx = read_hw_cidx(eq); if (eq->pidx == eq->cidx) available = eq->sidx - 1; else available = IDXDIFF(eq->cidx, eq->pidx, eq->sidx) - 1; MPASS(wr->wrq == wrq); n = howmany(wr->wr_len, EQ_ESIZE); if (available < n) break; dst = (void *)&eq->desc[eq->pidx]; if (__predict_true(eq->sidx - eq->pidx > n)) { /* Won't wrap, won't end exactly at the status page. */ bcopy(&wr->wr[0], dst, wr->wr_len); eq->pidx += n; } else { int first_portion = (eq->sidx - eq->pidx) * EQ_ESIZE; bcopy(&wr->wr[0], dst, first_portion); if (wr->wr_len > first_portion) { bcopy(&wr->wr[first_portion], &eq->desc[0], wr->wr_len - first_portion); } eq->pidx = n - (eq->sidx - eq->pidx); } if (available < eq->sidx / 4 && atomic_cmpset_int(&eq->equiq, 0, 1)) { dst->equiq_to_len16 |= htobe32(F_FW_WR_EQUIQ | F_FW_WR_EQUEQ); eq->equeqidx = eq->pidx; } else if (IDXDIFF(eq->pidx, eq->equeqidx, eq->sidx) >= 32) { dst->equiq_to_len16 |= htobe32(F_FW_WR_EQUEQ); eq->equeqidx = eq->pidx; } dbdiff += n; if (dbdiff >= 16) { ring_eq_db(sc, eq, dbdiff); dbdiff = 0; } STAILQ_REMOVE_HEAD(&wrq->wr_list, link); free_wrqe(wr); MPASS(wrq->nwr_pending > 0); wrq->nwr_pending--; MPASS(wrq->ndesc_needed >= n); wrq->ndesc_needed -= n; } while ((wr = STAILQ_FIRST(&wrq->wr_list)) != NULL); if (dbdiff) ring_eq_db(sc, eq, dbdiff); } /* * Doesn't fail. Holds on to work requests it can't send right away. */ void t4_wrq_tx_locked(struct adapter *sc, struct sge_wrq *wrq, struct wrqe *wr) { #ifdef INVARIANTS struct sge_eq *eq = &wrq->eq; #endif EQ_LOCK_ASSERT_OWNED(eq); MPASS(wr != NULL); MPASS(wr->wr_len > 0 && wr->wr_len <= SGE_MAX_WR_LEN); MPASS((wr->wr_len & 0x7) == 0); STAILQ_INSERT_TAIL(&wrq->wr_list, wr, link); wrq->nwr_pending++; wrq->ndesc_needed += howmany(wr->wr_len, EQ_ESIZE); if (!TAILQ_EMPTY(&wrq->incomplete_wrs)) return; /* commit_wrq_wr will drain wr_list as well. */ drain_wrq_wr_list(sc, wrq); /* Doorbell must have caught up to the pidx. */ MPASS(eq->pidx == eq->dbidx); } void t4_update_fl_bufsize(struct ifnet *ifp) { struct vi_info *vi = ifp->if_softc; struct adapter *sc = vi->pi->adapter; struct sge_rxq *rxq; #ifdef TCP_OFFLOAD struct sge_ofld_rxq *ofld_rxq; #endif struct sge_fl *fl; int i, maxp, mtu = ifp->if_mtu; maxp = mtu_to_max_payload(sc, mtu, 0); for_each_rxq(vi, i, rxq) { fl = &rxq->fl; FL_LOCK(fl); find_best_refill_source(sc, fl, maxp); FL_UNLOCK(fl); } #ifdef TCP_OFFLOAD maxp = mtu_to_max_payload(sc, mtu, 1); for_each_ofld_rxq(vi, i, ofld_rxq) { fl = &ofld_rxq->fl; FL_LOCK(fl); find_best_refill_source(sc, fl, maxp); FL_UNLOCK(fl); } #endif } static inline int mbuf_nsegs(struct mbuf *m) { M_ASSERTPKTHDR(m); KASSERT(m->m_pkthdr.l5hlen > 0, ("%s: mbuf %p missing information on # of segments.", __func__, m)); return (m->m_pkthdr.l5hlen); } static inline void set_mbuf_nsegs(struct mbuf *m, uint8_t nsegs) { M_ASSERTPKTHDR(m); m->m_pkthdr.l5hlen = nsegs; } static inline int mbuf_len16(struct mbuf *m) { int n; M_ASSERTPKTHDR(m); n = m->m_pkthdr.PH_loc.eight[0]; MPASS(n > 0 && n <= SGE_MAX_WR_LEN / 16); return (n); } static inline void set_mbuf_len16(struct mbuf *m, uint8_t len16) { M_ASSERTPKTHDR(m); m->m_pkthdr.PH_loc.eight[0] = len16; } static inline int needs_tso(struct mbuf *m) { M_ASSERTPKTHDR(m); if (m->m_pkthdr.csum_flags & CSUM_TSO) { KASSERT(m->m_pkthdr.tso_segsz > 0, ("%s: TSO requested in mbuf %p but MSS not provided", __func__, m)); return (1); } return (0); } static inline int needs_l3_csum(struct mbuf *m) { M_ASSERTPKTHDR(m); if (m->m_pkthdr.csum_flags & (CSUM_IP | CSUM_TSO)) return (1); return (0); } static inline int needs_l4_csum(struct mbuf *m) { M_ASSERTPKTHDR(m); if (m->m_pkthdr.csum_flags & (CSUM_TCP | CSUM_UDP | CSUM_UDP_IPV6 | CSUM_TCP_IPV6 | CSUM_TSO)) return (1); return (0); } static inline int needs_vlan_insertion(struct mbuf *m) { M_ASSERTPKTHDR(m); if (m->m_flags & M_VLANTAG) { KASSERT(m->m_pkthdr.ether_vtag != 0, ("%s: HWVLAN requested in mbuf %p but tag not provided", __func__, m)); return (1); } return (0); } static void * m_advance(struct mbuf **pm, int *poffset, int len) { struct mbuf *m = *pm; int offset = *poffset; uintptr_t p = 0; MPASS(len > 0); while (len) { if (offset + len < m->m_len) { offset += len; p = mtod(m, uintptr_t) + offset; break; } len -= m->m_len - offset; m = m->m_next; offset = 0; MPASS(m != NULL); } *poffset = offset; *pm = m; return ((void *)p); } static inline int same_paddr(char *a, char *b) { if (a == b) return (1); else if (a != NULL && b != NULL) { vm_offset_t x = (vm_offset_t)a; vm_offset_t y = (vm_offset_t)b; if ((x & PAGE_MASK) == (y & PAGE_MASK) && pmap_kextract(x) == pmap_kextract(y)) return (1); } return (0); } /* * Can deal with empty mbufs in the chain that have m_len = 0, but the chain * must have at least one mbuf that's not empty. */ static inline int count_mbuf_nsegs(struct mbuf *m) { char *prev_end, *start; int len, nsegs; MPASS(m != NULL); nsegs = 0; prev_end = NULL; for (; m; m = m->m_next) { len = m->m_len; if (__predict_false(len == 0)) continue; start = mtod(m, char *); nsegs += sglist_count(start, len); if (same_paddr(prev_end, start)) nsegs--; prev_end = start + len; } MPASS(nsegs > 0); return (nsegs); } /* * Analyze the mbuf to determine its tx needs. The mbuf passed in may change: * a) caller can assume it's been freed if this function returns with an error. * b) it may get defragged up if the gather list is too long for the hardware. */ int parse_pkt(struct mbuf **mp) { struct mbuf *m0 = *mp, *m; int rc, nsegs, defragged = 0, offset; struct ether_header *eh; void *l3hdr; #if defined(INET) || defined(INET6) struct tcphdr *tcp; #endif uint16_t eh_type; M_ASSERTPKTHDR(m0); if (__predict_false(m0->m_pkthdr.len < ETHER_HDR_LEN)) { rc = EINVAL; fail: m_freem(m0); *mp = NULL; return (rc); } restart: /* * First count the number of gather list segments in the payload. * Defrag the mbuf if nsegs exceeds the hardware limit. */ M_ASSERTPKTHDR(m0); MPASS(m0->m_pkthdr.len > 0); nsegs = count_mbuf_nsegs(m0); if (nsegs > (needs_tso(m0) ? TX_SGL_SEGS_TSO : TX_SGL_SEGS)) { if (defragged++ > 0 || (m = m_defrag(m0, M_NOWAIT)) == NULL) { rc = EFBIG; goto fail; } *mp = m0 = m; /* update caller's copy after defrag */ goto restart; } if (__predict_false(nsegs > 2 && m0->m_pkthdr.len <= MHLEN)) { m0 = m_pullup(m0, m0->m_pkthdr.len); if (m0 == NULL) { /* Should have left well enough alone. */ rc = EFBIG; goto fail; } *mp = m0; /* update caller's copy after pullup */ goto restart; } set_mbuf_nsegs(m0, nsegs); set_mbuf_len16(m0, txpkt_len16(nsegs, needs_tso(m0))); if (!needs_tso(m0)) return (0); m = m0; eh = mtod(m, struct ether_header *); eh_type = ntohs(eh->ether_type); if (eh_type == ETHERTYPE_VLAN) { struct ether_vlan_header *evh = (void *)eh; eh_type = ntohs(evh->evl_proto); m0->m_pkthdr.l2hlen = sizeof(*evh); } else m0->m_pkthdr.l2hlen = sizeof(*eh); offset = 0; l3hdr = m_advance(&m, &offset, m0->m_pkthdr.l2hlen); switch (eh_type) { #ifdef INET6 case ETHERTYPE_IPV6: { struct ip6_hdr *ip6 = l3hdr; MPASS(ip6->ip6_nxt == IPPROTO_TCP); m0->m_pkthdr.l3hlen = sizeof(*ip6); break; } #endif #ifdef INET case ETHERTYPE_IP: { struct ip *ip = l3hdr; m0->m_pkthdr.l3hlen = ip->ip_hl * 4; break; } #endif default: panic("%s: ethertype 0x%04x unknown. if_cxgbe must be compiled" " with the same INET/INET6 options as the kernel.", __func__, eh_type); } #if defined(INET) || defined(INET6) tcp = m_advance(&m, &offset, m0->m_pkthdr.l3hlen); m0->m_pkthdr.l4hlen = tcp->th_off * 4; #endif MPASS(m0 == *mp); return (0); } void * start_wrq_wr(struct sge_wrq *wrq, int len16, struct wrq_cookie *cookie) { struct sge_eq *eq = &wrq->eq; struct adapter *sc = wrq->adapter; int ndesc, available; struct wrqe *wr; void *w; MPASS(len16 > 0); ndesc = howmany(len16, EQ_ESIZE / 16); MPASS(ndesc > 0 && ndesc <= SGE_MAX_WR_NDESC); EQ_LOCK(eq); if (!STAILQ_EMPTY(&wrq->wr_list)) drain_wrq_wr_list(sc, wrq); if (!STAILQ_EMPTY(&wrq->wr_list)) { slowpath: EQ_UNLOCK(eq); wr = alloc_wrqe(len16 * 16, wrq); if (__predict_false(wr == NULL)) return (NULL); cookie->pidx = -1; cookie->ndesc = ndesc; return (&wr->wr); } eq->cidx = read_hw_cidx(eq); if (eq->pidx == eq->cidx) available = eq->sidx - 1; else available = IDXDIFF(eq->cidx, eq->pidx, eq->sidx) - 1; if (available < ndesc) goto slowpath; cookie->pidx = eq->pidx; cookie->ndesc = ndesc; TAILQ_INSERT_TAIL(&wrq->incomplete_wrs, cookie, link); w = &eq->desc[eq->pidx]; IDXINCR(eq->pidx, ndesc, eq->sidx); if (__predict_false(eq->pidx < ndesc - 1)) { w = &wrq->ss[0]; wrq->ss_pidx = cookie->pidx; wrq->ss_len = len16 * 16; } EQ_UNLOCK(eq); return (w); } void commit_wrq_wr(struct sge_wrq *wrq, void *w, struct wrq_cookie *cookie) { struct sge_eq *eq = &wrq->eq; struct adapter *sc = wrq->adapter; int ndesc, pidx; struct wrq_cookie *prev, *next; if (cookie->pidx == -1) { struct wrqe *wr = __containerof(w, struct wrqe, wr); t4_wrq_tx(sc, wr); return; } ndesc = cookie->ndesc; /* Can be more than SGE_MAX_WR_NDESC here. */ pidx = cookie->pidx; MPASS(pidx >= 0 && pidx < eq->sidx); if (__predict_false(w == &wrq->ss[0])) { int n = (eq->sidx - wrq->ss_pidx) * EQ_ESIZE; MPASS(wrq->ss_len > n); /* WR had better wrap around. */ bcopy(&wrq->ss[0], &eq->desc[wrq->ss_pidx], n); bcopy(&wrq->ss[n], &eq->desc[0], wrq->ss_len - n); wrq->tx_wrs_ss++; } else wrq->tx_wrs_direct++; EQ_LOCK(eq); prev = TAILQ_PREV(cookie, wrq_incomplete_wrs, link); next = TAILQ_NEXT(cookie, link); if (prev == NULL) { MPASS(pidx == eq->dbidx); if (next == NULL || ndesc >= 16) ring_eq_db(wrq->adapter, eq, ndesc); else { MPASS(IDXDIFF(next->pidx, pidx, eq->sidx) == ndesc); next->pidx = pidx; next->ndesc += ndesc; } } else { MPASS(IDXDIFF(pidx, prev->pidx, eq->sidx) == prev->ndesc); prev->ndesc += ndesc; } TAILQ_REMOVE(&wrq->incomplete_wrs, cookie, link); if (TAILQ_EMPTY(&wrq->incomplete_wrs) && !STAILQ_EMPTY(&wrq->wr_list)) drain_wrq_wr_list(sc, wrq); #ifdef INVARIANTS if (TAILQ_EMPTY(&wrq->incomplete_wrs)) { /* Doorbell must have caught up to the pidx. */ MPASS(wrq->eq.pidx == wrq->eq.dbidx); } #endif EQ_UNLOCK(eq); } static u_int can_resume_eth_tx(struct mp_ring *r) { struct sge_eq *eq = r->cookie; return (total_available_tx_desc(eq) > eq->sidx / 8); } static inline int cannot_use_txpkts(struct mbuf *m) { /* maybe put a GL limit too, to avoid silliness? */ return (needs_tso(m)); } /* * r->items[cidx] to r->items[pidx], with a wraparound at r->size, are ready to * be consumed. Return the actual number consumed. 0 indicates a stall. */ static u_int eth_tx(struct mp_ring *r, u_int cidx, u_int pidx) { struct sge_txq *txq = r->cookie; struct sge_eq *eq = &txq->eq; struct ifnet *ifp = txq->ifp; struct vi_info *vi = ifp->if_softc; struct port_info *pi = vi->pi; struct adapter *sc = pi->adapter; u_int total, remaining; /* # of packets */ u_int available, dbdiff; /* # of hardware descriptors */ u_int n, next_cidx; struct mbuf *m0, *tail; struct txpkts txp; struct fw_eth_tx_pkts_wr *wr; /* any fw WR struct will do */ remaining = IDXDIFF(pidx, cidx, r->size); MPASS(remaining > 0); /* Must not be called without work to do. */ total = 0; TXQ_LOCK(txq); if (__predict_false((eq->flags & EQ_ENABLED) == 0)) { while (cidx != pidx) { m0 = r->items[cidx]; m_freem(m0); if (++cidx == r->size) cidx = 0; } reclaim_tx_descs(txq, 2048); total = remaining; goto done; } /* How many hardware descriptors do we have readily available. */ if (eq->pidx == eq->cidx) available = eq->sidx - 1; else available = IDXDIFF(eq->cidx, eq->pidx, eq->sidx) - 1; dbdiff = IDXDIFF(eq->pidx, eq->dbidx, eq->sidx); while (remaining > 0) { m0 = r->items[cidx]; M_ASSERTPKTHDR(m0); MPASS(m0->m_nextpkt == NULL); if (available < SGE_MAX_WR_NDESC) { available += reclaim_tx_descs(txq, 64); if (available < howmany(mbuf_len16(m0), EQ_ESIZE / 16)) break; /* out of descriptors */ } next_cidx = cidx + 1; if (__predict_false(next_cidx == r->size)) next_cidx = 0; wr = (void *)&eq->desc[eq->pidx]; if (remaining > 1 && try_txpkts(m0, r->items[next_cidx], &txp, available) == 0) { /* pkts at cidx, next_cidx should both be in txp. */ MPASS(txp.npkt == 2); tail = r->items[next_cidx]; MPASS(tail->m_nextpkt == NULL); ETHER_BPF_MTAP(ifp, m0); ETHER_BPF_MTAP(ifp, tail); m0->m_nextpkt = tail; if (__predict_false(++next_cidx == r->size)) next_cidx = 0; while (next_cidx != pidx) { if (add_to_txpkts(r->items[next_cidx], &txp, available) != 0) break; tail->m_nextpkt = r->items[next_cidx]; tail = tail->m_nextpkt; ETHER_BPF_MTAP(ifp, tail); if (__predict_false(++next_cidx == r->size)) next_cidx = 0; } n = write_txpkts_wr(txq, wr, m0, &txp, available); total += txp.npkt; remaining -= txp.npkt; } else { total++; remaining--; ETHER_BPF_MTAP(ifp, m0); n = write_txpkt_wr(txq, (void *)wr, m0, available); } MPASS(n >= 1 && n <= available && n <= SGE_MAX_WR_NDESC); available -= n; dbdiff += n; IDXINCR(eq->pidx, n, eq->sidx); if (total_available_tx_desc(eq) < eq->sidx / 4 && atomic_cmpset_int(&eq->equiq, 0, 1)) { wr->equiq_to_len16 |= htobe32(F_FW_WR_EQUIQ | F_FW_WR_EQUEQ); eq->equeqidx = eq->pidx; } else if (IDXDIFF(eq->pidx, eq->equeqidx, eq->sidx) >= 32) { wr->equiq_to_len16 |= htobe32(F_FW_WR_EQUEQ); eq->equeqidx = eq->pidx; } if (dbdiff >= 16 && remaining >= 4) { ring_eq_db(sc, eq, dbdiff); available += reclaim_tx_descs(txq, 4 * dbdiff); dbdiff = 0; } cidx = next_cidx; } if (dbdiff != 0) { ring_eq_db(sc, eq, dbdiff); reclaim_tx_descs(txq, 32); } done: TXQ_UNLOCK(txq); return (total); } static inline void init_iq(struct sge_iq *iq, struct adapter *sc, int tmr_idx, int pktc_idx, int qsize) { KASSERT(tmr_idx >= 0 && tmr_idx < SGE_NTIMERS, ("%s: bad tmr_idx %d", __func__, tmr_idx)); KASSERT(pktc_idx < SGE_NCOUNTERS, /* -ve is ok, means don't use */ ("%s: bad pktc_idx %d", __func__, pktc_idx)); iq->flags = 0; iq->adapter = sc; iq->intr_params = V_QINTR_TIMER_IDX(tmr_idx); iq->intr_pktc_idx = SGE_NCOUNTERS - 1; if (pktc_idx >= 0) { iq->intr_params |= F_QINTR_CNT_EN; iq->intr_pktc_idx = pktc_idx; } iq->qsize = roundup2(qsize, 16); /* See FW_IQ_CMD/iqsize */ iq->sidx = iq->qsize - sc->params.sge.spg_len / IQ_ESIZE; } static inline void init_fl(struct adapter *sc, struct sge_fl *fl, int qsize, int maxp, char *name) { fl->qsize = qsize; fl->sidx = qsize - sc->params.sge.spg_len / EQ_ESIZE; strlcpy(fl->lockname, name, sizeof(fl->lockname)); if (sc->flags & BUF_PACKING_OK && ((!is_t4(sc) && buffer_packing) || /* T5+: enabled unless 0 */ (is_t4(sc) && buffer_packing == 1)))/* T4: disabled unless 1 */ fl->flags |= FL_BUF_PACKING; find_best_refill_source(sc, fl, maxp); find_safe_refill_source(sc, fl); } static inline void init_eq(struct adapter *sc, struct sge_eq *eq, int eqtype, int qsize, uint8_t tx_chan, uint16_t iqid, char *name) { KASSERT(eqtype <= EQ_TYPEMASK, ("%s: bad qtype %d", __func__, eqtype)); eq->flags = eqtype & EQ_TYPEMASK; eq->tx_chan = tx_chan; eq->iqid = iqid; eq->sidx = qsize - sc->params.sge.spg_len / EQ_ESIZE; strlcpy(eq->lockname, name, sizeof(eq->lockname)); } static int alloc_ring(struct adapter *sc, size_t len, bus_dma_tag_t *tag, bus_dmamap_t *map, bus_addr_t *pa, void **va) { int rc; rc = bus_dma_tag_create(sc->dmat, 512, 0, BUS_SPACE_MAXADDR, BUS_SPACE_MAXADDR, NULL, NULL, len, 1, len, 0, NULL, NULL, tag); if (rc != 0) { device_printf(sc->dev, "cannot allocate DMA tag: %d\n", rc); goto done; } rc = bus_dmamem_alloc(*tag, va, BUS_DMA_WAITOK | BUS_DMA_COHERENT | BUS_DMA_ZERO, map); if (rc != 0) { device_printf(sc->dev, "cannot allocate DMA memory: %d\n", rc); goto done; } rc = bus_dmamap_load(*tag, *map, *va, len, oneseg_dma_callback, pa, 0); if (rc != 0) { device_printf(sc->dev, "cannot load DMA map: %d\n", rc); goto done; } done: if (rc) free_ring(sc, *tag, *map, *pa, *va); return (rc); } static int free_ring(struct adapter *sc, bus_dma_tag_t tag, bus_dmamap_t map, bus_addr_t pa, void *va) { if (pa) bus_dmamap_unload(tag, map); if (va) bus_dmamem_free(tag, va, map); if (tag) bus_dma_tag_destroy(tag); return (0); } /* * Allocates the ring for an ingress queue and an optional freelist. If the * freelist is specified it will be allocated and then associated with the * ingress queue. * * Returns errno on failure. Resources allocated up to that point may still be * allocated. Caller is responsible for cleanup in case this function fails. * * If the ingress queue will take interrupts directly (iq->flags & IQ_INTR) then * the intr_idx specifies the vector, starting from 0. Otherwise it specifies * the abs_id of the ingress queue to which its interrupts should be forwarded. */ static int alloc_iq_fl(struct vi_info *vi, struct sge_iq *iq, struct sge_fl *fl, int intr_idx, int cong) { int rc, i, cntxt_id; size_t len; struct fw_iq_cmd c; struct port_info *pi = vi->pi; struct adapter *sc = iq->adapter; struct sge_params *sp = &sc->params.sge; __be32 v = 0; len = iq->qsize * IQ_ESIZE; rc = alloc_ring(sc, len, &iq->desc_tag, &iq->desc_map, &iq->ba, (void **)&iq->desc); if (rc != 0) return (rc); bzero(&c, sizeof(c)); c.op_to_vfn = htobe32(V_FW_CMD_OP(FW_IQ_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_IQ_CMD_PFN(sc->pf) | V_FW_IQ_CMD_VFN(0)); c.alloc_to_len16 = htobe32(F_FW_IQ_CMD_ALLOC | F_FW_IQ_CMD_IQSTART | FW_LEN16(c)); /* Special handling for firmware event queue */ if (iq == &sc->sge.fwq) v |= F_FW_IQ_CMD_IQASYNCH; if (iq->flags & IQ_INTR) { KASSERT(intr_idx < sc->intr_count, ("%s: invalid direct intr_idx %d", __func__, intr_idx)); } else v |= F_FW_IQ_CMD_IQANDST; v |= V_FW_IQ_CMD_IQANDSTINDEX(intr_idx); c.type_to_iqandstindex = htobe32(v | V_FW_IQ_CMD_TYPE(FW_IQ_TYPE_FL_INT_CAP) | V_FW_IQ_CMD_VIID(vi->viid) | V_FW_IQ_CMD_IQANUD(X_UPDATEDELIVERY_INTERRUPT)); c.iqdroprss_to_iqesize = htobe16(V_FW_IQ_CMD_IQPCIECH(pi->tx_chan) | F_FW_IQ_CMD_IQGTSMODE | V_FW_IQ_CMD_IQINTCNTTHRESH(iq->intr_pktc_idx) | V_FW_IQ_CMD_IQESIZE(ilog2(IQ_ESIZE) - 4)); c.iqsize = htobe16(iq->qsize); c.iqaddr = htobe64(iq->ba); if (cong >= 0) c.iqns_to_fl0congen = htobe32(F_FW_IQ_CMD_IQFLINTCONGEN); if (fl) { mtx_init(&fl->fl_lock, fl->lockname, NULL, MTX_DEF); len = fl->qsize * EQ_ESIZE; rc = alloc_ring(sc, len, &fl->desc_tag, &fl->desc_map, &fl->ba, (void **)&fl->desc); if (rc) return (rc); /* Allocate space for one software descriptor per buffer. */ rc = alloc_fl_sdesc(fl); if (rc != 0) { device_printf(sc->dev, "failed to setup fl software descriptors: %d\n", rc); return (rc); } if (fl->flags & FL_BUF_PACKING) { fl->lowat = roundup2(sp->fl_starve_threshold2, 8); fl->buf_boundary = sp->pack_boundary; } else { fl->lowat = roundup2(sp->fl_starve_threshold, 8); fl->buf_boundary = 16; } if (fl_pad && fl->buf_boundary < sp->pad_boundary) fl->buf_boundary = sp->pad_boundary; c.iqns_to_fl0congen |= htobe32(V_FW_IQ_CMD_FL0HOSTFCMODE(X_HOSTFCMODE_NONE) | F_FW_IQ_CMD_FL0FETCHRO | F_FW_IQ_CMD_FL0DATARO | (fl_pad ? F_FW_IQ_CMD_FL0PADEN : 0) | (fl->flags & FL_BUF_PACKING ? F_FW_IQ_CMD_FL0PACKEN : 0)); if (cong >= 0) { c.iqns_to_fl0congen |= htobe32(V_FW_IQ_CMD_FL0CNGCHMAP(cong) | F_FW_IQ_CMD_FL0CONGCIF | F_FW_IQ_CMD_FL0CONGEN); } c.fl0dcaen_to_fl0cidxfthresh = htobe16(V_FW_IQ_CMD_FL0FBMIN(X_FETCHBURSTMIN_128B) | V_FW_IQ_CMD_FL0FBMAX(X_FETCHBURSTMAX_512B)); c.fl0size = htobe16(fl->qsize); c.fl0addr = htobe64(fl->ba); } rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c); if (rc != 0) { device_printf(sc->dev, "failed to create ingress queue: %d\n", rc); return (rc); } iq->cidx = 0; iq->gen = F_RSPD_GEN; iq->intr_next = iq->intr_params; iq->cntxt_id = be16toh(c.iqid); iq->abs_id = be16toh(c.physiqid); iq->flags |= IQ_ALLOCATED; cntxt_id = iq->cntxt_id - sc->sge.iq_start; if (cntxt_id >= sc->sge.niq) { panic ("%s: iq->cntxt_id (%d) more than the max (%d)", __func__, cntxt_id, sc->sge.niq - 1); } sc->sge.iqmap[cntxt_id] = iq; if (fl) { u_int qid; iq->flags |= IQ_HAS_FL; fl->cntxt_id = be16toh(c.fl0id); fl->pidx = fl->cidx = 0; cntxt_id = fl->cntxt_id - sc->sge.eq_start; if (cntxt_id >= sc->sge.neq) { panic("%s: fl->cntxt_id (%d) more than the max (%d)", __func__, cntxt_id, sc->sge.neq - 1); } sc->sge.eqmap[cntxt_id] = (void *)fl; qid = fl->cntxt_id; if (isset(&sc->doorbells, DOORBELL_UDB)) { uint32_t s_qpp = sc->params.sge.eq_s_qpp; uint32_t mask = (1 << s_qpp) - 1; volatile uint8_t *udb; udb = sc->udbs_base + UDBS_DB_OFFSET; udb += (qid >> s_qpp) << PAGE_SHIFT; qid &= mask; if (qid < PAGE_SIZE / UDBS_SEG_SIZE) { udb += qid << UDBS_SEG_SHIFT; qid = 0; } fl->udb = (volatile void *)udb; } fl->dbval = V_QID(qid) | sc->chip_params->sge_fl_db; FL_LOCK(fl); /* Enough to make sure the SGE doesn't think it's starved */ refill_fl(sc, fl, fl->lowat); FL_UNLOCK(fl); } if (is_t5(sc) && cong >= 0) { uint32_t param, val; param = V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DMAQ) | V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DMAQ_CONM_CTXT) | V_FW_PARAMS_PARAM_YZ(iq->cntxt_id); if (cong == 0) val = 1 << 19; else { val = 2 << 19; for (i = 0; i < 4; i++) { if (cong & (1 << i)) val |= 1 << (i << 2); } } rc = -t4_set_params(sc, sc->mbox, sc->pf, 0, 1, ¶m, &val); if (rc != 0) { /* report error but carry on */ device_printf(sc->dev, "failed to set congestion manager context for " "ingress queue %d: %d\n", iq->cntxt_id, rc); } } /* Enable IQ interrupts */ atomic_store_rel_int(&iq->state, IQS_IDLE); t4_write_reg(sc, MYPF_REG(A_SGE_PF_GTS), V_SEINTARM(iq->intr_params) | V_INGRESSQID(iq->cntxt_id)); return (0); } static int free_iq_fl(struct vi_info *vi, struct sge_iq *iq, struct sge_fl *fl) { int rc; struct adapter *sc = iq->adapter; device_t dev; if (sc == NULL) return (0); /* nothing to do */ dev = vi ? vi->dev : sc->dev; if (iq->flags & IQ_ALLOCATED) { rc = -t4_iq_free(sc, sc->mbox, sc->pf, 0, FW_IQ_TYPE_FL_INT_CAP, iq->cntxt_id, fl ? fl->cntxt_id : 0xffff, 0xffff); if (rc != 0) { device_printf(dev, "failed to free queue %p: %d\n", iq, rc); return (rc); } iq->flags &= ~IQ_ALLOCATED; } free_ring(sc, iq->desc_tag, iq->desc_map, iq->ba, iq->desc); bzero(iq, sizeof(*iq)); if (fl) { free_ring(sc, fl->desc_tag, fl->desc_map, fl->ba, fl->desc); if (fl->sdesc) free_fl_sdesc(sc, fl); if (mtx_initialized(&fl->fl_lock)) mtx_destroy(&fl->fl_lock); bzero(fl, sizeof(*fl)); } return (0); } static void add_fl_sysctls(struct sysctl_ctx_list *ctx, struct sysctl_oid *oid, struct sge_fl *fl) { struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid); oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, "fl", CTLFLAG_RD, NULL, "freelist"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cntxt_id", CTLTYPE_INT | CTLFLAG_RD, &fl->cntxt_id, 0, sysctl_uint16, "I", "SGE context id of the freelist"); SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "padding", CTLFLAG_RD, NULL, fl_pad ? 1 : 0, "padding enabled"); SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "packing", CTLFLAG_RD, NULL, fl->flags & FL_BUF_PACKING ? 1 : 0, "packing enabled"); SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "cidx", CTLFLAG_RD, &fl->cidx, 0, "consumer index"); if (fl->flags & FL_BUF_PACKING) { SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "rx_offset", CTLFLAG_RD, &fl->rx_offset, 0, "packing rx offset"); } SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "pidx", CTLFLAG_RD, &fl->pidx, 0, "producer index"); SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "mbuf_allocated", CTLFLAG_RD, &fl->mbuf_allocated, "# of mbuf allocated"); SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "mbuf_inlined", CTLFLAG_RD, &fl->mbuf_inlined, "# of mbuf inlined in clusters"); SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "cluster_allocated", CTLFLAG_RD, &fl->cl_allocated, "# of clusters allocated"); SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "cluster_recycled", CTLFLAG_RD, &fl->cl_recycled, "# of clusters recycled"); SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "cluster_fast_recycled", CTLFLAG_RD, &fl->cl_fast_recycled, "# of clusters recycled (fast)"); } static int alloc_fwq(struct adapter *sc) { int rc, intr_idx; struct sge_iq *fwq = &sc->sge.fwq; struct sysctl_oid *oid = device_get_sysctl_tree(sc->dev); struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid); init_iq(fwq, sc, 0, 0, FW_IQ_QSIZE); fwq->flags |= IQ_INTR; /* always */ intr_idx = sc->intr_count > 1 ? 1 : 0; fwq->set_tcb_rpl = t4_filter_rpl; fwq->l2t_write_rpl = do_l2t_write_rpl; rc = alloc_iq_fl(&sc->port[0]->vi[0], fwq, NULL, intr_idx, -1); if (rc != 0) { device_printf(sc->dev, "failed to create firmware event queue: %d\n", rc); return (rc); } oid = SYSCTL_ADD_NODE(&sc->ctx, children, OID_AUTO, "fwq", CTLFLAG_RD, NULL, "firmware event queue"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_PROC(&sc->ctx, children, OID_AUTO, "abs_id", CTLTYPE_INT | CTLFLAG_RD, &fwq->abs_id, 0, sysctl_uint16, "I", "absolute id of the queue"); SYSCTL_ADD_PROC(&sc->ctx, children, OID_AUTO, "cntxt_id", CTLTYPE_INT | CTLFLAG_RD, &fwq->cntxt_id, 0, sysctl_uint16, "I", "SGE context id of the queue"); SYSCTL_ADD_PROC(&sc->ctx, children, OID_AUTO, "cidx", CTLTYPE_INT | CTLFLAG_RD, &fwq->cidx, 0, sysctl_uint16, "I", "consumer index"); return (0); } static int free_fwq(struct adapter *sc) { return free_iq_fl(NULL, &sc->sge.fwq, NULL); } static int alloc_mgmtq(struct adapter *sc) { int rc; struct sge_wrq *mgmtq = &sc->sge.mgmtq; char name[16]; struct sysctl_oid *oid = device_get_sysctl_tree(sc->dev); struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid); oid = SYSCTL_ADD_NODE(&sc->ctx, children, OID_AUTO, "mgmtq", CTLFLAG_RD, NULL, "management queue"); snprintf(name, sizeof(name), "%s mgmtq", device_get_nameunit(sc->dev)); init_eq(sc, &mgmtq->eq, EQ_CTRL, CTRL_EQ_QSIZE, sc->port[0]->tx_chan, sc->sge.fwq.cntxt_id, name); rc = alloc_wrq(sc, NULL, mgmtq, oid); if (rc != 0) { device_printf(sc->dev, "failed to create management queue: %d\n", rc); return (rc); } return (0); } static int free_mgmtq(struct adapter *sc) { return free_wrq(sc, &sc->sge.mgmtq); } int tnl_cong(struct port_info *pi, int drop) { if (drop == -1) return (-1); else if (drop == 1) return (0); else return (pi->rx_chan_map); } static int alloc_rxq(struct vi_info *vi, struct sge_rxq *rxq, int intr_idx, int idx, struct sysctl_oid *oid) { int rc; struct sysctl_oid_list *children; char name[16]; rc = alloc_iq_fl(vi, &rxq->iq, &rxq->fl, intr_idx, tnl_cong(vi->pi, cong_drop)); if (rc != 0) return (rc); /* * The freelist is just barely above the starvation threshold right now, * fill it up a bit more. */ FL_LOCK(&rxq->fl); refill_fl(vi->pi->adapter, &rxq->fl, 128); FL_UNLOCK(&rxq->fl); #if defined(INET) || defined(INET6) rc = tcp_lro_init(&rxq->lro); if (rc != 0) return (rc); rxq->lro.ifp = vi->ifp; /* also indicates LRO init'ed */ if (vi->ifp->if_capenable & IFCAP_LRO) rxq->iq.flags |= IQ_LRO_ENABLED; #endif rxq->ifp = vi->ifp; children = SYSCTL_CHILDREN(oid); snprintf(name, sizeof(name), "%d", idx); oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD, NULL, "rx queue"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "abs_id", CTLTYPE_INT | CTLFLAG_RD, &rxq->iq.abs_id, 0, sysctl_uint16, "I", "absolute id of the queue"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cntxt_id", CTLTYPE_INT | CTLFLAG_RD, &rxq->iq.cntxt_id, 0, sysctl_uint16, "I", "SGE context id of the queue"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx", CTLTYPE_INT | CTLFLAG_RD, &rxq->iq.cidx, 0, sysctl_uint16, "I", "consumer index"); #if defined(INET) || defined(INET6) SYSCTL_ADD_U64(&vi->ctx, children, OID_AUTO, "lro_queued", CTLFLAG_RD, &rxq->lro.lro_queued, 0, NULL); SYSCTL_ADD_U64(&vi->ctx, children, OID_AUTO, "lro_flushed", CTLFLAG_RD, &rxq->lro.lro_flushed, 0, NULL); #endif SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "rxcsum", CTLFLAG_RD, &rxq->rxcsum, "# of times hardware assisted with checksum"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "vlan_extraction", CTLFLAG_RD, &rxq->vlan_extraction, "# of times hardware extracted 802.1Q tag"); add_fl_sysctls(&vi->ctx, oid, &rxq->fl); return (rc); } static int free_rxq(struct vi_info *vi, struct sge_rxq *rxq) { int rc; #if defined(INET) || defined(INET6) if (rxq->lro.ifp) { tcp_lro_free(&rxq->lro); rxq->lro.ifp = NULL; } #endif rc = free_iq_fl(vi, &rxq->iq, &rxq->fl); if (rc == 0) bzero(rxq, sizeof(*rxq)); return (rc); } #ifdef TCP_OFFLOAD static int alloc_ofld_rxq(struct vi_info *vi, struct sge_ofld_rxq *ofld_rxq, int intr_idx, int idx, struct sysctl_oid *oid) { int rc; struct sysctl_oid_list *children; char name[16]; rc = alloc_iq_fl(vi, &ofld_rxq->iq, &ofld_rxq->fl, intr_idx, vi->pi->rx_chan_map); if (rc != 0) return (rc); children = SYSCTL_CHILDREN(oid); snprintf(name, sizeof(name), "%d", idx); oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD, NULL, "rx queue"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "abs_id", CTLTYPE_INT | CTLFLAG_RD, &ofld_rxq->iq.abs_id, 0, sysctl_uint16, "I", "absolute id of the queue"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cntxt_id", CTLTYPE_INT | CTLFLAG_RD, &ofld_rxq->iq.cntxt_id, 0, sysctl_uint16, "I", "SGE context id of the queue"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx", CTLTYPE_INT | CTLFLAG_RD, &ofld_rxq->iq.cidx, 0, sysctl_uint16, "I", "consumer index"); add_fl_sysctls(&vi->ctx, oid, &ofld_rxq->fl); return (rc); } static int free_ofld_rxq(struct vi_info *vi, struct sge_ofld_rxq *ofld_rxq) { int rc; rc = free_iq_fl(vi, &ofld_rxq->iq, &ofld_rxq->fl); if (rc == 0) bzero(ofld_rxq, sizeof(*ofld_rxq)); return (rc); } #endif #ifdef DEV_NETMAP static int alloc_nm_rxq(struct vi_info *vi, struct sge_nm_rxq *nm_rxq, int intr_idx, int idx, struct sysctl_oid *oid) { int rc; struct sysctl_oid_list *children; struct sysctl_ctx_list *ctx; char name[16]; size_t len; struct adapter *sc = vi->pi->adapter; struct netmap_adapter *na = NA(vi->ifp); MPASS(na != NULL); len = vi->qsize_rxq * IQ_ESIZE; rc = alloc_ring(sc, len, &nm_rxq->iq_desc_tag, &nm_rxq->iq_desc_map, &nm_rxq->iq_ba, (void **)&nm_rxq->iq_desc); if (rc != 0) return (rc); len = na->num_rx_desc * EQ_ESIZE + sc->params.sge.spg_len; rc = alloc_ring(sc, len, &nm_rxq->fl_desc_tag, &nm_rxq->fl_desc_map, &nm_rxq->fl_ba, (void **)&nm_rxq->fl_desc); if (rc != 0) return (rc); nm_rxq->vi = vi; nm_rxq->nid = idx; nm_rxq->iq_cidx = 0; nm_rxq->iq_sidx = vi->qsize_rxq - sc->params.sge.spg_len / IQ_ESIZE; nm_rxq->iq_gen = F_RSPD_GEN; nm_rxq->fl_pidx = nm_rxq->fl_cidx = 0; nm_rxq->fl_sidx = na->num_rx_desc; nm_rxq->intr_idx = intr_idx; ctx = &vi->ctx; children = SYSCTL_CHILDREN(oid); snprintf(name, sizeof(name), "%d", idx); oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, name, CTLFLAG_RD, NULL, "rx queue"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "abs_id", CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->iq_abs_id, 0, sysctl_uint16, "I", "absolute id of the queue"); SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cntxt_id", CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->iq_cntxt_id, 0, sysctl_uint16, "I", "SGE context id of the queue"); SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cidx", CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->iq_cidx, 0, sysctl_uint16, "I", "consumer index"); children = SYSCTL_CHILDREN(oid); oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, "fl", CTLFLAG_RD, NULL, "freelist"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cntxt_id", CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->fl_cntxt_id, 0, sysctl_uint16, "I", "SGE context id of the freelist"); SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "cidx", CTLFLAG_RD, &nm_rxq->fl_cidx, 0, "consumer index"); SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "pidx", CTLFLAG_RD, &nm_rxq->fl_pidx, 0, "producer index"); return (rc); } static int free_nm_rxq(struct vi_info *vi, struct sge_nm_rxq *nm_rxq) { struct adapter *sc = vi->pi->adapter; free_ring(sc, nm_rxq->iq_desc_tag, nm_rxq->iq_desc_map, nm_rxq->iq_ba, nm_rxq->iq_desc); free_ring(sc, nm_rxq->fl_desc_tag, nm_rxq->fl_desc_map, nm_rxq->fl_ba, nm_rxq->fl_desc); return (0); } static int alloc_nm_txq(struct vi_info *vi, struct sge_nm_txq *nm_txq, int iqidx, int idx, struct sysctl_oid *oid) { int rc; size_t len; struct port_info *pi = vi->pi; struct adapter *sc = pi->adapter; struct netmap_adapter *na = NA(vi->ifp); char name[16]; struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid); len = na->num_tx_desc * EQ_ESIZE + sc->params.sge.spg_len; rc = alloc_ring(sc, len, &nm_txq->desc_tag, &nm_txq->desc_map, &nm_txq->ba, (void **)&nm_txq->desc); if (rc) return (rc); nm_txq->pidx = nm_txq->cidx = 0; nm_txq->sidx = na->num_tx_desc; nm_txq->nid = idx; nm_txq->iqidx = iqidx; nm_txq->cpl_ctrl0 = htobe32(V_TXPKT_OPCODE(CPL_TX_PKT) | V_TXPKT_INTF(pi->tx_chan) | V_TXPKT_VF_VLD(1) | V_TXPKT_VF(vi->viid)); snprintf(name, sizeof(name), "%d", idx); oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD, NULL, "netmap tx queue"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_UINT(&vi->ctx, children, OID_AUTO, "cntxt_id", CTLFLAG_RD, &nm_txq->cntxt_id, 0, "SGE context id of the queue"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx", CTLTYPE_INT | CTLFLAG_RD, &nm_txq->cidx, 0, sysctl_uint16, "I", "consumer index"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "pidx", CTLTYPE_INT | CTLFLAG_RD, &nm_txq->pidx, 0, sysctl_uint16, "I", "producer index"); return (rc); } static int free_nm_txq(struct vi_info *vi, struct sge_nm_txq *nm_txq) { struct adapter *sc = vi->pi->adapter; free_ring(sc, nm_txq->desc_tag, nm_txq->desc_map, nm_txq->ba, nm_txq->desc); return (0); } #endif static int ctrl_eq_alloc(struct adapter *sc, struct sge_eq *eq) { int rc, cntxt_id; struct fw_eq_ctrl_cmd c; int qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE; bzero(&c, sizeof(c)); c.op_to_vfn = htobe32(V_FW_CMD_OP(FW_EQ_CTRL_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_EQ_CTRL_CMD_PFN(sc->pf) | V_FW_EQ_CTRL_CMD_VFN(0)); c.alloc_to_len16 = htobe32(F_FW_EQ_CTRL_CMD_ALLOC | F_FW_EQ_CTRL_CMD_EQSTART | FW_LEN16(c)); c.cmpliqid_eqid = htonl(V_FW_EQ_CTRL_CMD_CMPLIQID(eq->iqid)); c.physeqid_pkd = htobe32(0); c.fetchszm_to_iqid = htobe32(V_FW_EQ_CTRL_CMD_HOSTFCMODE(X_HOSTFCMODE_NONE) | V_FW_EQ_CTRL_CMD_PCIECHN(eq->tx_chan) | F_FW_EQ_CTRL_CMD_FETCHRO | V_FW_EQ_CTRL_CMD_IQID(eq->iqid)); c.dcaen_to_eqsize = htobe32(V_FW_EQ_CTRL_CMD_FBMIN(X_FETCHBURSTMIN_64B) | V_FW_EQ_CTRL_CMD_FBMAX(X_FETCHBURSTMAX_512B) | V_FW_EQ_CTRL_CMD_EQSIZE(qsize)); c.eqaddr = htobe64(eq->ba); rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c); if (rc != 0) { device_printf(sc->dev, "failed to create control queue %d: %d\n", eq->tx_chan, rc); return (rc); } eq->flags |= EQ_ALLOCATED; eq->cntxt_id = G_FW_EQ_CTRL_CMD_EQID(be32toh(c.cmpliqid_eqid)); cntxt_id = eq->cntxt_id - sc->sge.eq_start; if (cntxt_id >= sc->sge.neq) panic("%s: eq->cntxt_id (%d) more than the max (%d)", __func__, cntxt_id, sc->sge.neq - 1); sc->sge.eqmap[cntxt_id] = eq; return (rc); } static int eth_eq_alloc(struct adapter *sc, struct vi_info *vi, struct sge_eq *eq) { int rc, cntxt_id; struct fw_eq_eth_cmd c; int qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE; bzero(&c, sizeof(c)); c.op_to_vfn = htobe32(V_FW_CMD_OP(FW_EQ_ETH_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_EQ_ETH_CMD_PFN(sc->pf) | V_FW_EQ_ETH_CMD_VFN(0)); c.alloc_to_len16 = htobe32(F_FW_EQ_ETH_CMD_ALLOC | F_FW_EQ_ETH_CMD_EQSTART | FW_LEN16(c)); c.autoequiqe_to_viid = htobe32(F_FW_EQ_ETH_CMD_AUTOEQUIQE | F_FW_EQ_ETH_CMD_AUTOEQUEQE | V_FW_EQ_ETH_CMD_VIID(vi->viid)); c.fetchszm_to_iqid = htobe32(V_FW_EQ_ETH_CMD_HOSTFCMODE(X_HOSTFCMODE_NONE) | V_FW_EQ_ETH_CMD_PCIECHN(eq->tx_chan) | F_FW_EQ_ETH_CMD_FETCHRO | V_FW_EQ_ETH_CMD_IQID(eq->iqid)); c.dcaen_to_eqsize = htobe32(V_FW_EQ_ETH_CMD_FBMIN(X_FETCHBURSTMIN_64B) | V_FW_EQ_ETH_CMD_FBMAX(X_FETCHBURSTMAX_512B) | V_FW_EQ_ETH_CMD_EQSIZE(qsize)); c.eqaddr = htobe64(eq->ba); rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c); if (rc != 0) { device_printf(vi->dev, "failed to create Ethernet egress queue: %d\n", rc); return (rc); } eq->flags |= EQ_ALLOCATED; eq->cntxt_id = G_FW_EQ_ETH_CMD_EQID(be32toh(c.eqid_pkd)); cntxt_id = eq->cntxt_id - sc->sge.eq_start; if (cntxt_id >= sc->sge.neq) panic("%s: eq->cntxt_id (%d) more than the max (%d)", __func__, cntxt_id, sc->sge.neq - 1); sc->sge.eqmap[cntxt_id] = eq; return (rc); } #ifdef TCP_OFFLOAD static int ofld_eq_alloc(struct adapter *sc, struct vi_info *vi, struct sge_eq *eq) { int rc, cntxt_id; struct fw_eq_ofld_cmd c; int qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE; bzero(&c, sizeof(c)); c.op_to_vfn = htonl(V_FW_CMD_OP(FW_EQ_OFLD_CMD) | F_FW_CMD_REQUEST | F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_EQ_OFLD_CMD_PFN(sc->pf) | V_FW_EQ_OFLD_CMD_VFN(0)); c.alloc_to_len16 = htonl(F_FW_EQ_OFLD_CMD_ALLOC | F_FW_EQ_OFLD_CMD_EQSTART | FW_LEN16(c)); c.fetchszm_to_iqid = htonl(V_FW_EQ_OFLD_CMD_HOSTFCMODE(X_HOSTFCMODE_NONE) | V_FW_EQ_OFLD_CMD_PCIECHN(eq->tx_chan) | F_FW_EQ_OFLD_CMD_FETCHRO | V_FW_EQ_OFLD_CMD_IQID(eq->iqid)); c.dcaen_to_eqsize = htobe32(V_FW_EQ_OFLD_CMD_FBMIN(X_FETCHBURSTMIN_64B) | V_FW_EQ_OFLD_CMD_FBMAX(X_FETCHBURSTMAX_512B) | V_FW_EQ_OFLD_CMD_EQSIZE(qsize)); c.eqaddr = htobe64(eq->ba); rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c); if (rc != 0) { device_printf(vi->dev, "failed to create egress queue for TCP offload: %d\n", rc); return (rc); } eq->flags |= EQ_ALLOCATED; eq->cntxt_id = G_FW_EQ_OFLD_CMD_EQID(be32toh(c.eqid_pkd)); cntxt_id = eq->cntxt_id - sc->sge.eq_start; if (cntxt_id >= sc->sge.neq) panic("%s: eq->cntxt_id (%d) more than the max (%d)", __func__, cntxt_id, sc->sge.neq - 1); sc->sge.eqmap[cntxt_id] = eq; return (rc); } #endif static int alloc_eq(struct adapter *sc, struct vi_info *vi, struct sge_eq *eq) { int rc, qsize; size_t len; mtx_init(&eq->eq_lock, eq->lockname, NULL, MTX_DEF); qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE; len = qsize * EQ_ESIZE; rc = alloc_ring(sc, len, &eq->desc_tag, &eq->desc_map, &eq->ba, (void **)&eq->desc); if (rc) return (rc); eq->pidx = eq->cidx = 0; eq->equeqidx = eq->dbidx = 0; eq->doorbells = sc->doorbells; switch (eq->flags & EQ_TYPEMASK) { case EQ_CTRL: rc = ctrl_eq_alloc(sc, eq); break; case EQ_ETH: rc = eth_eq_alloc(sc, vi, eq); break; #ifdef TCP_OFFLOAD case EQ_OFLD: rc = ofld_eq_alloc(sc, vi, eq); break; #endif default: panic("%s: invalid eq type %d.", __func__, eq->flags & EQ_TYPEMASK); } if (rc != 0) { device_printf(sc->dev, "failed to allocate egress queue(%d): %d\n", eq->flags & EQ_TYPEMASK, rc); } if (isset(&eq->doorbells, DOORBELL_UDB) || isset(&eq->doorbells, DOORBELL_UDBWC) || isset(&eq->doorbells, DOORBELL_WCWR)) { uint32_t s_qpp = sc->params.sge.eq_s_qpp; uint32_t mask = (1 << s_qpp) - 1; volatile uint8_t *udb; udb = sc->udbs_base + UDBS_DB_OFFSET; udb += (eq->cntxt_id >> s_qpp) << PAGE_SHIFT; /* pg offset */ eq->udb_qid = eq->cntxt_id & mask; /* id in page */ if (eq->udb_qid >= PAGE_SIZE / UDBS_SEG_SIZE) clrbit(&eq->doorbells, DOORBELL_WCWR); else { udb += eq->udb_qid << UDBS_SEG_SHIFT; /* seg offset */ eq->udb_qid = 0; } eq->udb = (volatile void *)udb; } return (rc); } static int free_eq(struct adapter *sc, struct sge_eq *eq) { int rc; if (eq->flags & EQ_ALLOCATED) { switch (eq->flags & EQ_TYPEMASK) { case EQ_CTRL: rc = -t4_ctrl_eq_free(sc, sc->mbox, sc->pf, 0, eq->cntxt_id); break; case EQ_ETH: rc = -t4_eth_eq_free(sc, sc->mbox, sc->pf, 0, eq->cntxt_id); break; #ifdef TCP_OFFLOAD case EQ_OFLD: rc = -t4_ofld_eq_free(sc, sc->mbox, sc->pf, 0, eq->cntxt_id); break; #endif default: panic("%s: invalid eq type %d.", __func__, eq->flags & EQ_TYPEMASK); } if (rc != 0) { device_printf(sc->dev, "failed to free egress queue (%d): %d\n", eq->flags & EQ_TYPEMASK, rc); return (rc); } eq->flags &= ~EQ_ALLOCATED; } free_ring(sc, eq->desc_tag, eq->desc_map, eq->ba, eq->desc); if (mtx_initialized(&eq->eq_lock)) mtx_destroy(&eq->eq_lock); bzero(eq, sizeof(*eq)); return (0); } static int alloc_wrq(struct adapter *sc, struct vi_info *vi, struct sge_wrq *wrq, struct sysctl_oid *oid) { int rc; struct sysctl_ctx_list *ctx = vi ? &vi->ctx : &sc->ctx; struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid); rc = alloc_eq(sc, vi, &wrq->eq); if (rc) return (rc); wrq->adapter = sc; TASK_INIT(&wrq->wrq_tx_task, 0, wrq_tx_drain, wrq); TAILQ_INIT(&wrq->incomplete_wrs); STAILQ_INIT(&wrq->wr_list); wrq->nwr_pending = 0; wrq->ndesc_needed = 0; SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "cntxt_id", CTLFLAG_RD, &wrq->eq.cntxt_id, 0, "SGE context id of the queue"); SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cidx", CTLTYPE_INT | CTLFLAG_RD, &wrq->eq.cidx, 0, sysctl_uint16, "I", "consumer index"); SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "pidx", CTLTYPE_INT | CTLFLAG_RD, &wrq->eq.pidx, 0, sysctl_uint16, "I", "producer index"); SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "tx_wrs_direct", CTLFLAG_RD, &wrq->tx_wrs_direct, "# of work requests (direct)"); SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "tx_wrs_copied", CTLFLAG_RD, &wrq->tx_wrs_copied, "# of work requests (copied)"); return (rc); } static int free_wrq(struct adapter *sc, struct sge_wrq *wrq) { int rc; rc = free_eq(sc, &wrq->eq); if (rc) return (rc); bzero(wrq, sizeof(*wrq)); return (0); } static int alloc_txq(struct vi_info *vi, struct sge_txq *txq, int idx, struct sysctl_oid *oid) { int rc; struct port_info *pi = vi->pi; struct adapter *sc = pi->adapter; struct sge_eq *eq = &txq->eq; char name[16]; struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid); rc = mp_ring_alloc(&txq->r, eq->sidx, txq, eth_tx, can_resume_eth_tx, M_CXGBE, M_WAITOK); if (rc != 0) { device_printf(sc->dev, "failed to allocate mp_ring: %d\n", rc); return (rc); } rc = alloc_eq(sc, vi, eq); if (rc != 0) { mp_ring_free(txq->r); txq->r = NULL; return (rc); } /* Can't fail after this point. */ TASK_INIT(&txq->tx_reclaim_task, 0, tx_reclaim, eq); txq->ifp = vi->ifp; txq->gl = sglist_alloc(TX_SGL_SEGS, M_WAITOK); txq->cpl_ctrl0 = htobe32(V_TXPKT_OPCODE(CPL_TX_PKT) | V_TXPKT_INTF(pi->tx_chan) | V_TXPKT_VF_VLD(1) | V_TXPKT_VF(vi->viid)); txq->tc_idx = -1; txq->sdesc = malloc(eq->sidx * sizeof(struct tx_sdesc), M_CXGBE, M_ZERO | M_WAITOK); snprintf(name, sizeof(name), "%d", idx); oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD, NULL, "tx queue"); children = SYSCTL_CHILDREN(oid); SYSCTL_ADD_UINT(&vi->ctx, children, OID_AUTO, "cntxt_id", CTLFLAG_RD, &eq->cntxt_id, 0, "SGE context id of the queue"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx", CTLTYPE_INT | CTLFLAG_RD, &eq->cidx, 0, sysctl_uint16, "I", "consumer index"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "pidx", CTLTYPE_INT | CTLFLAG_RD, &eq->pidx, 0, sysctl_uint16, "I", "producer index"); SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "tc", CTLTYPE_INT | CTLFLAG_RW, vi, idx, sysctl_tc, "I", "traffic class (-1 means none)"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txcsum", CTLFLAG_RD, &txq->txcsum, "# of times hardware assisted with checksum"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "vlan_insertion", CTLFLAG_RD, &txq->vlan_insertion, "# of times hardware inserted 802.1Q tag"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "tso_wrs", CTLFLAG_RD, &txq->tso_wrs, "# of TSO work requests"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "imm_wrs", CTLFLAG_RD, &txq->imm_wrs, "# of work requests with immediate data"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "sgl_wrs", CTLFLAG_RD, &txq->sgl_wrs, "# of work requests with direct SGL"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkt_wrs", CTLFLAG_RD, &txq->txpkt_wrs, "# of txpkt work requests (one pkt/WR)"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts0_wrs", CTLFLAG_RD, &txq->txpkts0_wrs, "# of txpkts (type 0) work requests"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts1_wrs", CTLFLAG_RD, &txq->txpkts1_wrs, "# of txpkts (type 1) work requests"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts0_pkts", CTLFLAG_RD, &txq->txpkts0_pkts, "# of frames tx'd using type0 txpkts work requests"); SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts1_pkts", CTLFLAG_RD, &txq->txpkts1_pkts, "# of frames tx'd using type1 txpkts work requests"); SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_enqueues", CTLFLAG_RD, &txq->r->enqueues, "# of enqueues to the mp_ring for this queue"); SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_drops", CTLFLAG_RD, &txq->r->drops, "# of drops in the mp_ring for this queue"); SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_starts", CTLFLAG_RD, &txq->r->starts, "# of normal consumer starts in the mp_ring for this queue"); SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_stalls", CTLFLAG_RD, &txq->r->stalls, "# of consumer stalls in the mp_ring for this queue"); SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_restarts", CTLFLAG_RD, &txq->r->restarts, "# of consumer restarts in the mp_ring for this queue"); SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_abdications", CTLFLAG_RD, &txq->r->abdications, "# of consumer abdications in the mp_ring for this queue"); return (0); } static int free_txq(struct vi_info *vi, struct sge_txq *txq) { int rc; struct adapter *sc = vi->pi->adapter; struct sge_eq *eq = &txq->eq; rc = free_eq(sc, eq); if (rc) return (rc); sglist_free(txq->gl); free(txq->sdesc, M_CXGBE); mp_ring_free(txq->r); bzero(txq, sizeof(*txq)); return (0); } static void oneseg_dma_callback(void *arg, bus_dma_segment_t *segs, int nseg, int error) { bus_addr_t *ba = arg; KASSERT(nseg == 1, ("%s meant for single segment mappings only.", __func__)); *ba = error ? 0 : segs->ds_addr; } static inline void ring_fl_db(struct adapter *sc, struct sge_fl *fl) { uint32_t n, v; n = IDXDIFF(fl->pidx / 8, fl->dbidx, fl->sidx); MPASS(n > 0); wmb(); v = fl->dbval | V_PIDX(n); if (fl->udb) *fl->udb = htole32(v); else t4_write_reg(sc, MYPF_REG(A_SGE_PF_KDOORBELL), v); IDXINCR(fl->dbidx, n, fl->sidx); } /* * Fills up the freelist by allocating up to 'n' buffers. Buffers that are * recycled do not count towards this allocation budget. * * Returns non-zero to indicate that this freelist should be added to the list * of starving freelists. */ static int refill_fl(struct adapter *sc, struct sge_fl *fl, int n) { __be64 *d; struct fl_sdesc *sd; uintptr_t pa; caddr_t cl; struct cluster_layout *cll; struct sw_zone_info *swz; struct cluster_metadata *clm; uint16_t max_pidx; uint16_t hw_cidx = fl->hw_cidx; /* stable snapshot */ FL_LOCK_ASSERT_OWNED(fl); /* * We always stop at the beginning of the hardware descriptor that's just * before the one with the hw cidx. This is to avoid hw pidx = hw cidx, * which would mean an empty freelist to the chip. */ max_pidx = __predict_false(hw_cidx == 0) ? fl->sidx - 1 : hw_cidx - 1; if (fl->pidx == max_pidx * 8) return (0); d = &fl->desc[fl->pidx]; sd = &fl->sdesc[fl->pidx]; cll = &fl->cll_def; /* default layout */ swz = &sc->sge.sw_zone_info[cll->zidx]; while (n > 0) { if (sd->cl != NULL) { if (sd->nmbuf == 0) { /* * Fast recycle without involving any atomics on * the cluster's metadata (if the cluster has * metadata). This happens when all frames * received in the cluster were small enough to * fit within a single mbuf each. */ fl->cl_fast_recycled++; #ifdef INVARIANTS clm = cl_metadata(sc, fl, &sd->cll, sd->cl); if (clm != NULL) MPASS(clm->refcount == 1); #endif goto recycled_fast; } /* * Cluster is guaranteed to have metadata. Clusters * without metadata always take the fast recycle path * when they're recycled. */ clm = cl_metadata(sc, fl, &sd->cll, sd->cl); MPASS(clm != NULL); if (atomic_fetchadd_int(&clm->refcount, -1) == 1) { fl->cl_recycled++; counter_u64_add(extfree_rels, 1); goto recycled; } sd->cl = NULL; /* gave up my reference */ } MPASS(sd->cl == NULL); alloc: cl = uma_zalloc(swz->zone, M_NOWAIT); if (__predict_false(cl == NULL)) { if (cll == &fl->cll_alt || fl->cll_alt.zidx == -1 || fl->cll_def.zidx == fl->cll_alt.zidx) break; /* fall back to the safe zone */ cll = &fl->cll_alt; swz = &sc->sge.sw_zone_info[cll->zidx]; goto alloc; } fl->cl_allocated++; n--; pa = pmap_kextract((vm_offset_t)cl); pa += cll->region1; sd->cl = cl; sd->cll = *cll; *d = htobe64(pa | cll->hwidx); clm = cl_metadata(sc, fl, cll, cl); if (clm != NULL) { recycled: #ifdef INVARIANTS clm->sd = sd; #endif clm->refcount = 1; } sd->nmbuf = 0; recycled_fast: d++; sd++; if (__predict_false(++fl->pidx % 8 == 0)) { uint16_t pidx = fl->pidx / 8; if (__predict_false(pidx == fl->sidx)) { fl->pidx = 0; pidx = 0; sd = fl->sdesc; d = fl->desc; } if (pidx == max_pidx) break; if (IDXDIFF(pidx, fl->dbidx, fl->sidx) >= 4) ring_fl_db(sc, fl); } } if (fl->pidx / 8 != fl->dbidx) ring_fl_db(sc, fl); return (FL_RUNNING_LOW(fl) && !(fl->flags & FL_STARVING)); } /* * Attempt to refill all starving freelists. */ static void refill_sfl(void *arg) { struct adapter *sc = arg; struct sge_fl *fl, *fl_temp; mtx_assert(&sc->sfl_lock, MA_OWNED); TAILQ_FOREACH_SAFE(fl, &sc->sfl, link, fl_temp) { FL_LOCK(fl); refill_fl(sc, fl, 64); if (FL_NOT_RUNNING_LOW(fl) || fl->flags & FL_DOOMED) { TAILQ_REMOVE(&sc->sfl, fl, link); fl->flags &= ~FL_STARVING; } FL_UNLOCK(fl); } if (!TAILQ_EMPTY(&sc->sfl)) callout_schedule(&sc->sfl_callout, hz / 5); } static int alloc_fl_sdesc(struct sge_fl *fl) { fl->sdesc = malloc(fl->sidx * 8 * sizeof(struct fl_sdesc), M_CXGBE, M_ZERO | M_WAITOK); return (0); } static void free_fl_sdesc(struct adapter *sc, struct sge_fl *fl) { struct fl_sdesc *sd; struct cluster_metadata *clm; struct cluster_layout *cll; int i; sd = fl->sdesc; for (i = 0; i < fl->sidx * 8; i++, sd++) { if (sd->cl == NULL) continue; cll = &sd->cll; clm = cl_metadata(sc, fl, cll, sd->cl); if (sd->nmbuf == 0) uma_zfree(sc->sge.sw_zone_info[cll->zidx].zone, sd->cl); else if (clm && atomic_fetchadd_int(&clm->refcount, -1) == 1) { uma_zfree(sc->sge.sw_zone_info[cll->zidx].zone, sd->cl); counter_u64_add(extfree_rels, 1); } sd->cl = NULL; } free(fl->sdesc, M_CXGBE); fl->sdesc = NULL; } static inline void get_pkt_gl(struct mbuf *m, struct sglist *gl) { int rc; M_ASSERTPKTHDR(m); sglist_reset(gl); rc = sglist_append_mbuf(gl, m); if (__predict_false(rc != 0)) { panic("%s: mbuf %p (%d segs) was vetted earlier but now fails " "with %d.", __func__, m, mbuf_nsegs(m), rc); } KASSERT(gl->sg_nseg == mbuf_nsegs(m), ("%s: nsegs changed for mbuf %p from %d to %d", __func__, m, mbuf_nsegs(m), gl->sg_nseg)); KASSERT(gl->sg_nseg > 0 && gl->sg_nseg <= (needs_tso(m) ? TX_SGL_SEGS_TSO : TX_SGL_SEGS), ("%s: %d segments, should have been 1 <= nsegs <= %d", __func__, gl->sg_nseg, needs_tso(m) ? TX_SGL_SEGS_TSO : TX_SGL_SEGS)); } /* * len16 for a txpkt WR with a GL. Includes the firmware work request header. */ static inline u_int txpkt_len16(u_int nsegs, u_int tso) { u_int n; MPASS(nsegs > 0); nsegs--; /* first segment is part of ulptx_sgl */ n = sizeof(struct fw_eth_tx_pkt_wr) + sizeof(struct cpl_tx_pkt_core) + sizeof(struct ulptx_sgl) + 8 * ((3 * nsegs) / 2 + (nsegs & 1)); if (tso) n += sizeof(struct cpl_tx_pkt_lso_core); return (howmany(n, 16)); } /* * len16 for a txpkts type 0 WR with a GL. Does not include the firmware work * request header. */ static inline u_int txpkts0_len16(u_int nsegs) { u_int n; MPASS(nsegs > 0); nsegs--; /* first segment is part of ulptx_sgl */ n = sizeof(struct ulp_txpkt) + sizeof(struct ulptx_idata) + sizeof(struct cpl_tx_pkt_core) + sizeof(struct ulptx_sgl) + 8 * ((3 * nsegs) / 2 + (nsegs & 1)); return (howmany(n, 16)); } /* * len16 for a txpkts type 1 WR with a GL. Does not include the firmware work * request header. */ static inline u_int txpkts1_len16(void) { u_int n; n = sizeof(struct cpl_tx_pkt_core) + sizeof(struct ulptx_sgl); return (howmany(n, 16)); } static inline u_int imm_payload(u_int ndesc) { u_int n; n = ndesc * EQ_ESIZE - sizeof(struct fw_eth_tx_pkt_wr) - sizeof(struct cpl_tx_pkt_core); return (n); } /* * Write a txpkt WR for this packet to the hardware descriptors, update the * software descriptor, and advance the pidx. It is guaranteed that enough * descriptors are available. * * The return value is the # of hardware descriptors used. */ static u_int write_txpkt_wr(struct sge_txq *txq, struct fw_eth_tx_pkt_wr *wr, struct mbuf *m0, u_int available) { struct sge_eq *eq = &txq->eq; struct tx_sdesc *txsd; struct cpl_tx_pkt_core *cpl; uint32_t ctrl; /* used in many unrelated places */ uint64_t ctrl1; int len16, ndesc, pktlen, nsegs; caddr_t dst; TXQ_LOCK_ASSERT_OWNED(txq); M_ASSERTPKTHDR(m0); MPASS(available > 0 && available < eq->sidx); len16 = mbuf_len16(m0); nsegs = mbuf_nsegs(m0); pktlen = m0->m_pkthdr.len; ctrl = sizeof(struct cpl_tx_pkt_core); if (needs_tso(m0)) ctrl += sizeof(struct cpl_tx_pkt_lso_core); else if (pktlen <= imm_payload(2) && available >= 2) { /* Immediate data. Recalculate len16 and set nsegs to 0. */ ctrl += pktlen; len16 = howmany(sizeof(struct fw_eth_tx_pkt_wr) + sizeof(struct cpl_tx_pkt_core) + pktlen, 16); nsegs = 0; } ndesc = howmany(len16, EQ_ESIZE / 16); MPASS(ndesc <= available); /* Firmware work request header */ MPASS(wr == (void *)&eq->desc[eq->pidx]); wr->op_immdlen = htobe32(V_FW_WR_OP(FW_ETH_TX_PKT_WR) | V_FW_ETH_TX_PKT_WR_IMMDLEN(ctrl)); ctrl = V_FW_WR_LEN16(len16); wr->equiq_to_len16 = htobe32(ctrl); wr->r3 = 0; if (needs_tso(m0)) { struct cpl_tx_pkt_lso_core *lso = (void *)(wr + 1); KASSERT(m0->m_pkthdr.l2hlen > 0 && m0->m_pkthdr.l3hlen > 0 && m0->m_pkthdr.l4hlen > 0, ("%s: mbuf %p needs TSO but missing header lengths", __func__, m0)); ctrl = V_LSO_OPCODE(CPL_TX_PKT_LSO) | F_LSO_FIRST_SLICE | F_LSO_LAST_SLICE | V_LSO_IPHDR_LEN(m0->m_pkthdr.l3hlen >> 2) | V_LSO_TCPHDR_LEN(m0->m_pkthdr.l4hlen >> 2); if (m0->m_pkthdr.l2hlen == sizeof(struct ether_vlan_header)) ctrl |= V_LSO_ETHHDR_LEN(1); if (m0->m_pkthdr.l3hlen == sizeof(struct ip6_hdr)) ctrl |= F_LSO_IPV6; lso->lso_ctrl = htobe32(ctrl); lso->ipid_ofst = htobe16(0); lso->mss = htobe16(m0->m_pkthdr.tso_segsz); lso->seqno_offset = htobe32(0); lso->len = htobe32(pktlen); cpl = (void *)(lso + 1); txq->tso_wrs++; } else cpl = (void *)(wr + 1); /* Checksum offload */ ctrl1 = 0; if (needs_l3_csum(m0) == 0) ctrl1 |= F_TXPKT_IPCSUM_DIS; if (needs_l4_csum(m0) == 0) ctrl1 |= F_TXPKT_L4CSUM_DIS; if (m0->m_pkthdr.csum_flags & (CSUM_IP | CSUM_TCP | CSUM_UDP | CSUM_UDP_IPV6 | CSUM_TCP_IPV6 | CSUM_TSO)) txq->txcsum++; /* some hardware assistance provided */ /* VLAN tag insertion */ if (needs_vlan_insertion(m0)) { ctrl1 |= F_TXPKT_VLAN_VLD | V_TXPKT_VLAN(m0->m_pkthdr.ether_vtag); txq->vlan_insertion++; } /* CPL header */ cpl->ctrl0 = txq->cpl_ctrl0; cpl->pack = 0; cpl->len = htobe16(pktlen); cpl->ctrl1 = htobe64(ctrl1); /* SGL */ dst = (void *)(cpl + 1); if (nsegs > 0) { write_gl_to_txd(txq, m0, &dst, eq->sidx - ndesc < eq->pidx); txq->sgl_wrs++; } else { struct mbuf *m; for (m = m0; m != NULL; m = m->m_next) { copy_to_txd(eq, mtod(m, caddr_t), &dst, m->m_len); #ifdef INVARIANTS pktlen -= m->m_len; #endif } #ifdef INVARIANTS KASSERT(pktlen == 0, ("%s: %d bytes left.", __func__, pktlen)); #endif txq->imm_wrs++; } txq->txpkt_wrs++; txsd = &txq->sdesc[eq->pidx]; txsd->m = m0; txsd->desc_used = ndesc; return (ndesc); } static int try_txpkts(struct mbuf *m, struct mbuf *n, struct txpkts *txp, u_int available) { u_int needed, nsegs1, nsegs2, l1, l2; if (cannot_use_txpkts(m) || cannot_use_txpkts(n)) return (1); nsegs1 = mbuf_nsegs(m); nsegs2 = mbuf_nsegs(n); if (nsegs1 + nsegs2 == 2) { txp->wr_type = 1; l1 = l2 = txpkts1_len16(); } else { txp->wr_type = 0; l1 = txpkts0_len16(nsegs1); l2 = txpkts0_len16(nsegs2); } txp->len16 = howmany(sizeof(struct fw_eth_tx_pkts_wr), 16) + l1 + l2; needed = howmany(txp->len16, EQ_ESIZE / 16); if (needed > SGE_MAX_WR_NDESC || needed > available) return (1); txp->plen = m->m_pkthdr.len + n->m_pkthdr.len; if (txp->plen > 65535) return (1); txp->npkt = 2; set_mbuf_len16(m, l1); set_mbuf_len16(n, l2); return (0); } static int add_to_txpkts(struct mbuf *m, struct txpkts *txp, u_int available) { u_int plen, len16, needed, nsegs; MPASS(txp->wr_type == 0 || txp->wr_type == 1); nsegs = mbuf_nsegs(m); if (needs_tso(m) || (txp->wr_type == 1 && nsegs != 1)) return (1); plen = txp->plen + m->m_pkthdr.len; if (plen > 65535) return (1); if (txp->wr_type == 0) len16 = txpkts0_len16(nsegs); else len16 = txpkts1_len16(); needed = howmany(txp->len16 + len16, EQ_ESIZE / 16); if (needed > SGE_MAX_WR_NDESC || needed > available) return (1); txp->npkt++; txp->plen = plen; txp->len16 += len16; set_mbuf_len16(m, len16); return (0); } /* * Write a txpkts WR for the packets in txp to the hardware descriptors, update * the software descriptor, and advance the pidx. It is guaranteed that enough * descriptors are available. * * The return value is the # of hardware descriptors used. */ static u_int write_txpkts_wr(struct sge_txq *txq, struct fw_eth_tx_pkts_wr *wr, struct mbuf *m0, const struct txpkts *txp, u_int available) { struct sge_eq *eq = &txq->eq; struct tx_sdesc *txsd; struct cpl_tx_pkt_core *cpl; uint32_t ctrl; uint64_t ctrl1; int ndesc, checkwrap; struct mbuf *m; void *flitp; TXQ_LOCK_ASSERT_OWNED(txq); MPASS(txp->npkt > 0); MPASS(txp->plen < 65536); MPASS(m0 != NULL); MPASS(m0->m_nextpkt != NULL); MPASS(txp->len16 <= howmany(SGE_MAX_WR_LEN, 16)); MPASS(available > 0 && available < eq->sidx); ndesc = howmany(txp->len16, EQ_ESIZE / 16); MPASS(ndesc <= available); MPASS(wr == (void *)&eq->desc[eq->pidx]); wr->op_pkd = htobe32(V_FW_WR_OP(FW_ETH_TX_PKTS_WR)); ctrl = V_FW_WR_LEN16(txp->len16); wr->equiq_to_len16 = htobe32(ctrl); wr->plen = htobe16(txp->plen); wr->npkt = txp->npkt; wr->r3 = 0; wr->type = txp->wr_type; flitp = wr + 1; /* * At this point we are 16B into a hardware descriptor. If checkwrap is * set then we know the WR is going to wrap around somewhere. We'll * check for that at appropriate points. */ checkwrap = eq->sidx - ndesc < eq->pidx; for (m = m0; m != NULL; m = m->m_nextpkt) { if (txp->wr_type == 0) { struct ulp_txpkt *ulpmc; struct ulptx_idata *ulpsc; /* ULP master command */ ulpmc = flitp; ulpmc->cmd_dest = htobe32(V_ULPTX_CMD(ULP_TX_PKT) | V_ULP_TXPKT_DEST(0) | V_ULP_TXPKT_FID(eq->iqid)); ulpmc->len = htobe32(mbuf_len16(m)); /* ULP subcommand */ ulpsc = (void *)(ulpmc + 1); ulpsc->cmd_more = htobe32(V_ULPTX_CMD(ULP_TX_SC_IMM) | F_ULP_TX_SC_MORE); ulpsc->len = htobe32(sizeof(struct cpl_tx_pkt_core)); cpl = (void *)(ulpsc + 1); if (checkwrap && (uintptr_t)cpl == (uintptr_t)&eq->desc[eq->sidx]) cpl = (void *)&eq->desc[0]; txq->txpkts0_pkts += txp->npkt; txq->txpkts0_wrs++; } else { cpl = flitp; txq->txpkts1_pkts += txp->npkt; txq->txpkts1_wrs++; } /* Checksum offload */ ctrl1 = 0; if (needs_l3_csum(m) == 0) ctrl1 |= F_TXPKT_IPCSUM_DIS; if (needs_l4_csum(m) == 0) ctrl1 |= F_TXPKT_L4CSUM_DIS; if (m->m_pkthdr.csum_flags & (CSUM_IP | CSUM_TCP | CSUM_UDP | CSUM_UDP_IPV6 | CSUM_TCP_IPV6 | CSUM_TSO)) txq->txcsum++; /* some hardware assistance provided */ /* VLAN tag insertion */ if (needs_vlan_insertion(m)) { ctrl1 |= F_TXPKT_VLAN_VLD | V_TXPKT_VLAN(m->m_pkthdr.ether_vtag); txq->vlan_insertion++; } /* CPL header */ cpl->ctrl0 = txq->cpl_ctrl0; cpl->pack = 0; cpl->len = htobe16(m->m_pkthdr.len); cpl->ctrl1 = htobe64(ctrl1); flitp = cpl + 1; if (checkwrap && (uintptr_t)flitp == (uintptr_t)&eq->desc[eq->sidx]) flitp = (void *)&eq->desc[0]; write_gl_to_txd(txq, m, (caddr_t *)(&flitp), checkwrap); } txsd = &txq->sdesc[eq->pidx]; txsd->m = m0; txsd->desc_used = ndesc; return (ndesc); } /* * If the SGL ends on an address that is not 16 byte aligned, this function will * add a 0 filled flit at the end. */ static void write_gl_to_txd(struct sge_txq *txq, struct mbuf *m, caddr_t *to, int checkwrap) { struct sge_eq *eq = &txq->eq; struct sglist *gl = txq->gl; struct sglist_seg *seg; __be64 *flitp, *wrap; struct ulptx_sgl *usgl; int i, nflits, nsegs; KASSERT(((uintptr_t)(*to) & 0xf) == 0, ("%s: SGL must start at a 16 byte boundary: %p", __func__, *to)); MPASS((uintptr_t)(*to) >= (uintptr_t)&eq->desc[0]); MPASS((uintptr_t)(*to) < (uintptr_t)&eq->desc[eq->sidx]); get_pkt_gl(m, gl); nsegs = gl->sg_nseg; MPASS(nsegs > 0); nflits = (3 * (nsegs - 1)) / 2 + ((nsegs - 1) & 1) + 2; flitp = (__be64 *)(*to); wrap = (__be64 *)(&eq->desc[eq->sidx]); seg = &gl->sg_segs[0]; usgl = (void *)flitp; /* * We start at a 16 byte boundary somewhere inside the tx descriptor * ring, so we're at least 16 bytes away from the status page. There is * no chance of a wrap around in the middle of usgl (which is 16 bytes). */ usgl->cmd_nsge = htobe32(V_ULPTX_CMD(ULP_TX_SC_DSGL) | V_ULPTX_NSGE(nsegs)); usgl->len0 = htobe32(seg->ss_len); usgl->addr0 = htobe64(seg->ss_paddr); seg++; if (checkwrap == 0 || (uintptr_t)(flitp + nflits) <= (uintptr_t)wrap) { /* Won't wrap around at all */ for (i = 0; i < nsegs - 1; i++, seg++) { usgl->sge[i / 2].len[i & 1] = htobe32(seg->ss_len); usgl->sge[i / 2].addr[i & 1] = htobe64(seg->ss_paddr); } if (i & 1) usgl->sge[i / 2].len[1] = htobe32(0); flitp += nflits; } else { /* Will wrap somewhere in the rest of the SGL */ /* 2 flits already written, write the rest flit by flit */ flitp = (void *)(usgl + 1); for (i = 0; i < nflits - 2; i++) { if (flitp == wrap) flitp = (void *)eq->desc; *flitp++ = get_flit(seg, nsegs - 1, i); } } if (nflits & 1) { MPASS(((uintptr_t)flitp) & 0xf); *flitp++ = 0; } MPASS((((uintptr_t)flitp) & 0xf) == 0); if (__predict_false(flitp == wrap)) *to = (void *)eq->desc; else *to = (void *)flitp; } static inline void copy_to_txd(struct sge_eq *eq, caddr_t from, caddr_t *to, int len) { MPASS((uintptr_t)(*to) >= (uintptr_t)&eq->desc[0]); MPASS((uintptr_t)(*to) < (uintptr_t)&eq->desc[eq->sidx]); if (__predict_true((uintptr_t)(*to) + len <= (uintptr_t)&eq->desc[eq->sidx])) { bcopy(from, *to, len); (*to) += len; } else { int portion = (uintptr_t)&eq->desc[eq->sidx] - (uintptr_t)(*to); bcopy(from, *to, portion); from += portion; portion = len - portion; /* remaining */ bcopy(from, (void *)eq->desc, portion); (*to) = (caddr_t)eq->desc + portion; } } static inline void ring_eq_db(struct adapter *sc, struct sge_eq *eq, u_int n) { u_int db; MPASS(n > 0); db = eq->doorbells; if (n > 1) clrbit(&db, DOORBELL_WCWR); wmb(); switch (ffs(db) - 1) { case DOORBELL_UDB: *eq->udb = htole32(V_QID(eq->udb_qid) | V_PIDX(n)); break; case DOORBELL_WCWR: { volatile uint64_t *dst, *src; int i; /* * Queues whose 128B doorbell segment fits in the page do not * use relative qid (udb_qid is always 0). Only queues with * doorbell segments can do WCWR. */ KASSERT(eq->udb_qid == 0 && n == 1, ("%s: inappropriate doorbell (0x%x, %d, %d) for eq %p", __func__, eq->doorbells, n, eq->dbidx, eq)); dst = (volatile void *)((uintptr_t)eq->udb + UDBS_WR_OFFSET - UDBS_DB_OFFSET); i = eq->dbidx; src = (void *)&eq->desc[i]; while (src != (void *)&eq->desc[i + 1]) *dst++ = *src++; wmb(); break; } case DOORBELL_UDBWC: *eq->udb = htole32(V_QID(eq->udb_qid) | V_PIDX(n)); wmb(); break; case DOORBELL_KDB: t4_write_reg(sc, MYPF_REG(A_SGE_PF_KDOORBELL), V_QID(eq->cntxt_id) | V_PIDX(n)); break; } IDXINCR(eq->dbidx, n, eq->sidx); } static inline u_int reclaimable_tx_desc(struct sge_eq *eq) { uint16_t hw_cidx; hw_cidx = read_hw_cidx(eq); return (IDXDIFF(hw_cidx, eq->cidx, eq->sidx)); } static inline u_int total_available_tx_desc(struct sge_eq *eq) { uint16_t hw_cidx, pidx; hw_cidx = read_hw_cidx(eq); pidx = eq->pidx; if (pidx == hw_cidx) return (eq->sidx - 1); else return (IDXDIFF(hw_cidx, pidx, eq->sidx) - 1); } static inline uint16_t read_hw_cidx(struct sge_eq *eq) { struct sge_qstat *spg = (void *)&eq->desc[eq->sidx]; uint16_t cidx = spg->cidx; /* stable snapshot */ return (be16toh(cidx)); } /* * Reclaim 'n' descriptors approximately. */ static u_int reclaim_tx_descs(struct sge_txq *txq, u_int n) { struct tx_sdesc *txsd; struct sge_eq *eq = &txq->eq; u_int can_reclaim, reclaimed; TXQ_LOCK_ASSERT_OWNED(txq); MPASS(n > 0); reclaimed = 0; can_reclaim = reclaimable_tx_desc(eq); while (can_reclaim && reclaimed < n) { int ndesc; struct mbuf *m, *nextpkt; txsd = &txq->sdesc[eq->cidx]; ndesc = txsd->desc_used; /* Firmware doesn't return "partial" credits. */ KASSERT(can_reclaim >= ndesc, ("%s: unexpected number of credits: %d, %d", __func__, can_reclaim, ndesc)); for (m = txsd->m; m != NULL; m = nextpkt) { nextpkt = m->m_nextpkt; m->m_nextpkt = NULL; m_freem(m); } reclaimed += ndesc; can_reclaim -= ndesc; IDXINCR(eq->cidx, ndesc, eq->sidx); } return (reclaimed); } static void tx_reclaim(void *arg, int n) { struct sge_txq *txq = arg; struct sge_eq *eq = &txq->eq; do { if (TXQ_TRYLOCK(txq) == 0) break; n = reclaim_tx_descs(txq, 32); if (eq->cidx == eq->pidx) eq->equeqidx = eq->pidx; TXQ_UNLOCK(txq); } while (n > 0); } static __be64 get_flit(struct sglist_seg *segs, int nsegs, int idx) { int i = (idx / 3) * 2; switch (idx % 3) { case 0: { __be64 rc; rc = htobe32(segs[i].ss_len); if (i + 1 < nsegs) rc |= (uint64_t)htobe32(segs[i + 1].ss_len) << 32; return (rc); } case 1: return (htobe64(segs[i].ss_paddr)); case 2: return (htobe64(segs[i + 1].ss_paddr)); } return (0); } static void find_best_refill_source(struct adapter *sc, struct sge_fl *fl, int maxp) { int8_t zidx, hwidx, idx; uint16_t region1, region3; int spare, spare_needed, n; struct sw_zone_info *swz; struct hw_buf_info *hwb, *hwb_list = &sc->sge.hw_buf_info[0]; /* * Buffer Packing: Look for PAGE_SIZE or larger zone which has a bufsize * large enough for the max payload and cluster metadata. Otherwise * settle for the largest bufsize that leaves enough room in the cluster * for metadata. * * Without buffer packing: Look for the smallest zone which has a * bufsize large enough for the max payload. Settle for the largest * bufsize available if there's nothing big enough for max payload. */ spare_needed = fl->flags & FL_BUF_PACKING ? CL_METADATA_SIZE : 0; swz = &sc->sge.sw_zone_info[0]; hwidx = -1; for (zidx = 0; zidx < SW_ZONE_SIZES; zidx++, swz++) { if (swz->size > largest_rx_cluster) { if (__predict_true(hwidx != -1)) break; /* * This is a misconfiguration. largest_rx_cluster is * preventing us from finding a refill source. See * dev.t5nex..buffer_sizes to figure out why. */ device_printf(sc->dev, "largest_rx_cluster=%u leaves no" " refill source for fl %p (dma %u). Ignored.\n", largest_rx_cluster, fl, maxp); } for (idx = swz->head_hwidx; idx != -1; idx = hwb->next) { hwb = &hwb_list[idx]; spare = swz->size - hwb->size; if (spare < spare_needed) continue; hwidx = idx; /* best option so far */ if (hwb->size >= maxp) { if ((fl->flags & FL_BUF_PACKING) == 0) goto done; /* stop looking (not packing) */ if (swz->size >= safest_rx_cluster) goto done; /* stop looking (packing) */ } break; /* keep looking, next zone */ } } done: /* A usable hwidx has been located. */ MPASS(hwidx != -1); hwb = &hwb_list[hwidx]; zidx = hwb->zidx; swz = &sc->sge.sw_zone_info[zidx]; region1 = 0; region3 = swz->size - hwb->size; /* * Stay within this zone and see if there is a better match when mbuf * inlining is allowed. Remember that the hwidx's are sorted in * decreasing order of size (so in increasing order of spare area). */ for (idx = hwidx; idx != -1; idx = hwb->next) { hwb = &hwb_list[idx]; spare = swz->size - hwb->size; if (allow_mbufs_in_cluster == 0 || hwb->size < maxp) break; /* * Do not inline mbufs if doing so would violate the pad/pack * boundary alignment requirement. */ if (fl_pad && (MSIZE % sc->params.sge.pad_boundary) != 0) continue; if (fl->flags & FL_BUF_PACKING && (MSIZE % sc->params.sge.pack_boundary) != 0) continue; if (spare < CL_METADATA_SIZE + MSIZE) continue; n = (spare - CL_METADATA_SIZE) / MSIZE; if (n > howmany(hwb->size, maxp)) break; hwidx = idx; if (fl->flags & FL_BUF_PACKING) { region1 = n * MSIZE; region3 = spare - region1; } else { region1 = MSIZE; region3 = spare - region1; break; } } KASSERT(zidx >= 0 && zidx < SW_ZONE_SIZES, ("%s: bad zone %d for fl %p, maxp %d", __func__, zidx, fl, maxp)); KASSERT(hwidx >= 0 && hwidx <= SGE_FLBUF_SIZES, ("%s: bad hwidx %d for fl %p, maxp %d", __func__, hwidx, fl, maxp)); KASSERT(region1 + sc->sge.hw_buf_info[hwidx].size + region3 == sc->sge.sw_zone_info[zidx].size, ("%s: bad buffer layout for fl %p, maxp %d. " "cl %d; r1 %d, payload %d, r3 %d", __func__, fl, maxp, sc->sge.sw_zone_info[zidx].size, region1, sc->sge.hw_buf_info[hwidx].size, region3)); if (fl->flags & FL_BUF_PACKING || region1 > 0) { KASSERT(region3 >= CL_METADATA_SIZE, ("%s: no room for metadata. fl %p, maxp %d; " "cl %d; r1 %d, payload %d, r3 %d", __func__, fl, maxp, sc->sge.sw_zone_info[zidx].size, region1, sc->sge.hw_buf_info[hwidx].size, region3)); KASSERT(region1 % MSIZE == 0, ("%s: bad mbuf region for fl %p, maxp %d. " "cl %d; r1 %d, payload %d, r3 %d", __func__, fl, maxp, sc->sge.sw_zone_info[zidx].size, region1, sc->sge.hw_buf_info[hwidx].size, region3)); } fl->cll_def.zidx = zidx; fl->cll_def.hwidx = hwidx; fl->cll_def.region1 = region1; fl->cll_def.region3 = region3; } static void find_safe_refill_source(struct adapter *sc, struct sge_fl *fl) { struct sge *s = &sc->sge; struct hw_buf_info *hwb; struct sw_zone_info *swz; int spare; int8_t hwidx; if (fl->flags & FL_BUF_PACKING) hwidx = s->safe_hwidx2; /* with room for metadata */ else if (allow_mbufs_in_cluster && s->safe_hwidx2 != -1) { hwidx = s->safe_hwidx2; hwb = &s->hw_buf_info[hwidx]; swz = &s->sw_zone_info[hwb->zidx]; spare = swz->size - hwb->size; /* no good if there isn't room for an mbuf as well */ if (spare < CL_METADATA_SIZE + MSIZE) hwidx = s->safe_hwidx1; } else hwidx = s->safe_hwidx1; if (hwidx == -1) { /* No fallback source */ fl->cll_alt.hwidx = -1; fl->cll_alt.zidx = -1; return; } hwb = &s->hw_buf_info[hwidx]; swz = &s->sw_zone_info[hwb->zidx]; spare = swz->size - hwb->size; fl->cll_alt.hwidx = hwidx; fl->cll_alt.zidx = hwb->zidx; if (allow_mbufs_in_cluster && (fl_pad == 0 || (MSIZE % sc->params.sge.pad_boundary) == 0)) fl->cll_alt.region1 = ((spare - CL_METADATA_SIZE) / MSIZE) * MSIZE; else fl->cll_alt.region1 = 0; fl->cll_alt.region3 = spare - fl->cll_alt.region1; } static void add_fl_to_sfl(struct adapter *sc, struct sge_fl *fl) { mtx_lock(&sc->sfl_lock); FL_LOCK(fl); if ((fl->flags & FL_DOOMED) == 0) { fl->flags |= FL_STARVING; TAILQ_INSERT_TAIL(&sc->sfl, fl, link); callout_reset(&sc->sfl_callout, hz / 5, refill_sfl, sc); } FL_UNLOCK(fl); mtx_unlock(&sc->sfl_lock); } static void handle_wrq_egr_update(struct adapter *sc, struct sge_eq *eq) { struct sge_wrq *wrq = (void *)eq; atomic_readandclear_int(&eq->equiq); taskqueue_enqueue(sc->tq[eq->tx_chan], &wrq->wrq_tx_task); } static void handle_eth_egr_update(struct adapter *sc, struct sge_eq *eq) { struct sge_txq *txq = (void *)eq; MPASS((eq->flags & EQ_TYPEMASK) == EQ_ETH); atomic_readandclear_int(&eq->equiq); mp_ring_check_drainage(txq->r, 0); taskqueue_enqueue(sc->tq[eq->tx_chan], &txq->tx_reclaim_task); } static int handle_sge_egr_update(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m) { const struct cpl_sge_egr_update *cpl = (const void *)(rss + 1); unsigned int qid = G_EGR_QID(ntohl(cpl->opcode_qid)); struct adapter *sc = iq->adapter; struct sge *s = &sc->sge; struct sge_eq *eq; static void (*h[])(struct adapter *, struct sge_eq *) = {NULL, &handle_wrq_egr_update, &handle_eth_egr_update, &handle_wrq_egr_update}; KASSERT(m == NULL, ("%s: payload with opcode %02x", __func__, rss->opcode)); eq = s->eqmap[qid - s->eq_start]; (*h[eq->flags & EQ_TYPEMASK])(sc, eq); return (0); } /* handle_fw_msg works for both fw4_msg and fw6_msg because this is valid */ CTASSERT(offsetof(struct cpl_fw4_msg, data) == \ offsetof(struct cpl_fw6_msg, data)); static int handle_fw_msg(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m) { struct adapter *sc = iq->adapter; const struct cpl_fw6_msg *cpl = (const void *)(rss + 1); KASSERT(m == NULL, ("%s: payload with opcode %02x", __func__, rss->opcode)); if (cpl->type == FW_TYPE_RSSCPL || cpl->type == FW6_TYPE_RSSCPL) { const struct rss_header *rss2; rss2 = (const struct rss_header *)&cpl->data[0]; return (t4_cpl_handler[rss2->opcode](iq, rss2, m)); } return (t4_fw_msg_handler[cpl->type](sc, &cpl->data[0])); +} + +/** + * t4_handle_wrerr_rpl - process a FW work request error message + * @adap: the adapter + * @rpl: start of the FW message + */ +static int +t4_handle_wrerr_rpl(struct adapter *adap, const __be64 *rpl) +{ + u8 opcode = *(const u8 *)rpl; + const struct fw_error_cmd *e = (const void *)rpl; + unsigned int i; + + if (opcode != FW_ERROR_CMD) { + log(LOG_ERR, + "%s: Received WRERR_RPL message with opcode %#x\n", + device_get_nameunit(adap->dev), opcode); + return (EINVAL); + } + log(LOG_ERR, "%s: FW_ERROR (%s) ", device_get_nameunit(adap->dev), + G_FW_ERROR_CMD_FATAL(be32toh(e->op_to_type)) ? "fatal" : + "non-fatal"); + switch (G_FW_ERROR_CMD_TYPE(be32toh(e->op_to_type))) { + case FW_ERROR_TYPE_EXCEPTION: + log(LOG_ERR, "exception info:\n"); + for (i = 0; i < nitems(e->u.exception.info); i++) + log(LOG_ERR, "%s%08x", i == 0 ? "\t" : " ", + be32toh(e->u.exception.info[i])); + log(LOG_ERR, "\n"); + break; + case FW_ERROR_TYPE_HWMODULE: + log(LOG_ERR, "HW module regaddr %08x regval %08x\n", + be32toh(e->u.hwmodule.regaddr), + be32toh(e->u.hwmodule.regval)); + break; + case FW_ERROR_TYPE_WR: + log(LOG_ERR, "WR cidx %d PF %d VF %d eqid %d hdr:\n", + be16toh(e->u.wr.cidx), + G_FW_ERROR_CMD_PFN(be16toh(e->u.wr.pfn_vfn)), + G_FW_ERROR_CMD_VFN(be16toh(e->u.wr.pfn_vfn)), + be32toh(e->u.wr.eqid)); + for (i = 0; i < nitems(e->u.wr.wrhdr); i++) + log(LOG_ERR, "%s%02x", i == 0 ? "\t" : " ", + e->u.wr.wrhdr[i]); + log(LOG_ERR, "\n"); + break; + case FW_ERROR_TYPE_ACL: + log(LOG_ERR, "ACL cidx %d PF %d VF %d eqid %d %s", + be16toh(e->u.acl.cidx), + G_FW_ERROR_CMD_PFN(be16toh(e->u.acl.pfn_vfn)), + G_FW_ERROR_CMD_VFN(be16toh(e->u.acl.pfn_vfn)), + be32toh(e->u.acl.eqid), + G_FW_ERROR_CMD_MV(be16toh(e->u.acl.mv_pkd)) ? "vlanid" : + "MAC"); + for (i = 0; i < nitems(e->u.acl.val); i++) + log(LOG_ERR, " %02x", e->u.acl.val[i]); + log(LOG_ERR, "\n"); + break; + default: + log(LOG_ERR, "type %#x\n", + G_FW_ERROR_CMD_TYPE(be32toh(e->op_to_type))); + return (EINVAL); + } + return (0); } static int sysctl_uint16(SYSCTL_HANDLER_ARGS) { uint16_t *id = arg1; int i = *id; return sysctl_handle_int(oidp, &i, 0, req); } static int sysctl_bufsizes(SYSCTL_HANDLER_ARGS) { struct sge *s = arg1; struct hw_buf_info *hwb = &s->hw_buf_info[0]; struct sw_zone_info *swz = &s->sw_zone_info[0]; int i, rc; struct sbuf sb; char c; sbuf_new(&sb, NULL, 32, SBUF_AUTOEXTEND); for (i = 0; i < SGE_FLBUF_SIZES; i++, hwb++) { if (hwb->zidx >= 0 && swz[hwb->zidx].size <= largest_rx_cluster) c = '*'; else c = '\0'; sbuf_printf(&sb, "%u%c ", hwb->size, c); } sbuf_trim(&sb); sbuf_finish(&sb); rc = sysctl_handle_string(oidp, sbuf_data(&sb), sbuf_len(&sb), req); sbuf_delete(&sb); return (rc); } static int sysctl_tc(SYSCTL_HANDLER_ARGS) { struct vi_info *vi = arg1; struct port_info *pi; struct adapter *sc; struct sge_txq *txq; struct tx_sched_class *tc; int qidx = arg2, rc, tc_idx; uint32_t fw_queue, fw_class; MPASS(qidx >= 0 && qidx < vi->ntxq); pi = vi->pi; sc = pi->adapter; txq = &sc->sge.txq[vi->first_txq + qidx]; tc_idx = txq->tc_idx; rc = sysctl_handle_int(oidp, &tc_idx, 0, req); if (rc != 0 || req->newptr == NULL) return (rc); /* Note that -1 is legitimate input (it means unbind). */ if (tc_idx < -1 || tc_idx >= sc->chip_params->nsched_cls) return (EINVAL); rc = begin_synchronized_op(sc, vi, SLEEP_OK | INTR_OK, "t4stc"); if (rc) return (rc); if (tc_idx == txq->tc_idx) { rc = 0; /* No change, nothing to do. */ goto done; } fw_queue = V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DMAQ) | V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DMAQ_EQ_SCHEDCLASS_ETH) | V_FW_PARAMS_PARAM_YZ(txq->eq.cntxt_id); if (tc_idx == -1) fw_class = 0xffffffff; /* Unbind. */ else { /* * Bind to a different class. Ethernet txq's are only allowed * to bind to cl-rl mode-class for now. XXX: too restrictive. */ tc = &pi->tc[tc_idx]; if (tc->flags & TX_SC_OK && tc->params.level == SCHED_CLASS_LEVEL_CL_RL && tc->params.mode == SCHED_CLASS_MODE_CLASS) { /* Ok to proceed. */ fw_class = tc_idx; } else { rc = tc->flags & TX_SC_OK ? EBUSY : ENXIO; goto done; } } rc = -t4_set_params(sc, sc->mbox, sc->pf, 0, 1, &fw_queue, &fw_class); if (rc == 0) { if (txq->tc_idx != -1) { tc = &pi->tc[txq->tc_idx]; MPASS(tc->refcount > 0); tc->refcount--; } if (tc_idx != -1) { tc = &pi->tc[tc_idx]; tc->refcount++; } txq->tc_idx = tc_idx; } done: end_synchronized_op(sc, 0); return (rc); } Index: user/alc/PQ_LAUNDRY/sys/dev/e1000/if_igb.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/e1000/if_igb.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/e1000/if_igb.c (revision 303206) @@ -1,6441 +1,6441 @@ /****************************************************************************** Copyright (c) 2001-2015, Intel Corporation All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ******************************************************************************/ /*$FreeBSD$*/ #include "opt_inet.h" #include "opt_inet6.h" #include "opt_rss.h" #ifdef HAVE_KERNEL_OPTION_HEADERS #include "opt_device_polling.h" #include "opt_altq.h" #endif #include "if_igb.h" /********************************************************************* * Driver version: *********************************************************************/ char igb_driver_version[] = "2.5.3-k"; /********************************************************************* * PCI Device ID Table * * Used by probe to select devices to load on * Last field stores an index into e1000_strings * Last entry must be all 0s * * { Vendor ID, Device ID, SubVendor ID, SubDevice ID, String Index } *********************************************************************/ static igb_vendor_info_t igb_vendor_info_array[] = { {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82575EB_COPPER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82575EB_FIBER_SERDES, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82575GB_QUAD_COPPER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_NS, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_NS_SERDES, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_FIBER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_SERDES, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_SERDES_QUAD, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_QUAD_COPPER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_QUAD_COPPER_ET2, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82576_VF, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82580_COPPER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82580_FIBER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82580_SERDES, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82580_SGMII, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82580_COPPER_DUAL, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_82580_QUAD_FIBER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_DH89XXCC_SERDES, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_DH89XXCC_SGMII, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_DH89XXCC_SFP, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_DH89XXCC_BACKPLANE, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I350_COPPER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I350_FIBER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I350_SERDES, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I350_SGMII, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I350_VF, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_COPPER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_COPPER_IT, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_COPPER_OEM1, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_COPPER_FLASHLESS, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_SERDES_FLASHLESS, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_FIBER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_SERDES, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I210_SGMII, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I211_COPPER, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I354_BACKPLANE_1GBPS, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I354_BACKPLANE_2_5GBPS, 0, 0, 0}, {IGB_INTEL_VENDOR_ID, E1000_DEV_ID_I354_SGMII, 0, 0, 0}, /* required last entry */ {0, 0, 0, 0, 0} }; /********************************************************************* * Table of branding strings for all supported NICs. *********************************************************************/ static char *igb_strings[] = { "Intel(R) PRO/1000 Network Connection" }; /********************************************************************* * Function prototypes *********************************************************************/ static int igb_probe(device_t); static int igb_attach(device_t); static int igb_detach(device_t); static int igb_shutdown(device_t); static int igb_suspend(device_t); static int igb_resume(device_t); #ifndef IGB_LEGACY_TX static int igb_mq_start(struct ifnet *, struct mbuf *); static int igb_mq_start_locked(struct ifnet *, struct tx_ring *); static void igb_qflush(struct ifnet *); static void igb_deferred_mq_start(void *, int); #else static void igb_start(struct ifnet *); static void igb_start_locked(struct tx_ring *, struct ifnet *ifp); #endif static int igb_ioctl(struct ifnet *, u_long, caddr_t); static uint64_t igb_get_counter(if_t, ift_counter); static void igb_init(void *); static void igb_init_locked(struct adapter *); static void igb_stop(void *); static void igb_media_status(struct ifnet *, struct ifmediareq *); static int igb_media_change(struct ifnet *); static void igb_identify_hardware(struct adapter *); static int igb_allocate_pci_resources(struct adapter *); static int igb_allocate_msix(struct adapter *); static int igb_allocate_legacy(struct adapter *); static int igb_setup_msix(struct adapter *); static void igb_free_pci_resources(struct adapter *); static void igb_local_timer(void *); static void igb_reset(struct adapter *); static int igb_setup_interface(device_t, struct adapter *); static int igb_allocate_queues(struct adapter *); static void igb_configure_queues(struct adapter *); static int igb_allocate_transmit_buffers(struct tx_ring *); static void igb_setup_transmit_structures(struct adapter *); static void igb_setup_transmit_ring(struct tx_ring *); static void igb_initialize_transmit_units(struct adapter *); static void igb_free_transmit_structures(struct adapter *); static void igb_free_transmit_buffers(struct tx_ring *); static int igb_allocate_receive_buffers(struct rx_ring *); static int igb_setup_receive_structures(struct adapter *); static int igb_setup_receive_ring(struct rx_ring *); static void igb_initialize_receive_units(struct adapter *); static void igb_free_receive_structures(struct adapter *); static void igb_free_receive_buffers(struct rx_ring *); static void igb_free_receive_ring(struct rx_ring *); static void igb_enable_intr(struct adapter *); static void igb_disable_intr(struct adapter *); static void igb_update_stats_counters(struct adapter *); static bool igb_txeof(struct tx_ring *); static __inline void igb_rx_discard(struct rx_ring *, int); static __inline void igb_rx_input(struct rx_ring *, struct ifnet *, struct mbuf *, u32); static bool igb_rxeof(struct igb_queue *, int, int *); static void igb_rx_checksum(u32, struct mbuf *, u32); static int igb_tx_ctx_setup(struct tx_ring *, struct mbuf *, u32 *, u32 *); static int igb_tso_setup(struct tx_ring *, struct mbuf *, u32 *, u32 *); static void igb_set_promisc(struct adapter *); static void igb_disable_promisc(struct adapter *); static void igb_set_multi(struct adapter *); static void igb_update_link_status(struct adapter *); static void igb_refresh_mbufs(struct rx_ring *, int); static void igb_register_vlan(void *, struct ifnet *, u16); static void igb_unregister_vlan(void *, struct ifnet *, u16); static void igb_setup_vlan_hw_support(struct adapter *); static int igb_xmit(struct tx_ring *, struct mbuf **); static int igb_dma_malloc(struct adapter *, bus_size_t, struct igb_dma_alloc *, int); static void igb_dma_free(struct adapter *, struct igb_dma_alloc *); static int igb_sysctl_nvm_info(SYSCTL_HANDLER_ARGS); static void igb_print_nvm_info(struct adapter *); static int igb_is_valid_ether_addr(u8 *); static void igb_add_hw_stats(struct adapter *); static void igb_vf_init_stats(struct adapter *); static void igb_update_vf_stats_counters(struct adapter *); /* Management and WOL Support */ static void igb_init_manageability(struct adapter *); static void igb_release_manageability(struct adapter *); static void igb_get_hw_control(struct adapter *); static void igb_release_hw_control(struct adapter *); static void igb_enable_wakeup(device_t); static void igb_led_func(void *, int); static int igb_irq_fast(void *); static void igb_msix_que(void *); static void igb_msix_link(void *); static void igb_handle_que(void *context, int pending); static void igb_handle_link(void *context, int pending); static void igb_handle_link_locked(struct adapter *); static void igb_set_sysctl_value(struct adapter *, const char *, const char *, int *, int); static int igb_set_flowcntl(SYSCTL_HANDLER_ARGS); static int igb_sysctl_dmac(SYSCTL_HANDLER_ARGS); static int igb_sysctl_eee(SYSCTL_HANDLER_ARGS); #ifdef DEVICE_POLLING static poll_handler_t igb_poll; #endif /* POLLING */ /********************************************************************* * FreeBSD Device Interface Entry Points *********************************************************************/ static device_method_t igb_methods[] = { /* Device interface */ DEVMETHOD(device_probe, igb_probe), DEVMETHOD(device_attach, igb_attach), DEVMETHOD(device_detach, igb_detach), DEVMETHOD(device_shutdown, igb_shutdown), DEVMETHOD(device_suspend, igb_suspend), DEVMETHOD(device_resume, igb_resume), DEVMETHOD_END }; static driver_t igb_driver = { "igb", igb_methods, sizeof(struct adapter), }; static devclass_t igb_devclass; DRIVER_MODULE(igb, pci, igb_driver, igb_devclass, 0, 0); MODULE_DEPEND(igb, pci, 1, 1, 1); MODULE_DEPEND(igb, ether, 1, 1, 1); #ifdef DEV_NETMAP MODULE_DEPEND(igb, netmap, 1, 1, 1); #endif /* DEV_NETMAP */ /********************************************************************* * Tunable default values. *********************************************************************/ static SYSCTL_NODE(_hw, OID_AUTO, igb, CTLFLAG_RD, 0, "IGB driver parameters"); /* Descriptor defaults */ static int igb_rxd = IGB_DEFAULT_RXD; static int igb_txd = IGB_DEFAULT_TXD; SYSCTL_INT(_hw_igb, OID_AUTO, rxd, CTLFLAG_RDTUN, &igb_rxd, 0, "Number of receive descriptors per queue"); SYSCTL_INT(_hw_igb, OID_AUTO, txd, CTLFLAG_RDTUN, &igb_txd, 0, "Number of transmit descriptors per queue"); /* ** AIM: Adaptive Interrupt Moderation ** which means that the interrupt rate ** is varied over time based on the ** traffic for that interrupt vector */ static int igb_enable_aim = TRUE; SYSCTL_INT(_hw_igb, OID_AUTO, enable_aim, CTLFLAG_RWTUN, &igb_enable_aim, 0, "Enable adaptive interrupt moderation"); /* * MSIX should be the default for best performance, * but this allows it to be forced off for testing. */ static int igb_enable_msix = 1; SYSCTL_INT(_hw_igb, OID_AUTO, enable_msix, CTLFLAG_RDTUN, &igb_enable_msix, 0, "Enable MSI-X interrupts"); /* ** Tuneable Interrupt rate */ static int igb_max_interrupt_rate = 8000; SYSCTL_INT(_hw_igb, OID_AUTO, max_interrupt_rate, CTLFLAG_RDTUN, &igb_max_interrupt_rate, 0, "Maximum interrupts per second"); #ifndef IGB_LEGACY_TX /* ** Tuneable number of buffers in the buf-ring (drbr_xxx) */ static int igb_buf_ring_size = IGB_BR_SIZE; SYSCTL_INT(_hw_igb, OID_AUTO, buf_ring_size, CTLFLAG_RDTUN, &igb_buf_ring_size, 0, "Size of the bufring"); #endif /* ** Header split causes the packet header to ** be dma'd to a separate mbuf from the payload. ** this can have memory alignment benefits. But ** another plus is that small packets often fit ** into the header and thus use no cluster. Its ** a very workload dependent type feature. */ static int igb_header_split = FALSE; SYSCTL_INT(_hw_igb, OID_AUTO, header_split, CTLFLAG_RDTUN, &igb_header_split, 0, "Enable receive mbuf header split"); /* ** This will autoconfigure based on the ** number of CPUs and max supported ** MSIX messages if left at 0. */ static int igb_num_queues = 0; SYSCTL_INT(_hw_igb, OID_AUTO, num_queues, CTLFLAG_RDTUN, &igb_num_queues, 0, "Number of queues to configure, 0 indicates autoconfigure"); /* ** Global variable to store last used CPU when binding queues ** to CPUs in igb_allocate_msix. Starts at CPU_FIRST and increments when a ** queue is bound to a cpu. */ static int igb_last_bind_cpu = -1; /* How many packets rxeof tries to clean at a time */ static int igb_rx_process_limit = 100; SYSCTL_INT(_hw_igb, OID_AUTO, rx_process_limit, CTLFLAG_RDTUN, &igb_rx_process_limit, 0, "Maximum number of received packets to process at a time, -1 means unlimited"); /* How many packets txeof tries to clean at a time */ static int igb_tx_process_limit = -1; SYSCTL_INT(_hw_igb, OID_AUTO, tx_process_limit, CTLFLAG_RDTUN, &igb_tx_process_limit, 0, "Maximum number of sent packets to process at a time, -1 means unlimited"); #ifdef DEV_NETMAP /* see ixgbe.c for details */ #include #endif /* DEV_NETMAP */ /********************************************************************* * Device identification routine * * igb_probe determines if the driver should be loaded on * adapter based on PCI vendor/device id of the adapter. * * return BUS_PROBE_DEFAULT on success, positive on failure *********************************************************************/ static int igb_probe(device_t dev) { char adapter_name[256]; uint16_t pci_vendor_id = 0; uint16_t pci_device_id = 0; uint16_t pci_subvendor_id = 0; uint16_t pci_subdevice_id = 0; igb_vendor_info_t *ent; INIT_DEBUGOUT("igb_probe: begin"); pci_vendor_id = pci_get_vendor(dev); if (pci_vendor_id != IGB_INTEL_VENDOR_ID) return (ENXIO); pci_device_id = pci_get_device(dev); pci_subvendor_id = pci_get_subvendor(dev); pci_subdevice_id = pci_get_subdevice(dev); ent = igb_vendor_info_array; while (ent->vendor_id != 0) { if ((pci_vendor_id == ent->vendor_id) && (pci_device_id == ent->device_id) && ((pci_subvendor_id == ent->subvendor_id) || (ent->subvendor_id == 0)) && ((pci_subdevice_id == ent->subdevice_id) || (ent->subdevice_id == 0))) { sprintf(adapter_name, "%s, Version - %s", igb_strings[ent->index], igb_driver_version); device_set_desc_copy(dev, adapter_name); return (BUS_PROBE_DEFAULT); } ent++; } return (ENXIO); } /********************************************************************* * Device initialization routine * * The attach entry point is called when the driver is being loaded. * This routine identifies the type of hardware, allocates all resources * and initializes the hardware. * * return 0 on success, positive on failure *********************************************************************/ static int igb_attach(device_t dev) { struct adapter *adapter; int error = 0; u16 eeprom_data; INIT_DEBUGOUT("igb_attach: begin"); if (resource_disabled("igb", device_get_unit(dev))) { device_printf(dev, "Disabled by device hint\n"); return (ENXIO); } adapter = device_get_softc(dev); adapter->dev = adapter->osdep.dev = dev; IGB_CORE_LOCK_INIT(adapter, device_get_nameunit(dev)); /* SYSCTLs */ SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "nvm", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, igb_sysctl_nvm_info, "I", "NVM Information"); igb_set_sysctl_value(adapter, "enable_aim", "Interrupt Moderation", &adapter->enable_aim, igb_enable_aim); SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "fc", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, igb_set_flowcntl, "I", "Flow Control"); callout_init_mtx(&adapter->timer, &adapter->core_mtx, 0); /* Determine hardware and mac info */ igb_identify_hardware(adapter); /* Setup PCI resources */ if (igb_allocate_pci_resources(adapter)) { device_printf(dev, "Allocation of PCI resources failed\n"); error = ENXIO; goto err_pci; } /* Do Shared Code initialization */ if (e1000_setup_init_funcs(&adapter->hw, TRUE)) { device_printf(dev, "Setup of Shared code failed\n"); error = ENXIO; goto err_pci; } e1000_get_bus_info(&adapter->hw); /* Sysctls for limiting the amount of work done in the taskqueues */ igb_set_sysctl_value(adapter, "rx_processing_limit", "max number of rx packets to process", &adapter->rx_process_limit, igb_rx_process_limit); igb_set_sysctl_value(adapter, "tx_processing_limit", "max number of tx packets to process", &adapter->tx_process_limit, igb_tx_process_limit); /* * Validate number of transmit and receive descriptors. It * must not exceed hardware maximum, and must be multiple * of E1000_DBA_ALIGN. */ if (((igb_txd * sizeof(struct e1000_tx_desc)) % IGB_DBA_ALIGN) != 0 || (igb_txd > IGB_MAX_TXD) || (igb_txd < IGB_MIN_TXD)) { device_printf(dev, "Using %d TX descriptors instead of %d!\n", IGB_DEFAULT_TXD, igb_txd); adapter->num_tx_desc = IGB_DEFAULT_TXD; } else adapter->num_tx_desc = igb_txd; if (((igb_rxd * sizeof(struct e1000_rx_desc)) % IGB_DBA_ALIGN) != 0 || (igb_rxd > IGB_MAX_RXD) || (igb_rxd < IGB_MIN_RXD)) { device_printf(dev, "Using %d RX descriptors instead of %d!\n", IGB_DEFAULT_RXD, igb_rxd); adapter->num_rx_desc = IGB_DEFAULT_RXD; } else adapter->num_rx_desc = igb_rxd; adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_wait_to_complete = FALSE; adapter->hw.phy.autoneg_advertised = AUTONEG_ADV_DEFAULT; /* Copper options */ if (adapter->hw.phy.media_type == e1000_media_type_copper) { adapter->hw.phy.mdix = AUTO_ALL_MODES; adapter->hw.phy.disable_polarity_correction = FALSE; adapter->hw.phy.ms_type = IGB_MASTER_SLAVE; } /* * Set the frame limits assuming * standard ethernet sized frames. */ adapter->max_frame_size = ETHERMTU + ETHER_HDR_LEN + ETHERNET_FCS_SIZE; /* ** Allocate and Setup Queues */ if (igb_allocate_queues(adapter)) { error = ENOMEM; goto err_pci; } /* Allocate the appropriate stats memory */ if (adapter->vf_ifp) { adapter->stats = (struct e1000_vf_stats *)malloc(sizeof \ (struct e1000_vf_stats), M_DEVBUF, M_NOWAIT | M_ZERO); igb_vf_init_stats(adapter); } else adapter->stats = (struct e1000_hw_stats *)malloc(sizeof \ (struct e1000_hw_stats), M_DEVBUF, M_NOWAIT | M_ZERO); if (adapter->stats == NULL) { device_printf(dev, "Can not allocate stats memory\n"); error = ENOMEM; goto err_late; } /* Allocate multicast array memory. */ adapter->mta = malloc(sizeof(u8) * ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES, M_DEVBUF, M_NOWAIT); if (adapter->mta == NULL) { device_printf(dev, "Can not allocate multicast setup array\n"); error = ENOMEM; goto err_late; } /* Some adapter-specific advanced features */ if (adapter->hw.mac.type >= e1000_i350) { SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "dmac", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, igb_sysctl_dmac, "I", "DMA Coalesce"); SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "eee_disabled", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, igb_sysctl_eee, "I", "Disable Energy Efficient Ethernet"); if (adapter->hw.phy.media_type == e1000_media_type_copper) { if (adapter->hw.mac.type == e1000_i354) e1000_set_eee_i354(&adapter->hw, TRUE, TRUE); else e1000_set_eee_i350(&adapter->hw, TRUE, TRUE); } } /* ** Start from a known state, this is ** important in reading the nvm and ** mac from that. */ e1000_reset_hw(&adapter->hw); /* Make sure we have a good EEPROM before we read from it */ if (((adapter->hw.mac.type != e1000_i210) && (adapter->hw.mac.type != e1000_i211)) && (e1000_validate_nvm_checksum(&adapter->hw) < 0)) { /* ** Some PCI-E parts fail the first check due to ** the link being in sleep state, call it again, ** if it fails a second time its a real issue. */ if (e1000_validate_nvm_checksum(&adapter->hw) < 0) { device_printf(dev, "The EEPROM Checksum Is Not Valid\n"); error = EIO; goto err_late; } } /* ** Copy the permanent MAC address out of the EEPROM */ if (e1000_read_mac_addr(&adapter->hw) < 0) { device_printf(dev, "EEPROM read error while reading MAC" " address\n"); error = EIO; goto err_late; } /* Check its sanity */ if (!igb_is_valid_ether_addr(adapter->hw.mac.addr)) { device_printf(dev, "Invalid MAC address\n"); error = EIO; goto err_late; } /* Setup OS specific network interface */ if (igb_setup_interface(dev, adapter) != 0) goto err_late; /* Now get a good starting state */ igb_reset(adapter); /* Initialize statistics */ igb_update_stats_counters(adapter); adapter->hw.mac.get_link_status = 1; igb_update_link_status(adapter); /* Indicate SOL/IDER usage */ if (e1000_check_reset_block(&adapter->hw)) device_printf(dev, "PHY reset is blocked due to SOL/IDER session.\n"); /* Determine if we have to control management hardware */ adapter->has_manage = e1000_enable_mng_pass_thru(&adapter->hw); /* * Setup Wake-on-Lan */ /* APME bit in EEPROM is mapped to WUC.APME */ eeprom_data = E1000_READ_REG(&adapter->hw, E1000_WUC) & E1000_WUC_APME; if (eeprom_data) adapter->wol = E1000_WUFC_MAG; /* Register for VLAN events */ adapter->vlan_attach = EVENTHANDLER_REGISTER(vlan_config, igb_register_vlan, adapter, EVENTHANDLER_PRI_FIRST); adapter->vlan_detach = EVENTHANDLER_REGISTER(vlan_unconfig, igb_unregister_vlan, adapter, EVENTHANDLER_PRI_FIRST); igb_add_hw_stats(adapter); /* Tell the stack that the interface is not active */ adapter->ifp->if_drv_flags &= ~IFF_DRV_RUNNING; adapter->ifp->if_drv_flags |= IFF_DRV_OACTIVE; adapter->led_dev = led_create(igb_led_func, adapter, device_get_nameunit(dev)); /* ** Configure Interrupts */ if ((adapter->msix > 1) && (igb_enable_msix)) error = igb_allocate_msix(adapter); else /* MSI or Legacy */ error = igb_allocate_legacy(adapter); if (error) goto err_late; #ifdef DEV_NETMAP igb_netmap_attach(adapter); #endif /* DEV_NETMAP */ INIT_DEBUGOUT("igb_attach: end"); return (0); err_late: if (igb_detach(dev) == 0) /* igb_detach() already did the cleanup */ return(error); igb_free_transmit_structures(adapter); igb_free_receive_structures(adapter); igb_release_hw_control(adapter); err_pci: igb_free_pci_resources(adapter); if (adapter->ifp != NULL) if_free(adapter->ifp); free(adapter->mta, M_DEVBUF); IGB_CORE_LOCK_DESTROY(adapter); return (error); } /********************************************************************* * Device removal routine * * The detach entry point is called when the driver is being removed. * This routine stops the adapter and deallocates all the resources * that were allocated for driver operation. * * return 0 on success, positive on failure *********************************************************************/ static int igb_detach(device_t dev) { struct adapter *adapter = device_get_softc(dev); struct ifnet *ifp = adapter->ifp; INIT_DEBUGOUT("igb_detach: begin"); /* Make sure VLANS are not using driver */ if (adapter->ifp->if_vlantrunk != NULL) { device_printf(dev,"Vlan in use, detach first\n"); return (EBUSY); } ether_ifdetach(adapter->ifp); if (adapter->led_dev != NULL) led_destroy(adapter->led_dev); #ifdef DEVICE_POLLING if (ifp->if_capenable & IFCAP_POLLING) ether_poll_deregister(ifp); #endif IGB_CORE_LOCK(adapter); adapter->in_detach = 1; igb_stop(adapter); IGB_CORE_UNLOCK(adapter); e1000_phy_hw_reset(&adapter->hw); /* Give control back to firmware */ igb_release_manageability(adapter); igb_release_hw_control(adapter); if (adapter->wol) { E1000_WRITE_REG(&adapter->hw, E1000_WUC, E1000_WUC_PME_EN); E1000_WRITE_REG(&adapter->hw, E1000_WUFC, adapter->wol); igb_enable_wakeup(dev); } /* Unregister VLAN events */ if (adapter->vlan_attach != NULL) EVENTHANDLER_DEREGISTER(vlan_config, adapter->vlan_attach); if (adapter->vlan_detach != NULL) EVENTHANDLER_DEREGISTER(vlan_unconfig, adapter->vlan_detach); callout_drain(&adapter->timer); #ifdef DEV_NETMAP netmap_detach(adapter->ifp); #endif /* DEV_NETMAP */ igb_free_pci_resources(adapter); bus_generic_detach(dev); if_free(ifp); igb_free_transmit_structures(adapter); igb_free_receive_structures(adapter); if (adapter->mta != NULL) free(adapter->mta, M_DEVBUF); IGB_CORE_LOCK_DESTROY(adapter); return (0); } /********************************************************************* * * Shutdown entry point * **********************************************************************/ static int igb_shutdown(device_t dev) { return igb_suspend(dev); } /* * Suspend/resume device methods. */ static int igb_suspend(device_t dev) { struct adapter *adapter = device_get_softc(dev); IGB_CORE_LOCK(adapter); igb_stop(adapter); igb_release_manageability(adapter); igb_release_hw_control(adapter); if (adapter->wol) { E1000_WRITE_REG(&adapter->hw, E1000_WUC, E1000_WUC_PME_EN); E1000_WRITE_REG(&adapter->hw, E1000_WUFC, adapter->wol); igb_enable_wakeup(dev); } IGB_CORE_UNLOCK(adapter); return bus_generic_suspend(dev); } static int igb_resume(device_t dev) { struct adapter *adapter = device_get_softc(dev); struct tx_ring *txr = adapter->tx_rings; struct ifnet *ifp = adapter->ifp; IGB_CORE_LOCK(adapter); igb_init_locked(adapter); igb_init_manageability(adapter); if ((ifp->if_flags & IFF_UP) && (ifp->if_drv_flags & IFF_DRV_RUNNING) && adapter->link_active) { for (int i = 0; i < adapter->num_queues; i++, txr++) { IGB_TX_LOCK(txr); #ifndef IGB_LEGACY_TX /* Process the stack queue only if not depleted */ if (((txr->queue_status & IGB_QUEUE_DEPLETED) == 0) && !drbr_empty(ifp, txr->br)) igb_mq_start_locked(ifp, txr); #else if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) igb_start_locked(txr, ifp); #endif IGB_TX_UNLOCK(txr); } } IGB_CORE_UNLOCK(adapter); return bus_generic_resume(dev); } #ifdef IGB_LEGACY_TX /********************************************************************* * Transmit entry point * * igb_start is called by the stack to initiate a transmit. * The driver will remain in this routine as long as there are * packets to transmit and transmit resources are available. * In case resources are not available stack is notified and * the packet is requeued. **********************************************************************/ static void igb_start_locked(struct tx_ring *txr, struct ifnet *ifp) { struct adapter *adapter = ifp->if_softc; struct mbuf *m_head; IGB_TX_LOCK_ASSERT(txr); if ((ifp->if_drv_flags & (IFF_DRV_RUNNING|IFF_DRV_OACTIVE)) != IFF_DRV_RUNNING) return; if (!adapter->link_active) return; /* Call cleanup if number of TX descriptors low */ if (txr->tx_avail <= IGB_TX_CLEANUP_THRESHOLD) igb_txeof(txr); while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) { if (txr->tx_avail <= IGB_MAX_SCATTER) { txr->queue_status |= IGB_QUEUE_DEPLETED; break; } IFQ_DRV_DEQUEUE(&ifp->if_snd, m_head); if (m_head == NULL) break; /* * Encapsulation can modify our pointer, and or make it * NULL on failure. In that event, we can't requeue. */ if (igb_xmit(txr, &m_head)) { if (m_head != NULL) IFQ_DRV_PREPEND(&ifp->if_snd, m_head); if (txr->tx_avail <= IGB_MAX_SCATTER) txr->queue_status |= IGB_QUEUE_DEPLETED; break; } /* Send a copy of the frame to the BPF listener */ ETHER_BPF_MTAP(ifp, m_head); /* Set watchdog on */ txr->watchdog_time = ticks; txr->queue_status |= IGB_QUEUE_WORKING; } } /* * Legacy TX driver routine, called from the * stack, always uses tx[0], and spins for it. * Should not be used with multiqueue tx */ static void igb_start(struct ifnet *ifp) { struct adapter *adapter = ifp->if_softc; struct tx_ring *txr = adapter->tx_rings; if (ifp->if_drv_flags & IFF_DRV_RUNNING) { IGB_TX_LOCK(txr); igb_start_locked(txr, ifp); IGB_TX_UNLOCK(txr); } return; } #else /* ~IGB_LEGACY_TX */ /* ** Multiqueue Transmit Entry: ** quick turnaround to the stack ** */ static int igb_mq_start(struct ifnet *ifp, struct mbuf *m) { struct adapter *adapter = ifp->if_softc; struct igb_queue *que; struct tx_ring *txr; int i, err = 0; #ifdef RSS uint32_t bucket_id; #endif /* Which queue to use */ /* * When doing RSS, map it to the same outbound queue * as the incoming flow would be mapped to. * * If everything is setup correctly, it should be the * same bucket that the current CPU we're on is. */ if (M_HASHTYPE_GET(m) != M_HASHTYPE_NONE) { #ifdef RSS if (rss_hash2bucket(m->m_pkthdr.flowid, M_HASHTYPE_GET(m), &bucket_id) == 0) { /* XXX TODO: spit out something if bucket_id > num_queues? */ i = bucket_id % adapter->num_queues; } else { #endif i = m->m_pkthdr.flowid % adapter->num_queues; #ifdef RSS } #endif } else { i = curcpu % adapter->num_queues; } txr = &adapter->tx_rings[i]; que = &adapter->queues[i]; err = drbr_enqueue(ifp, txr->br, m); if (err) return (err); if (IGB_TX_TRYLOCK(txr)) { igb_mq_start_locked(ifp, txr); IGB_TX_UNLOCK(txr); } else taskqueue_enqueue(que->tq, &txr->txq_task); return (0); } static int igb_mq_start_locked(struct ifnet *ifp, struct tx_ring *txr) { struct adapter *adapter = txr->adapter; struct mbuf *next; int err = 0, enq = 0; IGB_TX_LOCK_ASSERT(txr); if (((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) || adapter->link_active == 0) return (ENETDOWN); /* Process the queue */ while ((next = drbr_peek(ifp, txr->br)) != NULL) { if ((err = igb_xmit(txr, &next)) != 0) { if (next == NULL) { /* It was freed, move forward */ drbr_advance(ifp, txr->br); } else { /* * Still have one left, it may not be * the same since the transmit function * may have changed it. */ drbr_putback(ifp, txr->br, next); } break; } drbr_advance(ifp, txr->br); enq++; if (next->m_flags & M_MCAST && adapter->vf_ifp) if_inc_counter(ifp, IFCOUNTER_OMCASTS, 1); ETHER_BPF_MTAP(ifp, next); if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) break; } if (enq > 0) { /* Set the watchdog */ txr->queue_status |= IGB_QUEUE_WORKING; txr->watchdog_time = ticks; } if (txr->tx_avail <= IGB_TX_CLEANUP_THRESHOLD) igb_txeof(txr); if (txr->tx_avail <= IGB_MAX_SCATTER) txr->queue_status |= IGB_QUEUE_DEPLETED; return (err); } /* * Called from a taskqueue to drain queued transmit packets. */ static void igb_deferred_mq_start(void *arg, int pending) { struct tx_ring *txr = arg; struct adapter *adapter = txr->adapter; struct ifnet *ifp = adapter->ifp; IGB_TX_LOCK(txr); if (!drbr_empty(ifp, txr->br)) igb_mq_start_locked(ifp, txr); IGB_TX_UNLOCK(txr); } /* ** Flush all ring buffers */ static void igb_qflush(struct ifnet *ifp) { struct adapter *adapter = ifp->if_softc; struct tx_ring *txr = adapter->tx_rings; struct mbuf *m; for (int i = 0; i < adapter->num_queues; i++, txr++) { IGB_TX_LOCK(txr); while ((m = buf_ring_dequeue_sc(txr->br)) != NULL) m_freem(m); IGB_TX_UNLOCK(txr); } if_qflush(ifp); } #endif /* ~IGB_LEGACY_TX */ /********************************************************************* * Ioctl entry point * * igb_ioctl is called when the user wants to configure the * interface. * * return 0 on success, positive on failure **********************************************************************/ static int igb_ioctl(struct ifnet *ifp, u_long command, caddr_t data) { struct adapter *adapter = ifp->if_softc; struct ifreq *ifr = (struct ifreq *)data; #if defined(INET) || defined(INET6) struct ifaddr *ifa = (struct ifaddr *)data; #endif bool avoid_reset = FALSE; int error = 0; if (adapter->in_detach) return (error); switch (command) { case SIOCSIFADDR: #ifdef INET if (ifa->ifa_addr->sa_family == AF_INET) avoid_reset = TRUE; #endif #ifdef INET6 if (ifa->ifa_addr->sa_family == AF_INET6) avoid_reset = TRUE; #endif /* ** Calling init results in link renegotiation, ** so we avoid doing it when possible. */ if (avoid_reset) { ifp->if_flags |= IFF_UP; if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) igb_init(adapter); #ifdef INET if (!(ifp->if_flags & IFF_NOARP)) arp_ifinit(ifp, ifa); #endif } else error = ether_ioctl(ifp, command, data); break; case SIOCSIFMTU: { int max_frame_size; IOCTL_DEBUGOUT("ioctl rcv'd: SIOCSIFMTU (Set Interface MTU)"); IGB_CORE_LOCK(adapter); max_frame_size = 9234; if (ifr->ifr_mtu > max_frame_size - ETHER_HDR_LEN - ETHER_CRC_LEN) { IGB_CORE_UNLOCK(adapter); error = EINVAL; break; } ifp->if_mtu = ifr->ifr_mtu; adapter->max_frame_size = ifp->if_mtu + ETHER_HDR_LEN + ETHER_CRC_LEN; - if ((ifp->if_drv_flags & IFF_DRV_RUNNING)) + if (ifp->if_drv_flags & IFF_DRV_RUNNING) igb_init_locked(adapter); IGB_CORE_UNLOCK(adapter); break; } case SIOCSIFFLAGS: IOCTL_DEBUGOUT("ioctl rcv'd:\ SIOCSIFFLAGS (Set Interface Flags)"); IGB_CORE_LOCK(adapter); if (ifp->if_flags & IFF_UP) { if ((ifp->if_drv_flags & IFF_DRV_RUNNING)) { if ((ifp->if_flags ^ adapter->if_flags) & (IFF_PROMISC | IFF_ALLMULTI)) { igb_disable_promisc(adapter); igb_set_promisc(adapter); } } else igb_init_locked(adapter); } else if (ifp->if_drv_flags & IFF_DRV_RUNNING) igb_stop(adapter); adapter->if_flags = ifp->if_flags; IGB_CORE_UNLOCK(adapter); break; case SIOCADDMULTI: case SIOCDELMULTI: IOCTL_DEBUGOUT("ioctl rcv'd: SIOC(ADD|DEL)MULTI"); if (ifp->if_drv_flags & IFF_DRV_RUNNING) { IGB_CORE_LOCK(adapter); igb_disable_intr(adapter); igb_set_multi(adapter); #ifdef DEVICE_POLLING if (!(ifp->if_capenable & IFCAP_POLLING)) #endif igb_enable_intr(adapter); IGB_CORE_UNLOCK(adapter); } break; case SIOCSIFMEDIA: /* Check SOL/IDER usage */ IGB_CORE_LOCK(adapter); if (e1000_check_reset_block(&adapter->hw)) { IGB_CORE_UNLOCK(adapter); device_printf(adapter->dev, "Media change is" " blocked due to SOL/IDER session.\n"); break; } IGB_CORE_UNLOCK(adapter); case SIOCGIFMEDIA: IOCTL_DEBUGOUT("ioctl rcv'd: \ SIOCxIFMEDIA (Get/Set Interface Media)"); error = ifmedia_ioctl(ifp, ifr, &adapter->media, command); break; case SIOCSIFCAP: { int mask, reinit; IOCTL_DEBUGOUT("ioctl rcv'd: SIOCSIFCAP (Set Capabilities)"); reinit = 0; mask = ifr->ifr_reqcap ^ ifp->if_capenable; #ifdef DEVICE_POLLING if (mask & IFCAP_POLLING) { if (ifr->ifr_reqcap & IFCAP_POLLING) { error = ether_poll_register(igb_poll, ifp); if (error) return (error); IGB_CORE_LOCK(adapter); igb_disable_intr(adapter); ifp->if_capenable |= IFCAP_POLLING; IGB_CORE_UNLOCK(adapter); } else { error = ether_poll_deregister(ifp); /* Enable interrupt even in error case */ IGB_CORE_LOCK(adapter); igb_enable_intr(adapter); ifp->if_capenable &= ~IFCAP_POLLING; IGB_CORE_UNLOCK(adapter); } } #endif #if __FreeBSD_version >= 1000000 /* HW cannot turn these on/off separately */ if (mask & (IFCAP_RXCSUM | IFCAP_RXCSUM_IPV6)) { ifp->if_capenable ^= IFCAP_RXCSUM; ifp->if_capenable ^= IFCAP_RXCSUM_IPV6; reinit = 1; } if (mask & IFCAP_TXCSUM) { ifp->if_capenable ^= IFCAP_TXCSUM; reinit = 1; } if (mask & IFCAP_TXCSUM_IPV6) { ifp->if_capenable ^= IFCAP_TXCSUM_IPV6; reinit = 1; } #else if (mask & IFCAP_HWCSUM) { ifp->if_capenable ^= IFCAP_HWCSUM; reinit = 1; } #endif if (mask & IFCAP_TSO4) { ifp->if_capenable ^= IFCAP_TSO4; reinit = 1; } if (mask & IFCAP_TSO6) { ifp->if_capenable ^= IFCAP_TSO6; reinit = 1; } if (mask & IFCAP_VLAN_HWTAGGING) { ifp->if_capenable ^= IFCAP_VLAN_HWTAGGING; reinit = 1; } if (mask & IFCAP_VLAN_HWFILTER) { ifp->if_capenable ^= IFCAP_VLAN_HWFILTER; reinit = 1; } if (mask & IFCAP_VLAN_HWTSO) { ifp->if_capenable ^= IFCAP_VLAN_HWTSO; reinit = 1; } if (mask & IFCAP_LRO) { ifp->if_capenable ^= IFCAP_LRO; reinit = 1; } if (reinit && (ifp->if_drv_flags & IFF_DRV_RUNNING)) igb_init(adapter); VLAN_CAPABILITIES(ifp); break; } default: error = ether_ioctl(ifp, command, data); break; } return (error); } /********************************************************************* * Init entry point * * This routine is used in two ways. It is used by the stack as * init entry point in network interface structure. It is also used * by the driver as a hw/sw initialization routine to get to a * consistent state. * * return 0 on success, positive on failure **********************************************************************/ static void igb_init_locked(struct adapter *adapter) { struct ifnet *ifp = adapter->ifp; device_t dev = adapter->dev; INIT_DEBUGOUT("igb_init: begin"); IGB_CORE_LOCK_ASSERT(adapter); igb_disable_intr(adapter); callout_stop(&adapter->timer); /* Get the latest mac address, User can use a LAA */ bcopy(IF_LLADDR(adapter->ifp), adapter->hw.mac.addr, ETHER_ADDR_LEN); /* Put the address into the Receive Address Array */ e1000_rar_set(&adapter->hw, adapter->hw.mac.addr, 0); igb_reset(adapter); igb_update_link_status(adapter); E1000_WRITE_REG(&adapter->hw, E1000_VET, ETHERTYPE_VLAN); /* Set hardware offload abilities */ ifp->if_hwassist = 0; if (ifp->if_capenable & IFCAP_TXCSUM) { #if __FreeBSD_version >= 1000000 ifp->if_hwassist |= (CSUM_IP_TCP | CSUM_IP_UDP); if (adapter->hw.mac.type != e1000_82575) ifp->if_hwassist |= CSUM_IP_SCTP; #else ifp->if_hwassist |= (CSUM_TCP | CSUM_UDP); #if __FreeBSD_version >= 800000 if (adapter->hw.mac.type != e1000_82575) ifp->if_hwassist |= CSUM_SCTP; #endif #endif } #if __FreeBSD_version >= 1000000 if (ifp->if_capenable & IFCAP_TXCSUM_IPV6) { ifp->if_hwassist |= (CSUM_IP6_TCP | CSUM_IP6_UDP); if (adapter->hw.mac.type != e1000_82575) ifp->if_hwassist |= CSUM_IP6_SCTP; } #endif if (ifp->if_capenable & IFCAP_TSO) ifp->if_hwassist |= CSUM_TSO; /* Clear bad data from Rx FIFOs */ e1000_rx_fifo_flush_82575(&adapter->hw); /* Configure for OS presence */ igb_init_manageability(adapter); /* Prepare transmit descriptors and buffers */ igb_setup_transmit_structures(adapter); igb_initialize_transmit_units(adapter); /* Setup Multicast table */ igb_set_multi(adapter); /* ** Figure out the desired mbuf pool ** for doing jumbo/packetsplit */ if (adapter->max_frame_size <= 2048) adapter->rx_mbuf_sz = MCLBYTES; else if (adapter->max_frame_size <= 4096) adapter->rx_mbuf_sz = MJUMPAGESIZE; else adapter->rx_mbuf_sz = MJUM9BYTES; /* Prepare receive descriptors and buffers */ if (igb_setup_receive_structures(adapter)) { device_printf(dev, "Could not setup receive structures\n"); return; } igb_initialize_receive_units(adapter); /* Enable VLAN support */ if (ifp->if_capenable & IFCAP_VLAN_HWTAGGING) igb_setup_vlan_hw_support(adapter); /* Don't lose promiscuous settings */ igb_set_promisc(adapter); ifp->if_drv_flags |= IFF_DRV_RUNNING; ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; callout_reset(&adapter->timer, hz, igb_local_timer, adapter); e1000_clear_hw_cntrs_base_generic(&adapter->hw); if (adapter->msix > 1) /* Set up queue routing */ igb_configure_queues(adapter); /* this clears any pending interrupts */ E1000_READ_REG(&adapter->hw, E1000_ICR); #ifdef DEVICE_POLLING /* * Only enable interrupts if we are not polling, make sure * they are off otherwise. */ if (ifp->if_capenable & IFCAP_POLLING) igb_disable_intr(adapter); else #endif /* DEVICE_POLLING */ { igb_enable_intr(adapter); E1000_WRITE_REG(&adapter->hw, E1000_ICS, E1000_ICS_LSC); } /* Set Energy Efficient Ethernet */ if (adapter->hw.phy.media_type == e1000_media_type_copper) { if (adapter->hw.mac.type == e1000_i354) e1000_set_eee_i354(&adapter->hw, TRUE, TRUE); else e1000_set_eee_i350(&adapter->hw, TRUE, TRUE); } } static void igb_init(void *arg) { struct adapter *adapter = arg; IGB_CORE_LOCK(adapter); igb_init_locked(adapter); IGB_CORE_UNLOCK(adapter); } static void igb_handle_que(void *context, int pending) { struct igb_queue *que = context; struct adapter *adapter = que->adapter; struct tx_ring *txr = que->txr; struct ifnet *ifp = adapter->ifp; if (ifp->if_drv_flags & IFF_DRV_RUNNING) { bool more; more = igb_rxeof(que, adapter->rx_process_limit, NULL); IGB_TX_LOCK(txr); igb_txeof(txr); #ifndef IGB_LEGACY_TX /* Process the stack queue only if not depleted */ if (((txr->queue_status & IGB_QUEUE_DEPLETED) == 0) && !drbr_empty(ifp, txr->br)) igb_mq_start_locked(ifp, txr); #else if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) igb_start_locked(txr, ifp); #endif IGB_TX_UNLOCK(txr); /* Do we need another? */ if (more) { taskqueue_enqueue(que->tq, &que->que_task); return; } } #ifdef DEVICE_POLLING if (ifp->if_capenable & IFCAP_POLLING) return; #endif /* Reenable this interrupt */ if (que->eims) E1000_WRITE_REG(&adapter->hw, E1000_EIMS, que->eims); else igb_enable_intr(adapter); } /* Deal with link in a sleepable context */ static void igb_handle_link(void *context, int pending) { struct adapter *adapter = context; IGB_CORE_LOCK(adapter); igb_handle_link_locked(adapter); IGB_CORE_UNLOCK(adapter); } static void igb_handle_link_locked(struct adapter *adapter) { struct tx_ring *txr = adapter->tx_rings; struct ifnet *ifp = adapter->ifp; IGB_CORE_LOCK_ASSERT(adapter); adapter->hw.mac.get_link_status = 1; igb_update_link_status(adapter); if ((ifp->if_drv_flags & IFF_DRV_RUNNING) && adapter->link_active) { for (int i = 0; i < adapter->num_queues; i++, txr++) { IGB_TX_LOCK(txr); #ifndef IGB_LEGACY_TX /* Process the stack queue only if not depleted */ if (((txr->queue_status & IGB_QUEUE_DEPLETED) == 0) && !drbr_empty(ifp, txr->br)) igb_mq_start_locked(ifp, txr); #else if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) igb_start_locked(txr, ifp); #endif IGB_TX_UNLOCK(txr); } } } /********************************************************************* * * MSI/Legacy Deferred * Interrupt Service routine * *********************************************************************/ static int igb_irq_fast(void *arg) { struct adapter *adapter = arg; struct igb_queue *que = adapter->queues; u32 reg_icr; reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR); /* Hot eject? */ if (reg_icr == 0xffffffff) return FILTER_STRAY; /* Definitely not our interrupt. */ if (reg_icr == 0x0) return FILTER_STRAY; if ((reg_icr & E1000_ICR_INT_ASSERTED) == 0) return FILTER_STRAY; /* * Mask interrupts until the taskqueue is finished running. This is * cheap, just assume that it is needed. This also works around the * MSI message reordering errata on certain systems. */ igb_disable_intr(adapter); taskqueue_enqueue(que->tq, &que->que_task); /* Link status change */ if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) taskqueue_enqueue(que->tq, &adapter->link_task); if (reg_icr & E1000_ICR_RXO) adapter->rx_overruns++; return FILTER_HANDLED; } #ifdef DEVICE_POLLING #if __FreeBSD_version >= 800000 #define POLL_RETURN_COUNT(a) (a) static int #else #define POLL_RETURN_COUNT(a) static void #endif igb_poll(struct ifnet *ifp, enum poll_cmd cmd, int count) { struct adapter *adapter = ifp->if_softc; struct igb_queue *que; struct tx_ring *txr; u32 reg_icr, rx_done = 0; u32 loop = IGB_MAX_LOOP; bool more; IGB_CORE_LOCK(adapter); if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) { IGB_CORE_UNLOCK(adapter); return POLL_RETURN_COUNT(rx_done); } if (cmd == POLL_AND_CHECK_STATUS) { reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR); /* Link status change */ if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) igb_handle_link_locked(adapter); if (reg_icr & E1000_ICR_RXO) adapter->rx_overruns++; } IGB_CORE_UNLOCK(adapter); for (int i = 0; i < adapter->num_queues; i++) { que = &adapter->queues[i]; txr = que->txr; igb_rxeof(que, count, &rx_done); IGB_TX_LOCK(txr); do { more = igb_txeof(txr); } while (loop-- && more); #ifndef IGB_LEGACY_TX if (!drbr_empty(ifp, txr->br)) igb_mq_start_locked(ifp, txr); #else if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) igb_start_locked(txr, ifp); #endif IGB_TX_UNLOCK(txr); } return POLL_RETURN_COUNT(rx_done); } #endif /* DEVICE_POLLING */ /********************************************************************* * * MSIX Que Interrupt Service routine * **********************************************************************/ static void igb_msix_que(void *arg) { struct igb_queue *que = arg; struct adapter *adapter = que->adapter; struct ifnet *ifp = adapter->ifp; struct tx_ring *txr = que->txr; struct rx_ring *rxr = que->rxr; u32 newitr = 0; bool more_rx; /* Ignore spurious interrupts */ if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) return; E1000_WRITE_REG(&adapter->hw, E1000_EIMC, que->eims); ++que->irqs; IGB_TX_LOCK(txr); igb_txeof(txr); #ifndef IGB_LEGACY_TX /* Process the stack queue only if not depleted */ if (((txr->queue_status & IGB_QUEUE_DEPLETED) == 0) && !drbr_empty(ifp, txr->br)) igb_mq_start_locked(ifp, txr); #else if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) igb_start_locked(txr, ifp); #endif IGB_TX_UNLOCK(txr); more_rx = igb_rxeof(que, adapter->rx_process_limit, NULL); if (adapter->enable_aim == FALSE) goto no_calc; /* ** Do Adaptive Interrupt Moderation: ** - Write out last calculated setting ** - Calculate based on average size over ** the last interval. */ if (que->eitr_setting) E1000_WRITE_REG(&adapter->hw, E1000_EITR(que->msix), que->eitr_setting); que->eitr_setting = 0; /* Idle, do nothing */ if ((txr->bytes == 0) && (rxr->bytes == 0)) goto no_calc; /* Used half Default if sub-gig */ if (adapter->link_speed != 1000) newitr = IGB_DEFAULT_ITR / 2; else { if ((txr->bytes) && (txr->packets)) newitr = txr->bytes/txr->packets; if ((rxr->bytes) && (rxr->packets)) newitr = max(newitr, (rxr->bytes / rxr->packets)); newitr += 24; /* account for hardware frame, crc */ /* set an upper boundary */ newitr = min(newitr, 3000); /* Be nice to the mid range */ if ((newitr > 300) && (newitr < 1200)) newitr = (newitr / 3); else newitr = (newitr / 2); } newitr &= 0x7FFC; /* Mask invalid bits */ if (adapter->hw.mac.type == e1000_82575) newitr |= newitr << 16; else newitr |= E1000_EITR_CNT_IGNR; /* save for next interrupt */ que->eitr_setting = newitr; /* Reset state */ txr->bytes = 0; txr->packets = 0; rxr->bytes = 0; rxr->packets = 0; no_calc: /* Schedule a clean task if needed*/ if (more_rx) taskqueue_enqueue(que->tq, &que->que_task); else /* Reenable this interrupt */ E1000_WRITE_REG(&adapter->hw, E1000_EIMS, que->eims); return; } /********************************************************************* * * MSIX Link Interrupt Service routine * **********************************************************************/ static void igb_msix_link(void *arg) { struct adapter *adapter = arg; u32 icr; ++adapter->link_irq; icr = E1000_READ_REG(&adapter->hw, E1000_ICR); if (!(icr & E1000_ICR_LSC)) goto spurious; igb_handle_link(adapter, 0); spurious: /* Rearm */ E1000_WRITE_REG(&adapter->hw, E1000_IMS, E1000_IMS_LSC); E1000_WRITE_REG(&adapter->hw, E1000_EIMS, adapter->link_mask); return; } /********************************************************************* * * Media Ioctl callback * * This routine is called whenever the user queries the status of * the interface using ifconfig. * **********************************************************************/ static void igb_media_status(struct ifnet *ifp, struct ifmediareq *ifmr) { struct adapter *adapter = ifp->if_softc; INIT_DEBUGOUT("igb_media_status: begin"); IGB_CORE_LOCK(adapter); igb_update_link_status(adapter); ifmr->ifm_status = IFM_AVALID; ifmr->ifm_active = IFM_ETHER; if (!adapter->link_active) { IGB_CORE_UNLOCK(adapter); return; } ifmr->ifm_status |= IFM_ACTIVE; switch (adapter->link_speed) { case 10: ifmr->ifm_active |= IFM_10_T; break; case 100: /* ** Support for 100Mb SFP - these are Fiber ** but the media type appears as serdes */ if (adapter->hw.phy.media_type == e1000_media_type_internal_serdes) ifmr->ifm_active |= IFM_100_FX; else ifmr->ifm_active |= IFM_100_TX; break; case 1000: ifmr->ifm_active |= IFM_1000_T; break; case 2500: ifmr->ifm_active |= IFM_2500_SX; break; } if (adapter->link_duplex == FULL_DUPLEX) ifmr->ifm_active |= IFM_FDX; else ifmr->ifm_active |= IFM_HDX; IGB_CORE_UNLOCK(adapter); } /********************************************************************* * * Media Ioctl callback * * This routine is called when the user changes speed/duplex using * media/mediopt option with ifconfig. * **********************************************************************/ static int igb_media_change(struct ifnet *ifp) { struct adapter *adapter = ifp->if_softc; struct ifmedia *ifm = &adapter->media; INIT_DEBUGOUT("igb_media_change: begin"); if (IFM_TYPE(ifm->ifm_media) != IFM_ETHER) return (EINVAL); IGB_CORE_LOCK(adapter); switch (IFM_SUBTYPE(ifm->ifm_media)) { case IFM_AUTO: adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_advertised = AUTONEG_ADV_DEFAULT; break; case IFM_1000_LX: case IFM_1000_SX: case IFM_1000_T: adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_advertised = ADVERTISE_1000_FULL; break; case IFM_100_TX: adapter->hw.mac.autoneg = FALSE; adapter->hw.phy.autoneg_advertised = 0; if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX) adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_FULL; else adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_HALF; break; case IFM_10_T: adapter->hw.mac.autoneg = FALSE; adapter->hw.phy.autoneg_advertised = 0; if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX) adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_FULL; else adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_HALF; break; default: device_printf(adapter->dev, "Unsupported media type\n"); } igb_init_locked(adapter); IGB_CORE_UNLOCK(adapter); return (0); } /********************************************************************* * * This routine maps the mbufs to Advanced TX descriptors. * **********************************************************************/ static int igb_xmit(struct tx_ring *txr, struct mbuf **m_headp) { struct adapter *adapter = txr->adapter; u32 olinfo_status = 0, cmd_type_len; int i, j, error, nsegs; int first; bool remap = TRUE; struct mbuf *m_head; bus_dma_segment_t segs[IGB_MAX_SCATTER]; bus_dmamap_t map; struct igb_tx_buf *txbuf; union e1000_adv_tx_desc *txd = NULL; m_head = *m_headp; /* Basic descriptor defines */ cmd_type_len = (E1000_ADVTXD_DTYP_DATA | E1000_ADVTXD_DCMD_IFCS | E1000_ADVTXD_DCMD_DEXT); if (m_head->m_flags & M_VLANTAG) cmd_type_len |= E1000_ADVTXD_DCMD_VLE; /* * Important to capture the first descriptor * used because it will contain the index of * the one we tell the hardware to report back */ first = txr->next_avail_desc; txbuf = &txr->tx_buffers[first]; map = txbuf->map; /* * Map the packet for DMA. */ retry: error = bus_dmamap_load_mbuf_sg(txr->txtag, map, *m_headp, segs, &nsegs, BUS_DMA_NOWAIT); if (__predict_false(error)) { struct mbuf *m; switch (error) { case EFBIG: /* Try it again? - one try */ if (remap == TRUE) { remap = FALSE; m = m_collapse(*m_headp, M_NOWAIT, IGB_MAX_SCATTER); if (m == NULL) { adapter->mbuf_defrag_failed++; m_freem(*m_headp); *m_headp = NULL; return (ENOBUFS); } *m_headp = m; goto retry; } else return (error); default: txr->no_tx_dma_setup++; m_freem(*m_headp); *m_headp = NULL; return (error); } } /* Make certain there are enough descriptors */ if (txr->tx_avail < (nsegs + 2)) { txr->no_desc_avail++; bus_dmamap_unload(txr->txtag, map); return (ENOBUFS); } m_head = *m_headp; /* ** Set up the appropriate offload context ** this will consume the first descriptor */ error = igb_tx_ctx_setup(txr, m_head, &cmd_type_len, &olinfo_status); if (__predict_false(error)) { m_freem(*m_headp); *m_headp = NULL; return (error); } /* 82575 needs the queue index added */ if (adapter->hw.mac.type == e1000_82575) olinfo_status |= txr->me << 4; i = txr->next_avail_desc; for (j = 0; j < nsegs; j++) { bus_size_t seglen; bus_addr_t segaddr; txbuf = &txr->tx_buffers[i]; txd = &txr->tx_base[i]; seglen = segs[j].ds_len; segaddr = htole64(segs[j].ds_addr); txd->read.buffer_addr = segaddr; txd->read.cmd_type_len = htole32(E1000_TXD_CMD_IFCS | cmd_type_len | seglen); txd->read.olinfo_status = htole32(olinfo_status); if (++i == txr->num_desc) i = 0; } txd->read.cmd_type_len |= htole32(E1000_TXD_CMD_EOP | E1000_TXD_CMD_RS); txr->tx_avail -= nsegs; txr->next_avail_desc = i; txbuf->m_head = m_head; /* ** Here we swap the map so the last descriptor, ** which gets the completion interrupt has the ** real map, and the first descriptor gets the ** unused map from this descriptor. */ txr->tx_buffers[first].map = txbuf->map; txbuf->map = map; bus_dmamap_sync(txr->txtag, map, BUS_DMASYNC_PREWRITE); /* Set the EOP descriptor that will be marked done */ txbuf = &txr->tx_buffers[first]; txbuf->eop = txd; bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* * Advance the Transmit Descriptor Tail (Tdt), this tells the * hardware that this frame is available to transmit. */ ++txr->total_packets; E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), i); return (0); } static void igb_set_promisc(struct adapter *adapter) { struct ifnet *ifp = adapter->ifp; struct e1000_hw *hw = &adapter->hw; u32 reg; if (adapter->vf_ifp) { e1000_promisc_set_vf(hw, e1000_promisc_enabled); return; } reg = E1000_READ_REG(hw, E1000_RCTL); if (ifp->if_flags & IFF_PROMISC) { reg |= (E1000_RCTL_UPE | E1000_RCTL_MPE); E1000_WRITE_REG(hw, E1000_RCTL, reg); } else if (ifp->if_flags & IFF_ALLMULTI) { reg |= E1000_RCTL_MPE; reg &= ~E1000_RCTL_UPE; E1000_WRITE_REG(hw, E1000_RCTL, reg); } } static void igb_disable_promisc(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct ifnet *ifp = adapter->ifp; u32 reg; int mcnt = 0; if (adapter->vf_ifp) { e1000_promisc_set_vf(hw, e1000_promisc_disabled); return; } reg = E1000_READ_REG(hw, E1000_RCTL); reg &= (~E1000_RCTL_UPE); if (ifp->if_flags & IFF_ALLMULTI) mcnt = MAX_NUM_MULTICAST_ADDRESSES; else { struct ifmultiaddr *ifma; #if __FreeBSD_version < 800000 IF_ADDR_LOCK(ifp); #else if_maddr_rlock(ifp); #endif TAILQ_FOREACH(ifma, &ifp->if_multiaddrs, ifma_link) { if (ifma->ifma_addr->sa_family != AF_LINK) continue; if (mcnt == MAX_NUM_MULTICAST_ADDRESSES) break; mcnt++; } #if __FreeBSD_version < 800000 IF_ADDR_UNLOCK(ifp); #else if_maddr_runlock(ifp); #endif } /* Don't disable if in MAX groups */ if (mcnt < MAX_NUM_MULTICAST_ADDRESSES) reg &= (~E1000_RCTL_MPE); E1000_WRITE_REG(hw, E1000_RCTL, reg); } /********************************************************************* * Multicast Update * * This routine is called whenever multicast address list is updated. * **********************************************************************/ static void igb_set_multi(struct adapter *adapter) { struct ifnet *ifp = adapter->ifp; struct ifmultiaddr *ifma; u32 reg_rctl = 0; u8 *mta; int mcnt = 0; IOCTL_DEBUGOUT("igb_set_multi: begin"); mta = adapter->mta; bzero(mta, sizeof(uint8_t) * ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES); #if __FreeBSD_version < 800000 IF_ADDR_LOCK(ifp); #else if_maddr_rlock(ifp); #endif TAILQ_FOREACH(ifma, &ifp->if_multiaddrs, ifma_link) { if (ifma->ifma_addr->sa_family != AF_LINK) continue; if (mcnt == MAX_NUM_MULTICAST_ADDRESSES) break; bcopy(LLADDR((struct sockaddr_dl *)ifma->ifma_addr), &mta[mcnt * ETH_ADDR_LEN], ETH_ADDR_LEN); mcnt++; } #if __FreeBSD_version < 800000 IF_ADDR_UNLOCK(ifp); #else if_maddr_runlock(ifp); #endif if (mcnt >= MAX_NUM_MULTICAST_ADDRESSES) { reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); reg_rctl |= E1000_RCTL_MPE; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } else e1000_update_mc_addr_list(&adapter->hw, mta, mcnt); } /********************************************************************* * Timer routine: * This routine checks for link status, * updates statistics, and does the watchdog. * **********************************************************************/ static void igb_local_timer(void *arg) { struct adapter *adapter = arg; device_t dev = adapter->dev; struct ifnet *ifp = adapter->ifp; struct tx_ring *txr = adapter->tx_rings; struct igb_queue *que = adapter->queues; int hung = 0, busy = 0; IGB_CORE_LOCK_ASSERT(adapter); igb_update_link_status(adapter); igb_update_stats_counters(adapter); /* ** Check the TX queues status ** - central locked handling of OACTIVE ** - watchdog only if all queues show hung */ for (int i = 0; i < adapter->num_queues; i++, que++, txr++) { if ((txr->queue_status & IGB_QUEUE_HUNG) && (adapter->pause_frames == 0)) ++hung; if (txr->queue_status & IGB_QUEUE_DEPLETED) ++busy; if ((txr->queue_status & IGB_QUEUE_IDLE) == 0) taskqueue_enqueue(que->tq, &que->que_task); } if (hung == adapter->num_queues) goto timeout; if (busy == adapter->num_queues) ifp->if_drv_flags |= IFF_DRV_OACTIVE; else if ((ifp->if_drv_flags & IFF_DRV_OACTIVE) && (busy < adapter->num_queues)) ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; adapter->pause_frames = 0; callout_reset(&adapter->timer, hz, igb_local_timer, adapter); #ifndef DEVICE_POLLING /* Schedule all queue interrupts - deadlock protection */ E1000_WRITE_REG(&adapter->hw, E1000_EICS, adapter->que_mask); #endif return; timeout: device_printf(adapter->dev, "Watchdog timeout -- resetting\n"); device_printf(dev,"Queue(%d) tdh = %d, hw tdt = %d\n", txr->me, E1000_READ_REG(&adapter->hw, E1000_TDH(txr->me)), E1000_READ_REG(&adapter->hw, E1000_TDT(txr->me))); device_printf(dev,"TX(%d) desc avail = %d," "Next TX to Clean = %d\n", txr->me, txr->tx_avail, txr->next_to_clean); adapter->ifp->if_drv_flags &= ~IFF_DRV_RUNNING; adapter->watchdog_events++; igb_init_locked(adapter); } static void igb_update_link_status(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct e1000_fc_info *fc = &hw->fc; struct ifnet *ifp = adapter->ifp; device_t dev = adapter->dev; struct tx_ring *txr = adapter->tx_rings; u32 link_check, thstat, ctrl; char *flowctl = NULL; link_check = thstat = ctrl = 0; /* Get the cached link value or read for real */ switch (hw->phy.media_type) { case e1000_media_type_copper: if (hw->mac.get_link_status) { /* Do the work to read phy */ e1000_check_for_link(hw); link_check = !hw->mac.get_link_status; } else link_check = TRUE; break; case e1000_media_type_fiber: e1000_check_for_link(hw); link_check = (E1000_READ_REG(hw, E1000_STATUS) & E1000_STATUS_LU); break; case e1000_media_type_internal_serdes: e1000_check_for_link(hw); link_check = adapter->hw.mac.serdes_has_link; break; /* VF device is type_unknown */ case e1000_media_type_unknown: e1000_check_for_link(hw); link_check = !hw->mac.get_link_status; /* Fall thru */ default: break; } /* Check for thermal downshift or shutdown */ if (hw->mac.type == e1000_i350) { thstat = E1000_READ_REG(hw, E1000_THSTAT); ctrl = E1000_READ_REG(hw, E1000_CTRL_EXT); } /* Get the flow control for display */ switch (fc->current_mode) { case e1000_fc_rx_pause: flowctl = "RX"; break; case e1000_fc_tx_pause: flowctl = "TX"; break; case e1000_fc_full: flowctl = "Full"; break; case e1000_fc_none: default: flowctl = "None"; break; } /* Now we check if a transition has happened */ if (link_check && (adapter->link_active == 0)) { e1000_get_speed_and_duplex(&adapter->hw, &adapter->link_speed, &adapter->link_duplex); if (bootverbose) device_printf(dev, "Link is up %d Mbps %s," " Flow Control: %s\n", adapter->link_speed, ((adapter->link_duplex == FULL_DUPLEX) ? "Full Duplex" : "Half Duplex"), flowctl); adapter->link_active = 1; ifp->if_baudrate = adapter->link_speed * 1000000; if ((ctrl & E1000_CTRL_EXT_LINK_MODE_GMII) && (thstat & E1000_THSTAT_LINK_THROTTLE)) device_printf(dev, "Link: thermal downshift\n"); /* Delay Link Up for Phy update */ if (((hw->mac.type == e1000_i210) || (hw->mac.type == e1000_i211)) && (hw->phy.id == I210_I_PHY_ID)) msec_delay(I210_LINK_DELAY); /* Reset if the media type changed. */ if (hw->dev_spec._82575.media_changed) { hw->dev_spec._82575.media_changed = false; adapter->flags |= IGB_MEDIA_RESET; igb_reset(adapter); } /* This can sleep */ if_link_state_change(ifp, LINK_STATE_UP); } else if (!link_check && (adapter->link_active == 1)) { ifp->if_baudrate = adapter->link_speed = 0; adapter->link_duplex = 0; if (bootverbose) device_printf(dev, "Link is Down\n"); if ((ctrl & E1000_CTRL_EXT_LINK_MODE_GMII) && (thstat & E1000_THSTAT_PWR_DOWN)) device_printf(dev, "Link: thermal shutdown\n"); adapter->link_active = 0; /* This can sleep */ if_link_state_change(ifp, LINK_STATE_DOWN); /* Reset queue state */ for (int i = 0; i < adapter->num_queues; i++, txr++) txr->queue_status = IGB_QUEUE_IDLE; } } /********************************************************************* * * This routine disables all traffic on the adapter by issuing a * global reset on the MAC and deallocates TX/RX buffers. * **********************************************************************/ static void igb_stop(void *arg) { struct adapter *adapter = arg; struct ifnet *ifp = adapter->ifp; struct tx_ring *txr = adapter->tx_rings; IGB_CORE_LOCK_ASSERT(adapter); INIT_DEBUGOUT("igb_stop: begin"); igb_disable_intr(adapter); callout_stop(&adapter->timer); /* Tell the stack that the interface is no longer active */ ifp->if_drv_flags &= ~IFF_DRV_RUNNING; ifp->if_drv_flags |= IFF_DRV_OACTIVE; /* Disarm watchdog timer. */ for (int i = 0; i < adapter->num_queues; i++, txr++) { IGB_TX_LOCK(txr); txr->queue_status = IGB_QUEUE_IDLE; IGB_TX_UNLOCK(txr); } e1000_reset_hw(&adapter->hw); E1000_WRITE_REG(&adapter->hw, E1000_WUC, 0); e1000_led_off(&adapter->hw); e1000_cleanup_led(&adapter->hw); } /********************************************************************* * * Determine hardware revision. * **********************************************************************/ static void igb_identify_hardware(struct adapter *adapter) { device_t dev = adapter->dev; /* Make sure our PCI config space has the necessary stuff set */ pci_enable_busmaster(dev); adapter->hw.bus.pci_cmd_word = pci_read_config(dev, PCIR_COMMAND, 2); /* Save off the information about this board */ adapter->hw.vendor_id = pci_get_vendor(dev); adapter->hw.device_id = pci_get_device(dev); adapter->hw.revision_id = pci_read_config(dev, PCIR_REVID, 1); adapter->hw.subsystem_vendor_id = pci_read_config(dev, PCIR_SUBVEND_0, 2); adapter->hw.subsystem_device_id = pci_read_config(dev, PCIR_SUBDEV_0, 2); /* Set MAC type early for PCI setup */ e1000_set_mac_type(&adapter->hw); /* Are we a VF device? */ if ((adapter->hw.mac.type == e1000_vfadapt) || (adapter->hw.mac.type == e1000_vfadapt_i350)) adapter->vf_ifp = 1; else adapter->vf_ifp = 0; } static int igb_allocate_pci_resources(struct adapter *adapter) { device_t dev = adapter->dev; int rid; rid = PCIR_BAR(0); adapter->pci_mem = bus_alloc_resource_any(dev, SYS_RES_MEMORY, &rid, RF_ACTIVE); if (adapter->pci_mem == NULL) { device_printf(dev, "Unable to allocate bus resource: memory\n"); return (ENXIO); } adapter->osdep.mem_bus_space_tag = rman_get_bustag(adapter->pci_mem); adapter->osdep.mem_bus_space_handle = rman_get_bushandle(adapter->pci_mem); adapter->hw.hw_addr = (u8 *)&adapter->osdep.mem_bus_space_handle; adapter->num_queues = 1; /* Defaults for Legacy or MSI */ /* This will setup either MSI/X or MSI */ adapter->msix = igb_setup_msix(adapter); adapter->hw.back = &adapter->osdep; return (0); } /********************************************************************* * * Setup the Legacy or MSI Interrupt handler * **********************************************************************/ static int igb_allocate_legacy(struct adapter *adapter) { device_t dev = adapter->dev; struct igb_queue *que = adapter->queues; #ifndef IGB_LEGACY_TX struct tx_ring *txr = adapter->tx_rings; #endif int error, rid = 0; /* Turn off all interrupts */ E1000_WRITE_REG(&adapter->hw, E1000_IMC, 0xffffffff); /* MSI RID is 1 */ if (adapter->msix == 1) rid = 1; /* We allocate a single interrupt resource */ adapter->res = bus_alloc_resource_any(dev, SYS_RES_IRQ, &rid, RF_SHAREABLE | RF_ACTIVE); if (adapter->res == NULL) { device_printf(dev, "Unable to allocate bus resource: " "interrupt\n"); return (ENXIO); } #ifndef IGB_LEGACY_TX TASK_INIT(&txr->txq_task, 0, igb_deferred_mq_start, txr); #endif /* * Try allocating a fast interrupt and the associated deferred * processing contexts. */ TASK_INIT(&que->que_task, 0, igb_handle_que, que); /* Make tasklet for deferred link handling */ TASK_INIT(&adapter->link_task, 0, igb_handle_link, adapter); que->tq = taskqueue_create_fast("igb_taskq", M_NOWAIT, taskqueue_thread_enqueue, &que->tq); taskqueue_start_threads(&que->tq, 1, PI_NET, "%s taskq", device_get_nameunit(adapter->dev)); if ((error = bus_setup_intr(dev, adapter->res, INTR_TYPE_NET | INTR_MPSAFE, igb_irq_fast, NULL, adapter, &adapter->tag)) != 0) { device_printf(dev, "Failed to register fast interrupt " "handler: %d\n", error); taskqueue_free(que->tq); que->tq = NULL; return (error); } return (0); } /********************************************************************* * * Setup the MSIX Queue Interrupt handlers: * **********************************************************************/ static int igb_allocate_msix(struct adapter *adapter) { device_t dev = adapter->dev; struct igb_queue *que = adapter->queues; int error, rid, vector = 0; int cpu_id = 0; #ifdef RSS cpuset_t cpu_mask; #endif /* Be sure to start with all interrupts disabled */ E1000_WRITE_REG(&adapter->hw, E1000_IMC, ~0); E1000_WRITE_FLUSH(&adapter->hw); #ifdef RSS /* * If we're doing RSS, the number of queues needs to * match the number of RSS buckets that are configured. * * + If there's more queues than RSS buckets, we'll end * up with queues that get no traffic. * * + If there's more RSS buckets than queues, we'll end * up having multiple RSS buckets map to the same queue, * so there'll be some contention. */ if (adapter->num_queues != rss_getnumbuckets()) { device_printf(dev, "%s: number of queues (%d) != number of RSS buckets (%d)" "; performance will be impacted.\n", __func__, adapter->num_queues, rss_getnumbuckets()); } #endif for (int i = 0; i < adapter->num_queues; i++, vector++, que++) { rid = vector +1; que->res = bus_alloc_resource_any(dev, SYS_RES_IRQ, &rid, RF_SHAREABLE | RF_ACTIVE); if (que->res == NULL) { device_printf(dev, "Unable to allocate bus resource: " "MSIX Queue Interrupt\n"); return (ENXIO); } error = bus_setup_intr(dev, que->res, INTR_TYPE_NET | INTR_MPSAFE, NULL, igb_msix_que, que, &que->tag); if (error) { que->res = NULL; device_printf(dev, "Failed to register Queue handler"); return (error); } #if __FreeBSD_version >= 800504 bus_describe_intr(dev, que->res, que->tag, "que %d", i); #endif que->msix = vector; if (adapter->hw.mac.type == e1000_82575) que->eims = E1000_EICR_TX_QUEUE0 << i; else que->eims = 1 << vector; #ifdef RSS /* * The queue ID is used as the RSS layer bucket ID. * We look up the queue ID -> RSS CPU ID and select * that. */ cpu_id = rss_getcpu(i % rss_getnumbuckets()); #else /* * Bind the msix vector, and thus the * rings to the corresponding cpu. * * This just happens to match the default RSS round-robin * bucket -> queue -> CPU allocation. */ if (adapter->num_queues > 1) { if (igb_last_bind_cpu < 0) igb_last_bind_cpu = CPU_FIRST(); cpu_id = igb_last_bind_cpu; } #endif if (adapter->num_queues > 1) { bus_bind_intr(dev, que->res, cpu_id); #ifdef RSS device_printf(dev, "Bound queue %d to RSS bucket %d\n", i, cpu_id); #else device_printf(dev, "Bound queue %d to cpu %d\n", i, cpu_id); #endif } #ifndef IGB_LEGACY_TX TASK_INIT(&que->txr->txq_task, 0, igb_deferred_mq_start, que->txr); #endif /* Make tasklet for deferred handling */ TASK_INIT(&que->que_task, 0, igb_handle_que, que); que->tq = taskqueue_create("igb_que", M_NOWAIT, taskqueue_thread_enqueue, &que->tq); if (adapter->num_queues > 1) { /* * Only pin the taskqueue thread to a CPU if * RSS is in use. * * This again just happens to match the default RSS * round-robin bucket -> queue -> CPU allocation. */ #ifdef RSS CPU_SETOF(cpu_id, &cpu_mask); taskqueue_start_threads_cpuset(&que->tq, 1, PI_NET, &cpu_mask, "%s que (bucket %d)", device_get_nameunit(adapter->dev), cpu_id); #else taskqueue_start_threads(&que->tq, 1, PI_NET, "%s que (qid %d)", device_get_nameunit(adapter->dev), cpu_id); #endif } else { taskqueue_start_threads(&que->tq, 1, PI_NET, "%s que", device_get_nameunit(adapter->dev)); } /* Finally update the last bound CPU id */ if (adapter->num_queues > 1) igb_last_bind_cpu = CPU_NEXT(igb_last_bind_cpu); } /* And Link */ rid = vector + 1; adapter->res = bus_alloc_resource_any(dev, SYS_RES_IRQ, &rid, RF_SHAREABLE | RF_ACTIVE); if (adapter->res == NULL) { device_printf(dev, "Unable to allocate bus resource: " "MSIX Link Interrupt\n"); return (ENXIO); } if ((error = bus_setup_intr(dev, adapter->res, INTR_TYPE_NET | INTR_MPSAFE, NULL, igb_msix_link, adapter, &adapter->tag)) != 0) { device_printf(dev, "Failed to register Link handler"); return (error); } #if __FreeBSD_version >= 800504 bus_describe_intr(dev, adapter->res, adapter->tag, "link"); #endif adapter->linkvec = vector; return (0); } static void igb_configure_queues(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct igb_queue *que; u32 tmp, ivar = 0, newitr = 0; /* First turn on RSS capability */ if (adapter->hw.mac.type != e1000_82575) E1000_WRITE_REG(hw, E1000_GPIE, E1000_GPIE_MSIX_MODE | E1000_GPIE_EIAME | E1000_GPIE_PBA | E1000_GPIE_NSICR); /* Turn on MSIX */ switch (adapter->hw.mac.type) { case e1000_82580: case e1000_i350: case e1000_i354: case e1000_i210: case e1000_i211: case e1000_vfadapt: case e1000_vfadapt_i350: /* RX entries */ for (int i = 0; i < adapter->num_queues; i++) { u32 index = i >> 1; ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); que = &adapter->queues[i]; if (i & 1) { ivar &= 0xFF00FFFF; ivar |= (que->msix | E1000_IVAR_VALID) << 16; } else { ivar &= 0xFFFFFF00; ivar |= que->msix | E1000_IVAR_VALID; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); } /* TX entries */ for (int i = 0; i < adapter->num_queues; i++) { u32 index = i >> 1; ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); que = &adapter->queues[i]; if (i & 1) { ivar &= 0x00FFFFFF; ivar |= (que->msix | E1000_IVAR_VALID) << 24; } else { ivar &= 0xFFFF00FF; ivar |= (que->msix | E1000_IVAR_VALID) << 8; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); adapter->que_mask |= que->eims; } /* And for the link interrupt */ ivar = (adapter->linkvec | E1000_IVAR_VALID) << 8; adapter->link_mask = 1 << adapter->linkvec; E1000_WRITE_REG(hw, E1000_IVAR_MISC, ivar); break; case e1000_82576: /* RX entries */ for (int i = 0; i < adapter->num_queues; i++) { u32 index = i & 0x7; /* Each IVAR has two entries */ ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); que = &adapter->queues[i]; if (i < 8) { ivar &= 0xFFFFFF00; ivar |= que->msix | E1000_IVAR_VALID; } else { ivar &= 0xFF00FFFF; ivar |= (que->msix | E1000_IVAR_VALID) << 16; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); adapter->que_mask |= que->eims; } /* TX entries */ for (int i = 0; i < adapter->num_queues; i++) { u32 index = i & 0x7; /* Each IVAR has two entries */ ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index); que = &adapter->queues[i]; if (i < 8) { ivar &= 0xFFFF00FF; ivar |= (que->msix | E1000_IVAR_VALID) << 8; } else { ivar &= 0x00FFFFFF; ivar |= (que->msix | E1000_IVAR_VALID) << 24; } E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar); adapter->que_mask |= que->eims; } /* And for the link interrupt */ ivar = (adapter->linkvec | E1000_IVAR_VALID) << 8; adapter->link_mask = 1 << adapter->linkvec; E1000_WRITE_REG(hw, E1000_IVAR_MISC, ivar); break; case e1000_82575: /* enable MSI-X support*/ tmp = E1000_READ_REG(hw, E1000_CTRL_EXT); tmp |= E1000_CTRL_EXT_PBA_CLR; /* Auto-Mask interrupts upon ICR read. */ tmp |= E1000_CTRL_EXT_EIAME; tmp |= E1000_CTRL_EXT_IRCA; E1000_WRITE_REG(hw, E1000_CTRL_EXT, tmp); /* Queues */ for (int i = 0; i < adapter->num_queues; i++) { que = &adapter->queues[i]; tmp = E1000_EICR_RX_QUEUE0 << i; tmp |= E1000_EICR_TX_QUEUE0 << i; que->eims = tmp; E1000_WRITE_REG_ARRAY(hw, E1000_MSIXBM(0), i, que->eims); adapter->que_mask |= que->eims; } /* Link */ E1000_WRITE_REG(hw, E1000_MSIXBM(adapter->linkvec), E1000_EIMS_OTHER); adapter->link_mask |= E1000_EIMS_OTHER; default: break; } /* Set the starting interrupt rate */ if (igb_max_interrupt_rate > 0) newitr = (4000000 / igb_max_interrupt_rate) & 0x7FFC; if (hw->mac.type == e1000_82575) newitr |= newitr << 16; else newitr |= E1000_EITR_CNT_IGNR; for (int i = 0; i < adapter->num_queues; i++) { que = &adapter->queues[i]; E1000_WRITE_REG(hw, E1000_EITR(que->msix), newitr); } return; } static void igb_free_pci_resources(struct adapter *adapter) { struct igb_queue *que = adapter->queues; device_t dev = adapter->dev; int rid; /* ** There is a slight possibility of a failure mode ** in attach that will result in entering this function ** before interrupt resources have been initialized, and ** in that case we do not want to execute the loops below ** We can detect this reliably by the state of the adapter ** res pointer. */ if (adapter->res == NULL) goto mem; /* * First release all the interrupt resources: */ for (int i = 0; i < adapter->num_queues; i++, que++) { rid = que->msix + 1; if (que->tag != NULL) { bus_teardown_intr(dev, que->res, que->tag); que->tag = NULL; } if (que->res != NULL) bus_release_resource(dev, SYS_RES_IRQ, rid, que->res); } /* Clean the Legacy or Link interrupt last */ if (adapter->linkvec) /* we are doing MSIX */ rid = adapter->linkvec + 1; else (adapter->msix != 0) ? (rid = 1):(rid = 0); que = adapter->queues; if (adapter->tag != NULL) { taskqueue_drain(que->tq, &adapter->link_task); bus_teardown_intr(dev, adapter->res, adapter->tag); adapter->tag = NULL; } if (adapter->res != NULL) bus_release_resource(dev, SYS_RES_IRQ, rid, adapter->res); for (int i = 0; i < adapter->num_queues; i++, que++) { if (que->tq != NULL) { #ifndef IGB_LEGACY_TX taskqueue_drain(que->tq, &que->txr->txq_task); #endif taskqueue_drain(que->tq, &que->que_task); taskqueue_free(que->tq); } } mem: if (adapter->msix) pci_release_msi(dev); if (adapter->msix_mem != NULL) bus_release_resource(dev, SYS_RES_MEMORY, adapter->memrid, adapter->msix_mem); if (adapter->pci_mem != NULL) bus_release_resource(dev, SYS_RES_MEMORY, PCIR_BAR(0), adapter->pci_mem); } /* * Setup Either MSI/X or MSI */ static int igb_setup_msix(struct adapter *adapter) { device_t dev = adapter->dev; int bar, want, queues, msgs, maxqueues; /* tuneable override */ if (igb_enable_msix == 0) goto msi; /* First try MSI/X */ msgs = pci_msix_count(dev); if (msgs == 0) goto msi; /* ** Some new devices, as with ixgbe, now may ** use a different BAR, so we need to keep ** track of which is used. */ adapter->memrid = PCIR_BAR(IGB_MSIX_BAR); bar = pci_read_config(dev, adapter->memrid, 4); if (bar == 0) /* use next bar */ adapter->memrid += 4; adapter->msix_mem = bus_alloc_resource_any(dev, SYS_RES_MEMORY, &adapter->memrid, RF_ACTIVE); if (adapter->msix_mem == NULL) { /* May not be enabled */ device_printf(adapter->dev, "Unable to map MSIX table \n"); goto msi; } queues = (mp_ncpus > (msgs-1)) ? (msgs-1) : mp_ncpus; /* Override via tuneable */ if (igb_num_queues != 0) queues = igb_num_queues; #ifdef RSS /* If we're doing RSS, clamp at the number of RSS buckets */ if (queues > rss_getnumbuckets()) queues = rss_getnumbuckets(); #endif /* Sanity check based on HW */ switch (adapter->hw.mac.type) { case e1000_82575: maxqueues = 4; break; case e1000_82576: case e1000_82580: case e1000_i350: case e1000_i354: maxqueues = 8; break; case e1000_i210: maxqueues = 4; break; case e1000_i211: maxqueues = 2; break; default: /* VF interfaces */ maxqueues = 1; break; } /* Final clamp on the actual hardware capability */ if (queues > maxqueues) queues = maxqueues; /* ** One vector (RX/TX pair) per queue ** plus an additional for Link interrupt */ want = queues + 1; if (msgs >= want) msgs = want; else { device_printf(adapter->dev, "MSIX Configuration Problem, " "%d vectors configured, but %d queues wanted!\n", msgs, want); goto msi; } if ((pci_alloc_msix(dev, &msgs) == 0) && (msgs == want)) { device_printf(adapter->dev, "Using MSIX interrupts with %d vectors\n", msgs); adapter->num_queues = queues; return (msgs); } /* ** If MSIX alloc failed or provided us with ** less than needed, free and fall through to MSI */ pci_release_msi(dev); msi: if (adapter->msix_mem != NULL) { bus_release_resource(dev, SYS_RES_MEMORY, PCIR_BAR(IGB_MSIX_BAR), adapter->msix_mem); adapter->msix_mem = NULL; } msgs = 1; if (pci_alloc_msi(dev, &msgs) == 0) { device_printf(adapter->dev," Using an MSI interrupt\n"); return (msgs); } device_printf(adapter->dev," Using a Legacy interrupt\n"); return (0); } /********************************************************************* * * Initialize the DMA Coalescing feature * **********************************************************************/ static void igb_init_dmac(struct adapter *adapter, u32 pba) { device_t dev = adapter->dev; struct e1000_hw *hw = &adapter->hw; u32 dmac, reg = ~E1000_DMACR_DMAC_EN; u16 hwm; if (hw->mac.type == e1000_i211) return; if (hw->mac.type > e1000_82580) { if (adapter->dmac == 0) { /* Disabling it */ E1000_WRITE_REG(hw, E1000_DMACR, reg); return; } else device_printf(dev, "DMA Coalescing enabled\n"); /* Set starting threshold */ E1000_WRITE_REG(hw, E1000_DMCTXTH, 0); hwm = 64 * pba - adapter->max_frame_size / 16; if (hwm < 64 * (pba - 6)) hwm = 64 * (pba - 6); reg = E1000_READ_REG(hw, E1000_FCRTC); reg &= ~E1000_FCRTC_RTH_COAL_MASK; reg |= ((hwm << E1000_FCRTC_RTH_COAL_SHIFT) & E1000_FCRTC_RTH_COAL_MASK); E1000_WRITE_REG(hw, E1000_FCRTC, reg); dmac = pba - adapter->max_frame_size / 512; if (dmac < pba - 10) dmac = pba - 10; reg = E1000_READ_REG(hw, E1000_DMACR); reg &= ~E1000_DMACR_DMACTHR_MASK; reg = ((dmac << E1000_DMACR_DMACTHR_SHIFT) & E1000_DMACR_DMACTHR_MASK); /* transition to L0x or L1 if available..*/ reg |= (E1000_DMACR_DMAC_EN | E1000_DMACR_DMAC_LX_MASK); /* Check if status is 2.5Gb backplane connection * before configuration of watchdog timer, which is * in msec values in 12.8usec intervals * watchdog timer= msec values in 32usec intervals * for non 2.5Gb connection */ if (hw->mac.type == e1000_i354) { int status = E1000_READ_REG(hw, E1000_STATUS); if ((status & E1000_STATUS_2P5_SKU) && (!(status & E1000_STATUS_2P5_SKU_OVER))) reg |= ((adapter->dmac * 5) >> 6); else reg |= (adapter->dmac >> 5); } else { reg |= (adapter->dmac >> 5); } E1000_WRITE_REG(hw, E1000_DMACR, reg); E1000_WRITE_REG(hw, E1000_DMCRTRH, 0); /* Set the interval before transition */ reg = E1000_READ_REG(hw, E1000_DMCTLX); if (hw->mac.type == e1000_i350) reg |= IGB_DMCTLX_DCFLUSH_DIS; /* ** in 2.5Gb connection, TTLX unit is 0.4 usec ** which is 0x4*2 = 0xA. But delay is still 4 usec */ if (hw->mac.type == e1000_i354) { int status = E1000_READ_REG(hw, E1000_STATUS); if ((status & E1000_STATUS_2P5_SKU) && (!(status & E1000_STATUS_2P5_SKU_OVER))) reg |= 0xA; else reg |= 0x4; } else { reg |= 0x4; } E1000_WRITE_REG(hw, E1000_DMCTLX, reg); /* free space in tx packet buffer to wake from DMA coal */ E1000_WRITE_REG(hw, E1000_DMCTXTH, (IGB_TXPBSIZE - (2 * adapter->max_frame_size)) >> 6); /* make low power state decision controlled by DMA coal */ reg = E1000_READ_REG(hw, E1000_PCIEMISC); reg &= ~E1000_PCIEMISC_LX_DECISION; E1000_WRITE_REG(hw, E1000_PCIEMISC, reg); } else if (hw->mac.type == e1000_82580) { u32 reg = E1000_READ_REG(hw, E1000_PCIEMISC); E1000_WRITE_REG(hw, E1000_PCIEMISC, reg & ~E1000_PCIEMISC_LX_DECISION); E1000_WRITE_REG(hw, E1000_DMACR, 0); } } /********************************************************************* * * Set up an fresh starting state * **********************************************************************/ static void igb_reset(struct adapter *adapter) { device_t dev = adapter->dev; struct e1000_hw *hw = &adapter->hw; struct e1000_fc_info *fc = &hw->fc; struct ifnet *ifp = adapter->ifp; u32 pba = 0; u16 hwm; INIT_DEBUGOUT("igb_reset: begin"); /* Let the firmware know the OS is in control */ igb_get_hw_control(adapter); /* * Packet Buffer Allocation (PBA) * Writing PBA sets the receive portion of the buffer * the remainder is used for the transmit buffer. */ switch (hw->mac.type) { case e1000_82575: pba = E1000_PBA_32K; break; case e1000_82576: case e1000_vfadapt: pba = E1000_READ_REG(hw, E1000_RXPBS); pba &= E1000_RXPBS_SIZE_MASK_82576; break; case e1000_82580: case e1000_i350: case e1000_i354: case e1000_vfadapt_i350: pba = E1000_READ_REG(hw, E1000_RXPBS); pba = e1000_rxpbs_adjust_82580(pba); break; case e1000_i210: case e1000_i211: pba = E1000_PBA_34K; default: break; } /* Special needs in case of Jumbo frames */ if ((hw->mac.type == e1000_82575) && (ifp->if_mtu > ETHERMTU)) { u32 tx_space, min_tx, min_rx; pba = E1000_READ_REG(hw, E1000_PBA); tx_space = pba >> 16; pba &= 0xffff; min_tx = (adapter->max_frame_size + sizeof(struct e1000_tx_desc) - ETHERNET_FCS_SIZE) * 2; min_tx = roundup2(min_tx, 1024); min_tx >>= 10; min_rx = adapter->max_frame_size; min_rx = roundup2(min_rx, 1024); min_rx >>= 10; if (tx_space < min_tx && ((min_tx - tx_space) < pba)) { pba = pba - (min_tx - tx_space); /* * if short on rx space, rx wins * and must trump tx adjustment */ if (pba < min_rx) pba = min_rx; } E1000_WRITE_REG(hw, E1000_PBA, pba); } INIT_DEBUGOUT1("igb_init: pba=%dK",pba); /* * These parameters control the automatic generation (Tx) and * response (Rx) to Ethernet PAUSE frames. * - High water mark should allow for at least two frames to be * received after sending an XOFF. * - Low water mark works best when it is very near the high water mark. * This allows the receiver to restart by sending XON when it has * drained a bit. */ hwm = min(((pba << 10) * 9 / 10), ((pba << 10) - 2 * adapter->max_frame_size)); if (hw->mac.type < e1000_82576) { fc->high_water = hwm & 0xFFF8; /* 8-byte granularity */ fc->low_water = fc->high_water - 8; } else { fc->high_water = hwm & 0xFFF0; /* 16-byte granularity */ fc->low_water = fc->high_water - 16; } fc->pause_time = IGB_FC_PAUSE_TIME; fc->send_xon = TRUE; if (adapter->fc) fc->requested_mode = adapter->fc; else fc->requested_mode = e1000_fc_default; /* Issue a global reset */ e1000_reset_hw(hw); E1000_WRITE_REG(hw, E1000_WUC, 0); /* Reset for AutoMediaDetect */ if (adapter->flags & IGB_MEDIA_RESET) { e1000_setup_init_funcs(hw, TRUE); e1000_get_bus_info(hw); adapter->flags &= ~IGB_MEDIA_RESET; } if (e1000_init_hw(hw) < 0) device_printf(dev, "Hardware Initialization Failed\n"); /* Setup DMA Coalescing */ igb_init_dmac(adapter, pba); E1000_WRITE_REG(&adapter->hw, E1000_VET, ETHERTYPE_VLAN); e1000_get_phy_info(hw); e1000_check_for_link(hw); return; } /********************************************************************* * * Setup networking device structure and register an interface. * **********************************************************************/ static int igb_setup_interface(device_t dev, struct adapter *adapter) { struct ifnet *ifp; INIT_DEBUGOUT("igb_setup_interface: begin"); ifp = adapter->ifp = if_alloc(IFT_ETHER); if (ifp == NULL) { device_printf(dev, "can not allocate ifnet structure\n"); return (-1); } if_initname(ifp, device_get_name(dev), device_get_unit(dev)); ifp->if_init = igb_init; ifp->if_softc = adapter; ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST; ifp->if_ioctl = igb_ioctl; ifp->if_get_counter = igb_get_counter; /* TSO parameters */ ifp->if_hw_tsomax = IP_MAXPACKET; ifp->if_hw_tsomaxsegcount = IGB_MAX_SCATTER; ifp->if_hw_tsomaxsegsize = IGB_TSO_SEG_SIZE; #ifndef IGB_LEGACY_TX ifp->if_transmit = igb_mq_start; ifp->if_qflush = igb_qflush; #else ifp->if_start = igb_start; IFQ_SET_MAXLEN(&ifp->if_snd, adapter->num_tx_desc - 1); ifp->if_snd.ifq_drv_maxlen = adapter->num_tx_desc - 1; IFQ_SET_READY(&ifp->if_snd); #endif ether_ifattach(ifp, adapter->hw.mac.addr); ifp->if_capabilities = ifp->if_capenable = 0; ifp->if_capabilities = IFCAP_HWCSUM | IFCAP_VLAN_HWCSUM; #if __FreeBSD_version >= 1000000 ifp->if_capabilities |= IFCAP_HWCSUM_IPV6; #endif ifp->if_capabilities |= IFCAP_TSO; ifp->if_capabilities |= IFCAP_JUMBO_MTU; ifp->if_capenable = ifp->if_capabilities; /* Don't enable LRO by default */ ifp->if_capabilities |= IFCAP_LRO; #ifdef DEVICE_POLLING ifp->if_capabilities |= IFCAP_POLLING; #endif /* * Tell the upper layer(s) we * support full VLAN capability. */ ifp->if_hdrlen = sizeof(struct ether_vlan_header); ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_HWTSO | IFCAP_VLAN_MTU; ifp->if_capenable |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_HWTSO | IFCAP_VLAN_MTU; /* ** Don't turn this on by default, if vlans are ** created on another pseudo device (eg. lagg) ** then vlan events are not passed thru, breaking ** operation, but with HW FILTER off it works. If ** using vlans directly on the igb driver you can ** enable this and get full hardware tag filtering. */ ifp->if_capabilities |= IFCAP_VLAN_HWFILTER; /* * Specify the media types supported by this adapter and register * callbacks to update media and link information */ ifmedia_init(&adapter->media, IFM_IMASK, igb_media_change, igb_media_status); if ((adapter->hw.phy.media_type == e1000_media_type_fiber) || (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) { ifmedia_add(&adapter->media, IFM_ETHER | IFM_1000_SX | IFM_FDX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_1000_SX, 0, NULL); } else { ifmedia_add(&adapter->media, IFM_ETHER | IFM_10_T, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_10_T | IFM_FDX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_100_TX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_100_TX | IFM_FDX, 0, NULL); if (adapter->hw.phy.type != e1000_phy_ife) { ifmedia_add(&adapter->media, IFM_ETHER | IFM_1000_T | IFM_FDX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_1000_T, 0, NULL); } } ifmedia_add(&adapter->media, IFM_ETHER | IFM_AUTO, 0, NULL); ifmedia_set(&adapter->media, IFM_ETHER | IFM_AUTO); return (0); } /* * Manage DMA'able memory. */ static void igb_dmamap_cb(void *arg, bus_dma_segment_t *segs, int nseg, int error) { if (error) return; *(bus_addr_t *) arg = segs[0].ds_addr; } static int igb_dma_malloc(struct adapter *adapter, bus_size_t size, struct igb_dma_alloc *dma, int mapflags) { int error; error = bus_dma_tag_create(bus_get_dma_tag(adapter->dev), /* parent */ IGB_DBA_ALIGN, 0, /* alignment, bounds */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ size, /* maxsize */ 1, /* nsegments */ size, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockarg */ &dma->dma_tag); if (error) { device_printf(adapter->dev, "%s: bus_dma_tag_create failed: %d\n", __func__, error); goto fail_0; } error = bus_dmamem_alloc(dma->dma_tag, (void**) &dma->dma_vaddr, BUS_DMA_NOWAIT | BUS_DMA_COHERENT, &dma->dma_map); if (error) { device_printf(adapter->dev, "%s: bus_dmamem_alloc(%ju) failed: %d\n", __func__, (uintmax_t)size, error); goto fail_2; } dma->dma_paddr = 0; error = bus_dmamap_load(dma->dma_tag, dma->dma_map, dma->dma_vaddr, size, igb_dmamap_cb, &dma->dma_paddr, mapflags | BUS_DMA_NOWAIT); if (error || dma->dma_paddr == 0) { device_printf(adapter->dev, "%s: bus_dmamap_load failed: %d\n", __func__, error); goto fail_3; } return (0); fail_3: bus_dmamap_unload(dma->dma_tag, dma->dma_map); fail_2: bus_dmamem_free(dma->dma_tag, dma->dma_vaddr, dma->dma_map); bus_dma_tag_destroy(dma->dma_tag); fail_0: dma->dma_tag = NULL; return (error); } static void igb_dma_free(struct adapter *adapter, struct igb_dma_alloc *dma) { if (dma->dma_tag == NULL) return; if (dma->dma_paddr != 0) { bus_dmamap_sync(dma->dma_tag, dma->dma_map, BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(dma->dma_tag, dma->dma_map); dma->dma_paddr = 0; } if (dma->dma_vaddr != NULL) { bus_dmamem_free(dma->dma_tag, dma->dma_vaddr, dma->dma_map); dma->dma_vaddr = NULL; } bus_dma_tag_destroy(dma->dma_tag); dma->dma_tag = NULL; } /********************************************************************* * * Allocate memory for the transmit and receive rings, and then * the descriptors associated with each, called only once at attach. * **********************************************************************/ static int igb_allocate_queues(struct adapter *adapter) { device_t dev = adapter->dev; struct igb_queue *que = NULL; struct tx_ring *txr = NULL; struct rx_ring *rxr = NULL; int rsize, tsize, error = E1000_SUCCESS; int txconf = 0, rxconf = 0; /* First allocate the top level queue structs */ if (!(adapter->queues = (struct igb_queue *) malloc(sizeof(struct igb_queue) * adapter->num_queues, M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(dev, "Unable to allocate queue memory\n"); error = ENOMEM; goto fail; } /* Next allocate the TX ring struct memory */ if (!(adapter->tx_rings = (struct tx_ring *) malloc(sizeof(struct tx_ring) * adapter->num_queues, M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(dev, "Unable to allocate TX ring memory\n"); error = ENOMEM; goto tx_fail; } /* Now allocate the RX */ if (!(adapter->rx_rings = (struct rx_ring *) malloc(sizeof(struct rx_ring) * adapter->num_queues, M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(dev, "Unable to allocate RX ring memory\n"); error = ENOMEM; goto rx_fail; } tsize = roundup2(adapter->num_tx_desc * sizeof(union e1000_adv_tx_desc), IGB_DBA_ALIGN); /* * Now set up the TX queues, txconf is needed to handle the * possibility that things fail midcourse and we need to * undo memory gracefully */ for (int i = 0; i < adapter->num_queues; i++, txconf++) { /* Set up some basics */ txr = &adapter->tx_rings[i]; txr->adapter = adapter; txr->me = i; txr->num_desc = adapter->num_tx_desc; /* Initialize the TX lock */ snprintf(txr->mtx_name, sizeof(txr->mtx_name), "%s:tx(%d)", device_get_nameunit(dev), txr->me); mtx_init(&txr->tx_mtx, txr->mtx_name, NULL, MTX_DEF); if (igb_dma_malloc(adapter, tsize, &txr->txdma, BUS_DMA_NOWAIT)) { device_printf(dev, "Unable to allocate TX Descriptor memory\n"); error = ENOMEM; goto err_tx_desc; } txr->tx_base = (union e1000_adv_tx_desc *)txr->txdma.dma_vaddr; bzero((void *)txr->tx_base, tsize); /* Now allocate transmit buffers for the ring */ if (igb_allocate_transmit_buffers(txr)) { device_printf(dev, "Critical Failure setting up transmit buffers\n"); error = ENOMEM; goto err_tx_desc; } #ifndef IGB_LEGACY_TX /* Allocate a buf ring */ txr->br = buf_ring_alloc(igb_buf_ring_size, M_DEVBUF, M_WAITOK, &txr->tx_mtx); #endif } /* * Next the RX queues... */ rsize = roundup2(adapter->num_rx_desc * sizeof(union e1000_adv_rx_desc), IGB_DBA_ALIGN); for (int i = 0; i < adapter->num_queues; i++, rxconf++) { rxr = &adapter->rx_rings[i]; rxr->adapter = adapter; rxr->me = i; /* Initialize the RX lock */ snprintf(rxr->mtx_name, sizeof(rxr->mtx_name), "%s:rx(%d)", device_get_nameunit(dev), txr->me); mtx_init(&rxr->rx_mtx, rxr->mtx_name, NULL, MTX_DEF); if (igb_dma_malloc(adapter, rsize, &rxr->rxdma, BUS_DMA_NOWAIT)) { device_printf(dev, "Unable to allocate RxDescriptor memory\n"); error = ENOMEM; goto err_rx_desc; } rxr->rx_base = (union e1000_adv_rx_desc *)rxr->rxdma.dma_vaddr; bzero((void *)rxr->rx_base, rsize); /* Allocate receive buffers for the ring*/ if (igb_allocate_receive_buffers(rxr)) { device_printf(dev, "Critical Failure setting up receive buffers\n"); error = ENOMEM; goto err_rx_desc; } } /* ** Finally set up the queue holding structs */ for (int i = 0; i < adapter->num_queues; i++) { que = &adapter->queues[i]; que->adapter = adapter; que->txr = &adapter->tx_rings[i]; que->rxr = &adapter->rx_rings[i]; } return (0); err_rx_desc: for (rxr = adapter->rx_rings; rxconf > 0; rxr++, rxconf--) igb_dma_free(adapter, &rxr->rxdma); err_tx_desc: for (txr = adapter->tx_rings; txconf > 0; txr++, txconf--) igb_dma_free(adapter, &txr->txdma); free(adapter->rx_rings, M_DEVBUF); rx_fail: #ifndef IGB_LEGACY_TX buf_ring_free(txr->br, M_DEVBUF); #endif free(adapter->tx_rings, M_DEVBUF); tx_fail: free(adapter->queues, M_DEVBUF); fail: return (error); } /********************************************************************* * * Allocate memory for tx_buffer structures. The tx_buffer stores all * the information needed to transmit a packet on the wire. This is * called only once at attach, setup is done every reset. * **********************************************************************/ static int igb_allocate_transmit_buffers(struct tx_ring *txr) { struct adapter *adapter = txr->adapter; device_t dev = adapter->dev; struct igb_tx_buf *txbuf; int error, i; /* * Setup DMA descriptor areas. */ if ((error = bus_dma_tag_create(bus_get_dma_tag(dev), 1, 0, /* alignment, bounds */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ IGB_TSO_SIZE, /* maxsize */ IGB_MAX_SCATTER, /* nsegments */ PAGE_SIZE, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &txr->txtag))) { device_printf(dev,"Unable to allocate TX DMA tag\n"); goto fail; } if (!(txr->tx_buffers = (struct igb_tx_buf *) malloc(sizeof(struct igb_tx_buf) * adapter->num_tx_desc, M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(dev, "Unable to allocate tx_buffer memory\n"); error = ENOMEM; goto fail; } /* Create the descriptor buffer dma maps */ txbuf = txr->tx_buffers; for (i = 0; i < adapter->num_tx_desc; i++, txbuf++) { error = bus_dmamap_create(txr->txtag, 0, &txbuf->map); if (error != 0) { device_printf(dev, "Unable to create TX DMA map\n"); goto fail; } } return 0; fail: /* We free all, it handles case where we are in the middle */ igb_free_transmit_structures(adapter); return (error); } /********************************************************************* * * Initialize a transmit ring. * **********************************************************************/ static void igb_setup_transmit_ring(struct tx_ring *txr) { struct adapter *adapter = txr->adapter; struct igb_tx_buf *txbuf; int i; #ifdef DEV_NETMAP struct netmap_adapter *na = NA(adapter->ifp); struct netmap_slot *slot; #endif /* DEV_NETMAP */ /* Clear the old descriptor contents */ IGB_TX_LOCK(txr); #ifdef DEV_NETMAP slot = netmap_reset(na, NR_TX, txr->me, 0); #endif /* DEV_NETMAP */ bzero((void *)txr->tx_base, (sizeof(union e1000_adv_tx_desc)) * adapter->num_tx_desc); /* Reset indices */ txr->next_avail_desc = 0; txr->next_to_clean = 0; /* Free any existing tx buffers. */ txbuf = txr->tx_buffers; for (i = 0; i < adapter->num_tx_desc; i++, txbuf++) { if (txbuf->m_head != NULL) { bus_dmamap_sync(txr->txtag, txbuf->map, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(txr->txtag, txbuf->map); m_freem(txbuf->m_head); txbuf->m_head = NULL; } #ifdef DEV_NETMAP if (slot) { int si = netmap_idx_n2k(&na->tx_rings[txr->me], i); /* no need to set the address */ netmap_load_map(na, txr->txtag, txbuf->map, NMB(na, slot + si)); } #endif /* DEV_NETMAP */ /* clear the watch index */ txbuf->eop = NULL; } /* Set number of descriptors available */ txr->tx_avail = adapter->num_tx_desc; bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); IGB_TX_UNLOCK(txr); } /********************************************************************* * * Initialize all transmit rings. * **********************************************************************/ static void igb_setup_transmit_structures(struct adapter *adapter) { struct tx_ring *txr = adapter->tx_rings; for (int i = 0; i < adapter->num_queues; i++, txr++) igb_setup_transmit_ring(txr); return; } /********************************************************************* * * Enable transmit unit. * **********************************************************************/ static void igb_initialize_transmit_units(struct adapter *adapter) { struct tx_ring *txr = adapter->tx_rings; struct e1000_hw *hw = &adapter->hw; u32 tctl, txdctl; INIT_DEBUGOUT("igb_initialize_transmit_units: begin"); tctl = txdctl = 0; /* Setup the Tx Descriptor Rings */ for (int i = 0; i < adapter->num_queues; i++, txr++) { u64 bus_addr = txr->txdma.dma_paddr; E1000_WRITE_REG(hw, E1000_TDLEN(i), adapter->num_tx_desc * sizeof(struct e1000_tx_desc)); E1000_WRITE_REG(hw, E1000_TDBAH(i), (uint32_t)(bus_addr >> 32)); E1000_WRITE_REG(hw, E1000_TDBAL(i), (uint32_t)bus_addr); /* Setup the HW Tx Head and Tail descriptor pointers */ E1000_WRITE_REG(hw, E1000_TDT(i), 0); E1000_WRITE_REG(hw, E1000_TDH(i), 0); HW_DEBUGOUT2("Base = %x, Length = %x\n", E1000_READ_REG(hw, E1000_TDBAL(i)), E1000_READ_REG(hw, E1000_TDLEN(i))); txr->queue_status = IGB_QUEUE_IDLE; txdctl |= IGB_TX_PTHRESH; txdctl |= IGB_TX_HTHRESH << 8; txdctl |= IGB_TX_WTHRESH << 16; txdctl |= E1000_TXDCTL_QUEUE_ENABLE; E1000_WRITE_REG(hw, E1000_TXDCTL(i), txdctl); } if (adapter->vf_ifp) return; e1000_config_collision_dist(hw); /* Program the Transmit Control Register */ tctl = E1000_READ_REG(hw, E1000_TCTL); tctl &= ~E1000_TCTL_CT; tctl |= (E1000_TCTL_PSP | E1000_TCTL_RTLC | E1000_TCTL_EN | (E1000_COLLISION_THRESHOLD << E1000_CT_SHIFT)); /* This write will effectively turn on the transmit unit. */ E1000_WRITE_REG(hw, E1000_TCTL, tctl); } /********************************************************************* * * Free all transmit rings. * **********************************************************************/ static void igb_free_transmit_structures(struct adapter *adapter) { struct tx_ring *txr = adapter->tx_rings; for (int i = 0; i < adapter->num_queues; i++, txr++) { IGB_TX_LOCK(txr); igb_free_transmit_buffers(txr); igb_dma_free(adapter, &txr->txdma); IGB_TX_UNLOCK(txr); IGB_TX_LOCK_DESTROY(txr); } free(adapter->tx_rings, M_DEVBUF); } /********************************************************************* * * Free transmit ring related data structures. * **********************************************************************/ static void igb_free_transmit_buffers(struct tx_ring *txr) { struct adapter *adapter = txr->adapter; struct igb_tx_buf *tx_buffer; int i; INIT_DEBUGOUT("free_transmit_ring: begin"); if (txr->tx_buffers == NULL) return; tx_buffer = txr->tx_buffers; for (i = 0; i < adapter->num_tx_desc; i++, tx_buffer++) { if (tx_buffer->m_head != NULL) { bus_dmamap_sync(txr->txtag, tx_buffer->map, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(txr->txtag, tx_buffer->map); m_freem(tx_buffer->m_head); tx_buffer->m_head = NULL; if (tx_buffer->map != NULL) { bus_dmamap_destroy(txr->txtag, tx_buffer->map); tx_buffer->map = NULL; } } else if (tx_buffer->map != NULL) { bus_dmamap_unload(txr->txtag, tx_buffer->map); bus_dmamap_destroy(txr->txtag, tx_buffer->map); tx_buffer->map = NULL; } } #ifndef IGB_LEGACY_TX if (txr->br != NULL) buf_ring_free(txr->br, M_DEVBUF); #endif if (txr->tx_buffers != NULL) { free(txr->tx_buffers, M_DEVBUF); txr->tx_buffers = NULL; } if (txr->txtag != NULL) { bus_dma_tag_destroy(txr->txtag); txr->txtag = NULL; } return; } /********************************************************************** * * Setup work for hardware segmentation offload (TSO) on * adapters using advanced tx descriptors * **********************************************************************/ static int igb_tso_setup(struct tx_ring *txr, struct mbuf *mp, u32 *cmd_type_len, u32 *olinfo_status) { struct adapter *adapter = txr->adapter; struct e1000_adv_tx_context_desc *TXD; u32 vlan_macip_lens = 0, type_tucmd_mlhl = 0; u32 mss_l4len_idx = 0, paylen; u16 vtag = 0, eh_type; int ctxd, ehdrlen, ip_hlen, tcp_hlen; struct ether_vlan_header *eh; #ifdef INET6 struct ip6_hdr *ip6; #endif #ifdef INET struct ip *ip; #endif struct tcphdr *th; /* * Determine where frame payload starts. * Jump over vlan headers if already present */ eh = mtod(mp, struct ether_vlan_header *); if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) { ehdrlen = ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN; eh_type = eh->evl_proto; } else { ehdrlen = ETHER_HDR_LEN; eh_type = eh->evl_encap_proto; } switch (ntohs(eh_type)) { #ifdef INET6 case ETHERTYPE_IPV6: ip6 = (struct ip6_hdr *)(mp->m_data + ehdrlen); /* XXX-BZ For now we do not pretend to support ext. hdrs. */ if (ip6->ip6_nxt != IPPROTO_TCP) return (ENXIO); ip_hlen = sizeof(struct ip6_hdr); ip6 = (struct ip6_hdr *)(mp->m_data + ehdrlen); th = (struct tcphdr *)((caddr_t)ip6 + ip_hlen); th->th_sum = in6_cksum_pseudo(ip6, 0, IPPROTO_TCP, 0); type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV6; break; #endif #ifdef INET case ETHERTYPE_IP: ip = (struct ip *)(mp->m_data + ehdrlen); if (ip->ip_p != IPPROTO_TCP) return (ENXIO); ip->ip_sum = 0; ip_hlen = ip->ip_hl << 2; th = (struct tcphdr *)((caddr_t)ip + ip_hlen); th->th_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, htons(IPPROTO_TCP)); type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV4; /* Tell transmit desc to also do IPv4 checksum. */ *olinfo_status |= E1000_TXD_POPTS_IXSM << 8; break; #endif default: panic("%s: CSUM_TSO but no supported IP version (0x%04x)", __func__, ntohs(eh_type)); break; } ctxd = txr->next_avail_desc; TXD = (struct e1000_adv_tx_context_desc *) &txr->tx_base[ctxd]; tcp_hlen = th->th_off << 2; /* This is used in the transmit desc in encap */ paylen = mp->m_pkthdr.len - ehdrlen - ip_hlen - tcp_hlen; /* VLAN MACLEN IPLEN */ if (mp->m_flags & M_VLANTAG) { vtag = htole16(mp->m_pkthdr.ether_vtag); vlan_macip_lens |= (vtag << E1000_ADVTXD_VLAN_SHIFT); } vlan_macip_lens |= ehdrlen << E1000_ADVTXD_MACLEN_SHIFT; vlan_macip_lens |= ip_hlen; TXD->vlan_macip_lens = htole32(vlan_macip_lens); /* ADV DTYPE TUCMD */ type_tucmd_mlhl |= E1000_ADVTXD_DCMD_DEXT | E1000_ADVTXD_DTYP_CTXT; type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_TCP; TXD->type_tucmd_mlhl = htole32(type_tucmd_mlhl); /* MSS L4LEN IDX */ mss_l4len_idx |= (mp->m_pkthdr.tso_segsz << E1000_ADVTXD_MSS_SHIFT); mss_l4len_idx |= (tcp_hlen << E1000_ADVTXD_L4LEN_SHIFT); /* 82575 needs the queue index added */ if (adapter->hw.mac.type == e1000_82575) mss_l4len_idx |= txr->me << 4; TXD->mss_l4len_idx = htole32(mss_l4len_idx); TXD->seqnum_seed = htole32(0); if (++ctxd == txr->num_desc) ctxd = 0; txr->tx_avail--; txr->next_avail_desc = ctxd; *cmd_type_len |= E1000_ADVTXD_DCMD_TSE; *olinfo_status |= E1000_TXD_POPTS_TXSM << 8; *olinfo_status |= paylen << E1000_ADVTXD_PAYLEN_SHIFT; ++txr->tso_tx; return (0); } /********************************************************************* * * Advanced Context Descriptor setup for VLAN, CSUM or TSO * **********************************************************************/ static int igb_tx_ctx_setup(struct tx_ring *txr, struct mbuf *mp, u32 *cmd_type_len, u32 *olinfo_status) { struct e1000_adv_tx_context_desc *TXD; struct adapter *adapter = txr->adapter; struct ether_vlan_header *eh; struct ip *ip; struct ip6_hdr *ip6; u32 vlan_macip_lens = 0, type_tucmd_mlhl = 0, mss_l4len_idx = 0; int ehdrlen, ip_hlen = 0; u16 etype; u8 ipproto = 0; int offload = TRUE; int ctxd = txr->next_avail_desc; u16 vtag = 0; /* First check if TSO is to be used */ if (mp->m_pkthdr.csum_flags & CSUM_TSO) return (igb_tso_setup(txr, mp, cmd_type_len, olinfo_status)); if ((mp->m_pkthdr.csum_flags & CSUM_OFFLOAD) == 0) offload = FALSE; /* Indicate the whole packet as payload when not doing TSO */ *olinfo_status |= mp->m_pkthdr.len << E1000_ADVTXD_PAYLEN_SHIFT; /* Now ready a context descriptor */ TXD = (struct e1000_adv_tx_context_desc *) &txr->tx_base[ctxd]; /* ** In advanced descriptors the vlan tag must ** be placed into the context descriptor. Hence ** we need to make one even if not doing offloads. */ if (mp->m_flags & M_VLANTAG) { vtag = htole16(mp->m_pkthdr.ether_vtag); vlan_macip_lens |= (vtag << E1000_ADVTXD_VLAN_SHIFT); } else if (offload == FALSE) /* ... no offload to do */ return (0); /* * Determine where frame payload starts. * Jump over vlan headers if already present, * helpful for QinQ too. */ eh = mtod(mp, struct ether_vlan_header *); if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) { etype = ntohs(eh->evl_proto); ehdrlen = ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN; } else { etype = ntohs(eh->evl_encap_proto); ehdrlen = ETHER_HDR_LEN; } /* Set the ether header length */ vlan_macip_lens |= ehdrlen << E1000_ADVTXD_MACLEN_SHIFT; switch (etype) { case ETHERTYPE_IP: ip = (struct ip *)(mp->m_data + ehdrlen); ip_hlen = ip->ip_hl << 2; ipproto = ip->ip_p; type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV4; break; case ETHERTYPE_IPV6: ip6 = (struct ip6_hdr *)(mp->m_data + ehdrlen); ip_hlen = sizeof(struct ip6_hdr); /* XXX-BZ this will go badly in case of ext hdrs. */ ipproto = ip6->ip6_nxt; type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV6; break; default: offload = FALSE; break; } vlan_macip_lens |= ip_hlen; type_tucmd_mlhl |= E1000_ADVTXD_DCMD_DEXT | E1000_ADVTXD_DTYP_CTXT; switch (ipproto) { case IPPROTO_TCP: #if __FreeBSD_version >= 1000000 if (mp->m_pkthdr.csum_flags & (CSUM_IP_TCP | CSUM_IP6_TCP)) #else if (mp->m_pkthdr.csum_flags & CSUM_TCP) #endif type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_TCP; break; case IPPROTO_UDP: #if __FreeBSD_version >= 1000000 if (mp->m_pkthdr.csum_flags & (CSUM_IP_UDP | CSUM_IP6_UDP)) #else if (mp->m_pkthdr.csum_flags & CSUM_UDP) #endif type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_UDP; break; #if __FreeBSD_version >= 800000 case IPPROTO_SCTP: #if __FreeBSD_version >= 1000000 if (mp->m_pkthdr.csum_flags & (CSUM_IP_SCTP | CSUM_IP6_SCTP)) #else if (mp->m_pkthdr.csum_flags & CSUM_SCTP) #endif type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_SCTP; break; #endif default: offload = FALSE; break; } if (offload) /* For the TX descriptor setup */ *olinfo_status |= E1000_TXD_POPTS_TXSM << 8; /* 82575 needs the queue index added */ if (adapter->hw.mac.type == e1000_82575) mss_l4len_idx = txr->me << 4; /* Now copy bits into descriptor */ TXD->vlan_macip_lens = htole32(vlan_macip_lens); TXD->type_tucmd_mlhl = htole32(type_tucmd_mlhl); TXD->seqnum_seed = htole32(0); TXD->mss_l4len_idx = htole32(mss_l4len_idx); /* We've consumed the first desc, adjust counters */ if (++ctxd == txr->num_desc) ctxd = 0; txr->next_avail_desc = ctxd; --txr->tx_avail; return (0); } /********************************************************************** * * Examine each tx_buffer in the used queue. If the hardware is done * processing the packet then free associated resources. The * tx_buffer is put back on the free queue. * * TRUE return means there's work in the ring to clean, FALSE its empty. **********************************************************************/ static bool igb_txeof(struct tx_ring *txr) { struct adapter *adapter = txr->adapter; #ifdef DEV_NETMAP struct ifnet *ifp = adapter->ifp; #endif /* DEV_NETMAP */ u32 work, processed = 0; int limit = adapter->tx_process_limit; struct igb_tx_buf *buf; union e1000_adv_tx_desc *txd; mtx_assert(&txr->tx_mtx, MA_OWNED); #ifdef DEV_NETMAP if (netmap_tx_irq(ifp, txr->me)) return (FALSE); #endif /* DEV_NETMAP */ if (txr->tx_avail == txr->num_desc) { txr->queue_status = IGB_QUEUE_IDLE; return FALSE; } /* Get work starting point */ work = txr->next_to_clean; buf = &txr->tx_buffers[work]; txd = &txr->tx_base[work]; work -= txr->num_desc; /* The distance to ring end */ bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map, BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); do { union e1000_adv_tx_desc *eop = buf->eop; if (eop == NULL) /* No work */ break; if ((eop->wb.status & E1000_TXD_STAT_DD) == 0) break; /* I/O not complete */ if (buf->m_head) { txr->bytes += buf->m_head->m_pkthdr.len; bus_dmamap_sync(txr->txtag, buf->map, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(txr->txtag, buf->map); m_freem(buf->m_head); buf->m_head = NULL; } buf->eop = NULL; ++txr->tx_avail; /* We clean the range if multi segment */ while (txd != eop) { ++txd; ++buf; ++work; /* wrap the ring? */ if (__predict_false(!work)) { work -= txr->num_desc; buf = txr->tx_buffers; txd = txr->tx_base; } if (buf->m_head) { txr->bytes += buf->m_head->m_pkthdr.len; bus_dmamap_sync(txr->txtag, buf->map, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(txr->txtag, buf->map); m_freem(buf->m_head); buf->m_head = NULL; } ++txr->tx_avail; buf->eop = NULL; } ++txr->packets; ++processed; txr->watchdog_time = ticks; /* Try the next packet */ ++txd; ++buf; ++work; /* reset with a wrap */ if (__predict_false(!work)) { work -= txr->num_desc; buf = txr->tx_buffers; txd = txr->tx_base; } prefetch(txd); } while (__predict_true(--limit)); bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); work += txr->num_desc; txr->next_to_clean = work; /* ** Watchdog calculation, we know there's ** work outstanding or the first return ** would have been taken, so none processed ** for too long indicates a hang. */ if ((!processed) && ((ticks - txr->watchdog_time) > IGB_WATCHDOG)) txr->queue_status |= IGB_QUEUE_HUNG; if (txr->tx_avail >= IGB_QUEUE_THRESHOLD) txr->queue_status &= ~IGB_QUEUE_DEPLETED; if (txr->tx_avail == txr->num_desc) { txr->queue_status = IGB_QUEUE_IDLE; return (FALSE); } return (TRUE); } /********************************************************************* * * Refresh mbuf buffers for RX descriptor rings * - now keeps its own state so discards due to resource * exhaustion are unnecessary, if an mbuf cannot be obtained * it just returns, keeping its placeholder, thus it can simply * be recalled to try again. * **********************************************************************/ static void igb_refresh_mbufs(struct rx_ring *rxr, int limit) { struct adapter *adapter = rxr->adapter; bus_dma_segment_t hseg[1]; bus_dma_segment_t pseg[1]; struct igb_rx_buf *rxbuf; struct mbuf *mh, *mp; int i, j, nsegs, error; bool refreshed = FALSE; i = j = rxr->next_to_refresh; /* ** Get one descriptor beyond ** our work mark to control ** the loop. */ if (++j == adapter->num_rx_desc) j = 0; while (j != limit) { rxbuf = &rxr->rx_buffers[i]; /* No hdr mbuf used with header split off */ if (rxr->hdr_split == FALSE) goto no_split; if (rxbuf->m_head == NULL) { mh = m_gethdr(M_NOWAIT, MT_DATA); if (mh == NULL) goto update; } else mh = rxbuf->m_head; mh->m_pkthdr.len = mh->m_len = MHLEN; mh->m_len = MHLEN; mh->m_flags |= M_PKTHDR; /* Get the memory mapping */ error = bus_dmamap_load_mbuf_sg(rxr->htag, rxbuf->hmap, mh, hseg, &nsegs, BUS_DMA_NOWAIT); if (error != 0) { printf("Refresh mbufs: hdr dmamap load" " failure - %d\n", error); m_free(mh); rxbuf->m_head = NULL; goto update; } rxbuf->m_head = mh; bus_dmamap_sync(rxr->htag, rxbuf->hmap, BUS_DMASYNC_PREREAD); rxr->rx_base[i].read.hdr_addr = htole64(hseg[0].ds_addr); no_split: if (rxbuf->m_pack == NULL) { mp = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, adapter->rx_mbuf_sz); if (mp == NULL) goto update; } else mp = rxbuf->m_pack; mp->m_pkthdr.len = mp->m_len = adapter->rx_mbuf_sz; /* Get the memory mapping */ error = bus_dmamap_load_mbuf_sg(rxr->ptag, rxbuf->pmap, mp, pseg, &nsegs, BUS_DMA_NOWAIT); if (error != 0) { printf("Refresh mbufs: payload dmamap load" " failure - %d\n", error); m_free(mp); rxbuf->m_pack = NULL; goto update; } rxbuf->m_pack = mp; bus_dmamap_sync(rxr->ptag, rxbuf->pmap, BUS_DMASYNC_PREREAD); rxr->rx_base[i].read.pkt_addr = htole64(pseg[0].ds_addr); refreshed = TRUE; /* I feel wefreshed :) */ i = j; /* our next is precalculated */ rxr->next_to_refresh = i; if (++j == adapter->num_rx_desc) j = 0; } update: if (refreshed) /* update tail */ E1000_WRITE_REG(&adapter->hw, E1000_RDT(rxr->me), rxr->next_to_refresh); return; } /********************************************************************* * * Allocate memory for rx_buffer structures. Since we use one * rx_buffer per received packet, the maximum number of rx_buffer's * that we'll need is equal to the number of receive descriptors * that we've allocated. * **********************************************************************/ static int igb_allocate_receive_buffers(struct rx_ring *rxr) { struct adapter *adapter = rxr->adapter; device_t dev = adapter->dev; struct igb_rx_buf *rxbuf; int i, bsize, error; bsize = sizeof(struct igb_rx_buf) * adapter->num_rx_desc; if (!(rxr->rx_buffers = (struct igb_rx_buf *) malloc(bsize, M_DEVBUF, M_NOWAIT | M_ZERO))) { device_printf(dev, "Unable to allocate rx_buffer memory\n"); error = ENOMEM; goto fail; } if ((error = bus_dma_tag_create(bus_get_dma_tag(dev), 1, 0, /* alignment, bounds */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ MSIZE, /* maxsize */ 1, /* nsegments */ MSIZE, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &rxr->htag))) { device_printf(dev, "Unable to create RX DMA tag\n"); goto fail; } if ((error = bus_dma_tag_create(bus_get_dma_tag(dev), 1, 0, /* alignment, bounds */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ MJUM9BYTES, /* maxsize */ 1, /* nsegments */ MJUM9BYTES, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &rxr->ptag))) { device_printf(dev, "Unable to create RX payload DMA tag\n"); goto fail; } for (i = 0; i < adapter->num_rx_desc; i++) { rxbuf = &rxr->rx_buffers[i]; error = bus_dmamap_create(rxr->htag, 0, &rxbuf->hmap); if (error) { device_printf(dev, "Unable to create RX head DMA maps\n"); goto fail; } error = bus_dmamap_create(rxr->ptag, 0, &rxbuf->pmap); if (error) { device_printf(dev, "Unable to create RX packet DMA maps\n"); goto fail; } } return (0); fail: /* Frees all, but can handle partial completion */ igb_free_receive_structures(adapter); return (error); } static void igb_free_receive_ring(struct rx_ring *rxr) { struct adapter *adapter = rxr->adapter; struct igb_rx_buf *rxbuf; for (int i = 0; i < adapter->num_rx_desc; i++) { rxbuf = &rxr->rx_buffers[i]; if (rxbuf->m_head != NULL) { bus_dmamap_sync(rxr->htag, rxbuf->hmap, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(rxr->htag, rxbuf->hmap); rxbuf->m_head->m_flags |= M_PKTHDR; m_freem(rxbuf->m_head); } if (rxbuf->m_pack != NULL) { bus_dmamap_sync(rxr->ptag, rxbuf->pmap, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(rxr->ptag, rxbuf->pmap); rxbuf->m_pack->m_flags |= M_PKTHDR; m_freem(rxbuf->m_pack); } rxbuf->m_head = NULL; rxbuf->m_pack = NULL; } } /********************************************************************* * * Initialize a receive ring and its buffers. * **********************************************************************/ static int igb_setup_receive_ring(struct rx_ring *rxr) { struct adapter *adapter; struct ifnet *ifp; device_t dev; struct igb_rx_buf *rxbuf; bus_dma_segment_t pseg[1], hseg[1]; struct lro_ctrl *lro = &rxr->lro; int rsize, nsegs, error = 0; #ifdef DEV_NETMAP struct netmap_adapter *na = NA(rxr->adapter->ifp); struct netmap_slot *slot; #endif /* DEV_NETMAP */ adapter = rxr->adapter; dev = adapter->dev; ifp = adapter->ifp; /* Clear the ring contents */ IGB_RX_LOCK(rxr); #ifdef DEV_NETMAP slot = netmap_reset(na, NR_RX, rxr->me, 0); #endif /* DEV_NETMAP */ rsize = roundup2(adapter->num_rx_desc * sizeof(union e1000_adv_rx_desc), IGB_DBA_ALIGN); bzero((void *)rxr->rx_base, rsize); /* ** Free current RX buffer structures and their mbufs */ igb_free_receive_ring(rxr); /* Configure for header split? */ if (igb_header_split) rxr->hdr_split = TRUE; /* Now replenish the ring mbufs */ for (int j = 0; j < adapter->num_rx_desc; ++j) { struct mbuf *mh, *mp; rxbuf = &rxr->rx_buffers[j]; #ifdef DEV_NETMAP if (slot) { /* slot sj is mapped to the j-th NIC-ring entry */ int sj = netmap_idx_n2k(&na->rx_rings[rxr->me], j); uint64_t paddr; void *addr; addr = PNMB(na, slot + sj, &paddr); netmap_load_map(na, rxr->ptag, rxbuf->pmap, addr); /* Update descriptor */ rxr->rx_base[j].read.pkt_addr = htole64(paddr); continue; } #endif /* DEV_NETMAP */ if (rxr->hdr_split == FALSE) goto skip_head; /* First the header */ rxbuf->m_head = m_gethdr(M_NOWAIT, MT_DATA); if (rxbuf->m_head == NULL) { error = ENOBUFS; goto fail; } m_adj(rxbuf->m_head, ETHER_ALIGN); mh = rxbuf->m_head; mh->m_len = mh->m_pkthdr.len = MHLEN; mh->m_flags |= M_PKTHDR; /* Get the memory mapping */ error = bus_dmamap_load_mbuf_sg(rxr->htag, rxbuf->hmap, rxbuf->m_head, hseg, &nsegs, BUS_DMA_NOWAIT); if (error != 0) /* Nothing elegant to do here */ goto fail; bus_dmamap_sync(rxr->htag, rxbuf->hmap, BUS_DMASYNC_PREREAD); /* Update descriptor */ rxr->rx_base[j].read.hdr_addr = htole64(hseg[0].ds_addr); skip_head: /* Now the payload cluster */ rxbuf->m_pack = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, adapter->rx_mbuf_sz); if (rxbuf->m_pack == NULL) { error = ENOBUFS; goto fail; } mp = rxbuf->m_pack; mp->m_pkthdr.len = mp->m_len = adapter->rx_mbuf_sz; /* Get the memory mapping */ error = bus_dmamap_load_mbuf_sg(rxr->ptag, rxbuf->pmap, mp, pseg, &nsegs, BUS_DMA_NOWAIT); if (error != 0) goto fail; bus_dmamap_sync(rxr->ptag, rxbuf->pmap, BUS_DMASYNC_PREREAD); /* Update descriptor */ rxr->rx_base[j].read.pkt_addr = htole64(pseg[0].ds_addr); } /* Setup our descriptor indices */ rxr->next_to_check = 0; rxr->next_to_refresh = adapter->num_rx_desc - 1; rxr->lro_enabled = FALSE; rxr->rx_split_packets = 0; rxr->rx_bytes = 0; rxr->fmp = NULL; rxr->lmp = NULL; bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* ** Now set up the LRO interface, we ** also only do head split when LRO ** is enabled, since so often they ** are undesirable in similar setups. */ if (ifp->if_capenable & IFCAP_LRO) { error = tcp_lro_init(lro); if (error) { device_printf(dev, "LRO Initialization failed!\n"); goto fail; } INIT_DEBUGOUT("RX LRO Initialized\n"); rxr->lro_enabled = TRUE; lro->ifp = adapter->ifp; } IGB_RX_UNLOCK(rxr); return (0); fail: igb_free_receive_ring(rxr); IGB_RX_UNLOCK(rxr); return (error); } /********************************************************************* * * Initialize all receive rings. * **********************************************************************/ static int igb_setup_receive_structures(struct adapter *adapter) { struct rx_ring *rxr = adapter->rx_rings; int i; for (i = 0; i < adapter->num_queues; i++, rxr++) if (igb_setup_receive_ring(rxr)) goto fail; return (0); fail: /* * Free RX buffers allocated so far, we will only handle * the rings that completed, the failing case will have * cleaned up for itself. 'i' is the endpoint. */ for (int j = 0; j < i; ++j) { rxr = &adapter->rx_rings[j]; IGB_RX_LOCK(rxr); igb_free_receive_ring(rxr); IGB_RX_UNLOCK(rxr); } return (ENOBUFS); } /* * Initialise the RSS mapping for NICs that support multiple transmit/ * receive rings. */ static void igb_initialise_rss_mapping(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; int i; int queue_id; u32 reta; u32 rss_key[10], mrqc, shift = 0; /* XXX? */ if (adapter->hw.mac.type == e1000_82575) shift = 6; /* * The redirection table controls which destination * queue each bucket redirects traffic to. * Each DWORD represents four queues, with the LSB * being the first queue in the DWORD. * * This just allocates buckets to queues using round-robin * allocation. * * NOTE: It Just Happens to line up with the default * RSS allocation method. */ /* Warning FM follows */ reta = 0; for (i = 0; i < 128; i++) { #ifdef RSS queue_id = rss_get_indirection_to_bucket(i); /* * If we have more queues than buckets, we'll * end up mapping buckets to a subset of the * queues. * * If we have more buckets than queues, we'll * end up instead assigning multiple buckets * to queues. * * Both are suboptimal, but we need to handle * the case so we don't go out of bounds * indexing arrays and such. */ queue_id = queue_id % adapter->num_queues; #else queue_id = (i % adapter->num_queues); #endif /* Adjust if required */ queue_id = queue_id << shift; /* * The low 8 bits are for hash value (n+0); * The next 8 bits are for hash value (n+1), etc. */ reta = reta >> 8; reta = reta | ( ((uint32_t) queue_id) << 24); if ((i & 3) == 3) { E1000_WRITE_REG(hw, E1000_RETA(i >> 2), reta); reta = 0; } } /* Now fill in hash table */ /* * MRQC: Multiple Receive Queues Command * Set queuing to RSS control, number depends on the device. */ mrqc = E1000_MRQC_ENABLE_RSS_8Q; #ifdef RSS /* XXX ew typecasting */ rss_getkey((uint8_t *) &rss_key); #else arc4rand(&rss_key, sizeof(rss_key), 0); #endif for (i = 0; i < 10; i++) E1000_WRITE_REG_ARRAY(hw, E1000_RSSRK(0), i, rss_key[i]); /* * Configure the RSS fields to hash upon. */ mrqc |= (E1000_MRQC_RSS_FIELD_IPV4 | E1000_MRQC_RSS_FIELD_IPV4_TCP); mrqc |= (E1000_MRQC_RSS_FIELD_IPV6 | E1000_MRQC_RSS_FIELD_IPV6_TCP); mrqc |=( E1000_MRQC_RSS_FIELD_IPV4_UDP | E1000_MRQC_RSS_FIELD_IPV6_UDP); mrqc |=( E1000_MRQC_RSS_FIELD_IPV6_UDP_EX | E1000_MRQC_RSS_FIELD_IPV6_TCP_EX); E1000_WRITE_REG(hw, E1000_MRQC, mrqc); } /********************************************************************* * * Enable receive unit. * **********************************************************************/ static void igb_initialize_receive_units(struct adapter *adapter) { struct rx_ring *rxr = adapter->rx_rings; struct ifnet *ifp = adapter->ifp; struct e1000_hw *hw = &adapter->hw; u32 rctl, rxcsum, psize, srrctl = 0; INIT_DEBUGOUT("igb_initialize_receive_unit: begin"); /* * Make sure receives are disabled while setting * up the descriptor ring */ rctl = E1000_READ_REG(hw, E1000_RCTL); E1000_WRITE_REG(hw, E1000_RCTL, rctl & ~E1000_RCTL_EN); /* ** Set up for header split */ if (igb_header_split) { /* Use a standard mbuf for the header */ srrctl |= IGB_HDR_BUF << E1000_SRRCTL_BSIZEHDRSIZE_SHIFT; srrctl |= E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS; } else srrctl |= E1000_SRRCTL_DESCTYPE_ADV_ONEBUF; /* ** Set up for jumbo frames */ if (ifp->if_mtu > ETHERMTU) { rctl |= E1000_RCTL_LPE; if (adapter->rx_mbuf_sz == MJUMPAGESIZE) { srrctl |= 4096 >> E1000_SRRCTL_BSIZEPKT_SHIFT; rctl |= E1000_RCTL_SZ_4096 | E1000_RCTL_BSEX; } else if (adapter->rx_mbuf_sz > MJUMPAGESIZE) { srrctl |= 8192 >> E1000_SRRCTL_BSIZEPKT_SHIFT; rctl |= E1000_RCTL_SZ_8192 | E1000_RCTL_BSEX; } /* Set maximum packet len */ psize = adapter->max_frame_size; /* are we on a vlan? */ if (adapter->ifp->if_vlantrunk != NULL) psize += VLAN_TAG_SIZE; E1000_WRITE_REG(&adapter->hw, E1000_RLPML, psize); } else { rctl &= ~E1000_RCTL_LPE; srrctl |= 2048 >> E1000_SRRCTL_BSIZEPKT_SHIFT; rctl |= E1000_RCTL_SZ_2048; } /* * If TX flow control is disabled and there's >1 queue defined, * enable DROP. * * This drops frames rather than hanging the RX MAC for all queues. */ if ((adapter->num_queues > 1) && (adapter->fc == e1000_fc_none || adapter->fc == e1000_fc_rx_pause)) { srrctl |= E1000_SRRCTL_DROP_EN; } /* Setup the Base and Length of the Rx Descriptor Rings */ for (int i = 0; i < adapter->num_queues; i++, rxr++) { u64 bus_addr = rxr->rxdma.dma_paddr; u32 rxdctl; E1000_WRITE_REG(hw, E1000_RDLEN(i), adapter->num_rx_desc * sizeof(struct e1000_rx_desc)); E1000_WRITE_REG(hw, E1000_RDBAH(i), (uint32_t)(bus_addr >> 32)); E1000_WRITE_REG(hw, E1000_RDBAL(i), (uint32_t)bus_addr); E1000_WRITE_REG(hw, E1000_SRRCTL(i), srrctl); /* Enable this Queue */ rxdctl = E1000_READ_REG(hw, E1000_RXDCTL(i)); rxdctl |= E1000_RXDCTL_QUEUE_ENABLE; rxdctl &= 0xFFF00000; rxdctl |= IGB_RX_PTHRESH; rxdctl |= IGB_RX_HTHRESH << 8; rxdctl |= IGB_RX_WTHRESH << 16; E1000_WRITE_REG(hw, E1000_RXDCTL(i), rxdctl); } /* ** Setup for RX MultiQueue */ rxcsum = E1000_READ_REG(hw, E1000_RXCSUM); if (adapter->num_queues >1) { /* rss setup */ igb_initialise_rss_mapping(adapter); /* ** NOTE: Receive Full-Packet Checksum Offload ** is mutually exclusive with Multiqueue. However ** this is not the same as TCP/IP checksums which ** still work. */ rxcsum |= E1000_RXCSUM_PCSD; #if __FreeBSD_version >= 800000 /* For SCTP Offload */ if ((hw->mac.type != e1000_82575) && (ifp->if_capenable & IFCAP_RXCSUM)) rxcsum |= E1000_RXCSUM_CRCOFL; #endif } else { /* Non RSS setup */ if (ifp->if_capenable & IFCAP_RXCSUM) { rxcsum |= E1000_RXCSUM_IPPCSE; #if __FreeBSD_version >= 800000 if (adapter->hw.mac.type != e1000_82575) rxcsum |= E1000_RXCSUM_CRCOFL; #endif } else rxcsum &= ~E1000_RXCSUM_TUOFL; } E1000_WRITE_REG(hw, E1000_RXCSUM, rxcsum); /* Setup the Receive Control Register */ rctl &= ~(3 << E1000_RCTL_MO_SHIFT); rctl |= E1000_RCTL_EN | E1000_RCTL_BAM | E1000_RCTL_LBM_NO | E1000_RCTL_RDMTS_HALF | (hw->mac.mc_filter_type << E1000_RCTL_MO_SHIFT); /* Strip CRC bytes. */ rctl |= E1000_RCTL_SECRC; /* Make sure VLAN Filters are off */ rctl &= ~E1000_RCTL_VFE; /* Don't store bad packets */ rctl &= ~E1000_RCTL_SBP; /* Enable Receives */ E1000_WRITE_REG(hw, E1000_RCTL, rctl); /* * Setup the HW Rx Head and Tail Descriptor Pointers * - needs to be after enable */ for (int i = 0; i < adapter->num_queues; i++) { rxr = &adapter->rx_rings[i]; E1000_WRITE_REG(hw, E1000_RDH(i), rxr->next_to_check); #ifdef DEV_NETMAP /* * an init() while a netmap client is active must * preserve the rx buffers passed to userspace. * In this driver it means we adjust RDT to * something different from next_to_refresh * (which is not used in netmap mode). */ if (ifp->if_capenable & IFCAP_NETMAP) { struct netmap_adapter *na = NA(adapter->ifp); struct netmap_kring *kring = &na->rx_rings[i]; int t = rxr->next_to_refresh - nm_kr_rxspace(kring); if (t >= adapter->num_rx_desc) t -= adapter->num_rx_desc; else if (t < 0) t += adapter->num_rx_desc; E1000_WRITE_REG(hw, E1000_RDT(i), t); } else #endif /* DEV_NETMAP */ E1000_WRITE_REG(hw, E1000_RDT(i), rxr->next_to_refresh); } return; } /********************************************************************* * * Free receive rings. * **********************************************************************/ static void igb_free_receive_structures(struct adapter *adapter) { struct rx_ring *rxr = adapter->rx_rings; for (int i = 0; i < adapter->num_queues; i++, rxr++) { struct lro_ctrl *lro = &rxr->lro; igb_free_receive_buffers(rxr); tcp_lro_free(lro); igb_dma_free(adapter, &rxr->rxdma); } free(adapter->rx_rings, M_DEVBUF); } /********************************************************************* * * Free receive ring data structures. * **********************************************************************/ static void igb_free_receive_buffers(struct rx_ring *rxr) { struct adapter *adapter = rxr->adapter; struct igb_rx_buf *rxbuf; int i; INIT_DEBUGOUT("free_receive_structures: begin"); /* Cleanup any existing buffers */ if (rxr->rx_buffers != NULL) { for (i = 0; i < adapter->num_rx_desc; i++) { rxbuf = &rxr->rx_buffers[i]; if (rxbuf->m_head != NULL) { bus_dmamap_sync(rxr->htag, rxbuf->hmap, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(rxr->htag, rxbuf->hmap); rxbuf->m_head->m_flags |= M_PKTHDR; m_freem(rxbuf->m_head); } if (rxbuf->m_pack != NULL) { bus_dmamap_sync(rxr->ptag, rxbuf->pmap, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(rxr->ptag, rxbuf->pmap); rxbuf->m_pack->m_flags |= M_PKTHDR; m_freem(rxbuf->m_pack); } rxbuf->m_head = NULL; rxbuf->m_pack = NULL; if (rxbuf->hmap != NULL) { bus_dmamap_destroy(rxr->htag, rxbuf->hmap); rxbuf->hmap = NULL; } if (rxbuf->pmap != NULL) { bus_dmamap_destroy(rxr->ptag, rxbuf->pmap); rxbuf->pmap = NULL; } } if (rxr->rx_buffers != NULL) { free(rxr->rx_buffers, M_DEVBUF); rxr->rx_buffers = NULL; } } if (rxr->htag != NULL) { bus_dma_tag_destroy(rxr->htag); rxr->htag = NULL; } if (rxr->ptag != NULL) { bus_dma_tag_destroy(rxr->ptag); rxr->ptag = NULL; } } static __inline void igb_rx_discard(struct rx_ring *rxr, int i) { struct igb_rx_buf *rbuf; rbuf = &rxr->rx_buffers[i]; /* Partially received? Free the chain */ if (rxr->fmp != NULL) { rxr->fmp->m_flags |= M_PKTHDR; m_freem(rxr->fmp); rxr->fmp = NULL; rxr->lmp = NULL; } /* ** With advanced descriptors the writeback ** clobbers the buffer addrs, so its easier ** to just free the existing mbufs and take ** the normal refresh path to get new buffers ** and mapping. */ if (rbuf->m_head) { m_free(rbuf->m_head); rbuf->m_head = NULL; bus_dmamap_unload(rxr->htag, rbuf->hmap); } if (rbuf->m_pack) { m_free(rbuf->m_pack); rbuf->m_pack = NULL; bus_dmamap_unload(rxr->ptag, rbuf->pmap); } return; } static __inline void igb_rx_input(struct rx_ring *rxr, struct ifnet *ifp, struct mbuf *m, u32 ptype) { /* * ATM LRO is only for IPv4/TCP packets and TCP checksum of the packet * should be computed by hardware. Also it should not have VLAN tag in * ethernet header. */ if (rxr->lro_enabled && (ifp->if_capenable & IFCAP_VLAN_HWTAGGING) != 0 && (ptype & E1000_RXDADV_PKTTYPE_ETQF) == 0 && (ptype & (E1000_RXDADV_PKTTYPE_IPV4 | E1000_RXDADV_PKTTYPE_TCP)) == (E1000_RXDADV_PKTTYPE_IPV4 | E1000_RXDADV_PKTTYPE_TCP) && (m->m_pkthdr.csum_flags & (CSUM_DATA_VALID | CSUM_PSEUDO_HDR)) == (CSUM_DATA_VALID | CSUM_PSEUDO_HDR)) { /* * Send to the stack if: ** - LRO not enabled, or ** - no LRO resources, or ** - lro enqueue fails */ if (rxr->lro.lro_cnt != 0) if (tcp_lro_rx(&rxr->lro, m, 0) == 0) return; } IGB_RX_UNLOCK(rxr); (*ifp->if_input)(ifp, m); IGB_RX_LOCK(rxr); } /********************************************************************* * * This routine executes in interrupt context. It replenishes * the mbufs in the descriptor and sends data which has been * dma'ed into host memory to upper layer. * * We loop at most count times if count is > 0, or until done if * count < 0. * * Return TRUE if more to clean, FALSE otherwise *********************************************************************/ static bool igb_rxeof(struct igb_queue *que, int count, int *done) { struct adapter *adapter = que->adapter; struct rx_ring *rxr = que->rxr; struct ifnet *ifp = adapter->ifp; struct lro_ctrl *lro = &rxr->lro; int i, processed = 0, rxdone = 0; u32 ptype, staterr = 0; union e1000_adv_rx_desc *cur; IGB_RX_LOCK(rxr); /* Sync the ring. */ bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); #ifdef DEV_NETMAP if (netmap_rx_irq(ifp, rxr->me, &processed)) { IGB_RX_UNLOCK(rxr); return (FALSE); } #endif /* DEV_NETMAP */ /* Main clean loop */ for (i = rxr->next_to_check; count != 0;) { struct mbuf *sendmp, *mh, *mp; struct igb_rx_buf *rxbuf; u16 hlen, plen, hdr, vtag, pkt_info; bool eop = FALSE; cur = &rxr->rx_base[i]; staterr = le32toh(cur->wb.upper.status_error); if ((staterr & E1000_RXD_STAT_DD) == 0) break; if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) break; count--; sendmp = mh = mp = NULL; cur->wb.upper.status_error = 0; rxbuf = &rxr->rx_buffers[i]; plen = le16toh(cur->wb.upper.length); ptype = le32toh(cur->wb.lower.lo_dword.data) & IGB_PKTTYPE_MASK; if (((adapter->hw.mac.type == e1000_i350) || (adapter->hw.mac.type == e1000_i354)) && (staterr & E1000_RXDEXT_STATERR_LB)) vtag = be16toh(cur->wb.upper.vlan); else vtag = le16toh(cur->wb.upper.vlan); hdr = le16toh(cur->wb.lower.lo_dword.hs_rss.hdr_info); pkt_info = le16toh(cur->wb.lower.lo_dword.hs_rss.pkt_info); eop = ((staterr & E1000_RXD_STAT_EOP) == E1000_RXD_STAT_EOP); /* * Free the frame (all segments) if we're at EOP and * it's an error. * * The datasheet states that EOP + status is only valid for * the final segment in a multi-segment frame. */ if (eop && ((staterr & E1000_RXDEXT_ERR_FRAME_ERR_MASK) != 0)) { adapter->dropped_pkts++; ++rxr->rx_discarded; igb_rx_discard(rxr, i); goto next_desc; } /* ** The way the hardware is configured to ** split, it will ONLY use the header buffer ** when header split is enabled, otherwise we ** get normal behavior, ie, both header and ** payload are DMA'd into the payload buffer. ** ** The fmp test is to catch the case where a ** packet spans multiple descriptors, in that ** case only the first header is valid. */ if (rxr->hdr_split && rxr->fmp == NULL) { bus_dmamap_unload(rxr->htag, rxbuf->hmap); hlen = (hdr & E1000_RXDADV_HDRBUFLEN_MASK) >> E1000_RXDADV_HDRBUFLEN_SHIFT; if (hlen > IGB_HDR_BUF) hlen = IGB_HDR_BUF; mh = rxr->rx_buffers[i].m_head; mh->m_len = hlen; /* clear buf pointer for refresh */ rxbuf->m_head = NULL; /* ** Get the payload length, this ** could be zero if its a small ** packet. */ if (plen > 0) { mp = rxr->rx_buffers[i].m_pack; mp->m_len = plen; mh->m_next = mp; /* clear buf pointer */ rxbuf->m_pack = NULL; rxr->rx_split_packets++; } } else { /* ** Either no header split, or a ** secondary piece of a fragmented ** split packet. */ mh = rxr->rx_buffers[i].m_pack; mh->m_len = plen; /* clear buf info for refresh */ rxbuf->m_pack = NULL; } bus_dmamap_unload(rxr->ptag, rxbuf->pmap); ++processed; /* So we know when to refresh */ /* Initial frame - setup */ if (rxr->fmp == NULL) { mh->m_pkthdr.len = mh->m_len; /* Save the head of the chain */ rxr->fmp = mh; rxr->lmp = mh; if (mp != NULL) { /* Add payload if split */ mh->m_pkthdr.len += mp->m_len; rxr->lmp = mh->m_next; } } else { /* Chain mbuf's together */ rxr->lmp->m_next = mh; rxr->lmp = rxr->lmp->m_next; rxr->fmp->m_pkthdr.len += mh->m_len; } if (eop) { rxr->fmp->m_pkthdr.rcvif = ifp; rxr->rx_packets++; /* capture data for AIM */ rxr->packets++; rxr->bytes += rxr->fmp->m_pkthdr.len; rxr->rx_bytes += rxr->fmp->m_pkthdr.len; if ((ifp->if_capenable & IFCAP_RXCSUM) != 0) igb_rx_checksum(staterr, rxr->fmp, ptype); if ((ifp->if_capenable & IFCAP_VLAN_HWTAGGING) != 0 && (staterr & E1000_RXD_STAT_VP) != 0) { rxr->fmp->m_pkthdr.ether_vtag = vtag; rxr->fmp->m_flags |= M_VLANTAG; } /* * In case of multiqueue, we have RXCSUM.PCSD bit set * and never cleared. This means we have RSS hash * available to be used. */ if (adapter->num_queues > 1) { rxr->fmp->m_pkthdr.flowid = le32toh(cur->wb.lower.hi_dword.rss); switch (pkt_info & E1000_RXDADV_RSSTYPE_MASK) { case E1000_RXDADV_RSSTYPE_IPV4_TCP: M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_RSS_TCP_IPV4); break; case E1000_RXDADV_RSSTYPE_IPV4: M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_RSS_IPV4); break; case E1000_RXDADV_RSSTYPE_IPV6_TCP: M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_RSS_TCP_IPV6); break; case E1000_RXDADV_RSSTYPE_IPV6_EX: M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_RSS_IPV6_EX); break; case E1000_RXDADV_RSSTYPE_IPV6: M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_RSS_IPV6); break; case E1000_RXDADV_RSSTYPE_IPV6_TCP_EX: M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_RSS_TCP_IPV6_EX); break; default: /* XXX fallthrough */ M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_OPAQUE_HASH); } } else { #ifndef IGB_LEGACY_TX rxr->fmp->m_pkthdr.flowid = que->msix; M_HASHTYPE_SET(rxr->fmp, M_HASHTYPE_OPAQUE); #endif } sendmp = rxr->fmp; /* Make sure to set M_PKTHDR. */ sendmp->m_flags |= M_PKTHDR; rxr->fmp = NULL; rxr->lmp = NULL; } next_desc: bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* Advance our pointers to the next descriptor. */ if (++i == adapter->num_rx_desc) i = 0; /* ** Send to the stack or LRO */ if (sendmp != NULL) { rxr->next_to_check = i; igb_rx_input(rxr, ifp, sendmp, ptype); i = rxr->next_to_check; rxdone++; } /* Every 8 descriptors we go to refresh mbufs */ if (processed == 8) { igb_refresh_mbufs(rxr, i); processed = 0; } } /* Catch any remainders */ if (igb_rx_unrefreshed(rxr)) igb_refresh_mbufs(rxr, i); rxr->next_to_check = i; /* * Flush any outstanding LRO work */ tcp_lro_flush_all(lro); if (done != NULL) *done += rxdone; IGB_RX_UNLOCK(rxr); return ((staterr & E1000_RXD_STAT_DD) ? TRUE : FALSE); } /********************************************************************* * * Verify that the hardware indicated that the checksum is valid. * Inform the stack about the status of checksum so that stack * doesn't spend time verifying the checksum. * *********************************************************************/ static void igb_rx_checksum(u32 staterr, struct mbuf *mp, u32 ptype) { u16 status = (u16)staterr; u8 errors = (u8) (staterr >> 24); int sctp; /* Ignore Checksum bit is set */ if (status & E1000_RXD_STAT_IXSM) { mp->m_pkthdr.csum_flags = 0; return; } if ((ptype & E1000_RXDADV_PKTTYPE_ETQF) == 0 && (ptype & E1000_RXDADV_PKTTYPE_SCTP) != 0) sctp = 1; else sctp = 0; if (status & E1000_RXD_STAT_IPCS) { /* Did it pass? */ if (!(errors & E1000_RXD_ERR_IPE)) { /* IP Checksum Good */ mp->m_pkthdr.csum_flags = CSUM_IP_CHECKED; mp->m_pkthdr.csum_flags |= CSUM_IP_VALID; } else mp->m_pkthdr.csum_flags = 0; } if (status & (E1000_RXD_STAT_TCPCS | E1000_RXD_STAT_UDPCS)) { u64 type = (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); #if __FreeBSD_version >= 800000 if (sctp) /* reassign */ type = CSUM_SCTP_VALID; #endif /* Did it pass? */ if (!(errors & E1000_RXD_ERR_TCPE)) { mp->m_pkthdr.csum_flags |= type; if (sctp == 0) mp->m_pkthdr.csum_data = htons(0xffff); } } return; } /* * This routine is run via an vlan * config EVENT */ static void igb_register_vlan(void *arg, struct ifnet *ifp, u16 vtag) { struct adapter *adapter = ifp->if_softc; u32 index, bit; if (ifp->if_softc != arg) /* Not our event */ return; if ((vtag == 0) || (vtag > 4095)) /* Invalid */ return; IGB_CORE_LOCK(adapter); index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] |= (1 << bit); ++adapter->num_vlans; /* Change hw filter setting */ if (ifp->if_capenable & IFCAP_VLAN_HWFILTER) igb_setup_vlan_hw_support(adapter); IGB_CORE_UNLOCK(adapter); } /* * This routine is run via an vlan * unconfig EVENT */ static void igb_unregister_vlan(void *arg, struct ifnet *ifp, u16 vtag) { struct adapter *adapter = ifp->if_softc; u32 index, bit; if (ifp->if_softc != arg) return; if ((vtag == 0) || (vtag > 4095)) /* Invalid */ return; IGB_CORE_LOCK(adapter); index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] &= ~(1 << bit); --adapter->num_vlans; /* Change hw filter setting */ if (ifp->if_capenable & IFCAP_VLAN_HWFILTER) igb_setup_vlan_hw_support(adapter); IGB_CORE_UNLOCK(adapter); } static void igb_setup_vlan_hw_support(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct ifnet *ifp = adapter->ifp; u32 reg; if (adapter->vf_ifp) { e1000_rlpml_set_vf(hw, adapter->max_frame_size + VLAN_TAG_SIZE); return; } reg = E1000_READ_REG(hw, E1000_CTRL); reg |= E1000_CTRL_VME; E1000_WRITE_REG(hw, E1000_CTRL, reg); /* Enable the Filter Table */ if (ifp->if_capenable & IFCAP_VLAN_HWFILTER) { reg = E1000_READ_REG(hw, E1000_RCTL); reg &= ~E1000_RCTL_CFIEN; reg |= E1000_RCTL_VFE; E1000_WRITE_REG(hw, E1000_RCTL, reg); } /* Update the frame size */ E1000_WRITE_REG(&adapter->hw, E1000_RLPML, adapter->max_frame_size + VLAN_TAG_SIZE); /* Don't bother with table if no vlans */ if ((adapter->num_vlans == 0) || ((ifp->if_capenable & IFCAP_VLAN_HWFILTER) == 0)) return; /* ** A soft reset zero's out the VFTA, so ** we need to repopulate it now. */ for (int i = 0; i < IGB_VFTA_SIZE; i++) if (adapter->shadow_vfta[i] != 0) { if (adapter->vf_ifp) e1000_vfta_set_vf(hw, adapter->shadow_vfta[i], TRUE); else e1000_write_vfta(hw, i, adapter->shadow_vfta[i]); } } static void igb_enable_intr(struct adapter *adapter) { /* With RSS set up what to auto clear */ if (adapter->msix_mem) { u32 mask = (adapter->que_mask | adapter->link_mask); E1000_WRITE_REG(&adapter->hw, E1000_EIAC, mask); E1000_WRITE_REG(&adapter->hw, E1000_EIAM, mask); E1000_WRITE_REG(&adapter->hw, E1000_EIMS, mask); E1000_WRITE_REG(&adapter->hw, E1000_IMS, E1000_IMS_LSC); } else { E1000_WRITE_REG(&adapter->hw, E1000_IMS, IMS_ENABLE_MASK); } E1000_WRITE_FLUSH(&adapter->hw); return; } static void igb_disable_intr(struct adapter *adapter) { if (adapter->msix_mem) { E1000_WRITE_REG(&adapter->hw, E1000_EIMC, ~0); E1000_WRITE_REG(&adapter->hw, E1000_EIAC, 0); } E1000_WRITE_REG(&adapter->hw, E1000_IMC, ~0); E1000_WRITE_FLUSH(&adapter->hw); return; } /* * Bit of a misnomer, what this really means is * to enable OS management of the system... aka * to disable special hardware management features */ static void igb_init_manageability(struct adapter *adapter) { if (adapter->has_manage) { int manc2h = E1000_READ_REG(&adapter->hw, E1000_MANC2H); int manc = E1000_READ_REG(&adapter->hw, E1000_MANC); /* disable hardware interception of ARP */ manc &= ~(E1000_MANC_ARP_EN); /* enable receiving management packets to the host */ manc |= E1000_MANC_EN_MNG2HOST; manc2h |= 1 << 5; /* Mng Port 623 */ manc2h |= 1 << 6; /* Mng Port 664 */ E1000_WRITE_REG(&adapter->hw, E1000_MANC2H, manc2h); E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc); } } /* * Give control back to hardware management * controller if there is one. */ static void igb_release_manageability(struct adapter *adapter) { if (adapter->has_manage) { int manc = E1000_READ_REG(&adapter->hw, E1000_MANC); /* re-enable hardware interception of ARP */ manc |= E1000_MANC_ARP_EN; manc &= ~E1000_MANC_EN_MNG2HOST; E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc); } } /* * igb_get_hw_control sets CTRL_EXT:DRV_LOAD bit. * For ASF and Pass Through versions of f/w this means that * the driver is loaded. * */ static void igb_get_hw_control(struct adapter *adapter) { u32 ctrl_ext; if (adapter->vf_ifp) return; /* Let firmware know the driver has taken over */ ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext | E1000_CTRL_EXT_DRV_LOAD); } /* * igb_release_hw_control resets CTRL_EXT:DRV_LOAD bit. * For ASF and Pass Through versions of f/w this means that the * driver is no longer loaded. * */ static void igb_release_hw_control(struct adapter *adapter) { u32 ctrl_ext; if (adapter->vf_ifp) return; /* Let firmware taken over control of h/w */ ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext & ~E1000_CTRL_EXT_DRV_LOAD); } static int igb_is_valid_ether_addr(uint8_t *addr) { char zero_addr[6] = { 0, 0, 0, 0, 0, 0 }; if ((addr[0] & 1) || (!bcmp(addr, zero_addr, ETHER_ADDR_LEN))) { return (FALSE); } return (TRUE); } /* * Enable PCI Wake On Lan capability */ static void igb_enable_wakeup(device_t dev) { u16 cap, status; u8 id; /* First find the capabilities pointer*/ cap = pci_read_config(dev, PCIR_CAP_PTR, 2); /* Read the PM Capabilities */ id = pci_read_config(dev, cap, 1); if (id != PCIY_PMG) /* Something wrong */ return; /* OK, we have the power capabilities, so now get the status register */ cap += PCIR_POWER_STATUS; status = pci_read_config(dev, cap, 2); status |= PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE; pci_write_config(dev, cap, status, 2); return; } static void igb_led_func(void *arg, int onoff) { struct adapter *adapter = arg; IGB_CORE_LOCK(adapter); if (onoff) { e1000_setup_led(&adapter->hw); e1000_led_on(&adapter->hw); } else { e1000_led_off(&adapter->hw); e1000_cleanup_led(&adapter->hw); } IGB_CORE_UNLOCK(adapter); } static uint64_t igb_get_vf_counter(if_t ifp, ift_counter cnt) { struct adapter *adapter; struct e1000_vf_stats *stats; #ifndef IGB_LEGACY_TX struct tx_ring *txr; uint64_t rv; #endif adapter = if_getsoftc(ifp); stats = (struct e1000_vf_stats *)adapter->stats; switch (cnt) { case IFCOUNTER_IPACKETS: return (stats->gprc); case IFCOUNTER_OPACKETS: return (stats->gptc); case IFCOUNTER_IBYTES: return (stats->gorc); case IFCOUNTER_OBYTES: return (stats->gotc); case IFCOUNTER_IMCASTS: return (stats->mprc); case IFCOUNTER_IERRORS: return (adapter->dropped_pkts); case IFCOUNTER_OERRORS: return (adapter->watchdog_events); #ifndef IGB_LEGACY_TX case IFCOUNTER_OQDROPS: rv = 0; txr = adapter->tx_rings; for (int i = 0; i < adapter->num_queues; i++, txr++) rv += txr->br->br_drops; return (rv); #endif default: return (if_get_counter_default(ifp, cnt)); } } static uint64_t igb_get_counter(if_t ifp, ift_counter cnt) { struct adapter *adapter; struct e1000_hw_stats *stats; #ifndef IGB_LEGACY_TX struct tx_ring *txr; uint64_t rv; #endif adapter = if_getsoftc(ifp); if (adapter->vf_ifp) return (igb_get_vf_counter(ifp, cnt)); stats = (struct e1000_hw_stats *)adapter->stats; switch (cnt) { case IFCOUNTER_IPACKETS: return (stats->gprc); case IFCOUNTER_OPACKETS: return (stats->gptc); case IFCOUNTER_IBYTES: return (stats->gorc); case IFCOUNTER_OBYTES: return (stats->gotc); case IFCOUNTER_IMCASTS: return (stats->mprc); case IFCOUNTER_OMCASTS: return (stats->mptc); case IFCOUNTER_IERRORS: return (adapter->dropped_pkts + stats->rxerrc + stats->crcerrs + stats->algnerrc + stats->ruc + stats->roc + stats->cexterr); case IFCOUNTER_OERRORS: return (stats->ecol + stats->latecol + adapter->watchdog_events); case IFCOUNTER_COLLISIONS: return (stats->colc); case IFCOUNTER_IQDROPS: return (stats->mpc); #ifndef IGB_LEGACY_TX case IFCOUNTER_OQDROPS: rv = 0; txr = adapter->tx_rings; for (int i = 0; i < adapter->num_queues; i++, txr++) rv += txr->br->br_drops; return (rv); #endif default: return (if_get_counter_default(ifp, cnt)); } } /********************************************************************** * * Update the board statistics counters. * **********************************************************************/ static void igb_update_stats_counters(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct e1000_hw_stats *stats; /* ** The virtual function adapter has only a ** small controlled set of stats, do only ** those and return. */ if (adapter->vf_ifp) { igb_update_vf_stats_counters(adapter); return; } stats = (struct e1000_hw_stats *)adapter->stats; if (adapter->hw.phy.media_type == e1000_media_type_copper || (E1000_READ_REG(hw, E1000_STATUS) & E1000_STATUS_LU)) { stats->symerrs += E1000_READ_REG(hw,E1000_SYMERRS); stats->sec += E1000_READ_REG(hw, E1000_SEC); } stats->crcerrs += E1000_READ_REG(hw, E1000_CRCERRS); stats->mpc += E1000_READ_REG(hw, E1000_MPC); stats->scc += E1000_READ_REG(hw, E1000_SCC); stats->ecol += E1000_READ_REG(hw, E1000_ECOL); stats->mcc += E1000_READ_REG(hw, E1000_MCC); stats->latecol += E1000_READ_REG(hw, E1000_LATECOL); stats->colc += E1000_READ_REG(hw, E1000_COLC); stats->dc += E1000_READ_REG(hw, E1000_DC); stats->rlec += E1000_READ_REG(hw, E1000_RLEC); stats->xonrxc += E1000_READ_REG(hw, E1000_XONRXC); stats->xontxc += E1000_READ_REG(hw, E1000_XONTXC); /* ** For watchdog management we need to know if we have been ** paused during the last interval, so capture that here. */ adapter->pause_frames = E1000_READ_REG(&adapter->hw, E1000_XOFFRXC); stats->xoffrxc += adapter->pause_frames; stats->xofftxc += E1000_READ_REG(hw, E1000_XOFFTXC); stats->fcruc += E1000_READ_REG(hw, E1000_FCRUC); stats->prc64 += E1000_READ_REG(hw, E1000_PRC64); stats->prc127 += E1000_READ_REG(hw, E1000_PRC127); stats->prc255 += E1000_READ_REG(hw, E1000_PRC255); stats->prc511 += E1000_READ_REG(hw, E1000_PRC511); stats->prc1023 += E1000_READ_REG(hw, E1000_PRC1023); stats->prc1522 += E1000_READ_REG(hw, E1000_PRC1522); stats->gprc += E1000_READ_REG(hw, E1000_GPRC); stats->bprc += E1000_READ_REG(hw, E1000_BPRC); stats->mprc += E1000_READ_REG(hw, E1000_MPRC); stats->gptc += E1000_READ_REG(hw, E1000_GPTC); /* For the 64-bit byte counters the low dword must be read first. */ /* Both registers clear on the read of the high dword */ stats->gorc += E1000_READ_REG(hw, E1000_GORCL) + ((u64)E1000_READ_REG(hw, E1000_GORCH) << 32); stats->gotc += E1000_READ_REG(hw, E1000_GOTCL) + ((u64)E1000_READ_REG(hw, E1000_GOTCH) << 32); stats->rnbc += E1000_READ_REG(hw, E1000_RNBC); stats->ruc += E1000_READ_REG(hw, E1000_RUC); stats->rfc += E1000_READ_REG(hw, E1000_RFC); stats->roc += E1000_READ_REG(hw, E1000_ROC); stats->rjc += E1000_READ_REG(hw, E1000_RJC); stats->mgprc += E1000_READ_REG(hw, E1000_MGTPRC); stats->mgpdc += E1000_READ_REG(hw, E1000_MGTPDC); stats->mgptc += E1000_READ_REG(hw, E1000_MGTPTC); stats->tor += E1000_READ_REG(hw, E1000_TORL) + ((u64)E1000_READ_REG(hw, E1000_TORH) << 32); stats->tot += E1000_READ_REG(hw, E1000_TOTL) + ((u64)E1000_READ_REG(hw, E1000_TOTH) << 32); stats->tpr += E1000_READ_REG(hw, E1000_TPR); stats->tpt += E1000_READ_REG(hw, E1000_TPT); stats->ptc64 += E1000_READ_REG(hw, E1000_PTC64); stats->ptc127 += E1000_READ_REG(hw, E1000_PTC127); stats->ptc255 += E1000_READ_REG(hw, E1000_PTC255); stats->ptc511 += E1000_READ_REG(hw, E1000_PTC511); stats->ptc1023 += E1000_READ_REG(hw, E1000_PTC1023); stats->ptc1522 += E1000_READ_REG(hw, E1000_PTC1522); stats->mptc += E1000_READ_REG(hw, E1000_MPTC); stats->bptc += E1000_READ_REG(hw, E1000_BPTC); /* Interrupt Counts */ stats->iac += E1000_READ_REG(hw, E1000_IAC); stats->icrxptc += E1000_READ_REG(hw, E1000_ICRXPTC); stats->icrxatc += E1000_READ_REG(hw, E1000_ICRXATC); stats->ictxptc += E1000_READ_REG(hw, E1000_ICTXPTC); stats->ictxatc += E1000_READ_REG(hw, E1000_ICTXATC); stats->ictxqec += E1000_READ_REG(hw, E1000_ICTXQEC); stats->ictxqmtc += E1000_READ_REG(hw, E1000_ICTXQMTC); stats->icrxdmtc += E1000_READ_REG(hw, E1000_ICRXDMTC); stats->icrxoc += E1000_READ_REG(hw, E1000_ICRXOC); /* Host to Card Statistics */ stats->cbtmpc += E1000_READ_REG(hw, E1000_CBTMPC); stats->htdpmc += E1000_READ_REG(hw, E1000_HTDPMC); stats->cbrdpc += E1000_READ_REG(hw, E1000_CBRDPC); stats->cbrmpc += E1000_READ_REG(hw, E1000_CBRMPC); stats->rpthc += E1000_READ_REG(hw, E1000_RPTHC); stats->hgptc += E1000_READ_REG(hw, E1000_HGPTC); stats->htcbdpc += E1000_READ_REG(hw, E1000_HTCBDPC); stats->hgorc += (E1000_READ_REG(hw, E1000_HGORCL) + ((u64)E1000_READ_REG(hw, E1000_HGORCH) << 32)); stats->hgotc += (E1000_READ_REG(hw, E1000_HGOTCL) + ((u64)E1000_READ_REG(hw, E1000_HGOTCH) << 32)); stats->lenerrs += E1000_READ_REG(hw, E1000_LENERRS); stats->scvpc += E1000_READ_REG(hw, E1000_SCVPC); stats->hrmpc += E1000_READ_REG(hw, E1000_HRMPC); stats->algnerrc += E1000_READ_REG(hw, E1000_ALGNERRC); stats->rxerrc += E1000_READ_REG(hw, E1000_RXERRC); stats->tncrs += E1000_READ_REG(hw, E1000_TNCRS); stats->cexterr += E1000_READ_REG(hw, E1000_CEXTERR); stats->tsctc += E1000_READ_REG(hw, E1000_TSCTC); stats->tsctfc += E1000_READ_REG(hw, E1000_TSCTFC); /* Driver specific counters */ adapter->device_control = E1000_READ_REG(hw, E1000_CTRL); adapter->rx_control = E1000_READ_REG(hw, E1000_RCTL); adapter->int_mask = E1000_READ_REG(hw, E1000_IMS); adapter->eint_mask = E1000_READ_REG(hw, E1000_EIMS); adapter->packet_buf_alloc_tx = ((E1000_READ_REG(hw, E1000_PBA) & 0xffff0000) >> 16); adapter->packet_buf_alloc_rx = (E1000_READ_REG(hw, E1000_PBA) & 0xffff); } /********************************************************************** * * Initialize the VF board statistics counters. * **********************************************************************/ static void igb_vf_init_stats(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct e1000_vf_stats *stats; stats = (struct e1000_vf_stats *)adapter->stats; if (stats == NULL) return; stats->last_gprc = E1000_READ_REG(hw, E1000_VFGPRC); stats->last_gorc = E1000_READ_REG(hw, E1000_VFGORC); stats->last_gptc = E1000_READ_REG(hw, E1000_VFGPTC); stats->last_gotc = E1000_READ_REG(hw, E1000_VFGOTC); stats->last_mprc = E1000_READ_REG(hw, E1000_VFMPRC); } /********************************************************************** * * Update the VF board statistics counters. * **********************************************************************/ static void igb_update_vf_stats_counters(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; struct e1000_vf_stats *stats; if (adapter->link_speed == 0) return; stats = (struct e1000_vf_stats *)adapter->stats; UPDATE_VF_REG(E1000_VFGPRC, stats->last_gprc, stats->gprc); UPDATE_VF_REG(E1000_VFGORC, stats->last_gorc, stats->gorc); UPDATE_VF_REG(E1000_VFGPTC, stats->last_gptc, stats->gptc); UPDATE_VF_REG(E1000_VFGOTC, stats->last_gotc, stats->gotc); UPDATE_VF_REG(E1000_VFMPRC, stats->last_mprc, stats->mprc); } /* Export a single 32-bit register via a read-only sysctl. */ static int igb_sysctl_reg_handler(SYSCTL_HANDLER_ARGS) { struct adapter *adapter; u_int val; adapter = oidp->oid_arg1; val = E1000_READ_REG(&adapter->hw, oidp->oid_arg2); return (sysctl_handle_int(oidp, &val, 0, req)); } /* ** Tuneable interrupt rate handler */ static int igb_sysctl_interrupt_rate_handler(SYSCTL_HANDLER_ARGS) { struct igb_queue *que = ((struct igb_queue *)oidp->oid_arg1); int error; u32 reg, usec, rate; reg = E1000_READ_REG(&que->adapter->hw, E1000_EITR(que->msix)); usec = ((reg & 0x7FFC) >> 2); if (usec > 0) rate = 1000000 / usec; else rate = 0; error = sysctl_handle_int(oidp, &rate, 0, req); if (error || !req->newptr) return error; return 0; } /* * Add sysctl variables, one per statistic, to the system. */ static void igb_add_hw_stats(struct adapter *adapter) { device_t dev = adapter->dev; struct tx_ring *txr = adapter->tx_rings; struct rx_ring *rxr = adapter->rx_rings; struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(dev); struct sysctl_oid *tree = device_get_sysctl_tree(dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); struct e1000_hw_stats *stats = adapter->stats; struct sysctl_oid *stat_node, *queue_node, *int_node, *host_node; struct sysctl_oid_list *stat_list, *queue_list, *int_list, *host_list; #define QUEUE_NAME_LEN 32 char namebuf[QUEUE_NAME_LEN]; /* Driver Statistics */ SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "dropped", CTLFLAG_RD, &adapter->dropped_pkts, "Driver dropped packets"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "link_irq", CTLFLAG_RD, &adapter->link_irq, "Link MSIX IRQ Handled"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "mbuf_defrag_fail", CTLFLAG_RD, &adapter->mbuf_defrag_failed, "Defragmenting mbuf chain failed"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_dma_fail", CTLFLAG_RD, &adapter->no_tx_dma_setup, "Driver tx dma failure in xmit"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "rx_overruns", CTLFLAG_RD, &adapter->rx_overruns, "RX overruns"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "watchdog_timeouts", CTLFLAG_RD, &adapter->watchdog_events, "Watchdog timeouts"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "device_control", CTLFLAG_RD, &adapter->device_control, "Device Control Register"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "rx_control", CTLFLAG_RD, &adapter->rx_control, "Receiver Control Register"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "interrupt_mask", CTLFLAG_RD, &adapter->int_mask, "Interrupt Mask"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "extended_int_mask", CTLFLAG_RD, &adapter->eint_mask, "Extended Interrupt Mask"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_buf_alloc", CTLFLAG_RD, &adapter->packet_buf_alloc_tx, "Transmit Buffer Packet Allocation"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "rx_buf_alloc", CTLFLAG_RD, &adapter->packet_buf_alloc_rx, "Receive Buffer Packet Allocation"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_high_water", CTLFLAG_RD, &adapter->hw.fc.high_water, 0, "Flow Control High Watermark"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_low_water", CTLFLAG_RD, &adapter->hw.fc.low_water, 0, "Flow Control Low Watermark"); for (int i = 0; i < adapter->num_queues; i++, rxr++, txr++) { struct lro_ctrl *lro = &rxr->lro; snprintf(namebuf, QUEUE_NAME_LEN, "queue%d", i); queue_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, namebuf, CTLFLAG_RD, NULL, "Queue Name"); queue_list = SYSCTL_CHILDREN(queue_node); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "interrupt_rate", CTLTYPE_UINT | CTLFLAG_RD, &adapter->queues[i], sizeof(&adapter->queues[i]), igb_sysctl_interrupt_rate_handler, "IU", "Interrupt Rate"); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "txd_head", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_TDH(txr->me), igb_sysctl_reg_handler, "IU", "Transmit Descriptor Head"); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "txd_tail", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_TDT(txr->me), igb_sysctl_reg_handler, "IU", "Transmit Descriptor Tail"); SYSCTL_ADD_QUAD(ctx, queue_list, OID_AUTO, "no_desc_avail", CTLFLAG_RD, &txr->no_desc_avail, "Queue Descriptors Unavailable"); SYSCTL_ADD_UQUAD(ctx, queue_list, OID_AUTO, "tx_packets", CTLFLAG_RD, &txr->total_packets, "Queue Packets Transmitted"); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "rxd_head", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RDH(rxr->me), igb_sysctl_reg_handler, "IU", "Receive Descriptor Head"); SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "rxd_tail", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RDT(rxr->me), igb_sysctl_reg_handler, "IU", "Receive Descriptor Tail"); SYSCTL_ADD_QUAD(ctx, queue_list, OID_AUTO, "rx_packets", CTLFLAG_RD, &rxr->rx_packets, "Queue Packets Received"); SYSCTL_ADD_QUAD(ctx, queue_list, OID_AUTO, "rx_bytes", CTLFLAG_RD, &rxr->rx_bytes, "Queue Bytes Received"); SYSCTL_ADD_U64(ctx, queue_list, OID_AUTO, "lro_queued", CTLFLAG_RD, &lro->lro_queued, 0, "LRO Queued"); SYSCTL_ADD_U64(ctx, queue_list, OID_AUTO, "lro_flushed", CTLFLAG_RD, &lro->lro_flushed, 0, "LRO Flushed"); } /* MAC stats get their own sub node */ stat_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "mac_stats", CTLFLAG_RD, NULL, "MAC Statistics"); stat_list = SYSCTL_CHILDREN(stat_node); /* ** VF adapter has a very limited set of stats ** since its not managing the metal, so to speak. */ if (adapter->vf_ifp) { SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_pkts_recvd", CTLFLAG_RD, &stats->gprc, "Good Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_pkts_txd", CTLFLAG_RD, &stats->gptc, "Good Packets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_octets_recvd", CTLFLAG_RD, &stats->gorc, "Good Octets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_octets_txd", CTLFLAG_RD, &stats->gotc, "Good Octets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_recvd", CTLFLAG_RD, &stats->mprc, "Multicast Packets Received"); return; } SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "excess_coll", CTLFLAG_RD, &stats->ecol, "Excessive collisions"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "single_coll", CTLFLAG_RD, &stats->scc, "Single collisions"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "multiple_coll", CTLFLAG_RD, &stats->mcc, "Multiple collisions"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "late_coll", CTLFLAG_RD, &stats->latecol, "Late collisions"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "collision_count", CTLFLAG_RD, &stats->colc, "Collision Count"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "symbol_errors", CTLFLAG_RD, &stats->symerrs, "Symbol Errors"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "sequence_errors", CTLFLAG_RD, &stats->sec, "Sequence Errors"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "defer_count", CTLFLAG_RD, &stats->dc, "Defer Count"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "missed_packets", CTLFLAG_RD, &stats->mpc, "Missed Packets"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "recv_length_errors", CTLFLAG_RD, &stats->rlec, "Receive Length Errors"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "recv_no_buff", CTLFLAG_RD, &stats->rnbc, "Receive No Buffers"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "recv_undersize", CTLFLAG_RD, &stats->ruc, "Receive Undersize"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "recv_fragmented", CTLFLAG_RD, &stats->rfc, "Fragmented Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "recv_oversize", CTLFLAG_RD, &stats->roc, "Oversized Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "recv_jabber", CTLFLAG_RD, &stats->rjc, "Recevied Jabber"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "recv_errs", CTLFLAG_RD, &stats->rxerrc, "Receive Errors"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "crc_errs", CTLFLAG_RD, &stats->crcerrs, "CRC errors"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "alignment_errs", CTLFLAG_RD, &stats->algnerrc, "Alignment Errors"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tx_no_crs", CTLFLAG_RD, &stats->tncrs, "Transmit with No CRS"); /* On 82575 these are collision counts */ SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "coll_ext_errs", CTLFLAG_RD, &stats->cexterr, "Collision/Carrier extension errors"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "xon_recvd", CTLFLAG_RD, &stats->xonrxc, "XON Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "xon_txd", CTLFLAG_RD, &stats->xontxc, "XON Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "xoff_recvd", CTLFLAG_RD, &stats->xoffrxc, "XOFF Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "xoff_txd", CTLFLAG_RD, &stats->xofftxc, "XOFF Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "unsupported_fc_recvd", CTLFLAG_RD, &stats->fcruc, "Unsupported Flow Control Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "mgmt_pkts_recvd", CTLFLAG_RD, &stats->mgprc, "Management Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "mgmt_pkts_drop", CTLFLAG_RD, &stats->mgpdc, "Management Packets Dropped"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "mgmt_pkts_txd", CTLFLAG_RD, &stats->mgptc, "Management Packets Transmitted"); /* Packet Reception Stats */ SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "total_pkts_recvd", CTLFLAG_RD, &stats->tpr, "Total Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_pkts_recvd", CTLFLAG_RD, &stats->gprc, "Good Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_recvd", CTLFLAG_RD, &stats->bprc, "Broadcast Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_recvd", CTLFLAG_RD, &stats->mprc, "Multicast Packets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "rx_frames_64", CTLFLAG_RD, &stats->prc64, "64 byte frames received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "rx_frames_65_127", CTLFLAG_RD, &stats->prc127, "65-127 byte frames received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "rx_frames_128_255", CTLFLAG_RD, &stats->prc255, "128-255 byte frames received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "rx_frames_256_511", CTLFLAG_RD, &stats->prc511, "256-511 byte frames received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "rx_frames_512_1023", CTLFLAG_RD, &stats->prc1023, "512-1023 byte frames received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "rx_frames_1024_1522", CTLFLAG_RD, &stats->prc1522, "1023-1522 byte frames received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_octets_recvd", CTLFLAG_RD, &stats->gorc, "Good Octets Received"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "total_octets_recvd", CTLFLAG_RD, &stats->tor, "Total Octets Received"); /* Packet Transmission Stats */ SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_octets_txd", CTLFLAG_RD, &stats->gotc, "Good Octets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "total_octets_txd", CTLFLAG_RD, &stats->tot, "Total Octets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "total_pkts_txd", CTLFLAG_RD, &stats->tpt, "Total Packets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "good_pkts_txd", CTLFLAG_RD, &stats->gptc, "Good Packets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_txd", CTLFLAG_RD, &stats->bptc, "Broadcast Packets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_txd", CTLFLAG_RD, &stats->mptc, "Multicast Packets Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tx_frames_64", CTLFLAG_RD, &stats->ptc64, "64 byte frames transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tx_frames_65_127", CTLFLAG_RD, &stats->ptc127, "65-127 byte frames transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tx_frames_128_255", CTLFLAG_RD, &stats->ptc255, "128-255 byte frames transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tx_frames_256_511", CTLFLAG_RD, &stats->ptc511, "256-511 byte frames transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tx_frames_512_1023", CTLFLAG_RD, &stats->ptc1023, "512-1023 byte frames transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tx_frames_1024_1522", CTLFLAG_RD, &stats->ptc1522, "1024-1522 byte frames transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tso_txd", CTLFLAG_RD, &stats->tsctc, "TSO Contexts Transmitted"); SYSCTL_ADD_QUAD(ctx, stat_list, OID_AUTO, "tso_ctx_fail", CTLFLAG_RD, &stats->tsctfc, "TSO Contexts Failed"); /* Interrupt Stats */ int_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "interrupts", CTLFLAG_RD, NULL, "Interrupt Statistics"); int_list = SYSCTL_CHILDREN(int_node); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "asserts", CTLFLAG_RD, &stats->iac, "Interrupt Assertion Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "rx_pkt_timer", CTLFLAG_RD, &stats->icrxptc, "Interrupt Cause Rx Pkt Timer Expire Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "rx_abs_timer", CTLFLAG_RD, &stats->icrxatc, "Interrupt Cause Rx Abs Timer Expire Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "tx_pkt_timer", CTLFLAG_RD, &stats->ictxptc, "Interrupt Cause Tx Pkt Timer Expire Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "tx_abs_timer", CTLFLAG_RD, &stats->ictxatc, "Interrupt Cause Tx Abs Timer Expire Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "tx_queue_empty", CTLFLAG_RD, &stats->ictxqec, "Interrupt Cause Tx Queue Empty Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "tx_queue_min_thresh", CTLFLAG_RD, &stats->ictxqmtc, "Interrupt Cause Tx Queue Min Thresh Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "rx_desc_min_thresh", CTLFLAG_RD, &stats->icrxdmtc, "Interrupt Cause Rx Desc Min Thresh Count"); SYSCTL_ADD_QUAD(ctx, int_list, OID_AUTO, "rx_overrun", CTLFLAG_RD, &stats->icrxoc, "Interrupt Cause Receiver Overrun Count"); /* Host to Card Stats */ host_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "host", CTLFLAG_RD, NULL, "Host to Card Statistics"); host_list = SYSCTL_CHILDREN(host_node); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "breaker_tx_pkt", CTLFLAG_RD, &stats->cbtmpc, "Circuit Breaker Tx Packet Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "host_tx_pkt_discard", CTLFLAG_RD, &stats->htdpmc, "Host Transmit Discarded Packets"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "rx_pkt", CTLFLAG_RD, &stats->rpthc, "Rx Packets To Host"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "breaker_rx_pkts", CTLFLAG_RD, &stats->cbrmpc, "Circuit Breaker Rx Packet Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "breaker_rx_pkt_drop", CTLFLAG_RD, &stats->cbrdpc, "Circuit Breaker Rx Dropped Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "tx_good_pkt", CTLFLAG_RD, &stats->hgptc, "Host Good Packets Tx Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "breaker_tx_pkt_drop", CTLFLAG_RD, &stats->htcbdpc, "Host Tx Circuit Breaker Dropped Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "rx_good_bytes", CTLFLAG_RD, &stats->hgorc, "Host Good Octets Received Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "tx_good_bytes", CTLFLAG_RD, &stats->hgotc, "Host Good Octets Transmit Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "length_errors", CTLFLAG_RD, &stats->lenerrs, "Length Errors"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "serdes_violation_pkt", CTLFLAG_RD, &stats->scvpc, "SerDes/SGMII Code Violation Pkt Count"); SYSCTL_ADD_QUAD(ctx, host_list, OID_AUTO, "header_redir_missed", CTLFLAG_RD, &stats->hrmpc, "Header Redirection Missed Packet Count"); } /********************************************************************** * * This routine provides a way to dump out the adapter eeprom, * often a useful debug/service tool. This only dumps the first * 32 words, stuff that matters is in that extent. * **********************************************************************/ static int igb_sysctl_nvm_info(SYSCTL_HANDLER_ARGS) { struct adapter *adapter; int error; int result; result = -1; error = sysctl_handle_int(oidp, &result, 0, req); if (error || !req->newptr) return (error); /* * This value will cause a hex dump of the * first 32 16-bit words of the EEPROM to * the screen. */ if (result == 1) { adapter = (struct adapter *)arg1; igb_print_nvm_info(adapter); } return (error); } static void igb_print_nvm_info(struct adapter *adapter) { u16 eeprom_data; int i, j, row = 0; /* Its a bit crude, but it gets the job done */ printf("\nInterface EEPROM Dump:\n"); printf("Offset\n0x0000 "); for (i = 0, j = 0; i < 32; i++, j++) { if (j == 8) { /* Make the offset block */ j = 0; ++row; printf("\n0x00%x0 ",row); } e1000_read_nvm(&adapter->hw, i, 1, &eeprom_data); printf("%04x ", eeprom_data); } printf("\n"); } static void igb_set_sysctl_value(struct adapter *adapter, const char *name, const char *description, int *limit, int value) { *limit = value; SYSCTL_ADD_INT(device_get_sysctl_ctx(adapter->dev), SYSCTL_CHILDREN(device_get_sysctl_tree(adapter->dev)), OID_AUTO, name, CTLFLAG_RW, limit, value, description); } /* ** Set flow control using sysctl: ** Flow control values: ** 0 - off ** 1 - rx pause ** 2 - tx pause ** 3 - full */ static int igb_set_flowcntl(SYSCTL_HANDLER_ARGS) { int error; static int input = 3; /* default is full */ struct adapter *adapter = (struct adapter *) arg1; error = sysctl_handle_int(oidp, &input, 0, req); if ((error) || (req->newptr == NULL)) return (error); switch (input) { case e1000_fc_rx_pause: case e1000_fc_tx_pause: case e1000_fc_full: case e1000_fc_none: adapter->hw.fc.requested_mode = input; adapter->fc = input; break; default: /* Do nothing */ return (error); } adapter->hw.fc.current_mode = adapter->hw.fc.requested_mode; e1000_force_mac_fc(&adapter->hw); /* XXX TODO: update DROP_EN on each RX queue if appropriate */ return (error); } /* ** Manage DMA Coalesce: ** Control values: ** 0/1 - off/on ** Legal timer values are: ** 250,500,1000-10000 in thousands */ static int igb_sysctl_dmac(SYSCTL_HANDLER_ARGS) { struct adapter *adapter = (struct adapter *) arg1; int error; error = sysctl_handle_int(oidp, &adapter->dmac, 0, req); if ((error) || (req->newptr == NULL)) return (error); switch (adapter->dmac) { case 0: /* Disabling */ break; case 1: /* Just enable and use default */ adapter->dmac = 1000; break; case 250: case 500: case 1000: case 2000: case 3000: case 4000: case 5000: case 6000: case 7000: case 8000: case 9000: case 10000: /* Legal values - allow */ break; default: /* Do nothing, illegal value */ adapter->dmac = 0; return (EINVAL); } /* Reinit the interface */ igb_init(adapter); return (error); } /* ** Manage Energy Efficient Ethernet: ** Control values: ** 0/1 - enabled/disabled */ static int igb_sysctl_eee(SYSCTL_HANDLER_ARGS) { struct adapter *adapter = (struct adapter *) arg1; int error, value; value = adapter->hw.dev_spec._82575.eee_disable; error = sysctl_handle_int(oidp, &value, 0, req); if (error || req->newptr == NULL) return (error); IGB_CORE_LOCK(adapter); adapter->hw.dev_spec._82575.eee_disable = (value != 0); igb_init_locked(adapter); IGB_CORE_UNLOCK(adapter); return (0); } Index: user/alc/PQ_LAUNDRY/sys/dev/e1000/if_lem.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/e1000/if_lem.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/e1000/if_lem.c (revision 303206) @@ -1,4875 +1,4875 @@ /****************************************************************************** Copyright (c) 2001-2015, Intel Corporation All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ******************************************************************************/ /*$FreeBSD$*/ /* * Uncomment the following extensions for better performance in a VM, * especially if you have support in the hypervisor. * See http://info.iet.unipi.it/~luigi/netmap/ */ // #define BATCH_DISPATCH // #define NIC_SEND_COMBINING // #define NIC_PARAVIRT /* enable virtio-like synchronization */ #include "opt_inet.h" #include "opt_inet6.h" #ifdef HAVE_KERNEL_OPTION_HEADERS #include "opt_device_polling.h" #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "e1000_api.h" #include "if_lem.h" /********************************************************************* * Legacy Em Driver version: *********************************************************************/ char lem_driver_version[] = "1.1.0"; /********************************************************************* * PCI Device ID Table * * Used by probe to select devices to load on * Last field stores an index into e1000_strings * Last entry must be all 0s * * { Vendor ID, Device ID, SubVendor ID, SubDevice ID, String Index } *********************************************************************/ static em_vendor_info_t lem_vendor_info_array[] = { /* Intel(R) PRO/1000 Network Connection */ { 0x8086, E1000_DEV_ID_82540EM, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82540EM_LOM, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82540EP, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82540EP_LOM, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82540EP_LP, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82541EI, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82541ER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82541ER_LOM, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82541EI_MOBILE, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82541GI, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82541GI_LF, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82541GI_MOBILE, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82542, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82543GC_FIBER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82543GC_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82544EI_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82544EI_FIBER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82544GC_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82544GC_LOM, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82545EM_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82545EM_FIBER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82545GM_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82545GM_FIBER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82545GM_SERDES, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546EB_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546EB_FIBER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546EB_QUAD_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546GB_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546GB_FIBER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546GB_SERDES, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546GB_PCIE, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546GB_QUAD_COPPER, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82547EI, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82547EI_MOBILE, PCI_ANY_ID, PCI_ANY_ID, 0}, { 0x8086, E1000_DEV_ID_82547GI, PCI_ANY_ID, PCI_ANY_ID, 0}, /* required last entry */ { 0, 0, 0, 0, 0} }; /********************************************************************* * Table of branding strings for all supported NICs. *********************************************************************/ static char *lem_strings[] = { "Intel(R) PRO/1000 Legacy Network Connection" }; /********************************************************************* * Function prototypes *********************************************************************/ static int lem_probe(device_t); static int lem_attach(device_t); static int lem_detach(device_t); static int lem_shutdown(device_t); static int lem_suspend(device_t); static int lem_resume(device_t); static void lem_start(if_t); static void lem_start_locked(if_t ifp); static int lem_ioctl(if_t, u_long, caddr_t); static uint64_t lem_get_counter(if_t, ift_counter); static void lem_init(void *); static void lem_init_locked(struct adapter *); static void lem_stop(void *); static void lem_media_status(if_t, struct ifmediareq *); static int lem_media_change(if_t); static void lem_identify_hardware(struct adapter *); static int lem_allocate_pci_resources(struct adapter *); static int lem_allocate_irq(struct adapter *adapter); static void lem_free_pci_resources(struct adapter *); static void lem_local_timer(void *); static int lem_hardware_init(struct adapter *); static int lem_setup_interface(device_t, struct adapter *); static void lem_setup_transmit_structures(struct adapter *); static void lem_initialize_transmit_unit(struct adapter *); static int lem_setup_receive_structures(struct adapter *); static void lem_initialize_receive_unit(struct adapter *); static void lem_enable_intr(struct adapter *); static void lem_disable_intr(struct adapter *); static void lem_free_transmit_structures(struct adapter *); static void lem_free_receive_structures(struct adapter *); static void lem_update_stats_counters(struct adapter *); static void lem_add_hw_stats(struct adapter *adapter); static void lem_txeof(struct adapter *); static void lem_tx_purge(struct adapter *); static int lem_allocate_receive_structures(struct adapter *); static int lem_allocate_transmit_structures(struct adapter *); static bool lem_rxeof(struct adapter *, int, int *); #ifndef __NO_STRICT_ALIGNMENT static int lem_fixup_rx(struct adapter *); #endif static void lem_receive_checksum(struct adapter *, struct e1000_rx_desc *, struct mbuf *); static void lem_transmit_checksum_setup(struct adapter *, struct mbuf *, u32 *, u32 *); static void lem_set_promisc(struct adapter *); static void lem_disable_promisc(struct adapter *); static void lem_set_multi(struct adapter *); static void lem_update_link_status(struct adapter *); static int lem_get_buf(struct adapter *, int); static void lem_register_vlan(void *, if_t, u16); static void lem_unregister_vlan(void *, if_t, u16); static void lem_setup_vlan_hw_support(struct adapter *); static int lem_xmit(struct adapter *, struct mbuf **); static void lem_smartspeed(struct adapter *); static int lem_82547_fifo_workaround(struct adapter *, int); static void lem_82547_update_fifo_head(struct adapter *, int); static int lem_82547_tx_fifo_reset(struct adapter *); static void lem_82547_move_tail(void *); static int lem_dma_malloc(struct adapter *, bus_size_t, struct em_dma_alloc *, int); static void lem_dma_free(struct adapter *, struct em_dma_alloc *); static int lem_sysctl_nvm_info(SYSCTL_HANDLER_ARGS); static void lem_print_nvm_info(struct adapter *); static int lem_is_valid_ether_addr(u8 *); static u32 lem_fill_descriptors (bus_addr_t address, u32 length, PDESC_ARRAY desc_array); static int lem_sysctl_int_delay(SYSCTL_HANDLER_ARGS); static void lem_add_int_delay_sysctl(struct adapter *, const char *, const char *, struct em_int_delay_info *, int, int); static void lem_set_flow_cntrl(struct adapter *, const char *, const char *, int *, int); /* Management and WOL Support */ static void lem_init_manageability(struct adapter *); static void lem_release_manageability(struct adapter *); static void lem_get_hw_control(struct adapter *); static void lem_release_hw_control(struct adapter *); static void lem_get_wakeup(device_t); static void lem_enable_wakeup(device_t); static int lem_enable_phy_wakeup(struct adapter *); static void lem_led_func(void *, int); static void lem_intr(void *); static int lem_irq_fast(void *); static void lem_handle_rxtx(void *context, int pending); static void lem_handle_link(void *context, int pending); static void lem_add_rx_process_limit(struct adapter *, const char *, const char *, int *, int); #ifdef DEVICE_POLLING static poll_handler_t lem_poll; #endif /* POLLING */ /********************************************************************* * FreeBSD Device Interface Entry Points *********************************************************************/ static device_method_t lem_methods[] = { /* Device interface */ DEVMETHOD(device_probe, lem_probe), DEVMETHOD(device_attach, lem_attach), DEVMETHOD(device_detach, lem_detach), DEVMETHOD(device_shutdown, lem_shutdown), DEVMETHOD(device_suspend, lem_suspend), DEVMETHOD(device_resume, lem_resume), DEVMETHOD_END }; static driver_t lem_driver = { "em", lem_methods, sizeof(struct adapter), }; extern devclass_t em_devclass; DRIVER_MODULE(lem, pci, lem_driver, em_devclass, 0, 0); MODULE_DEPEND(lem, pci, 1, 1, 1); MODULE_DEPEND(lem, ether, 1, 1, 1); #ifdef DEV_NETMAP MODULE_DEPEND(lem, netmap, 1, 1, 1); #endif /* DEV_NETMAP */ /********************************************************************* * Tunable default values. *********************************************************************/ #define EM_TICKS_TO_USECS(ticks) ((1024 * (ticks) + 500) / 1000) #define EM_USECS_TO_TICKS(usecs) ((1000 * (usecs) + 512) / 1024) #define MAX_INTS_PER_SEC 8000 #define DEFAULT_ITR (1000000000/(MAX_INTS_PER_SEC * 256)) static int lem_tx_int_delay_dflt = EM_TICKS_TO_USECS(EM_TIDV); static int lem_rx_int_delay_dflt = EM_TICKS_TO_USECS(EM_RDTR); static int lem_tx_abs_int_delay_dflt = EM_TICKS_TO_USECS(EM_TADV); static int lem_rx_abs_int_delay_dflt = EM_TICKS_TO_USECS(EM_RADV); /* * increase lem_rxd and lem_txd to at least 2048 in netmap mode * for better performance. */ static int lem_rxd = EM_DEFAULT_RXD; static int lem_txd = EM_DEFAULT_TXD; static int lem_smart_pwr_down = FALSE; /* Controls whether promiscuous also shows bad packets */ static int lem_debug_sbp = FALSE; TUNABLE_INT("hw.em.tx_int_delay", &lem_tx_int_delay_dflt); TUNABLE_INT("hw.em.rx_int_delay", &lem_rx_int_delay_dflt); TUNABLE_INT("hw.em.tx_abs_int_delay", &lem_tx_abs_int_delay_dflt); TUNABLE_INT("hw.em.rx_abs_int_delay", &lem_rx_abs_int_delay_dflt); TUNABLE_INT("hw.em.rxd", &lem_rxd); TUNABLE_INT("hw.em.txd", &lem_txd); TUNABLE_INT("hw.em.smart_pwr_down", &lem_smart_pwr_down); TUNABLE_INT("hw.em.sbp", &lem_debug_sbp); /* Interrupt style - default to fast */ static int lem_use_legacy_irq = 0; TUNABLE_INT("hw.em.use_legacy_irq", &lem_use_legacy_irq); /* How many packets rxeof tries to clean at a time */ static int lem_rx_process_limit = 100; TUNABLE_INT("hw.em.rx_process_limit", &lem_rx_process_limit); /* Flow control setting - default to FULL */ static int lem_fc_setting = e1000_fc_full; TUNABLE_INT("hw.em.fc_setting", &lem_fc_setting); /* Global used in WOL setup with multiport cards */ static int global_quad_port_a = 0; #ifdef DEV_NETMAP /* see ixgbe.c for details */ #include #endif /* DEV_NETMAP */ /********************************************************************* * Device identification routine * * em_probe determines if the driver should be loaded on * adapter based on PCI vendor/device id of the adapter. * * return BUS_PROBE_DEFAULT on success, positive on failure *********************************************************************/ static int lem_probe(device_t dev) { char adapter_name[60]; u16 pci_vendor_id = 0; u16 pci_device_id = 0; u16 pci_subvendor_id = 0; u16 pci_subdevice_id = 0; em_vendor_info_t *ent; INIT_DEBUGOUT("em_probe: begin"); pci_vendor_id = pci_get_vendor(dev); if (pci_vendor_id != EM_VENDOR_ID) return (ENXIO); pci_device_id = pci_get_device(dev); pci_subvendor_id = pci_get_subvendor(dev); pci_subdevice_id = pci_get_subdevice(dev); ent = lem_vendor_info_array; while (ent->vendor_id != 0) { if ((pci_vendor_id == ent->vendor_id) && (pci_device_id == ent->device_id) && ((pci_subvendor_id == ent->subvendor_id) || (ent->subvendor_id == PCI_ANY_ID)) && ((pci_subdevice_id == ent->subdevice_id) || (ent->subdevice_id == PCI_ANY_ID))) { sprintf(adapter_name, "%s %s", lem_strings[ent->index], lem_driver_version); device_set_desc_copy(dev, adapter_name); return (BUS_PROBE_DEFAULT); } ent++; } return (ENXIO); } /********************************************************************* * Device initialization routine * * The attach entry point is called when the driver is being loaded. * This routine identifies the type of hardware, allocates all resources * and initializes the hardware. * * return 0 on success, positive on failure *********************************************************************/ static int lem_attach(device_t dev) { struct adapter *adapter; int tsize, rsize; int error = 0; INIT_DEBUGOUT("lem_attach: begin"); adapter = device_get_softc(dev); adapter->dev = adapter->osdep.dev = dev; EM_CORE_LOCK_INIT(adapter, device_get_nameunit(dev)); EM_TX_LOCK_INIT(adapter, device_get_nameunit(dev)); EM_RX_LOCK_INIT(adapter, device_get_nameunit(dev)); /* SYSCTL stuff */ SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "nvm", CTLTYPE_INT|CTLFLAG_RW, adapter, 0, lem_sysctl_nvm_info, "I", "NVM Information"); callout_init_mtx(&adapter->timer, &adapter->core_mtx, 0); callout_init_mtx(&adapter->tx_fifo_timer, &adapter->tx_mtx, 0); /* Determine hardware and mac info */ lem_identify_hardware(adapter); /* Setup PCI resources */ if (lem_allocate_pci_resources(adapter)) { device_printf(dev, "Allocation of PCI resources failed\n"); error = ENXIO; goto err_pci; } /* Do Shared Code initialization */ if (e1000_setup_init_funcs(&adapter->hw, TRUE)) { device_printf(dev, "Setup of Shared code failed\n"); error = ENXIO; goto err_pci; } e1000_get_bus_info(&adapter->hw); /* Set up some sysctls for the tunable interrupt delays */ lem_add_int_delay_sysctl(adapter, "rx_int_delay", "receive interrupt delay in usecs", &adapter->rx_int_delay, E1000_REGISTER(&adapter->hw, E1000_RDTR), lem_rx_int_delay_dflt); lem_add_int_delay_sysctl(adapter, "tx_int_delay", "transmit interrupt delay in usecs", &adapter->tx_int_delay, E1000_REGISTER(&adapter->hw, E1000_TIDV), lem_tx_int_delay_dflt); if (adapter->hw.mac.type >= e1000_82540) { lem_add_int_delay_sysctl(adapter, "rx_abs_int_delay", "receive interrupt delay limit in usecs", &adapter->rx_abs_int_delay, E1000_REGISTER(&adapter->hw, E1000_RADV), lem_rx_abs_int_delay_dflt); lem_add_int_delay_sysctl(adapter, "tx_abs_int_delay", "transmit interrupt delay limit in usecs", &adapter->tx_abs_int_delay, E1000_REGISTER(&adapter->hw, E1000_TADV), lem_tx_abs_int_delay_dflt); lem_add_int_delay_sysctl(adapter, "itr", "interrupt delay limit in usecs/4", &adapter->tx_itr, E1000_REGISTER(&adapter->hw, E1000_ITR), DEFAULT_ITR); } /* Sysctls for limiting the amount of work done in the taskqueue */ lem_add_rx_process_limit(adapter, "rx_processing_limit", "max number of rx packets to process", &adapter->rx_process_limit, lem_rx_process_limit); #ifdef NIC_SEND_COMBINING /* Sysctls to control mitigation */ lem_add_rx_process_limit(adapter, "sc_enable", "driver TDT mitigation", &adapter->sc_enable, 0); #endif /* NIC_SEND_COMBINING */ #ifdef BATCH_DISPATCH lem_add_rx_process_limit(adapter, "batch_enable", "driver rx batch", &adapter->batch_enable, 0); #endif /* BATCH_DISPATCH */ #ifdef NIC_PARAVIRT lem_add_rx_process_limit(adapter, "rx_retries", "driver rx retries", &adapter->rx_retries, 0); #endif /* NIC_PARAVIRT */ /* Sysctl for setting the interface flow control */ lem_set_flow_cntrl(adapter, "flow_control", "flow control setting", &adapter->fc_setting, lem_fc_setting); /* * Validate number of transmit and receive descriptors. It * must not exceed hardware maximum, and must be multiple * of E1000_DBA_ALIGN. */ if (((lem_txd * sizeof(struct e1000_tx_desc)) % EM_DBA_ALIGN) != 0 || (adapter->hw.mac.type >= e1000_82544 && lem_txd > EM_MAX_TXD) || (adapter->hw.mac.type < e1000_82544 && lem_txd > EM_MAX_TXD_82543) || (lem_txd < EM_MIN_TXD)) { device_printf(dev, "Using %d TX descriptors instead of %d!\n", EM_DEFAULT_TXD, lem_txd); adapter->num_tx_desc = EM_DEFAULT_TXD; } else adapter->num_tx_desc = lem_txd; if (((lem_rxd * sizeof(struct e1000_rx_desc)) % EM_DBA_ALIGN) != 0 || (adapter->hw.mac.type >= e1000_82544 && lem_rxd > EM_MAX_RXD) || (adapter->hw.mac.type < e1000_82544 && lem_rxd > EM_MAX_RXD_82543) || (lem_rxd < EM_MIN_RXD)) { device_printf(dev, "Using %d RX descriptors instead of %d!\n", EM_DEFAULT_RXD, lem_rxd); adapter->num_rx_desc = EM_DEFAULT_RXD; } else adapter->num_rx_desc = lem_rxd; adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_wait_to_complete = FALSE; adapter->hw.phy.autoneg_advertised = AUTONEG_ADV_DEFAULT; adapter->rx_buffer_len = 2048; e1000_init_script_state_82541(&adapter->hw, TRUE); e1000_set_tbi_compatibility_82543(&adapter->hw, TRUE); /* Copper options */ if (adapter->hw.phy.media_type == e1000_media_type_copper) { adapter->hw.phy.mdix = AUTO_ALL_MODES; adapter->hw.phy.disable_polarity_correction = FALSE; adapter->hw.phy.ms_type = EM_MASTER_SLAVE; } /* * Set the frame limits assuming * standard ethernet sized frames. */ adapter->max_frame_size = ETHERMTU + ETHER_HDR_LEN + ETHERNET_FCS_SIZE; adapter->min_frame_size = ETH_ZLEN + ETHERNET_FCS_SIZE; /* * This controls when hardware reports transmit completion * status. */ adapter->hw.mac.report_tx_early = 1; #ifdef NIC_PARAVIRT device_printf(dev, "driver supports paravirt, subdev 0x%x\n", adapter->hw.subsystem_device_id); if (adapter->hw.subsystem_device_id == E1000_PARA_SUBDEV) { uint64_t bus_addr; device_printf(dev, "paravirt support on dev %p\n", adapter); tsize = 4096; // XXX one page for the csb if (lem_dma_malloc(adapter, tsize, &adapter->csb_mem, BUS_DMA_NOWAIT)) { device_printf(dev, "Unable to allocate csb memory\n"); error = ENOMEM; goto err_csb; } /* Setup the Base of the CSB */ adapter->csb = (struct paravirt_csb *)adapter->csb_mem.dma_vaddr; /* force the first kick */ adapter->csb->host_need_txkick = 1; /* txring empty */ adapter->csb->guest_need_rxkick = 1; /* no rx packets */ bus_addr = adapter->csb_mem.dma_paddr; lem_add_rx_process_limit(adapter, "csb_on", "enable paravirt.", &adapter->csb->guest_csb_on, 0); lem_add_rx_process_limit(adapter, "txc_lim", "txc_lim", &adapter->csb->host_txcycles_lim, 1); /* some stats */ #define PA_SC(name, var, val) \ lem_add_rx_process_limit(adapter, name, name, var, val) PA_SC("host_need_txkick",&adapter->csb->host_need_txkick, 1); PA_SC("host_rxkick_at",&adapter->csb->host_rxkick_at, ~0); PA_SC("guest_need_txkick",&adapter->csb->guest_need_txkick, 0); PA_SC("guest_need_rxkick",&adapter->csb->guest_need_rxkick, 1); PA_SC("tdt_reg_count",&adapter->tdt_reg_count, 0); PA_SC("tdt_csb_count",&adapter->tdt_csb_count, 0); PA_SC("tdt_int_count",&adapter->tdt_int_count, 0); PA_SC("guest_need_kick_count",&adapter->guest_need_kick_count, 0); /* tell the host where the block is */ E1000_WRITE_REG(&adapter->hw, E1000_CSBAH, (u32)(bus_addr >> 32)); E1000_WRITE_REG(&adapter->hw, E1000_CSBAL, (u32)bus_addr); } #endif /* NIC_PARAVIRT */ tsize = roundup2(adapter->num_tx_desc * sizeof(struct e1000_tx_desc), EM_DBA_ALIGN); /* Allocate Transmit Descriptor ring */ if (lem_dma_malloc(adapter, tsize, &adapter->txdma, BUS_DMA_NOWAIT)) { device_printf(dev, "Unable to allocate tx_desc memory\n"); error = ENOMEM; goto err_tx_desc; } adapter->tx_desc_base = (struct e1000_tx_desc *)adapter->txdma.dma_vaddr; rsize = roundup2(adapter->num_rx_desc * sizeof(struct e1000_rx_desc), EM_DBA_ALIGN); /* Allocate Receive Descriptor ring */ if (lem_dma_malloc(adapter, rsize, &adapter->rxdma, BUS_DMA_NOWAIT)) { device_printf(dev, "Unable to allocate rx_desc memory\n"); error = ENOMEM; goto err_rx_desc; } adapter->rx_desc_base = (struct e1000_rx_desc *)adapter->rxdma.dma_vaddr; /* Allocate multicast array memory. */ adapter->mta = malloc(sizeof(u8) * ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES, M_DEVBUF, M_NOWAIT); if (adapter->mta == NULL) { device_printf(dev, "Can not allocate multicast setup array\n"); error = ENOMEM; goto err_hw_init; } /* ** Start from a known state, this is ** important in reading the nvm and ** mac from that. */ e1000_reset_hw(&adapter->hw); /* Make sure we have a good EEPROM before we read from it */ if (e1000_validate_nvm_checksum(&adapter->hw) < 0) { /* ** Some PCI-E parts fail the first check due to ** the link being in sleep state, call it again, ** if it fails a second time its a real issue. */ if (e1000_validate_nvm_checksum(&adapter->hw) < 0) { device_printf(dev, "The EEPROM Checksum Is Not Valid\n"); error = EIO; goto err_hw_init; } } /* Copy the permanent MAC address out of the EEPROM */ if (e1000_read_mac_addr(&adapter->hw) < 0) { device_printf(dev, "EEPROM read error while reading MAC" " address\n"); error = EIO; goto err_hw_init; } if (!lem_is_valid_ether_addr(adapter->hw.mac.addr)) { device_printf(dev, "Invalid MAC address\n"); error = EIO; goto err_hw_init; } /* Initialize the hardware */ if (lem_hardware_init(adapter)) { device_printf(dev, "Unable to initialize the hardware\n"); error = EIO; goto err_hw_init; } /* Allocate transmit descriptors and buffers */ if (lem_allocate_transmit_structures(adapter)) { device_printf(dev, "Could not setup transmit structures\n"); error = ENOMEM; goto err_tx_struct; } /* Allocate receive descriptors and buffers */ if (lem_allocate_receive_structures(adapter)) { device_printf(dev, "Could not setup receive structures\n"); error = ENOMEM; goto err_rx_struct; } /* ** Do interrupt configuration */ error = lem_allocate_irq(adapter); if (error) goto err_rx_struct; /* * Get Wake-on-Lan and Management info for later use */ lem_get_wakeup(dev); /* Setup OS specific network interface */ if (lem_setup_interface(dev, adapter) != 0) goto err_rx_struct; /* Initialize statistics */ lem_update_stats_counters(adapter); adapter->hw.mac.get_link_status = 1; lem_update_link_status(adapter); /* Indicate SOL/IDER usage */ if (e1000_check_reset_block(&adapter->hw)) device_printf(dev, "PHY reset is blocked due to SOL/IDER session.\n"); /* Do we need workaround for 82544 PCI-X adapter? */ if (adapter->hw.bus.type == e1000_bus_type_pcix && adapter->hw.mac.type == e1000_82544) adapter->pcix_82544 = TRUE; else adapter->pcix_82544 = FALSE; /* Register for VLAN events */ adapter->vlan_attach = EVENTHANDLER_REGISTER(vlan_config, lem_register_vlan, adapter, EVENTHANDLER_PRI_FIRST); adapter->vlan_detach = EVENTHANDLER_REGISTER(vlan_unconfig, lem_unregister_vlan, adapter, EVENTHANDLER_PRI_FIRST); lem_add_hw_stats(adapter); /* Non-AMT based hardware can now take control from firmware */ if (adapter->has_manage && !adapter->has_amt) lem_get_hw_control(adapter); /* Tell the stack that the interface is not active */ if_setdrvflagbits(adapter->ifp, 0, IFF_DRV_OACTIVE | IFF_DRV_RUNNING); adapter->led_dev = led_create(lem_led_func, adapter, device_get_nameunit(dev)); #ifdef DEV_NETMAP lem_netmap_attach(adapter); #endif /* DEV_NETMAP */ INIT_DEBUGOUT("lem_attach: end"); return (0); err_rx_struct: lem_free_transmit_structures(adapter); err_tx_struct: err_hw_init: lem_release_hw_control(adapter); lem_dma_free(adapter, &adapter->rxdma); err_rx_desc: lem_dma_free(adapter, &adapter->txdma); err_tx_desc: #ifdef NIC_PARAVIRT lem_dma_free(adapter, &adapter->csb_mem); err_csb: #endif /* NIC_PARAVIRT */ err_pci: if (adapter->ifp != (void *)NULL) if_free(adapter->ifp); lem_free_pci_resources(adapter); free(adapter->mta, M_DEVBUF); EM_TX_LOCK_DESTROY(adapter); EM_RX_LOCK_DESTROY(adapter); EM_CORE_LOCK_DESTROY(adapter); return (error); } /********************************************************************* * Device removal routine * * The detach entry point is called when the driver is being removed. * This routine stops the adapter and deallocates all the resources * that were allocated for driver operation. * * return 0 on success, positive on failure *********************************************************************/ static int lem_detach(device_t dev) { struct adapter *adapter = device_get_softc(dev); if_t ifp = adapter->ifp; INIT_DEBUGOUT("em_detach: begin"); /* Make sure VLANS are not using driver */ if (if_vlantrunkinuse(ifp)) { device_printf(dev,"Vlan in use, detach first\n"); return (EBUSY); } #ifdef DEVICE_POLLING if (if_getcapenable(ifp) & IFCAP_POLLING) ether_poll_deregister(ifp); #endif if (adapter->led_dev != NULL) led_destroy(adapter->led_dev); EM_CORE_LOCK(adapter); EM_TX_LOCK(adapter); adapter->in_detach = 1; lem_stop(adapter); e1000_phy_hw_reset(&adapter->hw); lem_release_manageability(adapter); EM_TX_UNLOCK(adapter); EM_CORE_UNLOCK(adapter); /* Unregister VLAN events */ if (adapter->vlan_attach != NULL) EVENTHANDLER_DEREGISTER(vlan_config, adapter->vlan_attach); if (adapter->vlan_detach != NULL) EVENTHANDLER_DEREGISTER(vlan_unconfig, adapter->vlan_detach); ether_ifdetach(adapter->ifp); callout_drain(&adapter->timer); callout_drain(&adapter->tx_fifo_timer); #ifdef DEV_NETMAP netmap_detach(ifp); #endif /* DEV_NETMAP */ lem_free_pci_resources(adapter); bus_generic_detach(dev); if_free(ifp); lem_free_transmit_structures(adapter); lem_free_receive_structures(adapter); /* Free Transmit Descriptor ring */ if (adapter->tx_desc_base) { lem_dma_free(adapter, &adapter->txdma); adapter->tx_desc_base = NULL; } /* Free Receive Descriptor ring */ if (adapter->rx_desc_base) { lem_dma_free(adapter, &adapter->rxdma); adapter->rx_desc_base = NULL; } #ifdef NIC_PARAVIRT if (adapter->csb) { lem_dma_free(adapter, &adapter->csb_mem); adapter->csb = NULL; } #endif /* NIC_PARAVIRT */ lem_release_hw_control(adapter); free(adapter->mta, M_DEVBUF); EM_TX_LOCK_DESTROY(adapter); EM_RX_LOCK_DESTROY(adapter); EM_CORE_LOCK_DESTROY(adapter); return (0); } /********************************************************************* * * Shutdown entry point * **********************************************************************/ static int lem_shutdown(device_t dev) { return lem_suspend(dev); } /* * Suspend/resume device methods. */ static int lem_suspend(device_t dev) { struct adapter *adapter = device_get_softc(dev); EM_CORE_LOCK(adapter); lem_release_manageability(adapter); lem_release_hw_control(adapter); lem_enable_wakeup(dev); EM_CORE_UNLOCK(adapter); return bus_generic_suspend(dev); } static int lem_resume(device_t dev) { struct adapter *adapter = device_get_softc(dev); if_t ifp = adapter->ifp; EM_CORE_LOCK(adapter); lem_init_locked(adapter); lem_init_manageability(adapter); EM_CORE_UNLOCK(adapter); lem_start(ifp); return bus_generic_resume(dev); } static void lem_start_locked(if_t ifp) { struct adapter *adapter = if_getsoftc(ifp); struct mbuf *m_head; EM_TX_LOCK_ASSERT(adapter); if ((if_getdrvflags(ifp) & (IFF_DRV_RUNNING|IFF_DRV_OACTIVE)) != IFF_DRV_RUNNING) return; if (!adapter->link_active) return; /* * Force a cleanup if number of TX descriptors * available hits the threshold */ if (adapter->num_tx_desc_avail <= EM_TX_CLEANUP_THRESHOLD) { lem_txeof(adapter); /* Now do we at least have a minimal? */ if (adapter->num_tx_desc_avail <= EM_TX_OP_THRESHOLD) { adapter->no_tx_desc_avail1++; return; } } while (!if_sendq_empty(ifp)) { m_head = if_dequeue(ifp); if (m_head == NULL) break; /* * Encapsulation can modify our pointer, and or make it * NULL on failure. In that event, we can't requeue. */ if (lem_xmit(adapter, &m_head)) { if (m_head == NULL) break; if_setdrvflagbits(ifp, IFF_DRV_OACTIVE, 0); if_sendq_prepend(ifp, m_head); break; } /* Send a copy of the frame to the BPF listener */ if_etherbpfmtap(ifp, m_head); /* Set timeout in case hardware has problems transmitting. */ adapter->watchdog_check = TRUE; adapter->watchdog_time = ticks; } if (adapter->num_tx_desc_avail <= EM_TX_OP_THRESHOLD) if_setdrvflagbits(ifp, IFF_DRV_OACTIVE, 0); #ifdef NIC_PARAVIRT if (if_getdrvflags(ifp) & IFF_DRV_OACTIVE && adapter->csb && adapter->csb->guest_csb_on && !(adapter->csb->guest_need_txkick & 1)) { adapter->csb->guest_need_txkick = 1; adapter->guest_need_kick_count++; // XXX memory barrier lem_txeof(adapter); // XXX possibly clear IFF_DRV_OACTIVE } #endif /* NIC_PARAVIRT */ return; } static void lem_start(if_t ifp) { struct adapter *adapter = if_getsoftc(ifp); EM_TX_LOCK(adapter); if (if_getdrvflags(ifp) & IFF_DRV_RUNNING) lem_start_locked(ifp); EM_TX_UNLOCK(adapter); } /********************************************************************* * Ioctl entry point * * em_ioctl is called when the user wants to configure the * interface. * * return 0 on success, positive on failure **********************************************************************/ static int lem_ioctl(if_t ifp, u_long command, caddr_t data) { struct adapter *adapter = if_getsoftc(ifp); struct ifreq *ifr = (struct ifreq *)data; #if defined(INET) || defined(INET6) struct ifaddr *ifa = (struct ifaddr *)data; #endif bool avoid_reset = FALSE; int error = 0; if (adapter->in_detach) return (error); switch (command) { case SIOCSIFADDR: #ifdef INET if (ifa->ifa_addr->sa_family == AF_INET) avoid_reset = TRUE; #endif #ifdef INET6 if (ifa->ifa_addr->sa_family == AF_INET6) avoid_reset = TRUE; #endif /* ** Calling init results in link renegotiation, ** so we avoid doing it when possible. */ if (avoid_reset) { if_setflagbits(ifp, IFF_UP, 0); if (!(if_getdrvflags(ifp) & IFF_DRV_RUNNING)) lem_init(adapter); #ifdef INET if (!(if_getflags(ifp) & IFF_NOARP)) arp_ifinit(ifp, ifa); #endif } else error = ether_ioctl(ifp, command, data); break; case SIOCSIFMTU: { int max_frame_size; IOCTL_DEBUGOUT("ioctl rcv'd: SIOCSIFMTU (Set Interface MTU)"); EM_CORE_LOCK(adapter); switch (adapter->hw.mac.type) { case e1000_82542: max_frame_size = ETHER_MAX_LEN; break; default: max_frame_size = MAX_JUMBO_FRAME_SIZE; } if (ifr->ifr_mtu > max_frame_size - ETHER_HDR_LEN - ETHER_CRC_LEN) { EM_CORE_UNLOCK(adapter); error = EINVAL; break; } if_setmtu(ifp, ifr->ifr_mtu); adapter->max_frame_size = if_getmtu(ifp) + ETHER_HDR_LEN + ETHER_CRC_LEN; - if ((if_getdrvflags(ifp) & IFF_DRV_RUNNING)) + if (if_getdrvflags(ifp) & IFF_DRV_RUNNING) lem_init_locked(adapter); EM_CORE_UNLOCK(adapter); break; } case SIOCSIFFLAGS: IOCTL_DEBUGOUT("ioctl rcv'd:\ SIOCSIFFLAGS (Set Interface Flags)"); EM_CORE_LOCK(adapter); if (if_getflags(ifp) & IFF_UP) { if ((if_getdrvflags(ifp) & IFF_DRV_RUNNING)) { if ((if_getflags(ifp) ^ adapter->if_flags) & (IFF_PROMISC | IFF_ALLMULTI)) { lem_disable_promisc(adapter); lem_set_promisc(adapter); } } else lem_init_locked(adapter); } else if (if_getdrvflags(ifp) & IFF_DRV_RUNNING) { EM_TX_LOCK(adapter); lem_stop(adapter); EM_TX_UNLOCK(adapter); } adapter->if_flags = if_getflags(ifp); EM_CORE_UNLOCK(adapter); break; case SIOCADDMULTI: case SIOCDELMULTI: IOCTL_DEBUGOUT("ioctl rcv'd: SIOC(ADD|DEL)MULTI"); if (if_getdrvflags(ifp) & IFF_DRV_RUNNING) { EM_CORE_LOCK(adapter); lem_disable_intr(adapter); lem_set_multi(adapter); if (adapter->hw.mac.type == e1000_82542 && adapter->hw.revision_id == E1000_REVISION_2) { lem_initialize_receive_unit(adapter); } #ifdef DEVICE_POLLING if (!(if_getcapenable(ifp) & IFCAP_POLLING)) #endif lem_enable_intr(adapter); EM_CORE_UNLOCK(adapter); } break; case SIOCSIFMEDIA: /* Check SOL/IDER usage */ EM_CORE_LOCK(adapter); if (e1000_check_reset_block(&adapter->hw)) { EM_CORE_UNLOCK(adapter); device_printf(adapter->dev, "Media change is" " blocked due to SOL/IDER session.\n"); break; } EM_CORE_UNLOCK(adapter); case SIOCGIFMEDIA: IOCTL_DEBUGOUT("ioctl rcv'd: \ SIOCxIFMEDIA (Get/Set Interface Media)"); error = ifmedia_ioctl(ifp, ifr, &adapter->media, command); break; case SIOCSIFCAP: { int mask, reinit; IOCTL_DEBUGOUT("ioctl rcv'd: SIOCSIFCAP (Set Capabilities)"); reinit = 0; mask = ifr->ifr_reqcap ^ if_getcapenable(ifp); #ifdef DEVICE_POLLING if (mask & IFCAP_POLLING) { if (ifr->ifr_reqcap & IFCAP_POLLING) { error = ether_poll_register(lem_poll, ifp); if (error) return (error); EM_CORE_LOCK(adapter); lem_disable_intr(adapter); if_setcapenablebit(ifp, IFCAP_POLLING, 0); EM_CORE_UNLOCK(adapter); } else { error = ether_poll_deregister(ifp); /* Enable interrupt even in error case */ EM_CORE_LOCK(adapter); lem_enable_intr(adapter); if_setcapenablebit(ifp, 0, IFCAP_POLLING); EM_CORE_UNLOCK(adapter); } } #endif if (mask & IFCAP_HWCSUM) { if_togglecapenable(ifp, IFCAP_HWCSUM); reinit = 1; } if (mask & IFCAP_VLAN_HWTAGGING) { if_togglecapenable(ifp, IFCAP_VLAN_HWTAGGING); reinit = 1; } if ((mask & IFCAP_WOL) && (if_getcapabilities(ifp) & IFCAP_WOL) != 0) { if (mask & IFCAP_WOL_MCAST) if_togglecapenable(ifp, IFCAP_WOL_MCAST); if (mask & IFCAP_WOL_MAGIC) if_togglecapenable(ifp, IFCAP_WOL_MAGIC); } if (reinit && (if_getdrvflags(ifp) & IFF_DRV_RUNNING)) lem_init(adapter); if_vlancap(ifp); break; } default: error = ether_ioctl(ifp, command, data); break; } return (error); } /********************************************************************* * Init entry point * * This routine is used in two ways. It is used by the stack as * init entry point in network interface structure. It is also used * by the driver as a hw/sw initialization routine to get to a * consistent state. * * return 0 on success, positive on failure **********************************************************************/ static void lem_init_locked(struct adapter *adapter) { if_t ifp = adapter->ifp; device_t dev = adapter->dev; u32 pba; INIT_DEBUGOUT("lem_init: begin"); EM_CORE_LOCK_ASSERT(adapter); EM_TX_LOCK(adapter); lem_stop(adapter); EM_TX_UNLOCK(adapter); /* * Packet Buffer Allocation (PBA) * Writing PBA sets the receive portion of the buffer * the remainder is used for the transmit buffer. * * Devices before the 82547 had a Packet Buffer of 64K. * Default allocation: PBA=48K for Rx, leaving 16K for Tx. * After the 82547 the buffer was reduced to 40K. * Default allocation: PBA=30K for Rx, leaving 10K for Tx. * Note: default does not leave enough room for Jumbo Frame >10k. */ switch (adapter->hw.mac.type) { case e1000_82547: case e1000_82547_rev_2: /* 82547: Total Packet Buffer is 40K */ if (adapter->max_frame_size > 8192) pba = E1000_PBA_22K; /* 22K for Rx, 18K for Tx */ else pba = E1000_PBA_30K; /* 30K for Rx, 10K for Tx */ adapter->tx_fifo_head = 0; adapter->tx_head_addr = pba << EM_TX_HEAD_ADDR_SHIFT; adapter->tx_fifo_size = (E1000_PBA_40K - pba) << EM_PBA_BYTES_SHIFT; break; default: /* Devices before 82547 had a Packet Buffer of 64K. */ if (adapter->max_frame_size > 8192) pba = E1000_PBA_40K; /* 40K for Rx, 24K for Tx */ else pba = E1000_PBA_48K; /* 48K for Rx, 16K for Tx */ } INIT_DEBUGOUT1("lem_init: pba=%dK",pba); E1000_WRITE_REG(&adapter->hw, E1000_PBA, pba); /* Get the latest mac address, User can use a LAA */ bcopy(if_getlladdr(adapter->ifp), adapter->hw.mac.addr, ETHER_ADDR_LEN); /* Put the address into the Receive Address Array */ e1000_rar_set(&adapter->hw, adapter->hw.mac.addr, 0); /* Initialize the hardware */ if (lem_hardware_init(adapter)) { device_printf(dev, "Unable to initialize the hardware\n"); return; } lem_update_link_status(adapter); /* Setup VLAN support, basic and offload if available */ E1000_WRITE_REG(&adapter->hw, E1000_VET, ETHERTYPE_VLAN); /* Set hardware offload abilities */ if_clearhwassist(ifp); if (adapter->hw.mac.type >= e1000_82543) { if (if_getcapenable(ifp) & IFCAP_TXCSUM) if_sethwassistbits(ifp, CSUM_TCP | CSUM_UDP, 0); } /* Configure for OS presence */ lem_init_manageability(adapter); /* Prepare transmit descriptors and buffers */ lem_setup_transmit_structures(adapter); lem_initialize_transmit_unit(adapter); /* Setup Multicast table */ lem_set_multi(adapter); /* Prepare receive descriptors and buffers */ if (lem_setup_receive_structures(adapter)) { device_printf(dev, "Could not setup receive structures\n"); EM_TX_LOCK(adapter); lem_stop(adapter); EM_TX_UNLOCK(adapter); return; } lem_initialize_receive_unit(adapter); /* Use real VLAN Filter support? */ if (if_getcapenable(ifp) & IFCAP_VLAN_HWTAGGING) { if (if_getcapenable(ifp) & IFCAP_VLAN_HWFILTER) /* Use real VLAN Filter support */ lem_setup_vlan_hw_support(adapter); else { u32 ctrl; ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); ctrl |= E1000_CTRL_VME; E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl); } } /* Don't lose promiscuous settings */ lem_set_promisc(adapter); if_setdrvflagbits(ifp, IFF_DRV_RUNNING, IFF_DRV_OACTIVE); callout_reset(&adapter->timer, hz, lem_local_timer, adapter); e1000_clear_hw_cntrs_base_generic(&adapter->hw); #ifdef DEVICE_POLLING /* * Only enable interrupts if we are not polling, make sure * they are off otherwise. */ if (if_getcapenable(ifp) & IFCAP_POLLING) lem_disable_intr(adapter); else #endif /* DEVICE_POLLING */ lem_enable_intr(adapter); /* AMT based hardware can now take control from firmware */ if (adapter->has_manage && adapter->has_amt) lem_get_hw_control(adapter); } static void lem_init(void *arg) { struct adapter *adapter = arg; EM_CORE_LOCK(adapter); lem_init_locked(adapter); EM_CORE_UNLOCK(adapter); } #ifdef DEVICE_POLLING /********************************************************************* * * Legacy polling routine * *********************************************************************/ static int lem_poll(if_t ifp, enum poll_cmd cmd, int count) { struct adapter *adapter = if_getsoftc(ifp); u32 reg_icr, rx_done = 0; EM_CORE_LOCK(adapter); if ((if_getdrvflags(ifp) & IFF_DRV_RUNNING) == 0) { EM_CORE_UNLOCK(adapter); return (rx_done); } if (cmd == POLL_AND_CHECK_STATUS) { reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR); if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) { callout_stop(&adapter->timer); adapter->hw.mac.get_link_status = 1; lem_update_link_status(adapter); callout_reset(&adapter->timer, hz, lem_local_timer, adapter); } } EM_CORE_UNLOCK(adapter); lem_rxeof(adapter, count, &rx_done); EM_TX_LOCK(adapter); lem_txeof(adapter); if(!if_sendq_empty(ifp)) lem_start_locked(ifp); EM_TX_UNLOCK(adapter); return (rx_done); } #endif /* DEVICE_POLLING */ /********************************************************************* * * Legacy Interrupt Service routine * *********************************************************************/ static void lem_intr(void *arg) { struct adapter *adapter = arg; if_t ifp = adapter->ifp; u32 reg_icr; if ((if_getcapenable(ifp) & IFCAP_POLLING) || ((if_getdrvflags(ifp) & IFF_DRV_RUNNING) == 0)) return; EM_CORE_LOCK(adapter); reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR); if (reg_icr & E1000_ICR_RXO) adapter->rx_overruns++; if ((reg_icr == 0xffffffff) || (reg_icr == 0)) { EM_CORE_UNLOCK(adapter); return; } if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) { callout_stop(&adapter->timer); adapter->hw.mac.get_link_status = 1; lem_update_link_status(adapter); /* Deal with TX cruft when link lost */ lem_tx_purge(adapter); callout_reset(&adapter->timer, hz, lem_local_timer, adapter); EM_CORE_UNLOCK(adapter); return; } EM_CORE_UNLOCK(adapter); lem_rxeof(adapter, -1, NULL); EM_TX_LOCK(adapter); lem_txeof(adapter); if ((if_getdrvflags(ifp) & IFF_DRV_RUNNING) && (!if_sendq_empty(ifp))) lem_start_locked(ifp); EM_TX_UNLOCK(adapter); return; } static void lem_handle_link(void *context, int pending) { struct adapter *adapter = context; if_t ifp = adapter->ifp; if (!(if_getdrvflags(ifp) & IFF_DRV_RUNNING)) return; EM_CORE_LOCK(adapter); callout_stop(&adapter->timer); lem_update_link_status(adapter); /* Deal with TX cruft when link lost */ lem_tx_purge(adapter); callout_reset(&adapter->timer, hz, lem_local_timer, adapter); EM_CORE_UNLOCK(adapter); } /* Combined RX/TX handler, used by Legacy and MSI */ static void lem_handle_rxtx(void *context, int pending) { struct adapter *adapter = context; if_t ifp = adapter->ifp; if (if_getdrvflags(ifp) & IFF_DRV_RUNNING) { bool more = lem_rxeof(adapter, adapter->rx_process_limit, NULL); EM_TX_LOCK(adapter); lem_txeof(adapter); if(!if_sendq_empty(ifp)) lem_start_locked(ifp); EM_TX_UNLOCK(adapter); if (more) { taskqueue_enqueue(adapter->tq, &adapter->rxtx_task); return; } } if (if_getdrvflags(ifp) & IFF_DRV_RUNNING) lem_enable_intr(adapter); } /********************************************************************* * * Fast Legacy/MSI Combined Interrupt Service routine * *********************************************************************/ static int lem_irq_fast(void *arg) { struct adapter *adapter = arg; if_t ifp; u32 reg_icr; ifp = adapter->ifp; reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR); /* Hot eject? */ if (reg_icr == 0xffffffff) return FILTER_STRAY; /* Definitely not our interrupt. */ if (reg_icr == 0x0) return FILTER_STRAY; /* * Mask interrupts until the taskqueue is finished running. This is * cheap, just assume that it is needed. This also works around the * MSI message reordering errata on certain systems. */ lem_disable_intr(adapter); taskqueue_enqueue(adapter->tq, &adapter->rxtx_task); /* Link status change */ if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) { adapter->hw.mac.get_link_status = 1; taskqueue_enqueue(taskqueue_fast, &adapter->link_task); } if (reg_icr & E1000_ICR_RXO) adapter->rx_overruns++; return FILTER_HANDLED; } /********************************************************************* * * Media Ioctl callback * * This routine is called whenever the user queries the status of * the interface using ifconfig. * **********************************************************************/ static void lem_media_status(if_t ifp, struct ifmediareq *ifmr) { struct adapter *adapter = if_getsoftc(ifp); u_char fiber_type = IFM_1000_SX; INIT_DEBUGOUT("lem_media_status: begin"); EM_CORE_LOCK(adapter); lem_update_link_status(adapter); ifmr->ifm_status = IFM_AVALID; ifmr->ifm_active = IFM_ETHER; if (!adapter->link_active) { EM_CORE_UNLOCK(adapter); return; } ifmr->ifm_status |= IFM_ACTIVE; if ((adapter->hw.phy.media_type == e1000_media_type_fiber) || (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) { if (adapter->hw.mac.type == e1000_82545) fiber_type = IFM_1000_LX; ifmr->ifm_active |= fiber_type | IFM_FDX; } else { switch (adapter->link_speed) { case 10: ifmr->ifm_active |= IFM_10_T; break; case 100: ifmr->ifm_active |= IFM_100_TX; break; case 1000: ifmr->ifm_active |= IFM_1000_T; break; } if (adapter->link_duplex == FULL_DUPLEX) ifmr->ifm_active |= IFM_FDX; else ifmr->ifm_active |= IFM_HDX; } EM_CORE_UNLOCK(adapter); } /********************************************************************* * * Media Ioctl callback * * This routine is called when the user changes speed/duplex using * media/mediopt option with ifconfig. * **********************************************************************/ static int lem_media_change(if_t ifp) { struct adapter *adapter = if_getsoftc(ifp); struct ifmedia *ifm = &adapter->media; INIT_DEBUGOUT("lem_media_change: begin"); if (IFM_TYPE(ifm->ifm_media) != IFM_ETHER) return (EINVAL); EM_CORE_LOCK(adapter); switch (IFM_SUBTYPE(ifm->ifm_media)) { case IFM_AUTO: adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_advertised = AUTONEG_ADV_DEFAULT; break; case IFM_1000_LX: case IFM_1000_SX: case IFM_1000_T: adapter->hw.mac.autoneg = DO_AUTO_NEG; adapter->hw.phy.autoneg_advertised = ADVERTISE_1000_FULL; break; case IFM_100_TX: adapter->hw.mac.autoneg = FALSE; adapter->hw.phy.autoneg_advertised = 0; if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX) adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_FULL; else adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_HALF; break; case IFM_10_T: adapter->hw.mac.autoneg = FALSE; adapter->hw.phy.autoneg_advertised = 0; if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX) adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_FULL; else adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_HALF; break; default: device_printf(adapter->dev, "Unsupported media type\n"); } lem_init_locked(adapter); EM_CORE_UNLOCK(adapter); return (0); } /********************************************************************* * * This routine maps the mbufs to tx descriptors. * * return 0 on success, positive on failure **********************************************************************/ static int lem_xmit(struct adapter *adapter, struct mbuf **m_headp) { bus_dma_segment_t segs[EM_MAX_SCATTER]; bus_dmamap_t map; struct em_buffer *tx_buffer, *tx_buffer_mapped; struct e1000_tx_desc *ctxd = NULL; struct mbuf *m_head; u32 txd_upper, txd_lower, txd_used, txd_saved; int error, nsegs, i, j, first, last = 0; m_head = *m_headp; txd_upper = txd_lower = txd_used = txd_saved = 0; /* ** When doing checksum offload, it is critical to ** make sure the first mbuf has more than header, ** because that routine expects data to be present. */ if ((m_head->m_pkthdr.csum_flags & CSUM_OFFLOAD) && (m_head->m_len < ETHER_HDR_LEN + sizeof(struct ip))) { m_head = m_pullup(m_head, ETHER_HDR_LEN + sizeof(struct ip)); *m_headp = m_head; if (m_head == NULL) return (ENOBUFS); } /* * Map the packet for DMA * * Capture the first descriptor index, * this descriptor will have the index * of the EOP which is the only one that * now gets a DONE bit writeback. */ first = adapter->next_avail_tx_desc; tx_buffer = &adapter->tx_buffer_area[first]; tx_buffer_mapped = tx_buffer; map = tx_buffer->map; error = bus_dmamap_load_mbuf_sg(adapter->txtag, map, *m_headp, segs, &nsegs, BUS_DMA_NOWAIT); /* * There are two types of errors we can (try) to handle: * - EFBIG means the mbuf chain was too long and bus_dma ran * out of segments. Defragment the mbuf chain and try again. * - ENOMEM means bus_dma could not obtain enough bounce buffers * at this point in time. Defer sending and try again later. * All other errors, in particular EINVAL, are fatal and prevent the * mbuf chain from ever going through. Drop it and report error. */ if (error == EFBIG) { struct mbuf *m; m = m_collapse(*m_headp, M_NOWAIT, EM_MAX_SCATTER); if (m == NULL) { adapter->mbuf_defrag_failed++; m_freem(*m_headp); *m_headp = NULL; return (ENOBUFS); } *m_headp = m; /* Try it again */ error = bus_dmamap_load_mbuf_sg(adapter->txtag, map, *m_headp, segs, &nsegs, BUS_DMA_NOWAIT); if (error) { adapter->no_tx_dma_setup++; m_freem(*m_headp); *m_headp = NULL; return (error); } } else if (error != 0) { adapter->no_tx_dma_setup++; return (error); } if (adapter->num_tx_desc_avail < (nsegs + 2)) { adapter->no_tx_desc_avail2++; bus_dmamap_unload(adapter->txtag, map); return (ENOBUFS); } m_head = *m_headp; /* Do hardware assists */ if (m_head->m_pkthdr.csum_flags & CSUM_OFFLOAD) lem_transmit_checksum_setup(adapter, m_head, &txd_upper, &txd_lower); i = adapter->next_avail_tx_desc; if (adapter->pcix_82544) txd_saved = i; /* Set up our transmit descriptors */ for (j = 0; j < nsegs; j++) { bus_size_t seg_len; bus_addr_t seg_addr; /* If adapter is 82544 and on PCIX bus */ if(adapter->pcix_82544) { DESC_ARRAY desc_array; u32 array_elements, counter; /* * Check the Address and Length combination and * split the data accordingly */ array_elements = lem_fill_descriptors(segs[j].ds_addr, segs[j].ds_len, &desc_array); for (counter = 0; counter < array_elements; counter++) { if (txd_used == adapter->num_tx_desc_avail) { adapter->next_avail_tx_desc = txd_saved; adapter->no_tx_desc_avail2++; bus_dmamap_unload(adapter->txtag, map); return (ENOBUFS); } tx_buffer = &adapter->tx_buffer_area[i]; ctxd = &adapter->tx_desc_base[i]; ctxd->buffer_addr = htole64( desc_array.descriptor[counter].address); ctxd->lower.data = htole32( (adapter->txd_cmd | txd_lower | (u16) desc_array.descriptor[counter].length)); ctxd->upper.data = htole32((txd_upper)); last = i; if (++i == adapter->num_tx_desc) i = 0; tx_buffer->m_head = NULL; tx_buffer->next_eop = -1; txd_used++; } } else { tx_buffer = &adapter->tx_buffer_area[i]; ctxd = &adapter->tx_desc_base[i]; seg_addr = segs[j].ds_addr; seg_len = segs[j].ds_len; ctxd->buffer_addr = htole64(seg_addr); ctxd->lower.data = htole32( adapter->txd_cmd | txd_lower | seg_len); ctxd->upper.data = htole32(txd_upper); last = i; if (++i == adapter->num_tx_desc) i = 0; tx_buffer->m_head = NULL; tx_buffer->next_eop = -1; } } adapter->next_avail_tx_desc = i; if (adapter->pcix_82544) adapter->num_tx_desc_avail -= txd_used; else adapter->num_tx_desc_avail -= nsegs; if (m_head->m_flags & M_VLANTAG) { /* Set the vlan id. */ ctxd->upper.fields.special = htole16(m_head->m_pkthdr.ether_vtag); /* Tell hardware to add tag */ ctxd->lower.data |= htole32(E1000_TXD_CMD_VLE); } tx_buffer->m_head = m_head; tx_buffer_mapped->map = tx_buffer->map; tx_buffer->map = map; bus_dmamap_sync(adapter->txtag, map, BUS_DMASYNC_PREWRITE); /* * Last Descriptor of Packet * needs End Of Packet (EOP) * and Report Status (RS) */ ctxd->lower.data |= htole32(E1000_TXD_CMD_EOP | E1000_TXD_CMD_RS); /* * Keep track in the first buffer which * descriptor will be written back */ tx_buffer = &adapter->tx_buffer_area[first]; tx_buffer->next_eop = last; adapter->watchdog_time = ticks; /* * Advance the Transmit Descriptor Tail (TDT), this tells the E1000 * that this frame is available to transmit. */ bus_dmamap_sync(adapter->txdma.dma_tag, adapter->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); #ifdef NIC_PARAVIRT if (adapter->csb) { adapter->csb->guest_tdt = i; /* XXX memory barrier ? */ if (adapter->csb->guest_csb_on && !(adapter->csb->host_need_txkick & 1)) { /* XXX maybe useless * clean the ring. maybe do it before ? * maybe a little bit of histeresys ? */ if (adapter->num_tx_desc_avail <= 64) {// XXX lem_txeof(adapter); } return (0); } } #endif /* NIC_PARAVIRT */ #ifdef NIC_SEND_COMBINING if (adapter->sc_enable) { if (adapter->shadow_tdt & MIT_PENDING_INT) { /* signal intr and data pending */ adapter->shadow_tdt = MIT_PENDING_TDT | (i & 0xffff); return (0); } else { adapter->shadow_tdt = MIT_PENDING_INT; } } #endif /* NIC_SEND_COMBINING */ if (adapter->hw.mac.type == e1000_82547 && adapter->link_duplex == HALF_DUPLEX) lem_82547_move_tail(adapter); else { E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), i); if (adapter->hw.mac.type == e1000_82547) lem_82547_update_fifo_head(adapter, m_head->m_pkthdr.len); } return (0); } /********************************************************************* * * 82547 workaround to avoid controller hang in half-duplex environment. * The workaround is to avoid queuing a large packet that would span * the internal Tx FIFO ring boundary. We need to reset the FIFO pointers * in this case. We do that only when FIFO is quiescent. * **********************************************************************/ static void lem_82547_move_tail(void *arg) { struct adapter *adapter = arg; struct e1000_tx_desc *tx_desc; u16 hw_tdt, sw_tdt, length = 0; bool eop = 0; EM_TX_LOCK_ASSERT(adapter); hw_tdt = E1000_READ_REG(&adapter->hw, E1000_TDT(0)); sw_tdt = adapter->next_avail_tx_desc; while (hw_tdt != sw_tdt) { tx_desc = &adapter->tx_desc_base[hw_tdt]; length += tx_desc->lower.flags.length; eop = tx_desc->lower.data & E1000_TXD_CMD_EOP; if (++hw_tdt == adapter->num_tx_desc) hw_tdt = 0; if (eop) { if (lem_82547_fifo_workaround(adapter, length)) { adapter->tx_fifo_wrk_cnt++; callout_reset(&adapter->tx_fifo_timer, 1, lem_82547_move_tail, adapter); break; } E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), hw_tdt); lem_82547_update_fifo_head(adapter, length); length = 0; } } } static int lem_82547_fifo_workaround(struct adapter *adapter, int len) { int fifo_space, fifo_pkt_len; fifo_pkt_len = roundup2(len + EM_FIFO_HDR, EM_FIFO_HDR); if (adapter->link_duplex == HALF_DUPLEX) { fifo_space = adapter->tx_fifo_size - adapter->tx_fifo_head; if (fifo_pkt_len >= (EM_82547_PKT_THRESH + fifo_space)) { if (lem_82547_tx_fifo_reset(adapter)) return (0); else return (1); } } return (0); } static void lem_82547_update_fifo_head(struct adapter *adapter, int len) { int fifo_pkt_len = roundup2(len + EM_FIFO_HDR, EM_FIFO_HDR); /* tx_fifo_head is always 16 byte aligned */ adapter->tx_fifo_head += fifo_pkt_len; if (adapter->tx_fifo_head >= adapter->tx_fifo_size) { adapter->tx_fifo_head -= adapter->tx_fifo_size; } } static int lem_82547_tx_fifo_reset(struct adapter *adapter) { u32 tctl; if ((E1000_READ_REG(&adapter->hw, E1000_TDT(0)) == E1000_READ_REG(&adapter->hw, E1000_TDH(0))) && (E1000_READ_REG(&adapter->hw, E1000_TDFT) == E1000_READ_REG(&adapter->hw, E1000_TDFH)) && (E1000_READ_REG(&adapter->hw, E1000_TDFTS) == E1000_READ_REG(&adapter->hw, E1000_TDFHS)) && (E1000_READ_REG(&adapter->hw, E1000_TDFPC) == 0)) { /* Disable TX unit */ tctl = E1000_READ_REG(&adapter->hw, E1000_TCTL); E1000_WRITE_REG(&adapter->hw, E1000_TCTL, tctl & ~E1000_TCTL_EN); /* Reset FIFO pointers */ E1000_WRITE_REG(&adapter->hw, E1000_TDFT, adapter->tx_head_addr); E1000_WRITE_REG(&adapter->hw, E1000_TDFH, adapter->tx_head_addr); E1000_WRITE_REG(&adapter->hw, E1000_TDFTS, adapter->tx_head_addr); E1000_WRITE_REG(&adapter->hw, E1000_TDFHS, adapter->tx_head_addr); /* Re-enable TX unit */ E1000_WRITE_REG(&adapter->hw, E1000_TCTL, tctl); E1000_WRITE_FLUSH(&adapter->hw); adapter->tx_fifo_head = 0; adapter->tx_fifo_reset_cnt++; return (TRUE); } else { return (FALSE); } } static void lem_set_promisc(struct adapter *adapter) { if_t ifp = adapter->ifp; u32 reg_rctl; reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); if (if_getflags(ifp) & IFF_PROMISC) { reg_rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE); /* Turn this on if you want to see bad packets */ if (lem_debug_sbp) reg_rctl |= E1000_RCTL_SBP; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } else if (if_getflags(ifp) & IFF_ALLMULTI) { reg_rctl |= E1000_RCTL_MPE; reg_rctl &= ~E1000_RCTL_UPE; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } } static void lem_disable_promisc(struct adapter *adapter) { if_t ifp = adapter->ifp; u32 reg_rctl; int mcnt = 0; reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); reg_rctl &= (~E1000_RCTL_UPE); if (if_getflags(ifp) & IFF_ALLMULTI) mcnt = MAX_NUM_MULTICAST_ADDRESSES; else mcnt = if_multiaddr_count(ifp, MAX_NUM_MULTICAST_ADDRESSES); /* Don't disable if in MAX groups */ if (mcnt < MAX_NUM_MULTICAST_ADDRESSES) reg_rctl &= (~E1000_RCTL_MPE); reg_rctl &= (~E1000_RCTL_SBP); E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } /********************************************************************* * Multicast Update * * This routine is called whenever multicast address list is updated. * **********************************************************************/ static void lem_set_multi(struct adapter *adapter) { if_t ifp = adapter->ifp; u32 reg_rctl = 0; u8 *mta; /* Multicast array memory */ int mcnt = 0; IOCTL_DEBUGOUT("lem_set_multi: begin"); mta = adapter->mta; bzero(mta, sizeof(u8) * ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES); if (adapter->hw.mac.type == e1000_82542 && adapter->hw.revision_id == E1000_REVISION_2) { reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); if (adapter->hw.bus.pci_cmd_word & CMD_MEM_WRT_INVALIDATE) e1000_pci_clear_mwi(&adapter->hw); reg_rctl |= E1000_RCTL_RST; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); msec_delay(5); } if_multiaddr_array(ifp, mta, &mcnt, MAX_NUM_MULTICAST_ADDRESSES); if (mcnt >= MAX_NUM_MULTICAST_ADDRESSES) { reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); reg_rctl |= E1000_RCTL_MPE; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); } else e1000_update_mc_addr_list(&adapter->hw, mta, mcnt); if (adapter->hw.mac.type == e1000_82542 && adapter->hw.revision_id == E1000_REVISION_2) { reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); reg_rctl &= ~E1000_RCTL_RST; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl); msec_delay(5); if (adapter->hw.bus.pci_cmd_word & CMD_MEM_WRT_INVALIDATE) e1000_pci_set_mwi(&adapter->hw); } } /********************************************************************* * Timer routine * * This routine checks for link status and updates statistics. * **********************************************************************/ static void lem_local_timer(void *arg) { struct adapter *adapter = arg; EM_CORE_LOCK_ASSERT(adapter); lem_update_link_status(adapter); lem_update_stats_counters(adapter); lem_smartspeed(adapter); #ifdef NIC_PARAVIRT /* recover space if needed */ if (adapter->csb && adapter->csb->guest_csb_on && (adapter->watchdog_check == TRUE) && (ticks - adapter->watchdog_time > EM_WATCHDOG) && (adapter->num_tx_desc_avail != adapter->num_tx_desc) ) { lem_txeof(adapter); /* * lem_txeof() normally (except when space in the queue * runs low XXX) cleans watchdog_check so that * we do not hung. */ } #endif /* NIC_PARAVIRT */ /* * We check the watchdog: the time since * the last TX descriptor was cleaned. * This implies a functional TX engine. */ if ((adapter->watchdog_check == TRUE) && (ticks - adapter->watchdog_time > EM_WATCHDOG)) goto hung; callout_reset(&adapter->timer, hz, lem_local_timer, adapter); return; hung: device_printf(adapter->dev, "Watchdog timeout -- resetting\n"); if_setdrvflagbits(adapter->ifp, 0, IFF_DRV_RUNNING); adapter->watchdog_events++; lem_init_locked(adapter); } static void lem_update_link_status(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; if_t ifp = adapter->ifp; device_t dev = adapter->dev; u32 link_check = 0; /* Get the cached link value or read phy for real */ switch (hw->phy.media_type) { case e1000_media_type_copper: if (hw->mac.get_link_status) { /* Do the work to read phy */ e1000_check_for_link(hw); link_check = !hw->mac.get_link_status; if (link_check) /* ESB2 fix */ e1000_cfg_on_link_up(hw); } else link_check = TRUE; break; case e1000_media_type_fiber: e1000_check_for_link(hw); link_check = (E1000_READ_REG(hw, E1000_STATUS) & E1000_STATUS_LU); break; case e1000_media_type_internal_serdes: e1000_check_for_link(hw); link_check = adapter->hw.mac.serdes_has_link; break; default: case e1000_media_type_unknown: break; } /* Now check for a transition */ if (link_check && (adapter->link_active == 0)) { e1000_get_speed_and_duplex(hw, &adapter->link_speed, &adapter->link_duplex); if (bootverbose) device_printf(dev, "Link is up %d Mbps %s\n", adapter->link_speed, ((adapter->link_duplex == FULL_DUPLEX) ? "Full Duplex" : "Half Duplex")); adapter->link_active = 1; adapter->smartspeed = 0; if_setbaudrate(ifp, adapter->link_speed * 1000000); if_link_state_change(ifp, LINK_STATE_UP); } else if (!link_check && (adapter->link_active == 1)) { if_setbaudrate(ifp, 0); adapter->link_speed = 0; adapter->link_duplex = 0; if (bootverbose) device_printf(dev, "Link is Down\n"); adapter->link_active = 0; /* Link down, disable watchdog */ adapter->watchdog_check = FALSE; if_link_state_change(ifp, LINK_STATE_DOWN); } } /********************************************************************* * * This routine disables all traffic on the adapter by issuing a * global reset on the MAC and deallocates TX/RX buffers. * * This routine should always be called with BOTH the CORE * and TX locks. **********************************************************************/ static void lem_stop(void *arg) { struct adapter *adapter = arg; if_t ifp = adapter->ifp; EM_CORE_LOCK_ASSERT(adapter); EM_TX_LOCK_ASSERT(adapter); INIT_DEBUGOUT("lem_stop: begin"); lem_disable_intr(adapter); callout_stop(&adapter->timer); callout_stop(&adapter->tx_fifo_timer); /* Tell the stack that the interface is no longer active */ if_setdrvflagbits(ifp, 0, (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)); e1000_reset_hw(&adapter->hw); if (adapter->hw.mac.type >= e1000_82544) E1000_WRITE_REG(&adapter->hw, E1000_WUC, 0); e1000_led_off(&adapter->hw); e1000_cleanup_led(&adapter->hw); } /********************************************************************* * * Determine hardware revision. * **********************************************************************/ static void lem_identify_hardware(struct adapter *adapter) { device_t dev = adapter->dev; /* Make sure our PCI config space has the necessary stuff set */ pci_enable_busmaster(dev); adapter->hw.bus.pci_cmd_word = pci_read_config(dev, PCIR_COMMAND, 2); /* Save off the information about this board */ adapter->hw.vendor_id = pci_get_vendor(dev); adapter->hw.device_id = pci_get_device(dev); adapter->hw.revision_id = pci_read_config(dev, PCIR_REVID, 1); adapter->hw.subsystem_vendor_id = pci_read_config(dev, PCIR_SUBVEND_0, 2); adapter->hw.subsystem_device_id = pci_read_config(dev, PCIR_SUBDEV_0, 2); /* Do Shared Code Init and Setup */ if (e1000_set_mac_type(&adapter->hw)) { device_printf(dev, "Setup init failure\n"); return; } } static int lem_allocate_pci_resources(struct adapter *adapter) { device_t dev = adapter->dev; int val, rid, error = E1000_SUCCESS; rid = PCIR_BAR(0); adapter->memory = bus_alloc_resource_any(dev, SYS_RES_MEMORY, &rid, RF_ACTIVE); if (adapter->memory == NULL) { device_printf(dev, "Unable to allocate bus resource: memory\n"); return (ENXIO); } adapter->osdep.mem_bus_space_tag = rman_get_bustag(adapter->memory); adapter->osdep.mem_bus_space_handle = rman_get_bushandle(adapter->memory); adapter->hw.hw_addr = (u8 *)&adapter->osdep.mem_bus_space_handle; /* Only older adapters use IO mapping */ if (adapter->hw.mac.type > e1000_82543) { /* Figure our where our IO BAR is ? */ for (rid = PCIR_BAR(0); rid < PCIR_CIS;) { val = pci_read_config(dev, rid, 4); if (EM_BAR_TYPE(val) == EM_BAR_TYPE_IO) { adapter->io_rid = rid; break; } rid += 4; /* check for 64bit BAR */ if (EM_BAR_MEM_TYPE(val) == EM_BAR_MEM_TYPE_64BIT) rid += 4; } if (rid >= PCIR_CIS) { device_printf(dev, "Unable to locate IO BAR\n"); return (ENXIO); } adapter->ioport = bus_alloc_resource_any(dev, SYS_RES_IOPORT, &adapter->io_rid, RF_ACTIVE); if (adapter->ioport == NULL) { device_printf(dev, "Unable to allocate bus resource: " "ioport\n"); return (ENXIO); } adapter->hw.io_base = 0; adapter->osdep.io_bus_space_tag = rman_get_bustag(adapter->ioport); adapter->osdep.io_bus_space_handle = rman_get_bushandle(adapter->ioport); } adapter->hw.back = &adapter->osdep; return (error); } /********************************************************************* * * Setup the Legacy or MSI Interrupt handler * **********************************************************************/ int lem_allocate_irq(struct adapter *adapter) { device_t dev = adapter->dev; int error, rid = 0; /* Manually turn off all interrupts */ E1000_WRITE_REG(&adapter->hw, E1000_IMC, 0xffffffff); /* We allocate a single interrupt resource */ adapter->res[0] = bus_alloc_resource_any(dev, SYS_RES_IRQ, &rid, RF_SHAREABLE | RF_ACTIVE); if (adapter->res[0] == NULL) { device_printf(dev, "Unable to allocate bus resource: " "interrupt\n"); return (ENXIO); } /* Do Legacy setup? */ if (lem_use_legacy_irq) { if ((error = bus_setup_intr(dev, adapter->res[0], INTR_TYPE_NET | INTR_MPSAFE, NULL, lem_intr, adapter, &adapter->tag[0])) != 0) { device_printf(dev, "Failed to register interrupt handler"); return (error); } return (0); } /* * Use a Fast interrupt and the associated * deferred processing contexts. */ TASK_INIT(&adapter->rxtx_task, 0, lem_handle_rxtx, adapter); TASK_INIT(&adapter->link_task, 0, lem_handle_link, adapter); adapter->tq = taskqueue_create_fast("lem_taskq", M_NOWAIT, taskqueue_thread_enqueue, &adapter->tq); taskqueue_start_threads(&adapter->tq, 1, PI_NET, "%s taskq", device_get_nameunit(adapter->dev)); if ((error = bus_setup_intr(dev, adapter->res[0], INTR_TYPE_NET, lem_irq_fast, NULL, adapter, &adapter->tag[0])) != 0) { device_printf(dev, "Failed to register fast interrupt " "handler: %d\n", error); taskqueue_free(adapter->tq); adapter->tq = NULL; return (error); } return (0); } static void lem_free_pci_resources(struct adapter *adapter) { device_t dev = adapter->dev; if (adapter->tag[0] != NULL) { bus_teardown_intr(dev, adapter->res[0], adapter->tag[0]); adapter->tag[0] = NULL; } if (adapter->res[0] != NULL) { bus_release_resource(dev, SYS_RES_IRQ, 0, adapter->res[0]); } if (adapter->memory != NULL) bus_release_resource(dev, SYS_RES_MEMORY, PCIR_BAR(0), adapter->memory); if (adapter->ioport != NULL) bus_release_resource(dev, SYS_RES_IOPORT, adapter->io_rid, adapter->ioport); } /********************************************************************* * * Initialize the hardware to a configuration * as specified by the adapter structure. * **********************************************************************/ static int lem_hardware_init(struct adapter *adapter) { device_t dev = adapter->dev; u16 rx_buffer_size; INIT_DEBUGOUT("lem_hardware_init: begin"); /* Issue a global reset */ e1000_reset_hw(&adapter->hw); /* When hardware is reset, fifo_head is also reset */ adapter->tx_fifo_head = 0; /* * These parameters control the automatic generation (Tx) and * response (Rx) to Ethernet PAUSE frames. * - High water mark should allow for at least two frames to be * received after sending an XOFF. * - Low water mark works best when it is very near the high water mark. * This allows the receiver to restart by sending XON when it has * drained a bit. Here we use an arbitrary value of 1500 which will * restart after one full frame is pulled from the buffer. There * could be several smaller frames in the buffer and if so they will * not trigger the XON until their total number reduces the buffer * by 1500. * - The pause time is fairly large at 1000 x 512ns = 512 usec. */ rx_buffer_size = ((E1000_READ_REG(&adapter->hw, E1000_PBA) & 0xffff) << 10 ); adapter->hw.fc.high_water = rx_buffer_size - roundup2(adapter->max_frame_size, 1024); adapter->hw.fc.low_water = adapter->hw.fc.high_water - 1500; adapter->hw.fc.pause_time = EM_FC_PAUSE_TIME; adapter->hw.fc.send_xon = TRUE; /* Set Flow control, use the tunable location if sane */ if ((lem_fc_setting >= 0) && (lem_fc_setting < 4)) adapter->hw.fc.requested_mode = lem_fc_setting; else adapter->hw.fc.requested_mode = e1000_fc_none; if (e1000_init_hw(&adapter->hw) < 0) { device_printf(dev, "Hardware Initialization Failed\n"); return (EIO); } e1000_check_for_link(&adapter->hw); return (0); } /********************************************************************* * * Setup networking device structure and register an interface. * **********************************************************************/ static int lem_setup_interface(device_t dev, struct adapter *adapter) { if_t ifp; INIT_DEBUGOUT("lem_setup_interface: begin"); ifp = adapter->ifp = if_gethandle(IFT_ETHER); if (ifp == (void *)NULL) { device_printf(dev, "can not allocate ifnet structure\n"); return (-1); } if_initname(ifp, device_get_name(dev), device_get_unit(dev)); if_setinitfn(ifp, lem_init); if_setsoftc(ifp, adapter); if_setflags(ifp, IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST); if_setioctlfn(ifp, lem_ioctl); if_setstartfn(ifp, lem_start); if_setgetcounterfn(ifp, lem_get_counter); if_setsendqlen(ifp, adapter->num_tx_desc - 1); if_setsendqready(ifp); ether_ifattach(ifp, adapter->hw.mac.addr); if_setcapabilities(ifp, 0); if (adapter->hw.mac.type >= e1000_82543) { if_setcapabilitiesbit(ifp, IFCAP_HWCSUM | IFCAP_VLAN_HWCSUM, 0); if_setcapenablebit(ifp, IFCAP_HWCSUM | IFCAP_VLAN_HWCSUM, 0); } /* * Tell the upper layer(s) we support long frames. */ if_setifheaderlen(ifp, sizeof(struct ether_vlan_header)); if_setcapabilitiesbit(ifp, IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_MTU, 0); if_setcapenablebit(ifp, IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_MTU, 0); /* ** Dont turn this on by default, if vlans are ** created on another pseudo device (eg. lagg) ** then vlan events are not passed thru, breaking ** operation, but with HW FILTER off it works. If ** using vlans directly on the em driver you can ** enable this and get full hardware tag filtering. */ if_setcapabilitiesbit(ifp, IFCAP_VLAN_HWFILTER, 0); #ifdef DEVICE_POLLING if_setcapabilitiesbit(ifp, IFCAP_POLLING, 0); #endif /* Enable only WOL MAGIC by default */ if (adapter->wol) { if_setcapabilitiesbit(ifp, IFCAP_WOL, 0); if_setcapenablebit(ifp, IFCAP_WOL_MAGIC, 0); } /* * Specify the media types supported by this adapter and register * callbacks to update media and link information */ ifmedia_init(&adapter->media, IFM_IMASK, lem_media_change, lem_media_status); if ((adapter->hw.phy.media_type == e1000_media_type_fiber) || (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) { u_char fiber_type = IFM_1000_SX; /* default type */ if (adapter->hw.mac.type == e1000_82545) fiber_type = IFM_1000_LX; ifmedia_add(&adapter->media, IFM_ETHER | fiber_type | IFM_FDX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | fiber_type, 0, NULL); } else { ifmedia_add(&adapter->media, IFM_ETHER | IFM_10_T, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_10_T | IFM_FDX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_100_TX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_100_TX | IFM_FDX, 0, NULL); if (adapter->hw.phy.type != e1000_phy_ife) { ifmedia_add(&adapter->media, IFM_ETHER | IFM_1000_T | IFM_FDX, 0, NULL); ifmedia_add(&adapter->media, IFM_ETHER | IFM_1000_T, 0, NULL); } } ifmedia_add(&adapter->media, IFM_ETHER | IFM_AUTO, 0, NULL); ifmedia_set(&adapter->media, IFM_ETHER | IFM_AUTO); return (0); } /********************************************************************* * * Workaround for SmartSpeed on 82541 and 82547 controllers * **********************************************************************/ static void lem_smartspeed(struct adapter *adapter) { u16 phy_tmp; if (adapter->link_active || (adapter->hw.phy.type != e1000_phy_igp) || adapter->hw.mac.autoneg == 0 || (adapter->hw.phy.autoneg_advertised & ADVERTISE_1000_FULL) == 0) return; if (adapter->smartspeed == 0) { /* If Master/Slave config fault is asserted twice, * we assume back-to-back */ e1000_read_phy_reg(&adapter->hw, PHY_1000T_STATUS, &phy_tmp); if (!(phy_tmp & SR_1000T_MS_CONFIG_FAULT)) return; e1000_read_phy_reg(&adapter->hw, PHY_1000T_STATUS, &phy_tmp); if (phy_tmp & SR_1000T_MS_CONFIG_FAULT) { e1000_read_phy_reg(&adapter->hw, PHY_1000T_CTRL, &phy_tmp); if(phy_tmp & CR_1000T_MS_ENABLE) { phy_tmp &= ~CR_1000T_MS_ENABLE; e1000_write_phy_reg(&adapter->hw, PHY_1000T_CTRL, phy_tmp); adapter->smartspeed++; if(adapter->hw.mac.autoneg && !e1000_copper_link_autoneg(&adapter->hw) && !e1000_read_phy_reg(&adapter->hw, PHY_CONTROL, &phy_tmp)) { phy_tmp |= (MII_CR_AUTO_NEG_EN | MII_CR_RESTART_AUTO_NEG); e1000_write_phy_reg(&adapter->hw, PHY_CONTROL, phy_tmp); } } } return; } else if(adapter->smartspeed == EM_SMARTSPEED_DOWNSHIFT) { /* If still no link, perhaps using 2/3 pair cable */ e1000_read_phy_reg(&adapter->hw, PHY_1000T_CTRL, &phy_tmp); phy_tmp |= CR_1000T_MS_ENABLE; e1000_write_phy_reg(&adapter->hw, PHY_1000T_CTRL, phy_tmp); if(adapter->hw.mac.autoneg && !e1000_copper_link_autoneg(&adapter->hw) && !e1000_read_phy_reg(&adapter->hw, PHY_CONTROL, &phy_tmp)) { phy_tmp |= (MII_CR_AUTO_NEG_EN | MII_CR_RESTART_AUTO_NEG); e1000_write_phy_reg(&adapter->hw, PHY_CONTROL, phy_tmp); } } /* Restart process after EM_SMARTSPEED_MAX iterations */ if(adapter->smartspeed++ == EM_SMARTSPEED_MAX) adapter->smartspeed = 0; } /* * Manage DMA'able memory. */ static void lem_dmamap_cb(void *arg, bus_dma_segment_t *segs, int nseg, int error) { if (error) return; *(bus_addr_t *) arg = segs[0].ds_addr; } static int lem_dma_malloc(struct adapter *adapter, bus_size_t size, struct em_dma_alloc *dma, int mapflags) { int error; error = bus_dma_tag_create(bus_get_dma_tag(adapter->dev), /* parent */ EM_DBA_ALIGN, 0, /* alignment, bounds */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ size, /* maxsize */ 1, /* nsegments */ size, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockarg */ &dma->dma_tag); if (error) { device_printf(adapter->dev, "%s: bus_dma_tag_create failed: %d\n", __func__, error); goto fail_0; } error = bus_dmamem_alloc(dma->dma_tag, (void**) &dma->dma_vaddr, BUS_DMA_NOWAIT | BUS_DMA_COHERENT, &dma->dma_map); if (error) { device_printf(adapter->dev, "%s: bus_dmamem_alloc(%ju) failed: %d\n", __func__, (uintmax_t)size, error); goto fail_2; } dma->dma_paddr = 0; error = bus_dmamap_load(dma->dma_tag, dma->dma_map, dma->dma_vaddr, size, lem_dmamap_cb, &dma->dma_paddr, mapflags | BUS_DMA_NOWAIT); if (error || dma->dma_paddr == 0) { device_printf(adapter->dev, "%s: bus_dmamap_load failed: %d\n", __func__, error); goto fail_3; } return (0); fail_3: bus_dmamap_unload(dma->dma_tag, dma->dma_map); fail_2: bus_dmamem_free(dma->dma_tag, dma->dma_vaddr, dma->dma_map); bus_dma_tag_destroy(dma->dma_tag); fail_0: dma->dma_tag = NULL; return (error); } static void lem_dma_free(struct adapter *adapter, struct em_dma_alloc *dma) { if (dma->dma_tag == NULL) return; if (dma->dma_paddr != 0) { bus_dmamap_sync(dma->dma_tag, dma->dma_map, BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(dma->dma_tag, dma->dma_map); dma->dma_paddr = 0; } if (dma->dma_vaddr != NULL) { bus_dmamem_free(dma->dma_tag, dma->dma_vaddr, dma->dma_map); dma->dma_vaddr = NULL; } bus_dma_tag_destroy(dma->dma_tag); dma->dma_tag = NULL; } /********************************************************************* * * Allocate memory for tx_buffer structures. The tx_buffer stores all * the information needed to transmit a packet on the wire. * **********************************************************************/ static int lem_allocate_transmit_structures(struct adapter *adapter) { device_t dev = adapter->dev; struct em_buffer *tx_buffer; int error; /* * Create DMA tags for tx descriptors */ if ((error = bus_dma_tag_create(bus_get_dma_tag(dev), /* parent */ 1, 0, /* alignment, bounds */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ MCLBYTES * EM_MAX_SCATTER, /* maxsize */ EM_MAX_SCATTER, /* nsegments */ MCLBYTES, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockarg */ &adapter->txtag)) != 0) { device_printf(dev, "Unable to allocate TX DMA tag\n"); goto fail; } adapter->tx_buffer_area = malloc(sizeof(struct em_buffer) * adapter->num_tx_desc, M_DEVBUF, M_NOWAIT | M_ZERO); if (adapter->tx_buffer_area == NULL) { device_printf(dev, "Unable to allocate tx_buffer memory\n"); error = ENOMEM; goto fail; } /* Create the descriptor buffer dma maps */ for (int i = 0; i < adapter->num_tx_desc; i++) { tx_buffer = &adapter->tx_buffer_area[i]; error = bus_dmamap_create(adapter->txtag, 0, &tx_buffer->map); if (error != 0) { device_printf(dev, "Unable to create TX DMA map\n"); goto fail; } tx_buffer->next_eop = -1; } return (0); fail: lem_free_transmit_structures(adapter); return (error); } /********************************************************************* * * (Re)Initialize transmit structures. * **********************************************************************/ static void lem_setup_transmit_structures(struct adapter *adapter) { struct em_buffer *tx_buffer; #ifdef DEV_NETMAP /* we are already locked */ struct netmap_adapter *na = netmap_getna(adapter->ifp); struct netmap_slot *slot = netmap_reset(na, NR_TX, 0, 0); #endif /* DEV_NETMAP */ /* Clear the old ring contents */ bzero(adapter->tx_desc_base, (sizeof(struct e1000_tx_desc)) * adapter->num_tx_desc); /* Free any existing TX buffers */ for (int i = 0; i < adapter->num_tx_desc; i++, tx_buffer++) { tx_buffer = &adapter->tx_buffer_area[i]; bus_dmamap_sync(adapter->txtag, tx_buffer->map, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(adapter->txtag, tx_buffer->map); m_freem(tx_buffer->m_head); tx_buffer->m_head = NULL; #ifdef DEV_NETMAP if (slot) { /* the i-th NIC entry goes to slot si */ int si = netmap_idx_n2k(&na->tx_rings[0], i); uint64_t paddr; void *addr; addr = PNMB(na, slot + si, &paddr); adapter->tx_desc_base[i].buffer_addr = htole64(paddr); /* reload the map for netmap mode */ netmap_load_map(na, adapter->txtag, tx_buffer->map, addr); } #endif /* DEV_NETMAP */ tx_buffer->next_eop = -1; } /* Reset state */ adapter->last_hw_offload = 0; adapter->next_avail_tx_desc = 0; adapter->next_tx_to_clean = 0; adapter->num_tx_desc_avail = adapter->num_tx_desc; bus_dmamap_sync(adapter->txdma.dma_tag, adapter->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); return; } /********************************************************************* * * Enable transmit unit. * **********************************************************************/ static void lem_initialize_transmit_unit(struct adapter *adapter) { u32 tctl, tipg = 0; u64 bus_addr; INIT_DEBUGOUT("lem_initialize_transmit_unit: begin"); /* Setup the Base and Length of the Tx Descriptor Ring */ bus_addr = adapter->txdma.dma_paddr; E1000_WRITE_REG(&adapter->hw, E1000_TDLEN(0), adapter->num_tx_desc * sizeof(struct e1000_tx_desc)); E1000_WRITE_REG(&adapter->hw, E1000_TDBAH(0), (u32)(bus_addr >> 32)); E1000_WRITE_REG(&adapter->hw, E1000_TDBAL(0), (u32)bus_addr); /* Setup the HW Tx Head and Tail descriptor pointers */ E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), 0); E1000_WRITE_REG(&adapter->hw, E1000_TDH(0), 0); HW_DEBUGOUT2("Base = %x, Length = %x\n", E1000_READ_REG(&adapter->hw, E1000_TDBAL(0)), E1000_READ_REG(&adapter->hw, E1000_TDLEN(0))); /* Set the default values for the Tx Inter Packet Gap timer */ switch (adapter->hw.mac.type) { case e1000_82542: tipg = DEFAULT_82542_TIPG_IPGT; tipg |= DEFAULT_82542_TIPG_IPGR1 << E1000_TIPG_IPGR1_SHIFT; tipg |= DEFAULT_82542_TIPG_IPGR2 << E1000_TIPG_IPGR2_SHIFT; break; default: if ((adapter->hw.phy.media_type == e1000_media_type_fiber) || (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) tipg = DEFAULT_82543_TIPG_IPGT_FIBER; else tipg = DEFAULT_82543_TIPG_IPGT_COPPER; tipg |= DEFAULT_82543_TIPG_IPGR1 << E1000_TIPG_IPGR1_SHIFT; tipg |= DEFAULT_82543_TIPG_IPGR2 << E1000_TIPG_IPGR2_SHIFT; } E1000_WRITE_REG(&adapter->hw, E1000_TIPG, tipg); E1000_WRITE_REG(&adapter->hw, E1000_TIDV, adapter->tx_int_delay.value); if(adapter->hw.mac.type >= e1000_82540) E1000_WRITE_REG(&adapter->hw, E1000_TADV, adapter->tx_abs_int_delay.value); /* Program the Transmit Control Register */ tctl = E1000_READ_REG(&adapter->hw, E1000_TCTL); tctl &= ~E1000_TCTL_CT; tctl |= (E1000_TCTL_PSP | E1000_TCTL_RTLC | E1000_TCTL_EN | (E1000_COLLISION_THRESHOLD << E1000_CT_SHIFT)); /* This write will effectively turn on the transmit unit. */ E1000_WRITE_REG(&adapter->hw, E1000_TCTL, tctl); /* Setup Transmit Descriptor Base Settings */ adapter->txd_cmd = E1000_TXD_CMD_IFCS; if (adapter->tx_int_delay.value > 0) adapter->txd_cmd |= E1000_TXD_CMD_IDE; } /********************************************************************* * * Free all transmit related data structures. * **********************************************************************/ static void lem_free_transmit_structures(struct adapter *adapter) { struct em_buffer *tx_buffer; INIT_DEBUGOUT("free_transmit_structures: begin"); if (adapter->tx_buffer_area != NULL) { for (int i = 0; i < adapter->num_tx_desc; i++) { tx_buffer = &adapter->tx_buffer_area[i]; if (tx_buffer->m_head != NULL) { bus_dmamap_sync(adapter->txtag, tx_buffer->map, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(adapter->txtag, tx_buffer->map); m_freem(tx_buffer->m_head); tx_buffer->m_head = NULL; } else if (tx_buffer->map != NULL) bus_dmamap_unload(adapter->txtag, tx_buffer->map); if (tx_buffer->map != NULL) { bus_dmamap_destroy(adapter->txtag, tx_buffer->map); tx_buffer->map = NULL; } } } if (adapter->tx_buffer_area != NULL) { free(adapter->tx_buffer_area, M_DEVBUF); adapter->tx_buffer_area = NULL; } if (adapter->txtag != NULL) { bus_dma_tag_destroy(adapter->txtag); adapter->txtag = NULL; } } /********************************************************************* * * The offload context needs to be set when we transfer the first * packet of a particular protocol (TCP/UDP). This routine has been * enhanced to deal with inserted VLAN headers, and IPV6 (not complete) * * Added back the old method of keeping the current context type * and not setting if unnecessary, as this is reported to be a * big performance win. -jfv **********************************************************************/ static void lem_transmit_checksum_setup(struct adapter *adapter, struct mbuf *mp, u32 *txd_upper, u32 *txd_lower) { struct e1000_context_desc *TXD = NULL; struct em_buffer *tx_buffer; struct ether_vlan_header *eh; struct ip *ip = NULL; struct ip6_hdr *ip6; int curr_txd, ehdrlen; u32 cmd, hdr_len, ip_hlen; u16 etype; u8 ipproto; cmd = hdr_len = ipproto = 0; *txd_upper = *txd_lower = 0; curr_txd = adapter->next_avail_tx_desc; /* * Determine where frame payload starts. * Jump over vlan headers if already present, * helpful for QinQ too. */ eh = mtod(mp, struct ether_vlan_header *); if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) { etype = ntohs(eh->evl_proto); ehdrlen = ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN; } else { etype = ntohs(eh->evl_encap_proto); ehdrlen = ETHER_HDR_LEN; } /* * We only support TCP/UDP for IPv4 and IPv6 for the moment. * TODO: Support SCTP too when it hits the tree. */ switch (etype) { case ETHERTYPE_IP: ip = (struct ip *)(mp->m_data + ehdrlen); ip_hlen = ip->ip_hl << 2; /* Setup of IP header checksum. */ if (mp->m_pkthdr.csum_flags & CSUM_IP) { /* * Start offset for header checksum calculation. * End offset for header checksum calculation. * Offset of place to put the checksum. */ TXD = (struct e1000_context_desc *) &adapter->tx_desc_base[curr_txd]; TXD->lower_setup.ip_fields.ipcss = ehdrlen; TXD->lower_setup.ip_fields.ipcse = htole16(ehdrlen + ip_hlen); TXD->lower_setup.ip_fields.ipcso = ehdrlen + offsetof(struct ip, ip_sum); cmd |= E1000_TXD_CMD_IP; *txd_upper |= E1000_TXD_POPTS_IXSM << 8; } hdr_len = ehdrlen + ip_hlen; ipproto = ip->ip_p; break; case ETHERTYPE_IPV6: ip6 = (struct ip6_hdr *)(mp->m_data + ehdrlen); ip_hlen = sizeof(struct ip6_hdr); /* XXX: No header stacking. */ /* IPv6 doesn't have a header checksum. */ hdr_len = ehdrlen + ip_hlen; ipproto = ip6->ip6_nxt; break; default: return; } switch (ipproto) { case IPPROTO_TCP: if (mp->m_pkthdr.csum_flags & CSUM_TCP) { *txd_lower = E1000_TXD_CMD_DEXT | E1000_TXD_DTYP_D; *txd_upper |= E1000_TXD_POPTS_TXSM << 8; /* no need for context if already set */ if (adapter->last_hw_offload == CSUM_TCP) return; adapter->last_hw_offload = CSUM_TCP; /* * Start offset for payload checksum calculation. * End offset for payload checksum calculation. * Offset of place to put the checksum. */ TXD = (struct e1000_context_desc *) &adapter->tx_desc_base[curr_txd]; TXD->upper_setup.tcp_fields.tucss = hdr_len; TXD->upper_setup.tcp_fields.tucse = htole16(0); TXD->upper_setup.tcp_fields.tucso = hdr_len + offsetof(struct tcphdr, th_sum); cmd |= E1000_TXD_CMD_TCP; } break; case IPPROTO_UDP: { if (mp->m_pkthdr.csum_flags & CSUM_UDP) { *txd_lower = E1000_TXD_CMD_DEXT | E1000_TXD_DTYP_D; *txd_upper |= E1000_TXD_POPTS_TXSM << 8; /* no need for context if already set */ if (adapter->last_hw_offload == CSUM_UDP) return; adapter->last_hw_offload = CSUM_UDP; /* * Start offset for header checksum calculation. * End offset for header checksum calculation. * Offset of place to put the checksum. */ TXD = (struct e1000_context_desc *) &adapter->tx_desc_base[curr_txd]; TXD->upper_setup.tcp_fields.tucss = hdr_len; TXD->upper_setup.tcp_fields.tucse = htole16(0); TXD->upper_setup.tcp_fields.tucso = hdr_len + offsetof(struct udphdr, uh_sum); } /* Fall Thru */ } default: break; } if (TXD == NULL) return; TXD->tcp_seg_setup.data = htole32(0); TXD->cmd_and_length = htole32(adapter->txd_cmd | E1000_TXD_CMD_DEXT | cmd); tx_buffer = &adapter->tx_buffer_area[curr_txd]; tx_buffer->m_head = NULL; tx_buffer->next_eop = -1; if (++curr_txd == adapter->num_tx_desc) curr_txd = 0; adapter->num_tx_desc_avail--; adapter->next_avail_tx_desc = curr_txd; } /********************************************************************** * * Examine each tx_buffer in the used queue. If the hardware is done * processing the packet then free associated resources. The * tx_buffer is put back on the free queue. * **********************************************************************/ static void lem_txeof(struct adapter *adapter) { int first, last, done, num_avail; struct em_buffer *tx_buffer; struct e1000_tx_desc *tx_desc, *eop_desc; if_t ifp = adapter->ifp; EM_TX_LOCK_ASSERT(adapter); #ifdef DEV_NETMAP if (netmap_tx_irq(ifp, 0)) return; #endif /* DEV_NETMAP */ if (adapter->num_tx_desc_avail == adapter->num_tx_desc) return; num_avail = adapter->num_tx_desc_avail; first = adapter->next_tx_to_clean; tx_desc = &adapter->tx_desc_base[first]; tx_buffer = &adapter->tx_buffer_area[first]; last = tx_buffer->next_eop; eop_desc = &adapter->tx_desc_base[last]; /* * What this does is get the index of the * first descriptor AFTER the EOP of the * first packet, that way we can do the * simple comparison on the inner while loop. */ if (++last == adapter->num_tx_desc) last = 0; done = last; bus_dmamap_sync(adapter->txdma.dma_tag, adapter->txdma.dma_map, BUS_DMASYNC_POSTREAD); while (eop_desc->upper.fields.status & E1000_TXD_STAT_DD) { /* We clean the range of the packet */ while (first != done) { tx_desc->upper.data = 0; tx_desc->lower.data = 0; tx_desc->buffer_addr = 0; ++num_avail; if (tx_buffer->m_head) { if_inc_counter(ifp, IFCOUNTER_OPACKETS, 1); bus_dmamap_sync(adapter->txtag, tx_buffer->map, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(adapter->txtag, tx_buffer->map); m_freem(tx_buffer->m_head); tx_buffer->m_head = NULL; } tx_buffer->next_eop = -1; adapter->watchdog_time = ticks; if (++first == adapter->num_tx_desc) first = 0; tx_buffer = &adapter->tx_buffer_area[first]; tx_desc = &adapter->tx_desc_base[first]; } /* See if we can continue to the next packet */ last = tx_buffer->next_eop; if (last != -1) { eop_desc = &adapter->tx_desc_base[last]; /* Get new done point */ if (++last == adapter->num_tx_desc) last = 0; done = last; } else break; } bus_dmamap_sync(adapter->txdma.dma_tag, adapter->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); adapter->next_tx_to_clean = first; adapter->num_tx_desc_avail = num_avail; #ifdef NIC_SEND_COMBINING if ((adapter->shadow_tdt & MIT_PENDING_TDT) == MIT_PENDING_TDT) { /* a tdt write is pending, do it */ E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), 0xffff & adapter->shadow_tdt); adapter->shadow_tdt = MIT_PENDING_INT; } else { adapter->shadow_tdt = 0; // disable } #endif /* NIC_SEND_COMBINING */ /* * If we have enough room, clear IFF_DRV_OACTIVE to * tell the stack that it is OK to send packets. * If there are no pending descriptors, clear the watchdog. */ if (adapter->num_tx_desc_avail > EM_TX_CLEANUP_THRESHOLD) { if_setdrvflagbits(ifp, 0, IFF_DRV_OACTIVE); #ifdef NIC_PARAVIRT if (adapter->csb) { // XXX also csb_on ? adapter->csb->guest_need_txkick = 2; /* acked */ // XXX memory barrier } #endif /* NIC_PARAVIRT */ if (adapter->num_tx_desc_avail == adapter->num_tx_desc) { adapter->watchdog_check = FALSE; return; } } } /********************************************************************* * * When Link is lost sometimes there is work still in the TX ring * which may result in a watchdog, rather than allow that we do an * attempted cleanup and then reinit here. Note that this has been * seens mostly with fiber adapters. * **********************************************************************/ static void lem_tx_purge(struct adapter *adapter) { if ((!adapter->link_active) && (adapter->watchdog_check)) { EM_TX_LOCK(adapter); lem_txeof(adapter); EM_TX_UNLOCK(adapter); if (adapter->watchdog_check) /* Still outstanding? */ lem_init_locked(adapter); } } /********************************************************************* * * Get a buffer from system mbuf buffer pool. * **********************************************************************/ static int lem_get_buf(struct adapter *adapter, int i) { struct mbuf *m; bus_dma_segment_t segs[1]; bus_dmamap_t map; struct em_buffer *rx_buffer; int error, nsegs; m = m_getcl(M_NOWAIT, MT_DATA, M_PKTHDR); if (m == NULL) { adapter->mbuf_cluster_failed++; return (ENOBUFS); } m->m_len = m->m_pkthdr.len = MCLBYTES; if (adapter->max_frame_size <= (MCLBYTES - ETHER_ALIGN)) m_adj(m, ETHER_ALIGN); /* * Using memory from the mbuf cluster pool, invoke the * bus_dma machinery to arrange the memory mapping. */ error = bus_dmamap_load_mbuf_sg(adapter->rxtag, adapter->rx_sparemap, m, segs, &nsegs, BUS_DMA_NOWAIT); if (error != 0) { m_free(m); return (error); } /* If nsegs is wrong then the stack is corrupt. */ KASSERT(nsegs == 1, ("Too many segments returned!")); rx_buffer = &adapter->rx_buffer_area[i]; if (rx_buffer->m_head != NULL) bus_dmamap_unload(adapter->rxtag, rx_buffer->map); map = rx_buffer->map; rx_buffer->map = adapter->rx_sparemap; adapter->rx_sparemap = map; bus_dmamap_sync(adapter->rxtag, rx_buffer->map, BUS_DMASYNC_PREREAD); rx_buffer->m_head = m; adapter->rx_desc_base[i].buffer_addr = htole64(segs[0].ds_addr); return (0); } /********************************************************************* * * Allocate memory for rx_buffer structures. Since we use one * rx_buffer per received packet, the maximum number of rx_buffer's * that we'll need is equal to the number of receive descriptors * that we've allocated. * **********************************************************************/ static int lem_allocate_receive_structures(struct adapter *adapter) { device_t dev = adapter->dev; struct em_buffer *rx_buffer; int i, error; adapter->rx_buffer_area = malloc(sizeof(struct em_buffer) * adapter->num_rx_desc, M_DEVBUF, M_NOWAIT | M_ZERO); if (adapter->rx_buffer_area == NULL) { device_printf(dev, "Unable to allocate rx_buffer memory\n"); return (ENOMEM); } error = bus_dma_tag_create(bus_get_dma_tag(dev), /* parent */ 1, 0, /* alignment, bounds */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ MCLBYTES, /* maxsize */ 1, /* nsegments */ MCLBYTES, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockarg */ &adapter->rxtag); if (error) { device_printf(dev, "%s: bus_dma_tag_create failed %d\n", __func__, error); goto fail; } /* Create the spare map (used by getbuf) */ error = bus_dmamap_create(adapter->rxtag, 0, &adapter->rx_sparemap); if (error) { device_printf(dev, "%s: bus_dmamap_create failed: %d\n", __func__, error); goto fail; } rx_buffer = adapter->rx_buffer_area; for (i = 0; i < adapter->num_rx_desc; i++, rx_buffer++) { error = bus_dmamap_create(adapter->rxtag, 0, &rx_buffer->map); if (error) { device_printf(dev, "%s: bus_dmamap_create failed: %d\n", __func__, error); goto fail; } } return (0); fail: lem_free_receive_structures(adapter); return (error); } /********************************************************************* * * (Re)initialize receive structures. * **********************************************************************/ static int lem_setup_receive_structures(struct adapter *adapter) { struct em_buffer *rx_buffer; int i, error; #ifdef DEV_NETMAP /* we are already under lock */ struct netmap_adapter *na = netmap_getna(adapter->ifp); struct netmap_slot *slot = netmap_reset(na, NR_RX, 0, 0); #endif /* Reset descriptor ring */ bzero(adapter->rx_desc_base, (sizeof(struct e1000_rx_desc)) * adapter->num_rx_desc); /* Free current RX buffers. */ rx_buffer = adapter->rx_buffer_area; for (i = 0; i < adapter->num_rx_desc; i++, rx_buffer++) { if (rx_buffer->m_head != NULL) { bus_dmamap_sync(adapter->rxtag, rx_buffer->map, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(adapter->rxtag, rx_buffer->map); m_freem(rx_buffer->m_head); rx_buffer->m_head = NULL; } } /* Allocate new ones. */ for (i = 0; i < adapter->num_rx_desc; i++) { #ifdef DEV_NETMAP if (slot) { /* the i-th NIC entry goes to slot si */ int si = netmap_idx_n2k(&na->rx_rings[0], i); uint64_t paddr; void *addr; addr = PNMB(na, slot + si, &paddr); netmap_load_map(na, adapter->rxtag, rx_buffer->map, addr); /* Update descriptor */ adapter->rx_desc_base[i].buffer_addr = htole64(paddr); continue; } #endif /* DEV_NETMAP */ error = lem_get_buf(adapter, i); if (error) return (error); } /* Setup our descriptor pointers */ adapter->next_rx_desc_to_check = 0; bus_dmamap_sync(adapter->rxdma.dma_tag, adapter->rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); return (0); } /********************************************************************* * * Enable receive unit. * **********************************************************************/ static void lem_initialize_receive_unit(struct adapter *adapter) { if_t ifp = adapter->ifp; u64 bus_addr; u32 rctl, rxcsum; INIT_DEBUGOUT("lem_initialize_receive_unit: begin"); /* * Make sure receives are disabled while setting * up the descriptor ring */ rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); E1000_WRITE_REG(&adapter->hw, E1000_RCTL, rctl & ~E1000_RCTL_EN); if (adapter->hw.mac.type >= e1000_82540) { E1000_WRITE_REG(&adapter->hw, E1000_RADV, adapter->rx_abs_int_delay.value); /* * Set the interrupt throttling rate. Value is calculated * as DEFAULT_ITR = 1/(MAX_INTS_PER_SEC * 256ns) */ E1000_WRITE_REG(&adapter->hw, E1000_ITR, DEFAULT_ITR); } /* Setup the Base and Length of the Rx Descriptor Ring */ bus_addr = adapter->rxdma.dma_paddr; E1000_WRITE_REG(&adapter->hw, E1000_RDLEN(0), adapter->num_rx_desc * sizeof(struct e1000_rx_desc)); E1000_WRITE_REG(&adapter->hw, E1000_RDBAH(0), (u32)(bus_addr >> 32)); E1000_WRITE_REG(&adapter->hw, E1000_RDBAL(0), (u32)bus_addr); /* Setup the Receive Control Register */ rctl &= ~(3 << E1000_RCTL_MO_SHIFT); rctl |= E1000_RCTL_EN | E1000_RCTL_BAM | E1000_RCTL_LBM_NO | E1000_RCTL_RDMTS_HALF | (adapter->hw.mac.mc_filter_type << E1000_RCTL_MO_SHIFT); /* Make sure VLAN Filters are off */ rctl &= ~E1000_RCTL_VFE; if (e1000_tbi_sbp_enabled_82543(&adapter->hw)) rctl |= E1000_RCTL_SBP; else rctl &= ~E1000_RCTL_SBP; switch (adapter->rx_buffer_len) { default: case 2048: rctl |= E1000_RCTL_SZ_2048; break; case 4096: rctl |= E1000_RCTL_SZ_4096 | E1000_RCTL_BSEX | E1000_RCTL_LPE; break; case 8192: rctl |= E1000_RCTL_SZ_8192 | E1000_RCTL_BSEX | E1000_RCTL_LPE; break; case 16384: rctl |= E1000_RCTL_SZ_16384 | E1000_RCTL_BSEX | E1000_RCTL_LPE; break; } if (if_getmtu(ifp) > ETHERMTU) rctl |= E1000_RCTL_LPE; else rctl &= ~E1000_RCTL_LPE; /* Enable 82543 Receive Checksum Offload for TCP and UDP */ if ((adapter->hw.mac.type >= e1000_82543) && (if_getcapenable(ifp) & IFCAP_RXCSUM)) { rxcsum = E1000_READ_REG(&adapter->hw, E1000_RXCSUM); rxcsum |= (E1000_RXCSUM_IPOFL | E1000_RXCSUM_TUOFL); E1000_WRITE_REG(&adapter->hw, E1000_RXCSUM, rxcsum); } /* Enable Receives */ E1000_WRITE_REG(&adapter->hw, E1000_RCTL, rctl); /* * Setup the HW Rx Head and * Tail Descriptor Pointers */ E1000_WRITE_REG(&adapter->hw, E1000_RDH(0), 0); rctl = adapter->num_rx_desc - 1; /* default RDT value */ #ifdef DEV_NETMAP /* preserve buffers already made available to clients */ if (if_getcapenable(ifp) & IFCAP_NETMAP) { struct netmap_adapter *na = netmap_getna(adapter->ifp); rctl -= nm_kr_rxspace(&na->rx_rings[0]); } #endif /* DEV_NETMAP */ E1000_WRITE_REG(&adapter->hw, E1000_RDT(0), rctl); return; } /********************************************************************* * * Free receive related data structures. * **********************************************************************/ static void lem_free_receive_structures(struct adapter *adapter) { struct em_buffer *rx_buffer; int i; INIT_DEBUGOUT("free_receive_structures: begin"); if (adapter->rx_sparemap) { bus_dmamap_destroy(adapter->rxtag, adapter->rx_sparemap); adapter->rx_sparemap = NULL; } /* Cleanup any existing buffers */ if (adapter->rx_buffer_area != NULL) { rx_buffer = adapter->rx_buffer_area; for (i = 0; i < adapter->num_rx_desc; i++, rx_buffer++) { if (rx_buffer->m_head != NULL) { bus_dmamap_sync(adapter->rxtag, rx_buffer->map, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(adapter->rxtag, rx_buffer->map); m_freem(rx_buffer->m_head); rx_buffer->m_head = NULL; } else if (rx_buffer->map != NULL) bus_dmamap_unload(adapter->rxtag, rx_buffer->map); if (rx_buffer->map != NULL) { bus_dmamap_destroy(adapter->rxtag, rx_buffer->map); rx_buffer->map = NULL; } } } if (adapter->rx_buffer_area != NULL) { free(adapter->rx_buffer_area, M_DEVBUF); adapter->rx_buffer_area = NULL; } if (adapter->rxtag != NULL) { bus_dma_tag_destroy(adapter->rxtag); adapter->rxtag = NULL; } } /********************************************************************* * * This routine executes in interrupt context. It replenishes * the mbufs in the descriptor and sends data which has been * dma'ed into host memory to upper layer. * * We loop at most count times if count is > 0, or until done if * count < 0. * * For polling we also now return the number of cleaned packets *********************************************************************/ static bool lem_rxeof(struct adapter *adapter, int count, int *done) { if_t ifp = adapter->ifp; struct mbuf *mp; u8 status = 0, accept_frame = 0, eop = 0; u16 len, desc_len, prev_len_adj; int i, rx_sent = 0; struct e1000_rx_desc *current_desc; #ifdef BATCH_DISPATCH struct mbuf *mh = NULL, *mt = NULL; #endif /* BATCH_DISPATCH */ #ifdef NIC_PARAVIRT int retries = 0; struct paravirt_csb* csb = adapter->csb; int csb_mode = csb && csb->guest_csb_on; //ND("clear guest_rxkick at %d", adapter->next_rx_desc_to_check); if (csb_mode && csb->guest_need_rxkick) csb->guest_need_rxkick = 0; #endif /* NIC_PARAVIRT */ EM_RX_LOCK(adapter); #ifdef BATCH_DISPATCH batch_again: #endif /* BATCH_DISPATCH */ i = adapter->next_rx_desc_to_check; current_desc = &adapter->rx_desc_base[i]; bus_dmamap_sync(adapter->rxdma.dma_tag, adapter->rxdma.dma_map, BUS_DMASYNC_POSTREAD); #ifdef DEV_NETMAP if (netmap_rx_irq(ifp, 0, &rx_sent)) { EM_RX_UNLOCK(adapter); return (FALSE); } #endif /* DEV_NETMAP */ #if 1 // XXX optimization ? if (!((current_desc->status) & E1000_RXD_STAT_DD)) { if (done != NULL) *done = rx_sent; EM_RX_UNLOCK(adapter); return (FALSE); } #endif /* 0 */ while (count != 0 && if_getdrvflags(ifp) & IFF_DRV_RUNNING) { struct mbuf *m = NULL; status = current_desc->status; if ((status & E1000_RXD_STAT_DD) == 0) { #ifdef NIC_PARAVIRT if (csb_mode) { /* buffer not ready yet. Retry a few times before giving up */ if (++retries <= adapter->rx_retries) { continue; } if (csb->guest_need_rxkick == 0) { // ND("set guest_rxkick at %d", adapter->next_rx_desc_to_check); csb->guest_need_rxkick = 1; // XXX memory barrier, status volatile ? continue; /* double check */ } } /* no buffer ready, give up */ #endif /* NIC_PARAVIRT */ break; } #ifdef NIC_PARAVIRT if (csb_mode) { if (csb->guest_need_rxkick) // ND("clear again guest_rxkick at %d", adapter->next_rx_desc_to_check); csb->guest_need_rxkick = 0; retries = 0; } #endif /* NIC_PARAVIRT */ mp = adapter->rx_buffer_area[i].m_head; /* * Can't defer bus_dmamap_sync(9) because TBI_ACCEPT * needs to access the last received byte in the mbuf. */ bus_dmamap_sync(adapter->rxtag, adapter->rx_buffer_area[i].map, BUS_DMASYNC_POSTREAD); accept_frame = 1; prev_len_adj = 0; desc_len = le16toh(current_desc->length); if (status & E1000_RXD_STAT_EOP) { count--; eop = 1; if (desc_len < ETHER_CRC_LEN) { len = 0; prev_len_adj = ETHER_CRC_LEN - desc_len; } else len = desc_len - ETHER_CRC_LEN; } else { eop = 0; len = desc_len; } if (current_desc->errors & E1000_RXD_ERR_FRAME_ERR_MASK) { u8 last_byte; u32 pkt_len = desc_len; if (adapter->fmp != NULL) pkt_len += adapter->fmp->m_pkthdr.len; last_byte = *(mtod(mp, caddr_t) + desc_len - 1); if (TBI_ACCEPT(&adapter->hw, status, current_desc->errors, pkt_len, last_byte, adapter->min_frame_size, adapter->max_frame_size)) { e1000_tbi_adjust_stats_82543(&adapter->hw, &adapter->stats, pkt_len, adapter->hw.mac.addr, adapter->max_frame_size); if (len > 0) len--; } else accept_frame = 0; } if (accept_frame) { if (lem_get_buf(adapter, i) != 0) { if_inc_counter(ifp, IFCOUNTER_IQDROPS, 1); goto discard; } /* Assign correct length to the current fragment */ mp->m_len = len; if (adapter->fmp == NULL) { mp->m_pkthdr.len = len; adapter->fmp = mp; /* Store the first mbuf */ adapter->lmp = mp; } else { /* Chain mbuf's together */ mp->m_flags &= ~M_PKTHDR; /* * Adjust length of previous mbuf in chain if * we received less than 4 bytes in the last * descriptor. */ if (prev_len_adj > 0) { adapter->lmp->m_len -= prev_len_adj; adapter->fmp->m_pkthdr.len -= prev_len_adj; } adapter->lmp->m_next = mp; adapter->lmp = adapter->lmp->m_next; adapter->fmp->m_pkthdr.len += len; } if (eop) { if_setrcvif(adapter->fmp, ifp); if_inc_counter(ifp, IFCOUNTER_IPACKETS, 1); lem_receive_checksum(adapter, current_desc, adapter->fmp); #ifndef __NO_STRICT_ALIGNMENT if (adapter->max_frame_size > (MCLBYTES - ETHER_ALIGN) && lem_fixup_rx(adapter) != 0) goto skip; #endif if (status & E1000_RXD_STAT_VP) { adapter->fmp->m_pkthdr.ether_vtag = le16toh(current_desc->special); adapter->fmp->m_flags |= M_VLANTAG; } #ifndef __NO_STRICT_ALIGNMENT skip: #endif m = adapter->fmp; adapter->fmp = NULL; adapter->lmp = NULL; } } else { adapter->dropped_pkts++; discard: /* Reuse loaded DMA map and just update mbuf chain */ mp = adapter->rx_buffer_area[i].m_head; mp->m_len = mp->m_pkthdr.len = MCLBYTES; mp->m_data = mp->m_ext.ext_buf; mp->m_next = NULL; if (adapter->max_frame_size <= (MCLBYTES - ETHER_ALIGN)) m_adj(mp, ETHER_ALIGN); if (adapter->fmp != NULL) { m_freem(adapter->fmp); adapter->fmp = NULL; adapter->lmp = NULL; } m = NULL; } /* Zero out the receive descriptors status. */ current_desc->status = 0; bus_dmamap_sync(adapter->rxdma.dma_tag, adapter->rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); #ifdef NIC_PARAVIRT if (csb_mode) { /* the buffer at i has been already replaced by lem_get_buf() * so it is safe to set guest_rdt = i and possibly send a kick. * XXX see if we can optimize it later. */ csb->guest_rdt = i; // XXX memory barrier if (i == csb->host_rxkick_at) E1000_WRITE_REG(&adapter->hw, E1000_RDT(0), i); } #endif /* NIC_PARAVIRT */ /* Advance our pointers to the next descriptor. */ if (++i == adapter->num_rx_desc) i = 0; /* Call into the stack */ if (m != NULL) { #ifdef BATCH_DISPATCH if (adapter->batch_enable) { if (mh == NULL) mh = mt = m; else mt->m_nextpkt = m; mt = m; m->m_nextpkt = NULL; rx_sent++; current_desc = &adapter->rx_desc_base[i]; continue; } #endif /* BATCH_DISPATCH */ adapter->next_rx_desc_to_check = i; EM_RX_UNLOCK(adapter); if_input(ifp, m); EM_RX_LOCK(adapter); rx_sent++; i = adapter->next_rx_desc_to_check; } current_desc = &adapter->rx_desc_base[i]; } adapter->next_rx_desc_to_check = i; #ifdef BATCH_DISPATCH if (mh) { EM_RX_UNLOCK(adapter); while ( (mt = mh) != NULL) { mh = mh->m_nextpkt; mt->m_nextpkt = NULL; if_input(ifp, mt); } EM_RX_LOCK(adapter); i = adapter->next_rx_desc_to_check; /* in case of interrupts */ if (count > 0) goto batch_again; } #endif /* BATCH_DISPATCH */ /* Advance the E1000's Receive Queue #0 "Tail Pointer". */ if (--i < 0) i = adapter->num_rx_desc - 1; #ifdef NIC_PARAVIRT if (!csb_mode) /* filter out writes */ #endif /* NIC_PARAVIRT */ E1000_WRITE_REG(&adapter->hw, E1000_RDT(0), i); if (done != NULL) *done = rx_sent; EM_RX_UNLOCK(adapter); return ((status & E1000_RXD_STAT_DD) ? TRUE : FALSE); } #ifndef __NO_STRICT_ALIGNMENT /* * When jumbo frames are enabled we should realign entire payload on * architecures with strict alignment. This is serious design mistake of 8254x * as it nullifies DMA operations. 8254x just allows RX buffer size to be * 2048/4096/8192/16384. What we really want is 2048 - ETHER_ALIGN to align its * payload. On architecures without strict alignment restrictions 8254x still * performs unaligned memory access which would reduce the performance too. * To avoid copying over an entire frame to align, we allocate a new mbuf and * copy ethernet header to the new mbuf. The new mbuf is prepended into the * existing mbuf chain. * * Be aware, best performance of the 8254x is achieved only when jumbo frame is * not used at all on architectures with strict alignment. */ static int lem_fixup_rx(struct adapter *adapter) { struct mbuf *m, *n; int error; error = 0; m = adapter->fmp; if (m->m_len <= (MCLBYTES - ETHER_HDR_LEN)) { bcopy(m->m_data, m->m_data + ETHER_HDR_LEN, m->m_len); m->m_data += ETHER_HDR_LEN; } else { MGETHDR(n, M_NOWAIT, MT_DATA); if (n != NULL) { bcopy(m->m_data, n->m_data, ETHER_HDR_LEN); m->m_data += ETHER_HDR_LEN; m->m_len -= ETHER_HDR_LEN; n->m_len = ETHER_HDR_LEN; M_MOVE_PKTHDR(n, m); n->m_next = m; adapter->fmp = n; } else { adapter->dropped_pkts++; m_freem(adapter->fmp); adapter->fmp = NULL; error = ENOMEM; } } return (error); } #endif /********************************************************************* * * Verify that the hardware indicated that the checksum is valid. * Inform the stack about the status of checksum so that stack * doesn't spend time verifying the checksum. * *********************************************************************/ static void lem_receive_checksum(struct adapter *adapter, struct e1000_rx_desc *rx_desc, struct mbuf *mp) { /* 82543 or newer only */ if ((adapter->hw.mac.type < e1000_82543) || /* Ignore Checksum bit is set */ (rx_desc->status & E1000_RXD_STAT_IXSM)) { mp->m_pkthdr.csum_flags = 0; return; } if (rx_desc->status & E1000_RXD_STAT_IPCS) { /* Did it pass? */ if (!(rx_desc->errors & E1000_RXD_ERR_IPE)) { /* IP Checksum Good */ mp->m_pkthdr.csum_flags = CSUM_IP_CHECKED; mp->m_pkthdr.csum_flags |= CSUM_IP_VALID; } else { mp->m_pkthdr.csum_flags = 0; } } if (rx_desc->status & E1000_RXD_STAT_TCPCS) { /* Did it pass? */ if (!(rx_desc->errors & E1000_RXD_ERR_TCPE)) { mp->m_pkthdr.csum_flags |= (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); mp->m_pkthdr.csum_data = htons(0xffff); } } } /* * This routine is run via an vlan * config EVENT */ static void lem_register_vlan(void *arg, if_t ifp, u16 vtag) { struct adapter *adapter = if_getsoftc(ifp); u32 index, bit; if (if_getsoftc(ifp) != arg) /* Not our event */ return; if ((vtag == 0) || (vtag > 4095)) /* Invalid ID */ return; EM_CORE_LOCK(adapter); index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] |= (1 << bit); ++adapter->num_vlans; /* Re-init to load the changes */ if (if_getcapenable(ifp) & IFCAP_VLAN_HWFILTER) lem_init_locked(adapter); EM_CORE_UNLOCK(adapter); } /* * This routine is run via an vlan * unconfig EVENT */ static void lem_unregister_vlan(void *arg, if_t ifp, u16 vtag) { struct adapter *adapter = if_getsoftc(ifp); u32 index, bit; if (if_getsoftc(ifp) != arg) return; if ((vtag == 0) || (vtag > 4095)) /* Invalid */ return; EM_CORE_LOCK(adapter); index = (vtag >> 5) & 0x7F; bit = vtag & 0x1F; adapter->shadow_vfta[index] &= ~(1 << bit); --adapter->num_vlans; /* Re-init to load the changes */ if (if_getcapenable(ifp) & IFCAP_VLAN_HWFILTER) lem_init_locked(adapter); EM_CORE_UNLOCK(adapter); } static void lem_setup_vlan_hw_support(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; u32 reg; /* ** We get here thru init_locked, meaning ** a soft reset, this has already cleared ** the VFTA and other state, so if there ** have been no vlan's registered do nothing. */ if (adapter->num_vlans == 0) return; /* ** A soft reset zero's out the VFTA, so ** we need to repopulate it now. */ for (int i = 0; i < EM_VFTA_SIZE; i++) if (adapter->shadow_vfta[i] != 0) E1000_WRITE_REG_ARRAY(hw, E1000_VFTA, i, adapter->shadow_vfta[i]); reg = E1000_READ_REG(hw, E1000_CTRL); reg |= E1000_CTRL_VME; E1000_WRITE_REG(hw, E1000_CTRL, reg); /* Enable the Filter Table */ reg = E1000_READ_REG(hw, E1000_RCTL); reg &= ~E1000_RCTL_CFIEN; reg |= E1000_RCTL_VFE; E1000_WRITE_REG(hw, E1000_RCTL, reg); } static void lem_enable_intr(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; u32 ims_mask = IMS_ENABLE_MASK; E1000_WRITE_REG(hw, E1000_IMS, ims_mask); } static void lem_disable_intr(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; E1000_WRITE_REG(hw, E1000_IMC, 0xffffffff); } /* * Bit of a misnomer, what this really means is * to enable OS management of the system... aka * to disable special hardware management features */ static void lem_init_manageability(struct adapter *adapter) { /* A shared code workaround */ if (adapter->has_manage) { int manc = E1000_READ_REG(&adapter->hw, E1000_MANC); /* disable hardware interception of ARP */ manc &= ~(E1000_MANC_ARP_EN); E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc); } } /* * Give control back to hardware management * controller if there is one. */ static void lem_release_manageability(struct adapter *adapter) { if (adapter->has_manage) { int manc = E1000_READ_REG(&adapter->hw, E1000_MANC); /* re-enable hardware interception of ARP */ manc |= E1000_MANC_ARP_EN; E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc); } } /* * lem_get_hw_control sets the {CTRL_EXT|FWSM}:DRV_LOAD bit. * For ASF and Pass Through versions of f/w this means * that the driver is loaded. For AMT version type f/w * this means that the network i/f is open. */ static void lem_get_hw_control(struct adapter *adapter) { u32 ctrl_ext; ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext | E1000_CTRL_EXT_DRV_LOAD); return; } /* * lem_release_hw_control resets {CTRL_EXT|FWSM}:DRV_LOAD bit. * For ASF and Pass Through versions of f/w this means that * the driver is no longer loaded. For AMT versions of the * f/w this means that the network i/f is closed. */ static void lem_release_hw_control(struct adapter *adapter) { u32 ctrl_ext; if (!adapter->has_manage) return; ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext & ~E1000_CTRL_EXT_DRV_LOAD); return; } static int lem_is_valid_ether_addr(u8 *addr) { char zero_addr[6] = { 0, 0, 0, 0, 0, 0 }; if ((addr[0] & 1) || (!bcmp(addr, zero_addr, ETHER_ADDR_LEN))) { return (FALSE); } return (TRUE); } /* ** Parse the interface capabilities with regard ** to both system management and wake-on-lan for ** later use. */ static void lem_get_wakeup(device_t dev) { struct adapter *adapter = device_get_softc(dev); u16 eeprom_data = 0, device_id, apme_mask; adapter->has_manage = e1000_enable_mng_pass_thru(&adapter->hw); apme_mask = EM_EEPROM_APME; switch (adapter->hw.mac.type) { case e1000_82542: case e1000_82543: break; case e1000_82544: e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL2_REG, 1, &eeprom_data); apme_mask = EM_82544_APME; break; case e1000_82546: case e1000_82546_rev_3: if (adapter->hw.bus.func == 1) { e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_B, 1, &eeprom_data); break; } else e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data); break; default: e1000_read_nvm(&adapter->hw, NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data); break; } if (eeprom_data & apme_mask) adapter->wol = (E1000_WUFC_MAG | E1000_WUFC_MC); /* * We have the eeprom settings, now apply the special cases * where the eeprom may be wrong or the board won't support * wake on lan on a particular port */ device_id = pci_get_device(dev); switch (device_id) { case E1000_DEV_ID_82546GB_PCIE: adapter->wol = 0; break; case E1000_DEV_ID_82546EB_FIBER: case E1000_DEV_ID_82546GB_FIBER: /* Wake events only supported on port A for dual fiber * regardless of eeprom setting */ if (E1000_READ_REG(&adapter->hw, E1000_STATUS) & E1000_STATUS_FUNC_1) adapter->wol = 0; break; case E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3: /* if quad port adapter, disable WoL on all but port A */ if (global_quad_port_a != 0) adapter->wol = 0; /* Reset for multiple quad port adapters */ if (++global_quad_port_a == 4) global_quad_port_a = 0; break; } return; } /* * Enable PCI Wake On Lan capability */ static void lem_enable_wakeup(device_t dev) { struct adapter *adapter = device_get_softc(dev); if_t ifp = adapter->ifp; u32 pmc, ctrl, ctrl_ext, rctl; u16 status; if ((pci_find_cap(dev, PCIY_PMG, &pmc) != 0)) return; /* Advertise the wakeup capability */ ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL); ctrl |= (E1000_CTRL_SWDPIN2 | E1000_CTRL_SWDPIN3); E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl); E1000_WRITE_REG(&adapter->hw, E1000_WUC, E1000_WUC_PME_EN); /* Keep the laser running on Fiber adapters */ if (adapter->hw.phy.media_type == e1000_media_type_fiber || adapter->hw.phy.media_type == e1000_media_type_internal_serdes) { ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT); ctrl_ext |= E1000_CTRL_EXT_SDP3_DATA; E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext); } /* ** Determine type of Wakeup: note that wol ** is set with all bits on by default. */ if ((if_getcapenable(ifp) & IFCAP_WOL_MAGIC) == 0) adapter->wol &= ~E1000_WUFC_MAG; if ((if_getcapenable(ifp) & IFCAP_WOL_MCAST) == 0) adapter->wol &= ~E1000_WUFC_MC; else { rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL); rctl |= E1000_RCTL_MPE; E1000_WRITE_REG(&adapter->hw, E1000_RCTL, rctl); } if (adapter->hw.mac.type == e1000_pchlan) { if (lem_enable_phy_wakeup(adapter)) return; } else { E1000_WRITE_REG(&adapter->hw, E1000_WUC, E1000_WUC_PME_EN); E1000_WRITE_REG(&adapter->hw, E1000_WUFC, adapter->wol); } /* Request PME */ status = pci_read_config(dev, pmc + PCIR_POWER_STATUS, 2); status &= ~(PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE); if (if_getcapenable(ifp) & IFCAP_WOL) status |= PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE; pci_write_config(dev, pmc + PCIR_POWER_STATUS, status, 2); return; } /* ** WOL in the newer chipset interfaces (pchlan) ** require thing to be copied into the phy */ static int lem_enable_phy_wakeup(struct adapter *adapter) { struct e1000_hw *hw = &adapter->hw; u32 mreg, ret = 0; u16 preg; /* copy MAC RARs to PHY RARs */ for (int i = 0; i < adapter->hw.mac.rar_entry_count; i++) { mreg = E1000_READ_REG(hw, E1000_RAL(i)); e1000_write_phy_reg(hw, BM_RAR_L(i), (u16)(mreg & 0xFFFF)); e1000_write_phy_reg(hw, BM_RAR_M(i), (u16)((mreg >> 16) & 0xFFFF)); mreg = E1000_READ_REG(hw, E1000_RAH(i)); e1000_write_phy_reg(hw, BM_RAR_H(i), (u16)(mreg & 0xFFFF)); e1000_write_phy_reg(hw, BM_RAR_CTRL(i), (u16)((mreg >> 16) & 0xFFFF)); } /* copy MAC MTA to PHY MTA */ for (int i = 0; i < adapter->hw.mac.mta_reg_count; i++) { mreg = E1000_READ_REG_ARRAY(hw, E1000_MTA, i); e1000_write_phy_reg(hw, BM_MTA(i), (u16)(mreg & 0xFFFF)); e1000_write_phy_reg(hw, BM_MTA(i) + 1, (u16)((mreg >> 16) & 0xFFFF)); } /* configure PHY Rx Control register */ e1000_read_phy_reg(&adapter->hw, BM_RCTL, &preg); mreg = E1000_READ_REG(hw, E1000_RCTL); if (mreg & E1000_RCTL_UPE) preg |= BM_RCTL_UPE; if (mreg & E1000_RCTL_MPE) preg |= BM_RCTL_MPE; preg &= ~(BM_RCTL_MO_MASK); if (mreg & E1000_RCTL_MO_3) preg |= (((mreg & E1000_RCTL_MO_3) >> E1000_RCTL_MO_SHIFT) << BM_RCTL_MO_SHIFT); if (mreg & E1000_RCTL_BAM) preg |= BM_RCTL_BAM; if (mreg & E1000_RCTL_PMCF) preg |= BM_RCTL_PMCF; mreg = E1000_READ_REG(hw, E1000_CTRL); if (mreg & E1000_CTRL_RFCE) preg |= BM_RCTL_RFCE; e1000_write_phy_reg(&adapter->hw, BM_RCTL, preg); /* enable PHY wakeup in MAC register */ E1000_WRITE_REG(hw, E1000_WUC, E1000_WUC_PHY_WAKE | E1000_WUC_PME_EN); E1000_WRITE_REG(hw, E1000_WUFC, adapter->wol); /* configure and enable PHY wakeup in PHY registers */ e1000_write_phy_reg(&adapter->hw, BM_WUFC, adapter->wol); e1000_write_phy_reg(&adapter->hw, BM_WUC, E1000_WUC_PME_EN); /* activate PHY wakeup */ ret = hw->phy.ops.acquire(hw); if (ret) { printf("Could not acquire PHY\n"); return ret; } e1000_write_phy_reg_mdic(hw, IGP01E1000_PHY_PAGE_SELECT, (BM_WUC_ENABLE_PAGE << IGP_PAGE_SHIFT)); ret = e1000_read_phy_reg_mdic(hw, BM_WUC_ENABLE_REG, &preg); if (ret) { printf("Could not read PHY page 769\n"); goto out; } preg |= BM_WUC_ENABLE_BIT | BM_WUC_HOST_WU_BIT; ret = e1000_write_phy_reg_mdic(hw, BM_WUC_ENABLE_REG, preg); if (ret) printf("Could not set PHY Host Wakeup bit\n"); out: hw->phy.ops.release(hw); return ret; } static void lem_led_func(void *arg, int onoff) { struct adapter *adapter = arg; EM_CORE_LOCK(adapter); if (onoff) { e1000_setup_led(&adapter->hw); e1000_led_on(&adapter->hw); } else { e1000_led_off(&adapter->hw); e1000_cleanup_led(&adapter->hw); } EM_CORE_UNLOCK(adapter); } /********************************************************************* * 82544 Coexistence issue workaround. * There are 2 issues. * 1. Transmit Hang issue. * To detect this issue, following equation can be used... * SIZE[3:0] + ADDR[2:0] = SUM[3:0]. * If SUM[3:0] is in between 1 to 4, we will have this issue. * * 2. DAC issue. * To detect this issue, following equation can be used... * SIZE[3:0] + ADDR[2:0] = SUM[3:0]. * If SUM[3:0] is in between 9 to c, we will have this issue. * * * WORKAROUND: * Make sure we do not have ending address * as 1,2,3,4(Hang) or 9,a,b,c (DAC) * *************************************************************************/ static u32 lem_fill_descriptors (bus_addr_t address, u32 length, PDESC_ARRAY desc_array) { u32 safe_terminator; /* Since issue is sensitive to length and address.*/ /* Let us first check the address...*/ if (length <= 4) { desc_array->descriptor[0].address = address; desc_array->descriptor[0].length = length; desc_array->elements = 1; return (desc_array->elements); } safe_terminator = (u32)((((u32)address & 0x7) + (length & 0xF)) & 0xF); /* if it does not fall between 0x1 to 0x4 and 0x9 to 0xC then return */ if (safe_terminator == 0 || (safe_terminator > 4 && safe_terminator < 9) || (safe_terminator > 0xC && safe_terminator <= 0xF)) { desc_array->descriptor[0].address = address; desc_array->descriptor[0].length = length; desc_array->elements = 1; return (desc_array->elements); } desc_array->descriptor[0].address = address; desc_array->descriptor[0].length = length - 4; desc_array->descriptor[1].address = address + (length - 4); desc_array->descriptor[1].length = 4; desc_array->elements = 2; return (desc_array->elements); } /********************************************************************** * * Update the board statistics counters. * **********************************************************************/ static void lem_update_stats_counters(struct adapter *adapter) { if(adapter->hw.phy.media_type == e1000_media_type_copper || (E1000_READ_REG(&adapter->hw, E1000_STATUS) & E1000_STATUS_LU)) { adapter->stats.symerrs += E1000_READ_REG(&adapter->hw, E1000_SYMERRS); adapter->stats.sec += E1000_READ_REG(&adapter->hw, E1000_SEC); } adapter->stats.crcerrs += E1000_READ_REG(&adapter->hw, E1000_CRCERRS); adapter->stats.mpc += E1000_READ_REG(&adapter->hw, E1000_MPC); adapter->stats.scc += E1000_READ_REG(&adapter->hw, E1000_SCC); adapter->stats.ecol += E1000_READ_REG(&adapter->hw, E1000_ECOL); adapter->stats.mcc += E1000_READ_REG(&adapter->hw, E1000_MCC); adapter->stats.latecol += E1000_READ_REG(&adapter->hw, E1000_LATECOL); adapter->stats.colc += E1000_READ_REG(&adapter->hw, E1000_COLC); adapter->stats.dc += E1000_READ_REG(&adapter->hw, E1000_DC); adapter->stats.rlec += E1000_READ_REG(&adapter->hw, E1000_RLEC); adapter->stats.xonrxc += E1000_READ_REG(&adapter->hw, E1000_XONRXC); adapter->stats.xontxc += E1000_READ_REG(&adapter->hw, E1000_XONTXC); adapter->stats.xoffrxc += E1000_READ_REG(&adapter->hw, E1000_XOFFRXC); adapter->stats.xofftxc += E1000_READ_REG(&adapter->hw, E1000_XOFFTXC); adapter->stats.fcruc += E1000_READ_REG(&adapter->hw, E1000_FCRUC); adapter->stats.prc64 += E1000_READ_REG(&adapter->hw, E1000_PRC64); adapter->stats.prc127 += E1000_READ_REG(&adapter->hw, E1000_PRC127); adapter->stats.prc255 += E1000_READ_REG(&adapter->hw, E1000_PRC255); adapter->stats.prc511 += E1000_READ_REG(&adapter->hw, E1000_PRC511); adapter->stats.prc1023 += E1000_READ_REG(&adapter->hw, E1000_PRC1023); adapter->stats.prc1522 += E1000_READ_REG(&adapter->hw, E1000_PRC1522); adapter->stats.gprc += E1000_READ_REG(&adapter->hw, E1000_GPRC); adapter->stats.bprc += E1000_READ_REG(&adapter->hw, E1000_BPRC); adapter->stats.mprc += E1000_READ_REG(&adapter->hw, E1000_MPRC); adapter->stats.gptc += E1000_READ_REG(&adapter->hw, E1000_GPTC); /* For the 64-bit byte counters the low dword must be read first. */ /* Both registers clear on the read of the high dword */ adapter->stats.gorc += E1000_READ_REG(&adapter->hw, E1000_GORCL) + ((u64)E1000_READ_REG(&adapter->hw, E1000_GORCH) << 32); adapter->stats.gotc += E1000_READ_REG(&adapter->hw, E1000_GOTCL) + ((u64)E1000_READ_REG(&adapter->hw, E1000_GOTCH) << 32); adapter->stats.rnbc += E1000_READ_REG(&adapter->hw, E1000_RNBC); adapter->stats.ruc += E1000_READ_REG(&adapter->hw, E1000_RUC); adapter->stats.rfc += E1000_READ_REG(&adapter->hw, E1000_RFC); adapter->stats.roc += E1000_READ_REG(&adapter->hw, E1000_ROC); adapter->stats.rjc += E1000_READ_REG(&adapter->hw, E1000_RJC); adapter->stats.tor += E1000_READ_REG(&adapter->hw, E1000_TORH); adapter->stats.tot += E1000_READ_REG(&adapter->hw, E1000_TOTH); adapter->stats.tpr += E1000_READ_REG(&adapter->hw, E1000_TPR); adapter->stats.tpt += E1000_READ_REG(&adapter->hw, E1000_TPT); adapter->stats.ptc64 += E1000_READ_REG(&adapter->hw, E1000_PTC64); adapter->stats.ptc127 += E1000_READ_REG(&adapter->hw, E1000_PTC127); adapter->stats.ptc255 += E1000_READ_REG(&adapter->hw, E1000_PTC255); adapter->stats.ptc511 += E1000_READ_REG(&adapter->hw, E1000_PTC511); adapter->stats.ptc1023 += E1000_READ_REG(&adapter->hw, E1000_PTC1023); adapter->stats.ptc1522 += E1000_READ_REG(&adapter->hw, E1000_PTC1522); adapter->stats.mptc += E1000_READ_REG(&adapter->hw, E1000_MPTC); adapter->stats.bptc += E1000_READ_REG(&adapter->hw, E1000_BPTC); if (adapter->hw.mac.type >= e1000_82543) { adapter->stats.algnerrc += E1000_READ_REG(&adapter->hw, E1000_ALGNERRC); adapter->stats.rxerrc += E1000_READ_REG(&adapter->hw, E1000_RXERRC); adapter->stats.tncrs += E1000_READ_REG(&adapter->hw, E1000_TNCRS); adapter->stats.cexterr += E1000_READ_REG(&adapter->hw, E1000_CEXTERR); adapter->stats.tsctc += E1000_READ_REG(&adapter->hw, E1000_TSCTC); adapter->stats.tsctfc += E1000_READ_REG(&adapter->hw, E1000_TSCTFC); } } static uint64_t lem_get_counter(if_t ifp, ift_counter cnt) { struct adapter *adapter; adapter = if_getsoftc(ifp); switch (cnt) { case IFCOUNTER_COLLISIONS: return (adapter->stats.colc); case IFCOUNTER_IERRORS: return (adapter->dropped_pkts + adapter->stats.rxerrc + adapter->stats.crcerrs + adapter->stats.algnerrc + adapter->stats.ruc + adapter->stats.roc + adapter->stats.mpc + adapter->stats.cexterr); case IFCOUNTER_OERRORS: return (adapter->stats.ecol + adapter->stats.latecol + adapter->watchdog_events); default: return (if_get_counter_default(ifp, cnt)); } } /* Export a single 32-bit register via a read-only sysctl. */ static int lem_sysctl_reg_handler(SYSCTL_HANDLER_ARGS) { struct adapter *adapter; u_int val; adapter = oidp->oid_arg1; val = E1000_READ_REG(&adapter->hw, oidp->oid_arg2); return (sysctl_handle_int(oidp, &val, 0, req)); } /* * Add sysctl variables, one per statistic, to the system. */ static void lem_add_hw_stats(struct adapter *adapter) { device_t dev = adapter->dev; struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(dev); struct sysctl_oid *tree = device_get_sysctl_tree(dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); struct e1000_hw_stats *stats = &adapter->stats; struct sysctl_oid *stat_node; struct sysctl_oid_list *stat_list; /* Driver Statistics */ SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "cluster_alloc_fail", CTLFLAG_RD, &adapter->mbuf_cluster_failed, "Std mbuf cluster failed"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "mbuf_defrag_fail", CTLFLAG_RD, &adapter->mbuf_defrag_failed, "Defragmenting mbuf chain failed"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "dropped", CTLFLAG_RD, &adapter->dropped_pkts, "Driver dropped packets"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_dma_fail", CTLFLAG_RD, &adapter->no_tx_dma_setup, "Driver tx dma failure in xmit"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_desc_fail1", CTLFLAG_RD, &adapter->no_tx_desc_avail1, "Not enough tx descriptors failure in xmit"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_desc_fail2", CTLFLAG_RD, &adapter->no_tx_desc_avail2, "Not enough tx descriptors failure in xmit"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "rx_overruns", CTLFLAG_RD, &adapter->rx_overruns, "RX overruns"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "watchdog_timeouts", CTLFLAG_RD, &adapter->watchdog_events, "Watchdog timeouts"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "device_control", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_CTRL, lem_sysctl_reg_handler, "IU", "Device Control Register"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "rx_control", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RCTL, lem_sysctl_reg_handler, "IU", "Receiver Control Register"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_high_water", CTLFLAG_RD, &adapter->hw.fc.high_water, 0, "Flow Control High Watermark"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_low_water", CTLFLAG_RD, &adapter->hw.fc.low_water, 0, "Flow Control Low Watermark"); SYSCTL_ADD_UQUAD(ctx, child, OID_AUTO, "fifo_workaround", CTLFLAG_RD, &adapter->tx_fifo_wrk_cnt, "TX FIFO workaround events"); SYSCTL_ADD_UQUAD(ctx, child, OID_AUTO, "fifo_reset", CTLFLAG_RD, &adapter->tx_fifo_reset_cnt, "TX FIFO resets"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "txd_head", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_TDH(0), lem_sysctl_reg_handler, "IU", "Transmit Descriptor Head"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "txd_tail", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_TDT(0), lem_sysctl_reg_handler, "IU", "Transmit Descriptor Tail"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "rxd_head", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RDH(0), lem_sysctl_reg_handler, "IU", "Receive Descriptor Head"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "rxd_tail", CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RDT(0), lem_sysctl_reg_handler, "IU", "Receive Descriptor Tail"); /* MAC stats get their own sub node */ stat_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "mac_stats", CTLFLAG_RD, NULL, "Statistics"); stat_list = SYSCTL_CHILDREN(stat_node); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "excess_coll", CTLFLAG_RD, &stats->ecol, "Excessive collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "single_coll", CTLFLAG_RD, &stats->scc, "Single collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "multiple_coll", CTLFLAG_RD, &stats->mcc, "Multiple collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "late_coll", CTLFLAG_RD, &stats->latecol, "Late collisions"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "collision_count", CTLFLAG_RD, &stats->colc, "Collision Count"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "symbol_errors", CTLFLAG_RD, &adapter->stats.symerrs, "Symbol Errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "sequence_errors", CTLFLAG_RD, &adapter->stats.sec, "Sequence Errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "defer_count", CTLFLAG_RD, &adapter->stats.dc, "Defer Count"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "missed_packets", CTLFLAG_RD, &adapter->stats.mpc, "Missed Packets"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_no_buff", CTLFLAG_RD, &adapter->stats.rnbc, "Receive No Buffers"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_undersize", CTLFLAG_RD, &adapter->stats.ruc, "Receive Undersize"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_fragmented", CTLFLAG_RD, &adapter->stats.rfc, "Fragmented Packets Received "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_oversize", CTLFLAG_RD, &adapter->stats.roc, "Oversized Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_jabber", CTLFLAG_RD, &adapter->stats.rjc, "Recevied Jabber"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_errs", CTLFLAG_RD, &adapter->stats.rxerrc, "Receive Errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "crc_errs", CTLFLAG_RD, &adapter->stats.crcerrs, "CRC errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "alignment_errs", CTLFLAG_RD, &adapter->stats.algnerrc, "Alignment Errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "coll_ext_errs", CTLFLAG_RD, &adapter->stats.cexterr, "Collision/Carrier extension errors"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xon_recvd", CTLFLAG_RD, &adapter->stats.xonrxc, "XON Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xon_txd", CTLFLAG_RD, &adapter->stats.xontxc, "XON Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xoff_recvd", CTLFLAG_RD, &adapter->stats.xoffrxc, "XOFF Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xoff_txd", CTLFLAG_RD, &adapter->stats.xofftxc, "XOFF Transmitted"); /* Packet Reception Stats */ SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "total_pkts_recvd", CTLFLAG_RD, &adapter->stats.tpr, "Total Packets Received "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_pkts_recvd", CTLFLAG_RD, &adapter->stats.gprc, "Good Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_recvd", CTLFLAG_RD, &adapter->stats.bprc, "Broadcast Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_recvd", CTLFLAG_RD, &adapter->stats.mprc, "Multicast Packets Received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_64", CTLFLAG_RD, &adapter->stats.prc64, "64 byte frames received "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_65_127", CTLFLAG_RD, &adapter->stats.prc127, "65-127 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_128_255", CTLFLAG_RD, &adapter->stats.prc255, "128-255 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_256_511", CTLFLAG_RD, &adapter->stats.prc511, "256-511 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_512_1023", CTLFLAG_RD, &adapter->stats.prc1023, "512-1023 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_1024_1522", CTLFLAG_RD, &adapter->stats.prc1522, "1023-1522 byte frames received"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_octets_recvd", CTLFLAG_RD, &adapter->stats.gorc, "Good Octets Received"); /* Packet Transmission Stats */ SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_octets_txd", CTLFLAG_RD, &adapter->stats.gotc, "Good Octets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "total_pkts_txd", CTLFLAG_RD, &adapter->stats.tpt, "Total Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_pkts_txd", CTLFLAG_RD, &adapter->stats.gptc, "Good Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_txd", CTLFLAG_RD, &adapter->stats.bptc, "Broadcast Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_txd", CTLFLAG_RD, &adapter->stats.mptc, "Multicast Packets Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_64", CTLFLAG_RD, &adapter->stats.ptc64, "64 byte frames transmitted "); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_65_127", CTLFLAG_RD, &adapter->stats.ptc127, "65-127 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_128_255", CTLFLAG_RD, &adapter->stats.ptc255, "128-255 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_256_511", CTLFLAG_RD, &adapter->stats.ptc511, "256-511 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_512_1023", CTLFLAG_RD, &adapter->stats.ptc1023, "512-1023 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_1024_1522", CTLFLAG_RD, &adapter->stats.ptc1522, "1024-1522 byte frames transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tso_txd", CTLFLAG_RD, &adapter->stats.tsctc, "TSO Contexts Transmitted"); SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tso_ctx_fail", CTLFLAG_RD, &adapter->stats.tsctfc, "TSO Contexts Failed"); } /********************************************************************** * * This routine provides a way to dump out the adapter eeprom, * often a useful debug/service tool. This only dumps the first * 32 words, stuff that matters is in that extent. * **********************************************************************/ static int lem_sysctl_nvm_info(SYSCTL_HANDLER_ARGS) { struct adapter *adapter; int error; int result; result = -1; error = sysctl_handle_int(oidp, &result, 0, req); if (error || !req->newptr) return (error); /* * This value will cause a hex dump of the * first 32 16-bit words of the EEPROM to * the screen. */ if (result == 1) { adapter = (struct adapter *)arg1; lem_print_nvm_info(adapter); } return (error); } static void lem_print_nvm_info(struct adapter *adapter) { u16 eeprom_data; int i, j, row = 0; /* Its a bit crude, but it gets the job done */ printf("\nInterface EEPROM Dump:\n"); printf("Offset\n0x0000 "); for (i = 0, j = 0; i < 32; i++, j++) { if (j == 8) { /* Make the offset block */ j = 0; ++row; printf("\n0x00%x0 ",row); } e1000_read_nvm(&adapter->hw, i, 1, &eeprom_data); printf("%04x ", eeprom_data); } printf("\n"); } static int lem_sysctl_int_delay(SYSCTL_HANDLER_ARGS) { struct em_int_delay_info *info; struct adapter *adapter; u32 regval; int error; int usecs; int ticks; info = (struct em_int_delay_info *)arg1; usecs = info->value; error = sysctl_handle_int(oidp, &usecs, 0, req); if (error != 0 || req->newptr == NULL) return (error); if (usecs < 0 || usecs > EM_TICKS_TO_USECS(65535)) return (EINVAL); info->value = usecs; ticks = EM_USECS_TO_TICKS(usecs); if (info->offset == E1000_ITR) /* units are 256ns here */ ticks *= 4; adapter = info->adapter; EM_CORE_LOCK(adapter); regval = E1000_READ_OFFSET(&adapter->hw, info->offset); regval = (regval & ~0xffff) | (ticks & 0xffff); /* Handle a few special cases. */ switch (info->offset) { case E1000_RDTR: break; case E1000_TIDV: if (ticks == 0) { adapter->txd_cmd &= ~E1000_TXD_CMD_IDE; /* Don't write 0 into the TIDV register. */ regval++; } else adapter->txd_cmd |= E1000_TXD_CMD_IDE; break; } E1000_WRITE_OFFSET(&adapter->hw, info->offset, regval); EM_CORE_UNLOCK(adapter); return (0); } static void lem_add_int_delay_sysctl(struct adapter *adapter, const char *name, const char *description, struct em_int_delay_info *info, int offset, int value) { info->adapter = adapter; info->offset = offset; info->value = value; SYSCTL_ADD_PROC(device_get_sysctl_ctx(adapter->dev), SYSCTL_CHILDREN(device_get_sysctl_tree(adapter->dev)), OID_AUTO, name, CTLTYPE_INT|CTLFLAG_RW, info, 0, lem_sysctl_int_delay, "I", description); } static void lem_set_flow_cntrl(struct adapter *adapter, const char *name, const char *description, int *limit, int value) { *limit = value; SYSCTL_ADD_INT(device_get_sysctl_ctx(adapter->dev), SYSCTL_CHILDREN(device_get_sysctl_tree(adapter->dev)), OID_AUTO, name, CTLFLAG_RW, limit, value, description); } static void lem_add_rx_process_limit(struct adapter *adapter, const char *name, const char *description, int *limit, int value) { *limit = value; SYSCTL_ADD_INT(device_get_sysctl_ctx(adapter->dev), SYSCTL_CHILDREN(device_get_sysctl_tree(adapter->dev)), OID_AUTO, name, CTLFLAG_RW, limit, value, description); } Index: user/alc/PQ_LAUNDRY/sys/dev/fb/vesa.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/fb/vesa.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/fb/vesa.c (revision 303206) @@ -1,1984 +1,1996 @@ /*- * Copyright (c) 1998 Kazutaka YOKOTA and Michael Smith * Copyright (c) 2009-2013 Jung-uk Kim * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer as * the first lines of this file unmodified. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_vga.h" #include "opt_vesa.h" #ifndef VGA_NO_MODE_CHANGE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define VESA_BIOS_OFFSET 0xc0000 #define VESA_PALETTE_SIZE (256 * 4) #define VESA_VIA_CLE266 "VIA CLE266\r\n" #ifndef VESA_DEBUG #define VESA_DEBUG 0 #endif /* VESA video adapter state buffer stub */ struct adp_state { int sig; #define V_STATE_SIG 0x61736576 u_char regs[1]; }; typedef struct adp_state adp_state_t; static struct mtx vesa_lock; static int vesa_state; static void *vesa_state_buf; static uint32_t vesa_state_buf_offs; static size_t vesa_state_buf_size; static u_char *vesa_palette; static uint32_t vesa_palette_offs; static void *vesa_vmem_buf; static size_t vesa_vmem_max; static void *vesa_bios; static uint32_t vesa_bios_offs; static uint32_t vesa_bios_int10; static size_t vesa_bios_size; /* VESA video adapter */ static video_adapter_t *vesa_adp; static SYSCTL_NODE(_debug, OID_AUTO, vesa, CTLFLAG_RD, NULL, "VESA debugging"); static int vesa_shadow_rom; SYSCTL_INT(_debug_vesa, OID_AUTO, shadow_rom, CTLFLAG_RDTUN, &vesa_shadow_rom, 0, "Enable video BIOS shadow"); /* VESA functions */ #if 0 static int vesa_nop(void); #endif static int vesa_error(void); static vi_probe_t vesa_probe; static vi_init_t vesa_init; static vi_get_info_t vesa_get_info; static vi_query_mode_t vesa_query_mode; static vi_set_mode_t vesa_set_mode; static vi_save_font_t vesa_save_font; static vi_load_font_t vesa_load_font; static vi_show_font_t vesa_show_font; static vi_save_palette_t vesa_save_palette; static vi_load_palette_t vesa_load_palette; static vi_set_border_t vesa_set_border; static vi_save_state_t vesa_save_state; static vi_load_state_t vesa_load_state; static vi_set_win_org_t vesa_set_origin; static vi_read_hw_cursor_t vesa_read_hw_cursor; static vi_set_hw_cursor_t vesa_set_hw_cursor; static vi_set_hw_cursor_shape_t vesa_set_hw_cursor_shape; static vi_blank_display_t vesa_blank_display; static vi_mmap_t vesa_mmap; static vi_ioctl_t vesa_ioctl; static vi_clear_t vesa_clear; static vi_fill_rect_t vesa_fill_rect; static vi_bitblt_t vesa_bitblt; static vi_diag_t vesa_diag; static int vesa_bios_info(int level); +static int vesa_late_load(int flags); static video_switch_t vesavidsw = { vesa_probe, vesa_init, vesa_get_info, vesa_query_mode, vesa_set_mode, vesa_save_font, vesa_load_font, vesa_show_font, vesa_save_palette, vesa_load_palette, vesa_set_border, vesa_save_state, vesa_load_state, vesa_set_origin, vesa_read_hw_cursor, vesa_set_hw_cursor, vesa_set_hw_cursor_shape, vesa_blank_display, vesa_mmap, vesa_ioctl, vesa_clear, vesa_fill_rect, vesa_bitblt, vesa_error, vesa_error, vesa_diag, }; static video_switch_t *prevvidsw; /* VESA BIOS video modes */ #define VESA_MAXMODES 64 #define EOT (-1) #define NA (-2) #define MODE_TABLE_DELTA 8 static int vesa_vmode_max; static video_info_t *vesa_vmode; static int vesa_init_done; static struct vesa_info *vesa_adp_info; static u_int16_t *vesa_vmodetab; static char *vesa_oemstr; static char *vesa_venderstr; static char *vesa_prodstr; static char *vesa_revstr; /* local macros and functions */ #define BIOS_SADDRTOLADDR(p) ((((p) & 0xffff0000) >> 12) + ((p) & 0x0000ffff)) static int int10_set_mode(int mode); static int vesa_bios_post(void); static int vesa_bios_get_mode(int mode, struct vesa_mode *vmode, int flags); static int vesa_bios_set_mode(int mode); #if 0 static int vesa_bios_get_dac(void); #endif static int vesa_bios_set_dac(int bits); static int vesa_bios_save_palette(int start, int colors, u_char *palette, int bits); static int vesa_bios_save_palette2(int start, int colors, u_char *r, u_char *g, u_char *b, int bits); static int vesa_bios_load_palette(int start, int colors, u_char *palette, int bits); static int vesa_bios_load_palette2(int start, int colors, u_char *r, u_char *g, u_char *b, int bits); #define STATE_SIZE 0 #define STATE_SAVE 1 #define STATE_LOAD 2 static size_t vesa_bios_state_buf_size(int); static int vesa_bios_save_restore(int code, void *p); #ifdef MODE_TABLE_BROKEN static int vesa_bios_get_line_length(void); #endif static int vesa_bios_set_line_length(int pixel, int *bytes, int *lines); #if 0 static int vesa_bios_get_start(int *x, int *y); #endif static int vesa_bios_set_start(int x, int y); static int vesa_map_gen_mode_num(int type, int color, int mode); static int vesa_translate_flags(u_int16_t vflags); static int vesa_translate_mmodel(u_int8_t vmodel); static int vesa_get_bpscanline(struct vesa_mode *vmode); static int vesa_bios_init(void); static void vesa_bios_uninit(void); static void vesa_clear_modes(video_info_t *info, int color); #if 0 static int vesa_get_origin(video_adapter_t *adp, off_t *offset); #endif /* INT 10 BIOS calls */ static int int10_set_mode(int mode) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AL = mode; x86bios_intr(®s, 0x10); return (0); } static int vesa_bios_post(void) { x86regs_t regs; devclass_t dc; device_t *devs; device_t dev; int count, i, is_pci; if (x86bios_get_orm(vesa_bios_offs) == NULL) return (1); dev = NULL; is_pci = 0; /* Find the matching PCI video controller. */ dc = devclass_find("vgapci"); if (dc != NULL && devclass_get_devices(dc, &devs, &count) == 0) { for (i = 0; i < count; i++) if (device_get_flags(devs[i]) != 0 && x86bios_match_device(vesa_bios_offs, devs[i])) { dev = devs[i]; is_pci = 1; break; } free(devs, M_TEMP); } /* Try VGA if a PCI device is not found. */ if (dev == NULL) { dc = devclass_find(VGA_DRIVER_NAME); if (dc != NULL) dev = devclass_get_device(dc, 0); } if (bootverbose) printf("%s: calling BIOS POST\n", dev == NULL ? "VESA" : device_get_nameunit(dev)); x86bios_init_regs(®s); if (is_pci) { regs.R_AH = pci_get_bus(dev); regs.R_AL = (pci_get_slot(dev) << 3) | (pci_get_function(dev) & 0x07); } regs.R_DL = 0x80; x86bios_call(®s, X86BIOS_PHYSTOSEG(vesa_bios_offs + 3), X86BIOS_PHYSTOOFF(vesa_bios_offs + 3)); if (x86bios_get_intr(0x10) == 0) return (1); return (0); } /* VESA BIOS calls */ static int vesa_bios_get_mode(int mode, struct vesa_mode *vmode, int flags) { x86regs_t regs; uint32_t offs; void *buf; buf = x86bios_alloc(&offs, sizeof(*vmode), flags); if (buf == NULL) return (1); x86bios_init_regs(®s); regs.R_AX = 0x4f01; regs.R_CX = mode; regs.R_ES = X86BIOS_PHYSTOSEG(offs); regs.R_DI = X86BIOS_PHYSTOOFF(offs); x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) { x86bios_free(buf, sizeof(*vmode)); return (1); } bcopy(buf, vmode, sizeof(*vmode)); x86bios_free(buf, sizeof(*vmode)); return (0); } static int vesa_bios_set_mode(int mode) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f02; regs.R_BX = mode; x86bios_intr(®s, 0x10); return (regs.R_AX != 0x004f); } #if 0 static int vesa_bios_get_dac(void) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f08; regs.R_BL = 1; x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) return (6); return (regs.R_BH); } #endif static int vesa_bios_set_dac(int bits) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f08; /* regs.R_BL = 0; */ regs.R_BH = bits; x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) return (6); return (regs.R_BH); } static int vesa_bios_save_palette(int start, int colors, u_char *palette, int bits) { x86regs_t regs; int i; x86bios_init_regs(®s); regs.R_AX = 0x4f09; regs.R_BL = 1; regs.R_CX = colors; regs.R_DX = start; regs.R_ES = X86BIOS_PHYSTOSEG(vesa_palette_offs); regs.R_DI = X86BIOS_PHYSTOOFF(vesa_palette_offs); bits = 8 - bits; mtx_lock(&vesa_lock); x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) { mtx_unlock(&vesa_lock); return (1); } for (i = 0; i < colors; ++i) { palette[i * 3] = vesa_palette[i * 4 + 2] << bits; palette[i * 3 + 1] = vesa_palette[i * 4 + 1] << bits; palette[i * 3 + 2] = vesa_palette[i * 4] << bits; } mtx_unlock(&vesa_lock); return (0); } static int vesa_bios_save_palette2(int start, int colors, u_char *r, u_char *g, u_char *b, int bits) { x86regs_t regs; int i; x86bios_init_regs(®s); regs.R_AX = 0x4f09; regs.R_BL = 1; regs.R_CX = colors; regs.R_DX = start; regs.R_ES = X86BIOS_PHYSTOSEG(vesa_palette_offs); regs.R_DI = X86BIOS_PHYSTOOFF(vesa_palette_offs); bits = 8 - bits; mtx_lock(&vesa_lock); x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) { mtx_unlock(&vesa_lock); return (1); } for (i = 0; i < colors; ++i) { r[i] = vesa_palette[i * 4 + 2] << bits; g[i] = vesa_palette[i * 4 + 1] << bits; b[i] = vesa_palette[i * 4] << bits; } mtx_unlock(&vesa_lock); return (0); } static int vesa_bios_load_palette(int start, int colors, u_char *palette, int bits) { x86regs_t regs; int i; x86bios_init_regs(®s); regs.R_AX = 0x4f09; /* regs.R_BL = 0; */ regs.R_CX = colors; regs.R_DX = start; regs.R_ES = X86BIOS_PHYSTOSEG(vesa_palette_offs); regs.R_DI = X86BIOS_PHYSTOOFF(vesa_palette_offs); bits = 8 - bits; mtx_lock(&vesa_lock); for (i = 0; i < colors; ++i) { vesa_palette[i * 4] = palette[i * 3 + 2] >> bits; vesa_palette[i * 4 + 1] = palette[i * 3 + 1] >> bits; vesa_palette[i * 4 + 2] = palette[i * 3] >> bits; vesa_palette[i * 4 + 3] = 0; } x86bios_intr(®s, 0x10); mtx_unlock(&vesa_lock); return (regs.R_AX != 0x004f); } static int vesa_bios_load_palette2(int start, int colors, u_char *r, u_char *g, u_char *b, int bits) { x86regs_t regs; int i; x86bios_init_regs(®s); regs.R_AX = 0x4f09; /* regs.R_BL = 0; */ regs.R_CX = colors; regs.R_DX = start; regs.R_ES = X86BIOS_PHYSTOSEG(vesa_palette_offs); regs.R_DI = X86BIOS_PHYSTOOFF(vesa_palette_offs); bits = 8 - bits; mtx_lock(&vesa_lock); for (i = 0; i < colors; ++i) { vesa_palette[i * 4] = b[i] >> bits; vesa_palette[i * 4 + 1] = g[i] >> bits; vesa_palette[i * 4 + 2] = r[i] >> bits; vesa_palette[i * 4 + 3] = 0; } x86bios_intr(®s, 0x10); mtx_unlock(&vesa_lock); return (regs.R_AX != 0x004f); } static size_t vesa_bios_state_buf_size(int state) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f04; /* regs.R_DL = STATE_SIZE; */ regs.R_CX = state; x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) return (0); return (regs.R_BX * 64); } static int vesa_bios_save_restore(int code, void *p) { x86regs_t regs; if (code != STATE_SAVE && code != STATE_LOAD) return (1); x86bios_init_regs(®s); regs.R_AX = 0x4f04; regs.R_DL = code; regs.R_CX = vesa_state; regs.R_ES = X86BIOS_PHYSTOSEG(vesa_state_buf_offs); regs.R_BX = X86BIOS_PHYSTOOFF(vesa_state_buf_offs); mtx_lock(&vesa_lock); switch (code) { case STATE_SAVE: x86bios_intr(®s, 0x10); if (regs.R_AX == 0x004f) bcopy(vesa_state_buf, p, vesa_state_buf_size); break; case STATE_LOAD: bcopy(p, vesa_state_buf, vesa_state_buf_size); x86bios_intr(®s, 0x10); break; } mtx_unlock(&vesa_lock); return (regs.R_AX != 0x004f); } #ifdef MODE_TABLE_BROKEN static int vesa_bios_get_line_length(void) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f06; regs.R_BL = 1; x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) return (-1); return (regs.R_BX); } #endif static int vesa_bios_set_line_length(int pixel, int *bytes, int *lines) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f06; /* regs.R_BL = 0; */ regs.R_CX = pixel; x86bios_intr(®s, 0x10); #if VESA_DEBUG > 1 printf("bx:%d, cx:%d, dx:%d\n", regs.R_BX, regs.R_CX, regs.R_DX); #endif if (regs.R_AX != 0x004f) return (-1); if (bytes != NULL) *bytes = regs.R_BX; if (lines != NULL) *lines = regs.R_DX; return (0); } #if 0 static int vesa_bios_get_start(int *x, int *y) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f07; regs.R_BL = 1; x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) return (-1); *x = regs.R_CX; *y = regs.R_DX; return (0); } #endif static int vesa_bios_set_start(int x, int y) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f07; regs.R_BL = 0x80; regs.R_CX = x; regs.R_DX = y; x86bios_intr(®s, 0x10); return (regs.R_AX != 0x004f); } /* map a generic video mode to a known mode */ static int vesa_map_gen_mode_num(int type, int color, int mode) { static struct { int from; int to; } mode_map[] = { { M_TEXT_132x25, M_VESA_C132x25 }, { M_TEXT_132x43, M_VESA_C132x43 }, { M_TEXT_132x50, M_VESA_C132x50 }, { M_TEXT_132x60, M_VESA_C132x60 }, }; int i; for (i = 0; i < nitems(mode_map); ++i) { if (mode_map[i].from == mode) return (mode_map[i].to); } return (mode); } static int vesa_translate_flags(u_int16_t vflags) { static struct { u_int16_t mask; int set; int reset; } ftable[] = { { V_MODECOLOR, V_INFO_COLOR, 0 }, { V_MODEGRAPHICS, V_INFO_GRAPHICS, 0 }, { V_MODELFB, V_INFO_LINEAR, 0 }, { V_MODENONVGA, V_INFO_NONVGA, 0 }, }; int flags; int i; for (flags = 0, i = 0; i < nitems(ftable); ++i) { flags |= (vflags & ftable[i].mask) ? ftable[i].set : ftable[i].reset; } return (flags); } static int vesa_translate_mmodel(u_int8_t vmodel) { static struct { u_int8_t vmodel; int mmodel; } mtable[] = { { V_MMTEXT, V_INFO_MM_TEXT }, { V_MMCGA, V_INFO_MM_CGA }, { V_MMHGC, V_INFO_MM_HGC }, { V_MMEGA, V_INFO_MM_PLANAR }, { V_MMPACKED, V_INFO_MM_PACKED }, { V_MMDIRCOLOR, V_INFO_MM_DIRECT }, }; int i; for (i = 0; mtable[i].mmodel >= 0; ++i) { if (mtable[i].vmodel == vmodel) return (mtable[i].mmodel); } return (V_INFO_MM_OTHER); } static int vesa_get_bpscanline(struct vesa_mode *vmode) { int bpsl; if ((vmode->v_modeattr & V_MODEGRAPHICS) != 0) { /* Find the minimum length. */ switch (vmode->v_bpp / vmode->v_planes) { case 1: bpsl = vmode->v_width / 8; break; case 2: bpsl = vmode->v_width / 4; break; case 4: bpsl = vmode->v_width / 2; break; default: bpsl = vmode->v_width * ((vmode->v_bpp + 7) / 8); bpsl /= vmode->v_planes; break; } /* Use VBE 3.0 information if it looks sane. */ if ((vmode->v_modeattr & V_MODELFB) != 0 && vesa_adp_info->v_version >= 0x0300 && vmode->v_linbpscanline > bpsl) return (vmode->v_linbpscanline); /* Return the minimum if the mode table looks absurd. */ if (vmode->v_bpscanline < bpsl) return (bpsl); } return (vmode->v_bpscanline); } #define VESA_MAXSTR 256 #define VESA_STRCPY(dst, src) do { \ char *str; \ int i; \ dst = malloc(VESA_MAXSTR, M_DEVBUF, M_WAITOK); \ str = x86bios_offset(BIOS_SADDRTOLADDR(src)); \ for (i = 0; i < VESA_MAXSTR - 1 && str[i] != '\0'; i++) \ dst[i] = str[i]; \ dst[i] = '\0'; \ } while (0) static int vesa_bios_init(void) { struct vesa_mode vmode; struct vesa_info *buf; video_info_t *p; x86regs_t regs; size_t bsize; size_t msize; void *vmbuf; uint8_t *vbios; uint32_t offs; uint16_t vers; int is_via_cle266; int modes; int i; if (vesa_init_done) return (0); vesa_bios_offs = VESA_BIOS_OFFSET; /* * If the VBE real mode interrupt vector is not found, try BIOS POST. */ vesa_bios_int10 = x86bios_get_intr(0x10); if (vesa_bios_int10 == 0) { if (vesa_bios_post() != 0) return (1); vesa_bios_int10 = x86bios_get_intr(0x10); if (vesa_bios_int10 == 0) return (1); } /* * Shadow video ROM. */ offs = vesa_bios_int10; if (vesa_shadow_rom) { vbios = x86bios_get_orm(vesa_bios_offs); if (vbios != NULL) { vesa_bios_size = vbios[2] * 512; if (((VESA_BIOS_OFFSET << 12) & 0xffff0000) == (vesa_bios_int10 & 0xffff0000) && vesa_bios_size > (vesa_bios_int10 & 0xffff)) { vesa_bios = x86bios_alloc(&vesa_bios_offs, vesa_bios_size, M_WAITOK); bcopy(vbios, vesa_bios, vesa_bios_size); offs = ((vesa_bios_offs << 12) & 0xffff0000) + (vesa_bios_int10 & 0xffff); x86bios_set_intr(0x10, offs); } } if (vesa_bios == NULL) printf("VESA: failed to shadow video ROM\n"); } if (bootverbose) printf("VESA: INT 0x10 vector 0x%04x:0x%04x\n", (offs >> 16) & 0xffff, offs & 0xffff); x86bios_init_regs(®s); regs.R_AX = 0x4f00; vmbuf = x86bios_alloc(&offs, sizeof(*buf), M_WAITOK); regs.R_ES = X86BIOS_PHYSTOSEG(offs); regs.R_DI = X86BIOS_PHYSTOOFF(offs); bcopy("VBE2", vmbuf, 4); /* try for VBE2 data */ x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f || bcmp("VESA", vmbuf, 4) != 0) goto fail; vesa_adp_info = buf = malloc(sizeof(*buf), M_DEVBUF, M_WAITOK); bcopy(vmbuf, buf, sizeof(*buf)); if (bootverbose) { printf("VESA: information block\n"); hexdump(buf, sizeof(*buf), NULL, HD_OMIT_CHARS); } vers = buf->v_version = le16toh(buf->v_version); buf->v_oemstr = le32toh(buf->v_oemstr); buf->v_flags = le32toh(buf->v_flags); buf->v_modetable = le32toh(buf->v_modetable); buf->v_memsize = le16toh(buf->v_memsize); buf->v_revision = le16toh(buf->v_revision); buf->v_venderstr = le32toh(buf->v_venderstr); buf->v_prodstr = le32toh(buf->v_prodstr); buf->v_revstr = le32toh(buf->v_revstr); if (vers < 0x0102) { printf("VESA: VBE version %d.%d is not supported; " "version 1.2 or later is required.\n", ((vers & 0xf000) >> 12) * 10 + ((vers & 0x0f00) >> 8), ((vers & 0x00f0) >> 4) * 10 + (vers & 0x000f)); goto fail; } VESA_STRCPY(vesa_oemstr, buf->v_oemstr); if (vers >= 0x0200) { VESA_STRCPY(vesa_venderstr, buf->v_venderstr); VESA_STRCPY(vesa_prodstr, buf->v_prodstr); VESA_STRCPY(vesa_revstr, buf->v_revstr); } is_via_cle266 = strncmp(vesa_oemstr, VESA_VIA_CLE266, sizeof(VESA_VIA_CLE266)) == 0; if (buf->v_modetable == 0) goto fail; msize = (size_t)buf->v_memsize * 64 * 1024; vesa_vmodetab = x86bios_offset(BIOS_SADDRTOLADDR(buf->v_modetable)); for (i = 0, modes = 0; (i < (M_VESA_MODE_MAX - M_VESA_BASE + 1)) && (vesa_vmodetab[i] != 0xffff); ++i) { vesa_vmodetab[i] = le16toh(vesa_vmodetab[i]); if (vesa_bios_get_mode(vesa_vmodetab[i], &vmode, M_WAITOK)) continue; vmode.v_modeattr = le16toh(vmode.v_modeattr); vmode.v_wgran = le16toh(vmode.v_wgran); vmode.v_wsize = le16toh(vmode.v_wsize); vmode.v_waseg = le16toh(vmode.v_waseg); vmode.v_wbseg = le16toh(vmode.v_wbseg); vmode.v_posfunc = le32toh(vmode.v_posfunc); vmode.v_bpscanline = le16toh(vmode.v_bpscanline); vmode.v_width = le16toh(vmode.v_width); vmode.v_height = le16toh(vmode.v_height); vmode.v_lfb = le32toh(vmode.v_lfb); vmode.v_offscreen = le32toh(vmode.v_offscreen); vmode.v_offscreensize = le16toh(vmode.v_offscreensize); vmode.v_linbpscanline = le16toh(vmode.v_linbpscanline); vmode.v_maxpixelclock = le32toh(vmode.v_maxpixelclock); /* reject unsupported modes */ #if 0 if ((vmode.v_modeattr & (V_MODESUPP | V_MODEOPTINFO | V_MODENONVGA)) != (V_MODESUPP | V_MODEOPTINFO)) continue; #else if ((vmode.v_modeattr & V_MODEOPTINFO) == 0) { #if VESA_DEBUG > 1 printf("Rejecting VESA %s mode: %d x %d x %d bpp " " attr = %x\n", vmode.v_modeattr & V_MODEGRAPHICS ? "graphics" : "text", vmode.v_width, vmode.v_height, vmode.v_bpp, vmode.v_modeattr); #endif continue; } #endif bsize = vesa_get_bpscanline(&vmode) * vmode.v_height; if ((vmode.v_modeattr & V_MODEGRAPHICS) != 0) bsize *= vmode.v_planes; /* Does it have enough memory to support this mode? */ if (msize < bsize) { #if VESA_DEBUG > 1 printf("Rejecting VESA %s mode: %d x %d x %d bpp " " attr = %x, not enough memory\n", vmode.v_modeattr & V_MODEGRAPHICS ? "graphics" : "text", vmode.v_width, vmode.v_height, vmode.v_bpp, vmode.v_modeattr); #endif continue; } if (bsize > vesa_vmem_max) vesa_vmem_max = bsize; /* expand the array if necessary */ if (modes >= vesa_vmode_max) { vesa_vmode_max += MODE_TABLE_DELTA; p = malloc(sizeof(*vesa_vmode) * (vesa_vmode_max + 1), M_DEVBUF, M_WAITOK); #if VESA_DEBUG > 1 printf("vesa_bios_init(): modes:%d, vesa_mode_max:%d\n", modes, vesa_vmode_max); #endif if (modes > 0) { bcopy(vesa_vmode, p, sizeof(*vesa_vmode)*modes); free(vesa_vmode, M_DEVBUF); } vesa_vmode = p; } #if VESA_DEBUG > 1 printf("Found VESA %s mode: %d x %d x %d bpp\n", vmode.v_modeattr & V_MODEGRAPHICS ? "graphics" : "text", vmode.v_width, vmode.v_height, vmode.v_bpp); #endif if (is_via_cle266) { if ((vmode.v_width & 0xff00) >> 8 == vmode.v_height - 1) { vmode.v_width &= 0xff; vmode.v_waseg = 0xb8000 >> 4; } } /* copy some fields */ bzero(&vesa_vmode[modes], sizeof(vesa_vmode[modes])); vesa_vmode[modes].vi_mode = vesa_vmodetab[i]; vesa_vmode[modes].vi_width = vmode.v_width; vesa_vmode[modes].vi_height = vmode.v_height; vesa_vmode[modes].vi_depth = vmode.v_bpp; vesa_vmode[modes].vi_planes = vmode.v_planes; vesa_vmode[modes].vi_cwidth = vmode.v_cwidth; vesa_vmode[modes].vi_cheight = vmode.v_cheight; vesa_vmode[modes].vi_window = (vm_offset_t)vmode.v_waseg << 4; /* XXX window B */ vesa_vmode[modes].vi_window_size = vmode.v_wsize * 1024; vesa_vmode[modes].vi_window_gran = vmode.v_wgran * 1024; if (vmode.v_modeattr & V_MODELFB) vesa_vmode[modes].vi_buffer = vmode.v_lfb; vesa_vmode[modes].vi_buffer_size = bsize; vesa_vmode[modes].vi_mem_model = vesa_translate_mmodel(vmode.v_memmodel); switch (vesa_vmode[modes].vi_mem_model) { case V_INFO_MM_DIRECT: if ((vmode.v_modeattr & V_MODELFB) != 0 && vers >= 0x0300) { vesa_vmode[modes].vi_pixel_fields[0] = vmode.v_linredfieldpos; vesa_vmode[modes].vi_pixel_fields[1] = vmode.v_lingreenfieldpos; vesa_vmode[modes].vi_pixel_fields[2] = vmode.v_linbluefieldpos; vesa_vmode[modes].vi_pixel_fields[3] = vmode.v_linresfieldpos; vesa_vmode[modes].vi_pixel_fsizes[0] = vmode.v_linredmasksize; vesa_vmode[modes].vi_pixel_fsizes[1] = vmode.v_lingreenmasksize; vesa_vmode[modes].vi_pixel_fsizes[2] = vmode.v_linbluemasksize; vesa_vmode[modes].vi_pixel_fsizes[3] = vmode.v_linresmasksize; } else { vesa_vmode[modes].vi_pixel_fields[0] = vmode.v_redfieldpos; vesa_vmode[modes].vi_pixel_fields[1] = vmode.v_greenfieldpos; vesa_vmode[modes].vi_pixel_fields[2] = vmode.v_bluefieldpos; vesa_vmode[modes].vi_pixel_fields[3] = vmode.v_resfieldpos; vesa_vmode[modes].vi_pixel_fsizes[0] = vmode.v_redmasksize; vesa_vmode[modes].vi_pixel_fsizes[1] = vmode.v_greenmasksize; vesa_vmode[modes].vi_pixel_fsizes[2] = vmode.v_bluemasksize; vesa_vmode[modes].vi_pixel_fsizes[3] = vmode.v_resmasksize; } /* FALLTHROUGH */ case V_INFO_MM_PACKED: vesa_vmode[modes].vi_pixel_size = (vmode.v_bpp + 7) / 8; break; } vesa_vmode[modes].vi_flags = vesa_translate_flags(vmode.v_modeattr) | V_INFO_VESA; ++modes; } if (vesa_vmode != NULL) vesa_vmode[modes].vi_mode = EOT; if (bootverbose) printf("VESA: %d mode(s) found\n", modes); if (modes == 0) goto fail; x86bios_free(vmbuf, sizeof(*buf)); /* Probe supported save/restore states. */ for (i = 0; i < 4; i++) if (vesa_bios_state_buf_size(1 << i) > 0) vesa_state |= 1 << i; if (vesa_state != 0) vesa_state_buf_size = vesa_bios_state_buf_size(vesa_state); vesa_palette = x86bios_alloc(&vesa_palette_offs, VESA_PALETTE_SIZE + vesa_state_buf_size, M_WAITOK); if (vesa_state_buf_size > 0) { vesa_state_buf = vesa_palette + VESA_PALETTE_SIZE; vesa_state_buf_offs = vesa_palette_offs + VESA_PALETTE_SIZE; } return (0); fail: x86bios_free(vmbuf, sizeof(buf)); vesa_bios_uninit(); return (1); } static void vesa_bios_uninit(void) { if (vesa_bios != NULL) { x86bios_set_intr(0x10, vesa_bios_int10); vesa_bios_offs = VESA_BIOS_OFFSET; x86bios_free(vesa_bios, vesa_bios_size); vesa_bios = NULL; } if (vesa_adp_info != NULL) { free(vesa_adp_info, M_DEVBUF); vesa_adp_info = NULL; } if (vesa_oemstr != NULL) { free(vesa_oemstr, M_DEVBUF); vesa_oemstr = NULL; } if (vesa_venderstr != NULL) { free(vesa_venderstr, M_DEVBUF); vesa_venderstr = NULL; } if (vesa_prodstr != NULL) { free(vesa_prodstr, M_DEVBUF); vesa_prodstr = NULL; } if (vesa_revstr != NULL) { free(vesa_revstr, M_DEVBUF); vesa_revstr = NULL; } if (vesa_vmode != NULL) { free(vesa_vmode, M_DEVBUF); vesa_vmode = NULL; } if (vesa_palette != NULL) { x86bios_free(vesa_palette, VESA_PALETTE_SIZE + vesa_state_buf_size); vesa_palette = NULL; } } static void vesa_clear_modes(video_info_t *info, int color) { while (info->vi_mode != EOT) { if ((info->vi_flags & V_INFO_COLOR) != color) info->vi_mode = NA; ++info; } } /* entry points */ static int vesa_configure(int flags) { video_adapter_t *adp; int adapters; int error; int i; if (vesa_init_done) return (0); if (flags & VIO_PROBE_ONLY) return (0); /* * If the VESA module has already been loaded, abort loading * the module this time. */ for (i = 0; (adp = vid_get_adapter(i)) != NULL; ++i) { if (adp->va_flags & V_ADP_VESA) return (ENXIO); if (adp->va_type == KD_VGA) break; } /* * The VGA adapter is not found. This is because either * 1) the VGA driver has not been initialized, or 2) the VGA card * is not present. If 1) is the case, we shall defer * initialization for now and try again later. */ if (adp == NULL) { - vga_sub_configure = vesa_configure; + vga_sub_configure = vesa_late_load; return (ENODEV); } /* count number of registered adapters */ for (++i; vid_get_adapter(i) != NULL; ++i) ; adapters = i; /* call VESA BIOS */ vesa_adp = adp; if (vesa_bios_init()) { vesa_adp = NULL; return (ENXIO); } vesa_adp->va_flags |= V_ADP_VESA; /* remove conflicting modes if we have more than one adapter */ if (adapters > 1) { vesa_clear_modes(vesa_vmode, (vesa_adp->va_flags & V_ADP_COLOR) ? V_INFO_COLOR : 0); } if ((error = vesa_load_ioctl()) == 0) { prevvidsw = vidsw[vesa_adp->va_index]; vidsw[vesa_adp->va_index] = &vesavidsw; vesa_init_done = TRUE; } else { vesa_adp = NULL; return (error); } return (0); } #if 0 static int vesa_nop(void) { return (0); } #endif static int vesa_error(void) { return (1); } static int vesa_probe(int unit, video_adapter_t **adpp, void *arg, int flags) { return ((*prevvidsw->probe)(unit, adpp, arg, flags)); } static int vesa_init(int unit, video_adapter_t *adp, int flags) { return ((*prevvidsw->init)(unit, adp, flags)); } static int vesa_get_info(video_adapter_t *adp, int mode, video_info_t *info) { int i; if ((*prevvidsw->get_info)(adp, mode, info) == 0) return (0); if (adp != vesa_adp) return (1); mode = vesa_map_gen_mode_num(vesa_adp->va_type, vesa_adp->va_flags & V_ADP_COLOR, mode); for (i = 0; vesa_vmode[i].vi_mode != EOT; ++i) { if (vesa_vmode[i].vi_mode == NA) continue; if (vesa_vmode[i].vi_mode == mode) { *info = vesa_vmode[i]; return (0); } } return (1); } static int vesa_query_mode(video_adapter_t *adp, video_info_t *info) { int i; if ((*prevvidsw->query_mode)(adp, info) == 0) return (0); if (adp != vesa_adp) return (ENODEV); for (i = 0; vesa_vmode[i].vi_mode != EOT; ++i) { if ((info->vi_width != 0) && (info->vi_width != vesa_vmode[i].vi_width)) continue; if ((info->vi_height != 0) && (info->vi_height != vesa_vmode[i].vi_height)) continue; if ((info->vi_cwidth != 0) && (info->vi_cwidth != vesa_vmode[i].vi_cwidth)) continue; if ((info->vi_cheight != 0) && (info->vi_cheight != vesa_vmode[i].vi_cheight)) continue; if ((info->vi_depth != 0) && (info->vi_depth != vesa_vmode[i].vi_depth)) continue; if ((info->vi_planes != 0) && (info->vi_planes != vesa_vmode[i].vi_planes)) continue; /* pixel format, memory model */ if ((info->vi_flags != 0) && (info->vi_flags != vesa_vmode[i].vi_flags)) continue; *info = vesa_vmode[i]; return (0); } return (ENODEV); } static int vesa_set_mode(video_adapter_t *adp, int mode) { video_info_t info; if (adp != vesa_adp) return ((*prevvidsw->set_mode)(adp, mode)); mode = vesa_map_gen_mode_num(adp->va_type, adp->va_flags & V_ADP_COLOR, mode); #if VESA_DEBUG > 0 printf("VESA: set_mode(): %d(%x) -> %d(%x)\n", adp->va_mode, adp->va_mode, mode, mode); #endif /* * If the current mode is a VESA mode and the new mode is not, * restore the state of the adapter first by setting one of the * standard VGA mode, so that non-standard, extended SVGA registers * are set to the state compatible with the standard VGA modes. * Otherwise (*prevvidsw->set_mode)() may not be able to set up * the new mode correctly. */ if (VESA_MODE(adp->va_mode)) { if (!VESA_MODE(mode) && (*prevvidsw->get_info)(adp, mode, &info) == 0) { if ((adp->va_flags & V_ADP_DAC8) != 0) { vesa_bios_set_dac(6); adp->va_flags &= ~V_ADP_DAC8; } int10_set_mode(adp->va_initial_bios_mode); if (adp->va_info.vi_flags & V_INFO_LINEAR) pmap_unmapdev(adp->va_buffer, vesa_vmem_max); /* * Once (*prevvidsw->get_info)() succeeded, * (*prevvidsw->set_mode)() below won't fail... */ } } /* we may not need to handle this mode after all... */ if (!VESA_MODE(mode) && (*prevvidsw->set_mode)(adp, mode) == 0) return (0); /* is the new mode supported? */ if (vesa_get_info(adp, mode, &info)) return (1); /* assert(VESA_MODE(mode)); */ #if VESA_DEBUG > 0 printf("VESA: about to set a VESA mode...\n"); #endif /* don't use the linear frame buffer for text modes. XXX */ if (!(info.vi_flags & V_INFO_GRAPHICS)) info.vi_flags &= ~V_INFO_LINEAR; if ((info.vi_flags & V_INFO_LINEAR) != 0) mode |= 0x4000; if (vesa_bios_set_mode(mode | 0x8000)) return (1); /* Palette format is reset by the above VBE function call. */ adp->va_flags &= ~V_ADP_DAC8; if ((vesa_adp_info->v_flags & V_DAC8) != 0 && (info.vi_flags & V_INFO_GRAPHICS) != 0 && vesa_bios_set_dac(8) > 6) adp->va_flags |= V_ADP_DAC8; if (adp->va_info.vi_flags & V_INFO_LINEAR) pmap_unmapdev(adp->va_buffer, vesa_vmem_max); #if VESA_DEBUG > 0 printf("VESA: mode set!\n"); #endif vesa_adp->va_mode = mode & 0x1ff; /* Mode number is 9-bit. */ vesa_adp->va_flags &= ~V_ADP_COLOR; vesa_adp->va_flags |= (info.vi_flags & V_INFO_COLOR) ? V_ADP_COLOR : 0; vesa_adp->va_crtc_addr = (vesa_adp->va_flags & V_ADP_COLOR) ? COLOR_CRTC : MONO_CRTC; vesa_adp->va_line_width = info.vi_buffer_size / info.vi_height; if ((info.vi_flags & V_INFO_GRAPHICS) != 0) vesa_adp->va_line_width /= info.vi_planes; #ifdef MODE_TABLE_BROKEN /* If VBE function returns bigger bytes per scan line, use it. */ { int bpsl = vesa_bios_get_line_length(); if (bpsl > vesa_adp->va_line_width) { vesa_adp->va_line_width = bpsl; info.vi_buffer_size = bpsl * info.vi_height; if ((info.vi_flags & V_INFO_GRAPHICS) != 0) info.vi_buffer_size *= info.vi_planes; } } #endif if (info.vi_flags & V_INFO_LINEAR) { #if VESA_DEBUG > 1 printf("VESA: setting up LFB\n"); #endif vesa_adp->va_buffer = (vm_offset_t)pmap_mapdev_attr(info.vi_buffer, vesa_vmem_max, PAT_WRITE_COMBINING); vesa_adp->va_window = vesa_adp->va_buffer; vesa_adp->va_window_size = info.vi_buffer_size / info.vi_planes; vesa_adp->va_window_gran = info.vi_buffer_size / info.vi_planes; } else { vesa_adp->va_buffer = 0; vesa_adp->va_window = (vm_offset_t)x86bios_offset(info.vi_window); vesa_adp->va_window_size = info.vi_window_size; vesa_adp->va_window_gran = info.vi_window_gran; } vesa_adp->va_buffer_size = info.vi_buffer_size; vesa_adp->va_window_orig = 0; vesa_adp->va_disp_start.x = 0; vesa_adp->va_disp_start.y = 0; #if VESA_DEBUG > 0 printf("vesa_set_mode(): vi_width:%d, line_width:%d\n", info.vi_width, vesa_adp->va_line_width); #endif bcopy(&info, &vesa_adp->va_info, sizeof(vesa_adp->va_info)); /* move hardware cursor out of the way */ (*vidsw[vesa_adp->va_index]->set_hw_cursor)(vesa_adp, -1, -1); return (0); } static int vesa_save_font(video_adapter_t *adp, int page, int fontsize, int fontwidth, u_char *data, int ch, int count) { return ((*prevvidsw->save_font)(adp, page, fontsize, fontwidth, data, ch, count)); } static int vesa_load_font(video_adapter_t *adp, int page, int fontsize, int fontwidth, u_char *data, int ch, int count) { return ((*prevvidsw->load_font)(adp, page, fontsize, fontwidth, data, ch, count)); } static int vesa_show_font(video_adapter_t *adp, int page) { return ((*prevvidsw->show_font)(adp, page)); } static int vesa_save_palette(video_adapter_t *adp, u_char *palette) { int bits; if (adp == vesa_adp && VESA_MODE(adp->va_mode)) { bits = (adp->va_flags & V_ADP_DAC8) != 0 ? 8 : 6; if (vesa_bios_save_palette(0, 256, palette, bits) == 0) return (0); } return ((*prevvidsw->save_palette)(adp, palette)); } static int vesa_load_palette(video_adapter_t *adp, u_char *palette) { int bits; if (adp == vesa_adp && VESA_MODE(adp->va_mode)) { bits = (adp->va_flags & V_ADP_DAC8) != 0 ? 8 : 6; if (vesa_bios_load_palette(0, 256, palette, bits) == 0) return (0); } return ((*prevvidsw->load_palette)(adp, palette)); } static int vesa_set_border(video_adapter_t *adp, int color) { return ((*prevvidsw->set_border)(adp, color)); } static int vesa_save_state(video_adapter_t *adp, void *p, size_t size) { void *buf; size_t bsize; if (adp != vesa_adp || (size == 0 && vesa_state_buf_size == 0)) return ((*prevvidsw->save_state)(adp, p, size)); bsize = offsetof(adp_state_t, regs) + vesa_state_buf_size; if (size == 0) return (bsize); if (vesa_state_buf_size > 0 && size < bsize) return (EINVAL); if (vesa_vmem_buf != NULL) { free(vesa_vmem_buf, M_DEVBUF); vesa_vmem_buf = NULL; } if (VESA_MODE(adp->va_mode)) { buf = (void *)adp->va_buffer; if (buf != NULL) { bsize = adp->va_buffer_size; vesa_vmem_buf = malloc(bsize, M_DEVBUF, M_NOWAIT); if (vesa_vmem_buf != NULL) bcopy(buf, vesa_vmem_buf, bsize); } } if (vesa_state_buf_size == 0) return ((*prevvidsw->save_state)(adp, p, size)); ((adp_state_t *)p)->sig = V_STATE_SIG; return (vesa_bios_save_restore(STATE_SAVE, ((adp_state_t *)p)->regs)); } static int vesa_load_state(video_adapter_t *adp, void *p) { void *buf; size_t bsize; int error, mode; if (adp != vesa_adp) return ((*prevvidsw->load_state)(adp, p)); /* Try BIOS POST to restore a sane state. */ (void)vesa_bios_post(); bsize = adp->va_buffer_size; mode = adp->va_mode; error = vesa_set_mode(adp, adp->va_initial_mode); if (mode != adp->va_initial_mode) error = vesa_set_mode(adp, mode); if (vesa_vmem_buf != NULL) { if (error == 0 && VESA_MODE(mode)) { buf = (void *)adp->va_buffer; if (buf != NULL) bcopy(vesa_vmem_buf, buf, bsize); } free(vesa_vmem_buf, M_DEVBUF); vesa_vmem_buf = NULL; } if (((adp_state_t *)p)->sig != V_STATE_SIG) return ((*prevvidsw->load_state)(adp, p)); return (vesa_bios_save_restore(STATE_LOAD, ((adp_state_t *)p)->regs)); } #if 0 static int vesa_get_origin(video_adapter_t *adp, off_t *offset) { x86regs_t regs; x86bios_init_regs(®s); regs.R_AX = 0x4f05; regs.R_BL = 0x10; x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) return (1); *offset = regs.DX * adp->va_window_gran; return (0); } #endif static int vesa_set_origin(video_adapter_t *adp, off_t offset) { x86regs_t regs; /* * This function should return as quickly as possible to * maintain good performance of the system. For this reason, * error checking is kept minimal and let the VESA BIOS to * detect error. */ if (adp != vesa_adp) return ((*prevvidsw->set_win_org)(adp, offset)); /* if this is a linear frame buffer, do nothing */ if (adp->va_info.vi_flags & V_INFO_LINEAR) return (0); /* XXX */ if (adp->va_window_gran == 0) return (1); x86bios_init_regs(®s); regs.R_AX = 0x4f05; regs.R_DX = offset / adp->va_window_gran; x86bios_intr(®s, 0x10); if (regs.R_AX != 0x004f) return (1); x86bios_init_regs(®s); regs.R_AX = 0x4f05; regs.R_BL = 1; regs.R_DX = offset / adp->va_window_gran; x86bios_intr(®s, 0x10); adp->va_window_orig = rounddown(offset, adp->va_window_gran); return (0); /* XXX */ } static int vesa_read_hw_cursor(video_adapter_t *adp, int *col, int *row) { return ((*prevvidsw->read_hw_cursor)(adp, col, row)); } static int vesa_set_hw_cursor(video_adapter_t *adp, int col, int row) { return ((*prevvidsw->set_hw_cursor)(adp, col, row)); } static int vesa_set_hw_cursor_shape(video_adapter_t *adp, int base, int height, int celsize, int blink) { return ((*prevvidsw->set_hw_cursor_shape)(adp, base, height, celsize, blink)); } static int vesa_blank_display(video_adapter_t *adp, int mode) { /* XXX: use VESA DPMS */ return ((*prevvidsw->blank_display)(adp, mode)); } static int vesa_mmap(video_adapter_t *adp, vm_ooffset_t offset, vm_paddr_t *paddr, int prot, vm_memattr_t *memattr) { #if VESA_DEBUG > 0 printf("vesa_mmap(): window:0x%tx, buffer:0x%tx, offset:0x%jx\n", adp->va_info.vi_window, adp->va_info.vi_buffer, offset); #endif if ((adp == vesa_adp) && (adp->va_info.vi_flags & V_INFO_LINEAR) != 0) { /* va_window_size == va_buffer_size/vi_planes */ /* XXX: is this correct? */ if (offset > adp->va_window_size - PAGE_SIZE) return (-1); *paddr = adp->va_info.vi_buffer + offset; return (0); } return ((*prevvidsw->mmap)(adp, offset, paddr, prot, memattr)); } static int vesa_clear(video_adapter_t *adp) { return ((*prevvidsw->clear)(adp)); } static int vesa_fill_rect(video_adapter_t *adp, int val, int x, int y, int cx, int cy) { return ((*prevvidsw->fill_rect)(adp, val, x, y, cx, cy)); } static int vesa_bitblt(video_adapter_t *adp,...) { /* FIXME */ return (1); } static int get_palette(video_adapter_t *adp, int base, int count, u_char *red, u_char *green, u_char *blue, u_char *trans) { u_char *r; u_char *g; u_char *b; int bits; int error; if (base < 0 || base >= 256 || count < 0 || count > 256) return (1); if ((base + count) > 256) return (1); if (!VESA_MODE(adp->va_mode)) return (1); bits = (adp->va_flags & V_ADP_DAC8) != 0 ? 8 : 6; r = malloc(count * 3, M_DEVBUF, M_WAITOK); g = r + count; b = g + count; error = vesa_bios_save_palette2(base, count, r, g, b, bits); if (error == 0) { copyout(r, red, count); copyout(g, green, count); copyout(b, blue, count); if (trans != NULL) { bzero(r, count); copyout(r, trans, count); } } free(r, M_DEVBUF); return (error); } static int set_palette(video_adapter_t *adp, int base, int count, u_char *red, u_char *green, u_char *blue, u_char *trans) { u_char *r; u_char *g; u_char *b; int bits; int error; if (base < 0 || base >= 256 || count < 0 || count > 256) return (1); if ((base + count) > 256) return (1); if (!VESA_MODE(adp->va_mode)) return (1); bits = (adp->va_flags & V_ADP_DAC8) != 0 ? 8 : 6; r = malloc(count * 3, M_DEVBUF, M_WAITOK); g = r + count; b = g + count; copyin(red, r, count); copyin(green, g, count); copyin(blue, b, count); error = vesa_bios_load_palette2(base, count, r, g, b, bits); free(r, M_DEVBUF); return (error); } static int vesa_ioctl(video_adapter_t *adp, u_long cmd, caddr_t arg) { int bytes; if (adp != vesa_adp) return ((*prevvidsw->ioctl)(adp, cmd, arg)); switch (cmd) { case FBIO_SETWINORG: /* set frame buffer window origin */ if (!VESA_MODE(adp->va_mode)) return (*prevvidsw->ioctl)(adp, cmd, arg); return (vesa_set_origin(adp, *(off_t *)arg) ? ENODEV : 0); case FBIO_SETDISPSTART: /* set display start address */ if (!VESA_MODE(adp->va_mode)) return ((*prevvidsw->ioctl)(adp, cmd, arg)); if (vesa_bios_set_start(((video_display_start_t *)arg)->x, ((video_display_start_t *)arg)->y)) return (ENODEV); adp->va_disp_start.x = ((video_display_start_t *)arg)->x; adp->va_disp_start.y = ((video_display_start_t *)arg)->y; return (0); case FBIO_SETLINEWIDTH: /* set line length in pixel */ if (!VESA_MODE(adp->va_mode)) return ((*prevvidsw->ioctl)(adp, cmd, arg)); if (vesa_bios_set_line_length(*(u_int *)arg, &bytes, NULL)) return (ENODEV); adp->va_line_width = bytes; #if VESA_DEBUG > 1 printf("new line width:%d\n", adp->va_line_width); #endif return (0); case FBIO_GETPALETTE: /* get color palette */ if (get_palette(adp, ((video_color_palette_t *)arg)->index, ((video_color_palette_t *)arg)->count, ((video_color_palette_t *)arg)->red, ((video_color_palette_t *)arg)->green, ((video_color_palette_t *)arg)->blue, ((video_color_palette_t *)arg)->transparent)) return ((*prevvidsw->ioctl)(adp, cmd, arg)); return (0); case FBIO_SETPALETTE: /* set color palette */ if (set_palette(adp, ((video_color_palette_t *)arg)->index, ((video_color_palette_t *)arg)->count, ((video_color_palette_t *)arg)->red, ((video_color_palette_t *)arg)->green, ((video_color_palette_t *)arg)->blue, ((video_color_palette_t *)arg)->transparent)) return ((*prevvidsw->ioctl)(adp, cmd, arg)); return (0); case FBIOGETCMAP: /* get color palette */ if (get_palette(adp, ((struct fbcmap *)arg)->index, ((struct fbcmap *)arg)->count, ((struct fbcmap *)arg)->red, ((struct fbcmap *)arg)->green, ((struct fbcmap *)arg)->blue, NULL)) return ((*prevvidsw->ioctl)(adp, cmd, arg)); return (0); case FBIOPUTCMAP: /* set color palette */ if (set_palette(adp, ((struct fbcmap *)arg)->index, ((struct fbcmap *)arg)->count, ((struct fbcmap *)arg)->red, ((struct fbcmap *)arg)->green, ((struct fbcmap *)arg)->blue, NULL)) return ((*prevvidsw->ioctl)(adp, cmd, arg)); return (0); default: return ((*prevvidsw->ioctl)(adp, cmd, arg)); } } static int vesa_diag(video_adapter_t *adp, int level) { int error; /* call the previous handler first */ error = (*prevvidsw->diag)(adp, level); if (error) return (error); if (adp != vesa_adp) return (1); if (level <= 0) return (0); return (0); } static int vesa_bios_info(int level) { #if VESA_DEBUG > 1 struct vesa_mode vmode; int i; #endif uint16_t vers; vers = vesa_adp_info->v_version; if (bootverbose) { /* general adapter information */ printf( "VESA: v%d.%d, %dk memory, flags:0x%x, mode table:%p (%x)\n", (vers >> 12) * 10 + ((vers & 0x0f00) >> 8), ((vers & 0x00f0) >> 4) * 10 + (vers & 0x000f), vesa_adp_info->v_memsize * 64, vesa_adp_info->v_flags, vesa_vmodetab, vesa_adp_info->v_modetable); /* OEM string */ if (vesa_oemstr != NULL) printf("VESA: %s\n", vesa_oemstr); } if (level <= 0) return (0); if (vers >= 0x0200 && bootverbose) { /* vender name, product name, product revision */ printf("VESA: %s %s %s\n", (vesa_venderstr != NULL) ? vesa_venderstr : "unknown", (vesa_prodstr != NULL) ? vesa_prodstr : "unknown", (vesa_revstr != NULL) ? vesa_revstr : "?"); } #if VESA_DEBUG > 1 /* mode information */ for (i = 0; (i < (M_VESA_MODE_MAX - M_VESA_BASE + 1)) && (vesa_vmodetab[i] != 0xffff); ++i) { if (vesa_bios_get_mode(vesa_vmodetab[i], &vmode, M_NOWAIT)) continue; /* print something for diagnostic purpose */ printf("VESA: mode:0x%03x, flags:0x%04x", vesa_vmodetab[i], vmode.v_modeattr); if (vmode.v_modeattr & V_MODEOPTINFO) { if (vmode.v_modeattr & V_MODEGRAPHICS) { printf(", G %dx%dx%d %d, ", vmode.v_width, vmode.v_height, vmode.v_bpp, vmode.v_planes); } else { printf(", T %dx%d, ", vmode.v_width, vmode.v_height); } printf("font:%dx%d, ", vmode.v_cwidth, vmode.v_cheight); printf("pages:%d, mem:%d", vmode.v_ipages + 1, vmode.v_memmodel); } if (vmode.v_modeattr & V_MODELFB) { printf("\nVESA: LFB:0x%x, off:0x%x, off_size:0x%x", vmode.v_lfb, vmode.v_offscreen, vmode.v_offscreensize*1024); } printf("\n"); printf("VESA: window A:0x%x (%x), window B:0x%x (%x), ", vmode.v_waseg, vmode.v_waattr, vmode.v_wbseg, vmode.v_wbattr); printf("size:%dk, gran:%dk\n", vmode.v_wsize, vmode.v_wgran); } #endif /* VESA_DEBUG > 1 */ return (0); } /* module loading */ static int vesa_load(void) { + + return (vesa_late_load(0)); +} + +/* + * To be called from the vga_sub_configure hook in case the VGA adapter is + * not found when VESA is loaded. + */ +static int +vesa_late_load(int flags) +{ int error; if (vesa_init_done) return (0); mtx_init(&vesa_lock, "VESA lock", NULL, MTX_DEF); /* locate a VGA adapter */ vesa_adp = NULL; - error = vesa_configure(0); + error = vesa_configure(flags); if (error == 0) vesa_bios_info(bootverbose); return (error); } static int vesa_unload(void) { u_char palette[256*3]; int error; /* if the adapter is currently in a VESA mode, don't unload */ if ((vesa_adp != NULL) && VESA_MODE(vesa_adp->va_mode)) return (EBUSY); /* * FIXME: if there is at least one vty which is in a VESA mode, * we shouldn't be unloading! XXX */ if ((error = vesa_unload_ioctl()) == 0) { if (vesa_adp != NULL) { if ((vesa_adp->va_flags & V_ADP_DAC8) != 0) { vesa_bios_save_palette(0, 256, palette, 8); vesa_bios_set_dac(6); vesa_adp->va_flags &= ~V_ADP_DAC8; vesa_bios_load_palette(0, 256, palette, 6); } vesa_adp->va_flags &= ~V_ADP_VESA; vidsw[vesa_adp->va_index] = prevvidsw; } } vesa_bios_uninit(); mtx_destroy(&vesa_lock); return (error); } static int vesa_mod_event(module_t mod, int type, void *data) { switch (type) { case MOD_LOAD: return (vesa_load()); case MOD_UNLOAD: return (vesa_unload()); } return (EOPNOTSUPP); } static moduledata_t vesa_mod = { "vesa", vesa_mod_event, NULL, }; DECLARE_MODULE(vesa, vesa_mod, SI_SUB_DRIVERS, SI_ORDER_MIDDLE); MODULE_DEPEND(vesa, x86bios, 1, 1, 1); #endif /* VGA_NO_MODE_CHANGE */ Index: user/alc/PQ_LAUNDRY/sys/dev/flash/mx25l.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/flash/mx25l.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/flash/mx25l.c (revision 303206) @@ -1,642 +1,645 @@ /*- * Copyright (c) 2006 M. Warner Losh. All rights reserved. * Copyright (c) 2009 Oleksandr Tymoshenko. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_platform.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef FDT #include #include #include #endif #include #include "spibus_if.h" #include #define FL_NONE 0x00 #define FL_ERASE_4K 0x01 #define FL_ERASE_32K 0x02 #define FL_ENABLE_4B_ADDR 0x04 #define FL_DISABLE_4B_ADDR 0x08 /* * Define the sectorsize to be a smaller size rather than the flash * sector size. Trying to run FFS off of a 64k flash sector size * results in a completely un-usable system. */ #define MX25L_SECTORSIZE 512 struct mx25l_flash_ident { const char *name; uint8_t manufacturer_id; uint16_t device_id; unsigned int sectorsize; unsigned int sectorcount; unsigned int flags; }; struct mx25l_softc { device_t sc_dev; uint8_t sc_manufacturer_id; uint16_t sc_device_id; unsigned int sc_sectorsize; struct mtx sc_mtx; struct disk *sc_disk; struct proc *sc_p; struct bio_queue_head sc_bio_queue; unsigned int sc_flags; }; #define M25PXX_LOCK(_sc) mtx_lock(&(_sc)->sc_mtx) #define M25PXX_UNLOCK(_sc) mtx_unlock(&(_sc)->sc_mtx) #define M25PXX_LOCK_INIT(_sc) \ mtx_init(&_sc->sc_mtx, device_get_nameunit(_sc->sc_dev), \ "mx25l", MTX_DEF) #define M25PXX_LOCK_DESTROY(_sc) mtx_destroy(&_sc->sc_mtx); #define M25PXX_ASSERT_LOCKED(_sc) mtx_assert(&_sc->sc_mtx, MA_OWNED); #define M25PXX_ASSERT_UNLOCKED(_sc) mtx_assert(&_sc->sc_mtx, MA_NOTOWNED); /* disk routines */ static int mx25l_open(struct disk *dp); static int mx25l_close(struct disk *dp); static int mx25l_ioctl(struct disk *, u_long, void *, int, struct thread *); static void mx25l_strategy(struct bio *bp); static int mx25l_getattr(struct bio *bp); static void mx25l_task(void *arg); struct mx25l_flash_ident flash_devices[] = { { "en25f32", 0x1c, 0x3116, 64 * 1024, 64, FL_NONE }, { "en25p32", 0x1c, 0x2016, 64 * 1024, 64, FL_NONE }, { "en25p64", 0x1c, 0x2017, 64 * 1024, 128, FL_NONE }, { "en25q64", 0x1c, 0x3017, 64 * 1024, 128, FL_ERASE_4K }, { "m25p64", 0x20, 0x2017, 64 * 1024, 128, FL_NONE }, { "mx25ll32", 0xc2, 0x2016, 64 * 1024, 64, FL_NONE }, { "mx25ll64", 0xc2, 0x2017, 64 * 1024, 128, FL_NONE }, { "mx25ll128", 0xc2, 0x2018, 64 * 1024, 256, FL_ERASE_4K | FL_ERASE_32K }, { "mx25ll256", 0xc2, 0x2019, 64 * 1024, 512, FL_ERASE_4K | FL_ERASE_32K | FL_ENABLE_4B_ADDR }, { "s25fl032", 0x01, 0x0215, 64 * 1024, 64, FL_NONE }, { "s25fl064", 0x01, 0x0216, 64 * 1024, 128, FL_NONE }, { "s25fl128", 0x01, 0x2018, 64 * 1024, 256, FL_NONE }, { "s25fl256s", 0x01, 0x0219, 64 * 1024, 512, FL_NONE }, { "SST25VF032B", 0xbf, 0x254a, 64 * 1024, 64, FL_ERASE_4K | FL_ERASE_32K }, /* Winbond -- w25x "blocks" are 64K, "sectors" are 4KiB */ { "w25x32", 0xef, 0x3016, 64 * 1024, 64, FL_ERASE_4K }, { "w25x64", 0xef, 0x3017, 64 * 1024, 128, FL_ERASE_4K }, { "w25q32", 0xef, 0x4016, 64 * 1024, 64, FL_ERASE_4K }, { "w25q64", 0xef, 0x4017, 64 * 1024, 128, FL_ERASE_4K }, { "w25q64bv", 0xef, 0x4017, 64 * 1024, 128, FL_ERASE_4K }, { "w25q128", 0xef, 0x4018, 64 * 1024, 256, FL_ERASE_4K }, { "w25q256", 0xef, 0x4019, 64 * 1024, 512, FL_ERASE_4K }, /* Atmel */ { "at25df641", 0x1f, 0x4800, 64 * 1024, 128, FL_ERASE_4K }, + + /* GigaDevice */ + { "gd25q64", 0xc8, 0x4017, 64 * 1024, 128, FL_ERASE_4K }, }; static uint8_t mx25l_get_status(device_t dev) { uint8_t txBuf[2], rxBuf[2]; struct spi_command cmd; int err; memset(&cmd, 0, sizeof(cmd)); memset(txBuf, 0, sizeof(txBuf)); memset(rxBuf, 0, sizeof(rxBuf)); txBuf[0] = CMD_READ_STATUS; cmd.tx_cmd = txBuf; cmd.rx_cmd = rxBuf; cmd.rx_cmd_sz = 2; cmd.tx_cmd_sz = 2; err = SPIBUS_TRANSFER(device_get_parent(dev), dev, &cmd); return (rxBuf[1]); } static void mx25l_wait_for_device_ready(device_t dev) { while ((mx25l_get_status(dev) & STATUS_WIP)) continue; } static struct mx25l_flash_ident* mx25l_get_device_ident(struct mx25l_softc *sc) { device_t dev = sc->sc_dev; uint8_t txBuf[8], rxBuf[8]; struct spi_command cmd; uint8_t manufacturer_id; uint16_t dev_id; int err, i; memset(&cmd, 0, sizeof(cmd)); memset(txBuf, 0, sizeof(txBuf)); memset(rxBuf, 0, sizeof(rxBuf)); txBuf[0] = CMD_READ_IDENT; cmd.tx_cmd = &txBuf; cmd.rx_cmd = &rxBuf; /* * Some compatible devices has extended two-bytes ID * We'll use only manufacturer/deviceid atm */ cmd.tx_cmd_sz = 4; cmd.rx_cmd_sz = 4; err = SPIBUS_TRANSFER(device_get_parent(dev), dev, &cmd); if (err) return (NULL); manufacturer_id = rxBuf[1]; dev_id = (rxBuf[2] << 8) | (rxBuf[3]); for (i = 0; i < nitems(flash_devices); i++) { if ((flash_devices[i].manufacturer_id == manufacturer_id) && (flash_devices[i].device_id == dev_id)) return &flash_devices[i]; } printf("Unknown SPI flash device. Vendor: %02x, device id: %04x\n", manufacturer_id, dev_id); return (NULL); } static void mx25l_set_writable(device_t dev, int writable) { uint8_t txBuf[1], rxBuf[1]; struct spi_command cmd; int err; memset(&cmd, 0, sizeof(cmd)); memset(txBuf, 0, sizeof(txBuf)); memset(rxBuf, 0, sizeof(rxBuf)); txBuf[0] = writable ? CMD_WRITE_ENABLE : CMD_WRITE_DISABLE; cmd.tx_cmd = txBuf; cmd.rx_cmd = rxBuf; cmd.rx_cmd_sz = 1; cmd.tx_cmd_sz = 1; err = SPIBUS_TRANSFER(device_get_parent(dev), dev, &cmd); } static void mx25l_erase_cmd(device_t dev, off_t sector, uint8_t ecmd) { struct mx25l_softc *sc; uint8_t txBuf[5], rxBuf[5]; struct spi_command cmd; int err; sc = device_get_softc(dev); mx25l_wait_for_device_ready(dev); mx25l_set_writable(dev, 1); memset(&cmd, 0, sizeof(cmd)); memset(txBuf, 0, sizeof(txBuf)); memset(rxBuf, 0, sizeof(rxBuf)); txBuf[0] = ecmd; cmd.tx_cmd = txBuf; cmd.rx_cmd = rxBuf; if (sc->sc_flags & FL_ENABLE_4B_ADDR) { cmd.rx_cmd_sz = 5; cmd.tx_cmd_sz = 5; txBuf[1] = ((sector >> 24) & 0xff); txBuf[2] = ((sector >> 16) & 0xff); txBuf[3] = ((sector >> 8) & 0xff); txBuf[4] = (sector & 0xff); } else { cmd.rx_cmd_sz = 4; cmd.tx_cmd_sz = 4; txBuf[1] = ((sector >> 16) & 0xff); txBuf[2] = ((sector >> 8) & 0xff); txBuf[3] = (sector & 0xff); } err = SPIBUS_TRANSFER(device_get_parent(dev), dev, &cmd); } static int mx25l_write(device_t dev, off_t offset, caddr_t data, off_t count) { struct mx25l_softc *sc; uint8_t txBuf[8], rxBuf[8]; struct spi_command cmd; off_t write_offset; long bytes_to_write, bytes_writen; device_t pdev; int err = 0; pdev = device_get_parent(dev); sc = device_get_softc(dev); if (sc->sc_flags & FL_ENABLE_4B_ADDR) { cmd.tx_cmd_sz = 5; cmd.rx_cmd_sz = 5; } else { cmd.tx_cmd_sz = 4; cmd.rx_cmd_sz = 4; } bytes_writen = 0; write_offset = offset; /* * Use the erase sectorsize here since blocks are fully erased * first before they're written to. */ if (count % sc->sc_sectorsize != 0 || offset % sc->sc_sectorsize != 0) return (EIO); /* * Assume here that we write per-sector only * and sector size should be 256 bytes aligned */ KASSERT(write_offset % FLASH_PAGE_SIZE == 0, ("offset for BIO_WRITE is not page size (%d bytes) aligned", FLASH_PAGE_SIZE)); /* * Maximum write size for CMD_PAGE_PROGRAM is * FLASH_PAGE_SIZE, so split data to chunks * FLASH_PAGE_SIZE bytes eash and write them * one by one */ while (bytes_writen < count) { /* * If we crossed sector boundary - erase next sector */ if (((offset + bytes_writen) % sc->sc_sectorsize) == 0) mx25l_erase_cmd(dev, offset + bytes_writen, CMD_SECTOR_ERASE); txBuf[0] = CMD_PAGE_PROGRAM; if (sc->sc_flags & FL_ENABLE_4B_ADDR) { txBuf[1] = ((write_offset >> 24) & 0xff); txBuf[2] = ((write_offset >> 16) & 0xff); txBuf[3] = ((write_offset >> 8) & 0xff); txBuf[4] = (write_offset & 0xff); } else { txBuf[1] = ((write_offset >> 16) & 0xff); txBuf[2] = ((write_offset >> 8) & 0xff); txBuf[3] = (write_offset & 0xff); } bytes_to_write = MIN(FLASH_PAGE_SIZE, count - bytes_writen); cmd.tx_cmd = txBuf; cmd.rx_cmd = rxBuf; cmd.tx_data = data + bytes_writen; cmd.tx_data_sz = bytes_to_write; cmd.rx_data = data + bytes_writen; cmd.rx_data_sz = bytes_to_write; /* * Eash completed write operation resets WEL * (write enable latch) to disabled state, * so we re-enable it here */ mx25l_wait_for_device_ready(dev); mx25l_set_writable(dev, 1); err = SPIBUS_TRANSFER(pdev, dev, &cmd); if (err) break; bytes_writen += bytes_to_write; write_offset += bytes_to_write; } return (err); } static int mx25l_read(device_t dev, off_t offset, caddr_t data, off_t count) { struct mx25l_softc *sc; uint8_t txBuf[8], rxBuf[8]; struct spi_command cmd; device_t pdev; int err = 0; pdev = device_get_parent(dev); sc = device_get_softc(dev); /* * Enforce the disk read sectorsize not the erase sectorsize. * In this way, smaller read IO is possible,dramatically * speeding up filesystem/geom_compress access. */ if (count % sc->sc_disk->d_sectorsize != 0 || offset % sc->sc_disk->d_sectorsize != 0) return (EIO); txBuf[0] = CMD_FAST_READ; if (sc->sc_flags & FL_ENABLE_4B_ADDR) { cmd.tx_cmd_sz = 6; cmd.rx_cmd_sz = 6; txBuf[1] = ((offset >> 24) & 0xff); txBuf[2] = ((offset >> 16) & 0xff); txBuf[3] = ((offset >> 8) & 0xff); txBuf[4] = (offset & 0xff); /* Dummy byte */ txBuf[5] = 0; } else { cmd.tx_cmd_sz = 5; cmd.rx_cmd_sz = 5; txBuf[1] = ((offset >> 16) & 0xff); txBuf[2] = ((offset >> 8) & 0xff); txBuf[3] = (offset & 0xff); /* Dummy byte */ txBuf[4] = 0; } cmd.tx_cmd = txBuf; cmd.rx_cmd = rxBuf; cmd.tx_data = data; cmd.tx_data_sz = count; cmd.rx_data = data; cmd.rx_data_sz = count; err = SPIBUS_TRANSFER(pdev, dev, &cmd); return (err); } static int mx25l_set_4b_mode(device_t dev, uint8_t command) { uint8_t txBuf[1], rxBuf[1]; struct spi_command cmd; device_t pdev; int err; memset(&cmd, 0, sizeof(cmd)); memset(txBuf, 0, sizeof(txBuf)); memset(rxBuf, 0, sizeof(rxBuf)); pdev = device_get_parent(dev); cmd.tx_cmd_sz = cmd.rx_cmd_sz = 1; cmd.tx_cmd = txBuf; cmd.rx_cmd = rxBuf; txBuf[0] = command; err = SPIBUS_TRANSFER(pdev, dev, &cmd); mx25l_wait_for_device_ready(dev); return (err); } #ifdef FDT static struct ofw_compat_data compat_data[] = { { "st,m25p", 1 }, { "jedec,spi-nor", 1 }, { NULL, 0 }, }; #endif static int mx25l_probe(device_t dev) { #ifdef FDT int i; if (!ofw_bus_status_okay(dev)) return (ENXIO); /* First try to match the compatible property to the compat_data */ if (ofw_bus_search_compatible(dev, compat_data)->ocd_data == 1) goto found; /* * Next, try to find a compatible device using the names in the * flash_devices structure */ for (i = 0; i < nitems(flash_devices); i++) if (ofw_bus_is_compatible(dev, flash_devices[i].name)) goto found; return (ENXIO); found: #endif device_set_desc(dev, "M25Pxx Flash Family"); return (0); } static int mx25l_attach(device_t dev) { struct mx25l_softc *sc; struct mx25l_flash_ident *ident; sc = device_get_softc(dev); sc->sc_dev = dev; M25PXX_LOCK_INIT(sc); ident = mx25l_get_device_ident(sc); if (ident == NULL) return (ENXIO); mx25l_wait_for_device_ready(sc->sc_dev); sc->sc_disk = disk_alloc(); sc->sc_disk->d_open = mx25l_open; sc->sc_disk->d_close = mx25l_close; sc->sc_disk->d_strategy = mx25l_strategy; sc->sc_disk->d_getattr = mx25l_getattr; sc->sc_disk->d_ioctl = mx25l_ioctl; sc->sc_disk->d_name = "flash/spi"; sc->sc_disk->d_drv1 = sc; sc->sc_disk->d_maxsize = DFLTPHYS; sc->sc_disk->d_sectorsize = MX25L_SECTORSIZE; sc->sc_disk->d_mediasize = ident->sectorsize * ident->sectorcount; sc->sc_disk->d_unit = device_get_unit(sc->sc_dev); sc->sc_disk->d_dump = NULL; /* NB: no dumps */ /* Sectorsize for erase operations */ sc->sc_sectorsize = ident->sectorsize; sc->sc_flags = ident->flags; if (sc->sc_flags & FL_ENABLE_4B_ADDR) mx25l_set_4b_mode(dev, CMD_ENTER_4B_MODE); if (sc->sc_flags & FL_DISABLE_4B_ADDR) mx25l_set_4b_mode(dev, CMD_EXIT_4B_MODE); /* NB: use stripesize to hold the erase/region size for RedBoot */ sc->sc_disk->d_stripesize = ident->sectorsize; disk_create(sc->sc_disk, DISK_VERSION); bioq_init(&sc->sc_bio_queue); kproc_create(&mx25l_task, sc, &sc->sc_p, 0, 0, "task: mx25l flash"); device_printf(sc->sc_dev, "%s, sector %d bytes, %d sectors\n", ident->name, ident->sectorsize, ident->sectorcount); return (0); } static int mx25l_detach(device_t dev) { return (EIO); } static int mx25l_open(struct disk *dp) { return (0); } static int mx25l_close(struct disk *dp) { return (0); } static int mx25l_ioctl(struct disk *dp, u_long cmd, void *data, int fflag, struct thread *td) { return (EINVAL); } static void mx25l_strategy(struct bio *bp) { struct mx25l_softc *sc; sc = (struct mx25l_softc *)bp->bio_disk->d_drv1; M25PXX_LOCK(sc); bioq_disksort(&sc->sc_bio_queue, bp); wakeup(sc); M25PXX_UNLOCK(sc); } static int mx25l_getattr(struct bio *bp) { struct mx25l_softc *sc; device_t dev; if (bp->bio_disk == NULL || bp->bio_disk->d_drv1 == NULL) return (ENXIO); sc = bp->bio_disk->d_drv1; dev = sc->sc_dev; if (strcmp(bp->bio_attribute, "SPI::device") == 0) { if (bp->bio_length != sizeof(dev)) return (EFAULT); bcopy(&dev, bp->bio_data, sizeof(dev)); } else return (-1); return (0); } static void mx25l_task(void *arg) { struct mx25l_softc *sc = (struct mx25l_softc*)arg; struct bio *bp; device_t dev; for (;;) { dev = sc->sc_dev; M25PXX_LOCK(sc); do { bp = bioq_first(&sc->sc_bio_queue); if (bp == NULL) msleep(sc, &sc->sc_mtx, PRIBIO, "jobqueue", 0); } while (bp == NULL); bioq_remove(&sc->sc_bio_queue, bp); M25PXX_UNLOCK(sc); switch (bp->bio_cmd) { case BIO_READ: bp->bio_error = mx25l_read(dev, bp->bio_offset, bp->bio_data, bp->bio_bcount); break; case BIO_WRITE: bp->bio_error = mx25l_write(dev, bp->bio_offset, bp->bio_data, bp->bio_bcount); break; default: bp->bio_error = EINVAL; } biodone(bp); } } static devclass_t mx25l_devclass; static device_method_t mx25l_methods[] = { /* Device interface */ DEVMETHOD(device_probe, mx25l_probe), DEVMETHOD(device_attach, mx25l_attach), DEVMETHOD(device_detach, mx25l_detach), { 0, 0 } }; static driver_t mx25l_driver = { "mx25l", mx25l_methods, sizeof(struct mx25l_softc), }; DRIVER_MODULE(mx25l, spibus, mx25l_driver, mx25l_devclass, 0, 0); Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/include/hyperv.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/include/hyperv.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/include/hyperv.h (revision 303206) @@ -1,287 +1,80 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * Copyright (c) 2012 Citrix Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /** * HyperV definitions for messages that are sent between instances of the * Channel Management Library in separate partitions, or in some cases, * back to itself. */ #ifndef __HYPERV_H__ #define __HYPERV_H__ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include -typedef uint8_t hv_bool_uint8_t; - -#define HV_S_OK 0x00000000 -#define HV_E_FAIL 0x80004005 -#define HV_ERROR_NOT_SUPPORTED 0x80070032 -#define HV_ERROR_MACHINE_LOCKED 0x800704F7 - -/* - * VMBUS version is 32 bit, upper 16 bit for major_number and lower - * 16 bit for minor_number. - * - * 0.13 -- Windows Server 2008 - * 1.1 -- Windows 7 - * 2.4 -- Windows 8 - * 3.0 -- Windows 8.1 - */ -#define VMBUS_VERSION_WS2008 ((0 << 16) | (13)) -#define VMBUS_VERSION_WIN7 ((1 << 16) | (1)) -#define VMBUS_VERSION_WIN8 ((2 << 16) | (4)) -#define VMBUS_VERSION_WIN8_1 ((3 << 16) | (0)) - -#define VMBUS_VERSION_MAJOR(ver) (((uint32_t)(ver)) >> 16) -#define VMBUS_VERSION_MINOR(ver) (((uint32_t)(ver)) & 0xffff) - struct hyperv_guid { uint8_t hv_guid[16]; } __packed; #define HYPERV_GUID_STRLEN 40 int hyperv_guid2str(const struct hyperv_guid *, char *, size_t); -#define HW_MACADDR_LEN 6 - -/* - * Common defines for Hyper-V ICs - */ -#define HV_ICMSGTYPE_NEGOTIATE 0 -#define HV_ICMSGTYPE_HEARTBEAT 1 -#define HV_ICMSGTYPE_KVPEXCHANGE 2 -#define HV_ICMSGTYPE_SHUTDOWN 3 -#define HV_ICMSGTYPE_TIMESYNC 4 -#define HV_ICMSGTYPE_VSS 5 - -#define HV_ICMSGHDRFLAG_TRANSACTION 1 -#define HV_ICMSGHDRFLAG_REQUEST 2 -#define HV_ICMSGHDRFLAG_RESPONSE 4 - -typedef struct hv_vmbus_pipe_hdr { - uint32_t flags; - uint32_t msgsize; -} __packed hv_vmbus_pipe_hdr; - -typedef struct hv_vmbus_ic_version { - uint16_t major; - uint16_t minor; -} __packed hv_vmbus_ic_version; - -typedef struct hv_vmbus_icmsg_hdr { - hv_vmbus_ic_version icverframe; - uint16_t icmsgtype; - hv_vmbus_ic_version icvermsg; - uint16_t icmsgsize; - uint32_t status; - uint8_t ictransaction_id; - uint8_t icflags; - uint8_t reserved[2]; -} __packed hv_vmbus_icmsg_hdr; - -typedef struct hv_vmbus_icmsg_negotiate { - uint16_t icframe_vercnt; - uint16_t icmsg_vercnt; - uint32_t reserved; - hv_vmbus_ic_version icversion_data[1]; /* any size array */ -} __packed hv_vmbus_icmsg_negotiate; - -typedef struct hv_vmbus_shutdown_msg_data { - uint32_t reason_code; - uint32_t timeout_seconds; - uint32_t flags; - uint8_t display_message[2048]; -} __packed hv_vmbus_shutdown_msg_data; - -typedef struct hv_vmbus_heartbeat_msg_data { - uint64_t seq_num; - uint32_t reserved[8]; -} __packed hv_vmbus_heartbeat_msg_data; - -typedef struct { - /* - * offset in bytes from the start of ring data below - */ - volatile uint32_t write_index; - /* - * offset in bytes from the start of ring data below - */ - volatile uint32_t read_index; - /* - * NOTE: The interrupt_mask field is used only for channels, but - * vmbus connection also uses this data structure - */ - volatile uint32_t interrupt_mask; - /* pad it to PAGE_SIZE so that data starts on a page */ - uint8_t reserved[4084]; - - /* - * WARNING: Ring data starts here - * !!! DO NOT place any fields below this !!! - */ - uint8_t buffer[0]; /* doubles as interrupt mask */ -} __packed hv_vmbus_ring_buffer; - -typedef struct { - hv_vmbus_ring_buffer* ring_buffer; - struct mtx ring_lock; - uint32_t ring_data_size; /* ring_size */ -} hv_vmbus_ring_buffer_info; - -typedef void (*vmbus_chan_callback_t)(void *); - -typedef struct hv_vmbus_channel { - device_t ch_dev; - struct vmbus_softc *vmbus_sc; - uint32_t ch_flags; /* VMBUS_CHAN_FLAG_ */ - uint32_t ch_id; /* channel id */ - - /* - * These are based on the offer_msg.monitor_id. - * Save it here for easy access. - */ - int ch_montrig_idx; /* MNF trig index */ - uint32_t ch_montrig_mask;/* MNF trig mask */ - - /* - * send to parent - */ - hv_vmbus_ring_buffer_info outbound; - /* - * receive from parent - */ - hv_vmbus_ring_buffer_info inbound; - - struct taskqueue *ch_tq; - struct task ch_task; - vmbus_chan_callback_t ch_cb; - void *ch_cbarg; - - struct hyperv_mon_param *ch_monprm; - struct hyperv_dma ch_monprm_dma; - - int ch_cpuid; /* owner cpu */ - /* - * Virtual cpuid for ch_cpuid; it is used to communicate cpuid - * related information w/ Hyper-V. If MSR_HV_VP_INDEX does not - * exist, ch_vcpuid will always be 0 for compatibility. - */ - uint32_t ch_vcpuid; - - /* - * If this is a primary channel, ch_subchan* fields - * contain sub-channels belonging to this primary - * channel. - */ - struct mtx ch_subchan_lock; - TAILQ_HEAD(, hv_vmbus_channel) ch_subchans; - int ch_subchan_cnt; - - /* If this is a sub-channel */ - TAILQ_ENTRY(hv_vmbus_channel) ch_sublink; /* sub-channel link */ - struct hv_vmbus_channel *ch_prichan; /* owner primary chan */ - - /* - * Driver private data - */ - void *hv_chan_priv1; - void *hv_chan_priv2; - void *hv_chan_priv3; - - void *ch_bufring; /* TX+RX bufrings */ - struct hyperv_dma ch_bufring_dma; - uint32_t ch_bufring_gpadl; - - struct task ch_detach_task; - TAILQ_ENTRY(hv_vmbus_channel) ch_prilink; /* primary chan link */ - uint32_t ch_subidx; /* subchan index */ - volatile uint32_t ch_stflags; /* atomic-op */ - /* VMBUS_CHAN_ST_ */ - struct hyperv_guid ch_guid_type; - struct hyperv_guid ch_guid_inst; - - struct sysctl_ctx_list ch_sysctl_ctx; -} hv_vmbus_channel; - -#define VMBUS_CHAN_ISPRIMARY(chan) ((chan)->ch_subidx == 0) - -#define VMBUS_CHAN_FLAG_HASMNF 0x0001 -/* - * If this flag is set, this channel's interrupt will be masked in ISR, - * and the RX bufring will be drained before this channel's interrupt is - * unmasked. - * - * This flag is turned on by default. Drivers can turn it off according - * to their own requirement. - */ -#define VMBUS_CHAN_FLAG_BATCHREAD 0x0002 - -#define VMBUS_CHAN_ST_OPENED_SHIFT 0 -#define VMBUS_CHAN_ST_OPENED (1 << VMBUS_CHAN_ST_OPENED_SHIFT) - /** * @brief Get physical address from virtual */ static inline unsigned long hv_get_phys_addr(void *virt) { unsigned long ret; ret = (vtophys(virt) | ((vm_offset_t) virt & PAGE_MASK)); return (ret); -} - -static __inline struct hv_vmbus_channel * -vmbus_get_channel(device_t dev) -{ - return device_get_ivars(dev); } #endif /* __HYPERV_H__ */ Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/include/vmbus.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/include/vmbus.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/include/vmbus.h (revision 303206) @@ -1,127 +1,159 @@ /*- * Copyright (c) 2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _VMBUS_H_ #define _VMBUS_H_ #include /* + * VMBUS version is 32 bit, upper 16 bit for major_number and lower + * 16 bit for minor_number. + * + * 0.13 -- Windows Server 2008 + * 1.1 -- Windows 7 + * 2.4 -- Windows 8 + * 3.0 -- Windows 8.1 + */ +#define VMBUS_VERSION_WS2008 ((0 << 16) | (13)) +#define VMBUS_VERSION_WIN7 ((1 << 16) | (1)) +#define VMBUS_VERSION_WIN8 ((2 << 16) | (4)) +#define VMBUS_VERSION_WIN8_1 ((3 << 16) | (0)) + +#define VMBUS_VERSION_MAJOR(ver) (((uint32_t)(ver)) >> 16) +#define VMBUS_VERSION_MINOR(ver) (((uint32_t)(ver)) & 0xffff) + +/* * GPA stuffs. */ struct vmbus_gpa_range { uint32_t gpa_len; uint32_t gpa_ofs; uint64_t gpa_page[0]; } __packed; /* This is actually vmbus_gpa_range.gpa_page[1] */ struct vmbus_gpa { uint32_t gpa_len; uint32_t gpa_ofs; uint64_t gpa_page; } __packed; #define VMBUS_CHANPKT_SIZE_SHIFT 3 #define VMBUS_CHANPKT_GETLEN(pktlen) \ (((int)(pktlen)) << VMBUS_CHANPKT_SIZE_SHIFT) struct vmbus_chanpkt_hdr { uint16_t cph_type; /* VMBUS_CHANPKT_TYPE_ */ uint16_t cph_hlen; /* header len, in 8 bytes */ uint16_t cph_tlen; /* total len, in 8 bytes */ uint16_t cph_flags; /* VMBUS_CHANPKT_FLAG_ */ uint64_t cph_xactid; } __packed; #define VMBUS_CHANPKT_TYPE_INBAND 0x0006 #define VMBUS_CHANPKT_TYPE_RXBUF 0x0007 #define VMBUS_CHANPKT_TYPE_GPA 0x0009 #define VMBUS_CHANPKT_TYPE_COMP 0x000b #define VMBUS_CHANPKT_FLAG_RC 0x0001 /* report completion */ #define VMBUS_CHANPKT_CONST_DATA(pkt) \ (const void *)((const uint8_t *)(pkt) + \ VMBUS_CHANPKT_GETLEN((pkt)->cph_hlen)) struct vmbus_rxbuf_desc { uint32_t rb_len; uint32_t rb_ofs; } __packed; struct vmbus_chanpkt_rxbuf { struct vmbus_chanpkt_hdr cp_hdr; uint16_t cp_rxbuf_id; uint16_t cp_rsvd; uint32_t cp_rxbuf_cnt; struct vmbus_rxbuf_desc cp_rxbuf[]; } __packed; #define VMBUS_CHAN_SGLIST_MAX 32 #define VMBUS_CHAN_PRPLIST_MAX 32 -struct hv_vmbus_channel; +struct vmbus_channel; +struct hyperv_guid; -int vmbus_chan_open(struct hv_vmbus_channel *chan, +typedef void (*vmbus_chan_callback_t)(struct vmbus_channel *, void *); + +static __inline struct vmbus_channel * +vmbus_get_channel(device_t dev) +{ + return device_get_ivars(dev); +} + +int vmbus_chan_open(struct vmbus_channel *chan, int txbr_size, int rxbr_size, const void *udata, int udlen, vmbus_chan_callback_t cb, void *cbarg); -void vmbus_chan_close(struct hv_vmbus_channel *chan); +void vmbus_chan_close(struct vmbus_channel *chan); -int vmbus_chan_gpadl_connect(struct hv_vmbus_channel *chan, +int vmbus_chan_gpadl_connect(struct vmbus_channel *chan, bus_addr_t paddr, int size, uint32_t *gpadl); -int vmbus_chan_gpadl_disconnect(struct hv_vmbus_channel *chan, +int vmbus_chan_gpadl_disconnect(struct vmbus_channel *chan, uint32_t gpadl); -void vmbus_chan_cpu_set(struct hv_vmbus_channel *chan, int cpu); -void vmbus_chan_cpu_rr(struct hv_vmbus_channel *chan); -struct hv_vmbus_channel * - vmbus_chan_cpu2chan(struct hv_vmbus_channel *chan, int cpu); -void vmbus_chan_set_readbatch(struct hv_vmbus_channel *chan, bool on); +void vmbus_chan_cpu_set(struct vmbus_channel *chan, int cpu); +void vmbus_chan_cpu_rr(struct vmbus_channel *chan); +struct vmbus_channel * + vmbus_chan_cpu2chan(struct vmbus_channel *chan, int cpu); +void vmbus_chan_set_readbatch(struct vmbus_channel *chan, bool on); -struct hv_vmbus_channel ** - vmbus_subchan_get(struct hv_vmbus_channel *pri_chan, int subchan_cnt); -void vmbus_subchan_rel(struct hv_vmbus_channel **subchan, int subchan_cnt); -void vmbus_subchan_drain(struct hv_vmbus_channel *pri_chan); +struct vmbus_channel ** + vmbus_subchan_get(struct vmbus_channel *pri_chan, int subchan_cnt); +void vmbus_subchan_rel(struct vmbus_channel **subchan, int subchan_cnt); +void vmbus_subchan_drain(struct vmbus_channel *pri_chan); -int vmbus_chan_recv(struct hv_vmbus_channel *chan, void *data, int *dlen, +int vmbus_chan_recv(struct vmbus_channel *chan, void *data, int *dlen, uint64_t *xactid); -int vmbus_chan_recv_pkt(struct hv_vmbus_channel *chan, +int vmbus_chan_recv_pkt(struct vmbus_channel *chan, struct vmbus_chanpkt_hdr *pkt, int *pktlen); -int vmbus_chan_send(struct hv_vmbus_channel *chan, uint16_t type, +int vmbus_chan_send(struct vmbus_channel *chan, uint16_t type, uint16_t flags, void *data, int dlen, uint64_t xactid); -int vmbus_chan_send_sglist(struct hv_vmbus_channel *chan, +int vmbus_chan_send_sglist(struct vmbus_channel *chan, struct vmbus_gpa sg[], int sglen, void *data, int dlen, uint64_t xactid); -int vmbus_chan_send_prplist(struct hv_vmbus_channel *chan, +int vmbus_chan_send_prplist(struct vmbus_channel *chan, struct vmbus_gpa_range *prp, int prp_cnt, void *data, int dlen, uint64_t xactid); + +uint32_t vmbus_chan_id(const struct vmbus_channel *chan); +uint32_t vmbus_chan_subidx(const struct vmbus_channel *chan); +bool vmbus_chan_is_primary(const struct vmbus_channel *chan); +const struct hyperv_guid * + vmbus_chan_guid_inst(const struct vmbus_channel *chan); #endif /* !_VMBUS_H_ */ Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_net_vsc.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_net_vsc.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_net_vsc.c (revision 303206) @@ -1,1044 +1,1039 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2010-2012 Citrix Inc. * Copyright (c) 2012 NetApp Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /** * HyperV vmbus network VSC (virtual services client) module * */ #include #include #include #include #include #include #include #include #include #include #include "hv_net_vsc.h" #include "hv_rndis.h" #include "hv_rndis_filter.h" -/* priv1 and priv2 are consumed by the main driver */ -#define hv_chan_rdbuf hv_chan_priv3 - MALLOC_DEFINE(M_NETVSC, "netvsc", "Hyper-V netvsc driver"); /* * Forward declarations */ -static void hv_nv_on_channel_callback(void *xchan); +static void hv_nv_on_channel_callback(struct vmbus_channel *chan, + void *xrxr); static int hv_nv_init_send_buffer_with_net_vsp(struct hn_softc *sc); static int hv_nv_init_rx_buffer_with_net_vsp(struct hn_softc *); static int hv_nv_destroy_send_buffer(netvsc_dev *net_dev); static int hv_nv_destroy_rx_buffer(netvsc_dev *net_dev); static int hv_nv_connect_to_vsp(struct hn_softc *sc); static void hv_nv_on_send_completion(netvsc_dev *net_dev, - struct hv_vmbus_channel *, const struct vmbus_chanpkt_hdr *pkt); -static void hv_nv_on_receive_completion(struct hv_vmbus_channel *chan, + struct vmbus_channel *, const struct vmbus_chanpkt_hdr *pkt); +static void hv_nv_on_receive_completion(struct vmbus_channel *chan, uint64_t tid, uint32_t status); static void hv_nv_on_receive(netvsc_dev *net_dev, - struct hn_softc *sc, struct hv_vmbus_channel *chan, + struct hn_rx_ring *rxr, struct vmbus_channel *chan, const struct vmbus_chanpkt_hdr *pkt); /* * */ static inline netvsc_dev * hv_nv_alloc_net_device(struct hn_softc *sc) { netvsc_dev *net_dev; net_dev = malloc(sizeof(netvsc_dev), M_NETVSC, M_WAITOK | M_ZERO); net_dev->sc = sc; net_dev->destroy = FALSE; sc->net_dev = net_dev; return (net_dev); } /* * XXX unnecessary; nuke it. */ static inline netvsc_dev * hv_nv_get_outbound_net_device(struct hn_softc *sc) { return sc->net_dev; } /* * XXX unnecessary; nuke it. */ static inline netvsc_dev * hv_nv_get_inbound_net_device(struct hn_softc *sc) { return sc->net_dev; } int hv_nv_get_next_send_section(netvsc_dev *net_dev) { unsigned long bitsmap_words = net_dev->bitsmap_words; unsigned long *bitsmap = net_dev->send_section_bitsmap; unsigned long idx; int ret = NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX; int i; for (i = 0; i < bitsmap_words; i++) { idx = ffsl(~bitsmap[i]); if (0 == idx) continue; idx--; KASSERT(i * BITS_PER_LONG + idx < net_dev->send_section_count, ("invalid i %d and idx %lu", i, idx)); if (atomic_testandset_long(&bitsmap[i], idx)) continue; ret = i * BITS_PER_LONG + idx; break; } return (ret); } /* * Net VSC initialize receive buffer with net VSP * * Net VSP: Network virtual services client, also known as the * Hyper-V extensible switch and the synthetic data path. */ static int hv_nv_init_rx_buffer_with_net_vsp(struct hn_softc *sc) { netvsc_dev *net_dev; nvsp_msg *init_pkt; int ret = 0; net_dev = hv_nv_get_outbound_net_device(sc); if (!net_dev) { return (ENODEV); } net_dev->rx_buf = hyperv_dmamem_alloc(bus_get_dma_tag(sc->hn_dev), PAGE_SIZE, 0, net_dev->rx_buf_size, &net_dev->rxbuf_dma, BUS_DMA_WAITOK | BUS_DMA_ZERO); if (net_dev->rx_buf == NULL) { device_printf(sc->hn_dev, "allocate rxbuf failed\n"); return ENOMEM; } /* * Connect the RXBUF GPADL to the primary channel. * * NOTE: * Only primary channel has RXBUF connected to it. Sub-channels * just share this RXBUF. */ ret = vmbus_chan_gpadl_connect(sc->hn_prichan, net_dev->rxbuf_dma.hv_paddr, net_dev->rx_buf_size, &net_dev->rx_buf_gpadl_handle); if (ret != 0) { device_printf(sc->hn_dev, "rxbuf gpadl connect failed: %d\n", ret); goto cleanup; } /* sema_wait(&ext->channel_init_sema); KYS CHECK */ /* Notify the NetVsp of the gpadl handle */ init_pkt = &net_dev->channel_init_packet; memset(init_pkt, 0, sizeof(nvsp_msg)); init_pkt->hdr.msg_type = nvsp_msg_1_type_send_rx_buf; init_pkt->msgs.vers_1_msgs.send_rx_buf.gpadl_handle = net_dev->rx_buf_gpadl_handle; init_pkt->msgs.vers_1_msgs.send_rx_buf.id = NETVSC_RECEIVE_BUFFER_ID; /* Send the gpadl notification request */ ret = vmbus_chan_send(sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, init_pkt, sizeof(nvsp_msg), (uint64_t)(uintptr_t)init_pkt); if (ret != 0) { goto cleanup; } sema_wait(&net_dev->channel_init_sema); /* Check the response */ if (init_pkt->msgs.vers_1_msgs.send_rx_buf_complete.status != nvsp_status_success) { ret = EINVAL; goto cleanup; } net_dev->rx_section_count = init_pkt->msgs.vers_1_msgs.send_rx_buf_complete.num_sections; net_dev->rx_sections = malloc(net_dev->rx_section_count * sizeof(nvsp_1_rx_buf_section), M_NETVSC, M_WAITOK); memcpy(net_dev->rx_sections, init_pkt->msgs.vers_1_msgs.send_rx_buf_complete.sections, net_dev->rx_section_count * sizeof(nvsp_1_rx_buf_section)); /* * For first release, there should only be 1 section that represents * the entire receive buffer */ if (net_dev->rx_section_count != 1 || net_dev->rx_sections->offset != 0) { ret = EINVAL; goto cleanup; } goto exit; cleanup: hv_nv_destroy_rx_buffer(net_dev); exit: return (ret); } /* * Net VSC initialize send buffer with net VSP */ static int hv_nv_init_send_buffer_with_net_vsp(struct hn_softc *sc) { netvsc_dev *net_dev; nvsp_msg *init_pkt; int ret = 0; net_dev = hv_nv_get_outbound_net_device(sc); if (!net_dev) { return (ENODEV); } net_dev->send_buf = hyperv_dmamem_alloc(bus_get_dma_tag(sc->hn_dev), PAGE_SIZE, 0, net_dev->send_buf_size, &net_dev->txbuf_dma, BUS_DMA_WAITOK | BUS_DMA_ZERO); if (net_dev->send_buf == NULL) { device_printf(sc->hn_dev, "allocate chimney txbuf failed\n"); return ENOMEM; } /* * Connect chimney sending buffer GPADL to the primary channel. * * NOTE: * Only primary channel has chimney sending buffer connected to it. * Sub-channels just share this chimney sending buffer. */ ret = vmbus_chan_gpadl_connect(sc->hn_prichan, net_dev->txbuf_dma.hv_paddr, net_dev->send_buf_size, &net_dev->send_buf_gpadl_handle); if (ret != 0) { device_printf(sc->hn_dev, "chimney sending buffer gpadl " "connect failed: %d\n", ret); goto cleanup; } /* Notify the NetVsp of the gpadl handle */ init_pkt = &net_dev->channel_init_packet; memset(init_pkt, 0, sizeof(nvsp_msg)); init_pkt->hdr.msg_type = nvsp_msg_1_type_send_send_buf; init_pkt->msgs.vers_1_msgs.send_rx_buf.gpadl_handle = net_dev->send_buf_gpadl_handle; init_pkt->msgs.vers_1_msgs.send_rx_buf.id = NETVSC_SEND_BUFFER_ID; /* Send the gpadl notification request */ ret = vmbus_chan_send(sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, init_pkt, sizeof(nvsp_msg), (uint64_t)init_pkt); if (ret != 0) { goto cleanup; } sema_wait(&net_dev->channel_init_sema); /* Check the response */ if (init_pkt->msgs.vers_1_msgs.send_send_buf_complete.status != nvsp_status_success) { ret = EINVAL; goto cleanup; } net_dev->send_section_size = init_pkt->msgs.vers_1_msgs.send_send_buf_complete.section_size; net_dev->send_section_count = net_dev->send_buf_size / net_dev->send_section_size; net_dev->bitsmap_words = howmany(net_dev->send_section_count, BITS_PER_LONG); net_dev->send_section_bitsmap = malloc(net_dev->bitsmap_words * sizeof(long), M_NETVSC, M_WAITOK | M_ZERO); goto exit; cleanup: hv_nv_destroy_send_buffer(net_dev); exit: return (ret); } /* * Net VSC destroy receive buffer */ static int hv_nv_destroy_rx_buffer(netvsc_dev *net_dev) { nvsp_msg *revoke_pkt; int ret = 0; /* * If we got a section count, it means we received a * send_rx_buf_complete msg * (ie sent nvsp_msg_1_type_send_rx_buf msg) therefore, * we need to send a revoke msg here */ if (net_dev->rx_section_count) { /* Send the revoke receive buffer */ revoke_pkt = &net_dev->revoke_packet; memset(revoke_pkt, 0, sizeof(nvsp_msg)); revoke_pkt->hdr.msg_type = nvsp_msg_1_type_revoke_rx_buf; revoke_pkt->msgs.vers_1_msgs.revoke_rx_buf.id = NETVSC_RECEIVE_BUFFER_ID; ret = vmbus_chan_send(net_dev->sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, 0, revoke_pkt, sizeof(nvsp_msg), (uint64_t)(uintptr_t)revoke_pkt); /* * If we failed here, we might as well return and have a leak * rather than continue and a bugchk */ if (ret != 0) { return (ret); } } /* Tear down the gpadl on the vsp end */ if (net_dev->rx_buf_gpadl_handle) { ret = vmbus_chan_gpadl_disconnect(net_dev->sc->hn_prichan, net_dev->rx_buf_gpadl_handle); /* * If we failed here, we might as well return and have a leak * rather than continue and a bugchk */ if (ret != 0) { return (ret); } net_dev->rx_buf_gpadl_handle = 0; } if (net_dev->rx_buf) { /* Free up the receive buffer */ hyperv_dmamem_free(&net_dev->rxbuf_dma, net_dev->rx_buf); net_dev->rx_buf = NULL; } if (net_dev->rx_sections) { free(net_dev->rx_sections, M_NETVSC); net_dev->rx_sections = NULL; net_dev->rx_section_count = 0; } return (ret); } /* * Net VSC destroy send buffer */ static int hv_nv_destroy_send_buffer(netvsc_dev *net_dev) { nvsp_msg *revoke_pkt; int ret = 0; /* * If we got a section count, it means we received a * send_rx_buf_complete msg * (ie sent nvsp_msg_1_type_send_rx_buf msg) therefore, * we need to send a revoke msg here */ if (net_dev->send_section_size) { /* Send the revoke send buffer */ revoke_pkt = &net_dev->revoke_packet; memset(revoke_pkt, 0, sizeof(nvsp_msg)); revoke_pkt->hdr.msg_type = nvsp_msg_1_type_revoke_send_buf; revoke_pkt->msgs.vers_1_msgs.revoke_send_buf.id = NETVSC_SEND_BUFFER_ID; ret = vmbus_chan_send(net_dev->sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, 0, revoke_pkt, sizeof(nvsp_msg), (uint64_t)(uintptr_t)revoke_pkt); /* * If we failed here, we might as well return and have a leak * rather than continue and a bugchk */ if (ret != 0) { return (ret); } } /* Tear down the gpadl on the vsp end */ if (net_dev->send_buf_gpadl_handle) { ret = vmbus_chan_gpadl_disconnect(net_dev->sc->hn_prichan, net_dev->send_buf_gpadl_handle); /* * If we failed here, we might as well return and have a leak * rather than continue and a bugchk */ if (ret != 0) { return (ret); } net_dev->send_buf_gpadl_handle = 0; } if (net_dev->send_buf) { /* Free up the receive buffer */ hyperv_dmamem_free(&net_dev->txbuf_dma, net_dev->send_buf); net_dev->send_buf = NULL; } if (net_dev->send_section_bitsmap) { free(net_dev->send_section_bitsmap, M_NETVSC); } return (ret); } /* * Attempt to negotiate the caller-specified NVSP version * * For NVSP v2, Server 2008 R2 does not set * init_pkt->msgs.init_msgs.init_compl.negotiated_prot_vers * to the negotiated version, so we cannot rely on that. */ static int hv_nv_negotiate_nvsp_protocol(struct hn_softc *sc, netvsc_dev *net_dev, uint32_t nvsp_ver) { nvsp_msg *init_pkt; int ret; init_pkt = &net_dev->channel_init_packet; memset(init_pkt, 0, sizeof(nvsp_msg)); init_pkt->hdr.msg_type = nvsp_msg_type_init; /* * Specify parameter as the only acceptable protocol version */ init_pkt->msgs.init_msgs.init.p1.protocol_version = nvsp_ver; init_pkt->msgs.init_msgs.init.protocol_version_2 = nvsp_ver; /* Send the init request */ ret = vmbus_chan_send(sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, init_pkt, sizeof(nvsp_msg), (uint64_t)(uintptr_t)init_pkt); if (ret != 0) return (-1); sema_wait(&net_dev->channel_init_sema); if (init_pkt->msgs.init_msgs.init_compl.status != nvsp_status_success) return (EINVAL); return (0); } /* * Send NDIS version 2 config packet containing MTU. * * Not valid for NDIS version 1. */ static int hv_nv_send_ndis_config(struct hn_softc *sc, uint32_t mtu) { netvsc_dev *net_dev; nvsp_msg *init_pkt; int ret; net_dev = hv_nv_get_outbound_net_device(sc); if (!net_dev) return (-ENODEV); /* * Set up configuration packet, write MTU * Indicate we are capable of handling VLAN tags */ init_pkt = &net_dev->channel_init_packet; memset(init_pkt, 0, sizeof(nvsp_msg)); init_pkt->hdr.msg_type = nvsp_msg_2_type_send_ndis_config; init_pkt->msgs.vers_2_msgs.send_ndis_config.mtu = mtu; init_pkt-> msgs.vers_2_msgs.send_ndis_config.capabilities.u1.u2.ieee8021q = 1; /* Send the configuration packet */ ret = vmbus_chan_send(sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, 0, init_pkt, sizeof(nvsp_msg), (uint64_t)(uintptr_t)init_pkt); if (ret != 0) return (-EINVAL); return (0); } /* * Net VSC connect to VSP */ static int hv_nv_connect_to_vsp(struct hn_softc *sc) { netvsc_dev *net_dev; nvsp_msg *init_pkt; uint32_t ndis_version; uint32_t protocol_list[] = { NVSP_PROTOCOL_VERSION_1, NVSP_PROTOCOL_VERSION_2, NVSP_PROTOCOL_VERSION_4, NVSP_PROTOCOL_VERSION_5 }; int i; int protocol_number = nitems(protocol_list); int ret = 0; device_t dev = sc->hn_dev; struct ifnet *ifp = sc->hn_ifp; net_dev = hv_nv_get_outbound_net_device(sc); /* * Negotiate the NVSP version. Try the latest NVSP first. */ for (i = protocol_number - 1; i >= 0; i--) { if (hv_nv_negotiate_nvsp_protocol(sc, net_dev, protocol_list[i]) == 0) { net_dev->nvsp_version = protocol_list[i]; if (bootverbose) device_printf(dev, "Netvsc: got version 0x%x\n", net_dev->nvsp_version); break; } } if (i < 0) { if (bootverbose) device_printf(dev, "failed to negotiate a valid " "protocol.\n"); return (EPROTO); } /* * Set the MTU if supported by this NVSP protocol version * This needs to be right after the NVSP init message per Haiyang */ if (net_dev->nvsp_version >= NVSP_PROTOCOL_VERSION_2) ret = hv_nv_send_ndis_config(sc, ifp->if_mtu); /* * Send the NDIS version */ init_pkt = &net_dev->channel_init_packet; memset(init_pkt, 0, sizeof(nvsp_msg)); if (net_dev->nvsp_version <= NVSP_PROTOCOL_VERSION_4) { ndis_version = NDIS_VERSION_6_1; } else { ndis_version = NDIS_VERSION_6_30; } init_pkt->hdr.msg_type = nvsp_msg_1_type_send_ndis_vers; init_pkt->msgs.vers_1_msgs.send_ndis_vers.ndis_major_vers = (ndis_version & 0xFFFF0000) >> 16; init_pkt->msgs.vers_1_msgs.send_ndis_vers.ndis_minor_vers = ndis_version & 0xFFFF; /* Send the init request */ ret = vmbus_chan_send(sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, 0, init_pkt, sizeof(nvsp_msg), (uint64_t)(uintptr_t)init_pkt); if (ret != 0) { goto cleanup; } /* * TODO: BUGBUG - We have to wait for the above msg since the netvsp * uses KMCL which acknowledges packet (completion packet) * since our Vmbus always set the VMBUS_CHANPKT_FLAG_RC flag */ /* sema_wait(&NetVscChannel->channel_init_sema); */ /* Post the big receive buffer to NetVSP */ if (net_dev->nvsp_version <= NVSP_PROTOCOL_VERSION_2) net_dev->rx_buf_size = NETVSC_RECEIVE_BUFFER_SIZE_LEGACY; else net_dev->rx_buf_size = NETVSC_RECEIVE_BUFFER_SIZE; net_dev->send_buf_size = NETVSC_SEND_BUFFER_SIZE; ret = hv_nv_init_rx_buffer_with_net_vsp(sc); if (ret == 0) ret = hv_nv_init_send_buffer_with_net_vsp(sc); cleanup: return (ret); } /* * Net VSC disconnect from VSP */ static void hv_nv_disconnect_from_vsp(netvsc_dev *net_dev) { hv_nv_destroy_rx_buffer(net_dev); hv_nv_destroy_send_buffer(net_dev); } void -hv_nv_subchan_attach(struct hv_vmbus_channel *chan) +hv_nv_subchan_attach(struct vmbus_channel *chan, struct hn_rx_ring *rxr) { - - chan->hv_chan_rdbuf = malloc(NETVSC_PACKET_SIZE, M_NETVSC, M_WAITOK); + KASSERT(rxr->hn_rx_idx == vmbus_chan_subidx(chan), + ("chan%u subidx %u, rxr%d mismatch", + vmbus_chan_id(chan), vmbus_chan_subidx(chan), rxr->hn_rx_idx)); vmbus_chan_open(chan, NETVSC_DEVICE_RING_BUFFER_SIZE, NETVSC_DEVICE_RING_BUFFER_SIZE, NULL, 0, - hv_nv_on_channel_callback, chan); + hv_nv_on_channel_callback, rxr); } /* * Net VSC on device add * * Callback when the device belonging to this driver is added */ netvsc_dev * -hv_nv_on_device_add(struct hn_softc *sc, void *additional_info) +hv_nv_on_device_add(struct hn_softc *sc, void *additional_info, + struct hn_rx_ring *rxr) { - struct hv_vmbus_channel *chan = sc->hn_prichan; + struct vmbus_channel *chan = sc->hn_prichan; netvsc_dev *net_dev; int ret = 0; net_dev = hv_nv_alloc_net_device(sc); if (net_dev == NULL) return NULL; /* Initialize the NetVSC channel extension */ sema_init(&net_dev->channel_init_sema, 0, "netdev_sema"); - chan->hv_chan_rdbuf = malloc(NETVSC_PACKET_SIZE, M_NETVSC, M_WAITOK); - /* * Open the channel */ + KASSERT(rxr->hn_rx_idx == vmbus_chan_subidx(chan), + ("chan%u subidx %u, rxr%d mismatch", + vmbus_chan_id(chan), vmbus_chan_subidx(chan), rxr->hn_rx_idx)); ret = vmbus_chan_open(chan, NETVSC_DEVICE_RING_BUFFER_SIZE, NETVSC_DEVICE_RING_BUFFER_SIZE, - NULL, 0, hv_nv_on_channel_callback, chan); - if (ret != 0) { - free(chan->hv_chan_rdbuf, M_NETVSC); + NULL, 0, hv_nv_on_channel_callback, rxr); + if (ret != 0) goto cleanup; - } /* * Connect with the NetVsp */ ret = hv_nv_connect_to_vsp(sc); if (ret != 0) goto close; return (net_dev); close: /* Now, we can close the channel safely */ - free(chan->hv_chan_rdbuf, M_NETVSC); vmbus_chan_close(chan); cleanup: /* * Free the packet buffers on the netvsc device packet queue. * Release other resources. */ sema_destroy(&net_dev->channel_init_sema); free(net_dev, M_NETVSC); return (NULL); } /* * Net VSC on device remove */ int hv_nv_on_device_remove(struct hn_softc *sc, boolean_t destroy_channel) { netvsc_dev *net_dev = sc->net_dev;; /* Stop outbound traffic ie sends and receives completions */ net_dev->destroy = TRUE; hv_nv_disconnect_from_vsp(net_dev); /* At this point, no one should be accessing net_dev except in here */ /* Now, we can close the channel safely */ - free(sc->hn_prichan->hv_chan_rdbuf, M_NETVSC); vmbus_chan_close(sc->hn_prichan); sema_destroy(&net_dev->channel_init_sema); free(net_dev, M_NETVSC); return (0); } /* * Net VSC on send completion */ static void -hv_nv_on_send_completion(netvsc_dev *net_dev, struct hv_vmbus_channel *chan, +hv_nv_on_send_completion(netvsc_dev *net_dev, struct vmbus_channel *chan, const struct vmbus_chanpkt_hdr *pkt) { const nvsp_msg *nvsp_msg_pkt; netvsc_packet *net_vsc_pkt; nvsp_msg_pkt = VMBUS_CHANPKT_CONST_DATA(pkt); if (nvsp_msg_pkt->hdr.msg_type == nvsp_msg_type_init_complete || nvsp_msg_pkt->hdr.msg_type == nvsp_msg_1_type_send_rx_buf_complete || nvsp_msg_pkt->hdr.msg_type == nvsp_msg_1_type_send_send_buf_complete || nvsp_msg_pkt->hdr.msg_type == nvsp_msg5_type_subchannel) { /* Copy the response back */ memcpy(&net_dev->channel_init_packet, nvsp_msg_pkt, sizeof(nvsp_msg)); sema_post(&net_dev->channel_init_sema); } else if (nvsp_msg_pkt->hdr.msg_type == nvsp_msg_1_type_send_rndis_pkt_complete) { /* Get the send context */ net_vsc_pkt = (netvsc_packet *)(unsigned long)pkt->cph_xactid; if (NULL != net_vsc_pkt) { if (net_vsc_pkt->send_buf_section_idx != NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX) { u_long mask; int idx; idx = net_vsc_pkt->send_buf_section_idx / BITS_PER_LONG; KASSERT(idx < net_dev->bitsmap_words, ("invalid section index %u", net_vsc_pkt->send_buf_section_idx)); mask = 1UL << (net_vsc_pkt->send_buf_section_idx % BITS_PER_LONG); KASSERT(net_dev->send_section_bitsmap[idx] & mask, ("index bitmap 0x%lx, section index %u, " "bitmap idx %d, bitmask 0x%lx", net_dev->send_section_bitsmap[idx], net_vsc_pkt->send_buf_section_idx, idx, mask)); atomic_clear_long( &net_dev->send_section_bitsmap[idx], mask); } /* Notify the layer above us */ net_vsc_pkt->compl.send.on_send_completion(chan, net_vsc_pkt->compl.send.send_completion_context); } } } /* * Net VSC on send * Sends a packet on the specified Hyper-V device. * Returns 0 on success, non-zero on failure. */ int -hv_nv_on_send(struct hv_vmbus_channel *chan, netvsc_packet *pkt) +hv_nv_on_send(struct vmbus_channel *chan, netvsc_packet *pkt) { nvsp_msg send_msg; int ret; send_msg.hdr.msg_type = nvsp_msg_1_type_send_rndis_pkt; if (pkt->is_data_pkt) { /* 0 is RMC_DATA */ send_msg.msgs.vers_1_msgs.send_rndis_pkt.chan_type = 0; } else { /* 1 is RMC_CONTROL */ send_msg.msgs.vers_1_msgs.send_rndis_pkt.chan_type = 1; } send_msg.msgs.vers_1_msgs.send_rndis_pkt.send_buf_section_idx = pkt->send_buf_section_idx; send_msg.msgs.vers_1_msgs.send_rndis_pkt.send_buf_section_size = pkt->send_buf_section_size; if (pkt->gpa_cnt) { ret = vmbus_chan_send_sglist(chan, pkt->gpa, pkt->gpa_cnt, &send_msg, sizeof(nvsp_msg), (uint64_t)(uintptr_t)pkt); } else { ret = vmbus_chan_send(chan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, &send_msg, sizeof(nvsp_msg), (uint64_t)(uintptr_t)pkt); } return (ret); } /* * Net VSC on receive * * In the FreeBSD Hyper-V virtual world, this function deals exclusively * with virtual addresses. */ static void -hv_nv_on_receive(netvsc_dev *net_dev, struct hn_softc *sc, - struct hv_vmbus_channel *chan, const struct vmbus_chanpkt_hdr *pkthdr) +hv_nv_on_receive(netvsc_dev *net_dev, struct hn_rx_ring *rxr, + struct vmbus_channel *chan, const struct vmbus_chanpkt_hdr *pkthdr) { const struct vmbus_chanpkt_rxbuf *pkt; const nvsp_msg *nvsp_msg_pkt; netvsc_packet vsc_pkt; netvsc_packet *net_vsc_pkt = &vsc_pkt; - device_t dev = sc->hn_dev; int count = 0; int i = 0; int status = nvsp_status_success; nvsp_msg_pkt = VMBUS_CHANPKT_CONST_DATA(pkthdr); /* Make sure this is a valid nvsp packet */ if (nvsp_msg_pkt->hdr.msg_type != nvsp_msg_1_type_send_rndis_pkt) { - device_printf(dev, "packet hdr type %u is invalid!\n", + if_printf(rxr->hn_ifp, "packet hdr type %u is invalid!\n", nvsp_msg_pkt->hdr.msg_type); return; } pkt = (const struct vmbus_chanpkt_rxbuf *)pkthdr; if (pkt->cp_rxbuf_id != NETVSC_RECEIVE_BUFFER_ID) { - device_printf(dev, "rxbuf_id %d is invalid!\n", + if_printf(rxr->hn_ifp, "rxbuf_id %d is invalid!\n", pkt->cp_rxbuf_id); return; } count = pkt->cp_rxbuf_cnt; /* Each range represents 1 RNDIS pkt that contains 1 Ethernet frame */ for (i = 0; i < count; i++) { net_vsc_pkt->status = nvsp_status_success; net_vsc_pkt->data = ((uint8_t *)net_dev->rx_buf + pkt->cp_rxbuf[i].rb_ofs); net_vsc_pkt->tot_data_buf_len = pkt->cp_rxbuf[i].rb_len; - hv_rf_on_receive(net_dev, chan, net_vsc_pkt); + hv_rf_on_receive(net_dev, rxr, net_vsc_pkt); if (net_vsc_pkt->status != nvsp_status_success) { status = nvsp_status_failure; } } /* * Moved completion call back here so that all received * messages (not just data messages) will trigger a response * message back to the host. */ hv_nv_on_receive_completion(chan, pkt->cp_hdr.cph_xactid, status); } /* * Net VSC on receive completion * * Send a receive completion packet to RNDIS device (ie NetVsp) */ static void -hv_nv_on_receive_completion(struct hv_vmbus_channel *chan, uint64_t tid, +hv_nv_on_receive_completion(struct vmbus_channel *chan, uint64_t tid, uint32_t status) { nvsp_msg rx_comp_msg; int retries = 0; int ret = 0; rx_comp_msg.hdr.msg_type = nvsp_msg_1_type_send_rndis_pkt_complete; /* Pass in the status */ rx_comp_msg.msgs.vers_1_msgs.send_rndis_pkt_complete.status = status; retry_send_cmplt: /* Send the completion */ ret = vmbus_chan_send(chan, VMBUS_CHANPKT_TYPE_COMP, 0, &rx_comp_msg, sizeof(nvsp_msg), tid); if (ret == 0) { /* success */ /* no-op */ } else if (ret == EAGAIN) { /* no more room... wait a bit and attempt to retry 3 times */ retries++; if (retries < 4) { DELAY(100); goto retry_send_cmplt; } } } /* * Net VSC receiving vRSS send table from VSP */ static void hv_nv_send_table(struct hn_softc *sc, const struct vmbus_chanpkt_hdr *pkt) { netvsc_dev *net_dev; const nvsp_msg *nvsp_msg_pkt; int i; uint32_t count; const uint32_t *table; net_dev = hv_nv_get_inbound_net_device(sc); if (!net_dev) return; nvsp_msg_pkt = VMBUS_CHANPKT_CONST_DATA(pkt); if (nvsp_msg_pkt->hdr.msg_type != nvsp_msg5_type_send_indirection_table) { printf("Netvsc: !Warning! receive msg type not " "send_indirection_table. type = %d\n", nvsp_msg_pkt->hdr.msg_type); return; } count = nvsp_msg_pkt->msgs.vers_5_msgs.send_table.count; if (count != VRSS_SEND_TABLE_SIZE) { printf("Netvsc: Received wrong send table size: %u\n", count); return; } table = (const uint32_t *) ((const uint8_t *)&nvsp_msg_pkt->msgs.vers_5_msgs.send_table + nvsp_msg_pkt->msgs.vers_5_msgs.send_table.offset); for (i = 0; i < count; i++) net_dev->vrss_send_table[i] = table[i]; } /* * Net VSC on channel callback */ static void -hv_nv_on_channel_callback(void *xchan) +hv_nv_on_channel_callback(struct vmbus_channel *chan, void *xrxr) { - struct hv_vmbus_channel *chan = xchan; - device_t dev = chan->ch_dev; - struct hn_softc *sc = device_get_softc(dev); + struct hn_rx_ring *rxr = xrxr; + struct hn_softc *sc = rxr->hn_ifp->if_softc; netvsc_dev *net_dev; void *buffer; int bufferlen = NETVSC_PACKET_SIZE; net_dev = hv_nv_get_inbound_net_device(sc); if (net_dev == NULL) return; - buffer = chan->hv_chan_rdbuf; + buffer = rxr->hn_rdbuf; do { struct vmbus_chanpkt_hdr *pkt = buffer; uint32_t bytes_rxed; int ret; bytes_rxed = bufferlen; ret = vmbus_chan_recv_pkt(chan, pkt, &bytes_rxed); if (ret == 0) { if (bytes_rxed > 0) { switch (pkt->cph_type) { case VMBUS_CHANPKT_TYPE_COMP: hv_nv_on_send_completion(net_dev, chan, pkt); break; case VMBUS_CHANPKT_TYPE_RXBUF: - hv_nv_on_receive(net_dev, sc, chan, pkt); + hv_nv_on_receive(net_dev, rxr, chan, pkt); break; case VMBUS_CHANPKT_TYPE_INBAND: hv_nv_send_table(sc, pkt); break; default: - device_printf(dev, + if_printf(rxr->hn_ifp, "unknown chan pkt %u\n", pkt->cph_type); break; } } } else if (ret == ENOBUFS) { /* Handle large packet */ if (bufferlen > NETVSC_PACKET_SIZE) { free(buffer, M_NETVSC); buffer = NULL; } /* alloc new buffer */ buffer = malloc(bytes_rxed, M_NETVSC, M_NOWAIT); if (buffer == NULL) { - device_printf(dev, + if_printf(rxr->hn_ifp, "hv_cb malloc buffer failed, len=%u\n", bytes_rxed); bufferlen = 0; break; } bufferlen = bytes_rxed; } else { /* No more packets */ break; } } while (1); if (bufferlen > NETVSC_PACKET_SIZE) free(buffer, M_NETVSC); - hv_rf_channel_rollup(chan); + hv_rf_channel_rollup(rxr, rxr->hn_txr); } Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_net_vsc.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_net_vsc.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_net_vsc.h (revision 303206) @@ -1,1276 +1,1282 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2010-2012 Citrix Inc. * Copyright (c) 2012 NetApp Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * HyperV vmbus (virtual machine bus) network VSC (virtual services client) * header file * * (Updated from unencumbered NvspProtocol.h) */ #ifndef __HV_NET_VSC_H__ #define __HV_NET_VSC_H__ #include #include #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #define HN_USE_TXDESC_BUFRING MALLOC_DECLARE(M_NETVSC); #define NVSP_INVALID_PROTOCOL_VERSION (0xFFFFFFFF) #define NVSP_PROTOCOL_VERSION_1 2 #define NVSP_PROTOCOL_VERSION_2 0x30002 #define NVSP_PROTOCOL_VERSION_4 0x40000 #define NVSP_PROTOCOL_VERSION_5 0x50000 #define NVSP_MIN_PROTOCOL_VERSION (NVSP_PROTOCOL_VERSION_1) #define NVSP_MAX_PROTOCOL_VERSION (NVSP_PROTOCOL_VERSION_2) #define NVSP_PROTOCOL_VERSION_CURRENT NVSP_PROTOCOL_VERSION_2 #define VERSION_4_OFFLOAD_SIZE 22 #define NVSP_OPERATIONAL_STATUS_OK (0x00000000) #define NVSP_OPERATIONAL_STATUS_DEGRADED (0x00000001) #define NVSP_OPERATIONAL_STATUS_NONRECOVERABLE (0x00000002) #define NVSP_OPERATIONAL_STATUS_NO_CONTACT (0x00000003) #define NVSP_OPERATIONAL_STATUS_LOST_COMMUNICATION (0x00000004) /* * Maximun number of transfer pages (packets) the VSP will use on a receive */ #define NVSP_MAX_PACKETS_PER_RECEIVE 375 /* vRSS stuff */ #define RNDIS_OBJECT_TYPE_RSS_CAPABILITIES 0x88 #define RNDIS_OBJECT_TYPE_RSS_PARAMETERS 0x89 #define RNDIS_RECEIVE_SCALE_CAPABILITIES_REVISION_2 2 #define RNDIS_RECEIVE_SCALE_PARAMETERS_REVISION_2 2 struct rndis_obj_header { uint8_t type; uint8_t rev; uint16_t size; } __packed; /* rndis_recv_scale_cap/cap_flag */ #define RNDIS_RSS_CAPS_MESSAGE_SIGNALED_INTERRUPTS 0x01000000 #define RNDIS_RSS_CAPS_CLASSIFICATION_AT_ISR 0x02000000 #define RNDIS_RSS_CAPS_CLASSIFICATION_AT_DPC 0x04000000 #define RNDIS_RSS_CAPS_USING_MSI_X 0x08000000 #define RNDIS_RSS_CAPS_RSS_AVAILABLE_ON_PORTS 0x10000000 #define RNDIS_RSS_CAPS_SUPPORTS_MSI_X 0x20000000 #define RNDIS_RSS_CAPS_HASH_TYPE_TCP_IPV4 0x00000100 #define RNDIS_RSS_CAPS_HASH_TYPE_TCP_IPV6 0x00000200 #define RNDIS_RSS_CAPS_HASH_TYPE_TCP_IPV6_EX 0x00000400 /* RNDIS_RECEIVE_SCALE_CAPABILITIES */ struct rndis_recv_scale_cap { struct rndis_obj_header hdr; uint32_t cap_flag; uint32_t num_int_msg; uint32_t num_recv_que; uint16_t num_indirect_tabent; } __packed; /* rndis_recv_scale_param flags */ #define RNDIS_RSS_PARAM_FLAG_BASE_CPU_UNCHANGED 0x0001 #define RNDIS_RSS_PARAM_FLAG_HASH_INFO_UNCHANGED 0x0002 #define RNDIS_RSS_PARAM_FLAG_ITABLE_UNCHANGED 0x0004 #define RNDIS_RSS_PARAM_FLAG_HASH_KEY_UNCHANGED 0x0008 #define RNDIS_RSS_PARAM_FLAG_DISABLE_RSS 0x0010 /* Hash info bits */ #define RNDIS_HASH_FUNC_TOEPLITZ 0x00000001 #define RNDIS_HASH_IPV4 0x00000100 #define RNDIS_HASH_TCP_IPV4 0x00000200 #define RNDIS_HASH_IPV6 0x00000400 #define RNDIS_HASH_IPV6_EX 0x00000800 #define RNDIS_HASH_TCP_IPV6 0x00001000 #define RNDIS_HASH_TCP_IPV6_EX 0x00002000 #define RNDIS_RSS_INDIRECTION_TABLE_MAX_SIZE_REVISION_2 (128 * 4) #define RNDIS_RSS_HASH_SECRET_KEY_MAX_SIZE_REVISION_2 40 #define ITAB_NUM 128 #define HASH_KEYLEN RNDIS_RSS_HASH_SECRET_KEY_MAX_SIZE_REVISION_2 /* RNDIS_RECEIVE_SCALE_PARAMETERS */ typedef struct rndis_recv_scale_param_ { struct rndis_obj_header hdr; /* Qualifies the rest of the information */ uint16_t flag; /* The base CPU number to do receive processing. not used */ uint16_t base_cpu_number; /* This describes the hash function and type being enabled */ uint32_t hashinfo; /* The size of indirection table array */ uint16_t indirect_tabsize; /* The offset of the indirection table from the beginning of this * structure */ uint32_t indirect_taboffset; /* The size of the hash secret key */ uint16_t hashkey_size; /* The offset of the secret key from the beginning of this structure */ uint32_t hashkey_offset; uint32_t processor_masks_offset; uint32_t num_processor_masks; uint32_t processor_masks_entry_size; } rndis_recv_scale_param; typedef enum nvsp_msg_type_ { nvsp_msg_type_none = 0, /* * Init Messages */ nvsp_msg_type_init = 1, nvsp_msg_type_init_complete = 2, nvsp_version_msg_start = 100, /* * Version 1 Messages */ nvsp_msg_1_type_send_ndis_vers = nvsp_version_msg_start, nvsp_msg_1_type_send_rx_buf, nvsp_msg_1_type_send_rx_buf_complete, nvsp_msg_1_type_revoke_rx_buf, nvsp_msg_1_type_send_send_buf, nvsp_msg_1_type_send_send_buf_complete, nvsp_msg_1_type_revoke_send_buf, nvsp_msg_1_type_send_rndis_pkt, nvsp_msg_1_type_send_rndis_pkt_complete, /* * Version 2 Messages */ nvsp_msg_2_type_send_chimney_delegated_buf, nvsp_msg_2_type_send_chimney_delegated_buf_complete, nvsp_msg_2_type_revoke_chimney_delegated_buf, nvsp_msg_2_type_resume_chimney_rx_indication, nvsp_msg_2_type_terminate_chimney, nvsp_msg_2_type_terminate_chimney_complete, nvsp_msg_2_type_indicate_chimney_event, nvsp_msg_2_type_send_chimney_packet, nvsp_msg_2_type_send_chimney_packet_complete, nvsp_msg_2_type_post_chimney_rx_request, nvsp_msg_2_type_post_chimney_rx_request_complete, nvsp_msg_2_type_alloc_rx_buf, nvsp_msg_2_type_alloc_rx_buf_complete, nvsp_msg_2_type_free_rx_buf, nvsp_msg_2_send_vmq_rndis_pkt, nvsp_msg_2_send_vmq_rndis_pkt_complete, nvsp_msg_2_type_send_ndis_config, nvsp_msg_2_type_alloc_chimney_handle, nvsp_msg_2_type_alloc_chimney_handle_complete, nvsp_msg2_max = nvsp_msg_2_type_alloc_chimney_handle_complete, /* * Version 4 Messages */ nvsp_msg4_type_send_vf_association, nvsp_msg4_type_switch_data_path, nvsp_msg4_type_uplink_connect_state_deprecated, nvsp_msg4_max = nvsp_msg4_type_uplink_connect_state_deprecated, /* * Version 5 Messages */ nvsp_msg5_type_oid_query_ex, nvsp_msg5_type_oid_query_ex_comp, nvsp_msg5_type_subchannel, nvsp_msg5_type_send_indirection_table, nvsp_msg5_max = nvsp_msg5_type_send_indirection_table, } nvsp_msg_type; typedef enum nvsp_status_ { nvsp_status_none = 0, nvsp_status_success, nvsp_status_failure, /* Deprecated */ nvsp_status_prot_vers_range_too_new, /* Deprecated */ nvsp_status_prot_vers_range_too_old, nvsp_status_invalid_rndis_pkt, nvsp_status_busy, nvsp_status_max, } nvsp_status; typedef struct nvsp_msg_hdr_ { uint32_t msg_type; } __packed nvsp_msg_hdr; /* * Init Messages */ /* * This message is used by the VSC to initialize the channel * after the channels has been opened. This message should * never include anything other then versioning (i.e. this * message will be the same for ever). * * Forever is a long time. The values have been redefined * in Win7 to indicate major and minor protocol version * number. */ typedef struct nvsp_msg_init_ { union { struct { uint16_t minor_protocol_version; uint16_t major_protocol_version; } s; /* Formerly min_protocol_version */ uint32_t protocol_version; } p1; /* Formerly max_protocol_version */ uint32_t protocol_version_2; } __packed nvsp_msg_init; /* * This message is used by the VSP to complete the initialization * of the channel. This message should never include anything other * then versioning (i.e. this message will be the same forever). */ typedef struct nvsp_msg_init_complete_ { /* Deprecated */ uint32_t negotiated_prot_vers; uint32_t max_mdl_chain_len; uint32_t status; } __packed nvsp_msg_init_complete; typedef union nvsp_msg_init_uber_ { nvsp_msg_init init; nvsp_msg_init_complete init_compl; } __packed nvsp_msg_init_uber; /* * Version 1 Messages */ /* * This message is used by the VSC to send the NDIS version * to the VSP. The VSP can use this information when handling * OIDs sent by the VSC. */ typedef struct nvsp_1_msg_send_ndis_version_ { uint32_t ndis_major_vers; /* Deprecated */ uint32_t ndis_minor_vers; } __packed nvsp_1_msg_send_ndis_version; /* * This message is used by the VSC to send a receive buffer * to the VSP. The VSP can then use the receive buffer to * send data to the VSC. */ typedef struct nvsp_1_msg_send_rx_buf_ { uint32_t gpadl_handle; uint16_t id; } __packed nvsp_1_msg_send_rx_buf; typedef struct nvsp_1_rx_buf_section_ { uint32_t offset; uint32_t sub_allocation_size; uint32_t num_sub_allocations; uint32_t end_offset; } __packed nvsp_1_rx_buf_section; /* * This message is used by the VSP to acknowledge a receive * buffer send by the VSC. This message must be sent by the * VSP before the VSP uses the receive buffer. */ typedef struct nvsp_1_msg_send_rx_buf_complete_ { uint32_t status; uint32_t num_sections; /* * The receive buffer is split into two parts, a large * suballocation section and a small suballocation * section. These sections are then suballocated by a * certain size. * * For example, the following break up of the receive * buffer has 6 large suballocations and 10 small * suballocations. * * | Large Section | | Small Section | * ------------------------------------------------------------ * | | | | | | | | | | | | | | | | | | * | | * LargeOffset SmallOffset */ nvsp_1_rx_buf_section sections[1]; } __packed nvsp_1_msg_send_rx_buf_complete; /* * This message is sent by the VSC to revoke the receive buffer. * After the VSP completes this transaction, the VSP should never * use the receive buffer again. */ typedef struct nvsp_1_msg_revoke_rx_buf_ { uint16_t id; } __packed nvsp_1_msg_revoke_rx_buf; /* * This message is used by the VSC to send a send buffer * to the VSP. The VSC can then use the send buffer to * send data to the VSP. */ typedef struct nvsp_1_msg_send_send_buf_ { uint32_t gpadl_handle; uint16_t id; } __packed nvsp_1_msg_send_send_buf; /* * This message is used by the VSP to acknowledge a send * buffer sent by the VSC. This message must be sent by the * VSP before the VSP uses the sent buffer. */ typedef struct nvsp_1_msg_send_send_buf_complete_ { uint32_t status; /* * The VSC gets to choose the size of the send buffer and * the VSP gets to choose the sections size of the buffer. * This was done to enable dynamic reconfigurations when * the cost of GPA-direct buffers decreases. */ uint32_t section_size; } __packed nvsp_1_msg_send_send_buf_complete; /* * This message is sent by the VSC to revoke the send buffer. * After the VSP completes this transaction, the vsp should never * use the send buffer again. */ typedef struct nvsp_1_msg_revoke_send_buf_ { uint16_t id; } __packed nvsp_1_msg_revoke_send_buf; /* * This message is used by both the VSP and the VSC to send * an RNDIS message to the opposite channel endpoint. */ typedef struct nvsp_1_msg_send_rndis_pkt_ { /* * This field is specified by RNIDS. They assume there's * two different channels of communication. However, * the Network VSP only has one. Therefore, the channel * travels with the RNDIS packet. */ uint32_t chan_type; /* * This field is used to send part or all of the data * through a send buffer. This values specifies an * index into the send buffer. If the index is * 0xFFFFFFFF, then the send buffer is not being used * and all of the data was sent through other VMBus * mechanisms. */ uint32_t send_buf_section_idx; uint32_t send_buf_section_size; } __packed nvsp_1_msg_send_rndis_pkt; /* * This message is used by both the VSP and the VSC to complete * a RNDIS message to the opposite channel endpoint. At this * point, the initiator of this message cannot use any resources * associated with the original RNDIS packet. */ typedef struct nvsp_1_msg_send_rndis_pkt_complete_ { uint32_t status; } __packed nvsp_1_msg_send_rndis_pkt_complete; /* * Version 2 Messages */ /* * This message is used by the VSC to send the NDIS version * to the VSP. The VSP can use this information when handling * OIDs sent by the VSC. */ typedef struct nvsp_2_netvsc_capabilities_ { union { uint64_t as_uint64; struct { uint64_t vmq : 1; uint64_t chimney : 1; uint64_t sriov : 1; uint64_t ieee8021q : 1; uint64_t correlationid : 1; uint64_t teaming : 1; } u2; } u1; } __packed nvsp_2_netvsc_capabilities; typedef struct nvsp_2_msg_send_ndis_config_ { uint32_t mtu; uint32_t reserved; nvsp_2_netvsc_capabilities capabilities; } __packed nvsp_2_msg_send_ndis_config; /* * NvspMessage2TypeSendChimneyDelegatedBuffer */ typedef struct nvsp_2_msg_send_chimney_buf_ { /* * On WIN7 beta, delegated_obj_max_size is defined as a uint32_t * Since WIN7 RC, it was split into two uint16_t. To have the same * struct layout, delegated_obj_max_size shall be the first field. */ uint16_t delegated_obj_max_size; /* * The revision # of chimney protocol used between NVSC and NVSP. * * This revision is NOT related to the chimney revision between * NDIS protocol and miniport drivers. */ uint16_t revision; uint32_t gpadl_handle; } __packed nvsp_2_msg_send_chimney_buf; /* Unsupported chimney revision 0 (only present in WIN7 beta) */ #define NVSP_CHIMNEY_REVISION_0 0 /* WIN7 Beta Chimney QFE */ #define NVSP_CHIMNEY_REVISION_1 1 /* The chimney revision since WIN7 RC */ #define NVSP_CHIMNEY_REVISION_2 2 /* * NvspMessage2TypeSendChimneyDelegatedBufferComplete */ typedef struct nvsp_2_msg_send_chimney_buf_complete_ { uint32_t status; /* * Maximum number outstanding sends and pre-posted receives. * * NVSC should not post more than SendQuota/ReceiveQuota packets. * Otherwise, it can block the non-chimney path for an indefinite * amount of time. * (since chimney sends/receives are affected by the remote peer). * * Note: NVSP enforces the quota restrictions on a per-VMBCHANNEL * basis. It doesn't enforce the restriction separately for chimney * send/receive. If NVSC doesn't voluntarily enforce "SendQuota", * it may kill its own network connectivity. */ uint32_t send_quota; uint32_t rx_quota; } __packed nvsp_2_msg_send_chimney_buf_complete; /* * NvspMessage2TypeRevokeChimneyDelegatedBuffer */ typedef struct nvsp_2_msg_revoke_chimney_buf_ { uint32_t gpadl_handle; } __packed nvsp_2_msg_revoke_chimney_buf; #define NVSP_CHIMNEY_OBJECT_TYPE_NEIGHBOR 0 #define NVSP_CHIMNEY_OBJECT_TYPE_PATH4 1 #define NVSP_CHIMNEY_OBJECT_TYPE_PATH6 2 #define NVSP_CHIMNEY_OBJECT_TYPE_TCP 3 /* * NvspMessage2TypeAllocateChimneyHandle */ typedef struct nvsp_2_msg_alloc_chimney_handle_ { uint64_t vsc_context; uint32_t object_type; } __packed nvsp_2_msg_alloc_chimney_handle; /* * NvspMessage2TypeAllocateChimneyHandleComplete */ typedef struct nvsp_2_msg_alloc_chimney_handle_complete_ { uint32_t vsp_handle; } __packed nvsp_2_msg_alloc_chimney_handle_complete; /* * NvspMessage2TypeResumeChimneyRXIndication */ typedef struct nvsp_2_msg_resume_chimney_rx_indication { /* * Handle identifying the offloaded connection */ uint32_t vsp_tcp_handle; } __packed nvsp_2_msg_resume_chimney_rx_indication; #define NVSP_2_MSG_TERMINATE_CHIMNEY_FLAGS_FIRST_STAGE (0x01u) #define NVSP_2_MSG_TERMINATE_CHIMNEY_FLAGS_RESERVED (~(0x01u)) /* * NvspMessage2TypeTerminateChimney */ typedef struct nvsp_2_msg_terminate_chimney_ { /* * Handle identifying the offloaded object */ uint32_t vsp_handle; /* * Terminate Offload Flags * Bit 0: * When set to 0, terminate the offload at the destination NIC * Bit 1-31: Reserved, shall be zero */ uint32_t flags; union { /* * This field is valid only when bit 0 of flags is clear. * It specifies the index into the premapped delegated * object buffer. The buffer was sent through the * NvspMessage2TypeSendChimneyDelegatedBuffer * message at initialization time. * * NVSP will write the delegated state into the delegated * buffer upon upload completion. */ uint32_t index; /* * This field is valid only when bit 0 of flags is set. * * The seqence number of the most recently accepted RX * indication when VSC sets its TCP context into * "terminating" state. * * This allows NVSP to determines if there are any in-flight * RX indications for which the acceptance state is still * undefined. */ uint64_t last_accepted_rx_seq_no; } f0; } __packed nvsp_2_msg_terminate_chimney; #define NVSP_TERMINATE_CHIMNEY_COMPLETE_FLAG_DATA_CORRUPTED 0x0000001u /* * NvspMessage2TypeTerminateChimneyComplete */ typedef struct nvsp_2_msg_terminate_chimney_complete_ { uint64_t vsc_context; uint32_t flags; } __packed nvsp_2_msg_terminate_chimney_complete; /* * NvspMessage2TypeIndicateChimneyEvent */ typedef struct nvsp_2_msg_indicate_chimney_event_ { /* * When VscTcpContext is 0, event_type is an NDIS_STATUS event code * Otherwise, EventType is an TCP connection event (defined in * NdisTcpOffloadEventHandler chimney DDK document). */ uint32_t event_type; /* * When VscTcpContext is 0, EventType is an NDIS_STATUS event code * Otherwise, EventType is an TCP connection event specific information * (defined in NdisTcpOffloadEventHandler chimney DDK document). */ uint32_t event_specific_info; /* * If not 0, the event is per-TCP connection event. This field * contains the VSC's TCP context. * If 0, the event indication is global. */ uint64_t vsc_tcp_context; } __packed nvsp_2_msg_indicate_chimney_event; #define NVSP_1_CHIMNEY_SEND_INVALID_OOB_INDEX 0xffffu #define NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX 0xffffffff /* * NvspMessage2TypeSendChimneyPacket */ typedef struct nvsp_2_msg_send_chimney_pkt_ { /* * Identify the TCP connection for which this chimney send is */ uint32_t vsp_tcp_handle; /* * This field is used to send part or all of the data * through a send buffer. This values specifies an * index into the send buffer. If the index is * 0xFFFF, then the send buffer is not being used * and all of the data was sent through other VMBus * mechanisms. */ uint16_t send_buf_section_index; uint16_t send_buf_section_size; /* * OOB Data Index * This an index to the OOB data buffer. If the index is 0xFFFFFFFF, * then there is no OOB data. * * This field shall be always 0xFFFFFFFF for now. It is reserved for * the future. */ uint16_t oob_data_index; /* * DisconnectFlags = 0 * Normal chimney send. See MiniportTcpOffloadSend for details. * * DisconnectFlags = TCP_DISCONNECT_GRACEFUL_CLOSE (0x01) * Graceful disconnect. See MiniportTcpOffloadDisconnect for details. * * DisconnectFlags = TCP_DISCONNECT_ABORTIVE_CLOSE (0x02) * Abortive disconnect. See MiniportTcpOffloadDisconnect for details. */ uint16_t disconnect_flags; uint32_t seq_no; } __packed nvsp_2_msg_send_chimney_pkt; /* * NvspMessage2TypeSendChimneyPacketComplete */ typedef struct nvsp_2_msg_send_chimney_pkt_complete_ { /* * The NDIS_STATUS for the chimney send */ uint32_t status; /* * Number of bytes that have been sent to the peer (and ACKed by the peer). */ uint32_t bytes_transferred; } __packed nvsp_2_msg_send_chimney_pkt_complete; #define NVSP_1_CHIMNEY_RECV_FLAG_NO_PUSH 0x0001u #define NVSP_1_CHIMNEY_RECV_INVALID_OOB_INDEX 0xffffu /* * NvspMessage2TypePostChimneyRecvRequest */ typedef struct nvsp_2_msg_post_chimney_rx_request_ { /* * Identify the TCP connection which this chimney receive request * is for. */ uint32_t vsp_tcp_handle; /* * OOB Data Index * This an index to the OOB data buffer. If the index is 0xFFFFFFFF, * then there is no OOB data. * * This field shall be always 0xFFFFFFFF for now. It is reserved for * the future. */ uint32_t oob_data_index; /* * Bit 0 * When it is set, this is a "no-push" receive. * When it is clear, this is a "push" receive. * * Bit 1-15: Reserved and shall be zero */ uint16_t flags; /* * For debugging and diagnoses purpose. * The SeqNo is per TCP connection and starts from 0. */ uint32_t seq_no; } __packed nvsp_2_msg_post_chimney_rx_request; /* * NvspMessage2TypePostChimneyRecvRequestComplete */ typedef struct nvsp_2_msg_post_chimney_rx_request_complete_ { /* * The NDIS_STATUS for the chimney send */ uint32_t status; /* * Number of bytes that have been sent to the peer (and ACKed by * the peer). */ uint32_t bytes_xferred; } __packed nvsp_2_msg_post_chimney_rx_request_complete; /* * NvspMessage2TypeAllocateReceiveBuffer */ typedef struct nvsp_2_msg_alloc_rx_buf_ { /* * Allocation ID to match the allocation request and response */ uint32_t allocation_id; /* * Length of the VM shared memory receive buffer that needs to * be allocated */ uint32_t length; } __packed nvsp_2_msg_alloc_rx_buf; /* * NvspMessage2TypeAllocateReceiveBufferComplete */ typedef struct nvsp_2_msg_alloc_rx_buf_complete_ { /* * The NDIS_STATUS code for buffer allocation */ uint32_t status; /* * Allocation ID from NVSP_2_MESSAGE_ALLOCATE_RECEIVE_BUFFER */ uint32_t allocation_id; /* * GPADL handle for the allocated receive buffer */ uint32_t gpadl_handle; /* * Receive buffer ID that is further used in * NvspMessage2SendVmqRndisPacket */ uint64_t rx_buf_id; } __packed nvsp_2_msg_alloc_rx_buf_complete; /* * NvspMessage2TypeFreeReceiveBuffer */ typedef struct nvsp_2_msg_free_rx_buf_ { /* * Receive buffer ID previous returned in * NvspMessage2TypeAllocateReceiveBufferComplete message */ uint64_t rx_buf_id; } __packed nvsp_2_msg_free_rx_buf; /* * This structure is used in defining the buffers in * NVSP_2_MESSAGE_SEND_VMQ_RNDIS_PACKET structure */ typedef struct nvsp_xfer_page_range_ { /* * Specifies the ID of the receive buffer that has the buffer. This * ID can be the general receive buffer ID specified in * NvspMessage1TypeSendReceiveBuffer or it can be the shared memory * receive buffer ID allocated by the VSC and specified in * NvspMessage2TypeAllocateReceiveBufferComplete message */ uint64_t xfer_page_set_id; /* * Number of bytes */ uint32_t byte_count; /* * Offset in bytes from the beginning of the buffer */ uint32_t byte_offset; } __packed nvsp_xfer_page_range; /* * NvspMessage2SendVmqRndisPacket */ typedef struct nvsp_2_msg_send_vmq_rndis_pkt_ { /* * This field is specified by RNIDS. They assume there's * two different channels of communication. However, * the Network VSP only has one. Therefore, the channel * travels with the RNDIS packet. It must be RMC_DATA */ uint32_t channel_type; /* * Only the Range element corresponding to the RNDIS header of * the first RNDIS message in the multiple RNDIS messages sent * in one NVSP message. Information about the data portions as well * as the subsequent RNDIS messages in the same NVSP message are * embedded in the RNDIS header itself */ nvsp_xfer_page_range range; } __packed nvsp_2_msg_send_vmq_rndis_pkt; /* * This message is used by the VSC to complete * a RNDIS VMQ message to the VSP. At this point, * the initiator of this message can use any resources * associated with the original RNDIS VMQ packet. */ typedef struct nvsp_2_msg_send_vmq_rndis_pkt_complete_ { uint32_t status; } __packed nvsp_2_msg_send_vmq_rndis_pkt_complete; /* * Version 5 messages */ enum nvsp_subchannel_operation { NVSP_SUBCHANNEL_NONE = 0, NVSP_SUBCHANNE_ALLOCATE, NVSP_SUBCHANNE_MAX }; typedef struct nvsp_5_subchannel_request_ { uint32_t op; uint32_t num_subchannels; } __packed nvsp_5_subchannel_request; typedef struct nvsp_5_subchannel_complete_ { uint32_t status; /* Actual number of subchannels allocated */ uint32_t num_subchannels; } __packed nvsp_5_subchannel_complete; typedef struct nvsp_5_send_indirect_table_ { /* The number of entries in the send indirection table */ uint32_t count; /* * The offset of the send indireciton table from top of * this struct. The send indirection table tells which channel * to put the send traffic on. Each entry is a channel number. */ uint32_t offset; } __packed nvsp_5_send_indirect_table; typedef union nvsp_1_msg_uber_ { nvsp_1_msg_send_ndis_version send_ndis_vers; nvsp_1_msg_send_rx_buf send_rx_buf; nvsp_1_msg_send_rx_buf_complete send_rx_buf_complete; nvsp_1_msg_revoke_rx_buf revoke_rx_buf; nvsp_1_msg_send_send_buf send_send_buf; nvsp_1_msg_send_send_buf_complete send_send_buf_complete; nvsp_1_msg_revoke_send_buf revoke_send_buf; nvsp_1_msg_send_rndis_pkt send_rndis_pkt; nvsp_1_msg_send_rndis_pkt_complete send_rndis_pkt_complete; } __packed nvsp_1_msg_uber; typedef union nvsp_2_msg_uber_ { nvsp_2_msg_send_ndis_config send_ndis_config; nvsp_2_msg_send_chimney_buf send_chimney_buf; nvsp_2_msg_send_chimney_buf_complete send_chimney_buf_complete; nvsp_2_msg_revoke_chimney_buf revoke_chimney_buf; nvsp_2_msg_resume_chimney_rx_indication resume_chimney_rx_indication; nvsp_2_msg_terminate_chimney terminate_chimney; nvsp_2_msg_terminate_chimney_complete terminate_chimney_complete; nvsp_2_msg_indicate_chimney_event indicate_chimney_event; nvsp_2_msg_send_chimney_pkt send_chimney_packet; nvsp_2_msg_send_chimney_pkt_complete send_chimney_packet_complete; nvsp_2_msg_post_chimney_rx_request post_chimney_rx_request; nvsp_2_msg_post_chimney_rx_request_complete post_chimney_rx_request_complete; nvsp_2_msg_alloc_rx_buf alloc_rx_buffer; nvsp_2_msg_alloc_rx_buf_complete alloc_rx_buffer_complete; nvsp_2_msg_free_rx_buf free_rx_buffer; nvsp_2_msg_send_vmq_rndis_pkt send_vmq_rndis_pkt; nvsp_2_msg_send_vmq_rndis_pkt_complete send_vmq_rndis_pkt_complete; nvsp_2_msg_alloc_chimney_handle alloc_chimney_handle; nvsp_2_msg_alloc_chimney_handle_complete alloc_chimney_handle_complete; } __packed nvsp_2_msg_uber; typedef union nvsp_5_msg_uber_ { nvsp_5_subchannel_request subchannel_request; nvsp_5_subchannel_complete subchn_complete; nvsp_5_send_indirect_table send_table; } __packed nvsp_5_msg_uber; typedef union nvsp_all_msgs_ { nvsp_msg_init_uber init_msgs; nvsp_1_msg_uber vers_1_msgs; nvsp_2_msg_uber vers_2_msgs; nvsp_5_msg_uber vers_5_msgs; } __packed nvsp_all_msgs; /* * ALL Messages */ typedef struct nvsp_msg_ { nvsp_msg_hdr hdr; nvsp_all_msgs msgs; } __packed nvsp_msg; /* * The following arguably belongs in a separate header file */ /* * Defines */ #define NETVSC_SEND_BUFFER_SIZE (1024*1024*15) /* 15M */ #define NETVSC_SEND_BUFFER_ID 0xface #define NETVSC_RECEIVE_BUFFER_SIZE_LEGACY (1024*1024*15) /* 15MB */ #define NETVSC_RECEIVE_BUFFER_SIZE (1024*1024*16) /* 16MB */ #define NETVSC_RECEIVE_BUFFER_ID 0xcafe #define NETVSC_RECEIVE_SG_COUNT 1 /* Preallocated receive packets */ #define NETVSC_RECEIVE_PACKETLIST_COUNT 256 /* * Maximum MTU we permit to be configured for a netvsc interface. * When the code was developed, a max MTU of 12232 was tested and * proven to work. 9K is a reasonable maximum for an Ethernet. */ #define NETVSC_MAX_CONFIGURABLE_MTU (9 * 1024) #define NETVSC_PACKET_SIZE PAGE_SIZE #define VRSS_SEND_TABLE_SIZE 16 /* * Data types */ /* * Per netvsc channel-specific */ typedef struct netvsc_dev_ { struct hn_softc *sc; /* Send buffer allocated by us but manages by NetVSP */ void *send_buf; uint32_t send_buf_size; uint32_t send_buf_gpadl_handle; uint32_t send_section_size; uint32_t send_section_count; unsigned long bitsmap_words; unsigned long *send_section_bitsmap; /* Receive buffer allocated by us but managed by NetVSP */ void *rx_buf; uint32_t rx_buf_size; uint32_t rx_buf_gpadl_handle; uint32_t rx_section_count; nvsp_1_rx_buf_section *rx_sections; /* Used for NetVSP initialization protocol */ struct sema channel_init_sema; nvsp_msg channel_init_packet; nvsp_msg revoke_packet; - /*uint8_t hw_mac_addr[HW_MACADDR_LEN];*/ + /*uint8_t hw_mac_addr[ETHER_ADDR_LEN];*/ /* Holds rndis device info */ void *extension; - hv_bool_uint8_t destroy; + uint8_t destroy; /* Negotiated NVSP version */ uint32_t nvsp_version; uint32_t num_channel; struct hyperv_dma rxbuf_dma; struct hyperv_dma txbuf_dma; uint32_t vrss_send_table[VRSS_SEND_TABLE_SIZE]; } netvsc_dev; -struct hv_vmbus_channel; +struct vmbus_channel; -typedef void (*pfn_on_send_rx_completion)(struct hv_vmbus_channel *, void *); +typedef void (*pfn_on_send_rx_completion)(struct vmbus_channel *, void *); #define NETVSC_DEVICE_RING_BUFFER_SIZE (128 * PAGE_SIZE) #define NETVSC_VLAN_PRIO_MASK 0xe000 #define NETVSC_VLAN_PRIO_SHIFT 13 #define NETVSC_VLAN_VID_MASK 0x0fff #define TYPE_IPV4 2 #define TYPE_IPV6 4 #define TYPE_TCP 2 #define TYPE_UDP 4 #define TRANSPORT_TYPE_NOT_IP 0 #define TRANSPORT_TYPE_IPV4_TCP ((TYPE_IPV4 << 16) | TYPE_TCP) #define TRANSPORT_TYPE_IPV4_UDP ((TYPE_IPV4 << 16) | TYPE_UDP) #define TRANSPORT_TYPE_IPV6_TCP ((TYPE_IPV6 << 16) | TYPE_TCP) #define TRANSPORT_TYPE_IPV6_UDP ((TYPE_IPV6 << 16) | TYPE_UDP) #ifdef __LP64__ #define BITS_PER_LONG 64 #else #define BITS_PER_LONG 32 #endif typedef struct netvsc_packet_ { - hv_bool_uint8_t is_data_pkt; /* One byte */ + uint8_t is_data_pkt; /* One byte */ uint16_t vlan_tci; uint32_t status; /* Completion */ union { struct { uint64_t rx_completion_tid; void *rx_completion_context; /* This is no longer used */ pfn_on_send_rx_completion on_rx_completion; } rx; struct { uint64_t send_completion_tid; void *send_completion_context; /* Still used in netvsc and filter code */ pfn_on_send_rx_completion on_send_completion; } send; } compl; uint32_t send_buf_section_idx; uint32_t send_buf_section_size; void *rndis_mesg; uint32_t tot_data_buf_len; void *data; uint32_t gpa_cnt; struct vmbus_gpa gpa[VMBUS_CHAN_SGLIST_MAX]; } netvsc_packet; typedef struct { uint8_t mac_addr[6]; /* Assumption unsigned long */ - hv_bool_uint8_t link_state; + uint8_t link_state; } netvsc_device_info; #ifndef HN_USE_TXDESC_BUFRING struct hn_txdesc; SLIST_HEAD(hn_txdesc_list, hn_txdesc); #else struct buf_ring; #endif +struct hn_tx_ring; + struct hn_rx_ring { struct ifnet *hn_ifp; + struct hn_tx_ring *hn_txr; + void *hn_rdbuf; int hn_rx_idx; /* Trust csum verification on host side */ int hn_trust_hcsum; /* HN_TRUST_HCSUM_ */ struct lro_ctrl hn_lro; u_long hn_csum_ip; u_long hn_csum_tcp; u_long hn_csum_udp; u_long hn_csum_trusted; u_long hn_lro_tried; u_long hn_small_pkts; u_long hn_pkts; u_long hn_rss_pkts; /* Rarely used stuffs */ struct sysctl_oid *hn_rx_sysctl_tree; int hn_rx_flags; } __aligned(CACHE_LINE_SIZE); #define HN_TRUST_HCSUM_IP 0x0001 #define HN_TRUST_HCSUM_TCP 0x0002 #define HN_TRUST_HCSUM_UDP 0x0004 #define HN_RX_FLAG_ATTACHED 0x1 struct hn_tx_ring { #ifndef HN_USE_TXDESC_BUFRING struct mtx hn_txlist_spin; struct hn_txdesc_list hn_txlist; #else struct buf_ring *hn_txdesc_br; #endif int hn_txdesc_cnt; int hn_txdesc_avail; u_short hn_has_txeof; u_short hn_txdone_cnt; int hn_sched_tx; void (*hn_txeof)(struct hn_tx_ring *); struct taskqueue *hn_tx_taskq; struct task hn_tx_task; struct task hn_txeof_task; struct buf_ring *hn_mbuf_br; int hn_oactive; int hn_tx_idx; struct mtx hn_tx_lock; struct hn_softc *hn_sc; - struct hv_vmbus_channel *hn_chan; + struct vmbus_channel *hn_chan; int hn_direct_tx_size; int hn_tx_chimney_size; bus_dma_tag_t hn_tx_data_dtag; uint64_t hn_csum_assist; u_long hn_no_txdescs; u_long hn_send_failed; u_long hn_txdma_failed; u_long hn_tx_collapsed; u_long hn_tx_chimney_tried; u_long hn_tx_chimney; u_long hn_pkts; /* Rarely used stuffs */ struct hn_txdesc *hn_txdesc; bus_dma_tag_t hn_tx_rndis_dtag; struct sysctl_oid *hn_tx_sysctl_tree; int hn_tx_flags; } __aligned(CACHE_LINE_SIZE); #define HN_TX_FLAG_ATTACHED 0x1 /* * Device-specific softc structure */ typedef struct hn_softc { struct ifnet *hn_ifp; struct ifmedia hn_media; device_t hn_dev; uint8_t hn_unit; int hn_carrier; int hn_if_flags; struct mtx hn_lock; int hn_initdone; /* See hv_netvsc_drv_freebsd.c for rules on how to use */ int temp_unusable; netvsc_dev *net_dev; - struct hv_vmbus_channel *hn_prichan; + struct vmbus_channel *hn_prichan; int hn_rx_ring_cnt; int hn_rx_ring_inuse; struct hn_rx_ring *hn_rx_ring; int hn_tx_ring_cnt; int hn_tx_ring_inuse; struct hn_tx_ring *hn_tx_ring; int hn_cpu; int hn_tx_chimney_max; struct taskqueue *hn_tx_taskq; struct sysctl_oid *hn_tx_sysctl_tree; struct sysctl_oid *hn_rx_sysctl_tree; } hn_softc_t; /* * Externs */ extern int hv_promisc_mode; void netvsc_linkstatus_callback(struct hn_softc *sc, uint32_t status); netvsc_dev *hv_nv_on_device_add(struct hn_softc *sc, - void *additional_info); + void *additional_info, struct hn_rx_ring *rxr); int hv_nv_on_device_remove(struct hn_softc *sc, boolean_t destroy_channel); -int hv_nv_on_send(struct hv_vmbus_channel *chan, netvsc_packet *pkt); +int hv_nv_on_send(struct vmbus_channel *chan, netvsc_packet *pkt); int hv_nv_get_next_send_section(netvsc_dev *net_dev); -void hv_nv_subchan_attach(struct hv_vmbus_channel *chan); +void hv_nv_subchan_attach(struct vmbus_channel *chan, + struct hn_rx_ring *rxr); #endif /* __HV_NET_VSC_H__ */ Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c (revision 303206) @@ -1,3043 +1,3039 @@ /*- * Copyright (c) 2010-2012 Citrix Inc. * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /*- * Copyright (c) 2004-2006 Kip Macy * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_inet6.h" #include "opt_inet.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "hv_net_vsc.h" #include "hv_rndis.h" #include "hv_rndis_filter.h" #include "vmbus_if.h" -#define hv_chan_rxr hv_chan_priv1 -#define hv_chan_txr hv_chan_priv2 - /* Short for Hyper-V network interface */ #define NETVSC_DEVNAME "hn" /* * It looks like offset 0 of buf is reserved to hold the softc pointer. * The sc pointer evidently not needed, and is not presently populated. * The packet offset is where the netvsc_packet starts in the buffer. */ #define HV_NV_SC_PTR_OFFSET_IN_BUF 0 #define HV_NV_PACKET_OFFSET_IN_BUF 16 /* YYY should get it from the underlying channel */ #define HN_TX_DESC_CNT 512 #define HN_LROENT_CNT_DEF 128 #define HN_RING_CNT_DEF_MAX 8 #define HN_RNDIS_MSG_LEN \ (sizeof(rndis_msg) + \ RNDIS_HASHVAL_PPI_SIZE + \ RNDIS_VLAN_PPI_SIZE + \ RNDIS_TSO_PPI_SIZE + \ RNDIS_CSUM_PPI_SIZE) #define HN_RNDIS_MSG_BOUNDARY PAGE_SIZE #define HN_RNDIS_MSG_ALIGN CACHE_LINE_SIZE #define HN_TX_DATA_BOUNDARY PAGE_SIZE #define HN_TX_DATA_MAXSIZE IP_MAXPACKET #define HN_TX_DATA_SEGSIZE PAGE_SIZE #define HN_TX_DATA_SEGCNT_MAX \ (VMBUS_CHAN_SGLIST_MAX - HV_RF_NUM_TX_RESERVED_PAGE_BUFS) #define HN_DIRECT_TX_SIZE_DEF 128 #define HN_EARLY_TXEOF_THRESH 8 struct hn_txdesc { #ifndef HN_USE_TXDESC_BUFRING SLIST_ENTRY(hn_txdesc) link; #endif struct mbuf *m; struct hn_tx_ring *txr; int refs; uint32_t flags; /* HN_TXD_FLAG_ */ netvsc_packet netvsc_pkt; /* XXX to be removed */ bus_dmamap_t data_dmap; bus_addr_t rndis_msg_paddr; rndis_msg *rndis_msg; bus_dmamap_t rndis_msg_dmap; }; #define HN_TXD_FLAG_ONLIST 0x1 #define HN_TXD_FLAG_DMAMAP 0x2 /* * Only enable UDP checksum offloading when it is on 2012R2 or * later. UDP checksum offloading doesn't work on earlier * Windows releases. */ #define HN_CSUM_ASSIST_WIN8 (CSUM_IP | CSUM_TCP) #define HN_CSUM_ASSIST (CSUM_IP | CSUM_UDP | CSUM_TCP) #define HN_LRO_LENLIM_MULTIRX_DEF (12 * ETHERMTU) #define HN_LRO_LENLIM_DEF (25 * ETHERMTU) /* YYY 2*MTU is a bit rough, but should be good enough. */ #define HN_LRO_LENLIM_MIN(ifp) (2 * (ifp)->if_mtu) #define HN_LRO_ACKCNT_DEF 1 /* * Be aware that this sleepable mutex will exhibit WITNESS errors when * certain TCP and ARP code paths are taken. This appears to be a * well-known condition, as all other drivers checked use a sleeping * mutex to protect their transmit paths. * Also Be aware that mutexes do not play well with semaphores, and there * is a conflicting semaphore in a certain channel code path. */ #define NV_LOCK_INIT(_sc, _name) \ mtx_init(&(_sc)->hn_lock, _name, MTX_NETWORK_LOCK, MTX_DEF) #define NV_LOCK(_sc) mtx_lock(&(_sc)->hn_lock) #define NV_LOCK_ASSERT(_sc) mtx_assert(&(_sc)->hn_lock, MA_OWNED) #define NV_UNLOCK(_sc) mtx_unlock(&(_sc)->hn_lock) #define NV_LOCK_DESTROY(_sc) mtx_destroy(&(_sc)->hn_lock) /* * Globals */ int hv_promisc_mode = 0; /* normal mode by default */ SYSCTL_NODE(_hw, OID_AUTO, hn, CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, "Hyper-V network interface"); /* Trust tcp segements verification on host side. */ static int hn_trust_hosttcp = 1; SYSCTL_INT(_hw_hn, OID_AUTO, trust_hosttcp, CTLFLAG_RDTUN, &hn_trust_hosttcp, 0, "Trust tcp segement verification on host side, " "when csum info is missing (global setting)"); /* Trust udp datagrams verification on host side. */ static int hn_trust_hostudp = 1; SYSCTL_INT(_hw_hn, OID_AUTO, trust_hostudp, CTLFLAG_RDTUN, &hn_trust_hostudp, 0, "Trust udp datagram verification on host side, " "when csum info is missing (global setting)"); /* Trust ip packets verification on host side. */ static int hn_trust_hostip = 1; SYSCTL_INT(_hw_hn, OID_AUTO, trust_hostip, CTLFLAG_RDTUN, &hn_trust_hostip, 0, "Trust ip packet verification on host side, " "when csum info is missing (global setting)"); #if __FreeBSD_version >= 1100045 /* Limit TSO burst size */ static int hn_tso_maxlen = 0; SYSCTL_INT(_hw_hn, OID_AUTO, tso_maxlen, CTLFLAG_RDTUN, &hn_tso_maxlen, 0, "TSO burst limit"); #endif /* Limit chimney send size */ static int hn_tx_chimney_size = 0; SYSCTL_INT(_hw_hn, OID_AUTO, tx_chimney_size, CTLFLAG_RDTUN, &hn_tx_chimney_size, 0, "Chimney send packet size limit"); /* Limit the size of packet for direct transmission */ static int hn_direct_tx_size = HN_DIRECT_TX_SIZE_DEF; SYSCTL_INT(_hw_hn, OID_AUTO, direct_tx_size, CTLFLAG_RDTUN, &hn_direct_tx_size, 0, "Size of the packet for direct transmission"); #if defined(INET) || defined(INET6) #if __FreeBSD_version >= 1100095 static int hn_lro_entry_count = HN_LROENT_CNT_DEF; SYSCTL_INT(_hw_hn, OID_AUTO, lro_entry_count, CTLFLAG_RDTUN, &hn_lro_entry_count, 0, "LRO entry count"); #endif #endif static int hn_share_tx_taskq = 0; SYSCTL_INT(_hw_hn, OID_AUTO, share_tx_taskq, CTLFLAG_RDTUN, &hn_share_tx_taskq, 0, "Enable shared TX taskqueue"); static struct taskqueue *hn_tx_taskq; #ifndef HN_USE_TXDESC_BUFRING static int hn_use_txdesc_bufring = 0; #else static int hn_use_txdesc_bufring = 1; #endif SYSCTL_INT(_hw_hn, OID_AUTO, use_txdesc_bufring, CTLFLAG_RD, &hn_use_txdesc_bufring, 0, "Use buf_ring for TX descriptors"); static int hn_bind_tx_taskq = -1; SYSCTL_INT(_hw_hn, OID_AUTO, bind_tx_taskq, CTLFLAG_RDTUN, &hn_bind_tx_taskq, 0, "Bind TX taskqueue to the specified cpu"); static int hn_use_if_start = 0; SYSCTL_INT(_hw_hn, OID_AUTO, use_if_start, CTLFLAG_RDTUN, &hn_use_if_start, 0, "Use if_start TX method"); static int hn_chan_cnt = 0; SYSCTL_INT(_hw_hn, OID_AUTO, chan_cnt, CTLFLAG_RDTUN, &hn_chan_cnt, 0, "# of channels to use; each channel has one RX ring and one TX ring"); static int hn_tx_ring_cnt = 0; SYSCTL_INT(_hw_hn, OID_AUTO, tx_ring_cnt, CTLFLAG_RDTUN, &hn_tx_ring_cnt, 0, "# of TX rings to use"); static int hn_tx_swq_depth = 0; SYSCTL_INT(_hw_hn, OID_AUTO, tx_swq_depth, CTLFLAG_RDTUN, &hn_tx_swq_depth, 0, "Depth of IFQ or BUFRING"); #if __FreeBSD_version >= 1100095 static u_int hn_lro_mbufq_depth = 0; SYSCTL_UINT(_hw_hn, OID_AUTO, lro_mbufq_depth, CTLFLAG_RDTUN, &hn_lro_mbufq_depth, 0, "Depth of LRO mbuf queue"); #endif static u_int hn_cpu_index; /* * Forward declarations */ static void hn_stop(hn_softc_t *sc); static void hn_ifinit_locked(hn_softc_t *sc); static void hn_ifinit(void *xsc); static int hn_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data); static int hn_start_locked(struct hn_tx_ring *txr, int len); static void hn_start(struct ifnet *ifp); static void hn_start_txeof(struct hn_tx_ring *); static int hn_ifmedia_upd(struct ifnet *ifp); static void hn_ifmedia_sts(struct ifnet *ifp, struct ifmediareq *ifmr); #if __FreeBSD_version >= 1100099 static int hn_lro_lenlim_sysctl(SYSCTL_HANDLER_ARGS); static int hn_lro_ackcnt_sysctl(SYSCTL_HANDLER_ARGS); #endif static int hn_trust_hcsum_sysctl(SYSCTL_HANDLER_ARGS); static int hn_tx_chimney_size_sysctl(SYSCTL_HANDLER_ARGS); static int hn_rx_stat_ulong_sysctl(SYSCTL_HANDLER_ARGS); static int hn_rx_stat_u64_sysctl(SYSCTL_HANDLER_ARGS); static int hn_tx_stat_ulong_sysctl(SYSCTL_HANDLER_ARGS); static int hn_tx_conf_int_sysctl(SYSCTL_HANDLER_ARGS); static int hn_check_iplen(const struct mbuf *, int); static int hn_create_tx_ring(struct hn_softc *, int); static void hn_destroy_tx_ring(struct hn_tx_ring *); static int hn_create_tx_data(struct hn_softc *, int); static void hn_destroy_tx_data(struct hn_softc *); static void hn_start_taskfunc(void *, int); static void hn_start_txeof_taskfunc(void *, int); static void hn_stop_tx_tasks(struct hn_softc *); static int hn_encap(struct hn_tx_ring *, struct hn_txdesc *, struct mbuf **); static void hn_create_rx_data(struct hn_softc *sc, int); static void hn_destroy_rx_data(struct hn_softc *sc); static void hn_set_tx_chimney_size(struct hn_softc *, int); -static void hn_channel_attach(struct hn_softc *, struct hv_vmbus_channel *); -static void hn_subchan_attach(struct hn_softc *, struct hv_vmbus_channel *); +static void hn_channel_attach(struct hn_softc *, struct vmbus_channel *); +static void hn_subchan_attach(struct hn_softc *, struct vmbus_channel *); static void hn_subchan_setup(struct hn_softc *); static int hn_transmit(struct ifnet *, struct mbuf *); static void hn_xmit_qflush(struct ifnet *); static int hn_xmit(struct hn_tx_ring *, int); static void hn_xmit_txeof(struct hn_tx_ring *); static void hn_xmit_taskfunc(void *, int); static void hn_xmit_txeof_taskfunc(void *, int); #if __FreeBSD_version >= 1100099 static void hn_set_lro_lenlim(struct hn_softc *sc, int lenlim) { int i; for (i = 0; i < sc->hn_rx_ring_inuse; ++i) sc->hn_rx_ring[i].hn_lro.lro_length_lim = lenlim; } #endif static int hn_get_txswq_depth(const struct hn_tx_ring *txr) { KASSERT(txr->hn_txdesc_cnt > 0, ("tx ring is not setup yet")); if (hn_tx_swq_depth < txr->hn_txdesc_cnt) return txr->hn_txdesc_cnt; return hn_tx_swq_depth; } static int hn_ifmedia_upd(struct ifnet *ifp __unused) { return EOPNOTSUPP; } static void hn_ifmedia_sts(struct ifnet *ifp, struct ifmediareq *ifmr) { struct hn_softc *sc = ifp->if_softc; ifmr->ifm_status = IFM_AVALID; ifmr->ifm_active = IFM_ETHER; if (!sc->hn_carrier) { ifmr->ifm_active |= IFM_NONE; return; } ifmr->ifm_status |= IFM_ACTIVE; ifmr->ifm_active |= IFM_10G_T | IFM_FDX; } /* {F8615163-DF3E-46c5-913F-F2D2F965ED0E} */ static const struct hyperv_guid g_net_vsc_device_type = { .hv_guid = {0x63, 0x51, 0x61, 0xF8, 0x3E, 0xDF, 0xc5, 0x46, 0x91, 0x3F, 0xF2, 0xD2, 0xF9, 0x65, 0xED, 0x0E} }; /* * Standard probe entry point. * */ static int netvsc_probe(device_t dev) { if (VMBUS_PROBE_GUID(device_get_parent(dev), dev, &g_net_vsc_device_type) == 0) { device_set_desc(dev, "Hyper-V Network Interface"); return BUS_PROBE_DEFAULT; } return ENXIO; } /* * Standard attach entry point. * * Called when the driver is loaded. It allocates needed resources, * and initializes the "hardware" and software. */ static int netvsc_attach(device_t dev) { netvsc_device_info device_info; hn_softc_t *sc; int unit = device_get_unit(dev); struct ifnet *ifp = NULL; int error, ring_cnt, tx_ring_cnt; #if __FreeBSD_version >= 1100045 int tso_maxlen; #endif sc = device_get_softc(dev); sc->hn_unit = unit; sc->hn_dev = dev; sc->hn_prichan = vmbus_get_channel(dev); if (hn_tx_taskq == NULL) { sc->hn_tx_taskq = taskqueue_create("hn_tx", M_WAITOK, taskqueue_thread_enqueue, &sc->hn_tx_taskq); if (hn_bind_tx_taskq >= 0) { int cpu = hn_bind_tx_taskq; cpuset_t cpu_set; if (cpu > mp_ncpus - 1) cpu = mp_ncpus - 1; CPU_SETOF(cpu, &cpu_set); taskqueue_start_threads_cpuset(&sc->hn_tx_taskq, 1, PI_NET, &cpu_set, "%s tx", device_get_nameunit(dev)); } else { taskqueue_start_threads(&sc->hn_tx_taskq, 1, PI_NET, "%s tx", device_get_nameunit(dev)); } } else { sc->hn_tx_taskq = hn_tx_taskq; } NV_LOCK_INIT(sc, "NetVSCLock"); ifp = sc->hn_ifp = if_alloc(IFT_ETHER); ifp->if_softc = sc; if_initname(ifp, device_get_name(dev), device_get_unit(dev)); /* * Figure out the # of RX rings (ring_cnt) and the # of TX rings * to use (tx_ring_cnt). * * NOTE: * The # of RX rings to use is same as the # of channels to use. */ ring_cnt = hn_chan_cnt; if (ring_cnt <= 0) { /* Default */ ring_cnt = mp_ncpus; if (ring_cnt > HN_RING_CNT_DEF_MAX) ring_cnt = HN_RING_CNT_DEF_MAX; } else if (ring_cnt > mp_ncpus) { ring_cnt = mp_ncpus; } tx_ring_cnt = hn_tx_ring_cnt; if (tx_ring_cnt <= 0 || tx_ring_cnt > ring_cnt) tx_ring_cnt = ring_cnt; if (hn_use_if_start) { /* ifnet.if_start only needs one TX ring. */ tx_ring_cnt = 1; } /* * Set the leader CPU for channels. */ sc->hn_cpu = atomic_fetchadd_int(&hn_cpu_index, ring_cnt) % mp_ncpus; error = hn_create_tx_data(sc, tx_ring_cnt); if (error) goto failed; hn_create_rx_data(sc, ring_cnt); /* * Associate the first TX/RX ring w/ the primary channel. */ hn_channel_attach(sc, sc->hn_prichan); ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST; ifp->if_ioctl = hn_ioctl; ifp->if_init = hn_ifinit; /* needed by hv_rf_on_device_add() code */ ifp->if_mtu = ETHERMTU; if (hn_use_if_start) { int qdepth = hn_get_txswq_depth(&sc->hn_tx_ring[0]); ifp->if_start = hn_start; IFQ_SET_MAXLEN(&ifp->if_snd, qdepth); ifp->if_snd.ifq_drv_maxlen = qdepth - 1; IFQ_SET_READY(&ifp->if_snd); } else { ifp->if_transmit = hn_transmit; ifp->if_qflush = hn_xmit_qflush; } ifmedia_init(&sc->hn_media, 0, hn_ifmedia_upd, hn_ifmedia_sts); ifmedia_add(&sc->hn_media, IFM_ETHER | IFM_AUTO, 0, NULL); ifmedia_set(&sc->hn_media, IFM_ETHER | IFM_AUTO); /* XXX ifmedia_set really should do this for us */ sc->hn_media.ifm_media = sc->hn_media.ifm_cur->ifm_media; /* * Tell upper layers that we support full VLAN capability. */ ifp->if_hdrlen = sizeof(struct ether_vlan_header); ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_MTU | IFCAP_HWCSUM | IFCAP_TSO | IFCAP_LRO; ifp->if_capenable |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_MTU | IFCAP_HWCSUM | IFCAP_TSO | IFCAP_LRO; ifp->if_hwassist = sc->hn_tx_ring[0].hn_csum_assist | CSUM_TSO; - error = hv_rf_on_device_add(sc, &device_info, ring_cnt); + error = hv_rf_on_device_add(sc, &device_info, ring_cnt, + &sc->hn_rx_ring[0]); if (error) goto failed; KASSERT(sc->net_dev->num_channel > 0 && sc->net_dev->num_channel <= sc->hn_rx_ring_inuse, ("invalid channel count %u, should be less than %d", sc->net_dev->num_channel, sc->hn_rx_ring_inuse)); /* * Set the # of TX/RX rings that could be used according to * the # of channels that host offered. */ if (sc->hn_tx_ring_inuse > sc->net_dev->num_channel) sc->hn_tx_ring_inuse = sc->net_dev->num_channel; sc->hn_rx_ring_inuse = sc->net_dev->num_channel; device_printf(dev, "%d TX ring, %d RX ring\n", sc->hn_tx_ring_inuse, sc->hn_rx_ring_inuse); if (sc->net_dev->num_channel > 1) hn_subchan_setup(sc); #if __FreeBSD_version >= 1100099 if (sc->hn_rx_ring_inuse > 1) { /* * Reduce TCP segment aggregation limit for multiple * RX rings to increase ACK timeliness. */ hn_set_lro_lenlim(sc, HN_LRO_LENLIM_MULTIRX_DEF); } #endif if (device_info.link_state == 0) { sc->hn_carrier = 1; } #if __FreeBSD_version >= 1100045 tso_maxlen = hn_tso_maxlen; if (tso_maxlen <= 0 || tso_maxlen > IP_MAXPACKET) tso_maxlen = IP_MAXPACKET; ifp->if_hw_tsomaxsegcount = HN_TX_DATA_SEGCNT_MAX; ifp->if_hw_tsomaxsegsize = PAGE_SIZE; ifp->if_hw_tsomax = tso_maxlen - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN); #endif ether_ifattach(ifp, device_info.mac_addr); #if __FreeBSD_version >= 1100045 if_printf(ifp, "TSO: %u/%u/%u\n", ifp->if_hw_tsomax, ifp->if_hw_tsomaxsegcount, ifp->if_hw_tsomaxsegsize); #endif sc->hn_tx_chimney_max = sc->net_dev->send_section_size; hn_set_tx_chimney_size(sc, sc->hn_tx_chimney_max); if (hn_tx_chimney_size > 0 && hn_tx_chimney_size < sc->hn_tx_chimney_max) hn_set_tx_chimney_size(sc, hn_tx_chimney_size); return (0); failed: hn_destroy_tx_data(sc); if (ifp != NULL) if_free(ifp); return (error); } /* * Standard detach entry point */ static int netvsc_detach(device_t dev) { struct hn_softc *sc = device_get_softc(dev); if (bootverbose) printf("netvsc_detach\n"); /* * XXXKYS: Need to clean up all our * driver state; this is the driver * unloading. */ /* * XXXKYS: Need to stop outgoing traffic and unregister * the netdevice. */ hv_rf_on_device_remove(sc, HV_RF_NV_DESTROY_CHANNEL); hn_stop_tx_tasks(sc); ifmedia_removeall(&sc->hn_media); hn_destroy_rx_data(sc); hn_destroy_tx_data(sc); if (sc->hn_tx_taskq != hn_tx_taskq) taskqueue_free(sc->hn_tx_taskq); return (0); } /* * Standard shutdown entry point */ static int netvsc_shutdown(device_t dev) { return (0); } static __inline int hn_txdesc_dmamap_load(struct hn_tx_ring *txr, struct hn_txdesc *txd, struct mbuf **m_head, bus_dma_segment_t *segs, int *nsegs) { struct mbuf *m = *m_head; int error; error = bus_dmamap_load_mbuf_sg(txr->hn_tx_data_dtag, txd->data_dmap, m, segs, nsegs, BUS_DMA_NOWAIT); if (error == EFBIG) { struct mbuf *m_new; m_new = m_collapse(m, M_NOWAIT, HN_TX_DATA_SEGCNT_MAX); if (m_new == NULL) return ENOBUFS; else *m_head = m = m_new; txr->hn_tx_collapsed++; error = bus_dmamap_load_mbuf_sg(txr->hn_tx_data_dtag, txd->data_dmap, m, segs, nsegs, BUS_DMA_NOWAIT); } if (!error) { bus_dmamap_sync(txr->hn_tx_data_dtag, txd->data_dmap, BUS_DMASYNC_PREWRITE); txd->flags |= HN_TXD_FLAG_DMAMAP; } return error; } static __inline void hn_txdesc_dmamap_unload(struct hn_tx_ring *txr, struct hn_txdesc *txd) { if (txd->flags & HN_TXD_FLAG_DMAMAP) { bus_dmamap_sync(txr->hn_tx_data_dtag, txd->data_dmap, BUS_DMASYNC_POSTWRITE); bus_dmamap_unload(txr->hn_tx_data_dtag, txd->data_dmap); txd->flags &= ~HN_TXD_FLAG_DMAMAP; } } static __inline int hn_txdesc_put(struct hn_tx_ring *txr, struct hn_txdesc *txd) { KASSERT((txd->flags & HN_TXD_FLAG_ONLIST) == 0, ("put an onlist txd %#x", txd->flags)); KASSERT(txd->refs > 0, ("invalid txd refs %d", txd->refs)); if (atomic_fetchadd_int(&txd->refs, -1) != 1) return 0; hn_txdesc_dmamap_unload(txr, txd); if (txd->m != NULL) { m_freem(txd->m); txd->m = NULL; } txd->flags |= HN_TXD_FLAG_ONLIST; #ifndef HN_USE_TXDESC_BUFRING mtx_lock_spin(&txr->hn_txlist_spin); KASSERT(txr->hn_txdesc_avail >= 0 && txr->hn_txdesc_avail < txr->hn_txdesc_cnt, ("txdesc_put: invalid txd avail %d", txr->hn_txdesc_avail)); txr->hn_txdesc_avail++; SLIST_INSERT_HEAD(&txr->hn_txlist, txd, link); mtx_unlock_spin(&txr->hn_txlist_spin); #else atomic_add_int(&txr->hn_txdesc_avail, 1); buf_ring_enqueue(txr->hn_txdesc_br, txd); #endif return 1; } static __inline struct hn_txdesc * hn_txdesc_get(struct hn_tx_ring *txr) { struct hn_txdesc *txd; #ifndef HN_USE_TXDESC_BUFRING mtx_lock_spin(&txr->hn_txlist_spin); txd = SLIST_FIRST(&txr->hn_txlist); if (txd != NULL) { KASSERT(txr->hn_txdesc_avail > 0, ("txdesc_get: invalid txd avail %d", txr->hn_txdesc_avail)); txr->hn_txdesc_avail--; SLIST_REMOVE_HEAD(&txr->hn_txlist, link); } mtx_unlock_spin(&txr->hn_txlist_spin); #else txd = buf_ring_dequeue_sc(txr->hn_txdesc_br); #endif if (txd != NULL) { #ifdef HN_USE_TXDESC_BUFRING atomic_subtract_int(&txr->hn_txdesc_avail, 1); #endif KASSERT(txd->m == NULL && txd->refs == 0 && (txd->flags & HN_TXD_FLAG_ONLIST), ("invalid txd")); txd->flags &= ~HN_TXD_FLAG_ONLIST; txd->refs = 1; } return txd; } static __inline void hn_txdesc_hold(struct hn_txdesc *txd) { /* 0->1 transition will never work */ KASSERT(txd->refs > 0, ("invalid refs %d", txd->refs)); atomic_add_int(&txd->refs, 1); } static __inline void hn_txeof(struct hn_tx_ring *txr) { txr->hn_has_txeof = 0; txr->hn_txeof(txr); } static void -hn_tx_done(struct hv_vmbus_channel *chan, void *xpkt) +hn_tx_done(struct vmbus_channel *chan, void *xpkt) { netvsc_packet *packet = xpkt; struct hn_txdesc *txd; struct hn_tx_ring *txr; txd = (struct hn_txdesc *)(uintptr_t) packet->compl.send.send_completion_tid; txr = txd->txr; KASSERT(txr->hn_chan == chan, - ("channel mismatch, on channel%u, should be channel%u", - chan->ch_subidx, - txr->hn_chan->ch_subidx)); + ("channel mismatch, on chan%u, should be chan%u", + vmbus_chan_subidx(chan), vmbus_chan_subidx(txr->hn_chan))); txr->hn_has_txeof = 1; hn_txdesc_put(txr, txd); ++txr->hn_txdone_cnt; if (txr->hn_txdone_cnt >= HN_EARLY_TXEOF_THRESH) { txr->hn_txdone_cnt = 0; if (txr->hn_oactive) hn_txeof(txr); } } void -netvsc_channel_rollup(struct hv_vmbus_channel *chan) +netvsc_channel_rollup(struct hn_rx_ring *rxr, struct hn_tx_ring *txr) { - struct hn_tx_ring *txr = chan->hv_chan_txr; #if defined(INET) || defined(INET6) - struct hn_rx_ring *rxr = chan->hv_chan_rxr; - tcp_lro_flush_all(&rxr->hn_lro); #endif /* * NOTE: * 'txr' could be NULL, if multiple channels and * ifnet.if_start method are enabled. */ if (txr == NULL || !txr->hn_has_txeof) return; txr->hn_txdone_cnt = 0; hn_txeof(txr); } /* * NOTE: * If this function fails, then both txd and m_head0 will be freed. */ static int hn_encap(struct hn_tx_ring *txr, struct hn_txdesc *txd, struct mbuf **m_head0) { bus_dma_segment_t segs[HN_TX_DATA_SEGCNT_MAX]; int error, nsegs, i; struct mbuf *m_head = *m_head0; netvsc_packet *packet; rndis_msg *rndis_mesg; rndis_packet *rndis_pkt; rndis_per_packet_info *rppi; struct rndis_hash_value *hash_value; uint32_t rndis_msg_size; packet = &txd->netvsc_pkt; packet->is_data_pkt = TRUE; packet->tot_data_buf_len = m_head->m_pkthdr.len; /* * extension points to the area reserved for the * rndis_filter_packet, which is placed just after * the netvsc_packet (and rppi struct, if present; * length is updated later). */ rndis_mesg = txd->rndis_msg; /* XXX not necessary */ memset(rndis_mesg, 0, HN_RNDIS_MSG_LEN); rndis_mesg->ndis_msg_type = REMOTE_NDIS_PACKET_MSG; rndis_pkt = &rndis_mesg->msg.packet; rndis_pkt->data_offset = sizeof(rndis_packet); rndis_pkt->data_length = packet->tot_data_buf_len; rndis_pkt->per_pkt_info_offset = sizeof(rndis_packet); rndis_msg_size = RNDIS_MESSAGE_SIZE(rndis_packet); /* * Set the hash value for this packet, so that the host could * dispatch the TX done event for this packet back to this TX * ring's channel. */ rndis_msg_size += RNDIS_HASHVAL_PPI_SIZE; rppi = hv_set_rppi_data(rndis_mesg, RNDIS_HASHVAL_PPI_SIZE, nbl_hash_value); hash_value = (struct rndis_hash_value *)((uint8_t *)rppi + rppi->per_packet_info_offset); hash_value->hash_value = txr->hn_tx_idx; if (m_head->m_flags & M_VLANTAG) { ndis_8021q_info *rppi_vlan_info; rndis_msg_size += RNDIS_VLAN_PPI_SIZE; rppi = hv_set_rppi_data(rndis_mesg, RNDIS_VLAN_PPI_SIZE, ieee_8021q_info); rppi_vlan_info = (ndis_8021q_info *)((uint8_t *)rppi + rppi->per_packet_info_offset); rppi_vlan_info->u1.s1.vlan_id = m_head->m_pkthdr.ether_vtag & 0xfff; } if (m_head->m_pkthdr.csum_flags & CSUM_TSO) { rndis_tcp_tso_info *tso_info; struct ether_vlan_header *eh; int ether_len; /* * XXX need m_pullup and use mtodo */ eh = mtod(m_head, struct ether_vlan_header*); if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) ether_len = ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN; else ether_len = ETHER_HDR_LEN; rndis_msg_size += RNDIS_TSO_PPI_SIZE; rppi = hv_set_rppi_data(rndis_mesg, RNDIS_TSO_PPI_SIZE, tcp_large_send_info); tso_info = (rndis_tcp_tso_info *)((uint8_t *)rppi + rppi->per_packet_info_offset); tso_info->lso_v2_xmit.type = RNDIS_TCP_LARGE_SEND_OFFLOAD_V2_TYPE; #ifdef INET if (m_head->m_pkthdr.csum_flags & CSUM_IP_TSO) { struct ip *ip = (struct ip *)(m_head->m_data + ether_len); unsigned long iph_len = ip->ip_hl << 2; struct tcphdr *th = (struct tcphdr *)((caddr_t)ip + iph_len); tso_info->lso_v2_xmit.ip_version = RNDIS_TCP_LARGE_SEND_OFFLOAD_IPV4; ip->ip_len = 0; ip->ip_sum = 0; th->th_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, htons(IPPROTO_TCP)); } #endif #if defined(INET6) && defined(INET) else #endif #ifdef INET6 { struct ip6_hdr *ip6 = (struct ip6_hdr *) (m_head->m_data + ether_len); struct tcphdr *th = (struct tcphdr *)(ip6 + 1); tso_info->lso_v2_xmit.ip_version = RNDIS_TCP_LARGE_SEND_OFFLOAD_IPV6; ip6->ip6_plen = 0; th->th_sum = in6_cksum_pseudo(ip6, 0, IPPROTO_TCP, 0); } #endif tso_info->lso_v2_xmit.tcp_header_offset = 0; tso_info->lso_v2_xmit.mss = m_head->m_pkthdr.tso_segsz; } else if (m_head->m_pkthdr.csum_flags & txr->hn_csum_assist) { rndis_tcp_ip_csum_info *csum_info; rndis_msg_size += RNDIS_CSUM_PPI_SIZE; rppi = hv_set_rppi_data(rndis_mesg, RNDIS_CSUM_PPI_SIZE, tcpip_chksum_info); csum_info = (rndis_tcp_ip_csum_info *)((uint8_t *)rppi + rppi->per_packet_info_offset); csum_info->xmit.is_ipv4 = 1; if (m_head->m_pkthdr.csum_flags & CSUM_IP) csum_info->xmit.ip_header_csum = 1; if (m_head->m_pkthdr.csum_flags & CSUM_TCP) { csum_info->xmit.tcp_csum = 1; csum_info->xmit.tcp_header_offset = 0; } else if (m_head->m_pkthdr.csum_flags & CSUM_UDP) { csum_info->xmit.udp_csum = 1; } } rndis_mesg->msg_len = packet->tot_data_buf_len + rndis_msg_size; packet->tot_data_buf_len = rndis_mesg->msg_len; /* * Chimney send, if the packet could fit into one chimney buffer. */ if (packet->tot_data_buf_len < txr->hn_tx_chimney_size) { netvsc_dev *net_dev = txr->hn_sc->net_dev; uint32_t send_buf_section_idx; txr->hn_tx_chimney_tried++; send_buf_section_idx = hv_nv_get_next_send_section(net_dev); if (send_buf_section_idx != NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX) { uint8_t *dest = ((uint8_t *)net_dev->send_buf + (send_buf_section_idx * net_dev->send_section_size)); memcpy(dest, rndis_mesg, rndis_msg_size); dest += rndis_msg_size; m_copydata(m_head, 0, m_head->m_pkthdr.len, dest); packet->send_buf_section_idx = send_buf_section_idx; packet->send_buf_section_size = packet->tot_data_buf_len; packet->gpa_cnt = 0; txr->hn_tx_chimney++; goto done; } } error = hn_txdesc_dmamap_load(txr, txd, &m_head, segs, &nsegs); if (error) { int freed; /* * This mbuf is not linked w/ the txd yet, so free it now. */ m_freem(m_head); *m_head0 = NULL; freed = hn_txdesc_put(txr, txd); KASSERT(freed != 0, ("fail to free txd upon txdma error")); txr->hn_txdma_failed++; if_inc_counter(txr->hn_sc->hn_ifp, IFCOUNTER_OERRORS, 1); return error; } *m_head0 = m_head; packet->gpa_cnt = nsegs + HV_RF_NUM_TX_RESERVED_PAGE_BUFS; /* send packet with page buffer */ packet->gpa[0].gpa_page = atop(txd->rndis_msg_paddr); packet->gpa[0].gpa_ofs = txd->rndis_msg_paddr & PAGE_MASK; packet->gpa[0].gpa_len = rndis_msg_size; /* * Fill the page buffers with mbuf info starting at index * HV_RF_NUM_TX_RESERVED_PAGE_BUFS. */ for (i = 0; i < nsegs; ++i) { struct vmbus_gpa *gpa = &packet->gpa[ i + HV_RF_NUM_TX_RESERVED_PAGE_BUFS]; gpa->gpa_page = atop(segs[i].ds_addr); gpa->gpa_ofs = segs[i].ds_addr & PAGE_MASK; gpa->gpa_len = segs[i].ds_len; } packet->send_buf_section_idx = NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX; packet->send_buf_section_size = 0; done: txd->m = m_head; /* Set the completion routine */ packet->compl.send.on_send_completion = hn_tx_done; packet->compl.send.send_completion_context = packet; packet->compl.send.send_completion_tid = (uint64_t)(uintptr_t)txd; return 0; } /* * NOTE: * If this function fails, then txd will be freed, but the mbuf * associated w/ the txd will _not_ be freed. */ static int hn_send_pkt(struct ifnet *ifp, struct hn_tx_ring *txr, struct hn_txdesc *txd) { int error, send_failed = 0; again: /* * Make sure that txd is not freed before ETHER_BPF_MTAP. */ hn_txdesc_hold(txd); error = hv_nv_on_send(txr->hn_chan, &txd->netvsc_pkt); if (!error) { ETHER_BPF_MTAP(ifp, txd->m); if_inc_counter(ifp, IFCOUNTER_OPACKETS, 1); if (!hn_use_if_start) { if_inc_counter(ifp, IFCOUNTER_OBYTES, txd->m->m_pkthdr.len); if (txd->m->m_flags & M_MCAST) if_inc_counter(ifp, IFCOUNTER_OMCASTS, 1); } txr->hn_pkts++; } hn_txdesc_put(txr, txd); if (__predict_false(error)) { int freed; /* * This should "really rarely" happen. * * XXX Too many RX to be acked or too many sideband * commands to run? Ask netvsc_channel_rollup() * to kick start later. */ txr->hn_has_txeof = 1; if (!send_failed) { txr->hn_send_failed++; send_failed = 1; /* * Try sending again after set hn_has_txeof; * in case that we missed the last * netvsc_channel_rollup(). */ goto again; } if_printf(ifp, "send failed\n"); /* * Caller will perform further processing on the * associated mbuf, so don't free it in hn_txdesc_put(); * only unload it from the DMA map in hn_txdesc_put(), * if it was loaded. */ txd->m = NULL; freed = hn_txdesc_put(txr, txd); KASSERT(freed != 0, ("fail to free txd upon send error")); txr->hn_send_failed++; } return error; } /* * Start a transmit of one or more packets */ static int hn_start_locked(struct hn_tx_ring *txr, int len) { struct hn_softc *sc = txr->hn_sc; struct ifnet *ifp = sc->hn_ifp; KASSERT(hn_use_if_start, ("hn_start_locked is called, when if_start is disabled")); KASSERT(txr == &sc->hn_tx_ring[0], ("not the first TX ring")); mtx_assert(&txr->hn_tx_lock, MA_OWNED); if ((ifp->if_drv_flags & (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)) != IFF_DRV_RUNNING) return 0; while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) { struct hn_txdesc *txd; struct mbuf *m_head; int error; IFQ_DRV_DEQUEUE(&ifp->if_snd, m_head); if (m_head == NULL) break; if (len > 0 && m_head->m_pkthdr.len > len) { /* * This sending could be time consuming; let callers * dispatch this packet sending (and sending of any * following up packets) to tx taskqueue. */ IFQ_DRV_PREPEND(&ifp->if_snd, m_head); return 1; } txd = hn_txdesc_get(txr); if (txd == NULL) { txr->hn_no_txdescs++; IFQ_DRV_PREPEND(&ifp->if_snd, m_head); atomic_set_int(&ifp->if_drv_flags, IFF_DRV_OACTIVE); break; } error = hn_encap(txr, txd, &m_head); if (error) { /* Both txd and m_head are freed */ continue; } error = hn_send_pkt(ifp, txr, txd); if (__predict_false(error)) { /* txd is freed, but m_head is not */ IFQ_DRV_PREPEND(&ifp->if_snd, m_head); atomic_set_int(&ifp->if_drv_flags, IFF_DRV_OACTIVE); break; } } return 0; } /* * Link up/down notification */ void netvsc_linkstatus_callback(struct hn_softc *sc, uint32_t status) { if (status == 1) { sc->hn_carrier = 1; } else { sc->hn_carrier = 0; } } /* * Append the specified data to the indicated mbuf chain, * Extend the mbuf chain if the new data does not fit in * existing space. * * This is a minor rewrite of m_append() from sys/kern/uipc_mbuf.c. * There should be an equivalent in the kernel mbuf code, * but there does not appear to be one yet. * * Differs from m_append() in that additional mbufs are * allocated with cluster size MJUMPAGESIZE, and filled * accordingly. * * Return 1 if able to complete the job; otherwise 0. */ static int hv_m_append(struct mbuf *m0, int len, c_caddr_t cp) { struct mbuf *m, *n; int remainder, space; for (m = m0; m->m_next != NULL; m = m->m_next) ; remainder = len; space = M_TRAILINGSPACE(m); if (space > 0) { /* * Copy into available space. */ if (space > remainder) space = remainder; bcopy(cp, mtod(m, caddr_t) + m->m_len, space); m->m_len += space; cp += space; remainder -= space; } while (remainder > 0) { /* * Allocate a new mbuf; could check space * and allocate a cluster instead. */ n = m_getjcl(M_NOWAIT, m->m_type, 0, MJUMPAGESIZE); if (n == NULL) break; n->m_len = min(MJUMPAGESIZE, remainder); bcopy(cp, mtod(n, caddr_t), n->m_len); cp += n->m_len; remainder -= n->m_len; m->m_next = n; m = n; } if (m0->m_flags & M_PKTHDR) m0->m_pkthdr.len += len - remainder; return (remainder == 0); } #if defined(INET) || defined(INET6) static __inline int hn_lro_rx(struct lro_ctrl *lc, struct mbuf *m) { #if __FreeBSD_version >= 1100095 if (hn_lro_mbufq_depth) { tcp_lro_queue_mbuf(lc, m); return 0; } #endif return tcp_lro_rx(lc, m, 0); } #endif /* * Called when we receive a data packet from the "wire" on the * specified device * * Note: This is no longer used as a callback */ int -netvsc_recv(struct hv_vmbus_channel *chan, netvsc_packet *packet, +netvsc_recv(struct hn_rx_ring *rxr, netvsc_packet *packet, const rndis_tcp_ip_csum_info *csum_info, const struct rndis_hash_info *hash_info, const struct rndis_hash_value *hash_value) { - struct hn_rx_ring *rxr = chan->hv_chan_rxr; struct ifnet *ifp = rxr->hn_ifp; struct mbuf *m_new; int size, do_lro = 0, do_csum = 1; int hash_type = M_HASHTYPE_OPAQUE_HASH; if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) return (0); /* * Bail out if packet contains more data than configured MTU. */ if (packet->tot_data_buf_len > (ifp->if_mtu + ETHER_HDR_LEN)) { return (0); } else if (packet->tot_data_buf_len <= MHLEN) { m_new = m_gethdr(M_NOWAIT, MT_DATA); if (m_new == NULL) { if_inc_counter(ifp, IFCOUNTER_IQDROPS, 1); return (0); } memcpy(mtod(m_new, void *), packet->data, packet->tot_data_buf_len); m_new->m_pkthdr.len = m_new->m_len = packet->tot_data_buf_len; rxr->hn_small_pkts++; } else { /* * Get an mbuf with a cluster. For packets 2K or less, * get a standard 2K cluster. For anything larger, get a * 4K cluster. Any buffers larger than 4K can cause problems * if looped around to the Hyper-V TX channel, so avoid them. */ size = MCLBYTES; if (packet->tot_data_buf_len > MCLBYTES) { /* 4096 */ size = MJUMPAGESIZE; } m_new = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, size); if (m_new == NULL) { if_inc_counter(ifp, IFCOUNTER_IQDROPS, 1); return (0); } hv_m_append(m_new, packet->tot_data_buf_len, packet->data); } m_new->m_pkthdr.rcvif = ifp; if (__predict_false((ifp->if_capenable & IFCAP_RXCSUM) == 0)) do_csum = 0; /* receive side checksum offload */ if (csum_info != NULL) { /* IP csum offload */ if (csum_info->receive.ip_csum_succeeded && do_csum) { m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID); rxr->hn_csum_ip++; } /* TCP/UDP csum offload */ if ((csum_info->receive.tcp_csum_succeeded || csum_info->receive.udp_csum_succeeded) && do_csum) { m_new->m_pkthdr.csum_flags |= (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m_new->m_pkthdr.csum_data = 0xffff; if (csum_info->receive.tcp_csum_succeeded) rxr->hn_csum_tcp++; else rxr->hn_csum_udp++; } if (csum_info->receive.ip_csum_succeeded && csum_info->receive.tcp_csum_succeeded) do_lro = 1; } else { const struct ether_header *eh; uint16_t etype; int hoff; hoff = sizeof(*eh); if (m_new->m_len < hoff) goto skip; eh = mtod(m_new, struct ether_header *); etype = ntohs(eh->ether_type); if (etype == ETHERTYPE_VLAN) { const struct ether_vlan_header *evl; hoff = sizeof(*evl); if (m_new->m_len < hoff) goto skip; evl = mtod(m_new, struct ether_vlan_header *); etype = ntohs(evl->evl_proto); } if (etype == ETHERTYPE_IP) { int pr; pr = hn_check_iplen(m_new, hoff); if (pr == IPPROTO_TCP) { if (do_csum && (rxr->hn_trust_hcsum & HN_TRUST_HCSUM_TCP)) { rxr->hn_csum_trusted++; m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID | CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m_new->m_pkthdr.csum_data = 0xffff; } do_lro = 1; } else if (pr == IPPROTO_UDP) { if (do_csum && (rxr->hn_trust_hcsum & HN_TRUST_HCSUM_UDP)) { rxr->hn_csum_trusted++; m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID | CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m_new->m_pkthdr.csum_data = 0xffff; } } else if (pr != IPPROTO_DONE && do_csum && (rxr->hn_trust_hcsum & HN_TRUST_HCSUM_IP)) { rxr->hn_csum_trusted++; m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID); } } } skip: if ((packet->vlan_tci != 0) && (ifp->if_capenable & IFCAP_VLAN_HWTAGGING) != 0) { m_new->m_pkthdr.ether_vtag = packet->vlan_tci; m_new->m_flags |= M_VLANTAG; } if (hash_info != NULL && hash_value != NULL) { rxr->hn_rss_pkts++; m_new->m_pkthdr.flowid = hash_value->hash_value; if ((hash_info->hash_info & NDIS_HASH_FUNCTION_MASK) == NDIS_HASH_FUNCTION_TOEPLITZ) { uint32_t type = (hash_info->hash_info & NDIS_HASH_TYPE_MASK); switch (type) { case NDIS_HASH_IPV4: hash_type = M_HASHTYPE_RSS_IPV4; break; case NDIS_HASH_TCP_IPV4: hash_type = M_HASHTYPE_RSS_TCP_IPV4; break; case NDIS_HASH_IPV6: hash_type = M_HASHTYPE_RSS_IPV6; break; case NDIS_HASH_IPV6_EX: hash_type = M_HASHTYPE_RSS_IPV6_EX; break; case NDIS_HASH_TCP_IPV6: hash_type = M_HASHTYPE_RSS_TCP_IPV6; break; case NDIS_HASH_TCP_IPV6_EX: hash_type = M_HASHTYPE_RSS_TCP_IPV6_EX; break; } } } else { if (hash_value != NULL) { m_new->m_pkthdr.flowid = hash_value->hash_value; } else { m_new->m_pkthdr.flowid = rxr->hn_rx_idx; hash_type = M_HASHTYPE_OPAQUE; } } M_HASHTYPE_SET(m_new, hash_type); /* * Note: Moved RX completion back to hv_nv_on_receive() so all * messages (not just data messages) will trigger a response. */ if_inc_counter(ifp, IFCOUNTER_IPACKETS, 1); rxr->hn_pkts++; if ((ifp->if_capenable & IFCAP_LRO) && do_lro) { #if defined(INET) || defined(INET6) struct lro_ctrl *lro = &rxr->hn_lro; if (lro->lro_cnt) { rxr->hn_lro_tried++; if (hn_lro_rx(lro, m_new) == 0) { /* DONE! */ return 0; } } #endif } /* We're not holding the lock here, so don't release it */ (*ifp->if_input)(ifp, m_new); return (0); } /* * Rules for using sc->temp_unusable: * 1. sc->temp_unusable can only be read or written while holding NV_LOCK() * 2. code reading sc->temp_unusable under NV_LOCK(), and finding * sc->temp_unusable set, must release NV_LOCK() and exit * 3. to retain exclusive control of the interface, * sc->temp_unusable must be set by code before releasing NV_LOCK() * 4. only code setting sc->temp_unusable can clear sc->temp_unusable * 5. code setting sc->temp_unusable must eventually clear sc->temp_unusable */ /* * Standard ioctl entry point. Called when the user wants to configure * the interface. */ static int hn_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data) { hn_softc_t *sc = ifp->if_softc; struct ifreq *ifr = (struct ifreq *)data; #ifdef INET struct ifaddr *ifa = (struct ifaddr *)data; #endif netvsc_device_info device_info; int mask, error = 0; int retry_cnt = 500; switch(cmd) { case SIOCSIFADDR: #ifdef INET if (ifa->ifa_addr->sa_family == AF_INET) { ifp->if_flags |= IFF_UP; if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) hn_ifinit(sc); arp_ifinit(ifp, ifa); } else #endif error = ether_ioctl(ifp, cmd, data); break; case SIOCSIFMTU: /* Check MTU value change */ if (ifp->if_mtu == ifr->ifr_mtu) break; if (ifr->ifr_mtu > NETVSC_MAX_CONFIGURABLE_MTU) { error = EINVAL; break; } /* Obtain and record requested MTU */ ifp->if_mtu = ifr->ifr_mtu; #if __FreeBSD_version >= 1100099 /* * Make sure that LRO aggregation length limit is still * valid, after the MTU change. */ NV_LOCK(sc); if (sc->hn_rx_ring[0].hn_lro.lro_length_lim < HN_LRO_LENLIM_MIN(ifp)) hn_set_lro_lenlim(sc, HN_LRO_LENLIM_MIN(ifp)); NV_UNLOCK(sc); #endif do { NV_LOCK(sc); if (!sc->temp_unusable) { sc->temp_unusable = TRUE; retry_cnt = -1; } NV_UNLOCK(sc); if (retry_cnt > 0) { retry_cnt--; DELAY(5 * 1000); } } while (retry_cnt > 0); if (retry_cnt == 0) { error = EINVAL; break; } /* We must remove and add back the device to cause the new * MTU to take effect. This includes tearing down, but not * deleting the channel, then bringing it back up. */ error = hv_rf_on_device_remove(sc, HV_RF_NV_RETAIN_CHANNEL); if (error) { NV_LOCK(sc); sc->temp_unusable = FALSE; NV_UNLOCK(sc); break; } /* Wait for subchannels to be destroyed */ vmbus_subchan_drain(sc->hn_prichan); error = hv_rf_on_device_add(sc, &device_info, - sc->hn_rx_ring_inuse); + sc->hn_rx_ring_inuse, &sc->hn_rx_ring[0]); if (error) { NV_LOCK(sc); sc->temp_unusable = FALSE; NV_UNLOCK(sc); break; } KASSERT(sc->hn_rx_ring_cnt == sc->net_dev->num_channel, ("RX ring count %d and channel count %u mismatch", sc->hn_rx_ring_cnt, sc->net_dev->num_channel)); if (sc->net_dev->num_channel > 1) { int r; /* * Skip the rings on primary channel; they are * handled by the hv_rf_on_device_add() above. */ for (r = 1; r < sc->hn_rx_ring_cnt; ++r) { sc->hn_rx_ring[r].hn_rx_flags &= ~HN_RX_FLAG_ATTACHED; } for (r = 1; r < sc->hn_tx_ring_cnt; ++r) { sc->hn_tx_ring[r].hn_tx_flags &= ~HN_TX_FLAG_ATTACHED; } hn_subchan_setup(sc); } sc->hn_tx_chimney_max = sc->net_dev->send_section_size; if (sc->hn_tx_ring[0].hn_tx_chimney_size > sc->hn_tx_chimney_max) hn_set_tx_chimney_size(sc, sc->hn_tx_chimney_max); hn_ifinit_locked(sc); NV_LOCK(sc); sc->temp_unusable = FALSE; NV_UNLOCK(sc); break; case SIOCSIFFLAGS: do { NV_LOCK(sc); if (!sc->temp_unusable) { sc->temp_unusable = TRUE; retry_cnt = -1; } NV_UNLOCK(sc); if (retry_cnt > 0) { retry_cnt--; DELAY(5 * 1000); } } while (retry_cnt > 0); if (retry_cnt == 0) { error = EINVAL; break; } if (ifp->if_flags & IFF_UP) { /* * If only the state of the PROMISC flag changed, * then just use the 'set promisc mode' command * instead of reinitializing the entire NIC. Doing * a full re-init means reloading the firmware and * waiting for it to start up, which may take a * second or two. */ #ifdef notyet /* Fixme: Promiscuous mode? */ if (ifp->if_drv_flags & IFF_DRV_RUNNING && ifp->if_flags & IFF_PROMISC && !(sc->hn_if_flags & IFF_PROMISC)) { /* do something here for Hyper-V */ } else if (ifp->if_drv_flags & IFF_DRV_RUNNING && !(ifp->if_flags & IFF_PROMISC) && sc->hn_if_flags & IFF_PROMISC) { /* do something here for Hyper-V */ } else #endif hn_ifinit_locked(sc); } else { if (ifp->if_drv_flags & IFF_DRV_RUNNING) { hn_stop(sc); } } NV_LOCK(sc); sc->temp_unusable = FALSE; NV_UNLOCK(sc); sc->hn_if_flags = ifp->if_flags; error = 0; break; case SIOCSIFCAP: NV_LOCK(sc); mask = ifr->ifr_reqcap ^ ifp->if_capenable; if (mask & IFCAP_TXCSUM) { ifp->if_capenable ^= IFCAP_TXCSUM; if (ifp->if_capenable & IFCAP_TXCSUM) { ifp->if_hwassist |= sc->hn_tx_ring[0].hn_csum_assist; } else { ifp->if_hwassist &= ~sc->hn_tx_ring[0].hn_csum_assist; } } if (mask & IFCAP_RXCSUM) ifp->if_capenable ^= IFCAP_RXCSUM; if (mask & IFCAP_LRO) ifp->if_capenable ^= IFCAP_LRO; if (mask & IFCAP_TSO4) { ifp->if_capenable ^= IFCAP_TSO4; if (ifp->if_capenable & IFCAP_TSO4) ifp->if_hwassist |= CSUM_IP_TSO; else ifp->if_hwassist &= ~CSUM_IP_TSO; } if (mask & IFCAP_TSO6) { ifp->if_capenable ^= IFCAP_TSO6; if (ifp->if_capenable & IFCAP_TSO6) ifp->if_hwassist |= CSUM_IP6_TSO; else ifp->if_hwassist &= ~CSUM_IP6_TSO; } NV_UNLOCK(sc); error = 0; break; case SIOCADDMULTI: case SIOCDELMULTI: #ifdef notyet /* Fixme: Multicast mode? */ if (ifp->if_drv_flags & IFF_DRV_RUNNING) { NV_LOCK(sc); netvsc_setmulti(sc); NV_UNLOCK(sc); error = 0; } #endif error = EINVAL; break; case SIOCSIFMEDIA: case SIOCGIFMEDIA: error = ifmedia_ioctl(ifp, ifr, &sc->hn_media, cmd); break; default: error = ether_ioctl(ifp, cmd, data); break; } return (error); } /* * */ static void hn_stop(hn_softc_t *sc) { struct ifnet *ifp; int ret, i; ifp = sc->hn_ifp; if (bootverbose) printf(" Closing Device ...\n"); atomic_clear_int(&ifp->if_drv_flags, (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)); for (i = 0; i < sc->hn_tx_ring_inuse; ++i) sc->hn_tx_ring[i].hn_oactive = 0; if_link_state_change(ifp, LINK_STATE_DOWN); sc->hn_initdone = 0; ret = hv_rf_on_close(sc); } /* * FreeBSD transmit entry point */ static void hn_start(struct ifnet *ifp) { struct hn_softc *sc = ifp->if_softc; struct hn_tx_ring *txr = &sc->hn_tx_ring[0]; if (txr->hn_sched_tx) goto do_sched; if (mtx_trylock(&txr->hn_tx_lock)) { int sched; sched = hn_start_locked(txr, txr->hn_direct_tx_size); mtx_unlock(&txr->hn_tx_lock); if (!sched) return; } do_sched: taskqueue_enqueue(txr->hn_tx_taskq, &txr->hn_tx_task); } static void hn_start_txeof(struct hn_tx_ring *txr) { struct hn_softc *sc = txr->hn_sc; struct ifnet *ifp = sc->hn_ifp; KASSERT(txr == &sc->hn_tx_ring[0], ("not the first TX ring")); if (txr->hn_sched_tx) goto do_sched; if (mtx_trylock(&txr->hn_tx_lock)) { int sched; atomic_clear_int(&ifp->if_drv_flags, IFF_DRV_OACTIVE); sched = hn_start_locked(txr, txr->hn_direct_tx_size); mtx_unlock(&txr->hn_tx_lock); if (sched) { taskqueue_enqueue(txr->hn_tx_taskq, &txr->hn_tx_task); } } else { do_sched: /* * Release the OACTIVE earlier, with the hope, that * others could catch up. The task will clear the * flag again with the hn_tx_lock to avoid possible * races. */ atomic_clear_int(&ifp->if_drv_flags, IFF_DRV_OACTIVE); taskqueue_enqueue(txr->hn_tx_taskq, &txr->hn_txeof_task); } } /* * */ static void hn_ifinit_locked(hn_softc_t *sc) { struct ifnet *ifp; int ret, i; ifp = sc->hn_ifp; if (ifp->if_drv_flags & IFF_DRV_RUNNING) { return; } hv_promisc_mode = 1; ret = hv_rf_on_open(sc); if (ret != 0) { return; } else { sc->hn_initdone = 1; } atomic_clear_int(&ifp->if_drv_flags, IFF_DRV_OACTIVE); for (i = 0; i < sc->hn_tx_ring_inuse; ++i) sc->hn_tx_ring[i].hn_oactive = 0; atomic_set_int(&ifp->if_drv_flags, IFF_DRV_RUNNING); if_link_state_change(ifp, LINK_STATE_UP); } /* * */ static void hn_ifinit(void *xsc) { hn_softc_t *sc = xsc; NV_LOCK(sc); if (sc->temp_unusable) { NV_UNLOCK(sc); return; } sc->temp_unusable = TRUE; NV_UNLOCK(sc); hn_ifinit_locked(sc); NV_LOCK(sc); sc->temp_unusable = FALSE; NV_UNLOCK(sc); } #ifdef LATER /* * */ static void hn_watchdog(struct ifnet *ifp) { hn_softc_t *sc; sc = ifp->if_softc; printf("hn%d: watchdog timeout -- resetting\n", sc->hn_unit); hn_ifinit(sc); /*???*/ if_inc_counter(ifp, IFCOUNTER_OERRORS, 1); } #endif #if __FreeBSD_version >= 1100099 static int hn_lro_lenlim_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; unsigned int lenlim; int error; lenlim = sc->hn_rx_ring[0].hn_lro.lro_length_lim; error = sysctl_handle_int(oidp, &lenlim, 0, req); if (error || req->newptr == NULL) return error; if (lenlim < HN_LRO_LENLIM_MIN(sc->hn_ifp) || lenlim > TCP_LRO_LENGTH_MAX) return EINVAL; NV_LOCK(sc); hn_set_lro_lenlim(sc, lenlim); NV_UNLOCK(sc); return 0; } static int hn_lro_ackcnt_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; int ackcnt, error, i; /* * lro_ackcnt_lim is append count limit, * +1 to turn it into aggregation limit. */ ackcnt = sc->hn_rx_ring[0].hn_lro.lro_ackcnt_lim + 1; error = sysctl_handle_int(oidp, &ackcnt, 0, req); if (error || req->newptr == NULL) return error; if (ackcnt < 2 || ackcnt > (TCP_LRO_ACKCNT_MAX + 1)) return EINVAL; /* * Convert aggregation limit back to append * count limit. */ --ackcnt; NV_LOCK(sc); for (i = 0; i < sc->hn_rx_ring_inuse; ++i) sc->hn_rx_ring[i].hn_lro.lro_ackcnt_lim = ackcnt; NV_UNLOCK(sc); return 0; } #endif static int hn_trust_hcsum_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; int hcsum = arg2; int on, error, i; on = 0; if (sc->hn_rx_ring[0].hn_trust_hcsum & hcsum) on = 1; error = sysctl_handle_int(oidp, &on, 0, req); if (error || req->newptr == NULL) return error; NV_LOCK(sc); for (i = 0; i < sc->hn_rx_ring_inuse; ++i) { struct hn_rx_ring *rxr = &sc->hn_rx_ring[i]; if (on) rxr->hn_trust_hcsum |= hcsum; else rxr->hn_trust_hcsum &= ~hcsum; } NV_UNLOCK(sc); return 0; } static int hn_tx_chimney_size_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; int chimney_size, error; chimney_size = sc->hn_tx_ring[0].hn_tx_chimney_size; error = sysctl_handle_int(oidp, &chimney_size, 0, req); if (error || req->newptr == NULL) return error; if (chimney_size > sc->hn_tx_chimney_max || chimney_size <= 0) return EINVAL; hn_set_tx_chimney_size(sc, chimney_size); return 0; } static int hn_rx_stat_ulong_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; int ofs = arg2, i, error; struct hn_rx_ring *rxr; u_long stat; stat = 0; for (i = 0; i < sc->hn_rx_ring_inuse; ++i) { rxr = &sc->hn_rx_ring[i]; stat += *((u_long *)((uint8_t *)rxr + ofs)); } error = sysctl_handle_long(oidp, &stat, 0, req); if (error || req->newptr == NULL) return error; /* Zero out this stat. */ for (i = 0; i < sc->hn_rx_ring_inuse; ++i) { rxr = &sc->hn_rx_ring[i]; *((u_long *)((uint8_t *)rxr + ofs)) = 0; } return 0; } static int hn_rx_stat_u64_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; int ofs = arg2, i, error; struct hn_rx_ring *rxr; uint64_t stat; stat = 0; for (i = 0; i < sc->hn_rx_ring_inuse; ++i) { rxr = &sc->hn_rx_ring[i]; stat += *((uint64_t *)((uint8_t *)rxr + ofs)); } error = sysctl_handle_64(oidp, &stat, 0, req); if (error || req->newptr == NULL) return error; /* Zero out this stat. */ for (i = 0; i < sc->hn_rx_ring_inuse; ++i) { rxr = &sc->hn_rx_ring[i]; *((uint64_t *)((uint8_t *)rxr + ofs)) = 0; } return 0; } static int hn_tx_stat_ulong_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; int ofs = arg2, i, error; struct hn_tx_ring *txr; u_long stat; stat = 0; for (i = 0; i < sc->hn_tx_ring_inuse; ++i) { txr = &sc->hn_tx_ring[i]; stat += *((u_long *)((uint8_t *)txr + ofs)); } error = sysctl_handle_long(oidp, &stat, 0, req); if (error || req->newptr == NULL) return error; /* Zero out this stat. */ for (i = 0; i < sc->hn_tx_ring_inuse; ++i) { txr = &sc->hn_tx_ring[i]; *((u_long *)((uint8_t *)txr + ofs)) = 0; } return 0; } static int hn_tx_conf_int_sysctl(SYSCTL_HANDLER_ARGS) { struct hn_softc *sc = arg1; int ofs = arg2, i, error, conf; struct hn_tx_ring *txr; txr = &sc->hn_tx_ring[0]; conf = *((int *)((uint8_t *)txr + ofs)); error = sysctl_handle_int(oidp, &conf, 0, req); if (error || req->newptr == NULL) return error; NV_LOCK(sc); for (i = 0; i < sc->hn_tx_ring_inuse; ++i) { txr = &sc->hn_tx_ring[i]; *((int *)((uint8_t *)txr + ofs)) = conf; } NV_UNLOCK(sc); return 0; } static int hn_check_iplen(const struct mbuf *m, int hoff) { const struct ip *ip; int len, iphlen, iplen; const struct tcphdr *th; int thoff; /* TCP data offset */ len = hoff + sizeof(struct ip); /* The packet must be at least the size of an IP header. */ if (m->m_pkthdr.len < len) return IPPROTO_DONE; /* The fixed IP header must reside completely in the first mbuf. */ if (m->m_len < len) return IPPROTO_DONE; ip = mtodo(m, hoff); /* Bound check the packet's stated IP header length. */ iphlen = ip->ip_hl << 2; if (iphlen < sizeof(struct ip)) /* minimum header length */ return IPPROTO_DONE; /* The full IP header must reside completely in the one mbuf. */ if (m->m_len < hoff + iphlen) return IPPROTO_DONE; iplen = ntohs(ip->ip_len); /* * Check that the amount of data in the buffers is as * at least much as the IP header would have us expect. */ if (m->m_pkthdr.len < hoff + iplen) return IPPROTO_DONE; /* * Ignore IP fragments. */ if (ntohs(ip->ip_off) & (IP_OFFMASK | IP_MF)) return IPPROTO_DONE; /* * The TCP/IP or UDP/IP header must be entirely contained within * the first fragment of a packet. */ switch (ip->ip_p) { case IPPROTO_TCP: if (iplen < iphlen + sizeof(struct tcphdr)) return IPPROTO_DONE; if (m->m_len < hoff + iphlen + sizeof(struct tcphdr)) return IPPROTO_DONE; th = (const struct tcphdr *)((const uint8_t *)ip + iphlen); thoff = th->th_off << 2; if (thoff < sizeof(struct tcphdr) || thoff + iphlen > iplen) return IPPROTO_DONE; if (m->m_len < hoff + iphlen + thoff) return IPPROTO_DONE; break; case IPPROTO_UDP: if (iplen < iphlen + sizeof(struct udphdr)) return IPPROTO_DONE; if (m->m_len < hoff + iphlen + sizeof(struct udphdr)) return IPPROTO_DONE; break; default: if (iplen < iphlen) return IPPROTO_DONE; break; } return ip->ip_p; } static void hn_create_rx_data(struct hn_softc *sc, int ring_cnt) { struct sysctl_oid_list *child; struct sysctl_ctx_list *ctx; device_t dev = sc->hn_dev; #if defined(INET) || defined(INET6) #if __FreeBSD_version >= 1100095 int lroent_cnt; #endif #endif int i; sc->hn_rx_ring_cnt = ring_cnt; sc->hn_rx_ring_inuse = sc->hn_rx_ring_cnt; sc->hn_rx_ring = malloc(sizeof(struct hn_rx_ring) * sc->hn_rx_ring_cnt, M_NETVSC, M_WAITOK | M_ZERO); #if defined(INET) || defined(INET6) #if __FreeBSD_version >= 1100095 lroent_cnt = hn_lro_entry_count; if (lroent_cnt < TCP_LRO_ENTRIES) lroent_cnt = TCP_LRO_ENTRIES; device_printf(dev, "LRO: entry count %d\n", lroent_cnt); #endif #endif /* INET || INET6 */ ctx = device_get_sysctl_ctx(dev); child = SYSCTL_CHILDREN(device_get_sysctl_tree(dev)); /* Create dev.hn.UNIT.rx sysctl tree */ sc->hn_rx_sysctl_tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "rx", CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); for (i = 0; i < sc->hn_rx_ring_cnt; ++i) { struct hn_rx_ring *rxr = &sc->hn_rx_ring[i]; if (hn_trust_hosttcp) rxr->hn_trust_hcsum |= HN_TRUST_HCSUM_TCP; if (hn_trust_hostudp) rxr->hn_trust_hcsum |= HN_TRUST_HCSUM_UDP; if (hn_trust_hostip) rxr->hn_trust_hcsum |= HN_TRUST_HCSUM_IP; rxr->hn_ifp = sc->hn_ifp; + if (i < sc->hn_tx_ring_cnt) + rxr->hn_txr = &sc->hn_tx_ring[i]; + rxr->hn_rdbuf = malloc(NETVSC_PACKET_SIZE, M_NETVSC, M_WAITOK); rxr->hn_rx_idx = i; /* * Initialize LRO. */ #if defined(INET) || defined(INET6) #if __FreeBSD_version >= 1100095 tcp_lro_init_args(&rxr->hn_lro, sc->hn_ifp, lroent_cnt, hn_lro_mbufq_depth); #else tcp_lro_init(&rxr->hn_lro); rxr->hn_lro.ifp = sc->hn_ifp; #endif #if __FreeBSD_version >= 1100099 rxr->hn_lro.lro_length_lim = HN_LRO_LENLIM_DEF; rxr->hn_lro.lro_ackcnt_lim = HN_LRO_ACKCNT_DEF; #endif #endif /* INET || INET6 */ if (sc->hn_rx_sysctl_tree != NULL) { char name[16]; /* * Create per RX ring sysctl tree: * dev.hn.UNIT.rx.RINGID */ snprintf(name, sizeof(name), "%d", i); rxr->hn_rx_sysctl_tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(sc->hn_rx_sysctl_tree), OID_AUTO, name, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); if (rxr->hn_rx_sysctl_tree != NULL) { SYSCTL_ADD_ULONG(ctx, SYSCTL_CHILDREN(rxr->hn_rx_sysctl_tree), OID_AUTO, "packets", CTLFLAG_RW, &rxr->hn_pkts, "# of packets received"); SYSCTL_ADD_ULONG(ctx, SYSCTL_CHILDREN(rxr->hn_rx_sysctl_tree), OID_AUTO, "rss_pkts", CTLFLAG_RW, &rxr->hn_rss_pkts, "# of packets w/ RSS info received"); } } } SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "lro_queued", CTLTYPE_U64 | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_lro.lro_queued), hn_rx_stat_u64_sysctl, "LU", "LRO queued"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "lro_flushed", CTLTYPE_U64 | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_lro.lro_flushed), hn_rx_stat_u64_sysctl, "LU", "LRO flushed"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "lro_tried", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_lro_tried), hn_rx_stat_ulong_sysctl, "LU", "# of LRO tries"); #if __FreeBSD_version >= 1100099 SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "lro_length_lim", CTLTYPE_UINT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, 0, hn_lro_lenlim_sysctl, "IU", "Max # of data bytes to be aggregated by LRO"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "lro_ackcnt_lim", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, 0, hn_lro_ackcnt_sysctl, "I", "Max # of ACKs to be aggregated by LRO"); #endif SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "trust_hosttcp", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, HN_TRUST_HCSUM_TCP, hn_trust_hcsum_sysctl, "I", "Trust tcp segement verification on host side, " "when csum info is missing"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "trust_hostudp", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, HN_TRUST_HCSUM_UDP, hn_trust_hcsum_sysctl, "I", "Trust udp datagram verification on host side, " "when csum info is missing"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "trust_hostip", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, HN_TRUST_HCSUM_IP, hn_trust_hcsum_sysctl, "I", "Trust ip packet verification on host side, " "when csum info is missing"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "csum_ip", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_csum_ip), hn_rx_stat_ulong_sysctl, "LU", "RXCSUM IP"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "csum_tcp", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_csum_tcp), hn_rx_stat_ulong_sysctl, "LU", "RXCSUM TCP"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "csum_udp", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_csum_udp), hn_rx_stat_ulong_sysctl, "LU", "RXCSUM UDP"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "csum_trusted", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_csum_trusted), hn_rx_stat_ulong_sysctl, "LU", "# of packets that we trust host's csum verification"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "small_pkts", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_rx_ring, hn_small_pkts), hn_rx_stat_ulong_sysctl, "LU", "# of small packets received"); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "rx_ring_cnt", CTLFLAG_RD, &sc->hn_rx_ring_cnt, 0, "# created RX rings"); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "rx_ring_inuse", CTLFLAG_RD, &sc->hn_rx_ring_inuse, 0, "# used RX rings"); } static void hn_destroy_rx_data(struct hn_softc *sc) { -#if defined(INET) || defined(INET6) int i; -#endif if (sc->hn_rx_ring_cnt == 0) return; + for (i = 0; i < sc->hn_rx_ring_cnt; ++i) { + struct hn_rx_ring *rxr = &sc->hn_rx_ring[i]; + #if defined(INET) || defined(INET6) - for (i = 0; i < sc->hn_rx_ring_cnt; ++i) - tcp_lro_free(&sc->hn_rx_ring[i].hn_lro); + tcp_lro_free(&rxr->hn_lro); #endif + free(rxr->hn_rdbuf, M_NETVSC); + } free(sc->hn_rx_ring, M_NETVSC); sc->hn_rx_ring = NULL; sc->hn_rx_ring_cnt = 0; sc->hn_rx_ring_inuse = 0; } static int hn_create_tx_ring(struct hn_softc *sc, int id) { struct hn_tx_ring *txr = &sc->hn_tx_ring[id]; device_t dev = sc->hn_dev; bus_dma_tag_t parent_dtag; int error, i; uint32_t version; txr->hn_sc = sc; txr->hn_tx_idx = id; #ifndef HN_USE_TXDESC_BUFRING mtx_init(&txr->hn_txlist_spin, "hn txlist", NULL, MTX_SPIN); #endif mtx_init(&txr->hn_tx_lock, "hn tx", NULL, MTX_DEF); txr->hn_txdesc_cnt = HN_TX_DESC_CNT; txr->hn_txdesc = malloc(sizeof(struct hn_txdesc) * txr->hn_txdesc_cnt, M_NETVSC, M_WAITOK | M_ZERO); #ifndef HN_USE_TXDESC_BUFRING SLIST_INIT(&txr->hn_txlist); #else txr->hn_txdesc_br = buf_ring_alloc(txr->hn_txdesc_cnt, M_NETVSC, M_WAITOK, &txr->hn_tx_lock); #endif txr->hn_tx_taskq = sc->hn_tx_taskq; if (hn_use_if_start) { txr->hn_txeof = hn_start_txeof; TASK_INIT(&txr->hn_tx_task, 0, hn_start_taskfunc, txr); TASK_INIT(&txr->hn_txeof_task, 0, hn_start_txeof_taskfunc, txr); } else { int br_depth; txr->hn_txeof = hn_xmit_txeof; TASK_INIT(&txr->hn_tx_task, 0, hn_xmit_taskfunc, txr); TASK_INIT(&txr->hn_txeof_task, 0, hn_xmit_txeof_taskfunc, txr); br_depth = hn_get_txswq_depth(txr); txr->hn_mbuf_br = buf_ring_alloc(br_depth, M_NETVSC, M_WAITOK, &txr->hn_tx_lock); } txr->hn_direct_tx_size = hn_direct_tx_size; version = VMBUS_GET_VERSION(device_get_parent(dev), dev); if (version >= VMBUS_VERSION_WIN8_1) { txr->hn_csum_assist = HN_CSUM_ASSIST; } else { txr->hn_csum_assist = HN_CSUM_ASSIST_WIN8; if (id == 0) { device_printf(dev, "bus version %u.%u, " "no UDP checksum offloading\n", VMBUS_VERSION_MAJOR(version), VMBUS_VERSION_MINOR(version)); } } /* * Always schedule transmission instead of trying to do direct * transmission. This one gives the best performance so far. */ txr->hn_sched_tx = 1; parent_dtag = bus_get_dma_tag(dev); /* DMA tag for RNDIS messages. */ error = bus_dma_tag_create(parent_dtag, /* parent */ HN_RNDIS_MSG_ALIGN, /* alignment */ HN_RNDIS_MSG_BOUNDARY, /* boundary */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ HN_RNDIS_MSG_LEN, /* maxsize */ 1, /* nsegments */ HN_RNDIS_MSG_LEN, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &txr->hn_tx_rndis_dtag); if (error) { device_printf(dev, "failed to create rndis dmatag\n"); return error; } /* DMA tag for data. */ error = bus_dma_tag_create(parent_dtag, /* parent */ 1, /* alignment */ HN_TX_DATA_BOUNDARY, /* boundary */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ HN_TX_DATA_MAXSIZE, /* maxsize */ HN_TX_DATA_SEGCNT_MAX, /* nsegments */ HN_TX_DATA_SEGSIZE, /* maxsegsize */ 0, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &txr->hn_tx_data_dtag); if (error) { device_printf(dev, "failed to create data dmatag\n"); return error; } for (i = 0; i < txr->hn_txdesc_cnt; ++i) { struct hn_txdesc *txd = &txr->hn_txdesc[i]; txd->txr = txr; /* * Allocate and load RNDIS messages. */ error = bus_dmamem_alloc(txr->hn_tx_rndis_dtag, (void **)&txd->rndis_msg, BUS_DMA_WAITOK | BUS_DMA_COHERENT, &txd->rndis_msg_dmap); if (error) { device_printf(dev, "failed to allocate rndis_msg, %d\n", i); return error; } error = bus_dmamap_load(txr->hn_tx_rndis_dtag, txd->rndis_msg_dmap, txd->rndis_msg, HN_RNDIS_MSG_LEN, hyperv_dma_map_paddr, &txd->rndis_msg_paddr, BUS_DMA_NOWAIT); if (error) { device_printf(dev, "failed to load rndis_msg, %d\n", i); bus_dmamem_free(txr->hn_tx_rndis_dtag, txd->rndis_msg, txd->rndis_msg_dmap); return error; } /* DMA map for TX data. */ error = bus_dmamap_create(txr->hn_tx_data_dtag, 0, &txd->data_dmap); if (error) { device_printf(dev, "failed to allocate tx data dmamap\n"); bus_dmamap_unload(txr->hn_tx_rndis_dtag, txd->rndis_msg_dmap); bus_dmamem_free(txr->hn_tx_rndis_dtag, txd->rndis_msg, txd->rndis_msg_dmap); return error; } /* All set, put it to list */ txd->flags |= HN_TXD_FLAG_ONLIST; #ifndef HN_USE_TXDESC_BUFRING SLIST_INSERT_HEAD(&txr->hn_txlist, txd, link); #else buf_ring_enqueue(txr->hn_txdesc_br, txd); #endif } txr->hn_txdesc_avail = txr->hn_txdesc_cnt; if (sc->hn_tx_sysctl_tree != NULL) { struct sysctl_oid_list *child; struct sysctl_ctx_list *ctx; char name[16]; /* * Create per TX ring sysctl tree: * dev.hn.UNIT.tx.RINGID */ ctx = device_get_sysctl_ctx(dev); child = SYSCTL_CHILDREN(sc->hn_tx_sysctl_tree); snprintf(name, sizeof(name), "%d", id); txr->hn_tx_sysctl_tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, name, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); if (txr->hn_tx_sysctl_tree != NULL) { child = SYSCTL_CHILDREN(txr->hn_tx_sysctl_tree); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "txdesc_avail", CTLFLAG_RD, &txr->hn_txdesc_avail, 0, "# of available TX descs"); if (!hn_use_if_start) { SYSCTL_ADD_INT(ctx, child, OID_AUTO, "oactive", CTLFLAG_RD, &txr->hn_oactive, 0, "over active"); } SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "packets", CTLFLAG_RW, &txr->hn_pkts, "# of packets transmitted"); } } return 0; } static void hn_txdesc_dmamap_destroy(struct hn_txdesc *txd) { struct hn_tx_ring *txr = txd->txr; KASSERT(txd->m == NULL, ("still has mbuf installed")); KASSERT((txd->flags & HN_TXD_FLAG_DMAMAP) == 0, ("still dma mapped")); bus_dmamap_unload(txr->hn_tx_rndis_dtag, txd->rndis_msg_dmap); bus_dmamem_free(txr->hn_tx_rndis_dtag, txd->rndis_msg, txd->rndis_msg_dmap); bus_dmamap_destroy(txr->hn_tx_data_dtag, txd->data_dmap); } static void hn_destroy_tx_ring(struct hn_tx_ring *txr) { struct hn_txdesc *txd; if (txr->hn_txdesc == NULL) return; #ifndef HN_USE_TXDESC_BUFRING while ((txd = SLIST_FIRST(&txr->hn_txlist)) != NULL) { SLIST_REMOVE_HEAD(&txr->hn_txlist, link); hn_txdesc_dmamap_destroy(txd); } #else mtx_lock(&txr->hn_tx_lock); while ((txd = buf_ring_dequeue_sc(txr->hn_txdesc_br)) != NULL) hn_txdesc_dmamap_destroy(txd); mtx_unlock(&txr->hn_tx_lock); #endif if (txr->hn_tx_data_dtag != NULL) bus_dma_tag_destroy(txr->hn_tx_data_dtag); if (txr->hn_tx_rndis_dtag != NULL) bus_dma_tag_destroy(txr->hn_tx_rndis_dtag); #ifdef HN_USE_TXDESC_BUFRING buf_ring_free(txr->hn_txdesc_br, M_NETVSC); #endif free(txr->hn_txdesc, M_NETVSC); txr->hn_txdesc = NULL; if (txr->hn_mbuf_br != NULL) buf_ring_free(txr->hn_mbuf_br, M_NETVSC); #ifndef HN_USE_TXDESC_BUFRING mtx_destroy(&txr->hn_txlist_spin); #endif mtx_destroy(&txr->hn_tx_lock); } static int hn_create_tx_data(struct hn_softc *sc, int ring_cnt) { struct sysctl_oid_list *child; struct sysctl_ctx_list *ctx; int i; sc->hn_tx_ring_cnt = ring_cnt; sc->hn_tx_ring_inuse = sc->hn_tx_ring_cnt; sc->hn_tx_ring = malloc(sizeof(struct hn_tx_ring) * sc->hn_tx_ring_cnt, M_NETVSC, M_WAITOK | M_ZERO); ctx = device_get_sysctl_ctx(sc->hn_dev); child = SYSCTL_CHILDREN(device_get_sysctl_tree(sc->hn_dev)); /* Create dev.hn.UNIT.tx sysctl tree */ sc->hn_tx_sysctl_tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "tx", CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); for (i = 0; i < sc->hn_tx_ring_cnt; ++i) { int error; error = hn_create_tx_ring(sc, i); if (error) return error; } SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "no_txdescs", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_no_txdescs), hn_tx_stat_ulong_sysctl, "LU", "# of times short of TX descs"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "send_failed", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_send_failed), hn_tx_stat_ulong_sysctl, "LU", "# of hyper-v sending failure"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "txdma_failed", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_txdma_failed), hn_tx_stat_ulong_sysctl, "LU", "# of TX DMA failure"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "tx_collapsed", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_tx_collapsed), hn_tx_stat_ulong_sysctl, "LU", "# of TX mbuf collapsed"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "tx_chimney", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_tx_chimney), hn_tx_stat_ulong_sysctl, "LU", "# of chimney send"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "tx_chimney_tried", CTLTYPE_ULONG | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_tx_chimney_tried), hn_tx_stat_ulong_sysctl, "LU", "# of chimney send tries"); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "txdesc_cnt", CTLFLAG_RD, &sc->hn_tx_ring[0].hn_txdesc_cnt, 0, "# of total TX descs"); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "tx_chimney_max", CTLFLAG_RD, &sc->hn_tx_chimney_max, 0, "Chimney send packet size upper boundary"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "tx_chimney_size", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, 0, hn_tx_chimney_size_sysctl, "I", "Chimney send packet size limit"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "direct_tx_size", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_direct_tx_size), hn_tx_conf_int_sysctl, "I", "Size of the packet for direct transmission"); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "sched_tx", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, sc, __offsetof(struct hn_tx_ring, hn_sched_tx), hn_tx_conf_int_sysctl, "I", "Always schedule transmission " "instead of doing direct transmission"); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "tx_ring_cnt", CTLFLAG_RD, &sc->hn_tx_ring_cnt, 0, "# created TX rings"); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "tx_ring_inuse", CTLFLAG_RD, &sc->hn_tx_ring_inuse, 0, "# used TX rings"); return 0; } static void hn_set_tx_chimney_size(struct hn_softc *sc, int chimney_size) { int i; NV_LOCK(sc); for (i = 0; i < sc->hn_tx_ring_inuse; ++i) sc->hn_tx_ring[i].hn_tx_chimney_size = chimney_size; NV_UNLOCK(sc); } static void hn_destroy_tx_data(struct hn_softc *sc) { int i; if (sc->hn_tx_ring_cnt == 0) return; for (i = 0; i < sc->hn_tx_ring_cnt; ++i) hn_destroy_tx_ring(&sc->hn_tx_ring[i]); free(sc->hn_tx_ring, M_NETVSC); sc->hn_tx_ring = NULL; sc->hn_tx_ring_cnt = 0; sc->hn_tx_ring_inuse = 0; } static void hn_start_taskfunc(void *xtxr, int pending __unused) { struct hn_tx_ring *txr = xtxr; mtx_lock(&txr->hn_tx_lock); hn_start_locked(txr, 0); mtx_unlock(&txr->hn_tx_lock); } static void hn_start_txeof_taskfunc(void *xtxr, int pending __unused) { struct hn_tx_ring *txr = xtxr; mtx_lock(&txr->hn_tx_lock); atomic_clear_int(&txr->hn_sc->hn_ifp->if_drv_flags, IFF_DRV_OACTIVE); hn_start_locked(txr, 0); mtx_unlock(&txr->hn_tx_lock); } static void hn_stop_tx_tasks(struct hn_softc *sc) { int i; for (i = 0; i < sc->hn_tx_ring_inuse; ++i) { struct hn_tx_ring *txr = &sc->hn_tx_ring[i]; taskqueue_drain(txr->hn_tx_taskq, &txr->hn_tx_task); taskqueue_drain(txr->hn_tx_taskq, &txr->hn_txeof_task); } } static int hn_xmit(struct hn_tx_ring *txr, int len) { struct hn_softc *sc = txr->hn_sc; struct ifnet *ifp = sc->hn_ifp; struct mbuf *m_head; mtx_assert(&txr->hn_tx_lock, MA_OWNED); KASSERT(hn_use_if_start == 0, ("hn_xmit is called, when if_start is enabled")); if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0 || txr->hn_oactive) return 0; while ((m_head = drbr_peek(ifp, txr->hn_mbuf_br)) != NULL) { struct hn_txdesc *txd; int error; if (len > 0 && m_head->m_pkthdr.len > len) { /* * This sending could be time consuming; let callers * dispatch this packet sending (and sending of any * following up packets) to tx taskqueue. */ drbr_putback(ifp, txr->hn_mbuf_br, m_head); return 1; } txd = hn_txdesc_get(txr); if (txd == NULL) { txr->hn_no_txdescs++; drbr_putback(ifp, txr->hn_mbuf_br, m_head); txr->hn_oactive = 1; break; } error = hn_encap(txr, txd, &m_head); if (error) { /* Both txd and m_head are freed; discard */ drbr_advance(ifp, txr->hn_mbuf_br); continue; } error = hn_send_pkt(ifp, txr, txd); if (__predict_false(error)) { /* txd is freed, but m_head is not */ drbr_putback(ifp, txr->hn_mbuf_br, m_head); txr->hn_oactive = 1; break; } /* Sent */ drbr_advance(ifp, txr->hn_mbuf_br); } return 0; } static int hn_transmit(struct ifnet *ifp, struct mbuf *m) { struct hn_softc *sc = ifp->if_softc; struct hn_tx_ring *txr; int error, idx = 0; /* * Select the TX ring based on flowid */ if (M_HASHTYPE_GET(m) != M_HASHTYPE_NONE) idx = m->m_pkthdr.flowid % sc->hn_tx_ring_inuse; txr = &sc->hn_tx_ring[idx]; error = drbr_enqueue(ifp, txr->hn_mbuf_br, m); if (error) { if_inc_counter(ifp, IFCOUNTER_OQDROPS, 1); return error; } if (txr->hn_oactive) return 0; if (txr->hn_sched_tx) goto do_sched; if (mtx_trylock(&txr->hn_tx_lock)) { int sched; sched = hn_xmit(txr, txr->hn_direct_tx_size); mtx_unlock(&txr->hn_tx_lock); if (!sched) return 0; } do_sched: taskqueue_enqueue(txr->hn_tx_taskq, &txr->hn_tx_task); return 0; } static void hn_xmit_qflush(struct ifnet *ifp) { struct hn_softc *sc = ifp->if_softc; int i; for (i = 0; i < sc->hn_tx_ring_inuse; ++i) { struct hn_tx_ring *txr = &sc->hn_tx_ring[i]; struct mbuf *m; mtx_lock(&txr->hn_tx_lock); while ((m = buf_ring_dequeue_sc(txr->hn_mbuf_br)) != NULL) m_freem(m); mtx_unlock(&txr->hn_tx_lock); } if_qflush(ifp); } static void hn_xmit_txeof(struct hn_tx_ring *txr) { if (txr->hn_sched_tx) goto do_sched; if (mtx_trylock(&txr->hn_tx_lock)) { int sched; txr->hn_oactive = 0; sched = hn_xmit(txr, txr->hn_direct_tx_size); mtx_unlock(&txr->hn_tx_lock); if (sched) { taskqueue_enqueue(txr->hn_tx_taskq, &txr->hn_tx_task); } } else { do_sched: /* * Release the oactive earlier, with the hope, that * others could catch up. The task will clear the * oactive again with the hn_tx_lock to avoid possible * races. */ txr->hn_oactive = 0; taskqueue_enqueue(txr->hn_tx_taskq, &txr->hn_txeof_task); } } static void hn_xmit_taskfunc(void *xtxr, int pending __unused) { struct hn_tx_ring *txr = xtxr; mtx_lock(&txr->hn_tx_lock); hn_xmit(txr, 0); mtx_unlock(&txr->hn_tx_lock); } static void hn_xmit_txeof_taskfunc(void *xtxr, int pending __unused) { struct hn_tx_ring *txr = xtxr; mtx_lock(&txr->hn_tx_lock); txr->hn_oactive = 0; hn_xmit(txr, 0); mtx_unlock(&txr->hn_tx_lock); } static void -hn_channel_attach(struct hn_softc *sc, struct hv_vmbus_channel *chan) +hn_channel_attach(struct hn_softc *sc, struct vmbus_channel *chan) { struct hn_rx_ring *rxr; int idx; - idx = chan->ch_subidx; + idx = vmbus_chan_subidx(chan); KASSERT(idx >= 0 && idx < sc->hn_rx_ring_inuse, ("invalid channel index %d, should > 0 && < %d", idx, sc->hn_rx_ring_inuse)); rxr = &sc->hn_rx_ring[idx]; KASSERT((rxr->hn_rx_flags & HN_RX_FLAG_ATTACHED) == 0, ("RX ring %d already attached", idx)); rxr->hn_rx_flags |= HN_RX_FLAG_ATTACHED; - chan->hv_chan_rxr = rxr; if (bootverbose) { if_printf(sc->hn_ifp, "link RX ring %d to channel%u\n", - idx, chan->ch_id); + idx, vmbus_chan_id(chan)); } if (idx < sc->hn_tx_ring_inuse) { struct hn_tx_ring *txr = &sc->hn_tx_ring[idx]; KASSERT((txr->hn_tx_flags & HN_TX_FLAG_ATTACHED) == 0, ("TX ring %d already attached", idx)); txr->hn_tx_flags |= HN_TX_FLAG_ATTACHED; - chan->hv_chan_txr = txr; txr->hn_chan = chan; if (bootverbose) { if_printf(sc->hn_ifp, "link TX ring %d to channel%u\n", - idx, chan->ch_id); + idx, vmbus_chan_id(chan)); } } /* Bind channel to a proper CPU */ vmbus_chan_cpu_set(chan, (sc->hn_cpu + idx) % mp_ncpus); } static void -hn_subchan_attach(struct hn_softc *sc, struct hv_vmbus_channel *chan) +hn_subchan_attach(struct hn_softc *sc, struct vmbus_channel *chan) { - KASSERT(!VMBUS_CHAN_ISPRIMARY(chan), + KASSERT(!vmbus_chan_is_primary(chan), ("subchannel callback on primary channel")); - KASSERT(chan->ch_subidx > 0, - ("invalid channel subidx %u", - chan->ch_subidx)); hn_channel_attach(sc, chan); } static void hn_subchan_setup(struct hn_softc *sc) { - struct hv_vmbus_channel **subchan; + struct vmbus_channel **subchans; int subchan_cnt = sc->net_dev->num_channel - 1; int i; /* Wait for sub-channels setup to complete. */ - subchan = vmbus_subchan_get(sc->hn_prichan, subchan_cnt); + subchans = vmbus_subchan_get(sc->hn_prichan, subchan_cnt); /* Attach the sub-channels. */ for (i = 0; i < subchan_cnt; ++i) { + struct vmbus_channel *subchan = subchans[i]; + /* NOTE: Calling order is critical. */ - hn_subchan_attach(sc, subchan[i]); - hv_nv_subchan_attach(subchan[i]); + hn_subchan_attach(sc, subchan); + hv_nv_subchan_attach(subchan, + &sc->hn_rx_ring[vmbus_chan_subidx(subchan)]); } /* Release the sub-channels */ - vmbus_subchan_rel(subchan, subchan_cnt); + vmbus_subchan_rel(subchans, subchan_cnt); if_printf(sc->hn_ifp, "%d sub-channels setup done\n", subchan_cnt); } static void hn_tx_taskq_create(void *arg __unused) { if (!hn_share_tx_taskq) return; hn_tx_taskq = taskqueue_create("hn_tx", M_WAITOK, taskqueue_thread_enqueue, &hn_tx_taskq); if (hn_bind_tx_taskq >= 0) { int cpu = hn_bind_tx_taskq; cpuset_t cpu_set; if (cpu > mp_ncpus - 1) cpu = mp_ncpus - 1; CPU_SETOF(cpu, &cpu_set); taskqueue_start_threads_cpuset(&hn_tx_taskq, 1, PI_NET, &cpu_set, "hn tx"); } else { taskqueue_start_threads(&hn_tx_taskq, 1, PI_NET, "hn tx"); } } SYSINIT(hn_txtq_create, SI_SUB_DRIVERS, SI_ORDER_FIRST, hn_tx_taskq_create, NULL); static void hn_tx_taskq_destroy(void *arg __unused) { if (hn_tx_taskq != NULL) taskqueue_free(hn_tx_taskq); } SYSUNINIT(hn_txtq_destroy, SI_SUB_DRIVERS, SI_ORDER_FIRST, hn_tx_taskq_destroy, NULL); static device_method_t netvsc_methods[] = { /* Device interface */ DEVMETHOD(device_probe, netvsc_probe), DEVMETHOD(device_attach, netvsc_attach), DEVMETHOD(device_detach, netvsc_detach), DEVMETHOD(device_shutdown, netvsc_shutdown), { 0, 0 } }; static driver_t netvsc_driver = { NETVSC_DEVNAME, netvsc_methods, sizeof(hn_softc_t) }; static devclass_t netvsc_devclass; DRIVER_MODULE(hn, vmbus, netvsc_driver, netvsc_devclass, 0, 0); MODULE_VERSION(hn, 1); MODULE_DEPEND(hn, vmbus, 1, 1, 1); Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis.h (revision 303206) @@ -1,1100 +1,1101 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2010-2012 Citrix Inc. * Copyright (c) 2012 NetApp Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __HV_RNDIS_H__ #define __HV_RNDIS_H__ /* * NDIS protocol version numbers */ #define NDIS_VERSION_5_0 0x00050000 #define NDIS_VERSION_5_1 0x00050001 #define NDIS_VERSION_6_0 0x00060000 #define NDIS_VERSION_6_1 0x00060001 #define NDIS_VERSION_6_30 0x0006001e #define NDIS_VERSION (NDIS_VERSION_5_1) /* * Status codes */ #define STATUS_SUCCESS (0x00000000L) #define STATUS_UNSUCCESSFUL (0xC0000001L) #define STATUS_PENDING (0x00000103L) #define STATUS_INSUFFICIENT_RESOURCES (0xC000009AL) #define STATUS_BUFFER_OVERFLOW (0x80000005L) #define STATUS_NOT_SUPPORTED (0xC00000BBL) #define RNDIS_STATUS_SUCCESS (STATUS_SUCCESS) #define RNDIS_STATUS_PENDING (STATUS_PENDING) #define RNDIS_STATUS_NOT_RECOGNIZED (0x00010001L) #define RNDIS_STATUS_NOT_COPIED (0x00010002L) #define RNDIS_STATUS_NOT_ACCEPTED (0x00010003L) #define RNDIS_STATUS_CALL_ACTIVE (0x00010007L) #define RNDIS_STATUS_ONLINE (0x40010003L) #define RNDIS_STATUS_RESET_START (0x40010004L) #define RNDIS_STATUS_RESET_END (0x40010005L) #define RNDIS_STATUS_RING_STATUS (0x40010006L) #define RNDIS_STATUS_CLOSED (0x40010007L) #define RNDIS_STATUS_WAN_LINE_UP (0x40010008L) #define RNDIS_STATUS_WAN_LINE_DOWN (0x40010009L) #define RNDIS_STATUS_WAN_FRAGMENT (0x4001000AL) #define RNDIS_STATUS_MEDIA_CONNECT (0x4001000BL) #define RNDIS_STATUS_MEDIA_DISCONNECT (0x4001000CL) #define RNDIS_STATUS_HARDWARE_LINE_UP (0x4001000DL) #define RNDIS_STATUS_HARDWARE_LINE_DOWN (0x4001000EL) #define RNDIS_STATUS_INTERFACE_UP (0x4001000FL) #define RNDIS_STATUS_INTERFACE_DOWN (0x40010010L) #define RNDIS_STATUS_MEDIA_BUSY (0x40010011L) #define RNDIS_STATUS_MEDIA_SPECIFIC_INDICATION (0x40010012L) #define RNDIS_STATUS_WW_INDICATION RNDIS_STATUS_MEDIA_SPECIFIC_INDICATION #define RNDIS_STATUS_LINK_SPEED_CHANGE (0x40010013L) #define RNDIS_STATUS_NOT_RESETTABLE (0x80010001L) #define RNDIS_STATUS_SOFT_ERRORS (0x80010003L) #define RNDIS_STATUS_HARD_ERRORS (0x80010004L) #define RNDIS_STATUS_BUFFER_OVERFLOW (STATUS_BUFFER_OVERFLOW) #define RNDIS_STATUS_FAILURE (STATUS_UNSUCCESSFUL) #define RNDIS_STATUS_RESOURCES (STATUS_INSUFFICIENT_RESOURCES) #define RNDIS_STATUS_CLOSING (0xC0010002L) #define RNDIS_STATUS_BAD_VERSION (0xC0010004L) #define RNDIS_STATUS_BAD_CHARACTERISTICS (0xC0010005L) #define RNDIS_STATUS_ADAPTER_NOT_FOUND (0xC0010006L) #define RNDIS_STATUS_OPEN_FAILED (0xC0010007L) #define RNDIS_STATUS_DEVICE_FAILED (0xC0010008L) #define RNDIS_STATUS_MULTICAST_FULL (0xC0010009L) #define RNDIS_STATUS_MULTICAST_EXISTS (0xC001000AL) #define RNDIS_STATUS_MULTICAST_NOT_FOUND (0xC001000BL) #define RNDIS_STATUS_REQUEST_ABORTED (0xC001000CL) #define RNDIS_STATUS_RESET_IN_PROGRESS (0xC001000DL) #define RNDIS_STATUS_CLOSING_INDICATING (0xC001000EL) #define RNDIS_STATUS_NOT_SUPPORTED (STATUS_NOT_SUPPORTED) #define RNDIS_STATUS_INVALID_PACKET (0xC001000FL) #define RNDIS_STATUS_OPEN_LIST_FULL (0xC0010010L) #define RNDIS_STATUS_ADAPTER_NOT_READY (0xC0010011L) #define RNDIS_STATUS_ADAPTER_NOT_OPEN (0xC0010012L) #define RNDIS_STATUS_NOT_INDICATING (0xC0010013L) #define RNDIS_STATUS_INVALID_LENGTH (0xC0010014L) #define RNDIS_STATUS_INVALID_DATA (0xC0010015L) #define RNDIS_STATUS_BUFFER_TOO_SHORT (0xC0010016L) #define RNDIS_STATUS_INVALID_OID (0xC0010017L) #define RNDIS_STATUS_ADAPTER_REMOVED (0xC0010018L) #define RNDIS_STATUS_UNSUPPORTED_MEDIA (0xC0010019L) #define RNDIS_STATUS_GROUP_ADDRESS_IN_USE (0xC001001AL) #define RNDIS_STATUS_FILE_NOT_FOUND (0xC001001BL) #define RNDIS_STATUS_ERROR_READING_FILE (0xC001001CL) #define RNDIS_STATUS_ALREADY_MAPPED (0xC001001DL) #define RNDIS_STATUS_RESOURCE_CONFLICT (0xC001001EL) #define RNDIS_STATUS_NO_CABLE (0xC001001FL) #define RNDIS_STATUS_INVALID_SAP (0xC0010020L) #define RNDIS_STATUS_SAP_IN_USE (0xC0010021L) #define RNDIS_STATUS_INVALID_ADDRESS (0xC0010022L) #define RNDIS_STATUS_VC_NOT_ACTIVATED (0xC0010023L) #define RNDIS_STATUS_DEST_OUT_OF_ORDER (0xC0010024L) #define RNDIS_STATUS_VC_NOT_AVAILABLE (0xC0010025L) #define RNDIS_STATUS_CELLRATE_NOT_AVAILABLE (0xC0010026L) #define RNDIS_STATUS_INCOMPATABLE_QOS (0xC0010027L) #define RNDIS_STATUS_AAL_PARAMS_UNSUPPORTED (0xC0010028L) #define RNDIS_STATUS_NO_ROUTE_TO_DESTINATION (0xC0010029L) #define RNDIS_STATUS_TOKEN_RING_OPEN_ERROR (0xC0011000L) /* * Object Identifiers used by NdisRequest Query/Set Information */ /* * General Objects */ #define RNDIS_OID_GEN_SUPPORTED_LIST 0x00010101 #define RNDIS_OID_GEN_HARDWARE_STATUS 0x00010102 #define RNDIS_OID_GEN_MEDIA_SUPPORTED 0x00010103 #define RNDIS_OID_GEN_MEDIA_IN_USE 0x00010104 #define RNDIS_OID_GEN_MAXIMUM_LOOKAHEAD 0x00010105 #define RNDIS_OID_GEN_MAXIMUM_FRAME_SIZE 0x00010106 #define RNDIS_OID_GEN_LINK_SPEED 0x00010107 #define RNDIS_OID_GEN_TRANSMIT_BUFFER_SPACE 0x00010108 #define RNDIS_OID_GEN_RECEIVE_BUFFER_SPACE 0x00010109 #define RNDIS_OID_GEN_TRANSMIT_BLOCK_SIZE 0x0001010A #define RNDIS_OID_GEN_RECEIVE_BLOCK_SIZE 0x0001010B #define RNDIS_OID_GEN_VENDOR_ID 0x0001010C #define RNDIS_OID_GEN_VENDOR_DESCRIPTION 0x0001010D #define RNDIS_OID_GEN_CURRENT_PACKET_FILTER 0x0001010E #define RNDIS_OID_GEN_CURRENT_LOOKAHEAD 0x0001010F #define RNDIS_OID_GEN_DRIVER_VERSION 0x00010110 #define RNDIS_OID_GEN_MAXIMUM_TOTAL_SIZE 0x00010111 #define RNDIS_OID_GEN_PROTOCOL_OPTIONS 0x00010112 #define RNDIS_OID_GEN_MAC_OPTIONS 0x00010113 #define RNDIS_OID_GEN_MEDIA_CONNECT_STATUS 0x00010114 #define RNDIS_OID_GEN_MAXIMUM_SEND_PACKETS 0x00010115 #define RNDIS_OID_GEN_VENDOR_DRIVER_VERSION 0x00010116 #define RNDIS_OID_GEN_NETWORK_LAYER_ADDRESSES 0x00010118 #define RNDIS_OID_GEN_TRANSPORT_HEADER_OFFSET 0x00010119 #define RNDIS_OID_GEN_MACHINE_NAME 0x0001021A #define RNDIS_OID_GEN_RNDIS_CONFIG_PARAMETER 0x0001021B /* * For receive side scale */ /* Query only */ #define RNDIS_OID_GEN_RSS_CAPABILITIES 0x00010203 /* Query and set */ #define RNDIS_OID_GEN_RSS_PARAMETERS 0x00010204 #define RNDIS_OID_GEN_XMIT_OK 0x00020101 #define RNDIS_OID_GEN_RCV_OK 0x00020102 #define RNDIS_OID_GEN_XMIT_ERROR 0x00020103 #define RNDIS_OID_GEN_RCV_ERROR 0x00020104 #define RNDIS_OID_GEN_RCV_NO_BUFFER 0x00020105 #define RNDIS_OID_GEN_DIRECTED_BYTES_XMIT 0x00020201 #define RNDIS_OID_GEN_DIRECTED_FRAMES_XMIT 0x00020202 #define RNDIS_OID_GEN_MULTICAST_BYTES_XMIT 0x00020203 #define RNDIS_OID_GEN_MULTICAST_FRAMES_XMIT 0x00020204 #define RNDIS_OID_GEN_BROADCAST_BYTES_XMIT 0x00020205 #define RNDIS_OID_GEN_BROADCAST_FRAMES_XMIT 0x00020206 #define RNDIS_OID_GEN_DIRECTED_BYTES_RCV 0x00020207 #define RNDIS_OID_GEN_DIRECTED_FRAMES_RCV 0x00020208 #define RNDIS_OID_GEN_MULTICAST_BYTES_RCV 0x00020209 #define RNDIS_OID_GEN_MULTICAST_FRAMES_RCV 0x0002020A #define RNDIS_OID_GEN_BROADCAST_BYTES_RCV 0x0002020B #define RNDIS_OID_GEN_BROADCAST_FRAMES_RCV 0x0002020C #define RNDIS_OID_GEN_RCV_CRC_ERROR 0x0002020D #define RNDIS_OID_GEN_TRANSMIT_QUEUE_LENGTH 0x0002020E #define RNDIS_OID_GEN_GET_TIME_CAPS 0x0002020F #define RNDIS_OID_GEN_GET_NETCARD_TIME 0x00020210 /* * These are connection-oriented general OIDs. * These replace the above OIDs for connection-oriented media. */ #define RNDIS_OID_GEN_CO_SUPPORTED_LIST 0x00010101 #define RNDIS_OID_GEN_CO_HARDWARE_STATUS 0x00010102 #define RNDIS_OID_GEN_CO_MEDIA_SUPPORTED 0x00010103 #define RNDIS_OID_GEN_CO_MEDIA_IN_USE 0x00010104 #define RNDIS_OID_GEN_CO_LINK_SPEED 0x00010105 #define RNDIS_OID_GEN_CO_VENDOR_ID 0x00010106 #define RNDIS_OID_GEN_CO_VENDOR_DESCRIPTION 0x00010107 #define RNDIS_OID_GEN_CO_DRIVER_VERSION 0x00010108 #define RNDIS_OID_GEN_CO_PROTOCOL_OPTIONS 0x00010109 #define RNDIS_OID_GEN_CO_MAC_OPTIONS 0x0001010A #define RNDIS_OID_GEN_CO_MEDIA_CONNECT_STATUS 0x0001010B #define RNDIS_OID_GEN_CO_VENDOR_DRIVER_VERSION 0x0001010C #define RNDIS_OID_GEN_CO_MINIMUM_LINK_SPEED 0x0001010D #define RNDIS_OID_GEN_CO_GET_TIME_CAPS 0x00010201 #define RNDIS_OID_GEN_CO_GET_NETCARD_TIME 0x00010202 /* * These are connection-oriented statistics OIDs. */ #define RNDIS_OID_GEN_CO_XMIT_PDUS_OK 0x00020101 #define RNDIS_OID_GEN_CO_RCV_PDUS_OK 0x00020102 #define RNDIS_OID_GEN_CO_XMIT_PDUS_ERROR 0x00020103 #define RNDIS_OID_GEN_CO_RCV_PDUS_ERROR 0x00020104 #define RNDIS_OID_GEN_CO_RCV_PDUS_NO_BUFFER 0x00020105 #define RNDIS_OID_GEN_CO_RCV_CRC_ERROR 0x00020201 #define RNDIS_OID_GEN_CO_TRANSMIT_QUEUE_LENGTH 0x00020202 #define RNDIS_OID_GEN_CO_BYTES_XMIT 0x00020203 #define RNDIS_OID_GEN_CO_BYTES_RCV 0x00020204 #define RNDIS_OID_GEN_CO_BYTES_XMIT_OUTSTANDING 0x00020205 #define RNDIS_OID_GEN_CO_NETCARD_LOAD 0x00020206 /* * These are objects for Connection-oriented media call-managers. */ #define RNDIS_OID_CO_ADD_PVC 0xFF000001 #define RNDIS_OID_CO_DELETE_PVC 0xFF000002 #define RNDIS_OID_CO_GET_CALL_INFORMATION 0xFF000003 #define RNDIS_OID_CO_ADD_ADDRESS 0xFF000004 #define RNDIS_OID_CO_DELETE_ADDRESS 0xFF000005 #define RNDIS_OID_CO_GET_ADDRESSES 0xFF000006 #define RNDIS_OID_CO_ADDRESS_CHANGE 0xFF000007 #define RNDIS_OID_CO_SIGNALING_ENABLED 0xFF000008 #define RNDIS_OID_CO_SIGNALING_DISABLED 0xFF000009 /* * 802.3 Objects (Ethernet) */ #define RNDIS_OID_802_3_PERMANENT_ADDRESS 0x01010101 #define RNDIS_OID_802_3_CURRENT_ADDRESS 0x01010102 #define RNDIS_OID_802_3_MULTICAST_LIST 0x01010103 #define RNDIS_OID_802_3_MAXIMUM_LIST_SIZE 0x01010104 #define RNDIS_OID_802_3_MAC_OPTIONS 0x01010105 /* * */ #define NDIS_802_3_MAC_OPTION_PRIORITY 0x00000001 #define RNDIS_OID_802_3_RCV_ERROR_ALIGNMENT 0x01020101 #define RNDIS_OID_802_3_XMIT_ONE_COLLISION 0x01020102 #define RNDIS_OID_802_3_XMIT_MORE_COLLISIONS 0x01020103 #define RNDIS_OID_802_3_XMIT_DEFERRED 0x01020201 #define RNDIS_OID_802_3_XMIT_MAX_COLLISIONS 0x01020202 #define RNDIS_OID_802_3_RCV_OVERRUN 0x01020203 #define RNDIS_OID_802_3_XMIT_UNDERRUN 0x01020204 #define RNDIS_OID_802_3_XMIT_HEARTBEAT_FAILURE 0x01020205 #define RNDIS_OID_802_3_XMIT_TIMES_CRS_LOST 0x01020206 #define RNDIS_OID_802_3_XMIT_LATE_COLLISIONS 0x01020207 /* * RNDIS MP custom OID for test */ #define OID_RNDISMP_GET_RECEIVE_BUFFERS 0xFFA0C90D // Query only /* * Remote NDIS message types */ #define REMOTE_NDIS_PACKET_MSG 0x00000001 #define REMOTE_NDIS_INITIALIZE_MSG 0x00000002 #define REMOTE_NDIS_HALT_MSG 0x00000003 #define REMOTE_NDIS_QUERY_MSG 0x00000004 #define REMOTE_NDIS_SET_MSG 0x00000005 #define REMOTE_NDIS_RESET_MSG 0x00000006 #define REMOTE_NDIS_INDICATE_STATUS_MSG 0x00000007 #define REMOTE_NDIS_KEEPALIVE_MSG 0x00000008 #define REMOTE_CONDIS_MP_CREATE_VC_MSG 0x00008001 #define REMOTE_CONDIS_MP_DELETE_VC_MSG 0x00008002 #define REMOTE_CONDIS_MP_ACTIVATE_VC_MSG 0x00008005 #define REMOTE_CONDIS_MP_DEACTIVATE_VC_MSG 0x00008006 #define REMOTE_CONDIS_INDICATE_STATUS_MSG 0x00008007 /* * Remote NDIS message completion types */ #define REMOTE_NDIS_INITIALIZE_CMPLT 0x80000002 #define REMOTE_NDIS_QUERY_CMPLT 0x80000004 #define REMOTE_NDIS_SET_CMPLT 0x80000005 #define REMOTE_NDIS_RESET_CMPLT 0x80000006 #define REMOTE_NDIS_KEEPALIVE_CMPLT 0x80000008 #define REMOTE_CONDIS_MP_CREATE_VC_CMPLT 0x80008001 #define REMOTE_CONDIS_MP_DELETE_VC_CMPLT 0x80008002 #define REMOTE_CONDIS_MP_ACTIVATE_VC_CMPLT 0x80008005 #define REMOTE_CONDIS_MP_DEACTIVATE_VC_CMPLT 0x80008006 /* * Reserved message type for private communication between lower-layer * host driver and remote device, if necessary. */ #define REMOTE_NDIS_BUS_MSG 0xff000001 /* * Defines for DeviceFlags in rndis_initialize_complete */ #define RNDIS_DF_CONNECTIONLESS 0x00000001 #define RNDIS_DF_CONNECTION_ORIENTED 0x00000002 #define RNDIS_DF_RAW_DATA 0x00000004 /* * Remote NDIS medium types. */ #define RNDIS_MEDIUM_802_3 0x00000000 #define RNDIS_MEDIUM_802_5 0x00000001 #define RNDIS_MEDIUM_FDDI 0x00000002 #define RNDIS_MEDIUM_WAN 0x00000003 #define RNDIS_MEDIUM_LOCAL_TALK 0x00000004 #define RNDIS_MEDIUM_ARCNET_RAW 0x00000006 #define RNDIS_MEDIUM_ARCNET_878_2 0x00000007 #define RNDIS_MEDIUM_ATM 0x00000008 #define RNDIS_MEDIUM_WIRELESS_WAN 0x00000009 #define RNDIS_MEDIUM_IRDA 0x0000000a #define RNDIS_MEDIUM_CO_WAN 0x0000000b /* Not a real medium, defined as an upper bound */ #define RNDIS_MEDIUM_MAX 0x0000000d /* * Remote NDIS medium connection states. */ #define RNDIS_MEDIA_STATE_CONNECTED 0x00000000 #define RNDIS_MEDIA_STATE_DISCONNECTED 0x00000001 /* * Remote NDIS version numbers */ #define RNDIS_MAJOR_VERSION 0x00000001 #define RNDIS_MINOR_VERSION 0x00000000 /* * Remote NDIS offload parameters */ #define RNDIS_OBJECT_TYPE_DEFAULT 0x80 #define RNDIS_OFFLOAD_PARAMETERS_REVISION_3 3 #define RNDIS_OFFLOAD_PARAMETERS_NO_CHANGE 0 #define RNDIS_OFFLOAD_PARAMETERS_LSOV2_DISABLED 1 #define RNDIS_OFFLOAD_PARAMETERS_LSOV2_ENABLED 2 #define RNDIS_OFFLOAD_PARAMETERS_LSOV1_ENABLED 2 #define RNDIS_OFFLOAD_PARAMETERS_RSC_DISABLED 1 #define RNDIS_OFFLOAD_PARAMETERS_RSC_ENABLED 2 #define RNDIS_OFFLOAD_PARAMETERS_TX_RX_DISABLED 1 #define RNDIS_OFFLOAD_PARAMETERS_TX_ENABLED_RX_DISABLED 2 #define RNDIS_OFFLOAD_PARAMETERS_RX_ENABLED_TX_DISABLED 3 #define RNDIS_OFFLOAD_PARAMETERS_TX_RX_ENABLED 4 #define RNDIS_TCP_LARGE_SEND_OFFLOAD_V2_TYPE 1 #define RNDIS_TCP_LARGE_SEND_OFFLOAD_IPV4 0 #define RNDIS_TCP_LARGE_SEND_OFFLOAD_IPV6 1 #define RNDIS_OID_TCP_OFFLOAD_CURRENT_CONFIG 0xFC01020B /* query only */ #define RNDIS_OID_TCP_OFFLOAD_PARAMETERS 0xFC01020C /* set only */ #define RNDIS_OID_TCP_OFFLOAD_HARDWARE_CAPABILITIES 0xFC01020D/* query only */ #define RNDIS_OID_TCP_CONNECTION_OFFLOAD_CURRENT_CONFIG 0xFC01020E /* query only */ #define RNDIS_OID_TCP_CONNECTION_OFFLOAD_HARDWARE_CAPABILITIES 0xFC01020F /* query */ #define RNDIS_OID_OFFLOAD_ENCAPSULATION 0x0101010A /* set/query */ /* * NdisInitialize message */ typedef struct rndis_initialize_request_ { /* RNDIS request ID */ uint32_t request_id; uint32_t major_version; uint32_t minor_version; uint32_t max_xfer_size; } rndis_initialize_request; /* * Response to NdisInitialize */ typedef struct rndis_initialize_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS status */ uint32_t status; uint32_t major_version; uint32_t minor_version; uint32_t device_flags; /* RNDIS medium */ uint32_t medium; uint32_t max_pkts_per_msg; uint32_t max_xfer_size; uint32_t pkt_align_factor; uint32_t af_list_offset; uint32_t af_list_size; } rndis_initialize_complete; /* * Call manager devices only: Information about an address family * supported by the device is appended to the response to NdisInitialize. */ typedef struct rndis_co_address_family_ { /* RNDIS AF */ uint32_t address_family; uint32_t major_version; uint32_t minor_version; } rndis_co_address_family; /* * NdisHalt message */ typedef struct rndis_halt_request_ { /* RNDIS request ID */ uint32_t request_id; } rndis_halt_request; /* * NdisQueryRequest message */ typedef struct rndis_query_request_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS OID */ uint32_t oid; uint32_t info_buffer_length; uint32_t info_buffer_offset; /* RNDIS handle */ uint32_t device_vc_handle; } rndis_query_request; /* * Response to NdisQueryRequest */ typedef struct rndis_query_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS status */ uint32_t status; uint32_t info_buffer_length; uint32_t info_buffer_offset; } rndis_query_complete; /* * NdisSetRequest message */ typedef struct rndis_set_request_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS OID */ uint32_t oid; uint32_t info_buffer_length; uint32_t info_buffer_offset; /* RNDIS handle */ uint32_t device_vc_handle; } rndis_set_request; /* * Response to NdisSetRequest */ typedef struct rndis_set_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS status */ uint32_t status; } rndis_set_complete; /* * NdisReset message */ typedef struct rndis_reset_request_ { uint32_t reserved; } rndis_reset_request; /* * Response to NdisReset */ typedef struct rndis_reset_complete_ { /* RNDIS status */ uint32_t status; uint32_t addressing_reset; } rndis_reset_complete; /* * NdisMIndicateStatus message */ typedef struct rndis_indicate_status_ { /* RNDIS status */ uint32_t status; uint32_t status_buf_length; uint32_t status_buf_offset; } rndis_indicate_status; /* * Diagnostic information passed as the status buffer in * rndis_indicate_status messages signifying error conditions. */ typedef struct rndis_diagnostic_info_ { /* RNDIS status */ uint32_t diag_status; uint32_t error_offset; } rndis_diagnostic_info; /* * NdisKeepAlive message */ typedef struct rndis_keepalive_request_ { /* RNDIS request ID */ uint32_t request_id; } rndis_keepalive_request; /* * Response to NdisKeepAlive */ typedef struct rndis_keepalive_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS status */ uint32_t status; } rndis_keepalive_complete; /* * Data message. All offset fields contain byte offsets from the beginning * of the rndis_packet structure. All length fields are in bytes. * VcHandle is set to 0 for connectionless data, otherwise it * contains the VC handle. */ typedef struct rndis_packet_ { uint32_t data_offset; uint32_t data_length; uint32_t oob_data_offset; uint32_t oob_data_length; uint32_t num_oob_data_elements; uint32_t per_pkt_info_offset; uint32_t per_pkt_info_length; /* RNDIS handle */ uint32_t vc_handle; uint32_t reserved; } rndis_packet; typedef struct rndis_packet_ex_ { uint32_t data_offset; uint32_t data_length; uint32_t oob_data_offset; uint32_t oob_data_length; uint32_t num_oob_data_elements; uint32_t per_pkt_info_offset; uint32_t per_pkt_info_length; /* RNDIS handle */ uint32_t vc_handle; uint32_t reserved; uint64_t data_buf_id; uint32_t data_buf_offset; uint64_t next_header_buf_id; uint32_t next_header_byte_offset; uint32_t next_header_byte_count; } rndis_packet_ex; /* * Optional Out of Band data associated with a Data message. */ typedef struct rndis_oobd_ { uint32_t size; /* RNDIS class ID */ uint32_t type; uint32_t class_info_offset; } rndis_oobd; /* * Packet extension field contents associated with a Data message. */ typedef struct rndis_per_packet_info_ { uint32_t size; uint32_t type; uint32_t per_packet_info_offset; } rndis_per_packet_info; typedef enum ndis_per_pkt_infotype_ { tcpip_chksum_info, ipsec_info, tcp_large_send_info, classification_handle_info, ndis_reserved, sgl_info, ieee_8021q_info, original_pkt_info, pkt_cancel_id, original_netbuf_list, cached_netbuf_list, short_pkt_padding_info, max_perpkt_info } ndis_per_pkt_infotype; #define nbl_hash_value pkt_cancel_id #define nbl_hash_info original_netbuf_list typedef struct ndis_8021q_info_ { union { struct { uint32_t user_pri : 3; /* User Priority */ uint32_t cfi : 1; /* Canonical Format ID */ uint32_t vlan_id : 12; uint32_t reserved : 16; } s1; uint32_t value; } u1; } ndis_8021q_info; struct rndis_object_header { uint8_t type; uint8_t revision; uint16_t size; }; typedef struct rndis_offload_params_ { struct rndis_object_header header; uint8_t ipv4_csum; uint8_t tcp_ipv4_csum; uint8_t udp_ipv4_csum; uint8_t tcp_ipv6_csum; uint8_t udp_ipv6_csum; uint8_t lso_v1; uint8_t ip_sec_v1; uint8_t lso_v2_ipv4; uint8_t lso_v2_ipv6; uint8_t tcp_connection_ipv4; uint8_t tcp_connection_ipv6; uint32_t flags; uint8_t ip_sec_v2; uint8_t ip_sec_v2_ipv4; struct { uint8_t rsc_ipv4; uint8_t rsc_ipv6; }; struct { uint8_t encapsulated_packet_task_offload; uint8_t encapsulation_types; }; } rndis_offload_params; typedef struct rndis_tcp_ip_csum_info_ { union { struct { uint32_t is_ipv4:1; uint32_t is_ipv6:1; uint32_t tcp_csum:1; uint32_t udp_csum:1; uint32_t ip_header_csum:1; uint32_t reserved:11; uint32_t tcp_header_offset:10; } xmit; struct { uint32_t tcp_csum_failed:1; uint32_t udp_csum_failed:1; uint32_t ip_csum_failed:1; uint32_t tcp_csum_succeeded:1; uint32_t udp_csum_succeeded:1; uint32_t ip_csum_succeeded:1; uint32_t loopback:1; uint32_t tcp_csum_value_invalid:1; uint32_t ip_csum_value_invalid:1; } receive; uint32_t value; }; } rndis_tcp_ip_csum_info; struct rndis_hash_value { uint32_t hash_value; } __packed; struct rndis_hash_info { uint32_t hash_info; } __packed; #define NDIS_HASH_FUNCTION_MASK 0x000000FF /* see hash function */ #define NDIS_HASH_TYPE_MASK 0x00FFFF00 /* see hash type */ /* hash function */ #define NDIS_HASH_FUNCTION_TOEPLITZ 0x00000001 /* hash type */ #define NDIS_HASH_IPV4 0x00000100 #define NDIS_HASH_TCP_IPV4 0x00000200 #define NDIS_HASH_IPV6 0x00000400 #define NDIS_HASH_IPV6_EX 0x00000800 #define NDIS_HASH_TCP_IPV6 0x00001000 #define NDIS_HASH_TCP_IPV6_EX 0x00002000 typedef struct rndis_tcp_tso_info_ { union { struct { uint32_t unused:30; uint32_t type:1; uint32_t reserved2:1; } xmit; struct { uint32_t mss:20; uint32_t tcp_header_offset:10; uint32_t type:1; uint32_t reserved2:1; } lso_v1_xmit; struct { uint32_t tcp_payload:30; uint32_t type:1; uint32_t reserved2:1; } lso_v1_xmit_complete; struct { uint32_t mss:20; uint32_t tcp_header_offset:10; uint32_t type:1; uint32_t ip_version:1; } lso_v2_xmit; struct { uint32_t reserved:30; uint32_t type:1; uint32_t reserved2:1; } lso_v2_xmit_complete; uint32_t value; }; } rndis_tcp_tso_info; #define RNDIS_HASHVAL_PPI_SIZE (sizeof(rndis_per_packet_info) + \ sizeof(struct rndis_hash_value)) #define RNDIS_VLAN_PPI_SIZE (sizeof(rndis_per_packet_info) + \ sizeof(ndis_8021q_info)) #define RNDIS_CSUM_PPI_SIZE (sizeof(rndis_per_packet_info) + \ sizeof(rndis_tcp_ip_csum_info)) #define RNDIS_TSO_PPI_SIZE (sizeof(rndis_per_packet_info) + \ sizeof(rndis_tcp_tso_info)) /* * Format of Information buffer passed in a SetRequest for the OID * OID_GEN_RNDIS_CONFIG_PARAMETER. */ typedef struct rndis_config_parameter_info_ { uint32_t parameter_name_offset; uint32_t parameter_name_length; uint32_t parameter_type; uint32_t parameter_value_offset; uint32_t parameter_value_length; } rndis_config_parameter_info; /* * Values for ParameterType in rndis_config_parameter_info */ #define RNDIS_CONFIG_PARAM_TYPE_INTEGER 0 #define RNDIS_CONFIG_PARAM_TYPE_STRING 2 /* * CONDIS Miniport messages for connection oriented devices * that do not implement a call manager. */ /* * CoNdisMiniportCreateVc message */ typedef struct rcondis_mp_create_vc_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS handle */ uint32_t ndis_vc_handle; } rcondis_mp_create_vc; /* * Response to CoNdisMiniportCreateVc */ typedef struct rcondis_mp_create_vc_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS handle */ uint32_t device_vc_handle; /* RNDIS status */ uint32_t status; } rcondis_mp_create_vc_complete; /* * CoNdisMiniportDeleteVc message */ typedef struct rcondis_mp_delete_vc_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS handle */ uint32_t device_vc_handle; } rcondis_mp_delete_vc; /* * Response to CoNdisMiniportDeleteVc */ typedef struct rcondis_mp_delete_vc_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS status */ uint32_t status; } rcondis_mp_delete_vc_complete; /* * CoNdisMiniportQueryRequest message */ typedef struct rcondis_mp_query_request_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS request type */ uint32_t request_type; /* RNDIS OID */ uint32_t oid; /* RNDIS handle */ uint32_t device_vc_handle; uint32_t info_buf_length; uint32_t info_buf_offset; } rcondis_mp_query_request; /* * CoNdisMiniportSetRequest message */ typedef struct rcondis_mp_set_request_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS request type */ uint32_t request_type; /* RNDIS OID */ uint32_t oid; /* RNDIS handle */ uint32_t device_vc_handle; uint32_t info_buf_length; uint32_t info_buf_offset; } rcondis_mp_set_request; /* * CoNdisIndicateStatus message */ typedef struct rcondis_indicate_status_ { /* RNDIS handle */ uint32_t ndis_vc_handle; /* RNDIS status */ uint32_t status; uint32_t status_buf_length; uint32_t status_buf_offset; } rcondis_indicate_status; /* * CONDIS Call/VC parameters */ typedef struct rcondis_specific_parameters_ { uint32_t parameter_type; uint32_t parameter_length; uint32_t parameter_offset; } rcondis_specific_parameters; typedef struct rcondis_media_parameters_ { uint32_t flags; uint32_t reserved1; uint32_t reserved2; rcondis_specific_parameters media_specific; } rcondis_media_parameters; typedef struct rndis_flowspec_ { uint32_t token_rate; uint32_t token_bucket_size; uint32_t peak_bandwidth; uint32_t latency; uint32_t delay_variation; uint32_t service_type; uint32_t max_sdu_size; uint32_t minimum_policed_size; } rndis_flowspec; typedef struct rcondis_call_manager_parameters_ { rndis_flowspec transmit; rndis_flowspec receive; rcondis_specific_parameters call_mgr_specific; } rcondis_call_manager_parameters; /* * CoNdisMiniportActivateVc message */ typedef struct rcondis_mp_activate_vc_request_ { /* RNDIS request ID */ uint32_t request_id; uint32_t flags; /* RNDIS handle */ uint32_t device_vc_handle; uint32_t media_params_offset; uint32_t media_params_length; uint32_t call_mgr_params_offset; uint32_t call_mgr_params_length; } rcondis_mp_activate_vc_request; /* * Response to CoNdisMiniportActivateVc */ typedef struct rcondis_mp_activate_vc_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS status */ uint32_t status; } rcondis_mp_activate_vc_complete; /* * CoNdisMiniportDeactivateVc message */ typedef struct rcondis_mp_deactivate_vc_request_ { /* RNDIS request ID */ uint32_t request_id; uint32_t flags; /* RNDIS handle */ uint32_t device_vc_handle; } rcondis_mp_deactivate_vc_request; /* * Response to CoNdisMiniportDeactivateVc */ typedef struct rcondis_mp_deactivate_vc_complete_ { /* RNDIS request ID */ uint32_t request_id; /* RNDIS status */ uint32_t status; } rcondis_mp_deactivate_vc_complete; /* * union with all of the RNDIS messages */ typedef union rndis_msg_container_ { rndis_packet packet; rndis_initialize_request init_request; rndis_halt_request halt_request; rndis_query_request query_request; rndis_set_request set_request; rndis_reset_request reset_request; rndis_keepalive_request keepalive_request; rndis_indicate_status indicate_status; rndis_initialize_complete init_complete; rndis_query_complete query_complete; rndis_set_complete set_complete; rndis_reset_complete reset_complete; rndis_keepalive_complete keepalive_complete; rcondis_mp_create_vc co_miniport_create_vc; rcondis_mp_delete_vc co_miniport_delete_vc; rcondis_indicate_status co_miniport_status; rcondis_mp_activate_vc_request co_miniport_activate_vc; rcondis_mp_deactivate_vc_request co_miniport_deactivate_vc; rcondis_mp_create_vc_complete co_miniport_create_vc_complete; rcondis_mp_delete_vc_complete co_miniport_delete_vc_complete; rcondis_mp_activate_vc_complete co_miniport_activate_vc_complete; rcondis_mp_deactivate_vc_complete co_miniport_deactivate_vc_complete; rndis_packet_ex packet_ex; } rndis_msg_container; /* * Remote NDIS message format */ typedef struct rndis_msg_ { uint32_t ndis_msg_type; /* * Total length of this message, from the beginning * of the rndis_msg struct, in bytes. */ uint32_t msg_len; /* Actual message */ rndis_msg_container msg; } rndis_msg; /* * Handy macros */ /* * get the size of an RNDIS message. Pass in the message type, * rndis_set_request, rndis_packet for example */ #define RNDIS_MESSAGE_SIZE(message) \ (sizeof(message) + (sizeof(rndis_msg) - sizeof(rndis_msg_container))) /* * get pointer to info buffer with message pointer */ #define MESSAGE_TO_INFO_BUFFER(message) \ (((PUCHAR)(message)) + message->InformationBufferOffset) /* * get pointer to status buffer with message pointer */ #define MESSAGE_TO_STATUS_BUFFER(message) \ (((PUCHAR)(message)) + message->StatusBufferOffset) /* * get pointer to OOBD buffer with message pointer */ #define MESSAGE_TO_OOBD_BUFFER(message) \ (((PUCHAR)(message)) + message->OOBDataOffset) /* * get pointer to data buffer with message pointer */ #define MESSAGE_TO_DATA_BUFFER(message) \ (((PUCHAR)(message)) + message->PerPacketInfoOffset) /* * get pointer to contained message from NDIS_MESSAGE pointer */ #define RNDIS_MESSAGE_PTR_TO_MESSAGE_PTR(rndis_message) \ ((void *) &rndis_message->Message) /* * get pointer to contained message from NDIS_MESSAGE pointer */ #define RNDIS_MESSAGE_RAW_PTR_TO_MESSAGE_PTR(rndis_message) \ ((void *) rndis_message) /* * Structures used in OID_RNDISMP_GET_RECEIVE_BUFFERS */ #define RNDISMP_RECEIVE_BUFFER_ELEM_FLAG_VMQ_RECEIVE_BUFFER 0x00000001 typedef struct rndismp_rx_buf_elem_ { uint32_t flags; uint32_t length; uint64_t rx_buf_id; uint32_t gpadl_handle; void *rx_buf; } rndismp_rx_buf_elem; typedef struct rndismp_rx_bufs_info_ { uint32_t num_rx_bufs; rndismp_rx_buf_elem rx_buf_elems[1]; } rndismp_rx_bufs_info; #define RNDIS_HEADER_SIZE (sizeof(rndis_msg) - sizeof(rndis_msg_container)) #define NDIS_PACKET_TYPE_DIRECTED 0x00000001 #define NDIS_PACKET_TYPE_MULTICAST 0x00000002 #define NDIS_PACKET_TYPE_ALL_MULTICAST 0x00000004 #define NDIS_PACKET_TYPE_BROADCAST 0x00000008 #define NDIS_PACKET_TYPE_SOURCE_ROUTING 0x00000010 #define NDIS_PACKET_TYPE_PROMISCUOUS 0x00000020 #define NDIS_PACKET_TYPE_SMT 0x00000040 #define NDIS_PACKET_TYPE_ALL_LOCAL 0x00000080 #define NDIS_PACKET_TYPE_GROUP 0x00000100 #define NDIS_PACKET_TYPE_ALL_FUNCTIONAL 0x00000200 #define NDIS_PACKET_TYPE_FUNCTIONAL 0x00000400 #define NDIS_PACKET_TYPE_MAC_FRAME 0x00000800 /* * Externs */ -struct hv_vmbus_channel; +struct hn_rx_ring; +struct hn_tx_ring; -int netvsc_recv(struct hv_vmbus_channel *chan, +int netvsc_recv(struct hn_rx_ring *rxr, netvsc_packet *packet, const rndis_tcp_ip_csum_info *csum_info, const struct rndis_hash_info *hash_info, const struct rndis_hash_value *hash_value); -void netvsc_channel_rollup(struct hv_vmbus_channel *chan); +void netvsc_channel_rollup(struct hn_rx_ring *rxr, struct hn_tx_ring *txr); void* hv_set_rppi_data(rndis_msg *rndis_mesg, uint32_t rppi_size, int pkt_type); void* hv_get_ppi_data(rndis_packet *rpkt, uint32_t type); #endif /* __HV_RNDIS_H__ */ Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis_filter.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis_filter.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis_filter.c (revision 303206) @@ -1,1273 +1,1273 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2010-2012 Citrix Inc. * Copyright (c) 2012 NetApp Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include +#include #include +#include #include #include #include #include #include #include #include #include #include #include "hv_net_vsc.h" #include "hv_rndis.h" #include "hv_rndis_filter.h" struct hv_rf_recvinfo { const ndis_8021q_info *vlan_info; const rndis_tcp_ip_csum_info *csum_info; const struct rndis_hash_info *hash_info; const struct rndis_hash_value *hash_value; }; #define HV_RF_RECVINFO_VLAN 0x1 #define HV_RF_RECVINFO_CSUM 0x2 #define HV_RF_RECVINFO_HASHINF 0x4 #define HV_RF_RECVINFO_HASHVAL 0x8 #define HV_RF_RECVINFO_ALL \ (HV_RF_RECVINFO_VLAN | \ HV_RF_RECVINFO_CSUM | \ HV_RF_RECVINFO_HASHINF | \ HV_RF_RECVINFO_HASHVAL) /* * Forward declarations */ static int hv_rf_send_request(rndis_device *device, rndis_request *request, uint32_t message_type); static void hv_rf_receive_response(rndis_device *device, rndis_msg *response); static void hv_rf_receive_indicate_status(rndis_device *device, rndis_msg *response); -static void hv_rf_receive_data(rndis_device *device, rndis_msg *message, - struct hv_vmbus_channel *chan, +static void hv_rf_receive_data(struct hn_rx_ring *rxr, rndis_msg *message, netvsc_packet *pkt); static int hv_rf_query_device(rndis_device *device, uint32_t oid, void *result, uint32_t *result_size); static inline int hv_rf_query_device_mac(rndis_device *device); static inline int hv_rf_query_device_link_status(rndis_device *device); static int hv_rf_set_packet_filter(rndis_device *device, uint32_t new_filter); static int hv_rf_init_device(rndis_device *device); static int hv_rf_open_device(rndis_device *device); static int hv_rf_close_device(rndis_device *device); -static void hv_rf_on_send_request_completion(struct hv_vmbus_channel *, void *context); -static void hv_rf_on_send_request_halt_completion(struct hv_vmbus_channel *, void *context); +static void hv_rf_on_send_request_completion(struct vmbus_channel *, void *context); +static void hv_rf_on_send_request_halt_completion(struct vmbus_channel *, void *context); int hv_rf_send_offload_request(struct hn_softc *sc, rndis_offload_params *offloads); /* * Set the Per-Packet-Info with the specified type */ void * hv_set_rppi_data(rndis_msg *rndis_mesg, uint32_t rppi_size, int pkt_type) { rndis_packet *rndis_pkt; rndis_per_packet_info *rppi; rndis_pkt = &rndis_mesg->msg.packet; rndis_pkt->data_offset += rppi_size; rppi = (rndis_per_packet_info *)((char *)rndis_pkt + rndis_pkt->per_pkt_info_offset + rndis_pkt->per_pkt_info_length); rppi->size = rppi_size; rppi->type = pkt_type; rppi->per_packet_info_offset = sizeof(rndis_per_packet_info); rndis_pkt->per_pkt_info_length += rppi_size; return (rppi); } /* * Get the Per-Packet-Info with the specified type * return NULL if not found. */ void * hv_get_ppi_data(rndis_packet *rpkt, uint32_t type) { rndis_per_packet_info *ppi; int len; if (rpkt->per_pkt_info_offset == 0) return (NULL); ppi = (rndis_per_packet_info *)((unsigned long)rpkt + rpkt->per_pkt_info_offset); len = rpkt->per_pkt_info_length; while (len > 0) { if (ppi->type == type) return (void *)((unsigned long)ppi + ppi->per_packet_info_offset); len -= ppi->size; ppi = (rndis_per_packet_info *)((unsigned long)ppi + ppi->size); } return (NULL); } /* * Allow module_param to work and override to switch to promiscuous mode. */ static inline rndis_device * hv_get_rndis_device(void) { rndis_device *device; device = malloc(sizeof(rndis_device), M_NETVSC, M_WAITOK | M_ZERO); mtx_init(&device->req_lock, "HV-FRL", NULL, MTX_DEF); /* Same effect as STAILQ_HEAD_INITIALIZER() static initializer */ STAILQ_INIT(&device->myrequest_list); device->state = RNDIS_DEV_UNINITIALIZED; return (device); } /* * */ static inline void hv_put_rndis_device(rndis_device *device) { mtx_destroy(&device->req_lock); free(device, M_NETVSC); } /* * */ static inline rndis_request * hv_rndis_request(rndis_device *device, uint32_t message_type, uint32_t message_length) { rndis_request *request; rndis_msg *rndis_mesg; rndis_set_request *set; request = malloc(sizeof(rndis_request), M_NETVSC, M_WAITOK | M_ZERO); sema_init(&request->wait_sema, 0, "rndis sema"); rndis_mesg = &request->request_msg; rndis_mesg->ndis_msg_type = message_type; rndis_mesg->msg_len = message_length; /* * Set the request id. This field is always after the rndis header * for request/response packet types so we just use the set_request * as a template. */ set = &rndis_mesg->msg.set_request; set->request_id = atomic_fetchadd_int(&device->new_request_id, 1); /* Increment to get the new value (call above returns old value) */ set->request_id += 1; /* Add to the request list */ mtx_lock(&device->req_lock); STAILQ_INSERT_TAIL(&device->myrequest_list, request, mylist_entry); mtx_unlock(&device->req_lock); return (request); } /* * */ static inline void hv_put_rndis_request(rndis_device *device, rndis_request *request) { mtx_lock(&device->req_lock); /* Fixme: Has O(n) performance */ /* * XXXKYS: Use Doubly linked lists. */ STAILQ_REMOVE(&device->myrequest_list, request, rndis_request_, mylist_entry); mtx_unlock(&device->req_lock); sema_destroy(&request->wait_sema); free(request, M_NETVSC); } /* * */ static int hv_rf_send_request(rndis_device *device, rndis_request *request, uint32_t message_type) { int ret; netvsc_packet *packet; netvsc_dev *net_dev = device->net_dev; int send_buf_section_idx; /* Set up the packet to send it */ packet = &request->pkt; packet->is_data_pkt = FALSE; packet->tot_data_buf_len = request->request_msg.msg_len; packet->gpa_cnt = 1; packet->gpa[0].gpa_page = hv_get_phys_addr(&request->request_msg) >> PAGE_SHIFT; packet->gpa[0].gpa_len = request->request_msg.msg_len; packet->gpa[0].gpa_ofs = (unsigned long)&request->request_msg & (PAGE_SIZE - 1); if (packet->gpa[0].gpa_ofs + packet->gpa[0].gpa_len > PAGE_SIZE) { packet->gpa_cnt = 2; packet->gpa[0].gpa_len = PAGE_SIZE - packet->gpa[0].gpa_ofs; packet->gpa[1].gpa_page = hv_get_phys_addr((char*)&request->request_msg + packet->gpa[0].gpa_len) >> PAGE_SHIFT; packet->gpa[1].gpa_ofs = 0; packet->gpa[1].gpa_len = request->request_msg.msg_len - packet->gpa[0].gpa_len; } packet->compl.send.send_completion_context = request; /* packet */ if (message_type != REMOTE_NDIS_HALT_MSG) { packet->compl.send.on_send_completion = hv_rf_on_send_request_completion; } else { packet->compl.send.on_send_completion = hv_rf_on_send_request_halt_completion; } packet->compl.send.send_completion_tid = (unsigned long)device; if (packet->tot_data_buf_len < net_dev->send_section_size) { send_buf_section_idx = hv_nv_get_next_send_section(net_dev); if (send_buf_section_idx != NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX) { char *dest = ((char *)net_dev->send_buf + send_buf_section_idx * net_dev->send_section_size); memcpy(dest, &request->request_msg, request->request_msg.msg_len); packet->send_buf_section_idx = send_buf_section_idx; packet->send_buf_section_size = packet->tot_data_buf_len; packet->gpa_cnt = 0; goto sendit; } /* Failed to allocate chimney send buffer; move on */ } packet->send_buf_section_idx = NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX; packet->send_buf_section_size = 0; sendit: ret = hv_nv_on_send(device->net_dev->sc->hn_prichan, packet); return (ret); } /* * RNDIS filter receive response */ static void hv_rf_receive_response(rndis_device *device, rndis_msg *response) { rndis_request *request = NULL; rndis_request *next_request; boolean_t found = FALSE; mtx_lock(&device->req_lock); request = STAILQ_FIRST(&device->myrequest_list); while (request != NULL) { /* * All request/response message contains request_id as the * first field */ if (request->request_msg.msg.init_request.request_id == response->msg.init_complete.request_id) { found = TRUE; break; } next_request = STAILQ_NEXT(request, mylist_entry); request = next_request; } mtx_unlock(&device->req_lock); if (found) { if (response->msg_len <= sizeof(rndis_msg)) { memcpy(&request->response_msg, response, response->msg_len); } else { if (response->ndis_msg_type == REMOTE_NDIS_RESET_CMPLT) { /* Does not have a request id field */ request->response_msg.msg.reset_complete.status = STATUS_BUFFER_OVERFLOW; } else { request->response_msg.msg.init_complete.status = STATUS_BUFFER_OVERFLOW; } } sema_post(&request->wait_sema); } } int hv_rf_send_offload_request(struct hn_softc *sc, rndis_offload_params *offloads) { rndis_request *request; rndis_set_request *set; rndis_offload_params *offload_req; rndis_set_complete *set_complete; rndis_device *rndis_dev; device_t dev = sc->hn_dev; netvsc_dev *net_dev = sc->net_dev; uint32_t vsp_version = net_dev->nvsp_version; uint32_t extlen = sizeof(rndis_offload_params); int ret; if (vsp_version <= NVSP_PROTOCOL_VERSION_4) { extlen = VERSION_4_OFFLOAD_SIZE; /* On NVSP_PROTOCOL_VERSION_4 and below, we do not support * UDP checksum offload. */ offloads->udp_ipv4_csum = 0; offloads->udp_ipv6_csum = 0; } rndis_dev = net_dev->extension; request = hv_rndis_request(rndis_dev, REMOTE_NDIS_SET_MSG, RNDIS_MESSAGE_SIZE(rndis_set_request) + extlen); if (!request) return (ENOMEM); set = &request->request_msg.msg.set_request; set->oid = RNDIS_OID_TCP_OFFLOAD_PARAMETERS; set->info_buffer_length = extlen; set->info_buffer_offset = sizeof(rndis_set_request); set->device_vc_handle = 0; offload_req = (rndis_offload_params *)((unsigned long)set + set->info_buffer_offset); *offload_req = *offloads; offload_req->header.type = RNDIS_OBJECT_TYPE_DEFAULT; offload_req->header.revision = RNDIS_OFFLOAD_PARAMETERS_REVISION_3; offload_req->header.size = extlen; ret = hv_rf_send_request(rndis_dev, request, REMOTE_NDIS_SET_MSG); if (ret != 0) { device_printf(dev, "hv send offload request failed, ret=%d!\n", ret); goto cleanup; } ret = sema_timedwait(&request->wait_sema, 5 * hz); if (ret != 0) { device_printf(dev, "hv send offload request timeout\n"); goto cleanup; } set_complete = &request->response_msg.msg.set_complete; if (set_complete->status == RNDIS_STATUS_SUCCESS) { device_printf(dev, "hv send offload request succeeded\n"); ret = 0; } else { if (set_complete->status == STATUS_NOT_SUPPORTED) { device_printf(dev, "HV Not support offload\n"); ret = 0; } else { ret = set_complete->status; } } cleanup: hv_put_rndis_request(rndis_dev, request); return (ret); } /* * RNDIS filter receive indicate status */ static void hv_rf_receive_indicate_status(rndis_device *device, rndis_msg *response) { rndis_indicate_status *indicate = &response->msg.indicate_status; switch(indicate->status) { case RNDIS_STATUS_MEDIA_CONNECT: netvsc_linkstatus_callback(device->net_dev->sc, 1); break; case RNDIS_STATUS_MEDIA_DISCONNECT: netvsc_linkstatus_callback(device->net_dev->sc, 0); break; default: /* TODO: */ device_printf(device->net_dev->sc->hn_dev, "unknown status %d received\n", indicate->status); break; } } static int hv_rf_find_recvinfo(const rndis_packet *rpkt, struct hv_rf_recvinfo *info) { const rndis_per_packet_info *ppi; uint32_t mask, len; info->vlan_info = NULL; info->csum_info = NULL; info->hash_info = NULL; info->hash_value = NULL; if (rpkt->per_pkt_info_offset == 0) return 0; ppi = (const rndis_per_packet_info *) ((const uint8_t *)rpkt + rpkt->per_pkt_info_offset); len = rpkt->per_pkt_info_length; mask = 0; while (len != 0) { const void *ppi_dptr; uint32_t ppi_dlen; if (__predict_false(ppi->size < ppi->per_packet_info_offset)) return EINVAL; ppi_dlen = ppi->size - ppi->per_packet_info_offset; ppi_dptr = (const uint8_t *)ppi + ppi->per_packet_info_offset; switch (ppi->type) { case ieee_8021q_info: if (__predict_false(ppi_dlen < sizeof(ndis_8021q_info))) return EINVAL; info->vlan_info = ppi_dptr; mask |= HV_RF_RECVINFO_VLAN; break; case tcpip_chksum_info: if (__predict_false(ppi_dlen < sizeof(rndis_tcp_ip_csum_info))) return EINVAL; info->csum_info = ppi_dptr; mask |= HV_RF_RECVINFO_CSUM; break; case nbl_hash_value: if (__predict_false(ppi_dlen < sizeof(struct rndis_hash_value))) return EINVAL; info->hash_value = ppi_dptr; mask |= HV_RF_RECVINFO_HASHVAL; break; case nbl_hash_info: if (__predict_false(ppi_dlen < sizeof(struct rndis_hash_info))) return EINVAL; info->hash_info = ppi_dptr; mask |= HV_RF_RECVINFO_HASHINF; break; default: goto skip; } if (mask == HV_RF_RECVINFO_ALL) { /* All found; done */ break; } skip: if (__predict_false(len < ppi->size)) return EINVAL; len -= ppi->size; ppi = (const rndis_per_packet_info *) ((const uint8_t *)ppi + ppi->size); } return 0; } /* * RNDIS filter receive data */ static void -hv_rf_receive_data(rndis_device *device, rndis_msg *message, - struct hv_vmbus_channel *chan, netvsc_packet *pkt) +hv_rf_receive_data(struct hn_rx_ring *rxr, rndis_msg *message, + netvsc_packet *pkt) { rndis_packet *rndis_pkt; uint32_t data_offset; - device_t dev = device->net_dev->sc->hn_dev; struct hv_rf_recvinfo info; rndis_pkt = &message->msg.packet; /* * Fixme: Handle multiple rndis pkt msgs that may be enclosed in this * netvsc packet (ie tot_data_buf_len != message_length) */ /* Remove rndis header, then pass data packet up the stack */ data_offset = RNDIS_HEADER_SIZE + rndis_pkt->data_offset; pkt->tot_data_buf_len -= data_offset; if (pkt->tot_data_buf_len < rndis_pkt->data_length) { pkt->status = nvsp_status_failure; - device_printf(dev, + if_printf(rxr->hn_ifp, "total length %u is less than data length %u\n", pkt->tot_data_buf_len, rndis_pkt->data_length); return; } pkt->tot_data_buf_len = rndis_pkt->data_length; pkt->data = (void *)((unsigned long)pkt->data + data_offset); if (hv_rf_find_recvinfo(rndis_pkt, &info)) { pkt->status = nvsp_status_failure; - device_printf(dev, "recvinfo parsing failed\n"); + if_printf(rxr->hn_ifp, "recvinfo parsing failed\n"); return; } if (info.vlan_info != NULL) pkt->vlan_tci = info.vlan_info->u1.s1.vlan_id; else pkt->vlan_tci = 0; - netvsc_recv(chan, pkt, info.csum_info, info.hash_info, info.hash_value); + netvsc_recv(rxr, pkt, info.csum_info, info.hash_info, info.hash_value); } /* * RNDIS filter on receive */ int hv_rf_on_receive(netvsc_dev *net_dev, - struct hv_vmbus_channel *chan, netvsc_packet *pkt) + struct hn_rx_ring *rxr, netvsc_packet *pkt) { rndis_device *rndis_dev; rndis_msg *rndis_hdr; /* Make sure the rndis device state is initialized */ if (net_dev->extension == NULL) { pkt->status = nvsp_status_failure; return (ENODEV); } rndis_dev = (rndis_device *)net_dev->extension; if (rndis_dev->state == RNDIS_DEV_UNINITIALIZED) { pkt->status = nvsp_status_failure; return (EINVAL); } rndis_hdr = pkt->data; switch (rndis_hdr->ndis_msg_type) { /* data message */ case REMOTE_NDIS_PACKET_MSG: - hv_rf_receive_data(rndis_dev, rndis_hdr, chan, pkt); + hv_rf_receive_data(rxr, rndis_hdr, pkt); break; /* completion messages */ case REMOTE_NDIS_INITIALIZE_CMPLT: case REMOTE_NDIS_QUERY_CMPLT: case REMOTE_NDIS_SET_CMPLT: case REMOTE_NDIS_RESET_CMPLT: case REMOTE_NDIS_KEEPALIVE_CMPLT: hv_rf_receive_response(rndis_dev, rndis_hdr); break; /* notification message */ case REMOTE_NDIS_INDICATE_STATUS_MSG: hv_rf_receive_indicate_status(rndis_dev, rndis_hdr); break; default: printf("hv_rf_on_receive(): Unknown msg_type 0x%x\n", rndis_hdr->ndis_msg_type); break; } return (0); } /* * RNDIS filter query device */ static int hv_rf_query_device(rndis_device *device, uint32_t oid, void *result, uint32_t *result_size) { rndis_request *request; uint32_t in_result_size = *result_size; rndis_query_request *query; rndis_query_complete *query_complete; int ret = 0; *result_size = 0; request = hv_rndis_request(device, REMOTE_NDIS_QUERY_MSG, RNDIS_MESSAGE_SIZE(rndis_query_request)); if (request == NULL) { ret = -1; goto cleanup; } /* Set up the rndis query */ query = &request->request_msg.msg.query_request; query->oid = oid; query->info_buffer_offset = sizeof(rndis_query_request); query->info_buffer_length = 0; query->device_vc_handle = 0; if (oid == RNDIS_OID_GEN_RSS_CAPABILITIES) { struct rndis_recv_scale_cap *cap; request->request_msg.msg_len += sizeof(struct rndis_recv_scale_cap); query->info_buffer_length = sizeof(struct rndis_recv_scale_cap); cap = (struct rndis_recv_scale_cap *)((unsigned long)query + query->info_buffer_offset); cap->hdr.type = RNDIS_OBJECT_TYPE_RSS_CAPABILITIES; cap->hdr.rev = RNDIS_RECEIVE_SCALE_CAPABILITIES_REVISION_2; cap->hdr.size = sizeof(struct rndis_recv_scale_cap); } ret = hv_rf_send_request(device, request, REMOTE_NDIS_QUERY_MSG); if (ret != 0) { /* Fixme: printf added */ printf("RNDISFILTER request failed to Send!\n"); goto cleanup; } sema_wait(&request->wait_sema); /* Copy the response back */ query_complete = &request->response_msg.msg.query_complete; if (query_complete->info_buffer_length > in_result_size) { ret = EINVAL; goto cleanup; } memcpy(result, (void *)((unsigned long)query_complete + query_complete->info_buffer_offset), query_complete->info_buffer_length); *result_size = query_complete->info_buffer_length; cleanup: if (request != NULL) hv_put_rndis_request(device, request); return (ret); } /* * RNDIS filter query device MAC address */ static inline int hv_rf_query_device_mac(rndis_device *device) { - uint32_t size = HW_MACADDR_LEN; + uint32_t size = ETHER_ADDR_LEN; return (hv_rf_query_device(device, RNDIS_OID_802_3_PERMANENT_ADDRESS, device->hw_mac_addr, &size)); } /* * RNDIS filter query device link status */ static inline int hv_rf_query_device_link_status(rndis_device *device) { uint32_t size = sizeof(uint32_t); return (hv_rf_query_device(device, RNDIS_OID_GEN_MEDIA_CONNECT_STATUS, &device->link_status, &size)); } static uint8_t netvsc_hash_key[HASH_KEYLEN] = { 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2, 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0, 0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4, 0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c, 0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa }; /* * RNDIS set vRSS parameters */ static int hv_rf_set_rss_param(rndis_device *device, int num_queue) { rndis_request *request; rndis_set_request *set; rndis_set_complete *set_complete; rndis_recv_scale_param *rssp; uint32_t extlen = sizeof(rndis_recv_scale_param) + (4 * ITAB_NUM) + HASH_KEYLEN; uint32_t *itab, status; uint8_t *keyp; int i, ret; request = hv_rndis_request(device, REMOTE_NDIS_SET_MSG, RNDIS_MESSAGE_SIZE(rndis_set_request) + extlen); if (request == NULL) { if (bootverbose) printf("Netvsc: No memory to set vRSS parameters.\n"); ret = -1; goto cleanup; } set = &request->request_msg.msg.set_request; set->oid = RNDIS_OID_GEN_RSS_PARAMETERS; set->info_buffer_length = extlen; set->info_buffer_offset = sizeof(rndis_set_request); set->device_vc_handle = 0; /* Fill out the rssp parameter structure */ rssp = (rndis_recv_scale_param *)(set + 1); rssp->hdr.type = RNDIS_OBJECT_TYPE_RSS_PARAMETERS; rssp->hdr.rev = RNDIS_RECEIVE_SCALE_PARAMETERS_REVISION_2; rssp->hdr.size = sizeof(rndis_recv_scale_param); rssp->flag = 0; rssp->hashinfo = RNDIS_HASH_FUNC_TOEPLITZ | RNDIS_HASH_IPV4 | RNDIS_HASH_TCP_IPV4 | RNDIS_HASH_IPV6 | RNDIS_HASH_TCP_IPV6; rssp->indirect_tabsize = 4 * ITAB_NUM; rssp->indirect_taboffset = sizeof(rndis_recv_scale_param); rssp->hashkey_size = HASH_KEYLEN; rssp->hashkey_offset = rssp->indirect_taboffset + rssp->indirect_tabsize; /* Set indirection table entries */ itab = (uint32_t *)(rssp + 1); for (i = 0; i < ITAB_NUM; i++) itab[i] = i % num_queue; /* Set hash key values */ keyp = (uint8_t *)((unsigned long)rssp + rssp->hashkey_offset); for (i = 0; i < HASH_KEYLEN; i++) keyp[i] = netvsc_hash_key[i]; ret = hv_rf_send_request(device, request, REMOTE_NDIS_SET_MSG); if (ret != 0) { goto cleanup; } /* * Wait for the response from the host. Another thread will signal * us when the response has arrived. In the failure case, * sema_timedwait() returns a non-zero status after waiting 5 seconds. */ ret = sema_timedwait(&request->wait_sema, 5 * hz); if (ret == 0) { /* Response received, check status */ set_complete = &request->response_msg.msg.set_complete; status = set_complete->status; if (status != RNDIS_STATUS_SUCCESS) { /* Bad response status, return error */ if (bootverbose) printf("Netvsc: Failed to set vRSS " "parameters.\n"); ret = -2; } else { if (bootverbose) printf("Netvsc: Successfully set vRSS " "parameters.\n"); } } else { /* * We cannot deallocate the request since we may still * receive a send completion for it. */ printf("Netvsc: vRSS set timeout, id = %u, ret = %d\n", request->request_msg.msg.init_request.request_id, ret); goto exit; } cleanup: if (request != NULL) { hv_put_rndis_request(device, request); } exit: return (ret); } /* * RNDIS filter set packet filter * Sends an rndis request with the new filter, then waits for a response * from the host. * Returns zero on success, non-zero on failure. */ static int hv_rf_set_packet_filter(rndis_device *device, uint32_t new_filter) { rndis_request *request; rndis_set_request *set; rndis_set_complete *set_complete; uint32_t status; int ret; request = hv_rndis_request(device, REMOTE_NDIS_SET_MSG, RNDIS_MESSAGE_SIZE(rndis_set_request) + sizeof(uint32_t)); if (request == NULL) { ret = -1; goto cleanup; } /* Set up the rndis set */ set = &request->request_msg.msg.set_request; set->oid = RNDIS_OID_GEN_CURRENT_PACKET_FILTER; set->info_buffer_length = sizeof(uint32_t); set->info_buffer_offset = sizeof(rndis_set_request); memcpy((void *)((unsigned long)set + sizeof(rndis_set_request)), &new_filter, sizeof(uint32_t)); ret = hv_rf_send_request(device, request, REMOTE_NDIS_SET_MSG); if (ret != 0) { goto cleanup; } /* * Wait for the response from the host. Another thread will signal * us when the response has arrived. In the failure case, * sema_timedwait() returns a non-zero status after waiting 5 seconds. */ ret = sema_timedwait(&request->wait_sema, 5 * hz); if (ret == 0) { /* Response received, check status */ set_complete = &request->response_msg.msg.set_complete; status = set_complete->status; if (status != RNDIS_STATUS_SUCCESS) { /* Bad response status, return error */ ret = -2; } } else { /* * We cannot deallocate the request since we may still * receive a send completion for it. */ goto exit; } cleanup: if (request != NULL) { hv_put_rndis_request(device, request); } exit: return (ret); } /* * RNDIS filter init device */ static int hv_rf_init_device(rndis_device *device) { rndis_request *request; rndis_initialize_request *init; rndis_initialize_complete *init_complete; uint32_t status; int ret; request = hv_rndis_request(device, REMOTE_NDIS_INITIALIZE_MSG, RNDIS_MESSAGE_SIZE(rndis_initialize_request)); if (!request) { ret = -1; goto cleanup; } /* Set up the rndis set */ init = &request->request_msg.msg.init_request; init->major_version = RNDIS_MAJOR_VERSION; init->minor_version = RNDIS_MINOR_VERSION; /* * Per the RNDIS document, this should be set to the max MTU * plus the header size. However, 2048 works fine, so leaving * it as is. */ init->max_xfer_size = 2048; device->state = RNDIS_DEV_INITIALIZING; ret = hv_rf_send_request(device, request, REMOTE_NDIS_INITIALIZE_MSG); if (ret != 0) { device->state = RNDIS_DEV_UNINITIALIZED; goto cleanup; } sema_wait(&request->wait_sema); init_complete = &request->response_msg.msg.init_complete; status = init_complete->status; if (status == RNDIS_STATUS_SUCCESS) { device->state = RNDIS_DEV_INITIALIZED; ret = 0; } else { device->state = RNDIS_DEV_UNINITIALIZED; ret = -1; } cleanup: if (request) { hv_put_rndis_request(device, request); } return (ret); } #define HALT_COMPLETION_WAIT_COUNT 25 /* * RNDIS filter halt device */ static int hv_rf_halt_device(rndis_device *device) { rndis_request *request; rndis_halt_request *halt; int i, ret; /* Attempt to do a rndis device halt */ request = hv_rndis_request(device, REMOTE_NDIS_HALT_MSG, RNDIS_MESSAGE_SIZE(rndis_halt_request)); if (request == NULL) { return (-1); } /* initialize "poor man's semaphore" */ request->halt_complete_flag = 0; /* Set up the rndis set */ halt = &request->request_msg.msg.halt_request; halt->request_id = atomic_fetchadd_int(&device->new_request_id, 1); /* Increment to get the new value (call above returns old value) */ halt->request_id += 1; ret = hv_rf_send_request(device, request, REMOTE_NDIS_HALT_MSG); if (ret != 0) { return (-1); } /* * Wait for halt response from halt callback. We must wait for * the transaction response before freeing the request and other * resources. */ for (i=HALT_COMPLETION_WAIT_COUNT; i > 0; i--) { if (request->halt_complete_flag != 0) { break; } DELAY(400); } if (i == 0) { return (-1); } device->state = RNDIS_DEV_UNINITIALIZED; hv_put_rndis_request(device, request); return (0); } /* * RNDIS filter open device */ static int hv_rf_open_device(rndis_device *device) { int ret; if (device->state != RNDIS_DEV_INITIALIZED) { return (0); } if (hv_promisc_mode != 1) { ret = hv_rf_set_packet_filter(device, NDIS_PACKET_TYPE_BROADCAST | NDIS_PACKET_TYPE_ALL_MULTICAST | NDIS_PACKET_TYPE_DIRECTED); } else { ret = hv_rf_set_packet_filter(device, NDIS_PACKET_TYPE_PROMISCUOUS); } if (ret == 0) { device->state = RNDIS_DEV_DATAINITIALIZED; } return (ret); } /* * RNDIS filter close device */ static int hv_rf_close_device(rndis_device *device) { int ret; if (device->state != RNDIS_DEV_DATAINITIALIZED) { return (0); } ret = hv_rf_set_packet_filter(device, 0); if (ret == 0) { device->state = RNDIS_DEV_INITIALIZED; } return (ret); } /* * RNDIS filter on device add */ int hv_rf_on_device_add(struct hn_softc *sc, void *additl_info, - int nchan) + int nchan, struct hn_rx_ring *rxr) { int ret; netvsc_dev *net_dev; rndis_device *rndis_dev; nvsp_msg *init_pkt; rndis_offload_params offloads; struct rndis_recv_scale_cap rsscaps; uint32_t rsscaps_size = sizeof(struct rndis_recv_scale_cap); netvsc_device_info *dev_info = (netvsc_device_info *)additl_info; device_t dev = sc->hn_dev; rndis_dev = hv_get_rndis_device(); if (rndis_dev == NULL) { return (ENOMEM); } /* * Let the inner driver handle this first to create the netvsc channel * NOTE! Once the channel is created, we may get a receive callback * (hv_rf_on_receive()) before this call is completed. * Note: Earlier code used a function pointer here. */ - net_dev = hv_nv_on_device_add(sc, additl_info); + net_dev = hv_nv_on_device_add(sc, additl_info, rxr); if (!net_dev) { hv_put_rndis_device(rndis_dev); return (ENOMEM); } /* * Initialize the rndis device */ net_dev->extension = rndis_dev; rndis_dev->net_dev = net_dev; /* Send the rndis initialization message */ ret = hv_rf_init_device(rndis_dev); if (ret != 0) { /* * TODO: If rndis init failed, we will need to shut down * the channel */ } /* Get the mac address */ ret = hv_rf_query_device_mac(rndis_dev); if (ret != 0) { /* TODO: shut down rndis device and the channel */ } /* config csum offload and send request to host */ memset(&offloads, 0, sizeof(offloads)); offloads.ipv4_csum = RNDIS_OFFLOAD_PARAMETERS_TX_RX_ENABLED; offloads.tcp_ipv4_csum = RNDIS_OFFLOAD_PARAMETERS_TX_RX_ENABLED; offloads.udp_ipv4_csum = RNDIS_OFFLOAD_PARAMETERS_TX_RX_ENABLED; offloads.tcp_ipv6_csum = RNDIS_OFFLOAD_PARAMETERS_TX_RX_ENABLED; offloads.udp_ipv6_csum = RNDIS_OFFLOAD_PARAMETERS_TX_RX_ENABLED; offloads.lso_v2_ipv4 = RNDIS_OFFLOAD_PARAMETERS_LSOV2_ENABLED; ret = hv_rf_send_offload_request(sc, &offloads); if (ret != 0) { /* TODO: shut down rndis device and the channel */ device_printf(dev, "hv_rf_send_offload_request failed, ret=%d\n", ret); } - memcpy(dev_info->mac_addr, rndis_dev->hw_mac_addr, HW_MACADDR_LEN); + memcpy(dev_info->mac_addr, rndis_dev->hw_mac_addr, ETHER_ADDR_LEN); hv_rf_query_device_link_status(rndis_dev); dev_info->link_state = rndis_dev->link_status; net_dev->num_channel = 1; if (net_dev->nvsp_version < NVSP_PROTOCOL_VERSION_5 || nchan == 1) return (0); memset(&rsscaps, 0, rsscaps_size); ret = hv_rf_query_device(rndis_dev, RNDIS_OID_GEN_RSS_CAPABILITIES, &rsscaps, &rsscaps_size); if ((ret != 0) || (rsscaps.num_recv_que < 2)) { device_printf(dev, "hv_rf_query_device failed or " "rsscaps.num_recv_que < 2 \n"); goto out; } device_printf(dev, "channel, offered %u, requested %d\n", rsscaps.num_recv_que, nchan); if (nchan > rsscaps.num_recv_que) nchan = rsscaps.num_recv_que; net_dev->num_channel = nchan; if (net_dev->num_channel == 1) { device_printf(dev, "net_dev->num_channel == 1 under VRSS\n"); goto out; } /* request host to create sub channels */ init_pkt = &net_dev->channel_init_packet; memset(init_pkt, 0, sizeof(nvsp_msg)); init_pkt->hdr.msg_type = nvsp_msg5_type_subchannel; init_pkt->msgs.vers_5_msgs.subchannel_request.op = NVSP_SUBCHANNE_ALLOCATE; init_pkt->msgs.vers_5_msgs.subchannel_request.num_subchannels = net_dev->num_channel - 1; ret = vmbus_chan_send(sc->hn_prichan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, init_pkt, sizeof(nvsp_msg), (uint64_t)(uintptr_t)init_pkt); if (ret != 0) { device_printf(dev, "Fail to allocate subchannel\n"); goto out; } sema_wait(&net_dev->channel_init_sema); if (init_pkt->msgs.vers_5_msgs.subchn_complete.status != nvsp_status_success) { ret = ENODEV; device_printf(dev, "sub channel complete error\n"); goto out; } net_dev->num_channel = 1 + init_pkt->msgs.vers_5_msgs.subchn_complete.num_subchannels; ret = hv_rf_set_rss_param(rndis_dev, net_dev->num_channel); out: if (ret) net_dev->num_channel = 1; return (ret); } /* * RNDIS filter on device remove */ int hv_rf_on_device_remove(struct hn_softc *sc, boolean_t destroy_channel) { netvsc_dev *net_dev = sc->net_dev; rndis_device *rndis_dev = (rndis_device *)net_dev->extension; int ret; /* Halt and release the rndis device */ ret = hv_rf_halt_device(rndis_dev); hv_put_rndis_device(rndis_dev); net_dev->extension = NULL; /* Pass control to inner driver to remove the device */ ret |= hv_nv_on_device_remove(sc, destroy_channel); return (ret); } /* * RNDIS filter on open */ int hv_rf_on_open(struct hn_softc *sc) { netvsc_dev *net_dev = sc->net_dev; return (hv_rf_open_device((rndis_device *)net_dev->extension)); } /* * RNDIS filter on close */ int hv_rf_on_close(struct hn_softc *sc) { netvsc_dev *net_dev = sc->net_dev; return (hv_rf_close_device((rndis_device *)net_dev->extension)); } /* * RNDIS filter on send request completion callback */ static void -hv_rf_on_send_request_completion(struct hv_vmbus_channel *chan __unused, +hv_rf_on_send_request_completion(struct vmbus_channel *chan __unused, void *context __unused) { } /* * RNDIS filter on send request (halt only) completion callback */ static void -hv_rf_on_send_request_halt_completion(struct hv_vmbus_channel *chan __unused, +hv_rf_on_send_request_halt_completion(struct vmbus_channel *chan __unused, void *context) { rndis_request *request = context; /* * Notify hv_rf_halt_device() about halt completion. * The halt code must wait for completion before freeing * the transaction resources. */ request->halt_complete_flag = 1; } void -hv_rf_channel_rollup(struct hv_vmbus_channel *chan) +hv_rf_channel_rollup(struct hn_rx_ring *rxr, struct hn_tx_ring *txr) { - netvsc_channel_rollup(chan); + netvsc_channel_rollup(rxr, txr); } Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis_filter.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis_filter.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/netvsc/hv_rndis_filter.h (revision 303206) @@ -1,125 +1,128 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2010-2012 Citrix Inc. * Copyright (c) 2012 NetApp Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __HV_RNDIS_FILTER_H__ #define __HV_RNDIS_FILTER_H__ +#include +#include /* * Defines */ /* Destroy or preserve channel on filter/netvsc teardown */ #define HV_RF_NV_DESTROY_CHANNEL TRUE #define HV_RF_NV_RETAIN_CHANNEL FALSE /* * Number of page buffers to reserve for the RNDIS filter packet in the * transmitted message. */ #define HV_RF_NUM_TX_RESERVED_PAGE_BUFS 1 /* * Data types */ typedef enum { RNDIS_DEV_UNINITIALIZED = 0, RNDIS_DEV_INITIALIZING, RNDIS_DEV_INITIALIZED, RNDIS_DEV_DATAINITIALIZED, } rndis_device_state; typedef struct rndis_request_ { STAILQ_ENTRY(rndis_request_) mylist_entry; struct sema wait_sema; /* * The max response size is sizeof(rndis_msg) + PAGE_SIZE. * * XXX * This is ugly and should be cleaned up once we busdma-fy * RNDIS request bits. */ rndis_msg response_msg; uint8_t buf_resp[PAGE_SIZE]; /* Simplify allocation by having a netvsc packet inline */ netvsc_packet pkt; /* * The max request size is sizeof(rndis_msg) + PAGE_SIZE. * * NOTE: * This is required for the large request like RSS settings. * * XXX * This is ugly and should be cleaned up once we busdma-fy * RNDIS request bits. */ rndis_msg request_msg; uint8_t buf_req[PAGE_SIZE]; /* Fixme: Poor man's semaphore. */ uint32_t halt_complete_flag; } rndis_request; typedef struct rndis_device_ { netvsc_dev *net_dev; rndis_device_state state; uint32_t link_status; uint32_t new_request_id; struct mtx req_lock; STAILQ_HEAD(RQ, rndis_request_) myrequest_list; - uint8_t hw_mac_addr[HW_MACADDR_LEN]; + uint8_t hw_mac_addr[ETHER_ADDR_LEN]; } rndis_device; /* * Externs */ -struct hv_vmbus_channel; struct hn_softc; +struct hn_rx_ring; int hv_rf_on_receive(netvsc_dev *net_dev, - struct hv_vmbus_channel *chan, netvsc_packet *pkt); + struct hn_rx_ring *rxr, netvsc_packet *pkt); void hv_rf_receive_rollup(netvsc_dev *net_dev); -void hv_rf_channel_rollup(struct hv_vmbus_channel *chan); -int hv_rf_on_device_add(struct hn_softc *sc, void *additl_info, int nchan); +void hv_rf_channel_rollup(struct hn_rx_ring *rxr, struct hn_tx_ring *txr); +int hv_rf_on_device_add(struct hn_softc *sc, void *additl_info, int nchan, + struct hn_rx_ring *rxr); int hv_rf_on_device_remove(struct hn_softc *sc, boolean_t destroy_channel); int hv_rf_on_open(struct hn_softc *sc); int hv_rf_on_close(struct hn_softc *sc); #endif /* __HV_RNDIS_FILTER_H__ */ Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c (revision 303206) @@ -1,2087 +1,2081 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * Copyright (c) 2012 Citrix Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /** * StorVSC driver for Hyper-V. This driver presents a SCSI HBA interface * to the Comman Access Method (CAM) layer. CAM control blocks (CCBs) are * converted into VSCSI protocol messages which are delivered to the parent * partition StorVSP driver over the Hyper-V VMBUS. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "hv_vstorage.h" #include "vmbus_if.h" #define STORVSC_RINGBUFFER_SIZE (20*PAGE_SIZE) #define STORVSC_MAX_LUNS_PER_TARGET (64) #define STORVSC_MAX_IO_REQUESTS (STORVSC_MAX_LUNS_PER_TARGET * 2) #define BLKVSC_MAX_IDE_DISKS_PER_TARGET (1) #define BLKVSC_MAX_IO_REQUESTS STORVSC_MAX_IO_REQUESTS #define STORVSC_MAX_TARGETS (2) #define VSTOR_PKT_SIZE (sizeof(struct vstor_packet) - vmscsi_size_delta) #define HV_ALIGN(x, a) roundup2(x, a) struct storvsc_softc; struct hv_sgl_node { LIST_ENTRY(hv_sgl_node) link; struct sglist *sgl_data; }; struct hv_sgl_page_pool{ LIST_HEAD(, hv_sgl_node) in_use_sgl_list; LIST_HEAD(, hv_sgl_node) free_sgl_list; boolean_t is_init; } g_hv_sgl_page_pool; #define STORVSC_MAX_SG_PAGE_CNT STORVSC_MAX_IO_REQUESTS * VMBUS_CHAN_PRPLIST_MAX enum storvsc_request_type { WRITE_TYPE, READ_TYPE, UNKNOWN_TYPE }; struct hvs_gpa_range { struct vmbus_gpa_range gpa_range; uint64_t gpa_page[VMBUS_CHAN_PRPLIST_MAX]; } __packed; struct hv_storvsc_request { LIST_ENTRY(hv_storvsc_request) link; struct vstor_packet vstor_packet; int prp_cnt; struct hvs_gpa_range prp_list; void *sense_data; uint8_t sense_info_len; uint8_t retries; union ccb *ccb; struct storvsc_softc *softc; struct callout callout; struct sema synch_sema; /*Synchronize the request/response if needed */ struct sglist *bounce_sgl; unsigned int bounce_sgl_count; uint64_t not_aligned_seg_bits; }; struct storvsc_softc { - struct hv_vmbus_channel *hs_chan; + struct vmbus_channel *hs_chan; LIST_HEAD(, hv_storvsc_request) hs_free_list; struct mtx hs_lock; struct storvsc_driver_props *hs_drv_props; int hs_unit; uint32_t hs_frozen; struct cam_sim *hs_sim; struct cam_path *hs_path; uint32_t hs_num_out_reqs; boolean_t hs_destroy; boolean_t hs_drain_notify; struct sema hs_drain_sema; struct hv_storvsc_request hs_init_req; struct hv_storvsc_request hs_reset_req; device_t hs_dev; - struct hv_vmbus_channel *hs_cpu2chan[MAXCPU]; + struct vmbus_channel *hs_cpu2chan[MAXCPU]; }; /** * HyperV storvsc timeout testing cases: * a. IO returned after first timeout; * b. IO returned after second timeout and queue freeze; * c. IO returned while timer handler is running * The first can be tested by "sg_senddiag -vv /dev/daX", * and the second and third can be done by * "sg_wr_mode -v -p 08 -c 0,1a -m 0,ff /dev/daX". */ #define HVS_TIMEOUT_TEST 0 /* * Bus/adapter reset functionality on the Hyper-V host is * buggy and it will be disabled until * it can be further tested. */ #define HVS_HOST_RESET 0 struct storvsc_driver_props { char *drv_name; char *drv_desc; uint8_t drv_max_luns_per_target; uint8_t drv_max_ios_per_target; uint32_t drv_ringbuffer_size; }; enum hv_storage_type { DRIVER_BLKVSC, DRIVER_STORVSC, DRIVER_UNKNOWN }; #define HS_MAX_ADAPTERS 10 #define HV_STORAGE_SUPPORTS_MULTI_CHANNEL 0x1 /* {ba6163d9-04a1-4d29-b605-72e2ffb1dc7f} */ static const struct hyperv_guid gStorVscDeviceType={ .hv_guid = {0xd9, 0x63, 0x61, 0xba, 0xa1, 0x04, 0x29, 0x4d, 0xb6, 0x05, 0x72, 0xe2, 0xff, 0xb1, 0xdc, 0x7f} }; /* {32412632-86cb-44a2-9b5c-50d1417354f5} */ static const struct hyperv_guid gBlkVscDeviceType={ .hv_guid = {0x32, 0x26, 0x41, 0x32, 0xcb, 0x86, 0xa2, 0x44, 0x9b, 0x5c, 0x50, 0xd1, 0x41, 0x73, 0x54, 0xf5} }; static struct storvsc_driver_props g_drv_props_table[] = { {"blkvsc", "Hyper-V IDE Storage Interface", BLKVSC_MAX_IDE_DISKS_PER_TARGET, BLKVSC_MAX_IO_REQUESTS, STORVSC_RINGBUFFER_SIZE}, {"storvsc", "Hyper-V SCSI Storage Interface", STORVSC_MAX_LUNS_PER_TARGET, STORVSC_MAX_IO_REQUESTS, STORVSC_RINGBUFFER_SIZE} }; /* * Sense buffer size changed in win8; have a run-time * variable to track the size we should use. */ static int sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE; /* * The size of the vmscsi_request has changed in win8. The * additional size is for the newly added elements in the * structure. These elements are valid only when we are talking * to a win8 host. * Track the correct size we need to apply. */ static int vmscsi_size_delta; /* * The storage protocol version is determined during the * initial exchange with the host. It will indicate which * storage functionality is available in the host. */ static int vmstor_proto_version; struct vmstor_proto { int proto_version; int sense_buffer_size; int vmscsi_size_delta; }; static const struct vmstor_proto vmstor_proto_list[] = { { VMSTOR_PROTOCOL_VERSION_WIN10, POST_WIN7_STORVSC_SENSE_BUFFER_SIZE, 0 }, { VMSTOR_PROTOCOL_VERSION_WIN8_1, POST_WIN7_STORVSC_SENSE_BUFFER_SIZE, 0 }, { VMSTOR_PROTOCOL_VERSION_WIN8, POST_WIN7_STORVSC_SENSE_BUFFER_SIZE, 0 }, { VMSTOR_PROTOCOL_VERSION_WIN7, PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE, sizeof(struct vmscsi_win8_extension), }, { VMSTOR_PROTOCOL_VERSION_WIN6, PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE, sizeof(struct vmscsi_win8_extension), } }; /* static functions */ static int storvsc_probe(device_t dev); static int storvsc_attach(device_t dev); static int storvsc_detach(device_t dev); static void storvsc_poll(struct cam_sim * sim); static void storvsc_action(struct cam_sim * sim, union ccb * ccb); static int create_storvsc_request(union ccb *ccb, struct hv_storvsc_request *reqp); static void storvsc_free_request(struct storvsc_softc *sc, struct hv_storvsc_request *reqp); static enum hv_storage_type storvsc_get_storage_type(device_t dev); static void hv_storvsc_rescan_target(struct storvsc_softc *sc); -static void hv_storvsc_on_channel_callback(void *xchan); +static void hv_storvsc_on_channel_callback(struct vmbus_channel *chan, void *xsc); static void hv_storvsc_on_iocompletion( struct storvsc_softc *sc, struct vstor_packet *vstor_packet, struct hv_storvsc_request *request); static int hv_storvsc_connect_vsp(struct storvsc_softc *); static void storvsc_io_done(struct hv_storvsc_request *reqp); static void storvsc_copy_sgl_to_bounce_buf(struct sglist *bounce_sgl, bus_dma_segment_t *orig_sgl, unsigned int orig_sgl_count, uint64_t seg_bits); void storvsc_copy_from_bounce_buf_to_sgl(bus_dma_segment_t *dest_sgl, unsigned int dest_sgl_count, struct sglist* src_sgl, uint64_t seg_bits); static device_method_t storvsc_methods[] = { /* Device interface */ DEVMETHOD(device_probe, storvsc_probe), DEVMETHOD(device_attach, storvsc_attach), DEVMETHOD(device_detach, storvsc_detach), DEVMETHOD(device_shutdown, bus_generic_shutdown), DEVMETHOD_END }; static driver_t storvsc_driver = { "storvsc", storvsc_methods, sizeof(struct storvsc_softc), }; static devclass_t storvsc_devclass; DRIVER_MODULE(storvsc, vmbus, storvsc_driver, storvsc_devclass, 0, 0); MODULE_VERSION(storvsc, 1); MODULE_DEPEND(storvsc, vmbus, 1, 1, 1); static void storvsc_subchan_attach(struct storvsc_softc *sc, - struct hv_vmbus_channel *new_channel) + struct vmbus_channel *new_channel) { struct vmstor_chan_props props; int ret = 0; memset(&props, 0, sizeof(props)); - new_channel->hv_chan_priv1 = sc; vmbus_chan_cpu_rr(new_channel); ret = vmbus_chan_open(new_channel, sc->hs_drv_props->drv_ringbuffer_size, sc->hs_drv_props->drv_ringbuffer_size, (void *)&props, sizeof(struct vmstor_chan_props), - hv_storvsc_on_channel_callback, - new_channel); + hv_storvsc_on_channel_callback, sc); } /** * @brief Send multi-channel creation request to host * * @param device a Hyper-V device pointer * @param max_chans the max channels supported by vmbus */ static void storvsc_send_multichannel_request(struct storvsc_softc *sc, int max_chans) { - struct hv_vmbus_channel **subchan; + struct vmbus_channel **subchan; struct hv_storvsc_request *request; struct vstor_packet *vstor_packet; int request_channels_cnt = 0; int ret, i; /* get multichannels count that need to create */ request_channels_cnt = MIN(max_chans, mp_ncpus); request = &sc->hs_init_req; /* request the host to create multi-channel */ memset(request, 0, sizeof(struct hv_storvsc_request)); sema_init(&request->synch_sema, 0, ("stor_synch_sema")); vstor_packet = &request->vstor_packet; vstor_packet->operation = VSTOR_OPERATION_CREATE_MULTI_CHANNELS; vstor_packet->flags = REQUEST_COMPLETION_FLAG; vstor_packet->u.multi_channels_cnt = request_channels_cnt; ret = vmbus_chan_send(sc->hs_chan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)request); /* wait for 5 seconds */ ret = sema_timedwait(&request->synch_sema, 5 * hz); if (ret != 0) { printf("Storvsc_error: create multi-channel timeout, %d\n", ret); return; } if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO || vstor_packet->status != 0) { printf("Storvsc_error: create multi-channel invalid operation " "(%d) or statue (%u)\n", vstor_packet->operation, vstor_packet->status); return; } /* Wait for sub-channels setup to complete. */ subchan = vmbus_subchan_get(sc->hs_chan, request_channels_cnt); /* Attach the sub-channels. */ for (i = 0; i < request_channels_cnt; ++i) storvsc_subchan_attach(sc, subchan[i]); /* Release the sub-channels. */ vmbus_subchan_rel(subchan, request_channels_cnt); if (bootverbose) printf("Storvsc create multi-channel success!\n"); } /** * @brief initialize channel connection to parent partition * * @param dev a Hyper-V device pointer * @returns 0 on success, non-zero error on failure */ static int hv_storvsc_channel_init(struct storvsc_softc *sc) { int ret = 0, i; struct hv_storvsc_request *request; struct vstor_packet *vstor_packet; uint16_t max_chans = 0; boolean_t support_multichannel = FALSE; uint32_t version; max_chans = 0; support_multichannel = FALSE; request = &sc->hs_init_req; memset(request, 0, sizeof(struct hv_storvsc_request)); vstor_packet = &request->vstor_packet; request->softc = sc; /** * Initiate the vsc/vsp initialization protocol on the open channel */ sema_init(&request->synch_sema, 0, ("stor_synch_sema")); vstor_packet->operation = VSTOR_OPERATION_BEGININITIALIZATION; vstor_packet->flags = REQUEST_COMPLETION_FLAG; ret = vmbus_chan_send(sc->hs_chan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)request); if (ret != 0) goto cleanup; /* wait 5 seconds */ ret = sema_timedwait(&request->synch_sema, 5 * hz); if (ret != 0) goto cleanup; if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO || vstor_packet->status != 0) { goto cleanup; } for (i = 0; i < nitems(vmstor_proto_list); i++) { /* reuse the packet for version range supported */ memset(vstor_packet, 0, sizeof(struct vstor_packet)); vstor_packet->operation = VSTOR_OPERATION_QUERYPROTOCOLVERSION; vstor_packet->flags = REQUEST_COMPLETION_FLAG; vstor_packet->u.version.major_minor = vmstor_proto_list[i].proto_version; /* revision is only significant for Windows guests */ vstor_packet->u.version.revision = 0; ret = vmbus_chan_send(sc->hs_chan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)request); if (ret != 0) goto cleanup; /* wait 5 seconds */ ret = sema_timedwait(&request->synch_sema, 5 * hz); if (ret) goto cleanup; if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO) { ret = EINVAL; goto cleanup; } if (vstor_packet->status == 0) { vmstor_proto_version = vmstor_proto_list[i].proto_version; sense_buffer_size = vmstor_proto_list[i].sense_buffer_size; vmscsi_size_delta = vmstor_proto_list[i].vmscsi_size_delta; break; } } if (vstor_packet->status != 0) { ret = EINVAL; goto cleanup; } /** * Query channel properties */ memset(vstor_packet, 0, sizeof(struct vstor_packet)); vstor_packet->operation = VSTOR_OPERATION_QUERYPROPERTIES; vstor_packet->flags = REQUEST_COMPLETION_FLAG; ret = vmbus_chan_send(sc->hs_chan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)request); if ( ret != 0) goto cleanup; /* wait 5 seconds */ ret = sema_timedwait(&request->synch_sema, 5 * hz); if (ret != 0) goto cleanup; /* TODO: Check returned version */ if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO || vstor_packet->status != 0) { goto cleanup; } /* multi-channels feature is supported by WIN8 and above version */ max_chans = vstor_packet->u.chan_props.max_channel_cnt; version = VMBUS_GET_VERSION(device_get_parent(sc->hs_dev), sc->hs_dev); if (version != VMBUS_VERSION_WIN7 && version != VMBUS_VERSION_WS2008 && (vstor_packet->u.chan_props.flags & HV_STORAGE_SUPPORTS_MULTI_CHANNEL)) { support_multichannel = TRUE; } memset(vstor_packet, 0, sizeof(struct vstor_packet)); vstor_packet->operation = VSTOR_OPERATION_ENDINITIALIZATION; vstor_packet->flags = REQUEST_COMPLETION_FLAG; ret = vmbus_chan_send(sc->hs_chan, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)request); if (ret != 0) { goto cleanup; } /* wait 5 seconds */ ret = sema_timedwait(&request->synch_sema, 5 * hz); if (ret != 0) goto cleanup; if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO || vstor_packet->status != 0) goto cleanup; /* * If multi-channel is supported, send multichannel create * request to host. */ if (support_multichannel) storvsc_send_multichannel_request(sc, max_chans); cleanup: sema_destroy(&request->synch_sema); return (ret); } /** * @brief Open channel connection to paraent partition StorVSP driver * * Open and initialize channel connection to parent partition StorVSP driver. * * @param pointer to a Hyper-V device * @returns 0 on success, non-zero error on failure */ static int hv_storvsc_connect_vsp(struct storvsc_softc *sc) { int ret = 0; struct vmstor_chan_props props; memset(&props, 0, sizeof(struct vmstor_chan_props)); /* * Open the channel */ - KASSERT(sc->hs_chan->hv_chan_priv1 == sc, ("invalid chan priv1")); vmbus_chan_cpu_rr(sc->hs_chan); ret = vmbus_chan_open( sc->hs_chan, sc->hs_drv_props->drv_ringbuffer_size, sc->hs_drv_props->drv_ringbuffer_size, (void *)&props, sizeof(struct vmstor_chan_props), - hv_storvsc_on_channel_callback, - sc->hs_chan); + hv_storvsc_on_channel_callback, sc); if (ret != 0) { return ret; } ret = hv_storvsc_channel_init(sc); return (ret); } #if HVS_HOST_RESET static int hv_storvsc_host_reset(struct storvsc_softc *sc) { int ret = 0; struct hv_storvsc_request *request; struct vstor_packet *vstor_packet; request = &sc->hs_reset_req; request->softc = sc; vstor_packet = &request->vstor_packet; sema_init(&request->synch_sema, 0, "stor synch sema"); vstor_packet->operation = VSTOR_OPERATION_RESETBUS; vstor_packet->flags = REQUEST_COMPLETION_FLAG; ret = vmbus_chan_send(dev->channel, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)&sc->hs_reset_req); if (ret != 0) { goto cleanup; } ret = sema_timedwait(&request->synch_sema, 5 * hz); /* KYS 5 seconds */ if (ret) { goto cleanup; } /* * At this point, all outstanding requests in the adapter * should have been flushed out and return to us */ cleanup: sema_destroy(&request->synch_sema); return (ret); } #endif /* HVS_HOST_RESET */ /** * @brief Function to initiate an I/O request * * @param device Hyper-V device pointer * @param request pointer to a request structure * @returns 0 on success, non-zero error on failure */ static int hv_storvsc_io_request(struct storvsc_softc *sc, struct hv_storvsc_request *request) { struct vstor_packet *vstor_packet = &request->vstor_packet; - struct hv_vmbus_channel* outgoing_channel = NULL; + struct vmbus_channel* outgoing_channel = NULL; int ret = 0; vstor_packet->flags |= REQUEST_COMPLETION_FLAG; vstor_packet->u.vm_srb.length = VSTOR_PKT_SIZE; vstor_packet->u.vm_srb.sense_info_len = sense_buffer_size; vstor_packet->u.vm_srb.transfer_len = request->prp_list.gpa_range.gpa_len; vstor_packet->operation = VSTOR_OPERATION_EXECUTESRB; outgoing_channel = sc->hs_cpu2chan[curcpu]; mtx_unlock(&request->softc->hs_lock); if (request->prp_list.gpa_range.gpa_len) { ret = vmbus_chan_send_prplist(outgoing_channel, &request->prp_list.gpa_range, request->prp_cnt, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)request); } else { ret = vmbus_chan_send(outgoing_channel, VMBUS_CHANPKT_TYPE_INBAND, VMBUS_CHANPKT_FLAG_RC, vstor_packet, VSTOR_PKT_SIZE, (uint64_t)(uintptr_t)request); } mtx_lock(&request->softc->hs_lock); if (ret != 0) { printf("Unable to send packet %p ret %d", vstor_packet, ret); } else { atomic_add_int(&sc->hs_num_out_reqs, 1); } return (ret); } /** * Process IO_COMPLETION_OPERATION and ready * the result to be completed for upper layer * processing by the CAM layer. */ static void hv_storvsc_on_iocompletion(struct storvsc_softc *sc, struct vstor_packet *vstor_packet, struct hv_storvsc_request *request) { struct vmscsi_req *vm_srb; vm_srb = &vstor_packet->u.vm_srb; /* * Copy some fields of the host's response into the request structure, * because the fields will be used later in storvsc_io_done(). */ request->vstor_packet.u.vm_srb.scsi_status = vm_srb->scsi_status; request->vstor_packet.u.vm_srb.transfer_len = vm_srb->transfer_len; if (((vm_srb->scsi_status & 0xFF) == SCSI_STATUS_CHECK_COND) && (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID)) { /* Autosense data available */ KASSERT(vm_srb->sense_info_len <= request->sense_info_len, ("vm_srb->sense_info_len <= " "request->sense_info_len")); memcpy(request->sense_data, vm_srb->u.sense_data, vm_srb->sense_info_len); request->sense_info_len = vm_srb->sense_info_len; } /* Complete request by passing to the CAM layer */ storvsc_io_done(request); atomic_subtract_int(&sc->hs_num_out_reqs, 1); if (sc->hs_drain_notify && (sc->hs_num_out_reqs == 0)) { sema_post(&sc->hs_drain_sema); } } static void hv_storvsc_rescan_target(struct storvsc_softc *sc) { path_id_t pathid; target_id_t targetid; union ccb *ccb; pathid = cam_sim_path(sc->hs_sim); targetid = CAM_TARGET_WILDCARD; /* * Allocate a CCB and schedule a rescan. */ ccb = xpt_alloc_ccb_nowait(); if (ccb == NULL) { printf("unable to alloc CCB for rescan\n"); return; } if (xpt_create_path(&ccb->ccb_h.path, NULL, pathid, targetid, CAM_LUN_WILDCARD) != CAM_REQ_CMP) { printf("unable to create path for rescan, pathid: %u," "targetid: %u\n", pathid, targetid); xpt_free_ccb(ccb); return; } if (targetid == CAM_TARGET_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_BUS; else ccb->ccb_h.func_code = XPT_SCAN_TGT; xpt_rescan(ccb); } static void -hv_storvsc_on_channel_callback(void *xchan) +hv_storvsc_on_channel_callback(struct vmbus_channel *channel, void *xsc) { int ret = 0; - hv_vmbus_channel *channel = xchan; - struct storvsc_softc *sc = channel->hv_chan_priv1; + struct storvsc_softc *sc = xsc; uint32_t bytes_recvd; uint64_t request_id; uint8_t packet[roundup2(sizeof(struct vstor_packet), 8)]; struct hv_storvsc_request *request; struct vstor_packet *vstor_packet; bytes_recvd = roundup2(VSTOR_PKT_SIZE, 8); ret = vmbus_chan_recv(channel, packet, &bytes_recvd, &request_id); KASSERT(ret != ENOBUFS, ("storvsc recvbuf is not large enough")); /* XXX check bytes_recvd to make sure that it contains enough data */ while ((ret == 0) && (bytes_recvd > 0)) { request = (struct hv_storvsc_request *)(uintptr_t)request_id; if ((request == &sc->hs_init_req) || (request == &sc->hs_reset_req)) { memcpy(&request->vstor_packet, packet, sizeof(struct vstor_packet)); sema_post(&request->synch_sema); } else { vstor_packet = (struct vstor_packet *)packet; switch(vstor_packet->operation) { case VSTOR_OPERATION_COMPLETEIO: if (request == NULL) panic("VMBUS: storvsc received a " "packet with NULL request id in " "COMPLETEIO operation."); hv_storvsc_on_iocompletion(sc, vstor_packet, request); break; case VSTOR_OPERATION_REMOVEDEVICE: printf("VMBUS: storvsc operation %d not " "implemented.\n", vstor_packet->operation); /* TODO: implement */ break; case VSTOR_OPERATION_ENUMERATE_BUS: hv_storvsc_rescan_target(sc); break; default: break; } } bytes_recvd = roundup2(VSTOR_PKT_SIZE, 8), ret = vmbus_chan_recv(channel, packet, &bytes_recvd, &request_id); KASSERT(ret != ENOBUFS, ("storvsc recvbuf is not large enough")); /* * XXX check bytes_recvd to make sure that it contains * enough data */ } } /** * @brief StorVSC probe function * * Device probe function. Returns 0 if the input device is a StorVSC * device. Otherwise, a ENXIO is returned. If the input device is * for BlkVSC (paravirtual IDE) device and this support is disabled in * favor of the emulated ATA/IDE device, return ENXIO. * * @param a device * @returns 0 on success, ENXIO if not a matcing StorVSC device */ static int storvsc_probe(device_t dev) { int ata_disk_enable = 0; int ret = ENXIO; switch (storvsc_get_storage_type(dev)) { case DRIVER_BLKVSC: if(bootverbose) device_printf(dev, "DRIVER_BLKVSC-Emulated ATA/IDE probe\n"); if (!getenv_int("hw.ata.disk_enable", &ata_disk_enable)) { if(bootverbose) device_printf(dev, "Enlightened ATA/IDE detected\n"); device_set_desc(dev, g_drv_props_table[DRIVER_BLKVSC].drv_desc); ret = BUS_PROBE_DEFAULT; } else if(bootverbose) device_printf(dev, "Emulated ATA/IDE set (hw.ata.disk_enable set)\n"); break; case DRIVER_STORVSC: if(bootverbose) device_printf(dev, "Enlightened SCSI device detected\n"); device_set_desc(dev, g_drv_props_table[DRIVER_STORVSC].drv_desc); ret = BUS_PROBE_DEFAULT; break; default: ret = ENXIO; } return (ret); } static void storvsc_create_cpu2chan(struct storvsc_softc *sc) { int cpu; CPU_FOREACH(cpu) { sc->hs_cpu2chan[cpu] = vmbus_chan_cpu2chan(sc->hs_chan, cpu); if (bootverbose) { device_printf(sc->hs_dev, "cpu%d -> chan%u\n", - cpu, sc->hs_cpu2chan[cpu]->ch_id); + cpu, vmbus_chan_id(sc->hs_cpu2chan[cpu])); } } } /** * @brief StorVSC attach function * * Function responsible for allocating per-device structures, * setting up CAM interfaces and scanning for available LUNs to * be used for SCSI device peripherals. * * @param a device * @returns 0 on success or an error on failure */ static int storvsc_attach(device_t dev) { enum hv_storage_type stor_type; struct storvsc_softc *sc; struct cam_devq *devq; int ret, i, j; struct hv_storvsc_request *reqp; struct root_hold_token *root_mount_token = NULL; struct hv_sgl_node *sgl_node = NULL; void *tmp_buff = NULL; /* * We need to serialize storvsc attach calls. */ root_mount_token = root_mount_hold("storvsc"); sc = device_get_softc(dev); sc->hs_chan = vmbus_get_channel(dev); - sc->hs_chan->hv_chan_priv1 = sc; stor_type = storvsc_get_storage_type(dev); if (stor_type == DRIVER_UNKNOWN) { ret = ENODEV; goto cleanup; } /* fill in driver specific properties */ sc->hs_drv_props = &g_drv_props_table[stor_type]; /* fill in device specific properties */ sc->hs_unit = device_get_unit(dev); sc->hs_dev = dev; LIST_INIT(&sc->hs_free_list); mtx_init(&sc->hs_lock, "hvslck", NULL, MTX_DEF); for (i = 0; i < sc->hs_drv_props->drv_max_ios_per_target; ++i) { reqp = malloc(sizeof(struct hv_storvsc_request), M_DEVBUF, M_WAITOK|M_ZERO); reqp->softc = sc; LIST_INSERT_HEAD(&sc->hs_free_list, reqp, link); } /* create sg-list page pool */ if (FALSE == g_hv_sgl_page_pool.is_init) { g_hv_sgl_page_pool.is_init = TRUE; LIST_INIT(&g_hv_sgl_page_pool.in_use_sgl_list); LIST_INIT(&g_hv_sgl_page_pool.free_sgl_list); /* * Pre-create SG list, each SG list with * VMBUS_CHAN_PRPLIST_MAX segments, each * segment has one page buffer */ for (i = 0; i < STORVSC_MAX_IO_REQUESTS; i++) { sgl_node = malloc(sizeof(struct hv_sgl_node), M_DEVBUF, M_WAITOK|M_ZERO); sgl_node->sgl_data = sglist_alloc(VMBUS_CHAN_PRPLIST_MAX, M_WAITOK|M_ZERO); for (j = 0; j < VMBUS_CHAN_PRPLIST_MAX; j++) { tmp_buff = malloc(PAGE_SIZE, M_DEVBUF, M_WAITOK|M_ZERO); sgl_node->sgl_data->sg_segs[j].ss_paddr = (vm_paddr_t)tmp_buff; } LIST_INSERT_HEAD(&g_hv_sgl_page_pool.free_sgl_list, sgl_node, link); } } sc->hs_destroy = FALSE; sc->hs_drain_notify = FALSE; sema_init(&sc->hs_drain_sema, 0, "Store Drain Sema"); ret = hv_storvsc_connect_vsp(sc); if (ret != 0) { goto cleanup; } /* Construct cpu to channel mapping */ storvsc_create_cpu2chan(sc); /* * Create the device queue. * Hyper-V maps each target to one SCSI HBA */ devq = cam_simq_alloc(sc->hs_drv_props->drv_max_ios_per_target); if (devq == NULL) { device_printf(dev, "Failed to alloc device queue\n"); ret = ENOMEM; goto cleanup; } sc->hs_sim = cam_sim_alloc(storvsc_action, storvsc_poll, sc->hs_drv_props->drv_name, sc, sc->hs_unit, &sc->hs_lock, 1, sc->hs_drv_props->drv_max_ios_per_target, devq); if (sc->hs_sim == NULL) { device_printf(dev, "Failed to alloc sim\n"); cam_simq_free(devq); ret = ENOMEM; goto cleanup; } mtx_lock(&sc->hs_lock); /* bus_id is set to 0, need to get it from VMBUS channel query? */ if (xpt_bus_register(sc->hs_sim, dev, 0) != CAM_SUCCESS) { cam_sim_free(sc->hs_sim, /*free_devq*/TRUE); mtx_unlock(&sc->hs_lock); device_printf(dev, "Unable to register SCSI bus\n"); ret = ENXIO; goto cleanup; } if (xpt_create_path(&sc->hs_path, /*periph*/NULL, cam_sim_path(sc->hs_sim), CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD) != CAM_REQ_CMP) { xpt_bus_deregister(cam_sim_path(sc->hs_sim)); cam_sim_free(sc->hs_sim, /*free_devq*/TRUE); mtx_unlock(&sc->hs_lock); device_printf(dev, "Unable to create path\n"); ret = ENXIO; goto cleanup; } mtx_unlock(&sc->hs_lock); root_mount_rel(root_mount_token); return (0); cleanup: root_mount_rel(root_mount_token); while (!LIST_EMPTY(&sc->hs_free_list)) { reqp = LIST_FIRST(&sc->hs_free_list); LIST_REMOVE(reqp, link); free(reqp, M_DEVBUF); } while (!LIST_EMPTY(&g_hv_sgl_page_pool.free_sgl_list)) { sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.free_sgl_list); LIST_REMOVE(sgl_node, link); for (j = 0; j < VMBUS_CHAN_PRPLIST_MAX; j++) { if (NULL != (void*)sgl_node->sgl_data->sg_segs[j].ss_paddr) { free((void*)sgl_node->sgl_data->sg_segs[j].ss_paddr, M_DEVBUF); } } sglist_free(sgl_node->sgl_data); free(sgl_node, M_DEVBUF); } return (ret); } /** * @brief StorVSC device detach function * * This function is responsible for safely detaching a * StorVSC device. This includes waiting for inbound responses * to complete and freeing associated per-device structures. * * @param dev a device * returns 0 on success */ static int storvsc_detach(device_t dev) { struct storvsc_softc *sc = device_get_softc(dev); struct hv_storvsc_request *reqp = NULL; struct hv_sgl_node *sgl_node = NULL; int j = 0; sc->hs_destroy = TRUE; /* * At this point, all outbound traffic should be disabled. We * only allow inbound traffic (responses) to proceed so that * outstanding requests can be completed. */ sc->hs_drain_notify = TRUE; sema_wait(&sc->hs_drain_sema); sc->hs_drain_notify = FALSE; /* * Since we have already drained, we don't need to busy wait. * The call to close the channel will reset the callback * under the protection of the incoming channel lock. */ vmbus_chan_close(sc->hs_chan); mtx_lock(&sc->hs_lock); while (!LIST_EMPTY(&sc->hs_free_list)) { reqp = LIST_FIRST(&sc->hs_free_list); LIST_REMOVE(reqp, link); free(reqp, M_DEVBUF); } mtx_unlock(&sc->hs_lock); while (!LIST_EMPTY(&g_hv_sgl_page_pool.free_sgl_list)) { sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.free_sgl_list); LIST_REMOVE(sgl_node, link); for (j = 0; j < VMBUS_CHAN_PRPLIST_MAX; j++){ if (NULL != (void*)sgl_node->sgl_data->sg_segs[j].ss_paddr) { free((void*)sgl_node->sgl_data->sg_segs[j].ss_paddr, M_DEVBUF); } } sglist_free(sgl_node->sgl_data); free(sgl_node, M_DEVBUF); } return (0); } #if HVS_TIMEOUT_TEST /** * @brief unit test for timed out operations * * This function provides unit testing capability to simulate * timed out operations. Recompilation with HV_TIMEOUT_TEST=1 * is required. * * @param reqp pointer to a request structure * @param opcode SCSI operation being performed * @param wait if 1, wait for I/O to complete */ static void storvsc_timeout_test(struct hv_storvsc_request *reqp, uint8_t opcode, int wait) { int ret; union ccb *ccb = reqp->ccb; struct storvsc_softc *sc = reqp->softc; if (reqp->vstor_packet.vm_srb.cdb[0] != opcode) { return; } if (wait) { mtx_lock(&reqp->event.mtx); } ret = hv_storvsc_io_request(sc, reqp); if (ret != 0) { if (wait) { mtx_unlock(&reqp->event.mtx); } printf("%s: io_request failed with %d.\n", __func__, ret); ccb->ccb_h.status = CAM_PROVIDE_FAIL; mtx_lock(&sc->hs_lock); storvsc_free_request(sc, reqp); xpt_done(ccb); mtx_unlock(&sc->hs_lock); return; } if (wait) { xpt_print(ccb->ccb_h.path, "%u: %s: waiting for IO return.\n", ticks, __func__); ret = cv_timedwait(&reqp->event.cv, &reqp->event.mtx, 60*hz); mtx_unlock(&reqp->event.mtx); xpt_print(ccb->ccb_h.path, "%u: %s: %s.\n", ticks, __func__, (ret == 0)? "IO return detected" : "IO return not detected"); /* * Now both the timer handler and io done are running * simultaneously. We want to confirm the io done always * finishes after the timer handler exits. So reqp used by * timer handler is not freed or stale. Do busy loop for * another 1/10 second to make sure io done does * wait for the timer handler to complete. */ DELAY(100*1000); mtx_lock(&sc->hs_lock); xpt_print(ccb->ccb_h.path, "%u: %s: finishing, queue frozen %d, " "ccb status 0x%x scsi_status 0x%x.\n", ticks, __func__, sc->hs_frozen, ccb->ccb_h.status, ccb->csio.scsi_status); mtx_unlock(&sc->hs_lock); } } #endif /* HVS_TIMEOUT_TEST */ #ifdef notyet /** * @brief timeout handler for requests * * This function is called as a result of a callout expiring. * * @param arg pointer to a request */ static void storvsc_timeout(void *arg) { struct hv_storvsc_request *reqp = arg; struct storvsc_softc *sc = reqp->softc; union ccb *ccb = reqp->ccb; if (reqp->retries == 0) { mtx_lock(&sc->hs_lock); xpt_print(ccb->ccb_h.path, "%u: IO timed out (req=0x%p), wait for another %u secs.\n", ticks, reqp, ccb->ccb_h.timeout / 1000); cam_error_print(ccb, CAM_ESF_ALL, CAM_EPF_ALL); mtx_unlock(&sc->hs_lock); reqp->retries++; callout_reset_sbt(&reqp->callout, SBT_1MS * ccb->ccb_h.timeout, 0, storvsc_timeout, reqp, 0); #if HVS_TIMEOUT_TEST storvsc_timeout_test(reqp, SEND_DIAGNOSTIC, 0); #endif return; } mtx_lock(&sc->hs_lock); xpt_print(ccb->ccb_h.path, "%u: IO (reqp = 0x%p) did not return for %u seconds, %s.\n", ticks, reqp, ccb->ccb_h.timeout * (reqp->retries+1) / 1000, (sc->hs_frozen == 0)? "freezing the queue" : "the queue is already frozen"); if (sc->hs_frozen == 0) { sc->hs_frozen = 1; xpt_freeze_simq(xpt_path_sim(ccb->ccb_h.path), 1); } mtx_unlock(&sc->hs_lock); #if HVS_TIMEOUT_TEST storvsc_timeout_test(reqp, MODE_SELECT_10, 1); #endif } #endif /** * @brief StorVSC device poll function * * This function is responsible for servicing requests when * interrupts are disabled (i.e when we are dumping core.) * * @param sim a pointer to a CAM SCSI interface module */ static void storvsc_poll(struct cam_sim *sim) { struct storvsc_softc *sc = cam_sim_softc(sim); mtx_assert(&sc->hs_lock, MA_OWNED); mtx_unlock(&sc->hs_lock); - hv_storvsc_on_channel_callback(sc->hs_chan); + hv_storvsc_on_channel_callback(sc->hs_chan, sc); mtx_lock(&sc->hs_lock); } /** * @brief StorVSC device action function * * This function is responsible for handling SCSI operations which * are passed from the CAM layer. The requests are in the form of * CAM control blocks which indicate the action being performed. * Not all actions require converting the request to a VSCSI protocol * message - these actions can be responded to by this driver. * Requests which are destined for a backend storage device are converted * to a VSCSI protocol message and sent on the channel connection associated * with this device. * * @param sim pointer to a CAM SCSI interface module * @param ccb pointer to a CAM control block */ static void storvsc_action(struct cam_sim *sim, union ccb *ccb) { struct storvsc_softc *sc = cam_sim_softc(sim); int res; mtx_assert(&sc->hs_lock, MA_OWNED); switch (ccb->ccb_h.func_code) { case XPT_PATH_INQ: { struct ccb_pathinq *cpi = &ccb->cpi; cpi->version_num = 1; cpi->hba_inquiry = PI_TAG_ABLE|PI_SDTR_ABLE; cpi->target_sprt = 0; cpi->hba_misc = PIM_NOBUSRESET; cpi->hba_eng_cnt = 0; cpi->max_target = STORVSC_MAX_TARGETS; cpi->max_lun = sc->hs_drv_props->drv_max_luns_per_target; cpi->initiator_id = cpi->max_target; cpi->bus_id = cam_sim_bus(sim); cpi->base_transfer_speed = 300000; cpi->transport = XPORT_SAS; cpi->transport_version = 0; cpi->protocol = PROTO_SCSI; cpi->protocol_version = SCSI_REV_SPC2; strncpy(cpi->sim_vid, "FreeBSD", SIM_IDLEN); strncpy(cpi->hba_vid, sc->hs_drv_props->drv_name, HBA_IDLEN); strncpy(cpi->dev_name, cam_sim_name(sim), DEV_IDLEN); cpi->unit_number = cam_sim_unit(sim); ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; } case XPT_GET_TRAN_SETTINGS: { struct ccb_trans_settings *cts = &ccb->cts; cts->transport = XPORT_SAS; cts->transport_version = 0; cts->protocol = PROTO_SCSI; cts->protocol_version = SCSI_REV_SPC2; /* enable tag queuing and disconnected mode */ cts->proto_specific.valid = CTS_SCSI_VALID_TQ; cts->proto_specific.scsi.valid = CTS_SCSI_VALID_TQ; cts->proto_specific.scsi.flags = CTS_SCSI_FLAGS_TAG_ENB; cts->xport_specific.valid = CTS_SPI_VALID_DISC; cts->xport_specific.spi.flags = CTS_SPI_FLAGS_DISC_ENB; ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; } case XPT_SET_TRAN_SETTINGS: { ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; } case XPT_CALC_GEOMETRY:{ cam_calc_geometry(&ccb->ccg, 1); xpt_done(ccb); return; } case XPT_RESET_BUS: case XPT_RESET_DEV:{ #if HVS_HOST_RESET if ((res = hv_storvsc_host_reset(sc)) != 0) { xpt_print(ccb->ccb_h.path, "hv_storvsc_host_reset failed with %d\n", res); ccb->ccb_h.status = CAM_PROVIDE_FAIL; xpt_done(ccb); return; } ccb->ccb_h.status = CAM_REQ_CMP; xpt_done(ccb); return; #else xpt_print(ccb->ccb_h.path, "%s reset not supported.\n", (ccb->ccb_h.func_code == XPT_RESET_BUS)? "bus" : "dev"); ccb->ccb_h.status = CAM_REQ_INVALID; xpt_done(ccb); return; #endif /* HVS_HOST_RESET */ } case XPT_SCSI_IO: case XPT_IMMED_NOTIFY: { struct hv_storvsc_request *reqp = NULL; if (ccb->csio.cdb_len == 0) { panic("cdl_len is 0\n"); } if (LIST_EMPTY(&sc->hs_free_list)) { ccb->ccb_h.status = CAM_REQUEUE_REQ; if (sc->hs_frozen == 0) { sc->hs_frozen = 1; xpt_freeze_simq(sim, /* count*/1); } xpt_done(ccb); return; } reqp = LIST_FIRST(&sc->hs_free_list); LIST_REMOVE(reqp, link); bzero(reqp, sizeof(struct hv_storvsc_request)); reqp->softc = sc; ccb->ccb_h.status |= CAM_SIM_QUEUED; if ((res = create_storvsc_request(ccb, reqp)) != 0) { ccb->ccb_h.status = CAM_REQ_INVALID; xpt_done(ccb); return; } #ifdef notyet if (ccb->ccb_h.timeout != CAM_TIME_INFINITY) { callout_init(&reqp->callout, 1); callout_reset_sbt(&reqp->callout, SBT_1MS * ccb->ccb_h.timeout, 0, storvsc_timeout, reqp, 0); #if HVS_TIMEOUT_TEST cv_init(&reqp->event.cv, "storvsc timeout cv"); mtx_init(&reqp->event.mtx, "storvsc timeout mutex", NULL, MTX_DEF); switch (reqp->vstor_packet.vm_srb.cdb[0]) { case MODE_SELECT_10: case SEND_DIAGNOSTIC: /* To have timer send the request. */ return; default: break; } #endif /* HVS_TIMEOUT_TEST */ } #endif if ((res = hv_storvsc_io_request(sc, reqp)) != 0) { xpt_print(ccb->ccb_h.path, "hv_storvsc_io_request failed with %d\n", res); ccb->ccb_h.status = CAM_PROVIDE_FAIL; storvsc_free_request(sc, reqp); xpt_done(ccb); return; } return; } default: ccb->ccb_h.status = CAM_REQ_INVALID; xpt_done(ccb); return; } } /** * @brief destroy bounce buffer * * This function is responsible for destroy a Scatter/Gather list * that create by storvsc_create_bounce_buffer() * * @param sgl- the Scatter/Gather need be destroy * @param sg_count- page count of the SG list. * */ static void storvsc_destroy_bounce_buffer(struct sglist *sgl) { struct hv_sgl_node *sgl_node = NULL; if (LIST_EMPTY(&g_hv_sgl_page_pool.in_use_sgl_list)) { printf("storvsc error: not enough in use sgl\n"); return; } sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.in_use_sgl_list); LIST_REMOVE(sgl_node, link); sgl_node->sgl_data = sgl; LIST_INSERT_HEAD(&g_hv_sgl_page_pool.free_sgl_list, sgl_node, link); } /** * @brief create bounce buffer * * This function is responsible for create a Scatter/Gather list, * which hold several pages that can be aligned with page size. * * @param seg_count- SG-list segments count * @param write - if WRITE_TYPE, set SG list page used size to 0, * otherwise set used size to page size. * * return NULL if create failed */ static struct sglist * storvsc_create_bounce_buffer(uint16_t seg_count, int write) { int i = 0; struct sglist *bounce_sgl = NULL; unsigned int buf_len = ((write == WRITE_TYPE) ? 0 : PAGE_SIZE); struct hv_sgl_node *sgl_node = NULL; /* get struct sglist from free_sgl_list */ if (LIST_EMPTY(&g_hv_sgl_page_pool.free_sgl_list)) { printf("storvsc error: not enough free sgl\n"); return NULL; } sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.free_sgl_list); LIST_REMOVE(sgl_node, link); bounce_sgl = sgl_node->sgl_data; LIST_INSERT_HEAD(&g_hv_sgl_page_pool.in_use_sgl_list, sgl_node, link); bounce_sgl->sg_maxseg = seg_count; if (write == WRITE_TYPE) bounce_sgl->sg_nseg = 0; else bounce_sgl->sg_nseg = seg_count; for (i = 0; i < seg_count; i++) bounce_sgl->sg_segs[i].ss_len = buf_len; return bounce_sgl; } /** * @brief copy data from SG list to bounce buffer * * This function is responsible for copy data from one SG list's segments * to another SG list which used as bounce buffer. * * @param bounce_sgl - the destination SG list * @param orig_sgl - the segment of the source SG list. * @param orig_sgl_count - the count of segments. * @param orig_sgl_count - indicate which segment need bounce buffer, * set 1 means need. * */ static void storvsc_copy_sgl_to_bounce_buf(struct sglist *bounce_sgl, bus_dma_segment_t *orig_sgl, unsigned int orig_sgl_count, uint64_t seg_bits) { int src_sgl_idx = 0; for (src_sgl_idx = 0; src_sgl_idx < orig_sgl_count; src_sgl_idx++) { if (seg_bits & (1 << src_sgl_idx)) { memcpy((void*)bounce_sgl->sg_segs[src_sgl_idx].ss_paddr, (void*)orig_sgl[src_sgl_idx].ds_addr, orig_sgl[src_sgl_idx].ds_len); bounce_sgl->sg_segs[src_sgl_idx].ss_len = orig_sgl[src_sgl_idx].ds_len; } } } /** * @brief copy data from SG list which used as bounce to another SG list * * This function is responsible for copy data from one SG list with bounce * buffer to another SG list's segments. * * @param dest_sgl - the destination SG list's segments * @param dest_sgl_count - the count of destination SG list's segment. * @param src_sgl - the source SG list. * @param seg_bits - indicate which segment used bounce buffer of src SG-list. * */ void storvsc_copy_from_bounce_buf_to_sgl(bus_dma_segment_t *dest_sgl, unsigned int dest_sgl_count, struct sglist* src_sgl, uint64_t seg_bits) { int sgl_idx = 0; for (sgl_idx = 0; sgl_idx < dest_sgl_count; sgl_idx++) { if (seg_bits & (1 << sgl_idx)) { memcpy((void*)(dest_sgl[sgl_idx].ds_addr), (void*)(src_sgl->sg_segs[sgl_idx].ss_paddr), src_sgl->sg_segs[sgl_idx].ss_len); } } } /** * @brief check SG list with bounce buffer or not * * This function is responsible for check if need bounce buffer for SG list. * * @param sgl - the SG list's segments * @param sg_count - the count of SG list's segment. * @param bits - segmengs number that need bounce buffer * * return -1 if SG list needless bounce buffer */ static int storvsc_check_bounce_buffer_sgl(bus_dma_segment_t *sgl, unsigned int sg_count, uint64_t *bits) { int i = 0; int offset = 0; uint64_t phys_addr = 0; uint64_t tmp_bits = 0; boolean_t found_hole = FALSE; boolean_t pre_aligned = TRUE; if (sg_count < 2){ return -1; } *bits = 0; phys_addr = vtophys(sgl[0].ds_addr); offset = phys_addr - trunc_page(phys_addr); if (offset != 0) { pre_aligned = FALSE; tmp_bits |= 1; } for (i = 1; i < sg_count; i++) { phys_addr = vtophys(sgl[i].ds_addr); offset = phys_addr - trunc_page(phys_addr); if (offset == 0) { if (FALSE == pre_aligned){ /* * This segment is aligned, if the previous * one is not aligned, find a hole */ found_hole = TRUE; } pre_aligned = TRUE; } else { tmp_bits |= 1 << i; if (!pre_aligned) { if (phys_addr != vtophys(sgl[i-1].ds_addr + sgl[i-1].ds_len)) { /* * Check whether connect to previous * segment,if not, find the hole */ found_hole = TRUE; } } else { found_hole = TRUE; } pre_aligned = FALSE; } } if (!found_hole) { return (-1); } else { *bits = tmp_bits; return 0; } } /** * @brief Fill in a request structure based on a CAM control block * * Fills in a request structure based on the contents of a CAM control * block. The request structure holds the payload information for * VSCSI protocol request. * * @param ccb pointer to a CAM contorl block * @param reqp pointer to a request structure */ static int create_storvsc_request(union ccb *ccb, struct hv_storvsc_request *reqp) { struct ccb_scsiio *csio = &ccb->csio; uint64_t phys_addr; uint32_t bytes_to_copy = 0; uint32_t pfn_num = 0; uint32_t pfn; uint64_t not_aligned_seg_bits = 0; struct hvs_gpa_range *prplist; /* refer to struct vmscsi_req for meanings of these two fields */ reqp->vstor_packet.u.vm_srb.port = cam_sim_unit(xpt_path_sim(ccb->ccb_h.path)); reqp->vstor_packet.u.vm_srb.path_id = cam_sim_bus(xpt_path_sim(ccb->ccb_h.path)); reqp->vstor_packet.u.vm_srb.target_id = ccb->ccb_h.target_id; reqp->vstor_packet.u.vm_srb.lun = ccb->ccb_h.target_lun; reqp->vstor_packet.u.vm_srb.cdb_len = csio->cdb_len; if(ccb->ccb_h.flags & CAM_CDB_POINTER) { memcpy(&reqp->vstor_packet.u.vm_srb.u.cdb, csio->cdb_io.cdb_ptr, csio->cdb_len); } else { memcpy(&reqp->vstor_packet.u.vm_srb.u.cdb, csio->cdb_io.cdb_bytes, csio->cdb_len); } switch (ccb->ccb_h.flags & CAM_DIR_MASK) { case CAM_DIR_OUT: reqp->vstor_packet.u.vm_srb.data_in = WRITE_TYPE; break; case CAM_DIR_IN: reqp->vstor_packet.u.vm_srb.data_in = READ_TYPE; break; case CAM_DIR_NONE: reqp->vstor_packet.u.vm_srb.data_in = UNKNOWN_TYPE; break; default: reqp->vstor_packet.u.vm_srb.data_in = UNKNOWN_TYPE; break; } reqp->sense_data = &csio->sense_data; reqp->sense_info_len = csio->sense_len; reqp->ccb = ccb; if (0 == csio->dxfer_len) { return (0); } prplist = &reqp->prp_list; prplist->gpa_range.gpa_len = csio->dxfer_len; switch (ccb->ccb_h.flags & CAM_DATA_MASK) { case CAM_DATA_VADDR: { bytes_to_copy = csio->dxfer_len; phys_addr = vtophys(csio->data_ptr); prplist->gpa_range.gpa_ofs = phys_addr & PAGE_MASK; while (bytes_to_copy != 0) { int bytes, page_offset; phys_addr = vtophys(&csio->data_ptr[prplist->gpa_range.gpa_len - bytes_to_copy]); pfn = phys_addr >> PAGE_SHIFT; prplist->gpa_page[pfn_num] = pfn; page_offset = phys_addr & PAGE_MASK; bytes = min(PAGE_SIZE - page_offset, bytes_to_copy); bytes_to_copy -= bytes; pfn_num++; } reqp->prp_cnt = pfn_num; break; } case CAM_DATA_SG: { int i = 0; int offset = 0; int ret; bus_dma_segment_t *storvsc_sglist = (bus_dma_segment_t *)ccb->csio.data_ptr; u_int16_t storvsc_sg_count = ccb->csio.sglist_cnt; printf("Storvsc: get SG I/O operation, %d\n", reqp->vstor_packet.u.vm_srb.data_in); if (storvsc_sg_count > VMBUS_CHAN_PRPLIST_MAX){ printf("Storvsc: %d segments is too much, " "only support %d segments\n", storvsc_sg_count, VMBUS_CHAN_PRPLIST_MAX); return (EINVAL); } /* * We create our own bounce buffer function currently. Idealy * we should use BUS_DMA(9) framework. But with current BUS_DMA * code there is no callback API to check the page alignment of * middle segments before busdma can decide if a bounce buffer * is needed for particular segment. There is callback, * "bus_dma_filter_t *filter", but the parrameters are not * sufficient for storvsc driver. * TODO: * Add page alignment check in BUS_DMA(9) callback. Once * this is complete, switch the following code to use * BUS_DMA(9) for storvsc bounce buffer support. */ /* check if we need to create bounce buffer */ ret = storvsc_check_bounce_buffer_sgl(storvsc_sglist, storvsc_sg_count, ¬_aligned_seg_bits); if (ret != -1) { reqp->bounce_sgl = storvsc_create_bounce_buffer(storvsc_sg_count, reqp->vstor_packet.u.vm_srb.data_in); if (NULL == reqp->bounce_sgl) { printf("Storvsc_error: " "create bounce buffer failed.\n"); return (ENOMEM); } reqp->bounce_sgl_count = storvsc_sg_count; reqp->not_aligned_seg_bits = not_aligned_seg_bits; /* * if it is write, we need copy the original data *to bounce buffer */ if (WRITE_TYPE == reqp->vstor_packet.u.vm_srb.data_in) { storvsc_copy_sgl_to_bounce_buf( reqp->bounce_sgl, storvsc_sglist, storvsc_sg_count, reqp->not_aligned_seg_bits); } /* transfer virtual address to physical frame number */ if (reqp->not_aligned_seg_bits & 0x1){ phys_addr = vtophys(reqp->bounce_sgl->sg_segs[0].ss_paddr); }else{ phys_addr = vtophys(storvsc_sglist[0].ds_addr); } prplist->gpa_range.gpa_ofs = phys_addr & PAGE_MASK; pfn = phys_addr >> PAGE_SHIFT; prplist->gpa_page[0] = pfn; for (i = 1; i < storvsc_sg_count; i++) { if (reqp->not_aligned_seg_bits & (1 << i)) { phys_addr = vtophys(reqp->bounce_sgl->sg_segs[i].ss_paddr); } else { phys_addr = vtophys(storvsc_sglist[i].ds_addr); } pfn = phys_addr >> PAGE_SHIFT; prplist->gpa_page[i] = pfn; } reqp->prp_cnt = i; } else { phys_addr = vtophys(storvsc_sglist[0].ds_addr); prplist->gpa_range.gpa_ofs = phys_addr & PAGE_MASK; for (i = 0; i < storvsc_sg_count; i++) { phys_addr = vtophys(storvsc_sglist[i].ds_addr); pfn = phys_addr >> PAGE_SHIFT; prplist->gpa_page[i] = pfn; } reqp->prp_cnt = i; /* check the last segment cross boundary or not */ offset = phys_addr & PAGE_MASK; if (offset) { /* Add one more PRP entry */ phys_addr = vtophys(storvsc_sglist[i-1].ds_addr + PAGE_SIZE - offset); pfn = phys_addr >> PAGE_SHIFT; prplist->gpa_page[i] = pfn; reqp->prp_cnt++; } reqp->bounce_sgl_count = 0; } break; } default: printf("Unknow flags: %d\n", ccb->ccb_h.flags); return(EINVAL); } return(0); } /* * SCSI Inquiry checks qualifier and type. * If qualifier is 011b, means the device server is not capable * of supporting a peripheral device on this logical unit, and * the type should be set to 1Fh. * * Return 1 if it is valid, 0 otherwise. */ static inline int is_inquiry_valid(const struct scsi_inquiry_data *inq_data) { uint8_t type; if (SID_QUAL(inq_data) != SID_QUAL_LU_CONNECTED) { return (0); } type = SID_TYPE(inq_data); if (type == T_NODEVICE) { return (0); } return (1); } /** * @brief completion function before returning to CAM * * I/O process has been completed and the result needs * to be passed to the CAM layer. * Free resources related to this request. * * @param reqp pointer to a request structure */ static void storvsc_io_done(struct hv_storvsc_request *reqp) { union ccb *ccb = reqp->ccb; struct ccb_scsiio *csio = &ccb->csio; struct storvsc_softc *sc = reqp->softc; struct vmscsi_req *vm_srb = &reqp->vstor_packet.u.vm_srb; bus_dma_segment_t *ori_sglist = NULL; int ori_sg_count = 0; /* destroy bounce buffer if it is used */ if (reqp->bounce_sgl_count) { ori_sglist = (bus_dma_segment_t *)ccb->csio.data_ptr; ori_sg_count = ccb->csio.sglist_cnt; /* * If it is READ operation, we should copy back the data * to original SG list. */ if (READ_TYPE == reqp->vstor_packet.u.vm_srb.data_in) { storvsc_copy_from_bounce_buf_to_sgl(ori_sglist, ori_sg_count, reqp->bounce_sgl, reqp->not_aligned_seg_bits); } storvsc_destroy_bounce_buffer(reqp->bounce_sgl); reqp->bounce_sgl_count = 0; } if (reqp->retries > 0) { mtx_lock(&sc->hs_lock); #if HVS_TIMEOUT_TEST xpt_print(ccb->ccb_h.path, "%u: IO returned after timeout, " "waking up timer handler if any.\n", ticks); mtx_lock(&reqp->event.mtx); cv_signal(&reqp->event.cv); mtx_unlock(&reqp->event.mtx); #endif reqp->retries = 0; xpt_print(ccb->ccb_h.path, "%u: IO returned after timeout, " "stopping timer if any.\n", ticks); mtx_unlock(&sc->hs_lock); } #ifdef notyet /* * callout_drain() will wait for the timer handler to finish * if it is running. So we don't need any lock to synchronize * between this routine and the timer handler. * Note that we need to make sure reqp is not freed when timer * handler is using or will use it. */ if (ccb->ccb_h.timeout != CAM_TIME_INFINITY) { callout_drain(&reqp->callout); } #endif ccb->ccb_h.status &= ~CAM_SIM_QUEUED; ccb->ccb_h.status &= ~CAM_STATUS_MASK; if (vm_srb->scsi_status == SCSI_STATUS_OK) { const struct scsi_generic *cmd; /* * Check whether the data for INQUIRY cmd is valid or * not. Windows 10 and Windows 2016 send all zero * inquiry data to VM even for unpopulated slots. */ cmd = (const struct scsi_generic *) ((ccb->ccb_h.flags & CAM_CDB_POINTER) ? csio->cdb_io.cdb_ptr : csio->cdb_io.cdb_bytes); if (cmd->opcode == INQUIRY) { /* * The host of Windows 10 or 2016 server will response * the inquiry request with invalid data for unexisted device: [0x7f 0x0 0x5 0x2 0x1f ... ] * But on windows 2012 R2, the response is: [0x7f 0x0 0x0 0x0 0x0 ] * That is why here wants to validate the inquiry response. * The validation will skip the INQUIRY whose response is short, * which is less than SHORT_INQUIRY_LENGTH (36). * * For more information about INQUIRY, please refer to: * ftp://ftp.avc-pioneer.com/Mtfuji_7/Proposal/Jun09/INQUIRY.pdf */ const struct scsi_inquiry_data *inq_data = (const struct scsi_inquiry_data *)csio->data_ptr; uint8_t* resp_buf = (uint8_t*)csio->data_ptr; /* Get the buffer length reported by host */ int resp_xfer_len = vm_srb->transfer_len; /* Get the available buffer length */ int resp_buf_len = resp_xfer_len >= 5 ? resp_buf[4] + 5 : 0; int data_len = (resp_buf_len < resp_xfer_len) ? resp_buf_len : resp_xfer_len; if (data_len < SHORT_INQUIRY_LENGTH) { ccb->ccb_h.status |= CAM_REQ_CMP; if (bootverbose && data_len >= 5) { mtx_lock(&sc->hs_lock); xpt_print(ccb->ccb_h.path, "storvsc skips the validation for short inquiry (%d)" " [%x %x %x %x %x]\n", data_len,resp_buf[0],resp_buf[1],resp_buf[2], resp_buf[3],resp_buf[4]); mtx_unlock(&sc->hs_lock); } } else if (is_inquiry_valid(inq_data) == 0) { ccb->ccb_h.status |= CAM_DEV_NOT_THERE; if (bootverbose && data_len >= 5) { mtx_lock(&sc->hs_lock); xpt_print(ccb->ccb_h.path, "storvsc uninstalled invalid device" " [%x %x %x %x %x]\n", resp_buf[0],resp_buf[1],resp_buf[2],resp_buf[3],resp_buf[4]); mtx_unlock(&sc->hs_lock); } } else { ccb->ccb_h.status |= CAM_REQ_CMP; if (bootverbose) { mtx_lock(&sc->hs_lock); xpt_print(ccb->ccb_h.path, "storvsc has passed inquiry response (%d) validation\n", data_len); mtx_unlock(&sc->hs_lock); } } } else { ccb->ccb_h.status |= CAM_REQ_CMP; } } else { mtx_lock(&sc->hs_lock); xpt_print(ccb->ccb_h.path, "storvsc scsi_status = %d\n", vm_srb->scsi_status); mtx_unlock(&sc->hs_lock); ccb->ccb_h.status |= CAM_SCSI_STATUS_ERROR; } ccb->csio.scsi_status = (vm_srb->scsi_status & 0xFF); ccb->csio.resid = ccb->csio.dxfer_len - vm_srb->transfer_len; if (reqp->sense_info_len != 0) { csio->sense_resid = csio->sense_len - reqp->sense_info_len; ccb->ccb_h.status |= CAM_AUTOSNS_VALID; } mtx_lock(&sc->hs_lock); if (reqp->softc->hs_frozen == 1) { xpt_print(ccb->ccb_h.path, "%u: storvsc unfreezing softc 0x%p.\n", ticks, reqp->softc); ccb->ccb_h.status |= CAM_RELEASE_SIMQ; reqp->softc->hs_frozen = 0; } storvsc_free_request(sc, reqp); mtx_unlock(&sc->hs_lock); xpt_done_direct(ccb); } /** * @brief Free a request structure * * Free a request structure by returning it to the free list * * @param sc pointer to a softc * @param reqp pointer to a request structure */ static void storvsc_free_request(struct storvsc_softc *sc, struct hv_storvsc_request *reqp) { LIST_INSERT_HEAD(&sc->hs_free_list, reqp, link); } /** * @brief Determine type of storage device from GUID * * Using the type GUID, determine if this is a StorVSC (paravirtual * SCSI or BlkVSC (paravirtual IDE) device. * * @param dev a device * returns an enum */ static enum hv_storage_type storvsc_get_storage_type(device_t dev) { device_t parent = device_get_parent(dev); if (VMBUS_PROBE_GUID(parent, dev, &gBlkVscDeviceType) == 0) return DRIVER_BLKVSC; if (VMBUS_PROBE_GUID(parent, dev, &gStorVscDeviceType) == 0) return DRIVER_STORVSC; return DRIVER_UNKNOWN; } Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_heartbeat.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_heartbeat.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_heartbeat.c (revision 303206) @@ -1,134 +1,133 @@ /*- * Copyright (c) 2014,2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include #include #include #include #include #include #include +#include #include "hv_util.h" #include "vmbus_if.h" /* Heartbeat Service */ static const struct hyperv_guid service_guid = { .hv_guid = {0x39, 0x4f, 0x16, 0x57, 0x15, 0x91, 0x78, 0x4e, 0xab, 0x55, 0x38, 0x2f, 0x3b, 0xd5, 0x42, 0x2d} }; /** * Process heartbeat message */ static void -hv_heartbeat_cb(void *context) +hv_heartbeat_cb(struct vmbus_channel *channel, void *context) { uint8_t* buf; - hv_vmbus_channel* channel; int recvlen; uint64_t requestid; int ret; struct hv_vmbus_heartbeat_msg_data* heartbeat_msg; struct hv_vmbus_icmsg_hdr* icmsghdrp; hv_util_sc *softc; softc = (hv_util_sc*)context; buf = softc->receive_buffer; - channel = softc->channel; recvlen = PAGE_SIZE; ret = vmbus_chan_recv(channel, buf, &recvlen, &requestid); KASSERT(ret != ENOBUFS, ("hvheartbeat recvbuf is not large enough")); /* XXX check recvlen to make sure that it contains enough data */ if ((ret == 0) && recvlen > 0) { icmsghdrp = (struct hv_vmbus_icmsg_hdr *) &buf[sizeof(struct hv_vmbus_pipe_hdr)]; if (icmsghdrp->icmsgtype == HV_ICMSGTYPE_NEGOTIATE) { hv_negotiate_version(icmsghdrp, NULL, buf); } else { heartbeat_msg = (struct hv_vmbus_heartbeat_msg_data *) &buf[sizeof(struct hv_vmbus_pipe_hdr) + sizeof(struct hv_vmbus_icmsg_hdr)]; heartbeat_msg->seq_num += 1; } icmsghdrp->icflags = HV_ICMSGHDRFLAG_TRANSACTION | HV_ICMSGHDRFLAG_RESPONSE; vmbus_chan_send(channel, VMBUS_CHANPKT_TYPE_INBAND, 0, buf, recvlen, requestid); } } static int hv_heartbeat_probe(device_t dev) { if (resource_disabled("hvheartbeat", 0)) return ENXIO; if (VMBUS_PROBE_GUID(device_get_parent(dev), dev, &service_guid) == 0) { device_set_desc(dev, "Hyper-V Heartbeat Service"); return BUS_PROBE_DEFAULT; } return ENXIO; } static int hv_heartbeat_attach(device_t dev) { hv_util_sc *softc = (hv_util_sc*)device_get_softc(dev); softc->callback = hv_heartbeat_cb; return hv_util_attach(dev); } static device_method_t heartbeat_methods[] = { /* Device interface */ DEVMETHOD(device_probe, hv_heartbeat_probe), DEVMETHOD(device_attach, hv_heartbeat_attach), DEVMETHOD(device_detach, hv_util_detach), { 0, 0 } }; static driver_t heartbeat_driver = { "hvheartbeat", heartbeat_methods, sizeof(hv_util_sc)}; static devclass_t heartbeat_devclass; DRIVER_MODULE(hv_heartbeat, vmbus, heartbeat_driver, heartbeat_devclass, NULL, NULL); MODULE_VERSION(hv_heartbeat, 1); MODULE_DEPEND(hv_heartbeat, vmbus, 1, 1, 1); Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_kvp.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_kvp.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_kvp.c (revision 303206) @@ -1,951 +1,955 @@ /*- * Copyright (c) 2014,2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * Author: Sainath Varanasi. * Date: 4/2012 * Email: bsdic@microsoft.com */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include #include "hv_util.h" #include "unicode.h" #include "hv_kvp.h" #include "vmbus_if.h" /* hv_kvp defines */ #define BUFFERSIZE sizeof(struct hv_kvp_msg) #define KVP_SUCCESS 0 #define KVP_ERROR 1 #define kvp_hdr hdr.kvp_hdr /* hv_kvp debug control */ static int hv_kvp_log = 0; #define hv_kvp_log_error(...) do { \ if (hv_kvp_log > 0) \ log(LOG_ERR, "hv_kvp: " __VA_ARGS__); \ } while (0) #define hv_kvp_log_info(...) do { \ if (hv_kvp_log > 1) \ log(LOG_INFO, "hv_kvp: " __VA_ARGS__); \ } while (0) static const struct hyperv_guid service_guid = { .hv_guid = {0xe7, 0xf4, 0xa0, 0xa9, 0x45, 0x5a, 0x96, 0x4d, 0xb8, 0x27, 0x8a, 0x84, 0x1e, 0x8c, 0x3, 0xe6} }; /* character device prototypes */ static d_open_t hv_kvp_dev_open; static d_close_t hv_kvp_dev_close; static d_read_t hv_kvp_dev_daemon_read; static d_write_t hv_kvp_dev_daemon_write; static d_poll_t hv_kvp_dev_daemon_poll; /* hv_kvp character device structure */ static struct cdevsw hv_kvp_cdevsw = { .d_version = D_VERSION, .d_open = hv_kvp_dev_open, .d_close = hv_kvp_dev_close, .d_read = hv_kvp_dev_daemon_read, .d_write = hv_kvp_dev_daemon_write, .d_poll = hv_kvp_dev_daemon_poll, .d_name = "hv_kvp_dev", }; /* * Global state to track and synchronize multiple * KVP transaction requests from the host. */ typedef struct hv_kvp_sc { struct hv_util_sc util_sc; + device_t dev; /* Unless specified the pending mutex should be * used to alter the values of the following parameters: * 1. req_in_progress * 2. req_timed_out */ struct mtx pending_mutex; struct task task; /* To track if transaction is active or not */ boolean_t req_in_progress; /* Tracks if daemon did not reply back in time */ boolean_t req_timed_out; /* Tracks if daemon is serving a request currently */ boolean_t daemon_busy; /* Length of host message */ uint32_t host_msg_len; /* Host message id */ uint64_t host_msg_id; /* Current kvp message from the host */ struct hv_kvp_msg *host_kvp_msg; /* Current kvp message for daemon */ struct hv_kvp_msg daemon_kvp_msg; /* Rcv buffer for communicating with the host*/ uint8_t *rcv_buf; /* Device semaphore to control communication */ struct sema dev_sema; /* Indicates if daemon registered with driver */ boolean_t register_done; /* Character device status */ boolean_t dev_accessed; struct cdev *hv_kvp_dev; struct proc *daemon_task; struct selinfo hv_kvp_selinfo; } hv_kvp_sc; /* hv_kvp prototypes */ static int hv_kvp_req_in_progress(hv_kvp_sc *sc); static void hv_kvp_transaction_init(hv_kvp_sc *sc, uint32_t, uint64_t, uint8_t *); static void hv_kvp_send_msg_to_daemon(hv_kvp_sc *sc); static void hv_kvp_process_request(void *context, int pending); /* * hv_kvp low level functions */ /* * Check if kvp transaction is in progres */ static int hv_kvp_req_in_progress(hv_kvp_sc *sc) { return (sc->req_in_progress); } /* * This routine is called whenever a message is received from the host */ static void hv_kvp_transaction_init(hv_kvp_sc *sc, uint32_t rcv_len, uint64_t request_id, uint8_t *rcv_buf) { /* Store all the relevant message details in the global structure */ /* Do not need to use mutex for req_in_progress here */ sc->req_in_progress = true; sc->host_msg_len = rcv_len; sc->host_msg_id = request_id; sc->rcv_buf = rcv_buf; sc->host_kvp_msg = (struct hv_kvp_msg *)&rcv_buf[ sizeof(struct hv_vmbus_pipe_hdr) + sizeof(struct hv_vmbus_icmsg_hdr)]; } /* * hv_kvp - version neogtiation function */ static void hv_kvp_negotiate_version(struct hv_vmbus_icmsg_hdr *icmsghdrp, struct hv_vmbus_icmsg_negotiate *negop, uint8_t *buf) { int icframe_vercnt; int icmsg_vercnt; icmsghdrp->icmsgsize = 0x10; negop = (struct hv_vmbus_icmsg_negotiate *)&buf[ sizeof(struct hv_vmbus_pipe_hdr) + sizeof(struct hv_vmbus_icmsg_hdr)]; icframe_vercnt = negop->icframe_vercnt; icmsg_vercnt = negop->icmsg_vercnt; /* * Select the framework version number we will support */ if ((icframe_vercnt >= 2) && (negop->icversion_data[1].major == 3)) { icframe_vercnt = 3; if (icmsg_vercnt > 2) icmsg_vercnt = 4; else icmsg_vercnt = 3; } else { icframe_vercnt = 1; icmsg_vercnt = 1; } negop->icframe_vercnt = 1; negop->icmsg_vercnt = 1; negop->icversion_data[0].major = icframe_vercnt; negop->icversion_data[0].minor = 0; negop->icversion_data[1].major = icmsg_vercnt; negop->icversion_data[1].minor = 0; } /* * Convert ip related info in umsg from utf8 to utf16 and store in hmsg */ static int hv_kvp_convert_utf8_ipinfo_to_utf16(struct hv_kvp_msg *umsg, struct hv_kvp_ip_msg *host_ip_msg) { int err_ip, err_subnet, err_gway, err_dns, err_adap; int UNUSED_FLAG = 1; utf8_to_utf16((uint16_t *)host_ip_msg->kvp_ip_val.ip_addr, MAX_IP_ADDR_SIZE, (char *)umsg->body.kvp_ip_val.ip_addr, strlen((char *)umsg->body.kvp_ip_val.ip_addr), UNUSED_FLAG, &err_ip); utf8_to_utf16((uint16_t *)host_ip_msg->kvp_ip_val.sub_net, MAX_IP_ADDR_SIZE, (char *)umsg->body.kvp_ip_val.sub_net, strlen((char *)umsg->body.kvp_ip_val.sub_net), UNUSED_FLAG, &err_subnet); utf8_to_utf16((uint16_t *)host_ip_msg->kvp_ip_val.gate_way, MAX_GATEWAY_SIZE, (char *)umsg->body.kvp_ip_val.gate_way, strlen((char *)umsg->body.kvp_ip_val.gate_way), UNUSED_FLAG, &err_gway); utf8_to_utf16((uint16_t *)host_ip_msg->kvp_ip_val.dns_addr, MAX_IP_ADDR_SIZE, (char *)umsg->body.kvp_ip_val.dns_addr, strlen((char *)umsg->body.kvp_ip_val.dns_addr), UNUSED_FLAG, &err_dns); utf8_to_utf16((uint16_t *)host_ip_msg->kvp_ip_val.adapter_id, MAX_IP_ADDR_SIZE, (char *)umsg->body.kvp_ip_val.adapter_id, strlen((char *)umsg->body.kvp_ip_val.adapter_id), UNUSED_FLAG, &err_adap); host_ip_msg->kvp_ip_val.dhcp_enabled = umsg->body.kvp_ip_val.dhcp_enabled; host_ip_msg->kvp_ip_val.addr_family = umsg->body.kvp_ip_val.addr_family; return (err_ip | err_subnet | err_gway | err_dns | err_adap); } /* * Convert ip related info in hmsg from utf16 to utf8 and store in umsg */ static int hv_kvp_convert_utf16_ipinfo_to_utf8(struct hv_kvp_ip_msg *host_ip_msg, struct hv_kvp_msg *umsg) { int err_ip, err_subnet, err_gway, err_dns, err_adap; int UNUSED_FLAG = 1; device_t *devs; int devcnt; /* IP Address */ utf16_to_utf8((char *)umsg->body.kvp_ip_val.ip_addr, MAX_IP_ADDR_SIZE, (uint16_t *)host_ip_msg->kvp_ip_val.ip_addr, MAX_IP_ADDR_SIZE, UNUSED_FLAG, &err_ip); /* Adapter ID : GUID */ utf16_to_utf8((char *)umsg->body.kvp_ip_val.adapter_id, MAX_ADAPTER_ID_SIZE, (uint16_t *)host_ip_msg->kvp_ip_val.adapter_id, MAX_ADAPTER_ID_SIZE, UNUSED_FLAG, &err_adap); if (devclass_get_devices(devclass_find("hn"), &devs, &devcnt) == 0) { for (devcnt = devcnt - 1; devcnt >= 0; devcnt--) { /* XXX access other driver's softc? are you kidding? */ device_t dev = devs[devcnt]; struct hn_softc *sc = device_get_softc(dev); - struct hv_vmbus_channel *chan; + struct vmbus_channel *chan; char buf[HYPERV_GUID_STRLEN]; /* * Trying to find GUID of Network Device * TODO: need vmbus interface. */ chan = vmbus_get_channel(dev); - hyperv_guid2str(&chan->ch_guid_inst, buf, sizeof(buf)); + hyperv_guid2str(vmbus_chan_guid_inst(chan), + buf, sizeof(buf)); if (strncmp(buf, (char *)umsg->body.kvp_ip_val.adapter_id, HYPERV_GUID_STRLEN - 1) == 0) { strlcpy((char *)umsg->body.kvp_ip_val.adapter_id, sc->hn_ifp->if_xname, MAX_ADAPTER_ID_SIZE); break; } } free(devs, M_TEMP); } /* Address Family , DHCP , SUBNET, Gateway, DNS */ umsg->kvp_hdr.operation = host_ip_msg->operation; umsg->body.kvp_ip_val.addr_family = host_ip_msg->kvp_ip_val.addr_family; umsg->body.kvp_ip_val.dhcp_enabled = host_ip_msg->kvp_ip_val.dhcp_enabled; utf16_to_utf8((char *)umsg->body.kvp_ip_val.sub_net, MAX_IP_ADDR_SIZE, (uint16_t *)host_ip_msg->kvp_ip_val.sub_net, MAX_IP_ADDR_SIZE, UNUSED_FLAG, &err_subnet); utf16_to_utf8((char *)umsg->body.kvp_ip_val.gate_way, MAX_GATEWAY_SIZE, (uint16_t *)host_ip_msg->kvp_ip_val.gate_way, MAX_GATEWAY_SIZE, UNUSED_FLAG, &err_gway); utf16_to_utf8((char *)umsg->body.kvp_ip_val.dns_addr, MAX_IP_ADDR_SIZE, (uint16_t *)host_ip_msg->kvp_ip_val.dns_addr, MAX_IP_ADDR_SIZE, UNUSED_FLAG, &err_dns); return (err_ip | err_subnet | err_gway | err_dns | err_adap); } /* * Prepare a user kvp msg based on host kvp msg (utf16 to utf8) * Ensure utf16_utf8 takes care of the additional string terminating char!! */ static void hv_kvp_convert_hostmsg_to_usermsg(struct hv_kvp_msg *hmsg, struct hv_kvp_msg *umsg) { int utf_err = 0; uint32_t value_type; struct hv_kvp_ip_msg *host_ip_msg; host_ip_msg = (struct hv_kvp_ip_msg*)hmsg; memset(umsg, 0, sizeof(struct hv_kvp_msg)); umsg->kvp_hdr.operation = hmsg->kvp_hdr.operation; umsg->kvp_hdr.pool = hmsg->kvp_hdr.pool; switch (umsg->kvp_hdr.operation) { case HV_KVP_OP_SET_IP_INFO: hv_kvp_convert_utf16_ipinfo_to_utf8(host_ip_msg, umsg); break; case HV_KVP_OP_GET_IP_INFO: utf16_to_utf8((char *)umsg->body.kvp_ip_val.adapter_id, MAX_ADAPTER_ID_SIZE, (uint16_t *)host_ip_msg->kvp_ip_val.adapter_id, MAX_ADAPTER_ID_SIZE, 1, &utf_err); umsg->body.kvp_ip_val.addr_family = host_ip_msg->kvp_ip_val.addr_family; break; case HV_KVP_OP_SET: value_type = hmsg->body.kvp_set.data.value_type; switch (value_type) { case HV_REG_SZ: umsg->body.kvp_set.data.value_size = utf16_to_utf8( (char *)umsg->body.kvp_set.data.msg_value.value, HV_KVP_EXCHANGE_MAX_VALUE_SIZE - 1, (uint16_t *)hmsg->body.kvp_set.data.msg_value.value, hmsg->body.kvp_set.data.value_size, 1, &utf_err); /* utf8 encoding */ umsg->body.kvp_set.data.value_size = umsg->body.kvp_set.data.value_size / 2; break; case HV_REG_U32: umsg->body.kvp_set.data.value_size = sprintf(umsg->body.kvp_set.data.msg_value.value, "%d", hmsg->body.kvp_set.data.msg_value.value_u32) + 1; break; case HV_REG_U64: umsg->body.kvp_set.data.value_size = sprintf(umsg->body.kvp_set.data.msg_value.value, "%llu", (unsigned long long) hmsg->body.kvp_set.data.msg_value.value_u64) + 1; break; } umsg->body.kvp_set.data.key_size = utf16_to_utf8( umsg->body.kvp_set.data.key, HV_KVP_EXCHANGE_MAX_KEY_SIZE - 1, (uint16_t *)hmsg->body.kvp_set.data.key, hmsg->body.kvp_set.data.key_size, 1, &utf_err); /* utf8 encoding */ umsg->body.kvp_set.data.key_size = umsg->body.kvp_set.data.key_size / 2; break; case HV_KVP_OP_GET: umsg->body.kvp_get.data.key_size = utf16_to_utf8(umsg->body.kvp_get.data.key, HV_KVP_EXCHANGE_MAX_KEY_SIZE - 1, (uint16_t *)hmsg->body.kvp_get.data.key, hmsg->body.kvp_get.data.key_size, 1, &utf_err); /* utf8 encoding */ umsg->body.kvp_get.data.key_size = umsg->body.kvp_get.data.key_size / 2; break; case HV_KVP_OP_DELETE: umsg->body.kvp_delete.key_size = utf16_to_utf8(umsg->body.kvp_delete.key, HV_KVP_EXCHANGE_MAX_KEY_SIZE - 1, (uint16_t *)hmsg->body.kvp_delete.key, hmsg->body.kvp_delete.key_size, 1, &utf_err); /* utf8 encoding */ umsg->body.kvp_delete.key_size = umsg->body.kvp_delete.key_size / 2; break; case HV_KVP_OP_ENUMERATE: umsg->body.kvp_enum_data.index = hmsg->body.kvp_enum_data.index; break; default: hv_kvp_log_info("%s: daemon_kvp_msg: Invalid operation : %d\n", __func__, umsg->kvp_hdr.operation); } } /* * Prepare a host kvp msg based on user kvp msg (utf8 to utf16) */ static int hv_kvp_convert_usermsg_to_hostmsg(struct hv_kvp_msg *umsg, struct hv_kvp_msg *hmsg) { int hkey_len = 0, hvalue_len = 0, utf_err = 0; struct hv_kvp_exchg_msg_value *host_exchg_data; char *key_name, *value; struct hv_kvp_ip_msg *host_ip_msg = (struct hv_kvp_ip_msg *)hmsg; switch (hmsg->kvp_hdr.operation) { case HV_KVP_OP_GET_IP_INFO: return (hv_kvp_convert_utf8_ipinfo_to_utf16(umsg, host_ip_msg)); case HV_KVP_OP_SET_IP_INFO: case HV_KVP_OP_SET: case HV_KVP_OP_DELETE: return (KVP_SUCCESS); case HV_KVP_OP_ENUMERATE: host_exchg_data = &hmsg->body.kvp_enum_data.data; key_name = umsg->body.kvp_enum_data.data.key; hkey_len = utf8_to_utf16((uint16_t *)host_exchg_data->key, ((HV_KVP_EXCHANGE_MAX_KEY_SIZE / 2) - 2), key_name, strlen(key_name), 1, &utf_err); /* utf16 encoding */ host_exchg_data->key_size = 2 * (hkey_len + 1); value = umsg->body.kvp_enum_data.data.msg_value.value; hvalue_len = utf8_to_utf16( (uint16_t *)host_exchg_data->msg_value.value, ((HV_KVP_EXCHANGE_MAX_VALUE_SIZE / 2) - 2), value, strlen(value), 1, &utf_err); host_exchg_data->value_size = 2 * (hvalue_len + 1); host_exchg_data->value_type = HV_REG_SZ; if ((hkey_len < 0) || (hvalue_len < 0)) return (HV_KVP_E_FAIL); return (KVP_SUCCESS); case HV_KVP_OP_GET: host_exchg_data = &hmsg->body.kvp_get.data; value = umsg->body.kvp_get.data.msg_value.value; hvalue_len = utf8_to_utf16( (uint16_t *)host_exchg_data->msg_value.value, ((HV_KVP_EXCHANGE_MAX_VALUE_SIZE / 2) - 2), value, strlen(value), 1, &utf_err); /* Convert value size to uft16 */ host_exchg_data->value_size = 2 * (hvalue_len + 1); /* Use values by string */ host_exchg_data->value_type = HV_REG_SZ; if ((hkey_len < 0) || (hvalue_len < 0)) return (HV_KVP_E_FAIL); return (KVP_SUCCESS); default: return (HV_KVP_E_FAIL); } } /* * Send the response back to the host. */ static void hv_kvp_respond_host(hv_kvp_sc *sc, int error) { struct hv_vmbus_icmsg_hdr *hv_icmsg_hdrp; hv_icmsg_hdrp = (struct hv_vmbus_icmsg_hdr *) &sc->rcv_buf[sizeof(struct hv_vmbus_pipe_hdr)]; if (error) error = HV_KVP_E_FAIL; hv_icmsg_hdrp->status = error; hv_icmsg_hdrp->icflags = HV_ICMSGHDRFLAG_TRANSACTION | HV_ICMSGHDRFLAG_RESPONSE; - error = vmbus_chan_send(sc->util_sc.channel, + error = vmbus_chan_send(vmbus_get_channel(sc->dev), VMBUS_CHANPKT_TYPE_INBAND, 0, sc->rcv_buf, sc->host_msg_len, sc->host_msg_id); if (error) hv_kvp_log_info("%s: hv_kvp_respond_host: sendpacket error:%d\n", __func__, error); } /* * This is the main kvp kernel process that interacts with both user daemon * and the host */ static void hv_kvp_send_msg_to_daemon(hv_kvp_sc *sc) { struct hv_kvp_msg *hmsg = sc->host_kvp_msg; struct hv_kvp_msg *umsg = &sc->daemon_kvp_msg; /* Prepare kvp_msg to be sent to user */ hv_kvp_convert_hostmsg_to_usermsg(hmsg, umsg); /* Send the msg to user via function deamon_read - setting sema */ sema_post(&sc->dev_sema); /* We should wake up the daemon, in case it's doing poll() */ selwakeup(&sc->hv_kvp_selinfo); } /* * Function to read the kvp request buffer from host * and interact with daemon */ static void hv_kvp_process_request(void *context, int pending) { uint8_t *kvp_buf; - hv_vmbus_channel *channel; + struct vmbus_channel *channel; uint32_t recvlen = 0; uint64_t requestid; struct hv_vmbus_icmsg_hdr *icmsghdrp; int ret = 0; hv_kvp_sc *sc; hv_kvp_log_info("%s: entering hv_kvp_process_request\n", __func__); sc = (hv_kvp_sc*)context; kvp_buf = sc->util_sc.receive_buffer; - channel = sc->util_sc.channel; + channel = vmbus_get_channel(sc->dev); recvlen = 2 * PAGE_SIZE; ret = vmbus_chan_recv(channel, kvp_buf, &recvlen, &requestid); KASSERT(ret != ENOBUFS, ("hvkvp recvbuf is not large enough")); /* XXX check recvlen to make sure that it contains enough data */ while ((ret == 0) && (recvlen > 0)) { icmsghdrp = (struct hv_vmbus_icmsg_hdr *) &kvp_buf[sizeof(struct hv_vmbus_pipe_hdr)]; hv_kvp_transaction_init(sc, recvlen, requestid, kvp_buf); if (icmsghdrp->icmsgtype == HV_ICMSGTYPE_NEGOTIATE) { hv_kvp_negotiate_version(icmsghdrp, NULL, kvp_buf); hv_kvp_respond_host(sc, ret); /* * It is ok to not acquire the mutex before setting * req_in_progress here because negotiation is the * first thing that happens and hence there is no * chance of a race condition. */ sc->req_in_progress = false; hv_kvp_log_info("%s :version negotiated\n", __func__); } else { if (!sc->daemon_busy) { hv_kvp_log_info("%s: issuing qury to daemon\n", __func__); mtx_lock(&sc->pending_mutex); sc->req_timed_out = false; sc->daemon_busy = true; mtx_unlock(&sc->pending_mutex); hv_kvp_send_msg_to_daemon(sc); hv_kvp_log_info("%s: waiting for daemon\n", __func__); } /* Wait 5 seconds for daemon to respond back */ tsleep(sc, 0, "kvpworkitem", 5 * hz); hv_kvp_log_info("%s: came out of wait\n", __func__); } mtx_lock(&sc->pending_mutex); /* Notice that once req_timed_out is set to true * it will remain true until the next request is * sent to the daemon. The response from daemon * is forwarded to host only when this flag is * false. */ sc->req_timed_out = true; /* * Cancel request if so need be. */ if (hv_kvp_req_in_progress(sc)) { hv_kvp_log_info("%s: request was still active after wait so failing\n", __func__); hv_kvp_respond_host(sc, HV_KVP_E_FAIL); sc->req_in_progress = false; } mtx_unlock(&sc->pending_mutex); /* * Try reading next buffer */ recvlen = 2 * PAGE_SIZE; ret = vmbus_chan_recv(channel, kvp_buf, &recvlen, &requestid); KASSERT(ret != ENOBUFS, ("hvkvp recvbuf is not large enough")); /* XXX check recvlen to make sure that it contains enough data */ hv_kvp_log_info("%s: read: context %p, ret =%d, recvlen=%d\n", __func__, context, ret, recvlen); } } /* * Callback routine that gets called whenever there is a message from host */ static void -hv_kvp_callback(void *context) +hv_kvp_callback(struct vmbus_channel *chan __unused, void *context) { hv_kvp_sc *sc = (hv_kvp_sc*)context; /* The first request from host will not be handled until daemon is registered. when callback is triggered without a registered daemon, callback just return. When a new daemon gets regsitered, this callbcak is trigged from _write op. */ if (sc->register_done) { hv_kvp_log_info("%s: Queuing work item\n", __func__); taskqueue_enqueue(taskqueue_thread, &sc->task); } } static int hv_kvp_dev_open(struct cdev *dev, int oflags, int devtype, struct thread *td) { hv_kvp_sc *sc = (hv_kvp_sc*)dev->si_drv1; hv_kvp_log_info("%s: Opened device \"hv_kvp_device\" successfully.\n", __func__); if (sc->dev_accessed) return (-EBUSY); sc->daemon_task = curproc; sc->dev_accessed = true; sc->daemon_busy = false; return (0); } static int hv_kvp_dev_close(struct cdev *dev __unused, int fflag __unused, int devtype __unused, struct thread *td __unused) { hv_kvp_sc *sc = (hv_kvp_sc*)dev->si_drv1; hv_kvp_log_info("%s: Closing device \"hv_kvp_device\".\n", __func__); sc->dev_accessed = false; sc->register_done = false; return (0); } /* * hv_kvp_daemon read invokes this function * acts as a send to daemon */ static int hv_kvp_dev_daemon_read(struct cdev *dev, struct uio *uio, int ioflag __unused) { size_t amt; int error = 0; struct hv_kvp_msg *hv_kvp_dev_buf; hv_kvp_sc *sc = (hv_kvp_sc*)dev->si_drv1; /* Check hv_kvp daemon registration status*/ if (!sc->register_done) return (KVP_ERROR); sema_wait(&sc->dev_sema); hv_kvp_dev_buf = malloc(sizeof(*hv_kvp_dev_buf), M_TEMP, M_WAITOK); memcpy(hv_kvp_dev_buf, &sc->daemon_kvp_msg, sizeof(struct hv_kvp_msg)); amt = MIN(uio->uio_resid, uio->uio_offset >= BUFFERSIZE + 1 ? 0 : BUFFERSIZE + 1 - uio->uio_offset); if ((error = uiomove(hv_kvp_dev_buf, amt, uio)) != 0) hv_kvp_log_info("%s: hv_kvp uiomove read failed!\n", __func__); free(hv_kvp_dev_buf, M_TEMP); return (error); } /* * hv_kvp_daemon write invokes this function * acts as a receive from daemon */ static int hv_kvp_dev_daemon_write(struct cdev *dev, struct uio *uio, int ioflag __unused) { size_t amt; int error = 0; struct hv_kvp_msg *hv_kvp_dev_buf; hv_kvp_sc *sc = (hv_kvp_sc*)dev->si_drv1; uio->uio_offset = 0; hv_kvp_dev_buf = malloc(sizeof(*hv_kvp_dev_buf), M_TEMP, M_WAITOK); amt = MIN(uio->uio_resid, BUFFERSIZE); error = uiomove(hv_kvp_dev_buf, amt, uio); if (error != 0) { free(hv_kvp_dev_buf, M_TEMP); return (error); } memcpy(&sc->daemon_kvp_msg, hv_kvp_dev_buf, sizeof(struct hv_kvp_msg)); free(hv_kvp_dev_buf, M_TEMP); if (sc->register_done == false) { if (sc->daemon_kvp_msg.kvp_hdr.operation == HV_KVP_OP_REGISTER) { sc->register_done = true; - hv_kvp_callback(dev->si_drv1); + hv_kvp_callback(vmbus_get_channel(sc->dev), dev->si_drv1); } else { hv_kvp_log_info("%s, KVP Registration Failed\n", __func__); return (KVP_ERROR); } } else { mtx_lock(&sc->pending_mutex); if(!sc->req_timed_out) { struct hv_kvp_msg *hmsg = sc->host_kvp_msg; struct hv_kvp_msg *umsg = &sc->daemon_kvp_msg; hv_kvp_convert_usermsg_to_hostmsg(umsg, hmsg); hv_kvp_respond_host(sc, KVP_SUCCESS); wakeup(sc); sc->req_in_progress = false; } sc->daemon_busy = false; mtx_unlock(&sc->pending_mutex); } return (error); } /* * hv_kvp_daemon poll invokes this function to check if data is available * for daemon to read. */ static int hv_kvp_dev_daemon_poll(struct cdev *dev, int events, struct thread *td) { int revents = 0; hv_kvp_sc *sc = (hv_kvp_sc*)dev->si_drv1; mtx_lock(&sc->pending_mutex); /* * We check global flag daemon_busy for the data availiability for * userland to read. Deamon_busy is set to true before driver has data * for daemon to read. It is set to false after daemon sends * then response back to driver. */ if (sc->daemon_busy == true) revents = POLLIN; else selrecord(td, &sc->hv_kvp_selinfo); mtx_unlock(&sc->pending_mutex); return (revents); } static int hv_kvp_probe(device_t dev) { if (resource_disabled("hvkvp", 0)) return ENXIO; if (VMBUS_PROBE_GUID(device_get_parent(dev), dev, &service_guid) == 0) { device_set_desc(dev, "Hyper-V KVP Service"); return BUS_PROBE_DEFAULT; } return ENXIO; } static int hv_kvp_attach(device_t dev) { int error; struct sysctl_oid_list *child; struct sysctl_ctx_list *ctx; hv_kvp_sc *sc = (hv_kvp_sc*)device_get_softc(dev); sc->util_sc.callback = hv_kvp_callback; + sc->dev = dev; sema_init(&sc->dev_sema, 0, "hv_kvp device semaphore"); mtx_init(&sc->pending_mutex, "hv-kvp pending mutex", NULL, MTX_DEF); ctx = device_get_sysctl_ctx(dev); child = SYSCTL_CHILDREN(device_get_sysctl_tree(dev)); SYSCTL_ADD_INT(ctx, child, OID_AUTO, "hv_kvp_log", CTLFLAG_RW, &hv_kvp_log, 0, "Hyperv KVP service log level"); TASK_INIT(&sc->task, 0, hv_kvp_process_request, sc); /* create character device */ error = make_dev_p(MAKEDEV_CHECKNAME | MAKEDEV_WAITOK, &sc->hv_kvp_dev, &hv_kvp_cdevsw, 0, UID_ROOT, GID_WHEEL, 0640, "hv_kvp_dev"); if (error != 0) return (error); sc->hv_kvp_dev->si_drv1 = sc; return hv_util_attach(dev); } static int hv_kvp_detach(device_t dev) { hv_kvp_sc *sc = (hv_kvp_sc*)device_get_softc(dev); if (sc->daemon_task != NULL) { PROC_LOCK(sc->daemon_task); kern_psignal(sc->daemon_task, SIGKILL); PROC_UNLOCK(sc->daemon_task); } destroy_dev(sc->hv_kvp_dev); return hv_util_detach(dev); } static device_method_t kvp_methods[] = { /* Device interface */ DEVMETHOD(device_probe, hv_kvp_probe), DEVMETHOD(device_attach, hv_kvp_attach), DEVMETHOD(device_detach, hv_kvp_detach), { 0, 0 } }; static driver_t kvp_driver = { "hvkvp", kvp_methods, sizeof(hv_kvp_sc)}; static devclass_t kvp_devclass; DRIVER_MODULE(hv_kvp, vmbus, kvp_driver, kvp_devclass, NULL, NULL); MODULE_VERSION(hv_kvp, 1); MODULE_DEPEND(hv_kvp, vmbus, 1, 1, 1); Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_shutdown.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_shutdown.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_shutdown.c (revision 303206) @@ -1,156 +1,155 @@ /*- * Copyright (c) 2014,2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * A common driver for all hyper-V util services. */ #include #include #include #include #include #include #include #include #include #include +#include #include "hv_util.h" #include "vmbus_if.h" static const struct hyperv_guid service_guid = { .hv_guid = {0x31, 0x60, 0x0B, 0X0E, 0x13, 0x52, 0x34, 0x49, 0x81, 0x8B, 0x38, 0XD9, 0x0C, 0xED, 0x39, 0xDB} }; /** * Shutdown */ static void -hv_shutdown_cb(void *context) +hv_shutdown_cb(struct vmbus_channel *channel, void *context) { uint8_t* buf; - hv_vmbus_channel* channel; uint8_t execute_shutdown = 0; hv_vmbus_icmsg_hdr* icmsghdrp; uint32_t recv_len; uint64_t request_id; int ret; hv_vmbus_shutdown_msg_data* shutdown_msg; hv_util_sc *softc; softc = (hv_util_sc*)context; buf = softc->receive_buffer; - channel = softc->channel; recv_len = PAGE_SIZE; ret = vmbus_chan_recv(channel, buf, &recv_len, &request_id); KASSERT(ret != ENOBUFS, ("hvshutdown recvbuf is not large enough")); /* XXX check recv_len to make sure that it contains enough data */ if ((ret == 0) && recv_len > 0) { icmsghdrp = (struct hv_vmbus_icmsg_hdr *) &buf[sizeof(struct hv_vmbus_pipe_hdr)]; if (icmsghdrp->icmsgtype == HV_ICMSGTYPE_NEGOTIATE) { hv_negotiate_version(icmsghdrp, NULL, buf); } else { shutdown_msg = (struct hv_vmbus_shutdown_msg_data *) &buf[sizeof(struct hv_vmbus_pipe_hdr) + sizeof(struct hv_vmbus_icmsg_hdr)]; switch (shutdown_msg->flags) { case 0: case 1: icmsghdrp->status = HV_S_OK; execute_shutdown = 1; if(bootverbose) printf("Shutdown request received -" " graceful shutdown initiated\n"); break; default: icmsghdrp->status = HV_E_FAIL; execute_shutdown = 0; printf("Shutdown request received -" " Invalid request\n"); break; } } icmsghdrp->icflags = HV_ICMSGHDRFLAG_TRANSACTION | HV_ICMSGHDRFLAG_RESPONSE; vmbus_chan_send(channel, VMBUS_CHANPKT_TYPE_INBAND, 0, buf, recv_len, request_id); } if (execute_shutdown) shutdown_nice(RB_POWEROFF); } static int hv_shutdown_probe(device_t dev) { if (resource_disabled("hvshutdown", 0)) return ENXIO; if (VMBUS_PROBE_GUID(device_get_parent(dev), dev, &service_guid) == 0) { device_set_desc(dev, "Hyper-V Shutdown Service"); return BUS_PROBE_DEFAULT; } return ENXIO; } static int hv_shutdown_attach(device_t dev) { hv_util_sc *softc = (hv_util_sc*)device_get_softc(dev); softc->callback = hv_shutdown_cb; return hv_util_attach(dev); } static device_method_t shutdown_methods[] = { /* Device interface */ DEVMETHOD(device_probe, hv_shutdown_probe), DEVMETHOD(device_attach, hv_shutdown_attach), DEVMETHOD(device_detach, hv_util_detach), { 0, 0 } }; static driver_t shutdown_driver = { "hvshutdown", shutdown_methods, sizeof(hv_util_sc)}; static devclass_t shutdown_devclass; DRIVER_MODULE(hv_shutdown, vmbus, shutdown_driver, shutdown_devclass, NULL, NULL); MODULE_VERSION(hv_shutdown, 1); MODULE_DEPEND(hv_shutdown, vmbus, 1, 1, 1); Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_timesync.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_timesync.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_timesync.c (revision 303206) @@ -1,220 +1,219 @@ /*- * Copyright (c) 2014,2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * A common driver for all hyper-V util services. */ #include #include #include #include #include #include #include #include #include #include +#include #include "hv_util.h" #include "vmbus_if.h" #define HV_WLTIMEDELTA 116444736000000000L /* in 100ns unit */ #define HV_ICTIMESYNCFLAG_PROBE 0 #define HV_ICTIMESYNCFLAG_SYNC 1 #define HV_ICTIMESYNCFLAG_SAMPLE 2 #define HV_NANO_SEC_PER_SEC 1000000000 /* Time Sync data */ typedef struct { uint64_t data; } time_sync_data; /* Time Synch Service */ static const struct hyperv_guid service_guid = {.hv_guid = {0x30, 0xe6, 0x27, 0x95, 0xae, 0xd0, 0x7b, 0x49, 0xad, 0xce, 0xe8, 0x0a, 0xb0, 0x17, 0x5c, 0xaf } }; struct hv_ictimesync_data { uint64_t parenttime; uint64_t childtime; uint64_t roundtriptime; uint8_t flags; } __packed; typedef struct hv_timesync_sc { hv_util_sc util_sc; struct task task; time_sync_data time_msg; } hv_timesync_sc; /** * Set host time based on time sync message from host */ static void hv_set_host_time(void *context, int pending) { hv_timesync_sc *softc = (hv_timesync_sc*)context; uint64_t hosttime = softc->time_msg.data; struct timespec guest_ts, host_ts; uint64_t host_tns; int64_t diff; int error; host_tns = (hosttime - HV_WLTIMEDELTA) * 100; host_ts.tv_sec = (time_t)(host_tns/HV_NANO_SEC_PER_SEC); host_ts.tv_nsec = (long)(host_tns%HV_NANO_SEC_PER_SEC); nanotime(&guest_ts); diff = (int64_t)host_ts.tv_sec - (int64_t)guest_ts.tv_sec; /* * If host differs by 5 seconds then make the guest catch up */ if (diff > 5 || diff < -5) { error = kern_clock_settime(curthread, CLOCK_REALTIME, &host_ts); } } /** * @brief Synchronize time with host after reboot, restore, etc. * * ICTIMESYNCFLAG_SYNC flag bit indicates reboot, restore events of the VM. * After reboot the flag ICTIMESYNCFLAG_SYNC is included in the first time * message after the timesync channel is opened. Since the hv_utils module is * loaded after hv_vmbus, the first message is usually missed. The other * thing is, systime is automatically set to emulated hardware clock which may * not be UTC time or in the same time zone. So, to override these effects, we * use the first 50 time samples for initial system time setting. */ static inline void hv_adj_guesttime(hv_timesync_sc *sc, uint64_t hosttime, uint8_t flags) { sc->time_msg.data = hosttime; if (((flags & HV_ICTIMESYNCFLAG_SYNC) != 0) || ((flags & HV_ICTIMESYNCFLAG_SAMPLE) != 0)) { taskqueue_enqueue(taskqueue_thread, &sc->task); } } /** * Time Sync Channel message handler */ static void -hv_timesync_cb(void *context) +hv_timesync_cb(struct vmbus_channel *channel, void *context) { - hv_vmbus_channel* channel; hv_vmbus_icmsg_hdr* icmsghdrp; uint32_t recvlen; uint64_t requestId; int ret; uint8_t* time_buf; struct hv_ictimesync_data* timedatap; hv_timesync_sc *softc; softc = (hv_timesync_sc*)context; - channel = softc->util_sc.channel; time_buf = softc->util_sc.receive_buffer; recvlen = PAGE_SIZE; ret = vmbus_chan_recv(channel, time_buf, &recvlen, &requestId); KASSERT(ret != ENOBUFS, ("hvtimesync recvbuf is not large enough")); /* XXX check recvlen to make sure that it contains enough data */ if ((ret == 0) && recvlen > 0) { icmsghdrp = (struct hv_vmbus_icmsg_hdr *) &time_buf[ sizeof(struct hv_vmbus_pipe_hdr)]; if (icmsghdrp->icmsgtype == HV_ICMSGTYPE_NEGOTIATE) { hv_negotiate_version(icmsghdrp, NULL, time_buf); } else { timedatap = (struct hv_ictimesync_data *) &time_buf[ sizeof(struct hv_vmbus_pipe_hdr) + sizeof(struct hv_vmbus_icmsg_hdr)]; hv_adj_guesttime(softc, timedatap->parenttime, timedatap->flags); } icmsghdrp->icflags = HV_ICMSGHDRFLAG_TRANSACTION | HV_ICMSGHDRFLAG_RESPONSE; vmbus_chan_send(channel, VMBUS_CHANPKT_TYPE_INBAND, 0, time_buf, recvlen, requestId); } } static int hv_timesync_probe(device_t dev) { if (resource_disabled("hvtimesync", 0)) return ENXIO; if (VMBUS_PROBE_GUID(device_get_parent(dev), dev, &service_guid) == 0) { device_set_desc(dev, "Hyper-V Time Synch Service"); return BUS_PROBE_DEFAULT; } return ENXIO; } static int hv_timesync_attach(device_t dev) { hv_timesync_sc *softc = device_get_softc(dev); softc->util_sc.callback = hv_timesync_cb; TASK_INIT(&softc->task, 1, hv_set_host_time, softc); return hv_util_attach(dev); } static int hv_timesync_detach(device_t dev) { hv_timesync_sc *softc = device_get_softc(dev); taskqueue_drain(taskqueue_thread, &softc->task); return hv_util_detach(dev); } static device_method_t timesync_methods[] = { /* Device interface */ DEVMETHOD(device_probe, hv_timesync_probe), DEVMETHOD(device_attach, hv_timesync_attach), DEVMETHOD(device_detach, hv_timesync_detach), { 0, 0 } }; static driver_t timesync_driver = { "hvtimesync", timesync_methods, sizeof(hv_timesync_sc)}; static devclass_t timesync_devclass; DRIVER_MODULE(hv_timesync, vmbus, timesync_driver, timesync_devclass, NULL, NULL); MODULE_VERSION(hv_timesync, 1); MODULE_DEPEND(hv_timesync, vmbus, 1, 1, 1); Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_util.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_util.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_util.c (revision 303206) @@ -1,118 +1,119 @@ /*- * Copyright (c) 2014,2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * A common driver for all hyper-V util services. */ #include #include #include #include #include #include #include #include #include #include +#include #include "hv_util.h" void hv_negotiate_version( struct hv_vmbus_icmsg_hdr* icmsghdrp, struct hv_vmbus_icmsg_negotiate* negop, uint8_t* buf) { icmsghdrp->icmsgsize = 0x10; negop = (struct hv_vmbus_icmsg_negotiate *)&buf[ sizeof(struct hv_vmbus_pipe_hdr) + sizeof(struct hv_vmbus_icmsg_hdr)]; if (negop->icframe_vercnt >= 2 && negop->icversion_data[1].major == 3) { negop->icversion_data[0].major = 3; negop->icversion_data[0].minor = 0; negop->icversion_data[1].major = 3; negop->icversion_data[1].minor = 0; } else { negop->icversion_data[0].major = 1; negop->icversion_data[0].minor = 0; negop->icversion_data[1].major = 1; negop->icversion_data[1].minor = 0; } negop->icframe_vercnt = 1; negop->icmsg_vercnt = 1; } int hv_util_attach(device_t dev) { struct hv_util_sc* softc; + struct vmbus_channel *chan; int ret; softc = device_get_softc(dev); - softc->channel = vmbus_get_channel(dev); softc->receive_buffer = malloc(4 * PAGE_SIZE, M_DEVBUF, M_WAITOK | M_ZERO); + chan = vmbus_get_channel(dev); /* * These services are not performance critical and do not need * batched reading. Furthermore, some services such as KVP can * only handle one message from the host at a time. * Turn off batched reading for all util drivers before we open the * channel. */ - vmbus_chan_set_readbatch(softc->channel, false); + vmbus_chan_set_readbatch(chan, false); - ret = vmbus_chan_open(softc->channel, 4 * PAGE_SIZE, - 4 * PAGE_SIZE, NULL, 0, - softc->callback, softc); + ret = vmbus_chan_open(chan, 4 * PAGE_SIZE, 4 * PAGE_SIZE, NULL, 0, + softc->callback, softc); if (ret) goto error0; return (0); error0: free(softc->receive_buffer, M_DEVBUF); return (ret); } int hv_util_detach(device_t dev) { struct hv_util_sc *sc = device_get_softc(dev); - vmbus_chan_close(sc->channel); + vmbus_chan_close(vmbus_get_channel(dev)); free(sc->receive_buffer, M_DEVBUF); return (0); } Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_util.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_util.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_util.h (revision 303206) @@ -1,55 +1,53 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * Copyright (c) 2012 Citrix Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _HVUTIL_H_ #define _HVUTIL_H_ /** * hv_util related structures * */ typedef struct hv_util_sc { /* * function to process Hyper-V messages */ - void (*callback)(void *); - - struct hv_vmbus_channel *channel; + void (*callback)(struct vmbus_channel *, void *); uint8_t *receive_buffer; } hv_util_sc; void hv_negotiate_version( struct hv_vmbus_icmsg_hdr* icmsghdrp, struct hv_vmbus_icmsg_negotiate* negop, uint8_t* buf); int hv_util_attach(device_t dev); int hv_util_detach(device_t dev); #endif Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_utilreg.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_utilreg.h (nonexistent) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_utilreg.h (revision 303206) @@ -0,0 +1,91 @@ +/*- + * Copyright (c) 2016 Microsoft Corp. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice unmodified, this list of conditions, and the following + * disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _HV_UTILREG_H_ +#define _HV_UTILREG_H_ + +#define HV_S_OK 0x00000000 +#define HV_E_FAIL 0x80004005 +#define HV_ERROR_NOT_SUPPORTED 0x80070032 +#define HV_ERROR_MACHINE_LOCKED 0x800704F7 + +/* + * Common defines for Hyper-V ICs + */ +#define HV_ICMSGTYPE_NEGOTIATE 0 +#define HV_ICMSGTYPE_HEARTBEAT 1 +#define HV_ICMSGTYPE_KVPEXCHANGE 2 +#define HV_ICMSGTYPE_SHUTDOWN 3 +#define HV_ICMSGTYPE_TIMESYNC 4 +#define HV_ICMSGTYPE_VSS 5 + +#define HV_ICMSGHDRFLAG_TRANSACTION 1 +#define HV_ICMSGHDRFLAG_REQUEST 2 +#define HV_ICMSGHDRFLAG_RESPONSE 4 + +typedef struct hv_vmbus_pipe_hdr { + uint32_t flags; + uint32_t msgsize; +} __packed hv_vmbus_pipe_hdr; + +typedef struct hv_vmbus_ic_version { + uint16_t major; + uint16_t minor; +} __packed hv_vmbus_ic_version; + +typedef struct hv_vmbus_icmsg_hdr { + hv_vmbus_ic_version icverframe; + uint16_t icmsgtype; + hv_vmbus_ic_version icvermsg; + uint16_t icmsgsize; + uint32_t status; + uint8_t ictransaction_id; + uint8_t icflags; + uint8_t reserved[2]; +} __packed hv_vmbus_icmsg_hdr; + +typedef struct hv_vmbus_icmsg_negotiate { + uint16_t icframe_vercnt; + uint16_t icmsg_vercnt; + uint32_t reserved; + hv_vmbus_ic_version icversion_data[1]; /* any size array */ +} __packed hv_vmbus_icmsg_negotiate; + +typedef struct hv_vmbus_shutdown_msg_data { + uint32_t reason_code; + uint32_t timeout_seconds; + uint32_t flags; + uint8_t display_message[2048]; +} __packed hv_vmbus_shutdown_msg_data; + +typedef struct hv_vmbus_heartbeat_msg_data { + uint64_t seq_num; + uint32_t reserved[8]; +} __packed hv_vmbus_heartbeat_msg_data; + +#endif /* !_HV_UTILREG_H_ */ Property changes on: user/alc/PQ_LAUNDRY/sys/dev/hyperv/utilities/hv_utilreg.h ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/hv_ring_buffer.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/hv_ring_buffer.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/hv_ring_buffer.c (revision 303206) @@ -1,540 +1,524 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * Copyright (c) 2012 Citrix Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ - #include #include #include #include #include "hv_vmbus_priv.h" +#include /* Amount of space to write to */ -#define HV_BYTES_AVAIL_TO_WRITE(r, w, z) ((w) >= (r))? \ - ((z) - ((w) - (r))):((r) - (w)) +#define HV_BYTES_AVAIL_TO_WRITE(r, w, z) \ + ((w) >= (r)) ? ((z) - ((w) - (r))) : ((r) - (w)) +static uint32_t copy_to_ring_buffer(hv_vmbus_ring_buffer_info *ring_info, + uint32_t start_write_offset, const uint8_t *src, + uint32_t src_len); +static uint32_t copy_from_ring_buffer(hv_vmbus_ring_buffer_info *ring_info, + char *dest, uint32_t dest_len, uint32_t start_read_offset); + static int -hv_rbi_sysctl_stats(SYSCTL_HANDLER_ARGS) +vmbus_br_sysctl_state(SYSCTL_HANDLER_ARGS) { - hv_vmbus_ring_buffer_info* rbi; - uint32_t read_index, write_index, interrupt_mask, sz; - uint32_t read_avail, write_avail; - char rbi_stats[256]; + const hv_vmbus_ring_buffer_info *br = arg1; + uint32_t rindex, windex, intr_mask, ravail, wavail; + char state[256]; - rbi = (hv_vmbus_ring_buffer_info*)arg1; - read_index = rbi->ring_buffer->read_index; - write_index = rbi->ring_buffer->write_index; - interrupt_mask = rbi->ring_buffer->interrupt_mask; - sz = rbi->ring_data_size; - write_avail = HV_BYTES_AVAIL_TO_WRITE(read_index, - write_index, sz); - read_avail = sz - write_avail; - snprintf(rbi_stats, sizeof(rbi_stats), - "r_idx:%d " - "w_idx:%d " - "int_mask:%d " - "r_avail:%d " - "w_avail:%d", - read_index, write_index, interrupt_mask, - read_avail, write_avail); + rindex = br->ring_buffer->br_rindex; + windex = br->ring_buffer->br_windex; + intr_mask = br->ring_buffer->br_imask; + wavail = HV_BYTES_AVAIL_TO_WRITE(rindex, windex, br->ring_data_size); + ravail = br->ring_data_size - wavail; - return (sysctl_handle_string(oidp, rbi_stats, - sizeof(rbi_stats), req)); + snprintf(state, sizeof(state), + "rindex:%u windex:%u intr_mask:%u ravail:%u wavail:%u", + rindex, windex, intr_mask, ravail, wavail); + return sysctl_handle_string(oidp, state, sizeof(state), req); } +/* + * Binary bufring states. + */ +static int +vmbus_br_sysctl_state_bin(SYSCTL_HANDLER_ARGS) +{ +#define BR_STATE_RIDX 0 +#define BR_STATE_WIDX 1 +#define BR_STATE_IMSK 2 +#define BR_STATE_RSPC 3 +#define BR_STATE_WSPC 4 +#define BR_STATE_MAX 5 + + const hv_vmbus_ring_buffer_info *br = arg1; + uint32_t rindex, windex, wavail, state[BR_STATE_MAX]; + + rindex = br->ring_buffer->br_rindex; + windex = br->ring_buffer->br_windex; + wavail = HV_BYTES_AVAIL_TO_WRITE(rindex, windex, br->ring_data_size); + + state[BR_STATE_RIDX] = rindex; + state[BR_STATE_WIDX] = windex; + state[BR_STATE_IMSK] = br->ring_buffer->br_imask; + state[BR_STATE_WSPC] = wavail; + state[BR_STATE_RSPC] = br->ring_data_size - wavail; + + return sysctl_handle_opaque(oidp, state, sizeof(state), req); +} + void -hv_ring_buffer_stat( - struct sysctl_ctx_list *ctx, - struct sysctl_oid_list *tree_node, - hv_vmbus_ring_buffer_info *rbi, - const char *desc) +vmbus_br_sysctl_create(struct sysctl_ctx_list *ctx, struct sysctl_oid *br_tree, + hv_vmbus_ring_buffer_info *br, const char *name) { - SYSCTL_ADD_PROC(ctx, tree_node, OID_AUTO, - "ring_buffer_stats", - CTLTYPE_STRING|CTLFLAG_RD|CTLFLAG_MPSAFE, rbi, 0, - hv_rbi_sysctl_stats, "A", desc); + struct sysctl_oid *tree; + char desc[64]; + + tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(br_tree), OID_AUTO, + name, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); + if (tree == NULL) + return; + + snprintf(desc, sizeof(desc), "%s state", name); + SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "state", + CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_MPSAFE, + br, 0, vmbus_br_sysctl_state, "A", desc); + + snprintf(desc, sizeof(desc), "%s binary state", name); + SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "state_bin", + CTLTYPE_OPAQUE | CTLFLAG_RD | CTLFLAG_MPSAFE, + br, 0, vmbus_br_sysctl_state_bin, "IU", desc); } + /** * @brief Get number of bytes available to read and to write to * for the specified ring buffer */ -static inline void -get_ring_buffer_avail_bytes( - hv_vmbus_ring_buffer_info* rbi, - uint32_t* read, - uint32_t* write) +static __inline void +get_ring_buffer_avail_bytes(hv_vmbus_ring_buffer_info *rbi, uint32_t *read, + uint32_t *write) { uint32_t read_loc, write_loc; /* * Capture the read/write indices before they changed */ - read_loc = rbi->ring_buffer->read_index; - write_loc = rbi->ring_buffer->write_index; + read_loc = rbi->ring_buffer->br_rindex; + write_loc = rbi->ring_buffer->br_windex; - *write = HV_BYTES_AVAIL_TO_WRITE( - read_loc, write_loc, rbi->ring_data_size); + *write = HV_BYTES_AVAIL_TO_WRITE(read_loc, write_loc, + rbi->ring_data_size); *read = rbi->ring_data_size - *write; } /** * @brief Get the next write location for the specified ring buffer */ -static inline uint32_t -get_next_write_location(hv_vmbus_ring_buffer_info* ring_info) +static __inline uint32_t +get_next_write_location(hv_vmbus_ring_buffer_info *ring_info) { - uint32_t next = ring_info->ring_buffer->write_index; - return (next); + return ring_info->ring_buffer->br_windex; } /** * @brief Set the next write location for the specified ring buffer */ -static inline void -set_next_write_location( - hv_vmbus_ring_buffer_info* ring_info, - uint32_t next_write_location) +static __inline void +set_next_write_location(hv_vmbus_ring_buffer_info *ring_info, + uint32_t next_write_location) { - ring_info->ring_buffer->write_index = next_write_location; + ring_info->ring_buffer->br_windex = next_write_location; } /** * @brief Get the next read location for the specified ring buffer */ -static inline uint32_t -get_next_read_location(hv_vmbus_ring_buffer_info* ring_info) +static __inline uint32_t +get_next_read_location(hv_vmbus_ring_buffer_info *ring_info) { - uint32_t next = ring_info->ring_buffer->read_index; - return (next); + return ring_info->ring_buffer->br_rindex; } /** * @brief Get the next read location + offset for the specified ring buffer. * This allows the caller to skip. */ -static inline uint32_t -get_next_read_location_with_offset( - hv_vmbus_ring_buffer_info* ring_info, - uint32_t offset) +static __inline uint32_t +get_next_read_location_with_offset(hv_vmbus_ring_buffer_info *ring_info, + uint32_t offset) { - uint32_t next = ring_info->ring_buffer->read_index; + uint32_t next = ring_info->ring_buffer->br_rindex; + next += offset; next %= ring_info->ring_data_size; return (next); } /** * @brief Set the next read location for the specified ring buffer */ -static inline void -set_next_read_location( - hv_vmbus_ring_buffer_info* ring_info, - uint32_t next_read_location) +static __inline void +set_next_read_location(hv_vmbus_ring_buffer_info *ring_info, + uint32_t next_read_location) { - ring_info->ring_buffer->read_index = next_read_location; + ring_info->ring_buffer->br_rindex = next_read_location; } /** * @brief Get the start of the ring buffer */ -static inline void * -get_ring_buffer(hv_vmbus_ring_buffer_info* ring_info) +static __inline void * +get_ring_buffer(hv_vmbus_ring_buffer_info *ring_info) { - return (void *) ring_info->ring_buffer->buffer; + return ring_info->ring_buffer->br_data; } /** * @brief Get the size of the ring buffer. */ -static inline uint32_t -get_ring_buffer_size(hv_vmbus_ring_buffer_info* ring_info) +static __inline uint32_t +get_ring_buffer_size(hv_vmbus_ring_buffer_info *ring_info) { return ring_info->ring_data_size; } /** * Get the read and write indices as uint64_t of the specified ring buffer. */ -static inline uint64_t -get_ring_buffer_indices(hv_vmbus_ring_buffer_info* ring_info) +static __inline uint64_t +get_ring_buffer_indices(hv_vmbus_ring_buffer_info *ring_info) { - return (uint64_t) ring_info->ring_buffer->write_index << 32; + return ((uint64_t)ring_info->ring_buffer->br_windex) << 32; } void -hv_ring_buffer_read_begin( - hv_vmbus_ring_buffer_info* ring_info) +hv_ring_buffer_read_begin(hv_vmbus_ring_buffer_info *ring_info) { - ring_info->ring_buffer->interrupt_mask = 1; + ring_info->ring_buffer->br_imask = 1; mb(); } uint32_t -hv_ring_buffer_read_end( - hv_vmbus_ring_buffer_info* ring_info) +hv_ring_buffer_read_end(hv_vmbus_ring_buffer_info *ring_info) { - uint32_t read, write; + uint32_t read, write; - ring_info->ring_buffer->interrupt_mask = 0; + ring_info->ring_buffer->br_imask = 0; mb(); /* * Now check to see if the ring buffer is still empty. * If it is not, we raced and we need to process new * incoming messages. */ get_ring_buffer_avail_bytes(ring_info, &read, &write); - return (read); } /* * When we write to the ring buffer, check if the host needs to * be signaled. Here is the details of this protocol: * * 1. The host guarantees that while it is draining the * ring buffer, it will set the interrupt_mask to * indicate it does not need to be interrupted when * new data is placed. * * 2. The host guarantees that it will completely drain * the ring buffer before exiting the read loop. Further, * once the ring buffer is empty, it will clear the * interrupt_mask and re-check to see if new data has * arrived. */ static boolean_t -hv_ring_buffer_needsig_on_write( - uint32_t old_write_location, - hv_vmbus_ring_buffer_info* rbi) +hv_ring_buffer_needsig_on_write(uint32_t old_write_location, + hv_vmbus_ring_buffer_info *rbi) { mb(); - if (rbi->ring_buffer->interrupt_mask) + if (rbi->ring_buffer->br_imask) return (FALSE); /* Read memory barrier */ rmb(); /* * This is the only case we need to signal when the * ring transitions from being empty to non-empty. */ - if (old_write_location == rbi->ring_buffer->read_index) + if (old_write_location == rbi->ring_buffer->br_rindex) return (TRUE); return (FALSE); } -static uint32_t copy_to_ring_buffer( - hv_vmbus_ring_buffer_info* ring_info, - uint32_t start_write_offset, - const uint8_t *src, - uint32_t src_len); - -static uint32_t copy_from_ring_buffer( - hv_vmbus_ring_buffer_info* ring_info, - char* dest, - uint32_t dest_len, - uint32_t start_read_offset); - /** * @brief Initialize the ring buffer. */ int -hv_vmbus_ring_buffer_init( - hv_vmbus_ring_buffer_info* ring_info, - void* buffer, - uint32_t buffer_len) +hv_vmbus_ring_buffer_init(hv_vmbus_ring_buffer_info *ring_info, void *buffer, + uint32_t buffer_len) { memset(ring_info, 0, sizeof(hv_vmbus_ring_buffer_info)); - ring_info->ring_buffer = (hv_vmbus_ring_buffer*) buffer; - ring_info->ring_buffer->read_index = - ring_info->ring_buffer->write_index = 0; + ring_info->ring_buffer = buffer; + ring_info->ring_buffer->br_rindex = 0; + ring_info->ring_buffer->br_windex = 0; - ring_info->ring_data_size = buffer_len - sizeof(hv_vmbus_ring_buffer); - + ring_info->ring_data_size = buffer_len - sizeof(struct vmbus_bufring); mtx_init(&ring_info->ring_lock, "vmbus ring buffer", NULL, MTX_SPIN); return (0); } /** * @brief Cleanup the ring buffer. */ -void hv_ring_buffer_cleanup(hv_vmbus_ring_buffer_info* ring_info) +void +hv_ring_buffer_cleanup(hv_vmbus_ring_buffer_info *ring_info) { mtx_destroy(&ring_info->ring_lock); } /** * @brief Write to the ring buffer. */ int -hv_ring_buffer_write( - hv_vmbus_ring_buffer_info* out_ring_info, - const struct iovec iov[], - uint32_t iovlen, - boolean_t *need_sig) +hv_ring_buffer_write(hv_vmbus_ring_buffer_info *out_ring_info, + const struct iovec iov[], uint32_t iovlen, boolean_t *need_sig) { int i = 0; uint32_t byte_avail_to_write; uint32_t byte_avail_to_read; uint32_t old_write_location; uint32_t total_bytes_to_write = 0; - volatile uint32_t next_write_location; uint64_t prev_indices = 0; - for (i = 0; i < iovlen; i++) { - total_bytes_to_write += iov[i].iov_len; - } + for (i = 0; i < iovlen; i++) + total_bytes_to_write += iov[i].iov_len; total_bytes_to_write += sizeof(uint64_t); mtx_lock_spin(&out_ring_info->ring_lock); get_ring_buffer_avail_bytes(out_ring_info, &byte_avail_to_read, &byte_avail_to_write); /* * If there is only room for the packet, assume it is full. * Otherwise, the next time around, we think the ring buffer * is empty since the read index == write index */ - if (byte_avail_to_write <= total_bytes_to_write) { - - mtx_unlock_spin(&out_ring_info->ring_lock); - return (EAGAIN); + mtx_unlock_spin(&out_ring_info->ring_lock); + return (EAGAIN); } /* * Write to the ring buffer */ next_write_location = get_next_write_location(out_ring_info); old_write_location = next_write_location; for (i = 0; i < iovlen; i++) { - next_write_location = copy_to_ring_buffer(out_ring_info, - next_write_location, iov[i].iov_base, iov[i].iov_len); + next_write_location = copy_to_ring_buffer(out_ring_info, + next_write_location, iov[i].iov_base, iov[i].iov_len); } /* * Set previous packet start */ prev_indices = get_ring_buffer_indices(out_ring_info); - next_write_location = copy_to_ring_buffer( - out_ring_info, next_write_location, - (char *) &prev_indices, sizeof(uint64_t)); + next_write_location = copy_to_ring_buffer(out_ring_info, + next_write_location, (char *)&prev_indices, sizeof(uint64_t)); /* * Full memory barrier before upding the write index. */ mb(); /* * Now, update the write location */ set_next_write_location(out_ring_info, next_write_location); mtx_unlock_spin(&out_ring_info->ring_lock); *need_sig = hv_ring_buffer_needsig_on_write(old_write_location, out_ring_info); return (0); } /** * @brief Read without advancing the read index. */ int -hv_ring_buffer_peek( - hv_vmbus_ring_buffer_info* in_ring_info, - void* buffer, - uint32_t buffer_len) +hv_ring_buffer_peek(hv_vmbus_ring_buffer_info *in_ring_info, void *buffer, + uint32_t buffer_len) { uint32_t bytesAvailToWrite; uint32_t bytesAvailToRead; uint32_t nextReadLocation = 0; mtx_lock_spin(&in_ring_info->ring_lock); get_ring_buffer_avail_bytes(in_ring_info, &bytesAvailToRead, - &bytesAvailToWrite); + &bytesAvailToWrite); /* * Make sure there is something to read */ if (bytesAvailToRead < buffer_len) { - mtx_unlock_spin(&in_ring_info->ring_lock); - return (EAGAIN); + mtx_unlock_spin(&in_ring_info->ring_lock); + return (EAGAIN); } /* * Convert to byte offset */ nextReadLocation = get_next_read_location(in_ring_info); - nextReadLocation = copy_from_ring_buffer( - in_ring_info, (char *)buffer, buffer_len, nextReadLocation); + nextReadLocation = copy_from_ring_buffer(in_ring_info, + (char *)buffer, buffer_len, nextReadLocation); mtx_unlock_spin(&in_ring_info->ring_lock); return (0); } /** * @brief Read and advance the read index. */ int -hv_ring_buffer_read( - hv_vmbus_ring_buffer_info* in_ring_info, - void* buffer, - uint32_t buffer_len, - uint32_t offset) +hv_ring_buffer_read(hv_vmbus_ring_buffer_info *in_ring_info, void *buffer, + uint32_t buffer_len, uint32_t offset) { uint32_t bytes_avail_to_write; uint32_t bytes_avail_to_read; uint32_t next_read_location = 0; uint64_t prev_indices = 0; if (buffer_len <= 0) - return (EINVAL); + return (EINVAL); mtx_lock_spin(&in_ring_info->ring_lock); - get_ring_buffer_avail_bytes( - in_ring_info, &bytes_avail_to_read, + get_ring_buffer_avail_bytes(in_ring_info, &bytes_avail_to_read, &bytes_avail_to_write); /* * Make sure there is something to read */ if (bytes_avail_to_read < buffer_len) { - mtx_unlock_spin(&in_ring_info->ring_lock); - return (EAGAIN); + mtx_unlock_spin(&in_ring_info->ring_lock); + return (EAGAIN); } - next_read_location = get_next_read_location_with_offset( - in_ring_info, + next_read_location = get_next_read_location_with_offset(in_ring_info, offset); - next_read_location = copy_from_ring_buffer( - in_ring_info, - (char *) buffer, - buffer_len, - next_read_location); + next_read_location = copy_from_ring_buffer(in_ring_info, (char *)buffer, + buffer_len, next_read_location); - next_read_location = copy_from_ring_buffer( - in_ring_info, - (char *) &prev_indices, - sizeof(uint64_t), - next_read_location); + next_read_location = copy_from_ring_buffer(in_ring_info, + (char *)&prev_indices, sizeof(uint64_t), next_read_location); /* * Make sure all reads are done before we update the read index since * the writer may start writing to the read area once the read index * is updated. */ wmb(); /* * Update the read index */ set_next_read_location(in_ring_info, next_read_location); mtx_unlock_spin(&in_ring_info->ring_lock); return (0); } /** * @brief Helper routine to copy from source to ring buffer. * * Assume there is enough room. Handles wrap-around in dest case only! */ static uint32_t -copy_to_ring_buffer( - hv_vmbus_ring_buffer_info* ring_info, - uint32_t start_write_offset, - const uint8_t *src, - uint32_t src_len) +copy_to_ring_buffer(hv_vmbus_ring_buffer_info *ring_info, + uint32_t start_write_offset, const uint8_t *src, uint32_t src_len) { char *ring_buffer = get_ring_buffer(ring_info); uint32_t ring_buffer_size = get_ring_buffer_size(ring_info); uint32_t fragLen; - if (src_len > ring_buffer_size - start_write_offset) { - /* wrap-around detected! */ - fragLen = ring_buffer_size - start_write_offset; - memcpy(ring_buffer + start_write_offset, src, fragLen); - memcpy(ring_buffer, src + fragLen, src_len - fragLen); + if (src_len > ring_buffer_size - start_write_offset) { + /* wrap-around detected! */ + fragLen = ring_buffer_size - start_write_offset; + memcpy(ring_buffer + start_write_offset, src, fragLen); + memcpy(ring_buffer, src + fragLen, src_len - fragLen); } else { - memcpy(ring_buffer + start_write_offset, src, src_len); + memcpy(ring_buffer + start_write_offset, src, src_len); } start_write_offset += src_len; start_write_offset %= ring_buffer_size; return (start_write_offset); } /** * @brief Helper routine to copy to source from ring buffer. * * Assume there is enough room. Handles wrap-around in src case only! */ -uint32_t -copy_from_ring_buffer( - hv_vmbus_ring_buffer_info* ring_info, - char* dest, - uint32_t dest_len, - uint32_t start_read_offset) +static uint32_t +copy_from_ring_buffer(hv_vmbus_ring_buffer_info *ring_info, char *dest, + uint32_t dest_len, uint32_t start_read_offset) { uint32_t fragLen; char *ring_buffer = get_ring_buffer(ring_info); uint32_t ring_buffer_size = get_ring_buffer_size(ring_info); if (dest_len > ring_buffer_size - start_read_offset) { - /* wrap-around detected at the src */ - fragLen = ring_buffer_size - start_read_offset; - memcpy(dest, ring_buffer + start_read_offset, fragLen); - memcpy(dest + fragLen, ring_buffer, dest_len - fragLen); + /* wrap-around detected at the src */ + fragLen = ring_buffer_size - start_read_offset; + memcpy(dest, ring_buffer + start_read_offset, fragLen); + memcpy(dest + fragLen, ring_buffer, dest_len - fragLen); } else { - memcpy(dest, ring_buffer + start_read_offset, dest_len); + memcpy(dest, ring_buffer + start_read_offset, dest_len); } start_read_offset += dest_len; start_read_offset %= ring_buffer_size; return (start_read_offset); } - Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/hv_vmbus_priv.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/hv_vmbus_priv.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/hv_vmbus_priv.h (revision 303206) @@ -1,87 +1,85 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * Copyright (c) 2012 Citrix Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __HYPERV_PRIV_H__ #define __HYPERV_PRIV_H__ #include #include #include #include #include -#include +#include struct vmbus_softc; /* * Private, VM Bus functions */ struct sysctl_ctx_list; -struct sysctl_oid_list; +struct sysctl_oid; -void hv_ring_buffer_stat( - struct sysctl_ctx_list *ctx, - struct sysctl_oid_list *tree_node, - hv_vmbus_ring_buffer_info *rbi, - const char *desc); +void vmbus_br_sysctl_create(struct sysctl_ctx_list *ctx, + struct sysctl_oid *br_tree, hv_vmbus_ring_buffer_info *br, + const char *name); int hv_vmbus_ring_buffer_init( hv_vmbus_ring_buffer_info *ring_info, void *buffer, uint32_t buffer_len); void hv_ring_buffer_cleanup( hv_vmbus_ring_buffer_info *ring_info); int hv_ring_buffer_write( hv_vmbus_ring_buffer_info *ring_info, const struct iovec iov[], uint32_t iovlen, boolean_t *need_sig); int hv_ring_buffer_peek( hv_vmbus_ring_buffer_info *ring_info, void *buffer, uint32_t buffer_len); int hv_ring_buffer_read( hv_vmbus_ring_buffer_info *ring_info, void *buffer, uint32_t buffer_len, uint32_t offset); void hv_ring_buffer_read_begin( hv_vmbus_ring_buffer_info *ring_info); uint32_t hv_ring_buffer_read_end( hv_vmbus_ring_buffer_info *ring_info); #endif /* __HYPERV_PRIV_H__ */ Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus.c (revision 303206) @@ -1,1332 +1,1332 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * Copyright (c) 2012 Citrix Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * VM Bus Driver Implementation */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "acpi_if.h" #include "vmbus_if.h" #define VMBUS_GPADL_START 0xe1e10 struct vmbus_msghc { struct hypercall_postmsg_in *mh_inprm; struct hypercall_postmsg_in mh_inprm_save; struct hyperv_dma mh_inprm_dma; struct vmbus_message *mh_resp; struct vmbus_message mh_resp0; }; struct vmbus_msghc_ctx { struct vmbus_msghc *mhc_free; struct mtx mhc_free_lock; uint32_t mhc_flags; struct vmbus_msghc *mhc_active; struct mtx mhc_active_lock; }; #define VMBUS_MSGHC_CTXF_DESTROY 0x0001 static int vmbus_init(struct vmbus_softc *); static int vmbus_connect(struct vmbus_softc *, uint32_t); static int vmbus_req_channels(struct vmbus_softc *sc); static void vmbus_disconnect(struct vmbus_softc *); static int vmbus_scan(struct vmbus_softc *); static void vmbus_scan_wait(struct vmbus_softc *); static void vmbus_scan_newchan(struct vmbus_softc *); static void vmbus_scan_newdev(struct vmbus_softc *); static void vmbus_scan_done(struct vmbus_softc *, const struct vmbus_message *); static void vmbus_chanmsg_handle(struct vmbus_softc *, const struct vmbus_message *); static int vmbus_sysctl_version(SYSCTL_HANDLER_ARGS); static struct vmbus_msghc_ctx *vmbus_msghc_ctx_create(bus_dma_tag_t); static void vmbus_msghc_ctx_destroy( struct vmbus_msghc_ctx *); static void vmbus_msghc_ctx_free(struct vmbus_msghc_ctx *); static struct vmbus_msghc *vmbus_msghc_alloc(bus_dma_tag_t); static void vmbus_msghc_free(struct vmbus_msghc *); static struct vmbus_msghc *vmbus_msghc_get1(struct vmbus_msghc_ctx *, uint32_t); struct vmbus_softc *vmbus_sc; extern inthand_t IDTVEC(vmbus_isr); static const uint32_t vmbus_version[] = { VMBUS_VERSION_WIN8_1, VMBUS_VERSION_WIN8, VMBUS_VERSION_WIN7, VMBUS_VERSION_WS2008 }; static const vmbus_chanmsg_proc_t vmbus_chanmsg_handlers[VMBUS_CHANMSG_TYPE_MAX] = { VMBUS_CHANMSG_PROC(CHOFFER_DONE, vmbus_scan_done), VMBUS_CHANMSG_PROC_WAKEUP(CONNECT_RESP) }; static struct vmbus_msghc * vmbus_msghc_alloc(bus_dma_tag_t parent_dtag) { struct vmbus_msghc *mh; mh = malloc(sizeof(*mh), M_DEVBUF, M_WAITOK | M_ZERO); mh->mh_inprm = hyperv_dmamem_alloc(parent_dtag, HYPERCALL_PARAM_ALIGN, 0, HYPERCALL_POSTMSGIN_SIZE, &mh->mh_inprm_dma, BUS_DMA_WAITOK); if (mh->mh_inprm == NULL) { free(mh, M_DEVBUF); return NULL; } return mh; } static void vmbus_msghc_free(struct vmbus_msghc *mh) { hyperv_dmamem_free(&mh->mh_inprm_dma, mh->mh_inprm); free(mh, M_DEVBUF); } static void vmbus_msghc_ctx_free(struct vmbus_msghc_ctx *mhc) { KASSERT(mhc->mhc_active == NULL, ("still have active msg hypercall")); KASSERT(mhc->mhc_free == NULL, ("still have hypercall msg")); mtx_destroy(&mhc->mhc_free_lock); mtx_destroy(&mhc->mhc_active_lock); free(mhc, M_DEVBUF); } static struct vmbus_msghc_ctx * vmbus_msghc_ctx_create(bus_dma_tag_t parent_dtag) { struct vmbus_msghc_ctx *mhc; mhc = malloc(sizeof(*mhc), M_DEVBUF, M_WAITOK | M_ZERO); mtx_init(&mhc->mhc_free_lock, "vmbus msghc free", NULL, MTX_DEF); mtx_init(&mhc->mhc_active_lock, "vmbus msghc act", NULL, MTX_DEF); mhc->mhc_free = vmbus_msghc_alloc(parent_dtag); if (mhc->mhc_free == NULL) { vmbus_msghc_ctx_free(mhc); return NULL; } return mhc; } static struct vmbus_msghc * vmbus_msghc_get1(struct vmbus_msghc_ctx *mhc, uint32_t dtor_flag) { struct vmbus_msghc *mh; mtx_lock(&mhc->mhc_free_lock); while ((mhc->mhc_flags & dtor_flag) == 0 && mhc->mhc_free == NULL) { mtx_sleep(&mhc->mhc_free, &mhc->mhc_free_lock, 0, "gmsghc", 0); } if (mhc->mhc_flags & dtor_flag) { /* Being destroyed */ mh = NULL; } else { mh = mhc->mhc_free; KASSERT(mh != NULL, ("no free hypercall msg")); KASSERT(mh->mh_resp == NULL, ("hypercall msg has pending response")); mhc->mhc_free = NULL; } mtx_unlock(&mhc->mhc_free_lock); return mh; } void vmbus_msghc_reset(struct vmbus_msghc *mh, size_t dsize) { struct hypercall_postmsg_in *inprm; if (dsize > HYPERCALL_POSTMSGIN_DSIZE_MAX) panic("invalid data size %zu", dsize); inprm = mh->mh_inprm; memset(inprm, 0, HYPERCALL_POSTMSGIN_SIZE); inprm->hc_connid = VMBUS_CONNID_MESSAGE; inprm->hc_msgtype = HYPERV_MSGTYPE_CHANNEL; inprm->hc_dsize = dsize; } struct vmbus_msghc * vmbus_msghc_get(struct vmbus_softc *sc, size_t dsize) { struct vmbus_msghc *mh; if (dsize > HYPERCALL_POSTMSGIN_DSIZE_MAX) panic("invalid data size %zu", dsize); mh = vmbus_msghc_get1(sc->vmbus_msg_hc, VMBUS_MSGHC_CTXF_DESTROY); if (mh == NULL) return NULL; vmbus_msghc_reset(mh, dsize); return mh; } void vmbus_msghc_put(struct vmbus_softc *sc, struct vmbus_msghc *mh) { struct vmbus_msghc_ctx *mhc = sc->vmbus_msg_hc; KASSERT(mhc->mhc_active == NULL, ("msg hypercall is active")); mh->mh_resp = NULL; mtx_lock(&mhc->mhc_free_lock); KASSERT(mhc->mhc_free == NULL, ("has free hypercall msg")); mhc->mhc_free = mh; mtx_unlock(&mhc->mhc_free_lock); wakeup(&mhc->mhc_free); } void * vmbus_msghc_dataptr(struct vmbus_msghc *mh) { return mh->mh_inprm->hc_data; } static void vmbus_msghc_ctx_destroy(struct vmbus_msghc_ctx *mhc) { struct vmbus_msghc *mh; mtx_lock(&mhc->mhc_free_lock); mhc->mhc_flags |= VMBUS_MSGHC_CTXF_DESTROY; mtx_unlock(&mhc->mhc_free_lock); wakeup(&mhc->mhc_free); mh = vmbus_msghc_get1(mhc, 0); if (mh == NULL) panic("can't get msghc"); vmbus_msghc_free(mh); vmbus_msghc_ctx_free(mhc); } int vmbus_msghc_exec_noresult(struct vmbus_msghc *mh) { sbintime_t time = SBT_1MS; int i; /* * Save the input parameter so that we could restore the input * parameter if the Hypercall failed. * * XXX * Is this really necessary?! i.e. Will the Hypercall ever * overwrite the input parameter? */ memcpy(&mh->mh_inprm_save, mh->mh_inprm, HYPERCALL_POSTMSGIN_SIZE); /* * In order to cope with transient failures, e.g. insufficient * resources on host side, we retry the post message Hypercall * several times. 20 retries seem sufficient. */ #define HC_RETRY_MAX 20 for (i = 0; i < HC_RETRY_MAX; ++i) { uint64_t status; status = hypercall_post_message(mh->mh_inprm_dma.hv_paddr); if (status == HYPERCALL_STATUS_SUCCESS) return 0; pause_sbt("hcpmsg", time, 0, C_HARDCLOCK); if (time < SBT_1S * 2) time *= 2; /* Restore input parameter and try again */ memcpy(mh->mh_inprm, &mh->mh_inprm_save, HYPERCALL_POSTMSGIN_SIZE); } #undef HC_RETRY_MAX return EIO; } int vmbus_msghc_exec(struct vmbus_softc *sc, struct vmbus_msghc *mh) { struct vmbus_msghc_ctx *mhc = sc->vmbus_msg_hc; int error; KASSERT(mh->mh_resp == NULL, ("hypercall msg has pending response")); mtx_lock(&mhc->mhc_active_lock); KASSERT(mhc->mhc_active == NULL, ("pending active msg hypercall")); mhc->mhc_active = mh; mtx_unlock(&mhc->mhc_active_lock); error = vmbus_msghc_exec_noresult(mh); if (error) { mtx_lock(&mhc->mhc_active_lock); KASSERT(mhc->mhc_active == mh, ("msghc mismatch")); mhc->mhc_active = NULL; mtx_unlock(&mhc->mhc_active_lock); } return error; } const struct vmbus_message * vmbus_msghc_wait_result(struct vmbus_softc *sc, struct vmbus_msghc *mh) { struct vmbus_msghc_ctx *mhc = sc->vmbus_msg_hc; mtx_lock(&mhc->mhc_active_lock); KASSERT(mhc->mhc_active == mh, ("msghc mismatch")); while (mh->mh_resp == NULL) { mtx_sleep(&mhc->mhc_active, &mhc->mhc_active_lock, 0, "wmsghc", 0); } mhc->mhc_active = NULL; mtx_unlock(&mhc->mhc_active_lock); return mh->mh_resp; } void vmbus_msghc_wakeup(struct vmbus_softc *sc, const struct vmbus_message *msg) { struct vmbus_msghc_ctx *mhc = sc->vmbus_msg_hc; struct vmbus_msghc *mh; mtx_lock(&mhc->mhc_active_lock); mh = mhc->mhc_active; KASSERT(mh != NULL, ("no pending msg hypercall")); memcpy(&mh->mh_resp0, msg, sizeof(mh->mh_resp0)); mh->mh_resp = &mh->mh_resp0; mtx_unlock(&mhc->mhc_active_lock); wakeup(&mhc->mhc_active); } uint32_t vmbus_gpadl_alloc(struct vmbus_softc *sc) { return atomic_fetchadd_int(&sc->vmbus_gpadl, 1); } static int vmbus_connect(struct vmbus_softc *sc, uint32_t version) { struct vmbus_chanmsg_connect *req; const struct vmbus_message *msg; struct vmbus_msghc *mh; int error, done = 0; mh = vmbus_msghc_get(sc, sizeof(*req)); if (mh == NULL) return ENXIO; req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_CONNECT; req->chm_ver = version; req->chm_evtflags = sc->vmbus_evtflags_dma.hv_paddr; req->chm_mnf1 = sc->vmbus_mnf1_dma.hv_paddr; req->chm_mnf2 = sc->vmbus_mnf2_dma.hv_paddr; error = vmbus_msghc_exec(sc, mh); if (error) { vmbus_msghc_put(sc, mh); return error; } msg = vmbus_msghc_wait_result(sc, mh); done = ((const struct vmbus_chanmsg_connect_resp *) msg->msg_data)->chm_done; vmbus_msghc_put(sc, mh); return (done ? 0 : EOPNOTSUPP); } static int vmbus_init(struct vmbus_softc *sc) { int i; for (i = 0; i < nitems(vmbus_version); ++i) { int error; error = vmbus_connect(sc, vmbus_version[i]); if (!error) { sc->vmbus_version = vmbus_version[i]; device_printf(sc->vmbus_dev, "version %u.%u\n", VMBUS_VERSION_MAJOR(sc->vmbus_version), VMBUS_VERSION_MINOR(sc->vmbus_version)); return 0; } } return ENXIO; } static void vmbus_disconnect(struct vmbus_softc *sc) { struct vmbus_chanmsg_disconnect *req; struct vmbus_msghc *mh; int error; mh = vmbus_msghc_get(sc, sizeof(*req)); if (mh == NULL) { device_printf(sc->vmbus_dev, "can not get msg hypercall for disconnect\n"); return; } req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_DISCONNECT; error = vmbus_msghc_exec_noresult(mh); vmbus_msghc_put(sc, mh); if (error) { device_printf(sc->vmbus_dev, "disconnect msg hypercall failed\n"); } } static int vmbus_req_channels(struct vmbus_softc *sc) { struct vmbus_chanmsg_chrequest *req; struct vmbus_msghc *mh; int error; mh = vmbus_msghc_get(sc, sizeof(*req)); if (mh == NULL) return ENXIO; req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_CHREQUEST; error = vmbus_msghc_exec_noresult(mh); vmbus_msghc_put(sc, mh); return error; } static void vmbus_scan_newchan(struct vmbus_softc *sc) { mtx_lock(&sc->vmbus_scan_lock); if ((sc->vmbus_scan_chcnt & VMBUS_SCAN_CHCNT_DONE) == 0) sc->vmbus_scan_chcnt++; mtx_unlock(&sc->vmbus_scan_lock); } static void vmbus_scan_done(struct vmbus_softc *sc, const struct vmbus_message *msg __unused) { mtx_lock(&sc->vmbus_scan_lock); sc->vmbus_scan_chcnt |= VMBUS_SCAN_CHCNT_DONE; mtx_unlock(&sc->vmbus_scan_lock); wakeup(&sc->vmbus_scan_chcnt); } static void vmbus_scan_newdev(struct vmbus_softc *sc) { mtx_lock(&sc->vmbus_scan_lock); sc->vmbus_scan_devcnt++; mtx_unlock(&sc->vmbus_scan_lock); wakeup(&sc->vmbus_scan_devcnt); } static void vmbus_scan_wait(struct vmbus_softc *sc) { uint32_t chancnt; mtx_lock(&sc->vmbus_scan_lock); while ((sc->vmbus_scan_chcnt & VMBUS_SCAN_CHCNT_DONE) == 0) { mtx_sleep(&sc->vmbus_scan_chcnt, &sc->vmbus_scan_lock, 0, "waitch", 0); } chancnt = sc->vmbus_scan_chcnt & ~VMBUS_SCAN_CHCNT_DONE; while (sc->vmbus_scan_devcnt != chancnt) { mtx_sleep(&sc->vmbus_scan_devcnt, &sc->vmbus_scan_lock, 0, "waitdev", 0); } mtx_unlock(&sc->vmbus_scan_lock); } static int vmbus_scan(struct vmbus_softc *sc) { int error; /* * Start vmbus scanning. */ error = vmbus_req_channels(sc); if (error) { device_printf(sc->vmbus_dev, "channel request failed: %d\n", error); return error; } /* * Wait for all devices are added to vmbus. */ vmbus_scan_wait(sc); /* * Identify, probe and attach. */ bus_generic_probe(sc->vmbus_dev); bus_generic_attach(sc->vmbus_dev); if (bootverbose) { device_printf(sc->vmbus_dev, "device scan, probe and attach " "done\n"); } return 0; } static void vmbus_chanmsg_handle(struct vmbus_softc *sc, const struct vmbus_message *msg) { vmbus_chanmsg_proc_t msg_proc; uint32_t msg_type; msg_type = ((const struct vmbus_chanmsg_hdr *)msg->msg_data)->chm_type; if (msg_type >= VMBUS_CHANMSG_TYPE_MAX) { device_printf(sc->vmbus_dev, "unknown message type 0x%x\n", msg_type); return; } msg_proc = vmbus_chanmsg_handlers[msg_type]; if (msg_proc != NULL) msg_proc(sc, msg); /* Channel specific processing */ vmbus_chan_msgproc(sc, msg); } static void vmbus_msg_task(void *xsc, int pending __unused) { struct vmbus_softc *sc = xsc; volatile struct vmbus_message *msg; msg = VMBUS_PCPU_GET(sc, message, curcpu) + VMBUS_SINT_MESSAGE; for (;;) { if (msg->msg_type == HYPERV_MSGTYPE_NONE) { /* No message */ break; } else if (msg->msg_type == HYPERV_MSGTYPE_CHANNEL) { /* Channel message */ vmbus_chanmsg_handle(sc, __DEVOLATILE(const struct vmbus_message *, msg)); } msg->msg_type = HYPERV_MSGTYPE_NONE; /* * Make sure the write to msg_type (i.e. set to * HYPERV_MSGTYPE_NONE) happens before we read the * msg_flags and EOMing. Otherwise, the EOMing will * not deliver any more messages since there is no * empty slot * * NOTE: * mb() is used here, since atomic_thread_fence_seq_cst() * will become compiler fence on UP kernel. */ mb(); if (msg->msg_flags & VMBUS_MSGFLAG_PENDING) { /* * This will cause message queue rescan to possibly * deliver another msg from the hypervisor */ wrmsr(MSR_HV_EOM, 0); } } } static __inline int vmbus_handle_intr1(struct vmbus_softc *sc, struct trapframe *frame, int cpu) { volatile struct vmbus_message *msg; struct vmbus_message *msg_base; msg_base = VMBUS_PCPU_GET(sc, message, cpu); /* * Check event timer. * * TODO: move this to independent IDT vector. */ msg = msg_base + VMBUS_SINT_TIMER; if (msg->msg_type == HYPERV_MSGTYPE_TIMER_EXPIRED) { msg->msg_type = HYPERV_MSGTYPE_NONE; vmbus_et_intr(frame); /* * Make sure the write to msg_type (i.e. set to * HYPERV_MSGTYPE_NONE) happens before we read the * msg_flags and EOMing. Otherwise, the EOMing will * not deliver any more messages since there is no * empty slot * * NOTE: * mb() is used here, since atomic_thread_fence_seq_cst() * will become compiler fence on UP kernel. */ mb(); if (msg->msg_flags & VMBUS_MSGFLAG_PENDING) { /* * This will cause message queue rescan to possibly * deliver another msg from the hypervisor */ wrmsr(MSR_HV_EOM, 0); } } /* * Check events. Hot path for network and storage I/O data; high rate. * * NOTE: * As recommended by the Windows guest fellows, we check events before * checking messages. */ sc->vmbus_event_proc(sc, cpu); /* * Check messages. Mainly management stuffs; ultra low rate. */ msg = msg_base + VMBUS_SINT_MESSAGE; if (__predict_false(msg->msg_type != HYPERV_MSGTYPE_NONE)) { taskqueue_enqueue(VMBUS_PCPU_GET(sc, message_tq, cpu), VMBUS_PCPU_PTR(sc, message_task, cpu)); } return (FILTER_HANDLED); } void vmbus_handle_intr(struct trapframe *trap_frame) { struct vmbus_softc *sc = vmbus_get_softc(); int cpu = curcpu; /* * Disable preemption. */ critical_enter(); /* * Do a little interrupt counting. */ (*VMBUS_PCPU_GET(sc, intr_cnt, cpu))++; vmbus_handle_intr1(sc, trap_frame, cpu); /* * Enable preemption. */ critical_exit(); } static void vmbus_synic_setup(void *xsc) { struct vmbus_softc *sc = xsc; int cpu = curcpu; uint64_t val, orig; uint32_t sint; if (hyperv_features & CPUID_HV_MSR_VP_INDEX) { /* Save virtual processor id. */ VMBUS_PCPU_GET(sc, vcpuid, cpu) = rdmsr(MSR_HV_VP_INDEX); } else { /* Set virtual processor id to 0 for compatibility. */ VMBUS_PCPU_GET(sc, vcpuid, cpu) = 0; } /* * Setup the SynIC message. */ orig = rdmsr(MSR_HV_SIMP); val = MSR_HV_SIMP_ENABLE | (orig & MSR_HV_SIMP_RSVD_MASK) | ((VMBUS_PCPU_GET(sc, message_dma.hv_paddr, cpu) >> PAGE_SHIFT) << MSR_HV_SIMP_PGSHIFT); wrmsr(MSR_HV_SIMP, val); /* * Setup the SynIC event flags. */ orig = rdmsr(MSR_HV_SIEFP); val = MSR_HV_SIEFP_ENABLE | (orig & MSR_HV_SIEFP_RSVD_MASK) | ((VMBUS_PCPU_GET(sc, event_flags_dma.hv_paddr, cpu) >> PAGE_SHIFT) << MSR_HV_SIEFP_PGSHIFT); wrmsr(MSR_HV_SIEFP, val); /* * Configure and unmask SINT for message and event flags. */ sint = MSR_HV_SINT0 + VMBUS_SINT_MESSAGE; orig = rdmsr(sint); val = sc->vmbus_idtvec | MSR_HV_SINT_AUTOEOI | (orig & MSR_HV_SINT_RSVD_MASK); wrmsr(sint, val); /* * Configure and unmask SINT for timer. */ sint = MSR_HV_SINT0 + VMBUS_SINT_TIMER; orig = rdmsr(sint); val = sc->vmbus_idtvec | MSR_HV_SINT_AUTOEOI | (orig & MSR_HV_SINT_RSVD_MASK); wrmsr(sint, val); /* * All done; enable SynIC. */ orig = rdmsr(MSR_HV_SCONTROL); val = MSR_HV_SCTRL_ENABLE | (orig & MSR_HV_SCTRL_RSVD_MASK); wrmsr(MSR_HV_SCONTROL, val); } static void vmbus_synic_teardown(void *arg) { uint64_t orig; uint32_t sint; /* * Disable SynIC. */ orig = rdmsr(MSR_HV_SCONTROL); wrmsr(MSR_HV_SCONTROL, (orig & MSR_HV_SCTRL_RSVD_MASK)); /* * Mask message and event flags SINT. */ sint = MSR_HV_SINT0 + VMBUS_SINT_MESSAGE; orig = rdmsr(sint); wrmsr(sint, orig | MSR_HV_SINT_MASKED); /* * Mask timer SINT. */ sint = MSR_HV_SINT0 + VMBUS_SINT_TIMER; orig = rdmsr(sint); wrmsr(sint, orig | MSR_HV_SINT_MASKED); /* * Teardown SynIC message. */ orig = rdmsr(MSR_HV_SIMP); wrmsr(MSR_HV_SIMP, (orig & MSR_HV_SIMP_RSVD_MASK)); /* * Teardown SynIC event flags. */ orig = rdmsr(MSR_HV_SIEFP); wrmsr(MSR_HV_SIEFP, (orig & MSR_HV_SIEFP_RSVD_MASK)); } static int vmbus_dma_alloc(struct vmbus_softc *sc) { bus_dma_tag_t parent_dtag; uint8_t *evtflags; int cpu; parent_dtag = bus_get_dma_tag(sc->vmbus_dev); CPU_FOREACH(cpu) { void *ptr; /* * Per-cpu messages and event flags. */ ptr = hyperv_dmamem_alloc(parent_dtag, PAGE_SIZE, 0, PAGE_SIZE, VMBUS_PCPU_PTR(sc, message_dma, cpu), BUS_DMA_WAITOK | BUS_DMA_ZERO); if (ptr == NULL) return ENOMEM; VMBUS_PCPU_GET(sc, message, cpu) = ptr; ptr = hyperv_dmamem_alloc(parent_dtag, PAGE_SIZE, 0, PAGE_SIZE, VMBUS_PCPU_PTR(sc, event_flags_dma, cpu), BUS_DMA_WAITOK | BUS_DMA_ZERO); if (ptr == NULL) return ENOMEM; VMBUS_PCPU_GET(sc, event_flags, cpu) = ptr; } evtflags = hyperv_dmamem_alloc(parent_dtag, PAGE_SIZE, 0, PAGE_SIZE, &sc->vmbus_evtflags_dma, BUS_DMA_WAITOK | BUS_DMA_ZERO); if (evtflags == NULL) return ENOMEM; sc->vmbus_rx_evtflags = (u_long *)evtflags; sc->vmbus_tx_evtflags = (u_long *)(evtflags + (PAGE_SIZE / 2)); sc->vmbus_evtflags = evtflags; sc->vmbus_mnf1 = hyperv_dmamem_alloc(parent_dtag, PAGE_SIZE, 0, PAGE_SIZE, &sc->vmbus_mnf1_dma, BUS_DMA_WAITOK | BUS_DMA_ZERO); if (sc->vmbus_mnf1 == NULL) return ENOMEM; sc->vmbus_mnf2 = hyperv_dmamem_alloc(parent_dtag, PAGE_SIZE, 0, sizeof(struct vmbus_mnf), &sc->vmbus_mnf2_dma, BUS_DMA_WAITOK | BUS_DMA_ZERO); if (sc->vmbus_mnf2 == NULL) return ENOMEM; return 0; } static void vmbus_dma_free(struct vmbus_softc *sc) { int cpu; if (sc->vmbus_evtflags != NULL) { hyperv_dmamem_free(&sc->vmbus_evtflags_dma, sc->vmbus_evtflags); sc->vmbus_evtflags = NULL; sc->vmbus_rx_evtflags = NULL; sc->vmbus_tx_evtflags = NULL; } if (sc->vmbus_mnf1 != NULL) { hyperv_dmamem_free(&sc->vmbus_mnf1_dma, sc->vmbus_mnf1); sc->vmbus_mnf1 = NULL; } if (sc->vmbus_mnf2 != NULL) { hyperv_dmamem_free(&sc->vmbus_mnf2_dma, sc->vmbus_mnf2); sc->vmbus_mnf2 = NULL; } CPU_FOREACH(cpu) { if (VMBUS_PCPU_GET(sc, message, cpu) != NULL) { hyperv_dmamem_free( VMBUS_PCPU_PTR(sc, message_dma, cpu), VMBUS_PCPU_GET(sc, message, cpu)); VMBUS_PCPU_GET(sc, message, cpu) = NULL; } if (VMBUS_PCPU_GET(sc, event_flags, cpu) != NULL) { hyperv_dmamem_free( VMBUS_PCPU_PTR(sc, event_flags_dma, cpu), VMBUS_PCPU_GET(sc, event_flags, cpu)); VMBUS_PCPU_GET(sc, event_flags, cpu) = NULL; } } } static int vmbus_intr_setup(struct vmbus_softc *sc) { int cpu; CPU_FOREACH(cpu) { char buf[MAXCOMLEN + 1]; cpuset_t cpu_mask; /* Allocate an interrupt counter for Hyper-V interrupt */ snprintf(buf, sizeof(buf), "cpu%d:hyperv", cpu); intrcnt_add(buf, VMBUS_PCPU_PTR(sc, intr_cnt, cpu)); /* * Setup taskqueue to handle events. Task will be per- * channel. */ VMBUS_PCPU_GET(sc, event_tq, cpu) = taskqueue_create_fast( "hyperv event", M_WAITOK, taskqueue_thread_enqueue, VMBUS_PCPU_PTR(sc, event_tq, cpu)); CPU_SETOF(cpu, &cpu_mask); taskqueue_start_threads_cpuset( VMBUS_PCPU_PTR(sc, event_tq, cpu), 1, PI_NET, &cpu_mask, "hvevent%d", cpu); /* * Setup tasks and taskqueues to handle messages. */ VMBUS_PCPU_GET(sc, message_tq, cpu) = taskqueue_create_fast( "hyperv msg", M_WAITOK, taskqueue_thread_enqueue, VMBUS_PCPU_PTR(sc, message_tq, cpu)); CPU_SETOF(cpu, &cpu_mask); taskqueue_start_threads_cpuset( VMBUS_PCPU_PTR(sc, message_tq, cpu), 1, PI_NET, &cpu_mask, "hvmsg%d", cpu); TASK_INIT(VMBUS_PCPU_PTR(sc, message_task, cpu), 0, vmbus_msg_task, sc); } /* * All Hyper-V ISR required resources are setup, now let's find a * free IDT vector for Hyper-V ISR and set it up. */ sc->vmbus_idtvec = lapic_ipi_alloc(IDTVEC(vmbus_isr)); if (sc->vmbus_idtvec < 0) { device_printf(sc->vmbus_dev, "cannot find free IDT vector\n"); return ENXIO; } if(bootverbose) { device_printf(sc->vmbus_dev, "vmbus IDT vector %d\n", sc->vmbus_idtvec); } return 0; } static void vmbus_intr_teardown(struct vmbus_softc *sc) { int cpu; if (sc->vmbus_idtvec >= 0) { lapic_ipi_free(sc->vmbus_idtvec); sc->vmbus_idtvec = -1; } CPU_FOREACH(cpu) { if (VMBUS_PCPU_GET(sc, event_tq, cpu) != NULL) { taskqueue_free(VMBUS_PCPU_GET(sc, event_tq, cpu)); VMBUS_PCPU_GET(sc, event_tq, cpu) = NULL; } if (VMBUS_PCPU_GET(sc, message_tq, cpu) != NULL) { taskqueue_drain(VMBUS_PCPU_GET(sc, message_tq, cpu), VMBUS_PCPU_PTR(sc, message_task, cpu)); taskqueue_free(VMBUS_PCPU_GET(sc, message_tq, cpu)); VMBUS_PCPU_GET(sc, message_tq, cpu) = NULL; } } } static int vmbus_read_ivar(device_t dev, device_t child, int index, uintptr_t *result) { return (ENOENT); } static int vmbus_child_pnpinfo_str(device_t dev, device_t child, char *buf, size_t buflen) { - const struct hv_vmbus_channel *chan; + const struct vmbus_channel *chan; char guidbuf[HYPERV_GUID_STRLEN]; chan = vmbus_get_channel(child); if (chan == NULL) { /* Event timer device, which does not belong to a channel */ return (0); } strlcat(buf, "classid=", buflen); hyperv_guid2str(&chan->ch_guid_type, guidbuf, sizeof(guidbuf)); strlcat(buf, guidbuf, buflen); strlcat(buf, " deviceid=", buflen); hyperv_guid2str(&chan->ch_guid_inst, guidbuf, sizeof(guidbuf)); strlcat(buf, guidbuf, buflen); return (0); } int -vmbus_add_child(struct hv_vmbus_channel *chan) +vmbus_add_child(struct vmbus_channel *chan) { - struct vmbus_softc *sc = chan->vmbus_sc; + struct vmbus_softc *sc = chan->ch_vmbus; device_t parent = sc->vmbus_dev; int error = 0; /* New channel has been offered */ vmbus_scan_newchan(sc); chan->ch_dev = device_add_child(parent, NULL, -1); if (chan->ch_dev == NULL) { device_printf(parent, "device_add_child for chan%u failed\n", chan->ch_id); error = ENXIO; goto done; } device_set_ivars(chan->ch_dev, chan); done: /* New device has been/should be added to vmbus. */ vmbus_scan_newdev(sc); return error; } int -vmbus_delete_child(struct hv_vmbus_channel *chan) +vmbus_delete_child(struct vmbus_channel *chan) { int error; if (chan->ch_dev == NULL) { /* Failed to add a device. */ return 0; } /* * XXXKYS: Ensure that this is the opposite of * device_add_child() */ mtx_lock(&Giant); - error = device_delete_child(chan->vmbus_sc->vmbus_dev, chan->ch_dev); + error = device_delete_child(chan->ch_vmbus->vmbus_dev, chan->ch_dev); mtx_unlock(&Giant); return error; } static int vmbus_sysctl_version(SYSCTL_HANDLER_ARGS) { struct vmbus_softc *sc = arg1; char verstr[16]; snprintf(verstr, sizeof(verstr), "%u.%u", VMBUS_VERSION_MAJOR(sc->vmbus_version), VMBUS_VERSION_MINOR(sc->vmbus_version)); return sysctl_handle_string(oidp, verstr, sizeof(verstr), req); } static uint32_t vmbus_get_version_method(device_t bus, device_t dev) { struct vmbus_softc *sc = device_get_softc(bus); return sc->vmbus_version; } static int vmbus_probe_guid_method(device_t bus, device_t dev, const struct hyperv_guid *guid) { - const struct hv_vmbus_channel *chan = vmbus_get_channel(dev); + const struct vmbus_channel *chan = vmbus_get_channel(dev); if (memcmp(&chan->ch_guid_type, guid, sizeof(struct hyperv_guid)) == 0) return 0; return ENXIO; } static int vmbus_probe(device_t dev) { char *id[] = { "VMBUS", NULL }; if (ACPI_ID_PROBE(device_get_parent(dev), dev, id) == NULL || device_get_unit(dev) != 0 || vm_guest != VM_GUEST_HV || (hyperv_features & CPUID_HV_MSR_SYNIC) == 0) return (ENXIO); device_set_desc(dev, "Hyper-V Vmbus"); return (BUS_PROBE_DEFAULT); } /** * @brief Main vmbus driver initialization routine. * * Here, we * - initialize the vmbus driver context * - setup various driver entry points * - invoke the vmbus hv main init routine * - get the irq resource * - invoke the vmbus to add the vmbus root device * - setup the vmbus root device * - retrieve the channel offers */ static int vmbus_doattach(struct vmbus_softc *sc) { struct sysctl_oid_list *child; struct sysctl_ctx_list *ctx; int ret; if (sc->vmbus_flags & VMBUS_FLAG_ATTACHED) return (0); sc->vmbus_flags |= VMBUS_FLAG_ATTACHED; mtx_init(&sc->vmbus_scan_lock, "vmbus scan", NULL, MTX_DEF); sc->vmbus_gpadl = VMBUS_GPADL_START; mtx_init(&sc->vmbus_prichan_lock, "vmbus prichan", NULL, MTX_DEF); TAILQ_INIT(&sc->vmbus_prichans); sc->vmbus_chmap = malloc( - sizeof(struct hv_vmbus_channel *) * VMBUS_CHAN_MAX, M_DEVBUF, + sizeof(struct vmbus_channel *) * VMBUS_CHAN_MAX, M_DEVBUF, M_WAITOK | M_ZERO); /* * Create context for "post message" Hypercalls */ sc->vmbus_msg_hc = vmbus_msghc_ctx_create( bus_get_dma_tag(sc->vmbus_dev)); if (sc->vmbus_msg_hc == NULL) { ret = ENXIO; goto cleanup; } /* * Allocate DMA stuffs. */ ret = vmbus_dma_alloc(sc); if (ret != 0) goto cleanup; /* * Setup interrupt. */ ret = vmbus_intr_setup(sc); if (ret != 0) goto cleanup; /* * Setup SynIC. */ if (bootverbose) device_printf(sc->vmbus_dev, "smp_started = %d\n", smp_started); smp_rendezvous(NULL, vmbus_synic_setup, NULL, sc); sc->vmbus_flags |= VMBUS_FLAG_SYNIC; /* * Initialize vmbus, e.g. connect to Hypervisor. */ ret = vmbus_init(sc); if (ret != 0) goto cleanup; if (sc->vmbus_version == VMBUS_VERSION_WS2008 || sc->vmbus_version == VMBUS_VERSION_WIN7) sc->vmbus_event_proc = vmbus_event_proc_compat; else sc->vmbus_event_proc = vmbus_event_proc; ret = vmbus_scan(sc); if (ret != 0) goto cleanup; ctx = device_get_sysctl_ctx(sc->vmbus_dev); child = SYSCTL_CHILDREN(device_get_sysctl_tree(sc->vmbus_dev)); SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "version", CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_MPSAFE, sc, 0, vmbus_sysctl_version, "A", "vmbus version"); return (ret); cleanup: vmbus_intr_teardown(sc); vmbus_dma_free(sc); if (sc->vmbus_msg_hc != NULL) { vmbus_msghc_ctx_destroy(sc->vmbus_msg_hc); sc->vmbus_msg_hc = NULL; } free(sc->vmbus_chmap, M_DEVBUF); mtx_destroy(&sc->vmbus_scan_lock); mtx_destroy(&sc->vmbus_prichan_lock); return (ret); } static void vmbus_event_proc_dummy(struct vmbus_softc *sc __unused, int cpu __unused) { } static int vmbus_attach(device_t dev) { vmbus_sc = device_get_softc(dev); vmbus_sc->vmbus_dev = dev; vmbus_sc->vmbus_idtvec = -1; /* * Event processing logic will be configured: * - After the vmbus protocol version negotiation. * - Before we request channel offers. */ vmbus_sc->vmbus_event_proc = vmbus_event_proc_dummy; #ifndef EARLY_AP_STARTUP /* * If the system has already booted and thread * scheduling is possible indicated by the global * cold set to zero, we just call the driver * initialization directly. */ if (!cold) #endif vmbus_doattach(vmbus_sc); return (0); } static void vmbus_sysinit(void *arg __unused) { struct vmbus_softc *sc = vmbus_get_softc(); if (vm_guest != VM_GUEST_HV || sc == NULL) return; #ifndef EARLY_AP_STARTUP /* * If the system has already booted and thread * scheduling is possible, as indicated by the * global cold set to zero, we just call the driver * initialization directly. */ if (!cold) #endif vmbus_doattach(sc); } static int vmbus_detach(device_t dev) { struct vmbus_softc *sc = device_get_softc(dev); vmbus_chan_destroy_all(sc); vmbus_disconnect(sc); if (sc->vmbus_flags & VMBUS_FLAG_SYNIC) { sc->vmbus_flags &= ~VMBUS_FLAG_SYNIC; smp_rendezvous(NULL, vmbus_synic_teardown, NULL, NULL); } vmbus_intr_teardown(sc); vmbus_dma_free(sc); if (sc->vmbus_msg_hc != NULL) { vmbus_msghc_ctx_destroy(sc->vmbus_msg_hc); sc->vmbus_msg_hc = NULL; } free(sc->vmbus_chmap, M_DEVBUF); mtx_destroy(&sc->vmbus_scan_lock); mtx_destroy(&sc->vmbus_prichan_lock); return (0); } static device_method_t vmbus_methods[] = { /* Device interface */ DEVMETHOD(device_probe, vmbus_probe), DEVMETHOD(device_attach, vmbus_attach), DEVMETHOD(device_detach, vmbus_detach), DEVMETHOD(device_shutdown, bus_generic_shutdown), DEVMETHOD(device_suspend, bus_generic_suspend), DEVMETHOD(device_resume, bus_generic_resume), /* Bus interface */ DEVMETHOD(bus_add_child, bus_generic_add_child), DEVMETHOD(bus_print_child, bus_generic_print_child), DEVMETHOD(bus_read_ivar, vmbus_read_ivar), DEVMETHOD(bus_child_pnpinfo_str, vmbus_child_pnpinfo_str), /* Vmbus interface */ DEVMETHOD(vmbus_get_version, vmbus_get_version_method), DEVMETHOD(vmbus_probe_guid, vmbus_probe_guid_method), DEVMETHOD_END }; static driver_t vmbus_driver = { "vmbus", vmbus_methods, sizeof(struct vmbus_softc) }; static devclass_t vmbus_devclass; DRIVER_MODULE(vmbus, acpi, vmbus_driver, vmbus_devclass, NULL, NULL); MODULE_DEPEND(vmbus, acpi, 1, 1, 1); MODULE_VERSION(vmbus, 1); #ifndef EARLY_AP_STARTUP /* * NOTE: * We have to start as the last step of SI_SUB_SMP, i.e. after SMP is * initialized. */ SYSINIT(vmbus_initialize, SI_SUB_SMP, SI_ORDER_ANY, vmbus_sysinit, NULL); #endif Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_chan.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_chan.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_chan.c (revision 303206) @@ -1,1380 +1,1404 @@ /*- * Copyright (c) 2009-2012,2016 Microsoft Corp. * Copyright (c) 2012 NetApp Inc. * Copyright (c) 2012 Citrix Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include -static void vmbus_chan_signal_tx(struct hv_vmbus_channel *chan); static void vmbus_chan_update_evtflagcnt(struct vmbus_softc *, - const struct hv_vmbus_channel *); + const struct vmbus_channel *); static void vmbus_chan_task(void *, int); static void vmbus_chan_task_nobatch(void *, int); static void vmbus_chan_detach_task(void *, int); static void vmbus_chan_msgproc_choffer(struct vmbus_softc *, const struct vmbus_message *); static void vmbus_chan_msgproc_chrescind(struct vmbus_softc *, const struct vmbus_message *); /* * Vmbus channel message processing. */ static const vmbus_chanmsg_proc_t vmbus_chan_msgprocs[VMBUS_CHANMSG_TYPE_MAX] = { VMBUS_CHANMSG_PROC(CHOFFER, vmbus_chan_msgproc_choffer), VMBUS_CHANMSG_PROC(CHRESCIND, vmbus_chan_msgproc_chrescind), VMBUS_CHANMSG_PROC_WAKEUP(CHOPEN_RESP), VMBUS_CHANMSG_PROC_WAKEUP(GPADL_CONNRESP), VMBUS_CHANMSG_PROC_WAKEUP(GPADL_DISCONNRESP) }; -/** - * @brief Trigger an event notification on the specified channel +/* + * Notify host that there are data pending on our TX bufring. */ -static void -vmbus_chan_signal_tx(struct hv_vmbus_channel *chan) +static __inline void +vmbus_chan_signal_tx(const struct vmbus_channel *chan) { - struct vmbus_softc *sc = chan->vmbus_sc; - uint32_t chanid = chan->ch_id; - - atomic_set_long(&sc->vmbus_tx_evtflags[chanid >> VMBUS_EVTFLAG_SHIFT], - 1UL << (chanid & VMBUS_EVTFLAG_MASK)); - - if (chan->ch_flags & VMBUS_CHAN_FLAG_HASMNF) { - atomic_set_int( - &sc->vmbus_mnf2->mnf_trigs[chan->ch_montrig_idx].mt_pending, - chan->ch_montrig_mask); - } else { + atomic_set_long(chan->ch_evtflag, chan->ch_evtflag_mask); + if (chan->ch_txflags & VMBUS_CHAN_TXF_HASMNF) + atomic_set_int(chan->ch_montrig, chan->ch_montrig_mask); + else hypercall_signal_event(chan->ch_monprm_dma.hv_paddr); - } } static int vmbus_chan_sysctl_mnf(SYSCTL_HANDLER_ARGS) { - struct hv_vmbus_channel *chan = arg1; + struct vmbus_channel *chan = arg1; int mnf = 0; - if (chan->ch_flags & VMBUS_CHAN_FLAG_HASMNF) + if (chan->ch_txflags & VMBUS_CHAN_TXF_HASMNF) mnf = 1; return sysctl_handle_int(oidp, &mnf, 0, req); } static void -vmbus_chan_sysctl_create(struct hv_vmbus_channel *chan) +vmbus_chan_sysctl_create(struct vmbus_channel *chan) { struct sysctl_oid *ch_tree, *chid_tree, *br_tree; struct sysctl_ctx_list *ctx; uint32_t ch_id; char name[16]; /* * Add sysctl nodes related to this channel to this * channel's sysctl ctx, so that they can be destroyed * independently upon close of this channel, which can * happen even if the device is not detached. */ ctx = &chan->ch_sysctl_ctx; sysctl_ctx_init(ctx); /* * Create dev.NAME.UNIT.channel tree. */ ch_tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(device_get_sysctl_tree(chan->ch_dev)), OID_AUTO, "channel", CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); if (ch_tree == NULL) return; /* * Create dev.NAME.UNIT.channel.CHANID tree. */ if (VMBUS_CHAN_ISPRIMARY(chan)) ch_id = chan->ch_id; else ch_id = chan->ch_prichan->ch_id; snprintf(name, sizeof(name), "%d", ch_id); chid_tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(ch_tree), OID_AUTO, name, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); if (chid_tree == NULL) return; if (!VMBUS_CHAN_ISPRIMARY(chan)) { /* * Create dev.NAME.UNIT.channel.CHANID.sub tree. */ ch_tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(chid_tree), OID_AUTO, "sub", CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); if (ch_tree == NULL) return; /* * Create dev.NAME.UNIT.channel.CHANID.sub.SUBIDX tree. * * NOTE: * chid_tree is changed to this new sysctl tree. */ snprintf(name, sizeof(name), "%d", chan->ch_subidx); chid_tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(ch_tree), OID_AUTO, name, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); if (chid_tree == NULL) return; SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(chid_tree), OID_AUTO, "chanid", CTLFLAG_RD, &chan->ch_id, 0, "channel id"); } SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(chid_tree), OID_AUTO, "cpu", CTLFLAG_RD, &chan->ch_cpuid, 0, "owner CPU id"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(chid_tree), OID_AUTO, "mnf", CTLTYPE_INT | CTLFLAG_RD | CTLFLAG_MPSAFE, chan, 0, vmbus_chan_sysctl_mnf, "I", "has monitor notification facilities"); - /* - * Create sysctl tree for RX bufring. - */ br_tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(chid_tree), OID_AUTO, - "in", CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); + "br", CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); if (br_tree != NULL) { - hv_ring_buffer_stat(ctx, SYSCTL_CHILDREN(br_tree), - &chan->inbound, "inbound ring buffer stats"); + /* + * Create sysctl tree for RX bufring. + */ + vmbus_br_sysctl_create(ctx, br_tree, &chan->ch_rxbr, "rx"); + /* + * Create sysctl tree for TX bufring. + */ + vmbus_br_sysctl_create(ctx, br_tree, &chan->ch_txbr, "tx"); } - - /* - * Create sysctl tree for TX bufring. - */ - br_tree = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(chid_tree), OID_AUTO, - "out", CTLFLAG_RD | CTLFLAG_MPSAFE, 0, ""); - if (br_tree != NULL) { - hv_ring_buffer_stat(ctx, SYSCTL_CHILDREN(br_tree), - &chan->outbound, "outbound ring buffer stats"); - } } int -vmbus_chan_open(struct hv_vmbus_channel *chan, int txbr_size, int rxbr_size, +vmbus_chan_open(struct vmbus_channel *chan, int txbr_size, int rxbr_size, const void *udata, int udlen, vmbus_chan_callback_t cb, void *cbarg) { - struct vmbus_softc *sc = chan->vmbus_sc; + struct vmbus_softc *sc = chan->ch_vmbus; const struct vmbus_chanmsg_chopen_resp *resp; const struct vmbus_message *msg; struct vmbus_chanmsg_chopen *req; struct vmbus_msghc *mh; uint32_t status; int error; uint8_t *br; if (udlen > VMBUS_CHANMSG_CHOPEN_UDATA_SIZE) { device_printf(sc->vmbus_dev, "invalid udata len %d for chan%u\n", udlen, chan->ch_id); return EINVAL; } KASSERT((txbr_size & PAGE_MASK) == 0, ("send bufring size is not multiple page")); KASSERT((rxbr_size & PAGE_MASK) == 0, ("recv bufring size is not multiple page")); if (atomic_testandset_int(&chan->ch_stflags, VMBUS_CHAN_ST_OPENED_SHIFT)) panic("double-open chan%u", chan->ch_id); chan->ch_cb = cb; chan->ch_cbarg = cbarg; vmbus_chan_update_evtflagcnt(sc, chan); - chan->ch_tq = VMBUS_PCPU_GET(chan->vmbus_sc, event_tq, chan->ch_cpuid); + chan->ch_tq = VMBUS_PCPU_GET(chan->ch_vmbus, event_tq, chan->ch_cpuid); if (chan->ch_flags & VMBUS_CHAN_FLAG_BATCHREAD) TASK_INIT(&chan->ch_task, 0, vmbus_chan_task, chan); else TASK_INIT(&chan->ch_task, 0, vmbus_chan_task_nobatch, chan); /* * Allocate the TX+RX bufrings. * XXX should use ch_dev dtag */ br = hyperv_dmamem_alloc(bus_get_dma_tag(sc->vmbus_dev), PAGE_SIZE, 0, txbr_size + rxbr_size, &chan->ch_bufring_dma, BUS_DMA_WAITOK | BUS_DMA_ZERO); if (br == NULL) { device_printf(sc->vmbus_dev, "bufring allocation failed\n"); error = ENOMEM; goto failed; } chan->ch_bufring = br; /* TX bufring comes first */ - hv_vmbus_ring_buffer_init(&chan->outbound, br, txbr_size); + hv_vmbus_ring_buffer_init(&chan->ch_txbr, br, txbr_size); /* RX bufring immediately follows TX bufring */ - hv_vmbus_ring_buffer_init(&chan->inbound, br + txbr_size, rxbr_size); + hv_vmbus_ring_buffer_init(&chan->ch_rxbr, br + txbr_size, rxbr_size); /* Create sysctl tree for this channel */ vmbus_chan_sysctl_create(chan); /* * Connect the bufrings, both RX and TX, to this channel. */ error = vmbus_chan_gpadl_connect(chan, chan->ch_bufring_dma.hv_paddr, txbr_size + rxbr_size, &chan->ch_bufring_gpadl); if (error) { device_printf(sc->vmbus_dev, "failed to connect bufring GPADL to chan%u\n", chan->ch_id); goto failed; } /* * Open channel w/ the bufring GPADL on the target CPU. */ mh = vmbus_msghc_get(sc, sizeof(*req)); if (mh == NULL) { device_printf(sc->vmbus_dev, "can not get msg hypercall for chopen(chan%u)\n", chan->ch_id); error = ENXIO; goto failed; } req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_CHOPEN; req->chm_chanid = chan->ch_id; req->chm_openid = chan->ch_id; req->chm_gpadl = chan->ch_bufring_gpadl; req->chm_vcpuid = chan->ch_vcpuid; req->chm_txbr_pgcnt = txbr_size >> PAGE_SHIFT; if (udlen > 0) memcpy(req->chm_udata, udata, udlen); error = vmbus_msghc_exec(sc, mh); if (error) { device_printf(sc->vmbus_dev, "chopen(chan%u) msg hypercall exec failed: %d\n", chan->ch_id, error); vmbus_msghc_put(sc, mh); goto failed; } msg = vmbus_msghc_wait_result(sc, mh); resp = (const struct vmbus_chanmsg_chopen_resp *)msg->msg_data; status = resp->chm_status; vmbus_msghc_put(sc, mh); if (status == 0) { if (bootverbose) { device_printf(sc->vmbus_dev, "chan%u opened\n", chan->ch_id); } return 0; } device_printf(sc->vmbus_dev, "failed to open chan%u\n", chan->ch_id); error = ENXIO; failed: if (chan->ch_bufring_gpadl) { vmbus_chan_gpadl_disconnect(chan, chan->ch_bufring_gpadl); chan->ch_bufring_gpadl = 0; } if (chan->ch_bufring != NULL) { hyperv_dmamem_free(&chan->ch_bufring_dma, chan->ch_bufring); chan->ch_bufring = NULL; } atomic_clear_int(&chan->ch_stflags, VMBUS_CHAN_ST_OPENED); return error; } int -vmbus_chan_gpadl_connect(struct hv_vmbus_channel *chan, bus_addr_t paddr, +vmbus_chan_gpadl_connect(struct vmbus_channel *chan, bus_addr_t paddr, int size, uint32_t *gpadl0) { - struct vmbus_softc *sc = chan->vmbus_sc; + struct vmbus_softc *sc = chan->ch_vmbus; struct vmbus_msghc *mh; struct vmbus_chanmsg_gpadl_conn *req; const struct vmbus_message *msg; size_t reqsz; uint32_t gpadl, status; int page_count, range_len, i, cnt, error; uint64_t page_id; /* * Preliminary checks. */ KASSERT((size & PAGE_MASK) == 0, ("invalid GPA size %d, not multiple page size", size)); page_count = size >> PAGE_SHIFT; KASSERT((paddr & PAGE_MASK) == 0, ("GPA is not page aligned %jx", (uintmax_t)paddr)); page_id = paddr >> PAGE_SHIFT; range_len = __offsetof(struct vmbus_gpa_range, gpa_page[page_count]); /* * We don't support multiple GPA ranges. */ if (range_len > UINT16_MAX) { device_printf(sc->vmbus_dev, "GPA too large, %d pages\n", page_count); return EOPNOTSUPP; } /* * Allocate GPADL id. */ gpadl = vmbus_gpadl_alloc(sc); *gpadl0 = gpadl; /* * Connect this GPADL to the target channel. * * NOTE: * Since each message can only hold small set of page * addresses, several messages may be required to * complete the connection. */ if (page_count > VMBUS_CHANMSG_GPADL_CONN_PGMAX) cnt = VMBUS_CHANMSG_GPADL_CONN_PGMAX; else cnt = page_count; page_count -= cnt; reqsz = __offsetof(struct vmbus_chanmsg_gpadl_conn, chm_range.gpa_page[cnt]); mh = vmbus_msghc_get(sc, reqsz); if (mh == NULL) { device_printf(sc->vmbus_dev, "can not get msg hypercall for gpadl->chan%u\n", chan->ch_id); return EIO; } req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_GPADL_CONN; req->chm_chanid = chan->ch_id; req->chm_gpadl = gpadl; req->chm_range_len = range_len; req->chm_range_cnt = 1; req->chm_range.gpa_len = size; req->chm_range.gpa_ofs = 0; for (i = 0; i < cnt; ++i) req->chm_range.gpa_page[i] = page_id++; error = vmbus_msghc_exec(sc, mh); if (error) { device_printf(sc->vmbus_dev, "gpadl->chan%u msg hypercall exec failed: %d\n", chan->ch_id, error); vmbus_msghc_put(sc, mh); return error; } while (page_count > 0) { struct vmbus_chanmsg_gpadl_subconn *subreq; if (page_count > VMBUS_CHANMSG_GPADL_SUBCONN_PGMAX) cnt = VMBUS_CHANMSG_GPADL_SUBCONN_PGMAX; else cnt = page_count; page_count -= cnt; reqsz = __offsetof(struct vmbus_chanmsg_gpadl_subconn, chm_gpa_page[cnt]); vmbus_msghc_reset(mh, reqsz); subreq = vmbus_msghc_dataptr(mh); subreq->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_GPADL_SUBCONN; subreq->chm_gpadl = gpadl; for (i = 0; i < cnt; ++i) subreq->chm_gpa_page[i] = page_id++; vmbus_msghc_exec_noresult(mh); } KASSERT(page_count == 0, ("invalid page count %d", page_count)); msg = vmbus_msghc_wait_result(sc, mh); status = ((const struct vmbus_chanmsg_gpadl_connresp *) msg->msg_data)->chm_status; vmbus_msghc_put(sc, mh); if (status != 0) { device_printf(sc->vmbus_dev, "gpadl->chan%u failed: " "status %u\n", chan->ch_id, status); return EIO; } else { if (bootverbose) { device_printf(sc->vmbus_dev, "gpadl->chan%u " "succeeded\n", chan->ch_id); } } return 0; } /* * Disconnect the GPA from the target channel */ int -vmbus_chan_gpadl_disconnect(struct hv_vmbus_channel *chan, uint32_t gpadl) +vmbus_chan_gpadl_disconnect(struct vmbus_channel *chan, uint32_t gpadl) { - struct vmbus_softc *sc = chan->vmbus_sc; + struct vmbus_softc *sc = chan->ch_vmbus; struct vmbus_msghc *mh; struct vmbus_chanmsg_gpadl_disconn *req; int error; mh = vmbus_msghc_get(sc, sizeof(*req)); if (mh == NULL) { device_printf(sc->vmbus_dev, "can not get msg hypercall for gpa x->chan%u\n", chan->ch_id); return EBUSY; } req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_GPADL_DISCONN; req->chm_chanid = chan->ch_id; req->chm_gpadl = gpadl; error = vmbus_msghc_exec(sc, mh); if (error) { device_printf(sc->vmbus_dev, "gpa x->chan%u msg hypercall exec failed: %d\n", chan->ch_id, error); vmbus_msghc_put(sc, mh); return error; } vmbus_msghc_wait_result(sc, mh); /* Discard result; no useful information */ vmbus_msghc_put(sc, mh); return 0; } static void -vmbus_chan_close_internal(struct hv_vmbus_channel *chan) +vmbus_chan_close_internal(struct vmbus_channel *chan) { - struct vmbus_softc *sc = chan->vmbus_sc; + struct vmbus_softc *sc = chan->ch_vmbus; struct vmbus_msghc *mh; struct vmbus_chanmsg_chclose *req; struct taskqueue *tq = chan->ch_tq; int error; /* TODO: stringent check */ atomic_clear_int(&chan->ch_stflags, VMBUS_CHAN_ST_OPENED); /* * Free this channel's sysctl tree attached to its device's * sysctl tree. */ sysctl_ctx_free(&chan->ch_sysctl_ctx); /* * Set ch_tq to NULL to avoid more requests be scheduled. * XXX pretty broken; need rework. */ chan->ch_tq = NULL; taskqueue_drain(tq, &chan->ch_task); chan->ch_cb = NULL; /* * Close this channel. */ mh = vmbus_msghc_get(sc, sizeof(*req)); if (mh == NULL) { device_printf(sc->vmbus_dev, "can not get msg hypercall for chclose(chan%u)\n", chan->ch_id); return; } req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_CHCLOSE; req->chm_chanid = chan->ch_id; error = vmbus_msghc_exec_noresult(mh); vmbus_msghc_put(sc, mh); if (error) { device_printf(sc->vmbus_dev, "chclose(chan%u) msg hypercall exec failed: %d\n", chan->ch_id, error); return; } else if (bootverbose) { device_printf(sc->vmbus_dev, "close chan%u\n", chan->ch_id); } /* * Disconnect the TX+RX bufrings from this channel. */ if (chan->ch_bufring_gpadl) { vmbus_chan_gpadl_disconnect(chan, chan->ch_bufring_gpadl); chan->ch_bufring_gpadl = 0; } /* * Destroy the TX+RX bufrings. */ - hv_ring_buffer_cleanup(&chan->outbound); - hv_ring_buffer_cleanup(&chan->inbound); + hv_ring_buffer_cleanup(&chan->ch_txbr); + hv_ring_buffer_cleanup(&chan->ch_rxbr); if (chan->ch_bufring != NULL) { hyperv_dmamem_free(&chan->ch_bufring_dma, chan->ch_bufring); chan->ch_bufring = NULL; } } /* * Caller should make sure that all sub-channels have * been added to 'chan' and all to-be-closed channels * are not being opened. */ void -vmbus_chan_close(struct hv_vmbus_channel *chan) +vmbus_chan_close(struct vmbus_channel *chan) { int subchan_cnt; if (!VMBUS_CHAN_ISPRIMARY(chan)) { /* * Sub-channel is closed when its primary channel * is closed; done. */ return; } /* * Close all sub-channels, if any. */ subchan_cnt = chan->ch_subchan_cnt; if (subchan_cnt > 0) { - struct hv_vmbus_channel **subchan; + struct vmbus_channel **subchan; int i; subchan = vmbus_subchan_get(chan, subchan_cnt); for (i = 0; i < subchan_cnt; ++i) vmbus_chan_close_internal(subchan[i]); vmbus_subchan_rel(subchan, subchan_cnt); } /* Then close the primary channel. */ vmbus_chan_close_internal(chan); } int -vmbus_chan_send(struct hv_vmbus_channel *chan, uint16_t type, uint16_t flags, +vmbus_chan_send(struct vmbus_channel *chan, uint16_t type, uint16_t flags, void *data, int dlen, uint64_t xactid) { struct vmbus_chanpkt pkt; int pktlen, pad_pktlen, hlen, error; uint64_t pad = 0; struct iovec iov[3]; boolean_t send_evt; hlen = sizeof(pkt); pktlen = hlen + dlen; pad_pktlen = VMBUS_CHANPKT_TOTLEN(pktlen); pkt.cp_hdr.cph_type = type; pkt.cp_hdr.cph_flags = flags; VMBUS_CHANPKT_SETLEN(pkt.cp_hdr.cph_hlen, hlen); VMBUS_CHANPKT_SETLEN(pkt.cp_hdr.cph_tlen, pad_pktlen); pkt.cp_hdr.cph_xactid = xactid; iov[0].iov_base = &pkt; iov[0].iov_len = hlen; iov[1].iov_base = data; iov[1].iov_len = dlen; iov[2].iov_base = &pad; iov[2].iov_len = pad_pktlen - pktlen; - error = hv_ring_buffer_write(&chan->outbound, iov, 3, &send_evt); + error = hv_ring_buffer_write(&chan->ch_txbr, iov, 3, &send_evt); if (!error && send_evt) vmbus_chan_signal_tx(chan); return error; } int -vmbus_chan_send_sglist(struct hv_vmbus_channel *chan, +vmbus_chan_send_sglist(struct vmbus_channel *chan, struct vmbus_gpa sg[], int sglen, void *data, int dlen, uint64_t xactid) { struct vmbus_chanpkt_sglist pkt; int pktlen, pad_pktlen, hlen, error; struct iovec iov[4]; boolean_t send_evt; uint64_t pad = 0; KASSERT(sglen < VMBUS_CHAN_SGLIST_MAX, ("invalid sglist len %d", sglen)); hlen = __offsetof(struct vmbus_chanpkt_sglist, cp_gpa[sglen]); pktlen = hlen + dlen; pad_pktlen = VMBUS_CHANPKT_TOTLEN(pktlen); pkt.cp_hdr.cph_type = VMBUS_CHANPKT_TYPE_GPA; pkt.cp_hdr.cph_flags = VMBUS_CHANPKT_FLAG_RC; VMBUS_CHANPKT_SETLEN(pkt.cp_hdr.cph_hlen, hlen); VMBUS_CHANPKT_SETLEN(pkt.cp_hdr.cph_tlen, pad_pktlen); pkt.cp_hdr.cph_xactid = xactid; pkt.cp_rsvd = 0; pkt.cp_gpa_cnt = sglen; iov[0].iov_base = &pkt; iov[0].iov_len = sizeof(pkt); iov[1].iov_base = sg; iov[1].iov_len = sizeof(struct vmbus_gpa) * sglen; iov[2].iov_base = data; iov[2].iov_len = dlen; iov[3].iov_base = &pad; iov[3].iov_len = pad_pktlen - pktlen; - error = hv_ring_buffer_write(&chan->outbound, iov, 4, &send_evt); + error = hv_ring_buffer_write(&chan->ch_txbr, iov, 4, &send_evt); if (!error && send_evt) vmbus_chan_signal_tx(chan); return error; } int -vmbus_chan_send_prplist(struct hv_vmbus_channel *chan, +vmbus_chan_send_prplist(struct vmbus_channel *chan, struct vmbus_gpa_range *prp, int prp_cnt, void *data, int dlen, uint64_t xactid) { struct vmbus_chanpkt_prplist pkt; int pktlen, pad_pktlen, hlen, error; struct iovec iov[4]; boolean_t send_evt; uint64_t pad = 0; KASSERT(prp_cnt < VMBUS_CHAN_PRPLIST_MAX, ("invalid prplist entry count %d", prp_cnt)); hlen = __offsetof(struct vmbus_chanpkt_prplist, cp_range[0].gpa_page[prp_cnt]); pktlen = hlen + dlen; pad_pktlen = VMBUS_CHANPKT_TOTLEN(pktlen); pkt.cp_hdr.cph_type = VMBUS_CHANPKT_TYPE_GPA; pkt.cp_hdr.cph_flags = VMBUS_CHANPKT_FLAG_RC; VMBUS_CHANPKT_SETLEN(pkt.cp_hdr.cph_hlen, hlen); VMBUS_CHANPKT_SETLEN(pkt.cp_hdr.cph_tlen, pad_pktlen); pkt.cp_hdr.cph_xactid = xactid; pkt.cp_rsvd = 0; pkt.cp_range_cnt = 1; iov[0].iov_base = &pkt; iov[0].iov_len = sizeof(pkt); iov[1].iov_base = prp; iov[1].iov_len = __offsetof(struct vmbus_gpa_range, gpa_page[prp_cnt]); iov[2].iov_base = data; iov[2].iov_len = dlen; iov[3].iov_base = &pad; iov[3].iov_len = pad_pktlen - pktlen; - error = hv_ring_buffer_write(&chan->outbound, iov, 4, &send_evt); + error = hv_ring_buffer_write(&chan->ch_txbr, iov, 4, &send_evt); if (!error && send_evt) vmbus_chan_signal_tx(chan); return error; } int -vmbus_chan_recv(struct hv_vmbus_channel *chan, void *data, int *dlen0, +vmbus_chan_recv(struct vmbus_channel *chan, void *data, int *dlen0, uint64_t *xactid) { struct vmbus_chanpkt_hdr pkt; int error, dlen, hlen; - error = hv_ring_buffer_peek(&chan->inbound, &pkt, sizeof(pkt)); + error = hv_ring_buffer_peek(&chan->ch_rxbr, &pkt, sizeof(pkt)); if (error) return error; hlen = VMBUS_CHANPKT_GETLEN(pkt.cph_hlen); dlen = VMBUS_CHANPKT_GETLEN(pkt.cph_tlen) - hlen; if (*dlen0 < dlen) { /* Return the size of this packet's data. */ *dlen0 = dlen; return ENOBUFS; } *xactid = pkt.cph_xactid; *dlen0 = dlen; /* Skip packet header */ - error = hv_ring_buffer_read(&chan->inbound, data, dlen, hlen); + error = hv_ring_buffer_read(&chan->ch_rxbr, data, dlen, hlen); KASSERT(!error, ("hv_ring_buffer_read failed")); return 0; } int -vmbus_chan_recv_pkt(struct hv_vmbus_channel *chan, +vmbus_chan_recv_pkt(struct vmbus_channel *chan, struct vmbus_chanpkt_hdr *pkt0, int *pktlen0) { struct vmbus_chanpkt_hdr pkt; int error, pktlen; - error = hv_ring_buffer_peek(&chan->inbound, &pkt, sizeof(pkt)); + error = hv_ring_buffer_peek(&chan->ch_rxbr, &pkt, sizeof(pkt)); if (error) return error; pktlen = VMBUS_CHANPKT_GETLEN(pkt.cph_tlen); if (*pktlen0 < pktlen) { /* Return the size of this packet. */ *pktlen0 = pktlen; return ENOBUFS; } *pktlen0 = pktlen; /* Include packet header */ - error = hv_ring_buffer_read(&chan->inbound, pkt0, pktlen, 0); + error = hv_ring_buffer_read(&chan->ch_rxbr, pkt0, pktlen, 0); KASSERT(!error, ("hv_ring_buffer_read failed")); return 0; } static void vmbus_chan_task(void *xchan, int pending __unused) { - struct hv_vmbus_channel *chan = xchan; + struct vmbus_channel *chan = xchan; vmbus_chan_callback_t cb = chan->ch_cb; void *cbarg = chan->ch_cbarg; /* * Optimize host to guest signaling by ensuring: * 1. While reading the channel, we disable interrupts from * host. * 2. Ensure that we process all posted messages from the host * before returning from this callback. * 3. Once we return, enable signaling from the host. Once this * state is set we check to see if additional packets are * available to read. In this case we repeat the process. * * NOTE: Interrupt has been disabled in the ISR. */ for (;;) { uint32_t left; - cb(cbarg); + cb(chan, cbarg); - left = hv_ring_buffer_read_end(&chan->inbound); + left = hv_ring_buffer_read_end(&chan->ch_rxbr); if (left == 0) { /* No more data in RX bufring; done */ break; } - hv_ring_buffer_read_begin(&chan->inbound); + hv_ring_buffer_read_begin(&chan->ch_rxbr); } } static void vmbus_chan_task_nobatch(void *xchan, int pending __unused) { - struct hv_vmbus_channel *chan = xchan; + struct vmbus_channel *chan = xchan; - chan->ch_cb(chan->ch_cbarg); + chan->ch_cb(chan, chan->ch_cbarg); } static __inline void vmbus_event_flags_proc(struct vmbus_softc *sc, volatile u_long *event_flags, int flag_cnt) { int f; for (f = 0; f < flag_cnt; ++f) { uint32_t chid_base; u_long flags; int chid_ofs; if (event_flags[f] == 0) continue; flags = atomic_swap_long(&event_flags[f], 0); chid_base = f << VMBUS_EVTFLAG_SHIFT; while ((chid_ofs = ffsl(flags)) != 0) { - struct hv_vmbus_channel *chan; + struct vmbus_channel *chan; --chid_ofs; /* NOTE: ffsl is 1-based */ flags &= ~(1UL << chid_ofs); chan = sc->vmbus_chmap[chid_base + chid_ofs]; /* if channel is closed or closing */ if (chan == NULL || chan->ch_tq == NULL) continue; if (chan->ch_flags & VMBUS_CHAN_FLAG_BATCHREAD) - hv_ring_buffer_read_begin(&chan->inbound); + hv_ring_buffer_read_begin(&chan->ch_rxbr); taskqueue_enqueue(chan->ch_tq, &chan->ch_task); } } } void vmbus_event_proc(struct vmbus_softc *sc, int cpu) { struct vmbus_evtflags *eventf; /* * On Host with Win8 or above, the event page can be checked directly * to get the id of the channel that has the pending interrupt. */ eventf = VMBUS_PCPU_GET(sc, event_flags, cpu) + VMBUS_SINT_MESSAGE; vmbus_event_flags_proc(sc, eventf->evt_flags, VMBUS_PCPU_GET(sc, event_flags_cnt, cpu)); } void vmbus_event_proc_compat(struct vmbus_softc *sc, int cpu) { struct vmbus_evtflags *eventf; eventf = VMBUS_PCPU_GET(sc, event_flags, cpu) + VMBUS_SINT_MESSAGE; if (atomic_testandclear_long(&eventf->evt_flags[0], 0)) { vmbus_event_flags_proc(sc, sc->vmbus_rx_evtflags, VMBUS_CHAN_MAX_COMPAT >> VMBUS_EVTFLAG_SHIFT); } } static void vmbus_chan_update_evtflagcnt(struct vmbus_softc *sc, - const struct hv_vmbus_channel *chan) + const struct vmbus_channel *chan) { volatile int *flag_cnt_ptr; int flag_cnt; flag_cnt = (chan->ch_id / VMBUS_EVTFLAG_LEN) + 1; flag_cnt_ptr = VMBUS_PCPU_PTR(sc, event_flags_cnt, chan->ch_cpuid); for (;;) { int old_flag_cnt; old_flag_cnt = *flag_cnt_ptr; if (old_flag_cnt >= flag_cnt) break; if (atomic_cmpset_int(flag_cnt_ptr, old_flag_cnt, flag_cnt)) { if (bootverbose) { device_printf(sc->vmbus_dev, "channel%u update cpu%d flag_cnt to %d\n", chan->ch_id, chan->ch_cpuid, flag_cnt); } break; } } } -static struct hv_vmbus_channel * +static struct vmbus_channel * vmbus_chan_alloc(struct vmbus_softc *sc) { - struct hv_vmbus_channel *chan; + struct vmbus_channel *chan; chan = malloc(sizeof(*chan), M_DEVBUF, M_WAITOK | M_ZERO); chan->ch_monprm = hyperv_dmamem_alloc(bus_get_dma_tag(sc->vmbus_dev), HYPERCALL_PARAM_ALIGN, 0, sizeof(struct hyperv_mon_param), &chan->ch_monprm_dma, BUS_DMA_WAITOK | BUS_DMA_ZERO); if (chan->ch_monprm == NULL) { device_printf(sc->vmbus_dev, "monprm alloc failed\n"); free(chan, M_DEVBUF); return NULL; } - chan->vmbus_sc = sc; + chan->ch_vmbus = sc; mtx_init(&chan->ch_subchan_lock, "vmbus subchan", NULL, MTX_DEF); TAILQ_INIT(&chan->ch_subchans); TASK_INIT(&chan->ch_detach_task, 0, vmbus_chan_detach_task, chan); return chan; } static void -vmbus_chan_free(struct hv_vmbus_channel *chan) +vmbus_chan_free(struct vmbus_channel *chan) { /* TODO: assert sub-channel list is empty */ /* TODO: asset no longer on the primary channel's sub-channel list */ /* TODO: asset no longer on the vmbus channel list */ hyperv_dmamem_free(&chan->ch_monprm_dma, chan->ch_monprm); mtx_destroy(&chan->ch_subchan_lock); free(chan, M_DEVBUF); } static int -vmbus_chan_add(struct hv_vmbus_channel *newchan) +vmbus_chan_add(struct vmbus_channel *newchan) { - struct vmbus_softc *sc = newchan->vmbus_sc; - struct hv_vmbus_channel *prichan; + struct vmbus_softc *sc = newchan->ch_vmbus; + struct vmbus_channel *prichan; if (newchan->ch_id == 0) { /* * XXX * Chan0 will neither be processed nor should be offered; * skip it. */ device_printf(sc->vmbus_dev, "got chan0 offer, discard\n"); return EINVAL; } else if (newchan->ch_id >= VMBUS_CHAN_MAX) { device_printf(sc->vmbus_dev, "invalid chan%u offer\n", newchan->ch_id); return EINVAL; } sc->vmbus_chmap[newchan->ch_id] = newchan; if (bootverbose) { device_printf(sc->vmbus_dev, "chan%u subidx%u offer\n", newchan->ch_id, newchan->ch_subidx); } mtx_lock(&sc->vmbus_prichan_lock); TAILQ_FOREACH(prichan, &sc->vmbus_prichans, ch_prilink) { /* * Sub-channel will have the same type GUID and instance * GUID as its primary channel. */ if (memcmp(&prichan->ch_guid_type, &newchan->ch_guid_type, sizeof(struct hyperv_guid)) == 0 && memcmp(&prichan->ch_guid_inst, &newchan->ch_guid_inst, sizeof(struct hyperv_guid)) == 0) break; } if (VMBUS_CHAN_ISPRIMARY(newchan)) { if (prichan == NULL) { /* Install the new primary channel */ TAILQ_INSERT_TAIL(&sc->vmbus_prichans, newchan, ch_prilink); mtx_unlock(&sc->vmbus_prichan_lock); return 0; } else { mtx_unlock(&sc->vmbus_prichan_lock); device_printf(sc->vmbus_dev, "duplicated primary " "chan%u\n", newchan->ch_id); return EINVAL; } } else { /* Sub-channel */ if (prichan == NULL) { mtx_unlock(&sc->vmbus_prichan_lock); device_printf(sc->vmbus_dev, "no primary chan for " "chan%u\n", newchan->ch_id); return EINVAL; } /* * Found the primary channel for this sub-channel and * move on. * * XXX refcnt prichan */ } mtx_unlock(&sc->vmbus_prichan_lock); /* * This is a sub-channel; link it with the primary channel. */ KASSERT(!VMBUS_CHAN_ISPRIMARY(newchan), ("new channel is not sub-channel")); KASSERT(prichan != NULL, ("no primary channel")); newchan->ch_prichan = prichan; newchan->ch_dev = prichan->ch_dev; mtx_lock(&prichan->ch_subchan_lock); TAILQ_INSERT_TAIL(&prichan->ch_subchans, newchan, ch_sublink); /* * Bump up sub-channel count and notify anyone that is * interested in this sub-channel, after this sub-channel * is setup. */ prichan->ch_subchan_cnt++; mtx_unlock(&prichan->ch_subchan_lock); wakeup(prichan); return 0; } void -vmbus_chan_cpu_set(struct hv_vmbus_channel *chan, int cpu) +vmbus_chan_cpu_set(struct vmbus_channel *chan, int cpu) { KASSERT(cpu >= 0 && cpu < mp_ncpus, ("invalid cpu %d", cpu)); - if (chan->vmbus_sc->vmbus_version == VMBUS_VERSION_WS2008 || - chan->vmbus_sc->vmbus_version == VMBUS_VERSION_WIN7) { + if (chan->ch_vmbus->vmbus_version == VMBUS_VERSION_WS2008 || + chan->ch_vmbus->vmbus_version == VMBUS_VERSION_WIN7) { /* Only cpu0 is supported */ cpu = 0; } chan->ch_cpuid = cpu; - chan->ch_vcpuid = VMBUS_PCPU_GET(chan->vmbus_sc, vcpuid, cpu); + chan->ch_vcpuid = VMBUS_PCPU_GET(chan->ch_vmbus, vcpuid, cpu); if (bootverbose) { printf("vmbus_chan%u: assigned to cpu%u [vcpu%u]\n", chan->ch_id, chan->ch_cpuid, chan->ch_vcpuid); } } void -vmbus_chan_cpu_rr(struct hv_vmbus_channel *chan) +vmbus_chan_cpu_rr(struct vmbus_channel *chan) { static uint32_t vmbus_chan_nextcpu; int cpu; cpu = atomic_fetchadd_int(&vmbus_chan_nextcpu, 1) % mp_ncpus; vmbus_chan_cpu_set(chan, cpu); } static void -vmbus_chan_cpu_default(struct hv_vmbus_channel *chan) +vmbus_chan_cpu_default(struct vmbus_channel *chan) { /* * By default, pin the channel to cpu0. Devices having * special channel-cpu mapping requirement should call * vmbus_chan_cpu_{set,rr}(). */ vmbus_chan_cpu_set(chan, 0); } static void vmbus_chan_msgproc_choffer(struct vmbus_softc *sc, const struct vmbus_message *msg) { const struct vmbus_chanmsg_choffer *offer; - struct hv_vmbus_channel *chan; + struct vmbus_channel *chan; int error; offer = (const struct vmbus_chanmsg_choffer *)msg->msg_data; chan = vmbus_chan_alloc(sc); if (chan == NULL) { device_printf(sc->vmbus_dev, "allocate chan%u failed\n", offer->chm_chanid); return; } chan->ch_id = offer->chm_chanid; chan->ch_subidx = offer->chm_subidx; chan->ch_guid_type = offer->chm_chtype; chan->ch_guid_inst = offer->chm_chinst; /* Batch reading is on by default */ chan->ch_flags |= VMBUS_CHAN_FLAG_BATCHREAD; chan->ch_monprm->mp_connid = VMBUS_CONNID_EVENT; if (sc->vmbus_version != VMBUS_VERSION_WS2008) chan->ch_monprm->mp_connid = offer->chm_connid; if (offer->chm_flags1 & VMBUS_CHOFFER_FLAG1_HASMNF) { + int trig_idx; + /* * Setup MNF stuffs. */ - chan->ch_flags |= VMBUS_CHAN_FLAG_HASMNF; - chan->ch_montrig_idx = offer->chm_montrig / VMBUS_MONTRIG_LEN; - if (chan->ch_montrig_idx >= VMBUS_MONTRIGS_MAX) + chan->ch_txflags |= VMBUS_CHAN_TXF_HASMNF; + + trig_idx = offer->chm_montrig / VMBUS_MONTRIG_LEN; + if (trig_idx >= VMBUS_MONTRIGS_MAX) panic("invalid monitor trigger %u", offer->chm_montrig); + chan->ch_montrig = + &sc->vmbus_mnf2->mnf_trigs[trig_idx].mt_pending; + chan->ch_montrig_mask = 1 << (offer->chm_montrig % VMBUS_MONTRIG_LEN); } + /* + * Setup event flag. + */ + chan->ch_evtflag = + &sc->vmbus_tx_evtflags[chan->ch_id >> VMBUS_EVTFLAG_SHIFT]; + chan->ch_evtflag_mask = 1UL << (chan->ch_id & VMBUS_EVTFLAG_MASK); + /* Select default cpu for this channel. */ vmbus_chan_cpu_default(chan); error = vmbus_chan_add(chan); if (error) { device_printf(sc->vmbus_dev, "add chan%u failed: %d\n", chan->ch_id, error); vmbus_chan_free(chan); return; } if (VMBUS_CHAN_ISPRIMARY(chan)) { /* * Add device for this primary channel. * * NOTE: * Error is ignored here; don't have much to do if error * really happens. */ vmbus_add_child(chan); } } /* * XXX pretty broken; need rework. */ static void vmbus_chan_msgproc_chrescind(struct vmbus_softc *sc, const struct vmbus_message *msg) { const struct vmbus_chanmsg_chrescind *note; - struct hv_vmbus_channel *chan; + struct vmbus_channel *chan; note = (const struct vmbus_chanmsg_chrescind *)msg->msg_data; if (note->chm_chanid > VMBUS_CHAN_MAX) { device_printf(sc->vmbus_dev, "invalid rescinded chan%u\n", note->chm_chanid); return; } if (bootverbose) { device_printf(sc->vmbus_dev, "chan%u rescinded\n", note->chm_chanid); } chan = sc->vmbus_chmap[note->chm_chanid]; if (chan == NULL) return; sc->vmbus_chmap[note->chm_chanid] = NULL; taskqueue_enqueue(taskqueue_thread, &chan->ch_detach_task); } static void vmbus_chan_detach_task(void *xchan, int pending __unused) { - struct hv_vmbus_channel *chan = xchan; + struct vmbus_channel *chan = xchan; if (VMBUS_CHAN_ISPRIMARY(chan)) { /* Only primary channel owns the device */ vmbus_delete_child(chan); /* NOTE: DO NOT free primary channel for now */ } else { - struct vmbus_softc *sc = chan->vmbus_sc; - struct hv_vmbus_channel *pri_chan = chan->ch_prichan; + struct vmbus_softc *sc = chan->ch_vmbus; + struct vmbus_channel *pri_chan = chan->ch_prichan; struct vmbus_chanmsg_chfree *req; struct vmbus_msghc *mh; int error; mh = vmbus_msghc_get(sc, sizeof(*req)); if (mh == NULL) { device_printf(sc->vmbus_dev, "can not get msg hypercall for chfree(chan%u)\n", chan->ch_id); goto remove; } req = vmbus_msghc_dataptr(mh); req->chm_hdr.chm_type = VMBUS_CHANMSG_TYPE_CHFREE; req->chm_chanid = chan->ch_id; error = vmbus_msghc_exec_noresult(mh); vmbus_msghc_put(sc, mh); if (error) { device_printf(sc->vmbus_dev, "chfree(chan%u) failed: %d", chan->ch_id, error); /* NOTE: Move on! */ } else { if (bootverbose) { device_printf(sc->vmbus_dev, "chan%u freed\n", chan->ch_id); } } remove: mtx_lock(&pri_chan->ch_subchan_lock); TAILQ_REMOVE(&pri_chan->ch_subchans, chan, ch_sublink); KASSERT(pri_chan->ch_subchan_cnt > 0, ("invalid subchan_cnt %d", pri_chan->ch_subchan_cnt)); pri_chan->ch_subchan_cnt--; mtx_unlock(&pri_chan->ch_subchan_lock); wakeup(pri_chan); vmbus_chan_free(chan); } } /* * Detach all devices and destroy the corresponding primary channels. */ void vmbus_chan_destroy_all(struct vmbus_softc *sc) { - struct hv_vmbus_channel *chan; + struct vmbus_channel *chan; mtx_lock(&sc->vmbus_prichan_lock); while ((chan = TAILQ_FIRST(&sc->vmbus_prichans)) != NULL) { KASSERT(VMBUS_CHAN_ISPRIMARY(chan), ("not primary channel")); TAILQ_REMOVE(&sc->vmbus_prichans, chan, ch_prilink); mtx_unlock(&sc->vmbus_prichan_lock); vmbus_delete_child(chan); vmbus_chan_free(chan); mtx_lock(&sc->vmbus_prichan_lock); } bzero(sc->vmbus_chmap, - sizeof(struct hv_vmbus_channel *) * VMBUS_CHAN_MAX); + sizeof(struct vmbus_channel *) * VMBUS_CHAN_MAX); mtx_unlock(&sc->vmbus_prichan_lock); } /* * The channel whose vcpu binding is closest to the currect vcpu will * be selected. * If no multi-channel, always select primary channel. */ -struct hv_vmbus_channel * -vmbus_chan_cpu2chan(struct hv_vmbus_channel *prichan, int cpu) +struct vmbus_channel * +vmbus_chan_cpu2chan(struct vmbus_channel *prichan, int cpu) { - struct hv_vmbus_channel *sel, *chan; + struct vmbus_channel *sel, *chan; uint32_t vcpu, sel_dist; KASSERT(cpu >= 0 && cpu < mp_ncpus, ("invalid cpuid %d", cpu)); if (TAILQ_EMPTY(&prichan->ch_subchans)) return prichan; - vcpu = VMBUS_PCPU_GET(prichan->vmbus_sc, vcpuid, cpu); + vcpu = VMBUS_PCPU_GET(prichan->ch_vmbus, vcpuid, cpu); #define CHAN_VCPU_DIST(ch, vcpu) \ (((ch)->ch_vcpuid > (vcpu)) ? \ ((ch)->ch_vcpuid - (vcpu)) : ((vcpu) - (ch)->ch_vcpuid)) #define CHAN_SELECT(ch) \ do { \ sel = ch; \ sel_dist = CHAN_VCPU_DIST(ch, vcpu); \ } while (0) CHAN_SELECT(prichan); mtx_lock(&prichan->ch_subchan_lock); TAILQ_FOREACH(chan, &prichan->ch_subchans, ch_sublink) { uint32_t dist; KASSERT(chan->ch_stflags & VMBUS_CHAN_ST_OPENED, ("chan%u is not opened", chan->ch_id)); if (chan->ch_vcpuid == vcpu) { /* Exact match; done */ CHAN_SELECT(chan); break; } dist = CHAN_VCPU_DIST(chan, vcpu); if (sel_dist <= dist) { /* Far or same distance; skip */ continue; } /* Select the closer channel. */ CHAN_SELECT(chan); } mtx_unlock(&prichan->ch_subchan_lock); #undef CHAN_SELECT #undef CHAN_VCPU_DIST return sel; } -struct hv_vmbus_channel ** -vmbus_subchan_get(struct hv_vmbus_channel *pri_chan, int subchan_cnt) +struct vmbus_channel ** +vmbus_subchan_get(struct vmbus_channel *pri_chan, int subchan_cnt) { - struct hv_vmbus_channel **ret, *chan; + struct vmbus_channel **ret, *chan; int i; - ret = malloc(subchan_cnt * sizeof(struct hv_vmbus_channel *), M_TEMP, + ret = malloc(subchan_cnt * sizeof(struct vmbus_channel *), M_TEMP, M_WAITOK); mtx_lock(&pri_chan->ch_subchan_lock); while (pri_chan->ch_subchan_cnt < subchan_cnt) mtx_sleep(pri_chan, &pri_chan->ch_subchan_lock, 0, "subch", 0); i = 0; TAILQ_FOREACH(chan, &pri_chan->ch_subchans, ch_sublink) { /* TODO: refcnt chan */ ret[i] = chan; ++i; if (i == subchan_cnt) break; } KASSERT(i == subchan_cnt, ("invalid subchan count %d, should be %d", pri_chan->ch_subchan_cnt, subchan_cnt)); mtx_unlock(&pri_chan->ch_subchan_lock); return ret; } void -vmbus_subchan_rel(struct hv_vmbus_channel **subchan, int subchan_cnt __unused) +vmbus_subchan_rel(struct vmbus_channel **subchan, int subchan_cnt __unused) { free(subchan, M_TEMP); } void -vmbus_subchan_drain(struct hv_vmbus_channel *pri_chan) +vmbus_subchan_drain(struct vmbus_channel *pri_chan) { mtx_lock(&pri_chan->ch_subchan_lock); while (pri_chan->ch_subchan_cnt > 0) mtx_sleep(pri_chan, &pri_chan->ch_subchan_lock, 0, "dsubch", 0); mtx_unlock(&pri_chan->ch_subchan_lock); } void vmbus_chan_msgproc(struct vmbus_softc *sc, const struct vmbus_message *msg) { vmbus_chanmsg_proc_t msg_proc; uint32_t msg_type; msg_type = ((const struct vmbus_chanmsg_hdr *)msg->msg_data)->chm_type; KASSERT(msg_type < VMBUS_CHANMSG_TYPE_MAX, ("invalid message type %u", msg_type)); msg_proc = vmbus_chan_msgprocs[msg_type]; if (msg_proc != NULL) msg_proc(sc, msg); } void -vmbus_chan_set_readbatch(struct hv_vmbus_channel *chan, bool on) +vmbus_chan_set_readbatch(struct vmbus_channel *chan, bool on) { if (!on) chan->ch_flags &= ~VMBUS_CHAN_FLAG_BATCHREAD; else chan->ch_flags |= VMBUS_CHAN_FLAG_BATCHREAD; +} + +uint32_t +vmbus_chan_id(const struct vmbus_channel *chan) +{ + return chan->ch_id; +} + +uint32_t +vmbus_chan_subidx(const struct vmbus_channel *chan) +{ + return chan->ch_subidx; +} + +bool +vmbus_chan_is_primary(const struct vmbus_channel *chan) +{ + if (VMBUS_CHAN_ISPRIMARY(chan)) + return true; + else + return false; +} + +const struct hyperv_guid * +vmbus_chan_guid_inst(const struct vmbus_channel *chan) +{ + return &chan->ch_guid_inst; } Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_chanvar.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_chanvar.h (nonexistent) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_chanvar.h (revision 303206) @@ -0,0 +1,168 @@ +/*- + * Copyright (c) 2016 Microsoft Corp. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice unmodified, this list of conditions, and the following + * disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _VMBUS_CHANVAR_H_ +#define _VMBUS_CHANVAR_H_ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +typedef struct { + struct vmbus_bufring *ring_buffer; + struct mtx ring_lock; + uint32_t ring_data_size; /* ring_size */ +} hv_vmbus_ring_buffer_info; + +struct vmbus_channel { + /* + * NOTE: + * Fields before ch_txbr are only accessed on this channel's + * target CPU. + */ + uint32_t ch_flags; /* VMBUS_CHAN_FLAG_ */ + + /* + * RX bufring; immediately following ch_txbr. + */ + hv_vmbus_ring_buffer_info ch_rxbr; + + struct taskqueue *ch_tq; + struct task ch_task; + vmbus_chan_callback_t ch_cb; + void *ch_cbarg; + + /* + * TX bufring; at the beginning of ch_bufring. + * + * NOTE: + * Put TX bufring and the following MNF/evtflag to a new + * cacheline, since they will be accessed on all CPUs by + * locking ch_txbr first. + * + * XXX + * TX bufring and following MNF/evtflags do _not_ fit in + * one 64B cacheline. + */ + hv_vmbus_ring_buffer_info ch_txbr __aligned(CACHE_LINE_SIZE); + uint32_t ch_txflags; /* VMBUS_CHAN_TXF_ */ + + /* + * These are based on the vmbus_chanmsg_choffer.chm_montrig. + * Save it here for easy access. + */ + uint32_t ch_montrig_mask;/* MNF trig mask */ + volatile uint32_t *ch_montrig; /* MNF trigger loc. */ + + /* + * These are based on the vmbus_chanmsg_choffer.chm_chanid. + * Save it here for easy access. + */ + u_long ch_evtflag_mask;/* event flag */ + volatile u_long *ch_evtflag; /* event flag loc. */ + + /* + * Rarely used fields. + */ + + struct hyperv_mon_param *ch_monprm; + struct hyperv_dma ch_monprm_dma; + + uint32_t ch_id; /* channel id */ + device_t ch_dev; + struct vmbus_softc *ch_vmbus; + + int ch_cpuid; /* owner cpu */ + /* + * Virtual cpuid for ch_cpuid; it is used to communicate cpuid + * related information w/ Hyper-V. If MSR_HV_VP_INDEX does not + * exist, ch_vcpuid will always be 0 for compatibility. + */ + uint32_t ch_vcpuid; + + /* + * If this is a primary channel, ch_subchan* fields + * contain sub-channels belonging to this primary + * channel. + */ + struct mtx ch_subchan_lock; + TAILQ_HEAD(, vmbus_channel) ch_subchans; + int ch_subchan_cnt; + + /* If this is a sub-channel */ + TAILQ_ENTRY(vmbus_channel) ch_sublink; /* sub-channel link */ + struct vmbus_channel *ch_prichan; /* owner primary chan */ + + void *ch_bufring; /* TX+RX bufrings */ + struct hyperv_dma ch_bufring_dma; + uint32_t ch_bufring_gpadl; + + struct task ch_detach_task; + TAILQ_ENTRY(vmbus_channel) ch_prilink; /* primary chan link */ + uint32_t ch_subidx; /* subchan index */ + volatile uint32_t ch_stflags; /* atomic-op */ + /* VMBUS_CHAN_ST_ */ + struct hyperv_guid ch_guid_type; + struct hyperv_guid ch_guid_inst; + + struct sysctl_ctx_list ch_sysctl_ctx; +} __aligned(CACHE_LINE_SIZE); + +#define VMBUS_CHAN_ISPRIMARY(chan) ((chan)->ch_subidx == 0) + +/* + * If this flag is set, this channel's interrupt will be masked in ISR, + * and the RX bufring will be drained before this channel's interrupt is + * unmasked. + * + * This flag is turned on by default. Drivers can turn it off according + * to their own requirement. + */ +#define VMBUS_CHAN_FLAG_BATCHREAD 0x0002 + +#define VMBUS_CHAN_TXF_HASMNF 0x0001 + +#define VMBUS_CHAN_ST_OPENED_SHIFT 0 +#define VMBUS_CHAN_ST_OPENED (1 << VMBUS_CHAN_ST_OPENED_SHIFT) + +struct vmbus_softc; +struct vmbus_message; + +void vmbus_event_proc(struct vmbus_softc *, int); +void vmbus_event_proc_compat(struct vmbus_softc *, int); +void vmbus_chan_msgproc(struct vmbus_softc *, const struct vmbus_message *); +void vmbus_chan_destroy_all(struct vmbus_softc *); + +#endif /* !_VMBUS_CHANVAR_H_ */ Property changes on: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_chanvar.h ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_reg.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_reg.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_reg.h (revision 303206) @@ -1,302 +1,333 @@ /*- * Copyright (c) 2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _VMBUS_REG_H_ #define _VMBUS_REG_H_ #include #include /* XXX for hyperv_guid */ #include #include /* * Hyper-V SynIC message format. */ #define VMBUS_MSG_DSIZE_MAX 240 #define VMBUS_MSG_SIZE 256 struct vmbus_message { uint32_t msg_type; /* HYPERV_MSGTYPE_ */ uint8_t msg_dsize; /* data size */ uint8_t msg_flags; /* VMBUS_MSGFLAG_ */ uint16_t msg_rsvd; uint64_t msg_id; uint8_t msg_data[VMBUS_MSG_DSIZE_MAX]; } __packed; CTASSERT(sizeof(struct vmbus_message) == VMBUS_MSG_SIZE); #define VMBUS_MSGFLAG_PENDING 0x01 /* * Hyper-V SynIC event flags */ #ifdef __LP64__ #define VMBUS_EVTFLAGS_MAX 32 #define VMBUS_EVTFLAG_SHIFT 6 #else #define VMBUS_EVTFLAGS_MAX 64 #define VMBUS_EVTFLAG_SHIFT 5 #endif #define VMBUS_EVTFLAG_LEN (1 << VMBUS_EVTFLAG_SHIFT) #define VMBUS_EVTFLAG_MASK (VMBUS_EVTFLAG_LEN - 1) #define VMBUS_EVTFLAGS_SIZE 256 struct vmbus_evtflags { u_long evt_flags[VMBUS_EVTFLAGS_MAX]; } __packed; CTASSERT(sizeof(struct vmbus_evtflags) == VMBUS_EVTFLAGS_SIZE); /* * Hyper-V Monitor Notification Facility */ struct vmbus_mon_trig { uint32_t mt_pending; uint32_t mt_armed; } __packed; #define VMBUS_MONTRIGS_MAX 4 #define VMBUS_MONTRIG_LEN 32 struct vmbus_mnf { uint32_t mnf_state; uint32_t mnf_rsvd1; struct vmbus_mon_trig mnf_trigs[VMBUS_MONTRIGS_MAX]; uint8_t mnf_rsvd2[536]; uint16_t mnf_lat[VMBUS_MONTRIGS_MAX][VMBUS_MONTRIG_LEN]; uint8_t mnf_rsvd3[256]; struct hyperv_mon_param mnf_param[VMBUS_MONTRIGS_MAX][VMBUS_MONTRIG_LEN]; uint8_t mnf_rsvd4[1984]; } __packed; CTASSERT(sizeof(struct vmbus_mnf) == PAGE_SIZE); /* + * Buffer ring + */ +struct vmbus_bufring { + /* + * If br_windex == br_rindex, this bufring is empty; this + * means we can _not_ write data to the bufring, if the + * write is going to make br_windex same as br_rindex. + */ + volatile uint32_t br_windex; + volatile uint32_t br_rindex; + + /* + * Interrupt mask {0,1} + * + * For TX bufring, host set this to 1, when it is processing + * the TX bufring, so that we can safely skip the TX event + * notification to host. + * + * For RX bufring, once this is set to 1 by us, host will not + * further dispatch interrupts to us, even if there are data + * pending on the RX bufring. This effectively disables the + * interrupt of the channel to which this RX bufring is attached. + */ + volatile uint32_t br_imask; + + uint8_t br_rsvd[4084]; + uint8_t br_data[]; +} __packed; +CTASSERT(sizeof(struct vmbus_bufring) == PAGE_SIZE); + +/* * Channel */ #define VMBUS_CHAN_MAX_COMPAT 256 #define VMBUS_CHAN_MAX (VMBUS_EVTFLAG_LEN * VMBUS_EVTFLAGS_MAX) /* * Channel packets */ #define VMBUS_CHANPKT_SIZE_ALIGN (1 << VMBUS_CHANPKT_SIZE_SHIFT) #define VMBUS_CHANPKT_SETLEN(pktlen, len) \ do { \ (pktlen) = (len) >> VMBUS_CHANPKT_SIZE_SHIFT; \ } while (0) #define VMBUS_CHANPKT_TOTLEN(tlen) \ roundup2((tlen), VMBUS_CHANPKT_SIZE_ALIGN) struct vmbus_chanpkt { struct vmbus_chanpkt_hdr cp_hdr; } __packed; struct vmbus_chanpkt_sglist { struct vmbus_chanpkt_hdr cp_hdr; uint32_t cp_rsvd; uint32_t cp_gpa_cnt; struct vmbus_gpa cp_gpa[]; } __packed; struct vmbus_chanpkt_prplist { struct vmbus_chanpkt_hdr cp_hdr; uint32_t cp_rsvd; uint32_t cp_range_cnt; struct vmbus_gpa_range cp_range[]; } __packed; /* * Channel messages * - Embedded in vmbus_message.msg_data, e.g. response and notification. * - Embedded in hypercall_postmsg_in.hc_data, e.g. request. */ #define VMBUS_CHANMSG_TYPE_CHOFFER 1 /* NOTE */ #define VMBUS_CHANMSG_TYPE_CHRESCIND 2 /* NOTE */ #define VMBUS_CHANMSG_TYPE_CHREQUEST 3 /* REQ */ #define VMBUS_CHANMSG_TYPE_CHOFFER_DONE 4 /* NOTE */ #define VMBUS_CHANMSG_TYPE_CHOPEN 5 /* REQ */ #define VMBUS_CHANMSG_TYPE_CHOPEN_RESP 6 /* RESP */ #define VMBUS_CHANMSG_TYPE_CHCLOSE 7 /* REQ */ #define VMBUS_CHANMSG_TYPE_GPADL_CONN 8 /* REQ */ #define VMBUS_CHANMSG_TYPE_GPADL_SUBCONN 9 /* REQ */ #define VMBUS_CHANMSG_TYPE_GPADL_CONNRESP 10 /* RESP */ #define VMBUS_CHANMSG_TYPE_GPADL_DISCONN 11 /* REQ */ #define VMBUS_CHANMSG_TYPE_GPADL_DISCONNRESP 12 /* RESP */ #define VMBUS_CHANMSG_TYPE_CHFREE 13 /* REQ */ #define VMBUS_CHANMSG_TYPE_CONNECT 14 /* REQ */ #define VMBUS_CHANMSG_TYPE_CONNECT_RESP 15 /* RESP */ #define VMBUS_CHANMSG_TYPE_DISCONNECT 16 /* REQ */ #define VMBUS_CHANMSG_TYPE_MAX 22 struct vmbus_chanmsg_hdr { uint32_t chm_type; /* VMBUS_CHANMSG_TYPE_ */ uint32_t chm_rsvd; } __packed; /* VMBUS_CHANMSG_TYPE_CONNECT */ struct vmbus_chanmsg_connect { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_ver; uint32_t chm_rsvd; uint64_t chm_evtflags; uint64_t chm_mnf1; uint64_t chm_mnf2; } __packed; /* VMBUS_CHANMSG_TYPE_CONNECT_RESP */ struct vmbus_chanmsg_connect_resp { struct vmbus_chanmsg_hdr chm_hdr; uint8_t chm_done; } __packed; /* VMBUS_CHANMSG_TYPE_CHREQUEST */ struct vmbus_chanmsg_chrequest { struct vmbus_chanmsg_hdr chm_hdr; } __packed; /* VMBUS_CHANMSG_TYPE_DISCONNECT */ struct vmbus_chanmsg_disconnect { struct vmbus_chanmsg_hdr chm_hdr; } __packed; /* VMBUS_CHANMSG_TYPE_CHOPEN */ struct vmbus_chanmsg_chopen { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; uint32_t chm_openid; uint32_t chm_gpadl; uint32_t chm_vcpuid; uint32_t chm_txbr_pgcnt; #define VMBUS_CHANMSG_CHOPEN_UDATA_SIZE 120 uint8_t chm_udata[VMBUS_CHANMSG_CHOPEN_UDATA_SIZE]; } __packed; /* VMBUS_CHANMSG_TYPE_CHOPEN_RESP */ struct vmbus_chanmsg_chopen_resp { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; uint32_t chm_openid; uint32_t chm_status; } __packed; /* VMBUS_CHANMSG_TYPE_GPADL_CONN */ struct vmbus_chanmsg_gpadl_conn { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; uint32_t chm_gpadl; uint16_t chm_range_len; uint16_t chm_range_cnt; struct vmbus_gpa_range chm_range; } __packed; #define VMBUS_CHANMSG_GPADL_CONN_PGMAX 26 CTASSERT(__offsetof(struct vmbus_chanmsg_gpadl_conn, chm_range.gpa_page[VMBUS_CHANMSG_GPADL_CONN_PGMAX]) <= HYPERCALL_POSTMSGIN_DSIZE_MAX); /* VMBUS_CHANMSG_TYPE_GPADL_SUBCONN */ struct vmbus_chanmsg_gpadl_subconn { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_msgno; uint32_t chm_gpadl; uint64_t chm_gpa_page[]; } __packed; #define VMBUS_CHANMSG_GPADL_SUBCONN_PGMAX 28 CTASSERT(__offsetof(struct vmbus_chanmsg_gpadl_subconn, chm_gpa_page[VMBUS_CHANMSG_GPADL_SUBCONN_PGMAX]) <= HYPERCALL_POSTMSGIN_DSIZE_MAX); /* VMBUS_CHANMSG_TYPE_GPADL_CONNRESP */ struct vmbus_chanmsg_gpadl_connresp { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; uint32_t chm_gpadl; uint32_t chm_status; } __packed; /* VMBUS_CHANMSG_TYPE_CHCLOSE */ struct vmbus_chanmsg_chclose { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; } __packed; /* VMBUS_CHANMSG_TYPE_GPADL_DISCONN */ struct vmbus_chanmsg_gpadl_disconn { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; uint32_t chm_gpadl; } __packed; /* VMBUS_CHANMSG_TYPE_CHFREE */ struct vmbus_chanmsg_chfree { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; } __packed; /* VMBUS_CHANMSG_TYPE_CHRESCIND */ struct vmbus_chanmsg_chrescind { struct vmbus_chanmsg_hdr chm_hdr; uint32_t chm_chanid; } __packed; /* VMBUS_CHANMSG_TYPE_CHOFFER */ struct vmbus_chanmsg_choffer { struct vmbus_chanmsg_hdr chm_hdr; struct hyperv_guid chm_chtype; struct hyperv_guid chm_chinst; uint64_t chm_chlat; /* unit: 100ns */ uint32_t chm_chrev; uint32_t chm_svrctx_sz; uint16_t chm_chflags; uint16_t chm_mmio_sz; /* unit: MB */ uint8_t chm_udata[120]; uint16_t chm_subidx; uint16_t chm_rsvd; uint32_t chm_chanid; uint8_t chm_montrig; uint8_t chm_flags1; /* VMBUS_CHOFFER_FLAG1_ */ uint16_t chm_flags2; uint32_t chm_connid; } __packed; CTASSERT(sizeof(struct vmbus_chanmsg_choffer) <= VMBUS_MSG_DSIZE_MAX); #define VMBUS_CHOFFER_FLAG1_HASMNF 0x01 #endif /* !_VMBUS_REG_H_ */ Index: user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_var.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_var.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/hyperv/vmbus/vmbus_var.h (revision 303206) @@ -1,169 +1,162 @@ /*- * Copyright (c) 2016 Microsoft Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _VMBUS_VAR_H_ #define _VMBUS_VAR_H_ #include #include #include #include /* * NOTE: DO NOT CHANGE THIS. */ #define VMBUS_SINT_MESSAGE 2 /* * NOTE: * - DO NOT set it to the same value as VMBUS_SINT_MESSAGE. * - DO NOT set it to 0. */ #define VMBUS_SINT_TIMER 4 /* * NOTE: DO NOT CHANGE THESE */ #define VMBUS_CONNID_MESSAGE 1 #define VMBUS_CONNID_EVENT 2 struct vmbus_message; struct vmbus_softc; typedef void (*vmbus_chanmsg_proc_t)(struct vmbus_softc *, const struct vmbus_message *); #define VMBUS_CHANMSG_PROC(name, func) \ [VMBUS_CHANMSG_TYPE_##name] = func #define VMBUS_CHANMSG_PROC_WAKEUP(name) \ VMBUS_CHANMSG_PROC(name, vmbus_msghc_wakeup) struct vmbus_pcpu_data { u_long *intr_cnt; /* Hyper-V interrupt counter */ struct vmbus_message *message; /* shared messages */ uint32_t vcpuid; /* virtual cpuid */ int event_flags_cnt;/* # of event flags */ struct vmbus_evtflags *event_flags; /* event flags from host */ /* Rarely used fields */ struct hyperv_dma message_dma; /* busdma glue */ struct hyperv_dma event_flags_dma;/* busdma glue */ struct taskqueue *event_tq; /* event taskq */ struct taskqueue *message_tq; /* message taskq */ struct task message_task; /* message task */ } __aligned(CACHE_LINE_SIZE); struct vmbus_softc { void (*vmbus_event_proc)(struct vmbus_softc *, int); u_long *vmbus_tx_evtflags; /* event flags to host */ struct vmbus_mnf *vmbus_mnf2; /* monitored by host */ u_long *vmbus_rx_evtflags; /* compat evtflgs from host */ - struct hv_vmbus_channel **vmbus_chmap; + struct vmbus_channel **vmbus_chmap; struct vmbus_msghc_ctx *vmbus_msg_hc; struct vmbus_pcpu_data vmbus_pcpu[MAXCPU]; /* * Rarely used fields */ device_t vmbus_dev; int vmbus_idtvec; uint32_t vmbus_flags; /* see VMBUS_FLAG_ */ uint32_t vmbus_version; uint32_t vmbus_gpadl; /* Shared memory for vmbus_{rx,tx}_evtflags */ void *vmbus_evtflags; struct hyperv_dma vmbus_evtflags_dma; void *vmbus_mnf1; /* monitored by VM, unused */ struct hyperv_dma vmbus_mnf1_dma; struct hyperv_dma vmbus_mnf2_dma; struct mtx vmbus_scan_lock; uint32_t vmbus_scan_chcnt; #define VMBUS_SCAN_CHCNT_DONE 0x80000000 uint32_t vmbus_scan_devcnt; /* Primary channels */ struct mtx vmbus_prichan_lock; - TAILQ_HEAD(, hv_vmbus_channel) vmbus_prichans; + TAILQ_HEAD(, vmbus_channel) vmbus_prichans; }; #define VMBUS_FLAG_ATTACHED 0x0001 /* vmbus was attached */ #define VMBUS_FLAG_SYNIC 0x0002 /* SynIC was setup */ extern struct vmbus_softc *vmbus_sc; static __inline struct vmbus_softc * vmbus_get_softc(void) { return vmbus_sc; } static __inline device_t vmbus_get_device(void) { return vmbus_sc->vmbus_dev; } #define VMBUS_PCPU_GET(sc, field, cpu) (sc)->vmbus_pcpu[(cpu)].field #define VMBUS_PCPU_PTR(sc, field, cpu) &(sc)->vmbus_pcpu[(cpu)].field -struct hv_vmbus_channel; +struct vmbus_channel; struct trapframe; struct vmbus_message; struct vmbus_msghc; -void vmbus_event_proc(struct vmbus_softc *, int); -void vmbus_event_proc_compat(struct vmbus_softc *, int); void vmbus_handle_intr(struct trapframe *); -int vmbus_add_child(struct hv_vmbus_channel *); -int vmbus_delete_child(struct hv_vmbus_channel *); - +int vmbus_add_child(struct vmbus_channel *); +int vmbus_delete_child(struct vmbus_channel *); void vmbus_et_intr(struct trapframe *); +uint32_t vmbus_gpadl_alloc(struct vmbus_softc *); -void vmbus_chan_msgproc(struct vmbus_softc *, const struct vmbus_message *); -void vmbus_chan_destroy_all(struct vmbus_softc *); - struct vmbus_msghc *vmbus_msghc_get(struct vmbus_softc *, size_t); void vmbus_msghc_put(struct vmbus_softc *, struct vmbus_msghc *); void *vmbus_msghc_dataptr(struct vmbus_msghc *); int vmbus_msghc_exec_noresult(struct vmbus_msghc *); int vmbus_msghc_exec(struct vmbus_softc *, struct vmbus_msghc *); const struct vmbus_message *vmbus_msghc_wait_result(struct vmbus_softc *, struct vmbus_msghc *); void vmbus_msghc_wakeup(struct vmbus_softc *, const struct vmbus_message *); void vmbus_msghc_reset(struct vmbus_msghc *, size_t); - -uint32_t vmbus_gpadl_alloc(struct vmbus_softc *); #endif /* !_VMBUS_VAR_H_ */ Index: user/alc/PQ_LAUNDRY/sys/dev/nvme/nvme_private.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/nvme/nvme_private.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/nvme/nvme_private.h (revision 303206) @@ -1,537 +1,533 @@ /*- * Copyright (C) 2012-2014 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __NVME_PRIVATE_H__ #define __NVME_PRIVATE_H__ #include #include #include #include #include #include #include #include #include #include #include #include #include "nvme.h" #define DEVICE2SOFTC(dev) ((struct nvme_controller *) device_get_softc(dev)) MALLOC_DECLARE(M_NVME); #define IDT32_PCI_ID 0x80d0111d /* 32 channel board */ #define IDT8_PCI_ID 0x80d2111d /* 8 channel board */ /* * For commands requiring more than 2 PRP entries, one PRP will be * embedded in the command (prp1), and the rest of the PRP entries * will be in a list pointed to by the command (prp2). This means * that real max number of PRP entries we support is 32+1, which * results in a max xfer size of 32*PAGE_SIZE. */ #define NVME_MAX_PRP_LIST_ENTRIES (NVME_MAX_XFER_SIZE / PAGE_SIZE) #define NVME_ADMIN_TRACKERS (16) #define NVME_ADMIN_ENTRIES (128) /* min and max are defined in admin queue attributes section of spec */ #define NVME_MIN_ADMIN_ENTRIES (2) #define NVME_MAX_ADMIN_ENTRIES (4096) /* * NVME_IO_ENTRIES defines the size of an I/O qpair's submission and completion * queues, while NVME_IO_TRACKERS defines the maximum number of I/O that we * will allow outstanding on an I/O qpair at any time. The only advantage in * having IO_ENTRIES > IO_TRACKERS is for debugging purposes - when dumping * the contents of the submission and completion queues, it will show a longer * history of data. */ #define NVME_IO_ENTRIES (256) #define NVME_IO_TRACKERS (128) #define NVME_MIN_IO_TRACKERS (4) #define NVME_MAX_IO_TRACKERS (1024) /* * NVME_MAX_IO_ENTRIES is not defined, since it is specified in CC.MQES * for each controller. */ #define NVME_INT_COAL_TIME (0) /* disabled */ #define NVME_INT_COAL_THRESHOLD (0) /* 0-based */ #define NVME_MAX_NAMESPACES (16) #define NVME_MAX_CONSUMERS (2) #define NVME_MAX_ASYNC_EVENTS (8) #define NVME_DEFAULT_TIMEOUT_PERIOD (30) /* in seconds */ #define NVME_MIN_TIMEOUT_PERIOD (5) #define NVME_MAX_TIMEOUT_PERIOD (120) #define NVME_DEFAULT_RETRY_COUNT (4) /* Maximum log page size to fetch for AERs. */ #define NVME_MAX_AER_LOG_SIZE (4096) /* * Define CACHE_LINE_SIZE here for older FreeBSD versions that do not define * it. */ #ifndef CACHE_LINE_SIZE #define CACHE_LINE_SIZE (64) #endif /* * Use presence of the BIO_UNMAPPED flag to determine whether unmapped I/O * support and the bus_dmamap_load_bio API are available on the target * kernel. This will ease porting back to earlier stable branches at a * later point. */ #ifdef BIO_UNMAPPED #define NVME_UNMAPPED_BIO_SUPPORT #endif extern uma_zone_t nvme_request_zone; extern int32_t nvme_retry_count; struct nvme_completion_poll_status { struct nvme_completion cpl; boolean_t done; }; #define NVME_REQUEST_VADDR 1 #define NVME_REQUEST_NULL 2 /* For requests with no payload. */ #define NVME_REQUEST_UIO 3 #ifdef NVME_UNMAPPED_BIO_SUPPORT #define NVME_REQUEST_BIO 4 #endif struct nvme_request { struct nvme_command cmd; struct nvme_qpair *qpair; union { void *payload; struct bio *bio; } u; uint32_t type; uint32_t payload_size; boolean_t timeout; nvme_cb_fn_t cb_fn; void *cb_arg; int32_t retries; STAILQ_ENTRY(nvme_request) stailq; }; struct nvme_async_event_request { struct nvme_controller *ctrlr; struct nvme_request *req; struct nvme_completion cpl; uint32_t log_page_id; uint32_t log_page_size; uint8_t log_page_buffer[NVME_MAX_AER_LOG_SIZE]; }; struct nvme_tracker { TAILQ_ENTRY(nvme_tracker) tailq; struct nvme_request *req; struct nvme_qpair *qpair; struct callout timer; bus_dmamap_t payload_dma_map; uint16_t cid; uint64_t prp[NVME_MAX_PRP_LIST_ENTRIES]; bus_addr_t prp_bus_addr; bus_dmamap_t prp_dma_map; }; struct nvme_qpair { struct nvme_controller *ctrlr; uint32_t id; uint32_t phase; uint16_t vector; int rid; struct resource *res; void *tag; uint32_t num_entries; uint32_t num_trackers; uint32_t sq_tdbl_off; uint32_t cq_hdbl_off; uint32_t sq_head; uint32_t sq_tail; uint32_t cq_head; int64_t num_cmds; int64_t num_intr_handler_calls; struct nvme_command *cmd; struct nvme_completion *cpl; bus_dma_tag_t dma_tag; bus_dma_tag_t dma_tag_payload; bus_dmamap_t cmd_dma_map; uint64_t cmd_bus_addr; bus_dmamap_t cpl_dma_map; uint64_t cpl_bus_addr; TAILQ_HEAD(, nvme_tracker) free_tr; TAILQ_HEAD(, nvme_tracker) outstanding_tr; STAILQ_HEAD(, nvme_request) queued_req; struct nvme_tracker **act_tr; boolean_t is_enabled; struct mtx lock __aligned(CACHE_LINE_SIZE); } __aligned(CACHE_LINE_SIZE); struct nvme_namespace { struct nvme_controller *ctrlr; struct nvme_namespace_data data; uint16_t id; uint16_t flags; struct cdev *cdev; void *cons_cookie[NVME_MAX_CONSUMERS]; uint32_t stripesize; struct mtx lock; }; /* * One of these per allocated PCI device. */ struct nvme_controller { device_t dev; struct mtx lock; - struct cam_sim *sim; - struct cam_path *path; - int cam_ref; - uint32_t ready_timeout_in_ms; bus_space_tag_t bus_tag; bus_space_handle_t bus_handle; int resource_id; struct resource *resource; /* * The NVMe spec allows for the MSI-X table to be placed in BAR 4/5, * separate from the control registers which are in BAR 0/1. These * members track the mapping of BAR 4/5 for that reason. */ int bar4_resource_id; struct resource *bar4_resource; uint32_t msix_enabled; uint32_t force_intx; uint32_t enable_aborts; uint32_t num_io_queues; uint32_t num_cpus_per_ioq; /* Fields for tracking progress during controller initialization. */ struct intr_config_hook config_hook; uint32_t ns_identified; uint32_t queues_created; struct task reset_task; struct task fail_req_task; struct taskqueue *taskqueue; /* For shared legacy interrupt. */ int rid; struct resource *res; void *tag; bus_dma_tag_t hw_desc_tag; bus_dmamap_t hw_desc_map; /** maximum i/o size in bytes */ uint32_t max_xfer_size; /** minimum page size supported by this controller in bytes */ uint32_t min_page_size; /** interrupt coalescing time period (in microseconds) */ uint32_t int_coal_time; /** interrupt coalescing threshold */ uint32_t int_coal_threshold; /** timeout period in seconds */ uint32_t timeout_period; struct nvme_qpair adminq; struct nvme_qpair *ioq; struct nvme_registers *regs; struct nvme_controller_data cdata; struct nvme_namespace ns[NVME_MAX_NAMESPACES]; struct cdev *cdev; /** bit mask of warning types currently enabled for async events */ union nvme_critical_warning_state async_event_config; uint32_t num_aers; struct nvme_async_event_request aer[NVME_MAX_ASYNC_EVENTS]; void *cons_cookie[NVME_MAX_CONSUMERS]; uint32_t is_resetting; uint32_t is_initialized; uint32_t notification_sent; boolean_t is_failed; STAILQ_HEAD(, nvme_request) fail_req; }; #define nvme_mmio_offsetof(reg) \ offsetof(struct nvme_registers, reg) #define nvme_mmio_read_4(sc, reg) \ bus_space_read_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg)) #define nvme_mmio_write_4(sc, reg, val) \ bus_space_write_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg), val) #define nvme_mmio_write_8(sc, reg, val) \ do { \ bus_space_write_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg), val & 0xFFFFFFFF); \ bus_space_write_4((sc)->bus_tag, (sc)->bus_handle, \ nvme_mmio_offsetof(reg)+4, \ (val & 0xFFFFFFFF00000000UL) >> 32); \ } while (0); #if __FreeBSD_version < 800054 #define wmb() __asm volatile("sfence" ::: "memory") #define mb() __asm volatile("mfence" ::: "memory") #endif #define nvme_printf(ctrlr, fmt, args...) \ device_printf(ctrlr->dev, fmt, ##args) void nvme_ns_test(struct nvme_namespace *ns, u_long cmd, caddr_t arg); void nvme_ctrlr_cmd_identify_controller(struct nvme_controller *ctrlr, void *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_identify_namespace(struct nvme_controller *ctrlr, uint16_t nsid, void *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_set_interrupt_coalescing(struct nvme_controller *ctrlr, uint32_t microseconds, uint32_t threshold, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_error_page(struct nvme_controller *ctrlr, struct nvme_error_information_entry *payload, uint32_t num_entries, /* 0 = max */ nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_health_information_page(struct nvme_controller *ctrlr, uint32_t nsid, struct nvme_health_information_page *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_get_firmware_page(struct nvme_controller *ctrlr, struct nvme_firmware_page *payload, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_create_io_cq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, uint16_t vector, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_create_io_sq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_delete_io_cq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_delete_io_sq(struct nvme_controller *ctrlr, struct nvme_qpair *io_que, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_set_num_queues(struct nvme_controller *ctrlr, uint32_t num_queues, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_set_async_event_config(struct nvme_controller *ctrlr, union nvme_critical_warning_state state, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_ctrlr_cmd_abort(struct nvme_controller *ctrlr, uint16_t cid, uint16_t sqid, nvme_cb_fn_t cb_fn, void *cb_arg); void nvme_completion_poll_cb(void *arg, const struct nvme_completion *cpl); int nvme_ctrlr_construct(struct nvme_controller *ctrlr, device_t dev); void nvme_ctrlr_destruct(struct nvme_controller *ctrlr, device_t dev); void nvme_ctrlr_shutdown(struct nvme_controller *ctrlr); int nvme_ctrlr_hw_reset(struct nvme_controller *ctrlr); void nvme_ctrlr_reset(struct nvme_controller *ctrlr); /* ctrlr defined as void * to allow use with config_intrhook. */ void nvme_ctrlr_start_config_hook(void *ctrlr_arg); void nvme_ctrlr_submit_admin_request(struct nvme_controller *ctrlr, struct nvme_request *req); void nvme_ctrlr_submit_io_request(struct nvme_controller *ctrlr, struct nvme_request *req); void nvme_ctrlr_post_failed_request(struct nvme_controller *ctrlr, struct nvme_request *req); void nvme_qpair_construct(struct nvme_qpair *qpair, uint32_t id, uint16_t vector, uint32_t num_entries, uint32_t num_trackers, struct nvme_controller *ctrlr); void nvme_qpair_submit_tracker(struct nvme_qpair *qpair, struct nvme_tracker *tr); void nvme_qpair_process_completions(struct nvme_qpair *qpair); void nvme_qpair_submit_request(struct nvme_qpair *qpair, struct nvme_request *req); void nvme_qpair_reset(struct nvme_qpair *qpair); void nvme_qpair_fail(struct nvme_qpair *qpair); void nvme_qpair_manual_complete_request(struct nvme_qpair *qpair, struct nvme_request *req, uint32_t sct, uint32_t sc, boolean_t print_on_error); void nvme_admin_qpair_enable(struct nvme_qpair *qpair); void nvme_admin_qpair_disable(struct nvme_qpair *qpair); void nvme_admin_qpair_destroy(struct nvme_qpair *qpair); void nvme_io_qpair_enable(struct nvme_qpair *qpair); void nvme_io_qpair_disable(struct nvme_qpair *qpair); void nvme_io_qpair_destroy(struct nvme_qpair *qpair); int nvme_ns_construct(struct nvme_namespace *ns, uint16_t id, struct nvme_controller *ctrlr); void nvme_ns_destruct(struct nvme_namespace *ns); void nvme_sysctl_initialize_ctrlr(struct nvme_controller *ctrlr); void nvme_dump_command(struct nvme_command *cmd); void nvme_dump_completion(struct nvme_completion *cpl); static __inline void nvme_single_map(void *arg, bus_dma_segment_t *seg, int nseg, int error) { uint64_t *bus_addr = (uint64_t *)arg; if (error != 0) printf("nvme_single_map err %d\n", error); *bus_addr = seg[0].ds_addr; } static __inline struct nvme_request * _nvme_allocate_request(nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = uma_zalloc(nvme_request_zone, M_NOWAIT | M_ZERO); if (req != NULL) { req->cb_fn = cb_fn; req->cb_arg = cb_arg; req->timeout = TRUE; } return (req); } static __inline struct nvme_request * nvme_allocate_request_vaddr(void *payload, uint32_t payload_size, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = _nvme_allocate_request(cb_fn, cb_arg); if (req != NULL) { req->type = NVME_REQUEST_VADDR; req->u.payload = payload; req->payload_size = payload_size; } return (req); } static __inline struct nvme_request * nvme_allocate_request_null(nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = _nvme_allocate_request(cb_fn, cb_arg); if (req != NULL) req->type = NVME_REQUEST_NULL; return (req); } static __inline struct nvme_request * nvme_allocate_request_bio(struct bio *bio, nvme_cb_fn_t cb_fn, void *cb_arg) { struct nvme_request *req; req = _nvme_allocate_request(cb_fn, cb_arg); if (req != NULL) { #ifdef NVME_UNMAPPED_BIO_SUPPORT req->type = NVME_REQUEST_BIO; req->u.bio = bio; #else req->type = NVME_REQUEST_VADDR; req->u.payload = bio->bio_data; req->payload_size = bio->bio_bcount; #endif } return (req); } #define nvme_free_request(req) uma_zfree(nvme_request_zone, req) void nvme_notify_async_consumers(struct nvme_controller *ctrlr, const struct nvme_completion *async_cpl, uint32_t log_page_id, void *log_page_buffer, uint32_t log_page_size); void nvme_notify_fail_consumers(struct nvme_controller *ctrlr); void nvme_notify_new_controller(struct nvme_controller *ctrlr); void nvme_ctrlr_intx_handler(void *arg); #endif /* __NVME_PRIVATE_H__ */ Index: user/alc/PQ_LAUNDRY/sys/dev/nvme/nvme_sim.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/nvme/nvme_sim.c (nonexistent) +++ user/alc/PQ_LAUNDRY/sys/dev/nvme/nvme_sim.c (revision 303206) @@ -0,0 +1,400 @@ +/*- + * Copyright (c) 2016 Netflix, Inc + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer, + * without modification, immediately at the beginning of the file. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include // Yes, this is wrong. +#include + +#include "nvme_private.h" + +#define ccb_accb_ptr spriv_ptr0 +#define ccb_ctrlr_ptr spriv_ptr1 +static void nvme_sim_action(struct cam_sim *sim, union ccb *ccb); +static void nvme_sim_poll(struct cam_sim *sim); + +#define sim2softc(sim) ((struct nvme_sim_softc *)cam_sim_softc(sim)) +#define sim2ns(sim) (sim2softc(sim)->s_ns) +#define sim2ctrlr(sim) (sim2softc(sim)->s_ctrlr) + +struct nvme_sim_softc +{ + struct nvme_controller *s_ctrlr; + struct nvme_namespace *s_ns; + struct cam_sim *s_sim; + struct cam_path *s_path; +}; + +static void +nvme_sim_nvmeio_done(void *ccb_arg, const struct nvme_completion *cpl) +{ + union ccb *ccb = (union ccb *)ccb_arg; + + /* + * Let the periph know the completion, and let it sort out what + * it means. Make our best guess, though for the status code. + */ + memcpy(&ccb->nvmeio.cpl, cpl, sizeof(*cpl)); + if (nvme_completion_is_error(cpl)) + ccb->ccb_h.status = CAM_REQ_CMP_ERR; + else + ccb->ccb_h.status = CAM_REQ_CMP; + xpt_done(ccb); +} + +static void +nvme_sim_nvmeio(struct cam_sim *sim, union ccb *ccb) +{ + struct ccb_nvmeio *nvmeio = &ccb->nvmeio; + struct nvme_request *req; + void *payload; + uint32_t size; + struct nvme_controller *ctrlr; + + ctrlr = sim2ctrlr(sim); + payload = nvmeio->data_ptr; + size = nvmeio->dxfer_len; + /* SG LIST ??? */ + if ((nvmeio->ccb_h.flags & CAM_DATA_MASK) == CAM_DATA_BIO) + req = nvme_allocate_request_bio((struct bio *)payload, + nvme_sim_nvmeio_done, ccb); + else if (payload == NULL) + req = nvme_allocate_request_null(nvme_sim_nvmeio_done, ccb); + else + req = nvme_allocate_request_vaddr(payload, size, + nvme_sim_nvmeio_done, ccb); + + if (req == NULL) { + nvmeio->ccb_h.status = CAM_RESRC_UNAVAIL; + xpt_done(ccb); + return; + } + + memcpy(&req->cmd, &ccb->nvmeio.cmd, sizeof(ccb->nvmeio.cmd)); + + nvme_ctrlr_submit_io_request(ctrlr, req); + + ccb->ccb_h.status |= CAM_SIM_QUEUED; +} + +static void +nvme_sim_action(struct cam_sim *sim, union ccb *ccb) +{ + struct nvme_controller *ctrlr; + struct nvme_namespace *ns; + + CAM_DEBUG(ccb->ccb_h.path, CAM_DEBUG_TRACE, + ("nvme_sim_action: func= %#x\n", + ccb->ccb_h.func_code)); + + /* + * XXX when we support multiple namespaces in the base driver we'll need + * to revisit how all this gets stored and saved in the periph driver's + * reserved areas. Right now we store all three in the softc of the sim. + */ + ns = sim2ns(sim); + ctrlr = sim2ctrlr(sim); + + printf("Sim action: ctrlr %p ns %p\n", ctrlr, ns); + + mtx_assert(&ctrlr->lock, MA_OWNED); + + switch (ccb->ccb_h.func_code) { + case XPT_CALC_GEOMETRY: /* Calculate Geometry Totally nuts ? XXX */ + /* + * Only meaningful for old-school SCSI disks since only the SCSI + * da driver generates them. Reject all these that slip through. + */ + /*FALLTHROUGH*/ + case XPT_ABORT: /* Abort the specified CCB */ + case XPT_EN_LUN: /* Enable LUN as a target */ + case XPT_TARGET_IO: /* Execute target I/O request */ + case XPT_ACCEPT_TARGET_IO: /* Accept Host Target Mode CDB */ + case XPT_CONT_TARGET_IO: /* Continue Host Target I/O Connection*/ + /* + * Only target mode generates these, and only for SCSI. They are + * all invalid/unsupported for NVMe. + */ + ccb->ccb_h.status = CAM_REQ_INVALID; + break; + case XPT_SET_TRAN_SETTINGS: + /* + * NVMe doesn't really have different transfer settings, but + * other parts of CAM think failure here is a big deal. + */ + ccb->ccb_h.status = CAM_REQ_CMP; + break; + case XPT_PATH_INQ: /* Path routing inquiry */ + { + struct ccb_pathinq *cpi = &ccb->cpi; + + /* + * NVMe may have multiple LUNs on the same path. Current generation + * of NVMe devives support only a single name space. Multiple name + * space drives are coming, but it's unclear how we should report + * them up the stack. + */ + cpi->version_num = 1; + cpi->hba_inquiry = 0; + cpi->target_sprt = 0; + cpi->hba_misc = PIM_UNMAPPED /* | PIM_NOSCAN */; + cpi->hba_eng_cnt = 0; + cpi->max_target = 0; + cpi->max_lun = ctrlr->cdata.nn; + cpi->maxio = nvme_ns_get_max_io_xfer_size(ns); + cpi->initiator_id = 0; + cpi->bus_id = cam_sim_bus(sim); + cpi->base_transfer_speed = 4000000; /* 4 GB/s 4 lanes pcie 3 */ + strncpy(cpi->sim_vid, "FreeBSD", SIM_IDLEN); + strncpy(cpi->hba_vid, "NVMe", HBA_IDLEN); + strncpy(cpi->dev_name, cam_sim_name(sim), DEV_IDLEN); + cpi->unit_number = cam_sim_unit(sim); + cpi->transport = XPORT_NVME; /* XXX XPORT_PCIE ? */ + cpi->transport_version = 1; /* XXX Get PCIe spec ? */ + cpi->protocol = PROTO_NVME; + cpi->protocol_version = NVME_REV_1; /* Groks all 1.x NVMe cards */ + cpi->xport_specific.nvme.nsid = ns->id; + cpi->ccb_h.status = CAM_REQ_CMP; + break; + } + case XPT_GET_TRAN_SETTINGS: /* Get transport settings */ + { + struct ccb_trans_settings *cts; + struct ccb_trans_settings_nvme *nvmep; + struct ccb_trans_settings_nvme *nvmex; + + cts = &ccb->cts; + nvmex = &cts->xport_specific.nvme; + nvmep = &cts->proto_specific.nvme; + + nvmex->valid = CTS_NVME_VALID_SPEC; + nvmex->spec_major = 1; /* XXX read from card */ + nvmex->spec_minor = 2; + nvmex->spec_tiny = 0; + + nvmep->valid = CTS_NVME_VALID_SPEC; + nvmep->spec_major = 1; /* XXX read from card */ + nvmep->spec_minor = 2; + nvmep->spec_tiny = 0; + cts->transport = XPORT_NVME; + cts->protocol = PROTO_NVME; + cts->ccb_h.status = CAM_REQ_CMP; + break; + } + case XPT_TERM_IO: /* Terminate the I/O process */ + /* + * every driver handles this, but nothing generates it. Assume + * it's OK to just say 'that worked'. + */ + /*FALLTHROUGH*/ + case XPT_RESET_DEV: /* Bus Device Reset the specified device */ + case XPT_RESET_BUS: /* Reset the specified bus */ + /* + * NVMe doesn't really support physically resetting the bus. It's part + * of the bus scanning dance, so return sucess to tell the process to + * proceed. + */ + ccb->ccb_h.status = CAM_REQ_CMP; + break; + case XPT_NVME_IO: /* Execute the requested I/O operation */ + nvme_sim_nvmeio(sim, ccb); + return; /* no done */ + default: + ccb->ccb_h.status = CAM_REQ_INVALID; + break; + } + xpt_done(ccb); +} + +static void +nvme_sim_poll(struct cam_sim *sim) +{ + + nvme_ctrlr_intx_handler(sim2ctrlr(sim)); +} + +static void * +nvme_sim_new_controller(struct nvme_controller *ctrlr) +{ + struct cam_devq *devq; + int max_trans; + int unit; + struct nvme_sim_softc *sc = NULL; + + max_trans = 256;/* XXX not so simple -- must match queues */ + unit = device_get_unit(ctrlr->dev); + devq = cam_simq_alloc(max_trans); + if (devq == NULL) + return NULL; + + sc = malloc(sizeof(*sc), M_NVME, M_ZERO | M_WAITOK); + + sc->s_ctrlr = ctrlr; + + sc->s_sim = cam_sim_alloc(nvme_sim_action, nvme_sim_poll, + "nvme", sc, unit, &ctrlr->lock, max_trans, max_trans, devq); + if (sc->s_sim == NULL) { + printf("Failed to allocate a sim\n"); + cam_simq_free(devq); + free(sc, M_NVME); + return NULL; + } + + return sc; +} + +static void +nvme_sim_rescan_target(struct nvme_controller *ctrlr, struct cam_path *path) +{ + union ccb *ccb; + + ccb = xpt_alloc_ccb_nowait(); + if (ccb == NULL) { + printf("unable to alloc CCB for rescan\n"); + return; + } + + if (xpt_clone_path(&ccb->ccb_h.path, path) != CAM_REQ_CMP) { + printf("unable to copy path for rescan\n"); + xpt_free_ccb(ccb); + return; + } + + xpt_rescan(ccb); +} + +static void * +nvme_sim_new_ns(struct nvme_namespace *ns, void *sc_arg) +{ + struct nvme_sim_softc *sc = sc_arg; + struct nvme_controller *ctrlr = sc->s_ctrlr; + int i; + + sc->s_ns = ns; + + printf("Our SIM's softc %p ctrlr %p ns %p\n", sc, ctrlr, ns); + + /* + * XXX this is creating one bus per ns, but it should be one + * XXX target per controller, and one LUN per namespace. + * XXX Current drives only support one NS, so there's time + * XXX to fix it later when new drives arrive. + * + * XXX I'm pretty sure the xpt_bus_register() call below is + * XXX like super lame and it really belongs in the sim_new_ctrlr + * XXX callback. Then the create_path below would be pretty close + * XXX to being right. Except we should be per-ns not per-ctrlr + * XXX data. + */ + + mtx_lock(&ctrlr->lock); +/* Create bus */ + + /* + * XXX do I need to lock ctrlr->lock ? + * XXX do I need to lock the path? + * ata and scsi seem to in their code, but their discovery is + * somewhat more asynchronous. We're only every called one at a + * time, and nothing is in parallel. + */ + + i = 0; + if (xpt_bus_register(sc->s_sim, ctrlr->dev, 0) != CAM_SUCCESS) + goto error; + i++; + if (xpt_create_path(&sc->s_path, /*periph*/NULL, cam_sim_path(sc->s_sim), + 1, ns->id) != CAM_REQ_CMP) + goto error; + i++; + + sc->s_path->device->nvme_data = nvme_ns_get_data(ns); + sc->s_path->device->nvme_cdata = nvme_ctrlr_get_data(ns->ctrlr); + +/* Scan bus */ + printf("Initiate rescan of the bus\n"); + nvme_sim_rescan_target(ctrlr, sc->s_path); + + mtx_unlock(&ctrlr->lock); + + return ns; + +error: + switch (i) { + case 2: + xpt_free_path(sc->s_path); + case 1: + xpt_bus_deregister(cam_sim_path(sc->s_sim)); + case 0: + cam_sim_free(sc->s_sim, /*free_devq*/TRUE); + } + mtx_unlock(&ctrlr->lock); + return NULL; +} + +static void +nvme_sim_controller_fail(void *ctrlr_arg) +{ + /* XXX cleanup XXX */ +} + +struct nvme_consumer *consumer_cookie; + +static void +nvme_sim_init(void) +{ + + consumer_cookie = nvme_register_consumer(nvme_sim_new_ns, + nvme_sim_new_controller, NULL, nvme_sim_controller_fail); +} + +SYSINIT(nvme_sim_register, SI_SUB_DRIVERS, SI_ORDER_ANY, + nvme_sim_init, NULL); + +static void +nvme_sim_uninit(void) +{ + /* XXX Cleanup */ + + nvme_unregister_consumer(consumer_cookie); +} + +SYSUNINIT(nvme_sim_unregister, SI_SUB_DRIVERS, SI_ORDER_ANY, + nvme_sim_uninit, NULL); Property changes on: user/alc/PQ_LAUNDRY/sys/dev/nvme/nvme_sim.c ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: user/alc/PQ_LAUNDRY/sys/dev/pty/pty.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/pty/pty.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/pty/pty.c (revision 303206) @@ -1,169 +1,164 @@ /*- * Copyright (c) 2008 Ed Schouten * All rights reserved. * * Portions of this software were developed under sponsorship from Snow * B.V., the Netherlands. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include /* * This driver implements a BSD-style compatibility naming scheme for * the pts(4) driver. We just call into pts(4) to create the actual PTY. * To make sure we don't use the same PTY multiple times, we abuse * si_drv1 inside the cdev to mark whether the PTY is in use. * * It also implements a /dev/ptmx device node, which is useful for Linux * binary emulation. */ -static unsigned int pty_warningcnt = 1; +static unsigned pty_warningcnt = 1; SYSCTL_UINT(_kern, OID_AUTO, tty_pty_warningcnt, CTLFLAG_RW, - &pty_warningcnt, 0, - "Warnings that will be triggered upon legacy PTY allocation"); + &pty_warningcnt, 0, + "Warnings that will be triggered upon legacy PTY allocation"); static int ptydev_fdopen(struct cdev *dev, int fflags, struct thread *td, struct file *fp) { int error; char name[6]; /* "ttyXX" */ if (!atomic_cmpset_ptr((uintptr_t *)&dev->si_drv1, 0, 1)) return (EBUSY); /* Generate device name and create PTY. */ strlcpy(name, devtoname(dev), sizeof(name)); name[0] = 't'; error = pts_alloc_external(fflags & (FREAD|FWRITE), td, fp, dev, name); if (error != 0) { destroy_dev_sched(dev); return (error); } /* Raise a warning when a legacy PTY has been allocated. */ - if (pty_warningcnt > 0) { - pty_warningcnt--; - log(LOG_INFO, "pid %d (%s) is using legacy pty devices%s\n", - td->td_proc->p_pid, td->td_name, - pty_warningcnt ? "" : " - not logging anymore"); - } + counted_warning(&pty_warningcnt, "is using legacy pty devices"); return (0); } static struct cdevsw ptydev_cdevsw = { .d_version = D_VERSION, .d_fdopen = ptydev_fdopen, .d_name = "ptydev", }; static void pty_clone(void *arg, struct ucred *cr, char *name, int namelen, struct cdev **dev) { struct make_dev_args mda; int error; /* Cloning is already satisfied. */ if (*dev != NULL) return; /* Only catch /dev/ptyXX. */ if (namelen != 5 || bcmp(name, "pty", 3) != 0) return; /* Only catch /dev/pty[l-sL-S]X. */ if (!(name[3] >= 'l' && name[3] <= 's') && !(name[3] >= 'L' && name[3] <= 'S')) return; /* Only catch /dev/pty[l-sL-S][0-9a-v]. */ if (!(name[4] >= '0' && name[4] <= '9') && !(name[4] >= 'a' && name[4] <= 'v')) return; /* Create the controller device node. */ make_dev_args_init(&mda); mda.mda_flags = MAKEDEV_CHECKNAME | MAKEDEV_REF; mda.mda_devsw = &ptydev_cdevsw; mda.mda_uid = UID_ROOT; mda.mda_gid = GID_WHEEL; mda.mda_mode = 0666; error = make_dev_s(&mda, dev, "%s", name); if (error != 0) *dev = NULL; } static int ptmx_fdopen(struct cdev *dev __unused, int fflags, struct thread *td, struct file *fp) { return (pts_alloc(fflags & (FREAD|FWRITE), td, fp)); } static struct cdevsw ptmx_cdevsw = { .d_version = D_VERSION, .d_fdopen = ptmx_fdopen, .d_name = "ptmx", }; static int pty_modevent(module_t mod, int type, void *data) { switch(type) { case MOD_LOAD: EVENTHANDLER_REGISTER(dev_clone, pty_clone, 0, 1000); make_dev_credf(MAKEDEV_ETERNAL_KLD, &ptmx_cdevsw, 0, NULL, UID_ROOT, GID_WHEEL, 0666, "ptmx"); break; case MOD_SHUTDOWN: break; case MOD_UNLOAD: /* XXX: No unloading support yet. */ return (EBUSY); default: return (EOPNOTSUPP); } return (0); } DEV_MODULE(pty, pty_modevent, NULL); Index: user/alc/PQ_LAUNDRY/sys/dev/uart/uart_bus_acpi.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/uart/uart_bus_acpi.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/uart/uart_bus_acpi.c (revision 303206) @@ -1,89 +1,124 @@ /*- * Copyright (c) 2001 M. Warner Losh. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include +#include +#ifdef __aarch64__ +#include +#include +#include +#endif + static int uart_acpi_probe(device_t dev); static device_method_t uart_acpi_methods[] = { /* Device interface */ DEVMETHOD(device_probe, uart_acpi_probe), DEVMETHOD(device_attach, uart_bus_attach), DEVMETHOD(device_detach, uart_bus_detach), DEVMETHOD(device_resume, uart_bus_resume), { 0, 0 } }; static driver_t uart_acpi_driver = { uart_driver_name, uart_acpi_methods, sizeof(struct uart_softc), }; +#if defined(__i386__) || defined(__amd64__) static struct isa_pnp_id acpi_ns8250_ids[] = { {0x0005d041, "Standard PC COM port"}, /* PNP0500 */ {0x0105d041, "16550A-compatible COM port"}, /* PNP0501 */ {0x0205d041, "Multiport serial device (non-intelligent 16550)"}, /* PNP0502 */ {0x1005d041, "Generic IRDA-compatible device"}, /* PNP0510 */ {0x1105d041, "Generic IRDA-compatible device"}, /* PNP0511 */ {0x04f0235c, "Wacom Tablet PC Screen"}, /* WACF004 */ {0xe502aa1a, "Wacom Tablet at FuS Lifebook T"}, /* FUJ02E5 */ {0} }; +#endif +#ifdef __aarch64__ +static struct uart_class * +uart_acpi_find_device(device_t dev) +{ + struct acpi_uart_compat_data **cd; + ACPI_HANDLE h; + + if ((h = acpi_get_handle(dev)) == NULL) + return (NULL); + + SET_FOREACH(cd, uart_acpi_class_and_device_set) { + if (acpi_MatchHid(h, (*cd)->hid)) { + return ((*cd)->clas); + } + } + + return (NULL); +} +#endif + static int uart_acpi_probe(device_t dev) { struct uart_softc *sc; device_t parent; parent = device_get_parent(dev); sc = device_get_softc(dev); +#if defined(__i386__) || defined(__amd64__) if (!ISA_PNP_PROBE(parent, dev, acpi_ns8250_ids)) { sc->sc_class = &uart_ns8250_class; return (uart_bus_probe(dev, 0, 0, 0, 0)); } /* Add checks for non-ns8250 IDs here. */ +#elif defined(__aarch64__) + if ((sc->sc_class = uart_acpi_find_device(dev)) != NULL) + return (uart_bus_probe(dev, 2, 0, 0, 0)); +#endif + return (ENXIO); } DRIVER_MODULE(uart, acpi, uart_acpi_driver, uart_devclass, 0, 0); Index: user/alc/PQ_LAUNDRY/sys/dev/uart/uart_bus_fdt.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/uart/uart_bus_fdt.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/uart/uart_bus_fdt.c (revision 303206) @@ -1,135 +1,258 @@ /*- * Copyright (c) 2009-2010 The FreeBSD Foundation * All rights reserved. * * This software was developed by Semihalf under sponsorship from * the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_platform.h" #include #include #include #include #include #include #include #include #include #include #include #include #include static int uart_fdt_probe(device_t); static device_method_t uart_fdt_methods[] = { /* Device interface */ DEVMETHOD(device_probe, uart_fdt_probe), DEVMETHOD(device_attach, uart_bus_attach), DEVMETHOD(device_detach, uart_bus_detach), { 0, 0 } }; static driver_t uart_fdt_driver = { uart_driver_name, uart_fdt_methods, sizeof(struct uart_softc), }; int uart_fdt_get_clock(phandle_t node, pcell_t *cell) { /* clock-frequency is a FreeBSD-only extension. */ if ((OF_getencprop(node, "clock-frequency", cell, sizeof(*cell))) <= 0) { /* Try to retrieve parent 'bus-frequency' */ /* XXX this should go to simple-bus fixup or so */ if ((OF_getencprop(OF_parent(node), "bus-frequency", cell, sizeof(*cell))) <= 0) *cell = 0; } return (0); } int uart_fdt_get_shift(phandle_t node, pcell_t *cell) { if ((OF_getencprop(node, "reg-shift", cell, sizeof(*cell))) <= 0) return (-1); return (0); } static uintptr_t uart_fdt_find_device(device_t dev) { struct ofw_compat_data **cd; const struct ofw_compat_data *ocd; SET_FOREACH(cd, uart_fdt_class_and_device_set) { ocd = ofw_bus_search_compatible(dev, *cd); if (ocd->ocd_data != 0) return (ocd->ocd_data); } return (0); } static int +phandle_chosen_propdev(phandle_t chosen, const char *name, phandle_t *node) +{ + char buf[64]; + + if (OF_getprop(chosen, name, buf, sizeof(buf)) <= 0) + return (ENXIO); + if ((*node = OF_finddevice(buf)) == -1) + return (ENXIO); + + return (0); +} + +static const struct ofw_compat_data * +uart_fdt_find_compatible(phandle_t node, const struct ofw_compat_data *cd) +{ + const struct ofw_compat_data *ocd; + + for (ocd = cd; ocd->ocd_str != NULL; ocd++) { + if (fdt_is_compatible(node, ocd->ocd_str)) + return (ocd); + } + return (NULL); +} + +static uintptr_t +uart_fdt_find_by_node(phandle_t node, int class_list) +{ + struct ofw_compat_data **cd; + const struct ofw_compat_data *ocd; + + if (class_list) { + SET_FOREACH(cd, uart_fdt_class_set) { + ocd = uart_fdt_find_compatible(node, *cd); + if ((ocd != NULL) && (ocd->ocd_data != 0)) + return (ocd->ocd_data); + } + } else { + SET_FOREACH(cd, uart_fdt_class_and_device_set) { + ocd = uart_fdt_find_compatible(node, *cd); + if ((ocd != NULL) && (ocd->ocd_data != 0)) + return (ocd->ocd_data); + } + } + + return (0); +} + +int +uart_cpu_fdt_probe(struct uart_class **classp, bus_space_tag_t *bst, + bus_space_handle_t *bsh, int *baud, u_int *rclk, u_int *shiftp) +{ + const char *propnames[] = {"stdout-path", "linux,stdout-path", "stdout", + "stdin-path", "stdin", NULL}; + const char **name; + struct uart_class *class; + phandle_t node, chosen; + pcell_t br, clk, shift; + char *cp; + int err; + + /* Has the user forced a specific device node? */ + cp = kern_getenv("hw.fdt.console"); + if (cp == NULL) { + /* + * Retrieve /chosen/std{in,out}. + */ + node = -1; + if ((chosen = OF_finddevice("/chosen")) != -1) { + for (name = propnames; *name != NULL; name++) { + if (phandle_chosen_propdev(chosen, *name, + &node) == 0) + break; + } + } + if (chosen == -1 || *name == NULL) + node = OF_finddevice("serial0"); /* Last ditch */ + } else { + node = OF_finddevice(cp); + } + + if (node == -1) + return (ENXIO); + + /* + * Check old style of UART definition first. Unfortunately, the common + * FDT processing is not possible if we have clock, power domains and + * pinmux stuff. + */ + class = (struct uart_class *)uart_fdt_find_by_node(node, 0); + if (class != NULL) { + if ((err = uart_fdt_get_clock(node, &clk)) != 0) + return (err); + } else { + /* Check class only linker set */ + class = + (struct uart_class *)uart_fdt_find_by_node(node, 1); + if (class == NULL) + return (ENXIO); + clk = 0; + } + + /* + * Retrieve serial attributes. + */ + if (uart_fdt_get_shift(node, &shift) != 0) + shift = uart_getregshift(class); + + if (OF_getencprop(node, "current-speed", &br, sizeof(br)) <= 0) + br = 0; + + err = OF_decode_addr(node, 0, bst, bsh, NULL); + if (err != 0) + return (err); + + *classp = class; + *baud = br; + *rclk = clk; + *shiftp = shift; + + return (0); +} + +static int uart_fdt_probe(device_t dev) { struct uart_softc *sc; phandle_t node; pcell_t clock, shift; int err; sc = device_get_softc(dev); if (!ofw_bus_status_okay(dev)) return (ENXIO); sc->sc_class = (struct uart_class *)uart_fdt_find_device(dev); if (sc->sc_class == NULL) return (ENXIO); node = ofw_bus_get_node(dev); if ((err = uart_fdt_get_clock(node, &clock)) != 0) return (err); if (uart_fdt_get_shift(node, &shift) != 0) shift = uart_getregshift(sc->sc_class); return (uart_bus_probe(dev, (int)shift, (int)clock, 0, 0)); } DRIVER_MODULE(uart, simplebus, uart_fdt_driver, uart_devclass, 0, 0); DRIVER_MODULE(uart, ofwbus, uart_fdt_driver, uart_devclass, 0, 0); Index: user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_acpi.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_acpi.h (nonexistent) +++ user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_acpi.h (revision 303206) @@ -0,0 +1,62 @@ +/*- + * Copyright (c) 2015 Michal Meloun + * Copyright (c) 2016 The FreeBSD Foundation + * All rights reserved. + * + * This software was developed by Andrew Turner under + * sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _DEV_UART_CPU_ACPI_H_ +#define _DEV_UART_CPU_ACPI_H_ + +#include + +struct uart_class; + +struct acpi_uart_compat_data { + const char *hid; + struct uart_class *clas; +}; + +/* + * If your UART driver implements only uart_class and uses uart_cpu_acpi.c + * for device instantiation, then use UART_ACPI_CLASS_AND_DEVICE for its + * declaration + */ +SET_DECLARE(uart_acpi_class_and_device_set, struct acpi_uart_compat_data); +#define UART_ACPI_CLASS_AND_DEVICE(data) \ + DATA_SET(uart_acpi_class_and_device_set, data) + +/* + * If your UART driver implements uart_class and custom device layer, + * then use UART_ACPI_CLASS for its declaration + */ +SET_DECLARE(uart_acpi_class_set, struct acpi_uart_compat_data); +#define UART_ACPI_CLASS(data) \ + DATA_SET(uart_acpi_class_set, data) + +#endif /* _DEV_UART_CPU_ACPI_H_ */ Property changes on: user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_acpi.h ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Index: user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_fdt.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_fdt.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_fdt.c (revision 303206) @@ -1,208 +1,160 @@ /*- * Copyright (c) 2009-2010 The FreeBSD Foundation * All rights reserved. * * This software was developed by Semihalf under sponsorship from * the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_platform.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* * UART console routines. */ bus_space_tag_t uart_bus_space_io; bus_space_tag_t uart_bus_space_mem; int uart_cpu_eqres(struct uart_bas *b1, struct uart_bas *b2) { if (b1->bst != b2->bst) return (0); if (pmap_kextract(b1->bsh) == 0) return (0); if (pmap_kextract(b2->bsh) == 0) return (0); return ((pmap_kextract(b1->bsh) == pmap_kextract(b2->bsh)) ? 1 : 0); } static int phandle_chosen_propdev(phandle_t chosen, const char *name, phandle_t *node) { char buf[64]; if (OF_getprop(chosen, name, buf, sizeof(buf)) <= 0) return (ENXIO); if ((*node = OF_finddevice(buf)) == -1) return (ENXIO); return (0); } static const struct ofw_compat_data * uart_fdt_find_compatible(phandle_t node, const struct ofw_compat_data *cd) { const struct ofw_compat_data *ocd; for (ocd = cd; ocd->ocd_str != NULL; ocd++) { if (fdt_is_compatible(node, ocd->ocd_str)) return (ocd); } return (NULL); } static uintptr_t uart_fdt_find_by_node(phandle_t node, int class_list) { struct ofw_compat_data **cd; const struct ofw_compat_data *ocd; if (class_list) { SET_FOREACH(cd, uart_fdt_class_set) { ocd = uart_fdt_find_compatible(node, *cd); if ((ocd != NULL) && (ocd->ocd_data != 0)) return (ocd->ocd_data); } } else { SET_FOREACH(cd, uart_fdt_class_and_device_set) { ocd = uart_fdt_find_compatible(node, *cd); if ((ocd != NULL) && (ocd->ocd_data != 0)) return (ocd->ocd_data); } } return (0); } int uart_cpu_getdev(int devtype, struct uart_devinfo *di) { - const char *propnames[] = {"stdout-path", "linux,stdout-path", "stdout", - "stdin-path", "stdin", NULL}; - const char **name; struct uart_class *class; - phandle_t node, chosen; - pcell_t shift, br, rclk; - char *cp; - int err; + bus_space_tag_t bst; + bus_space_handle_t bsh; + u_int shift, rclk; + int br, err; /* Allow overriding the FDT using the environment. */ class = &uart_ns8250_class; err = uart_getenv(devtype, di, class); if (!err) return (0); if (devtype != UART_DEV_CONSOLE) return (ENXIO); - /* Has the user forced a specific device node? */ - cp = kern_getenv("hw.fdt.console"); - if (cp == NULL) { - /* - * Retrieve /chosen/std{in,out}. - */ - node = -1; - if ((chosen = OF_finddevice("/chosen")) != -1) { - for (name = propnames; *name != NULL; name++) { - if (phandle_chosen_propdev(chosen, *name, - &node) == 0) - break; - } - } - if (chosen == -1 || *name == NULL) - node = OF_finddevice("serial0"); /* Last ditch */ - } else { - node = OF_finddevice(cp); - } + err = uart_cpu_fdt_probe(&class, &bst, &bsh, &br, &rclk, &shift); + if (err != 0) + return (err); - if (node == -1) /* Can't find anything */ - return (ENXIO); - /* - * Check old style of UART definition first. Unfortunately, the common - * FDT processing is not possible if we have clock, power domains and - * pinmux stuff. - */ - class = (struct uart_class *)uart_fdt_find_by_node(node, 0); - if (class != NULL) { - if ((err = uart_fdt_get_clock(node, &rclk)) != 0) - return (err); - } else { - /* Check class only linker set */ - class = - (struct uart_class *)uart_fdt_find_by_node(node, 1); - if (class == NULL) - return (ENXIO); - rclk = 0; - } - - /* - * Retrieve serial attributes. - */ - if (uart_fdt_get_shift(node, &shift) != 0) - shift = uart_getregshift(class); - - if (OF_getencprop(node, "current-speed", &br, sizeof(br)) <= 0) - br = 0; - - /* * Finalize configuration. */ di->bas.chan = 0; - di->bas.regshft = (u_int)shift; + di->bas.regshft = shift; di->baudrate = br; - di->bas.rclk = (u_int)rclk; + di->bas.rclk = rclk; di->ops = uart_getops(class); di->databits = 8; di->stopbits = 1; di->parity = UART_PARITY_NONE; + di->bas.bst = bst; + di->bas.bsh = bsh; - err = OF_decode_addr(node, 0, &di->bas.bst, &di->bas.bsh, NULL); uart_bus_space_mem = di->bas.bst; uart_bus_space_io = NULL; return (err); } Index: user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_fdt.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_fdt.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/uart/uart_cpu_fdt.h (revision 303206) @@ -1,56 +1,58 @@ /*- * Copyright 2015 Michal Meloun * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _DEV_UART_CPU_FDT_H_ #define _DEV_UART_CPU_FDT_H_ #include #include /* * If your UART driver implements only uart_class and uses uart_cpu_fdt.c * for device instantiation, then use UART_FDT_CLASS_AND_DEVICE for its * declaration */ SET_DECLARE(uart_fdt_class_and_device_set, struct ofw_compat_data ); #define UART_FDT_CLASS_AND_DEVICE(data) \ DATA_SET(uart_fdt_class_and_device_set, data) /* * If your UART driver implements uart_class and custom device layer, * then use UART_FDT_CLASS for its declaration */ SET_DECLARE(uart_fdt_class_set, struct ofw_compat_data ); #define UART_FDT_CLASS(data) \ DATA_SET(uart_fdt_class_set, data) +int uart_cpu_fdt_probe(struct uart_class **, bus_space_tag_t *, + bus_space_handle_t *, int *, u_int *, u_int *); int uart_fdt_get_clock(phandle_t node, pcell_t *cell); int uart_fdt_get_shift(phandle_t node, pcell_t *cell); #endif /* _DEV_UART_CPU_FDT_H_ */ Index: user/alc/PQ_LAUNDRY/sys/dev/uart/uart_dev_pl011.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/uart/uart_dev_pl011.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/uart/uart_dev_pl011.c (revision 303206) @@ -1,505 +1,524 @@ /*- * Copyright (c) 2012 Semihalf. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ +#include "opt_acpi.h" +#include "opt_platform.h" + #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include +#ifdef DEV_ACPI +#include +#endif +#ifdef FDT #include +#endif #include #include "uart_if.h" #include /* PL011 UART registers and masks*/ #define UART_DR 0x00 /* Data register */ #define DR_FE (1 << 8) /* Framing error */ #define DR_PE (1 << 9) /* Parity error */ #define DR_BE (1 << 10) /* Break error */ #define DR_OE (1 << 11) /* Overrun error */ #define UART_FR 0x06 /* Flag register */ #define FR_TXFF (1 << 5) /* Transmit FIFO/reg full */ #define FR_RXFF (1 << 6) /* Receive FIFO/reg full */ #define FR_TXFE (1 << 7) /* Transmit FIFO/reg empty */ #define UART_IBRD 0x09 /* Integer baud rate register */ #define IBRD_BDIVINT 0xffff /* Significant part of int. divisor value */ #define UART_FBRD 0x0a /* Fractional baud rate register */ #define FBRD_BDIVFRAC 0x3f /* Significant part of frac. divisor value */ #define UART_LCR_H 0x0b /* Line control register */ #define LCR_H_WLEN8 (0x3 << 5) #define LCR_H_WLEN7 (0x2 << 5) #define LCR_H_WLEN6 (0x1 << 5) #define LCR_H_FEN (1 << 4) /* FIFO mode enable */ #define LCR_H_STP2 (1 << 3) /* 2 stop frames at the end */ #define LCR_H_EPS (1 << 2) /* Even parity select */ #define LCR_H_PEN (1 << 1) /* Parity enable */ #define UART_CR 0x0c /* Control register */ #define CR_RXE (1 << 9) /* Receive enable */ #define CR_TXE (1 << 8) /* Transmit enable */ #define CR_UARTEN (1 << 0) /* UART enable */ #define UART_IMSC 0x0e /* Interrupt mask set/clear register */ #define IMSC_MASK_ALL 0x7ff /* Mask all interrupts */ #define UART_RIS 0x0f /* Raw interrupt status register */ #define UART_RXREADY (1 << 4) /* RX buffer full */ #define UART_TXEMPTY (1 << 5) /* TX buffer empty */ #define RIS_RTIM (1 << 6) /* Receive timeout */ #define RIS_FE (1 << 7) /* Framing error interrupt status */ #define RIS_PE (1 << 8) /* Parity error interrupt status */ #define RIS_BE (1 << 9) /* Break error interrupt status */ #define RIS_OE (1 << 10) /* Overrun interrupt status */ #define UART_MIS 0x10 /* Masked interrupt status register */ #define UART_ICR 0x11 /* Interrupt clear register */ /* * FIXME: actual register size is SoC-dependent, we need to handle it */ #define __uart_getreg(bas, reg) \ bus_space_read_4((bas)->bst, (bas)->bsh, uart_regofs(bas, reg)) #define __uart_setreg(bas, reg, value) \ bus_space_write_4((bas)->bst, (bas)->bsh, uart_regofs(bas, reg), value) /* * Low-level UART interface. */ static int uart_pl011_probe(struct uart_bas *bas); static void uart_pl011_init(struct uart_bas *bas, int, int, int, int); static void uart_pl011_term(struct uart_bas *bas); static void uart_pl011_putc(struct uart_bas *bas, int); static int uart_pl011_rxready(struct uart_bas *bas); static int uart_pl011_getc(struct uart_bas *bas, struct mtx *); static struct uart_ops uart_pl011_ops = { .probe = uart_pl011_probe, .init = uart_pl011_init, .term = uart_pl011_term, .putc = uart_pl011_putc, .rxready = uart_pl011_rxready, .getc = uart_pl011_getc, }; static int uart_pl011_probe(struct uart_bas *bas) { return (0); } static void uart_pl011_param(struct uart_bas *bas, int baudrate, int databits, int stopbits, int parity) { uint32_t ctrl, line; uint32_t baud; /* * Zero all settings to make sure * UART is disabled and not configured */ ctrl = line = 0x0; __uart_setreg(bas, UART_CR, ctrl); /* As we know UART is disabled we may setup the line */ switch (databits) { case 7: line |= LCR_H_WLEN7; break; case 6: line |= LCR_H_WLEN6; break; case 8: default: line |= LCR_H_WLEN8; break; } if (stopbits == 2) line |= LCR_H_STP2; else line &= ~LCR_H_STP2; if (parity) line |= LCR_H_PEN; else line &= ~LCR_H_PEN; /* Configure the rest */ line &= ~LCR_H_FEN; ctrl |= (CR_RXE | CR_TXE | CR_UARTEN); if (bas->rclk != 0 && baudrate != 0) { baud = bas->rclk * 4 / baudrate; __uart_setreg(bas, UART_IBRD, ((uint32_t)(baud >> 6)) & IBRD_BDIVINT); __uart_setreg(bas, UART_FBRD, (uint32_t)(baud & 0x3F) & FBRD_BDIVFRAC); } /* Add config. to line before reenabling UART */ __uart_setreg(bas, UART_LCR_H, (__uart_getreg(bas, UART_LCR_H) & ~0xff) | line); __uart_setreg(bas, UART_CR, ctrl); } static void uart_pl011_init(struct uart_bas *bas, int baudrate, int databits, int stopbits, int parity) { /* Mask all interrupts */ __uart_setreg(bas, UART_IMSC, __uart_getreg(bas, UART_IMSC) & ~IMSC_MASK_ALL); uart_pl011_param(bas, baudrate, databits, stopbits, parity); } static void uart_pl011_term(struct uart_bas *bas) { } static void uart_pl011_putc(struct uart_bas *bas, int c) { /* Wait when TX FIFO full. Push character otherwise. */ while (__uart_getreg(bas, UART_FR) & FR_TXFF) ; __uart_setreg(bas, UART_DR, c & 0xff); } static int uart_pl011_rxready(struct uart_bas *bas) { return (__uart_getreg(bas, UART_FR) & FR_RXFF); } static int uart_pl011_getc(struct uart_bas *bas, struct mtx *hwmtx) { int c; while (!uart_pl011_rxready(bas)) ; c = __uart_getreg(bas, UART_DR) & 0xff; return (c); } /* * High-level UART interface. */ struct uart_pl011_softc { struct uart_softc base; uint8_t fcr; uint8_t ier; uint8_t mcr; uint8_t ier_mask; uint8_t ier_rxbits; }; static int uart_pl011_bus_attach(struct uart_softc *); static int uart_pl011_bus_detach(struct uart_softc *); static int uart_pl011_bus_flush(struct uart_softc *, int); static int uart_pl011_bus_getsig(struct uart_softc *); static int uart_pl011_bus_ioctl(struct uart_softc *, int, intptr_t); static int uart_pl011_bus_ipend(struct uart_softc *); static int uart_pl011_bus_param(struct uart_softc *, int, int, int, int); static int uart_pl011_bus_probe(struct uart_softc *); static int uart_pl011_bus_receive(struct uart_softc *); static int uart_pl011_bus_setsig(struct uart_softc *, int); static int uart_pl011_bus_transmit(struct uart_softc *); static void uart_pl011_bus_grab(struct uart_softc *); static void uart_pl011_bus_ungrab(struct uart_softc *); static kobj_method_t uart_pl011_methods[] = { KOBJMETHOD(uart_attach, uart_pl011_bus_attach), KOBJMETHOD(uart_detach, uart_pl011_bus_detach), KOBJMETHOD(uart_flush, uart_pl011_bus_flush), KOBJMETHOD(uart_getsig, uart_pl011_bus_getsig), KOBJMETHOD(uart_ioctl, uart_pl011_bus_ioctl), KOBJMETHOD(uart_ipend, uart_pl011_bus_ipend), KOBJMETHOD(uart_param, uart_pl011_bus_param), KOBJMETHOD(uart_probe, uart_pl011_bus_probe), KOBJMETHOD(uart_receive, uart_pl011_bus_receive), KOBJMETHOD(uart_setsig, uart_pl011_bus_setsig), KOBJMETHOD(uart_transmit, uart_pl011_bus_transmit), KOBJMETHOD(uart_grab, uart_pl011_bus_grab), KOBJMETHOD(uart_ungrab, uart_pl011_bus_ungrab), { 0, 0 } }; static struct uart_class uart_pl011_class = { "uart_pl011", uart_pl011_methods, sizeof(struct uart_pl011_softc), .uc_ops = &uart_pl011_ops, .uc_range = 0x48, .uc_rclk = 0, .uc_rshift = 2 }; + +#ifdef FDT static struct ofw_compat_data compat_data[] = { {"arm,pl011", (uintptr_t)&uart_pl011_class}, {NULL, (uintptr_t)NULL}, }; UART_FDT_CLASS_AND_DEVICE(compat_data); +#endif + +#ifdef DEV_ACPI +static struct acpi_uart_compat_data acpi_compat_data[] = { + {"ARMH0011", &uart_pl011_class}, + {NULL, NULL}, +}; +UART_ACPI_CLASS_AND_DEVICE(acpi_compat_data); +#endif static int uart_pl011_bus_attach(struct uart_softc *sc) { struct uart_bas *bas; int reg; bas = &sc->sc_bas; /* Enable interrupts */ reg = (UART_RXREADY | RIS_RTIM | UART_TXEMPTY); __uart_setreg(bas, UART_IMSC, reg); /* Clear interrupts */ __uart_setreg(bas, UART_ICR, IMSC_MASK_ALL); return (0); } static int uart_pl011_bus_detach(struct uart_softc *sc) { return (0); } static int uart_pl011_bus_flush(struct uart_softc *sc, int what) { return (0); } static int uart_pl011_bus_getsig(struct uart_softc *sc) { return (0); } static int uart_pl011_bus_ioctl(struct uart_softc *sc, int request, intptr_t data) { struct uart_bas *bas; int error; bas = &sc->sc_bas; error = 0; uart_lock(sc->sc_hwmtx); switch (request) { case UART_IOCTL_BREAK: break; case UART_IOCTL_BAUD: *(int*)data = 115200; break; default: error = EINVAL; break; } uart_unlock(sc->sc_hwmtx); return (error); } static int uart_pl011_bus_ipend(struct uart_softc *sc) { struct uart_bas *bas; uint32_t ints; int ipend; int reg; bas = &sc->sc_bas; uart_lock(sc->sc_hwmtx); ints = __uart_getreg(bas, UART_MIS); ipend = 0; if (ints & (UART_RXREADY | RIS_RTIM)) ipend |= SER_INT_RXREADY; if (ints & RIS_BE) ipend |= SER_INT_BREAK; if (ints & RIS_OE) ipend |= SER_INT_OVERRUN; if (ints & UART_TXEMPTY) { if (sc->sc_txbusy) ipend |= SER_INT_TXIDLE; /* Disable TX interrupt */ reg = __uart_getreg(bas, UART_IMSC); reg &= ~(UART_TXEMPTY); __uart_setreg(bas, UART_IMSC, reg); } uart_unlock(sc->sc_hwmtx); return (ipend); } static int uart_pl011_bus_param(struct uart_softc *sc, int baudrate, int databits, int stopbits, int parity) { uart_lock(sc->sc_hwmtx); uart_pl011_param(&sc->sc_bas, baudrate, databits, stopbits, parity); uart_unlock(sc->sc_hwmtx); return (0); } static int uart_pl011_bus_probe(struct uart_softc *sc) { device_set_desc(sc->sc_dev, "PrimeCell UART (PL011)"); sc->sc_rxfifosz = 1; sc->sc_txfifosz = 1; return (0); } static int uart_pl011_bus_receive(struct uart_softc *sc) { struct uart_bas *bas; uint32_t ints, xc; int rx; bas = &sc->sc_bas; uart_lock(sc->sc_hwmtx); ints = __uart_getreg(bas, UART_MIS); while (ints & (UART_RXREADY | RIS_RTIM)) { if (uart_rx_full(sc)) { sc->sc_rxbuf[sc->sc_rxput] = UART_STAT_OVERRUN; break; } xc = __uart_getreg(bas, UART_DR); rx = xc & 0xff; if (xc & DR_FE) rx |= UART_STAT_FRAMERR; if (xc & DR_PE) rx |= UART_STAT_PARERR; __uart_setreg(bas, UART_ICR, (UART_RXREADY | RIS_RTIM)); uart_rx_put(sc, rx); ints = __uart_getreg(bas, UART_MIS); } uart_unlock(sc->sc_hwmtx); return (0); } static int uart_pl011_bus_setsig(struct uart_softc *sc, int sig) { return (0); } static int uart_pl011_bus_transmit(struct uart_softc *sc) { struct uart_bas *bas; int reg; int i; bas = &sc->sc_bas; uart_lock(sc->sc_hwmtx); for (i = 0; i < sc->sc_txdatasz; i++) { __uart_setreg(bas, UART_DR, sc->sc_txbuf[i]); uart_barrier(bas); } /* If not empty wait until it is */ if ((__uart_getreg(bas, UART_FR) & FR_TXFE) != FR_TXFE) { sc->sc_txbusy = 1; /* Enable TX interrupt */ reg = __uart_getreg(bas, UART_IMSC); reg |= (UART_TXEMPTY); __uart_setreg(bas, UART_IMSC, reg); } uart_unlock(sc->sc_hwmtx); /* No interrupt expected, schedule the next fifo write */ if (!sc->sc_txbusy) uart_sched_softih(sc, SER_INT_TXIDLE); return (0); } static void uart_pl011_bus_grab(struct uart_softc *sc) { struct uart_bas *bas; bas = &sc->sc_bas; uart_lock(sc->sc_hwmtx); __uart_setreg(bas, UART_IMSC, /* Switch to RX polling while grabbed */ ~UART_RXREADY & __uart_getreg(bas, UART_IMSC)); uart_unlock(sc->sc_hwmtx); } static void uart_pl011_bus_ungrab(struct uart_softc *sc) { struct uart_bas *bas; bas = &sc->sc_bas; uart_lock(sc->sc_hwmtx); __uart_setreg(bas, UART_IMSC, /* Switch to RX interrupts while not grabbed */ UART_RXREADY | __uart_getreg(bas, UART_IMSC)); uart_unlock(sc->sc_hwmtx); } Index: user/alc/PQ_LAUNDRY/sys/dev/urtwn/if_urtwn.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/urtwn/if_urtwn.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/urtwn/if_urtwn.c (revision 303206) @@ -1,5665 +1,5669 @@ /* $OpenBSD: if_urtwn.c,v 1.16 2011/02/10 17:26:40 jakemsr Exp $ */ /*- * Copyright (c) 2010 Damien Bergamini * Copyright (c) 2014 Kevin Lo * Copyright (c) 2015 Andriy Voskoboinyk * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ #include __FBSDID("$FreeBSD$"); /* * Driver for Realtek RTL8188CE-VAU/RTL8188CUS/RTL8188EU/RTL8188RU/RTL8192CU. */ #include "opt_wlan.h" #include "opt_urtwn.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef IEEE80211_SUPPORT_SUPERG #include #endif #include #include #include #include "usbdevs.h" #include #include #include #ifdef USB_DEBUG enum { URTWN_DEBUG_XMIT = 0x00000001, /* basic xmit operation */ URTWN_DEBUG_RECV = 0x00000002, /* basic recv operation */ URTWN_DEBUG_STATE = 0x00000004, /* 802.11 state transitions */ URTWN_DEBUG_RA = 0x00000008, /* f/w rate adaptation setup */ URTWN_DEBUG_USB = 0x00000010, /* usb requests */ URTWN_DEBUG_FIRMWARE = 0x00000020, /* firmware(9) loading debug */ URTWN_DEBUG_BEACON = 0x00000040, /* beacon handling */ URTWN_DEBUG_INTR = 0x00000080, /* ISR */ URTWN_DEBUG_TEMP = 0x00000100, /* temperature calibration */ URTWN_DEBUG_ROM = 0x00000200, /* various ROM info */ URTWN_DEBUG_KEY = 0x00000400, /* crypto keys management */ URTWN_DEBUG_TXPWR = 0x00000800, /* dump Tx power values */ URTWN_DEBUG_RSSI = 0x00001000, /* dump RSSI lookups */ URTWN_DEBUG_ANY = 0xffffffff }; #define URTWN_DPRINTF(_sc, _m, ...) do { \ if ((_sc)->sc_debug & (_m)) \ device_printf((_sc)->sc_dev, __VA_ARGS__); \ } while(0) #else #define URTWN_DPRINTF(_sc, _m, ...) do { (void) sc; } while (0) #endif #define IEEE80211_HAS_ADDR4(wh) IEEE80211_IS_DSTODS(wh) static int urtwn_enable_11n = 1; TUNABLE_INT("hw.usb.urtwn.enable_11n", &urtwn_enable_11n); /* various supported device vendors/products */ static const STRUCT_USB_HOST_ID urtwn_devs[] = { #define URTWN_DEV(v,p) { USB_VP(USB_VENDOR_##v, USB_PRODUCT_##v##_##p) } #define URTWN_RTL8188E_DEV(v,p) \ { USB_VPI(USB_VENDOR_##v, USB_PRODUCT_##v##_##p, URTWN_RTL8188E) } #define URTWN_RTL8188E 1 URTWN_DEV(ABOCOM, RTL8188CU_1), URTWN_DEV(ABOCOM, RTL8188CU_2), URTWN_DEV(ABOCOM, RTL8192CU), URTWN_DEV(ASUS, RTL8192CU), URTWN_DEV(ASUS, USBN10NANO), URTWN_DEV(AZUREWAVE, RTL8188CE_1), URTWN_DEV(AZUREWAVE, RTL8188CE_2), URTWN_DEV(AZUREWAVE, RTL8188CU), URTWN_DEV(BELKIN, F7D2102), URTWN_DEV(BELKIN, RTL8188CU), URTWN_DEV(BELKIN, RTL8192CU), URTWN_DEV(CHICONY, RTL8188CUS_1), URTWN_DEV(CHICONY, RTL8188CUS_2), URTWN_DEV(CHICONY, RTL8188CUS_3), URTWN_DEV(CHICONY, RTL8188CUS_4), URTWN_DEV(CHICONY, RTL8188CUS_5), URTWN_DEV(COREGA, RTL8192CU), URTWN_DEV(DLINK, RTL8188CU), URTWN_DEV(DLINK, RTL8192CU_1), URTWN_DEV(DLINK, RTL8192CU_2), URTWN_DEV(DLINK, RTL8192CU_3), URTWN_DEV(DLINK, DWA131B), URTWN_DEV(EDIMAX, EW7811UN), URTWN_DEV(EDIMAX, RTL8192CU), URTWN_DEV(FEIXUN, RTL8188CU), URTWN_DEV(FEIXUN, RTL8192CU), URTWN_DEV(GUILLEMOT, HWNUP150), URTWN_DEV(HAWKING, RTL8192CU), URTWN_DEV(HP3, RTL8188CU), URTWN_DEV(NETGEAR, WNA1000M), URTWN_DEV(NETGEAR, RTL8192CU), URTWN_DEV(NETGEAR4, RTL8188CU), URTWN_DEV(NOVATECH, RTL8188CU), URTWN_DEV(PLANEX2, RTL8188CU_1), URTWN_DEV(PLANEX2, RTL8188CU_2), URTWN_DEV(PLANEX2, RTL8188CU_3), URTWN_DEV(PLANEX2, RTL8188CU_4), URTWN_DEV(PLANEX2, RTL8188CUS), URTWN_DEV(PLANEX2, RTL8192CU), URTWN_DEV(REALTEK, RTL8188CE_0), URTWN_DEV(REALTEK, RTL8188CE_1), URTWN_DEV(REALTEK, RTL8188CTV), URTWN_DEV(REALTEK, RTL8188CU_0), URTWN_DEV(REALTEK, RTL8188CU_1), URTWN_DEV(REALTEK, RTL8188CU_2), URTWN_DEV(REALTEK, RTL8188CU_3), URTWN_DEV(REALTEK, RTL8188CU_COMBO), URTWN_DEV(REALTEK, RTL8188CUS), URTWN_DEV(REALTEK, RTL8188RU_1), URTWN_DEV(REALTEK, RTL8188RU_2), URTWN_DEV(REALTEK, RTL8188RU_3), URTWN_DEV(REALTEK, RTL8191CU), URTWN_DEV(REALTEK, RTL8192CE), URTWN_DEV(REALTEK, RTL8192CU), URTWN_DEV(SITECOMEU, RTL8188CU_1), URTWN_DEV(SITECOMEU, RTL8188CU_2), URTWN_DEV(SITECOMEU, RTL8192CU), URTWN_DEV(TRENDNET, RTL8188CU), URTWN_DEV(TRENDNET, RTL8192CU), URTWN_DEV(ZYXEL, RTL8192CU), /* URTWN_RTL8188E */ URTWN_RTL8188E_DEV(ABOCOM, RTL8188EU), URTWN_RTL8188E_DEV(DLINK, DWA123D1), URTWN_RTL8188E_DEV(DLINK, DWA125D1), URTWN_RTL8188E_DEV(ELECOM, WDC150SU2M), URTWN_RTL8188E_DEV(REALTEK, RTL8188ETV), URTWN_RTL8188E_DEV(REALTEK, RTL8188EU), #undef URTWN_RTL8188E_DEV #undef URTWN_DEV }; static device_probe_t urtwn_match; static device_attach_t urtwn_attach; static device_detach_t urtwn_detach; static usb_callback_t urtwn_bulk_tx_callback; static usb_callback_t urtwn_bulk_rx_callback; static void urtwn_sysctlattach(struct urtwn_softc *); static void urtwn_drain_mbufq(struct urtwn_softc *); static usb_error_t urtwn_do_request(struct urtwn_softc *, struct usb_device_request *, void *); static struct ieee80211vap *urtwn_vap_create(struct ieee80211com *, const char [IFNAMSIZ], int, enum ieee80211_opmode, int, const uint8_t [IEEE80211_ADDR_LEN], const uint8_t [IEEE80211_ADDR_LEN]); static void urtwn_vap_delete(struct ieee80211vap *); static void urtwn_vap_clear_tx(struct urtwn_softc *, struct ieee80211vap *); static void urtwn_vap_clear_tx_queue(struct urtwn_softc *, urtwn_datahead *, struct ieee80211vap *); static struct mbuf * urtwn_rx_copy_to_mbuf(struct urtwn_softc *, struct r92c_rx_stat *, int); static struct mbuf * urtwn_report_intr(struct usb_xfer *, struct urtwn_data *); static struct mbuf * urtwn_rxeof(struct urtwn_softc *, uint8_t *, int); static void urtwn_r88e_ratectl_tx_complete(struct urtwn_softc *, void *); static struct ieee80211_node *urtwn_rx_frame(struct urtwn_softc *, struct mbuf *, int8_t *); static void urtwn_txeof(struct urtwn_softc *, struct urtwn_data *, int); static int urtwn_alloc_list(struct urtwn_softc *, struct urtwn_data[], int, int); static int urtwn_alloc_rx_list(struct urtwn_softc *); static int urtwn_alloc_tx_list(struct urtwn_softc *); static void urtwn_free_list(struct urtwn_softc *, struct urtwn_data data[], int); static void urtwn_free_rx_list(struct urtwn_softc *); static void urtwn_free_tx_list(struct urtwn_softc *); static struct urtwn_data * _urtwn_getbuf(struct urtwn_softc *); static struct urtwn_data * urtwn_getbuf(struct urtwn_softc *); static usb_error_t urtwn_write_region_1(struct urtwn_softc *, uint16_t, uint8_t *, int); static usb_error_t urtwn_write_1(struct urtwn_softc *, uint16_t, uint8_t); static usb_error_t urtwn_write_2(struct urtwn_softc *, uint16_t, uint16_t); static usb_error_t urtwn_write_4(struct urtwn_softc *, uint16_t, uint32_t); static usb_error_t urtwn_read_region_1(struct urtwn_softc *, uint16_t, uint8_t *, int); static uint8_t urtwn_read_1(struct urtwn_softc *, uint16_t); static uint16_t urtwn_read_2(struct urtwn_softc *, uint16_t); static uint32_t urtwn_read_4(struct urtwn_softc *, uint16_t); static int urtwn_fw_cmd(struct urtwn_softc *, uint8_t, const void *, int); static void urtwn_cmdq_cb(void *, int); static int urtwn_cmd_sleepable(struct urtwn_softc *, const void *, size_t, CMD_FUNC_PROTO); static void urtwn_r92c_rf_write(struct urtwn_softc *, int, uint8_t, uint32_t); static void urtwn_r88e_rf_write(struct urtwn_softc *, int, uint8_t, uint32_t); static uint32_t urtwn_rf_read(struct urtwn_softc *, int, uint8_t); static int urtwn_llt_write(struct urtwn_softc *, uint32_t, uint32_t); static int urtwn_efuse_read_next(struct urtwn_softc *, uint8_t *); static int urtwn_efuse_read_data(struct urtwn_softc *, uint8_t *, uint8_t, uint8_t); #ifdef USB_DEBUG static void urtwn_dump_rom_contents(struct urtwn_softc *, uint8_t *, uint16_t); #endif static int urtwn_efuse_read(struct urtwn_softc *, uint8_t *, uint16_t); static int urtwn_efuse_switch_power(struct urtwn_softc *); static int urtwn_read_chipid(struct urtwn_softc *); static int urtwn_read_rom(struct urtwn_softc *); static int urtwn_r88e_read_rom(struct urtwn_softc *); static int urtwn_ra_init(struct urtwn_softc *); static void urtwn_init_beacon(struct urtwn_softc *, struct urtwn_vap *); static int urtwn_setup_beacon(struct urtwn_softc *, struct ieee80211_node *); static void urtwn_update_beacon(struct ieee80211vap *, int); static int urtwn_tx_beacon(struct urtwn_softc *sc, struct urtwn_vap *); static int urtwn_key_alloc(struct ieee80211vap *, struct ieee80211_key *, ieee80211_keyix *, ieee80211_keyix *); static void urtwn_key_set_cb(struct urtwn_softc *, union sec_param *); static void urtwn_key_del_cb(struct urtwn_softc *, union sec_param *); static int urtwn_key_set(struct ieee80211vap *, const struct ieee80211_key *); static int urtwn_key_delete(struct ieee80211vap *, const struct ieee80211_key *); static void urtwn_tsf_task_adhoc(void *, int); static void urtwn_tsf_sync_enable(struct urtwn_softc *, struct ieee80211vap *); static void urtwn_get_tsf(struct urtwn_softc *, uint64_t *); static void urtwn_set_led(struct urtwn_softc *, int, int); static void urtwn_set_mode(struct urtwn_softc *, uint8_t); static void urtwn_ibss_recv_mgmt(struct ieee80211_node *, struct mbuf *, int, const struct ieee80211_rx_stats *, int, int); static int urtwn_newstate(struct ieee80211vap *, enum ieee80211_state, int); static void urtwn_calib_to(void *); static void urtwn_calib_cb(struct urtwn_softc *, union sec_param *); static void urtwn_watchdog(void *); static void urtwn_update_avgrssi(struct urtwn_softc *, int, int8_t); static int8_t urtwn_get_rssi(struct urtwn_softc *, int, void *); static int8_t urtwn_r88e_get_rssi(struct urtwn_softc *, int, void *); static int urtwn_tx_data(struct urtwn_softc *, struct ieee80211_node *, struct mbuf *, struct urtwn_data *); static int urtwn_tx_raw(struct urtwn_softc *, struct ieee80211_node *, struct mbuf *, struct urtwn_data *, const struct ieee80211_bpf_params *); static void urtwn_tx_start(struct urtwn_softc *, struct mbuf *, uint8_t, struct urtwn_data *); static int urtwn_transmit(struct ieee80211com *, struct mbuf *); static void urtwn_start(struct urtwn_softc *); static void urtwn_parent(struct ieee80211com *); static int urtwn_r92c_power_on(struct urtwn_softc *); static int urtwn_r88e_power_on(struct urtwn_softc *); static void urtwn_r92c_power_off(struct urtwn_softc *); static void urtwn_r88e_power_off(struct urtwn_softc *); static int urtwn_llt_init(struct urtwn_softc *); #ifndef URTWN_WITHOUT_UCODE static void urtwn_fw_reset(struct urtwn_softc *); static void urtwn_r88e_fw_reset(struct urtwn_softc *); static int urtwn_fw_loadpage(struct urtwn_softc *, int, const uint8_t *, int); static int urtwn_load_firmware(struct urtwn_softc *); #endif static int urtwn_dma_init(struct urtwn_softc *); static int urtwn_mac_init(struct urtwn_softc *); static void urtwn_bb_init(struct urtwn_softc *); static void urtwn_rf_init(struct urtwn_softc *); static void urtwn_cam_init(struct urtwn_softc *); static int urtwn_cam_write(struct urtwn_softc *, uint32_t, uint32_t); static void urtwn_pa_bias_init(struct urtwn_softc *); static void urtwn_rxfilter_init(struct urtwn_softc *); static void urtwn_edca_init(struct urtwn_softc *); static void urtwn_write_txpower(struct urtwn_softc *, int, uint16_t[]); static void urtwn_get_txpower(struct urtwn_softc *, int, struct ieee80211_channel *, struct ieee80211_channel *, uint16_t[]); static void urtwn_r88e_get_txpower(struct urtwn_softc *, int, struct ieee80211_channel *, struct ieee80211_channel *, uint16_t[]); static void urtwn_set_txpower(struct urtwn_softc *, struct ieee80211_channel *, struct ieee80211_channel *); static void urtwn_set_rx_bssid_all(struct urtwn_softc *, int); static void urtwn_set_gain(struct urtwn_softc *, uint8_t); static void urtwn_scan_start(struct ieee80211com *); static void urtwn_scan_end(struct ieee80211com *); static void urtwn_getradiocaps(struct ieee80211com *, int, int *, struct ieee80211_channel[]); static void urtwn_set_channel(struct ieee80211com *); static int urtwn_wme_update(struct ieee80211com *); static void urtwn_update_slot(struct ieee80211com *); static void urtwn_update_slot_cb(struct urtwn_softc *, union sec_param *); static void urtwn_update_aifs(struct urtwn_softc *, uint8_t); static uint8_t urtwn_get_multi_pos(const uint8_t[]); static void urtwn_set_multi(struct urtwn_softc *); static void urtwn_set_promisc(struct urtwn_softc *); static void urtwn_update_promisc(struct ieee80211com *); static void urtwn_update_mcast(struct ieee80211com *); static struct ieee80211_node *urtwn_node_alloc(struct ieee80211vap *, const uint8_t mac[IEEE80211_ADDR_LEN]); static void urtwn_newassoc(struct ieee80211_node *, int); static void urtwn_node_free(struct ieee80211_node *); static void urtwn_set_chan(struct urtwn_softc *, struct ieee80211_channel *, struct ieee80211_channel *); static void urtwn_iq_calib(struct urtwn_softc *); static void urtwn_lc_calib(struct urtwn_softc *); static void urtwn_temp_calib(struct urtwn_softc *); static void urtwn_setup_static_keys(struct urtwn_softc *, struct urtwn_vap *); static int urtwn_init(struct urtwn_softc *); static void urtwn_stop(struct urtwn_softc *); static void urtwn_abort_xfers(struct urtwn_softc *); static int urtwn_raw_xmit(struct ieee80211_node *, struct mbuf *, const struct ieee80211_bpf_params *); static void urtwn_ms_delay(struct urtwn_softc *); /* Aliases. */ #define urtwn_bb_write urtwn_write_4 #define urtwn_bb_read urtwn_read_4 static const struct usb_config urtwn_config[URTWN_N_TRANSFER] = { [URTWN_BULK_RX] = { .type = UE_BULK, .endpoint = UE_ADDR_ANY, .direction = UE_DIR_IN, .bufsize = URTWN_RXBUFSZ, .flags = { .pipe_bof = 1, .short_xfer_ok = 1 }, .callback = urtwn_bulk_rx_callback, }, [URTWN_BULK_TX_BE] = { .type = UE_BULK, .endpoint = 0x03, .direction = UE_DIR_OUT, .bufsize = URTWN_TXBUFSZ, .flags = { .ext_buffer = 1, .pipe_bof = 1, .force_short_xfer = 1 }, .callback = urtwn_bulk_tx_callback, .timeout = URTWN_TX_TIMEOUT, /* ms */ }, [URTWN_BULK_TX_BK] = { .type = UE_BULK, .endpoint = 0x03, .direction = UE_DIR_OUT, .bufsize = URTWN_TXBUFSZ, .flags = { .ext_buffer = 1, .pipe_bof = 1, .force_short_xfer = 1, }, .callback = urtwn_bulk_tx_callback, .timeout = URTWN_TX_TIMEOUT, /* ms */ }, [URTWN_BULK_TX_VI] = { .type = UE_BULK, .endpoint = 0x02, .direction = UE_DIR_OUT, .bufsize = URTWN_TXBUFSZ, .flags = { .ext_buffer = 1, .pipe_bof = 1, .force_short_xfer = 1 }, .callback = urtwn_bulk_tx_callback, .timeout = URTWN_TX_TIMEOUT, /* ms */ }, [URTWN_BULK_TX_VO] = { .type = UE_BULK, .endpoint = 0x02, .direction = UE_DIR_OUT, .bufsize = URTWN_TXBUFSZ, .flags = { .ext_buffer = 1, .pipe_bof = 1, .force_short_xfer = 1 }, .callback = urtwn_bulk_tx_callback, .timeout = URTWN_TX_TIMEOUT, /* ms */ }, }; static const struct wme_to_queue { uint16_t reg; uint8_t qid; } wme2queue[WME_NUM_AC] = { { R92C_EDCA_BE_PARAM, URTWN_BULK_TX_BE}, { R92C_EDCA_BK_PARAM, URTWN_BULK_TX_BK}, { R92C_EDCA_VI_PARAM, URTWN_BULK_TX_VI}, { R92C_EDCA_VO_PARAM, URTWN_BULK_TX_VO} }; static const uint8_t urtwn_chan_2ghz[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 }; static int urtwn_match(device_t self) { struct usb_attach_arg *uaa = device_get_ivars(self); if (uaa->usb_mode != USB_MODE_HOST) return (ENXIO); if (uaa->info.bConfigIndex != URTWN_CONFIG_INDEX) return (ENXIO); if (uaa->info.bIfaceIndex != URTWN_IFACE_INDEX) return (ENXIO); return (usbd_lookup_id_by_uaa(urtwn_devs, sizeof(urtwn_devs), uaa)); } static void urtwn_update_chw(struct ieee80211com *ic) { } static int urtwn_ampdu_enable(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap) { /* We're driving this ourselves (eventually); don't involve net80211 */ return (0); } static int urtwn_attach(device_t self) { struct usb_attach_arg *uaa = device_get_ivars(self); struct urtwn_softc *sc = device_get_softc(self); struct ieee80211com *ic = &sc->sc_ic; int error; device_set_usb_desc(self); sc->sc_udev = uaa->device; sc->sc_dev = self; if (USB_GET_DRIVER_INFO(uaa) == URTWN_RTL8188E) sc->chip |= URTWN_CHIP_88E; #ifdef USB_DEBUG int debug; if (resource_int_value(device_get_name(sc->sc_dev), device_get_unit(sc->sc_dev), "debug", &debug) == 0) sc->sc_debug = debug; #endif mtx_init(&sc->sc_mtx, device_get_nameunit(self), MTX_NETWORK_LOCK, MTX_DEF); URTWN_CMDQ_LOCK_INIT(sc); URTWN_NT_LOCK_INIT(sc); callout_init(&sc->sc_calib_to, 0); callout_init(&sc->sc_watchdog_ch, 0); mbufq_init(&sc->sc_snd, ifqmaxlen); sc->sc_iface_index = URTWN_IFACE_INDEX; error = usbd_transfer_setup(uaa->device, &sc->sc_iface_index, sc->sc_xfer, urtwn_config, URTWN_N_TRANSFER, sc, &sc->sc_mtx); if (error) { device_printf(self, "could not allocate USB transfers, " "err=%s\n", usbd_errstr(error)); goto detach; } URTWN_LOCK(sc); error = urtwn_read_chipid(sc); if (error) { device_printf(sc->sc_dev, "unsupported test chip\n"); URTWN_UNLOCK(sc); goto detach; } /* Determine number of Tx/Rx chains. */ if (sc->chip & URTWN_CHIP_92C) { sc->ntxchains = (sc->chip & URTWN_CHIP_92C_1T2R) ? 1 : 2; sc->nrxchains = 2; } else { sc->ntxchains = 1; sc->nrxchains = 1; } if (sc->chip & URTWN_CHIP_88E) error = urtwn_r88e_read_rom(sc); else error = urtwn_read_rom(sc); if (error != 0) { device_printf(sc->sc_dev, "%s: cannot read rom, error %d\n", __func__, error); URTWN_UNLOCK(sc); goto detach; } device_printf(sc->sc_dev, "MAC/BB RTL%s, RF 6052 %dT%dR\n", (sc->chip & URTWN_CHIP_92C) ? "8192CU" : (sc->chip & URTWN_CHIP_88E) ? "8188EU" : (sc->board_type == R92C_BOARD_TYPE_HIGHPA) ? "8188RU" : (sc->board_type == R92C_BOARD_TYPE_MINICARD) ? "8188CE-VAU" : "8188CUS", sc->ntxchains, sc->nrxchains); URTWN_UNLOCK(sc); ic->ic_softc = sc; ic->ic_name = device_get_nameunit(self); ic->ic_phytype = IEEE80211_T_OFDM; /* not only, but not used */ ic->ic_opmode = IEEE80211_M_STA; /* default to BSS mode */ /* set device capabilities */ ic->ic_caps = IEEE80211_C_STA /* station mode */ | IEEE80211_C_MONITOR /* monitor mode */ | IEEE80211_C_IBSS /* adhoc mode */ | IEEE80211_C_HOSTAP /* hostap mode */ | IEEE80211_C_SHPREAMBLE /* short preamble supported */ | IEEE80211_C_SHSLOT /* short slot time supported */ #if 0 | IEEE80211_C_BGSCAN /* capable of bg scanning */ #endif | IEEE80211_C_WPA /* 802.11i */ | IEEE80211_C_WME /* 802.11e */ | IEEE80211_C_SWAMSDUTX /* Do software A-MSDU TX */ | IEEE80211_C_FF /* Atheros fast-frames */ ; ic->ic_cryptocaps = IEEE80211_CRYPTO_WEP | IEEE80211_CRYPTO_TKIP | IEEE80211_CRYPTO_AES_CCM; /* Assume they're all 11n capable for now */ if (urtwn_enable_11n) { device_printf(self, "enabling 11n\n"); ic->ic_htcaps = IEEE80211_HTC_HT | #if 0 IEEE80211_HTC_AMPDU | #endif IEEE80211_HTC_AMSDU | IEEE80211_HTCAP_MAXAMSDU_3839 | IEEE80211_HTCAP_SMPS_OFF; /* no HT40 just yet */ // ic->ic_htcaps |= IEEE80211_HTCAP_CHWIDTH40; /* XXX TODO: verify chains versus streams for urtwn */ ic->ic_txstream = sc->ntxchains; ic->ic_rxstream = sc->nrxchains; } /* XXX TODO: setup regdomain if R92C_CHANNEL_PLAN_BY_HW bit is set. */ urtwn_getradiocaps(ic, IEEE80211_CHAN_MAX, &ic->ic_nchans, ic->ic_channels); ieee80211_ifattach(ic); ic->ic_raw_xmit = urtwn_raw_xmit; ic->ic_scan_start = urtwn_scan_start; ic->ic_scan_end = urtwn_scan_end; ic->ic_getradiocaps = urtwn_getradiocaps; ic->ic_set_channel = urtwn_set_channel; ic->ic_transmit = urtwn_transmit; ic->ic_parent = urtwn_parent; ic->ic_vap_create = urtwn_vap_create; ic->ic_vap_delete = urtwn_vap_delete; ic->ic_wme.wme_update = urtwn_wme_update; ic->ic_updateslot = urtwn_update_slot; ic->ic_update_promisc = urtwn_update_promisc; ic->ic_update_mcast = urtwn_update_mcast; if (sc->chip & URTWN_CHIP_88E) { ic->ic_node_alloc = urtwn_node_alloc; ic->ic_newassoc = urtwn_newassoc; sc->sc_node_free = ic->ic_node_free; ic->ic_node_free = urtwn_node_free; } ic->ic_update_chw = urtwn_update_chw; ic->ic_ampdu_enable = urtwn_ampdu_enable; ieee80211_radiotap_attach(ic, &sc->sc_txtap.wt_ihdr, sizeof(sc->sc_txtap), URTWN_TX_RADIOTAP_PRESENT, &sc->sc_rxtap.wr_ihdr, sizeof(sc->sc_rxtap), URTWN_RX_RADIOTAP_PRESENT); TASK_INIT(&sc->cmdq_task, 0, urtwn_cmdq_cb, sc); urtwn_sysctlattach(sc); if (bootverbose) ieee80211_announce(ic); return (0); detach: urtwn_detach(self); return (ENXIO); /* failure */ } static void urtwn_sysctlattach(struct urtwn_softc *sc) { #ifdef USB_DEBUG struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); SYSCTL_ADD_U32(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "debug", CTLFLAG_RW, &sc->sc_debug, sc->sc_debug, "control debugging printfs"); #endif } static int urtwn_detach(device_t self) { struct urtwn_softc *sc = device_get_softc(self); struct ieee80211com *ic = &sc->sc_ic; /* Prevent further ioctls. */ URTWN_LOCK(sc); sc->sc_flags |= URTWN_DETACHED; URTWN_UNLOCK(sc); urtwn_stop(sc); callout_drain(&sc->sc_watchdog_ch); callout_drain(&sc->sc_calib_to); /* stop all USB transfers */ usbd_transfer_unsetup(sc->sc_xfer, URTWN_N_TRANSFER); if (ic->ic_softc == sc) { ieee80211_draintask(ic, &sc->cmdq_task); ieee80211_ifdetach(ic); } URTWN_NT_LOCK_DESTROY(sc); URTWN_CMDQ_LOCK_DESTROY(sc); mtx_destroy(&sc->sc_mtx); return (0); } static void urtwn_drain_mbufq(struct urtwn_softc *sc) { struct mbuf *m; struct ieee80211_node *ni; URTWN_ASSERT_LOCKED(sc); while ((m = mbufq_dequeue(&sc->sc_snd)) != NULL) { ni = (struct ieee80211_node *)m->m_pkthdr.rcvif; m->m_pkthdr.rcvif = NULL; ieee80211_free_node(ni); m_freem(m); } } static usb_error_t urtwn_do_request(struct urtwn_softc *sc, struct usb_device_request *req, void *data) { usb_error_t err; int ntries = 10; URTWN_ASSERT_LOCKED(sc); while (ntries--) { err = usbd_do_request_flags(sc->sc_udev, &sc->sc_mtx, req, data, 0, NULL, 250 /* ms */); if (err == 0) break; URTWN_DPRINTF(sc, URTWN_DEBUG_USB, "%s: control request failed, %s (retries left: %d)\n", __func__, usbd_errstr(err), ntries); usb_pause_mtx(&sc->sc_mtx, hz / 100); } return (err); } static struct ieee80211vap * urtwn_vap_create(struct ieee80211com *ic, const char name[IFNAMSIZ], int unit, enum ieee80211_opmode opmode, int flags, const uint8_t bssid[IEEE80211_ADDR_LEN], const uint8_t mac[IEEE80211_ADDR_LEN]) { struct urtwn_softc *sc = ic->ic_softc; struct urtwn_vap *uvp; struct ieee80211vap *vap; if (!TAILQ_EMPTY(&ic->ic_vaps)) /* only one at a time */ return (NULL); uvp = malloc(sizeof(struct urtwn_vap), M_80211_VAP, M_WAITOK | M_ZERO); vap = &uvp->vap; /* enable s/w bmiss handling for sta mode */ if (ieee80211_vap_setup(ic, vap, name, unit, opmode, flags | IEEE80211_CLONE_NOBEACONS, bssid) != 0) { /* out of memory */ free(uvp, M_80211_VAP); return (NULL); } if (opmode == IEEE80211_M_HOSTAP || opmode == IEEE80211_M_IBSS) urtwn_init_beacon(sc, uvp); /* override state transition machine */ uvp->newstate = vap->iv_newstate; vap->iv_newstate = urtwn_newstate; vap->iv_update_beacon = urtwn_update_beacon; vap->iv_key_alloc = urtwn_key_alloc; vap->iv_key_set = urtwn_key_set; vap->iv_key_delete = urtwn_key_delete; /* 802.11n parameters */ vap->iv_ampdu_density = IEEE80211_HTCAP_MPDUDENSITY_16; vap->iv_ampdu_rxmax = IEEE80211_HTCAP_MAXRXAMPDU_64K; if (opmode == IEEE80211_M_IBSS) { uvp->recv_mgmt = vap->iv_recv_mgmt; vap->iv_recv_mgmt = urtwn_ibss_recv_mgmt; TASK_INIT(&uvp->tsf_task_adhoc, 0, urtwn_tsf_task_adhoc, vap); } if (URTWN_CHIP_HAS_RATECTL(sc)) ieee80211_ratectl_init(vap); /* complete setup */ ieee80211_vap_attach(vap, ieee80211_media_change, ieee80211_media_status, mac); ic->ic_opmode = opmode; return (vap); } static void urtwn_vap_delete(struct ieee80211vap *vap) { struct ieee80211com *ic = vap->iv_ic; struct urtwn_softc *sc = ic->ic_softc; struct urtwn_vap *uvp = URTWN_VAP(vap); /* Guarantee that nothing will go through this vap. */ ieee80211_new_state(vap, IEEE80211_S_INIT, -1); ieee80211_draintask(ic, &vap->iv_nstate_task); URTWN_LOCK(sc); if (uvp->bcn_mbuf != NULL) m_freem(uvp->bcn_mbuf); /* Cancel any unfinished Tx. */ urtwn_vap_clear_tx(sc, vap); URTWN_UNLOCK(sc); if (vap->iv_opmode == IEEE80211_M_IBSS) ieee80211_draintask(ic, &uvp->tsf_task_adhoc); if (URTWN_CHIP_HAS_RATECTL(sc)) ieee80211_ratectl_deinit(vap); ieee80211_vap_detach(vap); free(uvp, M_80211_VAP); } static void urtwn_vap_clear_tx(struct urtwn_softc *sc, struct ieee80211vap *vap) { URTWN_ASSERT_LOCKED(sc); urtwn_vap_clear_tx_queue(sc, &sc->sc_tx_active, vap); urtwn_vap_clear_tx_queue(sc, &sc->sc_tx_pending, vap); } static void urtwn_vap_clear_tx_queue(struct urtwn_softc *sc, urtwn_datahead *head, struct ieee80211vap *vap) { struct urtwn_data *dp, *tmp; STAILQ_FOREACH_SAFE(dp, head, next, tmp) { if (dp->ni != NULL) { if (dp->ni->ni_vap == vap) { ieee80211_free_node(dp->ni); dp->ni = NULL; if (dp->m != NULL) { m_freem(dp->m); dp->m = NULL; } STAILQ_REMOVE(head, dp, urtwn_data, next); STAILQ_INSERT_TAIL(&sc->sc_tx_inactive, dp, next); } } } } static struct mbuf * urtwn_rx_copy_to_mbuf(struct urtwn_softc *sc, struct r92c_rx_stat *stat, int totlen) { struct ieee80211com *ic = &sc->sc_ic; struct mbuf *m; uint32_t rxdw0; int pktlen; /* * don't pass packets to the ieee80211 framework if the driver isn't * RUNNING. */ if (!(sc->sc_flags & URTWN_RUNNING)) return (NULL); rxdw0 = le32toh(stat->rxdw0); if (rxdw0 & (R92C_RXDW0_CRCERR | R92C_RXDW0_ICVERR)) { /* * This should not happen since we setup our Rx filter * to not receive these frames. */ URTWN_DPRINTF(sc, URTWN_DEBUG_RECV, "%s: RX flags error (%s)\n", __func__, rxdw0 & R92C_RXDW0_CRCERR ? "CRC" : "ICV"); goto fail; } pktlen = MS(rxdw0, R92C_RXDW0_PKTLEN); if (pktlen < sizeof(struct ieee80211_frame_ack)) { URTWN_DPRINTF(sc, URTWN_DEBUG_RECV, "%s: frame is too short: %d\n", __func__, pktlen); goto fail; } m = m_get2(totlen, M_NOWAIT, MT_DATA, M_PKTHDR); if (__predict_false(m == NULL)) { device_printf(sc->sc_dev, "%s: could not allocate RX mbuf\n", __func__); goto fail; } /* Finalize mbuf. */ memcpy(mtod(m, uint8_t *), (uint8_t *)stat, totlen); m->m_pkthdr.len = m->m_len = totlen; return (m); fail: counter_u64_add(ic->ic_ierrors, 1); return (NULL); } static struct mbuf * urtwn_report_intr(struct usb_xfer *xfer, struct urtwn_data *data) { struct urtwn_softc *sc = data->sc; struct ieee80211com *ic = &sc->sc_ic; struct r92c_rx_stat *stat; uint8_t *buf; int len; usbd_xfer_status(xfer, &len, NULL, NULL, NULL); if (len < sizeof(*stat)) { counter_u64_add(ic->ic_ierrors, 1); return (NULL); } buf = data->buf; stat = (struct r92c_rx_stat *)buf; /* * For 88E chips we can tie the FF flushing here; * this is where we do know exactly how deep the * transmit queue is. * * But it won't work for R92 chips, so we can't * take the easy way out. */ if (sc->chip & URTWN_CHIP_88E) { int report_sel = MS(le32toh(stat->rxdw3), R88E_RXDW3_RPT); switch (report_sel) { case R88E_RXDW3_RPT_RX: return (urtwn_rxeof(sc, buf, len)); case R88E_RXDW3_RPT_TX1: urtwn_r88e_ratectl_tx_complete(sc, &stat[1]); break; default: URTWN_DPRINTF(sc, URTWN_DEBUG_INTR, "%s: case %d was not handled\n", __func__, report_sel); break; } } else return (urtwn_rxeof(sc, buf, len)); return (NULL); } static struct mbuf * urtwn_rxeof(struct urtwn_softc *sc, uint8_t *buf, int len) { struct r92c_rx_stat *stat; struct mbuf *m, *m0 = NULL, *prevm = NULL; uint32_t rxdw0; int totlen, pktlen, infosz, npkts; /* Get the number of encapsulated frames. */ stat = (struct r92c_rx_stat *)buf; npkts = MS(le32toh(stat->rxdw2), R92C_RXDW2_PKTCNT); URTWN_DPRINTF(sc, URTWN_DEBUG_RECV, "%s: Rx %d frames in one chunk\n", __func__, npkts); /* Process all of them. */ while (npkts-- > 0) { if (len < sizeof(*stat)) break; stat = (struct r92c_rx_stat *)buf; rxdw0 = le32toh(stat->rxdw0); pktlen = MS(rxdw0, R92C_RXDW0_PKTLEN); if (pktlen == 0) break; infosz = MS(rxdw0, R92C_RXDW0_INFOSZ) * 8; /* Make sure everything fits in xfer. */ totlen = sizeof(*stat) + infosz + pktlen; if (totlen > len) break; m = urtwn_rx_copy_to_mbuf(sc, stat, totlen); if (m0 == NULL) m0 = m; if (prevm == NULL) prevm = m; else { prevm->m_next = m; prevm = m; } /* Next chunk is 128-byte aligned. */ totlen = (totlen + 127) & ~127; buf += totlen; len -= totlen; } return (m0); } static void urtwn_r88e_ratectl_tx_complete(struct urtwn_softc *sc, void *arg) { struct r88e_tx_rpt_ccx *rpt = arg; struct ieee80211vap *vap; struct ieee80211_node *ni; uint8_t macid; int ntries; macid = MS(rpt->rptb1, R88E_RPTB1_MACID); ntries = MS(rpt->rptb2, R88E_RPTB2_RETRY_CNT); URTWN_NT_LOCK(sc); ni = sc->node_list[macid]; if (ni != NULL) { vap = ni->ni_vap; URTWN_DPRINTF(sc, URTWN_DEBUG_INTR, "%s: frame for macid %d was" "%s sent (%d retries)\n", __func__, macid, (rpt->rptb1 & R88E_RPTB1_PKT_OK) ? "" : " not", ntries); if (rpt->rptb1 & R88E_RPTB1_PKT_OK) { ieee80211_ratectl_tx_complete(vap, ni, IEEE80211_RATECTL_TX_SUCCESS, &ntries, NULL); } else { ieee80211_ratectl_tx_complete(vap, ni, IEEE80211_RATECTL_TX_FAILURE, &ntries, NULL); } } else { URTWN_DPRINTF(sc, URTWN_DEBUG_INTR, "%s: macid %d, ni is NULL\n", __func__, macid); } URTWN_NT_UNLOCK(sc); } static struct ieee80211_node * urtwn_rx_frame(struct urtwn_softc *sc, struct mbuf *m, int8_t *rssi_p) { struct ieee80211com *ic = &sc->sc_ic; struct ieee80211_frame_min *wh; struct r92c_rx_stat *stat; uint32_t rxdw0, rxdw3; uint8_t rate, cipher; int8_t rssi = -127; int infosz; stat = mtod(m, struct r92c_rx_stat *); rxdw0 = le32toh(stat->rxdw0); rxdw3 = le32toh(stat->rxdw3); rate = MS(rxdw3, R92C_RXDW3_RATE); cipher = MS(rxdw0, R92C_RXDW0_CIPHER); infosz = MS(rxdw0, R92C_RXDW0_INFOSZ) * 8; /* Get RSSI from PHY status descriptor if present. */ if (infosz != 0 && (rxdw0 & R92C_RXDW0_PHYST)) { if (sc->chip & URTWN_CHIP_88E) rssi = urtwn_r88e_get_rssi(sc, rate, &stat[1]); else rssi = urtwn_get_rssi(sc, rate, &stat[1]); URTWN_DPRINTF(sc, URTWN_DEBUG_RSSI, "%s: rssi=%d\n", __func__, rssi); /* Update our average RSSI. */ urtwn_update_avgrssi(sc, rate, rssi); } if (ieee80211_radiotap_active(ic)) { struct urtwn_rx_radiotap_header *tap = &sc->sc_rxtap; tap->wr_flags = 0; urtwn_get_tsf(sc, &tap->wr_tsft); if (__predict_false(le32toh((uint32_t)tap->wr_tsft) < le32toh(stat->rxdw5))) { tap->wr_tsft = le32toh(tap->wr_tsft >> 32) - 1; tap->wr_tsft = (uint64_t)htole32(tap->wr_tsft) << 32; } else tap->wr_tsft &= 0xffffffff00000000; tap->wr_tsft += stat->rxdw5; /* XXX 20/40? */ /* XXX shortgi? */ /* Map HW rate index to 802.11 rate. */ if (!(rxdw3 & R92C_RXDW3_HT)) { tap->wr_rate = ridx2rate[rate]; } else if (rate >= 12) { /* MCS0~15. */ /* Bit 7 set means HT MCS instead of rate. */ tap->wr_rate = 0x80 | (rate - 12); } /* XXX TODO: this isn't right; should use the last good RSSI */ tap->wr_dbm_antsignal = rssi; tap->wr_dbm_antnoise = URTWN_NOISE_FLOOR; } *rssi_p = rssi; /* Drop descriptor. */ m_adj(m, sizeof(*stat) + infosz); wh = mtod(m, struct ieee80211_frame_min *); if ((wh->i_fc[1] & IEEE80211_FC1_PROTECTED) && cipher != R92C_CAM_ALGO_NONE) { m->m_flags |= M_WEP; } if (m->m_len >= sizeof(*wh)) return (ieee80211_find_rxnode(ic, wh)); return (NULL); } static void urtwn_bulk_rx_callback(struct usb_xfer *xfer, usb_error_t error) { struct urtwn_softc *sc = usbd_xfer_softc(xfer); struct ieee80211com *ic = &sc->sc_ic; struct ieee80211_node *ni; struct mbuf *m = NULL, *next; struct urtwn_data *data; int8_t nf, rssi; URTWN_ASSERT_LOCKED(sc); switch (USB_GET_STATE(xfer)) { case USB_ST_TRANSFERRED: data = STAILQ_FIRST(&sc->sc_rx_active); if (data == NULL) goto tr_setup; STAILQ_REMOVE_HEAD(&sc->sc_rx_active, next); m = urtwn_report_intr(xfer, data); STAILQ_INSERT_TAIL(&sc->sc_rx_inactive, data, next); /* FALLTHROUGH */ case USB_ST_SETUP: tr_setup: data = STAILQ_FIRST(&sc->sc_rx_inactive); if (data == NULL) { KASSERT(m == NULL, ("mbuf isn't NULL")); goto finish; } STAILQ_REMOVE_HEAD(&sc->sc_rx_inactive, next); STAILQ_INSERT_TAIL(&sc->sc_rx_active, data, next); usbd_xfer_set_frame_data(xfer, 0, data->buf, usbd_xfer_max_len(xfer)); usbd_transfer_submit(xfer); /* * To avoid LOR we should unlock our private mutex here to call * ieee80211_input() because here is at the end of a USB * callback and safe to unlock. */ while (m != NULL) { next = m->m_next; m->m_next = NULL; ni = urtwn_rx_frame(sc, m, &rssi); /* Store a global last-good RSSI */ if (rssi != -127) sc->last_rssi = rssi; URTWN_UNLOCK(sc); nf = URTWN_NOISE_FLOOR; if (ni != NULL) { if (rssi != -127) URTWN_NODE(ni)->last_rssi = rssi; if (ni->ni_flags & IEEE80211_NODE_HT) m->m_flags |= M_AMPDU; (void)ieee80211_input(ni, m, URTWN_NODE(ni)->last_rssi - nf, nf); ieee80211_free_node(ni); } else { /* Use last good global RSSI */ (void)ieee80211_input_all(ic, m, sc->last_rssi - nf, nf); } URTWN_LOCK(sc); m = next; } break; default: /* needs it to the inactive queue due to a error. */ data = STAILQ_FIRST(&sc->sc_rx_active); if (data != NULL) { STAILQ_REMOVE_HEAD(&sc->sc_rx_active, next); STAILQ_INSERT_TAIL(&sc->sc_rx_inactive, data, next); } if (error != USB_ERR_CANCELLED) { usbd_xfer_set_stall(xfer); counter_u64_add(ic->ic_ierrors, 1); goto tr_setup; } break; } finish: /* Finished receive; age anything left on the FF queue by a little bump */ /* * XXX TODO: just make this a callout timer schedule so we can * flush the FF staging queue if we're approaching idle. */ #ifdef IEEE80211_SUPPORT_SUPERG URTWN_UNLOCK(sc); ieee80211_ff_age_all(ic, 1); URTWN_LOCK(sc); #endif /* Kick-start more transmit in case we stalled */ urtwn_start(sc); } static void urtwn_txeof(struct urtwn_softc *sc, struct urtwn_data *data, int status) { URTWN_ASSERT_LOCKED(sc); if (data->ni != NULL) /* not a beacon frame */ ieee80211_tx_complete(data->ni, data->m, status); if (sc->sc_tx_n_active > 0) sc->sc_tx_n_active--; data->ni = NULL; data->m = NULL; sc->sc_txtimer = 0; STAILQ_INSERT_TAIL(&sc->sc_tx_inactive, data, next); } static int urtwn_alloc_list(struct urtwn_softc *sc, struct urtwn_data data[], int ndata, int maxsz) { int i, error; for (i = 0; i < ndata; i++) { struct urtwn_data *dp = &data[i]; dp->sc = sc; dp->m = NULL; dp->buf = malloc(maxsz, M_USBDEV, M_NOWAIT); if (dp->buf == NULL) { device_printf(sc->sc_dev, "could not allocate buffer\n"); error = ENOMEM; goto fail; } dp->ni = NULL; } return (0); fail: urtwn_free_list(sc, data, ndata); return (error); } static int urtwn_alloc_rx_list(struct urtwn_softc *sc) { int error, i; error = urtwn_alloc_list(sc, sc->sc_rx, URTWN_RX_LIST_COUNT, URTWN_RXBUFSZ); if (error != 0) return (error); STAILQ_INIT(&sc->sc_rx_active); STAILQ_INIT(&sc->sc_rx_inactive); for (i = 0; i < URTWN_RX_LIST_COUNT; i++) STAILQ_INSERT_HEAD(&sc->sc_rx_inactive, &sc->sc_rx[i], next); return (0); } static int urtwn_alloc_tx_list(struct urtwn_softc *sc) { int error, i; error = urtwn_alloc_list(sc, sc->sc_tx, URTWN_TX_LIST_COUNT, URTWN_TXBUFSZ); if (error != 0) return (error); STAILQ_INIT(&sc->sc_tx_active); STAILQ_INIT(&sc->sc_tx_inactive); STAILQ_INIT(&sc->sc_tx_pending); for (i = 0; i < URTWN_TX_LIST_COUNT; i++) STAILQ_INSERT_HEAD(&sc->sc_tx_inactive, &sc->sc_tx[i], next); return (0); } static void urtwn_free_list(struct urtwn_softc *sc, struct urtwn_data data[], int ndata) { int i; for (i = 0; i < ndata; i++) { struct urtwn_data *dp = &data[i]; if (dp->buf != NULL) { free(dp->buf, M_USBDEV); dp->buf = NULL; } if (dp->ni != NULL) { ieee80211_free_node(dp->ni); dp->ni = NULL; } } } static void urtwn_free_rx_list(struct urtwn_softc *sc) { urtwn_free_list(sc, sc->sc_rx, URTWN_RX_LIST_COUNT); STAILQ_INIT(&sc->sc_rx_active); STAILQ_INIT(&sc->sc_rx_inactive); } static void urtwn_free_tx_list(struct urtwn_softc *sc) { urtwn_free_list(sc, sc->sc_tx, URTWN_TX_LIST_COUNT); STAILQ_INIT(&sc->sc_tx_active); STAILQ_INIT(&sc->sc_tx_inactive); STAILQ_INIT(&sc->sc_tx_pending); } static void urtwn_bulk_tx_callback(struct usb_xfer *xfer, usb_error_t error) { struct urtwn_softc *sc = usbd_xfer_softc(xfer); #ifdef IEEE80211_SUPPORT_SUPERG struct ieee80211com *ic = &sc->sc_ic; #endif struct urtwn_data *data; URTWN_ASSERT_LOCKED(sc); switch (USB_GET_STATE(xfer)){ case USB_ST_TRANSFERRED: data = STAILQ_FIRST(&sc->sc_tx_active); if (data == NULL) goto tr_setup; STAILQ_REMOVE_HEAD(&sc->sc_tx_active, next); urtwn_txeof(sc, data, 0); /* FALLTHROUGH */ case USB_ST_SETUP: tr_setup: data = STAILQ_FIRST(&sc->sc_tx_pending); if (data == NULL) { URTWN_DPRINTF(sc, URTWN_DEBUG_XMIT, "%s: empty pending queue\n", __func__); sc->sc_tx_n_active = 0; goto finish; } STAILQ_REMOVE_HEAD(&sc->sc_tx_pending, next); STAILQ_INSERT_TAIL(&sc->sc_tx_active, data, next); usbd_xfer_set_frame_data(xfer, 0, data->buf, data->buflen); usbd_transfer_submit(xfer); sc->sc_tx_n_active++; break; default: data = STAILQ_FIRST(&sc->sc_tx_active); if (data == NULL) goto tr_setup; STAILQ_REMOVE_HEAD(&sc->sc_tx_active, next); urtwn_txeof(sc, data, 1); if (error != USB_ERR_CANCELLED) { usbd_xfer_set_stall(xfer); goto tr_setup; } break; } finish: #ifdef IEEE80211_SUPPORT_SUPERG /* * If the TX active queue drops below a certain * threshold, ensure we age fast-frames out so they're * transmitted. */ if (sc->sc_tx_n_active <= 1) { /* XXX ew - net80211 should defer this for us! */ /* * Note: this sc_tx_n_active currently tracks * the number of pending transmit submissions * and not the actual depth of the TX frames * pending to the hardware. That means that * we're going to end up with some sub-optimal * aggregation behaviour. */ /* * XXX TODO: just make this a callout timer schedule so we can * flush the FF staging queue if we're approaching idle. */ URTWN_UNLOCK(sc); ieee80211_ff_flush(ic, WME_AC_VO); ieee80211_ff_flush(ic, WME_AC_VI); ieee80211_ff_flush(ic, WME_AC_BE); ieee80211_ff_flush(ic, WME_AC_BK); URTWN_LOCK(sc); } #endif /* Kick-start more transmit */ urtwn_start(sc); } static struct urtwn_data * _urtwn_getbuf(struct urtwn_softc *sc) { struct urtwn_data *bf; bf = STAILQ_FIRST(&sc->sc_tx_inactive); if (bf != NULL) STAILQ_REMOVE_HEAD(&sc->sc_tx_inactive, next); else { URTWN_DPRINTF(sc, URTWN_DEBUG_XMIT, "%s: out of xmit buffers\n", __func__); } return (bf); } static struct urtwn_data * urtwn_getbuf(struct urtwn_softc *sc) { struct urtwn_data *bf; URTWN_ASSERT_LOCKED(sc); bf = _urtwn_getbuf(sc); if (bf == NULL) { URTWN_DPRINTF(sc, URTWN_DEBUG_XMIT, "%s: stop queue\n", __func__); } return (bf); } static usb_error_t urtwn_write_region_1(struct urtwn_softc *sc, uint16_t addr, uint8_t *buf, int len) { usb_device_request_t req; req.bmRequestType = UT_WRITE_VENDOR_DEVICE; req.bRequest = R92C_REQ_REGS; USETW(req.wValue, addr); USETW(req.wIndex, 0); USETW(req.wLength, len); return (urtwn_do_request(sc, &req, buf)); } static usb_error_t urtwn_write_1(struct urtwn_softc *sc, uint16_t addr, uint8_t val) { return (urtwn_write_region_1(sc, addr, &val, sizeof(val))); } static usb_error_t urtwn_write_2(struct urtwn_softc *sc, uint16_t addr, uint16_t val) { val = htole16(val); return (urtwn_write_region_1(sc, addr, (uint8_t *)&val, sizeof(val))); } static usb_error_t urtwn_write_4(struct urtwn_softc *sc, uint16_t addr, uint32_t val) { val = htole32(val); return (urtwn_write_region_1(sc, addr, (uint8_t *)&val, sizeof(val))); } static usb_error_t urtwn_read_region_1(struct urtwn_softc *sc, uint16_t addr, uint8_t *buf, int len) { usb_device_request_t req; req.bmRequestType = UT_READ_VENDOR_DEVICE; req.bRequest = R92C_REQ_REGS; USETW(req.wValue, addr); USETW(req.wIndex, 0); USETW(req.wLength, len); return (urtwn_do_request(sc, &req, buf)); } static uint8_t urtwn_read_1(struct urtwn_softc *sc, uint16_t addr) { uint8_t val; if (urtwn_read_region_1(sc, addr, &val, 1) != 0) return (0xff); return (val); } static uint16_t urtwn_read_2(struct urtwn_softc *sc, uint16_t addr) { uint16_t val; if (urtwn_read_region_1(sc, addr, (uint8_t *)&val, 2) != 0) return (0xffff); return (le16toh(val)); } static uint32_t urtwn_read_4(struct urtwn_softc *sc, uint16_t addr) { uint32_t val; if (urtwn_read_region_1(sc, addr, (uint8_t *)&val, 4) != 0) return (0xffffffff); return (le32toh(val)); } static int urtwn_fw_cmd(struct urtwn_softc *sc, uint8_t id, const void *buf, int len) { struct r92c_fw_cmd cmd; usb_error_t error; int ntries; if (!(sc->sc_flags & URTWN_FW_LOADED)) { URTWN_DPRINTF(sc, URTWN_DEBUG_FIRMWARE, "%s: firmware " "was not loaded; command (id %d) will be discarded\n", __func__, id); return (0); } /* Wait for current FW box to be empty. */ for (ntries = 0; ntries < 100; ntries++) { if (!(urtwn_read_1(sc, R92C_HMETFR) & (1 << sc->fwcur))) break; urtwn_ms_delay(sc); } if (ntries == 100) { device_printf(sc->sc_dev, "could not send firmware command\n"); return (ETIMEDOUT); } memset(&cmd, 0, sizeof(cmd)); cmd.id = id; if (len > 3) cmd.id |= R92C_CMD_FLAG_EXT; KASSERT(len <= sizeof(cmd.msg), ("urtwn_fw_cmd\n")); memcpy(cmd.msg, buf, len); /* Write the first word last since that will trigger the FW. */ error = urtwn_write_region_1(sc, R92C_HMEBOX_EXT(sc->fwcur), (uint8_t *)&cmd + 4, 2); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); error = urtwn_write_region_1(sc, R92C_HMEBOX(sc->fwcur), (uint8_t *)&cmd + 0, 4); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); sc->fwcur = (sc->fwcur + 1) % R92C_H2C_NBOX; return (0); } static void urtwn_cmdq_cb(void *arg, int pending) { struct urtwn_softc *sc = arg; struct urtwn_cmdq *item; /* * Device must be powered on (via urtwn_power_on()) * before any command may be sent. */ URTWN_LOCK(sc); if (!(sc->sc_flags & URTWN_RUNNING)) { URTWN_UNLOCK(sc); return; } URTWN_CMDQ_LOCK(sc); while (sc->cmdq[sc->cmdq_first].func != NULL) { item = &sc->cmdq[sc->cmdq_first]; sc->cmdq_first = (sc->cmdq_first + 1) % URTWN_CMDQ_SIZE; URTWN_CMDQ_UNLOCK(sc); item->func(sc, &item->data); URTWN_CMDQ_LOCK(sc); memset(item, 0, sizeof (*item)); } URTWN_CMDQ_UNLOCK(sc); URTWN_UNLOCK(sc); } static int urtwn_cmd_sleepable(struct urtwn_softc *sc, const void *ptr, size_t len, CMD_FUNC_PROTO) { struct ieee80211com *ic = &sc->sc_ic; KASSERT(len <= sizeof(union sec_param), ("buffer overflow")); URTWN_CMDQ_LOCK(sc); if (sc->cmdq[sc->cmdq_last].func != NULL) { device_printf(sc->sc_dev, "%s: cmdq overflow\n", __func__); URTWN_CMDQ_UNLOCK(sc); return (EAGAIN); } if (ptr != NULL) memcpy(&sc->cmdq[sc->cmdq_last].data, ptr, len); sc->cmdq[sc->cmdq_last].func = func; sc->cmdq_last = (sc->cmdq_last + 1) % URTWN_CMDQ_SIZE; URTWN_CMDQ_UNLOCK(sc); ieee80211_runtask(ic, &sc->cmdq_task); return (0); } static __inline void urtwn_rf_write(struct urtwn_softc *sc, int chain, uint8_t addr, uint32_t val) { sc->sc_rf_write(sc, chain, addr, val); } static void urtwn_r92c_rf_write(struct urtwn_softc *sc, int chain, uint8_t addr, uint32_t val) { urtwn_bb_write(sc, R92C_LSSI_PARAM(chain), SM(R92C_LSSI_PARAM_ADDR, addr) | SM(R92C_LSSI_PARAM_DATA, val)); } static void urtwn_r88e_rf_write(struct urtwn_softc *sc, int chain, uint8_t addr, uint32_t val) { urtwn_bb_write(sc, R92C_LSSI_PARAM(chain), SM(R88E_LSSI_PARAM_ADDR, addr) | SM(R92C_LSSI_PARAM_DATA, val)); } static uint32_t urtwn_rf_read(struct urtwn_softc *sc, int chain, uint8_t addr) { uint32_t reg[R92C_MAX_CHAINS], val; reg[0] = urtwn_bb_read(sc, R92C_HSSI_PARAM2(0)); if (chain != 0) reg[chain] = urtwn_bb_read(sc, R92C_HSSI_PARAM2(chain)); urtwn_bb_write(sc, R92C_HSSI_PARAM2(0), reg[0] & ~R92C_HSSI_PARAM2_READ_EDGE); urtwn_ms_delay(sc); urtwn_bb_write(sc, R92C_HSSI_PARAM2(chain), RW(reg[chain], R92C_HSSI_PARAM2_READ_ADDR, addr) | R92C_HSSI_PARAM2_READ_EDGE); urtwn_ms_delay(sc); urtwn_bb_write(sc, R92C_HSSI_PARAM2(0), reg[0] | R92C_HSSI_PARAM2_READ_EDGE); urtwn_ms_delay(sc); if (urtwn_bb_read(sc, R92C_HSSI_PARAM1(chain)) & R92C_HSSI_PARAM1_PI) val = urtwn_bb_read(sc, R92C_HSPI_READBACK(chain)); else val = urtwn_bb_read(sc, R92C_LSSI_READBACK(chain)); return (MS(val, R92C_LSSI_READBACK_DATA)); } static int urtwn_llt_write(struct urtwn_softc *sc, uint32_t addr, uint32_t data) { usb_error_t error; int ntries; error = urtwn_write_4(sc, R92C_LLT_INIT, SM(R92C_LLT_INIT_OP, R92C_LLT_INIT_OP_WRITE) | SM(R92C_LLT_INIT_ADDR, addr) | SM(R92C_LLT_INIT_DATA, data)); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Wait for write operation to complete. */ for (ntries = 0; ntries < 20; ntries++) { if (MS(urtwn_read_4(sc, R92C_LLT_INIT), R92C_LLT_INIT_OP) == R92C_LLT_INIT_OP_NO_ACTIVE) return (0); urtwn_ms_delay(sc); } return (ETIMEDOUT); } static int urtwn_efuse_read_next(struct urtwn_softc *sc, uint8_t *val) { uint32_t reg; usb_error_t error; int ntries; if (sc->last_rom_addr >= URTWN_EFUSE_MAX_LEN) return (EFAULT); reg = urtwn_read_4(sc, R92C_EFUSE_CTRL); reg = RW(reg, R92C_EFUSE_CTRL_ADDR, sc->last_rom_addr); reg &= ~R92C_EFUSE_CTRL_VALID; error = urtwn_write_4(sc, R92C_EFUSE_CTRL, reg); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Wait for read operation to complete. */ for (ntries = 0; ntries < 100; ntries++) { reg = urtwn_read_4(sc, R92C_EFUSE_CTRL); if (reg & R92C_EFUSE_CTRL_VALID) break; urtwn_ms_delay(sc); } if (ntries == 100) { device_printf(sc->sc_dev, "could not read efuse byte at address 0x%x\n", sc->last_rom_addr); return (ETIMEDOUT); } *val = MS(reg, R92C_EFUSE_CTRL_DATA); sc->last_rom_addr++; return (0); } static int urtwn_efuse_read_data(struct urtwn_softc *sc, uint8_t *rom, uint8_t off, uint8_t msk) { uint8_t reg; int i, error; for (i = 0; i < 4; i++) { if (msk & (1 << i)) continue; error = urtwn_efuse_read_next(sc, ®); if (error != 0) return (error); URTWN_DPRINTF(sc, URTWN_DEBUG_ROM, "rom[0x%03X] == 0x%02X\n", off * 8 + i * 2, reg); rom[off * 8 + i * 2 + 0] = reg; error = urtwn_efuse_read_next(sc, ®); if (error != 0) return (error); URTWN_DPRINTF(sc, URTWN_DEBUG_ROM, "rom[0x%03X] == 0x%02X\n", off * 8 + i * 2 + 1, reg); rom[off * 8 + i * 2 + 1] = reg; } return (0); } #ifdef USB_DEBUG static void urtwn_dump_rom_contents(struct urtwn_softc *sc, uint8_t *rom, uint16_t size) { int i; /* Dump ROM contents. */ device_printf(sc->sc_dev, "%s:", __func__); for (i = 0; i < size; i++) { if (i % 32 == 0) printf("\n%03X: ", i); else if (i % 4 == 0) printf(" "); printf("%02X", rom[i]); } printf("\n"); } #endif static int urtwn_efuse_read(struct urtwn_softc *sc, uint8_t *rom, uint16_t size) { #define URTWN_CHK(res) do { \ if ((error = res) != 0) \ goto end; \ } while(0) uint8_t msk, off, reg; int error; URTWN_CHK(urtwn_efuse_switch_power(sc)); /* Read full ROM image. */ sc->last_rom_addr = 0; memset(rom, 0xff, size); URTWN_CHK(urtwn_efuse_read_next(sc, ®)); while (reg != 0xff) { /* check for extended header */ if ((sc->chip & URTWN_CHIP_88E) && (reg & 0x1f) == 0x0f) { off = reg >> 5; URTWN_CHK(urtwn_efuse_read_next(sc, ®)); if ((reg & 0x0f) != 0x0f) off = ((reg & 0xf0) >> 1) | off; else continue; } else off = reg >> 4; msk = reg & 0xf; URTWN_CHK(urtwn_efuse_read_data(sc, rom, off, msk)); URTWN_CHK(urtwn_efuse_read_next(sc, ®)); } end: #ifdef USB_DEBUG if (sc->sc_debug & URTWN_DEBUG_ROM) urtwn_dump_rom_contents(sc, rom, size); #endif urtwn_write_1(sc, R92C_EFUSE_ACCESS, R92C_EFUSE_ACCESS_OFF); if (error != 0) { device_printf(sc->sc_dev, "%s: error while reading ROM\n", __func__); } return (error); #undef URTWN_CHK } static int urtwn_efuse_switch_power(struct urtwn_softc *sc) { usb_error_t error; uint32_t reg; error = urtwn_write_1(sc, R92C_EFUSE_ACCESS, R92C_EFUSE_ACCESS_ON); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); reg = urtwn_read_2(sc, R92C_SYS_ISO_CTRL); if (!(reg & R92C_SYS_ISO_CTRL_PWC_EV12V)) { error = urtwn_write_2(sc, R92C_SYS_ISO_CTRL, reg | R92C_SYS_ISO_CTRL_PWC_EV12V); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); } reg = urtwn_read_2(sc, R92C_SYS_FUNC_EN); if (!(reg & R92C_SYS_FUNC_EN_ELDR)) { error = urtwn_write_2(sc, R92C_SYS_FUNC_EN, reg | R92C_SYS_FUNC_EN_ELDR); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); } reg = urtwn_read_2(sc, R92C_SYS_CLKR); if ((reg & (R92C_SYS_CLKR_LOADER_EN | R92C_SYS_CLKR_ANA8M)) != (R92C_SYS_CLKR_LOADER_EN | R92C_SYS_CLKR_ANA8M)) { error = urtwn_write_2(sc, R92C_SYS_CLKR, reg | R92C_SYS_CLKR_LOADER_EN | R92C_SYS_CLKR_ANA8M); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); } return (0); } static int urtwn_read_chipid(struct urtwn_softc *sc) { uint32_t reg; if (sc->chip & URTWN_CHIP_88E) return (0); reg = urtwn_read_4(sc, R92C_SYS_CFG); if (reg & R92C_SYS_CFG_TRP_VAUX_EN) return (EIO); if (reg & R92C_SYS_CFG_TYPE_92C) { sc->chip |= URTWN_CHIP_92C; /* Check if it is a castrated 8192C. */ if (MS(urtwn_read_4(sc, R92C_HPON_FSM), R92C_HPON_FSM_CHIP_BONDING_ID) == R92C_HPON_FSM_CHIP_BONDING_ID_92C_1T2R) sc->chip |= URTWN_CHIP_92C_1T2R; } if (reg & R92C_SYS_CFG_VENDOR_UMC) { sc->chip |= URTWN_CHIP_UMC; if (MS(reg, R92C_SYS_CFG_CHIP_VER_RTL) == 0) sc->chip |= URTWN_CHIP_UMC_A_CUT; } return (0); } static int urtwn_read_rom(struct urtwn_softc *sc) { struct r92c_rom *rom = &sc->rom.r92c_rom; int error; /* Read full ROM image. */ error = urtwn_efuse_read(sc, (uint8_t *)rom, sizeof(*rom)); if (error != 0) return (error); /* XXX Weird but this is what the vendor driver does. */ sc->last_rom_addr = 0x1fa; error = urtwn_efuse_read_next(sc, &sc->pa_setting); if (error != 0) return (error); URTWN_DPRINTF(sc, URTWN_DEBUG_ROM, "%s: PA setting=0x%x\n", __func__, sc->pa_setting); sc->board_type = MS(rom->rf_opt1, R92C_ROM_RF1_BOARD_TYPE); sc->regulatory = MS(rom->rf_opt1, R92C_ROM_RF1_REGULATORY); URTWN_DPRINTF(sc, URTWN_DEBUG_ROM, "%s: regulatory type=%d\n", __func__, sc->regulatory); IEEE80211_ADDR_COPY(sc->sc_ic.ic_macaddr, rom->macaddr); sc->sc_rf_write = urtwn_r92c_rf_write; sc->sc_power_on = urtwn_r92c_power_on; sc->sc_power_off = urtwn_r92c_power_off; return (0); } static int urtwn_r88e_read_rom(struct urtwn_softc *sc) { struct r88e_rom *rom = &sc->rom.r88e_rom; int error; error = urtwn_efuse_read(sc, (uint8_t *)rom, sizeof(sc->rom.r88e_rom)); if (error != 0) return (error); sc->bw20_tx_pwr_diff = (rom->tx_pwr_diff >> 4); if (sc->bw20_tx_pwr_diff & 0x08) sc->bw20_tx_pwr_diff |= 0xf0; sc->ofdm_tx_pwr_diff = (rom->tx_pwr_diff & 0xf); if (sc->ofdm_tx_pwr_diff & 0x08) sc->ofdm_tx_pwr_diff |= 0xf0; sc->regulatory = MS(rom->rf_board_opt, R92C_ROM_RF1_REGULATORY); URTWN_DPRINTF(sc, URTWN_DEBUG_ROM, "%s: regulatory type %d\n", __func__,sc->regulatory); IEEE80211_ADDR_COPY(sc->sc_ic.ic_macaddr, rom->macaddr); sc->sc_rf_write = urtwn_r88e_rf_write; sc->sc_power_on = urtwn_r88e_power_on; sc->sc_power_off = urtwn_r88e_power_off; return (0); } static __inline uint8_t rate2ridx(uint8_t rate) { if (rate & IEEE80211_RATE_MCS) { /* 11n rates start at idx 12 */ return ((rate & 0xf) + 12); } switch (rate) { /* 11g */ case 12: return 4; case 18: return 5; case 24: return 6; case 36: return 7; case 48: return 8; case 72: return 9; case 96: return 10; case 108: return 11; /* 11b */ case 2: return 0; case 4: return 1; case 11: return 2; case 22: return 3; default: return URTWN_RIDX_UNKNOWN; } } /* * Initialize rate adaptation in firmware. */ static int urtwn_ra_init(struct urtwn_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap = TAILQ_FIRST(&ic->ic_vaps); struct ieee80211_node *ni; struct ieee80211_rateset *rs, *rs_ht; struct r92c_fw_cmd_macid_cfg cmd; uint32_t rates, basicrates; uint8_t mode, ridx; int maxrate, maxbasicrate, error, i; ni = ieee80211_ref_node(vap->iv_bss); rs = &ni->ni_rates; rs_ht = (struct ieee80211_rateset *) &ni->ni_htrates; /* Get normal and basic rates mask. */ rates = basicrates = 0; maxrate = maxbasicrate = 0; /* This is for 11bg */ for (i = 0; i < rs->rs_nrates; i++) { /* Convert 802.11 rate to HW rate index. */ ridx = rate2ridx(IEEE80211_RV(rs->rs_rates[i])); if (ridx == URTWN_RIDX_UNKNOWN) /* Unknown rate, skip. */ continue; rates |= 1 << ridx; if (ridx > maxrate) maxrate = ridx; if (rs->rs_rates[i] & IEEE80211_RATE_BASIC) { basicrates |= 1 << ridx; if (ridx > maxbasicrate) maxbasicrate = ridx; } } /* If we're doing 11n, enable 11n rates */ if (ni->ni_flags & IEEE80211_NODE_HT) { for (i = 0; i < rs_ht->rs_nrates; i++) { if ((rs_ht->rs_rates[i] & 0x7f) > 0xf) continue; /* 11n rates start at index 12 */ ridx = ((rs_ht->rs_rates[i]) & 0xf) + 12; rates |= (1 << ridx); /* Guard against the rate table being oddly ordered */ if (ridx > maxrate) maxrate = ridx; } } #if 0 if (ic->ic_curmode == IEEE80211_MODE_11NG) raid = R92C_RAID_11GN; #endif /* NB: group addressed frames are done at 11bg rates for now */ if (ic->ic_curmode == IEEE80211_MODE_11B) mode = R92C_RAID_11B; else mode = R92C_RAID_11BG; /* XXX misleading 'mode' value here for unicast frames */ URTWN_DPRINTF(sc, URTWN_DEBUG_RA, "%s: mode 0x%x, rates 0x%08x, basicrates 0x%08x\n", __func__, mode, rates, basicrates); /* Set rates mask for group addressed frames. */ cmd.macid = URTWN_MACID_BC | URTWN_MACID_VALID; cmd.mask = htole32(mode << 28 | basicrates); error = urtwn_fw_cmd(sc, R92C_CMD_MACID_CONFIG, &cmd, sizeof(cmd)); if (error != 0) { ieee80211_free_node(ni); device_printf(sc->sc_dev, "could not add broadcast station\n"); return (error); } /* Set initial MRR rate. */ URTWN_DPRINTF(sc, URTWN_DEBUG_RA, "%s: maxbasicrate %d\n", __func__, maxbasicrate); urtwn_write_1(sc, R92C_INIDATA_RATE_SEL(URTWN_MACID_BC), maxbasicrate); /* Set rates mask for unicast frames. */ if (ni->ni_flags & IEEE80211_NODE_HT) mode = R92C_RAID_11GN; else if (ic->ic_curmode == IEEE80211_MODE_11B) mode = R92C_RAID_11B; else mode = R92C_RAID_11BG; cmd.macid = URTWN_MACID_BSS | URTWN_MACID_VALID; cmd.mask = htole32(mode << 28 | rates); error = urtwn_fw_cmd(sc, R92C_CMD_MACID_CONFIG, &cmd, sizeof(cmd)); if (error != 0) { ieee80211_free_node(ni); device_printf(sc->sc_dev, "could not add BSS station\n"); return (error); } /* Set initial MRR rate. */ URTWN_DPRINTF(sc, URTWN_DEBUG_RA, "%s: maxrate %d\n", __func__, maxrate); urtwn_write_1(sc, R92C_INIDATA_RATE_SEL(URTWN_MACID_BSS), maxrate); /* Indicate highest supported rate. */ if (ni->ni_flags & IEEE80211_NODE_HT) ni->ni_txrate = rs_ht->rs_rates[rs_ht->rs_nrates - 1] | IEEE80211_RATE_MCS; else ni->ni_txrate = rs->rs_rates[rs->rs_nrates - 1]; ieee80211_free_node(ni); return (0); } static void urtwn_init_beacon(struct urtwn_softc *sc, struct urtwn_vap *uvp) { struct r92c_tx_desc *txd = &uvp->bcn_desc; txd->txdw0 = htole32( SM(R92C_TXDW0_OFFSET, sizeof(*txd)) | R92C_TXDW0_BMCAST | R92C_TXDW0_OWN | R92C_TXDW0_FSG | R92C_TXDW0_LSG); txd->txdw1 = htole32( SM(R92C_TXDW1_QSEL, R92C_TXDW1_QSEL_BEACON) | SM(R92C_TXDW1_RAID, R92C_RAID_11B)); if (sc->chip & URTWN_CHIP_88E) { txd->txdw1 |= htole32(SM(R88E_TXDW1_MACID, URTWN_MACID_BC)); txd->txdseq |= htole16(R88E_TXDSEQ_HWSEQ_EN); } else { txd->txdw1 |= htole32(SM(R92C_TXDW1_MACID, URTWN_MACID_BC)); txd->txdw4 |= htole32(R92C_TXDW4_HWSEQ_EN); } txd->txdw4 = htole32(R92C_TXDW4_DRVRATE); txd->txdw5 = htole32(SM(R92C_TXDW5_DATARATE, URTWN_RIDX_CCK1)); } static int urtwn_setup_beacon(struct urtwn_softc *sc, struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct urtwn_vap *uvp = URTWN_VAP(vap); struct mbuf *m; int error; URTWN_ASSERT_LOCKED(sc); if (ni->ni_chan == IEEE80211_CHAN_ANYC) return (EINVAL); m = ieee80211_beacon_alloc(ni); if (m == NULL) { device_printf(sc->sc_dev, "%s: could not allocate beacon frame\n", __func__); return (ENOMEM); } if (uvp->bcn_mbuf != NULL) m_freem(uvp->bcn_mbuf); uvp->bcn_mbuf = m; if ((error = urtwn_tx_beacon(sc, uvp)) != 0) return (error); /* XXX bcnq stuck workaround */ if ((error = urtwn_tx_beacon(sc, uvp)) != 0) return (error); URTWN_DPRINTF(sc, URTWN_DEBUG_BEACON, "%s: beacon was %srecognized\n", __func__, urtwn_read_1(sc, R92C_TDECTRL + 2) & (R92C_TDECTRL_BCN_VALID >> 16) ? "" : "not "); return (0); } static void urtwn_update_beacon(struct ieee80211vap *vap, int item) { struct urtwn_softc *sc = vap->iv_ic->ic_softc; struct urtwn_vap *uvp = URTWN_VAP(vap); struct ieee80211_beacon_offsets *bo = &vap->iv_bcn_off; struct ieee80211_node *ni = vap->iv_bss; int mcast = 0; URTWN_LOCK(sc); if (uvp->bcn_mbuf == NULL) { uvp->bcn_mbuf = ieee80211_beacon_alloc(ni); if (uvp->bcn_mbuf == NULL) { device_printf(sc->sc_dev, "%s: could not allocate beacon frame\n", __func__); URTWN_UNLOCK(sc); return; } } URTWN_UNLOCK(sc); if (item == IEEE80211_BEACON_TIM) mcast = 1; /* XXX */ setbit(bo->bo_flags, item); ieee80211_beacon_update(ni, uvp->bcn_mbuf, mcast); URTWN_LOCK(sc); urtwn_tx_beacon(sc, uvp); URTWN_UNLOCK(sc); } /* * Push a beacon frame into the chip. Beacon will * be repeated by the chip every R92C_BCN_INTERVAL. */ static int urtwn_tx_beacon(struct urtwn_softc *sc, struct urtwn_vap *uvp) { struct r92c_tx_desc *desc = &uvp->bcn_desc; struct urtwn_data *bf; URTWN_ASSERT_LOCKED(sc); bf = urtwn_getbuf(sc); if (bf == NULL) return (ENOMEM); memcpy(bf->buf, desc, sizeof(*desc)); urtwn_tx_start(sc, uvp->bcn_mbuf, IEEE80211_FC0_TYPE_MGT, bf); sc->sc_txtimer = 5; callout_reset(&sc->sc_watchdog_ch, hz, urtwn_watchdog, sc); return (0); } static int urtwn_key_alloc(struct ieee80211vap *vap, struct ieee80211_key *k, ieee80211_keyix *keyix, ieee80211_keyix *rxkeyix) { struct urtwn_softc *sc = vap->iv_ic->ic_softc; uint8_t i; if (!(&vap->iv_nw_keys[0] <= k && k < &vap->iv_nw_keys[IEEE80211_WEP_NKID])) { if (!(k->wk_flags & IEEE80211_KEY_SWCRYPT)) { URTWN_LOCK(sc); /* * First 4 slots for group keys, * what is left - for pairwise. * XXX incompatible with IBSS RSN. */ for (i = IEEE80211_WEP_NKID; i < R92C_CAM_ENTRY_COUNT; i++) { if ((sc->keys_bmap & (1 << i)) == 0) { sc->keys_bmap |= 1 << i; *keyix = i; break; } } URTWN_UNLOCK(sc); if (i == R92C_CAM_ENTRY_COUNT) { device_printf(sc->sc_dev, "%s: no free space in the key table\n", __func__); return 0; } } else *keyix = 0; } else { *keyix = k - vap->iv_nw_keys; } *rxkeyix = *keyix; return 1; } static void urtwn_key_set_cb(struct urtwn_softc *sc, union sec_param *data) { struct ieee80211_key *k = &data->key; uint8_t algo, keyid; int i, error; if (k->wk_keyix < IEEE80211_WEP_NKID) keyid = k->wk_keyix; else keyid = 0; /* Map net80211 cipher to HW crypto algorithm. */ switch (k->wk_cipher->ic_cipher) { case IEEE80211_CIPHER_WEP: if (k->wk_keylen < 8) algo = R92C_CAM_ALGO_WEP40; else algo = R92C_CAM_ALGO_WEP104; break; case IEEE80211_CIPHER_TKIP: algo = R92C_CAM_ALGO_TKIP; break; case IEEE80211_CIPHER_AES_CCM: algo = R92C_CAM_ALGO_AES; break; default: device_printf(sc->sc_dev, "%s: undefined cipher %d\n", __func__, k->wk_cipher->ic_cipher); return; } URTWN_DPRINTF(sc, URTWN_DEBUG_KEY, "%s: keyix %d, keyid %d, algo %d/%d, flags %04X, len %d, " "macaddr %s\n", __func__, k->wk_keyix, keyid, k->wk_cipher->ic_cipher, algo, k->wk_flags, k->wk_keylen, ether_sprintf(k->wk_macaddr)); + /* Clear high bits. */ + urtwn_cam_write(sc, R92C_CAM_CTL6(k->wk_keyix), 0); + urtwn_cam_write(sc, R92C_CAM_CTL7(k->wk_keyix), 0); + /* Write key. */ for (i = 0; i < 4; i++) { error = urtwn_cam_write(sc, R92C_CAM_KEY(k->wk_keyix, i), le32dec(&k->wk_key[i * 4])); if (error != 0) goto fail; } /* Write CTL0 last since that will validate the CAM entry. */ error = urtwn_cam_write(sc, R92C_CAM_CTL1(k->wk_keyix), le32dec(&k->wk_macaddr[2])); if (error != 0) goto fail; error = urtwn_cam_write(sc, R92C_CAM_CTL0(k->wk_keyix), SM(R92C_CAM_ALGO, algo) | SM(R92C_CAM_KEYID, keyid) | SM(R92C_CAM_MACLO, le16dec(&k->wk_macaddr[0])) | R92C_CAM_VALID); if (error != 0) goto fail; return; fail: device_printf(sc->sc_dev, "%s fails, error %d\n", __func__, error); } static void urtwn_key_del_cb(struct urtwn_softc *sc, union sec_param *data) { struct ieee80211_key *k = &data->key; int i; URTWN_DPRINTF(sc, URTWN_DEBUG_KEY, "%s: keyix %d, flags %04X, macaddr %s\n", __func__, k->wk_keyix, k->wk_flags, ether_sprintf(k->wk_macaddr)); urtwn_cam_write(sc, R92C_CAM_CTL0(k->wk_keyix), 0); urtwn_cam_write(sc, R92C_CAM_CTL1(k->wk_keyix), 0); /* Clear key. */ for (i = 0; i < 4; i++) urtwn_cam_write(sc, R92C_CAM_KEY(k->wk_keyix, i), 0); sc->keys_bmap &= ~(1 << k->wk_keyix); } static int urtwn_key_set(struct ieee80211vap *vap, const struct ieee80211_key *k) { struct urtwn_softc *sc = vap->iv_ic->ic_softc; struct urtwn_vap *uvp = URTWN_VAP(vap); if (k->wk_flags & IEEE80211_KEY_SWCRYPT) { /* Not for us. */ return (1); } if (&vap->iv_nw_keys[0] <= k && k < &vap->iv_nw_keys[IEEE80211_WEP_NKID]) { URTWN_LOCK(sc); uvp->keys[k->wk_keyix] = k; if ((sc->sc_flags & URTWN_RUNNING) == 0) { /* * The device was not started; * the key will be installed later. */ URTWN_UNLOCK(sc); return (1); } URTWN_UNLOCK(sc); } return (!urtwn_cmd_sleepable(sc, k, sizeof(*k), urtwn_key_set_cb)); } static int urtwn_key_delete(struct ieee80211vap *vap, const struct ieee80211_key *k) { struct urtwn_softc *sc = vap->iv_ic->ic_softc; struct urtwn_vap *uvp = URTWN_VAP(vap); if (k->wk_flags & IEEE80211_KEY_SWCRYPT) { /* Not for us. */ return (1); } if (&vap->iv_nw_keys[0] <= k && k < &vap->iv_nw_keys[IEEE80211_WEP_NKID]) { URTWN_LOCK(sc); uvp->keys[k->wk_keyix] = NULL; if ((sc->sc_flags & URTWN_RUNNING) == 0) { /* All keys are removed on device reset. */ URTWN_UNLOCK(sc); return (1); } URTWN_UNLOCK(sc); } return (!urtwn_cmd_sleepable(sc, k, sizeof(*k), urtwn_key_del_cb)); } static void urtwn_tsf_task_adhoc(void *arg, int pending) { struct ieee80211vap *vap = arg; struct urtwn_softc *sc = vap->iv_ic->ic_softc; struct ieee80211_node *ni; uint32_t reg; URTWN_LOCK(sc); ni = ieee80211_ref_node(vap->iv_bss); reg = urtwn_read_1(sc, R92C_BCN_CTRL); /* Accept beacons with the same BSSID. */ urtwn_set_rx_bssid_all(sc, 0); /* Enable synchronization. */ reg &= ~R92C_BCN_CTRL_DIS_TSF_UDT0; urtwn_write_1(sc, R92C_BCN_CTRL, reg); /* Synchronize. */ usb_pause_mtx(&sc->sc_mtx, hz * ni->ni_intval * 5 / 1000); /* Disable synchronization. */ reg |= R92C_BCN_CTRL_DIS_TSF_UDT0; urtwn_write_1(sc, R92C_BCN_CTRL, reg); /* Remove beacon filter. */ urtwn_set_rx_bssid_all(sc, 1); /* Enable beaconing. */ urtwn_write_1(sc, R92C_MBID_NUM, urtwn_read_1(sc, R92C_MBID_NUM) | R92C_MBID_TXBCN_RPT0); reg |= R92C_BCN_CTRL_EN_BCN; urtwn_write_1(sc, R92C_BCN_CTRL, reg); ieee80211_free_node(ni); URTWN_UNLOCK(sc); } static void urtwn_tsf_sync_enable(struct urtwn_softc *sc, struct ieee80211vap *vap) { struct ieee80211com *ic = &sc->sc_ic; struct urtwn_vap *uvp = URTWN_VAP(vap); /* Reset TSF. */ urtwn_write_1(sc, R92C_DUAL_TSF_RST, R92C_DUAL_TSF_RST0); switch (vap->iv_opmode) { case IEEE80211_M_STA: /* Enable TSF synchronization. */ urtwn_write_1(sc, R92C_BCN_CTRL, urtwn_read_1(sc, R92C_BCN_CTRL) & ~R92C_BCN_CTRL_DIS_TSF_UDT0); break; case IEEE80211_M_IBSS: ieee80211_runtask(ic, &uvp->tsf_task_adhoc); break; case IEEE80211_M_HOSTAP: /* Enable beaconing. */ urtwn_write_1(sc, R92C_MBID_NUM, urtwn_read_1(sc, R92C_MBID_NUM) | R92C_MBID_TXBCN_RPT0); urtwn_write_1(sc, R92C_BCN_CTRL, urtwn_read_1(sc, R92C_BCN_CTRL) | R92C_BCN_CTRL_EN_BCN); break; default: device_printf(sc->sc_dev, "undefined opmode %d\n", vap->iv_opmode); return; } } static void urtwn_get_tsf(struct urtwn_softc *sc, uint64_t *buf) { urtwn_read_region_1(sc, R92C_TSFTR, (uint8_t *)buf, sizeof(*buf)); } static void urtwn_set_led(struct urtwn_softc *sc, int led, int on) { uint8_t reg; if (led == URTWN_LED_LINK) { if (sc->chip & URTWN_CHIP_88E) { reg = urtwn_read_1(sc, R92C_LEDCFG2) & 0xf0; urtwn_write_1(sc, R92C_LEDCFG2, reg | 0x60); if (!on) { reg = urtwn_read_1(sc, R92C_LEDCFG2) & 0x90; urtwn_write_1(sc, R92C_LEDCFG2, reg | R92C_LEDCFG0_DIS); urtwn_write_1(sc, R92C_MAC_PINMUX_CFG, urtwn_read_1(sc, R92C_MAC_PINMUX_CFG) & 0xfe); } } else { reg = urtwn_read_1(sc, R92C_LEDCFG0) & 0x70; if (!on) reg |= R92C_LEDCFG0_DIS; urtwn_write_1(sc, R92C_LEDCFG0, reg); } sc->ledlink = on; /* Save LED state. */ } } static void urtwn_set_mode(struct urtwn_softc *sc, uint8_t mode) { uint8_t reg; reg = urtwn_read_1(sc, R92C_MSR); reg = (reg & ~R92C_MSR_MASK) | mode; urtwn_write_1(sc, R92C_MSR, reg); } static void urtwn_ibss_recv_mgmt(struct ieee80211_node *ni, struct mbuf *m, int subtype, const struct ieee80211_rx_stats *rxs, int rssi, int nf) { struct ieee80211vap *vap = ni->ni_vap; struct urtwn_softc *sc = vap->iv_ic->ic_softc; struct urtwn_vap *uvp = URTWN_VAP(vap); uint64_t ni_tstamp, curr_tstamp; uvp->recv_mgmt(ni, m, subtype, rxs, rssi, nf); if (vap->iv_state == IEEE80211_S_RUN && (subtype == IEEE80211_FC0_SUBTYPE_BEACON || subtype == IEEE80211_FC0_SUBTYPE_PROBE_RESP)) { ni_tstamp = le64toh(ni->ni_tstamp.tsf); URTWN_LOCK(sc); urtwn_get_tsf(sc, &curr_tstamp); URTWN_UNLOCK(sc); curr_tstamp = le64toh(curr_tstamp); if (ni_tstamp >= curr_tstamp) (void) ieee80211_ibss_merge(ni); } } static int urtwn_newstate(struct ieee80211vap *vap, enum ieee80211_state nstate, int arg) { struct urtwn_vap *uvp = URTWN_VAP(vap); struct ieee80211com *ic = vap->iv_ic; struct urtwn_softc *sc = ic->ic_softc; struct ieee80211_node *ni; enum ieee80211_state ostate; uint32_t reg; uint8_t mode; int error = 0; ostate = vap->iv_state; URTWN_DPRINTF(sc, URTWN_DEBUG_STATE, "%s -> %s\n", ieee80211_state_name[ostate], ieee80211_state_name[nstate]); IEEE80211_UNLOCK(ic); URTWN_LOCK(sc); callout_stop(&sc->sc_watchdog_ch); if (ostate == IEEE80211_S_RUN) { /* Stop calibration. */ callout_stop(&sc->sc_calib_to); /* Turn link LED off. */ urtwn_set_led(sc, URTWN_LED_LINK, 0); /* Set media status to 'No Link'. */ urtwn_set_mode(sc, R92C_MSR_NOLINK); /* Stop Rx of data frames. */ urtwn_write_2(sc, R92C_RXFLTMAP2, 0); /* Disable TSF synchronization. */ urtwn_write_1(sc, R92C_BCN_CTRL, (urtwn_read_1(sc, R92C_BCN_CTRL) & ~R92C_BCN_CTRL_EN_BCN) | R92C_BCN_CTRL_DIS_TSF_UDT0); /* Disable beaconing. */ urtwn_write_1(sc, R92C_MBID_NUM, urtwn_read_1(sc, R92C_MBID_NUM) & ~R92C_MBID_TXBCN_RPT0); /* Reset TSF. */ urtwn_write_1(sc, R92C_DUAL_TSF_RST, R92C_DUAL_TSF_RST0); /* Reset EDCA parameters. */ urtwn_write_4(sc, R92C_EDCA_VO_PARAM, 0x002f3217); urtwn_write_4(sc, R92C_EDCA_VI_PARAM, 0x005e4317); urtwn_write_4(sc, R92C_EDCA_BE_PARAM, 0x00105320); urtwn_write_4(sc, R92C_EDCA_BK_PARAM, 0x0000a444); } switch (nstate) { case IEEE80211_S_INIT: /* Turn link LED off. */ urtwn_set_led(sc, URTWN_LED_LINK, 0); break; case IEEE80211_S_SCAN: /* Pause AC Tx queues. */ urtwn_write_1(sc, R92C_TXPAUSE, urtwn_read_1(sc, R92C_TXPAUSE) | R92C_TX_QUEUE_AC); break; case IEEE80211_S_AUTH: urtwn_set_chan(sc, ic->ic_curchan, NULL); break; case IEEE80211_S_RUN: if (vap->iv_opmode == IEEE80211_M_MONITOR) { /* Turn link LED on. */ urtwn_set_led(sc, URTWN_LED_LINK, 1); break; } ni = ieee80211_ref_node(vap->iv_bss); if (ic->ic_bsschan == IEEE80211_CHAN_ANYC || ni->ni_chan == IEEE80211_CHAN_ANYC) { device_printf(sc->sc_dev, "%s: could not move to RUN state\n", __func__); error = EINVAL; goto end_run; } switch (vap->iv_opmode) { case IEEE80211_M_STA: mode = R92C_MSR_INFRA; break; case IEEE80211_M_IBSS: mode = R92C_MSR_ADHOC; break; case IEEE80211_M_HOSTAP: mode = R92C_MSR_AP; break; default: device_printf(sc->sc_dev, "undefined opmode %d\n", vap->iv_opmode); error = EINVAL; goto end_run; } /* Set media status to 'Associated'. */ urtwn_set_mode(sc, mode); /* Set BSSID. */ urtwn_write_4(sc, R92C_BSSID + 0, le32dec(&ni->ni_bssid[0])); urtwn_write_4(sc, R92C_BSSID + 4, le16dec(&ni->ni_bssid[4])); if (ic->ic_curmode == IEEE80211_MODE_11B) urtwn_write_1(sc, R92C_INIRTS_RATE_SEL, 0); else /* 802.11b/g */ urtwn_write_1(sc, R92C_INIRTS_RATE_SEL, 3); /* Enable Rx of data frames. */ urtwn_write_2(sc, R92C_RXFLTMAP2, 0xffff); /* Flush all AC queues. */ urtwn_write_1(sc, R92C_TXPAUSE, 0); /* Set beacon interval. */ urtwn_write_2(sc, R92C_BCN_INTERVAL, ni->ni_intval); /* Allow Rx from our BSSID only. */ if (ic->ic_promisc == 0) { reg = urtwn_read_4(sc, R92C_RCR); if (vap->iv_opmode != IEEE80211_M_HOSTAP) { reg |= R92C_RCR_CBSSID_DATA; if (vap->iv_opmode != IEEE80211_M_IBSS) reg |= R92C_RCR_CBSSID_BCN; } urtwn_write_4(sc, R92C_RCR, reg); } if (vap->iv_opmode == IEEE80211_M_HOSTAP || vap->iv_opmode == IEEE80211_M_IBSS) { error = urtwn_setup_beacon(sc, ni); if (error != 0) { device_printf(sc->sc_dev, "unable to push beacon into the chip, " "error %d\n", error); goto end_run; } } /* Enable TSF synchronization. */ urtwn_tsf_sync_enable(sc, vap); urtwn_write_1(sc, R92C_SIFS_CCK + 1, 10); urtwn_write_1(sc, R92C_SIFS_OFDM + 1, 10); urtwn_write_1(sc, R92C_SPEC_SIFS + 1, 10); urtwn_write_1(sc, R92C_MAC_SPEC_SIFS + 1, 10); urtwn_write_1(sc, R92C_R2T_SIFS + 1, 10); urtwn_write_1(sc, R92C_T2T_SIFS + 1, 10); /* Intialize rate adaptation. */ if (!(sc->chip & URTWN_CHIP_88E)) urtwn_ra_init(sc); /* Turn link LED on. */ urtwn_set_led(sc, URTWN_LED_LINK, 1); sc->avg_pwdb = -1; /* Reset average RSSI. */ /* Reset temperature calibration state machine. */ sc->sc_flags &= ~URTWN_TEMP_MEASURED; sc->thcal_lctemp = 0; /* Start periodic calibration. */ callout_reset(&sc->sc_calib_to, 2*hz, urtwn_calib_to, sc); end_run: ieee80211_free_node(ni); break; default: break; } URTWN_UNLOCK(sc); IEEE80211_LOCK(ic); return (error != 0 ? error : uvp->newstate(vap, nstate, arg)); } static void urtwn_calib_to(void *arg) { struct urtwn_softc *sc = arg; /* Do it in a process context. */ urtwn_cmd_sleepable(sc, NULL, 0, urtwn_calib_cb); } static void urtwn_calib_cb(struct urtwn_softc *sc, union sec_param *data) { /* Do temperature compensation. */ urtwn_temp_calib(sc); if ((urtwn_read_1(sc, R92C_MSR) & R92C_MSR_MASK) != R92C_MSR_NOLINK) callout_reset(&sc->sc_calib_to, 2*hz, urtwn_calib_to, sc); } static void urtwn_watchdog(void *arg) { struct urtwn_softc *sc = arg; if (sc->sc_txtimer > 0) { if (--sc->sc_txtimer == 0) { device_printf(sc->sc_dev, "device timeout\n"); counter_u64_add(sc->sc_ic.ic_oerrors, 1); return; } callout_reset(&sc->sc_watchdog_ch, hz, urtwn_watchdog, sc); } } static void urtwn_update_avgrssi(struct urtwn_softc *sc, int rate, int8_t rssi) { int pwdb; /* Convert antenna signal to percentage. */ if (rssi <= -100 || rssi >= 20) pwdb = 0; else if (rssi >= 0) pwdb = 100; else pwdb = 100 + rssi; if (!(sc->chip & URTWN_CHIP_88E)) { if (rate <= URTWN_RIDX_CCK11) { /* CCK gain is smaller than OFDM/MCS gain. */ pwdb += 6; if (pwdb > 100) pwdb = 100; if (pwdb <= 14) pwdb -= 4; else if (pwdb <= 26) pwdb -= 8; else if (pwdb <= 34) pwdb -= 6; else if (pwdb <= 42) pwdb -= 2; } } if (sc->avg_pwdb == -1) /* Init. */ sc->avg_pwdb = pwdb; else if (sc->avg_pwdb < pwdb) sc->avg_pwdb = ((sc->avg_pwdb * 19 + pwdb) / 20) + 1; else sc->avg_pwdb = ((sc->avg_pwdb * 19 + pwdb) / 20); URTWN_DPRINTF(sc, URTWN_DEBUG_RSSI, "%s: PWDB %d, EMA %d\n", __func__, pwdb, sc->avg_pwdb); } static int8_t urtwn_get_rssi(struct urtwn_softc *sc, int rate, void *physt) { static const int8_t cckoff[] = { 16, -12, -26, -46 }; struct r92c_rx_phystat *phy; struct r92c_rx_cck *cck; uint8_t rpt; int8_t rssi; if (rate <= URTWN_RIDX_CCK11) { cck = (struct r92c_rx_cck *)physt; if (sc->sc_flags & URTWN_FLAG_CCK_HIPWR) { rpt = (cck->agc_rpt >> 5) & 0x3; rssi = (cck->agc_rpt & 0x1f) << 1; } else { rpt = (cck->agc_rpt >> 6) & 0x3; rssi = cck->agc_rpt & 0x3e; } rssi = cckoff[rpt] - rssi; } else { /* OFDM/HT. */ phy = (struct r92c_rx_phystat *)physt; rssi = ((le32toh(phy->phydw1) >> 1) & 0x7f) - 110; } return (rssi); } static int8_t urtwn_r88e_get_rssi(struct urtwn_softc *sc, int rate, void *physt) { struct r92c_rx_phystat *phy; struct r88e_rx_cck *cck; uint8_t cck_agc_rpt, lna_idx, vga_idx; int8_t rssi; rssi = 0; if (rate <= URTWN_RIDX_CCK11) { cck = (struct r88e_rx_cck *)physt; cck_agc_rpt = cck->agc_rpt; lna_idx = (cck_agc_rpt & 0xe0) >> 5; vga_idx = cck_agc_rpt & 0x1f; switch (lna_idx) { case 7: if (vga_idx <= 27) rssi = -100 + 2* (27 - vga_idx); else rssi = -100; break; case 6: rssi = -48 + 2 * (2 - vga_idx); break; case 5: rssi = -42 + 2 * (7 - vga_idx); break; case 4: rssi = -36 + 2 * (7 - vga_idx); break; case 3: rssi = -24 + 2 * (7 - vga_idx); break; case 2: rssi = -12 + 2 * (5 - vga_idx); break; case 1: rssi = 8 - (2 * vga_idx); break; case 0: rssi = 14 - (2 * vga_idx); break; } rssi += 6; } else { /* OFDM/HT. */ phy = (struct r92c_rx_phystat *)physt; rssi = ((le32toh(phy->phydw1) >> 1) & 0x7f) - 110; } return (rssi); } static int urtwn_tx_data(struct urtwn_softc *sc, struct ieee80211_node *ni, struct mbuf *m, struct urtwn_data *data) { const struct ieee80211_txparam *tp; struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_key *k = NULL; struct ieee80211_channel *chan; struct ieee80211_frame *wh; struct r92c_tx_desc *txd; uint8_t macid, raid, rate, ridx, type, tid, qos, qsel; int hasqos, ismcast; URTWN_ASSERT_LOCKED(sc); wh = mtod(m, struct ieee80211_frame *); type = wh->i_fc[0] & IEEE80211_FC0_TYPE_MASK; hasqos = IEEE80211_QOS_HAS_SEQ(wh); ismcast = IEEE80211_IS_MULTICAST(wh->i_addr1); /* Select TX ring for this frame. */ if (hasqos) { qos = ((const struct ieee80211_qosframe *)wh)->i_qos[0]; tid = qos & IEEE80211_QOS_TID; } else { qos = 0; tid = 0; } chan = (ni->ni_chan != IEEE80211_CHAN_ANYC) ? ni->ni_chan : ic->ic_curchan; tp = &vap->iv_txparms[ieee80211_chan2mode(chan)]; /* Choose a TX rate index. */ if (type == IEEE80211_FC0_TYPE_MGT) rate = tp->mgmtrate; else if (ismcast) rate = tp->mcastrate; else if (tp->ucastrate != IEEE80211_FIXED_RATE_NONE) rate = tp->ucastrate; else if (m->m_flags & M_EAPOL) rate = tp->mgmtrate; else { if (URTWN_CHIP_HAS_RATECTL(sc)) { /* XXX pass pktlen */ (void) ieee80211_ratectl_rate(ni, NULL, 0); rate = ni->ni_txrate; } else { /* XXX TODO: drop the default rate for 11b/11g? */ if (ni->ni_flags & IEEE80211_NODE_HT) rate = IEEE80211_RATE_MCS | 0x4; /* MCS4 */ else if (ic->ic_curmode != IEEE80211_MODE_11B) rate = 108; else rate = 22; } } /* * XXX TODO: this should be per-node, for 11b versus 11bg * nodes in hostap mode */ ridx = rate2ridx(rate); if (ni->ni_flags & IEEE80211_NODE_HT) raid = R92C_RAID_11GN; else if (ic->ic_curmode != IEEE80211_MODE_11B) raid = R92C_RAID_11BG; else raid = R92C_RAID_11B; if (wh->i_fc[1] & IEEE80211_FC1_PROTECTED) { k = ieee80211_crypto_encap(ni, m); if (k == NULL) { device_printf(sc->sc_dev, "ieee80211_crypto_encap returns NULL.\n"); return (ENOBUFS); } /* in case packet header moved, reset pointer */ wh = mtod(m, struct ieee80211_frame *); } /* Fill Tx descriptor. */ txd = (struct r92c_tx_desc *)data->buf; memset(txd, 0, sizeof(*txd)); txd->txdw0 |= htole32( SM(R92C_TXDW0_OFFSET, sizeof(*txd)) | R92C_TXDW0_OWN | R92C_TXDW0_FSG | R92C_TXDW0_LSG); if (ismcast) txd->txdw0 |= htole32(R92C_TXDW0_BMCAST); if (!ismcast) { /* Unicast frame, check if an ACK is expected. */ if (!qos || (qos & IEEE80211_QOS_ACKPOLICY) != IEEE80211_QOS_ACKPOLICY_NOACK) { txd->txdw5 |= htole32(R92C_TXDW5_RTY_LMT_ENA); txd->txdw5 |= htole32(SM(R92C_TXDW5_RTY_LMT, tp->maxretry)); } if (sc->chip & URTWN_CHIP_88E) { struct urtwn_node *un = URTWN_NODE(ni); macid = un->id; } else macid = URTWN_MACID_BSS; if (type == IEEE80211_FC0_TYPE_DATA) { qsel = tid % URTWN_MAX_TID; if (sc->chip & URTWN_CHIP_88E) { txd->txdw2 |= htole32( R88E_TXDW2_AGGBK | R88E_TXDW2_CCX_RPT); } else txd->txdw1 |= htole32(R92C_TXDW1_AGGBK); /* protmode, non-HT */ /* XXX TODO: noack frames? */ if ((rate & 0x80) == 0 && (ic->ic_flags & IEEE80211_F_USEPROT)) { switch (ic->ic_protmode) { case IEEE80211_PROT_CTSONLY: txd->txdw4 |= htole32( R92C_TXDW4_CTS2SELF); break; case IEEE80211_PROT_RTSCTS: txd->txdw4 |= htole32( R92C_TXDW4_RTSEN | R92C_TXDW4_HWRTSEN); break; default: break; } } /* protmode, HT */ /* XXX TODO: noack frames? */ if ((rate & 0x80) && (ic->ic_htprotmode == IEEE80211_PROT_RTSCTS)) { txd->txdw4 |= htole32( R92C_TXDW4_RTSEN | R92C_TXDW4_HWRTSEN); } /* XXX TODO: rtsrate is configurable? 24mbit may * be a bit high for RTS rate? */ txd->txdw4 |= htole32(SM(R92C_TXDW4_RTSRATE, URTWN_RIDX_OFDM24)); txd->txdw5 |= htole32(0x0001ff00); } else /* IEEE80211_FC0_TYPE_MGT */ qsel = R92C_TXDW1_QSEL_MGNT; } else { macid = URTWN_MACID_BC; qsel = R92C_TXDW1_QSEL_MGNT; } txd->txdw1 |= htole32( SM(R92C_TXDW1_QSEL, qsel) | SM(R92C_TXDW1_RAID, raid)); /* XXX TODO: 40MHZ flag? */ /* XXX TODO: AMPDU flag? (AGG_ENABLE or AGG_BREAK?) Density shift? */ /* XXX Short preamble? */ /* XXX Short-GI? */ if (sc->chip & URTWN_CHIP_88E) txd->txdw1 |= htole32(SM(R88E_TXDW1_MACID, macid)); else txd->txdw1 |= htole32(SM(R92C_TXDW1_MACID, macid)); txd->txdw5 |= htole32(SM(R92C_TXDW5_DATARATE, ridx)); /* Force this rate if needed. */ if (URTWN_CHIP_HAS_RATECTL(sc) || ismcast || (tp->ucastrate != IEEE80211_FIXED_RATE_NONE) || (m->m_flags & M_EAPOL) || type != IEEE80211_FC0_TYPE_DATA) txd->txdw4 |= htole32(R92C_TXDW4_DRVRATE); if (!hasqos) { /* Use HW sequence numbering for non-QoS frames. */ if (sc->chip & URTWN_CHIP_88E) txd->txdseq = htole16(R88E_TXDSEQ_HWSEQ_EN); else txd->txdw4 |= htole32(R92C_TXDW4_HWSEQ_EN); } else { /* Set sequence number. */ txd->txdseq = htole16(M_SEQNO_GET(m) % IEEE80211_SEQ_RANGE); } if (k != NULL && !(k->wk_flags & IEEE80211_KEY_SWCRYPT)) { uint8_t cipher; switch (k->wk_cipher->ic_cipher) { case IEEE80211_CIPHER_WEP: case IEEE80211_CIPHER_TKIP: cipher = R92C_TXDW1_CIPHER_RC4; break; case IEEE80211_CIPHER_AES_CCM: cipher = R92C_TXDW1_CIPHER_AES; break; default: device_printf(sc->sc_dev, "%s: unknown cipher %d\n", __func__, k->wk_cipher->ic_cipher); return (EINVAL); } txd->txdw1 |= htole32(SM(R92C_TXDW1_CIPHER, cipher)); } if (ieee80211_radiotap_active_vap(vap)) { struct urtwn_tx_radiotap_header *tap = &sc->sc_txtap; tap->wt_flags = 0; if (k != NULL) tap->wt_flags |= IEEE80211_RADIOTAP_F_WEP; ieee80211_radiotap_tx(vap, m); } data->ni = ni; urtwn_tx_start(sc, m, type, data); return (0); } static int urtwn_tx_raw(struct urtwn_softc *sc, struct ieee80211_node *ni, struct mbuf *m, struct urtwn_data *data, const struct ieee80211_bpf_params *params) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_key *k = NULL; struct ieee80211_frame *wh; struct r92c_tx_desc *txd; uint8_t cipher, ridx, type; /* Encrypt the frame if need be. */ cipher = R92C_TXDW1_CIPHER_NONE; if (params->ibp_flags & IEEE80211_BPF_CRYPTO) { /* Retrieve key for TX. */ k = ieee80211_crypto_encap(ni, m); if (k == NULL) return (ENOBUFS); if (!(k->wk_flags & IEEE80211_KEY_SWCRYPT)) { switch (k->wk_cipher->ic_cipher) { case IEEE80211_CIPHER_WEP: case IEEE80211_CIPHER_TKIP: cipher = R92C_TXDW1_CIPHER_RC4; break; case IEEE80211_CIPHER_AES_CCM: cipher = R92C_TXDW1_CIPHER_AES; break; default: device_printf(sc->sc_dev, "%s: unknown cipher %d\n", __func__, k->wk_cipher->ic_cipher); return (EINVAL); } } } /* XXX TODO: 11n checks, matching urtwn_tx_data() */ wh = mtod(m, struct ieee80211_frame *); type = wh->i_fc[0] & IEEE80211_FC0_TYPE_MASK; /* Fill Tx descriptor. */ txd = (struct r92c_tx_desc *)data->buf; memset(txd, 0, sizeof(*txd)); txd->txdw0 |= htole32( SM(R92C_TXDW0_OFFSET, sizeof(*txd)) | R92C_TXDW0_OWN | R92C_TXDW0_FSG | R92C_TXDW0_LSG); if (IEEE80211_IS_MULTICAST(wh->i_addr1)) txd->txdw0 |= htole32(R92C_TXDW0_BMCAST); if ((params->ibp_flags & IEEE80211_BPF_NOACK) == 0) { txd->txdw5 |= htole32(R92C_TXDW5_RTY_LMT_ENA); txd->txdw5 |= htole32(SM(R92C_TXDW5_RTY_LMT, params->ibp_try0)); } if (params->ibp_flags & IEEE80211_BPF_RTS) txd->txdw4 |= htole32(R92C_TXDW4_RTSEN | R92C_TXDW4_HWRTSEN); if (params->ibp_flags & IEEE80211_BPF_CTS) txd->txdw4 |= htole32(R92C_TXDW4_CTS2SELF); if (txd->txdw4 & htole32(R92C_TXDW4_RTSEN | R92C_TXDW4_CTS2SELF)) { txd->txdw4 |= htole32(SM(R92C_TXDW4_RTSRATE, URTWN_RIDX_OFDM24)); } if (sc->chip & URTWN_CHIP_88E) txd->txdw1 |= htole32(SM(R88E_TXDW1_MACID, URTWN_MACID_BC)); else txd->txdw1 |= htole32(SM(R92C_TXDW1_MACID, URTWN_MACID_BC)); /* XXX TODO: rate index/config (RAID) for 11n? */ txd->txdw1 |= htole32(SM(R92C_TXDW1_QSEL, R92C_TXDW1_QSEL_MGNT)); txd->txdw1 |= htole32(SM(R92C_TXDW1_CIPHER, cipher)); /* Choose a TX rate index. */ ridx = rate2ridx(params->ibp_rate0); txd->txdw5 |= htole32(SM(R92C_TXDW5_DATARATE, ridx)); txd->txdw5 |= htole32(0x0001ff00); txd->txdw4 |= htole32(R92C_TXDW4_DRVRATE); if (!IEEE80211_QOS_HAS_SEQ(wh)) { /* Use HW sequence numbering for non-QoS frames. */ if (sc->chip & URTWN_CHIP_88E) txd->txdseq = htole16(R88E_TXDSEQ_HWSEQ_EN); else txd->txdw4 |= htole32(R92C_TXDW4_HWSEQ_EN); } else { /* Set sequence number. */ txd->txdseq = htole16(M_SEQNO_GET(m) % IEEE80211_SEQ_RANGE); } if (ieee80211_radiotap_active_vap(vap)) { struct urtwn_tx_radiotap_header *tap = &sc->sc_txtap; tap->wt_flags = 0; if (k != NULL) tap->wt_flags |= IEEE80211_RADIOTAP_F_WEP; ieee80211_radiotap_tx(vap, m); } data->ni = ni; urtwn_tx_start(sc, m, type, data); return (0); } static void urtwn_tx_start(struct urtwn_softc *sc, struct mbuf *m, uint8_t type, struct urtwn_data *data) { struct usb_xfer *xfer; struct r92c_tx_desc *txd; uint16_t ac, sum; int i, xferlen; URTWN_ASSERT_LOCKED(sc); ac = M_WME_GETAC(m); switch (type) { case IEEE80211_FC0_TYPE_CTL: case IEEE80211_FC0_TYPE_MGT: xfer = sc->sc_xfer[URTWN_BULK_TX_VO]; break; default: xfer = sc->sc_xfer[wme2queue[ac].qid]; break; } txd = (struct r92c_tx_desc *)data->buf; txd->txdw0 |= htole32(SM(R92C_TXDW0_PKTLEN, m->m_pkthdr.len)); /* Compute Tx descriptor checksum. */ sum = 0; for (i = 0; i < sizeof(*txd) / 2; i++) sum ^= ((uint16_t *)txd)[i]; txd->txdsum = sum; /* NB: already little endian. */ xferlen = sizeof(*txd) + m->m_pkthdr.len; m_copydata(m, 0, m->m_pkthdr.len, (caddr_t)&txd[1]); data->buflen = xferlen; data->m = m; STAILQ_INSERT_TAIL(&sc->sc_tx_pending, data, next); usbd_transfer_start(xfer); } static int urtwn_transmit(struct ieee80211com *ic, struct mbuf *m) { struct urtwn_softc *sc = ic->ic_softc; int error; URTWN_LOCK(sc); if ((sc->sc_flags & URTWN_RUNNING) == 0) { URTWN_UNLOCK(sc); return (ENXIO); } error = mbufq_enqueue(&sc->sc_snd, m); if (error) { URTWN_UNLOCK(sc); return (error); } urtwn_start(sc); URTWN_UNLOCK(sc); return (0); } static void urtwn_start(struct urtwn_softc *sc) { struct ieee80211_node *ni; struct mbuf *m; struct urtwn_data *bf; URTWN_ASSERT_LOCKED(sc); while ((m = mbufq_dequeue(&sc->sc_snd)) != NULL) { bf = urtwn_getbuf(sc); if (bf == NULL) { mbufq_prepend(&sc->sc_snd, m); break; } ni = (struct ieee80211_node *)m->m_pkthdr.rcvif; m->m_pkthdr.rcvif = NULL; URTWN_DPRINTF(sc, URTWN_DEBUG_XMIT, "%s: called; m=%p\n", __func__, m); if (urtwn_tx_data(sc, ni, m, bf) != 0) { if_inc_counter(ni->ni_vap->iv_ifp, IFCOUNTER_OERRORS, 1); STAILQ_INSERT_HEAD(&sc->sc_tx_inactive, bf, next); m_freem(m); ieee80211_free_node(ni); break; } sc->sc_txtimer = 5; callout_reset(&sc->sc_watchdog_ch, hz, urtwn_watchdog, sc); } } static void urtwn_parent(struct ieee80211com *ic) { struct urtwn_softc *sc = ic->ic_softc; URTWN_LOCK(sc); if (sc->sc_flags & URTWN_DETACHED) { URTWN_UNLOCK(sc); return; } URTWN_UNLOCK(sc); if (ic->ic_nrunning > 0) { if (urtwn_init(sc) != 0) { struct ieee80211vap *vap = TAILQ_FIRST(&ic->ic_vaps); if (vap != NULL) ieee80211_stop(vap); } else ieee80211_start_all(ic); } else urtwn_stop(sc); } static __inline int urtwn_power_on(struct urtwn_softc *sc) { return sc->sc_power_on(sc); } static int urtwn_r92c_power_on(struct urtwn_softc *sc) { uint32_t reg; usb_error_t error; int ntries; /* Wait for autoload done bit. */ for (ntries = 0; ntries < 1000; ntries++) { if (urtwn_read_1(sc, R92C_APS_FSMCO) & R92C_APS_FSMCO_PFM_ALDN) break; urtwn_ms_delay(sc); } if (ntries == 1000) { device_printf(sc->sc_dev, "timeout waiting for chip autoload\n"); return (ETIMEDOUT); } /* Unlock ISO/CLK/Power control register. */ error = urtwn_write_1(sc, R92C_RSV_CTRL, 0); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Move SPS into PWM mode. */ error = urtwn_write_1(sc, R92C_SPS0_CTRL, 0x2b); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); urtwn_ms_delay(sc); reg = urtwn_read_1(sc, R92C_LDOV12D_CTRL); if (!(reg & R92C_LDOV12D_CTRL_LDV12_EN)) { error = urtwn_write_1(sc, R92C_LDOV12D_CTRL, reg | R92C_LDOV12D_CTRL_LDV12_EN); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); urtwn_ms_delay(sc); error = urtwn_write_1(sc, R92C_SYS_ISO_CTRL, urtwn_read_1(sc, R92C_SYS_ISO_CTRL) & ~R92C_SYS_ISO_CTRL_MD2PP); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); } /* Auto enable WLAN. */ error = urtwn_write_2(sc, R92C_APS_FSMCO, urtwn_read_2(sc, R92C_APS_FSMCO) | R92C_APS_FSMCO_APFM_ONMAC); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); for (ntries = 0; ntries < 1000; ntries++) { if (!(urtwn_read_2(sc, R92C_APS_FSMCO) & R92C_APS_FSMCO_APFM_ONMAC)) break; urtwn_ms_delay(sc); } if (ntries == 1000) { device_printf(sc->sc_dev, "timeout waiting for MAC auto ON\n"); return (ETIMEDOUT); } /* Enable radio, GPIO and LED functions. */ error = urtwn_write_2(sc, R92C_APS_FSMCO, R92C_APS_FSMCO_AFSM_HSUS | R92C_APS_FSMCO_PDN_EN | R92C_APS_FSMCO_PFM_ALDN); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Release RF digital isolation. */ error = urtwn_write_2(sc, R92C_SYS_ISO_CTRL, urtwn_read_2(sc, R92C_SYS_ISO_CTRL) & ~R92C_SYS_ISO_CTRL_DIOR); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Initialize MAC. */ error = urtwn_write_1(sc, R92C_APSD_CTRL, urtwn_read_1(sc, R92C_APSD_CTRL) & ~R92C_APSD_CTRL_OFF); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); for (ntries = 0; ntries < 200; ntries++) { if (!(urtwn_read_1(sc, R92C_APSD_CTRL) & R92C_APSD_CTRL_OFF_STATUS)) break; urtwn_ms_delay(sc); } if (ntries == 200) { device_printf(sc->sc_dev, "timeout waiting for MAC initialization\n"); return (ETIMEDOUT); } /* Enable MAC DMA/WMAC/SCHEDULE/SEC blocks. */ reg = urtwn_read_2(sc, R92C_CR); reg |= R92C_CR_HCI_TXDMA_EN | R92C_CR_HCI_RXDMA_EN | R92C_CR_TXDMA_EN | R92C_CR_RXDMA_EN | R92C_CR_PROTOCOL_EN | R92C_CR_SCHEDULE_EN | R92C_CR_MACTXEN | R92C_CR_MACRXEN | R92C_CR_ENSEC; error = urtwn_write_2(sc, R92C_CR, reg); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); error = urtwn_write_1(sc, 0xfe10, 0x19); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); return (0); } static int urtwn_r88e_power_on(struct urtwn_softc *sc) { uint32_t reg; usb_error_t error; int ntries; /* Wait for power ready bit. */ for (ntries = 0; ntries < 5000; ntries++) { if (urtwn_read_4(sc, R92C_APS_FSMCO) & R92C_APS_FSMCO_SUS_HOST) break; urtwn_ms_delay(sc); } if (ntries == 5000) { device_printf(sc->sc_dev, "timeout waiting for chip power up\n"); return (ETIMEDOUT); } /* Reset BB. */ error = urtwn_write_1(sc, R92C_SYS_FUNC_EN, urtwn_read_1(sc, R92C_SYS_FUNC_EN) & ~(R92C_SYS_FUNC_EN_BBRSTB | R92C_SYS_FUNC_EN_BB_GLB_RST)); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); error = urtwn_write_1(sc, R92C_AFE_XTAL_CTRL + 2, urtwn_read_1(sc, R92C_AFE_XTAL_CTRL + 2) | 0x80); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Disable HWPDN. */ error = urtwn_write_2(sc, R92C_APS_FSMCO, urtwn_read_2(sc, R92C_APS_FSMCO) & ~R92C_APS_FSMCO_APDM_HPDN); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Disable WL suspend. */ error = urtwn_write_2(sc, R92C_APS_FSMCO, urtwn_read_2(sc, R92C_APS_FSMCO) & ~(R92C_APS_FSMCO_AFSM_HSUS | R92C_APS_FSMCO_AFSM_PCIE)); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); error = urtwn_write_2(sc, R92C_APS_FSMCO, urtwn_read_2(sc, R92C_APS_FSMCO) | R92C_APS_FSMCO_APFM_ONMAC); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); for (ntries = 0; ntries < 5000; ntries++) { if (!(urtwn_read_2(sc, R92C_APS_FSMCO) & R92C_APS_FSMCO_APFM_ONMAC)) break; urtwn_ms_delay(sc); } if (ntries == 5000) return (ETIMEDOUT); /* Enable LDO normal mode. */ error = urtwn_write_1(sc, R92C_LPLDO_CTRL, urtwn_read_1(sc, R92C_LPLDO_CTRL) & ~R92C_LPLDO_CTRL_SLEEP); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Enable MAC DMA/WMAC/SCHEDULE/SEC blocks. */ error = urtwn_write_2(sc, R92C_CR, 0); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); reg = urtwn_read_2(sc, R92C_CR); reg |= R92C_CR_HCI_TXDMA_EN | R92C_CR_HCI_RXDMA_EN | R92C_CR_TXDMA_EN | R92C_CR_RXDMA_EN | R92C_CR_PROTOCOL_EN | R92C_CR_SCHEDULE_EN | R92C_CR_ENSEC | R92C_CR_CALTMR_EN; error = urtwn_write_2(sc, R92C_CR, reg); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); return (0); } static __inline void urtwn_power_off(struct urtwn_softc *sc) { return sc->sc_power_off(sc); } static void urtwn_r92c_power_off(struct urtwn_softc *sc) { uint32_t reg; /* Block all Tx queues. */ urtwn_write_1(sc, R92C_TXPAUSE, R92C_TX_QUEUE_ALL); /* Disable RF */ urtwn_rf_write(sc, 0, 0, 0); urtwn_write_1(sc, R92C_APSD_CTRL, R92C_APSD_CTRL_OFF); /* Reset BB state machine */ urtwn_write_1(sc, R92C_SYS_FUNC_EN, R92C_SYS_FUNC_EN_USBD | R92C_SYS_FUNC_EN_USBA | R92C_SYS_FUNC_EN_BB_GLB_RST); urtwn_write_1(sc, R92C_SYS_FUNC_EN, R92C_SYS_FUNC_EN_USBD | R92C_SYS_FUNC_EN_USBA); /* * Reset digital sequence */ #ifndef URTWN_WITHOUT_UCODE if (urtwn_read_1(sc, R92C_MCUFWDL) & R92C_MCUFWDL_RDY) { /* Reset MCU ready status */ urtwn_write_1(sc, R92C_MCUFWDL, 0); /* If firmware in ram code, do reset */ urtwn_fw_reset(sc); } #endif /* Reset MAC and Enable 8051 */ urtwn_write_1(sc, R92C_SYS_FUNC_EN + 1, (R92C_SYS_FUNC_EN_CPUEN | R92C_SYS_FUNC_EN_ELDR | R92C_SYS_FUNC_EN_HWPDN) >> 8); /* Reset MCU ready status */ urtwn_write_1(sc, R92C_MCUFWDL, 0); /* Disable MAC clock */ urtwn_write_2(sc, R92C_SYS_CLKR, R92C_SYS_CLKR_ANAD16V_EN | R92C_SYS_CLKR_ANA8M | R92C_SYS_CLKR_LOADER_EN | R92C_SYS_CLKR_80M_SSC_DIS | R92C_SYS_CLKR_SYS_EN | R92C_SYS_CLKR_RING_EN | 0x4000); /* Disable AFE PLL */ urtwn_write_1(sc, R92C_AFE_PLL_CTRL, 0x80); /* Gated AFE DIG_CLOCK */ urtwn_write_2(sc, R92C_AFE_XTAL_CTRL, 0x880F); /* Isolated digital to PON */ urtwn_write_1(sc, R92C_SYS_ISO_CTRL, R92C_SYS_ISO_CTRL_MD2PP | R92C_SYS_ISO_CTRL_PA2PCIE | R92C_SYS_ISO_CTRL_PD2CORE | R92C_SYS_ISO_CTRL_IP2MAC | R92C_SYS_ISO_CTRL_DIOP | R92C_SYS_ISO_CTRL_DIOE); /* * Pull GPIO PIN to balance level and LED control */ /* 1. Disable GPIO[7:0] */ urtwn_write_2(sc, R92C_GPIO_IOSEL, 0x0000); reg = urtwn_read_4(sc, R92C_GPIO_PIN_CTRL) & ~0x0000ff00; reg |= ((reg << 8) & 0x0000ff00) | 0x00ff0000; urtwn_write_4(sc, R92C_GPIO_PIN_CTRL, reg); /* Disable GPIO[10:8] */ urtwn_write_1(sc, R92C_MAC_PINMUX_CFG, 0x00); reg = urtwn_read_2(sc, R92C_GPIO_IO_SEL) & ~0x00f0; reg |= (((reg & 0x000f) << 4) | 0x0780); urtwn_write_2(sc, R92C_GPIO_IO_SEL, reg); /* Disable LED0 & 1 */ urtwn_write_2(sc, R92C_LEDCFG0, 0x8080); /* * Reset digital sequence */ /* Disable ELDR clock */ urtwn_write_2(sc, R92C_SYS_CLKR, R92C_SYS_CLKR_ANAD16V_EN | R92C_SYS_CLKR_ANA8M | R92C_SYS_CLKR_LOADER_EN | R92C_SYS_CLKR_80M_SSC_DIS | R92C_SYS_CLKR_SYS_EN | R92C_SYS_CLKR_RING_EN | 0x4000); /* Isolated ELDR to PON */ urtwn_write_1(sc, R92C_SYS_ISO_CTRL + 1, (R92C_SYS_ISO_CTRL_DIOR | R92C_SYS_ISO_CTRL_PWC_EV12V) >> 8); /* * Disable analog sequence */ /* Disable A15 power */ urtwn_write_1(sc, R92C_LDOA15_CTRL, R92C_LDOA15_CTRL_OBUF); /* Disable digital core power */ urtwn_write_1(sc, R92C_LDOV12D_CTRL, urtwn_read_1(sc, R92C_LDOV12D_CTRL) & ~R92C_LDOV12D_CTRL_LDV12_EN); /* Enter PFM mode */ urtwn_write_1(sc, R92C_SPS0_CTRL, 0x23); /* Set USB suspend */ urtwn_write_2(sc, R92C_APS_FSMCO, R92C_APS_FSMCO_APDM_HOST | R92C_APS_FSMCO_AFSM_HSUS | R92C_APS_FSMCO_PFM_ALDN); /* Lock ISO/CLK/Power control register. */ urtwn_write_1(sc, R92C_RSV_CTRL, 0x0E); } static void urtwn_r88e_power_off(struct urtwn_softc *sc) { uint8_t reg; int ntries; /* Disable any kind of TX reports. */ urtwn_write_1(sc, R88E_TX_RPT_CTRL, urtwn_read_1(sc, R88E_TX_RPT_CTRL) & ~(R88E_TX_RPT1_ENA | R88E_TX_RPT2_ENA)); /* Stop Rx. */ urtwn_write_1(sc, R92C_CR, 0); /* Move card to Low Power State. */ /* Block all Tx queues. */ urtwn_write_1(sc, R92C_TXPAUSE, R92C_TX_QUEUE_ALL); for (ntries = 0; ntries < 20; ntries++) { /* Should be zero if no packet is transmitting. */ if (urtwn_read_4(sc, R88E_SCH_TXCMD) == 0) break; urtwn_ms_delay(sc); } if (ntries == 20) { device_printf(sc->sc_dev, "%s: failed to block Tx queues\n", __func__); return; } /* CCK and OFDM are disabled, and clock are gated. */ urtwn_write_1(sc, R92C_SYS_FUNC_EN, urtwn_read_1(sc, R92C_SYS_FUNC_EN) & ~R92C_SYS_FUNC_EN_BBRSTB); urtwn_ms_delay(sc); /* Reset MAC TRX */ urtwn_write_1(sc, R92C_CR, R92C_CR_HCI_TXDMA_EN | R92C_CR_HCI_RXDMA_EN | R92C_CR_TXDMA_EN | R92C_CR_RXDMA_EN | R92C_CR_PROTOCOL_EN | R92C_CR_SCHEDULE_EN); /* check if removed later */ urtwn_write_1(sc, R92C_CR + 1, urtwn_read_1(sc, R92C_CR + 1) & ~(R92C_CR_ENSEC >> 8)); /* Respond TxOK to scheduler */ urtwn_write_1(sc, R92C_DUAL_TSF_RST, urtwn_read_1(sc, R92C_DUAL_TSF_RST) | 0x20); /* If firmware in ram code, do reset. */ #ifndef URTWN_WITHOUT_UCODE if (urtwn_read_1(sc, R92C_MCUFWDL) & R92C_MCUFWDL_RDY) urtwn_r88e_fw_reset(sc); #endif /* Reset MCU ready status. */ urtwn_write_1(sc, R92C_MCUFWDL, 0x00); /* Disable 32k. */ urtwn_write_1(sc, R88E_32K_CTRL, urtwn_read_1(sc, R88E_32K_CTRL) & ~0x01); /* Move card to Disabled state. */ /* Turn off RF. */ urtwn_write_1(sc, R92C_RF_CTRL, 0); /* LDO Sleep mode. */ urtwn_write_1(sc, R92C_LPLDO_CTRL, urtwn_read_1(sc, R92C_LPLDO_CTRL) | R92C_LPLDO_CTRL_SLEEP); /* Turn off MAC by HW state machine */ urtwn_write_1(sc, R92C_APS_FSMCO + 1, urtwn_read_1(sc, R92C_APS_FSMCO + 1) | (R92C_APS_FSMCO_APFM_OFF >> 8)); for (ntries = 0; ntries < 20; ntries++) { /* Wait until it will be disabled. */ if ((urtwn_read_1(sc, R92C_APS_FSMCO + 1) & (R92C_APS_FSMCO_APFM_OFF >> 8)) == 0) break; urtwn_ms_delay(sc); } if (ntries == 20) { device_printf(sc->sc_dev, "%s: could not turn off MAC\n", __func__); return; } /* schmit trigger */ urtwn_write_1(sc, R92C_AFE_XTAL_CTRL + 2, urtwn_read_1(sc, R92C_AFE_XTAL_CTRL + 2) | 0x80); /* Enable WL suspend. */ urtwn_write_1(sc, R92C_APS_FSMCO + 1, (urtwn_read_1(sc, R92C_APS_FSMCO + 1) & ~0x10) | 0x08); /* Enable bandgap mbias in suspend. */ urtwn_write_1(sc, R92C_APS_FSMCO + 3, 0); /* Clear SIC_EN register. */ urtwn_write_1(sc, R92C_GPIO_MUXCFG + 1, urtwn_read_1(sc, R92C_GPIO_MUXCFG + 1) & ~0x10); /* Set USB suspend enable local register */ urtwn_write_1(sc, R92C_USB_SUSPEND, urtwn_read_1(sc, R92C_USB_SUSPEND) | 0x10); /* Reset MCU IO Wrapper. */ reg = urtwn_read_1(sc, R92C_RSV_CTRL + 1); urtwn_write_1(sc, R92C_RSV_CTRL + 1, reg & ~0x08); urtwn_write_1(sc, R92C_RSV_CTRL + 1, reg | 0x08); /* marked as 'For Power Consumption' code. */ urtwn_write_1(sc, R92C_GPIO_OUT, urtwn_read_1(sc, R92C_GPIO_IN)); urtwn_write_1(sc, R92C_GPIO_IOSEL, 0xff); urtwn_write_1(sc, R92C_GPIO_IO_SEL, urtwn_read_1(sc, R92C_GPIO_IO_SEL) << 4); urtwn_write_1(sc, R92C_GPIO_MOD, urtwn_read_1(sc, R92C_GPIO_MOD) | 0x0f); /* Set LNA, TRSW, EX_PA Pin to output mode. */ urtwn_write_4(sc, R88E_BB_PAD_CTRL, 0x00080808); } static int urtwn_llt_init(struct urtwn_softc *sc) { int i, error, page_count, pktbuf_count; page_count = (sc->chip & URTWN_CHIP_88E) ? R88E_TX_PAGE_COUNT : R92C_TX_PAGE_COUNT; pktbuf_count = (sc->chip & URTWN_CHIP_88E) ? R88E_TXPKTBUF_COUNT : R92C_TXPKTBUF_COUNT; /* Reserve pages [0; page_count]. */ for (i = 0; i < page_count; i++) { if ((error = urtwn_llt_write(sc, i, i + 1)) != 0) return (error); } /* NB: 0xff indicates end-of-list. */ if ((error = urtwn_llt_write(sc, i, 0xff)) != 0) return (error); /* * Use pages [page_count + 1; pktbuf_count - 1] * as ring buffer. */ for (++i; i < pktbuf_count - 1; i++) { if ((error = urtwn_llt_write(sc, i, i + 1)) != 0) return (error); } /* Make the last page point to the beginning of the ring buffer. */ error = urtwn_llt_write(sc, i, page_count + 1); return (error); } #ifndef URTWN_WITHOUT_UCODE static void urtwn_fw_reset(struct urtwn_softc *sc) { uint16_t reg; int ntries; /* Tell 8051 to reset itself. */ urtwn_write_1(sc, R92C_HMETFR + 3, 0x20); /* Wait until 8051 resets by itself. */ for (ntries = 0; ntries < 100; ntries++) { reg = urtwn_read_2(sc, R92C_SYS_FUNC_EN); if (!(reg & R92C_SYS_FUNC_EN_CPUEN)) return; urtwn_ms_delay(sc); } /* Force 8051 reset. */ urtwn_write_2(sc, R92C_SYS_FUNC_EN, reg & ~R92C_SYS_FUNC_EN_CPUEN); } static void urtwn_r88e_fw_reset(struct urtwn_softc *sc) { uint16_t reg; reg = urtwn_read_2(sc, R92C_SYS_FUNC_EN); urtwn_write_2(sc, R92C_SYS_FUNC_EN, reg & ~R92C_SYS_FUNC_EN_CPUEN); urtwn_write_2(sc, R92C_SYS_FUNC_EN, reg | R92C_SYS_FUNC_EN_CPUEN); } static int urtwn_fw_loadpage(struct urtwn_softc *sc, int page, const uint8_t *buf, int len) { uint32_t reg; usb_error_t error = USB_ERR_NORMAL_COMPLETION; int off, mlen; reg = urtwn_read_4(sc, R92C_MCUFWDL); reg = RW(reg, R92C_MCUFWDL_PAGE, page); urtwn_write_4(sc, R92C_MCUFWDL, reg); off = R92C_FW_START_ADDR; while (len > 0) { if (len > 196) mlen = 196; else if (len > 4) mlen = 4; else mlen = 1; /* XXX fix this deconst */ error = urtwn_write_region_1(sc, off, __DECONST(uint8_t *, buf), mlen); if (error != USB_ERR_NORMAL_COMPLETION) break; off += mlen; buf += mlen; len -= mlen; } return (error); } static int urtwn_load_firmware(struct urtwn_softc *sc) { const struct firmware *fw; const struct r92c_fw_hdr *hdr; const char *imagename; const u_char *ptr; size_t len; uint32_t reg; int mlen, ntries, page, error; URTWN_UNLOCK(sc); /* Read firmware image from the filesystem. */ if (sc->chip & URTWN_CHIP_88E) imagename = "urtwn-rtl8188eufw"; else if ((sc->chip & (URTWN_CHIP_UMC_A_CUT | URTWN_CHIP_92C)) == URTWN_CHIP_UMC_A_CUT) imagename = "urtwn-rtl8192cfwU"; else imagename = "urtwn-rtl8192cfwT"; fw = firmware_get(imagename); URTWN_LOCK(sc); if (fw == NULL) { device_printf(sc->sc_dev, "failed loadfirmware of file %s\n", imagename); return (ENOENT); } len = fw->datasize; if (len < sizeof(*hdr)) { device_printf(sc->sc_dev, "firmware too short\n"); error = EINVAL; goto fail; } ptr = fw->data; hdr = (const struct r92c_fw_hdr *)ptr; /* Check if there is a valid FW header and skip it. */ if ((le16toh(hdr->signature) >> 4) == 0x88c || (le16toh(hdr->signature) >> 4) == 0x88e || (le16toh(hdr->signature) >> 4) == 0x92c) { URTWN_DPRINTF(sc, URTWN_DEBUG_FIRMWARE, "FW V%d.%d %02d-%02d %02d:%02d\n", le16toh(hdr->version), le16toh(hdr->subversion), hdr->month, hdr->date, hdr->hour, hdr->minute); ptr += sizeof(*hdr); len -= sizeof(*hdr); } if (urtwn_read_1(sc, R92C_MCUFWDL) & R92C_MCUFWDL_RAM_DL_SEL) { if (sc->chip & URTWN_CHIP_88E) urtwn_r88e_fw_reset(sc); else urtwn_fw_reset(sc); urtwn_write_1(sc, R92C_MCUFWDL, 0); } if (!(sc->chip & URTWN_CHIP_88E)) { urtwn_write_2(sc, R92C_SYS_FUNC_EN, urtwn_read_2(sc, R92C_SYS_FUNC_EN) | R92C_SYS_FUNC_EN_CPUEN); } urtwn_write_1(sc, R92C_MCUFWDL, urtwn_read_1(sc, R92C_MCUFWDL) | R92C_MCUFWDL_EN); urtwn_write_1(sc, R92C_MCUFWDL + 2, urtwn_read_1(sc, R92C_MCUFWDL + 2) & ~0x08); /* Reset the FWDL checksum. */ urtwn_write_1(sc, R92C_MCUFWDL, urtwn_read_1(sc, R92C_MCUFWDL) | R92C_MCUFWDL_CHKSUM_RPT); for (page = 0; len > 0; page++) { mlen = min(len, R92C_FW_PAGE_SIZE); error = urtwn_fw_loadpage(sc, page, ptr, mlen); if (error != 0) { device_printf(sc->sc_dev, "could not load firmware page\n"); goto fail; } ptr += mlen; len -= mlen; } urtwn_write_1(sc, R92C_MCUFWDL, urtwn_read_1(sc, R92C_MCUFWDL) & ~R92C_MCUFWDL_EN); urtwn_write_1(sc, R92C_MCUFWDL + 1, 0); /* Wait for checksum report. */ for (ntries = 0; ntries < 1000; ntries++) { if (urtwn_read_4(sc, R92C_MCUFWDL) & R92C_MCUFWDL_CHKSUM_RPT) break; urtwn_ms_delay(sc); } if (ntries == 1000) { device_printf(sc->sc_dev, "timeout waiting for checksum report\n"); error = ETIMEDOUT; goto fail; } reg = urtwn_read_4(sc, R92C_MCUFWDL); reg = (reg & ~R92C_MCUFWDL_WINTINI_RDY) | R92C_MCUFWDL_RDY; urtwn_write_4(sc, R92C_MCUFWDL, reg); if (sc->chip & URTWN_CHIP_88E) urtwn_r88e_fw_reset(sc); /* Wait for firmware readiness. */ for (ntries = 0; ntries < 1000; ntries++) { if (urtwn_read_4(sc, R92C_MCUFWDL) & R92C_MCUFWDL_WINTINI_RDY) break; urtwn_ms_delay(sc); } if (ntries == 1000) { device_printf(sc->sc_dev, "timeout waiting for firmware readiness\n"); error = ETIMEDOUT; goto fail; } fail: firmware_put(fw, FIRMWARE_UNLOAD); return (error); } #endif static int urtwn_dma_init(struct urtwn_softc *sc) { struct usb_endpoint *ep, *ep_end; usb_error_t usb_err; uint32_t reg; int hashq, hasnq, haslq, nqueues, ntx; int error, pagecount, npubqpages, nqpages, nrempages, tx_boundary; /* Initialize LLT table. */ error = urtwn_llt_init(sc); if (error != 0) return (error); /* Determine the number of bulk-out pipes. */ ntx = 0; ep = sc->sc_udev->endpoints; ep_end = sc->sc_udev->endpoints + sc->sc_udev->endpoints_max; for (; ep != ep_end; ep++) { if ((ep->edesc == NULL) || (ep->iface_index != sc->sc_iface_index)) continue; if (UE_GET_DIR(ep->edesc->bEndpointAddress) == UE_DIR_OUT) ntx++; } if (ntx == 0) { device_printf(sc->sc_dev, "%d: invalid number of Tx bulk pipes\n", ntx); return (EIO); } /* Get Tx queues to USB endpoints mapping. */ hashq = hasnq = haslq = nqueues = 0; switch (ntx) { case 1: hashq = 1; break; case 2: hashq = hasnq = 1; break; case 3: case 4: hashq = hasnq = haslq = 1; break; } nqueues = hashq + hasnq + haslq; if (nqueues == 0) return (EIO); npubqpages = nqpages = nrempages = pagecount = 0; if (sc->chip & URTWN_CHIP_88E) tx_boundary = R88E_TX_PAGE_BOUNDARY; else { pagecount = R92C_TX_PAGE_COUNT; npubqpages = R92C_PUBQ_NPAGES; tx_boundary = R92C_TX_PAGE_BOUNDARY; } /* Set number of pages for normal priority queue. */ if (sc->chip & URTWN_CHIP_88E) { usb_err = urtwn_write_2(sc, R92C_RQPN_NPQ, 0xd); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); usb_err = urtwn_write_4(sc, R92C_RQPN, 0x808e000d); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); } else { /* Get the number of pages for each queue. */ nqpages = (pagecount - npubqpages) / nqueues; /* * The remaining pages are assigned to the high priority * queue. */ nrempages = (pagecount - npubqpages) % nqueues; usb_err = urtwn_write_1(sc, R92C_RQPN_NPQ, hasnq ? nqpages : 0); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); usb_err = urtwn_write_4(sc, R92C_RQPN, /* Set number of pages for public queue. */ SM(R92C_RQPN_PUBQ, npubqpages) | /* Set number of pages for high priority queue. */ SM(R92C_RQPN_HPQ, hashq ? nqpages + nrempages : 0) | /* Set number of pages for low priority queue. */ SM(R92C_RQPN_LPQ, haslq ? nqpages : 0) | /* Load values. */ R92C_RQPN_LD); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); } usb_err = urtwn_write_1(sc, R92C_TXPKTBUF_BCNQ_BDNY, tx_boundary); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); usb_err = urtwn_write_1(sc, R92C_TXPKTBUF_MGQ_BDNY, tx_boundary); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); usb_err = urtwn_write_1(sc, R92C_TXPKTBUF_WMAC_LBK_BF_HD, tx_boundary); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); usb_err = urtwn_write_1(sc, R92C_TRXFF_BNDY, tx_boundary); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); usb_err = urtwn_write_1(sc, R92C_TDECTRL + 1, tx_boundary); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Set queue to USB pipe mapping. */ reg = urtwn_read_2(sc, R92C_TRXDMA_CTRL); reg &= ~R92C_TRXDMA_CTRL_QMAP_M; if (nqueues == 1) { if (hashq) reg |= R92C_TRXDMA_CTRL_QMAP_HQ; else if (hasnq) reg |= R92C_TRXDMA_CTRL_QMAP_NQ; else reg |= R92C_TRXDMA_CTRL_QMAP_LQ; } else if (nqueues == 2) { /* * All 2-endpoints configs have high and normal * priority queues. */ reg |= R92C_TRXDMA_CTRL_QMAP_HQ_NQ; } else reg |= R92C_TRXDMA_CTRL_QMAP_3EP; usb_err = urtwn_write_2(sc, R92C_TRXDMA_CTRL, reg); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Set Tx/Rx transfer page boundary. */ usb_err = urtwn_write_2(sc, R92C_TRXFF_BNDY + 2, (sc->chip & URTWN_CHIP_88E) ? 0x23ff : 0x27ff); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); /* Set Tx/Rx transfer page size. */ usb_err = urtwn_write_1(sc, R92C_PBP, SM(R92C_PBP_PSRX, R92C_PBP_128) | SM(R92C_PBP_PSTX, R92C_PBP_128)); if (usb_err != USB_ERR_NORMAL_COMPLETION) return (EIO); return (0); } static int urtwn_mac_init(struct urtwn_softc *sc) { usb_error_t error; int i; /* Write MAC initialization values. */ if (sc->chip & URTWN_CHIP_88E) { for (i = 0; i < nitems(rtl8188eu_mac); i++) { error = urtwn_write_1(sc, rtl8188eu_mac[i].reg, rtl8188eu_mac[i].val); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); } urtwn_write_1(sc, R92C_MAX_AGGR_NUM, 0x07); } else { for (i = 0; i < nitems(rtl8192cu_mac); i++) error = urtwn_write_1(sc, rtl8192cu_mac[i].reg, rtl8192cu_mac[i].val); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); } return (0); } static void urtwn_bb_init(struct urtwn_softc *sc) { const struct urtwn_bb_prog *prog; uint32_t reg; uint8_t crystalcap; int i; /* Enable BB and RF. */ urtwn_write_2(sc, R92C_SYS_FUNC_EN, urtwn_read_2(sc, R92C_SYS_FUNC_EN) | R92C_SYS_FUNC_EN_BBRSTB | R92C_SYS_FUNC_EN_BB_GLB_RST | R92C_SYS_FUNC_EN_DIO_RF); if (!(sc->chip & URTWN_CHIP_88E)) urtwn_write_2(sc, R92C_AFE_PLL_CTRL, 0xdb83); urtwn_write_1(sc, R92C_RF_CTRL, R92C_RF_CTRL_EN | R92C_RF_CTRL_RSTB | R92C_RF_CTRL_SDMRSTB); urtwn_write_1(sc, R92C_SYS_FUNC_EN, R92C_SYS_FUNC_EN_USBA | R92C_SYS_FUNC_EN_USBD | R92C_SYS_FUNC_EN_BB_GLB_RST | R92C_SYS_FUNC_EN_BBRSTB); if (!(sc->chip & URTWN_CHIP_88E)) { urtwn_write_1(sc, R92C_LDOHCI12_CTRL, 0x0f); urtwn_write_1(sc, 0x15, 0xe9); urtwn_write_1(sc, R92C_AFE_XTAL_CTRL + 1, 0x80); } /* Select BB programming based on board type. */ if (sc->chip & URTWN_CHIP_88E) prog = &rtl8188eu_bb_prog; else if (!(sc->chip & URTWN_CHIP_92C)) { if (sc->board_type == R92C_BOARD_TYPE_MINICARD) prog = &rtl8188ce_bb_prog; else if (sc->board_type == R92C_BOARD_TYPE_HIGHPA) prog = &rtl8188ru_bb_prog; else prog = &rtl8188cu_bb_prog; } else { if (sc->board_type == R92C_BOARD_TYPE_MINICARD) prog = &rtl8192ce_bb_prog; else prog = &rtl8192cu_bb_prog; } /* Write BB initialization values. */ for (i = 0; i < prog->count; i++) { urtwn_bb_write(sc, prog->regs[i], prog->vals[i]); urtwn_ms_delay(sc); } if (sc->chip & URTWN_CHIP_92C_1T2R) { /* 8192C 1T only configuration. */ reg = urtwn_bb_read(sc, R92C_FPGA0_TXINFO); reg = (reg & ~0x00000003) | 0x2; urtwn_bb_write(sc, R92C_FPGA0_TXINFO, reg); reg = urtwn_bb_read(sc, R92C_FPGA1_TXINFO); reg = (reg & ~0x00300033) | 0x00200022; urtwn_bb_write(sc, R92C_FPGA1_TXINFO, reg); reg = urtwn_bb_read(sc, R92C_CCK0_AFESETTING); reg = (reg & ~0xff000000) | 0x45 << 24; urtwn_bb_write(sc, R92C_CCK0_AFESETTING, reg); reg = urtwn_bb_read(sc, R92C_OFDM0_TRXPATHENA); reg = (reg & ~0x000000ff) | 0x23; urtwn_bb_write(sc, R92C_OFDM0_TRXPATHENA, reg); reg = urtwn_bb_read(sc, R92C_OFDM0_AGCPARAM1); reg = (reg & ~0x00000030) | 1 << 4; urtwn_bb_write(sc, R92C_OFDM0_AGCPARAM1, reg); reg = urtwn_bb_read(sc, 0xe74); reg = (reg & ~0x0c000000) | 2 << 26; urtwn_bb_write(sc, 0xe74, reg); reg = urtwn_bb_read(sc, 0xe78); reg = (reg & ~0x0c000000) | 2 << 26; urtwn_bb_write(sc, 0xe78, reg); reg = urtwn_bb_read(sc, 0xe7c); reg = (reg & ~0x0c000000) | 2 << 26; urtwn_bb_write(sc, 0xe7c, reg); reg = urtwn_bb_read(sc, 0xe80); reg = (reg & ~0x0c000000) | 2 << 26; urtwn_bb_write(sc, 0xe80, reg); reg = urtwn_bb_read(sc, 0xe88); reg = (reg & ~0x0c000000) | 2 << 26; urtwn_bb_write(sc, 0xe88, reg); } /* Write AGC values. */ for (i = 0; i < prog->agccount; i++) { urtwn_bb_write(sc, R92C_OFDM0_AGCRSSITABLE, prog->agcvals[i]); urtwn_ms_delay(sc); } if (sc->chip & URTWN_CHIP_88E) { urtwn_bb_write(sc, R92C_OFDM0_AGCCORE1(0), 0x69553422); urtwn_ms_delay(sc); urtwn_bb_write(sc, R92C_OFDM0_AGCCORE1(0), 0x69553420); urtwn_ms_delay(sc); crystalcap = sc->rom.r88e_rom.crystalcap; if (crystalcap == 0xff) crystalcap = 0x20; crystalcap &= 0x3f; reg = urtwn_bb_read(sc, R92C_AFE_XTAL_CTRL); urtwn_bb_write(sc, R92C_AFE_XTAL_CTRL, RW(reg, R92C_AFE_XTAL_CTRL_ADDR, crystalcap | crystalcap << 6)); } else { if (urtwn_bb_read(sc, R92C_HSSI_PARAM2(0)) & R92C_HSSI_PARAM2_CCK_HIPWR) sc->sc_flags |= URTWN_FLAG_CCK_HIPWR; } } static void urtwn_rf_init(struct urtwn_softc *sc) { const struct urtwn_rf_prog *prog; uint32_t reg, type; int i, j, idx, off; /* Select RF programming based on board type. */ if (sc->chip & URTWN_CHIP_88E) prog = rtl8188eu_rf_prog; else if (!(sc->chip & URTWN_CHIP_92C)) { if (sc->board_type == R92C_BOARD_TYPE_MINICARD) prog = rtl8188ce_rf_prog; else if (sc->board_type == R92C_BOARD_TYPE_HIGHPA) prog = rtl8188ru_rf_prog; else prog = rtl8188cu_rf_prog; } else prog = rtl8192ce_rf_prog; for (i = 0; i < sc->nrxchains; i++) { /* Save RF_ENV control type. */ idx = i / 2; off = (i % 2) * 16; reg = urtwn_bb_read(sc, R92C_FPGA0_RFIFACESW(idx)); type = (reg >> off) & 0x10; /* Set RF_ENV enable. */ reg = urtwn_bb_read(sc, R92C_FPGA0_RFIFACEOE(i)); reg |= 0x100000; urtwn_bb_write(sc, R92C_FPGA0_RFIFACEOE(i), reg); urtwn_ms_delay(sc); /* Set RF_ENV output high. */ reg = urtwn_bb_read(sc, R92C_FPGA0_RFIFACEOE(i)); reg |= 0x10; urtwn_bb_write(sc, R92C_FPGA0_RFIFACEOE(i), reg); urtwn_ms_delay(sc); /* Set address and data lengths of RF registers. */ reg = urtwn_bb_read(sc, R92C_HSSI_PARAM2(i)); reg &= ~R92C_HSSI_PARAM2_ADDR_LENGTH; urtwn_bb_write(sc, R92C_HSSI_PARAM2(i), reg); urtwn_ms_delay(sc); reg = urtwn_bb_read(sc, R92C_HSSI_PARAM2(i)); reg &= ~R92C_HSSI_PARAM2_DATA_LENGTH; urtwn_bb_write(sc, R92C_HSSI_PARAM2(i), reg); urtwn_ms_delay(sc); /* Write RF initialization values for this chain. */ for (j = 0; j < prog[i].count; j++) { if (prog[i].regs[j] >= 0xf9 && prog[i].regs[j] <= 0xfe) { /* * These are fake RF registers offsets that * indicate a delay is required. */ usb_pause_mtx(&sc->sc_mtx, hz / 20); /* 50ms */ continue; } urtwn_rf_write(sc, i, prog[i].regs[j], prog[i].vals[j]); urtwn_ms_delay(sc); } /* Restore RF_ENV control type. */ reg = urtwn_bb_read(sc, R92C_FPGA0_RFIFACESW(idx)); reg &= ~(0x10 << off) | (type << off); urtwn_bb_write(sc, R92C_FPGA0_RFIFACESW(idx), reg); /* Cache RF register CHNLBW. */ sc->rf_chnlbw[i] = urtwn_rf_read(sc, i, R92C_RF_CHNLBW); } if ((sc->chip & (URTWN_CHIP_UMC_A_CUT | URTWN_CHIP_92C)) == URTWN_CHIP_UMC_A_CUT) { urtwn_rf_write(sc, 0, R92C_RF_RX_G1, 0x30255); urtwn_rf_write(sc, 0, R92C_RF_RX_G2, 0x50a00); } } static void urtwn_cam_init(struct urtwn_softc *sc) { /* Invalidate all CAM entries. */ urtwn_write_4(sc, R92C_CAMCMD, R92C_CAMCMD_POLLING | R92C_CAMCMD_CLR); } static int urtwn_cam_write(struct urtwn_softc *sc, uint32_t addr, uint32_t data) { usb_error_t error; error = urtwn_write_4(sc, R92C_CAMWRITE, data); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); error = urtwn_write_4(sc, R92C_CAMCMD, R92C_CAMCMD_POLLING | R92C_CAMCMD_WRITE | SM(R92C_CAMCMD_ADDR, addr)); if (error != USB_ERR_NORMAL_COMPLETION) return (EIO); return (0); } static void urtwn_pa_bias_init(struct urtwn_softc *sc) { uint8_t reg; int i; for (i = 0; i < sc->nrxchains; i++) { if (sc->pa_setting & (1 << i)) continue; urtwn_rf_write(sc, i, R92C_RF_IPA, 0x0f406); urtwn_rf_write(sc, i, R92C_RF_IPA, 0x4f406); urtwn_rf_write(sc, i, R92C_RF_IPA, 0x8f406); urtwn_rf_write(sc, i, R92C_RF_IPA, 0xcf406); } if (!(sc->pa_setting & 0x10)) { reg = urtwn_read_1(sc, 0x16); reg = (reg & ~0xf0) | 0x90; urtwn_write_1(sc, 0x16, reg); } } static void urtwn_rxfilter_init(struct urtwn_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap = TAILQ_FIRST(&ic->ic_vaps); uint32_t rcr; uint16_t filter; URTWN_ASSERT_LOCKED(sc); /* Setup multicast filter. */ urtwn_set_multi(sc); /* Filter for management frames. */ filter = 0x7f3f; switch (vap->iv_opmode) { case IEEE80211_M_STA: filter &= ~( R92C_RXFLTMAP_SUBTYPE(IEEE80211_FC0_SUBTYPE_ASSOC_REQ) | R92C_RXFLTMAP_SUBTYPE(IEEE80211_FC0_SUBTYPE_REASSOC_REQ) | R92C_RXFLTMAP_SUBTYPE(IEEE80211_FC0_SUBTYPE_PROBE_REQ)); break; case IEEE80211_M_HOSTAP: filter &= ~( R92C_RXFLTMAP_SUBTYPE(IEEE80211_FC0_SUBTYPE_ASSOC_RESP) | R92C_RXFLTMAP_SUBTYPE(IEEE80211_FC0_SUBTYPE_REASSOC_RESP)); break; case IEEE80211_M_MONITOR: case IEEE80211_M_IBSS: break; default: device_printf(sc->sc_dev, "%s: undefined opmode %d\n", __func__, vap->iv_opmode); break; } urtwn_write_2(sc, R92C_RXFLTMAP0, filter); /* Reject all control frames. */ urtwn_write_2(sc, R92C_RXFLTMAP1, 0x0000); /* Reject all data frames. */ urtwn_write_2(sc, R92C_RXFLTMAP2, 0x0000); rcr = R92C_RCR_AM | R92C_RCR_AB | R92C_RCR_APM | R92C_RCR_HTC_LOC_CTRL | R92C_RCR_APP_PHYSTS | R92C_RCR_APP_ICV | R92C_RCR_APP_MIC; if (vap->iv_opmode == IEEE80211_M_MONITOR) { /* Accept all frames. */ rcr |= R92C_RCR_ACF | R92C_RCR_ADF | R92C_RCR_AMF | R92C_RCR_AAP; } /* Set Rx filter. */ urtwn_write_4(sc, R92C_RCR, rcr); if (ic->ic_promisc != 0) { /* Update Rx filter. */ urtwn_set_promisc(sc); } } static void urtwn_edca_init(struct urtwn_softc *sc) { urtwn_write_2(sc, R92C_SPEC_SIFS, 0x100a); urtwn_write_2(sc, R92C_MAC_SPEC_SIFS, 0x100a); urtwn_write_2(sc, R92C_SIFS_CCK, 0x100a); urtwn_write_2(sc, R92C_SIFS_OFDM, 0x100a); urtwn_write_4(sc, R92C_EDCA_BE_PARAM, 0x005ea42b); urtwn_write_4(sc, R92C_EDCA_BK_PARAM, 0x0000a44f); urtwn_write_4(sc, R92C_EDCA_VI_PARAM, 0x005ea324); urtwn_write_4(sc, R92C_EDCA_VO_PARAM, 0x002fa226); } static void urtwn_write_txpower(struct urtwn_softc *sc, int chain, uint16_t power[URTWN_RIDX_COUNT]) { uint32_t reg; /* Write per-CCK rate Tx power. */ if (chain == 0) { reg = urtwn_bb_read(sc, R92C_TXAGC_A_CCK1_MCS32); reg = RW(reg, R92C_TXAGC_A_CCK1, power[0]); urtwn_bb_write(sc, R92C_TXAGC_A_CCK1_MCS32, reg); reg = urtwn_bb_read(sc, R92C_TXAGC_B_CCK11_A_CCK2_11); reg = RW(reg, R92C_TXAGC_A_CCK2, power[1]); reg = RW(reg, R92C_TXAGC_A_CCK55, power[2]); reg = RW(reg, R92C_TXAGC_A_CCK11, power[3]); urtwn_bb_write(sc, R92C_TXAGC_B_CCK11_A_CCK2_11, reg); } else { reg = urtwn_bb_read(sc, R92C_TXAGC_B_CCK1_55_MCS32); reg = RW(reg, R92C_TXAGC_B_CCK1, power[0]); reg = RW(reg, R92C_TXAGC_B_CCK2, power[1]); reg = RW(reg, R92C_TXAGC_B_CCK55, power[2]); urtwn_bb_write(sc, R92C_TXAGC_B_CCK1_55_MCS32, reg); reg = urtwn_bb_read(sc, R92C_TXAGC_B_CCK11_A_CCK2_11); reg = RW(reg, R92C_TXAGC_B_CCK11, power[3]); urtwn_bb_write(sc, R92C_TXAGC_B_CCK11_A_CCK2_11, reg); } /* Write per-OFDM rate Tx power. */ urtwn_bb_write(sc, R92C_TXAGC_RATE18_06(chain), SM(R92C_TXAGC_RATE06, power[ 4]) | SM(R92C_TXAGC_RATE09, power[ 5]) | SM(R92C_TXAGC_RATE12, power[ 6]) | SM(R92C_TXAGC_RATE18, power[ 7])); urtwn_bb_write(sc, R92C_TXAGC_RATE54_24(chain), SM(R92C_TXAGC_RATE24, power[ 8]) | SM(R92C_TXAGC_RATE36, power[ 9]) | SM(R92C_TXAGC_RATE48, power[10]) | SM(R92C_TXAGC_RATE54, power[11])); /* Write per-MCS Tx power. */ urtwn_bb_write(sc, R92C_TXAGC_MCS03_MCS00(chain), SM(R92C_TXAGC_MCS00, power[12]) | SM(R92C_TXAGC_MCS01, power[13]) | SM(R92C_TXAGC_MCS02, power[14]) | SM(R92C_TXAGC_MCS03, power[15])); urtwn_bb_write(sc, R92C_TXAGC_MCS07_MCS04(chain), SM(R92C_TXAGC_MCS04, power[16]) | SM(R92C_TXAGC_MCS05, power[17]) | SM(R92C_TXAGC_MCS06, power[18]) | SM(R92C_TXAGC_MCS07, power[19])); urtwn_bb_write(sc, R92C_TXAGC_MCS11_MCS08(chain), SM(R92C_TXAGC_MCS08, power[20]) | SM(R92C_TXAGC_MCS09, power[21]) | SM(R92C_TXAGC_MCS10, power[22]) | SM(R92C_TXAGC_MCS11, power[23])); urtwn_bb_write(sc, R92C_TXAGC_MCS15_MCS12(chain), SM(R92C_TXAGC_MCS12, power[24]) | SM(R92C_TXAGC_MCS13, power[25]) | SM(R92C_TXAGC_MCS14, power[26]) | SM(R92C_TXAGC_MCS15, power[27])); } static void urtwn_get_txpower(struct urtwn_softc *sc, int chain, struct ieee80211_channel *c, struct ieee80211_channel *extc, uint16_t power[URTWN_RIDX_COUNT]) { struct ieee80211com *ic = &sc->sc_ic; struct r92c_rom *rom = &sc->rom.r92c_rom; uint16_t cckpow, ofdmpow, htpow, diff, max; const struct urtwn_txpwr *base; int ridx, chan, group; /* Determine channel group. */ chan = ieee80211_chan2ieee(ic, c); /* XXX center freq! */ if (chan <= 3) group = 0; else if (chan <= 9) group = 1; else group = 2; /* Get original Tx power based on board type and RF chain. */ if (!(sc->chip & URTWN_CHIP_92C)) { if (sc->board_type == R92C_BOARD_TYPE_HIGHPA) base = &rtl8188ru_txagc[chain]; else base = &rtl8192cu_txagc[chain]; } else base = &rtl8192cu_txagc[chain]; memset(power, 0, URTWN_RIDX_COUNT * sizeof(power[0])); if (sc->regulatory == 0) { for (ridx = URTWN_RIDX_CCK1; ridx <= URTWN_RIDX_CCK11; ridx++) power[ridx] = base->pwr[0][ridx]; } for (ridx = URTWN_RIDX_OFDM6; ridx < URTWN_RIDX_COUNT; ridx++) { if (sc->regulatory == 3) { power[ridx] = base->pwr[0][ridx]; /* Apply vendor limits. */ if (extc != NULL) max = rom->ht40_max_pwr[group]; else max = rom->ht20_max_pwr[group]; max = (max >> (chain * 4)) & 0xf; if (power[ridx] > max) power[ridx] = max; } else if (sc->regulatory == 1) { if (extc == NULL) power[ridx] = base->pwr[group][ridx]; } else if (sc->regulatory != 2) power[ridx] = base->pwr[0][ridx]; } /* Compute per-CCK rate Tx power. */ cckpow = rom->cck_tx_pwr[chain][group]; for (ridx = URTWN_RIDX_CCK1; ridx <= URTWN_RIDX_CCK11; ridx++) { power[ridx] += cckpow; if (power[ridx] > R92C_MAX_TX_PWR) power[ridx] = R92C_MAX_TX_PWR; } htpow = rom->ht40_1s_tx_pwr[chain][group]; if (sc->ntxchains > 1) { /* Apply reduction for 2 spatial streams. */ diff = rom->ht40_2s_tx_pwr_diff[group]; diff = (diff >> (chain * 4)) & 0xf; htpow = (htpow > diff) ? htpow - diff : 0; } /* Compute per-OFDM rate Tx power. */ diff = rom->ofdm_tx_pwr_diff[group]; diff = (diff >> (chain * 4)) & 0xf; ofdmpow = htpow + diff; /* HT->OFDM correction. */ for (ridx = URTWN_RIDX_OFDM6; ridx <= URTWN_RIDX_OFDM54; ridx++) { power[ridx] += ofdmpow; if (power[ridx] > R92C_MAX_TX_PWR) power[ridx] = R92C_MAX_TX_PWR; } /* Compute per-MCS Tx power. */ if (extc == NULL) { diff = rom->ht20_tx_pwr_diff[group]; diff = (diff >> (chain * 4)) & 0xf; htpow += diff; /* HT40->HT20 correction. */ } for (ridx = 12; ridx <= 27; ridx++) { power[ridx] += htpow; if (power[ridx] > R92C_MAX_TX_PWR) power[ridx] = R92C_MAX_TX_PWR; } #ifdef USB_DEBUG if (sc->sc_debug & URTWN_DEBUG_TXPWR) { /* Dump per-rate Tx power values. */ printf("Tx power for chain %d:\n", chain); for (ridx = URTWN_RIDX_CCK1; ridx < URTWN_RIDX_COUNT; ridx++) printf("Rate %d = %u\n", ridx, power[ridx]); } #endif } static void urtwn_r88e_get_txpower(struct urtwn_softc *sc, int chain, struct ieee80211_channel *c, struct ieee80211_channel *extc, uint16_t power[URTWN_RIDX_COUNT]) { struct ieee80211com *ic = &sc->sc_ic; struct r88e_rom *rom = &sc->rom.r88e_rom; uint16_t cckpow, ofdmpow, bw20pow, htpow; const struct urtwn_r88e_txpwr *base; int ridx, chan, group; /* Determine channel group. */ chan = ieee80211_chan2ieee(ic, c); /* XXX center freq! */ if (chan <= 2) group = 0; else if (chan <= 5) group = 1; else if (chan <= 8) group = 2; else if (chan <= 11) group = 3; else if (chan <= 13) group = 4; else group = 5; /* Get original Tx power based on board type and RF chain. */ base = &rtl8188eu_txagc[chain]; memset(power, 0, URTWN_RIDX_COUNT * sizeof(power[0])); if (sc->regulatory == 0) { for (ridx = URTWN_RIDX_CCK1; ridx <= URTWN_RIDX_CCK11; ridx++) power[ridx] = base->pwr[0][ridx]; } for (ridx = URTWN_RIDX_OFDM6; ridx < URTWN_RIDX_COUNT; ridx++) { if (sc->regulatory == 3) power[ridx] = base->pwr[0][ridx]; else if (sc->regulatory == 1) { if (extc == NULL) power[ridx] = base->pwr[group][ridx]; } else if (sc->regulatory != 2) power[ridx] = base->pwr[0][ridx]; } /* Compute per-CCK rate Tx power. */ cckpow = rom->cck_tx_pwr[group]; for (ridx = URTWN_RIDX_CCK1; ridx <= URTWN_RIDX_CCK11; ridx++) { power[ridx] += cckpow; if (power[ridx] > R92C_MAX_TX_PWR) power[ridx] = R92C_MAX_TX_PWR; } htpow = rom->ht40_tx_pwr[group]; /* Compute per-OFDM rate Tx power. */ ofdmpow = htpow + sc->ofdm_tx_pwr_diff; for (ridx = URTWN_RIDX_OFDM6; ridx <= URTWN_RIDX_OFDM54; ridx++) { power[ridx] += ofdmpow; if (power[ridx] > R92C_MAX_TX_PWR) power[ridx] = R92C_MAX_TX_PWR; } bw20pow = htpow + sc->bw20_tx_pwr_diff; for (ridx = 12; ridx <= 27; ridx++) { power[ridx] += bw20pow; if (power[ridx] > R92C_MAX_TX_PWR) power[ridx] = R92C_MAX_TX_PWR; } } static void urtwn_set_txpower(struct urtwn_softc *sc, struct ieee80211_channel *c, struct ieee80211_channel *extc) { uint16_t power[URTWN_RIDX_COUNT]; int i; for (i = 0; i < sc->ntxchains; i++) { /* Compute per-rate Tx power values. */ if (sc->chip & URTWN_CHIP_88E) urtwn_r88e_get_txpower(sc, i, c, extc, power); else urtwn_get_txpower(sc, i, c, extc, power); /* Write per-rate Tx power values to hardware. */ urtwn_write_txpower(sc, i, power); } } static void urtwn_set_rx_bssid_all(struct urtwn_softc *sc, int enable) { uint32_t reg; reg = urtwn_read_4(sc, R92C_RCR); if (enable) reg &= ~R92C_RCR_CBSSID_BCN; else reg |= R92C_RCR_CBSSID_BCN; urtwn_write_4(sc, R92C_RCR, reg); } static void urtwn_set_gain(struct urtwn_softc *sc, uint8_t gain) { uint32_t reg; reg = urtwn_bb_read(sc, R92C_OFDM0_AGCCORE1(0)); reg = RW(reg, R92C_OFDM0_AGCCORE1_GAIN, gain); urtwn_bb_write(sc, R92C_OFDM0_AGCCORE1(0), reg); if (!(sc->chip & URTWN_CHIP_88E)) { reg = urtwn_bb_read(sc, R92C_OFDM0_AGCCORE1(1)); reg = RW(reg, R92C_OFDM0_AGCCORE1_GAIN, gain); urtwn_bb_write(sc, R92C_OFDM0_AGCCORE1(1), reg); } } static void urtwn_scan_start(struct ieee80211com *ic) { struct urtwn_softc *sc = ic->ic_softc; URTWN_LOCK(sc); /* Receive beacons / probe responses from any BSSID. */ if (ic->ic_opmode != IEEE80211_M_IBSS && ic->ic_opmode != IEEE80211_M_HOSTAP) urtwn_set_rx_bssid_all(sc, 1); /* Set gain for scanning. */ urtwn_set_gain(sc, 0x20); URTWN_UNLOCK(sc); } static void urtwn_scan_end(struct ieee80211com *ic) { struct urtwn_softc *sc = ic->ic_softc; URTWN_LOCK(sc); /* Restore limitations. */ if (ic->ic_promisc == 0 && ic->ic_opmode != IEEE80211_M_IBSS && ic->ic_opmode != IEEE80211_M_HOSTAP) urtwn_set_rx_bssid_all(sc, 0); /* Set gain under link. */ urtwn_set_gain(sc, 0x32); URTWN_UNLOCK(sc); } static void urtwn_getradiocaps(struct ieee80211com *ic, int maxchans, int *nchans, struct ieee80211_channel chans[]) { uint8_t bands[IEEE80211_MODE_BYTES]; memset(bands, 0, sizeof(bands)); setbit(bands, IEEE80211_MODE_11B); setbit(bands, IEEE80211_MODE_11G); if (urtwn_enable_11n) setbit(bands, IEEE80211_MODE_11NG); ieee80211_add_channel_list_2ghz(chans, maxchans, nchans, urtwn_chan_2ghz, nitems(urtwn_chan_2ghz), bands, 0); } static void urtwn_set_channel(struct ieee80211com *ic) { struct urtwn_softc *sc = ic->ic_softc; struct ieee80211_channel *c = ic->ic_curchan; struct ieee80211vap *vap = TAILQ_FIRST(&ic->ic_vaps); URTWN_LOCK(sc); if (vap->iv_state == IEEE80211_S_SCAN) { /* Make link LED blink during scan. */ urtwn_set_led(sc, URTWN_LED_LINK, !sc->ledlink); } urtwn_set_chan(sc, c, NULL); sc->sc_rxtap.wr_chan_freq = htole16(c->ic_freq); sc->sc_rxtap.wr_chan_flags = htole16(c->ic_flags); sc->sc_txtap.wt_chan_freq = htole16(c->ic_freq); sc->sc_txtap.wt_chan_flags = htole16(c->ic_flags); URTWN_UNLOCK(sc); } static int urtwn_wme_update(struct ieee80211com *ic) { const struct wmeParams *wmep = ic->ic_wme.wme_chanParams.cap_wmeParams; struct urtwn_softc *sc = ic->ic_softc; uint8_t aifs, acm, slottime; int ac; acm = 0; slottime = IEEE80211_GET_SLOTTIME(ic); URTWN_LOCK(sc); for (ac = WME_AC_BE; ac < WME_NUM_AC; ac++) { /* AIFS[AC] = AIFSN[AC] * aSlotTime + aSIFSTime. */ aifs = wmep[ac].wmep_aifsn * slottime + IEEE80211_DUR_SIFS; urtwn_write_4(sc, wme2queue[ac].reg, SM(R92C_EDCA_PARAM_TXOP, wmep[ac].wmep_txopLimit) | SM(R92C_EDCA_PARAM_ECWMIN, wmep[ac].wmep_logcwmin) | SM(R92C_EDCA_PARAM_ECWMAX, wmep[ac].wmep_logcwmax) | SM(R92C_EDCA_PARAM_AIFS, aifs)); if (ac != WME_AC_BE) acm |= wmep[ac].wmep_acm << ac; } if (acm != 0) acm |= R92C_ACMHWCTRL_EN; urtwn_write_1(sc, R92C_ACMHWCTRL, (urtwn_read_1(sc, R92C_ACMHWCTRL) & ~R92C_ACMHWCTRL_ACM_MASK) | acm); URTWN_UNLOCK(sc); return 0; } static void urtwn_update_slot(struct ieee80211com *ic) { urtwn_cmd_sleepable(ic->ic_softc, NULL, 0, urtwn_update_slot_cb); } static void urtwn_update_slot_cb(struct urtwn_softc *sc, union sec_param *data) { struct ieee80211com *ic = &sc->sc_ic; uint8_t slottime; slottime = IEEE80211_GET_SLOTTIME(ic); URTWN_DPRINTF(sc, URTWN_DEBUG_ANY, "%s: setting slot time to %uus\n", __func__, slottime); urtwn_write_1(sc, R92C_SLOT, slottime); urtwn_update_aifs(sc, slottime); } static void urtwn_update_aifs(struct urtwn_softc *sc, uint8_t slottime) { const struct wmeParams *wmep = sc->sc_ic.ic_wme.wme_chanParams.cap_wmeParams; uint8_t aifs, ac; for (ac = WME_AC_BE; ac < WME_NUM_AC; ac++) { /* AIFS[AC] = AIFSN[AC] * aSlotTime + aSIFSTime. */ aifs = wmep[ac].wmep_aifsn * slottime + IEEE80211_DUR_SIFS; urtwn_write_1(sc, wme2queue[ac].reg, aifs); } } static uint8_t urtwn_get_multi_pos(const uint8_t maddr[]) { uint64_t mask = 0x00004d101df481b4; uint8_t pos = 0x27; /* initial value */ int i, j; for (i = 0; i < IEEE80211_ADDR_LEN; i++) for (j = (i == 0) ? 1 : 0; j < 8; j++) if ((maddr[i] >> j) & 1) pos ^= (mask >> (i * 8 + j - 1)); pos &= 0x3f; return (pos); } static void urtwn_set_multi(struct urtwn_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; uint32_t mfilt[2]; URTWN_ASSERT_LOCKED(sc); /* general structure was copied from ath(4). */ if (ic->ic_allmulti == 0) { struct ieee80211vap *vap; struct ifnet *ifp; struct ifmultiaddr *ifma; /* * Merge multicast addresses to form the hardware filter. */ mfilt[0] = mfilt[1] = 0; TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) { ifp = vap->iv_ifp; if_maddr_rlock(ifp); TAILQ_FOREACH(ifma, &ifp->if_multiaddrs, ifma_link) { caddr_t dl; uint8_t pos; dl = LLADDR((struct sockaddr_dl *) ifma->ifma_addr); pos = urtwn_get_multi_pos(dl); mfilt[pos / 32] |= (1 << (pos % 32)); } if_maddr_runlock(ifp); } } else mfilt[0] = mfilt[1] = ~0; urtwn_write_4(sc, R92C_MAR + 0, mfilt[0]); urtwn_write_4(sc, R92C_MAR + 4, mfilt[1]); URTWN_DPRINTF(sc, URTWN_DEBUG_STATE, "%s: MC filter %08x:%08x\n", __func__, mfilt[0], mfilt[1]); } static void urtwn_set_promisc(struct urtwn_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap = TAILQ_FIRST(&ic->ic_vaps); uint32_t rcr, mask1, mask2; URTWN_ASSERT_LOCKED(sc); if (vap->iv_opmode == IEEE80211_M_MONITOR) return; mask1 = R92C_RCR_ACF | R92C_RCR_ADF | R92C_RCR_AMF | R92C_RCR_AAP; mask2 = R92C_RCR_APM; if (vap->iv_state == IEEE80211_S_RUN) { switch (vap->iv_opmode) { case IEEE80211_M_STA: mask2 |= R92C_RCR_CBSSID_BCN; /* FALLTHROUGH */ case IEEE80211_M_IBSS: mask2 |= R92C_RCR_CBSSID_DATA; break; case IEEE80211_M_HOSTAP: break; default: device_printf(sc->sc_dev, "%s: undefined opmode %d\n", __func__, vap->iv_opmode); return; } } rcr = urtwn_read_4(sc, R92C_RCR); if (ic->ic_promisc == 0) rcr = (rcr & ~mask1) | mask2; else rcr = (rcr & ~mask2) | mask1; urtwn_write_4(sc, R92C_RCR, rcr); } static void urtwn_update_promisc(struct ieee80211com *ic) { struct urtwn_softc *sc = ic->ic_softc; URTWN_LOCK(sc); if (sc->sc_flags & URTWN_RUNNING) urtwn_set_promisc(sc); URTWN_UNLOCK(sc); } static void urtwn_update_mcast(struct ieee80211com *ic) { struct urtwn_softc *sc = ic->ic_softc; URTWN_LOCK(sc); if (sc->sc_flags & URTWN_RUNNING) urtwn_set_multi(sc); URTWN_UNLOCK(sc); } static struct ieee80211_node * urtwn_node_alloc(struct ieee80211vap *vap, const uint8_t mac[IEEE80211_ADDR_LEN]) { struct urtwn_node *un; un = malloc(sizeof (struct urtwn_node), M_80211_NODE, M_NOWAIT | M_ZERO); if (un == NULL) return NULL; un->id = URTWN_MACID_UNDEFINED; return &un->ni; } static void urtwn_newassoc(struct ieee80211_node *ni, int isnew) { struct urtwn_softc *sc = ni->ni_ic->ic_softc; struct urtwn_node *un = URTWN_NODE(ni); uint8_t id; /* Only do this bit for R88E chips */ if (! (sc->chip & URTWN_CHIP_88E)) return; if (!isnew) return; URTWN_NT_LOCK(sc); for (id = 0; id <= URTWN_MACID_MAX(sc); id++) { if (id != URTWN_MACID_BC && sc->node_list[id] == NULL) { un->id = id; sc->node_list[id] = ni; break; } } URTWN_NT_UNLOCK(sc); if (id > URTWN_MACID_MAX(sc)) { device_printf(sc->sc_dev, "%s: node table is full\n", __func__); } } static void urtwn_node_free(struct ieee80211_node *ni) { struct urtwn_softc *sc = ni->ni_ic->ic_softc; struct urtwn_node *un = URTWN_NODE(ni); URTWN_NT_LOCK(sc); if (un->id != URTWN_MACID_UNDEFINED) sc->node_list[un->id] = NULL; URTWN_NT_UNLOCK(sc); sc->sc_node_free(ni); } static void urtwn_set_chan(struct urtwn_softc *sc, struct ieee80211_channel *c, struct ieee80211_channel *extc) { struct ieee80211com *ic = &sc->sc_ic; uint32_t reg; u_int chan; int i; chan = ieee80211_chan2ieee(ic, c); /* XXX center freq! */ if (chan == 0 || chan == IEEE80211_CHAN_ANY) { device_printf(sc->sc_dev, "%s: invalid channel %x\n", __func__, chan); return; } /* Set Tx power for this new channel. */ urtwn_set_txpower(sc, c, extc); for (i = 0; i < sc->nrxchains; i++) { urtwn_rf_write(sc, i, R92C_RF_CHNLBW, RW(sc->rf_chnlbw[i], R92C_RF_CHNLBW_CHNL, chan)); } #ifndef IEEE80211_NO_HT if (extc != NULL) { /* Is secondary channel below or above primary? */ int prichlo = c->ic_freq < extc->ic_freq; urtwn_write_1(sc, R92C_BWOPMODE, urtwn_read_1(sc, R92C_BWOPMODE) & ~R92C_BWOPMODE_20MHZ); reg = urtwn_read_1(sc, R92C_RRSR + 2); reg = (reg & ~0x6f) | (prichlo ? 1 : 2) << 5; urtwn_write_1(sc, R92C_RRSR + 2, reg); urtwn_bb_write(sc, R92C_FPGA0_RFMOD, urtwn_bb_read(sc, R92C_FPGA0_RFMOD) | R92C_RFMOD_40MHZ); urtwn_bb_write(sc, R92C_FPGA1_RFMOD, urtwn_bb_read(sc, R92C_FPGA1_RFMOD) | R92C_RFMOD_40MHZ); /* Set CCK side band. */ reg = urtwn_bb_read(sc, R92C_CCK0_SYSTEM); reg = (reg & ~0x00000010) | (prichlo ? 0 : 1) << 4; urtwn_bb_write(sc, R92C_CCK0_SYSTEM, reg); reg = urtwn_bb_read(sc, R92C_OFDM1_LSTF); reg = (reg & ~0x00000c00) | (prichlo ? 1 : 2) << 10; urtwn_bb_write(sc, R92C_OFDM1_LSTF, reg); urtwn_bb_write(sc, R92C_FPGA0_ANAPARAM2, urtwn_bb_read(sc, R92C_FPGA0_ANAPARAM2) & ~R92C_FPGA0_ANAPARAM2_CBW20); reg = urtwn_bb_read(sc, 0x818); reg = (reg & ~0x0c000000) | (prichlo ? 2 : 1) << 26; urtwn_bb_write(sc, 0x818, reg); /* Select 40MHz bandwidth. */ urtwn_rf_write(sc, 0, R92C_RF_CHNLBW, (sc->rf_chnlbw[0] & ~0xfff) | chan); } else #endif { urtwn_write_1(sc, R92C_BWOPMODE, urtwn_read_1(sc, R92C_BWOPMODE) | R92C_BWOPMODE_20MHZ); urtwn_bb_write(sc, R92C_FPGA0_RFMOD, urtwn_bb_read(sc, R92C_FPGA0_RFMOD) & ~R92C_RFMOD_40MHZ); urtwn_bb_write(sc, R92C_FPGA1_RFMOD, urtwn_bb_read(sc, R92C_FPGA1_RFMOD) & ~R92C_RFMOD_40MHZ); if (!(sc->chip & URTWN_CHIP_88E)) { urtwn_bb_write(sc, R92C_FPGA0_ANAPARAM2, urtwn_bb_read(sc, R92C_FPGA0_ANAPARAM2) | R92C_FPGA0_ANAPARAM2_CBW20); } /* Select 20MHz bandwidth. */ urtwn_rf_write(sc, 0, R92C_RF_CHNLBW, (sc->rf_chnlbw[0] & ~0xfff) | chan | ((sc->chip & URTWN_CHIP_88E) ? R88E_RF_CHNLBW_BW20 : R92C_RF_CHNLBW_BW20)); } } static void urtwn_iq_calib(struct urtwn_softc *sc) { /* TODO */ } static void urtwn_lc_calib(struct urtwn_softc *sc) { uint32_t rf_ac[2]; uint8_t txmode; int i; txmode = urtwn_read_1(sc, R92C_OFDM1_LSTF + 3); if ((txmode & 0x70) != 0) { /* Disable all continuous Tx. */ urtwn_write_1(sc, R92C_OFDM1_LSTF + 3, txmode & ~0x70); /* Set RF mode to standby mode. */ for (i = 0; i < sc->nrxchains; i++) { rf_ac[i] = urtwn_rf_read(sc, i, R92C_RF_AC); urtwn_rf_write(sc, i, R92C_RF_AC, RW(rf_ac[i], R92C_RF_AC_MODE, R92C_RF_AC_MODE_STANDBY)); } } else { /* Block all Tx queues. */ urtwn_write_1(sc, R92C_TXPAUSE, R92C_TX_QUEUE_ALL); } /* Start calibration. */ urtwn_rf_write(sc, 0, R92C_RF_CHNLBW, urtwn_rf_read(sc, 0, R92C_RF_CHNLBW) | R92C_RF_CHNLBW_LCSTART); /* Give calibration the time to complete. */ usb_pause_mtx(&sc->sc_mtx, hz / 10); /* 100ms */ /* Restore configuration. */ if ((txmode & 0x70) != 0) { /* Restore Tx mode. */ urtwn_write_1(sc, R92C_OFDM1_LSTF + 3, txmode); /* Restore RF mode. */ for (i = 0; i < sc->nrxchains; i++) urtwn_rf_write(sc, i, R92C_RF_AC, rf_ac[i]); } else { /* Unblock all Tx queues. */ urtwn_write_1(sc, R92C_TXPAUSE, 0x00); } } static void urtwn_temp_calib(struct urtwn_softc *sc) { uint8_t temp; URTWN_ASSERT_LOCKED(sc); if (!(sc->sc_flags & URTWN_TEMP_MEASURED)) { /* Start measuring temperature. */ URTWN_DPRINTF(sc, URTWN_DEBUG_TEMP, "%s: start measuring temperature\n", __func__); if (sc->chip & URTWN_CHIP_88E) { urtwn_rf_write(sc, 0, R88E_RF_T_METER, R88E_RF_T_METER_START); } else { urtwn_rf_write(sc, 0, R92C_RF_T_METER, R92C_RF_T_METER_START); } sc->sc_flags |= URTWN_TEMP_MEASURED; return; } sc->sc_flags &= ~URTWN_TEMP_MEASURED; /* Read measured temperature. */ if (sc->chip & URTWN_CHIP_88E) { temp = MS(urtwn_rf_read(sc, 0, R88E_RF_T_METER), R88E_RF_T_METER_VAL); } else { temp = MS(urtwn_rf_read(sc, 0, R92C_RF_T_METER), R92C_RF_T_METER_VAL); } if (temp == 0) { /* Read failed, skip. */ URTWN_DPRINTF(sc, URTWN_DEBUG_TEMP, "%s: temperature read failed, skipping\n", __func__); return; } URTWN_DPRINTF(sc, URTWN_DEBUG_TEMP, "%s: temperature: previous %u, current %u\n", __func__, sc->thcal_lctemp, temp); /* * Redo LC calibration if temperature changed significantly since * last calibration. */ if (sc->thcal_lctemp == 0) { /* First LC calibration is performed in urtwn_init(). */ sc->thcal_lctemp = temp; } else if (abs(temp - sc->thcal_lctemp) > 1) { URTWN_DPRINTF(sc, URTWN_DEBUG_TEMP, "%s: LC calib triggered by temp: %u -> %u\n", __func__, sc->thcal_lctemp, temp); urtwn_lc_calib(sc); /* Record temperature of last LC calibration. */ sc->thcal_lctemp = temp; } } static void urtwn_setup_static_keys(struct urtwn_softc *sc, struct urtwn_vap *uvp) { int i; for (i = 0; i < IEEE80211_WEP_NKID; i++) { const struct ieee80211_key *k = uvp->keys[i]; if (k != NULL) { urtwn_cmd_sleepable(sc, k, sizeof(*k), urtwn_key_set_cb); } } } static int urtwn_init(struct urtwn_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap = TAILQ_FIRST(&ic->ic_vaps); uint8_t macaddr[IEEE80211_ADDR_LEN]; uint32_t reg; usb_error_t usb_err = USB_ERR_NORMAL_COMPLETION; int error; URTWN_LOCK(sc); if (sc->sc_flags & URTWN_RUNNING) { URTWN_UNLOCK(sc); return (0); } /* Init firmware commands ring. */ sc->fwcur = 0; /* Allocate Tx/Rx buffers. */ error = urtwn_alloc_rx_list(sc); if (error != 0) goto fail; error = urtwn_alloc_tx_list(sc); if (error != 0) goto fail; /* Power on adapter. */ error = urtwn_power_on(sc); if (error != 0) goto fail; /* Initialize DMA. */ error = urtwn_dma_init(sc); if (error != 0) goto fail; /* Set info size in Rx descriptors (in 64-bit words). */ urtwn_write_1(sc, R92C_RX_DRVINFO_SZ, 4); /* Init interrupts. */ if (sc->chip & URTWN_CHIP_88E) { usb_err = urtwn_write_4(sc, R88E_HISR, 0xffffffff); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; usb_err = urtwn_write_4(sc, R88E_HIMR, R88E_HIMR_CPWM | R88E_HIMR_CPWM2 | R88E_HIMR_TBDER | R88E_HIMR_PSTIMEOUT); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; usb_err = urtwn_write_4(sc, R88E_HIMRE, R88E_HIMRE_RXFOVW | R88E_HIMRE_TXFOVW | R88E_HIMRE_RXERR | R88E_HIMRE_TXERR); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; usb_err = urtwn_write_1(sc, R92C_USB_SPECIAL_OPTION, urtwn_read_1(sc, R92C_USB_SPECIAL_OPTION) | R92C_USB_SPECIAL_OPTION_INT_BULK_SEL); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; } else { usb_err = urtwn_write_4(sc, R92C_HISR, 0xffffffff); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; usb_err = urtwn_write_4(sc, R92C_HIMR, 0xffffffff); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; } /* Set MAC address. */ IEEE80211_ADDR_COPY(macaddr, vap ? vap->iv_myaddr : ic->ic_macaddr); usb_err = urtwn_write_region_1(sc, R92C_MACID, macaddr, IEEE80211_ADDR_LEN); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; /* Set initial network type. */ urtwn_set_mode(sc, R92C_MSR_INFRA); /* Initialize Rx filter. */ urtwn_rxfilter_init(sc); /* Set response rate. */ reg = urtwn_read_4(sc, R92C_RRSR); reg = RW(reg, R92C_RRSR_RATE_BITMAP, R92C_RRSR_RATE_CCK_ONLY_1M); urtwn_write_4(sc, R92C_RRSR, reg); /* Set short/long retry limits. */ urtwn_write_2(sc, R92C_RL, SM(R92C_RL_SRL, 0x30) | SM(R92C_RL_LRL, 0x30)); /* Initialize EDCA parameters. */ urtwn_edca_init(sc); /* Setup rate fallback. */ if (!(sc->chip & URTWN_CHIP_88E)) { urtwn_write_4(sc, R92C_DARFRC + 0, 0x00000000); urtwn_write_4(sc, R92C_DARFRC + 4, 0x10080404); urtwn_write_4(sc, R92C_RARFRC + 0, 0x04030201); urtwn_write_4(sc, R92C_RARFRC + 4, 0x08070605); } urtwn_write_1(sc, R92C_FWHW_TXQ_CTRL, urtwn_read_1(sc, R92C_FWHW_TXQ_CTRL) | R92C_FWHW_TXQ_CTRL_AMPDU_RTY_NEW); /* Set ACK timeout. */ urtwn_write_1(sc, R92C_ACKTO, 0x40); /* Setup USB aggregation. */ reg = urtwn_read_4(sc, R92C_TDECTRL); reg = RW(reg, R92C_TDECTRL_BLK_DESC_NUM, 6); urtwn_write_4(sc, R92C_TDECTRL, reg); urtwn_write_1(sc, R92C_TRXDMA_CTRL, urtwn_read_1(sc, R92C_TRXDMA_CTRL) | R92C_TRXDMA_CTRL_RXDMA_AGG_EN); urtwn_write_1(sc, R92C_RXDMA_AGG_PG_TH, 48); if (sc->chip & URTWN_CHIP_88E) urtwn_write_1(sc, R92C_RXDMA_AGG_PG_TH + 1, 4); else { urtwn_write_1(sc, R92C_USB_DMA_AGG_TO, 4); urtwn_write_1(sc, R92C_USB_SPECIAL_OPTION, urtwn_read_1(sc, R92C_USB_SPECIAL_OPTION) | R92C_USB_SPECIAL_OPTION_AGG_EN); urtwn_write_1(sc, R92C_USB_AGG_TH, 8); urtwn_write_1(sc, R92C_USB_AGG_TO, 6); } /* Initialize beacon parameters. */ urtwn_write_2(sc, R92C_BCN_CTRL, 0x1010); urtwn_write_2(sc, R92C_TBTT_PROHIBIT, 0x6404); urtwn_write_1(sc, R92C_DRVERLYINT, 0x05); urtwn_write_1(sc, R92C_BCNDMATIM, 0x02); urtwn_write_2(sc, R92C_BCNTCFG, 0x660f); if (!(sc->chip & URTWN_CHIP_88E)) { /* Setup AMPDU aggregation. */ urtwn_write_4(sc, R92C_AGGLEN_LMT, 0x99997631); /* MCS7~0 */ urtwn_write_1(sc, R92C_AGGR_BREAK_TIME, 0x16); urtwn_write_2(sc, R92C_MAX_AGGR_NUM, 0x0708); urtwn_write_1(sc, R92C_BCN_MAX_ERR, 0xff); } #ifndef URTWN_WITHOUT_UCODE /* Load 8051 microcode. */ error = urtwn_load_firmware(sc); if (error == 0) sc->sc_flags |= URTWN_FW_LOADED; #endif /* Initialize MAC/BB/RF blocks. */ error = urtwn_mac_init(sc); if (error != 0) { device_printf(sc->sc_dev, "%s: error while initializing MAC block\n", __func__); goto fail; } urtwn_bb_init(sc); urtwn_rf_init(sc); /* Reinitialize Rx filter (D3845 is not committed yet). */ urtwn_rxfilter_init(sc); if (sc->chip & URTWN_CHIP_88E) { urtwn_write_2(sc, R92C_CR, urtwn_read_2(sc, R92C_CR) | R92C_CR_MACTXEN | R92C_CR_MACRXEN); } /* Turn CCK and OFDM blocks on. */ reg = urtwn_bb_read(sc, R92C_FPGA0_RFMOD); reg |= R92C_RFMOD_CCK_EN; usb_err = urtwn_bb_write(sc, R92C_FPGA0_RFMOD, reg); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; reg = urtwn_bb_read(sc, R92C_FPGA0_RFMOD); reg |= R92C_RFMOD_OFDM_EN; usb_err = urtwn_bb_write(sc, R92C_FPGA0_RFMOD, reg); if (usb_err != USB_ERR_NORMAL_COMPLETION) goto fail; /* Clear per-station keys table. */ urtwn_cam_init(sc); /* Enable decryption / encryption. */ urtwn_write_2(sc, R92C_SECCFG, R92C_SECCFG_TXUCKEY_DEF | R92C_SECCFG_RXUCKEY_DEF | R92C_SECCFG_TXENC_ENA | R92C_SECCFG_RXDEC_ENA | R92C_SECCFG_TXBCKEY_DEF | R92C_SECCFG_RXBCKEY_DEF); /* Enable hardware sequence numbering. */ urtwn_write_1(sc, R92C_HWSEQ_CTRL, R92C_TX_QUEUE_ALL); /* Enable per-packet TX report. */ if (sc->chip & URTWN_CHIP_88E) { urtwn_write_1(sc, R88E_TX_RPT_CTRL, urtwn_read_1(sc, R88E_TX_RPT_CTRL) | R88E_TX_RPT1_ENA); } /* Perform LO and IQ calibrations. */ urtwn_iq_calib(sc); /* Perform LC calibration. */ urtwn_lc_calib(sc); /* Fix USB interference issue. */ if (!(sc->chip & URTWN_CHIP_88E)) { urtwn_write_1(sc, 0xfe40, 0xe0); urtwn_write_1(sc, 0xfe41, 0x8d); urtwn_write_1(sc, 0xfe42, 0x80); urtwn_pa_bias_init(sc); } /* Initialize GPIO setting. */ urtwn_write_1(sc, R92C_GPIO_MUXCFG, urtwn_read_1(sc, R92C_GPIO_MUXCFG) & ~R92C_GPIO_MUXCFG_ENBT); /* Fix for lower temperature. */ if (!(sc->chip & URTWN_CHIP_88E)) urtwn_write_1(sc, 0x15, 0xe9); usbd_transfer_start(sc->sc_xfer[URTWN_BULK_RX]); sc->sc_flags |= URTWN_RUNNING; /* * Install static keys (if any). * Must be called after urtwn_cam_init(). */ if (vap != NULL) urtwn_setup_static_keys(sc, URTWN_VAP(vap)); callout_reset(&sc->sc_watchdog_ch, hz, urtwn_watchdog, sc); fail: if (usb_err != USB_ERR_NORMAL_COMPLETION) error = EIO; URTWN_UNLOCK(sc); return (error); } static void urtwn_stop(struct urtwn_softc *sc) { URTWN_LOCK(sc); if (!(sc->sc_flags & URTWN_RUNNING)) { URTWN_UNLOCK(sc); return; } sc->sc_flags &= ~(URTWN_RUNNING | URTWN_FW_LOADED | URTWN_TEMP_MEASURED); sc->thcal_lctemp = 0; callout_stop(&sc->sc_watchdog_ch); urtwn_abort_xfers(sc); urtwn_drain_mbufq(sc); urtwn_free_tx_list(sc); urtwn_free_rx_list(sc); urtwn_power_off(sc); URTWN_UNLOCK(sc); } static void urtwn_abort_xfers(struct urtwn_softc *sc) { int i; URTWN_ASSERT_LOCKED(sc); /* abort any pending transfers */ for (i = 0; i < URTWN_N_TRANSFER; i++) usbd_transfer_stop(sc->sc_xfer[i]); } static int urtwn_raw_xmit(struct ieee80211_node *ni, struct mbuf *m, const struct ieee80211_bpf_params *params) { struct ieee80211com *ic = ni->ni_ic; struct urtwn_softc *sc = ic->ic_softc; struct urtwn_data *bf; int error; URTWN_DPRINTF(sc, URTWN_DEBUG_XMIT, "%s: called; m=%p\n", __func__, m); /* prevent management frames from being sent if we're not ready */ URTWN_LOCK(sc); if (!(sc->sc_flags & URTWN_RUNNING)) { error = ENETDOWN; goto end; } bf = urtwn_getbuf(sc); if (bf == NULL) { error = ENOBUFS; goto end; } if (params == NULL) { /* * Legacy path; interpret frame contents to decide * precisely how to send the frame. */ error = urtwn_tx_data(sc, ni, m, bf); } else { /* * Caller supplied explicit parameters to use in * sending the frame. */ error = urtwn_tx_raw(sc, ni, m, bf, params); } if (error != 0) { STAILQ_INSERT_HEAD(&sc->sc_tx_inactive, bf, next); goto end; } sc->sc_txtimer = 5; callout_reset(&sc->sc_watchdog_ch, hz, urtwn_watchdog, sc); end: if (error != 0) m_freem(m); URTWN_UNLOCK(sc); return (error); } static void urtwn_ms_delay(struct urtwn_softc *sc) { usb_pause_mtx(&sc->sc_mtx, hz / 1000); } static device_method_t urtwn_methods[] = { /* Device interface */ DEVMETHOD(device_probe, urtwn_match), DEVMETHOD(device_attach, urtwn_attach), DEVMETHOD(device_detach, urtwn_detach), DEVMETHOD_END }; static driver_t urtwn_driver = { "urtwn", urtwn_methods, sizeof(struct urtwn_softc) }; static devclass_t urtwn_devclass; DRIVER_MODULE(urtwn, uhub, urtwn_driver, urtwn_devclass, NULL, NULL); MODULE_DEPEND(urtwn, usb, 1, 1, 1); MODULE_DEPEND(urtwn, wlan, 1, 1, 1); #ifndef URTWN_WITHOUT_UCODE MODULE_DEPEND(urtwn, firmware, 1, 1, 1); #endif MODULE_VERSION(urtwn, 1); USB_PNP_HOST_INFO(urtwn_devs); Index: user/alc/PQ_LAUNDRY/sys/dev/urtwn/if_urtwnreg.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/dev/urtwn/if_urtwnreg.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/dev/urtwn/if_urtwnreg.h (revision 303206) @@ -1,2182 +1,2184 @@ /*- * Copyright (c) 2010 Damien Bergamini * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. * * $OpenBSD: if_urtwnreg.h,v 1.3 2010/11/16 18:02:59 damien Exp $ * $FreeBSD$ */ #define URTWN_CONFIG_INDEX 0 #define URTWN_IFACE_INDEX 0 #define URTWN_NOISE_FLOOR -95 #define R92C_MAX_CHAINS 2 /* Maximum number of output pipes is 3. */ #define R92C_MAX_EPOUT 3 #define R92C_MAX_TX_PWR 0x3f #define R92C_PUBQ_NPAGES 231 #define R92C_TXPKTBUF_COUNT 256 #define R92C_TX_PAGE_COUNT 248 #define R92C_TX_PAGE_BOUNDARY (R92C_TX_PAGE_COUNT + 1) #define R88E_TXPKTBUF_COUNT 177 #define R88E_TX_PAGE_COUNT 169 #define R88E_TX_PAGE_BOUNDARY (R88E_TX_PAGE_COUNT + 1) #define R92C_H2C_NBOX 4 /* USB Requests. */ #define R92C_REQ_REGS 0x05 /* * MAC registers. */ /* System Configuration. */ #define R92C_SYS_ISO_CTRL 0x000 #define R92C_SYS_FUNC_EN 0x002 #define R92C_APS_FSMCO 0x004 #define R92C_SYS_CLKR 0x008 #define R92C_AFE_MISC 0x010 #define R92C_SPS0_CTRL 0x011 #define R92C_SPS_OCP_CFG 0x018 #define R92C_RSV_CTRL 0x01c #define R92C_RF_CTRL 0x01f #define R92C_LDOA15_CTRL 0x020 #define R92C_LDOV12D_CTRL 0x021 #define R92C_LDOHCI12_CTRL 0x022 #define R92C_LPLDO_CTRL 0x023 #define R92C_AFE_XTAL_CTRL 0x024 #define R92C_AFE_PLL_CTRL 0x028 #define R92C_EFUSE_CTRL 0x030 #define R92C_EFUSE_TEST 0x034 #define R92C_PWR_DATA 0x038 #define R92C_CAL_TIMER 0x03c #define R92C_ACLK_MON 0x03e #define R92C_GPIO_MUXCFG 0x040 #define R92C_GPIO_IO_SEL 0x042 #define R92C_MAC_PINMUX_CFG 0x043 #define R92C_GPIO_PIN_CTRL 0x044 #define R92C_GPIO_IN 0x044 #define R92C_GPIO_OUT 0x045 #define R92C_GPIO_IOSEL 0x046 #define R92C_GPIO_MOD 0x047 #define R92C_GPIO_INTM 0x048 #define R92C_LEDCFG0 0x04c #define R92C_LEDCFG1 0x04d #define R92C_LEDCFG2 0x04e #define R92C_LEDCFG3 0x04f #define R92C_FSIMR 0x050 #define R92C_FSISR 0x054 #define R92C_HSIMR 0x058 #define R92C_HSISR 0x05c #define R88E_BB_PAD_CTRL 0x064 #define R92C_MCUFWDL 0x080 #define R92C_HMEBOX_EXT(idx) (0x088 + (idx) * 2) #define R88E_HIMR 0x0b0 #define R88E_HISR 0x0b4 #define R88E_HIMRE 0x0b8 #define R88E_HISRE 0x0bc #define R92C_EFUSE_ACCESS 0x0cf #define R92C_BIST_SCAN 0x0d0 #define R92C_BIST_RPT 0x0d4 #define R92C_BIST_ROM_RPT 0x0d8 #define R92C_USB_SIE_INTF 0x0e0 #define R92C_PCIE_MIO_INTF 0x0e4 #define R92C_PCIE_MIO_INTD 0x0e8 #define R92C_HPON_FSM 0x0ec #define R92C_SYS_CFG 0x0f0 /* MAC General Configuration. */ #define R92C_CR 0x100 #define R92C_MSR 0x102 #define R92C_PBP 0x104 #define R92C_TRXDMA_CTRL 0x10c #define R92C_TRXFF_BNDY 0x114 #define R92C_TRXFF_STATUS 0x118 #define R92C_RXFF_PTR 0x11c #define R92C_HIMR 0x120 #define R92C_HISR 0x124 #define R92C_HIMRE 0x128 #define R92C_HISRE 0x12c #define R92C_CPWM 0x12f #define R92C_FWIMR 0x130 #define R92C_FWISR 0x134 #define R92C_PKTBUF_DBG_CTRL 0x140 #define R92C_PKTBUF_DBG_DATA_L 0x144 #define R92C_PKTBUF_DBG_DATA_H 0x148 #define R92C_TC0_CTRL(i) (0x150 + (i) * 4) #define R92C_TCUNIT_BASE 0x164 #define R92C_MBIST_START 0x174 #define R92C_MBIST_DONE 0x178 #define R92C_MBIST_FAIL 0x17c #define R88E_32K_CTRL 0x194 #define R92C_C2HEVT_MSG_NORMAL 0x1a0 #define R92C_C2HEVT_MSG_TEST 0x1b8 #define R92C_C2HEVT_CLEAR 0x1bf #define R92C_MCUTST_1 0x1c0 #define R92C_FMETHR 0x1c8 #define R92C_HMETFR 0x1cc #define R92C_HMEBOX(idx) (0x1d0 + (idx) * 4) #define R92C_LLT_INIT 0x1e0 #define R92C_BB_ACCESS_CTRL 0x1e8 #define R92C_BB_ACCESS_DATA 0x1ec #define R88E_HMEBOX_EXT(idx) (0x1f0 + (idx) * 4) /* Tx DMA Configuration. */ #define R92C_RQPN 0x200 #define R92C_FIFOPAGE 0x204 #define R92C_TDECTRL 0x208 #define R92C_TXDMA_OFFSET_CHK 0x20c #define R92C_TXDMA_STATUS 0x210 #define R92C_RQPN_NPQ 0x214 /* Rx DMA Configuration. */ #define R92C_RXDMA_AGG_PG_TH 0x280 #define R92C_RXPKT_NUM 0x284 #define R92C_RXDMA_STATUS 0x288 /* Protocol Configuration. */ #define R92C_FWHW_TXQ_CTRL 0x420 #define R92C_HWSEQ_CTRL 0x423 #define R92C_TXPKTBUF_BCNQ_BDNY 0x424 #define R92C_TXPKTBUF_MGQ_BDNY 0x425 #define R92C_SPEC_SIFS 0x428 #define R92C_RL 0x42a #define R92C_DARFRC 0x430 #define R92C_RARFRC 0x438 #define R92C_RRSR 0x440 #define R92C_ARFR(i) (0x444 + (i) * 4) #define R92C_AGGLEN_LMT 0x458 #define R92C_AMPDU_MIN_SPACE 0x45c #define R92C_TXPKTBUF_WMAC_LBK_BF_HD 0x45d #define R92C_FAST_EDCA_CTRL 0x460 #define R92C_RD_RESP_PKT_TH 0x463 #define R92C_INIRTS_RATE_SEL 0x480 #define R92C_INIDATA_RATE_SEL(macid) (0x484 + (macid)) #define R92C_MAX_AGGR_NUM 0x4ca #define R88E_TX_RPT_CTRL 0x4ec #define R88E_TX_RPT_MACID_MAX 0x4ed #define R88E_TX_RPT_TIME 0x4f0 /* EDCA Configuration. */ #define R92C_EDCA_VO_PARAM 0x500 #define R92C_EDCA_VI_PARAM 0x504 #define R92C_EDCA_BE_PARAM 0x508 #define R92C_EDCA_BK_PARAM 0x50c #define R92C_BCNTCFG 0x510 #define R92C_PIFS 0x512 #define R92C_RDG_PIFS 0x513 #define R92C_SIFS_CCK 0x514 #define R92C_SIFS_OFDM 0x516 #define R92C_AGGR_BREAK_TIME 0x51a #define R92C_SLOT 0x51b #define R92C_TX_PTCL_CTRL 0x520 #define R92C_TXPAUSE 0x522 #define R92C_DIS_TXREQ_CLR 0x523 #define R92C_RD_CTRL 0x524 #define R92C_TBTT_PROHIBIT 0x540 #define R92C_RD_NAV_NXT 0x544 #define R92C_NAV_PROT_LEN 0x546 #define R92C_BCN_CTRL 0x550 #define R92C_MBID_NUM 0x552 #define R92C_DUAL_TSF_RST 0x553 #define R92C_BCN_INTERVAL 0x554 #define R92C_DRVERLYINT 0x558 #define R92C_BCNDMATIM 0x559 #define R92C_ATIMWND 0x55a #define R92C_USTIME_TSF 0x55c #define R92C_BCN_MAX_ERR 0x55d #define R92C_RXTSF_OFFSET_CCK 0x55e #define R92C_RXTSF_OFFSET_OFDM 0x55f #define R92C_TSFTR 0x560 #define R92C_INIT_TSFTR 0x564 #define R92C_PSTIMER 0x580 #define R92C_TIMER0 0x584 #define R92C_TIMER1 0x588 #define R92C_ACMHWCTRL 0x5c0 #define R92C_ACMRSTCTRL 0x5c1 #define R92C_ACMAVG 0x5c2 #define R92C_VO_ADMTIME 0x5c4 #define R92C_VI_ADMTIME 0x5c6 #define R92C_BE_ADMTIME 0x5c8 #define R92C_EDCA_RANDOM_GEN 0x5cc #define R92C_SCH_TXCMD 0x5d0 #define R88E_SCH_TXCMD 0x5f8 /* WMAC Configuration. */ #define R92C_APSD_CTRL 0x600 #define R92C_BWOPMODE 0x603 #define R92C_RCR 0x608 #define R92C_RX_DRVINFO_SZ 0x60f #define R92C_MACID 0x610 #define R92C_BSSID 0x618 #define R92C_MAR 0x620 #define R92C_MAC_SPEC_SIFS 0x63a #define R92C_R2T_SIFS 0x63c #define R92C_T2T_SIFS 0x63e #define R92C_ACKTO 0x640 #define R92C_CAMCMD 0x670 #define R92C_CAMWRITE 0x674 #define R92C_CAMREAD 0x678 #define R92C_CAMDBG 0x67c #define R92C_SECCFG 0x680 #define R92C_RXFLTMAP0 0x6a0 #define R92C_RXFLTMAP1 0x6a2 #define R92C_RXFLTMAP2 0x6a4 /* Bits for R92C_SYS_ISO_CTRL. */ #define R92C_SYS_ISO_CTRL_MD2PP 0x0001 #define R92C_SYS_ISO_CTRL_UA2USB 0x0002 #define R92C_SYS_ISO_CTRL_UD2CORE 0x0004 #define R92C_SYS_ISO_CTRL_PA2PCIE 0x0008 #define R92C_SYS_ISO_CTRL_PD2CORE 0x0010 #define R92C_SYS_ISO_CTRL_IP2MAC 0x0020 #define R92C_SYS_ISO_CTRL_DIOP 0x0040 #define R92C_SYS_ISO_CTRL_DIOE 0x0080 #define R92C_SYS_ISO_CTRL_EB2CORE 0x0100 #define R92C_SYS_ISO_CTRL_DIOR 0x0200 #define R92C_SYS_ISO_CTRL_PWC_EV25V 0x4000 #define R92C_SYS_ISO_CTRL_PWC_EV12V 0x8000 /* Bits for R92C_SYS_FUNC_EN. */ #define R92C_SYS_FUNC_EN_BBRSTB 0x0001 #define R92C_SYS_FUNC_EN_BB_GLB_RST 0x0002 #define R92C_SYS_FUNC_EN_USBA 0x0004 #define R92C_SYS_FUNC_EN_UPLL 0x0008 #define R92C_SYS_FUNC_EN_USBD 0x0010 #define R92C_SYS_FUNC_EN_DIO_PCIE 0x0020 #define R92C_SYS_FUNC_EN_PCIEA 0x0040 #define R92C_SYS_FUNC_EN_PPLL 0x0080 #define R92C_SYS_FUNC_EN_PCIED 0x0100 #define R92C_SYS_FUNC_EN_DIOE 0x0200 #define R92C_SYS_FUNC_EN_CPUEN 0x0400 #define R92C_SYS_FUNC_EN_DCORE 0x0800 #define R92C_SYS_FUNC_EN_ELDR 0x1000 #define R92C_SYS_FUNC_EN_DIO_RF 0x2000 #define R92C_SYS_FUNC_EN_HWPDN 0x4000 #define R92C_SYS_FUNC_EN_MREGEN 0x8000 /* Bits for R92C_APS_FSMCO. */ #define R92C_APS_FSMCO_PFM_LDALL 0x00000001 #define R92C_APS_FSMCO_PFM_ALDN 0x00000002 #define R92C_APS_FSMCO_PFM_LDKP 0x00000004 #define R92C_APS_FSMCO_PFM_WOWL 0x00000008 #define R92C_APS_FSMCO_PDN_EN 0x00000010 #define R92C_APS_FSMCO_PDN_PL 0x00000020 #define R92C_APS_FSMCO_APFM_ONMAC 0x00000100 #define R92C_APS_FSMCO_APFM_OFF 0x00000200 #define R92C_APS_FSMCO_APFM_RSM 0x00000400 #define R92C_APS_FSMCO_AFSM_HSUS 0x00000800 #define R92C_APS_FSMCO_AFSM_PCIE 0x00001000 #define R92C_APS_FSMCO_APDM_MAC 0x00002000 #define R92C_APS_FSMCO_APDM_HOST 0x00004000 #define R92C_APS_FSMCO_APDM_HPDN 0x00008000 #define R92C_APS_FSMCO_RDY_MACON 0x00010000 #define R92C_APS_FSMCO_SUS_HOST 0x00020000 #define R92C_APS_FSMCO_ROP_ALD 0x00100000 #define R92C_APS_FSMCO_ROP_PWR 0x00200000 #define R92C_APS_FSMCO_ROP_SPS 0x00400000 #define R92C_APS_FSMCO_SOP_MRST 0x02000000 #define R92C_APS_FSMCO_SOP_FUSE 0x04000000 #define R92C_APS_FSMCO_SOP_ABG 0x08000000 #define R92C_APS_FSMCO_SOP_AMB 0x10000000 #define R92C_APS_FSMCO_SOP_RCK 0x20000000 #define R92C_APS_FSMCO_SOP_A8M 0x40000000 #define R92C_APS_FSMCO_XOP_BTCK 0x80000000 /* Bits for R92C_SYS_CLKR. */ #define R92C_SYS_CLKR_ANAD16V_EN 0x00000001 #define R92C_SYS_CLKR_ANA8M 0x00000002 #define R92C_SYS_CLKR_MACSLP 0x00000010 #define R92C_SYS_CLKR_LOADER_EN 0x00000020 #define R92C_SYS_CLKR_80M_SSC_DIS 0x00000080 #define R92C_SYS_CLKR_80M_SSC_EN_HO 0x00000100 #define R92C_SYS_CLKR_PHY_SSC_RSTB 0x00000200 #define R92C_SYS_CLKR_SEC_EN 0x00000400 #define R92C_SYS_CLKR_MAC_EN 0x00000800 #define R92C_SYS_CLKR_SYS_EN 0x00001000 #define R92C_SYS_CLKR_RING_EN 0x00002000 /* Bits for R92C_RF_CTRL. */ #define R92C_RF_CTRL_EN 0x01 #define R92C_RF_CTRL_RSTB 0x02 #define R92C_RF_CTRL_SDMRSTB 0x04 /* Bits for R92C_LDOA15_CTRL. */ #define R92C_LDOA15_CTRL_EN 0x01 #define R92C_LDOA15_CTRL_STBY 0x02 #define R92C_LDOA15_CTRL_OBUF 0x04 #define R92C_LDOA15_CTRL_REG_VOS 0x08 /* Bits for R92C_LDOV12D_CTRL. */ #define R92C_LDOV12D_CTRL_LDV12_EN 0x01 /* Bits for R92C_LPLDO_CTRL. */ #define R92C_LPLDO_CTRL_SLEEP 0x10 /* Bits for R92C_AFE_XTAL_CTRL. */ #define R92C_AFE_XTAL_CTRL_ADDR_M 0x007ff800 #define R92C_AFE_XTAL_CTRL_ADDR_S 11 /* Bits for R92C_AFE_PLL_CTRL. */ #define R92C_AFE_PLL_CTRL_EN 0x0001 #define R92C_AFE_PLL_CTRL_320_EN 0x0002 #define R92C_AFE_PLL_CTRL_FREF_SEL 0x0004 #define R92C_AFE_PLL_CTRL_EDGE_SEL 0x0008 #define R92C_AFE_PLL_CTRL_WDOGB 0x0010 #define R92C_AFE_PLL_CTRL_LPFEN 0x0020 /* Bits for R92C_EFUSE_CTRL. */ #define R92C_EFUSE_CTRL_DATA_M 0x000000ff #define R92C_EFUSE_CTRL_DATA_S 0 #define R92C_EFUSE_CTRL_ADDR_M 0x0003ff00 #define R92C_EFUSE_CTRL_ADDR_S 8 #define R92C_EFUSE_CTRL_VALID 0x80000000 /* Bits for R92C_GPIO_MUXCFG. */ #define R92C_GPIO_MUXCFG_ENBT 0x0020 /* Bits for R92C_LEDCFG0. */ #define R92C_LEDCFG0_DIS 0x08 /* Bits for R92C_MCUFWDL. */ #define R92C_MCUFWDL_EN 0x00000001 #define R92C_MCUFWDL_RDY 0x00000002 #define R92C_MCUFWDL_CHKSUM_RPT 0x00000004 #define R92C_MCUFWDL_MACINI_RDY 0x00000008 #define R92C_MCUFWDL_BBINI_RDY 0x00000010 #define R92C_MCUFWDL_RFINI_RDY 0x00000020 #define R92C_MCUFWDL_WINTINI_RDY 0x00000040 #define R92C_MCUFWDL_RAM_DL_SEL 0x00000080 #define R92C_MCUFWDL_PAGE_M 0x00070000 #define R92C_MCUFWDL_PAGE_S 16 #define R92C_MCUFWDL_CPRST 0x00800000 /* Bits for R88E_HIMR. */ #define R88E_HIMR_CPWM 0x00000100 #define R88E_HIMR_CPWM2 0x00000200 #define R88E_HIMR_TBDER 0x04000000 #define R88E_HIMR_PSTIMEOUT 0x20000000 /* Bits for R88E_HIMRE.*/ #define R88E_HIMRE_RXFOVW 0x00000100 #define R88E_HIMRE_TXFOVW 0x00000200 #define R88E_HIMRE_RXERR 0x00000400 #define R88E_HIMRE_TXERR 0x00000800 /* Bits for R92C_EFUSE_ACCESS. */ #define R92C_EFUSE_ACCESS_OFF 0x00 #define R92C_EFUSE_ACCESS_ON 0x69 /* Bits for R92C_HPON_FSM. */ #define R92C_HPON_FSM_CHIP_BONDING_ID_S 22 #define R92C_HPON_FSM_CHIP_BONDING_ID_M 0x00c00000 #define R92C_HPON_FSM_CHIP_BONDING_ID_92C_1T2R 1 /* Bits for R92C_SYS_CFG. */ #define R92C_SYS_CFG_XCLK_VLD 0x00000001 #define R92C_SYS_CFG_ACLK_VLD 0x00000002 #define R92C_SYS_CFG_UCLK_VLD 0x00000004 #define R92C_SYS_CFG_PCLK_VLD 0x00000008 #define R92C_SYS_CFG_PCIRSTB 0x00000010 #define R92C_SYS_CFG_V15_VLD 0x00000020 #define R92C_SYS_CFG_TRP_B15V_EN 0x00000080 #define R92C_SYS_CFG_SIC_IDLE 0x00000100 #define R92C_SYS_CFG_BD_MAC2 0x00000200 #define R92C_SYS_CFG_BD_MAC1 0x00000400 #define R92C_SYS_CFG_IC_MACPHY_MODE 0x00000800 #define R92C_SYS_CFG_CHIP_VER_RTL_M 0x0000f000 #define R92C_SYS_CFG_CHIP_VER_RTL_S 12 #define R92C_SYS_CFG_BT_FUNC 0x00010000 #define R92C_SYS_CFG_VENDOR_UMC 0x00080000 #define R92C_SYS_CFG_PAD_HWPD_IDN 0x00400000 #define R92C_SYS_CFG_TRP_VAUX_EN 0x00800000 #define R92C_SYS_CFG_TRP_BT_EN 0x01000000 #define R92C_SYS_CFG_BD_PKG_SEL 0x02000000 #define R92C_SYS_CFG_BD_HCI_SEL 0x04000000 #define R92C_SYS_CFG_TYPE_92C 0x08000000 /* Bits for R92C_CR. */ #define R92C_CR_HCI_TXDMA_EN 0x0001 #define R92C_CR_HCI_RXDMA_EN 0x0002 #define R92C_CR_TXDMA_EN 0x0004 #define R92C_CR_RXDMA_EN 0x0008 #define R92C_CR_PROTOCOL_EN 0x0010 #define R92C_CR_SCHEDULE_EN 0x0020 #define R92C_CR_MACTXEN 0x0040 #define R92C_CR_MACRXEN 0x0080 #define R92C_CR_ENSEC 0x0200 #define R92C_CR_CALTMR_EN 0x0400 /* Bits for R92C_MSR. */ #define R92C_MSR_NOLINK 0x00 #define R92C_MSR_ADHOC 0x01 #define R92C_MSR_INFRA 0x02 #define R92C_MSR_AP 0x03 #define R92C_MSR_MASK (R92C_MSR_AP) /* Bits for R92C_PBP. */ #define R92C_PBP_PSRX_M 0x0f #define R92C_PBP_PSRX_S 0 #define R92C_PBP_PSTX_M 0xf0 #define R92C_PBP_PSTX_S 4 #define R92C_PBP_64 0 #define R92C_PBP_128 1 #define R92C_PBP_256 2 #define R92C_PBP_512 3 #define R92C_PBP_1024 4 /* Bits for R92C_TRXDMA_CTRL. */ #define R92C_TRXDMA_CTRL_RXDMA_AGG_EN 0x0004 #define R92C_TRXDMA_CTRL_TXDMA_VOQ_MAP_M 0x0030 #define R92C_TRXDMA_CTRL_TXDMA_VOQ_MAP_S 4 #define R92C_TRXDMA_CTRL_TXDMA_VIQ_MAP_M 0x00c0 #define R92C_TRXDMA_CTRL_TXDMA_VIQ_MAP_S 6 #define R92C_TRXDMA_CTRL_TXDMA_BEQ_MAP_M 0x0300 #define R92C_TRXDMA_CTRL_TXDMA_BEQ_MAP_S 8 #define R92C_TRXDMA_CTRL_TXDMA_BKQ_MAP_M 0x0c00 #define R92C_TRXDMA_CTRL_TXDMA_BKQ_MAP_S 10 #define R92C_TRXDMA_CTRL_TXDMA_MGQ_MAP_M 0x3000 #define R92C_TRXDMA_CTRL_TXDMA_MGQ_MAP_S 12 #define R92C_TRXDMA_CTRL_TXDMA_HIQ_MAP_M 0xc000 #define R92C_TRXDMA_CTRL_TXDMA_HIQ_MAP_S 14 #define R92C_TRXDMA_CTRL_QUEUE_LOW 1 #define R92C_TRXDMA_CTRL_QUEUE_NORMAL 2 #define R92C_TRXDMA_CTRL_QUEUE_HIGH 3 #define R92C_TRXDMA_CTRL_QMAP_M 0xfff0 /* Shortcuts. */ #define R92C_TRXDMA_CTRL_QMAP_3EP 0xf5b0 #define R92C_TRXDMA_CTRL_QMAP_HQ_LQ 0xf5f0 #define R92C_TRXDMA_CTRL_QMAP_HQ_NQ 0xfaf0 #define R92C_TRXDMA_CTRL_QMAP_LQ 0x5550 #define R92C_TRXDMA_CTRL_QMAP_NQ 0xaaa0 #define R92C_TRXDMA_CTRL_QMAP_HQ 0xfff0 /* Bits for R92C_LLT_INIT. */ #define R92C_LLT_INIT_DATA_M 0x000000ff #define R92C_LLT_INIT_DATA_S 0 #define R92C_LLT_INIT_ADDR_M 0x0000ff00 #define R92C_LLT_INIT_ADDR_S 8 #define R92C_LLT_INIT_OP_M 0xc0000000 #define R92C_LLT_INIT_OP_S 30 #define R92C_LLT_INIT_OP_NO_ACTIVE 0 #define R92C_LLT_INIT_OP_WRITE 1 /* Bits for R92C_RQPN. */ #define R92C_RQPN_HPQ_M 0x000000ff #define R92C_RQPN_HPQ_S 0 #define R92C_RQPN_LPQ_M 0x0000ff00 #define R92C_RQPN_LPQ_S 8 #define R92C_RQPN_PUBQ_M 0x00ff0000 #define R92C_RQPN_PUBQ_S 16 #define R92C_RQPN_LD 0x80000000 /* Bits for R92C_TDECTRL. */ #define R92C_TDECTRL_BLK_DESC_NUM_M 0x000000f0 #define R92C_TDECTRL_BLK_DESC_NUM_S 4 #define R92C_TDECTRL_BCN_VALID 0x00010000 /* Bits for R92C_FWHW_TXQ_CTRL. */ #define R92C_FWHW_TXQ_CTRL_AMPDU_RTY_NEW 0x80 /* Bits for R92C_SPEC_SIFS. */ #define R92C_SPEC_SIFS_CCK_M 0x00ff #define R92C_SPEC_SIFS_CCK_S 0 #define R92C_SPEC_SIFS_OFDM_M 0xff00 #define R92C_SPEC_SIFS_OFDM_S 8 /* Bits for R92C_RL. */ #define R92C_RL_LRL_M 0x003f #define R92C_RL_LRL_S 0 #define R92C_RL_SRL_M 0x3f00 #define R92C_RL_SRL_S 8 /* Bits for R92C_RRSR. */ #define R92C_RRSR_RATE_BITMAP_M 0x000fffff #define R92C_RRSR_RATE_BITMAP_S 0 #define R92C_RRSR_RATE_CCK_ONLY_1M 0xffff1 #define R92C_RRSR_RSC_LOWSUBCHNL 0x00200000 #define R92C_RRSR_RSC_UPSUBCHNL 0x00400000 #define R92C_RRSR_SHORT 0x00800000 /* Bits for R88E_TX_RPT_CTRL. */ #define R88E_TX_RPT1_ENA 0x01 #define R88E_TX_RPT2_ENA 0x02 /* Bits for R92C_EDCA_XX_PARAM. */ #define R92C_EDCA_PARAM_AIFS_M 0x000000ff #define R92C_EDCA_PARAM_AIFS_S 0 #define R92C_EDCA_PARAM_ECWMIN_M 0x00000f00 #define R92C_EDCA_PARAM_ECWMIN_S 8 #define R92C_EDCA_PARAM_ECWMAX_M 0x0000f000 #define R92C_EDCA_PARAM_ECWMAX_S 12 #define R92C_EDCA_PARAM_TXOP_M 0xffff0000 #define R92C_EDCA_PARAM_TXOP_S 16 /* Bits for R92C_HWSEQ_CTRL / R92C_TXPAUSE. */ #define R92C_TX_QUEUE_VO 0x01 #define R92C_TX_QUEUE_VI 0x02 #define R92C_TX_QUEUE_BE 0x04 #define R92C_TX_QUEUE_BK 0x08 #define R92C_TX_QUEUE_MGT 0x10 #define R92C_TX_QUEUE_HIGH 0x20 #define R92C_TX_QUEUE_BCN 0x40 /* Shortcuts. */ #define R92C_TX_QUEUE_AC \ (R92C_TX_QUEUE_VO | R92C_TX_QUEUE_VI | \ R92C_TX_QUEUE_BE | R92C_TX_QUEUE_BK) #define R92C_TX_QUEUE_ALL \ (R92C_TX_QUEUE_AC | R92C_TX_QUEUE_MGT | \ R92C_TX_QUEUE_HIGH | R92C_TX_QUEUE_BCN | 0x80) /* XXX */ /* Bits for R92C_BCN_CTRL. */ #define R92C_BCN_CTRL_EN_MBSSID 0x02 #define R92C_BCN_CTRL_TXBCN_RPT 0x04 #define R92C_BCN_CTRL_EN_BCN 0x08 #define R92C_BCN_CTRL_DIS_TSF_UDT0 0x10 /* Bits for R92C_MBID_NUM. */ #define R92C_MBID_TXBCN_RPT0 0x08 #define R92C_MBID_TXBCN_RPT1 0x10 /* Bits for R92C_DUAL_TSF_RST. */ #define R92C_DUAL_TSF_RST0 0x01 #define R92C_DUAL_TSF_RST1 0x02 /* Bits for R92C_ACMHWCTRL. */ #define R92C_ACMHWCTRL_EN 0x01 #define R92C_ACMHWCTRL_BE 0x02 #define R92C_ACMHWCTRL_VI 0x04 #define R92C_ACMHWCTRL_VO 0x08 #define R92C_ACMHWCTRL_ACM_MASK 0x0f /* Bits for R92C_APSD_CTRL. */ #define R92C_APSD_CTRL_OFF 0x40 #define R92C_APSD_CTRL_OFF_STATUS 0x80 /* Bits for R92C_BWOPMODE. */ #define R92C_BWOPMODE_11J 0x01 #define R92C_BWOPMODE_5G 0x02 #define R92C_BWOPMODE_20MHZ 0x04 /* Bits for R92C_RCR. */ #define R92C_RCR_AAP 0x00000001 #define R92C_RCR_APM 0x00000002 #define R92C_RCR_AM 0x00000004 #define R92C_RCR_AB 0x00000008 #define R92C_RCR_ADD3 0x00000010 #define R92C_RCR_APWRMGT 0x00000020 #define R92C_RCR_CBSSID_DATA 0x00000040 #define R92C_RCR_CBSSID_BCN 0x00000080 #define R92C_RCR_ACRC32 0x00000100 #define R92C_RCR_AICV 0x00000200 #define R92C_RCR_ADF 0x00000800 #define R92C_RCR_ACF 0x00001000 #define R92C_RCR_AMF 0x00002000 #define R92C_RCR_HTC_LOC_CTRL 0x00004000 #define R92C_RCR_MFBEN 0x00400000 #define R92C_RCR_LSIGEN 0x00800000 #define R92C_RCR_ENMBID 0x01000000 #define R92C_RCR_APP_BA_SSN 0x08000000 #define R92C_RCR_APP_PHYSTS 0x10000000 #define R92C_RCR_APP_ICV 0x20000000 #define R92C_RCR_APP_MIC 0x40000000 #define R92C_RCR_APPFCS 0x80000000 /* Bits for R92C_CAMCMD. */ #define R92C_CAMCMD_ADDR_M 0x0000ffff #define R92C_CAMCMD_ADDR_S 0 #define R92C_CAMCMD_WRITE 0x00010000 #define R92C_CAMCMD_CLR 0x40000000 #define R92C_CAMCMD_POLLING 0x80000000 /* Bits for R92C_SECCFG. */ #define R92C_SECCFG_TXUCKEY_DEF 0x0001 #define R92C_SECCFG_RXUCKEY_DEF 0x0002 #define R92C_SECCFG_TXENC_ENA 0x0004 #define R92C_SECCFG_RXDEC_ENA 0x0008 #define R92C_SECCFG_CMP_A2 0x0010 #define R92C_SECCFG_TXBCKEY_DEF 0x0040 #define R92C_SECCFG_RXBCKEY_DEF 0x0080 #define R88E_SECCFG_CHK_KEYID 0x0100 /* Bits for R92C_RXFLTMAP*. */ #define R92C_RXFLTMAP_SUBTYPE(subtype) \ (1 << ((subtype) >> IEEE80211_FC0_SUBTYPE_SHIFT)) /* * Baseband registers. */ #define R92C_FPGA0_RFMOD 0x800 #define R92C_FPGA0_TXINFO 0x804 #define R92C_HSSI_PARAM1(chain) (0x820 + (chain) * 8) #define R92C_HSSI_PARAM2(chain) (0x824 + (chain) * 8) #define R92C_TXAGC_RATE18_06(i) (((i) == 0) ? 0xe00 : 0x830) #define R92C_TXAGC_RATE54_24(i) (((i) == 0) ? 0xe04 : 0x834) #define R92C_TXAGC_A_CCK1_MCS32 0xe08 #define R92C_TXAGC_B_CCK1_55_MCS32 0x838 #define R92C_TXAGC_B_CCK11_A_CCK2_11 0x86c #define R92C_TXAGC_MCS03_MCS00(i) (((i) == 0) ? 0xe10 : 0x83c) #define R92C_TXAGC_MCS07_MCS04(i) (((i) == 0) ? 0xe14 : 0x848) #define R92C_TXAGC_MCS11_MCS08(i) (((i) == 0) ? 0xe18 : 0x84c) #define R92C_TXAGC_MCS15_MCS12(i) (((i) == 0) ? 0xe1c : 0x868) #define R92C_LSSI_PARAM(chain) (0x840 + (chain) * 4) #define R92C_FPGA0_RFIFACEOE(chain) (0x860 + (chain) * 4) #define R92C_FPGA0_RFIFACESW(idx) (0x870 + (idx) * 4) #define R92C_FPGA0_RFPARAM(idx) (0x878 + (idx) * 4) #define R92C_FPGA0_ANAPARAM2 0x884 #define R92C_LSSI_READBACK(chain) (0x8a0 + (chain) * 4) #define R92C_HSPI_READBACK(chain) (0x8b8 + (chain) * 4) #define R92C_FPGA1_RFMOD 0x900 #define R92C_FPGA1_TXINFO 0x90c #define R92C_CCK0_SYSTEM 0xa00 #define R92C_CCK0_AFESETTING 0xa04 #define R92C_OFDM0_TRXPATHENA 0xc04 #define R92C_OFDM0_TRMUXPAR 0xc08 #define R92C_OFDM0_AGCCORE1(chain) (0xc50 + (chain) * 8) #define R92C_OFDM0_AGCPARAM1 0xc70 #define R92C_OFDM0_AGCRSSITABLE 0xc78 #define R92C_OFDM1_LSTF 0xd00 /* Bits for R92C_FPGA[01]_RFMOD. */ #define R92C_RFMOD_40MHZ 0x00000001 #define R92C_RFMOD_JAPAN 0x00000002 #define R92C_RFMOD_CCK_TXSC 0x00000030 #define R92C_RFMOD_CCK_EN 0x01000000 #define R92C_RFMOD_OFDM_EN 0x02000000 /* Bits for R92C_HSSI_PARAM1(i). */ #define R92C_HSSI_PARAM1_PI 0x00000100 /* Bits for R92C_HSSI_PARAM2(i). */ #define R92C_HSSI_PARAM2_CCK_HIPWR 0x00000200 #define R92C_HSSI_PARAM2_ADDR_LENGTH 0x00000400 #define R92C_HSSI_PARAM2_DATA_LENGTH 0x00000800 #define R92C_HSSI_PARAM2_READ_ADDR_M 0x7f800000 #define R92C_HSSI_PARAM2_READ_ADDR_S 23 #define R92C_HSSI_PARAM2_READ_EDGE 0x80000000 /* Bits for R92C_TXAGC_A_CCK1_MCS32. */ #define R92C_TXAGC_A_CCK1_M 0x0000ff00 #define R92C_TXAGC_A_CCK1_S 8 /* Bits for R92C_TXAGC_B_CCK11_A_CCK2_11. */ #define R92C_TXAGC_B_CCK11_M 0x000000ff #define R92C_TXAGC_B_CCK11_S 0 #define R92C_TXAGC_A_CCK2_M 0x0000ff00 #define R92C_TXAGC_A_CCK2_S 8 #define R92C_TXAGC_A_CCK55_M 0x00ff0000 #define R92C_TXAGC_A_CCK55_S 16 #define R92C_TXAGC_A_CCK11_M 0xff000000 #define R92C_TXAGC_A_CCK11_S 24 /* Bits for R92C_TXAGC_B_CCK1_55_MCS32. */ #define R92C_TXAGC_B_CCK1_M 0x0000ff00 #define R92C_TXAGC_B_CCK1_S 8 #define R92C_TXAGC_B_CCK2_M 0x00ff0000 #define R92C_TXAGC_B_CCK2_S 16 #define R92C_TXAGC_B_CCK55_M 0xff000000 #define R92C_TXAGC_B_CCK55_S 24 /* Bits for R92C_TXAGC_RATE18_06(x). */ #define R92C_TXAGC_RATE06_M 0x000000ff #define R92C_TXAGC_RATE06_S 0 #define R92C_TXAGC_RATE09_M 0x0000ff00 #define R92C_TXAGC_RATE09_S 8 #define R92C_TXAGC_RATE12_M 0x00ff0000 #define R92C_TXAGC_RATE12_S 16 #define R92C_TXAGC_RATE18_M 0xff000000 #define R92C_TXAGC_RATE18_S 24 /* Bits for R92C_TXAGC_RATE54_24(x). */ #define R92C_TXAGC_RATE24_M 0x000000ff #define R92C_TXAGC_RATE24_S 0 #define R92C_TXAGC_RATE36_M 0x0000ff00 #define R92C_TXAGC_RATE36_S 8 #define R92C_TXAGC_RATE48_M 0x00ff0000 #define R92C_TXAGC_RATE48_S 16 #define R92C_TXAGC_RATE54_M 0xff000000 #define R92C_TXAGC_RATE54_S 24 /* Bits for R92C_TXAGC_MCS03_MCS00(x). */ #define R92C_TXAGC_MCS00_M 0x000000ff #define R92C_TXAGC_MCS00_S 0 #define R92C_TXAGC_MCS01_M 0x0000ff00 #define R92C_TXAGC_MCS01_S 8 #define R92C_TXAGC_MCS02_M 0x00ff0000 #define R92C_TXAGC_MCS02_S 16 #define R92C_TXAGC_MCS03_M 0xff000000 #define R92C_TXAGC_MCS03_S 24 /* Bits for R92C_TXAGC_MCS07_MCS04(x). */ #define R92C_TXAGC_MCS04_M 0x000000ff #define R92C_TXAGC_MCS04_S 0 #define R92C_TXAGC_MCS05_M 0x0000ff00 #define R92C_TXAGC_MCS05_S 8 #define R92C_TXAGC_MCS06_M 0x00ff0000 #define R92C_TXAGC_MCS06_S 16 #define R92C_TXAGC_MCS07_M 0xff000000 #define R92C_TXAGC_MCS07_S 24 /* Bits for R92C_TXAGC_MCS11_MCS08(x). */ #define R92C_TXAGC_MCS08_M 0x000000ff #define R92C_TXAGC_MCS08_S 0 #define R92C_TXAGC_MCS09_M 0x0000ff00 #define R92C_TXAGC_MCS09_S 8 #define R92C_TXAGC_MCS10_M 0x00ff0000 #define R92C_TXAGC_MCS10_S 16 #define R92C_TXAGC_MCS11_M 0xff000000 #define R92C_TXAGC_MCS11_S 24 /* Bits for R92C_TXAGC_MCS15_MCS12(x). */ #define R92C_TXAGC_MCS12_M 0x000000ff #define R92C_TXAGC_MCS12_S 0 #define R92C_TXAGC_MCS13_M 0x0000ff00 #define R92C_TXAGC_MCS13_S 8 #define R92C_TXAGC_MCS14_M 0x00ff0000 #define R92C_TXAGC_MCS14_S 16 #define R92C_TXAGC_MCS15_M 0xff000000 #define R92C_TXAGC_MCS15_S 24 /* Bits for R92C_LSSI_PARAM(i). */ #define R92C_LSSI_PARAM_DATA_M 0x000fffff #define R92C_LSSI_PARAM_DATA_S 0 #define R92C_LSSI_PARAM_ADDR_M 0x03f00000 #define R92C_LSSI_PARAM_ADDR_S 20 #define R88E_LSSI_PARAM_ADDR_M 0x0ff00000 #define R88E_LSSI_PARAM_ADDR_S 20 /* Bits for R92C_FPGA0_ANAPARAM2. */ #define R92C_FPGA0_ANAPARAM2_CBW20 0x00000400 /* Bits for R92C_LSSI_READBACK(i). */ #define R92C_LSSI_READBACK_DATA_M 0x000fffff #define R92C_LSSI_READBACK_DATA_S 0 /* Bits for R92C_OFDM0_AGCCORE1(i). */ #define R92C_OFDM0_AGCCORE1_GAIN_M 0x0000007f #define R92C_OFDM0_AGCCORE1_GAIN_S 0 /* * USB registers. */ #define R92C_USB_SUSPEND 0xfe10 #define R92C_USB_INFO 0xfe17 #define R92C_USB_SPECIAL_OPTION 0xfe55 #define R92C_USB_HCPWM 0xfe57 #define R92C_USB_HRPWM 0xfe58 #define R92C_USB_DMA_AGG_TO 0xfe5b #define R92C_USB_AGG_TO 0xfe5c #define R92C_USB_AGG_TH 0xfe5d #define R92C_USB_VID 0xfe60 #define R92C_USB_PID 0xfe62 #define R92C_USB_OPTIONAL 0xfe64 #define R92C_USB_EP 0xfe65 #define R92C_USB_PHY 0xfe68 #define R92C_USB_MAC_ADDR 0xfe70 #define R92C_USB_STRING 0xfe80 /* Bits for R92C_USB_SPECIAL_OPTION. */ #define R92C_USB_SPECIAL_OPTION_AGG_EN 0x08 #define R92C_USB_SPECIAL_OPTION_INT_BULK_SEL 0x10 /* Bits for R92C_USB_EP. */ #define R92C_USB_EP_HQ_M 0x000f #define R92C_USB_EP_HQ_S 0 #define R92C_USB_EP_NQ_M 0x00f0 #define R92C_USB_EP_NQ_S 4 #define R92C_USB_EP_LQ_M 0x0f00 #define R92C_USB_EP_LQ_S 8 /* * Firmware base address. */ #define R92C_FW_START_ADDR 0x1000 #define R92C_FW_PAGE_SIZE 4096 /* * RF (6052) registers. */ #define R92C_RF_AC 0x00 #define R92C_RF_IQADJ_G(i) (0x01 + (i)) #define R92C_RF_POW_TRSW 0x05 #define R92C_RF_GAIN_RX 0x06 #define R92C_RF_GAIN_TX 0x07 #define R92C_RF_TXM_IDAC 0x08 #define R92C_RF_BS_IQGEN 0x0f #define R92C_RF_MODE1 0x10 #define R92C_RF_MODE2 0x11 #define R92C_RF_RX_AGC_HP 0x12 #define R92C_RF_TX_AGC 0x13 #define R92C_RF_BIAS 0x14 #define R92C_RF_IPA 0x15 #define R92C_RF_POW_ABILITY 0x17 #define R92C_RF_CHNLBW 0x18 #define R92C_RF_RX_G1 0x1a #define R92C_RF_RX_G2 0x1b #define R92C_RF_RX_BB2 0x1c #define R92C_RF_RX_BB1 0x1d #define R92C_RF_RCK1 0x1e #define R92C_RF_RCK2 0x1f #define R92C_RF_TX_G(i) (0x20 + (i)) #define R92C_RF_TX_BB1 0x23 #define R92C_RF_T_METER 0x24 #define R92C_RF_SYN_G(i) (0x25 + (i)) #define R92C_RF_RCK_OS 0x30 #define R92C_RF_TXPA_G(i) (0x31 + (i)) #define R88E_RF_T_METER 0x42 /* Bits for R92C_RF_AC. */ #define R92C_RF_AC_MODE_M 0x70000 #define R92C_RF_AC_MODE_S 16 #define R92C_RF_AC_MODE_STANDBY 1 /* Bits for R92C_RF_CHNLBW. */ #define R92C_RF_CHNLBW_CHNL_M 0x003ff #define R92C_RF_CHNLBW_CHNL_S 0 #define R92C_RF_CHNLBW_BW20 0x00400 #define R88E_RF_CHNLBW_BW20 0x00c00 #define R92C_RF_CHNLBW_LCSTART 0x08000 /* Bits for R92C_RF_T_METER. */ #define R92C_RF_T_METER_START 0x60 #define R92C_RF_T_METER_VAL_M 0x1f #define R92C_RF_T_METER_VAL_S 0 /* Bits for R88E_RF_T_METER. */ #define R88E_RF_T_METER_VAL_M 0x0fc00 #define R88E_RF_T_METER_VAL_S 10 #define R88E_RF_T_METER_START 0x30000 /* * CAM entries. */ #define R92C_CAM_ENTRY_COUNT 32 #define R92C_CAM_CTL0(entry) ((entry) * 8 + 0) #define R92C_CAM_CTL1(entry) ((entry) * 8 + 1) #define R92C_CAM_KEY(entry, i) ((entry) * 8 + 2 + (i)) +#define R92C_CAM_CTL6(entry) ((entry) * 8 + 6) +#define R92C_CAM_CTL7(entry) ((entry) * 8 + 7) /* Bits for R92C_CAM_CTL0(i). */ #define R92C_CAM_KEYID_M 0x00000003 #define R92C_CAM_KEYID_S 0 #define R92C_CAM_ALGO_M 0x0000001c #define R92C_CAM_ALGO_S 2 #define R92C_CAM_ALGO_NONE 0 #define R92C_CAM_ALGO_WEP40 1 #define R92C_CAM_ALGO_TKIP 2 #define R92C_CAM_ALGO_AES 4 #define R92C_CAM_ALGO_WEP104 5 #define R92C_CAM_VALID 0x00008000 #define R92C_CAM_MACLO_M 0xffff0000 #define R92C_CAM_MACLO_S 16 /* Rate adaptation modes. */ #define R92C_RAID_11GN 1 #define R92C_RAID_11N 3 #define R92C_RAID_11BG 4 #define R92C_RAID_11G 5 /* "pure" 11g */ #define R92C_RAID_11B 6 /* * Macros to access subfields in registers. */ /* Mask and Shift (getter). */ #define MS(val, field) \ (((val) & field##_M) >> field##_S) /* Shift and Mask (setter). */ #define SM(field, val) \ (((val) << field##_S) & field##_M) /* Rewrite. */ #define RW(var, field, val) \ (((var) & ~field##_M) | SM(field, val)) /* * Firmware image header. */ struct r92c_fw_hdr { /* QWORD0 */ uint16_t signature; uint8_t category; uint8_t function; uint16_t version; uint16_t subversion; /* QWORD1 */ uint8_t month; uint8_t date; uint8_t hour; uint8_t minute; uint16_t ramcodesize; uint16_t reserved2; /* QWORD2 */ uint32_t svnidx; uint32_t reserved3; /* QWORD3 */ uint32_t reserved4; uint32_t reserved5; } __packed; /* * Host to firmware commands. */ struct r92c_fw_cmd { uint8_t id; #define R92C_CMD_AP_OFFLOAD 0 #define R92C_CMD_SET_PWRMODE 1 #define R92C_CMD_JOINBSS_RPT 2 #define R92C_CMD_RSVD_PAGE 3 #define R92C_CMD_RSSI 4 #define R92C_CMD_RSSI_SETTING 5 #define R92C_CMD_MACID_CONFIG 6 #define R92C_CMD_MACID_PS_MODE 7 #define R92C_CMD_P2P_PS_OFFLOAD 8 #define R92C_CMD_SELECTIVE_SUSPEND 9 #define R92C_CMD_FLAG_EXT 0x80 uint8_t msg[5]; } __packed; /* Structure for R92C_CMD_RSSI_SETTING. */ struct r92c_fw_cmd_rssi { uint8_t macid; uint8_t reserved; uint8_t pwdb; } __packed; /* Structure for R92C_CMD_MACID_CONFIG. */ struct r92c_fw_cmd_macid_cfg { uint32_t mask; uint8_t macid; #define URTWN_MACID_BSS 0 #define URTWN_MACID_BC 4 /* Broadcast. */ #define R92C_MACID_MAX 31 #define R88E_MACID_MAX 63 #define URTWN_MACID_MAX(sc) (((sc)->chip & URTWN_CHIP_88E) ? \ R88E_MACID_MAX : R92C_MACID_MAX) #define URTWN_MACID_UNDEFINED (uint8_t)-1 #define URTWN_MACID_VALID 0x80 } __packed; /* * RTL8192CU ROM image. */ struct r92c_rom { uint16_t id; /* 0x8192 */ uint8_t reserved1[5]; uint8_t dbg_sel; uint16_t reserved2; uint16_t vid; uint16_t pid; uint8_t usb_opt; uint8_t ep_setting; uint16_t reserved3; uint8_t usb_phy; uint8_t reserved4[3]; uint8_t macaddr[IEEE80211_ADDR_LEN]; uint8_t string[61]; /* "Realtek" */ uint8_t subcustomer_id; uint8_t cck_tx_pwr[R92C_MAX_CHAINS][3]; uint8_t ht40_1s_tx_pwr[R92C_MAX_CHAINS][3]; uint8_t ht40_2s_tx_pwr_diff[3]; uint8_t ht20_tx_pwr_diff[3]; uint8_t ofdm_tx_pwr_diff[3]; uint8_t ht40_max_pwr[3]; uint8_t ht20_max_pwr[3]; uint8_t xtal_calib; uint8_t tssi[R92C_MAX_CHAINS]; uint8_t thermal_meter; uint8_t rf_opt1; #define R92C_ROM_RF1_REGULATORY_M 0x07 #define R92C_ROM_RF1_REGULATORY_S 0 #define R92C_ROM_RF1_BOARD_TYPE_M 0xe0 #define R92C_ROM_RF1_BOARD_TYPE_S 5 #define R92C_BOARD_TYPE_DONGLE 0 #define R92C_BOARD_TYPE_HIGHPA 1 #define R92C_BOARD_TYPE_MINICARD 2 #define R92C_BOARD_TYPE_SOLO 3 #define R92C_BOARD_TYPE_COMBO 4 uint8_t rf_opt2; uint8_t rf_opt3; uint8_t rf_opt4; uint8_t channel_plan; #define R92C_CHANNEL_PLAN_BY_HW 0x80 uint8_t version; uint8_t customer_id; } __packed; /* * RTL8188EU ROM image. */ struct r88e_rom { uint8_t reserved1[16]; uint8_t cck_tx_pwr[6]; uint8_t ht40_tx_pwr[5]; uint8_t tx_pwr_diff; uint8_t reserved2[156]; uint8_t channel_plan; uint8_t crystalcap; uint8_t reserved3[7]; uint8_t rf_board_opt; uint8_t rf_feature_opt; uint8_t rf_bt_opt; uint8_t version; uint8_t customer_id; uint8_t reserved4[3]; uint8_t rf_ant_opt; uint8_t reserved5[6]; uint16_t vid; uint16_t pid; uint8_t usb_opt; uint8_t reserved6[2]; uint8_t macaddr[IEEE80211_ADDR_LEN]; uint8_t reserved7[2]; uint8_t string[33]; /* "realtek 802.11n NIC" */ uint8_t reserved8[256]; } __packed; #define URTWN_EFUSE_MAX_LEN 512 /* Rx MAC descriptor. */ struct r92c_rx_stat { uint32_t rxdw0; #define R92C_RXDW0_PKTLEN_M 0x00003fff #define R92C_RXDW0_PKTLEN_S 0 #define R92C_RXDW0_CRCERR 0x00004000 #define R92C_RXDW0_ICVERR 0x00008000 #define R92C_RXDW0_INFOSZ_M 0x000f0000 #define R92C_RXDW0_INFOSZ_S 16 #define R92C_RXDW0_CIPHER_M 0x00700000 #define R92C_RXDW0_CIPHER_S 20 #define R92C_RXDW0_QOS 0x00800000 #define R92C_RXDW0_SHIFT_M 0x03000000 #define R92C_RXDW0_SHIFT_S 24 #define R92C_RXDW0_PHYST 0x04000000 #define R92C_RXDW0_DECRYPTED 0x08000000 uint32_t rxdw1; uint32_t rxdw2; #define R92C_RXDW2_PKTCNT_M 0x00ff0000 #define R92C_RXDW2_PKTCNT_S 16 uint32_t rxdw3; #define R92C_RXDW3_RATE_M 0x0000003f #define R92C_RXDW3_RATE_S 0 #define R92C_RXDW3_HT 0x00000040 #define R92C_RXDW3_HTC 0x00000400 #define R88E_RXDW3_RPT_M 0x0000c000 #define R88E_RXDW3_RPT_S 14 #define R88E_RXDW3_RPT_RX 0 #define R88E_RXDW3_RPT_TX1 1 #define R88E_RXDW3_RPT_TX2 2 uint32_t rxdw4; uint32_t rxdw5; } __packed __attribute__((aligned(4))); /* Rx PHY descriptor. */ struct r92c_rx_phystat { uint32_t phydw0; uint32_t phydw1; uint32_t phydw2; uint32_t phydw3; uint32_t phydw4; uint32_t phydw5; uint32_t phydw6; uint32_t phydw7; } __packed __attribute__((aligned(4))); /* Rx PHY CCK descriptor. */ struct r92c_rx_cck { uint8_t adc_pwdb[4]; uint8_t sq_rpt; uint8_t agc_rpt; } __packed; struct r88e_rx_cck { uint8_t path_agc[2]; uint8_t chan; uint8_t reserved1; uint8_t sig_qual; uint8_t agc_rpt; uint8_t rpt_b; uint8_t reserved2; uint8_t noise_power; uint8_t path_cfotail[2]; uint8_t pcts_mask[2]; uint8_t stream_rxevm[2]; uint8_t path_rxsnr[2]; uint8_t noise_power_db_lsb; uint8_t reserved3[3]; uint8_t stream_csi[2]; uint8_t stream_target_csi[2]; uint8_t sig_evm; } __packed; /* Tx MAC descriptor. */ struct r92c_tx_desc { uint32_t txdw0; #define R92C_TXDW0_PKTLEN_M 0x0000ffff #define R92C_TXDW0_PKTLEN_S 0 #define R92C_TXDW0_OFFSET_M 0x00ff0000 #define R92C_TXDW0_OFFSET_S 16 #define R92C_TXDW0_BMCAST 0x01000000 #define R92C_TXDW0_LSG 0x04000000 #define R92C_TXDW0_FSG 0x08000000 #define R92C_TXDW0_OWN 0x80000000 uint32_t txdw1; #define R92C_TXDW1_MACID_M 0x0000001f #define R92C_TXDW1_MACID_S 0 #define R88E_TXDW1_MACID_M 0x0000003f #define R88E_TXDW1_MACID_S 0 #define R92C_TXDW1_AGGEN 0x00000020 #define R92C_TXDW1_AGGBK 0x00000040 #define R92C_TXDW1_QSEL_M 0x00001f00 #define R92C_TXDW1_QSEL_S 8 #define R92C_TXDW1_QSEL_BE 0x00 /* or 0x03 */ #define R92C_TXDW1_QSEL_BK 0x01 /* or 0x02 */ #define R92C_TXDW1_QSEL_VI 0x04 /* or 0x05 */ #define R92C_TXDW1_QSEL_VO 0x06 /* or 0x07 */ #define URTWN_MAX_TID 8 #define R92C_TXDW1_QSEL_BEACON 0x10 #define R92C_TXDW1_QSEL_MGNT 0x12 #define R92C_TXDW1_RAID_M 0x000f0000 #define R92C_TXDW1_RAID_S 16 #define R92C_TXDW1_CIPHER_M 0x00c00000 #define R92C_TXDW1_CIPHER_S 22 #define R92C_TXDW1_CIPHER_NONE 0 #define R92C_TXDW1_CIPHER_RC4 1 #define R92C_TXDW1_CIPHER_AES 3 #define R92C_TXDW1_PKTOFF_M 0x7c000000 #define R92C_TXDW1_PKTOFF_S 26 uint32_t txdw2; #define R88E_TXDW2_AGGBK 0x00010000 #define R88E_TXDW2_CCX_RPT 0x00080000 uint16_t txdw3; uint16_t txdseq; #define R88E_TXDSEQ_HWSEQ_EN 0x8000 uint32_t txdw4; #define R92C_TXDW4_RTSRATE_M 0x0000003f #define R92C_TXDW4_RTSRATE_S 0 #define R92C_TXDW4_HWSEQ_QOS 0x00000040 #define R92C_TXDW4_HWSEQ_EN 0x00000080 #define R92C_TXDW4_DRVRATE 0x00000100 #define R92C_TXDW4_CTS2SELF 0x00000800 #define R92C_TXDW4_RTSEN 0x00001000 #define R92C_TXDW4_HWRTSEN 0x00002000 #define R92C_TXDW4_SCO_M 0x003f0000 #define R92C_TXDW4_SCO_S 20 #define R92C_TXDW4_SCO_SCA 1 #define R92C_TXDW4_SCO_SCB 2 #define R92C_TXDW4_40MHZ 0x02000000 uint32_t txdw5; #define R92C_TXDW5_DATARATE_M 0x0000003f #define R92C_TXDW5_DATARATE_S 0 #define R92C_TXDW5_SGI 0x00000040 #define R92C_TXDW5_RTY_LMT_ENA 0x00020000 #define R92C_TXDW5_RTY_LMT_M 0x00fc0000 #define R92C_TXDW5_RTY_LMT_S 18 #define R92C_TXDW5_AGGNUM_M 0xff000000 #define R92C_TXDW5_AGGNUM_S 24 uint32_t txdw6; uint16_t txdsum; uint16_t pad; } __packed __attribute__((aligned(4))); struct r88e_tx_rpt_ccx { uint8_t rptb0; uint8_t rptb1; #define R88E_RPTB1_MACID_M 0x3f #define R88E_RPTB1_MACID_S 0 #define R88E_RPTB1_PKT_OK 0x40 #define R88E_RPTB1_BMC 0x80 uint8_t rptb2; #define R88E_RPTB2_RETRY_CNT_M 0x3f #define R88E_RPTB2_RETRY_CNT_S 0 #define R88E_RPTB2_LIFE_EXPIRE 0x40 #define R88E_RPTB2_RETRY_OVER 0x80 uint8_t rptb3; uint8_t rptb4; uint8_t rptb5; uint8_t rptb6; #define R88E_RPTB6_QSEL_M 0xf0 #define R88E_RPTB6_QSEL_S 4 uint8_t rptb7; } __packed; static const uint8_t ridx2rate[] = { 2, 4, 11, 22, 12, 18, 24, 36, 48, 72, 96, 108 }; /* HW rate indices. */ #define URTWN_RIDX_CCK1 0 #define URTWN_RIDX_CCK11 3 #define URTWN_RIDX_OFDM6 4 #define URTWN_RIDX_OFDM24 8 #define URTWN_RIDX_OFDM54 11 #define URTWN_RIDX_COUNT 28 #define URTWN_RIDX_UNKNOWN (uint8_t)-1 /* * MAC initialization values. */ static const struct { uint16_t reg; uint8_t val; } rtl8188eu_mac[] = { { 0x026, 0x41 }, { 0x027, 0x35 }, { 0x040, 0x00 }, { 0x428, 0x0a }, { 0x429, 0x10 }, { 0x430, 0x00 }, { 0x431, 0x01 }, { 0x432, 0x02 }, { 0x433, 0x04 }, { 0x434, 0x05 }, { 0x435, 0x06 }, { 0x436, 0x07 }, { 0x437, 0x08 }, { 0x438, 0x00 }, { 0x439, 0x00 }, { 0x43a, 0x01 }, { 0x43b, 0x02 }, { 0x43c, 0x04 }, { 0x43d, 0x05 }, { 0x43e, 0x06 }, { 0x43f, 0x07 }, { 0x440, 0x5d }, { 0x441, 0x01 }, { 0x442, 0x00 }, { 0x444, 0x15 }, { 0x445, 0xf0 }, { 0x446, 0x0f }, { 0x447, 0x00 }, { 0x458, 0x41 }, { 0x459, 0xa8 }, { 0x45a, 0x72 }, { 0x45b, 0xb9 }, { 0x460, 0x66 }, { 0x461, 0x66 }, { 0x480, 0x08 }, { 0x4c8, 0xff }, { 0x4c9, 0x08 }, { 0x4cc, 0xff }, { 0x4cd, 0xff }, { 0x4ce, 0x01 }, { 0x4d3, 0x01 }, { 0x500, 0x26 }, { 0x501, 0xa2 }, { 0x502, 0x2f }, { 0x503, 0x00 }, { 0x504, 0x28 }, { 0x505, 0xa3 }, { 0x506, 0x5e }, { 0x507, 0x00 }, { 0x508, 0x2b }, { 0x509, 0xa4 }, { 0x50a, 0x5e }, { 0x50b, 0x00 }, { 0x50c, 0x4f }, { 0x50d, 0xa4 }, { 0x50e, 0x00 }, { 0x50f, 0x00 }, { 0x512, 0x1c }, { 0x514, 0x0a }, { 0x516, 0x0a }, { 0x525, 0x4f }, { 0x550, 0x10 }, { 0x551, 0x10 }, { 0x559, 0x02 }, { 0x55d, 0xff }, { 0x605, 0x30 }, { 0x608, 0x0e }, { 0x609, 0x2a }, { 0x620, 0xff }, { 0x621, 0xff }, { 0x622, 0xff }, { 0x623, 0xff }, { 0x624, 0xff }, { 0x625, 0xff }, { 0x626, 0xff }, { 0x627, 0xff }, { 0x652, 0x20 }, { 0x63c, 0x0a }, { 0x63d, 0x0a }, { 0x63e, 0x0e }, { 0x63f, 0x0e }, { 0x640, 0x40 }, { 0x66e, 0x05 }, { 0x700, 0x21 }, { 0x701, 0x43 }, { 0x702, 0x65 }, { 0x703, 0x87 }, { 0x708, 0x21 }, { 0x709, 0x43 }, { 0x70a, 0x65 }, { 0x70b, 0x87 } }, rtl8192cu_mac[] = { { 0x420, 0x80 }, { 0x423, 0x00 }, { 0x430, 0x00 }, { 0x431, 0x00 }, { 0x432, 0x00 }, { 0x433, 0x01 }, { 0x434, 0x04 }, { 0x435, 0x05 }, { 0x436, 0x06 }, { 0x437, 0x07 }, { 0x438, 0x00 }, { 0x439, 0x00 }, { 0x43a, 0x00 }, { 0x43b, 0x01 }, { 0x43c, 0x04 }, { 0x43d, 0x05 }, { 0x43e, 0x06 }, { 0x43f, 0x07 }, { 0x440, 0x5d }, { 0x441, 0x01 }, { 0x442, 0x00 }, { 0x444, 0x15 }, { 0x445, 0xf0 }, { 0x446, 0x0f }, { 0x447, 0x00 }, { 0x458, 0x41 }, { 0x459, 0xa8 }, { 0x45a, 0x72 }, { 0x45b, 0xb9 }, { 0x460, 0x66 }, { 0x461, 0x66 }, { 0x462, 0x08 }, { 0x463, 0x03 }, { 0x4c8, 0xff }, { 0x4c9, 0x08 }, { 0x4cc, 0xff }, { 0x4cd, 0xff }, { 0x4ce, 0x01 }, { 0x500, 0x26 }, { 0x501, 0xa2 }, { 0x502, 0x2f }, { 0x503, 0x00 }, { 0x504, 0x28 }, { 0x505, 0xa3 }, { 0x506, 0x5e }, { 0x507, 0x00 }, { 0x508, 0x2b }, { 0x509, 0xa4 }, { 0x50a, 0x5e }, { 0x50b, 0x00 }, { 0x50c, 0x4f }, { 0x50d, 0xa4 }, { 0x50e, 0x00 }, { 0x50f, 0x00 }, { 0x512, 0x1c }, { 0x514, 0x0a }, { 0x515, 0x10 }, { 0x516, 0x0a }, { 0x517, 0x10 }, { 0x51a, 0x16 }, { 0x524, 0x0f }, { 0x525, 0x4f }, { 0x546, 0x40 }, { 0x547, 0x00 }, { 0x550, 0x10 }, { 0x551, 0x10 }, { 0x559, 0x02 }, { 0x55a, 0x02 }, { 0x55d, 0xff }, { 0x605, 0x30 }, { 0x608, 0x0e }, { 0x609, 0x2a }, { 0x652, 0x20 }, { 0x63c, 0x0a }, { 0x63d, 0x0e }, { 0x63e, 0x0a }, { 0x63f, 0x0e }, { 0x66e, 0x05 }, { 0x700, 0x21 }, { 0x701, 0x43 }, { 0x702, 0x65 }, { 0x703, 0x87 }, { 0x708, 0x21 }, { 0x709, 0x43 }, { 0x70a, 0x65 }, { 0x70b, 0x87 } }; /* * Baseband initialization values. */ struct urtwn_bb_prog { int count; const uint16_t *regs; const uint32_t *vals; int agccount; const uint32_t *agcvals; }; /* * RTL8192CU and RTL8192CE-VAU. */ static const uint16_t rtl8192ce_bb_regs[] = { 0x024, 0x028, 0x800, 0x804, 0x808, 0x80c, 0x810, 0x814, 0x818, 0x81c, 0x820, 0x824, 0x828, 0x82c, 0x830, 0x834, 0x838, 0x83c, 0x840, 0x844, 0x848, 0x84c, 0x850, 0x854, 0x858, 0x85c, 0x860, 0x864, 0x868, 0x86c, 0x870, 0x874, 0x878, 0x87c, 0x880, 0x884, 0x888, 0x88c, 0x890, 0x894, 0x898, 0x89c, 0x900, 0x904, 0x908, 0x90c, 0xa00, 0xa04, 0xa08, 0xa0c, 0xa10, 0xa14, 0xa18, 0xa1c, 0xa20, 0xa24, 0xa28, 0xa2c, 0xa70, 0xa74, 0xc00, 0xc04, 0xc08, 0xc0c, 0xc10, 0xc14, 0xc18, 0xc1c, 0xc20, 0xc24, 0xc28, 0xc2c, 0xc30, 0xc34, 0xc38, 0xc3c, 0xc40, 0xc44, 0xc48, 0xc4c, 0xc50, 0xc54, 0xc58, 0xc5c, 0xc60, 0xc64, 0xc68, 0xc6c, 0xc70, 0xc74, 0xc78, 0xc7c, 0xc80, 0xc84, 0xc88, 0xc8c, 0xc90, 0xc94, 0xc98, 0xc9c, 0xca0, 0xca4, 0xca8, 0xcac, 0xcb0, 0xcb4, 0xcb8, 0xcbc, 0xcc0, 0xcc4, 0xcc8, 0xccc, 0xcd0, 0xcd4, 0xcd8, 0xcdc, 0xce0, 0xce4, 0xce8, 0xcec, 0xd00, 0xd04, 0xd08, 0xd0c, 0xd10, 0xd14, 0xd18, 0xd2c, 0xd30, 0xd34, 0xd38, 0xd3c, 0xd40, 0xd44, 0xd48, 0xd4c, 0xd50, 0xd54, 0xd58, 0xd5c, 0xd60, 0xd64, 0xd68, 0xd6c, 0xd70, 0xd74, 0xd78, 0xe00, 0xe04, 0xe08, 0xe10, 0xe14, 0xe18, 0xe1c, 0xe28, 0xe30, 0xe34, 0xe38, 0xe3c, 0xe40, 0xe44, 0xe48, 0xe4c, 0xe50, 0xe54, 0xe58, 0xe5c, 0xe60, 0xe68, 0xe6c, 0xe70, 0xe74, 0xe78, 0xe7c, 0xe80, 0xe84, 0xe88, 0xe8c, 0xed0, 0xed4, 0xed8, 0xedc, 0xee0, 0xeec, 0xf14, 0xf4c, 0xf00 }; static const uint32_t rtl8192ce_bb_vals[] = { 0x0011800d, 0x00ffdb83, 0x80040002, 0x00000003, 0x0000fc00, 0x0000000a, 0x10005388, 0x020c3d10, 0x02200385, 0x00000000, 0x01000100, 0x00390004, 0x01000100, 0x00390004, 0x27272727, 0x27272727, 0x27272727, 0x27272727, 0x00010000, 0x00010000, 0x27272727, 0x27272727, 0x00000000, 0x00000000, 0x569a569a, 0x0c1b25a4, 0x66e60230, 0x061f0130, 0x27272727, 0x2b2b2b27, 0x07000700, 0x22184000, 0x08080808, 0x00000000, 0xc0083070, 0x000004d5, 0x00000000, 0xcc0000c0, 0x00000800, 0xfffffffe, 0x40302010, 0x00706050, 0x00000000, 0x00000023, 0x00000000, 0x81121313, 0x00d047c8, 0x80ff000c, 0x8c838300, 0x2e68120f, 0x9500bb78, 0x11144028, 0x00881117, 0x89140f00, 0x1a1b0000, 0x090e1317, 0x00000204, 0x00d30000, 0x101fbf00, 0x00000007, 0x48071d40, 0x03a05633, 0x000000e4, 0x6c6c6c6c, 0x08800000, 0x40000100, 0x08800000, 0x40000100, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x69e9ac44, 0x469652cf, 0x49795994, 0x0a97971c, 0x1f7c403f, 0x000100b7, 0xec020107, 0x007f037f, 0x6954341e, 0x43bc0094, 0x6954341e, 0x433c0094, 0x00000000, 0x5116848b, 0x47c00bff, 0x00000036, 0x2c7f000d, 0x018610db, 0x0000001f, 0x00b91612, 0x40000100, 0x20f60000, 0x40000100, 0x20200000, 0x00121820, 0x00000000, 0x00121820, 0x00007f7f, 0x00000000, 0x00000080, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x28000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x64b22427, 0x00766932, 0x00222222, 0x00000000, 0x37644302, 0x2f97d40c, 0x00080740, 0x00020403, 0x0000907f, 0x20010201, 0xa0633333, 0x3333bc43, 0x7a8f5b6b, 0xcc979975, 0x00000000, 0x80608000, 0x00000000, 0x00027293, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x6437140a, 0x00000000, 0x00000000, 0x30032064, 0x4653de68, 0x04518a3c, 0x00002101, 0x2a201c16, 0x1812362e, 0x322c2220, 0x000e3c24, 0x2a2a2a2a, 0x2a2a2a2a, 0x03902a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x00000000, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x681604c2, 0x01007c00, 0x01004800, 0xfb000000, 0x000028d1, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x28160d05, 0x00000010, 0x001b25a4, 0x63db25a4, 0x63db25a4, 0x0c1b25a4, 0x0c1b25a4, 0x0c1b25a4, 0x0c1b25a4, 0x63db25a4, 0x0c1b25a4, 0x63db25a4, 0x63db25a4, 0x63db25a4, 0x63db25a4, 0x001b25a4, 0x001b25a4, 0x6fdb25a4, 0x00000003, 0x00000000, 0x00000300 }; static const uint32_t rtl8192ce_agc_vals[] = { 0x7b000001, 0x7b010001, 0x7b020001, 0x7b030001, 0x7b040001, 0x7b050001, 0x7a060001, 0x79070001, 0x78080001, 0x77090001, 0x760a0001, 0x750b0001, 0x740c0001, 0x730d0001, 0x720e0001, 0x710f0001, 0x70100001, 0x6f110001, 0x6e120001, 0x6d130001, 0x6c140001, 0x6b150001, 0x6a160001, 0x69170001, 0x68180001, 0x67190001, 0x661a0001, 0x651b0001, 0x641c0001, 0x631d0001, 0x621e0001, 0x611f0001, 0x60200001, 0x49210001, 0x48220001, 0x47230001, 0x46240001, 0x45250001, 0x44260001, 0x43270001, 0x42280001, 0x41290001, 0x402a0001, 0x262b0001, 0x252c0001, 0x242d0001, 0x232e0001, 0x222f0001, 0x21300001, 0x20310001, 0x06320001, 0x05330001, 0x04340001, 0x03350001, 0x02360001, 0x01370001, 0x00380001, 0x00390001, 0x003a0001, 0x003b0001, 0x003c0001, 0x003d0001, 0x003e0001, 0x003f0001, 0x7b400001, 0x7b410001, 0x7b420001, 0x7b430001, 0x7b440001, 0x7b450001, 0x7a460001, 0x79470001, 0x78480001, 0x77490001, 0x764a0001, 0x754b0001, 0x744c0001, 0x734d0001, 0x724e0001, 0x714f0001, 0x70500001, 0x6f510001, 0x6e520001, 0x6d530001, 0x6c540001, 0x6b550001, 0x6a560001, 0x69570001, 0x68580001, 0x67590001, 0x665a0001, 0x655b0001, 0x645c0001, 0x635d0001, 0x625e0001, 0x615f0001, 0x60600001, 0x49610001, 0x48620001, 0x47630001, 0x46640001, 0x45650001, 0x44660001, 0x43670001, 0x42680001, 0x41690001, 0x406a0001, 0x266b0001, 0x256c0001, 0x246d0001, 0x236e0001, 0x226f0001, 0x21700001, 0x20710001, 0x06720001, 0x05730001, 0x04740001, 0x03750001, 0x02760001, 0x01770001, 0x00780001, 0x00790001, 0x007a0001, 0x007b0001, 0x007c0001, 0x007d0001, 0x007e0001, 0x007f0001, 0x3800001e, 0x3801001e, 0x3802001e, 0x3803001e, 0x3804001e, 0x3805001e, 0x3806001e, 0x3807001e, 0x3808001e, 0x3c09001e, 0x3e0a001e, 0x400b001e, 0x440c001e, 0x480d001e, 0x4c0e001e, 0x500f001e, 0x5210001e, 0x5611001e, 0x5a12001e, 0x5e13001e, 0x6014001e, 0x6015001e, 0x6016001e, 0x6217001e, 0x6218001e, 0x6219001e, 0x621a001e, 0x621b001e, 0x621c001e, 0x621d001e, 0x621e001e, 0x621f001e }; static const struct urtwn_bb_prog rtl8192ce_bb_prog = { nitems(rtl8192ce_bb_regs), rtl8192ce_bb_regs, rtl8192ce_bb_vals, nitems(rtl8192ce_agc_vals), rtl8192ce_agc_vals }; /* * RTL8188CU. */ static const uint32_t rtl8192cu_bb_vals[] = { 0x0011800d, 0x00ffdb83, 0x80040002, 0x00000003, 0x0000fc00, 0x0000000a, 0x10005388, 0x020c3d10, 0x02200385, 0x00000000, 0x01000100, 0x00390004, 0x01000100, 0x00390004, 0x27272727, 0x27272727, 0x27272727, 0x27272727, 0x00010000, 0x00010000, 0x27272727, 0x27272727, 0x00000000, 0x00000000, 0x569a569a, 0x0c1b25a4, 0x66e60230, 0x061f0130, 0x27272727, 0x2b2b2b27, 0x07000700, 0x22184000, 0x08080808, 0x00000000, 0xc0083070, 0x000004d5, 0x00000000, 0xcc0000c0, 0x00000800, 0xfffffffe, 0x40302010, 0x00706050, 0x00000000, 0x00000023, 0x00000000, 0x81121313, 0x00d047c8, 0x80ff000c, 0x8c838300, 0x2e68120f, 0x9500bb78, 0x11144028, 0x00881117, 0x89140f00, 0x1a1b0000, 0x090e1317, 0x00000204, 0x00d30000, 0x101fbf00, 0x00000007, 0x48071d40, 0x03a05633, 0x000000e4, 0x6c6c6c6c, 0x08800000, 0x40000100, 0x08800000, 0x40000100, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x69e9ac44, 0x469652cf, 0x49795994, 0x0a97971c, 0x1f7c403f, 0x000100b7, 0xec020107, 0x007f037f, 0x6954341e, 0x43bc0094, 0x6954341e, 0x433c0094, 0x00000000, 0x5116848b, 0x47c00bff, 0x00000036, 0x2c7f000d, 0x0186115b, 0x0000001f, 0x00b99612, 0x40000100, 0x20f60000, 0x40000100, 0x20200000, 0x00121820, 0x00000000, 0x00121820, 0x00007f7f, 0x00000000, 0x00000080, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x28000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x64b22427, 0x00766932, 0x00222222, 0x00000000, 0x37644302, 0x2f97d40c, 0x00080740, 0x00020403, 0x0000907f, 0x20010201, 0xa0633333, 0x3333bc43, 0x7a8f5b6b, 0xcc979975, 0x00000000, 0x80608000, 0x00000000, 0x00027293, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x6437140a, 0x00000000, 0x00000000, 0x30032064, 0x4653de68, 0x04518a3c, 0x00002101, 0x2a201c16, 0x1812362e, 0x322c2220, 0x000e3c24, 0x2a2a2a2a, 0x2a2a2a2a, 0x03902a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x00000000, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x681604c2, 0x01007c00, 0x01004800, 0xfb000000, 0x000028d1, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x28160d05, 0x00000010, 0x001b25a4, 0x63db25a4, 0x63db25a4, 0x0c1b25a4, 0x0c1b25a4, 0x0c1b25a4, 0x0c1b25a4, 0x63db25a4, 0x0c1b25a4, 0x63db25a4, 0x63db25a4, 0x63db25a4, 0x63db25a4, 0x001b25a4, 0x001b25a4, 0x6fdb25a4, 0x00000003, 0x00000000, 0x00000300 }; static const struct urtwn_bb_prog rtl8192cu_bb_prog = { nitems(rtl8192ce_bb_regs), rtl8192ce_bb_regs, rtl8192cu_bb_vals, nitems(rtl8192ce_agc_vals), rtl8192ce_agc_vals }; /* * RTL8188CE-VAU. */ static const uint32_t rtl8188ce_bb_vals[] = { 0x0011800d, 0x00ffdb83, 0x80040000, 0x00000001, 0x0000fc00, 0x0000000a, 0x10005388, 0x020c3d10, 0x02200385, 0x00000000, 0x01000100, 0x00390004, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00010000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x569a569a, 0x001b25a4, 0x66e60230, 0x061f0130, 0x00000000, 0x32323200, 0x07000700, 0x22004000, 0x00000808, 0x00000000, 0xc0083070, 0x000004d5, 0x00000000, 0xccc000c0, 0x00000800, 0xfffffffe, 0x40302010, 0x00706050, 0x00000000, 0x00000023, 0x00000000, 0x81121111, 0x00d047c8, 0x80ff000c, 0x8c838300, 0x2e68120f, 0x9500bb78, 0x11144028, 0x00881117, 0x89140f00, 0x1a1b0000, 0x090e1317, 0x00000204, 0x00d30000, 0x101fbf00, 0x00000007, 0x48071d40, 0x03a05611, 0x000000e4, 0x6c6c6c6c, 0x08800000, 0x40000100, 0x08800000, 0x40000100, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x69e9ac44, 0x469652cf, 0x49795994, 0x0a97971c, 0x1f7c403f, 0x000100b7, 0xec020107, 0x007f037f, 0x6954341e, 0x43bc0094, 0x6954341e, 0x433c0094, 0x00000000, 0x5116848b, 0x47c00bff, 0x00000036, 0x2c7f000d, 0x018610db, 0x0000001f, 0x00b91612, 0x40000100, 0x20f60000, 0x40000100, 0x20200000, 0x00121820, 0x00000000, 0x00121820, 0x00007f7f, 0x00000000, 0x00000080, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x28000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x64b22427, 0x00766932, 0x00222222, 0x00000000, 0x37644302, 0x2f97d40c, 0x00080740, 0x00020401, 0x0000907f, 0x20010201, 0xa0633333, 0x3333bc43, 0x7a8f5b6b, 0xcc979975, 0x00000000, 0x80608000, 0x00000000, 0x00027293, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x6437140a, 0x00000000, 0x00000000, 0x30032064, 0x4653de68, 0x04518a3c, 0x00002101, 0x2a201c16, 0x1812362e, 0x322c2220, 0x000e3c24, 0x2a2a2a2a, 0x2a2a2a2a, 0x03902a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x00000000, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x681604c2, 0x01007c00, 0x01004800, 0xfb000000, 0x000028d1, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x28160d05, 0x00000008, 0x001b25a4, 0x631b25a0, 0x631b25a0, 0x081b25a0, 0x081b25a0, 0x081b25a0, 0x081b25a0, 0x631b25a0, 0x081b25a0, 0x631b25a0, 0x631b25a0, 0x631b25a0, 0x631b25a0, 0x001b25a0, 0x001b25a0, 0x6b1b25a0, 0x00000003, 0x00000000, 0x00000300 }; static const uint32_t rtl8188ce_agc_vals[] = { 0x7b000001, 0x7b010001, 0x7b020001, 0x7b030001, 0x7b040001, 0x7b050001, 0x7a060001, 0x79070001, 0x78080001, 0x77090001, 0x760a0001, 0x750b0001, 0x740c0001, 0x730d0001, 0x720e0001, 0x710f0001, 0x70100001, 0x6f110001, 0x6e120001, 0x6d130001, 0x6c140001, 0x6b150001, 0x6a160001, 0x69170001, 0x68180001, 0x67190001, 0x661a0001, 0x651b0001, 0x641c0001, 0x631d0001, 0x621e0001, 0x611f0001, 0x60200001, 0x49210001, 0x48220001, 0x47230001, 0x46240001, 0x45250001, 0x44260001, 0x43270001, 0x42280001, 0x41290001, 0x402a0001, 0x262b0001, 0x252c0001, 0x242d0001, 0x232e0001, 0x222f0001, 0x21300001, 0x20310001, 0x06320001, 0x05330001, 0x04340001, 0x03350001, 0x02360001, 0x01370001, 0x00380001, 0x00390001, 0x003a0001, 0x003b0001, 0x003c0001, 0x003d0001, 0x003e0001, 0x003f0001, 0x7b400001, 0x7b410001, 0x7b420001, 0x7b430001, 0x7b440001, 0x7b450001, 0x7a460001, 0x79470001, 0x78480001, 0x77490001, 0x764a0001, 0x754b0001, 0x744c0001, 0x734d0001, 0x724e0001, 0x714f0001, 0x70500001, 0x6f510001, 0x6e520001, 0x6d530001, 0x6c540001, 0x6b550001, 0x6a560001, 0x69570001, 0x68580001, 0x67590001, 0x665a0001, 0x655b0001, 0x645c0001, 0x635d0001, 0x625e0001, 0x615f0001, 0x60600001, 0x49610001, 0x48620001, 0x47630001, 0x46640001, 0x45650001, 0x44660001, 0x43670001, 0x42680001, 0x41690001, 0x406a0001, 0x266b0001, 0x256c0001, 0x246d0001, 0x236e0001, 0x226f0001, 0x21700001, 0x20710001, 0x06720001, 0x05730001, 0x04740001, 0x03750001, 0x02760001, 0x01770001, 0x00780001, 0x00790001, 0x007a0001, 0x007b0001, 0x007c0001, 0x007d0001, 0x007e0001, 0x007f0001, 0x3800001e, 0x3801001e, 0x3802001e, 0x3803001e, 0x3804001e, 0x3805001e, 0x3806001e, 0x3807001e, 0x3808001e, 0x3c09001e, 0x3e0a001e, 0x400b001e, 0x440c001e, 0x480d001e, 0x4c0e001e, 0x500f001e, 0x5210001e, 0x5611001e, 0x5a12001e, 0x5e13001e, 0x6014001e, 0x6015001e, 0x6016001e, 0x6217001e, 0x6218001e, 0x6219001e, 0x621a001e, 0x621b001e, 0x621c001e, 0x621d001e, 0x621e001e, 0x621f001e }; static const struct urtwn_bb_prog rtl8188ce_bb_prog = { nitems(rtl8192ce_bb_regs), rtl8192ce_bb_regs, rtl8188ce_bb_vals, nitems(rtl8188ce_agc_vals), rtl8188ce_agc_vals }; static const uint32_t rtl8188cu_bb_vals[] = { 0x0011800d, 0x00ffdb83, 0x80040000, 0x00000001, 0x0000fc00, 0x0000000a, 0x10005388, 0x020c3d10, 0x02200385, 0x00000000, 0x01000100, 0x00390004, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00010000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x569a569a, 0x001b25a4, 0x66e60230, 0x061f0130, 0x00000000, 0x32323200, 0x07000700, 0x22004000, 0x00000808, 0x00000000, 0xc0083070, 0x000004d5, 0x00000000, 0xccc000c0, 0x00000800, 0xfffffffe, 0x40302010, 0x00706050, 0x00000000, 0x00000023, 0x00000000, 0x81121111, 0x00d047c8, 0x80ff000c, 0x8c838300, 0x2e68120f, 0x9500bb78, 0x11144028, 0x00881117, 0x89140f00, 0x1a1b0000, 0x090e1317, 0x00000204, 0x00d30000, 0x101fbf00, 0x00000007, 0x48071d40, 0x03a05611, 0x000000e4, 0x6c6c6c6c, 0x08800000, 0x40000100, 0x08800000, 0x40000100, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x69e9ac44, 0x469652cf, 0x49795994, 0x0a97971c, 0x1f7c403f, 0x000100b7, 0xec020107, 0x007f037f, 0x6954341e, 0x43bc0094, 0x6954341e, 0x433c0094, 0x00000000, 0x5116848b, 0x47c00bff, 0x00000036, 0x2c7f000d, 0x018610db, 0x0000001f, 0x00b91612, 0x40000100, 0x20f60000, 0x40000100, 0x20200000, 0x00121820, 0x00000000, 0x00121820, 0x00007f7f, 0x00000000, 0x00000080, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x28000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x64b22427, 0x00766932, 0x00222222, 0x00000000, 0x37644302, 0x2f97d40c, 0x00080740, 0x00020401, 0x0000907f, 0x20010201, 0xa0633333, 0x3333bc43, 0x7a8f5b6b, 0xcc979975, 0x00000000, 0x80608000, 0x00000000, 0x00027293, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x6437140a, 0x00000000, 0x00000000, 0x30032064, 0x4653de68, 0x04518a3c, 0x00002101, 0x2a201c16, 0x1812362e, 0x322c2220, 0x000e3c24, 0x2a2a2a2a, 0x2a2a2a2a, 0x03902a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x00000000, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x681604c2, 0x01007c00, 0x01004800, 0xfb000000, 0x000028d1, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x28160d05, 0x00000008, 0x001b25a4, 0x631b25a0, 0x631b25a0, 0x081b25a0, 0x081b25a0, 0x081b25a0, 0x081b25a0, 0x631b25a0, 0x081b25a0, 0x631b25a0, 0x631b25a0, 0x631b25a0, 0x631b25a0, 0x001b25a0, 0x001b25a0, 0x6b1b25a0, 0x00000003, 0x00000000, 0x00000300 }; static const struct urtwn_bb_prog rtl8188cu_bb_prog = { nitems(rtl8192ce_bb_regs), rtl8192ce_bb_regs, rtl8188cu_bb_vals, nitems(rtl8188ce_agc_vals), rtl8188ce_agc_vals }; /* * RTL8188EU. */ static const uint16_t rtl8188eu_bb_regs[] = { 0x800, 0x804, 0x808, 0x80c, 0x810, 0x814, 0x818, 0x81c, 0x820, 0x824, 0x828, 0x82c, 0x830, 0x834, 0x838, 0x83c, 0x840, 0x844, 0x848, 0x84c, 0x850, 0x854, 0x858, 0x85c, 0x860, 0x864, 0x868, 0x86c, 0x870, 0x874, 0x878, 0x87c, 0x880, 0x884, 0x888, 0x88c, 0x890, 0x894, 0x898, 0x89c, 0x900, 0x904, 0x908, 0x90c, 0x910, 0x914, 0xa00, 0xa04, 0xa08, 0xa0c, 0xa10, 0xa14, 0xa18, 0xa1c, 0xa20, 0xa24, 0xa28, 0xa2c, 0xa70, 0xa74, 0xa78, 0xa7c, 0xa80, 0xb2c, 0xc00, 0xc04, 0xc08, 0xc0c, 0xc10, 0xc14, 0xc18, 0xc1c, 0xc20, 0xc24, 0xc28, 0xc2c, 0xc30, 0xc34, 0xc38, 0xc3c, 0xc40, 0xc44, 0xc48, 0xc4c, 0xc50, 0xc54, 0xc58, 0xc5c, 0xc60, 0xc64, 0xc68, 0xc6c, 0xc70, 0xc74, 0xc78, 0xc7c, 0xc80, 0xc84, 0xc88, 0xc8c, 0xc90, 0xc94, 0xc98, 0xc9c, 0xca0, 0xca4, 0xca8, 0xcac, 0xcb0, 0xcb4, 0xcb8, 0xcbc, 0xcc0, 0xcc4, 0xcc8, 0xccc, 0xcd0, 0xcd4, 0xcd8, 0xcdc, 0xce0, 0xce4, 0xce8, 0xcec, 0xd00, 0xd04, 0xd08, 0xd0c, 0xd10, 0xd14, 0xd18, 0xd2c, 0xd30, 0xd34, 0xd38, 0xd3c, 0xd40, 0xd44, 0xd48, 0xd4c, 0xd50, 0xd54, 0xd58, 0xd5c, 0xd60, 0xd64, 0xd68, 0xd6c, 0xd70, 0xd74, 0xd78, 0xe00, 0xe04, 0xe08, 0xe10, 0xe14, 0xe18, 0xe1c, 0xe28, 0xe30, 0xe34, 0xe38, 0xe3c, 0xe40, 0xe44, 0xe48, 0xe4c, 0xe50, 0xe54, 0xe58, 0xe5c, 0xe60, 0xe68, 0xe6c, 0xe70, 0xe74, 0xe78, 0xe7c, 0xe80, 0xe84, 0xe88, 0xe8c, 0xed0, 0xed4, 0xed8, 0xedc, 0xee0, 0xee8, 0xeec, 0xf14, 0xf4c, 0xf00 }; static const uint32_t rtl8188eu_bb_vals[] = { 0x80040000, 0x00000003, 0x0000fc00, 0x0000000a, 0x10001331, 0x020c3d10, 0x02200385, 0x00000000, 0x01000100, 0x00390204, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00010000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x569a11a9, 0x01000014, 0x66f60110, 0x061f0649, 0x00000000, 0x27272700, 0x07000760, 0x25004000, 0x00000808, 0x00000000, 0xb0000c1c, 0x00000001, 0x00000000, 0xccc000c0, 0x00000800, 0xfffffffe, 0x40302010, 0x00706050, 0x00000000, 0x00000023, 0x00000000, 0x81121111, 0x00000002, 0x00000201, 0x00d047c8, 0x80ff000c, 0x8c838300, 0x2e7f120f, 0x9500bb78, 0x1114d028, 0x00881117, 0x89140f00, 0x1a1b0000, 0x090e1317, 0x00000204, 0x00d30000, 0x101fbf00, 0x00000007, 0x00000900, 0x225b0606, 0x218075b1, 0x80000000, 0x48071d40, 0x03a05611, 0x000000e4, 0x6c6c6c6c, 0x08800000, 0x40000100, 0x08800000, 0x40000100, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x69e9ac47, 0x469652af, 0x49795994, 0x0a97971c, 0x1f7c403f, 0x000100b7, 0xec020107, 0x007f037f, 0x69553420, 0x43bc0094, 0x00013169, 0x00250492, 0x00000000, 0x7112848b, 0x47c00bff, 0x00000036, 0x2c7f000d, 0x020610db, 0x0000001f, 0x00b91612, 0x390000e4, 0x20f60000, 0x40000100, 0x20200000, 0x00091521, 0x00000000, 0x00121820, 0x00007f7f, 0x00000000, 0x000300a0, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x28000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x64b22427, 0x00766932, 0x00222222, 0x00000000, 0x37644302, 0x2f97d40c, 0x00000740, 0x00020401, 0x0000907f, 0x20010201, 0xa0633333, 0x3333bc43, 0x7a8f5b6f, 0xcc979975, 0x00000000, 0x80608000, 0x00000000, 0x00127353, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x6437140a, 0x00000000, 0x00000282, 0x30032064, 0x4653de68, 0x04518a3c, 0x00002101, 0x2a201c16, 0x1812362e, 0x322c2220, 0x000e3c24, 0x2d2d2d2d, 0x2d2d2d2d, 0x0390272d, 0x2d2d2d2d, 0x2d2d2d2d, 0x2d2d2d2d, 0x2d2d2d2d, 0x00000000, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x681604c2, 0x01007c00, 0x01004800, 0xfb000000, 0x000028d1, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x28160d05, 0x00000008, 0x001b25a4, 0x00c00014, 0x00c00014, 0x01000014, 0x01000014, 0x01000014, 0x01000014, 0x00c00014, 0x01000014, 0x00c00014, 0x00c00014, 0x00c00014, 0x00c00014, 0x00000014, 0x00000014, 0x21555448, 0x01c00014, 0x00000003, 0x00000000, 0x00000300 }; static const uint32_t rtl8188eu_agc_vals[] = { 0xfb000001, 0xfb010001, 0xfb020001, 0xfb030001, 0xfb040001, 0xfb050001, 0xfa060001, 0xf9070001, 0xf8080001, 0xf7090001, 0xf60a0001, 0xf50b0001, 0xf40c0001, 0xf30d0001, 0xf20e0001, 0xf10f0001, 0xf0100001, 0xef110001, 0xee120001, 0xed130001, 0xec140001, 0xeb150001, 0xea160001, 0xe9170001, 0xe8180001, 0xe7190001, 0xe61a0001, 0xe51b0001, 0xe41c0001, 0xe31d0001, 0xe21e0001, 0xe11f0001, 0x8a200001, 0x89210001, 0x88220001, 0x87230001, 0x86240001, 0x85250001, 0x84260001, 0x83270001, 0x82280001, 0x6b290001, 0x6a2a0001, 0x692b0001, 0x682c0001, 0x672d0001, 0x662e0001, 0x652f0001, 0x64300001, 0x63310001, 0x62320001, 0x61330001, 0x46340001, 0x45350001, 0x44360001, 0x43370001, 0x42380001, 0x41390001, 0x403a0001, 0x403b0001, 0x403c0001, 0x403d0001, 0x403e0001, 0x403f0001, 0xfb400001, 0xfb410001, 0xfb420001, 0xfb430001, 0xfb440001, 0xfb450001, 0xfb460001, 0xfb470001, 0xfb480001, 0xfa490001, 0xf94a0001, 0xf84B0001, 0xf74c0001, 0xf64d0001, 0xf54e0001, 0xf44f0001, 0xf3500001, 0xf2510001, 0xf1520001, 0xf0530001, 0xef540001, 0xee550001, 0xed560001, 0xec570001, 0xeb580001, 0xea590001, 0xe95a0001, 0xe85b0001, 0xe75c0001, 0xe65d0001, 0xe55e0001, 0xe45f0001, 0xe3600001, 0xe2610001, 0xc3620001, 0xc2630001, 0xc1640001, 0x8b650001, 0x8a660001, 0x89670001, 0x88680001, 0x87690001, 0x866a0001, 0x856b0001, 0x846c0001, 0x676d0001, 0x666e0001, 0x656f0001, 0x64700001, 0x63710001, 0x62720001, 0x61730001, 0x60740001, 0x46750001, 0x45760001, 0x44770001, 0x43780001, 0x42790001, 0x417a0001, 0x407b0001, 0x407c0001, 0x407d0001, 0x407e0001, 0x407f0001 }; static const struct urtwn_bb_prog rtl8188eu_bb_prog = { nitems(rtl8188eu_bb_regs), rtl8188eu_bb_regs, rtl8188eu_bb_vals, nitems(rtl8188eu_agc_vals), rtl8188eu_agc_vals }; /* * RTL8188RU. */ static const uint16_t rtl8188ru_bb_regs[] = { 0x024, 0x028, 0x040, 0x800, 0x804, 0x808, 0x80c, 0x810, 0x814, 0x818, 0x81c, 0x820, 0x824, 0x828, 0x82c, 0x830, 0x834, 0x838, 0x83c, 0x840, 0x844, 0x848, 0x84c, 0x850, 0x854, 0x858, 0x85c, 0x860, 0x864, 0x868, 0x86c, 0x870, 0x874, 0x878, 0x87c, 0x880, 0x884, 0x888, 0x88c, 0x890, 0x894, 0x898, 0x89c, 0x900, 0x904, 0x908, 0x90c, 0xa00, 0xa04, 0xa08, 0xa0c, 0xa10, 0xa14, 0xa18, 0xa1c, 0xa20, 0xa24, 0xa28, 0xa2c, 0xa70, 0xa74, 0xc00, 0xc04, 0xc08, 0xc0c, 0xc10, 0xc14, 0xc18, 0xc1c, 0xc20, 0xc24, 0xc28, 0xc2c, 0xc30, 0xc34, 0xc38, 0xc3c, 0xc40, 0xc44, 0xc48, 0xc4c, 0xc50, 0xc54, 0xc58, 0xc5c, 0xc60, 0xc64, 0xc68, 0xc6c, 0xc70, 0xc74, 0xc78, 0xc7c, 0xc80, 0xc84, 0xc88, 0xc8c, 0xc90, 0xc94, 0xc98, 0xc9c, 0xca0, 0xca4, 0xca8, 0xcac, 0xcb0, 0xcb4, 0xcb8, 0xcbc, 0xcc0, 0xcc4, 0xcc8, 0xccc, 0xcd0, 0xcd4, 0xcd8, 0xcdc, 0xce0, 0xce4, 0xce8, 0xcec, 0xd00, 0xd04, 0xd08, 0xd0c, 0xd10, 0xd14, 0xd18, 0xd2c, 0xd30, 0xd34, 0xd38, 0xd3c, 0xd40, 0xd44, 0xd48, 0xd4c, 0xd50, 0xd54, 0xd58, 0xd5c, 0xd60, 0xd64, 0xd68, 0xd6c, 0xd70, 0xd74, 0xd78, 0xe00, 0xe04, 0xe08, 0xe10, 0xe14, 0xe18, 0xe1c, 0xe28, 0xe30, 0xe34, 0xe38, 0xe3c, 0xe40, 0xe44, 0xe48, 0xe4c, 0xe50, 0xe54, 0xe58, 0xe5c, 0xe60, 0xe68, 0xe6c, 0xe70, 0xe74, 0xe78, 0xe7c, 0xe80, 0xe84, 0xe88, 0xe8c, 0xed0, 0xed4, 0xed8, 0xedc, 0xee0, 0xeec, 0xee8, 0xf14, 0xf4c, 0xf00 }; static const uint32_t rtl8188ru_bb_vals[] = { 0x0011800d, 0x00ffdb83, 0x000c0004, 0x80040000, 0x00000001, 0x0000fc00, 0x0000000a, 0x10005388, 0x020c3d10, 0x02200385, 0x00000000, 0x01000100, 0x00390204, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00010000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x569a569a, 0x001b25a4, 0x66e60230, 0x061f0130, 0x00000000, 0x32323200, 0x03000300, 0x22004000, 0x00000808, 0x00ffc3f1, 0xc0083070, 0x000004d5, 0x00000000, 0xccc000c0, 0x00000800, 0xfffffffe, 0x40302010, 0x00706050, 0x00000000, 0x00000023, 0x00000000, 0x81121111, 0x00d047c8, 0x80ff000c, 0x8c838300, 0x2e68120f, 0x9500bb78, 0x11144028, 0x00881117, 0x89140f00, 0x15160000, 0x070b0f12, 0x00000104, 0x00d30000, 0x101fbf00, 0x00000007, 0x48071d40, 0x03a05611, 0x000000e4, 0x6c6c6c6c, 0x08800000, 0x40000100, 0x08800000, 0x40000100, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x69e9ac44, 0x469652cf, 0x49795994, 0x0a97971c, 0x1f7c403f, 0x000100b7, 0xec020107, 0x007f037f, 0x6954342e, 0x43bc0094, 0x6954342f, 0x433c0094, 0x00000000, 0x5116848b, 0x47c00bff, 0x00000036, 0x2c56000d, 0x018610db, 0x0000001f, 0x00b91612, 0x24000090, 0x20f60000, 0x24000090, 0x20200000, 0x00121820, 0x00000000, 0x00121820, 0x00007f7f, 0x00000000, 0x00000080, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x28000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x64b22427, 0x00766932, 0x00222222, 0x00000000, 0x37644302, 0x2f97d40c, 0x00080740, 0x00020401, 0x0000907f, 0x20010201, 0xa0633333, 0x3333bc43, 0x7a8f5b6b, 0xcc979975, 0x00000000, 0x80608000, 0x00000000, 0x00027293, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x6437140a, 0x00000000, 0x00000000, 0x30032064, 0x4653de68, 0x04518a3c, 0x00002101, 0x2a201c16, 0x1812362e, 0x322c2220, 0x000e3c24, 0x2a2a2a2a, 0x2a2a2a2a, 0x03902a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x2a2a2a2a, 0x00000000, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x681604c2, 0x01007c00, 0x01004800, 0xfb000000, 0x000028d1, 0x1000dc1f, 0x10008c1f, 0x02140102, 0x28160d05, 0x00000010, 0x001b25a4, 0x631b25a0, 0x631b25a0, 0x081b25a0, 0x081b25a0, 0x081b25a0, 0x081b25a0, 0x631b25a0, 0x081b25a0, 0x631b25a0, 0x631b25a0, 0x631b25a0, 0x631b25a0, 0x001b25a0, 0x001b25a0, 0x6b1b25a0, 0x31555448, 0x00000003, 0x00000000, 0x00000300 }; static const uint32_t rtl8188ru_agc_vals[] = { 0x7b000001, 0x7b010001, 0x7b020001, 0x7b030001, 0x7b040001, 0x7b050001, 0x7b060001, 0x7b070001, 0x7b080001, 0x7a090001, 0x790a0001, 0x780b0001, 0x770c0001, 0x760d0001, 0x750e0001, 0x740f0001, 0x73100001, 0x72110001, 0x71120001, 0x70130001, 0x6f140001, 0x6e150001, 0x6d160001, 0x6c170001, 0x6b180001, 0x6a190001, 0x691a0001, 0x681b0001, 0x671c0001, 0x661d0001, 0x651e0001, 0x641f0001, 0x63200001, 0x62210001, 0x61220001, 0x60230001, 0x46240001, 0x45250001, 0x44260001, 0x43270001, 0x42280001, 0x41290001, 0x402a0001, 0x262b0001, 0x252c0001, 0x242d0001, 0x232e0001, 0x222f0001, 0x21300001, 0x20310001, 0x06320001, 0x05330001, 0x04340001, 0x03350001, 0x02360001, 0x01370001, 0x00380001, 0x00390001, 0x003a0001, 0x003b0001, 0x003c0001, 0x003d0001, 0x003e0001, 0x003f0001, 0x7b400001, 0x7b410001, 0x7b420001, 0x7b430001, 0x7b440001, 0x7b450001, 0x7b460001, 0x7b470001, 0x7b480001, 0x7a490001, 0x794a0001, 0x784b0001, 0x774c0001, 0x764d0001, 0x754e0001, 0x744f0001, 0x73500001, 0x72510001, 0x71520001, 0x70530001, 0x6f540001, 0x6e550001, 0x6d560001, 0x6c570001, 0x6b580001, 0x6a590001, 0x695a0001, 0x685b0001, 0x675c0001, 0x665d0001, 0x655e0001, 0x645f0001, 0x63600001, 0x62610001, 0x61620001, 0x60630001, 0x46640001, 0x45650001, 0x44660001, 0x43670001, 0x42680001, 0x41690001, 0x406a0001, 0x266b0001, 0x256c0001, 0x246d0001, 0x236e0001, 0x226f0001, 0x21700001, 0x20710001, 0x06720001, 0x05730001, 0x04740001, 0x03750001, 0x02760001, 0x01770001, 0x00780001, 0x00790001, 0x007a0001, 0x007b0001, 0x007c0001, 0x007d0001, 0x007e0001, 0x007f0001, 0x3800001e, 0x3801001e, 0x3802001e, 0x3803001e, 0x3804001e, 0x3805001e, 0x3806001e, 0x3807001e, 0x3808001e, 0x3c09001e, 0x3e0a001e, 0x400b001e, 0x440c001e, 0x480d001e, 0x4c0e001e, 0x500f001e, 0x5210001e, 0x5611001e, 0x5a12001e, 0x5e13001e, 0x6014001e, 0x6015001e, 0x6016001e, 0x6217001e, 0x6218001e, 0x6219001e, 0x621a001e, 0x621b001e, 0x621c001e, 0x621d001e, 0x621e001e, 0x621f001e }; static const struct urtwn_bb_prog rtl8188ru_bb_prog = { nitems(rtl8188ru_bb_regs), rtl8188ru_bb_regs, rtl8188ru_bb_vals, nitems(rtl8188ru_agc_vals), rtl8188ru_agc_vals }; /* * RF initialization values. */ struct urtwn_rf_prog { int count; const uint8_t *regs; const uint32_t *vals; }; /* * RTL8192CU and RTL8192CE-VAU. */ static const uint8_t rtl8192ce_rf1_regs[] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28, 0x29, 0x2a, 0x2b, 0x2a, 0x2b, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x2b, 0x2b, 0x2c, 0x2a, 0x10, 0x11, 0x10, 0x11, 0x10, 0x11, 0x10, 0x11, 0x10, 0x11, 0x10, 0x11, 0x10, 0x11, 0x12, 0x12, 0x12, 0x12, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x14, 0x14, 0x14, 0x14, 0x15, 0x15, 0x15, 0x15, 0x16, 0x16, 0x16, 0x16, 0x00, 0x18, 0xfe, 0xfe, 0x1f, 0xfe, 0xfe, 0x1e, 0x1f, 0x00 }; static const uint32_t rtl8192ce_rf1_vals[] = { 0x30159, 0x31284, 0x98000, 0x18c63, 0x210e7, 0x2044f, 0x1adb1, 0x54867, 0x8992e, 0x0e52c, 0x39ce7, 0x00451, 0x00000, 0x10255, 0x60a00, 0xfc378, 0xa1250, 0x4445f, 0x80001, 0x0b614, 0x6c000, 0x00000, 0x01558, 0x00060, 0x00483, 0x4f000, 0xec7d9, 0x577c0, 0x04783, 0x00001, 0x21334, 0x00000, 0x00054, 0x00001, 0x00808, 0x53333, 0x0000c, 0x00002, 0x00808, 0x5b333, 0x0000d, 0x00003, 0x00808, 0x63333, 0x0000d, 0x00004, 0x00808, 0x6b333, 0x0000d, 0x00005, 0x00808, 0x73333, 0x0000d, 0x00006, 0x00709, 0x5b333, 0x0000d, 0x00007, 0x00709, 0x63333, 0x0000d, 0x00008, 0x0060a, 0x4b333, 0x0000d, 0x00009, 0x0060a, 0x53333, 0x0000d, 0x0000a, 0x0060a, 0x5b333, 0x0000d, 0x0000b, 0x0060a, 0x63333, 0x0000d, 0x0000c, 0x0060a, 0x6b333, 0x0000d, 0x0000d, 0x0060a, 0x73333, 0x0000d, 0x0000e, 0x0050b, 0x66666, 0x0001a, 0xe0000, 0x4000f, 0xe31fc, 0x6000f, 0xff9f8, 0x2000f, 0x203f9, 0x3000f, 0xff500, 0x00000, 0x00000, 0x8000f, 0x3f100, 0x9000f, 0x23100, 0x32000, 0x71000, 0xb0000, 0xfc000, 0x287af, 0x244b7, 0x204ab, 0x1c49f, 0x18493, 0x14297, 0x10295, 0x0c298, 0x0819c, 0x040a8, 0x0001c, 0x1944c, 0x59444, 0x9944c, 0xd9444, 0x0f424, 0x4f424, 0x8f424, 0xcf424, 0xe0330, 0xa0330, 0x60330, 0x20330, 0x10159, 0x0f401, 0x00000, 0x00000, 0x80003, 0x00000, 0x00000, 0x44457, 0x80000, 0x30159 }; static const uint8_t rtl8192ce_rf2_regs[] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x12, 0x12, 0x12, 0x12, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, 0x14, 0x14, 0x14, 0x14, 0x15, 0x15, 0x15, 0x15, 0x16, 0x16, 0x16, 0x16 }; static const uint32_t rtl8192ce_rf2_vals[] = { 0x30159, 0x31284, 0x98000, 0x18c63, 0x210e7, 0x2044f, 0x1adb1, 0x54867, 0x8992e, 0x0e52c, 0x39ce7, 0x00451, 0x32000, 0x71000, 0xb0000, 0xfc000, 0x287af, 0x244b7, 0x204ab, 0x1c49f, 0x18493, 0x14297, 0x10295, 0x0c298, 0x0819c, 0x040a8, 0x0001c, 0x1944c, 0x59444, 0x9944c, 0xd9444, 0x0f424, 0x4f424, 0x8f424, 0xcf424, 0xe0330, 0xa0330, 0x60330, 0x20330 }; static const struct urtwn_rf_prog rtl8192ce_rf_prog[] = { { nitems(rtl8192ce_rf1_regs), rtl8192ce_rf1_regs, rtl8192ce_rf1_vals }, { nitems(rtl8192ce_rf2_regs), rtl8192ce_rf2_regs, rtl8192ce_rf2_vals } }; /* * RTL8188CE-VAU. */ static const uint32_t rtl8188ce_rf_vals[] = { 0x30159, 0x31284, 0x98000, 0x18c63, 0x210e7, 0x2044f, 0x1adb1, 0x54867, 0x8992e, 0x0e52c, 0x39ce7, 0x00451, 0x00000, 0x10255, 0x60a00, 0xfc378, 0xa1250, 0x4445f, 0x80001, 0x0b614, 0x6c000, 0x00000, 0x01558, 0x00060, 0x00483, 0x4f200, 0xec7d9, 0x577c0, 0x04783, 0x00001, 0x21334, 0x00000, 0x00054, 0x00001, 0x00808, 0x53333, 0x0000c, 0x00002, 0x00808, 0x5b333, 0x0000d, 0x00003, 0x00808, 0x63333, 0x0000d, 0x00004, 0x00808, 0x6b333, 0x0000d, 0x00005, 0x00808, 0x73333, 0x0000d, 0x00006, 0x00709, 0x5b333, 0x0000d, 0x00007, 0x00709, 0x63333, 0x0000d, 0x00008, 0x0060a, 0x4b333, 0x0000d, 0x00009, 0x0060a, 0x53333, 0x0000d, 0x0000a, 0x0060a, 0x5b333, 0x0000d, 0x0000b, 0x0060a, 0x63333, 0x0000d, 0x0000c, 0x0060a, 0x6b333, 0x0000d, 0x0000d, 0x0060a, 0x73333, 0x0000d, 0x0000e, 0x0050b, 0x66666, 0x0001a, 0xe0000, 0x4000f, 0xe31fc, 0x6000f, 0xff9f8, 0x2000f, 0x203f9, 0x3000f, 0xff500, 0x00000, 0x00000, 0x8000f, 0x3f100, 0x9000f, 0x23100, 0x32000, 0x71000, 0xb0000, 0xfc000, 0x287b3, 0x244b7, 0x204ab, 0x1c49f, 0x18493, 0x1429b, 0x10299, 0x0c29c, 0x081a0, 0x040ac, 0x00020, 0x1944c, 0x59444, 0x9944c, 0xd9444, 0x0f424, 0x4f424, 0x8f424, 0xcf424, 0xe0330, 0xa0330, 0x60330, 0x20330, 0x10159, 0x0f401, 0x00000, 0x00000, 0x80003, 0x00000, 0x00000, 0x44457, 0x80000, 0x30159 }; static const struct urtwn_rf_prog rtl8188ce_rf_prog[] = { { nitems(rtl8192ce_rf1_regs), rtl8192ce_rf1_regs, rtl8188ce_rf_vals } }; /* * RTL8188CU. */ static const uint32_t rtl8188cu_rf_vals[] = { 0x30159, 0x31284, 0x98000, 0x18c63, 0x210e7, 0x2044f, 0x1adb1, 0x54867, 0x8992e, 0x0e52c, 0x39ce7, 0x00451, 0x00000, 0x10255, 0x60a00, 0xfc378, 0xa1250, 0x4445f, 0x80001, 0x0b614, 0x6c000, 0x00000, 0x01558, 0x00060, 0x00483, 0x4f000, 0xec7d9, 0x577c0, 0x04783, 0x00001, 0x21334, 0x00000, 0x00054, 0x00001, 0x00808, 0x53333, 0x0000c, 0x00002, 0x00808, 0x5b333, 0x0000d, 0x00003, 0x00808, 0x63333, 0x0000d, 0x00004, 0x00808, 0x6b333, 0x0000d, 0x00005, 0x00808, 0x73333, 0x0000d, 0x00006, 0x00709, 0x5b333, 0x0000d, 0x00007, 0x00709, 0x63333, 0x0000d, 0x00008, 0x0060a, 0x4b333, 0x0000d, 0x00009, 0x0060a, 0x53333, 0x0000d, 0x0000a, 0x0060a, 0x5b333, 0x0000d, 0x0000b, 0x0060a, 0x63333, 0x0000d, 0x0000c, 0x0060a, 0x6b333, 0x0000d, 0x0000d, 0x0060a, 0x73333, 0x0000d, 0x0000e, 0x0050b, 0x66666, 0x0001a, 0xe0000, 0x4000f, 0xe31fc, 0x6000f, 0xff9f8, 0x2000f, 0x203f9, 0x3000f, 0xff500, 0x00000, 0x00000, 0x8000f, 0x3f100, 0x9000f, 0x23100, 0x32000, 0x71000, 0xb0000, 0xfc000, 0x287b3, 0x244b7, 0x204ab, 0x1c49f, 0x18493, 0x1429b, 0x10299, 0x0c29c, 0x081a0, 0x040ac, 0x00020, 0x1944c, 0x59444, 0x9944c, 0xd9444, 0x0f405, 0x4f405, 0x8f405, 0xcf405, 0xe0330, 0xa0330, 0x60330, 0x20330, 0x10159, 0x0f401, 0x00000, 0x00000, 0x80003, 0x00000, 0x00000, 0x44457, 0x80000, 0x30159 }; static const struct urtwn_rf_prog rtl8188cu_rf_prog[] = { { nitems(rtl8192ce_rf1_regs), rtl8192ce_rf1_regs, rtl8188cu_rf_vals } }; /* * RTL8188EU. */ static const uint8_t rtl8188eu_rf_regs[] = { 0x00, 0x08, 0x18, 0x19, 0x1e, 0x1f, 0x2f, 0x3f, 0x42, 0x57, 0x58, 0x67, 0x83, 0xb0, 0xb1, 0xb2, 0xb4, 0xb6, 0xb7, 0xb8, 0xb9, 0xba, 0xbb, 0xbf, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, 0xc8, 0xc9, 0xca, 0xdf, 0xef, 0x51, 0x52, 0x53, 0x56, 0x35, 0x35, 0x35, 0x36, 0x36, 0x36, 0x36, 0xb6, 0x18, 0x5a, 0x19, 0x34, 0x34, 0x34, 0x34, 0x34, 0x34, 0x34, 0x34, 0x34, 0x34, 0x34, 0x00, 0x84, 0x86, 0x87, 0x8e, 0x8f, 0xef, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0x3b, 0xef, 0x00, 0x18, 0xfe, 0xfe, 0x1f, 0xfe, 0xfe, 0x1e, 0x1f, 0x00 }; static const uint32_t rtl8188eu_rf_vals[] = { 0x30000, 0x84000, 0x00407, 0x00012, 0x80009, 0x00880, 0x1a060, 0x00000, 0x060c0, 0xd0000, 0xbe180, 0x01552, 0x00000, 0xff8fc, 0x54400, 0xccc19, 0x43003, 0x4953e, 0x1c718, 0x060ff, 0x80001, 0x40000, 0x00400, 0xc0000, 0x02400, 0x00009, 0x40c91, 0x99999, 0x000a3, 0x88820, 0x76c06, 0x00000, 0x80000, 0x00180, 0x001a0, 0x6b27d, 0x7e49d, 0x00073, 0x51ff3, 0x00086, 0x00186, 0x00286, 0x01c25, 0x09c25, 0x11c25, 0x19c25, 0x48538, 0x00c07, 0x4bd00, 0x739d0, 0x0adf3, 0x09df0, 0x08ded, 0x07dea, 0x06de7, 0x054ee, 0x044eb, 0x034e8, 0x0246b, 0x01468, 0x0006d, 0x30159, 0x68200, 0x000ce, 0x48a00, 0x65540, 0x88000, 0x020a0, 0xf02b0, 0xef7b0, 0xd4fb0, 0xcf060, 0xb0090, 0xa0080, 0x90080, 0x8f780, 0x722b0, 0x6f7b0, 0x54fb0, 0x4f060, 0x30090, 0x20080, 0x10080, 0x0f780, 0x000a0, 0x10159, 0x0f407, 0x00000, 0x00000, 0x80003, 0x00000, 0x00000, 0x00001, 0x80000, 0x33e60 }; static const struct urtwn_rf_prog rtl8188eu_rf_prog[] = { { nitems(rtl8188eu_rf_regs), rtl8188eu_rf_regs, rtl8188eu_rf_vals } }; /* * RTL8188RU. */ static const uint32_t rtl8188ru_rf_vals[] = { 0x30159, 0x31284, 0x98000, 0x18c63, 0x210e7, 0x2044f, 0x1adb0, 0x54867, 0x8992e, 0x0e529, 0x39ce7, 0x00451, 0x00000, 0x00255, 0x60a00, 0xfc378, 0xa1250, 0x4445f, 0x80001, 0x0b614, 0x6c000, 0x0083c, 0x01558, 0x00060, 0x00483, 0x4f000, 0xec7d9, 0x977c0, 0x04783, 0x00001, 0x21334, 0x00000, 0x00054, 0x00001, 0x00808, 0x53333, 0x0000c, 0x00002, 0x00808, 0x5b333, 0x0000d, 0x00003, 0x00808, 0x63333, 0x0000d, 0x00004, 0x00808, 0x6b333, 0x0000d, 0x00005, 0x00808, 0x73333, 0x0000d, 0x00006, 0x00709, 0x5b333, 0x0000d, 0x00007, 0x00709, 0x63333, 0x0000d, 0x00008, 0x0060a, 0x4b333, 0x0000d, 0x00009, 0x0060a, 0x53333, 0x0000d, 0x0000a, 0x0060a, 0x5b333, 0x0000d, 0x0000b, 0x0060a, 0x63333, 0x0000d, 0x0000c, 0x0060a, 0x6b333, 0x0000d, 0x0000d, 0x0060a, 0x73333, 0x0000d, 0x0000e, 0x0050b, 0x66666, 0x0001a, 0xe0000, 0x4000f, 0xe31fc, 0x6000f, 0xff9f8, 0x2000f, 0x203f9, 0x3000f, 0xff500, 0x00000, 0x00000, 0x8000f, 0x3f100, 0x9000f, 0x23100, 0xd8000, 0x90000, 0x51000, 0x12000, 0x28fb4, 0x24fa8, 0x207a4, 0x1c798, 0x183a4, 0x14398, 0x101a4, 0x0c198, 0x080a4, 0x04098, 0x00014, 0x1944c, 0x59444, 0x9944c, 0xd9444, 0x0f405, 0x4f405, 0x8f405, 0xcf405, 0xe0330, 0xa0330, 0x60330, 0x20330, 0x10159, 0x0f401, 0x00000, 0x00000, 0x80003, 0x00000, 0x00000, 0x44457, 0x80000, 0x30159 }; static const struct urtwn_rf_prog rtl8188ru_rf_prog[] = { { nitems(rtl8192ce_rf1_regs), rtl8192ce_rf1_regs, rtl8188ru_rf_vals } }; struct urtwn_txpwr { uint8_t pwr[3][28]; }; struct urtwn_r88e_txpwr { uint8_t pwr[6][28]; }; /* * Per RF chain/group/rate Tx gain values. */ static const struct urtwn_txpwr rtl8192cu_txagc[] = { { { /* Chain 0. */ { /* Group 0. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x0c, 0x0c, 0x0c, 0x0a, 0x08, 0x06, 0x04, 0x02, /* OFDM6~54. */ 0x0e, 0x0d, 0x0c, 0x0a, 0x08, 0x06, 0x04, 0x02, /* MCS0~7. */ 0x0e, 0x0d, 0x0c, 0x0a, 0x08, 0x06, 0x04, 0x02 /* MCS8~15. */ }, { /* Group 1. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 2. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x04, 0x04, 0x04, 0x04, 0x04, 0x02, 0x02, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ } } }, { { /* Chain 1. */ { /* Group 0. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 1. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 2. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x04, 0x04, 0x04, 0x04, 0x04, 0x02, 0x02, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ } } } }; static const struct urtwn_txpwr rtl8188ru_txagc[] = { { { /* Chain 0. */ { /* Group 0. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x08, 0x08, 0x08, 0x06, 0x06, 0x04, 0x04, 0x00, /* OFDM6~54. */ 0x08, 0x06, 0x06, 0x04, 0x04, 0x02, 0x02, 0x00, /* MCS0~7. */ 0x08, 0x06, 0x06, 0x04, 0x04, 0x02, 0x02, 0x00 /* MCS8~15. */ }, { /* Group 1. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 2. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ } } } }; static const struct urtwn_r88e_txpwr rtl8188eu_txagc[] = { { { /* Chain 0. */ { /* Group 0. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 1. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 2. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 3. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 4. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ }, { /* Group 5. */ 0x00, 0x00, 0x00, 0x00, /* CCK1~11. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* OFDM6~54. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* MCS0~7. */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* MCS8~15. */ } } } }; Index: user/alc/PQ_LAUNDRY/sys/kern/imgact_elf.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/kern/imgact_elf.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/kern/imgact_elf.c (revision 303206) @@ -1,2362 +1,2388 @@ /*- * Copyright (c) 2000 David O'Brien * Copyright (c) 1995-1996 Søren Schmidt * Copyright (c) 1996 Peter Wemm * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_capsicum.h" #include "opt_compat.h" #include "opt_gzio.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define ELF_NOTE_ROUNDSIZE 4 #define OLD_EI_BRAND 8 static int __elfN(check_header)(const Elf_Ehdr *hdr); static Elf_Brandinfo *__elfN(get_brandinfo)(struct image_params *imgp, const char *interp, int interp_name_len, int32_t *osrel); static int __elfN(load_file)(struct proc *p, const char *file, u_long *addr, u_long *entry, size_t pagesize); static int __elfN(load_section)(struct image_params *imgp, vm_offset_t offset, caddr_t vmaddr, size_t memsz, size_t filsz, vm_prot_t prot, size_t pagesize); static int __CONCAT(exec_, __elfN(imgact))(struct image_params *imgp); static boolean_t __elfN(freebsd_trans_osrel)(const Elf_Note *note, int32_t *osrel); static boolean_t kfreebsd_trans_osrel(const Elf_Note *note, int32_t *osrel); static boolean_t __elfN(check_note)(struct image_params *imgp, Elf_Brandnote *checknote, int32_t *osrel); static vm_prot_t __elfN(trans_prot)(Elf_Word); static Elf_Word __elfN(untrans_prot)(vm_prot_t); SYSCTL_NODE(_kern, OID_AUTO, __CONCAT(elf, __ELF_WORD_SIZE), CTLFLAG_RW, 0, ""); #define CORE_BUF_SIZE (16 * 1024) int __elfN(fallback_brand) = -1; SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO, fallback_brand, CTLFLAG_RWTUN, &__elfN(fallback_brand), 0, __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) " brand of last resort"); static int elf_legacy_coredump = 0; SYSCTL_INT(_debug, OID_AUTO, __elfN(legacy_coredump), CTLFLAG_RW, &elf_legacy_coredump, 0, "include all and only RW pages in core dumps"); int __elfN(nxstack) = #if defined(__amd64__) || defined(__powerpc64__) /* both 64 and 32 bit */ || \ (defined(__arm__) && __ARM_ARCH >= 7) || defined(__aarch64__) 1; #else 0; #endif SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO, nxstack, CTLFLAG_RW, &__elfN(nxstack), 0, __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) ": enable non-executable stack"); #if __ELF_WORD_SIZE == 32 #if defined(__amd64__) int i386_read_exec = 0; SYSCTL_INT(_kern_elf32, OID_AUTO, read_exec, CTLFLAG_RW, &i386_read_exec, 0, "enable execution from readable segments"); #endif #endif static Elf_Brandinfo *elf_brand_list[MAX_BRANDS]; #define trunc_page_ps(va, ps) rounddown2(va, ps) #define round_page_ps(va, ps) roundup2(va, ps) #define aligned(a, t) (trunc_page_ps((u_long)(a), sizeof(t)) == (u_long)(a)) static const char FREEBSD_ABI_VENDOR[] = "FreeBSD"; Elf_Brandnote __elfN(freebsd_brandnote) = { .hdr.n_namesz = sizeof(FREEBSD_ABI_VENDOR), .hdr.n_descsz = sizeof(int32_t), .hdr.n_type = NT_FREEBSD_ABI_TAG, .vendor = FREEBSD_ABI_VENDOR, .flags = BN_TRANSLATE_OSREL, .trans_osrel = __elfN(freebsd_trans_osrel) }; static boolean_t __elfN(freebsd_trans_osrel)(const Elf_Note *note, int32_t *osrel) { uintptr_t p; p = (uintptr_t)(note + 1); p += roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE); *osrel = *(const int32_t *)(p); return (TRUE); } static const char GNU_ABI_VENDOR[] = "GNU"; static int GNU_KFREEBSD_ABI_DESC = 3; Elf_Brandnote __elfN(kfreebsd_brandnote) = { .hdr.n_namesz = sizeof(GNU_ABI_VENDOR), .hdr.n_descsz = 16, /* XXX at least 16 */ .hdr.n_type = 1, .vendor = GNU_ABI_VENDOR, .flags = BN_TRANSLATE_OSREL, .trans_osrel = kfreebsd_trans_osrel }; static boolean_t kfreebsd_trans_osrel(const Elf_Note *note, int32_t *osrel) { const Elf32_Word *desc; uintptr_t p; p = (uintptr_t)(note + 1); p += roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE); desc = (const Elf32_Word *)p; if (desc[0] != GNU_KFREEBSD_ABI_DESC) return (FALSE); /* * Debian GNU/kFreeBSD embed the earliest compatible kernel version * (__FreeBSD_version: Rxx) in the LSB way. */ *osrel = desc[1] * 100000 + desc[2] * 1000 + desc[3]; return (TRUE); } int __elfN(insert_brand_entry)(Elf_Brandinfo *entry) { int i; for (i = 0; i < MAX_BRANDS; i++) { if (elf_brand_list[i] == NULL) { elf_brand_list[i] = entry; break; } } if (i == MAX_BRANDS) { printf("WARNING: %s: could not insert brandinfo entry: %p\n", __func__, entry); return (-1); } return (0); } int __elfN(remove_brand_entry)(Elf_Brandinfo *entry) { int i; for (i = 0; i < MAX_BRANDS; i++) { if (elf_brand_list[i] == entry) { elf_brand_list[i] = NULL; break; } } if (i == MAX_BRANDS) return (-1); return (0); } int __elfN(brand_inuse)(Elf_Brandinfo *entry) { struct proc *p; int rval = FALSE; sx_slock(&allproc_lock); FOREACH_PROC_IN_SYSTEM(p) { if (p->p_sysent == entry->sysvec) { rval = TRUE; break; } } sx_sunlock(&allproc_lock); return (rval); } static Elf_Brandinfo * __elfN(get_brandinfo)(struct image_params *imgp, const char *interp, int interp_name_len, int32_t *osrel) { const Elf_Ehdr *hdr = (const Elf_Ehdr *)imgp->image_header; Elf_Brandinfo *bi, *bi_m; boolean_t ret; int i; /* * We support four types of branding -- (1) the ELF EI_OSABI field * that SCO added to the ELF spec, (2) FreeBSD 3.x's traditional string * branding w/in the ELF header, (3) path of the `interp_path' * field, and (4) the ".note.ABI-tag" ELF section. */ /* Look for an ".note.ABI-tag" ELF section */ bi_m = NULL; for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL) continue; if (hdr->e_machine == bi->machine && (bi->flags & (BI_BRAND_NOTE|BI_BRAND_NOTE_MANDATORY)) != 0) { ret = __elfN(check_note)(imgp, bi->brand_note, osrel); /* Give brand a chance to veto check_note's guess */ if (ret && bi->header_supported) ret = bi->header_supported(imgp); /* * If note checker claimed the binary, but the * interpreter path in the image does not * match default one for the brand, try to * search for other brands with the same * interpreter. Either there is better brand * with the right interpreter, or, failing * this, we return first brand which accepted * our note and, optionally, header. */ if (ret && bi_m == NULL && (strlen(bi->interp_path) + 1 != interp_name_len || strncmp(interp, bi->interp_path, interp_name_len) != 0)) { bi_m = bi; ret = 0; } if (ret) return (bi); } } if (bi_m != NULL) return (bi_m); /* If the executable has a brand, search for it in the brand list. */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY) continue; if (hdr->e_machine == bi->machine && (hdr->e_ident[EI_OSABI] == bi->brand || strncmp((const char *)&hdr->e_ident[OLD_EI_BRAND], bi->compat_3_brand, strlen(bi->compat_3_brand)) == 0)) { /* Looks good, but give brand a chance to veto */ if (!bi->header_supported || bi->header_supported(imgp)) return (bi); } } /* No known brand, see if the header is recognized by any brand */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY || bi->header_supported == NULL) continue; if (hdr->e_machine == bi->machine) { ret = bi->header_supported(imgp); if (ret) return (bi); } } /* Lacking a known brand, search for a recognized interpreter. */ if (interp != NULL) { for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY) continue; if (hdr->e_machine == bi->machine && /* ELF image p_filesz includes terminating zero */ strlen(bi->interp_path) + 1 == interp_name_len && strncmp(interp, bi->interp_path, interp_name_len) == 0) return (bi); } } /* Lacking a recognized interpreter, try the default brand */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY) continue; if (hdr->e_machine == bi->machine && __elfN(fallback_brand) == bi->brand) return (bi); } return (NULL); } static int __elfN(check_header)(const Elf_Ehdr *hdr) { Elf_Brandinfo *bi; int i; if (!IS_ELF(*hdr) || hdr->e_ident[EI_CLASS] != ELF_TARG_CLASS || hdr->e_ident[EI_DATA] != ELF_TARG_DATA || hdr->e_ident[EI_VERSION] != EV_CURRENT || hdr->e_phentsize != sizeof(Elf_Phdr) || hdr->e_version != ELF_TARG_VER) return (ENOEXEC); /* * Make sure we have at least one brand for this machine. */ for (i = 0; i < MAX_BRANDS; i++) { bi = elf_brand_list[i]; if (bi != NULL && bi->machine == hdr->e_machine) break; } if (i == MAX_BRANDS) return (ENOEXEC); return (0); } static int __elfN(map_partial)(vm_map_t map, vm_object_t object, vm_ooffset_t offset, vm_offset_t start, vm_offset_t end, vm_prot_t prot) { struct sf_buf *sf; int error; vm_offset_t off; /* * Create the page if it doesn't exist yet. Ignore errors. */ vm_map_lock(map); vm_map_insert(map, NULL, 0, trunc_page(start), round_page(end), VM_PROT_ALL, VM_PROT_ALL, 0); vm_map_unlock(map); /* * Find the page from the underlying object. */ if (object) { sf = vm_imgact_map_page(object, offset); if (sf == NULL) return (KERN_FAILURE); off = offset - trunc_page(offset); error = copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)start, end - start); vm_imgact_unmap_page(sf); if (error) { return (KERN_FAILURE); } } return (KERN_SUCCESS); } static int __elfN(map_insert)(vm_map_t map, vm_object_t object, vm_ooffset_t offset, vm_offset_t start, vm_offset_t end, vm_prot_t prot, int cow) { struct sf_buf *sf; vm_offset_t off; vm_size_t sz; int error, rv; if (start != trunc_page(start)) { rv = __elfN(map_partial)(map, object, offset, start, round_page(start), prot); if (rv) return (rv); offset += round_page(start) - start; start = round_page(start); } if (end != round_page(end)) { rv = __elfN(map_partial)(map, object, offset + trunc_page(end) - start, trunc_page(end), end, prot); if (rv) return (rv); end = trunc_page(end); } if (end > start) { if (offset & PAGE_MASK) { /* * The mapping is not page aligned. This means we have * to copy the data. Sigh. */ rv = vm_map_find(map, NULL, 0, &start, end - start, 0, VMFS_NO_SPACE, prot | VM_PROT_WRITE, VM_PROT_ALL, 0); if (rv) return (rv); if (object == NULL) return (KERN_SUCCESS); for (; start < end; start += sz) { sf = vm_imgact_map_page(object, offset); if (sf == NULL) return (KERN_FAILURE); off = offset - trunc_page(offset); sz = end - start; if (sz > PAGE_SIZE - off) sz = PAGE_SIZE - off; error = copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)start, sz); vm_imgact_unmap_page(sf); if (error) { return (KERN_FAILURE); } offset += sz; } rv = KERN_SUCCESS; } else { vm_object_reference(object); vm_map_lock(map); rv = vm_map_insert(map, object, offset, start, end, prot, VM_PROT_ALL, cow); vm_map_unlock(map); if (rv != KERN_SUCCESS) vm_object_deallocate(object); } return (rv); } else { return (KERN_SUCCESS); } } static int __elfN(load_section)(struct image_params *imgp, vm_offset_t offset, caddr_t vmaddr, size_t memsz, size_t filsz, vm_prot_t prot, size_t pagesize) { struct sf_buf *sf; size_t map_len; vm_map_t map; vm_object_t object; vm_offset_t map_addr; int error, rv, cow; size_t copy_len; vm_offset_t file_addr; /* * It's necessary to fail if the filsz + offset taken from the * header is greater than the actual file pager object's size. * If we were to allow this, then the vm_map_find() below would * walk right off the end of the file object and into the ether. * * While I'm here, might as well check for something else that * is invalid: filsz cannot be greater than memsz. */ if ((off_t)filsz + offset > imgp->attr->va_size || filsz > memsz) { uprintf("elf_load_section: truncated ELF file\n"); return (ENOEXEC); } object = imgp->object; map = &imgp->proc->p_vmspace->vm_map; map_addr = trunc_page_ps((vm_offset_t)vmaddr, pagesize); file_addr = trunc_page_ps(offset, pagesize); /* * We have two choices. We can either clear the data in the last page * of an oversized mapping, or we can start the anon mapping a page * early and copy the initialized data into that first page. We * choose the second.. */ if (memsz > filsz) map_len = trunc_page_ps(offset + filsz, pagesize) - file_addr; else map_len = round_page_ps(offset + filsz, pagesize) - file_addr; if (map_len != 0) { /* cow flags: don't dump readonly sections in core */ cow = MAP_COPY_ON_WRITE | MAP_PREFAULT | (prot & VM_PROT_WRITE ? 0 : MAP_DISABLE_COREDUMP); rv = __elfN(map_insert)(map, object, file_addr, /* file offset */ map_addr, /* virtual start */ map_addr + map_len,/* virtual end */ prot, cow); if (rv != KERN_SUCCESS) return (EINVAL); /* we can stop now if we've covered it all */ if (memsz == filsz) { return (0); } } /* * We have to get the remaining bit of the file into the first part * of the oversized map segment. This is normally because the .data * segment in the file is extended to provide bss. It's a neat idea * to try and save a page, but it's a pain in the behind to implement. */ copy_len = (offset + filsz) - trunc_page_ps(offset + filsz, pagesize); map_addr = trunc_page_ps((vm_offset_t)vmaddr + filsz, pagesize); map_len = round_page_ps((vm_offset_t)vmaddr + memsz, pagesize) - map_addr; /* This had damn well better be true! */ if (map_len != 0) { rv = __elfN(map_insert)(map, NULL, 0, map_addr, map_addr + map_len, VM_PROT_ALL, 0); if (rv != KERN_SUCCESS) { return (EINVAL); } } if (copy_len != 0) { vm_offset_t off; sf = vm_imgact_map_page(object, offset + filsz); if (sf == NULL) return (EIO); /* send the page fragment to user space */ off = trunc_page_ps(offset + filsz, pagesize) - trunc_page(offset + filsz); error = copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)map_addr, copy_len); vm_imgact_unmap_page(sf); if (error) { return (error); } } /* * set it to the specified protection. * XXX had better undo the damage from pasting over the cracks here! */ vm_map_protect(map, trunc_page(map_addr), round_page(map_addr + map_len), prot, FALSE); return (0); } /* * Load the file "file" into memory. It may be either a shared object * or an executable. * * The "addr" reference parameter is in/out. On entry, it specifies * the address where a shared object should be loaded. If the file is * an executable, this value is ignored. On exit, "addr" specifies * where the file was actually loaded. * * The "entry" reference parameter is out only. On exit, it specifies * the entry point for the loaded file. */ static int __elfN(load_file)(struct proc *p, const char *file, u_long *addr, u_long *entry, size_t pagesize) { struct { struct nameidata nd; struct vattr attr; struct image_params image_params; } *tempdata; const Elf_Ehdr *hdr = NULL; const Elf_Phdr *phdr = NULL; struct nameidata *nd; struct vattr *attr; struct image_params *imgp; vm_prot_t prot; u_long rbase; u_long base_addr = 0; int error, i, numsegs; #ifdef CAPABILITY_MODE /* * XXXJA: This check can go away once we are sufficiently confident * that the checks in namei() are correct. */ if (IN_CAPABILITY_MODE(curthread)) return (ECAPMODE); #endif tempdata = malloc(sizeof(*tempdata), M_TEMP, M_WAITOK); nd = &tempdata->nd; attr = &tempdata->attr; imgp = &tempdata->image_params; /* * Initialize part of the common data */ imgp->proc = p; imgp->attr = attr; imgp->firstpage = NULL; imgp->image_header = NULL; imgp->object = NULL; imgp->execlabel = NULL; NDINIT(nd, LOOKUP, LOCKLEAF | FOLLOW, UIO_SYSSPACE, file, curthread); if ((error = namei(nd)) != 0) { nd->ni_vp = NULL; goto fail; } NDFREE(nd, NDF_ONLY_PNBUF); imgp->vp = nd->ni_vp; /* * Check permissions, modes, uid, etc on the file, and "open" it. */ error = exec_check_permissions(imgp); if (error) goto fail; error = exec_map_first_page(imgp); if (error) goto fail; /* * Also make certain that the interpreter stays the same, so set * its VV_TEXT flag, too. */ VOP_SET_TEXT(nd->ni_vp); imgp->object = nd->ni_vp->v_object; hdr = (const Elf_Ehdr *)imgp->image_header; if ((error = __elfN(check_header)(hdr)) != 0) goto fail; if (hdr->e_type == ET_DYN) rbase = *addr; else if (hdr->e_type == ET_EXEC) rbase = 0; else { error = ENOEXEC; goto fail; } /* Only support headers that fit within first page for now */ if ((hdr->e_phoff > PAGE_SIZE) || (u_int)hdr->e_phentsize * hdr->e_phnum > PAGE_SIZE - hdr->e_phoff) { error = ENOEXEC; goto fail; } phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff); if (!aligned(phdr, Elf_Addr)) { error = ENOEXEC; goto fail; } for (i = 0, numsegs = 0; i < hdr->e_phnum; i++) { if (phdr[i].p_type == PT_LOAD && phdr[i].p_memsz != 0) { /* Loadable segment */ prot = __elfN(trans_prot)(phdr[i].p_flags); error = __elfN(load_section)(imgp, phdr[i].p_offset, (caddr_t)(uintptr_t)phdr[i].p_vaddr + rbase, phdr[i].p_memsz, phdr[i].p_filesz, prot, pagesize); if (error != 0) goto fail; /* * Establish the base address if this is the * first segment. */ if (numsegs == 0) base_addr = trunc_page(phdr[i].p_vaddr + rbase); numsegs++; } } *addr = base_addr; *entry = (unsigned long)hdr->e_entry + rbase; fail: if (imgp->firstpage) exec_unmap_first_page(imgp); if (nd->ni_vp) vput(nd->ni_vp); free(tempdata, M_TEMP); return (error); } static int __CONCAT(exec_, __elfN(imgact))(struct image_params *imgp) { struct thread *td; const Elf_Ehdr *hdr; const Elf_Phdr *phdr; Elf_Auxargs *elf_auxargs; struct vmspace *vmspace; const char *err_str, *newinterp; char *interp, *interp_buf, *path; Elf_Brandinfo *brand_info; struct sysentvec *sv; vm_prot_t prot; u_long text_size, data_size, total_size, text_addr, data_addr; u_long seg_size, seg_addr, addr, baddr, et_dyn_addr, entry, proghdr; int32_t osrel; int error, i, n, interp_name_len, have_interp; hdr = (const Elf_Ehdr *)imgp->image_header; /* * Do we have a valid ELF header ? * * Only allow ET_EXEC & ET_DYN here, reject ET_DYN later * if particular brand doesn't support it. */ if (__elfN(check_header)(hdr) != 0 || (hdr->e_type != ET_EXEC && hdr->e_type != ET_DYN)) return (-1); /* * From here on down, we return an errno, not -1, as we've * detected an ELF file. */ if ((hdr->e_phoff > PAGE_SIZE) || (u_int)hdr->e_phentsize * hdr->e_phnum > PAGE_SIZE - hdr->e_phoff) { /* Only support headers in first page for now */ uprintf("Program headers not in the first page\n"); return (ENOEXEC); } phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff); if (!aligned(phdr, Elf_Addr)) { uprintf("Unaligned program headers\n"); return (ENOEXEC); } n = error = 0; baddr = 0; osrel = 0; text_size = data_size = total_size = text_addr = data_addr = 0; entry = proghdr = 0; interp_name_len = 0; err_str = newinterp = NULL; interp = interp_buf = NULL; td = curthread; for (i = 0; i < hdr->e_phnum; i++) { switch (phdr[i].p_type) { case PT_LOAD: if (n == 0) baddr = phdr[i].p_vaddr; n++; break; case PT_INTERP: /* Path to interpreter */ if (phdr[i].p_filesz > MAXPATHLEN) { uprintf("Invalid PT_INTERP\n"); error = ENOEXEC; goto ret; } if (interp != NULL) { uprintf("Multiple PT_INTERP headers\n"); error = ENOEXEC; goto ret; } interp_name_len = phdr[i].p_filesz; if (phdr[i].p_offset > PAGE_SIZE || interp_name_len > PAGE_SIZE - phdr[i].p_offset) { VOP_UNLOCK(imgp->vp, 0); interp_buf = malloc(interp_name_len + 1, M_TEMP, M_WAITOK); vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY); error = vn_rdwr(UIO_READ, imgp->vp, interp_buf, interp_name_len, phdr[i].p_offset, UIO_SYSSPACE, IO_NODELOCKED, td->td_ucred, NOCRED, NULL, td); if (error != 0) { uprintf("i/o error PT_INTERP\n"); goto ret; } interp_buf[interp_name_len] = '\0'; interp = interp_buf; } else { interp = __DECONST(char *, imgp->image_header) + phdr[i].p_offset; } break; case PT_GNU_STACK: if (__elfN(nxstack)) imgp->stack_prot = __elfN(trans_prot)(phdr[i].p_flags); imgp->stack_sz = phdr[i].p_memsz; break; } } brand_info = __elfN(get_brandinfo)(imgp, interp, interp_name_len, &osrel); if (brand_info == NULL) { uprintf("ELF binary type \"%u\" not known.\n", hdr->e_ident[EI_OSABI]); error = ENOEXEC; goto ret; } if (hdr->e_type == ET_DYN) { if ((brand_info->flags & BI_CAN_EXEC_DYN) == 0) { uprintf("Cannot execute shared object\n"); error = ENOEXEC; goto ret; } /* * Honour the base load address from the dso if it is * non-zero for some reason. */ if (baddr == 0) et_dyn_addr = ET_DYN_LOAD_ADDR; else et_dyn_addr = 0; } else et_dyn_addr = 0; sv = brand_info->sysvec; if (interp != NULL && brand_info->interp_newpath != NULL) newinterp = brand_info->interp_newpath; /* * Avoid a possible deadlock if the current address space is destroyed * and that address space maps the locked vnode. In the common case, * the locked vnode's v_usecount is decremented but remains greater * than zero. Consequently, the vnode lock is not needed by vrele(). * However, in cases where the vnode lock is external, such as nullfs, * v_usecount may become zero. * * The VV_TEXT flag prevents modifications to the executable while * the vnode is unlocked. */ VOP_UNLOCK(imgp->vp, 0); error = exec_new_vmspace(imgp, sv); imgp->proc->p_sysent = sv; vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY); if (error != 0) goto ret; for (i = 0; i < hdr->e_phnum; i++) { switch (phdr[i].p_type) { case PT_LOAD: /* Loadable segment */ if (phdr[i].p_memsz == 0) break; prot = __elfN(trans_prot)(phdr[i].p_flags); error = __elfN(load_section)(imgp, phdr[i].p_offset, (caddr_t)(uintptr_t)phdr[i].p_vaddr + et_dyn_addr, phdr[i].p_memsz, phdr[i].p_filesz, prot, sv->sv_pagesize); if (error != 0) goto ret; /* * If this segment contains the program headers, * remember their virtual address for the AT_PHDR * aux entry. Static binaries don't usually include * a PT_PHDR entry. */ if (phdr[i].p_offset == 0 && hdr->e_phoff + hdr->e_phnum * hdr->e_phentsize <= phdr[i].p_filesz) proghdr = phdr[i].p_vaddr + hdr->e_phoff + et_dyn_addr; seg_addr = trunc_page(phdr[i].p_vaddr + et_dyn_addr); seg_size = round_page(phdr[i].p_memsz + phdr[i].p_vaddr + et_dyn_addr - seg_addr); /* * Make the largest executable segment the official * text segment and all others data. * * Note that obreak() assumes that data_addr + * data_size == end of data load area, and the ELF * file format expects segments to be sorted by * address. If multiple data segments exist, the * last one will be used. */ if (phdr[i].p_flags & PF_X && text_size < seg_size) { text_size = seg_size; text_addr = seg_addr; } else { data_size = seg_size; data_addr = seg_addr; } total_size += seg_size; break; case PT_PHDR: /* Program header table info */ proghdr = phdr[i].p_vaddr + et_dyn_addr; break; default: break; } } if (data_addr == 0 && data_size == 0) { data_addr = text_addr; data_size = text_size; } entry = (u_long)hdr->e_entry + et_dyn_addr; /* * Check limits. It should be safe to check the * limits after loading the segments since we do * not actually fault in all the segments pages. */ PROC_LOCK(imgp->proc); if (data_size > lim_cur_proc(imgp->proc, RLIMIT_DATA)) err_str = "Data segment size exceeds process limit"; else if (text_size > maxtsiz) err_str = "Text segment size exceeds system limit"; else if (total_size > lim_cur_proc(imgp->proc, RLIMIT_VMEM)) err_str = "Total segment size exceeds process limit"; else if (racct_set(imgp->proc, RACCT_DATA, data_size) != 0) err_str = "Data segment size exceeds resource limit"; else if (racct_set(imgp->proc, RACCT_VMEM, total_size) != 0) err_str = "Total segment size exceeds resource limit"; if (err_str != NULL) { PROC_UNLOCK(imgp->proc); uprintf("%s\n", err_str); error = ENOMEM; goto ret; } vmspace = imgp->proc->p_vmspace; vmspace->vm_tsize = text_size >> PAGE_SHIFT; vmspace->vm_taddr = (caddr_t)(uintptr_t)text_addr; vmspace->vm_dsize = data_size >> PAGE_SHIFT; vmspace->vm_daddr = (caddr_t)(uintptr_t)data_addr; /* * We load the dynamic linker where a userland call * to mmap(0, ...) would put it. The rationale behind this * calculation is that it leaves room for the heap to grow to * its maximum allowed size. */ addr = round_page((vm_offset_t)vmspace->vm_daddr + lim_max(td, RLIMIT_DATA)); PROC_UNLOCK(imgp->proc); imgp->entry_addr = entry; if (interp != NULL) { have_interp = FALSE; VOP_UNLOCK(imgp->vp, 0); if (brand_info->emul_path != NULL && brand_info->emul_path[0] != '\0') { path = malloc(MAXPATHLEN, M_TEMP, M_WAITOK); snprintf(path, MAXPATHLEN, "%s%s", brand_info->emul_path, interp); error = __elfN(load_file)(imgp->proc, path, &addr, &imgp->entry_addr, sv->sv_pagesize); free(path, M_TEMP); if (error == 0) have_interp = TRUE; } if (!have_interp && newinterp != NULL && (brand_info->interp_path == NULL || strcmp(interp, brand_info->interp_path) == 0)) { error = __elfN(load_file)(imgp->proc, newinterp, &addr, &imgp->entry_addr, sv->sv_pagesize); if (error == 0) have_interp = TRUE; } if (!have_interp) { error = __elfN(load_file)(imgp->proc, interp, &addr, &imgp->entry_addr, sv->sv_pagesize); } vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY); if (error != 0) { uprintf("ELF interpreter %s not found, error %d\n", interp, error); goto ret; } } else addr = et_dyn_addr; /* * Construct auxargs table (used by the fixup routine) */ elf_auxargs = malloc(sizeof(Elf_Auxargs), M_TEMP, M_WAITOK); elf_auxargs->execfd = -1; elf_auxargs->phdr = proghdr; elf_auxargs->phent = hdr->e_phentsize; elf_auxargs->phnum = hdr->e_phnum; elf_auxargs->pagesz = PAGE_SIZE; elf_auxargs->base = addr; elf_auxargs->flags = 0; elf_auxargs->entry = entry; elf_auxargs->hdr_eflags = hdr->e_flags; imgp->auxargs = elf_auxargs; imgp->interpreted = 0; imgp->reloc_base = addr; imgp->proc->p_osrel = osrel; ret: free(interp_buf, M_TEMP); return (error); } #define suword __CONCAT(suword, __ELF_WORD_SIZE) int __elfN(freebsd_fixup)(register_t **stack_base, struct image_params *imgp) { Elf_Auxargs *args = (Elf_Auxargs *)imgp->auxargs; Elf_Addr *base; Elf_Addr *pos; base = (Elf_Addr *)*stack_base; pos = base + (imgp->args->argc + imgp->args->envc + 2); if (args->execfd != -1) AUXARGS_ENTRY(pos, AT_EXECFD, args->execfd); AUXARGS_ENTRY(pos, AT_PHDR, args->phdr); AUXARGS_ENTRY(pos, AT_PHENT, args->phent); AUXARGS_ENTRY(pos, AT_PHNUM, args->phnum); AUXARGS_ENTRY(pos, AT_PAGESZ, args->pagesz); AUXARGS_ENTRY(pos, AT_FLAGS, args->flags); AUXARGS_ENTRY(pos, AT_ENTRY, args->entry); AUXARGS_ENTRY(pos, AT_BASE, args->base); #ifdef AT_EHDRFLAGS AUXARGS_ENTRY(pos, AT_EHDRFLAGS, args->hdr_eflags); #endif if (imgp->execpathp != 0) AUXARGS_ENTRY(pos, AT_EXECPATH, imgp->execpathp); AUXARGS_ENTRY(pos, AT_OSRELDATE, imgp->proc->p_ucred->cr_prison->pr_osreldate); if (imgp->canary != 0) { AUXARGS_ENTRY(pos, AT_CANARY, imgp->canary); AUXARGS_ENTRY(pos, AT_CANARYLEN, imgp->canarylen); } AUXARGS_ENTRY(pos, AT_NCPUS, mp_ncpus); if (imgp->pagesizes != 0) { AUXARGS_ENTRY(pos, AT_PAGESIZES, imgp->pagesizes); AUXARGS_ENTRY(pos, AT_PAGESIZESLEN, imgp->pagesizeslen); } if (imgp->sysent->sv_timekeep_base != 0) { AUXARGS_ENTRY(pos, AT_TIMEKEEP, imgp->sysent->sv_timekeep_base); } AUXARGS_ENTRY(pos, AT_STACKPROT, imgp->sysent->sv_shared_page_obj != NULL && imgp->stack_prot != 0 ? imgp->stack_prot : imgp->sysent->sv_stackprot); AUXARGS_ENTRY(pos, AT_NULL, 0); free(imgp->auxargs, M_TEMP); imgp->auxargs = NULL; base--; suword(base, (long)imgp->args->argc); *stack_base = (register_t *)base; return (0); } /* * Code for generating ELF core dumps. */ typedef void (*segment_callback)(vm_map_entry_t, void *); /* Closure for cb_put_phdr(). */ struct phdr_closure { Elf_Phdr *phdr; /* Program header to fill in */ Elf_Off offset; /* Offset of segment in core file */ }; /* Closure for cb_size_segment(). */ struct sseg_closure { int count; /* Count of writable segments. */ size_t size; /* Total size of all writable segments. */ }; typedef void (*outfunc_t)(void *, struct sbuf *, size_t *); struct note_info { int type; /* Note type. */ outfunc_t outfunc; /* Output function. */ void *outarg; /* Argument for the output function. */ size_t outsize; /* Output size. */ TAILQ_ENTRY(note_info) link; /* Link to the next note info. */ }; TAILQ_HEAD(note_info_list, note_info); /* Coredump output parameters. */ struct coredump_params { off_t offset; struct ucred *active_cred; struct ucred *file_cred; struct thread *td; struct vnode *vp; struct gzio_stream *gzs; }; static void cb_put_phdr(vm_map_entry_t, void *); static void cb_size_segment(vm_map_entry_t, void *); static int core_write(struct coredump_params *, void *, size_t, off_t, enum uio_seg); -static void each_writable_segment(struct thread *, segment_callback, void *); +static void each_dumpable_segment(struct thread *, segment_callback, void *); static int __elfN(corehdr)(struct coredump_params *, int, void *, size_t, struct note_info_list *, size_t); static void __elfN(prepare_notes)(struct thread *, struct note_info_list *, size_t *); static void __elfN(puthdr)(struct thread *, void *, size_t, int, size_t); static void __elfN(putnote)(struct note_info *, struct sbuf *); static size_t register_note(struct note_info_list *, int, outfunc_t, void *); static int sbuf_drain_core_output(void *, const char *, int); static int sbuf_drain_count(void *arg, const char *data, int len); static void __elfN(note_fpregset)(void *, struct sbuf *, size_t *); static void __elfN(note_prpsinfo)(void *, struct sbuf *, size_t *); static void __elfN(note_prstatus)(void *, struct sbuf *, size_t *); static void __elfN(note_threadmd)(void *, struct sbuf *, size_t *); static void __elfN(note_thrmisc)(void *, struct sbuf *, size_t *); static void __elfN(note_procstat_auxv)(void *, struct sbuf *, size_t *); static void __elfN(note_procstat_proc)(void *, struct sbuf *, size_t *); static void __elfN(note_procstat_psstrings)(void *, struct sbuf *, size_t *); static void note_procstat_files(void *, struct sbuf *, size_t *); static void note_procstat_groups(void *, struct sbuf *, size_t *); static void note_procstat_osrel(void *, struct sbuf *, size_t *); static void note_procstat_rlimit(void *, struct sbuf *, size_t *); static void note_procstat_umask(void *, struct sbuf *, size_t *); static void note_procstat_vmmap(void *, struct sbuf *, size_t *); #ifdef GZIO extern int compress_user_cores_gzlevel; /* * Write out a core segment to the compression stream. */ static int compress_chunk(struct coredump_params *p, char *base, char *buf, u_int len) { u_int chunk_len; int error; while (len > 0) { chunk_len = MIN(len, CORE_BUF_SIZE); copyin(base, buf, chunk_len); error = gzio_write(p->gzs, buf, chunk_len); if (error != 0) break; base += chunk_len; len -= chunk_len; } return (error); } static int core_gz_write(void *base, size_t len, off_t offset, void *arg) { return (core_write((struct coredump_params *)arg, base, len, offset, UIO_SYSSPACE)); } #endif /* GZIO */ static int core_write(struct coredump_params *p, void *base, size_t len, off_t offset, enum uio_seg seg) { return (vn_rdwr_inchunks(UIO_WRITE, p->vp, base, len, offset, seg, IO_UNIT | IO_DIRECT | IO_RANGELOCKED, p->active_cred, p->file_cred, NULL, p->td)); } static int core_output(void *base, size_t len, off_t offset, struct coredump_params *p, void *tmpbuf) { #ifdef GZIO if (p->gzs != NULL) return (compress_chunk(p, base, tmpbuf, len)); #endif return (core_write(p, base, len, offset, UIO_USERSPACE)); } /* * Drain into a core file. */ static int sbuf_drain_core_output(void *arg, const char *data, int len) { struct coredump_params *p; int error, locked; p = (struct coredump_params *)arg; /* * Some kern_proc out routines that print to this sbuf may * call us with the process lock held. Draining with the * non-sleepable lock held is unsafe. The lock is needed for * those routines when dumping a live process. In our case we * can safely release the lock before draining and acquire * again after. */ locked = PROC_LOCKED(p->td->td_proc); if (locked) PROC_UNLOCK(p->td->td_proc); #ifdef GZIO if (p->gzs != NULL) error = gzio_write(p->gzs, __DECONST(char *, data), len); else #endif error = core_write(p, __DECONST(void *, data), len, p->offset, UIO_SYSSPACE); if (locked) PROC_LOCK(p->td->td_proc); if (error != 0) return (-error); p->offset += len; return (len); } /* * Drain into a counter. */ static int sbuf_drain_count(void *arg, const char *data __unused, int len) { size_t *sizep; sizep = (size_t *)arg; *sizep += len; return (len); } int __elfN(coredump)(struct thread *td, struct vnode *vp, off_t limit, int flags) { struct ucred *cred = td->td_ucred; int error = 0; struct sseg_closure seginfo; struct note_info_list notelst; struct coredump_params params; struct note_info *ninfo; void *hdr, *tmpbuf; size_t hdrsize, notesz, coresize; #ifdef GZIO boolean_t compress; compress = (flags & IMGACT_CORE_COMPRESS) != 0; #endif hdr = NULL; tmpbuf = NULL; TAILQ_INIT(¬elst); /* Size the program segments. */ seginfo.count = 0; seginfo.size = 0; - each_writable_segment(td, cb_size_segment, &seginfo); + each_dumpable_segment(td, cb_size_segment, &seginfo); /* * Collect info about the core file header area. */ hdrsize = sizeof(Elf_Ehdr) + sizeof(Elf_Phdr) * (1 + seginfo.count); + if (seginfo.count + 1 >= PN_XNUM) + hdrsize += sizeof(Elf_Shdr); __elfN(prepare_notes)(td, ¬elst, ¬esz); coresize = round_page(hdrsize + notesz) + seginfo.size; /* Set up core dump parameters. */ params.offset = 0; params.active_cred = cred; params.file_cred = NOCRED; params.td = td; params.vp = vp; params.gzs = NULL; #ifdef RACCT if (racct_enable) { PROC_LOCK(td->td_proc); error = racct_add(td->td_proc, RACCT_CORE, coresize); PROC_UNLOCK(td->td_proc); if (error != 0) { error = EFAULT; goto done; } } #endif if (coresize >= limit) { error = EFAULT; goto done; } #ifdef GZIO /* Create a compression stream if necessary. */ if (compress) { params.gzs = gzio_init(core_gz_write, GZIO_DEFLATE, CORE_BUF_SIZE, compress_user_cores_gzlevel, ¶ms); if (params.gzs == NULL) { error = EFAULT; goto done; } tmpbuf = malloc(CORE_BUF_SIZE, M_TEMP, M_WAITOK | M_ZERO); } #endif /* * Allocate memory for building the header, fill it up, * and write it out following the notes. */ hdr = malloc(hdrsize, M_TEMP, M_WAITOK); error = __elfN(corehdr)(¶ms, seginfo.count, hdr, hdrsize, ¬elst, notesz); /* Write the contents of all of the writable segments. */ if (error == 0) { Elf_Phdr *php; off_t offset; int i; php = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr)) + 1; offset = round_page(hdrsize + notesz); for (i = 0; i < seginfo.count; i++) { error = core_output((caddr_t)(uintptr_t)php->p_vaddr, php->p_filesz, offset, ¶ms, tmpbuf); if (error != 0) break; offset += php->p_filesz; php++; } #ifdef GZIO if (error == 0 && compress) error = gzio_flush(params.gzs); #endif } if (error) { log(LOG_WARNING, "Failed to write core file for process %s (error %d)\n", curproc->p_comm, error); } done: #ifdef GZIO if (compress) { free(tmpbuf, M_TEMP); if (params.gzs != NULL) gzio_fini(params.gzs); } #endif while ((ninfo = TAILQ_FIRST(¬elst)) != NULL) { TAILQ_REMOVE(¬elst, ninfo, link); free(ninfo, M_TEMP); } if (hdr != NULL) free(hdr, M_TEMP); return (error); } /* - * A callback for each_writable_segment() to write out the segment's + * A callback for each_dumpable_segment() to write out the segment's * program header entry. */ static void cb_put_phdr(entry, closure) vm_map_entry_t entry; void *closure; { struct phdr_closure *phc = (struct phdr_closure *)closure; Elf_Phdr *phdr = phc->phdr; phc->offset = round_page(phc->offset); phdr->p_type = PT_LOAD; phdr->p_offset = phc->offset; phdr->p_vaddr = entry->start; phdr->p_paddr = 0; phdr->p_filesz = phdr->p_memsz = entry->end - entry->start; phdr->p_align = PAGE_SIZE; phdr->p_flags = __elfN(untrans_prot)(entry->protection); phc->offset += phdr->p_filesz; phc->phdr++; } /* - * A callback for each_writable_segment() to gather information about + * A callback for each_dumpable_segment() to gather information about * the number of segments and their total size. */ static void -cb_size_segment(entry, closure) - vm_map_entry_t entry; - void *closure; +cb_size_segment(vm_map_entry_t entry, void *closure) { struct sseg_closure *ssc = (struct sseg_closure *)closure; ssc->count++; ssc->size += entry->end - entry->start; } /* * For each writable segment in the process's memory map, call the given * function with a pointer to the map entry and some arbitrary * caller-supplied data. */ static void -each_writable_segment(td, func, closure) - struct thread *td; - segment_callback func; - void *closure; +each_dumpable_segment(struct thread *td, segment_callback func, void *closure) { struct proc *p = td->td_proc; vm_map_t map = &p->p_vmspace->vm_map; vm_map_entry_t entry; vm_object_t backing_object, object; boolean_t ignore_entry; vm_map_lock_read(map); for (entry = map->header.next; entry != &map->header; entry = entry->next) { /* * Don't dump inaccessible mappings, deal with legacy * coredump mode. * * Note that read-only segments related to the elf binary * are marked MAP_ENTRY_NOCOREDUMP now so we no longer * need to arbitrarily ignore such segments. */ if (elf_legacy_coredump) { if ((entry->protection & VM_PROT_RW) != VM_PROT_RW) continue; } else { if ((entry->protection & VM_PROT_ALL) == 0) continue; } /* * Dont include memory segment in the coredump if * MAP_NOCORE is set in mmap(2) or MADV_NOCORE in * madvise(2). Do not dump submaps (i.e. parts of the * kernel map). */ if (entry->eflags & (MAP_ENTRY_NOCOREDUMP|MAP_ENTRY_IS_SUB_MAP)) continue; if ((object = entry->object.vm_object) == NULL) continue; /* Ignore memory-mapped devices and such things. */ VM_OBJECT_RLOCK(object); while ((backing_object = object->backing_object) != NULL) { VM_OBJECT_RLOCK(backing_object); VM_OBJECT_RUNLOCK(object); object = backing_object; } ignore_entry = object->type != OBJT_DEFAULT && object->type != OBJT_SWAP && object->type != OBJT_VNODE && object->type != OBJT_PHYS; VM_OBJECT_RUNLOCK(object); if (ignore_entry) continue; (*func)(entry, closure); } vm_map_unlock_read(map); } /* * Write the core file header to the file, including padding up to * the page boundary. */ static int __elfN(corehdr)(struct coredump_params *p, int numsegs, void *hdr, size_t hdrsize, struct note_info_list *notelst, size_t notesz) { struct note_info *ninfo; struct sbuf *sb; int error; /* Fill in the header. */ bzero(hdr, hdrsize); __elfN(puthdr)(p->td, hdr, hdrsize, numsegs, notesz); sb = sbuf_new(NULL, NULL, CORE_BUF_SIZE, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_core_output, p); sbuf_start_section(sb, NULL); sbuf_bcat(sb, hdr, hdrsize); TAILQ_FOREACH(ninfo, notelst, link) __elfN(putnote)(ninfo, sb); /* Align up to a page boundary for the program segments. */ sbuf_end_section(sb, -1, PAGE_SIZE, 0); error = sbuf_finish(sb); sbuf_delete(sb); return (error); } static void __elfN(prepare_notes)(struct thread *td, struct note_info_list *list, size_t *sizep) { struct proc *p; struct thread *thr; size_t size; p = td->td_proc; size = 0; size += register_note(list, NT_PRPSINFO, __elfN(note_prpsinfo), p); /* * To have the debugger select the right thread (LWP) as the initial * thread, we dump the state of the thread passed to us in td first. * This is the thread that causes the core dump and thus likely to * be the right thread one wants to have selected in the debugger. */ thr = td; while (thr != NULL) { size += register_note(list, NT_PRSTATUS, __elfN(note_prstatus), thr); size += register_note(list, NT_FPREGSET, __elfN(note_fpregset), thr); size += register_note(list, NT_THRMISC, __elfN(note_thrmisc), thr); size += register_note(list, -1, __elfN(note_threadmd), thr); thr = (thr == td) ? TAILQ_FIRST(&p->p_threads) : TAILQ_NEXT(thr, td_plist); if (thr == td) thr = TAILQ_NEXT(thr, td_plist); } size += register_note(list, NT_PROCSTAT_PROC, __elfN(note_procstat_proc), p); size += register_note(list, NT_PROCSTAT_FILES, note_procstat_files, p); size += register_note(list, NT_PROCSTAT_VMMAP, note_procstat_vmmap, p); size += register_note(list, NT_PROCSTAT_GROUPS, note_procstat_groups, p); size += register_note(list, NT_PROCSTAT_UMASK, note_procstat_umask, p); size += register_note(list, NT_PROCSTAT_RLIMIT, note_procstat_rlimit, p); size += register_note(list, NT_PROCSTAT_OSREL, note_procstat_osrel, p); size += register_note(list, NT_PROCSTAT_PSSTRINGS, __elfN(note_procstat_psstrings), p); size += register_note(list, NT_PROCSTAT_AUXV, __elfN(note_procstat_auxv), p); *sizep = size; } static void __elfN(puthdr)(struct thread *td, void *hdr, size_t hdrsize, int numsegs, size_t notesz) { Elf_Ehdr *ehdr; Elf_Phdr *phdr; + Elf_Shdr *shdr; struct phdr_closure phc; ehdr = (Elf_Ehdr *)hdr; - phdr = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr)); ehdr->e_ident[EI_MAG0] = ELFMAG0; ehdr->e_ident[EI_MAG1] = ELFMAG1; ehdr->e_ident[EI_MAG2] = ELFMAG2; ehdr->e_ident[EI_MAG3] = ELFMAG3; ehdr->e_ident[EI_CLASS] = ELF_CLASS; ehdr->e_ident[EI_DATA] = ELF_DATA; ehdr->e_ident[EI_VERSION] = EV_CURRENT; ehdr->e_ident[EI_OSABI] = ELFOSABI_FREEBSD; ehdr->e_ident[EI_ABIVERSION] = 0; ehdr->e_ident[EI_PAD] = 0; ehdr->e_type = ET_CORE; #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 ehdr->e_machine = ELF_ARCH32; #else ehdr->e_machine = ELF_ARCH; #endif ehdr->e_version = EV_CURRENT; ehdr->e_entry = 0; ehdr->e_phoff = sizeof(Elf_Ehdr); ehdr->e_flags = 0; ehdr->e_ehsize = sizeof(Elf_Ehdr); ehdr->e_phentsize = sizeof(Elf_Phdr); - ehdr->e_phnum = numsegs + 1; ehdr->e_shentsize = sizeof(Elf_Shdr); - ehdr->e_shnum = 0; ehdr->e_shstrndx = SHN_UNDEF; + if (numsegs + 1 < PN_XNUM) { + ehdr->e_phnum = numsegs + 1; + ehdr->e_shnum = 0; + } else { + ehdr->e_phnum = PN_XNUM; + ehdr->e_shnum = 1; + ehdr->e_shoff = ehdr->e_phoff + + (numsegs + 1) * ehdr->e_phentsize; + KASSERT(ehdr->e_shoff == hdrsize - sizeof(Elf_Shdr), + ("e_shoff: %zu, hdrsize - shdr: %zu", + (size_t)ehdr->e_shoff, hdrsize - sizeof(Elf_Shdr))); + + shdr = (Elf_Shdr *)((char *)hdr + ehdr->e_shoff); + memset(shdr, 0, sizeof(*shdr)); + /* + * A special first section is used to hold large segment and + * section counts. This was proposed by Sun Microsystems in + * Solaris and has been adopted by Linux; the standard ELF + * tools are already familiar with the technique. + * + * See table 7-7 of the Solaris "Linker and Libraries Guide" + * (or 12-7 depending on the version of the document) for more + * details. + */ + shdr->sh_type = SHT_NULL; + shdr->sh_size = ehdr->e_shnum; + shdr->sh_link = ehdr->e_shstrndx; + shdr->sh_info = numsegs + 1; + } + /* * Fill in the program header entries. */ + phdr = (Elf_Phdr *)((char *)hdr + ehdr->e_phoff); /* The note segement. */ phdr->p_type = PT_NOTE; phdr->p_offset = hdrsize; phdr->p_vaddr = 0; phdr->p_paddr = 0; phdr->p_filesz = notesz; phdr->p_memsz = 0; phdr->p_flags = PF_R; phdr->p_align = ELF_NOTE_ROUNDSIZE; phdr++; /* All the writable segments from the program. */ phc.phdr = phdr; phc.offset = round_page(hdrsize + notesz); - each_writable_segment(td, cb_put_phdr, &phc); + each_dumpable_segment(td, cb_put_phdr, &phc); } static size_t register_note(struct note_info_list *list, int type, outfunc_t out, void *arg) { struct note_info *ninfo; size_t size, notesize; size = 0; out(arg, NULL, &size); ninfo = malloc(sizeof(*ninfo), M_TEMP, M_ZERO | M_WAITOK); ninfo->type = type; ninfo->outfunc = out; ninfo->outarg = arg; ninfo->outsize = size; TAILQ_INSERT_TAIL(list, ninfo, link); if (type == -1) return (size); notesize = sizeof(Elf_Note) + /* note header */ roundup2(sizeof(FREEBSD_ABI_VENDOR), ELF_NOTE_ROUNDSIZE) + /* note name */ roundup2(size, ELF_NOTE_ROUNDSIZE); /* note description */ return (notesize); } static size_t append_note_data(const void *src, void *dst, size_t len) { size_t padded_len; padded_len = roundup2(len, ELF_NOTE_ROUNDSIZE); if (dst != NULL) { bcopy(src, dst, len); bzero((char *)dst + len, padded_len - len); } return (padded_len); } size_t __elfN(populate_note)(int type, void *src, void *dst, size_t size, void **descp) { Elf_Note *note; char *buf; size_t notesize; buf = dst; if (buf != NULL) { note = (Elf_Note *)buf; note->n_namesz = sizeof(FREEBSD_ABI_VENDOR); note->n_descsz = size; note->n_type = type; buf += sizeof(*note); buf += append_note_data(FREEBSD_ABI_VENDOR, buf, sizeof(FREEBSD_ABI_VENDOR)); append_note_data(src, buf, size); if (descp != NULL) *descp = buf; } notesize = sizeof(Elf_Note) + /* note header */ roundup2(sizeof(FREEBSD_ABI_VENDOR), ELF_NOTE_ROUNDSIZE) + /* note name */ roundup2(size, ELF_NOTE_ROUNDSIZE); /* note description */ return (notesize); } static void __elfN(putnote)(struct note_info *ninfo, struct sbuf *sb) { Elf_Note note; ssize_t old_len, sect_len; size_t new_len, descsz, i; if (ninfo->type == -1) { ninfo->outfunc(ninfo->outarg, sb, &ninfo->outsize); return; } note.n_namesz = sizeof(FREEBSD_ABI_VENDOR); note.n_descsz = ninfo->outsize; note.n_type = ninfo->type; sbuf_bcat(sb, ¬e, sizeof(note)); sbuf_start_section(sb, &old_len); sbuf_bcat(sb, FREEBSD_ABI_VENDOR, sizeof(FREEBSD_ABI_VENDOR)); sbuf_end_section(sb, old_len, ELF_NOTE_ROUNDSIZE, 0); if (note.n_descsz == 0) return; sbuf_start_section(sb, &old_len); ninfo->outfunc(ninfo->outarg, sb, &ninfo->outsize); sect_len = sbuf_end_section(sb, old_len, ELF_NOTE_ROUNDSIZE, 0); if (sect_len < 0) return; new_len = (size_t)sect_len; descsz = roundup(note.n_descsz, ELF_NOTE_ROUNDSIZE); if (new_len < descsz) { /* * It is expected that individual note emitters will correctly * predict their expected output size and fill up to that size * themselves, padding in a format-specific way if needed. * However, in case they don't, just do it here with zeros. */ for (i = 0; i < descsz - new_len; i++) sbuf_putc(sb, 0); } else if (new_len > descsz) { /* * We can't always truncate sb -- we may have drained some * of it already. */ KASSERT(new_len == descsz, ("%s: Note type %u changed as we " "read it (%zu > %zu). Since it is longer than " "expected, this coredump's notes are corrupt. THIS " "IS A BUG in the note_procstat routine for type %u.\n", __func__, (unsigned)note.n_type, new_len, descsz, (unsigned)note.n_type)); } } /* * Miscellaneous note out functions. */ #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 #include typedef struct prstatus32 elf_prstatus_t; typedef struct prpsinfo32 elf_prpsinfo_t; typedef struct fpreg32 elf_prfpregset_t; typedef struct fpreg32 elf_fpregset_t; typedef struct reg32 elf_gregset_t; typedef struct thrmisc32 elf_thrmisc_t; #define ELF_KERN_PROC_MASK KERN_PROC_MASK32 typedef struct kinfo_proc32 elf_kinfo_proc_t; typedef uint32_t elf_ps_strings_t; #else typedef prstatus_t elf_prstatus_t; typedef prpsinfo_t elf_prpsinfo_t; typedef prfpregset_t elf_prfpregset_t; typedef prfpregset_t elf_fpregset_t; typedef gregset_t elf_gregset_t; typedef thrmisc_t elf_thrmisc_t; #define ELF_KERN_PROC_MASK 0 typedef struct kinfo_proc elf_kinfo_proc_t; typedef vm_offset_t elf_ps_strings_t; #endif static void __elfN(note_prpsinfo)(void *arg, struct sbuf *sb, size_t *sizep) { struct sbuf sbarg; size_t len; char *cp, *end; struct proc *p; elf_prpsinfo_t *psinfo; int error; p = (struct proc *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(*psinfo), ("invalid size")); psinfo = malloc(sizeof(*psinfo), M_TEMP, M_ZERO | M_WAITOK); psinfo->pr_version = PRPSINFO_VERSION; psinfo->pr_psinfosz = sizeof(elf_prpsinfo_t); strlcpy(psinfo->pr_fname, p->p_comm, sizeof(psinfo->pr_fname)); PROC_LOCK(p); if (p->p_args != NULL) { len = sizeof(psinfo->pr_psargs) - 1; if (len > p->p_args->ar_length) len = p->p_args->ar_length; memcpy(psinfo->pr_psargs, p->p_args->ar_args, len); PROC_UNLOCK(p); error = 0; } else { _PHOLD(p); PROC_UNLOCK(p); sbuf_new(&sbarg, psinfo->pr_psargs, sizeof(psinfo->pr_psargs), SBUF_FIXEDLEN); error = proc_getargv(curthread, p, &sbarg); PRELE(p); if (sbuf_finish(&sbarg) == 0) len = sbuf_len(&sbarg) - 1; else len = sizeof(psinfo->pr_psargs) - 1; sbuf_delete(&sbarg); } if (error || len == 0) strlcpy(psinfo->pr_psargs, p->p_comm, sizeof(psinfo->pr_psargs)); else { KASSERT(len < sizeof(psinfo->pr_psargs), ("len is too long: %zu vs %zu", len, sizeof(psinfo->pr_psargs))); cp = psinfo->pr_psargs; end = cp + len - 1; for (;;) { cp = memchr(cp, '\0', end - cp); if (cp == NULL) break; *cp = ' '; } } psinfo->pr_pid = p->p_pid; sbuf_bcat(sb, psinfo, sizeof(*psinfo)); free(psinfo, M_TEMP); } *sizep = sizeof(*psinfo); } static void __elfN(note_prstatus)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; elf_prstatus_t *status; td = (struct thread *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(*status), ("invalid size")); status = malloc(sizeof(*status), M_TEMP, M_ZERO | M_WAITOK); status->pr_version = PRSTATUS_VERSION; status->pr_statussz = sizeof(elf_prstatus_t); status->pr_gregsetsz = sizeof(elf_gregset_t); status->pr_fpregsetsz = sizeof(elf_fpregset_t); status->pr_osreldate = osreldate; status->pr_cursig = td->td_proc->p_sig; status->pr_pid = td->td_tid; #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 fill_regs32(td, &status->pr_reg); #else fill_regs(td, &status->pr_reg); #endif sbuf_bcat(sb, status, sizeof(*status)); free(status, M_TEMP); } *sizep = sizeof(*status); } static void __elfN(note_fpregset)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; elf_prfpregset_t *fpregset; td = (struct thread *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(*fpregset), ("invalid size")); fpregset = malloc(sizeof(*fpregset), M_TEMP, M_ZERO | M_WAITOK); #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 fill_fpregs32(td, fpregset); #else fill_fpregs(td, fpregset); #endif sbuf_bcat(sb, fpregset, sizeof(*fpregset)); free(fpregset, M_TEMP); } *sizep = sizeof(*fpregset); } static void __elfN(note_thrmisc)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; elf_thrmisc_t thrmisc; td = (struct thread *)arg; if (sb != NULL) { KASSERT(*sizep == sizeof(thrmisc), ("invalid size")); bzero(&thrmisc._pad, sizeof(thrmisc._pad)); strcpy(thrmisc.pr_tname, td->td_name); sbuf_bcat(sb, &thrmisc, sizeof(thrmisc)); } *sizep = sizeof(thrmisc); } /* * Allow for MD specific notes, as well as any MD * specific preparations for writing MI notes. */ static void __elfN(note_threadmd)(void *arg, struct sbuf *sb, size_t *sizep) { struct thread *td; void *buf; size_t size; td = (struct thread *)arg; size = *sizep; if (size != 0 && sb != NULL) buf = malloc(size, M_TEMP, M_ZERO | M_WAITOK); else buf = NULL; size = 0; __elfN(dump_thread)(td, buf, &size); KASSERT(sb == NULL || *sizep == size, ("invalid size")); if (size != 0 && sb != NULL) sbuf_bcat(sb, buf, size); free(buf, M_TEMP); *sizep = size; } #ifdef KINFO_PROC_SIZE CTASSERT(sizeof(struct kinfo_proc) == KINFO_PROC_SIZE); #endif static void __elfN(note_procstat_proc)(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + p->p_numthreads * sizeof(elf_kinfo_proc_t); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(elf_kinfo_proc_t); sbuf_bcat(sb, &structsize, sizeof(structsize)); sx_slock(&proctree_lock); PROC_LOCK(p); kern_proc_out(p, sb, ELF_KERN_PROC_MASK); sx_sunlock(&proctree_lock); } *sizep = size; } #ifdef KINFO_FILE_SIZE CTASSERT(sizeof(struct kinfo_file) == KINFO_FILE_SIZE); #endif static void note_procstat_files(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size, sect_sz, i; ssize_t start_len, sect_len; int structsize, filedesc_flags; if (coredump_pack_fileinfo) filedesc_flags = KERN_FILEDESC_PACK_KINFO; else filedesc_flags = 0; p = (struct proc *)arg; structsize = sizeof(struct kinfo_file); if (sb == NULL) { size = 0; sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_count, &size); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_filedesc_out(p, sb, -1, filedesc_flags); sbuf_finish(sb); sbuf_delete(sb); *sizep = size; } else { sbuf_start_section(sb, &start_len); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_filedesc_out(p, sb, *sizep - sizeof(structsize), filedesc_flags); sect_len = sbuf_end_section(sb, start_len, 0, 0); if (sect_len < 0) return; sect_sz = sect_len; KASSERT(sect_sz <= *sizep, ("kern_proc_filedesc_out did not respect maxlen; " "requested %zu, got %zu", *sizep - sizeof(structsize), sect_sz - sizeof(structsize))); for (i = 0; i < *sizep - sect_sz && sb->s_error == 0; i++) sbuf_putc(sb, 0); } } #ifdef KINFO_VMENTRY_SIZE CTASSERT(sizeof(struct kinfo_vmentry) == KINFO_VMENTRY_SIZE); #endif static void note_procstat_vmmap(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize, vmmap_flags; if (coredump_pack_vmmapinfo) vmmap_flags = KERN_VMMAP_PACK_KINFO; else vmmap_flags = 0; p = (struct proc *)arg; structsize = sizeof(struct kinfo_vmentry); if (sb == NULL) { size = 0; sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_count, &size); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_vmmap_out(p, sb, -1, vmmap_flags); sbuf_finish(sb); sbuf_delete(sb); *sizep = size; } else { sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); kern_proc_vmmap_out(p, sb, *sizep - sizeof(structsize), vmmap_flags); } } static void note_procstat_groups(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + p->p_ucred->cr_ngroups * sizeof(gid_t); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(gid_t); sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, p->p_ucred->cr_groups, p->p_ucred->cr_ngroups * sizeof(gid_t)); } *sizep = size; } static void note_procstat_umask(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(p->p_fd->fd_cmask); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(p->p_fd->fd_cmask); sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, &p->p_fd->fd_cmask, sizeof(p->p_fd->fd_cmask)); } *sizep = size; } static void note_procstat_rlimit(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; struct rlimit rlim[RLIM_NLIMITS]; size_t size; int structsize, i; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(rlim); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(rlim); sbuf_bcat(sb, &structsize, sizeof(structsize)); PROC_LOCK(p); for (i = 0; i < RLIM_NLIMITS; i++) lim_rlimit_proc(p, i, &rlim[i]); PROC_UNLOCK(p); sbuf_bcat(sb, rlim, sizeof(rlim)); } *sizep = size; } static void note_procstat_osrel(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(p->p_osrel); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(p->p_osrel); sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, &p->p_osrel, sizeof(p->p_osrel)); } *sizep = size; } static void __elfN(note_procstat_psstrings)(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; elf_ps_strings_t ps_strings; size_t size; int structsize; p = (struct proc *)arg; size = sizeof(structsize) + sizeof(ps_strings); if (sb != NULL) { KASSERT(*sizep == size, ("invalid size")); structsize = sizeof(ps_strings); #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32 ps_strings = PTROUT(p->p_sysent->sv_psstrings); #else ps_strings = p->p_sysent->sv_psstrings; #endif sbuf_bcat(sb, &structsize, sizeof(structsize)); sbuf_bcat(sb, &ps_strings, sizeof(ps_strings)); } *sizep = size; } static void __elfN(note_procstat_auxv)(void *arg, struct sbuf *sb, size_t *sizep) { struct proc *p; size_t size; int structsize; p = (struct proc *)arg; if (sb == NULL) { size = 0; sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN); sbuf_set_drain(sb, sbuf_drain_count, &size); sbuf_bcat(sb, &structsize, sizeof(structsize)); PHOLD(p); proc_getauxv(curthread, p, sb); PRELE(p); sbuf_finish(sb); sbuf_delete(sb); *sizep = size; } else { structsize = sizeof(Elf_Auxinfo); sbuf_bcat(sb, &structsize, sizeof(structsize)); PHOLD(p); proc_getauxv(curthread, p, sb); PRELE(p); } } static boolean_t __elfN(parse_notes)(struct image_params *imgp, Elf_Brandnote *checknote, int32_t *osrel, const Elf_Phdr *pnote) { const Elf_Note *note, *note0, *note_end; const char *note_name; char *buf; int i, error; boolean_t res; /* We need some limit, might as well use PAGE_SIZE. */ if (pnote == NULL || pnote->p_filesz > PAGE_SIZE) return (FALSE); ASSERT_VOP_LOCKED(imgp->vp, "parse_notes"); if (pnote->p_offset > PAGE_SIZE || pnote->p_filesz > PAGE_SIZE - pnote->p_offset) { VOP_UNLOCK(imgp->vp, 0); buf = malloc(pnote->p_filesz, M_TEMP, M_WAITOK); vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY); error = vn_rdwr(UIO_READ, imgp->vp, buf, pnote->p_filesz, pnote->p_offset, UIO_SYSSPACE, IO_NODELOCKED, curthread->td_ucred, NOCRED, NULL, curthread); if (error != 0) { uprintf("i/o error PT_NOTE\n"); res = FALSE; goto ret; } note = note0 = (const Elf_Note *)buf; note_end = (const Elf_Note *)(buf + pnote->p_filesz); } else { note = note0 = (const Elf_Note *)(imgp->image_header + pnote->p_offset); note_end = (const Elf_Note *)(imgp->image_header + pnote->p_offset + pnote->p_filesz); buf = NULL; } for (i = 0; i < 100 && note >= note0 && note < note_end; i++) { if (!aligned(note, Elf32_Addr) || (const char *)note_end - (const char *)note < sizeof(Elf_Note)) { res = FALSE; goto ret; } if (note->n_namesz != checknote->hdr.n_namesz || note->n_descsz != checknote->hdr.n_descsz || note->n_type != checknote->hdr.n_type) goto nextnote; note_name = (const char *)(note + 1); if (note_name + checknote->hdr.n_namesz >= (const char *)note_end || strncmp(checknote->vendor, note_name, checknote->hdr.n_namesz) != 0) goto nextnote; /* * Fetch the osreldate for binary * from the ELF OSABI-note if necessary. */ if ((checknote->flags & BN_TRANSLATE_OSREL) != 0 && checknote->trans_osrel != NULL) { res = checknote->trans_osrel(note, osrel); goto ret; } res = TRUE; goto ret; nextnote: note = (const Elf_Note *)((const char *)(note + 1) + roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE) + roundup2(note->n_descsz, ELF_NOTE_ROUNDSIZE)); } res = FALSE; ret: free(buf, M_TEMP); return (res); } /* * Try to find the appropriate ABI-note section for checknote, * fetch the osreldate for binary from the ELF OSABI-note. Only the * first page of the image is searched, the same as for headers. */ static boolean_t __elfN(check_note)(struct image_params *imgp, Elf_Brandnote *checknote, int32_t *osrel) { const Elf_Phdr *phdr; const Elf_Ehdr *hdr; int i; hdr = (const Elf_Ehdr *)imgp->image_header; phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff); for (i = 0; i < hdr->e_phnum; i++) { if (phdr[i].p_type == PT_NOTE && __elfN(parse_notes)(imgp, checknote, osrel, &phdr[i])) return (TRUE); } return (FALSE); } /* * Tell kern_execve.c about it, with a little help from the linker. */ static struct execsw __elfN(execsw) = { __CONCAT(exec_, __elfN(imgact)), __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) }; EXEC_SET(__CONCAT(elf, __ELF_WORD_SIZE), __elfN(execsw)); static vm_prot_t __elfN(trans_prot)(Elf_Word flags) { vm_prot_t prot; prot = 0; if (flags & PF_X) prot |= VM_PROT_EXECUTE; if (flags & PF_W) prot |= VM_PROT_WRITE; if (flags & PF_R) prot |= VM_PROT_READ; #if __ELF_WORD_SIZE == 32 #if defined(__amd64__) if (i386_read_exec && (flags & PF_R)) prot |= VM_PROT_EXECUTE; #endif #endif return (prot); } static Elf_Word __elfN(untrans_prot)(vm_prot_t prot) { Elf_Word flags; flags = 0; if (prot & VM_PROT_EXECUTE) flags |= PF_X; if (prot & VM_PROT_READ) flags |= PF_R; if (prot & VM_PROT_WRITE) flags |= PF_W; return (flags); } Index: user/alc/PQ_LAUNDRY/sys/kern/kern_timeout.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/kern/kern_timeout.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/kern/kern_timeout.c (revision 303206) @@ -1,1670 +1,1654 @@ /*- * Copyright (c) 1982, 1986, 1991, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * From: @(#)kern_clock.c 8.5 (Berkeley) 1/21/94 */ #include __FBSDID("$FreeBSD$"); #include "opt_callout_profiling.h" #include "opt_ddb.h" #if defined(__arm__) #include "opt_timer.h" #endif #include "opt_rss.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef DDB #include #include #endif #ifdef SMP #include #endif #ifndef NO_EVENTTIMERS DPCPU_DECLARE(sbintime_t, hardclocktime); #endif SDT_PROVIDER_DEFINE(callout_execute); SDT_PROBE_DEFINE1(callout_execute, , , callout__start, "struct callout *"); SDT_PROBE_DEFINE1(callout_execute, , , callout__end, "struct callout *"); #ifdef CALLOUT_PROFILING static int avg_depth; SYSCTL_INT(_debug, OID_AUTO, to_avg_depth, CTLFLAG_RD, &avg_depth, 0, "Average number of items examined per softclock call. Units = 1/1000"); static int avg_gcalls; SYSCTL_INT(_debug, OID_AUTO, to_avg_gcalls, CTLFLAG_RD, &avg_gcalls, 0, "Average number of Giant callouts made per softclock call. Units = 1/1000"); static int avg_lockcalls; SYSCTL_INT(_debug, OID_AUTO, to_avg_lockcalls, CTLFLAG_RD, &avg_lockcalls, 0, "Average number of lock callouts made per softclock call. Units = 1/1000"); static int avg_mpcalls; SYSCTL_INT(_debug, OID_AUTO, to_avg_mpcalls, CTLFLAG_RD, &avg_mpcalls, 0, "Average number of MP callouts made per softclock call. Units = 1/1000"); static int avg_depth_dir; SYSCTL_INT(_debug, OID_AUTO, to_avg_depth_dir, CTLFLAG_RD, &avg_depth_dir, 0, "Average number of direct callouts examined per callout_process call. " "Units = 1/1000"); static int avg_lockcalls_dir; SYSCTL_INT(_debug, OID_AUTO, to_avg_lockcalls_dir, CTLFLAG_RD, &avg_lockcalls_dir, 0, "Average number of lock direct callouts made per " "callout_process call. Units = 1/1000"); static int avg_mpcalls_dir; SYSCTL_INT(_debug, OID_AUTO, to_avg_mpcalls_dir, CTLFLAG_RD, &avg_mpcalls_dir, 0, "Average number of MP direct callouts made per callout_process call. " "Units = 1/1000"); #endif static int ncallout; SYSCTL_INT(_kern, OID_AUTO, ncallout, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &ncallout, 0, "Number of entries in callwheel and size of timeout() preallocation"); #ifdef RSS static int pin_default_swi = 1; static int pin_pcpu_swi = 1; #else static int pin_default_swi = 0; static int pin_pcpu_swi = 0; #endif SYSCTL_INT(_kern, OID_AUTO, pin_default_swi, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &pin_default_swi, 0, "Pin the default (non-per-cpu) swi (shared with PCPU 0 swi)"); SYSCTL_INT(_kern, OID_AUTO, pin_pcpu_swi, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &pin_pcpu_swi, 0, "Pin the per-CPU swis (except PCPU 0, which is also default"); /* * TODO: * allocate more timeout table slots when table overflows. */ u_int callwheelsize, callwheelmask; /* * The callout cpu exec entities represent informations necessary for * describing the state of callouts currently running on the CPU and the ones * necessary for migrating callouts to the new callout cpu. In particular, * the first entry of the array cc_exec_entity holds informations for callout * running in SWI thread context, while the second one holds informations * for callout running directly from hardware interrupt context. * The cached informations are very important for deferring migration when * the migrating callout is already running. */ struct cc_exec { struct callout *cc_curr; void (*cc_drain)(void *); #ifdef SMP void (*ce_migration_func)(void *); void *ce_migration_arg; int ce_migration_cpu; sbintime_t ce_migration_time; sbintime_t ce_migration_prec; #endif bool cc_cancel; bool cc_waiting; }; /* * There is one struct callout_cpu per cpu, holding all relevant * state for the callout processing thread on the individual CPU. */ struct callout_cpu { struct mtx_padalign cc_lock; struct cc_exec cc_exec_entity[2]; struct callout *cc_next; struct callout *cc_callout; struct callout_list *cc_callwheel; struct callout_tailq cc_expireq; struct callout_slist cc_callfree; sbintime_t cc_firstevent; sbintime_t cc_lastscan; void *cc_cookie; u_int cc_bucket; u_int cc_inited; char cc_ktr_event_name[20]; }; #define callout_migrating(c) ((c)->c_iflags & CALLOUT_DFRMIGRATION) #define cc_exec_curr(cc, dir) cc->cc_exec_entity[dir].cc_curr #define cc_exec_drain(cc, dir) cc->cc_exec_entity[dir].cc_drain #define cc_exec_next(cc) cc->cc_next #define cc_exec_cancel(cc, dir) cc->cc_exec_entity[dir].cc_cancel #define cc_exec_waiting(cc, dir) cc->cc_exec_entity[dir].cc_waiting #ifdef SMP #define cc_migration_func(cc, dir) cc->cc_exec_entity[dir].ce_migration_func #define cc_migration_arg(cc, dir) cc->cc_exec_entity[dir].ce_migration_arg #define cc_migration_cpu(cc, dir) cc->cc_exec_entity[dir].ce_migration_cpu #define cc_migration_time(cc, dir) cc->cc_exec_entity[dir].ce_migration_time #define cc_migration_prec(cc, dir) cc->cc_exec_entity[dir].ce_migration_prec struct callout_cpu cc_cpu[MAXCPU]; #define CPUBLOCK MAXCPU #define CC_CPU(cpu) (&cc_cpu[(cpu)]) #define CC_SELF() CC_CPU(PCPU_GET(cpuid)) #else struct callout_cpu cc_cpu; #define CC_CPU(cpu) &cc_cpu #define CC_SELF() &cc_cpu #endif #define CC_LOCK(cc) mtx_lock_spin(&(cc)->cc_lock) #define CC_UNLOCK(cc) mtx_unlock_spin(&(cc)->cc_lock) #define CC_LOCK_ASSERT(cc) mtx_assert(&(cc)->cc_lock, MA_OWNED) static int timeout_cpu; static void callout_cpu_init(struct callout_cpu *cc, int cpu); static void softclock_call_cc(struct callout *c, struct callout_cpu *cc, #ifdef CALLOUT_PROFILING int *mpcalls, int *lockcalls, int *gcalls, #endif int direct); static MALLOC_DEFINE(M_CALLOUT, "callout", "Callout datastructures"); /** * Locked by cc_lock: * cc_curr - If a callout is in progress, it is cc_curr. * If cc_curr is non-NULL, threads waiting in * callout_drain() will be woken up as soon as the * relevant callout completes. * cc_cancel - Changing to 1 with both callout_lock and cc_lock held * guarantees that the current callout will not run. * The softclock() function sets this to 0 before it * drops callout_lock to acquire c_lock, and it calls * the handler only if curr_cancelled is still 0 after * cc_lock is successfully acquired. * cc_waiting - If a thread is waiting in callout_drain(), then * callout_wait is nonzero. Set only when * cc_curr is non-NULL. */ /* * Resets the execution entity tied to a specific callout cpu. */ static void cc_cce_cleanup(struct callout_cpu *cc, int direct) { cc_exec_curr(cc, direct) = NULL; cc_exec_cancel(cc, direct) = false; cc_exec_waiting(cc, direct) = false; #ifdef SMP cc_migration_cpu(cc, direct) = CPUBLOCK; cc_migration_time(cc, direct) = 0; cc_migration_prec(cc, direct) = 0; cc_migration_func(cc, direct) = NULL; cc_migration_arg(cc, direct) = NULL; #endif } /* * Checks if migration is requested by a specific callout cpu. */ static int cc_cce_migrating(struct callout_cpu *cc, int direct) { #ifdef SMP return (cc_migration_cpu(cc, direct) != CPUBLOCK); #else return (0); #endif } /* * Kernel low level callwheel initialization * called on cpu0 during kernel startup. */ static void callout_callwheel_init(void *dummy) { struct callout_cpu *cc; /* * Calculate the size of the callout wheel and the preallocated * timeout() structures. * XXX: Clip callout to result of previous function of maxusers * maximum 384. This is still huge, but acceptable. */ memset(CC_CPU(0), 0, sizeof(cc_cpu)); ncallout = imin(16 + maxproc + maxfiles, 18508); TUNABLE_INT_FETCH("kern.ncallout", &ncallout); /* * Calculate callout wheel size, should be next power of two higher * than 'ncallout'. */ callwheelsize = 1 << fls(ncallout); callwheelmask = callwheelsize - 1; /* * Fetch whether we're pinning the swi's or not. */ TUNABLE_INT_FETCH("kern.pin_default_swi", &pin_default_swi); TUNABLE_INT_FETCH("kern.pin_pcpu_swi", &pin_pcpu_swi); /* * Only cpu0 handles timeout(9) and receives a preallocation. * * XXX: Once all timeout(9) consumers are converted this can * be removed. */ timeout_cpu = PCPU_GET(cpuid); cc = CC_CPU(timeout_cpu); cc->cc_callout = malloc(ncallout * sizeof(struct callout), M_CALLOUT, M_WAITOK); callout_cpu_init(cc, timeout_cpu); } SYSINIT(callwheel_init, SI_SUB_CPU, SI_ORDER_ANY, callout_callwheel_init, NULL); /* * Initialize the per-cpu callout structures. */ static void callout_cpu_init(struct callout_cpu *cc, int cpu) { struct callout *c; int i; mtx_init(&cc->cc_lock, "callout", NULL, MTX_SPIN | MTX_RECURSE); SLIST_INIT(&cc->cc_callfree); cc->cc_inited = 1; cc->cc_callwheel = malloc(sizeof(struct callout_list) * callwheelsize, M_CALLOUT, M_WAITOK); for (i = 0; i < callwheelsize; i++) LIST_INIT(&cc->cc_callwheel[i]); TAILQ_INIT(&cc->cc_expireq); cc->cc_firstevent = SBT_MAX; for (i = 0; i < 2; i++) cc_cce_cleanup(cc, i); snprintf(cc->cc_ktr_event_name, sizeof(cc->cc_ktr_event_name), "callwheel cpu %d", cpu); if (cc->cc_callout == NULL) /* Only cpu0 handles timeout(9) */ return; for (i = 0; i < ncallout; i++) { c = &cc->cc_callout[i]; callout_init(c, 0); c->c_iflags = CALLOUT_LOCAL_ALLOC; SLIST_INSERT_HEAD(&cc->cc_callfree, c, c_links.sle); } } #ifdef SMP /* * Switches the cpu tied to a specific callout. * The function expects a locked incoming callout cpu and returns with * locked outcoming callout cpu. */ static struct callout_cpu * callout_cpu_switch(struct callout *c, struct callout_cpu *cc, int new_cpu) { struct callout_cpu *new_cc; MPASS(c != NULL && cc != NULL); CC_LOCK_ASSERT(cc); /* * Avoid interrupts and preemption firing after the callout cpu * is blocked in order to avoid deadlocks as the new thread * may be willing to acquire the callout cpu lock. */ c->c_cpu = CPUBLOCK; spinlock_enter(); CC_UNLOCK(cc); new_cc = CC_CPU(new_cpu); CC_LOCK(new_cc); spinlock_exit(); c->c_cpu = new_cpu; return (new_cc); } #endif /* * Start standard softclock thread. */ static void start_softclock(void *dummy) { struct callout_cpu *cc; char name[MAXCOMLEN]; #ifdef SMP int cpu; struct intr_event *ie; #endif cc = CC_CPU(timeout_cpu); snprintf(name, sizeof(name), "clock (%d)", timeout_cpu); if (swi_add(&clk_intr_event, name, softclock, cc, SWI_CLOCK, INTR_MPSAFE, &cc->cc_cookie)) panic("died while creating standard software ithreads"); if (pin_default_swi && (intr_event_bind(clk_intr_event, timeout_cpu) != 0)) { printf("%s: timeout clock couldn't be pinned to cpu %d\n", __func__, timeout_cpu); } #ifdef SMP CPU_FOREACH(cpu) { if (cpu == timeout_cpu) continue; cc = CC_CPU(cpu); cc->cc_callout = NULL; /* Only cpu0 handles timeout(9). */ callout_cpu_init(cc, cpu); snprintf(name, sizeof(name), "clock (%d)", cpu); ie = NULL; if (swi_add(&ie, name, softclock, cc, SWI_CLOCK, INTR_MPSAFE, &cc->cc_cookie)) panic("died while creating standard software ithreads"); if (pin_pcpu_swi && (intr_event_bind(ie, cpu) != 0)) { printf("%s: per-cpu clock couldn't be pinned to " "cpu %d\n", __func__, cpu); } } #endif } SYSINIT(start_softclock, SI_SUB_SOFTINTR, SI_ORDER_FIRST, start_softclock, NULL); #define CC_HASH_SHIFT 8 static inline u_int callout_hash(sbintime_t sbt) { return (sbt >> (32 - CC_HASH_SHIFT)); } static inline u_int callout_get_bucket(sbintime_t sbt) { return (callout_hash(sbt) & callwheelmask); } void callout_process(sbintime_t now) { struct callout *tmp, *tmpn; struct callout_cpu *cc; struct callout_list *sc; sbintime_t first, last, max, tmp_max; uint32_t lookahead; u_int firstb, lastb, nowb; #ifdef CALLOUT_PROFILING int depth_dir = 0, mpcalls_dir = 0, lockcalls_dir = 0; #endif cc = CC_SELF(); mtx_lock_spin_flags(&cc->cc_lock, MTX_QUIET); /* Compute the buckets of the last scan and present times. */ firstb = callout_hash(cc->cc_lastscan); cc->cc_lastscan = now; nowb = callout_hash(now); /* Compute the last bucket and minimum time of the bucket after it. */ if (nowb == firstb) lookahead = (SBT_1S / 16); else if (nowb - firstb == 1) lookahead = (SBT_1S / 8); else lookahead = (SBT_1S / 2); first = last = now; first += (lookahead / 2); last += lookahead; last &= (0xffffffffffffffffLLU << (32 - CC_HASH_SHIFT)); lastb = callout_hash(last) - 1; max = last; /* * Check if we wrapped around the entire wheel from the last scan. * In case, we need to scan entirely the wheel for pending callouts. */ if (lastb - firstb >= callwheelsize) { lastb = firstb + callwheelsize - 1; if (nowb - firstb >= callwheelsize) nowb = lastb; } /* Iterate callwheel from firstb to nowb and then up to lastb. */ do { sc = &cc->cc_callwheel[firstb & callwheelmask]; tmp = LIST_FIRST(sc); while (tmp != NULL) { /* Run the callout if present time within allowed. */ if (tmp->c_time <= now) { /* * Consumer told us the callout may be run * directly from hardware interrupt context. */ if (tmp->c_iflags & CALLOUT_DIRECT) { #ifdef CALLOUT_PROFILING ++depth_dir; #endif cc_exec_next(cc) = LIST_NEXT(tmp, c_links.le); cc->cc_bucket = firstb & callwheelmask; LIST_REMOVE(tmp, c_links.le); softclock_call_cc(tmp, cc, #ifdef CALLOUT_PROFILING &mpcalls_dir, &lockcalls_dir, NULL, #endif 1); tmp = cc_exec_next(cc); cc_exec_next(cc) = NULL; } else { tmpn = LIST_NEXT(tmp, c_links.le); LIST_REMOVE(tmp, c_links.le); TAILQ_INSERT_TAIL(&cc->cc_expireq, tmp, c_links.tqe); tmp->c_iflags |= CALLOUT_PROCESSED; tmp = tmpn; } continue; } /* Skip events from distant future. */ if (tmp->c_time >= max) goto next; /* * Event minimal time is bigger than present maximal * time, so it cannot be aggregated. */ if (tmp->c_time > last) { lastb = nowb; goto next; } /* Update first and last time, respecting this event. */ if (tmp->c_time < first) first = tmp->c_time; tmp_max = tmp->c_time + tmp->c_precision; if (tmp_max < last) last = tmp_max; next: tmp = LIST_NEXT(tmp, c_links.le); } /* Proceed with the next bucket. */ firstb++; /* * Stop if we looked after present time and found * some event we can't execute at now. * Stop if we looked far enough into the future. */ } while (((int)(firstb - lastb)) <= 0); cc->cc_firstevent = last; #ifndef NO_EVENTTIMERS cpu_new_callout(curcpu, last, first); #endif #ifdef CALLOUT_PROFILING avg_depth_dir += (depth_dir * 1000 - avg_depth_dir) >> 8; avg_mpcalls_dir += (mpcalls_dir * 1000 - avg_mpcalls_dir) >> 8; avg_lockcalls_dir += (lockcalls_dir * 1000 - avg_lockcalls_dir) >> 8; #endif mtx_unlock_spin_flags(&cc->cc_lock, MTX_QUIET); /* * swi_sched acquires the thread lock, so we don't want to call it * with cc_lock held; incorrect locking order. */ if (!TAILQ_EMPTY(&cc->cc_expireq)) swi_sched(cc->cc_cookie, 0); } static struct callout_cpu * callout_lock(struct callout *c) { struct callout_cpu *cc; int cpu; for (;;) { cpu = c->c_cpu; #ifdef SMP if (cpu == CPUBLOCK) { while (c->c_cpu == CPUBLOCK) cpu_spinwait(); continue; } #endif cc = CC_CPU(cpu); CC_LOCK(cc); if (cpu == c->c_cpu) break; CC_UNLOCK(cc); } return (cc); } static void callout_cc_add(struct callout *c, struct callout_cpu *cc, sbintime_t sbt, sbintime_t precision, void (*func)(void *), void *arg, int cpu, int flags) { int bucket; CC_LOCK_ASSERT(cc); if (sbt < cc->cc_lastscan) sbt = cc->cc_lastscan; c->c_arg = arg; c->c_iflags |= CALLOUT_PENDING; c->c_iflags &= ~CALLOUT_PROCESSED; c->c_flags |= CALLOUT_ACTIVE; if (flags & C_DIRECT_EXEC) c->c_iflags |= CALLOUT_DIRECT; c->c_func = func; c->c_time = sbt; c->c_precision = precision; bucket = callout_get_bucket(c->c_time); CTR3(KTR_CALLOUT, "precision set for %p: %d.%08x", c, (int)(c->c_precision >> 32), (u_int)(c->c_precision & 0xffffffff)); LIST_INSERT_HEAD(&cc->cc_callwheel[bucket], c, c_links.le); if (cc->cc_bucket == bucket) cc_exec_next(cc) = c; #ifndef NO_EVENTTIMERS /* * Inform the eventtimers(4) subsystem there's a new callout * that has been inserted, but only if really required. */ if (SBT_MAX - c->c_time < c->c_precision) c->c_precision = SBT_MAX - c->c_time; sbt = c->c_time + c->c_precision; if (sbt < cc->cc_firstevent) { cc->cc_firstevent = sbt; cpu_new_callout(cpu, sbt, c->c_time); } #endif } static void callout_cc_del(struct callout *c, struct callout_cpu *cc) { if ((c->c_iflags & CALLOUT_LOCAL_ALLOC) == 0) return; c->c_func = NULL; SLIST_INSERT_HEAD(&cc->cc_callfree, c, c_links.sle); } static void softclock_call_cc(struct callout *c, struct callout_cpu *cc, #ifdef CALLOUT_PROFILING int *mpcalls, int *lockcalls, int *gcalls, #endif int direct) { struct rm_priotracker tracker; void (*c_func)(void *); void *c_arg; struct lock_class *class; struct lock_object *c_lock; uintptr_t lock_status; int c_iflags; #ifdef SMP struct callout_cpu *new_cc; void (*new_func)(void *); void *new_arg; int flags, new_cpu; sbintime_t new_prec, new_time; #endif #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING) sbintime_t sbt1, sbt2; struct timespec ts2; static sbintime_t maxdt = 2 * SBT_1MS; /* 2 msec */ static timeout_t *lastfunc; #endif KASSERT((c->c_iflags & CALLOUT_PENDING) == CALLOUT_PENDING, ("softclock_call_cc: pend %p %x", c, c->c_iflags)); KASSERT((c->c_flags & CALLOUT_ACTIVE) == CALLOUT_ACTIVE, ("softclock_call_cc: act %p %x", c, c->c_flags)); class = (c->c_lock != NULL) ? LOCK_CLASS(c->c_lock) : NULL; lock_status = 0; if (c->c_flags & CALLOUT_SHAREDLOCK) { if (class == &lock_class_rm) lock_status = (uintptr_t)&tracker; else lock_status = 1; } c_lock = c->c_lock; c_func = c->c_func; c_arg = c->c_arg; c_iflags = c->c_iflags; if (c->c_iflags & CALLOUT_LOCAL_ALLOC) c->c_iflags = CALLOUT_LOCAL_ALLOC; else c->c_iflags &= ~CALLOUT_PENDING; cc_exec_curr(cc, direct) = c; cc_exec_cancel(cc, direct) = false; cc_exec_drain(cc, direct) = NULL; CC_UNLOCK(cc); if (c_lock != NULL) { class->lc_lock(c_lock, lock_status); /* * The callout may have been cancelled * while we switched locks. */ if (cc_exec_cancel(cc, direct)) { class->lc_unlock(c_lock); goto skip; } /* The callout cannot be stopped now. */ cc_exec_cancel(cc, direct) = true; if (c_lock == &Giant.lock_object) { #ifdef CALLOUT_PROFILING (*gcalls)++; #endif CTR3(KTR_CALLOUT, "callout giant %p func %p arg %p", c, c_func, c_arg); } else { #ifdef CALLOUT_PROFILING (*lockcalls)++; #endif CTR3(KTR_CALLOUT, "callout lock %p func %p arg %p", c, c_func, c_arg); } } else { #ifdef CALLOUT_PROFILING (*mpcalls)++; #endif CTR3(KTR_CALLOUT, "callout %p func %p arg %p", c, c_func, c_arg); } KTR_STATE3(KTR_SCHED, "callout", cc->cc_ktr_event_name, "running", "func:%p", c_func, "arg:%p", c_arg, "direct:%d", direct); #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING) sbt1 = sbinuptime(); #endif THREAD_NO_SLEEPING(); SDT_PROBE1(callout_execute, , , callout__start, c); c_func(c_arg); SDT_PROBE1(callout_execute, , , callout__end, c); THREAD_SLEEPING_OK(); #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING) sbt2 = sbinuptime(); sbt2 -= sbt1; if (sbt2 > maxdt) { if (lastfunc != c_func || sbt2 > maxdt * 2) { ts2 = sbttots(sbt2); printf( "Expensive timeout(9) function: %p(%p) %jd.%09ld s\n", c_func, c_arg, (intmax_t)ts2.tv_sec, ts2.tv_nsec); } maxdt = sbt2; lastfunc = c_func; } #endif KTR_STATE0(KTR_SCHED, "callout", cc->cc_ktr_event_name, "idle"); CTR1(KTR_CALLOUT, "callout %p finished", c); if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0) class->lc_unlock(c_lock); skip: CC_LOCK(cc); KASSERT(cc_exec_curr(cc, direct) == c, ("mishandled cc_curr")); cc_exec_curr(cc, direct) = NULL; if (cc_exec_drain(cc, direct)) { void (*drain)(void *); drain = cc_exec_drain(cc, direct); cc_exec_drain(cc, direct) = NULL; CC_UNLOCK(cc); drain(c_arg); CC_LOCK(cc); } if (cc_exec_waiting(cc, direct)) { /* * There is someone waiting for the * callout to complete. * If the callout was scheduled for * migration just cancel it. */ if (cc_cce_migrating(cc, direct)) { cc_cce_cleanup(cc, direct); /* * It should be assert here that the callout is not * destroyed but that is not easy. */ c->c_iflags &= ~CALLOUT_DFRMIGRATION; } cc_exec_waiting(cc, direct) = false; CC_UNLOCK(cc); wakeup(&cc_exec_waiting(cc, direct)); CC_LOCK(cc); } else if (cc_cce_migrating(cc, direct)) { KASSERT((c_iflags & CALLOUT_LOCAL_ALLOC) == 0, ("Migrating legacy callout %p", c)); #ifdef SMP /* * If the callout was scheduled for * migration just perform it now. */ new_cpu = cc_migration_cpu(cc, direct); new_time = cc_migration_time(cc, direct); new_prec = cc_migration_prec(cc, direct); new_func = cc_migration_func(cc, direct); new_arg = cc_migration_arg(cc, direct); cc_cce_cleanup(cc, direct); /* * It should be assert here that the callout is not destroyed * but that is not easy. * * As first thing, handle deferred callout stops. */ if (!callout_migrating(c)) { CTR3(KTR_CALLOUT, "deferred cancelled %p func %p arg %p", c, new_func, new_arg); callout_cc_del(c, cc); return; } c->c_iflags &= ~CALLOUT_DFRMIGRATION; new_cc = callout_cpu_switch(c, cc, new_cpu); flags = (direct) ? C_DIRECT_EXEC : 0; callout_cc_add(c, new_cc, new_time, new_prec, new_func, new_arg, new_cpu, flags); CC_UNLOCK(new_cc); CC_LOCK(cc); #else panic("migration should not happen"); #endif } /* * If the current callout is locally allocated (from * timeout(9)) then put it on the freelist. * * Note: we need to check the cached copy of c_iflags because * if it was not local, then it's not safe to deref the * callout pointer. */ KASSERT((c_iflags & CALLOUT_LOCAL_ALLOC) == 0 || c->c_iflags == CALLOUT_LOCAL_ALLOC, ("corrupted callout")); if (c_iflags & CALLOUT_LOCAL_ALLOC) callout_cc_del(c, cc); } /* * The callout mechanism is based on the work of Adam M. Costello and * George Varghese, published in a technical report entitled "Redesigning * the BSD Callout and Timer Facilities" and modified slightly for inclusion * in FreeBSD by Justin T. Gibbs. The original work on the data structures * used in this implementation was published by G. Varghese and T. Lauck in * the paper "Hashed and Hierarchical Timing Wheels: Data Structures for * the Efficient Implementation of a Timer Facility" in the Proceedings of * the 11th ACM Annual Symposium on Operating Systems Principles, * Austin, Texas Nov 1987. */ /* * Software (low priority) clock interrupt. * Run periodic events from timeout queue. */ void softclock(void *arg) { struct callout_cpu *cc; struct callout *c; #ifdef CALLOUT_PROFILING int depth = 0, gcalls = 0, lockcalls = 0, mpcalls = 0; #endif cc = (struct callout_cpu *)arg; CC_LOCK(cc); while ((c = TAILQ_FIRST(&cc->cc_expireq)) != NULL) { TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe); softclock_call_cc(c, cc, #ifdef CALLOUT_PROFILING &mpcalls, &lockcalls, &gcalls, #endif 0); #ifdef CALLOUT_PROFILING ++depth; #endif } #ifdef CALLOUT_PROFILING avg_depth += (depth * 1000 - avg_depth) >> 8; avg_mpcalls += (mpcalls * 1000 - avg_mpcalls) >> 8; avg_lockcalls += (lockcalls * 1000 - avg_lockcalls) >> 8; avg_gcalls += (gcalls * 1000 - avg_gcalls) >> 8; #endif CC_UNLOCK(cc); } /* * timeout -- * Execute a function after a specified length of time. * * untimeout -- * Cancel previous timeout function call. * * callout_handle_init -- * Initialize a handle so that using it with untimeout is benign. * * See AT&T BCI Driver Reference Manual for specification. This * implementation differs from that one in that although an * identification value is returned from timeout, the original * arguments to timeout as well as the identifier are used to * identify entries for untimeout. */ struct callout_handle timeout(timeout_t *ftn, void *arg, int to_ticks) { struct callout_cpu *cc; struct callout *new; struct callout_handle handle; cc = CC_CPU(timeout_cpu); CC_LOCK(cc); /* Fill in the next free callout structure. */ new = SLIST_FIRST(&cc->cc_callfree); if (new == NULL) /* XXX Attempt to malloc first */ panic("timeout table full"); SLIST_REMOVE_HEAD(&cc->cc_callfree, c_links.sle); callout_reset(new, to_ticks, ftn, arg); handle.callout = new; CC_UNLOCK(cc); return (handle); } void untimeout(timeout_t *ftn, void *arg, struct callout_handle handle) { struct callout_cpu *cc; /* * Check for a handle that was initialized * by callout_handle_init, but never used * for a real timeout. */ if (handle.callout == NULL) return; cc = callout_lock(handle.callout); if (handle.callout->c_func == ftn && handle.callout->c_arg == arg) callout_stop(handle.callout); CC_UNLOCK(cc); } void callout_handle_init(struct callout_handle *handle) { handle->callout = NULL; } /* * New interface; clients allocate their own callout structures. * * callout_reset() - establish or change a timeout * callout_stop() - disestablish a timeout * callout_init() - initialize a callout structure so that it can * safely be passed to callout_reset() and callout_stop() * * defines three convenience macros: * * callout_active() - returns truth if callout has not been stopped, * drained, or deactivated since the last time the callout was * reset. * callout_pending() - returns truth if callout is still waiting for timeout * callout_deactivate() - marks the callout as having been serviced */ int callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision, void (*ftn)(void *), void *arg, int cpu, int flags) { sbintime_t to_sbt, pr; struct callout_cpu *cc; int cancelled, direct; int ignore_cpu=0; cancelled = 0; if (cpu == -1) { ignore_cpu = 1; } else if ((cpu >= MAXCPU) || ((CC_CPU(cpu))->cc_inited == 0)) { /* Invalid CPU spec */ panic("Invalid CPU in callout %d", cpu); } if (flags & C_ABSOLUTE) { to_sbt = sbt; } else { if ((flags & C_HARDCLOCK) && (sbt < tick_sbt)) sbt = tick_sbt; if ((flags & C_HARDCLOCK) || #ifdef NO_EVENTTIMERS sbt >= sbt_timethreshold) { to_sbt = getsbinuptime(); /* Add safety belt for the case of hz > 1000. */ to_sbt += tc_tick_sbt - tick_sbt; #else sbt >= sbt_tickthreshold) { /* * Obtain the time of the last hardclock() call on * this CPU directly from the kern_clocksource.c. * This value is per-CPU, but it is equal for all * active ones. */ #ifdef __LP64__ to_sbt = DPCPU_GET(hardclocktime); #else spinlock_enter(); to_sbt = DPCPU_GET(hardclocktime); spinlock_exit(); #endif #endif if ((flags & C_HARDCLOCK) == 0) to_sbt += tick_sbt; } else to_sbt = sbinuptime(); if (SBT_MAX - to_sbt < sbt) to_sbt = SBT_MAX; else to_sbt += sbt; pr = ((C_PRELGET(flags) < 0) ? sbt >> tc_precexp : sbt >> C_PRELGET(flags)); if (pr > precision) precision = pr; } /* * This flag used to be added by callout_cc_add, but the * first time you call this we could end up with the * wrong direct flag if we don't do it before we add. */ if (flags & C_DIRECT_EXEC) { direct = 1; } else { direct = 0; } KASSERT(!direct || c->c_lock == NULL, ("%s: direct callout %p has lock", __func__, c)); cc = callout_lock(c); /* * Don't allow migration of pre-allocated callouts lest they * become unbalanced or handle the case where the user does * not care. */ if ((c->c_iflags & CALLOUT_LOCAL_ALLOC) || ignore_cpu) { cpu = c->c_cpu; } if (cc_exec_curr(cc, direct) == c) { /* * We're being asked to reschedule a callout which is * currently in progress. If there is a lock then we * can cancel the callout if it has not really started. */ if (c->c_lock != NULL && !cc_exec_cancel(cc, direct)) cancelled = cc_exec_cancel(cc, direct) = true; - if (cc_exec_waiting(cc, direct) || cc_exec_drain(cc, direct)) { + if (cc_exec_waiting(cc, direct)) { /* * Someone has called callout_drain to kill this * callout. Don't reschedule. */ CTR4(KTR_CALLOUT, "%s %p func %p arg %p", cancelled ? "cancelled" : "failed to cancel", c, c->c_func, c->c_arg); CC_UNLOCK(cc); return (cancelled); } #ifdef SMP if (callout_migrating(c)) { /* * This only occurs when a second callout_reset_sbt_on * is made after a previous one moved it into * deferred migration (below). Note we do *not* change * the prev_cpu even though the previous target may * be different. */ cc_migration_cpu(cc, direct) = cpu; cc_migration_time(cc, direct) = to_sbt; cc_migration_prec(cc, direct) = precision; cc_migration_func(cc, direct) = ftn; cc_migration_arg(cc, direct) = arg; cancelled = 1; CC_UNLOCK(cc); return (cancelled); } #endif } if (c->c_iflags & CALLOUT_PENDING) { if ((c->c_iflags & CALLOUT_PROCESSED) == 0) { if (cc_exec_next(cc) == c) cc_exec_next(cc) = LIST_NEXT(c, c_links.le); LIST_REMOVE(c, c_links.le); } else { TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe); } cancelled = 1; c->c_iflags &= ~ CALLOUT_PENDING; c->c_flags &= ~ CALLOUT_ACTIVE; } #ifdef SMP /* * If the callout must migrate try to perform it immediately. * If the callout is currently running, just defer the migration * to a more appropriate moment. */ if (c->c_cpu != cpu) { if (cc_exec_curr(cc, direct) == c) { /* * Pending will have been removed since we are * actually executing the callout on another * CPU. That callout should be waiting on the * lock the caller holds. If we set both * active/and/pending after we return and the * lock on the executing callout proceeds, it * will then see pending is true and return. * At the return from the actual callout execution * the migration will occur in softclock_call_cc * and this new callout will be placed on the * new CPU via a call to callout_cpu_switch() which * will get the lock on the right CPU followed * by a call callout_cc_add() which will add it there. * (see above in softclock_call_cc()). */ cc_migration_cpu(cc, direct) = cpu; cc_migration_time(cc, direct) = to_sbt; cc_migration_prec(cc, direct) = precision; cc_migration_func(cc, direct) = ftn; cc_migration_arg(cc, direct) = arg; c->c_iflags |= (CALLOUT_DFRMIGRATION | CALLOUT_PENDING); c->c_flags |= CALLOUT_ACTIVE; CTR6(KTR_CALLOUT, "migration of %p func %p arg %p in %d.%08x to %u deferred", c, c->c_func, c->c_arg, (int)(to_sbt >> 32), (u_int)(to_sbt & 0xffffffff), cpu); CC_UNLOCK(cc); return (cancelled); } cc = callout_cpu_switch(c, cc, cpu); } #endif callout_cc_add(c, cc, to_sbt, precision, ftn, arg, cpu, flags); CTR6(KTR_CALLOUT, "%sscheduled %p func %p arg %p in %d.%08x", cancelled ? "re" : "", c, c->c_func, c->c_arg, (int)(to_sbt >> 32), (u_int)(to_sbt & 0xffffffff)); CC_UNLOCK(cc); return (cancelled); } /* * Common idioms that can be optimized in the future. */ int callout_schedule_on(struct callout *c, int to_ticks, int cpu) { return callout_reset_on(c, to_ticks, c->c_func, c->c_arg, cpu); } int callout_schedule(struct callout *c, int to_ticks) { return callout_reset_on(c, to_ticks, c->c_func, c->c_arg, c->c_cpu); } int _callout_stop_safe(struct callout *c, int flags, void (*drain)(void *)) { struct callout_cpu *cc, *old_cc; struct lock_class *class; int direct, sq_locked, use_lock; - int not_on_a_list; + int cancelled, not_on_a_list; if ((flags & CS_DRAIN) != 0) WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, c->c_lock, "calling %s", __func__); /* * Some old subsystems don't hold Giant while running a callout_stop(), * so just discard this check for the moment. */ if ((flags & CS_DRAIN) == 0 && c->c_lock != NULL) { if (c->c_lock == &Giant.lock_object) use_lock = mtx_owned(&Giant); else { use_lock = 1; class = LOCK_CLASS(c->c_lock); class->lc_assert(c->c_lock, LA_XLOCKED); } } else use_lock = 0; if (c->c_iflags & CALLOUT_DIRECT) { direct = 1; } else { direct = 0; } sq_locked = 0; old_cc = NULL; again: cc = callout_lock(c); if ((c->c_iflags & (CALLOUT_DFRMIGRATION | CALLOUT_PENDING)) == (CALLOUT_DFRMIGRATION | CALLOUT_PENDING) && ((c->c_flags & CALLOUT_ACTIVE) == CALLOUT_ACTIVE)) { /* * Special case where this slipped in while we * were migrating *as* the callout is about to * execute. The caller probably holds the lock * the callout wants. * * Get rid of the migration first. Then set * the flag that tells this code *not* to * try to remove it from any lists (its not * on one yet). When the callout wheel runs, * it will ignore this callout. */ c->c_iflags &= ~CALLOUT_PENDING; c->c_flags &= ~CALLOUT_ACTIVE; not_on_a_list = 1; } else { not_on_a_list = 0; } /* * If the callout was migrating while the callout cpu lock was * dropped, just drop the sleepqueue lock and check the states * again. */ if (sq_locked != 0 && cc != old_cc) { #ifdef SMP CC_UNLOCK(cc); sleepq_release(&cc_exec_waiting(old_cc, direct)); sq_locked = 0; old_cc = NULL; goto again; #else panic("migration should not happen"); #endif } - if ((drain != NULL) && (c->c_iflags & CALLOUT_PENDING) && - (cc_exec_curr(cc, direct) != c)) { - /* - * This callout is executing and we are draining. - * The only way this can happen is if its also - * been rescheduled to run on one thread *and* asked to drain - * on this thread (at the same time it is waiting to execute). - */ - if ((c->c_iflags & CALLOUT_PROCESSED) == 0) { - if (cc_exec_next(cc) == c) - cc_exec_next(cc) = LIST_NEXT(c, c_links.le); - LIST_REMOVE(c, c_links.le); - } else { - TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe); - } - c->c_iflags &= ~CALLOUT_PENDING; - c->c_flags &= ~CALLOUT_ACTIVE; - } + /* - * If the callout isn't pending, it's not on the queue, so - * don't attempt to remove it from the queue. We can try to - * stop it by other means however. + * If the callout is running, try to stop it or drain it. */ - if (!(c->c_iflags & CALLOUT_PENDING)) { + if (cc_exec_curr(cc, direct) == c) { /* - * If it wasn't on the queue and it isn't the current - * callout, then we can't stop it, so just bail. - * It probably has already been run (if locking - * is properly done). You could get here if the caller - * calls stop twice in a row for example. The second - * call would fall here without CALLOUT_ACTIVE set. + * Succeed we to stop it or not, we must clear the + * active flag - this is what API users expect. */ c->c_flags &= ~CALLOUT_ACTIVE; - if (cc_exec_curr(cc, direct) != c) { - CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", - c, c->c_func, c->c_arg); - CC_UNLOCK(cc); - if (sq_locked) - sleepq_release(&cc_exec_waiting(cc, direct)); - return (-1); - } + if ((flags & CS_DRAIN) != 0) { /* * The current callout is running (or just * about to run) and blocking is allowed, so * just wait for the current invocation to * finish. */ while (cc_exec_curr(cc, direct) == c) { /* * Use direct calls to sleepqueue interface * instead of cv/msleep in order to avoid * a LOR between cc_lock and sleepqueue * chain spinlocks. This piece of code * emulates a msleep_spin() call actually. * * If we already have the sleepqueue chain * locked, then we can safely block. If we * don't already have it locked, however, * we have to drop the cc_lock to lock * it. This opens several races, so we * restart at the beginning once we have * both locks. If nothing has changed, then * we will end up back here with sq_locked * set. */ if (!sq_locked) { CC_UNLOCK(cc); sleepq_lock( &cc_exec_waiting(cc, direct)); sq_locked = 1; old_cc = cc; goto again; } + /* * Migration could be cancelled here, but * as long as it is still not sure when it * will be packed up, just let softclock() * take care of it. */ cc_exec_waiting(cc, direct) = true; DROP_GIANT(); CC_UNLOCK(cc); sleepq_add( &cc_exec_waiting(cc, direct), &cc->cc_lock.lock_object, "codrain", SLEEPQ_SLEEP, 0); sleepq_wait( &cc_exec_waiting(cc, direct), 0); sq_locked = 0; old_cc = NULL; /* Reacquire locks previously released. */ PICKUP_GIANT(); CC_LOCK(cc); } } else if (use_lock && !cc_exec_cancel(cc, direct) && (drain == NULL)) { /* * The current callout is waiting for its * lock which we hold. Cancel the callout * and return. After our caller drops the * lock, the callout will be skipped in * softclock(). This *only* works with a * callout_stop() *not* callout_drain() or * callout_async_drain(). */ cc_exec_cancel(cc, direct) = true; CTR3(KTR_CALLOUT, "cancelled %p func %p arg %p", c, c->c_func, c->c_arg); KASSERT(!cc_cce_migrating(cc, direct), ("callout wrongly scheduled for migration")); if (callout_migrating(c)) { c->c_iflags &= ~CALLOUT_DFRMIGRATION; #ifdef SMP cc_migration_cpu(cc, direct) = CPUBLOCK; cc_migration_time(cc, direct) = 0; cc_migration_prec(cc, direct) = 0; cc_migration_func(cc, direct) = NULL; cc_migration_arg(cc, direct) = NULL; #endif } CC_UNLOCK(cc); KASSERT(!sq_locked, ("sleepqueue chain locked")); return (1); } else if (callout_migrating(c)) { /* * The callout is currently being serviced * and the "next" callout is scheduled at * its completion with a migration. We remove * the migration flag so it *won't* get rescheduled, * but we can't stop the one thats running so * we return 0. */ c->c_iflags &= ~CALLOUT_DFRMIGRATION; #ifdef SMP /* * We can't call cc_cce_cleanup here since * if we do it will remove .ce_curr and * its still running. This will prevent a * reschedule of the callout when the * execution completes. */ cc_migration_cpu(cc, direct) = CPUBLOCK; cc_migration_time(cc, direct) = 0; cc_migration_prec(cc, direct) = 0; cc_migration_func(cc, direct) = NULL; cc_migration_arg(cc, direct) = NULL; #endif CTR3(KTR_CALLOUT, "postponing stop %p func %p arg %p", c, c->c_func, c->c_arg); if (drain) { cc_exec_drain(cc, direct) = drain; } CC_UNLOCK(cc); - if (drain) - return(0); return ((flags & CS_EXECUTING) != 0); } CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", c, c->c_func, c->c_arg); if (drain) { cc_exec_drain(cc, direct) = drain; } - CC_UNLOCK(cc); KASSERT(!sq_locked, ("sleepqueue chain still locked")); - return (0); - } + cancelled = ((flags & CS_EXECUTING) != 0); + } else + cancelled = 1; + if (sq_locked) sleepq_release(&cc_exec_waiting(cc, direct)); + if ((c->c_iflags & CALLOUT_PENDING) == 0) { + CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", + c, c->c_func, c->c_arg); + /* + * For not scheduled and not executing callout return + * negative value. + */ + if (cc_exec_curr(cc, direct) != c) + cancelled = -1; + CC_UNLOCK(cc); + return (cancelled); + } + c->c_iflags &= ~CALLOUT_PENDING; c->c_flags &= ~CALLOUT_ACTIVE; CTR3(KTR_CALLOUT, "cancelled %p func %p arg %p", c, c->c_func, c->c_arg); if (not_on_a_list == 0) { if ((c->c_iflags & CALLOUT_PROCESSED) == 0) { if (cc_exec_next(cc) == c) cc_exec_next(cc) = LIST_NEXT(c, c_links.le); LIST_REMOVE(c, c_links.le); } else { TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe); } } callout_cc_del(c, cc); CC_UNLOCK(cc); - return (1); + return (cancelled); } void callout_init(struct callout *c, int mpsafe) { bzero(c, sizeof *c); if (mpsafe) { c->c_lock = NULL; c->c_iflags = CALLOUT_RETURNUNLOCKED; } else { c->c_lock = &Giant.lock_object; c->c_iflags = 0; } c->c_cpu = timeout_cpu; } void _callout_init_lock(struct callout *c, struct lock_object *lock, int flags) { bzero(c, sizeof *c); c->c_lock = lock; KASSERT((flags & ~(CALLOUT_RETURNUNLOCKED | CALLOUT_SHAREDLOCK)) == 0, ("callout_init_lock: bad flags %d", flags)); KASSERT(lock != NULL || (flags & CALLOUT_RETURNUNLOCKED) == 0, ("callout_init_lock: CALLOUT_RETURNUNLOCKED with no lock")); KASSERT(lock == NULL || !(LOCK_CLASS(lock)->lc_flags & (LC_SPINLOCK | LC_SLEEPABLE)), ("%s: invalid lock class", __func__)); c->c_iflags = flags & (CALLOUT_RETURNUNLOCKED | CALLOUT_SHAREDLOCK); c->c_cpu = timeout_cpu; } #ifdef APM_FIXUP_CALLTODO /* * Adjust the kernel calltodo timeout list. This routine is used after * an APM resume to recalculate the calltodo timer list values with the * number of hz's we have been sleeping. The next hardclock() will detect * that there are fired timers and run softclock() to execute them. * * Please note, I have not done an exhaustive analysis of what code this * might break. I am motivated to have my select()'s and alarm()'s that * have expired during suspend firing upon resume so that the applications * which set the timer can do the maintanence the timer was for as close * as possible to the originally intended time. Testing this code for a * week showed that resuming from a suspend resulted in 22 to 25 timers * firing, which seemed independent on whether the suspend was 2 hours or * 2 days. Your milage may vary. - Ken Key */ void adjust_timeout_calltodo(struct timeval *time_change) { register struct callout *p; unsigned long delta_ticks; /* * How many ticks were we asleep? * (stolen from tvtohz()). */ /* Don't do anything */ if (time_change->tv_sec < 0) return; else if (time_change->tv_sec <= LONG_MAX / 1000000) delta_ticks = howmany(time_change->tv_sec * 1000000 + time_change->tv_usec, tick) + 1; else if (time_change->tv_sec <= LONG_MAX / hz) delta_ticks = time_change->tv_sec * hz + howmany(time_change->tv_usec, tick) + 1; else delta_ticks = LONG_MAX; if (delta_ticks > INT_MAX) delta_ticks = INT_MAX; /* * Now rip through the timer calltodo list looking for timers * to expire. */ /* don't collide with softclock() */ CC_LOCK(cc); for (p = calltodo.c_next; p != NULL; p = p->c_next) { p->c_time -= delta_ticks; /* Break if the timer had more time on it than delta_ticks */ if (p->c_time > 0) break; /* take back the ticks the timer didn't use (p->c_time <= 0) */ delta_ticks = -p->c_time; } CC_UNLOCK(cc); return; } #endif /* APM_FIXUP_CALLTODO */ static int flssbt(sbintime_t sbt) { sbt += (uint64_t)sbt >> 1; if (sizeof(long) >= sizeof(sbintime_t)) return (flsl(sbt)); if (sbt >= SBT_1S) return (flsl(((uint64_t)sbt) >> 32) + 32); return (flsl(sbt)); } /* * Dump immediate statistic snapshot of the scheduled callouts. */ static int sysctl_kern_callout_stat(SYSCTL_HANDLER_ARGS) { struct callout *tmp; struct callout_cpu *cc; struct callout_list *sc; sbintime_t maxpr, maxt, medpr, medt, now, spr, st, t; int ct[64], cpr[64], ccpbk[32]; int error, val, i, count, tcum, pcum, maxc, c, medc; #ifdef SMP int cpu; #endif val = 0; error = sysctl_handle_int(oidp, &val, 0, req); if (error != 0 || req->newptr == NULL) return (error); count = maxc = 0; st = spr = maxt = maxpr = 0; bzero(ccpbk, sizeof(ccpbk)); bzero(ct, sizeof(ct)); bzero(cpr, sizeof(cpr)); now = sbinuptime(); #ifdef SMP CPU_FOREACH(cpu) { cc = CC_CPU(cpu); #else cc = CC_CPU(timeout_cpu); #endif CC_LOCK(cc); for (i = 0; i < callwheelsize; i++) { sc = &cc->cc_callwheel[i]; c = 0; LIST_FOREACH(tmp, sc, c_links.le) { c++; t = tmp->c_time - now; if (t < 0) t = 0; st += t / SBT_1US; spr += tmp->c_precision / SBT_1US; if (t > maxt) maxt = t; if (tmp->c_precision > maxpr) maxpr = tmp->c_precision; ct[flssbt(t)]++; cpr[flssbt(tmp->c_precision)]++; } if (c > maxc) maxc = c; ccpbk[fls(c + c / 2)]++; count += c; } CC_UNLOCK(cc); #ifdef SMP } #endif for (i = 0, tcum = 0; i < 64 && tcum < count / 2; i++) tcum += ct[i]; medt = (i >= 2) ? (((sbintime_t)1) << (i - 2)) : 0; for (i = 0, pcum = 0; i < 64 && pcum < count / 2; i++) pcum += cpr[i]; medpr = (i >= 2) ? (((sbintime_t)1) << (i - 2)) : 0; for (i = 0, c = 0; i < 32 && c < count / 2; i++) c += ccpbk[i]; medc = (i >= 2) ? (1 << (i - 2)) : 0; printf("Scheduled callouts statistic snapshot:\n"); printf(" Callouts: %6d Buckets: %6d*%-3d Bucket size: 0.%06ds\n", count, callwheelsize, mp_ncpus, 1000000 >> CC_HASH_SHIFT); printf(" C/Bk: med %5d avg %6d.%06jd max %6d\n", medc, count / callwheelsize / mp_ncpus, (uint64_t)count * 1000000 / callwheelsize / mp_ncpus % 1000000, maxc); printf(" Time: med %5jd.%06jds avg %6jd.%06jds max %6jd.%06jds\n", medt / SBT_1S, (medt & 0xffffffff) * 1000000 >> 32, (st / count) / 1000000, (st / count) % 1000000, maxt / SBT_1S, (maxt & 0xffffffff) * 1000000 >> 32); printf(" Prec: med %5jd.%06jds avg %6jd.%06jds max %6jd.%06jds\n", medpr / SBT_1S, (medpr & 0xffffffff) * 1000000 >> 32, (spr / count) / 1000000, (spr / count) % 1000000, maxpr / SBT_1S, (maxpr & 0xffffffff) * 1000000 >> 32); printf(" Distribution: \tbuckets\t time\t tcum\t" " prec\t pcum\n"); for (i = 0, tcum = pcum = 0; i < 64; i++) { if (ct[i] == 0 && cpr[i] == 0) continue; t = (i != 0) ? (((sbintime_t)1) << (i - 1)) : 0; tcum += ct[i]; pcum += cpr[i]; printf(" %10jd.%06jds\t 2**%d\t%7d\t%7d\t%7d\t%7d\n", t / SBT_1S, (t & 0xffffffff) * 1000000 >> 32, i - 1 - (32 - CC_HASH_SHIFT), ct[i], tcum, cpr[i], pcum); } return (error); } SYSCTL_PROC(_kern, OID_AUTO, callout_stat, CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, 0, 0, sysctl_kern_callout_stat, "I", "Dump immediate statistic snapshot of the scheduled callouts"); + #ifdef DDB static void _show_callout(struct callout *c) { db_printf("callout %p\n", c); #define C_DB_PRINTF(f, e) db_printf(" %s = " f "\n", #e, c->e); db_printf(" &c_links = %p\n", &(c->c_links)); C_DB_PRINTF("%" PRId64, c_time); C_DB_PRINTF("%" PRId64, c_precision); C_DB_PRINTF("%p", c_arg); C_DB_PRINTF("%p", c_func); C_DB_PRINTF("%p", c_lock); C_DB_PRINTF("%#x", c_flags); C_DB_PRINTF("%#x", c_iflags); C_DB_PRINTF("%d", c_cpu); #undef C_DB_PRINTF } DB_SHOW_COMMAND(callout, db_show_callout) { if (!have_addr) { db_printf("usage: show callout \n"); return; } _show_callout((struct callout *)addr); } #endif /* DDB */ Index: user/alc/PQ_LAUNDRY/sys/kern/subr_prf.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/kern/subr_prf.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/kern/subr_prf.c (revision 303206) @@ -1,1198 +1,1219 @@ /*- * Copyright (c) 1986, 1988, 1991, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)subr_prf.c 8.3 (Berkeley) 1/21/94 */ #include __FBSDID("$FreeBSD$"); #ifdef _KERNEL #include "opt_ddb.h" #include "opt_printf.h" #endif /* _KERNEL */ #include #ifdef _KERNEL #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #endif #include #include #ifdef DDB #include #endif /* * Note that stdarg.h and the ANSI style va_start macro is used for both * ANSI and traditional C compilers. */ #include #ifdef _KERNEL #define TOCONS 0x01 #define TOTTY 0x02 #define TOLOG 0x04 /* Max number conversion buffer length: a u_quad_t in base 2, plus NUL byte. */ #define MAXNBUF (sizeof(intmax_t) * NBBY + 1) struct putchar_arg { int flags; int pri; struct tty *tty; char *p_bufr; size_t n_bufr; char *p_next; size_t remain; }; struct snprintf_arg { char *str; size_t remain; }; extern int log_open; static void msglogchar(int c, int pri); static void msglogstr(char *str, int pri, int filter_cr); static void putchar(int ch, void *arg); static char *ksprintn(char *nbuf, uintmax_t num, int base, int *len, int upper); static void snprintf_func(int ch, void *arg); static int msgbufmapped; /* Set when safe to use msgbuf */ int msgbuftrigger; static int log_console_output = 1; SYSCTL_INT(_kern, OID_AUTO, log_console_output, CTLFLAG_RWTUN, &log_console_output, 0, "Duplicate console output to the syslog"); /* * See the comment in log_console() below for more explanation of this. */ static int log_console_add_linefeed; SYSCTL_INT(_kern, OID_AUTO, log_console_add_linefeed, CTLFLAG_RWTUN, &log_console_add_linefeed, 0, "log_console() adds extra newlines"); static int always_console_output; SYSCTL_INT(_kern, OID_AUTO, always_console_output, CTLFLAG_RWTUN, &always_console_output, 0, "Always output to console despite TIOCCONS"); /* * Warn that a system table is full. */ void tablefull(const char *tab) { log(LOG_ERR, "%s: table is full\n", tab); } /* * Uprintf prints to the controlling terminal for the current process. */ int uprintf(const char *fmt, ...) { va_list ap; struct putchar_arg pca; struct proc *p; struct thread *td; int retval; td = curthread; if (TD_IS_IDLETHREAD(td)) return (0); sx_slock(&proctree_lock); p = td->td_proc; PROC_LOCK(p); if ((p->p_flag & P_CONTROLT) == 0) { PROC_UNLOCK(p); sx_sunlock(&proctree_lock); return (0); } SESS_LOCK(p->p_session); pca.tty = p->p_session->s_ttyp; SESS_UNLOCK(p->p_session); PROC_UNLOCK(p); if (pca.tty == NULL) { sx_sunlock(&proctree_lock); return (0); } pca.flags = TOTTY; pca.p_bufr = NULL; va_start(ap, fmt); tty_lock(pca.tty); sx_sunlock(&proctree_lock); retval = kvprintf(fmt, putchar, &pca, 10, ap); tty_unlock(pca.tty); va_end(ap); return (retval); } /* * tprintf and vtprintf print on the controlling terminal associated with the * given session, possibly to the log as well. */ void tprintf(struct proc *p, int pri, const char *fmt, ...) { va_list ap; va_start(ap, fmt); vtprintf(p, pri, fmt, ap); va_end(ap); } void vtprintf(struct proc *p, int pri, const char *fmt, va_list ap) { struct tty *tp = NULL; int flags = 0; struct putchar_arg pca; struct session *sess = NULL; sx_slock(&proctree_lock); if (pri != -1) flags |= TOLOG; if (p != NULL) { PROC_LOCK(p); if (p->p_flag & P_CONTROLT && p->p_session->s_ttyvp) { sess = p->p_session; sess_hold(sess); PROC_UNLOCK(p); tp = sess->s_ttyp; if (tp != NULL && tty_checkoutq(tp)) flags |= TOTTY; else tp = NULL; } else PROC_UNLOCK(p); } pca.pri = pri; pca.tty = tp; pca.flags = flags; pca.p_bufr = NULL; if (pca.tty != NULL) tty_lock(pca.tty); sx_sunlock(&proctree_lock); kvprintf(fmt, putchar, &pca, 10, ap); if (pca.tty != NULL) tty_unlock(pca.tty); if (sess != NULL) sess_release(sess); msgbuftrigger = 1; } /* * Ttyprintf displays a message on a tty; it should be used only by * the tty driver, or anything that knows the underlying tty will not * be revoke(2)'d away. Other callers should use tprintf. */ int ttyprintf(struct tty *tp, const char *fmt, ...) { va_list ap; struct putchar_arg pca; int retval; va_start(ap, fmt); pca.tty = tp; pca.flags = TOTTY; pca.p_bufr = NULL; retval = kvprintf(fmt, putchar, &pca, 10, ap); va_end(ap); return (retval); } static int _vprintf(int level, int flags, const char *fmt, va_list ap) { struct putchar_arg pca; int retval; #ifdef PRINTF_BUFR_SIZE char bufr[PRINTF_BUFR_SIZE]; #endif pca.tty = NULL; pca.pri = level; pca.flags = flags; #ifdef PRINTF_BUFR_SIZE pca.p_bufr = bufr; pca.p_next = pca.p_bufr; pca.n_bufr = sizeof(bufr); pca.remain = sizeof(bufr); *pca.p_next = '\0'; #else /* Don't buffer console output. */ pca.p_bufr = NULL; #endif retval = kvprintf(fmt, putchar, &pca, 10, ap); #ifdef PRINTF_BUFR_SIZE /* Write any buffered console/log output: */ if (*pca.p_bufr != '\0') { if (pca.flags & TOLOG) msglogstr(pca.p_bufr, level, /*filter_cr*/1); if (pca.flags & TOCONS) cnputs(pca.p_bufr); } #endif return (retval); } /* * Log writes to the log buffer, and guarantees not to sleep (so can be * called by interrupt routines). If there is no process reading the * log yet, it writes to the console also. */ void log(int level, const char *fmt, ...) { va_list ap; va_start(ap, fmt); vlog(level, fmt, ap); va_end(ap); } void vlog(int level, const char *fmt, va_list ap) { (void)_vprintf(level, log_open ? TOLOG : TOCONS | TOLOG, fmt, ap); msgbuftrigger = 1; } #define CONSCHUNK 128 void log_console(struct uio *uio) { int c, error, nl; char *consbuffer; int pri; if (!log_console_output) return; pri = LOG_INFO | LOG_CONSOLE; uio = cloneuio(uio); consbuffer = malloc(CONSCHUNK, M_TEMP, M_WAITOK); nl = 0; while (uio->uio_resid > 0) { c = imin(uio->uio_resid, CONSCHUNK - 1); error = uiomove(consbuffer, c, uio); if (error != 0) break; /* Make sure we're NUL-terminated */ consbuffer[c] = '\0'; if (consbuffer[c - 1] == '\n') nl = 1; else nl = 0; msglogstr(consbuffer, pri, /*filter_cr*/ 1); } /* * The previous behavior in log_console() is preserved when * log_console_add_linefeed is non-zero. For that behavior, if an * individual console write came in that was not terminated with a * line feed, it would add a line feed. * * This results in different data in the message buffer than * appears on the system console (which doesn't add extra line feed * characters). * * A number of programs and rc scripts write a line feed, or a period * and a line feed when they have completed their operation. On * the console, this looks seamless, but when displayed with * 'dmesg -a', you wind up with output that looks like this: * * Updating motd: * . * * On the console, it looks like this: * Updating motd:. * * We could add logic to detect that situation, or just not insert * the extra newlines. Set the kern.log_console_add_linefeed * sysctl/tunable variable to get the old behavior. */ if (!nl && log_console_add_linefeed) { consbuffer[0] = '\n'; consbuffer[1] = '\0'; msglogstr(consbuffer, pri, /*filter_cr*/ 1); } msgbuftrigger = 1; free(uio, M_IOV); free(consbuffer, M_TEMP); return; } int printf(const char *fmt, ...) { va_list ap; int retval; va_start(ap, fmt); retval = vprintf(fmt, ap); va_end(ap); return (retval); } int vprintf(const char *fmt, va_list ap) { int retval; retval = _vprintf(-1, TOCONS | TOLOG, fmt, ap); if (!panicstr) msgbuftrigger = 1; return (retval); } static void putbuf(int c, struct putchar_arg *ap) { /* Check if no console output buffer was provided. */ if (ap->p_bufr == NULL) { /* Output direct to the console. */ if (ap->flags & TOCONS) cnputc(c); if (ap->flags & TOLOG) msglogchar(c, ap->pri); } else { /* Buffer the character: */ *ap->p_next++ = c; ap->remain--; /* Always leave the buffer zero terminated. */ *ap->p_next = '\0'; /* Check if the buffer needs to be flushed. */ if (ap->remain == 2 || c == '\n') { if (ap->flags & TOLOG) msglogstr(ap->p_bufr, ap->pri, /*filter_cr*/1); if (ap->flags & TOCONS) { if ((panicstr == NULL) && (constty != NULL)) msgbuf_addstr(&consmsgbuf, -1, ap->p_bufr, /*filter_cr*/ 0); if ((constty == NULL) ||(always_console_output)) cnputs(ap->p_bufr); } ap->p_next = ap->p_bufr; ap->remain = ap->n_bufr; *ap->p_next = '\0'; } /* * Since we fill the buffer up one character at a time, * this should not happen. We should always catch it when * ap->remain == 2 (if not sooner due to a newline), flush * the buffer and move on. One way this could happen is * if someone sets PRINTF_BUFR_SIZE to 1 or something * similarly silly. */ KASSERT(ap->remain > 2, ("Bad buffer logic, remain = %zd", ap->remain)); } } /* * Print a character on console or users terminal. If destination is * the console then the last bunch of characters are saved in msgbuf for * inspection later. */ static void putchar(int c, void *arg) { struct putchar_arg *ap = (struct putchar_arg*) arg; struct tty *tp = ap->tty; int flags = ap->flags; /* Don't use the tty code after a panic or while in ddb. */ if (kdb_active) { if (c != '\0') cnputc(c); return; } if ((flags & TOTTY) && tp != NULL && panicstr == NULL) tty_putchar(tp, c); if ((flags & (TOCONS | TOLOG)) && c != '\0') putbuf(c, ap); } /* * Scaled down version of sprintf(3). */ int sprintf(char *buf, const char *cfmt, ...) { int retval; va_list ap; va_start(ap, cfmt); retval = kvprintf(cfmt, NULL, (void *)buf, 10, ap); buf[retval] = '\0'; va_end(ap); return (retval); } /* * Scaled down version of vsprintf(3). */ int vsprintf(char *buf, const char *cfmt, va_list ap) { int retval; retval = kvprintf(cfmt, NULL, (void *)buf, 10, ap); buf[retval] = '\0'; return (retval); } /* * Scaled down version of snprintf(3). */ int snprintf(char *str, size_t size, const char *format, ...) { int retval; va_list ap; va_start(ap, format); retval = vsnprintf(str, size, format, ap); va_end(ap); return(retval); } /* * Scaled down version of vsnprintf(3). */ int vsnprintf(char *str, size_t size, const char *format, va_list ap) { struct snprintf_arg info; int retval; info.str = str; info.remain = size; retval = kvprintf(format, snprintf_func, &info, 10, ap); if (info.remain >= 1) *info.str++ = '\0'; return (retval); } /* * Kernel version which takes radix argument vsnprintf(3). */ int vsnrprintf(char *str, size_t size, int radix, const char *format, va_list ap) { struct snprintf_arg info; int retval; info.str = str; info.remain = size; retval = kvprintf(format, snprintf_func, &info, radix, ap); if (info.remain >= 1) *info.str++ = '\0'; return (retval); } static void snprintf_func(int ch, void *arg) { struct snprintf_arg *const info = arg; if (info->remain >= 2) { *info->str++ = ch; info->remain--; } } /* * Put a NUL-terminated ASCII number (base <= 36) in a buffer in reverse * order; return an optional length and a pointer to the last character * written in the buffer (i.e., the first character of the string). * The buffer pointed to by `nbuf' must have length >= MAXNBUF. */ static char * ksprintn(char *nbuf, uintmax_t num, int base, int *lenp, int upper) { char *p, c; p = nbuf; *p = '\0'; do { c = hex2ascii(num % base); *++p = upper ? toupper(c) : c; } while (num /= base); if (lenp) *lenp = p - nbuf; return (p); } /* * Scaled down version of printf(3). * * Two additional formats: * * The format %b is supported to decode error registers. * Its usage is: * * printf("reg=%b\n", regval, "*"); * * where is the output base expressed as a control character, e.g. * \10 gives octal; \20 gives hex. Each arg is a sequence of characters, * the first of which gives the bit number to be inspected (origin 1), and * the next characters (up to a control character, i.e. a character <= 32), * give the name of the register. Thus: * * kvprintf("reg=%b\n", 3, "\10\2BITTWO\1BITONE"); * * would produce output: * * reg=3 * * XXX: %D -- Hexdump, takes pointer and separator string: * ("%6D", ptr, ":") -> XX:XX:XX:XX:XX:XX * ("%*D", len, ptr, " " -> XX XX XX XX ... */ int kvprintf(char const *fmt, void (*func)(int, void*), void *arg, int radix, va_list ap) { #define PCHAR(c) {int cc=(c); if (func) (*func)(cc,arg); else *d++ = cc; retval++; } char nbuf[MAXNBUF]; char *d; const char *p, *percent, *q; u_char *up; int ch, n; uintmax_t num; int base, lflag, qflag, tmp, width, ladjust, sharpflag, neg, sign, dot; int cflag, hflag, jflag, tflag, zflag; int dwidth, upper; char padc; int stop = 0, retval = 0; num = 0; if (!func) d = (char *) arg; else d = NULL; if (fmt == NULL) fmt = "(fmt null)\n"; if (radix < 2 || radix > 36) radix = 10; for (;;) { padc = ' '; width = 0; while ((ch = (u_char)*fmt++) != '%' || stop) { if (ch == '\0') return (retval); PCHAR(ch); } percent = fmt - 1; qflag = 0; lflag = 0; ladjust = 0; sharpflag = 0; neg = 0; sign = 0; dot = 0; dwidth = 0; upper = 0; cflag = 0; hflag = 0; jflag = 0; tflag = 0; zflag = 0; reswitch: switch (ch = (u_char)*fmt++) { case '.': dot = 1; goto reswitch; case '#': sharpflag = 1; goto reswitch; case '+': sign = 1; goto reswitch; case '-': ladjust = 1; goto reswitch; case '%': PCHAR(ch); break; case '*': if (!dot) { width = va_arg(ap, int); if (width < 0) { ladjust = !ladjust; width = -width; } } else { dwidth = va_arg(ap, int); } goto reswitch; case '0': if (!dot) { padc = '0'; goto reswitch; } case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': for (n = 0;; ++fmt) { n = n * 10 + ch - '0'; ch = *fmt; if (ch < '0' || ch > '9') break; } if (dot) dwidth = n; else width = n; goto reswitch; case 'b': num = (u_int)va_arg(ap, int); p = va_arg(ap, char *); for (q = ksprintn(nbuf, num, *p++, NULL, 0); *q;) PCHAR(*q--); if (num == 0) break; for (tmp = 0; *p;) { n = *p++; if (num & (1 << (n - 1))) { PCHAR(tmp ? ',' : '<'); for (; (n = *p) > ' '; ++p) PCHAR(n); tmp = 1; } else for (; *p > ' '; ++p) continue; } if (tmp) PCHAR('>'); break; case 'c': width -= 1; if (!ladjust && width > 0) while (width--) PCHAR(padc); PCHAR(va_arg(ap, int)); if (ladjust && width > 0) while (width--) PCHAR(padc); break; case 'D': up = va_arg(ap, u_char *); p = va_arg(ap, char *); if (!width) width = 16; while(width--) { PCHAR(hex2ascii(*up >> 4)); PCHAR(hex2ascii(*up & 0x0f)); up++; if (width) for (q=p;*q;q++) PCHAR(*q); } break; case 'd': case 'i': base = 10; sign = 1; goto handle_sign; case 'h': if (hflag) { hflag = 0; cflag = 1; } else hflag = 1; goto reswitch; case 'j': jflag = 1; goto reswitch; case 'l': if (lflag) { lflag = 0; qflag = 1; } else lflag = 1; goto reswitch; case 'n': if (jflag) *(va_arg(ap, intmax_t *)) = retval; else if (qflag) *(va_arg(ap, quad_t *)) = retval; else if (lflag) *(va_arg(ap, long *)) = retval; else if (zflag) *(va_arg(ap, size_t *)) = retval; else if (hflag) *(va_arg(ap, short *)) = retval; else if (cflag) *(va_arg(ap, char *)) = retval; else *(va_arg(ap, int *)) = retval; break; case 'o': base = 8; goto handle_nosign; case 'p': base = 16; sharpflag = (width == 0); sign = 0; num = (uintptr_t)va_arg(ap, void *); goto number; case 'q': qflag = 1; goto reswitch; case 'r': base = radix; if (sign) goto handle_sign; goto handle_nosign; case 's': p = va_arg(ap, char *); if (p == NULL) p = "(null)"; if (!dot) n = strlen (p); else for (n = 0; n < dwidth && p[n]; n++) continue; width -= n; if (!ladjust && width > 0) while (width--) PCHAR(padc); while (n--) PCHAR(*p++); if (ladjust && width > 0) while (width--) PCHAR(padc); break; case 't': tflag = 1; goto reswitch; case 'u': base = 10; goto handle_nosign; case 'X': upper = 1; case 'x': base = 16; goto handle_nosign; case 'y': base = 16; sign = 1; goto handle_sign; case 'z': zflag = 1; goto reswitch; handle_nosign: sign = 0; if (jflag) num = va_arg(ap, uintmax_t); else if (qflag) num = va_arg(ap, u_quad_t); else if (tflag) num = va_arg(ap, ptrdiff_t); else if (lflag) num = va_arg(ap, u_long); else if (zflag) num = va_arg(ap, size_t); else if (hflag) num = (u_short)va_arg(ap, int); else if (cflag) num = (u_char)va_arg(ap, int); else num = va_arg(ap, u_int); goto number; handle_sign: if (jflag) num = va_arg(ap, intmax_t); else if (qflag) num = va_arg(ap, quad_t); else if (tflag) num = va_arg(ap, ptrdiff_t); else if (lflag) num = va_arg(ap, long); else if (zflag) num = va_arg(ap, ssize_t); else if (hflag) num = (short)va_arg(ap, int); else if (cflag) num = (char)va_arg(ap, int); else num = va_arg(ap, int); number: if (sign && (intmax_t)num < 0) { neg = 1; num = -(intmax_t)num; } p = ksprintn(nbuf, num, base, &n, upper); tmp = 0; if (sharpflag && num != 0) { if (base == 8) tmp++; else if (base == 16) tmp += 2; } if (neg) tmp++; if (!ladjust && padc == '0') dwidth = width - tmp; width -= tmp + imax(dwidth, n); dwidth -= n; if (!ladjust) while (width-- > 0) PCHAR(' '); if (neg) PCHAR('-'); if (sharpflag && num != 0) { if (base == 8) { PCHAR('0'); } else if (base == 16) { PCHAR('0'); PCHAR('x'); } } while (dwidth-- > 0) PCHAR('0'); while (*p) PCHAR(*p--); if (ladjust) while (width-- > 0) PCHAR(' '); break; default: while (percent < fmt) PCHAR(*percent++); /* * Since we ignore a formatting argument it is no * longer safe to obey the remaining formatting * arguments as the arguments will no longer match * the format specs. */ stop = 1; break; } } #undef PCHAR } /* * Put character in log buffer with a particular priority. */ static void msglogchar(int c, int pri) { static int lastpri = -1; static int dangling; char nbuf[MAXNBUF]; char *p; if (!msgbufmapped) return; if (c == '\0' || c == '\r') return; if (pri != -1 && pri != lastpri) { if (dangling) { msgbuf_addchar(msgbufp, '\n'); dangling = 0; } msgbuf_addchar(msgbufp, '<'); for (p = ksprintn(nbuf, (uintmax_t)pri, 10, NULL, 0); *p;) msgbuf_addchar(msgbufp, *p--); msgbuf_addchar(msgbufp, '>'); lastpri = pri; } msgbuf_addchar(msgbufp, c); if (c == '\n') { dangling = 0; lastpri = -1; } else { dangling = 1; } } static void msglogstr(char *str, int pri, int filter_cr) { if (!msgbufmapped) return; msgbuf_addstr(msgbufp, pri, str, filter_cr); } void msgbufinit(void *ptr, int size) { char *cp; static struct msgbuf *oldp = NULL; size -= sizeof(*msgbufp); cp = (char *)ptr; msgbufp = (struct msgbuf *)(cp + size); msgbuf_reinit(msgbufp, cp, size); if (msgbufmapped && oldp != msgbufp) msgbuf_copy(oldp, msgbufp); msgbufmapped = 1; oldp = msgbufp; } static int unprivileged_read_msgbuf = 1; SYSCTL_INT(_security_bsd, OID_AUTO, unprivileged_read_msgbuf, CTLFLAG_RW, &unprivileged_read_msgbuf, 0, "Unprivileged processes may read the kernel message buffer"); /* Sysctls for accessing/clearing the msgbuf */ static int sysctl_kern_msgbuf(SYSCTL_HANDLER_ARGS) { char buf[128]; u_int seq; int error, len; if (!unprivileged_read_msgbuf) { error = priv_check(req->td, PRIV_MSGBUF); if (error) return (error); } /* Read the whole buffer, one chunk at a time. */ mtx_lock(&msgbuf_lock); msgbuf_peekbytes(msgbufp, NULL, 0, &seq); for (;;) { len = msgbuf_peekbytes(msgbufp, buf, sizeof(buf), &seq); mtx_unlock(&msgbuf_lock); if (len == 0) return (SYSCTL_OUT(req, "", 1)); /* add nulterm */ error = sysctl_handle_opaque(oidp, buf, len, req); if (error) return (error); mtx_lock(&msgbuf_lock); } } SYSCTL_PROC(_kern, OID_AUTO, msgbuf, CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0, sysctl_kern_msgbuf, "A", "Contents of kernel message buffer"); static int msgbuf_clearflag; static int sysctl_kern_msgbuf_clear(SYSCTL_HANDLER_ARGS) { int error; error = sysctl_handle_int(oidp, oidp->oid_arg1, oidp->oid_arg2, req); if (!error && req->newptr) { mtx_lock(&msgbuf_lock); msgbuf_clear(msgbufp); mtx_unlock(&msgbuf_lock); msgbuf_clearflag = 0; } return (error); } SYSCTL_PROC(_kern, OID_AUTO, msgbuf_clear, CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_SECURE | CTLFLAG_MPSAFE, &msgbuf_clearflag, 0, sysctl_kern_msgbuf_clear, "I", "Clear kernel message buffer"); #ifdef DDB DB_SHOW_COMMAND(msgbuf, db_show_msgbuf) { int i, j; if (!msgbufmapped) { db_printf("msgbuf not mapped yet\n"); return; } db_printf("msgbufp = %p\n", msgbufp); db_printf("magic = %x, size = %d, r= %u, w = %u, ptr = %p, cksum= %u\n", msgbufp->msg_magic, msgbufp->msg_size, msgbufp->msg_rseq, msgbufp->msg_wseq, msgbufp->msg_ptr, msgbufp->msg_cksum); for (i = 0; i < msgbufp->msg_size && !db_pager_quit; i++) { j = MSGBUF_SEQ_TO_POS(msgbufp, i + msgbufp->msg_rseq); db_printf("%c", msgbufp->msg_ptr[j]); } db_printf("\n"); } #endif /* DDB */ void hexdump(const void *ptr, int length, const char *hdr, int flags) { int i, j, k; int cols; const unsigned char *cp; char delim; if ((flags & HD_DELIM_MASK) != 0) delim = (flags & HD_DELIM_MASK) >> 8; else delim = ' '; if ((flags & HD_COLUMN_MASK) != 0) cols = flags & HD_COLUMN_MASK; else cols = 16; cp = ptr; for (i = 0; i < length; i+= cols) { if (hdr != NULL) printf("%s", hdr); if ((flags & HD_OMIT_COUNT) == 0) printf("%04x ", i); if ((flags & HD_OMIT_HEX) == 0) { for (j = 0; j < cols; j++) { k = i + j; if (k < length) printf("%c%02x", delim, cp[k]); else printf(" "); } } if ((flags & HD_OMIT_CHARS) == 0) { printf(" |"); for (j = 0; j < cols; j++) { k = i + j; if (k >= length) printf(" "); else if (cp[k] >= ' ' && cp[k] <= '~') printf("%c", cp[k]); else printf("."); } printf("|"); } printf("\n"); } } #endif /* _KERNEL */ void sbuf_hexdump(struct sbuf *sb, const void *ptr, int length, const char *hdr, int flags) { int i, j, k; int cols; const unsigned char *cp; char delim; if ((flags & HD_DELIM_MASK) != 0) delim = (flags & HD_DELIM_MASK) >> 8; else delim = ' '; if ((flags & HD_COLUMN_MASK) != 0) cols = flags & HD_COLUMN_MASK; else cols = 16; cp = ptr; for (i = 0; i < length; i+= cols) { if (hdr != NULL) sbuf_printf(sb, "%s", hdr); if ((flags & HD_OMIT_COUNT) == 0) sbuf_printf(sb, "%04x ", i); if ((flags & HD_OMIT_HEX) == 0) { for (j = 0; j < cols; j++) { k = i + j; if (k < length) sbuf_printf(sb, "%c%02x", delim, cp[k]); else sbuf_printf(sb, " "); } } if ((flags & HD_OMIT_CHARS) == 0) { sbuf_printf(sb, " |"); for (j = 0; j < cols; j++) { k = i + j; if (k >= length) sbuf_printf(sb, " "); else if (cp[k] >= ' ' && cp[k] <= '~') sbuf_printf(sb, "%c", cp[k]); else sbuf_printf(sb, "."); } sbuf_printf(sb, "|"); } sbuf_printf(sb, "\n"); } } +#ifdef _KERNEL +void +counted_warning(unsigned *counter, const char *msg) +{ + struct thread *td; + unsigned c; + + for (;;) { + c = *counter; + if (c == 0) + break; + if (atomic_cmpset_int(counter, c, c - 1)) { + td = curthread; + log(LOG_INFO, "pid %d (%s) %s%s\n", + td->td_proc->p_pid, td->td_name, msg, + c > 1 ? "" : " - not logging anymore"); + break; + } + } +} +#endif Index: user/alc/PQ_LAUNDRY/sys/kern/vfs_aio.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/kern/vfs_aio.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/kern/vfs_aio.c (revision 303206) @@ -1,2959 +1,2980 @@ /*- * Copyright (c) 1997 John S. Dyson. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. John S. Dyson's name may not be used to endorse or promote products * derived from this software without specific prior written permission. * * DISCLAIMER: This code isn't warranted to do anything useful. Anything * bad that happens because of using this software isn't the responsibility * of the author. This software is distributed AS-IS. */ /* * This file contains support for the POSIX 1003.1B AIO/LIO facility. */ #include __FBSDID("$FreeBSD$"); #include "opt_compat.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* * Counter for allocating reference ids to new jobs. Wrapped to 1 on * overflow. (XXX will be removed soon.) */ static u_long jobrefid; /* * Counter for aio_fsync. */ static uint64_t jobseqno; #ifndef MAX_AIO_PER_PROC #define MAX_AIO_PER_PROC 32 #endif #ifndef MAX_AIO_QUEUE_PER_PROC #define MAX_AIO_QUEUE_PER_PROC 256 /* Bigger than AIO_LISTIO_MAX */ #endif #ifndef MAX_AIO_QUEUE #define MAX_AIO_QUEUE 1024 /* Bigger than AIO_LISTIO_MAX */ #endif #ifndef MAX_BUF_AIO #define MAX_BUF_AIO 16 #endif FEATURE(aio, "Asynchronous I/O"); static MALLOC_DEFINE(M_LIO, "lio", "listio aio control block list"); static SYSCTL_NODE(_vfs, OID_AUTO, aio, CTLFLAG_RW, 0, "Async IO management"); static int enable_aio_unsafe = 0; SYSCTL_INT(_vfs_aio, OID_AUTO, enable_unsafe, CTLFLAG_RW, &enable_aio_unsafe, 0, "Permit asynchronous IO on all file types, not just known-safe types"); +static unsigned int unsafe_warningcnt = 1; +SYSCTL_UINT(_vfs_aio, OID_AUTO, unsafe_warningcnt, CTLFLAG_RW, + &unsafe_warningcnt, 0, + "Warnings that will be triggered upon failed IO requests on unsafe files"); + static int max_aio_procs = MAX_AIO_PROCS; SYSCTL_INT(_vfs_aio, OID_AUTO, max_aio_procs, CTLFLAG_RW, &max_aio_procs, 0, "Maximum number of kernel processes to use for handling async IO "); static int num_aio_procs = 0; SYSCTL_INT(_vfs_aio, OID_AUTO, num_aio_procs, CTLFLAG_RD, &num_aio_procs, 0, "Number of presently active kernel processes for async IO"); /* * The code will adjust the actual number of AIO processes towards this * number when it gets a chance. */ static int target_aio_procs = TARGET_AIO_PROCS; SYSCTL_INT(_vfs_aio, OID_AUTO, target_aio_procs, CTLFLAG_RW, &target_aio_procs, 0, "Preferred number of ready kernel processes for async IO"); static int max_queue_count = MAX_AIO_QUEUE; SYSCTL_INT(_vfs_aio, OID_AUTO, max_aio_queue, CTLFLAG_RW, &max_queue_count, 0, "Maximum number of aio requests to queue, globally"); static int num_queue_count = 0; SYSCTL_INT(_vfs_aio, OID_AUTO, num_queue_count, CTLFLAG_RD, &num_queue_count, 0, "Number of queued aio requests"); static int num_buf_aio = 0; SYSCTL_INT(_vfs_aio, OID_AUTO, num_buf_aio, CTLFLAG_RD, &num_buf_aio, 0, "Number of aio requests presently handled by the buf subsystem"); /* Number of async I/O processes in the process of being started */ /* XXX This should be local to aio_aqueue() */ static int num_aio_resv_start = 0; static int aiod_lifetime; SYSCTL_INT(_vfs_aio, OID_AUTO, aiod_lifetime, CTLFLAG_RW, &aiod_lifetime, 0, "Maximum lifetime for idle aiod"); static int max_aio_per_proc = MAX_AIO_PER_PROC; SYSCTL_INT(_vfs_aio, OID_AUTO, max_aio_per_proc, CTLFLAG_RW, &max_aio_per_proc, 0, "Maximum active aio requests per process (stored in the process)"); static int max_aio_queue_per_proc = MAX_AIO_QUEUE_PER_PROC; SYSCTL_INT(_vfs_aio, OID_AUTO, max_aio_queue_per_proc, CTLFLAG_RW, &max_aio_queue_per_proc, 0, "Maximum queued aio requests per process (stored in the process)"); static int max_buf_aio = MAX_BUF_AIO; SYSCTL_INT(_vfs_aio, OID_AUTO, max_buf_aio, CTLFLAG_RW, &max_buf_aio, 0, "Maximum buf aio requests per process (stored in the process)"); #ifdef COMPAT_FREEBSD6 typedef struct oaiocb { int aio_fildes; /* File descriptor */ off_t aio_offset; /* File offset for I/O */ volatile void *aio_buf; /* I/O buffer in process space */ size_t aio_nbytes; /* Number of bytes for I/O */ struct osigevent aio_sigevent; /* Signal to deliver */ int aio_lio_opcode; /* LIO opcode */ int aio_reqprio; /* Request priority -- ignored */ struct __aiocb_private _aiocb_private; } oaiocb_t; #endif /* * Below is a key of locks used to protect each member of struct kaiocb * aioliojob and kaioinfo and any backends. * * * - need not protected * a - locked by kaioinfo lock * b - locked by backend lock, the backend lock can be null in some cases, * for example, BIO belongs to this type, in this case, proc lock is * reused. * c - locked by aio_job_mtx, the lock for the generic file I/O backend. */ /* * If the routine that services an AIO request blocks while running in an * AIO kernel process it can starve other I/O requests. BIO requests * queued via aio_qphysio() complete in GEOM and do not use AIO kernel * processes at all. Socket I/O requests use a separate pool of * kprocs and also force non-blocking I/O. Other file I/O requests * use the generic fo_read/fo_write operations which can block. The * fsync and mlock operations can also block while executing. Ideally * none of these requests would block while executing. * * Note that the service routines cannot toggle O_NONBLOCK in the file * structure directly while handling a request due to races with * userland threads. */ /* jobflags */ #define KAIOCB_QUEUEING 0x01 #define KAIOCB_CANCELLED 0x02 #define KAIOCB_CANCELLING 0x04 #define KAIOCB_CHECKSYNC 0x08 #define KAIOCB_CLEARED 0x10 #define KAIOCB_FINISHED 0x20 /* * AIO process info */ #define AIOP_FREE 0x1 /* proc on free queue */ struct aioproc { int aioprocflags; /* (c) AIO proc flags */ TAILQ_ENTRY(aioproc) list; /* (c) list of processes */ struct proc *aioproc; /* (*) the AIO proc */ }; /* * data-structure for lio signal management */ struct aioliojob { int lioj_flags; /* (a) listio flags */ int lioj_count; /* (a) listio flags */ int lioj_finished_count; /* (a) listio flags */ struct sigevent lioj_signal; /* (a) signal on all I/O done */ TAILQ_ENTRY(aioliojob) lioj_list; /* (a) lio list */ struct knlist klist; /* (a) list of knotes */ ksiginfo_t lioj_ksi; /* (a) Realtime signal info */ }; #define LIOJ_SIGNAL 0x1 /* signal on all done (lio) */ #define LIOJ_SIGNAL_POSTED 0x2 /* signal has been posted */ #define LIOJ_KEVENT_POSTED 0x4 /* kevent triggered */ /* * per process aio data structure */ struct kaioinfo { struct mtx kaio_mtx; /* the lock to protect this struct */ int kaio_flags; /* (a) per process kaio flags */ int kaio_maxactive_count; /* (*) maximum number of AIOs */ int kaio_active_count; /* (c) number of currently used AIOs */ int kaio_qallowed_count; /* (*) maxiumu size of AIO queue */ int kaio_count; /* (a) size of AIO queue */ int kaio_ballowed_count; /* (*) maximum number of buffers */ int kaio_buffer_count; /* (a) number of physio buffers */ TAILQ_HEAD(,kaiocb) kaio_all; /* (a) all AIOs in a process */ TAILQ_HEAD(,kaiocb) kaio_done; /* (a) done queue for process */ TAILQ_HEAD(,aioliojob) kaio_liojoblist; /* (a) list of lio jobs */ TAILQ_HEAD(,kaiocb) kaio_jobqueue; /* (a) job queue for process */ TAILQ_HEAD(,kaiocb) kaio_syncqueue; /* (a) queue for aio_fsync */ TAILQ_HEAD(,kaiocb) kaio_syncready; /* (a) second q for aio_fsync */ struct task kaio_task; /* (*) task to kick aio processes */ struct task kaio_sync_task; /* (*) task to schedule fsync jobs */ }; #define AIO_LOCK(ki) mtx_lock(&(ki)->kaio_mtx) #define AIO_UNLOCK(ki) mtx_unlock(&(ki)->kaio_mtx) #define AIO_LOCK_ASSERT(ki, f) mtx_assert(&(ki)->kaio_mtx, (f)) #define AIO_MTX(ki) (&(ki)->kaio_mtx) #define KAIO_RUNDOWN 0x1 /* process is being run down */ #define KAIO_WAKEUP 0x2 /* wakeup process when AIO completes */ /* * Operations used to interact with userland aio control blocks. * Different ABIs provide their own operations. */ struct aiocb_ops { int (*copyin)(struct aiocb *ujob, struct aiocb *kjob); long (*fetch_status)(struct aiocb *ujob); long (*fetch_error)(struct aiocb *ujob); int (*store_status)(struct aiocb *ujob, long status); int (*store_error)(struct aiocb *ujob, long error); int (*store_kernelinfo)(struct aiocb *ujob, long jobref); int (*store_aiocb)(struct aiocb **ujobp, struct aiocb *ujob); }; static TAILQ_HEAD(,aioproc) aio_freeproc; /* (c) Idle daemons */ static struct sema aio_newproc_sem; static struct mtx aio_job_mtx; static TAILQ_HEAD(,kaiocb) aio_jobs; /* (c) Async job list */ static struct unrhdr *aiod_unr; void aio_init_aioinfo(struct proc *p); static int aio_onceonly(void); static int aio_free_entry(struct kaiocb *job); static void aio_process_rw(struct kaiocb *job); static void aio_process_sync(struct kaiocb *job); static void aio_process_mlock(struct kaiocb *job); static void aio_schedule_fsync(void *context, int pending); static int aio_newproc(int *); int aio_aqueue(struct thread *td, struct aiocb *ujob, struct aioliojob *lio, int type, struct aiocb_ops *ops); static int aio_queue_file(struct file *fp, struct kaiocb *job); static void aio_physwakeup(struct bio *bp); static void aio_proc_rundown(void *arg, struct proc *p); static void aio_proc_rundown_exec(void *arg, struct proc *p, struct image_params *imgp); static int aio_qphysio(struct proc *p, struct kaiocb *job); static void aio_daemon(void *param); static void aio_bio_done_notify(struct proc *userp, struct kaiocb *job); static int aio_kick(struct proc *userp); static void aio_kick_nowait(struct proc *userp); static void aio_kick_helper(void *context, int pending); static int filt_aioattach(struct knote *kn); static void filt_aiodetach(struct knote *kn); static int filt_aio(struct knote *kn, long hint); static int filt_lioattach(struct knote *kn); static void filt_liodetach(struct knote *kn); static int filt_lio(struct knote *kn, long hint); /* * Zones for: * kaio Per process async io info * aiop async io process data * aiocb async io jobs * aiol list io job pointer - internal to aio_suspend XXX * aiolio list io jobs */ static uma_zone_t kaio_zone, aiop_zone, aiocb_zone, aiol_zone, aiolio_zone; /* kqueue filters for aio */ static struct filterops aio_filtops = { .f_isfd = 0, .f_attach = filt_aioattach, .f_detach = filt_aiodetach, .f_event = filt_aio, }; static struct filterops lio_filtops = { .f_isfd = 0, .f_attach = filt_lioattach, .f_detach = filt_liodetach, .f_event = filt_lio }; static eventhandler_tag exit_tag, exec_tag; TASKQUEUE_DEFINE_THREAD(aiod_kick); /* * Main operations function for use as a kernel module. */ static int aio_modload(struct module *module, int cmd, void *arg) { int error = 0; switch (cmd) { case MOD_LOAD: aio_onceonly(); break; case MOD_SHUTDOWN: break; default: error = EOPNOTSUPP; break; } return (error); } static moduledata_t aio_mod = { "aio", &aio_modload, NULL }; DECLARE_MODULE(aio, aio_mod, SI_SUB_VFS, SI_ORDER_ANY); MODULE_VERSION(aio, 1); /* * Startup initialization */ static int aio_onceonly(void) { exit_tag = EVENTHANDLER_REGISTER(process_exit, aio_proc_rundown, NULL, EVENTHANDLER_PRI_ANY); exec_tag = EVENTHANDLER_REGISTER(process_exec, aio_proc_rundown_exec, NULL, EVENTHANDLER_PRI_ANY); kqueue_add_filteropts(EVFILT_AIO, &aio_filtops); kqueue_add_filteropts(EVFILT_LIO, &lio_filtops); TAILQ_INIT(&aio_freeproc); sema_init(&aio_newproc_sem, 0, "aio_new_proc"); mtx_init(&aio_job_mtx, "aio_job", NULL, MTX_DEF); TAILQ_INIT(&aio_jobs); aiod_unr = new_unrhdr(1, INT_MAX, NULL); kaio_zone = uma_zcreate("AIO", sizeof(struct kaioinfo), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE); aiop_zone = uma_zcreate("AIOP", sizeof(struct aioproc), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE); aiocb_zone = uma_zcreate("AIOCB", sizeof(struct kaiocb), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE); aiol_zone = uma_zcreate("AIOL", AIO_LISTIO_MAX*sizeof(intptr_t) , NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE); aiolio_zone = uma_zcreate("AIOLIO", sizeof(struct aioliojob), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE); aiod_lifetime = AIOD_LIFETIME_DEFAULT; jobrefid = 1; p31b_setcfg(CTL_P1003_1B_ASYNCHRONOUS_IO, _POSIX_ASYNCHRONOUS_IO); p31b_setcfg(CTL_P1003_1B_AIO_LISTIO_MAX, AIO_LISTIO_MAX); p31b_setcfg(CTL_P1003_1B_AIO_MAX, MAX_AIO_QUEUE); p31b_setcfg(CTL_P1003_1B_AIO_PRIO_DELTA_MAX, 0); return (0); } /* * Init the per-process aioinfo structure. The aioinfo limits are set * per-process for user limit (resource) management. */ void aio_init_aioinfo(struct proc *p) { struct kaioinfo *ki; ki = uma_zalloc(kaio_zone, M_WAITOK); mtx_init(&ki->kaio_mtx, "aiomtx", NULL, MTX_DEF | MTX_NEW); ki->kaio_flags = 0; ki->kaio_maxactive_count = max_aio_per_proc; ki->kaio_active_count = 0; ki->kaio_qallowed_count = max_aio_queue_per_proc; ki->kaio_count = 0; ki->kaio_ballowed_count = max_buf_aio; ki->kaio_buffer_count = 0; TAILQ_INIT(&ki->kaio_all); TAILQ_INIT(&ki->kaio_done); TAILQ_INIT(&ki->kaio_jobqueue); TAILQ_INIT(&ki->kaio_liojoblist); TAILQ_INIT(&ki->kaio_syncqueue); TAILQ_INIT(&ki->kaio_syncready); TASK_INIT(&ki->kaio_task, 0, aio_kick_helper, p); TASK_INIT(&ki->kaio_sync_task, 0, aio_schedule_fsync, ki); PROC_LOCK(p); if (p->p_aioinfo == NULL) { p->p_aioinfo = ki; PROC_UNLOCK(p); } else { PROC_UNLOCK(p); mtx_destroy(&ki->kaio_mtx); uma_zfree(kaio_zone, ki); } while (num_aio_procs < MIN(target_aio_procs, max_aio_procs)) aio_newproc(NULL); } static int aio_sendsig(struct proc *p, struct sigevent *sigev, ksiginfo_t *ksi) { struct thread *td; int error; error = sigev_findtd(p, sigev, &td); if (error) return (error); if (!KSI_ONQ(ksi)) { ksiginfo_set_sigev(ksi, sigev); ksi->ksi_code = SI_ASYNCIO; ksi->ksi_flags |= KSI_EXT | KSI_INS; tdsendsignal(p, td, ksi->ksi_signo, ksi); } PROC_UNLOCK(p); return (error); } /* * Free a job entry. Wait for completion if it is currently active, but don't * delay forever. If we delay, we return a flag that says that we have to * restart the queue scan. */ static int aio_free_entry(struct kaiocb *job) { struct kaioinfo *ki; struct aioliojob *lj; struct proc *p; p = job->userproc; MPASS(curproc == p); ki = p->p_aioinfo; MPASS(ki != NULL); AIO_LOCK_ASSERT(ki, MA_OWNED); MPASS(job->jobflags & KAIOCB_FINISHED); atomic_subtract_int(&num_queue_count, 1); ki->kaio_count--; MPASS(ki->kaio_count >= 0); TAILQ_REMOVE(&ki->kaio_done, job, plist); TAILQ_REMOVE(&ki->kaio_all, job, allist); lj = job->lio; if (lj) { lj->lioj_count--; lj->lioj_finished_count--; if (lj->lioj_count == 0) { TAILQ_REMOVE(&ki->kaio_liojoblist, lj, lioj_list); /* lio is going away, we need to destroy any knotes */ knlist_delete(&lj->klist, curthread, 1); PROC_LOCK(p); sigqueue_take(&lj->lioj_ksi); PROC_UNLOCK(p); uma_zfree(aiolio_zone, lj); } } /* job is going away, we need to destroy any knotes */ knlist_delete(&job->klist, curthread, 1); PROC_LOCK(p); sigqueue_take(&job->ksi); PROC_UNLOCK(p); AIO_UNLOCK(ki); /* * The thread argument here is used to find the owning process * and is also passed to fo_close() which may pass it to various * places such as devsw close() routines. Because of that, we * need a thread pointer from the process owning the job that is * persistent and won't disappear out from under us or move to * another process. * * Currently, all the callers of this function call it to remove * a kaiocb from the current process' job list either via a * syscall or due to the current process calling exit() or * execve(). Thus, we know that p == curproc. We also know that * curthread can't exit since we are curthread. * * Therefore, we use curthread as the thread to pass to * knlist_delete(). This does mean that it is possible for the * thread pointer at close time to differ from the thread pointer * at open time, but this is already true of file descriptors in * a multithreaded process. */ if (job->fd_file) fdrop(job->fd_file, curthread); crfree(job->cred); uma_zfree(aiocb_zone, job); AIO_LOCK(ki); return (0); } static void aio_proc_rundown_exec(void *arg, struct proc *p, struct image_params *imgp __unused) { aio_proc_rundown(arg, p); } static int aio_cancel_job(struct proc *p, struct kaioinfo *ki, struct kaiocb *job) { aio_cancel_fn_t *func; int cancelled; AIO_LOCK_ASSERT(ki, MA_OWNED); if (job->jobflags & (KAIOCB_CANCELLED | KAIOCB_FINISHED)) return (0); MPASS((job->jobflags & KAIOCB_CANCELLING) == 0); job->jobflags |= KAIOCB_CANCELLED; func = job->cancel_fn; /* * If there is no cancel routine, just leave the job marked as * cancelled. The job should be in active use by a caller who * should complete it normally or when it fails to install a * cancel routine. */ if (func == NULL) return (0); /* * Set the CANCELLING flag so that aio_complete() will defer * completions of this job. This prevents the job from being * freed out from under the cancel callback. After the * callback any deferred completion (whether from the callback * or any other source) will be completed. */ job->jobflags |= KAIOCB_CANCELLING; AIO_UNLOCK(ki); func(job); AIO_LOCK(ki); job->jobflags &= ~KAIOCB_CANCELLING; if (job->jobflags & KAIOCB_FINISHED) { cancelled = job->uaiocb._aiocb_private.error == ECANCELED; TAILQ_REMOVE(&ki->kaio_jobqueue, job, plist); aio_bio_done_notify(p, job); } else { /* * The cancel callback might have scheduled an * operation to cancel this request, but it is * only counted as cancelled if the request is * cancelled when the callback returns. */ cancelled = 0; } return (cancelled); } /* * Rundown the jobs for a given process. */ static void aio_proc_rundown(void *arg, struct proc *p) { struct kaioinfo *ki; struct aioliojob *lj; struct kaiocb *job, *jobn; KASSERT(curthread->td_proc == p, ("%s: called on non-curproc", __func__)); ki = p->p_aioinfo; if (ki == NULL) return; AIO_LOCK(ki); ki->kaio_flags |= KAIO_RUNDOWN; restart: /* * Try to cancel all pending requests. This code simulates * aio_cancel on all pending I/O requests. */ TAILQ_FOREACH_SAFE(job, &ki->kaio_jobqueue, plist, jobn) { aio_cancel_job(p, ki, job); } /* Wait for all running I/O to be finished */ if (TAILQ_FIRST(&ki->kaio_jobqueue) || ki->kaio_active_count != 0) { ki->kaio_flags |= KAIO_WAKEUP; msleep(&p->p_aioinfo, AIO_MTX(ki), PRIBIO, "aioprn", hz); goto restart; } /* Free all completed I/O requests. */ while ((job = TAILQ_FIRST(&ki->kaio_done)) != NULL) aio_free_entry(job); while ((lj = TAILQ_FIRST(&ki->kaio_liojoblist)) != NULL) { if (lj->lioj_count == 0) { TAILQ_REMOVE(&ki->kaio_liojoblist, lj, lioj_list); knlist_delete(&lj->klist, curthread, 1); PROC_LOCK(p); sigqueue_take(&lj->lioj_ksi); PROC_UNLOCK(p); uma_zfree(aiolio_zone, lj); } else { panic("LIO job not cleaned up: C:%d, FC:%d\n", lj->lioj_count, lj->lioj_finished_count); } } AIO_UNLOCK(ki); taskqueue_drain(taskqueue_aiod_kick, &ki->kaio_task); taskqueue_drain(taskqueue_aiod_kick, &ki->kaio_sync_task); mtx_destroy(&ki->kaio_mtx); uma_zfree(kaio_zone, ki); p->p_aioinfo = NULL; } /* * Select a job to run (called by an AIO daemon). */ static struct kaiocb * aio_selectjob(struct aioproc *aiop) { struct kaiocb *job; struct kaioinfo *ki; struct proc *userp; mtx_assert(&aio_job_mtx, MA_OWNED); restart: TAILQ_FOREACH(job, &aio_jobs, list) { userp = job->userproc; ki = userp->p_aioinfo; if (ki->kaio_active_count < ki->kaio_maxactive_count) { TAILQ_REMOVE(&aio_jobs, job, list); if (!aio_clear_cancel_function(job)) goto restart; /* Account for currently active jobs. */ ki->kaio_active_count++; break; } } return (job); } /* * Move all data to a permanent storage device. This code * simulates the fsync syscall. */ static int aio_fsync_vnode(struct thread *td, struct vnode *vp) { struct mount *mp; int error; if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0) goto drop; vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); if (vp->v_object != NULL) { VM_OBJECT_WLOCK(vp->v_object); vm_object_page_clean(vp->v_object, 0, 0, 0); VM_OBJECT_WUNLOCK(vp->v_object); } error = VOP_FSYNC(vp, MNT_WAIT, td); VOP_UNLOCK(vp, 0); vn_finished_write(mp); drop: return (error); } /* * The AIO processing activity for LIO_READ/LIO_WRITE. This is the code that * does the I/O request for the non-physio version of the operations. The * normal vn operations are used, and this code should work in all instances * for every type of file, including pipes, sockets, fifos, and regular files. * * XXX I don't think it works well for socket, pipe, and fifo. */ static void aio_process_rw(struct kaiocb *job) { struct ucred *td_savedcred; struct thread *td; struct aiocb *cb; struct file *fp; struct uio auio; struct iovec aiov; ssize_t cnt; long msgsnd_st, msgsnd_end; long msgrcv_st, msgrcv_end; long oublock_st, oublock_end; long inblock_st, inblock_end; int error; KASSERT(job->uaiocb.aio_lio_opcode == LIO_READ || job->uaiocb.aio_lio_opcode == LIO_WRITE, ("%s: opcode %d", __func__, job->uaiocb.aio_lio_opcode)); aio_switch_vmspace(job); td = curthread; td_savedcred = td->td_ucred; td->td_ucred = job->cred; cb = &job->uaiocb; fp = job->fd_file; aiov.iov_base = (void *)(uintptr_t)cb->aio_buf; aiov.iov_len = cb->aio_nbytes; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_offset = cb->aio_offset; auio.uio_resid = cb->aio_nbytes; cnt = cb->aio_nbytes; auio.uio_segflg = UIO_USERSPACE; auio.uio_td = td; msgrcv_st = td->td_ru.ru_msgrcv; msgsnd_st = td->td_ru.ru_msgsnd; inblock_st = td->td_ru.ru_inblock; oublock_st = td->td_ru.ru_oublock; /* * aio_aqueue() acquires a reference to the file that is * released in aio_free_entry(). */ if (cb->aio_lio_opcode == LIO_READ) { auio.uio_rw = UIO_READ; if (auio.uio_resid == 0) error = 0; else error = fo_read(fp, &auio, fp->f_cred, FOF_OFFSET, td); } else { if (fp->f_type == DTYPE_VNODE) bwillwrite(); auio.uio_rw = UIO_WRITE; error = fo_write(fp, &auio, fp->f_cred, FOF_OFFSET, td); } msgrcv_end = td->td_ru.ru_msgrcv; msgsnd_end = td->td_ru.ru_msgsnd; inblock_end = td->td_ru.ru_inblock; oublock_end = td->td_ru.ru_oublock; job->msgrcv = msgrcv_end - msgrcv_st; job->msgsnd = msgsnd_end - msgsnd_st; job->inblock = inblock_end - inblock_st; job->outblock = oublock_end - oublock_st; if ((error) && (auio.uio_resid != cnt)) { if (error == ERESTART || error == EINTR || error == EWOULDBLOCK) error = 0; if ((error == EPIPE) && (cb->aio_lio_opcode == LIO_WRITE)) { PROC_LOCK(job->userproc); kern_psignal(job->userproc, SIGPIPE); PROC_UNLOCK(job->userproc); } } cnt -= auio.uio_resid; td->td_ucred = td_savedcred; if (error) aio_complete(job, -1, error); else aio_complete(job, cnt, 0); } static void aio_process_sync(struct kaiocb *job) { struct thread *td = curthread; struct ucred *td_savedcred = td->td_ucred; struct file *fp = job->fd_file; int error = 0; KASSERT(job->uaiocb.aio_lio_opcode == LIO_SYNC, ("%s: opcode %d", __func__, job->uaiocb.aio_lio_opcode)); td->td_ucred = job->cred; if (fp->f_vnode != NULL) error = aio_fsync_vnode(td, fp->f_vnode); td->td_ucred = td_savedcred; if (error) aio_complete(job, -1, error); else aio_complete(job, 0, 0); } static void aio_process_mlock(struct kaiocb *job) { struct aiocb *cb = &job->uaiocb; int error; KASSERT(job->uaiocb.aio_lio_opcode == LIO_MLOCK, ("%s: opcode %d", __func__, job->uaiocb.aio_lio_opcode)); aio_switch_vmspace(job); error = vm_mlock(job->userproc, job->cred, __DEVOLATILE(void *, cb->aio_buf), cb->aio_nbytes); if (error) aio_complete(job, -1, error); else aio_complete(job, 0, 0); } static void aio_bio_done_notify(struct proc *userp, struct kaiocb *job) { struct aioliojob *lj; struct kaioinfo *ki; struct kaiocb *sjob, *sjobn; int lj_done; bool schedule_fsync; ki = userp->p_aioinfo; AIO_LOCK_ASSERT(ki, MA_OWNED); lj = job->lio; lj_done = 0; if (lj) { lj->lioj_finished_count++; if (lj->lioj_count == lj->lioj_finished_count) lj_done = 1; } TAILQ_INSERT_TAIL(&ki->kaio_done, job, plist); MPASS(job->jobflags & KAIOCB_FINISHED); if (ki->kaio_flags & KAIO_RUNDOWN) goto notification_done; if (job->uaiocb.aio_sigevent.sigev_notify == SIGEV_SIGNAL || job->uaiocb.aio_sigevent.sigev_notify == SIGEV_THREAD_ID) aio_sendsig(userp, &job->uaiocb.aio_sigevent, &job->ksi); KNOTE_LOCKED(&job->klist, 1); if (lj_done) { if (lj->lioj_signal.sigev_notify == SIGEV_KEVENT) { lj->lioj_flags |= LIOJ_KEVENT_POSTED; KNOTE_LOCKED(&lj->klist, 1); } if ((lj->lioj_flags & (LIOJ_SIGNAL|LIOJ_SIGNAL_POSTED)) == LIOJ_SIGNAL && (lj->lioj_signal.sigev_notify == SIGEV_SIGNAL || lj->lioj_signal.sigev_notify == SIGEV_THREAD_ID)) { aio_sendsig(userp, &lj->lioj_signal, &lj->lioj_ksi); lj->lioj_flags |= LIOJ_SIGNAL_POSTED; } } notification_done: if (job->jobflags & KAIOCB_CHECKSYNC) { schedule_fsync = false; TAILQ_FOREACH_SAFE(sjob, &ki->kaio_syncqueue, list, sjobn) { if (job->fd_file == sjob->fd_file && job->seqno < sjob->seqno) { if (--sjob->pending == 0) { TAILQ_REMOVE(&ki->kaio_syncqueue, sjob, list); if (!aio_clear_cancel_function(sjob)) continue; TAILQ_INSERT_TAIL(&ki->kaio_syncready, sjob, list); schedule_fsync = true; } } } if (schedule_fsync) taskqueue_enqueue(taskqueue_aiod_kick, &ki->kaio_sync_task); } if (ki->kaio_flags & KAIO_WAKEUP) { ki->kaio_flags &= ~KAIO_WAKEUP; wakeup(&userp->p_aioinfo); } } static void aio_schedule_fsync(void *context, int pending) { struct kaioinfo *ki; struct kaiocb *job; ki = context; AIO_LOCK(ki); while (!TAILQ_EMPTY(&ki->kaio_syncready)) { job = TAILQ_FIRST(&ki->kaio_syncready); TAILQ_REMOVE(&ki->kaio_syncready, job, list); AIO_UNLOCK(ki); aio_schedule(job, aio_process_sync); AIO_LOCK(ki); } AIO_UNLOCK(ki); } bool aio_cancel_cleared(struct kaiocb *job) { struct kaioinfo *ki; /* * The caller should hold the same queue lock held when * aio_clear_cancel_function() was called and set this flag * ensuring this check sees an up-to-date value. However, * there is no way to assert that. */ ki = job->userproc->p_aioinfo; return ((job->jobflags & KAIOCB_CLEARED) != 0); } bool aio_clear_cancel_function(struct kaiocb *job) { struct kaioinfo *ki; ki = job->userproc->p_aioinfo; AIO_LOCK(ki); MPASS(job->cancel_fn != NULL); if (job->jobflags & KAIOCB_CANCELLING) { job->jobflags |= KAIOCB_CLEARED; AIO_UNLOCK(ki); return (false); } job->cancel_fn = NULL; AIO_UNLOCK(ki); return (true); } bool aio_set_cancel_function(struct kaiocb *job, aio_cancel_fn_t *func) { struct kaioinfo *ki; ki = job->userproc->p_aioinfo; AIO_LOCK(ki); if (job->jobflags & KAIOCB_CANCELLED) { AIO_UNLOCK(ki); return (false); } job->cancel_fn = func; AIO_UNLOCK(ki); return (true); } void aio_complete(struct kaiocb *job, long status, int error) { struct kaioinfo *ki; struct proc *userp; job->uaiocb._aiocb_private.error = error; job->uaiocb._aiocb_private.status = status; userp = job->userproc; ki = userp->p_aioinfo; AIO_LOCK(ki); KASSERT(!(job->jobflags & KAIOCB_FINISHED), ("duplicate aio_complete")); job->jobflags |= KAIOCB_FINISHED; if ((job->jobflags & (KAIOCB_QUEUEING | KAIOCB_CANCELLING)) == 0) { TAILQ_REMOVE(&ki->kaio_jobqueue, job, plist); aio_bio_done_notify(userp, job); } AIO_UNLOCK(ki); } void aio_cancel(struct kaiocb *job) { aio_complete(job, -1, ECANCELED); } void aio_switch_vmspace(struct kaiocb *job) { vmspace_switch_aio(job->userproc->p_vmspace); } /* * The AIO daemon, most of the actual work is done in aio_process_*, * but the setup (and address space mgmt) is done in this routine. */ static void aio_daemon(void *_id) { struct kaiocb *job; struct aioproc *aiop; struct kaioinfo *ki; struct proc *p; struct vmspace *myvm; struct thread *td = curthread; int id = (intptr_t)_id; /* * Grab an extra reference on the daemon's vmspace so that it * doesn't get freed by jobs that switch to a different * vmspace. */ p = td->td_proc; myvm = vmspace_acquire_ref(p); KASSERT(p->p_textvp == NULL, ("kthread has a textvp")); /* * Allocate and ready the aio control info. There is one aiop structure * per daemon. */ aiop = uma_zalloc(aiop_zone, M_WAITOK); aiop->aioproc = p; aiop->aioprocflags = 0; /* * Wakeup parent process. (Parent sleeps to keep from blasting away * and creating too many daemons.) */ sema_post(&aio_newproc_sem); mtx_lock(&aio_job_mtx); for (;;) { /* * Take daemon off of free queue */ if (aiop->aioprocflags & AIOP_FREE) { TAILQ_REMOVE(&aio_freeproc, aiop, list); aiop->aioprocflags &= ~AIOP_FREE; } /* * Check for jobs. */ while ((job = aio_selectjob(aiop)) != NULL) { mtx_unlock(&aio_job_mtx); ki = job->userproc->p_aioinfo; job->handle_fn(job); mtx_lock(&aio_job_mtx); /* Decrement the active job count. */ ki->kaio_active_count--; } /* * Disconnect from user address space. */ if (p->p_vmspace != myvm) { mtx_unlock(&aio_job_mtx); vmspace_switch_aio(myvm); mtx_lock(&aio_job_mtx); /* * We have to restart to avoid race, we only sleep if * no job can be selected. */ continue; } mtx_assert(&aio_job_mtx, MA_OWNED); TAILQ_INSERT_HEAD(&aio_freeproc, aiop, list); aiop->aioprocflags |= AIOP_FREE; /* * If daemon is inactive for a long time, allow it to exit, * thereby freeing resources. */ if (msleep(p, &aio_job_mtx, PRIBIO, "aiordy", aiod_lifetime) == EWOULDBLOCK && TAILQ_EMPTY(&aio_jobs) && (aiop->aioprocflags & AIOP_FREE) && num_aio_procs > target_aio_procs) break; } TAILQ_REMOVE(&aio_freeproc, aiop, list); num_aio_procs--; mtx_unlock(&aio_job_mtx); uma_zfree(aiop_zone, aiop); free_unr(aiod_unr, id); vmspace_free(myvm); KASSERT(p->p_vmspace == myvm, ("AIOD: bad vmspace for exiting daemon")); KASSERT(myvm->vm_refcnt > 1, ("AIOD: bad vm refcnt for exiting daemon: %d", myvm->vm_refcnt)); kproc_exit(0); } /* * Create a new AIO daemon. This is mostly a kernel-thread fork routine. The * AIO daemon modifies its environment itself. */ static int aio_newproc(int *start) { int error; struct proc *p; int id; id = alloc_unr(aiod_unr); error = kproc_create(aio_daemon, (void *)(intptr_t)id, &p, RFNOWAIT, 0, "aiod%d", id); if (error == 0) { /* * Wait until daemon is started. */ sema_wait(&aio_newproc_sem); mtx_lock(&aio_job_mtx); num_aio_procs++; if (start != NULL) (*start)--; mtx_unlock(&aio_job_mtx); } else { free_unr(aiod_unr, id); } return (error); } /* * Try the high-performance, low-overhead physio method for eligible * VCHR devices. This method doesn't use an aio helper thread, and * thus has very low overhead. * * Assumes that the caller, aio_aqueue(), has incremented the file * structure's reference count, preventing its deallocation for the * duration of this call. */ static int aio_qphysio(struct proc *p, struct kaiocb *job) { struct aiocb *cb; struct file *fp; struct bio *bp; struct buf *pbuf; struct vnode *vp; struct cdevsw *csw; struct cdev *dev; struct kaioinfo *ki; int error, ref, poff; vm_prot_t prot; cb = &job->uaiocb; fp = job->fd_file; if (fp == NULL || fp->f_type != DTYPE_VNODE) return (-1); vp = fp->f_vnode; if (vp->v_type != VCHR) return (-1); if (vp->v_bufobj.bo_bsize == 0) return (-1); if (cb->aio_nbytes % vp->v_bufobj.bo_bsize) return (-1); ref = 0; csw = devvn_refthread(vp, &dev, &ref); if (csw == NULL) return (ENXIO); if ((csw->d_flags & D_DISK) == 0) { error = -1; goto unref; } if (cb->aio_nbytes > dev->si_iosize_max) { error = -1; goto unref; } ki = p->p_aioinfo; poff = (vm_offset_t)cb->aio_buf & PAGE_MASK; if ((dev->si_flags & SI_UNMAPPED) && unmapped_buf_allowed) { if (cb->aio_nbytes > MAXPHYS) { error = -1; goto unref; } pbuf = NULL; } else { if (cb->aio_nbytes > MAXPHYS - poff) { error = -1; goto unref; } if (ki->kaio_buffer_count >= ki->kaio_ballowed_count) { error = -1; goto unref; } job->pbuf = pbuf = (struct buf *)getpbuf(NULL); BUF_KERNPROC(pbuf); AIO_LOCK(ki); ki->kaio_buffer_count++; AIO_UNLOCK(ki); } job->bp = bp = g_alloc_bio(); bp->bio_length = cb->aio_nbytes; bp->bio_bcount = cb->aio_nbytes; bp->bio_done = aio_physwakeup; bp->bio_data = (void *)(uintptr_t)cb->aio_buf; bp->bio_offset = cb->aio_offset; bp->bio_cmd = cb->aio_lio_opcode == LIO_WRITE ? BIO_WRITE : BIO_READ; bp->bio_dev = dev; bp->bio_caller1 = (void *)job; prot = VM_PROT_READ; if (cb->aio_lio_opcode == LIO_READ) prot |= VM_PROT_WRITE; /* Less backwards than it looks */ job->npages = vm_fault_quick_hold_pages(&curproc->p_vmspace->vm_map, (vm_offset_t)bp->bio_data, bp->bio_length, prot, job->pages, nitems(job->pages)); if (job->npages < 0) { error = EFAULT; goto doerror; } if (pbuf != NULL) { pmap_qenter((vm_offset_t)pbuf->b_data, job->pages, job->npages); bp->bio_data = pbuf->b_data + poff; atomic_add_int(&num_buf_aio, 1); } else { bp->bio_ma = job->pages; bp->bio_ma_n = job->npages; bp->bio_ma_offset = poff; bp->bio_data = unmapped_buf; bp->bio_flags |= BIO_UNMAPPED; } /* Perform transfer. */ csw->d_strategy(bp); dev_relthread(dev, ref); return (0); doerror: if (pbuf != NULL) { AIO_LOCK(ki); ki->kaio_buffer_count--; AIO_UNLOCK(ki); relpbuf(pbuf, NULL); job->pbuf = NULL; } g_destroy_bio(bp); job->bp = NULL; unref: dev_relthread(dev, ref); return (error); } #ifdef COMPAT_FREEBSD6 static int convert_old_sigevent(struct osigevent *osig, struct sigevent *nsig) { /* * Only SIGEV_NONE, SIGEV_SIGNAL, and SIGEV_KEVENT are * supported by AIO with the old sigevent structure. */ nsig->sigev_notify = osig->sigev_notify; switch (nsig->sigev_notify) { case SIGEV_NONE: break; case SIGEV_SIGNAL: nsig->sigev_signo = osig->__sigev_u.__sigev_signo; break; case SIGEV_KEVENT: nsig->sigev_notify_kqueue = osig->__sigev_u.__sigev_notify_kqueue; nsig->sigev_value.sival_ptr = osig->sigev_value.sival_ptr; break; default: return (EINVAL); } return (0); } static int aiocb_copyin_old_sigevent(struct aiocb *ujob, struct aiocb *kjob) { struct oaiocb *ojob; int error; bzero(kjob, sizeof(struct aiocb)); error = copyin(ujob, kjob, sizeof(struct oaiocb)); if (error) return (error); ojob = (struct oaiocb *)kjob; return (convert_old_sigevent(&ojob->aio_sigevent, &kjob->aio_sigevent)); } #endif static int aiocb_copyin(struct aiocb *ujob, struct aiocb *kjob) { return (copyin(ujob, kjob, sizeof(struct aiocb))); } static long aiocb_fetch_status(struct aiocb *ujob) { return (fuword(&ujob->_aiocb_private.status)); } static long aiocb_fetch_error(struct aiocb *ujob) { return (fuword(&ujob->_aiocb_private.error)); } static int aiocb_store_status(struct aiocb *ujob, long status) { return (suword(&ujob->_aiocb_private.status, status)); } static int aiocb_store_error(struct aiocb *ujob, long error) { return (suword(&ujob->_aiocb_private.error, error)); } static int aiocb_store_kernelinfo(struct aiocb *ujob, long jobref) { return (suword(&ujob->_aiocb_private.kernelinfo, jobref)); } static int aiocb_store_aiocb(struct aiocb **ujobp, struct aiocb *ujob) { return (suword(ujobp, (long)ujob)); } static struct aiocb_ops aiocb_ops = { .copyin = aiocb_copyin, .fetch_status = aiocb_fetch_status, .fetch_error = aiocb_fetch_error, .store_status = aiocb_store_status, .store_error = aiocb_store_error, .store_kernelinfo = aiocb_store_kernelinfo, .store_aiocb = aiocb_store_aiocb, }; #ifdef COMPAT_FREEBSD6 static struct aiocb_ops aiocb_ops_osigevent = { .copyin = aiocb_copyin_old_sigevent, .fetch_status = aiocb_fetch_status, .fetch_error = aiocb_fetch_error, .store_status = aiocb_store_status, .store_error = aiocb_store_error, .store_kernelinfo = aiocb_store_kernelinfo, .store_aiocb = aiocb_store_aiocb, }; #endif /* * Queue a new AIO request. Choosing either the threaded or direct physio VCHR * technique is done in this code. */ int aio_aqueue(struct thread *td, struct aiocb *ujob, struct aioliojob *lj, int type, struct aiocb_ops *ops) { struct proc *p = td->td_proc; cap_rights_t rights; struct file *fp; struct kaiocb *job; struct kaioinfo *ki; struct kevent kev; int opcode; int error; int fd, kqfd; int jid; u_short evflags; if (p->p_aioinfo == NULL) aio_init_aioinfo(p); ki = p->p_aioinfo; ops->store_status(ujob, -1); ops->store_error(ujob, 0); ops->store_kernelinfo(ujob, -1); if (num_queue_count >= max_queue_count || ki->kaio_count >= ki->kaio_qallowed_count) { ops->store_error(ujob, EAGAIN); return (EAGAIN); } job = uma_zalloc(aiocb_zone, M_WAITOK | M_ZERO); knlist_init_mtx(&job->klist, AIO_MTX(ki)); error = ops->copyin(ujob, &job->uaiocb); if (error) { ops->store_error(ujob, error); uma_zfree(aiocb_zone, job); return (error); } if (job->uaiocb.aio_nbytes > IOSIZE_MAX) { uma_zfree(aiocb_zone, job); return (EINVAL); } if (job->uaiocb.aio_sigevent.sigev_notify != SIGEV_KEVENT && job->uaiocb.aio_sigevent.sigev_notify != SIGEV_SIGNAL && job->uaiocb.aio_sigevent.sigev_notify != SIGEV_THREAD_ID && job->uaiocb.aio_sigevent.sigev_notify != SIGEV_NONE) { ops->store_error(ujob, EINVAL); uma_zfree(aiocb_zone, job); return (EINVAL); } if ((job->uaiocb.aio_sigevent.sigev_notify == SIGEV_SIGNAL || job->uaiocb.aio_sigevent.sigev_notify == SIGEV_THREAD_ID) && !_SIG_VALID(job->uaiocb.aio_sigevent.sigev_signo)) { uma_zfree(aiocb_zone, job); return (EINVAL); } ksiginfo_init(&job->ksi); /* Save userspace address of the job info. */ job->ujob = ujob; /* Get the opcode. */ if (type != LIO_NOP) job->uaiocb.aio_lio_opcode = type; opcode = job->uaiocb.aio_lio_opcode; /* * Validate the opcode and fetch the file object for the specified * file descriptor. * * XXXRW: Moved the opcode validation up here so that we don't * retrieve a file descriptor without knowing what the capabiltity * should be. */ fd = job->uaiocb.aio_fildes; switch (opcode) { case LIO_WRITE: error = fget_write(td, fd, cap_rights_init(&rights, CAP_PWRITE), &fp); break; case LIO_READ: error = fget_read(td, fd, cap_rights_init(&rights, CAP_PREAD), &fp); break; case LIO_SYNC: error = fget(td, fd, cap_rights_init(&rights, CAP_FSYNC), &fp); break; case LIO_MLOCK: fp = NULL; break; case LIO_NOP: error = fget(td, fd, cap_rights_init(&rights), &fp); break; default: error = EINVAL; } if (error) { uma_zfree(aiocb_zone, job); ops->store_error(ujob, error); return (error); } if (opcode == LIO_SYNC && fp->f_vnode == NULL) { error = EINVAL; goto aqueue_fail; } if (opcode != LIO_SYNC && job->uaiocb.aio_offset == -1LL) { error = EINVAL; goto aqueue_fail; } job->fd_file = fp; mtx_lock(&aio_job_mtx); jid = jobrefid++; job->seqno = jobseqno++; mtx_unlock(&aio_job_mtx); error = ops->store_kernelinfo(ujob, jid); if (error) { error = EINVAL; goto aqueue_fail; } job->uaiocb._aiocb_private.kernelinfo = (void *)(intptr_t)jid; if (opcode == LIO_NOP) { fdrop(fp, td); uma_zfree(aiocb_zone, job); return (0); } if (job->uaiocb.aio_sigevent.sigev_notify != SIGEV_KEVENT) goto no_kqueue; evflags = job->uaiocb.aio_sigevent.sigev_notify_kevent_flags; if ((evflags & ~(EV_CLEAR | EV_DISPATCH | EV_ONESHOT)) != 0) { error = EINVAL; goto aqueue_fail; } kqfd = job->uaiocb.aio_sigevent.sigev_notify_kqueue; kev.ident = (uintptr_t)job->ujob; kev.filter = EVFILT_AIO; kev.flags = EV_ADD | EV_ENABLE | EV_FLAG1 | evflags; kev.data = (intptr_t)job; kev.udata = job->uaiocb.aio_sigevent.sigev_value.sival_ptr; error = kqfd_register(kqfd, &kev, td, 1); if (error) goto aqueue_fail; no_kqueue: ops->store_error(ujob, EINPROGRESS); job->uaiocb._aiocb_private.error = EINPROGRESS; job->userproc = p; job->cred = crhold(td->td_ucred); job->jobflags = KAIOCB_QUEUEING; job->lio = lj; if (opcode == LIO_MLOCK) { aio_schedule(job, aio_process_mlock); error = 0; } else if (fp->f_ops->fo_aio_queue == NULL) error = aio_queue_file(fp, job); else error = fo_aio_queue(fp, job); if (error) goto aqueue_fail; AIO_LOCK(ki); job->jobflags &= ~KAIOCB_QUEUEING; TAILQ_INSERT_TAIL(&ki->kaio_all, job, allist); ki->kaio_count++; if (lj) lj->lioj_count++; atomic_add_int(&num_queue_count, 1); if (job->jobflags & KAIOCB_FINISHED) { /* * The queue callback completed the request synchronously. * The bulk of the completion is deferred in that case * until this point. */ aio_bio_done_notify(p, job); } else TAILQ_INSERT_TAIL(&ki->kaio_jobqueue, job, plist); AIO_UNLOCK(ki); return (0); aqueue_fail: knlist_delete(&job->klist, curthread, 0); if (fp) fdrop(fp, td); uma_zfree(aiocb_zone, job); ops->store_error(ujob, error); return (error); } static void aio_cancel_daemon_job(struct kaiocb *job) { mtx_lock(&aio_job_mtx); if (!aio_cancel_cleared(job)) TAILQ_REMOVE(&aio_jobs, job, list); mtx_unlock(&aio_job_mtx); aio_cancel(job); } void aio_schedule(struct kaiocb *job, aio_handle_fn_t *func) { mtx_lock(&aio_job_mtx); if (!aio_set_cancel_function(job, aio_cancel_daemon_job)) { mtx_unlock(&aio_job_mtx); aio_cancel(job); return; } job->handle_fn = func; TAILQ_INSERT_TAIL(&aio_jobs, job, list); aio_kick_nowait(job->userproc); mtx_unlock(&aio_job_mtx); } static void aio_cancel_sync(struct kaiocb *job) { struct kaioinfo *ki; ki = job->userproc->p_aioinfo; mtx_lock(&aio_job_mtx); if (!aio_cancel_cleared(job)) TAILQ_REMOVE(&ki->kaio_syncqueue, job, list); mtx_unlock(&aio_job_mtx); aio_cancel(job); } int aio_queue_file(struct file *fp, struct kaiocb *job) { struct aioliojob *lj; struct kaioinfo *ki; struct kaiocb *job2; + struct vnode *vp; + struct mount *mp; int error, opcode; + bool safe; lj = job->lio; ki = job->userproc->p_aioinfo; opcode = job->uaiocb.aio_lio_opcode; if (opcode == LIO_SYNC) goto queueit; if ((error = aio_qphysio(job->userproc, job)) == 0) goto done; #if 0 /* * XXX: This means qphysio() failed with EFAULT. The current * behavior is to retry the operation via fo_read/fo_write. * Wouldn't it be better to just complete the request with an * error here? */ if (error > 0) goto done; #endif queueit: - if (!enable_aio_unsafe) + safe = false; + if (fp->f_type == DTYPE_VNODE) { + vp = fp->f_vnode; + if (vp->v_type == VREG || vp->v_type == VDIR) { + mp = fp->f_vnode->v_mount; + if (mp == NULL || (mp->mnt_flag & MNT_LOCAL) != 0) + safe = true; + } + } + if (!(safe || enable_aio_unsafe)) { + counted_warning(&unsafe_warningcnt, + "is attempting to use unsafe AIO requests"); return (EOPNOTSUPP); + } if (opcode == LIO_SYNC) { AIO_LOCK(ki); TAILQ_FOREACH(job2, &ki->kaio_jobqueue, plist) { if (job2->fd_file == job->fd_file && job2->uaiocb.aio_lio_opcode != LIO_SYNC && job2->seqno < job->seqno) { job2->jobflags |= KAIOCB_CHECKSYNC; job->pending++; } } if (job->pending != 0) { if (!aio_set_cancel_function(job, aio_cancel_sync)) { AIO_UNLOCK(ki); aio_cancel(job); return (0); } TAILQ_INSERT_TAIL(&ki->kaio_syncqueue, job, list); AIO_UNLOCK(ki); return (0); } AIO_UNLOCK(ki); } switch (opcode) { case LIO_READ: case LIO_WRITE: aio_schedule(job, aio_process_rw); error = 0; break; case LIO_SYNC: aio_schedule(job, aio_process_sync); error = 0; break; default: error = EINVAL; } done: return (error); } static void aio_kick_nowait(struct proc *userp) { struct kaioinfo *ki = userp->p_aioinfo; struct aioproc *aiop; mtx_assert(&aio_job_mtx, MA_OWNED); if ((aiop = TAILQ_FIRST(&aio_freeproc)) != NULL) { TAILQ_REMOVE(&aio_freeproc, aiop, list); aiop->aioprocflags &= ~AIOP_FREE; wakeup(aiop->aioproc); } else if (num_aio_resv_start + num_aio_procs < max_aio_procs && ki->kaio_active_count + num_aio_resv_start < ki->kaio_maxactive_count) { taskqueue_enqueue(taskqueue_aiod_kick, &ki->kaio_task); } } static int aio_kick(struct proc *userp) { struct kaioinfo *ki = userp->p_aioinfo; struct aioproc *aiop; int error, ret = 0; mtx_assert(&aio_job_mtx, MA_OWNED); retryproc: if ((aiop = TAILQ_FIRST(&aio_freeproc)) != NULL) { TAILQ_REMOVE(&aio_freeproc, aiop, list); aiop->aioprocflags &= ~AIOP_FREE; wakeup(aiop->aioproc); } else if (num_aio_resv_start + num_aio_procs < max_aio_procs && ki->kaio_active_count + num_aio_resv_start < ki->kaio_maxactive_count) { num_aio_resv_start++; mtx_unlock(&aio_job_mtx); error = aio_newproc(&num_aio_resv_start); mtx_lock(&aio_job_mtx); if (error) { num_aio_resv_start--; goto retryproc; } } else { ret = -1; } return (ret); } static void aio_kick_helper(void *context, int pending) { struct proc *userp = context; mtx_lock(&aio_job_mtx); while (--pending >= 0) { if (aio_kick(userp)) break; } mtx_unlock(&aio_job_mtx); } /* * Support the aio_return system call, as a side-effect, kernel resources are * released. */ static int kern_aio_return(struct thread *td, struct aiocb *ujob, struct aiocb_ops *ops) { struct proc *p = td->td_proc; struct kaiocb *job; struct kaioinfo *ki; long status, error; ki = p->p_aioinfo; if (ki == NULL) return (EINVAL); AIO_LOCK(ki); TAILQ_FOREACH(job, &ki->kaio_done, plist) { if (job->ujob == ujob) break; } if (job != NULL) { MPASS(job->jobflags & KAIOCB_FINISHED); status = job->uaiocb._aiocb_private.status; error = job->uaiocb._aiocb_private.error; td->td_retval[0] = status; td->td_ru.ru_oublock += job->outblock; td->td_ru.ru_inblock += job->inblock; td->td_ru.ru_msgsnd += job->msgsnd; td->td_ru.ru_msgrcv += job->msgrcv; aio_free_entry(job); AIO_UNLOCK(ki); ops->store_error(ujob, error); ops->store_status(ujob, status); } else { error = EINVAL; AIO_UNLOCK(ki); } return (error); } int sys_aio_return(struct thread *td, struct aio_return_args *uap) { return (kern_aio_return(td, uap->aiocbp, &aiocb_ops)); } /* * Allow a process to wakeup when any of the I/O requests are completed. */ static int kern_aio_suspend(struct thread *td, int njoblist, struct aiocb **ujoblist, struct timespec *ts) { struct proc *p = td->td_proc; struct timeval atv; struct kaioinfo *ki; struct kaiocb *firstjob, *job; int error, i, timo; timo = 0; if (ts) { if (ts->tv_nsec < 0 || ts->tv_nsec >= 1000000000) return (EINVAL); TIMESPEC_TO_TIMEVAL(&atv, ts); if (itimerfix(&atv)) return (EINVAL); timo = tvtohz(&atv); } ki = p->p_aioinfo; if (ki == NULL) return (EAGAIN); if (njoblist == 0) return (0); AIO_LOCK(ki); for (;;) { firstjob = NULL; error = 0; TAILQ_FOREACH(job, &ki->kaio_all, allist) { for (i = 0; i < njoblist; i++) { if (job->ujob == ujoblist[i]) { if (firstjob == NULL) firstjob = job; if (job->jobflags & KAIOCB_FINISHED) goto RETURN; } } } /* All tasks were finished. */ if (firstjob == NULL) break; ki->kaio_flags |= KAIO_WAKEUP; error = msleep(&p->p_aioinfo, AIO_MTX(ki), PRIBIO | PCATCH, "aiospn", timo); if (error == ERESTART) error = EINTR; if (error) break; } RETURN: AIO_UNLOCK(ki); return (error); } int sys_aio_suspend(struct thread *td, struct aio_suspend_args *uap) { struct timespec ts, *tsp; struct aiocb **ujoblist; int error; if (uap->nent < 0 || uap->nent > AIO_LISTIO_MAX) return (EINVAL); if (uap->timeout) { /* Get timespec struct. */ if ((error = copyin(uap->timeout, &ts, sizeof(ts))) != 0) return (error); tsp = &ts; } else tsp = NULL; ujoblist = uma_zalloc(aiol_zone, M_WAITOK); error = copyin(uap->aiocbp, ujoblist, uap->nent * sizeof(ujoblist[0])); if (error == 0) error = kern_aio_suspend(td, uap->nent, ujoblist, tsp); uma_zfree(aiol_zone, ujoblist); return (error); } /* * aio_cancel cancels any non-physio aio operations not currently in * progress. */ int sys_aio_cancel(struct thread *td, struct aio_cancel_args *uap) { struct proc *p = td->td_proc; struct kaioinfo *ki; struct kaiocb *job, *jobn; struct file *fp; cap_rights_t rights; int error; int cancelled = 0; int notcancelled = 0; struct vnode *vp; /* Lookup file object. */ error = fget(td, uap->fd, cap_rights_init(&rights), &fp); if (error) return (error); ki = p->p_aioinfo; if (ki == NULL) goto done; if (fp->f_type == DTYPE_VNODE) { vp = fp->f_vnode; if (vn_isdisk(vp, &error)) { fdrop(fp, td); td->td_retval[0] = AIO_NOTCANCELED; return (0); } } AIO_LOCK(ki); TAILQ_FOREACH_SAFE(job, &ki->kaio_jobqueue, plist, jobn) { if ((uap->fd == job->uaiocb.aio_fildes) && ((uap->aiocbp == NULL) || (uap->aiocbp == job->ujob))) { if (aio_cancel_job(p, ki, job)) { cancelled++; } else { notcancelled++; } if (uap->aiocbp != NULL) break; } } AIO_UNLOCK(ki); done: fdrop(fp, td); if (uap->aiocbp != NULL) { if (cancelled) { td->td_retval[0] = AIO_CANCELED; return (0); } } if (notcancelled) { td->td_retval[0] = AIO_NOTCANCELED; return (0); } if (cancelled) { td->td_retval[0] = AIO_CANCELED; return (0); } td->td_retval[0] = AIO_ALLDONE; return (0); } /* * aio_error is implemented in the kernel level for compatibility purposes * only. For a user mode async implementation, it would be best to do it in * a userland subroutine. */ static int kern_aio_error(struct thread *td, struct aiocb *ujob, struct aiocb_ops *ops) { struct proc *p = td->td_proc; struct kaiocb *job; struct kaioinfo *ki; int status; ki = p->p_aioinfo; if (ki == NULL) { td->td_retval[0] = EINVAL; return (0); } AIO_LOCK(ki); TAILQ_FOREACH(job, &ki->kaio_all, allist) { if (job->ujob == ujob) { if (job->jobflags & KAIOCB_FINISHED) td->td_retval[0] = job->uaiocb._aiocb_private.error; else td->td_retval[0] = EINPROGRESS; AIO_UNLOCK(ki); return (0); } } AIO_UNLOCK(ki); /* * Hack for failure of aio_aqueue. */ status = ops->fetch_status(ujob); if (status == -1) { td->td_retval[0] = ops->fetch_error(ujob); return (0); } td->td_retval[0] = EINVAL; return (0); } int sys_aio_error(struct thread *td, struct aio_error_args *uap) { return (kern_aio_error(td, uap->aiocbp, &aiocb_ops)); } /* syscall - asynchronous read from a file (REALTIME) */ #ifdef COMPAT_FREEBSD6 int freebsd6_aio_read(struct thread *td, struct freebsd6_aio_read_args *uap) { return (aio_aqueue(td, (struct aiocb *)uap->aiocbp, NULL, LIO_READ, &aiocb_ops_osigevent)); } #endif int sys_aio_read(struct thread *td, struct aio_read_args *uap) { return (aio_aqueue(td, uap->aiocbp, NULL, LIO_READ, &aiocb_ops)); } /* syscall - asynchronous write to a file (REALTIME) */ #ifdef COMPAT_FREEBSD6 int freebsd6_aio_write(struct thread *td, struct freebsd6_aio_write_args *uap) { return (aio_aqueue(td, (struct aiocb *)uap->aiocbp, NULL, LIO_WRITE, &aiocb_ops_osigevent)); } #endif int sys_aio_write(struct thread *td, struct aio_write_args *uap) { return (aio_aqueue(td, uap->aiocbp, NULL, LIO_WRITE, &aiocb_ops)); } int sys_aio_mlock(struct thread *td, struct aio_mlock_args *uap) { return (aio_aqueue(td, uap->aiocbp, NULL, LIO_MLOCK, &aiocb_ops)); } static int kern_lio_listio(struct thread *td, int mode, struct aiocb * const *uacb_list, struct aiocb **acb_list, int nent, struct sigevent *sig, struct aiocb_ops *ops) { struct proc *p = td->td_proc; struct aiocb *job; struct kaioinfo *ki; struct aioliojob *lj; struct kevent kev; int error; int nerror; int i; if ((mode != LIO_NOWAIT) && (mode != LIO_WAIT)) return (EINVAL); if (nent < 0 || nent > AIO_LISTIO_MAX) return (EINVAL); if (p->p_aioinfo == NULL) aio_init_aioinfo(p); ki = p->p_aioinfo; lj = uma_zalloc(aiolio_zone, M_WAITOK); lj->lioj_flags = 0; lj->lioj_count = 0; lj->lioj_finished_count = 0; knlist_init_mtx(&lj->klist, AIO_MTX(ki)); ksiginfo_init(&lj->lioj_ksi); /* * Setup signal. */ if (sig && (mode == LIO_NOWAIT)) { bcopy(sig, &lj->lioj_signal, sizeof(lj->lioj_signal)); if (lj->lioj_signal.sigev_notify == SIGEV_KEVENT) { /* Assume only new style KEVENT */ kev.filter = EVFILT_LIO; kev.flags = EV_ADD | EV_ENABLE | EV_FLAG1; kev.ident = (uintptr_t)uacb_list; /* something unique */ kev.data = (intptr_t)lj; /* pass user defined sigval data */ kev.udata = lj->lioj_signal.sigev_value.sival_ptr; error = kqfd_register( lj->lioj_signal.sigev_notify_kqueue, &kev, td, 1); if (error) { uma_zfree(aiolio_zone, lj); return (error); } } else if (lj->lioj_signal.sigev_notify == SIGEV_NONE) { ; } else if (lj->lioj_signal.sigev_notify == SIGEV_SIGNAL || lj->lioj_signal.sigev_notify == SIGEV_THREAD_ID) { if (!_SIG_VALID(lj->lioj_signal.sigev_signo)) { uma_zfree(aiolio_zone, lj); return EINVAL; } lj->lioj_flags |= LIOJ_SIGNAL; } else { uma_zfree(aiolio_zone, lj); return EINVAL; } } AIO_LOCK(ki); TAILQ_INSERT_TAIL(&ki->kaio_liojoblist, lj, lioj_list); /* * Add extra aiocb count to avoid the lio to be freed * by other threads doing aio_waitcomplete or aio_return, * and prevent event from being sent until we have queued * all tasks. */ lj->lioj_count = 1; AIO_UNLOCK(ki); /* * Get pointers to the list of I/O requests. */ nerror = 0; for (i = 0; i < nent; i++) { job = acb_list[i]; if (job != NULL) { error = aio_aqueue(td, job, lj, LIO_NOP, ops); if (error != 0) nerror++; } } error = 0; AIO_LOCK(ki); if (mode == LIO_WAIT) { while (lj->lioj_count - 1 != lj->lioj_finished_count) { ki->kaio_flags |= KAIO_WAKEUP; error = msleep(&p->p_aioinfo, AIO_MTX(ki), PRIBIO | PCATCH, "aiospn", 0); if (error == ERESTART) error = EINTR; if (error) break; } } else { if (lj->lioj_count - 1 == lj->lioj_finished_count) { if (lj->lioj_signal.sigev_notify == SIGEV_KEVENT) { lj->lioj_flags |= LIOJ_KEVENT_POSTED; KNOTE_LOCKED(&lj->klist, 1); } if ((lj->lioj_flags & (LIOJ_SIGNAL|LIOJ_SIGNAL_POSTED)) == LIOJ_SIGNAL && (lj->lioj_signal.sigev_notify == SIGEV_SIGNAL || lj->lioj_signal.sigev_notify == SIGEV_THREAD_ID)) { aio_sendsig(p, &lj->lioj_signal, &lj->lioj_ksi); lj->lioj_flags |= LIOJ_SIGNAL_POSTED; } } } lj->lioj_count--; if (lj->lioj_count == 0) { TAILQ_REMOVE(&ki->kaio_liojoblist, lj, lioj_list); knlist_delete(&lj->klist, curthread, 1); PROC_LOCK(p); sigqueue_take(&lj->lioj_ksi); PROC_UNLOCK(p); AIO_UNLOCK(ki); uma_zfree(aiolio_zone, lj); } else AIO_UNLOCK(ki); if (nerror) return (EIO); return (error); } /* syscall - list directed I/O (REALTIME) */ #ifdef COMPAT_FREEBSD6 int freebsd6_lio_listio(struct thread *td, struct freebsd6_lio_listio_args *uap) { struct aiocb **acb_list; struct sigevent *sigp, sig; struct osigevent osig; int error, nent; if ((uap->mode != LIO_NOWAIT) && (uap->mode != LIO_WAIT)) return (EINVAL); nent = uap->nent; if (nent < 0 || nent > AIO_LISTIO_MAX) return (EINVAL); if (uap->sig && (uap->mode == LIO_NOWAIT)) { error = copyin(uap->sig, &osig, sizeof(osig)); if (error) return (error); error = convert_old_sigevent(&osig, &sig); if (error) return (error); sigp = &sig; } else sigp = NULL; acb_list = malloc(sizeof(struct aiocb *) * nent, M_LIO, M_WAITOK); error = copyin(uap->acb_list, acb_list, nent * sizeof(acb_list[0])); if (error == 0) error = kern_lio_listio(td, uap->mode, (struct aiocb * const *)uap->acb_list, acb_list, nent, sigp, &aiocb_ops_osigevent); free(acb_list, M_LIO); return (error); } #endif /* syscall - list directed I/O (REALTIME) */ int sys_lio_listio(struct thread *td, struct lio_listio_args *uap) { struct aiocb **acb_list; struct sigevent *sigp, sig; int error, nent; if ((uap->mode != LIO_NOWAIT) && (uap->mode != LIO_WAIT)) return (EINVAL); nent = uap->nent; if (nent < 0 || nent > AIO_LISTIO_MAX) return (EINVAL); if (uap->sig && (uap->mode == LIO_NOWAIT)) { error = copyin(uap->sig, &sig, sizeof(sig)); if (error) return (error); sigp = &sig; } else sigp = NULL; acb_list = malloc(sizeof(struct aiocb *) * nent, M_LIO, M_WAITOK); error = copyin(uap->acb_list, acb_list, nent * sizeof(acb_list[0])); if (error == 0) error = kern_lio_listio(td, uap->mode, uap->acb_list, acb_list, nent, sigp, &aiocb_ops); free(acb_list, M_LIO); return (error); } static void aio_physwakeup(struct bio *bp) { struct kaiocb *job = (struct kaiocb *)bp->bio_caller1; struct proc *userp; struct kaioinfo *ki; size_t nbytes; int error, nblks; /* Release mapping into kernel space. */ userp = job->userproc; ki = userp->p_aioinfo; if (job->pbuf) { pmap_qremove((vm_offset_t)job->pbuf->b_data, job->npages); relpbuf(job->pbuf, NULL); job->pbuf = NULL; atomic_subtract_int(&num_buf_aio, 1); AIO_LOCK(ki); ki->kaio_buffer_count--; AIO_UNLOCK(ki); } vm_page_unhold_pages(job->pages, job->npages); bp = job->bp; job->bp = NULL; nbytes = job->uaiocb.aio_nbytes - bp->bio_resid; error = 0; if (bp->bio_flags & BIO_ERROR) error = bp->bio_error; nblks = btodb(nbytes); if (job->uaiocb.aio_lio_opcode == LIO_WRITE) job->outblock += nblks; else job->inblock += nblks; if (error) aio_complete(job, -1, error); else aio_complete(job, nbytes, 0); g_destroy_bio(bp); } /* syscall - wait for the next completion of an aio request */ static int kern_aio_waitcomplete(struct thread *td, struct aiocb **ujobp, struct timespec *ts, struct aiocb_ops *ops) { struct proc *p = td->td_proc; struct timeval atv; struct kaioinfo *ki; struct kaiocb *job; struct aiocb *ujob; long error, status; int timo; ops->store_aiocb(ujobp, NULL); if (ts == NULL) { timo = 0; } else if (ts->tv_sec == 0 && ts->tv_nsec == 0) { timo = -1; } else { if ((ts->tv_nsec < 0) || (ts->tv_nsec >= 1000000000)) return (EINVAL); TIMESPEC_TO_TIMEVAL(&atv, ts); if (itimerfix(&atv)) return (EINVAL); timo = tvtohz(&atv); } if (p->p_aioinfo == NULL) aio_init_aioinfo(p); ki = p->p_aioinfo; error = 0; job = NULL; AIO_LOCK(ki); while ((job = TAILQ_FIRST(&ki->kaio_done)) == NULL) { if (timo == -1) { error = EWOULDBLOCK; break; } ki->kaio_flags |= KAIO_WAKEUP; error = msleep(&p->p_aioinfo, AIO_MTX(ki), PRIBIO | PCATCH, "aiowc", timo); if (timo && error == ERESTART) error = EINTR; if (error) break; } if (job != NULL) { MPASS(job->jobflags & KAIOCB_FINISHED); ujob = job->ujob; status = job->uaiocb._aiocb_private.status; error = job->uaiocb._aiocb_private.error; td->td_retval[0] = status; td->td_ru.ru_oublock += job->outblock; td->td_ru.ru_inblock += job->inblock; td->td_ru.ru_msgsnd += job->msgsnd; td->td_ru.ru_msgrcv += job->msgrcv; aio_free_entry(job); AIO_UNLOCK(ki); ops->store_aiocb(ujobp, ujob); ops->store_error(ujob, error); ops->store_status(ujob, status); } else AIO_UNLOCK(ki); return (error); } int sys_aio_waitcomplete(struct thread *td, struct aio_waitcomplete_args *uap) { struct timespec ts, *tsp; int error; if (uap->timeout) { /* Get timespec struct. */ error = copyin(uap->timeout, &ts, sizeof(ts)); if (error) return (error); tsp = &ts; } else tsp = NULL; return (kern_aio_waitcomplete(td, uap->aiocbp, tsp, &aiocb_ops)); } static int kern_aio_fsync(struct thread *td, int op, struct aiocb *ujob, struct aiocb_ops *ops) { struct proc *p = td->td_proc; struct kaioinfo *ki; if (op != O_SYNC) /* XXX lack of O_DSYNC */ return (EINVAL); ki = p->p_aioinfo; if (ki == NULL) aio_init_aioinfo(p); return (aio_aqueue(td, ujob, NULL, LIO_SYNC, ops)); } int sys_aio_fsync(struct thread *td, struct aio_fsync_args *uap) { return (kern_aio_fsync(td, uap->op, uap->aiocbp, &aiocb_ops)); } /* kqueue attach function */ static int filt_aioattach(struct knote *kn) { struct kaiocb *job = (struct kaiocb *)kn->kn_sdata; /* * The job pointer must be validated before using it, so * registration is restricted to the kernel; the user cannot * set EV_FLAG1. */ if ((kn->kn_flags & EV_FLAG1) == 0) return (EPERM); kn->kn_ptr.p_aio = job; kn->kn_flags &= ~EV_FLAG1; knlist_add(&job->klist, kn, 0); return (0); } /* kqueue detach function */ static void filt_aiodetach(struct knote *kn) { struct knlist *knl; knl = &kn->kn_ptr.p_aio->klist; knl->kl_lock(knl->kl_lockarg); if (!knlist_empty(knl)) knlist_remove(knl, kn, 1); knl->kl_unlock(knl->kl_lockarg); } /* kqueue filter function */ /*ARGSUSED*/ static int filt_aio(struct knote *kn, long hint) { struct kaiocb *job = kn->kn_ptr.p_aio; kn->kn_data = job->uaiocb._aiocb_private.error; if (!(job->jobflags & KAIOCB_FINISHED)) return (0); kn->kn_flags |= EV_EOF; return (1); } /* kqueue attach function */ static int filt_lioattach(struct knote *kn) { struct aioliojob * lj = (struct aioliojob *)kn->kn_sdata; /* * The aioliojob pointer must be validated before using it, so * registration is restricted to the kernel; the user cannot * set EV_FLAG1. */ if ((kn->kn_flags & EV_FLAG1) == 0) return (EPERM); kn->kn_ptr.p_lio = lj; kn->kn_flags &= ~EV_FLAG1; knlist_add(&lj->klist, kn, 0); return (0); } /* kqueue detach function */ static void filt_liodetach(struct knote *kn) { struct knlist *knl; knl = &kn->kn_ptr.p_lio->klist; knl->kl_lock(knl->kl_lockarg); if (!knlist_empty(knl)) knlist_remove(knl, kn, 1); knl->kl_unlock(knl->kl_lockarg); } /* kqueue filter function */ /*ARGSUSED*/ static int filt_lio(struct knote *kn, long hint) { struct aioliojob * lj = kn->kn_ptr.p_lio; return (lj->lioj_flags & LIOJ_KEVENT_POSTED); } #ifdef COMPAT_FREEBSD32 #include #include #include #include #include #include #include struct __aiocb_private32 { int32_t status; int32_t error; uint32_t kernelinfo; }; #ifdef COMPAT_FREEBSD6 typedef struct oaiocb32 { int aio_fildes; /* File descriptor */ uint64_t aio_offset __packed; /* File offset for I/O */ uint32_t aio_buf; /* I/O buffer in process space */ uint32_t aio_nbytes; /* Number of bytes for I/O */ struct osigevent32 aio_sigevent; /* Signal to deliver */ int aio_lio_opcode; /* LIO opcode */ int aio_reqprio; /* Request priority -- ignored */ struct __aiocb_private32 _aiocb_private; } oaiocb32_t; #endif typedef struct aiocb32 { int32_t aio_fildes; /* File descriptor */ uint64_t aio_offset __packed; /* File offset for I/O */ uint32_t aio_buf; /* I/O buffer in process space */ uint32_t aio_nbytes; /* Number of bytes for I/O */ int __spare__[2]; uint32_t __spare2__; int aio_lio_opcode; /* LIO opcode */ int aio_reqprio; /* Request priority -- ignored */ struct __aiocb_private32 _aiocb_private; struct sigevent32 aio_sigevent; /* Signal to deliver */ } aiocb32_t; #ifdef COMPAT_FREEBSD6 static int convert_old_sigevent32(struct osigevent32 *osig, struct sigevent *nsig) { /* * Only SIGEV_NONE, SIGEV_SIGNAL, and SIGEV_KEVENT are * supported by AIO with the old sigevent structure. */ CP(*osig, *nsig, sigev_notify); switch (nsig->sigev_notify) { case SIGEV_NONE: break; case SIGEV_SIGNAL: nsig->sigev_signo = osig->__sigev_u.__sigev_signo; break; case SIGEV_KEVENT: nsig->sigev_notify_kqueue = osig->__sigev_u.__sigev_notify_kqueue; PTRIN_CP(*osig, *nsig, sigev_value.sival_ptr); break; default: return (EINVAL); } return (0); } static int aiocb32_copyin_old_sigevent(struct aiocb *ujob, struct aiocb *kjob) { struct oaiocb32 job32; int error; bzero(kjob, sizeof(struct aiocb)); error = copyin(ujob, &job32, sizeof(job32)); if (error) return (error); CP(job32, *kjob, aio_fildes); CP(job32, *kjob, aio_offset); PTRIN_CP(job32, *kjob, aio_buf); CP(job32, *kjob, aio_nbytes); CP(job32, *kjob, aio_lio_opcode); CP(job32, *kjob, aio_reqprio); CP(job32, *kjob, _aiocb_private.status); CP(job32, *kjob, _aiocb_private.error); PTRIN_CP(job32, *kjob, _aiocb_private.kernelinfo); return (convert_old_sigevent32(&job32.aio_sigevent, &kjob->aio_sigevent)); } #endif static int aiocb32_copyin(struct aiocb *ujob, struct aiocb *kjob) { struct aiocb32 job32; int error; error = copyin(ujob, &job32, sizeof(job32)); if (error) return (error); CP(job32, *kjob, aio_fildes); CP(job32, *kjob, aio_offset); PTRIN_CP(job32, *kjob, aio_buf); CP(job32, *kjob, aio_nbytes); CP(job32, *kjob, aio_lio_opcode); CP(job32, *kjob, aio_reqprio); CP(job32, *kjob, _aiocb_private.status); CP(job32, *kjob, _aiocb_private.error); PTRIN_CP(job32, *kjob, _aiocb_private.kernelinfo); return (convert_sigevent32(&job32.aio_sigevent, &kjob->aio_sigevent)); } static long aiocb32_fetch_status(struct aiocb *ujob) { struct aiocb32 *ujob32; ujob32 = (struct aiocb32 *)ujob; return (fuword32(&ujob32->_aiocb_private.status)); } static long aiocb32_fetch_error(struct aiocb *ujob) { struct aiocb32 *ujob32; ujob32 = (struct aiocb32 *)ujob; return (fuword32(&ujob32->_aiocb_private.error)); } static int aiocb32_store_status(struct aiocb *ujob, long status) { struct aiocb32 *ujob32; ujob32 = (struct aiocb32 *)ujob; return (suword32(&ujob32->_aiocb_private.status, status)); } static int aiocb32_store_error(struct aiocb *ujob, long error) { struct aiocb32 *ujob32; ujob32 = (struct aiocb32 *)ujob; return (suword32(&ujob32->_aiocb_private.error, error)); } static int aiocb32_store_kernelinfo(struct aiocb *ujob, long jobref) { struct aiocb32 *ujob32; ujob32 = (struct aiocb32 *)ujob; return (suword32(&ujob32->_aiocb_private.kernelinfo, jobref)); } static int aiocb32_store_aiocb(struct aiocb **ujobp, struct aiocb *ujob) { return (suword32(ujobp, (long)ujob)); } static struct aiocb_ops aiocb32_ops = { .copyin = aiocb32_copyin, .fetch_status = aiocb32_fetch_status, .fetch_error = aiocb32_fetch_error, .store_status = aiocb32_store_status, .store_error = aiocb32_store_error, .store_kernelinfo = aiocb32_store_kernelinfo, .store_aiocb = aiocb32_store_aiocb, }; #ifdef COMPAT_FREEBSD6 static struct aiocb_ops aiocb32_ops_osigevent = { .copyin = aiocb32_copyin_old_sigevent, .fetch_status = aiocb32_fetch_status, .fetch_error = aiocb32_fetch_error, .store_status = aiocb32_store_status, .store_error = aiocb32_store_error, .store_kernelinfo = aiocb32_store_kernelinfo, .store_aiocb = aiocb32_store_aiocb, }; #endif int freebsd32_aio_return(struct thread *td, struct freebsd32_aio_return_args *uap) { return (kern_aio_return(td, (struct aiocb *)uap->aiocbp, &aiocb32_ops)); } int freebsd32_aio_suspend(struct thread *td, struct freebsd32_aio_suspend_args *uap) { struct timespec32 ts32; struct timespec ts, *tsp; struct aiocb **ujoblist; uint32_t *ujoblist32; int error, i; if (uap->nent < 0 || uap->nent > AIO_LISTIO_MAX) return (EINVAL); if (uap->timeout) { /* Get timespec struct. */ if ((error = copyin(uap->timeout, &ts32, sizeof(ts32))) != 0) return (error); CP(ts32, ts, tv_sec); CP(ts32, ts, tv_nsec); tsp = &ts; } else tsp = NULL; ujoblist = uma_zalloc(aiol_zone, M_WAITOK); ujoblist32 = (uint32_t *)ujoblist; error = copyin(uap->aiocbp, ujoblist32, uap->nent * sizeof(ujoblist32[0])); if (error == 0) { for (i = uap->nent; i > 0; i--) ujoblist[i] = PTRIN(ujoblist32[i]); error = kern_aio_suspend(td, uap->nent, ujoblist, tsp); } uma_zfree(aiol_zone, ujoblist); return (error); } int freebsd32_aio_error(struct thread *td, struct freebsd32_aio_error_args *uap) { return (kern_aio_error(td, (struct aiocb *)uap->aiocbp, &aiocb32_ops)); } #ifdef COMPAT_FREEBSD6 int freebsd6_freebsd32_aio_read(struct thread *td, struct freebsd6_freebsd32_aio_read_args *uap) { return (aio_aqueue(td, (struct aiocb *)uap->aiocbp, NULL, LIO_READ, &aiocb32_ops_osigevent)); } #endif int freebsd32_aio_read(struct thread *td, struct freebsd32_aio_read_args *uap) { return (aio_aqueue(td, (struct aiocb *)uap->aiocbp, NULL, LIO_READ, &aiocb32_ops)); } #ifdef COMPAT_FREEBSD6 int freebsd6_freebsd32_aio_write(struct thread *td, struct freebsd6_freebsd32_aio_write_args *uap) { return (aio_aqueue(td, (struct aiocb *)uap->aiocbp, NULL, LIO_WRITE, &aiocb32_ops_osigevent)); } #endif int freebsd32_aio_write(struct thread *td, struct freebsd32_aio_write_args *uap) { return (aio_aqueue(td, (struct aiocb *)uap->aiocbp, NULL, LIO_WRITE, &aiocb32_ops)); } int freebsd32_aio_mlock(struct thread *td, struct freebsd32_aio_mlock_args *uap) { return (aio_aqueue(td, (struct aiocb *)uap->aiocbp, NULL, LIO_MLOCK, &aiocb32_ops)); } int freebsd32_aio_waitcomplete(struct thread *td, struct freebsd32_aio_waitcomplete_args *uap) { struct timespec32 ts32; struct timespec ts, *tsp; int error; if (uap->timeout) { /* Get timespec struct. */ error = copyin(uap->timeout, &ts32, sizeof(ts32)); if (error) return (error); CP(ts32, ts, tv_sec); CP(ts32, ts, tv_nsec); tsp = &ts; } else tsp = NULL; return (kern_aio_waitcomplete(td, (struct aiocb **)uap->aiocbp, tsp, &aiocb32_ops)); } int freebsd32_aio_fsync(struct thread *td, struct freebsd32_aio_fsync_args *uap) { return (kern_aio_fsync(td, uap->op, (struct aiocb *)uap->aiocbp, &aiocb32_ops)); } #ifdef COMPAT_FREEBSD6 int freebsd6_freebsd32_lio_listio(struct thread *td, struct freebsd6_freebsd32_lio_listio_args *uap) { struct aiocb **acb_list; struct sigevent *sigp, sig; struct osigevent32 osig; uint32_t *acb_list32; int error, i, nent; if ((uap->mode != LIO_NOWAIT) && (uap->mode != LIO_WAIT)) return (EINVAL); nent = uap->nent; if (nent < 0 || nent > AIO_LISTIO_MAX) return (EINVAL); if (uap->sig && (uap->mode == LIO_NOWAIT)) { error = copyin(uap->sig, &osig, sizeof(osig)); if (error) return (error); error = convert_old_sigevent32(&osig, &sig); if (error) return (error); sigp = &sig; } else sigp = NULL; acb_list32 = malloc(sizeof(uint32_t) * nent, M_LIO, M_WAITOK); error = copyin(uap->acb_list, acb_list32, nent * sizeof(uint32_t)); if (error) { free(acb_list32, M_LIO); return (error); } acb_list = malloc(sizeof(struct aiocb *) * nent, M_LIO, M_WAITOK); for (i = 0; i < nent; i++) acb_list[i] = PTRIN(acb_list32[i]); free(acb_list32, M_LIO); error = kern_lio_listio(td, uap->mode, (struct aiocb * const *)uap->acb_list, acb_list, nent, sigp, &aiocb32_ops_osigevent); free(acb_list, M_LIO); return (error); } #endif int freebsd32_lio_listio(struct thread *td, struct freebsd32_lio_listio_args *uap) { struct aiocb **acb_list; struct sigevent *sigp, sig; struct sigevent32 sig32; uint32_t *acb_list32; int error, i, nent; if ((uap->mode != LIO_NOWAIT) && (uap->mode != LIO_WAIT)) return (EINVAL); nent = uap->nent; if (nent < 0 || nent > AIO_LISTIO_MAX) return (EINVAL); if (uap->sig && (uap->mode == LIO_NOWAIT)) { error = copyin(uap->sig, &sig32, sizeof(sig32)); if (error) return (error); error = convert_sigevent32(&sig32, &sig); if (error) return (error); sigp = &sig; } else sigp = NULL; acb_list32 = malloc(sizeof(uint32_t) * nent, M_LIO, M_WAITOK); error = copyin(uap->acb_list, acb_list32, nent * sizeof(uint32_t)); if (error) { free(acb_list32, M_LIO); return (error); } acb_list = malloc(sizeof(struct aiocb *) * nent, M_LIO, M_WAITOK); for (i = 0; i < nent; i++) acb_list[i] = PTRIN(acb_list32[i]); free(acb_list32, M_LIO); error = kern_lio_listio(td, uap->mode, (struct aiocb * const *)uap->acb_list, acb_list, nent, sigp, &aiocb32_ops); free(acb_list, M_LIO); return (error); } #endif Index: user/alc/PQ_LAUNDRY/sys/modules/cam/Makefile =================================================================== --- user/alc/PQ_LAUNDRY/sys/modules/cam/Makefile (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/modules/cam/Makefile (revision 303206) @@ -1,46 +1,49 @@ # $FreeBSD$ S= ${.CURDIR}/../.. -.PATH: $S/cam $S/cam/scsi $S/cam/ata $S/${MACHINE}/${MACHINE} +.PATH: $S/cam $S/cam/scsi $S/cam/ata $S/cam/nvme $S/${MACHINE}/${MACHINE} KMOD= cam # See sys/conf/options for the flags that go into the different opt_*.h files. SRCS= opt_cam.h SRCS+= opt_ada.h SRCS+= opt_scsi.h SRCS+= opt_cd.h SRCS+= opt_pt.h SRCS+= opt_sa.h SRCS+= opt_ses.h +SRCS+= opt_ddb.h SRCS+= device_if.h bus_if.h vnode_if.h SRCS+= cam.c SRCS+= cam_compat.c .if exists($S/${MACHINE}/${MACHINE}/cam_machdep.c) SRCS+= cam_machdep.c .endif SRCS+= cam_iosched.c cam_periph.c cam_queue.c cam_sim.c cam_xpt.c SRCS+= scsi_all.c scsi_cd.c scsi_ch.c SRCS+= scsi_da.c SRCS+= scsi_pass.c SRCS+= scsi_pt.c SRCS+= scsi_sa.c SRCS+= scsi_enc.c SRCS+= scsi_enc_ses.c SRCS+= scsi_enc_safte.c SRCS+= scsi_sg.c SRCS+= scsi_targ_bh.c scsi_target.c SRCS+= scsi_xpt.c SRCS+= smp_all.c SRCS+= ata_all.c SRCS+= ata_xpt.c SRCS+= ata_da.c .if exists($S/${MACHINE}/${MACHINE}/ata_machdep.c) SRCS+= ata_machdep.c .endif SRCS+= ata_pmp.c +SRCS+= nvme_all.c +SRCS+= nvme_xpt.c EXPORT_SYMS= YES # XXX evaluate .include Index: user/alc/PQ_LAUNDRY/sys/modules/uart/Makefile =================================================================== --- user/alc/PQ_LAUNDRY/sys/modules/uart/Makefile (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/modules/uart/Makefile (revision 303206) @@ -1,39 +1,40 @@ # $FreeBSD$ .PATH: ${.CURDIR}/../../dev/uart .if ${MACHINE_CPUARCH} == "sparc64" uart_bus_ebus= uart_bus_ebus.c .endif .if ${MACHINE_CPUARCH} == "arm" uart_dev_lpc= uart_dev_lpc.c .endif .if ${MACHINE_CPUARCH} == "aarch64" || ${MACHINE_CPUARCH} == "arm" || \ ${MACHINE_CPUARCH} == "sparc64" || ${MACHINE_CPUARCH} == "powerpc" ofw_bus_if= ofw_bus_if.h .endif .if ${MACHINE} == "i386" || ${MACHINE} == "amd64" _uart_cpu=uart_cpu_x86.c .else _uart_cpu=uart_cpu_${MACHINE}.c .endif .if exists(${.CURDIR:H:H}/dev/uart/${_uart_cpu}) uart_cpu_machine= ${_uart_cpu} .endif KMOD= uart SRCS= uart_bus_acpi.c ${uart_bus_ebus} uart_bus_isa.c uart_bus_pccard.c \ uart_bus_pci.c uart_bus_puc.c uart_bus_scc.c \ uart_core.c ${uart_cpu_machine} uart_dbg.c \ ${uart_dev_lpc} uart_dev_ns8250.c uart_dev_quicc.c uart_dev_sab82532.c \ uart_dev_z8530.c \ uart_if.c uart_if.h uart_subr.c uart_tty.c -SRCS+= bus_if.h card_if.h device_if.h isa_if.h ${ofw_bus_if} pci_if.h \ +SRCS+= acpi_if.h bus_if.h card_if.h device_if.h isa_if.h ${ofw_bus_if} \ + pci_if.h \ power_if.h pccarddevs.h serdev_if.h SRCS+= opt_platform.h opt_uart.h .include Index: user/alc/PQ_LAUNDRY/sys/netinet/if_ether.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/netinet/if_ether.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/netinet/if_ether.c (revision 303206) @@ -1,1368 +1,1375 @@ /*- * Copyright (c) 1982, 1986, 1988, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)if_ether.c 8.1 (Berkeley) 6/10/93 */ /* * Ethernet address resolution protocol. * TODO: * add "inuse/lock" bit (or ref. count) along with valid bit */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef INET #include #endif #include #define SIN(s) ((const struct sockaddr_in *)(s)) static struct timeval arp_lastlog; static int arp_curpps; static int arp_maxpps = 1; /* Simple ARP state machine */ enum arp_llinfo_state { ARP_LLINFO_INCOMPLETE = 0, /* No LLE data */ ARP_LLINFO_REACHABLE, /* LLE is valid */ ARP_LLINFO_VERIFY, /* LLE is valid, need refresh */ ARP_LLINFO_DELETED, /* LLE is deleted */ }; SYSCTL_DECL(_net_link_ether); static SYSCTL_NODE(_net_link_ether, PF_INET, inet, CTLFLAG_RW, 0, ""); static SYSCTL_NODE(_net_link_ether, PF_ARP, arp, CTLFLAG_RW, 0, ""); /* timer values */ static VNET_DEFINE(int, arpt_keep) = (20*60); /* once resolved, good for 20 * minutes */ static VNET_DEFINE(int, arp_maxtries) = 5; static VNET_DEFINE(int, arp_proxyall) = 0; static VNET_DEFINE(int, arpt_down) = 20; /* keep incomplete entries for * 20 seconds */ static VNET_DEFINE(int, arpt_rexmit) = 1; /* retransmit arp entries, sec*/ VNET_PCPUSTAT_DEFINE(struct arpstat, arpstat); /* ARP statistics, see if_arp.h */ VNET_PCPUSTAT_SYSINIT(arpstat); #ifdef VIMAGE VNET_PCPUSTAT_SYSUNINIT(arpstat); #endif /* VIMAGE */ static VNET_DEFINE(int, arp_maxhold) = 1; #define V_arpt_keep VNET(arpt_keep) #define V_arpt_down VNET(arpt_down) #define V_arpt_rexmit VNET(arpt_rexmit) #define V_arp_maxtries VNET(arp_maxtries) #define V_arp_proxyall VNET(arp_proxyall) #define V_arp_maxhold VNET(arp_maxhold) SYSCTL_INT(_net_link_ether_inet, OID_AUTO, max_age, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(arpt_keep), 0, "ARP entry lifetime in seconds"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, maxtries, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(arp_maxtries), 0, "ARP resolution attempts before returning error"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, proxyall, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(arp_proxyall), 0, "Enable proxy ARP for all suitable requests"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, wait, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(arpt_down), 0, "Incomplete ARP entry lifetime in seconds"); SYSCTL_VNET_PCPUSTAT(_net_link_ether_arp, OID_AUTO, stats, struct arpstat, arpstat, "ARP statistics (struct arpstat, net/if_arp.h)"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, maxhold, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(arp_maxhold), 0, "Number of packets to hold per ARP entry"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, max_log_per_second, CTLFLAG_RW, &arp_maxpps, 0, "Maximum number of remotely triggered ARP messages that can be " "logged per second"); #define ARP_LOG(pri, ...) do { \ if (ppsratecheck(&arp_lastlog, &arp_curpps, arp_maxpps)) \ log((pri), "arp: " __VA_ARGS__); \ } while (0) static void arpintr(struct mbuf *); static void arptimer(void *); #ifdef INET static void in_arpinput(struct mbuf *); #endif static void arp_check_update_lle(struct arphdr *ah, struct in_addr isaddr, struct ifnet *ifp, int bridged, struct llentry *la); static void arp_mark_lle_reachable(struct llentry *la); static void arp_iflladdr(void *arg __unused, struct ifnet *ifp); static eventhandler_tag iflladdr_tag; static const struct netisr_handler arp_nh = { .nh_name = "arp", .nh_handler = arpintr, .nh_proto = NETISR_ARP, .nh_policy = NETISR_POLICY_SOURCE, }; /* * Timeout routine. Age arp_tab entries periodically. */ static void arptimer(void *arg) { struct llentry *lle = (struct llentry *)arg; struct ifnet *ifp; int r_skip_req; if (lle->la_flags & LLE_STATIC) { return; } LLE_WLOCK(lle); if (callout_pending(&lle->lle_timer)) { /* * Here we are a bit odd here in the treatment of * active/pending. If the pending bit is set, it got * rescheduled before I ran. The active * bit we ignore, since if it was stopped * in ll_tablefree() and was currently running * it would have return 0 so the code would * not have deleted it since the callout could * not be stopped so we want to go through * with the delete here now. If the callout * was restarted, the pending bit will be back on and * we just want to bail since the callout_reset would * return 1 and our reference would have been removed * by arpresolve() below. */ LLE_WUNLOCK(lle); return; } ifp = lle->lle_tbl->llt_ifp; CURVNET_SET(ifp->if_vnet); switch (lle->ln_state) { case ARP_LLINFO_REACHABLE: /* * Expiration time is approaching. * Let's try to refresh entry if it is still * in use. * * Set r_skip_req to get feedback from * fast path. Change state and re-schedule * ourselves. */ LLE_REQ_LOCK(lle); lle->r_skip_req = 1; LLE_REQ_UNLOCK(lle); lle->ln_state = ARP_LLINFO_VERIFY; callout_schedule(&lle->lle_timer, hz * V_arpt_rexmit); LLE_WUNLOCK(lle); CURVNET_RESTORE(); return; case ARP_LLINFO_VERIFY: LLE_REQ_LOCK(lle); r_skip_req = lle->r_skip_req; LLE_REQ_UNLOCK(lle); if (r_skip_req == 0 && lle->la_preempt > 0) { /* Entry was used, issue refresh request */ struct in_addr dst; dst = lle->r_l3addr.addr4; lle->la_preempt--; callout_schedule(&lle->lle_timer, hz * V_arpt_rexmit); LLE_WUNLOCK(lle); arprequest(ifp, NULL, &dst, NULL); CURVNET_RESTORE(); return; } /* Nothing happened. Reschedule if not too late */ if (lle->la_expire > time_uptime) { callout_schedule(&lle->lle_timer, hz * V_arpt_rexmit); LLE_WUNLOCK(lle); CURVNET_RESTORE(); return; } break; case ARP_LLINFO_INCOMPLETE: case ARP_LLINFO_DELETED: break; } if ((lle->la_flags & LLE_DELETED) == 0) { int evt; if (lle->la_flags & LLE_VALID) evt = LLENTRY_EXPIRED; else evt = LLENTRY_TIMEDOUT; EVENTHANDLER_INVOKE(lle_event, lle, evt); } callout_stop(&lle->lle_timer); /* XXX: LOR avoidance. We still have ref on lle. */ LLE_WUNLOCK(lle); IF_AFDATA_LOCK(ifp); LLE_WLOCK(lle); /* Guard against race with other llentry_free(). */ if (lle->la_flags & LLE_LINKED) { LLE_REMREF(lle); lltable_unlink_entry(lle->lle_tbl, lle); } IF_AFDATA_UNLOCK(ifp); size_t pkts_dropped = llentry_free(lle); ARPSTAT_ADD(dropped, pkts_dropped); ARPSTAT_INC(timeouts); CURVNET_RESTORE(); } /* * Stores link-layer header for @ifp in format suitable for if_output() * into buffer @buf. Resulting header length is stored in @bufsize. * * Returns 0 on success. */ static int arp_fillheader(struct ifnet *ifp, struct arphdr *ah, int bcast, u_char *buf, size_t *bufsize) { struct if_encap_req ereq; int error; bzero(buf, *bufsize); bzero(&ereq, sizeof(ereq)); ereq.buf = buf; ereq.bufsize = *bufsize; ereq.rtype = IFENCAP_LL; ereq.family = AF_ARP; ereq.lladdr = ar_tha(ah); ereq.hdata = (u_char *)ah; if (bcast) ereq.flags = IFENCAP_FLAG_BROADCAST; error = ifp->if_requestencap(ifp, &ereq); if (error == 0) *bufsize = ereq.bufsize; return (error); } /* * Broadcast an ARP request. Caller specifies: * - arp header source ip address * - arp header target ip address * - arp header source ethernet address */ void arprequest(struct ifnet *ifp, const struct in_addr *sip, const struct in_addr *tip, u_char *enaddr) { struct mbuf *m; struct arphdr *ah; struct sockaddr sa; u_char *carpaddr = NULL; uint8_t linkhdr[LLE_MAX_LINKHDR]; size_t linkhdrsize; struct route ro; int error; if (sip == NULL) { /* * The caller did not supply a source address, try to find * a compatible one among those assigned to this interface. */ struct ifaddr *ifa; IF_ADDR_RLOCK(ifp); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != AF_INET) continue; if (ifa->ifa_carp) { if ((*carp_iamatch_p)(ifa, &carpaddr) == 0) continue; sip = &IA_SIN(ifa)->sin_addr; } else { carpaddr = NULL; sip = &IA_SIN(ifa)->sin_addr; } if (0 == ((sip->s_addr ^ tip->s_addr) & IA_MASKSIN(ifa)->sin_addr.s_addr)) break; /* found it. */ } IF_ADDR_RUNLOCK(ifp); if (sip == NULL) { printf("%s: cannot find matching address\n", __func__); return; } } if (enaddr == NULL) enaddr = carpaddr ? carpaddr : (u_char *)IF_LLADDR(ifp); if ((m = m_gethdr(M_NOWAIT, MT_DATA)) == NULL) return; m->m_len = sizeof(*ah) + 2 * sizeof(struct in_addr) + 2 * ifp->if_addrlen; m->m_pkthdr.len = m->m_len; M_ALIGN(m, m->m_len); ah = mtod(m, struct arphdr *); bzero((caddr_t)ah, m->m_len); #ifdef MAC mac_netinet_arp_send(ifp, m); #endif ah->ar_pro = htons(ETHERTYPE_IP); ah->ar_hln = ifp->if_addrlen; /* hardware address length */ ah->ar_pln = sizeof(struct in_addr); /* protocol address length */ ah->ar_op = htons(ARPOP_REQUEST); bcopy(enaddr, ar_sha(ah), ah->ar_hln); bcopy(sip, ar_spa(ah), ah->ar_pln); bcopy(tip, ar_tpa(ah), ah->ar_pln); sa.sa_family = AF_ARP; sa.sa_len = 2; /* Calculate link header for sending frame */ bzero(&ro, sizeof(ro)); linkhdrsize = sizeof(linkhdr); error = arp_fillheader(ifp, ah, 1, linkhdr, &linkhdrsize); if (error != 0 && error != EAFNOSUPPORT) { ARP_LOG(LOG_ERR, "Failed to calculate ARP header on %s: %d\n", if_name(ifp), error); return; } ro.ro_prepend = linkhdr; ro.ro_plen = linkhdrsize; ro.ro_flags = 0; m->m_flags |= M_BCAST; m_clrprotoflags(m); /* Avoid confusing lower layers. */ (*ifp->if_output)(ifp, m, &sa, &ro); ARPSTAT_INC(txrequests); } /* * Resolve an IP address into an ethernet address - heavy version. * Used internally by arpresolve(). * We have already checked than we can't use existing lle without * modification so we have to acquire LLE_EXCLUSIVE lle lock. * * On success, desten and flags are filled in and the function returns 0; * If the packet must be held pending resolution, we return EWOULDBLOCK * On other errors, we return the corresponding error code. * Note that m_freem() handles NULL. */ static int arpresolve_full(struct ifnet *ifp, int is_gw, int flags, struct mbuf *m, const struct sockaddr *dst, u_char *desten, uint32_t *pflags, struct llentry **plle) { struct llentry *la = NULL, *la_tmp; struct mbuf *curr = NULL; struct mbuf *next = NULL; int error, renew; char *lladdr; int ll_len; if (pflags != NULL) *pflags = 0; if (plle != NULL) *plle = NULL; if ((flags & LLE_CREATE) == 0) { IF_AFDATA_RLOCK(ifp); la = lla_lookup(LLTABLE(ifp), LLE_EXCLUSIVE, dst); IF_AFDATA_RUNLOCK(ifp); } if (la == NULL && (ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) == 0) { la = lltable_alloc_entry(LLTABLE(ifp), 0, dst); if (la == NULL) { log(LOG_DEBUG, "arpresolve: can't allocate llinfo for %s on %s\n", inet_ntoa(SIN(dst)->sin_addr), if_name(ifp)); m_freem(m); return (EINVAL); } IF_AFDATA_WLOCK(ifp); LLE_WLOCK(la); la_tmp = lla_lookup(LLTABLE(ifp), LLE_EXCLUSIVE, dst); /* Prefer ANY existing lle over newly-created one */ if (la_tmp == NULL) lltable_link_entry(LLTABLE(ifp), la); IF_AFDATA_WUNLOCK(ifp); if (la_tmp != NULL) { lltable_free_entry(LLTABLE(ifp), la); la = la_tmp; } } if (la == NULL) { m_freem(m); return (EINVAL); } if ((la->la_flags & LLE_VALID) && ((la->la_flags & LLE_STATIC) || la->la_expire > time_uptime)) { if (flags & LLE_ADDRONLY) { lladdr = la->ll_addr; ll_len = ifp->if_addrlen; } else { lladdr = la->r_linkdata; ll_len = la->r_hdrlen; } bcopy(lladdr, desten, ll_len); /* Check if we have feedback request from arptimer() */ if (la->r_skip_req != 0) { LLE_REQ_LOCK(la); la->r_skip_req = 0; /* Notify that entry was used */ LLE_REQ_UNLOCK(la); } if (pflags != NULL) *pflags = la->la_flags & (LLE_VALID|LLE_IFADDR); if (plle) { LLE_ADDREF(la); *plle = la; } LLE_WUNLOCK(la); return (0); } renew = (la->la_asked == 0 || la->la_expire != time_uptime); /* * There is an arptab entry, but no ethernet address * response yet. Add the mbuf to the list, dropping * the oldest packet if we have exceeded the system * setting. */ if (m != NULL) { if (la->la_numheld >= V_arp_maxhold) { if (la->la_hold != NULL) { next = la->la_hold->m_nextpkt; m_freem(la->la_hold); la->la_hold = next; la->la_numheld--; ARPSTAT_INC(dropped); } } if (la->la_hold != NULL) { curr = la->la_hold; while (curr->m_nextpkt != NULL) curr = curr->m_nextpkt; curr->m_nextpkt = m; } else la->la_hold = m; la->la_numheld++; } /* * Return EWOULDBLOCK if we have tried less than arp_maxtries. It * will be masked by ether_output(). Return EHOSTDOWN/EHOSTUNREACH * if we have already sent arp_maxtries ARP requests. Retransmit the * ARP request, but not faster than one request per second. */ if (la->la_asked < V_arp_maxtries) error = EWOULDBLOCK; /* First request. */ else error = is_gw != 0 ? EHOSTUNREACH : EHOSTDOWN; if (renew) { int canceled; LLE_ADDREF(la); la->la_expire = time_uptime; canceled = callout_reset(&la->lle_timer, hz * V_arpt_down, arptimer, la); if (canceled) LLE_REMREF(la); la->la_asked++; LLE_WUNLOCK(la); arprequest(ifp, NULL, &SIN(dst)->sin_addr, NULL); return (error); } LLE_WUNLOCK(la); return (error); } /* * Resolve an IP address into an ethernet address. */ int arpresolve_addr(struct ifnet *ifp, int flags, const struct sockaddr *dst, char *desten, uint32_t *pflags, struct llentry **plle) { int error; flags |= LLE_ADDRONLY; error = arpresolve_full(ifp, 0, flags, NULL, dst, desten, pflags, plle); return (error); } /* * Lookups link header based on an IP address. * On input: * ifp is the interface we use * is_gw != 0 if @dst represents gateway to some destination * m is the mbuf. May be NULL if we don't have a packet. * dst is the next hop, * desten is the storage to put LL header. * flags returns subset of lle flags: LLE_VALID | LLE_IFADDR * * On success, full/partial link header and flags are filled in and * the function returns 0. * If the packet must be held pending resolution, we return EWOULDBLOCK * On other errors, we return the corresponding error code. * Note that m_freem() handles NULL. */ int arpresolve(struct ifnet *ifp, int is_gw, struct mbuf *m, const struct sockaddr *dst, u_char *desten, uint32_t *pflags, struct llentry **plle) { struct llentry *la = NULL; if (pflags != NULL) *pflags = 0; if (plle != NULL) *plle = NULL; if (m != NULL) { if (m->m_flags & M_BCAST) { /* broadcast */ (void)memcpy(desten, ifp->if_broadcastaddr, ifp->if_addrlen); return (0); } if (m->m_flags & M_MCAST) { /* multicast */ ETHER_MAP_IP_MULTICAST(&SIN(dst)->sin_addr, desten); return (0); } } IF_AFDATA_RLOCK(ifp); - la = lla_lookup(LLTABLE(ifp), LLE_UNLOCKED, dst); + la = lla_lookup(LLTABLE(ifp), plle ? LLE_EXCLUSIVE : LLE_UNLOCKED, dst); if (la != NULL && (la->r_flags & RLLE_VALID) != 0) { /* Entry found, let's copy lle info */ bcopy(la->r_linkdata, desten, la->r_hdrlen); if (pflags != NULL) *pflags = LLE_VALID | (la->r_flags & RLLE_IFADDR); /* Check if we have feedback request from arptimer() */ if (la->r_skip_req != 0) { LLE_REQ_LOCK(la); la->r_skip_req = 0; /* Notify that entry was used */ LLE_REQ_UNLOCK(la); } + if (plle) { + LLE_ADDREF(la); + *plle = la; + LLE_WUNLOCK(la); + } IF_AFDATA_RUNLOCK(ifp); return (0); } + if (plle && la) + LLE_WUNLOCK(la); IF_AFDATA_RUNLOCK(ifp); return (arpresolve_full(ifp, is_gw, la == NULL ? LLE_CREATE : 0, m, dst, desten, pflags, plle)); } /* * Common length and type checks are done here, * then the protocol-specific routine is called. */ static void arpintr(struct mbuf *m) { struct arphdr *ar; struct ifnet *ifp; char *layer; int hlen; ifp = m->m_pkthdr.rcvif; if (m->m_len < sizeof(struct arphdr) && ((m = m_pullup(m, sizeof(struct arphdr))) == NULL)) { ARP_LOG(LOG_NOTICE, "packet with short header received on %s\n", if_name(ifp)); return; } ar = mtod(m, struct arphdr *); /* Check if length is sufficient */ if (m->m_len < arphdr_len(ar)) { m = m_pullup(m, arphdr_len(ar)); if (m == NULL) { ARP_LOG(LOG_NOTICE, "short packet received on %s\n", if_name(ifp)); return; } ar = mtod(m, struct arphdr *); } hlen = 0; layer = ""; switch (ntohs(ar->ar_hrd)) { case ARPHRD_ETHER: hlen = ETHER_ADDR_LEN; /* RFC 826 */ layer = "ethernet"; break; case ARPHRD_IEEE802: hlen = 6; /* RFC 1390, FDDI_ADDR_LEN */ layer = "fddi"; break; case ARPHRD_ARCNET: hlen = 1; /* RFC 1201, ARC_ADDR_LEN */ layer = "arcnet"; break; case ARPHRD_INFINIBAND: hlen = 20; /* RFC 4391, INFINIBAND_ALEN */ layer = "infiniband"; break; case ARPHRD_IEEE1394: hlen = 0; /* SHALL be 16 */ /* RFC 2734 */ layer = "firewire"; /* * Restrict too long hardware addresses. * Currently we are capable of handling 20-byte * addresses ( sizeof(lle->ll_addr) ) */ if (ar->ar_hln >= 20) hlen = 16; break; default: ARP_LOG(LOG_NOTICE, "packet with unknown hardware format 0x%02d received on " "%s\n", ntohs(ar->ar_hrd), if_name(ifp)); m_freem(m); return; } if (hlen != 0 && hlen != ar->ar_hln) { ARP_LOG(LOG_NOTICE, "packet with invalid %s address length %d received on %s\n", layer, ar->ar_hln, if_name(ifp)); m_freem(m); return; } ARPSTAT_INC(received); switch (ntohs(ar->ar_pro)) { #ifdef INET case ETHERTYPE_IP: in_arpinput(m); return; #endif } m_freem(m); } #ifdef INET /* * ARP for Internet protocols on 10 Mb/s Ethernet. * Algorithm is that given in RFC 826. * In addition, a sanity check is performed on the sender * protocol address, to catch impersonators. * We no longer handle negotiations for use of trailer protocol: * Formerly, ARP replied for protocol type ETHERTYPE_TRAIL sent * along with IP replies if we wanted trailers sent to us, * and also sent them in response to IP replies. * This allowed either end to announce the desire to receive * trailer packets. * We no longer reply to requests for ETHERTYPE_TRAIL protocol either, * but formerly didn't normally send requests. */ static int log_arp_wrong_iface = 1; static int log_arp_movements = 1; static int log_arp_permanent_modify = 1; static int allow_multicast = 0; SYSCTL_INT(_net_link_ether_inet, OID_AUTO, log_arp_wrong_iface, CTLFLAG_RW, &log_arp_wrong_iface, 0, "log arp packets arriving on the wrong interface"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, log_arp_movements, CTLFLAG_RW, &log_arp_movements, 0, "log arp replies from MACs different than the one in the cache"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, log_arp_permanent_modify, CTLFLAG_RW, &log_arp_permanent_modify, 0, "log arp replies from MACs different than the one in the permanent arp entry"); SYSCTL_INT(_net_link_ether_inet, OID_AUTO, allow_multicast, CTLFLAG_RW, &allow_multicast, 0, "accept multicast addresses"); static void in_arpinput(struct mbuf *m) { struct rm_priotracker in_ifa_tracker; struct arphdr *ah; struct ifnet *ifp = m->m_pkthdr.rcvif; struct llentry *la = NULL, *la_tmp; struct ifaddr *ifa; struct in_ifaddr *ia; struct sockaddr sa; struct in_addr isaddr, itaddr, myaddr; u_int8_t *enaddr = NULL; int op; int bridged = 0, is_bridge = 0; int carped; struct sockaddr_in sin; struct sockaddr *dst; struct nhop4_basic nh4; uint8_t linkhdr[LLE_MAX_LINKHDR]; struct route ro; size_t linkhdrsize; int lladdr_off; int error; sin.sin_len = sizeof(struct sockaddr_in); sin.sin_family = AF_INET; sin.sin_addr.s_addr = 0; if (ifp->if_bridge) bridged = 1; if (ifp->if_type == IFT_BRIDGE) is_bridge = 1; /* * We already have checked that mbuf contains enough contiguous data * to hold entire arp message according to the arp header. */ ah = mtod(m, struct arphdr *); /* * ARP is only for IPv4 so we can reject packets with * a protocol length not equal to an IPv4 address. */ if (ah->ar_pln != sizeof(struct in_addr)) { ARP_LOG(LOG_NOTICE, "requested protocol length != %zu\n", sizeof(struct in_addr)); goto drop; } if (allow_multicast == 0 && ETHER_IS_MULTICAST(ar_sha(ah))) { ARP_LOG(LOG_NOTICE, "%*D is multicast\n", ifp->if_addrlen, (u_char *)ar_sha(ah), ":"); goto drop; } op = ntohs(ah->ar_op); (void)memcpy(&isaddr, ar_spa(ah), sizeof (isaddr)); (void)memcpy(&itaddr, ar_tpa(ah), sizeof (itaddr)); if (op == ARPOP_REPLY) ARPSTAT_INC(rxreplies); /* * For a bridge, we want to check the address irrespective * of the receive interface. (This will change slightly * when we have clusters of interfaces). */ IN_IFADDR_RLOCK(&in_ifa_tracker); LIST_FOREACH(ia, INADDR_HASH(itaddr.s_addr), ia_hash) { if (((bridged && ia->ia_ifp->if_bridge == ifp->if_bridge) || ia->ia_ifp == ifp) && itaddr.s_addr == ia->ia_addr.sin_addr.s_addr && (ia->ia_ifa.ifa_carp == NULL || (*carp_iamatch_p)(&ia->ia_ifa, &enaddr))) { ifa_ref(&ia->ia_ifa); IN_IFADDR_RUNLOCK(&in_ifa_tracker); goto match; } } LIST_FOREACH(ia, INADDR_HASH(isaddr.s_addr), ia_hash) if (((bridged && ia->ia_ifp->if_bridge == ifp->if_bridge) || ia->ia_ifp == ifp) && isaddr.s_addr == ia->ia_addr.sin_addr.s_addr) { ifa_ref(&ia->ia_ifa); IN_IFADDR_RUNLOCK(&in_ifa_tracker); goto match; } #define BDG_MEMBER_MATCHES_ARP(addr, ifp, ia) \ (ia->ia_ifp->if_bridge == ifp->if_softc && \ !bcmp(IF_LLADDR(ia->ia_ifp), IF_LLADDR(ifp), ifp->if_addrlen) && \ addr == ia->ia_addr.sin_addr.s_addr) /* * Check the case when bridge shares its MAC address with * some of its children, so packets are claimed by bridge * itself (bridge_input() does it first), but they are really * meant to be destined to the bridge member. */ if (is_bridge) { LIST_FOREACH(ia, INADDR_HASH(itaddr.s_addr), ia_hash) { if (BDG_MEMBER_MATCHES_ARP(itaddr.s_addr, ifp, ia)) { ifa_ref(&ia->ia_ifa); ifp = ia->ia_ifp; IN_IFADDR_RUNLOCK(&in_ifa_tracker); goto match; } } } #undef BDG_MEMBER_MATCHES_ARP IN_IFADDR_RUNLOCK(&in_ifa_tracker); /* * No match, use the first inet address on the receive interface * as a dummy address for the rest of the function. */ IF_ADDR_RLOCK(ifp); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) if (ifa->ifa_addr->sa_family == AF_INET && (ifa->ifa_carp == NULL || (*carp_iamatch_p)(ifa, &enaddr))) { ia = ifatoia(ifa); ifa_ref(ifa); IF_ADDR_RUNLOCK(ifp); goto match; } IF_ADDR_RUNLOCK(ifp); /* * If bridging, fall back to using any inet address. */ IN_IFADDR_RLOCK(&in_ifa_tracker); if (!bridged || (ia = TAILQ_FIRST(&V_in_ifaddrhead)) == NULL) { IN_IFADDR_RUNLOCK(&in_ifa_tracker); goto drop; } ifa_ref(&ia->ia_ifa); IN_IFADDR_RUNLOCK(&in_ifa_tracker); match: if (!enaddr) enaddr = (u_int8_t *)IF_LLADDR(ifp); carped = (ia->ia_ifa.ifa_carp != NULL); myaddr = ia->ia_addr.sin_addr; ifa_free(&ia->ia_ifa); if (!bcmp(ar_sha(ah), enaddr, ifp->if_addrlen)) goto drop; /* it's from me, ignore it. */ if (!bcmp(ar_sha(ah), ifp->if_broadcastaddr, ifp->if_addrlen)) { ARP_LOG(LOG_NOTICE, "link address is broadcast for IP address " "%s!\n", inet_ntoa(isaddr)); goto drop; } if (ifp->if_addrlen != ah->ar_hln) { ARP_LOG(LOG_WARNING, "from %*D: addr len: new %d, " "i/f %d (ignored)\n", ifp->if_addrlen, (u_char *) ar_sha(ah), ":", ah->ar_hln, ifp->if_addrlen); goto drop; } /* * Warn if another host is using the same IP address, but only if the * IP address isn't 0.0.0.0, which is used for DHCP only, in which * case we suppress the warning to avoid false positive complaints of * potential misconfiguration. */ if (!bridged && !carped && isaddr.s_addr == myaddr.s_addr && myaddr.s_addr != 0) { ARP_LOG(LOG_ERR, "%*D is using my IP address %s on %s!\n", ifp->if_addrlen, (u_char *)ar_sha(ah), ":", inet_ntoa(isaddr), ifp->if_xname); itaddr = myaddr; ARPSTAT_INC(dupips); goto reply; } if (ifp->if_flags & IFF_STATICARP) goto reply; bzero(&sin, sizeof(sin)); sin.sin_len = sizeof(struct sockaddr_in); sin.sin_family = AF_INET; sin.sin_addr = isaddr; dst = (struct sockaddr *)&sin; IF_AFDATA_RLOCK(ifp); la = lla_lookup(LLTABLE(ifp), LLE_EXCLUSIVE, dst); IF_AFDATA_RUNLOCK(ifp); if (la != NULL) arp_check_update_lle(ah, isaddr, ifp, bridged, la); else if (itaddr.s_addr == myaddr.s_addr) { /* * Request/reply to our address, but no lle exists yet. * Calculate full link prepend to use in lle. */ linkhdrsize = sizeof(linkhdr); if (lltable_calc_llheader(ifp, AF_INET, ar_sha(ah), linkhdr, &linkhdrsize, &lladdr_off) != 0) goto reply; /* Allocate new entry */ la = lltable_alloc_entry(LLTABLE(ifp), 0, dst); if (la == NULL) { /* * lle creation may fail if source address belongs * to non-directly connected subnet. However, we * will try to answer the request instead of dropping * frame. */ goto reply; } lltable_set_entry_addr(ifp, la, linkhdr, linkhdrsize, lladdr_off); IF_AFDATA_WLOCK(ifp); LLE_WLOCK(la); la_tmp = lla_lookup(LLTABLE(ifp), LLE_EXCLUSIVE, dst); /* * Check if lle still does not exists. * If it does, that means that we either * 1) have configured it explicitly, via * 1a) 'arp -s' static entry or * 1b) interface address static record * or * 2) it was the result of sending first packet to-host * or * 3) it was another arp reply packet we handled in * different thread. * * In all cases except 3) we definitely need to prefer * existing lle. For the sake of simplicity, prefer any * existing lle over newly-create one. */ if (la_tmp == NULL) lltable_link_entry(LLTABLE(ifp), la); IF_AFDATA_WUNLOCK(ifp); if (la_tmp == NULL) { arp_mark_lle_reachable(la); LLE_WUNLOCK(la); } else { /* Free newly-create entry and handle packet */ lltable_free_entry(LLTABLE(ifp), la); la = la_tmp; la_tmp = NULL; arp_check_update_lle(ah, isaddr, ifp, bridged, la); /* arp_check_update_lle() returns @la unlocked */ } la = NULL; } reply: if (op != ARPOP_REQUEST) goto drop; ARPSTAT_INC(rxrequests); if (itaddr.s_addr == myaddr.s_addr) { /* Shortcut.. the receiving interface is the target. */ (void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln); (void)memcpy(ar_sha(ah), enaddr, ah->ar_hln); } else { struct llentry *lle = NULL; sin.sin_addr = itaddr; IF_AFDATA_RLOCK(ifp); lle = lla_lookup(LLTABLE(ifp), 0, (struct sockaddr *)&sin); IF_AFDATA_RUNLOCK(ifp); if ((lle != NULL) && (lle->la_flags & LLE_PUB)) { (void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln); (void)memcpy(ar_sha(ah), lle->ll_addr, ah->ar_hln); LLE_RUNLOCK(lle); } else { if (lle != NULL) LLE_RUNLOCK(lle); if (!V_arp_proxyall) goto drop; /* XXX MRT use table 0 for arp reply */ if (fib4_lookup_nh_basic(0, itaddr, 0, 0, &nh4) != 0) goto drop; /* * Don't send proxies for nodes on the same interface * as this one came out of, or we'll get into a fight * over who claims what Ether address. */ if (nh4.nh_ifp == ifp) goto drop; (void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln); (void)memcpy(ar_sha(ah), enaddr, ah->ar_hln); /* * Also check that the node which sent the ARP packet * is on the interface we expect it to be on. This * avoids ARP chaos if an interface is connected to the * wrong network. */ /* XXX MRT use table 0 for arp checks */ if (fib4_lookup_nh_basic(0, isaddr, 0, 0, &nh4) != 0) goto drop; if (nh4.nh_ifp != ifp) { ARP_LOG(LOG_INFO, "proxy: ignoring request" " from %s via %s\n", inet_ntoa(isaddr), ifp->if_xname); goto drop; } #ifdef DEBUG_PROXY printf("arp: proxying for %s\n", inet_ntoa(itaddr)); #endif } } if (itaddr.s_addr == myaddr.s_addr && IN_LINKLOCAL(ntohl(itaddr.s_addr))) { /* RFC 3927 link-local IPv4; always reply by broadcast. */ #ifdef DEBUG_LINKLOCAL printf("arp: sending reply for link-local addr %s\n", inet_ntoa(itaddr)); #endif m->m_flags |= M_BCAST; m->m_flags &= ~M_MCAST; } else { /* default behaviour; never reply by broadcast. */ m->m_flags &= ~(M_BCAST|M_MCAST); } (void)memcpy(ar_tpa(ah), ar_spa(ah), ah->ar_pln); (void)memcpy(ar_spa(ah), &itaddr, ah->ar_pln); ah->ar_op = htons(ARPOP_REPLY); ah->ar_pro = htons(ETHERTYPE_IP); /* let's be sure! */ m->m_len = sizeof(*ah) + (2 * ah->ar_pln) + (2 * ah->ar_hln); m->m_pkthdr.len = m->m_len; m->m_pkthdr.rcvif = NULL; sa.sa_family = AF_ARP; sa.sa_len = 2; /* Calculate link header for sending frame */ bzero(&ro, sizeof(ro)); linkhdrsize = sizeof(linkhdr); error = arp_fillheader(ifp, ah, 0, linkhdr, &linkhdrsize); /* * arp_fillheader() may fail due to lack of support inside encap request * routing. This is not necessary an error, AF_ARP can/should be handled * by if_output(). */ if (error != 0 && error != EAFNOSUPPORT) { ARP_LOG(LOG_ERR, "Failed to calculate ARP header on %s: %d\n", if_name(ifp), error); return; } ro.ro_prepend = linkhdr; ro.ro_plen = linkhdrsize; ro.ro_flags = 0; m_clrprotoflags(m); /* Avoid confusing lower layers. */ (*ifp->if_output)(ifp, m, &sa, &ro); ARPSTAT_INC(txreplies); return; drop: m_freem(m); } #endif /* * Checks received arp data against existing @la. * Updates lle state/performs notification if necessary. */ static void arp_check_update_lle(struct arphdr *ah, struct in_addr isaddr, struct ifnet *ifp, int bridged, struct llentry *la) { struct sockaddr sa; struct mbuf *m_hold, *m_hold_next; uint8_t linkhdr[LLE_MAX_LINKHDR]; size_t linkhdrsize; int lladdr_off; LLE_WLOCK_ASSERT(la); /* the following is not an error when doing bridging */ if (!bridged && la->lle_tbl->llt_ifp != ifp) { if (log_arp_wrong_iface) ARP_LOG(LOG_WARNING, "%s is on %s " "but got reply from %*D on %s\n", inet_ntoa(isaddr), la->lle_tbl->llt_ifp->if_xname, ifp->if_addrlen, (u_char *)ar_sha(ah), ":", ifp->if_xname); LLE_WUNLOCK(la); return; } if ((la->la_flags & LLE_VALID) && bcmp(ar_sha(ah), la->ll_addr, ifp->if_addrlen)) { if (la->la_flags & LLE_STATIC) { LLE_WUNLOCK(la); if (log_arp_permanent_modify) ARP_LOG(LOG_ERR, "%*D attempts to modify " "permanent entry for %s on %s\n", ifp->if_addrlen, (u_char *)ar_sha(ah), ":", inet_ntoa(isaddr), ifp->if_xname); return; } if (log_arp_movements) { ARP_LOG(LOG_INFO, "%s moved from %*D " "to %*D on %s\n", inet_ntoa(isaddr), ifp->if_addrlen, (u_char *)&la->ll_addr, ":", ifp->if_addrlen, (u_char *)ar_sha(ah), ":", ifp->if_xname); } } /* Calculate full link prepend to use in lle */ linkhdrsize = sizeof(linkhdr); if (lltable_calc_llheader(ifp, AF_INET, ar_sha(ah), linkhdr, &linkhdrsize, &lladdr_off) != 0) return; /* Check if something has changed */ if (memcmp(la->r_linkdata, linkhdr, linkhdrsize) != 0 || (la->la_flags & LLE_VALID) == 0) { /* Try to perform LLE update */ if (lltable_try_set_entry_addr(ifp, la, linkhdr, linkhdrsize, lladdr_off) == 0) return; /* Clear fast path feedback request if set */ la->r_skip_req = 0; } arp_mark_lle_reachable(la); /* * The packets are all freed within the call to the output * routine. * * NB: The lock MUST be released before the call to the * output routine. */ if (la->la_hold != NULL) { m_hold = la->la_hold; la->la_hold = NULL; la->la_numheld = 0; lltable_fill_sa_entry(la, &sa); LLE_WUNLOCK(la); for (; m_hold != NULL; m_hold = m_hold_next) { m_hold_next = m_hold->m_nextpkt; m_hold->m_nextpkt = NULL; /* Avoid confusing lower layers. */ m_clrprotoflags(m_hold); (*ifp->if_output)(ifp, m_hold, &sa, NULL); } } else LLE_WUNLOCK(la); } static void arp_mark_lle_reachable(struct llentry *la) { int canceled, wtime; LLE_WLOCK_ASSERT(la); la->ln_state = ARP_LLINFO_REACHABLE; EVENTHANDLER_INVOKE(lle_event, la, LLENTRY_RESOLVED); if (!(la->la_flags & LLE_STATIC)) { LLE_ADDREF(la); la->la_expire = time_uptime + V_arpt_keep; wtime = V_arpt_keep - V_arp_maxtries * V_arpt_rexmit; if (wtime < 0) wtime = V_arpt_keep; canceled = callout_reset(&la->lle_timer, hz * wtime, arptimer, la); if (canceled) LLE_REMREF(la); } la->la_asked = 0; la->la_preempt = V_arp_maxtries; } /* * Add pernament link-layer record for given interface address. */ static __noinline void arp_add_ifa_lle(struct ifnet *ifp, const struct sockaddr *dst) { struct llentry *lle, *lle_tmp; /* * Interface address LLE record is considered static * because kernel code relies on LLE_STATIC flag to check * if these entries can be rewriten by arp updates. */ lle = lltable_alloc_entry(LLTABLE(ifp), LLE_IFADDR | LLE_STATIC, dst); if (lle == NULL) { log(LOG_INFO, "arp_ifinit: cannot create arp " "entry for interface address\n"); return; } IF_AFDATA_WLOCK(ifp); LLE_WLOCK(lle); /* Unlink any entry if exists */ lle_tmp = lla_lookup(LLTABLE(ifp), LLE_EXCLUSIVE, dst); if (lle_tmp != NULL) lltable_unlink_entry(LLTABLE(ifp), lle_tmp); lltable_link_entry(LLTABLE(ifp), lle); IF_AFDATA_WUNLOCK(ifp); if (lle_tmp != NULL) EVENTHANDLER_INVOKE(lle_event, lle_tmp, LLENTRY_EXPIRED); EVENTHANDLER_INVOKE(lle_event, lle, LLENTRY_RESOLVED); LLE_WUNLOCK(lle); if (lle_tmp != NULL) lltable_free_entry(LLTABLE(ifp), lle_tmp); } void arp_ifinit(struct ifnet *ifp, struct ifaddr *ifa) { const struct sockaddr_in *dst_in; const struct sockaddr *dst; if (ifa->ifa_carp != NULL) return; dst = ifa->ifa_addr; dst_in = (const struct sockaddr_in *)dst; if (ntohl(dst_in->sin_addr.s_addr) == INADDR_ANY) return; arp_announce_ifaddr(ifp, dst_in->sin_addr, IF_LLADDR(ifp)); arp_add_ifa_lle(ifp, dst); } void arp_announce_ifaddr(struct ifnet *ifp, struct in_addr addr, u_char *enaddr) { if (ntohl(addr.s_addr) != INADDR_ANY) arprequest(ifp, &addr, &addr, enaddr); } /* * Sends gratuitous ARPs for each ifaddr to notify other * nodes about the address change. */ static __noinline void arp_handle_ifllchange(struct ifnet *ifp) { struct ifaddr *ifa; TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family == AF_INET) arp_ifinit(ifp, ifa); } } /* * A handler for interface link layer address change event. */ static void arp_iflladdr(void *arg __unused, struct ifnet *ifp) { lltable_update_ifaddr(LLTABLE(ifp)); if ((ifp->if_flags & IFF_UP) != 0) arp_handle_ifllchange(ifp); } static void vnet_arp_init(void) { if (IS_DEFAULT_VNET(curvnet)) { netisr_register(&arp_nh); iflladdr_tag = EVENTHANDLER_REGISTER(iflladdr_event, arp_iflladdr, NULL, EVENTHANDLER_PRI_ANY); } #ifdef VIMAGE else netisr_register_vnet(&arp_nh); #endif } VNET_SYSINIT(vnet_arp_init, SI_SUB_PROTO_DOMAIN, SI_ORDER_SECOND, vnet_arp_init, 0); #ifdef VIMAGE /* * We have to unregister ARP along with IP otherwise we risk doing INADDR_HASH * lookups after destroying the hash. Ideally this would go on SI_ORDER_3.5. */ static void vnet_arp_destroy(__unused void *arg) { netisr_unregister_vnet(&arp_nh); } VNET_SYSUNINIT(vnet_arp_uninit, SI_SUB_PROTO_DOMAIN, SI_ORDER_THIRD, vnet_arp_destroy, NULL); #endif Index: user/alc/PQ_LAUNDRY/sys/netinet/sctp_output.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/netinet/sctp_output.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/netinet/sctp_output.c (revision 303206) @@ -1,13882 +1,13882 @@ /*- * Copyright (c) 2001-2008, by Cisco Systems, Inc. All rights reserved. * Copyright (c) 2008-2012, by Randall Stewart. All rights reserved. * Copyright (c) 2008-2012, by Michael Tuexen. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * a) Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * * b) Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the distribution. * * c) Neither the name of Cisco Systems, Inc. nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #if defined(INET) || defined(INET6) #include #endif #include #include #define SCTP_MAX_GAPS_INARRAY 4 struct sack_track { uint8_t right_edge; /* mergable on the right edge */ uint8_t left_edge; /* mergable on the left edge */ uint8_t num_entries; uint8_t spare; struct sctp_gap_ack_block gaps[SCTP_MAX_GAPS_INARRAY]; }; const struct sack_track sack_array[256] = { {0, 0, 0, 0, /* 0x00 */ {{0, 0}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 1, 0, /* 0x01 */ {{0, 0}, {0, 0}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x02 */ {{1, 1}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 1, 0, /* 0x03 */ {{0, 1}, {0, 0}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x04 */ {{2, 2}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x05 */ {{0, 0}, {2, 2}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x06 */ {{1, 2}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 1, 0, /* 0x07 */ {{0, 2}, {0, 0}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x08 */ {{3, 3}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x09 */ {{0, 0}, {3, 3}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x0a */ {{1, 1}, {3, 3}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x0b */ {{0, 1}, {3, 3}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x0c */ {{2, 3}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x0d */ {{0, 0}, {2, 3}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x0e */ {{1, 3}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 1, 0, /* 0x0f */ {{0, 3}, {0, 0}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x10 */ {{4, 4}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x11 */ {{0, 0}, {4, 4}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x12 */ {{1, 1}, {4, 4}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x13 */ {{0, 1}, {4, 4}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x14 */ {{2, 2}, {4, 4}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x15 */ {{0, 0}, {2, 2}, {4, 4}, {0, 0} } }, {0, 0, 2, 0, /* 0x16 */ {{1, 2}, {4, 4}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x17 */ {{0, 2}, {4, 4}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x18 */ {{3, 4}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x19 */ {{0, 0}, {3, 4}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x1a */ {{1, 1}, {3, 4}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x1b */ {{0, 1}, {3, 4}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x1c */ {{2, 4}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x1d */ {{0, 0}, {2, 4}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x1e */ {{1, 4}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 1, 0, /* 0x1f */ {{0, 4}, {0, 0}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x20 */ {{5, 5}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x21 */ {{0, 0}, {5, 5}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x22 */ {{1, 1}, {5, 5}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x23 */ {{0, 1}, {5, 5}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x24 */ {{2, 2}, {5, 5}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x25 */ {{0, 0}, {2, 2}, {5, 5}, {0, 0} } }, {0, 0, 2, 0, /* 0x26 */ {{1, 2}, {5, 5}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x27 */ {{0, 2}, {5, 5}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x28 */ {{3, 3}, {5, 5}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x29 */ {{0, 0}, {3, 3}, {5, 5}, {0, 0} } }, {0, 0, 3, 0, /* 0x2a */ {{1, 1}, {3, 3}, {5, 5}, {0, 0} } }, {1, 0, 3, 0, /* 0x2b */ {{0, 1}, {3, 3}, {5, 5}, {0, 0} } }, {0, 0, 2, 0, /* 0x2c */ {{2, 3}, {5, 5}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x2d */ {{0, 0}, {2, 3}, {5, 5}, {0, 0} } }, {0, 0, 2, 0, /* 0x2e */ {{1, 3}, {5, 5}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x2f */ {{0, 3}, {5, 5}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x30 */ {{4, 5}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x31 */ {{0, 0}, {4, 5}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x32 */ {{1, 1}, {4, 5}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x33 */ {{0, 1}, {4, 5}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x34 */ {{2, 2}, {4, 5}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x35 */ {{0, 0}, {2, 2}, {4, 5}, {0, 0} } }, {0, 0, 2, 0, /* 0x36 */ {{1, 2}, {4, 5}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x37 */ {{0, 2}, {4, 5}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x38 */ {{3, 5}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x39 */ {{0, 0}, {3, 5}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x3a */ {{1, 1}, {3, 5}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x3b */ {{0, 1}, {3, 5}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x3c */ {{2, 5}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x3d */ {{0, 0}, {2, 5}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x3e */ {{1, 5}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 1, 0, /* 0x3f */ {{0, 5}, {0, 0}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x40 */ {{6, 6}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x41 */ {{0, 0}, {6, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x42 */ {{1, 1}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x43 */ {{0, 1}, {6, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x44 */ {{2, 2}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x45 */ {{0, 0}, {2, 2}, {6, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x46 */ {{1, 2}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x47 */ {{0, 2}, {6, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x48 */ {{3, 3}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x49 */ {{0, 0}, {3, 3}, {6, 6}, {0, 0} } }, {0, 0, 3, 0, /* 0x4a */ {{1, 1}, {3, 3}, {6, 6}, {0, 0} } }, {1, 0, 3, 0, /* 0x4b */ {{0, 1}, {3, 3}, {6, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x4c */ {{2, 3}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x4d */ {{0, 0}, {2, 3}, {6, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x4e */ {{1, 3}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x4f */ {{0, 3}, {6, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x50 */ {{4, 4}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x51 */ {{0, 0}, {4, 4}, {6, 6}, {0, 0} } }, {0, 0, 3, 0, /* 0x52 */ {{1, 1}, {4, 4}, {6, 6}, {0, 0} } }, {1, 0, 3, 0, /* 0x53 */ {{0, 1}, {4, 4}, {6, 6}, {0, 0} } }, {0, 0, 3, 0, /* 0x54 */ {{2, 2}, {4, 4}, {6, 6}, {0, 0} } }, {1, 0, 4, 0, /* 0x55 */ {{0, 0}, {2, 2}, {4, 4}, {6, 6} } }, {0, 0, 3, 0, /* 0x56 */ {{1, 2}, {4, 4}, {6, 6}, {0, 0} } }, {1, 0, 3, 0, /* 0x57 */ {{0, 2}, {4, 4}, {6, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x58 */ {{3, 4}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x59 */ {{0, 0}, {3, 4}, {6, 6}, {0, 0} } }, {0, 0, 3, 0, /* 0x5a */ {{1, 1}, {3, 4}, {6, 6}, {0, 0} } }, {1, 0, 3, 0, /* 0x5b */ {{0, 1}, {3, 4}, {6, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x5c */ {{2, 4}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x5d */ {{0, 0}, {2, 4}, {6, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x5e */ {{1, 4}, {6, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x5f */ {{0, 4}, {6, 6}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x60 */ {{5, 6}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x61 */ {{0, 0}, {5, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x62 */ {{1, 1}, {5, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x63 */ {{0, 1}, {5, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x64 */ {{2, 2}, {5, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x65 */ {{0, 0}, {2, 2}, {5, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x66 */ {{1, 2}, {5, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x67 */ {{0, 2}, {5, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x68 */ {{3, 3}, {5, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x69 */ {{0, 0}, {3, 3}, {5, 6}, {0, 0} } }, {0, 0, 3, 0, /* 0x6a */ {{1, 1}, {3, 3}, {5, 6}, {0, 0} } }, {1, 0, 3, 0, /* 0x6b */ {{0, 1}, {3, 3}, {5, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x6c */ {{2, 3}, {5, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x6d */ {{0, 0}, {2, 3}, {5, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x6e */ {{1, 3}, {5, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x6f */ {{0, 3}, {5, 6}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x70 */ {{4, 6}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x71 */ {{0, 0}, {4, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x72 */ {{1, 1}, {4, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x73 */ {{0, 1}, {4, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x74 */ {{2, 2}, {4, 6}, {0, 0}, {0, 0} } }, {1, 0, 3, 0, /* 0x75 */ {{0, 0}, {2, 2}, {4, 6}, {0, 0} } }, {0, 0, 2, 0, /* 0x76 */ {{1, 2}, {4, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x77 */ {{0, 2}, {4, 6}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x78 */ {{3, 6}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x79 */ {{0, 0}, {3, 6}, {0, 0}, {0, 0} } }, {0, 0, 2, 0, /* 0x7a */ {{1, 1}, {3, 6}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x7b */ {{0, 1}, {3, 6}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x7c */ {{2, 6}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 2, 0, /* 0x7d */ {{0, 0}, {2, 6}, {0, 0}, {0, 0} } }, {0, 0, 1, 0, /* 0x7e */ {{1, 6}, {0, 0}, {0, 0}, {0, 0} } }, {1, 0, 1, 0, /* 0x7f */ {{0, 6}, {0, 0}, {0, 0}, {0, 0} } }, {0, 1, 1, 0, /* 0x80 */ {{7, 7}, {0, 0}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0x81 */ {{0, 0}, {7, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0x82 */ {{1, 1}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0x83 */ {{0, 1}, {7, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0x84 */ {{2, 2}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0x85 */ {{0, 0}, {2, 2}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0x86 */ {{1, 2}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0x87 */ {{0, 2}, {7, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0x88 */ {{3, 3}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0x89 */ {{0, 0}, {3, 3}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0x8a */ {{1, 1}, {3, 3}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0x8b */ {{0, 1}, {3, 3}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0x8c */ {{2, 3}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0x8d */ {{0, 0}, {2, 3}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0x8e */ {{1, 3}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0x8f */ {{0, 3}, {7, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0x90 */ {{4, 4}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0x91 */ {{0, 0}, {4, 4}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0x92 */ {{1, 1}, {4, 4}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0x93 */ {{0, 1}, {4, 4}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0x94 */ {{2, 2}, {4, 4}, {7, 7}, {0, 0} } }, {1, 1, 4, 0, /* 0x95 */ {{0, 0}, {2, 2}, {4, 4}, {7, 7} } }, {0, 1, 3, 0, /* 0x96 */ {{1, 2}, {4, 4}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0x97 */ {{0, 2}, {4, 4}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0x98 */ {{3, 4}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0x99 */ {{0, 0}, {3, 4}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0x9a */ {{1, 1}, {3, 4}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0x9b */ {{0, 1}, {3, 4}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0x9c */ {{2, 4}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0x9d */ {{0, 0}, {2, 4}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0x9e */ {{1, 4}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0x9f */ {{0, 4}, {7, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xa0 */ {{5, 5}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xa1 */ {{0, 0}, {5, 5}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xa2 */ {{1, 1}, {5, 5}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xa3 */ {{0, 1}, {5, 5}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xa4 */ {{2, 2}, {5, 5}, {7, 7}, {0, 0} } }, {1, 1, 4, 0, /* 0xa5 */ {{0, 0}, {2, 2}, {5, 5}, {7, 7} } }, {0, 1, 3, 0, /* 0xa6 */ {{1, 2}, {5, 5}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xa7 */ {{0, 2}, {5, 5}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xa8 */ {{3, 3}, {5, 5}, {7, 7}, {0, 0} } }, {1, 1, 4, 0, /* 0xa9 */ {{0, 0}, {3, 3}, {5, 5}, {7, 7} } }, {0, 1, 4, 0, /* 0xaa */ {{1, 1}, {3, 3}, {5, 5}, {7, 7} } }, {1, 1, 4, 0, /* 0xab */ {{0, 1}, {3, 3}, {5, 5}, {7, 7} } }, {0, 1, 3, 0, /* 0xac */ {{2, 3}, {5, 5}, {7, 7}, {0, 0} } }, {1, 1, 4, 0, /* 0xad */ {{0, 0}, {2, 3}, {5, 5}, {7, 7} } }, {0, 1, 3, 0, /* 0xae */ {{1, 3}, {5, 5}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xaf */ {{0, 3}, {5, 5}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xb0 */ {{4, 5}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xb1 */ {{0, 0}, {4, 5}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xb2 */ {{1, 1}, {4, 5}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xb3 */ {{0, 1}, {4, 5}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xb4 */ {{2, 2}, {4, 5}, {7, 7}, {0, 0} } }, {1, 1, 4, 0, /* 0xb5 */ {{0, 0}, {2, 2}, {4, 5}, {7, 7} } }, {0, 1, 3, 0, /* 0xb6 */ {{1, 2}, {4, 5}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xb7 */ {{0, 2}, {4, 5}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xb8 */ {{3, 5}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xb9 */ {{0, 0}, {3, 5}, {7, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xba */ {{1, 1}, {3, 5}, {7, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xbb */ {{0, 1}, {3, 5}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xbc */ {{2, 5}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xbd */ {{0, 0}, {2, 5}, {7, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xbe */ {{1, 5}, {7, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xbf */ {{0, 5}, {7, 7}, {0, 0}, {0, 0} } }, {0, 1, 1, 0, /* 0xc0 */ {{6, 7}, {0, 0}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xc1 */ {{0, 0}, {6, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xc2 */ {{1, 1}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xc3 */ {{0, 1}, {6, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xc4 */ {{2, 2}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xc5 */ {{0, 0}, {2, 2}, {6, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xc6 */ {{1, 2}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xc7 */ {{0, 2}, {6, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xc8 */ {{3, 3}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xc9 */ {{0, 0}, {3, 3}, {6, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xca */ {{1, 1}, {3, 3}, {6, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xcb */ {{0, 1}, {3, 3}, {6, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xcc */ {{2, 3}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xcd */ {{0, 0}, {2, 3}, {6, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xce */ {{1, 3}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xcf */ {{0, 3}, {6, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xd0 */ {{4, 4}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xd1 */ {{0, 0}, {4, 4}, {6, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xd2 */ {{1, 1}, {4, 4}, {6, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xd3 */ {{0, 1}, {4, 4}, {6, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xd4 */ {{2, 2}, {4, 4}, {6, 7}, {0, 0} } }, {1, 1, 4, 0, /* 0xd5 */ {{0, 0}, {2, 2}, {4, 4}, {6, 7} } }, {0, 1, 3, 0, /* 0xd6 */ {{1, 2}, {4, 4}, {6, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xd7 */ {{0, 2}, {4, 4}, {6, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xd8 */ {{3, 4}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xd9 */ {{0, 0}, {3, 4}, {6, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xda */ {{1, 1}, {3, 4}, {6, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xdb */ {{0, 1}, {3, 4}, {6, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xdc */ {{2, 4}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xdd */ {{0, 0}, {2, 4}, {6, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xde */ {{1, 4}, {6, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xdf */ {{0, 4}, {6, 7}, {0, 0}, {0, 0} } }, {0, 1, 1, 0, /* 0xe0 */ {{5, 7}, {0, 0}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xe1 */ {{0, 0}, {5, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xe2 */ {{1, 1}, {5, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xe3 */ {{0, 1}, {5, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xe4 */ {{2, 2}, {5, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xe5 */ {{0, 0}, {2, 2}, {5, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xe6 */ {{1, 2}, {5, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xe7 */ {{0, 2}, {5, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xe8 */ {{3, 3}, {5, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xe9 */ {{0, 0}, {3, 3}, {5, 7}, {0, 0} } }, {0, 1, 3, 0, /* 0xea */ {{1, 1}, {3, 3}, {5, 7}, {0, 0} } }, {1, 1, 3, 0, /* 0xeb */ {{0, 1}, {3, 3}, {5, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xec */ {{2, 3}, {5, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xed */ {{0, 0}, {2, 3}, {5, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xee */ {{1, 3}, {5, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xef */ {{0, 3}, {5, 7}, {0, 0}, {0, 0} } }, {0, 1, 1, 0, /* 0xf0 */ {{4, 7}, {0, 0}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xf1 */ {{0, 0}, {4, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xf2 */ {{1, 1}, {4, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xf3 */ {{0, 1}, {4, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xf4 */ {{2, 2}, {4, 7}, {0, 0}, {0, 0} } }, {1, 1, 3, 0, /* 0xf5 */ {{0, 0}, {2, 2}, {4, 7}, {0, 0} } }, {0, 1, 2, 0, /* 0xf6 */ {{1, 2}, {4, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xf7 */ {{0, 2}, {4, 7}, {0, 0}, {0, 0} } }, {0, 1, 1, 0, /* 0xf8 */ {{3, 7}, {0, 0}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xf9 */ {{0, 0}, {3, 7}, {0, 0}, {0, 0} } }, {0, 1, 2, 0, /* 0xfa */ {{1, 1}, {3, 7}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xfb */ {{0, 1}, {3, 7}, {0, 0}, {0, 0} } }, {0, 1, 1, 0, /* 0xfc */ {{2, 7}, {0, 0}, {0, 0}, {0, 0} } }, {1, 1, 2, 0, /* 0xfd */ {{0, 0}, {2, 7}, {0, 0}, {0, 0} } }, {0, 1, 1, 0, /* 0xfe */ {{1, 7}, {0, 0}, {0, 0}, {0, 0} } }, {1, 1, 1, 0, /* 0xff */ {{0, 7}, {0, 0}, {0, 0}, {0, 0} } } }; int sctp_is_address_in_scope(struct sctp_ifa *ifa, struct sctp_scoping *scope, int do_update) { if ((scope->loopback_scope == 0) && (ifa->ifn_p) && SCTP_IFN_IS_IFT_LOOP(ifa->ifn_p)) { /* * skip loopback if not in scope * */ return (0); } switch (ifa->address.sa.sa_family) { #ifdef INET case AF_INET: if (scope->ipv4_addr_legal) { struct sockaddr_in *sin; sin = &ifa->address.sin; if (sin->sin_addr.s_addr == 0) { /* not in scope , unspecified */ return (0); } if ((scope->ipv4_local_scope == 0) && (IN4_ISPRIVATE_ADDRESS(&sin->sin_addr))) { /* private address not in scope */ return (0); } } else { return (0); } break; #endif #ifdef INET6 case AF_INET6: if (scope->ipv6_addr_legal) { struct sockaddr_in6 *sin6; /* * Must update the flags, bummer, which means any * IFA locks must now be applied HERE <-> */ if (do_update) { sctp_gather_internal_ifa_flags(ifa); } if (ifa->localifa_flags & SCTP_ADDR_IFA_UNUSEABLE) { return (0); } /* ok to use deprecated addresses? */ sin6 = &ifa->address.sin6; if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) { /* skip unspecifed addresses */ return (0); } if ( /* (local_scope == 0) && */ (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr))) { return (0); } if ((scope->site_scope == 0) && (IN6_IS_ADDR_SITELOCAL(&sin6->sin6_addr))) { return (0); } } else { return (0); } break; #endif default: return (0); } return (1); } static struct mbuf * sctp_add_addr_to_mbuf(struct mbuf *m, struct sctp_ifa *ifa, uint16_t * len) { #if defined(INET) || defined(INET6) struct sctp_paramhdr *parmh; struct mbuf *mret; uint16_t plen; #endif switch (ifa->address.sa.sa_family) { #ifdef INET case AF_INET: plen = (uint16_t) sizeof(struct sctp_ipv4addr_param); break; #endif #ifdef INET6 case AF_INET6: plen = (uint16_t) sizeof(struct sctp_ipv6addr_param); break; #endif default: return (m); } #if defined(INET) || defined(INET6) if (M_TRAILINGSPACE(m) >= plen) { /* easy side we just drop it on the end */ parmh = (struct sctp_paramhdr *)(SCTP_BUF_AT(m, SCTP_BUF_LEN(m))); mret = m; } else { /* Need more space */ mret = m; while (SCTP_BUF_NEXT(mret) != NULL) { mret = SCTP_BUF_NEXT(mret); } SCTP_BUF_NEXT(mret) = sctp_get_mbuf_for_msg(plen, 0, M_NOWAIT, 1, MT_DATA); if (SCTP_BUF_NEXT(mret) == NULL) { /* We are hosed, can't add more addresses */ return (m); } mret = SCTP_BUF_NEXT(mret); parmh = mtod(mret, struct sctp_paramhdr *); } /* now add the parameter */ switch (ifa->address.sa.sa_family) { #ifdef INET case AF_INET: { struct sctp_ipv4addr_param *ipv4p; struct sockaddr_in *sin; sin = &ifa->address.sin; ipv4p = (struct sctp_ipv4addr_param *)parmh; parmh->param_type = htons(SCTP_IPV4_ADDRESS); parmh->param_length = htons(plen); ipv4p->addr = sin->sin_addr.s_addr; SCTP_BUF_LEN(mret) += plen; break; } #endif #ifdef INET6 case AF_INET6: { struct sctp_ipv6addr_param *ipv6p; struct sockaddr_in6 *sin6; sin6 = &ifa->address.sin6; ipv6p = (struct sctp_ipv6addr_param *)parmh; parmh->param_type = htons(SCTP_IPV6_ADDRESS); parmh->param_length = htons(plen); memcpy(ipv6p->addr, &sin6->sin6_addr, sizeof(ipv6p->addr)); /* clear embedded scope in the address */ in6_clearscope((struct in6_addr *)ipv6p->addr); SCTP_BUF_LEN(mret) += plen; break; } #endif default: return (m); } if (len != NULL) { *len += plen; } return (mret); #endif } struct mbuf * sctp_add_addresses_to_i_ia(struct sctp_inpcb *inp, struct sctp_tcb *stcb, struct sctp_scoping *scope, struct mbuf *m_at, int cnt_inits_to, uint16_t * padding_len, uint16_t * chunk_len) { struct sctp_vrf *vrf = NULL; int cnt, limit_out = 0, total_count; uint32_t vrf_id; vrf_id = inp->def_vrf_id; SCTP_IPI_ADDR_RLOCK(); vrf = sctp_find_vrf(vrf_id); if (vrf == NULL) { SCTP_IPI_ADDR_RUNLOCK(); return (m_at); } if (inp->sctp_flags & SCTP_PCB_FLAGS_BOUNDALL) { struct sctp_ifa *sctp_ifap; struct sctp_ifn *sctp_ifnp; cnt = cnt_inits_to; if (vrf->total_ifa_count > SCTP_COUNT_LIMIT) { limit_out = 1; cnt = SCTP_ADDRESS_LIMIT; goto skip_count; } LIST_FOREACH(sctp_ifnp, &vrf->ifnlist, next_ifn) { if ((scope->loopback_scope == 0) && SCTP_IFN_IS_IFT_LOOP(sctp_ifnp)) { /* * Skip loopback devices if loopback_scope * not set */ continue; } LIST_FOREACH(sctp_ifap, &sctp_ifnp->ifalist, next_ifa) { #ifdef INET if ((sctp_ifap->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifap->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((sctp_ifap->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifap->address.sin6.sin6_addr) != 0)) { continue; } #endif if (sctp_is_addr_restricted(stcb, sctp_ifap)) { continue; } if (sctp_is_address_in_scope(sctp_ifap, scope, 1) == 0) { continue; } cnt++; if (cnt > SCTP_ADDRESS_LIMIT) { break; } } if (cnt > SCTP_ADDRESS_LIMIT) { break; } } skip_count: if (cnt > 1) { total_count = 0; LIST_FOREACH(sctp_ifnp, &vrf->ifnlist, next_ifn) { cnt = 0; if ((scope->loopback_scope == 0) && SCTP_IFN_IS_IFT_LOOP(sctp_ifnp)) { /* * Skip loopback devices if * loopback_scope not set */ continue; } LIST_FOREACH(sctp_ifap, &sctp_ifnp->ifalist, next_ifa) { #ifdef INET if ((sctp_ifap->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifap->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((sctp_ifap->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifap->address.sin6.sin6_addr) != 0)) { continue; } #endif if (sctp_is_addr_restricted(stcb, sctp_ifap)) { continue; } if (sctp_is_address_in_scope(sctp_ifap, scope, 0) == 0) { continue; } if ((chunk_len != NULL) && (padding_len != NULL) && (*padding_len > 0)) { memset(mtod(m_at, caddr_t)+*chunk_len, 0, *padding_len); SCTP_BUF_LEN(m_at) += *padding_len; *chunk_len += *padding_len; *padding_len = 0; } m_at = sctp_add_addr_to_mbuf(m_at, sctp_ifap, chunk_len); if (limit_out) { cnt++; total_count++; if (cnt >= 2) { /* * two from each * address */ break; } if (total_count > SCTP_ADDRESS_LIMIT) { /* No more addresses */ break; } } } } } } else { struct sctp_laddr *laddr; cnt = cnt_inits_to; /* First, how many ? */ LIST_FOREACH(laddr, &inp->sctp_addr_list, sctp_nxt_addr) { if (laddr->ifa == NULL) { continue; } if (laddr->ifa->localifa_flags & SCTP_BEING_DELETED) /* * Address being deleted by the system, dont * list. */ continue; if (laddr->action == SCTP_DEL_IP_ADDRESS) { /* * Address being deleted on this ep don't * list. */ continue; } if (sctp_is_address_in_scope(laddr->ifa, scope, 1) == 0) { continue; } cnt++; } /* * To get through a NAT we only list addresses if we have * more than one. That way if you just bind a single address * we let the source of the init dictate our address. */ if (cnt > 1) { cnt = cnt_inits_to; LIST_FOREACH(laddr, &inp->sctp_addr_list, sctp_nxt_addr) { if (laddr->ifa == NULL) { continue; } if (laddr->ifa->localifa_flags & SCTP_BEING_DELETED) { continue; } if (sctp_is_address_in_scope(laddr->ifa, scope, 0) == 0) { continue; } if ((chunk_len != NULL) && (padding_len != NULL) && (*padding_len > 0)) { memset(mtod(m_at, caddr_t)+*chunk_len, 0, *padding_len); SCTP_BUF_LEN(m_at) += *padding_len; *chunk_len += *padding_len; *padding_len = 0; } m_at = sctp_add_addr_to_mbuf(m_at, laddr->ifa, chunk_len); cnt++; if (cnt >= SCTP_ADDRESS_LIMIT) { break; } } } } SCTP_IPI_ADDR_RUNLOCK(); return (m_at); } static struct sctp_ifa * sctp_is_ifa_addr_preferred(struct sctp_ifa *ifa, uint8_t dest_is_loop, uint8_t dest_is_priv, sa_family_t fam) { uint8_t dest_is_global = 0; /* dest_is_priv is true if destination is a private address */ /* dest_is_loop is true if destination is a loopback addresses */ /** * Here we determine if its a preferred address. A preferred address * means it is the same scope or higher scope then the destination. * L = loopback, P = private, G = global * ----------------------------------------- * src | dest | result * ---------------------------------------- * L | L | yes * ----------------------------------------- * P | L | yes-v4 no-v6 * ----------------------------------------- * G | L | yes-v4 no-v6 * ----------------------------------------- * L | P | no * ----------------------------------------- * P | P | yes * ----------------------------------------- * G | P | no * ----------------------------------------- * L | G | no * ----------------------------------------- * P | G | no * ----------------------------------------- * G | G | yes * ----------------------------------------- */ if (ifa->address.sa.sa_family != fam) { /* forget mis-matched family */ return (NULL); } if ((dest_is_priv == 0) && (dest_is_loop == 0)) { dest_is_global = 1; } SCTPDBG(SCTP_DEBUG_OUTPUT2, "Is destination preferred:"); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &ifa->address.sa); /* Ok the address may be ok */ #ifdef INET6 if (fam == AF_INET6) { /* ok to use deprecated addresses? no lets not! */ if (ifa->localifa_flags & SCTP_ADDR_IFA_UNUSEABLE) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:1\n"); return (NULL); } if (ifa->src_is_priv && !ifa->src_is_loop) { if (dest_is_loop) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:2\n"); return (NULL); } } if (ifa->src_is_glob) { if (dest_is_loop) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:3\n"); return (NULL); } } } #endif /* * Now that we know what is what, implement or table this could in * theory be done slicker (it used to be), but this is * straightforward and easier to validate :-) */ SCTPDBG(SCTP_DEBUG_OUTPUT3, "src_loop:%d src_priv:%d src_glob:%d\n", ifa->src_is_loop, ifa->src_is_priv, ifa->src_is_glob); SCTPDBG(SCTP_DEBUG_OUTPUT3, "dest_loop:%d dest_priv:%d dest_glob:%d\n", dest_is_loop, dest_is_priv, dest_is_global); if ((ifa->src_is_loop) && (dest_is_priv)) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:4\n"); return (NULL); } if ((ifa->src_is_glob) && (dest_is_priv)) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:5\n"); return (NULL); } if ((ifa->src_is_loop) && (dest_is_global)) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:6\n"); return (NULL); } if ((ifa->src_is_priv) && (dest_is_global)) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:7\n"); return (NULL); } SCTPDBG(SCTP_DEBUG_OUTPUT3, "YES\n"); /* its a preferred address */ return (ifa); } static struct sctp_ifa * sctp_is_ifa_addr_acceptable(struct sctp_ifa *ifa, uint8_t dest_is_loop, uint8_t dest_is_priv, sa_family_t fam) { uint8_t dest_is_global = 0; /** * Here we determine if its a acceptable address. A acceptable * address means it is the same scope or higher scope but we can * allow for NAT which means its ok to have a global dest and a * private src. * * L = loopback, P = private, G = global * ----------------------------------------- * src | dest | result * ----------------------------------------- * L | L | yes * ----------------------------------------- * P | L | yes-v4 no-v6 * ----------------------------------------- * G | L | yes * ----------------------------------------- * L | P | no * ----------------------------------------- * P | P | yes * ----------------------------------------- * G | P | yes - May not work * ----------------------------------------- * L | G | no * ----------------------------------------- * P | G | yes - May not work * ----------------------------------------- * G | G | yes * ----------------------------------------- */ if (ifa->address.sa.sa_family != fam) { /* forget non matching family */ SCTPDBG(SCTP_DEBUG_OUTPUT3, "ifa_fam:%d fam:%d\n", ifa->address.sa.sa_family, fam); return (NULL); } /* Ok the address may be ok */ SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT3, &ifa->address.sa); SCTPDBG(SCTP_DEBUG_OUTPUT3, "dst_is_loop:%d dest_is_priv:%d\n", dest_is_loop, dest_is_priv); if ((dest_is_loop == 0) && (dest_is_priv == 0)) { dest_is_global = 1; } #ifdef INET6 if (fam == AF_INET6) { /* ok to use deprecated addresses? */ if (ifa->localifa_flags & SCTP_ADDR_IFA_UNUSEABLE) { return (NULL); } if (ifa->src_is_priv) { /* Special case, linklocal to loop */ if (dest_is_loop) return (NULL); } } #endif /* * Now that we know what is what, implement our table. This could in * theory be done slicker (it used to be), but this is * straightforward and easier to validate :-) */ SCTPDBG(SCTP_DEBUG_OUTPUT3, "ifa->src_is_loop:%d dest_is_priv:%d\n", ifa->src_is_loop, dest_is_priv); if ((ifa->src_is_loop == 1) && (dest_is_priv)) { return (NULL); } SCTPDBG(SCTP_DEBUG_OUTPUT3, "ifa->src_is_loop:%d dest_is_glob:%d\n", ifa->src_is_loop, dest_is_global); if ((ifa->src_is_loop == 1) && (dest_is_global)) { return (NULL); } SCTPDBG(SCTP_DEBUG_OUTPUT3, "address is acceptable\n"); /* its an acceptable address */ return (ifa); } int sctp_is_addr_restricted(struct sctp_tcb *stcb, struct sctp_ifa *ifa) { struct sctp_laddr *laddr; if (stcb == NULL) { /* There are no restrictions, no TCB :-) */ return (0); } LIST_FOREACH(laddr, &stcb->asoc.sctp_restricted_addrs, sctp_nxt_addr) { if (laddr->ifa == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "%s: NULL ifa\n", __func__); continue; } if (laddr->ifa == ifa) { /* Yes it is on the list */ return (1); } } return (0); } int sctp_is_addr_in_ep(struct sctp_inpcb *inp, struct sctp_ifa *ifa) { struct sctp_laddr *laddr; if (ifa == NULL) return (0); LIST_FOREACH(laddr, &inp->sctp_addr_list, sctp_nxt_addr) { if (laddr->ifa == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "%s: NULL ifa\n", __func__); continue; } if ((laddr->ifa == ifa) && laddr->action == 0) /* same pointer */ return (1); } return (0); } static struct sctp_ifa * sctp_choose_boundspecific_inp(struct sctp_inpcb *inp, sctp_route_t * ro, uint32_t vrf_id, int non_asoc_addr_ok, uint8_t dest_is_priv, uint8_t dest_is_loop, sa_family_t fam) { struct sctp_laddr *laddr, *starting_point; void *ifn; int resettotop = 0; struct sctp_ifn *sctp_ifn; struct sctp_ifa *sctp_ifa, *sifa; struct sctp_vrf *vrf; uint32_t ifn_index; vrf = sctp_find_vrf(vrf_id); if (vrf == NULL) return (NULL); ifn = SCTP_GET_IFN_VOID_FROM_ROUTE(ro); ifn_index = SCTP_GET_IF_INDEX_FROM_ROUTE(ro); sctp_ifn = sctp_find_ifn(ifn, ifn_index); /* * first question, is the ifn we will emit on in our list, if so, we * want such an address. Note that we first looked for a preferred * address. */ if (sctp_ifn) { /* is a preferred one on the interface we route out? */ LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) { #ifdef INET if ((sctp_ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((sctp_ifa->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin6.sin6_addr) != 0)) { continue; } #endif if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) continue; sifa = sctp_is_ifa_addr_preferred(sctp_ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; if (sctp_is_addr_in_ep(inp, sifa)) { atomic_add_int(&sifa->refcount, 1); return (sifa); } } } /* * ok, now we now need to find one on the list of the addresses. We * can't get one on the emitting interface so let's find first a * preferred one. If not that an acceptable one otherwise... we * return NULL. */ starting_point = inp->next_addr_touse; once_again: if (inp->next_addr_touse == NULL) { inp->next_addr_touse = LIST_FIRST(&inp->sctp_addr_list); resettotop = 1; } for (laddr = inp->next_addr_touse; laddr; laddr = LIST_NEXT(laddr, sctp_nxt_addr)) { if (laddr->ifa == NULL) { /* address has been removed */ continue; } if (laddr->action == SCTP_DEL_IP_ADDRESS) { /* address is being deleted */ continue; } sifa = sctp_is_ifa_addr_preferred(laddr->ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; atomic_add_int(&sifa->refcount, 1); return (sifa); } if (resettotop == 0) { inp->next_addr_touse = NULL; goto once_again; } inp->next_addr_touse = starting_point; resettotop = 0; once_again_too: if (inp->next_addr_touse == NULL) { inp->next_addr_touse = LIST_FIRST(&inp->sctp_addr_list); resettotop = 1; } /* ok, what about an acceptable address in the inp */ for (laddr = inp->next_addr_touse; laddr; laddr = LIST_NEXT(laddr, sctp_nxt_addr)) { if (laddr->ifa == NULL) { /* address has been removed */ continue; } if (laddr->action == SCTP_DEL_IP_ADDRESS) { /* address is being deleted */ continue; } sifa = sctp_is_ifa_addr_acceptable(laddr->ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; atomic_add_int(&sifa->refcount, 1); return (sifa); } if (resettotop == 0) { inp->next_addr_touse = NULL; goto once_again_too; } /* * no address bound can be a source for the destination we are in * trouble */ return (NULL); } static struct sctp_ifa * sctp_choose_boundspecific_stcb(struct sctp_inpcb *inp, struct sctp_tcb *stcb, sctp_route_t * ro, uint32_t vrf_id, uint8_t dest_is_priv, uint8_t dest_is_loop, int non_asoc_addr_ok, sa_family_t fam) { struct sctp_laddr *laddr, *starting_point; void *ifn; struct sctp_ifn *sctp_ifn; struct sctp_ifa *sctp_ifa, *sifa; uint8_t start_at_beginning = 0; struct sctp_vrf *vrf; uint32_t ifn_index; /* * first question, is the ifn we will emit on in our list, if so, we * want that one. */ vrf = sctp_find_vrf(vrf_id); if (vrf == NULL) return (NULL); ifn = SCTP_GET_IFN_VOID_FROM_ROUTE(ro); ifn_index = SCTP_GET_IF_INDEX_FROM_ROUTE(ro); sctp_ifn = sctp_find_ifn(ifn, ifn_index); /* * first question, is the ifn we will emit on in our list? If so, * we want that one. First we look for a preferred. Second, we go * for an acceptable. */ if (sctp_ifn) { /* first try for a preferred address on the ep */ LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) { #ifdef INET if ((sctp_ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((sctp_ifa->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin6.sin6_addr) != 0)) { continue; } #endif if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) continue; if (sctp_is_addr_in_ep(inp, sctp_ifa)) { sifa = sctp_is_ifa_addr_preferred(sctp_ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* on the no-no list */ continue; } atomic_add_int(&sifa->refcount, 1); return (sifa); } } /* next try for an acceptable address on the ep */ LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) { #ifdef INET if ((sctp_ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((sctp_ifa->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin6.sin6_addr) != 0)) { continue; } #endif if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) continue; if (sctp_is_addr_in_ep(inp, sctp_ifa)) { sifa = sctp_is_ifa_addr_acceptable(sctp_ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* on the no-no list */ continue; } atomic_add_int(&sifa->refcount, 1); return (sifa); } } } /* * if we can't find one like that then we must look at all addresses * bound to pick one at first preferable then secondly acceptable. */ starting_point = stcb->asoc.last_used_address; sctp_from_the_top: if (stcb->asoc.last_used_address == NULL) { start_at_beginning = 1; stcb->asoc.last_used_address = LIST_FIRST(&inp->sctp_addr_list); } /* search beginning with the last used address */ for (laddr = stcb->asoc.last_used_address; laddr; laddr = LIST_NEXT(laddr, sctp_nxt_addr)) { if (laddr->ifa == NULL) { /* address has been removed */ continue; } if (laddr->action == SCTP_DEL_IP_ADDRESS) { /* address is being deleted */ continue; } sifa = sctp_is_ifa_addr_preferred(laddr->ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* on the no-no list */ continue; } stcb->asoc.last_used_address = laddr; atomic_add_int(&sifa->refcount, 1); return (sifa); } if (start_at_beginning == 0) { stcb->asoc.last_used_address = NULL; goto sctp_from_the_top; } /* now try for any higher scope than the destination */ stcb->asoc.last_used_address = starting_point; start_at_beginning = 0; sctp_from_the_top2: if (stcb->asoc.last_used_address == NULL) { start_at_beginning = 1; stcb->asoc.last_used_address = LIST_FIRST(&inp->sctp_addr_list); } /* search beginning with the last used address */ for (laddr = stcb->asoc.last_used_address; laddr; laddr = LIST_NEXT(laddr, sctp_nxt_addr)) { if (laddr->ifa == NULL) { /* address has been removed */ continue; } if (laddr->action == SCTP_DEL_IP_ADDRESS) { /* address is being deleted */ continue; } sifa = sctp_is_ifa_addr_acceptable(laddr->ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* on the no-no list */ continue; } stcb->asoc.last_used_address = laddr; atomic_add_int(&sifa->refcount, 1); return (sifa); } if (start_at_beginning == 0) { stcb->asoc.last_used_address = NULL; goto sctp_from_the_top2; } return (NULL); } static struct sctp_ifa * sctp_select_nth_preferred_addr_from_ifn_boundall(struct sctp_ifn *ifn, struct sctp_inpcb *inp, struct sctp_tcb *stcb, int non_asoc_addr_ok, uint8_t dest_is_loop, uint8_t dest_is_priv, int addr_wanted, sa_family_t fam, sctp_route_t * ro ) { struct sctp_ifa *ifa, *sifa; int num_eligible_addr = 0; #ifdef INET6 struct sockaddr_in6 sin6, lsa6; if (fam == AF_INET6) { memcpy(&sin6, &ro->ro_dst, sizeof(struct sockaddr_in6)); (void)sa6_recoverscope(&sin6); } #endif /* INET6 */ LIST_FOREACH(ifa, &ifn->ifalist, next_ifa) { #ifdef INET if ((ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &ifa->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((ifa->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &ifa->address.sin6.sin6_addr) != 0)) { continue; } #endif if ((ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) continue; sifa = sctp_is_ifa_addr_preferred(ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; #ifdef INET6 if (fam == AF_INET6 && dest_is_loop && sifa->src_is_loop && sifa->src_is_priv) { /* * don't allow fe80::1 to be a src on loop ::1, we * don't list it to the peer so we will get an * abort. */ continue; } if (fam == AF_INET6 && IN6_IS_ADDR_LINKLOCAL(&sifa->address.sin6.sin6_addr) && IN6_IS_ADDR_LINKLOCAL(&sin6.sin6_addr)) { /* * link-local <-> link-local must belong to the same * scope. */ memcpy(&lsa6, &sifa->address.sin6, sizeof(struct sockaddr_in6)); (void)sa6_recoverscope(&lsa6); if (sin6.sin6_scope_id != lsa6.sin6_scope_id) { continue; } } #endif /* INET6 */ /* * Check if the IPv6 address matches to next-hop. In the * mobile case, old IPv6 address may be not deleted from the * interface. Then, the interface has previous and new * addresses. We should use one corresponding to the * next-hop. (by micchie) */ #ifdef INET6 if (stcb && fam == AF_INET6 && sctp_is_mobility_feature_on(stcb->sctp_ep, SCTP_MOBILITY_BASE)) { if (sctp_v6src_match_nexthop(&sifa->address.sin6, ro) == 0) { continue; } } #endif #ifdef INET /* Avoid topologically incorrect IPv4 address */ if (stcb && fam == AF_INET && sctp_is_mobility_feature_on(stcb->sctp_ep, SCTP_MOBILITY_BASE)) { if (sctp_v4src_match_nexthop(sifa, ro) == 0) { continue; } } #endif if (stcb) { if (sctp_is_address_in_scope(ifa, &stcb->asoc.scope, 0) == 0) { continue; } if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* * It is restricted for some reason.. * probably not yet added. */ continue; } } if (num_eligible_addr >= addr_wanted) { return (sifa); } num_eligible_addr++; } return (NULL); } static int sctp_count_num_preferred_boundall(struct sctp_ifn *ifn, struct sctp_inpcb *inp, struct sctp_tcb *stcb, int non_asoc_addr_ok, uint8_t dest_is_loop, uint8_t dest_is_priv, sa_family_t fam) { struct sctp_ifa *ifa, *sifa; int num_eligible_addr = 0; LIST_FOREACH(ifa, &ifn->ifalist, next_ifa) { #ifdef INET if ((ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &ifa->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((ifa->address.sa.sa_family == AF_INET6) && (stcb != NULL) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &ifa->address.sin6.sin6_addr) != 0)) { continue; } #endif if ((ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) { continue; } sifa = sctp_is_ifa_addr_preferred(ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) { continue; } if (stcb) { if (sctp_is_address_in_scope(ifa, &stcb->asoc.scope, 0) == 0) { continue; } if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* * It is restricted for some reason.. * probably not yet added. */ continue; } } num_eligible_addr++; } return (num_eligible_addr); } static struct sctp_ifa * sctp_choose_boundall(struct sctp_inpcb *inp, struct sctp_tcb *stcb, struct sctp_nets *net, sctp_route_t * ro, uint32_t vrf_id, uint8_t dest_is_priv, uint8_t dest_is_loop, int non_asoc_addr_ok, sa_family_t fam) { int cur_addr_num = 0, num_preferred = 0; void *ifn; struct sctp_ifn *sctp_ifn, *looked_at = NULL, *emit_ifn; struct sctp_ifa *sctp_ifa, *sifa; uint32_t ifn_index; struct sctp_vrf *vrf; #ifdef INET int retried = 0; #endif /*- * For boundall we can use any address in the association. * If non_asoc_addr_ok is set we can use any address (at least in * theory). So we look for preferred addresses first. If we find one, * we use it. Otherwise we next try to get an address on the * interface, which we should be able to do (unless non_asoc_addr_ok * is false and we are routed out that way). In these cases where we * can't use the address of the interface we go through all the * ifn's looking for an address we can use and fill that in. Punting * means we send back address 0, which will probably cause problems * actually since then IP will fill in the address of the route ifn, * which means we probably already rejected it.. i.e. here comes an * abort :-<. */ vrf = sctp_find_vrf(vrf_id); if (vrf == NULL) return (NULL); ifn = SCTP_GET_IFN_VOID_FROM_ROUTE(ro); ifn_index = SCTP_GET_IF_INDEX_FROM_ROUTE(ro); SCTPDBG(SCTP_DEBUG_OUTPUT2, "ifn from route:%p ifn_index:%d\n", ifn, ifn_index); emit_ifn = looked_at = sctp_ifn = sctp_find_ifn(ifn, ifn_index); if (sctp_ifn == NULL) { /* ?? We don't have this guy ?? */ SCTPDBG(SCTP_DEBUG_OUTPUT2, "No ifn emit interface?\n"); goto bound_all_plan_b; } SCTPDBG(SCTP_DEBUG_OUTPUT2, "ifn_index:%d name:%s is emit interface\n", ifn_index, sctp_ifn->ifn_name); if (net) { cur_addr_num = net->indx_of_eligible_next_to_use; } num_preferred = sctp_count_num_preferred_boundall(sctp_ifn, inp, stcb, non_asoc_addr_ok, dest_is_loop, dest_is_priv, fam); SCTPDBG(SCTP_DEBUG_OUTPUT2, "Found %d preferred source addresses for intf:%s\n", num_preferred, sctp_ifn->ifn_name); if (num_preferred == 0) { /* * no eligible addresses, we must use some other interface * address if we can find one. */ goto bound_all_plan_b; } /* * Ok we have num_eligible_addr set with how many we can use, this * may vary from call to call due to addresses being deprecated * etc.. */ if (cur_addr_num >= num_preferred) { cur_addr_num = 0; } /* * select the nth address from the list (where cur_addr_num is the * nth) and 0 is the first one, 1 is the second one etc... */ SCTPDBG(SCTP_DEBUG_OUTPUT2, "cur_addr_num:%d\n", cur_addr_num); sctp_ifa = sctp_select_nth_preferred_addr_from_ifn_boundall(sctp_ifn, inp, stcb, non_asoc_addr_ok, dest_is_loop, dest_is_priv, cur_addr_num, fam, ro); /* if sctp_ifa is NULL something changed??, fall to plan b. */ if (sctp_ifa) { atomic_add_int(&sctp_ifa->refcount, 1); if (net) { /* save off where the next one we will want */ net->indx_of_eligible_next_to_use = cur_addr_num + 1; } return (sctp_ifa); } /* * plan_b: Look at all interfaces and find a preferred address. If * no preferred fall through to plan_c. */ bound_all_plan_b: SCTPDBG(SCTP_DEBUG_OUTPUT2, "Trying Plan B\n"); LIST_FOREACH(sctp_ifn, &vrf->ifnlist, next_ifn) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "Examine interface %s\n", sctp_ifn->ifn_name); if (dest_is_loop == 0 && SCTP_IFN_IS_IFT_LOOP(sctp_ifn)) { /* wrong base scope */ SCTPDBG(SCTP_DEBUG_OUTPUT2, "skip\n"); continue; } if ((sctp_ifn == looked_at) && looked_at) { /* already looked at this guy */ SCTPDBG(SCTP_DEBUG_OUTPUT2, "already seen\n"); continue; } num_preferred = sctp_count_num_preferred_boundall(sctp_ifn, inp, stcb, non_asoc_addr_ok, dest_is_loop, dest_is_priv, fam); SCTPDBG(SCTP_DEBUG_OUTPUT2, "Found ifn:%p %d preferred source addresses\n", ifn, num_preferred); if (num_preferred == 0) { /* None on this interface. */ SCTPDBG(SCTP_DEBUG_OUTPUT2, "No preferred -- skipping to next\n"); continue; } SCTPDBG(SCTP_DEBUG_OUTPUT2, "num preferred:%d on interface:%p cur_addr_num:%d\n", num_preferred, (void *)sctp_ifn, cur_addr_num); /* * Ok we have num_eligible_addr set with how many we can * use, this may vary from call to call due to addresses * being deprecated etc.. */ if (cur_addr_num >= num_preferred) { cur_addr_num = 0; } sifa = sctp_select_nth_preferred_addr_from_ifn_boundall(sctp_ifn, inp, stcb, non_asoc_addr_ok, dest_is_loop, dest_is_priv, cur_addr_num, fam, ro); if (sifa == NULL) continue; if (net) { net->indx_of_eligible_next_to_use = cur_addr_num + 1; SCTPDBG(SCTP_DEBUG_OUTPUT2, "we selected %d\n", cur_addr_num); SCTPDBG(SCTP_DEBUG_OUTPUT2, "Source:"); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &sifa->address.sa); SCTPDBG(SCTP_DEBUG_OUTPUT2, "Dest:"); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &net->ro._l_addr.sa); } atomic_add_int(&sifa->refcount, 1); return (sifa); } #ifdef INET again_with_private_addresses_allowed: #endif /* plan_c: do we have an acceptable address on the emit interface */ sifa = NULL; SCTPDBG(SCTP_DEBUG_OUTPUT2, "Trying Plan C: find acceptable on interface\n"); if (emit_ifn == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "Jump to Plan D - no emit_ifn\n"); goto plan_d; } LIST_FOREACH(sctp_ifa, &emit_ifn->ifalist, next_ifa) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "ifa:%p\n", (void *)sctp_ifa); #ifdef INET if ((sctp_ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin.sin_addr) != 0)) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "Jailed\n"); continue; } #endif #ifdef INET6 if ((sctp_ifa->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin6.sin6_addr) != 0)) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "Jailed\n"); continue; } #endif if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "Defer\n"); continue; } sifa = sctp_is_ifa_addr_acceptable(sctp_ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "IFA not acceptable\n"); continue; } if (stcb) { if (sctp_is_address_in_scope(sifa, &stcb->asoc.scope, 0) == 0) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "NOT in scope\n"); sifa = NULL; continue; } if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* * It is restricted for some reason.. * probably not yet added. */ SCTPDBG(SCTP_DEBUG_OUTPUT2, "Its restricted\n"); sifa = NULL; continue; } } atomic_add_int(&sifa->refcount, 1); goto out; } plan_d: /* * plan_d: We are in trouble. No preferred address on the emit * interface. And not even a preferred address on all interfaces. Go * out and see if we can find an acceptable address somewhere * amongst all interfaces. */ SCTPDBG(SCTP_DEBUG_OUTPUT2, "Trying Plan D looked_at is %p\n", (void *)looked_at); LIST_FOREACH(sctp_ifn, &vrf->ifnlist, next_ifn) { if (dest_is_loop == 0 && SCTP_IFN_IS_IFT_LOOP(sctp_ifn)) { /* wrong base scope */ continue; } LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) { #ifdef INET if ((sctp_ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((sctp_ifa->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin6.sin6_addr) != 0)) { continue; } #endif if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) continue; sifa = sctp_is_ifa_addr_acceptable(sctp_ifa, dest_is_loop, dest_is_priv, fam); if (sifa == NULL) continue; if (stcb) { if (sctp_is_address_in_scope(sifa, &stcb->asoc.scope, 0) == 0) { sifa = NULL; continue; } if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, sifa)) && (!sctp_is_addr_pending(stcb, sifa)))) { /* * It is restricted for some * reason.. probably not yet added. */ sifa = NULL; continue; } } goto out; } } #ifdef INET if (stcb) { if ((retried == 0) && (stcb->asoc.scope.ipv4_local_scope == 0)) { stcb->asoc.scope.ipv4_local_scope = 1; retried = 1; goto again_with_private_addresses_allowed; } else if (retried == 1) { stcb->asoc.scope.ipv4_local_scope = 0; } } #endif out: #ifdef INET if (sifa) { if (retried == 1) { LIST_FOREACH(sctp_ifn, &vrf->ifnlist, next_ifn) { if (dest_is_loop == 0 && SCTP_IFN_IS_IFT_LOOP(sctp_ifn)) { /* wrong base scope */ continue; } LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) { struct sctp_ifa *tmp_sifa; #ifdef INET if ((sctp_ifa->address.sa.sa_family == AF_INET) && (prison_check_ip4(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin.sin_addr) != 0)) { continue; } #endif #ifdef INET6 if ((sctp_ifa->address.sa.sa_family == AF_INET6) && (prison_check_ip6(inp->ip_inp.inp.inp_cred, &sctp_ifa->address.sin6.sin6_addr) != 0)) { continue; } #endif if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0)) continue; tmp_sifa = sctp_is_ifa_addr_acceptable(sctp_ifa, dest_is_loop, dest_is_priv, fam); if (tmp_sifa == NULL) { continue; } if (tmp_sifa == sifa) { continue; } if (stcb) { if (sctp_is_address_in_scope(tmp_sifa, &stcb->asoc.scope, 0) == 0) { continue; } if (((non_asoc_addr_ok == 0) && (sctp_is_addr_restricted(stcb, tmp_sifa))) || (non_asoc_addr_ok && (sctp_is_addr_restricted(stcb, tmp_sifa)) && (!sctp_is_addr_pending(stcb, tmp_sifa)))) { /* * It is restricted * for some reason.. * probably not yet * added. */ continue; } } if ((tmp_sifa->address.sin.sin_family == AF_INET) && (IN4_ISPRIVATE_ADDRESS(&(tmp_sifa->address.sin.sin_addr)))) { sctp_add_local_addr_restricted(stcb, tmp_sifa); } } } } atomic_add_int(&sifa->refcount, 1); } #endif return (sifa); } /* tcb may be NULL */ struct sctp_ifa * sctp_source_address_selection(struct sctp_inpcb *inp, struct sctp_tcb *stcb, sctp_route_t * ro, struct sctp_nets *net, int non_asoc_addr_ok, uint32_t vrf_id) { struct sctp_ifa *answer; uint8_t dest_is_priv, dest_is_loop; sa_family_t fam; #ifdef INET struct sockaddr_in *to = (struct sockaddr_in *)&ro->ro_dst; #endif #ifdef INET6 struct sockaddr_in6 *to6 = (struct sockaddr_in6 *)&ro->ro_dst; #endif /** * Rules: * - Find the route if needed, cache if I can. * - Look at interface address in route, Is it in the bound list. If so we * have the best source. * - If not we must rotate amongst the addresses. * * Cavets and issues * * Do we need to pay attention to scope. We can have a private address * or a global address we are sourcing or sending to. So if we draw * it out * zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz * For V4 * ------------------------------------------ * source * dest * result * ----------------------------------------- * Private * Global * NAT * ----------------------------------------- * Private * Private * No problem * ----------------------------------------- * Global * Private * Huh, How will this work? * ----------------------------------------- * Global * Global * No Problem *------------------------------------------ * zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz * For V6 *------------------------------------------ * source * dest * result * ----------------------------------------- * Linklocal * Global * * ----------------------------------------- * Linklocal * Linklocal * No problem * ----------------------------------------- * Global * Linklocal * Huh, How will this work? * ----------------------------------------- * Global * Global * No Problem *------------------------------------------ * zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz * * And then we add to that what happens if there are multiple addresses * assigned to an interface. Remember the ifa on a ifn is a linked * list of addresses. So one interface can have more than one IP * address. What happens if we have both a private and a global * address? Do we then use context of destination to sort out which * one is best? And what about NAT's sending P->G may get you a NAT * translation, or should you select the G thats on the interface in * preference. * * Decisions: * * - count the number of addresses on the interface. * - if it is one, no problem except case . * For we will assume a NAT out there. * - if there are more than one, then we need to worry about scope P * or G. We should prefer G -> G and P -> P if possible. * Then as a secondary fall back to mixed types G->P being a last * ditch one. * - The above all works for bound all, but bound specific we need to * use the same concept but instead only consider the bound * addresses. If the bound set is NOT assigned to the interface then * we must use rotation amongst the bound addresses.. */ if (ro->ro_rt == NULL) { /* * Need a route to cache. */ SCTP_RTALLOC(ro, vrf_id, inp->fibnum); } if (ro->ro_rt == NULL) { return (NULL); } fam = ro->ro_dst.sa_family; dest_is_priv = dest_is_loop = 0; /* Setup our scopes for the destination */ switch (fam) { #ifdef INET case AF_INET: /* Scope based on outbound address */ if (IN4_ISLOOPBACK_ADDRESS(&to->sin_addr)) { dest_is_loop = 1; if (net != NULL) { /* mark it as local */ net->addr_is_local = 1; } } else if ((IN4_ISPRIVATE_ADDRESS(&to->sin_addr))) { dest_is_priv = 1; } break; #endif #ifdef INET6 case AF_INET6: /* Scope based on outbound address */ if (IN6_IS_ADDR_LOOPBACK(&to6->sin6_addr) || SCTP_ROUTE_IS_REAL_LOOP(ro)) { /* * If the address is a loopback address, which * consists of "::1" OR "fe80::1%lo0", we are * loopback scope. But we don't use dest_is_priv * (link local addresses). */ dest_is_loop = 1; if (net != NULL) { /* mark it as local */ net->addr_is_local = 1; } } else if (IN6_IS_ADDR_LINKLOCAL(&to6->sin6_addr)) { dest_is_priv = 1; } break; #endif } SCTPDBG(SCTP_DEBUG_OUTPUT2, "Select source addr for:"); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)&ro->ro_dst); SCTP_IPI_ADDR_RLOCK(); if (inp->sctp_flags & SCTP_PCB_FLAGS_BOUNDALL) { /* * Bound all case */ answer = sctp_choose_boundall(inp, stcb, net, ro, vrf_id, dest_is_priv, dest_is_loop, non_asoc_addr_ok, fam); SCTP_IPI_ADDR_RUNLOCK(); return (answer); } /* * Subset bound case */ if (stcb) { answer = sctp_choose_boundspecific_stcb(inp, stcb, ro, vrf_id, dest_is_priv, dest_is_loop, non_asoc_addr_ok, fam); } else { answer = sctp_choose_boundspecific_inp(inp, ro, vrf_id, non_asoc_addr_ok, dest_is_priv, dest_is_loop, fam); } SCTP_IPI_ADDR_RUNLOCK(); return (answer); } static int sctp_find_cmsg(int c_type, void *data, struct mbuf *control, size_t cpsize) { struct cmsghdr cmh; int tlen, at, found; struct sctp_sndinfo sndinfo; struct sctp_prinfo prinfo; struct sctp_authinfo authinfo; tlen = SCTP_BUF_LEN(control); at = 0; found = 0; /* * Independent of how many mbufs, find the c_type inside the control * structure and copy out the data. */ while (at < tlen) { if ((tlen - at) < (int)CMSG_ALIGN(sizeof(cmh))) { /* There is not enough room for one more. */ return (found); } m_copydata(control, at, sizeof(cmh), (caddr_t)&cmh); if (cmh.cmsg_len < CMSG_ALIGN(sizeof(cmh))) { /* We dont't have a complete CMSG header. */ return (found); } if (((int)cmh.cmsg_len + at) > tlen) { /* We don't have the complete CMSG. */ return (found); } if ((cmh.cmsg_level == IPPROTO_SCTP) && ((c_type == cmh.cmsg_type) || ((c_type == SCTP_SNDRCV) && ((cmh.cmsg_type == SCTP_SNDINFO) || (cmh.cmsg_type == SCTP_PRINFO) || (cmh.cmsg_type == SCTP_AUTHINFO))))) { if (c_type == cmh.cmsg_type) { if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < cpsize) { return (found); } /* It is exactly what we want. Copy it out. */ m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), (int)cpsize, (caddr_t)data); return (1); } else { struct sctp_sndrcvinfo *sndrcvinfo; sndrcvinfo = (struct sctp_sndrcvinfo *)data; if (found == 0) { if (cpsize < sizeof(struct sctp_sndrcvinfo)) { return (found); } memset(sndrcvinfo, 0, sizeof(struct sctp_sndrcvinfo)); } switch (cmh.cmsg_type) { case SCTP_SNDINFO: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct sctp_sndinfo)) { return (found); } m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct sctp_sndinfo), (caddr_t)&sndinfo); sndrcvinfo->sinfo_stream = sndinfo.snd_sid; sndrcvinfo->sinfo_flags = sndinfo.snd_flags; sndrcvinfo->sinfo_ppid = sndinfo.snd_ppid; sndrcvinfo->sinfo_context = sndinfo.snd_context; sndrcvinfo->sinfo_assoc_id = sndinfo.snd_assoc_id; break; case SCTP_PRINFO: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct sctp_prinfo)) { return (found); } m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct sctp_prinfo), (caddr_t)&prinfo); if (prinfo.pr_policy != SCTP_PR_SCTP_NONE) { sndrcvinfo->sinfo_timetolive = prinfo.pr_value; } else { sndrcvinfo->sinfo_timetolive = 0; } sndrcvinfo->sinfo_flags |= prinfo.pr_policy; break; case SCTP_AUTHINFO: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct sctp_authinfo)) { return (found); } m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct sctp_authinfo), (caddr_t)&authinfo); sndrcvinfo->sinfo_keynumber_valid = 1; sndrcvinfo->sinfo_keynumber = authinfo.auth_keynumber; break; default: return (found); } found = 1; } } at += CMSG_ALIGN(cmh.cmsg_len); } return (found); } static int sctp_process_cmsgs_for_init(struct sctp_tcb *stcb, struct mbuf *control, int *error) { struct cmsghdr cmh; int tlen, at; struct sctp_initmsg initmsg; #ifdef INET struct sockaddr_in sin; #endif #ifdef INET6 struct sockaddr_in6 sin6; #endif tlen = SCTP_BUF_LEN(control); at = 0; while (at < tlen) { if ((tlen - at) < (int)CMSG_ALIGN(sizeof(cmh))) { /* There is not enough room for one more. */ *error = EINVAL; return (1); } m_copydata(control, at, sizeof(cmh), (caddr_t)&cmh); if (cmh.cmsg_len < CMSG_ALIGN(sizeof(cmh))) { /* We dont't have a complete CMSG header. */ *error = EINVAL; return (1); } if (((int)cmh.cmsg_len + at) > tlen) { /* We don't have the complete CMSG. */ *error = EINVAL; return (1); } if (cmh.cmsg_level == IPPROTO_SCTP) { switch (cmh.cmsg_type) { case SCTP_INIT: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct sctp_initmsg)) { *error = EINVAL; return (1); } m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct sctp_initmsg), (caddr_t)&initmsg); if (initmsg.sinit_max_attempts) stcb->asoc.max_init_times = initmsg.sinit_max_attempts; if (initmsg.sinit_num_ostreams) stcb->asoc.pre_open_streams = initmsg.sinit_num_ostreams; if (initmsg.sinit_max_instreams) stcb->asoc.max_inbound_streams = initmsg.sinit_max_instreams; if (initmsg.sinit_max_init_timeo) stcb->asoc.initial_init_rto_max = initmsg.sinit_max_init_timeo; if (stcb->asoc.streamoutcnt < stcb->asoc.pre_open_streams) { struct sctp_stream_out *tmp_str; unsigned int i; #if defined(SCTP_DETAILED_STR_STATS) int j; #endif /* Default is NOT correct */ SCTPDBG(SCTP_DEBUG_OUTPUT1, "Ok, default:%d pre_open:%d\n", stcb->asoc.streamoutcnt, stcb->asoc.pre_open_streams); SCTP_TCB_UNLOCK(stcb); SCTP_MALLOC(tmp_str, struct sctp_stream_out *, (stcb->asoc.pre_open_streams * sizeof(struct sctp_stream_out)), SCTP_M_STRMO); SCTP_TCB_LOCK(stcb); if (tmp_str != NULL) { SCTP_FREE(stcb->asoc.strmout, SCTP_M_STRMO); stcb->asoc.strmout = tmp_str; stcb->asoc.strm_realoutsize = stcb->asoc.streamoutcnt = stcb->asoc.pre_open_streams; } else { stcb->asoc.pre_open_streams = stcb->asoc.streamoutcnt; } for (i = 0; i < stcb->asoc.streamoutcnt; i++) { TAILQ_INIT(&stcb->asoc.strmout[i].outqueue); stcb->asoc.strmout[i].chunks_on_queues = 0; stcb->asoc.strmout[i].next_mid_ordered = 0; stcb->asoc.strmout[i].next_mid_unordered = 0; #if defined(SCTP_DETAILED_STR_STATS) for (j = 0; j < SCTP_PR_SCTP_MAX + 1; j++) { stcb->asoc.strmout[i].abandoned_sent[j] = 0; stcb->asoc.strmout[i].abandoned_unsent[j] = 0; } #else stcb->asoc.strmout[i].abandoned_sent[0] = 0; stcb->asoc.strmout[i].abandoned_unsent[0] = 0; #endif stcb->asoc.strmout[i].stream_no = i; stcb->asoc.strmout[i].last_msg_incomplete = 0; stcb->asoc.strmout[i].state = SCTP_STREAM_OPENING; stcb->asoc.ss_functions.sctp_ss_init_stream(&stcb->asoc.strmout[i], NULL); } } break; #ifdef INET case SCTP_DSTADDRV4: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct in_addr)) { *error = EINVAL; return (1); } memset(&sin, 0, sizeof(struct sockaddr_in)); sin.sin_family = AF_INET; sin.sin_len = sizeof(struct sockaddr_in); sin.sin_port = stcb->rport; m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct in_addr), (caddr_t)&sin.sin_addr); if ((sin.sin_addr.s_addr == INADDR_ANY) || (sin.sin_addr.s_addr == INADDR_BROADCAST) || IN_MULTICAST(ntohl(sin.sin_addr.s_addr))) { *error = EINVAL; return (1); } if (sctp_add_remote_addr(stcb, (struct sockaddr *)&sin, NULL, stcb->asoc.port, SCTP_DONOT_SETSCOPE, SCTP_ADDR_IS_CONFIRMED)) { *error = ENOBUFS; return (1); } break; #endif #ifdef INET6 case SCTP_DSTADDRV6: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct in6_addr)) { *error = EINVAL; return (1); } memset(&sin6, 0, sizeof(struct sockaddr_in6)); sin6.sin6_family = AF_INET6; sin6.sin6_len = sizeof(struct sockaddr_in6); sin6.sin6_port = stcb->rport; m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct in6_addr), (caddr_t)&sin6.sin6_addr); if (IN6_IS_ADDR_UNSPECIFIED(&sin6.sin6_addr) || IN6_IS_ADDR_MULTICAST(&sin6.sin6_addr)) { *error = EINVAL; return (1); } #ifdef INET if (IN6_IS_ADDR_V4MAPPED(&sin6.sin6_addr)) { in6_sin6_2_sin(&sin, &sin6); if ((sin.sin_addr.s_addr == INADDR_ANY) || (sin.sin_addr.s_addr == INADDR_BROADCAST) || IN_MULTICAST(ntohl(sin.sin_addr.s_addr))) { *error = EINVAL; return (1); } if (sctp_add_remote_addr(stcb, (struct sockaddr *)&sin, NULL, stcb->asoc.port, SCTP_DONOT_SETSCOPE, SCTP_ADDR_IS_CONFIRMED)) { *error = ENOBUFS; return (1); } } else #endif if (sctp_add_remote_addr(stcb, (struct sockaddr *)&sin6, NULL, stcb->asoc.port, SCTP_DONOT_SETSCOPE, SCTP_ADDR_IS_CONFIRMED)) { *error = ENOBUFS; return (1); } break; #endif default: break; } } at += CMSG_ALIGN(cmh.cmsg_len); } return (0); } static struct sctp_tcb * sctp_findassociation_cmsgs(struct sctp_inpcb **inp_p, uint16_t port, struct mbuf *control, struct sctp_nets **net_p, int *error) { struct cmsghdr cmh; int tlen, at; struct sctp_tcb *stcb; struct sockaddr *addr; #ifdef INET struct sockaddr_in sin; #endif #ifdef INET6 struct sockaddr_in6 sin6; #endif tlen = SCTP_BUF_LEN(control); at = 0; while (at < tlen) { if ((tlen - at) < (int)CMSG_ALIGN(sizeof(cmh))) { /* There is not enough room for one more. */ *error = EINVAL; return (NULL); } m_copydata(control, at, sizeof(cmh), (caddr_t)&cmh); if (cmh.cmsg_len < CMSG_ALIGN(sizeof(cmh))) { /* We dont't have a complete CMSG header. */ *error = EINVAL; return (NULL); } if (((int)cmh.cmsg_len + at) > tlen) { /* We don't have the complete CMSG. */ *error = EINVAL; return (NULL); } if (cmh.cmsg_level == IPPROTO_SCTP) { switch (cmh.cmsg_type) { #ifdef INET case SCTP_DSTADDRV4: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct in_addr)) { *error = EINVAL; return (NULL); } memset(&sin, 0, sizeof(struct sockaddr_in)); sin.sin_family = AF_INET; sin.sin_len = sizeof(struct sockaddr_in); sin.sin_port = port; m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct in_addr), (caddr_t)&sin.sin_addr); addr = (struct sockaddr *)&sin; break; #endif #ifdef INET6 case SCTP_DSTADDRV6: if ((size_t)(cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh))) < sizeof(struct in6_addr)) { *error = EINVAL; return (NULL); } memset(&sin6, 0, sizeof(struct sockaddr_in6)); sin6.sin6_family = AF_INET6; sin6.sin6_len = sizeof(struct sockaddr_in6); sin6.sin6_port = port; m_copydata(control, at + CMSG_ALIGN(sizeof(cmh)), sizeof(struct in6_addr), (caddr_t)&sin6.sin6_addr); #ifdef INET if (IN6_IS_ADDR_V4MAPPED(&sin6.sin6_addr)) { in6_sin6_2_sin(&sin, &sin6); addr = (struct sockaddr *)&sin; } else #endif addr = (struct sockaddr *)&sin6; break; #endif default: addr = NULL; break; } if (addr) { stcb = sctp_findassociation_ep_addr(inp_p, addr, net_p, NULL, NULL); if (stcb != NULL) { return (stcb); } } } at += CMSG_ALIGN(cmh.cmsg_len); } return (NULL); } static struct mbuf * sctp_add_cookie(struct mbuf *init, int init_offset, struct mbuf *initack, int initack_offset, struct sctp_state_cookie *stc_in, uint8_t ** signature) { struct mbuf *copy_init, *copy_initack, *m_at, *sig, *mret; struct sctp_state_cookie *stc; struct sctp_paramhdr *ph; uint8_t *foo; int sig_offset; uint16_t cookie_sz; mret = sctp_get_mbuf_for_msg((sizeof(struct sctp_state_cookie) + sizeof(struct sctp_paramhdr)), 0, M_NOWAIT, 1, MT_DATA); if (mret == NULL) { return (NULL); } copy_init = SCTP_M_COPYM(init, init_offset, M_COPYALL, M_NOWAIT); if (copy_init == NULL) { sctp_m_freem(mret); return (NULL); } #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(copy_init, SCTP_MBUF_ICOPY); } #endif copy_initack = SCTP_M_COPYM(initack, initack_offset, M_COPYALL, M_NOWAIT); if (copy_initack == NULL) { sctp_m_freem(mret); sctp_m_freem(copy_init); return (NULL); } #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(copy_initack, SCTP_MBUF_ICOPY); } #endif /* easy side we just drop it on the end */ ph = mtod(mret, struct sctp_paramhdr *); SCTP_BUF_LEN(mret) = sizeof(struct sctp_state_cookie) + sizeof(struct sctp_paramhdr); stc = (struct sctp_state_cookie *)((caddr_t)ph + sizeof(struct sctp_paramhdr)); ph->param_type = htons(SCTP_STATE_COOKIE); ph->param_length = 0; /* fill in at the end */ /* Fill in the stc cookie data */ memcpy(stc, stc_in, sizeof(struct sctp_state_cookie)); /* tack the INIT and then the INIT-ACK onto the chain */ cookie_sz = 0; for (m_at = mret; m_at; m_at = SCTP_BUF_NEXT(m_at)) { cookie_sz += SCTP_BUF_LEN(m_at); if (SCTP_BUF_NEXT(m_at) == NULL) { SCTP_BUF_NEXT(m_at) = copy_init; break; } } for (m_at = copy_init; m_at; m_at = SCTP_BUF_NEXT(m_at)) { cookie_sz += SCTP_BUF_LEN(m_at); if (SCTP_BUF_NEXT(m_at) == NULL) { SCTP_BUF_NEXT(m_at) = copy_initack; break; } } for (m_at = copy_initack; m_at; m_at = SCTP_BUF_NEXT(m_at)) { cookie_sz += SCTP_BUF_LEN(m_at); if (SCTP_BUF_NEXT(m_at) == NULL) { break; } } sig = sctp_get_mbuf_for_msg(SCTP_SECRET_SIZE, 0, M_NOWAIT, 1, MT_DATA); if (sig == NULL) { /* no space, so free the entire chain */ sctp_m_freem(mret); return (NULL); } SCTP_BUF_LEN(sig) = 0; SCTP_BUF_NEXT(m_at) = sig; sig_offset = 0; foo = (uint8_t *) (mtod(sig, caddr_t)+sig_offset); memset(foo, 0, SCTP_SIGNATURE_SIZE); *signature = foo; SCTP_BUF_LEN(sig) += SCTP_SIGNATURE_SIZE; cookie_sz += SCTP_SIGNATURE_SIZE; ph->param_length = htons(cookie_sz); return (mret); } static uint8_t sctp_get_ect(struct sctp_tcb *stcb) { if ((stcb != NULL) && (stcb->asoc.ecn_supported == 1)) { return (SCTP_ECT0_BIT); } else { return (0); } } #if defined(INET) || defined(INET6) static void sctp_handle_no_route(struct sctp_tcb *stcb, struct sctp_nets *net, int so_locked) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "dropped packet - no valid source addr\n"); if (net) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Destination was "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT1, &net->ro._l_addr.sa); if (net->dest_state & SCTP_ADDR_CONFIRMED) { if ((net->dest_state & SCTP_ADDR_REACHABLE) && stcb) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "no route takes interface %p down\n", (void *)net); sctp_ulp_notify(SCTP_NOTIFY_INTERFACE_DOWN, stcb, 0, (void *)net, so_locked); net->dest_state &= ~SCTP_ADDR_REACHABLE; net->dest_state &= ~SCTP_ADDR_PF; } } if (stcb) { if (net == stcb->asoc.primary_destination) { /* need a new primary */ struct sctp_nets *alt; alt = sctp_find_alternate_net(stcb, net, 0); if (alt != net) { if (stcb->asoc.alternate) { sctp_free_remote_addr(stcb->asoc.alternate); } stcb->asoc.alternate = alt; atomic_add_int(&stcb->asoc.alternate->ref_count, 1); if (net->ro._s_addr) { sctp_free_ifa(net->ro._s_addr); net->ro._s_addr = NULL; } net->src_addr_selected = 0; } } } } } #endif static int sctp_lowlevel_chunk_output(struct sctp_inpcb *inp, struct sctp_tcb *stcb, /* may be NULL */ struct sctp_nets *net, struct sockaddr *to, struct mbuf *m, uint32_t auth_offset, struct sctp_auth_chunk *auth, uint16_t auth_keyid, int nofragment_flag, int ecn_ok, int out_of_asoc_ok, uint16_t src_port, uint16_t dest_port, uint32_t v_tag, uint16_t port, union sctp_sockstore *over_addr, uint8_t mflowtype, uint32_t mflowid, #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) int so_locked SCTP_UNUSED #else int so_locked #endif ) /* nofragment_flag to tell if IP_DF should be set (IPv4 only) */ { /** * Given a mbuf chain (via SCTP_BUF_NEXT()) that holds a packet header * WITH an SCTPHDR but no IP header, endpoint inp and sa structure: * - fill in the HMAC digest of any AUTH chunk in the packet. * - calculate and fill in the SCTP checksum. * - prepend an IP address header. * - if boundall use INADDR_ANY. * - if boundspecific do source address selection. * - set fragmentation option for ipV4. * - On return from IP output, check/adjust mtu size of output * interface and smallest_mtu size as well. */ /* Will need ifdefs around this */ struct mbuf *newm; struct sctphdr *sctphdr; int packet_length; int ret; #if defined(INET) || defined(INET6) uint32_t vrf_id; #endif #if defined(INET) || defined(INET6) struct mbuf *o_pak; sctp_route_t *ro = NULL; struct udphdr *udp = NULL; #endif uint8_t tos_value; #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING) struct socket *so = NULL; #endif if ((net) && (net->dest_state & SCTP_ADDR_OUT_OF_SCOPE)) { SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EFAULT); sctp_m_freem(m); return (EFAULT); } #if defined(INET) || defined(INET6) if (stcb) { vrf_id = stcb->asoc.vrf_id; } else { vrf_id = inp->def_vrf_id; } #endif /* fill in the HMAC digest for any AUTH chunk in the packet */ if ((auth != NULL) && (stcb != NULL)) { sctp_fill_hmac_digest_m(m, auth_offset, auth, stcb, auth_keyid); } if (net) { tos_value = net->dscp; } else if (stcb) { tos_value = stcb->asoc.default_dscp; } else { tos_value = inp->sctp_ep.default_dscp; } switch (to->sa_family) { #ifdef INET case AF_INET: { struct ip *ip = NULL; sctp_route_t iproute; int len; len = SCTP_MIN_V4_OVERHEAD; if (port) { len += sizeof(struct udphdr); } newm = sctp_get_mbuf_for_msg(len, 1, M_NOWAIT, 1, MT_DATA); if (newm == NULL) { sctp_m_freem(m); SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } SCTP_ALIGN_TO_END(newm, len); SCTP_BUF_LEN(newm) = len; SCTP_BUF_NEXT(newm) = m; m = newm; if (net != NULL) { m->m_pkthdr.flowid = net->flowid; M_HASHTYPE_SET(m, net->flowtype); } else { m->m_pkthdr.flowid = mflowid; M_HASHTYPE_SET(m, mflowtype); } packet_length = sctp_calculate_len(m); ip = mtod(m, struct ip *); ip->ip_v = IPVERSION; ip->ip_hl = (sizeof(struct ip) >> 2); if (tos_value == 0) { /* * This means especially, that it is not set * at the SCTP layer. So use the value from * the IP layer. */ tos_value = inp->ip_inp.inp.inp_ip_tos; } tos_value &= 0xfc; if (ecn_ok) { tos_value |= sctp_get_ect(stcb); } if ((nofragment_flag) && (port == 0)) { ip->ip_off = htons(IP_DF); } else { ip->ip_off = htons(0); } /* FreeBSD has a function for ip_id's */ ip_fillid(ip); ip->ip_ttl = inp->ip_inp.inp.inp_ip_ttl; ip->ip_len = htons(packet_length); ip->ip_tos = tos_value; if (port) { ip->ip_p = IPPROTO_UDP; } else { ip->ip_p = IPPROTO_SCTP; } ip->ip_sum = 0; if (net == NULL) { ro = &iproute; memset(&iproute, 0, sizeof(iproute)); memcpy(&ro->ro_dst, to, to->sa_len); } else { ro = (sctp_route_t *) & net->ro; } /* Now the address selection part */ ip->ip_dst.s_addr = ((struct sockaddr_in *)to)->sin_addr.s_addr; /* call the routine to select the src address */ if (net && out_of_asoc_ok == 0) { if (net->ro._s_addr && (net->ro._s_addr->localifa_flags & (SCTP_BEING_DELETED | SCTP_ADDR_IFA_UNUSEABLE))) { sctp_free_ifa(net->ro._s_addr); net->ro._s_addr = NULL; net->src_addr_selected = 0; if (ro->ro_rt) { RTFREE(ro->ro_rt); ro->ro_rt = NULL; } } if (net->src_addr_selected == 0) { /* Cache the source address */ net->ro._s_addr = sctp_source_address_selection(inp, stcb, ro, net, 0, vrf_id); net->src_addr_selected = 1; } if (net->ro._s_addr == NULL) { /* No route to host */ net->src_addr_selected = 0; sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } ip->ip_src = net->ro._s_addr->address.sin.sin_addr; } else { if (over_addr == NULL) { struct sctp_ifa *_lsrc; _lsrc = sctp_source_address_selection(inp, stcb, ro, net, out_of_asoc_ok, vrf_id); if (_lsrc == NULL) { sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } ip->ip_src = _lsrc->address.sin.sin_addr; sctp_free_ifa(_lsrc); } else { ip->ip_src = over_addr->sin.sin_addr; SCTP_RTALLOC(ro, vrf_id, inp->fibnum); } } if (port) { if (htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)) == 0) { sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } udp = (struct udphdr *)((caddr_t)ip + sizeof(struct ip)); udp->uh_sport = htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)); udp->uh_dport = port; udp->uh_ulen = htons((uint16_t) (packet_length - sizeof(struct ip))); if (V_udp_cksum) { udp->uh_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, udp->uh_ulen + htons(IPPROTO_UDP)); } else { udp->uh_sum = 0; } sctphdr = (struct sctphdr *)((caddr_t)udp + sizeof(struct udphdr)); } else { sctphdr = (struct sctphdr *)((caddr_t)ip + sizeof(struct ip)); } sctphdr->src_port = src_port; sctphdr->dest_port = dest_port; sctphdr->v_tag = v_tag; sctphdr->checksum = 0; /* * If source address selection fails and we find no * route then the ip_output should fail as well with * a NO_ROUTE_TO_HOST type error. We probably should * catch that somewhere and abort the association * right away (assuming this is an INIT being sent). */ if (ro->ro_rt == NULL) { /* * src addr selection failed to find a route * (or valid source addr), so we can't get * there from here (yet)! */ sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } if (ro != &iproute) { memcpy(&iproute, ro, sizeof(*ro)); } SCTPDBG(SCTP_DEBUG_OUTPUT3, "Calling ipv4 output routine from low level src addr:%x\n", (uint32_t) (ntohl(ip->ip_src.s_addr))); SCTPDBG(SCTP_DEBUG_OUTPUT3, "Destination is %x\n", (uint32_t) (ntohl(ip->ip_dst.s_addr))); SCTPDBG(SCTP_DEBUG_OUTPUT3, "RTP route is %p through\n", (void *)ro->ro_rt); if (SCTP_GET_HEADER_FOR_OUTPUT(o_pak)) { /* failed to prepend data, give up */ SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); sctp_m_freem(m); return (ENOMEM); } SCTP_ATTACH_CHAIN(o_pak, m, packet_length); if (port) { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else sctphdr->checksum = sctp_calculate_cksum(m, sizeof(struct ip) + sizeof(struct udphdr)); SCTP_STAT_INCR(sctps_sendswcrc); #endif if (V_udp_cksum) { SCTP_ENABLE_UDP_CSUM(o_pak); } } else { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else m->m_pkthdr.csum_flags = CSUM_SCTP; m->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum); SCTP_STAT_INCR(sctps_sendhwcrc); #endif } #ifdef SCTP_PACKET_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING) sctp_packet_log(o_pak); #endif /* send it out. table id is taken from stcb */ #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING) if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) { so = SCTP_INP_SO(inp); SCTP_SOCKET_UNLOCK(so, 0); } #endif SCTP_IP_OUTPUT(ret, o_pak, ro, stcb, vrf_id); #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING) if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) { atomic_add_int(&stcb->asoc.refcnt, 1); SCTP_TCB_UNLOCK(stcb); SCTP_SOCKET_LOCK(so, 0); SCTP_TCB_LOCK(stcb); atomic_subtract_int(&stcb->asoc.refcnt, 1); } #endif SCTP_STAT_INCR(sctps_sendpackets); SCTP_STAT_INCR_COUNTER64(sctps_outpackets); if (ret) SCTP_STAT_INCR(sctps_senderrors); SCTPDBG(SCTP_DEBUG_OUTPUT3, "IP output returns %d\n", ret); if (net == NULL) { /* free tempy routes */ RO_RTFREE(ro); } else { /* * PMTU check versus smallest asoc MTU goes * here */ if ((ro->ro_rt != NULL) && (net->ro._s_addr)) { uint32_t mtu; mtu = SCTP_GATHER_MTU_FROM_ROUTE(net->ro._s_addr, &net->ro._l_addr.sa, ro->ro_rt); if (net->port) { mtu -= sizeof(struct udphdr); } if (mtu && (stcb->asoc.smallest_mtu > mtu)) { sctp_mtu_size_reset(inp, &stcb->asoc, mtu); net->mtu = mtu; } } else if (ro->ro_rt == NULL) { /* route was freed */ if (net->ro._s_addr && net->src_addr_selected) { sctp_free_ifa(net->ro._s_addr); net->ro._s_addr = NULL; } net->src_addr_selected = 0; } } return (ret); } #endif #ifdef INET6 case AF_INET6: { uint32_t flowlabel, flowinfo; struct ip6_hdr *ip6h; struct route_in6 ip6route; struct ifnet *ifp; struct sockaddr_in6 *sin6, tmp, *lsa6, lsa6_tmp; int prev_scope = 0; struct sockaddr_in6 lsa6_storage; int error; u_short prev_port = 0; int len; if (net) { flowlabel = net->flowlabel; } else if (stcb) { flowlabel = stcb->asoc.default_flowlabel; } else { flowlabel = inp->sctp_ep.default_flowlabel; } if (flowlabel == 0) { /* * This means especially, that it is not set * at the SCTP layer. So use the value from * the IP layer. */ flowlabel = ntohl(((struct in6pcb *)inp)->in6p_flowinfo); } flowlabel &= 0x000fffff; len = SCTP_MIN_OVERHEAD; if (port) { len += sizeof(struct udphdr); } newm = sctp_get_mbuf_for_msg(len, 1, M_NOWAIT, 1, MT_DATA); if (newm == NULL) { sctp_m_freem(m); SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } SCTP_ALIGN_TO_END(newm, len); SCTP_BUF_LEN(newm) = len; SCTP_BUF_NEXT(newm) = m; m = newm; if (net != NULL) { m->m_pkthdr.flowid = net->flowid; M_HASHTYPE_SET(m, net->flowtype); } else { m->m_pkthdr.flowid = mflowid; M_HASHTYPE_SET(m, mflowtype); } packet_length = sctp_calculate_len(m); ip6h = mtod(m, struct ip6_hdr *); /* protect *sin6 from overwrite */ sin6 = (struct sockaddr_in6 *)to; tmp = *sin6; sin6 = &tmp; /* KAME hack: embed scopeid */ if (sa6_embedscope(sin6, MODULE_GLOBAL(ip6_use_defzone)) != 0) { SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } if (net == NULL) { memset(&ip6route, 0, sizeof(ip6route)); ro = (sctp_route_t *) & ip6route; memcpy(&ro->ro_dst, sin6, sin6->sin6_len); } else { ro = (sctp_route_t *) & net->ro; } /* * We assume here that inp_flow is in host byte * order within the TCB! */ if (tos_value == 0) { /* * This means especially, that it is not set * at the SCTP layer. So use the value from * the IP layer. */ tos_value = (ntohl(((struct in6pcb *)inp)->in6p_flowinfo) >> 20) & 0xff; } tos_value &= 0xfc; if (ecn_ok) { tos_value |= sctp_get_ect(stcb); } flowinfo = 0x06; flowinfo <<= 8; flowinfo |= tos_value; flowinfo <<= 20; flowinfo |= flowlabel; ip6h->ip6_flow = htonl(flowinfo); if (port) { ip6h->ip6_nxt = IPPROTO_UDP; } else { ip6h->ip6_nxt = IPPROTO_SCTP; } ip6h->ip6_plen = (uint16_t) (packet_length - sizeof(struct ip6_hdr)); ip6h->ip6_dst = sin6->sin6_addr; /* * Add SRC address selection here: we can only reuse * to a limited degree the kame src-addr-sel, since * we can try their selection but it may not be * bound. */ bzero(&lsa6_tmp, sizeof(lsa6_tmp)); lsa6_tmp.sin6_family = AF_INET6; lsa6_tmp.sin6_len = sizeof(lsa6_tmp); lsa6 = &lsa6_tmp; if (net && out_of_asoc_ok == 0) { if (net->ro._s_addr && (net->ro._s_addr->localifa_flags & (SCTP_BEING_DELETED | SCTP_ADDR_IFA_UNUSEABLE))) { sctp_free_ifa(net->ro._s_addr); net->ro._s_addr = NULL; net->src_addr_selected = 0; if (ro->ro_rt) { RTFREE(ro->ro_rt); ro->ro_rt = NULL; } } if (net->src_addr_selected == 0) { sin6 = (struct sockaddr_in6 *)&net->ro._l_addr; /* KAME hack: embed scopeid */ if (sa6_embedscope(sin6, MODULE_GLOBAL(ip6_use_defzone)) != 0) { SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } /* Cache the source address */ net->ro._s_addr = sctp_source_address_selection(inp, stcb, ro, net, 0, vrf_id); (void)sa6_recoverscope(sin6); net->src_addr_selected = 1; } if (net->ro._s_addr == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "V6:No route to host\n"); net->src_addr_selected = 0; sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } lsa6->sin6_addr = net->ro._s_addr->address.sin6.sin6_addr; } else { sin6 = (struct sockaddr_in6 *)&ro->ro_dst; /* KAME hack: embed scopeid */ if (sa6_embedscope(sin6, MODULE_GLOBAL(ip6_use_defzone)) != 0) { SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } if (over_addr == NULL) { struct sctp_ifa *_lsrc; _lsrc = sctp_source_address_selection(inp, stcb, ro, net, out_of_asoc_ok, vrf_id); if (_lsrc == NULL) { sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } lsa6->sin6_addr = _lsrc->address.sin6.sin6_addr; sctp_free_ifa(_lsrc); } else { lsa6->sin6_addr = over_addr->sin6.sin6_addr; SCTP_RTALLOC(ro, vrf_id, inp->fibnum); } (void)sa6_recoverscope(sin6); } lsa6->sin6_port = inp->sctp_lport; if (ro->ro_rt == NULL) { /* * src addr selection failed to find a route * (or valid source addr), so we can't get * there from here! */ sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } /* * XXX: sa6 may not have a valid sin6_scope_id in * the non-SCOPEDROUTING case. */ bzero(&lsa6_storage, sizeof(lsa6_storage)); lsa6_storage.sin6_family = AF_INET6; lsa6_storage.sin6_len = sizeof(lsa6_storage); lsa6_storage.sin6_addr = lsa6->sin6_addr; if ((error = sa6_recoverscope(&lsa6_storage)) != 0) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "recover scope fails error %d\n", error); sctp_m_freem(m); return (error); } /* XXX */ lsa6_storage.sin6_addr = lsa6->sin6_addr; lsa6_storage.sin6_port = inp->sctp_lport; lsa6 = &lsa6_storage; ip6h->ip6_src = lsa6->sin6_addr; if (port) { if (htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)) == 0) { sctp_handle_no_route(stcb, net, so_locked); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH); sctp_m_freem(m); return (EHOSTUNREACH); } udp = (struct udphdr *)((caddr_t)ip6h + sizeof(struct ip6_hdr)); udp->uh_sport = htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)); udp->uh_dport = port; udp->uh_ulen = htons((uint16_t) (packet_length - sizeof(struct ip6_hdr))); udp->uh_sum = 0; sctphdr = (struct sctphdr *)((caddr_t)udp + sizeof(struct udphdr)); } else { sctphdr = (struct sctphdr *)((caddr_t)ip6h + sizeof(struct ip6_hdr)); } sctphdr->src_port = src_port; sctphdr->dest_port = dest_port; sctphdr->v_tag = v_tag; sctphdr->checksum = 0; /* * We set the hop limit now since there is a good * chance that our ro pointer is now filled */ ip6h->ip6_hlim = SCTP_GET_HLIM(inp, ro); ifp = SCTP_GET_IFN_VOID_FROM_ROUTE(ro); #ifdef SCTP_DEBUG /* Copy to be sure something bad is not happening */ sin6->sin6_addr = ip6h->ip6_dst; lsa6->sin6_addr = ip6h->ip6_src; #endif SCTPDBG(SCTP_DEBUG_OUTPUT3, "Calling ipv6 output routine from low level\n"); SCTPDBG(SCTP_DEBUG_OUTPUT3, "src: "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT3, (struct sockaddr *)lsa6); SCTPDBG(SCTP_DEBUG_OUTPUT3, "dst: "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT3, (struct sockaddr *)sin6); if (net) { sin6 = (struct sockaddr_in6 *)&net->ro._l_addr; /* * preserve the port and scope for link * local send */ prev_scope = sin6->sin6_scope_id; prev_port = sin6->sin6_port; } if (SCTP_GET_HEADER_FOR_OUTPUT(o_pak)) { /* failed to prepend data, give up */ sctp_m_freem(m); SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } SCTP_ATTACH_CHAIN(o_pak, m, packet_length); if (port) { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else sctphdr->checksum = sctp_calculate_cksum(m, sizeof(struct ip6_hdr) + sizeof(struct udphdr)); SCTP_STAT_INCR(sctps_sendswcrc); #endif if ((udp->uh_sum = in6_cksum(o_pak, IPPROTO_UDP, sizeof(struct ip6_hdr), packet_length - sizeof(struct ip6_hdr))) == 0) { udp->uh_sum = 0xffff; } } else { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else m->m_pkthdr.csum_flags = CSUM_SCTP_IPV6; m->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum); SCTP_STAT_INCR(sctps_sendhwcrc); #endif } /* send it out. table id is taken from stcb */ #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING) if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) { so = SCTP_INP_SO(inp); SCTP_SOCKET_UNLOCK(so, 0); } #endif #ifdef SCTP_PACKET_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING) sctp_packet_log(o_pak); #endif SCTP_IP6_OUTPUT(ret, o_pak, (struct route_in6 *)ro, &ifp, stcb, vrf_id); #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING) if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) { atomic_add_int(&stcb->asoc.refcnt, 1); SCTP_TCB_UNLOCK(stcb); SCTP_SOCKET_LOCK(so, 0); SCTP_TCB_LOCK(stcb); atomic_subtract_int(&stcb->asoc.refcnt, 1); } #endif if (net) { /* for link local this must be done */ sin6->sin6_scope_id = prev_scope; sin6->sin6_port = prev_port; } SCTPDBG(SCTP_DEBUG_OUTPUT3, "return from send is %d\n", ret); SCTP_STAT_INCR(sctps_sendpackets); SCTP_STAT_INCR_COUNTER64(sctps_outpackets); if (ret) { SCTP_STAT_INCR(sctps_senderrors); } if (net == NULL) { /* Now if we had a temp route free it */ RO_RTFREE(ro); } else { /* * PMTU check versus smallest asoc MTU goes * here */ if (ro->ro_rt == NULL) { /* Route was freed */ if (net->ro._s_addr && net->src_addr_selected) { sctp_free_ifa(net->ro._s_addr); net->ro._s_addr = NULL; } net->src_addr_selected = 0; } if ((ro->ro_rt != NULL) && (net->ro._s_addr)) { uint32_t mtu; mtu = SCTP_GATHER_MTU_FROM_ROUTE(net->ro._s_addr, &net->ro._l_addr.sa, ro->ro_rt); if (mtu && (stcb->asoc.smallest_mtu > mtu)) { sctp_mtu_size_reset(inp, &stcb->asoc, mtu); net->mtu = mtu; if (net->port) { net->mtu -= sizeof(struct udphdr); } } } else if (ifp) { if (ND_IFINFO(ifp)->linkmtu && (stcb->asoc.smallest_mtu > ND_IFINFO(ifp)->linkmtu)) { sctp_mtu_size_reset(inp, &stcb->asoc, ND_IFINFO(ifp)->linkmtu); } } } return (ret); } #endif default: SCTPDBG(SCTP_DEBUG_OUTPUT1, "Unknown protocol (TSNH) type %d\n", ((struct sockaddr *)to)->sa_family); sctp_m_freem(m); SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EFAULT); return (EFAULT); } } void sctp_send_initiate(struct sctp_inpcb *inp, struct sctp_tcb *stcb, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { struct mbuf *m, *m_last; struct sctp_nets *net; struct sctp_init_chunk *init; struct sctp_supported_addr_param *sup_addr; struct sctp_adaptation_layer_indication *ali; struct sctp_supported_chunk_types_param *pr_supported; struct sctp_paramhdr *ph; int cnt_inits_to = 0; int ret; uint16_t num_ext, chunk_len, padding_len, parameter_len; /* INIT's always go to the primary (and usually ONLY address) */ net = stcb->asoc.primary_destination; if (net == NULL) { net = TAILQ_FIRST(&stcb->asoc.nets); if (net == NULL) { /* TSNH */ return; } /* we confirm any address we send an INIT to */ net->dest_state &= ~SCTP_ADDR_UNCONFIRMED; (void)sctp_set_primary_addr(stcb, NULL, net); } else { /* we confirm any address we send an INIT to */ net->dest_state &= ~SCTP_ADDR_UNCONFIRMED; } SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT\n"); #ifdef INET6 if (net->ro._l_addr.sa.sa_family == AF_INET6) { /* * special hook, if we are sending to link local it will not * show up in our private address count. */ if (IN6_IS_ADDR_LINKLOCAL(&net->ro._l_addr.sin6.sin6_addr)) cnt_inits_to = 1; } #endif if (SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) { /* This case should not happen */ SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT - failed timer?\n"); return; } /* start the INIT timer */ sctp_timer_start(SCTP_TIMER_TYPE_INIT, inp, stcb, net); m = sctp_get_mbuf_for_msg(MCLBYTES, 1, M_NOWAIT, 1, MT_DATA); if (m == NULL) { /* No memory, INIT timer will re-attempt. */ SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT - mbuf?\n"); return; } chunk_len = (uint16_t) sizeof(struct sctp_init_chunk); padding_len = 0; /* Now lets put the chunk header in place */ init = mtod(m, struct sctp_init_chunk *); /* now the chunk header */ init->ch.chunk_type = SCTP_INITIATION; init->ch.chunk_flags = 0; /* fill in later from mbuf we build */ init->ch.chunk_length = 0; /* place in my tag */ init->init.initiate_tag = htonl(stcb->asoc.my_vtag); /* set up some of the credits. */ init->init.a_rwnd = htonl(max(inp->sctp_socket ? SCTP_SB_LIMIT_RCV(inp->sctp_socket) : 0, SCTP_MINIMAL_RWND)); init->init.num_outbound_streams = htons(stcb->asoc.pre_open_streams); init->init.num_inbound_streams = htons(stcb->asoc.max_inbound_streams); init->init.initial_tsn = htonl(stcb->asoc.init_seq_number); /* Adaptation layer indication parameter */ if (inp->sctp_ep.adaptation_layer_indicator_provided) { parameter_len = (uint16_t) sizeof(struct sctp_adaptation_layer_indication); ali = (struct sctp_adaptation_layer_indication *)(mtod(m, caddr_t)+chunk_len); ali->ph.param_type = htons(SCTP_ULP_ADAPTATION); ali->ph.param_length = htons(parameter_len); ali->indication = htonl(inp->sctp_ep.adaptation_layer_indicator); chunk_len += parameter_len; } /* ECN parameter */ if (stcb->asoc.ecn_supported == 1) { parameter_len = (uint16_t) sizeof(struct sctp_paramhdr); ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len); ph->param_type = htons(SCTP_ECN_CAPABLE); ph->param_length = htons(parameter_len); chunk_len += parameter_len; } /* PR-SCTP supported parameter */ if (stcb->asoc.prsctp_supported == 1) { parameter_len = (uint16_t) sizeof(struct sctp_paramhdr); ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len); ph->param_type = htons(SCTP_PRSCTP_SUPPORTED); ph->param_length = htons(parameter_len); chunk_len += parameter_len; } /* Add NAT friendly parameter. */ if (SCTP_BASE_SYSCTL(sctp_inits_include_nat_friendly)) { parameter_len = (uint16_t) sizeof(struct sctp_paramhdr); ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len); ph->param_type = htons(SCTP_HAS_NAT_SUPPORT); ph->param_length = htons(parameter_len); chunk_len += parameter_len; } /* And now tell the peer which extensions we support */ num_ext = 0; pr_supported = (struct sctp_supported_chunk_types_param *)(mtod(m, caddr_t)+chunk_len); if (stcb->asoc.prsctp_supported == 1) { pr_supported->chunk_types[num_ext++] = SCTP_FORWARD_CUM_TSN; if (stcb->asoc.idata_supported) { pr_supported->chunk_types[num_ext++] = SCTP_IFORWARD_CUM_TSN; } } if (stcb->asoc.auth_supported == 1) { pr_supported->chunk_types[num_ext++] = SCTP_AUTHENTICATION; } if (stcb->asoc.asconf_supported == 1) { pr_supported->chunk_types[num_ext++] = SCTP_ASCONF; pr_supported->chunk_types[num_ext++] = SCTP_ASCONF_ACK; } if (stcb->asoc.reconfig_supported == 1) { pr_supported->chunk_types[num_ext++] = SCTP_STREAM_RESET; } if (stcb->asoc.idata_supported) { pr_supported->chunk_types[num_ext++] = SCTP_IDATA; } if (stcb->asoc.nrsack_supported == 1) { pr_supported->chunk_types[num_ext++] = SCTP_NR_SELECTIVE_ACK; } if (stcb->asoc.pktdrop_supported == 1) { pr_supported->chunk_types[num_ext++] = SCTP_PACKET_DROPPED; } if (num_ext > 0) { parameter_len = (uint16_t) sizeof(struct sctp_supported_chunk_types_param) + num_ext; pr_supported->ph.param_type = htons(SCTP_SUPPORTED_CHUNK_EXT); pr_supported->ph.param_length = htons(parameter_len); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; } /* add authentication parameters */ if (stcb->asoc.auth_supported) { /* attach RANDOM parameter, if available */ if (stcb->asoc.authinfo.random != NULL) { struct sctp_auth_random *randp; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } randp = (struct sctp_auth_random *)(mtod(m, caddr_t)+chunk_len); parameter_len = (uint16_t) sizeof(struct sctp_auth_random) + stcb->asoc.authinfo.random_len; /* random key already contains the header */ memcpy(randp, stcb->asoc.authinfo.random->key, parameter_len); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; } /* add HMAC_ALGO parameter */ if (stcb->asoc.local_hmacs != NULL) { struct sctp_auth_hmac_algo *hmacs; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } hmacs = (struct sctp_auth_hmac_algo *)(mtod(m, caddr_t)+chunk_len); parameter_len = (uint16_t) (sizeof(struct sctp_auth_hmac_algo) + stcb->asoc.local_hmacs->num_algo * sizeof(uint16_t)); hmacs->ph.param_type = htons(SCTP_HMAC_LIST); hmacs->ph.param_length = htons(parameter_len); sctp_serialize_hmaclist(stcb->asoc.local_hmacs, (uint8_t *) hmacs->hmac_ids); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; } /* add CHUNKS parameter */ if (stcb->asoc.local_auth_chunks != NULL) { struct sctp_auth_chunk_list *chunks; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } chunks = (struct sctp_auth_chunk_list *)(mtod(m, caddr_t)+chunk_len); parameter_len = (uint16_t) (sizeof(struct sctp_auth_chunk_list) + sctp_auth_get_chklist_size(stcb->asoc.local_auth_chunks)); chunks->ph.param_type = htons(SCTP_CHUNK_LIST); chunks->ph.param_length = htons(parameter_len); sctp_serialize_auth_chunks(stcb->asoc.local_auth_chunks, chunks->chunk_types); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; } } /* now any cookie time extensions */ if (stcb->asoc.cookie_preserve_req) { struct sctp_cookie_perserve_param *cookie_preserve; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } parameter_len = (uint16_t) sizeof(struct sctp_cookie_perserve_param); cookie_preserve = (struct sctp_cookie_perserve_param *)(mtod(m, caddr_t)+chunk_len); cookie_preserve->ph.param_type = htons(SCTP_COOKIE_PRESERVE); cookie_preserve->ph.param_length = htons(parameter_len); cookie_preserve->time = htonl(stcb->asoc.cookie_preserve_req); stcb->asoc.cookie_preserve_req = 0; chunk_len += parameter_len; } if (stcb->asoc.scope.ipv4_addr_legal || stcb->asoc.scope.ipv6_addr_legal) { uint8_t i; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } parameter_len = (uint16_t) sizeof(struct sctp_paramhdr); if (stcb->asoc.scope.ipv4_addr_legal) { parameter_len += (uint16_t) sizeof(uint16_t); } if (stcb->asoc.scope.ipv6_addr_legal) { parameter_len += (uint16_t) sizeof(uint16_t); } sup_addr = (struct sctp_supported_addr_param *)(mtod(m, caddr_t)+chunk_len); sup_addr->ph.param_type = htons(SCTP_SUPPORTED_ADDRTYPE); sup_addr->ph.param_length = htons(parameter_len); i = 0; if (stcb->asoc.scope.ipv4_addr_legal) { sup_addr->addr_type[i++] = htons(SCTP_IPV4_ADDRESS); } if (stcb->asoc.scope.ipv6_addr_legal) { sup_addr->addr_type[i++] = htons(SCTP_IPV6_ADDRESS); } padding_len = 4 - 2 * i; chunk_len += parameter_len; } SCTP_BUF_LEN(m) = chunk_len; /* now the addresses */ /* * To optimize this we could put the scoping stuff into a structure * and remove the individual uint8's from the assoc structure. Then * we could just sifa in the address within the stcb. But for now * this is a quick hack to get the address stuff teased apart. */ m_last = sctp_add_addresses_to_i_ia(inp, stcb, &stcb->asoc.scope, m, cnt_inits_to, &padding_len, &chunk_len); init->ch.chunk_length = htons(chunk_len); if (padding_len > 0) { if (sctp_add_pad_tombuf(m_last, padding_len) == NULL) { sctp_m_freem(m); return; } } SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT - calls lowlevel_output\n"); ret = sctp_lowlevel_chunk_output(inp, stcb, net, (struct sockaddr *)&net->ro._l_addr, m, 0, NULL, 0, 0, 0, 0, inp->sctp_lport, stcb->rport, htonl(0), net->port, NULL, 0, 0, so_locked); SCTPDBG(SCTP_DEBUG_OUTPUT4, "lowlevel_output - %d\n", ret); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); (void)SCTP_GETTIME_TIMEVAL(&net->last_sent_time); } struct mbuf * sctp_arethere_unrecognized_parameters(struct mbuf *in_initpkt, int param_offset, int *abort_processing, struct sctp_chunkhdr *cp, int *nat_friendly) { /* * Given a mbuf containing an INIT or INIT-ACK with the param_offset * being equal to the beginning of the params i.e. (iphlen + * sizeof(struct sctp_init_msg) parse through the parameters to the * end of the mbuf verifying that all parameters are known. * * For unknown parameters build and return a mbuf with * UNRECOGNIZED_PARAMETER errors. If the flags indicate to stop * processing this chunk stop, and set *abort_processing to 1. * * By having param_offset be pre-set to where parameters begin it is * hoped that this routine may be reused in the future by new * features. */ struct sctp_paramhdr *phdr, params; struct mbuf *mat, *op_err; char tempbuf[SCTP_PARAM_BUFFER_SIZE]; int at, limit, pad_needed; uint16_t ptype, plen, padded_size; int err_at; *abort_processing = 0; mat = in_initpkt; err_at = 0; limit = ntohs(cp->chunk_length) - sizeof(struct sctp_init_chunk); at = param_offset; op_err = NULL; SCTPDBG(SCTP_DEBUG_OUTPUT1, "Check for unrecognized param's\n"); phdr = sctp_get_next_param(mat, at, ¶ms, sizeof(params)); while ((phdr != NULL) && ((size_t)limit >= sizeof(struct sctp_paramhdr))) { ptype = ntohs(phdr->param_type); plen = ntohs(phdr->param_length); if ((plen > limit) || (plen < sizeof(struct sctp_paramhdr))) { /* wacked parameter */ SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error %d\n", plen); goto invalid_size; } limit -= SCTP_SIZE32(plen); /*- * All parameters for all chunks that we know/understand are * listed here. We process them other places and make * appropriate stop actions per the upper bits. However this * is the generic routine processor's can call to get back * an operr.. to either incorporate (init-ack) or send. */ padded_size = SCTP_SIZE32(plen); switch (ptype) { /* Param's with variable size */ case SCTP_HEARTBEAT_INFO: case SCTP_STATE_COOKIE: case SCTP_UNRECOG_PARAM: case SCTP_ERROR_CAUSE_IND: /* ok skip fwd */ at += padded_size; break; /* Param's with variable size within a range */ case SCTP_CHUNK_LIST: case SCTP_SUPPORTED_CHUNK_EXT: if (padded_size > (sizeof(struct sctp_supported_chunk_types_param) + (sizeof(uint8_t) * SCTP_MAX_SUPPORTED_EXT))) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error chklist %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_SUPPORTED_ADDRTYPE: if (padded_size > SCTP_MAX_ADDR_PARAMS_SIZE) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error supaddrtype %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_RANDOM: if (padded_size > (sizeof(struct sctp_auth_random) + SCTP_RANDOM_MAX_SIZE)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error random %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_SET_PRIM_ADDR: case SCTP_DEL_IP_ADDRESS: case SCTP_ADD_IP_ADDRESS: if ((padded_size != sizeof(struct sctp_asconf_addrv4_param)) && (padded_size != sizeof(struct sctp_asconf_addr_param))) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error setprim %d\n", plen); goto invalid_size; } at += padded_size; break; /* Param's with a fixed size */ case SCTP_IPV4_ADDRESS: if (padded_size != sizeof(struct sctp_ipv4addr_param)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error ipv4 addr %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_IPV6_ADDRESS: if (padded_size != sizeof(struct sctp_ipv6addr_param)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error ipv6 addr %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_COOKIE_PRESERVE: if (padded_size != sizeof(struct sctp_cookie_perserve_param)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error cookie-preserve %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_HAS_NAT_SUPPORT: *nat_friendly = 1; /* fall through */ case SCTP_PRSCTP_SUPPORTED: if (padded_size != sizeof(struct sctp_paramhdr)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error prsctp/nat support %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_ECN_CAPABLE: if (padded_size != sizeof(struct sctp_paramhdr)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error ecn %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_ULP_ADAPTATION: if (padded_size != sizeof(struct sctp_adaptation_layer_indication)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error adapatation %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_SUCCESS_REPORT: if (padded_size != sizeof(struct sctp_asconf_paramhdr)) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error success %d\n", plen); goto invalid_size; } at += padded_size; break; case SCTP_HOSTNAME_ADDRESS: { /* We can NOT handle HOST NAME addresses!! */ int l_len; SCTPDBG(SCTP_DEBUG_OUTPUT1, "Can't handle hostname addresses.. abort processing\n"); *abort_processing = 1; if (op_err == NULL) { /* Ok need to try to get a mbuf */ #ifdef INET6 l_len = SCTP_MIN_OVERHEAD; #else l_len = SCTP_MIN_V4_OVERHEAD; #endif l_len += sizeof(struct sctp_chunkhdr); l_len += plen; l_len += sizeof(struct sctp_paramhdr); op_err = sctp_get_mbuf_for_msg(l_len, 0, M_NOWAIT, 1, MT_DATA); if (op_err) { SCTP_BUF_LEN(op_err) = 0; /* * pre-reserve space for ip * and sctp header and * chunk hdr */ #ifdef INET6 SCTP_BUF_RESV_UF(op_err, sizeof(struct ip6_hdr)); #else SCTP_BUF_RESV_UF(op_err, sizeof(struct ip)); #endif SCTP_BUF_RESV_UF(op_err, sizeof(struct sctphdr)); SCTP_BUF_RESV_UF(op_err, sizeof(struct sctp_chunkhdr)); } } if (op_err) { /* If we have space */ struct sctp_paramhdr s; if (err_at % 4) { uint32_t cpthis = 0; pad_needed = 4 - (err_at % 4); m_copyback(op_err, err_at, pad_needed, (caddr_t)&cpthis); err_at += pad_needed; } s.param_type = htons(SCTP_CAUSE_UNRESOLVABLE_ADDR); s.param_length = htons(sizeof(s) + plen); m_copyback(op_err, err_at, sizeof(s), (caddr_t)&s); err_at += sizeof(s); phdr = sctp_get_next_param(mat, at, (struct sctp_paramhdr *)tempbuf, min(sizeof(tempbuf), plen)); if (phdr == NULL) { sctp_m_freem(op_err); /* * we are out of memory but * we still need to have a * look at what to do (the * system is in trouble * though). */ return (NULL); } m_copyback(op_err, err_at, plen, (caddr_t)phdr); } return (op_err); break; } default: /* * we do not recognize the parameter figure out what * we do. */ SCTPDBG(SCTP_DEBUG_OUTPUT1, "Hit default param %x\n", ptype); if ((ptype & 0x4000) == 0x4000) { /* Report bit is set?? */ SCTPDBG(SCTP_DEBUG_OUTPUT1, "report op err\n"); if (op_err == NULL) { int l_len; /* Ok need to try to get an mbuf */ #ifdef INET6 l_len = SCTP_MIN_OVERHEAD; #else l_len = SCTP_MIN_V4_OVERHEAD; #endif l_len += sizeof(struct sctp_chunkhdr); l_len += plen; l_len += sizeof(struct sctp_paramhdr); op_err = sctp_get_mbuf_for_msg(l_len, 0, M_NOWAIT, 1, MT_DATA); if (op_err) { SCTP_BUF_LEN(op_err) = 0; #ifdef INET6 SCTP_BUF_RESV_UF(op_err, sizeof(struct ip6_hdr)); #else SCTP_BUF_RESV_UF(op_err, sizeof(struct ip)); #endif SCTP_BUF_RESV_UF(op_err, sizeof(struct sctphdr)); SCTP_BUF_RESV_UF(op_err, sizeof(struct sctp_chunkhdr)); } } if (op_err) { /* If we have space */ struct sctp_paramhdr s; if (err_at % 4) { uint32_t cpthis = 0; pad_needed = 4 - (err_at % 4); m_copyback(op_err, err_at, pad_needed, (caddr_t)&cpthis); err_at += pad_needed; } s.param_type = htons(SCTP_UNRECOG_PARAM); s.param_length = htons(sizeof(s) + plen); m_copyback(op_err, err_at, sizeof(s), (caddr_t)&s); err_at += sizeof(s); if (plen > sizeof(tempbuf)) { plen = sizeof(tempbuf); } phdr = sctp_get_next_param(mat, at, (struct sctp_paramhdr *)tempbuf, min(sizeof(tempbuf), plen)); if (phdr == NULL) { sctp_m_freem(op_err); /* * we are out of memory but * we still need to have a * look at what to do (the * system is in trouble * though). */ op_err = NULL; goto more_processing; } m_copyback(op_err, err_at, plen, (caddr_t)phdr); err_at += plen; } } more_processing: if ((ptype & 0x8000) == 0x0000) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "stop proc\n"); return (op_err); } else { /* skip this chunk and continue processing */ SCTPDBG(SCTP_DEBUG_OUTPUT1, "move on\n"); at += SCTP_SIZE32(plen); } break; } phdr = sctp_get_next_param(mat, at, ¶ms, sizeof(params)); } return (op_err); invalid_size: SCTPDBG(SCTP_DEBUG_OUTPUT1, "abort flag set\n"); *abort_processing = 1; if ((op_err == NULL) && phdr) { int l_len; #ifdef INET6 l_len = SCTP_MIN_OVERHEAD; #else l_len = SCTP_MIN_V4_OVERHEAD; #endif l_len += sizeof(struct sctp_chunkhdr); l_len += (2 * sizeof(struct sctp_paramhdr)); op_err = sctp_get_mbuf_for_msg(l_len, 0, M_NOWAIT, 1, MT_DATA); if (op_err) { SCTP_BUF_LEN(op_err) = 0; #ifdef INET6 SCTP_BUF_RESV_UF(op_err, sizeof(struct ip6_hdr)); #else SCTP_BUF_RESV_UF(op_err, sizeof(struct ip)); #endif SCTP_BUF_RESV_UF(op_err, sizeof(struct sctphdr)); SCTP_BUF_RESV_UF(op_err, sizeof(struct sctp_chunkhdr)); } } if ((op_err) && phdr) { struct sctp_paramhdr s; if (err_at % 4) { uint32_t cpthis = 0; pad_needed = 4 - (err_at % 4); m_copyback(op_err, err_at, pad_needed, (caddr_t)&cpthis); err_at += pad_needed; } s.param_type = htons(SCTP_CAUSE_PROTOCOL_VIOLATION); s.param_length = htons(sizeof(s) + sizeof(struct sctp_paramhdr)); m_copyback(op_err, err_at, sizeof(s), (caddr_t)&s); err_at += sizeof(s); /* Only copy back the p-hdr that caused the issue */ m_copyback(op_err, err_at, sizeof(struct sctp_paramhdr), (caddr_t)phdr); } return (op_err); } static int sctp_are_there_new_addresses(struct sctp_association *asoc, struct mbuf *in_initpkt, int offset, struct sockaddr *src) { /* * Given a INIT packet, look through the packet to verify that there * are NO new addresses. As we go through the parameters add reports * of any un-understood parameters that require an error. Also we * must return (1) to drop the packet if we see a un-understood * parameter that tells us to drop the chunk. */ struct sockaddr *sa_touse; struct sockaddr *sa; struct sctp_paramhdr *phdr, params; uint16_t ptype, plen; uint8_t fnd; struct sctp_nets *net; int check_src; #ifdef INET struct sockaddr_in sin4, *sa4; #endif #ifdef INET6 struct sockaddr_in6 sin6, *sa6; #endif #ifdef INET memset(&sin4, 0, sizeof(sin4)); sin4.sin_family = AF_INET; sin4.sin_len = sizeof(sin4); #endif #ifdef INET6 memset(&sin6, 0, sizeof(sin6)); sin6.sin6_family = AF_INET6; sin6.sin6_len = sizeof(sin6); #endif /* First what about the src address of the pkt ? */ check_src = 0; switch (src->sa_family) { #ifdef INET case AF_INET: if (asoc->scope.ipv4_addr_legal) { check_src = 1; } break; #endif #ifdef INET6 case AF_INET6: if (asoc->scope.ipv6_addr_legal) { check_src = 1; } break; #endif default: /* TSNH */ break; } if (check_src) { fnd = 0; TAILQ_FOREACH(net, &asoc->nets, sctp_next) { sa = (struct sockaddr *)&net->ro._l_addr; if (sa->sa_family == src->sa_family) { #ifdef INET if (sa->sa_family == AF_INET) { struct sockaddr_in *src4; sa4 = (struct sockaddr_in *)sa; src4 = (struct sockaddr_in *)src; if (sa4->sin_addr.s_addr == src4->sin_addr.s_addr) { fnd = 1; break; } } #endif #ifdef INET6 if (sa->sa_family == AF_INET6) { struct sockaddr_in6 *src6; sa6 = (struct sockaddr_in6 *)sa; src6 = (struct sockaddr_in6 *)src; if (SCTP6_ARE_ADDR_EQUAL(sa6, src6)) { fnd = 1; break; } } #endif } } if (fnd == 0) { /* New address added! no need to look further. */ return (1); } } /* Ok so far lets munge through the rest of the packet */ offset += sizeof(struct sctp_init_chunk); phdr = sctp_get_next_param(in_initpkt, offset, ¶ms, sizeof(params)); while (phdr) { sa_touse = NULL; ptype = ntohs(phdr->param_type); plen = ntohs(phdr->param_length); switch (ptype) { #ifdef INET case SCTP_IPV4_ADDRESS: { struct sctp_ipv4addr_param *p4, p4_buf; phdr = sctp_get_next_param(in_initpkt, offset, (struct sctp_paramhdr *)&p4_buf, sizeof(p4_buf)); if (plen != sizeof(struct sctp_ipv4addr_param) || phdr == NULL) { return (1); } if (asoc->scope.ipv4_addr_legal) { p4 = (struct sctp_ipv4addr_param *)phdr; sin4.sin_addr.s_addr = p4->addr; sa_touse = (struct sockaddr *)&sin4; } break; } #endif #ifdef INET6 case SCTP_IPV6_ADDRESS: { struct sctp_ipv6addr_param *p6, p6_buf; phdr = sctp_get_next_param(in_initpkt, offset, (struct sctp_paramhdr *)&p6_buf, sizeof(p6_buf)); if (plen != sizeof(struct sctp_ipv6addr_param) || phdr == NULL) { return (1); } if (asoc->scope.ipv6_addr_legal) { p6 = (struct sctp_ipv6addr_param *)phdr; memcpy((caddr_t)&sin6.sin6_addr, p6->addr, sizeof(p6->addr)); sa_touse = (struct sockaddr *)&sin6; } break; } #endif default: sa_touse = NULL; break; } if (sa_touse) { /* ok, sa_touse points to one to check */ fnd = 0; TAILQ_FOREACH(net, &asoc->nets, sctp_next) { sa = (struct sockaddr *)&net->ro._l_addr; if (sa->sa_family != sa_touse->sa_family) { continue; } #ifdef INET if (sa->sa_family == AF_INET) { sa4 = (struct sockaddr_in *)sa; if (sa4->sin_addr.s_addr == sin4.sin_addr.s_addr) { fnd = 1; break; } } #endif #ifdef INET6 if (sa->sa_family == AF_INET6) { sa6 = (struct sockaddr_in6 *)sa; if (SCTP6_ARE_ADDR_EQUAL( sa6, &sin6)) { fnd = 1; break; } } #endif } if (!fnd) { /* New addr added! no need to look further */ return (1); } } offset += SCTP_SIZE32(plen); phdr = sctp_get_next_param(in_initpkt, offset, ¶ms, sizeof(params)); } return (0); } /* * Given a MBUF chain that was sent into us containing an INIT. Build a * INIT-ACK with COOKIE and send back. We assume that the in_initpkt has done * a pullup to include IPv6/4header, SCTP header and initial part of INIT * message (i.e. the struct sctp_init_msg). */ void sctp_send_initiate_ack(struct sctp_inpcb *inp, struct sctp_tcb *stcb, struct sctp_nets *src_net, struct mbuf *init_pkt, int iphlen, int offset, struct sockaddr *src, struct sockaddr *dst, struct sctphdr *sh, struct sctp_init_chunk *init_chk, uint8_t mflowtype, uint32_t mflowid, uint32_t vrf_id, uint16_t port, int hold_inp_lock) { struct sctp_association *asoc; struct mbuf *m, *m_tmp, *m_last, *m_cookie, *op_err; struct sctp_init_ack_chunk *initack; struct sctp_adaptation_layer_indication *ali; struct sctp_supported_chunk_types_param *pr_supported; struct sctp_paramhdr *ph; union sctp_sockstore *over_addr; struct sctp_scoping scp; #ifdef INET struct sockaddr_in *dst4 = (struct sockaddr_in *)dst; struct sockaddr_in *src4 = (struct sockaddr_in *)src; struct sockaddr_in *sin; #endif #ifdef INET6 struct sockaddr_in6 *dst6 = (struct sockaddr_in6 *)dst; struct sockaddr_in6 *src6 = (struct sockaddr_in6 *)src; struct sockaddr_in6 *sin6; #endif struct sockaddr *to; struct sctp_state_cookie stc; struct sctp_nets *net = NULL; uint8_t *signature = NULL; int cnt_inits_to = 0; uint16_t his_limit, i_want; int abort_flag; int nat_friendly = 0; struct socket *so; uint16_t num_ext, chunk_len, padding_len, parameter_len; if (stcb) { asoc = &stcb->asoc; } else { asoc = NULL; } if ((asoc != NULL) && (SCTP_GET_STATE(asoc) != SCTP_STATE_COOKIE_WAIT)) { if (sctp_are_there_new_addresses(asoc, init_pkt, offset, src)) { /* * new addresses, out of here in non-cookie-wait * states * * Send an ABORT, without the new address error cause. * This looks no different than if no listener was * present. */ op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code), "Address added"); sctp_send_abort(init_pkt, iphlen, src, dst, sh, 0, op_err, mflowtype, mflowid, inp->fibnum, vrf_id, port); return; } if (src_net != NULL && (src_net->port != port)) { /* * change of remote encapsulation port, out of here * in non-cookie-wait states * * Send an ABORT, without an specific error cause. This * looks no different than if no listener was * present. */ op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code), "Remote encapsulation port changed"); sctp_send_abort(init_pkt, iphlen, src, dst, sh, 0, op_err, mflowtype, mflowid, inp->fibnum, vrf_id, port); return; } } abort_flag = 0; op_err = sctp_arethere_unrecognized_parameters(init_pkt, (offset + sizeof(struct sctp_init_chunk)), &abort_flag, (struct sctp_chunkhdr *)init_chk, &nat_friendly); if (abort_flag) { do_a_abort: if (op_err == NULL) { char msg[SCTP_DIAG_INFO_LEN]; snprintf(msg, sizeof(msg), "%s:%d at %s", __FILE__, __LINE__, __func__); op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code), msg); } sctp_send_abort(init_pkt, iphlen, src, dst, sh, init_chk->init.initiate_tag, op_err, mflowtype, mflowid, inp->fibnum, vrf_id, port); return; } m = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA); if (m == NULL) { /* No memory, INIT timer will re-attempt. */ if (op_err) sctp_m_freem(op_err); return; } chunk_len = (uint16_t) sizeof(struct sctp_init_ack_chunk); padding_len = 0; /* * We might not overwrite the identification[] completely and on * some platforms time_entered will contain some padding. Therefore * zero out the cookie to avoid putting uninitialized memory on the * wire. */ memset(&stc, 0, sizeof(struct sctp_state_cookie)); /* the time I built cookie */ (void)SCTP_GETTIME_TIMEVAL(&stc.time_entered); /* populate any tie tags */ if (asoc != NULL) { /* unlock before tag selections */ stc.tie_tag_my_vtag = asoc->my_vtag_nonce; stc.tie_tag_peer_vtag = asoc->peer_vtag_nonce; stc.cookie_life = asoc->cookie_life; net = asoc->primary_destination; } else { stc.tie_tag_my_vtag = 0; stc.tie_tag_peer_vtag = 0; /* life I will award this cookie */ stc.cookie_life = inp->sctp_ep.def_cookie_life; } /* copy in the ports for later check */ stc.myport = sh->dest_port; stc.peerport = sh->src_port; /* * If we wanted to honor cookie life extensions, we would add to * stc.cookie_life. For now we should NOT honor any extension */ stc.site_scope = stc.local_scope = stc.loopback_scope = 0; if (inp->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) { stc.ipv6_addr_legal = 1; if (SCTP_IPV6_V6ONLY(inp)) { stc.ipv4_addr_legal = 0; } else { stc.ipv4_addr_legal = 1; } } else { stc.ipv6_addr_legal = 0; stc.ipv4_addr_legal = 1; } stc.ipv4_scope = 0; if (net == NULL) { to = src; switch (dst->sa_family) { #ifdef INET case AF_INET: { /* lookup address */ stc.address[0] = src4->sin_addr.s_addr; stc.address[1] = 0; stc.address[2] = 0; stc.address[3] = 0; stc.addr_type = SCTP_IPV4_ADDRESS; /* local from address */ stc.laddress[0] = dst4->sin_addr.s_addr; stc.laddress[1] = 0; stc.laddress[2] = 0; stc.laddress[3] = 0; stc.laddr_type = SCTP_IPV4_ADDRESS; /* scope_id is only for v6 */ stc.scope_id = 0; if ((IN4_ISPRIVATE_ADDRESS(&src4->sin_addr)) || (IN4_ISPRIVATE_ADDRESS(&dst4->sin_addr))) { stc.ipv4_scope = 1; } /* Must use the address in this case */ if (sctp_is_address_on_local_host(src, vrf_id)) { stc.loopback_scope = 1; stc.ipv4_scope = 1; stc.site_scope = 1; stc.local_scope = 0; } break; } #endif #ifdef INET6 case AF_INET6: { stc.addr_type = SCTP_IPV6_ADDRESS; memcpy(&stc.address, &src6->sin6_addr, sizeof(struct in6_addr)); stc.scope_id = ntohs(in6_getscope(&src6->sin6_addr)); if (sctp_is_address_on_local_host(src, vrf_id)) { stc.loopback_scope = 1; stc.local_scope = 0; stc.site_scope = 1; stc.ipv4_scope = 1; } else if (IN6_IS_ADDR_LINKLOCAL(&src6->sin6_addr) || IN6_IS_ADDR_LINKLOCAL(&dst6->sin6_addr)) { /* * If the new destination or source * is a LINK_LOCAL we must have * common both site and local scope. * Don't set local scope though * since we must depend on the * source to be added implicitly. We * cannot assure just because we * share one link that all links are * common. */ stc.local_scope = 0; stc.site_scope = 1; stc.ipv4_scope = 1; /* * we start counting for the private * address stuff at 1. since the * link local we source from won't * show up in our scoped count. */ cnt_inits_to = 1; /* * pull out the scope_id from * incoming pkt */ } else if (IN6_IS_ADDR_SITELOCAL(&src6->sin6_addr) || IN6_IS_ADDR_SITELOCAL(&dst6->sin6_addr)) { /* * If the new destination or source * is SITE_LOCAL then we must have * site scope in common. */ stc.site_scope = 1; } memcpy(&stc.laddress, &dst6->sin6_addr, sizeof(struct in6_addr)); stc.laddr_type = SCTP_IPV6_ADDRESS; break; } #endif default: /* TSNH */ goto do_a_abort; break; } } else { /* set the scope per the existing tcb */ #ifdef INET6 struct sctp_nets *lnet; #endif stc.loopback_scope = asoc->scope.loopback_scope; stc.ipv4_scope = asoc->scope.ipv4_local_scope; stc.site_scope = asoc->scope.site_scope; stc.local_scope = asoc->scope.local_scope; #ifdef INET6 /* Why do we not consider IPv4 LL addresses? */ TAILQ_FOREACH(lnet, &asoc->nets, sctp_next) { if (lnet->ro._l_addr.sin6.sin6_family == AF_INET6) { if (IN6_IS_ADDR_LINKLOCAL(&lnet->ro._l_addr.sin6.sin6_addr)) { /* * if we have a LL address, start * counting at 1. */ cnt_inits_to = 1; } } } #endif /* use the net pointer */ to = (struct sockaddr *)&net->ro._l_addr; switch (to->sa_family) { #ifdef INET case AF_INET: sin = (struct sockaddr_in *)to; stc.address[0] = sin->sin_addr.s_addr; stc.address[1] = 0; stc.address[2] = 0; stc.address[3] = 0; stc.addr_type = SCTP_IPV4_ADDRESS; if (net->src_addr_selected == 0) { /* * strange case here, the INIT should have * did the selection. */ net->ro._s_addr = sctp_source_address_selection(inp, stcb, (sctp_route_t *) & net->ro, net, 0, vrf_id); if (net->ro._s_addr == NULL) return; net->src_addr_selected = 1; } stc.laddress[0] = net->ro._s_addr->address.sin.sin_addr.s_addr; stc.laddress[1] = 0; stc.laddress[2] = 0; stc.laddress[3] = 0; stc.laddr_type = SCTP_IPV4_ADDRESS; /* scope_id is only for v6 */ stc.scope_id = 0; break; #endif #ifdef INET6 case AF_INET6: sin6 = (struct sockaddr_in6 *)to; memcpy(&stc.address, &sin6->sin6_addr, sizeof(struct in6_addr)); stc.addr_type = SCTP_IPV6_ADDRESS; stc.scope_id = sin6->sin6_scope_id; if (net->src_addr_selected == 0) { /* * strange case here, the INIT should have * done the selection. */ net->ro._s_addr = sctp_source_address_selection(inp, stcb, (sctp_route_t *) & net->ro, net, 0, vrf_id); if (net->ro._s_addr == NULL) return; net->src_addr_selected = 1; } memcpy(&stc.laddress, &net->ro._s_addr->address.sin6.sin6_addr, sizeof(struct in6_addr)); stc.laddr_type = SCTP_IPV6_ADDRESS; break; #endif } } /* Now lets put the SCTP header in place */ initack = mtod(m, struct sctp_init_ack_chunk *); /* Save it off for quick ref */ stc.peers_vtag = ntohl(init_chk->init.initiate_tag); /* who are we */ memcpy(stc.identification, SCTP_VERSION_STRING, min(strlen(SCTP_VERSION_STRING), sizeof(stc.identification))); memset(stc.reserved, 0, SCTP_RESERVE_SPACE); /* now the chunk header */ initack->ch.chunk_type = SCTP_INITIATION_ACK; initack->ch.chunk_flags = 0; /* fill in later from mbuf we build */ initack->ch.chunk_length = 0; /* place in my tag */ if ((asoc != NULL) && ((SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_WAIT) || (SCTP_GET_STATE(asoc) == SCTP_STATE_INUSE) || (SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_ECHOED))) { /* re-use the v-tags and init-seq here */ initack->init.initiate_tag = htonl(asoc->my_vtag); initack->init.initial_tsn = htonl(asoc->init_seq_number); } else { uint32_t vtag, itsn; if (hold_inp_lock) { SCTP_INP_INCR_REF(inp); SCTP_INP_RUNLOCK(inp); } if (asoc) { atomic_add_int(&asoc->refcnt, 1); SCTP_TCB_UNLOCK(stcb); new_tag: vtag = sctp_select_a_tag(inp, inp->sctp_lport, sh->src_port, 1); if ((asoc->peer_supports_nat) && (vtag == asoc->my_vtag)) { /* * Got a duplicate vtag on some guy behind a * nat make sure we don't use it. */ goto new_tag; } initack->init.initiate_tag = htonl(vtag); /* get a TSN to use too */ itsn = sctp_select_initial_TSN(&inp->sctp_ep); initack->init.initial_tsn = htonl(itsn); SCTP_TCB_LOCK(stcb); atomic_add_int(&asoc->refcnt, -1); } else { vtag = sctp_select_a_tag(inp, inp->sctp_lport, sh->src_port, 1); initack->init.initiate_tag = htonl(vtag); /* get a TSN to use too */ initack->init.initial_tsn = htonl(sctp_select_initial_TSN(&inp->sctp_ep)); } if (hold_inp_lock) { SCTP_INP_RLOCK(inp); SCTP_INP_DECR_REF(inp); } } /* save away my tag to */ stc.my_vtag = initack->init.initiate_tag; /* set up some of the credits. */ so = inp->sctp_socket; if (so == NULL) { /* memory problem */ sctp_m_freem(m); return; } else { initack->init.a_rwnd = htonl(max(SCTP_SB_LIMIT_RCV(so), SCTP_MINIMAL_RWND)); } /* set what I want */ his_limit = ntohs(init_chk->init.num_inbound_streams); /* choose what I want */ if (asoc != NULL) { if (asoc->streamoutcnt > asoc->pre_open_streams) { i_want = asoc->streamoutcnt; } else { i_want = asoc->pre_open_streams; } } else { i_want = inp->sctp_ep.pre_open_stream_count; } if (his_limit < i_want) { /* I Want more :< */ initack->init.num_outbound_streams = init_chk->init.num_inbound_streams; } else { /* I can have what I want :> */ initack->init.num_outbound_streams = htons(i_want); } /* tell him his limit. */ initack->init.num_inbound_streams = htons(inp->sctp_ep.max_open_streams_intome); /* adaptation layer indication parameter */ if (inp->sctp_ep.adaptation_layer_indicator_provided) { parameter_len = (uint16_t) sizeof(struct sctp_adaptation_layer_indication); ali = (struct sctp_adaptation_layer_indication *)(mtod(m, caddr_t)+chunk_len); ali->ph.param_type = htons(SCTP_ULP_ADAPTATION); ali->ph.param_length = htons(parameter_len); ali->indication = htonl(inp->sctp_ep.adaptation_layer_indicator); chunk_len += parameter_len; } /* ECN parameter */ if (((asoc != NULL) && (asoc->ecn_supported == 1)) || ((asoc == NULL) && (inp->ecn_supported == 1))) { parameter_len = (uint16_t) sizeof(struct sctp_paramhdr); ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len); ph->param_type = htons(SCTP_ECN_CAPABLE); ph->param_length = htons(parameter_len); chunk_len += parameter_len; } /* PR-SCTP supported parameter */ if (((asoc != NULL) && (asoc->prsctp_supported == 1)) || ((asoc == NULL) && (inp->prsctp_supported == 1))) { parameter_len = (uint16_t) sizeof(struct sctp_paramhdr); ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len); ph->param_type = htons(SCTP_PRSCTP_SUPPORTED); ph->param_length = htons(parameter_len); chunk_len += parameter_len; } /* Add NAT friendly parameter */ if (nat_friendly) { parameter_len = (uint16_t) sizeof(struct sctp_paramhdr); ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len); ph->param_type = htons(SCTP_HAS_NAT_SUPPORT); ph->param_length = htons(parameter_len); chunk_len += parameter_len; } /* And now tell the peer which extensions we support */ num_ext = 0; pr_supported = (struct sctp_supported_chunk_types_param *)(mtod(m, caddr_t)+chunk_len); if (((asoc != NULL) && (asoc->prsctp_supported == 1)) || ((asoc == NULL) && (inp->prsctp_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_FORWARD_CUM_TSN; if (((asoc != NULL) && (asoc->idata_supported == 1)) || ((asoc == NULL) && (inp->idata_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_IFORWARD_CUM_TSN; } } if (((asoc != NULL) && (asoc->auth_supported == 1)) || ((asoc == NULL) && (inp->auth_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_AUTHENTICATION; } if (((asoc != NULL) && (asoc->asconf_supported == 1)) || ((asoc == NULL) && (inp->asconf_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_ASCONF; pr_supported->chunk_types[num_ext++] = SCTP_ASCONF_ACK; } if (((asoc != NULL) && (asoc->reconfig_supported == 1)) || ((asoc == NULL) && (inp->reconfig_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_STREAM_RESET; } if (((asoc != NULL) && (asoc->idata_supported == 1)) || ((asoc == NULL) && (inp->idata_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_IDATA; } if (((asoc != NULL) && (asoc->nrsack_supported == 1)) || ((asoc == NULL) && (inp->nrsack_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_NR_SELECTIVE_ACK; } if (((asoc != NULL) && (asoc->pktdrop_supported == 1)) || ((asoc == NULL) && (inp->pktdrop_supported == 1))) { pr_supported->chunk_types[num_ext++] = SCTP_PACKET_DROPPED; } if (num_ext > 0) { parameter_len = (uint16_t) sizeof(struct sctp_supported_chunk_types_param) + num_ext; pr_supported->ph.param_type = htons(SCTP_SUPPORTED_CHUNK_EXT); pr_supported->ph.param_length = htons(parameter_len); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; } /* add authentication parameters */ if (((asoc != NULL) && (asoc->auth_supported == 1)) || ((asoc == NULL) && (inp->auth_supported == 1))) { struct sctp_auth_random *randp; struct sctp_auth_hmac_algo *hmacs; struct sctp_auth_chunk_list *chunks; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } /* generate and add RANDOM parameter */ randp = (struct sctp_auth_random *)(mtod(m, caddr_t)+chunk_len); parameter_len = (uint16_t) sizeof(struct sctp_auth_random) + SCTP_AUTH_RANDOM_SIZE_DEFAULT; randp->ph.param_type = htons(SCTP_RANDOM); randp->ph.param_length = htons(parameter_len); SCTP_READ_RANDOM(randp->random_data, SCTP_AUTH_RANDOM_SIZE_DEFAULT); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } /* add HMAC_ALGO parameter */ hmacs = (struct sctp_auth_hmac_algo *)(mtod(m, caddr_t)+chunk_len); parameter_len = (uint16_t) sizeof(struct sctp_auth_hmac_algo) + sctp_serialize_hmaclist(inp->sctp_ep.local_hmacs, (uint8_t *) hmacs->hmac_ids); hmacs->ph.param_type = htons(SCTP_HMAC_LIST); hmacs->ph.param_length = htons(parameter_len); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; padding_len = 0; } /* add CHUNKS parameter */ chunks = (struct sctp_auth_chunk_list *)(mtod(m, caddr_t)+chunk_len); parameter_len = (uint16_t) sizeof(struct sctp_auth_chunk_list) + sctp_serialize_auth_chunks(inp->sctp_ep.local_auth_chunks, chunks->chunk_types); chunks->ph.param_type = htons(SCTP_CHUNK_LIST); chunks->ph.param_length = htons(parameter_len); padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; } SCTP_BUF_LEN(m) = chunk_len; m_last = m; /* now the addresses */ /* * To optimize this we could put the scoping stuff into a structure * and remove the individual uint8's from the stc structure. Then we * could just sifa in the address within the stc.. but for now this * is a quick hack to get the address stuff teased apart. */ scp.ipv4_addr_legal = stc.ipv4_addr_legal; scp.ipv6_addr_legal = stc.ipv6_addr_legal; scp.loopback_scope = stc.loopback_scope; scp.ipv4_local_scope = stc.ipv4_scope; scp.local_scope = stc.local_scope; scp.site_scope = stc.site_scope; m_last = sctp_add_addresses_to_i_ia(inp, stcb, &scp, m_last, cnt_inits_to, &padding_len, &chunk_len); /* padding_len can only be positive, if no addresses have been added */ if (padding_len > 0) { memset(mtod(m, caddr_t)+chunk_len, 0, padding_len); chunk_len += padding_len; SCTP_BUF_LEN(m) += padding_len; padding_len = 0; } /* tack on the operational error if present */ if (op_err) { parameter_len = 0; for (m_tmp = op_err; m_tmp != NULL; m_tmp = SCTP_BUF_NEXT(m_tmp)) { parameter_len += SCTP_BUF_LEN(m_tmp); } padding_len = SCTP_SIZE32(parameter_len) - parameter_len; SCTP_BUF_NEXT(m_last) = op_err; while (SCTP_BUF_NEXT(m_last) != NULL) { m_last = SCTP_BUF_NEXT(m_last); } chunk_len += parameter_len; } if (padding_len > 0) { m_last = sctp_add_pad_tombuf(m_last, padding_len); if (m_last == NULL) { /* Houston we have a problem, no space */ sctp_m_freem(m); return; } chunk_len += padding_len; padding_len = 0; } /* Now we must build a cookie */ m_cookie = sctp_add_cookie(init_pkt, offset, m, 0, &stc, &signature); if (m_cookie == NULL) { /* memory problem */ sctp_m_freem(m); return; } /* Now append the cookie to the end and update the space/size */ SCTP_BUF_NEXT(m_last) = m_cookie; parameter_len = 0; for (m_tmp = m_cookie; m_tmp != NULL; m_tmp = SCTP_BUF_NEXT(m_tmp)) { parameter_len += SCTP_BUF_LEN(m_tmp); if (SCTP_BUF_NEXT(m_tmp) == NULL) { m_last = m_tmp; } } padding_len = SCTP_SIZE32(parameter_len) - parameter_len; chunk_len += parameter_len; /* * Place in the size, but we don't include the last pad (if any) in * the INIT-ACK. */ initack->ch.chunk_length = htons(chunk_len); /* * Time to sign the cookie, we don't sign over the cookie signature * though thus we set trailer. */ (void)sctp_hmac_m(SCTP_HMAC, (uint8_t *) inp->sctp_ep.secret_key[(int)(inp->sctp_ep.current_secret_number)], SCTP_SECRET_SIZE, m_cookie, sizeof(struct sctp_paramhdr), (uint8_t *) signature, SCTP_SIGNATURE_SIZE); /* * We sifa 0 here to NOT set IP_DF if its IPv4, we ignore the return * here since the timer will drive a retranmission. */ if (padding_len > 0) { if (sctp_add_pad_tombuf(m_last, padding_len) == NULL) { sctp_m_freem(m); return; } } if (stc.loopback_scope) { over_addr = (union sctp_sockstore *)dst; } else { over_addr = NULL; } (void)sctp_lowlevel_chunk_output(inp, NULL, NULL, to, m, 0, NULL, 0, 0, 0, 0, inp->sctp_lport, sh->src_port, init_chk->init.initiate_tag, port, over_addr, mflowtype, mflowid, SCTP_SO_NOT_LOCKED); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } static void sctp_prune_prsctp(struct sctp_tcb *stcb, struct sctp_association *asoc, struct sctp_sndrcvinfo *srcv, int dataout) { int freed_spc = 0; struct sctp_tmit_chunk *chk, *nchk; SCTP_TCB_LOCK_ASSERT(stcb); if ((asoc->prsctp_supported) && (asoc->sent_queue_cnt_removeable > 0)) { TAILQ_FOREACH(chk, &asoc->sent_queue, sctp_next) { /* * Look for chunks marked with the PR_SCTP flag AND * the buffer space flag. If the one being sent is * equal or greater priority then purge the old one * and free some space. */ if (PR_SCTP_BUF_ENABLED(chk->flags)) { /* * This one is PR-SCTP AND buffer space * limited type */ if (chk->rec.data.timetodrop.tv_sec >= (long)srcv->sinfo_timetolive) { /* * Lower numbers equates to higher * priority so if the one we are * looking at has a larger or equal * priority we want to drop the data * and NOT retransmit it. */ if (chk->data) { /* * We release the book_size * if the mbuf is here */ int ret_spc; uint8_t sent; if (chk->sent > SCTP_DATAGRAM_UNSENT) sent = 1; else sent = 0; ret_spc = sctp_release_pr_sctp_chunk(stcb, chk, sent, SCTP_SO_LOCKED); freed_spc += ret_spc; if (freed_spc >= dataout) { return; } } /* if chunk was present */ } /* if of sufficient priority */ } /* if chunk has enabled */ } /* tailqforeach */ TAILQ_FOREACH_SAFE(chk, &asoc->send_queue, sctp_next, nchk) { /* Here we must move to the sent queue and mark */ if (PR_SCTP_BUF_ENABLED(chk->flags)) { if (chk->rec.data.timetodrop.tv_sec >= (long)srcv->sinfo_timetolive) { if (chk->data) { /* * We release the book_size * if the mbuf is here */ int ret_spc; ret_spc = sctp_release_pr_sctp_chunk(stcb, chk, 0, SCTP_SO_LOCKED); freed_spc += ret_spc; if (freed_spc >= dataout) { return; } } /* end if chk->data */ } /* end if right class */ } /* end if chk pr-sctp */ } /* tailqforeachsafe (chk) */ } /* if enabled in asoc */ } int sctp_get_frag_point(struct sctp_tcb *stcb, struct sctp_association *asoc) { int siz, ovh; /* * For endpoints that have both v6 and v4 addresses we must reserve * room for the ipv6 header, for those that are only dealing with V4 * we use a larger frag point. */ if (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) { ovh = SCTP_MIN_OVERHEAD; } else { ovh = SCTP_MIN_V4_OVERHEAD; } if (stcb->asoc.idata_supported) { ovh += sizeof(struct sctp_idata_chunk); } else { ovh += sizeof(struct sctp_data_chunk); } if (stcb->asoc.sctp_frag_point > asoc->smallest_mtu) siz = asoc->smallest_mtu - ovh; else siz = (stcb->asoc.sctp_frag_point - ovh); /* * if (siz > (MCLBYTES-sizeof(struct sctp_data_chunk))) { */ /* A data chunk MUST fit in a cluster */ /* siz = (MCLBYTES - sizeof(struct sctp_data_chunk)); */ /* } */ /* adjust for an AUTH chunk if DATA requires auth */ if (sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks)) siz -= sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id); if (siz % 4) { /* make it an even word boundary please */ siz -= (siz % 4); } return (siz); } static void sctp_set_prsctp_policy(struct sctp_stream_queue_pending *sp) { /* * We assume that the user wants PR_SCTP_TTL if the user provides a * positive lifetime but does not specify any PR_SCTP policy. */ if (PR_SCTP_ENABLED(sp->sinfo_flags)) { sp->act_flags |= PR_SCTP_POLICY(sp->sinfo_flags); } else if (sp->timetolive > 0) { sp->sinfo_flags |= SCTP_PR_SCTP_TTL; sp->act_flags |= PR_SCTP_POLICY(sp->sinfo_flags); } else { return; } switch (PR_SCTP_POLICY(sp->sinfo_flags)) { case CHUNK_FLAGS_PR_SCTP_BUF: /* * Time to live is a priority stored in tv_sec when doing * the buffer drop thing. */ sp->ts.tv_sec = sp->timetolive; sp->ts.tv_usec = 0; break; case CHUNK_FLAGS_PR_SCTP_TTL: { struct timeval tv; (void)SCTP_GETTIME_TIMEVAL(&sp->ts); tv.tv_sec = sp->timetolive / 1000; tv.tv_usec = (sp->timetolive * 1000) % 1000000; /* * TODO sctp_constants.h needs alternative time * macros when _KERNEL is undefined. */ timevaladd(&sp->ts, &tv); } break; case CHUNK_FLAGS_PR_SCTP_RTX: /* * Time to live is a the number or retransmissions stored in * tv_sec. */ sp->ts.tv_sec = sp->timetolive; sp->ts.tv_usec = 0; break; default: SCTPDBG(SCTP_DEBUG_USRREQ1, "Unknown PR_SCTP policy %u.\n", PR_SCTP_POLICY(sp->sinfo_flags)); break; } } static int sctp_msg_append(struct sctp_tcb *stcb, struct sctp_nets *net, struct mbuf *m, struct sctp_sndrcvinfo *srcv, int hold_stcb_lock) { int error = 0; struct mbuf *at; struct sctp_stream_queue_pending *sp = NULL; struct sctp_stream_out *strm; /* * Given an mbuf chain, put it into the association send queue and * place it on the wheel */ if (srcv->sinfo_stream >= stcb->asoc.streamoutcnt) { /* Invalid stream number */ SCTP_LTRACE_ERR_RET_PKT(m, NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_now; } if ((stcb->asoc.stream_locked) && (stcb->asoc.stream_locked_on != srcv->sinfo_stream)) { SCTP_LTRACE_ERR_RET_PKT(m, NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_now; } strm = &stcb->asoc.strmout[srcv->sinfo_stream]; /* Now can we send this? */ if ((SCTP_GET_STATE(&stcb->asoc) == SCTP_STATE_SHUTDOWN_SENT) || (SCTP_GET_STATE(&stcb->asoc) == SCTP_STATE_SHUTDOWN_ACK_SENT) || (SCTP_GET_STATE(&stcb->asoc) == SCTP_STATE_SHUTDOWN_RECEIVED) || (stcb->asoc.state & SCTP_STATE_SHUTDOWN_PENDING)) { /* got data while shutting down */ SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET); error = ECONNRESET; goto out_now; } sctp_alloc_a_strmoq(stcb, sp); if (sp == NULL) { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); error = ENOMEM; goto out_now; } sp->sinfo_flags = srcv->sinfo_flags; sp->timetolive = srcv->sinfo_timetolive; sp->ppid = srcv->sinfo_ppid; sp->context = srcv->sinfo_context; sp->fsn = 0; if (sp->sinfo_flags & SCTP_ADDR_OVER) { sp->net = net; atomic_add_int(&sp->net->ref_count, 1); } else { sp->net = NULL; } (void)SCTP_GETTIME_TIMEVAL(&sp->ts); sp->stream = srcv->sinfo_stream; sp->msg_is_complete = 1; sp->sender_all_done = 1; sp->some_taken = 0; sp->data = m; sp->tail_mbuf = NULL; sctp_set_prsctp_policy(sp); /* * We could in theory (for sendall) sifa the length in, but we would * still have to hunt through the chain since we need to setup the * tail_mbuf */ sp->length = 0; for (at = m; at; at = SCTP_BUF_NEXT(at)) { if (SCTP_BUF_NEXT(at) == NULL) sp->tail_mbuf = at; sp->length += SCTP_BUF_LEN(at); } if (srcv->sinfo_keynumber_valid) { sp->auth_keyid = srcv->sinfo_keynumber; } else { sp->auth_keyid = stcb->asoc.authinfo.active_keyid; } if (sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks)) { sctp_auth_key_acquire(stcb, sp->auth_keyid); sp->holds_key_ref = 1; } if (hold_stcb_lock == 0) { SCTP_TCB_SEND_LOCK(stcb); } sctp_snd_sb_alloc(stcb, sp->length); atomic_add_int(&stcb->asoc.stream_queue_cnt, 1); TAILQ_INSERT_TAIL(&strm->outqueue, sp, next); stcb->asoc.ss_functions.sctp_ss_add_to_stream(stcb, &stcb->asoc, strm, sp, 1); m = NULL; if (hold_stcb_lock == 0) { SCTP_TCB_SEND_UNLOCK(stcb); } out_now: if (m) { sctp_m_freem(m); } return (error); } static struct mbuf * sctp_copy_mbufchain(struct mbuf *clonechain, struct mbuf *outchain, struct mbuf **endofchain, int can_take_mbuf, int sizeofcpy, uint8_t copy_by_ref) { struct mbuf *m; struct mbuf *appendchain; caddr_t cp; int len; if (endofchain == NULL) { /* error */ error_out: if (outchain) sctp_m_freem(outchain); return (NULL); } if (can_take_mbuf) { appendchain = clonechain; } else { if (!copy_by_ref && (sizeofcpy <= (int)((((SCTP_BASE_SYSCTL(sctp_mbuf_threshold_count) - 1) * MLEN) + MHLEN))) ) { /* Its not in a cluster */ if (*endofchain == NULL) { /* lets get a mbuf cluster */ if (outchain == NULL) { /* This is the general case */ new_mbuf: outchain = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_HEADER); if (outchain == NULL) { goto error_out; } SCTP_BUF_LEN(outchain) = 0; *endofchain = outchain; /* get the prepend space */ SCTP_BUF_RESV_UF(outchain, (SCTP_FIRST_MBUF_RESV + 4)); } else { /* * We really should not get a NULL * in endofchain */ /* find end */ m = outchain; while (m) { if (SCTP_BUF_NEXT(m) == NULL) { *endofchain = m; break; } m = SCTP_BUF_NEXT(m); } /* sanity */ if (*endofchain == NULL) { /* * huh, TSNH XXX maybe we * should panic */ sctp_m_freem(outchain); goto new_mbuf; } } /* get the new end of length */ len = (int)M_TRAILINGSPACE(*endofchain); } else { /* how much is left at the end? */ len = (int)M_TRAILINGSPACE(*endofchain); } /* Find the end of the data, for appending */ cp = (mtod((*endofchain), caddr_t)+SCTP_BUF_LEN((*endofchain))); /* Now lets copy it out */ if (len >= sizeofcpy) { /* It all fits, copy it in */ m_copydata(clonechain, 0, sizeofcpy, cp); SCTP_BUF_LEN((*endofchain)) += sizeofcpy; } else { /* fill up the end of the chain */ if (len > 0) { m_copydata(clonechain, 0, len, cp); SCTP_BUF_LEN((*endofchain)) += len; /* now we need another one */ sizeofcpy -= len; } m = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_HEADER); if (m == NULL) { /* We failed */ goto error_out; } SCTP_BUF_NEXT((*endofchain)) = m; *endofchain = m; cp = mtod((*endofchain), caddr_t); m_copydata(clonechain, len, sizeofcpy, cp); SCTP_BUF_LEN((*endofchain)) += sizeofcpy; } return (outchain); } else { /* copy the old fashion way */ appendchain = SCTP_M_COPYM(clonechain, 0, M_COPYALL, M_NOWAIT); #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(appendchain, SCTP_MBUF_ICOPY); } #endif } } if (appendchain == NULL) { /* error */ if (outchain) sctp_m_freem(outchain); return (NULL); } if (outchain) { /* tack on to the end */ if (*endofchain != NULL) { SCTP_BUF_NEXT(((*endofchain))) = appendchain; } else { m = outchain; while (m) { if (SCTP_BUF_NEXT(m) == NULL) { SCTP_BUF_NEXT(m) = appendchain; break; } m = SCTP_BUF_NEXT(m); } } /* * save off the end and update the end-chain position */ m = appendchain; while (m) { if (SCTP_BUF_NEXT(m) == NULL) { *endofchain = m; break; } m = SCTP_BUF_NEXT(m); } return (outchain); } else { /* save off the end and update the end-chain position */ m = appendchain; while (m) { if (SCTP_BUF_NEXT(m) == NULL) { *endofchain = m; break; } m = SCTP_BUF_NEXT(m); } return (appendchain); } } static int sctp_med_chunk_output(struct sctp_inpcb *inp, struct sctp_tcb *stcb, struct sctp_association *asoc, int *num_out, int *reason_code, int control_only, int from_where, struct timeval *now, int *now_filled, int frag_point, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ); static void sctp_sendall_iterator(struct sctp_inpcb *inp, struct sctp_tcb *stcb, void *ptr, uint32_t val SCTP_UNUSED) { struct sctp_copy_all *ca; struct mbuf *m; int ret = 0; int added_control = 0; int un_sent, do_chunk_output = 1; struct sctp_association *asoc; struct sctp_nets *net; ca = (struct sctp_copy_all *)ptr; if (ca->m == NULL) { return; } if (ca->inp != inp) { /* TSNH */ return; } if (ca->sndlen > 0) { m = SCTP_M_COPYM(ca->m, 0, M_COPYALL, M_NOWAIT); if (m == NULL) { /* can't copy so we are done */ ca->cnt_failed++; return; } #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(m, SCTP_MBUF_ICOPY); } #endif } else { m = NULL; } SCTP_TCB_LOCK_ASSERT(stcb); if (stcb->asoc.alternate) { net = stcb->asoc.alternate; } else { net = stcb->asoc.primary_destination; } if (ca->sndrcv.sinfo_flags & SCTP_ABORT) { /* Abort this assoc with m as the user defined reason */ if (m != NULL) { SCTP_BUF_PREPEND(m, sizeof(struct sctp_paramhdr), M_NOWAIT); } else { m = sctp_get_mbuf_for_msg(sizeof(struct sctp_paramhdr), 0, M_NOWAIT, 1, MT_DATA); SCTP_BUF_LEN(m) = sizeof(struct sctp_paramhdr); } if (m != NULL) { struct sctp_paramhdr *ph; ph = mtod(m, struct sctp_paramhdr *); ph->param_type = htons(SCTP_CAUSE_USER_INITIATED_ABT); ph->param_length = htons((uint16_t) (sizeof(struct sctp_paramhdr) + ca->sndlen)); } /* * We add one here to keep the assoc from dis-appearing on * us. */ atomic_add_int(&stcb->asoc.refcnt, 1); sctp_abort_an_association(inp, stcb, m, SCTP_SO_NOT_LOCKED); /* * sctp_abort_an_association calls sctp_free_asoc() free * association will NOT free it since we incremented the * refcnt .. we do this to prevent it being freed and things * getting tricky since we could end up (from free_asoc) * calling inpcb_free which would get a recursive lock call * to the iterator lock.. But as a consequence of that the * stcb will return to us un-locked.. since free_asoc * returns with either no TCB or the TCB unlocked, we must * relock.. to unlock in the iterator timer :-0 */ SCTP_TCB_LOCK(stcb); atomic_add_int(&stcb->asoc.refcnt, -1); goto no_chunk_output; } else { if (m) { ret = sctp_msg_append(stcb, net, m, &ca->sndrcv, 1); } asoc = &stcb->asoc; if (ca->sndrcv.sinfo_flags & SCTP_EOF) { /* shutdown this assoc */ int cnt; cnt = sctp_is_there_unsent_data(stcb, SCTP_SO_NOT_LOCKED); if (TAILQ_EMPTY(&asoc->send_queue) && TAILQ_EMPTY(&asoc->sent_queue) && (cnt == 0)) { if (asoc->locked_on_sending) { goto abort_anyway; } /* * there is nothing queued to send, so I'm * done... */ if ((SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_SENT) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_RECEIVED) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_ACK_SENT)) { /* * only send SHUTDOWN the first time * through */ if (SCTP_GET_STATE(asoc) == SCTP_STATE_OPEN) { SCTP_STAT_DECR_GAUGE32(sctps_currestab); } SCTP_SET_STATE(asoc, SCTP_STATE_SHUTDOWN_SENT); SCTP_CLEAR_SUBSTATE(asoc, SCTP_STATE_SHUTDOWN_PENDING); sctp_stop_timers_for_shutdown(stcb); sctp_send_shutdown(stcb, net); sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWN, stcb->sctp_ep, stcb, net); sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb, asoc->primary_destination); added_control = 1; do_chunk_output = 0; } } else { /* * we still got (or just got) data to send, * so set SHUTDOWN_PENDING */ /* * XXX sockets draft says that SCTP_EOF * should be sent with no data. currently, * we will allow user data to be sent first * and move to SHUTDOWN-PENDING */ if ((SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_SENT) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_RECEIVED) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_ACK_SENT)) { if (asoc->locked_on_sending) { /* * Locked to send out the * data */ struct sctp_stream_queue_pending *sp; sp = TAILQ_LAST(&asoc->locked_on_sending->outqueue, sctp_streamhead); if (sp) { if ((sp->length == 0) && (sp->msg_is_complete == 0)) asoc->state |= SCTP_STATE_PARTIAL_MSG_LEFT; } } asoc->state |= SCTP_STATE_SHUTDOWN_PENDING; if (TAILQ_EMPTY(&asoc->send_queue) && TAILQ_EMPTY(&asoc->sent_queue) && (asoc->state & SCTP_STATE_PARTIAL_MSG_LEFT)) { struct mbuf *op_err; char msg[SCTP_DIAG_INFO_LEN]; abort_anyway: snprintf(msg, sizeof(msg), "%s:%d at %s", __FILE__, __LINE__, __func__); op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code), msg); atomic_add_int(&stcb->asoc.refcnt, 1); sctp_abort_an_association(stcb->sctp_ep, stcb, op_err, SCTP_SO_NOT_LOCKED); atomic_add_int(&stcb->asoc.refcnt, -1); goto no_chunk_output; } sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb, asoc->primary_destination); } } } } un_sent = ((stcb->asoc.total_output_queue_size - stcb->asoc.total_flight) + (stcb->asoc.stream_queue_cnt * sizeof(struct sctp_data_chunk))); if ((sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) && (stcb->asoc.total_flight > 0) && (un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD))) { do_chunk_output = 0; } if (do_chunk_output) sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_NOT_LOCKED); else if (added_control) { int num_out, reason, now_filled = 0; struct timeval now; int frag_point; frag_point = sctp_get_frag_point(stcb, &stcb->asoc); (void)sctp_med_chunk_output(inp, stcb, &stcb->asoc, &num_out, &reason, 1, 1, &now, &now_filled, frag_point, SCTP_SO_NOT_LOCKED); } no_chunk_output: if (ret) { ca->cnt_failed++; } else { ca->cnt_sent++; } } static void sctp_sendall_completes(void *ptr, uint32_t val SCTP_UNUSED) { struct sctp_copy_all *ca; ca = (struct sctp_copy_all *)ptr; /* * Do a notify here? Kacheong suggests that the notify be done at * the send time.. so you would push up a notification if any send * failed. Don't know if this is feasible since the only failures we * have is "memory" related and if you cannot get an mbuf to send * the data you surely can't get an mbuf to send up to notify the * user you can't send the data :-> */ /* now free everything */ sctp_m_freem(ca->m); SCTP_FREE(ca, SCTP_M_COPYAL); } static struct mbuf * sctp_copy_out_all(struct uio *uio, int len) { struct mbuf *ret, *at; int left, willcpy, cancpy, error; ret = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_WAITOK, 1, MT_DATA); if (ret == NULL) { /* TSNH */ return (NULL); } left = len; SCTP_BUF_LEN(ret) = 0; /* save space for the data chunk header */ cancpy = (int)M_TRAILINGSPACE(ret); willcpy = min(cancpy, left); at = ret; while (left > 0) { /* Align data to the end */ error = uiomove(mtod(at, caddr_t), willcpy, uio); if (error) { err_out_now: sctp_m_freem(at); return (NULL); } SCTP_BUF_LEN(at) = willcpy; SCTP_BUF_NEXT_PKT(at) = SCTP_BUF_NEXT(at) = 0; left -= willcpy; if (left > 0) { SCTP_BUF_NEXT(at) = sctp_get_mbuf_for_msg(left, 0, M_WAITOK, 1, MT_DATA); if (SCTP_BUF_NEXT(at) == NULL) { goto err_out_now; } at = SCTP_BUF_NEXT(at); SCTP_BUF_LEN(at) = 0; cancpy = (int)M_TRAILINGSPACE(at); willcpy = min(cancpy, left); } } return (ret); } static int sctp_sendall(struct sctp_inpcb *inp, struct uio *uio, struct mbuf *m, struct sctp_sndrcvinfo *srcv) { int ret; struct sctp_copy_all *ca; SCTP_MALLOC(ca, struct sctp_copy_all *, sizeof(struct sctp_copy_all), SCTP_M_COPYAL); if (ca == NULL) { sctp_m_freem(m); SCTP_LTRACE_ERR_RET(inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } memset(ca, 0, sizeof(struct sctp_copy_all)); ca->inp = inp; if (srcv) { memcpy(&ca->sndrcv, srcv, sizeof(struct sctp_nonpad_sndrcvinfo)); } /* * take off the sendall flag, it would be bad if we failed to do * this :-0 */ ca->sndrcv.sinfo_flags &= ~SCTP_SENDALL; /* get length and mbuf chain */ if (uio) { ca->sndlen = (int)uio->uio_resid; ca->m = sctp_copy_out_all(uio, ca->sndlen); if (ca->m == NULL) { SCTP_FREE(ca, SCTP_M_COPYAL); SCTP_LTRACE_ERR_RET(inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } } else { /* Gather the length of the send */ struct mbuf *mat; ca->sndlen = 0; for (mat = m; mat; mat = SCTP_BUF_NEXT(mat)) { ca->sndlen += SCTP_BUF_LEN(mat); } } ret = sctp_initiate_iterator(NULL, sctp_sendall_iterator, NULL, SCTP_PCB_ANY_FLAGS, SCTP_PCB_ANY_FEATURES, SCTP_ASOC_ANY_STATE, (void *)ca, 0, sctp_sendall_completes, inp, 1); if (ret) { SCTP_PRINTF("Failed to initiate iterator for sendall\n"); SCTP_FREE(ca, SCTP_M_COPYAL); SCTP_LTRACE_ERR_RET_PKT(m, inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EFAULT); return (EFAULT); } return (0); } void sctp_toss_old_cookies(struct sctp_tcb *stcb, struct sctp_association *asoc) { struct sctp_tmit_chunk *chk, *nchk; TAILQ_FOREACH_SAFE(chk, &asoc->control_send_queue, sctp_next, nchk) { if (chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) { TAILQ_REMOVE(&asoc->control_send_queue, chk, sctp_next); if (chk->data) { sctp_m_freem(chk->data); chk->data = NULL; } asoc->ctrl_queue_cnt--; sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED); } } } void sctp_toss_old_asconf(struct sctp_tcb *stcb) { struct sctp_association *asoc; struct sctp_tmit_chunk *chk, *nchk; struct sctp_asconf_chunk *acp; asoc = &stcb->asoc; TAILQ_FOREACH_SAFE(chk, &asoc->asconf_send_queue, sctp_next, nchk) { /* find SCTP_ASCONF chunk in queue */ if (chk->rec.chunk_id.id == SCTP_ASCONF) { if (chk->data) { acp = mtod(chk->data, struct sctp_asconf_chunk *); if (SCTP_TSN_GT(ntohl(acp->serial_number), asoc->asconf_seq_out_acked)) { /* Not Acked yet */ break; } } TAILQ_REMOVE(&asoc->asconf_send_queue, chk, sctp_next); if (chk->data) { sctp_m_freem(chk->data); chk->data = NULL; } asoc->ctrl_queue_cnt--; sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED); } } } static void sctp_clean_up_datalist(struct sctp_tcb *stcb, struct sctp_association *asoc, struct sctp_tmit_chunk **data_list, int bundle_at, struct sctp_nets *net) { int i; struct sctp_tmit_chunk *tp1; for (i = 0; i < bundle_at; i++) { /* off of the send queue */ TAILQ_REMOVE(&asoc->send_queue, data_list[i], sctp_next); asoc->send_queue_cnt--; if (i > 0) { /* * Any chunk NOT 0 you zap the time chunk 0 gets * zapped or set based on if a RTO measurment is * needed. */ data_list[i]->do_rtt = 0; } /* record time */ data_list[i]->sent_rcv_time = net->last_sent_time; data_list[i]->rec.data.cwnd_at_send = net->cwnd; data_list[i]->rec.data.fast_retran_tsn = data_list[i]->rec.data.TSN_seq; if (data_list[i]->whoTo == NULL) { data_list[i]->whoTo = net; atomic_add_int(&net->ref_count, 1); } /* on to the sent queue */ tp1 = TAILQ_LAST(&asoc->sent_queue, sctpchunk_listhead); if ((tp1) && SCTP_TSN_GT(tp1->rec.data.TSN_seq, data_list[i]->rec.data.TSN_seq)) { struct sctp_tmit_chunk *tpp; /* need to move back */ back_up_more: tpp = TAILQ_PREV(tp1, sctpchunk_listhead, sctp_next); if (tpp == NULL) { TAILQ_INSERT_BEFORE(tp1, data_list[i], sctp_next); goto all_done; } tp1 = tpp; if (SCTP_TSN_GT(tp1->rec.data.TSN_seq, data_list[i]->rec.data.TSN_seq)) { goto back_up_more; } TAILQ_INSERT_AFTER(&asoc->sent_queue, tp1, data_list[i], sctp_next); } else { TAILQ_INSERT_TAIL(&asoc->sent_queue, data_list[i], sctp_next); } all_done: /* This does not lower until the cum-ack passes it */ asoc->sent_queue_cnt++; if ((asoc->peers_rwnd <= 0) && (asoc->total_flight == 0) && (bundle_at == 1)) { /* Mark the chunk as being a window probe */ SCTP_STAT_INCR(sctps_windowprobed); } #ifdef SCTP_AUDITING_ENABLED sctp_audit_log(0xC2, 3); #endif data_list[i]->sent = SCTP_DATAGRAM_SENT; data_list[i]->snd_count = 1; data_list[i]->rec.data.chunk_was_revoked = 0; if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_FLIGHT_LOGGING_ENABLE) { sctp_misc_ints(SCTP_FLIGHT_LOG_UP, data_list[i]->whoTo->flight_size, data_list[i]->book_size, (uint32_t) (uintptr_t) data_list[i]->whoTo, data_list[i]->rec.data.TSN_seq); } sctp_flight_size_increase(data_list[i]); sctp_total_flight_increase(stcb, data_list[i]); if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_RWND_ENABLE) { sctp_log_rwnd(SCTP_DECREASE_PEER_RWND, asoc->peers_rwnd, data_list[i]->send_size, SCTP_BASE_SYSCTL(sctp_peer_chunk_oh)); } asoc->peers_rwnd = sctp_sbspace_sub(asoc->peers_rwnd, (uint32_t) (data_list[i]->send_size + SCTP_BASE_SYSCTL(sctp_peer_chunk_oh))); if (asoc->peers_rwnd < stcb->sctp_ep->sctp_ep.sctp_sws_sender) { /* SWS sender side engages */ asoc->peers_rwnd = 0; } } if (asoc->cc_functions.sctp_cwnd_update_packet_transmitted) { (*asoc->cc_functions.sctp_cwnd_update_packet_transmitted) (stcb, net); } } static void sctp_clean_up_ctl(struct sctp_tcb *stcb, struct sctp_association *asoc, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { struct sctp_tmit_chunk *chk, *nchk; TAILQ_FOREACH_SAFE(chk, &asoc->control_send_queue, sctp_next, nchk) { if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) || (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK) || /* EY */ (chk->rec.chunk_id.id == SCTP_HEARTBEAT_REQUEST) || (chk->rec.chunk_id.id == SCTP_HEARTBEAT_ACK) || (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) || (chk->rec.chunk_id.id == SCTP_SHUTDOWN) || (chk->rec.chunk_id.id == SCTP_SHUTDOWN_ACK) || (chk->rec.chunk_id.id == SCTP_OPERATION_ERROR) || (chk->rec.chunk_id.id == SCTP_PACKET_DROPPED) || (chk->rec.chunk_id.id == SCTP_COOKIE_ACK) || (chk->rec.chunk_id.id == SCTP_ECN_CWR) || (chk->rec.chunk_id.id == SCTP_ASCONF_ACK)) { /* Stray chunks must be cleaned up */ clean_up_anyway: TAILQ_REMOVE(&asoc->control_send_queue, chk, sctp_next); if (chk->data) { sctp_m_freem(chk->data); chk->data = NULL; } asoc->ctrl_queue_cnt--; if (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) asoc->fwd_tsn_cnt--; sctp_free_a_chunk(stcb, chk, so_locked); } else if (chk->rec.chunk_id.id == SCTP_STREAM_RESET) { /* special handling, we must look into the param */ if (chk != asoc->str_reset) { goto clean_up_anyway; } } } } static int sctp_can_we_split_this(struct sctp_tcb *stcb, uint32_t length, uint32_t goal_mtu, uint32_t frag_point, int eeor_on) { /* * Make a decision on if I should split a msg into multiple parts. * This is only asked of incomplete messages. */ if (eeor_on) { /* * If we are doing EEOR we need to always send it if its the * entire thing, since it might be all the guy is putting in * the hopper. */ if (goal_mtu >= length) { /*- * If we have data outstanding, * we get another chance when the sack * arrives to transmit - wait for more data */ if (stcb->asoc.total_flight == 0) { /* * If nothing is in flight, we zero the * packet counter. */ return (length); } return (0); } else { /* You can fill the rest */ return (goal_mtu); } } /*- * For those strange folk that make the send buffer * smaller than our fragmentation point, we can't * get a full msg in so we have to allow splitting. */ if (SCTP_SB_LIMIT_SND(stcb->sctp_socket) < frag_point) { return (length); } if ((length <= goal_mtu) || ((length - goal_mtu) < SCTP_BASE_SYSCTL(sctp_min_residual))) { /* Sub-optimial residual don't split in non-eeor mode. */ return (0); } /* * If we reach here length is larger than the goal_mtu. Do we wish * to split it for the sake of packet putting together? */ if (goal_mtu >= min(SCTP_BASE_SYSCTL(sctp_min_split_point), frag_point)) { /* Its ok to split it */ return (min(goal_mtu, frag_point)); } /* Nope, can't split */ return (0); } static uint32_t sctp_move_to_outqueue(struct sctp_tcb *stcb, struct sctp_stream_out *strq, uint32_t goal_mtu, uint32_t frag_point, int *locked, int *giveup, int eeor_mode, int *bail, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { /* Move from the stream to the send_queue keeping track of the total */ struct sctp_association *asoc; struct sctp_stream_queue_pending *sp; struct sctp_tmit_chunk *chk; struct sctp_data_chunk *dchkh = NULL; struct sctp_idata_chunk *ndchkh = NULL; uint32_t to_move, length; int leading; uint8_t rcv_flags = 0; uint8_t some_taken; uint8_t send_lock_up = 0; SCTP_TCB_LOCK_ASSERT(stcb); asoc = &stcb->asoc; one_more_time: /* sa_ignore FREED_MEMORY */ *locked = 0; sp = TAILQ_FIRST(&strq->outqueue); if (sp == NULL) { *locked = 0; if (send_lock_up == 0) { SCTP_TCB_SEND_LOCK(stcb); send_lock_up = 1; } sp = TAILQ_FIRST(&strq->outqueue); if (sp) { goto one_more_time; } if ((sctp_is_feature_on(stcb->sctp_ep, SCTP_PCB_FLAGS_EXPLICIT_EOR) == 0) && (stcb->asoc.idata_supported == 0) && (strq->last_msg_incomplete)) { SCTP_PRINTF("Huh? Stream:%d lm_in_c=%d but queue is NULL\n", strq->stream_no, strq->last_msg_incomplete); strq->last_msg_incomplete = 0; } to_move = 0; if (send_lock_up) { SCTP_TCB_SEND_UNLOCK(stcb); send_lock_up = 0; } goto out_of; } if ((sp->msg_is_complete) && (sp->length == 0)) { if (sp->sender_all_done) { /* * We are doing differed cleanup. Last time through * when we took all the data the sender_all_done was * not set. */ if ((sp->put_last_out == 0) && (sp->discard_rest == 0)) { SCTP_PRINTF("Gak, put out entire msg with NO end!-1\n"); SCTP_PRINTF("sender_done:%d len:%d msg_comp:%d put_last_out:%d send_lock:%d\n", sp->sender_all_done, sp->length, sp->msg_is_complete, sp->put_last_out, send_lock_up); } if ((TAILQ_NEXT(sp, next) == NULL) && (send_lock_up == 0)) { SCTP_TCB_SEND_LOCK(stcb); send_lock_up = 1; } atomic_subtract_int(&asoc->stream_queue_cnt, 1); TAILQ_REMOVE(&strq->outqueue, sp, next); if ((strq->state == SCTP_STREAM_RESET_PENDING) && (strq->chunks_on_queues == 0) && TAILQ_EMPTY(&strq->outqueue)) { stcb->asoc.trigger_reset = 1; } stcb->asoc.ss_functions.sctp_ss_remove_from_stream(stcb, asoc, strq, sp, send_lock_up); if (sp->net) { sctp_free_remote_addr(sp->net); sp->net = NULL; } if (sp->data) { sctp_m_freem(sp->data); sp->data = NULL; } sctp_free_a_strmoq(stcb, sp, so_locked); /* we can't be locked to it */ *locked = 0; stcb->asoc.locked_on_sending = NULL; if (send_lock_up) { SCTP_TCB_SEND_UNLOCK(stcb); send_lock_up = 0; } /* back to get the next msg */ goto one_more_time; } else { /* * sender just finished this but still holds a * reference */ if (stcb->asoc.idata_supported == 0) *locked = 1; *giveup = 1; to_move = 0; goto out_of; } } else { /* is there some to get */ if (sp->length == 0) { /* no */ if (stcb->asoc.idata_supported == 0) *locked = 1; *giveup = 1; to_move = 0; goto out_of; } else if (sp->discard_rest) { if (send_lock_up == 0) { SCTP_TCB_SEND_LOCK(stcb); send_lock_up = 1; } /* Whack down the size */ atomic_subtract_int(&stcb->asoc.total_output_queue_size, sp->length); if ((stcb->sctp_socket != NULL) && ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) || (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL))) { atomic_subtract_int(&stcb->sctp_socket->so_snd.sb_cc, sp->length); } if (sp->data) { sctp_m_freem(sp->data); sp->data = NULL; sp->tail_mbuf = NULL; } sp->length = 0; sp->some_taken = 1; if (stcb->asoc.idata_supported == 0) *locked = 1; *giveup = 1; to_move = 0; goto out_of; } } some_taken = sp->some_taken; re_look: length = sp->length; if (sp->msg_is_complete) { /* The message is complete */ to_move = min(length, frag_point); if (to_move == length) { /* All of it fits in the MTU */ if (sp->some_taken) { rcv_flags |= SCTP_DATA_LAST_FRAG; } else { rcv_flags |= SCTP_DATA_NOT_FRAG; } sp->put_last_out = 1; if (sp->sinfo_flags & SCTP_SACK_IMMEDIATELY) { rcv_flags |= SCTP_DATA_SACK_IMMEDIATELY; } } else { /* Not all of it fits, we fragment */ if (sp->some_taken == 0) { rcv_flags |= SCTP_DATA_FIRST_FRAG; } sp->some_taken = 1; } } else { to_move = sctp_can_we_split_this(stcb, length, goal_mtu, frag_point, eeor_mode); if (to_move) { /*- * We use a snapshot of length in case it * is expanding during the compare. */ uint32_t llen; llen = length; if (to_move >= llen) { to_move = llen; if (send_lock_up == 0) { /*- * We are taking all of an incomplete msg * thus we need a send lock. */ SCTP_TCB_SEND_LOCK(stcb); send_lock_up = 1; if (sp->msg_is_complete) { /* * the sender finished the * msg */ goto re_look; } } } if (sp->some_taken == 0) { rcv_flags |= SCTP_DATA_FIRST_FRAG; sp->some_taken = 1; } } else { /* Nothing to take. */ if ((sp->some_taken) && (stcb->asoc.idata_supported == 0)) { *locked = 1; } *giveup = 1; to_move = 0; goto out_of; } } /* If we reach here, we can copy out a chunk */ sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* No chunk memory */ *giveup = 1; to_move = 0; goto out_of; } /* * Setup for unordered if needed by looking at the user sent info * flags. */ if (sp->sinfo_flags & SCTP_UNORDERED) { rcv_flags |= SCTP_DATA_UNORDERED; } if (SCTP_BASE_SYSCTL(sctp_enable_sack_immediately) && (sp->sinfo_flags & SCTP_EOF) == SCTP_EOF) { rcv_flags |= SCTP_DATA_SACK_IMMEDIATELY; } /* clear out the chunk before setting up */ memset(chk, 0, sizeof(*chk)); chk->rec.data.rcv_flags = rcv_flags; if (to_move >= length) { /* we think we can steal the whole thing */ if ((sp->sender_all_done == 0) && (send_lock_up == 0)) { SCTP_TCB_SEND_LOCK(stcb); send_lock_up = 1; } if (to_move < sp->length) { /* bail, it changed */ goto dont_do_it; } chk->data = sp->data; chk->last_mbuf = sp->tail_mbuf; /* register the stealing */ sp->data = sp->tail_mbuf = NULL; } else { struct mbuf *m; dont_do_it: chk->data = SCTP_M_COPYM(sp->data, 0, to_move, M_NOWAIT); chk->last_mbuf = NULL; if (chk->data == NULL) { sp->some_taken = some_taken; sctp_free_a_chunk(stcb, chk, so_locked); *bail = 1; to_move = 0; goto out_of; } #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(chk->data, SCTP_MBUF_ICOPY); } #endif /* Pull off the data */ m_adj(sp->data, to_move); /* Now lets work our way down and compact it */ m = sp->data; while (m && (SCTP_BUF_LEN(m) == 0)) { sp->data = SCTP_BUF_NEXT(m); SCTP_BUF_NEXT(m) = NULL; if (sp->tail_mbuf == m) { /*- * Freeing tail? TSNH since * we supposedly were taking less * than the sp->length. */ #ifdef INVARIANTS panic("Huh, freing tail? - TSNH"); #else SCTP_PRINTF("Huh, freeing tail? - TSNH\n"); sp->tail_mbuf = sp->data = NULL; sp->length = 0; #endif } sctp_m_free(m); m = sp->data; } } if (SCTP_BUF_IS_EXTENDED(chk->data)) { chk->copy_by_ref = 1; } else { chk->copy_by_ref = 0; } /* * get last_mbuf and counts of mb usage This is ugly but hopefully * its only one mbuf. */ if (chk->last_mbuf == NULL) { chk->last_mbuf = chk->data; while (SCTP_BUF_NEXT(chk->last_mbuf) != NULL) { chk->last_mbuf = SCTP_BUF_NEXT(chk->last_mbuf); } } if (to_move > length) { /*- This should not happen either * since we always lower to_move to the size * of sp->length if its larger. */ #ifdef INVARIANTS panic("Huh, how can to_move be larger?"); #else SCTP_PRINTF("Huh, how can to_move be larger?\n"); sp->length = 0; #endif } else { atomic_subtract_int(&sp->length, to_move); } if (stcb->asoc.idata_supported == 0) { leading = sizeof(struct sctp_data_chunk); } else { leading = sizeof(struct sctp_idata_chunk); } if (M_LEADINGSPACE(chk->data) < leading) { /* Not enough room for a chunk header, get some */ struct mbuf *m; m = sctp_get_mbuf_for_msg(1, 0, M_NOWAIT, 0, MT_DATA); if (m == NULL) { /* * we're in trouble here. _PREPEND below will free * all the data if there is no leading space, so we * must put the data back and restore. */ if (send_lock_up == 0) { SCTP_TCB_SEND_LOCK(stcb); send_lock_up = 1; } if (sp->data == NULL) { /* unsteal the data */ sp->data = chk->data; sp->tail_mbuf = chk->last_mbuf; } else { struct mbuf *m_tmp; /* reassemble the data */ m_tmp = sp->data; sp->data = chk->data; SCTP_BUF_NEXT(chk->last_mbuf) = m_tmp; } sp->some_taken = some_taken; atomic_add_int(&sp->length, to_move); chk->data = NULL; *bail = 1; sctp_free_a_chunk(stcb, chk, so_locked); to_move = 0; goto out_of; } else { SCTP_BUF_LEN(m) = 0; SCTP_BUF_NEXT(m) = chk->data; chk->data = m; M_ALIGN(chk->data, 4); } } if (stcb->asoc.idata_supported == 0) { SCTP_BUF_PREPEND(chk->data, sizeof(struct sctp_data_chunk), M_NOWAIT); } else { SCTP_BUF_PREPEND(chk->data, sizeof(struct sctp_idata_chunk), M_NOWAIT); } if (chk->data == NULL) { /* HELP, TSNH since we assured it would not above? */ #ifdef INVARIANTS panic("prepend failes HELP?"); #else SCTP_PRINTF("prepend fails HELP?\n"); sctp_free_a_chunk(stcb, chk, so_locked); #endif *bail = 1; to_move = 0; goto out_of; } if (stcb->asoc.idata_supported == 0) { sctp_snd_sb_alloc(stcb, sizeof(struct sctp_data_chunk)); chk->book_size = chk->send_size = (uint16_t) (to_move + sizeof(struct sctp_data_chunk)); } else { sctp_snd_sb_alloc(stcb, sizeof(struct sctp_idata_chunk)); chk->book_size = chk->send_size = (uint16_t) (to_move + sizeof(struct sctp_idata_chunk)); } chk->book_size_scale = 0; chk->sent = SCTP_DATAGRAM_UNSENT; chk->flags = 0; chk->asoc = &stcb->asoc; chk->pad_inplace = 0; chk->no_fr_allowed = 0; if (stcb->asoc.idata_supported == 0) { if (rcv_flags & SCTP_DATA_UNORDERED) { /* Just use 0. The receiver ignores the values. */ chk->rec.data.stream_seq = 0; } else { chk->rec.data.stream_seq = strq->next_mid_ordered; if (rcv_flags & SCTP_DATA_LAST_FRAG) { strq->next_mid_ordered++; } } } else { if (rcv_flags & SCTP_DATA_UNORDERED) { chk->rec.data.stream_seq = strq->next_mid_unordered; if (rcv_flags & SCTP_DATA_LAST_FRAG) { strq->next_mid_unordered++; } } else { chk->rec.data.stream_seq = strq->next_mid_ordered; if (rcv_flags & SCTP_DATA_LAST_FRAG) { strq->next_mid_ordered++; } } } chk->rec.data.stream_number = sp->stream; chk->rec.data.payloadtype = sp->ppid; chk->rec.data.context = sp->context; chk->rec.data.doing_fast_retransmit = 0; chk->rec.data.timetodrop = sp->ts; chk->flags = sp->act_flags; if (sp->net) { chk->whoTo = sp->net; atomic_add_int(&chk->whoTo->ref_count, 1); } else chk->whoTo = NULL; if (sp->holds_key_ref) { chk->auth_keyid = sp->auth_keyid; sctp_auth_key_acquire(stcb, chk->auth_keyid); chk->holds_key_ref = 1; } chk->rec.data.TSN_seq = atomic_fetchadd_int(&asoc->sending_seq, 1); if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_AT_SEND_2_OUTQ) { sctp_misc_ints(SCTP_STRMOUT_LOG_SEND, (uint32_t) (uintptr_t) stcb, sp->length, (uint32_t) ((chk->rec.data.stream_number << 16) | chk->rec.data.stream_seq), chk->rec.data.TSN_seq); } if (stcb->asoc.idata_supported == 0) { dchkh = mtod(chk->data, struct sctp_data_chunk *); } else { ndchkh = mtod(chk->data, struct sctp_idata_chunk *); } /* * Put the rest of the things in place now. Size was done earlier in * previous loop prior to padding. */ #ifdef SCTP_ASOCLOG_OF_TSNS SCTP_TCB_LOCK_ASSERT(stcb); if (asoc->tsn_out_at >= SCTP_TSN_LOG_SIZE) { asoc->tsn_out_at = 0; asoc->tsn_out_wrapped = 1; } asoc->out_tsnlog[asoc->tsn_out_at].tsn = chk->rec.data.TSN_seq; asoc->out_tsnlog[asoc->tsn_out_at].strm = chk->rec.data.stream_number; asoc->out_tsnlog[asoc->tsn_out_at].seq = chk->rec.data.stream_seq; asoc->out_tsnlog[asoc->tsn_out_at].sz = chk->send_size; asoc->out_tsnlog[asoc->tsn_out_at].flgs = chk->rec.data.rcv_flags; asoc->out_tsnlog[asoc->tsn_out_at].stcb = (void *)stcb; asoc->out_tsnlog[asoc->tsn_out_at].in_pos = asoc->tsn_out_at; asoc->out_tsnlog[asoc->tsn_out_at].in_out = 2; asoc->tsn_out_at++; #endif if (stcb->asoc.idata_supported == 0) { dchkh->ch.chunk_type = SCTP_DATA; dchkh->ch.chunk_flags = chk->rec.data.rcv_flags; dchkh->dp.tsn = htonl(chk->rec.data.TSN_seq); dchkh->dp.stream_id = htons((strq->stream_no & 0x0000ffff)); dchkh->dp.stream_sequence = htons((uint16_t) chk->rec.data.stream_seq); dchkh->dp.protocol_id = chk->rec.data.payloadtype; dchkh->ch.chunk_length = htons(chk->send_size); } else { ndchkh->ch.chunk_type = SCTP_IDATA; ndchkh->ch.chunk_flags = chk->rec.data.rcv_flags; ndchkh->dp.tsn = htonl(chk->rec.data.TSN_seq); ndchkh->dp.stream_id = htons(strq->stream_no); ndchkh->dp.reserved = htons(0); ndchkh->dp.msg_id = htonl(chk->rec.data.stream_seq); if (sp->fsn == 0) ndchkh->dp.ppid_fsn.protocol_id = chk->rec.data.payloadtype; else ndchkh->dp.ppid_fsn.fsn = htonl(sp->fsn); sp->fsn++; ndchkh->ch.chunk_length = htons(chk->send_size); } /* Now advance the chk->send_size by the actual pad needed. */ if (chk->send_size < SCTP_SIZE32(chk->book_size)) { /* need a pad */ struct mbuf *lm; int pads; pads = SCTP_SIZE32(chk->book_size) - chk->send_size; lm = sctp_pad_lastmbuf(chk->data, pads, chk->last_mbuf); if (lm != NULL) { chk->last_mbuf = lm; chk->pad_inplace = 1; } chk->send_size += pads; } if (PR_SCTP_ENABLED(chk->flags)) { asoc->pr_sctp_cnt++; } if (sp->msg_is_complete && (sp->length == 0) && (sp->sender_all_done)) { /* All done pull and kill the message */ atomic_subtract_int(&asoc->stream_queue_cnt, 1); if (sp->put_last_out == 0) { SCTP_PRINTF("Gak, put out entire msg with NO end!-2\n"); SCTP_PRINTF("sender_done:%d len:%d msg_comp:%d put_last_out:%d send_lock:%d\n", sp->sender_all_done, sp->length, sp->msg_is_complete, sp->put_last_out, send_lock_up); } if ((send_lock_up == 0) && (TAILQ_NEXT(sp, next) == NULL)) { SCTP_TCB_SEND_LOCK(stcb); send_lock_up = 1; } TAILQ_REMOVE(&strq->outqueue, sp, next); if ((strq->state == SCTP_STREAM_RESET_PENDING) && (strq->chunks_on_queues == 0) && TAILQ_EMPTY(&strq->outqueue)) { stcb->asoc.trigger_reset = 1; } stcb->asoc.ss_functions.sctp_ss_remove_from_stream(stcb, asoc, strq, sp, send_lock_up); if (sp->net) { sctp_free_remote_addr(sp->net); sp->net = NULL; } if (sp->data) { sctp_m_freem(sp->data); sp->data = NULL; } sctp_free_a_strmoq(stcb, sp, so_locked); /* we can't be locked to it */ *locked = 0; stcb->asoc.locked_on_sending = NULL; } else { /* more to go, we are locked */ if (stcb->asoc.idata_supported == 0) *locked = 1; } asoc->chunks_on_out_queue++; strq->chunks_on_queues++; TAILQ_INSERT_TAIL(&asoc->send_queue, chk, sctp_next); asoc->send_queue_cnt++; out_of: if (send_lock_up) { SCTP_TCB_SEND_UNLOCK(stcb); } return (to_move); } static void sctp_fill_outqueue(struct sctp_tcb *stcb, struct sctp_nets *net, int frag_point, int eeor_mode, int *quit_now, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { struct sctp_association *asoc; struct sctp_stream_out *strq; int goal_mtu, moved_how_much, total_moved = 0, bail = 0; int locked, giveup; SCTP_TCB_LOCK_ASSERT(stcb); asoc = &stcb->asoc; switch (net->ro._l_addr.sa.sa_family) { #ifdef INET case AF_INET: goal_mtu = net->mtu - SCTP_MIN_V4_OVERHEAD; break; #endif #ifdef INET6 case AF_INET6: goal_mtu = net->mtu - SCTP_MIN_OVERHEAD; break; #endif default: /* TSNH */ goal_mtu = net->mtu; break; } /* Need an allowance for the data chunk header too */ if (stcb->asoc.idata_supported == 0) { goal_mtu -= sizeof(struct sctp_data_chunk); } else { goal_mtu -= sizeof(struct sctp_idata_chunk); } /* must make even word boundary */ goal_mtu &= 0xfffffffc; if (asoc->locked_on_sending) { /* We are stuck on one stream until the message completes. */ strq = asoc->locked_on_sending; locked = 1; } else { strq = stcb->asoc.ss_functions.sctp_ss_select_stream(stcb, net, asoc); locked = 0; } while ((goal_mtu > 0) && strq) { giveup = 0; bail = 0; moved_how_much = sctp_move_to_outqueue(stcb, strq, goal_mtu, frag_point, &locked, &giveup, eeor_mode, &bail, so_locked); if (moved_how_much) stcb->asoc.ss_functions.sctp_ss_scheduled(stcb, net, asoc, strq, moved_how_much); if (locked) { asoc->locked_on_sending = strq; if ((moved_how_much == 0) || (giveup) || bail) /* no more to move for now */ break; } else { asoc->locked_on_sending = NULL; if ((giveup) || bail) { break; } strq = stcb->asoc.ss_functions.sctp_ss_select_stream(stcb, net, asoc); if (strq == NULL) { break; } } total_moved += moved_how_much; goal_mtu -= (moved_how_much + sizeof(struct sctp_data_chunk)); goal_mtu &= 0xfffffffc; } if (bail) *quit_now = 1; stcb->asoc.ss_functions.sctp_ss_packet_done(stcb, net, asoc); if (total_moved == 0) { if ((stcb->asoc.sctp_cmt_on_off == 0) && (net == stcb->asoc.primary_destination)) { /* ran dry for primary network net */ SCTP_STAT_INCR(sctps_primary_randry); } else if (stcb->asoc.sctp_cmt_on_off > 0) { /* ran dry with CMT on */ SCTP_STAT_INCR(sctps_cmt_randry); } } } void sctp_fix_ecn_echo(struct sctp_association *asoc) { struct sctp_tmit_chunk *chk; TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if (chk->rec.chunk_id.id == SCTP_ECN_ECHO) { chk->sent = SCTP_DATAGRAM_UNSENT; } } } void sctp_move_chunks_from_net(struct sctp_tcb *stcb, struct sctp_nets *net) { struct sctp_association *asoc; struct sctp_tmit_chunk *chk; struct sctp_stream_queue_pending *sp; unsigned int i; if (net == NULL) { return; } asoc = &stcb->asoc; for (i = 0; i < stcb->asoc.streamoutcnt; i++) { TAILQ_FOREACH(sp, &stcb->asoc.strmout[i].outqueue, next) { if (sp->net == net) { sctp_free_remote_addr(sp->net); sp->net = NULL; } } } TAILQ_FOREACH(chk, &asoc->send_queue, sctp_next) { if (chk->whoTo == net) { sctp_free_remote_addr(chk->whoTo); chk->whoTo = NULL; } } } int sctp_med_chunk_output(struct sctp_inpcb *inp, struct sctp_tcb *stcb, struct sctp_association *asoc, int *num_out, int *reason_code, int control_only, int from_where, struct timeval *now, int *now_filled, int frag_point, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { /** * Ok this is the generic chunk service queue. we must do the * following: * - Service the stream queue that is next, moving any * message (note I must get a complete message i.e. FIRST/MIDDLE and * LAST to the out queue in one pass) and assigning TSN's. This * only applys though if the peer does not support NDATA. For NDATA * chunks its ok to not send the entire message ;-) * - Check to see if the cwnd/rwnd allows any output, if so we go ahead and * fomulate and send the low level chunks. Making sure to combine * any control in the control chunk queue also. */ struct sctp_nets *net, *start_at, *sack_goes_to = NULL, *old_start_at = NULL; struct mbuf *outchain, *endoutchain; struct sctp_tmit_chunk *chk, *nchk; /* temp arrays for unlinking */ struct sctp_tmit_chunk *data_list[SCTP_MAX_DATA_BUNDLING]; int no_fragmentflg, error; unsigned int max_rwnd_per_dest, max_send_per_dest; int one_chunk, hbflag, skip_data_for_this_net; int asconf, cookie, no_out_cnt; int bundle_at, ctl_cnt, no_data_chunks, eeor_mode; unsigned int mtu, r_mtu, omtu, mx_mtu, to_out; int tsns_sent = 0; uint32_t auth_offset = 0; struct sctp_auth_chunk *auth = NULL; uint16_t auth_keyid; int override_ok = 1; int skip_fill_up = 0; int data_auth_reqd = 0; /* * JRS 5/14/07 - Add flag for whether a heartbeat is sent to the * destination. */ int quit_now = 0; *num_out = 0; *reason_code = 0; auth_keyid = stcb->asoc.authinfo.active_keyid; if ((asoc->state & SCTP_STATE_SHUTDOWN_PENDING) || (asoc->state & SCTP_STATE_SHUTDOWN_RECEIVED) || (sctp_is_feature_on(inp, SCTP_PCB_FLAGS_EXPLICIT_EOR))) { eeor_mode = 1; } else { eeor_mode = 0; } ctl_cnt = no_out_cnt = asconf = cookie = 0; /* * First lets prime the pump. For each destination, if there is room * in the flight size, attempt to pull an MTU's worth out of the * stream queues into the general send_queue */ #ifdef SCTP_AUDITING_ENABLED sctp_audit_log(0xC2, 2); #endif SCTP_TCB_LOCK_ASSERT(stcb); hbflag = 0; if (control_only) no_data_chunks = 1; else no_data_chunks = 0; /* Nothing to possible to send? */ if ((TAILQ_EMPTY(&asoc->control_send_queue) || (asoc->ctrl_queue_cnt == stcb->asoc.ecn_echo_cnt_onq)) && TAILQ_EMPTY(&asoc->asconf_send_queue) && TAILQ_EMPTY(&asoc->send_queue) && stcb->asoc.ss_functions.sctp_ss_is_empty(stcb, asoc)) { nothing_to_send: *reason_code = 9; return (0); } if (asoc->peers_rwnd == 0) { /* No room in peers rwnd */ *reason_code = 1; if (asoc->total_flight > 0) { /* we are allowed one chunk in flight */ no_data_chunks = 1; } } if (stcb->asoc.ecn_echo_cnt_onq) { /* Record where a sack goes, if any */ if (no_data_chunks && (asoc->ctrl_queue_cnt == stcb->asoc.ecn_echo_cnt_onq)) { /* Nothing but ECNe to send - we don't do that */ goto nothing_to_send; } TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) || (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK)) { sack_goes_to = chk->whoTo; break; } } } max_rwnd_per_dest = ((asoc->peers_rwnd + asoc->total_flight) / asoc->numnets); if (stcb->sctp_socket) max_send_per_dest = SCTP_SB_LIMIT_SND(stcb->sctp_socket) / asoc->numnets; else max_send_per_dest = 0; if (no_data_chunks == 0) { /* How many non-directed chunks are there? */ TAILQ_FOREACH(chk, &asoc->send_queue, sctp_next) { if (chk->whoTo == NULL) { /* * We already have non-directed chunks on * the queue, no need to do a fill-up. */ skip_fill_up = 1; break; } } } if ((no_data_chunks == 0) && (skip_fill_up == 0) && (!stcb->asoc.ss_functions.sctp_ss_is_empty(stcb, asoc))) { TAILQ_FOREACH(net, &asoc->nets, sctp_next) { /* * This for loop we are in takes in each net, if * its's got space in cwnd and has data sent to it * (when CMT is off) then it calls * sctp_fill_outqueue for the net. This gets data on * the send queue for that network. * * In sctp_fill_outqueue TSN's are assigned and data is * copied out of the stream buffers. Note mostly * copy by reference (we hope). */ net->window_probe = 0; if ((net != stcb->asoc.alternate) && ((net->dest_state & SCTP_ADDR_PF) || (!(net->dest_state & SCTP_ADDR_REACHABLE)) || (net->dest_state & SCTP_ADDR_UNCONFIRMED))) { if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, net, 1, SCTP_CWND_LOG_FILL_OUTQ_CALLED); } continue; } if ((stcb->asoc.cc_functions.sctp_cwnd_new_transmission_begins) && (net->flight_size == 0)) { (*stcb->asoc.cc_functions.sctp_cwnd_new_transmission_begins) (stcb, net); } if (net->flight_size >= net->cwnd) { /* skip this network, no room - can't fill */ if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, net, 3, SCTP_CWND_LOG_FILL_OUTQ_CALLED); } continue; } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, net, 4, SCTP_CWND_LOG_FILL_OUTQ_CALLED); } sctp_fill_outqueue(stcb, net, frag_point, eeor_mode, &quit_now, so_locked); if (quit_now) { /* memory alloc failure */ no_data_chunks = 1; break; } } } /* now service each destination and send out what we can for it */ /* Nothing to send? */ if (TAILQ_EMPTY(&asoc->control_send_queue) && TAILQ_EMPTY(&asoc->asconf_send_queue) && TAILQ_EMPTY(&asoc->send_queue)) { *reason_code = 8; return (0); } if (asoc->sctp_cmt_on_off > 0) { /* get the last start point */ start_at = asoc->last_net_cmt_send_started; if (start_at == NULL) { /* null so to beginning */ start_at = TAILQ_FIRST(&asoc->nets); } else { start_at = TAILQ_NEXT(asoc->last_net_cmt_send_started, sctp_next); if (start_at == NULL) { start_at = TAILQ_FIRST(&asoc->nets); } } asoc->last_net_cmt_send_started = start_at; } else { start_at = TAILQ_FIRST(&asoc->nets); } TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if (chk->whoTo == NULL) { if (asoc->alternate) { chk->whoTo = asoc->alternate; } else { chk->whoTo = asoc->primary_destination; } atomic_add_int(&chk->whoTo->ref_count, 1); } } old_start_at = NULL; again_one_more_time: for (net = start_at; net != NULL; net = TAILQ_NEXT(net, sctp_next)) { /* how much can we send? */ /* SCTPDBG("Examine for sending net:%x\n", (uint32_t)net); */ if (old_start_at && (old_start_at == net)) { /* through list ocmpletely. */ break; } tsns_sent = 0xa; if (TAILQ_EMPTY(&asoc->control_send_queue) && TAILQ_EMPTY(&asoc->asconf_send_queue) && (net->flight_size >= net->cwnd)) { /* * Nothing on control or asconf and flight is full, * we can skip even in the CMT case. */ continue; } bundle_at = 0; endoutchain = outchain = NULL; no_fragmentflg = 1; one_chunk = 0; if (net->dest_state & SCTP_ADDR_UNCONFIRMED) { skip_data_for_this_net = 1; } else { skip_data_for_this_net = 0; } switch (((struct sockaddr *)&net->ro._l_addr)->sa_family) { #ifdef INET case AF_INET: mtu = net->mtu - SCTP_MIN_V4_OVERHEAD; break; #endif #ifdef INET6 case AF_INET6: mtu = net->mtu - SCTP_MIN_OVERHEAD; break; #endif default: /* TSNH */ mtu = net->mtu; break; } mx_mtu = mtu; to_out = 0; if (mtu > asoc->peers_rwnd) { if (asoc->total_flight > 0) { /* We have a packet in flight somewhere */ r_mtu = asoc->peers_rwnd; } else { /* We are always allowed to send one MTU out */ one_chunk = 1; r_mtu = mtu; } } else { r_mtu = mtu; } error = 0; /************************/ /* ASCONF transmission */ /************************/ /* Now first lets go through the asconf queue */ TAILQ_FOREACH_SAFE(chk, &asoc->asconf_send_queue, sctp_next, nchk) { if (chk->rec.chunk_id.id != SCTP_ASCONF) { continue; } if (chk->whoTo == NULL) { if (asoc->alternate == NULL) { if (asoc->primary_destination != net) { break; } } else { if (asoc->alternate != net) { break; } } } else { if (chk->whoTo != net) { break; } } if (chk->data == NULL) { break; } if (chk->sent != SCTP_DATAGRAM_UNSENT && chk->sent != SCTP_DATAGRAM_RESEND) { break; } /* * if no AUTH is yet included and this chunk * requires it, make sure to account for it. We * don't apply the size until the AUTH chunk is * actually added below in case there is no room for * this chunk. NOTE: we overload the use of "omtu" * here */ if ((auth == NULL) && sctp_auth_is_required_chunk(chk->rec.chunk_id.id, stcb->asoc.peer_auth_chunks)) { omtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id); } else omtu = 0; /* Here we do NOT factor the r_mtu */ if ((chk->send_size < (int)(mtu - omtu)) || (chk->flags & CHUNK_FLAGS_FRAGMENT_OK)) { /* * We probably should glom the mbuf chain * from the chk->data for control but the * problem is it becomes yet one more level * of tracking to do if for some reason * output fails. Then I have got to * reconstruct the merged control chain.. el * yucko.. for now we take the easy way and * do the copy */ /* * Add an AUTH chunk, if chunk requires it * save the offset into the chain for AUTH */ if ((auth == NULL) && (sctp_auth_is_required_chunk(chk->rec.chunk_id.id, stcb->asoc.peer_auth_chunks))) { outchain = sctp_add_auth_chunk(outchain, &endoutchain, &auth, &auth_offset, stcb, chk->rec.chunk_id.id); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } outchain = sctp_copy_mbufchain(chk->data, outchain, &endoutchain, (int)chk->rec.chunk_id.can_take_data, chk->send_size, chk->copy_by_ref); if (outchain == NULL) { *reason_code = 8; SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); /* update our MTU size */ if (mtu > (chk->send_size + omtu)) mtu -= (chk->send_size + omtu); else mtu = 0; to_out += (chk->send_size + omtu); /* Do clear IP_DF ? */ if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) { no_fragmentflg = 0; } if (chk->rec.chunk_id.can_take_data) chk->data = NULL; /* * set hb flag since we can use these for * RTO */ hbflag = 1; asconf = 1; /* * should sysctl this: don't bundle data * with ASCONF since it requires AUTH */ no_data_chunks = 1; chk->sent = SCTP_DATAGRAM_SENT; if (chk->whoTo == NULL) { chk->whoTo = net; atomic_add_int(&net->ref_count, 1); } chk->snd_count++; if (mtu == 0) { /* * Ok we are out of room but we can * output without effecting the * flight size since this little guy * is a control only packet. */ sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp, stcb, net); /* * do NOT clear the asconf flag as * it is used to do appropriate * source address selection. */ if (*now_filled == 0) { (void)SCTP_GETTIME_TIMEVAL(now); *now_filled = 1; } net->last_sent_time = *now; hbflag = 0; if ((error = sctp_lowlevel_chunk_output(inp, stcb, net, (struct sockaddr *)&net->ro._l_addr, outchain, auth_offset, auth, stcb->asoc.authinfo.active_keyid, no_fragmentflg, 0, asconf, inp->sctp_lport, stcb->rport, htonl(stcb->asoc.peer_vtag), net->port, NULL, 0, 0, so_locked))) { /* * error, we could not * output */ SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error); if (from_where == 0) { SCTP_STAT_INCR(sctps_lowlevelerrusr); } if (error == ENOBUFS) { asoc->ifp_had_enobuf = 1; SCTP_STAT_INCR(sctps_lowlevelerr); } /* error, could not output */ if (error == EHOSTUNREACH) { /* * Destination went * unreachable * during this send */ sctp_move_chunks_from_net(stcb, net); } *reason_code = 7; break; } else { asoc->ifp_had_enobuf = 0; } /* * increase the number we sent, if a * cookie is sent we don't tell them * any was sent out. */ outchain = endoutchain = NULL; auth = NULL; auth_offset = 0; if (!no_out_cnt) *num_out += ctl_cnt; /* recalc a clean slate and setup */ switch (net->ro._l_addr.sa.sa_family) { #ifdef INET case AF_INET: mtu = net->mtu - SCTP_MIN_V4_OVERHEAD; break; #endif #ifdef INET6 case AF_INET6: mtu = net->mtu - SCTP_MIN_OVERHEAD; break; #endif default: /* TSNH */ mtu = net->mtu; break; } to_out = 0; no_fragmentflg = 1; } } } if (error != 0) { /* try next net */ continue; } /************************/ /* Control transmission */ /************************/ /* Now first lets go through the control queue */ TAILQ_FOREACH_SAFE(chk, &asoc->control_send_queue, sctp_next, nchk) { if ((sack_goes_to) && (chk->rec.chunk_id.id == SCTP_ECN_ECHO) && (chk->whoTo != sack_goes_to)) { /* * if we have a sack in queue, and we are * looking at an ecn echo that is NOT queued * to where the sack is going.. */ if (chk->whoTo == net) { /* * Don't transmit it to where its * going (current net) */ continue; } else if (sack_goes_to == net) { /* * But do transmit it to this * address */ goto skip_net_check; } } if (chk->whoTo == NULL) { if (asoc->alternate == NULL) { if (asoc->primary_destination != net) { continue; } } else { if (asoc->alternate != net) { continue; } } } else { if (chk->whoTo != net) { continue; } } skip_net_check: if (chk->data == NULL) { continue; } if (chk->sent != SCTP_DATAGRAM_UNSENT) { /* * It must be unsent. Cookies and ASCONF's * hang around but there timers will force * when marked for resend. */ continue; } /* * if no AUTH is yet included and this chunk * requires it, make sure to account for it. We * don't apply the size until the AUTH chunk is * actually added below in case there is no room for * this chunk. NOTE: we overload the use of "omtu" * here */ if ((auth == NULL) && sctp_auth_is_required_chunk(chk->rec.chunk_id.id, stcb->asoc.peer_auth_chunks)) { omtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id); } else omtu = 0; /* Here we do NOT factor the r_mtu */ if ((chk->send_size <= (int)(mtu - omtu)) || (chk->flags & CHUNK_FLAGS_FRAGMENT_OK)) { /* * We probably should glom the mbuf chain * from the chk->data for control but the * problem is it becomes yet one more level * of tracking to do if for some reason * output fails. Then I have got to * reconstruct the merged control chain.. el * yucko.. for now we take the easy way and * do the copy */ /* * Add an AUTH chunk, if chunk requires it * save the offset into the chain for AUTH */ if ((auth == NULL) && (sctp_auth_is_required_chunk(chk->rec.chunk_id.id, stcb->asoc.peer_auth_chunks))) { outchain = sctp_add_auth_chunk(outchain, &endoutchain, &auth, &auth_offset, stcb, chk->rec.chunk_id.id); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } outchain = sctp_copy_mbufchain(chk->data, outchain, &endoutchain, (int)chk->rec.chunk_id.can_take_data, chk->send_size, chk->copy_by_ref); if (outchain == NULL) { *reason_code = 8; SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); /* update our MTU size */ if (mtu > (chk->send_size + omtu)) mtu -= (chk->send_size + omtu); else mtu = 0; to_out += (chk->send_size + omtu); /* Do clear IP_DF ? */ if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) { no_fragmentflg = 0; } if (chk->rec.chunk_id.can_take_data) chk->data = NULL; /* Mark things to be removed, if needed */ if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) || (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK) || /* EY */ (chk->rec.chunk_id.id == SCTP_HEARTBEAT_REQUEST) || (chk->rec.chunk_id.id == SCTP_HEARTBEAT_ACK) || (chk->rec.chunk_id.id == SCTP_SHUTDOWN) || (chk->rec.chunk_id.id == SCTP_SHUTDOWN_ACK) || (chk->rec.chunk_id.id == SCTP_OPERATION_ERROR) || (chk->rec.chunk_id.id == SCTP_COOKIE_ACK) || (chk->rec.chunk_id.id == SCTP_ECN_CWR) || (chk->rec.chunk_id.id == SCTP_PACKET_DROPPED) || (chk->rec.chunk_id.id == SCTP_ASCONF_ACK)) { if (chk->rec.chunk_id.id == SCTP_HEARTBEAT_REQUEST) { hbflag = 1; } /* remove these chunks at the end */ if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) || (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK)) { /* turn off the timer */ if (SCTP_OS_TIMER_PENDING(&stcb->asoc.dack_timer.timer)) { sctp_timer_stop(SCTP_TIMER_TYPE_RECV, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_1); } } ctl_cnt++; } else { /* * Other chunks, since they have * timers running (i.e. COOKIE) we * just "trust" that it gets sent or * retransmitted. */ ctl_cnt++; if (chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) { cookie = 1; no_out_cnt = 1; } else if (chk->rec.chunk_id.id == SCTP_ECN_ECHO) { /* * Increment ecne send count * here this means we may be * over-zealous in our * counting if the send * fails, but its the best * place to do it (we used * to do it in the queue of * the chunk, but that did * not tell how many times * it was sent. */ SCTP_STAT_INCR(sctps_sendecne); } chk->sent = SCTP_DATAGRAM_SENT; if (chk->whoTo == NULL) { chk->whoTo = net; atomic_add_int(&net->ref_count, 1); } chk->snd_count++; } if (mtu == 0) { /* * Ok we are out of room but we can * output without effecting the * flight size since this little guy * is a control only packet. */ if (asconf) { sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp, stcb, net); /* * do NOT clear the asconf * flag as it is used to do * appropriate source * address selection. */ } if (cookie) { sctp_timer_start(SCTP_TIMER_TYPE_COOKIE, inp, stcb, net); cookie = 0; } /* Only HB or ASCONF advances time */ if (hbflag) { if (*now_filled == 0) { (void)SCTP_GETTIME_TIMEVAL(now); *now_filled = 1; } net->last_sent_time = *now; hbflag = 0; } if ((error = sctp_lowlevel_chunk_output(inp, stcb, net, (struct sockaddr *)&net->ro._l_addr, outchain, auth_offset, auth, stcb->asoc.authinfo.active_keyid, no_fragmentflg, 0, asconf, inp->sctp_lport, stcb->rport, htonl(stcb->asoc.peer_vtag), net->port, NULL, 0, 0, so_locked))) { /* * error, we could not * output */ SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error); if (from_where == 0) { SCTP_STAT_INCR(sctps_lowlevelerrusr); } if (error == ENOBUFS) { asoc->ifp_had_enobuf = 1; SCTP_STAT_INCR(sctps_lowlevelerr); } if (error == EHOSTUNREACH) { /* * Destination went * unreachable * during this send */ sctp_move_chunks_from_net(stcb, net); } *reason_code = 7; break; } else { asoc->ifp_had_enobuf = 0; } /* * increase the number we sent, if a * cookie is sent we don't tell them * any was sent out. */ outchain = endoutchain = NULL; auth = NULL; auth_offset = 0; if (!no_out_cnt) *num_out += ctl_cnt; /* recalc a clean slate and setup */ switch (net->ro._l_addr.sa.sa_family) { #ifdef INET case AF_INET: mtu = net->mtu - SCTP_MIN_V4_OVERHEAD; break; #endif #ifdef INET6 case AF_INET6: mtu = net->mtu - SCTP_MIN_OVERHEAD; break; #endif default: /* TSNH */ mtu = net->mtu; break; } to_out = 0; no_fragmentflg = 1; } } } if (error != 0) { /* try next net */ continue; } /* JRI: if dest is in PF state, do not send data to it */ if ((asoc->sctp_cmt_on_off > 0) && (net != stcb->asoc.alternate) && (net->dest_state & SCTP_ADDR_PF)) { goto no_data_fill; } if (net->flight_size >= net->cwnd) { goto no_data_fill; } if ((asoc->sctp_cmt_on_off > 0) && (SCTP_BASE_SYSCTL(sctp_buffer_splitting) & SCTP_RECV_BUFFER_SPLITTING) && (net->flight_size > max_rwnd_per_dest)) { goto no_data_fill; } /* * We need a specific accounting for the usage of the send * buffer. We also need to check the number of messages per * net. For now, this is better than nothing and it disabled * by default... */ if ((asoc->sctp_cmt_on_off > 0) && (SCTP_BASE_SYSCTL(sctp_buffer_splitting) & SCTP_SEND_BUFFER_SPLITTING) && (max_send_per_dest > 0) && (net->flight_size > max_send_per_dest)) { goto no_data_fill; } /*********************/ /* Data transmission */ /*********************/ /* * if AUTH for DATA is required and no AUTH has been added * yet, account for this in the mtu now... if no data can be * bundled, this adjustment won't matter anyways since the * packet will be going out... */ data_auth_reqd = sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks); if (data_auth_reqd && (auth == NULL)) { mtu -= sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id); } /* now lets add any data within the MTU constraints */ switch (((struct sockaddr *)&net->ro._l_addr)->sa_family) { #ifdef INET case AF_INET: if (net->mtu > SCTP_MIN_V4_OVERHEAD) omtu = net->mtu - SCTP_MIN_V4_OVERHEAD; else omtu = 0; break; #endif #ifdef INET6 case AF_INET6: if (net->mtu > SCTP_MIN_OVERHEAD) omtu = net->mtu - SCTP_MIN_OVERHEAD; else omtu = 0; break; #endif default: /* TSNH */ omtu = 0; break; } if ((((SCTP_GET_STATE(asoc) == SCTP_STATE_OPEN) || (SCTP_GET_STATE(asoc) == SCTP_STATE_SHUTDOWN_RECEIVED)) && (skip_data_for_this_net == 0)) || (cookie)) { TAILQ_FOREACH_SAFE(chk, &asoc->send_queue, sctp_next, nchk) { if (no_data_chunks) { /* let only control go out */ *reason_code = 1; break; } if (net->flight_size >= net->cwnd) { /* skip this net, no room for data */ *reason_code = 2; break; } if ((chk->whoTo != NULL) && (chk->whoTo != net)) { /* Don't send the chunk on this net */ continue; } if (asoc->sctp_cmt_on_off == 0) { if ((asoc->alternate) && (asoc->alternate != net) && (chk->whoTo == NULL)) { continue; } else if ((net != asoc->primary_destination) && (asoc->alternate == NULL) && (chk->whoTo == NULL)) { continue; } } if ((chk->send_size > omtu) && ((chk->flags & CHUNK_FLAGS_FRAGMENT_OK) == 0)) { /*- * strange, we have a chunk that is * to big for its destination and * yet no fragment ok flag. * Something went wrong when the * PMTU changed...we did not mark * this chunk for some reason?? I * will fix it here by letting IP * fragment it for now and printing * a warning. This really should not * happen ... */ SCTP_PRINTF("Warning chunk of %d bytes > mtu:%d and yet PMTU disc missed\n", chk->send_size, mtu); chk->flags |= CHUNK_FLAGS_FRAGMENT_OK; } if (SCTP_BASE_SYSCTL(sctp_enable_sack_immediately) && ((asoc->state & SCTP_STATE_SHUTDOWN_PENDING) == SCTP_STATE_SHUTDOWN_PENDING)) { struct sctp_data_chunk *dchkh; dchkh = mtod(chk->data, struct sctp_data_chunk *); dchkh->ch.chunk_flags |= SCTP_DATA_SACK_IMMEDIATELY; } if (((chk->send_size <= mtu) && (chk->send_size <= r_mtu)) || ((chk->flags & CHUNK_FLAGS_FRAGMENT_OK) && (chk->send_size <= asoc->peers_rwnd))) { /* ok we will add this one */ /* * Add an AUTH chunk, if chunk * requires it, save the offset into * the chain for AUTH */ if (data_auth_reqd) { if (auth == NULL) { outchain = sctp_add_auth_chunk(outchain, &endoutchain, &auth, &auth_offset, stcb, SCTP_DATA); auth_keyid = chk->auth_keyid; override_ok = 0; SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } else if (override_ok) { /* * use this data's * keyid */ auth_keyid = chk->auth_keyid; override_ok = 0; } else if (auth_keyid != chk->auth_keyid) { /* * different keyid, * so done bundling */ break; } } outchain = sctp_copy_mbufchain(chk->data, outchain, &endoutchain, 0, chk->send_size, chk->copy_by_ref); if (outchain == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT3, "No memory?\n"); if (!SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) { sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net); } *reason_code = 3; SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } /* upate our MTU size */ /* Do clear IP_DF ? */ if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) { no_fragmentflg = 0; } /* unsigned subtraction of mtu */ if (mtu > chk->send_size) mtu -= chk->send_size; else mtu = 0; /* unsigned subtraction of r_mtu */ if (r_mtu > chk->send_size) r_mtu -= chk->send_size; else r_mtu = 0; to_out += chk->send_size; if ((to_out > mx_mtu) && no_fragmentflg) { #ifdef INVARIANTS panic("Exceeding mtu of %d out size is %d", mx_mtu, to_out); #else SCTP_PRINTF("Exceeding mtu of %d out size is %d\n", mx_mtu, to_out); #endif } chk->window_probe = 0; data_list[bundle_at++] = chk; if (bundle_at >= SCTP_MAX_DATA_BUNDLING) { break; } if (chk->sent == SCTP_DATAGRAM_UNSENT) { if ((chk->rec.data.rcv_flags & SCTP_DATA_UNORDERED) == 0) { SCTP_STAT_INCR_COUNTER64(sctps_outorderchunks); } else { SCTP_STAT_INCR_COUNTER64(sctps_outunorderchunks); } if (((chk->rec.data.rcv_flags & SCTP_DATA_LAST_FRAG) == SCTP_DATA_LAST_FRAG) && ((chk->rec.data.rcv_flags & SCTP_DATA_FIRST_FRAG) == 0)) /* * Count number of * user msg's that * were fragmented * we do this by * counting when we * see a LAST * fragment only. */ SCTP_STAT_INCR_COUNTER64(sctps_fragusrmsgs); } if ((mtu == 0) || (r_mtu == 0) || (one_chunk)) { if ((one_chunk) && (stcb->asoc.total_flight == 0)) { data_list[0]->window_probe = 1; net->window_probe = 1; } break; } } else { /* * Must be sent in order of the * TSN's (on a network) */ break; } } /* for (chunk gather loop for this net) */ } /* if asoc.state OPEN */ no_data_fill: /* Is there something to send for this destination? */ if (outchain) { /* We may need to start a control timer or two */ if (asconf) { sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp, stcb, net); /* * do NOT clear the asconf flag as it is * used to do appropriate source address * selection. */ } if (cookie) { sctp_timer_start(SCTP_TIMER_TYPE_COOKIE, inp, stcb, net); cookie = 0; } /* must start a send timer if data is being sent */ if (bundle_at && (!SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer))) { /* * no timer running on this destination * restart it. */ sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net); } if (bundle_at || hbflag) { /* For data/asconf and hb set time */ if (*now_filled == 0) { (void)SCTP_GETTIME_TIMEVAL(now); *now_filled = 1; } net->last_sent_time = *now; } /* Now send it, if there is anything to send :> */ if ((error = sctp_lowlevel_chunk_output(inp, stcb, net, (struct sockaddr *)&net->ro._l_addr, outchain, auth_offset, auth, auth_keyid, no_fragmentflg, bundle_at, asconf, inp->sctp_lport, stcb->rport, htonl(stcb->asoc.peer_vtag), net->port, NULL, 0, 0, so_locked))) { /* error, we could not output */ SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error); if (from_where == 0) { SCTP_STAT_INCR(sctps_lowlevelerrusr); } if (error == ENOBUFS) { SCTP_STAT_INCR(sctps_lowlevelerr); asoc->ifp_had_enobuf = 1; } if (error == EHOSTUNREACH) { /* * Destination went unreachable * during this send */ sctp_move_chunks_from_net(stcb, net); } *reason_code = 6; /*- * I add this line to be paranoid. As far as * I can tell the continue, takes us back to * the top of the for, but just to make sure * I will reset these again here. */ ctl_cnt = bundle_at = 0; continue; /* This takes us back to the * for() for the nets. */ } else { asoc->ifp_had_enobuf = 0; } endoutchain = NULL; auth = NULL; auth_offset = 0; if (!no_out_cnt) { *num_out += (ctl_cnt + bundle_at); } if (bundle_at) { /* setup for a RTO measurement */ tsns_sent = data_list[0]->rec.data.TSN_seq; /* fill time if not already filled */ if (*now_filled == 0) { (void)SCTP_GETTIME_TIMEVAL(&asoc->time_last_sent); *now_filled = 1; *now = asoc->time_last_sent; } else { asoc->time_last_sent = *now; } if (net->rto_needed) { data_list[0]->do_rtt = 1; net->rto_needed = 0; } SCTP_STAT_INCR_BY(sctps_senddata, bundle_at); sctp_clean_up_datalist(stcb, asoc, data_list, bundle_at, net); } if (one_chunk) { break; } } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, net, tsns_sent, SCTP_CWND_LOG_FROM_SEND); } } if (old_start_at == NULL) { old_start_at = start_at; start_at = TAILQ_FIRST(&asoc->nets); if (old_start_at) goto again_one_more_time; } /* * At the end there should be no NON timed chunks hanging on this * queue. */ if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, net, *num_out, SCTP_CWND_LOG_FROM_SEND); } if ((*num_out == 0) && (*reason_code == 0)) { *reason_code = 4; } else { *reason_code = 5; } sctp_clean_up_ctl(stcb, asoc, so_locked); return (0); } void sctp_queue_op_err(struct sctp_tcb *stcb, struct mbuf *op_err) { /*- * Prepend a OPERATIONAL_ERROR chunk header and put on the end of * the control chunk queue. */ struct sctp_chunkhdr *hdr; struct sctp_tmit_chunk *chk; struct mbuf *mat, *last_mbuf; uint32_t chunk_length; uint16_t padding_length; SCTP_TCB_LOCK_ASSERT(stcb); SCTP_BUF_PREPEND(op_err, sizeof(struct sctp_chunkhdr), M_NOWAIT); if (op_err == NULL) { return; } last_mbuf = NULL; chunk_length = 0; for (mat = op_err; mat != NULL; mat = SCTP_BUF_NEXT(mat)) { chunk_length += SCTP_BUF_LEN(mat); if (SCTP_BUF_NEXT(mat) == NULL) { last_mbuf = mat; } } if (chunk_length > SCTP_MAX_CHUNK_LENGTH) { sctp_m_freem(op_err); return; } padding_length = chunk_length % 4; if (padding_length != 0) { padding_length = 4 - padding_length; } if (padding_length != 0) { if (sctp_add_pad_tombuf(last_mbuf, padding_length) == NULL) { sctp_m_freem(op_err); return; } } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ sctp_m_freem(op_err); return; } chk->copy_by_ref = 0; chk->send_size = (uint16_t) chunk_length; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->asoc = &stcb->asoc; chk->data = op_err; chk->whoTo = NULL; chk->rec.chunk_id.id = SCTP_OPERATION_ERROR; chk->rec.chunk_id.can_take_data = 0; hdr = mtod(op_err, struct sctp_chunkhdr *); hdr->chunk_type = SCTP_OPERATION_ERROR; hdr->chunk_flags = 0; hdr->chunk_length = htons(chk->send_size); TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; } int sctp_send_cookie_echo(struct mbuf *m, int offset, struct sctp_tcb *stcb, struct sctp_nets *net) { /*- * pull out the cookie and put it at the front of the control chunk * queue. */ int at; struct mbuf *cookie; struct sctp_paramhdr parm, *phdr; struct sctp_chunkhdr *hdr; struct sctp_tmit_chunk *chk; uint16_t ptype, plen; SCTP_TCB_LOCK_ASSERT(stcb); /* First find the cookie in the param area */ cookie = NULL; at = offset + sizeof(struct sctp_init_chunk); for (;;) { phdr = sctp_get_next_param(m, at, &parm, sizeof(parm)); if (phdr == NULL) { return (-3); } ptype = ntohs(phdr->param_type); plen = ntohs(phdr->param_length); if (ptype == SCTP_STATE_COOKIE) { int pad; /* found the cookie */ if ((pad = (plen % 4))) { plen += 4 - pad; } cookie = SCTP_M_COPYM(m, at, plen, M_NOWAIT); if (cookie == NULL) { /* No memory */ return (-2); } #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(cookie, SCTP_MBUF_ICOPY); } #endif break; } at += SCTP_SIZE32(plen); } /* ok, we got the cookie lets change it into a cookie echo chunk */ /* first the change from param to cookie */ hdr = mtod(cookie, struct sctp_chunkhdr *); hdr->chunk_type = SCTP_COOKIE_ECHO; hdr->chunk_flags = 0; /* get the chunk stuff now and place it in the FRONT of the queue */ sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ sctp_m_freem(cookie); return (-5); } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_COOKIE_ECHO; chk->rec.chunk_id.can_take_data = 0; chk->flags = CHUNK_FLAGS_FRAGMENT_OK; chk->send_size = plen; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->asoc = &stcb->asoc; chk->data = cookie; chk->whoTo = net; atomic_add_int(&chk->whoTo->ref_count, 1); TAILQ_INSERT_HEAD(&chk->asoc->control_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; return (0); } void sctp_send_heartbeat_ack(struct sctp_tcb *stcb, struct mbuf *m, int offset, int chk_length, struct sctp_nets *net) { /* * take a HB request and make it into a HB ack and send it. */ struct mbuf *outchain; struct sctp_chunkhdr *chdr; struct sctp_tmit_chunk *chk; if (net == NULL) /* must have a net pointer */ return; outchain = SCTP_M_COPYM(m, offset, chk_length, M_NOWAIT); if (outchain == NULL) { /* gak out of memory */ return; } #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(outchain, SCTP_MBUF_ICOPY); } #endif chdr = mtod(outchain, struct sctp_chunkhdr *); chdr->chunk_type = SCTP_HEARTBEAT_ACK; chdr->chunk_flags = 0; if (chk_length % 4) { /* need pad */ uint32_t cpthis = 0; int padlen; padlen = 4 - (chk_length % 4); m_copyback(outchain, chk_length, padlen, (caddr_t)&cpthis); } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ sctp_m_freem(outchain); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_HEARTBEAT_ACK; chk->rec.chunk_id.can_take_data = 1; chk->flags = 0; chk->send_size = chk_length; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->asoc = &stcb->asoc; chk->data = outchain; chk->whoTo = net; atomic_add_int(&chk->whoTo->ref_count, 1); TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; } void sctp_send_cookie_ack(struct sctp_tcb *stcb) { /* formulate and queue a cookie-ack back to sender */ struct mbuf *cookie_ack; struct sctp_chunkhdr *hdr; struct sctp_tmit_chunk *chk; SCTP_TCB_LOCK_ASSERT(stcb); cookie_ack = sctp_get_mbuf_for_msg(sizeof(struct sctp_chunkhdr), 0, M_NOWAIT, 1, MT_HEADER); if (cookie_ack == NULL) { /* no mbuf's */ return; } SCTP_BUF_RESV_UF(cookie_ack, SCTP_MIN_OVERHEAD); sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ sctp_m_freem(cookie_ack); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_COOKIE_ACK; chk->rec.chunk_id.can_take_data = 1; chk->flags = 0; chk->send_size = sizeof(struct sctp_chunkhdr); chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->asoc = &stcb->asoc; chk->data = cookie_ack; if (chk->asoc->last_control_chunk_from != NULL) { chk->whoTo = chk->asoc->last_control_chunk_from; atomic_add_int(&chk->whoTo->ref_count, 1); } else { chk->whoTo = NULL; } hdr = mtod(cookie_ack, struct sctp_chunkhdr *); hdr->chunk_type = SCTP_COOKIE_ACK; hdr->chunk_flags = 0; hdr->chunk_length = htons(chk->send_size); SCTP_BUF_LEN(cookie_ack) = chk->send_size; TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; return; } void sctp_send_shutdown_ack(struct sctp_tcb *stcb, struct sctp_nets *net) { /* formulate and queue a SHUTDOWN-ACK back to the sender */ struct mbuf *m_shutdown_ack; struct sctp_shutdown_ack_chunk *ack_cp; struct sctp_tmit_chunk *chk; m_shutdown_ack = sctp_get_mbuf_for_msg(sizeof(struct sctp_shutdown_ack_chunk), 0, M_NOWAIT, 1, MT_HEADER); if (m_shutdown_ack == NULL) { /* no mbuf's */ return; } SCTP_BUF_RESV_UF(m_shutdown_ack, SCTP_MIN_OVERHEAD); sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ sctp_m_freem(m_shutdown_ack); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_SHUTDOWN_ACK; chk->rec.chunk_id.can_take_data = 1; chk->flags = 0; chk->send_size = sizeof(struct sctp_chunkhdr); chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->flags = 0; chk->asoc = &stcb->asoc; chk->data = m_shutdown_ack; chk->whoTo = net; if (chk->whoTo) { atomic_add_int(&chk->whoTo->ref_count, 1); } ack_cp = mtod(m_shutdown_ack, struct sctp_shutdown_ack_chunk *); ack_cp->ch.chunk_type = SCTP_SHUTDOWN_ACK; ack_cp->ch.chunk_flags = 0; ack_cp->ch.chunk_length = htons(chk->send_size); SCTP_BUF_LEN(m_shutdown_ack) = chk->send_size; TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; return; } void sctp_send_shutdown(struct sctp_tcb *stcb, struct sctp_nets *net) { /* formulate and queue a SHUTDOWN to the sender */ struct mbuf *m_shutdown; struct sctp_shutdown_chunk *shutdown_cp; struct sctp_tmit_chunk *chk; m_shutdown = sctp_get_mbuf_for_msg(sizeof(struct sctp_shutdown_chunk), 0, M_NOWAIT, 1, MT_HEADER); if (m_shutdown == NULL) { /* no mbuf's */ return; } SCTP_BUF_RESV_UF(m_shutdown, SCTP_MIN_OVERHEAD); sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ sctp_m_freem(m_shutdown); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_SHUTDOWN; chk->rec.chunk_id.can_take_data = 1; chk->flags = 0; chk->send_size = sizeof(struct sctp_shutdown_chunk); chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->flags = 0; chk->asoc = &stcb->asoc; chk->data = m_shutdown; chk->whoTo = net; if (chk->whoTo) { atomic_add_int(&chk->whoTo->ref_count, 1); } shutdown_cp = mtod(m_shutdown, struct sctp_shutdown_chunk *); shutdown_cp->ch.chunk_type = SCTP_SHUTDOWN; shutdown_cp->ch.chunk_flags = 0; shutdown_cp->ch.chunk_length = htons(chk->send_size); shutdown_cp->cumulative_tsn_ack = htonl(stcb->asoc.cumulative_tsn); SCTP_BUF_LEN(m_shutdown) = chk->send_size; TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; return; } void sctp_send_asconf(struct sctp_tcb *stcb, struct sctp_nets *net, int addr_locked) { /* * formulate and queue an ASCONF to the peer. ASCONF parameters * should be queued on the assoc queue. */ struct sctp_tmit_chunk *chk; struct mbuf *m_asconf; int len; SCTP_TCB_LOCK_ASSERT(stcb); if ((!TAILQ_EMPTY(&stcb->asoc.asconf_send_queue)) && (!sctp_is_feature_on(stcb->sctp_ep, SCTP_PCB_FLAGS_MULTIPLE_ASCONFS))) { /* can't send a new one if there is one in flight already */ return; } /* compose an ASCONF chunk, maximum length is PMTU */ m_asconf = sctp_compose_asconf(stcb, &len, addr_locked); if (m_asconf == NULL) { return; } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ sctp_m_freem(m_asconf); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_ASCONF; chk->rec.chunk_id.can_take_data = 0; chk->flags = CHUNK_FLAGS_FRAGMENT_OK; chk->data = m_asconf; chk->send_size = len; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->asoc = &stcb->asoc; chk->whoTo = net; if (chk->whoTo) { atomic_add_int(&chk->whoTo->ref_count, 1); } TAILQ_INSERT_TAIL(&chk->asoc->asconf_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; return; } void sctp_send_asconf_ack(struct sctp_tcb *stcb) { /* * formulate and queue a asconf-ack back to sender. the asconf-ack * must be stored in the tcb. */ struct sctp_tmit_chunk *chk; struct sctp_asconf_ack *ack, *latest_ack; struct mbuf *m_ack; struct sctp_nets *net = NULL; SCTP_TCB_LOCK_ASSERT(stcb); /* Get the latest ASCONF-ACK */ latest_ack = TAILQ_LAST(&stcb->asoc.asconf_ack_sent, sctp_asconf_ackhead); if (latest_ack == NULL) { return; } if (latest_ack->last_sent_to != NULL && latest_ack->last_sent_to == stcb->asoc.last_control_chunk_from) { /* we're doing a retransmission */ net = sctp_find_alternate_net(stcb, stcb->asoc.last_control_chunk_from, 0); if (net == NULL) { /* no alternate */ if (stcb->asoc.last_control_chunk_from == NULL) { if (stcb->asoc.alternate) { net = stcb->asoc.alternate; } else { net = stcb->asoc.primary_destination; } } else { net = stcb->asoc.last_control_chunk_from; } } } else { /* normal case */ if (stcb->asoc.last_control_chunk_from == NULL) { if (stcb->asoc.alternate) { net = stcb->asoc.alternate; } else { net = stcb->asoc.primary_destination; } } else { net = stcb->asoc.last_control_chunk_from; } } latest_ack->last_sent_to = net; TAILQ_FOREACH(ack, &stcb->asoc.asconf_ack_sent, next) { if (ack->data == NULL) { continue; } /* copy the asconf_ack */ m_ack = SCTP_M_COPYM(ack->data, 0, M_COPYALL, M_NOWAIT); if (m_ack == NULL) { /* couldn't copy it */ return; } #ifdef SCTP_MBUF_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) { sctp_log_mbc(m_ack, SCTP_MBUF_ICOPY); } #endif sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { /* no memory */ if (m_ack) sctp_m_freem(m_ack); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_ASCONF_ACK; chk->rec.chunk_id.can_take_data = 1; chk->flags = CHUNK_FLAGS_FRAGMENT_OK; chk->whoTo = net; if (chk->whoTo) { atomic_add_int(&chk->whoTo->ref_count, 1); } chk->data = m_ack; chk->send_size = ack->len; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->asoc = &stcb->asoc; TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next); chk->asoc->ctrl_queue_cnt++; } return; } static int sctp_chunk_retransmission(struct sctp_inpcb *inp, struct sctp_tcb *stcb, struct sctp_association *asoc, int *cnt_out, struct timeval *now, int *now_filled, int *fr_done, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { /*- * send out one MTU of retransmission. If fast_retransmit is * happening we ignore the cwnd. Otherwise we obey the cwnd and * rwnd. For a Cookie or Asconf in the control chunk queue we * retransmit them by themselves. * * For data chunks we will pick out the lowest TSN's in the sent_queue * marked for resend and bundle them all together (up to a MTU of * destination). The address to send to should have been * selected/changed where the retransmission was marked (i.e. in FR * or t3-timeout routines). */ struct sctp_tmit_chunk *data_list[SCTP_MAX_DATA_BUNDLING]; struct sctp_tmit_chunk *chk, *fwd; struct mbuf *m, *endofchain; struct sctp_nets *net = NULL; uint32_t tsns_sent = 0; int no_fragmentflg, bundle_at, cnt_thru; unsigned int mtu; int error, i, one_chunk, fwd_tsn, ctl_cnt, tmr_started; struct sctp_auth_chunk *auth = NULL; uint32_t auth_offset = 0; uint16_t auth_keyid; int override_ok = 1; int data_auth_reqd = 0; uint32_t dmtu = 0; SCTP_TCB_LOCK_ASSERT(stcb); tmr_started = ctl_cnt = bundle_at = error = 0; no_fragmentflg = 1; fwd_tsn = 0; *cnt_out = 0; fwd = NULL; endofchain = m = NULL; auth_keyid = stcb->asoc.authinfo.active_keyid; #ifdef SCTP_AUDITING_ENABLED sctp_audit_log(0xC3, 1); #endif if ((TAILQ_EMPTY(&asoc->sent_queue)) && (TAILQ_EMPTY(&asoc->control_send_queue))) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "SCTP hits empty queue with cnt set to %d?\n", asoc->sent_queue_retran_cnt); asoc->sent_queue_cnt = 0; asoc->sent_queue_cnt_removeable = 0; /* send back 0/0 so we enter normal transmission */ *cnt_out = 0; return (0); } TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if ((chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) || (chk->rec.chunk_id.id == SCTP_STREAM_RESET) || (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN)) { if (chk->sent != SCTP_DATAGRAM_RESEND) { continue; } if (chk->rec.chunk_id.id == SCTP_STREAM_RESET) { if (chk != asoc->str_reset) { /* * not eligible for retran if its * not ours */ continue; } } ctl_cnt++; if (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) { fwd_tsn = 1; } /* * Add an AUTH chunk, if chunk requires it save the * offset into the chain for AUTH */ if ((auth == NULL) && (sctp_auth_is_required_chunk(chk->rec.chunk_id.id, stcb->asoc.peer_auth_chunks))) { m = sctp_add_auth_chunk(m, &endofchain, &auth, &auth_offset, stcb, chk->rec.chunk_id.id); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } m = sctp_copy_mbufchain(chk->data, m, &endofchain, 0, chk->send_size, chk->copy_by_ref); break; } } one_chunk = 0; cnt_thru = 0; /* do we have control chunks to retransmit? */ if (m != NULL) { /* Start a timer no matter if we succeed or fail */ if (chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) { sctp_timer_start(SCTP_TIMER_TYPE_COOKIE, inp, stcb, chk->whoTo); } else if (chk->rec.chunk_id.id == SCTP_ASCONF) sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp, stcb, chk->whoTo); chk->snd_count++; /* update our count */ if ((error = sctp_lowlevel_chunk_output(inp, stcb, chk->whoTo, (struct sockaddr *)&chk->whoTo->ro._l_addr, m, auth_offset, auth, stcb->asoc.authinfo.active_keyid, no_fragmentflg, 0, 0, inp->sctp_lport, stcb->rport, htonl(stcb->asoc.peer_vtag), chk->whoTo->port, NULL, 0, 0, so_locked))) { SCTP_STAT_INCR(sctps_lowlevelerr); return (error); } endofchain = NULL; auth = NULL; auth_offset = 0; /* * We don't want to mark the net->sent time here since this * we use this for HB and retrans cannot measure RTT */ /* (void)SCTP_GETTIME_TIMEVAL(&chk->whoTo->last_sent_time); */ *cnt_out += 1; chk->sent = SCTP_DATAGRAM_SENT; sctp_ucount_decr(stcb->asoc.sent_queue_retran_cnt); if (fwd_tsn == 0) { return (0); } else { /* Clean up the fwd-tsn list */ sctp_clean_up_ctl(stcb, asoc, so_locked); return (0); } } /* * Ok, it is just data retransmission we need to do or that and a * fwd-tsn with it all. */ if (TAILQ_EMPTY(&asoc->sent_queue)) { return (SCTP_RETRAN_DONE); } if ((SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_ECHOED) || (SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_WAIT)) { /* not yet open, resend the cookie and that is it */ return (1); } #ifdef SCTP_AUDITING_ENABLED sctp_auditing(20, inp, stcb, NULL); #endif data_auth_reqd = sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks); TAILQ_FOREACH(chk, &asoc->sent_queue, sctp_next) { if (chk->sent != SCTP_DATAGRAM_RESEND) { /* No, not sent to this net or not ready for rtx */ continue; } if (chk->data == NULL) { SCTP_PRINTF("TSN:%x chk->snd_count:%d chk->sent:%d can't retran - no data\n", chk->rec.data.TSN_seq, chk->snd_count, chk->sent); continue; } if ((SCTP_BASE_SYSCTL(sctp_max_retran_chunk)) && (chk->snd_count >= SCTP_BASE_SYSCTL(sctp_max_retran_chunk))) { struct mbuf *op_err; char msg[SCTP_DIAG_INFO_LEN]; snprintf(msg, sizeof(msg), "TSN %8.8x retransmitted %d times, giving up", chk->rec.data.TSN_seq, chk->snd_count); op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code), msg); atomic_add_int(&stcb->asoc.refcnt, 1); sctp_abort_an_association(stcb->sctp_ep, stcb, op_err, so_locked); SCTP_TCB_LOCK(stcb); atomic_subtract_int(&stcb->asoc.refcnt, 1); return (SCTP_RETRAN_EXIT); } /* pick up the net */ net = chk->whoTo; switch (net->ro._l_addr.sa.sa_family) { #ifdef INET case AF_INET: mtu = net->mtu - SCTP_MIN_V4_OVERHEAD; break; #endif #ifdef INET6 case AF_INET6: mtu = net->mtu - SCTP_MIN_OVERHEAD; break; #endif default: /* TSNH */ mtu = net->mtu; break; } if ((asoc->peers_rwnd < mtu) && (asoc->total_flight > 0)) { /* No room in peers rwnd */ uint32_t tsn; tsn = asoc->last_acked_seq + 1; if (tsn == chk->rec.data.TSN_seq) { /* * we make a special exception for this * case. The peer has no rwnd but is missing * the lowest chunk.. which is probably what * is holding up the rwnd. */ goto one_chunk_around; } return (1); } one_chunk_around: if (asoc->peers_rwnd < mtu) { one_chunk = 1; if ((asoc->peers_rwnd == 0) && (asoc->total_flight == 0)) { chk->window_probe = 1; chk->whoTo->window_probe = 1; } } #ifdef SCTP_AUDITING_ENABLED sctp_audit_log(0xC3, 2); #endif bundle_at = 0; m = NULL; net->fast_retran_ip = 0; if (chk->rec.data.doing_fast_retransmit == 0) { /* * if no FR in progress skip destination that have * flight_size > cwnd. */ if (net->flight_size >= net->cwnd) { continue; } } else { /* * Mark the destination net to have FR recovery * limits put on it. */ *fr_done = 1; net->fast_retran_ip = 1; } /* * if no AUTH is yet included and this chunk requires it, * make sure to account for it. We don't apply the size * until the AUTH chunk is actually added below in case * there is no room for this chunk. */ if (data_auth_reqd && (auth == NULL)) { dmtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id); } else dmtu = 0; if ((chk->send_size <= (mtu - dmtu)) || (chk->flags & CHUNK_FLAGS_FRAGMENT_OK)) { /* ok we will add this one */ if (data_auth_reqd) { if (auth == NULL) { m = sctp_add_auth_chunk(m, &endofchain, &auth, &auth_offset, stcb, SCTP_DATA); auth_keyid = chk->auth_keyid; override_ok = 0; SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } else if (override_ok) { auth_keyid = chk->auth_keyid; override_ok = 0; } else if (chk->auth_keyid != auth_keyid) { /* different keyid, so done bundling */ break; } } m = sctp_copy_mbufchain(chk->data, m, &endofchain, 0, chk->send_size, chk->copy_by_ref); if (m == NULL) { SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } /* Do clear IP_DF ? */ if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) { no_fragmentflg = 0; } /* upate our MTU size */ if (mtu > (chk->send_size + dmtu)) mtu -= (chk->send_size + dmtu); else mtu = 0; data_list[bundle_at++] = chk; if (one_chunk && (asoc->total_flight <= 0)) { SCTP_STAT_INCR(sctps_windowprobed); } } if (one_chunk == 0) { /* * now are there anymore forward from chk to pick * up? */ for (fwd = TAILQ_NEXT(chk, sctp_next); fwd != NULL; fwd = TAILQ_NEXT(fwd, sctp_next)) { if (fwd->sent != SCTP_DATAGRAM_RESEND) { /* Nope, not for retran */ continue; } if (fwd->whoTo != net) { /* Nope, not the net in question */ continue; } if (data_auth_reqd && (auth == NULL)) { dmtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id); } else dmtu = 0; if (fwd->send_size <= (mtu - dmtu)) { if (data_auth_reqd) { if (auth == NULL) { m = sctp_add_auth_chunk(m, &endofchain, &auth, &auth_offset, stcb, SCTP_DATA); auth_keyid = fwd->auth_keyid; override_ok = 0; SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } else if (override_ok) { auth_keyid = fwd->auth_keyid; override_ok = 0; } else if (fwd->auth_keyid != auth_keyid) { /* * different keyid, * so done bundling */ break; } } m = sctp_copy_mbufchain(fwd->data, m, &endofchain, 0, fwd->send_size, fwd->copy_by_ref); if (m == NULL) { SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } /* Do clear IP_DF ? */ if (fwd->flags & CHUNK_FLAGS_FRAGMENT_OK) { no_fragmentflg = 0; } /* upate our MTU size */ if (mtu > (fwd->send_size + dmtu)) mtu -= (fwd->send_size + dmtu); else mtu = 0; data_list[bundle_at++] = fwd; if (bundle_at >= SCTP_MAX_DATA_BUNDLING) { break; } } else { /* can't fit so we are done */ break; } } } /* Is there something to send for this destination? */ if (m) { /* * No matter if we fail/or succeed we should start a * timer. A failure is like a lost IP packet :-) */ if (!SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) { /* * no timer running on this destination * restart it. */ sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net); tmr_started = 1; } /* Now lets send it, if there is anything to send :> */ if ((error = sctp_lowlevel_chunk_output(inp, stcb, net, (struct sockaddr *)&net->ro._l_addr, m, auth_offset, auth, auth_keyid, no_fragmentflg, 0, 0, inp->sctp_lport, stcb->rport, htonl(stcb->asoc.peer_vtag), net->port, NULL, 0, 0, so_locked))) { /* error, we could not output */ SCTP_STAT_INCR(sctps_lowlevelerr); return (error); } endofchain = NULL; auth = NULL; auth_offset = 0; /* For HB's */ /* * We don't want to mark the net->sent time here * since this we use this for HB and retrans cannot * measure RTT */ /* (void)SCTP_GETTIME_TIMEVAL(&net->last_sent_time); */ /* For auto-close */ cnt_thru++; if (*now_filled == 0) { (void)SCTP_GETTIME_TIMEVAL(&asoc->time_last_sent); *now = asoc->time_last_sent; *now_filled = 1; } else { asoc->time_last_sent = *now; } *cnt_out += bundle_at; #ifdef SCTP_AUDITING_ENABLED sctp_audit_log(0xC4, bundle_at); #endif if (bundle_at) { tsns_sent = data_list[0]->rec.data.TSN_seq; } for (i = 0; i < bundle_at; i++) { SCTP_STAT_INCR(sctps_sendretransdata); data_list[i]->sent = SCTP_DATAGRAM_SENT; /* * When we have a revoked data, and we * retransmit it, then we clear the revoked * flag since this flag dictates if we * subtracted from the fs */ if (data_list[i]->rec.data.chunk_was_revoked) { /* Deflate the cwnd */ data_list[i]->whoTo->cwnd -= data_list[i]->book_size; data_list[i]->rec.data.chunk_was_revoked = 0; } data_list[i]->snd_count++; sctp_ucount_decr(asoc->sent_queue_retran_cnt); /* record the time */ data_list[i]->sent_rcv_time = asoc->time_last_sent; if (data_list[i]->book_size_scale) { /* * need to double the book size on * this one */ data_list[i]->book_size_scale = 0; /* * Since we double the booksize, we * must also double the output queue * size, since this get shrunk when * we free by this amount. */ atomic_add_int(&((asoc)->total_output_queue_size), data_list[i]->book_size); data_list[i]->book_size *= 2; } else { if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_RWND_ENABLE) { sctp_log_rwnd(SCTP_DECREASE_PEER_RWND, asoc->peers_rwnd, data_list[i]->send_size, SCTP_BASE_SYSCTL(sctp_peer_chunk_oh)); } asoc->peers_rwnd = sctp_sbspace_sub(asoc->peers_rwnd, (uint32_t) (data_list[i]->send_size + SCTP_BASE_SYSCTL(sctp_peer_chunk_oh))); } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_FLIGHT_LOGGING_ENABLE) { sctp_misc_ints(SCTP_FLIGHT_LOG_UP_RSND, data_list[i]->whoTo->flight_size, data_list[i]->book_size, (uint32_t) (uintptr_t) data_list[i]->whoTo, data_list[i]->rec.data.TSN_seq); } sctp_flight_size_increase(data_list[i]); sctp_total_flight_increase(stcb, data_list[i]); if (asoc->peers_rwnd < stcb->sctp_ep->sctp_ep.sctp_sws_sender) { /* SWS sender side engages */ asoc->peers_rwnd = 0; } if ((i == 0) && (data_list[i]->rec.data.doing_fast_retransmit)) { SCTP_STAT_INCR(sctps_sendfastretrans); if ((data_list[i] == TAILQ_FIRST(&asoc->sent_queue)) && (tmr_started == 0)) { /*- * ok we just fast-retrans'd * the lowest TSN, i.e the * first on the list. In * this case we want to give * some more time to get a * SACK back without a * t3-expiring. */ sctp_timer_stop(SCTP_TIMER_TYPE_SEND, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_2); sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net); } } } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, net, tsns_sent, SCTP_CWND_LOG_FROM_RESEND); } #ifdef SCTP_AUDITING_ENABLED sctp_auditing(21, inp, stcb, NULL); #endif } else { /* None will fit */ return (1); } if (asoc->sent_queue_retran_cnt <= 0) { /* all done we have no more to retran */ asoc->sent_queue_retran_cnt = 0; break; } if (one_chunk) { /* No more room in rwnd */ return (1); } /* stop the for loop here. we sent out a packet */ break; } return (0); } static void sctp_timer_validation(struct sctp_inpcb *inp, struct sctp_tcb *stcb, struct sctp_association *asoc) { struct sctp_nets *net; /* Validate that a timer is running somewhere */ TAILQ_FOREACH(net, &asoc->nets, sctp_next) { if (SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) { /* Here is a timer */ return; } } SCTP_TCB_LOCK_ASSERT(stcb); /* Gak, we did not have a timer somewhere */ SCTPDBG(SCTP_DEBUG_OUTPUT3, "Deadlock avoided starting timer on a dest at retran\n"); if (asoc->alternate) { sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, asoc->alternate); } else { sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, asoc->primary_destination); } return; } void sctp_chunk_output(struct sctp_inpcb *inp, struct sctp_tcb *stcb, int from_where, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { /*- * Ok this is the generic chunk service queue. we must do the * following: * - See if there are retransmits pending, if so we must * do these first. * - Service the stream queue that is next, moving any * message (note I must get a complete message i.e. * FIRST/MIDDLE and LAST to the out queue in one pass) and assigning * TSN's * - Check to see if the cwnd/rwnd allows any output, if so we * go ahead and fomulate and send the low level chunks. Making sure * to combine any control in the control chunk queue also. */ struct sctp_association *asoc; struct sctp_nets *net; int error = 0, num_out, tot_out = 0, ret = 0, reason_code; unsigned int burst_cnt = 0; struct timeval now; int now_filled = 0; int nagle_on; int frag_point = sctp_get_frag_point(stcb, &stcb->asoc); int un_sent = 0; int fr_done; unsigned int tot_frs = 0; asoc = &stcb->asoc; do_it_again: /* The Nagle algorithm is only applied when handling a send call. */ if (from_where == SCTP_OUTPUT_FROM_USR_SEND) { if (sctp_is_feature_on(inp, SCTP_PCB_FLAGS_NODELAY)) { nagle_on = 0; } else { nagle_on = 1; } } else { nagle_on = 0; } SCTP_TCB_LOCK_ASSERT(stcb); un_sent = (stcb->asoc.total_output_queue_size - stcb->asoc.total_flight); if ((un_sent <= 0) && (TAILQ_EMPTY(&asoc->control_send_queue)) && (TAILQ_EMPTY(&asoc->asconf_send_queue)) && (asoc->sent_queue_retran_cnt == 0) && (asoc->trigger_reset == 0)) { /* Nothing to do unless there is something to be sent left */ return; } /* * Do we have something to send, data or control AND a sack timer * running, if so piggy-back the sack. */ if (SCTP_OS_TIMER_PENDING(&stcb->asoc.dack_timer.timer)) { sctp_send_sack(stcb, so_locked); (void)SCTP_OS_TIMER_STOP(&stcb->asoc.dack_timer.timer); } while (asoc->sent_queue_retran_cnt) { /*- * Ok, it is retransmission time only, we send out only ONE * packet with a single call off to the retran code. */ if (from_where == SCTP_OUTPUT_FROM_COOKIE_ACK) { /*- * Special hook for handling cookiess discarded * by peer that carried data. Send cookie-ack only * and then the next call with get the retran's. */ (void)sctp_med_chunk_output(inp, stcb, asoc, &num_out, &reason_code, 1, from_where, &now, &now_filled, frag_point, so_locked); return; } else if (from_where != SCTP_OUTPUT_FROM_HB_TMR) { /* if its not from a HB then do it */ fr_done = 0; ret = sctp_chunk_retransmission(inp, stcb, asoc, &num_out, &now, &now_filled, &fr_done, so_locked); if (fr_done) { tot_frs++; } } else { /* * its from any other place, we don't allow retran * output (only control) */ ret = 1; } if (ret > 0) { /* Can't send anymore */ /*- * now lets push out control by calling med-level * output once. this assures that we WILL send HB's * if queued too. */ (void)sctp_med_chunk_output(inp, stcb, asoc, &num_out, &reason_code, 1, from_where, &now, &now_filled, frag_point, so_locked); #ifdef SCTP_AUDITING_ENABLED sctp_auditing(8, inp, stcb, NULL); #endif sctp_timer_validation(inp, stcb, asoc); return; } if (ret < 0) { /*- * The count was off.. retran is not happening so do * the normal retransmission. */ #ifdef SCTP_AUDITING_ENABLED sctp_auditing(9, inp, stcb, NULL); #endif if (ret == SCTP_RETRAN_EXIT) { return; } break; } if (from_where == SCTP_OUTPUT_FROM_T3) { /* Only one transmission allowed out of a timeout */ #ifdef SCTP_AUDITING_ENABLED sctp_auditing(10, inp, stcb, NULL); #endif /* Push out any control */ (void)sctp_med_chunk_output(inp, stcb, asoc, &num_out, &reason_code, 1, from_where, &now, &now_filled, frag_point, so_locked); return; } if ((asoc->fr_max_burst > 0) && (tot_frs >= asoc->fr_max_burst)) { /* Hit FR burst limit */ return; } if ((num_out == 0) && (ret == 0)) { /* No more retrans to send */ break; } } #ifdef SCTP_AUDITING_ENABLED sctp_auditing(12, inp, stcb, NULL); #endif /* Check for bad destinations, if they exist move chunks around. */ TAILQ_FOREACH(net, &asoc->nets, sctp_next) { if (!(net->dest_state & SCTP_ADDR_REACHABLE)) { /*- * if possible move things off of this address we * still may send below due to the dormant state but * we try to find an alternate address to send to * and if we have one we move all queued data on the * out wheel to this alternate address. */ if (net->ref_count > 1) sctp_move_chunks_from_net(stcb, net); } else { /*- * if ((asoc->sat_network) || (net->addr_is_local)) * { burst_limit = asoc->max_burst * * SCTP_SAT_NETWORK_BURST_INCR; } */ if (asoc->max_burst > 0) { if (SCTP_BASE_SYSCTL(sctp_use_cwnd_based_maxburst)) { if ((net->flight_size + (asoc->max_burst * net->mtu)) < net->cwnd) { /* * JRS - Use the congestion * control given in the * congestion control module */ asoc->cc_functions.sctp_cwnd_update_after_output(stcb, net, asoc->max_burst); if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_MAXBURST_ENABLE) { sctp_log_maxburst(stcb, net, 0, asoc->max_burst, SCTP_MAX_BURST_APPLIED); } SCTP_STAT_INCR(sctps_maxburstqueued); } net->fast_retran_ip = 0; } else { if (net->flight_size == 0) { /* * Should be decaying the * cwnd here */ ; } } } } } burst_cnt = 0; do { error = sctp_med_chunk_output(inp, stcb, asoc, &num_out, &reason_code, 0, from_where, &now, &now_filled, frag_point, so_locked); if (error) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "Error %d was returned from med-c-op\n", error); if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_MAXBURST_ENABLE) { sctp_log_maxburst(stcb, asoc->primary_destination, error, burst_cnt, SCTP_MAX_BURST_ERROR_STOP); } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, NULL, error, SCTP_SEND_NOW_COMPLETES); sctp_log_cwnd(stcb, NULL, 0xdeadbeef, SCTP_SEND_NOW_COMPLETES); } break; } SCTPDBG(SCTP_DEBUG_OUTPUT3, "m-c-o put out %d\n", num_out); tot_out += num_out; burst_cnt++; if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, NULL, num_out, SCTP_SEND_NOW_COMPLETES); if (num_out == 0) { sctp_log_cwnd(stcb, NULL, reason_code, SCTP_SEND_NOW_COMPLETES); } } if (nagle_on) { /* * When the Nagle algorithm is used, look at how * much is unsent, then if its smaller than an MTU * and we have data in flight we stop, except if we * are handling a fragmented user message. */ un_sent = ((stcb->asoc.total_output_queue_size - stcb->asoc.total_flight) + (stcb->asoc.stream_queue_cnt * sizeof(struct sctp_data_chunk))); if ((un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD)) && (stcb->asoc.total_flight > 0) && ((stcb->asoc.locked_on_sending == NULL) || sctp_is_feature_on(inp, SCTP_PCB_FLAGS_EXPLICIT_EOR))) { break; } } if (TAILQ_EMPTY(&asoc->control_send_queue) && TAILQ_EMPTY(&asoc->send_queue) && stcb->asoc.ss_functions.sctp_ss_is_empty(stcb, asoc)) { /* Nothing left to send */ break; } if ((stcb->asoc.total_output_queue_size - stcb->asoc.total_flight) <= 0) { /* Nothing left to send */ break; } } while (num_out && ((asoc->max_burst == 0) || SCTP_BASE_SYSCTL(sctp_use_cwnd_based_maxburst) || (burst_cnt < asoc->max_burst))); if (SCTP_BASE_SYSCTL(sctp_use_cwnd_based_maxburst) == 0) { if ((asoc->max_burst > 0) && (burst_cnt >= asoc->max_burst)) { SCTP_STAT_INCR(sctps_maxburstqueued); asoc->burst_limit_applied = 1; if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_MAXBURST_ENABLE) { sctp_log_maxburst(stcb, asoc->primary_destination, 0, burst_cnt, SCTP_MAX_BURST_APPLIED); } } else { asoc->burst_limit_applied = 0; } } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) { sctp_log_cwnd(stcb, NULL, tot_out, SCTP_SEND_NOW_COMPLETES); } SCTPDBG(SCTP_DEBUG_OUTPUT1, "Ok, we have put out %d chunks\n", tot_out); /*- * Now we need to clean up the control chunk chain if a ECNE is on * it. It must be marked as UNSENT again so next call will continue * to send it until such time that we get a CWR, to remove it. */ if (stcb->asoc.ecn_echo_cnt_onq) sctp_fix_ecn_echo(asoc); if (stcb->asoc.trigger_reset) { if (sctp_send_stream_reset_out_if_possible(stcb, so_locked) == 0) { goto do_it_again; } } return; } int sctp_output( struct sctp_inpcb *inp, struct mbuf *m, struct sockaddr *addr, struct mbuf *control, struct thread *p, int flags) { if (inp == NULL) { SCTP_LTRACE_ERR_RET_PKT(m, inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } if (inp->sctp_socket == NULL) { SCTP_LTRACE_ERR_RET_PKT(m, inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } return (sctp_sosend(inp->sctp_socket, addr, (struct uio *)NULL, m, control, flags, p )); } void send_forward_tsn(struct sctp_tcb *stcb, struct sctp_association *asoc) { struct sctp_tmit_chunk *chk; struct sctp_forward_tsn_chunk *fwdtsn; uint32_t advance_peer_ack_point; int old; if (asoc->idata_supported) { old = 0; } else { old = 1; } SCTP_TCB_LOCK_ASSERT(stcb); TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) { /* mark it to unsent */ chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; /* Do we correct its output location? */ if (chk->whoTo) { sctp_free_remote_addr(chk->whoTo); chk->whoTo = NULL; } goto sctp_fill_in_rest; } } /* Ok if we reach here we must build one */ sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { return; } asoc->fwd_tsn_cnt++; chk->copy_by_ref = 0; /* * We don't do the old thing here since this is used not for on-wire * but to tell if we are sending a fwd-tsn by the stack during * output. And if its a IFORWARD or a FORWARD it is a fwd-tsn. */ chk->rec.chunk_id.id = SCTP_FORWARD_CUM_TSN; chk->rec.chunk_id.can_take_data = 0; chk->flags = 0; chk->asoc = asoc; chk->whoTo = NULL; chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA); if (chk->data == NULL) { sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED); return; } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; TAILQ_INSERT_TAIL(&asoc->control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt++; sctp_fill_in_rest: /*- * Here we go through and fill out the part that deals with * stream/seq of the ones we skip. */ SCTP_BUF_LEN(chk->data) = 0; { struct sctp_tmit_chunk *at, *tp1, *last; struct sctp_strseq *strseq; struct sctp_strseq_mid *strseq_m; unsigned int cnt_of_space, i, ovh; unsigned int space_needed; unsigned int cnt_of_skipped = 0; TAILQ_FOREACH(at, &asoc->sent_queue, sctp_next) { if ((at->sent != SCTP_FORWARD_TSN_SKIP) && (at->sent != SCTP_DATAGRAM_NR_ACKED)) { /* no more to look at */ break; } if ((at->rec.data.rcv_flags & SCTP_DATA_UNORDERED) && old) { /* We don't report these */ continue; } cnt_of_skipped++; } if (old) { space_needed = (sizeof(struct sctp_forward_tsn_chunk) + (cnt_of_skipped * sizeof(struct sctp_strseq))); } else { space_needed = (sizeof(struct sctp_forward_tsn_chunk) + (cnt_of_skipped * sizeof(struct sctp_strseq_mid))); } cnt_of_space = (unsigned int)M_TRAILINGSPACE(chk->data); if (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) { ovh = SCTP_MIN_OVERHEAD; } else { ovh = SCTP_MIN_V4_OVERHEAD; } if (cnt_of_space > (asoc->smallest_mtu - ovh)) { /* trim to a mtu size */ cnt_of_space = asoc->smallest_mtu - ovh; } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_TRY_ADVANCE) { sctp_misc_ints(SCTP_FWD_TSN_CHECK, 0xff, 0, cnt_of_skipped, asoc->advanced_peer_ack_point); } advance_peer_ack_point = asoc->advanced_peer_ack_point; if (cnt_of_space < space_needed) { /*- * ok we must trim down the chunk by lowering the * advance peer ack point. */ if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_TRY_ADVANCE) { sctp_misc_ints(SCTP_FWD_TSN_CHECK, 0xff, 0xff, cnt_of_space, space_needed); } if (old) { cnt_of_skipped = cnt_of_space - sizeof(struct sctp_forward_tsn_chunk); cnt_of_skipped /= sizeof(struct sctp_strseq); } else { cnt_of_skipped = cnt_of_space - sizeof(struct sctp_forward_tsn_chunk); cnt_of_skipped /= sizeof(struct sctp_strseq_mid); } /*- * Go through and find the TSN that will be the one * we report. */ at = TAILQ_FIRST(&asoc->sent_queue); if (at != NULL) { for (i = 0; i < cnt_of_skipped; i++) { tp1 = TAILQ_NEXT(at, sctp_next); if (tp1 == NULL) { break; } at = tp1; } } if (at && SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_TRY_ADVANCE) { sctp_misc_ints(SCTP_FWD_TSN_CHECK, 0xff, cnt_of_skipped, at->rec.data.TSN_seq, asoc->advanced_peer_ack_point); } last = at; /*- * last now points to last one I can report, update * peer ack point */ if (last) advance_peer_ack_point = last->rec.data.TSN_seq; if (old) { space_needed = sizeof(struct sctp_forward_tsn_chunk) + cnt_of_skipped * sizeof(struct sctp_strseq); } else { space_needed = sizeof(struct sctp_forward_tsn_chunk) + cnt_of_skipped * sizeof(struct sctp_strseq_mid); } } chk->send_size = space_needed; /* Setup the chunk */ fwdtsn = mtod(chk->data, struct sctp_forward_tsn_chunk *); fwdtsn->ch.chunk_length = htons(chk->send_size); fwdtsn->ch.chunk_flags = 0; if (old) { fwdtsn->ch.chunk_type = SCTP_FORWARD_CUM_TSN; } else { fwdtsn->ch.chunk_type = SCTP_IFORWARD_CUM_TSN; } fwdtsn->new_cumulative_tsn = htonl(advance_peer_ack_point); SCTP_BUF_LEN(chk->data) = chk->send_size; fwdtsn++; /*- * Move pointer to after the fwdtsn and transfer to the * strseq pointer. */ if (old) { strseq = (struct sctp_strseq *)fwdtsn; } else { strseq_m = (struct sctp_strseq_mid *)fwdtsn; } /*- * Now populate the strseq list. This is done blindly * without pulling out duplicate stream info. This is * inefficent but won't harm the process since the peer will * look at these in sequence and will thus release anything. * It could mean we exceed the PMTU and chop off some that * we could have included.. but this is unlikely (aka 1432/4 * would mean 300+ stream seq's would have to be reported in * one FWD-TSN. With a bit of work we can later FIX this to * optimize and pull out duplcates.. but it does add more * overhead. So for now... not! */ at = TAILQ_FIRST(&asoc->sent_queue); for (i = 0; i < cnt_of_skipped; i++) { tp1 = TAILQ_NEXT(at, sctp_next); if (tp1 == NULL) break; if (old && (at->rec.data.rcv_flags & SCTP_DATA_UNORDERED)) { /* We don't report these */ i--; at = tp1; continue; } if (at->rec.data.TSN_seq == advance_peer_ack_point) { at->rec.data.fwd_tsn_cnt = 0; } if (old) { strseq->stream = ntohs(at->rec.data.stream_number); strseq->sequence = ntohs(at->rec.data.stream_seq); strseq++; } else { strseq_m->stream = ntohs(at->rec.data.stream_number); strseq_m->msg_id = ntohl(at->rec.data.stream_seq); if (at->rec.data.rcv_flags & SCTP_DATA_UNORDERED) strseq_m->flags = ntohs(PR_SCTP_UNORDERED_FLAG); else strseq_m->flags = 0; strseq_m++; } at = tp1; } } return; } void sctp_send_sack(struct sctp_tcb *stcb, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { /*- * Queue up a SACK or NR-SACK in the control queue. * We must first check to see if a SACK or NR-SACK is * somehow on the control queue. * If so, we will take and and remove the old one. */ struct sctp_association *asoc; struct sctp_tmit_chunk *chk, *a_chk; struct sctp_sack_chunk *sack; struct sctp_nr_sack_chunk *nr_sack; struct sctp_gap_ack_block *gap_descriptor; const struct sack_track *selector; int mergeable = 0; int offset; caddr_t limit; uint32_t *dup; int limit_reached = 0; unsigned int i, siz, j; unsigned int num_gap_blocks = 0, num_nr_gap_blocks = 0, space; int num_dups = 0; int space_req; uint32_t highest_tsn; uint8_t flags; uint8_t type; uint8_t tsn_map; if (stcb->asoc.nrsack_supported == 1) { type = SCTP_NR_SELECTIVE_ACK; } else { type = SCTP_SELECTIVE_ACK; } a_chk = NULL; asoc = &stcb->asoc; SCTP_TCB_LOCK_ASSERT(stcb); if (asoc->last_data_chunk_from == NULL) { /* Hmm we never received anything */ return; } sctp_slide_mapping_arrays(stcb); sctp_set_rwnd(stcb, asoc); TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if (chk->rec.chunk_id.id == type) { /* Hmm, found a sack already on queue, remove it */ TAILQ_REMOVE(&asoc->control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt--; a_chk = chk; if (a_chk->data) { sctp_m_freem(a_chk->data); a_chk->data = NULL; } if (a_chk->whoTo) { sctp_free_remote_addr(a_chk->whoTo); a_chk->whoTo = NULL; } break; } } if (a_chk == NULL) { sctp_alloc_a_chunk(stcb, a_chk); if (a_chk == NULL) { /* No memory so we drop the idea, and set a timer */ if (stcb->asoc.delayed_ack) { sctp_timer_stop(SCTP_TIMER_TYPE_RECV, stcb->sctp_ep, stcb, NULL, SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_3); sctp_timer_start(SCTP_TIMER_TYPE_RECV, stcb->sctp_ep, stcb, NULL); } else { stcb->asoc.send_sack = 1; } return; } a_chk->copy_by_ref = 0; a_chk->rec.chunk_id.id = type; a_chk->rec.chunk_id.can_take_data = 1; } /* Clear our pkt counts */ asoc->data_pkts_seen = 0; a_chk->flags = 0; a_chk->asoc = asoc; a_chk->snd_count = 0; a_chk->send_size = 0; /* fill in later */ a_chk->sent = SCTP_DATAGRAM_UNSENT; a_chk->whoTo = NULL; if (!(asoc->last_data_chunk_from->dest_state & SCTP_ADDR_REACHABLE)) { /*- * Ok, the destination for the SACK is unreachable, lets see if * we can select an alternate to asoc->last_data_chunk_from */ a_chk->whoTo = sctp_find_alternate_net(stcb, asoc->last_data_chunk_from, 0); if (a_chk->whoTo == NULL) { /* Nope, no alternate */ a_chk->whoTo = asoc->last_data_chunk_from; } } else { a_chk->whoTo = asoc->last_data_chunk_from; } if (a_chk->whoTo) { atomic_add_int(&a_chk->whoTo->ref_count, 1); } if (SCTP_TSN_GT(asoc->highest_tsn_inside_map, asoc->highest_tsn_inside_nr_map)) { highest_tsn = asoc->highest_tsn_inside_map; } else { highest_tsn = asoc->highest_tsn_inside_nr_map; } if (highest_tsn == asoc->cumulative_tsn) { /* no gaps */ if (type == SCTP_SELECTIVE_ACK) { space_req = sizeof(struct sctp_sack_chunk); } else { space_req = sizeof(struct sctp_nr_sack_chunk); } } else { /* gaps get a cluster */ space_req = MCLBYTES; } /* Ok now lets formulate a MBUF with our sack */ a_chk->data = sctp_get_mbuf_for_msg(space_req, 0, M_NOWAIT, 1, MT_DATA); if ((a_chk->data == NULL) || (a_chk->whoTo == NULL)) { /* rats, no mbuf memory */ if (a_chk->data) { /* was a problem with the destination */ sctp_m_freem(a_chk->data); a_chk->data = NULL; } sctp_free_a_chunk(stcb, a_chk, so_locked); /* sa_ignore NO_NULL_CHK */ if (stcb->asoc.delayed_ack) { sctp_timer_stop(SCTP_TIMER_TYPE_RECV, stcb->sctp_ep, stcb, NULL, SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_4); sctp_timer_start(SCTP_TIMER_TYPE_RECV, stcb->sctp_ep, stcb, NULL); } else { stcb->asoc.send_sack = 1; } return; } /* ok, lets go through and fill it in */ SCTP_BUF_RESV_UF(a_chk->data, SCTP_MIN_OVERHEAD); space = (unsigned int)M_TRAILINGSPACE(a_chk->data); if (space > (a_chk->whoTo->mtu - SCTP_MIN_OVERHEAD)) { space = (a_chk->whoTo->mtu - SCTP_MIN_OVERHEAD); } limit = mtod(a_chk->data, caddr_t); limit += space; flags = 0; if ((asoc->sctp_cmt_on_off > 0) && SCTP_BASE_SYSCTL(sctp_cmt_use_dac)) { /*- * CMT DAC algorithm: If 2 (i.e., 0x10) packets have been * received, then set high bit to 1, else 0. Reset * pkts_rcvd. */ flags |= (asoc->cmt_dac_pkts_rcvd << 6); asoc->cmt_dac_pkts_rcvd = 0; } #ifdef SCTP_ASOCLOG_OF_TSNS stcb->asoc.cumack_logsnt[stcb->asoc.cumack_log_atsnt] = asoc->cumulative_tsn; stcb->asoc.cumack_log_atsnt++; if (stcb->asoc.cumack_log_atsnt >= SCTP_TSN_LOG_SIZE) { stcb->asoc.cumack_log_atsnt = 0; } #endif /* reset the readers interpretation */ stcb->freed_by_sorcv_sincelast = 0; if (type == SCTP_SELECTIVE_ACK) { sack = mtod(a_chk->data, struct sctp_sack_chunk *); nr_sack = NULL; gap_descriptor = (struct sctp_gap_ack_block *)((caddr_t)sack + sizeof(struct sctp_sack_chunk)); if (highest_tsn > asoc->mapping_array_base_tsn) { siz = (((highest_tsn - asoc->mapping_array_base_tsn) + 1) + 7) / 8; } else { siz = (((MAX_TSN - highest_tsn) + 1) + highest_tsn + 7) / 8; } } else { sack = NULL; nr_sack = mtod(a_chk->data, struct sctp_nr_sack_chunk *); gap_descriptor = (struct sctp_gap_ack_block *)((caddr_t)nr_sack + sizeof(struct sctp_nr_sack_chunk)); if (asoc->highest_tsn_inside_map > asoc->mapping_array_base_tsn) { siz = (((asoc->highest_tsn_inside_map - asoc->mapping_array_base_tsn) + 1) + 7) / 8; } else { siz = (((MAX_TSN - asoc->mapping_array_base_tsn) + 1) + asoc->highest_tsn_inside_map + 7) / 8; } } if (SCTP_TSN_GT(asoc->mapping_array_base_tsn, asoc->cumulative_tsn)) { offset = 1; } else { offset = asoc->mapping_array_base_tsn - asoc->cumulative_tsn; } if (((type == SCTP_SELECTIVE_ACK) && SCTP_TSN_GT(highest_tsn, asoc->cumulative_tsn)) || ((type == SCTP_NR_SELECTIVE_ACK) && SCTP_TSN_GT(asoc->highest_tsn_inside_map, asoc->cumulative_tsn))) { /* we have a gap .. maybe */ for (i = 0; i < siz; i++) { tsn_map = asoc->mapping_array[i]; if (type == SCTP_SELECTIVE_ACK) { tsn_map |= asoc->nr_mapping_array[i]; } if (i == 0) { /* * Clear all bits corresponding to TSNs * smaller or equal to the cumulative TSN. */ tsn_map &= (~0U << (1 - offset)); } selector = &sack_array[tsn_map]; if (mergeable && selector->right_edge) { /* * Backup, left and right edges were ok to * merge. */ num_gap_blocks--; gap_descriptor--; } if (selector->num_entries == 0) mergeable = 0; else { for (j = 0; j < selector->num_entries; j++) { if (mergeable && selector->right_edge) { /* * do a merge by NOT setting * the left side */ mergeable = 0; } else { /* * no merge, set the left * side */ mergeable = 0; gap_descriptor->start = htons((selector->gaps[j].start + offset)); } gap_descriptor->end = htons((selector->gaps[j].end + offset)); num_gap_blocks++; gap_descriptor++; if (((caddr_t)gap_descriptor + sizeof(struct sctp_gap_ack_block)) > limit) { /* no more room */ limit_reached = 1; break; } } if (selector->left_edge) { mergeable = 1; } } if (limit_reached) { /* Reached the limit stop */ break; } offset += 8; } } if ((type == SCTP_NR_SELECTIVE_ACK) && (limit_reached == 0)) { mergeable = 0; if (asoc->highest_tsn_inside_nr_map > asoc->mapping_array_base_tsn) { siz = (((asoc->highest_tsn_inside_nr_map - asoc->mapping_array_base_tsn) + 1) + 7) / 8; } else { siz = (((MAX_TSN - asoc->mapping_array_base_tsn) + 1) + asoc->highest_tsn_inside_nr_map + 7) / 8; } if (SCTP_TSN_GT(asoc->mapping_array_base_tsn, asoc->cumulative_tsn)) { offset = 1; } else { offset = asoc->mapping_array_base_tsn - asoc->cumulative_tsn; } if (SCTP_TSN_GT(asoc->highest_tsn_inside_nr_map, asoc->cumulative_tsn)) { /* we have a gap .. maybe */ for (i = 0; i < siz; i++) { tsn_map = asoc->nr_mapping_array[i]; if (i == 0) { /* * Clear all bits corresponding to * TSNs smaller or equal to the * cumulative TSN. */ tsn_map &= (~0U << (1 - offset)); } selector = &sack_array[tsn_map]; if (mergeable && selector->right_edge) { /* * Backup, left and right edges were * ok to merge. */ num_nr_gap_blocks--; gap_descriptor--; } if (selector->num_entries == 0) mergeable = 0; else { for (j = 0; j < selector->num_entries; j++) { if (mergeable && selector->right_edge) { /* * do a merge by NOT * setting the left * side */ mergeable = 0; } else { /* * no merge, set the * left side */ mergeable = 0; gap_descriptor->start = htons((selector->gaps[j].start + offset)); } gap_descriptor->end = htons((selector->gaps[j].end + offset)); num_nr_gap_blocks++; gap_descriptor++; if (((caddr_t)gap_descriptor + sizeof(struct sctp_gap_ack_block)) > limit) { /* no more room */ limit_reached = 1; break; } } if (selector->left_edge) { mergeable = 1; } } if (limit_reached) { /* Reached the limit stop */ break; } offset += 8; } } } /* now we must add any dups we are going to report. */ if ((limit_reached == 0) && (asoc->numduptsns)) { dup = (uint32_t *) gap_descriptor; for (i = 0; i < asoc->numduptsns; i++) { *dup = htonl(asoc->dup_tsns[i]); dup++; num_dups++; if (((caddr_t)dup + sizeof(uint32_t)) > limit) { /* no more room */ break; } } asoc->numduptsns = 0; } /* * now that the chunk is prepared queue it to the control chunk * queue. */ if (type == SCTP_SELECTIVE_ACK) { a_chk->send_size = (uint16_t) (sizeof(struct sctp_sack_chunk) + (num_gap_blocks + num_nr_gap_blocks) * sizeof(struct sctp_gap_ack_block) + num_dups * sizeof(int32_t)); SCTP_BUF_LEN(a_chk->data) = a_chk->send_size; sack->sack.cum_tsn_ack = htonl(asoc->cumulative_tsn); sack->sack.a_rwnd = htonl(asoc->my_rwnd); sack->sack.num_gap_ack_blks = htons(num_gap_blocks); sack->sack.num_dup_tsns = htons(num_dups); sack->ch.chunk_type = type; sack->ch.chunk_flags = flags; sack->ch.chunk_length = htons(a_chk->send_size); } else { a_chk->send_size = (uint16_t) (sizeof(struct sctp_nr_sack_chunk) + (num_gap_blocks + num_nr_gap_blocks) * sizeof(struct sctp_gap_ack_block) + num_dups * sizeof(int32_t)); SCTP_BUF_LEN(a_chk->data) = a_chk->send_size; nr_sack->nr_sack.cum_tsn_ack = htonl(asoc->cumulative_tsn); nr_sack->nr_sack.a_rwnd = htonl(asoc->my_rwnd); nr_sack->nr_sack.num_gap_ack_blks = htons(num_gap_blocks); nr_sack->nr_sack.num_nr_gap_ack_blks = htons(num_nr_gap_blocks); nr_sack->nr_sack.num_dup_tsns = htons(num_dups); nr_sack->nr_sack.reserved = 0; nr_sack->ch.chunk_type = type; nr_sack->ch.chunk_flags = flags; nr_sack->ch.chunk_length = htons(a_chk->send_size); } TAILQ_INSERT_TAIL(&asoc->control_send_queue, a_chk, sctp_next); asoc->my_last_reported_rwnd = asoc->my_rwnd; asoc->ctrl_queue_cnt++; asoc->send_sack = 0; SCTP_STAT_INCR(sctps_sendsacks); return; } void sctp_send_abort_tcb(struct sctp_tcb *stcb, struct mbuf *operr, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { struct mbuf *m_abort, *m, *m_last; struct mbuf *m_out, *m_end = NULL; struct sctp_abort_chunk *abort; struct sctp_auth_chunk *auth = NULL; struct sctp_nets *net; uint32_t vtag; uint32_t auth_offset = 0; uint16_t cause_len, chunk_len, padding_len; SCTP_TCB_LOCK_ASSERT(stcb); /*- * Add an AUTH chunk, if chunk requires it and save the offset into * the chain for AUTH */ if (sctp_auth_is_required_chunk(SCTP_ABORT_ASSOCIATION, stcb->asoc.peer_auth_chunks)) { m_out = sctp_add_auth_chunk(NULL, &m_end, &auth, &auth_offset, stcb, SCTP_ABORT_ASSOCIATION); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } else { m_out = NULL; } m_abort = sctp_get_mbuf_for_msg(sizeof(struct sctp_abort_chunk), 0, M_NOWAIT, 1, MT_HEADER); if (m_abort == NULL) { if (m_out) { sctp_m_freem(m_out); } if (operr) { sctp_m_freem(operr); } return; } /* link in any error */ SCTP_BUF_NEXT(m_abort) = operr; cause_len = 0; m_last = NULL; for (m = operr; m; m = SCTP_BUF_NEXT(m)) { cause_len += (uint16_t) SCTP_BUF_LEN(m); if (SCTP_BUF_NEXT(m) == NULL) { m_last = m; } } SCTP_BUF_LEN(m_abort) = sizeof(struct sctp_abort_chunk); chunk_len = (uint16_t) sizeof(struct sctp_abort_chunk) + cause_len; padding_len = SCTP_SIZE32(chunk_len) - chunk_len; if (m_out == NULL) { /* NO Auth chunk prepended, so reserve space in front */ SCTP_BUF_RESV_UF(m_abort, SCTP_MIN_OVERHEAD); m_out = m_abort; } else { /* Put AUTH chunk at the front of the chain */ SCTP_BUF_NEXT(m_end) = m_abort; } if (stcb->asoc.alternate) { net = stcb->asoc.alternate; } else { net = stcb->asoc.primary_destination; } /* Fill in the ABORT chunk header. */ abort = mtod(m_abort, struct sctp_abort_chunk *); abort->ch.chunk_type = SCTP_ABORT_ASSOCIATION; if (stcb->asoc.peer_vtag == 0) { /* This happens iff the assoc is in COOKIE-WAIT state. */ vtag = stcb->asoc.my_vtag; abort->ch.chunk_flags = SCTP_HAD_NO_TCB; } else { vtag = stcb->asoc.peer_vtag; abort->ch.chunk_flags = 0; } abort->ch.chunk_length = htons(chunk_len); /* Add padding, if necessary. */ if (padding_len > 0) { if ((m_last == NULL) || (sctp_add_pad_tombuf(m_last, padding_len) == NULL)) { sctp_m_freem(m_out); return; } } (void)sctp_lowlevel_chunk_output(stcb->sctp_ep, stcb, net, (struct sockaddr *)&net->ro._l_addr, m_out, auth_offset, auth, stcb->asoc.authinfo.active_keyid, 1, 0, 0, stcb->sctp_ep->sctp_lport, stcb->rport, htonl(vtag), stcb->asoc.primary_destination->port, NULL, 0, 0, so_locked); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); } void sctp_send_shutdown_complete(struct sctp_tcb *stcb, struct sctp_nets *net, int reflect_vtag) { /* formulate and SEND a SHUTDOWN-COMPLETE */ struct mbuf *m_shutdown_comp; struct sctp_shutdown_complete_chunk *shutdown_complete; uint32_t vtag; uint8_t flags; m_shutdown_comp = sctp_get_mbuf_for_msg(sizeof(struct sctp_chunkhdr), 0, M_NOWAIT, 1, MT_HEADER); if (m_shutdown_comp == NULL) { /* no mbuf's */ return; } if (reflect_vtag) { flags = SCTP_HAD_NO_TCB; vtag = stcb->asoc.my_vtag; } else { flags = 0; vtag = stcb->asoc.peer_vtag; } shutdown_complete = mtod(m_shutdown_comp, struct sctp_shutdown_complete_chunk *); shutdown_complete->ch.chunk_type = SCTP_SHUTDOWN_COMPLETE; shutdown_complete->ch.chunk_flags = flags; shutdown_complete->ch.chunk_length = htons(sizeof(struct sctp_shutdown_complete_chunk)); SCTP_BUF_LEN(m_shutdown_comp) = sizeof(struct sctp_shutdown_complete_chunk); (void)sctp_lowlevel_chunk_output(stcb->sctp_ep, stcb, net, (struct sockaddr *)&net->ro._l_addr, m_shutdown_comp, 0, NULL, 0, 1, 0, 0, stcb->sctp_ep->sctp_lport, stcb->rport, htonl(vtag), net->port, NULL, 0, 0, SCTP_SO_NOT_LOCKED); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); return; } static void sctp_send_resp_msg(struct sockaddr *src, struct sockaddr *dst, struct sctphdr *sh, uint32_t vtag, uint8_t type, struct mbuf *cause, uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum, uint32_t vrf_id, uint16_t port) { struct mbuf *o_pak; struct mbuf *mout; struct sctphdr *shout; struct sctp_chunkhdr *ch; #if defined(INET) || defined(INET6) struct udphdr *udp; int ret; #endif int len, cause_len, padding_len; #ifdef INET struct sockaddr_in *src_sin, *dst_sin; struct ip *ip; #endif #ifdef INET6 struct sockaddr_in6 *src_sin6, *dst_sin6; struct ip6_hdr *ip6; #endif /* Compute the length of the cause and add final padding. */ cause_len = 0; if (cause != NULL) { struct mbuf *m_at, *m_last = NULL; for (m_at = cause; m_at; m_at = SCTP_BUF_NEXT(m_at)) { if (SCTP_BUF_NEXT(m_at) == NULL) m_last = m_at; cause_len += SCTP_BUF_LEN(m_at); } padding_len = cause_len % 4; if (padding_len != 0) { padding_len = 4 - padding_len; } if (padding_len != 0) { if (sctp_add_pad_tombuf(m_last, padding_len) == NULL) { sctp_m_freem(cause); return; } } } else { padding_len = 0; } /* Get an mbuf for the header. */ len = sizeof(struct sctphdr) + sizeof(struct sctp_chunkhdr); switch (dst->sa_family) { #ifdef INET case AF_INET: len += sizeof(struct ip); break; #endif #ifdef INET6 case AF_INET6: len += sizeof(struct ip6_hdr); break; #endif default: break; } #if defined(INET) || defined(INET6) if (port) { len += sizeof(struct udphdr); } #endif mout = sctp_get_mbuf_for_msg(len + max_linkhdr, 1, M_NOWAIT, 1, MT_DATA); if (mout == NULL) { if (cause) { sctp_m_freem(cause); } return; } SCTP_BUF_RESV_UF(mout, max_linkhdr); SCTP_BUF_LEN(mout) = len; SCTP_BUF_NEXT(mout) = cause; M_SETFIB(mout, fibnum); mout->m_pkthdr.flowid = mflowid; M_HASHTYPE_SET(mout, mflowtype); #ifdef INET ip = NULL; #endif #ifdef INET6 ip6 = NULL; #endif switch (dst->sa_family) { #ifdef INET case AF_INET: src_sin = (struct sockaddr_in *)src; dst_sin = (struct sockaddr_in *)dst; ip = mtod(mout, struct ip *); ip->ip_v = IPVERSION; ip->ip_hl = (sizeof(struct ip) >> 2); ip->ip_tos = 0; ip->ip_off = 0; ip_fillid(ip); ip->ip_ttl = MODULE_GLOBAL(ip_defttl); if (port) { ip->ip_p = IPPROTO_UDP; } else { ip->ip_p = IPPROTO_SCTP; } ip->ip_src.s_addr = dst_sin->sin_addr.s_addr; ip->ip_dst.s_addr = src_sin->sin_addr.s_addr; ip->ip_sum = 0; len = sizeof(struct ip); shout = (struct sctphdr *)((caddr_t)ip + len); break; #endif #ifdef INET6 case AF_INET6: src_sin6 = (struct sockaddr_in6 *)src; dst_sin6 = (struct sockaddr_in6 *)dst; ip6 = mtod(mout, struct ip6_hdr *); ip6->ip6_flow = htonl(0x60000000); if (V_ip6_auto_flowlabel) { ip6->ip6_flow |= (htonl(ip6_randomflowlabel()) & IPV6_FLOWLABEL_MASK); } ip6->ip6_hlim = MODULE_GLOBAL(ip6_defhlim); if (port) { ip6->ip6_nxt = IPPROTO_UDP; } else { ip6->ip6_nxt = IPPROTO_SCTP; } ip6->ip6_src = dst_sin6->sin6_addr; ip6->ip6_dst = src_sin6->sin6_addr; len = sizeof(struct ip6_hdr); shout = (struct sctphdr *)((caddr_t)ip6 + len); break; #endif default: len = 0; shout = mtod(mout, struct sctphdr *); break; } #if defined(INET) || defined(INET6) if (port) { if (htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)) == 0) { sctp_m_freem(mout); return; } udp = (struct udphdr *)shout; udp->uh_sport = htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)); udp->uh_dport = port; udp->uh_sum = 0; udp->uh_ulen = htons((uint16_t) (sizeof(struct udphdr) + sizeof(struct sctphdr) + sizeof(struct sctp_chunkhdr) + cause_len + padding_len)); len += sizeof(struct udphdr); shout = (struct sctphdr *)((caddr_t)shout + sizeof(struct udphdr)); } else { udp = NULL; } #endif shout->src_port = sh->dest_port; shout->dest_port = sh->src_port; shout->checksum = 0; if (vtag) { shout->v_tag = htonl(vtag); } else { shout->v_tag = sh->v_tag; } len += sizeof(struct sctphdr); ch = (struct sctp_chunkhdr *)((caddr_t)shout + sizeof(struct sctphdr)); ch->chunk_type = type; if (vtag) { ch->chunk_flags = 0; } else { ch->chunk_flags = SCTP_HAD_NO_TCB; } ch->chunk_length = htons((uint16_t) (sizeof(struct sctp_chunkhdr) + cause_len)); len += sizeof(struct sctp_chunkhdr); len += cause_len + padding_len; if (SCTP_GET_HEADER_FOR_OUTPUT(o_pak)) { sctp_m_freem(mout); return; } SCTP_ATTACH_CHAIN(o_pak, mout, len); switch (dst->sa_family) { #ifdef INET case AF_INET: if (port) { if (V_udp_cksum) { udp->uh_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, udp->uh_ulen + htons(IPPROTO_UDP)); } else { udp->uh_sum = 0; } } ip->ip_len = htons(len); if (port) { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else shout->checksum = sctp_calculate_cksum(mout, sizeof(struct ip) + sizeof(struct udphdr)); SCTP_STAT_INCR(sctps_sendswcrc); #endif if (V_udp_cksum) { SCTP_ENABLE_UDP_CSUM(o_pak); } } else { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else mout->m_pkthdr.csum_flags = CSUM_SCTP; mout->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum); SCTP_STAT_INCR(sctps_sendhwcrc); #endif } #ifdef SCTP_PACKET_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING) { sctp_packet_log(o_pak); } #endif SCTP_IP_OUTPUT(ret, o_pak, NULL, NULL, vrf_id); break; #endif #ifdef INET6 case AF_INET6: ip6->ip6_plen = (uint16_t) (len - sizeof(struct ip6_hdr)); if (port) { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else shout->checksum = sctp_calculate_cksum(mout, sizeof(struct ip6_hdr) + sizeof(struct udphdr)); SCTP_STAT_INCR(sctps_sendswcrc); #endif if ((udp->uh_sum = in6_cksum(o_pak, IPPROTO_UDP, sizeof(struct ip6_hdr), len - sizeof(struct ip6_hdr))) == 0) { udp->uh_sum = 0xffff; } } else { #if defined(SCTP_WITH_NO_CSUM) SCTP_STAT_INCR(sctps_sendnocrc); #else mout->m_pkthdr.csum_flags = CSUM_SCTP_IPV6; mout->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum); SCTP_STAT_INCR(sctps_sendhwcrc); #endif } #ifdef SCTP_PACKET_LOGGING if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING) { sctp_packet_log(o_pak); } #endif SCTP_IP6_OUTPUT(ret, o_pak, NULL, NULL, NULL, vrf_id); break; #endif default: SCTPDBG(SCTP_DEBUG_OUTPUT1, "Unknown protocol (TSNH) type %d\n", dst->sa_family); sctp_m_freem(mout); SCTP_LTRACE_ERR_RET_PKT(mout, NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EFAULT); return; } SCTP_STAT_INCR(sctps_sendpackets); SCTP_STAT_INCR_COUNTER64(sctps_outpackets); SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks); return; } void sctp_send_shutdown_complete2(struct sockaddr *src, struct sockaddr *dst, struct sctphdr *sh, uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum, uint32_t vrf_id, uint16_t port) { sctp_send_resp_msg(src, dst, sh, 0, SCTP_SHUTDOWN_COMPLETE, NULL, mflowtype, mflowid, fibnum, vrf_id, port); } void sctp_send_hb(struct sctp_tcb *stcb, struct sctp_nets *net, int so_locked #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING) SCTP_UNUSED #endif ) { struct sctp_tmit_chunk *chk; struct sctp_heartbeat_chunk *hb; struct timeval now; SCTP_TCB_LOCK_ASSERT(stcb); if (net == NULL) { return; } (void)SCTP_GETTIME_TIMEVAL(&now); switch (net->ro._l_addr.sa.sa_family) { #ifdef INET case AF_INET: break; #endif #ifdef INET6 case AF_INET6: break; #endif default: return; } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT4, "Gak, can't get a chunk for hb\n"); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_HEARTBEAT_REQUEST; chk->rec.chunk_id.can_take_data = 1; chk->flags = 0; chk->asoc = &stcb->asoc; chk->send_size = sizeof(struct sctp_heartbeat_chunk); chk->data = sctp_get_mbuf_for_msg(chk->send_size, 0, M_NOWAIT, 1, MT_HEADER); if (chk->data == NULL) { sctp_free_a_chunk(stcb, chk, so_locked); return; } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); SCTP_BUF_LEN(chk->data) = chk->send_size; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->whoTo = net; atomic_add_int(&chk->whoTo->ref_count, 1); /* Now we have a mbuf that we can fill in with the details */ hb = mtod(chk->data, struct sctp_heartbeat_chunk *); memset(hb, 0, sizeof(struct sctp_heartbeat_chunk)); /* fill out chunk header */ hb->ch.chunk_type = SCTP_HEARTBEAT_REQUEST; hb->ch.chunk_flags = 0; hb->ch.chunk_length = htons(chk->send_size); /* Fill out hb parameter */ hb->heartbeat.hb_info.ph.param_type = htons(SCTP_HEARTBEAT_INFO); hb->heartbeat.hb_info.ph.param_length = htons(sizeof(struct sctp_heartbeat_info_param)); hb->heartbeat.hb_info.time_value_1 = now.tv_sec; hb->heartbeat.hb_info.time_value_2 = now.tv_usec; /* Did our user request this one, put it in */ hb->heartbeat.hb_info.addr_family = (uint8_t) net->ro._l_addr.sa.sa_family; hb->heartbeat.hb_info.addr_len = net->ro._l_addr.sa.sa_len; if (net->dest_state & SCTP_ADDR_UNCONFIRMED) { /* * we only take from the entropy pool if the address is not * confirmed. */ net->heartbeat_random1 = hb->heartbeat.hb_info.random_value1 = sctp_select_initial_TSN(&stcb->sctp_ep->sctp_ep); net->heartbeat_random2 = hb->heartbeat.hb_info.random_value2 = sctp_select_initial_TSN(&stcb->sctp_ep->sctp_ep); } else { net->heartbeat_random1 = hb->heartbeat.hb_info.random_value1 = 0; net->heartbeat_random2 = hb->heartbeat.hb_info.random_value2 = 0; } switch (net->ro._l_addr.sa.sa_family) { #ifdef INET case AF_INET: memcpy(hb->heartbeat.hb_info.address, &net->ro._l_addr.sin.sin_addr, sizeof(net->ro._l_addr.sin.sin_addr)); break; #endif #ifdef INET6 case AF_INET6: memcpy(hb->heartbeat.hb_info.address, &net->ro._l_addr.sin6.sin6_addr, sizeof(net->ro._l_addr.sin6.sin6_addr)); break; #endif default: if (chk->data) { sctp_m_freem(chk->data); chk->data = NULL; } sctp_free_a_chunk(stcb, chk, so_locked); return; break; } net->hb_responded = 0; TAILQ_INSERT_TAIL(&stcb->asoc.control_send_queue, chk, sctp_next); stcb->asoc.ctrl_queue_cnt++; SCTP_STAT_INCR(sctps_sendheartbeat); return; } void sctp_send_ecn_echo(struct sctp_tcb *stcb, struct sctp_nets *net, uint32_t high_tsn) { struct sctp_association *asoc; struct sctp_ecne_chunk *ecne; struct sctp_tmit_chunk *chk; if (net == NULL) { return; } asoc = &stcb->asoc; SCTP_TCB_LOCK_ASSERT(stcb); TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if ((chk->rec.chunk_id.id == SCTP_ECN_ECHO) && (net == chk->whoTo)) { /* found a previous ECN_ECHO update it if needed */ uint32_t cnt, ctsn; ecne = mtod(chk->data, struct sctp_ecne_chunk *); ctsn = ntohl(ecne->tsn); if (SCTP_TSN_GT(high_tsn, ctsn)) { ecne->tsn = htonl(high_tsn); SCTP_STAT_INCR(sctps_queue_upd_ecne); } cnt = ntohl(ecne->num_pkts_since_cwr); cnt++; ecne->num_pkts_since_cwr = htonl(cnt); return; } } /* nope could not find one to update so we must build one */ sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { return; } SCTP_STAT_INCR(sctps_queue_upd_ecne); chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_ECN_ECHO; chk->rec.chunk_id.can_take_data = 0; chk->flags = 0; chk->asoc = &stcb->asoc; chk->send_size = sizeof(struct sctp_ecne_chunk); chk->data = sctp_get_mbuf_for_msg(chk->send_size, 0, M_NOWAIT, 1, MT_HEADER); if (chk->data == NULL) { sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED); return; } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); SCTP_BUF_LEN(chk->data) = chk->send_size; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->whoTo = net; atomic_add_int(&chk->whoTo->ref_count, 1); stcb->asoc.ecn_echo_cnt_onq++; ecne = mtod(chk->data, struct sctp_ecne_chunk *); ecne->ch.chunk_type = SCTP_ECN_ECHO; ecne->ch.chunk_flags = 0; ecne->ch.chunk_length = htons(sizeof(struct sctp_ecne_chunk)); ecne->tsn = htonl(high_tsn); ecne->num_pkts_since_cwr = htonl(1); TAILQ_INSERT_HEAD(&stcb->asoc.control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt++; } void sctp_send_packet_dropped(struct sctp_tcb *stcb, struct sctp_nets *net, struct mbuf *m, int len, int iphlen, int bad_crc) { struct sctp_association *asoc; struct sctp_pktdrop_chunk *drp; struct sctp_tmit_chunk *chk; uint8_t *datap; int was_trunc = 0; int fullsz = 0; long spc; int offset; struct sctp_chunkhdr *ch, chunk_buf; unsigned int chk_length; if (!stcb) { return; } asoc = &stcb->asoc; SCTP_TCB_LOCK_ASSERT(stcb); if (asoc->pktdrop_supported == 0) { /*- * peer must declare support before I send one. */ return; } if (stcb->sctp_socket == NULL) { return; } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_PACKET_DROPPED; chk->rec.chunk_id.can_take_data = 1; chk->flags = 0; len -= iphlen; chk->send_size = len; /* Validate that we do not have an ABORT in here. */ offset = iphlen + sizeof(struct sctphdr); ch = (struct sctp_chunkhdr *)sctp_m_getptr(m, offset, sizeof(*ch), (uint8_t *) & chunk_buf); while (ch != NULL) { chk_length = ntohs(ch->chunk_length); if (chk_length < sizeof(*ch)) { /* break to abort land */ break; } switch (ch->chunk_type) { case SCTP_PACKET_DROPPED: case SCTP_ABORT_ASSOCIATION: case SCTP_INITIATION_ACK: /** * We don't respond with an PKT-DROP to an ABORT * or PKT-DROP. We also do not respond to an * INIT-ACK, because we can't know if the initiation * tag is correct or not. */ sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED); return; default: break; } offset += SCTP_SIZE32(chk_length); ch = (struct sctp_chunkhdr *)sctp_m_getptr(m, offset, sizeof(*ch), (uint8_t *) & chunk_buf); } if ((len + SCTP_MAX_OVERHEAD + sizeof(struct sctp_pktdrop_chunk)) > min(stcb->asoc.smallest_mtu, MCLBYTES)) { /* * only send 1 mtu worth, trim off the excess on the end. */ fullsz = len; len = min(stcb->asoc.smallest_mtu, MCLBYTES) - SCTP_MAX_OVERHEAD; was_trunc = 1; } chk->asoc = &stcb->asoc; chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA); if (chk->data == NULL) { jump_out: sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED); return; } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); drp = mtod(chk->data, struct sctp_pktdrop_chunk *); if (drp == NULL) { sctp_m_freem(chk->data); chk->data = NULL; goto jump_out; } chk->book_size = SCTP_SIZE32((chk->send_size + sizeof(struct sctp_pktdrop_chunk) + sizeof(struct sctphdr) + SCTP_MED_OVERHEAD)); chk->book_size_scale = 0; if (was_trunc) { drp->ch.chunk_flags = SCTP_PACKET_TRUNCATED; drp->trunc_len = htons(fullsz); /* * Len is already adjusted to size minus overhead above take * out the pkt_drop chunk itself from it. */ chk->send_size = (uint16_t) (len - sizeof(struct sctp_pktdrop_chunk)); len = chk->send_size; } else { /* no truncation needed */ drp->ch.chunk_flags = 0; drp->trunc_len = htons(0); } if (bad_crc) { drp->ch.chunk_flags |= SCTP_BADCRC; } chk->send_size += sizeof(struct sctp_pktdrop_chunk); SCTP_BUF_LEN(chk->data) = chk->send_size; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; if (net) { /* we should hit here */ chk->whoTo = net; atomic_add_int(&chk->whoTo->ref_count, 1); } else { chk->whoTo = NULL; } drp->ch.chunk_type = SCTP_PACKET_DROPPED; drp->ch.chunk_length = htons(chk->send_size); spc = SCTP_SB_LIMIT_RCV(stcb->sctp_socket); if (spc < 0) { spc = 0; } drp->bottle_bw = htonl(spc); if (asoc->my_rwnd) { drp->current_onq = htonl(asoc->size_on_reasm_queue + asoc->size_on_all_streams + asoc->my_rwnd_control_len + stcb->sctp_socket->so_rcv.sb_cc); } else { /*- * If my rwnd is 0, possibly from mbuf depletion as well as * space used, tell the peer there is NO space aka onq == bw */ drp->current_onq = htonl(spc); } drp->reserved = 0; datap = drp->data; m_copydata(m, iphlen, len, (caddr_t)datap); TAILQ_INSERT_TAIL(&stcb->asoc.control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt++; } void sctp_send_cwr(struct sctp_tcb *stcb, struct sctp_nets *net, uint32_t high_tsn, uint8_t override) { struct sctp_association *asoc; struct sctp_cwr_chunk *cwr; struct sctp_tmit_chunk *chk; SCTP_TCB_LOCK_ASSERT(stcb); if (net == NULL) { return; } asoc = &stcb->asoc; TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) { if ((chk->rec.chunk_id.id == SCTP_ECN_CWR) && (net == chk->whoTo)) { /* * found a previous CWR queued to same destination * update it if needed */ uint32_t ctsn; cwr = mtod(chk->data, struct sctp_cwr_chunk *); ctsn = ntohl(cwr->tsn); if (SCTP_TSN_GT(high_tsn, ctsn)) { cwr->tsn = htonl(high_tsn); } if (override & SCTP_CWR_REDUCE_OVERRIDE) { /* Make sure override is carried */ cwr->ch.chunk_flags |= SCTP_CWR_REDUCE_OVERRIDE; } return; } } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_ECN_CWR; chk->rec.chunk_id.can_take_data = 1; chk->flags = 0; chk->asoc = &stcb->asoc; chk->send_size = sizeof(struct sctp_cwr_chunk); chk->data = sctp_get_mbuf_for_msg(chk->send_size, 0, M_NOWAIT, 1, MT_HEADER); if (chk->data == NULL) { sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED); return; } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); SCTP_BUF_LEN(chk->data) = chk->send_size; chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; chk->whoTo = net; atomic_add_int(&chk->whoTo->ref_count, 1); cwr = mtod(chk->data, struct sctp_cwr_chunk *); cwr->ch.chunk_type = SCTP_ECN_CWR; cwr->ch.chunk_flags = override; cwr->ch.chunk_length = htons(sizeof(struct sctp_cwr_chunk)); cwr->tsn = htonl(high_tsn); TAILQ_INSERT_TAIL(&stcb->asoc.control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt++; } static int sctp_add_stream_reset_out(struct sctp_tcb *stcb, struct sctp_tmit_chunk *chk, uint32_t seq, uint32_t resp_seq, uint32_t last_sent) { uint16_t len, old_len, i; struct sctp_stream_reset_out_request *req_out; struct sctp_chunkhdr *ch; int at; int number_entries = 0; ch = mtod(chk->data, struct sctp_chunkhdr *); old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length)); /* get to new offset for the param. */ req_out = (struct sctp_stream_reset_out_request *)((caddr_t)ch + len); /* now how long will this param be? */ for (i = 0; i < stcb->asoc.streamoutcnt; i++) { if ((stcb->asoc.strmout[i].state == SCTP_STREAM_RESET_PENDING) && (stcb->asoc.strmout[i].chunks_on_queues == 0) && TAILQ_EMPTY(&stcb->asoc.strmout[i].outqueue)) { number_entries++; } } if (number_entries == 0) { return (0); } if (number_entries == stcb->asoc.streamoutcnt) { number_entries = 0; } if (number_entries > SCTP_MAX_STREAMS_AT_ONCE_RESET) { number_entries = SCTP_MAX_STREAMS_AT_ONCE_RESET; } len = (uint16_t) (sizeof(struct sctp_stream_reset_out_request) + (sizeof(uint16_t) * number_entries)); req_out->ph.param_type = htons(SCTP_STR_RESET_OUT_REQUEST); req_out->ph.param_length = htons(len); req_out->request_seq = htonl(seq); req_out->response_seq = htonl(resp_seq); req_out->send_reset_at_tsn = htonl(last_sent); at = 0; if (number_entries) { for (i = 0; i < stcb->asoc.streamoutcnt; i++) { if ((stcb->asoc.strmout[i].state == SCTP_STREAM_RESET_PENDING) && (stcb->asoc.strmout[i].chunks_on_queues == 0) && TAILQ_EMPTY(&stcb->asoc.strmout[i].outqueue)) { req_out->list_of_streams[at] = htons(i); at++; stcb->asoc.strmout[i].state = SCTP_STREAM_RESET_IN_FLIGHT; if (at >= number_entries) { break; } } } } else { for (i = 0; i < stcb->asoc.streamoutcnt; i++) { stcb->asoc.strmout[i].state = SCTP_STREAM_RESET_IN_FLIGHT; } } if (SCTP_SIZE32(len) > len) { /*- * Need to worry about the pad we may end up adding to the * end. This is easy since the struct is either aligned to 4 * bytes or 2 bytes off. */ req_out->list_of_streams[number_entries] = 0; } /* now fix the chunk length */ ch->chunk_length = htons(len + old_len); chk->book_size = len + old_len; chk->book_size_scale = 0; chk->send_size = SCTP_SIZE32(chk->book_size); SCTP_BUF_LEN(chk->data) = chk->send_size; return (1); } static void sctp_add_stream_reset_in(struct sctp_tmit_chunk *chk, int number_entries, uint16_t * list, uint32_t seq) { uint16_t len, old_len, i; struct sctp_stream_reset_in_request *req_in; struct sctp_chunkhdr *ch; ch = mtod(chk->data, struct sctp_chunkhdr *); old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length)); /* get to new offset for the param. */ req_in = (struct sctp_stream_reset_in_request *)((caddr_t)ch + len); /* now how long will this param be? */ len = (uint16_t) (sizeof(struct sctp_stream_reset_in_request) + (sizeof(uint16_t) * number_entries)); req_in->ph.param_type = htons(SCTP_STR_RESET_IN_REQUEST); req_in->ph.param_length = htons(len); req_in->request_seq = htonl(seq); if (number_entries) { for (i = 0; i < number_entries; i++) { req_in->list_of_streams[i] = htons(list[i]); } } if (SCTP_SIZE32(len) > len) { /*- * Need to worry about the pad we may end up adding to the * end. This is easy since the struct is either aligned to 4 * bytes or 2 bytes off. */ req_in->list_of_streams[number_entries] = 0; } /* now fix the chunk length */ ch->chunk_length = htons(len + old_len); chk->book_size = len + old_len; chk->book_size_scale = 0; chk->send_size = SCTP_SIZE32(chk->book_size); SCTP_BUF_LEN(chk->data) = chk->send_size; return; } static void sctp_add_stream_reset_tsn(struct sctp_tmit_chunk *chk, uint32_t seq) { uint16_t len, old_len; struct sctp_stream_reset_tsn_request *req_tsn; struct sctp_chunkhdr *ch; ch = mtod(chk->data, struct sctp_chunkhdr *); old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length)); /* get to new offset for the param. */ req_tsn = (struct sctp_stream_reset_tsn_request *)((caddr_t)ch + len); /* now how long will this param be? */ len = sizeof(struct sctp_stream_reset_tsn_request); req_tsn->ph.param_type = htons(SCTP_STR_RESET_TSN_REQUEST); req_tsn->ph.param_length = htons(len); req_tsn->request_seq = htonl(seq); /* now fix the chunk length */ ch->chunk_length = htons(len + old_len); chk->send_size = len + old_len; chk->book_size = SCTP_SIZE32(chk->send_size); chk->book_size_scale = 0; SCTP_BUF_LEN(chk->data) = SCTP_SIZE32(chk->send_size); return; } void sctp_add_stream_reset_result(struct sctp_tmit_chunk *chk, uint32_t resp_seq, uint32_t result) { uint16_t len, old_len; struct sctp_stream_reset_response *resp; struct sctp_chunkhdr *ch; ch = mtod(chk->data, struct sctp_chunkhdr *); old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length)); /* get to new offset for the param. */ resp = (struct sctp_stream_reset_response *)((caddr_t)ch + len); /* now how long will this param be? */ len = sizeof(struct sctp_stream_reset_response); resp->ph.param_type = htons(SCTP_STR_RESET_RESPONSE); resp->ph.param_length = htons(len); resp->response_seq = htonl(resp_seq); resp->result = ntohl(result); /* now fix the chunk length */ ch->chunk_length = htons(len + old_len); chk->book_size = len + old_len; chk->book_size_scale = 0; chk->send_size = SCTP_SIZE32(chk->book_size); SCTP_BUF_LEN(chk->data) = chk->send_size; return; } void sctp_send_deferred_reset_response(struct sctp_tcb *stcb, struct sctp_stream_reset_list *ent, int response) { struct sctp_association *asoc; struct sctp_tmit_chunk *chk; struct sctp_chunkhdr *ch; asoc = &stcb->asoc; /* * Reset our last reset action to the new one IP -> response * (PERFORMED probably). This assures that if we fail to send, a * retran from the peer will get the new response. */ asoc->last_reset_action[0] = response; if (asoc->stream_reset_outstanding) { return; } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return; } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_STREAM_RESET; chk->rec.chunk_id.can_take_data = 0; chk->flags = 0; chk->asoc = &stcb->asoc; chk->book_size = sizeof(struct sctp_chunkhdr); chk->send_size = SCTP_SIZE32(chk->book_size); chk->book_size_scale = 0; chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA); if (chk->data == NULL) { sctp_free_a_chunk(stcb, chk, SCTP_SO_LOCKED); SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return; } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); - sctp_add_stream_reset_result(chk, ent->seq, response); /* setup chunk parameters */ chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; if (stcb->asoc.alternate) { chk->whoTo = stcb->asoc.alternate; } else { chk->whoTo = stcb->asoc.primary_destination; } ch = mtod(chk->data, struct sctp_chunkhdr *); ch->chunk_type = SCTP_STREAM_RESET; ch->chunk_flags = 0; ch->chunk_length = htons(chk->book_size); atomic_add_int(&chk->whoTo->ref_count, 1); SCTP_BUF_LEN(chk->data) = chk->send_size; + sctp_add_stream_reset_result(chk, ent->seq, response); /* insert the chunk for sending */ TAILQ_INSERT_TAIL(&asoc->control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt++; } void sctp_add_stream_reset_result_tsn(struct sctp_tmit_chunk *chk, uint32_t resp_seq, uint32_t result, uint32_t send_una, uint32_t recv_next) { uint16_t len, old_len; struct sctp_stream_reset_response_tsn *resp; struct sctp_chunkhdr *ch; ch = mtod(chk->data, struct sctp_chunkhdr *); old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length)); /* get to new offset for the param. */ resp = (struct sctp_stream_reset_response_tsn *)((caddr_t)ch + len); /* now how long will this param be? */ len = sizeof(struct sctp_stream_reset_response_tsn); resp->ph.param_type = htons(SCTP_STR_RESET_RESPONSE); resp->ph.param_length = htons(len); resp->response_seq = htonl(resp_seq); resp->result = htonl(result); resp->senders_next_tsn = htonl(send_una); resp->receivers_next_tsn = htonl(recv_next); /* now fix the chunk length */ ch->chunk_length = htons(len + old_len); chk->book_size = len + old_len; chk->send_size = SCTP_SIZE32(chk->book_size); chk->book_size_scale = 0; SCTP_BUF_LEN(chk->data) = chk->send_size; return; } static void sctp_add_an_out_stream(struct sctp_tmit_chunk *chk, uint32_t seq, uint16_t adding) { uint16_t len, old_len; struct sctp_chunkhdr *ch; struct sctp_stream_reset_add_strm *addstr; ch = mtod(chk->data, struct sctp_chunkhdr *); old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length)); /* get to new offset for the param. */ addstr = (struct sctp_stream_reset_add_strm *)((caddr_t)ch + len); /* now how long will this param be? */ len = sizeof(struct sctp_stream_reset_add_strm); /* Fill it out. */ addstr->ph.param_type = htons(SCTP_STR_RESET_ADD_OUT_STREAMS); addstr->ph.param_length = htons(len); addstr->request_seq = htonl(seq); addstr->number_of_streams = htons(adding); addstr->reserved = 0; /* now fix the chunk length */ ch->chunk_length = htons(len + old_len); chk->send_size = len + old_len; chk->book_size = SCTP_SIZE32(chk->send_size); chk->book_size_scale = 0; SCTP_BUF_LEN(chk->data) = SCTP_SIZE32(chk->send_size); return; } static void sctp_add_an_in_stream(struct sctp_tmit_chunk *chk, uint32_t seq, uint16_t adding) { uint16_t len, old_len; struct sctp_chunkhdr *ch; struct sctp_stream_reset_add_strm *addstr; ch = mtod(chk->data, struct sctp_chunkhdr *); old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length)); /* get to new offset for the param. */ addstr = (struct sctp_stream_reset_add_strm *)((caddr_t)ch + len); /* now how long will this param be? */ len = sizeof(struct sctp_stream_reset_add_strm); /* Fill it out. */ addstr->ph.param_type = htons(SCTP_STR_RESET_ADD_IN_STREAMS); addstr->ph.param_length = htons(len); addstr->request_seq = htonl(seq); addstr->number_of_streams = htons(adding); addstr->reserved = 0; /* now fix the chunk length */ ch->chunk_length = htons(len + old_len); chk->send_size = len + old_len; chk->book_size = SCTP_SIZE32(chk->send_size); chk->book_size_scale = 0; SCTP_BUF_LEN(chk->data) = SCTP_SIZE32(chk->send_size); return; } int sctp_send_stream_reset_out_if_possible(struct sctp_tcb *stcb, int so_locked) { struct sctp_association *asoc; struct sctp_tmit_chunk *chk; struct sctp_chunkhdr *ch; uint32_t seq; asoc = &stcb->asoc; asoc->trigger_reset = 0; if (asoc->stream_reset_outstanding) { return (EALREADY); } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_STREAM_RESET; chk->rec.chunk_id.can_take_data = 0; chk->flags = 0; chk->asoc = &stcb->asoc; chk->book_size = sizeof(struct sctp_chunkhdr); chk->send_size = SCTP_SIZE32(chk->book_size); chk->book_size_scale = 0; chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA); if (chk->data == NULL) { sctp_free_a_chunk(stcb, chk, so_locked); SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); /* setup chunk parameters */ chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; if (stcb->asoc.alternate) { chk->whoTo = stcb->asoc.alternate; } else { chk->whoTo = stcb->asoc.primary_destination; } ch = mtod(chk->data, struct sctp_chunkhdr *); ch->chunk_type = SCTP_STREAM_RESET; ch->chunk_flags = 0; ch->chunk_length = htons(chk->book_size); atomic_add_int(&chk->whoTo->ref_count, 1); SCTP_BUF_LEN(chk->data) = chk->send_size; seq = stcb->asoc.str_reset_seq_out; if (sctp_add_stream_reset_out(stcb, chk, seq, (stcb->asoc.str_reset_seq_in - 1), (stcb->asoc.sending_seq - 1))) { seq++; asoc->stream_reset_outstanding++; } else { m_freem(chk->data); chk->data = NULL; sctp_free_a_chunk(stcb, chk, so_locked); return (ENOENT); } asoc->str_reset = chk; /* insert the chunk for sending */ TAILQ_INSERT_TAIL(&asoc->control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt++; if (stcb->asoc.send_sack) { sctp_send_sack(stcb, so_locked); } sctp_timer_start(SCTP_TIMER_TYPE_STRRESET, stcb->sctp_ep, stcb, chk->whoTo); return (0); } int sctp_send_str_reset_req(struct sctp_tcb *stcb, uint16_t number_entries, uint16_t * list, uint8_t send_in_req, uint8_t send_tsn_req, uint8_t add_stream, uint16_t adding_o, uint16_t adding_i, uint8_t peer_asked) { struct sctp_association *asoc; struct sctp_tmit_chunk *chk; struct sctp_chunkhdr *ch; int can_send_out_req = 0; uint32_t seq; asoc = &stcb->asoc; if (asoc->stream_reset_outstanding) { /*- * Already one pending, must get ACK back to clear the flag. */ SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EBUSY); return (EBUSY); } if ((send_in_req == 0) && (send_tsn_req == 0) && (add_stream == 0)) { /* nothing to do */ SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } if (send_tsn_req && send_in_req) { /* error, can't do that */ SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } else if (send_in_req) { can_send_out_req = 1; } if (number_entries > (MCLBYTES - SCTP_MIN_OVERHEAD - sizeof(struct sctp_chunkhdr) - sizeof(struct sctp_stream_reset_out_request)) / sizeof(uint16_t)) { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } sctp_alloc_a_chunk(stcb, chk); if (chk == NULL) { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } chk->copy_by_ref = 0; chk->rec.chunk_id.id = SCTP_STREAM_RESET; chk->rec.chunk_id.can_take_data = 0; chk->flags = 0; chk->asoc = &stcb->asoc; chk->book_size = sizeof(struct sctp_chunkhdr); chk->send_size = SCTP_SIZE32(chk->book_size); chk->book_size_scale = 0; chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA); if (chk->data == NULL) { sctp_free_a_chunk(stcb, chk, SCTP_SO_LOCKED); SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM); return (ENOMEM); } SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD); /* setup chunk parameters */ chk->sent = SCTP_DATAGRAM_UNSENT; chk->snd_count = 0; if (stcb->asoc.alternate) { chk->whoTo = stcb->asoc.alternate; } else { chk->whoTo = stcb->asoc.primary_destination; } atomic_add_int(&chk->whoTo->ref_count, 1); ch = mtod(chk->data, struct sctp_chunkhdr *); ch->chunk_type = SCTP_STREAM_RESET; ch->chunk_flags = 0; ch->chunk_length = htons(chk->book_size); SCTP_BUF_LEN(chk->data) = chk->send_size; seq = stcb->asoc.str_reset_seq_out; if (can_send_out_req) { int ret; ret = sctp_add_stream_reset_out(stcb, chk, seq, (stcb->asoc.str_reset_seq_in - 1), (stcb->asoc.sending_seq - 1)); if (ret) { seq++; asoc->stream_reset_outstanding++; } } if ((add_stream & 1) && ((stcb->asoc.strm_realoutsize - stcb->asoc.streamoutcnt) < adding_o)) { /* Need to allocate more */ struct sctp_stream_out *oldstream; struct sctp_stream_queue_pending *sp, *nsp; int i; #if defined(SCTP_DETAILED_STR_STATS) int j; #endif oldstream = stcb->asoc.strmout; /* get some more */ SCTP_MALLOC(stcb->asoc.strmout, struct sctp_stream_out *, (stcb->asoc.streamoutcnt + adding_o) * sizeof(struct sctp_stream_out), SCTP_M_STRMO); if (stcb->asoc.strmout == NULL) { uint8_t x; stcb->asoc.strmout = oldstream; /* Turn off the bit */ x = add_stream & 0xfe; add_stream = x; goto skip_stuff; } /* * Ok now we proceed with copying the old out stuff and * initializing the new stuff. */ SCTP_TCB_SEND_LOCK(stcb); stcb->asoc.ss_functions.sctp_ss_clear(stcb, &stcb->asoc, 0, 1); for (i = 0; i < stcb->asoc.streamoutcnt; i++) { TAILQ_INIT(&stcb->asoc.strmout[i].outqueue); stcb->asoc.strmout[i].chunks_on_queues = oldstream[i].chunks_on_queues; stcb->asoc.strmout[i].next_mid_ordered = oldstream[i].next_mid_ordered; stcb->asoc.strmout[i].next_mid_unordered = oldstream[i].next_mid_unordered; stcb->asoc.strmout[i].last_msg_incomplete = oldstream[i].last_msg_incomplete; stcb->asoc.strmout[i].stream_no = i; stcb->asoc.strmout[i].state = oldstream[i].state; stcb->asoc.ss_functions.sctp_ss_init_stream(&stcb->asoc.strmout[i], &oldstream[i]); /* now anything on those queues? */ TAILQ_FOREACH_SAFE(sp, &oldstream[i].outqueue, next, nsp) { TAILQ_REMOVE(&oldstream[i].outqueue, sp, next); TAILQ_INSERT_TAIL(&stcb->asoc.strmout[i].outqueue, sp, next); } /* Now move assoc pointers too */ if (stcb->asoc.last_out_stream == &oldstream[i]) { stcb->asoc.last_out_stream = &stcb->asoc.strmout[i]; } if (stcb->asoc.locked_on_sending == &oldstream[i]) { stcb->asoc.locked_on_sending = &stcb->asoc.strmout[i]; } } /* now the new streams */ stcb->asoc.ss_functions.sctp_ss_init(stcb, &stcb->asoc, 1); for (i = stcb->asoc.streamoutcnt; i < (stcb->asoc.streamoutcnt + adding_o); i++) { TAILQ_INIT(&stcb->asoc.strmout[i].outqueue); stcb->asoc.strmout[i].chunks_on_queues = 0; #if defined(SCTP_DETAILED_STR_STATS) for (j = 0; j < SCTP_PR_SCTP_MAX + 1; j++) { stcb->asoc.strmout[i].abandoned_sent[j] = 0; stcb->asoc.strmout[i].abandoned_unsent[j] = 0; } #else stcb->asoc.strmout[i].abandoned_sent[0] = 0; stcb->asoc.strmout[i].abandoned_unsent[0] = 0; #endif stcb->asoc.strmout[i].next_mid_ordered = 0; stcb->asoc.strmout[i].next_mid_unordered = 0; stcb->asoc.strmout[i].stream_no = i; stcb->asoc.strmout[i].last_msg_incomplete = 0; stcb->asoc.ss_functions.sctp_ss_init_stream(&stcb->asoc.strmout[i], NULL); stcb->asoc.strmout[i].state = SCTP_STREAM_CLOSED; } stcb->asoc.strm_realoutsize = stcb->asoc.streamoutcnt + adding_o; SCTP_FREE(oldstream, SCTP_M_STRMO); SCTP_TCB_SEND_UNLOCK(stcb); } skip_stuff: if ((add_stream & 1) && (adding_o > 0)) { asoc->strm_pending_add_size = adding_o; asoc->peer_req_out = peer_asked; sctp_add_an_out_stream(chk, seq, adding_o); seq++; asoc->stream_reset_outstanding++; } if ((add_stream & 2) && (adding_i > 0)) { sctp_add_an_in_stream(chk, seq, adding_i); seq++; asoc->stream_reset_outstanding++; } if (send_in_req) { sctp_add_stream_reset_in(chk, number_entries, list, seq); seq++; asoc->stream_reset_outstanding++; } if (send_tsn_req) { sctp_add_stream_reset_tsn(chk, seq); asoc->stream_reset_outstanding++; } asoc->str_reset = chk; /* insert the chunk for sending */ TAILQ_INSERT_TAIL(&asoc->control_send_queue, chk, sctp_next); asoc->ctrl_queue_cnt++; if (stcb->asoc.send_sack) { sctp_send_sack(stcb, SCTP_SO_LOCKED); } sctp_timer_start(SCTP_TIMER_TYPE_STRRESET, stcb->sctp_ep, stcb, chk->whoTo); return (0); } void sctp_send_abort(struct mbuf *m, int iphlen, struct sockaddr *src, struct sockaddr *dst, struct sctphdr *sh, uint32_t vtag, struct mbuf *cause, uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum, uint32_t vrf_id, uint16_t port) { /* Don't respond to an ABORT with an ABORT. */ if (sctp_is_there_an_abort_here(m, iphlen, &vtag)) { if (cause) sctp_m_freem(cause); return; } sctp_send_resp_msg(src, dst, sh, vtag, SCTP_ABORT_ASSOCIATION, cause, mflowtype, mflowid, fibnum, vrf_id, port); return; } void sctp_send_operr_to(struct sockaddr *src, struct sockaddr *dst, struct sctphdr *sh, uint32_t vtag, struct mbuf *cause, uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum, uint32_t vrf_id, uint16_t port) { sctp_send_resp_msg(src, dst, sh, vtag, SCTP_OPERATION_ERROR, cause, mflowtype, mflowid, fibnum, vrf_id, port); return; } static struct mbuf * sctp_copy_resume(struct uio *uio, int max_send_len, int user_marks_eor, int *error, uint32_t * sndout, struct mbuf **new_tail) { struct mbuf *m; m = m_uiotombuf(uio, M_WAITOK, max_send_len, 0, (M_PKTHDR | (user_marks_eor ? M_EOR : 0))); if (m == NULL) { SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOBUFS); *error = ENOBUFS; } else { *sndout = m_length(m, NULL); *new_tail = m_last(m); } return (m); } static int sctp_copy_one(struct sctp_stream_queue_pending *sp, struct uio *uio, int resv_upfront) { sp->data = m_uiotombuf(uio, M_WAITOK, sp->length, resv_upfront, 0); if (sp->data == NULL) { SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOBUFS); return (ENOBUFS); } sp->tail_mbuf = m_last(sp->data); return (0); } static struct sctp_stream_queue_pending * sctp_copy_it_in(struct sctp_tcb *stcb, struct sctp_association *asoc, struct sctp_sndrcvinfo *srcv, struct uio *uio, struct sctp_nets *net, int max_send_len, int user_marks_eor, int *error) { /*- * This routine must be very careful in its work. Protocol * processing is up and running so care must be taken to spl...() * when you need to do something that may effect the stcb/asoc. The * sb is locked however. When data is copied the protocol processing * should be enabled since this is a slower operation... */ struct sctp_stream_queue_pending *sp = NULL; int resv_in_first; *error = 0; /* Now can we send this? */ if ((SCTP_GET_STATE(asoc) == SCTP_STATE_SHUTDOWN_SENT) || (SCTP_GET_STATE(asoc) == SCTP_STATE_SHUTDOWN_ACK_SENT) || (SCTP_GET_STATE(asoc) == SCTP_STATE_SHUTDOWN_RECEIVED) || (asoc->state & SCTP_STATE_SHUTDOWN_PENDING)) { /* got data while shutting down */ SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET); *error = ECONNRESET; goto out_now; } sctp_alloc_a_strmoq(stcb, sp); if (sp == NULL) { SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOMEM); *error = ENOMEM; goto out_now; } sp->act_flags = 0; sp->sender_all_done = 0; sp->sinfo_flags = srcv->sinfo_flags; sp->timetolive = srcv->sinfo_timetolive; sp->ppid = srcv->sinfo_ppid; sp->context = srcv->sinfo_context; sp->fsn = 0; (void)SCTP_GETTIME_TIMEVAL(&sp->ts); sp->stream = srcv->sinfo_stream; sp->length = (uint32_t) min(uio->uio_resid, max_send_len); if ((sp->length == (uint32_t) uio->uio_resid) && ((user_marks_eor == 0) || (srcv->sinfo_flags & SCTP_EOF) || (user_marks_eor && (srcv->sinfo_flags & SCTP_EOR)))) { sp->msg_is_complete = 1; } else { sp->msg_is_complete = 0; } sp->sender_all_done = 0; sp->some_taken = 0; sp->put_last_out = 0; resv_in_first = sizeof(struct sctp_data_chunk); sp->data = sp->tail_mbuf = NULL; if (sp->length == 0) { *error = 0; goto skip_copy; } if (srcv->sinfo_keynumber_valid) { sp->auth_keyid = srcv->sinfo_keynumber; } else { sp->auth_keyid = stcb->asoc.authinfo.active_keyid; } if (sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks)) { sctp_auth_key_acquire(stcb, sp->auth_keyid); sp->holds_key_ref = 1; } *error = sctp_copy_one(sp, uio, resv_in_first); skip_copy: if (*error) { sctp_free_a_strmoq(stcb, sp, SCTP_SO_LOCKED); sp = NULL; } else { if (sp->sinfo_flags & SCTP_ADDR_OVER) { sp->net = net; atomic_add_int(&sp->net->ref_count, 1); } else { sp->net = NULL; } sctp_set_prsctp_policy(sp); } out_now: return (sp); } int sctp_sosend(struct socket *so, struct sockaddr *addr, struct uio *uio, struct mbuf *top, struct mbuf *control, int flags, struct thread *p ) { int error, use_sndinfo = 0; struct sctp_sndrcvinfo sndrcvninfo; struct sockaddr *addr_to_use; #if defined(INET) && defined(INET6) struct sockaddr_in sin; #endif if (control) { /* process cmsg snd/rcv info (maybe a assoc-id) */ if (sctp_find_cmsg(SCTP_SNDRCV, (void *)&sndrcvninfo, control, sizeof(sndrcvninfo))) { /* got one */ use_sndinfo = 1; } } addr_to_use = addr; #if defined(INET) && defined(INET6) if ((addr) && (addr->sa_family == AF_INET6)) { struct sockaddr_in6 *sin6; sin6 = (struct sockaddr_in6 *)addr; if (IN6_IS_ADDR_V4MAPPED(&sin6->sin6_addr)) { in6_sin6_2_sin(&sin, sin6); addr_to_use = (struct sockaddr *)&sin; } } #endif error = sctp_lower_sosend(so, addr_to_use, uio, top, control, flags, use_sndinfo ? &sndrcvninfo : NULL ,p ); return (error); } int sctp_lower_sosend(struct socket *so, struct sockaddr *addr, struct uio *uio, struct mbuf *i_pak, struct mbuf *control, int flags, struct sctp_sndrcvinfo *srcv , struct thread *p ) { unsigned int sndlen = 0, max_len; int error, len; struct mbuf *top = NULL; int queue_only = 0, queue_only_for_init = 0; int free_cnt_applied = 0; int un_sent; int now_filled = 0; unsigned int inqueue_bytes = 0; struct sctp_block_entry be; struct sctp_inpcb *inp; struct sctp_tcb *stcb = NULL; struct timeval now; struct sctp_nets *net; struct sctp_association *asoc; struct sctp_inpcb *t_inp; int user_marks_eor; int create_lock_applied = 0; int nagle_applies = 0; int some_on_control = 0; int got_all_of_the_send = 0; int hold_tcblock = 0; int non_blocking = 0; uint32_t local_add_more, local_soresv = 0; uint16_t port; uint16_t sinfo_flags; sctp_assoc_t sinfo_assoc_id; error = 0; net = NULL; stcb = NULL; asoc = NULL; t_inp = inp = (struct sctp_inpcb *)so->so_pcb; if (inp == NULL) { SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; if (i_pak) { SCTP_RELEASE_PKT(i_pak); } return (error); } if ((uio == NULL) && (i_pak == NULL)) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } user_marks_eor = sctp_is_feature_on(inp, SCTP_PCB_FLAGS_EXPLICIT_EOR); atomic_add_int(&inp->total_sends, 1); if (uio) { if (uio->uio_resid < 0) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); return (EINVAL); } sndlen = (unsigned int)uio->uio_resid; } else { top = SCTP_HEADER_TO_CHAIN(i_pak); sndlen = SCTP_HEADER_LEN(i_pak); } SCTPDBG(SCTP_DEBUG_OUTPUT1, "Send called addr:%p send length %d\n", (void *)addr, sndlen); if ((inp->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) && (inp->sctp_socket->so_qlimit)) { /* The listener can NOT send */ SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOTCONN); error = ENOTCONN; goto out_unlocked; } /** * Pre-screen address, if one is given the sin-len * must be set correctly! */ if (addr) { union sctp_sockstore *raddr = (union sctp_sockstore *)addr; switch (raddr->sa.sa_family) { #ifdef INET case AF_INET: if (raddr->sin.sin_len != sizeof(struct sockaddr_in)) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } port = raddr->sin.sin_port; break; #endif #ifdef INET6 case AF_INET6: if (raddr->sin6.sin6_len != sizeof(struct sockaddr_in6)) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } port = raddr->sin6.sin6_port; break; #endif default: SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EAFNOSUPPORT); error = EAFNOSUPPORT; goto out_unlocked; } } else port = 0; if (srcv) { sinfo_flags = srcv->sinfo_flags; sinfo_assoc_id = srcv->sinfo_assoc_id; if (INVALID_SINFO_FLAG(sinfo_flags) || PR_SCTP_INVALID_POLICY(sinfo_flags)) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } if (srcv->sinfo_flags) SCTP_STAT_INCR(sctps_sends_with_flags); } else { sinfo_flags = inp->def_send.sinfo_flags; sinfo_assoc_id = inp->def_send.sinfo_assoc_id; } if (sinfo_flags & SCTP_SENDALL) { /* its a sendall */ error = sctp_sendall(inp, uio, top, srcv); top = NULL; goto out_unlocked; } if ((sinfo_flags & SCTP_ADDR_OVER) && (addr == NULL)) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } /* now we must find the assoc */ if ((inp->sctp_flags & SCTP_PCB_FLAGS_CONNECTED) || (inp->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL)) { SCTP_INP_RLOCK(inp); stcb = LIST_FIRST(&inp->sctp_asoc_list); if (stcb) { SCTP_TCB_LOCK(stcb); hold_tcblock = 1; } SCTP_INP_RUNLOCK(inp); } else if (sinfo_assoc_id) { stcb = sctp_findassociation_ep_asocid(inp, sinfo_assoc_id, 0); } else if (addr) { /*- * Since we did not use findep we must * increment it, and if we don't find a tcb * decrement it. */ SCTP_INP_WLOCK(inp); SCTP_INP_INCR_REF(inp); SCTP_INP_WUNLOCK(inp); stcb = sctp_findassociation_ep_addr(&t_inp, addr, &net, NULL, NULL); if (stcb == NULL) { SCTP_INP_WLOCK(inp); SCTP_INP_DECR_REF(inp); SCTP_INP_WUNLOCK(inp); } else { hold_tcblock = 1; } } if ((stcb == NULL) && (addr)) { /* Possible implicit send? */ SCTP_ASOC_CREATE_LOCK(inp); create_lock_applied = 1; if ((inp->sctp_flags & SCTP_PCB_FLAGS_SOCKET_GONE) || (inp->sctp_flags & SCTP_PCB_FLAGS_SOCKET_ALLGONE)) { /* Should I really unlock ? */ SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } if (((inp->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) == 0) && (addr->sa_family == AF_INET6)) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } SCTP_INP_WLOCK(inp); SCTP_INP_INCR_REF(inp); SCTP_INP_WUNLOCK(inp); /* With the lock applied look again */ stcb = sctp_findassociation_ep_addr(&t_inp, addr, &net, NULL, NULL); if ((stcb == NULL) && (control != NULL) && (port > 0)) { stcb = sctp_findassociation_cmsgs(&t_inp, port, control, &net, &error); } if (stcb == NULL) { SCTP_INP_WLOCK(inp); SCTP_INP_DECR_REF(inp); SCTP_INP_WUNLOCK(inp); } else { hold_tcblock = 1; } if (error) { goto out_unlocked; } if (t_inp != inp) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOTCONN); error = ENOTCONN; goto out_unlocked; } } if (stcb == NULL) { if (addr == NULL) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOENT); error = ENOENT; goto out_unlocked; } else { /* We must go ahead and start the INIT process */ uint32_t vrf_id; if ((sinfo_flags & SCTP_ABORT) || ((sinfo_flags & SCTP_EOF) && (sndlen == 0))) { /*- * User asks to abort a non-existant assoc, * or EOF a non-existant assoc with no data */ SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOENT); error = ENOENT; goto out_unlocked; } /* get an asoc/stcb struct */ vrf_id = inp->def_vrf_id; #ifdef INVARIANTS if (create_lock_applied == 0) { panic("Error, should hold create lock and I don't?"); } #endif stcb = sctp_aloc_assoc(inp, addr, &error, 0, vrf_id, inp->sctp_ep.pre_open_stream_count, inp->sctp_ep.port, p); if (stcb == NULL) { /* Error is setup for us in the call */ goto out_unlocked; } if (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) { stcb->sctp_ep->sctp_flags |= SCTP_PCB_FLAGS_CONNECTED; /* * Set the connected flag so we can queue * data */ soisconnecting(so); } hold_tcblock = 1; if (create_lock_applied) { SCTP_ASOC_CREATE_UNLOCK(inp); create_lock_applied = 0; } else { SCTP_PRINTF("Huh-3? create lock should have been on??\n"); } /* * Turn on queue only flag to prevent data from * being sent */ queue_only = 1; asoc = &stcb->asoc; SCTP_SET_STATE(asoc, SCTP_STATE_COOKIE_WAIT); (void)SCTP_GETTIME_TIMEVAL(&asoc->time_entered); /* initialize authentication params for the assoc */ sctp_initialize_auth_params(inp, stcb); if (control) { if (sctp_process_cmsgs_for_init(stcb, control, &error)) { sctp_free_assoc(inp, stcb, SCTP_PCBFREE_FORCE, SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_5); hold_tcblock = 0; stcb = NULL; goto out_unlocked; } } /* out with the INIT */ queue_only_for_init = 1; /*- * we may want to dig in after this call and adjust the MTU * value. It defaulted to 1500 (constant) but the ro * structure may now have an update and thus we may need to * change it BEFORE we append the message. */ } } else asoc = &stcb->asoc; if (srcv == NULL) srcv = (struct sctp_sndrcvinfo *)&asoc->def_send; if (srcv->sinfo_flags & SCTP_ADDR_OVER) { if (addr) net = sctp_findnet(stcb, addr); else net = NULL; if ((net == NULL) || ((port != 0) && (port != stcb->rport))) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } } else { if (stcb->asoc.alternate) { net = stcb->asoc.alternate; } else { net = stcb->asoc.primary_destination; } } atomic_add_int(&stcb->total_sends, 1); /* Keep the stcb from being freed under our feet */ atomic_add_int(&asoc->refcnt, 1); free_cnt_applied = 1; if (sctp_is_feature_on(inp, SCTP_PCB_FLAGS_NO_FRAGMENT)) { if (sndlen > asoc->smallest_mtu) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EMSGSIZE); error = EMSGSIZE; goto out_unlocked; } } if (SCTP_SO_IS_NBIO(so) || (flags & MSG_NBIO) ) { non_blocking = 1; } /* would we block? */ if (non_blocking) { if (hold_tcblock == 0) { SCTP_TCB_LOCK(stcb); hold_tcblock = 1; } inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * sizeof(struct sctp_data_chunk)); if ((SCTP_SB_LIMIT_SND(so) < (sndlen + inqueue_bytes + stcb->asoc.sb_send_resv)) || (stcb->asoc.chunks_on_out_queue >= SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue))) { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EWOULDBLOCK); if (sndlen > SCTP_SB_LIMIT_SND(so)) error = EMSGSIZE; else error = EWOULDBLOCK; goto out_unlocked; } stcb->asoc.sb_send_resv += sndlen; SCTP_TCB_UNLOCK(stcb); hold_tcblock = 0; } else { atomic_add_int(&stcb->asoc.sb_send_resv, sndlen); } local_soresv = sndlen; if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET); error = ECONNRESET; goto out_unlocked; } if (create_lock_applied) { SCTP_ASOC_CREATE_UNLOCK(inp); create_lock_applied = 0; } /* Is the stream no. valid? */ if (srcv->sinfo_stream >= asoc->streamoutcnt) { /* Invalid stream number */ SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } if ((asoc->strmout[srcv->sinfo_stream].state != SCTP_STREAM_OPEN) && (asoc->strmout[srcv->sinfo_stream].state != SCTP_STREAM_OPENING)) { /* * Can't queue any data while stream reset is underway. */ if (asoc->strmout[srcv->sinfo_stream].state > SCTP_STREAM_OPEN) { error = EAGAIN; } else { error = EINVAL; } SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, error); goto out_unlocked; } if ((SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_WAIT) || (SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_ECHOED)) { queue_only = 1; } /* we are now done with all control */ if (control) { sctp_m_freem(control); control = NULL; } if ((SCTP_GET_STATE(asoc) == SCTP_STATE_SHUTDOWN_SENT) || (SCTP_GET_STATE(asoc) == SCTP_STATE_SHUTDOWN_RECEIVED) || (SCTP_GET_STATE(asoc) == SCTP_STATE_SHUTDOWN_ACK_SENT) || (asoc->state & SCTP_STATE_SHUTDOWN_PENDING)) { if (srcv->sinfo_flags & SCTP_ABORT) { ; } else { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET); error = ECONNRESET; goto out_unlocked; } } /* Ok, we will attempt a msgsnd :> */ if (p) { p->td_ru.ru_msgsnd++; } /* Are we aborting? */ if (srcv->sinfo_flags & SCTP_ABORT) { struct mbuf *mm; int tot_demand, tot_out = 0, max_out; SCTP_STAT_INCR(sctps_sends_with_abort); if ((SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_WAIT) || (SCTP_GET_STATE(asoc) == SCTP_STATE_COOKIE_ECHOED)) { /* It has to be up before we abort */ /* how big is the user initiated abort? */ SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out; } if (hold_tcblock) { SCTP_TCB_UNLOCK(stcb); hold_tcblock = 0; } if (top) { struct mbuf *cntm = NULL; mm = sctp_get_mbuf_for_msg(sizeof(struct sctp_paramhdr), 0, M_WAITOK, 1, MT_DATA); if (sndlen != 0) { for (cntm = top; cntm; cntm = SCTP_BUF_NEXT(cntm)) { tot_out += SCTP_BUF_LEN(cntm); } } } else { /* Must fit in a MTU */ tot_out = sndlen; tot_demand = (tot_out + sizeof(struct sctp_paramhdr)); if (tot_demand > SCTP_DEFAULT_ADD_MORE) { /* To big */ SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EMSGSIZE); error = EMSGSIZE; goto out; } mm = sctp_get_mbuf_for_msg(tot_demand, 0, M_WAITOK, 1, MT_DATA); } if (mm == NULL) { SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOMEM); error = ENOMEM; goto out; } max_out = asoc->smallest_mtu - sizeof(struct sctp_paramhdr); max_out -= sizeof(struct sctp_abort_msg); if (tot_out > max_out) { tot_out = max_out; } if (mm) { struct sctp_paramhdr *ph; /* now move forward the data pointer */ ph = mtod(mm, struct sctp_paramhdr *); ph->param_type = htons(SCTP_CAUSE_USER_INITIATED_ABT); ph->param_length = htons((uint16_t) (sizeof(struct sctp_paramhdr) + tot_out)); ph++; SCTP_BUF_LEN(mm) = tot_out + sizeof(struct sctp_paramhdr); if (top == NULL) { error = uiomove((caddr_t)ph, (int)tot_out, uio); if (error) { /*- * Here if we can't get his data we * still abort we just don't get to * send the users note :-0 */ sctp_m_freem(mm); mm = NULL; } } else { if (sndlen != 0) { SCTP_BUF_NEXT(mm) = top; } } } if (hold_tcblock == 0) { SCTP_TCB_LOCK(stcb); } atomic_add_int(&stcb->asoc.refcnt, -1); free_cnt_applied = 0; /* release this lock, otherwise we hang on ourselves */ sctp_abort_an_association(stcb->sctp_ep, stcb, mm, SCTP_SO_LOCKED); /* now relock the stcb so everything is sane */ hold_tcblock = 0; stcb = NULL; /* * In this case top is already chained to mm avoid double * free, since we free it below if top != NULL and driver * would free it after sending the packet out */ if (sndlen != 0) { top = NULL; } goto out_unlocked; } /* Calculate the maximum we can send */ inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * sizeof(struct sctp_data_chunk)); if (SCTP_SB_LIMIT_SND(so) > inqueue_bytes) { if (non_blocking) { /* we already checked for non-blocking above. */ max_len = sndlen; } else { max_len = SCTP_SB_LIMIT_SND(so) - inqueue_bytes; } } else { max_len = 0; } if (hold_tcblock) { SCTP_TCB_UNLOCK(stcb); hold_tcblock = 0; } if (asoc->strmout == NULL) { /* huh? software error */ SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EFAULT); error = EFAULT; goto out_unlocked; } /* Unless E_EOR mode is on, we must make a send FIT in one call. */ if ((user_marks_eor == 0) && (sndlen > SCTP_SB_LIMIT_SND(stcb->sctp_socket))) { /* It will NEVER fit */ SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EMSGSIZE); error = EMSGSIZE; goto out_unlocked; } if ((uio == NULL) && user_marks_eor) { /*- * We do not support eeor mode for * sending with mbuf chains (like sendfile). */ SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out_unlocked; } if (user_marks_eor) { local_add_more = min(SCTP_SB_LIMIT_SND(so), SCTP_BASE_SYSCTL(sctp_add_more_threshold)); } else { /*- * For non-eeor the whole message must fit in * the socket send buffer. */ local_add_more = sndlen; } len = 0; if (non_blocking) { goto skip_preblock; } if (((max_len <= local_add_more) && (SCTP_SB_LIMIT_SND(so) >= local_add_more)) || (max_len == 0) || ((stcb->asoc.chunks_on_out_queue + stcb->asoc.stream_queue_cnt) >= SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue))) { /* No room right now ! */ SOCKBUF_LOCK(&so->so_snd); inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * sizeof(struct sctp_data_chunk)); while ((SCTP_SB_LIMIT_SND(so) < (inqueue_bytes + local_add_more)) || ((stcb->asoc.stream_queue_cnt + stcb->asoc.chunks_on_out_queue) >= SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue))) { SCTPDBG(SCTP_DEBUG_OUTPUT1, "pre_block limit:%u <(inq:%d + %d) || (%d+%d > %d)\n", (unsigned int)SCTP_SB_LIMIT_SND(so), inqueue_bytes, local_add_more, stcb->asoc.stream_queue_cnt, stcb->asoc.chunks_on_out_queue, SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue)); if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) { sctp_log_block(SCTP_BLOCK_LOG_INTO_BLKA, asoc, sndlen); } be.error = 0; stcb->block_entry = &be; error = sbwait(&so->so_snd); stcb->block_entry = NULL; if (error || so->so_error || be.error) { if (error == 0) { if (so->so_error) error = so->so_error; if (be.error) { error = be.error; } } SOCKBUF_UNLOCK(&so->so_snd); goto out_unlocked; } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) { sctp_log_block(SCTP_BLOCK_LOG_OUTOF_BLK, asoc, stcb->asoc.total_output_queue_size); } if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) { SOCKBUF_UNLOCK(&so->so_snd); goto out_unlocked; } inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * sizeof(struct sctp_data_chunk)); } if (SCTP_SB_LIMIT_SND(so) > inqueue_bytes) { max_len = SCTP_SB_LIMIT_SND(so) - inqueue_bytes; } else { max_len = 0; } SOCKBUF_UNLOCK(&so->so_snd); } skip_preblock: if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) { goto out_unlocked; } /* * sndlen covers for mbuf case uio_resid covers for the non-mbuf * case NOTE: uio will be null when top/mbuf is passed */ if (sndlen == 0) { if (srcv->sinfo_flags & SCTP_EOF) { got_all_of_the_send = 1; goto dataless_eof; } else { SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out; } } if (top == NULL) { struct sctp_stream_queue_pending *sp; struct sctp_stream_out *strm; uint32_t sndout; SCTP_TCB_SEND_LOCK(stcb); if ((asoc->stream_locked) && (asoc->stream_locked_on != srcv->sinfo_stream)) { SCTP_TCB_SEND_UNLOCK(stcb); SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL); error = EINVAL; goto out; } SCTP_TCB_SEND_UNLOCK(stcb); strm = &stcb->asoc.strmout[srcv->sinfo_stream]; if (strm->last_msg_incomplete == 0) { do_a_copy_in: sp = sctp_copy_it_in(stcb, asoc, srcv, uio, net, max_len, user_marks_eor, &error); if ((sp == NULL) || (error)) { goto out; } SCTP_TCB_SEND_LOCK(stcb); if (sp->msg_is_complete) { strm->last_msg_incomplete = 0; asoc->stream_locked = 0; } else { /* * Just got locked to this guy in case of an * interrupt. */ strm->last_msg_incomplete = 1; if (stcb->asoc.idata_supported == 0) { asoc->stream_locked = 1; asoc->stream_locked_on = srcv->sinfo_stream; } sp->sender_all_done = 0; } sctp_snd_sb_alloc(stcb, sp->length); atomic_add_int(&asoc->stream_queue_cnt, 1); if (srcv->sinfo_flags & SCTP_UNORDERED) { SCTP_STAT_INCR(sctps_sends_with_unord); } TAILQ_INSERT_TAIL(&strm->outqueue, sp, next); stcb->asoc.ss_functions.sctp_ss_add_to_stream(stcb, asoc, strm, sp, 1); SCTP_TCB_SEND_UNLOCK(stcb); } else { SCTP_TCB_SEND_LOCK(stcb); sp = TAILQ_LAST(&strm->outqueue, sctp_streamhead); SCTP_TCB_SEND_UNLOCK(stcb); if (sp == NULL) { /* ???? Huh ??? last msg is gone */ #ifdef INVARIANTS panic("Warning: Last msg marked incomplete, yet nothing left?"); #else SCTP_PRINTF("Warning: Last msg marked incomplete, yet nothing left?\n"); strm->last_msg_incomplete = 0; #endif goto do_a_copy_in; } } while (uio->uio_resid > 0) { /* How much room do we have? */ struct mbuf *new_tail, *mm; if (SCTP_SB_LIMIT_SND(so) > stcb->asoc.total_output_queue_size) max_len = SCTP_SB_LIMIT_SND(so) - stcb->asoc.total_output_queue_size; else max_len = 0; if ((max_len > SCTP_BASE_SYSCTL(sctp_add_more_threshold)) || (max_len && (SCTP_SB_LIMIT_SND(so) < SCTP_BASE_SYSCTL(sctp_add_more_threshold))) || (uio->uio_resid && (uio->uio_resid <= (int)max_len))) { sndout = 0; new_tail = NULL; if (hold_tcblock) { SCTP_TCB_UNLOCK(stcb); hold_tcblock = 0; } mm = sctp_copy_resume(uio, max_len, user_marks_eor, &error, &sndout, &new_tail); if ((mm == NULL) || error) { if (mm) { sctp_m_freem(mm); } goto out; } /* Update the mbuf and count */ SCTP_TCB_SEND_LOCK(stcb); if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) { /* * we need to get out. Peer probably * aborted. */ sctp_m_freem(mm); if (stcb->asoc.state & SCTP_PCB_FLAGS_WAS_ABORTED) { SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET); error = ECONNRESET; } SCTP_TCB_SEND_UNLOCK(stcb); goto out; } if (sp->tail_mbuf) { /* tack it to the end */ SCTP_BUF_NEXT(sp->tail_mbuf) = mm; sp->tail_mbuf = new_tail; } else { /* A stolen mbuf */ sp->data = mm; sp->tail_mbuf = new_tail; } sctp_snd_sb_alloc(stcb, sndout); atomic_add_int(&sp->length, sndout); len += sndout; if (srcv->sinfo_flags & SCTP_SACK_IMMEDIATELY) { sp->sinfo_flags |= SCTP_SACK_IMMEDIATELY; } /* Did we reach EOR? */ if ((uio->uio_resid == 0) && ((user_marks_eor == 0) || (srcv->sinfo_flags & SCTP_EOF) || (user_marks_eor && (srcv->sinfo_flags & SCTP_EOR)))) { sp->msg_is_complete = 1; } else { sp->msg_is_complete = 0; } SCTP_TCB_SEND_UNLOCK(stcb); } if (uio->uio_resid == 0) { /* got it all? */ continue; } /* PR-SCTP? */ if ((asoc->prsctp_supported) && (asoc->sent_queue_cnt_removeable > 0)) { /* * This is ugly but we must assure locking * order */ if (hold_tcblock == 0) { SCTP_TCB_LOCK(stcb); hold_tcblock = 1; } sctp_prune_prsctp(stcb, asoc, srcv, sndlen); inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * sizeof(struct sctp_data_chunk)); if (SCTP_SB_LIMIT_SND(so) > stcb->asoc.total_output_queue_size) max_len = SCTP_SB_LIMIT_SND(so) - inqueue_bytes; else max_len = 0; if (max_len > 0) { continue; } SCTP_TCB_UNLOCK(stcb); hold_tcblock = 0; } /* wait for space now */ if (non_blocking) { /* Non-blocking io in place out */ goto skip_out_eof; } /* What about the INIT, send it maybe */ if (queue_only_for_init) { if (hold_tcblock == 0) { SCTP_TCB_LOCK(stcb); hold_tcblock = 1; } if (SCTP_GET_STATE(&stcb->asoc) == SCTP_STATE_OPEN) { /* a collision took us forward? */ queue_only = 0; } else { sctp_send_initiate(inp, stcb, SCTP_SO_LOCKED); SCTP_SET_STATE(asoc, SCTP_STATE_COOKIE_WAIT); queue_only = 1; } } if ((net->flight_size > net->cwnd) && (asoc->sctp_cmt_on_off == 0)) { SCTP_STAT_INCR(sctps_send_cwnd_avoid); queue_only = 1; } else if (asoc->ifp_had_enobuf) { SCTP_STAT_INCR(sctps_ifnomemqueued); if (net->flight_size > (2 * net->mtu)) { queue_only = 1; } asoc->ifp_had_enobuf = 0; } un_sent = ((stcb->asoc.total_output_queue_size - stcb->asoc.total_flight) + (stcb->asoc.stream_queue_cnt * sizeof(struct sctp_data_chunk))); if ((sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) && (stcb->asoc.total_flight > 0) && (stcb->asoc.stream_queue_cnt < SCTP_MAX_DATA_BUNDLING) && (un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD))) { /*- * Ok, Nagle is set on and we have data outstanding. * Don't send anything and let SACKs drive out the * data unless we have a "full" segment to send. */ if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) { sctp_log_nagle_event(stcb, SCTP_NAGLE_APPLIED); } SCTP_STAT_INCR(sctps_naglequeued); nagle_applies = 1; } else { if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) { if (sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) sctp_log_nagle_event(stcb, SCTP_NAGLE_SKIPPED); } SCTP_STAT_INCR(sctps_naglesent); nagle_applies = 0; } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) { sctp_misc_ints(SCTP_CWNDLOG_PRESEND, queue_only_for_init, queue_only, nagle_applies, un_sent); sctp_misc_ints(SCTP_CWNDLOG_PRESEND, stcb->asoc.total_output_queue_size, stcb->asoc.total_flight, stcb->asoc.chunks_on_out_queue, stcb->asoc.total_flight_count); } if (queue_only_for_init) queue_only_for_init = 0; if ((queue_only == 0) && (nagle_applies == 0)) { /*- * need to start chunk output * before blocking.. note that if * a lock is already applied, then * the input via the net is happening * and I don't need to start output :-D */ if (hold_tcblock == 0) { if (SCTP_TCB_TRYLOCK(stcb)) { hold_tcblock = 1; sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED); } } else { sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED); } if (hold_tcblock == 1) { SCTP_TCB_UNLOCK(stcb); hold_tcblock = 0; } } SOCKBUF_LOCK(&so->so_snd); /*- * This is a bit strange, but I think it will * work. The total_output_queue_size is locked and * protected by the TCB_LOCK, which we just released. * There is a race that can occur between releasing it * above, and me getting the socket lock, where sacks * come in but we have not put the SB_WAIT on the * so_snd buffer to get the wakeup. After the LOCK * is applied the sack_processing will also need to * LOCK the so->so_snd to do the actual sowwakeup(). So * once we have the socket buffer lock if we recheck the * size we KNOW we will get to sleep safely with the * wakeup flag in place. */ if (SCTP_SB_LIMIT_SND(so) <= (stcb->asoc.total_output_queue_size + min(SCTP_BASE_SYSCTL(sctp_add_more_threshold), SCTP_SB_LIMIT_SND(so)))) { if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) { sctp_log_block(SCTP_BLOCK_LOG_INTO_BLK, asoc, (size_t)uio->uio_resid); } be.error = 0; stcb->block_entry = &be; error = sbwait(&so->so_snd); stcb->block_entry = NULL; if (error || so->so_error || be.error) { if (error == 0) { if (so->so_error) error = so->so_error; if (be.error) { error = be.error; } } SOCKBUF_UNLOCK(&so->so_snd); goto out_unlocked; } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) { sctp_log_block(SCTP_BLOCK_LOG_OUTOF_BLK, asoc, stcb->asoc.total_output_queue_size); } } SOCKBUF_UNLOCK(&so->so_snd); if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) { goto out_unlocked; } } SCTP_TCB_SEND_LOCK(stcb); if (sp) { if (sp->msg_is_complete == 0) { strm->last_msg_incomplete = 1; if (stcb->asoc.idata_supported == 0) { asoc->stream_locked = 1; asoc->stream_locked_on = srcv->sinfo_stream; } } else { sp->sender_all_done = 1; strm->last_msg_incomplete = 0; asoc->stream_locked = 0; } } else { SCTP_PRINTF("Huh no sp TSNH?\n"); strm->last_msg_incomplete = 0; asoc->stream_locked = 0; } SCTP_TCB_SEND_UNLOCK(stcb); if (uio->uio_resid == 0) { got_all_of_the_send = 1; } } else { /* We send in a 0, since we do NOT have any locks */ error = sctp_msg_append(stcb, net, top, srcv, 0); top = NULL; if (srcv->sinfo_flags & SCTP_EOF) { /* * This should only happen for Panda for the mbuf * send case, which does NOT yet support EEOR mode. * Thus, we can just set this flag to do the proper * EOF handling. */ got_all_of_the_send = 1; } } if (error) { goto out; } dataless_eof: /* EOF thing ? */ if ((srcv->sinfo_flags & SCTP_EOF) && (got_all_of_the_send == 1)) { int cnt; SCTP_STAT_INCR(sctps_sends_with_eof); error = 0; if (hold_tcblock == 0) { SCTP_TCB_LOCK(stcb); hold_tcblock = 1; } cnt = sctp_is_there_unsent_data(stcb, SCTP_SO_LOCKED); if (TAILQ_EMPTY(&asoc->send_queue) && TAILQ_EMPTY(&asoc->sent_queue) && (cnt == 0)) { if (asoc->locked_on_sending) { goto abort_anyway; } /* there is nothing queued to send, so I'm done... */ if ((SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_SENT) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_RECEIVED) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_ACK_SENT)) { struct sctp_nets *netp; /* only send SHUTDOWN the first time through */ if (SCTP_GET_STATE(asoc) == SCTP_STATE_OPEN) { SCTP_STAT_DECR_GAUGE32(sctps_currestab); } SCTP_SET_STATE(asoc, SCTP_STATE_SHUTDOWN_SENT); SCTP_CLEAR_SUBSTATE(asoc, SCTP_STATE_SHUTDOWN_PENDING); sctp_stop_timers_for_shutdown(stcb); if (stcb->asoc.alternate) { netp = stcb->asoc.alternate; } else { netp = stcb->asoc.primary_destination; } sctp_send_shutdown(stcb, netp); sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWN, stcb->sctp_ep, stcb, netp); sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb, asoc->primary_destination); } } else { /*- * we still got (or just got) data to send, so set * SHUTDOWN_PENDING */ /*- * XXX sockets draft says that SCTP_EOF should be * sent with no data. currently, we will allow user * data to be sent first and move to * SHUTDOWN-PENDING */ if ((SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_SENT) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_RECEIVED) && (SCTP_GET_STATE(asoc) != SCTP_STATE_SHUTDOWN_ACK_SENT)) { if (hold_tcblock == 0) { SCTP_TCB_LOCK(stcb); hold_tcblock = 1; } if (asoc->locked_on_sending) { /* Locked to send out the data */ struct sctp_stream_queue_pending *sp; sp = TAILQ_LAST(&asoc->locked_on_sending->outqueue, sctp_streamhead); if (sp) { if ((sp->length == 0) && (sp->msg_is_complete == 0)) asoc->state |= SCTP_STATE_PARTIAL_MSG_LEFT; } } asoc->state |= SCTP_STATE_SHUTDOWN_PENDING; if (TAILQ_EMPTY(&asoc->send_queue) && TAILQ_EMPTY(&asoc->sent_queue) && (asoc->state & SCTP_STATE_PARTIAL_MSG_LEFT)) { struct mbuf *op_err; char msg[SCTP_DIAG_INFO_LEN]; abort_anyway: if (free_cnt_applied) { atomic_add_int(&stcb->asoc.refcnt, -1); free_cnt_applied = 0; } snprintf(msg, sizeof(msg), "%s:%d at %s", __FILE__, __LINE__, __func__); op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code), msg); sctp_abort_an_association(stcb->sctp_ep, stcb, op_err, SCTP_SO_LOCKED); /* * now relock the stcb so everything * is sane */ hold_tcblock = 0; stcb = NULL; goto out; } sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb, asoc->primary_destination); sctp_feature_off(inp, SCTP_PCB_FLAGS_NODELAY); } } } skip_out_eof: if (!TAILQ_EMPTY(&stcb->asoc.control_send_queue)) { some_on_control = 1; } if (queue_only_for_init) { if (hold_tcblock == 0) { SCTP_TCB_LOCK(stcb); hold_tcblock = 1; } if (SCTP_GET_STATE(&stcb->asoc) == SCTP_STATE_OPEN) { /* a collision took us forward? */ queue_only = 0; } else { sctp_send_initiate(inp, stcb, SCTP_SO_LOCKED); SCTP_SET_STATE(&stcb->asoc, SCTP_STATE_COOKIE_WAIT); queue_only = 1; } } if ((net->flight_size > net->cwnd) && (stcb->asoc.sctp_cmt_on_off == 0)) { SCTP_STAT_INCR(sctps_send_cwnd_avoid); queue_only = 1; } else if (asoc->ifp_had_enobuf) { SCTP_STAT_INCR(sctps_ifnomemqueued); if (net->flight_size > (2 * net->mtu)) { queue_only = 1; } asoc->ifp_had_enobuf = 0; } un_sent = ((stcb->asoc.total_output_queue_size - stcb->asoc.total_flight) + (stcb->asoc.stream_queue_cnt * sizeof(struct sctp_data_chunk))); if ((sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) && (stcb->asoc.total_flight > 0) && (stcb->asoc.stream_queue_cnt < SCTP_MAX_DATA_BUNDLING) && (un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD))) { /*- * Ok, Nagle is set on and we have data outstanding. * Don't send anything and let SACKs drive out the * data unless wen have a "full" segment to send. */ if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) { sctp_log_nagle_event(stcb, SCTP_NAGLE_APPLIED); } SCTP_STAT_INCR(sctps_naglequeued); nagle_applies = 1; } else { if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) { if (sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) sctp_log_nagle_event(stcb, SCTP_NAGLE_SKIPPED); } SCTP_STAT_INCR(sctps_naglesent); nagle_applies = 0; } if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) { sctp_misc_ints(SCTP_CWNDLOG_PRESEND, queue_only_for_init, queue_only, nagle_applies, un_sent); sctp_misc_ints(SCTP_CWNDLOG_PRESEND, stcb->asoc.total_output_queue_size, stcb->asoc.total_flight, stcb->asoc.chunks_on_out_queue, stcb->asoc.total_flight_count); } if ((queue_only == 0) && (nagle_applies == 0) && (stcb->asoc.peers_rwnd && un_sent)) { /* we can attempt to send too. */ if (hold_tcblock == 0) { /* * If there is activity recv'ing sacks no need to * send */ if (SCTP_TCB_TRYLOCK(stcb)) { sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED); hold_tcblock = 1; } } else { sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED); } } else if ((queue_only == 0) && (stcb->asoc.peers_rwnd == 0) && (stcb->asoc.total_flight == 0)) { /* We get to have a probe outstanding */ if (hold_tcblock == 0) { hold_tcblock = 1; SCTP_TCB_LOCK(stcb); } sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED); } else if (some_on_control) { int num_out, reason, frag_point; /* Here we do control only */ if (hold_tcblock == 0) { hold_tcblock = 1; SCTP_TCB_LOCK(stcb); } frag_point = sctp_get_frag_point(stcb, &stcb->asoc); (void)sctp_med_chunk_output(inp, stcb, &stcb->asoc, &num_out, &reason, 1, 1, &now, &now_filled, frag_point, SCTP_SO_LOCKED); } SCTPDBG(SCTP_DEBUG_OUTPUT1, "USR Send complete qo:%d prw:%d unsent:%d tf:%d cooq:%d toqs:%d err:%d\n", queue_only, stcb->asoc.peers_rwnd, un_sent, stcb->asoc.total_flight, stcb->asoc.chunks_on_out_queue, stcb->asoc.total_output_queue_size, error); out: out_unlocked: if (local_soresv && stcb) { atomic_subtract_int(&stcb->asoc.sb_send_resv, sndlen); } if (create_lock_applied) { SCTP_ASOC_CREATE_UNLOCK(inp); } if ((stcb) && hold_tcblock) { SCTP_TCB_UNLOCK(stcb); } if (stcb && free_cnt_applied) { atomic_add_int(&stcb->asoc.refcnt, -1); } #ifdef INVARIANTS if (stcb) { if (mtx_owned(&stcb->tcb_mtx)) { panic("Leaving with tcb mtx owned?"); } if (mtx_owned(&stcb->tcb_send_mtx)) { panic("Leaving with tcb send mtx owned?"); } } #endif if (top) { sctp_m_freem(top); } if (control) { sctp_m_freem(control); } return (error); } /* * generate an AUTHentication chunk, if required */ struct mbuf * sctp_add_auth_chunk(struct mbuf *m, struct mbuf **m_end, struct sctp_auth_chunk **auth_ret, uint32_t * offset, struct sctp_tcb *stcb, uint8_t chunk) { struct mbuf *m_auth; struct sctp_auth_chunk *auth; int chunk_len; struct mbuf *cn; if ((m_end == NULL) || (auth_ret == NULL) || (offset == NULL) || (stcb == NULL)) return (m); if (stcb->asoc.auth_supported == 0) { return (m); } /* does the requested chunk require auth? */ if (!sctp_auth_is_required_chunk(chunk, stcb->asoc.peer_auth_chunks)) { return (m); } m_auth = sctp_get_mbuf_for_msg(sizeof(*auth), 0, M_NOWAIT, 1, MT_HEADER); if (m_auth == NULL) { /* no mbuf's */ return (m); } /* reserve some space if this will be the first mbuf */ if (m == NULL) SCTP_BUF_RESV_UF(m_auth, SCTP_MIN_OVERHEAD); /* fill in the AUTH chunk details */ auth = mtod(m_auth, struct sctp_auth_chunk *); bzero(auth, sizeof(*auth)); auth->ch.chunk_type = SCTP_AUTHENTICATION; auth->ch.chunk_flags = 0; chunk_len = sizeof(*auth) + sctp_get_hmac_digest_len(stcb->asoc.peer_hmac_id); auth->ch.chunk_length = htons(chunk_len); auth->hmac_id = htons(stcb->asoc.peer_hmac_id); /* key id and hmac digest will be computed and filled in upon send */ /* save the offset where the auth was inserted into the chain */ *offset = 0; for (cn = m; cn; cn = SCTP_BUF_NEXT(cn)) { *offset += SCTP_BUF_LEN(cn); } /* update length and return pointer to the auth chunk */ SCTP_BUF_LEN(m_auth) = chunk_len; m = sctp_copy_mbufchain(m_auth, m, m_end, 1, chunk_len, 0); if (auth_ret != NULL) *auth_ret = auth; return (m); } #ifdef INET6 int sctp_v6src_match_nexthop(struct sockaddr_in6 *src6, sctp_route_t * ro) { struct nd_prefix *pfx = NULL; struct nd_pfxrouter *pfxrtr = NULL; struct sockaddr_in6 gw6; if (ro == NULL || ro->ro_rt == NULL || src6->sin6_family != AF_INET6) return (0); /* get prefix entry of address */ LIST_FOREACH(pfx, &MODULE_GLOBAL(nd_prefix), ndpr_entry) { if (pfx->ndpr_stateflags & NDPRF_DETACHED) continue; if (IN6_ARE_MASKED_ADDR_EQUAL(&pfx->ndpr_prefix.sin6_addr, &src6->sin6_addr, &pfx->ndpr_mask)) break; } /* no prefix entry in the prefix list */ if (pfx == NULL) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "No prefix entry for "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)src6); return (0); } SCTPDBG(SCTP_DEBUG_OUTPUT2, "v6src_match_nexthop(), Prefix entry is "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)src6); /* search installed gateway from prefix entry */ LIST_FOREACH(pfxrtr, &pfx->ndpr_advrtrs, pfr_entry) { memset(&gw6, 0, sizeof(struct sockaddr_in6)); gw6.sin6_family = AF_INET6; gw6.sin6_len = sizeof(struct sockaddr_in6); memcpy(&gw6.sin6_addr, &pfxrtr->router->rtaddr, sizeof(struct in6_addr)); SCTPDBG(SCTP_DEBUG_OUTPUT2, "prefix router is "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)&gw6); SCTPDBG(SCTP_DEBUG_OUTPUT2, "installed router is "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, ro->ro_rt->rt_gateway); if (sctp_cmpaddr((struct sockaddr *)&gw6, ro->ro_rt->rt_gateway)) { SCTPDBG(SCTP_DEBUG_OUTPUT2, "pfxrouter is installed\n"); return (1); } } SCTPDBG(SCTP_DEBUG_OUTPUT2, "pfxrouter is not installed\n"); return (0); } #endif int sctp_v4src_match_nexthop(struct sctp_ifa *sifa, sctp_route_t * ro) { #ifdef INET struct sockaddr_in *sin, *mask; struct ifaddr *ifa; struct in_addr srcnetaddr, gwnetaddr; if (ro == NULL || ro->ro_rt == NULL || sifa->address.sa.sa_family != AF_INET) { return (0); } ifa = (struct ifaddr *)sifa->ifa; mask = (struct sockaddr_in *)(ifa->ifa_netmask); sin = &sifa->address.sin; srcnetaddr.s_addr = (sin->sin_addr.s_addr & mask->sin_addr.s_addr); SCTPDBG(SCTP_DEBUG_OUTPUT1, "match_nexthop4: src address is "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &sifa->address.sa); SCTPDBG(SCTP_DEBUG_OUTPUT1, "network address is %x\n", srcnetaddr.s_addr); sin = (struct sockaddr_in *)ro->ro_rt->rt_gateway; gwnetaddr.s_addr = (sin->sin_addr.s_addr & mask->sin_addr.s_addr); SCTPDBG(SCTP_DEBUG_OUTPUT1, "match_nexthop4: nexthop is "); SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, ro->ro_rt->rt_gateway); SCTPDBG(SCTP_DEBUG_OUTPUT1, "network address is %x\n", gwnetaddr.s_addr); if (srcnetaddr.s_addr == gwnetaddr.s_addr) { return (1); } #endif return (0); } Index: user/alc/PQ_LAUNDRY/sys/netinet6/nd6.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/netinet6/nd6.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/netinet6/nd6.c (revision 303206) @@ -1,2691 +1,2698 @@ /*- * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the project nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $KAME: nd6.c,v 1.144 2001/05/24 07:44:00 itojun Exp $ */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define ND6_SLOWTIMER_INTERVAL (60 * 60) /* 1 hour */ #define ND6_RECALC_REACHTM_INTERVAL (60 * 120) /* 2 hours */ #define SIN6(s) ((const struct sockaddr_in6 *)(s)) MALLOC_DEFINE(M_IP6NDP, "ip6ndp", "IPv6 Neighbor Discovery"); /* timer values */ VNET_DEFINE(int, nd6_prune) = 1; /* walk list every 1 seconds */ VNET_DEFINE(int, nd6_delay) = 5; /* delay first probe time 5 second */ VNET_DEFINE(int, nd6_umaxtries) = 3; /* maximum unicast query */ VNET_DEFINE(int, nd6_mmaxtries) = 3; /* maximum multicast query */ VNET_DEFINE(int, nd6_useloopback) = 1; /* use loopback interface for * local traffic */ VNET_DEFINE(int, nd6_gctimer) = (60 * 60 * 24); /* 1 day: garbage * collection timer */ /* preventing too many loops in ND option parsing */ static VNET_DEFINE(int, nd6_maxndopt) = 10; /* max # of ND options allowed */ VNET_DEFINE(int, nd6_maxnudhint) = 0; /* max # of subsequent upper * layer hints */ static VNET_DEFINE(int, nd6_maxqueuelen) = 1; /* max pkts cached in unresolved * ND entries */ #define V_nd6_maxndopt VNET(nd6_maxndopt) #define V_nd6_maxqueuelen VNET(nd6_maxqueuelen) #ifdef ND6_DEBUG VNET_DEFINE(int, nd6_debug) = 1; #else VNET_DEFINE(int, nd6_debug) = 0; #endif static eventhandler_tag lle_event_eh, iflladdr_event_eh; VNET_DEFINE(struct nd_drhead, nd_defrouter); VNET_DEFINE(struct nd_prhead, nd_prefix); VNET_DEFINE(struct rwlock, nd6_lock); VNET_DEFINE(int, nd6_recalc_reachtm_interval) = ND6_RECALC_REACHTM_INTERVAL; #define V_nd6_recalc_reachtm_interval VNET(nd6_recalc_reachtm_interval) int (*send_sendso_input_hook)(struct mbuf *, struct ifnet *, int, int); static int nd6_is_new_addr_neighbor(const struct sockaddr_in6 *, struct ifnet *); static void nd6_setmtu0(struct ifnet *, struct nd_ifinfo *); static void nd6_slowtimo(void *); static int regen_tmpaddr(struct in6_ifaddr *); static void nd6_free(struct llentry **, int); static void nd6_free_redirect(const struct llentry *); static void nd6_llinfo_timer(void *); static void nd6_llinfo_settimer_locked(struct llentry *, long); static void clear_llinfo_pqueue(struct llentry *); static void nd6_rtrequest(int, struct rtentry *, struct rt_addrinfo *); static int nd6_resolve_slow(struct ifnet *, int, struct mbuf *, const struct sockaddr_in6 *, u_char *, uint32_t *, struct llentry **); static int nd6_need_cache(struct ifnet *); static VNET_DEFINE(struct callout, nd6_slowtimo_ch); #define V_nd6_slowtimo_ch VNET(nd6_slowtimo_ch) VNET_DEFINE(struct callout, nd6_timer_ch); #define V_nd6_timer_ch VNET(nd6_timer_ch) static void nd6_lle_event(void *arg __unused, struct llentry *lle, int evt) { struct rt_addrinfo rtinfo; struct sockaddr_in6 dst; struct sockaddr_dl gw; struct ifnet *ifp; int type; LLE_WLOCK_ASSERT(lle); if (lltable_get_af(lle->lle_tbl) != AF_INET6) return; switch (evt) { case LLENTRY_RESOLVED: type = RTM_ADD; KASSERT(lle->la_flags & LLE_VALID, ("%s: %p resolved but not valid?", __func__, lle)); break; case LLENTRY_EXPIRED: type = RTM_DELETE; break; default: return; } ifp = lltable_get_ifp(lle->lle_tbl); bzero(&dst, sizeof(dst)); bzero(&gw, sizeof(gw)); bzero(&rtinfo, sizeof(rtinfo)); lltable_fill_sa_entry(lle, (struct sockaddr *)&dst); dst.sin6_scope_id = in6_getscopezone(ifp, in6_addrscope(&dst.sin6_addr)); gw.sdl_len = sizeof(struct sockaddr_dl); gw.sdl_family = AF_LINK; gw.sdl_alen = ifp->if_addrlen; gw.sdl_index = ifp->if_index; gw.sdl_type = ifp->if_type; if (evt == LLENTRY_RESOLVED) bcopy(lle->ll_addr, gw.sdl_data, ifp->if_addrlen); rtinfo.rti_info[RTAX_DST] = (struct sockaddr *)&dst; rtinfo.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&gw; rtinfo.rti_addrs = RTA_DST | RTA_GATEWAY; rt_missmsg_fib(type, &rtinfo, RTF_HOST | RTF_LLDATA | ( type == RTM_ADD ? RTF_UP: 0), 0, RT_DEFAULT_FIB); } /* * A handler for interface link layer address change event. */ static void nd6_iflladdr(void *arg __unused, struct ifnet *ifp) { lltable_update_ifaddr(LLTABLE6(ifp)); } void nd6_init(void) { rw_init(&V_nd6_lock, "nd6"); LIST_INIT(&V_nd_prefix); /* initialization of the default router list */ TAILQ_INIT(&V_nd_defrouter); /* Start timers. */ callout_init(&V_nd6_slowtimo_ch, 0); callout_reset(&V_nd6_slowtimo_ch, ND6_SLOWTIMER_INTERVAL * hz, nd6_slowtimo, curvnet); callout_init(&V_nd6_timer_ch, 0); callout_reset(&V_nd6_timer_ch, hz, nd6_timer, curvnet); nd6_dad_init(); if (IS_DEFAULT_VNET(curvnet)) { lle_event_eh = EVENTHANDLER_REGISTER(lle_event, nd6_lle_event, NULL, EVENTHANDLER_PRI_ANY); iflladdr_event_eh = EVENTHANDLER_REGISTER(iflladdr_event, nd6_iflladdr, NULL, EVENTHANDLER_PRI_ANY); } } #ifdef VIMAGE void nd6_destroy() { callout_drain(&V_nd6_slowtimo_ch); callout_drain(&V_nd6_timer_ch); if (IS_DEFAULT_VNET(curvnet)) { EVENTHANDLER_DEREGISTER(lle_event, lle_event_eh); EVENTHANDLER_DEREGISTER(iflladdr_event, iflladdr_event_eh); } rw_destroy(&V_nd6_lock); } #endif struct nd_ifinfo * nd6_ifattach(struct ifnet *ifp) { struct nd_ifinfo *nd; nd = malloc(sizeof(*nd), M_IP6NDP, M_WAITOK | M_ZERO); nd->initialized = 1; nd->chlim = IPV6_DEFHLIM; nd->basereachable = REACHABLE_TIME; nd->reachable = ND_COMPUTE_RTIME(nd->basereachable); nd->retrans = RETRANS_TIMER; nd->flags = ND6_IFF_PERFORMNUD; /* A loopback interface always has ND6_IFF_AUTO_LINKLOCAL. * XXXHRS: Clear ND6_IFF_AUTO_LINKLOCAL on an IFT_BRIDGE interface by * default regardless of the V_ip6_auto_linklocal configuration to * give a reasonable default behavior. */ if ((V_ip6_auto_linklocal && ifp->if_type != IFT_BRIDGE) || (ifp->if_flags & IFF_LOOPBACK)) nd->flags |= ND6_IFF_AUTO_LINKLOCAL; /* * A loopback interface does not need to accept RTADV. * XXXHRS: Clear ND6_IFF_ACCEPT_RTADV on an IFT_BRIDGE interface by * default regardless of the V_ip6_accept_rtadv configuration to * prevent the interface from accepting RA messages arrived * on one of the member interfaces with ND6_IFF_ACCEPT_RTADV. */ if (V_ip6_accept_rtadv && !(ifp->if_flags & IFF_LOOPBACK) && (ifp->if_type != IFT_BRIDGE)) nd->flags |= ND6_IFF_ACCEPT_RTADV; if (V_ip6_no_radr && !(ifp->if_flags & IFF_LOOPBACK)) nd->flags |= ND6_IFF_NO_RADR; /* XXX: we cannot call nd6_setmtu since ifp is not fully initialized */ nd6_setmtu0(ifp, nd); return nd; } void nd6_ifdetach(struct ifnet *ifp, struct nd_ifinfo *nd) { struct ifaddr *ifa, *next; IF_ADDR_RLOCK(ifp); TAILQ_FOREACH_SAFE(ifa, &ifp->if_addrhead, ifa_link, next) { if (ifa->ifa_addr->sa_family != AF_INET6) continue; /* stop DAD processing */ nd6_dad_stop(ifa); } IF_ADDR_RUNLOCK(ifp); free(nd, M_IP6NDP); } /* * Reset ND level link MTU. This function is called when the physical MTU * changes, which means we might have to adjust the ND level MTU. */ void nd6_setmtu(struct ifnet *ifp) { if (ifp->if_afdata[AF_INET6] == NULL) return; nd6_setmtu0(ifp, ND_IFINFO(ifp)); } /* XXX todo: do not maintain copy of ifp->if_mtu in ndi->maxmtu */ void nd6_setmtu0(struct ifnet *ifp, struct nd_ifinfo *ndi) { u_int32_t omaxmtu; omaxmtu = ndi->maxmtu; switch (ifp->if_type) { case IFT_ARCNET: ndi->maxmtu = MIN(ARC_PHDS_MAXMTU, ifp->if_mtu); /* RFC2497 */ break; case IFT_FDDI: ndi->maxmtu = MIN(FDDIIPMTU, ifp->if_mtu); /* RFC2467 */ break; case IFT_ISO88025: ndi->maxmtu = MIN(ISO88025_MAX_MTU, ifp->if_mtu); break; default: ndi->maxmtu = ifp->if_mtu; break; } /* * Decreasing the interface MTU under IPV6 minimum MTU may cause * undesirable situation. We thus notify the operator of the change * explicitly. The check for omaxmtu is necessary to restrict the * log to the case of changing the MTU, not initializing it. */ if (omaxmtu >= IPV6_MMTU && ndi->maxmtu < IPV6_MMTU) { log(LOG_NOTICE, "nd6_setmtu0: " "new link MTU on %s (%lu) is too small for IPv6\n", if_name(ifp), (unsigned long)ndi->maxmtu); } if (ndi->maxmtu > V_in6_maxmtu) in6_setmaxmtu(); /* check all interfaces just in case */ } void nd6_option_init(void *opt, int icmp6len, union nd_opts *ndopts) { bzero(ndopts, sizeof(*ndopts)); ndopts->nd_opts_search = (struct nd_opt_hdr *)opt; ndopts->nd_opts_last = (struct nd_opt_hdr *)(((u_char *)opt) + icmp6len); if (icmp6len == 0) { ndopts->nd_opts_done = 1; ndopts->nd_opts_search = NULL; } } /* * Take one ND option. */ struct nd_opt_hdr * nd6_option(union nd_opts *ndopts) { struct nd_opt_hdr *nd_opt; int olen; KASSERT(ndopts != NULL, ("%s: ndopts == NULL", __func__)); KASSERT(ndopts->nd_opts_last != NULL, ("%s: uninitialized ndopts", __func__)); if (ndopts->nd_opts_search == NULL) return NULL; if (ndopts->nd_opts_done) return NULL; nd_opt = ndopts->nd_opts_search; /* make sure nd_opt_len is inside the buffer */ if ((caddr_t)&nd_opt->nd_opt_len >= (caddr_t)ndopts->nd_opts_last) { bzero(ndopts, sizeof(*ndopts)); return NULL; } olen = nd_opt->nd_opt_len << 3; if (olen == 0) { /* * Message validation requires that all included * options have a length that is greater than zero. */ bzero(ndopts, sizeof(*ndopts)); return NULL; } ndopts->nd_opts_search = (struct nd_opt_hdr *)((caddr_t)nd_opt + olen); if (ndopts->nd_opts_search > ndopts->nd_opts_last) { /* option overruns the end of buffer, invalid */ bzero(ndopts, sizeof(*ndopts)); return NULL; } else if (ndopts->nd_opts_search == ndopts->nd_opts_last) { /* reached the end of options chain */ ndopts->nd_opts_done = 1; ndopts->nd_opts_search = NULL; } return nd_opt; } /* * Parse multiple ND options. * This function is much easier to use, for ND routines that do not need * multiple options of the same type. */ int nd6_options(union nd_opts *ndopts) { struct nd_opt_hdr *nd_opt; int i = 0; KASSERT(ndopts != NULL, ("%s: ndopts == NULL", __func__)); KASSERT(ndopts->nd_opts_last != NULL, ("%s: uninitialized ndopts", __func__)); if (ndopts->nd_opts_search == NULL) return 0; while (1) { nd_opt = nd6_option(ndopts); if (nd_opt == NULL && ndopts->nd_opts_last == NULL) { /* * Message validation requires that all included * options have a length that is greater than zero. */ ICMP6STAT_INC(icp6s_nd_badopt); bzero(ndopts, sizeof(*ndopts)); return -1; } if (nd_opt == NULL) goto skip1; switch (nd_opt->nd_opt_type) { case ND_OPT_SOURCE_LINKADDR: case ND_OPT_TARGET_LINKADDR: case ND_OPT_MTU: case ND_OPT_REDIRECTED_HEADER: case ND_OPT_NONCE: if (ndopts->nd_opt_array[nd_opt->nd_opt_type]) { nd6log((LOG_INFO, "duplicated ND6 option found (type=%d)\n", nd_opt->nd_opt_type)); /* XXX bark? */ } else { ndopts->nd_opt_array[nd_opt->nd_opt_type] = nd_opt; } break; case ND_OPT_PREFIX_INFORMATION: if (ndopts->nd_opt_array[nd_opt->nd_opt_type] == 0) { ndopts->nd_opt_array[nd_opt->nd_opt_type] = nd_opt; } ndopts->nd_opts_pi_end = (struct nd_opt_prefix_info *)nd_opt; break; /* What about ND_OPT_ROUTE_INFO? RFC 4191 */ case ND_OPT_RDNSS: /* RFC 6106 */ case ND_OPT_DNSSL: /* RFC 6106 */ /* * Silently ignore options we know and do not care about * in the kernel. */ break; default: /* * Unknown options must be silently ignored, * to accommodate future extension to the protocol. */ nd6log((LOG_DEBUG, "nd6_options: unsupported option %d - " "option ignored\n", nd_opt->nd_opt_type)); } skip1: i++; if (i > V_nd6_maxndopt) { ICMP6STAT_INC(icp6s_nd_toomanyopt); nd6log((LOG_INFO, "too many loop in nd opt\n")); break; } if (ndopts->nd_opts_done) break; } return 0; } /* * ND6 timer routine to handle ND6 entries */ static void nd6_llinfo_settimer_locked(struct llentry *ln, long tick) { int canceled; LLE_WLOCK_ASSERT(ln); if (tick < 0) { ln->la_expire = 0; ln->ln_ntick = 0; canceled = callout_stop(&ln->lle_timer); } else { ln->la_expire = time_uptime + tick / hz; LLE_ADDREF(ln); if (tick > INT_MAX) { ln->ln_ntick = tick - INT_MAX; canceled = callout_reset(&ln->lle_timer, INT_MAX, nd6_llinfo_timer, ln); } else { ln->ln_ntick = 0; canceled = callout_reset(&ln->lle_timer, tick, nd6_llinfo_timer, ln); } } if (canceled > 0) LLE_REMREF(ln); } /* * Gets source address of the first packet in hold queue * and stores it in @src. * Returns pointer to @src (if hold queue is not empty) or NULL. * * Set noinline to be dtrace-friendly */ static __noinline struct in6_addr * nd6_llinfo_get_holdsrc(struct llentry *ln, struct in6_addr *src) { struct ip6_hdr hdr; struct mbuf *m; if (ln->la_hold == NULL) return (NULL); /* * assume every packet in la_hold has the same IP header */ m = ln->la_hold; if (sizeof(hdr) > m->m_len) return (NULL); m_copydata(m, 0, sizeof(hdr), (caddr_t)&hdr); *src = hdr.ip6_src; return (src); } /* * Checks if we need to switch from STALE state. * * RFC 4861 requires switching from STALE to DELAY state * on first packet matching entry, waiting V_nd6_delay and * transition to PROBE state (if upper layer confirmation was * not received). * * This code performs a bit differently: * On packet hit we don't change state (but desired state * can be guessed by control plane). However, after V_nd6_delay * seconds code will transition to PROBE state (so DELAY state * is kinda skipped in most situations). * * Typically, V_nd6_gctimer is bigger than V_nd6_delay, so * we perform the following upon entering STALE state: * * 1) Arm timer to run each V_nd6_delay seconds to make sure that * if packet was transmitted at the start of given interval, we * would be able to switch to PROBE state in V_nd6_delay seconds * as user expects. * * 2) Reschedule timer until original V_nd6_gctimer expires keeping * lle in STALE state (remaining timer value stored in lle_remtime). * * 3) Reschedule timer if packet was transmitted less that V_nd6_delay * seconds ago. * * Returns non-zero value if the entry is still STALE (storing * the next timer interval in @pdelay). * * Returns zero value if original timer expired or we need to switch to * PROBE (store that in @do_switch variable). */ static int nd6_is_stale(struct llentry *lle, long *pdelay, int *do_switch) { int nd_delay, nd_gctimer, r_skip_req; time_t lle_hittime; long delay; *do_switch = 0; nd_gctimer = V_nd6_gctimer; nd_delay = V_nd6_delay; LLE_REQ_LOCK(lle); r_skip_req = lle->r_skip_req; lle_hittime = lle->lle_hittime; LLE_REQ_UNLOCK(lle); if (r_skip_req > 0) { /* * Nonzero r_skip_req value was set upon entering * STALE state. Since value was not changed, no * packets were passed using this lle. Ask for * timer reschedule and keep STALE state. */ delay = (long)(MIN(nd_gctimer, nd_delay)); delay *= hz; if (lle->lle_remtime > delay) lle->lle_remtime -= delay; else { delay = lle->lle_remtime; lle->lle_remtime = 0; } if (delay == 0) { /* * The original ng6_gctime timeout ended, * no more rescheduling. */ return (0); } *pdelay = delay; return (1); } /* * Packet received. Verify timestamp */ delay = (long)(time_uptime - lle_hittime); if (delay < nd_delay) { /* * V_nd6_delay still not passed since the first * hit in STALE state. * Reshedule timer and return. */ *pdelay = (long)(nd_delay - delay) * hz; return (1); } /* Request switching to probe */ *do_switch = 1; return (0); } /* * Switch @lle state to new state optionally arming timers. * * Set noinline to be dtrace-friendly */ __noinline void nd6_llinfo_setstate(struct llentry *lle, int newstate) { struct ifnet *ifp; int nd_gctimer, nd_delay; long delay, remtime; delay = 0; remtime = 0; switch (newstate) { case ND6_LLINFO_INCOMPLETE: ifp = lle->lle_tbl->llt_ifp; delay = (long)ND_IFINFO(ifp)->retrans * hz / 1000; break; case ND6_LLINFO_REACHABLE: if (!ND6_LLINFO_PERMANENT(lle)) { ifp = lle->lle_tbl->llt_ifp; delay = (long)ND_IFINFO(ifp)->reachable * hz; } break; case ND6_LLINFO_STALE: /* * Notify fast path that we want to know if any packet * is transmitted by setting r_skip_req. */ LLE_REQ_LOCK(lle); lle->r_skip_req = 1; LLE_REQ_UNLOCK(lle); nd_delay = V_nd6_delay; nd_gctimer = V_nd6_gctimer; delay = (long)(MIN(nd_gctimer, nd_delay)) * hz; remtime = (long)nd_gctimer * hz - delay; break; case ND6_LLINFO_DELAY: lle->la_asked = 0; delay = (long)V_nd6_delay * hz; break; } if (delay > 0) nd6_llinfo_settimer_locked(lle, delay); lle->lle_remtime = remtime; lle->ln_state = newstate; } /* * Timer-dependent part of nd state machine. * * Set noinline to be dtrace-friendly */ static __noinline void nd6_llinfo_timer(void *arg) { struct llentry *ln; struct in6_addr *dst, *pdst, *psrc, src; struct ifnet *ifp; struct nd_ifinfo *ndi; int do_switch, send_ns; long delay; KASSERT(arg != NULL, ("%s: arg NULL", __func__)); ln = (struct llentry *)arg; ifp = lltable_get_ifp(ln->lle_tbl); CURVNET_SET(ifp->if_vnet); ND6_RLOCK(); LLE_WLOCK(ln); if (callout_pending(&ln->lle_timer)) { /* * Here we are a bit odd here in the treatment of * active/pending. If the pending bit is set, it got * rescheduled before I ran. The active * bit we ignore, since if it was stopped * in ll_tablefree() and was currently running * it would have return 0 so the code would * not have deleted it since the callout could * not be stopped so we want to go through * with the delete here now. If the callout * was restarted, the pending bit will be back on and * we just want to bail since the callout_reset would * return 1 and our reference would have been removed * by nd6_llinfo_settimer_locked above since canceled * would have been 1. */ LLE_WUNLOCK(ln); ND6_RUNLOCK(); CURVNET_RESTORE(); return; } ndi = ND_IFINFO(ifp); send_ns = 0; dst = &ln->r_l3addr.addr6; pdst = dst; if (ln->ln_ntick > 0) { if (ln->ln_ntick > INT_MAX) { ln->ln_ntick -= INT_MAX; nd6_llinfo_settimer_locked(ln, INT_MAX); } else { ln->ln_ntick = 0; nd6_llinfo_settimer_locked(ln, ln->ln_ntick); } goto done; } if (ln->la_flags & LLE_STATIC) { goto done; } if (ln->la_flags & LLE_DELETED) { nd6_free(&ln, 0); goto done; } switch (ln->ln_state) { case ND6_LLINFO_INCOMPLETE: if (ln->la_asked < V_nd6_mmaxtries) { ln->la_asked++; send_ns = 1; /* Send NS to multicast address */ pdst = NULL; } else { struct mbuf *m = ln->la_hold; if (m) { struct mbuf *m0; /* * assuming every packet in la_hold has the * same IP header. Send error after unlock. */ m0 = m->m_nextpkt; m->m_nextpkt = NULL; ln->la_hold = m0; clear_llinfo_pqueue(ln); } nd6_free(&ln, 0); if (m != NULL) icmp6_error2(m, ICMP6_DST_UNREACH, ICMP6_DST_UNREACH_ADDR, 0, ifp); } break; case ND6_LLINFO_REACHABLE: if (!ND6_LLINFO_PERMANENT(ln)) nd6_llinfo_setstate(ln, ND6_LLINFO_STALE); break; case ND6_LLINFO_STALE: if (nd6_is_stale(ln, &delay, &do_switch) != 0) { /* * No packet has used this entry and GC timeout * has not been passed. Reshedule timer and * return. */ nd6_llinfo_settimer_locked(ln, delay); break; } if (do_switch == 0) { /* * GC timer has ended and entry hasn't been used. * Run Garbage collector (RFC 4861, 5.3) */ if (!ND6_LLINFO_PERMANENT(ln)) nd6_free(&ln, 1); break; } /* Entry has been used AND delay timer has ended. */ /* FALLTHROUGH */ case ND6_LLINFO_DELAY: if (ndi && (ndi->flags & ND6_IFF_PERFORMNUD) != 0) { /* We need NUD */ ln->la_asked = 1; nd6_llinfo_setstate(ln, ND6_LLINFO_PROBE); send_ns = 1; } else nd6_llinfo_setstate(ln, ND6_LLINFO_STALE); /* XXX */ break; case ND6_LLINFO_PROBE: if (ln->la_asked < V_nd6_umaxtries) { ln->la_asked++; send_ns = 1; } else { nd6_free(&ln, 0); } break; default: panic("%s: paths in a dark night can be confusing: %d", __func__, ln->ln_state); } done: if (ln != NULL) ND6_RUNLOCK(); if (send_ns != 0) { nd6_llinfo_settimer_locked(ln, (long)ndi->retrans * hz / 1000); psrc = nd6_llinfo_get_holdsrc(ln, &src); LLE_FREE_LOCKED(ln); ln = NULL; nd6_ns_output(ifp, psrc, pdst, dst, NULL); } if (ln != NULL) LLE_FREE_LOCKED(ln); CURVNET_RESTORE(); } /* * ND6 timer routine to expire default route list and prefix list */ void nd6_timer(void *arg) { CURVNET_SET((struct vnet *) arg); struct nd_drhead drq; struct nd_defrouter *dr, *ndr; struct nd_prefix *pr, *npr; struct in6_ifaddr *ia6, *nia6; TAILQ_INIT(&drq); /* expire default router list */ ND6_WLOCK(); TAILQ_FOREACH_SAFE(dr, &V_nd_defrouter, dr_entry, ndr) if (dr->expire && dr->expire < time_uptime) defrouter_unlink(dr, &drq); ND6_WUNLOCK(); while ((dr = TAILQ_FIRST(&drq)) != NULL) { TAILQ_REMOVE(&drq, dr, dr_entry); defrouter_del(dr); } /* * expire interface addresses. * in the past the loop was inside prefix expiry processing. * However, from a stricter speci-confrmance standpoint, we should * rather separate address lifetimes and prefix lifetimes. * * XXXRW: in6_ifaddrhead locking. */ addrloop: TAILQ_FOREACH_SAFE(ia6, &V_in6_ifaddrhead, ia_link, nia6) { /* check address lifetime */ if (IFA6_IS_INVALID(ia6)) { int regen = 0; /* * If the expiring address is temporary, try * regenerating a new one. This would be useful when * we suspended a laptop PC, then turned it on after a * period that could invalidate all temporary * addresses. Although we may have to restart the * loop (see below), it must be after purging the * address. Otherwise, we'd see an infinite loop of * regeneration. */ if (V_ip6_use_tempaddr && (ia6->ia6_flags & IN6_IFF_TEMPORARY) != 0) { if (regen_tmpaddr(ia6) == 0) regen = 1; } in6_purgeaddr(&ia6->ia_ifa); if (regen) goto addrloop; /* XXX: see below */ } else if (IFA6_IS_DEPRECATED(ia6)) { int oldflags = ia6->ia6_flags; ia6->ia6_flags |= IN6_IFF_DEPRECATED; /* * If a temporary address has just become deprecated, * regenerate a new one if possible. */ if (V_ip6_use_tempaddr && (ia6->ia6_flags & IN6_IFF_TEMPORARY) != 0 && (oldflags & IN6_IFF_DEPRECATED) == 0) { if (regen_tmpaddr(ia6) == 0) { /* * A new temporary address is * generated. * XXX: this means the address chain * has changed while we are still in * the loop. Although the change * would not cause disaster (because * it's not a deletion, but an * addition,) we'd rather restart the * loop just for safety. Or does this * significantly reduce performance?? */ goto addrloop; } } } else if ((ia6->ia6_flags & IN6_IFF_TENTATIVE) != 0) { /* * Schedule DAD for a tentative address. This happens * if the interface was down or not running * when the address was configured. */ int delay; delay = arc4random() % (MAX_RTR_SOLICITATION_DELAY * hz); nd6_dad_start((struct ifaddr *)ia6, delay); } else { /* * Check status of the interface. If it is down, * mark the address as tentative for future DAD. */ if ((ia6->ia_ifp->if_flags & IFF_UP) == 0 || (ia6->ia_ifp->if_drv_flags & IFF_DRV_RUNNING) == 0 || (ND_IFINFO(ia6->ia_ifp)->flags & ND6_IFF_IFDISABLED) != 0) { ia6->ia6_flags &= ~IN6_IFF_DUPLICATED; ia6->ia6_flags |= IN6_IFF_TENTATIVE; } /* * A new RA might have made a deprecated address * preferred. */ ia6->ia6_flags &= ~IN6_IFF_DEPRECATED; } } /* expire prefix list */ LIST_FOREACH_SAFE(pr, &V_nd_prefix, ndpr_entry, npr) { /* * check prefix lifetime. * since pltime is just for autoconf, pltime processing for * prefix is not necessary. */ if (pr->ndpr_vltime != ND6_INFINITE_LIFETIME && time_uptime - pr->ndpr_lastupdate > pr->ndpr_vltime) { /* * address expiration and prefix expiration are * separate. NEVER perform in6_purgeaddr here. */ prelist_remove(pr); } } callout_reset(&V_nd6_timer_ch, V_nd6_prune * hz, nd6_timer, curvnet); CURVNET_RESTORE(); } /* * ia6 - deprecated/invalidated temporary address */ static int regen_tmpaddr(struct in6_ifaddr *ia6) { struct ifaddr *ifa; struct ifnet *ifp; struct in6_ifaddr *public_ifa6 = NULL; ifp = ia6->ia_ifa.ifa_ifp; IF_ADDR_RLOCK(ifp); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { struct in6_ifaddr *it6; if (ifa->ifa_addr->sa_family != AF_INET6) continue; it6 = (struct in6_ifaddr *)ifa; /* ignore no autoconf addresses. */ if ((it6->ia6_flags & IN6_IFF_AUTOCONF) == 0) continue; /* ignore autoconf addresses with different prefixes. */ if (it6->ia6_ndpr == NULL || it6->ia6_ndpr != ia6->ia6_ndpr) continue; /* * Now we are looking at an autoconf address with the same * prefix as ours. If the address is temporary and is still * preferred, do not create another one. It would be rare, but * could happen, for example, when we resume a laptop PC after * a long period. */ if ((it6->ia6_flags & IN6_IFF_TEMPORARY) != 0 && !IFA6_IS_DEPRECATED(it6)) { public_ifa6 = NULL; break; } /* * This is a public autoconf address that has the same prefix * as ours. If it is preferred, keep it. We can't break the * loop here, because there may be a still-preferred temporary * address with the prefix. */ if (!IFA6_IS_DEPRECATED(it6)) public_ifa6 = it6; } if (public_ifa6 != NULL) ifa_ref(&public_ifa6->ia_ifa); IF_ADDR_RUNLOCK(ifp); if (public_ifa6 != NULL) { int e; if ((e = in6_tmpifadd(public_ifa6, 0, 0)) != 0) { ifa_free(&public_ifa6->ia_ifa); log(LOG_NOTICE, "regen_tmpaddr: failed to create a new" " tmp addr,errno=%d\n", e); return (-1); } ifa_free(&public_ifa6->ia_ifa); return (0); } return (-1); } /* * Remove prefix and default router list entries corresponding to ifp. Neighbor * cache entries are freed in in6_domifdetach(). */ void nd6_purge(struct ifnet *ifp) { struct nd_drhead drq; struct nd_defrouter *dr, *ndr; struct nd_prefix *pr, *npr; TAILQ_INIT(&drq); /* * Nuke default router list entries toward ifp. * We defer removal of default router list entries that is installed * in the routing table, in order to keep additional side effects as * small as possible. */ ND6_WLOCK(); TAILQ_FOREACH_SAFE(dr, &V_nd_defrouter, dr_entry, ndr) { if (dr->installed) continue; if (dr->ifp == ifp) defrouter_unlink(dr, &drq); } TAILQ_FOREACH_SAFE(dr, &V_nd_defrouter, dr_entry, ndr) { if (!dr->installed) continue; if (dr->ifp == ifp) defrouter_unlink(dr, &drq); } ND6_WUNLOCK(); while ((dr = TAILQ_FIRST(&drq)) != NULL) { TAILQ_REMOVE(&drq, dr, dr_entry); defrouter_del(dr); } /* Nuke prefix list entries toward ifp */ LIST_FOREACH_SAFE(pr, &V_nd_prefix, ndpr_entry, npr) { if (pr->ndpr_ifp == ifp) { /* * Because if_detach() does *not* release prefixes * while purging addresses the reference count will * still be above zero. We therefore reset it to * make sure that the prefix really gets purged. */ pr->ndpr_refcnt = 0; prelist_remove(pr); } } /* cancel default outgoing interface setting */ if (V_nd6_defifindex == ifp->if_index) nd6_setdefaultiface(0); if (ND_IFINFO(ifp)->flags & ND6_IFF_ACCEPT_RTADV) { /* Refresh default router list. */ defrouter_select(); } } /* * the caller acquires and releases the lock on the lltbls * Returns the llentry locked */ struct llentry * nd6_lookup(const struct in6_addr *addr6, int flags, struct ifnet *ifp) { struct sockaddr_in6 sin6; struct llentry *ln; bzero(&sin6, sizeof(sin6)); sin6.sin6_len = sizeof(struct sockaddr_in6); sin6.sin6_family = AF_INET6; sin6.sin6_addr = *addr6; IF_AFDATA_LOCK_ASSERT(ifp); ln = lla_lookup(LLTABLE6(ifp), flags, (struct sockaddr *)&sin6); return (ln); } struct llentry * nd6_alloc(const struct in6_addr *addr6, int flags, struct ifnet *ifp) { struct sockaddr_in6 sin6; struct llentry *ln; bzero(&sin6, sizeof(sin6)); sin6.sin6_len = sizeof(struct sockaddr_in6); sin6.sin6_family = AF_INET6; sin6.sin6_addr = *addr6; ln = lltable_alloc_entry(LLTABLE6(ifp), 0, (struct sockaddr *)&sin6); if (ln != NULL) ln->ln_state = ND6_LLINFO_NOSTATE; return (ln); } /* * Test whether a given IPv6 address is a neighbor or not, ignoring * the actual neighbor cache. The neighbor cache is ignored in order * to not reenter the routing code from within itself. */ static int nd6_is_new_addr_neighbor(const struct sockaddr_in6 *addr, struct ifnet *ifp) { struct nd_prefix *pr; struct ifaddr *dstaddr; struct rt_addrinfo info; struct sockaddr_in6 rt_key; struct sockaddr *dst6; int fibnum; /* * A link-local address is always a neighbor. * XXX: a link does not necessarily specify a single interface. */ if (IN6_IS_ADDR_LINKLOCAL(&addr->sin6_addr)) { struct sockaddr_in6 sin6_copy; u_int32_t zone; /* * We need sin6_copy since sa6_recoverscope() may modify the * content (XXX). */ sin6_copy = *addr; if (sa6_recoverscope(&sin6_copy)) return (0); /* XXX: should be impossible */ if (in6_setscope(&sin6_copy.sin6_addr, ifp, &zone)) return (0); if (sin6_copy.sin6_scope_id == zone) return (1); else return (0); } bzero(&rt_key, sizeof(rt_key)); bzero(&info, sizeof(info)); info.rti_info[RTAX_DST] = (struct sockaddr *)&rt_key; /* Always use the default FIB here. XXME - why? */ fibnum = RT_DEFAULT_FIB; /* * If the address matches one of our addresses, * it should be a neighbor. * If the address matches one of our on-link prefixes, it should be a * neighbor. */ LIST_FOREACH(pr, &V_nd_prefix, ndpr_entry) { if (pr->ndpr_ifp != ifp) continue; if (!(pr->ndpr_stateflags & NDPRF_ONLINK)) { /* Always use the default FIB here. */ dst6 = (struct sockaddr *)&pr->ndpr_prefix; /* Restore length field before retrying lookup */ rt_key.sin6_len = sizeof(rt_key); if (rib_lookup_info(fibnum, dst6, 0, 0, &info) != 0) continue; /* * This is the case where multiple interfaces * have the same prefix, but only one is installed * into the routing table and that prefix entry * is not the one being examined here. In the case * where RADIX_MPATH is enabled, multiple route * entries (of the same rt_key value) will be * installed because the interface addresses all * differ. */ if (!IN6_ARE_ADDR_EQUAL(&pr->ndpr_prefix.sin6_addr, &rt_key.sin6_addr)) continue; } if (IN6_ARE_MASKED_ADDR_EQUAL(&pr->ndpr_prefix.sin6_addr, &addr->sin6_addr, &pr->ndpr_mask)) return (1); } /* * If the address is assigned on the node of the other side of * a p2p interface, the address should be a neighbor. */ dstaddr = ifa_ifwithdstaddr((const struct sockaddr *)addr, RT_ALL_FIBS); if (dstaddr != NULL) { if (dstaddr->ifa_ifp == ifp) { ifa_free(dstaddr); return (1); } ifa_free(dstaddr); } /* * If the default router list is empty, all addresses are regarded * as on-link, and thus, as a neighbor. */ if (ND_IFINFO(ifp)->flags & ND6_IFF_ACCEPT_RTADV && TAILQ_EMPTY(&V_nd_defrouter) && V_nd6_defifindex == ifp->if_index) { return (1); } return (0); } /* * Detect if a given IPv6 address identifies a neighbor on a given link. * XXX: should take care of the destination of a p2p link? */ int nd6_is_addr_neighbor(const struct sockaddr_in6 *addr, struct ifnet *ifp) { struct llentry *lle; int rc = 0; IF_AFDATA_UNLOCK_ASSERT(ifp); if (nd6_is_new_addr_neighbor(addr, ifp)) return (1); /* * Even if the address matches none of our addresses, it might be * in the neighbor cache. */ IF_AFDATA_RLOCK(ifp); if ((lle = nd6_lookup(&addr->sin6_addr, 0, ifp)) != NULL) { LLE_RUNLOCK(lle); rc = 1; } IF_AFDATA_RUNLOCK(ifp); return (rc); } /* * Free an nd6 llinfo entry. * Since the function would cause significant changes in the kernel, DO NOT * make it global, unless you have a strong reason for the change, and are sure * that the change is safe. * * Set noinline to be dtrace-friendly */ static __noinline void nd6_free(struct llentry **lnp, int gc) { struct ifnet *ifp; struct llentry *ln; struct nd_defrouter *dr; ln = *lnp; *lnp = NULL; LLE_WLOCK_ASSERT(ln); ND6_RLOCK_ASSERT(); ifp = lltable_get_ifp(ln->lle_tbl); if ((ND_IFINFO(ifp)->flags & ND6_IFF_ACCEPT_RTADV) != 0) dr = defrouter_lookup_locked(&ln->r_l3addr.addr6, ifp); else dr = NULL; ND6_RUNLOCK(); if ((ln->la_flags & LLE_DELETED) == 0) EVENTHANDLER_INVOKE(lle_event, ln, LLENTRY_EXPIRED); /* * we used to have pfctlinput(PRC_HOSTDEAD) here. * even though it is not harmful, it was not really necessary. */ /* cancel timer */ nd6_llinfo_settimer_locked(ln, -1); if (ND_IFINFO(ifp)->flags & ND6_IFF_ACCEPT_RTADV) { if (dr != NULL && dr->expire && ln->ln_state == ND6_LLINFO_STALE && gc) { /* * If the reason for the deletion is just garbage * collection, and the neighbor is an active default * router, do not delete it. Instead, reset the GC * timer using the router's lifetime. * Simply deleting the entry would affect default * router selection, which is not necessarily a good * thing, especially when we're using router preference * values. * XXX: the check for ln_state would be redundant, * but we intentionally keep it just in case. */ if (dr->expire > time_uptime) nd6_llinfo_settimer_locked(ln, (dr->expire - time_uptime) * hz); else nd6_llinfo_settimer_locked(ln, (long)V_nd6_gctimer * hz); LLE_REMREF(ln); LLE_WUNLOCK(ln); defrouter_rele(dr); return; } if (dr) { /* * Unreachablity of a router might affect the default * router selection and on-link detection of advertised * prefixes. */ /* * Temporarily fake the state to choose a new default * router and to perform on-link determination of * prefixes correctly. * Below the state will be set correctly, * or the entry itself will be deleted. */ ln->ln_state = ND6_LLINFO_INCOMPLETE; } if (ln->ln_router || dr) { /* * We need to unlock to avoid a LOR with rt6_flush() with the * rnh and for the calls to pfxlist_onlink_check() and * defrouter_select() in the block further down for calls * into nd6_lookup(). We still hold a ref. */ LLE_WUNLOCK(ln); /* * rt6_flush must be called whether or not the neighbor * is in the Default Router List. * See a corresponding comment in nd6_na_input(). */ rt6_flush(&ln->r_l3addr.addr6, ifp); } if (dr) { /* * Since defrouter_select() does not affect the * on-link determination and MIP6 needs the check * before the default router selection, we perform * the check now. */ pfxlist_onlink_check(); /* * Refresh default router list. */ defrouter_select(); } /* * If this entry was added by an on-link redirect, remove the * corresponding host route. */ if (ln->la_flags & LLE_REDIRECT) nd6_free_redirect(ln); if (ln->ln_router || dr) LLE_WLOCK(ln); } /* * Save to unlock. We still hold an extra reference and will not * free(9) in llentry_free() if someone else holds one as well. */ LLE_WUNLOCK(ln); IF_AFDATA_LOCK(ifp); LLE_WLOCK(ln); /* Guard against race with other llentry_free(). */ if (ln->la_flags & LLE_LINKED) { /* Remove callout reference */ LLE_REMREF(ln); lltable_unlink_entry(ln->lle_tbl, ln); } IF_AFDATA_UNLOCK(ifp); llentry_free(ln); if (dr != NULL) defrouter_rele(dr); } static int nd6_isdynrte(const struct rtentry *rt, void *xap) { if (rt->rt_flags == (RTF_UP | RTF_HOST | RTF_DYNAMIC)) return (1); return (0); } /* * Remove the rtentry for the given llentry, * both of which were installed by a redirect. */ static void nd6_free_redirect(const struct llentry *ln) { int fibnum; struct sockaddr_in6 sin6; struct rt_addrinfo info; lltable_fill_sa_entry(ln, (struct sockaddr *)&sin6); memset(&info, 0, sizeof(info)); info.rti_info[RTAX_DST] = (struct sockaddr *)&sin6; info.rti_filter = nd6_isdynrte; for (fibnum = 0; fibnum < rt_numfibs; fibnum++) rtrequest1_fib(RTM_DELETE, &info, NULL, fibnum); } /* * Rejuvenate this function for routing operations related * processing. */ void nd6_rtrequest(int req, struct rtentry *rt, struct rt_addrinfo *info) { struct sockaddr_in6 *gateway; struct nd_defrouter *dr; struct ifnet *ifp; gateway = (struct sockaddr_in6 *)rt->rt_gateway; ifp = rt->rt_ifp; switch (req) { case RTM_ADD: break; case RTM_DELETE: if (!ifp) return; /* * Only indirect routes are interesting. */ if ((rt->rt_flags & RTF_GATEWAY) == 0) return; /* * check for default route */ if (IN6_ARE_ADDR_EQUAL(&in6addr_any, &SIN6(rt_key(rt))->sin6_addr)) { dr = defrouter_lookup(&gateway->sin6_addr, ifp); if (dr != NULL) { dr->installed = 0; defrouter_rele(dr); } } break; } } int nd6_ioctl(u_long cmd, caddr_t data, struct ifnet *ifp) { struct in6_ndireq *ndi = (struct in6_ndireq *)data; struct in6_nbrinfo *nbi = (struct in6_nbrinfo *)data; struct in6_ndifreq *ndif = (struct in6_ndifreq *)data; int error = 0; if (ifp->if_afdata[AF_INET6] == NULL) return (EPFNOSUPPORT); switch (cmd) { case OSIOCGIFINFO_IN6: #define ND ndi->ndi /* XXX: old ndp(8) assumes a positive value for linkmtu. */ bzero(&ND, sizeof(ND)); ND.linkmtu = IN6_LINKMTU(ifp); ND.maxmtu = ND_IFINFO(ifp)->maxmtu; ND.basereachable = ND_IFINFO(ifp)->basereachable; ND.reachable = ND_IFINFO(ifp)->reachable; ND.retrans = ND_IFINFO(ifp)->retrans; ND.flags = ND_IFINFO(ifp)->flags; ND.recalctm = ND_IFINFO(ifp)->recalctm; ND.chlim = ND_IFINFO(ifp)->chlim; break; case SIOCGIFINFO_IN6: ND = *ND_IFINFO(ifp); break; case SIOCSIFINFO_IN6: /* * used to change host variables from userland. * intended for a use on router to reflect RA configurations. */ /* 0 means 'unspecified' */ if (ND.linkmtu != 0) { if (ND.linkmtu < IPV6_MMTU || ND.linkmtu > IN6_LINKMTU(ifp)) { error = EINVAL; break; } ND_IFINFO(ifp)->linkmtu = ND.linkmtu; } if (ND.basereachable != 0) { int obasereachable = ND_IFINFO(ifp)->basereachable; ND_IFINFO(ifp)->basereachable = ND.basereachable; if (ND.basereachable != obasereachable) ND_IFINFO(ifp)->reachable = ND_COMPUTE_RTIME(ND.basereachable); } if (ND.retrans != 0) ND_IFINFO(ifp)->retrans = ND.retrans; if (ND.chlim != 0) ND_IFINFO(ifp)->chlim = ND.chlim; /* FALLTHROUGH */ case SIOCSIFINFO_FLAGS: { struct ifaddr *ifa; struct in6_ifaddr *ia; if ((ND_IFINFO(ifp)->flags & ND6_IFF_IFDISABLED) && !(ND.flags & ND6_IFF_IFDISABLED)) { /* ifdisabled 1->0 transision */ /* * If the interface is marked as ND6_IFF_IFDISABLED and * has an link-local address with IN6_IFF_DUPLICATED, * do not clear ND6_IFF_IFDISABLED. * See RFC 4862, Section 5.4.5. */ IF_ADDR_RLOCK(ifp); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != AF_INET6) continue; ia = (struct in6_ifaddr *)ifa; if ((ia->ia6_flags & IN6_IFF_DUPLICATED) && IN6_IS_ADDR_LINKLOCAL(IA6_IN6(ia))) break; } IF_ADDR_RUNLOCK(ifp); if (ifa != NULL) { /* LLA is duplicated. */ ND.flags |= ND6_IFF_IFDISABLED; log(LOG_ERR, "Cannot enable an interface" " with a link-local address marked" " duplicate.\n"); } else { ND_IFINFO(ifp)->flags &= ~ND6_IFF_IFDISABLED; if (ifp->if_flags & IFF_UP) in6_if_up(ifp); } } else if (!(ND_IFINFO(ifp)->flags & ND6_IFF_IFDISABLED) && (ND.flags & ND6_IFF_IFDISABLED)) { /* ifdisabled 0->1 transision */ /* Mark all IPv6 address as tentative. */ ND_IFINFO(ifp)->flags |= ND6_IFF_IFDISABLED; if (V_ip6_dad_count > 0 && (ND_IFINFO(ifp)->flags & ND6_IFF_NO_DAD) == 0) { IF_ADDR_RLOCK(ifp); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != AF_INET6) continue; ia = (struct in6_ifaddr *)ifa; ia->ia6_flags |= IN6_IFF_TENTATIVE; } IF_ADDR_RUNLOCK(ifp); } } if (ND.flags & ND6_IFF_AUTO_LINKLOCAL) { if (!(ND_IFINFO(ifp)->flags & ND6_IFF_AUTO_LINKLOCAL)) { /* auto_linklocal 0->1 transision */ /* If no link-local address on ifp, configure */ ND_IFINFO(ifp)->flags |= ND6_IFF_AUTO_LINKLOCAL; in6_ifattach(ifp, NULL); } else if (!(ND.flags & ND6_IFF_IFDISABLED) && ifp->if_flags & IFF_UP) { /* * When the IF already has * ND6_IFF_AUTO_LINKLOCAL, no link-local * address is assigned, and IFF_UP, try to * assign one. */ IF_ADDR_RLOCK(ifp); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != AF_INET6) continue; ia = (struct in6_ifaddr *)ifa; if (IN6_IS_ADDR_LINKLOCAL(IA6_IN6(ia))) break; } IF_ADDR_RUNLOCK(ifp); if (ifa != NULL) /* No LLA is configured. */ in6_ifattach(ifp, NULL); } } } ND_IFINFO(ifp)->flags = ND.flags; break; #undef ND case SIOCSNDFLUSH_IN6: /* XXX: the ioctl name is confusing... */ /* sync kernel routing table with the default router list */ defrouter_reset(); defrouter_select(); break; case SIOCSPFXFLUSH_IN6: { /* flush all the prefix advertised by routers */ struct nd_prefix *pr, *next; LIST_FOREACH_SAFE(pr, &V_nd_prefix, ndpr_entry, next) { struct in6_ifaddr *ia, *ia_next; if (IN6_IS_ADDR_LINKLOCAL(&pr->ndpr_prefix.sin6_addr)) continue; /* XXX */ /* do we really have to remove addresses as well? */ /* XXXRW: in6_ifaddrhead locking. */ TAILQ_FOREACH_SAFE(ia, &V_in6_ifaddrhead, ia_link, ia_next) { if ((ia->ia6_flags & IN6_IFF_AUTOCONF) == 0) continue; if (ia->ia6_ndpr == pr) in6_purgeaddr(&ia->ia_ifa); } prelist_remove(pr); } break; } case SIOCSRTRFLUSH_IN6: { /* flush all the default routers */ struct nd_drhead drq; struct nd_defrouter *dr; TAILQ_INIT(&drq); defrouter_reset(); ND6_WLOCK(); while ((dr = TAILQ_FIRST(&V_nd_defrouter)) != NULL) defrouter_unlink(dr, &drq); ND6_WUNLOCK(); while ((dr = TAILQ_FIRST(&drq)) != NULL) { TAILQ_REMOVE(&drq, dr, dr_entry); defrouter_del(dr); } defrouter_select(); break; } case SIOCGNBRINFO_IN6: { struct llentry *ln; struct in6_addr nb_addr = nbi->addr; /* make local for safety */ if ((error = in6_setscope(&nb_addr, ifp, NULL)) != 0) return (error); IF_AFDATA_RLOCK(ifp); ln = nd6_lookup(&nb_addr, 0, ifp); IF_AFDATA_RUNLOCK(ifp); if (ln == NULL) { error = EINVAL; break; } nbi->state = ln->ln_state; nbi->asked = ln->la_asked; nbi->isrouter = ln->ln_router; if (ln->la_expire == 0) nbi->expire = 0; else nbi->expire = ln->la_expire + ln->lle_remtime / hz + (time_second - time_uptime); LLE_RUNLOCK(ln); break; } case SIOCGDEFIFACE_IN6: /* XXX: should be implemented as a sysctl? */ ndif->ifindex = V_nd6_defifindex; break; case SIOCSDEFIFACE_IN6: /* XXX: should be implemented as a sysctl? */ return (nd6_setdefaultiface(ndif->ifindex)); } return (error); } /* * Calculates new isRouter value based on provided parameters and * returns it. */ static int nd6_is_router(int type, int code, int is_new, int old_addr, int new_addr, int ln_router) { /* * ICMP6 type dependent behavior. * * NS: clear IsRouter if new entry * RS: clear IsRouter * RA: set IsRouter if there's lladdr * redir: clear IsRouter if new entry * * RA case, (1): * The spec says that we must set IsRouter in the following cases: * - If lladdr exist, set IsRouter. This means (1-5). * - If it is old entry (!newentry), set IsRouter. This means (7). * So, based on the spec, in (1-5) and (7) cases we must set IsRouter. * A quetion arises for (1) case. (1) case has no lladdr in the * neighbor cache, this is similar to (6). * This case is rare but we figured that we MUST NOT set IsRouter. * * is_new old_addr new_addr NS RS RA redir * D R * 0 n n (1) c ? s * 0 y n (2) c s s * 0 n y (3) c s s * 0 y y (4) c s s * 0 y y (5) c s s * 1 -- n (6) c c c s * 1 -- y (7) c c s c s * * (c=clear s=set) */ switch (type & 0xff) { case ND_NEIGHBOR_SOLICIT: /* * New entry must have is_router flag cleared. */ if (is_new) /* (6-7) */ ln_router = 0; break; case ND_REDIRECT: /* * If the icmp is a redirect to a better router, always set the * is_router flag. Otherwise, if the entry is newly created, * clear the flag. [RFC 2461, sec 8.3] */ if (code == ND_REDIRECT_ROUTER) ln_router = 1; else { if (is_new) /* (6-7) */ ln_router = 0; } break; case ND_ROUTER_SOLICIT: /* * is_router flag must always be cleared. */ ln_router = 0; break; case ND_ROUTER_ADVERT: /* * Mark an entry with lladdr as a router. */ if ((!is_new && (old_addr || new_addr)) || /* (2-5) */ (is_new && new_addr)) { /* (7) */ ln_router = 1; } break; } return (ln_router); } /* * Create neighbor cache entry and cache link-layer address, * on reception of inbound ND6 packets. (RS/RA/NS/redirect) * * type - ICMP6 type * code - type dependent information * */ void nd6_cache_lladdr(struct ifnet *ifp, struct in6_addr *from, char *lladdr, int lladdrlen, int type, int code) { struct llentry *ln = NULL, *ln_tmp; int is_newentry; int do_update; int olladdr; int llchange; int flags; uint16_t router = 0; struct sockaddr_in6 sin6; struct mbuf *chain = NULL; u_char linkhdr[LLE_MAX_LINKHDR]; size_t linkhdrsize; int lladdr_off; IF_AFDATA_UNLOCK_ASSERT(ifp); KASSERT(ifp != NULL, ("%s: ifp == NULL", __func__)); KASSERT(from != NULL, ("%s: from == NULL", __func__)); /* nothing must be updated for unspecified address */ if (IN6_IS_ADDR_UNSPECIFIED(from)) return; /* * Validation about ifp->if_addrlen and lladdrlen must be done in * the caller. * * XXX If the link does not have link-layer adderss, what should * we do? (ifp->if_addrlen == 0) * Spec says nothing in sections for RA, RS and NA. There's small * description on it in NS section (RFC 2461 7.2.3). */ flags = lladdr ? LLE_EXCLUSIVE : 0; IF_AFDATA_RLOCK(ifp); ln = nd6_lookup(from, flags, ifp); IF_AFDATA_RUNLOCK(ifp); is_newentry = 0; if (ln == NULL) { flags |= LLE_EXCLUSIVE; ln = nd6_alloc(from, 0, ifp); if (ln == NULL) return; /* * Since we already know all the data for the new entry, * fill it before insertion. */ if (lladdr != NULL) { linkhdrsize = sizeof(linkhdr); if (lltable_calc_llheader(ifp, AF_INET6, lladdr, linkhdr, &linkhdrsize, &lladdr_off) != 0) return; lltable_set_entry_addr(ifp, ln, linkhdr, linkhdrsize, lladdr_off); } IF_AFDATA_WLOCK(ifp); LLE_WLOCK(ln); /* Prefer any existing lle over newly-created one */ ln_tmp = nd6_lookup(from, LLE_EXCLUSIVE, ifp); if (ln_tmp == NULL) lltable_link_entry(LLTABLE6(ifp), ln); IF_AFDATA_WUNLOCK(ifp); if (ln_tmp == NULL) { /* No existing lle, mark as new entry (6,7) */ is_newentry = 1; nd6_llinfo_setstate(ln, ND6_LLINFO_STALE); if (lladdr != NULL) /* (7) */ EVENTHANDLER_INVOKE(lle_event, ln, LLENTRY_RESOLVED); } else { lltable_free_entry(LLTABLE6(ifp), ln); ln = ln_tmp; ln_tmp = NULL; } } /* do nothing if static ndp is set */ if ((ln->la_flags & LLE_STATIC)) { if (flags & LLE_EXCLUSIVE) LLE_WUNLOCK(ln); else LLE_RUNLOCK(ln); return; } olladdr = (ln->la_flags & LLE_VALID) ? 1 : 0; if (olladdr && lladdr) { llchange = bcmp(lladdr, ln->ll_addr, ifp->if_addrlen); } else if (!olladdr && lladdr) llchange = 1; else llchange = 0; /* * newentry olladdr lladdr llchange (*=record) * 0 n n -- (1) * 0 y n -- (2) * 0 n y y (3) * STALE * 0 y y n (4) * * 0 y y y (5) * STALE * 1 -- n -- (6) NOSTATE(= PASSIVE) * 1 -- y -- (7) * STALE */ do_update = 0; if (is_newentry == 0 && llchange != 0) { do_update = 1; /* (3,5) */ /* * Record source link-layer address * XXX is it dependent to ifp->if_type? */ linkhdrsize = sizeof(linkhdr); if (lltable_calc_llheader(ifp, AF_INET6, lladdr, linkhdr, &linkhdrsize, &lladdr_off) != 0) return; if (lltable_try_set_entry_addr(ifp, ln, linkhdr, linkhdrsize, lladdr_off) == 0) { /* Entry was deleted */ return; } nd6_llinfo_setstate(ln, ND6_LLINFO_STALE); EVENTHANDLER_INVOKE(lle_event, ln, LLENTRY_RESOLVED); if (ln->la_hold != NULL) nd6_grab_holdchain(ln, &chain, &sin6); } /* Calculates new router status */ router = nd6_is_router(type, code, is_newentry, olladdr, lladdr != NULL ? 1 : 0, ln->ln_router); ln->ln_router = router; /* Mark non-router redirects with special flag */ if ((type & 0xFF) == ND_REDIRECT && code != ND_REDIRECT_ROUTER) ln->la_flags |= LLE_REDIRECT; if (flags & LLE_EXCLUSIVE) LLE_WUNLOCK(ln); else LLE_RUNLOCK(ln); if (chain != NULL) nd6_flush_holdchain(ifp, ifp, chain, &sin6); /* * When the link-layer address of a router changes, select the * best router again. In particular, when the neighbor entry is newly * created, it might affect the selection policy. * Question: can we restrict the first condition to the "is_newentry" * case? * XXX: when we hear an RA from a new router with the link-layer * address option, defrouter_select() is called twice, since * defrtrlist_update called the function as well. However, I believe * we can compromise the overhead, since it only happens the first * time. * XXX: although defrouter_select() should not have a bad effect * for those are not autoconfigured hosts, we explicitly avoid such * cases for safety. */ if ((do_update || is_newentry) && router && ND_IFINFO(ifp)->flags & ND6_IFF_ACCEPT_RTADV) { /* * guaranteed recursion */ defrouter_select(); } } static void nd6_slowtimo(void *arg) { CURVNET_SET((struct vnet *) arg); struct nd_ifinfo *nd6if; struct ifnet *ifp; callout_reset(&V_nd6_slowtimo_ch, ND6_SLOWTIMER_INTERVAL * hz, nd6_slowtimo, curvnet); IFNET_RLOCK_NOSLEEP(); TAILQ_FOREACH(ifp, &V_ifnet, if_link) { if (ifp->if_afdata[AF_INET6] == NULL) continue; nd6if = ND_IFINFO(ifp); if (nd6if->basereachable && /* already initialized */ (nd6if->recalctm -= ND6_SLOWTIMER_INTERVAL) <= 0) { /* * Since reachable time rarely changes by router * advertisements, we SHOULD insure that a new random * value gets recomputed at least once every few hours. * (RFC 2461, 6.3.4) */ nd6if->recalctm = V_nd6_recalc_reachtm_interval; nd6if->reachable = ND_COMPUTE_RTIME(nd6if->basereachable); } } IFNET_RUNLOCK_NOSLEEP(); CURVNET_RESTORE(); } void nd6_grab_holdchain(struct llentry *ln, struct mbuf **chain, struct sockaddr_in6 *sin6) { LLE_WLOCK_ASSERT(ln); *chain = ln->la_hold; ln->la_hold = NULL; lltable_fill_sa_entry(ln, (struct sockaddr *)sin6); if (ln->ln_state == ND6_LLINFO_STALE) { /* * The first time we send a packet to a * neighbor whose entry is STALE, we have * to change the state to DELAY and a sets * a timer to expire in DELAY_FIRST_PROBE_TIME * seconds to ensure do neighbor unreachability * detection on expiration. * (RFC 2461 7.3.3) */ nd6_llinfo_setstate(ln, ND6_LLINFO_DELAY); } } int nd6_output_ifp(struct ifnet *ifp, struct ifnet *origifp, struct mbuf *m, struct sockaddr_in6 *dst, struct route *ro) { int error; int ip6len; struct ip6_hdr *ip6; struct m_tag *mtag; #ifdef MAC mac_netinet6_nd6_send(ifp, m); #endif /* * If called from nd6_ns_output() (NS), nd6_na_output() (NA), * icmp6_redirect_output() (REDIRECT) or from rip6_output() (RS, RA * as handled by rtsol and rtadvd), mbufs will be tagged for SeND * to be diverted to user space. When re-injected into the kernel, * send_output() will directly dispatch them to the outgoing interface. */ if (send_sendso_input_hook != NULL) { mtag = m_tag_find(m, PACKET_TAG_ND_OUTGOING, NULL); if (mtag != NULL) { ip6 = mtod(m, struct ip6_hdr *); ip6len = sizeof(struct ip6_hdr) + ntohs(ip6->ip6_plen); /* Use the SEND socket */ error = send_sendso_input_hook(m, ifp, SND_OUT, ip6len); /* -1 == no app on SEND socket */ if (error == 0 || error != -1) return (error); } } m_clrprotoflags(m); /* Avoid confusing lower layers. */ IP_PROBE(send, NULL, NULL, mtod(m, struct ip6_hdr *), ifp, NULL, mtod(m, struct ip6_hdr *)); if ((ifp->if_flags & IFF_LOOPBACK) == 0) origifp = ifp; error = (*ifp->if_output)(origifp, m, (struct sockaddr *)dst, ro); return (error); } /* * Lookup link headerfor @sa_dst address. Stores found * data in @desten buffer. Copy of lle ln_flags can be also * saved in @pflags if @pflags is non-NULL. * * If destination LLE does not exists or lle state modification * is required, call "slow" version. * * Return values: * - 0 on success (address copied to buffer). * - EWOULDBLOCK (no local error, but address is still unresolved) * - other errors (alloc failure, etc) */ int nd6_resolve(struct ifnet *ifp, int is_gw, struct mbuf *m, const struct sockaddr *sa_dst, u_char *desten, uint32_t *pflags, struct llentry **plle) { struct llentry *ln = NULL; const struct sockaddr_in6 *dst6; if (pflags != NULL) *pflags = 0; dst6 = (const struct sockaddr_in6 *)sa_dst; /* discard the packet if IPv6 operation is disabled on the interface */ if ((ND_IFINFO(ifp)->flags & ND6_IFF_IFDISABLED)) { m_freem(m); return (ENETDOWN); /* better error? */ } if (m != NULL && m->m_flags & M_MCAST) { switch (ifp->if_type) { case IFT_ETHER: case IFT_FDDI: case IFT_L2VLAN: case IFT_IEEE80211: case IFT_BRIDGE: case IFT_ISO88025: ETHER_MAP_IPV6_MULTICAST(&dst6->sin6_addr, desten); return (0); default: m_freem(m); return (EAFNOSUPPORT); } } IF_AFDATA_RLOCK(ifp); - ln = nd6_lookup(&dst6->sin6_addr, LLE_UNLOCKED, ifp); + ln = nd6_lookup(&dst6->sin6_addr, plle ? LLE_EXCLUSIVE : LLE_UNLOCKED, + ifp); if (ln != NULL && (ln->r_flags & RLLE_VALID) != 0) { /* Entry found, let's copy lle info */ bcopy(ln->r_linkdata, desten, ln->r_hdrlen); if (pflags != NULL) *pflags = LLE_VALID | (ln->r_flags & RLLE_IFADDR); /* Check if we have feedback request from nd6 timer */ if (ln->r_skip_req != 0) { LLE_REQ_LOCK(ln); ln->r_skip_req = 0; /* Notify that entry was used */ ln->lle_hittime = time_uptime; LLE_REQ_UNLOCK(ln); } + if (plle) { + LLE_ADDREF(ln); + *plle = ln; + LLE_WUNLOCK(ln); + } IF_AFDATA_RUNLOCK(ifp); return (0); - } + } else if (plle && ln) + LLE_WUNLOCK(ln); IF_AFDATA_RUNLOCK(ifp); return (nd6_resolve_slow(ifp, 0, m, dst6, desten, pflags, plle)); } /* * Do L2 address resolution for @sa_dst address. Stores found * address in @desten buffer. Copy of lle ln_flags can be also * saved in @pflags if @pflags is non-NULL. * * Heavy version. * Function assume that destination LLE does not exist, * is invalid or stale, so LLE_EXCLUSIVE lock needs to be acquired. * * Set noinline to be dtrace-friendly */ static __noinline int nd6_resolve_slow(struct ifnet *ifp, int flags, struct mbuf *m, const struct sockaddr_in6 *dst, u_char *desten, uint32_t *pflags, struct llentry **plle) { struct llentry *lle = NULL, *lle_tmp; struct in6_addr *psrc, src; int send_ns, ll_len; char *lladdr; /* * Address resolution or Neighbor Unreachability Detection * for the next hop. * At this point, the destination of the packet must be a unicast * or an anycast address(i.e. not a multicast). */ if (lle == NULL) { IF_AFDATA_RLOCK(ifp); lle = nd6_lookup(&dst->sin6_addr, LLE_EXCLUSIVE, ifp); IF_AFDATA_RUNLOCK(ifp); if ((lle == NULL) && nd6_is_addr_neighbor(dst, ifp)) { /* * Since nd6_is_addr_neighbor() internally calls nd6_lookup(), * the condition below is not very efficient. But we believe * it is tolerable, because this should be a rare case. */ lle = nd6_alloc(&dst->sin6_addr, 0, ifp); if (lle == NULL) { char ip6buf[INET6_ADDRSTRLEN]; log(LOG_DEBUG, "nd6_output: can't allocate llinfo for %s " "(ln=%p)\n", ip6_sprintf(ip6buf, &dst->sin6_addr), lle); m_freem(m); return (ENOBUFS); } IF_AFDATA_WLOCK(ifp); LLE_WLOCK(lle); /* Prefer any existing entry over newly-created one */ lle_tmp = nd6_lookup(&dst->sin6_addr, LLE_EXCLUSIVE, ifp); if (lle_tmp == NULL) lltable_link_entry(LLTABLE6(ifp), lle); IF_AFDATA_WUNLOCK(ifp); if (lle_tmp != NULL) { lltable_free_entry(LLTABLE6(ifp), lle); lle = lle_tmp; lle_tmp = NULL; } } } if (lle == NULL) { if (!(ND_IFINFO(ifp)->flags & ND6_IFF_PERFORMNUD)) { m_freem(m); return (ENOBUFS); } if (m != NULL) m_freem(m); return (ENOBUFS); } LLE_WLOCK_ASSERT(lle); /* * The first time we send a packet to a neighbor whose entry is * STALE, we have to change the state to DELAY and a sets a timer to * expire in DELAY_FIRST_PROBE_TIME seconds to ensure do * neighbor unreachability detection on expiration. * (RFC 2461 7.3.3) */ if (lle->ln_state == ND6_LLINFO_STALE) nd6_llinfo_setstate(lle, ND6_LLINFO_DELAY); /* * If the neighbor cache entry has a state other than INCOMPLETE * (i.e. its link-layer address is already resolved), just * send the packet. */ if (lle->ln_state > ND6_LLINFO_INCOMPLETE) { if (flags & LLE_ADDRONLY) { lladdr = lle->ll_addr; ll_len = ifp->if_addrlen; } else { lladdr = lle->r_linkdata; ll_len = lle->r_hdrlen; } bcopy(lladdr, desten, ll_len); if (pflags != NULL) *pflags = lle->la_flags; if (plle) { LLE_ADDREF(lle); *plle = lle; } LLE_WUNLOCK(lle); return (0); } /* * There is a neighbor cache entry, but no ethernet address * response yet. Append this latest packet to the end of the * packet queue in the mbuf. When it exceeds nd6_maxqueuelen, * the oldest packet in the queue will be removed. */ if (lle->la_hold != NULL) { struct mbuf *m_hold; int i; i = 0; for (m_hold = lle->la_hold; m_hold; m_hold = m_hold->m_nextpkt){ i++; if (m_hold->m_nextpkt == NULL) { m_hold->m_nextpkt = m; break; } } while (i >= V_nd6_maxqueuelen) { m_hold = lle->la_hold; lle->la_hold = lle->la_hold->m_nextpkt; m_freem(m_hold); i--; } } else { lle->la_hold = m; } /* * If there has been no NS for the neighbor after entering the * INCOMPLETE state, send the first solicitation. * Note that for newly-created lle la_asked will be 0, * so we will transition from ND6_LLINFO_NOSTATE to * ND6_LLINFO_INCOMPLETE state here. */ psrc = NULL; send_ns = 0; if (lle->la_asked == 0) { lle->la_asked++; send_ns = 1; psrc = nd6_llinfo_get_holdsrc(lle, &src); nd6_llinfo_setstate(lle, ND6_LLINFO_INCOMPLETE); } LLE_WUNLOCK(lle); if (send_ns != 0) nd6_ns_output(ifp, psrc, NULL, &dst->sin6_addr, NULL); return (EWOULDBLOCK); } /* * Do L2 address resolution for @sa_dst address. Stores found * address in @desten buffer. Copy of lle ln_flags can be also * saved in @pflags if @pflags is non-NULL. * * Return values: * - 0 on success (address copied to buffer). * - EWOULDBLOCK (no local error, but address is still unresolved) * - other errors (alloc failure, etc) */ int nd6_resolve_addr(struct ifnet *ifp, int flags, const struct sockaddr *dst, char *desten, uint32_t *pflags) { int error; flags |= LLE_ADDRONLY; error = nd6_resolve_slow(ifp, flags, NULL, (const struct sockaddr_in6 *)dst, desten, pflags, NULL); return (error); } int nd6_flush_holdchain(struct ifnet *ifp, struct ifnet *origifp, struct mbuf *chain, struct sockaddr_in6 *dst) { struct mbuf *m, *m_head; struct ifnet *outifp; int error = 0; m_head = chain; if ((ifp->if_flags & IFF_LOOPBACK) != 0) outifp = origifp; else outifp = ifp; while (m_head) { m = m_head; m_head = m_head->m_nextpkt; error = nd6_output_ifp(ifp, origifp, m, dst, NULL); } /* * XXX * note that intermediate errors are blindly ignored */ return (error); } static int nd6_need_cache(struct ifnet *ifp) { /* * XXX: we currently do not make neighbor cache on any interface * other than ARCnet, Ethernet, FDDI and GIF. * * RFC2893 says: * - unidirectional tunnels needs no ND */ switch (ifp->if_type) { case IFT_ARCNET: case IFT_ETHER: case IFT_FDDI: case IFT_IEEE1394: case IFT_L2VLAN: case IFT_IEEE80211: case IFT_INFINIBAND: case IFT_BRIDGE: case IFT_PROPVIRTUAL: return (1); default: return (0); } } /* * Add pernament ND6 link-layer record for given * interface address. * * Very similar to IPv4 arp_ifinit(), but: * 1) IPv6 DAD is performed in different place * 2) It is called by IPv6 protocol stack in contrast to * arp_ifinit() which is typically called in SIOCSIFADDR * driver ioctl handler. * */ int nd6_add_ifa_lle(struct in6_ifaddr *ia) { struct ifnet *ifp; struct llentry *ln, *ln_tmp; struct sockaddr *dst; ifp = ia->ia_ifa.ifa_ifp; if (nd6_need_cache(ifp) == 0) return (0); ia->ia_ifa.ifa_rtrequest = nd6_rtrequest; dst = (struct sockaddr *)&ia->ia_addr; ln = lltable_alloc_entry(LLTABLE6(ifp), LLE_IFADDR, dst); if (ln == NULL) return (ENOBUFS); IF_AFDATA_WLOCK(ifp); LLE_WLOCK(ln); /* Unlink any entry if exists */ ln_tmp = lla_lookup(LLTABLE6(ifp), LLE_EXCLUSIVE, dst); if (ln_tmp != NULL) lltable_unlink_entry(LLTABLE6(ifp), ln_tmp); lltable_link_entry(LLTABLE6(ifp), ln); IF_AFDATA_WUNLOCK(ifp); if (ln_tmp != NULL) EVENTHANDLER_INVOKE(lle_event, ln_tmp, LLENTRY_EXPIRED); EVENTHANDLER_INVOKE(lle_event, ln, LLENTRY_RESOLVED); LLE_WUNLOCK(ln); if (ln_tmp != NULL) llentry_free(ln_tmp); return (0); } /* * Removes either all lle entries for given @ia, or lle * corresponding to @ia address. */ void nd6_rem_ifa_lle(struct in6_ifaddr *ia, int all) { struct sockaddr_in6 mask, addr; struct sockaddr *saddr, *smask; struct ifnet *ifp; ifp = ia->ia_ifa.ifa_ifp; memcpy(&addr, &ia->ia_addr, sizeof(ia->ia_addr)); memcpy(&mask, &ia->ia_prefixmask, sizeof(ia->ia_prefixmask)); saddr = (struct sockaddr *)&addr; smask = (struct sockaddr *)&mask; if (all != 0) lltable_prefix_free(AF_INET6, saddr, smask, LLE_STATIC); else lltable_delete_addr(LLTABLE6(ifp), LLE_IFADDR, saddr); } static void clear_llinfo_pqueue(struct llentry *ln) { struct mbuf *m_hold, *m_hold_next; for (m_hold = ln->la_hold; m_hold; m_hold = m_hold_next) { m_hold_next = m_hold->m_nextpkt; m_freem(m_hold); } ln->la_hold = NULL; } static int nd6_sysctl_drlist(SYSCTL_HANDLER_ARGS); static int nd6_sysctl_prlist(SYSCTL_HANDLER_ARGS); SYSCTL_DECL(_net_inet6_icmp6); SYSCTL_PROC(_net_inet6_icmp6, ICMPV6CTL_ND6_DRLIST, nd6_drlist, CTLTYPE_OPAQUE | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0, nd6_sysctl_drlist, "S,in6_defrouter", "NDP default router list"); SYSCTL_PROC(_net_inet6_icmp6, ICMPV6CTL_ND6_PRLIST, nd6_prlist, CTLTYPE_OPAQUE | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0, nd6_sysctl_prlist, "S,in6_prefix", "NDP prefix list"); SYSCTL_INT(_net_inet6_icmp6, ICMPV6CTL_ND6_MAXQLEN, nd6_maxqueuelen, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(nd6_maxqueuelen), 1, ""); SYSCTL_INT(_net_inet6_icmp6, OID_AUTO, nd6_gctimer, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(nd6_gctimer), (60 * 60 * 24), ""); static int nd6_sysctl_drlist(SYSCTL_HANDLER_ARGS) { struct in6_defrouter d; struct nd_defrouter *dr; int error; if (req->newptr != NULL) return (EPERM); error = sysctl_wire_old_buffer(req, 0); if (error != 0) return (error); bzero(&d, sizeof(d)); d.rtaddr.sin6_family = AF_INET6; d.rtaddr.sin6_len = sizeof(d.rtaddr); ND6_RLOCK(); TAILQ_FOREACH(dr, &V_nd_defrouter, dr_entry) { d.rtaddr.sin6_addr = dr->rtaddr; error = sa6_recoverscope(&d.rtaddr); if (error != 0) break; d.flags = dr->raflags; d.rtlifetime = dr->rtlifetime; d.expire = dr->expire + (time_second - time_uptime); d.if_index = dr->ifp->if_index; error = SYSCTL_OUT(req, &d, sizeof(d)); if (error != 0) break; } ND6_RUNLOCK(); return (error); } static int nd6_sysctl_prlist(SYSCTL_HANDLER_ARGS) { struct in6_prefix p; struct sockaddr_in6 s6; struct nd_prefix *pr; struct nd_pfxrouter *pfr; time_t maxexpire; int error; char ip6buf[INET6_ADDRSTRLEN]; if (req->newptr) return (EPERM); error = sysctl_wire_old_buffer(req, 0); if (error != 0) return (error); bzero(&p, sizeof(p)); p.origin = PR_ORIG_RA; bzero(&s6, sizeof(s6)); s6.sin6_family = AF_INET6; s6.sin6_len = sizeof(s6); ND6_RLOCK(); LIST_FOREACH(pr, &V_nd_prefix, ndpr_entry) { p.prefix = pr->ndpr_prefix; if (sa6_recoverscope(&p.prefix)) { log(LOG_ERR, "scope error in prefix list (%s)\n", ip6_sprintf(ip6buf, &p.prefix.sin6_addr)); /* XXX: press on... */ } p.raflags = pr->ndpr_raf; p.prefixlen = pr->ndpr_plen; p.vltime = pr->ndpr_vltime; p.pltime = pr->ndpr_pltime; p.if_index = pr->ndpr_ifp->if_index; if (pr->ndpr_vltime == ND6_INFINITE_LIFETIME) p.expire = 0; else { /* XXX: we assume time_t is signed. */ maxexpire = (-1) & ~((time_t)1 << ((sizeof(maxexpire) * 8) - 1)); if (pr->ndpr_vltime < maxexpire - pr->ndpr_lastupdate) p.expire = pr->ndpr_lastupdate + pr->ndpr_vltime + (time_second - time_uptime); else p.expire = maxexpire; } p.refcnt = pr->ndpr_refcnt; p.flags = pr->ndpr_stateflags; p.advrtrs = 0; LIST_FOREACH(pfr, &pr->ndpr_advrtrs, pfr_entry) p.advrtrs++; error = SYSCTL_OUT(req, &p, sizeof(p)); if (error != 0) break; LIST_FOREACH(pfr, &pr->ndpr_advrtrs, pfr_entry) { s6.sin6_addr = pfr->router->rtaddr; if (sa6_recoverscope(&s6)) log(LOG_ERR, "scope error in prefix list (%s)\n", ip6_sprintf(ip6buf, &pfr->router->rtaddr)); error = SYSCTL_OUT(req, &s6, sizeof(s6)); if (error != 0) break; } } ND6_RUNLOCK(); return (error); } Index: user/alc/PQ_LAUNDRY/sys/sys/systm.h =================================================================== --- user/alc/PQ_LAUNDRY/sys/sys/systm.h (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/sys/systm.h (revision 303206) @@ -1,450 +1,452 @@ /*- * Copyright (c) 1982, 1988, 1991, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)systm.h 8.7 (Berkeley) 3/29/95 * $FreeBSD$ */ #ifndef _SYS_SYSTM_H_ #define _SYS_SYSTM_H_ #include #include #include #include #include #include /* for people using printf mainly */ extern int cold; /* nonzero if we are doing a cold boot */ extern int suspend_blocked; /* block suspend due to pending shutdown */ extern int rebooting; /* kern_reboot() has been called. */ extern const char *panicstr; /* panic message */ extern char version[]; /* system version */ extern char compiler_version[]; /* compiler version */ extern char copyright[]; /* system copyright */ extern int kstack_pages; /* number of kernel stack pages */ extern u_long pagesizes[]; /* supported page sizes */ extern long physmem; /* physical memory */ extern long realmem; /* 'real' memory */ extern char *rootdevnames[2]; /* names of possible root devices */ extern int boothowto; /* reboot flags, from console subsystem */ extern int bootverbose; /* nonzero to print verbose messages */ extern int maxusers; /* system tune hint */ extern int ngroups_max; /* max # of supplemental groups */ extern int vm_guest; /* Running as virtual machine guest? */ /* * Detected virtual machine guest types. The intention is to expand * and/or add to the VM_GUEST_VM type if specific VM functionality is * ever implemented (e.g. vendor-specific paravirtualization features). * Keep in sync with vm_guest_sysctl_names[]. */ enum VM_GUEST { VM_GUEST_NO = 0, VM_GUEST_VM, VM_GUEST_XEN, VM_GUEST_HV, VM_GUEST_VMWARE, VM_GUEST_KVM, VM_LAST }; #if defined(WITNESS) || defined(INVARIANT_SUPPORT) void kassert_panic(const char *fmt, ...) __printflike(1, 2); #endif #ifdef INVARIANTS /* The option is always available */ #define KASSERT(exp,msg) do { \ if (__predict_false(!(exp))) \ kassert_panic msg; \ } while (0) #define VNASSERT(exp, vp, msg) do { \ if (__predict_false(!(exp))) { \ vn_printf(vp, "VNASSERT failed\n"); \ kassert_panic msg; \ } \ } while (0) #else #define KASSERT(exp,msg) do { \ } while (0) #define VNASSERT(exp, vp, msg) do { \ } while (0) #endif #ifndef CTASSERT /* Allow lint to override */ #define CTASSERT(x) _Static_assert(x, "compile-time assertion failed") #endif /* * Assert that a pointer can be loaded from memory atomically. * * This assertion enforces stronger alignment than necessary. For example, * on some architectures, atomicity for unaligned loads will depend on * whether or not the load spans multiple cache lines. */ #define ASSERT_ATOMIC_LOAD_PTR(var, msg) \ KASSERT(sizeof(var) == sizeof(void *) && \ ((uintptr_t)&(var) & (sizeof(void *) - 1)) == 0, msg) /* * Assert that a thread is in critical(9) section. */ #define CRITICAL_ASSERT(td) \ KASSERT((td)->td_critnest >= 1, ("Not in critical section")); /* * If we have already panic'd and this is the thread that called * panic(), then don't block on any mutexes but silently succeed. * Otherwise, the kernel will deadlock since the scheduler isn't * going to run the thread that holds any lock we need. */ #define SCHEDULER_STOPPED() __predict_false(curthread->td_stopsched) /* * XXX the hints declarations are even more misplaced than most declarations * in this file, since they are needed in one file (per arch) and only used * in two files. * XXX most of these variables should be const. */ extern int osreldate; extern int envmode; extern int hintmode; /* 0 = off. 1 = config, 2 = fallback */ extern int dynamic_kenv; extern struct mtx kenv_lock; extern char *kern_envp; extern char static_env[]; extern char static_hints[]; /* by config for now */ extern char **kenvp; extern const void *zero_region; /* address space maps to a zeroed page */ extern int unmapped_buf_allowed; #ifdef __LP64__ #define IOSIZE_MAX iosize_max() #define DEVFS_IOSIZE_MAX devfs_iosize_max() #else #define IOSIZE_MAX SSIZE_MAX #define DEVFS_IOSIZE_MAX SSIZE_MAX #endif /* * General function declarations. */ struct inpcb; struct lock_object; struct malloc_type; struct mtx; struct proc; struct socket; struct thread; struct tty; struct ucred; struct uio; struct _jmp_buf; struct trapframe; struct eventtimer; int setjmp(struct _jmp_buf *) __returns_twice; void longjmp(struct _jmp_buf *, int) __dead2; int dumpstatus(vm_offset_t addr, off_t count); int nullop(void); int eopnotsupp(void); int ureadc(int, struct uio *); void hashdestroy(void *, struct malloc_type *, u_long); void *hashinit(int count, struct malloc_type *type, u_long *hashmask); void *hashinit_flags(int count, struct malloc_type *type, u_long *hashmask, int flags); #define HASH_NOWAIT 0x00000001 #define HASH_WAITOK 0x00000002 void *phashinit(int count, struct malloc_type *type, u_long *nentries); void *phashinit_flags(int count, struct malloc_type *type, u_long *nentries, int flags); void g_waitidle(void); void panic(const char *, ...) __dead2 __printflike(1, 2); void vpanic(const char *, __va_list) __dead2 __printflike(1, 0); void cpu_boot(int); void cpu_flush_dcache(void *, size_t); void cpu_rootconf(void); void critical_enter(void); void critical_exit(void); void init_param1(void); void init_param2(long physpages); void init_static_kenv(char *, size_t); void tablefull(const char *); #ifdef EARLY_PRINTF typedef void early_putc_t(int ch); extern early_putc_t *early_putc; #endif int kvprintf(char const *, void (*)(int, void*), void *, int, __va_list) __printflike(1, 0); void log(int, const char *, ...) __printflike(2, 3); void log_console(struct uio *); void vlog(int, const char *, __va_list) __printflike(2, 0); int asprintf(char **ret, struct malloc_type *mtp, const char *format, ...) __printflike(3, 4); int printf(const char *, ...) __printflike(1, 2); int snprintf(char *, size_t, const char *, ...) __printflike(3, 4); int sprintf(char *buf, const char *, ...) __printflike(2, 3); int uprintf(const char *, ...) __printflike(1, 2); int vprintf(const char *, __va_list) __printflike(1, 0); int vasprintf(char **ret, struct malloc_type *mtp, const char *format, __va_list ap) __printflike(3, 0); int vsnprintf(char *, size_t, const char *, __va_list) __printflike(3, 0); int vsnrprintf(char *, size_t, int, const char *, __va_list) __printflike(4, 0); int vsprintf(char *buf, const char *, __va_list) __printflike(2, 0); int ttyprintf(struct tty *, const char *, ...) __printflike(2, 3); int sscanf(const char *, char const *, ...) __nonnull(1) __nonnull(2); int vsscanf(const char *, char const *, __va_list) __nonnull(1) __nonnull(2); long strtol(const char *, char **, int) __nonnull(1); u_long strtoul(const char *, char **, int) __nonnull(1); quad_t strtoq(const char *, char **, int) __nonnull(1); u_quad_t strtouq(const char *, char **, int) __nonnull(1); void tprintf(struct proc *p, int pri, const char *, ...) __printflike(3, 4); void vtprintf(struct proc *, int, const char *, __va_list) __printflike(3, 0); void hexdump(const void *ptr, int length, const char *hdr, int flags); #define HD_COLUMN_MASK 0xff #define HD_DELIM_MASK 0xff00 #define HD_OMIT_COUNT (1 << 16) #define HD_OMIT_HEX (1 << 17) #define HD_OMIT_CHARS (1 << 18) #define ovbcopy(f, t, l) bcopy((f), (t), (l)) void bcopy(const void *from, void *to, size_t len) __nonnull(1) __nonnull(2); void bzero(void *buf, size_t len) __nonnull(1); void explicit_bzero(void *, size_t) __nonnull(1); void *memcpy(void *to, const void *from, size_t len) __nonnull(1) __nonnull(2); void *memmove(void *dest, const void *src, size_t n) __nonnull(1) __nonnull(2); int copystr(const void * __restrict kfaddr, void * __restrict kdaddr, size_t len, size_t * __restrict lencopied) __nonnull(1) __nonnull(2); int copyinstr(const void * __restrict udaddr, void * __restrict kaddr, size_t len, size_t * __restrict lencopied) __nonnull(1) __nonnull(2); int copyin(const void * __restrict udaddr, void * __restrict kaddr, size_t len) __nonnull(1) __nonnull(2); int copyin_nofault(const void * __restrict udaddr, void * __restrict kaddr, size_t len) __nonnull(1) __nonnull(2); int copyout(const void * __restrict kaddr, void * __restrict udaddr, size_t len) __nonnull(1) __nonnull(2); int copyout_nofault(const void * __restrict kaddr, void * __restrict udaddr, size_t len) __nonnull(1) __nonnull(2); int fubyte(volatile const void *base); long fuword(volatile const void *base); int fuword16(volatile const void *base); int32_t fuword32(volatile const void *base); int64_t fuword64(volatile const void *base); int fueword(volatile const void *base, long *val); int fueword32(volatile const void *base, int32_t *val); int fueword64(volatile const void *base, int64_t *val); int subyte(volatile void *base, int byte); int suword(volatile void *base, long word); int suword16(volatile void *base, int word); int suword32(volatile void *base, int32_t word); int suword64(volatile void *base, int64_t word); uint32_t casuword32(volatile uint32_t *base, uint32_t oldval, uint32_t newval); u_long casuword(volatile u_long *p, u_long oldval, u_long newval); int casueword32(volatile uint32_t *base, uint32_t oldval, uint32_t *oldvalp, uint32_t newval); int casueword(volatile u_long *p, u_long oldval, u_long *oldvalp, u_long newval); void realitexpire(void *); int sysbeep(int hertz, int period); void hardclock(int usermode, uintfptr_t pc); void hardclock_cnt(int cnt, int usermode); void hardclock_cpu(int usermode); void hardclock_sync(int cpu); void softclock(void *); void statclock(int usermode); void statclock_cnt(int cnt, int usermode); void profclock(int usermode, uintfptr_t pc); void profclock_cnt(int cnt, int usermode, uintfptr_t pc); int hardclockintr(void); void startprofclock(struct proc *); void stopprofclock(struct proc *); void cpu_startprofclock(void); void cpu_stopprofclock(void); sbintime_t cpu_idleclock(void); void cpu_activeclock(void); void cpu_new_callout(int cpu, sbintime_t bt, sbintime_t bt_opt); void cpu_et_frequency(struct eventtimer *et, uint64_t newfreq); extern int cpu_deepest_sleep; extern int cpu_disable_c2_sleep; extern int cpu_disable_c3_sleep; int cr_cansee(struct ucred *u1, struct ucred *u2); int cr_canseesocket(struct ucred *cred, struct socket *so); int cr_canseeinpcb(struct ucred *cred, struct inpcb *inp); char *kern_getenv(const char *name); void freeenv(char *env); int getenv_int(const char *name, int *data); int getenv_uint(const char *name, unsigned int *data); int getenv_long(const char *name, long *data); int getenv_ulong(const char *name, unsigned long *data); int getenv_string(const char *name, char *data, int size); int getenv_int64(const char *name, int64_t *data); int getenv_uint64(const char *name, uint64_t *data); int getenv_quad(const char *name, quad_t *data); int kern_setenv(const char *name, const char *value); int kern_unsetenv(const char *name); int testenv(const char *name); typedef uint64_t (cpu_tick_f)(void); void set_cputicker(cpu_tick_f *func, uint64_t freq, unsigned var); extern cpu_tick_f *cpu_ticks; uint64_t cpu_tickrate(void); uint64_t cputick2usec(uint64_t tick); #ifdef APM_FIXUP_CALLTODO struct timeval; void adjust_timeout_calltodo(struct timeval *time_change); #endif /* APM_FIXUP_CALLTODO */ #include /* Initialize the world */ void consinit(void); void cpu_initclocks(void); void cpu_initclocks_bsp(void); void cpu_initclocks_ap(void); void usrinfoinit(void); /* Finalize the world */ void kern_reboot(int) __dead2; void shutdown_nice(int); /* Timeouts */ typedef void timeout_t(void *); /* timeout function type */ #define CALLOUT_HANDLE_INITIALIZER(handle) \ { NULL } void callout_handle_init(struct callout_handle *); struct callout_handle timeout(timeout_t *, void *, int); void untimeout(timeout_t *, void *, struct callout_handle); /* Stubs for obsolete functions that used to be for interrupt management */ static __inline intrmask_t splbio(void) { return 0; } static __inline intrmask_t splcam(void) { return 0; } static __inline intrmask_t splclock(void) { return 0; } static __inline intrmask_t splhigh(void) { return 0; } static __inline intrmask_t splimp(void) { return 0; } static __inline intrmask_t splnet(void) { return 0; } static __inline intrmask_t spltty(void) { return 0; } static __inline void splx(intrmask_t ipl __unused) { return; } /* * Common `proc' functions are declared here so that proc.h can be included * less often. */ int _sleep(void *chan, struct lock_object *lock, int pri, const char *wmesg, sbintime_t sbt, sbintime_t pr, int flags) __nonnull(1); #define msleep(chan, mtx, pri, wmesg, timo) \ _sleep((chan), &(mtx)->lock_object, (pri), (wmesg), \ tick_sbt * (timo), 0, C_HARDCLOCK) #define msleep_sbt(chan, mtx, pri, wmesg, bt, pr, flags) \ _sleep((chan), &(mtx)->lock_object, (pri), (wmesg), (bt), (pr), \ (flags)) int msleep_spin_sbt(void *chan, struct mtx *mtx, const char *wmesg, sbintime_t sbt, sbintime_t pr, int flags) __nonnull(1); #define msleep_spin(chan, mtx, wmesg, timo) \ msleep_spin_sbt((chan), (mtx), (wmesg), tick_sbt * (timo), \ 0, C_HARDCLOCK) int pause_sbt(const char *wmesg, sbintime_t sbt, sbintime_t pr, int flags); #define pause(wmesg, timo) \ pause_sbt((wmesg), tick_sbt * (timo), 0, C_HARDCLOCK) #define tsleep(chan, pri, wmesg, timo) \ _sleep((chan), NULL, (pri), (wmesg), tick_sbt * (timo), \ 0, C_HARDCLOCK) #define tsleep_sbt(chan, pri, wmesg, bt, pr, flags) \ _sleep((chan), NULL, (pri), (wmesg), (bt), (pr), (flags)) void wakeup(void *chan) __nonnull(1); void wakeup_one(void *chan) __nonnull(1); /* * Common `struct cdev *' stuff are declared here to avoid #include poisoning */ struct cdev; dev_t dev2udev(struct cdev *x); const char *devtoname(struct cdev *cdev); #ifdef __LP64__ size_t devfs_iosize_max(void); size_t iosize_max(void); #endif int poll_no_poll(int events); /* XXX: Should be void nanodelay(u_int nsec); */ void DELAY(int usec); /* Root mount holdback API */ struct root_hold_token; struct root_hold_token *root_mount_hold(const char *identifier); void root_mount_rel(struct root_hold_token *h); int root_mounted(void); /* * Unit number allocation API. (kern/subr_unit.c) */ struct unrhdr; struct unrhdr *new_unrhdr(int low, int high, struct mtx *mutex); void init_unrhdr(struct unrhdr *uh, int low, int high, struct mtx *mutex); void delete_unrhdr(struct unrhdr *uh); void clean_unrhdr(struct unrhdr *uh); void clean_unrhdrl(struct unrhdr *uh); int alloc_unr(struct unrhdr *uh); int alloc_unr_specific(struct unrhdr *uh, u_int item); int alloc_unrl(struct unrhdr *uh); void free_unr(struct unrhdr *uh, u_int item); void intr_prof_stack_use(struct thread *td, struct trapframe *frame); extern void (*softdep_ast_cleanup)(void); +void counted_warning(unsigned *counter, const char *msg); + #endif /* !_SYS_SYSTM_H_ */ Index: user/alc/PQ_LAUNDRY/sys/ufs/ufs/ufs_lookup.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/ufs/ufs/ufs_lookup.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/ufs/ufs/ufs_lookup.c (revision 303206) @@ -1,1493 +1,1496 @@ /*- * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)ufs_lookup.c 8.15 (Berkeley) 6/16/95 */ #include __FBSDID("$FreeBSD$"); #include "opt_ufs.h" #include "opt_quota.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef UFS_DIRHASH #include #endif #include #include #ifdef DIAGNOSTIC static int dirchk = 1; #else static int dirchk = 0; #endif SYSCTL_INT(_debug, OID_AUTO, dircheck, CTLFLAG_RW, &dirchk, 0, ""); /* true if old FS format...*/ #define OFSFMT(vp) ((vp)->v_mount->mnt_maxsymlinklen <= 0) #ifdef QUOTA static int ufs_lookup_upgrade_lock(struct vnode *vp) { int error; ASSERT_VOP_LOCKED(vp, __FUNCTION__); if (VOP_ISLOCKED(vp) == LK_EXCLUSIVE) return (0); error = 0; /* * Upgrade vnode lock, since getinoquota() * requires exclusive lock to modify inode. */ vhold(vp); vn_lock(vp, LK_UPGRADE | LK_RETRY); VI_LOCK(vp); if (vp->v_iflag & VI_DOOMED) error = ENOENT; vdropl(vp); return (error); } #endif static int ufs_delete_denied(struct vnode *vdp, struct vnode *tdp, struct ucred *cred, struct thread *td) { int error; #ifdef UFS_ACL /* * NFSv4 Minor Version 1, draft-ietf-nfsv4-minorversion1-03.txt * * 3.16.2.1. ACE4_DELETE vs. ACE4_DELETE_CHILD */ /* * XXX: Is this check required? */ error = VOP_ACCESS(vdp, VEXEC, cred, td); if (error) return (error); error = VOP_ACCESSX(tdp, VDELETE, cred, td); if (error == 0) return (0); error = VOP_ACCESSX(vdp, VDELETE_CHILD, cred, td); if (error == 0) return (0); error = VOP_ACCESSX(vdp, VEXPLICIT_DENY | VDELETE_CHILD, cred, td); if (error) return (error); #endif /* !UFS_ACL */ /* * Standard Unix access control - delete access requires VWRITE. */ error = VOP_ACCESS(vdp, VWRITE, cred, td); if (error) return (error); /* * If directory is "sticky", then user must own * the directory, or the file in it, else she * may not delete it (unless she's root). This * implements append-only directories. */ if ((VTOI(vdp)->i_mode & ISVTX) && VOP_ACCESS(vdp, VADMIN, cred, td) && VOP_ACCESS(tdp, VADMIN, cred, td)) return (EPERM); return (0); } /* * Convert a component of a pathname into a pointer to a locked inode. * This is a very central and rather complicated routine. * If the filesystem is not maintained in a strict tree hierarchy, * this can result in a deadlock situation (see comments in code below). * * The cnp->cn_nameiop argument is LOOKUP, CREATE, RENAME, or DELETE depending * on whether the name is to be looked up, created, renamed, or deleted. * When CREATE, RENAME, or DELETE is specified, information usable in * creating, renaming, or deleting a directory entry may be calculated. * If flag has LOCKPARENT or'ed into it and the target of the pathname * exists, lookup returns both the target and its parent directory locked. * When creating or renaming and LOCKPARENT is specified, the target may * not be ".". When deleting and LOCKPARENT is specified, the target may * be "."., but the caller must check to ensure it does an vrele and vput * instead of two vputs. * * This routine is actually used as VOP_CACHEDLOOKUP method, and the * filesystem employs the generic vfs_cache_lookup() as VOP_LOOKUP * method. * * vfs_cache_lookup() performs the following for us: * check that it is a directory * check accessibility of directory * check for modification attempts on read-only mounts * if name found in cache * if at end of path and deleting or creating * drop it * else * return name. * return VOP_CACHEDLOOKUP() * * Overall outline of ufs_lookup: * * search for name in directory, to found or notfound * notfound: * if creating, return locked directory, leaving info on available slots * else return error * found: * if at end of path and deleting, return information to allow delete * if at end of path and rewriting (RENAME and LOCKPARENT), lock target * inode and return info to allow rewrite * if not at end, add name to cache; if at end and neither creating * nor deleting, add name to cache */ int ufs_lookup(ap) struct vop_cachedlookup_args /* { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; } */ *ap; { return (ufs_lookup_ino(ap->a_dvp, ap->a_vpp, ap->a_cnp, NULL)); } int ufs_lookup_ino(struct vnode *vdp, struct vnode **vpp, struct componentname *cnp, ino_t *dd_ino) { struct inode *dp; /* inode for directory being searched */ struct buf *bp; /* a buffer of directory entries */ struct direct *ep; /* the current directory entry */ int entryoffsetinblock; /* offset of ep in bp's buffer */ enum {NONE, COMPACT, FOUND} slotstatus; doff_t slotoffset; /* offset of area with free space */ doff_t i_diroff; /* cached i_diroff value. */ doff_t i_offset; /* cached i_offset value. */ int slotsize; /* size of area at slotoffset */ int slotfreespace; /* amount of space free in slot */ int slotneeded; /* size of the entry we're seeking */ int numdirpasses; /* strategy for directory search */ doff_t endsearch; /* offset to end directory search */ doff_t prevoff; /* prev entry dp->i_offset */ struct vnode *pdp; /* saved dp during symlink work */ struct vnode *tdp; /* returned by VFS_VGET */ doff_t enduseful; /* pointer past last used dir slot */ u_long bmask; /* block offset mask */ int namlen, error; struct ucred *cred = cnp->cn_cred; int flags = cnp->cn_flags; int nameiop = cnp->cn_nameiop; ino_t ino, ino1; int ltype; if (vpp != NULL) *vpp = NULL; dp = VTOI(vdp); if (dp->i_effnlink == 0) return (ENOENT); /* * Create a vm object if vmiodirenable is enabled. * Alternatively we could call vnode_create_vobject * in VFS_VGET but we could end up creating objects * that are never used. */ vnode_create_vobject(vdp, DIP(dp, i_size), cnp->cn_thread); bmask = VFSTOUFS(vdp->v_mount)->um_mountp->mnt_stat.f_iosize - 1; #ifdef QUOTA if ((nameiop == DELETE || nameiop == RENAME) && (flags & ISLASTCN)) { error = ufs_lookup_upgrade_lock(vdp); if (error != 0) return (error); } #endif restart: bp = NULL; slotoffset = -1; /* * We now have a segment name to search for, and a directory to search. * * Suppress search for slots unless creating * file and at end of pathname, in which case * we watch for a place to put the new file in * case it doesn't already exist. */ ino = 0; i_diroff = dp->i_diroff; slotstatus = FOUND; slotfreespace = slotsize = slotneeded = 0; if ((nameiop == CREATE || nameiop == RENAME) && (flags & ISLASTCN)) { slotstatus = NONE; slotneeded = DIRECTSIZ(cnp->cn_namelen); } #ifdef UFS_DIRHASH /* * Use dirhash for fast operations on large directories. The logic * to determine whether to hash the directory is contained within * ufsdirhash_build(); a zero return means that it decided to hash * this directory and it successfully built up the hash table. */ if (ufsdirhash_build(dp) == 0) { /* Look for a free slot if needed. */ enduseful = dp->i_size; if (slotstatus != FOUND) { slotoffset = ufsdirhash_findfree(dp, slotneeded, &slotsize); if (slotoffset >= 0) { slotstatus = COMPACT; enduseful = ufsdirhash_enduseful(dp); if (enduseful < 0) enduseful = dp->i_size; } } /* Look up the component. */ numdirpasses = 1; entryoffsetinblock = 0; /* silence compiler warning */ switch (ufsdirhash_lookup(dp, cnp->cn_nameptr, cnp->cn_namelen, &i_offset, &bp, nameiop == DELETE ? &prevoff : NULL)) { case 0: ep = (struct direct *)((char *)bp->b_data + (i_offset & bmask)); goto foundentry; case ENOENT: i_offset = roundup2(dp->i_size, DIRBLKSIZ); goto notfound; default: /* Something failed; just do a linear search. */ break; } } #endif /* UFS_DIRHASH */ /* * If there is cached information on a previous search of * this directory, pick up where we last left off. * We cache only lookups as these are the most common * and have the greatest payoff. Caching CREATE has little * benefit as it usually must search the entire directory * to determine that the entry does not exist. Caching the * location of the last DELETE or RENAME has not reduced * profiling time and hence has been removed in the interest * of simplicity. */ if (nameiop != LOOKUP || i_diroff == 0 || i_diroff >= dp->i_size) { entryoffsetinblock = 0; i_offset = 0; numdirpasses = 1; } else { i_offset = i_diroff; if ((entryoffsetinblock = i_offset & bmask) && (error = UFS_BLKATOFF(vdp, (off_t)i_offset, NULL, &bp))) return (error); numdirpasses = 2; nchstats.ncs_2passes++; } prevoff = i_offset; endsearch = roundup2(dp->i_size, DIRBLKSIZ); enduseful = 0; searchloop: while (i_offset < endsearch) { /* * If necessary, get the next directory block. */ if ((i_offset & bmask) == 0) { if (bp != NULL) brelse(bp); error = UFS_BLKATOFF(vdp, (off_t)i_offset, NULL, &bp); if (error) return (error); entryoffsetinblock = 0; } /* * If still looking for a slot, and at a DIRBLKSIZE * boundary, have to start looking for free space again. */ if (slotstatus == NONE && (entryoffsetinblock & (DIRBLKSIZ - 1)) == 0) { slotoffset = -1; slotfreespace = 0; } /* * Get pointer to next entry. * Full validation checks are slow, so we only check * enough to insure forward progress through the * directory. Complete checks can be run by patching * "dirchk" to be true. */ ep = (struct direct *)((char *)bp->b_data + entryoffsetinblock); if (ep->d_reclen == 0 || ep->d_reclen > DIRBLKSIZ - (entryoffsetinblock & (DIRBLKSIZ - 1)) || (dirchk && ufs_dirbadentry(vdp, ep, entryoffsetinblock))) { int i; ufs_dirbad(dp, i_offset, "mangled entry"); i = DIRBLKSIZ - (entryoffsetinblock & (DIRBLKSIZ - 1)); i_offset += i; entryoffsetinblock += i; continue; } /* * If an appropriate sized slot has not yet been found, * check to see if one is available. Also accumulate space * in the current block so that we can determine if * compaction is viable. */ if (slotstatus != FOUND) { int size = ep->d_reclen; if (ep->d_ino != 0) size -= DIRSIZ(OFSFMT(vdp), ep); if (size > 0) { if (size >= slotneeded) { slotstatus = FOUND; slotoffset = i_offset; slotsize = ep->d_reclen; } else if (slotstatus == NONE) { slotfreespace += size; if (slotoffset == -1) slotoffset = i_offset; if (slotfreespace >= slotneeded) { slotstatus = COMPACT; slotsize = i_offset + ep->d_reclen - slotoffset; } } } } /* * Check for a name match. */ if (ep->d_ino) { # if (BYTE_ORDER == LITTLE_ENDIAN) if (OFSFMT(vdp)) namlen = ep->d_type; else namlen = ep->d_namlen; # else namlen = ep->d_namlen; # endif if (namlen == cnp->cn_namelen && (cnp->cn_nameptr[0] == ep->d_name[0]) && !bcmp(cnp->cn_nameptr, ep->d_name, (unsigned)namlen)) { #ifdef UFS_DIRHASH foundentry: #endif /* * Save directory entry's inode number and * reclen in ndp->ni_ufs area, and release * directory buffer. */ if (vdp->v_mount->mnt_maxsymlinklen > 0 && ep->d_type == DT_WHT) { slotstatus = FOUND; slotoffset = i_offset; slotsize = ep->d_reclen; enduseful = dp->i_size; cnp->cn_flags |= ISWHITEOUT; numdirpasses--; goto notfound; } ino = ep->d_ino; goto found; } } prevoff = i_offset; i_offset += ep->d_reclen; entryoffsetinblock += ep->d_reclen; if (ep->d_ino) enduseful = i_offset; } notfound: /* * If we started in the middle of the directory and failed * to find our target, we must check the beginning as well. */ if (numdirpasses == 2) { numdirpasses--; i_offset = 0; endsearch = i_diroff; goto searchloop; } if (bp != NULL) brelse(bp); /* * If creating, and at end of pathname and current * directory has not been removed, then can consider * allowing file to be created. */ if ((nameiop == CREATE || nameiop == RENAME || (nameiop == DELETE && (cnp->cn_flags & DOWHITEOUT) && (cnp->cn_flags & ISWHITEOUT))) && (flags & ISLASTCN) && dp->i_effnlink != 0) { /* * Access for write is interpreted as allowing * creation of files in the directory. * * XXX: Fix the comment above. */ if (flags & WILLBEDIR) error = VOP_ACCESSX(vdp, VWRITE | VAPPEND, cred, cnp->cn_thread); else error = VOP_ACCESS(vdp, VWRITE, cred, cnp->cn_thread); if (error) return (error); /* * Return an indication of where the new directory * entry should be put. If we didn't find a slot, * then set dp->i_count to 0 indicating * that the new slot belongs at the end of the * directory. If we found a slot, then the new entry * can be put in the range from dp->i_offset to * dp->i_offset + dp->i_count. */ if (slotstatus == NONE) { dp->i_offset = roundup2(dp->i_size, DIRBLKSIZ); dp->i_count = 0; enduseful = dp->i_offset; } else if (nameiop == DELETE) { dp->i_offset = slotoffset; if ((dp->i_offset & (DIRBLKSIZ - 1)) == 0) dp->i_count = 0; else dp->i_count = dp->i_offset - prevoff; } else { dp->i_offset = slotoffset; dp->i_count = slotsize; if (enduseful < slotoffset + slotsize) enduseful = slotoffset + slotsize; } dp->i_endoff = roundup2(enduseful, DIRBLKSIZ); /* * We return with the directory locked, so that * the parameters we set up above will still be * valid if we actually decide to do a direnter(). * We return ni_vp == NULL to indicate that the entry * does not currently exist; we leave a pointer to * the (locked) directory inode in ndp->ni_dvp. * The pathname buffer is saved so that the name * can be obtained later. * * NB - if the directory is unlocked, then this * information cannot be used. */ cnp->cn_flags |= SAVENAME; return (EJUSTRETURN); } /* * Insert name into cache (as non-existent) if appropriate. */ if ((cnp->cn_flags & MAKEENTRY) != 0) cache_enter(vdp, NULL, cnp); return (ENOENT); found: if (dd_ino != NULL) *dd_ino = ino; if (numdirpasses == 2) nchstats.ncs_pass2++; /* * Check that directory length properly reflects presence * of this entry. */ if (i_offset + DIRSIZ(OFSFMT(vdp), ep) > dp->i_size) { ufs_dirbad(dp, i_offset, "i_size too small"); dp->i_size = i_offset + DIRSIZ(OFSFMT(vdp), ep); DIP_SET(dp, i_size, dp->i_size); dp->i_flag |= IN_CHANGE | IN_UPDATE; } brelse(bp); /* * Found component in pathname. * If the final component of path name, save information * in the cache as to where the entry was found. */ if ((flags & ISLASTCN) && nameiop == LOOKUP) dp->i_diroff = rounddown2(i_offset, DIRBLKSIZ); /* * If deleting, and at end of pathname, return * parameters which can be used to remove file. */ if (nameiop == DELETE && (flags & ISLASTCN)) { if (flags & LOCKPARENT) ASSERT_VOP_ELOCKED(vdp, __FUNCTION__); /* * Return pointer to current entry in dp->i_offset, * and distance past previous entry (if there * is a previous entry in this block) in dp->i_count. * Save directory inode pointer in ndp->ni_dvp for dirremove(). * * Technically we shouldn't be setting these in the * WANTPARENT case (first lookup in rename()), but any * lookups that will result in directory changes will * overwrite these. */ dp->i_offset = i_offset; if ((dp->i_offset & (DIRBLKSIZ - 1)) == 0) dp->i_count = 0; else dp->i_count = dp->i_offset - prevoff; if (dd_ino != NULL) return (0); if ((error = VFS_VGET(vdp->v_mount, ino, LK_EXCLUSIVE, &tdp)) != 0) return (error); error = ufs_delete_denied(vdp, tdp, cred, cnp->cn_thread); if (error) { vput(tdp); return (error); } if (dp->i_number == ino) { VREF(vdp); *vpp = vdp; vput(tdp); return (0); } *vpp = tdp; return (0); } /* * If rewriting (RENAME), return the inode and the * information required to rewrite the present directory * Must get inode of directory entry to verify it's a * regular file, or empty directory. */ if (nameiop == RENAME && (flags & ISLASTCN)) { if (flags & WILLBEDIR) error = VOP_ACCESSX(vdp, VWRITE | VAPPEND, cred, cnp->cn_thread); else error = VOP_ACCESS(vdp, VWRITE, cred, cnp->cn_thread); if (error) return (error); /* * Careful about locking second inode. * This can only occur if the target is ".". */ dp->i_offset = i_offset; if (dp->i_number == ino) return (EISDIR); if (dd_ino != NULL) return (0); if ((error = VFS_VGET(vdp->v_mount, ino, LK_EXCLUSIVE, &tdp)) != 0) return (error); error = ufs_delete_denied(vdp, tdp, cred, cnp->cn_thread); if (error) { vput(tdp); return (error); } #ifdef SunOS_doesnt_do_that /* * The only purpose of this check is to return the correct * error. Assume that we want to rename directory "a" * to a file "b", and that we have no ACL_WRITE_DATA on * a containing directory, but we _do_ have ACL_APPEND_DATA. * In that case, the VOP_ACCESS check above will return 0, * and the operation will fail with ENOTDIR instead * of EACCESS. */ if (tdp->v_type == VDIR) error = VOP_ACCESSX(vdp, VWRITE | VAPPEND, cred, cnp->cn_thread); else error = VOP_ACCESS(vdp, VWRITE, cred, cnp->cn_thread); if (error) { vput(tdp); return (error); } #endif *vpp = tdp; cnp->cn_flags |= SAVENAME; return (0); } if (dd_ino != NULL) return (0); /* * Step through the translation in the name. We do not `vput' the * directory because we may need it again if a symbolic link * is relative to the current directory. Instead we save it * unlocked as "pdp". We must get the target inode before unlocking * the directory to insure that the inode will not be removed * before we get it. We prevent deadlock by always fetching * inodes from the root, moving down the directory tree. Thus * when following backward pointers ".." we must unlock the * parent directory before getting the requested directory. * There is a potential race condition here if both the current * and parent directories are removed before the VFS_VGET for the * inode associated with ".." returns. We hope that this occurs * infrequently since we cannot avoid this race condition without * implementing a sophisticated deadlock detection algorithm. * Note also that this simple deadlock detection scheme will not * work if the filesystem has any hard links other than ".." * that point backwards in the directory structure. */ pdp = vdp; if (flags & ISDOTDOT) { error = vn_vget_ino(pdp, ino, cnp->cn_lkflags, &tdp); if (error) return (error); /* * Recheck that ".." entry in the vdp directory points * to the inode we looked up before vdp lock was * dropped. */ error = ufs_lookup_ino(pdp, NULL, cnp, &ino1); if (error) { vput(tdp); return (error); } if (ino1 != ino) { vput(tdp); goto restart; } *vpp = tdp; } else if (dp->i_number == ino) { VREF(vdp); /* we want ourself, ie "." */ /* * When we lookup "." we still can be asked to lock it * differently. */ ltype = cnp->cn_lkflags & LK_TYPE_MASK; if (ltype != VOP_ISLOCKED(vdp)) { if (ltype == LK_EXCLUSIVE) vn_lock(vdp, LK_UPGRADE | LK_RETRY); else /* if (ltype == LK_SHARED) */ vn_lock(vdp, LK_DOWNGRADE | LK_RETRY); /* * Relock for the "." case may left us with * reclaimed vnode. */ if (vdp->v_iflag & VI_DOOMED) { vrele(vdp); return (ENOENT); } } *vpp = vdp; } else { error = VFS_VGET(pdp->v_mount, ino, cnp->cn_lkflags, &tdp); if (error) return (error); *vpp = tdp; } /* * Insert name into cache if appropriate. */ if (cnp->cn_flags & MAKEENTRY) cache_enter(vdp, *vpp, cnp); return (0); } void ufs_dirbad(ip, offset, how) struct inode *ip; doff_t offset; char *how; { struct mount *mp; mp = ITOV(ip)->v_mount; if ((mp->mnt_flag & MNT_RDONLY) == 0) panic("ufs_dirbad: %s: bad dir ino %ju at offset %ld: %s", mp->mnt_stat.f_mntonname, (uintmax_t)ip->i_number, (long)offset, how); else (void)printf("%s: bad dir ino %ju at offset %ld: %s\n", mp->mnt_stat.f_mntonname, (uintmax_t)ip->i_number, (long)offset, how); } /* * Do consistency checking on a directory entry: * record length must be multiple of 4 * entry must fit in rest of its DIRBLKSIZ block * record must be large enough to contain entry * name is not longer than MAXNAMLEN * name must be as long as advertised, and null terminated */ int ufs_dirbadentry(dp, ep, entryoffsetinblock) struct vnode *dp; struct direct *ep; int entryoffsetinblock; { int i, namlen; # if (BYTE_ORDER == LITTLE_ENDIAN) if (OFSFMT(dp)) namlen = ep->d_type; else namlen = ep->d_namlen; # else namlen = ep->d_namlen; # endif if ((ep->d_reclen & 0x3) != 0 || ep->d_reclen > DIRBLKSIZ - (entryoffsetinblock & (DIRBLKSIZ - 1)) || ep->d_reclen < DIRSIZ(OFSFMT(dp), ep) || namlen > MAXNAMLEN) { /*return (1); */ printf("First bad\n"); goto bad; } if (ep->d_ino == 0) return (0); for (i = 0; i < namlen; i++) if (ep->d_name[i] == '\0') { /*return (1); */ printf("Second bad\n"); goto bad; } if (ep->d_name[i]) goto bad; return (0); bad: return (1); } /* * Construct a new directory entry after a call to namei, using the * parameters that it left in the componentname argument cnp. The * argument ip is the inode to which the new directory entry will refer. */ void ufs_makedirentry(ip, cnp, newdirp) struct inode *ip; struct componentname *cnp; struct direct *newdirp; { #ifdef INVARIANTS if ((cnp->cn_flags & SAVENAME) == 0) panic("ufs_makedirentry: missing name"); #endif newdirp->d_ino = ip->i_number; newdirp->d_namlen = cnp->cn_namelen; bcopy(cnp->cn_nameptr, newdirp->d_name, (unsigned)cnp->cn_namelen + 1); if (ITOV(ip)->v_mount->mnt_maxsymlinklen > 0) newdirp->d_type = IFTODT(ip->i_mode); else { newdirp->d_type = 0; # if (BYTE_ORDER == LITTLE_ENDIAN) { u_char tmp = newdirp->d_namlen; newdirp->d_namlen = newdirp->d_type; newdirp->d_type = tmp; } # endif } } /* * Write a directory entry after a call to namei, using the parameters * that it left in nameidata. The argument dirp is the new directory * entry contents. Dvp is a pointer to the directory to be written, * which was left locked by namei. Remaining parameters (dp->i_offset, * dp->i_count) indicate how the space for the new entry is to be obtained. * Non-null bp indicates that a directory is being created (for the * soft dependency code). */ int ufs_direnter(dvp, tvp, dirp, cnp, newdirbp, isrename) struct vnode *dvp; struct vnode *tvp; struct direct *dirp; struct componentname *cnp; struct buf *newdirbp; int isrename; { struct ucred *cr; struct thread *td; int newentrysize; struct inode *dp; struct buf *bp; u_int dsize; struct direct *ep, *nep; + u_int64_t old_isize; int error, ret, blkoff, loc, spacefree, flags, namlen; char *dirbuf; td = curthread; /* XXX */ cr = td->td_ucred; dp = VTOI(dvp); newentrysize = DIRSIZ(OFSFMT(dvp), dirp); if (dp->i_count == 0) { /* * If dp->i_count is 0, then namei could find no * space in the directory. Here, dp->i_offset will * be on a directory block boundary and we will write the * new entry into a fresh block. */ if (dp->i_offset & (DIRBLKSIZ - 1)) panic("ufs_direnter: newblk"); flags = BA_CLRBUF; if (!DOINGSOFTDEP(dvp) && !DOINGASYNC(dvp)) flags |= IO_SYNC; #ifdef QUOTA if ((error = getinoquota(dp)) != 0) { if (DOINGSOFTDEP(dvp) && newdirbp != NULL) bdwrite(newdirbp); return (error); } #endif + old_isize = dp->i_size; + vnode_pager_setsize(dvp, (u_long)dp->i_offset + DIRBLKSIZ); if ((error = UFS_BALLOC(dvp, (off_t)dp->i_offset, DIRBLKSIZ, cr, flags, &bp)) != 0) { if (DOINGSOFTDEP(dvp) && newdirbp != NULL) bdwrite(newdirbp); + vnode_pager_setsize(dvp, (u_long)old_isize); return (error); } dp->i_size = dp->i_offset + DIRBLKSIZ; DIP_SET(dp, i_size, dp->i_size); dp->i_flag |= IN_CHANGE | IN_UPDATE; - vnode_pager_setsize(dvp, (u_long)dp->i_size); dirp->d_reclen = DIRBLKSIZ; blkoff = dp->i_offset & (VFSTOUFS(dvp->v_mount)->um_mountp->mnt_stat.f_iosize - 1); bcopy((caddr_t)dirp, (caddr_t)bp->b_data + blkoff,newentrysize); #ifdef UFS_DIRHASH if (dp->i_dirhash != NULL) { ufsdirhash_newblk(dp, dp->i_offset); ufsdirhash_add(dp, dirp, dp->i_offset); ufsdirhash_checkblock(dp, (char *)bp->b_data + blkoff, dp->i_offset); } #endif if (DOINGSOFTDEP(dvp)) { /* * Ensure that the entire newly allocated block is a * valid directory so that future growth within the * block does not have to ensure that the block is * written before the inode. */ blkoff += DIRBLKSIZ; while (blkoff < bp->b_bcount) { ((struct direct *) (bp->b_data + blkoff))->d_reclen = DIRBLKSIZ; blkoff += DIRBLKSIZ; } if (softdep_setup_directory_add(bp, dp, dp->i_offset, dirp->d_ino, newdirbp, 1)) dp->i_flag |= IN_NEEDSYNC; if (newdirbp) bdwrite(newdirbp); bdwrite(bp); if ((dp->i_flag & IN_NEEDSYNC) == 0) return (UFS_UPDATE(dvp, 0)); /* * We have just allocated a directory block in an * indirect block. We must prevent holes in the * directory created if directory entries are * written out of order. To accomplish this we * fsync when we extend a directory into indirects. * During rename it's not safe to drop the tvp lock * so sync must be delayed until it is. * * This synchronous step could be removed if fsck and * the kernel were taught to fill in sparse * directories rather than panic. */ if (isrename) return (0); if (tvp != NULL) VOP_UNLOCK(tvp, 0); (void) VOP_FSYNC(dvp, MNT_WAIT, td); if (tvp != NULL) vn_lock(tvp, LK_EXCLUSIVE | LK_RETRY); return (error); } if (DOINGASYNC(dvp)) { bdwrite(bp); return (UFS_UPDATE(dvp, 0)); } error = bwrite(bp); ret = UFS_UPDATE(dvp, 1); if (error == 0) return (ret); return (error); } /* * If dp->i_count is non-zero, then namei found space for the new * entry in the range dp->i_offset to dp->i_offset + dp->i_count * in the directory. To use this space, we may have to compact * the entries located there, by copying them together towards the * beginning of the block, leaving the free space in one usable * chunk at the end. */ /* * Increase size of directory if entry eats into new space. * This should never push the size past a new multiple of * DIRBLKSIZE. * * N.B. - THIS IS AN ARTIFACT OF 4.2 AND SHOULD NEVER HAPPEN. */ if (dp->i_offset + dp->i_count > dp->i_size) { dp->i_size = dp->i_offset + dp->i_count; DIP_SET(dp, i_size, dp->i_size); } /* * Get the block containing the space for the new directory entry. */ error = UFS_BLKATOFF(dvp, (off_t)dp->i_offset, &dirbuf, &bp); if (error) { if (DOINGSOFTDEP(dvp) && newdirbp != NULL) bdwrite(newdirbp); return (error); } /* * Find space for the new entry. In the simple case, the entry at * offset base will have the space. If it does not, then namei * arranged that compacting the region dp->i_offset to * dp->i_offset + dp->i_count would yield the space. */ ep = (struct direct *)dirbuf; dsize = ep->d_ino ? DIRSIZ(OFSFMT(dvp), ep) : 0; spacefree = ep->d_reclen - dsize; for (loc = ep->d_reclen; loc < dp->i_count; ) { nep = (struct direct *)(dirbuf + loc); /* Trim the existing slot (NB: dsize may be zero). */ ep->d_reclen = dsize; ep = (struct direct *)((char *)ep + dsize); /* Read nep->d_reclen now as the bcopy() may clobber it. */ loc += nep->d_reclen; if (nep->d_ino == 0) { /* * A mid-block unused entry. Such entries are * never created by the kernel, but fsck_ffs * can create them (and it doesn't fix them). * * Add up the free space, and initialise the * relocated entry since we don't bcopy it. */ spacefree += nep->d_reclen; ep->d_ino = 0; dsize = 0; continue; } dsize = DIRSIZ(OFSFMT(dvp), nep); spacefree += nep->d_reclen - dsize; #ifdef UFS_DIRHASH if (dp->i_dirhash != NULL) ufsdirhash_move(dp, nep, dp->i_offset + ((char *)nep - dirbuf), dp->i_offset + ((char *)ep - dirbuf)); #endif if (DOINGSOFTDEP(dvp)) softdep_change_directoryentry_offset(bp, dp, dirbuf, (caddr_t)nep, (caddr_t)ep, dsize); else bcopy((caddr_t)nep, (caddr_t)ep, dsize); } /* * Here, `ep' points to a directory entry containing `dsize' in-use * bytes followed by `spacefree' unused bytes. If ep->d_ino == 0, * then the entry is completely unused (dsize == 0). The value * of ep->d_reclen is always indeterminate. * * Update the pointer fields in the previous entry (if any), * copy in the new entry, and write out the block. */ # if (BYTE_ORDER == LITTLE_ENDIAN) if (OFSFMT(dvp)) namlen = ep->d_type; else namlen = ep->d_namlen; # else namlen = ep->d_namlen; # endif if (ep->d_ino == 0 || (ep->d_ino == WINO && namlen == dirp->d_namlen && bcmp(ep->d_name, dirp->d_name, dirp->d_namlen) == 0)) { if (spacefree + dsize < newentrysize) panic("ufs_direnter: compact1"); dirp->d_reclen = spacefree + dsize; } else { if (spacefree < newentrysize) panic("ufs_direnter: compact2"); dirp->d_reclen = spacefree; ep->d_reclen = dsize; ep = (struct direct *)((char *)ep + dsize); } #ifdef UFS_DIRHASH if (dp->i_dirhash != NULL && (ep->d_ino == 0 || dirp->d_reclen == spacefree)) ufsdirhash_add(dp, dirp, dp->i_offset + ((char *)ep - dirbuf)); #endif bcopy((caddr_t)dirp, (caddr_t)ep, (u_int)newentrysize); #ifdef UFS_DIRHASH if (dp->i_dirhash != NULL) ufsdirhash_checkblock(dp, dirbuf - (dp->i_offset & (DIRBLKSIZ - 1)), rounddown2(dp->i_offset, DIRBLKSIZ)); #endif if (DOINGSOFTDEP(dvp)) { (void) softdep_setup_directory_add(bp, dp, dp->i_offset + (caddr_t)ep - dirbuf, dirp->d_ino, newdirbp, 0); if (newdirbp != NULL) bdwrite(newdirbp); bdwrite(bp); } else { if (DOINGASYNC(dvp)) { bdwrite(bp); error = 0; } else { error = bwrite(bp); } } dp->i_flag |= IN_CHANGE | IN_UPDATE; /* * If all went well, and the directory can be shortened, proceed * with the truncation. Note that we have to unlock the inode for * the entry that we just entered, as the truncation may need to * lock other inodes which can lead to deadlock if we also hold a * lock on the newly entered node. */ if (isrename == 0 && error == 0 && dp->i_endoff && dp->i_endoff < dp->i_size) { if (tvp != NULL) VOP_UNLOCK(tvp, 0); error = UFS_TRUNCATE(dvp, (off_t)dp->i_endoff, IO_NORMAL | (DOINGASYNC(dvp) ? 0 : IO_SYNC), cr); if (error != 0) vprint("ufs_direnter: failed to truncate", dvp); #ifdef UFS_DIRHASH if (error == 0 && dp->i_dirhash != NULL) ufsdirhash_dirtrunc(dp, dp->i_endoff); #endif error = 0; if (tvp != NULL) vn_lock(tvp, LK_EXCLUSIVE | LK_RETRY); } return (error); } /* * Remove a directory entry after a call to namei, using * the parameters which it left in nameidata. The entry * dp->i_offset contains the offset into the directory of the * entry to be eliminated. The dp->i_count field contains the * size of the previous record in the directory. If this * is 0, the first entry is being deleted, so we need only * zero the inode number to mark the entry as free. If the * entry is not the first in the directory, we must reclaim * the space of the now empty record by adding the record size * to the size of the previous entry. */ int ufs_dirremove(dvp, ip, flags, isrmdir) struct vnode *dvp; struct inode *ip; int flags; int isrmdir; { struct inode *dp; struct direct *ep, *rep; struct buf *bp; int error; dp = VTOI(dvp); /* * Adjust the link count early so softdep can block if necessary. */ if (ip) { ip->i_effnlink--; if (DOINGSOFTDEP(dvp)) { softdep_setup_unlink(dp, ip); } else { ip->i_nlink--; DIP_SET(ip, i_nlink, ip->i_nlink); ip->i_flag |= IN_CHANGE; } } if (flags & DOWHITEOUT) { /* * Whiteout entry: set d_ino to WINO. */ if ((error = UFS_BLKATOFF(dvp, (off_t)dp->i_offset, (char **)&ep, &bp)) != 0) return (error); ep->d_ino = WINO; ep->d_type = DT_WHT; goto out; } if ((error = UFS_BLKATOFF(dvp, (off_t)(dp->i_offset - dp->i_count), (char **)&ep, &bp)) != 0) return (error); /* Set 'rep' to the entry being removed. */ if (dp->i_count == 0) rep = ep; else rep = (struct direct *)((char *)ep + ep->d_reclen); #ifdef UFS_DIRHASH /* * Remove the dirhash entry. This is complicated by the fact * that `ep' is the previous entry when dp->i_count != 0. */ if (dp->i_dirhash != NULL) ufsdirhash_remove(dp, rep, dp->i_offset); #endif if (ip && rep->d_ino != ip->i_number) panic("ufs_dirremove: ip %ju does not match dirent ino %ju\n", (uintmax_t)ip->i_number, (uintmax_t)rep->d_ino); if (dp->i_count == 0) { /* * First entry in block: set d_ino to zero. */ ep->d_ino = 0; } else { /* * Collapse new free space into previous entry. */ ep->d_reclen += rep->d_reclen; } #ifdef UFS_DIRHASH if (dp->i_dirhash != NULL) ufsdirhash_checkblock(dp, (char *)ep - ((dp->i_offset - dp->i_count) & (DIRBLKSIZ - 1)), rounddown2(dp->i_offset, DIRBLKSIZ)); #endif out: error = 0; if (DOINGSOFTDEP(dvp)) { if (ip) softdep_setup_remove(bp, dp, ip, isrmdir); if (softdep_slowdown(dvp)) error = bwrite(bp); else bdwrite(bp); } else { if (flags & DOWHITEOUT) error = bwrite(bp); else if (DOINGASYNC(dvp) && dp->i_count != 0) bdwrite(bp); else error = bwrite(bp); } dp->i_flag |= IN_CHANGE | IN_UPDATE; /* * If the last named reference to a snapshot goes away, * drop its snapshot reference so that it will be reclaimed * when last open reference goes away. */ if (ip != NULL && (ip->i_flags & SF_SNAPSHOT) != 0 && ip->i_effnlink == 0) UFS_SNAPGONE(ip); return (error); } /* * Rewrite an existing directory entry to point at the inode * supplied. The parameters describing the directory entry are * set up by a call to namei. */ int ufs_dirrewrite(dp, oip, newinum, newtype, isrmdir) struct inode *dp, *oip; ino_t newinum; int newtype; int isrmdir; { struct buf *bp; struct direct *ep; struct vnode *vdp = ITOV(dp); int error; /* * Drop the link before we lock the buf so softdep can block if * necessary. */ oip->i_effnlink--; if (DOINGSOFTDEP(vdp)) { softdep_setup_unlink(dp, oip); } else { oip->i_nlink--; DIP_SET(oip, i_nlink, oip->i_nlink); oip->i_flag |= IN_CHANGE; } error = UFS_BLKATOFF(vdp, (off_t)dp->i_offset, (char **)&ep, &bp); if (error) return (error); if (ep->d_namlen == 2 && ep->d_name[1] == '.' && ep->d_name[0] == '.' && ep->d_ino != oip->i_number) { brelse(bp); return (EIDRM); } ep->d_ino = newinum; if (!OFSFMT(vdp)) ep->d_type = newtype; if (DOINGSOFTDEP(vdp)) { softdep_setup_directory_change(bp, dp, oip, newinum, isrmdir); bdwrite(bp); } else { if (DOINGASYNC(vdp)) { bdwrite(bp); error = 0; } else { error = bwrite(bp); } } dp->i_flag |= IN_CHANGE | IN_UPDATE; /* * If the last named reference to a snapshot goes away, * drop its snapshot reference so that it will be reclaimed * when last open reference goes away. */ if ((oip->i_flags & SF_SNAPSHOT) != 0 && oip->i_effnlink == 0) UFS_SNAPGONE(oip); return (error); } /* * Check if a directory is empty or not. * Inode supplied must be locked. * * Using a struct dirtemplate here is not precisely * what we want, but better than using a struct direct. * * NB: does not handle corrupted directories. */ int ufs_dirempty(ip, parentino, cred) struct inode *ip; ino_t parentino; struct ucred *cred; { doff_t off; struct dirtemplate dbuf; struct direct *dp = (struct direct *)&dbuf; int error, namlen; ssize_t count; #define MINDIRSIZ (sizeof (struct dirtemplate) / 2) for (off = 0; off < ip->i_size; off += dp->d_reclen) { error = vn_rdwr(UIO_READ, ITOV(ip), (caddr_t)dp, MINDIRSIZ, off, UIO_SYSSPACE, IO_NODELOCKED | IO_NOMACCHECK, cred, NOCRED, &count, (struct thread *)0); /* * Since we read MINDIRSIZ, residual must * be 0 unless we're at end of file. */ if (error || count != 0) return (0); /* avoid infinite loops */ if (dp->d_reclen == 0) return (0); /* skip empty entries */ if (dp->d_ino == 0 || dp->d_ino == WINO) continue; /* accept only "." and ".." */ # if (BYTE_ORDER == LITTLE_ENDIAN) if (OFSFMT(ITOV(ip))) namlen = dp->d_type; else namlen = dp->d_namlen; # else namlen = dp->d_namlen; # endif if (namlen > 2) return (0); if (dp->d_name[0] != '.') return (0); /* * At this point namlen must be 1 or 2. * 1 implies ".", 2 implies ".." if second * char is also "." */ if (namlen == 1 && dp->d_ino == ip->i_number) continue; if (dp->d_name[1] == '.' && dp->d_ino == parentino) continue; return (0); } return (1); } static int ufs_dir_dd_ino(struct vnode *vp, struct ucred *cred, ino_t *dd_ino, struct vnode **dd_vp) { struct dirtemplate dirbuf; struct vnode *ddvp; int error, namlen; ASSERT_VOP_LOCKED(vp, "ufs_dir_dd_ino"); if (vp->v_type != VDIR) return (ENOTDIR); /* * First check to see if we have it in the name cache. */ if ((ddvp = vn_dir_dd_ino(vp)) != NULL) { KASSERT(ddvp->v_mount == vp->v_mount, ("ufs_dir_dd_ino: Unexpected mount point crossing")); *dd_ino = VTOI(ddvp)->i_number; *dd_vp = ddvp; return (0); } /* * Have to read the directory. */ error = vn_rdwr(UIO_READ, vp, (caddr_t)&dirbuf, sizeof (struct dirtemplate), (off_t)0, UIO_SYSSPACE, IO_NODELOCKED | IO_NOMACCHECK, cred, NOCRED, NULL, NULL); if (error != 0) return (error); #if (BYTE_ORDER == LITTLE_ENDIAN) if (OFSFMT(vp)) namlen = dirbuf.dotdot_type; else namlen = dirbuf.dotdot_namlen; #else namlen = dirbuf.dotdot_namlen; #endif if (namlen != 2 || dirbuf.dotdot_name[0] != '.' || dirbuf.dotdot_name[1] != '.') return (ENOTDIR); *dd_ino = dirbuf.dotdot_ino; *dd_vp = NULL; return (0); } /* * Check if source directory is in the path of the target directory. */ int ufs_checkpath(ino_t source_ino, ino_t parent_ino, struct inode *target, struct ucred *cred, ino_t *wait_ino) { struct mount *mp; struct vnode *tvp, *vp, *vp1; int error; ino_t dd_ino; vp = tvp = ITOV(target); mp = vp->v_mount; *wait_ino = 0; if (target->i_number == source_ino) return (EEXIST); if (target->i_number == parent_ino) return (0); if (target->i_number == ROOTINO) return (0); for (;;) { error = ufs_dir_dd_ino(vp, cred, &dd_ino, &vp1); if (error != 0) break; if (dd_ino == source_ino) { error = EINVAL; break; } if (dd_ino == ROOTINO) break; if (dd_ino == parent_ino) break; if (vp1 == NULL) { error = VFS_VGET(mp, dd_ino, LK_SHARED | LK_NOWAIT, &vp1); if (error != 0) { *wait_ino = dd_ino; break; } } KASSERT(dd_ino == VTOI(vp1)->i_number, ("directory %ju reparented\n", (uintmax_t)VTOI(vp1)->i_number)); if (vp != tvp) vput(vp); vp = vp1; } if (error == ENOTDIR) panic("checkpath: .. not a directory\n"); if (vp1 != NULL) vput(vp1); if (vp != tvp) vput(vp); return (error); } Index: user/alc/PQ_LAUNDRY/sys/ufs/ufs/ufs_vnops.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/ufs/ufs/ufs_vnops.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/ufs/ufs/ufs_vnops.c (revision 303206) @@ -1,2810 +1,2810 @@ /*- * Copyright (c) 1982, 1986, 1989, 1993, 1995 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)ufs_vnops.c 8.27 (Berkeley) 5/27/95 */ #include __FBSDID("$FreeBSD$"); #include "opt_quota.h" #include "opt_suiddir.h" #include "opt_ufs.h" #include "opt_ffs.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* XXX */ #include #include #include #include #include #include #include #include #include #ifdef UFS_DIRHASH #include #endif #ifdef UFS_GJOURNAL #include FEATURE(ufs_gjournal, "Journaling support through GEOM for UFS"); #endif #ifdef QUOTA FEATURE(ufs_quota, "UFS disk quotas support"); FEATURE(ufs_quota64, "64bit UFS disk quotas support"); #endif #ifdef SUIDDIR FEATURE(suiddir, "Give all new files in directory the same ownership as the directory"); #endif #include static vop_accessx_t ufs_accessx; static int ufs_chmod(struct vnode *, int, struct ucred *, struct thread *); static int ufs_chown(struct vnode *, uid_t, gid_t, struct ucred *, struct thread *); static vop_close_t ufs_close; static vop_create_t ufs_create; static vop_getattr_t ufs_getattr; static vop_ioctl_t ufs_ioctl; static vop_link_t ufs_link; static int ufs_makeinode(int mode, struct vnode *, struct vnode **, struct componentname *); static vop_markatime_t ufs_markatime; static vop_mkdir_t ufs_mkdir; static vop_mknod_t ufs_mknod; static vop_open_t ufs_open; static vop_pathconf_t ufs_pathconf; static vop_print_t ufs_print; static vop_readlink_t ufs_readlink; static vop_remove_t ufs_remove; static vop_rename_t ufs_rename; static vop_rmdir_t ufs_rmdir; static vop_setattr_t ufs_setattr; static vop_strategy_t ufs_strategy; static vop_symlink_t ufs_symlink; static vop_whiteout_t ufs_whiteout; static vop_close_t ufsfifo_close; static vop_kqfilter_t ufsfifo_kqfilter; static vop_pathconf_t ufsfifo_pathconf; SYSCTL_NODE(_vfs, OID_AUTO, ufs, CTLFLAG_RD, 0, "UFS filesystem"); /* * A virgin directory (no blushing please). */ static struct dirtemplate mastertemplate = { 0, 12, DT_DIR, 1, ".", 0, DIRBLKSIZ - 12, DT_DIR, 2, ".." }; static struct odirtemplate omastertemplate = { 0, 12, 1, ".", 0, DIRBLKSIZ - 12, 2, ".." }; static void ufs_itimes_locked(struct vnode *vp) { struct inode *ip; struct timespec ts; ASSERT_VI_LOCKED(vp, __func__); ip = VTOI(vp); if (UFS_RDONLY(ip)) goto out; if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) return; if ((vp->v_type == VBLK || vp->v_type == VCHR) && !DOINGSOFTDEP(vp)) ip->i_flag |= IN_LAZYMOD; else if (((vp->v_mount->mnt_kern_flag & (MNTK_SUSPENDED | MNTK_SUSPEND)) == 0) || (ip->i_flag & (IN_CHANGE | IN_UPDATE))) ip->i_flag |= IN_MODIFIED; else if (ip->i_flag & IN_ACCESS) ip->i_flag |= IN_LAZYACCESS; vfs_timestamp(&ts); if (ip->i_flag & IN_ACCESS) { DIP_SET(ip, i_atime, ts.tv_sec); DIP_SET(ip, i_atimensec, ts.tv_nsec); } if (ip->i_flag & IN_UPDATE) { DIP_SET(ip, i_mtime, ts.tv_sec); DIP_SET(ip, i_mtimensec, ts.tv_nsec); } if (ip->i_flag & IN_CHANGE) { DIP_SET(ip, i_ctime, ts.tv_sec); DIP_SET(ip, i_ctimensec, ts.tv_nsec); DIP_SET(ip, i_modrev, DIP(ip, i_modrev) + 1); } out: ip->i_flag &= ~(IN_ACCESS | IN_CHANGE | IN_UPDATE); } void ufs_itimes(struct vnode *vp) { VI_LOCK(vp); ufs_itimes_locked(vp); VI_UNLOCK(vp); } /* * Create a regular file */ static int ufs_create(ap) struct vop_create_args /* { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; } */ *ap; { int error; error = ufs_makeinode(MAKEIMODE(ap->a_vap->va_type, ap->a_vap->va_mode), ap->a_dvp, ap->a_vpp, ap->a_cnp); if (error != 0) return (error); if ((ap->a_cnp->cn_flags & MAKEENTRY) != 0) cache_enter(ap->a_dvp, *ap->a_vpp, ap->a_cnp); return (0); } /* * Mknod vnode call */ /* ARGSUSED */ static int ufs_mknod(ap) struct vop_mknod_args /* { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; } */ *ap; { struct vattr *vap = ap->a_vap; struct vnode **vpp = ap->a_vpp; struct inode *ip; ino_t ino; int error; error = ufs_makeinode(MAKEIMODE(vap->va_type, vap->va_mode), ap->a_dvp, vpp, ap->a_cnp); if (error) return (error); ip = VTOI(*vpp); ip->i_flag |= IN_ACCESS | IN_CHANGE | IN_UPDATE; if (vap->va_rdev != VNOVAL) { /* * Want to be able to use this to make badblock * inodes, so don't truncate the dev number. */ DIP_SET(ip, i_rdev, vap->va_rdev); } /* * Remove inode, then reload it through VFS_VGET so it is * checked to see if it is an alias of an existing entry in * the inode cache. XXX I don't believe this is necessary now. */ (*vpp)->v_type = VNON; ino = ip->i_number; /* Save this before vgone() invalidates ip. */ vgone(*vpp); vput(*vpp); error = VFS_VGET(ap->a_dvp->v_mount, ino, LK_EXCLUSIVE, vpp); if (error) { *vpp = NULL; return (error); } return (0); } /* * Open called. */ /* ARGSUSED */ static int ufs_open(struct vop_open_args *ap) { struct vnode *vp = ap->a_vp; struct inode *ip; if (vp->v_type == VCHR || vp->v_type == VBLK) return (EOPNOTSUPP); ip = VTOI(vp); /* * Files marked append-only must be opened for appending. */ if ((ip->i_flags & APPEND) && (ap->a_mode & (FWRITE | O_APPEND)) == FWRITE) return (EPERM); vnode_create_vobject(vp, DIP(ip, i_size), ap->a_td); return (0); } /* * Close called. * * Update the times on the inode. */ /* ARGSUSED */ static int ufs_close(ap) struct vop_close_args /* { struct vnode *a_vp; int a_fflag; struct ucred *a_cred; struct thread *a_td; } */ *ap; { struct vnode *vp = ap->a_vp; int usecount; VI_LOCK(vp); usecount = vp->v_usecount; if (usecount > 1) ufs_itimes_locked(vp); VI_UNLOCK(vp); return (0); } static int ufs_accessx(ap) struct vop_accessx_args /* { struct vnode *a_vp; accmode_t a_accmode; struct ucred *a_cred; struct thread *a_td; } */ *ap; { struct vnode *vp = ap->a_vp; struct inode *ip = VTOI(vp); accmode_t accmode = ap->a_accmode; int error; #ifdef QUOTA int relocked; #endif #ifdef UFS_ACL struct acl *acl; acl_type_t type; #endif /* * Disallow write attempts on read-only filesystems; * unless the file is a socket, fifo, or a block or * character device resident on the filesystem. */ if (accmode & VMODIFY_PERMS) { switch (vp->v_type) { case VDIR: case VLNK: case VREG: if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); #ifdef QUOTA /* * Inode is accounted in the quotas only if struct * dquot is attached to it. VOP_ACCESS() is called * from vn_open_cred() and provides a convenient * point to call getinoquota(). */ if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) { /* * Upgrade vnode lock, since getinoquota() * requires exclusive lock to modify inode. */ relocked = 1; vhold(vp); vn_lock(vp, LK_UPGRADE | LK_RETRY); VI_LOCK(vp); if (vp->v_iflag & VI_DOOMED) { vdropl(vp); error = ENOENT; goto relock; } vdropl(vp); } else relocked = 0; error = getinoquota(ip); relock: if (relocked) vn_lock(vp, LK_DOWNGRADE | LK_RETRY); if (error != 0) return (error); #endif break; default: break; } } /* * If immutable bit set, nobody gets to write it. "& ~VADMIN_PERMS" * permits the owner of the file to remove the IMMUTABLE flag. */ if ((accmode & (VMODIFY_PERMS & ~VADMIN_PERMS)) && (ip->i_flags & (IMMUTABLE | SF_SNAPSHOT))) return (EPERM); #ifdef UFS_ACL if ((vp->v_mount->mnt_flag & (MNT_ACLS | MNT_NFS4ACLS)) != 0) { if (vp->v_mount->mnt_flag & MNT_NFS4ACLS) type = ACL_TYPE_NFS4; else type = ACL_TYPE_ACCESS; acl = acl_alloc(M_WAITOK); if (type == ACL_TYPE_NFS4) error = ufs_getacl_nfs4_internal(vp, acl, ap->a_td); else error = VOP_GETACL(vp, type, acl, ap->a_cred, ap->a_td); switch (error) { case 0: if (type == ACL_TYPE_NFS4) { error = vaccess_acl_nfs4(vp->v_type, ip->i_uid, ip->i_gid, acl, accmode, ap->a_cred, NULL); } else { error = vfs_unixify_accmode(&accmode); if (error == 0) error = vaccess_acl_posix1e(vp->v_type, ip->i_uid, ip->i_gid, acl, accmode, ap->a_cred, NULL); } break; default: if (error != EOPNOTSUPP) printf( "ufs_accessx(): Error retrieving ACL on object (%d).\n", error); /* * XXX: Fall back until debugged. Should * eventually possibly log an error, and return * EPERM for safety. */ error = vfs_unixify_accmode(&accmode); if (error == 0) error = vaccess(vp->v_type, ip->i_mode, ip->i_uid, ip->i_gid, accmode, ap->a_cred, NULL); } acl_free(acl); return (error); } #endif /* !UFS_ACL */ error = vfs_unixify_accmode(&accmode); if (error == 0) error = vaccess(vp->v_type, ip->i_mode, ip->i_uid, ip->i_gid, accmode, ap->a_cred, NULL); return (error); } /* ARGSUSED */ static int ufs_getattr(ap) struct vop_getattr_args /* { struct vnode *a_vp; struct vattr *a_vap; struct ucred *a_cred; } */ *ap; { struct vnode *vp = ap->a_vp; struct inode *ip = VTOI(vp); struct vattr *vap = ap->a_vap; VI_LOCK(vp); ufs_itimes_locked(vp); if (ip->i_ump->um_fstype == UFS1) { vap->va_atime.tv_sec = ip->i_din1->di_atime; vap->va_atime.tv_nsec = ip->i_din1->di_atimensec; } else { vap->va_atime.tv_sec = ip->i_din2->di_atime; vap->va_atime.tv_nsec = ip->i_din2->di_atimensec; } VI_UNLOCK(vp); /* * Copy from inode table */ vap->va_fsid = dev2udev(ip->i_dev); vap->va_fileid = ip->i_number; vap->va_mode = ip->i_mode & ~IFMT; vap->va_nlink = ip->i_effnlink; vap->va_uid = ip->i_uid; vap->va_gid = ip->i_gid; if (ip->i_ump->um_fstype == UFS1) { vap->va_rdev = ip->i_din1->di_rdev; vap->va_size = ip->i_din1->di_size; vap->va_mtime.tv_sec = ip->i_din1->di_mtime; vap->va_mtime.tv_nsec = ip->i_din1->di_mtimensec; vap->va_ctime.tv_sec = ip->i_din1->di_ctime; vap->va_ctime.tv_nsec = ip->i_din1->di_ctimensec; vap->va_bytes = dbtob((u_quad_t)ip->i_din1->di_blocks); vap->va_filerev = ip->i_din1->di_modrev; } else { vap->va_rdev = ip->i_din2->di_rdev; vap->va_size = ip->i_din2->di_size; vap->va_mtime.tv_sec = ip->i_din2->di_mtime; vap->va_mtime.tv_nsec = ip->i_din2->di_mtimensec; vap->va_ctime.tv_sec = ip->i_din2->di_ctime; vap->va_ctime.tv_nsec = ip->i_din2->di_ctimensec; vap->va_birthtime.tv_sec = ip->i_din2->di_birthtime; vap->va_birthtime.tv_nsec = ip->i_din2->di_birthnsec; vap->va_bytes = dbtob((u_quad_t)ip->i_din2->di_blocks); vap->va_filerev = ip->i_din2->di_modrev; } vap->va_flags = ip->i_flags; vap->va_gen = ip->i_gen; vap->va_blocksize = vp->v_mount->mnt_stat.f_iosize; vap->va_type = IFTOVT(ip->i_mode); return (0); } /* * Set attribute vnode op. called from several syscalls */ static int ufs_setattr(ap) struct vop_setattr_args /* { struct vnode *a_vp; struct vattr *a_vap; struct ucred *a_cred; } */ *ap; { struct vattr *vap = ap->a_vap; struct vnode *vp = ap->a_vp; struct inode *ip = VTOI(vp); struct ucred *cred = ap->a_cred; struct thread *td = curthread; int error; /* * Check for unsettable attributes. */ if ((vap->va_type != VNON) || (vap->va_nlink != VNOVAL) || (vap->va_fsid != VNOVAL) || (vap->va_fileid != VNOVAL) || (vap->va_blocksize != VNOVAL) || (vap->va_rdev != VNOVAL) || ((int)vap->va_bytes != VNOVAL) || (vap->va_gen != VNOVAL)) { return (EINVAL); } if (vap->va_flags != VNOVAL) { if ((vap->va_flags & ~(SF_APPEND | SF_ARCHIVED | SF_IMMUTABLE | SF_NOUNLINK | SF_SNAPSHOT | UF_APPEND | UF_ARCHIVE | UF_HIDDEN | UF_IMMUTABLE | UF_NODUMP | UF_NOUNLINK | UF_OFFLINE | UF_OPAQUE | UF_READONLY | UF_REPARSE | UF_SPARSE | UF_SYSTEM)) != 0) return (EOPNOTSUPP); if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); /* * Callers may only modify the file flags on objects they * have VADMIN rights for. */ if ((error = VOP_ACCESS(vp, VADMIN, cred, td))) return (error); /* * Unprivileged processes are not permitted to unset system * flags, or modify flags if any system flags are set. * Privileged non-jail processes may not modify system flags * if securelevel > 0 and any existing system flags are set. * Privileged jail processes behave like privileged non-jail * processes if the security.jail.chflags_allowed sysctl is * is non-zero; otherwise, they behave like unprivileged * processes. */ if (!priv_check_cred(cred, PRIV_VFS_SYSFLAGS, 0)) { if (ip->i_flags & (SF_NOUNLINK | SF_IMMUTABLE | SF_APPEND)) { error = securelevel_gt(cred, 0); if (error) return (error); } /* The snapshot flag cannot be toggled. */ if ((vap->va_flags ^ ip->i_flags) & SF_SNAPSHOT) return (EPERM); } else { if (ip->i_flags & (SF_NOUNLINK | SF_IMMUTABLE | SF_APPEND) || ((vap->va_flags ^ ip->i_flags) & SF_SETTABLE)) return (EPERM); } ip->i_flags = vap->va_flags; DIP_SET(ip, i_flags, vap->va_flags); ip->i_flag |= IN_CHANGE; error = UFS_UPDATE(vp, 0); if (ip->i_flags & (IMMUTABLE | APPEND)) return (error); } /* * If immutable or append, no one can change any of its attributes * except the ones already handled (in some cases, file flags * including the immutability flags themselves for the superuser). */ if (ip->i_flags & (IMMUTABLE | APPEND)) return (EPERM); /* * Go through the fields and update iff not VNOVAL. */ if (vap->va_uid != (uid_t)VNOVAL || vap->va_gid != (gid_t)VNOVAL) { if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); if ((error = ufs_chown(vp, vap->va_uid, vap->va_gid, cred, td)) != 0) return (error); } if (vap->va_size != VNOVAL) { /* * XXX most of the following special cases should be in * callers instead of in N filesystems. The VDIR check * mostly already is. */ switch (vp->v_type) { case VDIR: return (EISDIR); case VLNK: case VREG: /* * Truncation should have an effect in these cases. * Disallow it if the filesystem is read-only or * the file is being snapshotted. */ if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); if ((ip->i_flags & SF_SNAPSHOT) != 0) return (EPERM); break; default: /* * According to POSIX, the result is unspecified * for file types other than regular files, * directories and shared memory objects. We * don't support shared memory objects in the file * system, and have dubious support for truncating * symlinks. Just ignore the request in other cases. */ return (0); } if ((error = UFS_TRUNCATE(vp, vap->va_size, IO_NORMAL | ((vap->va_vaflags & VA_SYNC) != 0 ? IO_SYNC : 0), cred)) != 0) return (error); } if (vap->va_atime.tv_sec != VNOVAL || vap->va_mtime.tv_sec != VNOVAL || vap->va_birthtime.tv_sec != VNOVAL) { if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); if ((ip->i_flags & SF_SNAPSHOT) != 0) return (EPERM); error = vn_utimes_perm(vp, vap, cred, td); if (error != 0) return (error); ip->i_flag |= IN_CHANGE | IN_MODIFIED; if (vap->va_atime.tv_sec != VNOVAL) { ip->i_flag &= ~IN_ACCESS; DIP_SET(ip, i_atime, vap->va_atime.tv_sec); DIP_SET(ip, i_atimensec, vap->va_atime.tv_nsec); } if (vap->va_mtime.tv_sec != VNOVAL) { ip->i_flag &= ~IN_UPDATE; DIP_SET(ip, i_mtime, vap->va_mtime.tv_sec); DIP_SET(ip, i_mtimensec, vap->va_mtime.tv_nsec); } if (vap->va_birthtime.tv_sec != VNOVAL && ip->i_ump->um_fstype == UFS2) { ip->i_din2->di_birthtime = vap->va_birthtime.tv_sec; ip->i_din2->di_birthnsec = vap->va_birthtime.tv_nsec; } error = UFS_UPDATE(vp, 0); if (error) return (error); } error = 0; if (vap->va_mode != (mode_t)VNOVAL) { if (vp->v_mount->mnt_flag & MNT_RDONLY) return (EROFS); if ((ip->i_flags & SF_SNAPSHOT) != 0 && (vap->va_mode & (S_IXUSR | S_IWUSR | S_IXGRP | S_IWGRP | S_IXOTH | S_IWOTH))) return (EPERM); error = ufs_chmod(vp, (int)vap->va_mode, cred, td); } return (error); } #ifdef UFS_ACL static int ufs_update_nfs4_acl_after_mode_change(struct vnode *vp, int mode, int file_owner_id, struct ucred *cred, struct thread *td) { int error; struct acl *aclp; aclp = acl_alloc(M_WAITOK); error = ufs_getacl_nfs4_internal(vp, aclp, td); /* * We don't have to handle EOPNOTSUPP here, as the filesystem claims * it supports ACLs. */ if (error) goto out; acl_nfs4_sync_acl_from_mode(aclp, mode, file_owner_id); error = ufs_setacl_nfs4_internal(vp, aclp, td); out: acl_free(aclp); return (error); } #endif /* UFS_ACL */ /* * Mark this file's access time for update for vfs_mark_atime(). This * is called from execve() and mmap(). */ static int ufs_markatime(ap) struct vop_markatime_args /* { struct vnode *a_vp; } */ *ap; { struct vnode *vp = ap->a_vp; struct inode *ip = VTOI(vp); VI_LOCK(vp); ip->i_flag |= IN_ACCESS; VI_UNLOCK(vp); /* * XXXKIB No UFS_UPDATE(ap->a_vp, 0) there. */ return (0); } /* * Change the mode on a file. * Inode must be locked before calling. */ static int ufs_chmod(vp, mode, cred, td) struct vnode *vp; int mode; struct ucred *cred; struct thread *td; { struct inode *ip = VTOI(vp); int error; /* * To modify the permissions on a file, must possess VADMIN * for that file. */ if ((error = VOP_ACCESSX(vp, VWRITE_ACL, cred, td))) return (error); /* * Privileged processes may set the sticky bit on non-directories, * as well as set the setgid bit on a file with a group that the * process is not a member of. Both of these are allowed in * jail(8). */ if (vp->v_type != VDIR && (mode & S_ISTXT)) { if (priv_check_cred(cred, PRIV_VFS_STICKYFILE, 0)) return (EFTYPE); } if (!groupmember(ip->i_gid, cred) && (mode & ISGID)) { error = priv_check_cred(cred, PRIV_VFS_SETGID, 0); if (error) return (error); } /* * Deny setting setuid if we are not the file owner. */ if ((mode & ISUID) && ip->i_uid != cred->cr_uid) { error = priv_check_cred(cred, PRIV_VFS_ADMIN, 0); if (error) return (error); } ip->i_mode &= ~ALLPERMS; ip->i_mode |= (mode & ALLPERMS); DIP_SET(ip, i_mode, ip->i_mode); ip->i_flag |= IN_CHANGE; #ifdef UFS_ACL if ((vp->v_mount->mnt_flag & MNT_NFS4ACLS) != 0) error = ufs_update_nfs4_acl_after_mode_change(vp, mode, ip->i_uid, cred, td); #endif if (error == 0 && (ip->i_flag & IN_CHANGE) != 0) error = UFS_UPDATE(vp, 0); return (error); } /* * Perform chown operation on inode ip; * inode must be locked prior to call. */ static int ufs_chown(vp, uid, gid, cred, td) struct vnode *vp; uid_t uid; gid_t gid; struct ucred *cred; struct thread *td; { struct inode *ip = VTOI(vp); uid_t ouid; gid_t ogid; int error = 0; #ifdef QUOTA int i; ufs2_daddr_t change; #endif if (uid == (uid_t)VNOVAL) uid = ip->i_uid; if (gid == (gid_t)VNOVAL) gid = ip->i_gid; /* * To modify the ownership of a file, must possess VADMIN for that * file. */ if ((error = VOP_ACCESSX(vp, VWRITE_OWNER, cred, td))) return (error); /* * To change the owner of a file, or change the group of a file to a * group of which we are not a member, the caller must have * privilege. */ if (((uid != ip->i_uid && uid != cred->cr_uid) || (gid != ip->i_gid && !groupmember(gid, cred))) && (error = priv_check_cred(cred, PRIV_VFS_CHOWN, 0))) return (error); ogid = ip->i_gid; ouid = ip->i_uid; #ifdef QUOTA if ((error = getinoquota(ip)) != 0) return (error); if (ouid == uid) { dqrele(vp, ip->i_dquot[USRQUOTA]); ip->i_dquot[USRQUOTA] = NODQUOT; } if (ogid == gid) { dqrele(vp, ip->i_dquot[GRPQUOTA]); ip->i_dquot[GRPQUOTA] = NODQUOT; } change = DIP(ip, i_blocks); (void) chkdq(ip, -change, cred, CHOWN); (void) chkiq(ip, -1, cred, CHOWN); for (i = 0; i < MAXQUOTAS; i++) { dqrele(vp, ip->i_dquot[i]); ip->i_dquot[i] = NODQUOT; } #endif ip->i_gid = gid; DIP_SET(ip, i_gid, gid); ip->i_uid = uid; DIP_SET(ip, i_uid, uid); #ifdef QUOTA if ((error = getinoquota(ip)) == 0) { if (ouid == uid) { dqrele(vp, ip->i_dquot[USRQUOTA]); ip->i_dquot[USRQUOTA] = NODQUOT; } if (ogid == gid) { dqrele(vp, ip->i_dquot[GRPQUOTA]); ip->i_dquot[GRPQUOTA] = NODQUOT; } if ((error = chkdq(ip, change, cred, CHOWN)) == 0) { if ((error = chkiq(ip, 1, cred, CHOWN)) == 0) goto good; else (void) chkdq(ip, -change, cred, CHOWN|FORCE); } for (i = 0; i < MAXQUOTAS; i++) { dqrele(vp, ip->i_dquot[i]); ip->i_dquot[i] = NODQUOT; } } ip->i_gid = ogid; DIP_SET(ip, i_gid, ogid); ip->i_uid = ouid; DIP_SET(ip, i_uid, ouid); if (getinoquota(ip) == 0) { if (ouid == uid) { dqrele(vp, ip->i_dquot[USRQUOTA]); ip->i_dquot[USRQUOTA] = NODQUOT; } if (ogid == gid) { dqrele(vp, ip->i_dquot[GRPQUOTA]); ip->i_dquot[GRPQUOTA] = NODQUOT; } (void) chkdq(ip, change, cred, FORCE|CHOWN); (void) chkiq(ip, 1, cred, FORCE|CHOWN); (void) getinoquota(ip); } return (error); good: if (getinoquota(ip)) panic("ufs_chown: lost quota"); #endif /* QUOTA */ ip->i_flag |= IN_CHANGE; if ((ip->i_mode & (ISUID | ISGID)) && (ouid != uid || ogid != gid)) { if (priv_check_cred(cred, PRIV_VFS_RETAINSUGID, 0)) { ip->i_mode &= ~(ISUID | ISGID); DIP_SET(ip, i_mode, ip->i_mode); } } error = UFS_UPDATE(vp, 0); return (error); } static int ufs_remove(ap) struct vop_remove_args /* { struct vnode *a_dvp; struct vnode *a_vp; struct componentname *a_cnp; } */ *ap; { struct inode *ip; struct vnode *vp = ap->a_vp; struct vnode *dvp = ap->a_dvp; int error; struct thread *td; td = curthread; ip = VTOI(vp); if ((ip->i_flags & (NOUNLINK | IMMUTABLE | APPEND)) || (VTOI(dvp)->i_flags & APPEND)) { error = EPERM; goto out; } #ifdef UFS_GJOURNAL ufs_gjournal_orphan(vp); #endif error = ufs_dirremove(dvp, ip, ap->a_cnp->cn_flags, 0); if (ip->i_nlink <= 0) vp->v_vflag |= VV_NOSYNC; if ((ip->i_flags & SF_SNAPSHOT) != 0) { /* * Avoid deadlock where another thread is trying to * update the inodeblock for dvp and is waiting on * snaplk. Temporary unlock the vnode lock for the * unlinked file and sync the directory. This should * allow vput() of the directory to not block later on * while holding the snapshot vnode locked, assuming * that the directory hasn't been unlinked too. */ VOP_UNLOCK(vp, 0); (void) VOP_FSYNC(dvp, MNT_WAIT, td); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); } out: return (error); } /* * link vnode call */ static int ufs_link(ap) struct vop_link_args /* { struct vnode *a_tdvp; struct vnode *a_vp; struct componentname *a_cnp; } */ *ap; { struct vnode *vp = ap->a_vp; struct vnode *tdvp = ap->a_tdvp; struct componentname *cnp = ap->a_cnp; struct inode *ip; struct direct newdir; int error; #ifdef INVARIANTS if ((cnp->cn_flags & HASBUF) == 0) panic("ufs_link: no name"); #endif if (VTOI(tdvp)->i_effnlink < 2) panic("ufs_link: Bad link count %d on parent", VTOI(tdvp)->i_effnlink); ip = VTOI(vp); if ((nlink_t)ip->i_nlink >= LINK_MAX) { error = EMLINK; goto out; } /* * The file may have been removed after namei droped the original * lock. */ if (ip->i_effnlink == 0) { error = ENOENT; goto out; } if (ip->i_flags & (IMMUTABLE | APPEND)) { error = EPERM; goto out; } ip->i_effnlink++; ip->i_nlink++; DIP_SET(ip, i_nlink, ip->i_nlink); ip->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(vp)) softdep_setup_link(VTOI(tdvp), ip); error = UFS_UPDATE(vp, !(DOINGSOFTDEP(vp) | DOINGASYNC(vp))); if (!error) { ufs_makedirentry(ip, cnp, &newdir); error = ufs_direnter(tdvp, vp, &newdir, cnp, NULL, 0); } if (error) { ip->i_effnlink--; ip->i_nlink--; DIP_SET(ip, i_nlink, ip->i_nlink); ip->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(vp)) softdep_revert_link(VTOI(tdvp), ip); } out: return (error); } /* * whiteout vnode call */ static int ufs_whiteout(ap) struct vop_whiteout_args /* { struct vnode *a_dvp; struct componentname *a_cnp; int a_flags; } */ *ap; { struct vnode *dvp = ap->a_dvp; struct componentname *cnp = ap->a_cnp; struct direct newdir; int error = 0; switch (ap->a_flags) { case LOOKUP: /* 4.4 format directories support whiteout operations */ if (dvp->v_mount->mnt_maxsymlinklen > 0) return (0); return (EOPNOTSUPP); case CREATE: /* create a new directory whiteout */ #ifdef INVARIANTS if ((cnp->cn_flags & SAVENAME) == 0) panic("ufs_whiteout: missing name"); if (dvp->v_mount->mnt_maxsymlinklen <= 0) panic("ufs_whiteout: old format filesystem"); #endif newdir.d_ino = WINO; newdir.d_namlen = cnp->cn_namelen; bcopy(cnp->cn_nameptr, newdir.d_name, (unsigned)cnp->cn_namelen + 1); newdir.d_type = DT_WHT; error = ufs_direnter(dvp, NULL, &newdir, cnp, NULL, 0); break; case DELETE: /* remove an existing directory whiteout */ #ifdef INVARIANTS if (dvp->v_mount->mnt_maxsymlinklen <= 0) panic("ufs_whiteout: old format filesystem"); #endif cnp->cn_flags &= ~DOWHITEOUT; error = ufs_dirremove(dvp, NULL, cnp->cn_flags, 0); break; default: panic("ufs_whiteout: unknown op"); } return (error); } static volatile int rename_restarts; SYSCTL_INT(_vfs_ufs, OID_AUTO, rename_restarts, CTLFLAG_RD, __DEVOLATILE(int *, &rename_restarts), 0, "Times rename had to restart due to lock contention"); /* * Rename system call. * rename("foo", "bar"); * is essentially * unlink("bar"); * link("foo", "bar"); * unlink("foo"); * but ``atomically''. Can't do full commit without saving state in the * inode on disk which isn't feasible at this time. Best we can do is * always guarantee the target exists. * * Basic algorithm is: * * 1) Bump link count on source while we're linking it to the * target. This also ensure the inode won't be deleted out * from underneath us while we work (it may be truncated by * a concurrent `trunc' or `open' for creation). * 2) Link source to destination. If destination already exists, * delete it first. * 3) Unlink source reference to inode if still around. If a * directory was moved and the parent of the destination * is different from the source, patch the ".." entry in the * directory. */ static int ufs_rename(ap) struct vop_rename_args /* { struct vnode *a_fdvp; struct vnode *a_fvp; struct componentname *a_fcnp; struct vnode *a_tdvp; struct vnode *a_tvp; struct componentname *a_tcnp; } */ *ap; { struct vnode *tvp = ap->a_tvp; struct vnode *tdvp = ap->a_tdvp; struct vnode *fvp = ap->a_fvp; struct vnode *fdvp = ap->a_fdvp; struct vnode *nvp; struct componentname *tcnp = ap->a_tcnp; struct componentname *fcnp = ap->a_fcnp; struct thread *td = fcnp->cn_thread; struct inode *fip, *tip, *tdp, *fdp; struct direct newdir; off_t endoff; int doingdirectory, newparent; int error = 0; struct mount *mp; ino_t ino; #ifdef INVARIANTS if ((tcnp->cn_flags & HASBUF) == 0 || (fcnp->cn_flags & HASBUF) == 0) panic("ufs_rename: no name"); #endif endoff = 0; mp = tdvp->v_mount; VOP_UNLOCK(tdvp, 0); if (tvp && tvp != tdvp) VOP_UNLOCK(tvp, 0); /* * Check for cross-device rename. */ if ((fvp->v_mount != tdvp->v_mount) || (tvp && (fvp->v_mount != tvp->v_mount))) { error = EXDEV; mp = NULL; goto releout; } relock: /* * We need to acquire 2 to 4 locks depending on whether tvp is NULL * and fdvp and tdvp are the same directory. Subsequently we need * to double-check all paths and in the directory rename case we * need to verify that we are not creating a directory loop. To * handle this we acquire all but fdvp using non-blocking * acquisitions. If we fail to acquire any lock in the path we will * drop all held locks, acquire the new lock in a blocking fashion, * and then release it and restart the rename. This acquire/release * step ensures that we do not spin on a lock waiting for release. */ error = vn_lock(fdvp, LK_EXCLUSIVE); if (error) goto releout; if (vn_lock(tdvp, LK_EXCLUSIVE | LK_NOWAIT) != 0) { VOP_UNLOCK(fdvp, 0); error = vn_lock(tdvp, LK_EXCLUSIVE); if (error) goto releout; VOP_UNLOCK(tdvp, 0); atomic_add_int(&rename_restarts, 1); goto relock; } /* * Re-resolve fvp to be certain it still exists and fetch the * correct vnode. */ error = ufs_lookup_ino(fdvp, NULL, fcnp, &ino); if (error) { VOP_UNLOCK(fdvp, 0); VOP_UNLOCK(tdvp, 0); goto releout; } error = VFS_VGET(mp, ino, LK_EXCLUSIVE | LK_NOWAIT, &nvp); if (error) { VOP_UNLOCK(fdvp, 0); VOP_UNLOCK(tdvp, 0); if (error != EBUSY) goto releout; error = VFS_VGET(mp, ino, LK_EXCLUSIVE, &nvp); if (error != 0) goto releout; VOP_UNLOCK(nvp, 0); vrele(fvp); fvp = nvp; atomic_add_int(&rename_restarts, 1); goto relock; } vrele(fvp); fvp = nvp; /* * Re-resolve tvp and acquire the vnode lock if present. */ error = ufs_lookup_ino(tdvp, NULL, tcnp, &ino); if (error != 0 && error != EJUSTRETURN) { VOP_UNLOCK(fdvp, 0); VOP_UNLOCK(tdvp, 0); VOP_UNLOCK(fvp, 0); goto releout; } /* * If tvp disappeared we just carry on. */ if (error == EJUSTRETURN && tvp != NULL) { vrele(tvp); tvp = NULL; } /* * Get the tvp ino if the lookup succeeded. We may have to restart * if the non-blocking acquire fails. */ if (error == 0) { nvp = NULL; error = VFS_VGET(mp, ino, LK_EXCLUSIVE | LK_NOWAIT, &nvp); if (tvp) vrele(tvp); tvp = nvp; if (error) { VOP_UNLOCK(fdvp, 0); VOP_UNLOCK(tdvp, 0); VOP_UNLOCK(fvp, 0); if (error != EBUSY) goto releout; error = VFS_VGET(mp, ino, LK_EXCLUSIVE, &nvp); if (error != 0) goto releout; vput(nvp); atomic_add_int(&rename_restarts, 1); goto relock; } } fdp = VTOI(fdvp); fip = VTOI(fvp); tdp = VTOI(tdvp); tip = NULL; if (tvp) tip = VTOI(tvp); if (tvp && ((VTOI(tvp)->i_flags & (NOUNLINK | IMMUTABLE | APPEND)) || (VTOI(tdvp)->i_flags & APPEND))) { error = EPERM; goto unlockout; } /* * Renaming a file to itself has no effect. The upper layers should * not call us in that case. However, things could change after * we drop the locks above. */ if (fvp == tvp) { error = 0; goto unlockout; } doingdirectory = 0; newparent = 0; ino = fip->i_number; if (fip->i_nlink >= LINK_MAX) { error = EMLINK; goto unlockout; } if ((fip->i_flags & (NOUNLINK | IMMUTABLE | APPEND)) || (fdp->i_flags & APPEND)) { error = EPERM; goto unlockout; } if ((fip->i_mode & IFMT) == IFDIR) { /* * Avoid ".", "..", and aliases of "." for obvious reasons. */ if ((fcnp->cn_namelen == 1 && fcnp->cn_nameptr[0] == '.') || fdp == fip || (fcnp->cn_flags | tcnp->cn_flags) & ISDOTDOT) { error = EINVAL; goto unlockout; } if (fdp->i_number != tdp->i_number) newparent = tdp->i_number; doingdirectory = 1; } if ((fvp->v_type == VDIR && fvp->v_mountedhere != NULL) || (tvp != NULL && tvp->v_type == VDIR && tvp->v_mountedhere != NULL)) { error = EXDEV; goto unlockout; } /* * If ".." must be changed (ie the directory gets a new * parent) then the source directory must not be in the * directory hierarchy above the target, as this would * orphan everything below the source directory. Also * the user must have write permission in the source so * as to be able to change "..". */ if (doingdirectory && newparent) { error = VOP_ACCESS(fvp, VWRITE, tcnp->cn_cred, tcnp->cn_thread); if (error) goto unlockout; error = ufs_checkpath(ino, fdp->i_number, tdp, tcnp->cn_cred, &ino); /* * We encountered a lock that we have to wait for. Unlock * everything else and VGET before restarting. */ if (ino) { VOP_UNLOCK(fdvp, 0); VOP_UNLOCK(fvp, 0); VOP_UNLOCK(tdvp, 0); if (tvp) VOP_UNLOCK(tvp, 0); error = VFS_VGET(mp, ino, LK_SHARED, &nvp); if (error == 0) vput(nvp); atomic_add_int(&rename_restarts, 1); goto relock; } if (error) goto unlockout; if ((tcnp->cn_flags & SAVESTART) == 0) panic("ufs_rename: lost to startdir"); } if (fip->i_effnlink == 0 || fdp->i_effnlink == 0 || tdp->i_effnlink == 0) panic("Bad effnlink fip %p, fdp %p, tdp %p", fip, fdp, tdp); /* * 1) Bump link count while we're moving stuff * around. If we crash somewhere before * completing our work, the link count * may be wrong, but correctable. */ fip->i_effnlink++; fip->i_nlink++; DIP_SET(fip, i_nlink, fip->i_nlink); fip->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(fvp)) softdep_setup_link(tdp, fip); error = UFS_UPDATE(fvp, !(DOINGSOFTDEP(fvp) | DOINGASYNC(fvp))); if (error) goto bad; /* * 2) If target doesn't exist, link the target * to the source and unlink the source. * Otherwise, rewrite the target directory * entry to reference the source inode and * expunge the original entry's existence. */ if (tip == NULL) { if (tdp->i_dev != fip->i_dev) panic("ufs_rename: EXDEV"); if (doingdirectory && newparent) { /* * Account for ".." in new directory. * When source and destination have the same * parent we don't adjust the link count. The * actual link modification is completed when * .. is rewritten below. */ if ((nlink_t)tdp->i_nlink >= LINK_MAX) { error = EMLINK; goto bad; } } ufs_makedirentry(fip, tcnp, &newdir); error = ufs_direnter(tdvp, NULL, &newdir, tcnp, NULL, 1); if (error) goto bad; /* Setup tdvp for directory compaction if needed. */ if (tdp->i_count && tdp->i_endoff && tdp->i_endoff < tdp->i_size) endoff = tdp->i_endoff; } else { if (tip->i_dev != tdp->i_dev || tip->i_dev != fip->i_dev) panic("ufs_rename: EXDEV"); /* * Short circuit rename(foo, foo). */ if (tip->i_number == fip->i_number) panic("ufs_rename: same file"); /* * If the parent directory is "sticky", then the caller * must possess VADMIN for the parent directory, or the * destination of the rename. This implements append-only * directories. */ if ((tdp->i_mode & S_ISTXT) && VOP_ACCESS(tdvp, VADMIN, tcnp->cn_cred, td) && VOP_ACCESS(tvp, VADMIN, tcnp->cn_cred, td)) { error = EPERM; goto bad; } /* * Target must be empty if a directory and have no links * to it. Also, ensure source and target are compatible * (both directories, or both not directories). */ if ((tip->i_mode & IFMT) == IFDIR) { if ((tip->i_effnlink > 2) || !ufs_dirempty(tip, tdp->i_number, tcnp->cn_cred)) { error = ENOTEMPTY; goto bad; } if (!doingdirectory) { error = ENOTDIR; goto bad; } cache_purge(tdvp); } else if (doingdirectory) { error = EISDIR; goto bad; } if (doingdirectory) { if (!newparent) { tdp->i_effnlink--; if (DOINGSOFTDEP(tdvp)) softdep_change_linkcnt(tdp); } tip->i_effnlink--; if (DOINGSOFTDEP(tvp)) softdep_change_linkcnt(tip); } error = ufs_dirrewrite(tdp, tip, fip->i_number, IFTODT(fip->i_mode), (doingdirectory && newparent) ? newparent : doingdirectory); if (error) { if (doingdirectory) { if (!newparent) { tdp->i_effnlink++; if (DOINGSOFTDEP(tdvp)) softdep_change_linkcnt(tdp); } tip->i_effnlink++; if (DOINGSOFTDEP(tvp)) softdep_change_linkcnt(tip); } } if (doingdirectory && !DOINGSOFTDEP(tvp)) { /* * The only stuff left in the directory is "." * and "..". The "." reference is inconsequential * since we are quashing it. We have removed the "." * reference and the reference in the parent directory, * but there may be other hard links. The soft * dependency code will arrange to do these operations * after the parent directory entry has been deleted on * disk, so when running with that code we avoid doing * them now. */ if (!newparent) { tdp->i_nlink--; DIP_SET(tdp, i_nlink, tdp->i_nlink); tdp->i_flag |= IN_CHANGE; } tip->i_nlink--; DIP_SET(tip, i_nlink, tip->i_nlink); tip->i_flag |= IN_CHANGE; } } /* * 3) Unlink the source. We have to resolve the path again to * fixup the directory offset and count for ufs_dirremove. */ if (fdvp == tdvp) { error = ufs_lookup_ino(fdvp, NULL, fcnp, &ino); if (error) panic("ufs_rename: from entry went away!"); if (ino != fip->i_number) panic("ufs_rename: ino mismatch %ju != %ju\n", (uintmax_t)ino, (uintmax_t)fip->i_number); } /* * If the source is a directory with a * new parent, the link count of the old * parent directory must be decremented * and ".." set to point to the new parent. */ if (doingdirectory && newparent) { /* * If tip exists we simply use its link, otherwise we must * add a new one. */ if (tip == NULL) { tdp->i_effnlink++; tdp->i_nlink++; DIP_SET(tdp, i_nlink, tdp->i_nlink); tdp->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(tdvp)) softdep_setup_dotdot_link(tdp, fip); error = UFS_UPDATE(tdvp, !(DOINGSOFTDEP(tdvp) | DOINGASYNC(tdvp))); /* Don't go to bad here as the new link exists. */ if (error) goto unlockout; } else if (DOINGSUJ(tdvp)) /* Journal must account for each new link. */ softdep_setup_dotdot_link(tdp, fip); fip->i_offset = mastertemplate.dot_reclen; ufs_dirrewrite(fip, fdp, newparent, DT_DIR, 0); cache_purge(fdvp); } error = ufs_dirremove(fdvp, fip, fcnp->cn_flags, 0); /* * The kern_renameat() looks up the fvp using the DELETE flag, which * causes the removal of the name cache entry for fvp. * As the relookup of the fvp is done in two steps: * ufs_lookup_ino() and then VFS_VGET(), another thread might do a * normal lookup of the from name just before the VFS_VGET() call, * causing the cache entry to be re-instantiated. * * The same issue also applies to tvp if it exists as * otherwise we may have a stale name cache entry for the new * name that references the old i-node if it has other links * or open file descriptors. */ cache_purge(fvp); if (tvp) cache_purge(tvp); cache_purge_negative(tdvp); unlockout: vput(fdvp); vput(fvp); if (tvp) vput(tvp); /* * If compaction or fsync was requested do it now that other locks * are no longer needed. */ if (error == 0 && endoff != 0) { #ifdef UFS_DIRHASH if (tdp->i_dirhash != NULL) ufsdirhash_dirtrunc(tdp, endoff); #endif UFS_TRUNCATE(tdvp, endoff, IO_NORMAL | IO_SYNC, tcnp->cn_cred); } if (error == 0 && tdp->i_flag & IN_NEEDSYNC) error = VOP_FSYNC(tdvp, MNT_WAIT, td); vput(tdvp); return (error); bad: fip->i_effnlink--; fip->i_nlink--; DIP_SET(fip, i_nlink, fip->i_nlink); fip->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(fvp)) softdep_revert_link(tdp, fip); goto unlockout; releout: vrele(fdvp); vrele(fvp); vrele(tdvp); if (tvp) vrele(tvp); return (error); } #ifdef UFS_ACL static int ufs_do_posix1e_acl_inheritance_dir(struct vnode *dvp, struct vnode *tvp, mode_t dmode, struct ucred *cred, struct thread *td) { int error; struct inode *ip = VTOI(tvp); struct acl *dacl, *acl; acl = acl_alloc(M_WAITOK); dacl = acl_alloc(M_WAITOK); /* * Retrieve default ACL from parent, if any. */ error = VOP_GETACL(dvp, ACL_TYPE_DEFAULT, acl, cred, td); switch (error) { case 0: /* * Retrieved a default ACL, so merge mode and ACL if * necessary. If the ACL is empty, fall through to * the "not defined or available" case. */ if (acl->acl_cnt != 0) { dmode = acl_posix1e_newfilemode(dmode, acl); ip->i_mode = dmode; DIP_SET(ip, i_mode, dmode); *dacl = *acl; ufs_sync_acl_from_inode(ip, acl); break; } /* FALLTHROUGH */ case EOPNOTSUPP: /* * Just use the mode as-is. */ ip->i_mode = dmode; DIP_SET(ip, i_mode, dmode); error = 0; goto out; default: goto out; } /* * XXX: If we abort now, will Soft Updates notify the extattr * code that the EAs for the file need to be released? */ error = VOP_SETACL(tvp, ACL_TYPE_ACCESS, acl, cred, td); if (error == 0) error = VOP_SETACL(tvp, ACL_TYPE_DEFAULT, dacl, cred, td); switch (error) { case 0: break; case EOPNOTSUPP: /* * XXX: This should not happen, as EOPNOTSUPP above * was supposed to free acl. */ printf("ufs_mkdir: VOP_GETACL() but no VOP_SETACL()\n"); /* panic("ufs_mkdir: VOP_GETACL() but no VOP_SETACL()"); */ break; default: goto out; } out: acl_free(acl); acl_free(dacl); return (error); } static int ufs_do_posix1e_acl_inheritance_file(struct vnode *dvp, struct vnode *tvp, mode_t mode, struct ucred *cred, struct thread *td) { int error; struct inode *ip = VTOI(tvp); struct acl *acl; acl = acl_alloc(M_WAITOK); /* * Retrieve default ACL for parent, if any. */ error = VOP_GETACL(dvp, ACL_TYPE_DEFAULT, acl, cred, td); switch (error) { case 0: /* * Retrieved a default ACL, so merge mode and ACL if * necessary. */ if (acl->acl_cnt != 0) { /* * Two possible ways for default ACL to not * be present. First, the EA can be * undefined, or second, the default ACL can * be blank. If it's blank, fall through to * the it's not defined case. */ mode = acl_posix1e_newfilemode(mode, acl); ip->i_mode = mode; DIP_SET(ip, i_mode, mode); ufs_sync_acl_from_inode(ip, acl); break; } /* FALLTHROUGH */ case EOPNOTSUPP: /* * Just use the mode as-is. */ ip->i_mode = mode; DIP_SET(ip, i_mode, mode); error = 0; goto out; default: goto out; } /* * XXX: If we abort now, will Soft Updates notify the extattr * code that the EAs for the file need to be released? */ error = VOP_SETACL(tvp, ACL_TYPE_ACCESS, acl, cred, td); switch (error) { case 0: break; case EOPNOTSUPP: /* * XXX: This should not happen, as EOPNOTSUPP above was * supposed to free acl. */ printf("ufs_makeinode: VOP_GETACL() but no " "VOP_SETACL()\n"); /* panic("ufs_makeinode: VOP_GETACL() but no " "VOP_SETACL()"); */ break; default: goto out; } out: acl_free(acl); return (error); } static int ufs_do_nfs4_acl_inheritance(struct vnode *dvp, struct vnode *tvp, mode_t child_mode, struct ucred *cred, struct thread *td) { int error; struct acl *parent_aclp, *child_aclp; parent_aclp = acl_alloc(M_WAITOK); child_aclp = acl_alloc(M_WAITOK | M_ZERO); error = ufs_getacl_nfs4_internal(dvp, parent_aclp, td); if (error) goto out; acl_nfs4_compute_inherited_acl(parent_aclp, child_aclp, child_mode, VTOI(tvp)->i_uid, tvp->v_type == VDIR); error = ufs_setacl_nfs4_internal(tvp, child_aclp, td); if (error) goto out; out: acl_free(parent_aclp); acl_free(child_aclp); return (error); } #endif /* * Mkdir system call */ static int ufs_mkdir(ap) struct vop_mkdir_args /* { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; } */ *ap; { struct vnode *dvp = ap->a_dvp; struct vattr *vap = ap->a_vap; struct componentname *cnp = ap->a_cnp; struct inode *ip, *dp; struct vnode *tvp; struct buf *bp; struct dirtemplate dirtemplate, *dtp; struct direct newdir; int error, dmode; long blkoff; #ifdef INVARIANTS if ((cnp->cn_flags & HASBUF) == 0) panic("ufs_mkdir: no name"); #endif dp = VTOI(dvp); if ((nlink_t)dp->i_nlink >= LINK_MAX) { error = EMLINK; goto out; } dmode = vap->va_mode & 0777; dmode |= IFDIR; /* * Must simulate part of ufs_makeinode here to acquire the inode, * but not have it entered in the parent directory. The entry is * made later after writing "." and ".." entries. */ error = UFS_VALLOC(dvp, dmode, cnp->cn_cred, &tvp); if (error) goto out; ip = VTOI(tvp); ip->i_gid = dp->i_gid; DIP_SET(ip, i_gid, dp->i_gid); #ifdef SUIDDIR { #ifdef QUOTA struct ucred ucred, *ucp; gid_t ucred_group; ucp = cnp->cn_cred; #endif /* * If we are hacking owners here, (only do this where told to) * and we are not giving it TO root, (would subvert quotas) * then go ahead and give it to the other user. * The new directory also inherits the SUID bit. * If user's UID and dir UID are the same, * 'give it away' so that the SUID is still forced on. */ if ((dvp->v_mount->mnt_flag & MNT_SUIDDIR) && (dp->i_mode & ISUID) && dp->i_uid) { dmode |= ISUID; ip->i_uid = dp->i_uid; DIP_SET(ip, i_uid, dp->i_uid); #ifdef QUOTA if (dp->i_uid != cnp->cn_cred->cr_uid) { /* * Make sure the correct user gets charged * for the space. * Make a dummy credential for the victim. * XXX This seems to never be accessed out of * our context so a stack variable is ok. */ refcount_init(&ucred.cr_ref, 1); ucred.cr_uid = ip->i_uid; ucred.cr_ngroups = 1; ucred.cr_groups = &ucred_group; ucred.cr_groups[0] = dp->i_gid; ucp = &ucred; } #endif } else { ip->i_uid = cnp->cn_cred->cr_uid; DIP_SET(ip, i_uid, ip->i_uid); } #ifdef QUOTA if ((error = getinoquota(ip)) || (error = chkiq(ip, 1, ucp, 0))) { if (DOINGSOFTDEP(tvp)) softdep_revert_link(dp, ip); UFS_VFREE(tvp, ip->i_number, dmode); vput(tvp); return (error); } #endif } #else /* !SUIDDIR */ ip->i_uid = cnp->cn_cred->cr_uid; DIP_SET(ip, i_uid, ip->i_uid); #ifdef QUOTA if ((error = getinoquota(ip)) || (error = chkiq(ip, 1, cnp->cn_cred, 0))) { if (DOINGSOFTDEP(tvp)) softdep_revert_link(dp, ip); UFS_VFREE(tvp, ip->i_number, dmode); vput(tvp); return (error); } #endif #endif /* !SUIDDIR */ ip->i_flag |= IN_ACCESS | IN_CHANGE | IN_UPDATE; ip->i_mode = dmode; DIP_SET(ip, i_mode, dmode); tvp->v_type = VDIR; /* Rest init'd in getnewvnode(). */ ip->i_effnlink = 2; ip->i_nlink = 2; DIP_SET(ip, i_nlink, 2); if (cnp->cn_flags & ISWHITEOUT) { ip->i_flags |= UF_OPAQUE; DIP_SET(ip, i_flags, ip->i_flags); } /* * Bump link count in parent directory to reflect work done below. * Should be done before reference is created so cleanup is * possible if we crash. */ dp->i_effnlink++; dp->i_nlink++; DIP_SET(dp, i_nlink, dp->i_nlink); dp->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(dvp)) softdep_setup_mkdir(dp, ip); error = UFS_UPDATE(dvp, !(DOINGSOFTDEP(dvp) | DOINGASYNC(dvp))); if (error) goto bad; #ifdef MAC if (dvp->v_mount->mnt_flag & MNT_MULTILABEL) { error = mac_vnode_create_extattr(cnp->cn_cred, dvp->v_mount, dvp, tvp, cnp); if (error) goto bad; } #endif #ifdef UFS_ACL if (dvp->v_mount->mnt_flag & MNT_ACLS) { error = ufs_do_posix1e_acl_inheritance_dir(dvp, tvp, dmode, cnp->cn_cred, cnp->cn_thread); if (error) goto bad; } else if (dvp->v_mount->mnt_flag & MNT_NFS4ACLS) { error = ufs_do_nfs4_acl_inheritance(dvp, tvp, dmode, cnp->cn_cred, cnp->cn_thread); if (error) goto bad; } #endif /* !UFS_ACL */ /* * Initialize directory with "." and ".." from static template. */ if (dvp->v_mount->mnt_maxsymlinklen > 0) dtp = &mastertemplate; else dtp = (struct dirtemplate *)&omastertemplate; dirtemplate = *dtp; dirtemplate.dot_ino = ip->i_number; dirtemplate.dotdot_ino = dp->i_number; + vnode_pager_setsize(tvp, DIRBLKSIZ); if ((error = UFS_BALLOC(tvp, (off_t)0, DIRBLKSIZ, cnp->cn_cred, BA_CLRBUF, &bp)) != 0) goto bad; ip->i_size = DIRBLKSIZ; DIP_SET(ip, i_size, DIRBLKSIZ); ip->i_flag |= IN_CHANGE | IN_UPDATE; - vnode_pager_setsize(tvp, (u_long)ip->i_size); bcopy((caddr_t)&dirtemplate, (caddr_t)bp->b_data, sizeof dirtemplate); if (DOINGSOFTDEP(tvp)) { /* * Ensure that the entire newly allocated block is a * valid directory so that future growth within the * block does not have to ensure that the block is * written before the inode. */ blkoff = DIRBLKSIZ; while (blkoff < bp->b_bcount) { ((struct direct *) (bp->b_data + blkoff))->d_reclen = DIRBLKSIZ; blkoff += DIRBLKSIZ; } } if ((error = UFS_UPDATE(tvp, !(DOINGSOFTDEP(tvp) | DOINGASYNC(tvp)))) != 0) { (void)bwrite(bp); goto bad; } /* * Directory set up, now install its entry in the parent directory. * * If we are not doing soft dependencies, then we must write out the * buffer containing the new directory body before entering the new * name in the parent. If we are doing soft dependencies, then the * buffer containing the new directory body will be passed to and * released in the soft dependency code after the code has attached * an appropriate ordering dependency to the buffer which ensures that * the buffer is written before the new name is written in the parent. */ if (DOINGASYNC(dvp)) bdwrite(bp); else if (!DOINGSOFTDEP(dvp) && ((error = bwrite(bp)))) goto bad; ufs_makedirentry(ip, cnp, &newdir); error = ufs_direnter(dvp, tvp, &newdir, cnp, bp, 0); bad: if (error == 0) { *ap->a_vpp = tvp; } else { dp->i_effnlink--; dp->i_nlink--; DIP_SET(dp, i_nlink, dp->i_nlink); dp->i_flag |= IN_CHANGE; /* * No need to do an explicit VOP_TRUNCATE here, vrele will * do this for us because we set the link count to 0. */ ip->i_effnlink = 0; ip->i_nlink = 0; DIP_SET(ip, i_nlink, 0); ip->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(tvp)) softdep_revert_mkdir(dp, ip); vput(tvp); } out: return (error); } /* * Rmdir system call. */ static int ufs_rmdir(ap) struct vop_rmdir_args /* { struct vnode *a_dvp; struct vnode *a_vp; struct componentname *a_cnp; } */ *ap; { struct vnode *vp = ap->a_vp; struct vnode *dvp = ap->a_dvp; struct componentname *cnp = ap->a_cnp; struct inode *ip, *dp; int error; ip = VTOI(vp); dp = VTOI(dvp); /* * Do not remove a directory that is in the process of being renamed. * Verify the directory is empty (and valid). Rmdir ".." will not be * valid since ".." will contain a reference to the current directory * and thus be non-empty. Do not allow the removal of mounted on * directories (this can happen when an NFS exported filesystem * tries to remove a locally mounted on directory). */ error = 0; if (ip->i_effnlink < 2) { error = EINVAL; goto out; } if (dp->i_effnlink < 3) panic("ufs_dirrem: Bad link count %d on parent", dp->i_effnlink); if (!ufs_dirempty(ip, dp->i_number, cnp->cn_cred)) { error = ENOTEMPTY; goto out; } if ((dp->i_flags & APPEND) || (ip->i_flags & (NOUNLINK | IMMUTABLE | APPEND))) { error = EPERM; goto out; } if (vp->v_mountedhere != 0) { error = EINVAL; goto out; } #ifdef UFS_GJOURNAL ufs_gjournal_orphan(vp); #endif /* * Delete reference to directory before purging * inode. If we crash in between, the directory * will be reattached to lost+found, */ dp->i_effnlink--; ip->i_effnlink--; if (DOINGSOFTDEP(vp)) softdep_setup_rmdir(dp, ip); error = ufs_dirremove(dvp, ip, cnp->cn_flags, 1); if (error) { dp->i_effnlink++; ip->i_effnlink++; if (DOINGSOFTDEP(vp)) softdep_revert_rmdir(dp, ip); goto out; } cache_purge(dvp); /* * The only stuff left in the directory is "." and "..". The "." * reference is inconsequential since we are quashing it. The soft * dependency code will arrange to do these operations after * the parent directory entry has been deleted on disk, so * when running with that code we avoid doing them now. */ if (!DOINGSOFTDEP(vp)) { dp->i_nlink--; DIP_SET(dp, i_nlink, dp->i_nlink); dp->i_flag |= IN_CHANGE; error = UFS_UPDATE(dvp, 0); ip->i_nlink--; DIP_SET(ip, i_nlink, ip->i_nlink); ip->i_flag |= IN_CHANGE; } cache_purge(vp); #ifdef UFS_DIRHASH /* Kill any active hash; i_effnlink == 0, so it will not come back. */ if (ip->i_dirhash != NULL) ufsdirhash_free(ip); #endif out: return (error); } /* * symlink -- make a symbolic link */ static int ufs_symlink(ap) struct vop_symlink_args /* { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; char *a_target; } */ *ap; { struct vnode *vp, **vpp = ap->a_vpp; struct inode *ip; int len, error; error = ufs_makeinode(IFLNK | ap->a_vap->va_mode, ap->a_dvp, vpp, ap->a_cnp); if (error) return (error); vp = *vpp; len = strlen(ap->a_target); if (len < vp->v_mount->mnt_maxsymlinklen) { ip = VTOI(vp); bcopy(ap->a_target, SHORTLINK(ip), len); ip->i_size = len; DIP_SET(ip, i_size, len); ip->i_flag |= IN_CHANGE | IN_UPDATE; error = UFS_UPDATE(vp, 0); } else error = vn_rdwr(UIO_WRITE, vp, ap->a_target, len, (off_t)0, UIO_SYSSPACE, IO_NODELOCKED | IO_NOMACCHECK, ap->a_cnp->cn_cred, NOCRED, NULL, NULL); if (error) vput(vp); return (error); } /* * Vnode op for reading directories. */ int ufs_readdir(ap) struct vop_readdir_args /* { struct vnode *a_vp; struct uio *a_uio; struct ucred *a_cred; int *a_eofflag; int *a_ncookies; u_long **a_cookies; } */ *ap; { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; struct buf *bp; struct inode *ip; struct direct *dp, *edp; u_long *cookies; struct dirent dstdp; off_t offset, startoffset; size_t readcnt, skipcnt; ssize_t startresid; int ncookies; int error; if (uio->uio_offset < 0) return (EINVAL); ip = VTOI(vp); if (ip->i_effnlink == 0) return (0); if (ap->a_ncookies != NULL) { ncookies = uio->uio_resid; if (uio->uio_offset >= ip->i_size) ncookies = 0; else if (ip->i_size - uio->uio_offset < ncookies) ncookies = ip->i_size - uio->uio_offset; ncookies = ncookies / (offsetof(struct direct, d_name) + 4) + 1; cookies = malloc(ncookies * sizeof(*cookies), M_TEMP, M_WAITOK); *ap->a_ncookies = ncookies; *ap->a_cookies = cookies; } else { ncookies = 0; cookies = NULL; } offset = startoffset = uio->uio_offset; startresid = uio->uio_resid; error = 0; while (error == 0 && uio->uio_resid > 0 && uio->uio_offset < ip->i_size) { error = ffs_blkatoff(vp, uio->uio_offset, NULL, &bp); if (error) break; if (bp->b_offset + bp->b_bcount > ip->i_size) readcnt = ip->i_size - bp->b_offset; else readcnt = bp->b_bcount; skipcnt = (size_t)(uio->uio_offset - bp->b_offset) & ~(size_t)(DIRBLKSIZ - 1); offset = bp->b_offset + skipcnt; dp = (struct direct *)&bp->b_data[skipcnt]; edp = (struct direct *)&bp->b_data[readcnt]; while (error == 0 && uio->uio_resid > 0 && dp < edp) { if (dp->d_reclen <= offsetof(struct direct, d_name) || (caddr_t)dp + dp->d_reclen > (caddr_t)edp) { error = EIO; break; } #if BYTE_ORDER == LITTLE_ENDIAN /* Old filesystem format. */ if (vp->v_mount->mnt_maxsymlinklen <= 0) { dstdp.d_namlen = dp->d_type; dstdp.d_type = dp->d_namlen; } else #endif { dstdp.d_namlen = dp->d_namlen; dstdp.d_type = dp->d_type; } if (offsetof(struct direct, d_name) + dstdp.d_namlen > dp->d_reclen) { error = EIO; break; } if (offset < startoffset || dp->d_ino == 0) goto nextentry; dstdp.d_fileno = dp->d_ino; dstdp.d_reclen = GENERIC_DIRSIZ(&dstdp); bcopy(dp->d_name, dstdp.d_name, dstdp.d_namlen); dstdp.d_name[dstdp.d_namlen] = '\0'; if (dstdp.d_reclen > uio->uio_resid) { if (uio->uio_resid == startresid) error = EINVAL; else error = EJUSTRETURN; break; } /* Advance dp. */ error = uiomove((caddr_t)&dstdp, dstdp.d_reclen, uio); if (error) break; if (cookies != NULL) { KASSERT(ncookies > 0, ("ufs_readdir: cookies buffer too small")); *cookies = offset + dp->d_reclen; cookies++; ncookies--; } nextentry: offset += dp->d_reclen; dp = (struct direct *)((caddr_t)dp + dp->d_reclen); } bqrelse(bp); uio->uio_offset = offset; } /* We need to correct uio_offset. */ uio->uio_offset = offset; if (error == EJUSTRETURN) error = 0; if (ap->a_ncookies != NULL) { if (error == 0) { ap->a_ncookies -= ncookies; } else { free(*ap->a_cookies, M_TEMP); *ap->a_ncookies = 0; *ap->a_cookies = NULL; } } if (error == 0 && ap->a_eofflag) *ap->a_eofflag = ip->i_size <= uio->uio_offset; return (error); } /* * Return target name of a symbolic link */ static int ufs_readlink(ap) struct vop_readlink_args /* { struct vnode *a_vp; struct uio *a_uio; struct ucred *a_cred; } */ *ap; { struct vnode *vp = ap->a_vp; struct inode *ip = VTOI(vp); doff_t isize; isize = ip->i_size; if ((isize < vp->v_mount->mnt_maxsymlinklen) || DIP(ip, i_blocks) == 0) { /* XXX - for old fastlink support */ return (uiomove(SHORTLINK(ip), isize, ap->a_uio)); } return (VOP_READ(vp, ap->a_uio, 0, ap->a_cred)); } /* * Calculate the logical to physical mapping if not done already, * then call the device strategy routine. * * In order to be able to swap to a file, the ufs_bmaparray() operation may not * deadlock on memory. See ufs_bmap() for details. */ static int ufs_strategy(ap) struct vop_strategy_args /* { struct vnode *a_vp; struct buf *a_bp; } */ *ap; { struct buf *bp = ap->a_bp; struct vnode *vp = ap->a_vp; struct bufobj *bo; struct inode *ip; ufs2_daddr_t blkno; int error; ip = VTOI(vp); if (bp->b_blkno == bp->b_lblkno) { error = ufs_bmaparray(vp, bp->b_lblkno, &blkno, bp, NULL, NULL); bp->b_blkno = blkno; if (error) { bp->b_error = error; bp->b_ioflags |= BIO_ERROR; bufdone(bp); return (0); } if ((long)bp->b_blkno == -1) vfs_bio_clrbuf(bp); } if ((long)bp->b_blkno == -1) { bufdone(bp); return (0); } bp->b_iooffset = dbtob(bp->b_blkno); bo = ip->i_umbufobj; BO_STRATEGY(bo, bp); return (0); } /* * Print out the contents of an inode. */ static int ufs_print(ap) struct vop_print_args /* { struct vnode *a_vp; } */ *ap; { struct vnode *vp = ap->a_vp; struct inode *ip = VTOI(vp); printf("\tino %lu, on dev %s", (u_long)ip->i_number, devtoname(ip->i_dev)); if (vp->v_type == VFIFO) fifo_printinfo(vp); printf("\n"); return (0); } /* * Close wrapper for fifos. * * Update the times on the inode then do device close. */ static int ufsfifo_close(ap) struct vop_close_args /* { struct vnode *a_vp; int a_fflag; struct ucred *a_cred; struct thread *a_td; } */ *ap; { struct vnode *vp = ap->a_vp; int usecount; VI_LOCK(vp); usecount = vp->v_usecount; if (usecount > 1) ufs_itimes_locked(vp); VI_UNLOCK(vp); return (fifo_specops.vop_close(ap)); } /* * Kqfilter wrapper for fifos. * * Fall through to ufs kqfilter routines if needed */ static int ufsfifo_kqfilter(ap) struct vop_kqfilter_args *ap; { int error; error = fifo_specops.vop_kqfilter(ap); if (error) error = vfs_kqfilter(ap); return (error); } /* * Return POSIX pathconf information applicable to fifos. */ static int ufsfifo_pathconf(ap) struct vop_pathconf_args /* { struct vnode *a_vp; int a_name; int *a_retval; } */ *ap; { switch (ap->a_name) { case _PC_ACL_EXTENDED: case _PC_ACL_NFS4: case _PC_ACL_PATH_MAX: case _PC_MAC_PRESENT: return (ufs_pathconf(ap)); default: return (fifo_specops.vop_pathconf(ap)); } /* NOTREACHED */ } /* * Return POSIX pathconf information applicable to ufs filesystems. */ static int ufs_pathconf(ap) struct vop_pathconf_args /* { struct vnode *a_vp; int a_name; int *a_retval; } */ *ap; { int error; error = 0; switch (ap->a_name) { case _PC_LINK_MAX: *ap->a_retval = LINK_MAX; break; case _PC_NAME_MAX: *ap->a_retval = NAME_MAX; break; case _PC_PATH_MAX: *ap->a_retval = PATH_MAX; break; case _PC_PIPE_BUF: *ap->a_retval = PIPE_BUF; break; case _PC_CHOWN_RESTRICTED: *ap->a_retval = 1; break; case _PC_NO_TRUNC: *ap->a_retval = 1; break; case _PC_ACL_EXTENDED: #ifdef UFS_ACL if (ap->a_vp->v_mount->mnt_flag & MNT_ACLS) *ap->a_retval = 1; else *ap->a_retval = 0; #else *ap->a_retval = 0; #endif break; case _PC_ACL_NFS4: #ifdef UFS_ACL if (ap->a_vp->v_mount->mnt_flag & MNT_NFS4ACLS) *ap->a_retval = 1; else *ap->a_retval = 0; #else *ap->a_retval = 0; #endif break; case _PC_ACL_PATH_MAX: #ifdef UFS_ACL if (ap->a_vp->v_mount->mnt_flag & (MNT_ACLS | MNT_NFS4ACLS)) *ap->a_retval = ACL_MAX_ENTRIES; else *ap->a_retval = 3; #else *ap->a_retval = 3; #endif break; case _PC_MAC_PRESENT: #ifdef MAC if (ap->a_vp->v_mount->mnt_flag & MNT_MULTILABEL) *ap->a_retval = 1; else *ap->a_retval = 0; #else *ap->a_retval = 0; #endif break; case _PC_MIN_HOLE_SIZE: *ap->a_retval = ap->a_vp->v_mount->mnt_stat.f_iosize; break; case _PC_ASYNC_IO: /* _PC_ASYNC_IO should have been handled by upper layers. */ KASSERT(0, ("_PC_ASYNC_IO should not get here")); error = EINVAL; break; case _PC_PRIO_IO: *ap->a_retval = 0; break; case _PC_SYNC_IO: *ap->a_retval = 0; break; case _PC_ALLOC_SIZE_MIN: *ap->a_retval = ap->a_vp->v_mount->mnt_stat.f_bsize; break; case _PC_FILESIZEBITS: *ap->a_retval = 64; break; case _PC_REC_INCR_XFER_SIZE: *ap->a_retval = ap->a_vp->v_mount->mnt_stat.f_iosize; break; case _PC_REC_MAX_XFER_SIZE: *ap->a_retval = -1; /* means ``unlimited'' */ break; case _PC_REC_MIN_XFER_SIZE: *ap->a_retval = ap->a_vp->v_mount->mnt_stat.f_iosize; break; case _PC_REC_XFER_ALIGN: *ap->a_retval = PAGE_SIZE; break; case _PC_SYMLINK_MAX: *ap->a_retval = MAXPATHLEN; break; default: error = EINVAL; break; } return (error); } /* * Initialize the vnode associated with a new inode, handle aliased * vnodes. */ int ufs_vinit(mntp, fifoops, vpp) struct mount *mntp; struct vop_vector *fifoops; struct vnode **vpp; { struct inode *ip; struct vnode *vp; vp = *vpp; ip = VTOI(vp); vp->v_type = IFTOVT(ip->i_mode); if (vp->v_type == VFIFO) vp->v_op = fifoops; ASSERT_VOP_LOCKED(vp, "ufs_vinit"); if (ip->i_number == ROOTINO) vp->v_vflag |= VV_ROOT; *vpp = vp; return (0); } /* * Allocate a new inode. * Vnode dvp must be locked. */ static int ufs_makeinode(mode, dvp, vpp, cnp) int mode; struct vnode *dvp; struct vnode **vpp; struct componentname *cnp; { struct inode *ip, *pdir; struct direct newdir; struct vnode *tvp; int error; pdir = VTOI(dvp); #ifdef INVARIANTS if ((cnp->cn_flags & HASBUF) == 0) panic("ufs_makeinode: no name"); #endif *vpp = NULL; if ((mode & IFMT) == 0) mode |= IFREG; if (VTOI(dvp)->i_effnlink < 2) panic("ufs_makeinode: Bad link count %d on parent", VTOI(dvp)->i_effnlink); error = UFS_VALLOC(dvp, mode, cnp->cn_cred, &tvp); if (error) return (error); ip = VTOI(tvp); ip->i_gid = pdir->i_gid; DIP_SET(ip, i_gid, pdir->i_gid); #ifdef SUIDDIR { #ifdef QUOTA struct ucred ucred, *ucp; gid_t ucred_group; ucp = cnp->cn_cred; #endif /* * If we are not the owner of the directory, * and we are hacking owners here, (only do this where told to) * and we are not giving it TO root, (would subvert quotas) * then go ahead and give it to the other user. * Note that this drops off the execute bits for security. */ if ((dvp->v_mount->mnt_flag & MNT_SUIDDIR) && (pdir->i_mode & ISUID) && (pdir->i_uid != cnp->cn_cred->cr_uid) && pdir->i_uid) { ip->i_uid = pdir->i_uid; DIP_SET(ip, i_uid, ip->i_uid); mode &= ~07111; #ifdef QUOTA /* * Make sure the correct user gets charged * for the space. * Quickly knock up a dummy credential for the victim. * XXX This seems to never be accessed out of our * context so a stack variable is ok. */ refcount_init(&ucred.cr_ref, 1); ucred.cr_uid = ip->i_uid; ucred.cr_ngroups = 1; ucred.cr_groups = &ucred_group; ucred.cr_groups[0] = pdir->i_gid; ucp = &ucred; #endif } else { ip->i_uid = cnp->cn_cred->cr_uid; DIP_SET(ip, i_uid, ip->i_uid); } #ifdef QUOTA if ((error = getinoquota(ip)) || (error = chkiq(ip, 1, ucp, 0))) { if (DOINGSOFTDEP(tvp)) softdep_revert_link(pdir, ip); UFS_VFREE(tvp, ip->i_number, mode); vput(tvp); return (error); } #endif } #else /* !SUIDDIR */ ip->i_uid = cnp->cn_cred->cr_uid; DIP_SET(ip, i_uid, ip->i_uid); #ifdef QUOTA if ((error = getinoquota(ip)) || (error = chkiq(ip, 1, cnp->cn_cred, 0))) { if (DOINGSOFTDEP(tvp)) softdep_revert_link(pdir, ip); UFS_VFREE(tvp, ip->i_number, mode); vput(tvp); return (error); } #endif #endif /* !SUIDDIR */ ip->i_flag |= IN_ACCESS | IN_CHANGE | IN_UPDATE; ip->i_mode = mode; DIP_SET(ip, i_mode, mode); tvp->v_type = IFTOVT(mode); /* Rest init'd in getnewvnode(). */ ip->i_effnlink = 1; ip->i_nlink = 1; DIP_SET(ip, i_nlink, 1); if (DOINGSOFTDEP(tvp)) softdep_setup_create(VTOI(dvp), ip); if ((ip->i_mode & ISGID) && !groupmember(ip->i_gid, cnp->cn_cred) && priv_check_cred(cnp->cn_cred, PRIV_VFS_SETGID, 0)) { ip->i_mode &= ~ISGID; DIP_SET(ip, i_mode, ip->i_mode); } if (cnp->cn_flags & ISWHITEOUT) { ip->i_flags |= UF_OPAQUE; DIP_SET(ip, i_flags, ip->i_flags); } /* * Make sure inode goes to disk before directory entry. */ error = UFS_UPDATE(tvp, !(DOINGSOFTDEP(tvp) | DOINGASYNC(tvp))); if (error) goto bad; #ifdef MAC if (dvp->v_mount->mnt_flag & MNT_MULTILABEL) { error = mac_vnode_create_extattr(cnp->cn_cred, dvp->v_mount, dvp, tvp, cnp); if (error) goto bad; } #endif #ifdef UFS_ACL if (dvp->v_mount->mnt_flag & MNT_ACLS) { error = ufs_do_posix1e_acl_inheritance_file(dvp, tvp, mode, cnp->cn_cred, cnp->cn_thread); if (error) goto bad; } else if (dvp->v_mount->mnt_flag & MNT_NFS4ACLS) { error = ufs_do_nfs4_acl_inheritance(dvp, tvp, mode, cnp->cn_cred, cnp->cn_thread); if (error) goto bad; } #endif /* !UFS_ACL */ ufs_makedirentry(ip, cnp, &newdir); error = ufs_direnter(dvp, tvp, &newdir, cnp, NULL, 0); if (error) goto bad; *vpp = tvp; return (0); bad: /* * Write error occurred trying to update the inode * or the directory so must deallocate the inode. */ ip->i_effnlink = 0; ip->i_nlink = 0; DIP_SET(ip, i_nlink, 0); ip->i_flag |= IN_CHANGE; if (DOINGSOFTDEP(tvp)) softdep_revert_create(VTOI(dvp), ip); vput(tvp); return (error); } static int ufs_ioctl(struct vop_ioctl_args *ap) { switch (ap->a_command) { case FIOSEEKDATA: case FIOSEEKHOLE: return (vn_bmap_seekhole(ap->a_vp, ap->a_command, (off_t *)ap->a_data, ap->a_cred)); default: return (ENOTTY); } } /* Global vfs data structures for ufs. */ struct vop_vector ufs_vnodeops = { .vop_default = &default_vnodeops, .vop_fsync = VOP_PANIC, .vop_read = VOP_PANIC, .vop_reallocblks = VOP_PANIC, .vop_write = VOP_PANIC, .vop_accessx = ufs_accessx, .vop_bmap = ufs_bmap, .vop_cachedlookup = ufs_lookup, .vop_close = ufs_close, .vop_create = ufs_create, .vop_getattr = ufs_getattr, .vop_inactive = ufs_inactive, .vop_ioctl = ufs_ioctl, .vop_link = ufs_link, .vop_lookup = vfs_cache_lookup, .vop_markatime = ufs_markatime, .vop_mkdir = ufs_mkdir, .vop_mknod = ufs_mknod, .vop_open = ufs_open, .vop_pathconf = ufs_pathconf, .vop_poll = vop_stdpoll, .vop_print = ufs_print, .vop_readdir = ufs_readdir, .vop_readlink = ufs_readlink, .vop_reclaim = ufs_reclaim, .vop_remove = ufs_remove, .vop_rename = ufs_rename, .vop_rmdir = ufs_rmdir, .vop_setattr = ufs_setattr, #ifdef MAC .vop_setlabel = vop_stdsetlabel_ea, #endif .vop_strategy = ufs_strategy, .vop_symlink = ufs_symlink, .vop_whiteout = ufs_whiteout, #ifdef UFS_EXTATTR .vop_getextattr = ufs_getextattr, .vop_deleteextattr = ufs_deleteextattr, .vop_setextattr = ufs_setextattr, #endif #ifdef UFS_ACL .vop_getacl = ufs_getacl, .vop_setacl = ufs_setacl, .vop_aclcheck = ufs_aclcheck, #endif }; struct vop_vector ufs_fifoops = { .vop_default = &fifo_specops, .vop_fsync = VOP_PANIC, .vop_accessx = ufs_accessx, .vop_close = ufsfifo_close, .vop_getattr = ufs_getattr, .vop_inactive = ufs_inactive, .vop_kqfilter = ufsfifo_kqfilter, .vop_markatime = ufs_markatime, .vop_pathconf = ufsfifo_pathconf, .vop_print = ufs_print, .vop_read = VOP_PANIC, .vop_reclaim = ufs_reclaim, .vop_setattr = ufs_setattr, #ifdef MAC .vop_setlabel = vop_stdsetlabel_ea, #endif .vop_write = VOP_PANIC, #ifdef UFS_EXTATTR .vop_getextattr = ufs_getextattr, .vop_deleteextattr = ufs_deleteextattr, .vop_setextattr = ufs_setextattr, #endif #ifdef UFS_ACL .vop_getacl = ufs_getacl, .vop_setacl = ufs_setacl, .vop_aclcheck = ufs_aclcheck, #endif }; Index: user/alc/PQ_LAUNDRY/sys/vm/uma_core.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/vm/uma_core.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/vm/uma_core.c (revision 303206) @@ -1,3685 +1,3684 @@ /*- * Copyright (c) 2002-2005, 2009, 2013 Jeffrey Roberson * Copyright (c) 2004, 2005 Bosko Milekic * Copyright (c) 2004-2006 Robert N. M. Watson * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * uma_core.c Implementation of the Universal Memory allocator * * This allocator is intended to replace the multitude of similar object caches * in the standard FreeBSD kernel. The intent is to be flexible as well as * efficient. A primary design goal is to return unused memory to the rest of * the system. This will make the system as a whole more flexible due to the * ability to move memory to subsystems which most need it instead of leaving * pools of reserved memory unused. * * The basic ideas stem from similar slab/zone based allocators whose algorithms * are well known. * */ /* * TODO: * - Improve memory usage for large allocations * - Investigate cache size adjustments */ #include __FBSDID("$FreeBSD$"); /* I should really use ktr.. */ /* #define UMA_DEBUG 1 #define UMA_DEBUG_ALLOC 1 #define UMA_DEBUG_ALLOC_1 1 */ #include "opt_ddb.h" #include "opt_param.h" #include "opt_vm.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef DEBUG_MEMGUARD #include #endif /* * This is the zone and keg from which all zones are spawned. The idea is that * even the zone & keg heads are allocated from the allocator, so we use the * bss section to bootstrap us. */ static struct uma_keg masterkeg; static struct uma_zone masterzone_k; static struct uma_zone masterzone_z; static uma_zone_t kegs = &masterzone_k; static uma_zone_t zones = &masterzone_z; /* This is the zone from which all of uma_slab_t's are allocated. */ static uma_zone_t slabzone; /* * The initial hash tables come out of this zone so they can be allocated * prior to malloc coming up. */ static uma_zone_t hashzone; /* The boot-time adjusted value for cache line alignment. */ int uma_align_cache = 64 - 1; static MALLOC_DEFINE(M_UMAHASH, "UMAHash", "UMA Hash Buckets"); /* * Are we allowed to allocate buckets? */ static int bucketdisable = 1; /* Linked list of all kegs in the system */ static LIST_HEAD(,uma_keg) uma_kegs = LIST_HEAD_INITIALIZER(uma_kegs); /* Linked list of all cache-only zones in the system */ static LIST_HEAD(,uma_zone) uma_cachezones = LIST_HEAD_INITIALIZER(uma_cachezones); /* This RW lock protects the keg list */ static struct rwlock_padalign uma_rwlock; /* Linked list of boot time pages */ static LIST_HEAD(,uma_slab) uma_boot_pages = LIST_HEAD_INITIALIZER(uma_boot_pages); /* This mutex protects the boot time pages list */ static struct mtx_padalign uma_boot_pages_mtx; static struct sx uma_drain_lock; /* Is the VM done starting up? */ static int booted = 0; #define UMA_STARTUP 1 #define UMA_STARTUP2 2 /* * This is the handle used to schedule events that need to happen * outside of the allocation fast path. */ static struct callout uma_callout; #define UMA_TIMEOUT 20 /* Seconds for callout interval. */ /* * This structure is passed as the zone ctor arg so that I don't have to create * a special allocation function just for zones. */ struct uma_zctor_args { const char *name; size_t size; uma_ctor ctor; uma_dtor dtor; uma_init uminit; uma_fini fini; uma_import import; uma_release release; void *arg; uma_keg_t keg; int align; uint32_t flags; }; struct uma_kctor_args { uma_zone_t zone; size_t size; uma_init uminit; uma_fini fini; int align; uint32_t flags; }; struct uma_bucket_zone { uma_zone_t ubz_zone; char *ubz_name; int ubz_entries; /* Number of items it can hold. */ int ubz_maxsize; /* Maximum allocation size per-item. */ }; /* * Compute the actual number of bucket entries to pack them in power * of two sizes for more efficient space utilization. */ #define BUCKET_SIZE(n) \ (((sizeof(void *) * (n)) - sizeof(struct uma_bucket)) / sizeof(void *)) #define BUCKET_MAX BUCKET_SIZE(256) struct uma_bucket_zone bucket_zones[] = { { NULL, "4 Bucket", BUCKET_SIZE(4), 4096 }, { NULL, "6 Bucket", BUCKET_SIZE(6), 3072 }, { NULL, "8 Bucket", BUCKET_SIZE(8), 2048 }, { NULL, "12 Bucket", BUCKET_SIZE(12), 1536 }, { NULL, "16 Bucket", BUCKET_SIZE(16), 1024 }, { NULL, "32 Bucket", BUCKET_SIZE(32), 512 }, { NULL, "64 Bucket", BUCKET_SIZE(64), 256 }, { NULL, "128 Bucket", BUCKET_SIZE(128), 128 }, { NULL, "256 Bucket", BUCKET_SIZE(256), 64 }, { NULL, NULL, 0} }; /* * Flags and enumerations to be passed to internal functions. */ enum zfreeskip { SKIP_NONE = 0, SKIP_DTOR, SKIP_FINI }; /* Prototypes.. */ static void *noobj_alloc(uma_zone_t, vm_size_t, uint8_t *, int); static void *page_alloc(uma_zone_t, vm_size_t, uint8_t *, int); static void *startup_alloc(uma_zone_t, vm_size_t, uint8_t *, int); static void page_free(void *, vm_size_t, uint8_t); static uma_slab_t keg_alloc_slab(uma_keg_t, uma_zone_t, int); static void cache_drain(uma_zone_t); static void bucket_drain(uma_zone_t, uma_bucket_t); static void bucket_cache_drain(uma_zone_t zone); static int keg_ctor(void *, int, void *, int); static void keg_dtor(void *, int, void *); static int zone_ctor(void *, int, void *, int); static void zone_dtor(void *, int, void *); static int zero_init(void *, int, int); static void keg_small_init(uma_keg_t keg); static void keg_large_init(uma_keg_t keg); static void zone_foreach(void (*zfunc)(uma_zone_t)); static void zone_timeout(uma_zone_t zone); static int hash_alloc(struct uma_hash *); static int hash_expand(struct uma_hash *, struct uma_hash *); static void hash_free(struct uma_hash *hash); static void uma_timeout(void *); static void uma_startup3(void); static void *zone_alloc_item(uma_zone_t, void *, int); static void zone_free_item(uma_zone_t, void *, void *, enum zfreeskip); static void bucket_enable(void); static void bucket_init(void); static uma_bucket_t bucket_alloc(uma_zone_t zone, void *, int); static void bucket_free(uma_zone_t zone, uma_bucket_t, void *); static void bucket_zone_drain(void); static uma_bucket_t zone_alloc_bucket(uma_zone_t zone, void *, int flags); static uma_slab_t zone_fetch_slab(uma_zone_t zone, uma_keg_t last, int flags); static uma_slab_t zone_fetch_slab_multi(uma_zone_t zone, uma_keg_t last, int flags); static void *slab_alloc_item(uma_keg_t keg, uma_slab_t slab); static void slab_free_item(uma_keg_t keg, uma_slab_t slab, void *item); static uma_keg_t uma_kcreate(uma_zone_t zone, size_t size, uma_init uminit, uma_fini fini, int align, uint32_t flags); static int zone_import(uma_zone_t zone, void **bucket, int max, int flags); static void zone_release(uma_zone_t zone, void **bucket, int cnt); static void uma_zero_item(void *item, uma_zone_t zone); void uma_print_zone(uma_zone_t); void uma_print_stats(void); static int sysctl_vm_zone_count(SYSCTL_HANDLER_ARGS); static int sysctl_vm_zone_stats(SYSCTL_HANDLER_ARGS); #ifdef INVARIANTS static void uma_dbg_free(uma_zone_t zone, uma_slab_t slab, void *item); static void uma_dbg_alloc(uma_zone_t zone, uma_slab_t slab, void *item); #endif SYSINIT(uma_startup3, SI_SUB_VM_CONF, SI_ORDER_SECOND, uma_startup3, NULL); SYSCTL_PROC(_vm, OID_AUTO, zone_count, CTLFLAG_RD|CTLTYPE_INT, 0, 0, sysctl_vm_zone_count, "I", "Number of UMA zones"); SYSCTL_PROC(_vm, OID_AUTO, zone_stats, CTLFLAG_RD|CTLTYPE_STRUCT, 0, 0, sysctl_vm_zone_stats, "s,struct uma_type_header", "Zone Stats"); static int zone_warnings = 1; SYSCTL_INT(_vm, OID_AUTO, zone_warnings, CTLFLAG_RWTUN, &zone_warnings, 0, "Warn when UMA zones becomes full"); /* * This routine checks to see whether or not it's safe to enable buckets. */ static void bucket_enable(void) { bucketdisable = vm_page_count_min(); } /* * Initialize bucket_zones, the array of zones of buckets of various sizes. * * For each zone, calculate the memory required for each bucket, consisting * of the header and an array of pointers. */ static void bucket_init(void) { struct uma_bucket_zone *ubz; int size; for (ubz = &bucket_zones[0]; ubz->ubz_entries != 0; ubz++) { size = roundup(sizeof(struct uma_bucket), sizeof(void *)); size += sizeof(void *) * ubz->ubz_entries; ubz->ubz_zone = uma_zcreate(ubz->ubz_name, size, NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_MTXCLASS | UMA_ZFLAG_BUCKET); } } /* * Given a desired number of entries for a bucket, return the zone from which * to allocate the bucket. */ static struct uma_bucket_zone * bucket_zone_lookup(int entries) { struct uma_bucket_zone *ubz; for (ubz = &bucket_zones[0]; ubz->ubz_entries != 0; ubz++) if (ubz->ubz_entries >= entries) return (ubz); ubz--; return (ubz); } static int bucket_select(int size) { struct uma_bucket_zone *ubz; ubz = &bucket_zones[0]; if (size > ubz->ubz_maxsize) return MAX((ubz->ubz_maxsize * ubz->ubz_entries) / size, 1); for (; ubz->ubz_entries != 0; ubz++) if (ubz->ubz_maxsize < size) break; ubz--; return (ubz->ubz_entries); } static uma_bucket_t bucket_alloc(uma_zone_t zone, void *udata, int flags) { struct uma_bucket_zone *ubz; uma_bucket_t bucket; /* * This is to stop us from allocating per cpu buckets while we're * running out of vm.boot_pages. Otherwise, we would exhaust the * boot pages. This also prevents us from allocating buckets in * low memory situations. */ if (bucketdisable) return (NULL); /* * To limit bucket recursion we store the original zone flags * in a cookie passed via zalloc_arg/zfree_arg. This allows the * NOVM flag to persist even through deep recursions. We also * store ZFLAG_BUCKET once we have recursed attempting to allocate * a bucket for a bucket zone so we do not allow infinite bucket * recursion. This cookie will even persist to frees of unused * buckets via the allocation path or bucket allocations in the * free path. */ if ((zone->uz_flags & UMA_ZFLAG_BUCKET) == 0) udata = (void *)(uintptr_t)zone->uz_flags; else { if ((uintptr_t)udata & UMA_ZFLAG_BUCKET) return (NULL); udata = (void *)((uintptr_t)udata | UMA_ZFLAG_BUCKET); } if ((uintptr_t)udata & UMA_ZFLAG_CACHEONLY) flags |= M_NOVM; ubz = bucket_zone_lookup(zone->uz_count); if (ubz->ubz_zone == zone && (ubz + 1)->ubz_entries != 0) ubz++; bucket = uma_zalloc_arg(ubz->ubz_zone, udata, flags); if (bucket) { #ifdef INVARIANTS bzero(bucket->ub_bucket, sizeof(void *) * ubz->ubz_entries); #endif bucket->ub_cnt = 0; bucket->ub_entries = ubz->ubz_entries; } return (bucket); } static void bucket_free(uma_zone_t zone, uma_bucket_t bucket, void *udata) { struct uma_bucket_zone *ubz; KASSERT(bucket->ub_cnt == 0, ("bucket_free: Freeing a non free bucket.")); if ((zone->uz_flags & UMA_ZFLAG_BUCKET) == 0) udata = (void *)(uintptr_t)zone->uz_flags; ubz = bucket_zone_lookup(bucket->ub_entries); uma_zfree_arg(ubz->ubz_zone, bucket, udata); } static void bucket_zone_drain(void) { struct uma_bucket_zone *ubz; for (ubz = &bucket_zones[0]; ubz->ubz_entries != 0; ubz++) zone_drain(ubz->ubz_zone); } static void zone_log_warning(uma_zone_t zone) { static const struct timeval warninterval = { 300, 0 }; if (!zone_warnings || zone->uz_warning == NULL) return; if (ratecheck(&zone->uz_ratecheck, &warninterval)) printf("[zone: %s] %s\n", zone->uz_name, zone->uz_warning); } static inline void zone_maxaction(uma_zone_t zone) { if (zone->uz_maxaction.ta_func != NULL) taskqueue_enqueue(taskqueue_thread, &zone->uz_maxaction); } static void zone_foreach_keg(uma_zone_t zone, void (*kegfn)(uma_keg_t)) { uma_klink_t klink; LIST_FOREACH(klink, &zone->uz_kegs, kl_link) kegfn(klink->kl_keg); } /* * Routine called by timeout which is used to fire off some time interval * based calculations. (stats, hash size, etc.) * * Arguments: * arg Unused * * Returns: * Nothing */ static void uma_timeout(void *unused) { bucket_enable(); zone_foreach(zone_timeout); /* Reschedule this event */ callout_reset(&uma_callout, UMA_TIMEOUT * hz, uma_timeout, NULL); } /* * Routine to perform timeout driven calculations. This expands the * hashes and does per cpu statistics aggregation. * * Returns nothing. */ static void keg_timeout(uma_keg_t keg) { KEG_LOCK(keg); /* * Expand the keg hash table. * * This is done if the number of slabs is larger than the hash size. * What I'm trying to do here is completely reduce collisions. This * may be a little aggressive. Should I allow for two collisions max? */ if (keg->uk_flags & UMA_ZONE_HASH && keg->uk_pages / keg->uk_ppera >= keg->uk_hash.uh_hashsize) { struct uma_hash newhash; struct uma_hash oldhash; int ret; /* * This is so involved because allocating and freeing * while the keg lock is held will lead to deadlock. * I have to do everything in stages and check for * races. */ newhash = keg->uk_hash; KEG_UNLOCK(keg); ret = hash_alloc(&newhash); KEG_LOCK(keg); if (ret) { if (hash_expand(&keg->uk_hash, &newhash)) { oldhash = keg->uk_hash; keg->uk_hash = newhash; } else oldhash = newhash; KEG_UNLOCK(keg); hash_free(&oldhash); return; } } KEG_UNLOCK(keg); } static void zone_timeout(uma_zone_t zone) { zone_foreach_keg(zone, &keg_timeout); } /* * Allocate and zero fill the next sized hash table from the appropriate * backing store. * * Arguments: * hash A new hash structure with the old hash size in uh_hashsize * * Returns: * 1 on success and 0 on failure. */ static int hash_alloc(struct uma_hash *hash) { int oldsize; int alloc; oldsize = hash->uh_hashsize; /* We're just going to go to a power of two greater */ if (oldsize) { hash->uh_hashsize = oldsize * 2; alloc = sizeof(hash->uh_slab_hash[0]) * hash->uh_hashsize; hash->uh_slab_hash = (struct slabhead *)malloc(alloc, M_UMAHASH, M_NOWAIT); } else { alloc = sizeof(hash->uh_slab_hash[0]) * UMA_HASH_SIZE_INIT; hash->uh_slab_hash = zone_alloc_item(hashzone, NULL, M_WAITOK); hash->uh_hashsize = UMA_HASH_SIZE_INIT; } if (hash->uh_slab_hash) { bzero(hash->uh_slab_hash, alloc); hash->uh_hashmask = hash->uh_hashsize - 1; return (1); } return (0); } /* * Expands the hash table for HASH zones. This is done from zone_timeout * to reduce collisions. This must not be done in the regular allocation * path, otherwise, we can recurse on the vm while allocating pages. * * Arguments: * oldhash The hash you want to expand * newhash The hash structure for the new table * * Returns: * Nothing * * Discussion: */ static int hash_expand(struct uma_hash *oldhash, struct uma_hash *newhash) { uma_slab_t slab; int hval; int i; if (!newhash->uh_slab_hash) return (0); if (oldhash->uh_hashsize >= newhash->uh_hashsize) return (0); /* * I need to investigate hash algorithms for resizing without a * full rehash. */ for (i = 0; i < oldhash->uh_hashsize; i++) while (!SLIST_EMPTY(&oldhash->uh_slab_hash[i])) { slab = SLIST_FIRST(&oldhash->uh_slab_hash[i]); SLIST_REMOVE_HEAD(&oldhash->uh_slab_hash[i], us_hlink); hval = UMA_HASH(newhash, slab->us_data); SLIST_INSERT_HEAD(&newhash->uh_slab_hash[hval], slab, us_hlink); } return (1); } /* * Free the hash bucket to the appropriate backing store. * * Arguments: * slab_hash The hash bucket we're freeing * hashsize The number of entries in that hash bucket * * Returns: * Nothing */ static void hash_free(struct uma_hash *hash) { if (hash->uh_slab_hash == NULL) return; if (hash->uh_hashsize == UMA_HASH_SIZE_INIT) zone_free_item(hashzone, hash->uh_slab_hash, NULL, SKIP_NONE); else free(hash->uh_slab_hash, M_UMAHASH); } /* * Frees all outstanding items in a bucket * * Arguments: * zone The zone to free to, must be unlocked. * bucket The free/alloc bucket with items, cpu queue must be locked. * * Returns: * Nothing */ static void bucket_drain(uma_zone_t zone, uma_bucket_t bucket) { int i; if (bucket == NULL) return; if (zone->uz_fini) for (i = 0; i < bucket->ub_cnt; i++) zone->uz_fini(bucket->ub_bucket[i], zone->uz_size); zone->uz_release(zone->uz_arg, bucket->ub_bucket, bucket->ub_cnt); bucket->ub_cnt = 0; } /* * Drains the per cpu caches for a zone. * * NOTE: This may only be called while the zone is being turn down, and not * during normal operation. This is necessary in order that we do not have * to migrate CPUs to drain the per-CPU caches. * * Arguments: * zone The zone to drain, must be unlocked. * * Returns: * Nothing */ static void cache_drain(uma_zone_t zone) { uma_cache_t cache; int cpu; /* * XXX: It is safe to not lock the per-CPU caches, because we're * tearing down the zone anyway. I.e., there will be no further use * of the caches at this point. * * XXX: It would good to be able to assert that the zone is being * torn down to prevent improper use of cache_drain(). * * XXX: We lock the zone before passing into bucket_cache_drain() as * it is used elsewhere. Should the tear-down path be made special * there in some form? */ CPU_FOREACH(cpu) { cache = &zone->uz_cpu[cpu]; bucket_drain(zone, cache->uc_allocbucket); bucket_drain(zone, cache->uc_freebucket); if (cache->uc_allocbucket != NULL) bucket_free(zone, cache->uc_allocbucket, NULL); if (cache->uc_freebucket != NULL) bucket_free(zone, cache->uc_freebucket, NULL); cache->uc_allocbucket = cache->uc_freebucket = NULL; } ZONE_LOCK(zone); bucket_cache_drain(zone); ZONE_UNLOCK(zone); } static void cache_shrink(uma_zone_t zone) { if (zone->uz_flags & UMA_ZFLAG_INTERNAL) return; ZONE_LOCK(zone); zone->uz_count = (zone->uz_count_min + zone->uz_count) / 2; ZONE_UNLOCK(zone); } static void cache_drain_safe_cpu(uma_zone_t zone) { uma_cache_t cache; uma_bucket_t b1, b2; if (zone->uz_flags & UMA_ZFLAG_INTERNAL) return; b1 = b2 = NULL; ZONE_LOCK(zone); critical_enter(); cache = &zone->uz_cpu[curcpu]; if (cache->uc_allocbucket) { if (cache->uc_allocbucket->ub_cnt != 0) LIST_INSERT_HEAD(&zone->uz_buckets, cache->uc_allocbucket, ub_link); else b1 = cache->uc_allocbucket; cache->uc_allocbucket = NULL; } if (cache->uc_freebucket) { if (cache->uc_freebucket->ub_cnt != 0) LIST_INSERT_HEAD(&zone->uz_buckets, cache->uc_freebucket, ub_link); else b2 = cache->uc_freebucket; cache->uc_freebucket = NULL; } critical_exit(); ZONE_UNLOCK(zone); if (b1) bucket_free(zone, b1, NULL); if (b2) bucket_free(zone, b2, NULL); } /* * Safely drain per-CPU caches of a zone(s) to alloc bucket. * This is an expensive call because it needs to bind to all CPUs * one by one and enter a critical section on each of them in order * to safely access their cache buckets. * Zone lock must not be held on call this function. */ static void cache_drain_safe(uma_zone_t zone) { int cpu; /* * Polite bucket sizes shrinking was not enouth, shrink aggressively. */ if (zone) cache_shrink(zone); else zone_foreach(cache_shrink); CPU_FOREACH(cpu) { thread_lock(curthread); sched_bind(curthread, cpu); thread_unlock(curthread); if (zone) cache_drain_safe_cpu(zone); else zone_foreach(cache_drain_safe_cpu); } thread_lock(curthread); sched_unbind(curthread); thread_unlock(curthread); } /* * Drain the cached buckets from a zone. Expects a locked zone on entry. */ static void bucket_cache_drain(uma_zone_t zone) { uma_bucket_t bucket; /* * Drain the bucket queues and free the buckets, we just keep two per * cpu (alloc/free). */ while ((bucket = LIST_FIRST(&zone->uz_buckets)) != NULL) { LIST_REMOVE(bucket, ub_link); ZONE_UNLOCK(zone); bucket_drain(zone, bucket); bucket_free(zone, bucket, NULL); ZONE_LOCK(zone); } /* * Shrink further bucket sizes. Price of single zone lock collision * is probably lower then price of global cache drain. */ if (zone->uz_count > zone->uz_count_min) zone->uz_count--; } static void keg_free_slab(uma_keg_t keg, uma_slab_t slab, int start) { uint8_t *mem; int i; uint8_t flags; mem = slab->us_data; flags = slab->us_flags; i = start; if (keg->uk_fini != NULL) { for (i--; i > -1; i--) keg->uk_fini(slab->us_data + (keg->uk_rsize * i), keg->uk_size); } if (keg->uk_flags & UMA_ZONE_OFFPAGE) zone_free_item(keg->uk_slabzone, slab, NULL, SKIP_NONE); #ifdef UMA_DEBUG printf("%s: Returning %d bytes.\n", keg->uk_name, PAGE_SIZE * keg->uk_ppera); #endif keg->uk_freef(mem, PAGE_SIZE * keg->uk_ppera, flags); } /* * Frees pages from a keg back to the system. This is done on demand from * the pageout daemon. * * Returns nothing. */ static void keg_drain(uma_keg_t keg) { struct slabhead freeslabs = { 0 }; uma_slab_t slab; uma_slab_t n; /* * We don't want to take pages from statically allocated kegs at this * time */ if (keg->uk_flags & UMA_ZONE_NOFREE || keg->uk_freef == NULL) return; #ifdef UMA_DEBUG printf("%s free items: %u\n", keg->uk_name, keg->uk_free); #endif KEG_LOCK(keg); if (keg->uk_free == 0) goto finished; slab = LIST_FIRST(&keg->uk_free_slab); while (slab) { n = LIST_NEXT(slab, us_link); /* We have no where to free these to */ if (slab->us_flags & UMA_SLAB_BOOT) { slab = n; continue; } LIST_REMOVE(slab, us_link); keg->uk_pages -= keg->uk_ppera; keg->uk_free -= keg->uk_ipers; if (keg->uk_flags & UMA_ZONE_HASH) UMA_HASH_REMOVE(&keg->uk_hash, slab, slab->us_data); SLIST_INSERT_HEAD(&freeslabs, slab, us_hlink); slab = n; } finished: KEG_UNLOCK(keg); while ((slab = SLIST_FIRST(&freeslabs)) != NULL) { SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink); keg_free_slab(keg, slab, keg->uk_ipers); } } static void zone_drain_wait(uma_zone_t zone, int waitok) { /* * Set draining to interlock with zone_dtor() so we can release our * locks as we go. Only dtor() should do a WAITOK call since it * is the only call that knows the structure will still be available * when it wakes up. */ ZONE_LOCK(zone); while (zone->uz_flags & UMA_ZFLAG_DRAINING) { if (waitok == M_NOWAIT) goto out; msleep(zone, zone->uz_lockptr, PVM, "zonedrain", 1); } zone->uz_flags |= UMA_ZFLAG_DRAINING; bucket_cache_drain(zone); ZONE_UNLOCK(zone); /* * The DRAINING flag protects us from being freed while * we're running. Normally the uma_rwlock would protect us but we * must be able to release and acquire the right lock for each keg. */ zone_foreach_keg(zone, &keg_drain); ZONE_LOCK(zone); zone->uz_flags &= ~UMA_ZFLAG_DRAINING; wakeup(zone); out: ZONE_UNLOCK(zone); } void zone_drain(uma_zone_t zone) { zone_drain_wait(zone, M_NOWAIT); } /* * Allocate a new slab for a keg. This does not insert the slab onto a list. * * Arguments: * wait Shall we wait? * * Returns: * The slab that was allocated or NULL if there is no memory and the * caller specified M_NOWAIT. */ static uma_slab_t keg_alloc_slab(uma_keg_t keg, uma_zone_t zone, int wait) { uma_alloc allocf; uma_slab_t slab; uint8_t *mem; uint8_t flags; int i; mtx_assert(&keg->uk_lock, MA_OWNED); slab = NULL; mem = NULL; #ifdef UMA_DEBUG printf("alloc_slab: Allocating a new slab for %s\n", keg->uk_name); #endif allocf = keg->uk_allocf; KEG_UNLOCK(keg); if (keg->uk_flags & UMA_ZONE_OFFPAGE) { slab = zone_alloc_item(keg->uk_slabzone, NULL, wait); if (slab == NULL) goto out; } /* * This reproduces the old vm_zone behavior of zero filling pages the * first time they are added to a zone. * * Malloced items are zeroed in uma_zalloc. */ if ((keg->uk_flags & UMA_ZONE_MALLOC) == 0) wait |= M_ZERO; else wait &= ~M_ZERO; if (keg->uk_flags & UMA_ZONE_NODUMP) wait |= M_NODUMP; /* zone is passed for legacy reasons. */ mem = allocf(zone, keg->uk_ppera * PAGE_SIZE, &flags, wait); if (mem == NULL) { if (keg->uk_flags & UMA_ZONE_OFFPAGE) zone_free_item(keg->uk_slabzone, slab, NULL, SKIP_NONE); slab = NULL; goto out; } /* Point the slab into the allocated memory */ if (!(keg->uk_flags & UMA_ZONE_OFFPAGE)) slab = (uma_slab_t )(mem + keg->uk_pgoff); if (keg->uk_flags & UMA_ZONE_VTOSLAB) for (i = 0; i < keg->uk_ppera; i++) vsetslab((vm_offset_t)mem + (i * PAGE_SIZE), slab); slab->us_keg = keg; slab->us_data = mem; slab->us_freecount = keg->uk_ipers; slab->us_flags = flags; BIT_FILL(SLAB_SETSIZE, &slab->us_free); #ifdef INVARIANTS BIT_ZERO(SLAB_SETSIZE, &slab->us_debugfree); #endif if (keg->uk_init != NULL) { for (i = 0; i < keg->uk_ipers; i++) if (keg->uk_init(slab->us_data + (keg->uk_rsize * i), keg->uk_size, wait) != 0) break; if (i != keg->uk_ipers) { keg_free_slab(keg, slab, i); slab = NULL; goto out; } } out: KEG_LOCK(keg); if (slab != NULL) { if (keg->uk_flags & UMA_ZONE_HASH) UMA_HASH_INSERT(&keg->uk_hash, slab, mem); keg->uk_pages += keg->uk_ppera; keg->uk_free += keg->uk_ipers; } return (slab); } /* * This function is intended to be used early on in place of page_alloc() so * that we may use the boot time page cache to satisfy allocations before * the VM is ready. */ static void * startup_alloc(uma_zone_t zone, vm_size_t bytes, uint8_t *pflag, int wait) { uma_keg_t keg; uma_slab_t tmps; int pages, check_pages; keg = zone_first_keg(zone); pages = howmany(bytes, PAGE_SIZE); check_pages = pages - 1; KASSERT(pages > 0, ("startup_alloc can't reserve 0 pages\n")); /* * Check our small startup cache to see if it has pages remaining. */ mtx_lock(&uma_boot_pages_mtx); /* First check if we have enough room. */ tmps = LIST_FIRST(&uma_boot_pages); while (tmps != NULL && check_pages-- > 0) tmps = LIST_NEXT(tmps, us_link); if (tmps != NULL) { /* * It's ok to lose tmps references. The last one will * have tmps->us_data pointing to the start address of * "pages" contiguous pages of memory. */ while (pages-- > 0) { tmps = LIST_FIRST(&uma_boot_pages); LIST_REMOVE(tmps, us_link); } mtx_unlock(&uma_boot_pages_mtx); *pflag = tmps->us_flags; return (tmps->us_data); } mtx_unlock(&uma_boot_pages_mtx); if (booted < UMA_STARTUP2) panic("UMA: Increase vm.boot_pages"); /* * Now that we've booted reset these users to their real allocator. */ #ifdef UMA_MD_SMALL_ALLOC keg->uk_allocf = (keg->uk_ppera > 1) ? page_alloc : uma_small_alloc; #else keg->uk_allocf = page_alloc; #endif return keg->uk_allocf(zone, bytes, pflag, wait); } /* * Allocates a number of pages from the system * * Arguments: * bytes The number of bytes requested * wait Shall we wait? * * Returns: * A pointer to the alloced memory or possibly * NULL if M_NOWAIT is set. */ static void * page_alloc(uma_zone_t zone, vm_size_t bytes, uint8_t *pflag, int wait) { void *p; /* Returned page */ *pflag = UMA_SLAB_KMEM; p = (void *) kmem_malloc(kmem_arena, bytes, wait); return (p); } /* * Allocates a number of pages from within an object * * Arguments: * bytes The number of bytes requested * wait Shall we wait? * * Returns: * A pointer to the alloced memory or possibly * NULL if M_NOWAIT is set. */ static void * noobj_alloc(uma_zone_t zone, vm_size_t bytes, uint8_t *flags, int wait) { TAILQ_HEAD(, vm_page) alloctail; u_long npages; vm_offset_t retkva, zkva; vm_page_t p, p_next; uma_keg_t keg; TAILQ_INIT(&alloctail); keg = zone_first_keg(zone); npages = howmany(bytes, PAGE_SIZE); while (npages > 0) { p = vm_page_alloc(NULL, 0, VM_ALLOC_INTERRUPT | VM_ALLOC_WIRED | VM_ALLOC_NOOBJ); if (p != NULL) { /* * Since the page does not belong to an object, its * listq is unused. */ TAILQ_INSERT_TAIL(&alloctail, p, listq); npages--; continue; } if (wait & M_WAITOK) { VM_WAIT; continue; } /* * Page allocation failed, free intermediate pages and * exit. */ TAILQ_FOREACH_SAFE(p, &alloctail, listq, p_next) { vm_page_unwire(p, PQ_NONE); vm_page_free(p); } return (NULL); } *flags = UMA_SLAB_PRIV; zkva = keg->uk_kva + atomic_fetchadd_long(&keg->uk_offset, round_page(bytes)); retkva = zkva; TAILQ_FOREACH(p, &alloctail, listq) { pmap_qenter(zkva, &p, 1); zkva += PAGE_SIZE; } return ((void *)retkva); } /* * Frees a number of pages to the system * * Arguments: * mem A pointer to the memory to be freed * size The size of the memory being freed * flags The original p->us_flags field * * Returns: * Nothing */ static void page_free(void *mem, vm_size_t size, uint8_t flags) { struct vmem *vmem; if (flags & UMA_SLAB_KMEM) vmem = kmem_arena; else if (flags & UMA_SLAB_KERNEL) vmem = kernel_arena; else panic("UMA: page_free used with invalid flags %d", flags); kmem_free(vmem, (vm_offset_t)mem, size); } /* * Zero fill initializer * * Arguments/Returns follow uma_init specifications */ static int zero_init(void *mem, int size, int flags) { bzero(mem, size); return (0); } /* * Finish creating a small uma keg. This calculates ipers, and the keg size. * * Arguments * keg The zone we should initialize * * Returns * Nothing */ static void keg_small_init(uma_keg_t keg) { u_int rsize; u_int memused; u_int wastedspace; u_int shsize; if (keg->uk_flags & UMA_ZONE_PCPU) { u_int ncpus = (mp_maxid + 1) ? (mp_maxid + 1) : MAXCPU; keg->uk_slabsize = sizeof(struct pcpu); keg->uk_ppera = howmany(ncpus * sizeof(struct pcpu), PAGE_SIZE); } else { keg->uk_slabsize = UMA_SLAB_SIZE; keg->uk_ppera = 1; } /* * Calculate the size of each allocation (rsize) according to * alignment. If the requested size is smaller than we have * allocation bits for we round it up. */ rsize = keg->uk_size; if (rsize < keg->uk_slabsize / SLAB_SETSIZE) rsize = keg->uk_slabsize / SLAB_SETSIZE; if (rsize & keg->uk_align) rsize = (rsize & ~keg->uk_align) + (keg->uk_align + 1); keg->uk_rsize = rsize; KASSERT((keg->uk_flags & UMA_ZONE_PCPU) == 0 || keg->uk_rsize < sizeof(struct pcpu), ("%s: size %u too large", __func__, keg->uk_rsize)); if (keg->uk_flags & UMA_ZONE_OFFPAGE) shsize = 0; else shsize = sizeof(struct uma_slab); keg->uk_ipers = (keg->uk_slabsize - shsize) / rsize; KASSERT(keg->uk_ipers > 0 && keg->uk_ipers <= SLAB_SETSIZE, ("%s: keg->uk_ipers %u", __func__, keg->uk_ipers)); memused = keg->uk_ipers * rsize + shsize; wastedspace = keg->uk_slabsize - memused; /* * We can't do OFFPAGE if we're internal or if we've been * asked to not go to the VM for buckets. If we do this we * may end up going to the VM for slabs which we do not * want to do if we're UMA_ZFLAG_CACHEONLY as a result * of UMA_ZONE_VM, which clearly forbids it. */ if ((keg->uk_flags & UMA_ZFLAG_INTERNAL) || (keg->uk_flags & UMA_ZFLAG_CACHEONLY)) return; /* * See if using an OFFPAGE slab will limit our waste. Only do * this if it permits more items per-slab. * * XXX We could try growing slabsize to limit max waste as well. * Historically this was not done because the VM could not * efficiently handle contiguous allocations. */ if ((wastedspace >= keg->uk_slabsize / UMA_MAX_WASTE) && (keg->uk_ipers < (keg->uk_slabsize / keg->uk_rsize))) { keg->uk_ipers = keg->uk_slabsize / keg->uk_rsize; KASSERT(keg->uk_ipers > 0 && keg->uk_ipers <= SLAB_SETSIZE, ("%s: keg->uk_ipers %u", __func__, keg->uk_ipers)); #ifdef UMA_DEBUG printf("UMA decided we need offpage slab headers for " "keg: %s, calculated wastedspace = %d, " "maximum wasted space allowed = %d, " "calculated ipers = %d, " "new wasted space = %d\n", keg->uk_name, wastedspace, keg->uk_slabsize / UMA_MAX_WASTE, keg->uk_ipers, keg->uk_slabsize - keg->uk_ipers * keg->uk_rsize); #endif keg->uk_flags |= UMA_ZONE_OFFPAGE; } if ((keg->uk_flags & UMA_ZONE_OFFPAGE) && (keg->uk_flags & UMA_ZONE_VTOSLAB) == 0) keg->uk_flags |= UMA_ZONE_HASH; } /* * Finish creating a large (> UMA_SLAB_SIZE) uma kegs. Just give in and do * OFFPAGE for now. When I can allow for more dynamic slab sizes this will be * more complicated. * * Arguments * keg The keg we should initialize * * Returns * Nothing */ static void keg_large_init(uma_keg_t keg) { u_int shsize; KASSERT(keg != NULL, ("Keg is null in keg_large_init")); KASSERT((keg->uk_flags & UMA_ZFLAG_CACHEONLY) == 0, ("keg_large_init: Cannot large-init a UMA_ZFLAG_CACHEONLY keg")); KASSERT((keg->uk_flags & UMA_ZONE_PCPU) == 0, ("%s: Cannot large-init a UMA_ZONE_PCPU keg", __func__)); keg->uk_ppera = howmany(keg->uk_size, PAGE_SIZE); keg->uk_slabsize = keg->uk_ppera * PAGE_SIZE; keg->uk_ipers = 1; keg->uk_rsize = keg->uk_size; /* We can't do OFFPAGE if we're internal, bail out here. */ if (keg->uk_flags & UMA_ZFLAG_INTERNAL) return; /* Check whether we have enough space to not do OFFPAGE. */ if ((keg->uk_flags & UMA_ZONE_OFFPAGE) == 0) { shsize = sizeof(struct uma_slab); if (shsize & UMA_ALIGN_PTR) shsize = (shsize & ~UMA_ALIGN_PTR) + (UMA_ALIGN_PTR + 1); if ((PAGE_SIZE * keg->uk_ppera) - keg->uk_rsize < shsize) keg->uk_flags |= UMA_ZONE_OFFPAGE; } if ((keg->uk_flags & UMA_ZONE_OFFPAGE) && (keg->uk_flags & UMA_ZONE_VTOSLAB) == 0) keg->uk_flags |= UMA_ZONE_HASH; } static void keg_cachespread_init(uma_keg_t keg) { int alignsize; int trailer; int pages; int rsize; KASSERT((keg->uk_flags & UMA_ZONE_PCPU) == 0, ("%s: Cannot cachespread-init a UMA_ZONE_PCPU keg", __func__)); alignsize = keg->uk_align + 1; rsize = keg->uk_size; /* * We want one item to start on every align boundary in a page. To * do this we will span pages. We will also extend the item by the * size of align if it is an even multiple of align. Otherwise, it * would fall on the same boundary every time. */ if (rsize & keg->uk_align) rsize = (rsize & ~keg->uk_align) + alignsize; if ((rsize & alignsize) == 0) rsize += alignsize; trailer = rsize - keg->uk_size; pages = (rsize * (PAGE_SIZE / alignsize)) / PAGE_SIZE; pages = MIN(pages, (128 * 1024) / PAGE_SIZE); keg->uk_rsize = rsize; keg->uk_ppera = pages; keg->uk_slabsize = UMA_SLAB_SIZE; keg->uk_ipers = ((pages * PAGE_SIZE) + trailer) / rsize; keg->uk_flags |= UMA_ZONE_OFFPAGE | UMA_ZONE_VTOSLAB; KASSERT(keg->uk_ipers <= SLAB_SETSIZE, ("%s: keg->uk_ipers too high(%d) increase max_ipers", __func__, keg->uk_ipers)); } /* * Keg header ctor. This initializes all fields, locks, etc. And inserts * the keg onto the global keg list. * * Arguments/Returns follow uma_ctor specifications * udata Actually uma_kctor_args */ static int keg_ctor(void *mem, int size, void *udata, int flags) { struct uma_kctor_args *arg = udata; uma_keg_t keg = mem; uma_zone_t zone; bzero(keg, size); keg->uk_size = arg->size; keg->uk_init = arg->uminit; keg->uk_fini = arg->fini; keg->uk_align = arg->align; keg->uk_free = 0; keg->uk_reserve = 0; keg->uk_pages = 0; keg->uk_flags = arg->flags; keg->uk_allocf = page_alloc; keg->uk_freef = page_free; keg->uk_slabzone = NULL; /* * The master zone is passed to us at keg-creation time. */ zone = arg->zone; keg->uk_name = zone->uz_name; if (arg->flags & UMA_ZONE_VM) keg->uk_flags |= UMA_ZFLAG_CACHEONLY; if (arg->flags & UMA_ZONE_ZINIT) keg->uk_init = zero_init; if (arg->flags & UMA_ZONE_MALLOC) keg->uk_flags |= UMA_ZONE_VTOSLAB; if (arg->flags & UMA_ZONE_PCPU) #ifdef SMP keg->uk_flags |= UMA_ZONE_OFFPAGE; #else keg->uk_flags &= ~UMA_ZONE_PCPU; #endif if (keg->uk_flags & UMA_ZONE_CACHESPREAD) { keg_cachespread_init(keg); } else { if (keg->uk_size > (UMA_SLAB_SIZE - sizeof(struct uma_slab))) keg_large_init(keg); else keg_small_init(keg); } if (keg->uk_flags & UMA_ZONE_OFFPAGE) keg->uk_slabzone = slabzone; /* * If we haven't booted yet we need allocations to go through the * startup cache until the vm is ready. */ if (keg->uk_ppera == 1) { #ifdef UMA_MD_SMALL_ALLOC keg->uk_allocf = uma_small_alloc; keg->uk_freef = uma_small_free; if (booted < UMA_STARTUP) keg->uk_allocf = startup_alloc; #else if (booted < UMA_STARTUP2) keg->uk_allocf = startup_alloc; #endif } else if (booted < UMA_STARTUP2 && (keg->uk_flags & UMA_ZFLAG_INTERNAL)) keg->uk_allocf = startup_alloc; /* * Initialize keg's lock */ KEG_LOCK_INIT(keg, (arg->flags & UMA_ZONE_MTXCLASS)); /* * If we're putting the slab header in the actual page we need to * figure out where in each page it goes. This calculates a right * justified offset into the memory on an ALIGN_PTR boundary. */ if (!(keg->uk_flags & UMA_ZONE_OFFPAGE)) { u_int totsize; /* Size of the slab struct and free list */ totsize = sizeof(struct uma_slab); if (totsize & UMA_ALIGN_PTR) totsize = (totsize & ~UMA_ALIGN_PTR) + (UMA_ALIGN_PTR + 1); keg->uk_pgoff = (PAGE_SIZE * keg->uk_ppera) - totsize; /* * The only way the following is possible is if with our * UMA_ALIGN_PTR adjustments we are now bigger than * UMA_SLAB_SIZE. I haven't checked whether this is * mathematically possible for all cases, so we make * sure here anyway. */ totsize = keg->uk_pgoff + sizeof(struct uma_slab); if (totsize > PAGE_SIZE * keg->uk_ppera) { printf("zone %s ipers %d rsize %d size %d\n", zone->uz_name, keg->uk_ipers, keg->uk_rsize, keg->uk_size); panic("UMA slab won't fit."); } } if (keg->uk_flags & UMA_ZONE_HASH) hash_alloc(&keg->uk_hash); #ifdef UMA_DEBUG printf("UMA: %s(%p) size %d(%d) flags %#x ipers %d ppera %d out %d free %d\n", zone->uz_name, zone, keg->uk_size, keg->uk_rsize, keg->uk_flags, keg->uk_ipers, keg->uk_ppera, (keg->uk_ipers * keg->uk_pages) - keg->uk_free, keg->uk_free); #endif LIST_INSERT_HEAD(&keg->uk_zones, zone, uz_link); rw_wlock(&uma_rwlock); LIST_INSERT_HEAD(&uma_kegs, keg, uk_link); rw_wunlock(&uma_rwlock); return (0); } /* * Zone header ctor. This initializes all fields, locks, etc. * * Arguments/Returns follow uma_ctor specifications * udata Actually uma_zctor_args */ static int zone_ctor(void *mem, int size, void *udata, int flags) { struct uma_zctor_args *arg = udata; uma_zone_t zone = mem; uma_zone_t z; uma_keg_t keg; bzero(zone, size); zone->uz_name = arg->name; zone->uz_ctor = arg->ctor; zone->uz_dtor = arg->dtor; zone->uz_slab = zone_fetch_slab; zone->uz_init = NULL; zone->uz_fini = NULL; zone->uz_allocs = 0; zone->uz_frees = 0; zone->uz_fails = 0; zone->uz_sleeps = 0; zone->uz_count = 0; zone->uz_count_min = 0; zone->uz_flags = 0; zone->uz_warning = NULL; timevalclear(&zone->uz_ratecheck); keg = arg->keg; ZONE_LOCK_INIT(zone, (arg->flags & UMA_ZONE_MTXCLASS)); /* * This is a pure cache zone, no kegs. */ if (arg->import) { if (arg->flags & UMA_ZONE_VM) arg->flags |= UMA_ZFLAG_CACHEONLY; zone->uz_flags = arg->flags; zone->uz_size = arg->size; zone->uz_import = arg->import; zone->uz_release = arg->release; zone->uz_arg = arg->arg; zone->uz_lockptr = &zone->uz_lock; rw_wlock(&uma_rwlock); LIST_INSERT_HEAD(&uma_cachezones, zone, uz_link); rw_wunlock(&uma_rwlock); goto out; } /* * Use the regular zone/keg/slab allocator. */ zone->uz_import = (uma_import)zone_import; zone->uz_release = (uma_release)zone_release; zone->uz_arg = zone; if (arg->flags & UMA_ZONE_SECONDARY) { KASSERT(arg->keg != NULL, ("Secondary zone on zero'd keg")); zone->uz_init = arg->uminit; zone->uz_fini = arg->fini; zone->uz_lockptr = &keg->uk_lock; zone->uz_flags |= UMA_ZONE_SECONDARY; rw_wlock(&uma_rwlock); ZONE_LOCK(zone); LIST_FOREACH(z, &keg->uk_zones, uz_link) { if (LIST_NEXT(z, uz_link) == NULL) { LIST_INSERT_AFTER(z, zone, uz_link); break; } } ZONE_UNLOCK(zone); rw_wunlock(&uma_rwlock); } else if (keg == NULL) { if ((keg = uma_kcreate(zone, arg->size, arg->uminit, arg->fini, arg->align, arg->flags)) == NULL) return (ENOMEM); } else { struct uma_kctor_args karg; int error; /* We should only be here from uma_startup() */ karg.size = arg->size; karg.uminit = arg->uminit; karg.fini = arg->fini; karg.align = arg->align; karg.flags = arg->flags; karg.zone = zone; error = keg_ctor(arg->keg, sizeof(struct uma_keg), &karg, flags); if (error) return (error); } /* * Link in the first keg. */ zone->uz_klink.kl_keg = keg; LIST_INSERT_HEAD(&zone->uz_kegs, &zone->uz_klink, kl_link); zone->uz_lockptr = &keg->uk_lock; zone->uz_size = keg->uk_size; zone->uz_flags |= (keg->uk_flags & (UMA_ZONE_INHERIT | UMA_ZFLAG_INHERIT)); /* * Some internal zones don't have room allocated for the per cpu * caches. If we're internal, bail out here. */ if (keg->uk_flags & UMA_ZFLAG_INTERNAL) { KASSERT((zone->uz_flags & UMA_ZONE_SECONDARY) == 0, ("Secondary zone requested UMA_ZFLAG_INTERNAL")); return (0); } out: if ((arg->flags & UMA_ZONE_MAXBUCKET) == 0) zone->uz_count = bucket_select(zone->uz_size); else zone->uz_count = BUCKET_MAX; zone->uz_count_min = zone->uz_count; return (0); } /* * Keg header dtor. This frees all data, destroys locks, frees the hash * table and removes the keg from the global list. * * Arguments/Returns follow uma_dtor specifications * udata unused */ static void keg_dtor(void *arg, int size, void *udata) { uma_keg_t keg; keg = (uma_keg_t)arg; KEG_LOCK(keg); if (keg->uk_free != 0) { printf("Freed UMA keg (%s) was not empty (%d items). " " Lost %d pages of memory.\n", keg->uk_name ? keg->uk_name : "", keg->uk_free, keg->uk_pages); } KEG_UNLOCK(keg); hash_free(&keg->uk_hash); KEG_LOCK_FINI(keg); } /* * Zone header dtor. * * Arguments/Returns follow uma_dtor specifications * udata unused */ static void zone_dtor(void *arg, int size, void *udata) { uma_klink_t klink; uma_zone_t zone; uma_keg_t keg; zone = (uma_zone_t)arg; keg = zone_first_keg(zone); if (!(zone->uz_flags & UMA_ZFLAG_INTERNAL)) cache_drain(zone); rw_wlock(&uma_rwlock); LIST_REMOVE(zone, uz_link); rw_wunlock(&uma_rwlock); /* * XXX there are some races here where * the zone can be drained but zone lock * released and then refilled before we * remove it... we dont care for now */ zone_drain_wait(zone, M_WAITOK); /* * Unlink all of our kegs. */ while ((klink = LIST_FIRST(&zone->uz_kegs)) != NULL) { klink->kl_keg = NULL; LIST_REMOVE(klink, kl_link); if (klink == &zone->uz_klink) continue; free(klink, M_TEMP); } /* * We only destroy kegs from non secondary zones. */ if (keg != NULL && (zone->uz_flags & UMA_ZONE_SECONDARY) == 0) { rw_wlock(&uma_rwlock); LIST_REMOVE(keg, uk_link); rw_wunlock(&uma_rwlock); zone_free_item(kegs, keg, NULL, SKIP_NONE); } ZONE_LOCK_FINI(zone); } /* * Traverses every zone in the system and calls a callback * * Arguments: * zfunc A pointer to a function which accepts a zone * as an argument. * * Returns: * Nothing */ static void zone_foreach(void (*zfunc)(uma_zone_t)) { uma_keg_t keg; uma_zone_t zone; rw_rlock(&uma_rwlock); LIST_FOREACH(keg, &uma_kegs, uk_link) { LIST_FOREACH(zone, &keg->uk_zones, uz_link) zfunc(zone); } rw_runlock(&uma_rwlock); } /* Public functions */ /* See uma.h */ void uma_startup(void *bootmem, int boot_pages) { struct uma_zctor_args args; uma_slab_t slab; int i; #ifdef UMA_DEBUG printf("Creating uma keg headers zone and keg.\n"); #endif rw_init(&uma_rwlock, "UMA lock"); /* "manually" create the initial zone */ memset(&args, 0, sizeof(args)); args.name = "UMA Kegs"; args.size = sizeof(struct uma_keg); args.ctor = keg_ctor; args.dtor = keg_dtor; args.uminit = zero_init; args.fini = NULL; args.keg = &masterkeg; args.align = 32 - 1; args.flags = UMA_ZFLAG_INTERNAL; /* The initial zone has no Per cpu queues so it's smaller */ zone_ctor(kegs, sizeof(struct uma_zone), &args, M_WAITOK); #ifdef UMA_DEBUG printf("Filling boot free list.\n"); #endif for (i = 0; i < boot_pages; i++) { slab = (uma_slab_t)((uint8_t *)bootmem + (i * UMA_SLAB_SIZE)); slab->us_data = (uint8_t *)slab; slab->us_flags = UMA_SLAB_BOOT; LIST_INSERT_HEAD(&uma_boot_pages, slab, us_link); } mtx_init(&uma_boot_pages_mtx, "UMA boot pages", NULL, MTX_DEF); #ifdef UMA_DEBUG printf("Creating uma zone headers zone and keg.\n"); #endif args.name = "UMA Zones"; args.size = sizeof(struct uma_zone) + (sizeof(struct uma_cache) * (mp_maxid + 1)); args.ctor = zone_ctor; args.dtor = zone_dtor; args.uminit = zero_init; args.fini = NULL; args.keg = NULL; args.align = 32 - 1; args.flags = UMA_ZFLAG_INTERNAL; /* The initial zone has no Per cpu queues so it's smaller */ zone_ctor(zones, sizeof(struct uma_zone), &args, M_WAITOK); #ifdef UMA_DEBUG printf("Creating slab and hash zones.\n"); #endif /* Now make a zone for slab headers */ slabzone = uma_zcreate("UMA Slabs", sizeof(struct uma_slab), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZFLAG_INTERNAL); hashzone = uma_zcreate("UMA Hash", sizeof(struct slabhead *) * UMA_HASH_SIZE_INIT, NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZFLAG_INTERNAL); bucket_init(); booted = UMA_STARTUP; #ifdef UMA_DEBUG printf("UMA startup complete.\n"); #endif } /* see uma.h */ void uma_startup2(void) { booted = UMA_STARTUP2; bucket_enable(); sx_init(&uma_drain_lock, "umadrain"); #ifdef UMA_DEBUG printf("UMA startup2 complete.\n"); #endif } /* * Initialize our callout handle * */ static void uma_startup3(void) { #ifdef UMA_DEBUG printf("Starting callout.\n"); #endif callout_init(&uma_callout, 1); callout_reset(&uma_callout, UMA_TIMEOUT * hz, uma_timeout, NULL); #ifdef UMA_DEBUG printf("UMA startup3 complete.\n"); #endif } static uma_keg_t uma_kcreate(uma_zone_t zone, size_t size, uma_init uminit, uma_fini fini, int align, uint32_t flags) { struct uma_kctor_args args; args.size = size; args.uminit = uminit; args.fini = fini; args.align = (align == UMA_ALIGN_CACHE) ? uma_align_cache : align; args.flags = flags; args.zone = zone; return (zone_alloc_item(kegs, &args, M_WAITOK)); } /* See uma.h */ void uma_set_align(int align) { if (align != UMA_ALIGN_CACHE) uma_align_cache = align; } /* See uma.h */ uma_zone_t uma_zcreate(const char *name, size_t size, uma_ctor ctor, uma_dtor dtor, uma_init uminit, uma_fini fini, int align, uint32_t flags) { struct uma_zctor_args args; uma_zone_t res; bool locked; /* This stuff is essential for the zone ctor */ memset(&args, 0, sizeof(args)); args.name = name; args.size = size; args.ctor = ctor; args.dtor = dtor; args.uminit = uminit; args.fini = fini; #ifdef INVARIANTS /* * If a zone is being created with an empty constructor and * destructor, pass UMA constructor/destructor which checks for * memory use after free. */ if ((!(flags & (UMA_ZONE_ZINIT | UMA_ZONE_NOFREE))) && ctor == NULL && dtor == NULL && uminit == NULL && fini == NULL) { args.ctor = trash_ctor; args.dtor = trash_dtor; args.uminit = trash_init; args.fini = trash_fini; } #endif args.align = align; args.flags = flags; args.keg = NULL; if (booted < UMA_STARTUP2) { locked = false; } else { sx_slock(&uma_drain_lock); locked = true; } res = zone_alloc_item(zones, &args, M_WAITOK); if (locked) sx_sunlock(&uma_drain_lock); return (res); } /* See uma.h */ uma_zone_t uma_zsecond_create(char *name, uma_ctor ctor, uma_dtor dtor, uma_init zinit, uma_fini zfini, uma_zone_t master) { struct uma_zctor_args args; uma_keg_t keg; uma_zone_t res; bool locked; keg = zone_first_keg(master); memset(&args, 0, sizeof(args)); args.name = name; args.size = keg->uk_size; args.ctor = ctor; args.dtor = dtor; args.uminit = zinit; args.fini = zfini; args.align = keg->uk_align; args.flags = keg->uk_flags | UMA_ZONE_SECONDARY; args.keg = keg; if (booted < UMA_STARTUP2) { locked = false; } else { sx_slock(&uma_drain_lock); locked = true; } /* XXX Attaches only one keg of potentially many. */ res = zone_alloc_item(zones, &args, M_WAITOK); if (locked) sx_sunlock(&uma_drain_lock); return (res); } /* See uma.h */ uma_zone_t uma_zcache_create(char *name, int size, uma_ctor ctor, uma_dtor dtor, uma_init zinit, uma_fini zfini, uma_import zimport, uma_release zrelease, void *arg, int flags) { struct uma_zctor_args args; memset(&args, 0, sizeof(args)); args.name = name; args.size = size; args.ctor = ctor; args.dtor = dtor; args.uminit = zinit; args.fini = zfini; args.import = zimport; args.release = zrelease; args.arg = arg; args.align = 0; args.flags = flags; return (zone_alloc_item(zones, &args, M_WAITOK)); } static void zone_lock_pair(uma_zone_t a, uma_zone_t b) { if (a < b) { ZONE_LOCK(a); mtx_lock_flags(b->uz_lockptr, MTX_DUPOK); } else { ZONE_LOCK(b); mtx_lock_flags(a->uz_lockptr, MTX_DUPOK); } } static void zone_unlock_pair(uma_zone_t a, uma_zone_t b) { ZONE_UNLOCK(a); ZONE_UNLOCK(b); } int uma_zsecond_add(uma_zone_t zone, uma_zone_t master) { uma_klink_t klink; uma_klink_t kl; int error; error = 0; klink = malloc(sizeof(*klink), M_TEMP, M_WAITOK | M_ZERO); zone_lock_pair(zone, master); /* * zone must use vtoslab() to resolve objects and must already be * a secondary. */ if ((zone->uz_flags & (UMA_ZONE_VTOSLAB | UMA_ZONE_SECONDARY)) != (UMA_ZONE_VTOSLAB | UMA_ZONE_SECONDARY)) { error = EINVAL; goto out; } /* * The new master must also use vtoslab(). */ if ((zone->uz_flags & UMA_ZONE_VTOSLAB) != UMA_ZONE_VTOSLAB) { error = EINVAL; goto out; } /* * The underlying object must be the same size. rsize * may be different. */ if (master->uz_size != zone->uz_size) { error = E2BIG; goto out; } /* * Put it at the end of the list. */ klink->kl_keg = zone_first_keg(master); LIST_FOREACH(kl, &zone->uz_kegs, kl_link) { if (LIST_NEXT(kl, kl_link) == NULL) { LIST_INSERT_AFTER(kl, klink, kl_link); break; } } klink = NULL; zone->uz_flags |= UMA_ZFLAG_MULTI; zone->uz_slab = zone_fetch_slab_multi; out: zone_unlock_pair(zone, master); if (klink != NULL) free(klink, M_TEMP); return (error); } /* See uma.h */ void uma_zdestroy(uma_zone_t zone) { sx_slock(&uma_drain_lock); zone_free_item(zones, zone, NULL, SKIP_NONE); sx_sunlock(&uma_drain_lock); } /* See uma.h */ void * uma_zalloc_arg(uma_zone_t zone, void *udata, int flags) { void *item; uma_cache_t cache; uma_bucket_t bucket; int lockfail; int cpu; /* Enable entropy collection for RANDOM_ENABLE_UMA kernel option */ random_harvest_fast_uma(&zone, sizeof(zone), 1, RANDOM_UMA); /* This is the fast path allocation */ #ifdef UMA_DEBUG_ALLOC_1 printf("Allocating one item from %s(%p)\n", zone->uz_name, zone); #endif CTR3(KTR_UMA, "uma_zalloc_arg thread %x zone %s flags %d", curthread, zone->uz_name, flags); if (flags & M_WAITOK) { WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "uma_zalloc_arg: zone \"%s\"", zone->uz_name); } KASSERT(curthread->td_critnest == 0 || SCHEDULER_STOPPED(), ("uma_zalloc_arg: called with spinlock or critical section held")); #ifdef DEBUG_MEMGUARD if (memguard_cmp_zone(zone)) { item = memguard_alloc(zone->uz_size, flags); if (item != NULL) { if (zone->uz_init != NULL && zone->uz_init(item, zone->uz_size, flags) != 0) return (NULL); if (zone->uz_ctor != NULL && zone->uz_ctor(item, zone->uz_size, udata, flags) != 0) { zone->uz_fini(item, zone->uz_size); return (NULL); } return (item); } /* This is unfortunate but should not be fatal. */ } #endif /* * If possible, allocate from the per-CPU cache. There are two * requirements for safe access to the per-CPU cache: (1) the thread * accessing the cache must not be preempted or yield during access, * and (2) the thread must not migrate CPUs without switching which * cache it accesses. We rely on a critical section to prevent * preemption and migration. We release the critical section in * order to acquire the zone mutex if we are unable to allocate from * the current cache; when we re-acquire the critical section, we * must detect and handle migration if it has occurred. */ critical_enter(); cpu = curcpu; cache = &zone->uz_cpu[cpu]; zalloc_start: bucket = cache->uc_allocbucket; if (bucket != NULL && bucket->ub_cnt > 0) { bucket->ub_cnt--; item = bucket->ub_bucket[bucket->ub_cnt]; #ifdef INVARIANTS bucket->ub_bucket[bucket->ub_cnt] = NULL; #endif KASSERT(item != NULL, ("uma_zalloc: Bucket pointer mangled.")); cache->uc_allocs++; critical_exit(); if (zone->uz_ctor != NULL && zone->uz_ctor(item, zone->uz_size, udata, flags) != 0) { atomic_add_long(&zone->uz_fails, 1); zone_free_item(zone, item, udata, SKIP_DTOR); return (NULL); } #ifdef INVARIANTS uma_dbg_alloc(zone, NULL, item); #endif if (flags & M_ZERO) uma_zero_item(item, zone); return (item); } /* * We have run out of items in our alloc bucket. * See if we can switch with our free bucket. */ bucket = cache->uc_freebucket; if (bucket != NULL && bucket->ub_cnt > 0) { #ifdef UMA_DEBUG_ALLOC printf("uma_zalloc: Swapping empty with alloc.\n"); #endif cache->uc_freebucket = cache->uc_allocbucket; cache->uc_allocbucket = bucket; goto zalloc_start; } /* * Discard any empty allocation bucket while we hold no locks. */ bucket = cache->uc_allocbucket; cache->uc_allocbucket = NULL; critical_exit(); if (bucket != NULL) bucket_free(zone, bucket, udata); /* Short-circuit for zones without buckets and low memory. */ if (zone->uz_count == 0 || bucketdisable) goto zalloc_item; /* * Attempt to retrieve the item from the per-CPU cache has failed, so * we must go back to the zone. This requires the zone lock, so we * must drop the critical section, then re-acquire it when we go back * to the cache. Since the critical section is released, we may be * preempted or migrate. As such, make sure not to maintain any * thread-local state specific to the cache from prior to releasing * the critical section. */ lockfail = 0; if (ZONE_TRYLOCK(zone) == 0) { /* Record contention to size the buckets. */ ZONE_LOCK(zone); lockfail = 1; } critical_enter(); cpu = curcpu; cache = &zone->uz_cpu[cpu]; /* * Since we have locked the zone we may as well send back our stats. */ atomic_add_long(&zone->uz_allocs, cache->uc_allocs); atomic_add_long(&zone->uz_frees, cache->uc_frees); cache->uc_allocs = 0; cache->uc_frees = 0; /* See if we lost the race to fill the cache. */ if (cache->uc_allocbucket != NULL) { ZONE_UNLOCK(zone); goto zalloc_start; } /* * Check the zone's cache of buckets. */ if ((bucket = LIST_FIRST(&zone->uz_buckets)) != NULL) { KASSERT(bucket->ub_cnt != 0, ("uma_zalloc_arg: Returning an empty bucket.")); LIST_REMOVE(bucket, ub_link); cache->uc_allocbucket = bucket; ZONE_UNLOCK(zone); goto zalloc_start; } /* We are no longer associated with this CPU. */ critical_exit(); /* * We bump the uz count when the cache size is insufficient to * handle the working set. */ if (lockfail && zone->uz_count < BUCKET_MAX) zone->uz_count++; ZONE_UNLOCK(zone); /* * Now lets just fill a bucket and put it on the free list. If that * works we'll restart the allocation from the beginning and it * will use the just filled bucket. */ bucket = zone_alloc_bucket(zone, udata, flags); if (bucket != NULL) { ZONE_LOCK(zone); critical_enter(); cpu = curcpu; cache = &zone->uz_cpu[cpu]; /* * See if we lost the race or were migrated. Cache the * initialized bucket to make this less likely or claim * the memory directly. */ if (cache->uc_allocbucket == NULL) cache->uc_allocbucket = bucket; else LIST_INSERT_HEAD(&zone->uz_buckets, bucket, ub_link); ZONE_UNLOCK(zone); goto zalloc_start; } /* * We may not be able to get a bucket so return an actual item. */ #ifdef UMA_DEBUG printf("uma_zalloc_arg: Bucketzone returned NULL\n"); #endif zalloc_item: item = zone_alloc_item(zone, udata, flags); return (item); } static uma_slab_t keg_fetch_slab(uma_keg_t keg, uma_zone_t zone, int flags) { uma_slab_t slab; int reserve; mtx_assert(&keg->uk_lock, MA_OWNED); slab = NULL; reserve = 0; if ((flags & M_USE_RESERVE) == 0) reserve = keg->uk_reserve; for (;;) { /* * Find a slab with some space. Prefer slabs that are partially * used over those that are totally full. This helps to reduce * fragmentation. */ if (keg->uk_free > reserve) { if (!LIST_EMPTY(&keg->uk_part_slab)) { slab = LIST_FIRST(&keg->uk_part_slab); } else { slab = LIST_FIRST(&keg->uk_free_slab); LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&keg->uk_part_slab, slab, us_link); } MPASS(slab->us_keg == keg); return (slab); } /* * M_NOVM means don't ask at all! */ if (flags & M_NOVM) break; if (keg->uk_maxpages && keg->uk_pages >= keg->uk_maxpages) { keg->uk_flags |= UMA_ZFLAG_FULL; /* * If this is not a multi-zone, set the FULL bit. * Otherwise slab_multi() takes care of it. */ if ((zone->uz_flags & UMA_ZFLAG_MULTI) == 0) { zone->uz_flags |= UMA_ZFLAG_FULL; zone_log_warning(zone); zone_maxaction(zone); } if (flags & M_NOWAIT) break; zone->uz_sleeps++; msleep(keg, &keg->uk_lock, PVM, "keglimit", 0); continue; } slab = keg_alloc_slab(keg, zone, flags); /* * If we got a slab here it's safe to mark it partially used * and return. We assume that the caller is going to remove * at least one item. */ if (slab) { MPASS(slab->us_keg == keg); LIST_INSERT_HEAD(&keg->uk_part_slab, slab, us_link); return (slab); } /* * We might not have been able to get a slab but another cpu * could have while we were unlocked. Check again before we * fail. */ flags |= M_NOVM; } return (slab); } static uma_slab_t zone_fetch_slab(uma_zone_t zone, uma_keg_t keg, int flags) { uma_slab_t slab; if (keg == NULL) { keg = zone_first_keg(zone); KEG_LOCK(keg); } for (;;) { slab = keg_fetch_slab(keg, zone, flags); if (slab) return (slab); if (flags & (M_NOWAIT | M_NOVM)) break; } KEG_UNLOCK(keg); return (NULL); } /* * uma_zone_fetch_slab_multi: Fetches a slab from one available keg. Returns * with the keg locked. On NULL no lock is held. * * The last pointer is used to seed the search. It is not required. */ static uma_slab_t zone_fetch_slab_multi(uma_zone_t zone, uma_keg_t last, int rflags) { uma_klink_t klink; uma_slab_t slab; uma_keg_t keg; int flags; int empty; int full; /* * Don't wait on the first pass. This will skip limit tests * as well. We don't want to block if we can find a provider * without blocking. */ flags = (rflags & ~M_WAITOK) | M_NOWAIT; /* * Use the last slab allocated as a hint for where to start * the search. */ if (last != NULL) { slab = keg_fetch_slab(last, zone, flags); if (slab) return (slab); KEG_UNLOCK(last); } /* * Loop until we have a slab incase of transient failures * while M_WAITOK is specified. I'm not sure this is 100% * required but we've done it for so long now. */ for (;;) { empty = 0; full = 0; /* * Search the available kegs for slabs. Be careful to hold the * correct lock while calling into the keg layer. */ LIST_FOREACH(klink, &zone->uz_kegs, kl_link) { keg = klink->kl_keg; KEG_LOCK(keg); if ((keg->uk_flags & UMA_ZFLAG_FULL) == 0) { slab = keg_fetch_slab(keg, zone, flags); if (slab) return (slab); } if (keg->uk_flags & UMA_ZFLAG_FULL) full++; else empty++; KEG_UNLOCK(keg); } if (rflags & (M_NOWAIT | M_NOVM)) break; flags = rflags; /* * All kegs are full. XXX We can't atomically check all kegs * and sleep so just sleep for a short period and retry. */ if (full && !empty) { ZONE_LOCK(zone); zone->uz_flags |= UMA_ZFLAG_FULL; zone->uz_sleeps++; zone_log_warning(zone); zone_maxaction(zone); msleep(zone, zone->uz_lockptr, PVM, "zonelimit", hz/100); zone->uz_flags &= ~UMA_ZFLAG_FULL; ZONE_UNLOCK(zone); continue; } } return (NULL); } static void * slab_alloc_item(uma_keg_t keg, uma_slab_t slab) { void *item; uint8_t freei; MPASS(keg == slab->us_keg); mtx_assert(&keg->uk_lock, MA_OWNED); freei = BIT_FFS(SLAB_SETSIZE, &slab->us_free) - 1; BIT_CLR(SLAB_SETSIZE, freei, &slab->us_free); item = slab->us_data + (keg->uk_rsize * freei); slab->us_freecount--; keg->uk_free--; /* Move this slab to the full list */ if (slab->us_freecount == 0) { LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&keg->uk_full_slab, slab, us_link); } return (item); } static int zone_import(uma_zone_t zone, void **bucket, int max, int flags) { uma_slab_t slab; uma_keg_t keg; int i; slab = NULL; keg = NULL; /* Try to keep the buckets totally full */ for (i = 0; i < max; ) { if ((slab = zone->uz_slab(zone, keg, flags)) == NULL) break; keg = slab->us_keg; while (slab->us_freecount && i < max) { bucket[i++] = slab_alloc_item(keg, slab); if (keg->uk_free <= keg->uk_reserve) break; } /* Don't grab more than one slab at a time. */ flags &= ~M_WAITOK; flags |= M_NOWAIT; } if (slab != NULL) KEG_UNLOCK(keg); return i; } static uma_bucket_t zone_alloc_bucket(uma_zone_t zone, void *udata, int flags) { uma_bucket_t bucket; int max; /* Don't wait for buckets, preserve caller's NOVM setting. */ bucket = bucket_alloc(zone, udata, M_NOWAIT | (flags & M_NOVM)); if (bucket == NULL) return (NULL); max = MIN(bucket->ub_entries, zone->uz_count); bucket->ub_cnt = zone->uz_import(zone->uz_arg, bucket->ub_bucket, max, flags); /* * Initialize the memory if necessary. */ if (bucket->ub_cnt != 0 && zone->uz_init != NULL) { int i; for (i = 0; i < bucket->ub_cnt; i++) if (zone->uz_init(bucket->ub_bucket[i], zone->uz_size, flags) != 0) break; /* * If we couldn't initialize the whole bucket, put the * rest back onto the freelist. */ if (i != bucket->ub_cnt) { zone->uz_release(zone->uz_arg, &bucket->ub_bucket[i], bucket->ub_cnt - i); #ifdef INVARIANTS bzero(&bucket->ub_bucket[i], sizeof(void *) * (bucket->ub_cnt - i)); #endif bucket->ub_cnt = i; } } if (bucket->ub_cnt == 0) { bucket_free(zone, bucket, udata); atomic_add_long(&zone->uz_fails, 1); return (NULL); } return (bucket); } /* * Allocates a single item from a zone. * * Arguments * zone The zone to alloc for. * udata The data to be passed to the constructor. * flags M_WAITOK, M_NOWAIT, M_ZERO. * * Returns * NULL if there is no memory and M_NOWAIT is set * An item if successful */ static void * zone_alloc_item(uma_zone_t zone, void *udata, int flags) { void *item; item = NULL; #ifdef UMA_DEBUG_ALLOC printf("INTERNAL: Allocating one item from %s(%p)\n", zone->uz_name, zone); #endif if (zone->uz_import(zone->uz_arg, &item, 1, flags) != 1) goto fail; atomic_add_long(&zone->uz_allocs, 1); /* * We have to call both the zone's init (not the keg's init) * and the zone's ctor. This is because the item is going from * a keg slab directly to the user, and the user is expecting it * to be both zone-init'd as well as zone-ctor'd. */ if (zone->uz_init != NULL) { if (zone->uz_init(item, zone->uz_size, flags) != 0) { zone_free_item(zone, item, udata, SKIP_FINI); goto fail; } } if (zone->uz_ctor != NULL) { if (zone->uz_ctor(item, zone->uz_size, udata, flags) != 0) { zone_free_item(zone, item, udata, SKIP_DTOR); goto fail; } } #ifdef INVARIANTS uma_dbg_alloc(zone, NULL, item); #endif if (flags & M_ZERO) uma_zero_item(item, zone); return (item); fail: atomic_add_long(&zone->uz_fails, 1); return (NULL); } /* See uma.h */ void uma_zfree_arg(uma_zone_t zone, void *item, void *udata) { uma_cache_t cache; uma_bucket_t bucket; int lockfail; int cpu; /* Enable entropy collection for RANDOM_ENABLE_UMA kernel option */ random_harvest_fast_uma(&zone, sizeof(zone), 1, RANDOM_UMA); #ifdef UMA_DEBUG_ALLOC_1 printf("Freeing item %p to %s(%p)\n", item, zone->uz_name, zone); #endif CTR2(KTR_UMA, "uma_zfree_arg thread %x zone %s", curthread, zone->uz_name); KASSERT(curthread->td_critnest == 0 || SCHEDULER_STOPPED(), ("uma_zfree_arg: called with spinlock or critical section held")); /* uma_zfree(..., NULL) does nothing, to match free(9). */ if (item == NULL) return; #ifdef DEBUG_MEMGUARD if (is_memguard_addr(item)) { if (zone->uz_dtor != NULL) zone->uz_dtor(item, zone->uz_size, udata); if (zone->uz_fini != NULL) zone->uz_fini(item, zone->uz_size); memguard_free(item); return; } #endif #ifdef INVARIANTS if (zone->uz_flags & UMA_ZONE_MALLOC) uma_dbg_free(zone, udata, item); else uma_dbg_free(zone, NULL, item); #endif if (zone->uz_dtor != NULL) zone->uz_dtor(item, zone->uz_size, udata); /* * The race here is acceptable. If we miss it we'll just have to wait * a little longer for the limits to be reset. */ if (zone->uz_flags & UMA_ZFLAG_FULL) goto zfree_item; /* * If possible, free to the per-CPU cache. There are two * requirements for safe access to the per-CPU cache: (1) the thread * accessing the cache must not be preempted or yield during access, * and (2) the thread must not migrate CPUs without switching which * cache it accesses. We rely on a critical section to prevent * preemption and migration. We release the critical section in * order to acquire the zone mutex if we are unable to free to the * current cache; when we re-acquire the critical section, we must * detect and handle migration if it has occurred. */ zfree_restart: critical_enter(); cpu = curcpu; cache = &zone->uz_cpu[cpu]; zfree_start: /* * Try to free into the allocbucket first to give LIFO ordering * for cache-hot datastructures. Spill over into the freebucket * if necessary. Alloc will swap them if one runs dry. */ bucket = cache->uc_allocbucket; if (bucket == NULL || bucket->ub_cnt >= bucket->ub_entries) bucket = cache->uc_freebucket; if (bucket != NULL && bucket->ub_cnt < bucket->ub_entries) { KASSERT(bucket->ub_bucket[bucket->ub_cnt] == NULL, ("uma_zfree: Freeing to non free bucket index.")); bucket->ub_bucket[bucket->ub_cnt] = item; bucket->ub_cnt++; cache->uc_frees++; critical_exit(); return; } /* * We must go back the zone, which requires acquiring the zone lock, * which in turn means we must release and re-acquire the critical * section. Since the critical section is released, we may be * preempted or migrate. As such, make sure not to maintain any * thread-local state specific to the cache from prior to releasing * the critical section. */ critical_exit(); if (zone->uz_count == 0 || bucketdisable) goto zfree_item; lockfail = 0; if (ZONE_TRYLOCK(zone) == 0) { /* Record contention to size the buckets. */ ZONE_LOCK(zone); lockfail = 1; } critical_enter(); cpu = curcpu; cache = &zone->uz_cpu[cpu]; /* * Since we have locked the zone we may as well send back our stats. */ atomic_add_long(&zone->uz_allocs, cache->uc_allocs); atomic_add_long(&zone->uz_frees, cache->uc_frees); cache->uc_allocs = 0; cache->uc_frees = 0; bucket = cache->uc_freebucket; if (bucket != NULL && bucket->ub_cnt < bucket->ub_entries) { ZONE_UNLOCK(zone); goto zfree_start; } cache->uc_freebucket = NULL; + /* We are no longer associated with this CPU. */ + critical_exit(); /* Can we throw this on the zone full list? */ if (bucket != NULL) { #ifdef UMA_DEBUG_ALLOC printf("uma_zfree: Putting old bucket on the free list.\n"); #endif /* ub_cnt is pointing to the last free item */ KASSERT(bucket->ub_cnt != 0, ("uma_zfree: Attempting to insert an empty bucket onto the full list.\n")); LIST_INSERT_HEAD(&zone->uz_buckets, bucket, ub_link); } - - /* We are no longer associated with this CPU. */ - critical_exit(); /* * We bump the uz count when the cache size is insufficient to * handle the working set. */ if (lockfail && zone->uz_count < BUCKET_MAX) zone->uz_count++; ZONE_UNLOCK(zone); #ifdef UMA_DEBUG_ALLOC printf("uma_zfree: Allocating new free bucket.\n"); #endif bucket = bucket_alloc(zone, udata, M_NOWAIT); if (bucket) { critical_enter(); cpu = curcpu; cache = &zone->uz_cpu[cpu]; if (cache->uc_freebucket == NULL) { cache->uc_freebucket = bucket; goto zfree_start; } /* * We lost the race, start over. We have to drop our * critical section to free the bucket. */ critical_exit(); bucket_free(zone, bucket, udata); goto zfree_restart; } /* * If nothing else caught this, we'll just do an internal free. */ zfree_item: zone_free_item(zone, item, udata, SKIP_DTOR); return; } static void slab_free_item(uma_keg_t keg, uma_slab_t slab, void *item) { uint8_t freei; mtx_assert(&keg->uk_lock, MA_OWNED); MPASS(keg == slab->us_keg); /* Do we need to remove from any lists? */ if (slab->us_freecount+1 == keg->uk_ipers) { LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&keg->uk_free_slab, slab, us_link); } else if (slab->us_freecount == 0) { LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&keg->uk_part_slab, slab, us_link); } /* Slab management. */ freei = ((uintptr_t)item - (uintptr_t)slab->us_data) / keg->uk_rsize; BIT_SET(SLAB_SETSIZE, freei, &slab->us_free); slab->us_freecount++; /* Keg statistics. */ keg->uk_free++; } static void zone_release(uma_zone_t zone, void **bucket, int cnt) { void *item; uma_slab_t slab; uma_keg_t keg; uint8_t *mem; int clearfull; int i; clearfull = 0; keg = zone_first_keg(zone); KEG_LOCK(keg); for (i = 0; i < cnt; i++) { item = bucket[i]; if (!(zone->uz_flags & UMA_ZONE_VTOSLAB)) { mem = (uint8_t *)((uintptr_t)item & (~UMA_SLAB_MASK)); if (zone->uz_flags & UMA_ZONE_HASH) { slab = hash_sfind(&keg->uk_hash, mem); } else { mem += keg->uk_pgoff; slab = (uma_slab_t)mem; } } else { slab = vtoslab((vm_offset_t)item); if (slab->us_keg != keg) { KEG_UNLOCK(keg); keg = slab->us_keg; KEG_LOCK(keg); } } slab_free_item(keg, slab, item); if (keg->uk_flags & UMA_ZFLAG_FULL) { if (keg->uk_pages < keg->uk_maxpages) { keg->uk_flags &= ~UMA_ZFLAG_FULL; clearfull = 1; } /* * We can handle one more allocation. Since we're * clearing ZFLAG_FULL, wake up all procs blocked * on pages. This should be uncommon, so keeping this * simple for now (rather than adding count of blocked * threads etc). */ wakeup(keg); } } KEG_UNLOCK(keg); if (clearfull) { ZONE_LOCK(zone); zone->uz_flags &= ~UMA_ZFLAG_FULL; wakeup(zone); ZONE_UNLOCK(zone); } } /* * Frees a single item to any zone. * * Arguments: * zone The zone to free to * item The item we're freeing * udata User supplied data for the dtor * skip Skip dtors and finis */ static void zone_free_item(uma_zone_t zone, void *item, void *udata, enum zfreeskip skip) { #ifdef INVARIANTS if (skip == SKIP_NONE) { if (zone->uz_flags & UMA_ZONE_MALLOC) uma_dbg_free(zone, udata, item); else uma_dbg_free(zone, NULL, item); } #endif if (skip < SKIP_DTOR && zone->uz_dtor) zone->uz_dtor(item, zone->uz_size, udata); if (skip < SKIP_FINI && zone->uz_fini) zone->uz_fini(item, zone->uz_size); atomic_add_long(&zone->uz_frees, 1); zone->uz_release(zone->uz_arg, &item, 1); } /* See uma.h */ int uma_zone_set_max(uma_zone_t zone, int nitems) { uma_keg_t keg; keg = zone_first_keg(zone); if (keg == NULL) return (0); KEG_LOCK(keg); keg->uk_maxpages = (nitems / keg->uk_ipers) * keg->uk_ppera; if (keg->uk_maxpages * keg->uk_ipers < nitems) keg->uk_maxpages += keg->uk_ppera; nitems = keg->uk_maxpages * keg->uk_ipers; KEG_UNLOCK(keg); return (nitems); } /* See uma.h */ int uma_zone_get_max(uma_zone_t zone) { int nitems; uma_keg_t keg; keg = zone_first_keg(zone); if (keg == NULL) return (0); KEG_LOCK(keg); nitems = keg->uk_maxpages * keg->uk_ipers; KEG_UNLOCK(keg); return (nitems); } /* See uma.h */ void uma_zone_set_warning(uma_zone_t zone, const char *warning) { ZONE_LOCK(zone); zone->uz_warning = warning; ZONE_UNLOCK(zone); } /* See uma.h */ void uma_zone_set_maxaction(uma_zone_t zone, uma_maxaction_t maxaction) { ZONE_LOCK(zone); TASK_INIT(&zone->uz_maxaction, 0, (task_fn_t *)maxaction, zone); ZONE_UNLOCK(zone); } /* See uma.h */ int uma_zone_get_cur(uma_zone_t zone) { int64_t nitems; u_int i; ZONE_LOCK(zone); nitems = zone->uz_allocs - zone->uz_frees; CPU_FOREACH(i) { /* * See the comment in sysctl_vm_zone_stats() regarding the * safety of accessing the per-cpu caches. With the zone lock * held, it is safe, but can potentially result in stale data. */ nitems += zone->uz_cpu[i].uc_allocs - zone->uz_cpu[i].uc_frees; } ZONE_UNLOCK(zone); return (nitems < 0 ? 0 : nitems); } /* See uma.h */ void uma_zone_set_init(uma_zone_t zone, uma_init uminit) { uma_keg_t keg; keg = zone_first_keg(zone); KASSERT(keg != NULL, ("uma_zone_set_init: Invalid zone type")); KEG_LOCK(keg); KASSERT(keg->uk_pages == 0, ("uma_zone_set_init on non-empty keg")); keg->uk_init = uminit; KEG_UNLOCK(keg); } /* See uma.h */ void uma_zone_set_fini(uma_zone_t zone, uma_fini fini) { uma_keg_t keg; keg = zone_first_keg(zone); KASSERT(keg != NULL, ("uma_zone_set_fini: Invalid zone type")); KEG_LOCK(keg); KASSERT(keg->uk_pages == 0, ("uma_zone_set_fini on non-empty keg")); keg->uk_fini = fini; KEG_UNLOCK(keg); } /* See uma.h */ void uma_zone_set_zinit(uma_zone_t zone, uma_init zinit) { ZONE_LOCK(zone); KASSERT(zone_first_keg(zone)->uk_pages == 0, ("uma_zone_set_zinit on non-empty keg")); zone->uz_init = zinit; ZONE_UNLOCK(zone); } /* See uma.h */ void uma_zone_set_zfini(uma_zone_t zone, uma_fini zfini) { ZONE_LOCK(zone); KASSERT(zone_first_keg(zone)->uk_pages == 0, ("uma_zone_set_zfini on non-empty keg")); zone->uz_fini = zfini; ZONE_UNLOCK(zone); } /* See uma.h */ /* XXX uk_freef is not actually used with the zone locked */ void uma_zone_set_freef(uma_zone_t zone, uma_free freef) { uma_keg_t keg; keg = zone_first_keg(zone); KASSERT(keg != NULL, ("uma_zone_set_freef: Invalid zone type")); KEG_LOCK(keg); keg->uk_freef = freef; KEG_UNLOCK(keg); } /* See uma.h */ /* XXX uk_allocf is not actually used with the zone locked */ void uma_zone_set_allocf(uma_zone_t zone, uma_alloc allocf) { uma_keg_t keg; keg = zone_first_keg(zone); KEG_LOCK(keg); keg->uk_allocf = allocf; KEG_UNLOCK(keg); } /* See uma.h */ void uma_zone_reserve(uma_zone_t zone, int items) { uma_keg_t keg; keg = zone_first_keg(zone); if (keg == NULL) return; KEG_LOCK(keg); keg->uk_reserve = items; KEG_UNLOCK(keg); return; } /* See uma.h */ int uma_zone_reserve_kva(uma_zone_t zone, int count) { uma_keg_t keg; vm_offset_t kva; u_int pages; keg = zone_first_keg(zone); if (keg == NULL) return (0); pages = count / keg->uk_ipers; if (pages * keg->uk_ipers < count) pages++; #ifdef UMA_MD_SMALL_ALLOC if (keg->uk_ppera > 1) { #else if (1) { #endif kva = kva_alloc((vm_size_t)pages * UMA_SLAB_SIZE); if (kva == 0) return (0); } else kva = 0; KEG_LOCK(keg); keg->uk_kva = kva; keg->uk_offset = 0; keg->uk_maxpages = pages; #ifdef UMA_MD_SMALL_ALLOC keg->uk_allocf = (keg->uk_ppera > 1) ? noobj_alloc : uma_small_alloc; #else keg->uk_allocf = noobj_alloc; #endif keg->uk_flags |= UMA_ZONE_NOFREE; KEG_UNLOCK(keg); return (1); } /* See uma.h */ void uma_prealloc(uma_zone_t zone, int items) { int slabs; uma_slab_t slab; uma_keg_t keg; keg = zone_first_keg(zone); if (keg == NULL) return; KEG_LOCK(keg); slabs = items / keg->uk_ipers; if (slabs * keg->uk_ipers < items) slabs++; while (slabs > 0) { slab = keg_alloc_slab(keg, zone, M_WAITOK); if (slab == NULL) break; MPASS(slab->us_keg == keg); LIST_INSERT_HEAD(&keg->uk_free_slab, slab, us_link); slabs--; } KEG_UNLOCK(keg); } /* See uma.h */ static void uma_reclaim_locked(bool kmem_danger) { #ifdef UMA_DEBUG printf("UMA: vm asked us to release pages!\n"); #endif sx_assert(&uma_drain_lock, SA_XLOCKED); bucket_enable(); zone_foreach(zone_drain); if (vm_page_count_min() || kmem_danger) { cache_drain_safe(NULL); zone_foreach(zone_drain); } /* * Some slabs may have been freed but this zone will be visited early * we visit again so that we can free pages that are empty once other * zones are drained. We have to do the same for buckets. */ zone_drain(slabzone); bucket_zone_drain(); } void uma_reclaim(void) { sx_xlock(&uma_drain_lock); uma_reclaim_locked(false); sx_xunlock(&uma_drain_lock); } static int uma_reclaim_needed; void uma_reclaim_wakeup(void) { uma_reclaim_needed = 1; wakeup(&uma_reclaim_needed); } void uma_reclaim_worker(void *arg __unused) { sx_xlock(&uma_drain_lock); for (;;) { sx_sleep(&uma_reclaim_needed, &uma_drain_lock, PVM, "umarcl", 0); if (uma_reclaim_needed) { uma_reclaim_needed = 0; uma_reclaim_locked(true); } } } /* See uma.h */ int uma_zone_exhausted(uma_zone_t zone) { int full; ZONE_LOCK(zone); full = (zone->uz_flags & UMA_ZFLAG_FULL); ZONE_UNLOCK(zone); return (full); } int uma_zone_exhausted_nolock(uma_zone_t zone) { return (zone->uz_flags & UMA_ZFLAG_FULL); } void * uma_large_malloc(vm_size_t size, int wait) { void *mem; uma_slab_t slab; uint8_t flags; slab = zone_alloc_item(slabzone, NULL, wait); if (slab == NULL) return (NULL); mem = page_alloc(NULL, size, &flags, wait); if (mem) { vsetslab((vm_offset_t)mem, slab); slab->us_data = mem; slab->us_flags = flags | UMA_SLAB_MALLOC; slab->us_size = size; } else { zone_free_item(slabzone, slab, NULL, SKIP_NONE); } return (mem); } void uma_large_free(uma_slab_t slab) { page_free(slab->us_data, slab->us_size, slab->us_flags); zone_free_item(slabzone, slab, NULL, SKIP_NONE); } static void uma_zero_item(void *item, uma_zone_t zone) { int i; if (zone->uz_flags & UMA_ZONE_PCPU) { CPU_FOREACH(i) bzero(zpcpu_get_cpu(item, i), zone->uz_size); } else bzero(item, zone->uz_size); } void uma_print_stats(void) { zone_foreach(uma_print_zone); } static void slab_print(uma_slab_t slab) { printf("slab: keg %p, data %p, freecount %d\n", slab->us_keg, slab->us_data, slab->us_freecount); } static void cache_print(uma_cache_t cache) { printf("alloc: %p(%d), free: %p(%d)\n", cache->uc_allocbucket, cache->uc_allocbucket?cache->uc_allocbucket->ub_cnt:0, cache->uc_freebucket, cache->uc_freebucket?cache->uc_freebucket->ub_cnt:0); } static void uma_print_keg(uma_keg_t keg) { uma_slab_t slab; printf("keg: %s(%p) size %d(%d) flags %#x ipers %d ppera %d " "out %d free %d limit %d\n", keg->uk_name, keg, keg->uk_size, keg->uk_rsize, keg->uk_flags, keg->uk_ipers, keg->uk_ppera, (keg->uk_ipers * keg->uk_pages) - keg->uk_free, keg->uk_free, (keg->uk_maxpages / keg->uk_ppera) * keg->uk_ipers); printf("Part slabs:\n"); LIST_FOREACH(slab, &keg->uk_part_slab, us_link) slab_print(slab); printf("Free slabs:\n"); LIST_FOREACH(slab, &keg->uk_free_slab, us_link) slab_print(slab); printf("Full slabs:\n"); LIST_FOREACH(slab, &keg->uk_full_slab, us_link) slab_print(slab); } void uma_print_zone(uma_zone_t zone) { uma_cache_t cache; uma_klink_t kl; int i; printf("zone: %s(%p) size %d flags %#x\n", zone->uz_name, zone, zone->uz_size, zone->uz_flags); LIST_FOREACH(kl, &zone->uz_kegs, kl_link) uma_print_keg(kl->kl_keg); CPU_FOREACH(i) { cache = &zone->uz_cpu[i]; printf("CPU %d Cache:\n", i); cache_print(cache); } } #ifdef DDB /* * Generate statistics across both the zone and its per-cpu cache's. Return * desired statistics if the pointer is non-NULL for that statistic. * * Note: does not update the zone statistics, as it can't safely clear the * per-CPU cache statistic. * * XXXRW: Following the uc_allocbucket and uc_freebucket pointers here isn't * safe from off-CPU; we should modify the caches to track this information * directly so that we don't have to. */ static void uma_zone_sumstat(uma_zone_t z, int *cachefreep, uint64_t *allocsp, uint64_t *freesp, uint64_t *sleepsp) { uma_cache_t cache; uint64_t allocs, frees, sleeps; int cachefree, cpu; allocs = frees = sleeps = 0; cachefree = 0; CPU_FOREACH(cpu) { cache = &z->uz_cpu[cpu]; if (cache->uc_allocbucket != NULL) cachefree += cache->uc_allocbucket->ub_cnt; if (cache->uc_freebucket != NULL) cachefree += cache->uc_freebucket->ub_cnt; allocs += cache->uc_allocs; frees += cache->uc_frees; } allocs += z->uz_allocs; frees += z->uz_frees; sleeps += z->uz_sleeps; if (cachefreep != NULL) *cachefreep = cachefree; if (allocsp != NULL) *allocsp = allocs; if (freesp != NULL) *freesp = frees; if (sleepsp != NULL) *sleepsp = sleeps; } #endif /* DDB */ static int sysctl_vm_zone_count(SYSCTL_HANDLER_ARGS) { uma_keg_t kz; uma_zone_t z; int count; count = 0; rw_rlock(&uma_rwlock); LIST_FOREACH(kz, &uma_kegs, uk_link) { LIST_FOREACH(z, &kz->uk_zones, uz_link) count++; } rw_runlock(&uma_rwlock); return (sysctl_handle_int(oidp, &count, 0, req)); } static int sysctl_vm_zone_stats(SYSCTL_HANDLER_ARGS) { struct uma_stream_header ush; struct uma_type_header uth; struct uma_percpu_stat ups; uma_bucket_t bucket; struct sbuf sbuf; uma_cache_t cache; uma_klink_t kl; uma_keg_t kz; uma_zone_t z; uma_keg_t k; int count, error, i; error = sysctl_wire_old_buffer(req, 0); if (error != 0) return (error); sbuf_new_for_sysctl(&sbuf, NULL, 128, req); sbuf_clear_flags(&sbuf, SBUF_INCLUDENUL); count = 0; rw_rlock(&uma_rwlock); LIST_FOREACH(kz, &uma_kegs, uk_link) { LIST_FOREACH(z, &kz->uk_zones, uz_link) count++; } /* * Insert stream header. */ bzero(&ush, sizeof(ush)); ush.ush_version = UMA_STREAM_VERSION; ush.ush_maxcpus = (mp_maxid + 1); ush.ush_count = count; (void)sbuf_bcat(&sbuf, &ush, sizeof(ush)); LIST_FOREACH(kz, &uma_kegs, uk_link) { LIST_FOREACH(z, &kz->uk_zones, uz_link) { bzero(&uth, sizeof(uth)); ZONE_LOCK(z); strlcpy(uth.uth_name, z->uz_name, UTH_MAX_NAME); uth.uth_align = kz->uk_align; uth.uth_size = kz->uk_size; uth.uth_rsize = kz->uk_rsize; LIST_FOREACH(kl, &z->uz_kegs, kl_link) { k = kl->kl_keg; uth.uth_maxpages += k->uk_maxpages; uth.uth_pages += k->uk_pages; uth.uth_keg_free += k->uk_free; uth.uth_limit = (k->uk_maxpages / k->uk_ppera) * k->uk_ipers; } /* * A zone is secondary is it is not the first entry * on the keg's zone list. */ if ((z->uz_flags & UMA_ZONE_SECONDARY) && (LIST_FIRST(&kz->uk_zones) != z)) uth.uth_zone_flags = UTH_ZONE_SECONDARY; LIST_FOREACH(bucket, &z->uz_buckets, ub_link) uth.uth_zone_free += bucket->ub_cnt; uth.uth_allocs = z->uz_allocs; uth.uth_frees = z->uz_frees; uth.uth_fails = z->uz_fails; uth.uth_sleeps = z->uz_sleeps; (void)sbuf_bcat(&sbuf, &uth, sizeof(uth)); /* * While it is not normally safe to access the cache * bucket pointers while not on the CPU that owns the * cache, we only allow the pointers to be exchanged * without the zone lock held, not invalidated, so * accept the possible race associated with bucket * exchange during monitoring. */ for (i = 0; i < (mp_maxid + 1); i++) { bzero(&ups, sizeof(ups)); if (kz->uk_flags & UMA_ZFLAG_INTERNAL) goto skip; if (CPU_ABSENT(i)) goto skip; cache = &z->uz_cpu[i]; if (cache->uc_allocbucket != NULL) ups.ups_cache_free += cache->uc_allocbucket->ub_cnt; if (cache->uc_freebucket != NULL) ups.ups_cache_free += cache->uc_freebucket->ub_cnt; ups.ups_allocs = cache->uc_allocs; ups.ups_frees = cache->uc_frees; skip: (void)sbuf_bcat(&sbuf, &ups, sizeof(ups)); } ZONE_UNLOCK(z); } } rw_runlock(&uma_rwlock); error = sbuf_finish(&sbuf); sbuf_delete(&sbuf); return (error); } int sysctl_handle_uma_zone_max(SYSCTL_HANDLER_ARGS) { uma_zone_t zone = *(uma_zone_t *)arg1; int error, max; max = uma_zone_get_max(zone); error = sysctl_handle_int(oidp, &max, 0, req); if (error || !req->newptr) return (error); uma_zone_set_max(zone, max); return (0); } int sysctl_handle_uma_zone_cur(SYSCTL_HANDLER_ARGS) { uma_zone_t zone = *(uma_zone_t *)arg1; int cur; cur = uma_zone_get_cur(zone); return (sysctl_handle_int(oidp, &cur, 0, req)); } #ifdef INVARIANTS static uma_slab_t uma_dbg_getslab(uma_zone_t zone, void *item) { uma_slab_t slab; uma_keg_t keg; uint8_t *mem; mem = (uint8_t *)((uintptr_t)item & (~UMA_SLAB_MASK)); if (zone->uz_flags & UMA_ZONE_VTOSLAB) { slab = vtoslab((vm_offset_t)mem); } else { /* * It is safe to return the slab here even though the * zone is unlocked because the item's allocation state * essentially holds a reference. */ ZONE_LOCK(zone); keg = LIST_FIRST(&zone->uz_kegs)->kl_keg; if (keg->uk_flags & UMA_ZONE_HASH) slab = hash_sfind(&keg->uk_hash, mem); else slab = (uma_slab_t)(mem + keg->uk_pgoff); ZONE_UNLOCK(zone); } return (slab); } /* * Set up the slab's freei data such that uma_dbg_free can function. * */ static void uma_dbg_alloc(uma_zone_t zone, uma_slab_t slab, void *item) { uma_keg_t keg; int freei; if (zone_first_keg(zone) == NULL) return; if (slab == NULL) { slab = uma_dbg_getslab(zone, item); if (slab == NULL) panic("uma: item %p did not belong to zone %s\n", item, zone->uz_name); } keg = slab->us_keg; freei = ((uintptr_t)item - (uintptr_t)slab->us_data) / keg->uk_rsize; if (BIT_ISSET(SLAB_SETSIZE, freei, &slab->us_debugfree)) panic("Duplicate alloc of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); BIT_SET_ATOMIC(SLAB_SETSIZE, freei, &slab->us_debugfree); return; } /* * Verifies freed addresses. Checks for alignment, valid slab membership * and duplicate frees. * */ static void uma_dbg_free(uma_zone_t zone, uma_slab_t slab, void *item) { uma_keg_t keg; int freei; if (zone_first_keg(zone) == NULL) return; if (slab == NULL) { slab = uma_dbg_getslab(zone, item); if (slab == NULL) panic("uma: Freed item %p did not belong to zone %s\n", item, zone->uz_name); } keg = slab->us_keg; freei = ((uintptr_t)item - (uintptr_t)slab->us_data) / keg->uk_rsize; if (freei >= keg->uk_ipers) panic("Invalid free of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); if (((freei * keg->uk_rsize) + slab->us_data) != item) panic("Unaligned free of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); if (!BIT_ISSET(SLAB_SETSIZE, freei, &slab->us_debugfree)) panic("Duplicate free of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); BIT_CLR_ATOMIC(SLAB_SETSIZE, freei, &slab->us_debugfree); } #endif /* INVARIANTS */ #ifdef DDB DB_SHOW_COMMAND(uma, db_show_uma) { uint64_t allocs, frees, sleeps; uma_bucket_t bucket; uma_keg_t kz; uma_zone_t z; int cachefree; db_printf("%18s %8s %8s %8s %12s %8s %8s\n", "Zone", "Size", "Used", "Free", "Requests", "Sleeps", "Bucket"); LIST_FOREACH(kz, &uma_kegs, uk_link) { LIST_FOREACH(z, &kz->uk_zones, uz_link) { if (kz->uk_flags & UMA_ZFLAG_INTERNAL) { allocs = z->uz_allocs; frees = z->uz_frees; sleeps = z->uz_sleeps; cachefree = 0; } else uma_zone_sumstat(z, &cachefree, &allocs, &frees, &sleeps); if (!((z->uz_flags & UMA_ZONE_SECONDARY) && (LIST_FIRST(&kz->uk_zones) != z))) cachefree += kz->uk_free; LIST_FOREACH(bucket, &z->uz_buckets, ub_link) cachefree += bucket->ub_cnt; db_printf("%18s %8ju %8jd %8d %12ju %8ju %8u\n", z->uz_name, (uintmax_t)kz->uk_size, (intmax_t)(allocs - frees), cachefree, (uintmax_t)allocs, sleeps, z->uz_count); if (db_pager_quit) return; } } } DB_SHOW_COMMAND(umacache, db_show_umacache) { uint64_t allocs, frees; uma_bucket_t bucket; uma_zone_t z; int cachefree; db_printf("%18s %8s %8s %8s %12s %8s\n", "Zone", "Size", "Used", "Free", "Requests", "Bucket"); LIST_FOREACH(z, &uma_cachezones, uz_link) { uma_zone_sumstat(z, &cachefree, &allocs, &frees, NULL); LIST_FOREACH(bucket, &z->uz_buckets, ub_link) cachefree += bucket->ub_cnt; db_printf("%18s %8ju %8jd %8d %12ju %8u\n", z->uz_name, (uintmax_t)z->uz_size, (intmax_t)(allocs - frees), cachefree, (uintmax_t)allocs, z->uz_count); if (db_pager_quit) return; } } #endif /* DDB */ Index: user/alc/PQ_LAUNDRY/sys/vm/vm_fault.c =================================================================== --- user/alc/PQ_LAUNDRY/sys/vm/vm_fault.c (revision 303205) +++ user/alc/PQ_LAUNDRY/sys/vm/vm_fault.c (revision 303206) @@ -1,1494 +1,1503 @@ /*- * Copyright (c) 1991, 1993 * The Regents of the University of California. All rights reserved. * Copyright (c) 1994 John S. Dyson * All rights reserved. * Copyright (c) 1994 David Greenman * All rights reserved. * * * This code is derived from software contributed to Berkeley by * The Mach Operating System project at Carnegie-Mellon University. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)vm_fault.c 8.4 (Berkeley) 1/12/94 * * * Copyright (c) 1987, 1990 Carnegie-Mellon University. * All rights reserved. * * Authors: Avadis Tevanian, Jr., Michael Wayne Young * * Permission to use, copy, modify and distribute this software and * its documentation is hereby granted, provided that both the copyright * notice and this permission notice appear in all copies of the * software, derivative works or modified versions, and any portions * thereof, and that both notices appear in supporting documentation. * * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS" * CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND * FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE. * * Carnegie Mellon requests users of this software to return to * * Software Distribution Coordinator or Software.Distribution@CS.CMU.EDU * School of Computer Science * Carnegie Mellon University * Pittsburgh PA 15213-3890 * * any improvements or extensions that they make and grant Carnegie the * rights to redistribute these changes. */ /* * Page fault handling module. */ #include __FBSDID("$FreeBSD$"); #include "opt_ktrace.h" #include "opt_vm.h" #include #include #include #include #include #include #include #include #include #include #include #include #ifdef KTRACE #include #endif #include #include #include #include #include #include #include #include #include #include #include #define PFBAK 4 #define PFFOR 4 #define VM_FAULT_READ_DEFAULT (1 + VM_FAULT_READ_AHEAD_INIT) #define VM_FAULT_READ_MAX (1 + VM_FAULT_READ_AHEAD_MAX) #define VM_FAULT_DONTNEED_MIN 1048576 struct faultstate { vm_page_t m; vm_object_t object; vm_pindex_t pindex; vm_page_t first_m; vm_object_t first_object; vm_pindex_t first_pindex; vm_map_t map; vm_map_entry_t entry; int lookup_still_valid; struct vnode *vp; }; static void vm_fault_dontneed(const struct faultstate *fs, vm_offset_t vaddr, int ahead); static void vm_fault_prefault(const struct faultstate *fs, vm_offset_t addra, int backward, int forward); static inline void release_page(struct faultstate *fs) { vm_page_xunbusy(fs->m); vm_page_lock(fs->m); vm_page_deactivate(fs->m); vm_page_unlock(fs->m); fs->m = NULL; } static inline void unlock_map(struct faultstate *fs) { if (fs->lookup_still_valid) { vm_map_lookup_done(fs->map, fs->entry); fs->lookup_still_valid = FALSE; } } static void unlock_and_deallocate(struct faultstate *fs) { vm_object_pip_wakeup(fs->object); VM_OBJECT_WUNLOCK(fs->object); if (fs->object != fs->first_object) { VM_OBJECT_WLOCK(fs->first_object); vm_page_lock(fs->first_m); vm_page_free(fs->first_m); vm_page_unlock(fs->first_m); vm_object_pip_wakeup(fs->first_object); VM_OBJECT_WUNLOCK(fs->first_object); fs->first_m = NULL; } vm_object_deallocate(fs->first_object); unlock_map(fs); if (fs->vp != NULL) { vput(fs->vp); fs->vp = NULL; } } static void vm_fault_dirty(vm_map_entry_t entry, vm_page_t m, vm_prot_t prot, vm_prot_t fault_type, int fault_flags, boolean_t set_wd) { boolean_t need_dirty; if (((prot & VM_PROT_WRITE) == 0 && (fault_flags & VM_FAULT_DIRTY) == 0) || (m->oflags & VPO_UNMANAGED) != 0) return; VM_OBJECT_ASSERT_LOCKED(m->object); need_dirty = ((fault_type & VM_PROT_WRITE) != 0 && (fault_flags & VM_FAULT_WIRE) == 0) || (fault_flags & VM_FAULT_DIRTY) != 0; if (set_wd) vm_object_set_writeable_dirty(m->object); else /* * If two callers of vm_fault_dirty() with set_wd == * FALSE, one for the map entry with MAP_ENTRY_NOSYNC * flag set, other with flag clear, race, it is * possible for the no-NOSYNC thread to see m->dirty * != 0 and not clear VPO_NOSYNC. Take vm_page lock * around manipulation of VPO_NOSYNC and * vm_page_dirty() call, to avoid the race and keep * m->oflags consistent. */ vm_page_lock(m); /* * If this is a NOSYNC mmap we do not want to set VPO_NOSYNC * if the page is already dirty to prevent data written with * the expectation of being synced from not being synced. * Likewise if this entry does not request NOSYNC then make * sure the page isn't marked NOSYNC. Applications sharing * data should use the same flags to avoid ping ponging. */ if ((entry->eflags & MAP_ENTRY_NOSYNC) != 0) { if (m->dirty == 0) { m->oflags |= VPO_NOSYNC; } } else { m->oflags &= ~VPO_NOSYNC; } /* * If the fault is a write, we know that this page is being * written NOW so dirty it explicitly to save on * pmap_is_modified() calls later. * * Also tell the backing pager, if any, that it should remove * any swap backing since the page is now dirty. */ if (need_dirty) vm_page_dirty(m); if (!set_wd) vm_page_unlock(m); if (need_dirty) vm_pager_page_unswapped(m); } /* * vm_fault: * * Handle a page fault occurring at the given address, * requiring the given permissions, in the map specified. * If successful, the page is inserted into the * associated physical map. * * NOTE: the given address should be truncated to the * proper page address. * * KERN_SUCCESS is returned if the page fault is handled; otherwise, * a standard error specifying why the fault is fatal is returned. * * The map in question must be referenced, and remains so. * Caller may hold no locks. */ int vm_fault(vm_map_t map, vm_offset_t vaddr, vm_prot_t fault_type, int fault_flags) { struct thread *td; int result; td = curthread; if ((td->td_pflags & TDP_NOFAULTING) != 0) return (KERN_PROTECTION_FAILURE); #ifdef KTRACE if (map != kernel_map && KTRPOINT(td, KTR_FAULT)) ktrfault(vaddr, fault_type); #endif result = vm_fault_hold(map, trunc_page(vaddr), fault_type, fault_flags, NULL); #ifdef KTRACE if (map != kernel_map && KTRPOINT(td, KTR_FAULTEND)) ktrfaultend(result); #endif return (result); } int vm_fault_hold(vm_map_t map, vm_offset_t vaddr, vm_prot_t fault_type, int fault_flags, vm_page_t *m_hold) { vm_prot_t prot; int alloc_req, era, faultcount, nera, result; boolean_t dead, growstack, is_first_object_locked, wired; int map_generation; vm_object_t next_object; int hardfault; struct faultstate fs; struct vnode *vp; vm_offset_t e_end, e_start; vm_page_t m; int ahead, behind, cluster_offset, error, locked, rv; u_char behavior; hardfault = 0; growstack = TRUE; PCPU_INC(cnt.v_vm_faults); fs.vp = NULL; faultcount = 0; nera = -1; RetryFault:; /* * Find the backing store object and offset into it to begin the * search. */ fs.map = map; result = vm_map_lookup(&fs.map, vaddr, fault_type, &fs.entry, &fs.first_object, &fs.first_pindex, &prot, &wired); if (result != KERN_SUCCESS) { if (growstack && result == KERN_INVALID_ADDRESS && map != kernel_map) { result = vm_map_growstack(curproc, vaddr); if (result != KERN_SUCCESS) return (KERN_FAILURE); growstack = FALSE; goto RetryFault; } return (result); } map_generation = fs.map->timestamp; if (fs.entry->eflags & MAP_ENTRY_NOFAULT) { panic("vm_fault: fault on nofault entry, addr: %lx", (u_long)vaddr); } if (fs.entry->eflags & MAP_ENTRY_IN_TRANSITION && fs.entry->wiring_thread != curthread) { vm_map_unlock_read(fs.map); vm_map_lock(fs.map); if (vm_map_lookup_entry(fs.map, vaddr, &fs.entry) && (fs.entry->eflags & MAP_ENTRY_IN_TRANSITION)) { if (fs.vp != NULL) { vput(fs.vp); fs.vp = NULL; } fs.entry->eflags |= MAP_ENTRY_NEEDS_WAKEUP; vm_map_unlock_and_wait(fs.map, 0); } else vm_map_unlock(fs.map); goto RetryFault; } if (wired) fault_type = prot | (fault_type & VM_PROT_COPY); else KASSERT((fault_flags & VM_FAULT_WIRE) == 0, ("!wired && VM_FAULT_WIRE")); + /* + * Try to avoid lock contention on the top-level object through + * special-case handling of some types of page faults, specifically, + * those that are both (1) mapping an existing page from the top- + * level object and (2) not having to mark that object as containing + * dirty pages. Under these conditions, a read lock on the top-level + * object suffices, allowing multiple page faults of a similar type to + * run in parallel on the same top-level object. + */ if (fs.vp == NULL /* avoid locked vnode leak */ && (fault_flags & (VM_FAULT_WIRE | VM_FAULT_DIRTY)) == 0 && /* avoid calling vm_object_set_writeable_dirty() */ ((prot & VM_PROT_WRITE) == 0 || (fs.first_object->type != OBJT_VNODE && (fs.first_object->flags & OBJ_TMPFS_NODE) == 0) || (fs.first_object->flags & OBJ_MIGHTBEDIRTY) != 0)) { VM_OBJECT_RLOCK(fs.first_object); if ((prot & VM_PROT_WRITE) != 0 && (fs.first_object->type == OBJT_VNODE || (fs.first_object->flags & OBJ_TMPFS_NODE) != 0) && (fs.first_object->flags & OBJ_MIGHTBEDIRTY) == 0) goto fast_failed; m = vm_page_lookup(fs.first_object, fs.first_pindex); /* A busy page can be mapped for read|execute access. */ if (m == NULL || ((prot & VM_PROT_WRITE) != 0 && vm_page_busied(m)) || m->valid != VM_PAGE_BITS_ALL) goto fast_failed; result = pmap_enter(fs.map->pmap, vaddr, m, prot, fault_type | PMAP_ENTER_NOSLEEP | (wired ? PMAP_ENTER_WIRED : 0), 0); if (result != KERN_SUCCESS) goto fast_failed; if (m_hold != NULL) { *m_hold = m; vm_page_lock(m); vm_page_hold(m); vm_page_unlock(m); } vm_fault_dirty(fs.entry, m, prot, fault_type, fault_flags, FALSE); VM_OBJECT_RUNLOCK(fs.first_object); if (!wired) vm_fault_prefault(&fs, vaddr, PFBAK, PFFOR); vm_map_lookup_done(fs.map, fs.entry); curthread->td_ru.ru_minflt++; return (KERN_SUCCESS); fast_failed: if (!VM_OBJECT_TRYUPGRADE(fs.first_object)) { VM_OBJECT_RUNLOCK(fs.first_object); VM_OBJECT_WLOCK(fs.first_object); } } else { VM_OBJECT_WLOCK(fs.first_object); } /* * Make a reference to this object to prevent its disposal while we * are messing with it. Once we have the reference, the map is free * to be diddled. Since objects reference their shadows (and copies), * they will stay around as well. * * Bump the paging-in-progress count to prevent size changes (e.g. * truncation operations) during I/O. This must be done after * obtaining the vnode lock in order to avoid possible deadlocks. */ vm_object_reference_locked(fs.first_object); vm_object_pip_add(fs.first_object, 1); fs.lookup_still_valid = TRUE; fs.first_m = NULL; /* * Search for the page at object/offset. */ fs.object = fs.first_object; fs.pindex = fs.first_pindex; while (TRUE) { /* * If the object is marked for imminent termination, * we retry here, since the collapse pass has raced * with us. Otherwise, if we see terminally dead * object, return fail. */ if ((fs.object->flags & OBJ_DEAD) != 0) { dead = fs.object->type == OBJT_DEAD; unlock_and_deallocate(&fs); if (dead) return (KERN_PROTECTION_FAILURE); pause("vmf_de", 1); goto RetryFault; } /* * See if page is resident */ fs.m = vm_page_lookup(fs.object, fs.pindex); if (fs.m != NULL) { /* * Wait/Retry if the page is busy. We have to do this * if the page is either exclusive or shared busy * because the vm_pager may be using read busy for * pageouts (and even pageins if it is the vnode * pager), and we could end up trying to pagein and * pageout the same page simultaneously. * * We can theoretically allow the busy case on a read * fault if the page is marked valid, but since such * pages are typically already pmap'd, putting that * special case in might be more effort then it is * worth. We cannot under any circumstances mess * around with a shared busied page except, perhaps, * to pmap it. */ if (vm_page_busied(fs.m)) { /* * Reference the page before unlocking and * sleeping so that the page daemon is less * likely to reclaim it. */ vm_page_aflag_set(fs.m, PGA_REFERENCED); if (fs.object != fs.first_object) { if (!VM_OBJECT_TRYWLOCK( fs.first_object)) { VM_OBJECT_WUNLOCK(fs.object); VM_OBJECT_WLOCK(fs.first_object); VM_OBJECT_WLOCK(fs.object); } vm_page_lock(fs.first_m); vm_page_free(fs.first_m); vm_page_unlock(fs.first_m); vm_object_pip_wakeup(fs.first_object); VM_OBJECT_WUNLOCK(fs.first_object); fs.first_m = NULL; } unlock_map(&fs); if (fs.m == vm_page_lookup(fs.object, fs.pindex)) { vm_page_sleep_if_busy(fs.m, "vmpfw"); } vm_object_pip_wakeup(fs.object); VM_OBJECT_WUNLOCK(fs.object); PCPU_INC(cnt.v_intrans); vm_object_deallocate(fs.first_object); goto RetryFault; } vm_page_lock(fs.m); vm_page_remque(fs.m); vm_page_unlock(fs.m); /* * Mark page busy for other processes, and the * pagedaemon. If it still isn't completely valid * (readable), jump to readrest, else break-out ( we * found the page ). */ vm_page_xbusy(fs.m); if (fs.m->valid != VM_PAGE_BITS_ALL) goto readrest; break; } KASSERT(fs.m == NULL, ("fs.m should be NULL, not %p", fs.m)); /* * Page is not resident. If the pager might contain the page * or this is the beginning of the search, allocate a new * page. (Default objects are zero-fill, so there is no real * pager for them.) */ if (fs.object->type != OBJT_DEFAULT || fs.object == fs.first_object) { if (fs.pindex >= fs.object->size) { unlock_and_deallocate(&fs); return (KERN_PROTECTION_FAILURE); } /* * Allocate a new page for this object/offset pair. * * Unlocked read of the p_flag is harmless. At * worst, the P_KILLED might be not observed * there, and allocation can fail, causing * restart and new reading of the p_flag. */ if (!vm_page_count_severe() || P_KILLED(curproc)) { #if VM_NRESERVLEVEL > 0 vm_object_color(fs.object, atop(vaddr) - fs.pindex); #endif alloc_req = P_KILLED(curproc) ? VM_ALLOC_SYSTEM : VM_ALLOC_NORMAL; if (fs.object->type != OBJT_VNODE && fs.object->backing_object == NULL) alloc_req |= VM_ALLOC_ZERO; fs.m = vm_page_alloc(fs.object, fs.pindex, alloc_req); } if (fs.m == NULL) { unlock_and_deallocate(&fs); VM_WAITPFAULT; goto RetryFault; } else if (fs.m->valid == VM_PAGE_BITS_ALL) break; } readrest: /* * If the pager for the current object might have the page, * then determine the number of additional pages to read and * potentially reprioritize previously read pages for earlier * reclamation. These operations should only be performed * once per page fault. Even if the current pager doesn't * have the page, the number of additional pages to read will * apply to subsequent objects in the shadow chain. */ if (fs.object->type != OBJT_DEFAULT && nera == -1 && !P_KILLED(curproc)) { KASSERT(fs.lookup_still_valid, ("map unlocked")); era = fs.entry->read_ahead; behavior = vm_map_entry_behavior(fs.entry); if (behavior == MAP_ENTRY_BEHAV_RANDOM) { nera = 0; } else if (behavior == MAP_ENTRY_BEHAV_SEQUENTIAL) { nera = VM_FAULT_READ_AHEAD_MAX; if (vaddr == fs.entry->next_read) vm_fault_dontneed(&fs, vaddr, nera); } else if (vaddr == fs.entry->next_read) { /* * This is a sequential fault. Arithmetically * increase the requested number of pages in * the read-ahead window. The requested * number of pages is "# of sequential faults * x (read ahead min + 1) + read ahead min" */ nera = VM_FAULT_READ_AHEAD_MIN; if (era > 0) { nera += era + 1; if (nera > VM_FAULT_READ_AHEAD_MAX) nera = VM_FAULT_READ_AHEAD_MAX; } if (era == VM_FAULT_READ_AHEAD_MAX) vm_fault_dontneed(&fs, vaddr, nera); } else { /* * This is a non-sequential fault. */ nera = 0; } if (era != nera) { /* * A read lock on the map suffices to update * the read ahead count safely. */ fs.entry->read_ahead = nera; } /* * Prepare for unlocking the map. Save the map * entry's start and end addresses, which are used to * optimize the size of the pager operation below. * Even if the map entry's addresses change after * unlocking the map, using the saved addresses is * safe. */ e_start = fs.entry->start; e_end = fs.entry->end; } /* * Call the pager to retrieve the page if there is a chance * that the pager has it, and potentially retrieve additional * pages at the same time. */ if (fs.object->type != OBJT_DEFAULT) { /* * We have either allocated a new page or found an * existing page that is only partially valid. We * hold a reference on fs.object and the page is * exclusive busied. */ unlock_map(&fs); if (fs.object->type == OBJT_VNODE) { vp = fs.object->handle; if (vp == fs.vp) goto vnode_locked; else if (fs.vp != NULL) { vput(fs.vp); fs.vp = NULL; } locked = VOP_ISLOCKED(vp); if (locked != LK_EXCLUSIVE) locked = LK_SHARED; /* Do not sleep for vnode lock while fs.m is busy */ error = vget(vp, locked | LK_CANRECURSE | LK_NOWAIT, curthread); if (error != 0) { vhold(vp); release_page(&fs); unlock_and_deallocate(&fs); error = vget(vp, locked | LK_RETRY | LK_CANRECURSE, curthread); vdrop(vp); fs.vp = vp; KASSERT(error == 0, ("vm_fault: vget failed")); goto RetryFault; } fs.vp = vp; } vnode_locked: KASSERT(fs.vp == NULL || !fs.map->system_map, ("vm_fault: vnode-backed object mapped by system map")); /* * Page in the requested page and hint the pager, * that it may bring up surrounding pages. */ if (nera == -1 || behavior == MAP_ENTRY_BEHAV_RANDOM || P_KILLED(curproc)) { behind = 0; ahead = 0; } else { /* Is this a sequential fault? */ if (nera > 0) { behind = 0; ahead = nera; } else { /* * Request a cluster of pages that is * aligned to a VM_FAULT_READ_DEFAULT * page offset boundary within the * object. Alignment to a page offset * boundary is more likely to coincide * with the underlying file system * block than alignment to a virtual * address boundary. */ cluster_offset = fs.pindex % VM_FAULT_READ_DEFAULT; behind = ulmin(cluster_offset, atop(vaddr - e_start)); ahead = VM_FAULT_READ_DEFAULT - 1 - cluster_offset; } ahead = ulmin(ahead, atop(e_end - vaddr) - 1); } rv = vm_pager_get_pages(fs.object, &fs.m, 1, &behind, &ahead); if (rv == VM_PAGER_OK) { faultcount = behind + 1 + ahead; hardfault++; break; /* break to PAGE HAS BEEN FOUND */ } if (rv == VM_PAGER_ERROR) printf("vm_fault: pager read error, pid %d (%s)\n", curproc->p_pid, curproc->p_comm); /* * If an I/O error occurred or the requested page was * outside the range of the pager, clean up and return * an error. */ if (rv == VM_PAGER_ERROR || rv == VM_PAGER_BAD) { vm_page_lock(fs.m); vm_page_free(fs.m); vm_page_unlock(fs.m); fs.m = NULL; unlock_and_deallocate(&fs); return (rv == VM_PAGER_ERROR ? KERN_FAILURE : KERN_PROTECTION_FAILURE); } /* * The requested page does not exist at this object/ * offset. Remove the invalid page from the object, * waking up anyone waiting for it, and continue on to * the next object. However, if this is the top-level * object, we must leave the busy page in place to * prevent another process from rushing past us, and * inserting the page in that object at the same time * that we are. */ if (fs.object != fs.first_object) { vm_page_lock(fs.m); vm_page_free(fs.m); vm_page_unlock(fs.m); fs.m = NULL; } } /* * We get here if the object has default pager (or unwiring) * or the pager doesn't have the page. */ if (fs.object == fs.first_object) fs.first_m = fs.m; /* * Move on to the next object. Lock the next object before * unlocking the current one. */ next_object = fs.object->backing_object; if (next_object == NULL) { /* * If there's no object left, fill the page in the top * object with zeros. */ if (fs.object != fs.first_object) { vm_object_pip_wakeup(fs.object); VM_OBJECT_WUNLOCK(fs.object); fs.object = fs.first_object; fs.pindex = fs.first_pindex; fs.m = fs.first_m; VM_OBJECT_WLOCK(fs.object); } fs.first_m = NULL; /* * Zero the page if necessary and mark it valid. */ if ((fs.m->flags & PG_ZERO) == 0) { pmap_zero_page(fs.m); } else { PCPU_INC(cnt.v_ozfod); } PCPU_INC(cnt.v_zfod); fs.m->valid = VM_PAGE_BITS_ALL; /* Don't try to prefault neighboring pages. */ faultcount = 1; break; /* break to PAGE HAS BEEN FOUND */ } else { KASSERT(fs.object != next_object, ("object loop %p", next_object)); VM_OBJECT_WLOCK(next_object); vm_object_pip_add(next_object, 1); if (fs.object != fs.first_object) vm_object_pip_wakeup(fs.object); fs.pindex += OFF_TO_IDX(fs.object->backing_object_offset); VM_OBJECT_WUNLOCK(fs.object); fs.object = next_object; } } vm_page_assert_xbusied(fs.m); /* * PAGE HAS BEEN FOUND. [Loop invariant still holds -- the object lock * is held.] */ /* * If the page is being written, but isn't already owned by the * top-level object, we have to copy it into a new page owned by the * top-level object. */ if (fs.object != fs.first_object) { /* * We only really need to copy if we want to write it. */ if ((fault_type & (VM_PROT_COPY | VM_PROT_WRITE)) != 0) { /* * This allows pages to be virtually copied from a * backing_object into the first_object, where the * backing object has no other refs to it, and cannot * gain any more refs. Instead of a bcopy, we just * move the page from the backing object to the * first object. Note that we must mark the page * dirty in the first object so that it will go out * to swap when needed. */ is_first_object_locked = FALSE; if ( /* * Only one shadow object */ (fs.object->shadow_count == 1) && /* * No COW refs, except us */ (fs.object->ref_count == 1) && /* * No one else can look this object up */ (fs.object->handle == NULL) && /* * No other ways to look the object up */ ((fs.object->type == OBJT_DEFAULT) || (fs.object->type == OBJT_SWAP)) && (is_first_object_locked = VM_OBJECT_TRYWLOCK(fs.first_object)) && /* * We don't chase down the shadow chain */ fs.object == fs.first_object->backing_object) { vm_page_lock(fs.m); vm_page_remove(fs.m); vm_page_unlock(fs.m); vm_page_lock(fs.first_m); vm_page_replace_checked(fs.m, fs.first_object, fs.first_pindex, fs.first_m); vm_page_free(fs.first_m); vm_page_unlock(fs.first_m); vm_page_dirty(fs.m); #if VM_NRESERVLEVEL > 0 /* * Rename the reservation. */ vm_reserv_rename(fs.m, fs.first_object, fs.object, OFF_TO_IDX( fs.first_object->backing_object_offset)); #endif /* * Removing the page from the backing object * unbusied it. */ vm_page_xbusy(fs.m); fs.first_m = fs.m; fs.m = NULL; PCPU_INC(cnt.v_cow_optim); } else { /* * Oh, well, lets copy it. */ pmap_copy_page(fs.m, fs.first_m); fs.first_m->valid = VM_PAGE_BITS_ALL; if (wired && (fault_flags & VM_FAULT_WIRE) == 0) { vm_page_lock(fs.first_m); vm_page_wire(fs.first_m); vm_page_unlock(fs.first_m); vm_page_lock(fs.m); vm_page_unwire(fs.m, PQ_INACTIVE); vm_page_unlock(fs.m); } /* * We no longer need the old page or object. */ release_page(&fs); } /* * fs.object != fs.first_object due to above * conditional */ vm_object_pip_wakeup(fs.object); VM_OBJECT_WUNLOCK(fs.object); /* * Only use the new page below... */ fs.object = fs.first_object; fs.pindex = fs.first_pindex; fs.m = fs.first_m; if (!is_first_object_locked) VM_OBJECT_WLOCK(fs.object); PCPU_INC(cnt.v_cow_faults); curthread->td_cow++; } else { prot &= ~VM_PROT_WRITE; } } /* * We must verify that the maps have not changed since our last * lookup. */ if (!fs.lookup_still_valid) { vm_object_t retry_object; vm_pindex_t retry_pindex; vm_prot_t retry_prot; if (!vm_map_trylock_read(fs.map)) { release_page(&fs); unlock_and_deallocate(&fs); goto RetryFault; } fs.lookup_still_valid = TRUE; if (fs.map->timestamp != map_generation) { result = vm_map_lookup_locked(&fs.map, vaddr, fault_type, &fs.entry, &retry_object, &retry_pindex, &retry_prot, &wired); /* * If we don't need the page any longer, put it on the inactive * list (the easiest thing to do here). If no one needs it, * pageout will grab it eventually. */ if (result != KERN_SUCCESS) { release_page(&fs); unlock_and_deallocate(&fs); /* * If retry of map lookup would have blocked then * retry fault from start. */ if (result == KERN_FAILURE) goto RetryFault; return (result); } if ((retry_object != fs.first_object) || (retry_pindex != fs.first_pindex)) { release_page(&fs); unlock_and_deallocate(&fs); goto RetryFault; } /* * Check whether the protection has changed or the object has * been copied while we left the map unlocked. Changing from * read to write permission is OK - we leave the page * write-protected, and catch the write fault. Changing from * write to read permission means that we can't mark the page * write-enabled after all. */ prot &= retry_prot; } } /* * If the page was filled by a pager, save the virtual address that * should be faulted on next under a sequential access pattern to the * map entry. A read lock on the map suffices to update this address * safely. */ if (hardfault) fs.entry->next_read = vaddr + ptoa(ahead) + PAGE_SIZE; vm_fault_dirty(fs.entry, fs.m, prot, fault_type, fault_flags, TRUE); vm_page_assert_xbusied(fs.m); /* * Page must be completely valid or it is not fit to * map into user space. vm_pager_get_pages() ensures this. */ KASSERT(fs.m->valid == VM_PAGE_BITS_ALL, ("vm_fault: page %p partially invalid", fs.m)); VM_OBJECT_WUNLOCK(fs.object); /* * Put this page into the physical map. We had to do the unlock above * because pmap_enter() may sleep. We don't put the page * back on the active queue until later so that the pageout daemon * won't find it (yet). */ pmap_enter(fs.map->pmap, vaddr, fs.m, prot, fault_type | (wired ? PMAP_ENTER_WIRED : 0), 0); if (faultcount != 1 && (fault_flags & VM_FAULT_WIRE) == 0 && wired == 0) vm_fault_prefault(&fs, vaddr, faultcount > 0 ? behind : PFBAK, faultcount > 0 ? ahead : PFFOR); VM_OBJECT_WLOCK(fs.object); vm_page_lock(fs.m); /* * If the page is not wired down, then put it where the pageout daemon * can find it. */ if ((fault_flags & VM_FAULT_WIRE) != 0) { KASSERT(wired, ("VM_FAULT_WIRE && !wired")); vm_page_wire(fs.m); } else vm_page_activate(fs.m); if (m_hold != NULL) { *m_hold = fs.m; vm_page_hold(fs.m); } vm_page_unlock(fs.m); vm_page_xunbusy(fs.m); /* * Unlock everything, and return */ unlock_and_deallocate(&fs); if (hardfault) { PCPU_INC(cnt.v_io_faults); curthread->td_ru.ru_majflt++; #ifdef RACCT if (racct_enable && fs.object->type == OBJT_VNODE) { PROC_LOCK(curproc); if ((fault_type & (VM_PROT_COPY | VM_PROT_WRITE)) != 0) { racct_add_force(curproc, RACCT_WRITEBPS, PAGE_SIZE + behind * PAGE_SIZE); racct_add_force(curproc, RACCT_WRITEIOPS, 1); } else { racct_add_force(curproc, RACCT_READBPS, PAGE_SIZE + ahead * PAGE_SIZE); racct_add_force(curproc, RACCT_READIOPS, 1); } PROC_UNLOCK(curproc); } #endif } else curthread->td_ru.ru_minflt++; return (KERN_SUCCESS); } /* * Speed up the reclamation of pages that precede the faulting pindex within * the first object of the shadow chain. Essentially, perform the equivalent * to madvise(..., MADV_DONTNEED) on a large cluster of pages that precedes * the faulting pindex by the cluster size when the pages read by vm_fault() * cross a cluster-size boundary. The cluster size is the greater of the * smallest superpage size and VM_FAULT_DONTNEED_MIN. * * When "fs->first_object" is a shadow object, the pages in the backing object * that precede the faulting pindex are deactivated by vm_fault(). So, this * function must only be concerned with pages in the first object. */ static void vm_fault_dontneed(const struct faultstate *fs, vm_offset_t vaddr, int ahead) { vm_map_entry_t entry; vm_object_t first_object, object; vm_offset_t end, start; vm_page_t m, m_next; vm_pindex_t pend, pstart; vm_size_t size; object = fs->object; VM_OBJECT_ASSERT_WLOCKED(object); first_object = fs->first_object; if (first_object != object) { if (!VM_OBJECT_TRYWLOCK(first_object)) { VM_OBJECT_WUNLOCK(object); VM_OBJECT_WLOCK(first_object); VM_OBJECT_WLOCK(object); } } /* Neither fictitious nor unmanaged pages can be reclaimed. */ if ((first_object->flags & (OBJ_FICTITIOUS | OBJ_UNMANAGED)) == 0) { size = VM_FAULT_DONTNEED_MIN; if (MAXPAGESIZES > 1 && size < pagesizes[1]) size = pagesizes[1]; end = rounddown2(vaddr, size); if (vaddr - end >= size - PAGE_SIZE - ptoa(ahead) && (entry = fs->entry)->start < end) { if (end - entry->start < size) start = entry->start; else start = end - size; pmap_advise(fs->map->pmap, start, end, MADV_DONTNEED); pstart = OFF_TO_IDX(entry->offset) + atop(start - entry->start); m_next = vm_page_find_least(first_object, pstart); pend = OFF_TO_IDX(entry->offset) + atop(end - entry->start); while ((m = m_next) != NULL && m->pindex < pend) { m_next = TAILQ_NEXT(m, listq); if (m->valid != VM_PAGE_BITS_ALL || vm_page_busied(m)) continue; /* * Don't clear PGA_REFERENCED, since it would * likely represent a reference by a different * process. * * Typically, at this point, prefetched pages * are still in the inactive queue. Only * pages that triggered page faults are in the * active queue. */ vm_page_lock(m); vm_page_deactivate(m); vm_page_unlock(m); } } } if (first_object != object) VM_OBJECT_WUNLOCK(first_object); } /* * vm_fault_prefault provides a quick way of clustering * pagefaults into a processes address space. It is a "cousin" * of vm_map_pmap_enter, except it runs at page fault time instead * of mmap time. */ static void vm_fault_prefault(const struct faultstate *fs, vm_offset_t addra, int backward, int forward) { pmap_t pmap; vm_map_entry_t entry; vm_object_t backing_object, lobject; vm_offset_t addr, starta; vm_pindex_t pindex; vm_page_t m; int i; pmap = fs->map->pmap; if (pmap != vmspace_pmap(curthread->td_proc->p_vmspace)) return; entry = fs->entry; starta = addra - backward * PAGE_SIZE; if (starta < entry->start) { starta = entry->start; } else if (starta > addra) { starta = 0; } /* * Generate the sequence of virtual addresses that are candidates for * prefaulting in an outward spiral from the faulting virtual address, * "addra". Specifically, the sequence is "addra - PAGE_SIZE", "addra * + PAGE_SIZE", "addra - 2 * PAGE_SIZE", "addra + 2 * PAGE_SIZE", ... * If the candidate address doesn't have a backing physical page, then * the loop immediately terminates. */ for (i = 0; i < 2 * imax(backward, forward); i++) { addr = addra + ((i >> 1) + 1) * ((i & 1) == 0 ? -PAGE_SIZE : PAGE_SIZE); if (addr > addra + forward * PAGE_SIZE) addr = 0; if (addr < starta || addr >= entry->end) continue; if (!pmap_is_prefaultable(pmap, addr)) continue; pindex = ((addr - entry->start) + entry->offset) >> PAGE_SHIFT; lobject = entry->object.vm_object; VM_OBJECT_RLOCK(lobject); while ((m = vm_page_lookup(lobject, pindex)) == NULL && lobject->type == OBJT_DEFAULT && (backing_object = lobject->backing_object) != NULL) { KASSERT((lobject->backing_object_offset & PAGE_MASK) == 0, ("vm_fault_prefault: unaligned object offset")); pindex += lobject->backing_object_offset >> PAGE_SHIFT; VM_OBJECT_RLOCK(backing_object); VM_OBJECT_RUNLOCK(lobject); lobject = backing_object; } if (m == NULL) { VM_OBJECT_RUNLOCK(lobject); break; } if (m->valid == VM_PAGE_BITS_ALL && (m->flags & PG_FICTITIOUS) == 0) pmap_enter_quick(pmap, addr, m, entry->protection); VM_OBJECT_RUNLOCK(lobject); } } /* * Hold each of the physical pages that are mapped by the specified range of * virtual addresses, ["addr", "addr" + "len"), if those mappings are valid * and allow the specified types of access, "prot". If all of the implied * pages are successfully held, then the number of held pages is returned * together with pointers to those pages in the array "ma". However, if any * of the pages cannot be held, -1 is returned. */ int vm_fault_quick_hold_pages(vm_map_t map, vm_offset_t addr, vm_size_t len, vm_prot_t prot, vm_page_t *ma, int max_count) { vm_offset_t end, va; vm_page_t *mp; int count; boolean_t pmap_failed; if (len == 0) return (0); end = round_page(addr + len); addr = trunc_page(addr); /* * Check for illegal addresses. */ if (addr < vm_map_min(map) || addr > end || end > vm_map_max(map)) return (-1); if (atop(end - addr) > max_count) panic("vm_fault_quick_hold_pages: count > max_count"); count = atop(end - addr); /* * Most likely, the physical pages are resident in the pmap, so it is * faster to try pmap_extract_and_hold() first. */ pmap_failed = FALSE; for (mp = ma, va = addr; va < end; mp++, va += PAGE_SIZE) { *mp = pmap_extract_and_hold(map->pmap, va, prot); if (*mp == NULL) pmap_failed = TRUE; else if ((prot & VM_PROT_WRITE) != 0 && (*mp)->dirty != VM_PAGE_BITS_ALL) { /* * Explicitly dirty the physical page. Otherwise, the * caller's changes may go unnoticed because they are * performed through an unmanaged mapping or by a DMA * operation. * * The object lock is not held here. * See vm_page_clear_dirty_mask(). */ vm_page_dirty(*mp); } } if (pmap_failed) { /* * One or more pages could not be held by the pmap. Either no * page was mapped at the specified virtual address or that * mapping had insufficient permissions. Attempt to fault in * and hold these pages. */ for (mp = ma, va = addr; va < end; mp++, va += PAGE_SIZE) if (*mp == NULL && vm_fault_hold(map, va, prot, VM_FAULT_NORMAL, mp) != KERN_SUCCESS) goto error; } return (count); error: for (mp = ma; mp < ma + count; mp++) if (*mp != NULL) { vm_page_lock(*mp); vm_page_unhold(*mp); vm_page_unlock(*mp); } return (-1); } /* * Routine: * vm_fault_copy_entry * Function: * Create new shadow object backing dst_entry with private copy of * all underlying pages. When src_entry is equal to dst_entry, * function implements COW for wired-down map entry. Otherwise, * it forks wired entry into dst_map. * * In/out conditions: * The source and destination maps must be locked for write. * The source map entry must be wired down (or be a sharing map * entry corresponding to a main map entry that is wired down). */ void vm_fault_copy_entry(vm_map_t dst_map, vm_map_t src_map, vm_map_entry_t dst_entry, vm_map_entry_t src_entry, vm_ooffset_t *fork_charge) { vm_object_t backing_object, dst_object, object, src_object; vm_pindex_t dst_pindex, pindex, src_pindex; vm_prot_t access, prot; vm_offset_t vaddr; vm_page_t dst_m; vm_page_t src_m; boolean_t upgrade; #ifdef lint src_map++; #endif /* lint */ upgrade = src_entry == dst_entry; access = prot = dst_entry->protection; src_object = src_entry->object.vm_object; src_pindex = OFF_TO_IDX(src_entry->offset); if (upgrade && (dst_entry->eflags & MAP_ENTRY_NEEDS_COPY) == 0) { dst_object = src_object; vm_object_reference(dst_object); } else { /* * Create the top-level object for the destination entry. (Doesn't * actually shadow anything - we copy the pages directly.) */ dst_object = vm_object_allocate(OBJT_DEFAULT, OFF_TO_IDX(dst_entry->end - dst_entry->start)); #if VM_NRESERVLEVEL > 0 dst_object->flags |= OBJ_COLORED; dst_object->pg_color = atop(dst_entry->start); #endif } VM_OBJECT_WLOCK(dst_object); KASSERT(upgrade || dst_entry->object.vm_object == NULL, ("vm_fault_copy_entry: vm_object not NULL")); if (src_object != dst_object) { dst_entry->object.vm_object = dst_object; dst_entry->offset = 0; dst_object->charge = dst_entry->end - dst_entry->start; } if (fork_charge != NULL) { KASSERT(dst_entry->cred == NULL, ("vm_fault_copy_entry: leaked swp charge")); dst_object->cred = curthread->td_ucred; crhold(dst_object->cred); *fork_charge += dst_object->charge; } else if (dst_object->cred == NULL) { KASSERT(dst_entry->cred != NULL, ("no cred for entry %p", dst_entry)); dst_object->cred = dst_entry->cred; dst_entry->cred = NULL; } /* * If not an upgrade, then enter the mappings in the pmap as * read and/or execute accesses. Otherwise, enter them as * write accesses. * * A writeable large page mapping is only created if all of * the constituent small page mappings are modified. Marking * PTEs as modified on inception allows promotion to happen * without taking potentially large number of soft faults. */ if (!upgrade) access &= ~VM_PROT_WRITE; /* * Loop through all of the virtual pages within the entry's * range, copying each page from the source object to the * destination object. Since the source is wired, those pages * must exist. In contrast, the destination is pageable. * Since the destination object does share any backing storage * with the source object, all of its pages must be dirtied, * regardless of whether they can be written. */ for (vaddr = dst_entry->start, dst_pindex = 0; vaddr < dst_entry->end; vaddr += PAGE_SIZE, dst_pindex++) { again: /* * Find the page in the source object, and copy it in. * Because the source is wired down, the page will be * in memory. */ if (src_object != dst_object) VM_OBJECT_RLOCK(src_object); object = src_object; pindex = src_pindex + dst_pindex; while ((src_m = vm_page_lookup(object, pindex)) == NULL && (backing_object = object->backing_object) != NULL) { /* * Unless the source mapping is read-only or * it is presently being upgraded from * read-only, the first object in the shadow * chain should provide all of the pages. In * other words, this loop body should never be * executed when the source mapping is already * read/write. */ KASSERT((src_entry->protection & VM_PROT_WRITE) == 0 || upgrade, ("vm_fault_copy_entry: main object missing page")); VM_OBJECT_RLOCK(backing_object); pindex += OFF_TO_IDX(object->backing_object_offset); if (object != dst_object) VM_OBJECT_RUNLOCK(object); object = backing_object; } KASSERT(src_m != NULL, ("vm_fault_copy_entry: page missing")); if (object != dst_object) { /* * Allocate a page in the destination object. */ dst_m = vm_page_alloc(dst_object, (src_object == dst_object ? src_pindex : 0) + dst_pindex, VM_ALLOC_NORMAL); if (dst_m == NULL) { VM_OBJECT_WUNLOCK(dst_object); VM_OBJECT_RUNLOCK(object); VM_WAIT; VM_OBJECT_WLOCK(dst_object); goto again; } pmap_copy_page(src_m, dst_m); VM_OBJECT_RUNLOCK(object); dst_m->valid = VM_PAGE_BITS_ALL; dst_m->dirty = VM_PAGE_BITS_ALL; } else { dst_m = src_m; if (vm_page_sleep_if_busy(dst_m, "fltupg")) goto again; vm_page_xbusy(dst_m); KASSERT(dst_m->valid == VM_PAGE_BITS_ALL, ("invalid dst page %p", dst_m)); } VM_OBJECT_WUNLOCK(dst_object); /* * Enter it in the pmap. If a wired, copy-on-write * mapping is being replaced by a write-enabled * mapping, then wire that new mapping. */ pmap_enter(dst_map->pmap, vaddr, dst_m, prot, access | (upgrade ? PMAP_ENTER_WIRED : 0), 0); /* * Mark it no longer busy, and put it on the active list. */ VM_OBJECT_WLOCK(dst_object); if (upgrade) { if (src_m != dst_m) { vm_page_lock(src_m); vm_page_unwire(src_m, PQ_INACTIVE); vm_page_unlock(src_m); vm_page_lock(dst_m); vm_page_wire(dst_m); vm_page_unlock(dst_m); } else { KASSERT(dst_m->wire_count > 0, ("dst_m %p is not wired", dst_m)); } } else { vm_page_lock(dst_m); vm_page_activate(dst_m); vm_page_unlock(dst_m); } vm_page_xunbusy(dst_m); } VM_OBJECT_WUNLOCK(dst_object); if (upgrade) { dst_entry->eflags &= ~(MAP_ENTRY_COW | MAP_ENTRY_NEEDS_COPY); vm_object_deallocate(src_object); } } /* * Block entry into the machine-independent layer's page fault handler by * the calling thread. Subsequent calls to vm_fault() by that thread will * return KERN_PROTECTION_FAILURE. Enable machine-dependent handling of * spurious page faults. */ int vm_fault_disable_pagefaults(void) { return (curthread_pflags_set(TDP_NOFAULTING | TDP_RESETSPUR)); } void vm_fault_enable_pagefaults(int save) { curthread_pflags_restore(save); } Index: user/alc/PQ_LAUNDRY/usr.bin/Makefile =================================================================== --- user/alc/PQ_LAUNDRY/usr.bin/Makefile (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.bin/Makefile (revision 303206) @@ -1,309 +1,310 @@ # From: @(#)Makefile 8.3 (Berkeley) 1/7/94 # $FreeBSD$ .include # XXX MISSING: deroff diction graph learn plot # spell spline struct xsend # XXX Use GNU versions: diff ld patch # Moved to secure: bdes # SUBDIR= alias \ apply \ asa \ awk \ banner \ basename \ brandelf \ bsdcat \ bsdiff \ bzip2 \ bzip2recover \ cap_mkdb \ chat \ chpass \ cksum \ cmp \ col \ colldef \ colrm \ column \ comm \ compress \ cpuset \ csplit \ ctlstat \ cut \ dirname \ dpv \ du \ elf2aout \ elfdump \ enigma \ env \ expand \ false \ fetch \ find \ fmt \ fold \ fstat \ fsync \ gcore \ gencat \ getconf \ getent \ getopt \ grep \ gzip \ head \ hexdump \ id \ ident \ ipcrm \ ipcs \ join \ jot \ keylogin \ keylogout \ killall \ ktrace \ ktrdump \ lam \ lastcomm \ ldd \ leave \ less \ lessecho \ lesskey \ limits \ locale \ localedef \ lock \ lockf \ logger \ login \ logins \ logname \ look \ lorder \ lsvfs \ lzmainfo \ m4 \ mandoc \ mesg \ minigzip \ ministat \ mkdep \ mkfifo \ mkimg \ mklocale \ mktemp \ mkuzip \ mt \ ncal \ netstat \ newgrp \ nfsstat \ nice \ nl \ numactl \ nohup \ opieinfo \ opiekey \ opiepasswd \ pagesize \ passwd \ paste \ patch \ pathchk \ perror \ pr \ printenv \ printf \ procstat \ protect \ rctl \ renice \ resizewin \ rev \ revoke \ rpcinfo \ rs \ rup \ rusers \ rwall \ script \ sdiff \ sed \ send-pr \ seq \ shar \ showmount \ sockstat \ soelim \ sort \ split \ stat \ stdbuf \ su \ systat \ tabs \ tail \ tar \ tcopy \ tee \ time \ timeout \ tip \ top \ touch \ tput \ tr \ true \ truncate \ tset \ tsort \ tty \ uname \ unexpand \ uniq \ unzip \ units \ unvis \ uudecode \ uuencode \ vis \ vmstat \ w \ wall \ wc \ what \ whereis \ which \ whois \ write \ xargs \ xinstall \ xo \ xz \ xzdec \ yes # NB: keep these sorted by MK_* knobs SUBDIR.${MK_AT}+= at SUBDIR.${MK_ATM}+= atm SUBDIR.${MK_BLUETOOTH}+= bluetooth SUBDIR.${MK_BSD_CPIO}+= cpio SUBDIR.${MK_CALENDAR}+= calendar SUBDIR.${MK_CLANG}+= clang SUBDIR.${MK_EE}+= ee SUBDIR.${MK_FILE}+= file SUBDIR.${MK_FINGER}+= finger SUBDIR.${MK_FTP}+= ftp SUBDIR.${MK_GAMES}+= caesar SUBDIR.${MK_GAMES}+= factor SUBDIR.${MK_GAMES}+= fortune SUBDIR.${MK_GAMES}+= grdc SUBDIR.${MK_GAMES}+= morse SUBDIR.${MK_GAMES}+= number SUBDIR.${MK_GAMES}+= pom SUBDIR.${MK_GAMES}+= primes SUBDIR.${MK_GAMES}+= random .if ${MK_GPL_DTC} != "yes" .if ${COMPILER_FEATURES:Mc++11} SUBDIR+= dtc .endif .endif SUBDIR.${MK_GROFF}+= vgrind SUBDIR.${MK_HESIOD}+= hesinfo SUBDIR.${MK_ICONV}+= iconv SUBDIR.${MK_ICONV}+= mkcsmapper SUBDIR.${MK_ICONV}+= mkesdb SUBDIR.${MK_ISCSI}+= iscsictl SUBDIR.${MK_KDUMP}+= kdump SUBDIR.${MK_KDUMP}+= truss SUBDIR.${MK_KERBEROS_SUPPORT}+= compile_et SUBDIR.${MK_LDNS_UTILS}+= drill SUBDIR.${MK_LDNS_UTILS}+= host SUBDIR.${MK_LOCATE}+= locate # XXX msgs? SUBDIR.${MK_MAIL}+= biff SUBDIR.${MK_MAIL}+= from SUBDIR.${MK_MAIL}+= mail SUBDIR.${MK_MAIL}+= msgs SUBDIR.${MK_MAKE}+= bmake SUBDIR.${MK_MAN_UTILS}+= catman .if ${MK_MANDOCDB} == "no" # AND SUBDIR.${MK_MAN_UTILS}+= makewhatis .endif SUBDIR.${MK_MAN_UTILS}+= man SUBDIR.${MK_NETCAT}+= nc SUBDIR.${MK_NIS}+= ypcat SUBDIR.${MK_NIS}+= ypmatch SUBDIR.${MK_NIS}+= ypwhich SUBDIR.${MK_OPENSSH}+= ssh-copy-id SUBDIR.${MK_OPENSSL}+= bc SUBDIR.${MK_OPENSSL}+= chkey SUBDIR.${MK_OPENSSL}+= dc SUBDIR.${MK_OPENSSL}+= newkey SUBDIR.${MK_QUOTAS}+= quota SUBDIR.${MK_RCMDS}+= rlogin SUBDIR.${MK_RCMDS}+= rsh SUBDIR.${MK_RCMDS}+= ruptime SUBDIR.${MK_RCMDS}+= rwho SUBDIR.${MK_SENDMAIL}+= vacation SUBDIR.${MK_TALK}+= talk SUBDIR.${MK_TELNET}+= telnet SUBDIR.${MK_TESTS}+= tests SUBDIR.${MK_TEXTPROC}+= checknr SUBDIR.${MK_TEXTPROC}+= colcrt SUBDIR.${MK_TEXTPROC}+= ul SUBDIR.${MK_TFTP}+= tftp SUBDIR.${MK_TOOLCHAIN}+= addr2line SUBDIR.${MK_TOOLCHAIN}+= ar SUBDIR.${MK_TOOLCHAIN}+= c89 SUBDIR.${MK_TOOLCHAIN}+= c99 SUBDIR.${MK_TOOLCHAIN}+= ctags SUBDIR.${MK_TOOLCHAIN}+= cxxfilt SUBDIR.${MK_TOOLCHAIN}+= elfcopy SUBDIR.${MK_TOOLCHAIN}+= file2c -.if ${MACHINE_ARCH} != "aarch64" && \ # ARM64TODO gprof does not build - ${MACHINE_CPUARCH} != "riscv" # RISCVTODO gprof does not build +# ARM64TODO gprof does not build +# RISCVTODO gprof does not build +.if ${MACHINE_ARCH} != "aarch64" && ${MACHINE_CPUARCH} != "riscv" SUBDIR.${MK_TOOLCHAIN}+= gprof .endif SUBDIR.${MK_TOOLCHAIN}+= indent SUBDIR.${MK_TOOLCHAIN}+= lex SUBDIR.${MK_TOOLCHAIN}+= mkstr SUBDIR.${MK_TOOLCHAIN}+= nm SUBDIR.${MK_TOOLCHAIN}+= readelf SUBDIR.${MK_TOOLCHAIN}+= rpcgen SUBDIR.${MK_TOOLCHAIN}+= unifdef SUBDIR.${MK_TOOLCHAIN}+= size SUBDIR.${MK_TOOLCHAIN}+= strings .if ${MACHINE_ARCH} != "aarch64" # ARM64TODO xlint does not build SUBDIR.${MK_TOOLCHAIN}+= xlint .endif SUBDIR.${MK_TOOLCHAIN}+= xstr SUBDIR.${MK_TOOLCHAIN}+= yacc SUBDIR.${MK_VI}+= vi SUBDIR.${MK_VT}+= vtfontcvt SUBDIR.${MK_USB}+= usbhidaction SUBDIR.${MK_USB}+= usbhidctl SUBDIR.${MK_UTMPX}+= last .if ${MACHINE_CPUARCH} != "riscv" # RISCVTODO users does not build SUBDIR.${MK_UTMPX}+= users .endif SUBDIR.${MK_UTMPX}+= who SUBDIR.${MK_SVN}+= svn SUBDIR.${MK_SVNLITE}+= svn .include SUBDIR:= ${SUBDIR:O:u} SUBDIR_PARALLEL= .include Index: user/alc/PQ_LAUNDRY/usr.bin/gcore/elfcore.c =================================================================== --- user/alc/PQ_LAUNDRY/usr.bin/gcore/elfcore.c (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.bin/gcore/elfcore.c (revision 303206) @@ -1,835 +1,863 @@ /*- * Copyright (c) 2007 Sandvine Incorporated * Copyright (c) 1998 John D. Polstra * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "extern.h" /* * Code for generating ELF core dumps. */ typedef void (*segment_callback)(vm_map_entry_t, void *); /* Closure for cb_put_phdr(). */ struct phdr_closure { Elf_Phdr *phdr; /* Program header to fill in */ Elf_Off offset; /* Offset of segment in core file */ }; /* Closure for cb_size_segment(). */ struct sseg_closure { int count; /* Count of writable segments. */ size_t size; /* Total size of all writable segments. */ }; #ifdef ELFCORE_COMPAT_32 typedef struct fpreg32 elfcore_fpregset_t; typedef struct reg32 elfcore_gregset_t; typedef struct prpsinfo32 elfcore_prpsinfo_t; typedef struct prstatus32 elfcore_prstatus_t; static void elf_convert_gregset(elfcore_gregset_t *rd, struct reg *rs); static void elf_convert_fpregset(elfcore_fpregset_t *rd, struct fpreg *rs); #else typedef fpregset_t elfcore_fpregset_t; typedef gregset_t elfcore_gregset_t; typedef prpsinfo_t elfcore_prpsinfo_t; typedef prstatus_t elfcore_prstatus_t; #define elf_convert_gregset(d,s) *d = *s #define elf_convert_fpregset(d,s) *d = *s #endif typedef void* (*notefunc_t)(void *, size_t *); static void cb_put_phdr(vm_map_entry_t, void *); static void cb_size_segment(vm_map_entry_t, void *); -static void each_writable_segment(vm_map_entry_t, segment_callback, +static void each_dumpable_segment(vm_map_entry_t, segment_callback, void *closure); static void elf_detach(void); /* atexit() handler. */ static void *elf_note_fpregset(void *, size_t *); static void *elf_note_prpsinfo(void *, size_t *); static void *elf_note_prstatus(void *, size_t *); static void *elf_note_thrmisc(void *, size_t *); #if defined(__i386__) || defined(__amd64__) static void *elf_note_x86_xstate(void *, size_t *); #endif #if defined(__powerpc__) static void *elf_note_powerpc_vmx(void *, size_t *); #endif static void *elf_note_procstat_auxv(void *, size_t *); static void *elf_note_procstat_files(void *, size_t *); static void *elf_note_procstat_groups(void *, size_t *); static void *elf_note_procstat_osrel(void *, size_t *); static void *elf_note_procstat_proc(void *, size_t *); static void *elf_note_procstat_psstrings(void *, size_t *); static void *elf_note_procstat_rlimit(void *, size_t *); static void *elf_note_procstat_umask(void *, size_t *); static void *elf_note_procstat_vmmap(void *, size_t *); static void elf_puthdr(pid_t, vm_map_entry_t, void *, size_t, size_t, size_t, int); static void elf_putnote(int, notefunc_t, void *, struct sbuf *); static void elf_putnotes(pid_t, struct sbuf *, size_t *); static void freemap(vm_map_entry_t); static vm_map_entry_t readmap(pid_t); static void *procstat_sysctl(void *, int, size_t, size_t *sizep); static pid_t g_pid; /* Pid being dumped, global for elf_detach */ static int g_status; /* proc status after ptrace attach */ static int elf_ident(int efd, pid_t pid __unused, char *binfile __unused) { Elf_Ehdr hdr; int cnt; uint16_t machine; cnt = read(efd, &hdr, sizeof(hdr)); if (cnt != sizeof(hdr)) return (0); if (!IS_ELF(hdr)) return (0); switch (hdr.e_ident[EI_DATA]) { case ELFDATA2LSB: machine = le16toh(hdr.e_machine); break; case ELFDATA2MSB: machine = be16toh(hdr.e_machine); break; default: return (0); } if (!ELF_MACHINE_OK(machine)) return (0); /* Looks good. */ return (1); } static void elf_detach(void) { int sig; if (g_pid != 0) { /* * Forward any pending signals. SIGSTOP is generated by ptrace * itself, so ignore it. */ sig = WIFSTOPPED(g_status) ? WSTOPSIG(g_status) : 0; if (sig == SIGSTOP) sig = 0; ptrace(PT_DETACH, g_pid, (caddr_t)1, sig); } } /* * Write an ELF coredump for the given pid to the given fd. */ static void elf_coredump(int efd __unused, int fd, pid_t pid) { vm_map_entry_t map; struct sseg_closure seginfo; struct sbuf *sb; void *hdr; size_t hdrsize, notesz, segoff; ssize_t n, old_len; Elf_Phdr *php; int i; /* Attach to process to dump. */ g_pid = pid; if (atexit(elf_detach) != 0) err(1, "atexit"); errno = 0; ptrace(PT_ATTACH, pid, NULL, 0); if (errno) err(1, "PT_ATTACH"); if (waitpid(pid, &g_status, 0) == -1) err(1, "waitpid"); /* Get the program's memory map. */ map = readmap(pid); /* Size the program segments. */ seginfo.count = 0; seginfo.size = 0; - each_writable_segment(map, cb_size_segment, &seginfo); + each_dumpable_segment(map, cb_size_segment, &seginfo); /* * Build the header and the notes using sbuf and write to the file. */ sb = sbuf_new_auto(); hdrsize = sizeof(Elf_Ehdr) + sizeof(Elf_Phdr) * (1 + seginfo.count); + if (seginfo.count + 1 >= PN_XNUM) + hdrsize += sizeof(Elf_Shdr); /* Start header + notes section. */ sbuf_start_section(sb, NULL); /* Make empty header subsection. */ sbuf_start_section(sb, &old_len); sbuf_putc(sb, 0); sbuf_end_section(sb, old_len, hdrsize, 0); /* Put notes. */ elf_putnotes(pid, sb, ¬esz); /* Align up to a page boundary for the program segments. */ sbuf_end_section(sb, -1, PAGE_SIZE, 0); if (sbuf_finish(sb) != 0) err(1, "sbuf_finish"); hdr = sbuf_data(sb); segoff = sbuf_len(sb); /* Fill in the header. */ elf_puthdr(pid, map, hdr, hdrsize, notesz, segoff, seginfo.count); n = write(fd, hdr, segoff); if (n == -1) err(1, "write"); if (n < segoff) errx(1, "short write"); /* Write the contents of all of the writable segments. */ php = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr)) + 1; for (i = 0; i < seginfo.count; i++) { struct ptrace_io_desc iorequest; uintmax_t nleft = php->p_filesz; iorequest.piod_op = PIOD_READ_D; iorequest.piod_offs = (caddr_t)(uintptr_t)php->p_vaddr; while (nleft > 0) { char buf[8*1024]; size_t nwant; ssize_t ngot; if (nleft > sizeof(buf)) nwant = sizeof buf; else nwant = nleft; iorequest.piod_addr = buf; iorequest.piod_len = nwant; ptrace(PT_IO, pid, (caddr_t)&iorequest, 0); ngot = iorequest.piod_len; if ((size_t)ngot < nwant) errx(1, "short read wanted %zu, got %zd", nwant, ngot); ngot = write(fd, buf, nwant); if (ngot == -1) err(1, "write of segment %d failed", i); if ((size_t)ngot != nwant) errx(1, "short write"); nleft -= nwant; iorequest.piod_offs += ngot; } php++; } sbuf_delete(sb); freemap(map); } /* - * A callback for each_writable_segment() to write out the segment's + * A callback for each_dumpable_segment() to write out the segment's * program header entry. */ static void cb_put_phdr(vm_map_entry_t entry, void *closure) { struct phdr_closure *phc = (struct phdr_closure *)closure; Elf_Phdr *phdr = phc->phdr; phc->offset = round_page(phc->offset); phdr->p_type = PT_LOAD; phdr->p_offset = phc->offset; phdr->p_vaddr = entry->start; phdr->p_paddr = 0; phdr->p_filesz = phdr->p_memsz = entry->end - entry->start; phdr->p_align = PAGE_SIZE; phdr->p_flags = 0; if (entry->protection & VM_PROT_READ) phdr->p_flags |= PF_R; if (entry->protection & VM_PROT_WRITE) phdr->p_flags |= PF_W; if (entry->protection & VM_PROT_EXECUTE) phdr->p_flags |= PF_X; phc->offset += phdr->p_filesz; phc->phdr++; } /* - * A callback for each_writable_segment() to gather information about + * A callback for each_dumpable_segment() to gather information about * the number of segments and their total size. */ static void cb_size_segment(vm_map_entry_t entry, void *closure) { struct sseg_closure *ssc = (struct sseg_closure *)closure; ssc->count++; ssc->size += entry->end - entry->start; } /* * For each segment in the given memory map, call the given function * with a pointer to the map entry and some arbitrary caller-supplied * data. */ static void -each_writable_segment(vm_map_entry_t map, segment_callback func, void *closure) +each_dumpable_segment(vm_map_entry_t map, segment_callback func, void *closure) { vm_map_entry_t entry; for (entry = map; entry != NULL; entry = entry->next) (*func)(entry, closure); } static void elf_putnotes(pid_t pid, struct sbuf *sb, size_t *sizep) { lwpid_t *tids; size_t threads, old_len; ssize_t size; int i; errno = 0; threads = ptrace(PT_GETNUMLWPS, pid, NULL, 0); if (errno) err(1, "PT_GETNUMLWPS"); tids = malloc(threads * sizeof(*tids)); if (tids == NULL) errx(1, "out of memory"); errno = 0; ptrace(PT_GETLWPLIST, pid, (void *)tids, threads); if (errno) err(1, "PT_GETLWPLIST"); sbuf_start_section(sb, &old_len); elf_putnote(NT_PRPSINFO, elf_note_prpsinfo, &pid, sb); for (i = 0; i < threads; ++i) { elf_putnote(NT_PRSTATUS, elf_note_prstatus, tids + i, sb); elf_putnote(NT_FPREGSET, elf_note_fpregset, tids + i, sb); elf_putnote(NT_THRMISC, elf_note_thrmisc, tids + i, sb); #if defined(__i386__) || defined(__amd64__) elf_putnote(NT_X86_XSTATE, elf_note_x86_xstate, tids + i, sb); #endif #if defined(__powerpc__) elf_putnote(NT_PPC_VMX, elf_note_powerpc_vmx, tids + i, sb); #endif } #ifndef ELFCORE_COMPAT_32 elf_putnote(NT_PROCSTAT_PROC, elf_note_procstat_proc, &pid, sb); elf_putnote(NT_PROCSTAT_FILES, elf_note_procstat_files, &pid, sb); elf_putnote(NT_PROCSTAT_VMMAP, elf_note_procstat_vmmap, &pid, sb); elf_putnote(NT_PROCSTAT_GROUPS, elf_note_procstat_groups, &pid, sb); elf_putnote(NT_PROCSTAT_UMASK, elf_note_procstat_umask, &pid, sb); elf_putnote(NT_PROCSTAT_RLIMIT, elf_note_procstat_rlimit, &pid, sb); elf_putnote(NT_PROCSTAT_OSREL, elf_note_procstat_osrel, &pid, sb); elf_putnote(NT_PROCSTAT_PSSTRINGS, elf_note_procstat_psstrings, &pid, sb); elf_putnote(NT_PROCSTAT_AUXV, elf_note_procstat_auxv, &pid, sb); #endif size = sbuf_end_section(sb, old_len, 1, 0); if (size == -1) err(1, "sbuf_end_section"); free(tids); *sizep = size; } /* * Emit one note section to sbuf. */ static void elf_putnote(int type, notefunc_t notefunc, void *arg, struct sbuf *sb) { Elf_Note note; size_t descsz; ssize_t old_len; void *desc; desc = notefunc(arg, &descsz); note.n_namesz = 8; /* strlen("FreeBSD") + 1 */ note.n_descsz = descsz; note.n_type = type; sbuf_bcat(sb, ¬e, sizeof(note)); sbuf_start_section(sb, &old_len); sbuf_bcat(sb, "FreeBSD", note.n_namesz); sbuf_end_section(sb, old_len, sizeof(Elf32_Size), 0); if (descsz == 0) return; sbuf_start_section(sb, &old_len); sbuf_bcat(sb, desc, descsz); sbuf_end_section(sb, old_len, sizeof(Elf32_Size), 0); free(desc); } /* * Generate the ELF coredump header. */ static void elf_puthdr(pid_t pid, vm_map_entry_t map, void *hdr, size_t hdrsize, size_t notesz, size_t segoff, int numsegs) { Elf_Ehdr *ehdr; Elf_Phdr *phdr; + Elf_Shdr *shdr; struct phdr_closure phc; ehdr = (Elf_Ehdr *)hdr; - phdr = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr)); ehdr->e_ident[EI_MAG0] = ELFMAG0; ehdr->e_ident[EI_MAG1] = ELFMAG1; ehdr->e_ident[EI_MAG2] = ELFMAG2; ehdr->e_ident[EI_MAG3] = ELFMAG3; ehdr->e_ident[EI_CLASS] = ELF_CLASS; ehdr->e_ident[EI_DATA] = ELF_DATA; ehdr->e_ident[EI_VERSION] = EV_CURRENT; ehdr->e_ident[EI_OSABI] = ELFOSABI_FREEBSD; ehdr->e_ident[EI_ABIVERSION] = 0; ehdr->e_ident[EI_PAD] = 0; ehdr->e_type = ET_CORE; ehdr->e_machine = ELF_ARCH; ehdr->e_version = EV_CURRENT; ehdr->e_entry = 0; ehdr->e_phoff = sizeof(Elf_Ehdr); ehdr->e_flags = 0; ehdr->e_ehsize = sizeof(Elf_Ehdr); ehdr->e_phentsize = sizeof(Elf_Phdr); - ehdr->e_phnum = numsegs + 1; ehdr->e_shentsize = sizeof(Elf_Shdr); - ehdr->e_shnum = 0; ehdr->e_shstrndx = SHN_UNDEF; + if (numsegs + 1 < PN_XNUM) { + ehdr->e_phnum = numsegs + 1; + ehdr->e_shnum = 0; + } else { + ehdr->e_phnum = PN_XNUM; + ehdr->e_shnum = 1; + ehdr->e_shoff = ehdr->e_phoff + + (numsegs + 1) * ehdr->e_phentsize; + + shdr = (Elf_Shdr *)((char *)hdr + ehdr->e_shoff); + memset(shdr, 0, sizeof(*shdr)); + /* + * A special first section is used to hold large segment and + * section counts. This was proposed by Sun Microsystems in + * Solaris and has been adopted by Linux; the standard ELF + * tools are already familiar with the technique. + * + * See table 7-7 of the Solaris "Linker and Libraries Guide" + * (or 12-7 depending on the version of the document) for more + * details. + */ + shdr->sh_type = SHT_NULL; + shdr->sh_size = ehdr->e_shnum; + shdr->sh_link = ehdr->e_shstrndx; + shdr->sh_info = numsegs + 1; + } + /* * Fill in the program header entries. */ + phdr = (Elf_Phdr *)((char *)hdr + ehdr->e_phoff); /* The note segement. */ phdr->p_type = PT_NOTE; phdr->p_offset = hdrsize; phdr->p_vaddr = 0; phdr->p_paddr = 0; phdr->p_filesz = notesz; phdr->p_memsz = 0; phdr->p_flags = PF_R; phdr->p_align = sizeof(Elf32_Size); phdr++; /* All the writable segments from the program. */ phc.phdr = phdr; phc.offset = segoff; - each_writable_segment(map, cb_put_phdr, &phc); + each_dumpable_segment(map, cb_put_phdr, &phc); } /* * Free the memory map. */ static void freemap(vm_map_entry_t map) { while (map != NULL) { vm_map_entry_t next = map->next; free(map); map = next; } } /* * Read the process's memory map using kinfo_getvmmap(), and return a list of * VM map entries. Only the non-device read/writable segments are * returned. The map entries in the list aren't fully filled in; only * the items we need are present. */ static vm_map_entry_t readmap(pid_t pid) { vm_map_entry_t ent, *linkp, map; struct kinfo_vmentry *vmentl, *kve; int i, nitems; vmentl = kinfo_getvmmap(pid, &nitems); if (vmentl == NULL) err(1, "cannot retrieve mappings for %u process", pid); map = NULL; linkp = ↦ for (i = 0; i < nitems; i++) { kve = &vmentl[i]; /* * Ignore 'malformed' segments or ones representing memory * mapping with MAP_NOCORE on. * If the 'full' support is disabled, just dump the most * meaningful data segments. */ if ((kve->kve_protection & KVME_PROT_READ) == 0 || (kve->kve_flags & KVME_FLAG_NOCOREDUMP) != 0 || kve->kve_type == KVME_TYPE_DEAD || kve->kve_type == KVME_TYPE_UNKNOWN || ((pflags & PFLAGS_FULL) == 0 && kve->kve_type != KVME_TYPE_DEFAULT && kve->kve_type != KVME_TYPE_VNODE && kve->kve_type != KVME_TYPE_SWAP && kve->kve_type != KVME_TYPE_PHYS)) continue; ent = calloc(1, sizeof(*ent)); if (ent == NULL) errx(1, "out of memory"); ent->start = (vm_offset_t)kve->kve_start; ent->end = (vm_offset_t)kve->kve_end; ent->protection = VM_PROT_READ | VM_PROT_WRITE; if ((kve->kve_protection & KVME_PROT_EXEC) != 0) ent->protection |= VM_PROT_EXECUTE; *linkp = ent; linkp = &ent->next; } free(vmentl); return (map); } /* * Miscellaneous note out functions. */ static void * elf_note_prpsinfo(void *arg, size_t *sizep) { char *cp, *end; pid_t pid; elfcore_prpsinfo_t *psinfo; struct kinfo_proc kip; size_t len; int name[4]; pid = *(pid_t *)arg; psinfo = calloc(1, sizeof(*psinfo)); if (psinfo == NULL) errx(1, "out of memory"); psinfo->pr_version = PRPSINFO_VERSION; psinfo->pr_psinfosz = sizeof(*psinfo); name[0] = CTL_KERN; name[1] = KERN_PROC; name[2] = KERN_PROC_PID; name[3] = pid; len = sizeof(kip); if (sysctl(name, 4, &kip, &len, NULL, 0) == -1) err(1, "kern.proc.pid.%u", pid); if (kip.ki_pid != pid) err(1, "kern.proc.pid.%u", pid); strlcpy(psinfo->pr_fname, kip.ki_comm, sizeof(psinfo->pr_fname)); name[2] = KERN_PROC_ARGS; len = sizeof(psinfo->pr_psargs) - 1; if (sysctl(name, 4, psinfo->pr_psargs, &len, NULL, 0) == 0 && len > 0) { cp = psinfo->pr_psargs; end = cp + len - 1; for (;;) { cp = memchr(cp, '\0', end - cp); if (cp == NULL) break; *cp = ' '; } } else strlcpy(psinfo->pr_psargs, kip.ki_comm, sizeof(psinfo->pr_psargs)); psinfo->pr_pid = pid; *sizep = sizeof(*psinfo); return (psinfo); } static void * elf_note_prstatus(void *arg, size_t *sizep) { lwpid_t tid; elfcore_prstatus_t *status; struct reg greg; tid = *(lwpid_t *)arg; status = calloc(1, sizeof(*status)); if (status == NULL) errx(1, "out of memory"); status->pr_version = PRSTATUS_VERSION; status->pr_statussz = sizeof(*status); status->pr_gregsetsz = sizeof(elfcore_gregset_t); status->pr_fpregsetsz = sizeof(elfcore_fpregset_t); status->pr_osreldate = __FreeBSD_version; status->pr_pid = tid; ptrace(PT_GETREGS, tid, (void *)&greg, 0); elf_convert_gregset(&status->pr_reg, &greg); *sizep = sizeof(*status); return (status); } static void * elf_note_fpregset(void *arg, size_t *sizep) { lwpid_t tid; elfcore_fpregset_t *fpregset; fpregset_t fpreg; tid = *(lwpid_t *)arg; fpregset = calloc(1, sizeof(*fpregset)); if (fpregset == NULL) errx(1, "out of memory"); ptrace(PT_GETFPREGS, tid, (void *)&fpreg, 0); elf_convert_fpregset(fpregset, &fpreg); *sizep = sizeof(*fpregset); return (fpregset); } static void * elf_note_thrmisc(void *arg, size_t *sizep) { lwpid_t tid; struct ptrace_lwpinfo lwpinfo; thrmisc_t *thrmisc; tid = *(lwpid_t *)arg; thrmisc = calloc(1, sizeof(*thrmisc)); if (thrmisc == NULL) errx(1, "out of memory"); ptrace(PT_LWPINFO, tid, (void *)&lwpinfo, sizeof(lwpinfo)); memset(&thrmisc->_pad, 0, sizeof(thrmisc->_pad)); strcpy(thrmisc->pr_tname, lwpinfo.pl_tdname); *sizep = sizeof(*thrmisc); return (thrmisc); } #if defined(__i386__) || defined(__amd64__) static void * elf_note_x86_xstate(void *arg, size_t *sizep) { lwpid_t tid; char *xstate; static bool xsave_checked = false; static struct ptrace_xstate_info info; tid = *(lwpid_t *)arg; if (!xsave_checked) { if (ptrace(PT_GETXSTATE_INFO, tid, (void *)&info, sizeof(info)) != 0) info.xsave_len = 0; xsave_checked = true; } if (info.xsave_len == 0) { *sizep = 0; return (NULL); } xstate = calloc(1, info.xsave_len); ptrace(PT_GETXSTATE, tid, xstate, 0); *(uint64_t *)(xstate + X86_XSTATE_XCR0_OFFSET) = info.xsave_mask; *sizep = info.xsave_len; return (xstate); } #endif #if defined(__powerpc__) static void * elf_note_powerpc_vmx(void *arg, size_t *sizep) { lwpid_t tid; struct vmxreg *vmx; static bool has_vmx = true; struct vmxreg info; tid = *(lwpid_t *)arg; if (has_vmx) { if (ptrace(PT_GETVRREGS, tid, (void *)&info, sizeof(info)) != 0) has_vmx = false; } if (!has_vmx) { *sizep = 0; return (NULL); } vmx = calloc(1, sizeof(*vmx)); memcpy(vmx, &info, sizeof(*vmx)); *sizep = sizeof(*vmx); return (vmx); } #endif static void * procstat_sysctl(void *arg, int what, size_t structsz, size_t *sizep) { size_t len; pid_t pid; int name[4], structsize; void *buf, *p; pid = *(pid_t *)arg; structsize = structsz; name[0] = CTL_KERN; name[1] = KERN_PROC; name[2] = what; name[3] = pid; len = 0; if (sysctl(name, 4, NULL, &len, NULL, 0) == -1) err(1, "kern.proc.%d.%u", what, pid); buf = calloc(1, sizeof(structsize) + len * 4 / 3); if (buf == NULL) errx(1, "out of memory"); bcopy(&structsize, buf, sizeof(structsize)); p = (char *)buf + sizeof(structsize); if (sysctl(name, 4, p, &len, NULL, 0) == -1) err(1, "kern.proc.%d.%u", what, pid); *sizep = sizeof(structsize) + len; return (buf); } static void * elf_note_procstat_proc(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_PID | KERN_PROC_INC_THREAD, sizeof(struct kinfo_proc), sizep)); } static void * elf_note_procstat_files(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_FILEDESC, sizeof(struct kinfo_file), sizep)); } static void * elf_note_procstat_vmmap(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_VMMAP, sizeof(struct kinfo_vmentry), sizep)); } static void * elf_note_procstat_groups(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_GROUPS, sizeof(gid_t), sizep)); } static void * elf_note_procstat_umask(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_UMASK, sizeof(u_short), sizep)); } static void * elf_note_procstat_osrel(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_OSREL, sizeof(int), sizep)); } static void * elf_note_procstat_psstrings(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_PS_STRINGS, sizeof(vm_offset_t), sizep)); } static void * elf_note_procstat_auxv(void *arg, size_t *sizep) { return (procstat_sysctl(arg, KERN_PROC_AUXV, sizeof(Elf_Auxinfo), sizep)); } static void * elf_note_procstat_rlimit(void *arg, size_t *sizep) { pid_t pid; size_t len; int i, name[5], structsize; void *buf, *p; pid = *(pid_t *)arg; structsize = sizeof(struct rlimit) * RLIM_NLIMITS; buf = calloc(1, sizeof(structsize) + structsize); if (buf == NULL) errx(1, "out of memory"); bcopy(&structsize, buf, sizeof(structsize)); p = (char *)buf + sizeof(structsize); name[0] = CTL_KERN; name[1] = KERN_PROC; name[2] = KERN_PROC_RLIMIT; name[3] = pid; len = sizeof(struct rlimit); for (i = 0; i < RLIM_NLIMITS; i++) { name[4] = i; if (sysctl(name, 5, p, &len, NULL, 0) == -1) err(1, "kern.proc.rlimit.%u", pid); if (len != sizeof(struct rlimit)) errx(1, "kern.proc.rlimit.%u: short read", pid); p += len; } *sizep = sizeof(structsize) + structsize; return (buf); } struct dumpers __elfN(dump) = { elf_ident, elf_coredump }; TEXT_SET(dumpset, __elfN(dump)); Index: user/alc/PQ_LAUNDRY/usr.bin/sed/process.c =================================================================== --- user/alc/PQ_LAUNDRY/usr.bin/sed/process.c (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.bin/sed/process.c (revision 303206) @@ -1,784 +1,785 @@ /*- * Copyright (c) 1992 Diomidis Spinellis. * Copyright (c) 1992, 1993, 1994 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Diomidis Spinellis of Imperial College, University of London. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #ifndef lint static const char sccsid[] = "@(#)process.c 8.6 (Berkeley) 4/20/94"; #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "defs.h" #include "extern.h" static SPACE HS, PS, SS, YS; #define pd PS.deleted #define ps PS.space #define psl PS.len #define psanl PS.append_newline #define hs HS.space #define hsl HS.len static inline int applies(struct s_command *); static void do_tr(const struct s_tr *); static void flush_appends(void); static void lputs(const char *, size_t); static int regexec_e(const regex_t *, const char *, int, int, size_t, size_t); static void regsub(SPACE *, const char *, const char *); static int substitute(const struct s_command *); struct s_appends *appends; /* Array of pointers to strings to append. */ static int appendx; /* Index into appends array. */ int appendnum; /* Size of appends array. */ static int lastaddr; /* Set by applies if last address of a range. */ static int sdone; /* If any substitutes since last line input. */ /* Iov structure for 'w' commands. */ static const regex_t *defpreg; size_t maxnsub; regmatch_t *match; #define OUT() do { \ fwrite(ps, 1, psl, outfile); \ if (psanl) fputc('\n', outfile); \ } while (0) void process(void) { struct s_command *cp; SPACE tspace; - size_t oldpsl = 0; + size_t oldpsl; char *p; int oldpsanl; p = NULL; + oldpsanl = oldpsl = 0; for (linenum = 0; mf_fgets(&PS, REPLACE);) { pd = 0; top: cp = prog; redirect: while (cp != NULL) { if (!applies(cp)) { cp = cp->next; continue; } switch (cp->code) { case '{': cp = cp->u.c; goto redirect; case 'a': if (appendx >= appendnum) if ((appends = realloc(appends, sizeof(struct s_appends) * (appendnum *= 2))) == NULL) err(1, "realloc"); appends[appendx].type = AP_STRING; appends[appendx].s = cp->t; appends[appendx].len = strlen(cp->t); appendx++; break; case 'b': cp = cp->u.c; goto redirect; case 'c': pd = 1; psl = 0; if (cp->a2 == NULL || lastaddr || lastline()) (void)fprintf(outfile, "%s", cp->t); break; case 'd': pd = 1; goto new; case 'D': if (pd) goto new; if (psl == 0 || (p = memchr(ps, '\n', psl)) == NULL) { pd = 1; goto new; } else { psl -= (p + 1) - ps; memmove(ps, p + 1, psl); goto top; } case 'g': cspace(&PS, hs, hsl, REPLACE); break; case 'G': cspace(&PS, "\n", 1, APPEND); cspace(&PS, hs, hsl, APPEND); break; case 'h': cspace(&HS, ps, psl, REPLACE); break; case 'H': cspace(&HS, "\n", 1, APPEND); cspace(&HS, ps, psl, APPEND); break; case 'i': (void)fprintf(outfile, "%s", cp->t); break; case 'l': lputs(ps, psl); break; case 'n': if (!nflag && !pd) OUT(); flush_appends(); if (!mf_fgets(&PS, REPLACE)) exit(0); pd = 0; break; case 'N': flush_appends(); cspace(&PS, "\n", 1, APPEND); if (!mf_fgets(&PS, APPEND)) exit(0); break; case 'p': if (pd) break; OUT(); break; case 'P': if (pd) break; if ((p = memchr(ps, '\n', psl)) != NULL) { oldpsl = psl; oldpsanl = psanl; psl = p - ps; psanl = 1; } OUT(); if (p != NULL) { psl = oldpsl; psanl = oldpsanl; } break; case 'q': if (!nflag && !pd) OUT(); flush_appends(); exit(0); case 'r': if (appendx >= appendnum) if ((appends = realloc(appends, sizeof(struct s_appends) * (appendnum *= 2))) == NULL) err(1, "realloc"); appends[appendx].type = AP_FILE; appends[appendx].s = cp->t; appends[appendx].len = strlen(cp->t); appendx++; break; case 's': sdone |= substitute(cp); break; case 't': if (sdone) { sdone = 0; cp = cp->u.c; goto redirect; } break; case 'w': if (pd) break; if (cp->u.fd == -1 && (cp->u.fd = open(cp->t, O_WRONLY|O_APPEND|O_CREAT|O_TRUNC, DEFFILEMODE)) == -1) err(1, "%s", cp->t); if (write(cp->u.fd, ps, psl) != (ssize_t)psl || write(cp->u.fd, "\n", 1) != 1) err(1, "%s", cp->t); break; case 'x': /* * If the hold space is null, make it empty * but not null. Otherwise the pattern space * will become null after the swap, which is * an abnormal condition. */ if (hs == NULL) cspace(&HS, "", 0, REPLACE); tspace = PS; PS = HS; psanl = tspace.append_newline; HS = tspace; break; case 'y': if (pd || psl == 0) break; do_tr(cp->u.y); break; case ':': case '}': break; case '=': (void)fprintf(outfile, "%lu\n", linenum); } cp = cp->next; } /* for all cp */ new: if (!nflag && !pd) OUT(); flush_appends(); } /* for all lines */ } /* * TRUE if the address passed matches the current program state * (lastline, linenumber, ps). */ #define MATCH(a) \ ((a)->type == AT_RE ? regexec_e((a)->u.r, ps, 0, 1, 0, psl) : \ (a)->type == AT_LINE ? linenum == (a)->u.l : lastline()) /* * Return TRUE if the command applies to the current line. Sets the start * line for process ranges. Interprets the non-select (``!'') flag. */ static inline int applies(struct s_command *cp) { int r; lastaddr = 0; if (cp->a1 == NULL && cp->a2 == NULL) r = 1; else if (cp->a2) if (cp->startline > 0) { switch (cp->a2->type) { case AT_RELLINE: if (linenum - cp->startline <= cp->a2->u.l) r = 1; else { cp->startline = 0; r = 0; } break; default: if (MATCH(cp->a2)) { cp->startline = 0; lastaddr = 1; r = 1; } else if (cp->a2->type == AT_LINE && linenum > cp->a2->u.l) { /* * We missed the 2nd address due to a * branch, so just close the range and * return false. */ cp->startline = 0; r = 0; } else r = 1; } } else if (cp->a1 && MATCH(cp->a1)) { /* * If the second address is a number less than or * equal to the line number first selected, only * one line shall be selected. * -- POSIX 1003.2 * Likewise if the relative second line address is zero. */ if ((cp->a2->type == AT_LINE && linenum >= cp->a2->u.l) || (cp->a2->type == AT_RELLINE && cp->a2->u.l == 0)) lastaddr = 1; else { cp->startline = linenum; } r = 1; } else r = 0; else r = MATCH(cp->a1); return (cp->nonsel ? ! r : r); } /* * Reset the sed processor to its initial state. */ void resetstate(void) { struct s_command *cp; /* * Reset all in-range markers. */ for (cp = prog; cp; cp = cp->code == '{' ? cp->u.c : cp->next) if (cp->a2) cp->startline = 0; /* * Clear out the hold space. */ cspace(&HS, "", 0, REPLACE); } /* * substitute -- * Do substitutions in the pattern space. Currently, we build a * copy of the new pattern space in the substitute space structure * and then swap them. */ static int substitute(const struct s_command *cp) { SPACE tspace; const regex_t *re; regoff_t slen; int lastempty, n; regoff_t le = 0; char *s; s = ps; re = cp->u.s->re; if (re == NULL) { if (defpreg != NULL && cp->u.s->maxbref > defpreg->re_nsub) { linenum = cp->u.s->linenum; errx(1, "%lu: %s: \\%u not defined in the RE", linenum, fname, cp->u.s->maxbref); } } if (!regexec_e(re, ps, 0, 0, 0, psl)) return (0); SS.len = 0; /* Clean substitute space. */ slen = psl; n = cp->u.s->n; lastempty = 1; do { /* Copy the leading retained string. */ if (n <= 1 && (match[0].rm_so > le)) cspace(&SS, s, match[0].rm_so - le, APPEND); /* Skip zero-length matches right after other matches. */ if (lastempty || (match[0].rm_so - le) || match[0].rm_so != match[0].rm_eo) { if (n <= 1) { /* Want this match: append replacement. */ regsub(&SS, ps, cp->u.s->new); if (n == 1) n = -1; } else { /* Want a later match: append original. */ if (match[0].rm_eo - le) cspace(&SS, s, match[0].rm_eo - le, APPEND); n--; } } /* Move past this match. */ s = ps + match[0].rm_eo; slen = psl - match[0].rm_eo; le = match[0].rm_eo; /* * After a zero-length match, advance one byte, * and at the end of the line, terminate. */ if (match[0].rm_so == match[0].rm_eo) { if (*s == '\0' || *s == '\n') slen = -1; else slen--; if (*s != '\0') { cspace(&SS, s++, 1, APPEND); le++; } lastempty = 1; } else lastempty = 0; } while (n >= 0 && slen >= 0 && regexec_e(re, ps, REG_NOTBOL, 0, le, psl)); /* Did not find the requested number of matches. */ if (n > 0) return (0); /* Copy the trailing retained string. */ if (slen > 0) cspace(&SS, s, slen, APPEND); /* * Swap the substitute space and the pattern space, and make sure * that any leftover pointers into stdio memory get lost. */ tspace = PS; PS = SS; psanl = tspace.append_newline; SS = tspace; SS.space = SS.back; /* Handle the 'p' flag. */ if (cp->u.s->p) OUT(); /* Handle the 'w' flag. */ if (cp->u.s->wfile && !pd) { if (cp->u.s->wfd == -1 && (cp->u.s->wfd = open(cp->u.s->wfile, O_WRONLY|O_APPEND|O_CREAT|O_TRUNC, DEFFILEMODE)) == -1) err(1, "%s", cp->u.s->wfile); if (write(cp->u.s->wfd, ps, psl) != (ssize_t)psl || write(cp->u.s->wfd, "\n", 1) != 1) err(1, "%s", cp->u.s->wfile); } return (1); } /* * do_tr -- * Perform translation ('y' command) in the pattern space. */ static void do_tr(const struct s_tr *y) { SPACE tmp; char c, *p; size_t clen, left; int i; if (MB_CUR_MAX == 1) { /* * Single-byte encoding: perform in-place translation * of the pattern space. */ for (p = ps; p < &ps[psl]; p++) *p = y->bytetab[(u_char)*p]; } else { /* * Multi-byte encoding: perform translation into the * translation space, then swap the translation and * pattern spaces. */ /* Clean translation space. */ YS.len = 0; for (p = ps, left = psl; left > 0; p += clen, left -= clen) { if ((c = y->bytetab[(u_char)*p]) != '\0') { cspace(&YS, &c, 1, APPEND); clen = 1; continue; } for (i = 0; i < y->nmultis; i++) if (left >= y->multis[i].fromlen && memcmp(p, y->multis[i].from, y->multis[i].fromlen) == 0) break; if (i < y->nmultis) { cspace(&YS, y->multis[i].to, y->multis[i].tolen, APPEND); clen = y->multis[i].fromlen; } else { cspace(&YS, p, 1, APPEND); clen = 1; } } /* Swap the translation space and the pattern space. */ tmp = PS; PS = YS; psanl = tmp.append_newline; YS = tmp; YS.space = YS.back; } } /* * Flush append requests. Always called before reading a line, * therefore it also resets the substitution done (sdone) flag. */ static void flush_appends(void) { FILE *f; int count, i; char buf[8 * 1024]; for (i = 0; i < appendx; i++) switch (appends[i].type) { case AP_STRING: fwrite(appends[i].s, sizeof(char), appends[i].len, outfile); break; case AP_FILE: /* * Read files probably shouldn't be cached. Since * it's not an error to read a non-existent file, * it's possible that another program is interacting * with the sed script through the filesystem. It * would be truly bizarre, but possible. It's probably * not that big a performance win, anyhow. */ if ((f = fopen(appends[i].s, "r")) == NULL) break; while ((count = fread(buf, sizeof(char), sizeof(buf), f))) (void)fwrite(buf, sizeof(char), count, outfile); (void)fclose(f); break; } if (ferror(outfile)) errx(1, "%s: %s", outfname, strerror(errno ? errno : EIO)); appendx = sdone = 0; } static void lputs(const char *s, size_t len) { static const char escapes[] = "\\\a\b\f\r\t\v"; int c, col, width; const char *p; struct winsize win; static int termwidth = -1; size_t clen, i; wchar_t wc; mbstate_t mbs; if (outfile != stdout) termwidth = 60; if (termwidth == -1) { if ((p = getenv("COLUMNS")) && *p != '\0') termwidth = atoi(p); else if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &win) == 0 && win.ws_col > 0) termwidth = win.ws_col; else termwidth = 60; } if (termwidth <= 0) termwidth = 1; memset(&mbs, 0, sizeof(mbs)); col = 0; while (len != 0) { clen = mbrtowc(&wc, s, len, &mbs); if (clen == 0) clen = 1; if (clen == (size_t)-1 || clen == (size_t)-2) { wc = (unsigned char)*s; clen = 1; memset(&mbs, 0, sizeof(mbs)); } if (wc == '\n') { if (col + 1 >= termwidth) fprintf(outfile, "\\\n"); fputc('$', outfile); fputc('\n', outfile); col = 0; } else if (iswprint(wc)) { width = wcwidth(wc); if (col + width >= termwidth) { fprintf(outfile, "\\\n"); col = 0; } fwrite(s, 1, clen, outfile); col += width; } else if (wc != L'\0' && (c = wctob(wc)) != EOF && (p = strchr(escapes, c)) != NULL) { if (col + 2 >= termwidth) { fprintf(outfile, "\\\n"); col = 0; } fprintf(outfile, "\\%c", "\\abfrtv"[p - escapes]); col += 2; } else { if (col + 4 * clen >= (unsigned)termwidth) { fprintf(outfile, "\\\n"); col = 0; } for (i = 0; i < clen; i++) fprintf(outfile, "\\%03o", (int)(unsigned char)s[i]); col += 4 * clen; } s += clen; len -= clen; } if (col + 1 >= termwidth) fprintf(outfile, "\\\n"); (void)fputc('$', outfile); (void)fputc('\n', outfile); if (ferror(outfile)) errx(1, "%s: %s", outfname, strerror(errno ? errno : EIO)); } static int regexec_e(const regex_t *preg, const char *string, int eflags, int nomatch, size_t start, size_t stop) { int eval; if (preg == NULL) { if (defpreg == NULL) errx(1, "first RE may not be empty"); } else defpreg = preg; /* Set anchors */ match[0].rm_so = start; match[0].rm_eo = stop; eval = regexec(defpreg, string, nomatch ? 0 : maxnsub + 1, match, eflags | REG_STARTEND); switch(eval) { case 0: return (1); case REG_NOMATCH: return (0); } errx(1, "RE error: %s", strregerror(eval, defpreg)); /* NOTREACHED */ } /* * regsub - perform substitutions after a regexp match * Based on a routine by Henry Spencer */ static void regsub(SPACE *sp, const char *string, const char *src) { int len, no; char c, *dst; #define NEEDSP(reqlen) \ /* XXX What is the +1 for? */ \ if (sp->len + (reqlen) + 1 >= sp->blen) { \ sp->blen += (reqlen) + 1024; \ if ((sp->space = sp->back = realloc(sp->back, sp->blen)) \ == NULL) \ err(1, "realloc"); \ dst = sp->space + sp->len; \ } dst = sp->space + sp->len; while ((c = *src++) != '\0') { if (c == '&') no = 0; else if (c == '\\' && isdigit((unsigned char)*src)) no = *src++ - '0'; else no = -1; if (no < 0) { /* Ordinary character. */ if (c == '\\' && (*src == '\\' || *src == '&')) c = *src++; NEEDSP(1); *dst++ = c; ++sp->len; } else if (match[no].rm_so != -1 && match[no].rm_eo != -1) { len = match[no].rm_eo - match[no].rm_so; NEEDSP(len); memmove(dst, string + match[no].rm_so, len); dst += len; sp->len += len; } } NEEDSP(1); *dst = '\0'; } /* * cspace -- * Concatenate space: append the source space to the destination space, * allocating new space as necessary. */ void cspace(SPACE *sp, const char *p, size_t len, enum e_spflag spflag) { size_t tlen; /* Make sure SPACE has enough memory and ramp up quickly. */ tlen = sp->len + len + 1; if (tlen > sp->blen) { sp->blen = tlen + 1024; if ((sp->space = sp->back = realloc(sp->back, sp->blen)) == NULL) err(1, "realloc"); } if (spflag == REPLACE) sp->len = 0; memmove(sp->space + sp->len, p, len); sp->space[sp->len += len] = '\0'; } /* * Close all cached opened files and report any errors */ void cfclose(struct s_command *cp, const struct s_command *end) { for (; cp != end; cp = cp->next) switch(cp->code) { case 's': if (cp->u.s->wfd != -1 && close(cp->u.s->wfd)) err(1, "%s", cp->u.s->wfile); cp->u.s->wfd = -1; break; case 'w': if (cp->u.fd != -1 && close(cp->u.fd)) err(1, "%s", cp->t); cp->u.fd = -1; break; case '{': cfclose(cp->u.c, cp->next); break; } } Index: user/alc/PQ_LAUNDRY/usr.sbin/camdd/camdd.c =================================================================== --- user/alc/PQ_LAUNDRY/usr.sbin/camdd/camdd.c (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.sbin/camdd/camdd.c (revision 303206) @@ -1,3423 +1,3424 @@ /*- * Copyright (c) 1997-2007 Kenneth D. Merry * Copyright (c) 2013, 2014, 2015 Spectra Logic Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification. * 2. Redistributions in binary form must reproduce at minimum a disclaimer * substantially similar to the "NO WARRANTY" disclaimer below * ("Disclaimer") and any redistribution must be conditioned upon * including a substantially similar Disclaimer requirement for further * binary redistribution. * * NO WARRANTY * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGES. * * Authors: Ken Merry (Spectra Logic Corporation) */ /* * This is eventually intended to be: * - A basic data transfer/copy utility * - A simple benchmark utility * - An example of how to use the asynchronous pass(4) driver interface. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include typedef enum { CAMDD_CMD_NONE = 0x00000000, CAMDD_CMD_HELP = 0x00000001, CAMDD_CMD_WRITE = 0x00000002, CAMDD_CMD_READ = 0x00000003 } camdd_cmdmask; typedef enum { CAMDD_ARG_NONE = 0x00000000, CAMDD_ARG_VERBOSE = 0x00000001, CAMDD_ARG_DEVICE = 0x00000002, CAMDD_ARG_BUS = 0x00000004, CAMDD_ARG_TARGET = 0x00000008, CAMDD_ARG_LUN = 0x00000010, CAMDD_ARG_UNIT = 0x00000020, CAMDD_ARG_TIMEOUT = 0x00000040, CAMDD_ARG_ERR_RECOVER = 0x00000080, CAMDD_ARG_RETRIES = 0x00000100 } camdd_argmask; typedef enum { CAMDD_DEV_NONE = 0x00, CAMDD_DEV_PASS = 0x01, CAMDD_DEV_FILE = 0x02 } camdd_dev_type; struct camdd_io_opts { camdd_dev_type dev_type; char *dev_name; uint64_t blocksize; uint64_t queue_depth; uint64_t offset; int min_cmd_size; int write_dev; uint64_t debug; }; typedef enum { CAMDD_BUF_NONE, CAMDD_BUF_DATA, CAMDD_BUF_INDIRECT } camdd_buf_type; struct camdd_buf_indirect { /* * Pointer to the source buffer. */ struct camdd_buf *src_buf; /* * Offset into the source buffer, in bytes. */ uint64_t offset; /* * Pointer to the starting point in the source buffer. */ uint8_t *start_ptr; /* * Length of this chunk in bytes. */ size_t len; }; struct camdd_buf_data { /* * Buffer allocated when we allocate this camdd_buf. This should * be the size of the blocksize for this device. */ uint8_t *buf; /* * The amount of backing store allocated in buf. Generally this * will be the blocksize of the device. */ uint32_t alloc_len; /* * The amount of data that was put into the buffer (on reads) or * the amount of data we have put onto the src_list so far (on * writes). */ uint32_t fill_len; /* * The amount of data that was not transferred. */ uint32_t resid; /* * Starting byte offset on the reader. */ uint64_t src_start_offset; /* * CCB used for pass(4) device targets. */ union ccb ccb; /* * Number of scatter/gather segments. */ int sg_count; /* * Set if we had to tack on an extra buffer to round the transfer * up to a sector size. */ int extra_buf; /* * Scatter/gather list used generally when we're the writer for a * pass(4) device. */ bus_dma_segment_t *segs; /* * Scatter/gather list used generally when we're the writer for a * file or block device; */ struct iovec *iovec; }; union camdd_buf_types { struct camdd_buf_indirect indirect; struct camdd_buf_data data; }; typedef enum { CAMDD_STATUS_NONE, CAMDD_STATUS_OK, CAMDD_STATUS_SHORT_IO, CAMDD_STATUS_EOF, CAMDD_STATUS_ERROR } camdd_buf_status; struct camdd_buf { camdd_buf_type buf_type; union camdd_buf_types buf_type_spec; camdd_buf_status status; uint64_t lba; size_t len; /* * A reference count of how many indirect buffers point to this * buffer. */ int refcount; /* * A link back to our parent device. */ struct camdd_dev *dev; STAILQ_ENTRY(camdd_buf) links; STAILQ_ENTRY(camdd_buf) work_links; /* * A count of the buffers on the src_list. */ int src_count; /* * List of buffers from our partner thread that are the components * of this buffer for the I/O. Uses src_links. */ STAILQ_HEAD(,camdd_buf) src_list; STAILQ_ENTRY(camdd_buf) src_links; }; #define NUM_DEV_TYPES 2 struct camdd_dev_pass { int scsi_dev_type; struct cam_device *dev; uint64_t max_sector; uint32_t block_len; uint32_t cpi_maxio; }; typedef enum { CAMDD_FILE_NONE, CAMDD_FILE_REG, CAMDD_FILE_STD, CAMDD_FILE_PIPE, CAMDD_FILE_DISK, CAMDD_FILE_TAPE, CAMDD_FILE_TTY, CAMDD_FILE_MEM } camdd_file_type; typedef enum { CAMDD_FF_NONE = 0x00, CAMDD_FF_CAN_SEEK = 0x01 } camdd_file_flags; struct camdd_dev_file { int fd; struct stat sb; char filename[MAXPATHLEN + 1]; camdd_file_type file_type; camdd_file_flags file_flags; uint8_t *tmp_buf; }; struct camdd_dev_block { int fd; uint64_t size_bytes; uint32_t block_len; }; union camdd_dev_spec { struct camdd_dev_pass pass; struct camdd_dev_file file; struct camdd_dev_block block; }; typedef enum { CAMDD_DEV_FLAG_NONE = 0x00, CAMDD_DEV_FLAG_EOF = 0x01, CAMDD_DEV_FLAG_PEER_EOF = 0x02, CAMDD_DEV_FLAG_ACTIVE = 0x04, CAMDD_DEV_FLAG_EOF_SENT = 0x08, CAMDD_DEV_FLAG_EOF_QUEUED = 0x10 } camdd_dev_flags; struct camdd_dev { camdd_dev_type dev_type; union camdd_dev_spec dev_spec; camdd_dev_flags flags; char device_name[MAXPATHLEN+1]; uint32_t blocksize; uint32_t sector_size; uint64_t max_sector; uint64_t sector_io_limit; int min_cmd_size; int write_dev; int retry_count; int io_timeout; int debug; uint64_t start_offset_bytes; uint64_t next_io_pos_bytes; uint64_t next_peer_pos_bytes; uint64_t next_completion_pos_bytes; uint64_t peer_bytes_queued; uint64_t bytes_transferred; uint32_t target_queue_depth; uint32_t cur_active_io; uint8_t *extra_buf; uint32_t extra_buf_len; struct camdd_dev *peer_dev; pthread_mutex_t mutex; pthread_cond_t cond; int kq; int (*run)(struct camdd_dev *dev); int (*fetch)(struct camdd_dev *dev); /* * Buffers that are available for I/O. Uses links. */ STAILQ_HEAD(,camdd_buf) free_queue; /* * Free indirect buffers. These are used for breaking a large * buffer into multiple pieces. */ STAILQ_HEAD(,camdd_buf) free_indirect_queue; /* * Buffers that have been queued to the kernel. Uses links. */ STAILQ_HEAD(,camdd_buf) active_queue; /* * Will generally contain one of our buffers that is waiting for enough * I/O from our partner thread to be able to execute. This will * generally happen when our per-I/O-size is larger than the * partner thread's per-I/O-size. Uses links. */ STAILQ_HEAD(,camdd_buf) pending_queue; /* * Number of buffers on the pending queue */ int num_pending_queue; /* * Buffers that are filled and ready to execute. This is used when * our partner (reader) thread sends us blocks that are larger than * our blocksize, and so we have to split them into multiple pieces. */ STAILQ_HEAD(,camdd_buf) run_queue; /* * Number of buffers on the run queue. */ int num_run_queue; STAILQ_HEAD(,camdd_buf) reorder_queue; int num_reorder_queue; /* * Buffers that have been queued to us by our partner thread * (generally the reader thread) to be written out. Uses * work_links. */ STAILQ_HEAD(,camdd_buf) work_queue; /* * Buffers that have been completed by our partner thread. Uses * work_links. */ STAILQ_HEAD(,camdd_buf) peer_done_queue; /* * Number of buffers on the peer done queue. */ uint32_t num_peer_done_queue; /* * A list of buffers that we have queued to our peer thread. Uses * links. */ STAILQ_HEAD(,camdd_buf) peer_work_queue; /* * Number of buffers on the peer work queue. */ uint32_t num_peer_work_queue; }; static sem_t camdd_sem; static int need_exit = 0; static int error_exit = 0; static int need_status = 0; #ifndef min #define min(a, b) (a < b) ? a : b #endif /* * XXX KDM private copy of timespecsub(). This is normally defined in * sys/time.h, but is only enabled in the kernel. If that definition is * enabled in userland, it breaks the build of libnetbsd. */ #ifndef timespecsub #define timespecsub(vvp, uvp) \ do { \ (vvp)->tv_sec -= (uvp)->tv_sec; \ (vvp)->tv_nsec -= (uvp)->tv_nsec; \ if ((vvp)->tv_nsec < 0) { \ (vvp)->tv_sec--; \ (vvp)->tv_nsec += 1000000000; \ } \ } while (0) #endif /* Generically useful offsets into the peripheral private area */ #define ppriv_ptr0 periph_priv.entries[0].ptr #define ppriv_ptr1 periph_priv.entries[1].ptr #define ppriv_field0 periph_priv.entries[0].field #define ppriv_field1 periph_priv.entries[1].field #define ccb_buf ppriv_ptr0 #define CAMDD_FILE_DEFAULT_BLOCK 524288 #define CAMDD_FILE_DEFAULT_DEPTH 1 #define CAMDD_PASS_MAX_BLOCK 1048576 #define CAMDD_PASS_DEFAULT_DEPTH 6 #define CAMDD_PASS_RW_TIMEOUT 60 * 1000 static int parse_btl(char *tstr, int *bus, int *target, int *lun, camdd_argmask *arglst); void camdd_free_dev(struct camdd_dev *dev); struct camdd_dev *camdd_alloc_dev(camdd_dev_type dev_type, struct kevent *new_ke, int num_ke, int retry_count, int timeout); static struct camdd_buf *camdd_alloc_buf(struct camdd_dev *dev, camdd_buf_type buf_type); void camdd_release_buf(struct camdd_buf *buf); struct camdd_buf *camdd_get_buf(struct camdd_dev *dev, camdd_buf_type buf_type); int camdd_buf_sg_create(struct camdd_buf *buf, int iovec, uint32_t sector_size, uint32_t *num_sectors_used, int *double_buf_needed); uint32_t camdd_buf_get_len(struct camdd_buf *buf); void camdd_buf_add_child(struct camdd_buf *buf, struct camdd_buf *child_buf); int camdd_probe_tape(int fd, char *filename, uint64_t *max_iosize, uint64_t *max_blk, uint64_t *min_blk, uint64_t *blk_gran); struct camdd_dev *camdd_probe_file(int fd, struct camdd_io_opts *io_opts, int retry_count, int timeout); struct camdd_dev *camdd_probe_pass(struct cam_device *cam_dev, struct camdd_io_opts *io_opts, camdd_argmask arglist, int probe_retry_count, int probe_timeout, int io_retry_count, int io_timeout); void *camdd_file_worker(void *arg); camdd_buf_status camdd_ccb_status(union ccb *ccb); int camdd_queue_peer_buf(struct camdd_dev *dev, struct camdd_buf *buf); int camdd_complete_peer_buf(struct camdd_dev *dev, struct camdd_buf *peer_buf); void camdd_peer_done(struct camdd_buf *buf); void camdd_complete_buf(struct camdd_dev *dev, struct camdd_buf *buf, int *error_count); int camdd_pass_fetch(struct camdd_dev *dev); int camdd_file_run(struct camdd_dev *dev); int camdd_pass_run(struct camdd_dev *dev); int camdd_get_next_lba_len(struct camdd_dev *dev, uint64_t *lba, ssize_t *len); int camdd_queue(struct camdd_dev *dev, struct camdd_buf *read_buf); void camdd_get_depth(struct camdd_dev *dev, uint32_t *our_depth, uint32_t *peer_depth, uint32_t *our_bytes, uint32_t *peer_bytes); void *camdd_worker(void *arg); void camdd_sig_handler(int sig); void camdd_print_status(struct camdd_dev *camdd_dev, struct camdd_dev *other_dev, struct timespec *start_time); int camdd_rw(struct camdd_io_opts *io_opts, int num_io_opts, uint64_t max_io, int retry_count, int timeout); int camdd_parse_io_opts(char *args, int is_write, struct camdd_io_opts *io_opts); void usage(void); /* * Parse out a bus, or a bus, target and lun in the following * format: * bus * bus:target * bus:target:lun * * Returns the number of parsed components, or 0. */ static int parse_btl(char *tstr, int *bus, int *target, int *lun, camdd_argmask *arglst) { char *tmpstr; int convs = 0; while (isspace(*tstr) && (*tstr != '\0')) tstr++; tmpstr = (char *)strtok(tstr, ":"); if ((tmpstr != NULL) && (*tmpstr != '\0')) { *bus = strtol(tmpstr, NULL, 0); *arglst |= CAMDD_ARG_BUS; convs++; tmpstr = (char *)strtok(NULL, ":"); if ((tmpstr != NULL) && (*tmpstr != '\0')) { *target = strtol(tmpstr, NULL, 0); *arglst |= CAMDD_ARG_TARGET; convs++; tmpstr = (char *)strtok(NULL, ":"); if ((tmpstr != NULL) && (*tmpstr != '\0')) { *lun = strtol(tmpstr, NULL, 0); *arglst |= CAMDD_ARG_LUN; convs++; } } } return convs; } /* * XXX KDM clean up and free all of the buffers on the queue! */ void camdd_free_dev(struct camdd_dev *dev) { if (dev == NULL) return; switch (dev->dev_type) { case CAMDD_DEV_FILE: { struct camdd_dev_file *file_dev = &dev->dev_spec.file; if (file_dev->fd != -1) close(file_dev->fd); free(file_dev->tmp_buf); break; } case CAMDD_DEV_PASS: { struct camdd_dev_pass *pass_dev = &dev->dev_spec.pass; if (pass_dev->dev != NULL) cam_close_device(pass_dev->dev); break; } default: break; } free(dev); } struct camdd_dev * camdd_alloc_dev(camdd_dev_type dev_type, struct kevent *new_ke, int num_ke, int retry_count, int timeout) { struct camdd_dev *dev = NULL; struct kevent *ke; size_t ke_size; int retval = 0; dev = malloc(sizeof(*dev)); if (dev == NULL) { warn("%s: unable to malloc %zu bytes", __func__, sizeof(*dev)); goto bailout; } bzero(dev, sizeof(*dev)); dev->dev_type = dev_type; dev->io_timeout = timeout; dev->retry_count = retry_count; STAILQ_INIT(&dev->free_queue); STAILQ_INIT(&dev->free_indirect_queue); STAILQ_INIT(&dev->active_queue); STAILQ_INIT(&dev->pending_queue); STAILQ_INIT(&dev->run_queue); STAILQ_INIT(&dev->reorder_queue); STAILQ_INIT(&dev->work_queue); STAILQ_INIT(&dev->peer_done_queue); STAILQ_INIT(&dev->peer_work_queue); retval = pthread_mutex_init(&dev->mutex, NULL); if (retval != 0) { warnc(retval, "%s: failed to initialize mutex", __func__); goto bailout; } retval = pthread_cond_init(&dev->cond, NULL); if (retval != 0) { warnc(retval, "%s: failed to initialize condition variable", __func__); goto bailout; } dev->kq = kqueue(); if (dev->kq == -1) { warn("%s: Unable to create kqueue", __func__); goto bailout; } ke_size = sizeof(struct kevent) * (num_ke + 4); ke = malloc(ke_size); if (ke == NULL) { warn("%s: unable to malloc %zu bytes", __func__, ke_size); goto bailout; } bzero(ke, ke_size); if (num_ke > 0) bcopy(new_ke, ke, num_ke * sizeof(struct kevent)); EV_SET(&ke[num_ke++], (uintptr_t)&dev->work_queue, EVFILT_USER, EV_ADD|EV_ENABLE|EV_CLEAR, 0,0, 0); EV_SET(&ke[num_ke++], (uintptr_t)&dev->peer_done_queue, EVFILT_USER, EV_ADD|EV_ENABLE|EV_CLEAR, 0,0, 0); EV_SET(&ke[num_ke++], SIGINFO, EVFILT_SIGNAL, EV_ADD|EV_ENABLE, 0,0,0); EV_SET(&ke[num_ke++], SIGINT, EVFILT_SIGNAL, EV_ADD|EV_ENABLE, 0,0,0); retval = kevent(dev->kq, ke, num_ke, NULL, 0, NULL); if (retval == -1) { warn("%s: Unable to register kevents", __func__); goto bailout; } return (dev); bailout: free(dev); return (NULL); } static struct camdd_buf * camdd_alloc_buf(struct camdd_dev *dev, camdd_buf_type buf_type) { struct camdd_buf *buf = NULL; uint8_t *data_ptr = NULL; /* * We only need to allocate data space for data buffers. */ switch (buf_type) { case CAMDD_BUF_DATA: data_ptr = malloc(dev->blocksize); if (data_ptr == NULL) { warn("unable to allocate %u bytes", dev->blocksize); goto bailout_error; } break; default: break; } buf = malloc(sizeof(*buf)); if (buf == NULL) { warn("unable to allocate %zu bytes", sizeof(*buf)); goto bailout_error; } bzero(buf, sizeof(*buf)); buf->buf_type = buf_type; buf->dev = dev; switch (buf_type) { case CAMDD_BUF_DATA: { struct camdd_buf_data *data; data = &buf->buf_type_spec.data; data->alloc_len = dev->blocksize; data->buf = data_ptr; break; } case CAMDD_BUF_INDIRECT: break; default: break; } STAILQ_INIT(&buf->src_list); return (buf); bailout_error: if (data_ptr != NULL) free(data_ptr); if (buf != NULL) free(buf); return (NULL); } void camdd_release_buf(struct camdd_buf *buf) { struct camdd_dev *dev; dev = buf->dev; switch (buf->buf_type) { case CAMDD_BUF_DATA: { struct camdd_buf_data *data; data = &buf->buf_type_spec.data; if (data->segs != NULL) { if (data->extra_buf != 0) { void *extra_buf; extra_buf = (void *) data->segs[data->sg_count - 1].ds_addr; free(extra_buf); data->extra_buf = 0; } free(data->segs); data->segs = NULL; data->sg_count = 0; } else if (data->iovec != NULL) { if (data->extra_buf != 0) { free(data->iovec[data->sg_count - 1].iov_base); data->extra_buf = 0; } free(data->iovec); data->iovec = NULL; data->sg_count = 0; } STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); break; } case CAMDD_BUF_INDIRECT: STAILQ_INSERT_TAIL(&dev->free_indirect_queue, buf, links); break; default: err(1, "%s: Invalid buffer type %d for released buffer", __func__, buf->buf_type); break; } } struct camdd_buf * camdd_get_buf(struct camdd_dev *dev, camdd_buf_type buf_type) { struct camdd_buf *buf = NULL; switch (buf_type) { case CAMDD_BUF_DATA: buf = STAILQ_FIRST(&dev->free_queue); if (buf != NULL) { struct camdd_buf_data *data; uint8_t *data_ptr; uint32_t alloc_len; STAILQ_REMOVE_HEAD(&dev->free_queue, links); data = &buf->buf_type_spec.data; data_ptr = data->buf; alloc_len = data->alloc_len; bzero(buf, sizeof(*buf)); data->buf = data_ptr; data->alloc_len = alloc_len; } break; case CAMDD_BUF_INDIRECT: buf = STAILQ_FIRST(&dev->free_indirect_queue); if (buf != NULL) { STAILQ_REMOVE_HEAD(&dev->free_indirect_queue, links); bzero(buf, sizeof(*buf)); } break; default: warnx("Unknown buffer type %d requested", buf_type); break; } if (buf == NULL) return (camdd_alloc_buf(dev, buf_type)); else { STAILQ_INIT(&buf->src_list); buf->dev = dev; buf->buf_type = buf_type; return (buf); } } int camdd_buf_sg_create(struct camdd_buf *buf, int iovec, uint32_t sector_size, uint32_t *num_sectors_used, int *double_buf_needed) { struct camdd_buf *tmp_buf; struct camdd_buf_data *data; uint8_t *extra_buf = NULL; size_t extra_buf_len = 0; int i, retval = 0; data = &buf->buf_type_spec.data; data->sg_count = buf->src_count; /* * Compose a scatter/gather list from all of the buffers in the list. * If the length of the buffer isn't a multiple of the sector size, * we'll have to add an extra buffer. This should only happen * at the end of a transfer. */ if ((data->fill_len % sector_size) != 0) { extra_buf_len = sector_size - (data->fill_len % sector_size); extra_buf = calloc(extra_buf_len, 1); if (extra_buf == NULL) { warn("%s: unable to allocate %zu bytes for extra " "buffer space", __func__, extra_buf_len); retval = 1; goto bailout; } data->extra_buf = 1; data->sg_count++; } if (iovec == 0) { data->segs = calloc(data->sg_count, sizeof(bus_dma_segment_t)); if (data->segs == NULL) { warn("%s: unable to allocate %zu bytes for S/G list", __func__, sizeof(bus_dma_segment_t) * data->sg_count); retval = 1; goto bailout; } } else { data->iovec = calloc(data->sg_count, sizeof(struct iovec)); if (data->iovec == NULL) { warn("%s: unable to allocate %zu bytes for S/G list", __func__, sizeof(struct iovec) * data->sg_count); retval = 1; goto bailout; } } for (i = 0, tmp_buf = STAILQ_FIRST(&buf->src_list); i < buf->src_count && tmp_buf != NULL; i++, tmp_buf = STAILQ_NEXT(tmp_buf, src_links)) { if (tmp_buf->buf_type == CAMDD_BUF_DATA) { struct camdd_buf_data *tmp_data; tmp_data = &tmp_buf->buf_type_spec.data; if (iovec == 0) { data->segs[i].ds_addr = (bus_addr_t) tmp_data->buf; data->segs[i].ds_len = tmp_data->fill_len - tmp_data->resid; } else { data->iovec[i].iov_base = tmp_data->buf; data->iovec[i].iov_len = tmp_data->fill_len - tmp_data->resid; } if (((tmp_data->fill_len - tmp_data->resid) % sector_size) != 0) *double_buf_needed = 1; } else { struct camdd_buf_indirect *tmp_ind; tmp_ind = &tmp_buf->buf_type_spec.indirect; if (iovec == 0) { data->segs[i].ds_addr = (bus_addr_t)tmp_ind->start_ptr; data->segs[i].ds_len = tmp_ind->len; } else { data->iovec[i].iov_base = tmp_ind->start_ptr; data->iovec[i].iov_len = tmp_ind->len; } if ((tmp_ind->len % sector_size) != 0) *double_buf_needed = 1; } } if (extra_buf != NULL) { if (iovec == 0) { data->segs[i].ds_addr = (bus_addr_t)extra_buf; data->segs[i].ds_len = extra_buf_len; } else { data->iovec[i].iov_base = extra_buf; data->iovec[i].iov_len = extra_buf_len; } i++; } if ((tmp_buf != NULL) || (i != data->sg_count)) { warnx("buffer source count does not match " "number of buffers in list!"); retval = 1; goto bailout; } bailout: if (retval == 0) { *num_sectors_used = (data->fill_len + extra_buf_len) / sector_size; } return (retval); } uint32_t camdd_buf_get_len(struct camdd_buf *buf) { uint32_t len = 0; if (buf->buf_type != CAMDD_BUF_DATA) { struct camdd_buf_indirect *indirect; indirect = &buf->buf_type_spec.indirect; len = indirect->len; } else { struct camdd_buf_data *data; data = &buf->buf_type_spec.data; len = data->fill_len; } return (len); } void camdd_buf_add_child(struct camdd_buf *buf, struct camdd_buf *child_buf) { struct camdd_buf_data *data; assert(buf->buf_type == CAMDD_BUF_DATA); data = &buf->buf_type_spec.data; STAILQ_INSERT_TAIL(&buf->src_list, child_buf, src_links); buf->src_count++; data->fill_len += camdd_buf_get_len(child_buf); } typedef enum { CAMDD_TS_MAX_BLK, CAMDD_TS_MIN_BLK, CAMDD_TS_BLK_GRAN, CAMDD_TS_EFF_IOSIZE } camdd_status_item_index; static struct camdd_status_items { const char *name; struct mt_status_entry *entry; } req_status_items[] = { { "max_blk", NULL }, { "min_blk", NULL }, { "blk_gran", NULL }, { "max_effective_iosize", NULL } }; int camdd_probe_tape(int fd, char *filename, uint64_t *max_iosize, uint64_t *max_blk, uint64_t *min_blk, uint64_t *blk_gran) { struct mt_status_data status_data; char *xml_str = NULL; unsigned int i; int retval = 0; retval = mt_get_xml_str(fd, MTIOCEXTGET, &xml_str); if (retval != 0) err(1, "Couldn't get XML string from %s", filename); retval = mt_get_status(xml_str, &status_data); if (retval != XML_STATUS_OK) { warn("couldn't get status for %s", filename); retval = 1; goto bailout; } else retval = 0; if (status_data.error != 0) { warnx("%s", status_data.error_str); retval = 1; goto bailout; } for (i = 0; i < sizeof(req_status_items) / sizeof(req_status_items[0]); i++) { char *name; name = __DECONST(char *, req_status_items[i].name); req_status_items[i].entry = mt_status_entry_find(&status_data, name); if (req_status_items[i].entry == NULL) { errx(1, "Cannot find status entry %s", req_status_items[i].name); } } *max_iosize = req_status_items[CAMDD_TS_EFF_IOSIZE].entry->value_unsigned; *max_blk= req_status_items[CAMDD_TS_MAX_BLK].entry->value_unsigned; *min_blk= req_status_items[CAMDD_TS_MIN_BLK].entry->value_unsigned; *blk_gran = req_status_items[CAMDD_TS_BLK_GRAN].entry->value_unsigned; bailout: free(xml_str); mt_status_free(&status_data); return (retval); } struct camdd_dev * camdd_probe_file(int fd, struct camdd_io_opts *io_opts, int retry_count, int timeout) { struct camdd_dev *dev = NULL; struct camdd_dev_file *file_dev; uint64_t blocksize = io_opts->blocksize; dev = camdd_alloc_dev(CAMDD_DEV_FILE, NULL, 0, retry_count, timeout); if (dev == NULL) goto bailout; file_dev = &dev->dev_spec.file; file_dev->fd = fd; strlcpy(file_dev->filename, io_opts->dev_name, sizeof(file_dev->filename)); strlcpy(dev->device_name, io_opts->dev_name, sizeof(dev->device_name)); if (blocksize == 0) dev->blocksize = CAMDD_FILE_DEFAULT_BLOCK; else dev->blocksize = blocksize; if ((io_opts->queue_depth != 0) && (io_opts->queue_depth != 1)) { warnx("Queue depth %ju for %s ignored, only 1 outstanding " "command supported", (uintmax_t)io_opts->queue_depth, io_opts->dev_name); } dev->target_queue_depth = CAMDD_FILE_DEFAULT_DEPTH; dev->run = camdd_file_run; dev->fetch = NULL; /* * We can effectively access files on byte boundaries. We'll reset * this for devices like disks that can be accessed on sector * boundaries. */ dev->sector_size = 1; if ((fd != STDIN_FILENO) && (fd != STDOUT_FILENO)) { int retval; retval = fstat(fd, &file_dev->sb); if (retval != 0) { warn("Cannot stat %s", dev->device_name); goto bailout_error; } if (S_ISREG(file_dev->sb.st_mode)) { file_dev->file_type = CAMDD_FILE_REG; } else if (S_ISCHR(file_dev->sb.st_mode)) { int type; if (ioctl(fd, FIODTYPE, &type) == -1) err(1, "FIODTYPE ioctl failed on %s", dev->device_name); else { if (type & D_TAPE) file_dev->file_type = CAMDD_FILE_TAPE; else if (type & D_DISK) file_dev->file_type = CAMDD_FILE_DISK; else if (type & D_MEM) file_dev->file_type = CAMDD_FILE_MEM; else if (type & D_TTY) file_dev->file_type = CAMDD_FILE_TTY; } } else if (S_ISDIR(file_dev->sb.st_mode)) { errx(1, "cannot operate on directory %s", dev->device_name); } else if (S_ISFIFO(file_dev->sb.st_mode)) { file_dev->file_type = CAMDD_FILE_PIPE; } else errx(1, "Cannot determine file type for %s", dev->device_name); switch (file_dev->file_type) { case CAMDD_FILE_REG: if (file_dev->sb.st_size != 0) dev->max_sector = file_dev->sb.st_size - 1; else dev->max_sector = 0; file_dev->file_flags |= CAMDD_FF_CAN_SEEK; break; case CAMDD_FILE_TAPE: { uint64_t max_iosize, max_blk, min_blk, blk_gran; /* * Check block limits and maximum effective iosize. * Make sure the blocksize is within the block * limits (and a multiple of the minimum blocksize) * and that the blocksize is <= maximum effective * iosize. */ retval = camdd_probe_tape(fd, dev->device_name, &max_iosize, &max_blk, &min_blk, &blk_gran); if (retval != 0) errx(1, "Unable to probe tape %s", dev->device_name); /* * The blocksize needs to be <= the maximum * effective I/O size of the tape device. Note * that this also takes into account the maximum * blocksize reported by READ BLOCK LIMITS. */ if (dev->blocksize > max_iosize) { warnx("Blocksize %u too big for %s, limiting " "to %ju", dev->blocksize, dev->device_name, max_iosize); dev->blocksize = max_iosize; } /* * The blocksize needs to be at least min_blk; */ if (dev->blocksize < min_blk) { warnx("Blocksize %u too small for %s, " "increasing to %ju", dev->blocksize, dev->device_name, min_blk); dev->blocksize = min_blk; } /* * And the blocksize needs to be a multiple of * the block granularity. */ if ((blk_gran != 0) && (dev->blocksize % (1 << blk_gran))) { warnx("Blocksize %u for %s not a multiple of " "%d, adjusting to %d", dev->blocksize, dev->device_name, (1 << blk_gran), dev->blocksize & ~((1 << blk_gran) - 1)); dev->blocksize &= ~((1 << blk_gran) - 1); } if (dev->blocksize == 0) { errx(1, "Unable to derive valid blocksize for " "%s", dev->device_name); } /* * For tape drives, set the sector size to the * blocksize so that we make sure not to write * less than the blocksize out to the drive. */ dev->sector_size = dev->blocksize; break; } case CAMDD_FILE_DISK: { off_t media_size; unsigned int sector_size; file_dev->file_flags |= CAMDD_FF_CAN_SEEK; if (ioctl(fd, DIOCGSECTORSIZE, §or_size) == -1) { err(1, "DIOCGSECTORSIZE ioctl failed on %s", dev->device_name); } if (sector_size == 0) { errx(1, "DIOCGSECTORSIZE ioctl returned " "invalid sector size %u for %s", sector_size, dev->device_name); } if (ioctl(fd, DIOCGMEDIASIZE, &media_size) == -1) { err(1, "DIOCGMEDIASIZE ioctl failed on %s", dev->device_name); } if (media_size == 0) { errx(1, "DIOCGMEDIASIZE ioctl returned " "invalid media size %ju for %s", (uintmax_t)media_size, dev->device_name); } if (dev->blocksize % sector_size) { errx(1, "%s blocksize %u not a multiple of " "sector size %u", dev->device_name, dev->blocksize, sector_size); } dev->sector_size = sector_size; dev->max_sector = (media_size / sector_size) - 1; break; } case CAMDD_FILE_MEM: file_dev->file_flags |= CAMDD_FF_CAN_SEEK; break; default: break; } } if ((io_opts->offset != 0) && ((file_dev->file_flags & CAMDD_FF_CAN_SEEK) == 0)) { warnx("Offset %ju specified for %s, but we cannot seek on %s", io_opts->offset, io_opts->dev_name, io_opts->dev_name); goto bailout_error; } #if 0 else if ((io_opts->offset != 0) && ((io_opts->offset % dev->sector_size) != 0)) { warnx("Offset %ju for %s is not a multiple of the " "sector size %u", io_opts->offset, io_opts->dev_name, dev->sector_size); goto bailout_error; } else { dev->start_offset_bytes = io_opts->offset; } #endif bailout: return (dev); bailout_error: camdd_free_dev(dev); return (NULL); } /* * Need to implement this. Do a basic probe: * - Check the inquiry data, make sure we're talking to a device that we * can reasonably expect to talk to -- direct, RBC, CD, WORM. * - Send a test unit ready, make sure the device is available. * - Get the capacity and block size. */ struct camdd_dev * camdd_probe_pass(struct cam_device *cam_dev, struct camdd_io_opts *io_opts, camdd_argmask arglist, int probe_retry_count, int probe_timeout, int io_retry_count, int io_timeout) { union ccb *ccb; uint64_t maxsector; uint32_t cpi_maxio, max_iosize, pass_numblocks; uint32_t block_len; struct scsi_read_capacity_data rcap; struct scsi_read_capacity_data_long rcaplong; struct camdd_dev *dev; struct camdd_dev_pass *pass_dev; struct kevent ke; int scsi_dev_type; dev = NULL; scsi_dev_type = SID_TYPE(&cam_dev->inq_data); maxsector = 0; block_len = 0; /* * For devices that support READ CAPACITY, we'll attempt to get the * capacity. Otherwise, we really don't support tape or other * devices via SCSI passthrough, so just return an error in that case. */ switch (scsi_dev_type) { case T_DIRECT: case T_WORM: case T_CDROM: case T_OPTICAL: case T_RBC: + case T_ZBC_HM: break; default: errx(1, "Unsupported SCSI device type %d", scsi_dev_type); break; /*NOTREACHED*/ } ccb = cam_getccb(cam_dev); if (ccb == NULL) { warnx("%s: error allocating ccb", __func__); goto bailout; } CCB_CLEAR_ALL_EXCEPT_HDR(&ccb->csio); scsi_read_capacity(&ccb->csio, /*retries*/ probe_retry_count, /*cbfcnp*/ NULL, /*tag_action*/ MSG_SIMPLE_Q_TAG, &rcap, SSD_FULL_SIZE, /*timeout*/ probe_timeout ? probe_timeout : 5000); /* Disable freezing the device queue */ ccb->ccb_h.flags |= CAM_DEV_QFRZDIS; if (arglist & CAMDD_ARG_ERR_RECOVER) ccb->ccb_h.flags |= CAM_PASS_ERR_RECOVER; if (cam_send_ccb(cam_dev, ccb) < 0) { warn("error sending READ CAPACITY command"); cam_error_print(cam_dev, ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); goto bailout; } if ((ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { cam_error_print(cam_dev, ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); goto bailout; } maxsector = scsi_4btoul(rcap.addr); block_len = scsi_4btoul(rcap.length); /* * A last block of 2^32-1 means that the true capacity is over 2TB, * and we need to issue the long READ CAPACITY to get the real * capacity. Otherwise, we're all set. */ if (maxsector != 0xffffffff) goto rcap_done; scsi_read_capacity_16(&ccb->csio, /*retries*/ probe_retry_count, /*cbfcnp*/ NULL, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*lba*/ 0, /*reladdr*/ 0, /*pmi*/ 0, (uint8_t *)&rcaplong, sizeof(rcaplong), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ probe_timeout ? probe_timeout : 5000); /* Disable freezing the device queue */ ccb->ccb_h.flags |= CAM_DEV_QFRZDIS; if (arglist & CAMDD_ARG_ERR_RECOVER) ccb->ccb_h.flags |= CAM_PASS_ERR_RECOVER; if (cam_send_ccb(cam_dev, ccb) < 0) { warn("error sending READ CAPACITY (16) command"); cam_error_print(cam_dev, ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); goto bailout; } if ((ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { cam_error_print(cam_dev, ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); goto bailout; } maxsector = scsi_8btou64(rcaplong.addr); block_len = scsi_4btoul(rcaplong.length); rcap_done: if (block_len == 0) { warnx("Sector size for %s%u is 0, cannot continue", cam_dev->device_name, cam_dev->dev_unit_num); goto bailout_error; } CCB_CLEAR_ALL_EXCEPT_HDR(&ccb->cpi); ccb->ccb_h.func_code = XPT_PATH_INQ; ccb->ccb_h.flags = CAM_DIR_NONE; ccb->ccb_h.retry_count = 1; if (cam_send_ccb(cam_dev, ccb) < 0) { warn("error sending XPT_PATH_INQ CCB"); cam_error_print(cam_dev, ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); goto bailout; } EV_SET(&ke, cam_dev->fd, EVFILT_READ, EV_ADD|EV_ENABLE, 0, 0, 0); dev = camdd_alloc_dev(CAMDD_DEV_PASS, &ke, 1, io_retry_count, io_timeout); if (dev == NULL) goto bailout; pass_dev = &dev->dev_spec.pass; pass_dev->scsi_dev_type = scsi_dev_type; pass_dev->dev = cam_dev; pass_dev->max_sector = maxsector; pass_dev->block_len = block_len; pass_dev->cpi_maxio = ccb->cpi.maxio; snprintf(dev->device_name, sizeof(dev->device_name), "%s%u", pass_dev->dev->device_name, pass_dev->dev->dev_unit_num); dev->sector_size = block_len; dev->max_sector = maxsector; /* * Determine the optimal blocksize to use for this device. */ /* * If the controller has not specified a maximum I/O size, * just go with 128K as a somewhat conservative value. */ if (pass_dev->cpi_maxio == 0) cpi_maxio = 131072; else cpi_maxio = pass_dev->cpi_maxio; /* * If the controller has a large maximum I/O size, limit it * to something smaller so that the kernel doesn't have trouble * allocating buffers to copy data in and out for us. * XXX KDM this is until we have unmapped I/O support in the kernel. */ max_iosize = min(cpi_maxio, CAMDD_PASS_MAX_BLOCK); /* * If we weren't able to get a block size for some reason, * default to 512 bytes. */ block_len = pass_dev->block_len; if (block_len == 0) block_len = 512; /* * Figure out how many blocksize chunks will fit in the * maximum I/O size. */ pass_numblocks = max_iosize / block_len; /* * And finally, multiple the number of blocks by the LBA * length to get our maximum block size; */ dev->blocksize = pass_numblocks * block_len; if (io_opts->blocksize != 0) { if ((io_opts->blocksize % dev->sector_size) != 0) { warnx("Blocksize %ju for %s is not a multiple of " "sector size %u", (uintmax_t)io_opts->blocksize, dev->device_name, dev->sector_size); goto bailout_error; } dev->blocksize = io_opts->blocksize; } dev->target_queue_depth = CAMDD_PASS_DEFAULT_DEPTH; if (io_opts->queue_depth != 0) dev->target_queue_depth = io_opts->queue_depth; if (io_opts->offset != 0) { if (io_opts->offset > (dev->max_sector * dev->sector_size)) { warnx("Offset %ju is past the end of device %s", io_opts->offset, dev->device_name); goto bailout_error; } #if 0 else if ((io_opts->offset % dev->sector_size) != 0) { warnx("Offset %ju for %s is not a multiple of the " "sector size %u", io_opts->offset, dev->device_name, dev->sector_size); goto bailout_error; } dev->start_offset_bytes = io_opts->offset; #endif } dev->min_cmd_size = io_opts->min_cmd_size; dev->run = camdd_pass_run; dev->fetch = camdd_pass_fetch; bailout: cam_freeccb(ccb); return (dev); bailout_error: cam_freeccb(ccb); camdd_free_dev(dev); return (NULL); } void * camdd_worker(void *arg) { struct camdd_dev *dev = arg; struct camdd_buf *buf; struct timespec ts, *kq_ts; ts.tv_sec = 0; ts.tv_nsec = 0; pthread_mutex_lock(&dev->mutex); dev->flags |= CAMDD_DEV_FLAG_ACTIVE; for (;;) { struct kevent ke; int retval = 0; /* * XXX KDM check the reorder queue depth? */ if (dev->write_dev == 0) { uint32_t our_depth, peer_depth, peer_bytes, our_bytes; uint32_t target_depth = dev->target_queue_depth; uint32_t peer_target_depth = dev->peer_dev->target_queue_depth; uint32_t peer_blocksize = dev->peer_dev->blocksize; camdd_get_depth(dev, &our_depth, &peer_depth, &our_bytes, &peer_bytes); #if 0 while (((our_depth < target_depth) && (peer_depth < peer_target_depth)) || ((peer_bytes + our_bytes) < (peer_blocksize * 2))) { #endif while (((our_depth + peer_depth) < (target_depth + peer_target_depth)) || ((peer_bytes + our_bytes) < (peer_blocksize * 3))) { retval = camdd_queue(dev, NULL); if (retval == 1) break; else if (retval != 0) { error_exit = 1; goto bailout; } camdd_get_depth(dev, &our_depth, &peer_depth, &our_bytes, &peer_bytes); } } /* * See if we have any I/O that is ready to execute. */ buf = STAILQ_FIRST(&dev->run_queue); if (buf != NULL) { while (dev->target_queue_depth > dev->cur_active_io) { retval = dev->run(dev); if (retval == -1) { dev->flags |= CAMDD_DEV_FLAG_EOF; error_exit = 1; break; } else if (retval != 0) { break; } } } /* * We've reached EOF, or our partner has reached EOF. */ if ((dev->flags & CAMDD_DEV_FLAG_EOF) || (dev->flags & CAMDD_DEV_FLAG_PEER_EOF)) { if (dev->write_dev != 0) { if ((STAILQ_EMPTY(&dev->work_queue)) && (dev->num_run_queue == 0) && (dev->cur_active_io == 0)) { goto bailout; } } else { /* * If we're the reader, and the writer * got EOF, he is already done. If we got * the EOF, then we need to wait until * everything is flushed out for the writer. */ if (dev->flags & CAMDD_DEV_FLAG_PEER_EOF) { goto bailout; } else if ((dev->num_peer_work_queue == 0) && (dev->num_peer_done_queue == 0) && (dev->cur_active_io == 0) && (dev->num_run_queue == 0)) { goto bailout; } } /* * XXX KDM need to do something about the pending * queue and cleanup resources. */ } if ((dev->write_dev == 0) && (dev->cur_active_io == 0) && (dev->peer_bytes_queued < dev->peer_dev->blocksize)) kq_ts = &ts; else kq_ts = NULL; /* * Run kevent to see if there are events to process. */ pthread_mutex_unlock(&dev->mutex); retval = kevent(dev->kq, NULL, 0, &ke, 1, kq_ts); pthread_mutex_lock(&dev->mutex); if (retval == -1) { warn("%s: error returned from kevent",__func__); goto bailout; } else if (retval != 0) { switch (ke.filter) { case EVFILT_READ: if (dev->fetch != NULL) { retval = dev->fetch(dev); if (retval == -1) { error_exit = 1; goto bailout; } } break; case EVFILT_SIGNAL: /* * We register for this so we don't get * an error as a result of a SIGINFO or a * SIGINT. It will actually get handled * by the signal handler. If we get a * SIGINT, bail out without printing an * error message. Any other signals * will result in the error message above. */ if (ke.ident == SIGINT) goto bailout; break; case EVFILT_USER: retval = 0; /* * Check to see if the other thread has * queued any I/O for us to do. (In this * case we're the writer.) */ for (buf = STAILQ_FIRST(&dev->work_queue); buf != NULL; buf = STAILQ_FIRST(&dev->work_queue)) { STAILQ_REMOVE_HEAD(&dev->work_queue, work_links); retval = camdd_queue(dev, buf); /* * We keep going unless we get an * actual error. If we get EOF, we * still want to remove the buffers * from the queue and send the back * to the reader thread. */ if (retval == -1) { error_exit = 1; goto bailout; } else retval = 0; } /* * Next check to see if the other thread has * queued any completed buffers back to us. * (In this case we're the reader.) */ for (buf = STAILQ_FIRST(&dev->peer_done_queue); buf != NULL; buf = STAILQ_FIRST(&dev->peer_done_queue)){ STAILQ_REMOVE_HEAD( &dev->peer_done_queue, work_links); dev->num_peer_done_queue--; camdd_peer_done(buf); } break; default: warnx("%s: unknown kevent filter %d", __func__, ke.filter); break; } } } bailout: dev->flags &= ~CAMDD_DEV_FLAG_ACTIVE; /* XXX KDM cleanup resources here? */ pthread_mutex_unlock(&dev->mutex); need_exit = 1; sem_post(&camdd_sem); return (NULL); } /* * Simplistic translation of CCB status to our local status. */ camdd_buf_status camdd_ccb_status(union ccb *ccb) { camdd_buf_status status = CAMDD_STATUS_NONE; cam_status ccb_status; ccb_status = ccb->ccb_h.status & CAM_STATUS_MASK; switch (ccb_status) { case CAM_REQ_CMP: { if (ccb->csio.resid == 0) { status = CAMDD_STATUS_OK; } else if (ccb->csio.dxfer_len > ccb->csio.resid) { status = CAMDD_STATUS_SHORT_IO; } else { status = CAMDD_STATUS_EOF; } break; } case CAM_SCSI_STATUS_ERROR: { switch (ccb->csio.scsi_status) { case SCSI_STATUS_OK: case SCSI_STATUS_COND_MET: case SCSI_STATUS_INTERMED: case SCSI_STATUS_INTERMED_COND_MET: status = CAMDD_STATUS_OK; break; case SCSI_STATUS_CMD_TERMINATED: case SCSI_STATUS_CHECK_COND: case SCSI_STATUS_QUEUE_FULL: case SCSI_STATUS_BUSY: case SCSI_STATUS_RESERV_CONFLICT: default: status = CAMDD_STATUS_ERROR; break; } break; } default: status = CAMDD_STATUS_ERROR; break; } return (status); } /* * Queue a buffer to our peer's work thread for writing. * * Returns 0 for success, -1 for failure, 1 if the other thread exited. */ int camdd_queue_peer_buf(struct camdd_dev *dev, struct camdd_buf *buf) { struct kevent ke; STAILQ_HEAD(, camdd_buf) local_queue; struct camdd_buf *buf1, *buf2; struct camdd_buf_data *data = NULL; uint64_t peer_bytes_queued = 0; int active = 1; int retval = 0; STAILQ_INIT(&local_queue); /* * Since we're the reader, we need to queue our I/O to the writer * in sequential order in order to make sure it gets written out * in sequential order. * * Check the next expected I/O starting offset. If this doesn't * match, put it on the reorder queue. */ if ((buf->lba * dev->sector_size) != dev->next_completion_pos_bytes) { /* * If there is nothing on the queue, there is no sorting * needed. */ if (STAILQ_EMPTY(&dev->reorder_queue)) { STAILQ_INSERT_TAIL(&dev->reorder_queue, buf, links); dev->num_reorder_queue++; goto bailout; } /* * Sort in ascending order by starting LBA. There should * be no identical LBAs. */ for (buf1 = STAILQ_FIRST(&dev->reorder_queue); buf1 != NULL; buf1 = buf2) { buf2 = STAILQ_NEXT(buf1, links); if (buf->lba < buf1->lba) { /* * If we're less than the first one, then * we insert at the head of the list * because this has to be the first element * on the list. */ STAILQ_INSERT_HEAD(&dev->reorder_queue, buf, links); dev->num_reorder_queue++; break; } else if (buf->lba > buf1->lba) { if (buf2 == NULL) { STAILQ_INSERT_TAIL(&dev->reorder_queue, buf, links); dev->num_reorder_queue++; break; } else if (buf->lba < buf2->lba) { STAILQ_INSERT_AFTER(&dev->reorder_queue, buf1, buf, links); dev->num_reorder_queue++; break; } } else { errx(1, "Found buffers with duplicate LBA %ju!", buf->lba); } } goto bailout; } else { /* * We're the next expected I/O completion, so put ourselves * on the local queue to be sent to the writer. We use * work_links here so that we can queue this to the * peer_work_queue before taking the buffer off of the * local_queue. */ dev->next_completion_pos_bytes += buf->len; STAILQ_INSERT_TAIL(&local_queue, buf, work_links); /* * Go through the reorder queue looking for more sequential * I/O and add it to the local queue. */ for (buf1 = STAILQ_FIRST(&dev->reorder_queue); buf1 != NULL; buf1 = STAILQ_FIRST(&dev->reorder_queue)) { /* * As soon as we see an I/O that is out of sequence, * we're done. */ if ((buf1->lba * dev->sector_size) != dev->next_completion_pos_bytes) break; STAILQ_REMOVE_HEAD(&dev->reorder_queue, links); dev->num_reorder_queue--; STAILQ_INSERT_TAIL(&local_queue, buf1, work_links); dev->next_completion_pos_bytes += buf1->len; } } /* * Setup the event to let the other thread know that it has work * pending. */ EV_SET(&ke, (uintptr_t)&dev->peer_dev->work_queue, EVFILT_USER, 0, NOTE_TRIGGER, 0, NULL); /* * Put this on our shadow queue so that we know what we've queued * to the other thread. */ STAILQ_FOREACH_SAFE(buf1, &local_queue, work_links, buf2) { if (buf1->buf_type != CAMDD_BUF_DATA) { errx(1, "%s: should have a data buffer, not an " "indirect buffer", __func__); } data = &buf1->buf_type_spec.data; /* * We only need to send one EOF to the writer, and don't * need to continue sending EOFs after that. */ if (buf1->status == CAMDD_STATUS_EOF) { if (dev->flags & CAMDD_DEV_FLAG_EOF_SENT) { STAILQ_REMOVE(&local_queue, buf1, camdd_buf, work_links); camdd_release_buf(buf1); retval = 1; continue; } dev->flags |= CAMDD_DEV_FLAG_EOF_SENT; } STAILQ_INSERT_TAIL(&dev->peer_work_queue, buf1, links); peer_bytes_queued += (data->fill_len - data->resid); dev->peer_bytes_queued += (data->fill_len - data->resid); dev->num_peer_work_queue++; } if (STAILQ_FIRST(&local_queue) == NULL) goto bailout; /* * Drop our mutex and pick up the other thread's mutex. We need to * do this to avoid deadlocks. */ pthread_mutex_unlock(&dev->mutex); pthread_mutex_lock(&dev->peer_dev->mutex); if (dev->peer_dev->flags & CAMDD_DEV_FLAG_ACTIVE) { /* * Put the buffers on the other thread's incoming work queue. */ for (buf1 = STAILQ_FIRST(&local_queue); buf1 != NULL; buf1 = STAILQ_FIRST(&local_queue)) { STAILQ_REMOVE_HEAD(&local_queue, work_links); STAILQ_INSERT_TAIL(&dev->peer_dev->work_queue, buf1, work_links); } /* * Send an event to the other thread's kqueue to let it know * that there is something on the work queue. */ retval = kevent(dev->peer_dev->kq, &ke, 1, NULL, 0, NULL); if (retval == -1) warn("%s: unable to add peer work_queue kevent", __func__); else retval = 0; } else active = 0; pthread_mutex_unlock(&dev->peer_dev->mutex); pthread_mutex_lock(&dev->mutex); /* * If the other side isn't active, run through the queue and * release all of the buffers. */ if (active == 0) { for (buf1 = STAILQ_FIRST(&local_queue); buf1 != NULL; buf1 = STAILQ_FIRST(&local_queue)) { STAILQ_REMOVE_HEAD(&local_queue, work_links); STAILQ_REMOVE(&dev->peer_work_queue, buf1, camdd_buf, links); dev->num_peer_work_queue--; camdd_release_buf(buf1); } dev->peer_bytes_queued -= peer_bytes_queued; retval = 1; } bailout: return (retval); } /* * Return a buffer to the reader thread when we have completed writing it. */ int camdd_complete_peer_buf(struct camdd_dev *dev, struct camdd_buf *peer_buf) { struct kevent ke; int retval = 0; /* * Setup the event to let the other thread know that we have * completed a buffer. */ EV_SET(&ke, (uintptr_t)&dev->peer_dev->peer_done_queue, EVFILT_USER, 0, NOTE_TRIGGER, 0, NULL); /* * Drop our lock and acquire the other thread's lock before * manipulating */ pthread_mutex_unlock(&dev->mutex); pthread_mutex_lock(&dev->peer_dev->mutex); /* * Put the buffer on the reader thread's peer done queue now that * we have completed it. */ STAILQ_INSERT_TAIL(&dev->peer_dev->peer_done_queue, peer_buf, work_links); dev->peer_dev->num_peer_done_queue++; /* * Send an event to the peer thread to let it know that we've added * something to its peer done queue. */ retval = kevent(dev->peer_dev->kq, &ke, 1, NULL, 0, NULL); if (retval == -1) warn("%s: unable to add peer_done_queue kevent", __func__); else retval = 0; /* * Drop the other thread's lock and reacquire ours. */ pthread_mutex_unlock(&dev->peer_dev->mutex); pthread_mutex_lock(&dev->mutex); return (retval); } /* * Free a buffer that was written out by the writer thread and returned to * the reader thread. */ void camdd_peer_done(struct camdd_buf *buf) { struct camdd_dev *dev; struct camdd_buf_data *data; dev = buf->dev; if (buf->buf_type != CAMDD_BUF_DATA) { errx(1, "%s: should have a data buffer, not an " "indirect buffer", __func__); } data = &buf->buf_type_spec.data; STAILQ_REMOVE(&dev->peer_work_queue, buf, camdd_buf, links); dev->num_peer_work_queue--; dev->peer_bytes_queued -= (data->fill_len - data->resid); if (buf->status == CAMDD_STATUS_EOF) dev->flags |= CAMDD_DEV_FLAG_PEER_EOF; STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); } /* * Assumes caller holds the lock for this device. */ void camdd_complete_buf(struct camdd_dev *dev, struct camdd_buf *buf, int *error_count) { int retval = 0; /* * If we're the reader, we need to send the completed I/O * to the writer. If we're the writer, we need to just * free up resources, or let the reader know if we've * encountered an error. */ if (dev->write_dev == 0) { retval = camdd_queue_peer_buf(dev, buf); if (retval != 0) (*error_count)++; } else { struct camdd_buf *tmp_buf, *next_buf; STAILQ_FOREACH_SAFE(tmp_buf, &buf->src_list, src_links, next_buf) { struct camdd_buf *src_buf; struct camdd_buf_indirect *indirect; STAILQ_REMOVE(&buf->src_list, tmp_buf, camdd_buf, src_links); tmp_buf->status = buf->status; if (tmp_buf->buf_type == CAMDD_BUF_DATA) { camdd_complete_peer_buf(dev, tmp_buf); continue; } indirect = &tmp_buf->buf_type_spec.indirect; src_buf = indirect->src_buf; src_buf->refcount--; /* * XXX KDM we probably need to account for * exactly how many bytes we were able to * write. Allocate the residual to the * first N buffers? Or just track the * number of bytes written? Right now the reader * doesn't do anything with a residual. */ src_buf->status = buf->status; if (src_buf->refcount <= 0) camdd_complete_peer_buf(dev, src_buf); STAILQ_INSERT_TAIL(&dev->free_indirect_queue, tmp_buf, links); } STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); } } /* * Fetch all completed commands from the pass(4) device. * * Returns the number of commands received, or -1 if any of the commands * completed with an error. Returns 0 if no commands are available. */ int camdd_pass_fetch(struct camdd_dev *dev) { struct camdd_dev_pass *pass_dev = &dev->dev_spec.pass; union ccb ccb; int retval = 0, num_fetched = 0, error_count = 0; pthread_mutex_unlock(&dev->mutex); /* * XXX KDM we don't distinguish between EFAULT and ENOENT. */ while ((retval = ioctl(pass_dev->dev->fd, CAMIOGET, &ccb)) != -1) { struct camdd_buf *buf; struct camdd_buf_data *data; cam_status ccb_status; union ccb *buf_ccb; buf = ccb.ccb_h.ccb_buf; data = &buf->buf_type_spec.data; buf_ccb = &data->ccb; num_fetched++; /* * Copy the CCB back out so we get status, sense data, etc. */ bcopy(&ccb, buf_ccb, sizeof(ccb)); pthread_mutex_lock(&dev->mutex); /* * We're now done, so take this off the active queue. */ STAILQ_REMOVE(&dev->active_queue, buf, camdd_buf, links); dev->cur_active_io--; ccb_status = ccb.ccb_h.status & CAM_STATUS_MASK; if (ccb_status != CAM_REQ_CMP) { cam_error_print(pass_dev->dev, &ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); } data->resid = ccb.csio.resid; dev->bytes_transferred += (ccb.csio.dxfer_len - ccb.csio.resid); if (buf->status == CAMDD_STATUS_NONE) buf->status = camdd_ccb_status(&ccb); if (buf->status == CAMDD_STATUS_ERROR) error_count++; else if (buf->status == CAMDD_STATUS_EOF) { /* * Once we queue this buffer to our partner thread, * he will know that we've hit EOF. */ dev->flags |= CAMDD_DEV_FLAG_EOF; } camdd_complete_buf(dev, buf, &error_count); /* * Unlock in preparation for the ioctl call. */ pthread_mutex_unlock(&dev->mutex); } pthread_mutex_lock(&dev->mutex); if (error_count > 0) return (-1); else return (num_fetched); } /* * Returns -1 for error, 0 for success/continue, and 1 for resource * shortage/stop processing. */ int camdd_file_run(struct camdd_dev *dev) { struct camdd_dev_file *file_dev = &dev->dev_spec.file; struct camdd_buf_data *data; struct camdd_buf *buf; off_t io_offset; int retval = 0, write_dev = dev->write_dev; int error_count = 0, no_resources = 0, double_buf_needed = 0; uint32_t num_sectors = 0, db_len = 0; buf = STAILQ_FIRST(&dev->run_queue); if (buf == NULL) { no_resources = 1; goto bailout; } else if ((dev->write_dev == 0) && (dev->flags & (CAMDD_DEV_FLAG_EOF | CAMDD_DEV_FLAG_EOF_SENT))) { STAILQ_REMOVE(&dev->run_queue, buf, camdd_buf, links); dev->num_run_queue--; buf->status = CAMDD_STATUS_EOF; error_count++; goto bailout; } /* * If we're writing, we need to go through the source buffer list * and create an S/G list. */ if (write_dev != 0) { retval = camdd_buf_sg_create(buf, /*iovec*/ 1, dev->sector_size, &num_sectors, &double_buf_needed); if (retval != 0) { no_resources = 1; goto bailout; } } STAILQ_REMOVE(&dev->run_queue, buf, camdd_buf, links); dev->num_run_queue--; data = &buf->buf_type_spec.data; /* * pread(2) and pwrite(2) offsets are byte offsets. */ io_offset = buf->lba * dev->sector_size; /* * Unlock the mutex while we read or write. */ pthread_mutex_unlock(&dev->mutex); /* * Note that we don't need to double buffer if we're the reader * because in that case, we have allocated a single buffer of * sufficient size to do the read. This copy is necessary on * writes because if one of the components of the S/G list is not * a sector size multiple, the kernel will reject the write. This * is unfortunate but not surprising. So this will make sure that * we're using a single buffer that is a multiple of the sector size. */ if ((double_buf_needed != 0) && (data->sg_count > 1) && (write_dev != 0)) { uint32_t cur_offset; int i; if (file_dev->tmp_buf == NULL) file_dev->tmp_buf = calloc(dev->blocksize, 1); if (file_dev->tmp_buf == NULL) { buf->status = CAMDD_STATUS_ERROR; error_count++; goto bailout; } for (i = 0, cur_offset = 0; i < data->sg_count; i++) { bcopy(data->iovec[i].iov_base, &file_dev->tmp_buf[cur_offset], data->iovec[i].iov_len); cur_offset += data->iovec[i].iov_len; } db_len = cur_offset; } if (file_dev->file_flags & CAMDD_FF_CAN_SEEK) { if (write_dev == 0) { /* * XXX KDM is there any way we would need a S/G * list here? */ retval = pread(file_dev->fd, data->buf, buf->len, io_offset); } else { if (double_buf_needed != 0) { retval = pwrite(file_dev->fd, file_dev->tmp_buf, db_len, io_offset); } else if (data->sg_count == 0) { retval = pwrite(file_dev->fd, data->buf, data->fill_len, io_offset); } else { retval = pwritev(file_dev->fd, data->iovec, data->sg_count, io_offset); } } } else { if (write_dev == 0) { /* * XXX KDM is there any way we would need a S/G * list here? */ retval = read(file_dev->fd, data->buf, buf->len); } else { if (double_buf_needed != 0) { retval = write(file_dev->fd, file_dev->tmp_buf, db_len); } else if (data->sg_count == 0) { retval = write(file_dev->fd, data->buf, data->fill_len); } else { retval = writev(file_dev->fd, data->iovec, data->sg_count); } } } /* We're done, re-acquire the lock */ pthread_mutex_lock(&dev->mutex); if (retval >= (ssize_t)data->fill_len) { /* * If the bytes transferred is more than the request size, * that indicates an overrun, which should only happen at * the end of a transfer if we have to round up to a sector * boundary. */ if (buf->status == CAMDD_STATUS_NONE) buf->status = CAMDD_STATUS_OK; data->resid = 0; dev->bytes_transferred += retval; } else if (retval == -1) { warn("Error %s %s", (write_dev) ? "writing to" : "reading from", file_dev->filename); buf->status = CAMDD_STATUS_ERROR; data->resid = data->fill_len; error_count++; if (dev->debug == 0) goto bailout; if ((double_buf_needed != 0) && (write_dev != 0)) { fprintf(stderr, "%s: fd %d, DB buf %p, len %u lba %ju " "offset %ju\n", __func__, file_dev->fd, file_dev->tmp_buf, db_len, (uintmax_t)buf->lba, (uintmax_t)io_offset); } else if (data->sg_count == 0) { fprintf(stderr, "%s: fd %d, buf %p, len %u, lba %ju " "offset %ju\n", __func__, file_dev->fd, data->buf, data->fill_len, (uintmax_t)buf->lba, (uintmax_t)io_offset); } else { int i; fprintf(stderr, "%s: fd %d, len %u, lba %ju " "offset %ju\n", __func__, file_dev->fd, data->fill_len, (uintmax_t)buf->lba, (uintmax_t)io_offset); for (i = 0; i < data->sg_count; i++) { fprintf(stderr, "index %d ptr %p len %zu\n", i, data->iovec[i].iov_base, data->iovec[i].iov_len); } } } else if (retval == 0) { buf->status = CAMDD_STATUS_EOF; if (dev->debug != 0) printf("%s: got EOF from %s!\n", __func__, file_dev->filename); data->resid = data->fill_len; error_count++; } else if (retval < (ssize_t)data->fill_len) { if (buf->status == CAMDD_STATUS_NONE) buf->status = CAMDD_STATUS_SHORT_IO; data->resid = data->fill_len - retval; dev->bytes_transferred += retval; } bailout: if (buf != NULL) { if (buf->status == CAMDD_STATUS_EOF) { struct camdd_buf *buf2; dev->flags |= CAMDD_DEV_FLAG_EOF; STAILQ_FOREACH(buf2, &dev->run_queue, links) buf2->status = CAMDD_STATUS_EOF; } camdd_complete_buf(dev, buf, &error_count); } if (error_count != 0) return (-1); else if (no_resources != 0) return (1); else return (0); } /* * Execute one command from the run queue. Returns 0 for success, 1 for * stop processing, and -1 for error. */ int camdd_pass_run(struct camdd_dev *dev) { struct camdd_buf *buf = NULL; struct camdd_dev_pass *pass_dev = &dev->dev_spec.pass; struct camdd_buf_data *data; uint32_t num_blocks, sectors_used = 0; union ccb *ccb; int retval = 0, is_write = dev->write_dev; int double_buf_needed = 0; buf = STAILQ_FIRST(&dev->run_queue); if (buf == NULL) { retval = 1; goto bailout; } /* * If we're writing, we need to go through the source buffer list * and create an S/G list. */ if (is_write != 0) { retval = camdd_buf_sg_create(buf, /*iovec*/ 0,dev->sector_size, §ors_used, &double_buf_needed); if (retval != 0) { retval = -1; goto bailout; } } STAILQ_REMOVE(&dev->run_queue, buf, camdd_buf, links); dev->num_run_queue--; data = &buf->buf_type_spec.data; ccb = &data->ccb; CCB_CLEAR_ALL_EXCEPT_HDR(&ccb->csio); /* * In almost every case the number of blocks should be the device * block size. The exception may be at the end of an I/O stream * for a partial block or at the end of a device. */ if (is_write != 0) num_blocks = sectors_used; else num_blocks = data->fill_len / pass_dev->block_len; scsi_read_write(&ccb->csio, /*retries*/ dev->retry_count, /*cbfcnp*/ NULL, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*readop*/ (dev->write_dev == 0) ? SCSI_RW_READ : SCSI_RW_WRITE, /*byte2*/ 0, /*minimum_cmd_size*/ dev->min_cmd_size, /*lba*/ buf->lba, /*block_count*/ num_blocks, /*data_ptr*/ (data->sg_count != 0) ? (uint8_t *)data->segs : data->buf, /*dxfer_len*/ (num_blocks * pass_dev->block_len), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ dev->io_timeout); /* Disable freezing the device queue */ ccb->ccb_h.flags |= CAM_DEV_QFRZDIS; if (dev->retry_count != 0) ccb->ccb_h.flags |= CAM_PASS_ERR_RECOVER; if (data->sg_count != 0) { ccb->csio.sglist_cnt = data->sg_count; ccb->ccb_h.flags |= CAM_DATA_SG; } /* * Store a pointer to the buffer in the CCB. The kernel will * restore this when we get it back, and we'll use it to identify * the buffer this CCB came from. */ ccb->ccb_h.ccb_buf = buf; /* * Unlock our mutex in preparation for issuing the ioctl. */ pthread_mutex_unlock(&dev->mutex); /* * Queue the CCB to the pass(4) driver. */ if (ioctl(pass_dev->dev->fd, CAMIOQUEUE, ccb) == -1) { pthread_mutex_lock(&dev->mutex); warn("%s: error sending CAMIOQUEUE ioctl to %s%u", __func__, pass_dev->dev->device_name, pass_dev->dev->dev_unit_num); warn("%s: CCB address is %p", __func__, ccb); retval = -1; STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); } else { pthread_mutex_lock(&dev->mutex); dev->cur_active_io++; STAILQ_INSERT_TAIL(&dev->active_queue, buf, links); } bailout: return (retval); } int camdd_get_next_lba_len(struct camdd_dev *dev, uint64_t *lba, ssize_t *len) { struct camdd_dev_pass *pass_dev; uint32_t num_blocks; int retval = 0; pass_dev = &dev->dev_spec.pass; *lba = dev->next_io_pos_bytes / dev->sector_size; *len = dev->blocksize; num_blocks = *len / dev->sector_size; /* * If max_sector is 0, then we have no set limit. This can happen * if we're writing to a file in a filesystem, or reading from * something like /dev/zero. */ if ((dev->max_sector != 0) || (dev->sector_io_limit != 0)) { uint64_t max_sector; if ((dev->max_sector != 0) && (dev->sector_io_limit != 0)) max_sector = min(dev->sector_io_limit, dev->max_sector); else if (dev->max_sector != 0) max_sector = dev->max_sector; else max_sector = dev->sector_io_limit; /* * Check to see whether we're starting off past the end of * the device. If so, we need to just send an EOF * notification to the writer. */ if (*lba > max_sector) { *len = 0; retval = 1; } else if (((*lba + num_blocks) > max_sector + 1) || ((*lba + num_blocks) < *lba)) { /* * If we get here (but pass the first check), we * can trim the request length down to go to the * end of the device. */ num_blocks = (max_sector + 1) - *lba; *len = num_blocks * dev->sector_size; retval = 1; } } dev->next_io_pos_bytes += *len; return (retval); } /* * Returns 0 for success, 1 for EOF detected, and -1 for failure. */ int camdd_queue(struct camdd_dev *dev, struct camdd_buf *read_buf) { struct camdd_buf *buf = NULL; struct camdd_buf_data *data; struct camdd_dev_pass *pass_dev; size_t new_len; struct camdd_buf_data *rb_data; int is_write = dev->write_dev; int eof_flush_needed = 0; int retval = 0; int error; pass_dev = &dev->dev_spec.pass; /* * If we've gotten EOF or our partner has, we should not continue * queueing I/O. If we're a writer, though, we should continue * to write any buffers that don't have EOF status. */ if ((dev->flags & CAMDD_DEV_FLAG_EOF) || ((dev->flags & CAMDD_DEV_FLAG_PEER_EOF) && (is_write == 0))) { /* * Tell the worker thread that we have seen EOF. */ retval = 1; /* * If we're the writer, send the buffer back with EOF status. */ if (is_write) { read_buf->status = CAMDD_STATUS_EOF; error = camdd_complete_peer_buf(dev, read_buf); } goto bailout; } if (is_write == 0) { buf = camdd_get_buf(dev, CAMDD_BUF_DATA); if (buf == NULL) { retval = -1; goto bailout; } data = &buf->buf_type_spec.data; retval = camdd_get_next_lba_len(dev, &buf->lba, &buf->len); if (retval != 0) { buf->status = CAMDD_STATUS_EOF; if ((buf->len == 0) && ((dev->flags & (CAMDD_DEV_FLAG_EOF_SENT | CAMDD_DEV_FLAG_EOF_QUEUED)) != 0)) { camdd_release_buf(buf); goto bailout; } dev->flags |= CAMDD_DEV_FLAG_EOF_QUEUED; } data->fill_len = buf->len; data->src_start_offset = buf->lba * dev->sector_size; /* * Put this on the run queue. */ STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); dev->num_run_queue++; /* We're done. */ goto bailout; } /* * Check for new EOF status from the reader. */ if ((read_buf->status == CAMDD_STATUS_EOF) || (read_buf->status == CAMDD_STATUS_ERROR)) { dev->flags |= CAMDD_DEV_FLAG_PEER_EOF; if ((STAILQ_FIRST(&dev->pending_queue) == NULL) && (read_buf->len == 0)) { camdd_complete_peer_buf(dev, read_buf); retval = 1; goto bailout; } else eof_flush_needed = 1; } /* * See if we have a buffer we're composing with pieces from our * partner thread. */ buf = STAILQ_FIRST(&dev->pending_queue); if (buf == NULL) { uint64_t lba; ssize_t len; retval = camdd_get_next_lba_len(dev, &lba, &len); if (retval != 0) { read_buf->status = CAMDD_STATUS_EOF; if (len == 0) { dev->flags |= CAMDD_DEV_FLAG_EOF; error = camdd_complete_peer_buf(dev, read_buf); goto bailout; } } /* * If we don't have a pending buffer, we need to grab a new * one from the free list or allocate another one. */ buf = camdd_get_buf(dev, CAMDD_BUF_DATA); if (buf == NULL) { retval = 1; goto bailout; } buf->lba = lba; buf->len = len; STAILQ_INSERT_TAIL(&dev->pending_queue, buf, links); dev->num_pending_queue++; } data = &buf->buf_type_spec.data; rb_data = &read_buf->buf_type_spec.data; if ((rb_data->src_start_offset != dev->next_peer_pos_bytes) && (dev->debug != 0)) { printf("%s: WARNING: reader offset %#jx != expected offset " "%#jx\n", __func__, (uintmax_t)rb_data->src_start_offset, (uintmax_t)dev->next_peer_pos_bytes); } dev->next_peer_pos_bytes = rb_data->src_start_offset + (rb_data->fill_len - rb_data->resid); new_len = (rb_data->fill_len - rb_data->resid) + data->fill_len; if (new_len < buf->len) { /* * There are three cases here: * 1. We need more data to fill up a block, so we put * this I/O on the queue and wait for more I/O. * 2. We have a pending buffer in the queue that is * smaller than our blocksize, but we got an EOF. So we * need to go ahead and flush the write out. * 3. We got an error. */ /* * Increment our fill length. */ data->fill_len += (rb_data->fill_len - rb_data->resid); /* * Add the new read buffer to the list for writing. */ STAILQ_INSERT_TAIL(&buf->src_list, read_buf, src_links); /* Increment the count */ buf->src_count++; if (eof_flush_needed == 0) { /* * We need to exit, because we don't have enough * data yet. */ goto bailout; } else { /* * Take the buffer off of the pending queue. */ STAILQ_REMOVE(&dev->pending_queue, buf, camdd_buf, links); dev->num_pending_queue--; /* * If we need an EOF flush, but there is no data * to flush, go ahead and return this buffer. */ if (data->fill_len == 0) { camdd_complete_buf(dev, buf, /*error_count*/0); retval = 1; goto bailout; } /* * Put this on the next queue for execution. */ STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); dev->num_run_queue++; } } else if (new_len == buf->len) { /* * We have enough data to completey fill one block, * so we're ready to issue the I/O. */ /* * Take the buffer off of the pending queue. */ STAILQ_REMOVE(&dev->pending_queue, buf, camdd_buf, links); dev->num_pending_queue--; /* * Add the new read buffer to the list for writing. */ STAILQ_INSERT_TAIL(&buf->src_list, read_buf, src_links); /* Increment the count */ buf->src_count++; /* * Increment our fill length. */ data->fill_len += (rb_data->fill_len - rb_data->resid); /* * Put this on the next queue for execution. */ STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); dev->num_run_queue++; } else { struct camdd_buf *idb; struct camdd_buf_indirect *indirect; uint32_t len_to_go, cur_offset; idb = camdd_get_buf(dev, CAMDD_BUF_INDIRECT); if (idb == NULL) { retval = 1; goto bailout; } indirect = &idb->buf_type_spec.indirect; indirect->src_buf = read_buf; read_buf->refcount++; indirect->offset = 0; indirect->start_ptr = rb_data->buf; /* * We've already established that there is more * data in read_buf than we have room for in our * current write request. So this particular chunk * of the request should just be the remainder * needed to fill up a block. */ indirect->len = buf->len - (data->fill_len - data->resid); camdd_buf_add_child(buf, idb); /* * This buffer is ready to execute, so we can take * it off the pending queue and put it on the run * queue. */ STAILQ_REMOVE(&dev->pending_queue, buf, camdd_buf, links); dev->num_pending_queue--; STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); dev->num_run_queue++; cur_offset = indirect->offset + indirect->len; /* * The resulting I/O would be too large to fit in * one block. We need to split this I/O into * multiple pieces. Allocate as many buffers as needed. */ for (len_to_go = rb_data->fill_len - rb_data->resid - indirect->len; len_to_go > 0;) { struct camdd_buf *new_buf; struct camdd_buf_data *new_data; uint64_t lba; ssize_t len; retval = camdd_get_next_lba_len(dev, &lba, &len); if ((retval != 0) && (len == 0)) { /* * The device has already been marked * as EOF, and there is no space left. */ goto bailout; } new_buf = camdd_get_buf(dev, CAMDD_BUF_DATA); if (new_buf == NULL) { retval = 1; goto bailout; } new_buf->lba = lba; new_buf->len = len; idb = camdd_get_buf(dev, CAMDD_BUF_INDIRECT); if (idb == NULL) { retval = 1; goto bailout; } indirect = &idb->buf_type_spec.indirect; indirect->src_buf = read_buf; read_buf->refcount++; indirect->offset = cur_offset; indirect->start_ptr = rb_data->buf + cur_offset; indirect->len = min(len_to_go, new_buf->len); #if 0 if (((indirect->len % dev->sector_size) != 0) || ((indirect->offset % dev->sector_size) != 0)) { warnx("offset %ju len %ju not aligned with " "sector size %u", indirect->offset, (uintmax_t)indirect->len, dev->sector_size); } #endif cur_offset += indirect->len; len_to_go -= indirect->len; camdd_buf_add_child(new_buf, idb); new_data = &new_buf->buf_type_spec.data; if ((new_data->fill_len == new_buf->len) || (eof_flush_needed != 0)) { STAILQ_INSERT_TAIL(&dev->run_queue, new_buf, links); dev->num_run_queue++; } else if (new_data->fill_len < buf->len) { STAILQ_INSERT_TAIL(&dev->pending_queue, new_buf, links); dev->num_pending_queue++; } else { warnx("%s: too much data in new " "buffer!", __func__); retval = 1; goto bailout; } } } bailout: return (retval); } void camdd_get_depth(struct camdd_dev *dev, uint32_t *our_depth, uint32_t *peer_depth, uint32_t *our_bytes, uint32_t *peer_bytes) { *our_depth = dev->cur_active_io + dev->num_run_queue; if (dev->num_peer_work_queue > dev->num_peer_done_queue) *peer_depth = dev->num_peer_work_queue - dev->num_peer_done_queue; else *peer_depth = 0; *our_bytes = *our_depth * dev->blocksize; *peer_bytes = dev->peer_bytes_queued; } void camdd_sig_handler(int sig) { if (sig == SIGINFO) need_status = 1; else { need_exit = 1; error_exit = 1; } sem_post(&camdd_sem); } void camdd_print_status(struct camdd_dev *camdd_dev, struct camdd_dev *other_dev, struct timespec *start_time) { struct timespec done_time; uint64_t total_ns; long double mb_sec, total_sec; int error = 0; error = clock_gettime(CLOCK_MONOTONIC_PRECISE, &done_time); if (error != 0) { warn("Unable to get done time"); return; } timespecsub(&done_time, start_time); total_ns = done_time.tv_nsec + (done_time.tv_sec * 1000000000); total_sec = total_ns; total_sec /= 1000000000; fprintf(stderr, "%ju bytes %s %s\n%ju bytes %s %s\n" "%.4Lf seconds elapsed\n", (uintmax_t)camdd_dev->bytes_transferred, (camdd_dev->write_dev == 0) ? "read from" : "written to", camdd_dev->device_name, (uintmax_t)other_dev->bytes_transferred, (other_dev->write_dev == 0) ? "read from" : "written to", other_dev->device_name, total_sec); mb_sec = min(other_dev->bytes_transferred,camdd_dev->bytes_transferred); mb_sec /= 1024 * 1024; mb_sec *= 1000000000; mb_sec /= total_ns; fprintf(stderr, "%.2Lf MB/sec\n", mb_sec); } int camdd_rw(struct camdd_io_opts *io_opts, int num_io_opts, uint64_t max_io, int retry_count, int timeout) { char *device = NULL; struct cam_device *new_cam_dev = NULL; struct camdd_dev *devs[2]; struct timespec start_time; pthread_t threads[2]; int unit = 0; int error = 0; int i; if (num_io_opts != 2) { warnx("Must have one input and one output path"); error = 1; goto bailout; } bzero(devs, sizeof(devs)); for (i = 0; i < num_io_opts; i++) { switch (io_opts[i].dev_type) { case CAMDD_DEV_PASS: { camdd_argmask new_arglist = CAMDD_ARG_NONE; int bus = 0, target = 0, lun = 0; char name[30]; int rv; if (isdigit(io_opts[i].dev_name[0])) { /* device specified as bus:target[:lun] */ rv = parse_btl(io_opts[i].dev_name, &bus, &target, &lun, &new_arglist); if (rv < 2) { warnx("numeric device specification " "must be either bus:target, or " "bus:target:lun"); error = 1; goto bailout; } /* default to 0 if lun was not specified */ if ((new_arglist & CAMDD_ARG_LUN) == 0) { lun = 0; new_arglist |= CAMDD_ARG_LUN; } } else { if (cam_get_device(io_opts[i].dev_name, name, sizeof name, &unit) == -1) { warnx("%s", cam_errbuf); error = 1; goto bailout; } device = strdup(name); new_arglist |= CAMDD_ARG_DEVICE |CAMDD_ARG_UNIT; } if (new_arglist & (CAMDD_ARG_BUS | CAMDD_ARG_TARGET)) new_cam_dev = cam_open_btl(bus, target, lun, O_RDWR, NULL); else new_cam_dev = cam_open_spec_device(device, unit, O_RDWR, NULL); if (new_cam_dev == NULL) { warnx("%s", cam_errbuf); error = 1; goto bailout; } devs[i] = camdd_probe_pass(new_cam_dev, /*io_opts*/ &io_opts[i], CAMDD_ARG_ERR_RECOVER, /*probe_retry_count*/ 3, /*probe_timeout*/ 5000, /*io_retry_count*/ retry_count, /*io_timeout*/ timeout); if (devs[i] == NULL) { warn("Unable to probe device %s%u", new_cam_dev->device_name, new_cam_dev->dev_unit_num); error = 1; goto bailout; } break; } case CAMDD_DEV_FILE: { int fd = -1; if (io_opts[i].dev_name[0] == '-') { if (io_opts[i].write_dev != 0) fd = STDOUT_FILENO; else fd = STDIN_FILENO; } else { if (io_opts[i].write_dev != 0) { fd = open(io_opts[i].dev_name, O_RDWR | O_CREAT, S_IWUSR |S_IRUSR); } else { fd = open(io_opts[i].dev_name, O_RDONLY); } } if (fd == -1) { warn("error opening file %s", io_opts[i].dev_name); error = 1; goto bailout; } devs[i] = camdd_probe_file(fd, &io_opts[i], retry_count, timeout); if (devs[i] == NULL) { error = 1; goto bailout; } break; } default: warnx("Unknown device type %d (%s)", io_opts[i].dev_type, io_opts[i].dev_name); error = 1; goto bailout; break; /*NOTREACHED */ } devs[i]->write_dev = io_opts[i].write_dev; devs[i]->start_offset_bytes = io_opts[i].offset; if (max_io != 0) { devs[i]->sector_io_limit = (devs[i]->start_offset_bytes / devs[i]->sector_size) + (max_io / devs[i]->sector_size) - 1; devs[i]->sector_io_limit = (devs[i]->start_offset_bytes / devs[i]->sector_size) + (max_io / devs[i]->sector_size) - 1; } devs[i]->next_io_pos_bytes = devs[i]->start_offset_bytes; devs[i]->next_completion_pos_bytes =devs[i]->start_offset_bytes; } devs[0]->peer_dev = devs[1]; devs[1]->peer_dev = devs[0]; devs[0]->next_peer_pos_bytes = devs[0]->peer_dev->next_io_pos_bytes; devs[1]->next_peer_pos_bytes = devs[1]->peer_dev->next_io_pos_bytes; sem_init(&camdd_sem, /*pshared*/ 0, 0); signal(SIGINFO, camdd_sig_handler); signal(SIGINT, camdd_sig_handler); error = clock_gettime(CLOCK_MONOTONIC_PRECISE, &start_time); if (error != 0) { warn("Unable to get start time"); goto bailout; } for (i = 0; i < num_io_opts; i++) { error = pthread_create(&threads[i], NULL, camdd_worker, (void *)devs[i]); if (error != 0) { warnc(error, "pthread_create() failed"); goto bailout; } } for (;;) { if ((sem_wait(&camdd_sem) == -1) || (need_exit != 0)) { struct kevent ke; for (i = 0; i < num_io_opts; i++) { EV_SET(&ke, (uintptr_t)&devs[i]->work_queue, EVFILT_USER, 0, NOTE_TRIGGER, 0, NULL); devs[i]->flags |= CAMDD_DEV_FLAG_EOF; error = kevent(devs[i]->kq, &ke, 1, NULL, 0, NULL); if (error == -1) warn("%s: unable to wake up thread", __func__); error = 0; } break; } else if (need_status != 0) { camdd_print_status(devs[0], devs[1], &start_time); need_status = 0; } } for (i = 0; i < num_io_opts; i++) { pthread_join(threads[i], NULL); } camdd_print_status(devs[0], devs[1], &start_time); bailout: for (i = 0; i < num_io_opts; i++) camdd_free_dev(devs[i]); return (error + error_exit); } void usage(void) { fprintf(stderr, "usage: camdd <-i|-o pass=pass0,bs=1M,offset=1M,depth=4>\n" " <-i|-o file=/tmp/file,bs=512K,offset=1M>\n" " <-i|-o file=/dev/da0,bs=512K,offset=1M>\n" " <-i|-o file=/dev/nsa0,bs=512K>\n" " [-C retry_count][-E][-m max_io_amt][-t timeout_secs][-v][-h]\n" "Option description\n" "-i Specify input device/file and parameters\n" "-o Specify output device/file and parameters\n" "Input and Output parameters\n" "pass=name Specify a pass(4) device like pass0 or /dev/pass0\n" "file=name Specify a file or device, /tmp/foo, /dev/da0, /dev/null\n" " or - for stdin/stdout\n" "bs=blocksize Specify blocksize in bytes, or using K, M, G, etc. suffix\n" "offset=len Specify starting offset in bytes or using K, M, G suffix\n" " NOTE: offset cannot be specified on tapes, pipes, stdin/out\n" "depth=N Specify a numeric queue depth. This only applies to pass(4)\n" "mcs=N Specify a minimum cmd size for pass(4) read/write commands\n" "Optional arguments\n" "-C retry_cnt Specify a retry count for pass(4) devices\n" "-E Enable CAM error recovery for pass(4) devices\n" "-m max_io Specify the maximum amount to be transferred in bytes or\n" " using K, G, M, etc. suffixes\n" "-t timeout Specify the I/O timeout to use with pass(4) devices\n" "-v Enable verbose error recovery\n" "-h Print this message\n"); } int camdd_parse_io_opts(char *args, int is_write, struct camdd_io_opts *io_opts) { char *tmpstr, *tmpstr2; char *orig_tmpstr = NULL; int retval = 0; io_opts->write_dev = is_write; tmpstr = strdup(args); if (tmpstr == NULL) { warn("strdup failed"); retval = 1; goto bailout; } orig_tmpstr = tmpstr; while ((tmpstr2 = strsep(&tmpstr, ",")) != NULL) { char *name, *value; /* * If the user creates an empty parameter by putting in two * commas, skip over it and look for the next field. */ if (*tmpstr2 == '\0') continue; name = strsep(&tmpstr2, "="); if (*name == '\0') { warnx("Got empty I/O parameter name"); retval = 1; goto bailout; } value = strsep(&tmpstr2, "="); if ((value == NULL) || (*value == '\0')) { warnx("Empty I/O parameter value for %s", name); retval = 1; goto bailout; } if (strncasecmp(name, "file", 4) == 0) { io_opts->dev_type = CAMDD_DEV_FILE; io_opts->dev_name = strdup(value); if (io_opts->dev_name == NULL) { warn("Error allocating memory"); retval = 1; goto bailout; } } else if (strncasecmp(name, "pass", 4) == 0) { io_opts->dev_type = CAMDD_DEV_PASS; io_opts->dev_name = strdup(value); if (io_opts->dev_name == NULL) { warn("Error allocating memory"); retval = 1; goto bailout; } } else if ((strncasecmp(name, "bs", 2) == 0) || (strncasecmp(name, "blocksize", 9) == 0)) { retval = expand_number(value, &io_opts->blocksize); if (retval == -1) { warn("expand_number(3) failed on %s=%s", name, value); retval = 1; goto bailout; } } else if (strncasecmp(name, "depth", 5) == 0) { char *endptr; io_opts->queue_depth = strtoull(value, &endptr, 0); if (*endptr != '\0') { warnx("invalid queue depth %s", value); retval = 1; goto bailout; } } else if (strncasecmp(name, "mcs", 3) == 0) { char *endptr; io_opts->min_cmd_size = strtol(value, &endptr, 0); if ((*endptr != '\0') || ((io_opts->min_cmd_size > 16) || (io_opts->min_cmd_size < 0))) { warnx("invalid minimum cmd size %s", value); retval = 1; goto bailout; } } else if (strncasecmp(name, "offset", 6) == 0) { retval = expand_number(value, &io_opts->offset); if (retval == -1) { warn("expand_number(3) failed on %s=%s", name, value); retval = 1; goto bailout; } } else if (strncasecmp(name, "debug", 5) == 0) { char *endptr; io_opts->debug = strtoull(value, &endptr, 0); if (*endptr != '\0') { warnx("invalid debug level %s", value); retval = 1; goto bailout; } } else { warnx("Unrecognized parameter %s=%s", name, value); } } bailout: free(orig_tmpstr); return (retval); } int main(int argc, char **argv) { int c; camdd_argmask arglist = CAMDD_ARG_NONE; int timeout = 0, retry_count = 1; int error = 0; uint64_t max_io = 0; struct camdd_io_opts *opt_list = NULL; if (argc == 1) { usage(); exit(1); } opt_list = calloc(2, sizeof(struct camdd_io_opts)); if (opt_list == NULL) { warn("Unable to allocate option list"); error = 1; goto bailout; } while ((c = getopt(argc, argv, "C:Ehi:m:o:t:v")) != -1){ switch (c) { case 'C': retry_count = strtol(optarg, NULL, 0); if (retry_count < 0) errx(1, "retry count %d is < 0", retry_count); arglist |= CAMDD_ARG_RETRIES; break; case 'E': arglist |= CAMDD_ARG_ERR_RECOVER; break; case 'i': case 'o': if (((c == 'i') && (opt_list[0].dev_type != CAMDD_DEV_NONE)) || ((c == 'o') && (opt_list[1].dev_type != CAMDD_DEV_NONE))) { errx(1, "Only one input and output path " "allowed"); } error = camdd_parse_io_opts(optarg, (c == 'o') ? 1 : 0, (c == 'o') ? &opt_list[1] : &opt_list[0]); if (error != 0) goto bailout; break; case 'm': error = expand_number(optarg, &max_io); if (error == -1) { warn("invalid maximum I/O amount %s", optarg); error = 1; goto bailout; } break; case 't': timeout = strtol(optarg, NULL, 0); if (timeout < 0) errx(1, "invalid timeout %d", timeout); /* Convert the timeout from seconds to ms */ timeout *= 1000; arglist |= CAMDD_ARG_TIMEOUT; break; case 'v': arglist |= CAMDD_ARG_VERBOSE; break; case 'h': default: usage(); exit(1); break; /*NOTREACHED*/ } } if ((opt_list[0].dev_type == CAMDD_DEV_NONE) || (opt_list[1].dev_type == CAMDD_DEV_NONE)) errx(1, "Must specify both -i and -o"); /* * Set the timeout if the user hasn't specified one. */ if (timeout == 0) timeout = CAMDD_PASS_RW_TIMEOUT; error = camdd_rw(opt_list, 2, max_io, retry_count, timeout); bailout: free(opt_list); exit(error); } Index: user/alc/PQ_LAUNDRY/usr.sbin/crashinfo/crashinfo.sh =================================================================== --- user/alc/PQ_LAUNDRY/usr.sbin/crashinfo/crashinfo.sh (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.sbin/crashinfo/crashinfo.sh (revision 303206) @@ -1,304 +1,323 @@ #!/bin/sh # # Copyright (c) 2008 Yahoo!, Inc. # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # 3. Neither the name of the author nor the names of any co-contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # $FreeBSD$ usage() { echo "usage: crashinfo [-d crashdir] [-n dumpnr] [-k kernel] [core]" exit 1 } +# Run a single gdb command against a kernel file in batch mode. +# The kernel file is specified as the first argument and the command +# is given in the remaining arguments. +gdb_command() +{ + local k + + k=$1 ; shift + + if [ -x /usr/local/bin/gdb ]; then + /usr/local/bin/gdb -batch -ex "$@" $k + else + echo -e "$@" | /usr/bin/gdb -x /dev/stdin -batch $k + fi +} + find_kernel() { local ivers k kvers ivers=$(awk ' /Version String/ { print nextline=1 next } nextline==1 { if ($0 ~ "^ [A-Za-z ]+: ") { nextline=0 } else { print } }' $INFO) # Look for a matching kernel version. for k in `sysctl -n kern.bootfile` $(ls -t /boot/*/kernel); do - kvers=$(echo 'printf " Version String: %s", version' | \ - gdb -x /dev/stdin -batch $k 2>/dev/null) + kvers=$(gdb_command $k 'printf " Version String: %s", version' \ + 2>/dev/null) if [ "$ivers" = "$kvers" ]; then KERNEL=$k break fi done } CRASHDIR=/var/crash DUMPNR= KERNEL= while getopts "d:n:k:" opt; do case "$opt" in d) CRASHDIR=$OPTARG ;; n) DUMPNR=$OPTARG ;; k) KERNEL=$OPTARG ;; \?) usage ;; esac done shift $((OPTIND - 1)) if [ $# -eq 1 ]; then if [ -n "$DUMPNR" ]; then echo "-n and an explicit vmcore are mutually exclusive" usage fi # Figure out the crash directory and number from the vmcore name. CRASHDIR=`dirname $1` DUMPNR=$(expr $(basename $1) : 'vmcore\.\([0-9]*\)$') if [ -z "$DUMPNR" ]; then echo "Unable to determine dump number from vmcore file $1." exit 1 fi elif [ $# -gt 1 ]; then usage else # If we don't have an explicit dump number, operate on the most # recent dump. if [ -z "$DUMPNR" ]; then if ! [ -r $CRASHDIR/bounds ]; then echo "No crash dumps in $CRASHDIR." exit 1 fi next=`cat $CRASHDIR/bounds` if [ -z "$next" ] || [ "$next" -eq 0 ]; then echo "No crash dumps in $CRASHDIR." exit 1 fi DUMPNR=$(($next - 1)) fi fi VMCORE=$CRASHDIR/vmcore.$DUMPNR INFO=$CRASHDIR/info.$DUMPNR FILE=$CRASHDIR/core.txt.$DUMPNR HOSTNAME=`hostname` if [ ! -e $VMCORE ]; then echo "$VMCORE not found" exit 1 fi if [ ! -e $INFO ]; then echo "$INFO not found" exit 1 fi # If the user didn't specify a kernel, then try to find one. if [ -z "$KERNEL" ]; then find_kernel if [ -z "$KERNEL" ]; then echo "Unable to find matching kernel for $VMCORE" exit 1 fi elif [ ! -e $KERNEL ]; then echo "$KERNEL not found" exit 1 fi echo "Writing crash summary to $FILE." umask 077 # Simulate uname -ostype=$(echo -e printf '"%s", ostype' | gdb -x /dev/stdin -batch $KERNEL) -osrelease=$(echo -e printf '"%s", osrelease' | gdb -x /dev/stdin -batch $KERNEL) -version=$(echo -e printf '"%s", version' | gdb -x /dev/stdin -batch $KERNEL | \ - tr '\t\n' ' ') -machine=$(echo -e printf '"%s", machine' | gdb -x /dev/stdin -batch $KERNEL) +ostype=$(gdb_command $KERNEL 'printf "%s", ostype') +osrelease=$(gdb_command $KERNEL 'printf "%s", osrelease') +version=$(gdb_command $KERNEL 'printf "%s", version' | tr '\t\n' ' ') +machine=$(gdb_command $KERNEL 'printf "%s", machine') exec > $FILE 2>&1 echo "$HOSTNAME dumped core - see $VMCORE" echo date echo echo "$ostype $HOSTNAME $osrelease $version $machine" echo sed -ne '/^ Panic String: /{s//panic: /;p;}' $INFO echo # XXX: /bin/sh on 7.0+ is broken so we can't simply pipe the commands to # kgdb via stdin and have to use a temporary file instead. file=`mktemp /tmp/crashinfo.XXXXXX` if [ $? -eq 0 ]; then echo "bt" >> $file echo "quit" >> $file - kgdb $KERNEL $VMCORE < $file + if [ -x /usr/local/bin/kgdb ]; then + /usr/local/bin/kgdb $KERNEL $VMCORE < $file + else + kgdb $KERNEL $VMCORE < $file + fi rm -f $file echo fi echo echo "------------------------------------------------------------------------" echo "ps -axlww" echo ps -M $VMCORE -N $KERNEL -axlww echo echo "------------------------------------------------------------------------" echo "vmstat -s" echo vmstat -M $VMCORE -N $KERNEL -s echo echo "------------------------------------------------------------------------" echo "vmstat -m" echo vmstat -M $VMCORE -N $KERNEL -m echo echo "------------------------------------------------------------------------" echo "vmstat -z" echo vmstat -M $VMCORE -N $KERNEL -z echo echo "------------------------------------------------------------------------" echo "vmstat -i" echo vmstat -M $VMCORE -N $KERNEL -i echo echo "------------------------------------------------------------------------" echo "pstat -T" echo pstat -M $VMCORE -N $KERNEL -T echo echo "------------------------------------------------------------------------" echo "pstat -s" echo pstat -M $VMCORE -N $KERNEL -s echo echo "------------------------------------------------------------------------" echo "iostat" echo iostat -M $VMCORE -N $KERNEL echo echo "------------------------------------------------------------------------" echo "ipcs -a" echo ipcs -C $VMCORE -N $KERNEL -a echo echo "------------------------------------------------------------------------" echo "ipcs -T" echo ipcs -C $VMCORE -N $KERNEL -T echo # XXX: This doesn't actually work in 5.x+ if false; then echo "------------------------------------------------------------------------" echo "w -dn" echo w -M $VMCORE -N $KERNEL -dn echo fi echo "------------------------------------------------------------------------" echo "nfsstat" echo nfsstat -M $VMCORE -N $KERNEL echo echo "------------------------------------------------------------------------" echo "netstat -s" echo netstat -M $VMCORE -N $KERNEL -s echo echo "------------------------------------------------------------------------" echo "netstat -m" echo netstat -M $VMCORE -N $KERNEL -m echo echo "------------------------------------------------------------------------" echo "netstat -anA" echo netstat -M $VMCORE -N $KERNEL -anA echo echo "------------------------------------------------------------------------" echo "netstat -aL" echo netstat -M $VMCORE -N $KERNEL -aL echo echo "------------------------------------------------------------------------" echo "fstat" echo fstat -M $VMCORE -N $KERNEL echo echo "------------------------------------------------------------------------" echo "dmesg" echo dmesg -a -M $VMCORE -N $KERNEL echo echo "------------------------------------------------------------------------" echo "kernel config" echo config -x $KERNEL echo echo "------------------------------------------------------------------------" echo "ddb capture buffer" echo ddb capture -M $VMCORE -N $KERNEL print Index: user/alc/PQ_LAUNDRY/usr.sbin/ctld/ctl.conf.5 =================================================================== --- user/alc/PQ_LAUNDRY/usr.sbin/ctld/ctl.conf.5 (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.sbin/ctld/ctl.conf.5 (revision 303206) @@ -1,496 +1,587 @@ .\" Copyright (c) 2012 The FreeBSD Foundation .\" Copyright (c) 2015 Alexander Motin .\" All rights reserved. .\" .\" This software was developed by Edward Tomasz Napierala under sponsorship .\" from the FreeBSD Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd November 9, 2015 +.Dd July 21, 2016 .Dt CTL.CONF 5 .Os .Sh NAME .Nm ctl.conf .Nd CAM Target Layer / iSCSI target daemon configuration file .Sh DESCRIPTION The .Nm configuration file is used by the .Xr ctld 8 daemon. Lines starting with .Ql # are interpreted as comments. The general syntax of the .Nm file is: .Bd -literal -offset indent .No pidfile Ar path .No auth-group Ar name No { .Dl chap Ar user Ar secret .Dl ... } .No portal-group Ar name No { .Dl listen Ar address .\".Dl listen-iser Ar address .Dl discovery-auth-group Ar name .Dl ... } .No target Ar name { .Dl auth-group Ar name .Dl portal-group Ar name .Dl lun Ar number No { .Dl path Ar path .Dl } .Dl ... } .Ed .Ss Global Context .Bl -tag -width indent .It Ic auth-group Ar name Create an .Sy auth-group configuration context, defining a new auth-group, which can then be assigned to any number of targets. .It Ic debug Ar level The debug verbosity level. The default is 0. .It Ic maxproc Ar number The limit for concurrently running child processes handling incoming connections. The default is 30. A setting of 0 disables the limit. .It Ic pidfile Ar path The path to the pidfile. The default is .Pa /var/run/ctld.pid . .It Ic portal-group Ar name Create a .Sy portal-group configuration context, defining a new portal-group, which can then be assigned to any number of targets. .It Ic lun Ar name Create a .Sy lun configuration context, defining a LUN to be exported by any number of targets. .It Ic target Ar name Create a .Sy target configuration context, which can optionally contain one or more .Sy lun contexts. .It Ic timeout Ar seconds The timeout for login sessions, after which the connection will be forcibly terminated. The default is 60. A setting of 0 disables the timeout. .It Ic isns-server Ar address An IPv4 or IPv6 address and optionally port of iSNS server to register on. .It Ic isns-period Ar seconds iSNS registration period. Registered Network Entity not updated during this period will be unregistered. The default is 900. .It Ic isns-timeout Ar seconds Timeout for iSNS requests. The default is 5. .El .Ss auth-group Context .Bl -tag -width indent .It Ic auth-type Ar type Sets the authentication type. Type can be either .Qq Ar none , .Qq Ar deny , .Qq Ar chap , or .Qq Ar chap-mutual . In most cases it is not necessary to set the type using this clause; it is usually used to disable authentication for a given .Sy auth-group . .It Ic chap Ar user Ar secret A set of CHAP authentication credentials. Note that for any .Sy auth-group , the configuration may only contain either .Sy chap or .Sy chap-mutual entries; it is an error to mix them. .It Ic chap-mutual Ar user Ar secret Ar mutualuser Ar mutualsecret A set of mutual CHAP authentication credentials. Note that for any .Sy auth-group , the configuration may only contain either .Sy chap or .Sy chap-mutual entries; it is an error to mix them. .It Ic initiator-name Ar initiator-name An iSCSI initiator name. Only initiators with a name matching one of the defined names will be allowed to connect. If not defined, there will be no restrictions based on initiator name. .It Ic initiator-portal Ar address Ns Op / Ns Ar prefixlen An iSCSI initiator portal: an IPv4 or IPv6 address, optionally followed by a literal slash and a prefix length. Only initiators with an address matching one of the defined addresses will be allowed to connect. If not defined, there will be no restrictions based on initiator address. .El .Ss portal-group Context .Bl -tag -width indent .It Ic discovery-auth-group Ar name Assign a previously defined authentication group to the portal group, to be used for target discovery. By default, portal groups are assigned predefined .Sy auth-group .Qq Ar default , which denies discovery. Another predefined .Sy auth-group , .Qq Ar no-authentication , may be used to permit discovery without authentication. .It Ic discovery-filter Ar filter Determines which targets are returned during discovery. Filter can be either .Qq Ar none , .Qq Ar portal , .Qq Ar portal-name , or .Qq Ar portal-name-auth . When set to .Qq Ar none , discovery will return all targets assigned to that portal group. When set to .Qq Ar portal , discovery will not return targets that cannot be accessed by the initiator because of their .Sy initiator-portal . When set to .Qq Ar portal-name , the check will include both .Sy initiator-portal and .Sy initiator-name . When set to .Qq Ar portal-name-auth , the check will include .Sy initiator-portal , .Sy initiator-name , and authentication credentials. The target is returned if it does not require CHAP authentication, or if the CHAP user and secret used during discovery match those used by the target. Note that when using .Qq Ar portal-name-auth , targets that require CHAP authentication will only be returned if .Sy discovery-auth-group requires CHAP. The default is .Qq Ar none . .It Ic listen Ar address An IPv4 or IPv6 address and port to listen on for incoming connections. .\".It Ic listen-iser Ar address .\"An IPv4 or IPv6 address and port to listen on for incoming connections .\"using iSER (iSCSI over RDMA) protocol. .It Ic offload Ar driver Define iSCSI hardware offload driver to use for this .Sy portal-group . The default is .Qq Ar none . .It Ic option Ar name Ar value The CTL-specific port options passed to the kernel. .It Ic redirect Ar address IPv4 or IPv6 address to redirect initiators to. When configured, all initiators attempting to connect to portal belonging to this .Sy portal-group will get redirected using "Target moved temporarily" login response. Redirection happens before authentication and any .Sy initiator-name or .Sy initiator-portal checks are skipped. .It Ic tag Ar value Unique 16-bit tag value of this .Sy portal-group . If not specified, the value is generated automatically. .It Ic foreign Specifies that this .Sy portal-group is listened by some other host. This host will announce it on discovery stage, but won't listen. .El .Ss target Context .Bl -tag -width indent .It Ic alias Ar text Assign a human-readable description to the target. There is no default. .It Ic auth-group Ar name Assign a previously defined authentication group to the target. By default, targets that do not specify their own auth settings, using clauses such as .Sy chap or .Sy initiator-name , are assigned predefined .Sy auth-group .Qq Ar default , which denies all access. Another predefined .Sy auth-group , .Qq Ar no-authentication , may be used to permit access without authentication. Note that this clause can be overridden using the second argument to a .Sy portal-group clause. .It Ic auth-type Ar type Sets the authentication type. Type can be either .Qq Ar none , .Qq Ar deny , .Qq Ar chap , or .Qq Ar chap-mutual . In most cases it is not necessary to set the type using this clause; it is usually used to disable authentication for a given .Sy target . This clause is mutually exclusive with .Sy auth-group ; one cannot use both in a single target. .It Ic chap Ar user Ar secret A set of CHAP authentication credentials. Note that targets must only use one of .Sy auth-group , chap , No or Sy chap-mutual ; it is a configuration error to mix multiple types in one target. .It Ic chap-mutual Ar user Ar secret Ar mutualuser Ar mutualsecret A set of mutual CHAP authentication credentials. Note that targets must only use one of .Sy auth-group , chap , No or Sy chap-mutual ; it is a configuration error to mix multiple types in one target. .It Ic initiator-name Ar initiator-name An iSCSI initiator name. Only initiators with a name matching one of the defined names will be allowed to connect. If not defined, there will be no restrictions based on initiator name. This clause is mutually exclusive with .Sy auth-group ; one cannot use both in a single target. .It Ic initiator-portal Ar address Ns Op / Ns Ar prefixlen An iSCSI initiator portal: an IPv4 or IPv6 address, optionally followed by a literal slash and a prefix length. Only initiators with an address matching one of the defined addresses will be allowed to connect. If not defined, there will be no restrictions based on initiator address. This clause is mutually exclusive with .Sy auth-group ; one cannot use both in a single target. .Pp The .Sy auth-type , .Sy chap , .Sy chap-mutual , .Sy initiator-name , and .Sy initiator-portal clauses in the target context provide an alternative to assigning an .Sy auth-group defined separately, useful in the common case of authentication settings specific to a single target. .It Ic portal-group Ar name Op Ar ag-name Assign a previously defined portal group to the target. The default portal group is .Qq Ar default , which makes the target available on TCP port 3260 on all configured IPv4 and IPv6 addresses. Optional second argument specifies .Sy auth-group for connections to this specific portal group. If second argument is not specified, target .Sy auth-group is used. .It Ic port Ar name .It Ic port Ar name/pp .It Ic port Ar name/pp/vp Assign specified CTL port (such as "isp0" or "isp2/1") to the target. This is used to export the target through a specific physical - eg Fibre Channel - port, in addition to portal-groups configured for the target. Use .Cm "ctladm portlist" command to retrieve the list of available ports. On startup .Xr ctld 8 configures LUN mapping and enables all assigned ports. Each port can be assigned to only one target. .It Ic redirect Ar address IPv4 or IPv6 address to redirect initiators to. When configured, all initiators attempting to connect to this target will get redirected using "Target moved temporarily" login response. Redirection happens after successful authentication. .It Ic lun Ar number Ar name Export previously defined .Sy lun by the parent target. .It Ic lun Ar number Create a .Sy lun configuration context, defining a LUN exported by the parent target. .Pp This is an alternative to defining the LUN separately, useful in the common case of a LUN being exported by a single target. .El .Ss lun Context .Bl -tag -width indent .It Ic backend Ar block No | Ar ramdisk The CTL backend to use for a given LUN. Valid choices are .Qq Ar block and .Qq Ar ramdisk ; block is used for LUNs backed by files or disk device nodes; ramdisk is a bitsink device, used mostly for testing. The default backend is block. .It Ic blocksize Ar size The blocksize visible to the initiator. The default blocksize is 512 for disks, and 2048 for CD/DVDs. .It Ic ctl-lun Ar lun_id Global numeric identifier to use for a given LUN inside CTL. By default CTL allocates those IDs dynamically, but explicit specification may be needed for consistency in HA configurations. .It Ic device-id Ar string The SCSI Device Identification string presented to the initiator. .It Ic device-type Ar type Specify the SCSI device type to use when creating the LUN. Currently CTL supports Direct Access (type 0), Processor (type 3) and CD/DVD (type 5) LUNs. .It Ic option Ar name Ar value The CTL-specific options passed to the kernel. All CTL-specific options are documented in the .Sx OPTIONS section of .Xr ctladm 8 . .It Ic path Ar path The path to the file, device node, or .Xr zfs 8 volume used to back the LUN. For optimal performance, create the volume with the .Qq Ar volmode=dev property set. .It Ic serial Ar string The SCSI serial number presented to the initiator. .It Ic size Ar size The LUN size, in bytes. .El .Sh FILES .Bl -tag -width ".Pa /etc/ctl.conf" -compact .It Pa /etc/ctl.conf The default location of the .Xr ctld 8 configuration file. .El .Sh EXAMPLES .Bd -literal auth-group ag0 { chap-mutual "user" "secret" "mutualuser" "mutualsecret" chap-mutual "user2" "secret2" "mutualuser" "mutualsecret" initiator-portal 192.168.1.1/16 } auth-group ag1 { auth-type none initiator-name "iqn.2012-06.com.example:initiatorhost1" initiator-name "iqn.2012-06.com.example:initiatorhost2" initiator-portal 192.168.1.1/24 initiator-portal [2001:db8::de:ef] } portal-group pg0 { discovery-auth-group no-authentication listen 0.0.0.0:3260 listen [::]:3260 listen [fe80::be:ef]:3261 } target iqn.2012-06.com.example:target0 { alias "Example target" auth-group no-authentication lun 0 { path /dev/zvol/tank/example_0 blocksize 4096 size 4G } } lun example_1 { path /dev/zvol/tank/example_1 option naa 0x50015178f369f093 } target iqn.2012-06.com.example:target1 { auth-group ag0 portal-group pg0 lun 0 example_1 lun 1 { path /dev/zvol/tank/example_2 option vendor "FreeBSD" } } target naa.50015178f369f092 { port isp0 port isp1 lun 0 example_1 +} +.Ed +.Pp +An equivalent configuration in UCL format, for use with +.Fl u : +.Bd -literal +auth-group { + ag0 { + chap-mutual = [ + { + user = "user" + secret = "secretsecret" + mutual-user = "mutualuser" + mutual-secret = "mutualsecret" + }, + { + user = "user2" + secret = "secret2secret2" + mutual-user = "mutualuser" + mutual-secret = "mutualsecret" + } + ] + } + + ag1 { + auth-type = none + initiator-name = [ + "iqn.2012-06.com.example:initiatorhost1", + "iqn.2012-06.com.example:initiatorhost2" + ] + initiator-portal = [192.168.1.1/24, "[2001:db8::de:ef]"] + } +} + +portal-group { + pg0 { + discovery-auth-group = no-authentication + listen = [ + 0.0.0.0:3260, + "[::]:3260", + "[fe80::be:ef]:3261" + ] + } +} + +lun { + example_0 { + path = /dev/zvol/tank/example_0 + blocksize = 4096 + size = "4G" + } + + example_1 { + path = /dev/zvol/tank/example_1 + options { + naa = "0x50015178f369f093" + } + } + + example_2 { + path = /dev/zvol/tank/example_2 + options { + vendor = "FreeBSD" + } + } +} + +target { + "iqn.2012-06.com.example:target0" { + alias = "Example target" + auth-group = no-authentication + lun = [ + { number = 0, name = example_0 }, + ] + } + + "iqn.2012-06.com.example:target1" { + auth-group = ag0 + portal-group { name = pg0 } + lun = [ + { number = 0, name = example_1 }, + { number = 1, name = example_2 } + ] + } + + naa.50015178f369f092 { + port = isp0 + lun = [ + { number = 0, name = example_1 } + ] + } } .Ed .Sh SEE ALSO .Xr ctl 4 , .Xr ctladm 8 , .Xr ctld 8 , .Xr zfs 8 .Sh AUTHORS The .Nm configuration file functionality for .Xr ctld 8 was developed by .An Edward Tomasz Napierala Aq Mt trasz@FreeBSD.org under sponsorship from the FreeBSD Foundation. Index: user/alc/PQ_LAUNDRY/usr.sbin/ctld/ctld.8 =================================================================== --- user/alc/PQ_LAUNDRY/usr.sbin/ctld/ctld.8 (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.sbin/ctld/ctld.8 (revision 303206) @@ -1,119 +1,122 @@ .\" Copyright (c) 2012 The FreeBSD Foundation .\" All rights reserved. .\" .\" This software was developed by Edward Tomasz Napierala under sponsorship .\" from the FreeBSD Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd May 22, 2015 +.Dd July 21, 2016 .Dt CTLD 8 .Os .Sh NAME .Nm ctld .Nd CAM Target Layer / iSCSI target daemon .Sh SYNOPSIS .Nm .Op Fl d .Op Fl f Ar config-file +.Op Fl u .Sh DESCRIPTION The .Nm daemon is responsible for managing the CAM Target Layer configuration, accepting incoming iSCSI connections, performing authentication and passing connections to the kernel part of the native iSCSI target. .Pp Upon startup, the .Nm daemon parses the configuration file and exits, if it encounters any errors. Then it compares the configuration with the kernel list of LUNs managed by previously running .Nm instances, removes LUNs no longer existing in the configuration file, and creates new LUNs as necessary. After that it listens for the incoming iSCSI connections, performs authentication, and, if successful, passes the connections to the kernel part of CTL iSCSI target, which handles it from that point. .Pp When it receives a SIGHUP signal, the .Nm reloads its configuration and applies the changes to the kernel. Changes are applied in a way that avoids unnecessary disruptions; for example removing one LUN does not affect other LUNs. .Pp When exiting gracefully, the .Nm daemon removes LUNs it managed and forcibly disconnects all the clients. Otherwise - for example, when killed with SIGKILL - LUNs stay configured and clients remain connected. .Pp To perform administrative actions that apply to already connected sessions, such as forcing termination, use .Xr ctladm 8 . .Pp The following options are available: .Bl -tag -width ".Fl P Ar pidfile" .It Fl f Ar config-file Specifies the name of the configuration file. The default is .Pa /etc/ctl.conf . .It Fl d Debug mode. The daemon sends verbose debug output to standard error, and does not put itself in the background. The daemon will also not fork and will exit after processing one connection. This option is only intended for debugging the target. +.It Fl u +Use UCL configuration file format instead of the traditional non-UCL format. .El .Sh FILES .Bl -tag -width ".Pa /var/run/ctld.pid" -compact .It Pa /etc/ctl.conf The configuration file for .Nm . The file format and configuration options are described in .Xr ctl.conf 5 . .It Pa /var/run/ctld.pid The default location of the .Nm PID file. .El .Sh EXIT STATUS The .Nm utility exits 0 on success, and >0 if an error occurs. .Sh SEE ALSO .Xr ctl 4 , .Xr ctl.conf 5 , .Xr ctladm 8 , .Xr ctlstat 8 .Sh HISTORY The .Nm command appeared in .Fx 10.0 . .Sh AUTHORS The .Nm was developed by .An Edward Tomasz Napierala Aq Mt trasz@FreeBSD.org under sponsorship from the FreeBSD Foundation. Index: user/alc/PQ_LAUNDRY/usr.sbin/ctld/login.c =================================================================== --- user/alc/PQ_LAUNDRY/usr.sbin/ctld/login.c (revision 303205) +++ user/alc/PQ_LAUNDRY/usr.sbin/ctld/login.c (revision 303206) @@ -1,1001 +1,1001 @@ /*- * Copyright (c) 2012 The FreeBSD Foundation * All rights reserved. * * This software was developed by Edward Tomasz Napierala under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include "ctld.h" #include "iscsi_proto.h" static void login_send_error(struct pdu *request, char class, char detail); static void login_set_nsg(struct pdu *response, int nsg) { struct iscsi_bhs_login_response *bhslr; assert(nsg == BHSLR_STAGE_SECURITY_NEGOTIATION || nsg == BHSLR_STAGE_OPERATIONAL_NEGOTIATION || nsg == BHSLR_STAGE_FULL_FEATURE_PHASE); bhslr = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr->bhslr_flags &= 0xFC; bhslr->bhslr_flags |= nsg; bhslr->bhslr_flags |= BHSLR_FLAGS_TRANSIT; } static int login_csg(const struct pdu *request) { struct iscsi_bhs_login_request *bhslr; bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; return ((bhslr->bhslr_flags & 0x0C) >> 2); } static void login_set_csg(struct pdu *response, int csg) { struct iscsi_bhs_login_response *bhslr; assert(csg == BHSLR_STAGE_SECURITY_NEGOTIATION || csg == BHSLR_STAGE_OPERATIONAL_NEGOTIATION || csg == BHSLR_STAGE_FULL_FEATURE_PHASE); bhslr = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr->bhslr_flags &= 0xF3; bhslr->bhslr_flags |= csg << 2; } static struct pdu * login_receive(struct connection *conn, bool initial) { struct pdu *request; struct iscsi_bhs_login_request *bhslr; request = pdu_new(conn); pdu_receive(request); if ((request->pdu_bhs->bhs_opcode & ~ISCSI_BHS_OPCODE_IMMEDIATE) != ISCSI_BHS_OPCODE_LOGIN_REQUEST) { /* * The first PDU in session is special - if we receive any PDU * different than login request, we have to drop the connection * without sending response ("A target receiving any PDU * except a Login request before the Login Phase is started MUST * immediately terminate the connection on which the PDU * was received.") */ if (initial == false) login_send_error(request, 0x02, 0x0b); log_errx(1, "protocol error: received invalid opcode 0x%x", request->pdu_bhs->bhs_opcode); } bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; /* * XXX: Implement the C flag some day. */ if ((bhslr->bhslr_flags & BHSLR_FLAGS_CONTINUE) != 0) { login_send_error(request, 0x03, 0x00); log_errx(1, "received Login PDU with unsupported \"C\" flag"); } if (bhslr->bhslr_version_max != 0x00) { login_send_error(request, 0x02, 0x05); log_errx(1, "received Login PDU with unsupported " "Version-max 0x%x", bhslr->bhslr_version_max); } if (bhslr->bhslr_version_min != 0x00) { login_send_error(request, 0x02, 0x05); log_errx(1, "received Login PDU with unsupported " "Version-min 0x%x", bhslr->bhslr_version_min); } if (initial == false && ISCSI_SNLT(ntohl(bhslr->bhslr_cmdsn), conn->conn_cmdsn)) { login_send_error(request, 0x02, 0x00); log_errx(1, "received Login PDU with decreasing CmdSN: " "was %u, is %u", conn->conn_cmdsn, ntohl(bhslr->bhslr_cmdsn)); } if (initial == false && ntohl(bhslr->bhslr_expstatsn) != conn->conn_statsn) { login_send_error(request, 0x02, 0x00); log_errx(1, "received Login PDU with wrong ExpStatSN: " "is %u, should be %u", ntohl(bhslr->bhslr_expstatsn), conn->conn_statsn); } conn->conn_cmdsn = ntohl(bhslr->bhslr_cmdsn); return (request); } static struct pdu * login_new_response(struct pdu *request) { struct pdu *response; struct connection *conn; struct iscsi_bhs_login_request *bhslr; struct iscsi_bhs_login_response *bhslr2; bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; conn = request->pdu_connection; response = pdu_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_opcode = ISCSI_BHS_OPCODE_LOGIN_RESPONSE; login_set_csg(response, BHSLR_STAGE_SECURITY_NEGOTIATION); memcpy(bhslr2->bhslr_isid, bhslr->bhslr_isid, sizeof(bhslr2->bhslr_isid)); bhslr2->bhslr_initiator_task_tag = bhslr->bhslr_initiator_task_tag; bhslr2->bhslr_statsn = htonl(conn->conn_statsn++); bhslr2->bhslr_expcmdsn = htonl(conn->conn_cmdsn); bhslr2->bhslr_maxcmdsn = htonl(conn->conn_cmdsn); return (response); } static void login_send_error(struct pdu *request, char class, char detail) { struct pdu *response; struct iscsi_bhs_login_response *bhslr2; log_debugx("sending Login Response PDU with failure class 0x%x/0x%x; " "see next line for reason", class, detail); response = login_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_status_class = class; bhslr2->bhslr_status_detail = detail; pdu_send(response); pdu_delete(response); } static int login_list_contains(const char *list, const char *what) { char *tofree, *str, *token; tofree = str = checked_strdup(list); while ((token = strsep(&str, ",")) != NULL) { if (strcmp(token, what) == 0) { free(tofree); return (1); } } free(tofree); return (0); } static int login_list_prefers(const char *list, const char *choice1, const char *choice2) { char *tofree, *str, *token; tofree = str = checked_strdup(list); while ((token = strsep(&str, ",")) != NULL) { if (strcmp(token, choice1) == 0) { free(tofree); return (1); } if (strcmp(token, choice2) == 0) { free(tofree); return (2); } } free(tofree); return (-1); } static struct pdu * login_receive_chap_a(struct connection *conn) { struct pdu *request; struct keys *request_keys; const char *chap_a; request = login_receive(conn, false); request_keys = keys_new(); keys_load(request_keys, request); chap_a = keys_find(request_keys, "CHAP_A"); if (chap_a == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU without CHAP_A"); } if (login_list_contains(chap_a, "5") == 0) { login_send_error(request, 0x02, 0x01); log_errx(1, "received CHAP Login PDU with unsupported CHAP_A " "\"%s\"", chap_a); } keys_delete(request_keys); return (request); } static void login_send_chap_c(struct pdu *request, struct chap *chap) { struct pdu *response; struct keys *response_keys; char *chap_c, *chap_i; chap_c = chap_get_challenge(chap); chap_i = chap_get_id(chap); response = login_new_response(request); response_keys = keys_new(); keys_add(response_keys, "CHAP_A", "5"); keys_add(response_keys, "CHAP_I", chap_i); keys_add(response_keys, "CHAP_C", chap_c); free(chap_i); free(chap_c); keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); } static struct pdu * login_receive_chap_r(struct connection *conn, struct auth_group *ag, struct chap *chap, const struct auth **authp) { struct pdu *request; struct keys *request_keys; const char *chap_n, *chap_r; const struct auth *auth; int error; request = login_receive(conn, false); request_keys = keys_new(); keys_load(request_keys, request); chap_n = keys_find(request_keys, "CHAP_N"); if (chap_n == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU without CHAP_N"); } chap_r = keys_find(request_keys, "CHAP_R"); if (chap_r == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU without CHAP_R"); } error = chap_receive(chap, chap_r); if (error != 0) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU with malformed CHAP_R"); } /* * Verify the response. */ assert(ag->ag_type == AG_TYPE_CHAP || ag->ag_type == AG_TYPE_CHAP_MUTUAL); auth = auth_find(ag, chap_n); if (auth == NULL) { login_send_error(request, 0x02, 0x01); log_errx(1, "received CHAP Login with invalid user \"%s\"", chap_n); } assert(auth->a_secret != NULL); assert(strlen(auth->a_secret) > 0); error = chap_authenticate(chap, auth->a_secret); if (error != 0) { login_send_error(request, 0x02, 0x01); log_errx(1, "CHAP authentication failed for user \"%s\"", auth->a_user); } keys_delete(request_keys); *authp = auth; return (request); } static void login_send_chap_success(struct pdu *request, const struct auth *auth) { struct pdu *response; struct keys *request_keys, *response_keys; struct rchap *rchap; const char *chap_i, *chap_c; char *chap_r; int error; response = login_new_response(request); login_set_nsg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION); /* * Actually, one more thing: mutual authentication. */ request_keys = keys_new(); keys_load(request_keys, request); chap_i = keys_find(request_keys, "CHAP_I"); chap_c = keys_find(request_keys, "CHAP_C"); if (chap_i != NULL || chap_c != NULL) { if (chap_i == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "initiator requested target " "authentication, but didn't send CHAP_I"); } if (chap_c == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "initiator requested target " "authentication, but didn't send CHAP_C"); } if (auth->a_auth_group->ag_type != AG_TYPE_CHAP_MUTUAL) { login_send_error(request, 0x02, 0x01); log_errx(1, "initiator requests target authentication " "for user \"%s\", but mutual user/secret " "is not set", auth->a_user); } log_debugx("performing mutual authentication as user \"%s\"", auth->a_mutual_user); rchap = rchap_new(auth->a_mutual_secret); error = rchap_receive(rchap, chap_i, chap_c); if (error != 0) { login_send_error(request, 0x02, 0x07); log_errx(1, "received CHAP Login PDU with malformed " "CHAP_I or CHAP_C"); } chap_r = rchap_get_response(rchap); rchap_delete(rchap); response_keys = keys_new(); keys_add(response_keys, "CHAP_N", auth->a_mutual_user); keys_add(response_keys, "CHAP_R", chap_r); free(chap_r); keys_save(response_keys, response); keys_delete(response_keys); } else { log_debugx("initiator did not request target authentication"); } keys_delete(request_keys); pdu_send(response); pdu_delete(response); } static void login_chap(struct connection *conn, struct auth_group *ag) { const struct auth *auth; struct chap *chap; struct pdu *request; /* * Receive CHAP_A PDU. */ log_debugx("beginning CHAP authentication; waiting for CHAP_A"); request = login_receive_chap_a(conn); /* * Generate the challenge. */ chap = chap_new(); /* * Send the challenge. */ log_debugx("sending CHAP_C, binary challenge size is %zd bytes", sizeof(chap->chap_challenge)); login_send_chap_c(request, chap); pdu_delete(request); /* * Receive CHAP_N/CHAP_R PDU and authenticate. */ log_debugx("waiting for CHAP_N/CHAP_R"); request = login_receive_chap_r(conn, ag, chap, &auth); /* * Yay, authentication succeeded! */ log_debugx("authentication succeeded for user \"%s\"; " "transitioning to Negotiation Phase", auth->a_user); login_send_chap_success(request, auth); pdu_delete(request); /* * Leave username and CHAP information for discovery(). */ conn->conn_user = auth->a_user; conn->conn_chap = chap; } static void login_negotiate_key(struct pdu *request, const char *name, const char *value, bool skipped_security, struct keys *response_keys) { int which; size_t tmp; struct connection *conn; conn = request->pdu_connection; if (strcmp(name, "InitiatorName") == 0) { if (!skipped_security) log_errx(1, "initiator resent InitiatorName"); } else if (strcmp(name, "SessionType") == 0) { if (!skipped_security) log_errx(1, "initiator resent SessionType"); } else if (strcmp(name, "TargetName") == 0) { if (!skipped_security) log_errx(1, "initiator resent TargetName"); } else if (strcmp(name, "InitiatorAlias") == 0) { if (conn->conn_initiator_alias != NULL) free(conn->conn_initiator_alias); conn->conn_initiator_alias = checked_strdup(value); } else if (strcmp(value, "Irrelevant") == 0) { /* Ignore. */ } else if (strcmp(name, "HeaderDigest") == 0) { /* * We don't handle digests for discovery sessions. */ if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) { log_debugx("discovery session; digests disabled"); keys_add(response_keys, name, "None"); return; } which = login_list_prefers(value, "CRC32C", "None"); switch (which) { case 1: log_debugx("initiator prefers CRC32C " "for header digest; we'll use it"); conn->conn_header_digest = CONN_DIGEST_CRC32C; keys_add(response_keys, name, "CRC32C"); break; case 2: log_debugx("initiator prefers not to do " "header digest; we'll comply"); keys_add(response_keys, name, "None"); break; default: log_warnx("initiator sent unrecognized " "HeaderDigest value \"%s\"; will use None", value); keys_add(response_keys, name, "None"); break; } } else if (strcmp(name, "DataDigest") == 0) { if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) { log_debugx("discovery session; digests disabled"); keys_add(response_keys, name, "None"); return; } which = login_list_prefers(value, "CRC32C", "None"); switch (which) { case 1: log_debugx("initiator prefers CRC32C " "for data digest; we'll use it"); conn->conn_data_digest = CONN_DIGEST_CRC32C; keys_add(response_keys, name, "CRC32C"); break; case 2: log_debugx("initiator prefers not to do " "data digest; we'll comply"); keys_add(response_keys, name, "None"); break; default: log_warnx("initiator sent unrecognized " "DataDigest value \"%s\"; will use None", value); keys_add(response_keys, name, "None"); break; } } else if (strcmp(name, "MaxConnections") == 0) { keys_add(response_keys, name, "1"); } else if (strcmp(name, "InitialR2T") == 0) { keys_add(response_keys, name, "Yes"); } else if (strcmp(name, "ImmediateData") == 0) { if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) { log_debugx("discovery session; ImmediateData irrelevant"); keys_add(response_keys, name, "Irrelevant"); } else { if (strcmp(value, "Yes") == 0) { conn->conn_immediate_data = true; keys_add(response_keys, name, "Yes"); } else { conn->conn_immediate_data = false; keys_add(response_keys, name, "No"); } } } else if (strcmp(name, "MaxRecvDataSegmentLength") == 0) { tmp = strtoul(value, NULL, 10); if (tmp <= 0) { login_send_error(request, 0x02, 0x00); log_errx(1, "received invalid " "MaxRecvDataSegmentLength"); } if (tmp > conn->conn_data_segment_limit) { log_debugx("capping MaxRecvDataSegmentLength " "from %zd to %zd", tmp, conn->conn_data_segment_limit); tmp = conn->conn_data_segment_limit; } conn->conn_max_data_segment_length = tmp; keys_add_int(response_keys, name, conn->conn_data_segment_limit); } else if (strcmp(name, "MaxBurstLength") == 0) { tmp = strtoul(value, NULL, 10); if (tmp <= 0) { login_send_error(request, 0x02, 0x00); log_errx(1, "received invalid MaxBurstLength"); } if (tmp > MAX_BURST_LENGTH) { log_debugx("capping MaxBurstLength from %zd to %d", tmp, MAX_BURST_LENGTH); tmp = MAX_BURST_LENGTH; } conn->conn_max_burst_length = tmp; - keys_add(response_keys, name, value); + keys_add_int(response_keys, name, tmp); } else if (strcmp(name, "FirstBurstLength") == 0) { tmp = strtoul(value, NULL, 10); if (tmp <= 0) { login_send_error(request, 0x02, 0x00); log_errx(1, "received invalid FirstBurstLength"); } if (tmp > FIRST_BURST_LENGTH) { log_debugx("capping FirstBurstLength from %zd to %d", tmp, FIRST_BURST_LENGTH); tmp = FIRST_BURST_LENGTH; } conn->conn_first_burst_length = tmp; keys_add_int(response_keys, name, tmp); } else if (strcmp(name, "DefaultTime2Wait") == 0) { keys_add(response_keys, name, value); } else if (strcmp(name, "DefaultTime2Retain") == 0) { keys_add(response_keys, name, "0"); } else if (strcmp(name, "MaxOutstandingR2T") == 0) { keys_add(response_keys, name, "1"); } else if (strcmp(name, "DataPDUInOrder") == 0) { keys_add(response_keys, name, "Yes"); } else if (strcmp(name, "DataSequenceInOrder") == 0) { keys_add(response_keys, name, "Yes"); } else if (strcmp(name, "ErrorRecoveryLevel") == 0) { keys_add(response_keys, name, "0"); } else if (strcmp(name, "OFMarker") == 0) { keys_add(response_keys, name, "No"); } else if (strcmp(name, "IFMarker") == 0) { keys_add(response_keys, name, "No"); } else if (strcmp(name, "iSCSIProtocolLevel") == 0) { tmp = strtoul(value, NULL, 10); if (tmp > 2) tmp = 2; keys_add_int(response_keys, name, tmp); } else { log_debugx("unknown key \"%s\"; responding " "with NotUnderstood", name); keys_add(response_keys, name, "NotUnderstood"); } } static void login_redirect(struct pdu *request, const char *target_address) { struct pdu *response; struct iscsi_bhs_login_response *bhslr2; struct keys *response_keys; response = login_new_response(request); login_set_csg(response, login_csg(request)); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_status_class = 0x01; bhslr2->bhslr_status_detail = 0x01; response_keys = keys_new(); keys_add(response_keys, "TargetAddress", target_address); keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); } static bool login_portal_redirect(struct connection *conn, struct pdu *request) { const struct portal_group *pg; pg = conn->conn_portal->p_portal_group; if (pg->pg_redirection == NULL) return (false); log_debugx("portal-group \"%s\" configured to redirect to %s", pg->pg_name, pg->pg_redirection); login_redirect(request, pg->pg_redirection); return (true); } static bool login_target_redirect(struct connection *conn, struct pdu *request) { const char *target_address; assert(conn->conn_portal->p_portal_group->pg_redirection == NULL); if (conn->conn_target == NULL) return (false); target_address = conn->conn_target->t_redirection; if (target_address == NULL) return (false); log_debugx("target \"%s\" configured to redirect to %s", conn->conn_target->t_name, target_address); login_redirect(request, target_address); return (true); } static void login_negotiate(struct connection *conn, struct pdu *request) { struct pdu *response; struct iscsi_bhs_login_response *bhslr2; struct keys *request_keys, *response_keys; int i; bool redirected, skipped_security; if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { /* * Query the kernel for MaxDataSegmentLength it can handle. * In case of offload, it depends on hardware capabilities. */ assert(conn->conn_target != NULL); kernel_limits(conn->conn_portal->p_portal_group->pg_offload, &conn->conn_data_segment_limit); } else { conn->conn_data_segment_limit = MAX_DATA_SEGMENT_LENGTH; } if (request == NULL) { log_debugx("beginning operational parameter negotiation; " "waiting for Login PDU"); request = login_receive(conn, false); skipped_security = false; } else skipped_security = true; /* * RFC 3720, 10.13.5. Status-Class and Status-Detail, says * the redirection SHOULD be accepted by the initiator before * authentication, but MUST be be accepted afterwards; that's * why we're doing it here and not earlier. */ redirected = login_target_redirect(conn, request); if (redirected) { log_debugx("initiator redirected; exiting"); exit(0); } request_keys = keys_new(); keys_load(request_keys, request); response = login_new_response(request); bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs; bhslr2->bhslr_tsih = htons(0xbadd); login_set_csg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION); login_set_nsg(response, BHSLR_STAGE_FULL_FEATURE_PHASE); response_keys = keys_new(); if (skipped_security && conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { if (conn->conn_target->t_alias != NULL) keys_add(response_keys, "TargetAlias", conn->conn_target->t_alias); keys_add_int(response_keys, "TargetPortalGroupTag", conn->conn_portal->p_portal_group->pg_tag); } for (i = 0; i < KEYS_MAX; i++) { if (request_keys->keys_names[i] == NULL) break; login_negotiate_key(request, request_keys->keys_names[i], request_keys->keys_values[i], skipped_security, response_keys); } log_debugx("operational parameter negotiation done; " "transitioning to Full Feature Phase"); keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); pdu_delete(request); keys_delete(request_keys); } static void login_wait_transition(struct connection *conn) { struct pdu *request, *response; struct iscsi_bhs_login_request *bhslr; log_debugx("waiting for state transition request"); request = login_receive(conn, false); bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; if ((bhslr->bhslr_flags & BHSLR_FLAGS_TRANSIT) == 0) { login_send_error(request, 0x02, 0x00); log_errx(1, "got no \"T\" flag after answering AuthMethod"); } log_debugx("got state transition request"); response = login_new_response(request); pdu_delete(request); login_set_nsg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION); pdu_send(response); pdu_delete(response); login_negotiate(conn, NULL); } void login(struct connection *conn) { struct pdu *request, *response; struct iscsi_bhs_login_request *bhslr; struct keys *request_keys, *response_keys; struct auth_group *ag; struct portal_group *pg; const char *initiator_name, *initiator_alias, *session_type, *target_name, *auth_method; bool redirected, fail, trans; /* * Handle the initial Login Request - figure out required authentication * method and either transition to the next phase, if no authentication * is required, or call appropriate authentication code. */ log_debugx("beginning Login Phase; waiting for Login PDU"); request = login_receive(conn, true); bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs; if (bhslr->bhslr_tsih != 0) { login_send_error(request, 0x02, 0x0a); log_errx(1, "received Login PDU with non-zero TSIH"); } pg = conn->conn_portal->p_portal_group; memcpy(conn->conn_initiator_isid, bhslr->bhslr_isid, sizeof(conn->conn_initiator_isid)); /* * XXX: Implement the C flag some day. */ request_keys = keys_new(); keys_load(request_keys, request); assert(conn->conn_initiator_name == NULL); initiator_name = keys_find(request_keys, "InitiatorName"); if (initiator_name == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received Login PDU without InitiatorName"); } if (valid_iscsi_name(initiator_name) == false) { login_send_error(request, 0x02, 0x00); log_errx(1, "received Login PDU with invalid InitiatorName"); } conn->conn_initiator_name = checked_strdup(initiator_name); log_set_peer_name(conn->conn_initiator_name); setproctitle("%s (%s)", conn->conn_initiator_addr, conn->conn_initiator_name); redirected = login_portal_redirect(conn, request); if (redirected) { log_debugx("initiator redirected; exiting"); exit(0); } initiator_alias = keys_find(request_keys, "InitiatorAlias"); if (initiator_alias != NULL) conn->conn_initiator_alias = checked_strdup(initiator_alias); assert(conn->conn_session_type == CONN_SESSION_TYPE_NONE); session_type = keys_find(request_keys, "SessionType"); if (session_type != NULL) { if (strcmp(session_type, "Normal") == 0) { conn->conn_session_type = CONN_SESSION_TYPE_NORMAL; } else if (strcmp(session_type, "Discovery") == 0) { conn->conn_session_type = CONN_SESSION_TYPE_DISCOVERY; } else { login_send_error(request, 0x02, 0x00); log_errx(1, "received Login PDU with invalid " "SessionType \"%s\"", session_type); } } else conn->conn_session_type = CONN_SESSION_TYPE_NORMAL; assert(conn->conn_target == NULL); if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { target_name = keys_find(request_keys, "TargetName"); if (target_name == NULL) { login_send_error(request, 0x02, 0x07); log_errx(1, "received Login PDU without TargetName"); } conn->conn_port = port_find_in_pg(pg, target_name); if (conn->conn_port == NULL) { login_send_error(request, 0x02, 0x03); log_errx(1, "requested target \"%s\" not found", target_name); } conn->conn_target = conn->conn_port->p_target; } /* * At this point we know what kind of authentication we need. */ if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { ag = conn->conn_port->p_auth_group; if (ag == NULL) ag = conn->conn_target->t_auth_group; if (ag->ag_name != NULL) { log_debugx("initiator requests to connect " "to target \"%s\"; auth-group \"%s\"", conn->conn_target->t_name, ag->ag_name); } else { log_debugx("initiator requests to connect " "to target \"%s\"", conn->conn_target->t_name); } } else { assert(conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY); ag = pg->pg_discovery_auth_group; if (ag->ag_name != NULL) { log_debugx("initiator requests " "discovery session; auth-group \"%s\"", ag->ag_name); } else { log_debugx("initiator requests discovery session"); } } if (ag->ag_type == AG_TYPE_DENY) { login_send_error(request, 0x02, 0x01); log_errx(1, "auth-type is \"deny\""); } if (ag->ag_type == AG_TYPE_UNKNOWN) { /* * This can happen with empty auth-group. */ login_send_error(request, 0x02, 0x01); log_errx(1, "auth-type not set, denying access"); } /* * Enforce initiator-name and initiator-portal. */ if (auth_name_check(ag, initiator_name) != 0) { login_send_error(request, 0x02, 0x02); log_errx(1, "initiator does not match allowed initiator names"); } if (auth_portal_check(ag, &conn->conn_initiator_sa) != 0) { login_send_error(request, 0x02, 0x02); log_errx(1, "initiator does not match allowed " "initiator portals"); } /* * Let's see if the initiator intends to do any kind of authentication * at all. */ if (login_csg(request) == BHSLR_STAGE_OPERATIONAL_NEGOTIATION) { if (ag->ag_type != AG_TYPE_NO_AUTHENTICATION) { login_send_error(request, 0x02, 0x01); log_errx(1, "initiator skipped the authentication, " "but authentication is required"); } keys_delete(request_keys); log_debugx("initiator skipped the authentication, " "and we don't need it; proceeding with negotiation"); login_negotiate(conn, request); return; } fail = false; response = login_new_response(request); response_keys = keys_new(); trans = (bhslr->bhslr_flags & BHSLR_FLAGS_TRANSIT) != 0; auth_method = keys_find(request_keys, "AuthMethod"); if (ag->ag_type == AG_TYPE_NO_AUTHENTICATION) { log_debugx("authentication not required"); if (auth_method == NULL || login_list_contains(auth_method, "None")) { keys_add(response_keys, "AuthMethod", "None"); } else { log_warnx("initiator requests " "AuthMethod \"%s\" instead of \"None\"", auth_method); keys_add(response_keys, "AuthMethod", "Reject"); } if (trans) login_set_nsg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION); } else { log_debugx("CHAP authentication required"); if (auth_method == NULL || login_list_contains(auth_method, "CHAP")) { keys_add(response_keys, "AuthMethod", "CHAP"); } else { log_warnx("initiator requests unsupported " "AuthMethod \"%s\" instead of \"CHAP\"", auth_method); keys_add(response_keys, "AuthMethod", "Reject"); fail = true; } } if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) { if (conn->conn_target->t_alias != NULL) keys_add(response_keys, "TargetAlias", conn->conn_target->t_alias); keys_add_int(response_keys, "TargetPortalGroupTag", pg->pg_tag); } keys_save(response_keys, response); pdu_send(response); pdu_delete(response); keys_delete(response_keys); pdu_delete(request); keys_delete(request_keys); if (fail) { log_debugx("sent reject for AuthMethod; exiting"); exit(1); } if (ag->ag_type != AG_TYPE_NO_AUTHENTICATION) { login_chap(conn, ag); login_negotiate(conn, NULL); } else if (trans) { login_negotiate(conn, NULL); } else { login_wait_transition(conn); } } Index: user/alc/PQ_LAUNDRY =================================================================== --- user/alc/PQ_LAUNDRY (revision 303205) +++ user/alc/PQ_LAUNDRY (revision 303206) Property changes on: user/alc/PQ_LAUNDRY ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head:r303053-303204