Index: user/ngie/stable-10-libnv/share/man/man9/pci.9 =================================================================== --- user/ngie/stable-10-libnv/share/man/man9/pci.9 (revision 293096) +++ user/ngie/stable-10-libnv/share/man/man9/pci.9 (revision 293097) @@ -1,854 +1,854 @@ .\" .\" Copyright (c) 2005 Bruce M Simpson .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd January 3, 2015 .Dt PCI 9 .Os .Sh NAME .Nm pci , .Nm pci_alloc_msi , .Nm pci_alloc_msix , .Nm pci_disable_busmaster , .Nm pci_disable_io , .Nm pci_enable_busmaster , .Nm pci_enable_io , .Nm pci_find_bsf , .Nm pci_find_cap , .Nm pci_find_dbsf , .Nm pci_find_device , .Nm pci_find_extcap , .Nm pci_find_htcap , .Nm pci_find_pcie_root_port , .Nm pci_get_max_read_req , .Nm pci_get_powerstate , .Nm pci_get_vpd_ident , .Nm pci_get_vpd_readonly , .Nm pci_iov_attach , .Nm pci_iov_detach , .Nm pci_msi_count , .Nm pci_msix_count , .Nm pci_pending_msix , .Nm pci_read_config , .Nm pci_release_msi , .Nm pci_remap_msix , .Nm pci_restore_state , .Nm pci_save_state , .Nm pci_set_max_read_req , .Nm pci_set_powerstate , .Nm pci_write_config , .Nm pcie_adjust_config , .Nm pcie_read_config , .Nm pcie_write_config .Nd PCI bus interface .Sh SYNOPSIS .In sys/bus.h .In dev/pci/pcireg.h .In dev/pci/pcivar.h .Ft int .Fn pci_alloc_msi "device_t dev" "int *count" .Ft int .Fn pci_alloc_msix "device_t dev" "int *count" .Ft int .Fn pci_disable_busmaster "device_t dev" .Ft int .Fn pci_disable_io "device_t dev" "int space" .Ft int .Fn pci_enable_busmaster "device_t dev" .Ft int .Fn pci_enable_io "device_t dev" "int space" .Ft device_t .Fn pci_find_bsf "uint8_t bus" "uint8_t slot" "uint8_t func" .Ft int .Fn pci_find_cap "device_t dev" "int capability" "int *capreg" .Ft device_t .Fn pci_find_dbsf "uint32_t domain" "uint8_t bus" "uint8_t slot" "uint8_t func" .Ft device_t .Fn pci_find_device "uint16_t vendor" "uint16_t device" .Ft int .Fn pci_find_extcap "device_t dev" "int capability" "int *capreg" .Ft int .Fn pci_find_htcap "device_t dev" "int capability" "int *capreg" .Ft device_t .Fn pci_find_pcie_root_port "device_t dev" .Ft int .Fn pci_get_max_read_req "device_t dev" .Ft int .Fn pci_get_powerstate "device_t dev" .Ft int .Fn pci_get_vpd_ident "device_t dev" "const char **identptr" .Ft int .Fn pci_get_vpd_readonly "device_t dev" "const char *kw" "const char **vptr" .Ft int .Fn pci_msi_count "device_t dev" .Ft int .Fn pci_msix_count "device_t dev" .Ft int .Fn pci_pending_msix "device_t dev" "u_int index" .Ft uint32_t .Fn pci_read_config "device_t dev" "int reg" "int width" .Ft int .Fn pci_release_msi "device_t dev" .Ft int .Fn pci_remap_msix "device_t dev" "int count" "const u_int *vectors" .Ft void .Fn pci_restore_state "device_t dev" .Ft void .Fn pci_save_state "device_t dev" .Ft int .Fn pci_set_max_read_req "device_t dev" "int size" .Ft int .Fn pci_set_powerstate "device_t dev" "int state" .Ft void .Fn pci_write_config "device_t dev" "int reg" "uint32_t val" "int width" .Ft uint32_t .Fo pcie_adjust_config .Fa "device_t dev" .Fa "int reg" .Fa "uint32_t mask" .Fa "uint32_t val" .Fa "int width" .Fc .Ft uint32_t .Fn pcie_read_config "device_t dev" "int reg" "int width" .Ft void .Fn pcie_write_config "device_t dev" "int reg" "uint32_t val" "int width" .In dev/pci/pci_iov.h .Ft int .Fn pci_iov_attach "device_t dev" "nvlist_t *pf_schema" "nvlist_t *vf_schema" .Ft int .Fn pci_iov_detach "device_t dev" .Sh DESCRIPTION The .Nm set of functions are used for managing PCI devices. The functions are split into several groups: raw configuration access, locating devices, device information, device configuration, and message signaled interrupts. .Ss Raw Configuration Access The .Fn pci_read_config function is used to read data from the PCI configuration space of the device .Fa dev , at offset .Fa reg , with .Fa width specifying the size of the access. .Pp The .Fn pci_write_config function is used to write the value .Fa val to the PCI configuration space of the device .Fa dev , at offset .Fa reg , with .Fa width specifying the size of the access. .Pp The .Fn pcie_adjust_config function is used to modify the value of a register in the PCI-express capability register set of device .Fa dev . The offset .Fa reg specifies a relative offset in the register set with .Fa width specifying the size of the access. The new value of the register is computed by modifying bits set in .Fa mask to the value in .Fa val . Any bits not specified in .Fa mask are preserved. The previous value of the register is returned. .Pp The .Fn pcie_read_config function is used to read the value of a register in the PCI-express capability register set of device .Fa dev . The offset .Fa reg specifies a relative offset in the register set with .Fa width specifying the size of the access. .Pp The .Fn pcie_write_config function is used to write the value .Fa val to a register in the PCI-express capability register set of device .Fa dev . The offset .Fa reg specifies a relative offset in the register set with .Fa width specifying the size of the access. .Pp .Em NOTE : Device drivers should only use these functions for functionality that is not available via another .Fn pci function. .Ss Locating Devices The .Fn pci_find_bsf function looks up the .Vt device_t of a PCI device, given its .Fa bus , .Fa slot , and .Fa func . The .Fa slot number actually refers to the number of the device on the bus, which does not necessarily indicate its geographic location in terms of a physical slot. Note that in case the system has multiple PCI domains, the .Fn pci_find_bsf function only searches the first one. Actually, it is equivalent to: .Bd -literal -offset indent pci_find_dbsf(0, bus, slot, func); .Ed .Pp The .Fn pci_find_dbsf function looks up the .Vt device_t of a PCI device, given its .Fa domain , .Fa bus , .Fa slot , and .Fa func . The .Fa slot number actually refers to the number of the device on the bus, which does not necessarily indicate its geographic location in terms of a physical slot. .Pp The .Fn pci_find_device function looks up the .Vt device_t of a PCI device, given its .Fa vendor and .Fa device IDs. Note that there can be multiple matches for this search; this function only returns the first matching device. .Ss Device Information The .Fn pci_find_cap function is used to locate the first instance of a PCI capability register set for the device .Fa dev . The capability to locate is specified by ID via .Fa capability . Constant macros of the form .Dv PCIY_xxx for standard capability IDs are defined in .In dev/pci/pcireg.h . If the capability is found, then .Fa *capreg is set to the offset in configuration space of the capability register set, and .Fn pci_find_cap returns zero. If the capability is not found or the device does not support capabilities, .Fn pci_find_cap returns an error. .Pp The .Fn pci_find_extcap function is used to locate the first instance of a PCI-express extended capability register set for the device .Fa dev . The extended capability to locate is specified by ID via .Fa capability . Constant macros of the form .Dv PCIZ_xxx for standard extended capability IDs are defined in .In dev/pci/pcireg.h . If the extended capability is found, then .Fa *capreg is set to the offset in configuration space of the extended capability register set, and .Fn pci_find_extcap returns zero. If the extended capability is not found or the device is not a PCI-express device, .Fn pci_find_extcap returns an error. .Pp The .Fn pci_find_htcap function is used to locate the first instance of a HyperTransport capability register set for the device .Fa dev . The capability to locate is specified by type via .Fa capability . Constant macros of the form .Dv PCIM_HTCAP_xxx for standard HyperTransport capability types are defined in .In dev/pci/pcireg.h . If the capability is found, then .Fa *capreg is set to the offset in configuration space of the capability register set, and .Fn pci_find_htcap returns zero. If the capability is not found or the device is not a HyperTransport device, .Fn pci_find_htcap returns an error. .Pp The .Fn pci_find_pcie_root_port function walks up the PCI device hierarchy to locate the PCI-express root port upstream of .Fa dev . If a root port is not found, .Fn pci_find_pcie_root_port returns .Dv NULL . .Pp The .Fn pci_get_vpd_ident function is used to fetch a device's Vital Product Data .Pq VPD identifier string. If the device .Fa dev supports VPD and provides an identifier string, then .Fa *identptr is set to point at a read-only, null-terminated copy of the identifier string, and .Fn pci_get_vpd_ident returns zero. If the device does not support VPD or does not provide an identifier string, then .Fn pci_get_vpd_ident returns an error. .Pp The .Fn pci_get_vpd_readonly function is used to fetch the value of a single VPD read-only keyword for the device .Fa dev . The keyword to fetch is identified by the two character string .Fa kw . If the device supports VPD and provides a read-only value for the requested keyword, then .Fa *vptr is set to point at a read-only, null-terminated copy of the value, and .Fn pci_get_vpd_readonly returns zero. If the device does not support VPD or does not provide the requested keyword, then .Fn pci_get_vpd_readonly returns an error. .Ss Device Configuration The .Fn pci_enable_busmaster function enables PCI bus mastering for the device .Fa dev , by setting the .Dv PCIM_CMD_BUSMASTEREN bit in the .Dv PCIR_COMMAND register. The .Fn pci_disable_busmaster function clears this bit. .Pp The .Fn pci_enable_io function enables memory or I/O port address decoding for the device .Fa dev , by setting the .Dv PCIM_CMD_MEMEN or .Dv PCIM_CMD_PORTEN bit in the .Dv PCIR_COMMAND register appropriately. The .Fn pci_disable_io function clears the appropriate bit. The .Fa space argument specifies which resource is affected; this can be either .Dv SYS_RES_MEMORY or .Dv SYS_RES_IOPORT as appropriate. Device drivers should generally not use these routines directly. The PCI bus will enable decoding automatically when a .Dv SYS_RES_MEMORY or .Dv SYS_RES_IOPORT resource is activated via .Xr bus_alloc_resource 9 or .Xr bus_activate_resource 9 . .Pp The .Fn pci_get_max_read_req function returns the current maximum read request size in bytes for a PCI-express device. If the .Fa dev device is not a PCI-express device, .Fn pci_get_max_read_req returns zero. .Pp The .Fn pci_set_max_read_req sets the PCI-express maximum read request size for .Fa dev . The requested .Fa size may be adjusted, and .Fn pci_set_max_read_req returns the actual size set in bytes. If the .Fa dev device is not a PCI-express device, .Fn pci_set_max_read_req returns zero. .Pp The .Fn pci_get_powerstate function returns the current power state of the device .Fa dev . If the device does not support power management capabilities, then the default state of .Dv PCI_POWERSTATE_D0 is returned. The following power states are defined by PCI: .Bl -hang -width ".Dv PCI_POWERSTATE_UNKNOWN" .It Dv PCI_POWERSTATE_D0 State in which device is on and running. It is receiving full power from the system and delivering full functionality to the user. .It Dv PCI_POWERSTATE_D1 Class-specific low-power state in which device context may or may not be lost. Busses in this state cannot do anything to the bus, to force devices to lose context. .It Dv PCI_POWERSTATE_D2 Class-specific low-power state in which device context may or may not be lost. Attains greater power savings than .Dv PCI_POWERSTATE_D1 . Busses in this state can cause devices to lose some context. Devices .Em must be prepared for the bus to be in this state or higher. .It Dv PCI_POWERSTATE_D3 State in which the device is off and not running. Device context is lost, and power from the device can be removed. .It Dv PCI_POWERSTATE_UNKNOWN State of the device is unknown. .El .Pp The .Fn pci_set_powerstate function is used to transition the device .Fa dev to the PCI power state .Fa state . If the device does not support power management capabilities or it does not support the specific power state .Fa state , then the function will fail with .Er EOPNOTSUPP . .Pp The .Fn pci_iov_attach function is used to advertise that the given device .Pq and associated device driver supports PCI Single-Root I/O Virtualization -.Po SR-IOV Pc . +.Pq SR-IOV . A driver that supports SR-IOV must implement the .Xr PCI_IOV_INIT 9 , .Xr PCI_IOV_ADD_VF 9 and .Xr PCI_IOV_UNINIT 9 methods. This function should be called during the .Xr DEVICE_ATTACH 9 method. If this function returns an error, it is recommended that the device driver still successfully attaches, but runs with SR-IOV disabled. The .Fa pf_schema and .Fa vf_schema parameters are used to define what device-specific configuration parameters the device driver accepts when SR-IOV is enabled for the Physical Function .Pq PF and for individual Virtual Functions .Pq VFs respectively. See .Xr pci_iov_schema 9 for details on how to construct the schema. If either the .Pa pf_schema or .Pa vf_schema is invalid or specifies parameter names that conflict with parameter names that are already in use, .Fn pci_iov_attach will return an error and SR-IOV will not be available on the PF device. If a driver does not accept configuration parameters for either the PF device or the VF devices, the driver must pass an empty schema for that device. The SR-IOV infrastructure takes ownership of the .Fa pf_schema and .Fa vf_schema and is responsible for freeing them. The driver must never free the schemas itself. .Pp The .Fn pci_iov_detach function is used to advise the SR-IOV infrastructure that the driver for the given device is attempting to detach and that all SR-IOV resources for the device must be released. This function must be called during the .Xr DEVICE_DETACH 9 method if .Fn pci_iov_attach was successfully called on the device and .Fn pci_iov_detach has not subsequently been called on the device and returned no error. If this function returns an error, the .Xr DEVICE_DETACH 9 method must fail and return an error, as detaching the PF driver while VF devices are active would cause system instability. This function is safe to call and will always succeed if .Fn pci_iov_attach previously failed with an error on the given device, or if .Fn pci_iov_attach was never called on the device. .Pp The .Fn pci_save_state and .Fn pci_restore_state functions can be used by a device driver to save and restore standard PCI config registers. The .Fn pci_save_state function must be invoked while the device has valid state before .Fn pci_restore_state can be used. If the device is not in the fully-powered state .Pq Dv PCI_POWERSTATE_D0 when .Fn pci_restore_state is invoked, then the device will be transitioned to .Dv PCI_POWERSTATE_D0 before any config registers are restored. .Ss Message Signaled Interrupts Message Signaled Interrupts .Pq MSI and Enhanced Message Signaled Interrupts .Pq MSI-X are PCI capabilities that provide an alternate method for PCI devices to signal interrupts. The legacy INTx interrupt is available to PCI devices as a .Dv SYS_RES_IRQ resource with a resource ID of zero. MSI and MSI-X interrupts are available to PCI devices as one or more .Dv SYS_RES_IRQ resources with resource IDs greater than zero. A driver must ask the PCI bus to allocate MSI or MSI-X interrupts using .Fn pci_alloc_msi or .Fn pci_alloc_msix before it can use MSI or MSI-X .Dv SYS_RES_IRQ resources. A driver is not allowed to use the legacy INTx .Dv SYS_RES_IRQ resource if MSI or MSI-X interrupts have been allocated, and attempts to allocate MSI or MSI-X interrupts will fail if the driver is currently using the legacy INTx .Dv SYS_RES_IRQ resource. A driver is only allowed to use either MSI or MSI-X, but not both. .Pp The .Fn pci_msi_count function returns the maximum number of MSI messages supported by the device .Fa dev . If the device does not support MSI, then .Fn pci_msi_count returns zero. .Pp The .Fn pci_alloc_msi function attempts to allocate .Fa *count MSI messages for the device .Fa dev . The .Fn pci_alloc_msi function may allocate fewer messages than requested for various reasons including requests for more messages than the device .Fa dev supports, or if the system has a shortage of available MSI messages. On success, .Fa *count is set to the number of messages allocated and .Fn pci_alloc_msi returns zero. The .Dv SYS_RES_IRQ resources for the allocated messages will be available at consecutive resource IDs beginning with one. If .Fn pci_alloc_msi is not able to allocate any messages, it returns an error. Note that MSI only supports message counts that are powers of two; requests to allocate a non-power of two count of messages will fail. .Pp The .Fn pci_release_msi function is used to release any allocated MSI or MSI-X messages back to the system. If any MSI or MSI-X .Dv SYS_RES_IRQ resources are allocated by the driver or have a configured interrupt handler, this function will fail with .Er EBUSY . The .Fn pci_release_msi function returns zero on success and an error on failure. .Pp The .Fn pci_msix_count function returns the maximum number of MSI-X messages supported by the device .Fa dev . If the device does not support MSI-X, then .Fn pci_msix_count returns zero. .Pp The .Fn pci_alloc_msix function attempts to allocate .Fa *count MSI-X messages for the device .Fa dev . The .Fn pci_alloc_msix function may allocate fewer messages than requested for various reasons including requests for more messages than the device .Fa dev supports, or if the system has a shortage of available MSI-X messages. On success, .Fa *count is set to the number of messages allocated and .Fn pci_alloc_msix returns zero. For MSI-X messages, the resource ID for each .Dv SYS_RES_IRQ resource identifies the index in the MSI-X table of the corresponding message. A resource ID of one maps to the first index of the MSI-X table; a resource ID two identifies the second index in the table, etc. The .Fn pci_alloc_msix function assigns the .Fa *count messages allocated to the first .Fa *count table indicies. If .Fn pci_alloc_msix is not able to allocate any messages, it returns an error. Unlike MSI, MSI-X does not require message counts that are powers of two. .Pp The .Fn pci_pending_msix function examines the .Fa dev device's Pending Bit Array .Pq PBA to determine the pending status of the MSI-X message at table index .Fa index . If the indicated message is pending, this function returns a non-zero value; otherwise, it returns zero. Passing an invalid .Fa index to this function will result in undefined behavior. .Pp As mentioned in the description of .Fn pci_alloc_msix , MSI-X messages are initially assigned to the first N table entries. A driver may use a different distribution of available messages to table entries via the .Fn pci_remap_msix function. Note that this function must be called after a successful call to .Fn pci_alloc_msix but before any of the .Dv SYS_RES_IRQ resources are allocated. The .Fn pci_remap_msix function returns zero on success, or an error on failure. .Pp The .Fa vectors array should contain .Fa count message vectors. The array maps directly to the MSI-X table in that the first entry in the array specifies the message used for the first entry in the MSI-X table, the second entry in the array corresponds to the second entry in the MSI-X table, etc. The vector value in each array index can either be zero to indicate that no message should be assigned to the corresponding MSI-X table entry, or it can be a number from one to N .Po where N is the count returned from the previous call to .Fn pci_alloc_msix .Pc to indicate which of the allocated messages should be assigned to the corresponding MSI-X table entry. .Pp If .Fn pci_remap_msix succeeds, each MSI-X table entry with a non-zero vector will have an associated .Dv SYS_RES_IRQ resource whose resource ID corresponds to the table index as described above for .Fn pci_alloc_msix . MSI-X table entries that with a vector of zero will not have an associated .Dv SYS_RES_IRQ resource. Additionally, if any of the original messages allocated by .Fn pci_alloc_msix are not used in the new distribution of messages in the MSI-X table, they will be released automatically. Note that if a driver wishes to use fewer messages than were allocated by .Fn pci_alloc_msix , the driver must use a single, contiguous range of messages beginning with one in the new distribution. The .Fn pci_remap_msix function will fail if this condition is not met. .Sh IMPLEMENTATION NOTES The .Vt pci_addr_t type varies according to the size of the PCI bus address space on the target architecture. .Sh SEE ALSO .Xr pci 4 , .Xr pciconf 8 , .Xr bus_alloc_resource 9 , .Xr bus_dma 9 , .Xr bus_release_resource 9 , .Xr bus_setup_intr 9 , .Xr bus_teardown_intr 9 , .Xr devclass 9 , .Xr device 9 , .Xr driver 9 , .Xr rman 9 .Rs .%B FreeBSD Developers' Handbook .%T NewBus .%U http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/developers-handbook/ .Re .Rs .%A Shanley .%A Anderson .%B PCI System Architecture .%N 2nd Edition .%I Addison-Wesley .%O ISBN 0-201-30974-2 .Re .Sh AUTHORS This manual page was written by .An Bruce M Simpson Aq bms@FreeBSD.org and .An John Baldwin Aq jhb@FreeBSD.org . .Sh BUGS The kernel PCI code has a number of references to .Dq "slot numbers" . These do not refer to the geographic location of PCI devices, but to the device number assigned by the combination of the PCI IDSEL mechanism and the platform firmware. This should be taken note of when working with the kernel PCI code. Index: user/ngie/stable-10-libnv/share/man/man9/pci_iov_schema.9 =================================================================== --- user/ngie/stable-10-libnv/share/man/man9/pci_iov_schema.9 (revision 293096) +++ user/ngie/stable-10-libnv/share/man/man9/pci_iov_schema.9 (revision 293097) @@ -1,265 +1,265 @@ .\" .\" Copyright (c) 2014 Sandvine Inc. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd May 28, 2015 -.Dt pci_iov_schema 9 +.Dd July 8, 2015 +.Dt PCI_IOV_SCHEMA 9 .Os .Sh NAME .Nm pci_iov_schema , .Nm pci_iov_schema_alloc_node , .Nm pci_iov_schema_add_bool , .Nm pci_iov_schema_add_string , .Nm pci_iov_schema_add_uint8 , .Nm pci_iov_schema_add_uint16 , .Nm pci_iov_schema_add_uint32 , .Nm pci_iov_schema_add_uint64 , .Nm pci_iov_schema_add_unicast_mac .Nd PCI SR-IOV config schema interface .Sh SYNOPSIS .In machine/stdarg.h .In sys/nv.h .In sys/iov_schema.h .Ft nvlist_t * .Fn pci_iov_schema_alloc_node "void" .Ft void .Fn pci_iov_schema_add_bool "nvlist_t *schema" "const char *name" \ "uint32_t flags" "int defaultVal" .Ft void .Fn pci_iov_schema_add_string "nvlist_t *schema" "const char *name" \ "uint32_t flags" "const char *defaultVal" .Ft void .Fn pci_iov_schema_add_uint8 "nvlist_t *schema" "const char *name" \ "uint32_t flags" "uint8_t defaultVal" .Ft void .Fn pci_iov_schema_add_uint16 "nvlist_t *schema" "const char *name" \ "uint32_t flags" "uint16_t defaultVal" .Ft void .Fn pci_iov_schema_add_uint32 "nvlist_t *schema" "const char *name" \ "uint32_t flags" "uint32_t defaultVal" .Ft void .Fn pci_iov_schema_add_uint64 "nvlist_t *schema" "const char *name" \ "uint32_t flags" "uint64_t defaultVal" .Ft void .Fn pci_iov_schema_add_unicast_mac "nvlist_t *schema" "const char *name" \ "uint32_t flags" "const uint8_t *defaultVal" .Sh DESCRIPTION The PCI Single-Root I/O Virtualization .Pq SR-IOV configuration schema is a data structure that describes the device-specific configuration parameters that a PF driver will accept when SR-IOV is enabled on the PF device. Each PF driver defines two schema instances: the PF schema and the VF schema. The PF schema describes configuration that applies to the PF device as a whole. The VF schema describes configuration that applies to an individual VF device. Different VF devices may have different configuration applied to them, as long as the configuration for each VF conforms to the VF schema. .Pp A PF driver builds a configuration schema by first allocating a schema node and then adding configuration parameter specifications to the schema. The configuration parameter specification consists of a name and a value type. .Pp Configuration parameter names are case-insensitive. It is an error to specify two or more configuration parameters with the same name. It is also an error to specific a configuration parameter that uses the same name as a configuration parameter used by the SR-IOV infrastructure. See .Xr iovctl.conf 5 for documentation of all configuration parameters used by the SR-IOV infrastructure. .Pp The parameter type constrains the possible values that the configuration parameter may take. .Pp A configuration parameter may be specified as a required parameter by setting the .Dv IOV_SCHEMA_REQUIRED flag in the .Pa flags argument. Required parameters must be specified by the user when SR-IOV is enabled. If the user does not specify a required parameter, the SR-IOV infrastructure will abort the request to enable SR-IOV and return an error to the user. .Pp Alternatively, a configuration parameter may be given a default value by setting the .Dv IOV_SCHEMA_HASDEFAULT flag in the .Pa flags argument. If a configuration parameter has a default value but the user has not specified a value for that parameter, then the SR-IOV infrastructure will apply .Pa defaultVal for that parameter in the configuration before passing it to the PF driver. It is an error for the value of the .Pa defaultVal parameter to not conform to the restrictions of the specified type. If this flag is not specified then the .Pa defaultVal argument is ignored. This flag is not compatible with the .Dv IOV_SCHEMA_REQUIRED flag; it is an error to specify both on the same parameter. .Pp The SR-IOV infrastructure guarantees that all configuration parameters that are either specified as required or given a default value will be present in the configuration passed to the PF driver. Configuration parameters that are neither specified as required nor given a default value are optional and may or may not be present in the configuration passed to the PF driver. .Pp It is highly recommended that a PF driver reserve the use of optional parameters for configuration that is truly optional. For example, a Network Interface PF device might have the option to encapsulate all traffic to and from a VF device in a vlan tag. The PF driver could expose that option as a "vlan" parameter accepting an integer argument specifying the vlan tag. In this case, it would be appropriate to set the "vlan" parameter as an optional parameter as it would be legitimate for a VF to be configured to have no vlan tagging enabled at all. .Pp Alternatively, if the PF device had an boolean option that controlled whether the VF was allowed to change its MAC address, it would not be appropriate to set this parameter as optional. The PF driver must either allow the MAC to change or not, so it would be more appropriate for the PF driver to document the default behaviour by specifying a default value in the schema .Po or potentially force the user to make the choice by setting the parameter to be required .Pc . .Pp Configuration parameters that have security implications must default to the most secure configuration possible. .Pp All device-specific configuration parameters must be documented in the manual page for the PF driver, or in a separate manual page that is cross-referenced from the main driver manual page. .Pp It is not necessary for a PF driver to check for failure from any of these functions. If an error occurs, it is flagged in the schema. The .Xr pci_iov_attach 9 function checks for this error and will fail to initialize SR-IOV on the PF device if an error is set in the schema. If this occurs, it is recommended that the PF driver still succeed in attaching and run with SR-IOV disabled on the device. .Pp The .Fn pci_iov_schema_alloc_node function is used to allocate an empty configuration schema. It is not necessary to check for failure from this function. The SR-IOV infrastructure will gracefully handle failure to allocate a schema and will simply not enable SR-IOV on the PF device. .Pp The .Fn pci_iov_schema_add_bool function is used to specify a configuration parameter in the given schema with the name .Pa name and having a boolean type. Boolean values can only take the value true or false (1 or 0, respectively). .Pp The .Fn pci_iov_schema_add_string function is used to specify a configuration parameter in the given schema with the name .Pa name and having a string type. String values are standard C strings. .Pp The .Fn pci_iov_schema_add_uint8 function is used to specify a configuration parameter in the given schema with the name .Pa name and having a .Vt uint8_t type. Values of type .Vt uint8_t are unsigned integers in the range 0 to 255, inclusive. .Pp The .Fn pci_iov_schema_add_uint16 function is used to specify a configuration parameter in the given schema with the name .Pa name and having a .Vt uint16_t type. Values of type .Vt uint16_t are unsigned integers in the range 0 to 65535, inclusive. .Pp The .Fn pci_iov_schema_add_uint32 function is used to specify a configuration parameter in the given schema with the name .Pa name and having a .Vt uint32_t type. Values of type .Vt uint32_t are unsigned integers in the range 0 to -.Po 2**32 - 1 Pc , +.Pq 2**32 - 1 , inclusive. .Pp The .Fn pci_iov_schema_add_uint64 function is used to specify a configuration parameter in the given schema with the name .Pa name and having a .Vt uint64_t type. Values of type .Vt uint64_t are unsigned integers in the range 0 to -.Po 2**64 - 1 Pc , +.Pq 2**64 - 1 , inclusive. .Pp The .Fn pci_iov_schema_add_unicast_mac function is used to specify a configuration parameter in the given schema with the name .Pa name and having a unicast-mac type. Values of type unicast-mac are binary values exactly 6 bytes long. The MAC address is guaranteed to not be a multicast or broadcast address. .Sh RETURN VALUES The .Fn pci_iov_schema_alloc_node function returns a pointer to the allocated schema, or NULL if a failure occurs. .Sh SEE ALSO .Xr pci 9 , .Xr PCI_IOV_ADD_VF 9 , .Xr PCI_IOV_INIT 9 .Sh AUTHORS This manual page was written by .An Ryan Stone Aq rstone@FreeBSD.org . Index: user/ngie/stable-10-libnv/sys/dev/pci/pci_iov.c =================================================================== --- user/ngie/stable-10-libnv/sys/dev/pci/pci_iov.c (revision 293096) +++ user/ngie/stable-10-libnv/sys/dev/pci/pci_iov.c (revision 293097) @@ -1,976 +1,980 @@ /*- * Copyright (c) 2013-2015 Sandvine Inc. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_bus.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pcib_if.h" static MALLOC_DEFINE(M_SRIOV, "sr_iov", "PCI SR-IOV allocations"); static d_ioctl_t pci_iov_ioctl; static struct cdevsw iov_cdevsw = { .d_version = D_VERSION, .d_name = "iov", .d_ioctl = pci_iov_ioctl }; SYSCTL_DECL(_hw_pci); /* * The maximum amount of memory we will allocate for user configuration of an * SR-IOV device. 1MB ought to be enough for anyone, but leave this * configurable just in case. */ static u_long pci_iov_max_config = 1024 * 1024; SYSCTL_ULONG(_hw_pci, OID_AUTO, iov_max_config, CTLFLAG_RWTUN, &pci_iov_max_config, 0, "Maximum allowed size of SR-IOV configuration."); #define IOV_READ(d, r, w) \ pci_read_config((d)->cfg.dev, (d)->cfg.iov->iov_pos + r, w) #define IOV_WRITE(d, r, v, w) \ pci_write_config((d)->cfg.dev, (d)->cfg.iov->iov_pos + r, v, w) static nvlist_t *pci_iov_build_schema(nvlist_t **pf_schema, nvlist_t **vf_schema); static void pci_iov_build_pf_schema(nvlist_t *schema, nvlist_t **driver_schema); static void pci_iov_build_vf_schema(nvlist_t *schema, nvlist_t **driver_schema); static nvlist_t *pci_iov_get_pf_subsystem_schema(void); static nvlist_t *pci_iov_get_vf_subsystem_schema(void); int pci_iov_attach_method(device_t bus, device_t dev, nvlist_t *pf_schema, nvlist_t *vf_schema) { device_t pcib; struct pci_devinfo *dinfo; struct pcicfg_iov *iov; nvlist_t *schema; uint32_t version; int error; int iov_pos; dinfo = device_get_ivars(dev); pcib = device_get_parent(bus); schema = NULL; error = pci_find_extcap(dev, PCIZ_SRIOV, &iov_pos); if (error != 0) return (error); version = pci_read_config(dev, iov_pos, 4); if (PCI_EXTCAP_VER(version) != 1) { if (bootverbose) device_printf(dev, "Unsupported version of SR-IOV (%d) detected\n", PCI_EXTCAP_VER(version)); return (ENXIO); } iov = malloc(sizeof(*dinfo->cfg.iov), M_SRIOV, M_WAITOK | M_ZERO); mtx_lock(&Giant); if (dinfo->cfg.iov != NULL) { error = EBUSY; goto cleanup; } iov->iov_pos = iov_pos; schema = pci_iov_build_schema(&pf_schema, &vf_schema); if (schema == NULL) { error = ENOMEM; goto cleanup; } + + error = pci_iov_validate_schema(schema); + if (error != 0) + goto cleanup; iov->iov_schema = schema; iov->iov_cdev = make_dev(&iov_cdevsw, device_get_unit(dev), UID_ROOT, GID_WHEEL, 0600, "iov/%s", device_get_nameunit(dev)); if (iov->iov_cdev == NULL) { error = ENOMEM; goto cleanup; } dinfo->cfg.iov = iov; iov->iov_cdev->si_drv1 = dinfo; mtx_unlock(&Giant); return (0); cleanup: nvlist_destroy(schema); nvlist_destroy(pf_schema); nvlist_destroy(vf_schema); free(iov, M_SRIOV); mtx_unlock(&Giant); return (error); } int pci_iov_detach_method(device_t bus, device_t dev) { struct pci_devinfo *dinfo; struct pcicfg_iov *iov; mtx_lock(&Giant); dinfo = device_get_ivars(dev); iov = dinfo->cfg.iov; if (iov == NULL) { mtx_unlock(&Giant); return (0); } if (iov->iov_num_vfs != 0 || iov->iov_flags & IOV_BUSY) { mtx_unlock(&Giant); return (EBUSY); } dinfo->cfg.iov = NULL; if (iov->iov_cdev) { destroy_dev(iov->iov_cdev); iov->iov_cdev = NULL; } nvlist_destroy(iov->iov_schema); free(iov, M_SRIOV); mtx_unlock(&Giant); return (0); } static nvlist_t * pci_iov_build_schema(nvlist_t **pf, nvlist_t **vf) { nvlist_t *schema, *pf_driver, *vf_driver; /* We always take ownership of the schemas. */ pf_driver = *pf; *pf = NULL; vf_driver = *vf; *vf = NULL; schema = pci_iov_schema_alloc_node(); if (schema == NULL) goto cleanup; pci_iov_build_pf_schema(schema, &pf_driver); pci_iov_build_vf_schema(schema, &vf_driver); if (nvlist_error(schema) != 0) goto cleanup; return (schema); cleanup: nvlist_destroy(schema); nvlist_destroy(pf_driver); nvlist_destroy(vf_driver); return (NULL); } static void pci_iov_build_pf_schema(nvlist_t *schema, nvlist_t **driver_schema) { nvlist_t *pf_schema, *iov_schema; pf_schema = pci_iov_schema_alloc_node(); if (pf_schema == NULL) { nvlist_set_error(schema, ENOMEM); return; } iov_schema = pci_iov_get_pf_subsystem_schema(); /* * Note that if either *driver_schema or iov_schema is NULL, then * nvlist_move_nvlist will put the schema in the error state and * SR-IOV will fail to initialize later, so we don't have to explicitly * handle that case. */ nvlist_move_nvlist(pf_schema, DRIVER_CONFIG_NAME, *driver_schema); nvlist_move_nvlist(pf_schema, IOV_CONFIG_NAME, iov_schema); nvlist_move_nvlist(schema, PF_CONFIG_NAME, pf_schema); *driver_schema = NULL; } static void pci_iov_build_vf_schema(nvlist_t *schema, nvlist_t **driver_schema) { nvlist_t *vf_schema, *iov_schema; vf_schema = pci_iov_schema_alloc_node(); if (vf_schema == NULL) { nvlist_set_error(schema, ENOMEM); return; } iov_schema = pci_iov_get_vf_subsystem_schema(); /* * Note that if either *driver_schema or iov_schema is NULL, then * nvlist_move_nvlist will put the schema in the error state and * SR-IOV will fail to initialize later, so we don't have to explicitly * handle that case. */ nvlist_move_nvlist(vf_schema, DRIVER_CONFIG_NAME, *driver_schema); nvlist_move_nvlist(vf_schema, IOV_CONFIG_NAME, iov_schema); nvlist_move_nvlist(schema, VF_SCHEMA_NAME, vf_schema); *driver_schema = NULL; } static nvlist_t * pci_iov_get_pf_subsystem_schema(void) { nvlist_t *pf; pf = pci_iov_schema_alloc_node(); if (pf == NULL) return (NULL); pci_iov_schema_add_uint16(pf, "num_vfs", IOV_SCHEMA_REQUIRED, -1); pci_iov_schema_add_string(pf, "device", IOV_SCHEMA_REQUIRED, NULL); return (pf); } static nvlist_t * pci_iov_get_vf_subsystem_schema(void) { nvlist_t *vf; vf = pci_iov_schema_alloc_node(); if (vf == NULL) return (NULL); pci_iov_schema_add_bool(vf, "passthrough", IOV_SCHEMA_HASDEFAULT, 0); return (vf); } static int pci_iov_alloc_bar(struct pci_devinfo *dinfo, int bar, pci_addr_t bar_shift) { struct resource *res; struct pcicfg_iov *iov; device_t dev, bus; u_long start, end; pci_addr_t bar_size; int rid; iov = dinfo->cfg.iov; dev = dinfo->cfg.dev; bus = device_get_parent(dev); rid = iov->iov_pos + PCIR_SRIOV_BAR(bar); bar_size = 1 << bar_shift; res = pci_alloc_multi_resource(bus, dev, SYS_RES_MEMORY, &rid, 0ul, ~0ul, 1, iov->iov_num_vfs, RF_ACTIVE); if (res == NULL) return (ENXIO); iov->iov_bar[bar].res = res; iov->iov_bar[bar].bar_size = bar_size; iov->iov_bar[bar].bar_shift = bar_shift; start = rman_get_start(res); end = rman_get_end(res); return (rman_manage_region(&iov->rman, start, end)); } static void pci_iov_add_bars(struct pcicfg_iov *iov, struct pci_devinfo *dinfo) { struct pci_iov_bar *bar; uint64_t bar_start; int i; for (i = 0; i <= PCIR_MAX_BAR_0; i++) { bar = &iov->iov_bar[i]; if (bar->res != NULL) { bar_start = rman_get_start(bar->res) + dinfo->cfg.vf.index * bar->bar_size; pci_add_bar(dinfo->cfg.dev, PCIR_BAR(i), bar_start, bar->bar_shift); } } } static int pci_iov_parse_config(struct pcicfg_iov *iov, struct pci_iov_arg *arg, nvlist_t **ret) { void *packed_config; nvlist_t *config; int error; config = NULL; packed_config = NULL; if (arg->len > pci_iov_max_config) { error = EMSGSIZE; goto out; } packed_config = malloc(arg->len, M_SRIOV, M_WAITOK); error = copyin(arg->config, packed_config, arg->len); if (error != 0) goto out; config = nvlist_unpack(packed_config, arg->len, NV_FLAG_IGNORE_CASE); if (config == NULL) { error = EINVAL; goto out; } error = pci_iov_schema_validate_config(iov->iov_schema, config); if (error != 0) goto out; error = nvlist_error(config); if (error != 0) goto out; *ret = config; config = NULL; out: nvlist_destroy(config); free(packed_config, M_SRIOV); return (error); } /* * Set the ARI_EN bit in the lowest-numbered PCI function with the SR-IOV * capability. This bit is only writeable on the lowest-numbered PF but * affects all PFs on the device. */ static int pci_iov_set_ari(device_t bus) { device_t lowest; device_t *devlist; int i, error, devcount, lowest_func, lowest_pos, iov_pos, dev_func; uint16_t iov_ctl; /* If ARI is disabled on the downstream port there is nothing to do. */ if (!PCIB_ARI_ENABLED(device_get_parent(bus))) return (0); error = device_get_children(bus, &devlist, &devcount); if (error != 0) return (error); lowest = NULL; for (i = 0; i < devcount; i++) { if (pci_find_extcap(devlist[i], PCIZ_SRIOV, &iov_pos) == 0) { dev_func = pci_get_function(devlist[i]); if (lowest == NULL || dev_func < lowest_func) { lowest = devlist[i]; lowest_func = dev_func; lowest_pos = iov_pos; } } } /* * If we called this function some device must have the SR-IOV * capability. */ KASSERT(lowest != NULL, ("Could not find child of %s with SR-IOV capability", device_get_nameunit(bus))); iov_ctl = pci_read_config(lowest, iov_pos + PCIR_SRIOV_CTL, 2); iov_ctl |= PCIM_SRIOV_ARI_EN; pci_write_config(lowest, iov_pos + PCIR_SRIOV_CTL, iov_ctl, 2); free(devlist, M_TEMP); return (0); } static int pci_iov_config_page_size(struct pci_devinfo *dinfo) { uint32_t page_cap, page_size; page_cap = IOV_READ(dinfo, PCIR_SRIOV_PAGE_CAP, 4); /* * If the system page size is less than the smallest SR-IOV page size * then round up to the smallest SR-IOV page size. */ if (PAGE_SHIFT < PCI_SRIOV_BASE_PAGE_SHIFT) page_size = (1 << 0); else page_size = (1 << (PAGE_SHIFT - PCI_SRIOV_BASE_PAGE_SHIFT)); /* Check that the device supports the system page size. */ if (!(page_size & page_cap)) return (ENXIO); IOV_WRITE(dinfo, PCIR_SRIOV_PAGE_SIZE, page_size, 4); return (0); } static int pci_iov_init(device_t dev, uint16_t num_vfs, const nvlist_t *config) { const nvlist_t *device, *driver_config; device = nvlist_get_nvlist(config, PF_CONFIG_NAME); driver_config = nvlist_get_nvlist(device, DRIVER_CONFIG_NAME); return (PCI_IOV_INIT(dev, num_vfs, driver_config)); } static int pci_iov_init_rman(device_t pf, struct pcicfg_iov *iov) { int error; iov->rman.rm_start = 0; iov->rman.rm_end = ~0ul; iov->rman.rm_type = RMAN_ARRAY; snprintf(iov->rman_name, sizeof(iov->rman_name), "%s VF I/O memory", device_get_nameunit(pf)); iov->rman.rm_descr = iov->rman_name; error = rman_init(&iov->rman); if (error != 0) return (error); iov->iov_flags |= IOV_RMAN_INITED; return (0); } static int pci_iov_setup_bars(struct pci_devinfo *dinfo) { device_t dev; struct pcicfg_iov *iov; pci_addr_t bar_value, testval; int i, last_64, error; iov = dinfo->cfg.iov; dev = dinfo->cfg.dev; last_64 = 0; for (i = 0; i <= PCIR_MAX_BAR_0; i++) { /* * If a PCI BAR is a 64-bit wide BAR, then it spans two * consecutive registers. Therefore if the last BAR that * we looked at was a 64-bit BAR, we need to skip this * register as it's the second half of the last BAR. */ if (!last_64) { pci_read_bar(dev, iov->iov_pos + PCIR_SRIOV_BAR(i), &bar_value, &testval, &last_64); if (testval != 0) { error = pci_iov_alloc_bar(dinfo, i, pci_mapsize(testval)); if (error != 0) return (error); } } else last_64 = 0; } return (0); } static void pci_iov_enumerate_vfs(struct pci_devinfo *dinfo, const nvlist_t *config, uint16_t first_rid, uint16_t rid_stride) { char device_name[VF_MAX_NAME]; const nvlist_t *device, *driver_config, *iov_config; device_t bus, dev, vf; struct pcicfg_iov *iov; struct pci_devinfo *vfinfo; size_t size; int i, error; uint16_t vid, did, next_rid; iov = dinfo->cfg.iov; dev = dinfo->cfg.dev; bus = device_get_parent(dev); size = dinfo->cfg.devinfo_size; next_rid = first_rid; vid = pci_get_vendor(dev); did = IOV_READ(dinfo, PCIR_SRIOV_VF_DID, 2); for (i = 0; i < iov->iov_num_vfs; i++, next_rid += rid_stride) { snprintf(device_name, sizeof(device_name), VF_PREFIX"%d", i); device = nvlist_get_nvlist(config, device_name); iov_config = nvlist_get_nvlist(device, IOV_CONFIG_NAME); driver_config = nvlist_get_nvlist(device, DRIVER_CONFIG_NAME); vf = PCI_CREATE_IOV_CHILD(bus, dev, next_rid, vid, did); if (vf == NULL) break; /* * If we are creating passthrough devices then force the ppt * driver to attach to prevent a VF driver from claiming the * VFs. */ if (nvlist_get_bool(iov_config, "passthrough")) device_set_devclass_fixed(vf, "ppt"); vfinfo = device_get_ivars(vf); vfinfo->cfg.iov = iov; vfinfo->cfg.vf.index = i; pci_iov_add_bars(iov, vfinfo); error = PCI_IOV_ADD_VF(dev, i, driver_config); if (error != 0) { device_printf(dev, "Failed to add VF %d\n", i); pci_delete_child(bus, vf); } } bus_generic_attach(bus); } static int pci_iov_config(struct cdev *cdev, struct pci_iov_arg *arg) { device_t bus, dev; struct pci_devinfo *dinfo; struct pcicfg_iov *iov; nvlist_t *config; int i, error; uint16_t rid_off, rid_stride; uint16_t first_rid, last_rid; uint16_t iov_ctl; uint16_t num_vfs, total_vfs; int iov_inited; mtx_lock(&Giant); dinfo = cdev->si_drv1; iov = dinfo->cfg.iov; dev = dinfo->cfg.dev; bus = device_get_parent(dev); iov_inited = 0; config = NULL; if ((iov->iov_flags & IOV_BUSY) || iov->iov_num_vfs != 0) { mtx_unlock(&Giant); return (EBUSY); } iov->iov_flags |= IOV_BUSY; error = pci_iov_parse_config(iov, arg, &config); if (error != 0) goto out; num_vfs = pci_iov_config_get_num_vfs(config); total_vfs = IOV_READ(dinfo, PCIR_SRIOV_TOTAL_VFS, 2); if (num_vfs > total_vfs) { error = EINVAL; goto out; } error = pci_iov_config_page_size(dinfo); if (error != 0) goto out; error = pci_iov_set_ari(bus); if (error != 0) goto out; error = pci_iov_init(dev, num_vfs, config); if (error != 0) goto out; iov_inited = 1; IOV_WRITE(dinfo, PCIR_SRIOV_NUM_VFS, num_vfs, 2); rid_off = IOV_READ(dinfo, PCIR_SRIOV_VF_OFF, 2); rid_stride = IOV_READ(dinfo, PCIR_SRIOV_VF_STRIDE, 2); first_rid = pci_get_rid(dev) + rid_off; last_rid = first_rid + (num_vfs - 1) * rid_stride; /* We don't yet support allocating extra bus numbers for VFs. */ if (pci_get_bus(dev) != PCI_RID2BUS(last_rid)) { error = ENOSPC; goto out; } iov_ctl = IOV_READ(dinfo, PCIR_SRIOV_CTL, 2); iov_ctl &= ~(PCIM_SRIOV_VF_EN | PCIM_SRIOV_VF_MSE); IOV_WRITE(dinfo, PCIR_SRIOV_CTL, iov_ctl, 2); error = pci_iov_init_rman(dev, iov); if (error != 0) goto out; iov->iov_num_vfs = num_vfs; error = pci_iov_setup_bars(dinfo); if (error != 0) goto out; iov_ctl = IOV_READ(dinfo, PCIR_SRIOV_CTL, 2); iov_ctl |= PCIM_SRIOV_VF_EN | PCIM_SRIOV_VF_MSE; IOV_WRITE(dinfo, PCIR_SRIOV_CTL, iov_ctl, 2); /* Per specification, we must wait 100ms before accessing VFs. */ pause("iov", roundup(hz, 10)); pci_iov_enumerate_vfs(dinfo, config, first_rid, rid_stride); nvlist_destroy(config); iov->iov_flags &= ~IOV_BUSY; mtx_unlock(&Giant); return (0); out: if (iov_inited) PCI_IOV_UNINIT(dev); for (i = 0; i <= PCIR_MAX_BAR_0; i++) { if (iov->iov_bar[i].res != NULL) { pci_release_resource(bus, dev, SYS_RES_MEMORY, iov->iov_pos + PCIR_SRIOV_BAR(i), iov->iov_bar[i].res); pci_delete_resource(bus, dev, SYS_RES_MEMORY, iov->iov_pos + PCIR_SRIOV_BAR(i)); iov->iov_bar[i].res = NULL; } } if (iov->iov_flags & IOV_RMAN_INITED) { rman_fini(&iov->rman); iov->iov_flags &= ~IOV_RMAN_INITED; } nvlist_destroy(config); iov->iov_num_vfs = 0; iov->iov_flags &= ~IOV_BUSY; mtx_unlock(&Giant); return (error); } /* Return true if child is a VF of the given PF. */ static int pci_iov_is_child_vf(struct pcicfg_iov *pf, device_t child) { struct pci_devinfo *vfinfo; vfinfo = device_get_ivars(child); if (!(vfinfo->cfg.flags & PCICFG_VF)) return (0); return (pf == vfinfo->cfg.iov); } static int pci_iov_delete(struct cdev *cdev) { device_t bus, dev, vf, *devlist; struct pci_devinfo *dinfo; struct pcicfg_iov *iov; int i, error, devcount; uint32_t iov_ctl; mtx_lock(&Giant); dinfo = cdev->si_drv1; iov = dinfo->cfg.iov; dev = dinfo->cfg.dev; bus = device_get_parent(dev); devlist = NULL; if (iov->iov_flags & IOV_BUSY) { mtx_unlock(&Giant); return (EBUSY); } if (iov->iov_num_vfs == 0) { mtx_unlock(&Giant); return (ECHILD); } iov->iov_flags |= IOV_BUSY; error = device_get_children(bus, &devlist, &devcount); if (error != 0) goto out; for (i = 0; i < devcount; i++) { vf = devlist[i]; if (!pci_iov_is_child_vf(iov, vf)) continue; error = device_detach(vf); if (error != 0) { device_printf(dev, "Could not disable SR-IOV: failed to detach VF %s\n", device_get_nameunit(vf)); goto out; } } for (i = 0; i < devcount; i++) { vf = devlist[i]; if (pci_iov_is_child_vf(iov, vf)) pci_delete_child(bus, vf); } PCI_IOV_UNINIT(dev); iov_ctl = IOV_READ(dinfo, PCIR_SRIOV_CTL, 2); iov_ctl &= ~(PCIM_SRIOV_VF_EN | PCIM_SRIOV_VF_MSE); IOV_WRITE(dinfo, PCIR_SRIOV_CTL, iov_ctl, 2); IOV_WRITE(dinfo, PCIR_SRIOV_NUM_VFS, 0, 2); iov->iov_num_vfs = 0; for (i = 0; i <= PCIR_MAX_BAR_0; i++) { if (iov->iov_bar[i].res != NULL) { pci_release_resource(bus, dev, SYS_RES_MEMORY, iov->iov_pos + PCIR_SRIOV_BAR(i), iov->iov_bar[i].res); pci_delete_resource(bus, dev, SYS_RES_MEMORY, iov->iov_pos + PCIR_SRIOV_BAR(i)); iov->iov_bar[i].res = NULL; } } if (iov->iov_flags & IOV_RMAN_INITED) { rman_fini(&iov->rman); iov->iov_flags &= ~IOV_RMAN_INITED; } error = 0; out: free(devlist, M_TEMP); iov->iov_flags &= ~IOV_BUSY; mtx_unlock(&Giant); return (error); } static int pci_iov_get_schema_ioctl(struct cdev *cdev, struct pci_iov_schema *output) { struct pci_devinfo *dinfo; void *packed; size_t output_len, size; int error; packed = NULL; mtx_lock(&Giant); dinfo = cdev->si_drv1; packed = nvlist_pack(dinfo->cfg.iov->iov_schema, &size); mtx_unlock(&Giant); if (packed == NULL) { error = ENOMEM; goto fail; } output_len = output->len; output->len = size; if (size <= output_len) { error = copyout(packed, output->schema, size); if (error != 0) goto fail; output->error = 0; } else /* * If we return an error then the ioctl code won't copyout * output back to userland, so we flag the error in the struct * instead. */ output->error = EMSGSIZE; error = 0; fail: free(packed, M_NVLIST); return (error); } static int pci_iov_ioctl(struct cdev *dev, u_long cmd, caddr_t data, int fflag, struct thread *td) { switch (cmd) { case IOV_CONFIG: return (pci_iov_config(dev, (struct pci_iov_arg *)data)); case IOV_DELETE: return (pci_iov_delete(dev)); case IOV_GET_SCHEMA: return (pci_iov_get_schema_ioctl(dev, (struct pci_iov_schema *)data)); default: return (EINVAL); } } struct resource * pci_vf_alloc_mem_resource(device_t dev, device_t child, int *rid, u_long start, u_long end, u_long count, u_int flags) { struct pci_devinfo *dinfo; struct pcicfg_iov *iov; struct pci_map *map; struct resource *res; struct resource_list_entry *rle; u_long bar_start, bar_end; pci_addr_t bar_length; int error; dinfo = device_get_ivars(child); iov = dinfo->cfg.iov; map = pci_find_bar(child, *rid); if (map == NULL) return (NULL); bar_length = 1 << map->pm_size; bar_start = map->pm_value; bar_end = bar_start + bar_length - 1; /* Make sure that the resource fits the constraints. */ if (bar_start >= end || bar_end <= bar_start || count != 1) return (NULL); /* Clamp the resource to the constraints if necessary. */ if (bar_start < start) bar_start = start; if (bar_end > end) bar_end = end; bar_length = bar_end - bar_start + 1; res = rman_reserve_resource(&iov->rman, bar_start, bar_end, bar_length, flags, child); if (res == NULL) return (NULL); rle = resource_list_add(&dinfo->resources, SYS_RES_MEMORY, *rid, bar_start, bar_end, 1); if (rle == NULL) { rman_release_resource(res); return (NULL); } rman_set_rid(res, *rid); if (flags & RF_ACTIVE) { error = bus_activate_resource(child, SYS_RES_MEMORY, *rid, res); if (error != 0) { resource_list_delete(&dinfo->resources, SYS_RES_MEMORY, *rid); rman_release_resource(res); return (NULL); } } rle->res = res; return (res); } int pci_vf_release_mem_resource(device_t dev, device_t child, int rid, struct resource *r) { struct pci_devinfo *dinfo; struct resource_list_entry *rle; int error; dinfo = device_get_ivars(child); if (rman_get_flags(r) & RF_ACTIVE) { error = bus_deactivate_resource(child, SYS_RES_MEMORY, rid, r); if (error != 0) return (error); } rle = resource_list_find(&dinfo->resources, SYS_RES_MEMORY, rid); if (rle != NULL) { rle->res = NULL; resource_list_delete(&dinfo->resources, SYS_RES_MEMORY, rid); } return (rman_release_resource(r)); } Index: user/ngie/stable-10-libnv/sys/dev/pci/pci_iov_schema.c =================================================================== --- user/ngie/stable-10-libnv/sys/dev/pci/pci_iov_schema.c (revision 293096) +++ user/ngie/stable-10-libnv/sys/dev/pci/pci_iov_schema.c (revision 293097) @@ -1,608 +1,869 @@ /*- * Copyright (c) 2014-2015 Sandvine Inc. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include struct config_type_validator; typedef int (validate_func)(const struct config_type_validator *, const nvlist_t *, const char *name); +typedef int (default_validate_t)(const struct config_type_validator *, + const nvlist_t *); static validate_func pci_iov_schema_validate_bool; static validate_func pci_iov_schema_validate_string; static validate_func pci_iov_schema_validate_uint; static validate_func pci_iov_schema_validate_unicast_mac; +static default_validate_t pci_iov_validate_bool_default; +static default_validate_t pci_iov_validate_string_default; +static default_validate_t pci_iov_validate_uint_default; +static default_validate_t pci_iov_validate_unicast_mac_default; + struct config_type_validator { const char *type_name; validate_func *validate; + default_validate_t *default_validate; uintmax_t limit; }; static struct config_type_validator pci_iov_schema_validators[] = { - { "bool", pci_iov_schema_validate_bool }, - { "string", pci_iov_schema_validate_string }, - { "uint8_t", pci_iov_schema_validate_uint, UINT8_MAX }, - { "uint16_t", pci_iov_schema_validate_uint, UINT16_MAX }, - { "uint32_t", pci_iov_schema_validate_uint, UINT32_MAX }, - { "uint64_t", pci_iov_schema_validate_uint, UINT64_MAX }, - { "unicast-mac", pci_iov_schema_validate_unicast_mac }, + { + .type_name = "bool", + .validate = pci_iov_schema_validate_bool, + .default_validate = pci_iov_validate_bool_default + }, + { + .type_name = "string", + .validate = pci_iov_schema_validate_string, + .default_validate = pci_iov_validate_string_default + }, + { + .type_name = "uint8_t", + .validate = pci_iov_schema_validate_uint, + .default_validate = pci_iov_validate_uint_default, + .limit = UINT8_MAX + }, + { + .type_name = "uint16_t", + .validate = pci_iov_schema_validate_uint, + .default_validate = pci_iov_validate_uint_default, + .limit = UINT16_MAX + }, + { + .type_name = "uint32_t", + .validate = pci_iov_schema_validate_uint, + .default_validate = pci_iov_validate_uint_default, + .limit = UINT32_MAX + }, + { + .type_name = "uint64_t", + .validate = pci_iov_schema_validate_uint, + .default_validate = pci_iov_validate_uint_default, + .limit = UINT64_MAX + }, + { + .type_name = "unicast-mac", + .validate = pci_iov_schema_validate_unicast_mac, + .default_validate = pci_iov_validate_unicast_mac_default, + }, }; static const struct config_type_validator * pci_iov_schema_find_validator(const char *type) { struct config_type_validator *validator; int i; for (i = 0; i < nitems(pci_iov_schema_validators); i++) { validator = &pci_iov_schema_validators[i]; if (strcmp(type, validator->type_name) == 0) return (validator); } return (NULL); } static void pci_iov_schema_add_type(nvlist_t *entry, const char *type) { if (pci_iov_schema_find_validator(type) == NULL) { nvlist_set_error(entry, EINVAL); return; } nvlist_add_string(entry, "type", type); } static void pci_iov_schema_add_required(nvlist_t *entry, uint32_t flags) { if (flags & IOV_SCHEMA_REQUIRED) { if (flags & IOV_SCHEMA_HASDEFAULT) { nvlist_set_error(entry, EINVAL); return; } nvlist_add_bool(entry, "required", 1); } } void pci_iov_schema_add_bool(nvlist_t *schema, const char *name, uint32_t flags, int defaultVal) { nvlist_t *entry; entry = nvlist_create(NV_FLAG_IGNORE_CASE); if (entry == NULL) { nvlist_set_error(schema, ENOMEM); return; } pci_iov_schema_add_type(entry, "bool"); if (flags & IOV_SCHEMA_HASDEFAULT) nvlist_add_bool(entry, "default", defaultVal); pci_iov_schema_add_required(entry, flags); nvlist_move_nvlist(schema, name, entry); } void pci_iov_schema_add_string(nvlist_t *schema, const char *name, uint32_t flags, const char *defaultVal) { nvlist_t *entry; entry = nvlist_create(NV_FLAG_IGNORE_CASE); if (entry == NULL) { nvlist_set_error(schema, ENOMEM); return; } pci_iov_schema_add_type(entry, "string"); if (flags & IOV_SCHEMA_HASDEFAULT) nvlist_add_string(entry, "default", defaultVal); pci_iov_schema_add_required(entry, flags); nvlist_move_nvlist(schema, name, entry); } static void pci_iov_schema_int(nvlist_t *schema, const char *name, const char *type, uint32_t flags, uint64_t defaultVal) { nvlist_t *entry; entry = nvlist_create(NV_FLAG_IGNORE_CASE); if (entry == NULL) { nvlist_set_error(schema, ENOMEM); return; } pci_iov_schema_add_type(entry, type); if (flags & IOV_SCHEMA_HASDEFAULT) nvlist_add_number(entry, "default", defaultVal); pci_iov_schema_add_required(entry, flags); nvlist_move_nvlist(schema, name, entry); } void pci_iov_schema_add_uint8(nvlist_t *schema, const char *name, uint32_t flags, uint8_t defaultVal) { pci_iov_schema_int(schema, name, "uint8_t", flags, defaultVal); } void pci_iov_schema_add_uint16(nvlist_t *schema, const char *name, uint32_t flags, uint16_t defaultVal) { pci_iov_schema_int(schema, name, "uint16_t", flags, defaultVal); } void pci_iov_schema_add_uint32(nvlist_t *schema, const char *name, uint32_t flags, uint32_t defaultVal) { pci_iov_schema_int(schema, name, "uint32_t", flags, defaultVal); } void pci_iov_schema_add_uint64(nvlist_t *schema, const char *name, uint32_t flags, uint64_t defaultVal) { pci_iov_schema_int(schema, name, "uint64_t", flags, defaultVal); } void pci_iov_schema_add_unicast_mac(nvlist_t *schema, const char *name, uint32_t flags, const uint8_t * defaultVal) { nvlist_t *entry; entry = nvlist_create(NV_FLAG_IGNORE_CASE); if (entry == NULL) { nvlist_set_error(schema, ENOMEM); return; } pci_iov_schema_add_type(entry, "unicast-mac"); if (flags & IOV_SCHEMA_HASDEFAULT) nvlist_add_binary(entry, "default", defaultVal, ETHER_ADDR_LEN); pci_iov_schema_add_required(entry, flags); nvlist_move_nvlist(schema, name, entry); } static int pci_iov_schema_validate_bool(const struct config_type_validator * validator, const nvlist_t *config, const char *name) { if (!nvlist_exists_bool(config, name)) return (EINVAL); return (0); } static int pci_iov_schema_validate_string(const struct config_type_validator * validator, const nvlist_t *config, const char *name) { if (!nvlist_exists_string(config, name)) return (EINVAL); return (0); } static int pci_iov_schema_validate_uint(const struct config_type_validator * validator, const nvlist_t *config, const char *name) { uint64_t value; if (!nvlist_exists_number(config, name)) return (EINVAL); value = nvlist_get_number(config, name); if (value > validator->limit) return (EINVAL); return (0); } static int pci_iov_schema_validate_unicast_mac( const struct config_type_validator * validator, const nvlist_t *config, const char *name) { const uint8_t *mac; size_t size; if (!nvlist_exists_binary(config, name)) return (EINVAL); mac = nvlist_get_binary(config, name, &size); if (size != ETHER_ADDR_LEN) return (EINVAL); if (ETHER_IS_MULTICAST(mac)) return (EINVAL); return (0); } static void pci_iov_config_add_default(const nvlist_t *param_schema, const char *name, nvlist_t *config) { const void *binary; size_t len; if (nvlist_exists_binary(param_schema, "default")) { binary = nvlist_get_binary(param_schema, "default", &len); nvlist_add_binary(config, name, binary, len); } else if (nvlist_exists_bool(param_schema, "default")) nvlist_add_bool(config, name, nvlist_get_bool(param_schema, "default")); else if (nvlist_exists_number(param_schema, "default")) nvlist_add_number(config, name, nvlist_get_number(param_schema, "default")); else if (nvlist_exists_nvlist(param_schema, "default")) nvlist_add_nvlist(config, name, nvlist_get_nvlist(param_schema, "default")); else if (nvlist_exists_string(param_schema, "default")) nvlist_add_string(config, name, nvlist_get_string(param_schema, "default")); else panic("Unexpected nvlist type"); +} + +static int +pci_iov_validate_bool_default(const struct config_type_validator * validator, + const nvlist_t *param) +{ + + if (!nvlist_exists_bool(param, DEFAULT_SCHEMA_NAME)) + return (EINVAL); + return (0); +} + +static int +pci_iov_validate_string_default(const struct config_type_validator * validator, + const nvlist_t *param) +{ + + if (!nvlist_exists_string(param, DEFAULT_SCHEMA_NAME)) + return (EINVAL); + return (0); +} + +static int +pci_iov_validate_uint_default(const struct config_type_validator * validator, + const nvlist_t *param) +{ + uint64_t defaultVal; + + if (!nvlist_exists_number(param, DEFAULT_SCHEMA_NAME)) + return (EINVAL); + + defaultVal = nvlist_get_number(param, DEFAULT_SCHEMA_NAME); + if (defaultVal > validator->limit) + return (EINVAL); + return (0); +} + +static int +pci_iov_validate_unicast_mac_default( + const struct config_type_validator * validator, const nvlist_t *param) +{ + const uint8_t *mac; + size_t size; + + if (!nvlist_exists_binary(param, DEFAULT_SCHEMA_NAME)) + return (EINVAL); + + mac = nvlist_get_binary(param, DEFAULT_SCHEMA_NAME, &size); + if (size != ETHER_ADDR_LEN) + return (EINVAL); + + if (ETHER_IS_MULTICAST(mac)) + return (EINVAL); + return (0); +} + +static int +pci_iov_validate_param_schema(const nvlist_t *schema) +{ + const struct config_type_validator *validator; + const char *type; + int error; + + /* All parameters must define a type. */ + if (!nvlist_exists_string(schema, TYPE_SCHEMA_NAME)) + return (EINVAL); + type = nvlist_get_string(schema, TYPE_SCHEMA_NAME); + + validator = pci_iov_schema_find_validator(type); + if (validator == NULL) + return (EINVAL); + + /* Validate that the default value conforms to the type. */ + if (nvlist_exists(schema, DEFAULT_SCHEMA_NAME)) { + error = validator->default_validate(validator, schema); + if (error != 0) + return (error); + + /* Required and Default are mutually exclusive. */ + if (nvlist_exists(schema, REQUIRED_SCHEMA_NAME)) + return (EINVAL); + } + + /* The "Required" field must be a bool. */ + if (nvlist_exists(schema, REQUIRED_SCHEMA_NAME)) { + if (!nvlist_exists_bool(schema, REQUIRED_SCHEMA_NAME)) + return (EINVAL); + } + + return (0); +} + +static int +pci_iov_validate_subsystem_schema(const nvlist_t *dev_schema, const char *name) +{ + const nvlist_t *sub_schema, *param_schema; + const char *param_name; + void *it; + int type, error; + + if (!nvlist_exists_nvlist(dev_schema, name)) + return (EINVAL); + sub_schema = nvlist_get_nvlist(dev_schema, name); + + it = NULL; + while ((param_name = nvlist_next(sub_schema, &type, &it)) != NULL) { + if (type != NV_TYPE_NVLIST) + return (EINVAL); + param_schema = nvlist_get_nvlist(sub_schema, param_name); + + error = pci_iov_validate_param_schema(param_schema); + if (error != 0) + return (error); + } + + return (0); +} + +/* + * Validate that the driver schema does not define any configuration parameters + * whose names collide with configuration parameters defined in the iov schema. + */ +static int +pci_iov_validate_param_collisions(const nvlist_t *dev_schema) +{ + const nvlist_t *iov_schema, *driver_schema; + const char *name; + void *it; + int type; + + driver_schema = nvlist_get_nvlist(dev_schema, DRIVER_CONFIG_NAME); + iov_schema = nvlist_get_nvlist(dev_schema, IOV_CONFIG_NAME); + + it = NULL; + while ((name = nvlist_next(driver_schema, &type, &it)) != NULL) { + if (nvlist_exists(iov_schema, name)) + return (EINVAL); + } + + return (0); +} + +/* + * Validate that we only have IOV and DRIVER subsystems beneath the given + * device schema node. + */ +static int +pci_iov_validate_schema_subsystems(const nvlist_t *dev_schema) +{ + const char *name; + void *it; + int type; + + it = NULL; + while ((name = nvlist_next(dev_schema, &type, &it)) != NULL) { + if (strcmp(name, IOV_CONFIG_NAME) != 0 && + strcmp(name, DRIVER_CONFIG_NAME) != 0) + return (EINVAL); + } + + return (0); +} + +static int +pci_iov_validate_device_schema(const nvlist_t *schema, const char *name) +{ + const nvlist_t *dev_schema; + int error; + + if (!nvlist_exists_nvlist(schema, name)) + return (EINVAL); + dev_schema = nvlist_get_nvlist(schema, name); + + error = pci_iov_validate_subsystem_schema(dev_schema, IOV_CONFIG_NAME); + if (error != 0) + return (error); + + error = pci_iov_validate_subsystem_schema(dev_schema, + DRIVER_CONFIG_NAME); + if (error != 0) + return (error); + + error = pci_iov_validate_param_collisions(dev_schema); + if (error != 0) + return (error); + + return (pci_iov_validate_schema_subsystems(dev_schema)); +} + +/* Validate that we only have PF and VF devices beneath the top-level schema. */ +static int +pci_iov_validate_schema_devices(const nvlist_t *dev_schema) +{ + const char *name; + void *it; + int type; + + it = NULL; + while ((name = nvlist_next(dev_schema, &type, &it)) != NULL) { + if (strcmp(name, PF_CONFIG_NAME) != 0 && + strcmp(name, VF_SCHEMA_NAME) != 0) + return (EINVAL); + } + + return (0); +} + +int +pci_iov_validate_schema(const nvlist_t *schema) +{ + int error; + + error = pci_iov_validate_device_schema(schema, PF_CONFIG_NAME); + if (error != 0) + return (error); + + error = pci_iov_validate_device_schema(schema, VF_SCHEMA_NAME); + if (error != 0) + return (error); + + return (pci_iov_validate_schema_devices(schema)); } /* * Validate that all required parameters from the schema are specified in the * config. If any parameter with a default value is not specified in the * config, add it to config. */ static int pci_iov_schema_validate_required(const nvlist_t *schema, nvlist_t *config) { const nvlist_t *param_schema; const char *name; void *cookie; int type; cookie = NULL; while ((name = nvlist_next(schema, &type, &cookie)) != NULL) { param_schema = nvlist_get_nvlist(schema, name); if (dnvlist_get_bool(param_schema, "required", 0)) { if (!nvlist_exists(config, name)) return (EINVAL); } if (nvlist_exists(param_schema, "default") && !nvlist_exists(config, name)) pci_iov_config_add_default(param_schema, name, config); } return (nvlist_error(config)); } static int pci_iov_schema_validate_param(const nvlist_t *schema_param, const char *name, const nvlist_t *config) { const struct config_type_validator *validator; const char *type; type = nvlist_get_string(schema_param, "type"); validator = pci_iov_schema_find_validator(type); KASSERT(validator != NULL, ("Schema was not validated: Unknown type %s", type)); return (validator->validate(validator, config, name)); } /* * Validate that all parameters in config are defined in the schema. Also * validate that the type of the parameter matches the type in the schema. */ static int pci_iov_schema_validate_types(const nvlist_t *schema, const nvlist_t *config) { const nvlist_t *schema_param; void *cookie; const char *name; int type, error; cookie = NULL; while ((name = nvlist_next(config, &type, &cookie)) != NULL) { if (!nvlist_exists_nvlist(schema, name)) return (EINVAL); schema_param = nvlist_get_nvlist(schema, name); error = pci_iov_schema_validate_param(schema_param, name, config); if (error != 0) return (error); } return (0); } static int pci_iov_schema_validate_device(const nvlist_t *schema, nvlist_t *config, const char *schema_device, const char *config_device) { const nvlist_t *device_schema, *iov_schema, *driver_schema; nvlist_t *device_config, *iov_config, *driver_config; int error; device_config = NULL; iov_config = NULL; driver_config = NULL; device_schema = nvlist_get_nvlist(schema, schema_device); iov_schema = nvlist_get_nvlist(device_schema, IOV_CONFIG_NAME); driver_schema = nvlist_get_nvlist(device_schema, DRIVER_CONFIG_NAME); device_config = dnvlist_take_nvlist(config, config_device, NULL); if (device_config == NULL) { error = EINVAL; goto out; } iov_config = dnvlist_take_nvlist(device_config, IOV_CONFIG_NAME, NULL); if (iov_config == NULL) { error = EINVAL; goto out; } driver_config = dnvlist_take_nvlist(device_config, DRIVER_CONFIG_NAME, NULL); if (driver_config == NULL) { error = EINVAL; goto out; } error = pci_iov_schema_validate_required(iov_schema, iov_config); if (error != 0) goto out; error = pci_iov_schema_validate_required(driver_schema, driver_config); if (error != 0) goto out; error = pci_iov_schema_validate_types(iov_schema, iov_config); if (error != 0) goto out; error = pci_iov_schema_validate_types(driver_schema, driver_config); if (error != 0) goto out; out: /* Note that these functions handle NULL pointers safely. */ nvlist_move_nvlist(device_config, IOV_CONFIG_NAME, iov_config); nvlist_move_nvlist(device_config, DRIVER_CONFIG_NAME, driver_config); nvlist_move_nvlist(config, config_device, device_config); return (error); } static int pci_iov_schema_validate_vfs(const nvlist_t *schema, nvlist_t *config, uint16_t num_vfs) { char device[VF_MAX_NAME]; int i, error; for (i = 0; i < num_vfs; i++) { snprintf(device, sizeof(device), VF_PREFIX"%d", i); error = pci_iov_schema_validate_device(schema, config, VF_SCHEMA_NAME, device); if (error != 0) return (error); } return (0); } /* * Validate that the device node only has IOV and DRIVER subnodes. */ static int pci_iov_schema_validate_device_subsystems(const nvlist_t *config) { void *cookie; const char *name; int type; cookie = NULL; while ((name = nvlist_next(config, &type, &cookie)) != NULL) { if (strcasecmp(name, IOV_CONFIG_NAME) == 0) continue; else if (strcasecmp(name, DRIVER_CONFIG_NAME) == 0) continue; return (EINVAL); } return (0); } /* * Validate that the string is a valid device node name. It must either be "PF" * or "VF-n", where n is an integer in the range [0, num_vfs). */ static int pci_iov_schema_validate_dev_name(const char *name, uint16_t num_vfs) { const char *number_start; char *endp; u_long vf_num; if (strcasecmp(PF_CONFIG_NAME, name) == 0) return (0); /* Ensure that we start with "VF-" */ if (strncasecmp(name, VF_PREFIX, VF_PREFIX_LEN) != 0) return (EINVAL); number_start = name + VF_PREFIX_LEN; /* Filter out name == "VF-" (no number) */ if (number_start[0] == '\0') return (EINVAL); /* Disallow leading whitespace or +/- */ if (!isdigit(number_start[0])) return (EINVAL); vf_num = strtoul(number_start, &endp, 10); if (*endp != '\0') return (EINVAL); /* Disallow leading zeros on VF-[1-9][0-9]* */ if (vf_num != 0 && number_start[0] == '0') return (EINVAL); /* Disallow leading zeros on VF-0 */ if (vf_num == 0 && number_start[1] != '\0') return (EINVAL); if (vf_num >= num_vfs) return (EINVAL); return (0); } /* * Validate that there are no device nodes in config other than the ones for * the PF and the VFs. This includes validating that all config nodes of the * form VF-n specify a VF number that is < num_vfs. */ static int pci_iov_schema_validate_device_names(const nvlist_t *config, uint16_t num_vfs) { const nvlist_t *device; void *cookie; const char *name; int type, error; cookie = NULL; while ((name = nvlist_next(config, &type, &cookie)) != NULL) { error = pci_iov_schema_validate_dev_name(name, num_vfs); if (error != 0) return (error); /* * Note that as this is a valid PF/VF node, we know that * pci_iov_schema_validate_device() has already checked that * the PF/VF node is an nvlist. */ device = nvlist_get_nvlist(config, name); error = pci_iov_schema_validate_device_subsystems(device); if (error != 0) return (error); } return (0); } int pci_iov_schema_validate_config(const nvlist_t *schema, nvlist_t *config) { int error; uint16_t num_vfs; error = pci_iov_schema_validate_device(schema, config, PF_CONFIG_NAME, PF_CONFIG_NAME); if (error != 0) return (error); num_vfs = pci_iov_config_get_num_vfs(config); error = pci_iov_schema_validate_vfs(schema, config, num_vfs); if (error != 0) return (error); return (pci_iov_schema_validate_device_names(config, num_vfs)); } /* * Return value of the num_vfs parameter. config must have already been * validated, which guarantees that the parameter exists. */ uint16_t pci_iov_config_get_num_vfs(const nvlist_t *config) { const nvlist_t *pf, *iov; pf = nvlist_get_nvlist(config, PF_CONFIG_NAME); iov = nvlist_get_nvlist(pf, IOV_CONFIG_NAME); return (nvlist_get_number(iov, "num_vfs")); } /* Allocate a new empty schema node. */ nvlist_t * pci_iov_schema_alloc_node(void) { return (nvlist_create(NV_FLAG_IGNORE_CASE)); } Index: user/ngie/stable-10-libnv/sys/dev/pci/schema_private.h =================================================================== --- user/ngie/stable-10-libnv/sys/dev/pci/schema_private.h (revision 293096) +++ user/ngie/stable-10-libnv/sys/dev/pci/schema_private.h (revision 293097) @@ -1,35 +1,37 @@ /*- * Copyright (c) 2014 Sandvine Inc. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _SCHEMA_PRIVATE_H_ #define _SCHEMA_PRIVATE_H_ +int pci_iov_validate_schema(const nvlist_t *schema); + int pci_iov_schema_validate_config(const nvlist_t *, nvlist_t *); uint16_t pci_iov_config_get_num_vfs(const nvlist_t *); #endif Index: user/ngie/stable-10-libnv/usr.sbin/iovctl/iovctl.8 =================================================================== --- user/ngie/stable-10-libnv/usr.sbin/iovctl/iovctl.8 (revision 293096) +++ user/ngie/stable-10-libnv/usr.sbin/iovctl/iovctl.8 (revision 293097) @@ -1,123 +1,123 @@ .\" .\" Copyright (c) 2014 Sandvine Inc. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd May 21, 2014 +.Dd July 8, 2015 .Dt IOVCTL 8 .Os .Sh NAME .Nm iovctl .Nd "PCI SR-IOV configuration utility" .Sh SYNOPSIS .Nm .Fl C .Op Fl f Ar config-file .Op Fl n .Nm .Fl D .Op Fl f Ar config-file | Fl d Ar device .Op Fl n .Nm .Fl S .Op Fl f Ar config-file | Fl d Ar device .Sh DESCRIPTION The .Nm utility creates or destroys PCI Single-Root I/O Virtualization .Pq SR-IOV Virtual Functions -.Po VFs Pc . +.Pq VFs . When invoked with the .Fl C flag, .Nm creates VFs as children of the Physical Function .Pq PF configured in the specified configuration file. When invoked with the .Fl D flag, .Nm destroys all VFs that are children of the specified device. Available PF devices can be seen in .Pa /dev/iov/ . .Pp The following options are available: .Bl -tag -width indent .It Fl C Enable SR-IOV on the specified PF device and create VF children. This operation will fail if the PF already has VF children. This option must be used in conjunction with the .Fl f option. .It Fl d Ar device Specify the PF device to use for the given operation. .Ar device may either be the name of a PF device, or a full path name to a node in .Pa /dev/iov/ . This option may not be used with the .Fl C option. .It Fl D Delete all VF children of the specified PF device. This operation will fail if SR-IOV is not currently enabled on the specified device. .It Fl f Ar config-file Specify the pathname of the configuration file. For the .Fl C option, this file will be used to specify all configuration values. For the .Fl D and .Fl S options, this file will only be used to specify the name of the PF device. .Pp See .Xr iovctl.conf for a description of the config file format and documentation of the configuration parameters that apply to all PF drivers. See the PF driver manual page for configuration parameters specific to particular hardware. .It Fl n Perform a dry-run. Perform all validation of the specified action and print what would be done, but do not perform the actual creation or destruction of VFs. This option may not be used with the .Fl S flag. .It Fl S Read the configuration schema from the specified device and print its contents to stdout. This action may be used to discover the configuration parameters supported on a given PF device. .El .Sh SEE ALSO .Xr iovctl.conf 5 , .Xr rc.conf 5 .Sh AUTHORS This manual page was written by .An Ryan Stone Aq Mt rstone@FreeBSD.org . Index: user/ngie/stable-10-libnv/usr.sbin/iovctl/iovctl.conf.5 =================================================================== --- user/ngie/stable-10-libnv/usr.sbin/iovctl/iovctl.conf.5 (revision 293096) +++ user/ngie/stable-10-libnv/usr.sbin/iovctl/iovctl.conf.5 (revision 293097) @@ -1,167 +1,171 @@ .\" .\" Copyright (c) 2014 Sandvine Inc. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd May 21, 2014 +.Dd July 8, 2015 .Dt IOVCTL.CONF 5 .Os .Sh NAME .Nm iovctl.conf .Nd IOVCTL configuration file .Sh DESCRIPTION The .Nm file is the configuration file for the .Xr iovctl 8 program. This file specifies configuration parameters for a single Physical Function .Pq PF device. To configure SR-IOV on multiple PF devices, use one configuration file for each PF. The locations of all .Xr iovctl 9 configuration files are specified in .Xr rc.conf 5 . .Pp The .Nm file uses UCL format. UCL syntax is documented at the official UCL website: http://github.com/vstakhov/libucl. .Pp There are three types of sections in the .Nm file. A section is a key at the top level of the file with a list as its value. The list may contain the keys specified in the .Sx OPTIONS section of this manual page. Individual PF driver implementations may specify additional device-specific configuration keys that they will accept. The order in which sections appear in .Nm is ignored. No two sections may have the same key. For example, two sections for VF-1 must not be defined. .Pp The first section type is the PF section. This section always has the key "PF"; therefore, only one such section may be defined. This section defines configuration parameters that apply to the PF as a whole. .Pp The second section type is the VF section. This section has the key "VF-" followed by a VF index. VF indices start at 0 and always increment by 1. Valid VF indices are in the range of 0 to -.Po num_vfs - 1 Pc . +.Pq num_vfs - 1 . The VF index must be given as a decimal integer with no leading zeros. This section defines configuration parameters that apply to a single VF. .Pp The third section type is the default section. This section always has the key "DEFAULT"; therefore, only one such section may be specified. This section defines default configuration parameters that apply to all VFs. All configuration keys that are valid to be applied to a VF are valid in this section. An individual VF section may override a default specified in this section by providing a different value for the configuration parameter. Note that the default section applies to ALL VFs. The default section must appear before any VF sections. The default section may appear before or after the PF section. .Pp The following option types are supported: .Bl -tag -width indent .It boolean Accepts a boolean value of true or false. .It mac-addr Accepts a unicast MAC address specified as a string of the form xx:xx:xx:xx:xx:xx, where xx is one or two hexadecimal digits. .It string Accepts any string value. .It uint8_t -Accepts any integer in the range 0-255, inclusive. +Accepts any integer in the range 0 to 255, inclusive. .It uint16_t -Accepts any integer in the range 0-65535, inclusive. +Accepts any integer in the range 0 to 65535, inclusive. .It uint32_t -Accepts any integer in the range 0-2**32, inclusive. +Accepts any integer in the range 0 to +.Pq 2**32 - 1 , +inclusive. .It uint64_t -Accepts any integer in the range 0-2**64, inclusive. +Accepts any integer in the range 0 to +.Pq 2**64 - 1 , +inclusive. .El .Sh OPTIONS The following parameters are accepted by all PF drivers: .Bl -tag -width indent .It device Pq string This parameter specifies the name of the PF device. This parameter is required to be specified. .It num_vfs Pq uint16_t This parameter specifies the number of VF children to create. This parameter may not be zero. The maximum value of this parameter is device-specific. .El .Pp The following parameters are accepted by all VFs: .Bl -tag -width indent .It passthrough Pq boolean This parameter controls whether the VF is reserved for the use of the .Xr bhyve 8 hypervisor as a PCI passthrough device. If this parameter is set to true, then the VF will be reserved as a PCI passthrough device and it will not be accessible from the host OS. The default value of this parameter is false. .El .Pp See the PF driver manual page for configuration parameters specific to particular hardware. .Sh EXAMPLES This sample file will create 3 VFs as children of the ix0 device. VF-1 and VF-2 are set as .Xr bhyve 8 passthrough devices through the use of the default section. VF-0 is not configured as a passthrough device as it explicitly overrides the default. VF-0 also sets a device-specific parameter named mac-addr. .Bd -literal .offset ident PF { device : "ix0"; num_vfs : 3; } DEFAULT { passthrough : true; } VF-0 { mac-addr : "02:56:48:7e:d9:f7"; passthrough : false; } .Ed .Sh SEE ALSO .Xr iovctl 8 , .Xr rc.conf 5 .Sh AUTHORS This manual page was written by .An Ryan Stone Aq Mt rstone@FreeBSD.org . Index: user/ngie/stable-10-libnv =================================================================== --- user/ngie/stable-10-libnv (revision 293096) +++ user/ngie/stable-10-libnv (revision 293097) Property changes on: user/ngie/stable-10-libnv ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head:r279465,284558,285189,285273