Index: head/share/man/man4/sfxge.4 =================================================================== --- head/share/man/man4/sfxge.4 (revision 311976) +++ head/share/man/man4/sfxge.4 (revision 311977) @@ -1,179 +1,187 @@ .\" Copyright (c) 2011-2016 Solarflare Communications Inc. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions are met: .\" .\" 1. Redistributions of source code must retain the above copyright notice, .\" this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright notice, .\" this list of conditions and the following disclaimer in the documentation .\" and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" .\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, .\" THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR .\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR .\" CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, .\" EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, .\" PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; .\" OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, .\" WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR .\" OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, .\" EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" The views and conclusions contained in the software and documentation are .\" those of the authors and should not be interpreted as representing official .\" policies, either expressed or implied, of the FreeBSD Project. .\" .\" $FreeBSD$ .\" .Dd February 22, 2015 .Dt SFXGE 4 .Os .Sh NAME .Nm sfxge .Nd "Solarflare 10Gb Ethernet adapter driver" .Sh SYNOPSIS To compile this driver into the kernel, place the following lines in your kernel configuration file: .Bd -ragged -offset indent .Cd "device sfxge" .Ed .Pp To load the driver as a module at boot time, place the following line in .Xr loader.conf 5 : .Bd -literal -offset indent sfxge_load="YES" .Ed .Sh DESCRIPTION The .Nm driver provides support for 10Gb Ethernet adapters based on Solarflare SFC9000 family controllers. The driver supports jumbo frames, transmit/receive checksum offload, TCP Segmentation Offload (TSO), Large Receive Offload (LRO), VLAN checksum offload, VLAN TSO, and Receive Side Scaling (RSS) using MSI-X interrupts. .Pp The driver allocates 1 receive queue, transmit queue, event queue and IRQ per CPU up to a maximum of 64. IRQ affinities should be spread out using .Xr cpuset 1 . Interrupt moderation may be controlled through the sysctl .Va dev.sfxge.%d.int_mod (units are microseconds). .Pp For more information on configuring this device, see .Xr ifconfig 8 . .Pp A large number of MAC, PHY and data path statistics are available under the sysctl .Va dev.sfxge.%d.stats . The adapter's VPD fields including its serial number are available under the sysctl .Va dev.sfxge.%d.vpd . .Sh HARDWARE The .Nm driver supports all 10Gb Ethernet adapters based on Solarflare SFC9000 family controllers. .Sh LOADER TUNABLES Tunables can be set at the .Xr loader 8 prompt before booting the kernel or stored in .Xr loader.conf 5 . Actual values can be obtained using .Xr sysctl 8 . .Bl -tag -width indent .It Va hw.sfxge.rx_ring The maximum number of descriptors in a receive queue ring. Supported values are: 512, 1024, 2048 and 4096. .It Va hw.sfxge.tx_ring The maximum number of descriptors in a transmit queue ring. Supported values are: 512, 1024, 2048 and 4096. .It Va hw.sfxge.tx_dpl_get_max The maximum length of the deferred packet .Dq get-list for queued transmit packets (TCP and non-TCP), used only if the transmit queue lock can be acquired. If a packet is dropped, the .Va tx_get_overflow counter is incremented and the local sender receives ENOBUFS. The value must be greater than 0. .It Va hw.sfxge.tx_dpl_get_non_tcp_max The maximum number of non-TCP packets in the deferred packet .Dq get-list , used only if the transmit queue lock can be acquired. If a packet is dropped, the .Va tx_get_non_tcp_overflow counter is incremented and the local sender receives ENOBUFS. The value must be greater than 0. .It Va hw.sfxge.tx_dpl_put_max The maximum length of the deferred packet .Dq put-list for queued transmit packets, used if the transmit queue lock cannot be acquired. If a packet is dropped, the .Va tx_put_overflow counter is incremented and the local sender receives ENOBUFS. The value must be greater than or equal to 0. .It Va hw.sfxge.tso_fw_assisted Bitmask to enable/disable usage of FW-assisted TSO version if supported by NIC firmware. FATSOv1 (bit 0) and FATSOv2 (bit 1) are supported. All enabled by default. .It Va hw.sfxge.N.max_rss_channels The maximum number of allocated RSS channels for the Nth adapter. If set to 0 or unset, the number of channels is determined by the number of CPU cores. .It Va hw.sfxge.lro.table_size Size of the LRO hash table. Must be a power of 2. A larger table means we can accelerate a larger number of streams. .It Va hw.sfxge.lro.chain_max The maximum length of a hash chain. If chains get too long then the lookup time increases and may exceed the benefit of LRO. .It Va hw.sfxge.lro.idle_ticks The maximum time (in ticks) that a connection can be idle before it's LRO state is discarded. .It Va hw.sfxge.lro.slow_start_packets Number of packets with payload that must arrive in-order before a connection is eligible for LRO. The idea is we should avoid coalescing segments when the sender is in slow-start because reducing the ACK rate can damage performance. .It Va hw.sfxge.lro.loss_packets Number of packets with payload that must arrive in-order following loss before a connection is eligible for LRO. The idea is we should avoid coalescing segments when the sender is recovering from loss, because reducing the ACK rate can damage performance. .It Va hw.sfxge.mcdi_logging Enable logging of MCDI protocol messages (only available if enabled at compile-time). .It Va hw.sfxge.N.mcdi_logging Enable or disable logging of MCDI protocol messages on a per-port basis. The default for each port will be the value of .Va hw.sfxge.mcdi_logging. The logging may also be enabled or disabled after the driver is loaded using the sysctl .Va dev.sfxge.%d.mcdi_logging. +.It Va hw.sfxge.stats_update_period_ms +Period in milliseconds to refresh interface statistics from hardware. +The accepted range is 0 to 65535, the default is 1000 (1 second). +Use zero value to disable periodic statistics update. +Supported on SFN8xxx series adapters with firmware v6.2.1.1033 and later and +SFN5xxx and SFN6xxx series adapters. +SFN7xxx series adapters and SFN8xxx series with earlier firmware use a +fixed 1000 milliseconds statistics update period. .El .Sh SUPPORT For general information and support, go to the Solarflare support website at: .Pa https://support.solarflare.com . .Sh SEE ALSO .Xr cpuset 1 , .Xr arp 4 , .Xr netintro 4 , .Xr ng_ether 4 , .Xr vlan 4 , .Xr ifconfig 8 .Sh AUTHORS The .Nm driver was written by .An Philip Paeps and .An Solarflare Communications, Inc. Index: head/sys/dev/sfxge/sfxge.h =================================================================== --- head/sys/dev/sfxge/sfxge.h (revision 311976) +++ head/sys/dev/sfxge/sfxge.h (revision 311977) @@ -1,479 +1,480 @@ /*- * Copyright (c) 2010-2016 Solarflare Communications Inc. * All rights reserved. * * This software was developed in part by Philip Paeps under contract for * Solarflare Communications, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * 1. Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * The views and conclusions contained in the software and documentation are * those of the authors and should not be interpreted as representing official * policies, either expressed or implied, of the FreeBSD Project. * * $FreeBSD$ */ #ifndef _SFXGE_H #define _SFXGE_H #include #include #include #include #include #include #include #include #include #include #include #include "sfxge_ioc.h" /* * Debugging */ #if 0 #define DBGPRINT(dev, fmt, args...) \ device_printf(dev, "%s: " fmt "\n", __func__, ## args) #else #define DBGPRINT(dev, fmt, args...) #endif /* * Backward-compatibility */ #ifndef CACHE_LINE_SIZE /* This should be right on most machines the driver will be used on, and * we needn't care too much about wasting a few KB per interface. */ #define CACHE_LINE_SIZE 128 #endif #ifndef IFCAP_LINKSTATE #define IFCAP_LINKSTATE 0 #endif #ifndef IFCAP_VLAN_HWTSO #define IFCAP_VLAN_HWTSO 0 #endif #ifndef IFM_10G_T #define IFM_10G_T IFM_UNKNOWN #endif #ifndef IFM_10G_KX4 #define IFM_10G_KX4 IFM_10G_CX4 #endif #ifndef IFM_40G_CR4 #define IFM_40G_CR4 IFM_UNKNOWN #endif #if (__FreeBSD_version >= 800501 && __FreeBSD_version < 900000) || \ __FreeBSD_version >= 900003 #define SFXGE_HAVE_DESCRIBE_INTR #endif #ifdef IFM_ETH_RXPAUSE #define SFXGE_HAVE_PAUSE_MEDIAOPTS #endif #ifndef CTLTYPE_U64 #define CTLTYPE_U64 CTLTYPE_QUAD #endif #include "sfxge_rx.h" #include "sfxge_tx.h" #define ROUNDUP_POW_OF_TWO(_n) (1ULL << flsl((_n) - 1)) #define SFXGE_IP_ALIGN 2 #define SFXGE_ETHERTYPE_LOOPBACK 0x9000 /* Xerox loopback */ #define SFXGE_MAGIC_RESERVED 0x8000 #define SFXGE_MAGIC_DMAQ_LABEL_WIDTH 6 #define SFXGE_MAGIC_DMAQ_LABEL_MASK \ ((1 << SFXGE_MAGIC_DMAQ_LABEL_WIDTH) - 1) enum sfxge_sw_ev { SFXGE_SW_EV_RX_QFLUSH_DONE = 1, SFXGE_SW_EV_RX_QFLUSH_FAILED, SFXGE_SW_EV_RX_QREFILL, SFXGE_SW_EV_TX_QFLUSH_DONE, }; #define SFXGE_SW_EV_MAGIC(_sw_ev) \ (SFXGE_MAGIC_RESERVED | ((_sw_ev) << SFXGE_MAGIC_DMAQ_LABEL_WIDTH)) static inline uint16_t sfxge_sw_ev_mk_magic(enum sfxge_sw_ev sw_ev, unsigned int label) { KASSERT((label & SFXGE_MAGIC_DMAQ_LABEL_MASK) == label, ("(label & SFXGE_MAGIC_DMAQ_LABEL_MASK) != label")); return SFXGE_SW_EV_MAGIC(sw_ev) | label; } static inline uint16_t sfxge_sw_ev_rxq_magic(enum sfxge_sw_ev sw_ev, struct sfxge_rxq *rxq) { return sfxge_sw_ev_mk_magic(sw_ev, 0); } static inline uint16_t sfxge_sw_ev_txq_magic(enum sfxge_sw_ev sw_ev, struct sfxge_txq *txq) { return sfxge_sw_ev_mk_magic(sw_ev, txq->type); } enum sfxge_evq_state { SFXGE_EVQ_UNINITIALIZED = 0, SFXGE_EVQ_INITIALIZED, SFXGE_EVQ_STARTING, SFXGE_EVQ_STARTED }; #define SFXGE_EV_BATCH 16384 #define SFXGE_STATS_UPDATE_PERIOD_MS 1000 struct sfxge_evq { /* Structure members below are sorted by usage order */ struct sfxge_softc *sc; struct mtx lock; unsigned int index; enum sfxge_evq_state init_state; efsys_mem_t mem; efx_evq_t *common; unsigned int read_ptr; boolean_t exception; unsigned int rx_done; unsigned int tx_done; /* Linked list of TX queues with completions to process */ struct sfxge_txq *txq; struct sfxge_txq **txqs; /* Structure members not used on event processing path */ unsigned int buf_base_id; unsigned int entries; char lock_name[SFXGE_LOCK_NAME_MAX]; } __aligned(CACHE_LINE_SIZE); #define SFXGE_NDESCS 1024 #define SFXGE_MODERATION 30 enum sfxge_intr_state { SFXGE_INTR_UNINITIALIZED = 0, SFXGE_INTR_INITIALIZED, SFXGE_INTR_TESTING, SFXGE_INTR_STARTED }; struct sfxge_intr_hdl { int eih_rid; void *eih_tag; struct resource *eih_res; }; struct sfxge_intr { enum sfxge_intr_state state; struct resource *msix_res; struct sfxge_intr_hdl *table; int n_alloc; int type; efsys_mem_t status; uint32_t zero_count; }; enum sfxge_mcdi_state { SFXGE_MCDI_UNINITIALIZED = 0, SFXGE_MCDI_INITIALIZED, SFXGE_MCDI_BUSY, SFXGE_MCDI_COMPLETED }; struct sfxge_mcdi { struct mtx lock; efsys_mem_t mem; enum sfxge_mcdi_state state; efx_mcdi_transport_t transport; /* Only used in debugging output */ char lock_name[SFXGE_LOCK_NAME_MAX]; }; struct sfxge_hw_stats { clock_t update_time; efsys_mem_t dma_buf; void *decode_buf; }; enum sfxge_port_state { SFXGE_PORT_UNINITIALIZED = 0, SFXGE_PORT_INITIALIZED, SFXGE_PORT_STARTED }; struct sfxge_port { struct sfxge_softc *sc; struct mtx lock; enum sfxge_port_state init_state; #ifndef SFXGE_HAVE_PAUSE_MEDIAOPTS unsigned int wanted_fc; #endif struct sfxge_hw_stats phy_stats; struct sfxge_hw_stats mac_stats; + uint16_t stats_update_period_ms; efx_link_mode_t link_mode; uint8_t mcast_addrs[EFX_MAC_MULTICAST_LIST_MAX * EFX_MAC_ADDR_LEN]; unsigned int mcast_count; /* Only used in debugging output */ char lock_name[SFXGE_LOCK_NAME_MAX]; }; enum sfxge_softc_state { SFXGE_UNINITIALIZED = 0, SFXGE_INITIALIZED, SFXGE_REGISTERED, SFXGE_STARTED }; struct sfxge_softc { device_t dev; struct sx softc_lock; char softc_lock_name[SFXGE_LOCK_NAME_MAX]; enum sfxge_softc_state init_state; struct ifnet *ifnet; unsigned int if_flags; struct sysctl_oid *stats_node; struct sysctl_oid *txqs_node; struct task task_reset; efx_family_t family; caddr_t vpd_data; size_t vpd_size; efx_nic_t *enp; efsys_lock_t enp_lock; unsigned int rxq_entries; unsigned int txq_entries; bus_dma_tag_t parent_dma_tag; efsys_bar_t bar; struct sfxge_intr intr; struct sfxge_mcdi mcdi; struct sfxge_port port; uint32_t buffer_table_next; struct sfxge_evq *evq[SFXGE_RX_SCALE_MAX]; unsigned int ev_moderation; #if EFSYS_OPT_QSTATS clock_t ev_stats_update_time; uint64_t ev_stats[EV_NQSTATS]; #endif unsigned int max_rss_channels; uma_zone_t rxq_cache; struct sfxge_rxq *rxq[SFXGE_RX_SCALE_MAX]; unsigned int rx_indir_table[EFX_RSS_TBL_SIZE]; struct sfxge_txq *txq[SFXGE_TXQ_NTYPES + SFXGE_RX_SCALE_MAX]; struct ifmedia media; size_t rx_prefix_size; size_t rx_buffer_size; size_t rx_buffer_align; int rx_cluster_size; unsigned int evq_max; unsigned int evq_count; unsigned int rxq_count; unsigned int txq_count; unsigned int tso_fw_assisted; #define SFXGE_FATSOV1 (1 << 0) #define SFXGE_FATSOV2 (1 << 1) #if EFSYS_OPT_MCDI_LOGGING int mcdi_logging; #endif }; #define SFXGE_LINK_UP(sc) \ ((sc)->port.link_mode != EFX_LINK_DOWN && \ (sc)->port.link_mode != EFX_LINK_UNKNOWN) #define SFXGE_RUNNING(sc) ((sc)->ifnet->if_drv_flags & IFF_DRV_RUNNING) #define SFXGE_PARAM(_name) "hw.sfxge." #_name SYSCTL_DECL(_hw_sfxge); /* * From sfxge.c. */ extern void sfxge_schedule_reset(struct sfxge_softc *sc); extern void sfxge_sram_buf_tbl_alloc(struct sfxge_softc *sc, size_t n, uint32_t *idp); /* * From sfxge_dma.c. */ extern int sfxge_dma_init(struct sfxge_softc *sc); extern void sfxge_dma_fini(struct sfxge_softc *sc); extern int sfxge_dma_alloc(struct sfxge_softc *sc, bus_size_t len, efsys_mem_t *esmp); extern void sfxge_dma_free(efsys_mem_t *esmp); extern int sfxge_dma_map_sg_collapse(bus_dma_tag_t tag, bus_dmamap_t map, struct mbuf **mp, bus_dma_segment_t *segs, int *nsegs, int maxsegs); /* * From sfxge_ev.c. */ extern int sfxge_ev_init(struct sfxge_softc *sc); extern void sfxge_ev_fini(struct sfxge_softc *sc); extern int sfxge_ev_start(struct sfxge_softc *sc); extern void sfxge_ev_stop(struct sfxge_softc *sc); extern int sfxge_ev_qpoll(struct sfxge_evq *evq); /* * From sfxge_intr.c. */ extern int sfxge_intr_init(struct sfxge_softc *sc); extern void sfxge_intr_fini(struct sfxge_softc *sc); extern int sfxge_intr_start(struct sfxge_softc *sc); extern void sfxge_intr_stop(struct sfxge_softc *sc); /* * From sfxge_mcdi.c. */ extern int sfxge_mcdi_init(struct sfxge_softc *sc); extern void sfxge_mcdi_fini(struct sfxge_softc *sc); extern int sfxge_mcdi_ioctl(struct sfxge_softc *sc, sfxge_ioc_t *ip); /* * From sfxge_nvram.c. */ extern int sfxge_nvram_ioctl(struct sfxge_softc *sc, sfxge_ioc_t *ip); /* * From sfxge_port.c. */ extern int sfxge_port_init(struct sfxge_softc *sc); extern void sfxge_port_fini(struct sfxge_softc *sc); extern int sfxge_port_start(struct sfxge_softc *sc); extern void sfxge_port_stop(struct sfxge_softc *sc); extern void sfxge_mac_link_update(struct sfxge_softc *sc, efx_link_mode_t mode); extern int sfxge_mac_filter_set(struct sfxge_softc *sc); extern int sfxge_port_ifmedia_init(struct sfxge_softc *sc); extern uint64_t sfxge_get_counter(struct ifnet *ifp, ift_counter c); #define SFXGE_MAX_MTU (9 * 1024) #define SFXGE_ADAPTER_LOCK_INIT(_sc, _ifname) \ do { \ struct sfxge_softc *__sc = (_sc); \ \ snprintf((__sc)->softc_lock_name, \ sizeof((__sc)->softc_lock_name), \ "%s:softc", (_ifname)); \ sx_init(&(__sc)->softc_lock, (__sc)->softc_lock_name); \ } while (B_FALSE) #define SFXGE_ADAPTER_LOCK_DESTROY(_sc) \ sx_destroy(&(_sc)->softc_lock) #define SFXGE_ADAPTER_LOCK(_sc) \ sx_xlock(&(_sc)->softc_lock) #define SFXGE_ADAPTER_UNLOCK(_sc) \ sx_xunlock(&(_sc)->softc_lock) #define SFXGE_ADAPTER_LOCK_ASSERT_OWNED(_sc) \ sx_assert(&(_sc)->softc_lock, LA_XLOCKED) #define SFXGE_PORT_LOCK_INIT(_port, _ifname) \ do { \ struct sfxge_port *__port = (_port); \ \ snprintf((__port)->lock_name, \ sizeof((__port)->lock_name), \ "%s:port", (_ifname)); \ mtx_init(&(__port)->lock, (__port)->lock_name, \ NULL, MTX_DEF); \ } while (B_FALSE) #define SFXGE_PORT_LOCK_DESTROY(_port) \ mtx_destroy(&(_port)->lock) #define SFXGE_PORT_LOCK(_port) \ mtx_lock(&(_port)->lock) #define SFXGE_PORT_UNLOCK(_port) \ mtx_unlock(&(_port)->lock) #define SFXGE_PORT_LOCK_ASSERT_OWNED(_port) \ mtx_assert(&(_port)->lock, MA_OWNED) #define SFXGE_MCDI_LOCK_INIT(_mcdi, _ifname) \ do { \ struct sfxge_mcdi *__mcdi = (_mcdi); \ \ snprintf((__mcdi)->lock_name, \ sizeof((__mcdi)->lock_name), \ "%s:mcdi", (_ifname)); \ mtx_init(&(__mcdi)->lock, (__mcdi)->lock_name, \ NULL, MTX_DEF); \ } while (B_FALSE) #define SFXGE_MCDI_LOCK_DESTROY(_mcdi) \ mtx_destroy(&(_mcdi)->lock) #define SFXGE_MCDI_LOCK(_mcdi) \ mtx_lock(&(_mcdi)->lock) #define SFXGE_MCDI_UNLOCK(_mcdi) \ mtx_unlock(&(_mcdi)->lock) #define SFXGE_MCDI_LOCK_ASSERT_OWNED(_mcdi) \ mtx_assert(&(_mcdi)->lock, MA_OWNED) #define SFXGE_EVQ_LOCK_INIT(_evq, _ifname, _evq_index) \ do { \ struct sfxge_evq *__evq = (_evq); \ \ snprintf((__evq)->lock_name, \ sizeof((__evq)->lock_name), \ "%s:evq%u", (_ifname), (_evq_index)); \ mtx_init(&(__evq)->lock, (__evq)->lock_name, \ NULL, MTX_DEF); \ } while (B_FALSE) #define SFXGE_EVQ_LOCK_DESTROY(_evq) \ mtx_destroy(&(_evq)->lock) #define SFXGE_EVQ_LOCK(_evq) \ mtx_lock(&(_evq)->lock) #define SFXGE_EVQ_UNLOCK(_evq) \ mtx_unlock(&(_evq)->lock) #define SFXGE_EVQ_LOCK_ASSERT_OWNED(_evq) \ mtx_assert(&(_evq)->lock, MA_OWNED) #endif /* _SFXGE_H */ Index: head/sys/dev/sfxge/sfxge_port.c =================================================================== --- head/sys/dev/sfxge/sfxge_port.c (revision 311976) +++ head/sys/dev/sfxge/sfxge_port.c (revision 311977) @@ -1,1005 +1,1035 @@ /*- * Copyright (c) 2010-2016 Solarflare Communications Inc. * All rights reserved. * * This software was developed in part by Philip Paeps under contract for * Solarflare Communications, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * 1. Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * The views and conclusions contained in the software and documentation are * those of the authors and should not be interpreted as representing official * policies, either expressed or implied, of the FreeBSD Project. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include "common/efx.h" #include "sfxge.h" +#define SFXGE_PARAM_STATS_UPDATE_PERIOD_MS \ + SFXGE_PARAM(stats_update_period_ms) +static int sfxge_stats_update_period_ms = SFXGE_STATS_UPDATE_PERIOD_MS; +TUNABLE_INT(SFXGE_PARAM_STATS_UPDATE_PERIOD_MS, + &sfxge_stats_update_period_ms); +SYSCTL_INT(_hw_sfxge, OID_AUTO, stats_update_period_ms, CTLFLAG_RDTUN, + &sfxge_stats_update_period_ms, 0, + "netstat interface statistics update period in milliseconds"); + static int sfxge_phy_cap_mask(struct sfxge_softc *, int, uint32_t *); static int sfxge_mac_stat_update(struct sfxge_softc *sc) { struct sfxge_port *port = &sc->port; efsys_mem_t *esmp = &(port->mac_stats.dma_buf); clock_t now; unsigned int min_ticks; unsigned int count; int rc; SFXGE_PORT_LOCK_ASSERT_OWNED(port); if (__predict_false(port->init_state != SFXGE_PORT_STARTED)) { rc = 0; goto out; } - min_ticks = (unsigned int)hz * SFXGE_STATS_UPDATE_PERIOD_MS / 1000; + min_ticks = (unsigned int)hz * port->stats_update_period_ms / 1000; now = ticks; if ((unsigned int)(now - port->mac_stats.update_time) < min_ticks) { rc = 0; goto out; } port->mac_stats.update_time = now; /* If we're unlucky enough to read statistics wduring the DMA, wait * up to 10ms for it to finish (typically takes <500us) */ for (count = 0; count < 100; ++count) { EFSYS_PROBE1(wait, unsigned int, count); /* Try to update the cached counters */ if ((rc = efx_mac_stats_update(sc->enp, esmp, port->mac_stats.decode_buf, NULL)) != EAGAIN) goto out; DELAY(100); } rc = ETIMEDOUT; out: return (rc); } uint64_t sfxge_get_counter(struct ifnet *ifp, ift_counter c) { struct sfxge_softc *sc = ifp->if_softc; uint64_t *mac_stats; uint64_t val; SFXGE_PORT_LOCK(&sc->port); /* Ignore error and use old values */ (void)sfxge_mac_stat_update(sc); mac_stats = (uint64_t *)sc->port.mac_stats.decode_buf; switch (c) { case IFCOUNTER_IPACKETS: val = mac_stats[EFX_MAC_RX_PKTS]; break; case IFCOUNTER_IERRORS: val = mac_stats[EFX_MAC_RX_ERRORS]; break; case IFCOUNTER_OPACKETS: val = mac_stats[EFX_MAC_TX_PKTS]; break; case IFCOUNTER_OERRORS: val = mac_stats[EFX_MAC_TX_ERRORS]; break; case IFCOUNTER_COLLISIONS: val = mac_stats[EFX_MAC_TX_SGL_COL_PKTS] + mac_stats[EFX_MAC_TX_MULT_COL_PKTS] + mac_stats[EFX_MAC_TX_EX_COL_PKTS] + mac_stats[EFX_MAC_TX_LATE_COL_PKTS]; break; case IFCOUNTER_IBYTES: val = mac_stats[EFX_MAC_RX_OCTETS]; break; case IFCOUNTER_OBYTES: val = mac_stats[EFX_MAC_TX_OCTETS]; break; case IFCOUNTER_OMCASTS: val = mac_stats[EFX_MAC_TX_MULTICST_PKTS] + mac_stats[EFX_MAC_TX_BRDCST_PKTS]; break; case IFCOUNTER_OQDROPS: SFXGE_PORT_UNLOCK(&sc->port); return (sfxge_tx_get_drops(sc)); case IFCOUNTER_IMCASTS: /* if_imcasts is maintained in net/if_ethersubr.c */ case IFCOUNTER_IQDROPS: /* if_iqdrops is maintained in net/if_ethersubr.c */ case IFCOUNTER_NOPROTO: /* if_noproto is maintained in net/if_ethersubr.c */ default: SFXGE_PORT_UNLOCK(&sc->port); return (if_get_counter_default(ifp, c)); } SFXGE_PORT_UNLOCK(&sc->port); return (val); } static int sfxge_mac_stat_handler(SYSCTL_HANDLER_ARGS) { struct sfxge_softc *sc = arg1; unsigned int id = arg2; int rc; uint64_t val; SFXGE_PORT_LOCK(&sc->port); if ((rc = sfxge_mac_stat_update(sc)) == 0) val = ((uint64_t *)sc->port.mac_stats.decode_buf)[id]; SFXGE_PORT_UNLOCK(&sc->port); if (rc == 0) rc = SYSCTL_OUT(req, &val, sizeof(val)); return (rc); } static void sfxge_mac_stat_init(struct sfxge_softc *sc) { struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->dev); struct sysctl_oid_list *stat_list; unsigned int id; const char *name; stat_list = SYSCTL_CHILDREN(sc->stats_node); /* Initialise the named stats */ for (id = 0; id < EFX_MAC_NSTATS; id++) { name = efx_mac_stat_name(sc->enp, id); SYSCTL_ADD_PROC( ctx, stat_list, OID_AUTO, name, CTLTYPE_U64|CTLFLAG_RD, sc, id, sfxge_mac_stat_handler, "Q", ""); } } #ifdef SFXGE_HAVE_PAUSE_MEDIAOPTS static unsigned int sfxge_port_wanted_fc(struct sfxge_softc *sc) { struct ifmedia_entry *ifm = sc->media.ifm_cur; if (ifm->ifm_media == (IFM_ETHER | IFM_AUTO)) return (EFX_FCNTL_RESPOND | EFX_FCNTL_GENERATE); return (((ifm->ifm_media & IFM_ETH_RXPAUSE) ? EFX_FCNTL_RESPOND : 0) | ((ifm->ifm_media & IFM_ETH_TXPAUSE) ? EFX_FCNTL_GENERATE : 0)); } static unsigned int sfxge_port_link_fc_ifm(struct sfxge_softc *sc) { unsigned int wanted_fc, link_fc; efx_mac_fcntl_get(sc->enp, &wanted_fc, &link_fc); return ((link_fc & EFX_FCNTL_RESPOND) ? IFM_ETH_RXPAUSE : 0) | ((link_fc & EFX_FCNTL_GENERATE) ? IFM_ETH_TXPAUSE : 0); } #else /* !SFXGE_HAVE_PAUSE_MEDIAOPTS */ static unsigned int sfxge_port_wanted_fc(struct sfxge_softc *sc) { return (sc->port.wanted_fc); } static unsigned int sfxge_port_link_fc_ifm(struct sfxge_softc *sc) { return (0); } static int sfxge_port_wanted_fc_handler(SYSCTL_HANDLER_ARGS) { struct sfxge_softc *sc; struct sfxge_port *port; unsigned int fcntl; int error; sc = arg1; port = &sc->port; if (req->newptr != NULL) { if ((error = SYSCTL_IN(req, &fcntl, sizeof(fcntl))) != 0) return (error); SFXGE_PORT_LOCK(port); if (port->wanted_fc != fcntl) { if (port->init_state == SFXGE_PORT_STARTED) error = efx_mac_fcntl_set(sc->enp, port->wanted_fc, B_TRUE); if (error == 0) port->wanted_fc = fcntl; } SFXGE_PORT_UNLOCK(port); } else { SFXGE_PORT_LOCK(port); fcntl = port->wanted_fc; SFXGE_PORT_UNLOCK(port); error = SYSCTL_OUT(req, &fcntl, sizeof(fcntl)); } return (error); } static int sfxge_port_link_fc_handler(SYSCTL_HANDLER_ARGS) { struct sfxge_softc *sc; struct sfxge_port *port; unsigned int wanted_fc, link_fc; sc = arg1; port = &sc->port; SFXGE_PORT_LOCK(port); if (__predict_true(port->init_state == SFXGE_PORT_STARTED) && SFXGE_LINK_UP(sc)) efx_mac_fcntl_get(sc->enp, &wanted_fc, &link_fc); else link_fc = 0; SFXGE_PORT_UNLOCK(port); return (SYSCTL_OUT(req, &link_fc, sizeof(link_fc))); } #endif /* SFXGE_HAVE_PAUSE_MEDIAOPTS */ static const uint64_t sfxge_link_baudrate[EFX_LINK_NMODES] = { [EFX_LINK_10HDX] = IF_Mbps(10), [EFX_LINK_10FDX] = IF_Mbps(10), [EFX_LINK_100HDX] = IF_Mbps(100), [EFX_LINK_100FDX] = IF_Mbps(100), [EFX_LINK_1000HDX] = IF_Gbps(1), [EFX_LINK_1000FDX] = IF_Gbps(1), [EFX_LINK_10000FDX] = IF_Gbps(10), [EFX_LINK_40000FDX] = IF_Gbps(40), }; void sfxge_mac_link_update(struct sfxge_softc *sc, efx_link_mode_t mode) { struct sfxge_port *port; int link_state; port = &sc->port; if (port->link_mode == mode) return; port->link_mode = mode; /* Push link state update to the OS */ link_state = (SFXGE_LINK_UP(sc) ? LINK_STATE_UP : LINK_STATE_DOWN); sc->ifnet->if_baudrate = sfxge_link_baudrate[port->link_mode]; if_link_state_change(sc->ifnet, link_state); } static void sfxge_mac_poll_work(void *arg, int npending) { struct sfxge_softc *sc; efx_nic_t *enp; struct sfxge_port *port; efx_link_mode_t mode; sc = (struct sfxge_softc *)arg; enp = sc->enp; port = &sc->port; SFXGE_PORT_LOCK(port); if (__predict_false(port->init_state != SFXGE_PORT_STARTED)) goto done; /* This may sleep waiting for MCDI completion */ (void)efx_port_poll(enp, &mode); sfxge_mac_link_update(sc, mode); done: SFXGE_PORT_UNLOCK(port); } static int sfxge_mac_multicast_list_set(struct sfxge_softc *sc) { struct ifnet *ifp = sc->ifnet; struct sfxge_port *port = &sc->port; uint8_t *mcast_addr = port->mcast_addrs; struct ifmultiaddr *ifma; struct sockaddr_dl *sa; int rc = 0; mtx_assert(&port->lock, MA_OWNED); port->mcast_count = 0; if_maddr_rlock(ifp); TAILQ_FOREACH(ifma, &ifp->if_multiaddrs, ifma_link) { if (ifma->ifma_addr->sa_family == AF_LINK) { if (port->mcast_count == EFX_MAC_MULTICAST_LIST_MAX) { device_printf(sc->dev, "Too many multicast addresses\n"); rc = EINVAL; break; } sa = (struct sockaddr_dl *)ifma->ifma_addr; memcpy(mcast_addr, LLADDR(sa), EFX_MAC_ADDR_LEN); mcast_addr += EFX_MAC_ADDR_LEN; ++port->mcast_count; } } if_maddr_runlock(ifp); if (rc == 0) { rc = efx_mac_multicast_list_set(sc->enp, port->mcast_addrs, port->mcast_count); if (rc != 0) device_printf(sc->dev, "Cannot set multicast address list\n"); } return (rc); } static int sfxge_mac_filter_set_locked(struct sfxge_softc *sc) { struct ifnet *ifp = sc->ifnet; struct sfxge_port *port = &sc->port; boolean_t all_mulcst; int rc; mtx_assert(&port->lock, MA_OWNED); all_mulcst = !!(ifp->if_flags & (IFF_PROMISC | IFF_ALLMULTI)); rc = sfxge_mac_multicast_list_set(sc); /* Fallback to all multicast if cannot set multicast list */ if (rc != 0) all_mulcst = B_TRUE; rc = efx_mac_filter_set(sc->enp, !!(ifp->if_flags & IFF_PROMISC), (port->mcast_count > 0), all_mulcst, B_TRUE); return (rc); } int sfxge_mac_filter_set(struct sfxge_softc *sc) { struct sfxge_port *port = &sc->port; int rc; SFXGE_PORT_LOCK(port); /* * The function may be called without softc_lock held in the * case of SIOCADDMULTI and SIOCDELMULTI ioctls. ioctl handler * checks IFF_DRV_RUNNING flag which implies port started, but * it is not guaranteed to remain. softc_lock shared lock can't * be held in the case of these ioctls processing, since it * results in failure where kernel complains that non-sleepable * lock is held in sleeping thread. Both problems are repeatable * on LAG with LACP proto bring up. */ if (__predict_true(port->init_state == SFXGE_PORT_STARTED)) rc = sfxge_mac_filter_set_locked(sc); else rc = 0; SFXGE_PORT_UNLOCK(port); return (rc); } void sfxge_port_stop(struct sfxge_softc *sc) { struct sfxge_port *port; efx_nic_t *enp; port = &sc->port; enp = sc->enp; SFXGE_PORT_LOCK(port); KASSERT(port->init_state == SFXGE_PORT_STARTED, ("port not started")); port->init_state = SFXGE_PORT_INITIALIZED; port->mac_stats.update_time = 0; /* This may call MCDI */ (void)efx_mac_drain(enp, B_TRUE); (void)efx_mac_stats_periodic(enp, &port->mac_stats.dma_buf, 0, B_FALSE); port->link_mode = EFX_LINK_UNKNOWN; /* Destroy the common code port object. */ efx_port_fini(enp); efx_filter_fini(enp); SFXGE_PORT_UNLOCK(port); } int sfxge_port_start(struct sfxge_softc *sc) { uint8_t mac_addr[ETHER_ADDR_LEN]; struct ifnet *ifp = sc->ifnet; struct sfxge_port *port; efx_nic_t *enp; size_t pdu; int rc; uint32_t phy_cap_mask; port = &sc->port; enp = sc->enp; SFXGE_PORT_LOCK(port); KASSERT(port->init_state == SFXGE_PORT_INITIALIZED, ("port not initialized")); /* Initialise the required filtering */ if ((rc = efx_filter_init(enp)) != 0) goto fail_filter_init; /* Initialize the port object in the common code. */ if ((rc = efx_port_init(sc->enp)) != 0) goto fail; /* Set the SDU */ pdu = EFX_MAC_PDU(ifp->if_mtu); if ((rc = efx_mac_pdu_set(enp, pdu)) != 0) goto fail2; if ((rc = efx_mac_fcntl_set(enp, sfxge_port_wanted_fc(sc), B_TRUE)) != 0) goto fail3; /* Set the unicast address */ if_addr_rlock(ifp); bcopy(LLADDR((struct sockaddr_dl *)ifp->if_addr->ifa_addr), mac_addr, sizeof(mac_addr)); if_addr_runlock(ifp); if ((rc = efx_mac_addr_set(enp, mac_addr)) != 0) goto fail4; sfxge_mac_filter_set_locked(sc); /* Update MAC stats by DMA every period */ if ((rc = efx_mac_stats_periodic(enp, &port->mac_stats.dma_buf, - SFXGE_STATS_UPDATE_PERIOD_MS, + port->stats_update_period_ms, B_FALSE)) != 0) goto fail6; if ((rc = efx_mac_drain(enp, B_FALSE)) != 0) goto fail8; if ((rc = sfxge_phy_cap_mask(sc, sc->media.ifm_cur->ifm_media, &phy_cap_mask)) != 0) goto fail9; if ((rc = efx_phy_adv_cap_set(sc->enp, phy_cap_mask)) != 0) goto fail10; port->init_state = SFXGE_PORT_STARTED; /* Single poll in case there were missing initial events */ SFXGE_PORT_UNLOCK(port); sfxge_mac_poll_work(sc, 0); return (0); fail10: fail9: (void)efx_mac_drain(enp, B_TRUE); fail8: (void)efx_mac_stats_periodic(enp, &port->mac_stats.dma_buf, 0, B_FALSE); fail6: fail4: fail3: fail2: efx_port_fini(enp); fail: efx_filter_fini(enp); fail_filter_init: SFXGE_PORT_UNLOCK(port); return (rc); } static int sfxge_phy_stat_update(struct sfxge_softc *sc) { struct sfxge_port *port = &sc->port; efsys_mem_t *esmp = &port->phy_stats.dma_buf; clock_t now; unsigned int count; int rc; SFXGE_PORT_LOCK_ASSERT_OWNED(port); if (__predict_false(port->init_state != SFXGE_PORT_STARTED)) { rc = 0; goto out; } now = ticks; if ((unsigned int)(now - port->phy_stats.update_time) < (unsigned int)hz) { rc = 0; goto out; } port->phy_stats.update_time = now; /* If we're unlucky enough to read statistics wduring the DMA, wait * up to 10ms for it to finish (typically takes <500us) */ for (count = 0; count < 100; ++count) { EFSYS_PROBE1(wait, unsigned int, count); /* Synchronize the DMA memory for reading */ bus_dmamap_sync(esmp->esm_tag, esmp->esm_map, BUS_DMASYNC_POSTREAD); /* Try to update the cached counters */ if ((rc = efx_phy_stats_update(sc->enp, esmp, port->phy_stats.decode_buf)) != EAGAIN) goto out; DELAY(100); } rc = ETIMEDOUT; out: return (rc); } static int sfxge_phy_stat_handler(SYSCTL_HANDLER_ARGS) { struct sfxge_softc *sc = arg1; unsigned int id = arg2; int rc; uint32_t val; SFXGE_PORT_LOCK(&sc->port); if ((rc = sfxge_phy_stat_update(sc)) == 0) val = ((uint32_t *)sc->port.phy_stats.decode_buf)[id]; SFXGE_PORT_UNLOCK(&sc->port); if (rc == 0) rc = SYSCTL_OUT(req, &val, sizeof(val)); return (rc); } static void sfxge_phy_stat_init(struct sfxge_softc *sc) { struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->dev); struct sysctl_oid_list *stat_list; unsigned int id; const char *name; uint64_t stat_mask = efx_nic_cfg_get(sc->enp)->enc_phy_stat_mask; stat_list = SYSCTL_CHILDREN(sc->stats_node); /* Initialise the named stats */ for (id = 0; id < EFX_PHY_NSTATS; id++) { if (!(stat_mask & ((uint64_t)1 << id))) continue; name = efx_phy_stat_name(sc->enp, id); SYSCTL_ADD_PROC( ctx, stat_list, OID_AUTO, name, CTLTYPE_UINT|CTLFLAG_RD, sc, id, sfxge_phy_stat_handler, id == EFX_PHY_STAT_OUI ? "IX" : "IU", ""); } } void sfxge_port_fini(struct sfxge_softc *sc) { struct sfxge_port *port; efsys_mem_t *esmp; port = &sc->port; esmp = &port->mac_stats.dma_buf; KASSERT(port->init_state == SFXGE_PORT_INITIALIZED, ("Port not initialized")); port->init_state = SFXGE_PORT_UNINITIALIZED; port->link_mode = EFX_LINK_UNKNOWN; /* Finish with PHY DMA memory */ sfxge_dma_free(&port->phy_stats.dma_buf); free(port->phy_stats.decode_buf, M_SFXGE); sfxge_dma_free(esmp); free(port->mac_stats.decode_buf, M_SFXGE); SFXGE_PORT_LOCK_DESTROY(port); port->sc = NULL; } +static uint16_t +sfxge_port_stats_update_period_ms(struct sfxge_softc *sc) +{ + int period_ms = sfxge_stats_update_period_ms; + + if (period_ms < 0) { + device_printf(sc->dev, + "treat negative stats update period %d as 0 (disable)\n", + period_ms); + period_ms = 0; + } else if (period_ms > UINT16_MAX) { + device_printf(sc->dev, + "treat too big stats update period %d as %u\n", + period_ms, UINT16_MAX); + period_ms = UINT16_MAX; + } + + return period_ms; +} + int sfxge_port_init(struct sfxge_softc *sc) { struct sfxge_port *port; struct sysctl_ctx_list *sysctl_ctx; struct sysctl_oid *sysctl_tree; efsys_mem_t *mac_stats_buf, *phy_stats_buf; int rc; port = &sc->port; mac_stats_buf = &port->mac_stats.dma_buf; phy_stats_buf = &port->phy_stats.dma_buf; KASSERT(port->init_state == SFXGE_PORT_UNINITIALIZED, ("Port already initialized")); port->sc = sc; SFXGE_PORT_LOCK_INIT(port, device_get_nameunit(sc->dev)); DBGPRINT(sc->dev, "alloc PHY stats"); port->phy_stats.decode_buf = malloc(EFX_PHY_NSTATS * sizeof(uint32_t), M_SFXGE, M_WAITOK | M_ZERO); if ((rc = sfxge_dma_alloc(sc, EFX_PHY_STATS_SIZE, phy_stats_buf)) != 0) goto fail; sfxge_phy_stat_init(sc); DBGPRINT(sc->dev, "init sysctl"); sysctl_ctx = device_get_sysctl_ctx(sc->dev); sysctl_tree = device_get_sysctl_tree(sc->dev); #ifndef SFXGE_HAVE_PAUSE_MEDIAOPTS /* If flow control cannot be configured or reported through * ifmedia, provide sysctls for it. */ port->wanted_fc = EFX_FCNTL_RESPOND | EFX_FCNTL_GENERATE; SYSCTL_ADD_PROC(sysctl_ctx, SYSCTL_CHILDREN(sysctl_tree), OID_AUTO, "wanted_fc", CTLTYPE_UINT|CTLFLAG_RW, sc, 0, sfxge_port_wanted_fc_handler, "IU", "wanted flow control mode"); SYSCTL_ADD_PROC(sysctl_ctx, SYSCTL_CHILDREN(sysctl_tree), OID_AUTO, "link_fc", CTLTYPE_UINT|CTLFLAG_RD, sc, 0, sfxge_port_link_fc_handler, "IU", "link flow control mode"); #endif DBGPRINT(sc->dev, "alloc MAC stats"); port->mac_stats.decode_buf = malloc(EFX_MAC_NSTATS * sizeof(uint64_t), M_SFXGE, M_WAITOK | M_ZERO); if ((rc = sfxge_dma_alloc(sc, EFX_MAC_STATS_SIZE, mac_stats_buf)) != 0) goto fail2; + port->stats_update_period_ms = sfxge_port_stats_update_period_ms(sc); sfxge_mac_stat_init(sc); port->init_state = SFXGE_PORT_INITIALIZED; DBGPRINT(sc->dev, "success"); return (0); fail2: free(port->mac_stats.decode_buf, M_SFXGE); sfxge_dma_free(phy_stats_buf); fail: free(port->phy_stats.decode_buf, M_SFXGE); SFXGE_PORT_LOCK_DESTROY(port); port->sc = NULL; DBGPRINT(sc->dev, "failed %d", rc); return (rc); } static const int sfxge_link_mode[EFX_PHY_MEDIA_NTYPES][EFX_LINK_NMODES] = { [EFX_PHY_MEDIA_CX4] = { [EFX_LINK_10000FDX] = IFM_ETHER | IFM_FDX | IFM_10G_CX4, }, [EFX_PHY_MEDIA_KX4] = { [EFX_LINK_10000FDX] = IFM_ETHER | IFM_FDX | IFM_10G_KX4, }, [EFX_PHY_MEDIA_XFP] = { /* Don't know the module type, but assume SR for now. */ [EFX_LINK_10000FDX] = IFM_ETHER | IFM_FDX | IFM_10G_SR, }, [EFX_PHY_MEDIA_QSFP_PLUS] = { /* Don't know the module type, but assume SR for now. */ [EFX_LINK_10000FDX] = IFM_ETHER | IFM_FDX | IFM_10G_SR, [EFX_LINK_40000FDX] = IFM_ETHER | IFM_FDX | IFM_40G_CR4, }, [EFX_PHY_MEDIA_SFP_PLUS] = { /* Don't know the module type, but assume SX/SR for now. */ [EFX_LINK_1000FDX] = IFM_ETHER | IFM_FDX | IFM_1000_SX, [EFX_LINK_10000FDX] = IFM_ETHER | IFM_FDX | IFM_10G_SR, }, [EFX_PHY_MEDIA_BASE_T] = { [EFX_LINK_10HDX] = IFM_ETHER | IFM_HDX | IFM_10_T, [EFX_LINK_10FDX] = IFM_ETHER | IFM_FDX | IFM_10_T, [EFX_LINK_100HDX] = IFM_ETHER | IFM_HDX | IFM_100_TX, [EFX_LINK_100FDX] = IFM_ETHER | IFM_FDX | IFM_100_TX, [EFX_LINK_1000HDX] = IFM_ETHER | IFM_HDX | IFM_1000_T, [EFX_LINK_1000FDX] = IFM_ETHER | IFM_FDX | IFM_1000_T, [EFX_LINK_10000FDX] = IFM_ETHER | IFM_FDX | IFM_10G_T, }, }; static void sfxge_media_status(struct ifnet *ifp, struct ifmediareq *ifmr) { struct sfxge_softc *sc; efx_phy_media_type_t medium_type; efx_link_mode_t mode; sc = ifp->if_softc; SFXGE_ADAPTER_LOCK(sc); ifmr->ifm_status = IFM_AVALID; ifmr->ifm_active = IFM_ETHER; if (SFXGE_RUNNING(sc) && SFXGE_LINK_UP(sc)) { ifmr->ifm_status |= IFM_ACTIVE; efx_phy_media_type_get(sc->enp, &medium_type); mode = sc->port.link_mode; ifmr->ifm_active |= sfxge_link_mode[medium_type][mode]; ifmr->ifm_active |= sfxge_port_link_fc_ifm(sc); } SFXGE_ADAPTER_UNLOCK(sc); } static efx_phy_cap_type_t sfxge_link_mode_to_phy_cap(efx_link_mode_t mode) { switch (mode) { case EFX_LINK_10HDX: return (EFX_PHY_CAP_10HDX); case EFX_LINK_10FDX: return (EFX_PHY_CAP_10FDX); case EFX_LINK_100HDX: return (EFX_PHY_CAP_100HDX); case EFX_LINK_100FDX: return (EFX_PHY_CAP_100FDX); case EFX_LINK_1000HDX: return (EFX_PHY_CAP_1000HDX); case EFX_LINK_1000FDX: return (EFX_PHY_CAP_1000FDX); case EFX_LINK_10000FDX: return (EFX_PHY_CAP_10000FDX); case EFX_LINK_40000FDX: return (EFX_PHY_CAP_40000FDX); default: EFSYS_ASSERT(B_FALSE); return (EFX_PHY_CAP_INVALID); } } static int sfxge_phy_cap_mask(struct sfxge_softc *sc, int ifmedia, uint32_t *phy_cap_mask) { /* Get global options (duplex), type and subtype bits */ int ifmedia_masked = ifmedia & (IFM_GMASK | IFM_NMASK | IFM_TMASK); efx_phy_media_type_t medium_type; boolean_t mode_found = B_FALSE; uint32_t cap_mask, mode_cap_mask; efx_link_mode_t mode; efx_phy_cap_type_t phy_cap; efx_phy_media_type_get(sc->enp, &medium_type); if (medium_type >= nitems(sfxge_link_mode)) { if_printf(sc->ifnet, "unexpected media type %d\n", medium_type); return (EINVAL); } efx_phy_adv_cap_get(sc->enp, EFX_PHY_CAP_PERM, &cap_mask); for (mode = EFX_LINK_10HDX; mode < EFX_LINK_NMODES; mode++) { if (ifmedia_masked == sfxge_link_mode[medium_type][mode]) { mode_found = B_TRUE; break; } } if (!mode_found) { /* * If media is not in the table, it must be IFM_AUTO. */ KASSERT((cap_mask & (1 << EFX_PHY_CAP_AN)) && ifmedia_masked == (IFM_ETHER | IFM_AUTO), ("%s: no mode for media %#x", __func__, ifmedia)); *phy_cap_mask = (cap_mask & ~(1 << EFX_PHY_CAP_ASYM)); return (0); } phy_cap = sfxge_link_mode_to_phy_cap(mode); if (phy_cap == EFX_PHY_CAP_INVALID) { if_printf(sc->ifnet, "cannot map link mode %d to phy capability\n", mode); return (EINVAL); } mode_cap_mask = (1 << phy_cap); mode_cap_mask |= cap_mask & (1 << EFX_PHY_CAP_AN); #ifdef SFXGE_HAVE_PAUSE_MEDIAOPTS if (ifmedia & IFM_ETH_RXPAUSE) mode_cap_mask |= cap_mask & (1 << EFX_PHY_CAP_PAUSE); if (!(ifmedia & IFM_ETH_TXPAUSE)) mode_cap_mask |= cap_mask & (1 << EFX_PHY_CAP_ASYM); #else mode_cap_mask |= cap_mask & (1 << EFX_PHY_CAP_PAUSE); #endif *phy_cap_mask = mode_cap_mask; return (0); } static int sfxge_media_change(struct ifnet *ifp) { struct sfxge_softc *sc; struct ifmedia_entry *ifm; int rc; uint32_t phy_cap_mask; sc = ifp->if_softc; ifm = sc->media.ifm_cur; SFXGE_ADAPTER_LOCK(sc); if (!SFXGE_RUNNING(sc)) { rc = 0; goto out; } rc = efx_mac_fcntl_set(sc->enp, sfxge_port_wanted_fc(sc), B_TRUE); if (rc != 0) goto out; if ((rc = sfxge_phy_cap_mask(sc, ifm->ifm_media, &phy_cap_mask)) != 0) goto out; rc = efx_phy_adv_cap_set(sc->enp, phy_cap_mask); out: SFXGE_ADAPTER_UNLOCK(sc); return (rc); } int sfxge_port_ifmedia_init(struct sfxge_softc *sc) { efx_phy_media_type_t medium_type; uint32_t cap_mask, mode_cap_mask; efx_link_mode_t mode; efx_phy_cap_type_t phy_cap; int mode_ifm, best_mode_ifm = 0; int rc; /* * We need port state to initialise the ifmedia list. * It requires initialized NIC what is already done in * sfxge_create() when resources are estimated. */ if ((rc = efx_filter_init(sc->enp)) != 0) goto out1; if ((rc = efx_port_init(sc->enp)) != 0) goto out2; /* * Register ifconfig callbacks for querying and setting the * link mode and link status. */ ifmedia_init(&sc->media, IFM_IMASK, sfxge_media_change, sfxge_media_status); /* * Map firmware medium type and capabilities to ifmedia types. * ifmedia does not distinguish between forcing the link mode * and disabling auto-negotiation. 1000BASE-T and 10GBASE-T * require AN even if only one link mode is enabled, and for * 100BASE-TX it is useful even if the link mode is forced. * Therefore we never disable auto-negotiation. * * Also enable and advertise flow control by default. */ efx_phy_media_type_get(sc->enp, &medium_type); efx_phy_adv_cap_get(sc->enp, EFX_PHY_CAP_PERM, &cap_mask); for (mode = EFX_LINK_10HDX; mode < EFX_LINK_NMODES; mode++) { phy_cap = sfxge_link_mode_to_phy_cap(mode); if (phy_cap == EFX_PHY_CAP_INVALID) continue; mode_cap_mask = (1 << phy_cap); mode_ifm = sfxge_link_mode[medium_type][mode]; if ((cap_mask & mode_cap_mask) && mode_ifm) { /* No flow-control */ ifmedia_add(&sc->media, mode_ifm, 0, NULL); #ifdef SFXGE_HAVE_PAUSE_MEDIAOPTS /* Respond-only. If using AN, we implicitly * offer symmetric as well, but that doesn't * mean we *have* to generate pause frames. */ mode_ifm |= IFM_ETH_RXPAUSE; ifmedia_add(&sc->media, mode_ifm, 0, NULL); /* Symmetric */ mode_ifm |= IFM_ETH_TXPAUSE; ifmedia_add(&sc->media, mode_ifm, 0, NULL); #endif /* Link modes are numbered in order of speed, * so assume the last one available is the best. */ best_mode_ifm = mode_ifm; } } if (cap_mask & (1 << EFX_PHY_CAP_AN)) { /* Add autoselect mode. */ mode_ifm = IFM_ETHER | IFM_AUTO; ifmedia_add(&sc->media, mode_ifm, 0, NULL); best_mode_ifm = mode_ifm; } if (best_mode_ifm != 0) ifmedia_set(&sc->media, best_mode_ifm); /* Now discard port state until interface is started. */ efx_port_fini(sc->enp); out2: efx_filter_fini(sc->enp); out1: return (rc); }