cxgbe(4): Add support for NIC suspend/resume and live reset.


cxgbe(4): Add support for NIC suspend/resume and live reset.

Add suspend/resume callbacks to the driver and a live reset built around
them. This commit covers the basic NIC and future commits will expand
this functionality to other stateful parts of the chip. Suspend and
resume operate on the chip (the t?nex nexus device) and affect all its
ports. It is not possible to suspend/resume or reset individual ports.
All these operations can be performed on a running NIC. A reset will
look like a link bounce to the networking stack.

Here are some ways to exercise this functionality:

/* Manual suspend and resume. */

  1. devctl suspend t6nex0
  2. devctl resume t6nex0

    /* Manual reset. */
  3. devctl reset t6nex0

    /* Manual reset with driver sysctl. */
  4. sysctl dev.t6nex.0.reset=1

    /* Automatic adapter reset on any fatal error. */
  5. hw.cxgbe.reset_on_fatal_err=1

Suspend disables the adapter (DMA, interrupts, and the port PHYs) and
marks the hardware as unavailable to the driver. All ifnets associated
with the adapter are still visible to the kernel but operations that
require hardware interaction will fail with ENXIO. All ifnets report
link-down while the adapter is suspended.

Resume will reattach to the card, reconfigure it as before, and recreate
the queues servicing the existing ifnets. The ifnets are able to send
and receive traffic as soon as the link comes back up.

Reset is roughly the same as a suspend and a resume with at least one of
these events in between: D0->D3Hot->D0, FLR, PCIe link retrain.

MFC after: 1 month
Relnotes: yes
Sponsored by: Chelsio Communications


npAuthored on Apr 28 2021, 4:33 AM
R10:f33f2365eeef: geom_uzip(4): fix a typo