Page MenuHomeFreeBSD

aq(4): RX/TX and HW-path correctness and hardening
ClosedPublic

Authored by nick_spun.io on Thu, Jun 4, 11:20 AM.
Referenced Files
Unknown Object (File)
Sun, Jun 21, 8:28 PM
Unknown Object (File)
Sun, Jun 21, 8:28 PM
Unknown Object (File)
Sun, Jun 21, 7:13 AM
Unknown Object (File)
Sun, Jun 21, 7:11 AM
Unknown Object (File)
Sat, Jun 20, 6:10 PM
Unknown Object (File)
Sat, Jun 20, 3:38 PM
Unknown Object (File)
Sat, Jun 20, 10:41 AM
Unknown Object (File)
Fri, Jun 19, 3:23 AM
Subscribers

Details

Summary

Independent correctness fixes, plus robustness against a non-responding
device, malformed descriptor writeback, and torn MMIO reads, and the move
to the FreeBSD bus_space(9) register abstraction.

Correctness:

  • aq_hw_ver_match returned true if any of major/minor/build was >= expected; compare lexicographically so e.g. 2.0.1 is correctly seen as older than 2.1.0.
  • The VLAN hardware-filter iteration used the vlan tag directly as the bitstring index; use vlan_tag + 1 so the active-VLAN bookkeeping lines up with the table.
  • aq_initmedia only registered IFM_AUTO in full-duplex/pause variants, so a bare "ifconfig aq0 media autoselect" matched no entry and returned ENXIO. Add the bare IFM_ETHER|IFM_AUTO entry, matching ix/em/igc/ixv.
  • Convert the per-ring diagnostic counters to counter(9): per-CPU, tear-free, no atomics on the increment path, fixing a data race and a 32-bit torn read against the locklessly-read sysctls. Drop three counters that were never populated (jumbo_pkts, tx_drops, tx_queue_full).

Hardening and modernization:

  • Implement aq_hw_err_from_flags (previously a stub returning 0 that left ~24 call sites as no-ops): detect a non-responding device in the register read path (all-ones, confirmed against reg 0x10) and latch a sticky AQ_HW_FLAG_ERR_UNPLUG -> ENXIO; aq_if_init clears it so a transient detection cannot permanently wedge the device.
  • Bound the RX fragment loop in aq_isc_rxd_pkt_get (EBADMSG once i reaches isc_rx_nsegments) so a malformed never-EOP stream cannot write past the fragment array.
  • Bound the TX completion count in aq_isc_txd_credits_update against the raw HW head pointer (reject head >= tx_size) so a glitched head cannot make iflib free still-in-flight TX mbufs.
  • Remove the dead hardware-RSC branch in aq_isc_rxd_available; it followed wb.next_desp, a raw device value used as an index, but RSC is never enabled so rsc_cnt is always 0. Advance sequentially like the other in-tree iflib drivers; this drops the last raw-hardware-pointer dereference in the RX path.
  • Flush MMIO after interrupt-status acks (AQ_HW_FLUSH) so the write lands before the vector is re-armed under auto-mask-clear; retry the high word in read64_ against a torn lo/hi pair; document the non-atomic IMR read-modify-write in itr_irq_map_en_{rx,tx}_set.
  • Replace the raw-pointer MMIO (the Linux readl/writel idiom) with bus_space(9): store the BAR tag and handle in struct aq_hw and route AQ_READ_REG/AQ_WRITE_REG through bus_space_read_4/write_4. No functional change on amd64.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 74049
Build 70932: arc lint + arc unit

Event Timeline

In general this is fine, i would've split it into a couple of diffs myself just to keep things cleaner. Just address the volatile/atomic access stuff to finish the conversion off and then itll be fine to land.

sys/dev/aq/aq_hw.c
54

this needs to be some atomic barrier/read operation.

sys/dev/aq/aq_hw.h
181

volatile is always a fun sketchy problem here - you're asking the compiler not to optimise out the reads/writes so they're visible from say an interrupt handler, but there's no guarantee about ordering across CPUs.

You're accessing it via atomics anyway, which is doing the barrier/modification or modification/barrier, except in the place where you're checking. I think you should find out what the right barrier/read atomic operation is to use for aq_hw_err_from_flags() and then migrate this to an actual atomic type.

This revision is now accepted and ready to land.Fri, Jun 19, 2:16 AM
This revision now requires review to proceed.Sat, Jun 20, 4:58 AM
This revision is now accepted and ready to land.Sat, Jun 20, 4:10 PM