Recent D3301 change improved thigs with lladdr propagation. However, there are number of outstanding issues that have to be fixed:
- lagg does not propagate lladdr change on vlans on top of it.
- lagg does not update "primary" interface lladdr when setting mac address manually.
- lladdr_event notifications are not always sent (for example, when adding slave port).
- Things not addressed in this review
- livelock on "./checklagg.sh cxl0 cxl1" due to (devd) "ifconfig lagg0 inet6 disabled" and "ifconfig lagg0 create laggport cxl0" cmds
- lagg llq is parsed in LIFO order (not
The ultimate goal is to make lladdr_event be reliably triggered. After that, arp-sending code from if_setlladdr() can be moved arp llhandler and long-standing issue with updating ifaddr lle (kern/145300, kern/152235) can also be addressed.
Actual changelist:
- Rework lagg_port_lladdr / lagg_port_setlladdr logic: now lagg_port_lladdr() consumers are responsible for specifying how lladdr update should be executed. This is done base on update _type_. There are only two of them: LAGG_LLQTYPE_PHYS (represents physical port, update can be skipped if current lladr is the same, BUT if we're executing from taskqueue we have to do both if_setlladdr() and lladdr notification). LAGG_LLQTYPE_VIRT (represents virtual e.g. lagg interface itself. Update is queue anyway, BUT we avoid calling if_setlladdr() (would trigger down/up) and just perform notification).
- Add (hopefully useful) comments on various lladdr queueing calls.
- Simplify "primary" port case in lagg_port_destroy().
- Add LAGG_UNLOCK_ASSERT() to note lagg_proto_detach() returns unlocked.
- Added __noinline to some control function to ease dtrace instrumentation.
See long, boring test plan with explanation/testing various scenarios with checklagg.sh script, slightly modified kernel and dtrace script (sources for all are attached).
Most debug was performed with vlans as lagg ports, however I performed similar testing with ixgbe/cxl drivers with similar results.
Very short version: