Recent D3301 change improved thigs with lladdr propagation. However, there are number of outstanding issues that have to be fixed:
- lagg does not propagate lladdr change on vlans on top of it.
- lagg does not update "primary" interface lladdr when setting mac address manually.
- lladdr_event notifications are not always sent (for example, when adding slave port).
- Things not addressed in this review
- livelock on "./checklagg.sh cxl0 cxl1" due to (devd) "ifconfig lagg0 inet6 disabled" and "ifconfig lagg0 create laggport cxl0" cmds
- lagg llq is parsed in LIFO order (not
The ultimate goal is to make lladdr_event be reliably triggered. After that, arp-sending code from if_setlladdr() can be moved arp llhandler and long-standing issue with updating ifaddr lle (kern/145300, kern/152235) can also be addressed.
Actual changelist:
- Rework lagg_port_lladdr / lagg_port_setlladdr logic: now lagg_port_lladdr() consumers are responsible for specifying how lladdr update should be executed. This is done base on update _type_. There are only two of them: LAGG_LLQTYPE_PHYS (represents physical port, update can be skipped if current lladr is the same, BUT if we're executing from taskqueue we have to do both if_setlladdr() and lladdr notification). LAGG_LLQTYPE_VIRT (represents virtual e.g. lagg interface itself. Update is queue anyway, BUT we avoid calling if_setlladdr() (would trigger down/up) and just perform notification).
- Add (hopefully useful) comments on various lladdr queueing calls.
- Simplify "primary" port case in lagg_port_destroy().
- Add LAGG_UNLOCK_ASSERT() to note lagg_proto_detach() returns unlocked.
- Added __noinline to some control function to ease dtrace instrumentation.
See long, boring test plan with explanation/testing various scenarios with checklagg.sh script, slightly modified kernel and dtrace script (sources for all are attached).
Most debug was performed with vlans as lagg ports, however I performed similar testing with ixgbe/cxl drivers with similar results.
Very short version:
My gut reaction to this is that we probably shouldn't expose the port at all if it's not explicitly set, just leave it nil in this table to allow for more idiomatic usage where you can simply test the port as truthy before using it.f