Page MenuHomeFreeBSD

Yet another if_lagg lladdr/eventhandler fix.
ClosedPublic

Authored by melifaro on Oct 26 2015, 10:37 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 6, 7:58 AM
Unknown Object (File)
Thu, Dec 12, 3:23 AM
Unknown Object (File)
Dec 7 2024, 7:59 PM
Unknown Object (File)
Dec 5 2024, 5:00 PM
Unknown Object (File)
Nov 17 2024, 4:22 PM
Unknown Object (File)
Nov 5 2024, 11:40 AM
Unknown Object (File)
Oct 29 2024, 1:28 PM
Unknown Object (File)
Oct 22 2024, 12:07 PM
Subscribers

Details

Summary

Recent D3301 change improved thigs with lladdr propagation. However, there are number of outstanding issues that have to be fixed:

  1. lagg does not propagate lladdr change on vlans on top of it.
  2. lagg does not update "primary" interface lladdr when setting mac address manually.
  3. lladdr_event notifications are not always sent (for example, when adding slave port).
  4. Things not addressed in this review
  5. livelock on "./checklagg.sh cxl0 cxl1" due to (devd) "ifconfig lagg0 inet6 disabled" and "ifconfig lagg0 create laggport cxl0" cmds
  6. lagg llq is parsed in LIFO order (not

The ultimate goal is to make lladdr_event be reliably triggered. After that, arp-sending code from if_setlladdr() can be moved arp llhandler and long-standing issue with updating ifaddr lle (kern/145300, kern/152235) can also be addressed.

Actual changelist:

  • Rework lagg_port_lladdr / lagg_port_setlladdr logic: now lagg_port_lladdr() consumers are responsible for specifying how lladdr update should be executed. This is done base on update _type_. There are only two of them: LAGG_LLQTYPE_PHYS (represents physical port, update can be skipped if current lladr is the same, BUT if we're executing from taskqueue we have to do both if_setlladdr() and lladdr notification). LAGG_LLQTYPE_VIRT (represents virtual e.g. lagg interface itself. Update is queue anyway, BUT we avoid calling if_setlladdr() (would trigger down/up) and just perform notification).
  • Add (hopefully useful) comments on various lladdr queueing calls.
  • Simplify "primary" port case in lagg_port_destroy().
  • Add LAGG_UNLOCK_ASSERT() to note lagg_proto_detach() returns unlocked.
  • Added __noinline to some control function to ease dtrace instrumentation.

See long, boring test plan with explanation/testing various scenarios with checklagg.sh script, slightly modified kernel and dtrace script (sources for all are attached).
Most debug was performed with vlans as lagg ports, however I performed similar testing with ixgbe/cxl drivers with similar results.

Very short version:

HEAD (r289939)

1[21:34:00] Working on:
2 ix0 90:e2:ba:0f:a0:04
3 ix1 90:e2:ba:0f:a0:05
4[21:34:00] Creating lagg lagg0 on top of ix0
5[21:34:00] Testing inherited mac.. OK
6
7[21:34:05] Creating vlan devices (vlan900, vlan901) on lagg0.. DONE
8[21:34:05] Checking if interfaces has been created with proper macs..
9 vlan900 OK
10 vlan901 OK
11[21:34:10] Adding (slave) port ix1 to lagg0.. DONE
12[21:34:10] Checking if slave port mac has changed.. OK
13[21:34:15] Removing primary port from lagg (ix1 is new primary).. DONE
14Sleeping to give physical interface chance to reinit..
15[21:34:17] Checking if all interfaces has updated their macs..
16 ix0 OK
17 ix1 OK
18 lagg0 OK
19 vlan900 FAIL
20 vlan901 FAIL
21[21:34:22] Adding (slave) port ix0 back.. DONE
22[21:34:22] Checking if slave port mac has changed.. OK
23[21:34:27] Trying to change mac manually to 52:54:00:00:00:00.. DONE
24Sleeping to give physical interface chance to reinit..
25[21:34:29] Checking if all interfaces has updated their macs..
26 ix0 OK
27 ix1 FAIL
28 lagg0 OK
29 vlan900 OK
30 vlan901 OK
31[21:34:34] Cleaning up.. DONE
32[21:34:35] Checking original macs..
33 ix0 OK
34 ix1 OK

PATCHED

1[21:29:45] Working on:
2 ix0 90:e2:ba:0f:a0:04
3 ix1 90:e2:ba:0f:a0:05
4[21:29:45] Creating lagg lagg0 on top of ix0
5[21:29:45] Testing inherited mac.. OK
6[21:29:50] Creating vlan devices (vlan900, vlan901) on lagg0.. DONE
7[21:29:50] Checking if interfaces has been created with proper macs..
8 vlan900 OK
9 vlan901 OK
10[21:29:55] Adding (slave) port ix1 to lagg0.. DONE
11[21:29:55] Checking if slave port mac has changed.. OK
12[21:30:00] Removing primary port from lagg (ix1 is new primary).. DONE
13Sleeping to give physical interface chance to reinit..
14[21:30:02] Checking if all interfaces has updated their macs..
15 ix0 OK
16 ix1 OK
17 lagg0 OK
18 vlan900 OK
19 vlan901 OK
20[21:30:07] Adding (slave) port ix0 back.. DONE
21[21:30:07] Checking if slave port mac has changed.. OK
22[21:30:12] Trying to change mac manually to 52:54:00:00:00:00.. DONE
23Sleeping to give physical interface chance to reinit..
24[21:30:14] Checking if all interfaces has updated their macs..
25 ix0 OK
26 ix1 OK
27 lagg0 OK
28 vlan900 OK
29 vlan901 OK
30[21:30:19] Cleaning up.. DONE
31[21:30:20] Checking original macs..
32 ix0 OK
33 ix1 OK

Pre-D3301 (r288654)

117:57 [0] m@pktgen1 s ./checklagg.sh ix0 ix1
2Working on:
3 ix0 90:e2:ba:0f:a0:04
4 ix1 90:e2:ba:0f:a0:05
5Creating lagg lagg0 on top of ix0
6Testing inherited mac.. OK
7Creating vlan devices (vlan900, vlan901) on lagg0.. DONE
8Checking if interfaces has been created with proper macs..
9 vlan900 OK
10 vlan901 OK
11Adding (slave) port ix1 to lagg0.. DONE
12Checking if slave port mac has changed.. OK
13Removing primary port from lagg (ix1 is new primary).. DONE
14Checking if all interfaces has updated their macs..
15 ix0 OK
16 ix1 FAIL
17 lagg0 OK
18 vlan900 FAIL
19 vlan901 FAIL
20Adding (slave) port ix0 back.. DONE
21Checking if slave port mac has changed.. OK
22Trying to change mac manually to 52:54:00:00:00:00.. DONE
23Checking if all interfaces has updated their macs..
24 ix0 OK
25 ix1 FAIL
26 lagg0 OK
27 vlan900 OK
28 vlan901 OK
29Cleaning up.. DONE
30Checking original macs..
31 ix0 FAIL (was 90:e2:ba:0f:a0:04) now 52:54:00:00:00:00. Restoring.. DONE
32 ix1 FAIL (was 90:e2:ba:0f:a0:05) now 90:e2:ba:0f:a0:04. Restoring.. DONE

Test Plan

Scenario:
virtual machine has 2 interfaces (vlan11, vlan12) on top of vtnet0.
We're building lagg on top of there 2 interfaces (see checklagg.sh for exact details).
Kernel is hacked to

  • allow vlans on top of lagg (one-liner)
  • help dtrace probes by explicitly uninlining several functions
  • dummy arp_iflladdr() lladdr_event callback added to ease dtrace handling

Files:
http://static.ipfw.ru/patches/d4004_checklagg.sh
http://static.ipfw.ru/patches/d4004_arp.d
http://static.ipfw.ru/patches/d4004_kernel.diff

checklagg.sh is running along with arp.d script in other console. Their output is merged and some comments are provided.

TESTCASE 1, HEAD (r289939)

1[22:02:02] Working on:
2 vlan11 52:54:11:11:11:11
3 vlan12 52:54:12:12:12:12
4[22:02:02] Creating lagg lagg0 on top of vlan11
5[22:02:02] Testing inherited mac.. OK
6
7CPU ID FUNCTION:NAME
8## Queing notification request for lagg mac change
9 0 60254 lagg_lladdr:entry Setting 'lagg0' mac to 52:54:11:11:11:11
10 0 60270 lagg_port_lladdr:entry Setting 'lagg0' mac to 52:54:11:11:11:11
11 1 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
12
13## No forced update (hence no down/up), just eventhandler
14 1 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 52:54:11:11:11:11]
15## Ignored set request (same lladdr for physical port)
16 0 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
17
18[22:02:07] Creating vlan devices (vlan900, vlan901) on lagg0.. DONE
19[22:02:07] Checking if interfaces has been created with proper macs..
20 vlan900 OK
21 vlan901 OK
22[22:02:12] Adding (slave) port vlan12 to lagg0.. DONE
23[22:02:12] Checking if slave port mac has changed.. OK
24
25## Enqueueing slave port lladdr change
26 0 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:11:11:11:11
27 1 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
28
29## Set mac + send notification
30 1 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:11:11:11:11
31 1 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:11:11:11:11]
32
33[22:02:17] Removing primary port from lagg (vlan12 is new primary).. DONE
34[22:02:17] Checking if all interfaces has updated their macs..
35 vlan11 OK
36 vlan12 OK
37 lagg0 OK
38 vlan900 OK
39 vlan901 OK
40
41## Enqueue lladdr change for removed interface (ignored since lladdr was not changed)
42 0 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
43## Choose new mac and change lagg lladdr/enqueue change
44 0 60254 lagg_lladdr:entry Setting 'lagg0' mac to 52:54:12:12:12:12
45 0 60270 lagg_port_lladdr:entry Setting 'lagg0' mac to 52:54:12:12:12:12
46 0 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
47 1 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
48
49## Handling queue backward
50## Set mac + send notification for vlan12
51 1 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
52 1 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:12:12:12:12]
53## Executing event handlers for lagg0 (not setting lladdr )
54 1 33396 if_setlladdr:entry Setting 'vlan901' mac to 52:54:12:12:12:12
55 1 33396 if_setlladdr:entry Setting 'vlan900' mac to 52:54:12:12:12:12
56 1 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 52:54:12:12:12:12]
57
58[22:02:22] Adding (slave) port vlan11 back.. DONE
59[22:02:22] Checking if slave port mac has changed.. OK
60
61## Enqueue lladdr change for newly-attached child interface
62 1 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:12:12:12:12
63 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
64
65## Set mac + send notification for vlan11
66 0 33396 if_setlladdr:entry Setting 'vlan11' mac to 52:54:12:12:12:12
67 0 17375 arp_iflladdr:entry ARP handler called for 'vlan11' [mac 52:54:12:12:12:12]
68
69[22:02:27] Trying to change mac manually to 52:54:00:00:00:00.. DONE
70[22:02:27] Checking if all interfaces has updated their macs..
71 vlan11 OK
72 vlan12 OK
73 lagg0 OK
74 vlan900 OK
75 vlan901 OK
76
77## Start from setlladdr -> lagg_stop() -> lagg_init()
78 2 33396 if_setlladdr:entry Setting 'lagg0' mac to 52:54:00:00:00:00
79## lagg_init() enqueues change for each port
80 2 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:00:00:00:00
81 2 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:00:00:00:00
82## taskqueue handler on cpu0
83 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
84
85## eventhnadler notification from if_setlladdr() on cpu2
86 2 33396 if_setlladdr:entry Setting 'vlan901' mac to 52:54:00:00:00:00
87## Set mac + send notification for vlan12
88 0 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:00:00:00:00
89 0 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:00:00:00:00]
90## Set mac + send notification for vlan11
91 0 33396 if_setlladdr:entry Setting 'vlan11' mac to 52:54:00:00:00:00
92 0 17375 arp_iflladdr:entry ARP handler called for 'vlan11' [mac 52:54:00:00:00:00]
93## eventhnadler notification from if_setlladdr() on cpu0
94 2 33396 if_setlladdr:entry Setting 'vlan900' mac to 52:54:00:00:00:00
95 2 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 52:54:00:00:00:00]
96
97[22:02:32] Cleaning up.. DONE
98[22:02:32] Checking original macs..
99 vlan11 OK
100 vlan12 OK
101
102 1 60218 lagg_clone_destroy:entry time to shutdown for 'lagg0'
103
104## Enqueing delete for vlan11 (slave port)
105 1 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
106## Takqueue triggered, waiting for lock?
107 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
108
109## Enqueing delete for vlan12 (the only remaining master)
110 1 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
111## lagg_port_destroy(): no ports, set empty lladdr
112 1 60254 lagg_lladdr:entry Setting 'lagg0' mac to 00:00:00:00:00:00
113 1 60270 lagg_port_lladdr:entry Setting 'lagg0' mac to 00:00:00:00:00:00
114## iterating taskq queue backwards
115## Send notification for lagg0
116 0 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 00:00:00:00:00:00]
117## Set mac + send notification for vlan12
118 0 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
119 0 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:12:12:12:12]
120## Set mac + send notification for vlan11
121 0 33396 if_setlladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
122 0 17375 arp_iflladdr:entry ARP handler called for 'vlan11' [mac 52:54:11:11:11:11]
123## tasqueue_drain() for empty queue?
124 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'

TESTCASE 2, PATCHED VERSION

1[22:02:02] Working on:
2 vlan11 52:54:11:11:11:11
3 vlan12 52:54:12:12:12:12
4[22:02:02] Creating lagg lagg0 on top of vlan11
5[22:02:02] Testing inherited mac.. OK
6
7CPU ID FUNCTION:NAME
8## Queing notification request for lagg mac change
9 0 60254 lagg_lladdr:entry Setting 'lagg0' mac to 52:54:11:11:11:11
10 0 60270 lagg_port_lladdr:entry Setting 'lagg0' mac to 52:54:11:11:11:11
11 1 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
12
13## No forced update (hence no down/up), just eventhandler
14 1 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 52:54:11:11:11:11]
15## Ignored set request (same lladdr for physical port)
16 0 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
17
18[22:02:07] Creating vlan devices (vlan900, vlan901) on lagg0.. DONE
19[22:02:07] Checking if interfaces has been created with proper macs..
20 vlan900 OK
21 vlan901 OK
22[22:02:12] Adding (slave) port vlan12 to lagg0.. DONE
23[22:02:12] Checking if slave port mac has changed.. OK
24
25## Enqueueing slave port lladdr change
26 0 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:11:11:11:11
27 1 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
28
29## Set mac + send notification
30 1 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:11:11:11:11
31 1 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:11:11:11:11]
32
33[22:02:17] Removing primary port from lagg (vlan12 is new primary).. DONE
34[22:02:17] Checking if all interfaces has updated their macs..
35 vlan11 OK
36 vlan12 OK
37 lagg0 OK
38 vlan900 OK
39 vlan901 OK
40
41## Enqueue lladdr change for removed interface (ignored since lladdr was not changed)
42 0 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
43## Choose new mac and change lagg lladdr/enqueue change
44 0 60254 lagg_lladdr:entry Setting 'lagg0' mac to 52:54:12:12:12:12
45 0 60270 lagg_port_lladdr:entry Setting 'lagg0' mac to 52:54:12:12:12:12
46 0 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
47 1 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
48
49## Handling queue backward
50## Set mac + send notification for vlan12
51 1 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
52 1 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:12:12:12:12]
53## Executing event handlers for lagg0 (not setting lladdr )
54 1 33396 if_setlladdr:entry Setting 'vlan901' mac to 52:54:12:12:12:12
55 1 33396 if_setlladdr:entry Setting 'vlan900' mac to 52:54:12:12:12:12
56 1 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 52:54:12:12:12:12]
57
58[22:02:22] Adding (slave) port vlan11 back.. DONE
59[22:02:22] Checking if slave port mac has changed.. OK
60
61## Enqueue lladdr change for newly-attached child interface
62 1 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:12:12:12:12
63 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
64
65## Set mac + send notification for vlan11
66 0 33396 if_setlladdr:entry Setting 'vlan11' mac to 52:54:12:12:12:12
67 0 17375 arp_iflladdr:entry ARP handler called for 'vlan11' [mac 52:54:12:12:12:12]
68
69[22:02:27] Trying to change mac manually to 52:54:00:00:00:00.. DONE
70[22:02:27] Checking if all interfaces has updated their macs..
71 vlan11 OK
72 vlan12 OK
73 lagg0 OK
74 vlan900 OK
75 vlan901 OK
76
77## Start from setlladdr -> lagg_stop() -> lagg_init()
78 2 33396 if_setlladdr:entry Setting 'lagg0' mac to 52:54:00:00:00:00
79## lagg_init() enqueues change for each port
80 2 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:00:00:00:00
81 2 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:00:00:00:00
82## taskqueue handler on cpu0
83 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
84
85## eventhnadler notification from if_setlladdr() on cpu2
86 2 33396 if_setlladdr:entry Setting 'vlan901' mac to 52:54:00:00:00:00
87## Set mac + send notification for vlan12
88 0 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:00:00:00:00
89 0 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:00:00:00:00]
90## Set mac + send notification for vlan11
91 0 33396 if_setlladdr:entry Setting 'vlan11' mac to 52:54:00:00:00:00
92 0 17375 arp_iflladdr:entry ARP handler called for 'vlan11' [mac 52:54:00:00:00:00]
93## eventhnadler notification from if_setlladdr() on cpu0
94 2 33396 if_setlladdr:entry Setting 'vlan900' mac to 52:54:00:00:00:00
95 2 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 52:54:00:00:00:00]
96
97[22:02:32] Cleaning up.. DONE
98[22:02:32] Checking original macs..
99 vlan11 OK
100 vlan12 OK
101
102 1 60218 lagg_clone_destroy:entry time to shutdown for 'lagg0'
103
104## Enqueing delete for vlan11 (slave port)
105 1 60270 lagg_port_lladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
106## Takqueue triggered, waiting for lock?
107 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'
108
109## Enqueing delete for vlan12 (the only remaining master)
110 1 60270 lagg_port_lladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
111## lagg_port_destroy(): no ports, set empty lladdr
112 1 60254 lagg_lladdr:entry Setting 'lagg0' mac to 00:00:00:00:00:00
113 1 60270 lagg_port_lladdr:entry Setting 'lagg0' mac to 00:00:00:00:00:00
114## iterating taskq queue backwards
115## Send notification for lagg0
116 0 17375 arp_iflladdr:entry ARP handler called for 'lagg0' [mac 00:00:00:00:00:00]
117## Set mac + send notification for vlan12
118 0 33396 if_setlladdr:entry Setting 'vlan12' mac to 52:54:12:12:12:12
119 0 17375 arp_iflladdr:entry ARP handler called for 'vlan12' [mac 52:54:12:12:12:12]
120## Set mac + send notification for vlan11
121 0 33396 if_setlladdr:entry Setting 'vlan11' mac to 52:54:11:11:11:11
122 0 17375 arp_iflladdr:entry ARP handler called for 'vlan11' [mac 52:54:11:11:11:11]
123## tasqueue_drain() for empty queue?
124 0 60274 lagg_port_setlladdr:entry Taskqueue handler called for 'lagg0'

files:
F270951
F270953
F270955

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

melifaro retitled this revision from to Yet another if_lagg lladdr/eventhandler fix..
melifaro updated this object.
melifaro edited the test plan for this revision. (Show Details)
melifaro added reviewers: network, ae, araujo, glebius, hiren, hrs.
melifaro edited the test plan for this revision. (Show Details)
hrs edited edge metadata.

Looks good to me. Thank you for your work!

This revision is now accepted and ready to land.Oct 27 2015, 5:55 AM
melifaro edited edge metadata.
melifaro edited the test plan for this revision. (Show Details)

Do pre-commit sync and update some comments.

This revision now requires review to proceed.Nov 1 2015, 7:50 PM
This revision was automatically updated to reflect the committed changes.