Page MenuHomeFreeBSD

Allow changing lagg(4) MTU
ClosedPublic

Authored by freqlabs on Oct 15 2018, 10:34 PM.
Referenced Files
Unknown Object (File)
Sun, Jan 26, 12:13 PM
Unknown Object (File)
Sat, Jan 25, 7:58 PM
Unknown Object (File)
Sat, Jan 25, 6:16 PM
Unknown Object (File)
Wed, Jan 22, 10:12 AM
Unknown Object (File)
Wed, Jan 22, 10:06 AM
Unknown Object (File)
Sat, Jan 18, 10:11 PM
Unknown Object (File)
Sat, Jan 18, 9:45 PM
Unknown Object (File)
Sat, Jan 18, 5:31 PM
Subscribers

Details

Summary

Previously, changing the MTU would require destroying the lagg and creating a new one. Now it is allowed to change the MTU of the lagg interface. The MTU of the ports will be set to match. If any port cannot set the new MTU, all ports are reverted to the original MTU of the lagg.

Additionally, when adding ports, the MTU of a port will be automatically set to the MTU of the lagg. As always, the MTU of the lagg is initially determined by the MTU of the first port added. If adding an interface as a port for some reason fails, that interface is reverted to its original MTU.

Test Plan

I have tested this change over the weekend on one of my systems configured for LACP, and under light use nothing seems amiss so far. I don't see any Kyua tests for net code, so some guidance with testing will be appreciated!

I hope someone familiar with the locking can sanity-check that particular aspect.

I have tested changing MTU several times, and I have tested the following error paths:

  • Attempting to set an invalid MTU (15000) on the lagg to ensure no change is attempted
  • Attempting to set a valid MTU on the lagg (16384) that is unsupported by a port (em) to ensure all ports are reverted and the error message is correctly displayed.

I tested adding a lagg to itself as a port (which did not end well, so I added a check for that and now it ends with a harmless error message).

I tested adding and removing ports several times with different MTUs and checked that the MTU gets changed to correctly match the lagg.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

LGTM! I have spoke with Ryan to get a better scenario about this issue and here there is a reference for that: https://redmine.ixsystems.com/issues/25092

Also I have checked the lock!! I will wait somebody from @manpages to double check the documentation, then I will do the process to include it on 12-RELEASE if possible.

This revision is now accepted and ready to land.Oct 16 2018, 4:34 AM
bcr added a subscriber: bcr.

You need to bump the .Dd at the beginning of the document to the date of the commit since this is a content change.

Bumped the date in the man page.

This revision now requires review to proceed.Oct 16 2018, 7:44 AM
0mp added inline comments.
sys/net/if_lagg.c
1462 ↗(On Diff #49208)

What is this 576?

1474 ↗(On Diff #49208)

I'm not sure if that's important but this is not compliant with style(9). It should be indented with 4 spaces, right?

1481 ↗(On Diff #49208)

Shouldn't it be if (error != 0) as in line 1675?

Also, the manpage patch looks good to me.

sys/net/if_lagg.c
1474 ↗(On Diff #49208)

Whoops, I've linked style(9) incorrectly in the previous comment. Sorry about that!

sys/net/if_lagg.c
1462 ↗(On Diff #49208)

It is the minimum accepted datagram, it cannot be less than 576 octets.

1474 ↗(On Diff #49208)

That is correct!!! I can fix that when I commit it.

1481 ↗(On Diff #49208)

It has the same result!

sys/net/if_lagg.c
1462 ↗(On Diff #49208)

Oh, cool.

I just thought that we should use a constant for values like this one but I guess that it's not really necessary in this case, right?

1481 ↗(On Diff #49208)

True. It's just that style(9) says:

Do not use ! for tests unless it is a boolean, e.g., use:

if (*p == '\0')

not:

if (!*p)

but I'm pretty new here so I'm happy with whatever you find reasonable :)

The value seems to come from RFC791, quote:

[...]
All hosts must be prepared to accept datagrams of up to 576 octets (whether they arrive whole or in fragments)
[...]

but I'm not sure if this really means that the *MTU* has to 576 or larger than that. I think the MTU
*can* be smaller, as the ethernet layer does accept smaller packets just fine.

There's IP_MSS with value 576, in netinet/ip.h -- maybe this is related ?

For additional context, here is the code from if_bridge.c this is based on:

case SIOCSIFMTU:
        if (ifr->ifr_mtu < 576) {
                error = EINVAL;
                break;
        }
        if (LIST_EMPTY(&sc->sc_iflist)) {
                sc->sc_ifp->if_mtu = ifr->ifr_mtu;
                break;
        }
        BRIDGE_LOCK(sc);
        LIST_FOREACH(bif, &sc->sc_iflist, bif_next) {
                if (bif->bif_ifp->if_mtu != ifr->ifr_mtu) {
                        log(LOG_NOTICE, "%s: invalid MTU: %u(%s)"
                            " != %d\n", sc->sc_ifp->if_xname,
                            bif->bif_ifp->if_mtu,
                            bif->bif_ifp->if_xname, ifr->ifr_mtu);
                        error = EINVAL;
                        break;
                }
        }
        if (!error)
                sc->sc_ifp->if_mtu = ifr->ifr_mtu;
        BRIDGE_UNLOCK(sc);
        break;

It was not obvious to me why the minimum MTU should be limited by IPv4 but I included it here erring on the side of caution.

sys/net/if_lagg.c
1481 ↗(On Diff #49208)

It should be if (error == 0) though. The MTU will only change if there is no error.

freqlabs edited the summary of this revision. (Show Details)
freqlabs edited the test plan for this revision. (Show Details)

Addressed feedback I received here and elsewhere on the previous patch. Please consider this alternative approach.

Key changes:

  • There doesn't seem to be a good reason to put a lower bound on the MTU as is done in bridge(4). That has been removed.
  • The optimization of checking if there are no ports to avoid taking a lock seems not worth it given how infrequently MTU changes are likely to occur. That has been removed.
  • While allowing and requiring the user to manually change port MTU brought lagg(4) in line with bridge(4), the eventual goal was to automatically handle changing the MTU of the ports. Now it does just that, avoiding needless churn:
    • Users cannot manually manipulate the MTU of ports on a lagg in LACP mode (this is the original behavior).
    • Setting the MTU on the lagg is allowed, and sets the MTU of all the ports automatically.
    • If setting the MTU on any port fails, all ports are reverted to the original MTU of the lagg.
  • Because no manual MTU manipulation is enabled, no documentation change is necessary.

I didn't see any document stating that lagg(4) MTU specifically cannot be changed, I hope I didn't miss it hiding somewhere though.

freqlabs edited the summary of this revision. (Show Details)
freqlabs edited the test plan for this revision. (Show Details)
freqlabs removed reviewers: manpages, bcr.

Addressed feedback by adding handling of new ports, so creating a lagg or adding new ports to an existing lagg sets all the port interfaces to the correct MTU automatically, too.

I also added a check to prevent a recursive lagg being created, which would cause infinite loops and other awful things if attempted.

I have no specific objections. Looks good to me. It just looks slightly odd, that MTU is handled different from other properties -- set before interface addition to LAGG, while multicast filters set after just before protocol initialization, and capabilities are set after the protocol initialization. As one of downsides this different order required addition of duplicate checks for LAGG nesting.

This revision is now accepted and ready to land.Oct 24 2018, 4:02 PM
This revision was automatically updated to reflect the committed changes.