Page MenuHomeFreeBSD

carp: support VRRPv3
Needs ReviewPublic

Authored by kp on Sat, Apr 13, 8:51 AM.
Tags
None
Referenced Files
F82603144: D44774.id137101.diff
Tue, Apr 30, 7:56 PM
F82600696: D44774.id137101.diff
Tue, Apr 30, 7:36 PM
F82593355: D44774.id137487.diff
Tue, Apr 30, 6:04 PM
F82547008: D44774.diff
Tue, Apr 30, 3:16 AM
Unknown Object (File)
Mon, Apr 29, 7:28 AM
Unknown Object (File)
Fri, Apr 26, 5:07 AM
Unknown Object (File)
Wed, Apr 17, 3:48 PM
Unknown Object (File)
Tue, Apr 16, 12:06 AM

Details

Reviewers
bz
glebius
Group Reviewers
network
pfsense
Core Team
Summary

Allow carp(4) to use the VRRPv3 protocol (RFC 5798). We can distinguish carp and
VRRP based on the protocol version number (carp is 2, VRRPv3 is 3), and support
both from the carp(4) code.

Co-authored by: glebius
Sponsored by: Rubicon Communications, LLC ("Netgate")

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 57465
Build 54353: arc lint + arc unit

Event Timeline

kp requested review of this revision.Sat, Apr 13, 8:51 AM
bz requested changes to this revision.Sun, Apr 14, 11:34 PM
bz added a reviewer: Core Team.
bz added a subscriber: bz.

I will simply express that this will not only open a can of worms by mixing both but the original reasons not to include VRRPv2/3 and hence the "existence" of CARP is also ignored.
Should people think this can go in, I would highly advise to make it 100% compile time option and there's quite a few things to polish (including quoting from copyrighted material without reference).

This revision now requires changes to proceed.Sun, Apr 14, 11:34 PM
In D44774#1020899, @bz wrote:

I will simply express that this will not only open a can of worms by mixing both but the original reasons not to include VRRPv2/3 and hence the "existence" of CARP is also ignored.

I'm assuming you're referring to the supposed patent issues?

There are multiple other open source VRRP implementations, e.g. https://github.com/FDio/vpp/tree/master/src/plugins/vrrp and https://www.keepalived.org
Also, the relevant patents have expired by now (by which I mean, in 2017 and 2012, so 7 and 12 years ago): https://en.wikipedia.org/wiki/Virtual_Router_Redundancy_Protocol#cite_note-6

emaste added inline comments.
sbin/ifconfig/carp.c
99

IMO it's worth at least an explicit assert that it's CARP_VERSION_VRRPv3 here (as documentation), if not going as far as an else if and an "unknown version" else case.

sbin/ifconfig/carp.c
99

Or use switch statement. For enums the compiler will make sure that all possible values are covered.

sys/netinet/ip_carp.c
110–129

Can we unionize vrrp-specific stuff with carp-specific?

178

Where is this structure used? Are we breaking any ABI making it larger? Why not create a new structure for vrrp and do not touch this one?

557–574

IMHO, switch() would look nicer here.

591–592

These two reassignments belong to the block that did m_pullup(). We don't need them at common path.

593–623

IMHO, switch() looks nicer.

630

All my comments to IPv4 input also apply to IPv6 version.

781

Maybe make it const struct mbuf *m and do same to carp_source_is_self()? Ideally we would pass struct ifnet * and IP header here, but that would be too much of change.

966

Can we ifa_free(ifa) right here?

1552–1558

Up to your taste, but I would go with C99 initializer here. That allows reader of the code to be 100% sure that everything that is not set is set to 0 and no stack bytes leak anywhere.

1684

Please use char * in a new code.

P.S. I'm surprised compiler didn't say anything about using caddr_t instead of c_caddr_t.

2808

Just as with the ioctl structure, I'd suggest to keep this one untouched and created a new one for new version.

sys/netinet/ip_carp.h
126–127

I wonder why do we have both declarations of protocol header outside of ip_carp.c? Why are they in ip_carp.h? Anything else may want to see them?

181–184

IMHO, a typical style of the "_t" postfix for a type name is that the named type is a typedef that hides the fact it is a structure, a union or a enum. And later in code you can avoid using word "enum" all around. Also, IMHO, better to avoid plural in the name, cause a single variable of this type is a version, not versions.

This whole comment is one big IMHO! Feel free to stick to your version.

kp marked 11 inline comments as done.Tue, Apr 16, 2:41 PM
kp added inline comments.
sys/netinet/ip_carp.c
110–129

We probably could, but given that they're different data types (int vs. uint8_t/uint16_t) and mean different things I don't think it'd improve anything.

It's not as if a handful of extra bytes per carp address is of any concern.

178

It's used only in the kernel. It's used to abstract the old ioctl configuration api and the new netlink one.
Both the carp_ioctl() and carp_nl_set() functions then call into carp_ioctl_set(), which takes a struct carpkreq. It needs to list the new fields to support the new functionality.

2808

That seems a bit silly. The nl_carp_parsed struct is only used during netlink attribute parsing, and the whole point of netlink is that we can extend the protocol without breaking user/kernel space interfaces.

sys/netinet/ip_carp.h
126–127

I followed the existing struct carp_header here.
It'd arguably be better to move both into ip_carp.c, but I'd prefer to do that outside of this commit.

sys/netinet/ip_carp.c
110–129

It is more about code readability rather than memory savings. A new reader comes and immediately sees that certain fields are shared between protocols and certain are mutually exclusive.

178

Got it. Thanks!

2808

Maybe I mistake, but won't the preceding nlmsghdr now have a bigger nlmsg_len? If so, won't the old ABI request fail?

sys/netinet/ip_carp.h
126–127

Agreed.

sys/netinet/ip_carp.c
110–129

So something like this?

diff --git a/sys/netinet/ip_carp.c b/sys/netinet/ip_carp.c
index 61b5a65d0bf3..714f2f5ed23f 100644
--- a/sys/netinet/ip_carp.c
+++ b/sys/netinet/ip_carp.c
@@ -108,13 +108,19 @@ struct carp_softc {
        struct mtx              sc_mtx;

        int                     sc_vhid;
-       int                     sc_advskew;
-       int                     sc_advbase;
+       union {
+               struct carp_softc_carp {
+                       int                     sc_advskew;
+                       int                     sc_advbase;
+               } carp;
+               struct carp_softc_vrrp {
+                       uint8_t                 sc_vrrp_prio;
+                       uint16_t                sc_vrrp_adv_inter;
+                       uint16_t                sc_vrrp_master_inter;
+               } vrrp;
+       } sc_settings;
        struct in_addr          sc_carpaddr;
        struct in6_addr         sc_carpaddr6;
-       uint8_t                 sc_vrrp_prio;
-       uint16_t                sc_vrrp_adv_inter;
-       uint16_t                sc_vrrp_master_inter;

        int                     sc_naddrs;
        int                     sc_naddrs6;
@@ -327,10 +333,10 @@ SYSCTL_VNET_PCPUSTAT(_net_inet_carp, OID_AUTO, stats, struct carpstats,
        TAILQ_FOREACH((sc), &(ifp)->if_carp->cif_vrs, sc_list)

 #define        DEMOTE_ADVSKEW(sc)                                      \
-    (((sc)->sc_advskew + V_carp_demotion > CARP_MAXSKEW) ?     \
+    (((sc)->sc_settings.carp.sc_advskew + V_carp_demotion > CARP_MAXSKEW) ?    \
     CARP_MAXSKEW :                                             \
-        (((sc)->sc_advskew + V_carp_demotion < 0) ?            \
-        0 : ((sc)->sc_advskew + V_carp_demotion)))
+        (((sc)->sc_settings.carp.sc_advskew + V_carp_demotion < 0) ?           \
+        0 : ((sc)->sc_settings.carp.sc_advskew + V_carp_demotion)))

 static void    carp_input_c(struct mbuf *, struct carp_header *, sa_family_t, int);
 static void    vrrp_input_c(struct mbuf *, int, sa_family_t, int, int, uint16_t);
@@ -885,7 +891,7 @@ carp_input_c(struct mbuf *m, struct carp_header *ch, sa_family_t af, int ttl)
        sc->sc_init_counter = 0;
        sc->sc_counter = tmp_counter;

-       sc_tv.tv_sec = sc->sc_advbase;
+       sc_tv.tv_sec = sc->sc_settings.carp.sc_advbase;
        sc_tv.tv_usec = DEMOTE_ADVSKEW(sc) * 1000000 / 256;
        ch_tv.tv_sec = ch->carp_advbase;
        ch_tv.tv_usec = ch->carp_advskew * 1000000 / 256;
...

I'm not sure that's better (and I seem to have broken things making a mechanical substitution too).

2808

The messages will indeed be slightly larger. Message size is already variable because things like strings only use strlen(string) + 1, so setting up carp on "if1" has a different message length from setting it up on "interface1" already.

That's a long winded way of saying that the message length is not part of the ABI.

sys/netinet/ip_carp.c
110–129

Nope, just this:

@@ -108,13 +108,19 @@ struct carp_softc {
        struct mtx              sc_mtx;

        int                     sc_vhid;
-       int                     sc_advskew;
-       int                     sc_advbase;
+       union {
+               struct {  /* CARP specific context */
+                       int                     sc_advskew;
+                       int                     sc_advbase;
+               };
+               struct {  /* VRRPv4 specific context */
+                       uint8_t                 sc_vrrp_prio;
+                       uint16_t                sc_vrrp_adv_inter;
+                       uint16_t                sc_vrrp_master_inter;
+               };
+       };
        struct in_addr          sc_carpaddr;
        struct in6_addr         sc_carpaddr6;
-       uint8_t                 sc_vrrp_prio;
-       uint16_t                sc_vrrp_adv_inter;
-       uint16_t                sc_vrrp_master_inter;

        int                     sc_naddrs;
        int                     sc_naddrs6;

And no changes to any code or macros!

2808

Got it, thanks!

sys/netinet/ip_carp.c
110–129

I've tried that, and while I can make it work it requires a few additional changes to cope with the conflicts this introduces.

That is, the valid values for sc_advbase and sc_vrrp_adv_inter are different, so changing version between carp and vrrp becomes more complex. This seems to introduce several possible ways for things to break, without adding much more than a hint that the sc_advskew value won't be used in VRRPv3 mode.

However, it occurs to me that we don't actually need a union to hint that these are protocol specific contexts, we could just do this too:

diff --git a/sys/netinet/ip_carp.c b/sys/netinet/ip_carp.c
index 61b5a65d0bf3..5e91c85ae885 100644
--- a/sys/netinet/ip_carp.c
+++ b/sys/netinet/ip_carp.c
@@ -108,13 +108,17 @@ struct carp_softc {
        struct mtx              sc_mtx;

        int                     sc_vhid;
-       int                     sc_advskew;
-       int                     sc_advbase;
-       struct in_addr          sc_carpaddr;
-       struct in6_addr         sc_carpaddr6;
-       uint8_t                 sc_vrrp_prio;
-       uint16_t                sc_vrrp_adv_inter;
-       uint16_t                sc_vrrp_master_inter;
+       struct { /* CARP specific context */
+               int             sc_advskew;
+               int             sc_advbase;
+               struct in_addr  sc_carpaddr;
+               struct in6_addr sc_carpaddr6;
+       };
+       struct { /* VRRPv3 specific context */
+               uint8_t         sc_vrrp_prio;
+               uint16_t        sc_vrrp_adv_inter;
+               uint16_t        sc_vrrp_master_inter;
+       };

        int                     sc_naddrs;
        int                     sc_naddrs6;

Put carp/vrrp3 specific variables in their own structs.

In D44774#1020944, @kp wrote:
In D44774#1020899, @bz wrote:

I will simply express that this will not only open a can of worms by mixing both but the original reasons not to include VRRPv2/3 and hence the "existence" of CARP is also ignored.

I'm assuming you're referring to the supposed patent issues?

There are multiple other open source VRRP implementations, e.g. https://github.com/FDio/vpp/tree/master/src/plugins/vrrp and https://www.keepalived.org
Also, the relevant patents have expired by now (by which I mean, in 2017 and 2012, so 7 and 12 years ago): https://en.wikipedia.org/wiki/Virtual_Router_Redundancy_Protocol#cite_note-6

Absent follow-up I'm going to consider this objection to be settled.

In D44774#1023611, @kp wrote:

Put carp/vrrp3 specific variables in their own structs.

This definitely is better, but I still can't figure out what prevents to unionize it? Do we have a code that switches certain address operation between CARP and VRRP?

In D44774#1023611, @kp wrote:

Put carp/vrrp3 specific variables in their own structs.

This definitely is better, but I still can't figure out what prevents to unionize it? Do we have a code that switches certain address operation between CARP and VRRP?

Putting those variables in a union means that setting sc_vrrp_prio will clobber sc_advskew (and vice versa). These have different constraints and different meanings. It just makes things harder than they need to be when switching between carp and vrrp.

In D44774#1024457, @kp wrote:

Putting those variables in a union means that setting sc_vrrp_prio will clobber sc_advskew (and vice versa). These have different constraints and different meanings. It just makes things harder than they need to be when switching between carp and vrrp.

The switching code should just use a local variable to convert between prio and adnskew. IMHO better make the rarely executed switching code a tiny bit more complex, but avoid carrying unrelated data in the softc. If we dereference CARP specific stuff in VRRP code or vice versa - that's a bug that needs be fixed rather than masked by non-unionizing data.

The switching code should just use a local variable to convert between prio and adnskew.

We can't convert between them. They're not different representations of the same concept, they just represent different things.

If we dereference CARP specific stuff in VRRP code or vice versa - that's a bug that needs be fixed rather than masked by non-unionizing data.

But that's what putting them in a union does. It, for lack of a better word, infects carp with vrrp settings and vice versa.

In D44774#1024640, @kp wrote:

The switching code should just use a local variable to convert between prio and adnskew.

We can't convert between them. They're not different representations of the same concept, they just represent different things.

If we dereference CARP specific stuff in VRRP code or vice versa - that's a bug that needs be fixed rather than masked by non-unionizing data.

But that's what putting them in a union does. It, for lack of a better word, infects carp with vrrp settings and vice versa.

If it is possible to set VRRP values while interface is in CARP mode and vice versa, then it is already a bug and de-unionizing just hides it.

Do you have the branch with this project shared anywhere? I'd like to give it a try.

If it is possible to set VRRP values while interface is in CARP mode and vice versa, then it is already a bug and de-unionizing just hides it.

It's possible to switch version without simultaneously changing the vrrp priority or advertisement interval, which means these values are whatever the carp skew/interval values were.
That is, ifconfig foo carpver 3 is insufficient, users would always have to pass the priority and advertisement interval as well. Or we'd have to have additional code to handle this case and force the default values to be set instead. We'd be adding code to handle issues that simply don't exist without the union.

Do you have the branch with this project shared anywhere? I'd like to give it a try.

https://github.com/kprovost/freebsd-src/tree/netgate/vrrp

I see. There is quite a lot of CARP specific code to be executed when we are VRRP mode and little bit code vice versa. I'm working pull requestes on github for you.

This revision now requires changes to proceed.Mon, Apr 29, 10:26 PM
sys/netinet/ip_carp.c
619

Isn't this the most easy way to panic system remotely?

kp marked an inline comment as done.Tue, Apr 30, 7:10 AM
kp added inline comments.
sys/netinet/ip_carp.c
619

No, because this is the second time we check the version. On line 567 we check for the first time, so asserting it here is appropriate.

sys/netinet/ip_carp.c
619

My bad, sorry.

kp marked an inline comment as done.

Add glebius' changes from https://github.com/freebsd/freebsd-src/compare/main...glebius:FreeBSD:carp-vrrp

Also change sc_addr to a uint8_t array (from sockaddr_dl).

sys/netinet/ip_carp.c
2112

This comment can now be deleted.

2113–2118

And this can go up into the sparse initializer above ^^

  • remove stale comment
  • init sc_addr in the initializer

Thanks! Please write a longer commit message, that would mention preparatory and cleanup stuff that we did and aren't part of VRRP addition:

  • Refactor packet tagging for ether_output(). Separate HMAC preparation (CARP specific) from tagging. In unicast mode (CARP specific) don't put tag at all. Don't put pointer to software context into the tag. Putting just vhid, an integer value, is a safer design.
  • Get rid of unnecessary sockaddr_dl and just use uint8_t array of ETHER_ADDR length.
  • Fix a tiny bug in ioctl handler, where we would accept new advbase, but return EINVAL due to invalid advskew.