Page MenuHomeFreeBSD

net/if_dl.h: make self-contained
ClosedPublic

Authored by ivy on Tue, Apr 29, 6:05 AM.
Tags
None
Referenced Files
F116798032: D50065.diff
Sat, May 10, 9:16 AM
F116791643: D50065.id154439.diff
Sat, May 10, 7:24 AM
Unknown Object (File)
Thu, May 8, 2:44 PM
Unknown Object (File)
Thu, May 8, 1:56 PM
Unknown Object (File)
Thu, May 8, 12:44 PM
Unknown Object (File)
Tue, May 6, 12:05 PM
Unknown Object (File)
Tue, May 6, 2:41 AM
Unknown Object (File)
Mon, May 5, 10:59 AM

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 63922
Build 60806: arc lint + arc unit

Event Timeline

ivy requested review of this revision.Tue, Apr 29, 6:05 AM
sys/net/if_dl.h
35

will sys/_types.h work?

sys/net/if_dl.h
35

i think not because we need caddr_t.

I'd rather try to get rid of caddr_t in this part of network stack. Not much left to be done.

  • use <sys/_types.h> instead of <sys/types.h>
  • replace caddr_t with char *

i am currently waiting on buildworld+buildkernel to test this, but the change
should be semantically identical to the old version.

ivy marked an inline comment as done.Wed, Apr 30, 3:15 AM

as this needs mentor approval here's the new commit message just to make sure i expressed this properly:

net/if_dl.h: make self-contained

include <sys/_types.h> for u_char and u_short.

replace caddr_t with char *.  this avoids having to include the full
<sys/types.h>, and we are trying to get rid of the legacy caddr_t type.
in this case, replace it with char * rather than void * to avoid
breaking existing users and because the intent is to represent an array
of bytes.
sys/net/if_dl.h
70

I think one could argue that (&(s)->sdl_data[(s)->sdl_nlen]) is marginally cleaner, but I don't think I insist.

sys/net/if_dl.h
70

you are completely right and i should not write computer words before having coffee -- we can obviously just remove the cast entirely.

however, can i suggest ((s)->sdl_data + (s)->sdl_nlen)? i think this is slightly more clear at first glance (mainly because indexing an array with a length looks odd) and maybe idiomatically more common. i don't really mind either way though.

sys/net/if_dl.h
70

I tend to have the opposite reaction (indexing feels more natural for arrays in general), but I don't really consider myself a stakeholder here and defer to others' preference (or lack thereof)

make LLADDR more beautiful

From what I know, the caddr_t was a special type added in the very early C, that did not have a void pointer. The whole goal was to guard against any address arithmetic. I'm not sure it guards today, but the purpose was to create a type that points just at memory of unknown type. Disclaimer: that's a rumor, not a written fact. Anyway, ideally all caddr_t's need to be converted to void * not into a pointer that can be incremented/decremented. May I ask you to test your patch with converting to void *? Sorry for adding extra burden on you, but you voluntary entered this burdensome domain of historical BSD includes :)

May I ask you to test your patch with converting to void *? Sorry for adding extra burden on you, but you voluntary entered this burdensome domain of historical BSD includes :)

i will test this but semantically i think array of bytes is the correct return type of LLADDR: it's not a pointer to an arbitrary object or memory region, it's a network address, which is by definition a byte array.

or in other words, i think caddr_t may have been wrong here to begin with.

sys/net/if_dl.h
70

as i have no real preference i have gone with your approach

In D50065#1142309, @ivy wrote:

i will test this

../../../../../../../../../ifdl-headers/lib/libc/net/linkaddr.c:109:21: error: 'char *' and 'void *' are not pointers to compatible types
  109 |         sdl->sdl_alen = cp - LLADDR(sdl);                               
      |                         ~~ ^ ~~~~~~~~~~~

okay, this is in the implementation of link_addr() itself so maybe it doesn't count, but i think the assumption here is that LLADDR is a char type.

i wonder if it makes sense to change it to uint8_t, which is semantically more accurate than char and would also break fewer consumers.

thinking about this even more (sorry for the flood of updates): i don't think we can change the public API here since this has been around since at least 4.4BSD (probably earlier) and there is likely some software out there that still depends on it. so i think we shouldn't change the type of LLADDR to either void * or uint8_t *, even if one of them is arguably more correct. removing the caddr_t cast leaves the type as char *, which is what it already was.

(basically, i don't want to be the person who broke an API which has been a de-facto standard for the last 35 years.)

Understand your feelings, been there :) Some of those "de-facto standards" are ugly and if carefully investigated sometimes appear to be possible to break fix carefully. Sometimes you need to take care of a small number of ports, sometimes even nothing. I've done this numerous times. This kind of work is rewarding in a long term. Looking at the history N years later you are glad that you had gone through the process in the past. The breakage is just a short term discomfort, but carrying ugly legacy into the future is never ending. An example: BSD raw sockets were providing headers in host byte order and with adjusted length. Every new programmer who writes a program for Linux and wants it to be portable to FreeBSD would need to know this "feature" and workaround it. About 10 years ago I took a risk to fix that. That was a painful process that lasted couple weeks, but today the pain is completely forgotten and I believe the non-existence of this feature made a number of people happier.

So, that's why I suggested to start with an exp-run to estimate how much software is affected by link_addr() and related changes. Knowing this estimate you can make more weighted decisions on what can be changed easily and what not so easily.

i definitely appreciate your point and i don't mind putting the work into this to fix it (apparently, i have adopted link_addr()...), i'm just not entirely sure which (if either) change is correct in the first place.

let's say we were starting with a new, modern API in C (or C++) for representing link addresses. which of these seems more correct?

a:

char *sdl_get_linkaddr(struct sockaddr_dl const *);

b:

uint8_t *sdl_get_linkaddr(struct sockaddr_dl const *);

c:

void *sdl_get_linkaddr(struct sockaddr_dl const *);

a) is what we have now, but i don't think this can be right, because the octets of network addresses are not signed, while char is often signed.

c) is okay, but what is the user meant to do with this? the answer is almost certainly that they'll immediately cast it to uint8_t * to access the octets.

b) is better because we're giving the user what they actually want: the octets of the network address.

so... a proposal: i am willing to merge all my link_addr/if_dl.h changes into a single branch (including option (b) here, unless you disagree with my reasoning), ask for an exp run over that, and if it passes, commit the result all at once.

does this seem like a reasonable approach?

Either of the 3 variants is going to be better than caddr_t :) I agree that unsigned type is better than signed. Judging between uint8_t * and void *, I would argue that void * is more correct, as link level addresses are usually fixed length prefixes. But different link level protocols would have different length. AFAIK, neither software, nor hardware treats them as separate octets. The division into octets was invented for human representation only.

However, switching to void * may require a ton of changes, as some consumers may actually treat the result as array of char exactly for the purposes of printf(3). See 8e1af80243ff97f56918b363196832af9406597e, where I gave up in the kernel. :) btw, I left char * there. Feel free to improve that into uint8_t *, if it is possible.

Anyway, good luck with that and thanks for working on it.

in the interest of fixing the original problem, which was to make if_dl.h self-contained, would you be okay with committing the version that returns a char *? i will think about ways to improve this but i'm worried we're missing an easy improvement for want of a much more complicated fix that might end up being impossible.

No objection, since that is already an improvement!

Would be nice to keep CLLADDR around for compatibility. Had to patch net-mgmt/ipgen as it uses this macro; would have been nice if I didn't have to.

put CLLADDR back

as discussed on irc, removing CLLADDR broke at least one port that uses it.
put it back using a (const char *) cast so we don't need c_caddr_t; this
matches the previous API and is still const-correct as we're only adding const.

i'll update the commit message before landing this.

I'd vote against putting it back. I submitted patch to the port upstream.

This revision is now accepted and ready to land.Mon, May 5, 10:16 PM

I'd vote against putting it back. I submitted patch to the port upstream.

Was that the only port identified that depended on it?

after discussion with des and kevans, don't put CLLADDR back in, the consensus
seems to be users should change their code.

This revision now requires review to proceed.Mon, May 5, 10:43 PM
This revision is now accepted and ready to land.Mon, May 5, 10:49 PM

I'd vote against putting it back. I submitted patch to the port upstream.

Was that the only port identified that depended on it?

net-mgmt/ipgen. There may be others, but this is the one I had to patch.

This revision was automatically updated to reflect the committed changes.