Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped - Build Status
Buildable 63794 Build 60678: arc lint + arc unit
Event Timeline
sys/net/if_dl.h | ||
---|---|---|
35 | will sys/_types.h work? |
sys/net/if_dl.h | ||
---|---|---|
35 | i think not because we need caddr_t. |
I'd rather try to get rid of caddr_t in this part of network stack. Not much left to be done.
- use <sys/_types.h> instead of <sys/types.h>
- replace caddr_t with char *
i am currently waiting on buildworld+buildkernel to test this, but the change
should be semantically identical to the old version.
as this needs mentor approval here's the new commit message just to make sure i expressed this properly:
net/if_dl.h: make self-contained include <sys/_types.h> for u_char and u_short. replace caddr_t with char *. this avoids having to include the full <sys/types.h>, and we are trying to get rid of the legacy caddr_t type. in this case, replace it with char * rather than void * to avoid breaking existing users and because the intent is to represent an array of bytes.
sys/net/if_dl.h | ||
---|---|---|
70 | I think one could argue that (&(s)->sdl_data[(s)->sdl_nlen]) is marginally cleaner, but I don't think I insist. |
sys/net/if_dl.h | ||
---|---|---|
70 | you are completely right and i should not write computer words before having coffee -- we can obviously just remove the cast entirely. however, can i suggest ((s)->sdl_data + (s)->sdl_nlen)? i think this is slightly more clear at first glance (mainly because indexing an array with a length looks odd) and maybe idiomatically more common. i don't really mind either way though. |
sys/net/if_dl.h | ||
---|---|---|
70 | I tend to have the opposite reaction (indexing feels more natural for arrays in general), but I don't really consider myself a stakeholder here and defer to others' preference (or lack thereof) |
From what I know, the caddr_t was a special type added in the very early C, that did not have a void pointer. The whole goal was to guard against any address arithmetic. I'm not sure it guards today, but the purpose was to create a type that points just at memory of unknown type. Disclaimer: that's a rumor, not a written fact. Anyway, ideally all caddr_t's need to be converted to void * not into a pointer that can be incremented/decremented. May I ask you to test your patch with converting to void *? Sorry for adding extra burden on you, but you voluntary entered this burdensome domain of historical BSD includes :)
i will test this but semantically i think array of bytes is the correct return type of LLADDR: it's not a pointer to an arbitrary object or memory region, it's a network address, which is by definition a byte array.
or in other words, i think caddr_t may have been wrong here to begin with.
sys/net/if_dl.h | ||
---|---|---|
70 | as i have no real preference i have gone with your approach |
../../../../../../../../../ifdl-headers/lib/libc/net/linkaddr.c:109:21: error: 'char *' and 'void *' are not pointers to compatible types 109 | sdl->sdl_alen = cp - LLADDR(sdl); | ~~ ^ ~~~~~~~~~~~
okay, this is in the implementation of link_addr() itself so maybe it doesn't count, but i think the assumption here is that LLADDR is a char type.
i wonder if it makes sense to change it to uint8_t, which is semantically more accurate than char and would also break fewer consumers.
thinking about this even more (sorry for the flood of updates): i don't think we can change the public API here since this has been around since at least 4.4BSD (probably earlier) and there is likely some software out there that still depends on it. so i think we shouldn't change the type of LLADDR to either void * or uint8_t *, even if one of them is arguably more correct. removing the caddr_t cast leaves the type as char *, which is what it already was.
(basically, i don't want to be the person who broke an API which has been a de-facto standard for the last 35 years.)
Understand your feelings, been there :) Some of those "de-facto standards" are ugly and if carefully investigated sometimes appear to be possible to break fix carefully. Sometimes you need to take care of a small number of ports, sometimes even nothing. I've done this numerous times. This kind of work is rewarding in a long term. Looking at the history N years later you are glad that you had gone through the process in the past. The breakage is just a short term discomfort, but carrying ugly legacy into the future is never ending. An example: BSD raw sockets were providing headers in host byte order and with adjusted length. Every new programmer who writes a program for Linux and wants it to be portable to FreeBSD would need to know this "feature" and workaround it. About 10 years ago I took a risk to fix that. That was a painful process that lasted couple weeks, but today the pain is completely forgotten and I believe the non-existence of this feature made a number of people happier.
So, that's why I suggested to start with an exp-run to estimate how much software is affected by link_addr() and related changes. Knowing this estimate you can make more weighted decisions on what can be changed easily and what not so easily.
i definitely appreciate your point and i don't mind putting the work into this to fix it (apparently, i have adopted link_addr()...), i'm just not entirely sure which (if either) change is correct in the first place.
let's say we were starting with a new, modern API in C (or C++) for representing link addresses. which of these seems more correct?
a:
char *sdl_get_linkaddr(struct sockaddr_dl const *);
b:
uint8_t *sdl_get_linkaddr(struct sockaddr_dl const *);
c:
void *sdl_get_linkaddr(struct sockaddr_dl const *);
a) is what we have now, but i don't think this can be right, because the octets of network addresses are not signed, while char is often signed.
c) is okay, but what is the user meant to do with this? the answer is almost certainly that they'll immediately cast it to uint8_t * to access the octets.
b) is better because we're giving the user what they actually want: the octets of the network address.
so... a proposal: i am willing to merge all my link_addr/if_dl.h changes into a single branch (including option (b) here, unless you disagree with my reasoning), ask for an exp run over that, and if it passes, commit the result all at once.
does this seem like a reasonable approach?
Either of the 3 variants is going to be better than caddr_t :) I agree that unsigned type is better than signed. Judging between uint8_t * and void *, I would argue that void * is more correct, as link level addresses are usually fixed length prefixes. But different link level protocols would have different length. AFAIK, neither software, nor hardware treats them as separate octets. The division into octets was invented for human representation only.
However, switching to void * may require a ton of changes, as some consumers may actually treat the result as array of char exactly for the purposes of printf(3). See 8e1af80243ff97f56918b363196832af9406597e, where I gave up in the kernel. :) btw, I left char * there. Feel free to improve that into uint8_t *, if it is possible.
Anyway, good luck with that and thanks for working on it.
in the interest of fixing the original problem, which was to make if_dl.h self-contained, would you be okay with committing the version that returns a char *? i will think about ways to improve this but i'm worried we're missing an easy improvement for want of a much more complicated fix that might end up being impossible.
Would be nice to keep CLLADDR around for compatibility. Had to patch net-mgmt/ipgen as it uses this macro; would have been nice if I didn't have to.
put CLLADDR back
as discussed on irc, removing CLLADDR broke at least one port that uses it.
put it back using a (const char *) cast so we don't need c_caddr_t; this
matches the previous API and is still const-correct as we're only adding const.
i'll update the commit message before landing this.
after discussion with des and kevans, don't put CLLADDR back in, the consensus
seems to be users should change their code.