Page MenuHomeFreeBSD

LinuxKPI/ofed/mlx: mark/hide net_device is ifnet code
AbandonedPublic

Authored by bz on Mar 21 2021, 9:50 PM.

Details

Reviewers
hselasky
kib
Summary

Historically LinuxKPI defined struct net_device to be struct ifnet.
With more networking code growing dependency on struct net_device
this shortcut is no longer feasible.
However changing the current driver/ofed code to use ifnet everywhere
is also not an immediate solution despite the code usually already
being of mixed-usage of net_device and ifnet and highly dependent on
strutc ifnet fields and semantics and just being called net_device still.
In order to keep accomodating the current codebase and allowing drivers
to slowly migrate away from it (or renaming some compat functions)
hide the current "net_device is ifnet" implementation under
_NET_DEVICE_IS_IFNET and define that in the files currently relying
on it.
This will allow us to build up more struct net_device compat code in
parallel to the working drivers/ofed implementation.

Obtained-from: bz_iwlwifi
Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
Reviewed-by: ...

Differential Revision:

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 38271
Build 35160: arc lint + arc unit

Event Timeline

bz requested review of this revision.Mar 21 2021, 9:50 PM
This revision is now accepted and ready to land.Mar 22 2021, 12:15 AM
sys/compat/linuxkpi/common/include/linux/inetdevice.h
37 ↗(On Diff #86116)

Could these functions return "struct ifnet *" ?
Or do you plan to use these in WLAN code ?

sys/compat/linuxkpi/common/include/linux/netdevice.h
96

"do {} while (0)" missing here I think.

I believe the long-term goal is to only have one API. These patches won't be MFC'ed until the transition is completed?

Beware the ibcore shadow the configuration of other ifnet devices, like lagg, vlan, which are not created by the network driver itself. How do you plan create struct netdevice for such devices? On the fly?

sys/compat/linuxkpi/common/include/linux/inetdevice.h
37 ↗(On Diff #86116)

I believe I have not come across them for WiFi but at least the IPv4 version seems to exist in Linux by that name from googling.

Given we seem to only use them in ofed and they are kind-of very FreeBSD specific the way they are I wonder if these could move into an ofed/infiniband header entirely in a 2nd stage?

While here I did not want to change them just yet in this go, mainly because I wanted to let you think about longer-term strategy for mlx and ofed. In general I am happy for anything under "_NET_DEVICE_IS_IFNET" to become ifnet while it is not in the public namespace. The same applies to netdevice.h.

sys/compat/linuxkpi/common/include/linux/netdevice.h
96

I just moved them down from above to outside the new `#ifdef` block.

Beware the ibcore shadow the configuration of other ifnet devices, like lagg, vlan, which are not created by the network driver itself. How do you plan create struct netdevice for such devices? On the fly?

I don't plan to. From what I have seen ofed and mlx when it comes to "struct net_device" is rally all just "struct ifnet" for us with a few wrappers. It is neither fully one or the other though the Linux-compat there seems to be mostly lost.

I have no stakes in mlx/ofed and hence did not immediately want to change anything there, which is why I chose the #ifdef _NET_DEVICE_IS_IFNET route so that they can co-exist and someone can think about how to actually solve this there.

I was pondering replacing the few places using struct net_device with struct ifnet at first and completely clear netdevice.h but I'd rather have someone who can test this do it.

sys/ofed/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
39 ↗(On Diff #86116)

This is one example where we include netdevice.h but as you can see below actually already use struct ifnet directly. In theory this file should include FreeBSD headers directly and not a Linux header anymore.

sys/compat/linuxkpi/common/include/linux/inetdevice.h
37 ↗(On Diff #86116)

Could we slightly rename these functions and put them in sys/net/if.c ?

sys/ofed/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
39 ↗(On Diff #86116)

Could you make an own patch for this?

sys/compat/linuxkpi/common/include/linux/netdevice.h
89

Why is it important that ifp is the first member of net_device?

93

Note that this is on edge of being invalid C. I did not looked deep enough to see if it is actually.

This kind of re-definitions violates one definition rule, generally. So e.g. LTO would break.

sys/compat/linuxkpi/common/include/linux/inetdevice.h
37 ↗(On Diff #86116)

Yes, I think that would make a lot of sense if we don't have anything yet already doing it.

sys/compat/linuxkpi/common/include/linux/netdevice.h
89

As we currently treat netdev_notifier_info to have ifnet as first field so it works with both definitions of net_device.

93

By that argument we could not have struct foo { } in two different drivers with two different sets of fields either?

sys/compat/linuxkpi/common/include/linux/netdevice.h
89

But you put ifnet *, not ifnet, as the first field.

93

If they are linked into single final blob (like kernel), then yes. So far our toolchains do not make a huge deal from it, perhaps only source-level debuggers like gdb are confused.

But LTO example may change it.

sys/compat/linuxkpi/common/include/linux/netdevice.h
89

Yes we pass an ifnet * of course; see D29365

93

I would just see this as more motivation to get the situation in our ofed/ sorted then

sys/compat/linuxkpi/common/include/linux/inetdevice.h
37 ↗(On Diff #86116)

As a matter of fact after D29428 the two functions ip[4]_dev_find() are only used by ofed/drivers/infiniband/core/ib_cma.c and by ib_addr.c and I am inclined to leave them in a header there rather than adding them to if.c or rather netinet/netinet6 (as they are IP version specific and not interface generic). The reason for that is that they seem to have a very specific KPI (e.g., passing vnet arguments rather than the caller setting them).

@hselasky If that is okay with you I'll prepare another review for that move.

bz retitled this revision from LinuxKPI/ofed/mlx/cxgbe: mark/hide net_device is ifnet code to LinuxKPI/ofed/mlx: mark/hide net_device is ifnet code.Mar 26 2021, 6:27 PM
bz removed a reviewer: np.
This revision now requires review to proceed.Mar 26 2021, 6:27 PM
bz marked 2 inline comments as done.Mar 26 2021, 6:28 PM
bz added inline comments.
sys/compat/linuxkpi/common/include/linux/inetdevice.h
37 ↗(On Diff #86116)

Extracted into D29366. I'll update this one here once that is sorted.

sys/dev/mlx5/mlx5_en/en.h
39 ↗(On Diff #86116)

Does mlx5en compile w/o this header file? I wonder if you can just remove it.

sys/dev/mlx5/mlx5_en/en.h
39 ↗(On Diff #86116)

I'll have a look at all remaining once after the experience with both cxgbe and qlnxr.
It'll happen for the next update of here once inetdevice.h is sorted.
Spring cleaning :-) And thanks a lot for helping so much!

Merge out other driver changes already comitted.
Move one function from netdevice.h to mlx4/mlx4_en/en.h as it is only
used in mlx4 code.
Move one function into ofed code as it is only used there and restructure
the former FreeBSD specific inlcudes.
Remove unused functions from netdevice.h.

bz marked an inline comment as done.Mar 30 2021, 3:49 PM

I've just uploaded an intermediate update. I'll now try to track down the remaining #include <linux/netdevice.h> and see where they are really needed. After that we should probably have a look what to do with the 7 remaining #defines in netdevice.h (I think all but net_eq() are actually used quite a bit).

bz marked an inline comment as done.Mar 30 2021, 4:05 PM
bz added inline comments.
sys/dev/mlx5/mlx5_en/en.h
39 ↗(On Diff #86116)

mlx5 does still need netdevice.h; even if not locally included it'll pull it in from ib_addr.h; it's a 15 line change to remove net netdevice.h bits from mlx5; I'll upload a separate review for those.

mlx5 is now out of the picture as well.

bz marked 2 inline comments as done.Apr 2 2021, 10:23 AM

Hi,

so there are still @kib's concerns about LTO. Any further changes about fixing this will be a lot more intrusive to both ofed and mlx4:

% grep -r net_device sys/dev/mlx4/ | wc -l
      60
% grep -r net_device sys/ofed/ | wc -l
      86

The question I have for you (@hselasky, @kib) is: would you have a look to cleanup ofed and mlx4 or do we want to go ahead with this change for a temporary time?

I will reply on Tuesday.

Any thoughts?

sys/compat/linuxkpi/common/include/linux/netdevice.h
77

Is there a chance that "net_device" can be an extension to "struct ifnet" so that you don't have to use a pointer here.

The network devices are all allocated and adding a few bytes in the end would save a lot of trouble!

Can we change this function to have a size argument?

struct ifnet *
if_alloc_domain(u_char type, int numa_domain)
{
        struct ifnet *ifp;
        u_short idx;
        void *old;

        KASSERT(numa_domain <= IF_NODOM, ("numa_domain too large"));
        if (numa_domain == IF_NODOM)
                ifp = malloc(sizeof(struct ifnet), M_IFNET,
                    M_WAITOK | M_ZERO);
        else
                ifp = malloc_domainset(sizeof(struct ifnet), M_IFNET,
                    DOMAINSET_PREF(numa_domain), M_WAITOK | M_ZERO);

Maybe call "struct netdevice" "struct netdevice_wlan" and let "netdevice" alias ifnet ?

Or have an ifdef:

#ifndef netdevice
#define netdevice ifnet
#endif

And in the WiFi drivers you add somewhere:

-Dnetdevice=netdevice_wlan

What I'm afraid of is that if you make a dual API like you've suggested, that one of them won't work or be maintained.

--HPS

Sorry I haven't been more active here. Many FreeBSD issues on my desk :-)

Maybe call "struct netdevice" "struct netdevice_wlan" and let "netdevice" alias ifnet ?

Or have an ifdef:

#ifndef netdevice
#define netdevice ifnet
#endif

And in the WiFi drivers you add somewhere:

-Dnetdevice=netdevice_wlan

What I'm afraid of is that if you make a dual API like you've suggested, that one of them won't work or be maintained.

Well my point is to not have a dual API; my hope also was that I don't have to cleanup mlx4/ofed code (beyond what I did) and only provided a temporary hack that would allow here to move forward while the other part would be sorted out. Since we have actually cut the hack down to very little.

The point remains that in mlx4/ofed every "net_device" means "ifnet". I'd be totally tempted to sed-replace these but I have no idea how we "maintain" the OFED code (e.g., if there are still generic-code updates?). And if that would be done the only thing left are the callback functions (quoting this from memory).

I think the alternative of using the real net_device KPI in mlx4/ofed was lost 10 years ago when the code was re-written.

sys/compat/linuxkpi/common/include/linux/netdevice.h
77

I'd be happy to make it the other way round and remove the '*' here; there reason it is there like this at the moment is the compatibility with the mlx/ofed variant still left. If that can be sorted we can do here what we want.

I am really hitting this now trying to sort out more remaining LinuxKPI bits and bringing them up-to and into HEAD.
I'll do one more round trip on this extracting the few inline functions used and removing the few unused bits so that as much overlap as possible is gone.