Add a boolean and an int a driver can set in their 'struct pci_driver' definiotion to raise or lower the default LinuxKPI pci return value. This is helpful in case of multiple drivers with overlapping IDs, such as iwlwifi(4) and iwm(4). The reason for the bool and the int is that 0 (BUS_PROBE_SPECIFIC) is a valid return but would also be the unintialised (bss) value. Having the bool allows us to check if we should apply the specific return value or stay with the default. Given 'struct pci_driver' is usually statically initialised we cannot set an easy global default to always use. Suggested by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 days X-MFC: struct members need to go to the end for stable/13 Differential Revision: https://reviews.freebsd.org/D33915
Details
- Reviewers
imp jhb • hselasky - Group Reviewers
linuxkpi - Commits
- rGc325d9edeff5: LinuxKPI: allow a driver to override the default pci probe result
rG3166cea63244: LinuxKPI: allow a driver to override the default pci probe result
rGb91dd79ba321: LinuxKPI: allow a driver to override the default pci probe result
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 44504 Build 41392: arc lint + arc unit
Event Timeline
Sorry for the massive text change on the header file; the current format did not allow easier line wrapping.
Also if there are other people who should review this please add them.
share/man/man9/DEVICE_PROBE.9 | ||
---|---|---|
103–112 ↗ | (On Diff #101517) | I think I understand what you meant, but it took me several minutes. Please check that my rewording for clarity is indeed what you meant and is consistent with the comment in sys/sys/bus.h. Also, you may need to reflow my revised paragraph since I didn't check line lengths. |
This doesn't answer the question of why very well. So why can't all LinuxKPI drivers be BUS_PROBE_DEFAULT-1? It would be functionally the same in all cases, except where there are two drivers of BUS_PROBE_DEFAULT.
Also, why doesn't the framework allow changing the default value? Most other frameworks do allow that.
I personally think, though, no change is needed to these values. It would be better to allow users to specify a preference when there's a tie and to have some value that is default based on the project's notion of which should win.
Yeah, could do the -1.
Also, why doesn't the framework allow changing the default value? Most other frameworks do allow that.
Because there is one probe routine... See D33913 for the related change which then should go BUS_PROBE_DEFAULT-1.
I personally think, though, no change is needed to these values. It would be better to allow users to specify a preference when there's a tie and to have some value that is default based on the project's notion of which should win.
Another way of saying this is that the framework has limitations and currently hard-codes the return value. the way you said it, though, makes it sound like that's the way it has to be, which I was bristling at...
I added jhb for his perspective as well. I'm curious to see if he agrees with my suggestion, or has a different perspective since he's used linuxkpi a lot and also done a lot with subr_bus.c.
I personally think, though, no change is needed to these values. It would be better to allow users to specify a preference when there's a tie and to have some value that is default based on the project's notion of which should win.
Sorry, I'd rather not have it that way. I'd also rather not have users by default fiddle with tuneables or other if it can be avoided. Of course you are correct. I assume I was looking for an easy way out to move forward and being jumping in and out and sleep deprived doesn't always help.
Speaking of that, I should probably go and look at subr_bus.c then and see if we can add a very generic way, not restricted to LinuxKPI, to fiddle with the return value from probe and apply an "adjust" value? Is that what you are trying to hint at?
I added jhb for his perspective as well. I'm curious to see if he agrees with my suggestion, or has a different perspective since he's used linuxkpi a lot and also done a lot with subr_bus.c.
Thanks!
I personally think, though, no change is needed to these values. It would be better to allow users to specify a preference when there's a tie and to have some value that is default based on the project's notion of which should win.
One way to look at this is "I have N drivers returning the same thing, how do I break the tie?"
Right now, you are hard coding the winner, and there's no way to adjust that.
I was proposing *something* that could break the tie.
One way would be to have a 'hw.dev.$DRIVER.probe_boost: X' that would automatically give a boost of X (so add it to the return value, capped at -1) when a driver claims a device. We could have project default values for some drivers that we wanted to tweak. We kinda do this for mfi vs mrsas driver today where they have the same PCI IDs (there's driver specific kludges to do this). This would let users set a tunable when the wrong device wins (and it can also be set at the loader prompt). Generally, this wouldn't be needed, but would give people a no-compile list. It would also allow people to say, more generally, that some generic drivers should win over their more specific counterparts. The down side is that this is a bit of a big hammer in some cases and is only per-driver, not per-device. However, there are other per-device solutions.
The second way would be to have something like 'hw.dev.preferred: A,B,C' which would say, if there's ever a tie in the bidding, prefer the driver whose name is in this list. When there's a tie, and one of the drivers is on the list, that one wins. When there's a tie and both drivers are on the list, the one on the list first wins. This is a bit simpler, but less powerful than the former suggestion. However, it may be sufficiently powerful for our actual needs (some of the advantage of the prior one is likely theoretical). I'm agnostic on the actual, final name.
There's finally currently a way to wire a device name / unit to a driver. This overrides all return values and only tries that device. However, this can be tedious and cumbersome to apply to fix this problem because you need to know the physical location string (bus dependent) of the device, which can be hard to discover in the boot loader. It's more useful in keeping unit numbers the same as cards fail, but is sufficiently general to solve this.
I added jhb for his perspective as well. I'm curious to see if he agrees with my suggestion, or has a different perspective since he's used linuxkpi a lot and also done a lot with subr_bus.c.
Thanks!
I personally think, though, no change is needed to these values. It would be better to allow users to specify a preference when there's a tie and to have some value that is default based on the project's notion of which should win.
I'll take another look at the manual page change when the discussion reaches consensus on the rest.
I think instead you want a way to tag LinuxKPI drivers with the desired probe priority somehow. I can't even find the relevant code in LinuxKPI unless it is badly obscured:
> grep -r BUS_PROBE sys/compat/linuxkpi sys/compat/linuxkpi/common/src/linux_i2cbb.c: return (BUS_PROBE_NOWILDCARD); sys/compat/linuxkpi/common/src/linux_i2c.c: return (BUS_PROBE_NOWILDCARD);
Sigh, in fact, LinuxKPI appears to be incredibly broken here:
static int usb_linux_probe(device_t dev) { struct usb_attach_arg *uaa = device_get_ivars(dev); struct usb_driver *udrv; int err = ENXIO; if (uaa->usb_mode != USB_MODE_HOST) { return (ENXIO); } mtx_lock(&Giant); LIST_FOREACH(udrv, &usb_linux_driver_list, linux_driver_list) { if (usb_linux_lookup_id(udrv->id_table, uaa)) { err = 0; break; } } mtx_unlock(&Giant); return (err); }
There's no valid reason _at all_ for any driver written in this millennium to use a return value of 0 from probe.
linux_pci_probe is a similarly utterly broken:
static int linux_pci_probe(device_t dev) { const struct pci_device_id *id; struct pci_driver *pdrv; if ((pdrv = linux_pci_find(dev, &id)) == NULL) return (ENXIO); if (device_get_driver(dev) != &pdrv->bsddriver) return (ENXIO); device_set_desc(dev, pdrv->name); return (0); }
For PCI at least it seems like it would not be impossible to add a FreeBSD-specific member to struct pci_driver that has to be set in the tree to the BUS_PROBE_* value to return from linux_pci_probe().
Sorry, I completely missed that reply two days ago :-( Would have saved me 10 minutes just now... I came back to follow-up only to find you had already written what I was going to say :-)))))
...
There's no valid reason _at all_ for any driver written in this millennium to use a return value of 0 from probe.
linux_pci_probe is a similarly utterly broken:
static int linux_pci_probe(device_t dev) { const struct pci_device_id *id; struct pci_driver *pdrv; if ((pdrv = linux_pci_find(dev, &id)) == NULL) return (ENXIO); if (device_get_driver(dev) != &pdrv->bsddriver) return (ENXIO); device_set_desc(dev, pdrv->name); return (0); }
Yes, I have that sitting in https://reviews.freebsd.org/D33913 ; feel free to approve there. Should go in independent of any of this.
For PCI at least it seems like it would not be impossible to add a FreeBSD-specific member to struct pci_driver that has to be set in the tree to the BUS_PROBE_* value to return from linux_pci_probe().
I had drafted up a resource_probe_adjust() but I really don't like it. Also just being able to do a -1/0/+1 seemed wrong.
So I looked through and came to the same conclusion (modulo MFC path which is tricky that one) as that suggestion of yours. If the value in pci_driver is 0 then return the BUS_PROBE_DEFAULT from D33913 otherwise do a sanity check and fall back to DEFAULT if needed. I'll draft that up next (as I'd love to be able to have auto-loading from D26651 in for 13.1-R but still prefer iwm(4) by default for overlapping IDs).
The one thing I haven't checked yet, is if we can also add a driver-specific (but possibly genericly implemented -- otherwise per-driver) TUNABLE to overwrite/set this value, so that one could still change order without recompiling. Now that might get me back to the resource_probe_adjust() thing but .. I still don't like that. So maybe I'll leave the TUNABLE to iwlwifi as it is currently the only driver with overlapping IDs and let it be that.
This indeed seems to be the best solution (or the one I like best after all of this). Thanks a lot you two for the valuable input! Very much appreciated!
Update to a totally different solution, will update patch descriptionn, etc in a minute
sys/compat/linuxkpi/common/include/linux/pci.h | ||
---|---|---|
215 | Why not init bsd_bus_probe_return BUS_PROBE_DEFAULT and skip the boolean? |
sys/compat/linuxkpi/common/include/linux/pci.h | ||
---|---|---|
215 | Where do you want to init that? I don't want to touch all the drivers which statically initialise the struct and then pass it to pci_register_driver(). If it gets there and is 0 I still don't know if that is a valid 0 or a bss 0. Am I missing a way without linker scripts to pre-initialise this field in all cases? |
sys/compat/linuxkpi/common/include/linux/pci.h | ||
---|---|---|
215 | The struct passed to pci_register_driver() is not const, so pci_register_driver could set bsd_bus_probe_return to the default value. Then you can make a new function, pci_register_driver_prio(), which takes a second argument, which gets stored into the pci_driver structure? |
sys/compat/linuxkpi/common/include/linux/pci.h | ||
---|---|---|
215 | Which will only work if you actually change the function call and not do (without the bool) the below which is how things get initialised at the moment. As otherwise you'll overwrite the already set value with a default value again. I keep wondering if it would be simpler to simply check for != 0, ignore the SPECIFIC case and move on as this below is likely the only case we'll hit in the next while (or at least so I'd hope)... --- sys/contrib/dev/iwlwifi/pcie/drv.c +++ sys/contrib/dev/iwlwifi/pcie/drv.c @@ -1660,6 +1660,11 @@ static struct pci_driver iwl_pci_driver = { .probe = iwl_pci_probe, .remove = iwl_pci_remove, .driver.pm = IWL_PM_OPS, +#if defined(__FreeBSD__) + /* Allow iwm(4) to attach for conflicting IDs for now. */ + .bsd_has_bus_probe_return = true, + .bsd_bus_probe_return = (BUS_PROBE_DEFAULT - 1), +#endif }; int __must_check iwl_pci_register_driver(void) |
Can you call it ".probe_priority = -1", then return (BUS_PROBE_DEFAULT + pdrv->probe_priority) ?
Yeah I could go back to probe_adjust_priority and do that and do maths again... will update in a minute.
I would still use the word priority "bsd_adjust_probe_return" -> "bsd_adjust_probe_priority" , but you can decide. You're native English :-)
Related: https://reviews.freebsd.org/D34357
This will let the user decide which device wins when multiple have the same priority.
It will not, however, make any of them default, which this commit will do.
As for this patch, It looks fine (I'd like to hear from @jhb too though).
If you want a name, bsd_probe_adjustment would be better I think than bsd_adjust_probe_return, but I'm not hard set on either of them.
Honestly, rather than an adjustment, I would just say that no real driver should ever set it to 0 and have it be a valid BUS_PROBE_ value and return BUS_PROBE_DEFAULT if it is zero. I don't think there is a "valid 0" case that you described (or we shouldn't permit it) and just assume 0 means "bss 0". Relative to existing drivers it is a lot cleaner to change iwlwifi to use BUS_PROBE_VENDOR or some such rather than having to set it it to BUS_PROBE_VENDOR - BUS_PROBE_DEFAULT or the like.
Okay, I'll wait for @jhb
If you want a name, bsd_probe_adjustment would be better I think than bsd_adjust_probe_return, but I'm not hard set on either of them.
I'll make that name change before committing. It's also shorter :-)
Which would be amazing to have in a future time when I do no longer want to penalize iwlwifi over iwm :-) Though in that case a devctl blocklist would also do it (if both were modules, which they are not).
It will not, however, make any of them default, which this commit will do.
Yupp. Thanks for your review time and comments!
Okay, so back to what it was in the previous version without the bool and with 0 is not an option for this (probably as a comment somewhere).. https://reviews.freebsd.org/D33915?vs=102957&id=102959#toc ?
Go back to returning a set BUS_PROBE_* value but no boolean
treating 0 as BSS and leave a comment that we never return BUS_PROBE_SPECIFC.
Did you mean like this, @jhb?
In case anyone has any more comments, please let me know. If I hear nothing back I'll get this in likely Tuesday morning UTC.
sys/compat/linuxkpi/common/src/linux_pci.c | ||
---|---|---|
364 |
sys/compat/linuxkpi/common/src/linux_pci.c | ||
---|---|---|
364 | ACK will add the empty line. And thanks for looking at this again! |
So after MFCing this drm-kmod binary packages on FreeBSD stable/13 and 13.1-RC1 broke; as per MFC rules I appended the field to the end of the struct. https://cgit.freebsd.org/src/commit/?h=releng/13.1&id=c325d9edeff568d3d38891b2916bd5bd0e9bf8e3 However probe will not run if there is something there which is > 0 and we'll treat it as a probe error.
Adjusting the change in linux_pci.c to >= 0 will do the right thing for now; not sure if for the sake of safety in stable/13 and releng/13.1 I also want to make it:
if (pdrv->bsd_probe_return >= 0 || pdrv->bsd_probe_return < BUS_PROBE_GENERIC) { return (BUS_PROBE_DEFAULT); } else { return (pdrv->bsd_probe_return); }
As a side note: that the original extra bool in Diff2 would have likely not altered the struct in the middle with two bools next to each other (the int still appended at the end) but caught this issue...
So, right now we end the structure with a bool. This takes up 4 or so bytes. We could make it a bit field... but we might not need to...
isdrm looks to be unused... Or could be unused. No driver references it today directly.
We do test it in a few places, and we do set it for register_drm_driver, but we could steal the storage and do a strcmp(name, "drmn") in the couple of places where we use it. They are in 'startup' routines and a few extra strcmps would be lost in the noise.
Barring that, since it's purely internal, we could make it a bitfield with one bit for isdrm and the rest for the return value, though that's starting to get close to UB behavior (it's implementation defined if it is signed or unsigned).
I don't think we can add it to the end of the struct, since the struct lives in driver itself.
We should add a static_assert for the size of this struct as well...
Or, I suppose, we could just eliminate the field in 13.1 entirely and always return BUS_PROBE_DEFAULT - 1 (or do so for all !isdrm drivers). None of the other linuxkpi drivers compete for bidding a device with multiple drivers (at least that I'm aware of) so this would have no practical effect on them. That would be a much smaller change, would be easier to test and may be less disruptive.
diff --git a/sys/compat/linuxkpi/common/include/linux/pci.h b/sys/compat/linuxkpi/common/include/linux/pci.h index eb5f46a4879c..8159cabb6408 100644 --- a/sys/compat/linuxkpi/common/include/linux/pci.h +++ b/sys/compat/linuxkpi/common/include/linux/pci.h @@ -235,7 +235,6 @@ struct pci_driver { void (*bsd_iov_uninit)(device_t dev); int (*bsd_iov_add_vf)(device_t dev, uint16_t vfnum, const nvlist_t *vf_config); - int bsd_probe_return; }; struct pci_bus { diff --git a/sys/compat/linuxkpi/common/src/linux_pci.c b/sys/compat/linuxkpi/common/src/linux_pci.c index 0734acac60ac..aeb277fc65e9 100644 --- a/sys/compat/linuxkpi/common/src/linux_pci.c +++ b/sys/compat/linuxkpi/common/src/linux_pci.c @@ -369,11 +369,7 @@ linux_pci_probe(device_t dev) return (ENXIO); device_set_desc(dev, pdrv->name); - /* Assume BSS initialized (should never return BUS_PROBE_SPECIFIC). */ - if (pdrv->bsd_probe_return == 0) - return (BUS_PROBE_DEFAULT); - else - return (pdrv->bsd_probe_return); + return (BUS_PROBE_DEFAULT - 1); } static int diff --git a/sys/contrib/dev/iwlwifi/pcie/drv.c b/sys/contrib/dev/iwlwifi/pcie/drv.c index cd890fae6fbc..58a7111d4f40 100644 --- a/sys/contrib/dev/iwlwifi/pcie/drv.c +++ b/sys/contrib/dev/iwlwifi/pcie/drv.c @@ -1746,10 +1746,6 @@ static struct pci_driver iwl_pci_driver = { .probe = iwl_pci_probe, .remove = iwl_pci_remove, .driver.pm = IWL_PM_OPS, -#if defined(__FreeBSD__) - /* Allow iwm(4) to attach for conflicting IDs for now. */ - .bsd_probe_return = (BUS_PROBE_DEFAULT - 1), -#endif }; int __must_check iwl_pci_register_driver(void)
I know my diff isn't great for stable/13, but for the release it likely suffices until we can fix the KBI in stable/13 and may be less risk for the release than a bigger fix.
Well if you want to fix KBI in stable/13 you probably want to actually append spares NOW for 13.1-RELEASE that are unused so that by the time we start building on 13.1-R rather than 13.0-R release for ports-packages we actually have space to expand. We'll keep running into these problems otherwise for the next 3 or so years...?
...
I like that idea especially for releng/13.1. I'll probably be fine with it on stable/13 for a few more weeks but it'll just push the inevitable -- unless we do add a generic solution in sys/kern/ (sorry, that work tree is currently not avail as the machine was re-purposed for .. ) like that review of yours which helps breaking a tie.
Here's the patch for stable/13, which might also be good for releng/13.1 too. I've not added the static_assert that I want, though...
diff --git a/sys/compat/linuxkpi/common/include/linux/pci.h b/sys/compat/linuxkpi/common/include/linux/pci.h index eb5f46a4879c..7afe426d6b52 100644 --- a/sys/compat/linuxkpi/common/include/linux/pci.h +++ b/sys/compat/linuxkpi/common/include/linux/pci.h @@ -229,13 +229,12 @@ struct pci_driver { devclass_t bsdclass; struct device_driver driver; const struct pci_error_handlers *err_handler; - bool isdrm; + int bsd_probe_return; int (*bsd_iov_init)(device_t dev, uint16_t num_vfs, const nvlist_t *pf_config); void (*bsd_iov_uninit)(device_t dev); int (*bsd_iov_add_vf)(device_t dev, uint16_t vfnum, const nvlist_t *vf_config); - int bsd_probe_return; }; struct pci_bus { diff --git a/sys/compat/linuxkpi/common/src/linux_pci.c b/sys/compat/linuxkpi/common/src/linux_pci.c index 0734acac60ac..523519a91f7b 100644 --- a/sys/compat/linuxkpi/common/src/linux_pci.c +++ b/sys/compat/linuxkpi/common/src/linux_pci.c @@ -119,6 +119,12 @@ struct linux_dma_priv { #define DMA_PRIV_LOCK(priv) mtx_lock(&(priv)->lock) #define DMA_PRIV_UNLOCK(priv) mtx_unlock(&(priv)->lock) +static bool +linux_is_drm(struct pci_driver *pdrv) +{ + return (pdrv->name != NULL && strcmp(pdrv->name, "drmn")); +} + static int linux_pdev_dma_uninit(struct pci_dev *pdev) { @@ -405,7 +411,7 @@ linux_pci_attach_device(device_t dev, struct pci_driver *pdrv, linux_set_current(curthread); parent = device_get_parent(dev); - isdrm = pdrv != NULL && pdrv->isdrm; + isdrm = pdrv != NULL && linux_is_drm(pdrv); if (isdrm) { struct pci_devinfo *dinfo; @@ -674,7 +680,6 @@ linux_pci_register_driver(struct pci_driver *pdrv) dc = devclass_find("pci"); if (dc == NULL) return (-ENXIO); - pdrv->isdrm = false; return (_linux_pci_register_driver(pdrv, dc)); } @@ -688,7 +693,7 @@ linux_pci_reserve_bar(struct pci_dev *pdev, struct resource_list *rl, KASSERT(type == SYS_RES_IOPORT || type == SYS_RES_MEMORY, ("trying to reserve non-BAR type %d", type)); - dev = pdev->pdrv != NULL && pdev->pdrv->isdrm ? + dev = pdev->pdrv != NULL && linux_is_drm(pdev->pdrv) ? device_get_parent(pdev->dev.bsddev) : pdev->dev.bsddev; res = pci_reserve_map(device_get_parent(dev), dev, type, &rid, 0, ~0, 1, 1, 0); @@ -706,7 +711,7 @@ pci_resource_start(struct pci_dev *pdev, int bar) if ((rle = linux_pci_get_bar(pdev, bar, true)) == NULL) return (0); - dev = pdev->pdrv != NULL && pdev->pdrv->isdrm ? + dev = pdev->pdrv != NULL && linux_is_drm(pdev->pdrv) ? device_get_parent(pdev->dev.bsddev) : pdev->dev.bsddev; if (BUS_TRANSLATE_RESOURCE(dev, rle->type, rle->start, &newstart)) { device_printf(pdev->dev.bsddev, "translate of %#jx failed\n", @@ -734,7 +739,6 @@ linux_pci_register_drm_driver(struct pci_driver *pdrv) dc = devclass_create("vgapci"); if (dc == NULL) return (-ENXIO); - pdrv->isdrm = true; pdrv->name = "drmn"; return (_linux_pci_register_driver(pdrv, dc)); }
I've uploaded what I think will work as D34754 but have only compile tested it on all supported architectures.
It's an interesting question as to why we build linuxkpi on i386, arm and powerpc, but solving / answering that will wait for another day.