Page MenuHomeFreeBSD

Add BUS_RESET() and devctl reset.
ClosedPublic

Authored by kib on Mar 19 2019, 4:31 PM.

Details

Summary

BUS_RESET() method is implemented for pcie buses and pci devices.

For pcie bus,

  • all devices below the bus are detached or suspended
  • BUS_RESET_PREPARE() is called on each child
  • the link is disabled on the downstream port, then retrained
  • BUS_RESET_POST() is called on each child
  • all children are re-attached or unsuspended.

PCIe standard only mentions that link disable causes conventional reset. Re-train does not seem to be automatic after the link is re-enabled.

For pci devices, the same dance with detach/prepare and post/reattach is done, but FLR is tried, then power reset if no FLR or FLR failed.

Sponsored by: Mellanox Technologies

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kib created this revision.Mar 19 2019, 4:31 PM
jhb added a comment.Mar 19 2019, 5:03 PM

It would be nice to wire these up to 'devctl reset' somehow. I had imagined a 'bus_reset_child' method that for PCI would start with FLR but keep trying more invasive things (Linux has a function like this to reset PCI devices). For the devctl ioctl I would have made it suspend the device before reset and resume after the reset, and require -f to reset an attached device.

kib added a comment.Mar 19 2019, 5:26 PM
In D19646#420515, @jhb wrote:

It would be nice to wire these up to 'devctl reset' somehow. I had imagined a 'bus_reset_child' method that for PCI would start with FLR but keep trying more invasive things (Linux has a function like this to reset PCI devices). For the devctl ioctl I would have made it suspend the device before reset and resume after the reset, and require -f to reset an attached device.

Do you mean, adding BUS_DEVICE_RESET() method ? I can do it, but prefer to not mix it with the current changes.

Note that for MLNX use case, we need

  • Conventional reset and not FLR (FLR would not make necessary changes to the hw state)
  • suspend/resume methods are not implemented and if done, would be almost identical to detach/attach.

-

jhb added inline comments.Mar 19 2019, 5:29 PM
sys/dev/pci/pci.c
6383 ↗(On Diff #55238)

Has to be 'device_get_devclass(device_get_parent(port))'

6418 ↗(On Diff #55238)

One issue with doing a link reset first is that it resets all functions on a multifunction device. In theory the D0->D3->D0 only affects a single function whereas a link reset affects everything downstream. Certainly I've passed through individual functions to a VM via ppt (e.g. VFs) while leaving the other functions on the host. It would be surprising to have ppt reset all of the functions which is what your change would do.

6419 ↗(On Diff #55238)

Shouldn't this be 'device_get_devclass(device_get_parent(dev))'? The 'dev' here is a leaf device like em0, so it's declass won't be "pci".

6424 ↗(On Diff #55238)

Given that pcie_link_reset() already does checks, I would replace this entire block with just 'error = pcie_link_reset(device_get_parent(device_get_parent(dev));'

6427 ↗(On Diff #55238)

This check then becomes redundant, and it would have to be. You've already assumed the ivars of 'dev' are PCI ivars, so that means the devclass of dev's parent has to be pci.

6488 ↗(On Diff #55238)

Why not just suspend the devices? You could just use 'bus_generic_suspend()' I think without needing device_get_children, etc.?

kib added inline comments.Mar 19 2019, 5:58 PM
sys/dev/pci/pci.c
6383 ↗(On Diff #55238)

Parent of the port is pcib, not pci class ? I never can understand this two-device_t per device stuff.

6418 ↗(On Diff #55238)

If pci_hard_reset() is not useful for ppt(4), then I do not see much point in bringing it in.

6488 ↗(On Diff #55238)

Suspend is nop for significant number of devices, in particular for mlx5(4). Detach/reattach works for all of them.

kib updated this revision to Diff 55246.Mar 19 2019, 7:42 PM

Do not initiate link-level reset from ppt.
Remove code duplication for checks that device has upper pcie port, this should also fix bugs with mixing port vs. pcib.

imp added inline comments.Mar 26 2019, 8:59 PM
sys/dev/pci/pci.c
6407–6410 ↗(On Diff #55246)

do you need to guard against hz < 50 here? Do we have a lower bound of HZ we support?

6414 ↗(On Diff #55246)

Do we need to loop here to make sure that the link retraining finished? IIRC, the standard wants you to read the status register and wait for PCIE_LINK_STA_TRAINING to clear.

6488 ↗(On Diff #55238)

detach / attach destroys all saved state for the device. Is that OK?

kib updated this revision to Diff 55507.Mar 27 2019, 6:39 PM

Reimplement patch as BUS_RESET(). Connect to devctl(8).
[I know that there is no doc].

kib retitled this revision from PCIe helpers for device resets. to Add BUS_RESET() and devctl reset..Mar 27 2019, 6:44 PM
kib edited the summary of this revision. (Show Details)
imp accepted this revision.Mar 28 2019, 8:48 PM

generally, I'm happy. Once minor nit that you can do or not as you see fit.

sys/kern/bus_if.m
80 ↗(On Diff #55507)

null_reset is the same as the default behavior: return ENXIO when there's no routine given.

This revision is now accepted and ready to land.Mar 28 2019, 8:48 PM
kib updated this revision to Diff 55560.Mar 28 2019, 9:24 PM

Remove unneeded default method.

This revision now requires review to proceed.Mar 28 2019, 9:24 PM
slavash removed a subscriber: slavash.Mar 31 2019, 1:18 PM
slavash added a subscriber: slavash.
kib updated this revision to Diff 55667.Apr 1 2019, 12:40 PM

One more missed place.

imp added inline comments.Apr 1 2019, 2:26 PM
sys/dev/pci/pci_pci.c
47 ↗(On Diff #55667)

Do you need the PCI ioctls here? You don't seem to add a reference to them..

sys/dev/smartpqi/smartpqi_cam.c
742 ↗(On Diff #55667)

This file could be done as a separate commit, since it looks to be independent... I was kinda on the fence mentioning this, so if it's a hassle, it will be fine combining it.

sys/kern/subr_bus.c
3877 ↗(On Diff #55667)

There's code in device_probe which is called from device_probe_and_attach that handles the device being attached properly. It returns -1 which is translated to return 0 which is what you want here. It seems like this check shouldn't be needed, so I'll ask: What is Is the reason you've added the check here and elsewhere?

kib added inline comments.Apr 1 2019, 2:52 PM
sys/dev/pci/pci_pci.c
47 ↗(On Diff #55667)

sys/dev/pcivar.h only exposes pci_devinfo when both sys/bus.h and sys/pciio.h are included. I need it for pdinfo in pcib_reset() below.

sys/dev/smartpqi/smartpqi_cam.c
742 ↗(On Diff #55667)

Yes, all this stuff is probably four commits total, if not more. I split at commit time.

sys/kern/subr_bus.c
3877 ↗(On Diff #55667)

I removed the checks for DS_ATTACHED before device_probe_and_attach() calls.

kib updated this revision to Diff 55674.Apr 1 2019, 2:54 PM

Remove unneeded DS_ATTACHED checks before device_probe_and_attach().

jhb added a comment.Apr 1 2019, 5:53 PM

In general this looks great to me. The only other caveat that perhaps might just need to be documented somewhere, is that if you have detached or suspended a child device explicitly and then do a reset of the parent, the child device will end up attached. That isn't a new problem though (a system-wide suspend doesn't preserve a device being suspended by devctl suspend). Eventually we might need to track the "desired administrative state" of a device_t kind of like IFF_UP vs IFF_RUNNING in ifnet flags as the proper fix for both cases.

sys/kern/bus_if.m
862 ↗(On Diff #55674)

Before commit these need comments of course, and probably manpages for the new methods, but I assume you are just waiting to get the design reviewed first which is fine.

I might have called the new method 'bus_reset_child' since you are resetting a specific child device ('pci_reset' resets a single child PCI device, not the entire PCI bus, 'pcib_reset' resets the child PCI bus, not the bridge device itself, etc.), but that is minor.

kib updated this revision to Diff 55719.Apr 2 2019, 11:32 AM

Rename BUS_RESUME to BUS_RESUME_CHILD.
Add documentation.

imp accepted this revision.Apr 3 2019, 3:54 PM

Looks great, apart from the stupid date nits :). And even if those weren't fixed, it wouldn't be too bad.

lib/libdevctl/devctl.3
27 ↗(On Diff #55719)

should bump this date, no need for a round trip through phab, though, for that :)

usr.sbin/devctl/devctl.8
27 ↗(On Diff #55719)

date here

This revision is now accepted and ready to land.Apr 3 2019, 3:54 PM
jhb added a comment.Apr 3 2019, 6:20 PM

Overall looks good, just some minor nits/suggestions.

lib/libdevctl/devctl.3
391 ↗(On Diff #55719)

Are there any devctl_reset-specific error values (not the EINVAL for a bad flags since the function only passes valid flags to the ioctl) that need to be documented here?

405 ↗(On Diff #55719)

Would be good to add a second paragraph here about a reset always resuming/re-attaching devices.

sys/dev/smartpqi/smartpqi_cam.c
742 ↗(On Diff #55719)

Now that there isn't a DEV_RESET method do you still need the smartpqi changes?

usr.sbin/devctl/devctl.8
126 ↗(On Diff #55719)

Arguably this belongs in a separate BUGS section. Actually, I did document this in devctl(3) in the BUGS section there. We could perhaps add an explicit BUGS section in devctl.8 that points the reader at the section in devctl.3 (or just duplicates the content.)

Ah, you did add it to BUGS below, so I don't think you need the note here.

201 ↗(On Diff #55719)

I would s/un-suspend/resume/g.

kib marked 5 inline comments as done.Apr 3 2019, 6:41 PM
kib added inline comments.
sys/dev/smartpqi/smartpqi_cam.c
742 ↗(On Diff #55719)

I asked myself the same question. I can remove it if you do not like it, but it stomps on the global namespace, which is why I decided to not drop the removal of single-use macro.

kib updated this revision to Diff 55785.Apr 3 2019, 6:41 PM

Update man pages.

This revision now requires review to proceed.Apr 3 2019, 6:41 PM
jhb accepted this revision.Apr 5 2019, 4:46 PM
jhb added inline comments.
sys/dev/smartpqi/smartpqi_cam.c
742 ↗(On Diff #55719)

I don't care strongly, but I wasn't sure if this was a vendor driver and that this would introduce a local diff. sbruno@ might know as he was the last one to bring in a vendor update 8 months ago it seems.

sys/kern/subr_bus.c
3867 ↗(On Diff #55785)

Doxygen comments for the new non-static functions would be consistent with the rest of this file. Sorry I didn't notice that earlier. If you add them, no need for me to review them.

This revision is now accepted and ready to land.Apr 5 2019, 4:46 PM
This revision was automatically updated to reflect the committed changes.