Page MenuHomeFreeBSD

PCIe HotPlug support

Authored by jmg on Oct 18 2015, 9:13 AM.



Initial support for PCIe Hot Plug support. There are two parts.

First is making the pci device adding more of a proper subclass, so that the PCI framework can add a pci device, and the various implementations (like ACPI) can
allocate/create the device w/ the necessary info.

Second is supporting hot plug. There is a kernel option on this part that will disable this functionality. This will initially be disabled, but we should look at enabling it by default to get the most testing.

This does not have the final state machine for controlling hot plug, that still needs some work.

Diff Detail

rS FreeBSD src repository
Lint OK
No Unit Test Coverage
Build Status
Buildable 807
Build 807: arc lint + arc unit

Event Timeline

jmg updated this revision to Diff 9485.Oct 18 2015, 9:13 AM
jmg retitled this revision from to PCIe HotPlug support.
jmg updated this object.
jmg edited the test plan for this revision. (Show Details)
jmg added reviewers: jhb, adrian, imp.
jmg set the repository for this revision to rS FreeBSD src repository.
adrian added inline comments.Oct 18 2015, 5:18 PM

Hm, so I know it requires MSI, but is it possible to also have legacy IRQ bits working too? I ask only because some of the wireless chipsets (eg *cough* earlier QCA pieces) support MSI but it's buggy, so the driver just plainly doesn't use MSI. (eg all the AR92xx PCIe pieces.)

imp edited edge metadata.Oct 18 2015, 6:15 PM

First pass: everything that isn't the actual hotplug stuff. Mostly dealing with consistency with the rest of the system.


that's a funky option name. pcie_hp or pciehp would be better. pci_ehp doesn't make sense to my brain.


and fix it here too. This is a PCIE option...


I'd be very hesitant to remove this DELAY. I think it is really needed for some legacy hardware.


This style change looks independent of the rest of the work.


This cleanup looks independent of the rest of the work.


This isn't necessarily a good test.

There are many devices that don't expect you to read config space for function > 0 since they are single function devices, as defined in the header for function 0. While most are well behaved, an interesting number are not.


history has shown that this isn't a good interface. Pass the length in. I see you take care to try to not overflow, but it isn't best practices...

jmg added a comment.Oct 18 2015, 7:14 PM

address some comments.


Yeh, I agree.. I was trying to stay in the PCI_ "namespace".. I'm fine w/ renaming to PCIE_HP or PCIEHP.


Should I just drop the e then and call it PCI_HP? the PCI_IOV is for PCIe SR-IOV and it doesn't have the E.


I talked w/ gavin (iirc) about this, and this was only required for a special AMD8111 chip that was never put into production (I believe pre production) that FreeBSD did devel on.



ok, missed that w/ my recent white space clean up last night.. I'll commit this separately shortly.


Standard problem of cleaning up code while looking at related code, and trying to pull changes apart.

Shall I just commit this fix then?


Well, pci_child_present is called in two contexts, either a) a device was present and went away (and in this case I believe it's well defined) (so, should be fine) or b) we are probing for devices, but we always check 0, 0 first, before going on. If you follow the code, this code should be virtually identical (here we check all 32 bits instead of just the vendor).. See pci_read_device for the original check.


as this code is static and used in only one place, I'd prefer not to. I would rather inline this code in pci_slot_cap_print than pass in a length here.


This is only to handle the interrupt for the PCIe bridge chip providing Hot Plug support... Devices below it should work fine on legacy interrupts just as they work fine below any other PCIe bridge.

If you have an example of them not working, please let me know.

jmg added a reviewer: gavin.Oct 18 2015, 7:14 PM
imp added a comment.Oct 19 2015, 5:17 AM

just commenting on the comments...
Still haven't had the time to concentrate on pcie_hp.c


Sure, I like that better. PCI_HP it is...


OK. The pre-production status of this part wasn't clear from reading the commit log.


Yes. Reduce the diffs. Easier to review, and easier to bisect changes if one of these cleanups winds up causing trouble.


Document this with a comment then.

emaste added a subscriber: emaste.Oct 19 2015, 4:20 PM
jhb edited edge metadata.Oct 20 2015, 11:23 PM

There are other PCI bus drivers that probably need an add_child method. Maybe just ofw_pci.c? Grep for pci_driver to see who is using it.


You could keep this to avoid the code duplication in pci_add_children(). It could do something like this:

pci_identify_function(device_t pcib, device_t dev, int domain, int busno, int slot, int func)

     if (pci_child_present(pcib, busno, slot, func) == -1 &&
         pci_find_dbsf(domain, busno, slot, func) == NULL) {
             return (PCI_ADD_CHILD(dev, slot, func));
     return (NULL);

Then to handle ARI you would just do:

child = pci_identify_function(...);
if (child != NULL && pci_enable_ari)

I would leave the REG() here to reduce the noise in the diff.


style(9) puts ()'s around return values, so 'return (pci_add_child_size(...))' FWIW.

(Also, throughout the patch).


You should only check the vendor. That is what the PCI spec defines (as it were) in that it says that a vendor ID of 0xFFFFh is always invalid. It makes no guarantees about the device ID.


Is this only for debugging? That's fine for now, but perhaps put it under an #ifdef? You should probably teach pciconf -lc to output these details, but they probably belong there rather than in the kernel outside of a temporary #ifdef.


This will reprobe any devices that were already there. Ideally you would only probe new devices to avoid console spam, etc.


These comments are mostly sentences, please just format them as such (leading capital letter and punctuation).


This comment seems outdated perhaps? It doesn't seem like this is doing that anywhere. NEW_PCIB is probably doing all of this for you. You could perhaps provide a hint to reserve more bus numbers by default if you wish.


For the case of only 1 resource, I think the spec table probably obfuscates more than helps.


Is the idea of putting these here rather than in pcicfg_hp because the same message can be overloaded for both hotplug and other things like AER?

jhb added a comment.Oct 21 2015, 6:37 PM

I had another thought in the shower this morning. Storing resources, etc. in the dinfo is not the right place to do this. The MSI interrupt belongs to the pcib(4) device. This state should be stored in the pcib(4) driver as part of its softc. Pretty much all of pcicfg_hp should move there, and the setup should be done from pcib_attach(), not from the PCI bus layer. This also makes it much more feasible to ensure there are adequate resources reserved for the bridge since you will be doing it in the right place when the bridge is doing the initial probe of its windows.

gavin added inline comments.Oct 23 2015, 12:04 AM

Nope, not me. I know nothing about this line's history other than from commit logs.


If dinfo->cfg.hp.hp_slotcap was set in pci_hotplug_init() rather than _setup it could be used here rather than reading it from the HW each time.


Use dinfo->cfg.hp.hp_slotcap here


I'd rather see this spelled out as the individual bits or'd together. As it is, PCIEM_SLOT_STA_EMASK is just a magic number that means nothing in isolation in the header file.


Save this in dinfo->cfg.hp.hp_slotcap here (rather than in pci_hotplug_setup() as it is now) then all other uses of PCIER_SLOT_CAP can just use this cached copy.

jmg abandoned this revision.Apr 28 2017, 5:41 PM