Page MenuHomeFreeBSD

Allocate BARs for unspecified region in pci.c
AbandonedPublic

Authored by wma on Feb 12 2016, 8:21 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Apr 5, 6:14 PM
Unknown Object (File)
Mar 16 2024, 9:27 AM
Unknown Object (File)
Jan 12 2024, 1:23 AM
Unknown Object (File)
Jan 1 2024, 9:44 AM
Unknown Object (File)
Nov 13 2023, 12:07 PM
Unknown Object (File)
Oct 25 2023, 7:32 AM
Unknown Object (File)
Oct 13 2023, 5:39 AM
Unknown Object (File)
Oct 2 2023, 2:30 AM

Details

Reviewers
zbb
jhb
Summary
When a driver of a device with not configured BARs requests access to
memory region, it ends up in PCI RootComplex controller alloc_resource
routine. Since the start, end and count is unknown, it was RC driver
responsibility to perform BAR size/offset discovery. This was done
mostly on ARM and architectures where bootloard does not configure
PCI tree.

To allow removal of duplicated code, perform BAR/start/end/count
enumeration inside pci_reserve_map, where we're already doing
some portions of BAR discovery.

Also, fix an issue where "num" value was not propagated to alloc_resource
when VFs were allocated. This resulted in allocation of memory
space only for BARx_VF0, not for the number requested by the user.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

wma retitled this revision from to Allocate BARs for unspecified region in pci.c.
wma updated this object.
wma edited the test plan for this revision. (Show Details)
wma set the repository for this revision to rS FreeBSD src repository - subversion.

No this is not the right approach at all. The problem is ARM's bus drivers are broken, not the PCI bus. Other platforms besides ARM cope with this just fine. What is supposed to happen is that the call to resource_list_reserve() is going to pass up the "real" count as count. Thus, when the RC driver sees the allocation request it will see a valid count to know what size of resource to allocate. The existing map is specifically _not_ read at this time. During the bus probe, pci_add_map() will try to allocate the firmware-assigned region programmed into the BAR. However, if that fails, then pci_alloc_resource() is going to call pci_reserve_resource(). In that case, the bus should _not_ simply try to reallocate the range that failed before. It is going to fail again. Instead, it passes up the wider start/end range (normally the wildcard range) to the parent bridge driver along with a valid count. That bridge driver is responsible for clipping the request if necessary to the ranges it decodes when asking for resources from its parent (e.g. ACPI host-pci bridge drivers use Producer resource ranges in _CRS to determine the ranges to request from).

Once resource_list_reserve() works, it updates the resource list entry with the start/end/count of the resource it allocated. The call to resource_list_alloc() will then be able to satisfy the original start/end/count from the caller with the resource in the resource list entry (thus there is no need to modify the values passed to resource_list_alloc() in pci_alloc_resource(), if it fails it points to a bug in the parent bridge's allocation).

In summary, this change is completely wrong and will break x86 (among other architectures). Your RC driver is supposed to alloc 'X' bytes in the requested range, nothing more. It should not care about the associated BAR (if there is a PCI->foo bridge there might not even _be_ a BAR for the allocation request as it might be for a foo device, e.g. for a PCI-ISA bridge that is subtractively decoded). It also needs to cope with the range being larger than the size meaning it needs to find "space" in the ranges it decodes from its parent to hand back to the requester.

So for example in sys/arm64/cavium/thunder_pcie.c:thunder_pcie_alloc_resource() you cannot read the BAR register directly. The device might not be a PCI device (it might be an ISA device behind a PCI-ISA bridge). The request might not be for a BAR (once NEW_PCIB is supported), it might be for the I/O window of a PCI-PCI bridge instead.

Instead, you should just be passing the 'start', 'end', 'count' you get to the mem_rman directly. It will allocate a suitable chunk of free space with the proper alignment, etc.

However, this means your rman needs to contain PCI address (the BUS addresses), not phys addresses. In particular, the values you return in the resource you allocate are what gets written into the BAR of the PCI device, so they have to be valid in the PCI address space. Instead, you will need to convert the BUS address to a PHYS address only after a resource is allocated to generate an appropriate bus tag/handle.

The bus_map_resource() API proposed in D5237 should make this easier to implement going forward for NEW_PCIB though it isn't a hard requirement for you right now. However, you need to fix the RC bridge driver, not change the PCI bus.