This is useful for bhyve, which otherwise has to use /dev/io to handle
accesses to I/O port BARs when PCI passthrough is in use. In
particular, /dev/io cannot be used in jails even if devfs access is
granted.
Details
See D31308.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 40679 Build 37568: arc lint + arc unit
Event Timeline
My feel is that it would be more useful to have single ioctl, but encode read/write operation in struct pci_ioport_io. My reasoning is that just reads and writes are not enough, eventually for this API to be more useful, there will be requests for operations like and/or/nor. I do not propose to implement them outright, but it is better to leave the flexibility to introduce them when needed.
sys/dev/pci/pci_user.c | ||
---|---|---|
943 | I believe that this feature should not be limited to the io bars. Or rather, I see no reason why it is useful to limit it. | |
956 | This is an overflow check, am I right? Isn't it better to check that offset+width does not overflow, and then leave only second test for rman_get_size? | |
1045 | I believe ports (or MMIO) reads should be treated same as writes from the security PoV. It is very common for any access to some registers to have side effect. |
Address feedback:
- Add a single ioctl and encode the requested op in the parameters
- Fix FWRITE check
- Permit I/O to any BAR
- Correct the overflow check
sys/dev/pci/pci_user.c | ||
---|---|---|
1045 | Ugh, thanks, I misread the check and inverted it in my head. |
sys/dev/pci/pci_user.c | ||
---|---|---|
949 | It's not required at least for PCI passthrough with ppt. Is there anything preventing a mmap of a memory-mapped BAR if a driver has attached and mapped it already? If not then it probably makes sense to add RF_SHAREABLE. |
sys/dev/pci/pci_user.c | ||
---|---|---|
949 | As I said, I am not sure. My belief is that it is needed on both sides to get the resource shared between two activations, but I do not know for sure. |
sys/dev/pci/pci_user.c | ||
---|---|---|
947 | You need to pick between SYS_RES_IOPORT and SYS_RES_MEMORY here to support memory BARs. I don't think we have a helper for that yet, but you can use this: if (PCI_BAR_MEM(pm->pm_value)) type = SYS_RES_MEMORY; else type = SYS_RES_IOPORT; | |
949 |
So it is true that all parties have to use RF_SHAREABLE. It may also be a feature that you can't do I/O to BARs mapped by drivers (I think ppt doesn't alloc resources for its BARs which is why Mark's use case works). I'm not sure how I feel about what we should do here. Not passing the flag is the most conservative approach, and I think most drivers don't allocate their BARs shareable anyway. Actually, the PCI bus only allows for one allocation of a BAR at a time, though a few places like vgapci allow multiple allocs via a ref count. | |
950–951 | Ugh, new code shouldn't use this. | |
956 | You could move these checks earlier and use pm->pm_size as the BAR size and validate the offset and width before the call to bus_alloc_resource_any. | |
967 | Here and the other places below. |
- Address feedback.
- Rename the ioctl and request structure to indicate that they can be used for both I/O port and memory-mapped BARs.
Generally this is looking good. I agree with the RF_SHAREABLE bit. But to support that protocol correctly, you need to activate/deactivate while holding the allocation, which suggests a more elaborate protocol would be needed than the stateless one we have here. So it's best to omit it entirely.
share/man/man4/pci.4 | ||
---|---|---|
454 ↗ | (On Diff #92993) | I'd be tempted to add 'in bytes' here to make it clear the units. And why aren't 8-byte I/Os supported? |
sys/dev/pci/pci_user.c | ||
937 | How do we know it's a type 0 device here? Do you allow these operations on PCI bridges? If not, perhaps you should filter them out. If so, you need to test for TYPE 1 here. Type 2 is CardBus bridges and likely can be ignored, or at least filtered, given their rarity these days. | |
946 | I think you want an invariant check to make sure this is a valid width. | |
970 | Why not case 8: here? |
Address feedback from Warner:
- man page clarification
- permit 8-byte accesses
- remove the BAR index check and rely on pci_find_bar() instead
Also add a comment noting that the ioctl could be modified to work if a driver
has already allocated the resource.
I'm not seeing anything new, and you addressed all my prior feedback (though the large number of comments in spots makes it a little hard to read)