In the extremely unlikely case that the iommu_gas_map_region() call in bus_dma_iommu_load_ident() failed, we would attempt to unmap the failed entry twice. First in iommu_gas_map_region(), and a second time in the caller. Once is enough, and twice is problematic because it leads to a second RB_REMOVE call on the same tree node. Like it or not, RB_TREE does not handle that possibility.
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
Kostik, I have another related concerned. What guarantee do we have that dmar_qi_task() is actually finished with the entry when it is freed here or in domain_init_rmrr()?
You mean, that the entry range must still be flushed from the TLB and the entry waited by the qi mechanism? It probably should be done for correctness.
But at least for RMRR entries, I believe that all of them are instantiated before DMAR translation is enabled, or at least they should be. This is probably more important than flushing.
The iommu_gas_free_entry() call here could be performed while the entry is still in the tlb_flush_entries TAILQ. To prevent this from happening, when false is passed as the free argument to iommu_domain_unload_entry(), I think that the IOTLB invalidation should be performed synchronously. Then, we would also no longer need the IOMMU_MAP_ENTRY_QI_NF flag.
But at least for RMRR entries, I believe that all of them are instantiated before DMAR translation is enabled, or at least they should be. This is probably more important than flushing.
Concretely, here is what I'm suggesting:
diff --git a/sys/dev/iommu/iommu_gas.h b/sys/dev/iommu/iommu_gas.h index c32a098538b0..a9d0df5f272f 100644 --- a/sys/dev/iommu/iommu_gas.h +++ b/sys/dev/iommu/iommu_gas.h @@ -43,17 +43,16 @@ #define IOMMU_PGF_ALLOC 0x0004 #define IOMMU_PGF_NOALLOC 0x0008 #define IOMMU_PGF_OBJL 0x0010 #define IOMMU_MAP_ENTRY_PLACE 0x0001 /* Fake entry */ #define IOMMU_MAP_ENTRY_RMRR 0x0002 /* Permanent, not linked by dmamap_link */ #define IOMMU_MAP_ENTRY_MAP 0x0004 /* Busdma created, linked by dmamap_link */ #define IOMMU_MAP_ENTRY_UNMAPPED 0x0010 /* No backing pages */ -#define IOMMU_MAP_ENTRY_QI_NF 0x0020 /* qi task, do not free entry */ #define IOMMU_MAP_ENTRY_READ 0x1000 /* Read permitted */ #define IOMMU_MAP_ENTRY_WRITE 0x2000 /* Write permitted */ #define IOMMU_MAP_ENTRY_SNOOP 0x4000 /* Snoop */ #define IOMMU_MAP_ENTRY_TM 0x8000 /* Transient */ #endif /* !_DEV_IOMMU_IOMMU_GAS_H_ */ diff --git a/sys/x86/iommu/intel_ctx.c b/sys/x86/iommu/intel_ctx.c index bfc607674b57..c6f1effe7b6b 100644 --- a/sys/x86/iommu/intel_ctx.c +++ b/sys/x86/iommu/intel_ctx.c @@ -868,27 +868,25 @@ dmar_domain_free_entry(struct iommu_map_entry *entry, bool free) } void iommu_domain_unload_entry(struct iommu_map_entry *entry, bool free) { struct dmar_domain *domain; struct dmar_unit *unit; domain = IODOM2DOM(entry->domain); unit = DOM2DMAR(domain); - if (unit->qi_enabled) { + if (unit->qi_enabled && free) { DMAR_LOCK(unit); dmar_qi_invalidate_locked(IODOM2DOM(entry->domain), entry->start, entry->end - entry->start, &entry->gseq, true); - if (!free) - entry->flags |= IOMMU_MAP_ENTRY_QI_NF; TAILQ_INSERT_TAIL(&unit->tlb_flush_entries, entry, dmamap_link); DMAR_UNLOCK(unit); } else { domain_flush_iotlb_sync(IODOM2DOM(entry->domain), entry->start, entry->end - entry->start); dmar_domain_free_entry(entry, free); } } static bool diff --git a/sys/x86/iommu/intel_qi.c b/sys/x86/iommu/intel_qi.c index ca58715a227c..57ad59eaa91a 100644 --- a/sys/x86/iommu/intel_qi.c +++ b/sys/x86/iommu/intel_qi.c @@ -355,22 +355,21 @@ dmar_qi_task(void *arg, int pending __unused) DMAR_LOCK(unit); for (;;) { entry = TAILQ_FIRST(&unit->tlb_flush_entries); if (entry == NULL) break; if (!dmar_qi_seq_processed(unit, &entry->gseq)) break; TAILQ_REMOVE(&unit->tlb_flush_entries, entry, dmamap_link); DMAR_UNLOCK(unit); - dmar_domain_free_entry(entry, (entry->flags & - IOMMU_MAP_ENTRY_QI_NF) == 0); + dmar_domain_free_entry(entry, true); DMAR_LOCK(unit); } if (unit->inv_seq_waiters > 0) wakeup(&unit->inv_seq_waiters); DMAR_UNLOCK(unit); } int dmar_init_qi(struct dmar_unit *unit) {