Page MenuHomeFreeBSD

tychon (Tycho Nightingale)
User

Projects

User Details

User Since
Aug 11 2014, 8:37 PM (544 w, 5 d)

Recent Activity

Apr 14 2021

tychon removed a member for bhyve: tychon.
Apr 14 2021, 3:03 PM

Mar 5 2021

tychon accepted D29094: Only set delayed inval for procs using PTI.
Mar 5 2021, 6:05 PM

Dec 8 2020

tychon accepted D27503: dmar: reserve memory windows of PCIe root port.

I've tested this latest iteration as well.

Dec 8 2020, 8:09 PM

Oct 13 2020

tychon committed rS366678: eliminate possible race in parallel TLB shootdown IPI.
eliminate possible race in parallel TLB shootdown IPI
Oct 13 2020, 6:29 PM

Jun 23 2020

tychon committed rS362540: To avoid a startup script race change net.bpf.optimize_writers from.
To avoid a startup script race change net.bpf.optimize_writers from
Jun 23 2020, 1:58 PM

Jun 5 2019

tychon committed rS348687: another occurrence where a very large dma mapping can cause integer overflow.
another occurrence where a very large dma mapping can cause integer overflow
Jun 5 2019, 1:08 PM

Jun 3 2019

tychon committed rS348571: very large dma mappings can cause integer overflow.
very large dma mappings can cause integer overflow
Jun 3 2019, 7:19 PM
tychon closed D20505: very large dma mappings can cause integer overflow.
Jun 3 2019, 7:19 PM
tychon created D20505: very large dma mappings can cause integer overflow.
Jun 3 2019, 5:10 PM

May 17 2019

tychon committed rS347903: Remove unused define..
Remove unused define.
May 17 2019, 1:08 PM

May 16 2019

tychon committed rS347896: Fix integer overflow in r346386..
Fix integer overflow in r346386.
May 16 2019, 10:27 PM
tychon closed D20277: revert 4GB workarounds for bge and aac.
May 16 2019, 8:41 PM
tychon committed rS347890: reinstate 4GB DMA boundary workarounds for bge and aac.
reinstate 4GB DMA boundary workarounds for bge and aac
May 16 2019, 8:41 PM
tychon added inline comments to D20277: revert 4GB workarounds for bge and aac.
May 16 2019, 8:10 PM
tychon committed rS347836: Allow loading the same DMA address multiple times without any prior.
Allow loading the same DMA address multiple times without any prior
May 16 2019, 5:41 PM
tychon closed D20181: For the LinuxKPI allow loading the same DMA address multiple times without any prior unload.
May 16 2019, 5:41 PM
tychon added inline comments to D20181: For the LinuxKPI allow loading the same DMA address multiple times without any prior unload.
May 16 2019, 1:53 PM
tychon updated the diff for D20181: For the LinuxKPI allow loading the same DMA address multiple times without any prior unload.

Swapped aarch64 for arm64 per emaste's suggestion and addressed rlibby's nits.

May 16 2019, 1:48 PM
tychon created D20277: revert 4GB workarounds for bge and aac.
May 16 2019, 1:16 PM

May 15 2019

tychon added inline comments to D20263: iommu static analysis cleanup.
May 15 2019, 4:58 PM
tychon added a comment to D20097: Fix regression issues after r346645 in the LinuxKPI.

What is remaining in this review?

May 15 2019, 4:55 PM

May 13 2019

tychon updated the diff for D20181: For the LinuxKPI allow loading the same DMA address multiple times without any prior unload.

Add support for arm64 plus rework _bus_dmamap_pagesneeded to short-circuit if we don't need an exact count but just need to know if *any* are needed.

May 13 2019, 4:30 PM

May 7 2019

tychon added a comment to D20181: For the LinuxKPI allow loading the same DMA address multiple times without any prior unload.

I'll give this a spin and look for regressions in graphics land, but it might take a day or two. Can you add x11 as group reviewer?

May 7 2019, 1:37 PM
tychon updated subscribers of D20181: For the LinuxKPI allow loading the same DMA address multiple times without any prior unload.
May 7 2019, 1:36 PM
tychon created D20181: For the LinuxKPI allow loading the same DMA address multiple times without any prior unload.
May 7 2019, 1:12 PM

May 6 2019

tychon added a comment to D20097: Fix regression issues after r346645 in the LinuxKPI.

Joining a few threads on this back together, below is a fleshed out version (minus arm64) of what I am thinking. It removes some penalty (the additional LOOKUP) when dmar is enabled and significantly reduces overhead in the bounce-without-bounce case.

May 6 2019, 8:38 PM
tychon committed rS347168: zero inputs to vm_page_initfake() for predictable results.
zero inputs to vm_page_initfake() for predictable results
May 6 2019, 12:57 AM
tychon closed D20162: zero vm_page_t inputs to vm_page_initfake() for predictable results.
May 6 2019, 12:57 AM

May 5 2019

tychon updated the diff for D20162: zero vm_page_t inputs to vm_page_initfake() for predictable results.
May 5 2019, 8:02 PM
tychon updated the diff for D20162: zero vm_page_t inputs to vm_page_initfake() for predictable results.

Update to a diff with full context.

May 5 2019, 5:32 PM
tychon created D20162: zero vm_page_t inputs to vm_page_initfake() for predictable results.
May 5 2019, 5:28 PM

May 4 2019

tychon accepted D20154: amd64: fix BUS_SPACE_MAXSIZE to 64bit max value..
In D20154#434058, @kib wrote:

I think that 4G value for BUS_SPACE_MAXSIZE still chomped the PCIe max DMA transfers into 4G chunks.

May 4 2019, 11:05 AM

May 3 2019

tychon added a comment to D20097: Fix regression issues after r346645 in the LinuxKPI.

This is still running fine from a graphics perspective.
DRM has been broken for some time now, what's needed to get this in?

May 3 2019, 4:27 PM

May 2 2019

tychon added a comment to D20097: Fix regression issues after r346645 in the LinuxKPI.

In D20097#433503, @kib wrote:

Why do you suggest that length check is not needed ? Putting the discussion of a possible race aside, why the lengths of two loads must be the same ?

May 2 2019, 7:41 PM
tychon added a comment to D20097: Fix regression issues after r346645 in the LinuxKPI.
In D20097#433442, @kib wrote:

I don't see how multiple mappings aren't a bug. The linux API doesn't do any ref-counting. There the first unmap would wipe out both. If there is sharing of an underlying resource that needs to be coordinated at a higher level; this isn't the place. Tolerating it to provide some cover is one thing and I'm not sure I even agree that is the right approach. IMHO fixing the driver is.

The double-mappings should not be bugs same as they are fine for our native busdma KPI. Consider:

  • for bounce, without bounce, bus address == phys address and nothing happens both on map and unmap
  • for bounce, with bounce, second map allocates another set of bounce pages, so bus address is different
  • for DMAR, guest mapping is created anew for each map request, again it is fine.

One of my points is that physical address is user-controllable, in some situations, so the KPI must handle duplicates.

May 2 2019, 2:00 PM
tychon added a comment to D20097: Fix regression issues after r346645 in the LinuxKPI.
In D20097#433274, @kib wrote:

@kib: I was thinking it would be better in a followup commit to add a debug knob to print the backtrace when double mappings happen or when the unmap cannot find the DMA address, and trace this down in the ibcore code. It is apparently a bug. Right now DRM-next and IBCORE works with this patch, and I think the current behaviour to kill old mappings on the same address is fine.

I disagree completely. Suppose that two RDMA clients mmap the same shared memory, and then use the same buffer for transfers.

May 2 2019, 10:35 AM

May 1 2019

tychon added a comment to D20097: Fix regression issues after r346645 in the LinuxKPI.

Would it be prudent to decouple #2 and #3? The fix for #1 raises some interesting implementation questions and may ultimately be fixed better in the driver anyway.

May 1 2019, 3:45 PM

Apr 30 2019

tychon accepted D20097: Fix regression issues after r346645 in the LinuxKPI.

Ideally #2 and #3 would be discrete commits but I see the value in single place to point folks to.

Apr 30 2019, 3:48 PM

Apr 29 2019

tychon accepted D20097: Fix regression issues after r346645 in the LinuxKPI.

I really like what you did with the locking. Thanks for pitching in here!!

Apr 29 2019, 4:24 PM

Apr 25 2019

tychon committed rS346687: LinuxKPI buildfix for ppc64 after r346645..
LinuxKPI buildfix for ppc64 after r346645.
Apr 25 2019, 6:14 PM

Apr 24 2019

tychon committed rS346645: LinuxKPI should use bus_dma(9) to be compatible with an IOMMU.
LinuxKPI should use bus_dma(9) to be compatible with an IOMMU
Apr 24 2019, 8:30 PM
tychon closed D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
Apr 24 2019, 8:30 PM

Apr 19 2019

tychon committed rS346386: remove the 4GB boundary requirement on PCI DMA segments.
remove the 4GB boundary requirement on PCI DMA segments
Apr 19 2019, 1:43 PM
tychon closed D19867: remove the 4GB boundary requirement on PCI DMA segments.
Apr 19 2019, 1:43 PM

Apr 18 2019

tychon added a comment to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
In D19845#428768, @greg_unrelenting.technology wrote:

Some more i915 GPU testing (w/o the latest update here): after using Firefox (opengl layers, xwayland) for some time, GPU resets start happening

drmn0: Resetting chip for stuck wait on rcs0
drmn0: Resetting chip for stuck wait on rcs0
drmn0: Resetting chip for stuck wait on rcs0
…
DMAR0: Fault Overflow
DMAR0: vgapci0: pci:0:2:0 sid 10 fault acc 0 adt 0x0 reason 0x5 addr 2e09000
DMAR0: Fault Overflow
DMAR0: vgapci0: pci:0:2:0 sid 10 fault acc 0 adt 0x0 reason 0x5 addr 2e09000

and eventually the whole system freezes if I don't quit the compositor / switch to vt console.

Apr 18 2019, 12:36 PM

Apr 17 2019

tychon added inline comments to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
Apr 17 2019, 9:52 PM
tychon updated the diff for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

Bump __FreeBSD_version and serialize required bus_dma(9) calls.

Apr 17 2019, 9:48 PM

Apr 16 2019

tychon updated the diff for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

Fix most trivial of trivial whitespace issues. I just want to avoid the tool chain complaining about any divergences so I'm updating the diff.

Apr 16 2019, 8:03 PM

Apr 14 2019

tychon added a comment to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
In D19845#427478, @greg_unrelenting.technology wrote:

Also tested on an AMD Ryzen + Vega system, no regressions. (No IOMMU there because no one wrote a dmar equivalent for AMD IOMMU…)

btw, amdgpu touches dma_mask in one place, had to do this to fix build:

--- i/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ w/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -366,7 +366,7 @@ void amdgpu_amdkfd_get_local_mem_info(struct kgd_dev *kgd,
                                      struct kfd_local_mem_info *mem_info)
 {
        struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-       uint64_t address_mask = adev->dev->dma_mask ? ~*adev->dev->dma_mask :
+       uint64_t address_mask = adev->dev->dma_priv ? ~*((uint64_t*)adev->dev->dma_priv) :
                                             ~((1ULL << 32) - 1);
        resource_size_t aper_limit = adev->gmc.aper_base + adev->gmc.aper_size;
Apr 14 2019, 2:26 PM
tychon added a comment to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
In D19845#427477, @greg_unrelenting.technology wrote:

Tested on my Haswell laptop with drm-v5.0, everything works (both with DMAR on and off), this line is new in dmesg:

vgapci0: dmar0 pci0:0:2:0 rid 10 domain 0 mgaw 48 agaw 48 re-mapped

so it seems like the GPU is IOMMU'd. (full dmesg)

Apr 14 2019, 2:24 PM

Apr 12 2019

tychon committed rS346150: for a cache-only zone the destructor tries to destroy a non-existent keg.
for a cache-only zone the destructor tries to destroy a non-existent keg
Apr 12 2019, 12:46 PM
tychon closed D19835: for a cache-only zone the destructor tries to destroy a non-existent keg.
Apr 12 2019, 12:46 PM
tychon added inline comments to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
Apr 12 2019, 11:23 AM
tychon updated the diff for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

Incorporate a few more review comments: add missing BUS_DMA_NOWAIT flags to _bus_dmamap_load_phys() and optimize linux_dma_map_sg_attrs() to coalesce physically contiguous scatter list entries.

Apr 12 2019, 11:19 AM
tychon updated the diff for D19835: for a cache-only zone the destructor tries to destroy a non-existent keg.

Might a well make this as good as can be. I combined the tests into one.

Apr 12 2019, 11:02 AM

Apr 11 2019

tychon updated the diff for D19835: for a cache-only zone the destructor tries to destroy a non-existent keg.

Use markj@'s suggestion for a more overt/intuitive fix.

Apr 11 2019, 1:45 PM
tychon updated the summary of D19835: for a cache-only zone the destructor tries to destroy a non-existent keg.
Apr 11 2019, 1:43 PM
tychon added a comment to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

And just a heads-up these patches uncovered an issue with the cache-only zone destructor trying to destroy a non-existent keg. That's being worked in D19835.

Apr 11 2019, 1:30 PM
tychon updated the diff for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

Address further code review feedback: use non-sleepable allocs, fix weird formatting and add KASSERT(negs == 1).

Apr 11 2019, 1:25 PM

Apr 10 2019

tychon added inline comments to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
Apr 10 2019, 8:52 PM
tychon updated the diff for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

Addressed a few code review comments. There is a bit more work to do. Glancing at the feedback, I forgot 'assert that nseg == 1'. Plus I've got the optimization of gluing adjacent sg segments in the works too.

Apr 10 2019, 8:46 PM
tychon updated the diff for D19867: remove the 4GB boundary requirement on PCI DMA segments.

Get rid of PCI_DMA_BOUNDARY entirely.

Apr 10 2019, 1:32 PM

Apr 9 2019

tychon added reviewers for D19835: for a cache-only zone the destructor tries to destroy a non-existent keg: markj, jeff.
Apr 9 2019, 7:04 PM
tychon created D19867: remove the 4GB boundary requirement on PCI DMA segments.
Apr 9 2019, 7:03 PM
tychon updated the diff for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

No locking is provided internally by the path-compressed radix trie implementation. It worked shockingly well without until it didn't. Adding locking to make the LinuxKPI DMA routines MT-Safe.

Apr 9 2019, 1:29 PM
tychon committed rS346050: ioatcontrol(8) crc-copy flag bug and misc usage tweak.
ioatcontrol(8) crc-copy flag bug and misc usage tweak
Apr 9 2019, 10:33 AM
tychon closed D19855: ioatcontrol(8) crc-copy flag bug and misc usage tweak.
Apr 9 2019, 10:33 AM
tychon created D19855: ioatcontrol(8) crc-copy flag bug and misc usage tweak.
Apr 9 2019, 1:22 AM

Apr 8 2019

tychon updated the diff for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
Apr 8 2019, 7:17 PM
tychon added a comment to D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).

While trying to explain the design I realized that the UMA "linux_dma objects" zone doesn't need to be per-device. I can make that global.

Apr 8 2019, 5:52 PM
tychon updated the summary of D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
Apr 8 2019, 5:50 PM
tychon added a reviewer for D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9): hselasky.
Apr 8 2019, 3:06 PM
tychon created D19845: to be compatible with an IOMMU LinuxKPI should use bus_dma(9).
Apr 8 2019, 3:04 PM

Apr 5 2019

tychon added a reviewer for D19835: for a cache-only zone the destructor tries to destroy a non-existent keg: glebius.
Apr 5 2019, 11:56 PM
tychon created D19835: for a cache-only zone the destructor tries to destroy a non-existent keg.
Apr 5 2019, 11:55 PM

Apr 2 2019

tychon committed rS345813: ioat(4) should use bus_dma(9) for the operation source and destination.
ioat(4) should use bus_dma(9) for the operation source and destination
Apr 2 2019, 7:08 PM
tychon closed D19725: ioat(4) should use bus_dma(9) for the operation source and destination addresses to work properly with the VT-d IOMMU.
Apr 2 2019, 7:08 PM
tychon committed rS345812: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and.
ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and
Apr 2 2019, 7:06 PM
tychon closed D19780: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes.
Apr 2 2019, 7:06 PM
tychon committed rS345811: DMAR driver assumes all physical addresses are backed by a fully.
DMAR driver assumes all physical addresses are backed by a fully
Apr 2 2019, 6:51 PM
tychon closed D19753: DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page.
Apr 2 2019, 6:51 PM
tychon added inline comments to D19753: DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page.
Apr 2 2019, 5:40 PM
tychon updated the diff for D19753: DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page.

I fixed the issues in the previous version which I should have taken more time to review myself before posting :-(

Apr 2 2019, 5:38 PM
tychon updated the diff for D19753: DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page.
Apr 2 2019, 4:46 PM
tychon added a comment to D19753: DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page.
In D19753#424464, @kib wrote:

Hm, sorry for following up immediately.

Can you change the patch slightly, to use the result of PHYS_TO_VM_PAGE() if it is usable ? In other words, only fill the missed slots in ma[].

Apr 2 2019, 4:45 PM
tychon added a comment to D19725: ioat(4) should use bus_dma(9) for the operation source and destination addresses to work properly with the VT-d IOMMU.

I think I've addressed at least many an attempt to address all outstanding feedback now. The diff is updated.

Apr 2 2019, 2:27 PM
tychon updated the diff for D19725: ioat(4) should use bus_dma(9) for the operation source and destination addresses to work properly with the VT-d IOMMU.
Apr 2 2019, 2:23 PM
tychon updated the diff for D19780: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes.
Apr 2 2019, 2:18 PM
tychon updated the diff for D19780: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes.

Add || TEST_DMA_8K_PB to pre-condition verification.

Apr 2 2019, 9:59 AM
tychon added a comment to D19780: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes.
In D19780#424304, @cem wrote:

arc-copy modes

typo: "crc-copy" (in the summary)

Looks mostly good to me. A few remarks:

Apr 2 2019, 1:09 AM
tychon updated the diff for D19780: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes.
Apr 2 2019, 1:09 AM
tychon retitled D19780: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes from ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and arc-copy modes to ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes.
Apr 2 2019, 1:04 AM
tychon updated the diff for D19725: ioat(4) should use bus_dma(9) for the operation source and destination addresses to work properly with the VT-d IOMMU.

This revision includes setups up the '48-bit' DMA address constraint for the 'data' tag and the '40-bit' DMA address constraint for the 'crc data' tag.

Apr 2 2019, 12:57 AM

Apr 1 2019

tychon created D19780: ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and crc-copy modes.
Apr 1 2019, 7:38 PM
tychon added inline comments to D19725: ioat(4) should use bus_dma(9) for the operation source and destination addresses to work properly with the VT-d IOMMU.
Apr 1 2019, 7:16 PM
tychon updated the diff for D19725: ioat(4) should use bus_dma(9) for the operation source and destination addresses to work properly with the VT-d IOMMU.
Apr 1 2019, 7:11 PM
tychon committed rS345777: Devices behind downstream bridges should still get DMAR protection..
Devices behind downstream bridges should still get DMAR protection.
Apr 1 2019, 7:08 PM
tychon closed D19717: devices behind downstream bridges should still get DMAR protection.
Apr 1 2019, 7:08 PM
tychon added a comment to D19753: DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page.
In D19753#423602, @kib wrote:

I do not like it. If we always pass ficitious pages, we do not need to pass pages at all, we can get away with physical address only. But I do want to have pages there for several reasons.

I do have the same problem on my bhyve integration branch, but there I do the following:

diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c
index a90a6f805b7..1f9d266ba7e 100644
--- a/sys/vm/vm_page.c
+++ b/sys/vm/vm_page.c
@@ -557,7 +557,7 @@ vm_page_startup(vm_offset_t vaddr)
 #ifdef WITNESS
 	int witness_size;
 #endif
-#if defined(__i386__) && defined(VM_PHYSSEG_DENSE)
+#if (defined(__i386__) || defined(__amd64__)) && defined(VM_PHYSSEG_DENSE)
 	long ii;
 #endif
 
@@ -800,7 +800,11 @@ vm_page_startup(vm_offset_t vaddr)
 	 * Initialize the page structures and add every available page to the
 	 * physical memory allocator's free lists.
 	 */
-#if defined(__i386__) && defined(VM_PHYSSEG_DENSE)
+#if (defined(__i386__) || defined(__amd64__)) && defined(VM_PHYSSEG_DENSE)
+	/*
+	 * i386 needs this for copyout(9) calling vm_fault_quick_hold_pages().
+	 * amd64 requires that for DMAR busdma and bhyve IOMMU.
+	 */
 	for (ii = 0; ii < vm_page_array_size; ii++) {
 		m = &vm_page_array[ii];
 		vm_page_init_page(m, (first_page + ii) << PAGE_SHIFT, 0);

As a temporal solution, you might consider using fake pages only for addresses where PHYS_TO_VM_PAGE() failed.

Apr 1 2019, 2:50 PM
tychon updated the diff for D19753: DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page.
Apr 1 2019, 2:45 PM