Page MenuHomeFreeBSD

DMAR driver assumes all physical addresses are backed by a fully initialized struct vm_page
ClosedPublic

Authored by tychon on Mar 29 2019, 5:30 PM.
Tags
None
Referenced Files
F83614590: D19753.diff
Sun, May 12, 5:35 PM
Unknown Object (File)
Sat, May 11, 2:34 PM
Unknown Object (File)
Tue, Apr 30, 1:02 AM
Unknown Object (File)
Tue, Apr 30, 1:02 AM
Unknown Object (File)
Tue, Apr 30, 1:02 AM
Unknown Object (File)
Apr 7 2024, 5:27 PM
Unknown Object (File)
Apr 5 2024, 3:42 PM
Unknown Object (File)
Feb 20 2024, 5:58 PM
Subscribers

Details

Summary

In dmar_bus_dmamap_load_buffer() there is code to special case the behavior when the system is creating a kernel crash dump and a comment to go along with it. Turns out there are other cases when the system is not dumping that PHYS_TO_VM_PAGE() can also return NULL or a non-initialized struct vm_page. For instance if performing DMA to/from a data-structure allocated in a driver's BSS segment.

To fix the issue in dmar_bus_dmamap_load_buffer() make the ‘dumping’ code the default. Likewise dmar_bus_dmamap_load_phys() makes an analogous call to PHYS_TO_VM_PAGE(); perform the same technique there.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

I do not like it. If we always pass ficitious pages, we do not need to pass pages at all, we can get away with physical address only. But I do want to have pages there for several reasons.

I do have the same problem on my bhyve integration branch, but there I do the following:

diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c
index a90a6f805b7..1f9d266ba7e 100644
--- a/sys/vm/vm_page.c
+++ b/sys/vm/vm_page.c
@@ -557,7 +557,7 @@ vm_page_startup(vm_offset_t vaddr)
 #ifdef WITNESS
 	int witness_size;
 #endif
-#if defined(__i386__) && defined(VM_PHYSSEG_DENSE)
+#if (defined(__i386__) || defined(__amd64__)) && defined(VM_PHYSSEG_DENSE)
 	long ii;
 #endif
 
@@ -800,7 +800,11 @@ vm_page_startup(vm_offset_t vaddr)
 	 * Initialize the page structures and add every available page to the
 	 * physical memory allocator's free lists.
 	 */
-#if defined(__i386__) && defined(VM_PHYSSEG_DENSE)
+#if (defined(__i386__) || defined(__amd64__)) && defined(VM_PHYSSEG_DENSE)
+	/*
+	 * i386 needs this for copyout(9) calling vm_fault_quick_hold_pages().
+	 * amd64 requires that for DMAR busdma and bhyve IOMMU.
+	 */
 	for (ii = 0; ii < vm_page_array_size; ii++) {
 		m = &vm_page_array[ii];
 		vm_page_init_page(m, (first_page + ii) << PAGE_SHIFT, 0);

As a temporal solution, you might consider using fake pages only for addresses where PHYS_TO_VM_PAGE() failed.

In D19753#423602, @kib wrote:

I do not like it. If we always pass ficitious pages, we do not need to pass pages at all, we can get away with physical address only. But I do want to have pages there for several reasons.

I do have the same problem on my bhyve integration branch, but there I do the following:

diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c
index a90a6f805b7..1f9d266ba7e 100644
--- a/sys/vm/vm_page.c
+++ b/sys/vm/vm_page.c
@@ -557,7 +557,7 @@ vm_page_startup(vm_offset_t vaddr)
 #ifdef WITNESS
 	int witness_size;
 #endif
-#if defined(__i386__) && defined(VM_PHYSSEG_DENSE)
+#if (defined(__i386__) || defined(__amd64__)) && defined(VM_PHYSSEG_DENSE)
 	long ii;
 #endif
 
@@ -800,7 +800,11 @@ vm_page_startup(vm_offset_t vaddr)
 	 * Initialize the page structures and add every available page to the
 	 * physical memory allocator's free lists.
 	 */
-#if defined(__i386__) && defined(VM_PHYSSEG_DENSE)
+#if (defined(__i386__) || defined(__amd64__)) && defined(VM_PHYSSEG_DENSE)
+	/*
+	 * i386 needs this for copyout(9) calling vm_fault_quick_hold_pages().
+	 * amd64 requires that for DMAR busdma and bhyve IOMMU.
+	 */
 	for (ii = 0; ii < vm_page_array_size; ii++) {
 		m = &vm_page_array[ii];
 		vm_page_init_page(m, (first_page + ii) << PAGE_SHIFT, 0);

As a temporal solution, you might consider using fake pages only for addresses where PHYS_TO_VM_PAGE() failed.

You make a good point that the common case was getting penalized. As you suggested I've adjusted to code to fallback to fake pages only when necessary. That keeps things pretty much as they are performance-wise and gets the non-working cases resolved.

I also debated the making the array of struct vm_page obsolete in favor of something else but I'm not well versed in the code enough to spot the obvious down side. Mostly that seemed attractive in that you could perhaps delete the random-sized malloc()/free() combination in favor of a linked list of uma-cached malloc data structures...

This revision is now accepted and ready to land.Apr 2 2019, 3:35 PM

Hm, sorry for following up immediately.

Can you change the patch slightly, to use the result of PHYS_TO_VM_PAGE() if it is usable ? In other words, only fill the missed slots in ma[].

In D19753#424464, @kib wrote:

Hm, sorry for following up immediately.

Can you change the patch slightly, to use the result of PHYS_TO_VM_PAGE() if it is usable ? In other words, only fill the missed slots in ma[].

Sure. Intuitively it seems to me that it will be all or nothing with respect to PHYS_TO_VM_PAGE() working but it's a pretty straightforward change to take a holes-only approach. I've attached a patch to do that. What do you think?

This revision now requires review to proceed.Apr 2 2019, 4:46 PM
sys/x86/iommu/busdma_dmar.c
695 ↗(On Diff #55736)

You probably should check if fma != NULL and not allocate if it was already allocated. Even if we agree that ether all pages are valid or fictitious, still on second page you would leak previously allocated fma.

709 ↗(On Diff #55736)

I do not see a need in the second loop. Could you assign to ma[i] in the if() block started at line 686 ?

I fixed the issues in the previous version which I should have taken more time to review myself before posting :-(

tychon added inline comments.
sys/x86/iommu/busdma_dmar.c
695 ↗(On Diff #55736)

I now see what you originally intended by only filling in the missing pages.

A new diff is attached, gone is leak in the previous version.

709 ↗(On Diff #55736)

Yes, both loops could be merged. It greatly cleaned up the code.

This revision is now accepted and ready to land.Apr 2 2019, 5:48 PM
This revision was automatically updated to reflect the committed changes.
tychon marked 2 inline comments as done.