Page MenuHomeFreeBSD

Extend device_pager to support prefaulting of device memory
AbandonedPublic

Authored by kmacy on May 27 2016, 1:36 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Apr 5, 10:57 AM
Unknown Object (File)
Sat, Mar 23, 1:26 AM
Unknown Object (File)
Dec 20 2023, 12:37 AM
Unknown Object (File)
Dec 12 2023, 6:00 PM
Unknown Object (File)
Oct 18 2023, 2:24 PM
Unknown Object (File)
Sep 9 2023, 2:26 PM
Unknown Object (File)
Sep 3 2023, 10:21 AM
Unknown Object (File)
Jul 30 2023, 7:03 PM
Subscribers

Details

Summary

Linux graphics drivers aggressively pre-fault device memory. The current interface prevents that from working - at least in terms of expecations implied by existing asserts in the VM code. This change to the device pager's fault interface let's it know how many pages have been passed and let's it communicate how much read ahead it has done.

It turns out that the driver prefault path isn't invoked at X startup. But during glxgears my instrumentation indicates that this change is in fact adding the *rahead mappings.

Test Plan

It doesn't need any testing per se as it is a no-op for all in-tree consumers.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kmacy retitled this revision from to Enable device's to support prefaulting of device memory.
kmacy updated this object.
kmacy edited the test plan for this revision. (Show Details)
kmacy added reviewers: adrian, kib.
kmacy set the repository for this revision to rS FreeBSD src repository - subversion.
kmacy retitled this revision from Enable device's to support prefaulting of device memory to Extend device_pager to support prefaulting of device memory.May 27 2016, 6:58 PM

Please explain what is the supposed action of the driver when *rahead is assigned non-zero value.

Of course, the change breaks driver KBI. If it is accepted, something (cdevsw version, i.e. D_VERSION ?) should be bumped.

In D6587#139656, @kib wrote:

Please explain what is the supposed action of the driver when *rahead is assigned non-zero value.

Ahh thanks for pointing out that prefaulting on fictitious pages is broken there too.

vm_fault.c:959 - faultcount != 1

if (faultcount != 1 && (fault_flags & VM_FAULT_WIRE) == 0 &&

	    wired == 0)
		vm_fault_prefault(&fs, vaddr,
		    faultcount > 0 ? behind : PFBAK,
		    faultcount > 0 ? ahead : PFFOR);

This will call vm_fault_prefault with the value of rahead as the 4th argument.

...

		if (m->valid == VM_PAGE_BITS_ALL &&
		    (m->flags & PG_FICTITIOUS) == 0)
			pmap_enter_quick(pmap, addr, m, entry->protection);

What is the reason for that? It's a completely gratuitous restriction.

Of course, the change breaks driver KBI. If it is accepted, something (cdevsw version, i.e. D_VERSION ?) should be bumped.

Alternatively, we can create a new cdev_pager_allocate. However, the existing API doesn't strike me as being worth preserving.

@kib can you explain the check?

If not I'll update the patch with its removal. Thanks again.

  • Fix erroneous limitation on drivers preventing prefaulting of device mappings.
  • Add new version number to conf.h
kmacy added a reviewer: alc.

Still, can you formally explain what are the expectations WRT value of *rahead and state of the device object page queue ?

Also, I wanted to pass the real protection value down to the cdev_pg_fault() for quite some time. Currently prot always indicates write, which causes unneccessary wait in the GEM and TTM page fault handlers. The KBI of the cdev_pg_fault does not need to be changed for that, but since we are discussing the interface update, I think it worth being mentioned.

In D6587#139787, @kib wrote:

Still, can you formally explain what are the expectations WRT value of *rahead and state of the device object page queue ?

I don't understand the question. The answer to what it _sounds_like_ you're asking is self-evident. All the pages that are (effectively) passed up from the i915 gem fault routine are inserted in the object in the appropriate order. Even with prefault failing to call pmap_enter_quick as it did it previously having all the pages in object saved on work later by having all the pages ready for lookup to find. I can provide a walkthrough the new code if it would help.

Also, I wanted to pass the real protection value down to the cdev_pg_fault() for quite some time. Currently prot always indicates write, which causes unneccessary wait in the GEM and TTM page fault handlers. The KBI of the cdev_pg_fault does not need to be changed for that, but since we are discussing the interface update, I think it worth being mentioned.

Within the pager fault routine the prot would logically indicate the actual fault type. If it is always VM_PROT_WRITE that is a VM bug. The ABI as it stands now supports the correct behavior, even if the implementation currently does not.

Thanks for pointing out yet another bug in the VM relating to graphics performance. I had assumed it would DTRT. I can't trace every call in the code and instrument every argument on the off-chance it's wrong - or at least I hope I don't have to.

In D6587#139856, @kmacy wrote:
In D6587#139787, @kib wrote:

Still, can you formally explain what are the expectations WRT value of *rahead and state of the device object page queue ?

I don't understand the question. The answer to what it _sounds_like_ you're asking is self-evident. All the pages that are (effectively) passed up from the i915 gem fault routine are inserted in the object in the appropriate order.

I think the question was referring to behaviour with respect to the devp_pglist queue.

Within the pager fault routine the prot would logically indicate the actual fault type. If it is always VM_PROT_WRITE that is a VM bug. The ABI as it stands now supports the correct behavior, even if the implementation currently does not.

Thanks for pointing out yet another bug in the VM relating to graphics performance. I had assumed it would DTRT. I can't trace every call in the code and instrument every argument on the off-chance it's wrong - or at least I hope I don't have to.

I see - it doesn't pass down fault_type:

			rv = vm_pager_get_pages(fs.object, &fs.m, 1,
			    &behind, &ahead);
			if (rv == VM_PAGER_OK) {
				if (ahead)

... just wow. Thanks again. I will update the patch and fix. I'm also thinking about extending the device_pager API rather than changing it to make it easier to backport to 10.

I see - it doesn't pass down fault_type:

			rv = vm_pager_get_pages(fs.object, &fs.m, 1,
			    &behind, &ahead);
			if (rv == VM_PAGER_OK) {
				if (ahead)

... just wow. Thanks again. I will update the patch and fix. I'm also thinking about extending the device_pager API rather than changing it to make it easier to backport to 10.

Oh dear ...

static int
dev_pager_getpages(vm_object_t object, vm_page_t *ma, int count, int *rbehind,

int *rahead)

{
int error;
<...>

error = object->un_pager.devp.ops->cdev_pg_fault(object,

			IDX_TO_OFF(ma[0]->pindex), PROT_READ, ma, count, rahead);

<...>

This code really is decrepit. Thanks again for forcing me to take a closer look. I'm sure there are other problems too.

In D6587#139880, @kmacy wrote:

I have a patch somewhere which passes correct PROT_XXX down to the vm_pager_getpages(). Nothing non-trivial, but it compiled some time ago. I will try to find it tomorrow.

In D6587#139858, @markj wrote:
In D6587#139856, @kmacy wrote:
In D6587#139787, @kib wrote:

Still, can you formally explain what are the expectations WRT value of *rahead and state of the device object page queue ?

I don't understand the question. The answer to what it _sounds_like_ you're asking is self-evident. All the pages that are (effectively) passed up from the i915 gem fault routine are inserted in the object in the appropriate order.

I think the question was referring to behaviour with respect to the devp_pglist queue.

The driver needs to add to the list itself. My next patch will make that clearer.

In D6587#139881, @kib wrote:
In D6587#139880, @kmacy wrote:

I have a patch somewhere which passes correct PROT_XXX down to the vm_pager_getpages(). Nothing non-trivial, but it compiled some time ago. I will try to find it tomorrow.

Cool. I have something I just hacked out that I'll post soon. I hope that we can merge the two somehow.

In D6587#139787, @kib wrote:

Still, can you formally explain what are the expectations WRT value of *rahead and state of the device object page queue ?

Also, I wanted to pass the real protection value down to the cdev_pg_fault() for quite some time. Currently prot always indicates write, which causes unneccessary wait in the GEM and TTM page fault handlers. The KBI of the cdev_pg_fault does not need to be changed for that, but since we are discussing the interface update, I think it worth being mentioned.

@kib see D6620 for a clarification of the code to answer your first question and a first cut at addressing the issues raised by your second.

I've confirmed that I can manage prefaulting within the driver itself and do not need the assistance or knowledge of the VM.