Use separate vmems to segregate KVA by domain
ClosedPublic

Authored by jeff on Nov 29 2017, 4:13 AM.

Details

Summary

This patch implements domain selection policy at the kmem layer. Each domain gets its own kva allocator which holds KVA_QUANTUM spans of address space. This number is chosen so that KVA is aligned between domains and superpages.

As with the page layer, kmem_ provides a domain specific allocation function, and a policy driven function. This will enable us to more easily push code under domain specific free queue locks.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
jeff created this revision.Nov 29 2017, 4:13 AM
jeff added inline comments.Nov 29 2017, 7:35 AM
vm/vm_kern.c
518 ↗(On Diff #35941)

This should return ret.

markj accepted this revision as: markj.Nov 30 2017, 8:06 PM
markj added a subscriber: markj.
markj added inline comments.
vm/vm_kern.c
555 ↗(On Diff #35941)

Parens around return value, ditto above.

vm/vm_page.c
2571 ↗(On Diff #35941)

The "domain" parameter comes before "req" and "npages" in vm_page_alloc_contig_domain()'s signature.

vm/vm_reserv.c
718 ↗(On Diff #35941)

The changes in this function should also be made in vm_reserv_alloc_contig().

This revision is now accepted and ready to land.Nov 30 2017, 8:06 PM
kib added a subscriber: kib.Dec 2 2017, 11:07 AM
kib added inline comments.
vm/vm_kern.c
236 ↗(On Diff #35941)

This chunk copies/pastes the same issue into three more places, at least. I really do not understand why do you insist on the broken handling of sleepable allocations.

541 ↗(On Diff #35941)

So you decided to silently treat allocations of size zero as domain zero ? Might be, it is more honest to add assert to not allow zero allocs at all ?

559 ↗(On Diff #35941)

Why do we need this wrapper ? IMO change of kmem_unback() to return int is justifiable.

vm/vm_page.c
2655 ↗(On Diff #35941)

Why this function is needed at all ? It is not called in the new world anywhere, unless I mistaken.

It is rather pointless IMO, to use the caller thread domain policy for unrelated reclamation with potentially global effect.

jeff marked 3 inline comments as done.Dec 2 2017, 7:24 PM
jeff added inline comments.
vm/vm_kern.c
236 ↗(On Diff #35941)

Here there is not even a WAITFAIL to fallback on. We use the malloc flags which don't include this concept. We don't have a primitive that waits for any domain to have memory to retry the loop. I don't consider this solution optimal but I don't think it's a huge problem either. In order to make progress I am trying to come up with separable units of work that I can commit rather than trying to write a huge unreviewable perfect solution all at once. This is just where I decided to make the cut. After this and the UMA change goes in I will have new policy with new iterators and we can continue to improve the sleeping situation.

541 ↗(On Diff #35941)

It happens in failure cases. I can short-circuit the failure cases elsewhere so we don't call the function. I could return -1. I just felt it was more defensive to handle it early and return.

559 ↗(On Diff #35941)

I suppose so yes. It seems like an unrelated detail for all callers outside of vm_kern.c and I didn't want to be married to this arrangement. Returning the int opens up the opportunity for something outside of this file to make use of it or try to interpret it.

vm/vm_page.c
2655 ↗(On Diff #35941)

It is used in mips pmap in ways I don't quite understand. If it is safe to assume domain 0 in those cases I can eliminate this wrapper.

jeff added inline comments.Dec 2 2017, 7:27 PM
vm/vm_page.c
2655 ↗(On Diff #35941)

./arm/nvidia/drm2/tegra_bo.c: if (!vm_page_reclaim_contig(pflags, npages, low, high,
./compat/linuxkpi/common/src/linux_page.c: if (!vm_page_reclaim_contig(req,
./dev/drm2/i915/i915_gem_gtt.c: if (!vm_page_reclaim_contig(req, 1, 0, 0xffffffff,
./dev/drm2/ttm/ttm_bo.c: if (tries < 1 && vm_page_reclaim_contig(req, 1,
./dev/drm2/ttm/ttm_page_alloc.c: if (!vm_page_reclaim_contig(req, 1, 0, 0xffffffff,
./mips/mips/pmap.c: if (!vm_page_reclaim_contig(req, 1, 0, MIPS_KSEG0_LARGEST_PHYS,
./mips/mips/pmap.c: if (nkpg == NULL && vm_page_reclaim_contig(req_class, 1,
./mips/mips/uma_machdep.c: if (m == NULL && vm_page_reclaim_contig(pflags, 1,

kib added inline comments.Dec 3 2017, 10:03 AM
vm/vm_page.c
2655 ↗(On Diff #35941)

I reviewed all uses of the function in drm2 (including ttm) and in mips pmap. My opinion is that perhaps the wrapper would be useful for now, but as I noted earlier, it must not use the calling thread domain policy.

Intent of all calls to the function (which remains after the patch) are to free some pages in the specified physical address range. For instance, if the current thread policy only allows a domain which does not contain any pages from the phys address ranges, the iteration is pointless.

jeff added inline comments.Dec 5 2017, 9:53 PM
vm/vm_page.c
2655 ↗(On Diff #35941)

The problem will be resolved with my new iterators and policy because the kernel_object policy will be used in place of the thread.

Before I commit that patch I will simply make this a 'foreach domain' loop.

alc added inline comments.Dec 23 2017, 8:05 PM
vm/vm_reserv.c
718 ↗(On Diff #35941)

This assignment makes no sense. vm_reserv's are like vm_page's. They are permanently bound to a range of physical addresses. This field, if it exists at all, should be initialized at startup and never changed.

This revision was automatically updated to reflect the committed changes.