Page MenuHomeFreeBSD

Cache kernel stacks in UMA so that we get NUMA support, better concurrency and statistics with less redundant code.
ClosedPublic

Authored by jeff on Jul 12 2019, 1:03 AM.

Details

Summary

I wanted to allocate kernel stacks through UMA so that we can apply NUMA policy to them as well as utilize other allocator features. In general I prefer to add small features to UMA rather than add another custom allocator.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jeff created this revision.Jul 12 2019, 1:03 AM
jeff retitled this revision from Cache kernel stacks in UMA so that we get NUMA support, better concurrency and statistics with less redundant code. to Cache kernel stacks in UMA so that we get NUMA support, better concurrency and statistics with less redundant code..Jul 12 2019, 1:05 AM
jeff edited the summary of this revision. (Show Details)
jeff added reviewers: glebius, kib, markj.
kib accepted this revision.Jul 12 2019, 1:01 PM

I think that the patch is functionally fine.

It looks strange that

  1. For newly created stack, you set the backing object domain policy to prefer domain of the current cpu.
  2. You do not encode any preference when allocating from cache.
This revision is now accepted and ready to land.Jul 12 2019, 1:01 PM
jeff added a comment.Jul 12 2019, 8:31 PM
In D20931#453715, @kib wrote:

I think that the patch is functionally fine.
It looks strange that

  1. For newly created stack, you set the backing object domain policy to prefer domain of the current cpu.
  2. You do not encode any preference when allocating from cache.

When allocating from cache the preference was recorded when it was inserted into cache in kstack_import(). The object is initialized and pointed to by the pages.

I think the one thing that is questionable here is the PREFER vs FIXED. I should probably fix them both as PREFER. Do you agree?

markj added inline comments.Jul 12 2019, 9:40 PM
sys/vm/vm_glue.c
347 ↗(On Diff #59670)

I don't think we need to initialize the iterator anymore?

jeff updated this revision to Diff 59727.Jul 13 2019, 5:28 PM

Address review feedback

This revision now requires review to proceed.Jul 13 2019, 5:28 PM
jeff added inline comments.Jul 13 2019, 5:28 PM
sys/vm/vm_glue.c
347 ↗(On Diff #59670)

If we don't initialize it then every ksobj will iterate from the same point. By initializing in this way they each step through the available domains slightly differently.

Was there some other change that made this unnecessary? I don't see it.

kib added a comment.Jul 13 2019, 8:21 PM
In D20931#453942, @jeff wrote:
In D20931#453715, @kib wrote:

I think that the patch is functionally fine.
It looks strange that

  1. For newly created stack, you set the backing object domain policy to prefer domain of the current cpu.
  2. You do not encode any preference when allocating from cache.

When allocating from cache the preference was recorded when it was inserted into cache in kstack_import(). The object is initialized and pointed to by the pages.

So it is UMA_ZONE_NUMA which gives the locality when satisfying the allocation from cache ?

I think the one thing that is questionable here is the PREFER vs FIXED. I should probably fix them both as PREFER. Do you agree?

I think yes.

jeff added a comment.Jul 13 2019, 8:39 PM
In D20931#454068, @kib wrote:
In D20931#453942, @jeff wrote:
In D20931#453715, @kib wrote:

I think that the patch is functionally fine.
It looks strange that

  1. For newly created stack, you set the backing object domain policy to prefer domain of the current cpu.
  2. You do not encode any preference when allocating from cache.

When allocating from cache the preference was recorded when it was inserted into cache in kstack_import(). The object is initialized and pointed to by the pages.

So it is UMA_ZONE_NUMA which gives the locality when satisfying the allocation from cache ?

I think the one thing that is questionable here is the PREFER vs FIXED. I should probably fix them both as PREFER. Do you agree?

I think yes.

Yes, stacks will now be allocated according to the UMA policy and the calling thread's domain. This isn't perfect when we're allocating a thread that we want to run on another domain. That is something that requires some deeper support for domain level locality in the scheduler to really be effective anyhow. This will work for threads that do some static binding.

markj accepted this revision.Jul 14 2019, 5:43 PM
markj added inline comments.
sys/vm/vm_glue.c
347 ↗(On Diff #59670)

That's true, though now the iterator is used only if the preferred domain is depleted. Shouldn't we do this kind of initialization in a more central location, like vm_object_allocate()?

301 ↗(On Diff #59727)

This should be flagged CTLFLAG_RWTUN to ensure that it can be set via loader.conf. We should declare CTLFLAG_MPSAFE as well.

This revision is now accepted and ready to land.Jul 14 2019, 5:43 PM