Page MenuHomeFreeBSD

Reallocate pcpu area on the correct domain.
ClosedPublic

Authored by jeff on Tue, Aug 13, 1:49 AM.

Details

Summary

This patch simply allocates various per-cpu data-structures on the correct domain. The actual pcpu area is reallocated and we waste a page. Recovering the page is likely more costly than the benefit as it will increase the size of the page array.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jeff created this revision.Tue, Aug 13, 1:49 AM
jeff retitled this revision from Reallocate pcpu area on the correct domain if it is not already. Parts of this diff were originally from kib. to Reallocate pcpu area on the correct domain..Tue, Aug 13, 1:52 AM
jeff edited the summary of this revision. (Show Details)
jeff added reviewers: markj, kib, gallatin, jhb, glebius.
kib added inline comments.Tue, Aug 13, 7:28 AM
sys/amd64/amd64/mp_machdep.c
406 ↗(On Diff #60707)

You are remapping a page in the kernel .bss section, I think that pmap_enter() works there only by a chance for now. I am sure that we do not want to support such use of pmap_enter().

Ideally we would use something like pmap_qenter(), but .bss might be mapped with superpage.

jeff added inline comments.Tue, Aug 13, 7:31 AM
sys/amd64/amd64/mp_machdep.c
406 ↗(On Diff #60707)

What might break it? Why do we not want to support this?

kib added inline comments.Tue, Aug 13, 11:07 AM
sys/amd64/amd64/mp_machdep.c
406 ↗(On Diff #60707)

For instance, vm_page_array elements backing pages of kernel text/data/bss were not initialized until very recently. We never faulted on kernel segments, and pmap_enter() was correspondingly never needed there.

jeff added inline comments.Tue, Aug 13, 5:09 PM
sys/amd64/amd64/mp_machdep.c
406 ↗(On Diff #60707)

Ok but I still don't see a reason not to do this. It is simple and it works. Anything I do will just be a more long winded replacement of the same code.

kib added inline comments.Tue, Aug 13, 5:35 PM
sys/amd64/amd64/mp_machdep.c
406 ↗(On Diff #60707)

I do not object (strongly), just pointing out that the code introduces a new requirement.

Also, this causes demotion of the superpage mapping the data (.data + .bss), I believe.

jeff added a reviewer: mmacy.Thu, Aug 15, 7:29 PM
jeff added a comment.Sun, Aug 18, 8:36 AM

If there is no strong objection I will commit this tomorrow.

kib accepted this revision.Sun, Aug 18, 6:25 PM
This revision is now accepted and ready to land.Sun, Aug 18, 6:25 PM
gallatin requested changes to this revision.Sun, Aug 18, 7:52 PM
gallatin added inline comments.
sys/amd64/amd64/mp_machdep.c
470 ↗(On Diff #60707)

I'm getting a panic here on AMD rome configured in non-numa mode. I think we need to avoid acpi_pxm_get_cpu_locality() when we're not on a numa machine. The following patch seems to fix it:

        /* start each AP */
-       for (cpu = 1; cpu < mp_ncpus; cpu++) {
+       for (domain = 0, cpu = 1; cpu < mp_ncpus; cpu++) {
                apic_id = cpu_apic_ids[cpu];
-               domain = acpi_pxm_get_cpu_locality(apic_id);
+#ifdef NUMA
+               if (vm_ndomains > 1)
+                       domain = acpi_pxm_get_cpu_locality(apic_id);
+#endif
                /* allocate and set up an idle stack data page */
                bootstacks[cpu] = (void *)kmem_malloc_domainset(
This revision now requires changes to proceed.Sun, Aug 18, 7:52 PM
This revision was not accepted when it landed; it landed in state Needs Revision.
This revision was automatically updated to reflect the committed changes.