Page MenuHomeFreeBSD

Fix early counter initialization.
ClosedPublic

Authored by markj on Jul 9 2018, 12:00 AM.
Tags
None
Referenced Files
F103221469: D16190.diff
Fri, Nov 22, 9:11 AM
F103221406: D16190.diff
Fri, Nov 22, 9:10 AM
F103215318: D16190.diff
Fri, Nov 22, 7:45 AM
Unknown Object (File)
Tue, Nov 19, 2:10 PM
Unknown Object (File)
Fri, Nov 8, 12:34 AM
Unknown Object (File)
Tue, Nov 5, 1:19 PM
Unknown Object (File)
Mon, Oct 28, 4:18 PM
Unknown Object (File)
Sat, Oct 26, 3:35 AM
Subscribers

Details

Summary

Some counter arrays are initialized before APs are started. For
instance, vmcounter_startup() runs at SI_SUB_KMEM, before SI_SUB_CPU.
Prior to r336020, counter_u64_alloc() initialized the counters using an
smp_rendezvous(). For early initializations, this ran before
smp_started was set, so we'd only initialize the counter for CPU 0, and
apparently we relied on the rest of the counters already being zeroed(!).

With r336020 we now initialize the counters using a CPU_FOREACH() loop,
and this revealed a problem. On arm64, we do not set the bit for cpuid
0 in all_cpus until SI_SUB_CPU, so the CPU_FOREACH() loop skips CPU 0
(because CPU_ABSENT(0) is true) when initializing some counters. Thus
the trash value 0xdeadc0de is left as the counter value.

Fix the problem by using M_ZERO with UMA, and modify UMA to explicitly
iterate over all cpuids. I do not believe that we can defer
initialization of vm_cnt to after SI_SUB_CPU.

Test Plan

Without this change, the vm_cnt per-CPU counters have a value of
0xdeadc0de + delta. With the change, the values are sane.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj edited the test plan for this revision. (Show Details)
markj added reviewers: kib, glebius.

Note that the trash init function isn't PCPU zone-aware, so it only trashes the first 8 bytes of the allocation, i.e., the counter for CPU 0.

This revision is now accepted and ready to land.Jul 9 2018, 7:53 AM
../../../kern/subr_counter.c:64:55: error: expected ')'
        return (uma_zalloc_pcpu(pcpu_zone_64, flags | M_ZERO);
In D16190#343346, @pho wrote:
../../../kern/subr_counter.c:64:55: error: expected ')'
        return (uma_zalloc_pcpu(pcpu_zone_64, flags | M_ZERO);

Oops, sorry. I hand-copied the patch before creating the review.

The description is somewhat incorrect. Now I think that this is a regression from r336020. In particular, pcpu_page_alloc() doesn't honour M_ZERO. Previously we allocated pages for PCPU zones with page_alloc(), which does honour M_ZERO, and we have the following code in keg_alloc_slab():

1027         /*                                                                                                                                                                                                                  
1028          * This reproduces the old vm_zone behavior of zero filling pages the                                                                                                                                               
1029          * first time they are added to a zone.                                                                                                                                                                             
1030          *                                                                                                                                                                                                                  
1031          * Malloced items are zeroed in uma_zalloc.                                                                                                                                                                         
1032          */                                                                                                                                                                                                                 
1033                                                                                                                                                                                                                             
1034         if ((keg->uk_flags & UMA_ZONE_MALLOC) == 0)                                                                                                                                                                         
1035                 wait |= M_ZERO;                                                                                                                                                                                             
1036         else                                                                                                                                                                                                                
1037                 wait &= ~M_ZERO;

So, prior to r336020, the zeroing would work somewhat by accident. The allocation steps looked like this:

  1. Allocate pages to back the PCPU zone, and zero them.
  2. Trash the first 8 bytes.
  3. Use smp_rendezvous() to zero each counter. Before SMP is started, this only zeroes CPU 0's counter, which is the only one that is trashed.

Now, we no longer perform step 1, and in step 3 we use CPU_FOREACH(), which doesn't work before SI_SUB_CPU.

In addition to the change above, I will modify pcpu_page_alloc() to honour M_ZERO (and other malloc flags like M_NODUMP).

... also, pcpu_page_alloc() doesn't do what it's supposed to do for pre-SI_SUB_CPU initializations, since CPU_ABSENT() is true for all CPUs. I won't try to fix that in this review though.

Honour M_ZERO in pcpu_page_alloc() as well.

This revision now requires review to proceed.Jul 9 2018, 10:44 AM
This revision was not accepted when it landed; it landed in state Needs Review.Jul 10 2018, 12:18 AM
This revision was automatically updated to reflect the committed changes.