Page MenuHomeFreeBSD

Make pmclog buffer pcpu and update constants
ClosedPublic

Authored by kmacy on Apr 22 2018, 5:30 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, May 13, 1:17 AM
Unknown Object (File)
Sat, May 4, 10:28 PM
Unknown Object (File)
Sat, May 4, 10:28 PM
Unknown Object (File)
Sat, May 4, 10:28 PM
Unknown Object (File)
Sat, May 4, 10:28 PM
Unknown Object (File)
Sat, May 4, 10:28 PM
Unknown Object (File)
Mar 21 2024, 11:23 PM
Unknown Object (File)
Mar 2 2024, 8:21 PM

Details

Summary

On non-trivial SMP systems the contention on the pmc_owner mutex leads to a substantial number of samples captured being from the pmc process itself. This change a) makes buffers larger to avoid contention on the global list b) makes the working sample buffer per cpu.

will-it-scale/page_fault1_processes -t 96 -s 30 >& /dev/null &
pmcstat -S UNHALTED_CORE_CYCLES -n 21000000 -O pf1.pmcstat sleep 10
pmcstat -R pf1.pmcstat -z100 -G pf1.stacks

before:
https://github.com/mattmacy/profiling/blob/master/2018.04.22/pf1_orig.svg

after:
https://github.com/mattmacy/profiling/blob/master/2018.04.22/pf1.svg

without pmcstat running:
mmacy@anarchy [~/devel/freebsd|0:09|32] time make -j96 buildkernel -s >& /dev/null
make -j96 buildkernel -s >&/dev/null 2508.41s user 1047.14s system 6000% cpu 59.259 total

with pmcstat running before:
pmcstat -S UNHALTED_CORE_CYCLES -n 21000000 -O /dev/null sleep 600 &
[1] 8918
mmacy@anarchy [~/devel/freebsd|0:29|68] time make -j96 buildkernel -s >& /dev/null
make -j96 buildkernel -s >&/dev/null 3336.68s user 24684.10s system 7442% cpu 6:16.50 total

with pmcstat running after:
pmcstat -S UNHALTED_CORE_CYCLES -n 21000000 -O /dev/null sleep 180 &
mmacy@anarchy [~/devel/freebsd|0:14|48] time make -j96 buildkernel -s >& /dev/null
make -j96 buildkernel -s >&/dev/null 2688.13s user 1547.66s system 6053% cpu 1:09.97 total

fixing the plethora of false sharing in pmc drops it a bit further:
make -j96 buildkernel -s >&/dev/null 2697.82s user 1347.35s system 6058% cpu 1:06.77 total

It turns out that the pmcstat mechanism for specifying rate is a bit brain damaged, it has to be specified _before_ the counter.

So the command line when testing sampling overhead is now:
pmcstat -n 1050000 -S UNHALTED_CORE_CYCLES -O /dev/null sleep 6000 &

This will sample at most 2000/s with a clock of 2.1Ghz.
baseline: make -j96 buildkernel without pmc running
pmc_head: make -j96 buildkernel with pmc running on HEAD
pmc_pcpu: make -j96 buildkernel with pmc running using the attached patch

Real time before and after:

x pmc_head_real.log
+ pmc_pcpu_real.log
+---------------------------------------------------------------------------+
|+                                                                          |
|+                                                                          |
|+                                                                          |
|+                                                                          |
|++                                                                         |
|++                          x                                              |
|++                xx  x   xxxx             x                              x|
|MA             |___________M___A________________|                          |
+---------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10          76.4        127.62        84.845        88.577     15.100031
+  10         59.71         60.79        60.135        60.179    0.29957192
Difference at 95.0% confidence
        -28.398 +/- 10.0344
        -32.0602% +/- 7.69825%
        (Student's t, pooled s = 10.6794)

system time before and after:
x pmc_head_sys.log
+ pmc_pcpu_sys.log
+---------------------------------------------------------------------------+
|+                                                                          |
|+                                                                          |
|+                                                                          |
|+                                                                          |
|+                                                                          |
|+                                                                          |
|++                     x x                                                 |
|++              xx  x  x xx              x                                x|
|A|         |____________M____A________________|                            |
+---------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10       2277.96       6948.53       2949.47      3341.492     1385.2677
+  10        1038.7       1081.06      1070.555      1064.017      15.85404
Difference at 95.0% confidence
        -2277.47 +/- 920.425
        -68.1574% +/- 8.77623%
        (Student's t, pooled s = 979.596)

Overhead of pmc on head:
x baseline_real.log
+ pmc_head_real.log
+---------------------------------------------------------------------------+
|xx                           +                                             |
|xx                           +                                             |
|xx                 ++  +   +++              +                             +|
|AM              |___________M___A_______________|                          |
+---------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10         58.38         59.15         58.86        58.847    0.22504567
+  10          76.4        127.62        84.845        88.577     15.100031
Difference at 95.0% confidence
        29.73 +/- 10.0335
        50.5208% +/- 17.0525%
        (Student's t, pooled s = 10.6785)
x baseline_sys.log
+ pmc_head_sys.log
+---------------------------------------------------------------------------+
|x                      + +                                                 |
|xx              ++  +  + ++              +                                +|
|A           |___________M____A________________|                            |
+---------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10       1010.15       1073.31      1025.465      1031.524     18.135705
+  10       2277.96       6948.53       2949.47      3341.492     1385.2677
Difference at 95.0% confidence
        2309.97 +/- 920.443
        223.937% +/- 89.3039%
        (Student's t, pooled s = 979.616)

Overhead of updated pmc:
x baseline_real.log
+ pmc_pcpu_real.log
+---------------------------------------------------------------------------+
|                x    x                               +                     |
|x        x x xx x    x  x                +    +     ++++    +   +         +|
|       |______AM_____|                        |_______MA________|          |
+---------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10         58.38         59.15         58.86        58.847    0.22504567
+  10         59.71         60.79        60.135        60.179    0.29957192
Difference at 95.0% confidence
        1.332 +/- 0.248939
        2.2635% +/- 0.426506%
        (Student's t, pooled s = 0.264942)

x baseline_sys.log
+ pmc_pcpu_sys.log
+---------------------------------------------------------------------------+
|            x   x                                                +        +|
|x       x   x   x     x       +   x+x     +    +             +   +x   +   +|
|   |____________M_____A_________________||______________A______M_________| |
+---------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10       1010.15       1073.31      1025.465      1031.524     18.135705
+  10        1038.7       1081.06      1070.555      1064.017      15.85404
Difference at 95.0% confidence
        32.493 +/- 16.0042
        3.15% +/- 1.5794%
        (Student's t, pooled s = 17.0331)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kmacy edited the summary of this revision. (Show Details)

Fix most false sharing in pmc

  • make buffer count a function of the number of cpus
  • allocate buffers and per-cpu structures from the right domain
  • reduce nsamples to the number required assuming at most ~8 samples per tick
kmacy edited the summary of this revision. (Show Details)
kmacy edited the summary of this revision. (Show Details)

remove gratuitous spinlock_{enter,exit} calls

  • spinlock_{enter, exit} are only gratuitous in functions only called from hardclock.
  • add copyright to the two heavily modified files

Fix compile issues on non tier 1 architectures

sys/dev/hwpmc/hwpmc_logging.c
550 ↗(On Diff #42081)

Is synchronization necessary for this? What happens when we race with setting shutdown?

sys/dev/hwpmc/hwpmc_mod.c
265–268 ↗(On Diff #42081)

Is this part related to this diff? It doesn't look necessary? It's also only done for intel cpus.

sys/dev/hwpmc/hwpmc_logging.c
550 ↗(On Diff #42081)

I'll protect the loop with an epoch section.

sys/dev/hwpmc/hwpmc_mod.c
265–268 ↗(On Diff #42081)

It's only needed for CPUs that support the counter tables from pmu-tables. The only arch's where its available are intel, power8/power9, and arm64. I guess I can commit that with the corresponding pmcstat changes.

This revision is now accepted and ready to land.May 11 2018, 11:32 PM
sys/dev/hwpmc/hwpmc_logging.c
550 ↗(On Diff #42081)

When pmclog_close() is called there are no pmc active.

 int
pmclog_close(struct pmc_owner *po)
{
	pmclog_process_closelog(po);
	mtx_lock(&pmc_kthread_mtx);

	/*
	 * Schedule the current buffer.
	 */
	pmclog_schedule_all(po);
	wakeup_one(po);

	/*
	 * Initiate shutdown: no new data queued,
	 * thread will close file on last block.
	 */
	po->po_flags |= PMC_PO_SHUTDOWN;
	mtx_unlock(&pmc_kthread_mtx);

	return (0);
}

pmclog_schedule_all will visit every CPU, clearing the pmc_sampled bit so that hardclock will not call in to this code.

static void
pmclog_schedule_one_cond(void *arg)
{
	struct pmc_owner *po = arg;
	struct pmclog_buffer *plb;

	spinlock_enter();
	if (DPCPU_GET(pmc_sampled)) {
		DPCPU_SET(pmc_sampled, 0);
		pmc_process_samples(PCPU_GET(cpuid), PMC_HR);
		pmc_process_samples(PCPU_GET(cpuid), PMC_SR);
	}
	plb = po->po_curbuf[curcpu];
	if (plb && plb->plb_ptr != plb->plb_base)
		pmclog_schedule_io(po);
	spinlock_exit();
}
This revision was automatically updated to reflect the committed changes.