Only apply MADV_FREE to exec args when the lowmem handler is called
ClosedPublic
Actions

Authored by markj on Feb 14 2017, 3:23 AM.

Details

Reviewers

Commits

rS313756: Apply MADV_FREE to exec_map entries only after a lowmem event.

Summary

With r311346 we apply MADV_FREE upon every execve. Despite the recent
changes to pmap_advise() and vm_object_madvise(), this still has a large
amount of overhead on many-CPU systems, primarily because
pmap_advise(MADV_FREE) clears dirty bits and thus requires a TLB
shootdown on x86. On x1.32xlarge EC2 instance (128 vCPUs), the removal of this
overhead gives a ~7.5% reduction in wall time for a -j128 buildkernel, and
nearly a 50% reduction in system CPU time.

To avoid this overhead, use a lowmem handler to move exec args pages
close to the head of the inactive queue prior to an inactive queue scan.
A generation counter is added to track lowmem calls; when freeing exec
args, a pending generation count will cause MADV_FREE to be applied.
This ensures that all but 260KB*ncpu worth of memory will be reclaimed,
and remaining pages will likely be reclaimed upon a subsequent scan.
Given the overhead of applying MADV_FREE, this seems to be a better
tradeoff. (The EC2 instance type in question provides almost 2TB of
RAM.)

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 7431
Build 7594: arc lint + arc unit

Event Timeline

markj updated this revision to Diff 25131.Feb 14 2017, 3:23 AM

markj retitled this revision from to Only apply MADV_FREE to exec args when the lowmem handler is called.

markj edited the test plan for this revision. (Show Details)

markj updated this object.

markj updated this object.Feb 14 2017, 3:28 AM

markj added a reviewer: kib.

Overall I do not object.

sys/kern/kern_exec.c
1375	I suggest to read exec_args_gen into a local var and assign argkva->gen a value from that var. I am not sure what exactly could go wrong if we miss intermediate gen update (missed madvise() call perhaps ?), but also I do not see why should we allow to ask such question. N.B. See another note below.
1405	Use atomic increment there ? There is a sense in using atomic_fetchadd() and pass the return value + 1 to exec_Free_args_kva() call below. Then exec_free_args_kva() would use the val instead of reading exec_args_gen twice.

This revision is now accepted and ready to land.Feb 14 2017, 4:54 AM

Use a consistent generation value when releasing a range to the cache.

This revision now requires review to proceed.Feb 14 2017, 6:33 AM

kib accepted this revision.Feb 14 2017, 6:42 AM

kib edited edge metadata.

This revision is now accepted and ready to land.Feb 14 2017, 6:42 AM

Closed by commit rS313756: Apply MADV_FREE to exec_map entries only after a lowmem event. (authored by markj). · Explain WhyFeb 15 2017, 1:51 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

Path

Size

sys/

kern/

kern_exec.c

53 lines

Diff 25131

View Options

Only apply MADV_FREE to exec args when the lowmem handler is calledClosedPublicActions