Page MenuHomeFreeBSD

Only apply MADV_FREE to exec args when the lowmem handler is called
ClosedPublic

Authored by markj on Feb 14 2017, 3:23 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Dec 27, 11:26 AM
Unknown Object (File)
Fri, Dec 20, 8:33 PM
Unknown Object (File)
Sep 26 2024, 8:50 PM
Unknown Object (File)
Sep 19 2024, 11:52 PM
Unknown Object (File)
Sep 18 2024, 10:48 PM
Unknown Object (File)
Sep 17 2024, 7:59 AM
Unknown Object (File)
Sep 17 2024, 4:07 AM
Unknown Object (File)
Sep 16 2024, 10:33 AM
Subscribers
None

Details

Summary

With r311346 we apply MADV_FREE upon every execve. Despite the recent
changes to pmap_advise() and vm_object_madvise(), this still has a large
amount of overhead on many-CPU systems, primarily because
pmap_advise(MADV_FREE) clears dirty bits and thus requires a TLB
shootdown on x86. On x1.32xlarge EC2 instance (128 vCPUs), the removal of this
overhead gives a ~7.5% reduction in wall time for a -j128 buildkernel, and
nearly a 50% reduction in system CPU time.

To avoid this overhead, use a lowmem handler to move exec args pages
close to the head of the inactive queue prior to an inactive queue scan.
A generation counter is added to track lowmem calls; when freeing exec
args, a pending generation count will cause MADV_FREE to be applied.
This ensures that all but 260KB*ncpu worth of memory will be reclaimed,
and remaining pages will likely be reclaimed upon a subsequent scan.
Given the overhead of applying MADV_FREE, this seems to be a better
tradeoff. (The EC2 instance type in question provides almost 2TB of
RAM.)

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 7431
Build 7594: arc lint + arc unit

Event Timeline

markj retitled this revision from to Only apply MADV_FREE to exec args when the lowmem handler is called.
markj edited the test plan for this revision. (Show Details)
markj updated this object.
markj added a reviewer: kib.
kib edited edge metadata.

Overall I do not object.

sys/kern/kern_exec.c
1375

I suggest to read exec_args_gen into a local var and assign argkva->gen a value from that var. I am not sure what exactly could go wrong if we miss intermediate gen update (missed madvise() call perhaps ?), but also I do not see why should we allow to ask such question.

N.B. See another note below.

1405

Use atomic increment there ? There is a sense in using atomic_fetchadd() and pass the return value + 1 to exec_Free_args_kva() call below. Then exec_free_args_kva() would use the val instead of reading exec_args_gen twice.

This revision is now accepted and ready to land.Feb 14 2017, 4:54 AM
markj edited edge metadata.
  • Use a consistent generation value when releasing a range to the cache.
This revision now requires review to proceed.Feb 14 2017, 6:33 AM
kib edited edge metadata.
This revision is now accepted and ready to land.Feb 14 2017, 6:42 AM
This revision was automatically updated to reflect the committed changes.