Page MenuHomeFreeBSD

Only apply MADV_FREE to exec args when the lowmem handler is called
ClosedPublic

Authored by markj on Feb 14 2017, 3:23 AM.
Tags
None
Referenced Files
Unknown Object (File)
Feb 12 2024, 9:50 PM
Unknown Object (File)
Feb 1 2024, 6:41 AM
Unknown Object (File)
Dec 22 2023, 9:38 PM
Unknown Object (File)
Nov 27 2023, 1:57 AM
Unknown Object (File)
Oct 27 2023, 11:45 PM
Unknown Object (File)
Oct 19 2023, 11:45 AM
Unknown Object (File)
Sep 4 2023, 11:38 PM
Unknown Object (File)
Jun 27 2023, 9:18 AM
Subscribers
None

Details

Summary

With r311346 we apply MADV_FREE upon every execve. Despite the recent
changes to pmap_advise() and vm_object_madvise(), this still has a large
amount of overhead on many-CPU systems, primarily because
pmap_advise(MADV_FREE) clears dirty bits and thus requires a TLB
shootdown on x86. On x1.32xlarge EC2 instance (128 vCPUs), the removal of this
overhead gives a ~7.5% reduction in wall time for a -j128 buildkernel, and
nearly a 50% reduction in system CPU time.

To avoid this overhead, use a lowmem handler to move exec args pages
close to the head of the inactive queue prior to an inactive queue scan.
A generation counter is added to track lowmem calls; when freeing exec
args, a pending generation count will cause MADV_FREE to be applied.
This ensures that all but 260KB*ncpu worth of memory will be reclaimed,
and remaining pages will likely be reclaimed upon a subsequent scan.
Given the overhead of applying MADV_FREE, this seems to be a better
tradeoff. (The EC2 instance type in question provides almost 2TB of
RAM.)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj retitled this revision from to Only apply MADV_FREE to exec args when the lowmem handler is called.
markj edited the test plan for this revision. (Show Details)
markj updated this object.
markj added a reviewer: kib.
kib edited edge metadata.

Overall I do not object.

sys/kern/kern_exec.c
1375 โ†—(On Diff #25131)

I suggest to read exec_args_gen into a local var and assign argkva->gen a value from that var. I am not sure what exactly could go wrong if we miss intermediate gen update (missed madvise() call perhaps ?), but also I do not see why should we allow to ask such question.

N.B. See another note below.

1405 โ†—(On Diff #25131)

Use atomic increment there ? There is a sense in using atomic_fetchadd() and pass the return value + 1 to exec_Free_args_kva() call below. Then exec_free_args_kva() would use the val instead of reading exec_args_gen twice.

This revision is now accepted and ready to land.Feb 14 2017, 4:54 AM
markj edited edge metadata.
  • Use a consistent generation value when releasing a range to the cache.
This revision now requires review to proceed.Feb 14 2017, 6:33 AM
kib edited edge metadata.
This revision is now accepted and ready to land.Feb 14 2017, 6:42 AM
This revision was automatically updated to reflect the committed changes.