Page MenuHomeFreeBSD

eliminate global serialization points in swap reserve & mmap
Needs ReviewPublic

Authored by mmacy on Jun 23 2018, 5:12 AM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 20 2023, 3:42 AM
Unknown Object (File)
Nov 14 2023, 3:46 PM
Unknown Object (File)
Nov 9 2023, 1:48 PM
Unknown Object (File)
Nov 7 2023, 2:03 PM
Unknown Object (File)
Nov 7 2023, 9:37 AM
Unknown Object (File)
Nov 6 2023, 2:30 AM
Unknown Object (File)
Nov 1 2023, 6:29 PM
Unknown Object (File)
Oct 13 2023, 2:44 PM

Details

Reviewers
jeff
manu
glebius
Summary

For discussion only

  • Change swap reserve to be in terms of unsigned long so that it can be manipulated as an atomic
  • Move UMA zone fast path fields to a single cache line, lock to share cache line with infrequently accessed fields
  • pcpu quota handling for swap reserve and uidinfo ui_vmsize, pcpu thresholds will need further consideration
  • pcpu refcnts
  • switch ucreds to use pcpu refcnts - for purposes of POC, new processes get their own ucred to simplify handling of kill
  • move pmap_remove to using preemptible epoch rather than current global mutex based ad hoc EBR
  • back counter_u64 with domain correct pages
  • zero counters synchronously at alloc
  • don't set pcpu->pc_domain to non-zero value if NUMA is not defined

4k brk calls at 48 processes 1.8M -> 106M (Linux is 103M)
128M mmap/munmap at 48p 1M -> 7M (Linux is 1.8M)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 17580

Event Timeline

Some great stuff in here. Let's peel off parts while we perfect the rest.

sys/amd64/amd64/pmap.c
494–495

Have you showed this part to kib? Was there any objection?

sys/kern/subr_counter.c
54

We should run this by gleb and commit it alone.

sys/kern/subr_pcpu_refcount.c
111

You may know this but it has to go through the counter api to work on arches that need critical or whatever.

sys/sys/pcpu.h
211–212

Maybe define it here and have uma inherit it.

sys/vm/swap_pager.c
401–402

You only touch cred once. Don't need the local.

sys/vm/uma.h
284

UMA_PCPU_ZONE_SIZE

sys/vm/uma_core.c
1193

This could be plain vm_page_alloc() and the kernel will pick the domain for you.

1219–1220

These probably require vm_page_lock(p)/unlock;

1333–1334

page lock.

Otherwise this section looks good. We should put it in its own review and show it to gleb for commit.

1374–1375

UMA_ZONE_PCPU_ZONE_SIZE?

sys/x86/acpica/srat.c
520

What's this for? Avoid assigning a domain if !NUMA?

Does pc_domain default to 0?

mmacy added inline comments.
sys/amd64/amd64/pmap.c
494–495

Not one that could be well articulated. He had concerns about the safety of what is now epoch(9)

sys/kern/subr_counter.c
54

added @glebius to take a look.

sys/kern/subr_pcpu_refcount.c
111

epoch_enter() calls critical_enter() -- it's not possible to preempt.

sys/sys/pcpu.h
211–212

Ok. That may be better.

sys/vm/uma_core.c
1193

It's already in D15933 which you are on.

sys/x86/acpica/srat.c
520

We discussed this. Yes pc_domain defaults to 0.

sys/amd64/amd64/pmap.c
494–495

@jeff see discussion in D15231

  • change vm_offset_t to unsigned long to fix build on arm
  • use ck functions instead of atomic(9) to build on mips/sparc64/riscv
  • GC if 0 code in pmap
  • updated pmap_delayed_invl_wait comment
  • come up with more palatable solution to the arm vm_offset atomics problem