Details

Reviewers

alc
markj

Commits

rGee511f83b386: vm_reserv: use default pool for free page removal.
rG0078df5f0258: vm_phys: reduce touching of page->pool fields
rGc669b08bd834: vm_phys: reduce touching of page->pool fields

Summary

Change the usage of the pool field in vm_page structs.

Currently, every page belongs to a pool, and the pool field identifies that pool, whether the page is allocated or free.

With this change, the pool field of the first page of a free block is used by the buddy allocator to identify its pool, but the buddy allocator makes no guarantees about the pool field value for allocated pages. The buddy allocator requires that a pool parameter be passed as part of freeing memory. A function that allocates memory may use the pool field of a page to record what pool to pass as that parameter when the memory is freed, but might not need to do so for every allocated page.

Test Plan

Running buildworlds.

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

In response to offline feedback, reduce needless changes to the pool field, where it is changed from A to B and then back to A.

Break a too-long line.

Add a KASSERT.

In vm_phys_free_pages, I'd be happy to make a change that might not be strictly in the spirit of this change.

I note that m is not dereferenced in the inner do-while loop. It is only computed. So, if it weren't computed in the loop, then after the loop you could have

if (order < VM_NFREEORDER - 1)

pa ^= ((vm_paddr_t)1 << (PAGE_SHIFT + order));

m = &seg->first_page[...];

Then you could replace m_buddy with m.

Avoid an unneeded write to pool in phys_free_pages.

Drop an unneeded assignment to ->pool in _unfree().

alc mentioned this in D40403: vm_page: Implement lazy page initialization.Jun 2 2024, 11:56 PM

Resolve conflicts that arise when ilog2 was introduced to vm_phys.c.

Update after FREEPOOL_LAZYINIT changes.

Remove a blank line.

Update to resolve a vm_reserv.c conflict.

The overall timing results of running a dozen or so make-worlds without (base) and with (pool) this change:

x base.time
+ pool.time
+------------------------------------------------------------------------------+
|+x ++  x+   +    +   x+   +* xx   + xx x  x    x  ++       +                 x|
|    |__________|______M_A_________A_M_______|_______|                         |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  13       3794.47       3828.59       3810.03     3809.0946     8.4516533
+  13       3794.02       3820.52       3804.11     3804.8223     8.8034663
No difference proven at 95.0% confidence

Remove a needless assignment to m->pool

Reran 13 buildworlds after one-line change; here are the results, compared to the same 13 buildworlds on unmodified kernel:

x base.time
+ pool.time
+------------------------------------------------------------------------------+
|                             +                                                |
|                             +            x        +                          |
|x     x             x++  +x +*x  + x x x  x  + x + +      +                  x|
|              |_________|____M___A_MA____________|__|                         |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  13       3794.47       3828.59       3810.03     3809.0946     8.4516533
+  13       3803.98       3820.27       3807.45     3810.5046      5.579744
No difference proven at 95.0% confidence

Let vm_phys_alloc_npages clear all pool values in allocated memory, with the function vm_phys_free_npages supplying the pool as a parameter.

dougm updated this revision to Diff 143046.Sep 6 2024, 5:49 PM

dougm edited the summary of this revision. (Show Details)

Updated stats comparing base buildworld times with times after applying this patch:

x base.time
+ pool3.time
+------------------------------------------------------------------------------+
|                                 +                                            |
|                                 +                                            |
|+ +     x  x  x + x+ x+   x   x  +    * + x    *x+   x +                     x|
|             ||_______________A__M_____________|_____|                        |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  13       3799.46       3826.48       3807.97       3809.38     7.7810464
+  13       3796.29       3817.71       3809.23       3807.95     6.7123679
No difference proven at 95.0% confidence

dougm updated this revision to Diff 143232.Sep 11 2024, 7:02 AM

dougm edited the summary of this revision. (Show Details)

dougm edited the test plan for this revision. (Show Details)

base - unmodified, pool-with these changes

x base
+ pool
+------------------------------------------------------------------------------+
|                                          +         x                         |
|+            + +x *+        x  x  +*   x  +x    +*  x x x    +     +         x|
|              |__________|________AM______AM__________|____|                  |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  13       3800.14       3820.84       3809.25     3809.0408     5.7418775
+  13       3794.66       3817.48       3806.45     3806.2346     6.7908745
No difference proven at 95.0% confidence

No changes, just an update to account for some lines that just got renumbered.

This is much better.

Refresh.

Rebase. Resolve a conflict.

Fix unused variable problem.

I don't see any particular problems with the patch.

sys/vm/vm_kern.c
956	We could hard-code VM_FREEPOOL_DEFAULT here, like we do in vm_page_startup() ("Initialize pages not covered by phys_avail[]..."). I would suggest keeping the code as it is, but it's something to keep in mind if we want to further reduce use of the `pool` field.
sys/vm/vm_phys.c
808–809
sys/vm/vm_reserv.c
483	I believe we can assume here that the pool is VM_FREEPOOL_DEFAULT; note that the pool is hard-coded in the vm_phys_alloc_pages() call elsewhere in this file. In particular, pages allocated from reservations always belong to a VM object.
957–964	This can be VM_FREEPOOL_DIRECT.

This revision is now accepted and ready to land.Jan 20 2025, 6:36 PM

dougm marked 3 inline comments as done.Jan 21 2025, 9:17 PM

Closed by commit rGc669b08bd834: vm_phys: reduce touching of page->pool fields (authored by dougm). · Explain WhyJan 21 2025, 10:46 PM

This revision was automatically updated to reflect the committed changes.

dougm added a commit: rGc669b08bd834: vm_phys: reduce touching of page->pool fields.

Herald added a subscriber: imp. · View Herald TranscriptJan 21 2025, 10:46 PM

dougm added a reverting change: rG18c47eab727c: Revert "vm_phys: reduce touching of page->pool fields". Pho reports, and I have.Jan 23 2025, 5:00 PM

Crash report from pho:

20250123 02:33:01 all (549/951): mmap41.sh
panic: Bad tailq NEXT(0xffffffff81ce3438->tqh_last) != NULL
cpuid = 0
time = 1737596106
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01821523b0
vpanic() at vpanic+0x136/frame 0xfffffe01821524e0
panic() at panic+0x43/frame 0xfffffe0182152540
vm_phys_free_pages() at vm_phys_free_pages+0x31c/frame 0xfffffe01821525a0
vm_page_zone_release() at vm_page_zone_release+0x70/frame 0xfffffe01821525f0
bucket_drain() at bucket_drain+0x11c/frame 0xfffffe0182152630
zone_put_bucket() at zone_put_bucket+0x1c0/frame 0xfffffe0182152680
cache_free() at cache_free+0x2b4/frame 0xfffffe01821526d0
uma_zfree_arg() at uma_zfree_arg+0x207/frame 0xfffffe0182152720
pctrie_reclaim_resume_cb() at pctrie_reclaim_resume_cb+0x9c/frame 0xfffffe0182152770
vm_object_terminate() at vm_object_terminate+0x132/frame 0xfffffe01821527b0
vm_object_deallocate() at vm_object_deallocate+0x2bb/frame 0xfffffe01821527e0
vm_map_process_deferred() at vm_map_process_deferred+0xa9/frame 0xfffffe0182152800
vm_map_remove() at vm_map_remove+0xcb/frame 0xfffffe0182152830
vmspace_exit() at vmspace_exit+0xa8/frame 0xfffffe0182152860
exit1() at exit1+0x533/frame 0xfffffe01821528d0
sigexit() at sigexit+0x19f/frame 0xfffffe0182152d50
postsig() at postsig+0x1a5/frame 0xfffffe0182152e20
ast_sig() at ast_sig+0x32b/frame 0xfffffe0182152ed0
ast_handler() at ast_handler+0xe8/frame 0xfffffe0182152f10
ast() at ast+0x20/frame 0xfffffe0182152f30
doreti_ast() at doreti_ast+0x1c/frame 0x309c5c2596c0

https://people.freebsd.org/~pho/stress/log/log0569.txt

dougm added inline comments.Jan 26 2025, 9:13 AM

sys/vm/vm_reserv.c
957–964	Taking this suggestion seems to have produced crashes on the mmap41 stress test.

Reposting the last version before review and commit. It passes the mmap41.sh stress test, but @alc expresses skepticism about what's happening in vm_reserv_break.

Clear out some obfuscation from vm_reserv_break().

I ran tests. for a day with D45409.149981.patch. LGTM.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 29 2025, 9:14 AM

Closed by commit rG0078df5f0258: vm_phys: reduce touching of page->pool fields (authored by dougm). · Explain Why

This revision was automatically updated to reflect the committed changes.

dougm added a commit: rG0078df5f0258: vm_phys: reduce touching of page->pool fields.

FYI: I came across this i386 problem after commit 0078df5f0258.

Trying to mount root from ufs:/dev/vtbd0p1 [rw]...
WARNING: WITNESS option enabled, expect reduced performance.
WARNING: DIAGNOSTIC option enabled, expect reduced performance.
WARNING: 32-bit kernels are deprecated and may be removed in FreeBSD 15.0.
panic: vm_page_t 0x5d49be0 phys_addr mismatch ffffffffb8779000 00000000b8779405
cpuid = 3
time = 1738177579
KDB: stack backtrace:
db_trace_self_wrapper(7,114beb40,5d49be0,ffffffff,ffffffff,...) at db_trace_self_wrapper+0x28/frame 0x113f6744
vpanic(148badd,113f6780,113f6780,113f67f0,13a54e9,...) at vpanic+0xf4/frame 0x113f6760
panic(148badd,5d49be0,b8779000,ffffffff,b8779405,...) at panic+0x14/frame 0x113f6774
pmap_pae_remove_pages(114128e8) at pmap_pae_remove_pages+0x599/frame 0x113f67f0
exec_new_vmspace(113f6994,188247c) at exec_new_vmspace+0x1cb/frame 0x113f6810
exec_elf32_imgact(113f6994,16d4801,113f6984,0,0,...) at exec_elf32_imgact+0x5e4/frame 0x113f6874
kern_execve(114beb40,113f6a8c,0,1141284c) at kern_execve+0x72c/frame 0x113f6a74
sys_execve(114beb40,114bedf0,114beb40,13ae913,114bede4,...) at sys_execve+0x4e/frame 0x113f6ac8
syscall(113f6ba8,3b,3b,3b,402aaf,...) at syscall+0x1e6/frame 0x113f6b9c
Xint0x80_syscall() at 0xffc03479/frame 0x113f6b9c
--- syscall (59, FreeBSD ELF32, execve), eip = 0x47e53f, esp = 0xffbfe848, ebp = 0xffbfe858 ---
KDB: enter: panic
[ thread pid 17 tid 100080 ]
Stopped at      kdb_enter+0x34: movl    $0,kdb_why
db> x/s version
version:        FreeBSD 15.0-CURRENT #0 main-n275068-0078df5f0258-dirty: Wed Jan 29 18:44:45 CET 2025\012    pho@mercat1.netperf.freebsd.org:/mnt25/obj/usr/src/i386.i386/sys/PHO\012
db>

In D45409#1111300, @pho wrote:

FYI: I came across this i386 problem after commit 0078df5f0258.

Trying to mount root from ufs:/dev/vtbd0p1 [rw]...
WARNING: WITNESS option enabled, expect reduced performance.
WARNING: DIAGNOSTIC option enabled, expect reduced performance.
WARNING: 32-bit kernels are deprecated and may be removed in FreeBSD 15.0.
panic: vm_page_t 0x5d49be0 phys_addr mismatch ffffffffb8779000 00000000b8779405
cpuid = 3
time = 1738177579
KDB: stack backtrace:
db_trace_self_wrapper(7,114beb40,5d49be0,ffffffff,ffffffff,...) at db_trace_self_wrapper+0x28/frame 0x113f6744
vpanic(148badd,113f6780,113f6780,113f67f0,13a54e9,...) at vpanic+0xf4/frame 0x113f6760
panic(148badd,5d49be0,b8779000,ffffffff,b8779405,...) at panic+0x14/frame 0x113f6774
pmap_pae_remove_pages(114128e8) at pmap_pae_remove_pages+0x599/frame 0x113f67f0
exec_new_vmspace(113f6994,188247c) at exec_new_vmspace+0x1cb/frame 0x113f6810
exec_elf32_imgact(113f6994,16d4801,113f6984,0,0,...) at exec_elf32_imgact+0x5e4/frame 0x113f6874
kern_execve(114beb40,113f6a8c,0,1141284c) at kern_execve+0x72c/frame 0x113f6a74
sys_execve(114beb40,114bedf0,114beb40,13ae913,114bede4,...) at sys_execve+0x4e/frame 0x113f6ac8
syscall(113f6ba8,3b,3b,3b,402aaf,...) at syscall+0x1e6/frame 0x113f6b9c
Xint0x80_syscall() at 0xffc03479/frame 0x113f6b9c
--- syscall (59, FreeBSD ELF32, execve), eip = 0x47e53f, esp = 0xffbfe848, ebp = 0xffbfe858 ---
KDB: enter: panic
[ thread pid 17 tid 100080 ]
Stopped at      kdb_enter+0x34: movl    $0,kdb_why
db> x/s version
version:        FreeBSD 15.0-CURRENT #0 main-n275068-0078df5f0258-dirty: Wed Jan 29 18:44:45 CET 2025\012    pho@mercat1.netperf.freebsd.org:/mnt25/obj/usr/src/i386.i386/sys/PHO\012
db>

Yeah I have almost exactly the same:

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex SYSMAPS (SYSMAPS) r = 0 (0x1af1b3c) locked @ /usr/src/sys/i386/i386/pmap.c:4597
stack backtrace:
#0 0xfd9123 at ??+0
#1 0xfda008 at ??+0
#2 0x13a5ca7 at ??+0
#3 0x13a5276 at ??+0
#4 0x1382f3f at ??+0
#5 0x28 at ??+0


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1f01000
fault code              = supervisor write data, reserved bits in PTE
instruction pointer     = 0x20:0x13ab070
stack pointer           = 0x28:0x1ef9ab4
frame pointer           = 0x28:0x1ef9ac0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 0 ()
trap number             = 12
panic: page fault
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper(0,1a4d3e0,1ef9a74,1f01000,c,...) at db_trace_self_wrapper+0x28/frame 0x1ef9900
vpanic(1442402,1ef993c,1ef993c,1ef9968,13a5c44,...) at vpanic+0xf4/frame 0x1ef991c
panic(1442402,149cbab,0,fffff,1aac59b,...) at panic+0x14/frame 0x1ef9930
trap_fatal(8,1470e0e,5e0,1aa05e0,46,...) at trap_fatal+0x344/frame 0x1ef9968
trap_pfault(1f01000,0,0) at trap_pfault+0x6f/frame 0x1ef999c
trap(1ef9a74) at trap+0x326/frame 0x1ef9a68
calltrap() at calltrap+0x8/frame 0x1ef9a68
--- trap 0xc, eip = 0x13ab070, esp = 0x1ef9ab4, ebp = 0x1ef9ac0 ---
memset(1f01000,0,1000) at memset+0x50/frame 0x1ef9ac0
pmap_pae_zero_page(5d74994) at pmap_pae_zero_page+0xb2/frame 0x1ef9aec
vm_page_alloc_noobj_contig_domain(0,8062,1,0,0,ffffffff,ffffffff,1,0,0,6) at vm_page_alloc_noobj_contig_domain+0x20a/frame 0x1ef9b1c
uma_startup1(5f5e000) at uma_startup1+0xc5/frame 0x1ef9bcc
vm_mem_init(0) at vm_mem_init+0x30/frame 0x1ef9bd8
mi_startup() at mi_startup+0x1a4/frame 0x1ef9bf8
btext() at btext+0x5f
KDB: enter: panic

In D45409#1112416, @dim wrote:

In D45409#1111300, @pho wrote:

FYI: I came across this i386 problem after commit 0078df5f0258.

Trying to mount root from ufs:/dev/vtbd0p1 [rw]...
WARNING: WITNESS option enabled, expect reduced performance.
WARNING: DIAGNOSTIC option enabled, expect reduced performance.
WARNING: 32-bit kernels are deprecated and may be removed in FreeBSD 15.0.
panic: vm_page_t 0x5d49be0 phys_addr mismatch ffffffffb8779000 00000000b8779405
cpuid = 3
time = 1738177579
KDB: stack backtrace:
db_trace_self_wrapper(7,114beb40,5d49be0,ffffffff,ffffffff,...) at db_trace_self_wrapper+0x28/frame 0x113f6744
vpanic(148badd,113f6780,113f6780,113f67f0,13a54e9,...) at vpanic+0xf4/frame 0x113f6760
panic(148badd,5d49be0,b8779000,ffffffff,b8779405,...) at panic+0x14/frame 0x113f6774
pmap_pae_remove_pages(114128e8) at pmap_pae_remove_pages+0x599/frame 0x113f67f0
exec_new_vmspace(113f6994,188247c) at exec_new_vmspace+0x1cb/frame 0x113f6810
exec_elf32_imgact(113f6994,16d4801,113f6984,0,0,...) at exec_elf32_imgact+0x5e4/frame 0x113f6874
kern_execve(114beb40,113f6a8c,0,1141284c) at kern_execve+0x72c/frame 0x113f6a74
sys_execve(114beb40,114bedf0,114beb40,13ae913,114bede4,...) at sys_execve+0x4e/frame 0x113f6ac8
syscall(113f6ba8,3b,3b,3b,402aaf,...) at syscall+0x1e6/frame 0x113f6b9c
Xint0x80_syscall() at 0xffc03479/frame 0x113f6b9c
--- syscall (59, FreeBSD ELF32, execve), eip = 0x47e53f, esp = 0xffbfe848, ebp = 0xffbfe858 ---
KDB: enter: panic
[ thread pid 17 tid 100080 ]
Stopped at      kdb_enter+0x34: movl    $0,kdb_why
db> x/s version
version:        FreeBSD 15.0-CURRENT #0 main-n275068-0078df5f0258-dirty: Wed Jan 29 18:44:45 CET 2025\012    pho@mercat1.netperf.freebsd.org:/mnt25/obj/usr/src/i386.i386/sys/PHO\012
db>

Yeah I have almost exactly the same:

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex SYSMAPS (SYSMAPS) r = 0 (0x1af1b3c) locked @ /usr/src/sys/i386/i386/pmap.c:4597
stack backtrace:
#0 0xfd9123 at ??+0
#1 0xfda008 at ??+0
#2 0x13a5ca7 at ??+0
#3 0x13a5276 at ??+0
#4 0x1382f3f at ??+0
#5 0x28 at ??+0


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1f01000
fault code              = supervisor write data, reserved bits in PTE
instruction pointer     = 0x20:0x13ab070
stack pointer           = 0x28:0x1ef9ab4
frame pointer           = 0x28:0x1ef9ac0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 0 ()
trap number             = 12
panic: page fault
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper(0,1a4d3e0,1ef9a74,1f01000,c,...) at db_trace_self_wrapper+0x28/frame 0x1ef9900
vpanic(1442402,1ef993c,1ef993c,1ef9968,13a5c44,...) at vpanic+0xf4/frame 0x1ef991c
panic(1442402,149cbab,0,fffff,1aac59b,...) at panic+0x14/frame 0x1ef9930
trap_fatal(8,1470e0e,5e0,1aa05e0,46,...) at trap_fatal+0x344/frame 0x1ef9968
trap_pfault(1f01000,0,0) at trap_pfault+0x6f/frame 0x1ef999c
trap(1ef9a74) at trap+0x326/frame 0x1ef9a68
calltrap() at calltrap+0x8/frame 0x1ef9a68
--- trap 0xc, eip = 0x13ab070, esp = 0x1ef9ab4, ebp = 0x1ef9ac0 ---
memset(1f01000,0,1000) at memset+0x50/frame 0x1ef9ac0
pmap_pae_zero_page(5d74994) at pmap_pae_zero_page+0xb2/frame 0x1ef9aec
vm_page_alloc_noobj_contig_domain(0,8062,1,0,0,ffffffff,ffffffff,1,0,0,6) at vm_page_alloc_noobj_contig_domain+0x20a/frame 0x1ef9b1c
uma_startup1(5f5e000) at uma_startup1+0xc5/frame 0x1ef9bcc
vm_mem_init(0) at vm_mem_init+0x30/frame 0x1ef9bd8
mi_startup() at mi_startup+0x1a4/frame 0x1ef9bf8
btext() at btext+0x5f
KDB: enter: panic

This should be fixed by 4015ff43cbbe

In D45409#1112419, @jrtc27 wrote:

This should be fixed by 4015ff43cbbe

Confirmed, thanks!

I now see some panics on shutdown in CI, e.g. https://cirrus-ci.com/task/5507293774610432

Shutdown NOW!
shutdown: [pid 20]
2025-02-03T20:59:41.136495+00:00 - shutdown 20 - - power-down by root: 
panic: vm_phys_free_pages: unexpected pool param 0
cpuid = 0
time = 1738616381
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0043aa79c0
vpanic() at vpanic+0x136/frame 0xfffffe0043aa7af0
panic() at panic+0x43/frame 0xfffffe0043aa7b50
vm_phys_free_pages() at vm_phys_free_pages+0x2e8/frame 0xfffffe0043aa7bb0
vm_reserv_free_page() at vm_reserv_free_page+0x2aa/frame 0xfffffe0043aa7be0
vm_page_free_prep() at vm_page_free_prep+0x158/frame 0xfffffe0043aa7c10
vm_page_free_toq() at vm_page_free_toq+0x12/frame 0xfffffe0043aa7c40
pctrie_reclaim_resume_cb() at pctrie_reclaim_resume_cb+0x9c/frame 0xfffffe0043aa7c90
vm_object_terminate() at vm_object_terminate+0x132/frame 0xfffffe0043aa7cd0
vm_object_deallocate() at vm_object_deallocate+0x2bb/frame 0xfffffe0043aa7d00
vm_map_process_deferred() at vm_map_process_deferred+0xa9/frame 0xfffffe0043aa7d20
vm_map_remove() at vm_map_remove+0xa6/frame 0xfffffe0043aa7d50
vmspace_exit() at vmspace_exit+0xa8/frame 0xfffffe0043aa7d80
exit1() at exit1+0x533/frame 0xfffffe0043aa7df0
sys_exit() at sys_exit+0xd/frame 0xfffffe0043aa7e00
amd64_syscall() at amd64_syscall+0x15a/frame 0xfffffe0043aa7f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0043aa7f30
--- syscall (1, FreeBSD ELF64, exit), rip = 0xec08aabc83a, rsp = 0xec0896cce08, rbp = 0xec0896cce20 ---
KDB: enter: panic
[ thread pid 20 tid 100064 ]
Stopped at      kdb_enter+0x33: movq    $0,0x1053622(%rip)
db>

(with D48823 applied, previously the test did not check for successful shutdown)

I reproduced the panic on main and submitted PR 284539

markj added inline comments.Feb 3 2025, 9:32 PM

sys/vm/vm_phys.c
813	I suspect that removing this assignment is incorrect - we want to ensure that any page allocated to a consumer has `m->pool != VM_FREEPOOL_LAZYINIT`.

jrtc27 added inline comments.Feb 3 2025, 9:36 PM

sys/vm/vm_phys.c
813	Isn't the whole point behind this change removing this assignment? It's moved from here to the allocation path for the first page. What that means for allocating a range, then freeing the first page before others, I don't know, though.

dougm added inline comments.Feb 3 2025, 9:39 PM

sys/vm/vm_phys.c
813	Some version of this patch did not remove this assignment, but changed it to "m->pool == VM_NFREEPOOL". But that fell out somewhere.

dougm added inline comments.Feb 3 2025, 9:45 PM

sys/vm/vm_phys.c
813	Ignore that comment. It implies that I agree that an assignment to m->pool needs to happen here, and I'm not currently persuaded of that.

dougm added inline comments.Feb 3 2025, 9:56 PM

sys/vm/vm_reserv.c
483	This comment is one that I applied, along with others, and when one of the others proved problematic, I discarded them all. But this one I need to apply again.

dougm added a commit: rGee511f83b386: vm_reserv: use default pool for free page removal..Feb 3 2025, 9:58 PM

jrtc27 added inline comments.Feb 3 2025, 10:00 PM

sys/vm/vm_reserv.c
483	Ok, but if that's the case, shouldn't rv->pages->pool be consistent with that? Does the failed assertion not point at a problem that changing this parameter to a constant will just hide?

markj added inline comments.Feb 3 2025, 10:00 PM

sys/vm/vm_phys.c
813	Sorry, ignore my previous comment.
sys/vm/vm_reserv.c
483	Yes, I just came to the same conclusion. Alternately, setting `m->pool = VM_FREEPOOL_DEFAULT` in vm_reserv_alloc_page() would fix the problem as well I believe. The vm_page_alloc_* function which triggered allocation of the reservation will set `m->pool` for some page in the reservation, not necessarily the first one. I prefer to simply hard-code VM_FREEPOOL_DEFAULT here, though.

markj added inline comments.Feb 3 2025, 10:04 PM

sys/vm/vm_reserv.c
483	No, because rv->pages is just the first page in the reservation, and was possibly never allocated to a consumer, in which case its pool field won't be initialized.

The description of this patch includes:

So, the value of the pool field for allocated memory is undefined an irrelevant. Now, way back when I started this, I was advised to consider a future when there were more than two pools, and the pool fields of the first reservation page would tell me what pool to free into, and so some code that reflected that notion got written, and it wasn't purged effectively enough then the decision was made to just stick with the two pools and let some future person deal with a flood of new pools.

From my inexperienced eye, the problem looks like when we're transferring pages from another pool to the requested pool, each of the m_buddy in vm_phys_split_pages will have its pool updated, but nothing updates the pool of the page being allocated itself. vm_phys_set_pool previously did that, along with various other updates, but now that it's vm_phys_finish_init it does not.

In D45409#1113549, @dougm wrote:

The description of this patch includes:

With this change, the pool field of the first page of a free block is used by the buddy allocator to identify its pool, but the buddy allocator makes no guarantees about the pool field value for allocated pages.

So, the value of the pool field for allocated memory is undefined an irrelevant. Now, way back when I started this, I was advised to consider a future when there were more than two pools, and the pool fields of the first reservation page would tell me what pool to free into, and so some code that reflected that notion got written, and it wasn't purged effectively enough then the decision was made to just stick with the two pools and let some future person deal with a flood of new pools.

Oh, I see, I misread it as being that the first page is always well-defined, not that it's only well-defined for the first page in a set of *free* pages.

vm_phys: reduce touching of page->pool fields
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 143046

sys/vm/vm_kern.c

sys/vm/vm_page.c

sys/vm/vm_pagequeue.h

sys/vm/vm_phys.h

sys/vm/vm_phys.c

sys/vm/vm_reserv.c

vm_phys: reduce touching of page->pool fieldsClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 143046

sys/vm/vm_kern.c

sys/vm/vm_page.c

sys/vm/vm_pagequeue.h

sys/vm/vm_phys.h

sys/vm/vm_phys.c

sys/vm/vm_reserv.c

vm_phys: reduce touching of page->pool fields
ClosedPublic
Actions

Revision Contents
Changeset List