In principle, this is fine, but in practice. we mostly call vm_page_insert_after(). So, the impact will be limited. That said, I'm still excited by where I think this is headed.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jun 5 2024
For some reason, this revision did not automatically close upon commit.
Jun 4 2024
Please coordinate this with @markj given his lazy init change.
Jun 3 2024
@jhibbits I would encourage you to apply https://reviews.freebsd.org/D40478 and a few other superpage-related follow on commits from amd64 to mmu_radix.c. This change would then apply to mmu_radix.c.
In D40403#1036869, @markj wrote:
Jun 2 2024
In D40403#1036869, @markj wrote:
In D45431#1036870, @jhibbits wrote:From a (very) quick check, it looks like the same change in amd64 should be made to mmu_radix for powerpc.
Set VM_PROT_NO_PROMOTE in vm_fault_prefault().
Jun 1 2024
Revise comment.
May 31 2024
May 24 2024
May 23 2024
May 22 2024
@markj Do you have any other comments on this change?
May 20 2024
May 19 2024
May 17 2024
In D45224#1031950, @gallatin wrote:In D45224#1031599, @alc wrote:I tried this on my ampere altra test box. Our (netflix) tree is roughly 1 month old (based at 63b0165cdcbb178df63ac57dcd39c29cf77f346e).
I cherry-picked a803837cec6e17e04849d59afac7b6431c70cb93 c1ebd76c3f283b10afe6b64f29fe68c4d1be3f8c b5a1f0406b9d6bba28e57377dcfc8b83bce987ad and then applied the patch. Its possible that I missed something critical elsewhere.. I just looked at the history of sys/arm64/arm64/pmap.c and cherry-picked what seemed to have come in after our latest upstream sync.In the new kernel, I see:
# sysctl vm.pmap vm.pmap.l3c.promotions: 4284 vm.pmap.l3c.p_failures: 156 vm.pmap.l3c.mappings: 348 vm.pmap.l3c.demotions: 795 vm.pmap.l2.promotions: 276 vm.pmap.l2.p_failures: 0 vm.pmap.l2.mappings: 0 vm.pmap.l2.demotions: 120 vm.pmap.l2c.demotions: 0 vm.pmap.l1.demotions: 0 vm.pmap.superpages_enabled: 1 vm.pmap.vmid.epoch: 0 vm.pmap.vmid.next: 2 vm.pmap.vmid.bits: 16 vm.pmap.asid.epoch: 0 vm.pmap.asid.next: 12760 vm.pmap.asid.bits: 16 # sysctl vm.pmap.kernel_maps | tail -15 0xffff001d14fb8000-0xffff001d17b0c000 rw--sg WB 0 0 0 21 85 0xffff007fffe24000-0xffff007fffffc000 rw--sg WT 0 0 0 0 118 0xffff007fffffc000-0xffff008000000000 rw--sg DEV 0 0 0 0 1 Direct map: 0xffffa00088300000-0xffffa00088400000 rw--sg WB 0 0 0 0 64 0xffffa00090000000-0xffffa000ebf28000 rw--sg WB 0 0 45 15 74 0xffffa000ec308000-0xffffa000ec30c000 rw--sg WB 0 0 0 0 1 0xffffa000ec310000-0xffffa000ec530000 rw--sg WB 0 0 0 0 136 0xffffa000ec540000-0xffffa000ee550000 rw--sg WB 0 0 0 15 132 0xffffa000eeab8000-0xffffa000ffc8c000 rw--sg WB 0 0 7 24 117 0xffffa000ffc90000-0xffffa00100000000 rw--sg WB 0 0 0 1 92 0xffffa80000000000-0xffffa80080000000 rw--sg WB 0 2 0 0 0 0xffffa80100000000-0xffffa82000000000 rw--sg WB 0 124 0 0 0
May 16 2024
May 12 2024
May 11 2024
Add requirements comment.
May 9 2024
May 8 2024
In D45042#1028354, @gallatin wrote:In D45042#1028058, @alc wrote:In D45042#1027957, @markj wrote:Do we have any idea what the downsides of the change are? If we make the default 64KB, then I'd expect memory usage to increase; do we have any idea what the looks like? It'd be nice to, for example, compare memory usage on a newly booted system with and without this change.
I had the same question. It will clearly impact a lot of page granularity counters, at the very least causing some confusion for people who look at those counters, e.g.,
./include/jemalloc/internal/arena_inlines_b.h- arena_stats_add_u64(tsdn, &arena->stats, ./include/jemalloc/internal/arena_inlines_b.h- &arena->decay_dirty.stats->nmadvise, 1); ./include/jemalloc/internal/arena_inlines_b.h- arena_stats_add_u64(tsdn, &arena->stats, ./include/jemalloc/internal/arena_inlines_b.h: &arena->decay_dirty.stats->purged, extent_size >> LG_PAGE); ./include/jemalloc/internal/arena_inlines_b.h- arena_stats_sub_zu(tsdn, &arena->stats, &arena->stats.mapped, ./include/jemalloc/internal/arena_inlines_b.h- extent_size);However, it's not so obvious what the effect on the memory footprint will be. For example, the madvise(MADV_FREE) calls will have coarser granularity. If we set the page size to 64KB, then one in-use 4KB page within a 64KB region will be enough to block the application of madvise(MADV_FREE) to the other 15 pages. Quantifying the impact that this coarsening has will be hard.
This does, however, seem to be the intended workaround: https://github.com/jemalloc/jemalloc/issues/467
Buried in that issue is the claim that Firefox's builtin derivative version of jemalloc eliminated the statically compiled page size.
What direction does the kernel grow the vm map? They apparently reverted support for lg page size values larger than the runtime page size because it caused fragmentation when the kernel grows the vm map downwards..
May 3 2024
In D45042#1027957, @markj wrote:Do we have any idea what the downsides of the change are? If we make the default 64KB, then I'd expect memory usage to increase; do we have any idea what the looks like? It'd be nice to, for example, compare memory usage on a newly booted system with and without this change.
Apr 29 2024
Apr 28 2024
Eliminate unnecessary parentheses.
Apr 27 2024
Apr 24 2024
In D44677#1022012, @andrew wrote:We could look at that as a follow up, however I'm unlikely to have time to make such a change and have it ready and well tested for 14.1 given it's due to be branched in just over 2 weeks.
Apr 17 2024
In D44677#1021479, @alc wrote:Why not deal with this issue in pmap_mapdev{,_attr}()? Specifically, if the given physical address falls within the DMAP region, don't call kva_alloc{,_aligned}(); instead, map the physical address at its corresponding DMAP virtual address. This is not all that different from amd64, where the DMAP (with appropriate attr settings) is used to access a good bit of device memory.
Apr 16 2024
Why not deal with this issue in pmap_mapdev{,_attr}()? Specifically, if the given physical address falls within the DMAP region, don't call kva_alloc{,_aligned}(); instead, map the physical address at its corresponding DMAP virtual address. This is not all that different from amd64, where the DMAP (with appropriate attr settings) is used to access a good bit of device memory.
Apr 9 2024
In D38852#1018877, @kib wrote:In D38852#1018757, @alc wrote:Just so we're all on the same page, I want to point out the following: While this patch achieves contiguity, it doesn't guarantee 2 MB alignment. Let 'F' represent a fully populated 2 MB reservation, 'E', represent a partially populated reservation, where the population begins in the middle and goes to the end, and 'B' is the complement of 'E', where the population begins at the start and ends in the middle. Typically, the physical memory allocation for one chunk of stacks on amd64 looks like 'EFFFB'. While it would be nice to achieve 'FFFF', this patch is already a great improvement over the current state of affairs.
But is it possible at all (perhaps a better word is it worth at all) since we do have the guard pages?
Apr 8 2024
Just so we're all on the same page, I want to point out the following: While this patch achieves contiguity, it doesn't guarantee 2 MB alignment. Let 'F' represent a fully populated 2 MB reservation, 'E', represent a partially populated reservation, where the population begins in the middle and goes to the end, and 'B' is the complement of 'E', where the population begins at the start and ends in the middle. Typically, the physical memory allocation for one chunk of stacks on amd64 looks like 'EFFFB'. While it would be nice to achieve 'FFFF', this patch is already a great improvement over the current state of affairs.
Apr 7 2024
Update to reflect committed change.
Apr 6 2024
Eliminate an unnecessary variable.
In D44575#1017895, @markj wrote:In D44575#1016703, @andrew wrote:I do, however, want to point out that a good portion of the reduction in buildworld time is coming from performing a smaller number of icache flushes when creating executable mappings.
Have you looked at teaching the vm code to manage the icache? We currently call cpu_icache_sync_range more than we need to, e.g. if mapping the same physical address as twice we will call it twice.
We discussed this a while ago. To minimize icache syncing, I believe we need to identify all of the places where a kernel might modify a user-mapped page via the direct map. I think that hooking uiomove_* would get us most of the way there, but it's hard to be confident that that's sufficient.
Add KASSERT to vm_reserv_is_populated().
Apr 3 2024
Rename pmap_enter_object()'s helpers to not have 64k or 2m in their names.
Apr 1 2024
Mar 30 2024
Mar 24 2024
Reopen after reservation size fix was committed.
Mar 18 2024
Correct VM_NFREEORDER for 16KB page size.
Mar 13 2024
Teach sysctl vm.pmap.kernel_maps to correctly count ATTR_CONTIGUOUS superpages when the base page size is 16KB.
Mar 12 2024
Add (void) casts. Refill a comment whose lines were too long.
Mar 10 2024
I'd really like to see this committed.
Jan 28 2024
Despite the long name, it's still two characters shorter than the original code. :-)