- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jul 26 2024
Jul 25 2024
Jul 24 2024
Jul 22 2024
Update man page.
In D46057#1050008, @kib wrote:In D46057#1050007, @alc wrote:In D46057#1049994, @markj wrote:Out-of-tree code can make the same change, no need for a __FreeBSD_version bump I think. stable/11 is the last branch where kmem_arena is a distinct entity.
I checked drm-515-kmod, drm-61-kmod, and virtualbox-ose-kmod. They don't appear to have any direct references to kmem_arena.
I suspect the nvidia driver could, but I did not checked.
In D46057#1049994, @markj wrote:Out-of-tree code can make the same change, no need for a __FreeBSD_version bump I think. stable/11 is the last branch where kmem_arena is a distinct entity.
Jul 21 2024
Jul 18 2024
Jul 15 2024
Jul 14 2024
In D45863#1047954, @mjg wrote:Sorry, I forgot to link the benchmark results from a previous iteration of this patch.
The metrics I've gathered show that this approach does reduce NOFREE fragmentation.Few years back I ran buildkernel in a loop, several runs later fragmentation increased significantly to the point where the kernel was not able to use huge pages.
While technically not a blocker for this patch, something is definitely going wrong here -- the same workload in a loop should have stabilized its NOFREE usage as is after maybe 2-3 runs, not keep increasing it until some unknown bound. Someone(tm) should look into it, but admittedly this patch may happen to dodge the impact.
Here is the amd64 pmap change:
diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 9f85e903cd74..841957db3b3b 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -5154,8 +5154,8 @@ pmap_growkernel(vm_offset_t addr) pdpe = pmap_pdpe(kernel_pmap, end); if ((*pdpe & X86_PG_V) == 0) { nkpg = pmap_alloc_pt_page(kernel_pmap, - pmap_pdpe_pindex(end), VM_ALLOC_WIRED | - VM_ALLOC_INTERRUPT | VM_ALLOC_ZERO); + pmap_pdpe_pindex(end), VM_ALLOC_INTERRUPT | + VM_ALLOC_NOFREE | VM_ALLOC_WIRED | VM_ALLOC_ZERO); if (nkpg == NULL) panic("pmap_growkernel: no memory to grow kernel"); paddr = VM_PAGE_TO_PHYS(nkpg); @@ -5174,7 +5174,8 @@ pmap_growkernel(vm_offset_t addr) }
Jul 13 2024
Add requested KASSERT()s.
Jul 12 2024
Are there any other comments or questions about this patch?
@kib Since I removed the rtld change, your last comment on that portion of the change no longer appears inline with the patch, so let me address it here.
Remove what was a transitional (and confusing) rtld change. The proper change will be included in the next big patch that introduces real two-level reservation support. The removed change only sought to avoid the pointless 2MB alignment of libraries, like libc.so, that are too small too benefit from 2MB alignment.
Jul 7 2024
Jul 6 2024
Jul 5 2024
Jul 3 2024
Make vm_map_find() a bit smarter about how much extra space to search for when performing ASLR and either VMFS_SUPER_SPACE or VMFS_OPTIMAL_SPACE is specified.
In D45046#1045243, @markj wrote:In D45046#1045072, @bnovkov wrote:However, I think that the biggest difference here is that these changes will prioritize filling up partially populated reservations, which should pack things more tightly than VM_FREEPOOL_DIRECT. Allocating pages using VM_FREEPOOL_DIRECT will dequeue pages from the 0-order freelist, and there's no guarantee that the queued 0-order pages come from the same reservation.
If a page is in a 0-order freelist, then its buddy is already allocated. (And that page has been in the freelist longer than all of the other free 0-order pages, so is not likely to see its buddy returned to the free lists in the near future.) vm_phys tries to import the largest possible chunk of contiguous memory into a pool when needed, so unless RAM is already very fragmented, the buddy will by allocated to another consumer of the same free pool, which in this case is likely to be UMA. It's not really obvious to me why this is objectively worse than the reservation-based scheme.
Jul 1 2024
Simplify pmap_enter_largepage()'s handling of L3C pages.
Jun 30 2024
Does anyone know if any real riscv hardware implements their extension that supports additional page sizes, particularly, 64KB?
In D45761#1044534, @markj wrote:I was about to comment that other pmaps still use atomics for these counters, but it seems that amd64's also been using counter(9) for a while. I'm a bit skeptical that that's really necessary (except perhaps for p_failures), but it doesn't have much downside either. It would be nice to make at least arm64 consistent.
Jun 29 2024
Jun 28 2024
Fix PMAP_ENTER_LARGEPAGE
Jun 25 2024
Jun 24 2024
Jun 23 2024
Jun 22 2024
Jun 14 2024
Jun 13 2024
Jun 12 2024
Here is a very simple example. Before the change:
97156 0x2eb08e600000 0x2eb08e64b000 rw- 37 37 1 0 ----- sw 97156 0x2eb08e800000 0x2eb08f021000 rw- 1031 29378 3 0 ----- sw 97156 0x2eb08f021000 0x2eb08f022000 rw- 1 29378 3 0 ----- sw 97156 0x2eb08f022000 0x2eb095f22000 rw- 28346 29378 3 0 --S-- sw 97156 0x2eb096000000 0x2eb09d000000 rw- 20411 20411 1 0 --S-- sw
After the change:
17423 0x258f55e00000 0x258f55e6c000 rw- 38 39 2 0 ----- sw 17423 0x258f55e6c000 0x258f55e6d000 rw- 1 39 2 0 ----- sw 17423 0x258f56000000 0x258f5d700000 rw- 29483 29483 1 0 --S-- sw 17423 0x258f5d800000 0x258f67000000 rw- 27646 27646 1 0 --S-- sw
With this patch in place, I see a small reduction in reservations allocated (e.g., 1.6% during buildworld), fewer partially populated reservations. a small increase in 64KB page promotions on arm64, and fewer map entries in the heap.
Jun 11 2024
I'm going to suggest the following as the final resolution for this issue. It addresses the fragmentation that I described in my last message, and it and handles the scenario that you brought up wherein we run out of free space at the current anon_loc, perform an ASLR restart, and want to avoid repeating the ASLR restart on the next mapping.
diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c index 3c7afcb6642f..fa71bb8a01d6 100644 --- a/sys/vm/vm_map.c +++ b/sys/vm/vm_map.c @@ -2247,8 +2247,15 @@ vm_map_find(vm_map_t map, vm_object_t object, vm_ooffset_t offset, rv = vm_map_insert(map, object, offset, *addr, *addr + length, prot, max, cow); } - if (rv == KERN_SUCCESS && update_anon) - map->anon_loc = *addr + length; + + /* + * Update the starting address for clustered anonymous memory mappings + * if a starting address was not previously defined or an ASLR restart + * placed an anonymous memory mapping at a lower address. + */ + if (update_anon && rv == KERN_SUCCESS && (map->anon_loc == 0 || + *addr < map->anon_loc)) + map->anon_loc = *addr; done: vm_map_unlock(map); return (rv); @@ -4041,9 +4048,6 @@ vm_map_delete(vm_map_t map, vm_offset_t start, vm_offset_t end) entry->object.vm_object != NULL) pmap_map_delete(map->pmap, entry->start, entry->end);
I'm not convinced that we should be creating an ilog2.h header file. I would leave the definitions in the existing libkern.h header file.
Jun 8 2024
Jun 7 2024
By the way, you mentioned pmap_enter_l3c() only being called from one place. The next big chunk of @ehs3_rice.edu 's work will include a call to pmap_enter_l3c() from pmap_enter() with pagesizes[] updated to include the L3C page size as psind == 1.
Remove two early calls to pmap_pte_bti() that are now redundant.
Jun 6 2024
@kib Do you have any PKU test programs that could be run to check this?
Correct a couple errors in the previous version.