User Details
- User Since
- Dec 14 2014, 5:52 AM (541 w, 5 d)
Yesterday
The key here is that all page allocation functions call vm_page_dequeue(), completing any lingering page queue operations that involve the plinks.q field.
Wed, Apr 30
Are you going to update the patch to include vm_page_grab_pages()?
Tue, Apr 29
Mon, Apr 28
Sat, Apr 26
Aren't all of the changes to vm_page.h valid? Can we have a standalone patch with just those changes?
Fri, Apr 25
Now that iterators are much smaller, consisting of only four 64-bit fields, I think we should seriously consider embedding an iterator in the vm_object for use by operations that require a write lock on the object. If that iterator is used by the functions that insert and remove pages from the trie, then it will always be consistent, and most of the pctrie_iter_reset()s can be eliminated. With the removal of the memq from the object, this would only grow the size of the vm_object by 2 64-bit fields. (If there was some way to avoid having both the root of the pctrie and the iterator's pointer to that root within the vm_object, then the size of the vm_object would only grow by 1 64-bit field.)
Thu, Apr 24
Wed, Apr 9
The performance results in the TEST PLAN are much improved over what I was seeing last fall for a similar set of changes, in particular, the results for vm_page_alloc(). Clearly, the improvements to the implementation of iterators in the last few months, particularly I suspect the elimination of the rather large array on the stack, have made a difference. Modulo any low-level changes/fixes to this patch that are needed, I would be very happy to see it committed. This is really the most critical step in my mind to the elimination of the two linked list pointers from struct vm_page, which I hope will yield a small but measurable performance improvement, particularly for pages allocated from superpage reservations.
Sat, Apr 5
Mar 24 2025
Mar 17 2025
The vm_page.h comment incorrectly says vm_page_alloc accepts VM_ALLOC_WAITOK.
Mar 14 2025
Mar 12 2025
Mar 10 2025
Mar 8 2025
Feb 28 2025
Feb 27 2025
Feb 22 2025
Feb 16 2025
Feb 15 2025
Dec 9 2024
Dec 8 2024
Dec 7 2024
This change increased the cycles spent in both vm_object_split() and vm_object_collapse_scan() during a buildworld. It appears that we're not moving enough pages to overcome the startup costs of the iterator.
Shouldn't this be abandoned?
Dec 6 2024
Dec 5 2024
Nov 30 2024
Nov 29 2024
Update in response to reviewer comments.
Update in response to reviewer comments.
Nov 28 2024
Nov 27 2024
Nov 25 2024
Let's limit this change to vm_page_iter_free. It reduces the average cycles in _kmem_unback and vm_object_page_remove by 7% and 5.3%, respectively.
Nov 24 2024
By itself, this is going to be slower. This should be a part of a larger patch that removes the memq, so that we can evaluate whether the elimination of the memq makes up for the higher cost of the iterators.
Nov 23 2024
Nov 20 2024
Nov 17 2024
Nov 16 2024
Cycles to perform a 2MB aligned vm_page_alloc_contig() for shm_create_largepage() on a Ryzen 5900X:
x base + iter +------------------------------------------------------------------------------+ | + x | | ++ x | | ++ xx | | ++++ + xxx xx x| ||_MA___| |__MA____| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 16813 17866 16989.5 17074.6 294.94715 + 10 12691 13496 12759 12833.1 239.58643 Difference at 95.0% confidence -4241.5 +/- 252.466 -24.841% +/- 1.2701% (Student's t, pooled s = 268.696)
Nov 13 2024
Nov 12 2024
Is this related to the KASSERT that was recently added to vm_object_terminate_single_page?
Oct 30 2024
Oct 27 2024
To be clear, there is no question that this function will get used, so adding it sooner, rather than later, would be okay with me.
I'm looking at patches from both you and @dougm that define this function. However, each of you adds it in a different location within the file. :-)
Oct 26 2024
Incorporating D47277 has increased the average number of cycles to perform vm_page_alloc() to 1325.
Incorporating D47277 yielded the lowest average cycles in vm_page_alloc_contig() that I've seen.
Switching from ..._lookup_le() to ..._lookup_lt() increased the average number of cycles in vm_page_alloc().
D47207 has reduced the average number of cycles to perform vm_page_alloc() from 1295 to 1265.