alc (Alan Cox)
User

Projects

User Details

User Since
Dec 14 2014, 5:52 AM (153 w, 18 h)

Recent Activity

Tue, Nov 7

alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Tue, Nov 7, 5:06 PM

Sun, Nov 5

alc added a comment to D11943: Modify vm_page_wire() to not dequeue the specified page.
In D11943#269227, @alc wrote:

I have a simple test where I load an approximately 16GB sized file into memory and then run a program that performs random accesses of various sizes (and alignments) to the file. Lock contention is not an issue in this test. I am not trying to measure the SMP benefits of the lazy wiring.

When I wrote the patch originally, I tried running a read-only multi-threaded pgbench against a database that fit in RAM. Without the patch, we see lots of inactive queue lock contention as postgres processes wire pages into the buffer cache. With the patch, the contention is completely gone since the queue manipulations are deferred to the bufdaemon.

Sun, Nov 5, 11:15 PM
alc added a comment to D11943: Modify vm_page_wire() to not dequeue the specified page.

I have a simple test where I load an approximately 16GB sized file into memory and then run a program that performs random accesses of various sizes (and alignments) to the file. Lock contention is not an issue in this test. I am not trying to measure the SMP benefits of the lazy wiring.

Sun, Nov 5, 9:46 PM
alc added a comment to D11943: Modify vm_page_wire() to not dequeue the specified page.
In D11943#258507, @alc wrote:

Originally, my hope was that this change would only acquire the inactive queue lock once in vfs_vmio_unwire() for requeueing the page and eliminate the acquisition in vfs_vmio_extend() because we lazily wire each page. However, please take a look at vfs_vmio_unwire(). Unfortunately, the way it is written, the page queue lock will now be acquired twice, first during the vm_page_unwire(PQ_NONE) to dequeue the page and again during vm_page_deactivate() to reenqueue the page. However, if we don't dequeue the page during vm_page_unwire(PQ_NONE) (as in the previous version of this change), then vm_page_deactivate() does nothing because it sees the page is already inactive. In other words, the page doesn't get requeued to the tail.

I tried to address this by adding a new KPI, vm_page_unwire_noq(), which unwires the page without modifying its position in any page queue. This way, the caller has fine-grained control over what happens to the page, though it must be careful to avoid leaks. I modified sendfile_free_page() to use it as well, and restructured it a bit to more closely resemble vfs_vmio_unwire().

Sun, Nov 5, 7:27 PM

Sat, Nov 4

alc added a comment to D12838: Replace many instances of VM_WAIT with blocking page allocation functions.

Overall, I think that this change is a good one. It will be a lot easier to perform a NUMA-aware VM_WAIT with this change.

Sat, Nov 4, 6:47 PM

Thu, Nov 2

alc added a comment to D12888: Remove object_collapses and object_bypasses counters.

On a occasion, I do look at these counters. Converting them to an SMP-friendly alternative is probably a worthwhile endeavor.

Thu, Nov 2, 5:09 PM

Sat, Oct 28

alc added inline comments to D12635: Allow allocations across meta boundaries.
Sat, Oct 28, 5:58 PM

Tue, Oct 24

alc committed rS324960: Micro-optimize the handling of fictitious pages in vm_page_free_prep()..
Micro-optimize the handling of fictitious pages in vm_page_free_prep().
Tue, Oct 24, 5:15 PM

Mon, Oct 23

alc accepted D12764: Fix the VM_NRESERVLEVEL == 0 build.
Mon, Oct 23, 4:50 AM
alc added inline comments to D12764: Fix the VM_NRESERVLEVEL == 0 build.
Mon, Oct 23, 1:01 AM

Sat, Oct 21

alc added a comment to D12725: Eliminate redundant TLB invalidations in the arm64 pmap.

Here are two related observations:

Sat, Oct 21, 5:47 PM

Oct 19 2017

alc added a comment to D12725: Eliminate redundant TLB invalidations in the arm64 pmap.
In D12725#264182, @kib wrote:
In D12725#264164, @alc wrote:
In D12725#264159, @kib wrote:

I agree that the change is functionally correct, but perhaps the intent was to coalesce the invalidations ? I.e. either pmap_remove_l3() should stop doing the invalidation, or its variant is added which does not do the invalidation and used there.

Then, for correctness, you have to introduce the delayed invalidation machinery from amd64. (Or perform a TLB range invalidation operation whenever you change PV list locks of the addresses whose PTEs were destroyed since the last PV list lock change.)

You are right, of course. But I think that this is worth it as well (invalidate ranges on PV list unlocks).

Oct 19 2017, 6:44 PM
alc added a comment to D12725: Eliminate redundant TLB invalidations in the arm64 pmap.
In D12725#264159, @kib wrote:

I agree that the change is functionally correct, but perhaps the intent was to coalesce the invalidations ? I.e. either pmap_remove_l3() should stop doing the invalidation, or its variant is added which does not do the invalidation and used there.

Oct 19 2017, 5:41 PM
alc created D12725: Eliminate redundant TLB invalidations in the arm64 pmap.
Oct 19 2017, 5:42 AM
alc accepted D12663: Move swapout code into vm_swapout.c..
Oct 19 2017, 5:14 AM
alc accepted D12668: Do not overwrite clean blocks on pageout..

It appears to me that we are acquiring a write lock on the object (as opposed to a read lock) just to handle an edge case where we have to call vm_page_clear_dirty().

Oct 19 2017, 5:07 AM
alc committed rS324743: Batch atomic updates to the number of active, inactive, and laundry.
Batch atomic updates to the number of active, inactive, and laundry
Oct 19 2017, 4:14 AM

Oct 14 2017

alc added inline comments to D12665: Reduce traffic on vm_cnt.v_free_count.
Oct 14 2017, 4:24 AM

Oct 13 2017

alc committed rS324601: Address two problems with sendfile(..., SF_NOCACHE) and apply one.
Address two problems with sendfile(..., SF_NOCACHE) and apply one
Oct 13 2017, 4:31 PM
alc accepted D12660: Evaluate the real size of the sblk_zone..
Oct 13 2017, 4:05 PM
alc added a comment to D12660: Evaluate the real size of the sblk_zone..

I think that the change is fine, but it needs a comment. Otherwise, this is going to look awfully strange to someone who isn't familiar with the inner works of UMA.

Oct 13 2017, 3:34 PM

Oct 9 2017

alc closed D12627: clear allocated blist struct.
Oct 9 2017, 6:19 PM
alc committed rS324444: The recent change to initialization of blists (r324420) relied on '-1'.
The recent change to initialization of blists (r324420) relied on '-1'
Oct 9 2017, 6:19 PM
alc added a comment to D11968: Simplify blist initialization.

Doug has explained the cause and provided a fix in D12627, which I will commit shortly.

Oct 9 2017, 5:28 PM

Oct 8 2017

alc committed rS324420: The blst_radix_init function has two purposes - to compute the number of.
The blst_radix_init function has two purposes - to compute the number of
Oct 8 2017, 10:17 PM
alc closed D11968: Simplify blist initialization.
Oct 8 2017, 10:17 PM
alc committed rS324412: MFC r324173.
MFC r324173
Oct 8 2017, 5:15 PM
alc added a comment to D11968: Simplify blist initialization.

There are a couple typo-class errors in the summary. Can you fix them so that I can use the summary as the commit message?

Oct 8 2017, 5:07 PM
alc committed rS324411: Replace an unnecessary call to vm_page_activate() by an assertion that.
Replace an unnecessary call to vm_page_activate() by an assertion that
Oct 8 2017, 4:55 PM
alc accepted D11968: Simplify blist initialization.
Oct 8 2017, 2:59 AM

Oct 7 2017

alc committed rS324400: MFC r305685.
MFC r305685
Oct 7 2017, 9:14 PM
alc committed rS324399: MFC r321386,321393.
MFC r321386,321393
Oct 7 2017, 8:22 PM
alc committed rS324390: MFC r319542,321003,321378.
MFC r319542,321003,321378
Oct 7 2017, 6:37 PM
alc committed rS324389: MFC r320980,321377.
MFC r320980,321377
Oct 7 2017, 6:08 PM
alc committed rS324387: MFC r321015.
MFC r321015
Oct 7 2017, 5:32 PM
alc committed rS324385: MFC r323973,324087.
MFC r323973,324087
Oct 7 2017, 5:20 PM
alc committed rS324384: MFC r323656.
MFC r323656
Oct 7 2017, 4:56 PM

Oct 2 2017

alc committed rS324190: Use vm_page_active() rather than directly accessing the page's queue.
Use vm_page_active() rather than directly accessing the page's queue
Oct 2 2017, 7:30 AM
alc committed rS324189: When mdstart_swap() accesses a page that is already in the active queue,.
When mdstart_swap() accesses a page that is already in the active queue,
Oct 2 2017, 7:14 AM

Oct 1 2017

alc committed rS324173: When an I/O error occurs on page out, there is no need to dirty the page,.
When an I/O error occurs on page out, there is no need to dirty the page,
Oct 1 2017, 5:04 PM
alc committed rS324171: MFC r323961.
MFC r323961
Oct 1 2017, 4:57 PM
alc committed rS324165: MFC r323981.
MFC r323981
Oct 1 2017, 4:29 PM

Sep 30 2017

alc committed rS324131: MFC r323391.
MFC r323391
Sep 30 2017, 7:54 PM
alc committed rS324130: MFC r322459,322897.
MFC r322459,322897
Sep 30 2017, 7:24 PM
alc committed rS324129: MFC r323868.
MFC r323868
Sep 30 2017, 6:53 PM
alc committed rS324128: MFC r323786.
MFC r323786
Sep 30 2017, 6:32 PM
alc committed rS324126: MFC r323785.
MFC r323785
Sep 30 2017, 6:07 PM

Sep 28 2017

alc committed rS324087: Optimize vm_object_page_remove() by eliminating pointless calls to.
Optimize vm_object_page_remove() by eliminating pointless calls to
Sep 28 2017, 5:56 PM

Sep 24 2017

alc committed rS323982: Change vm_page_try_to_free() to require a managed page. Essentially,.
Change vm_page_try_to_free() to require a managed page. Essentially,
Sep 24 2017, 11:35 PM
alc committed rS323981: Modernize the use of vm_page_unwire(). Since r288122, vm_page_unwire().
Modernize the use of vm_page_unwire(). Since r288122, vm_page_unwire()
Sep 24 2017, 10:29 PM
alc committed rS323973: Optimize vm_page_try_to_free(). Specifically, the call to pmap_remove_all().
Optimize vm_page_try_to_free(). Specifically, the call to pmap_remove_all()
Sep 24 2017, 4:50 PM
alc committed rS323961: Since the page "frame" doesn't belong to a vm object, it can't be paged.
Since the page "frame" doesn't belong to a vm object, it can't be paged
Sep 24 2017, 2:51 AM

Sep 23 2017

alc added a comment to D11943: Modify vm_page_wire() to not dequeue the specified page.

Originally, my hope was that this change would only acquire the inactive queue lock once in vfs_vmio_unwire() for requeueing the page and eliminate the acquisition in vfs_vmio_extend() because we lazily wire each page. However, please take a look at vfs_vmio_unwire(). Unfortunately, the way it is written, the page queue lock will now be acquired twice, first during the vm_page_unwire(PQ_NONE) to dequeue the page and again during vm_page_deactivate() to reenqueue the page. However, if we don't dequeue the page during vm_page_unwire(PQ_NONE) (as in the previous version of this change), then vm_page_deactivate() does nothing because it sees the page is already inactive. In other words, the page doesn't get requeued to the tail.

Sep 23 2017, 7:30 AM

Sep 21 2017

alc committed rS323868: Modernize calls to vm_page_unwire(). As of r288122, vm_page_unwire().
Modernize calls to vm_page_unwire(). As of r288122, vm_page_unwire()
Sep 21 2017, 3:32 PM

Sep 20 2017

alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Sep 20 2017, 4:26 PM
alc committed rS323786: In r288122, we changed vm_page_unwire() so that it returns a Boolean.
In r288122, we changed vm_page_unwire() so that it returns a Boolean
Sep 20 2017, 5:00 AM
alc committed rS323785: Sync with amd64/arm/arm64/i386/mips pmap change r288256:.
Sync with amd64/arm/arm64/i386/mips pmap change r288256:
Sep 20 2017, 4:20 AM

Sep 19 2017

alc accepted D12411: For unlinked files, do not msync(2) or sync on inactivation..
Sep 19 2017, 4:15 PM
alc added a comment to D12411: For unlinked files, do not msync(2) or sync on inactivation..

As long as the page daemon will still launder the pages when we are short of memory, I don't see a problem with msync() ignoring them.

Sep 19 2017, 3:50 PM

Sep 17 2017

alc committed rS323681: MFC r321840,322041.
MFC r321840,322041
Sep 17 2017, 4:46 PM
alc committed rS323665: MFC r321423.
MFC r321423
Sep 17 2017, 4:15 AM
alc committed rS323664: MFC r321102.
MFC r321102
Sep 17 2017, 3:44 AM
alc committed rS323663: MFC r322404.
MFC r322404
Sep 17 2017, 3:34 AM
alc committed rS323662: MFC r322296.
MFC r322296
Sep 17 2017, 3:17 AM

Sep 16 2017

alc committed rS323656: Modify blst_leaf_alloc to take only the cursor argument..
Modify blst_leaf_alloc to take only the cursor argument.
Sep 16 2017, 6:12 PM
alc closed D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.
Sep 16 2017, 6:12 PM
alc accepted D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.

The next to last change to this patch that zeroed the bitmap in terminator nodes addressed my last concern. I'm going to commit this patch shortly.

Sep 16 2017, 5:46 PM

Sep 13 2017

alc added inline comments to D11968: Simplify blist initialization.
Sep 13 2017, 4:38 PM
alc accepted D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.
Sep 13 2017, 4:27 PM
alc added a comment to D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.

I've performed some testing with the new sysctl that calls blist_stats(). Specifically, I've configured a test machine with 2GB of RAM and 64GB of swap space, and run a "make -j7 buildworld" in a loop. After 6 consecutive builds, the before and after results are as follows.

Sep 13 2017, 4:25 PM

Sep 10 2017

alc added a comment to D12281: Move vmmeter atomic counters into dedicated cache lines.
In D12281#255294, @mjg wrote:
-       u_int v_free_count;     /* (f) pages free */
        u_int v_inactive_target; /* (c) pages desired inactive */
        u_int v_pageout_free_min;   /* (c) min pages reserved for kernel */
        u_int v_interrupt_free_min; /* (c) reserved pages for int code */
        u_int v_free_severe;    /* (c) severe page depletion point */
+       u_int v_free_count VMMETER_ALIGNED;     /* (f) pages free */
        u_int v_wire_count VMMETER_ALIGNED; /* (a) pages wired down */

Made no difference in the buildkernel test, even with kib's page batching patch. Something of the sort will definitely have to be done later anyway and probably helps a little elsewhere, so I can include this bit in the patch if you want.

Sep 10 2017, 6:42 PM
alc added a comment to D12281: Move vmmeter atomic counters into dedicated cache lines.

Have you evaluated the effects of isolating the free count? The fields surrounding it are read only.

Sep 10 2017, 6:07 PM
alc accepted D12281: Move vmmeter atomic counters into dedicated cache lines.
Sep 10 2017, 6:00 PM
alc committed rS323391: To analyze the allocation of swap blocks by blist functions, add a method.
To analyze the allocation of swap blocks by blist functions, add a method
Sep 10 2017, 5:46 PM
alc closed D11906: est tool, so that 's' writes data about free space distribution.
Sep 10 2017, 5:46 PM

Sep 9 2017

alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Sep 9 2017, 11:26 PM
alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Sep 9 2017, 10:48 PM
alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Sep 9 2017, 10:33 PM

Sep 7 2017

alc accepted D12248: Speed up vm_page_array initialization..
Sep 7 2017, 4:58 PM
alc added inline comments to D12248: Speed up vm_page_array initialization..
Sep 7 2017, 3:52 PM

Sep 4 2017

alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Sep 4 2017, 4:52 PM

Sep 1 2017

alc added a comment to D11906: est tool, so that 's' writes data about free space distribution.

swap_pager.c needs #include <sys/sbuf.h>

Sep 1 2017, 2:32 AM
alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Sep 1 2017, 2:30 AM
alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Sep 1 2017, 2:28 AM

Aug 28 2017

alc committed rS322971: Update a couple vm_object lock assertions in the swap pager to reflect the.
Update a couple vm_object lock assertions in the swap pager to reflect the
Aug 28 2017, 5:02 PM
alc committed rS322970: Switching from a global hash table to per-vm_object radix tries for mapping.
Switching from a global hash table to per-vm_object radix tries for mapping
Aug 28 2017, 4:56 PM
alc closed D12134: Update vm object lock assertions in the swap pager by committing rS322970: Switching from a global hash table to per-vm_object radix tries for mapping.
Aug 28 2017, 4:56 PM
alc added a comment to D12134: Update vm object lock assertions in the swap pager.
In D12134#251911, @kib wrote:
In D12134#251899, @alc wrote:

This change reveals a problem: vm_fault_soft_fast() is (indirectly) calling vm_pager_page_unswapped() with only a read lock.

My understanding is that this is a valid assert in the following situation: we have a swap object with a valid page which was swapped in and not dirtied. Then, on a write fault, any (fast or normal) fault handlers could free the swap space. It is possible because swap pager page-in leaves the backed swap space intact as (IMO very small) optimization.

Aug 28 2017, 4:36 AM
alc updated the diff for D12134: Update vm object lock assertions in the swap pager.

Update vm_fault_dirty().

Aug 28 2017, 4:32 AM

Aug 27 2017

alc added a comment to D12134: Update vm object lock assertions in the swap pager.

This change reveals a problem: vm_fault_soft_fast() is (indirectly) calling vm_pager_page_unswapped() with only a read lock.

Aug 27 2017, 5:46 AM

Aug 26 2017

alc created D12134: Update vm object lock assertions in the swap pager.
Aug 26 2017, 9:11 PM
alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Aug 26 2017, 5:02 PM
alc accepted D12084: Synchronize page laundering with pmap_extract_and_hold().
Aug 26 2017, 4:34 PM

Aug 25 2017

alc committed rS322897: Correct a regression in the previous change, r322459. Specifically, the.
Correct a regression in the previous change, r322459. Specifically, the
Aug 25 2017, 6:47 PM
alc closed D12106: Allocation of last block requires blist cursor reset by committing rS322897: Correct a regression in the previous change, r322459. Specifically, the.
Aug 25 2017, 6:47 PM
alc accepted D12106: Allocation of last block requires blist cursor reset.
Aug 25 2017, 6:17 PM
alc added a comment to D12106: Allocation of last block requires blist cursor reset.

A sentinel is not sufficient. While cursor sits at the end of managed memory, we start with scan = bl_root, and compute blk == cursor, so child == 0, so we loop over the 16 subchildren of root and if we find a free block at p, we report a free block at p+radix. Nothing gets us to actually look at the new sentinel, unless we make the tree one level higher, with the 2nd child of the root consisting of the sentinel only.

Aug 25 2017, 6:17 PM
alc accepted D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Aug 25 2017, 5:33 PM
alc added a comment to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..

I've now completed 8+ hours of testing under a "make -j7 buildworld" workload on 1.5GB of RAM swapping to a Samsung 850 PRO, both with and without the patch. I think that there is too much variability in the execution time conclude anything about the execution time. However, here is the memory utilization story:

Aug 25 2017, 5:32 PM

Aug 24 2017

alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Aug 24 2017, 4:03 PM