alc (Alan Cox)
User

Projects

User Details

User Since
Dec 14 2014, 5:52 AM (144 w, 4 d)

Recent Activity

Today

alc committed rS323868: Modernize calls to vm_page_unwire(). As of r288122, vm_page_unwire().
Modernize calls to vm_page_unwire(). As of r288122, vm_page_unwire()
Thu, Sep 21, 3:32 PM

Yesterday

alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Wed, Sep 20, 4:26 PM
alc committed rS323786: In r288122, we changed vm_page_unwire() so that it returns a Boolean.
In r288122, we changed vm_page_unwire() so that it returns a Boolean
Wed, Sep 20, 5:00 AM
alc committed rS323785: Sync with amd64/arm/arm64/i386/mips pmap change r288256:.
Sync with amd64/arm/arm64/i386/mips pmap change r288256:
Wed, Sep 20, 4:20 AM

Tue, Sep 19

alc accepted D12411: For unlinked files, do not msync(2) or sync on inactivation..
Tue, Sep 19, 4:15 PM
alc added a comment to D12411: For unlinked files, do not msync(2) or sync on inactivation..

As long as the page daemon will still launder the pages when we are short of memory, I don't see a problem with msync() ignoring them.

Tue, Sep 19, 3:50 PM

Sun, Sep 17

alc committed rS323681: MFC r321840,322041.
MFC r321840,322041
Sun, Sep 17, 4:46 PM
alc committed rS323665: MFC r321423.
MFC r321423
Sun, Sep 17, 4:15 AM
alc committed rS323664: MFC r321102.
MFC r321102
Sun, Sep 17, 3:44 AM
alc committed rS323663: MFC r322404.
MFC r322404
Sun, Sep 17, 3:34 AM
alc committed rS323662: MFC r322296.
MFC r322296
Sun, Sep 17, 3:17 AM

Sat, Sep 16

alc committed rS323656: Modify blst_leaf_alloc to take only the cursor argument..
Modify blst_leaf_alloc to take only the cursor argument.
Sat, Sep 16, 6:12 PM
alc closed D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.
Sat, Sep 16, 6:12 PM
alc accepted D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.

The next to last change to this patch that zeroed the bitmap in terminator nodes addressed my last concern. I'm going to commit this patch shortly.

Sat, Sep 16, 5:46 PM

Wed, Sep 13

alc added inline comments to D11968: Simplify blist initialization.
Wed, Sep 13, 4:38 PM
alc accepted D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.
Wed, Sep 13, 4:27 PM
alc added a comment to D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.

I've performed some testing with the new sysctl that calls blist_stats(). Specifically, I've configured a test machine with 2GB of RAM and 64GB of swap space, and run a "make -j7 buildworld" in a loop. After 6 consecutive builds, the before and after results are as follows.

Wed, Sep 13, 4:25 PM

Sun, Sep 10

alc added a comment to D12281: Move vmmeter atomic counters into dedicated cache lines.
In D12281#255294, @mjg wrote:
-       u_int v_free_count;     /* (f) pages free */
        u_int v_inactive_target; /* (c) pages desired inactive */
        u_int v_pageout_free_min;   /* (c) min pages reserved for kernel */
        u_int v_interrupt_free_min; /* (c) reserved pages for int code */
        u_int v_free_severe;    /* (c) severe page depletion point */
+       u_int v_free_count VMMETER_ALIGNED;     /* (f) pages free */
        u_int v_wire_count VMMETER_ALIGNED; /* (a) pages wired down */

Made no difference in the buildkernel test, even with kib's page batching patch. Something of the sort will definitely have to be done later anyway and probably helps a little elsewhere, so I can include this bit in the patch if you want.

Sun, Sep 10, 6:42 PM
alc added a comment to D12281: Move vmmeter atomic counters into dedicated cache lines.

Have you evaluated the effects of isolating the free count? The fields surrounding it are read only.

Sun, Sep 10, 6:07 PM
alc accepted D12281: Move vmmeter atomic counters into dedicated cache lines.
Sun, Sep 10, 6:00 PM
alc committed rS323391: To analyze the allocation of swap blocks by blist functions, add a method.
To analyze the allocation of swap blocks by blist functions, add a method
Sun, Sep 10, 5:46 PM
alc closed D11906: est tool, so that 's' writes data about free space distribution.
Sun, Sep 10, 5:46 PM

Sat, Sep 9

alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Sat, Sep 9, 11:26 PM
alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Sat, Sep 9, 10:48 PM
alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Sat, Sep 9, 10:33 PM

Thu, Sep 7

alc accepted D12248: Speed up vm_page_array initialization..
Thu, Sep 7, 4:58 PM
alc added inline comments to D12248: Speed up vm_page_array initialization..
Thu, Sep 7, 3:52 PM

Mon, Sep 4

alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Mon, Sep 4, 4:52 PM

Fri, Sep 1

alc added a comment to D11906: est tool, so that 's' writes data about free space distribution.

swap_pager.c needs #include <sys/sbuf.h>

Fri, Sep 1, 2:32 AM
alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Fri, Sep 1, 2:30 AM
alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Fri, Sep 1, 2:28 AM

Mon, Aug 28

alc committed rS322971: Update a couple vm_object lock assertions in the swap pager to reflect the.
Update a couple vm_object lock assertions in the swap pager to reflect the
Mon, Aug 28, 5:02 PM
alc committed rS322970: Switching from a global hash table to per-vm_object radix tries for mapping.
Switching from a global hash table to per-vm_object radix tries for mapping
Mon, Aug 28, 4:56 PM
alc closed D12134: Update vm object lock assertions in the swap pager by committing rS322970: Switching from a global hash table to per-vm_object radix tries for mapping.
Mon, Aug 28, 4:56 PM
alc added a comment to D12134: Update vm object lock assertions in the swap pager.
In D12134#251911, @kib wrote:
In D12134#251899, @alc wrote:

This change reveals a problem: vm_fault_soft_fast() is (indirectly) calling vm_pager_page_unswapped() with only a read lock.

My understanding is that this is a valid assert in the following situation: we have a swap object with a valid page which was swapped in and not dirtied. Then, on a write fault, any (fast or normal) fault handlers could free the swap space. It is possible because swap pager page-in leaves the backed swap space intact as (IMO very small) optimization.

Mon, Aug 28, 4:36 AM
alc updated the diff for D12134: Update vm object lock assertions in the swap pager.

Update vm_fault_dirty().

Mon, Aug 28, 4:32 AM

Sun, Aug 27

alc added a comment to D12134: Update vm object lock assertions in the swap pager.

This change reveals a problem: vm_fault_soft_fast() is (indirectly) calling vm_pager_page_unswapped() with only a read lock.

Sun, Aug 27, 5:46 AM

Sat, Aug 26

alc created D12134: Update vm object lock assertions in the swap pager.
Sat, Aug 26, 9:11 PM
alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Sat, Aug 26, 5:02 PM
alc accepted D12084: Synchronize page laundering with pmap_extract_and_hold().
Sat, Aug 26, 4:34 PM

Fri, Aug 25

alc committed rS322897: Correct a regression in the previous change, r322459. Specifically, the.
Correct a regression in the previous change, r322459. Specifically, the
Fri, Aug 25, 6:47 PM
alc closed D12106: Allocation of last block requires blist cursor reset by committing rS322897: Correct a regression in the previous change, r322459. Specifically, the.
Fri, Aug 25, 6:47 PM
alc accepted D12106: Allocation of last block requires blist cursor reset.
Fri, Aug 25, 6:17 PM
alc added a comment to D12106: Allocation of last block requires blist cursor reset.

A sentinel is not sufficient. While cursor sits at the end of managed memory, we start with scan = bl_root, and compute blk == cursor, so child == 0, so we loop over the 16 subchildren of root and if we find a free block at p, we report a free block at p+radix. Nothing gets us to actually look at the new sentinel, unless we make the tree one level higher, with the 2nd child of the root consisting of the sentinel only.

Fri, Aug 25, 6:17 PM
alc accepted D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Fri, Aug 25, 5:33 PM
alc added a comment to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..

I've now completed 8+ hours of testing under a "make -j7 buildworld" workload on 1.5GB of RAM swapping to a Samsung 850 PRO, both with and without the patch. I think that there is too much variability in the execution time conclude anything about the execution time. However, here is the memory utilization story:

Fri, Aug 25, 5:32 PM

Thu, Aug 24

alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Thu, Aug 24, 4:03 PM
alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Thu, Aug 24, 4:01 PM
alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Thu, Aug 24, 3:18 PM
alc added a comment to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
In D11435#251430, @kib wrote:
In D11435#251395, @alc wrote:

I've tried to create a test where the vm object has discontiguous swap space usage and the radix tree should be advantageous. In particular, I've tried to create a situation where vm object destruction should be faster, but I'm getting inconsistent results.

In regards to memory utilization, I've looked at a best-case scenario for the old code, where you have contiguous swap space usage. (To be clear, I'm talking about contiguous page index ranges within the vm object being swapped out, not contiguous on-disk storage.) My scenario was using about 48GB of swap space, and the total kernel memory used by the new radix trie code was slightly less than the old code if you account for the size of the swhash array.

Isn't the main advantage of the new code is the fact that the per-object tracking of the pindex->swap block relation is protected by the vm object lock instead of the global hash lock. If the memory usage by the new code is approximately the same, as expected, then this reason alone should be significant.

Thu, Aug 24, 3:15 PM
alc added a comment to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..

I've tried to create a test where the vm object has discontiguous swap space usage and the radix tree should be advantageous. In particular, I've tried to create a situation where vm object destruction should be faster, but I'm getting inconsistent results.

Thu, Aug 24, 4:51 AM

Wed, Aug 23

alc added a comment to D12106: Allocation of last block requires blist cursor reset.

I have run numerous tests since r322459 was committed that have wrapped around the swap area without crashing, so there must be another prerequisite to a crash: Your swap area is a fully allocated tree, i.e., the number of blocks equals the radix. Can we also solve this problem by placing a sentinel entry at the end of a fully allocated tree?

Wed, Aug 23, 3:19 PM

Aug 22 2017

alc added a comment to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..

Just an FYI, I'm doing some performance testing while sipping my morning coffee.

Aug 22 2017, 4:51 PM

Aug 17 2017

alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Aug 17 2017, 3:55 PM

Aug 16 2017

alc added a comment to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
In D11435#249802, @kib wrote:
In D11435#249799, @alc wrote:

By the way, once upon a time, you asked me about whether we should continue using the blist allocator for swap space or switch to vmem. Perhaps the most compelling reason, which I failed to give at the time, is that every allocated range from an arena consumes a boundary tag. In other words, while vmem coalesces free ranges, it maintains a boundary tag for each allocated range.

Ok.

Also, from a functionality standpoint, with vmem, you have to free an entire allocation at once. In order to support the pager's unswapped function, we would have to allocate swap space one page at a time; so in fact we would wind up with a boundary tag in use for every page of allocated swap space.

Or enhance vmem to allow partial free, with a specialized interface. I think that this could be done in a way similar to the stack gap handling for edges, and new bt allocations if in middle.
I do not propose to do this, just discussing a hypothetical possibility.

Aug 16 2017, 4:07 PM
alc added a comment to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..

By the way, once upon a time, you asked me about whether we should continue using the blist allocator for swap space or switch to vmem. Perhaps the most compelling reason, which I failed to give at the time, is that every allocated range from an arena consumes a boundary tag. In other words, while vmem coalesces free ranges, it maintains a boundary tag for each allocated range.

Aug 16 2017, 3:47 PM
alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Aug 16 2017, 3:04 PM

Aug 15 2017

alc accepted D11688: Add OBJ_PG_DTOR flag to VM object.
Aug 15 2017, 4:48 PM
alc added inline comments to D11113: Intel SGX driver.
Aug 15 2017, 4:01 PM
alc added a comment to D11688: Add OBJ_PG_DTOR flag to VM object.
In D11688#249553, @br wrote:
In D11688#249549, @alc wrote:

I have no objections to this change. I'm just interested in hearing a little more about why it is needed. In your SGX driver are you essentially maintaining a private free page list for managing the pool of physical memory that backs enclaves?

Exactly. We have enclave's page cache (EPC) memory pool reserved by processor and we register it as fictitious range of physical memory. That EPC pool is encrypted and backs enclave's control information, enclave's code, stack, data, etc.
https://reviews.freebsd.org/D11113

Aug 15 2017, 3:45 PM
alc added a comment to D11688: Add OBJ_PG_DTOR flag to VM object.

I have no objections to this change. I'm just interested in hearing a little more about why it is needed. In your SGX driver are you essentially maintaining a private free page list for managing the pool of physical memory that backs enclaves?

Aug 15 2017, 2:36 PM
alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Aug 15 2017, 5:10 AM
alc accepted D11984: Add vm_page_alloc_after().

I'm happy with this.

Aug 15 2017, 4:00 AM
alc added inline comments to D11984: Add vm_page_alloc_after().
Aug 15 2017, 3:35 AM

Aug 13 2017

alc added a comment to D11784: Dynamically grow the slab size to control wasted memory.
In D11784#248990, @jeff wrote:

Three broad comments;

We don't necessarily know which allocations will be handed off to do DMA on. This may be safe as busdma becomes more sophisticated but in the past network drivers definitely assumed mbufs were contiguous, for example. There may still be advantages to allocating aligned kva or possibly just contig physical addresses and using the direct map. You could in principle still use a constant offset from the aligned boundary to find the slab. You would also produce shorter scatter/gather descriptors for those zones that are involved in dma. Eliminating vtoslab may not be that important with the per-cpu caches in play but we do need to consider the implications of contiguous virtual addresses not being physically contiguous.

Aug 13 2017, 6:39 PM
alc committed rS322459: The *_meta_* functions include a radix parameter, a blk parameter, and.
The *_meta_* functions include a radix parameter, a blk parameter, and
Aug 13 2017, 4:40 PM
alc closed D11964: Drop blk argument to meta functions by committing rS322459: The *_meta_* functions include a radix parameter, a blk parameter, and.
Aug 13 2017, 4:40 PM
alc added a comment to D11819: Allow blist allocations to span leaf boundaries, but not meta boundaries.

I've asked Doug to create some new statistics code (see D11906) so that we can quantify the effects of this patch.

Aug 13 2017, 4:11 PM
alc added inline comments to D11906: est tool, so that 's' writes data about free space distribution.
Aug 13 2017, 4:08 PM

Aug 12 2017

alc added inline comments to D11943: Modify vm_page_wire() to not dequeue the specified page.
Aug 12 2017, 10:12 PM
alc accepted D11964: Drop blk argument to meta functions.

This change reduces the text size by 7% on amd64.

Aug 12 2017, 6:43 PM
alc added a comment to D11964: Drop blk argument to meta functions.

I suggest changing the "radix" field of struct blist to be unsigned as well.

Aug 12 2017, 6:02 PM
alc added a comment to D11964: Drop blk argument to meta functions.

This change doesn't compile:

../../../kern/subr_blist.c:457:46: error: too few arguments to function call,
      expected 4, have 3
                return (blst_leaf_alloc(scan, cursor, count));
                        ~~~~~~~~~~~~~~~                    ^
../../../kern/subr_blist.c:358:1: note: 'blst_leaf_alloc' declared here
static daddr_t
^
1 error generated.
Aug 12 2017, 6:01 PM
alc added a comment to D11984: Add vm_page_alloc_after().

I think that the callers should be rewritten to use vm_radix_lookup_le(), e.g.,

Aug 12 2017, 5:20 PM

Aug 11 2017

alc committed rS322404: An invalid page can't be dirty..
An invalid page can't be dirty.
Aug 11 2017, 4:28 PM

Aug 10 2017

alc accepted D11945: Micro-optimize kmem_unback()..
Aug 10 2017, 11:21 PM
alc updated subscribers of D11943: Modify vm_page_wire() to not dequeue the specified page.
Aug 10 2017, 6:45 PM
alc added a comment to D11942: Have vm_page_grab_pages() support VM_ALLOC_NOWAIT.

Brett, I added you to this change, because it will decrease the time spent in vm_radix_lookup() by your shm_open()/sendfile() test case.

Aug 10 2017, 6:42 PM
alc updated subscribers of D11942: Have vm_page_grab_pages() support VM_ALLOC_NOWAIT.
Aug 10 2017, 6:40 PM
alc added a comment to D11945: Micro-optimize kmem_unback()..

To be clear, I'm happy with the concept, and in general eliminating vm_page_lookup() calls inside of loops.

Aug 10 2017, 4:04 PM
alc accepted D11942: Have vm_page_grab_pages() support VM_ALLOC_NOWAIT.
Aug 10 2017, 3:50 PM

Aug 9 2017

alc committed rS322296: Introduce vm_page_grab_pages(), which is intended to replace loops calling.
Introduce vm_page_grab_pages(), which is intended to replace loops calling
Aug 9 2017, 4:23 AM
alc closed D11926: Add vm_page_grab_pages() by committing rS322296: Introduce vm_page_grab_pages(), which is intended to replace loops calling.
Aug 9 2017, 4:23 AM

Aug 8 2017

alc updated the diff for D11926: Add vm_page_grab_pages().

Update the description of VM_ALLOC_NOBUSY.

Aug 8 2017, 5:02 PM
alc added a comment to D11926: Add vm_page_grab_pages().

Should vm_page_grab_pages() also include this assertion from vm_page_grab():

KASSERT((allocflags & VM_ALLOC_SBUSY) == 0 ||
    (allocflags & VM_ALLOC_IGN_SBUSY) != 0,
    ("vm_page_grab: VM_ALLOC_SBUSY/VM_ALLOC_IGN_SBUSY mismatch"));
Aug 8 2017, 4:42 PM
alc added inline comments to D11926: Add vm_page_grab_pages().
Aug 8 2017, 4:37 PM
alc created D11926: Add vm_page_grab_pages().
Aug 8 2017, 4:31 PM

Aug 4 2017

alc committed rS322041: In case readers are misled by expressions that combine multiplication and.
In case readers are misled by expressions that combine multiplication and
Aug 4 2017, 4:23 AM
alc closed D11815: Add parens to radix_to_skip for reader clarity by committing rS322041: In case readers are misled by expressions that combine multiplication and.
Aug 4 2017, 4:23 AM
alc committed rS322035: Add myself..
Add myself.
Aug 4 2017, 3:20 AM

Aug 2 2017

alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Aug 2 2017, 11:43 PM
alc accepted D11815: Add parens to radix_to_skip for reader clarity.
Aug 2 2017, 8:13 PM
alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Aug 2 2017, 4:51 AM
alc added inline comments to D11435: Replace global swhash in swap pager with per-object trie to track swap blocks assigned to the object pages..
Aug 2 2017, 4:47 AM

Aug 1 2017

alc accepted D11790: Batch v_wire_count updates in the pmap code..
Aug 1 2017, 5:12 AM
alc added inline comments to D11790: Batch v_wire_count updates in the pmap code..
Aug 1 2017, 4:19 AM
alc committed rS321840: The blist_meta_* routines that process a subtree take arguments 'radix' and.
The blist_meta_* routines that process a subtree take arguments 'radix' and
Aug 1 2017, 3:51 AM

Jul 31 2017

alc accepted D11791: Batch v_wire_count decrements in vm_hold_free_pages()..
Jul 31 2017, 6:33 PM
alc added inline comments to D11784: Dynamically grow the slab size to control wasted memory.
Jul 31 2017, 4:49 PM
alc added inline comments to D11790: Batch v_wire_count updates in the pmap code..
Jul 31 2017, 4:01 PM