Page MenuHomeFreeBSD

alc (Alan Cox)
User

Projects

User Details

User Since
Dec 14 2014, 5:52 AM (248 w, 5 d)

Recent Activity

Today

alc added a comment to D21685: Try translating the fault address with pmap unlocked..

As an aside, I want to ask why pmap_update_entry() both blocks interrupts and performs critical_enter()? Doesn't the former suffice?

Fri, Sep 20, 5:44 PM
alc updated the diff for D21685: Try translating the fault address with pmap unlocked..

On a translation fault within the kernel address space, perform a lock-free page table walk that handles transient invalidations due to superpage promotion or demotion.

Fri, Sep 20, 3:22 PM
alc commandeered D21685: Try translating the fault address with pmap unlocked..
Fri, Sep 20, 2:59 PM

Yesterday

alc committed rS352541: MF r349585.
MF r349585
Thu, Sep 19, 8:45 PM
alc added a comment to D21685: Try translating the fault address with pmap unlocked..

Here is a patch based on Mark's suggestion.

Index: arm64/arm64/pmap.c
===================================================================
--- arm64/arm64/pmap.c  (revision 352346)
+++ arm64/arm64/pmap.c  (working copy)
@@ -5810,13 +5810,21 @@ pmap_fault(pmap_t pmap, uint64_t esr, uint64_t far
        case ISS_DATA_DFSC_TF_L1:
        case ISS_DATA_DFSC_TF_L2:
        case ISS_DATA_DFSC_TF_L3:
+               if (pmap == kernel_pmap) {
+                       /*
+                        * The translation fault may have occurred within a
+                        * critical section.  Therefore, we must determine if
+                        * the address was invalid due to a break-before-make
+                        * sequence without acquiring the pmap lock.
+                        */
+                       if (pmap_kextract(far) != 0)
+                               rv = KERN_SUCCESS;
+                       break;
+               }
                PMAP_LOCK(pmap);
                /* Ask the MMU to check the address */
                intr = intr_disable();
-               if (pmap == kernel_pmap)
-                       par = arm64_address_translate_s1e1r(far);
-               else
-                       par = arm64_address_translate_s1e0r(far);
+               par = arm64_address_translate_s1e0r(far);
                intr_restore(intr);
                PMAP_UNLOCK(pmap);
Thu, Sep 19, 6:00 PM
alc committed rS352519: MFC r350335:.
MFC r350335:
Thu, Sep 19, 3:12 PM
alc committed rS352517: MFC r349526:.
MFC r349526:
Thu, Sep 19, 2:36 PM

Wed, Sep 18

alc added a comment to D21685: Try translating the fault address with pmap unlocked..

I don't see why this is sufficient. If the unlocked translation attempt fails, e.g., because the mapping is being promoted or demoted a second time, we may still attempt to acquire a mutex in a scenario where that is not allowed.
Can we use the same trick that pmap_kextract() does to identify mappings that are transiently invalid?

Wed, Sep 18, 3:18 PM
alc committed rS352485: MFC r349768.
MFC r349768
Wed, Sep 18, 2:38 PM
alc committed rS352484: MFC r350546.
MFC r350546
Wed, Sep 18, 2:27 PM
alc committed rS352475: MFC r350463.
MFC r350463
Wed, Sep 18, 7:25 AM
alc committed rS352469: MFC r350347.
MFC r350347
Wed, Sep 18, 7:01 AM

Tue, Sep 17

alc committed rS352452: MFC r349117, r349122, r349183, r349897, r349943, r350004, r350029, r350038,.
MFC r349117, r349122, r349183, r349897, r349943, r350004, r350029, r350038,
Tue, Sep 17, 5:28 PM
alc resigned from D20675: Eliminate incorrect 2 pagesized block skipping while preparing swap device..
Tue, Sep 17, 5:36 AM

Mon, Sep 16

alc accepted D21668: remove dead code from vm_map_entry_unlink.

When you commit the log entry should: (1) reference the prior commit when this code became dead code and (2) say, "Tested by: pho (as part of a larger change)"

Mon, Sep 16, 11:21 PM
alc added a comment to D21642: Remove more unused identifiers from r351198..
In D21642#472706, @kib wrote:
In D21642#472503, @alc wrote:

Yes, I put the const in the wrong place. So, code within vm_page.c will avoid the dereference, but code outside vm_page.c that uses the declaration from vm_page.h will still perform the dereference. Declaring vm_page_array as an array would deal with both cases.

I'm still missing something. vm_page_array cannot be defined in vm_page.h, and if it is not defined there I do not see how the compiler can elide a memory dereference.

It cannot be defined, but can be declared, same as now. Or I do not understand what you are trying to say.

I also do not see how an array symbol can be initialized to VM_MIN_KERNEL_ADDRESS.

You use .set in some asm file. No C definition is provided.

Mon, Sep 16, 5:28 PM
alc committed rS352374: MFC r349323, r349442, r349866, r349975.
MFC r349323, r349442, r349866, r349975
Mon, Sep 16, 4:54 AM
alc committed rS352373: MFC r349003, r349031, r349042, r349129, r349290, r349618, r349798.
MFC r349003, r349031, r349042, r349129, r349290, r349618, r349798
Mon, Sep 16, 2:32 AM

Sun, Sep 15

alc committed rS352368: MFC r349070.
MFC r349070
Sun, Sep 15, 9:32 PM
alc committed rS352367: MFC r349905.
MFC r349905
Sun, Sep 15, 9:27 PM
alc added a comment to D21642: Remove more unused identifiers from r351198..
In D21642#472496, @alc wrote:
In D21642#472450, @alc wrote:

We could go further and make vm_page_array a const pointer on amd64, ...

If you start down this path, I would go a step further: Declare vm_page_array as an array, instead of a pointer, and define the symbol's value to VM_MIN_KERNEL_ADDRESS. This would eliminate the memory dereference to retrieve the start of the array.

Sorry, I don't quite see what you are proposing. My suggestion also eliminated the memory dereference, but I don't see how declaring vm_page_array as an array would help anything.

I'm afraid not. Consider this simplified example.

const int *ptr = (int *)0x10000000;
...

My proposal would be equivalent to int *const ptr = (int *)0x10000000, which yields

func:                                   # @func                                              
        .cfi_startproc                                                                         
# %bb.0:                                # %entry                  
        pushq   %rbp                           
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16                                                                  
        movq    %rsp, %rbp                                                                     
        .cfi_def_cfa_register %rbp                                                             
        movl    268435456, %eax                
        popq    %rbp                       
        .cfi_def_cfa %rsp, 8                                                                   
        retq                                                                                   
.Lfunc_end0:                                                                                   
        .size   func, .Lfunc_end0-func                                                                                                                                                        
        .cfi_endproc
Sun, Sep 15, 8:46 PM
alc added a comment to D21642: Remove more unused identifiers from r351198..
In D21642#472450, @alc wrote:

We could go further and make vm_page_array a const pointer on amd64, ...

If you start down this path, I would go a step further: Declare vm_page_array as an array, instead of a pointer, and define the symbol's value to VM_MIN_KERNEL_ADDRESS. This would eliminate the memory dereference to retrieve the start of the array.

Sorry, I don't quite see what you are proposing. My suggestion also eliminated the memory dereference, but I don't see how declaring vm_page_array as an array would help anything.

Sun, Sep 15, 8:11 PM
alc accepted D21642: Remove more unused identifiers from r351198..

We could go further and make vm_page_array a const pointer on amd64, ...

If you start down this path, I would go a step further: Declare vm_page_array as an array, instead of a pointer, and define the set the symbol's value to VM_MIN_KERNEL_ADDRESS. This would eliminate the memory dereference to retrieve the start of the array.

Sun, Sep 15, 5:42 PM
alc committed rS352358: MFC r348828.
MFC r348828
Sun, Sep 15, 5:22 PM
alc added a comment to D20664: Changes to vm_map_lookup_entry.

How does this perform on the microbenchmarks that you've previously used?

Sun, Sep 15, 4:01 AM
alc added inline comments to D21548: (vm object 1) Replace busy checks and sleeps with acquires where it is trivial to do so..
Sun, Sep 15, 3:27 AM
alc added inline comments to D21642: Remove more unused identifiers from r351198..
Sun, Sep 15, 1:40 AM

Sat, Sep 14

alc accepted D21639: Fix some issues with r352110..
Sat, Sep 14, 10:37 PM

Fri, Sep 13

alc added inline comments to D21642: Remove more unused identifiers from r351198..
Fri, Sep 13, 6:37 PM
alc added a comment to D21566: Improve MD page fault handlers..
In D21566#470372, @kib wrote:

... Note that vm_map_wire() use of vm_fault() caused ktrace recording the wired pages, which IMO is not right.

Fri, Sep 13, 5:48 PM
alc added inline comments to D21566: Improve MD page fault handlers..
Fri, Sep 13, 3:01 PM

Thu, Sep 12

alc added inline comments to D21566: Improve MD page fault handlers..
Thu, Sep 12, 5:43 PM
alc added inline comments to D21566: Improve MD page fault handlers..
Thu, Sep 12, 3:27 PM

Wed, Sep 11

alc added inline comments to D21566: Improve MD page fault handlers..
Wed, Sep 11, 11:26 PM
alc added inline comments to D21566: Improve MD page fault handlers..
Wed, Sep 11, 11:11 PM
alc added inline comments to D21566: Improve MD page fault handlers..
Wed, Sep 11, 10:46 PM

Tue, Sep 10

alc added inline comments to D20486: Change synchonization rules for page reference counting..
Tue, Sep 10, 2:59 PM

Fri, Sep 6

alc added inline comments to D21255: Remove the page lock dependency on busy sleeps by using sleepq to avoid races..
Fri, Sep 6, 2:41 PM
alc accepted D21255: Remove the page lock dependency on busy sleeps by using sleepq to avoid races..
Fri, Sep 6, 2:37 PM

Tue, Sep 3

alc accepted D21495: Fix nits in pmap_page_array_startup()..

Thank you.

Tue, Sep 3, 6:35 PM
alc added a comment to D21491: Keep the vm_page array mapped into KVA on amd64..

I believe that PA_MIN_ADDRESS can be removed, as it was introduced by the original version of this code.

Tue, Sep 3, 3:19 PM

Mon, Sep 2

alc accepted D21352: procctl(PROC_STACKGAP_CTL).
Mon, Sep 2, 10:43 PM
alc added inline comments to D21352: procctl(PROC_STACKGAP_CTL).
Mon, Sep 2, 9:19 PM
alc added inline comments to D21352: procctl(PROC_STACKGAP_CTL).
Mon, Sep 2, 4:27 PM

Sun, Sep 1

alc added inline comments to D21352: procctl(PROC_STACKGAP_CTL).
Sun, Sep 1, 9:44 PM
D21352: procctl(PROC_STACKGAP_CTL) is now accepted and ready to land.

The code looks good.

Sun, Sep 1, 5:22 PM
alc added inline comments to D21352: procctl(PROC_STACKGAP_CTL).
Sun, Sep 1, 12:35 AM
alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).
In D21352#467724, @kib wrote:

Split current program stackgap state control, and the state after execve. Current state can be only disabled, while state after exec can be re-enabled.

Sun, Sep 1, 12:20 AM

Sat, Aug 31

alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).

My only hesitation from "rubber stamping" this change is the difficulty of describing the complete semantics of this option, specifically, the semantics that arise when reenabling the stack gap. Suppose that we have disabled and then reenabled stack gaps for a process. Stacks created before gaps are disabled will then have their original gap enforced, unless the stack has grown into the previously defined gap region. Stacks created while gaps are disabled will have no gap after gaps are reenabled. Consequently, I want to ask: Should we actually support reenabling stack gaps on a process once they've been disabled?

Sat, Aug 31, 4:57 PM
alc added inline comments to D21352: procctl(PROC_STACKGAP_CTL).
Sat, Aug 31, 4:05 PM

Wed, Aug 28

alc added a comment to D21424: Remove some assertions related to queue state and wiring..
In D21424#466489, @kib wrote:

I do not object against the removals. But please note that we had systematic bugs where pages were 'forgotten' to be added to a queue after allocation or some uses, essentially making them unswappable. So some way to assert that the page is visible to the pagedaemon is needed.

Wed, Aug 28, 2:53 PM
D21440: Use vm_page_grab(VM_ALLOC_WIRED) when appropriate. is now accepted and ready to land.
Wed, Aug 28, 2:49 PM

Tue, Aug 27

alc added inline comments to D21420: Fix logic in domainset_empty_vm()..
Tue, Aug 27, 2:39 PM

Mon, Aug 26

alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).
In D21352#465323, @kib wrote:

That said, do you have any objections or suggestions for the procctl(2) knob ?

Mon, Aug 26, 6:03 AM

Fri, Aug 23

alc accepted D21384: Store stack_guard_page * PAGE_SIZE into the gap->next_read field at the time of the stack creation..
Fri, Aug 23, 10:14 PM
alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).

I would also like to suggest that we store the stack_guard_page value that was in force at the time that the guard entry was created in the map entry and use this stored value in vm_map_growstack(). The next_read field should be unused by guard entries.

Fri, Aug 23, 4:18 PM
alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).
In D21352#465082, @kib wrote:
In D21352#465045, @alc wrote:
In D21352#464986, @kib wrote:

There is one detail, I forgot it, and think that it is mostly irrelevant, but it is still interesting. This was the reason why vm_map_protect() does clip and then skips GUARD entries. When creating a guarded stack, libthr does mmap(total size of stack and guard), then mprotect(guard). This clips out part of the guard to the permanent guard, the rest is the shrinking entry.

Actually, I think that it is relevant. As far as I can tell, the explicitly created libthr guard doesn't disable the security.bsd.stack_guard_page guard. If so, we are actually killing the thread before it hits the libthr guard.

Well, it is relevant in the sense that the guard is created. I mean that it is irrelevant whether libthr does mprotect() or mmap(MAP_GUARD) to create the extra guard at the bottom.

Fri, Aug 23, 4:10 PM
alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).
In D21352#464986, @kib wrote:
In D21352#464984, @alc wrote:
In D21352#464943, @kib wrote:
In D21352#464862, @alc wrote:

Yes?

Almost, but not quite. There is pthread feature where thread can be created with some attributes. Among them are the stack size and guard size. We create the stack of the specified size as you described (two entries, guard has reserved part which cannot be grown into), and below the libthr.so puts yet another guard entry, of the size specified in the attribute. JVM hacks on that guard, not on the implicit kernel guard.

Does libthr create a third, guard entry by default, or only if you pass an explicit "attr"?

Default guard size is zero, so it is not created explicitly. Third is it because you count two guards and mapped part ?

Fri, Aug 23, 6:18 AM
alc accepted D21357: Move OBJT_VNODE specific code from vm_object_terminate() to vnode_destroy_vobject()..
Fri, Aug 23, 5:42 AM

Thu, Aug 22

alc added inline comments to D21368: Fix some issues in vm_pqbatch_process_page()..
Thu, Aug 22, 10:46 PM
alc accepted D21372: Do not clear page flags in vm_page_pqbatch_submit()..
Thu, Aug 22, 10:12 PM
D20814: Make entry simplification one-way is now accepted and ready to land.

This looks good. Can you please repeat the performance test with and without this patch.

Thu, Aug 22, 9:52 PM
alc accepted D21369: Make vm_pqbatch_submit_page() externally visible..
Thu, Aug 22, 9:45 PM
alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).
In D21352#464943, @kib wrote:
In D21352#464862, @alc wrote:

Yes?

Almost, but not quite. There is pthread feature where thread can be created with some attributes. Among them are the stack size and guard size. We create the stack of the specified size as you described (two entries, guard has reserved part which cannot be grown into), and below the libthr.so puts yet another guard entry, of the size specified in the attribute. JVM hacks on that guard, not on the implicit kernel guard.

Thu, Aug 22, 8:20 PM
alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).
In D21352#464519, @kib wrote:
In D21352#464514, @alc wrote:

If I remember correctly, this would be applicable to openjdk. Specifically, my recollection is that there were different classes of threads, such as system maintenance threads and application code execution threads, and we wound up with two guard entries for application threads, one created by the kernel and another by the JVM.

See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=239894

Thu, Aug 22, 5:17 PM

Aug 21 2019

alc added a comment to D21352: procctl(PROC_STACKGAP_CTL).

If I remember correctly, this would be applicable to openjdk. Specifically, my recollection is that there were different classes of threads, such as system maintenance threads and application code execution threads, and we wound up with two guard entries for application threads, one created by the kernel and another by the JVM.

Aug 21 2019, 4:13 PM
alc added a comment to D20814: Make entry simplification one-way.

Simplify the patch.

Aug 21 2019, 4:00 PM

Aug 20 2019

alc accepted D21343: Avoid requeuing active pages in vm_swapout_object_deactivate_pages()..
Aug 20 2019, 10:51 PM
alc added inline comments to D21341: Slightly simplify page queue code..
Aug 20 2019, 3:12 PM
alc added inline comments to D21331: Eliminate redundant code by introducing a vm_page_grab_valid().
Aug 20 2019, 3:01 PM

Aug 17 2019

alc added a comment to D20814: Make entry simplification one-way.

My major concern with this change is that it is doing two things at once, and I would ask that these two things be done separately, so that they can be individually evaluated. As the title says, this change is making simplification one way. However, most of this change is actually about decommissioning the "prev" pointer in the map entry.

Aug 17 2019, 6:20 PM

Aug 16 2019

alc accepted D21272: Fix 'lock "vm reserv" already initialized' panic..
Aug 16 2019, 3:27 PM

Aug 14 2019

alc accepted D21191: Don't trim bsd_label.
Aug 14 2019, 7:25 PM
alc added a comment to D21191: Don't trim bsd_label.

I would recommend "MFC after: 3 days".

Aug 14 2019, 4:37 PM
D21191: Don't trim bsd_label is now accepted and ready to land.
Aug 14 2019, 4:36 PM
alc added inline comments to D21191: Don't trim bsd_label.
Aug 14 2019, 3:51 PM

Aug 9 2019

alc added inline comments to D21191: Don't trim bsd_label.
Aug 9 2019, 4:28 AM

Aug 8 2019

alc added inline comments to D21191: Don't trim bsd_label.
Aug 8 2019, 11:11 PM
alc added inline comments to D21191: Don't trim bsd_label.
Aug 8 2019, 10:55 PM
alc added inline comments to D21191: Don't trim bsd_label.
Aug 8 2019, 10:50 PM
alc added a comment to D20675: Eliminate incorrect 2 pagesized block skipping while preparing swap device..

Please take a look at the latest comment on D20599. It shows that the two-page reservation is required to preserve the BSD label when the swap partition is the first partition within a slice. Therefore, I recommend that this proposed change be abandoned.

Aug 8 2019, 5:05 PM
alc added inline comments to D20599: simple trimming before swapon.
Aug 8 2019, 4:45 PM
alc committed rS350741: Ordinarily, during a superpage promotion or demotion within a pmap, the.
Ordinarily, during a superpage promotion or demotion within a pmap, the
Aug 8 2019, 6:26 AM
Herald added a reviewer for D21169: Handle concurrent pmap_kextract() and promotion/demotion on arm64: manu.
Aug 8 2019, 6:26 AM

Aug 7 2019

D21169: Handle concurrent pmap_kextract() and promotion/demotion on arm64 now requires review to proceed.

Deindent most of pmap_kextract().

Aug 7 2019, 4:09 PM

Aug 6 2019

alc added a comment to D20863: prototype for trimming.

Before we get into the design of the priority mechanism, I'd like to discuss some of the test results and what I think that we should do in response to them.

Aug 6 2019, 10:07 PM
alc added reviewers for D21169: Handle concurrent pmap_kextract() and promotion/demotion on arm64: andrew, markj.
Aug 6 2019, 6:17 PM
alc added a comment to D21149: Enable superpage promotion within the kernel pmap on arm64.

Please try D21169 in combination with this change.

Aug 6 2019, 3:20 PM
alc created D21169: Handle concurrent pmap_kextract() and promotion/demotion on arm64.
Aug 6 2019, 3:15 PM

Aug 5 2019

alc added a comment to D21149: Enable superpage promotion within the kernel pmap on arm64.

Hasn't this opened a raise between promotion/demotion and pmap_kextract? pmap_kextract walks the page table so may find a zero entry from the break-before-make sequence.

Aug 5 2019, 6:19 PM
Herald added a reviewer for D21149: Enable superpage promotion within the kernel pmap on arm64: manu.
Aug 5 2019, 2:44 AM
alc committed rS350579: Enable superpage promotion within the kernel pmap..
Enable superpage promotion within the kernel pmap.
Aug 5 2019, 2:44 AM

Aug 3 2019

alc created D21149: Enable superpage promotion within the kernel pmap on arm64.
Aug 3 2019, 5:32 PM
alc accepted D20327: Don't reset memory attributes when mapping physical addresses for ACPI..
Aug 3 2019, 1:12 AM

Aug 2 2019

alc committed rS350546: Because of AArch64's weak memory consistency model, we need to include a.
Because of AArch64's weak memory consistency model, we need to include a
Aug 2 2019, 10:37 PM
Herald added a reviewer for D21126: Fixes to page table page mapping: manu.
Aug 2 2019, 10:37 PM
alc added a comment to D21126: Fixes to page table page mapping.

Suppose we pmap_enter() a zero-filled page. How do we guarantee that the PTP update isn't visible to another CPU before the page's contents are zero'ed?

Aug 2 2019, 4:29 PM

Aug 1 2019

alc added a comment to D21126: Fixes to page table page mapping.

Add a comment to pmap_demote_l2_locked() about pmap_fill_l3().

Aug 1 2019, 3:49 PM
alc added inline comments to D20863: prototype for trimming.
Aug 1 2019, 5:56 AM
alc added a comment to D20863: prototype for trimming.
In D20863#455860, @alc wrote:

Mark, I discussed this patch with Doug today and recommended that he modify swapdev_trim() so that it never sleeps and instead returns NULL. In particular, I don't think that we want the laundry thread sleeping because of a trim operation encountering a shortage of free memory. Thoughts?

Sorry, I missed this. Yes, I think that's correct. We occasionally see deadlocks in CAM in Peter's tests, where we rely on a M_WAITOK allocation succeeding in the I/O path.
I have been wondering lately if it would be reasonable to let the laundry thread maintain a chunk of reserved memory to handle these cases, but I think we should avoid introducing new ones whenever possible.

Aug 1 2019, 5:50 AM