Page MenuHomeFreeBSD

add map header sentinel to vm_map search tree
AbandonedPublic

Authored by dougm on Jan 20 2018, 7:09 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 31, 5:40 AM
Unknown Object (File)
Sun, Dec 29, 12:26 PM
Unknown Object (File)
Fri, Dec 27, 7:44 PM
Unknown Object (File)
Thu, Dec 26, 6:19 AM
Unknown Object (File)
Sun, Dec 22, 1:15 PM
Unknown Object (File)
Fri, Dec 20, 10:11 AM
Unknown Object (File)
Fri, Dec 20, 9:26 AM
Unknown Object (File)
Wed, Dec 18, 2:04 PM
Subscribers

Details

Reviewers
alc
Summary

Make the map->header sentinel a part of the initial vm_map binary search tree. Change the order of search tree comparisons so that the end field is tested before the start field, so that the map->header sentinel will always be the leftmost member of the search tree. Change the method of handling search failure in splay so that the new tree root after failure always becomes the largest map entry less than the address sought; such an entry must exist because the sentinel is less than any address sought. Have splay handle tree insertions and deletions. Have splay work on maps, rather than map entries. Change the splay code to eliminate duplicated comparisons. Rename vm_map_entry_splay to vm_map_splay.

Test Plan

I request that Peter Holm apply his robustness tests to this patch.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
In D13999#295233, @dougm_rice.edu wrote:

Throw everything back in.

uhub4: 2 ports with 2 removable, self powered
uhub5: 2 ports with 2 removable, self powered
WARNING: / was not properly dismounted


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address   = 0xc
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc0fbef60
stack pointer           = 0x28:0xd49009a0
frame pointer           = 0x28:0xd4900a00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1 (init)
[ thread pid 1 tid 100002 ]
Stopped at      vm_map_splay+0x300:     movl    0xc(%eax),%eax
db> bt
Tracing pid 1 tid 100002 td 0xd60f9a80
vm_map_splay(bfbfe000,d6102804,d4900a20,44f,d6102804,...) at vm_map_splay+0x300/frame 0xd4900a00
vm_map_entry_unlink(d6102804,d84931dc,c1942b00,c1942b10,c16a2f0f,...) at vm_map_entry_unlink+0x90/frame 0xd4900a30
vm_map_entry_delete(d6102804,d84931dc,bfbff000,c8f,0,...) at vm_map_entry_delete+0x2f/frame 0xd4900acc
vm_map_delete(d6102804,0,bfc00000,d4900b88,bfc00000,...) at vm_map_delete+0x291/frame 0xd4900b2c
vm_map_remove(d6102804,0,bfc00000,ffffffff,d6102804,...) at vm_map_remove+0xbf/frame 0xd4900b6c
vmspace_dofree(d6102804,1,c16a131b,ffffffff,0,0,5,d6102804) at vmspace_dofree+0x36/frame 0xd4900b88
vmspace_free(d6102804,0,0,3,fffffffe,...) at vmspace_free+0x90/frame 0xd4900bb4
post_execve(d60f9a80,fffffffe,d6102804,bfbfefe4,0,...) at post_execve+0xbb/frame 0xd4900bd8
sys_execve(d60f9a80,d4900c90,0,0,d4900c80,...) at sys_execve+0x80/frame 0xd4900c40
start_init(0,d4900ce8,c1644afc,407,0,...) at start_init+0x2ae/frame 0xd4900cac
fork_exit(c0c0a390,0,d4900ce8) at fork_exit+0x7e/frame 0xd4900cd4
fork_trampoline() at fork_trampoline+0x8/frame 0xd4900cd4
--- trap 0, eip = 0, esp = 0xd4900d20, ebp = 0 ---
db> dump
Cannot dump: no dump device specified.
db> show registers
cs                0x20
ds                0x28
es                0x28
fs                 0x8
gs                0x3b
ss                0x28
eax                  0
ecx         0xd4900a20
edx         0xd6102801
ebx         0xd6102804
esp         0xd49009a0
ebp         0xd4900a00
esi              0x44f
edi         0xd6102804
eip         0xc0fbef60  vm_map_splay+0x300
efl            0x10282
vm_map_splay+0x300:     movl    0xc(%eax),%eax
db> 
(kgdb) l *vm_map_splay+0x300
0xc0fbef60 is in vm_map_splay (../../../vm/vm_map.c:1008).
1003                    root->left = NULL;
1004                    *io = root;
1005                    root = ltree;
1006                    for (;;) {
1007                            /* root is never NULL in here. */
1008                            y = root->right;
1009                            if (y == NULL)
1010                                    break;
1011                            x = root;
1012                            root = y->right;
(kgdb)

Endeavor to properly delete a tree node with no left child.

In D13999#295250, @dougm_rice.edu wrote:

Endeavor to properly delete a tree node with no left child.

Yes, that boots. Starting tests right now.

Is there an assert somewhere that the map header is never returned by a lookup ?

In D13999#295780, @kib wrote:

Is there an assert somewhere that the map header is never returned by a lookup ?

Without this change, it is possible that vm_map_lookup_entry will return with *entry == &map->header, when the root is null or when a binary search is made for an address smaller than any in the tree. So there's no assertion to fail on that condition, and it would be incorrect to add one.

That will still be possible after this change, so that there's no cause to add an assertion.

In D13999#295781, @dougm_rice.edu wrote:
In D13999#295780, @kib wrote:

Is there an assert somewhere that the map header is never returned by a lookup ?

Without this change, it is possible that vm_map_lookup_entry will return with *entry == &map->header, when the root is null or when a binary search is made for an address smaller than any in the tree. So there's no assertion to fail on that condition, and it would be incorrect to add one.

That will still be possible after this change, so that there's no cause to add an assertion.

Ok. I looked at the consumers, and either they explicitly check for the lookup address falling into the map range, or do found_entry->next if result is false.

I mean that with your previous and this changes, the header map entry looks more and more as the normal entry, which might make some futher code change erroneously fail to skip it. I want to have some protection and diagnostic for that error.

In D13999#296096, @kib wrote:
In D13999#295781, @dougm_rice.edu wrote:
In D13999#295780, @kib wrote:

Is there an assert somewhere that the map header is never returned by a lookup ?

Without this change, it is possible that vm_map_lookup_entry will return with *entry == &map->header, when the root is null or when a binary search is made for an address smaller than any in the tree. So there's no assertion to fail on that condition, and it would be incorrect to add one.

That will still be possible after this change, so that there's no cause to add an assertion.

Ok. I looked at the consumers, and either they explicitly check for the lookup address falling into the map range, or do found_entry->next if result is false.

I mean that with your previous and this changes, the header map entry looks more and more as the normal entry, which might make some futher code change erroneously fail to skip it. I want to have some protection and diagnostic for that error.

The previous change made the header map entry look less like a normal entry, since normal entries have start < end, and the header map entry did too, before the change, which is why extra checks were needed in handling it. By making it less normal, with end < start for that node only, it is now harder to mistake for a normal entry.

This change modifies tree-related fields, left and right, and adj_free/max_free, which are modified in tree manipulation. Outside tree-manipulation code here, left and right are not examined. Outside of this file, adj_free and max_free are not examined. So I don't see the big impact globally from this change. The only functions that can return &map->header after this change are the one's that could return it already.

I'm not sure how to apply the protection and diagnostic changes you seek, other than by putting back in all the &map->header tests I took out of loop bound checks in the previous change, because they were no longer necessary. I'd welcome an example of the sort of change you're looking for.

Rewrite the splay root-picking code a bit to eliminate a few redundant writes.

The patch hasn't changed since February, but some of the context around it has, so updating to deal with conflicts.

Thanks to Peter Holm for point out necessary changes to this patch on account of changes elsewhere. min_offset and max_offset fields have been renamed.

In D13999#366507, @dougm_rice.edu wrote:

Thanks to Peter Holm for point out necessary changes to this patch on account of changes elsewhere. min_offset and max_offset fields have been renamed.

I have tested this patch with all of the stress2 tests. I also ran a buildworld / installworld. No problems seen.

Drop unneeded argument to vm_map_entry_link. Drop unnecessary calls to vm_map_entry_resize_free from vm_map_simplify_entry, by setting start/end fields before removing neighbor. Reduce number of uses of 'prev' member when iterating through consecutive map entries.

Can you extract the helpers functions into separate review ?

sys/vm/vm_map.c
1669

Change the return type to bool ?

1675

Almost all () there and below are excessive, and no need in space before the excessive '(' at this line.

1678

No need in additional space for indent.

1692

Blank line is needed before start of multi-line comment.

1706

!= NULL

Fix CTR3 compilation error.

Apply recommended style changes.

Update to adjust to recent changes in adjacent code.

In D13999#376599, @dougm_rice.edu wrote:

Update to adjust to recent changes in adjacent code.

startup_alloc from "UMA Zones", 12 boot pages left
startup_alloc from "UMA Zones", 8 boot pages left
startup_alloc from "UMA Zones", 4 boot pages left
kernel trap 12 with interrupts disabled
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex vm map (system) (vm map (system)) r = 0 (0xfffff80100001098) locked @ vm/vm_kern.c:730
stack backtrace:
#0 0xffffffff80bf9563 at ??+0
#1 0xffffffff80bfa4af at ??+0
#2 0xffffffff81069063 at ??+0
#3 0xffffffff8106869a at ??+0
#4 0xffffffff81043185 at ??+0
#5 0xffffffff80ed2843 at ??+0
#6 0xffffffff80ed0aa2 at ??+0
#7 0xffffffff80ecf717 at ??+0
#8 0xffffffff80b27ba8 at ??+0
#9 0xffffffff8034502c at ??+0


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80ed21c8
stack pointer           = 0x28:0xffffffff8270e2d0
frame pointer           = 0x28:0xffffffff8270e2f0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 0 ()
[ thread pid 0 tid 0 ]
Stopped at      vm_map_splay+0x1c8:     movq    %r11,(%r9)
db> bt
Tracing pid 0 tid 0 td 0xffffffff8208e100
vm_map_splay() at vm_map_splay+0x1c8/frame 0xffffffff8270e2f0
vm_map_insert() at vm_map_insert+0x453/frame 0xffffffff8270e390
kmem_init() at kmem_init+0x72/frame 0xffffffff8270e3c0
vm_mem_init() at vm_mem_init+0x47/frame 0xffffffff8270e3d0
mi_startup() at mi_startup+0x118/frame 0xffffffff8270e3f0
btext() at btext+0x2c
db> x/s version
version:        FreeBSD 13.0-CURRENT #0 r339523M: Sun Oct 21 08:17:58 CEST 2018\012    pho@flix1a.net

Fix a problem recently introduced that breaks an insert of a new last item.

In D13999#376605, @dougm_rice.edu wrote:

Fix a problem recently introduced that breaks an insert of a new last item.

start_init: trying /sbin/init
Expensive timeout(9) function: 0xffffffff80a01180(0xffffffff81af3bd0) 0.010507480 s


Fatal trap 9: general protection fault while in kernel mode
cpuid = 32; apic id = 26
instruction pointer     = 0x20:0xffffffff80ed205d
stack pointer           = 0x28:0xfffffe00edb16fc0
frame pointer           = 0x28:0xfffffe00edb16fe0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 132 (ifconfig)
[ thread pid 132 tid 100753 ]
Stopped at      vm_map_splay+0x5d:      cmpq    %r8,ll+0x7(%rcx)
db> bt
Tracing pid 132 tid 100753 td 0xfffff8010f80d580
vm_map_splay() at vm_map_splay+0x5d/frame 0xfffffe00edb16fe0
vm_map_simplify_entry() at vm_map_simplify_entry+0x1c1/frame 0xfffffe00edb17010
vm_map_unwire() at vm_map_unwire+0x4b3/frame 0xfffffe00edb170a0
userland_sysctl() at userland_sysctl+0x20b/frame 0xfffffe00edb17150
sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe00edb17200
amd64_syscall() at amd64_syscall+0x293/frame 0xfffffe00edb17330
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00edb17330
--- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80047bfda, rsp = 0x7fffffffe418, rbp = 0x7fffffffe450 ---
db> x/s version
version:        FreeBSD 13.0-CURRENT #2 r339523M: Sun Oct 21 09:50:31 CEST 2018\012    pho@flix1a.netperf.freebsd.org:/usr/src/sys/amd64/compile/PHO\012
db>

Back-up to a version that worked once, and hope that it still does.

In D13999#376620, @dougm_rice.edu wrote:

Back-up to a version that worked once, and hope that it still does.

Yes, this boots.

Updating to adapt changes to surrounding code. Hoping that this code still works.

Diff 39 looks promising. Uptime so far is one hour.

I ran 1/2 of the stress2 tests, for one day, with D13999.id50452.diff without seeing any problems.

Restore vm_map_entry_splay with its original arguments. Break its split and merge phases into separate functions, and use those functions, instead of vm_map_entry_splay, to implement link and unlink.

Ran all of the stress2 tests with D13999.51105. LGTM.

Drop after_where argument from link. Cleanup unlink.

In D13999#390115, @dougm_rice.edu wrote:

Drop after_where argument from link. Cleanup unlink.

Well, ...

real memory  = 103079215104 (98304 MB)
panic: kmem_suballoc: bad status return of 3
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82712240
vpanic() at vpanic+0x1a3/frame 0xffffffff827122a0
panic() at panic+0x43/frame 0xffffffff82712300
kmem_suballoc() at kmem_suballoc+0xb5/frame 0xffffffff82712330
vm_ksubmap_init() at vm_ksubmap_init+0x21d/frame 0xffffffff82712370
cpu_startup() at cpu_startup+0x21c/frame 0xffffffff827123a0
mi_startup() at mi_startup+0x25f/frame 0xffffffff827123f0
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/ verson
kdb_enter+0x3b: 2504c748        (\360\367\20111000000000
db>

Fix a broken adj_free calculation in _link. Eliminate two gratuitous differences from the original.

If I do not see any other problems during tests, this one seems unrelated IMHO:
https://people.freebsd.org/~pho/stress/log/dougm016.txt

In vm_map_lookup_entry, add a KASSERT to usurp an exit case, so that I can either delete the exit case, or find the caller passing the bad argument that creates the need for the exit case and fix the caller.

In D13999#391004, @dougm_rice.edu wrote:

In vm_map_lookup_entry, add a KASSERT to usurp an exit case, so that I can either delete the exit case, or find the caller passing the bad argument that creates the need for the exit case and fix the caller.

[root@flix1a /usr/src/sys/amd64/compile/PHO]# cc -c -O0 -pipe -fno-strict-aliasing  -g -nostdinc  -I. -I../../.. -I../../../contrib/ck/include -I../../../contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h   -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -MD  -MF.depend.vm_map.o -MTvm_map.o -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member  -mno-aes -mno-avx  -std=iso9899:1999 -Werror  ../../../vm/vm_map.c
../../../vm/vm_map.c:1154:26: error: ordered comparison between pointer and integer ('vm_offset_t' (aka 'unsigned long') and 'void *') [-Werror]
        KASSERT(vm_map_min(map) <= && address < vm_map_max(map),
                ~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~
../../../sys/systm.h:102:24: note: expanded from macro 'KASSERT'
        if (__predict_false(!(exp)))                                    \
                              ^~~
../../../sys/cdefs.h:455:51: note: expanded from macro '__predict_false'
#define __predict_false(exp)    __builtin_expect((exp), 0)
                                                  ^~~
../../../vm/vm_map.c:1155:65: error: invalid conversion specifier ')' [-Werror,-Wformat-invalid-specifier]
            ("vm_map_lookup_entry: query address %p out of range [%p, %)",
                                                                      ~^
../../../sys/systm.h:103:17: note: expanded from macro 'KASSERT'
                kassert_panic msg;                                      \
                              ^
../../../vm/vm_map.c:1156:48: error: data argument not used by format string [-Werror,-Wformat-extra-args]
             (void *)address, (void*)vm_map_min(map), (void*)vm_map_max(map)));
                                                      ^
../../../sys/systm.h:103:17: note: expanded from macro 'KASSERT'
                kassert_panic msg;                                      \
                              ^~~
../../../vm/vm_map.c:1154:32: error: use of undeclared label 'address'
        KASSERT(vm_map_min(map) <= && address < vm_map_max(map),
                                      ^
4 errors generated.
[root@flix1a /usr/src/sys/amd64/compile/PHO]# 
$

I booted with it anyway and got this one:

20181130 15:06:49 all (18/590): pmc3.sh
hwpmc: SOFT/16/64/0x67<INT,USR,SYS,REA,WRI> TSC/1/64/0x20<REA> IAP/4/48/0x3ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA,PRC> IAF/3/48/0x67<INT,USR,SYS,REA,WRI>
panic: vm_map_lookup_entry: query address 0 out of range [0x1000, 0x800000000000]
cpuid = 47
time = 1543586811
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0131340b10
vpanic() at vpanic+0x1a3/frame 0xfffffe0131340b70
panic() at panic+0x43/frame 0xfffffe0131340bd0
vm_map_lookup_entry() at vm_map_lookup_entry+0xa5/frame 0xfffffe0131340c40
vm_map_lookup() at vm_map_lookup+0x6b/frame 0xfffffe0131340cf0
vm_fault_hold() at vm_fault_hold+0x7d/frame 0xfffffe0131340e40
vm_fault() at vm_fault+0x60/frame 0xfffffe0131340e80
trap_pfault() at trap_pfault+0x188/frame 0xfffffe0131340ed0
trap() at trap+0x2ba/frame 0xfffffe0131340fe0
calltrap() at calltrap+0x8/frame 0xfffffe0131340fe0
--- trap 0xc, rip = 0xffffffff8107b155, rsp = 0xfffffe01313410b0, rbp = 0xfffffe01313410b0 ---
copyin_smap_erms() at copyin_smap_erms+0xa5/frame 0xfffffe01313410b0
pmc_syscall_handler() at pmc_syscall_handler+0xdd1/frame 0xfffffe01313411f0
amd64_syscall() at amd64_syscall+0x293/frame 0xfffffe0131341330
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0131341330
--- syscall (0, FreeBSD ELF64, nosys), rip = 0x800440eba, rsp = 0x7fffdf3f7f48, rbp = 0x7fffdf3f7fb0 ---
In D13999#391313, @dougm_rice.edu wrote:

Make it compile.

panic: vm_map_lookup_entry: query address 0 out of range [0x1000, 0x800000000000), vmcore.83
https://people.freebsd.org/~pho/stress/log/dougm017.txt

In vm_fault.c, make sure that vm_map_lookup is not invoked with and invalid address.

In D13999#391381, @dougm_rice.edu wrote:

In vm_fault.c, make sure that vm_map_lookup is not invoked with and invalid address.

../../../vm/vm_fault.c:575:39: error: use of undeclared identifier 'addr'; did you mean 'vaddr'?
        result = (vaddr < vm_map_min(map) || addr >= vm_map_max(map)) ?
                                             ^~~~
                                             vaddr
../../../vm/vm_fault.c:546:41: note: 'vaddr' declared here
In D13999#391385, @dougm_rice.edu wrote:

Fix typo.

panic: vm_map_lookup_entry: query address 0x800000000000 out of range [0x1000, 0x800000000000)
https://people.freebsd.org/~pho/stress/log/dougm018.txt

with a mmap() syscall fuzz test.

Allow lookups for the address at the upper bound of the map range.

In vm_map_wire, use VM_MAP_RANGE_CHECK before testing start and end for equality, since it might make them equal and let the function exit before a possible lookup for address vm_map_max, which is outside the valid range.

In D13999#391495, @dougm_rice.edu wrote:

In vm_map_wire, use VM_MAP_RANGE_CHECK before testing start and end for equality, since it might make them equal and let the function exit before a possible lookup for address vm_map_max, which is outside the valid range.

panic: vm_map_lookup_entry: query address 0x800000000000 out of range [0x1000, 0x800000000000)
https://people.freebsd.org/~pho/stress/log/dougm019.txt

Abandon attempts to make callers of vm_map_lookup_entry adhere to range limits on their address inquires, and just go back to having vm_map_lookup_entry check for out-of-bounds address.

Change the tree code to use the header sentinel value as two entries in the tree, both as a least and greatest tree element, by examining the start value of a node before the end value if and only if the node is a right child of its parent.

Reduce the code size in unlink by reusing splay_split.

With D13999.id51793.diff I see:

panic: page fault
cpuid = 6
time = 1544449196
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00bbff8ff0
vpanic() at vpanic+0x1a3/frame 0xfffffe00bbff9050
panic() at panic+0x43/frame 0xfffffe00bbff90b0
trap_fatal() at trap_fatal+0x35f/frame 0xfffffe00bbff9100
trap_pfault() at trap_pfault+0x62/frame 0xfffffe00bbff9150
trap() at trap+0x2ba/frame 0xfffffe00bbff9260
calltrap() at calltrap+0x8/frame 0xfffffe00bbff9260
--- trap 0xc, rip = 0xffffffff80eec33b, rsp = 0xfffffe00bbff9338, rbp = 0xfffffe00bbff9340 ---
vm_map_entry_splay_split() at vm_map_entry_splay_split+0xb/frame 0xfffffe00bbff9340
vm_map_entry_unlink() at vm_map_entry_unlink+0xa9/frame 0xfffffe00bbff9390
vm_map_entry_delete() at vm_map_entry_delete+0x1c/frame 0xfffffe00bbff93e0
vm_map_delete() at vm_map_delete+0x32c/frame 0xfffffe00bbff9450
vm_map_remove() at vm_map_remove+0x81/frame 0xfffffe00bbff9480
exec_new_vmspace() at exec_new_vmspace+0x183/frame 0xfffffe00bbff94e0
exec_elf64_imgact() at exec_elf64_imgact+0x6bb/frame 0xfffffe00bbff95b0
kern_execve() at kern_execve+0x67a/frame 0xfffffe00bbff98f0
sys_execve() at sys_execve+0x4c/frame 0xfffffe00bbff9970
amd64_syscall() at amd64_syscall+0x293/frame 0xfffffe00bbff9ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00bbff9ab0
--- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x2ad00a, rsp = 0x7fffffffe6e8, rbp = 0x7fffffffe6f0 ---
KDB: enter: panic
[ thread pid 46 tid 100221 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/s version
version:        FreeBSD 13.0-CURRENT #0 r341786M: Mon Dec 10 14:37:19 CET 2018\012    pho@t2.osted.lan:/usr/src/sys/amd64/compile/PHO\012
db>

Add null checks for root children in vm_map_entry_unlink.

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80ee85d3
stack pointer           = 0x28:0xfffffe00bbff92d0
frame pointer           = 0x28:0xfffffe00bbff9340
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 46 (init)
trap number             = 12
panic: page fault
cpuid = 1
time = 1544461569
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00bbff8f90
vpanic() at vpanic+0x1a3/frame 0xfffffe00bbff8ff0
panic() at panic+0x43/frame 0xfffffe00bbff9050
trap_fatal() at trap_fatal+0x35f/frame 0xfffffe00bbff90a0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe00bbff90f0
trap() at trap+0x2ba/frame 0xfffffe00bbff9200
calltrap() at calltrap+0x8/frame 0xfffffe00bbff9200
--- trap 0xc, rip = 0xffffffff80ee85d3, rsp = 0xfffffe00bbff92d0, rbp = 0xfffffe00bbff9340 ---
vm_map_entry_unlink() at vm_map_entry_unlink+0x143/frame 0xfffffe00bbff9340
vm_map_entry_delete() at vm_map_entry_delete+0x1d/frame 0xfffffe00bbff93b0
vm_map_delete() at vm_map_delete+0x28a/frame 0xfffffe00bbff9440
vm_map_remove() at vm_map_remove+0xa6/frame 0xfffffe00bbff9480
exec_new_vmspace() at exec_new_vmspace+0x183/frame 0xfffffe00bbff94e0
exec_elf64_imgact() at exec_elf64_imgact+0x6bb/frame 0xfffffe00bbff95b0
kern_execve() at kern_execve+0x67a/frame 0xfffffe00bbff98f0
sys_execve() at sys_execve+0x4c/frame 0xfffffe00bbff9970
amd64_syscall() at amd64_syscall+0x293/frame 0xfffffe00bbff9ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00bbff9ab0
--- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x2ad00a, rsp = 0x7fffffffe6e8, rbp = 0x7fffffffe6f0 ---
KDB: enter: panic
[ thread pid 46 tid 100221 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/s version
version:        FreeBSD 13.0-CURRENT #2 r341786M: Mon Dec 10 18:03:16 CET 2018\012    pho@t2.osted.lan:/usr/src/sys/amd64/compile/PHO\012
db> 

(kgdb) l *0xffffffff80ee85d3
0xffffffff80ee85d3 is in vm_map_entry_unlink (../../../vm/vm_map.c:1073).
1068                        &llist, &rlist);
1069            if (root->right != NULL)
1070                    vm_map_entry_splay_split(entry->start, root->right,
1071                        &llist, &rlist);
1072            llist->next = rlist;
1073            rlist->prev = llist;
1074            llist->adj_free = rlist->start - llist->end;
1075            /* Make the predecessor of the entry the new root */
1076            root = vm_map_entry_splay_merge(llist, llist->right, rlist, NULL, NULL);
1077            map->root = root;
(kgdb)

Handle the special case of deleting the max entry.

Fatal trap 12: page fault while in kernel mode
cpuid = 11; apic id = 0b
fault virtual address   = 0x18
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80ee5fb3
stack pointer           = 0x28:0xfffffe00bbff97e0
frame pointer           = 0x28:0xfffffe00bbff9810
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 46 (sh)
trap number             = 12
panic: page fault
cpuid = 11
time = 1544513729
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00bbff94a0
vpanic() at vpanic+0x1a3/frame 0xfffffe00bbff9500
panic() at panic+0x43/frame 0xfffffe00bbff9560
trap_fatal() at trap_fatal+0x35f/frame 0xfffffe00bbff95b0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe00bbff9600
trap() at trap+0x2ba/frame 0xfffffe00bbff9710
calltrap() at calltrap+0x8/frame 0xfffffe00bbff9710
--- trap 0xc, rip = 0xffffffff80ee5fb3, rsp = 0xfffffe00bbff97e0, rbp = 0xfffffe00bbff9810 ---
vm_map_entry_splay() at vm_map_entry_splay+0x63/frame 0xfffffe00bbff9810
vm_map_lookup_entry() at vm_map_lookup_entry+0x112/frame 0xfffffe00bbff9870
vm_map_delete() at vm_map_delete+0x5f/frame 0xfffffe00bbff9900
kern_munmap() at kern_munmap+0x8a/frame 0xfffffe00bbff9970
amd64_syscall() at amd64_syscall+0x293/frame 0xfffffe00bbff9ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00bbff9ab0
--- syscall (73, FreeBSD ELF64, sys_munmap), rip = 0x80024174a, rsp = 0x7fffffffd268, rbp = 0x7fffffffd3c0 ---
KDB: enter: panic
[ thread pid 46 tid 100221 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/s version
version:        FreeBSD 13.0-CURRENT #4 r341786M: Tue Dec 11 08:32:27 CET 2018\012    pho@t2.osted.lan:/usr/src/sys/amd64/compile/PHO\012
db> dump
Cannot dump: no dump device specified.
db> show reg
cs                        0x20
ds                        0x3b  ll+0x1a
es                        0x3b  ll+0x1a
fs                        0x13
gs                        0x1b
ss                        0x28  ll+0x7
rax                       0x12
rcx                       0x80  ll+0x5f
rdx         0xfffffe00bbff9390
rbx         0xffffffff813150b9
rsp         0xfffffe00bbff9490
rbp         0xfffffe00bbff94a0
rsi                       0x80  ll+0x5f
rdi         0xffffffff81e95358  cnputs_mtx
r8                           0
r9                           0
r10         0xffffffff81342cd5
r11                        0x1
r12                        0x1
r13         0xfffffe00bbff9720
r14         0xfffffe00bbff9540
r15         0xfffff80089b63580
rip         0xffffffff80bf252b  kdb_enter+0x3b
rflags                    0x82  ll+0x61
kdb_enter+0x3b: movq    $0,kdb_why
db>

(kgdb) l *0xffffffff80ee5fb3
0xffffffff80ee5fb3 is in vm_map_entry_splay (../../../vm/vm_map.c:1018).
1013                     * subtree and make it the root.  There is such a
1014                     * node, since map->header is in the tree and left of
1015                     * all addresses.
1016                     */
1017                    root = llist;
1018                    llist = root->right;
1019                    root->right = NULL;
1020            }
1021            return (vm_map_entry_splay_merge(root, llist, rlist,
1022                root->left, root->right));
(kgdb)

Just change unlink from the last working version by walking the left and right spines of the found node, without relying on splay_split to do it, and find the new root after that.

Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 05
fault virtual address   = 0x18
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80ee5fa3
stack pointer           = 0x28:0xfffffe00bbff97e0
frame pointer           = 0x28:0xfffffe00bbff9810
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 46 (sh)
trap number             = 12
panic: page fault
cpuid = 5
time = 1544527104
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00bbff94a0
vpanic() at vpanic+0x1a3/frame 0xfffffe00bbff9500
panic() at panic+0x43/frame 0xfffffe00bbff9560
trap_fatal() at trap_fatal+0x35f/frame 0xfffffe00bbff95b0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe00bbff9600
trap() at trap+0x2ba/frame 0xfffffe00bbff9710
calltrap() at calltrap+0x8/frame 0xfffffe00bbff9710
--- trap 0xc, rip = 0xffffffff80ee5fa3, rsp = 0xfffffe00bbff97e0, rbp = 0xfffffe00bbff9810 ---
vm_map_entry_splay() at vm_map_entry_splay+0x53/frame 0xfffffe00bbff9810
vm_map_lookup_entry() at vm_map_lookup_entry+0x112/frame 0xfffffe00bbff9870
vm_map_delete() at vm_map_delete+0x5f/frame 0xfffffe00bbff9900
kern_munmap() at kern_munmap+0x8a/frame 0xfffffe00bbff9970
amd64_syscall() at amd64_syscall+0x293/frame 0xfffffe00bbff9ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00bbff9ab0
--- syscall (73, FreeBSD ELF64, sys_munmap), rip = 0x80024174a, rsp = 0x7fffffffd268, rbp = 0x7fffffffd3c0 ---
KDB: enter: panic
[ thread pid 46 tid 100221 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/s version
version:        FreeBSD 13.0-CURRENT #6 r341786M: Tue Dec 11 12:15:24 CET 2018\012    pho@t2.osted.lan:/usr/src/sys/amd64/compile/PHO\012
db> 
(kgdb) l *0xffffffff80ee5fa3
0xffffffff80ee5fa3 is in vm_map_entry_splay (../../../vm/vm_map.c:1017).
1012                     * subtree and make it the root.  There is such a
1013                     * node, since map->header is in the tree and left of
1014                     * all addresses.
1015                     */
1016                    root = llist;
1017                    llist = root->right;
1018                    root->right = NULL;
1019            }
1020            return (vm_map_entry_splay_merge(root, llist, rlist,
1021                root->left, root->right));
(kgdb)

Just change the processing of the left spine a wee bit from what worked before, and see if that breaks.

Fix dumb error in the last one.

In D13999#394831, @dougm_rice.edu wrote:

Fix dumb error in the last one.

This version boots :)

Dare to change unlink again to walk the right spine also, and try to put the pieces together again afterward.

In D13999#394837, @dougm_rice.edu wrote:

Dare to change unlink again to walk the right spine also, and try to put the pieces together again afterward.

This version also boots.

The essential parts of this have been checked in with D17794, so I'm discarding this.