Page MenuHomeFreeBSD

HPTS Is missing some support for Vnets
ClosedPublic

Authored by rrs on Jun 11 2018, 12:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mar 12 2024, 12:00 AM
Unknown Object (File)
Mar 8 2024, 10:05 PM
Unknown Object (File)
Dec 28 2023, 5:16 AM
Unknown Object (File)
Dec 22 2023, 11:23 PM
Unknown Object (File)
Dec 3 2023, 1:32 AM
Unknown Object (File)
Nov 8 2023, 9:03 PM
Unknown Object (File)
Nov 7 2023, 10:43 PM
Unknown Object (File)
Nov 7 2023, 6:52 AM
Subscribers

Details

Summary

Turns out one of the crashes that was found by Larry Rosenman in rack was not a rack
bug at all but a crash when the hpts system was trying to lock the INFO lock. It is
missing a vnet set and so with vnets enabled "boom".

This fixes that.

Test Plan

I have given Larry a patch to test for both rack and hpts. Enabling
vnets and running rack would be an easy way to verify this :-)

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

I've got this and D15758 running on my home box since:
system boot Jun 11 08:24

We'll see

jtl requested changes to this revision.Jun 11 2018, 2:37 PM

I think there are more changes needed.

For example, the INP_INFO lock which is dropped on line 1260 is acquired on line 1220. If you don't set the VNET prior to that, you could end up acquiring and dropping different locks. (Keep in mind that tcbinfo itself is a VNET'd variable.) That wouldn't be good.

This revision now requires changes to proceed.Jun 11 2018, 2:37 PM

I got another crash even with this:
borg.lerctr.org /var/crash $ more core.txt.3
borg.lerctr.org dumped core - see /var/crash/vmcore.3

Mon Jun 11 09:42:19 CDT 2018

FreeBSD borg.lerctr.org 12.0-CURRENT FreeBSD 12.0-CURRENT #37 r334925M: Mon Jun 11 08:08:06 CDT 2018 root@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/VT-LER amd64

panic: page fault

GNU gdb (GDB) 8.1 [GDB v8.1 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...done.
done.

Unread portion of the kernel message buffer:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 20
fault virtual address = 0x28
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80da5cc3
stack pointer = 0x28:0xfffffe344db4e9f0
frame pointer = 0x28:0xfffffe344db4ea60
code segment = base 0x0, limit 0xfffff, type 0x1b

= DPL 0, pres 1, long 1, def32 0, gran 1

processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (swi1: hpts)
trap number = 12
panic: page fault
cpuid = 0
time = 1528727669
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe344db4e6a0
vpanic() at vpanic+0x1a3/frame 0xfffffe344db4e700
panic() at panic+0x43/frame 0xfffffe344db4e760
trap_fatal() at trap_fatal+0x35f/frame 0xfffffe344db4e7b0
trap_pfault() at trap_pfault+0x49/frame 0xfffffe344db4e810
trap() at trap+0x2ba/frame 0xfffffe344db4e920
calltrap() at calltrap+0x8/frame 0xfffffe344db4e920

  • trap 0xc, rip = 0xffffffff80da5cc3, rsp = 0xfffffe344db4e9f0, rbp = 0xfffffe344db4ea60 ---

tcp_input_data() at tcp_input_data+0xb3/frame 0xfffffe344db4ea60
tcp_hpts_thread() at tcp_hpts_thread+0x817/frame 0xfffffe344db4eb20
intr_event_execute_handlers() at intr_event_execute_handlers+0x99/frame 0xfffffe344db4eb60
ithread_loop() at ithread_loop+0xb7/frame 0xfffffe344db4ebb0
fork_exit() at fork_exit+0x84/frame 0xfffffe344db4ebf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe344db4ebf0

  • trap 0, rip = 0, rsp = 0, rbp = 0 ---

Uptime: 1h10m42s
Dumping 9638 out of 130994 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

curthread () at ./machine/pcpu.h:231
231
asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0 __curthread () at ./machine/pcpu.h:231
#1 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:366
#2 0xffffffff80b844d2 in kern_reboot (howto=260)

at /usr/src/sys/kern/kern_shutdown.c:446

#3 0xffffffff80b84ab3 in vpanic (fmt=<optimized out>, ap=0xfffffe344db4e740)

at /usr/src/sys/kern/kern_shutdown.c:863

#4 0xffffffff80b84b03 in panic (fmt=<unavailable>)

at /usr/src/sys/kern/kern_shutdown.c:790

#5 0xffffffff8105b6ff in trap_fatal (frame=0xfffffe344db4e930, eva=40)

at /usr/src/sys/amd64/amd64/trap.c:892

#6 0xffffffff8105b759 in trap_pfault (frame=0xfffffe344db4e930, usermode=0)

at /usr/src/sys/amd64/amd64/trap.c:728

#7 0xffffffff8105ad7a in trap (frame=0xfffffe344db4e930)

at /usr/src/sys/amd64/amd64/trap.c:427

#8 <signal handler called>
#9 0xffffffff80da5cc3 in tcp_input_data (hpts=0xfffff801ac2b2d00,

tv=0xfffffe344db4ead0) at /usr/src/sys/netinet/tcp_hpts.c:1220

#10 0xffffffff80da57f7 in tcp_hptsi (hpts=<optimized out>, ctick=0x1092)

at /usr/src/sys/netinet/tcp_hpts.c:1686

#11 tcp_hpts_thread (ctx=<optimized out>)

at /usr/src/sys/netinet/tcp_hpts.c:1810

#12 0xffffffff80b46369 in intr_event_execute_handlers (p=<optimized out>,

ie=0xfffff801ac300700) at /usr/src/sys/kern/kern_intr.c:1013

#13 0xffffffff80b46a57 in ithread_execute_handlers (ie=<optimized out>,

p=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1026

#14 ithread_loop (arg=0xfffff801ac31c700)

at /usr/src/sys/kern/kern_intr.c:1106

#15 0xffffffff80b43754 in fork_exit (

callout=0xffffffff80b469a0 <ithread_loop>, arg=0xfffff801ac31c700,
frame=0xfffffe344db4ec00) at /usr/src/sys/kern/kern_fork.c:1039

#16 <signal handler called>
(kgdb)

Get the vnet set much earlier and all the right bits around earlier too.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 12 2018, 11:54 PM
This revision was automatically updated to reflect the committed changes.