Page MenuHomeFreeBSD

Fix infinite loop on older hardware
AbandonedPublic

Authored by shurd on Apr 30 2019, 4:55 PM.

Details

Reviewers
rrs
gallatin
Group Reviewers
transport
Summary

On an old multisocket AMD system with RACK enabled, I was
seeing all cores stuck in tcp_hptsi() via the again loop. Ensuring
that p_on_queue_cnt is non-zero resolves it. I want to get an opinion
from rrs@ before committing, so I'm putting it here until he gets
back.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 24013
Build 22900: arc lint + arc unit

Event Timeline

shurd created this revision.Apr 30 2019, 4:55 PM
rrs added a comment.Jun 5 2019, 6:47 PM

There are for more problems then this. And its not explicitly with HPTS but it becomes the canary in the coal mine.
The basic problem is in r347381. This cause a huge slow down to the system at least with INVARIANTs. Hpts in
the form you are dealing with does handle wheel wrap well. We are backing out that change in our upstream
merge. And there will be changes I will be committing to fix wheel_wrap in hpts.

Right now I don't have Drew's changes in those so until the merge is done I won't have the final code..

I would suggest back out r347381 or at least get https://reviews.freebsd.org/D20372
which has a fix but TFO may still be broken Michael is investigating that... we will back out
for now 347381.

I should have something into phabricator hopefully by this weekend.

rrs added a comment.Jun 5 2019, 6:51 PM

sorry wrong revision.. the trouble is in the DSACK commit which is r347382... thats what we are backing out of our sync... (off by
one error.. must be a programmer :-) )

tuexen added a subscriber: tuexen.Jun 23 2019, 8:55 PM

I observed also a lockup of a bhyve VM on a slow system, where the VM used the config file:

tuexen@syzkaller:~ % cat head/sys/amd64/conf/SYZKALLER 
include         GENERIC
ident           SYZKALLER

options         TCPHPTS
options         COVERAGE
options         KCOV
options         DEBUG_REDZONE
options         ALT_BREAK_TO_DEBUGGER

The system locked up during boot:

sudo sh /usr/share/examples/bhyve/vmrun.sh -c 2 -m 2048M -t tap0 -d syzkaller.img SYZKALLER
results in
Loading kernel...
/boot/kernel/kernel text=0x1dfe464 data=0x247778+0x81b278 syms=[0x8+0x1b9870+0x8+0x1a5209]
Loading configured modules...
/boot/entropy size=0x1000
GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
---<<BOOT>>---
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT r349293 SYZKALLER amd64
FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) (based on LLVM 8.0.0)
WARNING: WITNESS option enabled, expect reduced performance.
VT: init without driver.
CPU: AMD GX-412TC SOC                                (998.05-MHz K8-class CPU)
 Origin="AuthenticAMD"  Id=0x730f01  Family=0x16  Model=0x30  Stepping=1
 Features=0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT>
 Features2=0xbed82203<SSE3,PCLMULQDQ,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,HV>
 AMD Features=0x26500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,LM>
 AMD Features2=0xc4031fb<LAHF,CMP,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,SKINIT,WDT,Topology,DBE,PTSC>
 Structured Extended Features=0x8<BMI1>
 XSAVE Features=0x1<XSAVEOPT>
 TSC: P-state invariant
Hypervisor: Origin = "bhyve bhyve "
real memory  = 2147483648 (2048 MB)
avail memory = 2030579712 (1936 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <BHYVE  BVMADT  >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 2 package(s) x 1 core(s)
random: unblocking device.
ioapic0 <Version 1.1> irqs 0-31 on motherboard
Launching APs: 1
random: entropy device external interface
[ath_hal] loaded
kbd1 at kbdmux0
module_register_init: MOD_LOAD (vesa, 0xffffffff817bf860, 0) error 19
000.000051 [4254] netmap_init               netmap: loaded module
nexus0
cryptosoft0: <software crypto> on motherboard
acpi0: <BHYVE BVXSDT> on motherboard
acpi0: Power Button (fixed)
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 16777216 Hz quality 950
Event timer "HPET" frequency 16777216 Hz quality 550
Event timer "HPET1" frequency 16777216 Hz quality 450
Event timer "HPET2" frequency 16777216 Hz quality 450
Event timer "HPET3" frequency 16777216 Hz quality 450
Event timer "HPET4" frequency 16777216 Hz quality 450
Event timer "HPET5" frequency 16777216 Hz quality 450
Event timer "HPET6" frequency 16777216 Hz quality 450
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
isab0: <PCI-ISA bridge> at device 1.0 on pci0
isa0: <ISA bus> on isab0
virtio_pci0: <VirtIO PCI Network adapter> port 0x2000-0x201f mem 0xc0000000-0xc0001fff irq 16 at device 2.0 on pci0
vtnet0: <VirtIO Networking Adapter> on virtio_pci0
vtnet0: Ethernet address: 00:a0:98:0d:02:0c
vtnet0: netmap queues/slots: TX 1/1024, RX 1/512
000.000150 [ 503] vtnet_netmap_attach       vtnet attached txq=1, txd=1024 rxq=1, rxd=512
virtio_pci1: <VirtIO PCI Block adapter> port 0x2040-0x207f mem 0xc0002000-0xc0003fff irq 17 at device 3.0 on pci0
vtblk0: <VirtIO Block Adapter> on virtio_pci1
vtblk0: 32768MB (67108864 512 byte sectors)
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
driver bug: Unable to set devclass (class: atkbdc devname: (unknown))
Unhandled ps2 mouse command 0xe1
                               psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model Generic PS/2 mouse, device ID 0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: console (9600,n,8,1)
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
vga0: <Generic ISA VGA> at port 0x3b0-0x3bb iomem 0xb0000-0xb7fff pnpid PNP0900 on isa0
Timecounters tick every 10.000 msec
usb_needs_explore_all: no devclass
Trying to mount root from ufs:/dev/vtbd0s1a [rw]...
TCP Hpts created 2 swi interrupt threads and bound 0 to cpus
WARNING: WITNESS option enabled, expect reduced performance.
Setting hostuuid: 9a74566d-71b5-11e9-85d8-00a098f23af6.
Setting hostid: 0x2ca0ec52.
Deprecated code (to be removed in FreeBSD 14): FreeBSD 12.x ABI compat
Deprecated code (to be removed in FreeBSD 14): FreeBSD 12.x ABI compat

Please note that no TCP module was actually loaded.

With the above patch, the kernel boots fine.

@rrs Should the above patch get committed or do you prefer a different fix?

D20834 fixes the issue for me.
@shurd: Could you test if D20834 fixes your issue?

shurd abandoned this revision.Jul 9 2019, 9:25 PM

Cannot reproduce with D20834 appplied.