Page MenuHomeFreeBSD

rescue: Implement a direct dumper for arm64 and amd64
Needs RevisionPublic

Authored by jhibbits on Thu, Oct 31, 3:52 PM.

Details

Summary

See the rescue.4 manual page for details on configuration.

amd64 and arm64 support is implemented. On arm64, this should work
regardless of whether the host uses an FDT or ACPI root bus; on amd64
the feature should work whether booted via EFI or legacy BIOS.

There are several independent pieces of the implementation:

  • Build-time configuration. There are two new kernel configuration options, RESCUE and RESCUE_SUPPORT. Compile rescue kernels with "options RESCUE" and compile kernels with "options RESCUE_SUPPORT" to enable use of a rescue kernel. Set the RESCUE_EMBED make option to embed a rescue kernel into a host kernel.
  • Enable rescue-kernel-on-panic by setting the debug.rescue_minidump tunable to 1 in the host kernel. When configured, rescue_kernel_init() allocates a physically contiguous chunk of memory for use by the rescue kernel. The reservation is populated with an aligned copy of the kernel, the host kernel's environment, and metadata (such as a DTB or an EFI memory map).
  • When rescue_minidump is configured, an attempt to dump will call rescue_kernel_exec(), which does some setup and jumps to the rescue kernel's entry point. initarm() and hammer_time() have some special hooks to pull metadata out of the reservation. In general I have tried to avoid modifying locore. This and the previous item are implemented in machine/rescue_machdep.c.
  • Once the rescue kernel has booted it behaves just like a regular kernel, i.e., there is no logic specific to rescue kernels. The one difference is that rescue kernels have a /dev/dumper, which can be used to read a minidump out of the host kernel's RAM. This is implemented in machine/rescue_dumper.c.

Original patch by Mark Johnston.

Obtained from: Juniper Networks, Inc.
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 60282
Build 57166: arc lint + arc unit

Event Timeline

Really cool work. I think this could use a better description though (and possibly a rename) - in current usage the rescue prefix suggests rescue(8) and it's not clear to me what a "direct dumper" is. The summary message should include a brief description of what this actually is.

Really cool work. I think this could use a better description though (and possibly a rename) - in current usage the rescue prefix suggests rescue(8) and it's not clear to me what a "direct dumper" is. The summary message should include a brief description of what this actually is.

We call it a 'rescue kernel' at Juniper because it "rescues" the core; a small kernel, embedded in the main kernel, whose entire purpose is to save off the core of the panicked kernel, directly to disk. We use it because some of our devices don't have any swap at all, or don't have enough swap to cover even a minidump worth of pages. So we need something that can take a dump of the panicked system, and dump it to a file directly.

ehem_freebsd_m5p.com added inline comments.
sys/dev/xen/bus/xen_intr.c
55

This is wrong. sys/dev/xen/bus/xen_intr.c is pure-MI. This could go in sys/x86/include/xen/arch-intr.h though.

This revision now requires changes to proceed.Thu, Oct 31, 5:15 PM