kern_dump.c add kernel support for compressed kernel dumps
AbandonedPublic
Actions

Authored by markj on Jul 12 2015, 2:18 AM.

Details

Reviewers

Summary

The interface for enabling dump compression mirrors that for compressed
user program core dumps. Specifically, the kern.compress_kernel_dumps
sysctl determines whether kernel dumps will be compressed, and
kern.compress_kernel_dumps_gzlevel specifies the compression level.
The compression sysctl has no effect until a dumper is chosen, since the
compression buffer size is determined by the max I/O size of the
underlying dump device.

The implementation works by interposing a compression stream between the
MI kernel dump routines and the raw dump device. The compression stream
flushes to disk whenever its input buffer is full, and once in
dump_finish(), so no modifications to (mini)dumpsys are needed.

The kernel dump header is modified slightly: in addition to the
dumplength field, which gives the size of the dump, we now also have a
dumpextent field, which gives the space between kernel dump headers.
With uncompressed dumps, these are equivalent. The kernel dump header
version is bumped accordingly, and we add a new header magic type,
GZDUMPMAGIC.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Event Timeline

markj updated this revision to Diff 6867.Jul 12 2015, 2:18 AM

markj retitled this revision from to kern_dump.c add kernel support for compressed kernel dumps.

markj edited the test plan for this revision. (Show Details)

markj added a reviewer: rstone.

markj updated this object.

markj added a subscriber: cem.

Herald added a subscriber: imp. · View Herald TranscriptJul 12 2015, 2:18 AM

cem added inline comments.Jul 12 2015, 6:04 PM

sys/kern/kern_dump.c
109–111	Whitespace looks off here, but that may just be Phabricator.
118–119	What's this construct for?
148–153	What about if the space is too small for an uncompressed dump? Follow-up question: Why don't we just start writing compressed dumps at the beginning of the medium? Edit: Below in `dump_start()`, that is exactly what we do, but only if the space would be too small. I'd prefer to simplify things a little and just start writing at the beginning always. Shrug.
291–292	I'd probably use a different errno. EINVAL? ENXIO suggests the device disappeared.
356–359	Is it valid to send unaligned blocks to dump_write_raw? But the interface requires block-unit sizes? Seems inconsistent.
400–405	Won't this panic if we switch dumpers without disabling first (`MPASS(gzs == NULL)` in `dump_gz_configure`)?

markj added inline comments.Jul 12 2015, 8:09 PM

sys/kern/kern_dump.c
109–111	Yeah, it's phabricator.
118–119	sysctl_handle_int? It does two things here: if the user is reading the sysctl, it returns the current value of compress_kernel_dumps. And if the user is modifying it, it copies the original value out and copies the new value into the local var "value".
148–153	If the space is too small, we return an error before attempting to write anything. If it's too small for even a compressed dump, we won't find that out until we've hit the end of the partition. That's not ideal, but I don't see a good way around that. Writing to the end of the device comes from the fact that the dump device is usually also the swap partition. When the system boots up after a panic, it'll fsck the local filesystems before it recovers the dump, and fsck could swap if it's dealing with a large filesystem and the amount of physical memory available is small. Writing the dump to the beginning of the device increases the risk that it'll be overwritten with swapped pages. I suppose there could be other processes that cause this, but fsck is the main example I think. So, this is just a small robustness measure. Obviously it's not foolproof, but everything involved in kernel dumps is best-effort.
291–292	Thanks, EINVAL makes more sense.
356–359	Yeah, this is somewhat weird. There's some reasoning behind it: dump_write_raw() wants blocks that are multiples of di->blocksize in size. The gzio buffer's size is di->maxiosize, which must be a multiple of di->blocksize. So dump_gz_write_cb will always be invoked with length % di->blocksize == 0 except once, when the stream is flushed at the end of the dump (i.e. in our last write). Hence the roundup(). But, we want dumpoff to contain the true length, since it'll be used to fill in the dumplength field in the header later. Otherwise savecore will read some extra garbage beyond the end of the dump, and gzip will complain when it encounters that. This at least deserves a comment.
400–405	No: you can't switch dumpers without unsetting the current dumper first, which frees gzs. See the check above that returns EBUSY.