Page MenuHomeFreeBSD

Disable free space checking in savecore.
AbandonedPublic

Authored by asomers on May 19 2015, 4:13 PM.

Details

Reviewers
pjd
Group Reviewers
manpages
Summary

Disable free space checking in savecore. It's file_size estimate is
so wildly conservative that it's basically useless IMHO. The estimate does not
consider the effects of sparseness (~ 3.5x space reduction), ZFS compression (~
32x space reduction), or gzip compression (~ 68x space reduction).

It is an open question what savecore's behavior should be when it actually runs
out of space. This patch implements no special handling. Do you think that
savecore should leave a fragmentary vmcore file behind or delete it on an
ENOSPC or EDQUOT failure?

Test Plan

Put /var/crash on a file system with insufficient free space for an
uncompressed, dense, core dump. On a stock 11.0 system, such a dump takes
about 1.2GB. It should have enough space for a sparse dump, which needs about
200MB. Create a dump and try to save it with savecore. It should succeed
despite the low free space.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

asomers updated this revision to Diff 5483.May 19 2015, 4:13 PM
asomers retitled this revision from to Disable free space checking in savecore..
asomers updated this object.
asomers edited the test plan for this revision. (Show Details)
asomers added a reviewer: pjd.
asomers added a subscriber: markj.
def added a subscriber: def.Oct 16 2015, 10:51 PM

I have made some inline comments. I think a partial vmcore shouldn't be removed because another panic might happen before a user tries to run savecore manually again.

sbin/savecore/savecore.c
290

This empty line should be removed.

613–615

If we don't check free space then we shouldn't write this message.

621

savecore can fail during saving a core dump. writebounds should be called after the core was successfully written. Otherwise a user can delete old core dumps (because of low maxdumps value) when it repeatedly tries to save the new core dump without enough space. Also it would be better to overwrite a partial core dump with a full core dump once a user has more free space.

In D2587#81386, @def wrote:

I have made some inline comments. I think a partial vmcore shouldn't be removed because another panic might happen before a user tries to run savecore manually again.

Hm, I'm not sure I agree. If the FS really doesn't have enough space, this will just cause savecore to consume all the remaining space with a kernel dump fragment, and I'm pretty sure a truncated vmcore is mostly useless anyway.

Generally I think it'd be wrong to entirely remove this checking. There are lots of systems in which /var/crash resides on a UFS FS and the space checking is mostly correct. In particular, Isilon has code to compress kernel dumps before writing them to the dumpdev (see D3054-59), so savecore will not create a sparse file in this case, and we frequently create large cores that cannot be extracted to /var/crash. Why not make the check conditional on a command-line option? This option could be set automatically if -z is specified.

asomers added a comment.EditedOct 16 2015, 11:35 PM

Hm, I'm not sure I agree. If the FS really doesn't have enough space, this will just cause savecore to consume all the remaining space with a kernel dump fragment, and I'm pretty sure a truncated vmcore is mostly useless anyway.

Agreed. I think it would be better to remove a truncated vmcore. pjd suggested removing the partial vmcore and not clearing the dump from the dump device. I agree with him on that.

Generally I think it'd be wrong to entirely remove this checking. There are lots of systems in which /var/crash resides on a UFS FS and the space checking is mostly correct. In particular, Isilon has code to compress kernel dumps before writing them to the dumpdev (see D3054-59), so savecore will not create a sparse file in this case, and we frequently create large cores that cannot be extracted to /var/crash. Why not make the check conditional on a command-line option? This option could be set automatically if -z is specified.

The space checking is not mostly correct. savecore will always save either a compressed dump or a sparse dump. In my experiments even a sparse dump consumes far less disk space than its nominal size.

OTOH, if Isilon's compressed core dump code gets merged, then the space checking would be useful again. I don't like the idea of adding another command-line option though. How does this sound?

If core dump is compressed
   If /var/crash has sufficient space
       Write compressed core file
else
    if user specified "-z"
        Try to write compressed core file
    else
        Try to write sparse core file
If the write succeeded
    Clear the core from the dump device
else
    Delete the partial core file
    Log an error message

Hm, I'm not sure I agree. If the FS really doesn't have enough space, this will just cause savecore to consume all the remaining space with a kernel dump fragment, and I'm pretty sure a truncated vmcore is mostly useless anyway.

Agreed. I think it would be better to remove a truncated vmcore. pjd suggested removing the partial vmcore and not clearing the dump from the dump device. I agree with him on that.

Generally I think it'd be wrong to entirely remove this checking. There are lots of systems in which /var/crash resides on a UFS FS and the space checking is mostly correct. In particular, Isilon has code to compress kernel dumps before writing them to the dumpdev (see D3054-59), so savecore will not create a sparse file in this case, and we frequently create large cores that cannot be extracted to /var/crash. Why not make the check conditional on a command-line option? This option could be set automatically if -z is specified.

The space checking is not mostly correct. savecore will always save either a compressed dump or a sparse dump. In my experiments even a sparse dump consumes far less disk space than its nominal size.
OTOH, if Isilon's compressed core dump code gets merged, then the space checking would be useful again. I don't like the idea of adding another command-line option though. How does this sound?

If core dump is compressed
   If /var/crash has sufficient space
       Write compressed core file
else
    if user specified "-z"
        Try to write compressed core file
    else
        Try to write sparse core file
If the write succeeded
    Clear the core from the dump device
else
    Delete the partial core file
    Log an error message

This makes sense to me. I do plan on merging the compression code before 11.0, so the checking will indeed be useful in that case.

asomers abandoned this revision.Oct 22 2015, 5:03 PM

Per discussion, I'm abandoning this revision. After D3057 and D3059 are complete, I will reimplement these changes. That work is tracked at https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203963