fsck_ffs(8): do bufinit() just before gjournal_check()
ClosedPublic
Actions

Authored by rew on May 28 2021, 9:25 PM.

Details

Reviewers

jhb
mckusick
chs

Commits

rGf8145bd4bcc0: fsck_ufs: fix segfault with gjournal
rG47ec64b3e801: fsck_ufs: fix segfault with gjournal
rG441e69e419ef: fsck_ufs: fix segfault with gjournal

Summary

revert commit f190f9193bc10a8193 to fix issue reported by jhb.

revert commit 84768d114951e88288 by chs that fixed breakage introduced by
f190f9193bc10a8193.

add a bufinit() call before gjournal_check() to fix bug 245907 - this approach seems like the most simplistic solution at the moment.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

rew created this revision.May 28 2021, 9:25 PM

Herald added a subscriber: imp. · View Herald TranscriptMay 28 2021, 9:25 PM

rew requested review of this revision.May 28 2021, 9:25 PM

Harbormaster completed remote builds in B39555: Diff 90090.May 28 2021, 9:25 PM

rew edited the summary of this revision. (Show Details)May 28 2021, 11:12 PM

rew added reviewers: jhb, mckusick, chs.

what is the "issue reported by jhb" that you mention?

In D30537#685836, @chs wrote:

what is the "issue reported by jhb" that you mention?

It's reported on the mailing list. On 32-bit platforms, fsck dies with a malloc of size (0xa5a5a5a5) because the sblock isn't initialized yet by sbread(), so on HEAD it is filed with malloc junk. This results in an error as malloc() fails. On 64-bit platforms, it's still malloc junk, but the size is no longer preposterous, so malloc() dutifully tries to allocate that much data (and due to junk filling, tries to touch it all) and eventually runs the system out of swap. On stable systems these are probably resulting in calls to malloc(0) and subsequent buffer overflows.

I think this fix is probably sensible for the immediate term, and it's probably good to get this fixed sooner rather than later given it's been breaking CI for several days now.

This revision is now accepted and ready to land.May 29 2021, 12:24 AM

jhb added a subscriber: jrtc27.May 29 2021, 12:24 AM

In D30537#685858, @jhb wrote:

In D30537#685836, @chs wrote:

what is the "issue reported by jhb" that you mention?

It's reported on the mailing list. On 32-bit platforms, fsck dies with a malloc of size (0xa5a5a5a5) because the sblock isn't initialized yet by sbread(), so on HEAD it is filed with malloc junk. This results in an error as malloc() fails. On 64-bit platforms, it's still malloc junk, but the size is no longer preposterous, so malloc() dutifully tries to allocate that much data (and due to junk filling, tries to touch it all) and eventually runs the system out of swap. On stable systems these are probably resulting in calls to malloc(0) and subsequent buffer overflows.

On most 64-bit systems it probably works, as it's "only" ~2.8 GiB of data being allocated. It'll be slow and bloated, but most people's primary systems can easily handle that. It's only once you get to small VMs, or emulating in QEMU, that you start to run out of memory and swap. Presumably the original patch was tested on a system that had enough memory to cope.

I travelled to the north coast where we have been offline for two days, but now back online. Will look into this directly.

I think a better change is to simply preinitialize fs_bsize in sblock_init to MAXBSIZE which will be valid for all UFS filesystems so that bufinit will do a resonable allocation.

diffs392 BDownload

This revision now requires changes to proceed.May 29 2021, 3:06 AM

I reverted the breaking commits until a solution is found.

jrtc27 summed it up pretty well.

I only tested the original patch on 64-bit and the malloc() call was succeeding because there was ample memory.

Had I tested this on 32-bit, I would have caught it during testing - my fault.

In D30537#685867, @mckusick wrote:

I think a better change is to simply preinitialize fs_bsize in sblock_init to MAXBSIZE which will be valid for all UFS filesystems so that bufinit will do a resonable allocation.

diffs392 BDownload

If you're doing that, why not just make bufinit pass MAXBSIZE to malloc instead rather than initialising the in-memory superblock with a dummy value? The latter seems dangerous, it's nice to be able to have malloc junk catch things that read parts of the superblock that haven't been read from disk yet.

In D30537#685871, @jrtc27 wrote:

In D30537#685867, @mckusick wrote:

I think a better change is to simply preinitialize fs_bsize in sblock_init to MAXBSIZE which will be valid for all UFS filesystems so that bufinit will do a resonable allocation.

diffs392 BDownload

If you're doing that, why not just make bufinit pass MAXBSIZE to malloc instead rather than initialising the in-memory superblock with a dummy value? The latter seems dangerous, it's nice to be able to have malloc junk catch things that read parts of the superblock that haven't been read from disk yet.

I concur with your suggestion. In bufinit() replace use of sblock.fs_bsize with MAXBSIZE.

Given that you rolled back the two changes, your suggested change of just adding a bufinit() in the gjournal case is the easiest and minimal change to solving this bug.

This revision is now accepted and ready to land.May 29 2021, 3:03 PM

Closed by commit rG441e69e419ef: fsck_ufs: fix segfault with gjournal (authored by rew). · Explain WhyJun 3 2021, 2:33 AM

This revision was automatically updated to reflect the committed changes.

rew added a commit: rG441e69e419ef: fsck_ufs: fix segfault with gjournal.

rew added a commit: rG47ec64b3e801: fsck_ufs: fix segfault with gjournal.Jun 11 2021, 4:59 PM

rew added a commit: rGf8145bd4bcc0: fsck_ufs: fix segfault with gjournal.Nov 20 2021, 6:34 PM