Page MenuHomeFreeBSD

fsck_ffs(8): do bufinit() just before gjournal_check()
ClosedPublic

Authored by rew on May 28 2021, 9:25 PM.

Details

Summary

revert commit f190f9193bc10a8193 to fix issue reported by jhb.

revert commit 84768d114951e88288 by chs that fixed breakage introduced by
f190f9193bc10a8193.

add a bufinit() call before gjournal_check() to fix bug 245907 - this approach seems like the most simplistic solution at the moment.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

rew requested review of this revision.May 28 2021, 9:25 PM
rew added reviewers: jhb, mckusick, chs.

what is the "issue reported by jhb" that you mention?

In D30537#685836, @chs wrote:

what is the "issue reported by jhb" that you mention?

It's reported on the mailing list. On 32-bit platforms, fsck dies with a malloc of size (0xa5a5a5a5) because the sblock isn't initialized yet by sbread(), so on HEAD it is filed with malloc junk. This results in an error as malloc() fails. On 64-bit platforms, it's still malloc junk, but the size is no longer preposterous, so malloc() dutifully tries to allocate that much data (and due to junk filling, tries to touch it all) and eventually runs the system out of swap. On stable systems these are probably resulting in calls to malloc(0) and subsequent buffer overflows.

I think this fix is probably sensible for the immediate term, and it's probably good to get this fixed sooner rather than later given it's been breaking CI for several days now.

This revision is now accepted and ready to land.May 29 2021, 12:24 AM
In D30537#685858, @jhb wrote:
In D30537#685836, @chs wrote:

what is the "issue reported by jhb" that you mention?

It's reported on the mailing list. On 32-bit platforms, fsck dies with a malloc of size (0xa5a5a5a5) because the sblock isn't initialized yet by sbread(), so on HEAD it is filed with malloc junk. This results in an error as malloc() fails. On 64-bit platforms, it's still malloc junk, but the size is no longer preposterous, so malloc() dutifully tries to allocate that much data (and due to junk filling, tries to touch it all) and eventually runs the system out of swap. On stable systems these are probably resulting in calls to malloc(0) and subsequent buffer overflows.

On most 64-bit systems it probably works, as it's "only" ~2.8 GiB of data being allocated. It'll be slow and bloated, but most people's primary systems can easily handle that. It's only once you get to small VMs, or emulating in QEMU, that you start to run out of memory and swap. Presumably the original patch was tested on a system that had enough memory to cope.

I travelled to the north coast where we have been offline for two days, but now back online. Will look into this directly.

I think a better change is to simply preinitialize fs_bsize in sblock_init to MAXBSIZE which will be valid for all UFS filesystems so that bufinit will do a resonable allocation.

This revision now requires changes to proceed.May 29 2021, 3:06 AM

I reverted the breaking commits until a solution is found.

jrtc27 summed it up pretty well.

I only tested the original patch on 64-bit and the malloc() call was succeeding because there was ample memory.

Had I tested this on 32-bit, I would have caught it during testing - my fault.

I think a better change is to simply preinitialize fs_bsize in sblock_init to MAXBSIZE which will be valid for all UFS filesystems so that bufinit will do a resonable allocation.

If you're doing that, why not just make bufinit pass MAXBSIZE to malloc instead rather than initialising the in-memory superblock with a dummy value? The latter seems dangerous, it's nice to be able to have malloc junk catch things that read parts of the superblock that haven't been read from disk yet.

I think a better change is to simply preinitialize fs_bsize in sblock_init to MAXBSIZE which will be valid for all UFS filesystems so that bufinit will do a resonable allocation.

If you're doing that, why not just make bufinit pass MAXBSIZE to malloc instead rather than initialising the in-memory superblock with a dummy value? The latter seems dangerous, it's nice to be able to have malloc junk catch things that read parts of the superblock that haven't been read from disk yet.

I concur with your suggestion. In bufinit() replace use of sblock.fs_bsize with MAXBSIZE.

Given that you rolled back the two changes, your suggested change of just adding a bufinit() in the gjournal case is the easiest and minimal change to solving this bug.

This revision is now accepted and ready to land.May 29 2021, 3:03 PM
This revision was automatically updated to reflect the committed changes.