Page MenuHomeFreeBSD

don't abort writing of a core dump after EFAULT
ClosedPublic

Authored by avg on Jan 18 2017, 6:17 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 24, 8:08 AM
Unknown Object (File)
Tue, Jan 21, 7:23 AM
Unknown Object (File)
Sun, Jan 12, 9:20 PM
Unknown Object (File)
Dec 20 2024, 4:08 PM
Unknown Object (File)
Dec 8 2024, 4:28 PM
Unknown Object (File)
Dec 7 2024, 8:36 AM
Unknown Object (File)
Nov 23 2024, 7:52 PM
Unknown Object (File)
Nov 23 2024, 11:52 AM
Subscribers

Details

Summary

It's possible to get EFAULT when writing a segment backed by a file
if the segment extends beyond the file. That can happen, for example,
if the processed mmap-ed the file and the file got truncated before
all pages were faulted in.
The core dump could still be useful if we skip the rest of the segment
and proceed to other segements.

Test Plan

Tested with the attached program

.

Before the change:

kernel: Failed to write core file for process mmap_trunc_core (error 14)
kernel: pid 77718 (mmap_trunc_core), uid 1001: exited on signal 6

After the change:

kernel: Failed to fully fault in a core file segment at VA 0x800645000 with size 0x4000 to be written at offset 0x29000 for process mmap_trunc_core
kernel: pid 4901 (mmap_trunc_core), uid 1001: exited on signal 6 (core dumped)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 6864
Build 7066: arc lint + arc unit

Event Timeline

avg retitled this revision from to don't abort writing of a core dump after EFAULT.
avg updated this object.
avg edited the test plan for this revision. (Show Details)
avg added reviewers: jhb, kib.

Is this the right place to be squashing this error? It seems like some paths into this routine are not dumping file segments. Is there somewhere specific to file segments we can catch the error?

Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?

I suspect that something might have to be written to the last byte of the segment which write fails with EFAULT, so that the core file gets right size if the segment is at the end.

In D9233#190933, @cem wrote:

Is this the right place to be squashing this error? It seems like some paths into this routine are not dumping file segments. Is there somewhere specific to file segments we can catch the error?

Actually, core_output is the only place where core_write is called with UIO_USERSPACE, so that's the only case where EFAULT is expected.
But you are right, of course, it's better to move the logic to core_output.
Or, prehaps, even to coredump?
@kib what do you think?

Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?

The undumped pages should appear as holes in the core dump, so they should be transparent to gdb (appear to be zero-filled).
That's because the next segment is going to be written at a specific, correct offse.
Except, of course, as @kib has noted in his comment, that won't work for the last segment.

In D9233#190936, @kib wrote:

I suspect that something might have to be written to the last byte of the segment which write fails with EFAULT, so that the core file gets right size if the segment is at the end.

This is a good point. In practice we've never seen the problem happening for the last segment. Maybe because the segments are ordered from lower to higher addresses and we always have the shared pages and stack(s) at the end. But it's better to have a reliable solution indeed.

My first instinct was to use vn_truncate but it takes a file, not a vnode.
Let me think about this. Maybe writing an explicit zero at the end of the segment is indeed the best solution.

julian added a reviewer: julian.
julian added a subscriber: julian.

looks good to me. Hit this in 10.3

This revision is now accepted and ready to land.Jan 19 2017, 3:42 AM
In D9233#190954, @avg wrote:
In D9233#190933, @cem wrote:

Is this the right place to be squashing this error? It seems like some paths into this routine are not dumping file segments. Is there somewhere specific to file segments we can catch the error?

Actually, core_output is the only place where core_write is called with UIO_USERSPACE, so that's the only case where EFAULT is expected.
But you are right, of course, it's better to move the logic to core_output.
Or, prehaps, even to coredump?
@kib what do you think?

Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?

The undumped pages should appear as holes in the core dump, so they should be transparent to gdb (appear to be zero-filled).
That's because the next segment is going to be written at a specific, correct offse.
Except, of course, as @kib has noted in his comment, that won't work for the last segment.

If the goal of moving the check for EFAULT is to also fix gzipped case, then I think the only good variant is to add code to compress_chunk() to feed zeroes into the compressor. In other words, the existing patch is fine as is (almost), while gzip case requires unrelated change.

avg edited edge metadata.
  • zero out the data to be compressed on a copyin failure (to avoid garbage in it)
  • write a few "real" zeroes at the end of the segment to ensure that the core file ends up with the correct size (if we failed to fault in the whole segment)
This revision now requires review to proceed.Jan 20 2017, 12:49 PM
kib edited edge metadata.

Except the single-byte write, it looks fine.

sys/kern/imgact_elf.c
1252

Why not write a single byte ?
Also, we have zero_region.

This revision is now accepted and ready to land.Jan 20 2017, 12:59 PM
avg edited edge metadata.
avg edited edge metadata.

Use zero_region, write just a single byte.
Using zero_region meant that I had to add some __DECONST pepper.

This revision now requires review to proceed.Jan 20 2017, 1:13 PM
kib edited edge metadata.
This revision is now accepted and ready to land.Jan 20 2017, 1:26 PM
This revision was automatically updated to reflect the committed changes.
sys/kern/imgact_elf.c
1264–1266

It seems like this could be abused by a low-privilege userspace process to DoS the system.

sys/kern/imgact_elf.c
1264–1266

IMO, this is not much different from the existing logging that prints this:

Failed to write core file for process mmap_trunc_core (error 14)

sys/kern/imgact_elf.c
1264–1266

Yes, that's true.