Page MenuHomeFreeBSD

don't abort writing of a core dump after EFAULT
ClosedPublic

Authored by avg on Jan 18 2017, 6:17 PM.

Details

Summary

It's possible to get EFAULT when writing a segment backed by a file
if the segment extends beyond the file. That can happen, for example,
if the processed mmap-ed the file and the file got truncated before
all pages were faulted in.
The core dump could still be useful if we skip the rest of the segment
and proceed to other segements.

Test Plan

Tested with the attached program

.

Before the change:

kernel: Failed to write core file for process mmap_trunc_core (error 14)
kernel: pid 77718 (mmap_trunc_core), uid 1001: exited on signal 6

After the change:

kernel: Failed to fully fault in a core file segment at VA 0x800645000 with size 0x4000 to be written at offset 0x29000 for process mmap_trunc_core
kernel: pid 4901 (mmap_trunc_core), uid 1001: exited on signal 6 (core dumped)

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

avg updated this revision to Diff 24168.Jan 18 2017, 6:17 PM
avg retitled this revision from to don't abort writing of a core dump after EFAULT.
avg updated this object.
avg edited the test plan for this revision. (Show Details)
avg added reviewers: jhb, kib.
cem added a subscriber: cem.Jan 18 2017, 6:49 PM

Is this the right place to be squashing this error? It seems like some paths into this routine are not dumping file segments. Is there somewhere specific to file segments we can catch the error?

Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?

kib edited edge metadata.Jan 18 2017, 7:12 PM

I suspect that something might have to be written to the last byte of the segment which write fails with EFAULT, so that the core file gets right size if the segment is at the end.

avg added a comment.Jan 18 2017, 9:39 PM
In D9233#190933, @cem wrote:

Is this the right place to be squashing this error? It seems like some paths into this routine are not dumping file segments. Is there somewhere specific to file segments we can catch the error?

Actually, core_output is the only place where core_write is called with UIO_USERSPACE, so that's the only case where EFAULT is expected.
But you are right, of course, it's better to move the logic to core_output.
Or, prehaps, even to coredump?
@kib what do you think?

Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?

The undumped pages should appear as holes in the core dump, so they should be transparent to gdb (appear to be zero-filled).
That's because the next segment is going to be written at a specific, correct offse.
Except, of course, as @kib has noted in his comment, that won't work for the last segment.

avg added a comment.Jan 18 2017, 9:50 PM
In D9233#190936, @kib wrote:

I suspect that something might have to be written to the last byte of the segment which write fails with EFAULT, so that the core file gets right size if the segment is at the end.

This is a good point. In practice we've never seen the problem happening for the last segment. Maybe because the segments are ordered from lower to higher addresses and we always have the shared pages and stack(s) at the end. But it's better to have a reliable solution indeed.

My first instinct was to use vn_truncate but it takes a file, not a vnode.
Let me think about this. Maybe writing an explicit zero at the end of the segment is indeed the best solution.

julian accepted this revision.Jan 19 2017, 3:42 AM
julian added a reviewer: julian.
julian added a subscriber: julian.

looks good to me. Hit this in 10.3

This revision is now accepted and ready to land.Jan 19 2017, 3:42 AM
kib added a comment.Jan 19 2017, 7:32 AM
In D9233#190954, @avg wrote:
In D9233#190933, @cem wrote:

Is this the right place to be squashing this error? It seems like some paths into this routine are not dumping file segments. Is there somewhere specific to file segments we can catch the error?

Actually, core_output is the only place where core_write is called with UIO_USERSPACE, so that's the only case where EFAULT is expected.
But you are right, of course, it's better to move the logic to core_output.
Or, prehaps, even to coredump?
@kib what do you think?

Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?

The undumped pages should appear as holes in the core dump, so they should be transparent to gdb (appear to be zero-filled).
That's because the next segment is going to be written at a specific, correct offse.
Except, of course, as @kib has noted in his comment, that won't work for the last segment.

If the goal of moving the check for EFAULT is to also fix gzipped case, then I think the only good variant is to add code to compress_chunk() to feed zeroes into the compressor. In other words, the existing patch is fine as is (almost), while gzip case requires unrelated change.

avg updated this revision to Diff 24231.Jan 20 2017, 12:49 PM
avg edited edge metadata.
  • zero out the data to be compressed on a copyin failure (to avoid garbage in it)
  • write a few "real" zeroes at the end of the segment to ensure that the core file ends up with the correct size (if we failed to fault in the whole segment)
This revision now requires review to proceed.Jan 20 2017, 12:49 PM
kib accepted this revision.Jan 20 2017, 12:59 PM
kib edited edge metadata.

Except the single-byte write, it looks fine.

sys/kern/imgact_elf.c
1245 ↗(On Diff #24231)

Why not write a single byte ?
Also, we have zero_region.

This revision is now accepted and ready to land.Jan 20 2017, 12:59 PM
avg edited the test plan for this revision. (Show Details)Jan 20 2017, 1:01 PM
avg edited edge metadata.
avg updated this revision to Diff 24232.Jan 20 2017, 1:13 PM
avg edited edge metadata.

Use zero_region, write just a single byte.
Using zero_region meant that I had to add some __DECONST pepper.

This revision now requires review to proceed.Jan 20 2017, 1:13 PM
kib accepted this revision.Jan 20 2017, 1:26 PM
kib edited edge metadata.
This revision is now accepted and ready to land.Jan 20 2017, 1:26 PM
This revision was automatically updated to reflect the committed changes.
cem added inline comments.Jan 20 2017, 5:23 PM
sys/kern/imgact_elf.c
1259–1261 ↗(On Diff #24231)

It seems like this could be abused by a low-privilege userspace process to DoS the system.

avg added inline comments.Jan 20 2017, 7:14 PM
sys/kern/imgact_elf.c
1259–1261 ↗(On Diff #24231)

IMO, this is not much different from the existing logging that prints this:

Failed to write core file for process mmap_trunc_core (error 14)

cem added inline comments.Jan 20 2017, 7:21 PM
sys/kern/imgact_elf.c
1259–1261 ↗(On Diff #24231)

Yes, that's true.