It's possible to get EFAULT when writing a segment backed by a file
if the segment extends beyond the file. That can happen, for example,
if the processed mmap-ed the file and the file got truncated before
all pages were faulted in.
The core dump could still be useful if we skip the rest of the segment
and proceed to other segements.
Details
- Reviewers
kib jhb julian - Commits
- rS312532: don't abort writing of a core dump after EFAULT
Tested with the attached program
.Before the change:
kernel: Failed to write core file for process mmap_trunc_core (error 14) kernel: pid 77718 (mmap_trunc_core), uid 1001: exited on signal 6
After the change:
kernel: Failed to fully fault in a core file segment at VA 0x800645000 with size 0x4000 to be written at offset 0x29000 for process mmap_trunc_core kernel: pid 4901 (mmap_trunc_core), uid 1001: exited on signal 6 (core dumped)
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
Is this the right place to be squashing this error? It seems like some paths into this routine are not dumping file segments. Is there somewhere specific to file segments we can catch the error?
Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?
I suspect that something might have to be written to the last byte of the segment which write fails with EFAULT, so that the core file gets right size if the segment is at the end.
Actually, core_output is the only place where core_write is called with UIO_USERSPACE, so that's the only case where EFAULT is expected.
But you are right, of course, it's better to move the logic to core_output.
Or, prehaps, even to coredump?
@kib what do you think?
Additionally, what happens to the undumped pages in the core? Are following segments indexed correctly? How does gdb etc handle these files?
The undumped pages should appear as holes in the core dump, so they should be transparent to gdb (appear to be zero-filled).
That's because the next segment is going to be written at a specific, correct offse.
Except, of course, as @kib has noted in his comment, that won't work for the last segment.
This is a good point. In practice we've never seen the problem happening for the last segment. Maybe because the segments are ordered from lower to higher addresses and we always have the shared pages and stack(s) at the end. But it's better to have a reliable solution indeed.
My first instinct was to use vn_truncate but it takes a file, not a vnode.
Let me think about this. Maybe writing an explicit zero at the end of the segment is indeed the best solution.
If the goal of moving the check for EFAULT is to also fix gzipped case, then I think the only good variant is to add code to compress_chunk() to feed zeroes into the compressor. In other words, the existing patch is fine as is (almost), while gzip case requires unrelated change.
- zero out the data to be compressed on a copyin failure (to avoid garbage in it)
- write a few "real" zeroes at the end of the segment to ensure that the core file ends up with the correct size (if we failed to fault in the whole segment)
Except the single-byte write, it looks fine.
sys/kern/imgact_elf.c | ||
---|---|---|
1245 ↗ | (On Diff #24231) | Why not write a single byte ? |
Use zero_region, write just a single byte.
Using zero_region meant that I had to add some __DECONST pepper.
sys/kern/imgact_elf.c | ||
---|---|---|
1259–1261 ↗ | (On Diff #24231) | It seems like this could be abused by a low-privilege userspace process to DoS the system. |
sys/kern/imgact_elf.c | ||
---|---|---|
1259–1261 ↗ | (On Diff #24231) | IMO, this is not much different from the existing logging that prints this:
|
sys/kern/imgact_elf.c | ||
---|---|---|
1259–1261 ↗ | (On Diff #24231) | Yes, that's true. |