After debugging with a user on discord why their process is not
generating coredumps, it became clear that we can use better error
messages -- particularly to distinguish between the different EFAULT
cases. For those that are denied by system policies, include some
more specific pointers to the relevant knob.
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
@kib had also pointed out that we could do better in previous reviews, but to then hit a case where it was more annoying to debug this quickly was rather convincing...
I suggested earlier that you can reuse exterr there, and Mark reasonably asked what place would be fetching the error string. I think that coredump() is the good location, It would not use the infra to copy out exterr to userspace, but it can get the string and do uprintf() for instance. Then the existing and future conversions to EXTERROR() would immediately benefit coredump code as a whole.
Hmm, it seems like the exterr bits may need some additional work here? The primary issues that I see for this are:
- Currently it wants TDP2_UEXTERR set to actually record anything, but for in-kernel usage we wouldn't seem to care about that
- The utility of exterr for this purpose drops on kernel configs that don't have BLOAT_KERNEL_WITH_EXTERR set
For #1, I guess one could just set td->td_kexterr outside of the conditional in exterr_set() and only set TDP2_EXTERR and do ktrexterr() if the thread has it configured. ((edit: well, I guess we'd need TDP2_EXTERR to see that the exterr is valid)
#2 is mainly an issue because I wanted to use this to improve the message we already send to syslog, which is currently pretty unhelpful for these specific cases.
I suggest coredump() might just set TDP2_UEXTERR on principle that it wants it, and the current thread is not supposed to return to userspace at all after it.
If there is no BLOAT_KERNEL_WITH_EXTERR in config, we probably should output whatever debugging gibberish we can, like numeric category + line number etc.
Opinions?
exterr usage propsal -- sys/exterrvar.h is split out into a separate commit locally:
kern: exterr: don't discard non-literal strings for !BLOAT configs This provides an escape hatch for callers to be able to provide a msg if they deem it critical enough, without worrying about whether the kernel is configured with the BLOAT_KERNEL_WITH_EXTERR config option or not.
Sure, no objection.
If there is no BLOAT_KERNEL_WITH_EXTERR in config, we probably should output whatever debugging gibberish we can, like numeric category + line number etc.
Opinions?
My main concern is that I don't want to regress this syslog message (and I'd prefer we make the cases changed here more clear), but I think there's a compromise that could be made. How about this version?
sys/kern/kern_ucoredump.c | ||
---|---|---|
159 | As I understand, this is mostly identical to the log() at the end of the function. | |
190 | And this is the third instance of the same log(). | |
200 | Move this above switch(rv) and do it just based on TDP2_EXTERR? | |
sys/sys/exterrvar.h | ||
37 ↗ | (On Diff #158944) | I do not object against this approach. |
sys/sys/exterrvar.h | ||
---|---|---|
37 ↗ | (On Diff #158944) | I had considered it and went with this as less-invasive since EXTERRORV() would require a full set of _SET_ERROR* macros to propagate it through, but I guess these macros will probably remain pretty static from here on so the maintenance burden probably isn't much (and it's not really my call to make, from that perspective). |
Take #3 - less chaotic, adopt an EXTERRORV() instead
sys/kern/kern_ucoredump.c | ||
---|---|---|
126 | Grr; this will go away. |
sys/kern/kern_ucoredump.c | ||
---|---|---|
162 | I would still print exterr->error %d. | |
179 | ||
sys/sys/exterrvar.h | ||
47 ↗ | (On Diff #159029) | |
37 ↗ | (On Diff #158944) | Sorry, I am not that proficient in english. Do you mean, by 'not my call to make', that I should have coded ERRORV macros? I sure could do it, but you did it great and this is not that much code, actually. |
sys/sys/exterrvar.h | ||
---|---|---|
37 ↗ | (On Diff #158944) | No, sorry- it's effectively "not my [decision] to make" -- whether it's acceptable or not is your decision as the maintainer, I will write whatever you're happy with since I'm just doing a drive-by here and am not as likely as you are to make other changes in the area in the future. |