- Dump an NT_X86_XSTATE note if XSAVE is in use. This note is designed to match what Linux does in that 1) it dumps the entire XSAVE area including the fxsave state, and 2) it stashes a copy of the current xsave mask in the unused padding between the fxsave state and the xstate header at the same location used by Linux.
- Teach readelf() to recognize NT_X86_XSTATE notes.
- Change PT_GET/SETXSTATE to take the entire XSAVE state instead of only the extra portion. This avoids having to always make two ptrace() calls to get or set the full XSAVE state.
- Add a PT_GET_XSTATE_INFO which returns the length of the current XSTATE save area (so the size of the buffer needed for PT_GETXSTATE) and the current XSAVE mask (%xcr0).
Details
- Reviewers
kib - Commits
- rS274817: Improve support for XSAVE with debuggers.
Using my patches to gdb, I was able to examine ymm registers in test
programs using AVX under both i386 and amd64. Core dumps work for
both i386 and amd64 as well as for 32-bit binaries on amd64 (and
gdb can examine ymm regs for all 3 cases). gcore works in all 3
cases. Was able to modify a ymm register using 'set' in gdb at runtime
for both i386 and amd64. Note that gdb (neither in tree nor in
ports) cannot currently handle 32-bit binaries on amd64 since it is
missing a hook to force usage of ld-elf32.so.1.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
No Lint Coverage - Unit
No Test Coverage
Event Timeline
sys/x86/include/ptrace.h | ||
---|---|---|
48 | I do not see why userspace needs the _OLD values exposed. I suggest either keep them under _KERNEL or move to .c files. |
sys/x86/include/ptrace.h | ||
---|---|---|
48 | Ah, yes, I agree. Under _KERNEL is fine. I think it's useful to leave them in the header so it's clear that these values are not available for future operations. |
sys/amd64/amd64/elf_machdep.c | ||
---|---|---|
153 | Strictly speaking, the comment is wrong. Thread may use FPU if core dumping write path utilizes FPU in kernel. Think about things like geli. Worse, if context switch happens after fpugetregs() but before bcopy() finished, and FPU state was still loaded into hw, the context switch would spill the state into the save area, overriding your mask. | |
sys/amd64/amd64/ptrace_machdep.c | ||
76 | I somewhat dislike this. My feel is that for ABI extensibility and stability it would be better to supply both return values from the same path, i.e. define some structute which contains both mask and ext_state size, and copy it out. Also, a padding for future extensions might be due. | |
sys/compat/ia32/ia32_sysvec.c | ||
196 | Why copy of dst into buf is needed ? | |
211 | Why ? buf is not used afterfard. | |
sys/kern/imgact_elf.c | ||
1619 | Use static const char freebsd[] = "FreeBSD" ? And then, you can just use the pointer to array and sizeof() of it, instead of magic '8' and strlen ? | |
sys/x86/include/fpu.h | ||
74 | Very nice cleanup, probably should be committed separately. |
sys/amd64/amd64/elf_machdep.c | ||
---|---|---|
153 | For the first bit, I'm worried that overwriting this will cause problems with the thread reloading the state from the save area. (i.e. the xrstor would fault due to the garbage at this location). Alternatively, I could use a gross hack to determine the offset into 'buf' and do this after elf64_populate_note(). It's just really gross to do (and I'm trying to avoid having to create another temporary buffer. For one, I can't return an error from here and the rest of the coredump code uses M_NOWAIT allocations). So I think a temporary buffer is out. If we know for certain that xrstor won't happen after this point (the thread can't go back into userland to do a DNA trap, and presumably even if something like geli is used it will just do an xsave of the current state and then load in a kernel-side state, but it won't do an xrstor when it is done, it would leave fpcurthread clear and be waiting for a DNA trap?) then a critical section might be ok. Otherwise I will have to frob buf afterwards, which is gross, but doable I guess. | |
sys/amd64/amd64/ptrace_machdep.c | ||
76 | Yeah, I started out returning only one thing and then added xcr. I can just do a structure. I'm not sure what else I would add to it, but perhaps we could require 'data' to be the size of the structure being fetched to make it extendable in the future? | |
sys/compat/ia32/ia32_sysvec.c | ||
211 | I did this (in all 3 versions) so that buf would be ready for a future MD note added in the future. However, I can drop that and someone else can add it in the future if needed. | |
sys/kern/imgact_elf.c | ||
1619 | This is actually copy and pasted from elsewhere in the file, but I would prefer this, yes. You might even be able to just use 'sizeof("FreeBSD")' directly. However, a 'static const char note_name[]' that the entire file used would be better. |
sys/amd64/amd64/elf_machdep.c | ||
---|---|---|
153 | There are two options. One, which is harder, is to allow to compose note data from pieces. Another, much simpler, is to drop state from hardware in advance. Like critical_enter(); fpudrop(); critical_leave(); do your stuff; restore the save area; fpudrop() ensures that context switch does not need to do the save. Since using FPU in interrupt context is disabled, this should work. | |
sys/amd64/amd64/ptrace_machdep.c | ||
76 | Yes, I want the structure. And I think that requiring data be the size is good idea. ptrace(PT_LWPINFO) already does this. | |
sys/compat/ia32/ia32_sysvec.c | ||
211 | Fine, keep it, but please add a comment. |
Feedback from Konstantin:
- Use FREEBSD_ABI_VENDOR instead of magic constants for the note name.
- Optionally return a pointer to the note payload to the caller of elfNN_populate_note() and use this to save xsave_mask in the note.
- Export a structure for xstate_info instead of a return value and
sys/amd64/amd64/elf_machdep.c | ||
---|---|---|
153 | Hmm, I ended up letting elfNN_populate_note() return a pointer to the start of the payload inside the note. I then use that to stash the saved copy of the mask. (So I never write into the thread's saved area directly). Let me know what you think of that approach vs the other. |