Page MenuHomeFreeBSD

Add PT_GETREGSET
Needs ReviewPublic

Authored by andrew on Apr 5 2019, 12:36 PM.

Details

Reviewers
jhb
kib
manu
Group Reviewers
manpages
Summary

This adds the PT_GETREGSET and PT_SETREGSET ptrace types. These can be
used to access all the registers from a specified core dump note type.
The NT_PRSTATUS and NT_FPREGSET notes are initially supported. Other
machine-dependant types are expected to be added in the future.

The ptrace addr points to a struct iovec pointing at memory to hold the
registers along with its length. On success the length in the iovec is
updated to tell userspace the actual length the kernel wrote or, if the
base address is NULL, the length the kernel would have written.

Because the data field is an int the arguments are backwards when
compared to the Linux PTRACE_GETREGSET call.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 41476
Build 38365: arc lint + arc unit

Event Timeline

Hmm, do you have plans for a PT_SETREGSET as well? To replace the ARM VFP, and PPC VMX, etc. sets we need set, not just get. Also, as I said previously on IRC I would like to have a PT_GETREGSET_INFO where the iovec in addr points to a structure. The first field would always be a uint64_t size member, but the rest of the fields could vary by note type. For XSTATE the structure would also hold the xcr0 mask to replace PT_GETXSTATE_INFO, so something like:

struct regset_info {
    uint64_t len;
};

and then in x86/include/ptrace.h:

struct xstate_regset_info {
     uint64_t len;
     uint64_t xsave_mask;
sys/kern/sys_process.c
630

style(9) is to use ()'s with sizeof

665

Perhaps use 'goto out' here instead?

705

And move the free of kern_vec.iov_base to here under an 'out' label. This fixes a leak when the first copyout fails. It will also make adding other functions that use an iovec simpler by consolidating the error handling.

If you return whole XSTATE from ptrace(PT_GETREGSET, NT_X86_XSTATE), what is the point of the new API ? It would make much more sense to me if there was an API to get components like YMM/ZMM/PKRU instead of returning whole blob and then requiring the caller to parse it combining XCR0 bitmask with either prior knowledge of the component sizes, or querying the offsets on live machine.

sys/amd64/amd64/elf_machdep.c
261

I believe you can do the assignment in fpu,c where cpu_max_ext_state_size is calculated, there is no need in SYSINIT.

268

This looks strange, I see that it matches the note behavior but still. If !use_xsave, you would return x87+XMM state, which is strict subset of XSAVE state.

sys/sys/ptrace.h
39

Why this include is needed in public header ? I believe it is fair to require users to #include <sys/uio.h> if they need PT_GETREGSET.

In D19831#425341, @kib wrote:

If you return whole XSTATE from ptrace(PT_GETREGSET, NT_X86_XSTATE), what is the point of the new API ? It would make much more sense to me if there was an API to get components like YMM/ZMM/PKRU instead of returning whole blob and then requiring the caller to parse it combining XCR0 bitmask with either prior knowledge of the component sizes, or querying the offsets on live machine.

Both GDB and lldb are happy to just get the blob. However, we could use this to replace some of the "one-off" register sets we have now like the VFP registers for ARM, the two separate vector register sets for PPC. I plan to add a new register set on x86 for fs_base/gs_base mostly because I need a new core dump note to permit TLS to work in gdb on core dumps. ARM needs the TLS register available in cores and via ptrace for which a new register set would make sense (and/or extending the existing 'struct reg' which PT_GETREGSET makes easier). One thing that you should be able to do, btw is use PT_GETREGSET to fetch the normal and floating point register sets. Floating point is just NT_FPREGSET. struct reg is a bit harder. My guess is you'd let NT_PRSTATUS return struct reg (but not the full core dump note), but we should see what Linux does perhaps for this.

In D19831#425330, @jhb wrote:

Hmm, do you have plans for a PT_SETREGSET as well?

Yes, when I have time to investigate how Linux handles a set with an incorrect length.

To replace the ARM VFP, and PPC VMX, etc. sets we need set, not just get. Also, as I said previously on IRC I would like to have a PT_GETREGSET_INFO where the iovec in addr points to a structure. The first field would always be a uint64_t size member, but the rest of the fields could vary by note type.

What would the length hold if the buffer was too small for the data but large enough for the length? I have an update that will set the iovec length to the expected buffer length when the base is NULL so userspace can query it.

In D19831#425359, @jhb wrote:

One thing that you should be able to do, btw is use PT_GETREGSET to fetch the normal and floating point register sets. Floating point is just NT_FPREGSET. struct reg is a bit harder. My guess is you'd let NT_PRSTATUS return struct reg (but not the full core dump note), but we should see what Linux does perhaps for this.

It looks like Linux does this. It then seems to have a special case for NT_PRSTATUS when building a core dump.

sys/amd64/amd64/elf_machdep.c
261

I'm planning on splitting this out into a new review. I kept it in for some context on how the KPI should be used.

sys/kern/sys_process.c
630

This is mostly to be consistent with the style in the switch. I'm planning on adding COMPAT32 support, but will need to add an iovec32 first. This will need this non-style sizeof.

  • Remove xsave. It will move to a new review.
  • Add PT_SETREGSET
  • Use goto out for error handling
  • Remove sys/_iovec.h from sys/ptrace.h

I also return the size when the base is NULL in PT_GETREGSET.

This looks fine to me, modulo compat32 and man page updates.

And compat32 rises interesting question, is ELF_REGSET() interface should be per-ABI ? E.g., do we want to have it specialized for the case when Linux process asks for a registers blob, or linuxolator should decode the values of the argument and re-arrange the FreeBSD native regsets into linuxish structures ?

Add COMPAT32 and update the man page

Correct the man page, addr points at a struct iovec.

I've tested on arm64. I still need to test the compat32 code but haven't yet as my current testing is on a machine without 32bit arm support.

Have you considered doing the malloc() of the temp buffer and copyin/copyout inside of proc_read_regset and proc_write_regset? This simplifies sys_ptrace() and the freebsd32 wrapper quite a bit as they simply have to copyin the iovec and don't have to deal with trying to free() the buffer, etc. It also solves your problem of having to worry about malloc()'ing an unvalidated size as you can defer the malloc until after you've located the regset structure and validated the length. You can drop the PROC_LOCK in those two functions to do the malloc + copyin and grab it again when filling the malloc'ed buffer.

sys/compat/freebsd32/freebsd32_misc.c
937

Probably this should be the 'struct iovec' and the 'struct iovec32' should be down in the 'r32' union. In general the 'r32' union are the 32-bit sized things, and the 'r' union is the native sized things. The reason the register sets are 32-bit in 'r' is because kern_ptrace() expects 'reg32' when the calling ABI is 32-bit. Then 'add = &r' does the right thing, etc.

1000

Why does this have to be a kernel pointer at all? Can't it just be a PTRIN() of the 32-bit pointer?

  • Move malloc into proc_read_regset/proc_write_regset
  • Add a test

Tested on arm64 kernel with both arm64 and armv7 programs calling ptrace with the following parent -> child relationship between the tester and target:

arm64 -> arm64
arm64 -> armv7
armv7 -> armv7
lib/libc/sys/ptrace.2
482

You did not explained the length handling

sys/kern/imgact_elf.c
2228

Eventually what we'd like to do I think is to add a hook which iterates over the linker set and writes out core dump notes for each regset making use of the 'get_regset' hook. The one trick is that NT_PRSTATUS has to come first, but we could use this hook to retire this custom function and also the custom dumping of register sets for VFP, XSAVE, etc. Not sure if you were already planning on doing that as a followup to this?

sys/kern/sys_process.c
200

We may consider in the future removing this check and always passing down 'vec->iov_len' as the size. This would let us grow an existing register set while still supporting the old size for compatibility. (Examples to consider here are fs_base/gs_base which Linux added to NT_PRSTATUS, or if we wanted to expand NT_PRSTATUS on arm64 to include the TLS register.) In practice I think we will just add new regsets for missing registers instead, but it's a thought. (fs_base and gs_base are kind of odd as we have ptrace ops for them today, but we don't include them in core dumps yet)

266

This KASSERT() can be removed since regset->set() doesn't change size.

sys/kern/imgact_elf.c
2228

I tried this, however found it started to get very complex. We might be able to use the same scheme for all the per-thread data, however the per-proc data expect to take a sbuf where the size may not be known ahead of time. Because of this they may extend the size to fit breaking when called from ptrace.

XSAVE would work (as the earlier version of this patch shows) because, while it's dynamic, the size can be fixed during the boot process.

sys/kern/sys_process.c
200

The intention is for userspace to query the size by passing in a NULL base and for it to handle data that's not the size it was expecting, e.g. new data was added to the kernel userspace doesn't expect. I have an initial patch for lldb to use this interface on arm64, so will make sure it can handle incorrectly sized data.

sys/kern/imgact_elf.c
2228

Eh, I was only suggesting this for the per-thread register set notes. For example, note_fpregset here should be fully replaceable by this scheme.

sys/kern/sys_process.c
200

I don't know that userland is capable of handling forwards compatibility in that way unless we promise userland that if a register set grows beyond a a size it expects, the original N bytes it understands still have the same meaning. OTOH, if the kernel is the one to manage compatibility, it will know what the old format an old application might expect is and be able to read/write the old format. This is generally consistent with what we do with other compat shims where the kernel is the one that deals with this, not applications. However, perhaps we won't ever grow NT_PRSTATUS and instead just need to handle the variable-sized cases like XSAVE and SVE where the register set size is not a compile-time constant.

sys/kern/sys_process.c
200

Wouldn't userspace already have to handle it when reading a coredump?

I was testing on Linux to see what they expose & found it to be just the registers. You can ask for any length that is a multiple of the base register size, e.g. on amd64 it is 27 * 8 byte registers, so you could pass in any multiple of 8 & get up to 216 bytes of register data back.