Page MenuHomeFreeBSD

Ease the life of PT_TO_SCE/PT_TO_SCX users when debuggee sleeps in sigsuspend(2)/sig{timed,}wait(2).
ClosedPublic

Authored by kib on Nov 25 2019, 4:31 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Apr 17, 7:44 AM
Unknown Object (File)
Dec 23 2023, 9:39 AM
Unknown Object (File)
Nov 26 2023, 1:28 PM
Unknown Object (File)
Nov 26 2023, 12:32 AM
Unknown Object (File)
Nov 22 2023, 9:25 PM
Unknown Object (File)
Nov 22 2023, 2:53 PM
Unknown Object (File)
Nov 22 2023, 2:50 PM
Unknown Object (File)
Nov 22 2023, 2:50 PM
Subscribers

Details

Summary

If PT_TO_SCE or PT_TO_SCX were issued after the target already entered sleep in sigsuspend(2) or sigtimedwait(2), there is no debugging events generated until a relevant signal is delivered. As result, debugger hangs waiting for an event.

Introduce spurious EINTR returned in that situations to help debugger to gain control earlier.

Requested and tested by: kevans

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 27761

Event Timeline

sys/kern/kern_sig.c
1330

After looking around a bit more, should we use the PTRACE_SYSCALL spelling in these spots?

sys/kern/kern_sig.c
1330

IMO it is some obfuscation, esp. for implementation side, but let use it.

kib marked an inline comment as done.

Use PTRACE_SYSCALL.

I think this looks good to me, thanks! It does behave differently now in that truss(1) will cause us to exit these syscalls if it attaches in the middle -- but every application I can think of using them will swiftly return to sigsuspend() state once it realizes nothing productive happened.

This revision is now accepted and ready to land.Nov 25 2019, 6:24 PM

I think this looks good to me, thanks! It does behave differently now in that truss(1) will cause us to exit these syscalls if it attaches in the middle -- but every application I can think of using them will swiftly return to sigsuspend() state once it realizes nothing productive happened.

Sorry, I am confused. truss(1) attach should generate a SIGSTOP that interrupts the system call anyway.

Why do we interrupt only these two distinguished system calls?

sys/kern/kern_sig.c
1330

Should it be conditional on error == 0?

I think this looks good to me, thanks! It does behave differently now in that truss(1) will cause us to exit these syscalls if it attaches in the middle -- but every application I can think of using them will swiftly return to sigsuspend() state once it realizes nothing productive happened.

Sorry, I am confused. truss(1) attach should generate a SIGSTOP that interrupts the system call anyway.

The SIGSTOP interrupts, then zsh returns to sigsuspend(2) following that. The application then goes to trace syscall entry and we're stuck in sigsuspend(2) at that point.

Why do we interrupt only these two distinguished system calls?

There are perhaps others, but sigsuspend was the first candidate because it's the most problematic one and sigwait/sigtimedwait are a natural extension to this.

I think this looks good to me, thanks! It does behave differently now in that truss(1) will cause us to exit these syscalls if it attaches in the middle -- but every application I can think of using them will swiftly return to sigsuspend() state once it realizes nothing productive happened.

Sorry, I am confused. truss(1) attach should generate a SIGSTOP that interrupts the system call anyway.

The SIGSTOP interrupts, then zsh returns to sigsuspend(2) following that. The application then goes to trace syscall entry and we're stuck in sigsuspend(2) at that point.

Why do we interrupt only these two distinguished system calls?

There are perhaps others, but sigsuspend was the first candidate because it's the most problematic one and sigwait/sigtimedwait are a natural extension to this.

I think it's because kern_sigsuspend() and kern_sigtimedwait() call cursig() and thus may block in ptracestop() before returning to the usermode boundary.

So to be clear, the problem is that a debugger is expecting to see a system call entry following a PT_CONTINUE following the attach?

kib marked an inline comment as done.EditedNov 26 2019, 5:43 PM

I think it's because kern_sigsuspend() and kern_sigtimedwait() call cursig() and thus may block in ptracestop() before returning to the usermode boundary.

So to be clear, the problem is that a debugger is expecting to see a system call entry following a PT_CONTINUE following the attach?

This is not quite a problem, but rather the inconvenience. In this situation debugger needs to understand that the process is waiting for a signal in a syscall that does not return until the signal is generated, and do something. It is quite involved for the debugger to infer this data, and much simpler for the kernel to fabricate a spurious EINTR instead.

Also, I believe it is consistent with behavior of other sleeping syscalls like select(2), where the situation causes spurious EINTR.

Only inject EINTR in sigtimedwait() when error == 0.

This revision now requires review to proceed.Nov 26 2019, 5:49 PM
In D22546#493292, @kib wrote:

I think it's because kern_sigsuspend() and kern_sigtimedwait() call cursig() and thus may block in ptracestop() before returning to the usermode boundary.

So to be clear, the problem is that a debugger is expecting to see a system call entry following a PT_CONTINUE following the attach?

This is not quite a problem, but rather the inconvenience. In this situation debugger needs to understand that the process is waiting for a signal in a syscall that does not return until the signal is generated, and do something. It is quite involved for the debugger to infer this data, and much simpler for the kernel to fabricate a spurious EINTR instead.

Also, I believe it is consistent with behavior of other sleeping syscalls like select(2), where the situation causes spurious EINTR.

I also suspect the new behavior is consistent with how Linux handles this situation, but I haven't dug too far to confirm. The application I'm working with has no problems with Linux+zsh, but it follows the same model: platform-independent layer calls into its platform-specific ptrace layer to attach (PT_ATTACH + SIGCONT + TRACEFORK, this is close enough in both Linux/FreeBSD versions), then PI layer proceeds to hook to syscall entry and call a syscall of its own in the target. The platform layer doesn't know at attach time what the caller's intent is, because it may just be doing an ATTACH/DETACH to make sure it's permissible to do so.

In D22546#493292, @kib wrote:

Also, I believe it is consistent with behavior of other sleeping syscalls like select(2), where the situation causes spurious EINTR.

I agree it is more consistent than before, but from my reading:

  • it does not result in EINTR for a mere ptrace attach, and
  • it will cause a spurious EINTR for every invocation of the system call after PTRACE_SYSCALL is set. That is, I cannot see how the comment fragment "after userspace entered the syscall" is honoured.
In D22546#493292, @kib wrote:

Also, I believe it is consistent with behavior of other sleeping syscalls like select(2), where the situation causes spurious EINTR.

I agree it is more consistent than before, but from my reading:

  • it does not result in EINTR for a mere ptrace attach, and

For PT_ATTACH it is not needed, I believe. There a SIGSTOP is generated and we would do all the magic inside ptracestop().

  • it will cause a spurious EINTR for every invocation of the system call after PTRACE_SYSCALL is set. That is, I cannot see how the comment fragment "after userspace entered the syscall" is honoured.

Note that the check for PTRACE_SYSCALL is only performed after the syscall code slept. In other words, we enter the syscall, check for signals, then either perform plain sleep or thread_suspend_check(), and for the new code to trigger, this sleep must be cancelled. It can happen only due to a signal delivered or due to the debugging action. Then we set error/increment has_sig, but still restart the loop and if an actual signal was delivered, we process it instead of our spurious action.

Let me reformulate the above explanation this way: I did not added more wakeups, only a terminating condition for the loop.

In D22546#493348, @kib wrote:
In D22546#493292, @kib wrote:

Also, I believe it is consistent with behavior of other sleeping syscalls like select(2), where the situation causes spurious EINTR.

I agree it is more consistent than before, but from my reading:

  • it does not result in EINTR for a mere ptrace attach, and

For PT_ATTACH it is not needed, I believe. There a SIGSTOP is generated and we would do all the magic inside ptracestop().

Yes, but a plain PT_ATTACH/PT_CONTINUE in kern_sigsuspend() will not terminate the loop, since the SIGSTOP is consumed by the debugger, whereas a PT_ATTACH/CONTINUE while e.g., select(2) or kevent(2) is sleeping will cause EINTR to be returned to user mode due to the interrupted sleep. I suspect that this inconsistency is not very important though.

  • it will cause a spurious EINTR for every invocation of the system call after PTRACE_SYSCALL is set. That is, I cannot see how the comment fragment "after userspace entered the syscall" is honoured.

Note that the check for PTRACE_SYSCALL is only performed after the syscall code slept. In other words, we enter the syscall, check for signals, then either perform plain sleep or thread_suspend_check(), and for the new code to trigger, this sleep must be cancelled. It can happen only due to a signal delivered or due to the debugging action. Then we set error/increment has_sig, but still restart the loop and if an actual signal was delivered, we process it instead of our spurious action.

Let me reformulate the above explanation this way: I did not added more wakeups, only a terminating condition for the loop.

Ok, so kern_sigsuspend() will behave the same as before except in the case where the debugger intercepts and discards a signal. That seems ok to me.

sys/kern/kern_sig.c
1321

If the timeout elapsed, we will clear error and return EAGAIN in the next loop iteration, but the new logic will convert it to EINTR if a debugger is attached and PTRACE_SYSCALL is set.

kib marked an inline comment as done.

Report spurious EINTR from sigtimedwait() only if there is no other errors to return.

This revision is now accepted and ready to land.Nov 27 2019, 4:18 PM
This revision was automatically updated to reflect the committed changes.