Synchronize the sigfastblock value on entry to sigsuspend
ClosedPublic
Actions

Authored by jtl on Mar 11 2021, 8:46 PM.

Details

Reviewers

kib
markj
jhb

Commits

rGdbec10e08808: Fetch the sigfastblock value in syscalls that wait for signals

Summary

We have seen several cases of processes which have become "stuck" in kern_sigsuspend(). They exhibit these symptoms:

The kernel shows that the user has blocked signals:

(kgdb) print $td->td_pflags
$1 = 0x80000101
(kgdb) print $td->td_sigblock_ptr
$2 = (void *) 0x801012038
(kgdb) print $td->td_sigblock_val
$3 = 0x10

The userspace side shows that is has unblocked signals:

#0  _sigsuspend () at _sigsuspend.S:4
4	_sigsuspend.S: No such file or directory.
(gdb) print *(int *)0x801012038
$1 = 0x0

The kernel sigblock value is out of sync with the userspace value. Normally, this would get resolved on the next syscall return when ast() would read the userspace sigblock value and do the right thing. However, when we sleep while waiting for signals, we instead deadlock. (Because the signals are blocked, cursig() says there are no signals. So, kern_sigsuspend() simply loops without ever returning.)

We should probably consider adding a similar check to kern_sigtimedwait().

It appears this condition became noticable sometime between main-c255535-gd189a74dfdcd and main-c256326-g24a8f6d36996; however, the condition happens infrequently enough that it will be very hard to bisect.

Test Plan

I can't reliably reproduce this condition, so I can't easily test it.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

jtl created this revision.Mar 11 2021, 8:46 PM

Herald added a subscriber: imp. · View Herald TranscriptMar 11 2021, 8:46 PM

jtl requested review of this revision.Mar 11 2021, 8:46 PM

It is strange indeed, and it sounds more like a self-inflicting action from userspace. Code in rtld or libthr should not leak sigfastblock block, but of course bugs are possible.
The values you show in the userspace/kernelspace block word make a random memory corruption unlikely.

I think that the proposed patch is a good protective measure, and yes kern_sigtimedwait() should be patched same way.

This revision is now accepted and ready to land.Mar 12 2021, 11:47 AM

This is the change I am planning to commit once the regression tests finish running.

This revision now requires review to proceed.Mar 12 2021, 5:35 PM

In D29225#654397, @kib wrote:

It is strange indeed, and it sounds more like a self-inflicting action from userspace. Code in rtld or libthr should not leak sigfastblock block, but of course bugs are possible.

I don't think the user space code is leaking a sigfastblock. It looks like libthr is modifying the block value up and down around the fork() call (and the code where we see this error does call fork() multiple times in fast succession). But, libthr must not see the SIGFASTBLOCK_PEND flag when it is unblocking signals, which means it doesn't run the sigfastblock(SIGFASTBLOCK_UNBLOCK) syscall. And, for some reason, we must have cached the old value of the sigfastblock word before it was decremented.

The values you show in the userspace/kernelspace block word make a random memory corruption unlikely.

Agreed. And, it seems to always be 0x10.

I think that the proposed patch is a good protective measure, and yes kern_sigtimedwait() should be patched same way.

Thanks! I've updated the diff to include the change for kern_sigtimedwait(), and to remove my musings from the comment in kern_sigsuspend(). I'm running the regression tests now. Once that is finished I plan to commit the change.

kib accepted this revision.Mar 12 2021, 5:47 PM

This revision is now accepted and ready to land.Mar 12 2021, 5:47 PM

Closed by commit rGdbec10e08808: Fetch the sigfastblock value in syscalls that wait for signals (authored by jtl). · Explain WhyMar 12 2021, 6:34 PM

This revision was automatically updated to reflect the committed changes.

jtl added a commit: rGdbec10e08808: Fetch the sigfastblock value in syscalls that wait for signals.

Can you please MFC this to 13.0? I believe it affects Nginx restarts:

https://lists.freebsd.org/pipermail/freebsd-stable/2021-March/093371.html