This change replaces the mutex with a sx lock for the interpreter list to avoid the problem of holding a non-sleep lock during a page fault as reported by witness. It also uses atomics where possible to avoid having to acquire the exclusive lock. In addition, it consistently uses memset()/memcpy() instead of bzero()/bcopy().
For reasons that are not clear, this does not cleanly apply to head.
Updated to revision 279400. firstname.lastname@example.org:/usr/src # patch -p0 < /var/tmp/patch.txt Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |Index: sys/kern/imgact_binmisc.c |=================================================================== |--- sys/kern/imgact_binmisc.c |+++ sys/kern/imgact_binmisc.c -------------------------- Patching file sys/kern/imgact_binmisc.c using Plan A... Hunk #1 succeeded at 1. Hunk #2 succeeded at 39. Hunk #3 succeeded at 98. Hunk #4 succeeded at 113. Hunk #5 succeeded at 154. Hunk #6 succeeded at 203. Hunk #7 succeeded at 260. Hunk #8 succeeded at 288. Hunk #9 succeeded at 312. Hunk #10 succeeded at 333. Hunk #11 succeeded at 351. Hunk #12 succeeded at 383. Hunk #13 succeeded at 405. Hunk #14 succeeded at 420. Hunk #15 succeeded at 557. Hunk #16 failed at 594. Hunk #17 succeeded at 650. Hunk #18 succeeded at 661. Hunk #19 succeeded at 721. Hunk #20 succeeded at 737. 1 out of 20 hunks failed--saving rejects to sys/kern/imgact_binmisc.c.rej done
It does cleanly apply now. Poudriere now fails with:
email@example.com:/home/sbruno # poudriere bulk -a -j 11mips64 -p 11mips32 [00:00:00] ====>> Creating the reference jail... done [00:00:01] ====>> Mounting system devices for 11mips64-11mips32 [00:00:01] ====>> Mounting ports/packages/distfiles [00:00:01] ====>> Using packages from previously failed build [00:00:01] ====>> Mounting packages from: /usr/local/poudriere/data/packages/11mips64-11mips32 [00:00:01] ====>> Warning: Blacklisting (from /usr/local/etc/poudriere.d/blacklist): lang/erlang [00:00:01] ====>> Warning: Blacklisting (from /usr/local/etc/poudriere.d/blacklist): lang/erlang-runtime15 [00:00:01] ====>> Warning: Blacklisting (from /usr/local/etc/poudriere.d/blacklist): lang/erlang-runtime16 [00:00:01] ====>> Warning: Blacklisting (from /usr/local/etc/poudriere.d/blacklist): lang/erlang-runtime17 [00:00:01] ====>> Warning: Blacklisting (from /usr/local/etc/poudriere.d/blacklist): lang/ratfor [00:00:01] ====>> Warning: Blacklisting (from /usr/local/etc/poudriere.d/blacklist): security/gpgme [00:00:01] ====>> Warning: Blacklisting (from /usr/local/etc/poudriere.d/blacklist): security/p5-Module-Signature /etc/resolv.conf -> /usr/local/poudriere/data/.m/11mips64-11mips32/ref/etc/resolv.conf [00:00:01] ====>> Starting jail 11mips64-11mips32 make: "/usr/ports/Mk/bsd.port.mk" line 1191: warning: "/usr/bin/uname -p" returned non-zero status make: "/usr/ports/Mk/bsd.port.mk" line 1196: warning: "/usr/bin/uname -s" returned non-zero status make: "/usr/ports/Mk/bsd.port.mk" line 1199: warning: "/usr/bin/uname -r" returned non-zero status make: "/usr/ports/Mk/bsd.port.mk" line 1219: UNAME_r () and OSVERSION (1100058) do not agree on major version number. [00:00:02] ====>> Cleaning up [00:00:02] ====>> Umounting file systems
the problem of holding a non-sleep lock during a page fault as reported by witness
What page fault?
Cursory reading suggests a M_WAITOK malloc instead coming from sbuf_auto. The allocation could be done prior to taking any locks, or would be best avoided. The string in question is always freed as the function is exiting and /dev/fd/%d" will fit on a small buffer on the stack no problem.
As such, while I don't see problems with using sx locks here, it looks like the code could be otherwise improved and then switch to rwlocks instead.
The page fault is caused by the bcopy() on line 670 when there are a lot of command-line arguments. The bcopy() moves everything down to make room for the interpreter path and arguments.
That isn't where the fault happens but, yes, it seems that buffer could be pre-allocated before getting the lock.
I am not sure if the bcopy() could be done without holding a lock.
Maybe call it 'interp_list_lock'?
With an sx lock you can use the old logic order since you can hold it across the malloc().
It would seem simpler to just use an xlock here and use plain ops for updating ibe_flags since my guess is that enabling/disabling entries isn't really a hot path?
oh ... is this a potential race here if we *do not* hold the lock across the call to imgact_binmisc_new_entry() call (e.g. we don't restore the old lock order?)
hrm ... don't we need to make this atomic because we *REALLY* want to make sure this happens during a review of the interpret list?
No, there is no race here. It is just imgact_binmisc_new_entry() mallocs a new entry but if it doesn't need the entry it free()'s it. With sleep locks we can malloc() it while holding the lock and, therefore, don't have to go to the trouble of pre-allocating a new entry.
Yes, this is really a hot path and we could just exclusively lock it here and not use the atomics to clear the flag.
No, this was memcpy() but memcpy() doesn't do play well overlapping memory which is required here. Sean changed it to bcopy() so now it looks like a whitespace diff.