Page MenuHomeFreeBSD

/bin/sh can, with its non-blocking I/O on stdout, get an EAGAIN during a write, leading to spurious build errors
Needs ReviewPublic

Authored by jkh on Feb 3 2015, 11:16 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, May 9, 9:21 PM
Unknown Object (File)
Thu, May 9, 9:21 PM
Unknown Object (File)
Thu, May 9, 1:30 PM
Unknown Object (File)
Jan 2 2024, 9:26 AM
Unknown Object (File)
Dec 21 2023, 12:31 AM
Unknown Object (File)
Dec 3 2023, 5:14 AM
Unknown Object (File)
Sep 30 2023, 8:55 PM
Unknown Object (File)
Aug 18 2023, 1:22 AM
Subscribers
None

Details

Summary

With poudriere and really large port builds (e.g. where a lot of ports need to be built at a given time), we’re getting build failures where the shell actually blows up, e.g.:
...
py-lxml devel/py-daemon sysutils/py-psutil sysutils/nut graphics/png graphics/graphite2 www/py-requests emulators/mtools dns/py-bonjour math/py-networkx www/py-ws4py textproc/py-libxml2 net/py-pysphere devel/py-rose devel/qt5-core graphics/php55-gd devel/py-mimeparse www/py-flask-bootstrap www/mod_mpm_itk security/sudo devel/py-jsonpointer security/py-openssl devel/py-simplejson net/smbldap-tools ftp/proftpd www/node net/ladvd devel/py-pyee benchmarks/iozone sysutils/tmux net/trafshow devel/py-ipaddr sysutils/hptcli net-mgmt/sipcalc graphics/cairo shells/mksh comms/lrzsz www/py-flup sysutils/screen freenas/freenas-10gui devel/git print/harfbuzz devel/goprintf: write error on stdout

With a little DTracing, we figured out that /bin/sh is using non-blocking I/O on stdout to work around some threading issues (apparently, FreeBSD’s pthread library, at some point, blocked background threads talking to stdout because of limitations in the implementation that may or may not even still exist) and because it’s non-blocking, write(2) can return EAGAIN in cases of temporary buffer overrun. The fix, as ugly as it looks, is to simply sleep for a bit and try again.

Test Plan

Compile /bin/sh with this fix and go through a full poudriere build (this may be onerous; the actual fix only really kicks in during serious output to stdout).

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

jkh retitled this revision from to /bin/sh can, with its non-blocking I/O on stdout, get an EAGAIN during a write, leading to spurious build errors.
jkh updated this object.
jkh edited the test plan for this revision. (Show Details)
jkh added reviewers: jilles, rwatson, bapt.
jkh set the repository for this revision to rS FreeBSD src repository - subversion.

Actually, after reading https://lists.freebsd.org/pipermail/freebsd-bugs/2012-February/047528.html I am starting to wonder if this wouldn't be better addressed at a stdio / write syscall wrapper level. We're also seeing highly intermittent errors with other utilities, like cat(1).

Can you find out which file is returning [EAGAIN] while writing, and who is setting it non-blocking temporarily? It is also possible that a kernel bug causes [EAGAIN] on a write even though non-blocking mode is off.

Note that sh enables non-blocking mode in two situations, but this is on new open files not shared with any other process (here-document pipe and [ENOEXEC] executable).

The file is always stdout. We have been unable to determine *definitively* who is setting it non-blocking, but /bin/sh both opens files non-blocking in exec.c and changes the file mode to non-blocking in redir.c, so if I had to guess, it would be stdout redirected to a pipe.

With that patch applied locally I can confirm that do not have the error on poudriere I used to have from time to time (write error on stdout)