Differential D23747

Improve sh(1) built-in read command performance when using a seekable fd
ClosedPublic
Actions

Authored by hrs on Feb 18 2020, 8:50 PM.

Details

Reviewers

jilles

Commits

rS359077: MFC of r358152 and r328235:
rS358152: Improve performance of "read" built-in command when using a seekable

Summary

This change adds a small buffer in sh(1) built-in read command where
it currently uses a 1-byte buffer.

sh(1) built-in read command calls read(2) with a 1-byte buffer because
newline characters need to be detected even on byte-stream from a
non-seekable file descriptor. Because of this, the following script
calls >6,000 read(2) to show a 6KiB file:

while read IN; do echo "$IN"; done < /COPYRIGHT

When the input byte-stream is seekable, it is possible to read a block
and then reposition the file pointer to where a newline character
found. It reduces the number of read(2) calls.

Theoretically, when multiple built-in commands are reading the same
seekable stream in a single pipe chain it is possible to share the
buffer. However, this change just makes a single built-in read
command allocate a buffer and deallocate it every time when invoked to
keep it simple. Although this causes read(2) to read the same regions
multiple times, impact to the overall performance should be small.

It seems bash and ksh have similar optimization while zsh does not.