Page MenuHomeFreeBSD

sh: use larger BUFSIZ
AbandonedPublic

Authored by trasz on Oct 16 2018, 12:37 PM.
Tags
None
Referenced Files
Unknown Object (File)
Jan 31 2024, 10:36 PM
Unknown Object (File)
Dec 20 2023, 6:39 AM
Unknown Object (File)
Dec 11 2023, 4:53 PM
Unknown Object (File)
Nov 10 2023, 6:44 AM
Unknown Object (File)
Sep 30 2023, 2:01 PM
Unknown Object (File)
Sep 22 2023, 11:57 PM
Unknown Object (File)
Aug 22 2023, 1:49 PM
Unknown Object (File)
Aug 9 2023, 11:29 PM
Subscribers

Details

Reviewers
jilles
Summary

Make sh(1) use larger BUFSIZ - 128k instead of 1024 bytes.
This results in significantly lower number of read syscalls.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 20271
Build 19744: arc lint + arc unit

Event Timeline

I think it would make more sense to define a new buffer size constant than modifying BUFSIZ.

The new size 131072 may be a bit large, given the processing sh does on the data. What benchmark benefits from increasing it so much? I do expect gains can be made by increasing from 1024 though.

Also note, if sh is reading from a pipe on stdin it's usually a POSIX violation to have a buffer size greater than 1. This is because it shall be possible for spawned utilities to read from stdin and have this interleave properly with the shell reading commands. If stdin is a regular file, this goal could also be achieved by seeking backwards when forking a utility.

jilles requested changes to this revision.Oct 16 2018, 11:00 PM
This revision now requires changes to proceed.Oct 16 2018, 11:00 PM

Use a better constant name.

Ok, fixed the constant name. As for size - it's not any specific benchmark; I'm basically fixing various obvious inefficiencies I see in truss output - it's kind of byproduct of another project. As for the size - my thinking here is that it's pretty much free; we don't keep a number of those buffers around, are we? Given that /etc/rc.subr is 49281, 128k seems like a reasonable choice to me.

OUTBUFSIZ is somewhat of a strange name given that these buffers contain shell input.

I'd like these buffers to be small enough that repeatedly executing the . (source) special builtin does not map and unmap memory every time.

Heh, I clearly remember looking at the output.c, which uses OUTBUFSIZ, and thinking "ok, let's follow the convention here". I guess I followed it a bit too closely.

Now, regarding the mmaps and munmaps - good point. I did a little experiment, though, and I don't see a negative difference. For small scripts:

% cat foo.sh 
#!/bin/sh

.  ./bar.sh
.  ./bar.sh

[trasz@v2:~]% cat bar.sh 
#!/bin/sh

bar=42

... the interesting part of diff in truss output is like this:

--- przed       2018-10-17 07:24:36.600474000 +0100
+++ po  2018-10-17 07:24:17.023569000 +0100
@@ -112,21 +112,21 @@ sigaction(SIGTERM,0x0,{ SIG_DFL SA_RESTART ss_t }) = 0
 sigaction(SIGTERM,{ SIG_DFL 0x0 ss_t },0x0)     = 0 (0x0)
 fstatat(AT_FDCWD,".",{ mode=drwxr-xr-x ,inode=653262,size=4608,blksize=32768 },0x0) = 0 (0x0)
 fstatat(AT_FDCWD,"/home/trasz",{ mode=drwxr-xr-x ,inode=653262,size=4608,blksize=32768 },0x0) = 0 (0x0)
-read(10,"#!/bin/sh\n\n. ./bar.sh\n. ./bar"...,1024) = 34 (0x22)
+read(10,"#!/bin/sh\n\n. ./bar.sh\n. ./bar"...,131072) = 34 (0x22)
 openat(AT_FDCWD,"./bar.sh",O_RDONLY|O_CLOEXEC,00) = 3 (0x3)
 fcntl(3,F_DUPFD_CLOEXEC,0xa)                    = 11 (0xb)
 close(3)                                        = 0 (0x0)
 mmap(0x0,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-1,0x0) = 34367475712 (0x800761000)
-mmap(0x0,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-1,0x0) = 34367496192 (0x800766000)
-read(11,"#!/bin/sh\n\nbar=42\n\n",1024)                 = 19 (0x13)
-read(11,0x800766000,1024)                       = 0 (0x0)
+mmap(0x0,167936,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON|MAP_ALIGNED(12),-1,0x0) = 34367496192 (0x800766000)
+read(11,"#!/bin/sh\n\nbar=42\n\n",131072)       = 19 (0x13)
+read(11,0x800766980,131072)                     = 0 (0x0)
 close(11)                                       = 0 (0x0)
 openat(AT_FDCWD,"./bar.sh",O_RDONLY|O_CLOEXEC,00) = 3 (0x3)
 fcntl(3,F_DUPFD_CLOEXEC,0xa)                    = 11 (0xb)
 close(3)                                        = 0 (0x0)
-read(11,"#!/bin/sh\n\nbar=42\n\n",1024)                 = 19 (0x13)
-read(11,0x800766000,1024)                       = 0 (0x0)
+read(11,"#!/bin/sh\n\nbar=42\n\n",131072)       = 19 (0x13)
+read(11,0x800766640,131072)                     = 0 (0x0)
 close(11)                                       = 0 (0x0)
-read(10,0x2299d0,1024)                          = 0 (0x0)
+read(10,0x2299d0,131072)                        = 0 (0x0)
 exit(0x0)
 process exit, rval = 0

For larger files - if I replace foo.sh with /etc/defaults/rc.conf - it's 215 syscalls before vs 141 after.

Ok, let's put this on hold. More careful benchmarking shows there's something seriously weird wrt performance going on.