Page MenuHomeFreeBSD

Add sponge(1)
ClosedPublic

Authored by eadler on Nov 1 2017, 7:55 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Dec 20, 1:36 AM
Unknown Object (File)
Wed, Dec 18, 4:33 PM
Unknown Object (File)
Wed, Dec 18, 4:03 AM
Unknown Object (File)
Tue, Dec 17, 7:55 AM
Unknown Object (File)
Mon, Dec 9, 6:28 AM
Unknown Object (File)
Sat, Dec 7, 3:03 AM
Unknown Object (File)
Thu, Dec 5, 5:02 AM
Unknown Object (File)
Thu, Dec 5, 3:58 AM

Details

Reviewers
None
Group Reviewers
manpages
Commits
rS326555: sponge(1) Minor commit for commit log
Summary

This adds a naive implementation of sponge(1) to base.

sponge(1) is useful in pipelines to write to files read earlier without
setting up a temporary file inline

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I like it. This looks great, but I'm concerned about performance for larger files since realloc copies data and can wind up with a fragmented heap that might artificially limit what you can do. Simple lists of ranges would suffice to avoid these copies without greatly increasing the complexity of the code.

usr.bin/sponge/sponge.1
48 ↗(On Diff #34626)

It might be useful to specify what happens when memory can't be allocated for the entire file.

usr.bin/sponge/sponge.c
20 ↗(On Diff #34626)

so if we fail to allocate, no output is produced. If so, this is OK.

88 ↗(On Diff #34626)

why *4? This artificially may limit the file size as we approach the VM limits. *2 will result in more allocation but will let us get much closer to the limit without going nuts.

Also, this strategy is great for small files, but sucks for large files since we call realloc, which could be copying a crapton of data to keep the output code simple.

110 ↗(On Diff #34626)

Why mix raw access and stdio? Seems like we could easily use all one or the other.

117 ↗(On Diff #34626)

There's no harm in always closing the output, we're about to exit anyway.

Please wait for the next update and then review

usr.bin/sponge/sponge.c
110 ↗(On Diff #34626)

because I was using printf for debugging and Just Worked. replacing with writev to avoid the reallocf problem you mentioned

117 ↗(On Diff #34626)

true, but I dislike it, and it makes debugging annoying.

Hi Eitan,

how about adding a small EXAMPLE section to the man page to show usage?

danfe added inline comments.
usr.bin/sponge/sponge.c
20 ↗(On Diff #34654)

style(9) violation: should be static void *\nsafe_malloc.

27 ↗(On Diff #34654)

return (ret);

30 ↗(On Diff #34654)

Ditto.

37 ↗(On Diff #34654)

Ditto.

47 ↗(On Diff #34654)

Ditto.

51 ↗(On Diff #34654)

Move { to the next line.

52 ↗(On Diff #34654)

Usage is typically sent to stderr.

56 ↗(On Diff #34654)

Ditto.

65 ↗(On Diff #34654)

Local variables are utterly unsorted. See style(9).

eadler marked 13 inline comments as done.

Attempt style(9)

An alternative implementation would write to a temporary file, then rename onto the destination once stdin closes. This is how ksh93's >;pathname redirection works (this does not mean I think this definitely should be done like that, or that the shell is the best place to implement this feature).

usr.bin/sponge/sponge.1
26 ↗(On Diff #34702)

Convention seems to be not to add the leading zero here.

53 ↗(On Diff #34702)

We should perhaps add a warning that the file will be written even if earlier components of the pipeline failed.

usr.bin/sponge/sponge.c
15 ↗(On Diff #34702)

whitespace style: static void *safe_malloc...

77 ↗(On Diff #34702)

style: spaces

79 ↗(On Diff #34702)

-h for help is an unusual convention. It is very common for -h to mean something else, such as affecting a symlink instead of its target or making output more human-readable.

116 ↗(On Diff #34702)

This memory is leaked (never used).

132 ↗(On Diff #34702)

You can use err(3) here and elsewhere to simplify the code a bit.

136 ↗(On Diff #34702)

If there are too many iovecs (sysconf(_SC_IOV_MAX), 16 MB with 16 KB buffers on FreeBSD), writev() will do nothing and return the error [EINVAL]. To avoid this, the write should be split up.

If sysconf(_SC_IOV_MAX) returns -1 or if you want to simplify the code, use the constant _XOPEN_IOV_MAX. This may lead to more system calls, but that may not be a problem.

143 ↗(On Diff #34702)

The error from close() should be checked.

Some more style(9) issues...

usr.bin/sponge/sponge.c
19 ↗(On Diff #34702)

Per our style(9), * should be part of the variable, not the type. Also applies to prototypes above.

22 ↗(On Diff #34702)

Local variables are typically separated from the rest of of the function with extra linefeed.

28 ↗(On Diff #34702)

Still not done.

34 ↗(On Diff #34702)

Ditto.

40 ↗(On Diff #34702)

Again, return (ret);.

43 ↗(On Diff #34702)

Again, should be static void * (note the space).

46 ↗(On Diff #34702)

Missing linefeed.

52 ↗(On Diff #34702)

Again, return (ret);.

98 ↗(On Diff #34702)

Per style(9), infinite loops are spelled as for (;;) in FreeBSD.

eadler marked an inline comment as done.

almost there...

usr.bin/sponge/sponge.1
53 ↗(On Diff #34702)

Its the same for all unix commands, but its likely worth adding a reminder here.

usr.bin/sponge/sponge.c
79 ↗(On Diff #34702)

It is annoying to me when utilities use -h for anything other than help.

136 ↗(On Diff #34702)

this is ridiculous. Will fix in the next iteration.

wblock added inline comments.
usr.bin/sponge/sponge.1
40 ↗(On Diff #34767)

Redundant.

utility reads standard in until complete, then opens
​the output file and writes to it.
41 ↗(On Diff #34767)

I would say "useful" instead of "valuable".

42 ↗(On Diff #34767)
This makes it useful in pipelines that read a file and then write to it.
43 ↗(On Diff #34767)

s/The following/These/

46 ↗(On Diff #34767)

s/Opens/Open/

51 ↗(On Diff #34767)

s/In the event that/If/
and this needs a comma and some other stuff. I doubt you are referring to a specific malloc, but rather to memory allocation in general. So:

If an attempt to allocate memory fails,
54 ↗(On Diff #34767)

s/will be/is/

54 ↗(On Diff #34767)

Seems like this ought to be in a BUGS section at the end.

55 ↗(On Diff #34767)

s/have//

67 ↗(On Diff #34767)

This should use double quotes. Too bad we don't have a macro for air quotes, because that's pretty much what this is.

68 ↗(On Diff #34767)
.Cm sort file | sponge file
eadler marked 30 inline comments as done.

style(9)

usr.bin/sponge/sponge.1
54 ↗(On Diff #34767)

this isn't a bug. Its statement on how shells work.

usr.bin/sponge/sponge.1
53 ↗(On Diff #36115)

missing space in iswritten

usr.bin/sponge/sponge.c
136 ↗(On Diff #34702)

Ok, but to be portable you still need something like

if (maxiovec == -1)
    maxiovec =  _XOPEN_IOV_MAX;

since I expect haters of fixed limits to return -1 from sysconf(_SC_IOV_MAX).

eadler marked an inline comment as done.

last iteration - I hope!

eadler marked an inline comment as done.

add unit test

This revision was automatically updated to reflect the committed changes.