Page MenuHomeFreeBSD

newsyslog: simplify compression code
ClosedPublic

Authored by dnelson_1901_yahoo.com on Jul 30 2018, 8:29 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 3, 2:00 AM
Unknown Object (File)
Mon, Dec 2, 9:35 AM
Unknown Object (File)
Thu, Nov 28, 1:58 AM
Unknown Object (File)
Mon, Nov 25, 5:12 PM
Unknown Object (File)
Mon, Nov 25, 5:12 PM
Unknown Object (File)
Mon, Nov 25, 5:12 PM
Unknown Object (File)
Mon, Nov 25, 5:11 PM
Unknown Object (File)
Mon, Nov 25, 5:11 PM
Subscribers

Details

Summary

Remove the compression suffix macros and move them directly into the compress_type array.

Remove the hardcoded sizes on the suffix and compression args arrays.

Simplify the compression args arrays at the expense of a __DECONST call when calling execv().

Rewrite do_zipwork. The COMPRESS_* macros can directly index the compress_types array, so the outer loop is not needed. Convert fixed-length strings into asprintf or sbuf calls.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

I'm open to any other simplification ideas, but the storing of the compression options in their own separate arrays plus that nested loop particularly hurt my eyes.

Rebase after rS337050 .

I'm open to making COMPRESS_SUFFIX_MAXLEN dynamic too, but I can't think of
a clean way to do it. age_old_log() feels like it wants to be rewritten
using an sbuf :)

  • newsyslog: calculate the maximum compression suffix length at runtime
usr.sbin/newsyslog/newsyslog.c
2017 ↗(On Diff #46186)

You might consider asserting that the index is valid.

2018 ↗(On Diff #46186)

I would make a const local pointer to compress_type[zwork->zw_conf->compress], and use that here. You could get rid of the "nargs" local variable in the process.

2433 ↗(On Diff #46186)

We're building a path here... I would argue that MAXPATHLEN on its own should be sufficiently large. IMO the real issue is that this function and others are not checking for truncation.

In general alloca()'s use is discouraged.

dnelson_1901_yahoo.com added inline comments.
usr.sbin/newsyslog/newsyslog.c
2018 ↗(On Diff #46186)

I actually did that in my initial draft to reduce line length so every line didn't wrap, but after I removed the nested loop there was enough space so I removed it :) Easy enough to put back.

2433 ↗(On Diff #46186)

Is an alloca() any worse than the current VLA usage?

I think the idea was "allocate a buffer large enough that truncation can't happen". The filename built here may be chopped up to find the names of older logfiles (which may have different or no suffixes), so even if this filename's length exceeds MAXPATHLEN, one of the older archive files might not.

get_logfile_suffix() looks like it might incorrectly match the wrong filename due to truncation, though. I'll see what can be done there.

usr.sbin/newsyslog/newsyslog.c
2433 ↗(On Diff #46186)

How is it a VLA? Its size is known at compile-time.

In a common case (archtodir == 0 && timefnamefmt == NULL), we just build up the path and pass it to stat(2), which will fail if the total length exceeds MAXPATHLEN. The array size calculation is also bogus for the timefnamefmt != NULL case, where we add a suffix other than ".0". IMHO there is no reason to play these tricks to avoid truncation when the result is going to be the same anyway. If you still feel strongly that alloc() is a better way to go, I have no objection to it.

dnelson_1901_yahoo.com added a subscriber: gad.

The rabbit hole goes deeper than I thought. The do_zipwork function came into being with @gad 's rS130167 commit back in 2004. Then the array of compression types was added in rS218127, then compression arguments were added in rS326617. Some of the code from the original commit remained but no longer makes sense, or is now being done in the wrong place.

I rearranged the code to make the compression arg arrays just an array of the flags themselves, and reordered do_zipwork() to put an easy exit up top, and then build the argument array in order with a single loop, using dynamic string allocation instead of static arrays.

usr.sbin/newsyslog/newsyslog.c
2433 ↗(On Diff #46186)

You're right, it's a constant; the math expression fooled me. I'll leave it as an alloca() for now. Rewriting the whole thing to use an sbuf will guarantee no overflows, and should be more readable than the current code. I think that should happen in a separate patch though.

  • newsyslog: more cleanup
  • newsyslog: more extensive cleanup

Thanks, this mostly looks good to me. Just a few nits.

usr.sbin/newsyslog/newsyslog.c
142 ↗(On Diff #46279)

The addition of -f should be a separate change.

2041 ↗(On Diff #46279)

Since you're using __DECONST below, why not here too?

Actually, I think you can instead declare "args" to be of type const char ** and just have one __DECONST in the execv() call:

execv(pgm_path, __DECONST(char *const *, args));
2044 ↗(On Diff #46279)

Style: redundant braces.

2065 ↗(On Diff #46279)

Style: redundant braces.

dnelson_1901_yahoo.com added inline comments.
usr.sbin/newsyslog/newsyslog.c
2041 ↗(On Diff #46279)

strdup() takes a const char* and returns a char *, so no const stripping is needed there. A better question would be "why are you strdup()'ing at all" :) It's not modified or passed anywhere. Removing that also removes the free(args[0]) call.

  • newsyslog: don't add -f to zstd args in this patch
  • newsyslog: remove unneeded braces
  • newsyslog: convert args to const char**
  • newsyslog: don't strdup pgm_name

Thanks, this looks ok to me.

usr.sbin/newsyslog/newsyslog.c
2041 ↗(On Diff #46279)

Right, I was wondering why you were using strdup() instead of DECONSTing.

This revision is now accepted and ready to land.Aug 4 2018, 8:13 PM

I haven't looked at newsyslog in awhile, but I'm happy to see someone attempting this cleanup.

disclaimer: I use reviews.freebsd so rarely that I'm not very good at it.

I think that style(9) suggests that the line:

	const char *pgm_name, *pgm_path, **args;

should be:

	const char **args, *pgm_name, *pgm_path;

(alphabetical order)

When I started to write this review, I believed there was a more significant issue I wanted to point out. But as I went to write that up I realized I had overlooked some code, so there was no problem where I had thought there was a problem. But I'll submit this anyway, if only to prove that I do read my email. 🙂

  • newsyslog: re-alphabetize variables

Oops. They were in order, until I made args a const pointer.

This is my second use of reviews, and I definitely like the inline comments (e.g you could have clicked on line 2005 in the diff and added an entry there) and the arc command to push out updated diffs straight from a checked-out source tree (svn, git, or hg). A whole lot simpler than manually diffing and then uploading to bugzilla.

This revision now requires review to proceed.Aug 7 2018, 1:01 AM
In D16518#352929, @gad wrote:

When I started to write this review, I believed there was a more significant issue I wanted to point out. But as I went to write that up I realized I had overlooked some code, so there was no problem where I had thought there was a problem. But I'll submit this anyway, if only to prove that I do read my email. 🙂

Does this qualify for a "Reviewed by: gad" in the commit message? Normally one "accepts" the revision to indicate that they've reviewed the diff.

In D16518#352929, @gad wrote:

[...] I'll submit this anyway, if only to prove that I do read my email. 🙂

Does this qualify for a "Reviewed by: gad" in the commit message?

I was a little reluctant to claim "reviewed by". I did look over the entire update, but I didn't try to run the result. On the other hand, I did look it over and I don't have any objections to it at this point. So I guess that's good for a "reviewed by".

This revision is now accepted and ready to land.Aug 7 2018, 6:18 PM
This revision was automatically updated to reflect the committed changes.