Page MenuHomeFreeBSD

bsdgrep: provide primitive literal matcher instead of erroring out on fgrep use
ClosedPublic

Authored by kevans on Aug 17 2017, 3:35 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 13, 9:34 PM
Unknown Object (File)
Wed, Jan 1, 6:37 PM
Unknown Object (File)
Dec 10 2024, 5:46 AM
Unknown Object (File)
Nov 10 2024, 11:09 AM
Unknown Object (File)
Oct 7 2024, 3:30 PM
Unknown Object (File)
Oct 7 2024, 5:41 AM
Unknown Object (File)
Sep 5 2024, 7:09 AM
Unknown Object (File)
Aug 30 2024, 9:19 PM
Subscribers

Details

Summary

fgrep/grep -F will error out at runtime if compiled with a regex(3)
that does not define REG_NOSPEC or REG_LITERAL. glibc is one such regex(3)
implementation, and as it turns out they don't support literal matching at all.

Provide a primitive literal matcher for use with glibc and other implementations
that don't support literal matching so that we don't completely lose
fgrep/grep -F if compiled against libgnuregex on stable/10, stable/11, or
other systems that we don't necessarily support.

This is a wholly unoptimized implementation with no plans to optimize it as of
now. This is due to both its use-case being primarily on unsupported systems in
the near-distant future and that it's reinventing the wheel that we already
have available as a feature of regex(3).

More fgrep tests have been added to more extensively test this one-off
implementation.

MFC after: 2 weeks?

Test Plan

Run the tests, Luke

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

contrib/netbsd-tests/usr.bin/grep/t_grep.sh
716 ↗(On Diff #32151)

Probably worth adding a test for something like -e "BAR" -e "bAz"

726 ↗(On Diff #32151)

any reason 'h' is missing, just out of curiosity?

usr.bin/grep/grep.c
743–747 ↗(On Diff #32151)

I'd generally have a slight preference for

#ifdef WITH_INTERNAL_NOSPEC
        if (!matchall && grepbehave != GREP_FIXED) {
#else
        if (!matchall) {
#endif
usr.bin/grep/util.c
360 ↗(On Diff #32151)

Might be worth extending this comment with info you have in the proposed commit message above (i.e. does not apply in default FreeBSD config, applies if built against libgnuregex or on non-FreeBSD OSes).

372 ↗(On Diff #32151)

Ought not use _-prefixed variables, what about strstr_fn = strcasestr or such?

  • Address concerns by emaste@
kevans added inline comments.
contrib/netbsd-tests/usr.bin/grep/t_grep.sh
716 ↗(On Diff #32151)

Right- actual mixed-case stuff makes sense =p.

726 ↗(On Diff #32151)

Keyboard goofs- 'h' takes extra effort on my Pinebook's keyboard. When I was typing this out, I failed to put in that extra effort and then just carried on with the typo.

usr.bin/grep/util.c
515 ↗(On Diff #32204)

I forgot to update my working copy -- this will not be included since it already went in.

372 ↗(On Diff #32151)

It did indeed feel dirty, but for some reason I hadn't thought of a _fn suffix.

contrib/netbsd-tests/usr.bin/grep/t_grep.sh
726 ↗(On Diff #32151)

:)

Could you please commit the testcases separately? It seems like they are good, but orthogonal to fixing fgrep.

contrib/netbsd-tests/usr.bin/grep/t_grep.sh
690 ↗(On Diff #32204)

Aren't some testcases missing dealing with newline separated patterns?

-F, --fixed-strings
         Interpret PATTERN as a list of fixed strings, separated by
         newlines, any of which is to be matched.
usr.bin/grep/grep.c
727 ↗(On Diff #32204)

Could you please add a comment as to why this is the case?

usr.bin/grep/util.c
74 ↗(On Diff #32204)

Spurious space after "char *".

kevans marked 2 inline comments as done.
  • Address ngie@'s concerns
  • Kill an extra space around strstr_fn's definition

I left a rather verbose comment around the GREP_FIXED cflag setting to explain both
motivation behind the preproc bits and what to do if we're wrong, since there's
one other place that would need amended in that case.

I could, but the only motivation behind these tests was to make sure that my implementation of the fixed matcher is functional -- without having to write the fixed matcher, I would not have bothered with these tests since I'd expect regex(3)'s REG_NOSPEC bits to have already been tested to a better standard than I could duplicate.

That being said, I have no strong opinion either way on whether they get committed together or separately.

contrib/netbsd-tests/usr.bin/grep/t_grep.sh
690 ↗(On Diff #32204)

Negative; that's not an fgrep specific feature, that's just how grep in general interprets PATTERN. This is reflected better in bsdgrep(1), where this isn't specifically mentioned in -F. Breaking up the pattern by newlines happens long before it gets to the bit that I care to test the most, and should happen elsewhere.

This revision is now accepted and ready to land.Aug 19 2017, 4:43 AM
This revision was automatically updated to reflect the committed changes.