Paths

Table of Contentst

Correct an out-of-bounds read in regcomp when the RE is bad.
ClosedPublic
Actions

Authored by brooks on Apr 29 2017, 12:48 AM.

Details

Reviewers

pfg
emaste
jhb
ngie
ache

Commits

rS318030: MFC r317707:
rS318029: MFC r317707:
rS317707: Correct an out-of-bounds read in regcomp when the RE is bad.

Summary

When passed the invalid regular expression "a**", the error is eventually
detected and seterr() is called. It sets p->error appropriatly and
p->next and p->end to nuls which is a never used char nuls[10] which is
zeros due to .bss initialization. Unfortunatly, p_ere_exp() and
p_simp_re() both have fall through cases where they set the error,
decrement p->next and access it which means a read from what ever .bss
variable comes before nuls.

Found with regex_test:repet_multi and CHERI bounds checking.

Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
MFC after: 1 week

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

brooks created this revision.Apr 29 2017, 12:48 AM

This bug was introduced with wide character support in rS132019.

emaste accepted this revision.Apr 29 2017, 2:07 AM

emaste added a subscriber: kevans.

This revision is now accepted and ready to land.Apr 29 2017, 2:07 AM

LGTM, but let me add ache as he knows this code better.

lib/libc/regex/regcomp.c
657 ↗	(On Diff #27842)	Typo in the comment: "Definitely"

ngie accepted this revision.Apr 29 2017, 3:34 AM

Fix typo.

This revision now requires review to proceed.May 1 2017, 3:19 AM

Harbormaster completed remote builds in B8998: Diff 27882.May 1 2017, 3:19 AM

This bug was introduced with wide character support in rS132019.

I am unsure if it's better to return or just break .... apparently the return just works so .. I'll accept it.

This revision is now accepted and ready to land.May 1 2017, 4:11 AM

In D10541#218737, @pfg wrote:

I am unsure if it's better to return or just break .... apparently the return just works so .. I'll accept it.

The error-handling stuff here feels wrong enough that kicking control back up to the top-level expression parser is likely safer.

Honestly, we could *probably* do a little more to safely exit out of regcomp in case of an error. Calls to the top-level parser can be nested, so this could bail out of one level and land us back in the sub-expression bits of this same function further up.

In D10541#218757, @kevans91_ksu.edu wrote:

In D10541#218737, @pfg wrote:

I am unsure if it's better to return or just break .... apparently the return just works so .. I'll accept it.

The error-handling stuff here feels wrong enough that kicking control back up to the top-level expression parser is likely safer.

Honestly, we could *probably* do a little more to safely exit out of regcomp in case of an error. Calls to the top-level parser can be nested, so this could bail out of one level and land us back in the sub-expression bits of this same function further up.

The struct parse error handling stuff is bizarre. I guess it's a pre-branch predictor optimization to keep error handling out of the fast path (after all, it's not like you need to consume the buffer). I think this code could do with a refactor, and probably quite a lot of fuzzing.

In D10541#218866, @brooks wrote:

The struct parse error handling stuff is bizarre. I guess it's a pre-branch predictor optimization to keep error handling out of the fast path (after all, it's not like you need to consume the buffer). I think this code could do with a refactor, and probably quite a lot of fuzzing.

Is such an optimization even worth it at this point? Especially given that this is in regcomp -- the lesser of two evils in regex(3).

I do have some refactoring for relevant parts in a local branch that will make it to review once D10315 gets hammered out, I may revisit this sooner rather than later.

Is such an optimization even worth it at this point? Especially given that this is in regcomp -- the lesser of two evils in regex(3).

Almost certainly not worth it. I think this is a reasonable change for now, and introduce a further refactoring later.

In D10541#218920, @emaste wrote:

Is such an optimization even worth it at this point? Especially given that this is in regcomp -- the lesser of two evils in regex(3).

Almost certainly not worth it. I think this is a reasonable change for now, and introduce a further refactoring later.

Given past experience breaking things, it's best to introduce tests beforehand, then request an exp- run with any and all refactoring.

Closed by commit rS317707: Correct an out-of-bounds read in regcomp when the RE is bad. (authored by brooks). · Explain WhyMay 2 2017, 9:20 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

Path

Size

head/

lib/

libc/

regex/

regcomp.c

4 lines

Diff 27949

View Options

head/lib/libc/regex/regcomp.c

Correct an out-of-bounds read in regcomp when the RE is bad.ClosedPublicActions