diff mbox series

grep: report missing left operand of --and

Message ID 98171911-ba39-27f1-d068-4d381bcd4804@web.de (mailing list archive)
State Superseded
Headers show
Series grep: report missing left operand of --and | expand

Commit Message

René Scharfe June 28, 2021, 6:58 p.m. UTC
Git grep allows combining two patterns with --and.  It checks and
reports if the second pattern is missing when compiling the expression.
A missing first pattern, however, is only reported later at match time.
Thus no error is returned if no matching is done, e.g. because no file
matches the also given pathspec.

When that happens we get an expression tree with an GREP_NODE_AND node
and a NULL pointer to the missing left child.  free_pattern_expr()
tries to dereference it during the cleanup at the end, which result in
a segmentation fault.

Fix this by verifying the presence of the left operand at expression
compilation time.

Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Whether the check in match_expr_eval() can now be turned into a BUG is
left as an exercise for the reader. ;-)

 grep.c          |  2 ++
 t/t7810-grep.sh | 10 ++++++++++
 2 files changed, 12 insertions(+)

--
2.32.0

Comments

Ævar Arnfjörð Bjarmason June 29, 2021, 5:52 p.m. UTC | #1
On Mon, Jun 28 2021, René Scharfe wrote:

> Git grep allows combining two patterns with --and.  It checks and
> reports if the second pattern is missing when compiling the expression.
> A missing first pattern, however, is only reported later at match time.
> Thus no error is returned if no matching is done, e.g. because no file
> matches the also given pathspec.
>
> When that happens we get an expression tree with an GREP_NODE_AND node
> and a NULL pointer to the missing left child.  free_pattern_expr()
> tries to dereference it during the cleanup at the end, which result in
> a segmentation fault.
>
> Fix this by verifying the presence of the left operand at expression
> compilation time.
>
> Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Whether the check in match_expr_eval() can now be turned into a BUG is
> left as an exercise for the reader. ;-)
>
>  grep.c          |  2 ++
>  t/t7810-grep.sh | 10 ++++++++++
>  2 files changed, 12 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index 8f91af1cb0..7d0ea4e956 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -655,6 +655,8 @@ static struct grep_expr *compile_pattern_and(struct grep_pat **list)
>  	struct grep_expr *x, *y, *z;
>
>  	x = compile_pattern_not(list);
> +	if (!x)
> +		die("Not a valid grep expression");
>  	p = *list;
>  	if (p && p->token == GREP_AND) {
>  		if (!p->next)
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 5830733f3d..c581239674 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -11,6 +11,13 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
>  . ./test-lib.sh
>
> +test_invalid_grep_expression() {
> +	params="$@" &&
> +	test_expect_success "invalid expression: grep $params" '
> +		test_must_fail git grep $params -- nonexisting
> +	'
> +}
> +
>  cat >hello.c <<EOF
>  #include <assert.h>
>  #include <stdio.h>
> @@ -89,6 +96,9 @@ test_expect_success 'grep should not segfault with a bad input' '
>  	test_must_fail git grep "("
>  '
>
> +test_invalid_grep_expression -e A --and
> +test_invalid_grep_expression --and -e A
> +
>  for H in HEAD ''
>  do
>  	case "$H" in

This seems like an incomplete fix, for the exact same thing with --or we
silently return 1, as we would if we exited early in free_pattern_expr
on !x, which aside from the segfault I think we should probably make a
habit in our own free()-like functions.

Whatever we're doing about the --and segfault it seems like we should do
the same under --or, no?

Your first test also passes before your fix, it's only the latter that
segfaults. The first one emits:

    fatal: --and not followed by pattern expression

So having that in a leading patch to indicate no behavior was changed
would be better.

Instead of the "Not a valid grep expression" error let's instead say
something like:

    fatal: --[and|or] must follow a pattern expression

The error (which I know you just copied from elsewhere) is misleading,
it's not the pattern that's not valid (as to me it implies), but our own
--and/--or option usage.

And the "excercise for the reader" is a bit flippant, do we actually hit
that condition now? If not and we're sure we won't now seems like the
time to add a BUG() there, and to change the "Not a valid grep
expression" to "internal error in --and/--or parsing" or something.

Thanks for the patch!
René Scharfe June 29, 2021, 6:35 p.m. UTC | #2
Am 29.06.21 um 19:52 schrieb Ævar Arnfjörð Bjarmason:
>
> On Mon, Jun 28 2021, René Scharfe wrote:
>
>> Git grep allows combining two patterns with --and.  It checks and
>> reports if the second pattern is missing when compiling the expression.
>> A missing first pattern, however, is only reported later at match time.
>> Thus no error is returned if no matching is done, e.g. because no file
>> matches the also given pathspec.
>>
>> When that happens we get an expression tree with an GREP_NODE_AND node
>> and a NULL pointer to the missing left child.  free_pattern_expr()
>> tries to dereference it during the cleanup at the end, which result in
>> a segmentation fault.
>>
>> Fix this by verifying the presence of the left operand at expression
>> compilation time.
>>
>> Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>> Whether the check in match_expr_eval() can now be turned into a BUG is
>> left as an exercise for the reader. ;-)
>>
>>  grep.c          |  2 ++
>>  t/t7810-grep.sh | 10 ++++++++++
>>  2 files changed, 12 insertions(+)
>>
>> diff --git a/grep.c b/grep.c
>> index 8f91af1cb0..7d0ea4e956 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -655,6 +655,8 @@ static struct grep_expr *compile_pattern_and(struct grep_pat **list)
>>  	struct grep_expr *x, *y, *z;
>>
>>  	x = compile_pattern_not(list);
>> +	if (!x)
>> +		die("Not a valid grep expression");
>>  	p = *list;
>>  	if (p && p->token == GREP_AND) {
>>  		if (!p->next)
>> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
>> index 5830733f3d..c581239674 100755
>> --- a/t/t7810-grep.sh
>> +++ b/t/t7810-grep.sh
>> @@ -11,6 +11,13 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>>
>>  . ./test-lib.sh
>>
>> +test_invalid_grep_expression() {
>> +	params="$@" &&
>> +	test_expect_success "invalid expression: grep $params" '
>> +		test_must_fail git grep $params -- nonexisting
>> +	'
>> +}
>> +
>>  cat >hello.c <<EOF
>>  #include <assert.h>
>>  #include <stdio.h>
>> @@ -89,6 +96,9 @@ test_expect_success 'grep should not segfault with a bad input' '
>>  	test_must_fail git grep "("
>>  '
>>
>> +test_invalid_grep_expression -e A --and
>> +test_invalid_grep_expression --and -e A
>> +
>>  for H in HEAD ''
>>  do
>>  	case "$H" in
>
> This seems like an incomplete fix, for the exact same thing with --or we
> silently return 1, as we would if we exited early in free_pattern_expr
> on !x, which aside from the segfault I think we should probably make a
> habit in our own free()-like functions.
>
> Whatever we're doing about the --and segfault it seems like we should do
> the same under --or, no?

No, --or is a special case and needs special handling.  Currently it's
ignored.  If we want to berate the user for using it without expressions
left and right then we need to start actively handling it.

> Your first test also passes before your fix, it's only the latter that
> segfaults. The first one emits:
>
>     fatal: --and not followed by pattern expression
>
> So having that in a leading patch to indicate no behavior was changed
> would be better.

True, the first test is just nice to have.  I can remove it to reduce
confusion.

> Instead of the "Not a valid grep expression" error let's instead say
> something like:
>
>     fatal: --[and|or] must follow a pattern expression

Good point.

> The error (which I know you just copied from elsewhere) is misleading,
> it's not the pattern that's not valid (as to me it implies), but our own
> --and/--or option usage.

That's what's meant with extended pattern, I think.

> And the "excercise for the reader" is a bit flippant, do we actually hit
> that condition now? If not and we're sure we won't now seems like the
> time to add a BUG() there, and to change the "Not a valid grep
> expression" to "internal error in --and/--or parsing" or something.

I don't know.

René
diff mbox series

Patch

diff --git a/grep.c b/grep.c
index 8f91af1cb0..7d0ea4e956 100644
--- a/grep.c
+++ b/grep.c
@@ -655,6 +655,8 @@  static struct grep_expr *compile_pattern_and(struct grep_pat **list)
 	struct grep_expr *x, *y, *z;

 	x = compile_pattern_not(list);
+	if (!x)
+		die("Not a valid grep expression");
 	p = *list;
 	if (p && p->token == GREP_AND) {
 		if (!p->next)
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 5830733f3d..c581239674 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -11,6 +11,13 @@  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

 . ./test-lib.sh

+test_invalid_grep_expression() {
+	params="$@" &&
+	test_expect_success "invalid expression: grep $params" '
+		test_must_fail git grep $params -- nonexisting
+	'
+}
+
 cat >hello.c <<EOF
 #include <assert.h>
 #include <stdio.h>
@@ -89,6 +96,9 @@  test_expect_success 'grep should not segfault with a bad input' '
 	test_must_fail git grep "("
 '

+test_invalid_grep_expression -e A --and
+test_invalid_grep_expression --and -e A
+
 for H in HEAD ''
 do
 	case "$H" in