t0091-bugreport.sh: actually verify some content of report

Message ID	20210411143354.25134-1-martin.agren@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> From: =?utf-8?q?Martin_=C3=85gren?= <martin.agren@gmail.com> To: =?utf-8?q?SZEDER_G=C3=A1bor?= <szeder.dev@gmail.com> Cc: Emily Shaffer <emilyshaffer@google.com>, =?utf-8?b?w4Z2YXIgQXJuZmrDtnI=?= =?utf-8?b?w7AgQmphcm1hc29u?= <avarab@gmail.com>, "Junio C Hamano" <gitster@pobox.com>, git@vger.kernel.org Subject: [PATCH] t0091-bugreport.sh: actually verify some content of report Date: Sun, 11 Apr 2021 16:33:54 +0200 Message-Id: <20210411143354.25134-1-martin.agren@gmail.com> In-Reply-To: <20210409212724.GE2947267@szeder.dev> References: <20210409212724.GE2947267@szeder.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	t0091-bugreport.sh: actually verify some content of report \| expand t0091-bugreport.sh: actually verify some content of report

Martin Ågren April 11, 2021, 2:33 p.m. UTC

In the first test in this script, 'creates a report with content in the
right places', we generate a report and pipe it into our helper
`check_all_headers_populated()`. The idea of the helper is to find all
lines that look like headers ("[Some Header Here]") and to check that
the next line is non-empty. This is supposed to catch erroneous outputs
such as the following:

  [A Header]
  something
  more here

  [Another Header]

  [Too Early Header]
  contents

However, we provide the lines of the bug report as filenames to grep,
meaning we mostly end up spewing errors:

  grep: : No such file or directory
  grep: [System Info]: No such file or directory
  grep: git version:: No such file or directory
  grep: git version 2.31.1.164.g984c2561cd: No such file

This doesn't disturb the test, which tugs along and reports success, not
really having verified the contents of the report at all.

Note that after 788a776069 ("bugreport: collect list of populated
hooks", 2020-05-07), the bug report, which is created in our hook-less
test repo, contains an empty section with the enabled hooks. Thus, even
the intention of our helper is a bit misguided: there is nothing
inherently wrong with having an empty section in the bug report.

Let's instead grep for some contents that we expect to find in a bug
report. We won't verify that they appear in the right order, but at
least we end up verifying the contents more than before this commit.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 > It does scare me..

 Maybe something like this?

 t/t0091-bugreport.sh | 26 +++++---------------------
 1 file changed, 5 insertions(+), 21 deletions(-)

Junio C Hamano April 12, 2021, 5:17 p.m. UTC | #1

Martin Ågren <martin.agren@gmail.com> writes:

> In the first test in this script, 'creates a report with content in the
> right places', we generate a report and pipe it into our helper
> `check_all_headers_populated()`. The idea of the helper is to find all
> lines that look like headers ("[Some Header Here]") and to check that
> the next line is non-empty. This is supposed to catch erroneous outputs
> such as the following:
>
>   [A Header]
>   something
>   more here
>
>   [Another Header]
>
>   [Too Early Header]
>   contents
>
> However, we provide the lines of the bug report as filenames to grep,
> meaning we mostly end up spewing errors:
>
>   grep: : No such file or directory
>   grep: [System Info]: No such file or directory
>   grep: git version:: No such file or directory
>   grep: git version 2.31.1.164.g984c2561cd: No such file
>
> This doesn't disturb the test, which tugs along and reports success, not
> really having verified the contents of the report at all.
>
> Note that after 788a776069 ("bugreport: collect list of populated
> hooks", 2020-05-07), the bug report, which is created in our hook-less
> test repo, contains an empty section with the enabled hooks. Thus, even
> the intention of our helper is a bit misguided: there is nothing
> inherently wrong with having an empty section in the bug report.
>
> Let's instead grep for some contents that we expect to find in a bug
> report. We won't verify that they appear in the right order, but at
> least we end up verifying the contents more than before this commit.

Nicely described.  I agree that the original intent (let alone the
implementation) is misguided and we should allow an empty section as
a perfectly normal thing.

> Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
>  > It does scare me..
>
>  Maybe something like this?
> ...
> +test_expect_success 'creates a report with content' '
>  	test_when_finished rm git-bugreport-check-headers.txt &&
>  	git bugreport -s check-headers &&
> -	check_all_headers_populated <git-bugreport-check-headers.txt
> +	grep "^Please answer " git-bugreport-check-headers.txt &&
> +	grep "^\[System Info\]$" git-bugreport-check-headers.txt &&
> +	grep "^git version:$" git-bugreport-check-headers.txt &&
> +	grep "^\[Enabled Hooks\]$" git-bugreport-check-headers.txt
>  '

It is a different matter if it is sufficient to ensure only certain
selected lines appear in the report, though.  As all the lines lost
by this fix comes from 238b439d (bugreport: add tool to generate
debugging info, 2020-04-16), it would be nice to hear from Emily.

Thanks.

Martin Ågren April 13, 2021, 6:32 p.m. UTC | #2

On Mon, 12 Apr 2021 at 19:17, Junio C Hamano <gitster@pobox.com> wrote:
>
> Martin Ågren <martin.agren@gmail.com> writes:
>
> > In the first test in this script, 'creates a report with content in the
> > right places', we generate a report and pipe it into our helper
> > `check_all_headers_populated()`. The idea of the helper is to find all
> > lines that look like headers ("[Some Header Here]") and to check that
> > the next line is non-empty. This is supposed to catch erroneous outputs
> > such as the following:
...
> > Let's instead grep for some contents that we expect to find in a bug
> > report. We won't verify that they appear in the right order, but at
> > least we end up verifying the contents more than before this commit.
>
> Nicely described.  I agree that the original intent (let alone the
> implementation) is misguided and we should allow an empty section as
> a perfectly normal thing.

> > +test_expect_success 'creates a report with content' '
> >       test_when_finished rm git-bugreport-check-headers.txt &&
> >       git bugreport -s check-headers &&
> > -     check_all_headers_populated <git-bugreport-check-headers.txt
> > +     grep "^Please answer " git-bugreport-check-headers.txt &&
> > +     grep "^\[System Info\]$" git-bugreport-check-headers.txt &&
> > +     grep "^git version:$" git-bugreport-check-headers.txt &&
> > +     grep "^\[Enabled Hooks\]$" git-bugreport-check-headers.txt
> >  '
>
> It is a different matter if it is sufficient to ensure only certain
> selected lines appear in the report, though.  As all the lines lost
> by this fix comes from 238b439d (bugreport: add tool to generate
> debugging info, 2020-04-16), it would be nice to hear from Emily.

Maybe something like

       awk '\''BEGIN { sect="" }
               /^\[.*]$/ { sect=$0 }
               /./ { print sect, $0 }'\'' \
           git-bugreport-check-headers.txt >prefixed &&
       grep "^ Thank you for filling out a Git bug report" prefixed &&
       grep "^ Please review the rest of the bug report below" prefixed &&
       grep "^ You can delete any lines you don.t wish to share" prefixed &&
       grep "\[System Info\] git version ..." prefixed

Something like that could be used to verify that a line goes into the
right section, as opposed to just seeing that it appears *somewhere*. Or
maybe

  grep -e Thank.you -e Please.review -e You.can.delete -e "^\[" \
       -e git.version git-bugreport-check-headers.txt >actual

then setting up an "expect" and comparing. That would help us verify the
order, including which section things appear in. Slightly less friendly
for comparing loosely, compared to the awk-then-grep above.

Let's see what Emily thinks about the various alternatives. Maybe she can
think of something else.

Martin

Ævar Arnfjörð Bjarmason April 13, 2021, 7:27 p.m. UTC | #3

On Tue, Apr 13 2021, Martin Ågren wrote:

> On Mon, 12 Apr 2021 at 19:17, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Martin Ågren <martin.agren@gmail.com> writes:
>>
>> > In the first test in this script, 'creates a report with content in the
>> > right places', we generate a report and pipe it into our helper
>> > `check_all_headers_populated()`. The idea of the helper is to find all
>> > lines that look like headers ("[Some Header Here]") and to check that
>> > the next line is non-empty. This is supposed to catch erroneous outputs
>> > such as the following:
> ...
>> > Let's instead grep for some contents that we expect to find in a bug
>> > report. We won't verify that they appear in the right order, but at
>> > least we end up verifying the contents more than before this commit.
>>
>> Nicely described.  I agree that the original intent (let alone the
>> implementation) is misguided and we should allow an empty section as
>> a perfectly normal thing.
>
>> > +test_expect_success 'creates a report with content' '
>> >       test_when_finished rm git-bugreport-check-headers.txt &&
>> >       git bugreport -s check-headers &&
>> > -     check_all_headers_populated <git-bugreport-check-headers.txt
>> > +     grep "^Please answer " git-bugreport-check-headers.txt &&
>> > +     grep "^\[System Info\]$" git-bugreport-check-headers.txt &&
>> > +     grep "^git version:$" git-bugreport-check-headers.txt &&
>> > +     grep "^\[Enabled Hooks\]$" git-bugreport-check-headers.txt
>> >  '
>>
>> It is a different matter if it is sufficient to ensure only certain
>> selected lines appear in the report, though.  As all the lines lost
>> by this fix comes from 238b439d (bugreport: add tool to generate
>> debugging info, 2020-04-16), it would be nice to hear from Emily.
>
> Maybe something like
>
>        awk '\''BEGIN { sect="" }
>                /^\[.*]$/ { sect=$0 }
>                /./ { print sect, $0 }'\'' \
>            git-bugreport-check-headers.txt >prefixed &&
>        grep "^ Thank you for filling out a Git bug report" prefixed &&
>        grep "^ Please review the rest of the bug report below" prefixed &&
>        grep "^ You can delete any lines you don.t wish to share" prefixed &&
>        grep "\[System Info\] git version ..." prefixed
>
> Something like that could be used to verify that a line goes into the
> right section, as opposed to just seeing that it appears *somewhere*. Or
> maybe
>
>   grep -e Thank.you -e Please.review -e You.can.delete -e "^\[" \
>        -e git.version git-bugreport-check-headers.txt >actual
>
> then setting up an "expect" and comparing. That would help us verify the
> order, including which section things appear in. Slightly less friendly
> for comparing loosely, compared to the awk-then-grep above.
>
> Let's see what Emily thinks about the various alternatives. Maybe she can
> think of something else.

I think a straight-up test_cmp is preferrable, both for correctness and
also as self-documentation, you can see from the test what the full
expected output is like.

Obviously in this case we can't do a test_cmp on the raw output, as it
contains various things from uname.

But it looks like we could do that if we do some light awk/perl/sed
munging of the "[System Info]" and "[Enabled Hooks]" section(s).

Or, since we also control the generator we could pass a --no-system-info
and/or --no-hooks-info, which may be something some people want for
privacy/reporting reasons anyway (e.g. I've often used perlbug and
deleted that whole info, because info there has no relevance to the
specific issue I'm reporting).

SZEDER Gábor April 13, 2021, 7:44 p.m. UTC | #4

On Sun, Apr 11, 2021 at 04:33:54PM +0200, Martin Ågren wrote:
> In the first test in this script, 'creates a report with content in the
> right places', we generate a report and pipe it into our helper
> `check_all_headers_populated()`. The idea of the helper is to find all
> lines that look like headers ("[Some Header Here]") and to check that
> the next line is non-empty. This is supposed to catch erroneous outputs
> such as the following:
> 
>   [A Header]
>   something
>   more here
> 
>   [Another Header]
> 
>   [Too Early Header]
>   contents
> 
> However, we provide the lines of the bug report as filenames to grep,
> meaning we mostly end up spewing errors:
> 
>   grep: : No such file or directory
>   grep: [System Info]: No such file or directory
>   grep: git version:: No such file or directory
>   grep: git version 2.31.1.164.g984c2561cd: No such file
> 
> This doesn't disturb the test, which tugs along and reports success, not
> really having verified the contents of the report at all.
> 
> Note that after 788a776069 ("bugreport: collect list of populated
> hooks", 2020-05-07), the bug report, which is created in our hook-less
> test repo, contains an empty section with the enabled hooks. Thus, even
> the intention of our helper is a bit misguided: there is nothing
> inherently wrong with having an empty section in the bug report.
> 
> Let's instead grep for some contents that we expect to find in a bug
> report. We won't verify that they appear in the right order, but at
> least we end up verifying the contents more than before this commit.
> 
> Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
>  > It does scare me..
> 
>  Maybe something like this?

Thanks!

>  t/t0091-bugreport.sh | 26 +++++---------------------
>  1 file changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh
> index 526304ff95..9111c4c26f 100755
> --- a/t/t0091-bugreport.sh
> +++ b/t/t0091-bugreport.sh
> @@ -4,29 +4,13 @@ test_description='git bugreport'
>  
>  . ./test-lib.sh
>  
> -# Headers "[System Info]" will be followed by a non-empty line if we put some
> -# information there; we can make sure all our headers were followed by some
> -# information to check if the command was successful.
> -HEADER_PATTERN="^\[.*\]$"
> -
> -check_all_headers_populated () {
> -	while read -r line
> -	do
> -		if test "$(grep "$HEADER_PATTERN" "$line")"
> -		then
> -			echo "$line"
> -			read -r nextline
> -			if test -z "$nextline"; then
> -				return 1;
> -			fi
> -		fi
> -	done
> -}
> -
> -test_expect_success 'creates a report with content in the right places' '
> +test_expect_success 'creates a report with content' '
>  	test_when_finished rm git-bugreport-check-headers.txt &&
>  	git bugreport -s check-headers &&
> -	check_all_headers_populated <git-bugreport-check-headers.txt
> +	grep "^Please answer " git-bugreport-check-headers.txt &&

This "Please answer" is translated and you look for it with plain
'grep' instead of 'test_i18ngrep', which is fine nowadays...  However,
Junio queued this patch on top of v2.29.3, which is old enough to
still have the GETTEXT_POISON CI job, and fails because of this.

> +	grep "^\[System Info\]$" git-bugreport-check-headers.txt &&
> +	grep "^git version:$" git-bugreport-check-headers.txt &&
> +	grep "^\[Enabled Hooks\]$" git-bugreport-check-headers.txt
>  '

I have to wonder, however, whether this is worth testing at all.

>  
>  test_expect_success 'dies if file with same name as report already exists' '
> -- 
> 2.31.1.163.ga65ce7f831
>

Emily Shaffer April 13, 2021, 10:21 p.m. UTC | #5

On Tue, Apr 13, 2021 at 09:27:33PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Tue, Apr 13 2021, Martin Ågren wrote:
> 
> > On Mon, 12 Apr 2021 at 19:17, Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> Martin Ågren <martin.agren@gmail.com> writes:
> >>
> >> > In the first test in this script, 'creates a report with content in the
> >> > right places', we generate a report and pipe it into our helper
> >> > `check_all_headers_populated()`. The idea of the helper is to find all
> >> > lines that look like headers ("[Some Header Here]") and to check that
> >> > the next line is non-empty. This is supposed to catch erroneous outputs
> >> > such as the following:
> > ...
> >> > Let's instead grep for some contents that we expect to find in a bug
> >> > report. We won't verify that they appear in the right order, but at
> >> > least we end up verifying the contents more than before this commit.
> >>
> >> Nicely described.  I agree that the original intent (let alone the
> >> implementation) is misguided and we should allow an empty section as
> >> a perfectly normal thing.
> >
> >> > +test_expect_success 'creates a report with content' '
> >> >       test_when_finished rm git-bugreport-check-headers.txt &&
> >> >       git bugreport -s check-headers &&
> >> > -     check_all_headers_populated <git-bugreport-check-headers.txt
> >> > +     grep "^Please answer " git-bugreport-check-headers.txt &&
> >> > +     grep "^\[System Info\]$" git-bugreport-check-headers.txt &&
> >> > +     grep "^git version:$" git-bugreport-check-headers.txt &&
> >> > +     grep "^\[Enabled Hooks\]$" git-bugreport-check-headers.txt
> >> >  '
> >>
> >> It is a different matter if it is sufficient to ensure only certain
> >> selected lines appear in the report, though.  As all the lines lost
> >> by this fix comes from 238b439d (bugreport: add tool to generate
> >> debugging info, 2020-04-16), it would be nice to hear from Emily.
> >
> > Maybe something like
> >
> >        awk '\''BEGIN { sect="" }
> >                /^\[.*]$/ { sect=$0 }
> >                /./ { print sect, $0 }'\'' \
> >            git-bugreport-check-headers.txt >prefixed &&
> >        grep "^ Thank you for filling out a Git bug report" prefixed &&
> >        grep "^ Please review the rest of the bug report below" prefixed &&
> >        grep "^ You can delete any lines you don.t wish to share" prefixed &&
> >        grep "\[System Info\] git version ..." prefixed
> >
> > Something like that could be used to verify that a line goes into the
> > right section, as opposed to just seeing that it appears *somewhere*. Or
> > maybe
> >
> >   grep -e Thank.you -e Please.review -e You.can.delete -e "^\[" \
> >        -e git.version git-bugreport-check-headers.txt >actual
> >
> > then setting up an "expect" and comparing. That would help us verify the
> > order, including which section things appear in. Slightly less friendly
> > for comparing loosely, compared to the awk-then-grep above.
> >
> > Let's see what Emily thinks about the various alternatives. Maybe she can
> > think of something else.

My apologies for the slow reply :)

> I think a straight-up test_cmp is preferrable, both for correctness and
> also as self-documentation, you can see from the test what the full
> expected output is like.

Yeah, I like the sound of this.

> 
> Obviously in this case we can't do a test_cmp on the raw output, as it
> contains various things from uname.
> 
> But it looks like we could do that if we do some light awk/perl/sed
> munging of the "[System Info]" and "[Enabled Hooks]" section(s).
> 
> Or, since we also control the generator we could pass a --no-system-info
> and/or --no-hooks-info, which may be something some people want for
> privacy/reporting reasons anyway (e.g. I've often used perlbug and
> deleted that whole info, because info there has no relevance to the
> specific issue I'm reporting).

This approach sounds more appealing to me than awk munging. I think
you're right that folks may not want to share it in some cases.

Thanks for noticing.
 - Emily

t0091-bugreport.sh: actually verify some content of report

Commit Message

Comments

Patch