diff mbox series

chainlint.pl: recognize test bodies defined via heredoc

Message ID 20240702235034.88219-1-ericsunshine@charter.net (mailing list archive)
State New, archived
Headers show
Series chainlint.pl: recognize test bodies defined via heredoc | expand

Commit Message

Eric Sunshine July 2, 2024, 11:50 p.m. UTC
From: Eric Sunshine <sunshine@sunshineco.com>

In order to check tests for semantic problems, chainlint.pl scans test
scripts, looking for tests defined as:

    test_expect_success [prereq] title '
        body
    '

where `body` is a single string which is then treated as a standalone
chunk of code and "linted" to detect semantic issues. (The same happens
for `test_expect_failure` definitions.)

The introduction of test definitions in which the test body is instead
presented via a heredoc rather than as a single string creates a blind
spot in the linting process since such invocations are not recognized by
chainlint.pl.

Address this shortcoming by also recognizing tests defined as:

    test_expect_success [prereq] title - <<\EOT
        body
    EOT

A minor complication is that chainlint.pl has never considered heredoc
bodies significant since it doesn't scan them for semantic problems,
thus it has always simply thrown them away. However, with the new
`test_expect_success` calling sequence, heredoc bodies become
meaningful, thus need to be captured.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---

This is a clean-room implementation which serves the same purpose as a
change proposed[1] by Peff; it was created before I looked at Peff's
proposal. The two independent implementations turned out quite similar,
but the one implemented by this patch takes a more formal and paranoid
stance. In particular, unlike Peff's patch, it doesn't trust that the
most-recently-seen heredoc body is one associated with the
`test_expect_success` invocation.

This patch can sit either at the top or bottom of Peff's series[2].

There was also related discussion of improving the chainlint self-test
infrastructure[3], however, such proposed changes needn't hold up Peff's
series[2]; such improvements can be applied after the dust settles. On
the other hand, Peff, if you plan to reroll for some reason, feel free
to incorporate this patch into your series.

[1]: https://lore.kernel.org/git/20240702005144.GA27170@coredump.intra.peff.net/
[2]: https://lore.kernel.org/git/20240701220815.GA20293@coredump.intra.peff.net/
[3]: https://lore.kernel.org/git/20240702211913.GB120950@coredump.intra.peff.net/

 t/chainlint.pl | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

Comments

Jeff King July 6, 2024, 6:01 a.m. UTC | #1
On Tue, Jul 02, 2024 at 07:50:34PM -0400, Eric Sunshine wrote:

> This is a clean-room implementation which serves the same purpose as a
> change proposed[1] by Peff; it was created before I looked at Peff's
> proposal. The two independent implementations turned out quite similar,
> but the one implemented by this patch takes a more formal and paranoid
> stance. In particular, unlike Peff's patch, it doesn't trust that the
> most-recently-seen heredoc body is one associated with the
> `test_expect_success` invocation.

Thanks for working on this! I think this is better than the patch I
showed earlier. But I am still glad to have worked on that one, because
there is no way I'd be able to intelligently review that one without
having poked at the code so much myself.

> This patch can sit either at the top or bottom of Peff's series[2].
> 
> There was also related discussion of improving the chainlint self-test
> infrastructure[3], however, such proposed changes needn't hold up Peff's
> series[2]; such improvements can be applied after the dust settles. On
> the other hand, Peff, if you plan to reroll for some reason, feel free
> to incorporate this patch into your series.

IMHO we want it all to come together. We should not allow "<<\EOT"
without making sure we can chainlint the test bodies, and we should not
make such a big change to chainlint.pl without tests to make sure it
works.

I'll post some patches in a moment:

  [1/3]: chainlint.pl: fix line number reporting
  [2/3]: t/chainlint: add test_expect_success call to test snippets
  [3/3]: t/chainlint: add tests for test body in heredoc

with the idea that we'd apply your patch here on top of what Junio has
queued in jk/test-body-in-here-doc, and then these three on top. For
Junio's sanity, I'll roll it all up into one series. But I wanted to
show it to you incrementally first, especially because I think the fixes
from patch 1/3 above should probably just get squashed in (or even
rewritten). I'll discuss the bugs they fix below.

> diff --git a/t/chainlint.pl b/t/chainlint.pl
> index 1bbd985b78..eba509b8e1 100755
> --- a/t/chainlint.pl
> +++ b/t/chainlint.pl
> @@ -174,6 +174,8 @@ sub swallow_heredocs {
>  		$$b =~ /(?:\G|\n)$indent\Q$$tag[0]\E(?:\n|\z)/gc;
>  		if (pos($$b) > $start) {
>  			my $body = substr($$b, $start, pos($$b) - $start);
> +			$self->{parser}->{heredocs}->{$$tag[0]} =
> +			    substr($body, 0, length($body) - length($&));
>  			$self->{lineno} += () = $body =~ /\n/sg;
>  			next;
>  		}

OK, this part looks familiar. :)

> @@ -232,7 +234,8 @@ sub new {
>  	my $self = bless {
>  		buff => [],
>  		stop => [],
> -		output => []
> +		output => [],
> +		heredocs => {},
>  	} => $class;
>  	$self->{lexer} = Lexer->new($self, $s);
>  	return $self;

I think initializing is not strictly necessary here, since we'd only try
to read tags if we saw a here-doc. But there might be some invalid cases
where we could convince higher-level code to look for tags even though
there were none (and generate a perl warning about trying to dereference
undef as a hashref).

On the flip side, what about cleaning up? The "heretags" array is
emptied as we parse the heredocs in swallow_heredocs(). But I think once
a ShellParser's $self->{heredocs}->{FOO} is written, it will hang around
forever (even though it's valid only for that one command). Probably not
a big deal, but there's probably some correct spot to reset it.

> @@ -616,7 +619,9 @@ sub unwrap {
>  
>  sub check_test {
>  	my $self = shift @_;
> -	my ($title, $body) = map(unwrap, @_);
> +	my $title = unwrap(shift @_);
> +	my $body = unwrap(shift @_);
> +	$body = shift @_ if $body eq '-';
>  	$self->{ntests}++;
>  	my $parser = TestParser->new(\$body);
>  	my @tokens = $parser->parse();

This has two problems related to line numbers. You can't see it in the
context, but we later do:

  my $lineno = $_[1]->[3];

Now that we're shifting @_, that array item is gone.

The second is that the line number for the here-doc is actually one past
the initial line number of the test_expect_success. That works
automatically for hanging single-quotes, since the newline from that
line is inside the quoted area. But for a here-doc, we have to account
for it manually. In my original patch I prepended "\n", but you can also
just increment $lineno (which is what I did in the fix I'm about to
send).

> @@ -649,8 +654,13 @@ sub parse_cmd {
>  	return @tokens unless @tokens && $tokens[0]->[0] =~ /^test_expect_(?:success|failure)$/;
>  	my $n = $#tokens;
>  	$n-- while $n >= 0 && $tokens[$n]->[0] =~ /^(?:[;&\n|]|&&|\|\|)$/;
> -	$self->check_test($tokens[1], $tokens[2]) if $n == 2; # title body
> -	$self->check_test($tokens[2], $tokens[3]) if $n > 2;  # prereq title body
> +	my $herebody;
> +	if ($n >= 2 && $tokens[$n-1]->[0] eq '-' && $tokens[$n]->[0] =~ /^<<-?(.+)$/) {
> +		$herebody = $self->{heredocs}->{$1};
> +		$n--;
> +	}
> +	$self->check_test($tokens[1], $tokens[2], $herebody) if $n == 2; # title body
> +	$self->check_test($tokens[2], $tokens[3], $herebody) if $n > 2;  # prereq title body
>  	return @tokens;
>  }

OK, mostly as expected. I think the check for "-" here is redundant with
what's in check_test(). We could just feed the heredoc body either way,
and in the nonsense case of:

  test_expect_success 'title' 'test body' <<EOT
  nobody reads this!
  EOT

the heredoc data would just be ignored.

Requiring "<<" at the end is somewhat limiting. E.g. this is valid:

  test_expect_success <<EOT 'title' -
  the test body
  EOT

I don't expect anybody to do that, but it would be nice to be more
robust if we can. I think the tokens are still wrapped at this point, so
we could read through all of them looking for "<<" anywhere, without
getting confused by "$(cat <<INNER_HEREDOC)". I think, anyway (I didn't
test).

I didn't address either of those comments in the patches I'm about to
send.

-Peff
Junio C Hamano July 6, 2024, 10:15 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> I'll post some patches in a moment:
>
>   [1/3]: chainlint.pl: fix line number reporting
>   [2/3]: t/chainlint: add test_expect_success call to test snippets
>   [3/3]: t/chainlint: add tests for test body in heredoc
>
> with the idea that we'd apply your patch here on top of what Junio has
> queued in jk/test-body-in-here-doc, and then these three on top.

Would the final form be to have Eric's preparatory enhancement to
chainlint and then these three first, and finally the "here-docs"
conversion I queued from you earlier?
Jeff King July 6, 2024, 11:11 p.m. UTC | #3
On Sat, Jul 06, 2024 at 03:15:32PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I'll post some patches in a moment:
> >
> >   [1/3]: chainlint.pl: fix line number reporting
> >   [2/3]: t/chainlint: add test_expect_success call to test snippets
> >   [3/3]: t/chainlint: add tests for test body in heredoc
> >
> > with the idea that we'd apply your patch here on top of what Junio has
> > queued in jk/test-body-in-here-doc, and then these three on top.
> 
> Would the final form be to have Eric's preparatory enhancement to
> chainlint and then these three first, and finally the "here-docs"
> conversion I queued from you earlier?

I had planned on top (leaving a brief moment where chainlint would
ignore the new format), but either way is fine.

My biggest question is around my patch 1 above:

  - is it worth squashing in to Eric's patch? I didn't want to do that
    without getting his OK on the approach.

  - instead of bumping the line number in the caller, should the lexer
    record the line number of the here-doc to be used later?

  - the test harness in the Makefile strips the line numbers from the
    chainlint output, so it's hard to verify those fixes. I saw them
    only because the combination of the two bugs meant that the here-doc
    had a "line 0" in it, which was enough to confuse the "sed"
    invocation in the Makefile.

    I did manually verify that it is OK after my fix, but do we want
    that to be part of the chainlint tests? Just leaving the line
    numbers in is a maintenance nightmare, since it depends on the order
    of concatenating all of the tests together (so our "expect" files
    would depend on all of the previous tests). But if we wanted to get
    fancy, we could perhaps store relative offsets in the expect file. I
    think it gets pretty complicated, though, since we print only
    problematic lines.

I was going to give it a few days for Eric to chime in on those points,
and then assemble a final version for you to apply (but I certainly
don't mind if you want to pick up what's here in the meantime).

-Peff
Eric Sunshine July 8, 2024, 3:40 a.m. UTC | #4
On Sat, Jul 6, 2024 at 2:01 AM Jeff King <peff@peff.net> wrote:
> On Tue, Jul 02, 2024 at 07:50:34PM -0400, Eric Sunshine wrote:
> I'll post some patches in a moment:
>
>   [1/3]: chainlint.pl: fix line number reporting
>   [2/3]: t/chainlint: add test_expect_success call to test snippets
>   [3/3]: t/chainlint: add tests for test body in heredoc
>
> with the idea that we'd apply your patch here on top of what Junio has
> queued in jk/test-body-in-here-doc, and then these three on top. For
> Junio's sanity, I'll roll it all up into one series. But I wanted to
> show it to you incrementally first, especially because I think the fixes
> from patch 1/3 above should probably just get squashed in (or even
> rewritten). I'll discuss the bugs they fix below.

Considering the excellent explanation you crafted in your patch, I'd
like to say that it should remain separate from mine. However, since
you caught the problems in review, it would be irresponsible of us to
let my patch into the permanent history as-is. So, feel free to squash
your fixes into my patch. Perhaps add a Co-authored-by:? The bit from
your [1/3] commit message about incrementing $lineno for the
heredoc-body case might be worth squashing in too?

I wrote one minor (perhaps non-actionable) comment in response to
patch [3/3]. The patches all looked fine to me, so:

    Acked-by: Eric Sunshine <sunshine@sunshineco.com>

> > @@ -232,7 +234,8 @@ sub new {
> >       my $self = bless {
> >               buff => [],
> >               stop => [],
> > -             output => []
> > +             output => [],
> > +             heredocs => {},
> >       } => $class;
> >       $self->{lexer} = Lexer->new($self, $s);
> >       return $self;
>
> I think initializing is not strictly necessary here, since we'd only try
> to read tags if we saw a here-doc. But there might be some invalid cases
> where we could convince higher-level code to look for tags even though
> there were none (and generate a perl warning about trying to dereference
> undef as a hashref).

You're right, it's not necessary to initialize here, but it feels more
consistent to do so. That said, I don't feel strongly either way.

> On the flip side, what about cleaning up? The "heretags" array is
> emptied as we parse the heredocs in swallow_heredocs(). But I think once
> a ShellParser's $self->{heredocs}->{FOO} is written, it will hang around
> forever (even though it's valid only for that one command). Probably not
> a big deal, but there's probably some correct spot to reset it.

There are a few reasons I wasn't overly concerned about cleaning up in
this case:

(1) The parsers are all short-lived, so the collected heredoc bodies
won't stick around long anyhow. For each test checked, a TestParser is
created and destroyed. For each script mentioned on the command-line,
a ScriptParser is created and destroyed. None of these parsers stick
around for long, though, a ScriptParser outlives a TestParser.

(2) The heredoc bodies in question tend to be pretty small, so it's
not consuming an inordinate amount of memory even if a single parser
latches bodies of multiple heredocs.

(3) We tend to be quite consistent about naming our heredoc tag (i.e.
"EOF", "EOT"), so a latched body in the parser's %heredocs hash is
very likely to get overwritten, thus the hash is probably not going to
eat up a lot of memory. Given the entire test suite, I'd be quite
surprised if any one parser ever latches more than three heredoc
bodies at a time, and the vast majority of parsers are likely latching
zero or one heredoc body.

(4) I couldn't really think of a correct spot to reset %heredocs.

That said, after reading your message, I did try implementing an
approach in which the heredoc body gets attached to the `<<` or `<<-`
token. That way, a heredoc body would be cleaned along with its
associated lexer token. However, the implementation got too ugly and
increased cognitive load too much for my liking, so I abandoned it.

> >  sub check_test {
> >       my $self = shift @_;
> > -     my ($title, $body) = map(unwrap, @_);
> > +     my $title = unwrap(shift @_);
> > +     my $body = unwrap(shift @_);
> > +     $body = shift @_ if $body eq '-';
> >       $self->{ntests}++;
> >       my $parser = TestParser->new(\$body);
> >       my @tokens = $parser->parse();
>
> This has two problems related to line numbers. You can't see it in the
> context, but we later do:
>
>   my $lineno = $_[1]->[3];
>
> Now that we're shifting @_, that array item is gone.

Ugh, this is embarrassing. I did run chainlint.pl on t0600 in which I
had intentionally broken some &&-chains, so I saw the output, but
somehow I overlooked that it broke the line numbering entirely.

> The second is that the line number for the here-doc is actually one past
> the initial line number of the test_expect_success. That works
> automatically for hanging single-quotes, since the newline from that
> line is inside the quoted area. But for a here-doc, we have to account
> for it manually. In my original patch I prepended "\n", but you can also
> just increment $lineno (which is what I did in the fix I'm about to
> send).

Nicely spotted. Simply incrementing $lineno does feel a bit hacky, but
I agree that it is probably good enough for now; it doesn't seem
likely that it will break any time soon. But I also agree with the
commentary you wrote in patch [1/3] that it probably would be easy
enough to latch the line number of the beginning of the heredoc body
and employ that value. That would certainly be more robust.

> > @@ -649,8 +654,13 @@ sub parse_cmd {
> >       return @tokens unless @tokens && $tokens[0]->[0] =~ /^test_expect_(?:success|failure)$/;
> >       my $n = $#tokens;
> >       $n-- while $n >= 0 && $tokens[$n]->[0] =~ /^(?:[;&\n|]|&&|\|\|)$/;
> > -     $self->check_test($tokens[1], $tokens[2]) if $n == 2; # title body
> > -     $self->check_test($tokens[2], $tokens[3]) if $n > 2;  # prereq title body
> > +     my $herebody;
> > +     if ($n >= 2 && $tokens[$n-1]->[0] eq '-' && $tokens[$n]->[0] =~ /^<<-?(.+)$/) {
> > +             $herebody = $self->{heredocs}->{$1};
> > +             $n--;
> > +     }
> > +     $self->check_test($tokens[1], $tokens[2], $herebody) if $n == 2; # title body
> > +     $self->check_test($tokens[2], $tokens[3], $herebody) if $n > 2;  # prereq title body
> >       return @tokens;
> >  }
>
> OK, mostly as expected. I think the check for "-" here is redundant with
> what's in check_test(). We could just feed the heredoc body either way,
> and in the nonsense case of:
>
>   test_expect_success 'title' 'test body' <<EOT
>   nobody reads this!
>   EOT
>
> the heredoc data would just be ignored.

Right. I went back and forth with this, never sure if this code was
overkill. On the other hand, we could make this more paranoid and
complain if we see either of these cases:

(1) "-" but no heredoc
(2) heredoc present but something other than "-"

> Requiring "<<" at the end is somewhat limiting. E.g. this is valid:
>
>   test_expect_success <<EOT 'title' -
>   the test body
>   EOT

True, I didn't even think about that.

> I don't expect anybody to do that, but it would be nice to be more
> robust if we can. I think the tokens are still wrapped at this point, so
> we could read through all of them looking for "<<" anywhere, without
> getting confused by "$(cat <<INNER_HEREDOC)". I think, anyway (I didn't
> test).

Correct. The stuff inside "$(...)" does get parsed and linted, but by
the time ScriptParser::parse_cmd() sees it, `$(cat <<INNER_HEREDOC)`
is just a single (string) token.
Eric Sunshine July 8, 2024, 3:51 a.m. UTC | #5
On Sat, Jul 6, 2024 at 7:11 PM Jeff King <peff@peff.net> wrote:
> My biggest question is around my patch 1 above:
>
>   - is it worth squashing in to Eric's patch? I didn't want to do that
>     without getting his OK on the approach.

Given the effort you put into the commit message and diagnosing my
bugs, my knee-jerk response is that it would be nice to keep your
patch separate so you retain authorship. But it also would be
irresponsible for us to let my buggy patch into the project history
as-is since you caught the problems at review time. So, squashing your
fixes in seems like the correct approach.

>   - instead of bumping the line number in the caller, should the lexer
>     record the line number of the here-doc to be used later?

It would be more robust to do so, but I suspect things will be fine
for a long time even without such an enhancement. But I also agree
with your commentary in patch [1/3] that it probably would be easy to
latch the line number at the point at which the heredoc body is
latched.

>   - the test harness in the Makefile strips the line numbers from the
>     chainlint output, so it's hard to verify those fixes. I saw them
>     only because the combination of the two bugs meant that the here-doc
>     had a "line 0" in it, which was enough to confuse the "sed"
>     invocation in the Makefile.
>
>     I did manually verify that it is OK after my fix, but do we want
>     that to be part of the chainlint tests? Just leaving the line
>     numbers in is a maintenance nightmare, since it depends on the order
>     of concatenating all of the tests together (so our "expect" files
>     would depend on all of the previous tests). But if we wanted to get
>     fancy, we could perhaps store relative offsets in the expect file. I
>     think it gets pretty complicated, though, since we print only
>     problematic lines.

Given the way the Makefile currently concatenates all the self-tests,
it would indeed be a nightmare to retain the line numbers. In the long
run, we probably ought someday to adopt Ævar's idea of checking the
self-test files individually[*] rather than en masse. With that
approach, it may make sense to revisit whether or not line numbers
should be present in the "expected" files.

[*] https://lore.kernel.org/git/CAPig+cSBjsosRqoAafYN94Cco8+7SdUt0ND_jHS+jVPoM4K0JA@mail.gmail.com/
Jeff King July 8, 2024, 9:05 a.m. UTC | #6
On Sun, Jul 07, 2024 at 11:40:19PM -0400, Eric Sunshine wrote:

> > On the flip side, what about cleaning up? The "heretags" array is
> > emptied as we parse the heredocs in swallow_heredocs(). But I think once
> > a ShellParser's $self->{heredocs}->{FOO} is written, it will hang around
> > forever (even though it's valid only for that one command). Probably not
> > a big deal, but there's probably some correct spot to reset it.
> 
> There are a few reasons I wasn't overly concerned about cleaning up in
> this case:
> 
> (1) The parsers are all short-lived, so the collected heredoc bodies
> won't stick around long anyhow. For each test checked, a TestParser is
> created and destroyed. For each script mentioned on the command-line,
> a ScriptParser is created and destroyed. None of these parsers stick
> around for long, though, a ScriptParser outlives a TestParser.
> 
> (2) The heredoc bodies in question tend to be pretty small, so it's
> not consuming an inordinate amount of memory even if a single parser
> latches bodies of multiple heredocs.
> 
> (3) We tend to be quite consistent about naming our heredoc tag (i.e.
> "EOF", "EOT"), so a latched body in the parser's %heredocs hash is
> very likely to get overwritten, thus the hash is probably not going to
> eat up a lot of memory. Given the entire test suite, I'd be quite
> surprised if any one parser ever latches more than three heredoc
> bodies at a time, and the vast majority of parsers are likely latching
> zero or one heredoc body.
> 
> (4) I couldn't really think of a correct spot to reset %heredocs.

All of that makes sense to me, especially (4). :)

> That said, after reading your message, I did try implementing an
> approach in which the heredoc body gets attached to the `<<` or `<<-`
> token. That way, a heredoc body would be cleaned along with its
> associated lexer token. However, the implementation got too ugly and
> increased cognitive load too much for my liking, so I abandoned it.

OK, thanks for trying. I do think sticking it into the token stream
would make sense, but if the implementation got tricky, it is probably
not worth the effort. We can always revisit it later if we find some
reason that it would be useful to do it that way.

> > OK, mostly as expected. I think the check for "-" here is redundant with
> > what's in check_test(). We could just feed the heredoc body either way,
> > and in the nonsense case of:
> >
> >   test_expect_success 'title' 'test body' <<EOT
> >   nobody reads this!
> >   EOT
> >
> > the heredoc data would just be ignored.
> 
> Right. I went back and forth with this, never sure if this code was
> overkill. On the other hand, we could make this more paranoid and
> complain if we see either of these cases:
> 
> (1) "-" but no heredoc
> (2) heredoc present but something other than "-"

Those seem like good things to check for, and not too hard to add. I'll
see if I can work up some tests.

> > Requiring "<<" at the end is somewhat limiting. E.g. this is valid:
> >
> >   test_expect_success <<EOT 'title' -
> >   the test body
> >   EOT
> 
> True, I didn't even think about that.
> 
> > I don't expect anybody to do that, but it would be nice to be more
> > robust if we can. I think the tokens are still wrapped at this point, so
> > we could read through all of them looking for "<<" anywhere, without
> > getting confused by "$(cat <<INNER_HEREDOC)". I think, anyway (I didn't
> > test).
> 
> Correct. The stuff inside "$(...)" does get parsed and linted, but by
> the time ScriptParser::parse_cmd() sees it, `$(cat <<INNER_HEREDOC)`
> is just a single (string) token.

OK, I'll see if I can generalize it a bit (and add a test).

-Peff
Jeff King July 8, 2024, 9:08 a.m. UTC | #7
On Sun, Jul 07, 2024 at 11:51:15PM -0400, Eric Sunshine wrote:

> >     I did manually verify that it is OK after my fix, but do we want
> >     that to be part of the chainlint tests? Just leaving the line
> >     numbers in is a maintenance nightmare, since it depends on the order
> >     of concatenating all of the tests together (so our "expect" files
> >     would depend on all of the previous tests). But if we wanted to get
> >     fancy, we could perhaps store relative offsets in the expect file. I
> >     think it gets pretty complicated, though, since we print only
> >     problematic lines.
> 
> Given the way the Makefile currently concatenates all the self-tests,
> it would indeed be a nightmare to retain the line numbers. In the long
> run, we probably ought someday to adopt Ævar's idea of checking the
> self-test files individually[*] rather than en masse. With that
> approach, it may make sense to revisit whether or not line numbers
> should be present in the "expected" files.
> 
> [*] https://lore.kernel.org/git/CAPig+cSBjsosRqoAafYN94Cco8+7SdUt0ND_jHS+jVPoM4K0JA@mail.gmail.com/

I took a look at running each test individually. It's surprisingly quite
a bit slower! About 4s instead of 200ms. There's a bit of low-hanging
fruit to get it down to ~1.7s (which I'll include in my series). But in
the end I punted on that for now, but did add line-number checks. Each
expect file just knows its own numbers, and I use a bit of perl to
handle the running offset.

-Peff
Eric Sunshine July 8, 2024, 7:46 p.m. UTC | #8
On Mon, Jul 8, 2024 at 5:08 AM Jeff King <peff@peff.net> wrote:
> On Sun, Jul 07, 2024 at 11:51:15PM -0400, Eric Sunshine wrote:
> > Given the way the Makefile currently concatenates all the self-tests,
> > it would indeed be a nightmare to retain the line numbers. In the long
> > run, we probably ought someday to adopt Ævar's idea of checking the
> > self-test files individually[*] rather than en masse. With that
> > approach, it may make sense to revisit whether or not line numbers
> > should be present in the "expected" files.
> >
> > [*] https://lore.kernel.org/git/CAPig+cSBjsosRqoAafYN94Cco8+7SdUt0ND_jHS+jVPoM4K0JA@mail.gmail.com/
>
> I took a look at running each test individually. It's surprisingly quite
> a bit slower! About 4s instead of 200ms.

I'm not surprised. As currently implemented, `make test` chainlints
the self-tests and the Git test scripts unconditionally, even if none
of them have changed. As I understand it, Ævar idea was that the
costly initial `make test` would be offset by subsequent `make test`
invocations since `make` will only recheck the self-test files and Git
test scripts if they have been changed. His particular use-case, as I
recall, was when running the full `make test` repeatedly, such as with
`git rebase --exec 'make test' HEAD~n` to ensure that the entire test
suite passes for each patch of a multi-patch series prior to
submitting the series; the repeated cost of linting unchanged files
adds up, especially when the series is long.

> There's a bit of low-hanging
> fruit to get it down to ~1.7s (which I'll include in my series). But in
> the end I punted on that for now, but did add line-number checks. Each
> expect file just knows its own numbers, and I use a bit of perl to
> handle the running offset.

Okay.
Eric Sunshine July 8, 2024, 8:06 p.m. UTC | #9
On Mon, Jul 8, 2024 at 5:05 AM Jeff King <peff@peff.net> wrote:
> On Sun, Jul 07, 2024 at 11:40:19PM -0400, Eric Sunshine wrote:
> > (3) We tend to be quite consistent about naming our heredoc tag (i.e.
> > "EOF", "EOT"), so a latched body in the parser's %heredocs hash is
> > very likely to get overwritten, thus the hash is probably not going to
> > eat up a lot of memory. Given the entire test suite, I'd be quite
> > surprised if any one parser ever latches more than three heredoc
> > bodies at a time, and the vast majority of parsers are likely latching
> > zero or one heredoc body.

One thing we may want to measure is how much extra time we're wasting
for the (very) common case of latching heredoc bodies only to then
ignore them. In particular, we may want to add a flag to ShellParser
telling it whether or not to latch heredoc bodies, and enable that
flag in subclass ScriptParser, but leave it disabled in subclass
TestParser since only ScriptParser currently cares about the heredoc
body.

> > (4) I couldn't really think of a correct spot to reset %heredocs.
>
> All of that makes sense to me, especially (4). :)
>
> > That said, after reading your message, I did try implementing an
> > approach in which the heredoc body gets attached to the `<<` or `<<-`
> > token. That way, a heredoc body would be cleaned along with its
> > associated lexer token. However, the implementation got too ugly and
> > increased cognitive load too much for my liking, so I abandoned it.
>
> OK, thanks for trying. I do think sticking it into the token stream
> would make sense, but if the implementation got tricky, it is probably
> not worth the effort. We can always revisit it later if we find some
> reason that it would be useful to do it that way.

In the long run, I think we probably want to build a full parse tree,
attach relevant information (such as a heredoc body) to each node, and
then walk the tree, rather than trying to perform on-the-fly lints and
other operations on the token stream as is currently the case.

This encapsulation would not only solve the problem of releasing
related resources (such as releasing the heredoc body when the `<<` or
`<<-` node is released), but it would also make it possible to perform
other lints I've had in mind. For instance, a while ago, I added (but
did not submit) a lint to check for `cd` outside of a subshell. After
implementing that, I realized that the cd-outside-subshell lint would
be useful, not just within test bodies, but also at the script level
itself. However, because actual linting functionality resides entirely
in TestParser, I wasn't able to reuse the code for detecting
cd-outside-subshell at the script level, and ended up having to write
duplicate linting code in ScriptParser. If, on the other hand, the
linting code was just handed a parse tree, then it wouldn't matter if
that parse tree came from parsing a test body or parsing a script.

All (or most) of the checks in t/check-non-portable-shell.pl could
also be incorporated into chainlint.pl (though that makes the name
"chainlint.pl" even more of an anachronism than it already is since it
outgrew "chain linting" when it starting checking for missing `||
return`, if not before then.)
Eric Sunshine July 8, 2024, 8:17 p.m. UTC | #10
On Mon, Jul 8, 2024 at 3:46 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Mon, Jul 8, 2024 at 5:08 AM Jeff King <peff@peff.net> wrote:
> > I took a look at running each test individually. It's surprisingly quite
> > a bit slower! About 4s instead of 200ms.
>
> I'm not surprised. As currently implemented, `make test` chainlints
> the self-tests and the Git test scripts unconditionally, even if none
> of them have changed. As I understand it, Ævar idea was that the
> costly initial `make test` would be offset by subsequent `make test`
> invocations since `make` will only recheck the self-test files and Git
> test scripts if they have been changed. His particular use-case, as I
> recall, was when running the full `make test` repeatedly, such as with
> `git rebase --exec 'make test' HEAD~n` to ensure that the entire test
> suite passes for each patch of a multi-patch series prior to
> submitting the series; the repeated cost of linting unchanged files
> adds up, especially when the series is long.

By the way, regarding your 4s instead of 200ms result, I don't think
that is necessarily reflective of what can be achieved. In particular,
to properly measure the effect, you also need to remove all the
threading support from chainlint.pl since using "ithreads" adds a
not-insignificant amount of time to script startup, especially on
Windows, but even on Unix it is quite noticeable.

To test this, I think you can just replace this block:

    unless ($Config{useithreads} && eval {
        require threads; threads->import();
        require Thread::Queue; Thread::Queue->import();
        1;
        }) {
        push(@stats, check_script(1, sub { shift(@scripts); }, sub {
print(@_); }));
        show_stats($start_time, \@stats) if $show_stats;
        exit(exit_code(\@stats));
    }

with:

    if (1) {
        push(@stats, check_script(1, sub { shift(@scripts); }, sub {
print(@_); }));
        show_stats($start_time, \@stats) if $show_stats;
        exit(exit_code(\@stats));
    }
Jeff King July 10, 2024, 12:37 a.m. UTC | #11
On Mon, Jul 08, 2024 at 04:17:46PM -0400, Eric Sunshine wrote:

> By the way, regarding your 4s instead of 200ms result, I don't think
> that is necessarily reflective of what can be achieved. In particular,
> to properly measure the effect, you also need to remove all the
> threading support from chainlint.pl since using "ithreads" adds a
> not-insignificant amount of time to script startup, especially on
> Windows, but even on Unix it is quite noticeable.

Yes, that is the low-hanging fruit I found. ;) Just adding:

  $jobs = @scripts if @scripts < $jobs;

cuts the time to run all scripts individually from ~4s to ~1.7s.

Removing the threading entirely goes to ~1.1s. I hadn't tried that until
now, but it is probably worth doing for the case of $jobs == 1.

-Peff
Jeff King July 10, 2024, 12:48 a.m. UTC | #12
On Mon, Jul 08, 2024 at 04:06:47PM -0400, Eric Sunshine wrote:

> One thing we may want to measure is how much extra time we're wasting
> for the (very) common case of latching heredoc bodies only to then
> ignore them. In particular, we may want to add a flag to ShellParser
> telling it whether or not to latch heredoc bodies, and enable that
> flag in subclass ScriptParser, but leave it disabled in subclass
> TestParser since only ScriptParser currently cares about the heredoc
> body.

I doubt it's much to hold on to a few extra small buffers. But it should
be easy to measure. Here are hyperfine results for checking all of our
test scripts both before (old.pl) and after (new.pl) your chainlint.pl,
with threading disabled:

  Benchmark 1: perl old.pl t[0-9]*.sh
    Time (mean ± σ):      4.215 s ±  0.052 s    [User: 4.187 s, System: 0.028 s]
    Range (min … max):    4.124 s …  4.288 s    10 runs
  
  Benchmark 2: perl new.pl t[0-9]*.sh
    Time (mean ± σ):      4.295 s ±  0.060 s    [User: 4.264 s, System: 0.031 s]
    Range (min … max):    4.229 s …  4.419 s    10 runs
  
  Summary
    perl old.pl t[0-9]*.sh ran
      1.02 ± 0.02 times faster than perl new.pl t[0-9]*.sh

So it does seem to make a small difference, but we're within the noise.

> In the long run, I think we probably want to build a full parse tree,
> attach relevant information (such as a heredoc body) to each node, and
> then walk the tree, rather than trying to perform on-the-fly lints and
> other operations on the token stream as is currently the case.
> [...]

Yeah, all of that sounds very sensible long term, but probably not worth
worrying about for this topic.

-Peff
Jeff King July 10, 2024, 1:09 a.m. UTC | #13
On Mon, Jul 08, 2024 at 05:08:37AM -0400, Jeff King wrote:

> I took a look at running each test individually. It's surprisingly quite
> a bit slower! About 4s instead of 200ms. There's a bit of low-hanging
> fruit to get it down to ~1.7s (which I'll include in my series). But in
> the end I punted on that for now, but did add line-number checks. Each
> expect file just knows its own numbers, and I use a bit of perl to
> handle the running offset.

By the way, in case you are wondering why I haven't sent out the next
iteration of the series: I am stuck trying to figure out some Windows
line-ending nonsense.

The chainlint.pl parser chokes on CRLF line endings. So Windows CI
produces:

  runneradmin@fv-az1390-742 MINGW64 /d/a/git/git/t
  # perl chainlint.pl chainlint/for-loop.test
  'nternal error scanning character '

(the funny overwrite is because the invalid char is a CR). I tried a few
simple things to skip past this error, but the problem is pervasive. We
really just want to have perl handle the line endings on read. And doing
this works:

  # PERLIO=:crlf perl chainlint.pl chainlint/for-loop.test
  # chainlint: chainlint/for-loop.test
  # chainlint: for-loop
  [...etc, normal output...]

Which gives me all sorts of questions:

  - isn't crlf handling usually the default for perl builds on Windows?
    I guess this is probably getting into weird mingw vs native Windows
    distinctions that generally leave me perplexed.

  - why wasn't this a problem before? I'm guessing again in the "weird
    mingw stuff" hand-waving way that when we used "sed" to assemble
    everything, it stripped the CR's in the "chainlinttmp/tests" file.
    And in my series, that "cat" is replaced with a perl script (that
    writes the "tests" and "expect" files together).

  - why doesn't "PERLIO=:crlf make check-chainlint" work? It seems that
    perl spawned from "make" behaves differently. More mingw weirdness?

I'm tempted to just do this:

--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -779,7 +779,7 @@ sub check_script {
        while (my $path = $next_script->()) {
                $nscripts++;
                my $fh;
-               unless (open($fh, "<", $path)) {
+               unless (open($fh, "<:unix:crlf", $path)) {
                        $emit->("?!ERR?! $path: $!\n");
                        next;
                }

It feels like a hack, but it makes the parser's assumptions explicit,
and it should just work everywhere.

-Peff
Eric Sunshine July 10, 2024, 2:38 a.m. UTC | #14
On Tue, Jul 9, 2024 at 8:48 PM Jeff King <peff@peff.net> wrote:
> On Mon, Jul 08, 2024 at 04:06:47PM -0400, Eric Sunshine wrote:
> > One thing we may want to measure is how much extra time we're wasting
> > for the (very) common case of latching heredoc bodies only to then
> > ignore them. In particular, we may want to add a flag to ShellParser
> > telling it whether or not to latch heredoc bodies, and enable that
> > flag in subclass ScriptParser, but leave it disabled in subclass
> > TestParser since only ScriptParser currently cares about the heredoc
> > body.
>
> I doubt it's much to hold on to a few extra small buffers.

I was more concerned about the extra substr() consuming additional CPU
time and inflating wall-clock time.

> So it does seem to make a small difference, but we're within the noise.

Okay. Thanks for measuring.

> > In the long run, I think we probably want to build a full parse tree,
> > attach relevant information (such as a heredoc body) to each node, and
> > then walk the tree, rather than trying to perform on-the-fly lints and
> > other operations on the token stream as is currently the case.
> > [...]
>
> Yeah, all of that sounds very sensible long term, but probably not worth
> worrying about for this topic.

Agreed.
Eric Sunshine July 10, 2024, 3:02 a.m. UTC | #15
On Tue, Jul 9, 2024 at 9:09 PM Jeff King <peff@peff.net> wrote:
> The chainlint.pl parser chokes on CRLF line endings. So Windows CI
> produces:
>
>   runneradmin@fv-az1390-742 MINGW64 /d/a/git/git/t
>   # perl chainlint.pl chainlint/for-loop.test
>   'nternal error scanning character '

As far as I understand, chainlint is disabled in the Windows CI. Did
you manually re-enable it for testing? Or are you just running it
manually in the Windows CI?

> We really just want to have perl handle the line endings on read. And doing
> this works:
>
>   # PERLIO=:crlf perl chainlint.pl chainlint/for-loop.test
>   # chainlint: chainlint/for-loop.test
>   # chainlint: for-loop
>   [...etc, normal output...]
>
> Which gives me all sorts of questions:
>
>   - isn't crlf handling usually the default for perl builds on Windows?
>     I guess this is probably getting into weird mingw vs native Windows
>     distinctions that generally leave me perplexed.

Could be. I'm not sure how the Windows CI is provisioned, whether with
some native-compiled Perl or with msys2/mingw Perl.

>   - why wasn't this a problem before? I'm guessing again in the "weird
>     mingw stuff" hand-waving way that when we used "sed" to assemble
>     everything, it stripped the CR's in the "chainlinttmp/tests" file.
>     And in my series, that "cat" is replaced with a perl script (that
>     writes the "tests" and "expect" files together).

Assuming you manually re-enabled chaintlint in the Windows CI for this
testing or are running it manually, it may be the case that
chainlint.pl has never been run in the Windows CI. Specifically,
chainlint in Windows CI was disabled by a87e427e35 (ci: speed up
Windows phase, 2019-01-29) which predates the switchover from
chainlint.sed to chainlint.pl by d00113ec34 (t/Makefile: apply
chainlint.pl to existing self-tests, 2022-09-01). So, it's quite
possible that chainlint.pl has never run in Windows CI. But, perhaps
I'm misunderstanding or missing some piece of information.

That said, I did thoroughly test chainlint.pl on Windows using Git For
Windows, and it did run in that environment. (But if the Windows CI
environment is somehow different, then that might explain the
problem?)

>   - why doesn't "PERLIO=:crlf make check-chainlint" work? It seems that
>     perl spawned from "make" behaves differently. More mingw weirdness?

That could indeed be an msys2 issue. It will automatically convert
colon ":" to semicolon ";" in environment variables since the PATH
separator on Windows is ";", not ":" as it is on Unix. Moreover, the
":" to ";" switcheroo logic is not restricted only to PATH since there
are other PATH-like variables in common use, so it's applied to all
environment variables.

> I'm tempted to just do this:
>
>         while (my $path = $next_script->()) {
>                 $nscripts++;
>                 my $fh;
> -               unless (open($fh, "<", $path)) {
> +               unless (open($fh, "<:unix:crlf", $path)) {
>
> It feels like a hack, but it makes the parser's assumptions explicit,
> and it should just work everywhere.

Yep, if this makes it work, then it seems like a good way forward,
especially since I don't think there's any obvious way to work around
the ":" to ";" switcheroo performed by msys2.
Jeff King July 10, 2024, 7:06 a.m. UTC | #16
On Tue, Jul 09, 2024 at 11:02:01PM -0400, Eric Sunshine wrote:

> On Tue, Jul 9, 2024 at 9:09 PM Jeff King <peff@peff.net> wrote:
> > The chainlint.pl parser chokes on CRLF line endings. So Windows CI
> > produces:
> >
> >   runneradmin@fv-az1390-742 MINGW64 /d/a/git/git/t
> >   # perl chainlint.pl chainlint/for-loop.test
> >   'nternal error scanning character '
> 
> As far as I understand, chainlint is disabled in the Windows CI. Did
> you manually re-enable it for testing? Or are you just running it
> manually in the Windows CI?

Neither. As far as I can tell, we still run the "check-chainlint" target
as part of "make test", and that's what I saw fail. For instance:

  https://github.com/peff/git/actions/runs/9856301557/job/27213352807

Every one of the "win test" jobs failed, with the same outcome: running
check-chainlint triggered the "internal scanning error".

> Assuming you manually re-enabled chaintlint in the Windows CI for this
> testing or are running it manually, it may be the case that
> chainlint.pl has never been run in the Windows CI. Specifically,
> chainlint in Windows CI was disabled by a87e427e35 (ci: speed up
> Windows phase, 2019-01-29) which predates the switchover from
> chainlint.sed to chainlint.pl by d00113ec34 (t/Makefile: apply
> chainlint.pl to existing self-tests, 2022-09-01). So, it's quite
> possible that chainlint.pl has never run in Windows CI. But, perhaps
> I'm misunderstanding or missing some piece of information.

I think that commit would prevent it from running as part of the actual
test scripts. But we'd still do check-chainlint to run the chainlint
self-tests. And because it only sets "--no-chain-lint" in GIT_TEST_OPTS
and not GIT_TEST_CHAIN_LINT=0, I think that the bulk run of chainlint.pl
by t/Makefile is still run (and then ironically, when that is run the
Makefile manually suppresses the per-script runs, so that
--no-chain-lint option is truly doing nothing).

And I think is true even with the ci/run-test-slice.sh approach that the
Windows tests use. They still drive it through "make", and just override
the $(T) variable.

> >   - why doesn't "PERLIO=:crlf make check-chainlint" work? It seems that
> >     perl spawned from "make" behaves differently. More mingw weirdness?
> 
> That could indeed be an msys2 issue. It will automatically convert
> colon ":" to semicolon ";" in environment variables since the PATH
> separator on Windows is ";", not ":" as it is on Unix. Moreover, the
> ":" to ";" switcheroo logic is not restricted only to PATH since there
> are other PATH-like variables in common use, so it's applied to all
> environment variables.

Ah, good thinking. I'm not sure if that's it, though. Just PERLIO=crlf
should behave the same way (the ":" is technically a separator, and it
is only a style suggestion that you prepend one). Likewise a space is
supposed to be OK, too, so PERLIO="unix crlf" should work. But neither
seems to work for me. So I'm still puzzled.

> > I'm tempted to just do this:
> >
> >         while (my $path = $next_script->()) {
> >                 $nscripts++;
> >                 my $fh;
> > -               unless (open($fh, "<", $path)) {
> > +               unless (open($fh, "<:unix:crlf", $path)) {
> >
> > It feels like a hack, but it makes the parser's assumptions explicit,
> > and it should just work everywhere.
> 
> Yep, if this makes it work, then it seems like a good way forward,
> especially since I don't think there's any obvious way to work around
> the ":" to ";" switcheroo performed by msys2.

OK, I'll add that to my series, then. The fact that we weren't really
_intending_ to run chainlint there makes me tempted to just punt and
disable it. But AFAICT we have been running it for a while, and it could
benefit people on Windows (though it is a bit funky that we do a full
check-chainlint in each slice). And I suspect disabling it reliably
might be a trickier change than what I wrote above anyway. ;)

-Peff
Eric Sunshine July 10, 2024, 7:29 a.m. UTC | #17
On Wed, Jul 10, 2024 at 3:06 AM Jeff King <peff@peff.net> wrote:
> On Tue, Jul 09, 2024 at 11:02:01PM -0400, Eric Sunshine wrote:
> > That could indeed be an msys2 issue. It will automatically convert
> > colon ":" to semicolon ";" in environment variables since the PATH
> > separator on Windows is ";", not ":" as it is on Unix. Moreover, the
> > ":" to ";" switcheroo logic is not restricted only to PATH since there
> > are other PATH-like variables in common use, so it's applied to all
> > environment variables.
>
> Ah, good thinking. I'm not sure if that's it, though. Just PERLIO=crlf
> should behave the same way (the ":" is technically a separator, and it
> is only a style suggestion that you prepend one). Likewise a space is
> supposed to be OK, too, so PERLIO="unix crlf" should work. But neither
> seems to work for me. So I'm still puzzled.

Me too. Perhaps Dscho would have more insight(?).
diff mbox series

Patch

diff --git a/t/chainlint.pl b/t/chainlint.pl
index 1bbd985b78..eba509b8e1 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -174,6 +174,8 @@  sub swallow_heredocs {
 		$$b =~ /(?:\G|\n)$indent\Q$$tag[0]\E(?:\n|\z)/gc;
 		if (pos($$b) > $start) {
 			my $body = substr($$b, $start, pos($$b) - $start);
+			$self->{parser}->{heredocs}->{$$tag[0]} =
+			    substr($body, 0, length($body) - length($&));
 			$self->{lineno} += () = $body =~ /\n/sg;
 			next;
 		}
@@ -232,7 +234,8 @@  sub new {
 	my $self = bless {
 		buff => [],
 		stop => [],
-		output => []
+		output => [],
+		heredocs => {},
 	} => $class;
 	$self->{lexer} = Lexer->new($self, $s);
 	return $self;
@@ -616,7 +619,9 @@  sub unwrap {
 
 sub check_test {
 	my $self = shift @_;
-	my ($title, $body) = map(unwrap, @_);
+	my $title = unwrap(shift @_);
+	my $body = unwrap(shift @_);
+	$body = shift @_ if $body eq '-';
 	$self->{ntests}++;
 	my $parser = TestParser->new(\$body);
 	my @tokens = $parser->parse();
@@ -649,8 +654,13 @@  sub parse_cmd {
 	return @tokens unless @tokens && $tokens[0]->[0] =~ /^test_expect_(?:success|failure)$/;
 	my $n = $#tokens;
 	$n-- while $n >= 0 && $tokens[$n]->[0] =~ /^(?:[;&\n|]|&&|\|\|)$/;
-	$self->check_test($tokens[1], $tokens[2]) if $n == 2; # title body
-	$self->check_test($tokens[2], $tokens[3]) if $n > 2;  # prereq title body
+	my $herebody;
+	if ($n >= 2 && $tokens[$n-1]->[0] eq '-' && $tokens[$n]->[0] =~ /^<<-?(.+)$/) {
+		$herebody = $self->{heredocs}->{$1};
+		$n--;
+	}
+	$self->check_test($tokens[1], $tokens[2], $herebody) if $n == 2; # title body
+	$self->check_test($tokens[2], $tokens[3], $herebody) if $n > 2;  # prereq title body
 	return @tokens;
 }