[4/4] am: fix --interactive HEAD tree resolution

Message ID	20190520121301.GD11212@sigill.intra.peff.net (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Date: Mon, 20 May 2019 08:13:01 -0400 From: Jeff King <peff@peff.net> To: Alejandro Sanchez <asanchez1987@gmail.com> Cc: git@vger.kernel.org Subject: [PATCH 4/4] am: fix --interactive HEAD tree resolution Message-ID: <20190520121301.GD11212@sigill.intra.peff.net> References: <20190520120636.GA12634@sigill.intra.peff.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190520120636.GA12634@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk
Series	fix BUG() with "git am -i --resolved" \| expand [0/4] fix BUG() with "git am -i --resolved" [1/4] am: simplify prompt response handling [2/4] am: read interactive input from stdin [3/4] am: drop tty requirement for --interactive [4/4] am: fix --interactive HEAD tree resolution

Jeff King May 20, 2019, 12:13 p.m. UTC

In interactive mode, "git am -i --resolved" will try to generate a patch
based on what is in the index, so that it can prompt "apply this
patch?". To do so it needs the tree of HEAD, which it tries to get with
get_oid_tree(). However, this doesn't yield a tree oid; the "tree" part
just means "if you must disambiguate short oids, then prefer trees" (and
we do not need to disambiguate at all, since we are feeding a ref name).

Instead, we must parse the oid as a commit (which should always be true
in a non-corrupt repository), and access its tree pointer manually.

This has been broken since the conversion to C in 7ff2683253
(builtin-am: implement -i/--interactive, 2015-08-04), but there was no
test coverage because of interactive-mode's insistence on having a tty.
That was lifted in the previous commit, so we can now add a test for
this case.

Note that before this patch, the test would result in a BUG() which
comes from 3506dc9445 (has_uncommitted_changes(): fall back to empty
tree, 2018-07-11). But before that, we'd have simply segfaulted (and in
fact this is the exact type of case the BUG() added there was trying to
catch!).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/am.c              | 14 ++++++++---
 t/t4257-am-interactive.sh | 49 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 3 deletions(-)
 create mode 100755 t/t4257-am-interactive.sh

Johannes Schindelin May 23, 2019, 7:12 a.m. UTC | #1

Hi Peff,

On Mon, 20 May 2019, Jeff King wrote:

> In interactive mode, "git am -i --resolved" will try to generate a patch
> based on what is in the index, so that it can prompt "apply this
> patch?". To do so it needs the tree of HEAD, which it tries to get with
> get_oid_tree(). However, this doesn't yield a tree oid; the "tree" part
> just means "if you must disambiguate short oids, then prefer trees" (and
> we do not need to disambiguate at all, since we are feeding a ref name).
>
> Instead, we must parse the oid as a commit (which should always be true
> in a non-corrupt repository), and access its tree pointer manually.
>
> This has been broken since the conversion to C in 7ff2683253
> (builtin-am: implement -i/--interactive, 2015-08-04), but there was no
> test coverage because of interactive-mode's insistence on having a tty.
> That was lifted in the previous commit, so we can now add a test for
> this case.
>
> Note that before this patch, the test would result in a BUG() which
> comes from 3506dc9445 (has_uncommitted_changes(): fall back to empty
> tree, 2018-07-11). But before that, we'd have simply segfaulted (and in
> fact this is the exact type of case the BUG() added there was trying to
> catch!).

What an old breakage! Thanks for analyzing and fixing it.

> diff --git a/builtin/am.c b/builtin/am.c
> index ea16b844f1..33bd7a6eab 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1339,9 +1339,17 @@ static void write_index_patch(const struct am_state *state)
>  	struct rev_info rev_info;
>  	FILE *fp;
>
> -	if (!get_oid_tree("HEAD", &head))
> -		tree = lookup_tree(the_repository, &head);
> -	else
> +	if (!get_oid("HEAD", &head)) {
> +		struct object *obj;
> +		struct commit *commit;
> +
> +		obj = parse_object_or_die(&head, NULL);
> +		commit = object_as_type(the_repository, obj, OBJ_COMMIT, 0);
> +		if (!commit)
> +			die("unable to parse HEAD as a commit");

Wouldn't this be easier to read like this:

		struct commit *commit =
			lookup_commit_reference(the_repository, &head);

> +
> +		tree = get_commit_tree(commit);
> +	} else
>  		tree = lookup_tree(the_repository,
>  				   the_repository->hash_algo->empty_tree);
>
> diff --git a/t/t4257-am-interactive.sh b/t/t4257-am-interactive.sh
> new file mode 100755
> index 0000000000..6989bf7aba
> --- /dev/null
> +++ b/t/t4257-am-interactive.sh
> @@ -0,0 +1,49 @@
> +#!/bin/sh
> +
> +test_description='am --interactive tests'
> +. ./test-lib.sh
> +
> +test_expect_success 'set up patches to apply' '
> +	test_commit unrelated &&
> +	test_commit no-conflict &&
> +	test_commit conflict-patch file patch &&
> +	git format-patch --stdout -2 >mbox &&
> +
> +	git reset --hard unrelated &&
> +	test_commit conflict-master file master base
> +'
> +
> +# Sanity check our setup.
> +test_expect_success 'applying all patches generates conflict' '
> +	test_must_fail git am mbox &&
> +	echo resolved >file &&
> +	git add -u &&
> +	git am --resolved
> +'
> +
> +test_expect_success 'interactive am can apply a single patch' '
> +	git reset --hard base &&
> +	printf "%s\n" y n | git am -i mbox &&

Since we want contributors to copy-edit our test cases (even if they do
not happen to be Unix shell scripting experts), it would be better to
write

	test_write_lines y n | git am -i mbox &&

here. Same for similar `printf` invocations further down.

> +
> +	echo no-conflict >expect &&
> +	git log -1 --format=%s >actual &&
> +	test_cmp expect actual

I would prefer

	test no-conflict = "$(git show -s --format=%s HEAD)"

or even better:

test_cmp_head_oneline () {
	if test "$1" != "$(git show -s --format=%s HEAD)"
	then
		echo >&4 "HEAD's oneline is '$(git show -s \
			--format=%s HEAD)'; expected '$1'"
		return 1
	fi
}

> +'
> +
> +test_expect_success 'interactive am can resolve conflict' '
> +	git reset --hard base &&
> +	printf "%s\n" y y | test_must_fail git am -i mbox &&
> +	echo resolved >file &&
> +	git add -u &&
> +	printf "%s\n" v y | git am -i --resolved &&

Maybe a comment, to explain to the casual reader what the "v" and the "y"
are supposed to do?

> +
> +	echo conflict-patch >expect &&
> +	git log -1 --format=%s >actual &&
> +	test_cmp expect actual &&
> +
> +	echo resolved >expect &&
> +	git cat-file blob HEAD:file >actual &&
> +	test_cmp expect actual
> +'

After wrapping my head around the intentions of these commands, I agree
that they test for the right thing.

Thanks!
Dscho

> +
> +test_done
> --
> 2.22.0.rc1.539.g7bfcdfe86d
>

Jeff King May 24, 2019, 6:39 a.m. UTC | #2

On Thu, May 23, 2019 at 09:12:27AM +0200, Johannes Schindelin wrote:

> > +	if (!get_oid("HEAD", &head)) {
> > +		struct object *obj;
> > +		struct commit *commit;
> > +
> > +		obj = parse_object_or_die(&head, NULL);
> > +		commit = object_as_type(the_repository, obj, OBJ_COMMIT, 0);
> > +		if (!commit)
> > +			die("unable to parse HEAD as a commit");
> 
> Wouldn't this be easier to read like this:
> 
> 		struct commit *commit =
> 			lookup_commit_reference(the_repository, &head);

Just the first two lines, I assume you mean; we still have to die
ourselves. There is a lookup_commit_or_die(), but weirdly it warns if
there was any tag dereferencing. I guess that would never happen here,
since we're reading HEAD (and most of the existing calls appear to be
for HEAD). So I'll go with that.

> > +test_expect_success 'interactive am can apply a single patch' '
> > +	git reset --hard base &&
> > +	printf "%s\n" y n | git am -i mbox &&
> 
> Since we want contributors to copy-edit our test cases (even if they do
> not happen to be Unix shell scripting experts), it would be better to
> write
> 
> 	test_write_lines y n | git am -i mbox &&
> 
> here. Same for similar `printf` invocations further down.

I think test_write_lines is mostly about avoiding echo chains, but it's
probably a little more readable to avoid having to say "\n". I'll adopt
that.

> > +	echo no-conflict >expect &&
> > +	git log -1 --format=%s >actual &&
> > +	test_cmp expect actual
> 
> I would prefer
> 
> 	test no-conflict = "$(git show -s --format=%s HEAD)"
> 
> or even better:
> 
> test_cmp_head_oneline () {
> 	if test "$1" != "$(git show -s --format=%s HEAD)"
> 	then
> 		echo >&4 "HEAD's oneline is '$(git show -s \
> 			--format=%s HEAD)'; expected '$1'"
> 		return 1
> 	fi
> }

This, I disagree with. IMHO comparing command output using "test" is
harder to read and produces worse debugging output (unless you do a
helper as you showed, which I think makes the readability even worse).
Not to mention that it raises questions of the shell's whitespace
handling (though that does not matter for this case).

What's your complaint with test_cmp? Is it the extra process? Could we
perhaps deal with that by having it use `read` for the happy-path?

Or do you prefer having a one-liner? I'd rather come up with a more
generic helper to cover this case, that can run any command and compare
it to a single argument (or stdin). E.g.,:

  test_cmp_cmd no-conflict git log -1 --format=%s

or

  test_cmp_cmd - git foo <<-\EOF
  multi-line
  expectation
  EOF

But I'd rather approach those issues separately and systematically, and
not hold up this bug fix.

> > +test_expect_success 'interactive am can resolve conflict' '
> > +	git reset --hard base &&
> > +	printf "%s\n" y y | test_must_fail git am -i mbox &&
> > +	echo resolved >file &&
> > +	git add -u &&
> > +	printf "%s\n" v y | git am -i --resolved &&
> 
> Maybe a comment, to explain to the casual reader what the "v" and the "y"
> are supposed to do?

OK. The "v" is actually optional, but I figured it would not hurt to
have us print the patch we just generated. I'll add a comment.

> After wrapping my head around the intentions of these commands, I agree
> that they test for the right thing.

Thanks!

-Peff

Johannes Schindelin May 28, 2019, 11:06 a.m. UTC | #3

Hi Peff,

On Fri, 24 May 2019, Jeff King wrote:

> On Thu, May 23, 2019 at 09:12:27AM +0200, Johannes Schindelin wrote:
>
> > > +	echo no-conflict >expect &&
> > > +	git log -1 --format=%s >actual &&
> > > +	test_cmp expect actual
> >
> > I would prefer
> >
> > 	test no-conflict = "$(git show -s --format=%s HEAD)"
> >
> > or even better:
> >
> > test_cmp_head_oneline () {
> > 	if test "$1" != "$(git show -s --format=%s HEAD)"
> > 	then
> > 		echo >&4 "HEAD's oneline is '$(git show -s \
> > 			--format=%s HEAD)'; expected '$1'"
> > 		return 1
> > 	fi
> > }
>
> This, I disagree with. IMHO comparing command output using "test" is
> harder to read and produces worse debugging output (unless you do a
> helper as you showed, which I think makes the readability even worse).
> Not to mention that it raises questions of the shell's whitespace
> handling (though that does not matter for this case).
>
> What's your complaint with test_cmp? Is it the extra process? Could we
> perhaps deal with that by having it use `read` for the happy-path?

I would prefer it if we adopted a more descriptive style in the test
suite, as I always found that style to be much easier to work with (you
might have guessed that I am spending a lot of time chasing test
failures).

Succinctness is just one benefit of that.

A more important benefit is that you can teach a helper that verifies
onelines to show very useful information in case of a failure, something
that `test_cmp` cannot do because it is totally agnostic to what it
compares.

> Or do you prefer having a one-liner? I'd rather come up with a more
> generic helper to cover this case, that can run any command and compare
> it to a single argument (or stdin). E.g.,:
>
>   test_cmp_cmd no-conflict git log -1 --format=%s
>
> or
>
>   test_cmp_cmd - git foo <<-\EOF
>   multi-line
>   expectation
>   EOF

I guess that you and me go into completely opposite directions here. I
want something *less* general. Because I want to optimize for the
unfortunate times when a test fails and most likely somebody else than the
original author of the test case is tasked with figuring out what the heck
goes wrong.

You seem to want to optimize for writing test cases. Which I find -- with
all due respect -- the wrong thing to optimize for. It is already dirt
easy to write new test cases. But *good* test cases (i.e. easy to debug
ones)? Not so much.

> But I'd rather approach those issues separately and systematically, and
> not hold up this bug fix.

Sure.

> > > +test_expect_success 'interactive am can resolve conflict' '
> > > +	git reset --hard base &&
> > > +	printf "%s\n" y y | test_must_fail git am -i mbox &&
> > > +	echo resolved >file &&
> > > +	git add -u &&
> > > +	printf "%s\n" v y | git am -i --resolved &&
> >
> > Maybe a comment, to explain to the casual reader what the "v" and the "y"
> > are supposed to do?
>
> OK. The "v" is actually optional, but I figured it would not hurt to
> have us print the patch we just generated. I'll add a comment.

Thank you.

Ciao,
Dscho

Jeff King May 28, 2019, 9:35 p.m. UTC | #4

On Tue, May 28, 2019 at 01:06:21PM +0200, Johannes Schindelin wrote:

> > Or do you prefer having a one-liner? I'd rather come up with a more
> > generic helper to cover this case, that can run any command and compare
> > it to a single argument (or stdin). E.g.,:
> >
> >   test_cmp_cmd no-conflict git log -1 --format=%s
> >
> > or
> >
> >   test_cmp_cmd - git foo <<-\EOF
> >   multi-line
> >   expectation
> >   EOF
> 
> I guess that you and me go into completely opposite directions here. I
> want something *less* general. Because I want to optimize for the
> unfortunate times when a test fails and most likely somebody else than the
> original author of the test case is tasked with figuring out what the heck
> goes wrong.
> 
> You seem to want to optimize for writing test cases. Which I find -- with
> all due respect -- the wrong thing to optimize for. It is already dirt
> easy to write new test cases. But *good* test cases (i.e. easy to debug
> ones)? Not so much.

Hmm. I too want the test output to be useful to people other than the
test author. But I find the output from test_cmp perfectly fine there.
My first step in digging into a failure is usually to look at what
commands the test is running, which generally makes it obvious why we
are expecting one thing and seeing another (or at least, just as obvious
as a hand-written message).

So to me the two are equal on that front, which makes me want to go with
the thing that is shorter to write, as it makes it more likely the test
writer will write it. The _worst_ option IMHO is a straight-up use of
"test" which provides no output at all in the test log of what value we
_did_ see. That requires the person looking into the failure to re-run
the test, which is hard if it's a remote CI, or if the failure does not
always reproduce.

-Peff

Johannes Schindelin May 29, 2019, 11:56 a.m. UTC | #5

Hi Peff,

On Tue, 28 May 2019, Jeff King wrote:

> On Tue, May 28, 2019 at 01:06:21PM +0200, Johannes Schindelin wrote:
>
> > > Or do you prefer having a one-liner? I'd rather come up with a more
> > > generic helper to cover this case, that can run any command and compare
> > > it to a single argument (or stdin). E.g.,:
> > >
> > >   test_cmp_cmd no-conflict git log -1 --format=%s
> > >
> > > or
> > >
> > >   test_cmp_cmd - git foo <<-\EOF
> > >   multi-line
> > >   expectation
> > >   EOF
> >
> > I guess that you and me go into completely opposite directions here. I
> > want something *less* general. Because I want to optimize for the
> > unfortunate times when a test fails and most likely somebody else than the
> > original author of the test case is tasked with figuring out what the heck
> > goes wrong.
> >
> > You seem to want to optimize for writing test cases. Which I find -- with
> > all due respect -- the wrong thing to optimize for. It is already dirt
> > easy to write new test cases. But *good* test cases (i.e. easy to debug
> > ones)? Not so much.
>
> Hmm. I too want the test output to be useful to people other than the
> test author. But I find the output from test_cmp perfectly fine there.
> My first step in digging into a failure is usually to look at what
> commands the test is running, which generally makes it obvious why we
> are expecting one thing and seeing another (or at least, just as obvious
> as a hand-written message).
>
> So to me the two are equal on that front, which makes me want to go with
> the thing that is shorter to write, as it makes it more likely the test
> writer will write it. The _worst_ option IMHO is a straight-up use of
> "test" which provides no output at all in the test log of what value we
> _did_ see. That requires the person looking into the failure to re-run
> the test, which is hard if it's a remote CI, or if the failure does not
> always reproduce.

If you think your version is easier to debug, then I won't object.

Thanks,
Dscho

Alejandro Sanchez Sept. 26, 2019, 2:20 p.m. UTC | #6

Hi,

Are there any updates to this problem?

Thank you,
Alex


On Wed, May 29, 2019 at 1:57 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Peff,
>
> On Tue, 28 May 2019, Jeff King wrote:
>
> > On Tue, May 28, 2019 at 01:06:21PM +0200, Johannes Schindelin wrote:
> >
> > > > Or do you prefer having a one-liner? I'd rather come up with a more
> > > > generic helper to cover this case, that can run any command and compare
> > > > it to a single argument (or stdin). E.g.,:
> > > >
> > > >   test_cmp_cmd no-conflict git log -1 --format=%s
> > > >
> > > > or
> > > >
> > > >   test_cmp_cmd - git foo <<-\EOF
> > > >   multi-line
> > > >   expectation
> > > >   EOF
> > >
> > > I guess that you and me go into completely opposite directions here. I
> > > want something *less* general. Because I want to optimize for the
> > > unfortunate times when a test fails and most likely somebody else than the
> > > original author of the test case is tasked with figuring out what the heck
> > > goes wrong.
> > >
> > > You seem to want to optimize for writing test cases. Which I find -- with
> > > all due respect -- the wrong thing to optimize for. It is already dirt
> > > easy to write new test cases. But *good* test cases (i.e. easy to debug
> > > ones)? Not so much.
> >
> > Hmm. I too want the test output to be useful to people other than the
> > test author. But I find the output from test_cmp perfectly fine there.
> > My first step in digging into a failure is usually to look at what
> > commands the test is running, which generally makes it obvious why we
> > are expecting one thing and seeing another (or at least, just as obvious
> > as a hand-written message).
> >
> > So to me the two are equal on that front, which makes me want to go with
> > the thing that is shorter to write, as it makes it more likely the test
> > writer will write it. The _worst_ option IMHO is a straight-up use of
> > "test" which provides no output at all in the test log of what value we
> > _did_ see. That requires the person looking into the failure to re-run
> > the test, which is hard if it's a remote CI, or if the failure does not
> > always reproduce.
>
> If you think your version is easier to debug, then I won't object.
>
> Thanks,
> Dscho

Jeff King Sept. 26, 2019, 9:11 p.m. UTC | #7

On Thu, Sep 26, 2019 at 04:20:05PM +0200, Alejandro Sanchez wrote:

> Are there any updates to this problem?

The fix for the original bug went into Git v2.22.1.

-Peff

[4/4] am: fix --interactive HEAD tree resolution

Commit Message

Comments

Patch