[v4,1/2] diff-files --raw: show correct post-image of intent-to-add files
diff mbox series

Message ID 69256ab9107c3dba0dc007b69cc0ce98a9b91f9a.1593107621.git.gitgitgadget@gmail.com
State New
Headers show
Series
  • Fix difftool problem with intent-to-add files
Related show

Commit Message

Hariom Verma via GitGitGadget June 25, 2020, 5:53 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

The documented behavior of `git diff-files --raw` is to display

	[...] 0{40} if creation, unmerged or "look at work tree".

This happens for example when showing modified, unstaged files.

For intent-to-add files, we used to show the empty blob's hash instead.
In c26022ea8f5 (diff: convert diff_addremove to struct object_id,
2017-05-30), we made that worse by inadvertently changing that to the
hash of the empty tree.

Let's make the behavior consistent with modified files by showing
all-zero values also for intent-to-add files.

Accordingly, this patch adjusts the expectations set by the regression
test introduced in feea6946a5b (diff-files: treat "i-t-a" files as
"not-in-index", 2020-06-20).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 diff-lib.c            | 3 +--
 t/t2203-add-intent.sh | 5 ++---
 2 files changed, 3 insertions(+), 5 deletions(-)

Comments

Junio C Hamano June 25, 2020, 6:08 p.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> diff --git a/diff-lib.c b/diff-lib.c
> index 61812f48c2..25fd2dee19 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -220,8 +220,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  			} else if (revs->diffopt.ita_invisible_in_index &&
>  				   ce_intent_to_add(ce)) {
>  				diff_addremove(&revs->diffopt, '+', ce->ce_mode,
> -					       the_hash_algo->empty_tree, 0,
> -					       ce->name, 0);
> +					       &null_oid, 0, ce->name, 0);

This (even if the preimage were correctly using empty_blob) is *so*
simple a change that it is very surprising that the new test in
[2/2] passes without any other code change.

It means that difftool was written correctly in the first place,
right?

Nicely done.
Junio C Hamano June 25, 2020, 6:21 p.m. UTC | #2
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The documented behavior of `git diff-files --raw` is to display
>
> 	[...] 0{40} if creation, unmerged or "look at work tree".

"on the right hand (i.e. postimage) side" is missing, which is
crucial in this description.

> This happens for example when showing modified, unstaged files.

Modified I would understand.  We notice that the contents on the
work tree is different from what is in the index, and we tell the
consumer "look at work tree".  I do not think you meant "unstaged",
though.  If truly removed from the index, diff-files won't even know
about the path.  You probably meant "removed from the working tree",
but 0{40} in that case means totally different thing (i.e. it would
be a (D)eletion record, so 0{40} on the postimage side is a filler,
not even "look at work tree").  What would make more sense to
describe might be

	This happens for paths that are modified, or unmodified but
	stat-dirty.

Either case, we tell the consumer to check the "work tree".

> For intent-to-add files, we used to show the empty blob's hash instead.
> In c26022ea8f5 (diff: convert diff_addremove to struct object_id,
> 2017-05-30), we made that worse by inadvertently changing that to the
> hash of the empty tree.
>
> Let's make the behavior consistent with modified files by showing
> all-zero values also for intent-to-add files.

Well described.

Thanks.
Srinidhi Kaushik June 26, 2020, 5:49 p.m. UTC | #3
Sorry for the late reply.

> Let's make the behavior consistent with modified files by showing
> all-zero values also for intent-to-add files.
>
-- >8 --
>
>                               diff_addremove(&revs->diffopt, '+', ce->ce_mode,
> -                                            the_hash_algo->empty_tree, 0,
> -                                            ce->name, 0);
> +                                            &null_oid, 0, ce->name, 0);
>                               continue;
>                       }
>
> diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
> index 8a5d55054f..cf0175ad6e 100755
> --- a/t/t2203-add-intent.sh
> +++ b/t/t2203-add-intent.sh
> @@ -240,7 +240,6 @@ test_expect_success 'i-t-a files shown as new for "diff", "diff-files"; not-new
>
>       hash_e=$(git hash-object empty) &&
>       hash_n=$(git hash-object not-empty) &&
> -     hash_t=$(git hash-object -t tree /dev/null) &&
>
>       cat >expect.diff_p <<-EOF &&
>       diff --git a/empty b/empty
> @@ -259,8 +258,8 @@ test_expect_success 'i-t-a files shown as new for "diff", "diff-files"; not-new
>        create mode 100644 not-empty
>       EOF
>       cat >expect.diff_a <<-EOF &&
> -     :000000 100644 0000000 $(git rev-parse --short $hash_t) A$(printf "\t")empty
> -     :000000 100644 0000000 $(git rev-parse --short $hash_t) A$(printf "\t")not-empty
> +     :000000 100644 0000000 0000000 A$(printf "\t")empty
> +     :000000 100644 0000000 0000000 A$(printf "\t")not-empty

This change (and the new test in [PATCH v4 2/2]) looks good to me.

I learnt quite a bit about what the string "0{40}" means in the context
of `diff'  post-image from the conversations around this patch. It was
very helpful.

Thank you.
Johannes Schindelin July 1, 2020, 9:46 a.m. UTC | #4
Hi Junio,


On Thu, 25 Jun 2020, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > diff --git a/diff-lib.c b/diff-lib.c
> > index 61812f48c2..25fd2dee19 100644
> > --- a/diff-lib.c
> > +++ b/diff-lib.c
> > @@ -220,8 +220,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
> >  			} else if (revs->diffopt.ita_invisible_in_index &&
> >  				   ce_intent_to_add(ce)) {
> >  				diff_addremove(&revs->diffopt, '+', ce->ce_mode,
> > -					       the_hash_algo->empty_tree, 0,
> > -					       ce->name, 0);
> > +					       &null_oid, 0, ce->name, 0);
>
> This (even if the preimage were correctly using empty_blob) is *so*
> simple a change that it is very surprising that the new test in
> [2/2] passes without any other code change.
>
> It means that difftool was written correctly in the first place,
> right?

Well, it means that difftool does not even consume the OID. Sure, it
parses it, but then it ignores it. All it _really_ is interested in is
that status flag (`A`), so technically, my fix in 1/2 is not even needed
after `sk/diff-files-show-i-t-a-as-new` to let 2/2 pass.

Ciao,
Dscho

>
> Nicely done.
>
Johannes Schindelin July 1, 2020, 9:52 a.m. UTC | #5
Hi Junio,

On Thu, 25 Jun 2020, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > The documented behavior of `git diff-files --raw` is to display
> >
> > 	[...] 0{40} if creation, unmerged or "look at work tree".
>
> "on the right hand (i.e. postimage) side" is missing, which is
> crucial in this description.

True. Sorry for omitting that.

> > This happens for example when showing modified, unstaged files.
>
> Modified I would understand.  We notice that the contents on the
> work tree is different from what is in the index, and we tell the
> consumer "look at work tree".  I do not think you meant "unstaged",

Right. I probably should have written "[...] when showing files with
unstaged modifications".

> though.  If truly removed from the index, diff-files won't even know
> about the path.  You probably meant "removed from the working tree",
> but 0{40} in that case means totally different thing (i.e. it would
> be a (D)eletion record, so 0{40} on the postimage side is a filler,
> not even "look at work tree").  What would make more sense to
> describe might be
>
> 	This happens for paths that are modified, or unmodified but
> 	stat-dirty.

Yes, but that includes more than I wanted to describe because the
stat-dirty does not matter for intent-to-add files, and I wanted to point
out the analogy between unstaged modifications and intent-to-add-files.

I noticed that you merged this commit into `next` already, so I am
assuming that you do not want me to send a fifth iteration ;-)

Ciao,
Dscho

> Either case, we tell the consumer to check the "work tree".
>
> > For intent-to-add files, we used to show the empty blob's hash instead.
> > In c26022ea8f5 (diff: convert diff_addremove to struct object_id,
> > 2017-05-30), we made that worse by inadvertently changing that to the
> > hash of the empty tree.
> >
> > Let's make the behavior consistent with modified files by showing
> > all-zero values also for intent-to-add files.
>
> Well described.
>
> Thanks.
>
Junio C Hamano July 1, 2020, 9:02 p.m. UTC | #6
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Thu, 25 Jun 2020, Junio C Hamano wrote:
>
>> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
>> writes:
>>
>> > diff --git a/diff-lib.c b/diff-lib.c
>> > index 61812f48c2..25fd2dee19 100644
>> > --- a/diff-lib.c
>> > +++ b/diff-lib.c
>> > @@ -220,8 +220,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>> >  			} else if (revs->diffopt.ita_invisible_in_index &&
>> >  				   ce_intent_to_add(ce)) {
>> >  				diff_addremove(&revs->diffopt, '+', ce->ce_mode,
>> > -					       the_hash_algo->empty_tree, 0,
>> > -					       ce->name, 0);
>> > +					       &null_oid, 0, ce->name, 0);
>>
>> This (even if the preimage were correctly using empty_blob) is *so*
>> simple a change that it is very surprising that the new test in
>> [2/2] passes without any other code change.
>>
>> It means that difftool was written correctly in the first place,
>> right?
>
> Well, it means that difftool does not even consume the OID. Sure, it
> parses it, but then it ignores it. All it _really_ is interested in is
> that status flag (`A`),

Ah, that's an ultimately defensive code.  No matter what is on the
right hand side of the raw output, as long as it knows that the
right hand side is a file on the working tree, it can safely ignore
the object name and directly go to the working tree file.

Nice ;-)

> so technically, my fix in 1/2 is not even needed
> after `sk/diff-files-show-i-t-a-as-new` to let 2/2 pass.

Sure, but I think it is the right thing to do nevertheless.  Giving
the object name for an empty blob would be wrong unless the working
tree file is known to be empty (and empty tree?  what were we even
thinking, gee...).

Patch
diff mbox series

diff --git a/diff-lib.c b/diff-lib.c
index 61812f48c2..25fd2dee19 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -220,8 +220,7 @@  int run_diff_files(struct rev_info *revs, unsigned int option)
 			} else if (revs->diffopt.ita_invisible_in_index &&
 				   ce_intent_to_add(ce)) {
 				diff_addremove(&revs->diffopt, '+', ce->ce_mode,
-					       the_hash_algo->empty_tree, 0,
-					       ce->name, 0);
+					       &null_oid, 0, ce->name, 0);
 				continue;
 			}
 
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index 8a5d55054f..cf0175ad6e 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -240,7 +240,6 @@  test_expect_success 'i-t-a files shown as new for "diff", "diff-files"; not-new
 
 	hash_e=$(git hash-object empty) &&
 	hash_n=$(git hash-object not-empty) &&
-	hash_t=$(git hash-object -t tree /dev/null) &&
 
 	cat >expect.diff_p <<-EOF &&
 	diff --git a/empty b/empty
@@ -259,8 +258,8 @@  test_expect_success 'i-t-a files shown as new for "diff", "diff-files"; not-new
 	 create mode 100644 not-empty
 	EOF
 	cat >expect.diff_a <<-EOF &&
-	:000000 100644 0000000 $(git rev-parse --short $hash_t) A$(printf "\t")empty
-	:000000 100644 0000000 $(git rev-parse --short $hash_t) A$(printf "\t")not-empty
+	:000000 100644 0000000 0000000 A$(printf "\t")empty
+	:000000 100644 0000000 0000000 A$(printf "\t")not-empty
 	EOF
 
 	git add -N empty not-empty &&