diff mbox series

[1/3] diff: use filespec path to set up tempfiles for ext-diff

Message ID Y7gAHenwmIo4gXTb@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit a0f83e777660dbf7d9526c05d94fc920e459aed9
Headers show
Series fixing "diff --relative" with external diff | expand

Commit Message

Jeff King Jan. 6, 2023, 11:03 a.m. UTC
When we're going to run an external diff, we have to make the contents
of the pre- and post-images available either by dumping them to a
tempfile, or by pointing at a valid file in the worktree. The logic of
this is all handled by prepare_temp_file(), and we just pass in the
filename and the diff_filespec.

But there's a gotcha here. The "filename" we have is a logical filename
and not necessarily a path on disk or in the repository. This matters in
at least one case: when using "--relative", we may have a name like
"foo", even though the file content is found at "subdir/foo". As a
result, we look for the wrong path, fail to find "foo", and claim that
the file has been deleted (passing "/dev/null" to the external diff,
rather than the correct worktree path).

We can fix this by passing the pathname from the diff_filespec, which
should always be a full repository path (and that's what we want even if
reusing a worktree file, since we're always operating from the top-level
of the working tree).

The breakage seems to go all the way back to cd676a5136 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12). As far as I can tell, before then "name" would always have
been the same as the filespec's "path".

There are two related cases I looked at that aren't buggy:

  1. the only other caller of prepare_temp_file() is run_textconv(). But
     it always passes the filespec's path field, so it's OK.

  2. I wondered if file renames/copies might cause similar confusion.
     But they don't, because run_external_diff() receives two names in
     that case: "name" and "other", which correspond to the two sides of
     the diff. And we did correctly pass "other" when handling the
     post-image side. Barring the use of "--relative", that would always
     match "two->path", the path of the second filespec (and the rename
     destination).

So the only bug is just the interaction with external diff drivers and
--relative.

Reported-by: Carl Baldwin <carl@ecbaldwin.net>
Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c                   |  2 +-
 t/t4045-diff-relative.sh | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Jan. 6, 2023, 12:48 p.m. UTC | #1
Jeff King <peff@peff.net> writes:

> We can fix this by passing the pathname from the diff_filespec, which
> should always be a full repository path (and that's what we want even if
> reusing a worktree file, since we're always operating from the top-level
> of the working tree).

Very sensible.

> The breakage seems to go all the way back to cd676a5136 (diff
> --relative: output paths as relative to the current subdirectory,
> 2008-02-12).

Not surprising.  When I wrote all the rest of "diff", I didn't
plan to do "--relative" ;-)

> So the only bug is just the interaction with external diff drivers and
> --relative.
>
> Reported-by: Carl Baldwin <carl@ecbaldwin.net>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  diff.c                   |  2 +-
>  t/t4045-diff-relative.sh | 29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+), 1 deletion(-)

Thanks for a clear description.  The fix looks trivially obvious and
correct.

> diff --git a/diff.c b/diff.c
> index 9b14543e6e..59039773a1 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4281,7 +4281,7 @@ static void add_external_diff_name(struct repository *r,
>  				   const char *name,
>  				   struct diff_filespec *df)
>  {
> -	struct diff_tempfile *temp = prepare_temp_file(r, name, df);
> +	struct diff_tempfile *temp = prepare_temp_file(r, df->path, df);
Jeff King Jan. 6, 2023, 1:10 p.m. UTC | #2
On Fri, Jan 06, 2023 at 09:48:57PM +0900, Junio C Hamano wrote:

> > The breakage seems to go all the way back to cd676a5136 (diff
> > --relative: output paths as relative to the current subdirectory,
> > 2008-02-12).
> 
> Not surprising.  When I wrote all the rest of "diff", I didn't
> plan to do "--relative" ;-)

:) Commit cd676a5136 mentions that "diff --relative" does not interact
well with "--no-index", and that was one of the things I tested while
poking (to make sure I did not make anything worse). And indeed, it
seems that --relative is mostly ignored there. We could follow through
on the plan from the end of the commit message to forbid combining the
two, but it may not be that important given how long it has been an
issue (and that I think people may set diff.relative in their config
these days).

> Thanks for a clear description.  The fix looks trivially obvious and
> correct.
> [...]
> > -	struct diff_tempfile *temp = prepare_temp_file(r, name, df);
> > +	struct diff_tempfile *temp = prepare_temp_file(r, df->path, df);

One nagging concern I had is whether "df->path" might ever point to
something unexpected (or even be NULL). The fact that textconv
unconditionally passes it made me feel a lot better. But I also ended up
walking back to the source of the "name" and "other" fields, which is
this code in run_diff():

	name  = one->path;
	other = (strcmp(name, two->path) ? two->path : NULL);
	attr_path = name;
	if (o->prefix_length)
		strip_prefix(o->prefix_length, &name, &other);

So those values really are just aliases for one->path and two->path,
modulo the prefix stripping (which as you might guess, came from
cd676a5136 itself). And using them directly instead of the stripped
versions is definitely the right thing to do.

-Peff
diff mbox series

Patch

diff --git a/diff.c b/diff.c
index 9b14543e6e..59039773a1 100644
--- a/diff.c
+++ b/diff.c
@@ -4281,7 +4281,7 @@  static void add_external_diff_name(struct repository *r,
 				   const char *name,
 				   struct diff_filespec *df)
 {
-	struct diff_tempfile *temp = prepare_temp_file(r, name, df);
+	struct diff_tempfile *temp = prepare_temp_file(r, df->path, df);
 	strvec_push(argv, temp->name);
 	strvec_push(argv, temp->hex);
 	strvec_push(argv, temp->mode);
diff --git a/t/t4045-diff-relative.sh b/t/t4045-diff-relative.sh
index 198dfc9190..9b46c4c1be 100755
--- a/t/t4045-diff-relative.sh
+++ b/t/t4045-diff-relative.sh
@@ -164,6 +164,35 @@  check_diff_relative_option subdir file2 true --no-relative --relative
 check_diff_relative_option . file2 false --no-relative --relative=subdir
 check_diff_relative_option . file2 true --no-relative --relative=subdir
 
+test_expect_success 'external diff with --relative' '
+	test_when_finished "git reset --hard" &&
+	echo changed >file1 &&
+	echo changed >subdir/file2 &&
+
+	write_script mydiff <<-\EOF &&
+	# hacky pretend diff; the goal here is just to make sure we got
+	# passed sensible input that we _could_ diff, without relying on
+	# the specific output of a system diff tool.
+	echo "diff a/$1 b/$1" &&
+	echo "--- a/$1" &&
+	echo "+++ b/$1" &&
+	echo "@@ -1 +0,0 @@" &&
+	sed "s/^/-/" "$2" &&
+	sed "s/^/+/" "$5"
+	EOF
+
+	cat >expect <<-\EOF &&
+	diff a/file2 b/file2
+	--- a/file2
+	+++ b/file2
+	@@ -1 +0,0 @@
+	-other content
+	+changed
+	EOF
+	GIT_EXTERNAL_DIFF=./mydiff git diff --relative=subdir >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'setup diff --relative unmerged' '
 	test_commit zero file0 &&
 	test_commit base subdir/file0 &&