diff mbox series

diff: handle NULL meta-info when spawning external diff

Message ID 20240129015708.GA1762343@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 85a9a63c9268b18b24f25f6a14d6ae9966c3566d
Headers show
Series diff: handle NULL meta-info when spawning external diff | expand

Commit Message

Jeff King Jan. 29, 2024, 1:57 a.m. UTC
On Sun, Jan 28, 2024 at 12:24:39PM -0800, Wilfred Hughes wrote:

> It looks like git crashes if diff.external is set and the user
> compares files that have different permissions. Here's a repro:
> 
> $ mkdir demo
> $ cd demo
> $ git init .
> Initialized empty Git repository in /tmp/demo/.git/
> 
> $ git config diff.external /bin/echo
> $ touch foo bar
> $ chmod 755 foo
> $ git diff --no-ext-diff --no-index foo bar
> diff --git 1/foo 2/bar
> old mode 100755
> new mode 100644
> 
> $ git diff --no-index foo bar
> zsh: segmentation fault (core dumped)  git diff --no-index foo bar

Thanks for providing a simple reproduction recipe. There's a pretty
straight-forward fix below, though it leaves open some question of
whether there's another bug lurking with --no-index (but either way, I
think we'd want this simple fix as a first step).

-- >8 --
Subject: diff: handle NULL meta-info when spawning external diff

Running this:

  $ touch foo bar
  $ chmod +x foo
  $ git -c diff.external=echo diff --ext-diff --no-index foo bar

results in a segfault. The issue is that run_diff_cmd() passes a NULL
"xfrm_msg" variable to run_external_diff(), which feeds it to
strvec_push(), causing the segfault. The bug dates back to 82fbf269b9
(run_external_diff: use an argv_array for the command line, 2014-04-19),
though it mostly only ever worked accidentally.  Before then, we just
stuck the NULL pointer into a "const char **" array, so our NULL ended
up acting as an extra end-of-argv sentinel (which was OK, because it was
the last thing in the array).

Curiously, though, this is only a problem with --no-index. We set up
xfrm_msg by calling fill_metainfo(). This result may be empty, or may
have text like "index 1234..5678\n", "rename from foo\nrename from
bar\n", etc. In run_external_diff(), we only look at xfrm_msg if the
"other" variable is not NULL. That variable is set when the paths of the
two sides of the diff pair aren't the same (in which case the
destination path becomes "other"). So normally it would kick in only for
a rename, in which case xfrm_msg should not be NULL (it would have the
rename information in it).

But with a "--no-index" of two blobs, we of course have two different
pathnames, and thus end up with a non-NULL "other" filename (which is
always just a repeat of the file2-name), but possibly a NULL xfrm_msg.

So how to fix it? I have a feeling that --no-index always passing
"other" to the external diff command is probably a bug. There was no
rename, and the name is always redundant with existing information we
pass (and this may even cause us to pass a useless "xfrm_msg" that
contains an "index 1234..5678" line). So one option would be to change
that behavior. We don't seem to have ever documented the "other" or
"xfrm_msg" parameters for external diffs.

But I'm not sure what fallout we might have from changing that behavior
now. So this patch takes the less-risky option, and simply teaches
run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes
it agnostic to whether "other" and "xfrm_msg" always come as a pair. It
fixes the segfault now, and if we want to change the --no-index "other"
behavior on top, it will handle that, too.

Reported-by: Wilfred Hughes <me@wilfred.me.uk>
Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c                   |  3 ++-
 t/t4053-diff-no-index.sh | 12 ++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Jan. 29, 2024, 6:37 p.m. UTC | #1
Jeff King <peff@peff.net> writes:

>> $ git diff --no-index foo bar
>> zsh: segmentation fault (core dumped)  git diff --no-index foo bar
>
> Thanks for providing a simple reproduction recipe. There's a pretty
> straight-forward fix below, though it leaves open some question of
> whether there's another bug lurking with --no-index (but either way, I
> think we'd want this simple fix as a first step).

Yup, I agree with you that the "--no-index" mode violates the basic
design that "the other path" and "xfrm_msg" go hand-in-hand.  In its
two tree comparison mode "git diff --no-index A/ B/", it should be
able to behave sensibly, but in its two files comparison mode to
compare plain regular files 'foo' and 'bar', there is nothing it can
do reasonably, I am afraid.  You could say that the change is
renaming 'foo' to create 'bar', and feed consistent data that is
aligned with that rename to external diff, which might be slightly
more logical than showing a change to 'foo' that has no rename
involved (i.e. omitting "other name"), but neither is satisfying.

> But I'm not sure what fallout we might have from changing that behavior
> now. So this patch takes the less-risky option, and simply teaches
> run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes
> it agnostic to whether "other" and "xfrm_msg" always come as a pair. It
> fixes the segfault now, and if we want to change the --no-index "other"
> behavior on top, it will handle that, too.

Sounds sensible.

Thanks.  Will queue.
Jeff King Jan. 30, 2024, 6:06 a.m. UTC | #2
On Mon, Jan 29, 2024 at 10:37:29AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> $ git diff --no-index foo bar
> >> zsh: segmentation fault (core dumped)  git diff --no-index foo bar
> >
> > Thanks for providing a simple reproduction recipe. There's a pretty
> > straight-forward fix below, though it leaves open some question of
> > whether there's another bug lurking with --no-index (but either way, I
> > think we'd want this simple fix as a first step).
> 
> Yup, I agree with you that the "--no-index" mode violates the basic
> design that "the other path" and "xfrm_msg" go hand-in-hand.  In its
> two tree comparison mode "git diff --no-index A/ B/", it should be
> able to behave sensibly, but in its two files comparison mode to
> compare plain regular files 'foo' and 'bar', there is nothing it can
> do reasonably, I am afraid.  You could say that the change is
> renaming 'foo' to create 'bar', and feed consistent data that is
> aligned with that rename to external diff, which might be slightly
> more logical than showing a change to 'foo' that has no rename
> involved (i.e. omitting "other name"), but neither is satisfying.

Yeah, I think the two-tree mode does behave correctly, and this is
really just about the two-blob mode. I agree that one could think of it
as a rename or not, depending on how much you want to read into the
importance of the names (after all, you could compare a/foo and b/foo,
which is sort of a moral equivalent of the usual two-tree case).

The current behavior is somewhere in between, though. You get an "other"
name passed to the external diff, but the metainfo argument makes no
mention of a rename (it's either blank for an exact rename, or may
contain an "index" line if there was a content change).

I'm not sure anybody really cares that much either way, though. It's
external diff, which I suspect hardly anybody uses, and those extra
fields aren't even documented in the first place.

-Peff
Junio C Hamano Jan. 30, 2024, 4:29 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> The current behavior is somewhere in between, though. You get an "other"
> name passed to the external diff, but the metainfo argument makes no
> mention of a rename (it's either blank for an exact rename, or may
> contain an "index" line if there was a content change).
>
> I'm not sure anybody really cares that much either way, though. It's
> external diff, which I suspect hardly anybody uses, and those extra
> fields aren't even documented in the first place.

Oh, we probably should fix the documentation eventually, then.

But I agree that in this case, whatever stops the segfault would be
good enough.

I am surprised to learn that this 8th hidden parameter dates back to
427dcb4b ([PATCH] Diff overhaul, adding half of copy detection.,
2005-05-21), and it is more surprising that even before it happened,
the external diff interface with 7 parameters was already
documented, which happened with 03ea2802 ([PATCH 2/2] core-git
documentation update, 2005-05-08).  Before the addition of the copy
detection, the presence of the "other" was how you learned if we saw
a rename (because there was no copy, the only reason "other" is
there was due to a rename).  With copy detection added, extra bits
of information needed to be passed and we started passing the
xfrm_msg as well through the interface.  At least, by dumping it to
the end-user, an external diff driver could help the end-user tell
if that "other" came from a rename or from a copy, even if it did
not understand it itself.

And of course, after merely 6 weeks since the inception, Git did not
have the "--no-index" mode (we did not even have a unified "git
diff" frontend), so this was never a problem back then.
diff mbox series

Patch

diff --git a/diff.c b/diff.c
index a89a6a6128..ccfa1fca0d 100644
--- a/diff.c
+++ b/diff.c
@@ -4384,7 +4384,8 @@  static void run_external_diff(const char *pgm,
 		add_external_diff_name(o->repo, &cmd.args, two);
 		if (other) {
 			strvec_push(&cmd.args, other);
-			strvec_push(&cmd.args, xfrm_msg);
+			if (xfrm_msg)
+				strvec_push(&cmd.args, xfrm_msg);
 		}
 	}
 
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 5ce345d309..651ec77660 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -205,6 +205,18 @@  test_expect_success POSIXPERM,SYMLINKS 'diff --no-index normalizes: mode not lik
 	test_cmp expected actual
 '
 
+test_expect_success POSIXPERM 'external diff with mode-only change' '
+	echo content >not-executable &&
+	echo content >executable &&
+	chmod +x executable &&
+	echo executable executable $(test_oid zero) 100755 \
+		not-executable $(test_oid zero) 100644 not-executable \
+		>expect &&
+	test_expect_code 1 git -c diff.external=echo diff \
+		--no-index executable not-executable >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success "diff --no-index treats '-' as stdin" '
 	cat >expect <<-EOF &&
 	diff --git a/- b/a/1