diff mbox series

diff-tree: fix crash when used with --remerge-diff

Message ID pull.1771.git.1723123250958.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series diff-tree: fix crash when used with --remerge-diff | expand

Commit Message

Xing Xin Aug. 8, 2024, 1:20 p.m. UTC
From: Xing Xin <xingxin.xx@bytedance.com>

When using "git-diff-tree" to get the tree diff for merge commits with
the diff format set to `remerge`, a bug is triggered as shown below:

    $ git diff-tree -r --remerge-diff 363337e6eb
    363337e6eb812d0c0d785ed4261544f35559ff8b
    BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?

This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
added in commit 7b90ab467a (log: clean unneeded objects during log
--remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
attempting to clean up temporary objects generated during the remerge
process.

After some further digging, I find that the remerge-related diff options
were introduced in db757e8b8d (show, log: provide a --remerge-diff
capability, 2022-02-02), which also affect the setup of `rev_info` for
"git-diff-tree", but were not accounted for in the original
implementation (inferred from the commit message).

This commit fixes the bug by adding initialization logic for
`remerge_objdir` in `builtin/diff-tree.c`, mirroring the logic in
`builtin/log.c:cmd_log_walk_no_free`. A final cleanup for
`remerge_objdir` is also included.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
---
    diff-tree: fix crash when used with --remerge-diff

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1771%2Fblanet%2Fxx%2Ffix-diff-tree-crash-on-remerge-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1771/blanet/xx/fix-diff-tree-crash-on-remerge-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1771

 builtin/diff-tree.c     | 13 +++++++++++++
 t/t4069-remerge-diff.sh | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)


base-commit: 406f326d271e0bacecdb00425422c5fa3f314930

Comments

Elijah Newren Aug. 8, 2024, 4:03 p.m. UTC | #1
On Thu, Aug 8, 2024 at 6:20 AM blanet via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Xing Xin <xingxin.xx@bytedance.com>
>
> When using "git-diff-tree" to get the tree diff for merge commits with
> the diff format set to `remerge`, a bug is triggered as shown below:
>
>     $ git diff-tree -r --remerge-diff 363337e6eb
>     363337e6eb812d0c0d785ed4261544f35559ff8b
>     BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?

Wow, this bug is around for 2.5 years, and then we both independently
notice and fix it within 3 weeks of each other:
https://github.com/git/git/commit/e5890667c7598e813edee0ac4e76d6e3cdd525ec

My patch is incomplete as it's missing a testcase, and you submitted
first, so let's stick with your fix, though.

> This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
> added in commit 7b90ab467a (log: clean unneeded objects during log
> --remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
> attempting to clean up temporary objects generated during the remerge
> process.
>
> After some further digging, I find that the remerge-related diff options
> were introduced in db757e8b8d (show, log: provide a --remerge-diff
> capability, 2022-02-02), which also affect the setup of `rev_info` for
> "git-diff-tree", but were not accounted for in the original
> implementation (inferred from the commit message).
>
> This commit fixes the bug by adding initialization logic for
> `remerge_objdir` in `builtin/diff-tree.c`, mirroring the logic in
> `builtin/log.c:cmd_log_walk_no_free`. A final cleanup for
> `remerge_objdir` is also included.

The commit message from my patch also included an explanation for why
diff-tree was the only caller that was missing the necessary logic
(see the last paragraph, which kind of references the one before it as
well).

> Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
> ---
>     diff-tree: fix crash when used with --remerge-diff
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1771%2Fblanet%2Fxx%2Ffix-diff-tree-crash-on-remerge-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1771/blanet/xx/fix-diff-tree-crash-on-remerge-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1771
>
>  builtin/diff-tree.c     | 13 +++++++++++++
>  t/t4069-remerge-diff.sh | 35 +++++++++++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
>
> diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
> index 0d3c611aac0..813be486dad 100644
> --- a/builtin/diff-tree.c
> +++ b/builtin/diff-tree.c
> @@ -9,6 +9,7 @@
>  #include "repository.h"
>  #include "revision.h"
>  #include "tree.h"
> +#include "tmp-objdir.h"

The includes other than this one are in alphabetical order; can you
move this a line before?

Also, as an aside, folks in this project often just put includes at
the end, but I think it's a bad practice.  Whenever someone needs to
backport fixes or merge separate patch topics into seen/next/etc. or
even merge not-yet-upstream topics with newer upstream versions, this
practice increases the odds of unnecessary conflicts.  And it makes it
harder for the next person who comes along to spot whether a header is
already included (and sometimes leaves us including headers twice).
While each case is a small amount of toil so we tend to overlook it,
it's totally unnecessary toil in many cases.  Putting includes in
alphabetical order (other than the one include required to be first,
git-compat-util.h or its documented stand-ins) can often remove this
unnecessary toil.  Anyway, thanks for letting me vent.  :-)

>  static struct rev_info log_tree_opt;
>
> @@ -166,6 +167,13 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
>
>         opt->diffopt.rotate_to_strict = 1;
>
> +       if (opt->remerge_diff) {
> +               opt->remerge_objdir = tmp_objdir_create("remerge-diff");
> +               if (!opt->remerge_objdir)
> +                       die(_("unable to create temporary object directory"));
> +               tmp_objdir_replace_primary_odb(opt->remerge_objdir, 1);
> +       }
> +
>         /*
>          * NOTE!  We expect "a..b" to expand to "^a b" but it is
>          * perfectly valid for revision range parser to yield "b ^a",
> @@ -230,5 +238,10 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
>                 diff_free(&opt->diffopt);
>         }
>
> +       if (opt->remerge_diff) {
> +               tmp_objdir_destroy(opt->remerge_objdir);
> +               opt->remerge_objdir = NULL;
> +       }
> +
>         return diff_result_code(&opt->diffopt);
>  }

Your fix exactly matches mine, other than the header include location.

> diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> index 07323ebafe0..ca8f999caba 100755
> --- a/t/t4069-remerge-diff.sh
> +++ b/t/t4069-remerge-diff.sh
> @@ -110,6 +110,41 @@ test_expect_success 'can filter out additional headers with pickaxe' '
>         test_must_be_empty actual
>  '
>
> +test_expect_success 'remerge-diff also works for git-diff-tree' '
> +       # With a clean merge
> +       git diff-tree -r -p --remerge-diff --no-commit-id bc_resolution >actual &&
> +       test_must_be_empty actual &&
> +
> +       # With both a resolved conflict and an unrelated change
> +       cat <<-EOF >tmp &&
> +       diff --git a/numbers b/numbers
> +       remerge CONFLICT (content): Merge conflict in numbers
> +       index a1fb731..6875544 100644
> +       --- a/numbers
> +       +++ b/numbers
> +       @@ -1,13 +1,9 @@
> +        1
> +        2
> +       -<<<<<<< b0ed5cb (change_a)
> +       -three
> +       -=======
> +       -tres
> +       ->>>>>>> 6cd3f82 (change_b)
> +       +drei
> +        4
> +        5
> +        6
> +        7
> +       -eight
> +       +acht
> +        9
> +       EOF
> +       sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> +       git diff-tree -r -p --remerge-diff --no-commit-id ab_resolution >tmp &&
> +       sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
> +       test_cmp expect actual
> +'
> +
>  test_expect_success 'setup non-content conflicts' '
>         git switch --orphan base &&

Test looks good too.

I'll be happy to add my Reviewed-by if you fix the header include order.
Junio C Hamano Aug. 8, 2024, 5:24 p.m. UTC | #2
Elijah Newren <newren@gmail.com> writes:

> The commit message from my patch also included an explanation for why
> diff-tree was the only caller that was missing the necessary logic
> (see the last paragraph, which kind of references the one before it as
> well).

... which we may want to resurrect.

> Test looks good too.
>
> I'll be happy to add my Reviewed-by if you fix the header include order.

Thanks for a review.
Xing Xin Aug. 9, 2024, 7:25 a.m. UTC | #3
At 2024-08-09 00:03:53, "Elijah Newren" <newren@gmail.com> wrote:
>On Thu, Aug 8, 2024 at 6:20 AM blanet via GitGitGadget
><gitgitgadget@gmail.com> wrote:
>>
>> From: Xing Xin <xingxin.xx@bytedance.com>
>>
>> When using "git-diff-tree" to get the tree diff for merge commits with
>> the diff format set to `remerge`, a bug is triggered as shown below:
>>
>>     $ git diff-tree -r --remerge-diff 363337e6eb
>>     363337e6eb812d0c0d785ed4261544f35559ff8b
>>     BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?
>
>Wow, this bug is around for 2.5 years, and then we both independently
>notice and fix it within 3 weeks of each other:
>https://github.com/git/git/commit/e5890667c7598e813edee0ac4e76d6e3cdd525ec
>
>My patch is incomplete as it's missing a testcase, and you submitted
>first, so let's stick with your fix, though.

Wow, such an interesting coincidence! And thanks for your quick reply!

>> This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
>> added in commit 7b90ab467a (log: clean unneeded objects during log
>> --remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
>> attempting to clean up temporary objects generated during the remerge
>> process.
>>
>> After some further digging, I find that the remerge-related diff options
>> were introduced in db757e8b8d (show, log: provide a --remerge-diff
>> capability, 2022-02-02), which also affect the setup of `rev_info` for
>> "git-diff-tree", but were not accounted for in the original
>> implementation (inferred from the commit message).
>>
>> This commit fixes the bug by adding initialization logic for
>> `remerge_objdir` in `builtin/diff-tree.c`, mirroring the logic in
>> `builtin/log.c:cmd_log_walk_no_free`. A final cleanup for
>> `remerge_objdir` is also included.
>
>The commit message from my patch also included an explanation for why
>diff-tree was the only caller that was missing the necessary logic
>(see the last paragraph, which kind of references the one before it as
>well).

Your explanations better illustrate the impact of this bug,  I've quoted them
in the new patch commit message.

[snip]

>> diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
>> index 0d3c611aac0..813be486dad 100644
>> --- a/builtin/diff-tree.c
>> +++ b/builtin/diff-tree.c
>> @@ -9,6 +9,7 @@
>>  #include "repository.h"
>>  #include "revision.h"
>>  #include "tree.h"
>> +#include "tmp-objdir.h"
>
>The includes other than this one are in alphabetical order; can you
>move this a line before?
>
>Also, as an aside, folks in this project often just put includes at
>the end, but I think it's a bad practice.  Whenever someone needs to
>backport fixes or merge separate patch topics into seen/next/etc. or
>even merge not-yet-upstream topics with newer upstream versions, this
>practice increases the odds of unnecessary conflicts.  And it makes it
>harder for the next person who comes along to spot whether a header is
>already included (and sometimes leaves us including headers twice).
>While each case is a small amount of toil so we tend to overlook it,
>it's totally unnecessary toil in many cases.  Putting includes in
>alphabetical order (other than the one include required to be first,
>git-compat-util.h or its documented stand-ins) can often remove this
>unnecessary toil.  Anyway, thanks for letting me vent.  :-)

Noted. I'll move the new include to the correct position. Thanks for the
guidance!

>>  static struct rev_info log_tree_opt;
>>
>> @@ -166,6 +167,13 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
>>
>>         opt->diffopt.rotate_to_strict = 1;
>>
>> +       if (opt->remerge_diff) {
>> +               opt->remerge_objdir = tmp_objdir_create("remerge-diff");
>> +               if (!opt->remerge_objdir)
>> +                       die(_("unable to create temporary object directory"));
>> +               tmp_objdir_replace_primary_odb(opt->remerge_objdir, 1);
>> +       }
>> +
>>         /*
>>          * NOTE!  We expect "a..b" to expand to "^a b" but it is
>>          * perfectly valid for revision range parser to yield "b ^a",
>> @@ -230,5 +238,10 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
>>                 diff_free(&opt->diffopt);
>>         }
>>
>> +       if (opt->remerge_diff) {
>> +               tmp_objdir_destroy(opt->remerge_objdir);
>> +               opt->remerge_objdir = NULL;
>> +       }
>> +
>>         return diff_result_code(&opt->diffopt);
>>  }
>
>Your fix exactly matches mine, other than the header include location.

High five! :-)

[snip]

Xing Xin
diff mbox series

Patch

diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index 0d3c611aac0..813be486dad 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -9,6 +9,7 @@ 
 #include "repository.h"
 #include "revision.h"
 #include "tree.h"
+#include "tmp-objdir.h"
 
 static struct rev_info log_tree_opt;
 
@@ -166,6 +167,13 @@  int cmd_diff_tree(int argc, const char **argv, const char *prefix)
 
 	opt->diffopt.rotate_to_strict = 1;
 
+	if (opt->remerge_diff) {
+		opt->remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!opt->remerge_objdir)
+			die(_("unable to create temporary object directory"));
+		tmp_objdir_replace_primary_odb(opt->remerge_objdir, 1);
+	}
+
 	/*
 	 * NOTE!  We expect "a..b" to expand to "^a b" but it is
 	 * perfectly valid for revision range parser to yield "b ^a",
@@ -230,5 +238,10 @@  int cmd_diff_tree(int argc, const char **argv, const char *prefix)
 		diff_free(&opt->diffopt);
 	}
 
+	if (opt->remerge_diff) {
+		tmp_objdir_destroy(opt->remerge_objdir);
+		opt->remerge_objdir = NULL;
+	}
+
 	return diff_result_code(&opt->diffopt);
 }
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 07323ebafe0..ca8f999caba 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -110,6 +110,41 @@  test_expect_success 'can filter out additional headers with pickaxe' '
 	test_must_be_empty actual
 '
 
+test_expect_success 'remerge-diff also works for git-diff-tree' '
+	# With a clean merge
+	git diff-tree -r -p --remerge-diff --no-commit-id bc_resolution >actual &&
+	test_must_be_empty actual &&
+
+	# With both a resolved conflict and an unrelated change
+	cat <<-EOF >tmp &&
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (content): Merge conflict in numbers
+	index a1fb731..6875544 100644
+	--- a/numbers
+	+++ b/numbers
+	@@ -1,13 +1,9 @@
+	 1
+	 2
+	-<<<<<<< b0ed5cb (change_a)
+	-three
+	-=======
+	-tres
+	->>>>>>> 6cd3f82 (change_b)
+	+drei
+	 4
+	 5
+	 6
+	 7
+	-eight
+	+acht
+	 9
+	EOF
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+	git diff-tree -r -p --remerge-diff --no-commit-id ab_resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'setup non-content conflicts' '
 	git switch --orphan base &&