[0/3] Output fixes for --remerge-diff

Message ID	pull.1342.git.1661926908.gitgitgadget@gmail.com (mailing list archive)
Headers	show Return-Path: <git-owner@kernel.org> Message-Id: <pull.1342.git.1661926908.gitgitgadget@gmail.com> From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> Date: Wed, 31 Aug 2022 06:21:45 +0000 Subject: [PATCH 0/3] Output fixes for --remerge-diff Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: Philippe Blain <levraiphilippeblain@gmail.com>, Elijah Newren <newren@gmail.com> Precedence: bulk
Series	Output fixes for --remerge-diff \| expand [0/3] Output fixes for --remerge-diff [1/3] diff: have submodule_format logic avoid additional diff headers [2/3] diff: fix filtering of additional headers under --remerge-diff [3/3] diff: fix filtering of merge commits under --remerge-diff

John Cai via GitGitGadget Aug. 31, 2022, 6:21 a.m. UTC

Philippe Blain found and reported a couple issues with the output of
--remerge-diff[1]. After digging in, I think one of them actually counts as
two separate issues, so here's a series with three patches to fix these
issues. Each includes testcases to keep us from regressing.

Note: The issue fixed by the third commit for --remerge-diff is also an
issue exhibited by 'git log --cc $FILTER_RULES $COMMIT' (or by -c instead of
--cc). However, as far as I can tell the causes are different and come from
separate codepaths; this series focuses on --remerge-diff and hence makes no
attempt to fix independent (even if similar) --cc or -c issues.

[1]
https://lore.kernel.org/git/43cf2a1d-058a-fd79-befe-7d9bc62581ed@gmail.com/

Elijah Newren (3):
  diff: have submodule_format logic avoid additional diff headers
  diff: fix filtering of additional headers under --remerge-diff
  diff: fix filtering of merge commits under --remerge-diff

 diff.c                  |  9 +++++++--
 t/t4069-remerge-diff.sh | 30 +++++++++++++++++++++++++++++-
 2 files changed, 36 insertions(+), 3 deletions(-)


base-commit: 6c8e4ee870332d11e4bba84901654b355a9ff016
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1342%2Fnewren%2Fremerge-diff-output-fixes-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1342/newren/remerge-diff-output-fixes-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1342

Junio C Hamano Sept. 1, 2022, 1:13 a.m. UTC | #1

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Philippe Blain found and reported a couple issues with the output of
> --remerge-diff[1]. After digging in, I think one of them actually counts as
> two separate issues, so here's a series with three patches to fix these
> issues. Each includes testcases to keep us from regressing.

Including this to 'seen' seems to break the leaks-check CI job X-<.

https://github.com/git/git/runs/8124648321?check_suite_focus=true

Elijah Newren Sept. 1, 2022, 3:47 a.m. UTC | #2

On Wed, Aug 31, 2022 at 6:13 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Philippe Blain found and reported a couple issues with the output of
> > --remerge-diff[1]. After digging in, I think one of them actually counts as
> > two separate issues, so here's a series with three patches to fix these
> > issues. Each includes testcases to keep us from regressing.
>
> Including this to 'seen' seems to break the leaks-check CI job X-<.
>
> https://github.com/git/git/runs/8124648321?check_suite_focus=true

That's...surprising.  Any chance of a mis-merge?

I ask for two reasons:
  * This series, built on main, passed the leaks-check job.
  * The link you provide points to t4069 as the test failing, but the
second patch of this series removes the TEST_PASSES_SANITIZE_LEAK=true
line from t4069, which should make that test a no-op for the
leaks-check job.

Elijah Newren Sept. 1, 2022, 4:01 a.m. UTC | #3

On Wed, Aug 31, 2022 at 8:47 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Wed, Aug 31, 2022 at 6:13 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > Philippe Blain found and reported a couple issues with the output of
> > > --remerge-diff[1]. After digging in, I think one of them actually counts as
> > > two separate issues, so here's a series with three patches to fix these
> > > issues. Each includes testcases to keep us from regressing.
> >
> > Including this to 'seen' seems to break the leaks-check CI job X-<.
> >
> > https://github.com/git/git/runs/8124648321?check_suite_focus=true
>
> That's...surprising.  Any chance of a mis-merge?
>
> I ask for two reasons:
>   * This series, built on main, passed the leaks-check job.
>   * The link you provide points to t4069 as the test failing, but the
> second patch of this series removes the TEST_PASSES_SANITIZE_LEAK=true
> line from t4069, which should make that test a no-op for the
> leaks-check job.

Actually, looks not like a mis-merge, but some kind of faulty `git am`
application.  The merge in question isn't available for me to fetch,
but clicking through the UI from the link you provide eventually leads
me to:

    https://github.com/git/git/commit/81f120208d02afee71543d4f588b471950f156f2

which does NOT match what I submitted:

    https://lore.kernel.org/git/feac97494600e522125b7bb202f4dc5ca020ca99.1661926908.git.gitgitgadget@gmail.com/

It's close, but despite still including this part of my commit message:

"""
This also removes the TEST_PASSES_SANITIZE_LEAK=true declaration from
t4069, as there is apparently some kind of memory leak with the pickaxe
code.
"""

it's missing this part of the diff:

diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index e3e6fbd97b2..95a16d19aec 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -2,7 +2,6 @@

 test_description='remerge-diff handling'

-TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh

 # This test is ort-specific

That part of the diff is important.  I did not add any leaks to the
code (I did run the leaks-checking job and looked through the output
to verify that none of them involved any codepath I added or
modified), but I did add some test code which exercises pre-existing
memory leaks, and testing those codepaths is critical to verify I got
the appropriate fixes.  Any idea what happened here?

Either way, I'm going to resubmit the series due to your other
suggestion.  So long as the unfortunate munging doesn't occur again,
things should be fine if you just take the new series.

Junio C Hamano Sept. 1, 2022, 3:24 p.m. UTC | #4

Elijah Newren <newren@gmail.com> writes:

> On Wed, Aug 31, 2022 at 6:13 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>> > Philippe Blain found and reported a couple issues with the output of
>> > --remerge-diff[1]. After digging in, I think one of them actually counts as
>> > two separate issues, so here's a series with three patches to fix these
>> > issues. Each includes testcases to keep us from regressing.
>>
>> Including this to 'seen' seems to break the leaks-check CI job X-<.
>>
>> https://github.com/git/git/runs/8124648321?check_suite_focus=true
>
> That's...surprising.  Any chance of a mis-merge?
>
> I ask for two reasons:
>   * This series, built on main, passed the leaks-check job.

Ah, that.

Yes, I did rebase it to 'maint' to be nice to our users as this is
not a new feature development but a bugfix or two.

This is why I hate the leak-check CI job (yes, I do help maintain
all parts of the tree, but it does not mean I have to love every bit
of the codebase, and this is one of the things I love to hate).

Instead of saying "subcommand X with feature Y? It ought to be clean
so complain if leak checker find something. subcommand Z? It is
known to be unclean, so do not bother", it says "In this test in
entirety, we currently happen to use only the ones that are clean"
and penalizes developers who wants to use an unclean tool merely for
checking.  The approach is fundamentally flawed and does not play
well with multiple integration branches, just like we saw here.

Ævar Arnfjörð Bjarmason Sept. 1, 2022, 6:46 p.m. UTC | #5

On Thu, Sep 01 2022, Junio C Hamano wrote:

> Elijah Newren <newren@gmail.com> writes:
>
>> On Wed, Aug 31, 2022 at 6:13 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>
>>> > Philippe Blain found and reported a couple issues with the output of
>>> > --remerge-diff[1]. After digging in, I think one of them actually counts as
>>> > two separate issues, so here's a series with three patches to fix these
>>> > issues. Each includes testcases to keep us from regressing.
>>>
>>> Including this to 'seen' seems to break the leaks-check CI job X-<.
>>>
>>> https://github.com/git/git/runs/8124648321?check_suite_focus=true
>>
>> That's...surprising.  Any chance of a mis-merge?
>>
>> I ask for two reasons:
>>   * This series, built on main, passed the leaks-check job.
>
> Ah, that.
>
> Yes, I did rebase it to 'maint' to be nice to our users as this is
> not a new feature development but a bugfix or two.
>
> This is why I hate the leak-check CI job (yes, I do help maintain
> all parts of the tree, but it does not mean I have to love every bit
> of the codebase, and this is one of the things I love to hate).
>
> Instead of saying "subcommand X with feature Y? It ought to be clean
> so complain if leak checker find something. subcommand Z? It is
> known to be unclean, so do not bother", it says "In this test in
> entirety, we currently happen to use only the ones that are clean"
> and penalizes developers who wants to use an unclean tool merely for
> checking.  The approach is fundamentally flawed and does not play
> well with multiple integration branches, just like we saw here.

We've discussed doing it that way before. I wouldn't be fundamentally
opposed, but I do think we're far enough along the way to being
leak-free that we'd want to mark more than just a "top-level" command as
leak-free.

It's just also not the case that we even could do that in all but the
most trivial cases. Most commands still leak somewhere in some obscure
cases, but we have entire tests now where the code they run in those
common cases doesn't leak.

However, in this case this seems to just be a case that Elijah tested
his code on base X, and you applied it on base Y.

I don't really see how the approach you're suggesting would be any more
likely to be resilient in the face of that. Then we'd presumably use
some command that's leak-free on "master", but that command wouldn't be
leak-free on "maint".

Junio C Hamano Sept. 1, 2022, 7:54 p.m. UTC | #6

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We've discussed doing it that way before. I wouldn't be fundamentally
> opposed, but I do think we're far enough along the way to being
> leak-free that we'd want to mark more than just a "top-level" command as
> leak-free.

Two things.  Seeing the leak-check breakage quite often, I doubt how
much trust I can place in your "far enough along the way" statement.
Also, I do not think we thought the alternative was to mark only the
top-level.  The test could inspect the crash after the fact, and say
"ah, allocation made by xcalloc() called from this and that functions
are still known to be leaky, so do not stop and mark the CI job a
failure for this one", for example?

[0/3] Output fixes for --remerge-diff

Message

Comments