diff mbox series

[v3,GSOC,RFC] format-patch: pass --left-only to range-diff

Message ID pull.898.v3.git.1615285726482.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v3,GSOC,RFC] format-patch: pass --left-only to range-diff | expand

Commit Message

ZheNing Hu March 9, 2021, 10:28 a.m. UTC
From: ZheNing Hu <adlternative@gmail.com>

In https://lore.kernel.org/git/YBx5rmVsg1LJhSKN@nand.local/,
Taylor Blau proposing `git format-patch --cover-letter
--range-diff` may mistakenly place upstream commit in the
range-diff output. Teach `format-patch` pass `--left-only`
to range-diff,can avoid this kind of mistake.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    [GSOC][RFC] format-patch: pass --left-only to range-diff
    
    With the help of Taylor Blau, I understood why the upstream commit
    appeared in the range-diff, and completed the writing of the test.
    
    this want to fix #876 Thanks.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-898%2Fadlternative%2Fformat-patch-range-diff-right-only-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-898/adlternative/format-patch-range-diff-right-only-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/898

Range-diff vs v2:

 1:  8daffd4f7546 ! 1:  5c58eb186d41 [GSOC][RFC] format-patch: pass --left-only to range-diff
     @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *pre
       			    N_("percentage by which creation is weighted")),
       		OPT_END()
      @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *prefix)
     + 					     _("Interdiff against v%d:"));
     + 	}
       
     - 	if (creation_factor < 0)
     - 		creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT;
     --	else if (!rdiff_prev)
     --		die(_("--creation-factor requires --range-diff"));
     --
     -+	else if (!rdiff_prev) {
     ++	if (!rdiff_prev) {
      +		if (creation_factor >= 0)
      +			die(_("--creation-factor requires --range-diff"));
      +		if (left_only)
      +			die(_("--left-only requires --range-diff"));
      +	}
     ++
     + 	if (creation_factor < 0)
     + 		creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT;
     +-	else if (!rdiff_prev)
     +-		die(_("--creation-factor requires --range-diff"));
     + 
       	if (rdiff_prev) {
       		if (!cover_letter && total != 1)
     - 			die(_("--range-diff requires --cover-letter or single patch"));
      @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *prefix)
       		if (thread)
       			gen_message_id(&rev, "cover");
       		make_cover_letter(&rev, !!output_directory,
      -				  origin, nr, list, branch_name, quiet);
      +				  origin, nr, list, branch_name, quiet,
     -+					left_only);
     ++				  left_only);
       		print_bases(&bases, rev.diffopt.file);
       		print_signature(rev.diffopt.file);
       		total++;
     @@ t/t3206-range-diff.sh: test_expect_success '--left-only/--right-only' '
      +	git rebase $base --onto main &&
      +	tip="$(git rev-parse my-feature)" &&
      +	git format-patch --range-diff $base $old $tip --cover-letter  &&
     -+	grep  "> 1: .* feature$"  0000-cover-letter.patch &&
     ++	grep "> 1: .* feature$" 0000-cover-letter.patch &&
      +	git format-patch --range-diff $base $old $tip --left-only --cover-letter &&
     -+	! grep  "> 1: .* feature$"  0000-cover-letter.patch
     ++	! grep "> 1: .* feature$" 0000-cover-letter.patch
      +'
     -+
      +
       test_done


 Documentation/git-format-patch.txt |  5 ++++-
 builtin/log.c                      | 20 +++++++++++++++-----
 t/t3206-range-diff.sh              | 27 +++++++++++++++++++++++++++
 3 files changed, 46 insertions(+), 6 deletions(-)


base-commit: be7935ed8bff19f481b033d0d242c5d5f239ed50

Comments

Junio C Hamano March 12, 2021, 10:50 p.m. UTC | #1
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> In https://lore.kernel.org/git/YBx5rmVsg1LJhSKN@nand.local/,
> Taylor Blau proposing `git format-patch --cover-letter
> --range-diff` may mistakenly place upstream commit in the
> range-diff output. Teach `format-patch` pass `--left-only`
> to range-diff,can avoid this kind of mistake.

The above is a bit too dense for average readers to grok.  Even if
the readers refer to the external reference, it is unclear where the
"may mistakenly" can come from and why "--left-only" would be
useful (and our log message should not depend on external material
so heavily to begin with).

So let's think aloud to see what use case this may be helpful, and
how the proposed solution makes the world a better place.

If I understand correctly, the use case this tries to help is this:

 * You had sent the v1 iteration of topic.  It was in the range
   B1..T1 where B1 is the tip of the integration branch (like
   'master') from the upstream.

 * To prepare for the v2 iteration, not only you updated individual
   commits, you rebased the series on a new upstream.  Now the topic
   is in the range B2..T2, where B2 is the tip of the integration
   branch from the upstream, and it is very likely that B2 is a
   descendant of B1.

And you want to find out how your commits in T2 (new iteration)
compares with those in T1 (old iteration).  Normally,

    $ git range-diff T1...T2

would be the shortest-to-type and correct version but that is
invalidated because you rebased.

    ---o---B1--b---b---b---B2
            \               \
	     t---t---T1      s---s---s---T2

You'd have commits B1..T1 on the left hand side of the range-diff,
while the right hand side has not just B2..T2 but also commits in
the range B1..B2, too.

By using --left-only (i.e. show only those pair that maps from
commits in the left range), you can exclude the commits in the
B1..B2.

    $ git range-diff --left-only T1...T2

I however wonder what --left-only (Suppress commits that are missing
from the first range) would do to commits in range B2..T2 (they are
all yours) that are (1) added since the v1 iteration, or (2)
modified so drastically that no matching commit is found.  With the
right invocation, of course,

    $ git range-diff B1..T1 B2..T2

you would not have such a problem.  If 2 't's in B1..T1 correspond
to 2 of the 3 's's in B2..T2, at least the presense of the third 's'
that did not match would show up in the output, making it clear that
you have one more commit relative to the earlier iteration.  If use
of --left-only filters it out, the output may be misleading to the
readers, no?

I started writing (or "thinking aloud") hoping that I can help
coming up with a better log message to describe the problem being
solved, but I ended up with "does this make the system better?"
ZheNing Hu March 13, 2021, 4:01 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> 于2021年3月13日周六 上午6:51写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > In https://lore.kernel.org/git/YBx5rmVsg1LJhSKN@nand.local/,
> > Taylor Blau proposing `git format-patch --cover-letter
> > --range-diff` may mistakenly place upstream commit in the
> > range-diff output. Teach `format-patch` pass `--left-only`
> > to range-diff,can avoid this kind of mistake.
>
> The above is a bit too dense for average readers to grok.  Even if
> the readers refer to the external reference, it is unclear where the
> "may mistakenly" can come from and why "--left-only" would be
> useful (and our log message should not depend on external material
> so heavily to begin with).
>

You are right, commit information with the original thread link may make
it difficult for readers to read. I will pay attention.

> So let's think aloud to see what use case this may be helpful, and
> how the proposed solution makes the world a better place.
>
> If I understand correctly, the use case this tries to help is this:
>
>  * You had sent the v1 iteration of topic.  It was in the range
>    B1..T1 where B1 is the tip of the integration branch (like
>    'master') from the upstream.
>
>  * To prepare for the v2 iteration, not only you updated individual
>    commits, you rebased the series on a new upstream.  Now the topic
>    is in the range B2..T2, where B2 is the tip of the integration
>    branch from the upstream, and it is very likely that B2 is a
>    descendant of B1.
>
> And you want to find out how your commits in T2 (new iteration)
> compares with those in T1 (old iteration).  Normally,
>
>     $ git range-diff T1...T2
>
> would be the shortest-to-type and correct version but that is
> invalidated because you rebased.
>
>     ---o---B1--b---b---b---B2
>             \               \
>              t---t---T1      s---s---s---T2
>
> You'd have commits B1..T1 on the left hand side of the range-diff,
> while the right hand side has not just B2..T2 but also commits in
> the range B1..B2, too.
>
> By using --left-only (i.e. show only those pair that maps from
> commits in the left range), you can exclude the commits in the
> B1..B2.
>
>     $ git range-diff --left-only T1...T2
>
> I however wonder what --left-only (Suppress commits that are missing
> from the first range) would do to commits in range B2..T2 (they are
> all yours) that are (1) added since the v1 iteration, or (2)
> modified so drastically that no matching commit is found.  With the
> right invocation, of course,
>
>     $ git range-diff B1..T1 B2..T2
>
> you would not have such a problem.  If 2 't's in B1..T1 correspond
> to 2 of the 3 's's in B2..T2, at least the presense of the third 's'
> that did not match would show up in the output, making it clear that
> you have one more commit relative to the earlier iteration.  If use
> of --left-only filters it out, the output may be misleading to the
> readers, no?
>
> I started writing (or "thinking aloud") hoping that I can help
> coming up with a better log message to describe the problem being
> solved, but I ended up with "does this make the system better?"

Junio, thank you for elaborating this issue in detail and clearly.
I probably understand what you mean by "git range-diff B1..T1 B2..T2"
 to correctly output the commits on my two version topic branch, without
including the upstream commits of B1..B2.So we don’t even need to specify
the `--left-only` to avoid the output of B1...B2, right?

The only thing I can think of now is that if users tend to use T1...T2
to compare
 the differences between the two topics, will the upstream commit in
B1...B2 appear
more abrupt?

Thanks.
Junio C Hamano March 13, 2021, 11:23 p.m. UTC | #3
ZheNing Hu <adlternative@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> 于2021年3月13日周六 上午6:51写道:
>>
>> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>> > From: ZheNing Hu <adlternative@gmail.com>
>> >
>> > In https://lore.kernel.org/git/YBx5rmVsg1LJhSKN@nand.local/,
>> > Taylor Blau proposing `git format-patch --cover-letter
>> > --range-diff` may mistakenly place upstream commit in the
>> > range-diff output. Teach `format-patch` pass `--left-only`
>> > to range-diff,can avoid this kind of mistake.
>>
>> The above is a bit too dense for average readers to grok.  Even if
>> the readers refer to the external reference, it is unclear where the
>> "may mistakenly" can come from and why "--left-only" would be
>> useful (and our log message should not depend on external material
>> so heavily to begin with).
>>
>
> You are right, commit information with the original thread link may make
> it difficult for readers to read. I will pay attention.
>
>> So let's think aloud to see what use case this may be helpful, and
>> how the proposed solution makes the world a better place.
>>
>> If I understand correctly, the use case this tries to help is this:
>>
>>  * You had sent the v1 iteration of topic.  It was in the range
>>    B1..T1 where B1 is the tip of the integration branch (like
>>    'master') from the upstream.
>>
>>  * To prepare for the v2 iteration, not only you updated individual
>>    commits, you rebased the series on a new upstream.  Now the topic
>>    is in the range B2..T2, where B2 is the tip of the integration
>>    branch from the upstream, and it is very likely that B2 is a
>>    descendant of B1.
>>
>> And you want to find out how your commits in T2 (new iteration)
>> compares with those in T1 (old iteration).  Normally,
>>
>>     $ git range-diff T1...T2
>>
>> would be the shortest-to-type and correct version but that is
>> invalidated because you rebased.
>>
>>     ---o---B1--b---b---b---B2
>>             \               \
>>              t---t---T1      s---s---s---T2
>>
>> You'd have commits B1..T1 on the left hand side of the range-diff,
>> while the right hand side has not just B2..T2 but also commits in
>> the range B1..B2, too.
>>
>> By using --left-only (i.e. show only those pair that maps from
>> commits in the left range), you can exclude the commits in the
>> B1..B2.
>>
>>     $ git range-diff --left-only T1...T2
>>
>> I however wonder what --left-only (Suppress commits that are missing
>> from the first range) would do to commits in range B2..T2 (they are
>> all yours) that are (1) added since the v1 iteration, or (2)
>> modified so drastically that no matching commit is found.  With the
>> right invocation, of course,
>>
>>     $ git range-diff B1..T1 B2..T2
>>
>> you would not have such a problem.  If 2 't's in B1..T1 correspond
>> to 2 of the 3 's's in B2..T2, at least the presense of the third 's'
>> that did not match would show up in the output, making it clear that
>> you have one more commit relative to the earlier iteration.  If use
>> of --left-only filters it out, the output may be misleading to the
>> readers, no?
>>
>> I started writing (or "thinking aloud") hoping that I can help
>> coming up with a better log message to describe the problem being
>> solved, but I ended up with "does this make the system better?"
>
> Junio, thank you for elaborating this issue in detail and clearly.
> I probably understand what you mean by "git range-diff B1..T1 B2..T2"
>  to correctly output the commits on my two version topic branch, without
> including the upstream commits of B1..B2.So we don’t even need to specify
> the `--left-only` to avoid the output of B1...B2, right?
>
> The only thing I can think of now is that if users tend to use T1...T2
> to compare
>  the differences between the two topics, will the upstream commit in
> B1...B2 appear
> more abrupt?

Yes, it would be, but that is why you need to educate users what
causes it, and what the right way to avoid unrelated commits from
appearing, and how this --left-only fits in the solution.

If some of the time, "--left-only T1...T2" would give you the same
result as the more strict "B1..T1 B2..T2", that may be why users may
want to use the "--left-only" instead as an easy/lazy alternative.

But I suspect that it would give an incorrect result some of the
time---for example, in the above example, wouldn't one of the
commits labeled as 's' be completely hidden?  And if that is the
case, the end-user documentation would need to warn about it, and
explain that it is a easy/lazy alternative that can produce
incorrect result in the log message.
ZheNing Hu March 14, 2021, 2:16 a.m. UTC | #4
Junio C Hamano <gitster@pobox.com> 于2021年3月14日周日 上午7:23写道:

>
> Yes, it would be, but that is why you need to educate users what
> causes it, and what the right way to avoid unrelated commits from
> appearing, and how this --left-only fits in the solution.
>
> If some of the time, "--left-only T1...T2" would give you the same
> result as the more strict "B1..T1 B2..T2", that may be why users may
> want to use the "--left-only" instead as an easy/lazy alternative.
>
> But I suspect that it would give an incorrect result some of the
> time---for example, in the above example, wouldn't one of the
> commits labeled as 's' be completely hidden?  And if that is the
> case, the end-user documentation would need to warn about it, and
> explain that it is a easy/lazy alternative that can produce
> incorrect result in the log message.

Thanks, I will try to illustrate these issues in the document.

My another thinking is:
Since `--left-only` inhibits "B1..B2" and "B2..T2" ( let the user
choose the left B1..T1 ), To some extent, `--right-only` can also
add ( let the user choose the right B2..T2 ). A separate `--left-only`
will be strange to the user ( If user call T2...T1 ). Since the
`git rebase --apply` will internal call `git format-patch -k --stdout
--full-index --cherry-pick --right-only ...`, I don't know what to deal
with this `--right-only` yet, because I don't how to teach git to judge
if the `--right-only` is pass from user or `git rebase --apply`, Is there
any good way?  If can solve this problem,  the user can choose the left
or right side of the free choice. (Or users don't need `--right-only?`)
ZheNing Hu March 14, 2021, 2:37 a.m. UTC | #5
ZheNing Hu <adlternative@gmail.com> 于2021年3月14日周日 上午10:16写道:
>
> Junio C Hamano <gitster@pobox.com> 于2021年3月14日周日 上午7:23写道:
>
> >
> > Yes, it would be, but that is why you need to educate users what
> > causes it, and what the right way to avoid unrelated commits from
> > appearing, and how this --left-only fits in the solution.
> >
> > If some of the time, "--left-only T1...T2" would give you the same
> > result as the more strict "B1..T1 B2..T2", that may be why users may
> > want to use the "--left-only" instead as an easy/lazy alternative.
> >
> > But I suspect that it would give an incorrect result some of the
> > time---for example, in the above example, wouldn't one of the
> > commits labeled as 's' be completely hidden?  And if that is the
> > case, the end-user documentation would need to warn about it, and
> > explain that it is a easy/lazy alternative that can produce
> > incorrect result in the log message.
>
> Thanks, I will try to illustrate these issues in the document.
>
> My another thinking is:
> Since `--left-only` inhibits "B1..B2" and "B2..T2" ( let the user
> choose the left B1..T1 ), To some extent, `--right-only` can also
> add ( let the user choose the right B2..T2 ). A separate `--left-only`
> will be strange to the user ( If user call T2...T1 ). Since the
> `git rebase --apply` will internal call `git format-patch -k --stdout
> --full-index --cherry-pick --right-only ...`, I don't know what to deal
> with this `--right-only` yet, because I don't how to teach git to judge
> if the `--right-only` is pass from user or `git rebase --apply`, Is there
> any good way?  If can solve this problem,  the user can choose the left
> or right side of the free choice. (Or users don't need `--right-only?`)

Let me refute my own point just now :`--right-only` can not show "B2..T2",
but "B1..T1", and it may be useful only when user want an inverted "T2...T1".
ZheNing Hu March 14, 2021, 2:41 a.m. UTC | #6
> Let me refute my own point just now :`--right-only` can not show "B2..T2",
> but "B1..T1", and it may be useful only when user want an inverted "T2...T1".

sorry,
s/B1..T1/B1..T2/
diff mbox series

Patch

diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
index 3e49bf221087..8c5eca0ba1e3 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -27,7 +27,7 @@  SYNOPSIS
 		   [--[no-]encode-email-headers]
 		   [--no-notes | --notes[=<ref>]]
 		   [--interdiff=<previous>]
-		   [--range-diff=<previous> [--creation-factor=<percent>]]
+		   [--range-diff=<previous> [--creation-factor=<percent>] [--left-only]]
 		   [--filename-max-length=<n>]
 		   [--progress]
 		   [<common diff options>]
@@ -301,6 +301,9 @@  material (this may change in the future).
 	creation/deletion cost fudge factor. See linkgit:git-range-diff[1])
 	for details.
 
+--left-only:
+	Used with `--range-diff`, only emit output related to the first range.
+
 --notes[=<ref>]::
 --no-notes::
 	Append the notes (see linkgit:git-notes[1]) for the commit
diff --git a/builtin/log.c b/builtin/log.c
index f67b67d80ed1..21fed9db82d6 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1153,7 +1153,7 @@  static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 			      struct commit *origin,
 			      int nr, struct commit **list,
 			      const char *branch_name,
-			      int quiet)
+			      int quiet, int left_only)
 {
 	const char *committer;
 	struct shortlog log;
@@ -1228,7 +1228,8 @@  static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 			.creation_factor = rev->creation_factor,
 			.dual_color = 1,
 			.diffopt = &opts,
-			.other_arg = &other_arg
+			.other_arg = &other_arg,
+			.left_only = left_only
 		};
 
 		diff_setup(&opts);
@@ -1732,6 +1733,7 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	struct strbuf rdiff2 = STRBUF_INIT;
 	struct strbuf rdiff_title = STRBUF_INIT;
 	int creation_factor = -1;
+	int left_only = 0;
 
 	const struct option builtin_format_patch_options[] = {
 		OPT_CALLBACK_F('n', "numbered", &numbered, NULL,
@@ -1814,6 +1816,8 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			     parse_opt_object_name),
 		OPT_STRING(0, "range-diff", &rdiff_prev, N_("refspec"),
 			   N_("show changes against <refspec> in cover letter or single patch")),
+		OPT_BOOL(0, "left-only", &left_only,
+			 N_("only emit output related to the first range")),
 		OPT_INTEGER(0, "creation-factor", &creation_factor,
 			    N_("percentage by which creation is weighted")),
 		OPT_END()
@@ -2083,10 +2087,15 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 					     _("Interdiff against v%d:"));
 	}
 
+	if (!rdiff_prev) {
+		if (creation_factor >= 0)
+			die(_("--creation-factor requires --range-diff"));
+		if (left_only)
+			die(_("--left-only requires --range-diff"));
+	}
+
 	if (creation_factor < 0)
 		creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT;
-	else if (!rdiff_prev)
-		die(_("--creation-factor requires --range-diff"));
 
 	if (rdiff_prev) {
 		if (!cover_letter && total != 1)
@@ -2134,7 +2143,8 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		if (thread)
 			gen_message_id(&rev, "cover");
 		make_cover_letter(&rev, !!output_directory,
-				  origin, nr, list, branch_name, quiet);
+				  origin, nr, list, branch_name, quiet,
+				  left_only);
 		print_bases(&bases, rev.diffopt.file);
 		print_signature(rev.diffopt.file);
 		total++;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 1b26c4c2ef91..6178a12dd4b1 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -748,4 +748,31 @@  test_expect_success '--left-only/--right-only' '
 	test_cmp expect actual
 '
 
+test_expect_success 'format-patch --range-diff --left-only' '
+	rm -fr repo &&
+	git init repo &&
+	cd repo &&
+	git branch -M main &&
+	echo "base" >base &&
+	git add base &&
+	git commit -m "base" &&
+	git checkout -b my-feature &&
+	echo "feature" >feature &&
+	git add feature &&
+	git commit -m "feature" &&
+	base="$(git rev-parse main)" &&
+	old="$(git rev-parse my-feature)" &&
+	git checkout main &&
+	echo "other" >>base &&
+	git add base &&
+	git commit -m "new" &&
+	git checkout my-feature &&
+	git rebase $base --onto main &&
+	tip="$(git rev-parse my-feature)" &&
+	git format-patch --range-diff $base $old $tip --cover-letter  &&
+	grep "> 1: .* feature$" 0000-cover-letter.patch &&
+	git format-patch --range-diff $base $old $tip --left-only --cover-letter &&
+	! grep "> 1: .* feature$" 0000-cover-letter.patch
+'
+
 test_done