diff mbox series

[v2,12/14] range-diff: add section header instead of diff header

Message ID 20190705170630.27500-13-t.gummerer@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v2,01/14] apply: replace marc.info link with public-inbox | expand

Commit Message

Thomas Gummerer July 5, 2019, 5:06 p.m. UTC
Currently range-diff keeps the diff header of the inner diff
intact (apart from stripping lines starting with index).  This diff
header is somewhat useful, especially when files get different
names in different ranges.

However there is no real need to keep the whole diff header for that.
The main reason we currently do that is probably because it is easy to
do.

Introduce a new range diff hunk header, that's enclosed by "##",
similar to how line numbers in diff hunks are enclosed by "@@", and
give human readable information of what exactly happened to the file,
including the file name.

This improves the readability of the range-diff by giving more concise
information to the users.  For example if a file was renamed in one
iteration, but not in another, the diff of the headers would be quite
noisy.  However the diff of a single line is concise and should be
easier to understand.

Additionaly, this allows us to add these range diff section headers to
the outer diffs hunk headers using a custom userdiff pattern, which
should help making the range-diff more readable.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 range-diff.c           | 35 ++++++++++++----
 t/t3206-range-diff.sh  | 91 +++++++++++++++++++++++++++++++++++++++---
 t/t3206/history.export | 84 ++++++++++++++++++++++++++++++++++++--
 3 files changed, 193 insertions(+), 17 deletions(-)

Comments

Johannes Schindelin July 5, 2019, 7:35 p.m. UTC | #1
Hi Thomas,

On Fri, 5 Jul 2019, Thomas Gummerer wrote:

> Currently range-diff keeps the diff header of the inner diff
> intact (apart from stripping lines starting with index).  This diff
> header is somewhat useful, especially when files get different
> names in different ranges.
>
> However there is no real need to keep the whole diff header for that.
> The main reason we currently do that is probably because it is easy to
> do.
>
> Introduce a new range diff hunk header, that's enclosed by "##",
> similar to how line numbers in diff hunks are enclosed by "@@", and
> give human readable information of what exactly happened to the file,
> including the file name.
>
> This improves the readability of the range-diff by giving more concise
> information to the users.  For example if a file was renamed in one
> iteration, but not in another, the diff of the headers would be quite
> noisy.  However the diff of a single line is concise and should be
> easier to understand.
>
> Additionaly, this allows us to add these range diff section headers to

s/Additionaly/Additionally/

> the outer diffs hunk headers using a custom userdiff pattern, which
> should help making the range-diff more readable.

Makes sense.

> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
> ---
>  range-diff.c           | 35 ++++++++++++----
>  t/t3206-range-diff.sh  | 91 +++++++++++++++++++++++++++++++++++++++---
>  t/t3206/history.export | 84 ++++++++++++++++++++++++++++++++++++--
>  3 files changed, 193 insertions(+), 17 deletions(-)
>
> diff --git a/range-diff.c b/range-diff.c
> index b31fbab026..cc01f7f573 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -10,6 +10,7 @@
>  #include "commit.h"
>  #include "pretty.h"
>  #include "userdiff.h"
> +#include "apply.h"
>
>  struct patch_util {
>  	/* For the search for an exact match */
> @@ -95,12 +96,36 @@ static int read_patches(const char *range, struct string_list *list)
>  		}
>
>  		if (starts_with(line, "diff --git")) {
> +			struct patch patch;

If you append ` = { 0 }` (or ` = { NULL }`, depending on the first field
of that struct, you don't need that `memset()` later.

> +			struct strbuf root = STRBUF_INIT;
> +			int linenr = 0;
> +
>  			in_header = 0;
>  			strbuf_addch(&buf, '\n');
>  			if (!util->diff_offset)
>  				util->diff_offset = buf.len;
> -			strbuf_addch(&buf, ' ');
> -			strbuf_addstr(&buf, line);
> +			memset(&patch, 0, sizeof(patch));
> +			line[len - 1] = '\n';

I guess `parse_git_header()` cannot handle lines ending in a NUL?

> +			len = parse_git_header(&root, &linenr, 1, line,
> +					       len, size, &patch);
> +			if (len < 0)
> +				die(_("could not parse git header"));

Maybe include the line's contents, like ` '%.*s'", (int)len, line`?

> +			strbuf_addstr(&buf, " ## ");
> +			if (patch.is_new > 0)
> +				strbuf_addf(&buf, "%s (new)", patch.new_name);
> +			else if (patch.is_delete > 0)
> +				strbuf_addf(&buf, "%s (deleted)", patch.old_name);
> +			else if (patch.is_rename)
> +				strbuf_addf(&buf, "%s => %s", patch.old_name, patch.new_name);
> +			else
> +				strbuf_addstr(&buf, patch.new_name);
> +
> +			if (patch.new_mode && patch.old_mode &&
> +			    patch.old_mode != patch.new_mode)
> +				strbuf_addf(&buf, " (mode change %06o => %06o)",
> +					    patch.old_mode, patch.new_mode);
> +
> +			strbuf_addstr(&buf, " ##");
>  		} else if (in_header) {
>  			if (starts_with(line, "Author: ")) {
>  				strbuf_addstr(&buf, line);
> @@ -117,17 +142,13 @@ static int read_patches(const char *range, struct string_list *list)
>  			if (!(p = strstr(p, "@@")))
>  				die(_("invalid hunk header in inner diff"));
>  			strbuf_addstr(&buf, p);
> -		} else if (!line[0] || starts_with(line, "index "))
> +		} else if (!line[0])
>  			/*
>  			 * A completely blank (not ' \n', which is context)
>  			 * line is not valid in a diff.  We skip it
>  			 * silently, because this neatly handles the blank
>  			 * separator line between commits in git-log
>  			 * output.
> -			 *
> -			 * We also want to ignore the diff's `index` lines
> -			 * because they contain exact blob hashes in which
> -			 * we are not interested.

Oh, are we interested in them again?

>  			 */
>  			continue;
>  		else if (line[0] == '>') {
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 9f89af7178..c277756057 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -181,6 +181,85 @@ test_expect_success 'changed commit with sm config' '
>  	test_cmp expected actual
>  '
>
> +test_expect_success 'renamed file' '
> +	git range-diff --no-color --submodule=log topic...renamed-file >actual &&
> +	sed s/Z/\ /g >expected <<-EOF &&
> +	1:  4de457d = 1:  f258d75 s/5/A/
> +	2:  fccce22 ! 2:  017b62d s/4/A/
> +	    @@
> +	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
> +	    Z
> +	    -    s/4/A/
> +	    +    s/4/A/ + rename file
> +	    Z
> +	    - ## file ##
> +	    + ## file => renamed-file ##

I guess there is no good way to suppress the `- ## file ##` line in this
case? It is a bit distracting...

> +	    Z@@
> +	    Z 1
> +	    Z 2
> +	3:  147e64e ! 3:  3ce7af6 s/11/B/
> +	    @@
> +	    Z
> +	    Z    s/11/B/
> +	    Z
> +	    - ## file ##
> +	    + ## renamed-file ##
> +	    Z@@ A
> +	    Z 8
> +	    Z 9
> +	4:  a63e992 ! 4:  1e6226b s/12/B/
> +	    @@
> +	    Z
> +	    Z    s/12/B/
> +	    Z
> +	    - ## file ##
> +	    + ## renamed-file ##
> +	    Z@@ A
> +	    Z 9
> +	    Z 10
> +	EOF
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'file added and later removed' '
> +	git range-diff --no-color --submodule=log topic...added-removed >actual &&
> +	sed s/Z/\ /g >expected <<-EOF &&
> +	1:  4de457d = 1:  096b1ba s/5/A/
> +	2:  fccce22 ! 2:  d92e698 s/4/A/
> +	    @@
> +	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
> +	    Z
> +	    -    s/4/A/
> +	    +    s/4/A/ + new-file
> +	    Z
> +	    Z ## file ##
> +	    Z@@
> +	    @@
> +	    Z A
> +	    Z 6
> +	    Z 7
> +	    +
> +	    + ## new-file (new) ##
> +	3:  147e64e ! 3:  9a1db4d s/11/B/
> +	    @@
> +	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
> +	    Z
> +	    -    s/11/B/
> +	    +    s/11/B/ + remove file
> +	    Z
> +	    Z ## file ##
> +	    Z@@ A
> +	    @@
> +	    Z 12
> +	    Z 13
> +	    Z 14
> +	    +
> +	    + ## new-file (deleted) ##
> +	4:  a63e992 = 4:  fea3b5c s/12/B/
> +	EOF
> +	test_cmp expected actual
> +'
> +
>  test_expect_success 'no commits on one side' '
>  	git commit --amend -m "new message" &&
>  	git range-diff master HEAD@{1} HEAD
> @@ -197,9 +276,9 @@ test_expect_success 'changed message' '
>  	    Z
>  	    +    Also a silly comment here!
>  	    +
> -	    Z diff --git a/file b/file
> -	    Z --- a/file
> -	    Z +++ b/file
> +	    Z ## file ##
> +	    Z@@
> +	    Z 1
>  	3:  147e64e = 3:  b9cb956 s/11/B/
>  	4:  a63e992 = 4:  8add5f1 s/12/B/
>  	EOF
> @@ -216,9 +295,9 @@ test_expect_success 'dual-coloring' '
>  	:     <RESET>
>  	:    <REVERSE><GREEN>+<RESET><BOLD>    Also a silly comment here!<RESET>
>  	:    <REVERSE><GREEN>+<RESET>
> -	:      diff --git a/file b/file<RESET>
> -	:      --- a/file<RESET>
> -	:      +++ b/file<RESET>
> +	:      ## file ##<RESET>
> +	:    <CYAN> @@<RESET>
> +	:      1<RESET>

I am a bit confused where these last two lines come from all of a
sudden... They were not there before, and I do not see any code change in
this patch that would be responsible for them, either...

Could you help me understand?

>  	:<RED>3:  0559556 <RESET><YELLOW>!<RESET><GREEN> 3:  b9cb956<RESET><YELLOW> s/11/B/<RESET>
>  	:    <REVERSE><CYAN>@@<RESET>
>  	:      9<RESET>
> diff --git a/t/t3206/history.export b/t/t3206/history.export
> index b8ffff0940..7bb3814962 100644
> --- a/t/t3206/history.export
> +++ b/t/t3206/history.export
> @@ -22,8 +22,8 @@ data 51
>  19
>  20
>
> -reset refs/heads/removed
> -commit refs/heads/removed
> +reset refs/heads/renamed-file
> +commit refs/heads/renamed-file

Hmm. Is the `removed` ref no longer required by the 'removed a commit'
test case?

>  mark :2
>  author Thomas Rast <trast@inf.ethz.ch> 1374424921 +0200
>  committer Thomas Rast <trast@inf.ethz.ch> 1374484724 +0200
> @@ -599,6 +599,82 @@ s/12/B/
>  from :46
>  M 100644 :28 file
>
> -reset refs/heads/removed
> -from :47
> +commit refs/heads/added-removed
> +mark :48
> +author Thomas Rast <trast@inf.ethz.ch> 1374485014 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574151 +0100

Neat ;-)

> +data 7
> +s/5/A/
> +from :2
> +M 100644 :3 file
> +
> +blob
> +mark :49
> +data 0
> +
> +commit refs/heads/added-removed
> +mark :50
> +author Thomas Rast <trast@inf.ethz.ch> 1374485024 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
> +data 18
> +s/4/A/ + new-file
> +from :48
> +M 100644 :5 file
> +M 100644 :49 new-file
> +
> +commit refs/heads/added-removed
> +mark :51
> +author Thomas Rast <trast@inf.ethz.ch> 1374485036 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
> +data 22
> +s/11/B/ + remove file
> +from :50
> +M 100644 :7 file
> +D new-file
> +
> +commit refs/heads/added-removed
> +mark :52
> +author Thomas Rast <trast@inf.ethz.ch> 1374485044 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
> +data 8
> +s/12/B/
> +from :51
> +M 100644 :9 file
> +
> +commit refs/heads/renamed-file
> +mark :53
> +author Thomas Rast <trast@inf.ethz.ch> 1374485014 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574309 +0100
> +data 7
> +s/5/A/
> +from :2
> +M 100644 :3 file
> +
> +commit refs/heads/renamed-file
> +mark :54
> +author Thomas Rast <trast@inf.ethz.ch> 1374485024 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574312 +0100
> +data 21
> +s/4/A/ + rename file
> +from :53
> +D file
> +M 100644 :5 renamed-file
> +
> +commit refs/heads/renamed-file
> +mark :55
> +author Thomas Rast <trast@inf.ethz.ch> 1374485036 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574319 +0100
> +data 8
> +s/11/B/
> +from :54
> +M 100644 :7 renamed-file
> +
> +commit refs/heads/renamed-file
> +mark :56
> +author Thomas Rast <trast@inf.ethz.ch> 1374485044 +0200
> +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574319 +0100
> +data 8
> +s/12/B/
> +from :55
> +M 100644 :9 renamed-file

I have to admit that I allowed myself not to study this script too
closely, trusting that the range-diff explains better what commit history
it creates than the fast-import script.

Thanks,
Dscho

>
> --
> 2.22.0.510.g264f2c817a
>
>
Thomas Gummerer July 8, 2019, 11:44 a.m. UTC | #2
On 07/05, Johannes Schindelin wrote:
> >  range-diff.c           | 35 ++++++++++++----
> >  t/t3206-range-diff.sh  | 91 +++++++++++++++++++++++++++++++++++++++---
> >  t/t3206/history.export | 84 ++++++++++++++++++++++++++++++++++++--
> >  3 files changed, 193 insertions(+), 17 deletions(-)
> >
> > diff --git a/range-diff.c b/range-diff.c
> > index b31fbab026..cc01f7f573 100644
> > --- a/range-diff.c
> > +++ b/range-diff.c
> > @@ -10,6 +10,7 @@
> >  #include "commit.h"
> >  #include "pretty.h"
> >  #include "userdiff.h"
> > +#include "apply.h"
> >
> >  struct patch_util {
> >  	/* For the search for an exact match */
> > @@ -95,12 +96,36 @@ static int read_patches(const char *range, struct string_list *list)
> >  		}
> >
> >  		if (starts_with(line, "diff --git")) {
> > +			struct patch patch;
> 
> If you append ` = { 0 }` (or ` = { NULL }`, depending on the first field
> of that struct, you don't need that `memset()` later.

Thanks!

> > +			struct strbuf root = STRBUF_INIT;
> > +			int linenr = 0;
> > +
> >  			in_header = 0;
> >  			strbuf_addch(&buf, '\n');
> >  			if (!util->diff_offset)
> >  				util->diff_offset = buf.len;
> > -			strbuf_addch(&buf, ' ');
> > -			strbuf_addstr(&buf, line);
> > +			memset(&patch, 0, sizeof(patch));
> > +			line[len - 1] = '\n';
> 
> I guess `parse_git_header()` cannot handle lines ending in a NUL?

Yeah, it doesn't deal well with them.  We might be able to change
that, but I haven't looked into that tbh.

> > +			len = parse_git_header(&root, &linenr, 1, line,
> > +					       len, size, &patch);
> > +			if (len < 0)
> > +				die(_("could not parse git header"));
> 
> Maybe include the line's contents, like ` '%.*s'", (int)len, line`?

Will do.

> > @@ -117,17 +142,13 @@ static int read_patches(const char *range, struct string_list *list)
> >  			if (!(p = strstr(p, "@@")))
> >  				die(_("invalid hunk header in inner diff"));
> >  			strbuf_addstr(&buf, p);
> > -		} else if (!line[0] || starts_with(line, "index "))
> > +		} else if (!line[0])
> >  			/*
> >  			 * A completely blank (not ' \n', which is context)
> >  			 * line is not valid in a diff.  We skip it
> >  			 * silently, because this neatly handles the blank
> >  			 * separator line between commits in git-log
> >  			 * output.
> > -			 *
> > -			 * We also want to ignore the diff's `index` lines
> > -			 * because they contain exact blob hashes in which
> > -			 * we are not interested.
> 
> Oh, are we interested in them again?

Not really, but we no longer need to ignore them explicitly here.  The
'index' lines are just another part of the diff header, which we parse
using 'parse_git_diff_header()' now, and then replace completely.

Previously we singled that line from the git diff headers out, because
they would result in a particularly noisy range diff, as the blob
hashes would change in a lot of cases, but are not really
interesting.  We could have ignored things such as the "similarity
index" as well, as for example a 1% change in the similarity wouldn't
matter that much, but I assume we just didn't because that's much less
likely to change between different versions of a patch series.

> >  			 */
> >  			continue;
> >  		else if (line[0] == '>') {
> > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> > index 9f89af7178..c277756057 100755
> > --- a/t/t3206-range-diff.sh
> > +++ b/t/t3206-range-diff.sh
> > @@ -181,6 +181,85 @@ test_expect_success 'changed commit with sm config' '
> >  	test_cmp expected actual
> >  '
> >
> > +test_expect_success 'renamed file' '
> > +	git range-diff --no-color --submodule=log topic...renamed-file >actual &&
> > +	sed s/Z/\ /g >expected <<-EOF &&
> > +	1:  4de457d = 1:  f258d75 s/5/A/
> > +	2:  fccce22 ! 2:  017b62d s/4/A/
> > +	    @@
> > +	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
> > +	    Z
> > +	    -    s/4/A/
> > +	    +    s/4/A/ + rename file
> > +	    Z
> > +	    - ## file ##
> > +	    + ## file => renamed-file ##
> 
> I guess there is no good way to suppress the `- ## file ##` line in this
> case? It is a bit distracting...

No, I can't think of a good way.  I'm also not sure it would be right
to remove it.  In this case it means that in the previous version this
was only called 'file', while in the new version it was renamend in
this patch to 'renamed-file', so it does give some useful information.

Not sure how else we could represent that.  If we just had a 
'## file => renamed-file ##' section header, I'd expect that 'file'
has been renamed to 'renamed-file' in both versions.

> > @@ -216,9 +295,9 @@ test_expect_success 'dual-coloring' '
> >  	:     <RESET>
> >  	:    <REVERSE><GREEN>+<RESET><BOLD>    Also a silly comment here!<RESET>
> >  	:    <REVERSE><GREEN>+<RESET>
> > -	:      diff --git a/file b/file<RESET>
> > -	:      --- a/file<RESET>
> > -	:      +++ b/file<RESET>
> > +	:      ## file ##<RESET>
> > +	:    <CYAN> @@<RESET>
> > +	:      1<RESET>
> 
> I am a bit confused where these last two lines come from all of a
> sudden... They were not there before, and I do not see any code change in
> this patch that would be responsible for them, either...
> 
> Could you help me understand?

Sure.  The actual change (in the range-diff) here is that "Also a
silly comment here!" was added to the commit message.  The diff header
is context lines after that.

We now replace the diff header with the new "section header", which is
only a single line, so we get a couple of additional lines of the
context of the subsequent inner diff.  


> >  	:<RED>3:  0559556 <RESET><YELLOW>!<RESET><GREEN> 3:  b9cb956<RESET><YELLOW> s/11/B/<RESET>
> >  	:    <REVERSE><CYAN>@@<RESET>
> >  	:      9<RESET>
> > diff --git a/t/t3206/history.export b/t/t3206/history.export
> > index b8ffff0940..7bb3814962 100644
> > --- a/t/t3206/history.export
> > +++ b/t/t3206/history.export
> > @@ -22,8 +22,8 @@ data 51
> >  19
> >  20
> >
> > -reset refs/heads/removed
> > -commit refs/heads/removed
> > +reset refs/heads/renamed-file
> > +commit refs/heads/renamed-file
> 
> Hmm. Is the `removed` ref no longer required by the 'removed a commit'
> test case?

It is, and it still exists.  I'm not entirely familar with the format
for fast-export/fast-import scripts.  What I did was just fast-import
the existing script, create the new refs that were required for the
tests and then fast-export'ed it again.

So not sure exactly why this changed, but the 'removed' ref still
exists :)

> >  mark :2
> >  author Thomas Rast <trast@inf.ethz.ch> 1374424921 +0200
> >  committer Thomas Rast <trast@inf.ethz.ch> 1374484724 +0200
> > @@ -599,6 +599,82 @@ s/12/B/
> >  from :46
> >  M 100644 :28 file
> >
> > -reset refs/heads/removed
> > -from :47
> > +commit refs/heads/added-removed
> > +mark :48
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485014 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574151 +0100
> 
> Neat ;-)
> 
> > +data 7
> > +s/5/A/
> > +from :2
> > +M 100644 :3 file
> > +
> > +blob
> > +mark :49
> > +data 0
> > +
> > +commit refs/heads/added-removed
> > +mark :50
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485024 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
> > +data 18
> > +s/4/A/ + new-file
> > +from :48
> > +M 100644 :5 file
> > +M 100644 :49 new-file
> > +
> > +commit refs/heads/added-removed
> > +mark :51
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485036 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
> > +data 22
> > +s/11/B/ + remove file
> > +from :50
> > +M 100644 :7 file
> > +D new-file
> > +
> > +commit refs/heads/added-removed
> > +mark :52
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485044 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
> > +data 8
> > +s/12/B/
> > +from :51
> > +M 100644 :9 file
> > +
> > +commit refs/heads/renamed-file
> > +mark :53
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485014 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574309 +0100
> > +data 7
> > +s/5/A/
> > +from :2
> > +M 100644 :3 file
> > +
> > +commit refs/heads/renamed-file
> > +mark :54
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485024 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574312 +0100
> > +data 21
> > +s/4/A/ + rename file
> > +from :53
> > +D file
> > +M 100644 :5 renamed-file
> > +
> > +commit refs/heads/renamed-file
> > +mark :55
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485036 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574319 +0100
> > +data 8
> > +s/11/B/
> > +from :54
> > +M 100644 :7 renamed-file
> > +
> > +commit refs/heads/renamed-file
> > +mark :56
> > +author Thomas Rast <trast@inf.ethz.ch> 1374485044 +0200
> > +committer Thomas Gummerer <t.gummerer@gmail.com> 1556574319 +0100
> > +data 8
> > +s/12/B/
> > +from :55
> > +M 100644 :9 renamed-file
> 
> I have to admit that I allowed myself not to study this script too
> closely, trusting that the range-diff explains better what commit history
> it creates than the fast-import script.
> 
> Thanks,
> Dscho
> 
> >
> > --
> > 2.22.0.510.g264f2c817a
> >
> >
Johannes Schindelin July 8, 2019, 1:12 p.m. UTC | #3
Hi Thomas,

On Mon, 8 Jul 2019, Thomas Gummerer wrote:

> On 07/05, Johannes Schindelin wrote:
>
> > >  			 */
> > >  			continue;
> > >  		else if (line[0] == '>') {
> > > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> > > index 9f89af7178..c277756057 100755
> > > --- a/t/t3206-range-diff.sh
> > > +++ b/t/t3206-range-diff.sh
> > > @@ -181,6 +181,85 @@ test_expect_success 'changed commit with sm config' '
> > >  	test_cmp expected actual
> > >  '
> > >
> > > +test_expect_success 'renamed file' '
> > > +	git range-diff --no-color --submodule=log topic...renamed-file >actual &&
> > > +	sed s/Z/\ /g >expected <<-EOF &&
> > > +	1:  4de457d = 1:  f258d75 s/5/A/
> > > +	2:  fccce22 ! 2:  017b62d s/4/A/
> > > +	    @@
> > > +	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
> > > +	    Z
> > > +	    -    s/4/A/
> > > +	    +    s/4/A/ + rename file
> > > +	    Z
> > > +	    - ## file ##
> > > +	    + ## file => renamed-file ##
> >
> > I guess there is no good way to suppress the `- ## file ##` line in this
> > case? It is a bit distracting...
>
> No, I can't think of a good way.  I'm also not sure it would be right
> to remove it.  In this case it means that in the previous version this
> was only called 'file', while in the new version it was renamend in
> this patch to 'renamed-file', so it does give some useful information.

Oh, I misunderstood! You're right, this is useful information, and I just
have to learn how to read that variant of the range-diffs.

In other words: please leave this part of your patch series as-is.

> > > @@ -216,9 +295,9 @@ test_expect_success 'dual-coloring' '
> > >  	:     <RESET>
> > >  	:    <REVERSE><GREEN>+<RESET><BOLD>    Also a silly comment here!<RESET>
> > >  	:    <REVERSE><GREEN>+<RESET>
> > > -	:      diff --git a/file b/file<RESET>
> > > -	:      --- a/file<RESET>
> > > -	:      +++ b/file<RESET>
> > > +	:      ## file ##<RESET>
> > > +	:    <CYAN> @@<RESET>
> > > +	:      1<RESET>
> >
> > I am a bit confused where these last two lines come from all of a
> > sudden... They were not there before, and I do not see any code change in
> > this patch that would be responsible for them, either...
> >
> > Could you help me understand?
>
> Sure.  The actual change (in the range-diff) here is that "Also a
> silly comment here!" was added to the commit message.  The diff header
> is context lines after that.
>
> We now replace the diff header with the new "section header", which is
> only a single line, so we get a couple of additional lines of the
> context of the subsequent inner diff.

You know what? This is my typical mistake when reading uncolored
range-diffs: _of course_ I missed that this is talking about context
lines. I really thought they were added by the new iteration. My bad. And
thanks for explaining this patiently to me.

> > >  	:<RED>3:  0559556 <RESET><YELLOW>!<RESET><GREEN> 3:  b9cb956<RESET><YELLOW> s/11/B/<RESET>
> > >  	:    <REVERSE><CYAN>@@<RESET>
> > >  	:      9<RESET>
> > > diff --git a/t/t3206/history.export b/t/t3206/history.export
> > > index b8ffff0940..7bb3814962 100644
> > > --- a/t/t3206/history.export
> > > +++ b/t/t3206/history.export
> > > @@ -22,8 +22,8 @@ data 51
> > >  19
> > >  20
> > >
> > > -reset refs/heads/removed
> > > -commit refs/heads/removed
> > > +reset refs/heads/renamed-file
> > > +commit refs/heads/renamed-file
> >
> > Hmm. Is the `removed` ref no longer required by the 'removed a commit'
> > test case?
>
> It is, and it still exists.  I'm not entirely familar with the format
> for fast-export/fast-import scripts.  What I did was just fast-import
> the existing script, create the new refs that were required for the
> tests and then fast-export'ed it again.
>
> So not sure exactly why this changed, but the 'removed' ref still
> exists :)

Right, it would still exist because earlier parts of the script would have
created that ref (otherwise `reset refs/heads/removed` would have failed).

Your strategy to update the script sounds like the best way to go, it just
runs afoul of topological ordering that somehow makes this patch look as
if you had removed parts of the commit history. I believe you when you say
that you didn't, of course.

Thanks,
Dscho
diff mbox series

Patch

diff --git a/range-diff.c b/range-diff.c
index b31fbab026..cc01f7f573 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -10,6 +10,7 @@ 
 #include "commit.h"
 #include "pretty.h"
 #include "userdiff.h"
+#include "apply.h"
 
 struct patch_util {
 	/* For the search for an exact match */
@@ -95,12 +96,36 @@  static int read_patches(const char *range, struct string_list *list)
 		}
 
 		if (starts_with(line, "diff --git")) {
+			struct patch patch;
+			struct strbuf root = STRBUF_INIT;
+			int linenr = 0;
+
 			in_header = 0;
 			strbuf_addch(&buf, '\n');
 			if (!util->diff_offset)
 				util->diff_offset = buf.len;
-			strbuf_addch(&buf, ' ');
-			strbuf_addstr(&buf, line);
+			memset(&patch, 0, sizeof(patch));
+			line[len - 1] = '\n';
+			len = parse_git_header(&root, &linenr, 1, line,
+					       len, size, &patch);
+			if (len < 0)
+				die(_("could not parse git header"));
+			strbuf_addstr(&buf, " ## ");
+			if (patch.is_new > 0)
+				strbuf_addf(&buf, "%s (new)", patch.new_name);
+			else if (patch.is_delete > 0)
+				strbuf_addf(&buf, "%s (deleted)", patch.old_name);
+			else if (patch.is_rename)
+				strbuf_addf(&buf, "%s => %s", patch.old_name, patch.new_name);
+			else
+				strbuf_addstr(&buf, patch.new_name);
+
+			if (patch.new_mode && patch.old_mode &&
+			    patch.old_mode != patch.new_mode)
+				strbuf_addf(&buf, " (mode change %06o => %06o)",
+					    patch.old_mode, patch.new_mode);
+
+			strbuf_addstr(&buf, " ##");
 		} else if (in_header) {
 			if (starts_with(line, "Author: ")) {
 				strbuf_addstr(&buf, line);
@@ -117,17 +142,13 @@  static int read_patches(const char *range, struct string_list *list)
 			if (!(p = strstr(p, "@@")))
 				die(_("invalid hunk header in inner diff"));
 			strbuf_addstr(&buf, p);
-		} else if (!line[0] || starts_with(line, "index "))
+		} else if (!line[0])
 			/*
 			 * A completely blank (not ' \n', which is context)
 			 * line is not valid in a diff.  We skip it
 			 * silently, because this neatly handles the blank
 			 * separator line between commits in git-log
 			 * output.
-			 *
-			 * We also want to ignore the diff's `index` lines
-			 * because they contain exact blob hashes in which
-			 * we are not interested.
 			 */
 			continue;
 		else if (line[0] == '>') {
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 9f89af7178..c277756057 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -181,6 +181,85 @@  test_expect_success 'changed commit with sm config' '
 	test_cmp expected actual
 '
 
+test_expect_success 'renamed file' '
+	git range-diff --no-color --submodule=log topic...renamed-file >actual &&
+	sed s/Z/\ /g >expected <<-EOF &&
+	1:  4de457d = 1:  f258d75 s/5/A/
+	2:  fccce22 ! 2:  017b62d s/4/A/
+	    @@
+	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
+	    Z
+	    -    s/4/A/
+	    +    s/4/A/ + rename file
+	    Z
+	    - ## file ##
+	    + ## file => renamed-file ##
+	    Z@@
+	    Z 1
+	    Z 2
+	3:  147e64e ! 3:  3ce7af6 s/11/B/
+	    @@
+	    Z
+	    Z    s/11/B/
+	    Z
+	    - ## file ##
+	    + ## renamed-file ##
+	    Z@@ A
+	    Z 8
+	    Z 9
+	4:  a63e992 ! 4:  1e6226b s/12/B/
+	    @@
+	    Z
+	    Z    s/12/B/
+	    Z
+	    - ## file ##
+	    + ## renamed-file ##
+	    Z@@ A
+	    Z 9
+	    Z 10
+	EOF
+	test_cmp expected actual
+'
+
+test_expect_success 'file added and later removed' '
+	git range-diff --no-color --submodule=log topic...added-removed >actual &&
+	sed s/Z/\ /g >expected <<-EOF &&
+	1:  4de457d = 1:  096b1ba s/5/A/
+	2:  fccce22 ! 2:  d92e698 s/4/A/
+	    @@
+	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
+	    Z
+	    -    s/4/A/
+	    +    s/4/A/ + new-file
+	    Z
+	    Z ## file ##
+	    Z@@
+	    @@
+	    Z A
+	    Z 6
+	    Z 7
+	    +
+	    + ## new-file (new) ##
+	3:  147e64e ! 3:  9a1db4d s/11/B/
+	    @@
+	    ZAuthor: Thomas Rast <trast@inf.ethz.ch>
+	    Z
+	    -    s/11/B/
+	    +    s/11/B/ + remove file
+	    Z
+	    Z ## file ##
+	    Z@@ A
+	    @@
+	    Z 12
+	    Z 13
+	    Z 14
+	    +
+	    + ## new-file (deleted) ##
+	4:  a63e992 = 4:  fea3b5c s/12/B/
+	EOF
+	test_cmp expected actual
+'
+
 test_expect_success 'no commits on one side' '
 	git commit --amend -m "new message" &&
 	git range-diff master HEAD@{1} HEAD
@@ -197,9 +276,9 @@  test_expect_success 'changed message' '
 	    Z
 	    +    Also a silly comment here!
 	    +
-	    Z diff --git a/file b/file
-	    Z --- a/file
-	    Z +++ b/file
+	    Z ## file ##
+	    Z@@
+	    Z 1
 	3:  147e64e = 3:  b9cb956 s/11/B/
 	4:  a63e992 = 4:  8add5f1 s/12/B/
 	EOF
@@ -216,9 +295,9 @@  test_expect_success 'dual-coloring' '
 	:     <RESET>
 	:    <REVERSE><GREEN>+<RESET><BOLD>    Also a silly comment here!<RESET>
 	:    <REVERSE><GREEN>+<RESET>
-	:      diff --git a/file b/file<RESET>
-	:      --- a/file<RESET>
-	:      +++ b/file<RESET>
+	:      ## file ##<RESET>
+	:    <CYAN> @@<RESET>
+	:      1<RESET>
 	:<RED>3:  0559556 <RESET><YELLOW>!<RESET><GREEN> 3:  b9cb956<RESET><YELLOW> s/11/B/<RESET>
 	:    <REVERSE><CYAN>@@<RESET>
 	:      9<RESET>
diff --git a/t/t3206/history.export b/t/t3206/history.export
index b8ffff0940..7bb3814962 100644
--- a/t/t3206/history.export
+++ b/t/t3206/history.export
@@ -22,8 +22,8 @@  data 51
 19
 20
 
-reset refs/heads/removed
-commit refs/heads/removed
+reset refs/heads/renamed-file
+commit refs/heads/renamed-file
 mark :2
 author Thomas Rast <trast@inf.ethz.ch> 1374424921 +0200
 committer Thomas Rast <trast@inf.ethz.ch> 1374484724 +0200
@@ -599,6 +599,82 @@  s/12/B/
 from :46
 M 100644 :28 file
 
-reset refs/heads/removed
-from :47
+commit refs/heads/added-removed
+mark :48
+author Thomas Rast <trast@inf.ethz.ch> 1374485014 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574151 +0100
+data 7
+s/5/A/
+from :2
+M 100644 :3 file
+
+blob
+mark :49
+data 0
+
+commit refs/heads/added-removed
+mark :50
+author Thomas Rast <trast@inf.ethz.ch> 1374485024 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
+data 18
+s/4/A/ + new-file
+from :48
+M 100644 :5 file
+M 100644 :49 new-file
+
+commit refs/heads/added-removed
+mark :51
+author Thomas Rast <trast@inf.ethz.ch> 1374485036 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
+data 22
+s/11/B/ + remove file
+from :50
+M 100644 :7 file
+D new-file
+
+commit refs/heads/added-removed
+mark :52
+author Thomas Rast <trast@inf.ethz.ch> 1374485044 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574177 +0100
+data 8
+s/12/B/
+from :51
+M 100644 :9 file
+
+commit refs/heads/renamed-file
+mark :53
+author Thomas Rast <trast@inf.ethz.ch> 1374485014 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574309 +0100
+data 7
+s/5/A/
+from :2
+M 100644 :3 file
+
+commit refs/heads/renamed-file
+mark :54
+author Thomas Rast <trast@inf.ethz.ch> 1374485024 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574312 +0100
+data 21
+s/4/A/ + rename file
+from :53
+D file
+M 100644 :5 renamed-file
+
+commit refs/heads/renamed-file
+mark :55
+author Thomas Rast <trast@inf.ethz.ch> 1374485036 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574319 +0100
+data 8
+s/11/B/
+from :54
+M 100644 :7 renamed-file
+
+commit refs/heads/renamed-file
+mark :56
+author Thomas Rast <trast@inf.ethz.ch> 1374485044 +0200
+committer Thomas Gummerer <t.gummerer@gmail.com> 1556574319 +0100
+data 8
+s/12/B/
+from :55
+M 100644 :9 renamed-file