Document change in format of raw diff output format
diff mbox series

Message ID 20181122105836.GA36193@retiro.local
State New
Headers show
Series
  • Document change in format of raw diff output format
Related show

Commit Message

Greg Hurrell Nov. 22, 2018, 10:58 a.m. UTC
I was troubleshooting some breakage in some code that consumes the output of `git log --raw` and looking on two machines with different versions of Git just now I discovered the output format has changed somewhere between v2.14.5:

:000000 100644 000000000... 9773b7718... A      content/snippets/1157.md

and v2.19.0:

:000000 100644 000000000 9773b7718 A    content/snippets/1157.md

A quick search turns up some patches related to the GIT_PRINT_SHA1_ELLIPSIS env variable, which can be used to force the old output format, and which landed in v2.16.0, I think.

Does it sound right that we should update the documentation in diff-format.txt to show what the new output format is? The examples all show the old output format, which isn't produced by default any more.

Something like the following? If the answer is yes, I can turn it into a real patch.

Cheers,
Greg

Comments

Jeff King Nov. 22, 2018, 4:01 p.m. UTC | #1
On Thu, Nov 22, 2018 at 11:58:36AM +0100, Greg Hurrell wrote:

> I was troubleshooting some breakage in some code that consumes the
> output of `git log --raw` and looking on two machines with different
> versions of Git just now I discovered the output format has changed
> somewhere between v2.14.5:
> 
> :000000 100644 000000000... 9773b7718... A      content/snippets/1157.md
> 
> and v2.19.0:
> 
> :000000 100644 000000000 9773b7718 A    content/snippets/1157.md
> 
> A quick search turns up some patches related to the
> GIT_PRINT_SHA1_ELLIPSIS env variable, which can be used to force the
> old output format, and which landed in v2.16.0, I think.

Yes. The actual commit that flipped the default is 7cb6ac1e4b (diff:
diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value,
2017-12-03). There's more discussion of the possibility of breakage in
this subthread:

  https://public-inbox.org/git/83D263E58ABD46188756D41FE311E469@PhilipOakley/

> Does it sound right that we should update the documentation in
> diff-format.txt to show what the new output format is? The examples
> all show the old output format, which isn't produced by default any
> more.

Yes, we should definitely update the documentation to show the modern
format. I think that was just an oversight in the original series.

> diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
> index 706916c94c..33776459d0 100644
> --- a/Documentation/diff-format.txt
> +++ b/Documentation/diff-format.txt
> @@ -26,12 +26,12 @@ line per changed file.
>  An output line is formatted this way:
> 
>  ------------------------------------------------
> -in-place edit  :100644 100644 bcd1234... 0123456... M file0
> -copy-edit      :100644 100644 abcd123... 1234567... C68 file1 file2
> -rename-edit    :100644 100644 abcd123... 1234567... R86 file1 file3
> -create         :000000 100644 0000000... 1234567... A file4
> -delete         :100644 000000 1234567... 0000000... D file5
> -unmerged       :000000 000000 0000000... 0000000... U file6
> +in-place edit  :100644 100644 bcd123456 012345678 M file0
> +copy-edit      :100644 100644 abcd12345 123456789 C68 file1 file2
> +rename-edit    :100644 100644 abcd12345 123456789 R86 file1 file3
> +create         :000000 100644 000000000 123456789 A file4
> +delete         :100644 000000 123456789 000000000 D file5
> +unmerged       :000000 000000 000000000 000000000 U file6
>  ------------------------------------------------

Yeah, this looks like an improvement.

I think in general that we'd continue to show 7 characters now, just
without the extra dots (though it's auto-scaled based on the number of
objects in the repo these days, so it's not even really a constant).

>  That is, from the left to the right:
> @@ -75,7 +75,7 @@ and it is out of sync with the index.
>  Example:
> 
>  ------------------------------------------------
> -:100644 100644 5be4a4...... 000000...... M file.c
> +:100644 100644 5be4a4abc 000000000 M file.c
>  ------------------------------------------------

I'm not even sure what this original was trying to show. I don't think
we ever produced that any dots. :)

Thanks for noticing.

-Peff

PS As you noticed, "git log" we don't promise that git-log output will
   never change between versions. For machine-consumption you probably
   want to use plumbing like "git rev-list | git diff-tree --stdin",
   which produces unabbreviated hashes.
Greg Hurrell Nov. 23, 2018, 9:09 a.m. UTC | #2
Jeff King wrote:

> On Thu, Nov 22, 2018 at 11:58:36AM +0100, Greg Hurrell wrote:
> 
> > diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
> > index 706916c94c..33776459d0 100644
> > --- a/Documentation/diff-format.txt
> > +++ b/Documentation/diff-format.txt
> > @@ -26,12 +26,12 @@ line per changed file.
> >  An output line is formatted this way:
> > 
> >  ------------------------------------------------
> > -in-place edit  :100644 100644 bcd1234... 0123456... M file0
> > -copy-edit      :100644 100644 abcd123... 1234567... C68 file1 file2
> > -rename-edit    :100644 100644 abcd123... 1234567... R86 file1 file3
> > -create         :000000 100644 0000000... 1234567... A file4
> > -delete         :100644 000000 1234567... 0000000... D file5
> > -unmerged       :000000 000000 0000000... 0000000... U file6
> > +in-place edit  :100644 100644 bcd123456 012345678 M file0
> > +copy-edit      :100644 100644 abcd12345 123456789 C68 file1 file2
> > +rename-edit    :100644 100644 abcd12345 123456789 R86 file1 file3
> > +create         :000000 100644 000000000 123456789 A file4
> > +delete         :100644 000000 123456789 000000000 D file5
> > +unmerged       :000000 000000 000000000 000000000 U file6
> >  ------------------------------------------------
> 
> Yeah, this looks like an improvement.
> 
> I think in general that we'd continue to show 7 characters now, just
> without the extra dots (though it's auto-scaled based on the number of
> objects in the repo these days, so it's not even really a constant).

That's funny. I looked at the output on (what I thought was) a small
repo and it was showing me 9-character abbreviated hashes. I guess I
just got lucky. Tested on a basically empty repo and 7 does look to be
the default.

> PS As you noticed, "git log" we don't promise that git-log output will
>    never change between versions. For machine-consumption you probably
>    want to use plumbing like "git rev-list | git diff-tree --stdin",
>    which produces unabbreviated hashes.

Thanks for the tip. My mistake was thinking that the `--raw` made the
`git log` output somehow more plumbing-ish, but I've gone ahead and
switched to using git-rev-list plus git-diff-tree instead.

Anyway, patch follows.

Patch
diff mbox series

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index 706916c94c..33776459d0 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -26,12 +26,12 @@  line per changed file.
 An output line is formatted this way:

 ------------------------------------------------
-in-place edit  :100644 100644 bcd1234... 0123456... M file0
-copy-edit      :100644 100644 abcd123... 1234567... C68 file1 file2
-rename-edit    :100644 100644 abcd123... 1234567... R86 file1 file3
-create         :000000 100644 0000000... 1234567... A file4
-delete         :100644 000000 1234567... 0000000... D file5
-unmerged       :000000 000000 0000000... 0000000... U file6
+in-place edit  :100644 100644 bcd123456 012345678 M file0
+copy-edit      :100644 100644 abcd12345 123456789 C68 file1 file2
+rename-edit    :100644 100644 abcd12345 123456789 R86 file1 file3
+create         :000000 100644 000000000 123456789 A file4
+delete         :100644 000000 123456789 000000000 D file5
+unmerged       :000000 000000 000000000 000000000 U file6
 ------------------------------------------------

 That is, from the left to the right:
@@ -75,7 +75,7 @@  and it is out of sync with the index.
 Example:

 ------------------------------------------------
-:100644 100644 5be4a4...... 000000...... M file.c
+:100644 100644 5be4a4abc 000000000 M file.c
 ------------------------------------------------

 Without the `-z` option, pathnames with "unusual" characters are
@@ -100,7 +100,7 @@  from the format described above in the following way:
 Example:

 ------------------------------------------------
-::100644 100644 100644 fabadb8... cc95eb0... 4866510... MM     describe.c
+::100644 100644 100644 fabadb827 cc95eb0f2 4866510ea MM        describe.c
 ------------------------------------------------

 Note that 'combined diff' lists only files which were modified from