diff mbox series

[v3,2/2] diff: index-line: respect --abbrev in object's name

Message ID 760df7782dad9e9df7bb284ec57249e697a4cc92.1597364493.git.congdanhqx@gmail.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Đoàn Trần Công Danh Aug. 14, 2020, 12:23 a.m. UTC
A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that consistent is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's resolve that inconsistency.

To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always shows full object id
if `--full-index' is specified.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/diff-options.txt                |  9 +++---
 diff.c                                        |  5 +++-
 t/t4013-diff-various.sh                       |  3 ++
 ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
 ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
 ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
 6 files changed, 99 insertions(+), 5 deletions(-)
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
 create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial

Comments

SZEDER Gábor Aug. 14, 2020, 3:18 p.m. UTC | #1
On Fri, Aug 14, 2020 at 07:23:10AM +0700, Đoàn Trần Công Danh wrote:
> A handful of Git's commands respect `--abbrev' for customizing length
> of abbreviation of object names.
> 
> For diff-family, Git supports 2 different options for 2 different
> purposes, `--full-index' for showing diff-patch object's name in full,
> and `--abbrev' to customize the length of object names in diff-raw and
> diff-tree header lines, without any options to customise the length of
> object names in diff-patch format. When working with diff-patch format,
> we only have two options, either full index, or default abbrev length.
> 
> Although, that consistent is documented, it doesn't stop users from
> trying to use `--abbrev' with the hope of customising diff-patch's
> objects' name's abbreviation.
> 
> Let's resolve that inconsistency.
> 
> To preserve backward compatibility with old script that specify both
> `--full-index' and `--abbrev', always shows full object id
> if `--full-index' is specified.
> 
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  Documentation/diff-options.txt                |  9 +++---
>  diff.c                                        |  5 +++-
>  t/t4013-diff-various.sh                       |  3 ++
>  ...ff.diff-tree_--root_-p_--abbrev=10_initial | 29 +++++++++++++++++++
>  ...--root_-p_--full-index_--abbrev=10_initial | 29 +++++++++++++++++++
>  ...f.diff-tree_--root_-p_--full-index_initial | 29 +++++++++++++++++++
>  6 files changed, 99 insertions(+), 5 deletions(-)
>  create mode 100644 t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
>  create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
>  create mode 100644 t/t4013/diff.diff-tree_--root_-p_--full-index_initial
> 
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 7987d72b02..c11efa7865 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -441,10 +441,11 @@ endif::git-format-patch[]
>  --abbrev[=<n>]::
>  	Instead of showing the full 40-byte hexadecimal object
>  	name in diff-raw format output and diff-tree header
> -	lines, show only a partial prefix.  This is
> -	independent of the `--full-index` option above, which controls
> -	the diff-patch output format.  Non default number of
> -	digits can be specified with `--abbrev=<n>`.
> +	lines, show only a partial prefix.
> +	In diff-patch output format, `--full-index` takes higher
> +	precedent, i.e. if `--full-index` is specified, full blob
> +	names will be shown regardless of `--abbrev`.
> +	Non default number of digits can be specified with `--abbrev=<n>`.
>  
>  -B[<n>][/<m>]::
>  --break-rewrites[=[<n>][/<m>]]::
> diff --git a/diff.c b/diff.c
> index f9709de7b4..20dedfe2a9 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4319,7 +4319,10 @@ static void fill_metainfo(struct strbuf *msg,
>  	}
>  	if (one && two && !oideq(&one->oid, &two->oid)) {
>  		const unsigned hexsz = the_hash_algo->hexsz;
> -		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
> +		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
> +
> +		if (o->flags.full_index)
> +			abbrev = hexsz;
>  
>  		if (o->flags.binary) {
>  			mmfile_t mf;
> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index 3f60f7d96c..e6eb4dd4c7 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -221,6 +221,9 @@ diff-tree --root -r --abbrev=4 initial
>  :noellipses diff-tree --root -r --abbrev=4 initial
>  diff-tree -p initial
>  diff-tree --root -p initial
> +diff-tree --root -p --abbrev=10 initial
> +diff-tree --root -p --full-index initial
> +diff-tree --root -p --full-index --abbrev=10 initial
>  diff-tree --patch-with-stat initial
>  diff-tree --root --patch-with-stat initial
>  diff-tree --patch-with-raw initial
> diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
> new file mode 100644
> index 0000000000..7518a9044e
> --- /dev/null
> +++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
> @@ -0,0 +1,29 @@
> +$ git diff-tree --root -p --abbrev=10 initial
> +444ac553ac7612cc88969031b02b3767fb8a353a
> +diff --git a/dir/sub b/dir/sub
> +new file mode 100644
> +index 0000000000..35d242ba79
> +--- /dev/null
> ++++ b/dir/sub
> +@@ -0,0 +1,2 @@
> ++A
> ++B
> +diff --git a/file0 b/file0
> +new file mode 100644
> +index 0000000000..01e79c32a8
> +--- /dev/null
> ++++ b/file0
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +diff --git a/file2 b/file2
> +new file mode 100644
> +index 0000000000..01e79c32a8
> +--- /dev/null
> ++++ b/file2
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +$
> diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
> new file mode 100644
> index 0000000000..69f913fbe5
> --- /dev/null
> +++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
> @@ -0,0 +1,29 @@
> +$ git diff-tree --root -p --full-index --abbrev=10 initial
> +444ac553ac7612cc88969031b02b3767fb8a353a
> +diff --git a/dir/sub b/dir/sub
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
> +--- /dev/null
> ++++ b/dir/sub
> +@@ -0,0 +1,2 @@
> ++A
> ++B
> +diff --git a/file0 b/file0
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file0
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +diff --git a/file2 b/file2
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file2
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +$
> diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
> new file mode 100644
> index 0000000000..1b0b6717fa
> --- /dev/null
> +++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
> @@ -0,0 +1,29 @@
> +$ git diff-tree --root -p --full-index initial
> +444ac553ac7612cc88969031b02b3767fb8a353a
> +diff --git a/dir/sub b/dir/sub
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
> +--- /dev/null
> ++++ b/dir/sub
> +@@ -0,0 +1,2 @@
> ++A
> ++B
> +diff --git a/file0 b/file0
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file0
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +diff --git a/file2 b/file2
> +new file mode 100644
> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
> +--- /dev/null
> ++++ b/file2
> +@@ -0,0 +1,3 @@
> ++1
> ++2
> ++3
> +$

All these new tests break when run with GIT_TEST_DEFAULT_HASH=sha256
with something like:

  + test_cmp expect actual
  --- expect	2020-08-14 15:05:12.209285397 +0000
  +++ actual	2020-08-14 15:05:12.205285279 +0000
  @@ -2,7 +2,7 @@
   0000000000000000000000000000000000000000000000000000000000000000
   diff --git a/dir/sub b/dir/sub
   new file mode ffff44
  -index ffffffffff..fffffffa79
  +index ffffffffff..fffffff895
   --- /dev/null
   +++ b/dir/sub
   @@ -0,0 +1,2 @@
  @@ -10,7 +10,7 @@
   +B
   diff --git a/file0 b/file0
   new file mode ffff44
  -index ffffffffff..fffffff2a8
  +index ffffffffff..fffffff5d7
   --- /dev/null
   +++ b/file0
   @@ -0,0 +1,3 @@
  @@ -19,7 +19,7 @@
   +3
   diff --git a/file2 b/file2
   new file mode ffff44
  -index ffffffffff..fffffff2a8
  +index ffffffffff..fffffff5d7
   --- /dev/null
   +++ b/file2
   @@ -0,0 +1,3 @@
  error: last command exited with $?=1
Junio C Hamano Aug. 14, 2020, 5 p.m. UTC | #2
SZEDER Gábor <szeder.dev@gmail.com> writes:

>> +diff --git a/file0 b/file0
>> +new file mode 100644
>> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
>> +--- /dev/null
>> ++++ b/file0
>> +@@ -0,0 +1,3 @@
>> ++1
>> ++2
>> ++3
>> +diff --git a/file2 b/file2
>> +new file mode 100644
>> +index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
>> +--- /dev/null
>> ++++ b/file2
>> +@@ -0,0 +1,3 @@
>> ++1
>> ++2
>> ++3
>> +$
>
> All these new tests break when run with GIT_TEST_DEFAULT_HASH=sha256
> with something like:
>
>   + test_cmp expect actual
>   --- expect	2020-08-14 15:05:12.209285397 +0000
>   +++ actual	2020-08-14 15:05:12.205285279 +0000
>   @@ -2,7 +2,7 @@
>    0000000000000000000000000000000000000000000000000000000000000000
>    diff --git a/dir/sub b/dir/sub
>    new file mode ffff44
>   -index ffffffffff..fffffffa79
>   +index ffffffffff..fffffff895

Ouch.  These apparently come from

process_diffs () {
	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
	_x07="$_x05[0-9a-f][0-9a-f]" &&
	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
	    -e "s/From $_x40 /From $ZERO_OID /" \
	    -e "s/from $_x40)/from $ZERO_OID)/" \
	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
	    -e "s/^$_x40 /$ZERO_OID /" \
	    -e "s/^$_x40$/$ZERO_OID/" \
	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
	    -e "s/$_x07\.\.\./fffffff.../g" \
	    -e "s/ $_x04\.\.\./ ffff.../g" \
	    -e "s/ $_x04/ ffff/g" \
	    "$1"
}

Hash-independence may be good, but it should not munge expected mode
bits from 100644 to ffff44, which I think is a bug in the original
introduced in 72f936b1 (t4013: make test hash independent,
2020-02-07).

When we are adjusting the abbrev length of the index line, of course
$_x07 would not be sufficient to match the abbreviated object name
in full, so a79 vs 895 can be explained and is a bug in this patch
that did not update the process_diffs helper.

Another thing that I find somewhat problematic in the original
(brian cc'ed) is that it does not special case all-zero object name
specially.  By turning any and all instances of $_x40 to $ZERO_OID,
we lose the distinction between a random-looking object name which
got turned into $ZERO_OID by the processing, and an object name that
was $ZERO_OID from the beginning, so we won't catch a possible
future bug where new file's preimage object name is not $ZERO_OID
(this is plausible when you try to show an intent-to-add entry; the
diff between the index and the working tree would be "new file"
patch, but the index entry records the object name for an empty
blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
can easily be emitted by a mistaken implementation).
Junio C Hamano Aug. 14, 2020, 6:59 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Ouch.  These apparently come from
>
> process_diffs () {
> ...
> }
>
> Hash-independence may be good, but it should not munge expected mode
> bits from 100644 to ffff44, which I think is a bug in the original
> introduced in 72f936b1 (t4013: make test hash independent,
> 2020-02-07).
>
> When we are adjusting the abbrev length of the index line, of course
> $_x07 would not be sufficient to match the abbreviated object name
> in full, so a79 vs 895 can be explained and is a bug in this patch
> that did not update the process_diffs helper.
>
> Another thing that I find somewhat problematic in the original
> (brian cc'ed) is that it does not special case all-zero object name
> specially.  By turning any and all instances of $_x40 to $ZERO_OID,
> we lose the distinction between a random-looking object name which
> got turned into $ZERO_OID by the processing, and an object name that
> was $ZERO_OID from the beginning, so we won't catch a possible
> future bug where new file's preimage object name is not $ZERO_OID
> (this is plausible when you try to show an intent-to-add entry; the
> diff between the index and the working tree would be "new file"
> patch, but the index entry records the object name for an empty
> blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
> can easily be emitted by a mistaken implementation).

So here is what I came up with as a possible starting point.  The
idea is to grab hexadecimal strings at locations the original tried
to isolate with various contexts, and

 - if the input happens to be all zero, use '0', otherwise use 'f'

 - if the input is 40-bytes (i.e. unabbreviated object name in the
   SHA-1 world), repeat the character chosen in the first step as
   many times as there are chars in $ZERO_OID

 - otherwise, repeat the character chosen in the first step as many
   times as there are chars in the input.

 - regardless of all of the above, special case possible in-tree
   blob modes (100644, 100755 and 120000) and don't munge them.

I haven't tried it with the patch that started this discussion
thread, nor with SHA-256 build, though.


 t/t4013-diff-various.sh | 58 +++++++++++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 21 deletions(-)

diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 43267d6024..b33e60ab9d 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -130,27 +130,43 @@ test_expect_success setup '
 EOF
 
 process_diffs () {
-	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
-	_x07="$_x05[0-9a-f][0-9a-f]" &&
-	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
-	    -e "s/From $_x40 /From $ZERO_OID /" \
-	    -e "s/from $_x40)/from $ZERO_OID)/" \
-	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
-	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
-	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
-	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
-	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
-	    -e "s/^$_x40 /$ZERO_OID /" \
-	    -e "s/^$_x40$/$ZERO_OID/" \
-	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
-	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
-	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
-	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
-	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
-	    -e "s/$_x07\.\.\./fffffff.../g" \
-	    -e "s/ $_x04\.\.\./ ffff.../g" \
-	    -e "s/ $_x04/ ffff/g" \
-	    "$1"
+	perl -e '
+		my $oid_length = length($ARGV[0]);
+		my $x40 = "[0-9a-f]{40}";
+		my $xab = "[0-9a-f]{4,16}";
+		my $orx = "[0-9a-f]" x $oid_length;
+
+		sub munge_oid {
+			my ($oid) = @_;
+			my $x;
+
+			if ($oid =~ /^(100644|100755|120000)$/) {
+				return $oid;
+			}
+
+			if ($oid =~ /^0*$/) {
+				$x = "0";
+			} else {
+				$x = "f";
+			}
+
+			if (length($oid) == 40) {
+				return $x x $oid_length;
+			} else {
+				return $x x length($oid);
+			}
+		}
+
+		while (<STDIN>) {
+			s/($orx)/munge_oid($1)/ge;
+			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
+			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
+			s/($x40) /munge_oid($1) . " "/ge;
+			s/^($x40)($| )/munge_oid($1) . $2/e;
+			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
+			print;
+		}
+	' "$ZERO_OID" <"$1"
 }
 
 V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')
brian m. carlson Aug. 15, 2020, 12:21 a.m. UTC | #4
On 2020-08-14 at 18:59:53, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Ouch.  These apparently come from
> >
> > process_diffs () {
> > ...
> > }
> >
> > Hash-independence may be good, but it should not munge expected mode
> > bits from 100644 to ffff44, which I think is a bug in the original
> > introduced in 72f936b1 (t4013: make test hash independent,
> > 2020-02-07).
> >
> > When we are adjusting the abbrev length of the index line, of course
> > $_x07 would not be sufficient to match the abbreviated object name
> > in full, so a79 vs 895 can be explained and is a bug in this patch
> > that did not update the process_diffs helper.
> >
> > Another thing that I find somewhat problematic in the original
> > (brian cc'ed) is that it does not special case all-zero object name
> > specially.  By turning any and all instances of $_x40 to $ZERO_OID,
> > we lose the distinction between a random-looking object name which
> > got turned into $ZERO_OID by the processing, and an object name that
> > was $ZERO_OID from the beginning, so we won't catch a possible
> > future bug where new file's preimage object name is not $ZERO_OID
> > (this is plausible when you try to show an intent-to-add entry; the
> > diff between the index and the working tree would be "new file"
> > patch, but the index entry records the object name for an empty
> > blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
> > can easily be emitted by a mistaken implementation).

Yeah, it wasn't intended that we munge mode bits, and I definitely agree
we'd be better off distinguishing between all-zero and non-zero OIDs.

As you might imagine, this is not my favorite test because have a large
amount of diff formats, and even I think the giant list of sed
expressions I wrote is hideous.  It is, however, reasonably
comprehensive, which is pretty much the only nice thing that can be said
about it.

> So here is what I came up with as a possible starting point.  The
> idea is to grab hexadecimal strings at locations the original tried
> to isolate with various contexts, and
> 
>  - if the input happens to be all zero, use '0', otherwise use 'f'
> 
>  - if the input is 40-bytes (i.e. unabbreviated object name in the
>    SHA-1 world), repeat the character chosen in the first step as
>    many times as there are chars in $ZERO_OID
> 
>  - otherwise, repeat the character chosen in the first step as many
>    times as there are chars in the input.
> 
>  - regardless of all of the above, special case possible in-tree
>    blob modes (100644, 100755 and 120000) and don't munge them.
> 
> I haven't tried it with the patch that started this discussion
> thread, nor with SHA-256 build, though.

This seems like a sane approach.

> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index 43267d6024..b33e60ab9d 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -130,27 +130,43 @@ test_expect_success setup '
>  EOF
>  
>  process_diffs () {
> -	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
> -	_x07="$_x05[0-9a-f][0-9a-f]" &&
> -	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
> -	    -e "s/From $_x40 /From $ZERO_OID /" \
> -	    -e "s/from $_x40)/from $ZERO_OID)/" \
> -	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
> -	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
> -	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
> -	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
> -	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
> -	    -e "s/^$_x40 /$ZERO_OID /" \
> -	    -e "s/^$_x40$/$ZERO_OID/" \
> -	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
> -	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
> -	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
> -	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
> -	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
> -	    -e "s/$_x07\.\.\./fffffff.../g" \
> -	    -e "s/ $_x04\.\.\./ ffff.../g" \
> -	    -e "s/ $_x04/ ffff/g" \
> -	    "$1"
> +	perl -e '
> +		my $oid_length = length($ARGV[0]);
> +		my $x40 = "[0-9a-f]{40}";
> +		my $xab = "[0-9a-f]{4,16}";
> +		my $orx = "[0-9a-f]" x $oid_length;
> +
> +		sub munge_oid {
> +			my ($oid) = @_;
> +			my $x;
> +
> +			if ($oid =~ /^(100644|100755|120000)$/) {
> +				return $oid;
> +			}
> +
> +			if ($oid =~ /^0*$/) {
> +				$x = "0";
> +			} else {
> +				$x = "f";
> +			}
> +
> +			if (length($oid) == 40) {
> +				return $x x $oid_length;
> +			} else {
> +				return $x x length($oid);
> +			}
> +		}
> +
> +		while (<STDIN>) {
> +			s/($orx)/munge_oid($1)/ge;
> +			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> +			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
> +			s/($x40) /munge_oid($1) . " "/ge;
> +			s/^($x40)($| )/munge_oid($1) . $2/e;
> +			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
> +			print;
> +		}
> +	' "$ZERO_OID" <"$1"
>  }

This is much nicer, but I think we need the following on top of it
because we have a couple of tricky cases the original didn't consider:

* Some of our 64-bit object IDs get processed twice, turning them into
  88-character object IDs, which don't match.  We therefore need "\b".
* The new unabbreviated index lines aren't accounted for, so I included
  them by possibly matching "\.\.".
* We have some lines that look like "commit $OID (from $OID)" that
  aren't accounted for.  Because we now have an optional OID in
  munge_oid, I had to account for that as well.

So this is what's on top:

diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index b5c7e1a63b..dfc87a0d19 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -140,6 +140,8 @@ process_diffs () {
 			my ($oid) = @_;
 			my $x;
 
+			return "" unless length $oid;
+
 			if ($oid =~ /^(100644|100755|120000)$/) {
 				return $oid;
 			}
@@ -160,8 +162,8 @@ process_diffs () {
 		while (<STDIN>) {
 			s/($orx)/munge_oid($1)/ge;
 			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
-			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
-			s/($x40) /munge_oid($1) . " "/ge;
+			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
+			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
 			s/^($x40)($| )/munge_oid($1) . $2/e;
 			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
 			print;

Or, a fresh original version:

-- %< --
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index e6eb4dd4c7..dfc87a0d19 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -130,27 +130,45 @@ test_expect_success setup '
 EOF
 
 process_diffs () {
-	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
-	_x07="$_x05[0-9a-f][0-9a-f]" &&
-	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
-	    -e "s/From $_x40 /From $ZERO_OID /" \
-	    -e "s/from $_x40)/from $ZERO_OID)/" \
-	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
-	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
-	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
-	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
-	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
-	    -e "s/^$_x40 /$ZERO_OID /" \
-	    -e "s/^$_x40$/$ZERO_OID/" \
-	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
-	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
-	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
-	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
-	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
-	    -e "s/$_x07\.\.\./fffffff.../g" \
-	    -e "s/ $_x04\.\.\./ ffff.../g" \
-	    -e "s/ $_x04/ ffff/g" \
-	    "$1"
+	perl -e '
+		my $oid_length = length($ARGV[0]);
+		my $x40 = "[0-9a-f]{40}";
+		my $xab = "[0-9a-f]{4,16}";
+		my $orx = "[0-9a-f]" x $oid_length;
+
+		sub munge_oid {
+			my ($oid) = @_;
+			my $x;
+
+			return "" unless length $oid;
+
+			if ($oid =~ /^(100644|100755|120000)$/) {
+				return $oid;
+			}
+
+			if ($oid =~ /^0*$/) {
+				$x = "0";
+			} else {
+				$x = "f";
+			}
+
+			if (length($oid) == 40) {
+				return $x x $oid_length;
+			} else {
+				return $x x length($oid);
+			}
+		}
+
+		while (<STDIN>) {
+			s/($orx)/munge_oid($1)/ge;
+			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
+			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
+			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
+			s/^($x40)($| )/munge_oid($1) . $2/e;
+			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
+			print;
+		}
+	' "$ZERO_OID" <"$1"
 }
 
 V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')
-- %< --

Anyone is welcome to have my sign-off if they pick up any part of this
patch.
Đoàn Trần Công Danh Aug. 15, 2020, 2:27 a.m. UTC | #5
On 2020-08-15 00:21:20+0000, "brian m. carlson" <sandals@crustytoothpaste.net> wrote:
> On 2020-08-14 at 18:59:53, Junio C Hamano wrote:
> > Junio C Hamano <gitster@pobox.com> writes:
> > 
> > > Ouch.  These apparently come from
> > >
> > > process_diffs () {
> > > ...
> > > }
> > >
> > > Hash-independence may be good, but it should not munge expected mode
> > > bits from 100644 to ffff44, which I think is a bug in the original
> > > introduced in 72f936b1 (t4013: make test hash independent,
> > > 2020-02-07).
> > >
> > > When we are adjusting the abbrev length of the index line, of course
> > > $_x07 would not be sufficient to match the abbreviated object name
> > > in full, so a79 vs 895 can be explained and is a bug in this patch
> > > that did not update the process_diffs helper.
> > >
> > > Another thing that I find somewhat problematic in the original
> > > (brian cc'ed) is that it does not special case all-zero object name
> > > specially.  By turning any and all instances of $_x40 to $ZERO_OID,
> > > we lose the distinction between a random-looking object name which
> > > got turned into $ZERO_OID by the processing, and an object name that
> > > was $ZERO_OID from the beginning, so we won't catch a possible
> > > future bug where new file's preimage object name is not $ZERO_OID
> > > (this is plausible when you try to show an intent-to-add entry; the
> > > diff between the index and the working tree would be "new file"
> > > patch, but the index entry records the object name for an empty
> > > blob, e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 instead of $ZERO_OID
> > > can easily be emitted by a mistaken implementation).
> 
> Yeah, it wasn't intended that we munge mode bits, and I definitely agree
> we'd be better off distinguishing between all-zero and non-zero OIDs.
> 
> As you might imagine, this is not my favorite test because have a large
> amount of diff formats, and even I think the giant list of sed
> expressions I wrote is hideous.  It is, however, reasonably
> comprehensive, which is pretty much the only nice thing that can be said
> about it.
> 
> > So here is what I came up with as a possible starting point.  The
> > idea is to grab hexadecimal strings at locations the original tried
> > to isolate with various contexts, and
> > 
> >  - if the input happens to be all zero, use '0', otherwise use 'f'
> > 
> >  - if the input is 40-bytes (i.e. unabbreviated object name in the
> >    SHA-1 world), repeat the character chosen in the first step as
> >    many times as there are chars in $ZERO_OID
> > 
> >  - otherwise, repeat the character chosen in the first step as many
> >    times as there are chars in the input.
> > 
> >  - regardless of all of the above, special case possible in-tree
> >    blob modes (100644, 100755 and 120000) and don't munge them.
> > 
> > I haven't tried it with the patch that started this discussion
> > thread, nor with SHA-256 build, though.
> 
> This seems like a sane approach.
> 
> > diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> > index 43267d6024..b33e60ab9d 100755
> > --- a/t/t4013-diff-various.sh
> > +++ b/t/t4013-diff-various.sh
> > @@ -130,27 +130,43 @@ test_expect_success setup '
> >  EOF
> >  
> >  process_diffs () {
> > -	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
> > -	_x07="$_x05[0-9a-f][0-9a-f]" &&
> > -	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
> > -	    -e "s/From $_x40 /From $ZERO_OID /" \
> > -	    -e "s/from $_x40)/from $ZERO_OID)/" \
> > -	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
> > -	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
> > -	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
> > -	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
> > -	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
> > -	    -e "s/^$_x40 /$ZERO_OID /" \
> > -	    -e "s/^$_x40$/$ZERO_OID/" \
> > -	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
> > -	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
> > -	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
> > -	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
> > -	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
> > -	    -e "s/$_x07\.\.\./fffffff.../g" \
> > -	    -e "s/ $_x04\.\.\./ ffff.../g" \
> > -	    -e "s/ $_x04/ ffff/g" \
> > -	    "$1"
> > +	perl -e '
> > +		my $oid_length = length($ARGV[0]);
> > +		my $x40 = "[0-9a-f]{40}";
> > +		my $xab = "[0-9a-f]{4,16}";
> > +		my $orx = "[0-9a-f]" x $oid_length;
> > +
> > +		sub munge_oid {
> > +			my ($oid) = @_;
> > +			my $x;
> > +
> > +			if ($oid =~ /^(100644|100755|120000)$/) {
> > +				return $oid;
> > +			}
> > +
> > +			if ($oid =~ /^0*$/) {
> > +				$x = "0";
> > +			} else {
> > +				$x = "f";
> > +			}
> > +
> > +			if (length($oid) == 40) {
> > +				return $x x $oid_length;
> > +			} else {
> > +				return $x x length($oid);
> > +			}
> > +		}
> > +
> > +		while (<STDIN>) {
> > +			s/($orx)/munge_oid($1)/ge;
> > +			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> > +			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
> > +			s/($x40) /munge_oid($1) . " "/ge;
> > +			s/^($x40)($| )/munge_oid($1) . $2/e;
> > +			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
> > +			print;
> > +		}
> > +	' "$ZERO_OID" <"$1"
> >  }
> 
> This is much nicer, but I think we need the following on top of it
> because we have a couple of tricky cases the original didn't consider:
> 
> * Some of our 64-bit object IDs get processed twice, turning them into
>   88-character object IDs, which don't match.  We therefore need "\b".
> * The new unabbreviated index lines aren't accounted for, so I included
>   them by possibly matching "\.\.".
> * We have some lines that look like "commit $OID (from $OID)" that
>   aren't accounted for.  Because we now have an optional OID in
>   munge_oid, I had to account for that as well.
> 
> So this is what's on top:
> 
> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index b5c7e1a63b..dfc87a0d19 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -140,6 +140,8 @@ process_diffs () {
>  			my ($oid) = @_;
>  			my $x;
>  
> +			return "" unless length $oid;
> +
>  			if ($oid =~ /^(100644|100755|120000)$/) {
>  				return $oid;
>  			}
> @@ -160,8 +162,8 @@ process_diffs () {
>  		while (<STDIN>) {
>  			s/($orx)/munge_oid($1)/ge;
>  			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> -			s/commit ($x40)($| \()/"commit " . munge_oid($1) . $2/ge;
> -			s/($x40) /munge_oid($1) . " "/ge;
> +			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
> +			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
>  			s/^($x40)($| )/munge_oid($1) . $2/e;
>  			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
>  			print;
> 
> Or, a fresh original version:
> 
> -- %< --
> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index e6eb4dd4c7..dfc87a0d19 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -130,27 +130,45 @@ test_expect_success setup '
>  EOF
>  
>  process_diffs () {
> -	_x04="[0-9a-f][0-9a-f][0-9a-f][0-9a-f]" &&
> -	_x07="$_x05[0-9a-f][0-9a-f]" &&
> -	sed -e "s/$OID_REGEX/$ZERO_OID/g" \
> -	    -e "s/From $_x40 /From $ZERO_OID /" \
> -	    -e "s/from $_x40)/from $ZERO_OID)/" \
> -	    -e "s/commit $_x40\$/commit $ZERO_OID/" \
> -	    -e "s/commit $_x40 (/commit $ZERO_OID (/" \
> -	    -e "s/$_x40 $_x40 $_x40/$ZERO_OID $ZERO_OID $ZERO_OID/" \
> -	    -e "s/$_x40 $_x40 /$ZERO_OID $ZERO_OID /" \
> -	    -e "s/^$_x40 $_x40$/$ZERO_OID $ZERO_OID/" \
> -	    -e "s/^$_x40 /$ZERO_OID /" \
> -	    -e "s/^$_x40$/$ZERO_OID/" \
> -	    -e "s/$_x07\.\.$_x07/fffffff..fffffff/g" \
> -	    -e "s/$_x07,$_x07\.\.$_x07/fffffff,fffffff..fffffff/g" \
> -	    -e "s/$_x07 $_x07 $_x07/fffffff fffffff fffffff/g" \
> -	    -e "s/$_x07 $_x07 /fffffff fffffff /g" \
> -	    -e "s/Merge: $_x07 $_x07/Merge: fffffff fffffff/g" \
> -	    -e "s/$_x07\.\.\./fffffff.../g" \
> -	    -e "s/ $_x04\.\.\./ ffff.../g" \
> -	    -e "s/ $_x04/ ffff/g" \
> -	    "$1"
> +	perl -e '
> +		my $oid_length = length($ARGV[0]);
> +		my $x40 = "[0-9a-f]{40}";
> +		my $xab = "[0-9a-f]{4,16}";
> +		my $orx = "[0-9a-f]" x $oid_length;
> +
> +		sub munge_oid {
> +			my ($oid) = @_;
> +			my $x;
> +
> +			return "" unless length $oid;
> +
> +			if ($oid =~ /^(100644|100755|120000)$/) {
> +				return $oid;
> +			}
> +
> +			if ($oid =~ /^0*$/) {
> +				$x = "0";
> +			} else {
> +				$x = "f";
> +			}
> +
> +			if (length($oid) == 40) {
> +				return $x x $oid_length;
> +			} else {
> +				return $x x length($oid);
> +			}
> +		}
> +
> +		while (<STDIN>) {
> +			s/($orx)/munge_oid($1)/ge;
> +			s/From ($x40)( |\))/"From " . munge_oid($1) . $2/ge;
> +			s/commit ($x40)($| \(from )($x40?)/"commit " .  munge_oid($1) . $2 . munge_oid($3)/ge;
> +			s/\b($x40)( |\.\.|$)/munge_oid($1) . $2/ge;
> +			s/^($x40)($| )/munge_oid($1) . $2/e;
> +			s/($xab)(\.\.|,| |\.\.\.|$)/munge_oid($1) . $2/ge;
> +			print;
> +		}
> +	' "$ZERO_OID" <"$1"
>  }
>  
>  V=$(git version | sed -e 's/^git version //' -e 's/\./\\./g')
> -- %< --
> 
> Anyone is welcome to have my sign-off if they pick up any part of this
> patch.

This one seems to work with:

	make -j9 test GIT_TEST_DEFAULT_HASH=sha256

If noone step up and write this into a patch in some days,
I'll take this as first step in my series.

Also waiting some days so other people could come up with better idea,
sed's y seems to be able to work if we don't have the constraint
on all-0 oid.
Junio C Hamano Aug. 17, 2020, 4:17 p.m. UTC | #6
Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

>> Anyone is welcome to have my sign-off if they pick up any part of this
>> patch.

Likewise.

> This one seems to work with:
>
> 	make -j9 test GIT_TEST_DEFAULT_HASH=sha256
>
> If noone step up and write this into a patch in some days,
> I'll take this as first step in my series.
>
> Also waiting some days so other people could come up with better idea,
> sed's y seems to be able to work if we don't have the constraint
> on all-0 oid.

y/// would help with the "same length" transformation, but I think
the primary reason why "sed" would not work well in this example is
"locate the things to tranform" part is harder to separate from
"here is how to transform a token" part when written in "sed".

Yes, you could use /pattern/{ s/from/to/; s/from2/to/; ...} but
at least I didn't see an obvious way to simplify the original and
reduce its repetitiveness.
Junio C Hamano Aug. 20, 2020, 4:56 a.m. UTC | #7
Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

>> Anyone is welcome to have my sign-off if they pick up any part of this
>> patch.
>
> This one seems to work with:
>
> 	make -j9 test GIT_TEST_DEFAULT_HASH=sha256
>
> If noone step up and write this into a patch in some days,
> I'll take this as first step in my series.

So, is there anything happening on this front?

Thanks.
diff mbox series

Patch

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 7987d72b02..c11efa7865 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -441,10 +441,11 @@  endif::git-format-patch[]
 --abbrev[=<n>]::
 	Instead of showing the full 40-byte hexadecimal object
 	name in diff-raw format output and diff-tree header
-	lines, show only a partial prefix.  This is
-	independent of the `--full-index` option above, which controls
-	the diff-patch output format.  Non default number of
-	digits can be specified with `--abbrev=<n>`.
+	lines, show only a partial prefix.
+	In diff-patch output format, `--full-index` takes higher
+	precedent, i.e. if `--full-index` is specified, full blob
+	names will be shown regardless of `--abbrev`.
+	Non default number of digits can be specified with `--abbrev=<n>`.
 
 -B[<n>][/<m>]::
 --break-rewrites[=[<n>][/<m>]]::
diff --git a/diff.c b/diff.c
index f9709de7b4..20dedfe2a9 100644
--- a/diff.c
+++ b/diff.c
@@ -4319,7 +4319,10 @@  static void fill_metainfo(struct strbuf *msg,
 	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
-		int abbrev = o->flags.full_index ? hexsz : DEFAULT_ABBREV;
+		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
+
+		if (o->flags.full_index)
+			abbrev = hexsz;
 
 		if (o->flags.binary) {
 			mmfile_t mf;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 3f60f7d96c..e6eb4dd4c7 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -221,6 +221,9 @@  diff-tree --root -r --abbrev=4 initial
 :noellipses diff-tree --root -r --abbrev=4 initial
 diff-tree -p initial
 diff-tree --root -p initial
+diff-tree --root -p --abbrev=10 initial
+diff-tree --root -p --full-index initial
+diff-tree --root -p --full-index --abbrev=10 initial
 diff-tree --patch-with-stat initial
 diff-tree --root --patch-with-stat initial
 diff-tree --patch-with-raw initial
diff --git a/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
new file mode 100644
index 0000000000..7518a9044e
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--abbrev=10_initial
@@ -0,0 +1,29 @@ 
+$ git diff-tree --root -p --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000..35d242ba79
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000..01e79c32a8
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
new file mode 100644
index 0000000000..69f913fbe5
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_--abbrev=10_initial
@@ -0,0 +1,29 @@ 
+$ git diff-tree --root -p --full-index --abbrev=10 initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$
diff --git a/t/t4013/diff.diff-tree_--root_-p_--full-index_initial b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
new file mode 100644
index 0000000000..1b0b6717fa
--- /dev/null
+++ b/t/t4013/diff.diff-tree_--root_-p_--full-index_initial
@@ -0,0 +1,29 @@ 
+$ git diff-tree --root -p --full-index initial
+444ac553ac7612cc88969031b02b3767fb8a353a
+diff --git a/dir/sub b/dir/sub
+new file mode 100644
+index 0000000000000000000000000000000000000000..35d242ba79ae89ac695e26b3d4c27a8e6f028f9e
+--- /dev/null
++++ b/dir/sub
+@@ -0,0 +1,2 @@
++A
++B
+diff --git a/file0 b/file0
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file0
+@@ -0,0 +1,3 @@
++1
++2
++3
+diff --git a/file2 b/file2
+new file mode 100644
+index 0000000000000000000000000000000000000000..01e79c32a8c99c557f0757da7cb6d65b3414466d
+--- /dev/null
++++ b/file2
+@@ -0,0 +1,3 @@
++1
++2
++3
+$