diff mbox series

Blob hash of binary files in patches generated by git format patch show in full form instead of short form

Message ID 499c9922-eb42-c2a8-b4b4-8e5197ea0fc6@gmail.com (mailing list archive)
State Accepted
Commit 8925eff9038497d2b6c45489e14dde945e32d96d
Headers show
Series Blob hash of binary files in patches generated by git format patch show in full form instead of short form | expand

Commit Message

Bagas Sanjaya March 21, 2021, 1:05 p.m. UTC
Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.

What did you do before the bug happened? (Steps to reproduce your issue)

I'm trying to do format-patch with binary files (images) in commits.
In each commit, it adds an image and its alt description text in separated
file.

Full steps:

   First, initialize empty repo and populate it with commits:
   - cd /tmp
   - mkdir bin-patch && cd bin-patch
   - git init
   - echo "test format-patch binary files" > README
   - git add * && git commit -m "init README"
   - git checkout -b test
   - wget -c [1] -O stackoverflow.png && echo "Stack Overflow" > stackoverflow.alt
   - git add * && git commit -m "Add Stack Overflow logo"
   - wget -c [2] -O idntm.jpg && echo "Indonesia's Next Top Model cast" > idntm.alt
   - git add * && git commit -m "Add IdNTM cast poster"

   Now prepare patches as usual (with cover letter ignored for this purpose):
   - git format-patch --cover-letter -M master

   (image link):
   [1]: https://upload.wikimedia.org/wikipedia/commons/thumb/0/02/Stack_Overflow_logo.svg/1280px-Stack_Overflow_logo.svg.png
   [2]: https://upload.wikimedia.org/wikipedia/en/9/9f/IndonesiaNTM1Cast.jpg

What did you expect to happen? (Expected behavior)

Blob hash in the `index` header of generated patches for binary file (image)
use short form (7 characters), just like for text file (alt description).

What happened instead? (Actual behavior)
Blob hash in the `index` stanza of generated patches for image use full
(long) form.

For first patch (Stack Overflow commit), diff header for the image read:
```
```

What's different between what you expected and what actually happened?

Blob hash for binary files are shown in full form, as opposed to blob hash
for text files.

Anything else you want to add:
(none)

Please review the rest of the bug report below.
You can delete any lines you don't wish to share.


[System Info]
git version:
git version 2.31.0.29.g98164e9585
cpu: x86_64
built from commit: 98164e9585e02e31dcf1377a553efe076c15f8c6
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.11.6-kernelorg-upstream-generic #1 SMP Fri Mar 12 06:35:27 WIB 2021 x86_64
compiler info: gnuc: 9.3
libc info: glibc: 2.31
$SHELL (typically, interactive shell): /bin/bash


[Enabled Hooks]
(none)

Comments

Junio C Hamano March 21, 2021, 5:31 p.m. UTC | #1
Bagas Sanjaya <bagasdotme@gmail.com> writes:

> What's different between what you expected and what actually happened?
>
> Blob hash for binary files are shown in full form, as opposed to blob hash
> for text files.

This is working as intended, designed and implemented.  

The textual patch is meant to be applicable on target text that may
even have been slightly modified from the original from which the
patch was taken, and the abbreviated object name on the "index" line
is there mostly for human's sanity check and as a visual aid.
Ordinarily it is not used to actually find the matching blob object
(and it is not an error if there is no matching blob object in the
repository that a patch application is attempted in).

But the binary patch is designed to be applicable only to an exact
copy of the original and nowhere else.  The object name is given in
full, instead of using abbreviated form, to ensure that we do not
try to apply a binary patch to an object whose name is "similar".

Thanks.
Bagas Sanjaya March 22, 2021, 5:47 a.m. UTC | #2
On 22/03/21 00.31, Junio C Hamano wrote:
> Bagas Sanjaya <bagasdotme@gmail.com> writes:
> 
>> What's different between what you expected and what actually happened?
>>
>> Blob hash for binary files are shown in full form, as opposed to blob hash
>> for text files.
> 
> This is working as intended, designed and implemented.
> 
> The textual patch is meant to be applicable on target text that may
> even have been slightly modified from the original from which the
> patch was taken, and the abbreviated object name on the "index" line
> is there mostly for human's sanity check and as a visual aid.
> Ordinarily it is not used to actually find the matching blob object
> (and it is not an error if there is no matching blob object in the
> repository that a patch application is attempted in).
> 
> But the binary patch is designed to be applicable only to an exact
> copy of the original and nowhere else.  The object name is given in
> full, instead of using abbreviated form, to ensure that we do not
> try to apply a binary patch to an object whose name is "similar".
> 
> Thanks.
> 

Hmm... but I don't see that in the documentation for git format-patch.
Maybe I need to send doc update.
Bagas Sanjaya March 22, 2021, 10:06 a.m. UTC | #3
Oh oh oh, I see git diff format for generating patches in documentation for
git diff-files.

On 22/03/21 12.47, Bagas Sanjaya wrote:
> On 22/03/21 00.31, Junio C Hamano wrote:
>> Bagas Sanjaya <bagasdotme@gmail.com> writes:
>>
>>> What's different between what you expected and what actually happened?
>>>
>>> Blob hash for binary files are shown in full form, as opposed to blob hash
>>> for text files.
>>
>> This is working as intended, designed and implemented.
>>
>> The textual patch is meant to be applicable on target text that may
>> even have been slightly modified from the original from which the
>> patch was taken, and the abbreviated object name on the "index" line
>> is there mostly for human's sanity check and as a visual aid.
>> Ordinarily it is not used to actually find the matching blob object
>> (and it is not an error if there is no matching blob object in the
>> repository that a patch application is attempted in).
>>
>> But the binary patch is designed to be applicable only to an exact
>> copy of the original and nowhere else.  The object name is given in
>> full, instead of using abbreviated form, to ensure that we do not
>> try to apply a binary patch to an object whose name is "similar".
>>
>> Thanks.
>>
> 
> Hmm... but I don't see that in the documentation for git format-patch.
> Maybe I need to send doc update.
>
diff mbox series

Patch

diff --git a/stackoverflow.png b/stackoverflow.png
new file mode 100644
index 0000000000000000000000000000000000000000..969908ad3161a66af31f2441cfea4ae002a5ec67
```

while diff header for alt description read:
```
diff --git a/stackoverflow.alt b/stackoverflow.alt
new file mode 100644
index 0000000..9368417
```

Similarly, for the second patch (INTM poster), diff header for the image read:
```
diff --git a/idntm.jpg b/idntm.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..8921ab9540e0a36a53f8c6632482fb04d5d0cc6c
```

while diff header for alt description read:
```
diff --git a/idntm.alt b/idntm.alt
new file mode 100644
index 0000000..719feb9