diff mbox series

"git diff" does not show a diff for newly added, binary files

Message ID 3473764.PTxrJRyG3s@earendil (mailing list archive)
State New, archived
Headers show
Series "git diff" does not show a diff for newly added, binary files | expand

Commit Message

Thorsten Otto April 4, 2023, 9:58 a.m. UTC
"git diff" does not show a diff for newly added, binary files

What did you do before the bug happened? (Steps to reproduce your issue)

$ git init .
$ touch a
$ git add a
$ git commit -m "first commit"
$ dd if=/dev/zero of=b count=1
$ git add b
$ echo hello > c
$ git add c
$ git diff --cached

What did you expect to happen? (Expected behavior)

I expected a binary diff for the new file, just like it is done
when comparing two different, already committed revisions.

What happened instead? (Actual behavior)

The "git diff" command only showed a diff for the text file c, 
but not for the binary file b:


[System Info]
git version:
git version 2.39.0
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 6.1.3-1-default #1 SMP PREEMPT_DYNAMIC Wed Jan  4 11:03:53 UTC 
2023 (a5315fb) x86_64
compiler info: gnuc: 12.2
libc info: glibc: 2.36
$SHELL (typically, interactive shell): /bin/bash


[Enabled Hooks]

Comments

Kristoffer Haugsbakk April 4, 2023, 10:17 a.m. UTC | #1
On Tue, Apr 4, 2023, at 11:58, Thorsten Otto wrote:
> "git diff" does not show a diff for newly added, binary files
> […]
> I expected a binary diff for the new file, just like it is done
> when comparing two different, already committed revisions.

Do you use `.gitattributes` to get these binary diffs? What does it look like?
Thorsten Otto April 4, 2023, 10:39 a.m. UTC | #2
On Dienstag, 4. April 2023 12:17:43 CEST Kristoffer Haugsbakk wrote:

> Do you use `.gitattributes` to get these binary diffs? What does it look
> like?

No, the repo was just created for demonstrating purposes. But when i commit 
that last change, then do a "git --format-patch -1", i get something like

diff --git a/b b/b
new file mode 100644
index 
0000000000000000000000000000000000000000..a64a5a93fb4aef4d5f63d79cb2582731b9ac5063
GIT binary patch
literal 512
NcmZQz7zHCa1ONg600961

literal 0
HcmV?d00001


I would expect "git diff" to output the same information. I don't see a reason 
why it outputs the diff for a new text file, but not for a binary file?
Kristoffer Haugsbakk April 4, 2023, 10:45 a.m. UTC | #3
On Tue, Apr 4, 2023, at 12:39, Thorsten Otto wrote:
> I would expect "git diff" to output the same information. I don't see a reason 
> why it outputs the diff for a new text file, but not for a binary file?

Has it done that before? As in, has git(1) behaved like the way you expect 
it to behave on this point before? Because `git diff` has never diffed
binary files for me unless I instructed it to do it via some `.gitattributes`
configuration.
Thorsten Otto April 4, 2023, 11:23 a.m. UTC | #4
On Dienstag, 4. April 2023 12:45:35 CEST Kristoffer Haugsbakk wrote:
> Has it done that before?

Not that a know of.

>unless I instructed it to do it via some `.gitattributes`
>configuration.

How can that be done? I mean, git detects that file to be binary. I certainly 
don't want to treat it as text, and then dump binary data to the terminal when 
they differ ;)
Kristoffer Haugsbakk April 4, 2023, 11:29 a.m. UTC | #5
On Tue, Apr 4, 2023, at 13:23, Thorsten Otto wrote:
> How can that be done? I mean, git detects that file to be binary. I certainly 
> don't want to treat it as text, and then dump binary data to the terminal when 
> they differ ;)

I used something like this when I last needed to diff binary files: https://superuser.com/a/706286/259670
Jeff King April 4, 2023, 3:36 p.m. UTC | #6
On Tue, Apr 04, 2023 at 12:39:09PM +0200, Thorsten Otto wrote:

> On Dienstag, 4. April 2023 12:17:43 CEST Kristoffer Haugsbakk wrote:
> 
> > Do you use `.gitattributes` to get these binary diffs? What does it look
> > like?
> 
> No, the repo was just created for demonstrating purposes. But when i commit 
> that last change, then do a "git --format-patch -1", i get something like
> 
> diff --git a/b b/b
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..a64a5a93fb4aef4d5f63d79cb2582731b9ac5063
> GIT binary patch
> literal 512
> NcmZQz7zHCa1ONg600961
> 
> literal 0
> HcmV?d00001
> 
> 
> I would expect "git diff" to output the same information. I don't see a reason 
> why it outputs the diff for a new text file, but not for a binary file?

The behavior you're seeing is expected.

The default for "git diff" is not to show binary patches, because they
are often gigantic, and are meaningless to human readers. You can use
"--binary" if you want to see them.

The default for "git format-patch" is different, because there the point
is to send the patch to somebody to be applied with "git am", so it's
important that all information is included.

-Peff
Junio C Hamano April 4, 2023, 3:37 p.m. UTC | #7
Thorsten Otto <admin@tho-otto.de> writes:

> "git diff" does not show a diff for newly added, binary files

The command is working as designed, either for newly added ones,
modified ones, or deleted ones.  In your sample output, we can see
"Binary files differ", which is the default "diff" for binary
contents.  It is in line with the way "diff" by other people work.

> What happened instead? (Actual behavior)
>
> The "git diff" command only showed a diff for the text file c, 
> but not for the binary file b:
>
> diff --git a/b b/b
> new file mode 100644
> index 0000000..a64a5a9
> Binary files /dev/null and b/b differ
> diff --git a/c b/c
> new file mode 100644
> index 0000000..ce01362
> --- /dev/null
> +++ b/c
> @@ -0,0 +1 @@
> +hello

As the primary purpose of "git format-patch" is to convey the
content change between pair of states (i.e. change to bring the
state at a particular commit to another state at a commit that is
its child), it implicitly enables the "binary patch" option, but for
"git diff" which is meant to be an interactive browser of changes
for human consumption, the "binary patch" option is off by default.

You can run "git diff --binary".
Jeff King April 4, 2023, 3:39 p.m. UTC | #8
On Tue, Apr 04, 2023 at 11:58:38AM +0200, Thorsten Otto wrote:

> "git diff" does not show a diff for newly added, binary files
> 
> What did you do before the bug happened? (Steps to reproduce your issue)
> 
> $ git init .
> $ touch a
> $ git add a
> $ git commit -m "first commit"
> $ dd if=/dev/zero of=b count=1
> $ git add b
> $ echo hello > c
> $ git add c
> $ git diff --cached
> 
> What did you expect to happen? (Expected behavior)
> 
> I expected a binary diff for the new file, just like it is done
> when comparing two different, already committed revisions.

I responded elsewhere in the thread mentioning "git diff --binary", but
note this part of the report is a little misleading. The difference is
not showing newly added files versus committed revisions. The difference
is between "git diff" and "git format-patch". If you commit the result
above and then run:

  git diff HEAD^ HEAD

it will likewise not show the binary patch (unless you specify
--binary). Likewise for "git show", etc. I think that format-patch is
the only command with binary diffs turned on by default.

-Peff
Junio C Hamano April 4, 2023, 3:42 p.m. UTC | #9
"Kristoffer Haugsbakk" <code@khaugsbakk.name> writes:

> On Tue, Apr 4, 2023, at 11:58, Thorsten Otto wrote:
>> "git diff" does not show a diff for newly added, binary files
>> […]
>> I expected a binary diff for the new file, just like it is done
>> when comparing two different, already committed revisions.
>
> Do you use `.gitattributes` to get these binary diffs? What does it look like?

The attribute "binary" can be used like this

    *.mybin	binary

to declare that all files with .mybin suffix are to be treated
binary files.  "git diff", "git format-patch", etc. will treat
such a file as "binary".

What they actually do to "binary files" is a different story.  The
internal diff machinery by default shows "Binary files A and B differ"
just like everybody else's "diff" program does, but has an option to
show "binary patch" we invented.  "git diff --binary" enables the option,
and for some commands, the option is enabled by default.
diff mbox series

Patch

diff --git a/b b/b
new file mode 100644
index 0000000..a64a5a9
Binary files /dev/null and b/b differ
diff --git a/c b/c
new file mode 100644
index 0000000..ce01362
--- /dev/null
+++ b/c
@@ -0,0 +1 @@ 
+hello