diff mbox series

[2/3] diff-format.txt: correct misleading wording

Message ID 5ff3a52556716e92f7207c75660cec2eb3da2587.1655123383.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 3b396c899f946604328d863154c4c988e1d777c1
Headers show
Series Fix a few documentation errors around the raw diff output | expand

Commit Message

Philippe Blain June 13, 2022, 12:29 p.m. UTC
From: Philippe Blain <levraiphilippeblain@gmail.com>

Near the end of the "Raw output format" section, an example shows the
output of 'git diff-files' for a tracked file modified on disk but not
yet added to the index. However the wording is:

    <sha1> is shown as all 0's if a file is new on the filesystem
    and it is out of sync with the index.

which is confusing since it can be understood to mean that 'file' is a
new, yet untracked file, in which case 'git diff-files' does not care
about it at all.

When this example was introduced all the way back in c64b9b8860
(Reference documentation for the core git commands., 2005-05-05), 'old'
and 'new' referred to the two entities being compared, depending on the
command being used (diff-index, diff-tree or diff-files - which at the
time were diff-cache, diff-tree and show-diff). The wording used at the
time was:

    <new-sha1> is shown as all 0's if new is a file on the
    filesystem and it is out of sync with the cache.

This section was reworked in 81e50eabf0 ([PATCH] The diff-raw
format updates., 2005-05-21) and the mention of the meaning of 'new' and
'old' was removed. Then in f73ae1fc5d (Some typos and light editing of
various manpages, 2005-10-05), the wording was changed to what it is
now.

In addition, in b6d8f309d9 ([PATCH] diff-raw format update take #2.,
2005-05-23), the section was further reworked and did not use '<sha1>'
anymore, making the example the sole user of this token.

Rework the introductory sentence of the example to instead refer to
'sha1 for "dst"', which is what the text description above it uses, and
fix the wording so that we do not mention a "new file".

While at it, also tweak the wording used in the description of the raw
format to explicitely state that all 0's are used for the destination
hash if the working tree is out of sync with the index, instead of the
more vague "look at worktree".

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
---
 Documentation/diff-format.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Junio C Hamano June 13, 2022, 6:56 p.m. UTC | #1
"Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Philippe Blain <levraiphilippeblain@gmail.com>
>
> Near the end of the "Raw output format" section, an example shows the
> output of 'git diff-files' for a tracked file modified on disk but not
> yet added to the index. However the wording is:
>
>     <sha1> is shown as all 0's if a file is new on the filesystem
>     and it is out of sync with the index.
>
> which is confusing since it can be understood to mean that 'file' is a
> new, yet untracked file, in which case 'git diff-files' does not care
> about it at all.

I do not think such an understanding is sensible, as "untracked
file" cannot be "out of sync with the index", because even its stale
version wouldn't be in the index if it is untracked.

But I agree that not all people are sensible, and it would be nicer
if the documentation helped them, too ;-)

> When this example was introduced all the way back in c64b9b8860
> (Reference documentation for the core git commands., 2005-05-05), 'old'
> and 'new' referred to the two entities being compared, depending on the
> command being used (diff-index, diff-tree or diff-files - which at the
> time were diff-cache, diff-tree and show-diff). The wording used at the
> time was:
>
>     <new-sha1> is shown as all 0's if new is a file on the
>     filesystem and it is out of sync with the cache.

Yeah, I remember this version of wording.

> Rework the introductory sentence of the example to instead refer to
> 'sha1 for "dst"', which is what the text description above it uses, and
> fix the wording so that we do not mention a "new file".

That's good.  We may need to upgrade them to 'object name' to wean
ourselves away from SHA-1 but that should be a separate topic.

> While at it, also tweak the wording used in the description of the raw
> format to explicitely state that all 0's are used for the destination
> hash if the working tree is out of sync with the index, instead of the
> more vague "look at worktree".

I am not sure if that is a good idea.  Those who understand what the
"work tree out of sync with" phrase mean would understand "look at
work tree" but the reverse would not be true.

The other hunk about "new" -> "dst" is a good change regardless, but
even there, "out of sync with" may need to be rewritten to make it
easier to understand.  Is it different from "the index does not know
the exact value (hence you need to look in the working tree if you
really cared to find it out, perhaps with hash-object)"?

Thanks.

> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
> ---
>  Documentation/diff-format.txt | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
> index b8319eeb41d..a3ae8747a22 100644
> --- a/Documentation/diff-format.txt
> +++ b/Documentation/diff-format.txt
> @@ -43,7 +43,7 @@ That is, from the left to the right:
>  . a space.
>  . sha1 for "src"; 0\{40\} if creation or unmerged.
>  . a space.
> -. sha1 for "dst"; 0\{40\} if deletion, unmerged or "look at work tree".
> +. sha1 for "dst"; 0\{40\} if deletion, unmerged or "work tree out of sync with the index".
>  . a space.
>  . status, followed by optional "score" number.
>  . a tab or a NUL when `-z` option is used.
> @@ -69,8 +69,8 @@ percentage of similarity between the source and target of the move or
>  copy).  Status letter M may be followed by a score (denoting the
>  percentage of dissimilarity) for file rewrites.
>  
> -<sha1> is shown as all 0's if a file is new on the filesystem
> -and it is out of sync with the index.
> +The sha1 for "dst" is shown as all 0's if a file on the filesystem
> +is out of sync with the index.
>  
>  Example:
Philippe Blain June 14, 2022, 12:31 p.m. UTC | #2
Le 2022-06-13 à 14:56, Junio C Hamano a écrit :
> "Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Philippe Blain <levraiphilippeblain@gmail.com>
>>
>> Near the end of the "Raw output format" section, an example shows the
>> output of 'git diff-files' for a tracked file modified on disk but not
>> yet added to the index. However the wording is:
>>
>>     <sha1> is shown as all 0's if a file is new on the filesystem
>>     and it is out of sync with the index.
>>
>> which is confusing since it can be understood to mean that 'file' is a
>> new, yet untracked file, in which case 'git diff-files' does not care
>> about it at all.
> 
> I do not think such an understanding is sensible, as "untracked
> file" cannot be "out of sync with the index", because even its stale
> version wouldn't be in the index if it is untracked.
> 
> But I agree that not all people are sensible, and it would be nicer
> if the documentation helped them, too ;-)

Well, call me not sensible then ;) When I read that, I had trouble understanding
what was meant precisely because I had what you wrote in mind.

> 
>> When this example was introduced all the way back in c64b9b8860
>> (Reference documentation for the core git commands., 2005-05-05), 'old'
>> and 'new' referred to the two entities being compared, depending on the
>> command being used (diff-index, diff-tree or diff-files - which at the
>> time were diff-cache, diff-tree and show-diff). The wording used at the
>> time was:
>>
>>     <new-sha1> is shown as all 0's if new is a file on the
>>     filesystem and it is out of sync with the cache.
> 
> Yeah, I remember this version of wording.
> 
>> Rework the introductory sentence of the example to instead refer to
>> 'sha1 for "dst"', which is what the text description above it uses, and
>> fix the wording so that we do not mention a "new file".
> 
> That's good.  We may need to upgrade them to 'object name' to wean
> ourselves away from SHA-1 but that should be a separate topic.
> 
>> While at it, also tweak the wording used in the description of the raw
>> format to explicitely state that all 0's are used for the destination
>> hash if the working tree is out of sync with the index, instead of the
>> more vague "look at worktree".
> 
> I am not sure if that is a good idea.  Those who understand what the
> "work tree out of sync with" phrase mean would understand "look at
> work tree" but the reverse would not be true.

I'm not sure I agree. Even looking at it from a grammatical perspective,
"the hash is all zero if a path is unmerged, is a deletion or is out of sync
with the index" makes more sense to me than "the hash is all zero if a path is 
unmerged, is a deletion or "look at worktree""...

> 
> The other hunk about "new" -> "dst" is a good change regardless, but
> even there, "out of sync with" may need to be rewritten to make it
> easier to understand.  Is it different from "the index does not know
> the exact value (hence you need to look in the working tree if you
> really cared to find it out, perhaps with hash-object)"?

I think it's clear what is meant, I don't think we need to change it.

Thanks,

Philippe.
diff mbox series

Patch

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index b8319eeb41d..a3ae8747a22 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -43,7 +43,7 @@  That is, from the left to the right:
 . a space.
 . sha1 for "src"; 0\{40\} if creation or unmerged.
 . a space.
-. sha1 for "dst"; 0\{40\} if deletion, unmerged or "look at work tree".
+. sha1 for "dst"; 0\{40\} if deletion, unmerged or "work tree out of sync with the index".
 . a space.
 . status, followed by optional "score" number.
 . a tab or a NUL when `-z` option is used.
@@ -69,8 +69,8 @@  percentage of similarity between the source and target of the move or
 copy).  Status letter M may be followed by a score (denoting the
 percentage of dissimilarity) for file rewrites.
 
-<sha1> is shown as all 0's if a file is new on the filesystem
-and it is out of sync with the index.
+The sha1 for "dst" is shown as all 0's if a file on the filesystem
+is out of sync with the index.
 
 Example: