diff mbox series

[v2] docs: explain the order of output in the batched mode of git-cat-file(1)

Message ID pull.1768.v2.git.git.1724234729288.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v2] docs: explain the order of output in the batched mode of git-cat-file(1) | expand

Commit Message

ahmed akef Aug. 21, 2024, 10:05 a.m. UTC
From: ahmed akef <aemed.akef.1@gmail.com>

The batched mode of git-cat-file(1) reads multiple objects from stdin
and prints their respective contents to stdout.
The order in which those objects are printed is not documented
and may not be immediately obvious to the user.
Document it.

Signed-off-by: ahmed akef <aemed.akef.1@gmail.com>
---
    docs: explain the order of output in The batched mode of git-cat-file(1)
    
    this is the same change as https://github.com/git/git/pull/1761 but dues
    to missteps, the PR got closed and I couldn't fix it, also applied the
    review comments from @pks-t
    cc: Patrick Steinhardt ps@pks.im

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1768%2Fahmedakef%2Fexplain-the-order-of-output-in-cat-file-batch-operations-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1768/ahmedakef/explain-the-order-of-output-in-cat-file-batch-operations-v2
Pull-Request: https://github.com/git/git/pull/1768

Range-diff vs v1:

 1:  86a982884ec ! 1:  3f742957aa1 docs: explain the order of output in the batched mode of git-cat-file(1)
     @@ Documentation/git-cat-file.txt: BATCH OUTPUT
       If `--batch` or `--batch-check` is given, `cat-file` will read objects
      -from stdin, one per line, and print information about them. By default,
      -the whole line is considered as an object, as if it were fed to
     -+from stdin, one per line, and print information about them sequentially in the same order.
     -+By default, the whole line is considered as an object, as if it were fed to
     - linkgit:git-rev-parse[1].
     +-linkgit:git-rev-parse[1].
     ++from stdin, one per line, and print information about them in the same
     ++order as they have been read from stdin. By default, the whole line is
     ++considered as an object, as if it were fed to linkgit:git-rev-parse[1].
       
       When `--batch-command` is given, `cat-file` will read commands from stdin,
     + one per line, and print information based on the command given. With


 Documentation/git-cat-file.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)


base-commit: 80ccd8a2602820fdf896a8e8894305225f86f61d

Comments

Junio C Hamano Aug. 21, 2024, 5:19 p.m. UTC | #1
"ahmed akef via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  If `--batch` or `--batch-check` is given, `cat-file` will read objects
> -from stdin, one per line, and print information about them. By default,
> -the whole line is considered as an object, as if it were fed to
> -linkgit:git-rev-parse[1].
> +from stdin, one per line, and print information about them in the same
> +order as they have been read from stdin. By default, the whole line is
> +considered as an object, as if it were fed to linkgit:git-rev-parse[1].

A few "Huh?" I had while reading the above.

 * "as they have been read from stdin"; drop "from stdin" here, as
   we already know we are talking about the mode that reads object
   names from the standard input and there is no need to repeat it.

 * "considered as an object" -> "considered to be an object name" or
   "used as an object name" [*].  This primarily comes from my
   spinal reflex against "consider as", plus my desire to be more
   precise in terminology.

Thanks.

Nothing mentioned below should be part of this patch, but as I
noticed it while studying the current documentation to prepare this
review, I'll record them as #leftoverbits.

The description of how lines read from the standard input should
look like needs more work.  Documentation on "--batch" says "the
input lines must specify the path, separated by whitespace", but is
it clear that it expects "<object name>", followed by a whitespace
(not necessarily a single SP), followed by "<path>"?  Without prior
knowledge, I wouldn't be surprised if somebody read the text as
asking for paths separated by whitespace, e.g.

    README.txt COPYING Makefile

that gives three paths.  The text needs to be tightened to say
something like "must give the path after the object name, separated
by whitespace.  The path is used to find the textconv and smudge
filter".

The section also says "See the section BATCH OUTPUT below for
details." but the section it refers to does not talk anything about
this whitespace thing.  It also is unclear what would happen if the
input line says:

    :COPYING Makefile

Would it apply the textconv/filters meant for Makefile to the blob
stored at COPYING in the index?  If we say

    :README.txt
    
would the command be smart enough to know that the blob came from
the path README.txt and apply the textconv/filters meant for that
path, without the input repeating the same information twice like:

    :README.txt README.txt

or something silly like that?


[Reference]

* https://www.britannica.com/dictionary/eb/qa/consider-and-consider-as
ahmed akef Aug. 22, 2024, 7:31 a.m. UTC | #2
> * "as they have been read from stdin"; drop "from stdin" here, as
>   we already know we are talking about the mode that reads object
>   names from the standard input and there is no need to repeat it.

it is needed to explain that git will not do any optimization to the
order of paths
before printing the output. I entered a discussion with someone who was worried
that git may optimize the paths order because it is not stated clearly
that output follows
the same order as input.

> * "considered as an object" -> "considered to be an object name" or
>   "used as an object name" [*].  This primarily comes from my
>   spinal reflex against "consider as", plus my desire to be more
>   precise in terminology.

this is not related to my changes, I just moved this line but didn't change it.


Thanks,
ahmed akef


Thanks,
ahmed akef


On Thu, Aug 22, 2024 at 9:20 AM Ahmed Akef <aemed.akef.1@gmail.com> wrote:
>
> > * "as they have been read from stdin"; drop "from stdin" here, as
> >   we already know we are talking about the mode that reads object
> >   names from the standard input and there is no need to repeat it.
>
> it is needed to explain that git will not do any optimization to the order of paths
> before printing the output. I entered a discussion with someone who was worried
> that git may optimize the paths order because it is not stated clearly that output follows
> the same order as input.
>
> > * "considered as an object" -> "considered to be an object name" or
> >   "used as an object name" [*].  This primarily comes from my
> >   spinal reflex against "consider as", plus my desire to be more
> >   precise in terminology.
>
> this is not related to my changes, I just moved this line but didn't changed it.
>
> Thanks,
> ahmed akef
>
>
> On Thu, Aug 22, 2024 at 9:18 AM Ahmed Akef <aemed.akef.1@gmail.com> wrote:
>>
>> > * "as they have been read from stdin"; drop "from stdin" here, as
>> >   we already know we are talking about the mode that reads object
>> >   names from the standard input and there is no need to repeat it.
>>
>> it is needed to explain that git will not do any optimization to the order of paths
>> before printing the output. I entered a discussion with someone who was worried
>> that git may optimize the paths order because it is not stated clearly that output follows
>> the same order as input.
>>
>> > * "considered as an object" -> "considered to be an object name" or
>> >   "used as an object name" [*].  This primarily comes from my
>> >   spinal reflex against "consider as", plus my desire to be more
>> >   precise in terminology.
>>
>> this is not related to my changes, I just moved this line but didn't changed it.
>>
>> Thanks,
>> ahmed akef
>>
>>
>> On Wed, Aug 21, 2024 at 7:19 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> "ahmed akef via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>
>>> >  If `--batch` or `--batch-check` is given, `cat-file` will read objects
>>> > -from stdin, one per line, and print information about them. By default,
>>> > -the whole line is considered as an object, as if it were fed to
>>> > -linkgit:git-rev-parse[1].
>>> > +from stdin, one per line, and print information about them in the same
>>> > +order as they have been read from stdin. By default, the whole line is
>>> > +considered as an object, as if it were fed to linkgit:git-rev-parse[1].
>>>
>>> A few "Huh?" I had while reading the above.
>>>
>>>  * "as they have been read from stdin"; drop "from stdin" here, as
>>>    we already know we are talking about the mode that reads object
>>>    names from the standard input and there is no need to repeat it.
>>>
>>>  * "considered as an object" -> "considered to be an object name" or
>>>    "used as an object name" [*].  This primarily comes from my
>>>    spinal reflex against "consider as", plus my desire to be more
>>>    precise in terminology.
>>>
>>> Thanks.
>>>
>>> Nothing mentioned below should be part of this patch, but as I
>>> noticed it while studying the current documentation to prepare this
>>> review, I'll record them as #leftoverbits.
>>>
>>> The description of how lines read from the standard input should
>>> look like needs more work.  Documentation on "--batch" says "the
>>> input lines must specify the path, separated by whitespace", but is
>>> it clear that it expects "<object name>", followed by a whitespace
>>> (not necessarily a single SP), followed by "<path>"?  Without prior
>>> knowledge, I wouldn't be surprised if somebody read the text as
>>> asking for paths separated by whitespace, e.g.
>>>
>>>     README.txt COPYING Makefile
>>>
>>> that gives three paths.  The text needs to be tightened to say
>>> something like "must give the path after the object name, separated
>>> by whitespace.  The path is used to find the textconv and smudge
>>> filter".
>>>
>>> The section also says "See the section BATCH OUTPUT below for
>>> details." but the section it refers to does not talk anything about
>>> this whitespace thing.  It also is unclear what would happen if the
>>> input line says:
>>>
>>>     :COPYING Makefile
>>>
>>> Would it apply the textconv/filters meant for Makefile to the blob
>>> stored at COPYING in the index?  If we say
>>>
>>>     :README.txt
>>>
>>> would the command be smart enough to know that the blob came from
>>> the path README.txt and apply the textconv/filters meant for that
>>> path, without the input repeating the same information twice like:
>>>
>>>     :README.txt README.txt
>>>
>>> or something silly like that?
>>>
>>>
>>> [Reference]
>>>
>>> * https://www.britannica.com/dictionary/eb/qa/consider-and-consider-as
>>>
>>>
Junio C Hamano Aug. 22, 2024, 3:07 p.m. UTC | #3
Ahmed Akef <aemed.akef.1@gmail.com> writes:

Administrivia: please do not break the discussion thread by dropping
In-Reply-To: and / or References: headers.  For those who are
following from sidelines, this is a reponse to

    https://lore.kernel.org/git/xmqqa5h5ztd9.fsf@gitster.g/

>> * "as they have been read from stdin"; drop "from stdin" here, as
>>   we already know we are talking about the mode that reads object
>>   names from the standard input and there is no need to repeat it.
>
> it is needed to explain that git will not do any optimization to the
> order of paths
> before printing the output.

I do not think "from stdin" is necessary for that.  The sentence
begins ...

>  If `--batch` or `--batch-check` is given, `cat-file` will read objects
> +from stdin, one per line, and print information about them in the same
> +order as they have been read from stdin. By default, the whole line is
> +considered as an object, as if it were fed to linkgit:git-rev-parse[1].

... by explaining that the command reads from the standard input, and
does something to each in the same order as they were read.  If you
already said you are reading from the standard input, the order you
read them is the order youread them from the standard input.

Drop the "from stdin" from "as they have been read from stdin".
diff mbox series

Patch

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index bd95a6c10a7..c39074b9ee6 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -270,9 +270,9 @@  BATCH OUTPUT
 ------------
 
 If `--batch` or `--batch-check` is given, `cat-file` will read objects
-from stdin, one per line, and print information about them. By default,
-the whole line is considered as an object, as if it were fed to
-linkgit:git-rev-parse[1].
+from stdin, one per line, and print information about them in the same
+order as they have been read from stdin. By default, the whole line is
+considered as an object, as if it were fed to linkgit:git-rev-parse[1].
 
 When `--batch-command` is given, `cat-file` will read commands from stdin,
 one per line, and print information based on the command given. With